From 5c7c86b56e35fbf9d639e0daebb4b9f65a90013e Mon Sep 17 00:00:00 2001 From: Jaron Kent-Dobias Date: Wed, 18 Sep 2024 21:27:18 +0200 Subject: Full pass of edits. --- topology.tex | 183 +++++++++++++++++++++++++++++++---------------------------- 1 file changed, 97 insertions(+), 86 deletions(-) diff --git a/topology.tex b/topology.tex index 2ed25aa..bbd978a 100644 --- a/topology.tex +++ b/topology.tex @@ -66,8 +66,9 @@ this behavior into five phases with different implications for the topology of the solution manifold. When $M=1$ there is a correspondence between this problem and level sets of the energy in the spherical spin glasses. We conjecture that the transition energy dividing two of the topological phases -corresponds to the asymptotic limit of gradient descent from a random initial -condition, possibly resolving an open problem in out-of-equilibrium dynamics. +corresponds to the energy asymptotically reached by gradient descent from a +random initial condition, possibly resolving an open problem in +out-of-equilibrium dynamics. } } @@ -127,13 +128,13 @@ set of solutions, which can influence the behavior of algorithms trying to find them \cite{Baldassi_2016_Unreasonable, Baldassi_2019_Properties, Beneventano_2023_On}. Here, we show how topological information about the set of solutions can be calculated in a simple problem of satisfying random -nonlinear equalities. This allows us to reason about the connectivity of the +nonlinear equalities. This allows us to reason about the connectivity and structure of the solution set. The topological properties revealed by this calculation yield surprising results for the well-studied spherical spin glasses, where a topological transition thought to occur at a threshold energy $E_\text{th}$ where marginal minima are dominant is shown to occur at a different energy $E_\text{sh}$. We conjecture that this difference resolves an outstanding -problem in gradient descent dynamics in these systems. +problem with the out-of-equilibrium dynamics in these systems. We consider the problem of finding configurations $\mathbf x\in\mathbb R^N$ lying on the $(N-1)$-sphere $\|\mathbf x\|^2=N$ that simultaneously satisfy $M$ @@ -146,16 +147,16 @@ Gaussian random functions with covariance \end{equation} for some choice of function $f$. When the covariance function $f$ is polynomial, the $V_k$ are also polynomial, with a term of degree $p$ in $f$ corresponding to -all possible terms of degree $p$ in $V_k$. One can explicitly construct functions that satisfy \eqref{eq:covariance} by taking +all possible terms of degree $p$ in the $V_k$. One can explicitly construct functions that satisfy \eqref{eq:covariance} by taking \begin{equation} V_k(\mathbf x) =\sum_{p=0}^\infty\frac1{p!}\sqrt{\frac{f^{(p)}(0)}{N^p}} \sum_{i_1\cdots i_p}^NJ^{(k,p)}_{i_1\cdots i_p}x_{i_1}\cdots x_{i_p} \end{equation} -with the elements of the tensors $J^{(k,p)}$ as independently distributed +where the elements of the tensors $J^{(k,p)}$ are independently distributed unit normal random variables. The series coefficients of $f$ therefore control the variances of the random coefficients -in the random polynomials $V_k$. When $M=1$, this problem corresponds to +in the polynomials $V_k$. When $M=1$, this problem corresponds to finding the level set of a spherical spin glass at energy density $E=V_0/\sqrt{N}$. @@ -171,7 +172,7 @@ studied properties of the cost function \mathscr C(\mathbf x)=\frac12\sum_{k=1}^M\big[V_k(\mathbf x)-V_0\big]^2 \end{equation} which achieves zero only for configurations that satisfy all the constraints. -From the perspective of the cost function, the set of solutions looks like a network of flat canyons at the bottom of the landscape. +From the perspective of the cost function, the set of solutions looks like a network of flat canyons at the bottom of the cost landscape. Here we dispense with the cost function and study the set of solutions directly. This set can be written as \begin{equation} @@ -220,7 +221,7 @@ a manifold, the Euler characteristic of its product with the circle $S^1$ is zero. The canonical method for computing the Euler characteristic is to construct a -complex on the manifold in question, essentially a higher-dimensional +complex on the manifold in question, which is a higher-dimensional generalization of a polygonal tiling. Then $\chi$ is given by an alternating sum over the number of cells of increasing dimension, which for 2-manifolds corresponds to the number of vertices, minus the number of edges, plus the @@ -298,7 +299,7 @@ form \overline{\chi(\Omega)} =\left(\frac N{2\pi}\right)^2\int dR\,dD\,dm\,d\hat m\,g(R,D,m,\hat m)\,e^{N\mathcal S_\chi(R,D,m,\hat m)} \end{equation} -where $g$ is a prefactor of $o(N^0)$, and $\mathcal S_\chi$ is an effective action defined by +where $g$ is a prefactor of $o(N^0)$, $\mathcal S_\chi$ is an effective action defined by \begin{equation} \label{eq:euler.action} \begin{aligned} \mathcal S_\chi(R,D,m,\hat m) @@ -322,20 +323,22 @@ The remaining order parameters are defined by the scalar products && \hat m=-i\frac1N\hat {\mathbf x}\cdot\mathbf x_0 \end{align} - -\subsection{Features of the effective action} - -This integral can be evaluated to leading order in $N$ by a saddle point approximation. First -we extremize with respect to $R$, $D$, and $\hat m$, which take the saddle-point -values -\begin{equation} +between the configurations $\mathbf x$, the auxiliary configurations +$\hat{\mathbf x}$, and the height axis $\mathbf x_0$. +The integral \eqref{eq:pre-saddle.characteristic} can be evaluated to leading +order in $N$ by a saddle point approximation. First we extremize with respect +to $R$, $D$, and $\hat m$, which take the saddle-point values +\begin{align} + R=R^* + && D=-\frac{m+R_*}{1-m^2}R_* - \hspace{8em} + && \hat m=0 -\end{equation} +\end{align} +where we have defined \begin{equation} \label{eq:rs} \begin{aligned} - R=R_* + R_* \equiv\frac{-m(1-m^2)}{2[f(1)-(1-m^2)f'(1)]^2} \Bigg[ \alpha V_0^2f'(1) @@ -347,7 +350,8 @@ values \Bigg] \end{aligned} \end{equation} -This results in an effective action as a function of $m$ alone given by +Upon substitution of these solutions into \eqref{eq:euler.action}, we find an +effective action as a function of $m$ alone given by \begin{equation} \label{eq:S.m} \mathcal S_\chi(m) =-\frac\alpha2\bigg[ @@ -360,14 +364,12 @@ This results in an effective action as a function of $m$ alone given by \bigg] +\frac12\log\left(-\frac m{R_*}\right) \end{equation} -This function is plotted as a function of $m$ in Fig.~\ref{fig:action} for a -selection of sample parameters. To finish evaluating the integral, this +This function is plotted in Fig.~\ref{fig:action} for a +selection of parameters. +To finish evaluating the integral, this expression should be maximized with respect to $m$. If $m_*$ is such a maximum, then the resulting Euler characteristic is $\overline{\chi(\Omega)}\propto -e^{N\mathcal S_\chi(m_*)}$. The order parameter $m$ is the overlap of the -configuration $\mathbf x$ with the height axis $\mathbf x_0$. Therefore, the -value $m$ that maximizes this action can be understood as the latitude on the -sphere where most of the contribution to the Euler characteristic is made. +e^{N\mathcal S_\chi(m_*)}$. \begin{figure}[tbh] \includegraphics{figs/action_1.pdf} @@ -393,6 +395,12 @@ sphere where most of the contribution to the Euler characteristic is made. } \label{fig:action} \end{figure} +\subsection{Features of the effective action} + +The order parameter $m$ is the overlap of the +configuration $\mathbf x$ with the height axis $\mathbf x_0$. Therefore, the +value $m$ that maximizes this action can be understood as the latitude on the +sphere at which most of the contribution to the Euler characteristic is made. The action $\mathcal S_\chi$ is extremized with respect to $m$ at $m=0$ or at $m=\pm m_*$ for \begin{equation} m_*=\sqrt{1-\frac{\alpha}{f'(1)}\big(V_0^2+f(1)\big)} @@ -457,7 +465,7 @@ function $f$ it is not possible to write an explicit formula for $V_\text{\textsc{sat}}$, and we calculate it through a numeric root-finding algorithm. -When $m_\text{min}^2>0$, the solution at $m=0$ is difficult to interpret, since +When $V_0^20$, and $\overline{\chi(\Omega)^2}\neq[\overline{\chi(\Omega)}]^2$ always. -} we find three +} we identify three saddle points that could contribute to the value of $\overline{\chi(\Omega)^2}$: two at $\pm m_*$ where $\frac1N\log\overline{\chi(\Omega)^2}=\frac1N\log\overline{\chi(\Omega)}\simeq0$, @@ -537,7 +545,7 @@ which are schematically represented in Fig.~\ref{fig:cartoons}.\footnote{ The solution manifold is shown as a shaded region, and the height axis $\mathbf x_0$ is a black arrow. In Regime I, the average Euler characteristic is consistent with a manifold with a single simply-connected - component. In Regime II, holes occupy the equator but the most polar + component. In Regime II, holes occupy the equator but the temperate regions are topologically simple. In Regime III, holes dominate and the edge of the manifold is not necessarily simple. In Regime IV, disconnected components dominate. In Regime V, the manifold is empty. @@ -550,7 +558,7 @@ This regime is found when the magnitude of the target value $V_0$ is less than the onset $V_\text{on}$ and $\operatorname{Re}\mathcal S(0)<0$, so that the maxima at $m=\pm m_*$ exist and are the dominant contributions to the average Euler characteristic. Here, $\overline{\chi(\Omega)}=2+o(1)$ for even $N-M-1$, -strongly indicating a topology homeomorphic to the $S^{N-M-1}$ sphere. This regime is the only nontrivial one found with linear covariance $f(q)$, where the solution manifold must be a sphere if it is not empty. +strongly indicating a topology homeomorphic to the $S^{N-M-1}$ sphere. This regime is the only nontrivial one found with linear covariance $f(q)=q$, where the solution manifold must be a sphere if it is not empty. \paragraph{Regime II: \boldmath{$\overline{\chi(\Omega)}$} large and negative, isolated contributions at \boldmath{$m=\pm m_*$}.} @@ -596,14 +604,14 @@ characteristic zero. \caption{ \textbf{Topological phase diagram.} - Topological phases of the model for three different homogeneous covariance + Topological phases of the problem for three different homogeneous covariance functions. The regimes are defined in the text and depicted as cartoons in Fig.~\ref{fig:cartoons}. The shaded region in the center panel shows where these results are unstable to \textsc{rsb}. In the limit of $\alpha\to0$, the behavior of level sets of the spherical spin glasses are recovered: the righthand plot shows how the ground state energy $E_\text{gs}$ and threshold energy $E_\text{th}$ of the 3-spin spherical model correspond with the limits - of the satisfiability and shattering transitions in the pure cubic model. Note that + of the satisfiability and shattering transitions in the pure cubic problem. Note that for mixed models with inhomogeneous covariance functions, $E_\text{th}$ is not the lower limit of $V_\text{sh}$. } \label{fig:phases} @@ -624,13 +632,13 @@ while III and IV are separated by the shattering transition at $V_\text{sh}$. Finally, IV and V are now separated by the satisfiability transition at $V_\text{\textsc{sat}}$. -One interesting +An interesting feature occurs in the limit of $\alpha$ to zero. If $V_0$ is likewise rescaled in the correct way, the limit of these phase boundaries approaches known landmark energy values in the pure spherical spin glasses. In particular, the -limit to zero $\alpha$ of the scaled satisfiability transition +limit $\alpha\to0$ of the scaled satisfiability transition $V_\text{\textsc{sat}}\sqrt\alpha$ approaches the ground state energy -$E_\text{gs}$, while the limit to zero $\alpha$ of the scaled shattering +$E_\text{gs}$, while the limit $\alpha\to0$ of the scaled shattering transition $V_\text{sh}\sqrt\alpha$ approaches the threshold energy $E_\text{th}$. The correspondence between ground state and satisfiability is expected: when the energy of a level set is greater in magnitude than the @@ -640,7 +648,7 @@ energy is typically understood as the point where the landscape fractures into pieces. However, this second correspondence is only true for the pure spherical models with homogeneous $f(q)$. For any other model with an inhomogeneous $f(q)$, $E_\text{sh}^2