diff options
Diffstat (limited to 'frsb_kac-rice_letter.tex')
-rw-r--r-- | frsb_kac-rice_letter.tex | 435 |
1 files changed, 435 insertions, 0 deletions
diff --git a/frsb_kac-rice_letter.tex b/frsb_kac-rice_letter.tex new file mode 100644 index 0000000..4fb22e7 --- /dev/null +++ b/frsb_kac-rice_letter.tex @@ -0,0 +1,435 @@ + +\documentclass[reprint,aps,prl,longbibliography,floatfix]{revtex4-2} + +\usepackage[utf8]{inputenc} % why not type "Bézout" with unicode? +\usepackage[T1]{fontenc} % vector fonts plz +\usepackage{amsmath,amssymb,latexsym,graphicx} +\usepackage{newtxtext,newtxmath} % Times for PR +\usepackage[dvipsnames]{xcolor} +\usepackage[ + colorlinks=true, + urlcolor=MidnightBlue, + citecolor=MidnightBlue, + filecolor=MidnightBlue, + linkcolor=MidnightBlue +]{hyperref} % ref and cite links with pretty colors +\usepackage{anyfontsize} + +\begin{document} + +\title{ + Unveiling the complexity of hierarchical energy landscapes +} + +\author{Jaron Kent-Dobias} +\author{Jorge Kurchan} +\affiliation{Laboratoire de Physique de l'Ecole Normale Supérieure, Paris, France} + +\begin{abstract} + Complexity is a measure of the number of stationary points in complex + landscapes. We derive a general solution for the complexity of mean-field + complex landscapes. It incorporates Parisi's solution for the ground state, + as it should. Using this solution, we count the stationary points of two + models: one with multi-step replica symmetry breaking, and one with full + replica symmetry breaking. These examples demonstrate the consistency of the + solution and reveal that the signature of replica symmetry breaking at high + energy densities is found in high-index saddles, not minima. +\end{abstract} + +\maketitle + +The functions used to describe the energies, costs, and fitnesses of disordered +systems in physics, computer science, and biology are typically \emph{complex}, +meaning that they have a number of minima that grows exponentially with the +size of the system \cite{Maillard_2020_Landscape, Ros_2019_Complex, +Altieri_2021_Properties}. Though they are often called `rough landscapes' to +evoke the intuitive image of many minima in something like a mountain range, +the metaphor to topographical landscapes is strained by the reality that these +complex landscapes also exist in very high dimensions: think of the dimensions +of phase space for $N$ particles, or the number of parameters in a neural +network. + +The \emph{complexity} of a function is the average of the logarithm of the +number of its minima, maxima, and saddle points (collectively stationary +points), under conditions fixing the value of the energy or the index of the +stationary point +\cite{Bray_1980_Metastable}. +Since in complex landscapes this +number grows exponentially with system size, their complexity is an extensive +quantity. Understanding the complexity offers an understanding about the +geometry and topology of the landscape, which can provide insight into +dynamical behavior. + +When complex systems are fully connected, i.e., each degree of freedom +interacts directly with every other, they are often described by a hierarchical +structure of the type first proposed by Parisi, the \emph{replica symmetry +breaking} (RSB) \cite{Parisi_1979_Infinite}. This family of structures is rich, spanning uniform +\emph{replica symmetry} (RS), an integer $k$ levels of hierarchical nested +structure ($k$RSB), a full continuum of nested structure (full RSB or FRSB), +and arbitrary combinations thereof. Though these rich structures are understood +in the equilibrium properties of fully connected models, the complexity has +only been computed in RS cases. + +In this paper and its longer companion, we share the first results for the +complexity with nontrivial hierarchy \cite{Kent-Dobias_2022_How}. Using a +general form for the solution detailed in a companion article, we describe the +structure of landscapes with a 1RSB complexity and a full RSB complexity +\footnote{The Thouless--Anderson--Palmer (TAP) complexity is the complexity of + a kind of mean-field free energy. Because of some deep thermodynamic + relationships between the TAP complexity and the equilibrium free energy, the +TAP complexity can be computed with extensions of the equilibrium method. As a +result, the TAP complexity has been previously computed for nontrivial +hierarchical structure.}. + +We study the mixed $p$-spin spherical models, with Hamiltonian +\begin{equation} \label{eq:hamiltonian} + H(\mathbf s)=-\sum_p\frac1{p!}\sum_{i_1\cdots i_p}^NJ^{(p)}_{i_1\cdots i_p}s_{i_1}\cdots s_{i_p} +\end{equation} +is defined for vectors $\mathbf s\in\mathbb R^N$ confined to the $N-1$ sphere +$S^{N-1}=\{\mathbf s\mid\|\mathbf s\|^2=N\}$. The coupling coefficients $J$ are taken at random, with +zero mean and variance $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$ chosen so that +the energy is typically extensive. The overbar will always denote an average +over the coefficients $J$. The factors $a_p$ in the variances are freely chosen +constants that define the particular model. For instance, the so-called `pure' +models have $a_p=1$ for some $p$ and all others zero. + +The complexity of the $p$-spin models has been extensively studied by +physicists and mathematicians. Among physicists, the bulk of work has been on + the so-called `TAP' complexity, +which counts minima in the mean-field Thouless--Anderson--Palmer () free energy \cite{Rieger_1992_The, +Crisanti_1995_Thouless-Anderson-Palmer, Cavagna_1997_An, +Cavagna_1997_Structure, Cavagna_1998_Stationary, Cavagna_2005_Cavity, +Giardina_2005_Supersymmetry}. The landscape complexity has been proven for pure +and mixed models without RSB \cite{Auffinger_2012_Random, +Auffinger_2013_Complexity, BenArous_2019_Geometry}. The mixed models been +treated without RSB \cite{Folena_2020_Rethinking}. And the methods of +complexity have been used to study many geometric properties of the pure +models, from the relative position of stationary points to one another to shape +and prevalence of instantons \cite{Ros_2019_Complexity, Ros_2021_Dynamical}. + +The variance of the couplings implies that the covariance of the energy with +itself depends on only the dot product (or overlap) between two configurations. +In particular, one finds +\begin{equation} \label{eq:covariance} + \overline{H(\mathbf s_1)H(\mathbf s_2)}=Nf\left(\frac{\mathbf s_1\cdot\mathbf s_2}N\right), +\end{equation} +where $f$ is defined by the series +\begin{equation} + f(q)=\frac12\sum_pa_pq^p. +\end{equation} +One needn't start with a Hamiltonian like +\eqref{eq:hamiltonian}, defined as a series: instead, the covariance rule +\eqref{eq:covariance} can be specified for arbitrary, non-polynomial $f$, as in +the `toy model' of M\'ezard and Parisi \cite{Mezard_1992_Manifolds}. In fact, defined this way the mixed spherical model encompasses all isotropic Gaussian fields on the sphere. + +The family of spherical models thus defined is quite rich, and by varying the +covariance $f$ nearly any hierarchical structure can be found in +equilibrium. Because of a correspondence between the ground state complexity +and the entropy at zero temperature, any hierarchical structure in the +equilibrium should be reflected in the complexity. + +The complexity is calculated using the Kac--Rice formula, which counts the +stationary points using a $\delta$-function weighted by a Jacobian +\cite{Kac_1943_On, Rice_1939_The}. The count is given by +\begin{equation} + \begin{aligned} + \mathcal N(E, \mu) + &=\int_{\mathbb R^N}d\boldsymbol\xi\,e^{-\frac12\|\boldsymbol\xi\|^2/\sigma^2}\int_{S^{N-1}}d\mathbf s\, \delta\big(\nabla H(\mathbf s)-\boldsymbol\xi\big)\,\big|\det\operatorname{Hess}H(\mathbf s)\big| \\ + &\hspace{2pc}\times\delta\big(NE-H(\mathbf s)\big)\delta\big(N\mu-\operatorname{Tr}\operatorname{Hess}H(\mathbf s)\big) + \end{aligned} +\end{equation} +with two additional $\delta$-functions inserted to fix the energy density $E$ +and the stability $\mu$. The additional `noise' field $\boldsymbol\xi$ +helps regularize the $\delta$-functions for the energy and stability at finite +$N$, and will be convenient for defining the order parameter matrices later. The complexity is then +\begin{equation} \label{eq:complexity} + \Sigma(E,\mu)=\lim_{N\to\infty}\lim_{\sigma\to0}\frac1N\overline{\log\mathcal N(E, \mu}). +\end{equation} +Most of the difficulty of this calculation resides in the logarithm in this +formula. + +The stability $\mu$, sometimes called the radial reaction, determines the depth +of minima or the index of saddles. At large $N$ the Hessian can be shown to +consist of the sum of a GOE matrix with variance $f''(1)/N$ shifted by a +constant diagonal matrix of value $\mu$. Therefore, the spectrum of the Hessian +is a Wigner semicircle of radius $\mu_\mathrm m=\sqrt{4f''(1)}$ centered at $\mu$. When +$\mu>\mu_\mathrm m$, stationary points are minima whose sloppiest eigenvalue is +$\mu-\mu_\mathrm m$. When $\mu=\mu_\mathrm m$, the stationary points are marginal minima with +flat directions. When $\mu<\mu_\mathrm m$, the stationary points are saddles with +indexed fixed to within order one (fixed macroscopic index). + +It's worth reviewing the complexity for the best-studied case of the pure model +for $p\geq3$ \cite{Cugliandolo_1993_Analytical}. Here, because the covariance +is a homogeneous polynomial, $E$ and $\mu$ cannot be fixed separately, and one +implies the other: $\mu=pE$. Therefore at each energy there is only one kind of +stationary point. When the energy reaches $E_\mathrm{th}=-\mu_\mathrm m/p$, the +population of stationary points suddenly shifts from all saddles to all minima, +and there is an abrupt percolation transition in the topology of +constant-energy slices of the landscape. This behavior of the complexity can be +used to explain a rich variety of phenomena in the equilibrium and dynamics of +the pure models: the `threshold' \cite{Cugliandolo_1993_Analytical} energy $E_\mathrm{th}$ corresponds to the +average energy at the dynamic transition temperature, and the asymptotic energy +reached by slow aging dynamics. + +Things become much less clear in even the simplest mixed models. For instance, +one mixed model known to have a replica symmetric complexity was shown to +nonetheless not have a clear relationship between features of the complexity +and the asymptotic dynamics \cite{Folena_2020_Rethinking}. There is no longer a +sharp topological transition. + +In the pure models, $E_\mathrm{th}$ also corresponds to the \emph{algorithmic +threshold} $E_\mathrm{alg}$, defined by the lowest energy reached by local +algorithms like approximate message passing \cite{ElAlaoui_2020_Algorithmic, +ElAlaoui_2021_Optimization}. In the spherical models, this has been proven to +be +\begin{equation} + E_{\mathrm{alg}}=-\int_0^1dq\,\sqrt{f''(q)} +\end{equation} +For full RSB systems, $E_\mathrm{alg}=E_0$ and the algorithm can reach the +ground state energy. For the pure $p$-spin models, +$E_\mathrm{alg}=E_\mathrm{th}$, where $E_\mathrm{th}$ is the energy at which +marginal minima are the most common stationary points. Something about the +topology of the energy function might be relevant to where this algorithmic +threshold lies. + +To compute the complexity in the generic case, we use the replica method to +treat the logarithm inside the average of \eqref{eq:complexity}, and the +$\delta$-functions are written in a Fourier basis. The average of the factor +including the determinant and the factors involving $\delta$-functions can be +averaged over the disorder separately \cite{Bray_2007_Statistics}. The result +can be written +\begin{equation} + \Sigma(E,\mu)=\lim_{N\to\infty}\lim_{n\to0}\frac1N\frac{\partial}{\partial n} + \int_{\mathrm M_n(\mathbb R)} dQ\,dR\,dD\,e^{N\mathcal S(Q,R,D\mid E,\mu)}, +\end{equation} +where the effective action $\mathcal S$ is a function of three matrices indexed +by the $n$ replicas: +\begin{equation} + \begin{aligned} + &Q_{ab}=\frac{\mathbf s_a\cdot\mathbf s_b}N + \hspace{4em} + R_{ab}=\frac{\boldsymbol\xi_a\cdot\mathbf s_b}{N\sigma^2} + \\ + &D_{ab}=\frac1{N\sigma^4}\left(\sigma^2\delta_{ab}-\boldsymbol\xi_a\cdot\boldsymbol\xi_b\right). + \end{aligned} +\end{equation} +The matrix $Q$ is a clear analogue of the usual overlap matrix of the +equilibrium case. The matrices $R$ and $D$ have interpretations as response +functions: $R$ is related to the typical displacement of stationary points by +perturbations to the potential, and $D$ is related to the change in the +complexity caused by the same perturbations. The general expression for the +complexity as a function of these matrices is also found in +\cite{Folena_2020_Rethinking}. + +The complexity is found by the saddle point method, extremizing $\mathcal S$ +with respect to $Q$, $R$, and $D$ and replacing the integral with its integrand +evaluated at the extremum. We make the \emph{ansatz} that all three matrices have +a hierarchical structure, and moreover that they share the same hierarchical +structure. This means that the size of the blocks of equal value of each is the +same, though the values inside these blocks will vary from matrix to matrix. +This form can be shown to exactly reproduce the ground state energy predicted +by the equilibrium solution, a key consistency check. + +Along one line in the energy--stability plane the solution takes a simple form: +the matrices $R$ and $D$ corresponding to responses are diagonal, leaving +only the overlap matrix $Q$ with nontrivial off-diagonal entries. This +simplification makes the solution along this line analytically tractable even +for FRSB. The simplification is related to the presence of an approximate +supersymmetry in the Kac--Rice formula, studied in the past in the context of +the TAP free energy \cite{Annibale_2003_Supersymmetric, Annibale_2003_The, +Annibale_2004_Coexistence}. This line of `supersymmetric' solutions terminates +at the ground state, and describes the most numerous types of stable minima. + +Using this solution, one finds a correspondence between properties of the +overlap matrix $Q$ at the ground state energy, where the complexity vanishes, +and the overlap matrix in the equilibrium problem in the limit of zero +temperature. The saddle point parameters of the two problems are related +exactly. In the case where the vicinity of the equilibrium ground state is +described by a $k$RSB solution, the complexity at the ground state is +$(k-1)$RSB. This can be intuitively understood by considering the difference +between measuring overlaps between equilibrium \emph{states} and stationary +\emph{points}. For states, the finest level of the hierarchical description +gives the typical overlap between two points drawn from the same state, which +has some distribution about the ground state at nonzero temperature. For +points, this finest level does not exist. + +In general, solving the saddle-point equations for the parameters of the three +replica matrices is challenging. Unlike the equilibrium case, the solution is +not extremal, and so minimization methods cannot be used. However, the line of +simple `supersymmetric' solutions offers a convenient foothold: starting from +one of these solutions, the parameters $E$ and $\mu$ can be slowly varied to +find the complexity everywhere. This is how the data in what follows was produced. + +\begin{figure} + \centering + \hspace{-1em} + \includegraphics[width=\columnwidth]{figs/316_complexity_contour_1_letter.pdf} + \includegraphics[width=\columnwidth]{figs/316_detail_letter_legend.pdf} + + \caption{ + Complexity of the $3+16$ model in the energy $E$ and stability $\mu$ + plane. Solid lines show the prediction of 1RSB complexity, while dashed + lines show the prediction of RS complexity. Below the yellow marginal line + the complexity counts saddles of increasing index as $\mu$ decreases. Above + the yellow marginal line the complexity counts minima of increasing + stability as $\mu$ increases. + } \label{fig:2rsb.contour} +\end{figure} + +\begin{figure} + \centering + \includegraphics[width=\columnwidth]{figs/316_detail_letter.pdf} + \includegraphics[width=\columnwidth]{figs/316_detail_letter_legend.pdf} + + \caption{ + Detail of the `phases' of the $3+16$ model complexity as a function of + energy and stability. Solid lines show the prediction of 1RSB complexity, while dashed + lines show the prediction of RS complexity. Above the yellow marginal stability line the + complexity counts saddles of fixed index, while below that line it counts + minima of fixed stability. The shaded red region shows places where the + complexity is described by the 1RSB solution, while the shaded gray region + shows places where the complexity is described by the RS solution. In white + regions the complexity is zero. Several interesting energies are marked + with vertical black lines: the traditional `threshold' $E_\mathrm{th}$ + where minima become most numerous, the algorithmic threshold + $E_\mathrm{alg}$ that bounds the performance of smooth algorithms, and the + average energies at the $2$RSB and $1$RSB equilibrium transitions $\langle + E\rangle_2$ and $\langle E\rangle_1$, respectively. Though the figure is + suggestive, $E_\mathrm{alg}$ lies at slightly lower energy than the termination of the RS + -- 1RSB transition line. + } \label{fig:2rsb.phases} +\end{figure} + +For the first example, we study a model whose complexity has the simplest +replica symmetry breaking scheme, 1RSB. By choosing a covariance $f$ as the sum +of polynomials with well-separated powers, one develops 2RSB in equilibrium. +This should correspond to 1RSB in the complexity. We take +\begin{equation} + f(q)=\frac12\left(q^3+\frac1{16}q^{16}\right) +\end{equation} +established to have a 2RSB ground state \cite{Crisanti_2011_Statistical}. +With this covariance, the model sees a replica symmetric to 1RSB transition at +$\beta_1=1.70615\ldots$ and a 1RSB to 2RSB transition at +$\beta_2=6.02198\ldots$. The typical equilibrium energies at these phase +transitions are listed in Table~\ref{tab:energies}. + +\begin{table} + \begin{tabular}{l|cc} + & $3+16$ & $2+4$ \\\hline\hline + $\langle E\rangle_\infty$ &---& $-0.531\,25\hphantom{1\,111\dots}$ \\ + $\hphantom{\langle}E_\mathrm{max}$ & $-0.886\,029\,051\dots$ & $-1.039\,701\,412\dots$\\ + $\langle E\rangle_1$ & $-0.906\,391\,055\dots$ & ---\\ + $\langle E\rangle_2$ & $-1.195\,531\,881\dots$ & ---\\ + $\hphantom{\langle}E_\mathrm{dom}$ & $-1.273\,886\,852\dots$ & $-1.056\,6\hphantom{11\,111\dots}$\\ + $\hphantom{\langle}E_\mathrm{alg}$ & $-1.275\,140\,128\dots$ & $-1.059\,384\,319\ldots$\\ + $\hphantom{\langle}E_\mathrm{th}$ & $-1.287\,575\,114\dots$ & $-1.059\,384\,319\ldots$\\ + $\hphantom{\langle}E_\mathrm{m}$ & $-1.287\,605\,527\ldots$ & $-1.059\,384\,319\ldots$ \\ + $\hphantom{\langle}E_0$ & $-1.287\,605\,530\ldots$ & $-1.059\,384\,319\ldots$\\\hline + \end{tabular} + \caption{ + Landmark energies of the equilibrium and complexity problems for the two + models studied. $\langle E\rangle_1$, $\langle E\rangle_2$ and $\langle + E\rangle_\infty$ are the average energies in equilibrium at the RS--1RSB, + 1RSB--2RSB, and RS--FRSB transitions, respectively. $E_\mathrm{max}$ is the + highest energy at which any stationary points are described by a RSB + complexity. $E_\mathrm{dom}$ is the energy at which dominant stationary + points have an RSB complexity. $E_\mathrm{alg}$ is the algorithmic + threshold below which smooth algorithms cannot go. $E_\mathrm{th}$ is the + traditional threshold energy, defined by the energy at which marginal + minima become most common. $E_\mathrm m$ is the lowest energy at which + saddles or marginal minima are found. $E_0$ is the ground state energy. + } \label{tab:energies} +\end{table} + +In this model, the RS complexity gives an inconsistent answer for the +complexity of the ground state, predicting that the complexity of minima +vanishes at a higher energy than the complexity of saddles, with both at a +lower energy than the equilibrium ground state. The 1RSB complexity resolves +these problems, shown in Fig.~\ref{fig:2rsb.contour}. It predicts the same ground state as equilibrium and with a +ground state stability $\mu_0=6.480\,764\ldots>\mu_\mathrm m$. It predicts that +the complexity of marginal minima (and therefore all saddles) vanishes at +$E_\mathrm m$, which is very slightly greater than $E_0$. Saddles become +dominant over minima at a higher energy $E_\mathrm{th}$. The 1RSB complexity +transitions to a RS description for dominant stationary points at an energy +$E_\mathrm{dom}$. The highest energy for which the 1RSB description exists is +$E_\mathrm{max}$. The numeric values for all these energies are listed in +Table~\ref{tab:energies}. + +For minima, the complexity does +not inherit a 1RSB description until the energy is with in a close vicinity of +the ground state. On the other hand, for high-index saddles the complexity +becomes described by 1RSB at quite high energies. This suggests that when +sampling a landscape at high energies, high index saddles may show a sign of +replica symmetry breaking when minima or inherent states do not. + +Fig.~\ref{fig:2rsb.phases} shows a different detail of the complexity in the +vicinity of the ground state, now as functions of the energy difference and +stability difference from the ground state. Several of the landmark energies +described above are plotted, alongside the boundaries between the `phases.' +Though $E_\mathrm{alg}$ looks quite close to the energy at which dominant +saddles transition from 1RSB to RS, they differ by roughly $10^{-3}$, as +evidenced by the numbers cited above. Likewise, though $\langle E\rangle_1$ +looks very close to $E_\mathrm{max}$, where the 1RSB transition line +terminates, they too differ. The fact that $E_\mathrm{alg}$ is very slightly +below the place where most saddle transition to 1RSB is suggestive; we +speculate that an analysis of the typical minima connected to these saddles by +downward trajectories will coincide with the algorithmic limit. An analysis of +the typical nearby minima or the typical downward trajectories from these +saddles at 1RSB is warranted \cite{Ros_2019_Complex, Ros_2021_Dynamical}. Also +notable is that $E_\mathrm{alg}$ is at a significantly higher energy than +$E_\mathrm{th}$; according to the theory, optimal smooth algorithms in this +model stall in a place where minima are exponentially subdominant. + +\begin{figure} + \centering + \includegraphics[width=\columnwidth]{figs/24_phases_letter.pdf} + \includegraphics[width=\columnwidth]{figs/24_detail_letter_legend.pdf} + \caption{ + `Phases' of the complexity for the $2+4$ model in the energy $E$ and + stability $\mu$ plane. Solid lines show the prediction of 1RSB complexity, + while dashed lines show the prediction of RS complexity. The region shaded + gray shows where the RS solution is correct, while the region shaded red + shows that where the FRSB solution is correct. The white region shows where + the complexity is zero. + } \label{fig:frsb.phases} +\end{figure} + +If the covariance $f$ is chosen to be concave, then one develops FRSB in equilibrium. To this purpose, we choose +\begin{equation} + f(q)=\frac12\left(q^2+\frac1{16}q^4\right), +\end{equation} +also studied before in equilibrium \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical}. Because the ground state is FRSB, for this model $E_0=E_\mathrm{alg}=E_\mathrm{th}=E_\mathrm m$. +In the equilibrium solution, the transition temperature from RS to FRSB is $\beta_\infty=1$, with corresponding average energy $\langle E\rangle_\infty$, also in Table~\ref{tab:energies}. + +Fig.~\ref{fig:frsb.phases} shows the regions of complexity for the $2+4$ model, +computed using finite-$k$ RSB approximations. Notably, the phase boundary +predicted by a perturbative expansion correctly predicts where the finite +$k$RSB approximations terminate. Like the 1RSB model in the previous +subsection, this phase boundary is oriented such that very few, low energy, +minima are described by a FRSB solution, while relatively high energy saddles +of high index are also. Again, this suggests that studying the mutual +distribution of high-index saddle points might give insight into lower-energy +symmetry breaking in more general contexts. + +We have used our solution for mean-field complexity to explore how hierarchical +RSB in equilibrium corresponds to analogous hierarchical structure in the +energy landscape. In the examples we studied, a relative minority of energy +minima are distributed in a nontrivial way, corresponding to the lowest energy +densities. On the other hand, very high-index saddles begin exhibit RSB at much +higher energy densities, on the order of the energy densities associated with +RSB transitions in equilibrium. More wore is necessary to explore this +connection, as well as whether a purely \emph{geometric} explanation can be +made for the algorithmic threshold. Applying this method to the most realistic +RSB scenario for structural glasses, the so-called 1FRSB which has features of +both 1RSB and FRSB, might yield insights about signatures that should be +present in the landscape. + +\paragraph{Acknowledgements} +The authors would like to thank Valentina Ros for helpful discussions. + +\paragraph{Funding information} +JK-D and JK are supported by the Simons Foundation Grant No.~454943. + +\bibliography{frsb_kac-rice} + +\end{document} |