From b02db1ba7786ffff0ababd07af25b1679264fd67 Mon Sep 17 00:00:00 2001 From: Jaron Kent-Dobias Date: Thu, 24 Aug 2023 12:05:15 +0200 Subject: Reflowed paragraphs. --- when_annealed.tex | 242 +++++++++++++++++++++++++++++------------------------- 1 file changed, 128 insertions(+), 114 deletions(-) diff --git a/when_annealed.tex b/when_annealed.tex index b65f4d0..b07e3cd 100644 --- a/when_annealed.tex +++ b/when_annealed.tex @@ -55,14 +55,15 @@ Random high-dimensional energies, cost functions, and interaction networks are important across disciplines: the energy landscape of glasses, the likelihood landscape of machine learning and inference, and the interactions between -organisms in an ecosystem are just a few examples \cite{Stein_1995_Broken, Krzakala_2007_Landscape, Altieri_2021_Properties, Yang_2023_Stochastic}. A traditional tool for -making sense of their behavior is to analyze the statistics of points where -their dynamics are stationary \cite{Cavagna_1998_Stationary, -Fyodorov_2004_Complexity, Fyodorov_2007_Density, Bray_2007_Statistics}. For -energy or cost landscapes, these correspond to the minima, maxima, and saddles, -while for ecosystems and other non-gradient dynamical systems these correspond -to equilibria of the dynamics. When many stationary points are present, the -system is considered complex. +organisms in an ecosystem are just a few examples \cite{Stein_1995_Broken, +Krzakala_2007_Landscape, Altieri_2021_Properties, Yang_2023_Stochastic}. A +traditional tool for making sense of their behavior is to analyze the +statistics of points where their dynamics are stationary +\cite{Cavagna_1998_Stationary, Fyodorov_2004_Complexity, Fyodorov_2007_Density, +Bray_2007_Statistics}. For energy or cost landscapes, these correspond to the +minima, maxima, and saddles, while for ecosystems and other non-gradient +dynamical systems these correspond to equilibria of the dynamics. When many +stationary points are present, the system is considered complex. Despite the importance of stationary point statistics for understanding complex behavior, they are often calculated using an uncontrolled approximation. @@ -70,34 +71,34 @@ Because their number is so large, it cannot be reliably averaged. The annealed approximation takes this average anyway, risking a systematic bias by rare and atypical samples. The annealed approximation is known to be exact for certain models and in certain circumstances, but it is used outside those circumstances -without much reflection \cite{Wainrib_2013_Topological, Kent-Dobias_2021_Complex, -Gershenzon_2023_On-Site}. In a few cases researchers have instead made the -better-controlled quenched average, which averages the logarithm of the number -of stationary points, and find deviations from the annealed approximation with -important implications for behavior \cite{Cavagna_1999_Quenched, Crisanti_2006_Spherical, Muller_2006_Marginal, -Ros_2019_Complex, Kent-Dobias_2023_How, Ros_2023_Quenched, Ros_2023_Generalized}. Generically, -the annealed approximation to the complexity is wrong when a nonvanishing -fraction of pairs of stationary points have nontrivial correlations in their -mutual position. +without much reflection \cite{Wainrib_2013_Topological, + Kent-Dobias_2021_Complex, Gershenzon_2023_On-Site}. In a few cases + researchers have instead made the better-controlled quenched average, which + averages the logarithm of the number of stationary points, and find + deviations from the annealed approximation with important implications for + behavior \cite{Cavagna_1999_Quenched, Crisanti_2006_Spherical, + Muller_2006_Marginal, Ros_2019_Complex, Kent-Dobias_2023_How, +Ros_2023_Quenched, Ros_2023_Generalized}. Generically, the annealed +approximation to the complexity is wrong when a nonvanishing fraction of pairs +of stationary points have nontrivial correlations in their mutual position. A heuristic line of reasoning for the appropriateness of the annealed approximation is sometimes made when the approximation is correct for an equilibrium calculation on the same system. The argument goes like this: since -the limit of zero temperature in an equilibrium calculation -concentrates the Boltzmann measure onto the lowest set of minima, the equilibrium free -energy in the limit to zero temperature will be governed by the same -statistics as the count of that lowest set of minima. This argument is strictly -valid only for the lowest minima, which at least in glassy problems are -rarely relevant to dynamical behavior. What about the \emph{rest} of the -stationary points? +the limit of zero temperature in an equilibrium calculation concentrates the +Boltzmann measure onto the lowest set of minima, the equilibrium free energy in +the limit to zero temperature will be governed by the same statistics as the +count of that lowest set of minima. This argument is strictly valid only for +the lowest minima, which at least in glassy problems are rarely relevant to +dynamical behavior. What about the \emph{rest} of the stationary points? In this paper, we show that the behavior of the ground state, or \emph{any} equilibrium behavior, does not govern whether stationary points will have a correct annealed average. In a prototypical family of models of random -functions, we determine a condition for when annealed averages -should fail and some stationary points will have nontrivial correlations in their -mutual position. We produce examples of models whose equilibrium is guaranteed -to never see such correlations between thermodynamic states, but where a +functions, we determine a condition for when annealed averages should fail and +some stationary points will have nontrivial correlations in their mutual +position. We produce examples of models whose equilibrium is guaranteed to +never see such correlations between thermodynamic states, but where a population of saddle points is nevertheless correlated. We study the mixed spherical models, which are models of Gaussian-correlated @@ -113,8 +114,10 @@ Specifying the covariance function $f$ uniquely specifies the model. The series coefficients of $f$ need to be nonnnegative in order for $f$ to be a well-defined covariance. The case where $f$ is a homogeneous polynomial has been extensively studied, and corresponds to the pure spherical models of glass -physics or the spiked tensor models of statistical inference \cite{Castellani_2005_Spin-glass}. Here our examples will be models with $f(q)=\frac12\big(\lambda q^3+(1-\lambda)q^s\big)$ for -$\lambda\in(0,1)$, called $3+s$ models.\footnote{ +physics or the spiked tensor models of statistical inference +\cite{Castellani_2005_Spin-glass}. Here our examples will be models with +$f(q)=\frac12\big(\lambda q^3+(1-\lambda)q^s\big)$ for $\lambda\in(0,1)$, +called $3+s$ models.\footnote{ Though the examples and discussion will focus on the $3+s$ models, most formulas (including the principal result in \eqref{eq:condition}) are valid for arbitrary covariance functions $f$ under the condition that $f'(0)=0$, i.e., @@ -127,12 +130,12 @@ $\lambda\in(0,1)$, called $3+s$ models.\footnote{ trivial overlap $q_0$ is also important in situations where a deterministic field (or spike) is present, as in \cite{Ros_2019_Complex}, but deterministic fields are likewise not considered here. -}These are examples of \emph{mixed} -spherical models, which have been studied in the physics and statistics -literature and host a zoo of complex orders and phase transitions -\cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical, -Krakoviack_2007_Comment, Crisanti_2007_Amorphous-amorphous, -Crisanti_2011_Statistical, BenArous_2019_Geometry, Subag_2020_Following, ElAlaoui_2020_Algorithmic}. +}These are examples of \emph{mixed} spherical models, which have been studied +in the physics and statistics literature and host a zoo of complex orders and +phase transitions \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical, + Krakoviack_2007_Comment, Crisanti_2007_Amorphous-amorphous, +Crisanti_2011_Statistical, BenArous_2019_Geometry, Subag_2020_Following, +ElAlaoui_2020_Algorithmic}. There are several well-established results on the equilibrium of this model. @@ -274,13 +277,13 @@ where $\Delta x=1-x$ and -\log\left(\left|\frac{\mu}{\mu_\text m}\right|-\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1}\right) & \mu^2>\mu_\text m^2 \end{cases} \end{equation} -The details of the derivation of these expressions can be found in \cite{Kent-Dobias_2023_How}. -The extremal problem in $\hat\beta$, $r_\mathrm d$, $r_1$, $d_\mathrm d$, and -$d_1$ has a unique solution and can be found explicitly, but the resulting -formula is unwieldy. The action can have multiple extrema, but the one for which the complexity is -\emph{smallest} gives the correct solution. There is always a solution for -$x=1$ which is independent of $q_1$, corresponding to the replica symmetric -case, and with $\Sigma_\mathrm +The details of the derivation of these expressions can be found in +\cite{Kent-Dobias_2023_How}. The extremal problem in $\hat\beta$, $r_\mathrm +d$, $r_1$, $d_\mathrm d$, and $d_1$ has a unique solution and can be found +explicitly, but the resulting formula is unwieldy. The action can have multiple +extrema, but the one for which the complexity is \emph{smallest} gives the +correct solution. There is always a solution for $x=1$ which is independent of +$q_1$, corresponding to the replica symmetric case, and with $\Sigma_\mathrm a(E,\mu)=\mathcal S_{\oldstylenums1\textsc{rsb}}(E,\mu\mid q_1,1)$. The crux of this paper will be to determine when this solution is not the global one. @@ -300,8 +303,9 @@ a(E,\mu)=0$. Going along this line in the replica symmetric solution, the {\oldstylenums1}\textsc{rsb} complexity transitions at a critical point where $x=q_1=1$ \cite{Kent-Dobias_2023_How}. Since all the parameters in the bifurcating solution are known at this point, we can search for it by looking -for a flat direction. In the annealed solution for -points describing saddles (with $\mu^2\leq\mu_\mathrm m^2$ and therefore the simpler form of \eqref{eq:hess.term}), this line is +for a flat direction. In the annealed solution for points describing saddles +(with $\mu^2\leq\mu_\mathrm m^2$ and therefore the simpler form of +\eqref{eq:hess.term}), this line is \begin{equation} \label{eq:extremal.line} \mu_0=-\frac{2Ef'f''}{z_f}-\sqrt{\frac{2f''u_f}{z_f^2}\bigg(\log\frac{f''}{f'}z_f-E^2(f''-f')\bigg)} \end{equation} @@ -317,12 +321,13 @@ elsewhere) the constants y_f=f'(f'-f)+f''f \\ z_f&=f(f''-f')+f'^2 \notag \end{align} -When $f$ and its derivatives appear without an argument, the implied argument is always 1, so, e.g., $f'\equiv f'(1)$. -If $f$ has at least two nonzero coefficients at second order or higher, all of -these constants are positive. Though in figures we focus on the lower branch of -saddles, another set of identical solutions always exists for $(E,\mu)\mapsto(-E,-\mu)$. -We also define $E_\textrm{min}$, the minimum energy at which saddle points with -an extensive number of downward directions are found, as the energy for which +When $f$ and its derivatives appear without an argument, the implied argument +is always 1, so, e.g., $f'\equiv f'(1)$. If $f$ has at least two nonzero +coefficients at second order or higher, all of these constants are positive. +Though in figures we focus on the lower branch of saddles, another set of +identical solutions always exists for $(E,\mu)\mapsto(-E,-\mu)$. We also define +$E_\textrm{min}$, the minimum energy at which saddle points with an extensive +number of downward directions are found, as the energy for which $\mu_0(E_\mathrm{min})=\mu_\mathrm m$. Let $M$ be the matrix of double partial derivatives of the action with @@ -369,16 +374,18 @@ $e$, $g$, and $h$ are given by \caption{ Stationary point statistics as a function of energy density $E$ and - stability $\mu$ for a model with $f(q)=\frac12(\frac12q^3+\frac12q^5)$. The dashed black - line shows the line of zero annealed complexity and - enclosed inside the annealed complexity is positive. The solid black line (only visible in the inset) gives the line of zero {\oldstylenums1\textsc{rsb}} complexity. The red region (blown - up in the inset) shows where the annealed complexity gives the wrong count - and a {\oldstylenums1}\textsc{rsb} complexity in necessary. The red points - show where $\det M=0$. The left point, which is only an upper bound on the - transition, coincides with it in this case. The gray shaded region - highlights the minima, which are stationary points with $\mu\geq\mu_\mathrm - m$. $E_\textrm{min}$ is marked on the plot as the lowest energy at which - saddles of extensive index are found. + stability $\mu$ for a model with $f(q)=\frac12(\frac12q^3+\frac12q^5)$. The + dashed black line shows the line of zero annealed complexity and enclosed + inside the annealed complexity is positive. The solid black line (only + visible in the inset) gives the line of zero {\oldstylenums1\textsc{rsb}} + complexity. The red region (blown up in the inset) shows where the annealed + complexity gives the wrong count and a {\oldstylenums1}\textsc{rsb} + complexity in necessary. The red points show where $\det M=0$. The left + point, which is only an upper bound on the transition, coincides with it in + this case. The gray shaded region highlights the minima, which are + stationary points with $\mu\geq\mu_\mathrm m$. $E_\textrm{min}$ is marked + on the plot as the lowest energy at which saddles of extensive index are + found. } \label{fig:complexity_35} \end{figure} @@ -419,15 +426,16 @@ proportional to -2\log^2\frac{f''}{f'}f'^2f''v_f \end{aligned} \end{equation} -If $G_f>0$, then the bifurcating solutions exist, and there are some saddles whose -complexity is corrected by a {\oldstylenums1\textsc{rsb}} solution. -Therefore, $G_f>0$ is a sufficient condition to see at least {\oldstylenums1}\textsc{rsb} in the -complexity. If $G_f<0$, then there is nowhere along the extremal line where -saddles can be described by such a complexity, but this does not definitively -rule out \textsc{rsb}: the model may be unstable to different \textsc{rsb} -orders, or its phase boundary may simply not have a critical point on the extremal line. We -discuss the former possibility later in the paper. The range of $3+s$ models where $G_f$ is positive is -shown in Fig.~\ref{fig:phases}. +If $G_f>0$, then the bifurcating solutions exist, and there are some saddles +whose complexity is corrected by a {\oldstylenums1\textsc{rsb}} solution. +Therefore, $G_f>0$ is a sufficient condition to see at least +{\oldstylenums1}\textsc{rsb} in the complexity. If $G_f<0$, then there is +nowhere along the extremal line where saddles can be described by such a +complexity, but this does not definitively rule out \textsc{rsb}: the model may +be unstable to different \textsc{rsb} orders, or its phase boundary may simply +not have a critical point on the extremal line. We discuss the former +possibility later in the paper. The range of $3+s$ models where $G_f$ is +positive is shown in Fig.~\ref{fig:phases}. \begin{figure*} \centering @@ -488,23 +496,25 @@ extended from $E_{\oldstylenums1\textsc{rsb}}^+$. \caption{ Examples of $3+14$ models where the critical point - $E_{\oldstylenums1\textsc{rsb}}^-$ (Top) is the lower bound and (Bottom) is not the lower bound - of energies where \textsc{rsb} saddles are found. In both plots the red dot - shows $E_{\oldstylenums1\textsc{rsb}}^-$, while the solid red lines show - the transition boundary the \textsc{rs} and {\oldstylenums1\textsc{rsb}} complexity. The dashed black - line shows the \textsc{rs} zero complexity line, while the solid black line - shows the {\oldstylenums1}\textsc{rsb} zero complexity line. The dashed red - lines show the spinodals of the {\oldstylenums1\textsc{rsb}} phases. The - dotted red line shows a discontinuous phase transition between different - {\oldstylenums1}\textsc{rsb} phases. \textbf{Top:} $\lambda=0.67$. - The transition line that begins at - $E_{\oldstylenums1\textsc{rsb}}^+$ does not intersect - $E_{\oldstylenums1\textsc{rsb}}^-$ but terminates at a higher energy. - $E_{\oldstylenums1\textsc{rsb}}^-$ is a lower bound on the energy of \textsc{rsb} saddles. There are two competing {\oldstylenums1\textsc{rsb}} phases among saddles. - \textbf{Bottom:} $\lambda=0.69$. The transition line that - begins at $E_{\oldstylenums1\textsc{rsb}}^+$ terminates at a lower energy - than $E_{\oldstylenums1\textsc{rsb}}^-$, and therefore its terminus defines - the lower bound. + $E_{\oldstylenums1\textsc{rsb}}^-$ (Top) is the lower bound and (Bottom) is + not the lower bound of energies where \textsc{rsb} saddles are found. In + both plots the red dot shows $E_{\oldstylenums1\textsc{rsb}}^-$, while the + solid red lines show the transition boundary the \textsc{rs} and + {\oldstylenums1\textsc{rsb}} complexity. The dashed black line shows the + \textsc{rs} zero complexity line, while the solid black line shows the + {\oldstylenums1}\textsc{rsb} zero complexity line. The dashed red lines + show the spinodals of the {\oldstylenums1\textsc{rsb}} phases. The dotted + red line shows a discontinuous phase transition between different + {\oldstylenums1}\textsc{rsb} phases. \textbf{Top:} $\lambda=0.67$. The + transition line that begins at $E_{\oldstylenums1\textsc{rsb}}^+$ does not + intersect $E_{\oldstylenums1\textsc{rsb}}^-$ but terminates at a higher + energy. $E_{\oldstylenums1\textsc{rsb}}^-$ is a lower bound on the energy + of \textsc{rsb} saddles. There are two competing + {\oldstylenums1\textsc{rsb}} phases among saddles. \textbf{Bottom:} + $\lambda=0.69$. The transition line that begins at + $E_{\oldstylenums1\textsc{rsb}}^+$ terminates at a lower energy than + $E_{\oldstylenums1\textsc{rsb}}^-$, and therefore its terminus defines the + lower bound. } \label{fig:order} \end{figure} @@ -518,18 +528,19 @@ Consider a specific $H$ with \end{aligned} \end{equation} where the interaction tensors $J$ are drawn from zero-mean normal distributions -with $\overline{(J^{(p)})^2}=p!/2N^{p-1}$ and likewise for $J^{(s)}$. Functions $H$ defined this way have the covariance -property \eqref{eq:covariance} with $f(q)=\frac12\big(\lambda -q^p+(1-\lambda)q^s\big)$. With the $J$s drawn in this way and fixed for $p=3$ -and $s=14$, we can vary $\lambda$, and according to Fig.~\ref{fig:phases} we -should see a transition in the type of order at the ground state. What causes -the change? Our analysis indicates that stationary points with the required -order \emph{already exist in the landscape} as unstable saddles for small -$\lambda$, then eventually stabilize into metastable minima and finally become -the lowest lying states. This is different from the picture of existing -uncorrelated low-lying states splitting apart into correlated clusters. Where -uncorrelated stationary points do appear to split apart, when $\lambda$ is -decreased from large values, is among saddles, not minima. +with $\overline{(J^{(p)})^2}=p!/2N^{p-1}$ and likewise for $J^{(s)}$. Functions +$H$ defined this way have the covariance property \eqref{eq:covariance} with +$f(q)=\frac12\big(\lambda q^p+(1-\lambda)q^s\big)$. With the $J$s drawn in this +way and fixed for $p=3$ and $s=14$, we can vary $\lambda$, and according to +Fig.~\ref{fig:phases} we should see a transition in the type of order at the +ground state. What causes the change? Our analysis indicates that stationary +points with the required order \emph{already exist in the landscape} as +unstable saddles for small $\lambda$, then eventually stabilize into metastable +minima and finally become the lowest lying states. This is different from the +picture of existing uncorrelated low-lying states splitting apart into +correlated clusters. Where uncorrelated stationary points do appear to split +apart, when $\lambda$ is decreased from large values, is among saddles, not +minima. A similar analysis can be made for other mixed models, like the $2+s$, which should see complexities with other forms of \textsc{rsb}. For instance, in @@ -545,24 +556,27 @@ $s>2$, this transition line \emph{always} intersects the extremal line \eqref{eq:extremal.line}, and so \textsc{rsb} complexity will always be found among some population of stationary points. However, it is likely that for much of the parameter space the so-called one-full \textsc{rsb} -({\oldstylenums1\textsc{frsb}}), rather than \textsc{frsb}, is the correct solution, as it likely is for -large $s$ and certain $\lambda$ in the $3+s$ models studied here. Further work to find the conditions for -transitions of the complexity to {\oldstylenums1\textsc{frsb}} and {\oldstylenums2\textsc{frsb}} is necessary. For values -of $s$ where there is trivial \textsc{rsb} in the ground state, we -expect that the {\oldstylenums1\textsc{rsb}} complexity is correct. +({\oldstylenums1\textsc{frsb}}), rather than \textsc{frsb}, is the correct +solution, as it likely is for large $s$ and certain $\lambda$ in the $3+s$ +models studied here. Further work to find the conditions for transitions of the +complexity to {\oldstylenums1\textsc{frsb}} and {\oldstylenums2\textsc{frsb}} +is necessary. For values of $s$ where there is trivial \textsc{rsb} in the +ground state, we expect that the {\oldstylenums1\textsc{rsb}} complexity is +correct. What are the implications for dynamics? We find that nontrivial correlations -tend to exist among saddle points with the largest or smallest possible index at -a given energy density, which are quite atypical in the landscape. However, +tend to exist among saddle points with the largest or smallest possible index +at a given energy density, which are quite atypical in the landscape. However, these strangely correlated saddle points must descend to uncorrelated minima, which raises questions about whether structure on the boundary of a basin of attraction is influential to the dynamics that descends into that basin. These -saddles might act as early-time separatrices for descent trajectories of certain algorithms. With -open problems in even the gradient decent dynamics on these models (itself attracted to an atypical subset of marginal minima), it +saddles might act as early-time separatrices for descent trajectories of +certain algorithms. With open problems in even the gradient decent dynamics on +these models (itself attracted to an atypical subset of marginal minima), it remains to be seen whether such structures could be influential -\cite{Folena_2020_Rethinking, Folena_2021_Gradient, Folena_2023_On}. This structure among saddles -cannot be the only influence, since it seems that the $3+4$ model is `safe' -from nontrivial \textsc{rsb} among saddles. +\cite{Folena_2020_Rethinking, Folena_2021_Gradient, Folena_2023_On}. This +structure among saddles cannot be the only influence, since it seems that the +$3+4$ model is `safe' from nontrivial \textsc{rsb} among saddles. We have determined the conditions under which the complexity of the mixed $3+s$ spherical models has different quenched and annealed averages, as the result of @@ -570,9 +584,9 @@ nontrivial correlations between stationary points. We saw that these conditions can arise among certain populations of saddle points even when the model is guaranteed to lack such correlations between equilibrium states, and exist for saddle points at a wide range of energies. This suggests that studies making -complexity calculations cannot reliably use equilibrium behavior to defend -the annealed approximation. Our result has direct implications for the -geometry of these landscapes, and perhaps could be influential to certain +complexity calculations cannot reliably use equilibrium behavior to defend the +annealed approximation. Our result has direct implications for the geometry of +these landscapes, and perhaps could be influential to certain out-of-equilibrium dynamics. -- cgit v1.2.3-70-g09d2