\documentclass[fleqn,a4paper]{article} \usepackage[utf8]{inputenc} % why not type "Bézout" with unicode? \usepackage[T1]{fontenc} % vector fonts plz \usepackage{fullpage,amsmath,amssymb,latexsym,graphicx} \usepackage{newtxtext,newtxmath} % Times for PR \usepackage{appendix} \usepackage[dvipsnames]{xcolor} \usepackage[ colorlinks=true, urlcolor=MidnightBlue, citecolor=MidnightBlue, filecolor=MidnightBlue, linkcolor=MidnightBlue ]{hyperref} % ref and cite links with pretty colors \usepackage[ style=phys, eprint=true, maxnames = 100 ]{biblatex} \usepackage{anyfontsize,authblk} \addbibresource{when_annealed.bib} \begin{document} \title{ When is the average number of saddles in a function typical? } \author{Jaron Kent-Dobias} \affil{Istituto Nazionale di Fisica Nucleare, Sezione di Roma I} \maketitle \begin{abstract} A common measure of the complexity of a function is the count of its stationary points. For complicated functions, this count often grows exponentially with the volume and dimension of their domain. In practice, the count is averaged over a class of such functions (the annealed average), but the large numbers involved can result in averages biased by extremely rare samples. Typical counts are reliably found by taking the average of the logarithm (the quenched average), which is more costly and not often done in practice. When most stationary points are uncorrelated with each other, quenched and anneals averages are equal. There are heuristics from equilibrium calculations that guarantee when most of the lowest minima will be uncorrelated. Here, we show that these equilibrium heuristics cannot be used to draw conclusions about other minima and saddles. We produce examples among Gaussian-correlated functions on the hypersphere where the count of certain saddles and minima has different quenched and annealed averages, despite being guaranteed `safe' in the equilibrium setting. We produce the necessary conditions for the emergence of nontrivial correlations between saddles. We discuss the implications for the geometry of those functions, and in what out-of-equilibrium settings might show a signature of this. \end{abstract} Random energies, cost functions, and interaction networks are important to many areas of modern science. The energy landscape of glasses, the likelihood landscape of machine learning and high-dimensional inference, and the interactions between organisms in an ecosystem are just a few examples. A traditional tool for making sense of their behavior is to analyze the statistics of points where their dynamics are stationary. For energy or cost landscapes, these correspond to the minima, maxima, and saddles, while for ecosystems and other non-gradient dynamical systems these correspond to equilibria of the dynamics. When many stationary points are present, the system is considered complex. Despite the importance of stationary point statistics for understanding complex behavior, they are often calculated using an uncontrolled approximation. Because their number is so large, it cannot be reliably averaged. The annealed approximation takes this average anyway, risking a systematic bias by rare and atypical samples. The annealed approximation is known to be exact for certain models and in certain circumstances, but it is used outside those circumstances without much reflection. In a few cases researches have made instead the better-controlled quenched average, which averages the logarithm of the number of stationary points \cite{Ros_2019_Complexity, Kent-Dobias_2023_How, Ros_2023_Quenched}. A heuristic line of reasoning for the correctness of the annealed approximation for the statistics of stationary points is sometimes made when the annealed approximation is correct for an equilibrium calculation on the same system. The argument goes like this: since the limit of zero temperature or noise in an equilibrium calculation concentrates the measure onto the lowest set of minima, the equilibrium free energy in the limit to zero temperature should be governed by the same statistics as the count of that lowest set of minima. This argument is valid, but only for the lowest set of minima, which at least in glassy problems are rarely relevant to dynamical behavior. What about the \emph{rest} of the stationary points? In this paper, we show that the behavior of the ground state, or \emph{any} equilibrium behavior, does not govern whether stationary points will have a correct annealed average. In a prototypical family of models of random functions, we calculate the conditions for when annealed averages should fail and stationary points should have nontrivial correlations in their mutual position. We produce examples of models whose equilibrium is guaranteed to never see correlations between states, but where a population of saddle points is correlated. We study the mixed spherical models, which are models of Gaussian-correlated random functions with isotropic statistics on the $(N-1)$-sphere. Each class of functions $H:S^{N-1}\to\mathbb R$ is defined by the average covariance between the function evaluated at two different points $\pmb\sigma_1,\pmb\sigma_2\in S^{N-1}$, which is a function of the scalar product or overlap between the two configurations: \begin{equation} \label{eq:covariance} \overline{H(\pmb\sigma_1)H(\pmb\sigma_2)}=\frac1Nf\bigg(\frac{\pmb\sigma_1\cdot\pmb\sigma_2}N\bigg) \end{equation} Specifying the covariance function $f$ uniquely specifies the class of functions. The series coefficients of $f$ need to be nonnnegative in order for $f$ to be a well-defined covariance. The case where $f$ is a homogeneous polynomial has been extensively studied, and corresponds to the pure spherical models of glass physics or the spiked tensor models of statistical inference. Here we will study cases where $f(q)=\frac12\big(\lambda q^p+(1-\lambda)q^s\big)$ for $\lambda\in(0,1)$. These are examples of \emph{mixed} spherical models. These classes of functions have been extensively studied in the physics and statistics literature, and host a zoo of complex orders and phase transitions \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical, Crisanti_2011_Statistical}. There are two important features which differentiate stationary points $\pmb\sigma^*$ in the spherical models: their \emph{energy density} $E=\frac1NH(\pmb\sigma^*)$ and their \emph{stability} $\mu=\frac1N\operatorname{\mathrm{Tr}}\operatorname{\mathrm{Hess}}H(\pmb\sigma^*)$. The energy density should be familiar, as the `height' in the landscape. The stability is so-called because it governs the spectrum of the stationary point. In each spherical model, the spectrum of each stationary point is a Wigner semicircle of the same width $\mu_\mathrm m=\sqrt{4f''(1)}$, but shifted by constant. The stability $\mu$ sets this constant shift. When $\mu<\mu_\mathrm m$, the spectrum still has support over zero and we have saddles with an extensive number of downward directions. When $\mu>\mu_\mathrm m$ the spectrum has support only over positive eigenvalues, and we have stable minima. When $\mu=\mu_\mathrm m$, the spectrum has a pseudogap, and we have marginal minima. \begin{figure} \centering \includegraphics{figs/phases_34.pdf} \caption{ A phase diagram of the boundaries we discuss in this paper for the $3+s$ model with $f=\frac12\big(\lambda q^3+(1-\lambda)q^s\big)$. The blue region shows where there exist some stationary points whose complexity is {\oldstylenums1}\textsc{rsb}. The yellow region shows where $f$ is not convex and therefore \textsc{rsb} solutions are possible in equilibrium. The green region shows where \textsc{rsb} solutions are correct at the ground state, adapted from \cite{Auffinger_2022_The}. } \label{fig:phases} \end{figure} The number $\mathcal N(E,\mu)$ of stationary points with energy density $E$ and stability $\mu$ is exponential in $N$ for these models. Their complexity $\Sigma(E,\mu)$ is defined by the average of the logarithm of their number, or $\Sigma(E,\mu)=\frac1N\overline{\log\mathcal N(E,\mu)}$. More often the annealed complexity is calculated, where the average is taken before the logarithm: $\Sigma_\mathrm a(E,\mu)=\frac1N\log\overline{\mathcal N(E,\mu)}$. When the complexity is calculated using the Kac--Rice formula and a physicists' tool set, the problem is reduced to the evaluation of an integral by the saddle point method for large $N$ \cite{Kent-Dobias_2023_How}. An ansatz for the complexity needs to be made. Here we restrict ourselves to a {\oldstylenums1}\textsc{rsb} ansatz, which is parameterized by two quantities: $q_1$ and $x$. They have a geometric interpretation: given a stationary point fixed with certain properties, $1-x$ corresponds to the proportion of other stationary points with the same properties that are correlated with it, and $q_1$ gives the overlap that this correlated population has with it. In the annealed or replica-symmetric case, $x=1$ and all but a vanishing fraction of stationary points are uncorrelated with each other. We are guaranteed that $\Sigma\leq\Sigma_{\oldstylenums1\textsc{rsb}}\leq\Sigma_\mathrm a$, and we will discuss later in what settings the {\oldstylenums1}\textsc{rsb} complexity is correct. The complexity is given by extremizing an effective action, \begin{equation} \Sigma_{\oldstylenums1\textsc{rsb}}(E,\mu)=\lim_{n\to0}\int dq_1\,dx\,\mathcal S(q_1,x\mid E,\mu)e^{nN\mathcal S(q_1,x\mid E,\mu)} =\mathop{\mathrm{extremum}}_{q_1,x}\mathcal S(q_1,x\mid E,\mu) \end{equation} for the action $\mathcal S$ given by \begin{equation} \begin{aligned} \mathcal S&(q_1,x\mid E,\mu) =\mathcal D(\mu) +\mathop{\textrm{extremum}}_{\hat\beta,r_\mathrm d,r_1,d_\mathrm d,d_1} \Bigg\{ \hat\beta E-r_\mathrm d\mu\\ &+\frac12\bigg[ \hat\beta^2\big[f(1)-\Delta xf(q_1)\big] +(2\hat\beta r_\mathrm d-d_\mathrm d)f'(1) -\Delta x(2\hat\beta r_1-d_1)f'(q_1) +r_\mathrm d^2f''(1)-\Delta x\,r_1^2f''(q_1) \\ &+\frac1x\log\Big( \big(r_\mathrm d-\Delta x\,r_1\big)^2+d_\mathrm d\big(1-\Delta x\,q_1\big)-\Delta x\,d_1\big(1-\Delta xq_1\big) \Big) -\frac{\Delta x}x\log\Big( (r_\mathrm d-r_1)^2+(d_\mathrm d-d_1)(1-q_1) \Big) \bigg] \Bigg\} \end{aligned} \end{equation} where $\Delta x=1-x$ and \begin{equation} \mathcal D(\mu) =\begin{cases} \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2} & \mu^2\leq\mu_\text m^2 \\ \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2} -\left|\frac{\mu}{\mu_\text m}\right|\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1} -\log\left(\left|\frac{\mu}{\mu_\text m}\right|-\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1}\right) & \mu^2>\mu_\text m^2 \end{cases} \end{equation} The extremal problem in $\hat\beta$, $r_\mathrm d$, $r_1$, $d_\mathrm d$, and $d_1$ has a unique solution and can be found explicitly, but the resulting formula is much more complicated so we do not include it here. There can be multiple extrema at which to evaluate $\mathcal S$, in this case the one for which $\Sigma$ is \emph{smallest} gives the correct solution. There is always a solution for $x=1$ which is independent of $q_1$, which corresponds to the replica symmetric case and which is equal to the annealed calculation, so $\Sigma_\mathrm a(E,\mu)=\mathcal S(E,\mu\mid q_1,1)$. The crux of this paper will be to determine when this solution is not the global one. It isn't accurate to say that a solution to the saddle point equations is `stable' or `unstable.' The problem of solving the complexity in this way is not a variational problem, so there is nothing to be maximized or minimized, and in general even global solutions have positive and negative eigenvalues of the Hessian. However, the eigenvalues of the Hessian can still tell us something about the emergence of new solutions: when another solution bifurcates smoothly from an existing one, the Hessian evaluated at that point will have a zero eigenvalue. Unfortunately this is a difficult procedure to apply in general, since one must know the parameters in the new solution, and some parameters, e.g., $q_1$, are unconstrained in the old solution. There is one place where one can consistently search for a bifurcating solution to the saddle point equations: along the zero complexity line $\Sigma(E,\mu)=0$. Going along this line in the replica symmetric solution, the {\oldstylenums1}\textsc{rsb} complexity transitions at a critical point where $x=q_1=1$ \cite{Kent-Dobias_2023_How}. Since all the parameters in the bifurcating solution are known at this point, one can search for it by looking for a zero eigenvalue in the way described above. In the replica symmetric solution for points describing saddles, this line is \begin{equation} \label{eq:extremal.line} \mu_0=-\frac1{z_f}\left(2Ef'f''+\sqrt{2f''u_f\bigg(\log\frac{f''}{f'}z_f-E^2(f''-f')\bigg)}\right) \end{equation} where we define for brevity (here and elsewhere) the constants \begin{align} u_f=f(f'+f'')-f'^2 && v_f=f'(f''+f''')-f''^2 \\ w_f=2f''(f''-f')+f'f''' && y_f=f'(f'-f)+f''f && z_f=f(f''-f')+f'^2 \end{align} Note that for $f$ to define a sensible covariance, all of its series coefficients must be nonnegative. If $f$ has at least two nonzero coefficients at second order or higher, all of these constants are positive. Let $M$ be the matrix of double partial derivatives of $\mathcal S$ with respect to $q_1$ and $x$. We evaluate $M$ at the replica symmetric saddle point $x=1$ with the additional constraint that $q_1=1$ and along the extremal complexity line \eqref{eq:extremal.line}. We determine when a zero eigenvalue appears, indicated the presence of a bifurcating {\oldstylenums1}\textsc{rsb} solution, by solving $0=\det M$. We find \begin{equation} \det M=-\left(\frac{\partial^2\mathcal S}{\partial q_1\partial x}\bigg|_{\substack{x=1\\q_1=1}}\right)^2\propto(ay^2+bE^2+2cyE-d)^2 \end{equation} where $y=-\frac12z_f\mu-f'f''E$ is proportional to the square-root term in \eqref{eq:extremal.line} and the constants $a$, $b$, $c$, and $d$ are defined below. Changing to $y$ is a convenient choice because the branch of \eqref{eq:extremal.line} is chosen by the sign of $y$ (the lower-energy branch we are interested in corresponds with $y>0$) and the relationship between $y$ and $E$ on the extremal line is $g=2hy^2+eE^2$, where the constants $e$, $g$, and $h$ are given by \begin{equation} \begin{aligned} a&=\frac{w_f\big(3y_f^2-4ff'f''(f'-f)\big)-6y_f^2(f''-f')f''}{(u_fz_ff'')^2f'} \qquad b=\frac{f'w_f}{z_f^2} \\ c&=\frac{w_f}{f''z_f^2} \qquad d=\frac{w_f}{f'f''} \qquad e=f''-f' \qquad g=z_f\log\frac{f''}{f'} \qquad h=\frac1{f''u_f} \end{aligned} \end{equation} The solutions for $\det M=0$ correspond to energies that satisfy \begin{equation} E_{\oldstylenums1\textsc{rsb}}^\pm =\operatorname{sign}(bg-de)\frac{-cg\pm\sqrt{c^2g^2+(2dh-ag)(bg-de)}} { \sqrt{2c^2eg+(2bh-ae)(bg-de)\mp2ce\sqrt{c^2g^2+(2dh-ag)(bg-de)}} } \end{equation} The expression inside the inner square root is proportional to \begin{equation} G_f = f'\log\frac{f''}{f'}\big[ 3y_f(f''-f')f'''-2(f'-2f)f''w_f \big] -2(f''-f')u_fw_f -2\log^2\frac{f''}{f'}f'^2f''v_f \end{equation} If $G_f>0$, then there are two points along the extremal complexity line where a solution bifurcates, and a new line of {\oldstylenums1}\textsc{rsb} solutions between them. Therefore, $G_f>0$ is a necessary condition to see {\oldstylenums1}\textsc{rsb} in the complexity. \begin{figure} \centering \includegraphics{figs/complexity_35.pdf} \caption{ Stationary point statistics as a function of energy density $E$ and stability $\mu$ for a $3+5$ model with $\lambda=\frac12$. The dashed black line shows the line of zero complexity, where stationary points vanish, and enclosed inside they are found in exponential number. The red region (blown up in the inset) shows where the annealed complexity gives the wrong count and a {\oldstylenums1}\textsc{rsb} complexity in necessary. The red points show where $\det M=0$. The left point, which is only an upper bound on the transition, coincides with it in this case. The gray shaded region highlights the minima, which are stationary points with $\mu>\mu_\mathrm m$. } \label{fig:complexity_35} \end{figure} \begin{figure} \centering \includegraphics{figs/range_plot_1.pdf} \hspace{-3em} \includegraphics{figs/range_plot_2.pdf} \hspace{-3em} \includegraphics{figs/range_plot_3.pdf} \hspace{-3em} \includegraphics{figs/range_plot_4.pdf} \\ \vspace{-2em} \includegraphics{figs/range_plot_log_1.pdf} \hspace{-3em} \includegraphics{figs/range_plot_log_2.pdf} \hspace{-3em} \includegraphics{figs/range_plot_log_3.pdf} \hspace{-3em} \includegraphics{figs/range_plot_log_4.pdf} \caption{ The range of energies where \textsc{rsb} saddles are found for the $3+s$ model with varying $s$ and $\lambda$. In the top row the black line shows $E_\textrm{min}$, the minimum energy where saddles are found, and in the bottom row this energy is subtracted away to emphasize when the \textsc{rsb} region crosses into minima. For most $s$, both the top and bottom lines are given by $E_{\oldstylenums1\textsc{rsb}}$, but for $s=14$ there is a portion where the low-energy boundary has $q_1<1$. In that plot, the continuation of the $E_{\oldstylenums1\textsc{rsb}}$ line is shown dashed. } \end{figure} \begin{figure} \centering \includegraphics{figs/order_plot_1.pdf}\\ \vspace{-1em} \includegraphics{figs/order_plot_2.pdf} \caption{ Examples of $3+14$ models where the solution $E_{\oldstylenums1\textsc{rsb}}^-$ does and doesn't define the lower limit of energies where \textsc{rsb} saddles are found. In both plots the red dot shows $E_{\oldstylenums1\textsc{rsb}}^-$, while the solid red lines shows the transition boundary with the \textsc{rs} complexity. The dashed black line shows the \textsc{rs} zero complexity line, while the solid black line shows the {\oldstylenums1}\textsc{rsb} zero complexity line. \textbf{Top:} $\lambda=0.67$. Here the end of the transition line that begins at $E_{\oldstylenums1\textsc{rsb}}^+$ does not match $E_{\oldstylenums1\textsc{rsb}}^-$ but terminates at higher energies. $E_{\oldstylenums1\textsc{rsb}}^-$ still corresponds with the lower bound. \textbf{Bottom:} $\lambda=0.69$. Here the end of the transition line that begins at $E_{\oldstylenums1\textsc{rsb}}^+$ terminates at lower energies than $E_{\oldstylenums1\textsc{rsb}}^-$, and therefore its terminus defines the lower bound. } \end{figure} \begin{equation} \mu =-\frac{(f_1'+f_0'')u_f}{(2f_1-f_1')f_1'f_0''^{1/2}} -\frac{f_1''-f_1'}{f_1'-2f_1}E \end{equation} There are implications for the emergence of \textsc{rsb} in equilibrium. Consider a specific $H$ with \begin{equation} H(\pmb\sigma) =\frac{\sqrt\lambda}{p!}\sum_{i_1\cdots i_p}J^{(p)}_{i_1\cdots i_p}\sigma_{i_1}\cdots\sigma_{i_2} +\frac{\sqrt{1-\lambda}}{s!}\sum_{i_1\cdots i_s}J^{(s)}_{i_1\cdots i_s}\sigma_{i_1}\cdots\sigma_{i_s} \end{equation} where the interaction tensors $J$ are drawn from zero-mean normal distributions with $\overline{(J^{(p)})^2}=p!/2N^{p-1}$ and likewise for $J^{(s)}$. It is straightforward to confirm that $H$ defined this way has the covariance property \eqref{eq:covariance} with $f(q)=\frac12\big(\lambda q^p+(1-\lambda)q^s\big)$. With the $J$s drawn in this way and fixed for $p=3$ and $s=16$, we can vary $\lambda$, and according to Fig.~\ref{fig:phases} we should see a transition in the type of order at the ground state. What causes the change? Our analysis indicates that stationary points with the required order \emph{already exist in the landscape} as unstable saddles for small $\lambda$, then eventually stabilize into metastable minima and finally become the lowest lying states. This is different from the picture of existing uncorrelated low-lying states splitting apart into correlated clusters. \printbibliography \end{document}