\documentclass[fleqn,a4paper]{article} \usepackage[utf8]{inputenc} % why not type "Bézout" with unicode? \usepackage[T1]{fontenc} % vector fonts plz \usepackage{fullpage,amsmath,amssymb,latexsym,graphicx} \usepackage{newtxtext,newtxmath} % Times for PR \usepackage{appendix} \usepackage[dvipsnames]{xcolor} \usepackage[ colorlinks=true, urlcolor=MidnightBlue, citecolor=MidnightBlue, filecolor=MidnightBlue, linkcolor=MidnightBlue ]{hyperref} % ref and cite links with pretty colors \usepackage[ style=phys, eprint=true, maxnames = 100 ]{biblatex} \usepackage{anyfontsize,authblk} \addbibresource{when_annealed.bib} \begin{document} \title{ When is the average number of saddles typical? } \author{Jaron Kent-Dobias} \affil{Istituto Nazionale di Fisica Nucleare, Sezione di Roma I} \maketitle \begin{abstract} A common measure of the complexity of a function is the count of its stationary points. For complicated functions, this count often grows exponentially with the volume and dimension of their domain. In practice, the count is averaged over a class of such functions (the annealed average), but the large numbers involved can result in averages biased by extremely rare samples. Typical counts are reliably found by taking the average of the logarithm (the quenched average), which is more costly and not often done in practice. When most stationary points are uncorrelated with each other, quenched and anneals averages are equal. There are heuristics from equilibrium calculations that guarantee when most of the lowest minima will be uncorrelated. Here, we show that these equilibrium heuristics cannot be used to draw conclusions about other minima and saddles. We produce examples among Gaussian-correlated functions on the hypersphere where the count of certain saddles and minima has different quenched and annealed averages, despite being guaranteed `safe' in the equilibrium setting. We produce the necessary conditions for the emergence of nontrivial correlations between saddles. We discuss the implications for the geometry of those functions, and in what out-of-equilibrium settings might show a signature of this. \end{abstract} Random energies, cost functions, and interaction networks are important to many areas of modern science. The energy landscape glasses, the likelihood landscape of machine learning and high-dimensional inference, and the interactions between organisms in an ecosystem are just a few examples. A traditional tool for making sense of the diverse behavior these systems exhibit is to analyze the statistics of points where their dynamics is stationary. For energy or cost landscapes, these correspond to the minima, maxima, and saddles of the function, while for ecosystems and other purely dynamical systems these correspond to the equilibria of the dynamics. When many stationary points are present, the system is considered complex. Despite the importance of stationary point statistics for understanding complex behavior, they are often calculated using an uncontrolled approximation. Because their number is so large, it cannot reliably be averaged over disorder in the system. The annealed approximation takes this average anyway, risking a systematic bias by rare and atypical samples. The anneal approximation is known to be exact for certain models and in certain circumstances, but it is widely used outside those circumstances without much reflection. In a few cases, the controlled quenched average, which averages the logarithm of their number, has been considered \cite{Ros_2019_Complexity, Kent-Dobias_2023_How, Ros_2023_Quenched}. One heuristic line of reasoning for the correctness of the annealed approximation for the statistics of stationary points is sometimes made when the annealed approximation is correct for an equilibrium calculation on the same system. The argument goes like this: since the limit of zero temperature or noise in an equilibrium calculation concentrates the measure onto the lowest set of minima, the equilibrium average at very low temperature should be governed by the same statistics as the count of that lowest set of minima. This argument is valid, but only for the lowest set of minima, which at least in glassy problems are rarely relevant to dynamical behavior. What about the \emph{rest} of the stationary points? In this paper, we show that the behavior of the ground state, or \emph{any} equilibrium behavior, does not govern whether stationary points will have a correct annealed average. In a prototypical family of models of random functions, we calculate the conditions for when annealed averages should fail and stationary points should have nontrivial correlations in their mutual position. We produce examples of models whose equilibrium is guaranteed to never see correlations between states, but where a population of saddle points is correlated. We study classes of Gaussian-correlated random functions with isotropic statistics on the $(N-1)$-sphere. Each class of functions $H:S^{N-1}\to\mathbb R$ is defined by the average covariance between the function evaluated at two different points $\pmb\sigma_1,\pmb\sigma_2\in S^{N-1}$, which is a function of the scalar product or overlap between the two configurations: \begin{equation} \label{eq:covariance} \overline{H(\pmb\sigma_1)H(\pmb\sigma_2)}=\frac1Nf\bigg(\frac{\pmb\sigma_1\cdot\pmb\sigma_2}N\bigg) \end{equation} Specifying the covariance function $f$ uniquely specifies the class of functions. The series coefficients of $f$ need to be nonnnegative in order for $f$ to be a well-defined covariance. The case where $f$ is a homogeneous polynomial has been extensively studied, and corresponds to the pure spherical models of glass physics or the spiked tensor models of statistical inference. Here we will study cases where $f(q)=\frac12\big(\lambda q^p+(1-\lambda)q^s\big)$ for $\lambda\in(0,1)$. These are examples of \emph{mixed} spherical models. These classes of functions have been extensively studied in the physics and statistics literature, and host a zoo of complex orders and phase transitions \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical, Crisanti_2011_Statistical}. There are two important features which differentiate stationary points in the spherical models: their \emph{energy density} $E=\frac1NH(\pmb\sigma^*)$ and their \emph{stability} $\mu=\frac1N\operatorname{\mathrm{Tr}}\operatorname{\mathrm{Hess}}H(\pmb\sigma^*)$. The energy density should be familiar, as the `height' in the landscape. The stability is so-called because it governs the spectrum of the stationary point. In each spherical model, the spectrum of each stationary point is a Wigner semicircle of the same width $\mu_\mathrm m=\sqrt{4f''(1)}$, but shifted by constant. The stability $\mu$ sets this constant shift. When $\mu<\mu_\mathrm m$, the spectrum still has support over zero and we have saddles with an extensive number of downward directions. When $\mu>\mu_\mathrm m$ the spectrum has support only over positive eigenvalues, and we have stable minima. When $\mu=\mu_\mathrm m$, the spectrum has a pseudogap, and we have marginal minima. \begin{figure} \centering \includegraphics{figs/phases_34.pdf} \caption{ A phase diagram of the boundaries we discuss in this paper for the $3+s$ model with $f=\frac12\big(\lambda q^3+(1-\lambda)q^s\big)$. The blue region shows where there exist some stationary points whose complexity is {\oldstylenums1}\textsc{rsb}. The yellow region shows where $f$ is not convex and therefore \textsc{rsb} solutions are possible in equilibrium. The green region shows where \textsc{rsb} solutions are correct at the ground state, adapted from \cite{Auffinger_2022_The}. } \label{fig:phases} \end{figure} The number $\mathcal N(E,\mu)$ of stationary points with energy density $E$ and stability $\mu$ is exponential in $N$ for these models. The complexity is defined by the average of the logarithm of this, or $\Sigma(E,\mu)=\frac1N\overline{\log\mathcal N(E,\mu)}$. More often the annealed complexity is calculated, where the average is taken before the logarithm: $\Sigma_\mathrm a(E,\mu)=\frac1N\log\overline{\mathcal N(E,\mu)}$. When the complexity is calculated using the Kac--Rice formula and a physicists' tool set, the problem is reduced to the evaluation of an integral by the saddle point method for large $N$ \cite{Kent-Dobias_2023_How}. The complexity is given by extremizing an effective action, $\Sigma(E,\mu)=\mathop{\mathrm{extremum}}_{q_1,x}\mathcal S(q_1,x\mid E,\mu)$ for the action $\mathcal S$ given by \begin{equation} \begin{aligned} \mathcal S&(q_1,x\mid E,\mu) =\mathcal D(\mu) +\mathop{\textrm{extremum}}_{\hat\beta,r_\mathrm d,r_1,d_\mathrm d,d_1} \Bigg\{ \hat\beta E-r_\mathrm d\mu\\ &+\frac12\bigg[ \hat\beta^2\big[f(1)-\Delta xf(q_1)\big] +(2\hat\beta r_\mathrm d-d_\mathrm d)f'(1) -\Delta x(2\hat\beta r_1-d_1)f'(q_1) +r_\mathrm d^2f''(1)-\Delta x\,r_1^2f''(q_1) \\ &+\log\Big( \big(r_\mathrm d-\Delta x\,r_1\big)^2+d_\mathrm d\big(1-\Delta x\,q_1\big)-\Delta x\,d_1\big(1-\Delta xq_1\big) \Big) -\frac{\Delta x}x\log\Big( (r_\mathrm d-r_1)^2+d_\mathrm d\big(1-\Delta xq_1\big) \Big) \bigg] \Bigg\} \end{aligned} \end{equation} where $\Delta x=1-x$ and \begin{equation} \mathcal D(\mu) =\begin{cases} \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2} & \mu^2\leq\mu_\text m^2 \\ \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2} -\left|\frac{\mu}{\mu_\text m}\right|\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1} -\log\left(\left|\frac{\mu}{\mu_\text m}\right|-\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1}\right) & \mu^2>\mu_\text m^2 \end{cases} \end{equation} The extremal problem is quadratic in $\hat\beta$, $r_\mathrm d$, $r_1$, $d_\mathrm d$, and $d_1$ and therefore its solution is unique and can be found explicitly, but the resulting formula is much more complicated so we do not include it here. There can be multiple extrema at which to evaluate $\mathcal S$, in this case the one for which $\Sigma$ is \emph{smallest} gives the correct solution. There is always a solution for $x=1$ which is independent of $q_1$, which corresponds to the replica symmetric case and which is equal to the annealed calculation. The crux of this paper will be to determine when this solution is not the global one. where we define for brevity (here and elsewhere) the constants \begin{align} u_f=f(f'+f'')-f'^2 && v_f=f'(f''+f''')-f''^2 \\ w_f=2f''^2+f'(f'''-2f'') && y_f=f'(f'-f)+f''f && z_f=f(f''-f')+f'^2 \end{align} It isn't accurate to say that a solution to the saddle point equations is `stable' or `unstable.' The problem of solving the complexity in this way is not a variational problem, so there is nothing to be maximized or minimized, and in general even global solutions have positive and negative eigenvalues of the Hessian. However, the eigenvalues of the Hessian can still tell us something about the emergence of new solutions: when another solution bifurcates smoothly from an existing one, the Hessian evaluated at that point will have a zero eigenvalue. Unfortunately this is a difficult procedure to apply in general, since on must know the parameters in the new solution, and some parameters, e.g., $q_1$, are unconstrained in the old solution. There is one place where one can consistently search for a bifurcating solution to the saddle point equations: along the zero complexity line $\Sigma(E,\mu)=0$. Going along this line in the replica symmetric solution, the {\oldstylenums1}\textsc{rsb} complexity transitions at a critical point where $x=q_1=1$ \cite{Kent-Dobias_2023_How}. Since all the parameters in the bifurcating solution are known at this point, one can search for it by looking for a zero eigenvalue in the way described above. In the replica symmetric solution for points describing saddles, this line is \begin{equation} \label{eq:extremal.line} \mu=-\frac1{z_f}\left(-2Ef'f''+\sqrt{-2f''u_f\big(E^2(f''-f')-\log\frac{f''}{f'}z_f\big)}\right) \end{equation} Let $M$ be the matrix of double partial derivatives of $\mathcal S$ with respect to $q_1$ and $x$. We evaluate $M$ at the replica symmetric saddle point $x=1$ with the additional constraint that $q_1=1$ and along the extremal complexity line \eqref{eq:extremal.line}. We determine when a zero eigenvalue appears, indicated the presence of a bifurcating {\oldstylenums1}\textsc{rsb} solution, by solving $0=\det M$. We find \begin{equation} \det M=-\left(\frac{\partial^2\mathcal S}{\partial q_1\partial x}\bigg|_{\substack{x=1\\q_1=1}}\right)^2\propto(ay^2+bE^2+cyE+d)^2 \end{equation} where $y^2=eE^2+g$ and the constants $a$, $b$, $c$, $d$, $e$, and $g$ are defined by \begin{equation} \begin{aligned} a&=\frac{f''}{u_f}\left[ \big(3y_f^2-4ff'f''(f'-f)\big)f'''-8f(f'-f)f''^2(f''-f') \right] \qquad b=2f'f''^2w_f \\ c&=2w_f\sqrt{2f''^3u_f} \qquad d=-\frac{2f''}{f'}z_f^2w_f \qquad e=-(f''-f') \qquad g=z_f\log\frac{f''}{f'} \end{aligned} \end{equation} The solutions for $\det M=0$ correspond to energies that satisfy \begin{equation} E_{\oldstylenums1\textsc{rsb}} =-\sqrt{\frac12\frac{c^2g-2(ade+a^2eg+b(d+ag))\pm|c|\sqrt{4d^2e-4d(b-ae)g+(c^2-4ab)g^2}}{b^2+2abe+e(a^2e-c^2)}} \end{equation} The expression inside the inner square root is proportional to \begin{equation} \begin{aligned} G_f &= 2(f''-f')u_fw_f -2\log^2\frac{f''}{f'}f'^2f''v_f \\ &\qquad -f'\log\frac{f''}{f'}\Big[ 4(f'-2f)(f''-f')f''^2-\big(-3(f'-f)f'^2+f'(f'-2f)f''+3ff''^2\big)f''' \Big] \end{aligned} \end{equation} If $G_f>0$, then there are two points along the extremal complexity line where a solution bifurcates, and a new line of {\oldstylenums1}\textsc{rsb} solutions between them. Therefore, $G_f>0$ is a necessary condition to see {\oldstylenums1}\textsc{rsb} in the complexity. \begin{figure} \centering \includegraphics{figs/complexity_35.pdf} \caption{ Stationary point statistics as a function of energy density $E$ and stability $\mu$ for a $3+5$ model with $\lambda=\frac12$. The dashed black line shows the line of zero complexity, where stationary points vanish, and enclosed inside they are found in exponential number. The red region (blown up in the inset) shows where the annealed complexity gives the wrong count and a {\oldstylenums1}\textsc{rsb} complexity in necessary. The red points show where $\det M=0$. The gray shaded region highlights the minima, which are stationary points with $\mu>\mu_\mathrm m$. } \label{fig:complexity_35} \end{figure} \begin{equation} \mu =-\frac{(f_1'+f_0'')u_f}{(2f_1-f_1')f_1'f_0''^{1/2}} -\frac{f_1''-f_1'}{f_1'-2f_1}E \end{equation} There are implications for the emergence of \textsc{rsb} in equilibrium. Consider a specific $H$ with \begin{equation} H(\pmb\sigma) =\frac{\sqrt\lambda}{p!}\sum_{i_1\cdots i_p}J^{(p)}_{i_1\cdots i_p}\sigma_{i_1}\cdots\sigma_{i_2} +\frac{\sqrt{1-\lambda}}{s!}\sum_{i_1\cdots i_s}J^{(s)}_{i_1\cdots i_s}\sigma_{i_1}\cdots\sigma_{i_s} \end{equation} where the interaction tensors $J$ are drawn from zero-mean normal distributions with $\overline{(J^{(p)})^2}=p!/2N^{p-1}$ and likewise for $J^{(s)}$. It is straightforward to confirm that $H$ defined this way has the covariance property \eqref{eq:covariance} with $f(q)=\frac12\big(\lambda q^p+(1-\lambda)q^s\big)$. With the $J$s drawn in this way and fixed for $p=3$ and $s=16$, we can vary $\lambda$, and according to Fig.~\ref{fig:phases} we should see a transition in the type of order at the ground state. What causes the change? Our analysis indicates that stationary points with the required order \emph{already exist in the landscape} as unstable saddles for small $\lambda$, then eventually stabilize into metastable minima and finally become the lowest lying states. This is different from the picture of existing uncorrelated low-lying states splitting apart into correlated clusters. \printbibliography \end{document}