diff options
Diffstat (limited to 'frsb_kac-rice.tex')
-rw-r--r-- | frsb_kac-rice.tex | 216 |
1 files changed, 127 insertions, 89 deletions
diff --git a/frsb_kac-rice.tex b/frsb_kac-rice.tex index 43cf24e..bc6647e 100644 --- a/frsb_kac-rice.tex +++ b/frsb_kac-rice.tex @@ -28,87 +28,71 @@ Bray and Moore \cite{Bray_1980_Metastable} attempted the first calculation for the Sherrington--Kirkpatrick model, in a paper remarkable for being one of the first applications of a replica symmetry breaking scheme. As became clear when the actual ground-state of the model was computed by Parisi \cite{Parisi_1979_Infinite} with a different scheme, the Bray--Moore result was not exact, and in fact the problem has been open ever since. To this date the program of computing the number of saddles of a mean-field -glass has been only carried out for a small subset of models. -These include most notably the (pure) $p$-spin model ($p>2$) \cite{Rieger_1992_The, Crisanti_1995_Thouless-Anderson-Palmer}. -The problem of studying the critical points of these landscapes +glass has been only carried out for a small subset of models, including most notably the (pure) $p$-spin model ($p>2$) \cite{Rieger_1992_The, Crisanti_1995_Thouless-Anderson-Palmer}. +In a parallel development, it has evolved into an active field in probability theory \cite{Auffinger_2012_Random, Auffinger_2013_Complexity, BenArous_2019_Geometry} In this paper we present what we argue is the general replica ansatz for the -computation of the number of saddles of generic mean-field models, including the Sherrington--Kirkpatrick model. It reproduces the Parisi result in the limit +computation of the number of saddles of generic mean-field models, which we expect to include the Sherrington--Kirkpatrick model. It reproduces the Parisi result in the limit of small temperature for the lowest states, as it should. To understand the importance of this computation, consider the following situation. When one solves the problem of spheres in large dimensions, one finds that there is -a transition at a given temperature to a one-step one step symmetry breaking (1RSB) phase at a Kauzmann temperature. At a lower temperature, -the is a transition to a full RSB phase (see \cite{gross1985mean,gardner1985spin}, the `Gardner ' phase \cite{charbonneau2014fractal}. - -Now, this transition involves the lowest, equilibrium states. Because they are obviously unreachable at any reasonable timescale, an often addressed to ask "what is the Gardner transition line for higher than equilibrium energy-densities" (see, for a review \cite{berthier2019gardner})? For example, when studying `jamming' at zero temperature, the question is posed as to "on what side of the 1RSB-FRS transition -are the high energy (or low density) states reachable dynamically. Posed in this way, such a question does not have a clear definition. +a transition at a given temperature to a one-step one step symmetry breaking (1RSB) phase at a Kauzmann temperature, +and, at a lower temperature, +another transition to a full RSB phase (see \cite{gross1985mean,gardner1985spin}, the `Gardner ' phase \cite{charbonneau2014fractal}. +Now, this transition involves the lowest, equilibrium states. Because they are obviously unreachable at any reasonable timescale, an often addressed question to ask is "what is the Gardner transition line for higher than equilibrium energy-densities"? (see, for a review \cite{berthier2019gardner}) For example, when studying `jamming' at zero temperature, the question is posed as to "on what side of the 1RSB-FRS transition +are the high energy (or low density) states reachable dynamically. +Posed in this way, such a question does not have a clear definition. In the present paper we give a concrete strategy to define unambiguously such an issue: we consider the local energy minima at a given energy and study their number and other properties: the solution involves a replica-symmetry breaking scheme that is well-defined, and corresponds directly to the topological characteristics of those minima. \section{The model} -Here we consider, for definiteness, the mixed $p$-spin model, itself a particular case -of the `Toy Model' of M\'ezard and Parisi \cite{Mezard_1992_Manifolds} +Here we consider, for definiteness, the mixed $p$-spin model, \begin{equation} H(s)=-\sum_p\frac1{p!}\sum_{i_1\cdots i_p}J^{(p)}_{i_1\cdots i_p}s_{i_1}\cdots s_{i_p} \end{equation} -for $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$. Then +for $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$ Then \begin{equation} \overline{H(s_1)H(s_2)}=Nf\left(\frac{s_1\cdot s_2}N\right) \end{equation} for \begin{equation} f(q)=\frac12\sum_pa_pq^p -\end{equation} -Can be thought of as a model of generic gaussian functions on the sphere. +\end{equation} +or, more generally, the `Toy Model' of M\'ezard and Parisi \cite{Mezard_1992_Manifolds} which involve non-polynomial forms for $f(q)$. + +These may be thought of as a model of generic Gaussian functions on the sphere. To constrain the model to the sphere, we use a Lagrange multiplier $\mu$, with the total energy being \begin{equation} H(s)+\frac\mu2(s\cdot s-N) \end{equation} -At any critical point, the hessian is +At any critical point, the gradient and Hessian are \begin{equation} - \operatorname{Hess}H=\partial\partial H+\mu I -\end{equation} -$\partial\partial H$ is a GOE matrix with variance -\begin{equation} - \overline{(\partial_i\partial_jH)^2}=\frac1Nf''(1)\delta_{ij} + \operatorname{Grad}H=\partial H+\mu z \qquad ; \qquad \operatorname{Hess}H=\partial\partial H+\mu I \end{equation} -and therefore its spectrum is given by the Wigner semicircle with radius $\sqrt{4f''(1)}$, or -\begin{equation} - \rho(\lambda)=\frac1{2\pi f''(1)}\sqrt{4f''(1)-\lambda^2} -\end{equation} -and the spectrum of $\operatorname{Hess}H$ is this shifted by $\mu$, or $\rho(\lambda+\mu)$. +The important observation was made by Bray and Dean \cite{Bray_2007_Statistics} that gradient and Hessian +are independent for random Gaussian disorder. +The average over disorder +breaks into a product of two independent averages, one for the gradient factor +and one for any function of the Hessian, in particular its number of negative eigenvalues, the index $\mathcal I$ of the saddle (see Fyodorov +\cite{Fyodorov_2007_Replica} for a detailed discussion).. + + -The parameter $\mu$ fixes the spectrum of the hessian. By manipulating it, one -can decide to find the complexity of saddles of a certain macroscopic index, or -of minima with a certain harmonic stiffness. When $\mu$ is taken to be within -the range $\pm\sqrt{4f''(1)}=\pm\mu_m$, the critical points are constrained to have -index -\begin{equation} - \mathcal I(\mu)=N\int_0^\infty d\lambda\,\rho(\lambda+\mu) - =N\left\{\frac12-\frac1\pi\left[ - \arctan\left(\frac\mu{\sqrt{\mu_m^2-\mu^2}}\right) - +\frac\mu{\mu_m^2}\sqrt{\mu_m^2-\mu^2} - \right] - \right\} -\end{equation} -When $\mu>\mu_m$, the critical -points are minima whose sloppiest eigenvalue is $\mu-\mu_m$. Finally, -when $\mu=\mu_m$, the critical points are marginal minima. -The parameter $\mu$ fixes the spectrum of the hessian. When it is an integration variable, -and one restricts the domain of all integrations to compute saddles of a certain macroscopic index, or -of minima with a certain harmonic stiffness, its value is the `softest' mode that adapts to change the Hessian \cite{Fyodorov_2007_Replica}. When it is fixed, then the restriction of the index of saddles is `payed' by the realization of the eigenvalues of the Hessian, usually a -`harder' mode. \section{Equilibrium} Here we review the equilibrium solution. \cite{Crisanti_1992_The, Crisanti_1993_The, Crisanti_2004_Spherical, Crisanti_2006_Spherical} -The free energy is well known to take the form +The free energy averaged over disorder is +\begin{equation} +\beta F = - \overline{\ln \int ds \; e^{-\beta H(s)}} +\end{equation} +Computing the logarithm as the limit of $n \rightarrow 0$ replicas is standard, and it take the form \begin{equation} \beta F=-1-\log2\pi-\frac12\lim_{n\to0}\frac1n\left(\beta^2\sum_{ab}^nf(Q_{ab})+\log\det Q\right) \end{equation} @@ -180,12 +164,12 @@ $q_k$ gives the overlap within a state, e.g., within the basin of a well inside the energy landscape. At zero temperature, the measure is completely localized on the bottom of the well, and therefore the overlap with each state becomes one. We will see that the complexity of low-energy stationary points in -Kac--Rice is also given by a $(k-1)$-RSB anstaz. Heuristically, this is because +Kac--Rice computation is also given by a $(k-1)$-RSB anstaz. Heuristically, this is because each stationary point also has no width and therefore overlap one with itself. -\section{Kac--Rice} +\section{Kac--Rice calculation} \cite{Auffinger_2012_Random, BenArous_2019_Geometry} @@ -194,39 +178,33 @@ which integrates a over the function's domain a $\delta$-function containing the gradient multiplied by the absolute value of the determinant \cite{Rice_1939_The, Kac_1943_On}. In addition, we insert a $\delta$-function fixing the energy density $E$, giving the number of stationary points at -energy $E$ and radial reaction $\mu$ as +energy $E$ in terms of a Lagrange multiplier $\mu$ as: \begin{equation} - \mathcal N(E, \mu) - =\int ds\,\delta(NE-H(s))\delta(\partial H(s)+\mu s)|\det(\partial\partial H(s)+\mu I)| + \mathcal N(E, {\cal I}_o) + =\int ds d\mu\,\delta(NE-H(s))\delta(\partial H(s)+\mu s)|\det(\partial\partial H(s)+\mu I)| \Theta\left[{\cal I}(s,\mu)-{\cal I}_o)\right] \end{equation} +where $\Theta$ is one in the region in which the Hessian the argument is vanishes, and zero elsewhere. This number will typically be exponential in $N$. In order to find typical counts when disorder is averaged, we will want to average its logarithm -instead, which is known as the complexity: +instead, which is known as the averaged complexity: \begin{equation} - \Sigma(E,\mu)=\lim_{N\to\infty}\frac1N\overline{\log\mathcal N(E, \mu)} + \Sigma(E,\mu)=\lim_{N\to\infty}\frac1N\overline{\log\mathcal N(E, {\cal I}_o)} \end{equation} -The radial reaction $\mu$, which acts like a kind of `mass' term, takes a -fixed value here, which means that the complexity is for a given energy -density and hessian spectrum. This will turn out to be important when we -discriminate between counting all solutions, or selecting those of a given -index, for example minima. The complexity of solutions without fixing $\mu$ is -given by maximizing the complexity as a function of $\mu$. - -If one averages over $\mathcal N$ and afterward takes its logarithm, one arrives at the annealed complexity +If one averages over $\mathcal N$ and afterward takes its logarithm, one arrives at the so-called {\em annealed} complexity \begin{equation} - \Sigma_\mathrm a(E,\mu) - =\lim_{N\to\infty}\frac1N\log\overline{\mathcal N(E,\mu)} + \Sigma_\mathrm a(E,{\cal I}_o) + =\lim_{N\to\infty}\frac1N\log\overline{\mathcal N(E,{\cal I}_o))} \end{equation} This has been previously computed for the mixed $p$-spin models \cite{BenArous_2019_Geometry}, with the result -\begin{equation} - \begin{aligned} - \Sigma_\mathrm a(E,\mu) - =-\frac{E^2(f'(1)+f''(1))+2E\mu f'(1)+f(1)\mu^2}{2f(1)(f'(1)+f''(1))-2f'(1)^2}-\frac12\log f'(1)\\ - +\operatorname{Re}\left[\frac\mu{\mu+\sqrt{\mu^2-4f''(1)}} - -\log\left(\frac1{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right) - \right] - \end{aligned} -\end{equation} +%\begin{equation} + % \begin{aligned} + %\Sigma_\mathrm a(E,\mu) + %=-\frac{E^2(f'(1)+f''(1))+2E\mu f'(1)+f(1)\mu^2}{2f(1)(f'(1)+f''(1))-2f'(1)^2}-\frac12\log f'(1)\\ + % +\operatorname{Re}\left[\frac\mu{\mu+\sqrt{\mu^2-4f''(1)}} + % -\log\left(\frac1{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right) + % \right] + % \end{aligned} +%\end{equation} The annealed complexity is known to equal the actual (quenched) complexity in circumstances where there is at most one level of RSB. This is the case for the pure $p$-spin models, or for mixed models where $1/\sqrt{f''(q)}$ is a convex @@ -235,41 +213,77 @@ symmetry breaking. For instance, when $f(q)=\frac12(q^2+\frac1{16}q^4)$, the anneal complexity predicts that minima vanish well before the dominant saddles, a contradiction for any bounded function, as seen in Fig.~\ref{fig:frsb.complexity}. +A sometimes more illuminating quantity to consider is the Legendre transform $G$ of the complexity: +\begin{equation} + e^{NG(\hat \beta, {\cal I}_o)} = \int dE \; e^{ -\hat \beta E +\Sigma(\hat \beta, {\cal I}_o)} +\end{equation} +There will be a critical value $\hat \beta_c$ beyond which the complexity is zero: above +this value the measure is split between the lowest $O(1)$ energy states. We shall not study here this regime that interpolates between the dynamically relevant and the equilibrium states, but just mention that +it is an interesting object of study. + + + + + + \subsection{The replicated problem} -The replicated Kac--Rice formula was introduced by Ros et al.~\cite{Ros_2019_Complex}, and its -effective action for the mixed $p$-spin model has previously been computed by -Folena et al.~\cite{Folena_2020_Rethinking}. Here we review the derivation. + In order to average the complexity over disorder properly, the logarithm must be dealt with. We use the standard replica trick, writing \begin{equation} \begin{aligned} - \log\mathcal N(E,\mu) - &=\lim_{n\to0}\frac\partial{\partial n}\mathcal N^n(E,\mu) \\ - &=\lim_{n\to0}\frac\partial{\partial n}\int\prod_a^n ds_a\,\delta(NE-H(s_a))\delta(\partial H(s_a)+\mu s_a)|\det(\partial\partial H(s_a)+\mu I)| + \log\mathcal N(E,{\cal I}_o) + &=\lim_{n\to0}\frac\partial{\partial n}\mathcal N^n(E,{\cal I}_o) \\ + &=\lim_{n\to0}\frac\partial{\partial n}\int\prod_a^n ds_a\,\delta(NE-H(s_a))\delta(\partial H(s_a)+\mu s_a)|\det(\partial\partial H(s_a)+\mu I)| \Theta\left[{\cal I}(s_a,\mu)-{\cal I}_o)\right] \end{aligned} \end{equation} - -As noted by Bray and Dean \cite{Bray_2007_Statistics}, gradient and Hessian -are independent for a random Gaussian function, and the average over disorder -breaks into a product of two independent averages, one for the gradient factor -and one for the determinant. The integration of all variables, including the -disorder in the last factor, may be restricted to the domain such that the -matrix $\partial\partial H(s_a)-\mu I$ has a specified number of negative -eigenvalues (the index $\mathcal I$ of the saddle), (see Fyodorov -\cite{Fyodorov_2007_Replica} for a detailed discussion). In practice, we are +The replicated Kac--Rice formula was introduced by Ros et al.~\cite{Ros_2019_Complex}, and its +effective action for the mixed $p$-spin model has previously been computed by +Folena et al.~\cite{Folena_2020_Rethinking}. Here we review the derivation. + In practice, we are therefore able to write \begin{equation} \begin{aligned} - \Sigma(E, \mu) + \Sigma(E, {\cal I}_o) &=\lim_{N\to\infty}\frac1N\lim_{n\to0}\frac\partial{\partial n}\int\left(\prod_a^nds_a\right)\,\overline{\prod_a^n \delta(NE-H(s_a))\delta(\partial H(s_a)+\mu s_a)} \times - \overline{\prod_a^n |\det(\partial\partial H(s_a)+\mu I)|} + \overline{\prod_a^n |\det(\partial\partial H(s_a)+\mu I)| \Theta\left[{\cal I}(s_a,\mu)-{\cal I}_o)\right]} \end{aligned} \end{equation} + +\subsubsection{The Hessian factors} + +The spectrum of the Hessian matrix $\partial\partial H$ is in the large $N$ limit +for almost every point and realization of disorder a GOE matrix with variance +\begin{equation} + \overline{(\partial_i\partial_jH)^2}=\frac1Nf''(1)\delta_{ij} +\end{equation} +and therefore its spectrum is given by the Wigner semicircle with radius $\sqrt{4f''(1)}$, or +\begin{equation} + \rho(\lambda)=\frac1{2\pi f''(1)}\sqrt{4f''(1)-\lambda^2} +\end{equation} +and the spectrum of $\operatorname{Hess}H$ is this shifted by $\mu$, or $\rho(\lambda+\mu)$. +The parameter $\mu$ thus fixes the spectrum of the Hessian, when $\mu$ is taken to be within +the range $\pm\sqrt{4f''(1)}=\pm\mu_m$, the critical (or in fact, any) points have +index density +\begin{equation} + \mathcal I(\mu)=\int_0^\infty d\lambda\,\rho(\lambda+\mu) + =N\left\{\frac12-\frac1\pi\left[ + \arctan\left(\frac\mu{\sqrt{\mu_m^2-\mu^2}}\right) + +\frac\mu{\mu_m^2}\sqrt{\mu_m^2-\mu^2} + \right] + \right\} +\end{equation} +When $\mu>\mu_m$, the critical +points are minima whose sloppiest eigenvalue is $\mu-\mu_m$. +The factor $\Theta[({\cal I}(s_a,\mu)-{\cal I}_0]$ selects a domain of integration of $\mu,s_a$. + + To largest order in $N$, the average over the product of determinants factorizes into the product of averages, each of which is given by the same expression depending only on $\mu$: \begin{equation} \begin{aligned} + & \overline{\prod_a^n |\det(\partial\partial H(s_a)+\mu I)| \Theta\left[{\cal I}(s_a,\mu)-{\cal I}_o)\right]}\\ \mathcal D(\mu) &=\frac1N\overline{\log|\det(\partial\partial H(s_a)+\mu I)|} =\int d\lambda\,\rho(\lambda+\mu)\log|\lambda| \\ @@ -279,6 +293,9 @@ To largest order in $N$, the average over the product of determinants factorizes \right\} \end{aligned} \end{equation} + +\subsubsection{The gradient factors} + The $\delta$-functions are treated by writing them in the Fourier basis, introducing auxiliary fields $\hat s_a$ and $\hat\beta$, \begin{equation} \prod_a^n\delta(NE-H(s_a))\delta(\partial H(s_a)+\mu s_a) @@ -345,6 +362,27 @@ matrices, we arrive at the form for the complexity where $\hat\beta$, $C$, $R$ and $D$ must be evaluated at extrema of this expression. + +{\color{blue} + + + +The complexity is defined as +\begin{equation} + \Sigma(E,\mu)= \frac 1N \log\mathcal N(E,\mu) +\end{equation} +The same information is contained, and better expressed in its double Legendre +transform $J(\hat \beta, R_d)$: +\begin{equation} + e^{N J(\hat \beta, R_d)} =\int \; d\mu de \; e^{N\Sigma(E,\mu)+R_d\mu -\hat \beta E +{\cal{D}}(\mu)} +\end{equation} +$R_d$ is conjugate to $\mu$ and through it to the Index density, while $\hat \beta$ plays the role of an inverse temperature conjugate to the complexity, that has been used since the beginning of the spin-glass field. + + +} + + + \section{Replica ansatz} Based on previous work on the SK model and the equilibrium solution of the |