1 files changed, 125 insertions, 119 deletions
diff --git a/frsb_kac_new.tex b/frsb_kac_new.tex
index e6b604e..3db21e4 100644
--- a/frsb_kac_new.tex
+++ b/frsb_kac_new.tex
@@ -65,6 +65,15 @@ and therefore its spectrum is given by the Wigner semicircle with radius $\sqrt{
 \end{equation}
 and the spectrum of $\operatorname{Hess}H$ is this shifted by $\mu$, or $\rho(\lambda+\mu)$.
 
+The parameter $\mu$ fixes the spectrum of the hessian. By manipulating it, one
+can decide to find the complexity of saddles of a certain macroscopic index, or
+of minima with a certain harmonic stiffness. When $\mu$ is taken to be within
+the range $\pm\sqrt{4f''(1)}=\pm\mu_m$, the critical points are constrained to have
+index $\mathcal I=\frac12N(1-\mu/\mu_m)$. When $\mu>\mu_m$, the critical
+points are minima whose sloppiest eigenvalue is $\mu-\mu_m$. Finally,
+when $\mu=\mu_m$, the critical points are marginal minima.
+
+
 The parameter $\mu$ fixes the spectrum of the hessian. When it is an integration variable,
 and one restricts the domain of all integrations to compute saddles of a certain macroscopic index, or
 of minima with a certain harmonic stiffness, its value is the `softest' mode that adapts to change the Hessian \cite{Fyodorov_2007_Replica}. When it is fixed, then the restriction of the index of saddles is `payed' by the realization of the eigenvalues of the Hessian, usually a
@@ -123,7 +132,7 @@ which must be extremized over the matrix $Q$. When the solution is a Parisi matr
 \begin{equation}
   \chi(q)=\int_1^qdq'\,\int_0^{q'}dq''\,P(q'')
 \end{equation}
-Since it is the double integral of a probability distribution, $\chi$ must be convex, monotonically decreasing and have $\chi(1)=0$, $\chi'(1)=1$. The free energy can be written as a functional over $\chi$ as
+Since it is the double integral of a probability distribution, $\chi$ must be concave, monotonically decreasing and have $\chi(1)=0$, $\chi'(1)=1$. The free energy can be written as a functional over $\chi$ as
 \begin{equation}
   \beta F=-1-\log2\pi-\frac12\int dq\,\left(\beta^2f''(q)\chi(q)+\frac1{\chi(q)}\right)
 \end{equation}
@@ -174,7 +183,7 @@ $q_1,\ldots,q_{k-1}$, which gives
   \right)
 \end{equation}
 In the continuum case, this is
-\begin{equation} \label{eq:ground.state.free.energy}
+\begin{equation} \label{eq:ground.state.free.energy.cont}
   \lim_{\beta\to\infty}\tilde\beta F
   =-\frac12z\tilde\beta f'(1)-\frac12\int dq\left(
     \tilde\beta^2f''(q)\tilde\chi(q)+\frac1{\tilde\chi(q)+\tilde\beta z^{-1}}
@@ -192,66 +201,108 @@ each stationary point also has no width and therefore overlap one with itself.
 
 
 
-\section{Kac-Rice}
+\section{Kac--Rice}
 
 \cite{Auffinger_2012_Random, BenArous_2019_Geometry}
 
+The stationary points of a function can be counted using the Kac--Rice formula,
+which integrates a over the function's domain a $\delta$-function containing
+the gradient multiplied by the absolute value of the determinant
+\cite{Rice_1939_The, Kac_1943_On}. In addition, we insert a $\delta$-function
+fixing the energy density $\epsilon$, giving the number of stationary points at
+energy $\epsilon$ and radial reaction $\mu$ as
 \begin{equation}
   \mathcal N(\epsilon, \mu)
   =\int ds\,\delta(N\epsilon-H(s))\delta(\partial H(s)+\mu s)|\det(\partial\partial H(s)+\mu I)|
 \end{equation}
+This number will typically be exponential in $N$. In order to find typical
+counts when disorder is averaged, we will want to average its logarithm
+instead, which is known as the complexity:
 \begin{equation}
-  \Sigma(\epsilon,\mu)=\frac1N\log\mathcal N(\epsilon, \mu)
+  \Sigma(\epsilon,\mu)=\lim_{N\to\infty}\frac1N\overline{\log\mathcal N(\epsilon, \mu)}
 \end{equation}
+The radial reaction $\mu$, which acts like a kind of `mass' term, takes a
+fixed value here, which means that the complexity is for a given energy
+density and hessian spectrum. This will turn out to be important when we
+discriminate between counting all solutions, or selecting those of a given
+index, for example minima. The complexity of solutions without fixing $\mu$ is
+given by maximizing the complexity as a function of $\mu$.
 
- The `mass' term $\mu$ may take a fixed value, or it may be an integration constant,
-for example fixing the spherical constraint.
-This will turn out to be important when we discriminate between counting all solutions, or selecting those of a given index, for example minima
-
+If one averages over $\mathcal N$ and afterward takes its logarithm, one arrives at the annealed complexity
+\begin{equation}
+  \Sigma_\mathrm a(\epsilon,\mu)
+  =\lim_{N\to\infty}\frac1N\log\overline{\mathcal N(\epsilon,\mu)}
+\end{equation}
+This has been previously computed for the mixed $p$-spin models \cite{BenArous_2019_Geometry}, with the result
+\begin{equation}
+  \begin{aligned}
+    \Sigma_\mathrm a(\epsilon,\mu)
+    =-\frac{\epsilon^2(f'(1)+f''(1))+2\epsilon\mu f'(1)+f(1)\mu^2}{2f(1)(f'(1)+f''(1))-2f'(1)^2}-\frac12\log f'(1)\\
+    +\operatorname{Re}\left[\frac\mu{\mu+\sqrt{\mu^2-4f''(1)}}
+      -\log\left(\frac1{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right)
+    \right]
+  \end{aligned}
+\end{equation}
+The annealed complexity is known to equal the actual (quenched) complexity in
+circumstances where there is at most one level of RSB. This is the case for the
+pure $p$-spin models, or for mixed models where $1/\sqrt{f''(q)}$ is a convex
+function. However, it fails dramatically for models with higher replica
+symmetry breaking. For instance, when $f(q)=\frac12(q^2+\frac1{16}q^4)$, the
+anneal complexity predicts that minima vanish well before the dominant saddles,
+a contradiction for any bounded function, as seen in Fig.~\ref{fig:frsb.complexity}.
 
-\subsection{The replicated  problem}
+\subsection{The replicated problem}
 
-\cite{Ros_2019_Complex}
-\cite{Folena_2020_Rethinking}
+The replicated Kac--Rice formula was introduced by Ros et al.~\cite{Ros_2019_Complex}, and its
+effective action for the mixed $p$-spin model has previously been computed by
+Folena et al.~\cite{Folena_2020_Rethinking}. Here we review the derivation.
 
+In order to average the complexity over disorder properly, the logarithm must be dealt with. We use the standard replica trick, writing
 \begin{equation}
   \begin{aligned}
-    \Sigma(\epsilon, \mu)
-    &=\frac1N\lim_{n\to0}\frac\partial{\partial n}\mathcal N^n(\epsilon) \\
-    &=\frac1N\lim_{n\to0}\frac\partial{\partial n}\int\prod_a^n ds_a\,\delta(N\epsilon-H(s_a))\delta(\partial H(s_a)+\mu s_a)|\det(\partial\partial H(s_a)+\mu I)|
+    \log\mathcal N(\epsilon,\mu)
+    &=\lim_{n\to0}\frac\partial{\partial n}\mathcal N^n(\epsilon,\mu) \\
+    &=\lim_{n\to0}\frac\partial{\partial n}\int\prod_a^n ds_a\,\delta(N\epsilon-H(s_a))\delta(\partial H(s_a)+\mu s_a)|\det(\partial\partial H(s_a)+\mu I)|
   \end{aligned}
 \end{equation}
 
-{\bf
-As noted by Bray and Dean  \cite{Bray_2007_Statistics}, gradient and Hessian are independent
-for a Gaussian distribution, and
-the average over disorder breaks into a product of two independent averages, one for the gradient factor and one for the determinant. The integration of all variables, including the disorder in the last factor, may be restricted to the domain such that the matrix $\partial\partial H(s_a)-\mu I$ has  a specified number of negative eigenvalues (the index {\cal{I}} of the saddle), 
-(see Fyodorov \cite{Fyodorov_2007_Replica} for a detailed discussion) }
-
-{ $\hat \beta$ is a parameter conjugate to the state energies, i.e. playing the role of an inverse temperature for the metastable states. }
+As noted by Bray and Dean  \cite{Bray_2007_Statistics}, gradient and Hessian
+are independent for a random Gaussian function, and the average over disorder
+breaks into a product of two independent averages, one for the gradient factor
+and one for the determinant. The integration of all variables, including the
+disorder in the last factor, may be restricted to the domain such that the
+matrix $\partial\partial H(s_a)-\mu I$ has  a specified number of negative
+eigenvalues (the index $\mathcal I$ of the saddle), (see Fyodorov
+\cite{Fyodorov_2007_Replica} for a detailed discussion). In practice, we are
+therefore able to write
 \begin{equation}
   \begin{aligned}
-    \overline{\Sigma(\epsilon, \mu)}
-    &=\frac1N\lim_{n\to0}\frac\partial{\partial n}\int\left(\prod_a^nds_a\right)\,\overline{\prod_a^n \delta(N\epsilon-H(s_a))\delta(\partial H(s_a)+\mu s_a)}
+    \Sigma(\epsilon, \mu)
+    &=\lim_{N\to\infty}\frac1N\lim_{n\to0}\frac\partial{\partial n}\int\left(\prod_a^nds_a\right)\,\overline{\prod_a^n \delta(N\epsilon-H(s_a))\delta(\partial H(s_a)+\mu s_a)}
     \times
     \overline{\prod_a^n |\det(\partial\partial H(s_a)+\mu I)|}
   \end{aligned}
 \end{equation}
-
-
-
-
-
-
-
-
-
+To largest order in $N$, the average over the product of determinants factorizes into the product of averages, each of which is given by the same expression depending only on $\mu$:
+\begin{equation}
+  \begin{aligned}
+    \mathcal D(\mu)
+    &=\frac1N\overline{\log|\det(\partial\partial H(s_a)+\mu I)|}
+    =\int d\lambda\,\rho(\lambda+\mu)\log|\lambda| \\
+    &=\operatorname{Re}\left\{
+    \frac12\left(1+\frac\mu{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right)
+    -\log\left(\frac1{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right)
+  \right\}
+  \end{aligned}
+\end{equation}
+The $\delta$-functions are treated by writing them in the Fourier basis, introducing auxiliary fields $\hat s_a$ and $\hat beta$,
 \begin{equation}
   \prod_a^n\delta(N\epsilon-H(s_a))\delta(\partial H(s_a)+\mu s_a)
   =\int \frac{d\hat\beta}{2\pi}\prod_a^n\frac{d\hat s_a}{2\pi}
     e^{\hat\beta(N\epsilon-H(s_a))+i\hat s_a\cdot(\partial H(s_a)+\mu s_a)}
 \end{equation}
-
+$\hat \beta$ is a parameter conjugate to the state energies, i.e. playing the
+role of an inverse temperature for the metastable states. The average over disorder can now be taken, and since everything is Gaussian it gives
 \begin{equation}
   \begin{aligned}
     \overline{
@@ -271,7 +322,7 @@ the average over disorder breaks into a product of two independent averages, one
         (i\hat s_b\cdot\partial_b-\hat\beta)
         f\left(\frac{s_a\cdot s_b}N\right)
       \right\} \\
-    &\hspace{-13em}\exp\left\{
+    &\hspace{-14em}=\exp\left\{
         \frac N2\sum_{ab}^n
         \left[
           \hat\beta^2f\left(\frac{s_a\cdot s_b}N\right)
@@ -283,104 +334,64 @@ the average over disorder breaks into a product of two independent averages, one
   \end{aligned}
 \end{equation}
 
-We introduce the parameters:
+We introduce new fields
 \begin{align}
   Q_{ab}=\frac1Ns_a\cdot s_b &&
   R_{ab}=-i\frac1N\hat s_a\cdot s_b &&
   D_{ab}=\frac1N\hat s_a\cdot\hat s_b
 \end{align}
-The meaning of $R_{ab}$ is that of a response of replica $a$ to a linear
-field in replica $b$:
+$Q_{ab}$ is the overlap between spins belonging to different replicas. The
+meaning of $R_{ab}$ is that of a response of replica $a$ to a linear field in
+replica $b$:
 \begin{equation}
     R_{ab} = \frac 1 N \sum_i \overline{\frac{\delta s_i^a}{\delta h_i^b}}
 \end{equation}
 The $D$ may similarly be seen as the variation of the complexity with respect to a random field.
 
-In terms of these parameters, we have
+By substituting these parameters into the expressions above and then making a change of variables in the integration from $s_a$ and $\hat s_a$ to these three matrices, we arrive at the form for the complexity
 \begin{equation}
   \begin{aligned}
-    S
-    =\mathcal D(\mu)+\hat\beta\epsilon+\lim_{n\to0}\frac1n\left(
+    &\Sigma(\epsilon,\mu)
+    =\mathcal D(\mu)+\hat\beta\epsilon+\\
+    &\lim_{n\to0}\frac1n\left(
       -\mu\sum_a^nR_{aa}
       +\frac12\sum_{ab}\left[
-        \hat\beta^2f(Q_{ab})+2\hat\beta R_{ab}f'(Q_{ab})
-        -D_{ab}f'(Q_{ab})+R_{ab}^2f''(Q_{ab})
-      \right] \right. \\ \left.
-    +\frac12\log\det\begin{bmatrix}Q&iR\\iR&D\end{bmatrix}
-    \right)
-  \end{aligned}
-\end{equation}where
-
-\begin{equation}
-  \begin{aligned}
-    \mathcal D(\mu)
-    &=\frac1N\overline{\log|\det(\partial\partial H(s_a)+\mu I)|}
-    =\int d\lambda\,\rho(\lambda+\mu)\log|\lambda| \\
-    &=\operatorname{Re}\left\{
-    \frac12\left(1+\frac\mu{2f''(1)}\left(\mu\pm\sqrt{\mu^2-4f''(1)}\right)\right)
-    -\log\left(\frac1{2f''(1)}\left(\mu\pm\sqrt{\mu^2-4f''(1)}\right)\right)
-  \right\}
-  \end{aligned}
-\end{equation}
-
-
-
-Following the usual steps (Appendix B) we arrive at the replicated action:
-
-\begin{equation}
-  \begin{aligned}
-    S
-    =\mathcal D(\mu)+\hat\beta\epsilon+\lim_{n\to0}\frac1n\left(
-      -\mu\sum_a^nR_{aa}
-      +\frac12\sum_{ab}\left[
-        \hat\beta^2f(Q_{ab})+2\hat\beta R_{ab}f'(Q_{ab})
-        -D_{ab}f'(Q_{ab})+R_{ab}^2f''(Q_{ab})
-      \right] \right. \\ \left.
+        \hat\beta^2f(Q_{ab})+(2\hat\beta R_{ab}-D_{ab})f'(Q_{ab})
+        +R_{ab}^2f''(Q_{ab})
+      \right]
     +\frac12\log\det\begin{bmatrix}Q&iR\\iR&D\end{bmatrix}
     \right)
   \end{aligned}
 \end{equation}
+where $\hat\beta$, $Q$, $R$ and $D$ must be evaluated at extrema of this
+expression. With $Q$, $R$, and $D$ distinct replica matrices, this is
+potentially quite challenging.
 
 \section{Replica ansatz}
 
-We shall make the following ansatz:
-
-\begin{eqnarray}\label{ansatz}
-  Q_{ab}&=& \text{a Parisi matrix} \nonumber \\
-  R_{ab}&=&R_d \; \delta_{ab} \nonumber\\
-  D_{ab}&=& D_d \; \delta_{ab}
-  \label{diagonal}\end{eqnarray}
+We shall make the following ansatz for the saddle point:
+\begin{align}\label{ansatz}
+  Q_{ab}= \text{a Parisi matrix} &&
+  R_{ab}=R_d \delta_{ab} &&
+  D_{ab}= D_d \delta_{ab}
+\end{align}
 From what we have seen above, this means that replica $a$ is insensitive to
 a small field applied to replica $b$ if $a \neq b$, a property related to ultrametricity. A similar situation happens in quantum replicated systems, 
 with time appearing only on the diagonal terms: see Appendix C for details.
 
-From its very definition, it is easy to see just perturbing the equations 
-with a field that $R_d$ is the trace of the inverse Hessian, as one expect indeed of a response. Putting: 
-\begin{equation}
-   \mathcal D(\mu)
-    =\frac1N\overline{\log|\det(\partial\partial H(s_a)-\mu I)|}
-    =\int d\lambda\,\rho(\lambda-\mu)\log|\lambda| 
-\end{equation}
-this means that:
+From its very definition, it is easy to see just perturbing the equations
+with a field that $R_d$ is the trace of the inverse Hessian, as one expect indeed of a response.
 \begin{equation}
- R_d =    \mathcal D(\mu)'
+ R_d =    \mathcal D'(\mu)
 \end{equation}
 
 \subsection{Solution}
 
 
-Insert the diagonal ansatz \cite{diagonal} one gets
-REINSTATED THIS---------
-\begin{equation}
-  \begin{aligned}
-    \mathcal D(\mu)
-    &=\operatorname{Re}\left\{\frac12\left(1+\frac\mu{2f''(1)}\left(\mu\pm\sqrt{\mu^2-4f''(1)}\right)\right)-\log\left(\frac1{2f''(1)}\left(\mu\pm\sqrt{\mu^2-4f''(1)}\right)\right)\right\}
-  \end{aligned}
-\end{equation}
---------------------------------------------------
+Inserting the diagonal ansatz \eqref{ansatz} one gets
 \begin{equation} \label{eq:diagonal.action}
   \begin{aligned}
-    S
+    \Sigma(\epsilon,\mu)
     =\mathcal D(\mu)
     +
       \hat\beta\epsilon-\mu R_d
@@ -389,15 +400,7 @@ REINSTATED THIS---------
       +\frac12\lim_{n\to0}\frac1n\left(\hat\beta^2\sum_{ab}f(Q_{ab})+\log\det((D_d/R_d^2)Q+I)\right)
   \end{aligned}
 \end{equation}
-
-
-
-
-Using standard manipulations (Appendix B), one finds
-
-
-
-
+Using standard manipulations (Appendix B), one finds also a continuous version
 \begin{equation} \label{eq:functional.action}
   \begin{aligned}
     S
@@ -411,6 +414,7 @@ Using standard manipulations (Appendix B), one finds
       \right)
   \end{aligned}
 \end{equation}
+where $\chi(q)$ is defined with respect to $Q$ exactly as in the equilibrium case.
 Note the close similarity of this action to the equilibrium replica one, at finite temperature.
 \begin{equation}
   \beta F=-\frac12\lim_{n\to0}\frac1n\left(\beta^2\sum_{ab}f(Q_{ab})+\log\det Q\right)-1-\log2\pi
@@ -427,13 +431,14 @@ to $\mu$. This gives
 \end{equation}
 as expected.
 To take the derivative, we must resolve the real part inside the definition of
-$\mathcal D$. When saddles dominate,, $\mu<\mu_m$, and
+$\mathcal D$. When saddles dominate, $\mu<\mu_m$, and
 \begin{equation}
   \mathcal D(\mu)=\frac12+\frac12\log f''(1)+\frac{\mu^2}{4f''(1)}
 \end{equation}
-It follows that the dominant saddles have $\mu=2f''(1)R_d$. Their index 
-is thus ${\cal{I}}=  $  THIS NEEDS MATHEMATICA
-
+It follows that the dominant saddles have $\mu=2f''(1)R_d$. Their index
+is thus $\mathcal I=\frac12N(1-R_d\sqrt{f''(1)})$. If
+$R_d\sqrt{f''(1)}>1$ then we were wrong to assume that saddles dominate, and
+the most numerous saddles will be found just above $\mu=\mu_m$.
 
 \subsubsection{Minima}
 \label{sec:counting.minima}
@@ -450,6 +455,8 @@ When minima dominate, $\mu>\mu_m$ and all the roots inside $\mathcal D(\mu)$ are
 \begin{center}
 \includegraphics[width=13cm]{frsb_complexity-2.pdf}
 \end{center}
+\caption{
+} \label{fig:frsb.complexity}
 \end{figure}
 
 \subsubsection{Recovering the replica ground state}
@@ -459,10 +466,10 @@ The ground state energy corresponds to that where the complexity of dominant sta
 Consider the extremum problem of \eqref{eq:diagonal.action} with respect to $R_d$ and $D_d$. This gives the equations
 \begin{align}
   0
-  &=\frac{\partial S}{\partial D_d}
+  &=\frac{\partial\Sigma}{\partial D_d}
   =-\frac12f'(1)+\frac12\frac1{R_d^2}\lim_{n\to0}\frac1n\operatorname{Tr}((D_d/R_d^2)Q+I)^{-1}Q \label{eq:saddle.d}\\
   0
-  &=\frac{\partial S}{\partial R_d}
+  &=\frac{\partial\Sigma}{\partial R_d}
   =-\mu+\hat\beta f'(1)+R_df''(1)+\frac1{R_d}-\frac{D_d}{R_d^3}\lim_{n\to0}\frac1n\operatorname{Tr}((D_d/R_d^2)Q+I)^{-1}Q \label{eq:saddle.r}
 \end{align}
 Adding $2(D_d/R_d)$ times \eqref{eq:saddle.d} to \eqref{eq:saddle.r} and multiplying by $R_d$ gives
@@ -482,7 +489,7 @@ When the dominant stationary points are saddles, we can use the $\mu$ from \S\re
 \end{equation}
 If saddles dominate all the way to the ground state, then they must become marginal minima at the ground state. Therefore at the ground state energy $\mu=\mu_m=\sqrt{4f''(1)}$, and once again $R_d\hat\beta-D_d=0$.
 
-In any case, at the ground state $D_d=R_d\hat\beta$. Substuting this into the action, and also substituting the optimal $\mu$ for saddles or minima, and taking $\Sigma(\epsilon_0,\mu^*)=0$, gives
+In any case, at the ground state $D_d=R_d\hat\beta$. Substituting this into the action, and also substituting the optimal $\mu$ for saddles or minima, and taking $\Sigma(\epsilon_0,\mu^*)=0$, gives
 \begin{equation}
   \hat\beta\epsilon_0
   =-\frac12R_d\hat\beta f'(1)-\frac12\lim_{n\to0}\frac1n\left(
@@ -490,7 +497,6 @@ In any case, at the ground state $D_d=R_d\hat\beta$. Substuting this into the ac
       +\log\det(\hat\beta R_d^{-1} Q+I)
   \right)
 \end{equation}
-
 which is precisely \eqref{eq:ground.state.free.energy} with $R_d=z$ and $\hat\beta=\tilde\beta$. 
 
 {\em We arrive at one of the main results of our paper: a $(k-1)$-RSB ansatz in Kac--Rice will predict the correct ground state energy for a model whose equilibrium state at small temperatures is $k$-RSB }
@@ -510,7 +516,7 @@ The frozen phase for a given index ${\cal{I}}$ is the one for values of $\hat \b
  
  The complexity of that index is zero, and we are looking at the lowest saddles 
 in the problem, a question that to the best of our knowledge has not been discussed
-in the Kac-Rice context -- for good reason, since the complexity - the original motivation - is zero.
+in the Kac--Rice context -- for good reason, since the complexity - the original motivation - is zero.
 However, our ansatz tells us something of the actual organization of  the lowest saddles of each index in phase space. 
 
 
@@ -634,7 +640,7 @@ $F$ is a $k-1$ RSB ansatz with all eigenvalues scaled by $y$ and shifted by $z$.
 
 
 
-\section{ RSB for the Kac-Rice integral}
+\section{ RSB for the Kac--Rice integral}
 
 
 \subsection{Solution}
@@ -753,7 +759,7 @@ where
   \right\}
 \end{equation}
 and $\Lambda$ is the space of functions $\chi:[0,1]\to[0,1]$ which are
-monotonically decreasing, convex, and have $\chi(1)=0$ and $\chi'(1)=-1$.
+monotonically decreasing, concave, and have $\chi(1)=0$ and $\chi'(1)=-1$.
 If there is more than one extremum of this function, choose the one with the
 smallest value of $\Sigma$. The sign of the root inside $\mathcal D(\mu)$ is
 negative for $\mu>0$ and positive for $\mu<0$.