From 4de9086f8eea30c002eab452b5ac1fdd471ca76d Mon Sep 17 00:00:00 2001
From: Jaron Kent-Dobias <jaron@kent-dobias.com>
Date: Thu, 7 Jul 2022 23:34:24 +0200
Subject: lots of changes

---
 frsb_kac-rice.tex | 301 ++++++++++++++++++++++++++++--------------------------
 1 file changed, 158 insertions(+), 143 deletions(-)

(limited to 'frsb_kac-rice.tex')

diff --git a/frsb_kac-rice.tex b/frsb_kac-rice.tex
index c5a288b..5a69d6d 100644
--- a/frsb_kac-rice.tex
+++ b/frsb_kac-rice.tex
@@ -38,7 +38,7 @@ of small temperature for the lowest states, as it should.
 To understand the importance of this computation, consider the following situation. When one solves the problem of spheres in large dimensions, one finds that there is
 a transition at a given temperature to a one-step one step symmetry breaking (1RSB) phase at a Kauzmann temperature,
 and, at a lower temperature,
-another  transition to  a full RSB phase (see \cite{Gross_1985_Mean-field, Gardner_1985_Spin}, the `Gardner ' phase \cite{Charbonneau_2014_Fractal}. 
+another  transition to  a full RSB phase (see \cite{Gross_1985_Mean-field, Gardner_1985_Spin}, the `Gardner' phase \cite{Charbonneau_2014_Fractal}. 
 Now, this transition involves the lowest, equilibrium states. Because they are
 obviously unreachable at any reasonable timescale, an often addressed question
 to ask is: what is the Gardner transition line for higher than equilibrium
@@ -52,51 +52,48 @@ energy and study their number and other properties: the solution involves a
 replica-symmetry breaking scheme that is well-defined, and corresponds directly
 to the topological characteristics of those minima.
 
-
-
 \section{The model}
+\label{sec:model}
 
-Here we consider, for definiteness, the  mixed $p$-spin model, whose Hamiltonian
-\begin{equation}
+For definiteness, we consider the mixed $p$-spin spherical model, whose Hamiltonian
+\begin{equation} \label{eq:hamiltonian}
   H(\mathbf s)=-\sum_p\frac1{p!}\sum_{i_1\cdots i_p}^NJ^{(p)}_{i_1\cdots i_p}s_{i_1}\cdots s_{i_p}
 \end{equation}
 is defined for vectors $\mathbf s\in\mathbb R^N$ confined to the sphere
-$\|\mathbf s\|^2=N$.  The coupling coefficients are taken at random, with zero
-mean and variance $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$. This implies that
-the covariance of the energy with itself depends only on the dot product, or
-overlap, between two configurations, and in particular that
+$\|\mathbf s\|^2=N$.  The coupling coefficients $J$ are taken at random, with
+zero mean and variance $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$ chosen so that
+the energy is typically extensive. This implies that the covariance of the
+energy with itself depends only on the dot product (or overlap) between two
+configurations. In particular, one has
 \begin{equation}
   \overline{H(\mathbf s_1)H(\mathbf s_2)}=Nf\left(\frac{\mathbf s_1\cdot\mathbf s_2}N\right)
 \end{equation}
-for
+where $f$ is defined by the series
 \begin{equation}
   f(q)=\frac12\sum_pa_pq^p
 \end{equation}
-More generally, one does not need to start with a Hamiltonian like ours and can simply invoke the covariance rule for arbitrary, non-polynomial $f$, as in the `toy model' of M\'ezard and Parisi \cite{Mezard_1992_Manifolds}.
+More generally, one need not start with a Hamiltonian like
+\eqref{eq:hamiltonian} and instead invoke the covariance rule for arbitrary,
+non-polynomial $f$, as in the `toy model' of M\'ezard and Parisi
+\cite{Mezard_1992_Manifolds}.
 
 These may be thought of as a model of generic Gaussian functions on the sphere.
 To constrain the model to the sphere, we use a Lagrange multiplier $\mu$, with the total energy being
 \begin{equation}
   H(\mathbf s)+\frac\mu2(\|\mathbf s\|^2-N)
 \end{equation}
-
 At any critical point, the gradient and Hessian are
 \begin{align}
   \nabla H(\mathbf s,\mu)=\partial H(\mathbf s)+\mu\mathbf s &&
   \operatorname{Hess}H(\mathbf s,\mu)=\partial\partial H(\mathbf s)+\mu I
 \end{align}
-where $\partial=\frac\partial{\partial\mathbf s}$ always.
-The important observation was made by Bray and Dean  \cite{Bray_2007_Statistics} that gradient and Hessian
-are independent for  random Gaussian disorder.
-The average over disorder
-breaks into a product of two independent averages, one for the gradient factor
-and one for any function of the Hessian, in particular its number of negative eigenvalues, the index $\mathcal I$ of the saddle (see Fyodorov
-\cite{Fyodorov_2007_Replica} for a detailed discussion).. 
-
-
-
-
-
+where $\partial=\frac\partial{\partial\mathbf s}$ always.  The important
+observation was made by Bray and Dean  \cite{Bray_2007_Statistics} that
+gradient and Hessian are independent for  random Gaussian disorder.  The
+average over disorder breaks into a product of two independent averages, one
+for the gradient factor and one for any function of the Hessian, in particular
+its number of negative eigenvalues, the index $\mathcal I$ of the saddle (see
+Fyodorov \cite{Fyodorov_2007_Replica} for a detailed discussion).
 
 \section{Equilibrium}
 
@@ -180,19 +177,22 @@ one. We will see that the complexity of low-energy stationary points in
 Kac--Rice computation is also given by a $(k-1)$-RSB anstaz. Heuristically, this is because
 each stationary point also has no width and therefore overlap one with itself.
 
-
-
-\section{Kac--Rice calculation}
-
-\cite{Auffinger_2012_Random, BenArous_2019_Geometry}
+\section{Landscape complexity}
 
 The stationary points of a function can be counted using the Kac--Rice formula,
 which integrates a over the function's domain a $\delta$-function containing
 the gradient multiplied by the absolute value of the determinant
-\cite{Rice_1939_The, Kac_1943_On}. In addition, we insert a $\delta$-function
-fixing the energy density $E$, and another delta function to count the number of 
-saddles with trace of the Hessian $=\mu^*$. The latter will give us everything we need
-to characterize the saddles, as we shall see later
+\cite{Rice_1939_The, Kac_1943_On}. It gives the number of stationary points $\mathcal N$ as
+\begin{equation}
+  \mathcal N
+    =\int d\mathbf s\, d\mu\,\delta\big(\tfrac12(\|\mathbf s\|^2-N)\big)\,\delta\big(\nabla H(\mathbf s,\mu)\big)\,\big|\det\operatorname{Hess}H(\mathbf s,\mu)\big|
+\end{equation}
+It is more interesting to count stationary points which share certain
+properties, like energy density $E$ or index $\mathcal I$. These properties can
+be fixed by inserting additional $\delta$-functions into the integral. Rather
+than fix the index directly, we fix the trace of the hession, which we'll soon
+show is equivalent to fixing the value $\mu$, and fixing $\mu$ fixes the index
+to within order one. Inserting these $\delta$-functions, we arrive at
 \begin{equation}
   \begin{aligned}
     \mathcal N(E, \mu^*)
@@ -211,23 +211,16 @@ If one averages over $\mathcal N$ and afterward takes its logarithm, one arrives
   \Sigma_\mathrm a(E,\mu^*)
   =\lim_{N\to\infty}\frac1N\log\overline{\mathcal N(E,\mu^*)}
 \end{equation}
-This has been previously computed for the mixed $p$-spin models \cite{BenArous_2019_Geometry}, with the result
-%\begin{equation}
- % \begin{aligned}
-    %\Sigma_\mathrm a(E,\mu)
-    %=-\frac{E^2(f'(1)+f''(1))+2E\mu f'(1)+f(1)\mu^2}{2f(1)(f'(1)+f''(1))-2f'(1)^2}-\frac12\log f'(1)\\
-   % +\operatorname{Re}\left[\frac\mu{\mu+\sqrt{\mu^2-4f''(1)}}
-   %   -\log\left(\frac1{2f''(1)}\left(\mu-\sqrt{\mu^2-4f''(1)}\right)\right)
-  %  \right]
- % \end{aligned}
-%\end{equation}
-The annealed complexity is known to equal the actual (quenched) complexity in
-circumstances where there is at most one level of RSB. This is the case for the
-pure $p$-spin models, or for mixed models where $1/\sqrt{f''(q)}$ is a convex
-function. However, it fails dramatically for models with higher replica
-symmetry breaking. For instance, when $f(q)=\frac12(q^2+\frac1{16}q^4)$, the
-anneal complexity predicts that minima vanish well before the dominant saddles,
-a contradiction for any bounded function, as seen in Fig.~\ref{fig:frsb.complexity}.
+This has been previously computed for the mixed $p$-spin models
+\cite{BenArous_2019_Geometry}.  The annealed complexity is known to equal the
+actual (quenched) complexity in circumstances where there is at most one level
+of replica symmetry breaking in the model's equilibrium. This is the case for
+the pure $p$-spin models, or for mixed models where $1/\sqrt{f''(q)}$ is a
+convex function. However, it fails dramatically for models with higher replica
+symmetry breaking. For instance, when $f(q)=\frac12(q^2+\frac1{16}q^4)$ (a
+model we study in detail later), the annealed complexity predicts that minima
+vanish well before the dominant saddles, a contradiction for any bounded
+function, as seen in Fig.~\ref{fig:frsb.complexity}.
 
 A sometimes more illuminating quantity to consider is the Legendre transform $G$ of the complexity:
 \begin{equation}
@@ -237,16 +230,16 @@ There will be a critical value $\hat \beta_c$ beyond  which the complexity is ze
 this value the measure is split between the lowest $O(1)$ energy states. We shall not study here this  regime that interpolates between the dynamically relevant and the equilibrium states, but just mention that
 it is an interesting object of study.
 
-
-
-
-
-
 \subsection{The replicated problem}
 
+The replicated Kac--Rice formula was introduced by Ros et
+al.~\cite{Ros_2019_Complex}, and its effective action for the mixed $p$-spin
+model has previously been computed by Folena et
+al.~\cite{Folena_2020_Rethinking}. Here we review the derivation.
 
-
-In order to average the complexity over disorder properly, the logarithm must be dealt with. We use the standard replica trick, writing
+In order to average the complexity over disorder, we must deal with the
+logarithm. We use the standard replica trick to convert the logarithm into a
+product, which gives
 \begin{equation}
   \begin{aligned}
     \log\mathcal N(E,\mu^*)
@@ -256,11 +249,11 @@ In order to average the complexity over disorder properly, the logarithm must be
     &\hspace{13pc}  \times\delta\big(NE-H(\mathbf s_a)\big)\delta\big(N\mu^*-\operatorname{Tr}\operatorname{Hess}H(\mathbf s_a,\mu_a)\big)
   \end{aligned}
 \end{equation}
-The replicated Kac--Rice formula was introduced by Ros et al.~\cite{Ros_2019_Complex}, and its
-effective action for the mixed $p$-spin model has previously been computed by
-Folena et al.~\cite{Folena_2020_Rethinking}. Here we review the derivation.
- In practice, we are
-therefore able to write
+As discussed in \S\ref{sec:model}, it has been shown that to the largest order
+in $N$, the Hessian of Gaussian random functions in independent from their
+gradient, once both are conditioned on certain properties. Here, they are only
+related by their shared value of $\mu$. Because of this statistical
+independence, we may write
 \begin{equation}
   \begin{aligned}
     \Sigma(E, \mu^*)
@@ -271,43 +264,46 @@ therefore able to write
     \overline{\prod_a^n |\det\operatorname{Hess}(\mathbf s_a,\mu_a)|\,\delta\big(N\mu^*-\operatorname{Tr}\operatorname{Hess}H(\mathbf s_a,\mu_a)\big)}
   \end{aligned}
 \end{equation}
+which simplifies matters. The average of the two factors may now be treated separately.
 
 \subsubsection{The Hessian factors}
 
-The spectrum of the matrix $\partial\partial H$ is uncorrelated from the
-gradient. In the large $N$ limit for almost every point and realization of
-disorder a GOE matrix with variance
+The spectrum of the matrix $\partial\partial H(\mathbf s)$ is uncorrelated from the
+gradient. In the large-$N$ limit, for almost every point and realization of
+disorder it is a GOE matrix with variance
 \begin{equation}
-  \overline{(\partial_i\partial_jH)^2}=\frac1Nf''(1)\delta_{ij}
+  \overline{(\partial_i\partial_jH(\mathbf s))^2}=\frac1Nf''(1)\delta_{ij}
 \end{equation}
-and therefore its spectrum is given by the Wigner semicircle with radius $\sqrt{4f''(1)}$, or
+Therefore in that limit its spectrum is given by the Wigner semicircle with radius $\sqrt{4f''(1)}$, or
 \begin{equation}
-  \rho(\lambda)=\frac1{2\pi f''(1)}\sqrt{4f''(1)-\lambda^2}
-\end{equation}
-and the spectrum of $\operatorname{Hess}H$ is this shifted by $\mu$, or $\rho(\lambda+\mu)$.
-The parameter $\mu$ thus fixes the spectrum of the Hessian, when $\mu$ is taken to be within
-the range $\pm\sqrt{4f''(1)}=\pm\mu_m$, the critical (or in fact, any) points  have
-index density
+  \rho(\lambda)=\begin{cases}
+    \frac1{2\pi f''(1)}\sqrt{4f''(1)-\lambda^2} & \lambda^2\leq 4f''(1) \\
+    0 & \text{otherwise}
+  \end{cases}
+\end{equation}
+The spectrum of the Hessian $\operatorname{Hess}H(\mathbf s,\mu)$ is the same
+semicircle shifted by $\mu$, or $\rho(\lambda+\mu)$.  The parameter $\mu$ thus
+fixes the spectrum of the Hessian: when $\mu$ is taken to be within the range
+$\pm\mu_m\equiv\pm\sqrt{4f''(1)}$, the critical points have index density
 \begin{equation}
   \mathcal I(\mu)=\int_0^\infty d\lambda\,\rho(\lambda+\mu)
-  =N\left\{\frac12-\frac1\pi\left[
+  =\frac12-\frac1\pi\left[
     \arctan\left(\frac\mu{\sqrt{\mu_m^2-\mu^2}}\right)
     +\frac\mu{\mu_m^2}\sqrt{\mu_m^2-\mu^2}
   \right]
-  \right\}
 \end{equation}
-When $\mu>\mu_m$, the critical
-points are minima whose sloppiest eigenvalue is $\mu-\mu_m$.
+When $\mu>\mu_m$, the critical points are minima whose sloppiest eigenvalue is
+$\mu-\mu_m$, and when $\mu=\mu_m$, the critical points are marginal minima.
 
 
-To largest order in $N$, the average over the product of determinants factorizes into the product of averages, each of which is given by the same expression depending only on $\mu$:
+To largest order in $N$, the average over the product of determinants
+factorizes into the product of averages, each of which is given by the same
+expression depending only on $\mu$ \cite{Ros_2019_Complex}. We therefore find
 \begin{equation}
-  \begin{aligned}
-    \overline{\prod_a^n |\det\operatorname{Hess}(\mathbf s_a,\mu_a)|\,\delta\big(N\mu^*-\operatorname{Tr}\operatorname{Hess}H(\mathbf s_a,\mu_a)\big)}
-   \rightarrow e^{nN{\cal D}(\mu^*)}\prod_a^n\delta(\mu_a-\mu^*)
-  \end{aligned}
+  \overline{\prod_a^n |\det\operatorname{Hess}(\mathbf s_a,\mu_a)|\,\delta\big(N\mu^*-\operatorname{Tr}\operatorname{Hess}H(\mathbf s_a,\mu_a)\big)}
+ \rightarrow \prod_a^ne^{N{\cal D}(\mu_a)}\delta(\mu_a-\mu^*)
 \end{equation}
-with
+where the function $\mathcal D$ is defined by
 \begin{equation}
   \begin{aligned}
     \mathcal D(\mu)
@@ -319,43 +315,54 @@ with
   \right\}
   \end{aligned}
 \end{equation}
+It follows that by fixing the trace of the Hessian, we have effectively fixed
+the value of $\mu$ in all replicas to $\mu^*$, and therefore the index as well.
 
-
-What we have described is the {\em typical} spectrum for given $\mu$. What about the deviations of the spectrum -- we are particularly interested in the number of negative eigenvalues -- at given $\mu$. The result is well known qualitatively: there are two possibilities:\\
-$\bullet$ For $|\mu|>\mu_m$ there is the possibility of a finite number of eigenvalues of
-the second derivative matrix
+What we have described is the {\em typical} spectrum for given $\mu$. What
+about the deviations of the spectrum -- we are particularly interested in the
+number of negative eigenvalues -- at given $\mu$. The result is well known
+qualitatively: there are two possibilities:
+\begin{itemize}
+  \item For $|\mu|>\mu_m$ there is the possibility of a finite number of eigenvalues of
+    the second derivative matrix
+  \item The second
+\end{itemize}
 
 \subsubsection{The gradient factors}
 
-The $\delta$-functions are treated by writing them in the Fourier basis, introducing auxiliary fields $\hat{\mathbf s}_a$ and $\hat\beta$,
+The $\delta$-functions are treated by writing them in the Fourier basis.
+Introducing auxiliary fields $\hat{\mathbf s}_a$ and $\hat\beta$, for each
+replica replica one writes
 \begin{equation}
-  \delta\big(\tfrac12(\|\mathbf s_a\|^2-N)\big)\,\delta\big(\nabla H(\mathbf s_a,\mu^*)\big)\delta(NE-H(\mathbf s_a))
-  =\int\frac{d\hat\mu}{2\pi}\,\frac{d\hat\beta}{2\pi}\,\frac{d\hat{\mathbf s}_a}{2\pi}
-  e^{\frac12\hat\mu(\|\mathbf s_a\|^2-N)+\hat\beta(NE-H(\mathbf s_a))+i\hat{\mathbf s}_a\cdot(\partial H(\mathbf s_a)+\mu^*\mathbf s_a)}
-\end{equation}
-$\hat \beta$ is a parameter conjugate to the state energies, i.e. playing the
-role of an inverse temperature for the metastable states. The average over
-disorder can now be taken for the pieces which depend on $H$, and since
-everything is Gaussian it gives
+  \begin{aligned}
+    &\delta\big(\tfrac12(\|\mathbf s_a\|^2-N)\big)\,\delta\big(\nabla H(\mathbf s_a,\mu^*)\big)\delta(NE-H(\mathbf s_a)) \\
+    &\hspace{12pc}=\int\frac{d\hat\mu}{2\pi}\,\frac{d\hat\beta}{2\pi}\,\frac{d\hat{\mathbf s}_a}{(2\pi)^N}
+      e^{\frac12\hat\mu(\|\mathbf s_a\|^2-N)+\hat\beta(NE-H(\mathbf s_a))+i\hat{\mathbf s}_a\cdot(\partial H(\mathbf s_a)+\mu^*\mathbf s_a)}
+  \end{aligned}
+\end{equation}
+Anticipating the Parisi-style solution, we don't label $\hat\mu$ or $\hat\beta$
+with replica indices, since replica vectors won't be broken in the scheme.  The
+average over disorder can now be taken for the pieces which depend explicitly
+on the Hamiltonian, and since everything is Gaussian this gives
 \begin{equation}
   \begin{aligned}
     \overline{
-      \exp\left\{
+      \exp\left[
         \sum_a^n(i\hat {\mathbf s}_a\cdot\partial_a-\hat\beta)H(s_a)
-      \right\}
+      \right]
     }
-    &=\exp\left\{
+    &=\exp\left[
         \frac12\sum_{ab}^n
         (i\hat{\mathbf s}_a\cdot\partial_a-\hat\beta)
         (i\hat{\mathbf s}_b\cdot\partial_b-\hat\beta)
         \overline{H(\mathbf s_a)H(\mathbf s_b)}
-      \right\} \\
-    &=\exp\left\{
+      \right] \\
+    &=\exp\left[
         \frac N2\sum_{ab}^n
         (i\hat{\mathbf s}_a\cdot\partial_a-\hat\beta)
         (i\hat{\mathbf s}_b\cdot\partial_b-\hat\beta)
         f\left(\frac{\mathbf s_a\cdot\mathbf s_b}N\right)
-      \right\} \\
+      \right] \\
     &\hspace{-14em}=\exp\left\{
         \frac N2\sum_{ab}^n
         \left[
@@ -374,17 +381,17 @@ We introduce new matrix fields
   R_{ab}=-i\frac1N\hat{\mathbf s}_a\cdot{\mathbf s}_b &&
   D_{ab}=\frac1N\hat{\mathbf s}_a\cdot\hat{\mathbf s}_b
 \end{align}
-Their physical meaning is explained in \S\ref{sec:interpretation}.
-By substituting these parameters into the expressions above and then making a
-change of variables in the integration from $s_a$ and $\hat s_a$ to these three
-matrices, we arrive at the form for the complexity
+Their physical meaning is explained in \S\ref{sec:interpretation}.  By
+substituting these parameters into the expressions above and then making a
+change of variables in the integration from $\mathbf s_a$ and $\hat{\mathbf
+s}_a$ to these three matrices, we arrive at the form for the complexity
 \begin{equation}
   \begin{aligned}
-    &\Sigma(E,\mu^*)
-    =\mathcal D(\mu^*)+\hat\beta E-\frac12\hat\mu+\\
-    &\lim_{n\to0}\frac1n\left(
-      \frac12\hat\mu\operatorname{Tr}C-\mu^*\operatorname{Tr}R
-      +\frac12\sum_{ab}\left[
+    \Sigma(E,\mu^*)
+    &=\mathcal D(\mu^*)+\hat\beta E-\frac12\hat\mu+
+    \lim_{n\to0}\frac1n\left(
+      \frac12\hat\mu\operatorname{Tr}C-\mu^*\operatorname{Tr}R\right.\\
+    &\hspace{2em}\left.+\frac12\sum_{ab}\left[
         \hat\beta^2f(C_{ab})+(2\hat\beta R_{ab}-D_{ab})f'(C_{ab})
         +R_{ab}^2f''(C_{ab})
       \right]
@@ -392,22 +399,21 @@ matrices, we arrive at the form for the complexity
     \right)
   \end{aligned}
 \end{equation}
-where $\hat\mu$, $\hat\beta$, $C$, $R$ and $D$ must be evaluated at extrema of this
-expression.
-
-
+where $\hat\mu$, $\hat\beta$, $C$, $R$ and $D$ must be evaluated at the extrema
+of this expression which minimize the complexity. Extremizing with respect to
+$\hat\mu$ is not difficult, and results in setting the diagonal of $C$ to one,
+fixing the spherical constraint. Maintaining $\hat\mu$ in the complexity is
+useful for writing down the extremal conditions, but when convenient we will
+drop the dependence.
 
-
-
-
-The same information is contained, and better expressed in its Legendre
+The same information is contained but better expressed in the Legendre
 transform
 \begin{equation}
   \begin{aligned}
     &G(\hat \beta,\mu^*)
-    =\mathcal D(\mu^*)-\frac12\hat\mu\\
+    =\mathcal D(\mu^*)+\\
     &\lim_{n\to0}\frac1n\left(
-      \frac12\hat\mu\operatorname{Tr}C-\mu^*\operatorname{Tr}R
+      -\mu^*\operatorname{Tr}R
       +\frac12\sum_{ab}\left[
         \hat\beta^2f(C_{ab})+(2\hat\beta R_{ab}-D_{ab})f'(C_{ab})
         +R_{ab}^2f''(C_{ab})
@@ -416,17 +422,16 @@ transform
     \right)
   \end{aligned}
 \end{equation}
-Denoting $r_d \equiv \frac 1 n {\mbox Tr} R$, we have the double Legendre transform $K(\hat \beta, r_d)$:
+Denoting $r_d \equiv \frac 1 n {\mbox Tr} R$, we can write down the double Legendre transform $K(\hat \beta, r_d)$:
 \begin{equation}
-  e^{N K(\hat \beta, r_d)} =\int \, d\mu^* \,dE \, e^{N\left\{\Sigma(E,\mu^*)+r_d\mu^* -\hat\beta E -\mathcal D(\mu^*)\right\}}
+  e^{N K(\hat \beta, r_d)} =\int\,dE\,d\mu^* e^{N\left\{\Sigma(E,\mu^*) -\hat\beta E+r_d\mu^* -\mathcal D(\mu^*)\right\}}
 \end{equation}
 given by
 \begin{equation}
   \begin{aligned}
     &K(\hat \beta,r_d)
    = \lim_{n\to0}\frac1n\left(
-     \frac{\hat\mu}2\operatorname{Tr}(C-I)
-      +\frac12\sum_{ab}\left[
+      \frac12\sum_{ab}\left[
         \hat\beta^2f(C_{ab})+(2\hat\beta R_{ab}-D_{ab})f'(C_{ab})
         +R_{ab}^2f''(C_{ab})
       \right]
@@ -434,10 +439,12 @@ given by
     \right)
   \end{aligned}
 \end{equation}
-$r_d$ is conjugate to $\mu^*$ and through it to the index density, while $\hat
-\beta$ plays the role of an inverse temperature conjugate to the complexity,
-that has been used since the beginning of the spin-glass field. In this way
-$K(\hat \beta,r_d)$ contains all the information about saddle densities.
+where the diagonal of $C$ is fixed to one and the diagonal of $R$ is fixed to
+$r_d$.  The variable $r_d$ is conjugate to $\mu^*$ and through it to the index
+density, while $\hat \beta$ plays the role of an inverse temperature conjugate
+to the complexity, that has been used since the beginning of the spin-glass
+field. In this way $K(\hat \beta,r_d)$ contains all the information about
+saddle densities.
 
 
 
@@ -446,11 +453,12 @@ $K(\hat \beta,r_d)$ contains all the information about saddle densities.
 
 \section{Replica ansatz}
 
-Based on previous work on the SK model and the equilibrium solution of the
-spherical model, we expect $C$, and $R$ and $D$ to be hierarchical matrices,
-i.e., to follow Parisi's scheme. In the end, when the limit of $n\to0$ is
-taken, each can be represented in the canonical way by its diagonal and a
-continuous function on the domain $[0,1]$ which parameterizes each of its rows, with
+Based on previous work on the Sherrington--Kirkpatrick model and the
+equilibrium solution of the spherical model, we expect $C$, and $R$ and $D$ to
+be hierarchical matrices, i.e., to follow Parisi's scheme. In the end, when the
+limit of $n\to0$ is taken, each can be represented in the canonical way by its
+diagonal and a continuous function on the domain $[0,1]$ which parameterizes
+each of its rows, with
 \begin{align}
   C\;\leftrightarrow\;[c_d, c(x)]
   &&
@@ -483,21 +491,28 @@ where $\odot$ denotes the Hadamard product, or the componentwise product. Equati
   D=f'(C)^{-1}-RC^{-1}R
 \end{equation}
 
-In addition to these equations, one is also often interested in maximizing the complexity as a function of $\mu^*$, to find the dominant or most common type of stationary points. These are given by the condition
+In addition to these equations, we often want to maximize the complexity as a
+function of $\mu^*$, to find the most common type of stationary points. These
+are given by the condition
 \begin{equation} \label{eq:cond.mu}
   0=\frac{\partial\Sigma}{\partial\mu^*}
   =\mathcal D'(\mu^*)-r_d
 \end{equation}
-Since $\mathcal D(\mu^*)$ is effectively a piecewise function, with different forms for $\mu^*$ greater or less than $\mu_m$, there are two regimes. When $\mu^*>\mu_m$, the critical points are minima, and \eqref{eq:cond.mu} implies
+Since $\mathcal D(\mu^*)$ is effectively a piecewise function, with different
+forms for $\mu^*$ greater or less than $\mu_m$, there are two regimes. When
+$\mu^*>\mu_m$ and the critical points are minima, \eqref{eq:cond.mu} implies
 \begin{equation} \label{eq:mu.minima}
   \mu^*=\frac1{r_d}+r_df''(1)
 \end{equation}
-When $\mu^*<\mu_m$, they are saddles, and
+When $\mu^*<\mu_m$ and the critical points are saddles, it implies
 \begin{equation} \label{eq:mu.saddles}
   \mu^*=2f''(1)r_d
 \end{equation}
 
-We will find it often useful to have the extremal conditions in a form without matrix inverses, both for numerics at finite $k$-RSB and for expanding in the continuous case. By simple manipulations, the matrix equations can be written as
+It is often useful to have the extremal conditions in a form without matrix
+inverses, both for numerics at finite $k$-RSB and for expanding in the
+continuous case. By simple manipulations, the matrix equations can be written
+as
 \begin{align}
   0&=\left[\hat\beta^2f'(C)+(2\hat\beta R-D)\odot f''(C)+R\odot R\odot f'''(C)+\hat\mu I\right]C+f'(C)D  \\
   0&=\left[\hat\beta f'(C)+R\odot f''(C)-\mu^*I\right]C+f'(C)R \\
@@ -512,7 +527,7 @@ operations.
 \section{Supersymmetric solution}
 
 The Kac--Rice problem has an approximate supersymmetry, which is found when the
-absolute value of the determinant is neglected. When this is done, the
+absolute value of the determinant is neglected, which has been studied in great detail in the complexity of the Thouless--Anderson--Palmer free energy \cite{Annibale_2003_The, Annibale_2003_Supersymmetric, Annibale_2004_Coexistence}. When this is done, the
 determinant can be represented by an integral over Grassmann variables, which
 yields a complexity depending on `bosons' and `fermions' that share the
 supersymmetry. The Ward identities associated with the supersymmetry imply
@@ -548,7 +563,7 @@ Understanding that $R$ is diagonal, this implies
   \mu^*=\frac1{r_d}+r_df''(1)
 \end{equation}
 which is precisely the condition \eqref{eq:mu.minima}. Therefore, \emph{the
-supersymmetric solution only counts dominant minima.}
+supersymmetric solution only counts the most common minima} \cite{Annibale_2004_Coexistence}.
 
 Inserting the supersymmetric ansatz $D=\hat\beta R$ and $R=r_dI$, one gets
 \begin{equation} \label{eq:diagonal.action}
@@ -576,7 +591,7 @@ $\Sigma(E_0,\mu^*)=0$, gives
 which is precisely \eqref{eq:ground.state.free.energy} with $r_d=z$,
 $\hat\beta=\tilde\beta$, and $C=\tilde Q$.
 
-{\em We arrive at one of the main results of our paper: a $(k-1)$-RSB ansatz in
+{\em Therefore a $(k-1)$-RSB ansatz in
 Kac--Rice will predict the correct ground state energy for a model whose
 equilibrium state at small temperatures is $k$-RSB } Moreover, there is an
 exact correspondance between the saddle parameters of each.  If the equilibrium
@@ -629,7 +644,7 @@ supersymmetric complexity
     =\mathcal D(\mu^*)
     +
       \hat\beta E-\mu^* r_d
-      +\frac12\hat\beta r_df'(1)+\frac12r_d^2f''(1)+\frac12\log r_d^2
+      +\frac12\left(\hat\beta r_df'(1)+r_d^2f''(1)+\log r_d^2\right)
       +\frac12\int_0^1dq\,\left(
         \hat\beta^2f''(q)\chi(q)+\frac1{\chi(q)+r_d/\hat\beta}
       \right)
@@ -649,7 +664,7 @@ given by
 is correct. This is only correct if it satisfies the boundary condition
 $\chi(1)=0$, which requires $r_d=f''(1)^{-1/2}$. This in turn implies
 $\mu^*=\frac1{r_d}+f''(1)r_d=\sqrt{4f''(1)}=\mu_m$. Therefore, the FRSB ground state
-is exactly marginal! It is straightforward to check that these conditions are
+is exactly marginal. It is straightforward to check that these conditions are
 indeed a saddle of the complexity.
 
 This has several implications. First, other than the ground state, there are
-- 
cgit v1.2.3-70-g09d2