\documentclass[submission, Phys]{SciPost}

\usepackage[utf8]{inputenc} % why not type "Bézout" with unicode?
\usepackage[T1]{fontenc} % vector fonts plz
\usepackage{amsmath,amssymb,latexsym,graphicx}
\usepackage{newtxtext,newtxmath} % Times for PR
\usepackage{anyfontsize,authblk}

\usepackage{tikz}
\usetikzlibrary{calc,fadings,decorations.pathreplacing,calligraphy}

\hypersetup{
  colorlinks=true,
  urlcolor={blue!50!black},
  citecolor={blue!50!black},
  filecolor={blue!50!black},
  linkcolor={blue!50!black}
}

\newcommand\pgfmathsinandcos[3]{%
  \pgfmathsetmacro#1{sin(#3)}%
  \pgfmathsetmacro#2{cos(#3)}%
}
\newcommand\LongitudePlane[3][current plane]{%
  \pgfmathsinandcos\sinEl\cosEl{#2} % elevation
  \pgfmathsinandcos\sint\cost{#3} % azimuth
  \tikzset{#1/.style={cm={\cost,\sint*\sinEl,0,\cosEl,(0,0)}}}
}

\newcommand\LatitudePlane[3][current plane]{%
  \pgfmathsinandcos\sinEl\cosEl{#2} % elevation
  \pgfmathsinandcos\sint\cost{#3} % latitude
  \pgfmathsetmacro\yshift{\cosEl*\sint}
  \tikzset{#1/.style={cm={\cost,0,0,\cost*\sinEl,(0,\yshift)}}} %
}
\newcommand\NewLatitudePlane[4][current plane]{%
  \pgfmathsinandcos\sinEl\cosEl{#3} % elevation
  \pgfmathsinandcos\sint\cost{#4} % latitude
  \pgfmathsetmacro\yshift{#2*\cosEl*\sint}
  \tikzset{#1/.style={cm={\cost,0,0,\cost*\sinEl,(0,\yshift)}}} %
}
\newcommand\TangentPlane[5][current plane]{%
  \pgfmathsinandcos\sint\cost{#3} % elevation
  \pgfmathsinandcos\sinb\cosb{-#4} % latitude
  \pgfmathsinandcos\sina\cosa{#5+90} % azimuth
  \pgfmathsetmacro\xshift{\cosb*\sina}
  \pgfmathsetmacro\yshift{-\cost*\sinb-\cosa*\cosb*\sint}
  \tikzset{#1/.style={cm={-\sina*\sinb,\cosa*\sinb*\sint-\cost*\cosb,\cosa,\sina*\sint,(#2*\xshift,#2*\yshift)}}} %
}
\newcommand\DrawLongitudeCircle[2][1]{
  \LongitudePlane{\angEl}{#2}
  \tikzset{current plane/.prefix style={scale=#1}}
   % angle of "visibility"
  \pgfmathsetmacro\angVis{atan(sin(#2)*cos(\angEl)/sin(\angEl))} %
  \draw[current plane] (\angVis:1) arc (\angVis:\angVis+180:1);
  \draw[current plane,dashed] (\angVis-180:1) arc (\angVis-180:\angVis:1);
}
\newcommand\DrawLatitudeCircle[2][1]{
  \LatitudePlane{\angEl}{#2}
  \tikzset{current plane/.prefix style={scale=#1}}
  \pgfmathsetmacro\sinVis{sin(#2)/cos(#2)*sin(\angEl)/cos(\angEl)}
  % angle of "visibility"
  \pgfmathsetmacro\angVis{asin(min(1,max(\sinVis,-1)))}
  \draw[current plane] (\angVis:1) arc (\angVis:-\angVis-180:1);
  \draw[current plane,dashed] (180-\angVis:1) arc (180-\angVis:\angVis:1);
}

\tikzset{%
  >=latex, % option for nice arrows
  inner sep=0pt,%
  outer sep=2pt%
}

\begin{document}

\title{
  Arrangement of nearby minima and saddles in the mixed spherical energy landscapes
}

\author{Jaron Kent-Dobias}
\affil{Istituto Nazionale di Fisica Nucleare, Sezione di Roma I}

\maketitle
\begin{abstract}
  The mixed spherical models were recently found to violate long-held
  assumptions about mean-field glassy dynamics. In particular, the threshold
  energy, where most stationary points are marginal and that in the simpler
  pure models attracts long-time dynamics, seems to lose significance. Here,
  we compute the typical distribution of stationary points relative to each
  other in mixed models with a replica symmetric complexity. We examine the
  stability of nearby points, accounting for the presence of an isolated
  eigenvalue in their spectrum due to their proximity. Despite finding rich
  structure not present in the pure models, we find nothing that distinguishes
  the points that do attract the dynamics. Instead, we find new geometric
  significance of the old threshold energy, and invalidate pictures of
  the arrangement of most marginal inherent states into a continuous manifold.
\end{abstract}

\tableofcontents

\section{Introduction}

Many systems exhibit ``glassiness,'' characterized by rapid slowing of dynamics
over a short parameter interval. These include actual (structural) glasses,
spin glasses, certain inference and optimization problems, and more 
\cite{Stein_1995_Broken, Krzakala_2007_Landscape, Altieri_2021_Properties, Yang_2023_Stochastic}. Glassiness
is qualitatively understood to arise from structure of an energy or cost
landscape, whether due to the proliferation of metastable states, or to the
raising of barriers which cause effective dynamic constraints
\cite{Cavagna_2001_Fragile, Stillinger_2013_Glass,
Kirkpatrick_2015_Colloquium}. However, in most models there is no known
quantitative correspondence between these landscape properties and the dynamic
behavior they are purported to describe.

There is such a correspondence in one of the simplest mean-field model of
glasses: in the pure spherical models, the dynamic transition corresponds with
the energy level at which thermodynamic states attached to marginal inherent
states\footnote{
  For this paper, which focuses on minima, we will take \emph{state} to mean
  \emph{minimum} or equivalently \emph{inherent state} and not a thermodynamic
  state. Any discussion of thermodynamic or equilibrium states will explicitly
  specify this.
} dominate the free energy
\cite{Castellani_2005_Spin-glass}. At that level, called the \emph{threshold
energy} $E_\mathrm{th}$, slices of the landscape at fixed energy undergo a
percolation transition. In fact, this threshold energy is significant in other
ways: it attracts the long-time dynamics after quenches in temperature to below
the dynamical transition from any starting temperature
\cite{Biroli_1999_Dynamical, Sellke_2023_The}. All of this can be understood in terms of the
landscape structure, and namely in the statistics of stationary points of the energy.

In slightly less simple models, the mixed spherical models, the story changes.
In these models there are a range of energies with exponentially many marginal minima. It
was believed that the energy level at which these marginal minima are the most
common type of stationary point would play the same role as the threshold
energy in the pure models (in fact we will refer to this energy level as the
threshold energy in the mixed models). However, recent work has shown that
this is incorrect. Quenches from different starting temperatures above the
dynamical transition temperature result in dynamics that approach marginal minima at different
energy levels, and the purported threshold does not attract the long-time
dynamics in most cases \cite{Folena_2020_Rethinking, Folena_2021_Gradient}.

This paper studies the two-point structure of stationary points in the mixed
spherical models, or their arrangement relative to each other, previously
studied only for the pure models \cite{Ros_2019_Complexity}. This gives various
kinds of information. When one point is a minimum, we see what other kinds of
minima are nearby, and the height of the saddle points that separate them.
When both points are saddles, we see the arrangement of barriers relative to
each other.

More specifically, one \emph{reference} point is fixed with certain properties.
Then, we compute the logarithm of the number of other points constrained to lie
at a fixed overlap from the reference point. Constraining the count
to points of a fixed overlap from the reference point produces constrained points with atypical properties. For
instance, when the required overlap is made sufficiently
large, typical constrained points tend to have an isolated eigenvalue pulled out
of their spectrum, and its associated eigenvector is correlated with the
direction of the reference point. Without the proximity constraint, such an
isolated eigenvalue amounts to a large deviation from the spectrum of
typical stationary points.

In order to address the open problem of what energies attract the long-time dynamics,
we focus on the neighborhoods of the marginal minima, to see if there is
anything interesting to differentiate sets of them from each other. Though we
find rich structure in this population, their properties pivot around the
debunked threshold energy, and the apparent attractors of long-time dynamics
are not distinguished. Moreover, we show that the usual picture of a
marginal `manifold' of inherent states separated by subextensive barriers \cite{Kurchan_1996_Phase} is only true
at the threshold energy, while at other energies typical marginal minima are far apart
and separated by extensive barriers. Therefore, with respect to the problem of
dynamics this paper merely deepens the outstanding issues.

In \S\ref{sec:model} we define the mixed spherical models and outline some of
their important properties. In the following section \S\ref{sec:results}, we go
over the main results of this work and their interpretation. In
\S\ref{sec:complexity} we outline the calculation of the two-point complexity
and its expansion in the near-neighborhood of a reference point. Details of the
calculation of the complexity are in Appendix \ref{sec:complexity-details}. In
\S\ref{sec:eigenvalue} we introduce a method for calculating the value of an
isolated eigenvalue in the spectrum at stationary points, and outline the
calculation for the mixed spherical models. Details of this calculation are in
Appendix \ref{sec:eigenvalue-details}. Finally, we draw some conclusions about
our results in \S\ref{sec:conclusion}. For the interested reader, a comparison
between the two-point complexity and the Franz--Parisi potential in the mixed
spherical models is presented in Appendix \ref{sec:franz-parisi}.

\section{The model}
\label{sec:model}

The mixed spherical models are defined by the Hamiltonian
\begin{equation} \label{eq:hamiltonian}
  H(\mathbf s)=-\sum_p\frac1{p!}\sum_{i_1\cdots i_p}^NJ^{(p)}_{i_1\cdots i_p}s_{i_1}\cdots s_{i_p}
\end{equation}
where the vectors $\mathbf s\in\mathbb R^N$ are confined to the sphere
$\|\mathbf s\|^2=N$ \cite{Kirkpatrick_1987_p-spin-interaction, Crisanti_1992_The, Crisanti_2004_Spherical}.  The coupling coefficients $J$ are fully-connected and random, with
zero mean and variance $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$ scaled so that
the energy is typically extensive. The overbar denotes an average
over the coefficients $J$. The factors $a_p$ in the variances are freely chosen
constants that define the particular model. For instance, the `pure'
$p$-spin model has $a_{p'}=\delta_{p'p}$. This class of models encompasses all
statistically isotropic Gaussian random Hamiltonians defined on the
hypersphere.

The covariance between the energy at two different points is a function of the overlap, or dot product, between those points, or
\begin{equation} \label{eq:covariance}
  \overline{H(\mathbf s_1)H(\mathbf s_2)}=Nf\left(\frac{\mathbf s_1\cdot\mathbf s_2}N\right)
\end{equation}
where the function $f$ is defined from the coefficients $a_p$ by
\begin{equation}
  f(q)=\frac12\sum_pa_pq^p
\end{equation}
The choice of $f$ has significant effect on the form of equilibrium order in the model, and
likewise influences the geometry of stationary points \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical}.

To enforce the spherical constraint at stationary points, we make use of a Lagrange multiplier $\omega$. This results in the extremal problem
\begin{equation}
  H(\mathbf s)+\frac\omega2(\|\mathbf s\|^2-N)
\end{equation}
The gradient and Hessian at a stationary point are then
\begin{align}
  \nabla H(\mathbf s,\omega)=\partial H(\mathbf s)+\omega\mathbf s
  &&
  \operatorname{Hess}H(\mathbf s,\omega)=\partial\partial H(\mathbf s)+\omega I
\end{align}
where $\partial=\frac\partial{\partial\mathbf s}$ denotes the derivative with respect to $\mathbf s$.

\begin{figure}
  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_saddle.pdf}\\
    \footnotesize\textbf{extensive saddle}
  \end{minipage}
  \hfill
  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_marginal.pdf}\\
    \footnotesize\textbf{marginal minimum}
  \end{minipage}
  \hfill
  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_minimum.pdf}\\
    \footnotesize\textbf{stable minimum}
  \end{minipage}

  \vspace{1em}

  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_saddle_2.pdf}\\
    \footnotesize\textbf{oriented saddle}
  \end{minipage}
  \hfill
  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_minimum_2.pdf}\\
    \footnotesize\textbf{oriented minimum}
  \end{minipage}
  \hfill
  \begin{minipage}[t]{0.31\textwidth}
    \centering
    \includegraphics{figs/spectrum_saddle_3.pdf}\\
    \footnotesize\textbf{oriented index-one saddle}
  \end{minipage}

  \caption{
    Illustration of the interpretation of the stability $\mu$, which sets the
    location of the center of the eigenvalue spectrum. In the top row we have
    spectra without an isolated eigenvalue. \textbf{(a)} $\mu<\mu_\mathrm m$,
    there are an extensive number of downward directions, and the associated
    point is an \emph{extensive saddle}. \textbf{(b)} $\mu=\mu_\mathrm m$ and
    we have a \emph{marginal minimum} with asymptotically flat directions.
    \textbf{(c)} $\mu>\mu_\mathrm m$, all eigenvalues are positive, and the
    point is a \emph{stable minimum}. On the bottom we show what happens in the
    presence of an isolated eigenvalue. \textbf{(d)} One eigenvalue leaves the
    bulk spectrum of a saddle point and it remains a saddle point, but now with
    an eigenvector correlated with the orientation of the reference vector, so
    we call this an \emph{oriented saddle}. \textbf{(e)} The same happens for
    a minimum, and we can call it an \emph{oriented minimum}. \textbf{(f)} One
    eigenvalue outside a positive bulk spectrum is negative, destabilizing what
    would otherwise have been a stable minimum, producing an \emph{oriented
    index-one saddle}.
  } \label{fig:spectra}
\end{figure}

When we count stationary points, we classify them by certain properties. One of
these is the energy density $E=H/N$. We will also fix the \emph{stability}
$\mu=\frac1N\operatorname{Tr}\operatorname{Hess}H$, also known as the radial
reaction. In the mixed spherical models, all stationary points have a
semicircle law for the eigenvalue spectrum of their Hessians, each with the
same width $\mu_\mathrm m$, but whose center is shifted by different amounts. Fixing the
stability $\mu$ fixes this shift, and therefore fixes the spectrum of the
associated stationary point. When the stability is smaller than the width of
the spectrum, or $\mu<\mu_\mathrm m$, there are an extensive number of negative
eigenvalues, and the stationary point is a saddle with a large index whose
value is set by the stability. When the stability is greater than the width of
the spectrum, or $\mu>\mu_\mathrm m$, the semicircle distribution lies only
over positive eigenvalues, and unless an isolated eigenvalue leaves the
semicircle and becomes negative, the stationary point is a minimum. Finally,
when $\mu=\mu_\mathrm m$, the edge of the semicircle touches zero and we have
marginal minima. Fig.~\ref{fig:spectra} shows what different values of the
stability imply about the spectrum at stationary points.

In the pure spherical models, $E$ and $\mu$ cannot be fixed separately: fixing
one uniquely fixes the other. This property leads to the great simplification
of these models: marginal minima exist \emph{only} at one energy level, and
therefore only that energy has the possibility of trapping the long-time
dynamics. In generic mixed models this is not the case and at a given energy
level $E$ there are many stabilities for which exponentially many marginal
points are found. We define the threshold energy $E_\mathrm{th}$ as the energy
at which most stationary points are marginal.\footnote{
  Note that crucially this is
\emph{not} the energy that has the most marginal stationary points: this energy
level with the largest number of marginal points has even more saddles of
extensive index. So $E_\mathrm{th}$ contains a \emph{minority} of the
marginal points, even if those marginal points are the \emph{majority} of
stationary points with energy $E_\mathrm{th}$.
}

\begin{figure}
  \centering
  \includegraphics{figs/single_complexity.pdf}
  \caption{
    Plot of the complexity (logarithm of the number of stationary points) for
    the $3+4$ mixed spherical model studied in this paper. Energies and stabilities
    of interest are marked, including the ground state energy
    $E_\mathrm{gs}$, the marginal stability $\mu_\mathrm
    m$, and the threshold energy $E_\mathrm{th}$. The blue line shows the location
    of the most common type of stationary point at each energy level. The
    highlighted red region shows the approximate range of minima that attract
    aging dynamics from a quench to zero temperature found in
    \cite{Folena_2020_Rethinking}.
  } \label{fig:complexities}
\end{figure}

In this study, our numeric examples are drawn exclusively from the model studied in
\cite{Folena_2020_Rethinking}, whose covariance function is given by
\begin{equation}
  f_{3+4}(q)=\frac12\big(q^3+q^4\big)
\end{equation}
First, the ordering of its stationary points is like that of the pure spherical models, without any clustering \cite{Kent-Dobias_2023_When}. Second, properties
of its long-time dynamics have been extensively studied and are available for comparison. Though the numeric examples all come from the $3+4$ model, the results apply to any model sharing its simple order. The annealed one-point
complexity of these models was calculated in \cite{BenArous_2019_Geometry}, and
for this model the annealed calculation is expected to be correct.

The one-point complexity of the $3+4$ model as a function of energy $E_0$ and
stability $\mu_0$ is plotted in Fig.~\ref{fig:complexities}. The same plot for a
pure $p$-spin model would consist of only a line, because $E_0$ and $\mu_0$ cannot
be varied independently. Several important features of the complexity are
highlighted: the energies of the ground state $E_\text{gs}$ and the threshold
$E_\text{th}$, along with the line of marginal stability $\mu_\text m$. Along
the line of marginal stability, energies that attract aging dynamics from
different temperatures are highlighted in red. One might expect some feature to mark the ends of this range, something that would differentiate
marginal minima that support aging dynamics from those that do not. As
indicated in the introduction, the two-point complexity we study in this paper
does not produce such a result.

\section{Results}
\label{sec:results}

Our results stem from the two-point complexity $\Sigma_{12}$, which is defined
as the logarithm of the number of stationary points with energy $E_1$ and
stability $\mu_1$ that lie at an overlap $q$ with a typical reference
stationary point whose energy is $E_0$ and stability is $\mu_0$. When the
complexity is positive, there are exponentially many stationary points with the
given properties conditioned on the existence of the reference point. When it is
zero, there are only order-one such points, and when it is negative there are
exponentially few (effectively, none). In the examples below, the boundary of
zero complexity between exponentially many and few points is often highlighted, with parameter regions that have negative complexity having no color.
Finally, as a result of the condition that the counted points lie with a given
proximity to the reference point, their spectrum can be modified by the
presence of an isolated eigenvalue, which can change their stability as shown in
Fig.~\ref{fig:spectra}.

\subsection{Barriers around deep states}

\begin{figure}
  \includegraphics[scale=0.95]{figs/gapped_min_energy.pdf}
  \raisebox{5em}{\includegraphics[scale=0.95]{figs/gapped_min_energy_legend.pdf}}
  \hfill
  \includegraphics[scale=0.95]{figs/gapped_min_stability.pdf}
  \raisebox{5em}{\includegraphics[scale=0.95]{figs/gapped_min_stability_legend.pdf}}

  \caption{
    The neighborhood of a reference minimum with $E_0=-1.71865<E_\mathrm{th}$
    and $\mu_0=6.1>\mu_\mathrm m$. \textbf{Left:} The most common type of
    stationary point lying at fixed overlap $q$ and energy $E_1$ from the
    reference minimum. The black line gives the smallest or largest energies
    where neighbors can be found at a given overlap. \textbf{Right:} The most
    common type of stationary point lying at fixed overlap $q$ and stability
    $\mu_1$ from the reference minimum. Note that this describes a different
    set of stationary points than shown in the left plot. On both plots, the
    shading of the righthand part depicts the state of an isolated eigenvalue
    in the spectrum of the Hessian of the neighboring points. Those more
    lightly shaded are points with an isolated eigenvalue that does not change
    their stability, e.g., corresponding with Fig.~\ref{fig:spectra}(d-e). The more
    darkly shaded are oriented index-one saddles, e.g., corresponding with
    Fig.~\ref{fig:spectra}(f). The dot-dashed line on the left plot depicts the
    trajectory of the solid line on the right plot, and the dot-dashed line on
    the right plot depicts the trajectory of the solid line on the left plot.
    In this case, the points lying nearest to the reference minimum are saddles
    with $\mu<\mu_\mathrm m$, but with energies smaller than the threshold
    energy, which makes them an atypical population of saddles.
  } \label{fig:min.neighborhood}
\end{figure}

If the reference configuration is a stable minimum, then there is a
gap in the overlap between it and its nearest neighbors in
configuration space. We can characterize these neighbors as a function of their
overlap and stability, with one example seen in
Fig.~\ref{fig:min.neighborhood}. For stable minima, the qualitative results for
the pure $p$-spin model continue to hold, with some small modifications
\cite{Ros_2019_Complexity}.

The largest difference between the pure and mixed models is the decoupling of nearby
stable points from nearby low-energy points: in the pure $p$-spin model, the
left and right panels of Fig.~\ref{fig:min.neighborhood} would be identical up
to a constant factor $-p$. Instead, for mixed models they differ substantially,
as evidenced by the dot-dashed lines in both plots that in the pure models
would correspond exactly with the solid lines. One significant consequence of
this difference is the diminished significance of the threshold energy
$E_\text{th}$: in the left panel, marginal minima of the threshold energy are
the most common among unconstrained points with $q=0$, but marginal minima of lower energy
are more common in the vicinity of the example reference minimum. In the pure models, all marginal minima are at the threshold energy.

The nearest neighbor points are always oriented saddles, sometimes
saddles with an extensive index and sometimes index-one saddles
(Fig.~\ref{fig:spectra}(d, f)). This is a result of the persistent presence of a negative isolated eigenvalue in the spectrum of the nearest neighbors, e.g., as in the shaded regions of Fig.~\ref{fig:min.neighborhood}. Like in the pure models, the minimum energy and
maximum stability of nearby points are not monotonic in $q$: there is a range of
overlap where the minimum energy of neighbors decreases with overlap. The
transition from stable minima to index-one saddles along the line of lowest-energy states
occurs at its local minimum, another similarity with the pure models
\cite{Ros_2019_Complexity}. This point is interesting because it describes the properties of the nearest stable minima to the reference point. It is not clear why the local minimum of the boundary coincides with this point or what implications that has for behavior.

\subsection{Grouping of saddle points}

At stabilities lower than the marginal stability one finds saddles with an
extensive index. Though, being unstable, saddles are not attractors of
dynamics, their properties do influence out-of-equilibrium dynamics. For example,
high-index saddle points are stationed at the boundaries between different
basins of attraction of gradient flow, and for a given basin the flow between
adjacent saddle points defines a complex with implications for the landscape
topology \cite{Audin_2014_Morse}.

Other stationary points are found at arbitrarily small distances from a
reference extensive saddle point, with a linear pseudogap in their complexity. The energy and stability of these near
neighbors approach that of the reference point as the difference in overlap
$\Delta q$ is brought to zero. However, the approach of the energy and
stability are at different rates: the energy difference between the reference
and its neighbors shrinks like $\Delta q^2$, while the stability difference
shrinks like $\Delta q$. This means that the near neighborhood of saddle points
is dominated by the presence of other saddle points at very similar energy, but
relatively variable index. Descending between saddles must simultaneously lower the index and the energy, but if the
energy and stability change with the same order of magnitude, the connected
saddle points must lie at a macroscopic distance from each other. This makes it impossible to use the properties of nearest neighbors to draw inferences about the way saddle
points are connected by gradient flow.

\subsection{Geometry of marginal states}

The set of marginal states is of special interest. First, marginal states are
known to attract physical and algorithmic dynamics \cite{Folena_2023_On}.
Second, they have more structure than in the pure models, with different types
of marginal states being found at different energies.  We find, surprisingly,
that the properties of marginal states pivot around the threshold energy, the
energy at which most stationary points are marginal, but which is not
significant for aging dynamics.

\begin{itemize}
  \item \textbf{Energies below the threshold.} These marginal states have a
    macroscopic gap in their overlap with nearby minima and saddles. Their
    nearest-neighbor stationary points are saddles with an oriented direction,
    and their nearest neighbors always have a higher energy density than the reference state.
    Fig.~\ref{fig:marginal.prop.below} shows examples of the neighborhoods of
    these marginal minima.

  \item \textbf{Energies above the threshold.} These marginal states have neighboring
    stationary points at arbitrarily close distance, with a quadratic pseudogap in
    their complexity. Their nearest neighbors are \emph{strictly} saddle points with
    an extensive number of downward directions and their nearest neighbors always have a higher energy
    density than the reference state. The nearest neighboring marginal states
    have an overlap gap with the reference state.
    Fig.~\ref{fig:marginal.prop.above} shows examples of the neighborhoods of
    these marginal minima.

  \item \textbf{At the threshold energy.} These marginal states have neighboring
    stationary points at arbitrarily close distance, with a cubic pseudogap
    in their complexity. The nearest ones include oriented saddle
    points with an extensive number of downward directions, and oriented stable
    and marginal minima. Though most of the nearest states are found at higher
    energies, they can also be found at the same energy density as the reference
    state. Fig.~\ref{fig:marginal.prop.thres} shows examples of the
    neighborhoods of these marginal states.
\end{itemize}

\begin{figure}
  \includegraphics[scale=0.93]{figs/nearest_energies_below.pdf}
  \hfill
  \includegraphics[scale=0.93]{figs/nearest_stabilities_below.pdf}
  \hfill
  \includegraphics[scale=0.93]{figs/nearest_marginal_below.pdf}

  \caption{
    The neighborhood of marginal states at several energies below the threshold
    energy. \textbf{Left:} The range of energies $E_1$ at which nearby states
    are found. For any $E_0<E_\mathrm{th}$, the nearest class of states is at
    an extensive distance, and their energies are higher than that of the
    reference configuration. \textbf{Center:} The range of stabilities $\mu_1$
    at which nearby states are found. For $E_0$ near the threshold, the nearest
    states are always index-one saddles with $\mu>\mu_\mathrm m$, but as the
    overlap gap widens their population becomes model-dependent.
    \textbf{Right:} The range of energies at which \emph{other} marginal states
    are found. Here, the more darkly shaded regions denote where an isolated
    eigenvalue appears. Marginal states below the threshold are always
    separated by a gap in their overlap.
  } \label{fig:marginal.prop.below}
\end{figure}

\begin{figure}
  \includegraphics[scale=0.93]{figs/nearest_energies_above.pdf}
  \hfill
  \includegraphics[scale=0.93]{figs/nearest_stabilities_above.pdf}
  \hfill
  \includegraphics[scale=0.93]{figs/nearest_marginal_above.pdf}

  \caption{
    The neighborhood of marginal states at several energies above the threshold
    energy. \textbf{Left:} The range of energies $E_1$ at which nearby states
    are found. For any $E_0>E_\mathrm{th}$, there always exists a $q$
    sufficiently close to one such that the nearby states have strictly greater
    energy than the reference state. \textbf{Center:} The range of stabilities
    $\mu_1$ at which nearby states are found. There is always a
    sufficiently large overlap beyond which all nearby states are saddle with
    an extensive number of downward directions. \textbf{Right:} The range of
    energies at which \emph{other} marginal states are found. Here, the more
    darkly shaded regions denote where an isolated eigenvalue appears. Marginal
    states above the threshold are always separated by a gap in their overlap.
  } \label{fig:marginal.prop.above}
\end{figure}

\begin{figure}
  \includegraphics{figs/nearest_energies_thres.pdf}
  \hfill
  \includegraphics{figs/nearest_stabilities_thres.pdf}
  \hfill
  \includegraphics{figs/nearest_marginal_thres.pdf}

  \caption{
    The neighborhood of marginal minima at the threshold energy
    $E_0=E_\mathrm{th}$. In all plots, the dashed lines show the population of
    most common neighbors at the given overlap $q$.
    \textbf{Left:} The range of energies $E_1$ at which nearby points are
    found. The approach of both the minimum and maximum energies goes like
    $(1-q)^3$. \textbf{Center:} The range of stabilities $\mu_1$ at which nearby
    points are found. The approach of both limits goes like $(1-q)^2$.
    \textbf{Right:} The range of nearby marginal minima. The more darkly shaded
    region denotes where an isolated eigenvalue appears. Marginal minima at the
    threshold lie asymptotically close together.
  } \label{fig:marginal.prop.thres}
\end{figure}

This leads us to some general conclusions. First, at all energy densities
except at the threshold energy, \emph{typical marginal minima are separated by
extensive energy barriers}. Therefore, the picture of a marginal
\emph{manifold} of many (even all) marginal states lying arbitrarily close and
being connected by subextensive energy barriers can only describe the
collection of marginal minima at the threshold energy, which is an atypical population of marginal minima. At energies both below and above the threshold energy,
typical marginal minima are isolated from each other.\footnote{
We must put a small caveat here: for any combination of energy and stability of the reference point, this calculation
admits order-one other marginal minima to lie a subextensive distance from the
reference point. For such a population of points, $\Sigma_{12}=0$ and $q=1$,
which is always a permitted solution when at least one marginal direction
exists. These points are separated by small barriers from one another, but they
also cover a vanishing piece of configuration space, and each such cluster of
points is isolated by extensive barriers from each other cluster in the way
described above. To move on a `manifold' of nearby marginal minima within such a
cluster cannot describe aging, since the overlap with the initial condition
will never change from one.
}

This has implications for how quench dynamics should be interpreted. When
typical marginal states are approached above the threshold energy, they must have been
via the neighborhood of saddles with an extensive index, not other marginal
states. On the other hand, typical marginal states approached below the threshold
energy must  be reached after an extensive distance in configuration space
without encountering any stationary point. The geometric conditions of the
neighborhoods above and below are quite different, but the observed aging
dynamics don't appear to qualitatively change \cite{Folena_2020_Rethinking,
Folena_2021_Gradient}. Therefore, if the marginal minima attracting dynamics are typical ones, the conditions in the neighborhood of the
marginal minimum eventually reached at infinite time appear to be irrelevant
for the nature of aging dynamics at any finite time.

A version of
this story was told a long time ago by the authors of
\cite{Kurchan_1996_Phase}, who write on aging in the pure spherical models
where the limit of $N\to\infty$ is taken before that of $t\to\infty$: ``it is
important to remark that this [...]\ does \emph{not} mean that the system
relaxes into a near-threshold state: at all finite times an infinite system has
a Hessian with an \emph{infinite} number of directions in which the energy is a
maximum. [...] We have seen that the saddles separating threshold minima are
typically $O(N^{1/3})$ above the threshold level, while the energy is at all
finite times $O(N)$ above this level.'' In the present case of the mixed
spherical models, where \cite{Folena_2020_Rethinking} has shown aging dynamics
asymptotically approaching marginal states that we have shown have $O(N)$
saddles separating them, this lesson must be taken all the more seriously.

On the other hand, it is possible that \emph{atypical} marginal minima are
relevant for attracting the dynamics. Studying these points would require a
different kind of computation, where the fixed reference point is abandoned and
both points are treated on equal footing. Such a calculation is beyond the
scope of this paper, but is clear fodder for future research.

\section{Calculation of the two-point complexity}
\label{sec:complexity}

To calculate the two-point complexity, we extend a common method for counting
stationary points: the Kac--Rice method \cite{Kac_1943_On,
Rice_1944_Mathematical}. The basic idea is that stationary points of a function
can be counted by integrating a Dirac $\delta$-function containing the
function's gradient over its domain. Because the argument of the
$\delta$-function is nonlinear in the integration variable, it must be weighted
by the determinant of its Jacobian, which happens to be the
Hessian of the function. It is not common that this procedure can be
analytically carried out for an explicit function. However, in the spherical
models it can be carried out \emph{on average}.

In order to lighten notation, we introduce the Kac--Rice  measure
\begin{equation}
  d\nu_H(\mathbf s,\omega)
  =2\,d\mathbf s\,d\omega\,\delta(\|\mathbf s\|^2-N)\,
  \delta\big(\nabla H(\mathbf s,\omega)\big)\,
  \big|\det\operatorname{Hess}H(\mathbf s,\omega)\big|
\end{equation}
containing the $\delta$-function of the gradient and determinant of the Hessian of the Hamiltonian, along with a $\delta$-function enforcing the spherical constraint.
If integrated over
configuration space, $\mathcal N_H=\int d\nu_H(\mathbf s,\omega)$ gives the
total number of stationary points in the function. The Kac--Rice method has been used by in many studies to analyze the geometry of random functions \cite{Cavagna_1998_Stationary, Fyodorov_2007_Density, Bray_2007_Statistics, Kent-Dobias_2023_How}. More interesting is the
measure conditioned on the energy density $E$ and stability $\mu$ of the
points,
\begin{equation} \label{eq:measure.energy}
  d\nu_H(\mathbf s,\omega\mid E,\mu)
  =d\nu_H(\mathbf s,\omega)\,
  \delta\big(NE-H(\mathbf s)\big)\,
  \delta\big(N\mu-\operatorname{Tr}\operatorname{Hess}H(\mathbf s,\omega)\big)
\end{equation}
While $\mu$ is strictly the trace of the Hessian, we call it the stability
because in this family of models all stationary points have a bulk spectrum of
the same shape, shifted by different constants. The stability $\mu$ sets this
shift, and therefore determines the stiffness of minima and the typical index of saddle points. See
Fig.~\ref{fig:spectra} for examples.

We want the typical number of stationary points with energy density
$E_1$ and stability $\mu_1$ that lie a fixed overlap $q$ from a reference
stationary point of energy density $E_0$ and stability $\mu_0$. For a
\emph{typical} number, we cannot average the total number $\mathcal N_H$, which
is exponentially large in $N$ and therefore can be biased by atypical examples.
Therefore, we will average the logarithm of this number. The two-point complexity is
therefore defined by
\begin{equation} \label{eq:complexity.definition}
  \Sigma_{12}
  =\frac1N\overline{\int\frac{d\nu_H(\pmb\sigma,\varsigma\mid E_0,\mu_0)}{\int d\nu_H(\pmb\sigma',\varsigma'\mid E_0,\mu_0)}\,
  \log\bigg(\int d\nu_H(\mathbf s,\omega\mid E_1,\mu_1)\,\delta(Nq-\pmb\sigma\cdot\mathbf s)\bigg)}
\end{equation}
Inside the logarithm, the measure \eqref{eq:measure.energy} is integrated with
the further condition that $\mathbf s$ has a certain overlap with the reference configuration $\pmb\sigma$.
The entire expression is then integrated over $\pmb\sigma$ again by
the Kac--Rice measure, then divided by a normalization. This is equivalent to
summing the logarithm over all stationary points $\pmb\sigma$ with the given
properties, then dividing by their total number, i.e., an average.

It is difficult to take the disorder average of anything that is not an
exponential integral. The normalization integral over $\pmb\sigma'$ in the
denominator and the integral inside the logarithm both pose a problem. Each can
be treated using the replica trick: $\lim_{m\to0} x^{m-1}=\frac1x$ and $\lim_{n\to0}\frac\partial{\partial n}x^n=\log x$. Applying these transformations, we have
\begin{equation}
  \Sigma_{12}
  =\frac1N\lim_{n\to0}\lim_{m\to0}\frac\partial{\partial n}\overline{\int\left(\prod_{b=1}^md\nu_H(\pmb\sigma_b,\varsigma_b\mid E_0,\mu_0)\right)\left(\prod_{a=1}^nd\nu_H(\mathbf s_a,\omega_a\mid E_1,\mu_1)\,\delta(Nq-\pmb \sigma_1\cdot \mathbf s_a)\right)}
\end{equation}
Note that among the $\pmb\sigma$ replicas, $\pmb\sigma_1$ is special. The $m-1$
replicas $\pmb\sigma_2,\ldots,\pmb\sigma_m$ correspond to the replicated
normalization integral over $\pmb\sigma'$, which is completely uncoupled from
$\mathbf s$. The variable $\pmb\sigma_1$ is not a replica: it is the same as
$\pmb\sigma$ in \eqref{eq:complexity.definition}, and is the only of the $\pmb\sigma$s that couples with $\mathbf s$.

This expression can now be averaged over the disordered couplings, and its
integration evaluated using the saddle point method. We must assume the form of
order among the replicas $\mathbf s$ and $\pmb\sigma$, and we take them to be
replica symmetric. Replica symmetry means that at the saddle point, all
distinct pairs of replicas have the same overlap. This choice is well-motivated
for the $3+4$ and similar models.
Details of the calculation can be found in
Appendix~\ref{sec:complexity-details}.

The resulting expression for the complexity, which must
still be extremized over the order parameters $\hat\beta_1$, $r^{01}$,
$r^{11}_\mathrm d$, $r^{11}_0$, and $q^{11}_0$, is
\begin{equation} \label{eq:complexity.full}
  \begin{aligned}
    &\Sigma_{12}(E_0,\mu_0,E_1,\mu_1,q)
    =\mathop{\mathrm{extremum}}_{\hat\beta_1,r^{11}_\mathrm d,r^{11}_0,r^{01},q^{11}_0}\Bigg\{
    \mathcal D(\mu_1)-\frac12+\hat\beta_1E_1-r^{11}_\mathrm d\mu_1
    +\hat\beta_1\big(r^{11}_\mathrm df'(1)-r^{11}_0f'(q^{11}_0)\big)\\
    &\qquad+\hat\beta_0\hat\beta_1f(q)+(\hat\beta_0r^{01}+\hat\beta_1r^{10}+r^{00}_\mathrm d r^{01})f'(q)
    +\frac{r^{11}_\mathrm d-r^{11}_0}{1-q^{11}_0}(r^{10}-qr^{00}_\mathrm d)f'(q)\\
    &+\frac12\Bigg[
      \hat\beta_1^2\big(f(1)-f(q^{11}_{0})\big)
      +(r^{11}_\mathrm d)^2f''(1)+2r^{01}r^{10}f''(q)-(r^{11}_0)^2f''(q^{11}_0)
      +\frac{(r^{10}-qr^{00}_\mathrm d)^2}{1-q^{11}_0}f'(1)
      \\
    &\qquad+\frac{1-q^2}{1-q^{11}_0}+\left(
      (r^{01})^2-\frac{r^{11}_\mathrm d-r^{11}_0}{1-q^{11}_0}
      \left(2qr^{01}-\frac{(1-q^2)r^{11}_0-(q^{11}_0-q^2)r^{11}_\mathrm d}{1-q^{11}_0}\right)
    \right)\big(f'(1)-f'(q_{22}^{(0)})\big) \\
    &\qquad
    -\frac1{f'(1)}\frac{f'(1)^2-f'(q)^2}{f'(1)-f'(q^{11}_0)}
    +\frac{r^{11}_\mathrm d-r^{11}_0}{1-q^{11}_0}\big(r^{11}_\mathrm df'(1)-r^{11}_0f'(q^{11}_0)\big)
    +\log\left(\frac{1-q_{11}^0}{f'(1)-f'(q_{11}^0)}\right)
  \Bigg]\Bigg\}
  \end{aligned}
\end{equation}
where the function $\mathcal D$ is defined in \eqref{eq:hessian.func} of Appendix~\ref{sec:complexity-details}.
It is possible to further extremize this expression over all the other
variables but $q_0^{11}$, for which the saddle point conditions have a unique
solution. However, the resulting expression is quite complicated and provides
no insight. In fact, the numeric root-finding problem is more stable when preserving these parameters, rather than analytically eliminating them.

In practice, the complexity can be calculated in two ways. First,
the extremal problem can be done numerically, initializing from $q=0$ where the
problem reduces to that of the single-point complexity of points with energy
$E_1$ and stability $\mu_1$, which has an analytical solution. Then small steps in $q$ or other
parameters are taken to analytically continue the solution. This is how the data in all the plots of
this paper was produced. Second, the complexity can be calculated in the near
neighborhood of a reference point by expanding in powers of small $\Delta q=1-q$. This expansion indicates when nearby points can be found at arbitrarily small distance, and in that case gives the form of the pseudogap in their complexity.

If there is no overlap gap between the reference point and its nearest
neighbors, their complexity can be calculated by an expansion in $1-q$. First,
we'll use this method to describe the most common type of stationary point in
the close vicinity of a reference point. These are given by further maximizing the two-point complexity over
the energy $E_1$ and stability $\mu_1$ of the nearby points. This gives the
conditions
\begin{align}
  \hat\beta_1=0 &&
  \mu_1=2r^{11}_\mathrm df''(1)
\end{align}
where the second condition is only true for $\mu_1^2\leq\mu_\mathrm m^2$, i.e., when the
nearby points are saddle points or marginal minima. When these conditions are
inserted into the complexity, an expansion is made in small $1-q$, and the
saddle point in the remaining parameters is taken, the result is
\begin{equation}
  \Sigma_{12}
  =\frac{f'''(1)}{8f''(1)^2}(\mu_\mathrm m^2-\mu_0^2)\left(\sqrt{2+\frac{2f''(1)\big(f''(1)-f'(1)\big)}{f'''(1)f'(1)}}-1\right)(1-q)
  +O\big((1-q)^2\big)
\end{equation}
independent of $E_0$. Notice that slope of the complexity is positive for $\mu_0<\mu_\text m$ and vanishes when the stability of the reference point approaches the marginal stability. This implies that extensive saddle points have arbitrarily close neighbors with a linear pseudogap, while stable minima have an overlap gap with their nearest neighbors. For marginal minima, the existence of arbitrarily close neighbors must be decided at quadratic order and higher.

To describe the properties of these most common
neighbors, it is convenient to first make a definition. The population of
stationary points that are most common at each energy (the blue line in
Fig.~\ref{fig:complexities}) have the relation
\begin{equation}
  E_\mathrm{dom}(\mu_0)=-\frac{f'(1)^2+f(1)\big(f''(1)-f'(1)\big)}{2f''(1)f'(1)}\mu_0
\end{equation}
between $E_0$ and $\mu_0$ for $\mu_0^2\leq\mu_\mathrm m^2$. Using this
definition, the energy and stability of the most common neighbors at small
$\Delta q$ are
\begin{align} \label{eq:expansion.E.1}
  E_1&=E_0+\frac12\frac{v_f}{u_f}\big(E_0-E_\mathrm{dom}(\mu_0)\big)(1-q)^2+O\big((1-q)^3\big) \\
    \label{eq:expansion.mu.1}
  \mu_1&=\mu_0-\frac{v_f}{u_f}\big(E_0-E_\mathrm{dom}(\mu_0)\big)(1-q)+O\big((1-q)^2\big)
\end{align}
where $v_f$ and $u_f$ are positive functionals of $f$ defined in \eqref{eq:v.and.u} of Appendix~\ref{sec:complexity-details}.
The most common neighboring saddles to a reference saddle are much nearer to
the reference in energy ($\Delta q^2$) than in stability ($\Delta q$). In fact,
this scaling also holds for all neighbors of a reference
saddle, not just the most common.

Because both expressions are proportional to $E_0-E_\mathrm{dom}(\mu_0)$,
whether the energy and stability of nearby points increases or decreases from
that of the reference point depends only on whether the energy of the reference
point is above or below that of the most common population at the same
stability, i.e., to the right or left of the blue line in Fig.~\ref{fig:complexities}. In particular, since $E_\mathrm{dom}(\mu_\mathrm m)=E_\mathrm{th}$,
the threshold energy is also the pivot around which the points asymptotically
nearby marginal minima change their properties.

To examine better the population of marginal points, it is necessary to look at
the next term in the series of the complexity with $\Delta q$, since the linear
coefficient becomes zero at the marginal line. When $\mu=\mu_\text m$, the quadratic term in the expansion for the dominant population of near neighbors is
\begin{equation}
  \Sigma_{12}
  =\frac12\frac{f'''(1)v_f}{f''(1)^{3/2}u_f}
  \left(\sqrt{\frac{2\big[f'(1)(f'''(1)-f''(1))+f''(1)^2\big]}{f'(1)f'''(1)}}-1\right)\big(E_0-E_\textrm{th}\big)(1-q)^2+O\big((1-q)^3\big)
\end{equation}
This coefficient is positive when $E>E_\text{th}$ and negative when $E<E_\text{th}$. Therefore,
marginal minima whose energy $E_0$ is greater than the threshold have neighbors at arbitrarily close distance with a quadratic pseudogap, while those whose energy is less than the threshold have an overlap gap. Exactly at the threshold the cubic term in the expansion is necessary; it is not insightful to share explicitly but it is positive for the $3+4$ and similar models.

The properties of the nearby states above the threshold can be further
quantified. Though we know from \eqref{eq:expansion.E.1} and
\eqref{eq:expansion.mu.1} that the most common nearby points at small distance
are extensive saddle points with higher energy than the reference point, we do
not know what other kinds of stationary points might also be found in close
proximity. Could these marginal minima sit at very small distance from other
marginal minima? The answer is that the very near neighbors are
exclusively extensive saddles of higher energy. Therefore, the marginal
minima with energies above the threshold energy also have overlap gaps with one
another. These results on the range of possible neighbors are elaborated in Appendix \ref{subsec:range}.


\section{Finding the isolated eigenvalue}
\label{sec:eigenvalue}

The two-point complexity $\Sigma_{12}$ depends on the spectrum at both stationary points
through the determinant of their Hessians, but only on the bulk of the
distribution. This bulk is unaffected by the conditions of energy
and proximity. However, these conditions give rise to small-rank perturbations
to the Hessian, which can cause a subextensive number of eigenvalues to leave the
bulk. We study the possibility of \emph{one} stray eigenvalue.

We use a technique recently developed to find the smallest eigenvalue of
random matrices \cite{Ikeda_2023_Bose-Einstein-like}. One defines an artificial quadratic
statistical mechanics model with configurations defined on the sphere, whose
interaction tensor is given by the matrix of interest. By construction, the
ground state is located in the direction of the eigenvector associated with the
smallest eigenvalue, and the ground state energy is proportional to that
eigenvalue.

\begin{figure}
  \centering
  \begin{tikzpicture}
    \def\R{4 } % sphere radius
    \def\Rt{2 } % tangent plane radius
    \def\angEl{15} % elevation angle
    \def\angsa{-160} % azimuth of s_1
    \def\angq{40} % elevation of constraint circle
    \filldraw[ball color=white] (0,0) circle (\R);
  %  \filldraw[fill=white] (0,0) circle (\R);

    \foreach \t in {0,\angq} { \DrawLatitudeCircle[\R]{\t} }
    %\foreach \t in {\angsa} { \DrawLongitudeCircle[\R]{\t} }

    \pgfmathsetmacro\H{\R*cos(\angEl)} % distance to north pole
    \coordinate (O) at (0,0);
    \node[circle,draw,black,scale=0.3] at (0,0) {};
    \coordinate (N) at (0,\H);
    \draw node[right=10,below] at (0,\H){$\pmb\sigma_1$};
    \draw[thick, ->](O)--(N);

    \NewLatitudePlane[planeP]{\R}{\angEl}{\angq};
    \path[planeP] (\angsa:\R) coordinate (P);
    \path[planeP] (0:1.5*\R) coordinate (Q);
    \path[planeP] (0:\R) coordinate (Q2);
    \draw[left] node at (P){$\mathbf s_1$};

    \NewLatitudePlane[equator]{\R}{\angEl}{00};
    \path[equator] (-30:\R) coordinate (Pprime);
    \path[equator] (0:{1.5*cos(\angq)*\R}) coordinate (Qe);
    \path[equator] (0:\R) coordinate (Qe2);
    \draw node[right=5,below] at (Pprime){$\pmb\sigma_c$};

    \NewLatitudePlane[sbplane]{\R}{\angEl}{\angq};
    \path[sbplane] (20:\R) coordinate (sb);
    \draw node[right=3,above=1] at (sb){$\mathbf s_b$};

    \TangentPlane[tplane]{\R}{\angEl}{\angq}{\angsa};
    \draw[tplane,fill=gray,fill opacity=0.3] circle (\Rt);
    \draw[tplane,->,thick] (0,0) -> ({\Rt*cos(160)},{\Rt*sin(160)}) node[above=1.5,right] {$\mathbf x_a$};
    \draw[tplane,->,thick] (0,0) -> ({\Rt*cos(250)},{\Rt*sin(250)}) node[above=1.5,left=0.1] {$\mathbf x_b$};

    \draw[thick, ->] (O)->(P);
    \draw[thick, ->] (O)->(Pprime);
    \draw[thick, ->] (O)->(sb);

    \draw[dotted] (Qe) -- (Qe2);
    \draw[dotted] (Q2) -- (Q);
    \draw[decorate, decoration = {brace,raise=3}] (Q) -- (Qe) node[pos=0.5,right=7]{$q$};
  \end{tikzpicture}
  \caption{
    A sketch of the vectors involved in the calculation of the isolated
    eigenvalue. All replicas $\mathbf x$, which correspond with candidate
    eigenvectors of the Hessian evaluated at $\mathbf s_1$, sit in an $N-2$
    sphere corresponding with the tangent plane (not to scale) of the first
    $\mathbf s$ replica. All of the $\mathbf s$ replicas lie on the sphere,
    constrained to be at fixed overlap $q$ with the first of the $\pmb\sigma$
    replicas, the reference configuration. All of the $\pmb\sigma$ replicas lie
    on the sphere.
  } \label{fig:sphere}
\end{figure}

Our matrix of interest is the Hessian evaluated at a stationary point of the mixed spherical
model, conditioned on the relative position, energies, and stabilities
discussed above. We must restrict the artificial spherical model to lie in the
tangent plane of the `real' spherical configuration space at the point of
interest, to avoid our eigenvector pointing in a direction that violates the
spherical constraint. A sketch of the setup is shown in Fig.~\ref{fig:sphere}. The free energy of the artificial model given a point $\mathbf s$
and for a specific realization of the disordered Hamiltonian is
\begin{equation}
  \begin{aligned}
    \beta F_H(\beta\mid\mathbf s,\omega)
    &=-\frac1N\log\left(\int d\mathbf x\,\delta(\mathbf x\cdot\mathbf s)\delta(\|\mathbf x\|^2-N)\exp\left\{
        -\beta\frac12\mathbf x^T\operatorname{Hess}H(\mathbf s,\omega)\mathbf x
    \right\}\right) \\
    &=-\lim_{\ell\to0}\frac1N\frac\partial{\partial\ell}\int\left[\prod_{\alpha=1}^\ell d\mathbf x_\alpha\,\delta(\mathbf x_\alpha^T\mathbf s)\delta(N-\mathbf x_\alpha^T\mathbf x_\alpha)\exp\left\{
      -\beta\frac12\mathbf x^T_\alpha\big(\partial\partial H(\mathbf s)+\omega I\big)\mathbf x_\alpha
    \right\}\right]
  \end{aligned}
\end{equation}
where the first $\delta$-function keeps the configurations in the tangent
plane, and the second enforces the spherical constraint. We have anticipated
treating the logarithm with replicas. We are interested in points $\mathbf s$
that have certain properties: they are stationary points of $H$ with given
energy density and stability, and fixed overlap from a reference configuration
$\pmb\sigma$. We therefore average the free energy above over such points,
giving
\begin{equation}
  \begin{aligned}
    F_H(\beta\mid E_1,\mu_1,q,\pmb\sigma)
    &=\int\frac{d\nu_H(\mathbf s,\omega\mid E_1,\mu_1)\delta(Nq-\pmb\sigma\cdot\mathbf s)}{\int d\nu_H(\mathbf s',\omega'\mid E_1,\mu_1)\delta(Nq-\pmb\sigma\cdot\mathbf s')}F_H(\beta\mid\mathbf s,\omega) \\
    &=\lim_{n\to0}\int\left[\prod_{a=1}^nd\nu_H(\mathbf s_a,\omega_a\mid E_1,\mu_1)\,\delta(Nq-\pmb\sigma\cdot\mathbf s_a)\right]F_H(\beta\mid\mathbf s_1,\omega_1)
  \end{aligned}
\end{equation}
again anticipating the use of replicas. Finally, the reference configuration $\pmb\sigma$ should itself be a stationary point of $H$ with its own energy density and stability, as before. Averaging over these conditions gives
\begin{equation}
  \begin{aligned}
    F_H(\beta\mid E_0,\mu_0,E_1,\mu_1,q)
    &=\int\frac{d\nu_H(\pmb\sigma,\varsigma\mid E_0,\mu_0)}{\int d\nu_H(\pmb\sigma',\varsigma'\mid E_0,\mu_0)}\,F_H(\beta\mid E_1,\mu_1,q,\pmb\sigma) \\
    &=\lim_{m\to0}\int\left[\prod_{a=1}^m d\nu_H(\pmb\sigma_a,\varsigma_a\mid E_0,\mu_0)\right]\,F_H(\beta\mid E_1,\mu_1,q,\pmb\sigma_1)
  \end{aligned}
\end{equation}
This formidable expression is now ready to be averaged over the disordered Hamiltonians $H$. Once averaged,
the minimum eigenvalue of the conditioned Hessian is then given by twice the ground state energy, or
\begin{equation}
  \lambda_\text{min}=2\lim_{\beta\to\infty}\overline{F_H(\beta\mid E_0,\mu_0,E_1,\mu_1,q)}
\end{equation}
For this calculation, there are three different sets of replicated variables.
Note that, as for the computation of the complexity, the $\pmb\sigma_1$ and
$\mathbf s_1$ replicas are \emph{special}. The first again is the only of the
$\pmb\sigma$ replicas constrained to lie at fixed overlap with \emph{all} the
$\mathbf s$ replicas, and the second is the only of the $\mathbf s$ replicas at
which the Hessian is evaluated.

The calculation of this minimum eigenvalue is very similar to that of the
complexity. The details of this calculation can be found in
Appendix~\ref{sec:eigenvalue-details}. The result for the minimum eigenvalue is given by
\begin{equation} \label{eq:minimum.eigenvalue.text}
  \lambda_\mathrm{min}
  =\mu_1-\left(y+\frac1yf''(1)\right)
\end{equation}
where $y$ is an order parameter whose value is set by the saddle-point conditions
\begin{align} \label{eq:eigen.conditions.main}
  0=-f''(1)+y^2(1-\mathcal X^TC\mathcal X)
  &&
  0=(B-yC)\mathcal X
\end{align}
for $\mathcal X\in\mathbb R^5$ a vector of order parameters, and $B$ and $C$
are $5\times 5$ matrices whose elements are explicit functions of the order
parameters from the two-point complexity problem and of $f$ and its
derivatives. The matrices $B$ and $C$ are given in \eqref{eq:matrix.b} and \eqref{eq:matrix.c} of Appendix~\ref{sec:eigenvalue-details}.

There is a trivial solution for $\mathcal X=0$ and $y^2=f''(1)$. This results
in a minimum eigenvalue
\begin{equation}
  \lambda_\mathrm{min}=\mu_1-\sqrt{4f''(1)}=\mu_1-\mu_\mathrm m
\end{equation}
that corresponds with the bottom edge of the semicircle distribution. This is
the correct solution in the absence of an isolated eigenvalue. Any
solution corresponding to the presence of an isolated eigenvalue must have nonzero $\mathcal
X$. The only way to satisfy this with the second of the saddle conditions
\eqref{eq:eigen.conditions.main} is for $y$ such that one of the eigenvalues of
$B-yC$ is zero. Under these circumstances, if the normalized eigenvector
associated with the zero eigenvector is $\hat{\mathcal X}_0$, then $\mathcal
X=\|\mathcal X_0\|\hat{\mathcal X}_0$ is a solution. The magnitude $\|\mathcal
X_0\|$ of this solution is set by the first saddle point condition, namely
\begin{equation}
  \|\mathcal X_0\|^2=\frac1{\hat{\mathcal X}_0^TC\hat{\mathcal X}_0}\left(1-\frac{f''(1)}{y^2}\right)
\end{equation}
In practice, we find that $\hat{\mathcal X}_0^TC\hat{\mathcal X}_0$ is positive
at the saddle point. Therefore, for the solution to exist we must have
$y^2\geq f''(1)$. In practice, there is at most one $y$ which produces a
zero eigenvalue of $B-yC$ and satisfies this inequality, so the solution seems
to be unique.

With this solution, we simultaneously find the smallest eigenvalue and
information about the orientation of its associated eigenvector: namely, its
overlap $q_\mathrm{min}$ with the tangent vector that points directly from one stationary point to the other. This information is encoded the order parameter vector
$\mathcal X$, and the details of how it is computed can be found at the end of
Appendix~\ref{sec:eigenvalue-details}. The emergence of an isolated eigenvalue
and its associated eigenvector are shown in Fig.~\ref{fig:isolated.eigenvalue},
for the same reference point properties that were used in
Fig.~\ref{fig:min.neighborhood}. For small overlaps, the minimum eigenvalue
corresponds with bottom of the semicircle distribution, or the trivial
solution. As the overlap is increased, one eigenvalue continuously leaves the
spectrum, with an eigenvector whose overlap with the vector between stationary
points also grows continuously from zero.

\begin{figure}
  \includegraphics{figs/isolated_eigenvalue.pdf}
  \hfill
  \includegraphics{figs/eigenvector_overlap.pdf}

  \caption{
    Properties of the isolated eigenvalue and the overlap of its associated
    eigenvector with the direction of the reference point. These curves
    correspond with the lower solid curve in Fig.~\ref{fig:min.neighborhood}.
    \textbf{Left:} The value of the minimum eigenvalue as a function of
    overlap. The dashed line shows the continuation of the bottom of the
    semicircle. Where the dashed line separates from the solid line, the
    isolated eigenvalue has appeared. \textbf{Right:} The overlap between the
    eigenvector associated with the minimum eigenvalue and the direction of the
    reference point. The overlap is zero until an isolated eigenvalue appears,
    and then it grows continuously until the nearest neighbor is reached.
  } \label{fig:isolated.eigenvalue}
\end{figure}

Though the two-point complexity $\Sigma_{12}$ fails to distinguish the marginal
minima at the limits of aging dynamics, one might imagine that something
related to the isolated eigenvalue might succeed in distinguishing them. This
does not appear to be the case. Above and below the threshold energy, the
nature of the isolated eigenvalue of nearest neighbors does not change: it is
always present and varies continuously. There is an energy both above and below
the threshold where the nearest marginal states transition from having an
isolated eigenvalue to not having one; see for instance in the right panel of
Fig.~\ref{fig:marginal.prop.above} that the grey region vanishes. One might
reason that this could change the connectivity of nearby marginal-like states and
thereby the aging dynamics. However, the energies where these changes occur are not close to the limits
of aging dynamics measured by \cite{Folena_2020_Rethinking}, so that reasoning is wrong.

\section{Conclusion}
\label{sec:conclusion}

We have computed the complexity of neighboring stationary points for the mixed
spherical models. When we studied the neighborhoods of marginal minima, we
found something striking: only those at the threshold energy have other
marginal minima nearby. For the many marginal minima away from the threshold
(including the exponential majority), there is a gap in overlap between them.

This has implications for pictures of relaxation and aging. In most $p+s$
models studied, quenches from infinite to zero temperature (gradient descent
starting from a random point) relax towards marginal states with energies above
the threshold energy \cite{Folena_2023_On}, while at least in some models a
quench to zero temperature from a temperature around the dynamic transition
relaxes towards marginal states with energies below the threshold energy
\cite{Folena_2020_Rethinking, Folena_2021_Gradient}. We found (see especially
Figs.~\ref{fig:marginal.prop.below} and \ref{fig:marginal.prop.above}) that the
neighborhoods of marginal states above and below the threshold are quite
different, and yet the emergent aging behavior relaxing toward states above and
below the threshold seem to be the same. Therefore, aging dynamics
appears to be insensitive to the neighborhood of the marginal state being
approached. To understand something better about why certain states attract the
dynamics in certain situations, nonlocal information, like the
structure of their entire basin of attraction, seems vital.

It is possible that replica symmetry breaking among the constrained stationary
points could change the details of the two-point complexity of very nearby
states. Indeed, it is difficult to rule out \textsc{rsb} in complexity
calculations. However, such corrections would not change the overarching
conclusions of this paper, namely that most marginal minima are separated from
each other by a macroscopic overlap gap and high barriers. This is because the
replica symmetric complexity bounds any \textsc{rsb} complexities from above,
and so \textsc{rsb} corrections can only decrease the complexity. Therefore,
the overlap gaps, which correspond to regions of negative complexity, cannot be
removed by a more detailed saddle point ansatz.

Our calculation studied the neighborhood of typical reference points with the
given energy and stability. However, it is possible that marginal minima with
atypical neighborhoods actually attract the dynamics, as has been argued in certain neural networks \cite{Baldassi_2016_Unreasonable, Baldassi_2021_Unveiling}. To determine this, a
different type of calculation is needed. As our calculation is akin to the
quenched Franz--Parisi potential, study of atypical neighborhoods would entail
something like the annealed Franz--Parisi approach, i.e.,
\begin{equation}
  \Sigma^*(E_0,\mu_0,E_1,\mu_1,q)=\frac1N\overline{\log\left(
      \int d\nu_H(\pmb\sigma,\varsigma\mid E_0,\mu_0)\,d\nu_H(\mathbf s,\omega\mid E_1,\mu_1)\,\delta(Nq-\pmb\sigma\cdot\mathbf s)
  \right)}
\end{equation}
which puts the two points on equal footing. This calculation and exploration of
the atypical neighborhoods it reveals is a clear future direction.

The methods developed in this paper are straightforwardly (if not easily)
generalized to landscapes with replica symmetry broken complexities
\cite{Kent-Dobias_2023_How}. We suspect that many of the qualitative features
of this study would persist, with neighboring states being divided into
different clusters based on the \textsc{rsb} order but with the basic presence
or absence of overlap gaps and the nature of the stability of near-neighbors
remaining unchanged. Interesting structure might emerge in the arrangement of
marginal states in \textsc{frsb} systems, where the ground state itself is
marginal and coincides with the threshold.

\paragraph{Acknowledgements}

The author would like to thank Valentina Ros, Giampaolo Folena, Chiara
Cammarota, and Jorge Kurchan for useful discussions related to this work.

\paragraph{Funding information}

JK-D is supported by a \textsc{DynSysMath} Specific Initiative by the
INFN.

\appendix

\section{Details of the calculation for the two-point complexity}
\label{sec:complexity-details}

The two-point complexity defined in \eqref{eq:complexity.definition} consists
of the average over integrals containing of products of Dirac
$\delta$-functions and determinants of Hessians. To compute it, we first split
the factors into two groups: one group that contains any dependence on the
Hessian (the determinants and the $\delta$-functions fixing the stabilities)
and a second group containing all other $\delta$-functions. The average over
disorder for the two groups of factors can be made independently, which is
described in subsections \ref{subsec:hessian} and \ref{subsec:other.factors}
for the Hessian and other factors, respectively.

Once the average is made over disorder, the result is an exponential integral
that depends only on scalar products between the replicated configurations
$\mathbf s$ and $\pmb\sigma$ and their conjugate fields. The explicit
dependence on these microscopic configurations is removed using a
Hubbard--Stratonovich transformation, which replaces the scalar products with
overlap order parameters. This is described in subsection
\ref{subsec:hubbard.strat}. Finally, the complexity is an exponential integral
over several order parameter fields, and is amenable to evaluation by a saddle
point method, detailed in subsection \ref{subsec:saddle}.

\subsection{The Hessian factors}
\label{subsec:hessian}

The factors dependant on the Hessian can be averaged over disorder using results from random matrix theory.
The double partial derivatives of the energy are Gaussian with the variance
\begin{equation}
  \overline{(\partial_i\partial_jH(\mathbf s))^2}=\frac1Nf''(1)
\end{equation}
which means that the matrix of partial derivatives belongs to the GOE class. Its spectrum is given by the Wigner semicircle
\begin{equation}
  \rho(\lambda)=\begin{cases}
    \frac2{\pi}\sqrt{1-\big(\frac{\lambda}{\mu_\text m}\big)^2} & \lambda^2\leq\mu_\text m^2 \\
    0 & \text{otherwise}
  \end{cases}
\end{equation}
with radius $\mu_\text m=\sqrt{4f''(1)}$. Since the Hessian differs from the
matrix of partial derivatives by adding the constant diagonal matrix $\omega
I$, it follows that the spectrum of the Hessian is a Wigner semicircle shifted
by $\omega$, or $\rho(\lambda+\omega)$.

The average over factors depending on the Hessian alone can be made separately
from those depending on the gradient or energy, since for random Gaussian
fields the Hessian is independent of these \cite{Bray_2007_Statistics}. In
principle the fact that we have conditioned the Hessian to belong to stationary
points of certain energy, stability, and proximity to another stationary point
will modify its statistics, but these changes will only appear at subleading
order in $N$ \cite{Ros_2019_Complexity}. This is because the conditioning amounts to a rank-one perturbation to the Hessian matrix, which does not affect the bulk of its spectrum. At leading order, the expectations related to different replicas factorize, each yielding
\begin{equation}
  \overline{\big|\det\operatorname{Hess}H(\mathbf s,\omega)\big|\,\delta\big(N\mu-\operatorname{Tr}\operatorname{Hess}H(\mathbf s,\omega)\big)}
  =e^{N\int d\lambda\,\rho(\lambda+\mu)\log|\lambda|}\delta(N\mu-N\omega)
\end{equation}
Therefore, each of the Lagrange multipliers is fixed to one of the stabilities $\mu$. We define the function
\begin{equation} \label{eq:hessian.func}
  \begin{aligned}
    \mathcal D(\mu)
    &=\int d\lambda\,\rho(\lambda+\mu)\log|\lambda| \\
    &=\begin{cases}
      \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2}
       & \mu^2\leq\mu_\text m^2 \\
      \frac12+\log\left(\frac12\mu_\text m\right)+\frac{\mu^2}{\mu_\text m^2}
      -\left|\frac{\mu}{\mu_\text m}\right|\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1}
      -\log\left(\left|\frac{\mu}{\mu_\text m}\right|-\sqrt{\big(\frac\mu{\mu_\text m}\big)^2-1}\right) & \mu^2>\mu_\text m^2
    \end{cases}
  \end{aligned}
\end{equation}
and using it the full factor due to the Hessians can be written
\begin{equation}
  e^{Nm\mathcal D(\mu_0)+Nn\mathcal D(\mu_1)}\left[\prod_a^m\delta(N\mu_0-N\varsigma_a)\right]\left[\prod_a^n\delta(N\mu_1-N\omega_a)\right]
\end{equation}

\subsection{The other factors}
\label{subsec:other.factors}

The other factors consist of $\delta$-functions of the gradient and $\delta$-functions containing the energy and spherical constraints. We take advantage of the Fourier representation of the $\delta$-function to express each of them as an exponential integral over an auxiliary field. For instance,
\begin{equation}
  \delta\big(\nabla H(\mathbf s,\mu_1)\big)
  =\int\frac{d\hat{\mathbf s}}{(2\pi)^N}e^{i\hat{\mathrm s}\cdot\nabla H(\mathbf s,\mu_1)}
\end{equation}
replaces a $\delta$-function of the gradient by introducing the auxiliary field $\hat{\mathbf s}$. Carrying out such a transformation to each of the remaining factors gives an exponential integrand of the form
\begin{equation}
  e^{
    Nm\hat\beta_0E_0+Nn\hat\beta_1E_1
    -\sum_a^m\left[(\pmb\sigma_a\cdot\hat{\pmb\sigma}_a)\mu_0
      -\frac12\hat\mu_0(N-\pmb\sigma_a\cdot\pmb\sigma_a)
    \right]
      -\sum_a^n\left[(\mathbf s_a\cdot\hat{\mathbf s}_a)\mu_1
      -\frac12\hat\mu_1(N-\mathbf s_a\cdot\mathbf s_a)
      -\frac12\hat\mu_{12}(Nq-\pmb\sigma_1\cdot\mathbf s_a)
    \right]
    +\int d\mathbf t\,\mathcal O(\mathbf t)H(\mathbf t)
  }
\end{equation}
where we have introduced the linear operator
\begin{equation}
  \mathcal O(\mathbf t)
  =\sum_a^m\delta(\mathbf t-\pmb\sigma_a)\left(
    i\hat{\pmb\sigma}_a\cdot\partial_{\mathbf t}-\hat\beta_0
  \right)
  +
  \sum_a^n\delta(\mathbf t-\mathbf s_a)\left(
    i\hat{\mathbf s}_a\cdot\partial_{\mathbf t}-\hat\beta_1
  \right)
\end{equation}
consolidating all of the $H$-dependent terms.
Here the $\hat\beta$s are the fields auxiliary to the energy constraints, the
$\hat\mu$s are auxiliary to the spherical and overlap constraints, and the
$\hat{\pmb\sigma}$s and $\hat{\mathbf s}$s are auxiliary to the constraints that
the gradient be zero.
We have written the $H$-dependent terms in this strange form for the ease of taking the average over $H$: since it is Gaussian-correlated, it follows that
\begin{equation}
  \overline{e^{\int d\mathbf t\,\mathcal O(\mathbf t)H(\mathbf t)}}
  =e^{\frac12\int d\mathbf t\,d\mathbf t'\,\mathcal O(\mathbf t)\mathcal O(\mathbf t')\overline{H(\mathbf t)H(\mathbf t')}}
  =e^{N\frac12\int d\mathbf t\,d\mathbf t'\,\mathcal O(\mathbf t)\mathcal O(\mathbf t')f\big(\frac{\mathbf t\cdot\mathbf t'}N\big)}
\end{equation}
It remains only to apply the doubled operators to $f$ and then evaluate the
simple integrals over the $\delta$ measures. We do not include these details,
which were carried out with computer algebra software. The result of this
calculation is found in the effective action \eqref{eq:intermed.complexity},
where it contributes all terms besides the functions $\mathcal D$ contributed by the Hessian terms in the previous section and the
logarithms contributed by the Hubbard--Stratonovich transformation of the next section.

\subsection{Hubbard--Stratonovich}
\label{subsec:hubbard.strat}

Having expanded the resulting expression, we are left with an argument in the exponential which is a function of scalar products between the fields $\mathbf s$, $\hat{\mathbf s}$, $\pmb\sigma$, and $\hat{\pmb\sigma}$. We will change integration coordinates from these fields to matrix fields given by their scalar products, defined as
\begin{equation} \label{eq:fields}
  \begin{aligned}
    C^{00}_{ab}=\frac1N\pmb\sigma_a\cdot\pmb\sigma_b &&
    R^{00}_{ab}=-i\frac1N{\pmb\sigma}_a\cdot\hat{\pmb\sigma}_b &&
    D^{00}_{ab}=\frac1N\hat{\pmb\sigma}_a\cdot\hat{\pmb\sigma}_b \\
    C^{01}_{ab}=\frac1N\pmb\sigma_a\cdot\mathbf s_b &&
    R^{01}_{ab}=-i\frac1N{\pmb\sigma}_a\cdot\hat{\mathbf s}_b &&
    R^{10}_{ab}=-i\frac1N\hat{\pmb\sigma}_a\cdot{\mathbf s}_b &&
    D^{01}_{ab}=\frac1N\hat{\pmb\sigma}_a\cdot\hat{\mathbf s}_b \\
    C^{11}_{ab}=\frac1N\mathbf s_a\cdot\mathbf s_b &&
    R^{11}_{ab}=-i\frac1N{\mathbf s}_a\cdot\hat{\mathbf s}_b &&
    D^{11}_{ab}=\frac1N\hat{\mathbf s}_a\cdot\hat{\mathbf s}_b
  \end{aligned}
\end{equation}
We insert into the integral the product of $\delta$-functions enforcing these
definitions, integrated over the new matrix fields, which is equivalent to
multiplying by one. For example, one such factor of one is given by
\begin{equation}
  1=\int dC^{00}\,\frac1{N^{m^2}}\prod_{ab}^m\delta(NC^{00}_{ab}-\pmb\sigma_a\cdot\pmb\sigma_b)
\end{equation}
Once this is done, the many scalar products appearing throughout the integrand can be
replaced by the matrix fields. The only dependence of the original vector
fields is from these new $\delta$-functions. These are treated schematically in
following way: let $\{\mathbf a_a\}=\{\mathbf s_a,\pmb\sigma_a,\hat{\mathbf
s}_a,\hat{\pmb\sigma}_a\}$ index all of the original vector fields, and let
$Q_{ab}=\frac1N\mathbf a_a\cdot\mathbf a_b$ likewise concatenate all of the
matrix fields. Then the $\delta$-functions described above can be promoted to an exponential integral of the form
\begin{equation}
  \int d\mathbf a\,d\hat Q\,e^{
    N\frac12\operatorname{Tr}\hat QQ
    -\frac12\mathbf a^T\hat Q\mathbf a
  }
\end{equation}
using an auxiliary matrix field $\hat Q$.
The integral over the vector fields $\mathbf a$ is Gaussian and can be evaluated, giving
\begin{equation}
  \int d\hat Q\,e^{
    N\operatorname{Tr}\hat QQ
  }(\det\hat Q)^{-N/2}
  =
  \int d\hat Q\,e^{
    \frac12 N(\operatorname{Tr}\hat QQ
    -\log\det\hat Q)
  }
\end{equation}
Finally, the integral over $\hat Q$ can be evaluated using the saddle point
method, giving $\hat Q=Q^{-1}$. Therefore, the term contributed to the effective
action as a result of the transformation is
\begin{equation}
  \frac12\log\det Q
  =
  \frac12\log\det\begin{bmatrix}
    C^{00}&iR^{00}&C^{01}&iR^{01}\\
    iR^{00}&D^{00}&iR^{10}&D^{01}\\
    C^{01}&iR^{10}&C^{11}&iR^{11}\\
    iR^{01}&D^{01}&iR^{11}&D^{11}
  \end{bmatrix}
\end{equation}

\subsection{Replica ansatz and saddle point}
\label{subsec:saddle}

After the transformation of the previous section, the complexity has been
brought to the form of an exponential integral over the matrix order parameters
\eqref{eq:fields}, proportional to $N$. We are therefore in the position to
evaluate this integral using a saddle point method.
We will always assume that the square matrices $C^{00}$, $R^{00}$, $D^{00}$,
$C^{11}$, $R^{11}$, and $D^{11}$ are hierarchical matrices, i.e., of the Parisi form, with each set of
three sharing the same structure. In particular, we immediately
define $c_\mathrm d^{00}$, $r_\mathrm d^{00}$, $d_\mathrm d^{00}$, $c_\mathrm d^{11}$, $r_\mathrm d^{11}$, and
$d_\mathrm d^{11}$ as the value of the diagonal elements of these matrices,
respectively. Note that $c_\mathrm d^{00}=c_\mathrm d^{11}=1$ due to the spherical constraint.

In this paper, we focus on models with a replica symmetric complexity, but
many of the intermediate formulae are valid for arbitrary replica symmetry
breakings. At most {\oldstylenums1}\textsc{rsb} in equilibrium is guaranteed if the function
$\chi(q)=f''(q)^{-1/2}$ is convex \cite{Crisanti_1992_The}. The complexity at the ground state must
reflect the structure of equilibrium, and therefore be replica symmetric.
Recent work has found that the complexity of saddle points can have
other \textsc{rsb} orders even when the ground state is replica symmetric, but the $3+4$ model has a safely replica symmetric complexity everywhere \cite{Kent-Dobias_2023_When}.

Defining the `block' fields $\mathcal Q_{00}=(\hat\beta_0, \hat\mu_0, C^{00},
R^{00}, D^{00})$, $\mathcal Q_{11}=(\hat\beta_1, \hat\mu_1, C^{11}, R^{11},
D^{11})$, and $\mathcal Q_{01}=(\hat\mu_{01},C^{01},R^{01},R^{10},D^{01})$
the resulting complexity is
\begin{equation} \label{eq:intermed.complexity}
  \Sigma_{12}
  =\frac1N\lim_{n\to0}\lim_{m\to0}\frac\partial{\partial n}\int d\mathcal Q_{00}\,d\mathcal Q_{11}\,d\mathcal Q_{01}\,e^{Nm\mathcal S_0(\mathcal Q_{00})+Nn\mathcal S_1(\mathcal Q_{11},\mathcal Q_{01}\mid\mathcal Q_{00})}
\end{equation}
where
\begin{equation} \label{eq:one-point.action}
  \begin{aligned}
    &\mathcal S_0(\mathcal Q_{00})
    =\hat\beta_0E_0-r^{00}_\mathrm d\mu_0-\frac12\hat\mu_0(1-c^{00}_\mathrm d)+\mathcal D(\mu_0)\\
    &\quad+\frac1m\bigg\{
      \frac12\sum_{ab}^m\left[
        \hat\beta_1^2f(C^{00}_{ab})+(2\hat\beta_1R^{00}_{ab}-D^{00}_{ab})f'(C^{00}_{ab})+(R_{ab}^{00})^2f''(C_{ab}^{00})
  \right]+\frac12\log\det\begin{bmatrix}C^{00}&iR^{00}\\iR^{00}&D^{00}\end{bmatrix}
\bigg\}
\end{aligned}
\end{equation}
is the action for the ordinary, one-point complexity, and the remainder is given by
\begin{equation} \label{eq:two-point.action}
  \begin{aligned}
    &\mathcal S_1(\mathcal Q_{11},\mathcal Q_{01}\mid\mathcal Q_{00})
    =\hat\beta_1E_1-r^{11}_\mathrm d\mu_1-\frac12\hat\mu_1(1-c^{11}_\mathrm d)+\mathcal D(\mu_1) \\
    &\quad+\frac1n\sum_b^n\left\{-\frac12\hat\mu_{12}(q-C^{01}_{1b})+\sum_a^m\left[
        \hat\beta_0\hat\beta_1f(C^{01}_{ab})+(\hat\beta_0R^{01}_{ab}+\hat\beta_1R^{10}_{ab}-D^{01}_{ab})f'(C^{01}_{ab})+R^{01}_{ab}R^{10}_{ab}f''(C^{01}_{ab})
    \right]\right\}
    \\
    &\quad+\frac1n\bigg\{
      \frac12\sum_{ab}^n\left[
        \hat\beta_1^2f(C^{11}_{ab})+(2\hat\beta_1R^{11}_{ab}-D^{11}_{ab})f'(C^{11}_{ab})+(R^{11}_{ab})^2f''(C^{11}_{ab})
      \right]\\
    &\quad+\frac12\log\det\left(
      \begin{bmatrix}
        C^{11}&iR^{11}\\iR^{11}&D^{11}
      \end{bmatrix}-
      \begin{bmatrix}
        C^{01}&iR^{01}\\iR^{10}&D^{01}
      \end{bmatrix}^T
      \begin{bmatrix}
        C^{00}&iR^{00}\\iR^{00}&D^{00}
      \end{bmatrix}^{-1}
      \begin{bmatrix}
        C^{01}&iR^{01}\\iR^{10}&D^{01}
      \end{bmatrix}
    \right)
  \bigg\}
  \end{aligned}
\end{equation}
Because of the structure of this problem in the twin limits of $m$ and $n$ to
zero, the parameters $\mathcal Q_{00}$ can be evaluated at a saddle point of
$\mathcal S_0$ alone. This means that these parameters will take the same value
they take when the ordinary, 1-point complexity is calculated. For a replica
symmetric complexity of the reference point, this results in
\begin{align}
  \hat\beta_0
  &=-\frac{\mu_0f'(1)+E_0\big(f'(1)+f''(1)\big)}{u_f}\\
  r_\mathrm d^{00}
  &=\frac{\mu_0f(1)+E_0f'(1)}{u_f} \\
  d_\mathrm d^{00}
  &=\frac1{f'(1)}
  -\left(
    \frac{\mu_0f(1)+E_0f'(1)}{u_f}
  \right)^2
\end{align}
where we define for brevity (here and elsewhere) the constants
\begin{align} \label{eq:v.and.u}
  u_f=f(1)\big(f'(1)+f''(1)\big)-f'(1)^2
  &&
  v_f=f'(1)\big(f''(1)+f'''(1)\big)-f''(1)^2
\end{align}
Note that because the coefficients of $f$ must be nonnegative for $f$ to
be a sensible covariance, both $u_f$ and $v_f$ are strictly positive.\footnote{
  Note also
that $u_f=v_f=0$ if $f$ is a homogeneous polynomial as in the pure models.
These expressions are invalid for the pure models because $\mu_0$ and $E_0$
cannot be fixed independently; we would have done the equivalent of inserting
two identical $\delta$-functions. For the pure models, the terms $\hat\beta_0$ and
$\hat\beta_1$ must be set to zero in our prior formulae (as if the energy was
not constrained) and then the saddle point taken.
}


In general, we except the $m\times n$ matrices $C^{01}$, $R^{01}$, $R^{10}$,
and $D^{01}$ to have constant \emph{rows} of length $n$, with blocks of rows
corresponding to the \textsc{rsb} structure of the single-point complexity for the model.
For
the scope of this paper, where we restrict ourselves to replica symmetric
complexities, they have the following form at the saddle point:
\begin{align} \label{eq:01.ansatz}
  C^{01}=
  \begin{subarray}{l}
  \hphantom{[}\begin{array}{ccc}\leftarrow&n&\rightarrow\end{array}\hphantom{\Bigg]}\\
  \left[
    \begin{array}{ccc}
      q&\cdots&q\\
      0&\cdots&0\\
      \vdots&\ddots&\vdots\\
      0&\cdots&0
    \end{array}
  \right]\begin{array}{c}
    \\\uparrow\\m-1\\\downarrow
  \end{array}
\end{subarray}
  &&
  R^{01}
  =\begin{bmatrix}
    r_{01}&\cdots&r_{01}\\
    0&\cdots&0\\
    \vdots&\ddots&\vdots\\
    0&\cdots&0
  \end{bmatrix}
  &&
  R^{10}
  =\begin{bmatrix}
    r_{10}&\cdots&r_{10}\\
    0&\cdots&0\\
    \vdots&\ddots&\vdots\\
    0&\cdots&0
  \end{bmatrix}
  &&
  D^{01}
  =\begin{bmatrix}
    d_{01}&\cdots&d_{01}\\
    0&\cdots&0\\
    \vdots&\ddots&\vdots\\
    0&\cdots&0
  \end{bmatrix}
\end{align}
where only the first row is nonzero. The other entries, which correspond to the
completely uncorrelated replicas in an \textsc{rsb} picture, are all zero
because uncorrelated vectors on the sphere are orthogonal.

The most challenging part of inserting our replica symmetric ansatz is the
volume element in the $\log\det$, which involves the product and inverse of
block replica matrices. The inverse of block hierarchical matrix is still a
block hierarchical matrix, since
\begin{equation}
  \begin{bmatrix}
    C^{00}&iR^{00}\\iR^{00}&D^{00}
  \end{bmatrix}^{-1}
  =
  \begin{bmatrix}
    (C^{00}D^{00}+R^{00}R^{00})^{-1}D^{00} & -i(C^{00}D^{00}+R^{00}R^{00})^{-1}R^{00} \\
    -i(C^{00}D^{00}+R^{00}R^{00})^{-1}R^{00} & (C^{00}D^{00}+R^{00}R^{00})^{-1}C^{00}
  \end{bmatrix}
\end{equation}
and hierarchical matrices are closed under inverses and products.
Because of the structure of the 01 matrices, the volume element will depend
only on the diagonals of the matrices in this inverse block matrix. If we define
\begin{align}
  \tilde c_\mathrm d^{00}&=[(C^{00}D^{00}+R^{00}R^{00})^{-1}C^{00}]_{\text d} \\
  \tilde r_\mathrm d^{00}&=[(C^{00}D^{00}+R^{00}R^{00})^{-1}R^{00}]_{\text d} \\
  \tilde d_\mathrm d^{00}&=[(C^{00}D^{00}+R^{00}R^{00})^{-1}D^{00}]_{\text d}
\end{align}
as the diagonals of the blocks of the inverse matrix, then the result  of the product is
\begin{equation}
  \begin{aligned}
     &   \begin{bmatrix}
       C^{01}&iR^{01}\\iR^{10}&D^{01}
     \end{bmatrix}^T
     \begin{bmatrix}
       C^{00}&iR^{00}\\iR^{00}&D^{00}
     \end{bmatrix}^{-1}
     \begin{bmatrix}
       C^{01}&iR^{01}\\iR^{10}&D^{01}
     \end{bmatrix} \\
     &\qquad=\begin{bmatrix}
       q^2\tilde d_\mathrm d^{00}+2qr_{10}\tilde r^{00}_\mathrm d-r_{10}^2\tilde d^{00}_\mathrm d
      &
      i\left[d_{01}(r_{10}\tilde c^{00}_\mathrm d-q\tilde r^{00}_\mathrm d)+r_{01}(r_{10}\tilde r^{00}_\mathrm d+q\tilde d^{00}_\mathrm d)\right]
      \\
      i\left[d_{01}(r_{10}\tilde c^{00}_\mathrm d-q\tilde r^{00}_\mathrm d)+r_{01}(r_{10}\tilde r^{00}_\mathrm d+q\tilde d^{00}_\mathrm d)\right]
      &
      d_{01}^2\tilde c^{00}_\mathrm d+2r_{01}d_{01}\tilde r^{00}_\mathrm d-r_{01}^2\tilde d^{00}_\mathrm d
     \end{bmatrix}
  \end{aligned}
\end{equation}
where each block is a constant $n\times n$ matrix. Because the matrices
$C^{00}$, $R^{00}$, and $D^{00}$ are diagonal in the replica symmetric case,
the diagonals of the blocks above take a simple form:
\begin{align}
  \tilde c_\mathrm d^{00}=f'(1) &&
  \tilde r_\mathrm d^{00}=r^{00}_\mathrm df'(1) &&
  \tilde d_\mathrm d^{00}=d^{00}_\mathrm df'(1)
\end{align}
Once these expressions are inserted into the complexity, the limits of $n$ and
$m$ to zero can be taken, and the parameters from $D^{01}$ and $D^{11}$ can be
extremized explicitly. The result is \eqref{eq:complexity.full} from section \ref{sec:complexity} of the main text.

\subsection{The range of energies and stabilities of nearby points}
\label{subsec:range}

The range of parameters that result in a positive complexity is found by taking
the complexity \eqref{eq:complexity.full} and further requiring that
$\Sigma_{12}=0$. The maximum and minimum stability are then found by maximizing
this constrained expression over the energy, while the maximum and minimum
energy are found by maximizing it over the stability. In the small-$\Delta q$
expansion outlined in \S\ref{sec:complexity}, these ranges can be computed
analytically. We share the results here for the neighbors to marginal minima
with energies greater than the threshold energy, and confirm that the
analytically computed ranges match those found numerically.

The limit of stability in which nearby points are found to marginal minima
above the threshold are given by
$\mu_1=\mu_\mathrm m+\delta\mu_1(1-q)\pm\delta\mu_2(1-q)^{3/2}+O\big((1-q)^2\big)$
where $\delta\mu_1$ is given by the coefficient in \eqref{eq:expansion.mu.1}
and
\begin{equation}
  \delta\mu_2=\frac{v_f}{f'(1)f''(1)^{3/4}}\sqrt{
    \frac{E_0-E_\mathrm{th}}2\frac{2f''(1)\big(f''(1)-f'(1)\big)+f'(1)f'''(1)}{u_f}
  }
\end{equation}
Since the limits differ from the most common points at higher order in $\Delta q$, nearby points are of the same kind as the most common population.
Similarly, one finds that the energy lies in the range $E_1=E_0+\delta
E_1(1-q)^2\pm\delta E_2(1-q)^{5/2}+O\big((1-q)^3\big)$ for $\delta E_1$ given
by the coefficient in \eqref{eq:expansion.E.1} and
\begin{equation}
  \begin{aligned}
    \delta E_2
    &=\frac{\sqrt{E_0-E_\mathrm{th}}}{4f'(1)f''(1)^{3/4}}\bigg(
      \frac{
        v_f
        }{3u_f}
        \big[
          f'(1)(2f''(1)-(2-(2-\delta q_0)\delta q_0)f'''(1))-2f''(1)^2
        \big]
        \\
    &\hspace{12pc}\times
        \big[f'(1)\big(6f''(1)+(18-(6-\delta q_0)\delta q_0)f'''(1)\big)-6f''(1)^2
        \big]
    \bigg)^\frac12
  \end{aligned}
\end{equation}
and $\delta q_0$ is the coefficient in the expansion $q_0=1-\delta q_0(1-q)+O((1-q)^2)$ and is given by the real root to the quintic equation
\begin{equation}
  0=((16-(6-\delta q_0)\delta q_0)\delta q_0-12)f'(1)f'''(1)-2\delta q_0(f''(1)-f'(1))f''(1)
\end{equation}
These predictions from the small $1-q$ expansion are compared with numeric
saddle points for the complexity of marginal minima in
Fig.~\ref{fig:expansion}, and the results agree well at small $1-q$.

\begin{figure}
  \centering
  \includegraphics{figs/expansion_energy.pdf}
  \hspace{1em}
  \includegraphics{figs/expansion_stability.pdf}

  \caption{
    Demonstration of the convergence of the $(1-q)$-expansion for marginal
    reference minima. Solid lines and shaded region show are the same as in
    Fig.~\ref{fig:marginal.prop.above} for $E_0-E_\mathrm{th}\simeq0.00667$.
    The dotted lines show the expansion of most common neighbors, while the
    dashed lines in both plots show the expansion for the minimum and maximum
    energies and stabilities found at given $q$.
  } \label{fig:expansion}
\end{figure}


\section{Details of calculation for the isolated eigenvalue}
\label{sec:eigenvalue-details}

Many of the steps in the evaluation of the isolated eigenvalue are similar to
those from the evaluation of the two-point complexity: the treatment of the
average over disorder and the Hubbard--Stratonovich transformation follow the
exact same reasoning. We will not repeat the details of techniques that were
already reported in the previous appendix.

The treatment of the factors in the average over disorder proceeds as it does
for the complexity in \ref{subsec:other.factors}, now with the
disorder-dependent terms captured in the linear operator
\begin{equation}
  \mathcal O(\mathbf t)=
  \sum_a^m\delta(\mathbf t-\pmb\sigma_a)(i\hat{\pmb\sigma}_a\cdot\partial_\mathbf t-\hat\beta_0)
  +
  \sum_b^n\delta(\mathbf t-\mathbf s_b)(i\hat{\mathbf s}_b\cdot\partial_\mathbf t-\hat\beta_1)
  -\frac12
  \delta(\mathbf t-\mathbf s_1)\beta\sum_c^\ell(\mathbf x_c\cdot\partial_{\mathbf t})^2
\end{equation}
that is applied to $H$ by integrating over $\mathbf t\in\mathbb R^N$. The
resulting expression for the integrand produces dependencies on the scalar
products in \eqref{eq:fields} and on the new scalar products involving the
tangent plane vectors $\mathbf x$:
\begin{align}
  A_{ab}=\frac1N\mathbf x_a\cdot\mathbf x_b
  &&
  X^0_{ab}=\frac1N\pmb\sigma_a\cdot\mathbf x_b
  &&
  \hat X^0_{ab}=-i\frac1N\hat{\pmb\sigma}_a\cdot\mathbf x_b
  &&
  X^1_{ab}=\frac1N\mathbf s_a\cdot\mathbf x_b
  &&
  \hat X^1_{ab}=-i\frac1N\hat{\mathbf s}_a\cdot\mathbf x_b
\end{align}
Replacing the original variables using a Hubbard--Stratonovich transformation then proceeds like it did for the complexity in subsection \ref{subsec:hubbard.strat}.
Defining as before a block variable $\mathcal Q_x=(A,X^0,\hat X^0,X^1,\hat X^1)$
and consolidating the previous block variables $\mathcal Q=(\mathcal Q_{00},
\mathcal Q_{01},\mathcal Q_{11})$, we can write the minimum eigenvalue
schematically as
\begin{equation}
  \lambda_\mathrm{min}
  =-2\lim_{\beta\to\infty}
  \lim_{\substack{\ell\to0\\m\to0\\n\to0}}\frac\partial{\partial\ell}\frac1{\beta N}
  \int d\mathcal Q\,d\mathcal Q_x\,
  e^{N[
    m\mathcal S_0(\mathcal Q_{00})
    +n\mathcal S_1(\mathcal Q_{11},\mathcal Q_{01}\mid\mathcal Q_{00})
    +\ell\mathcal S_x(\mathcal Q_x\mid\mathcal Q_{00},\mathcal Q_{01},\mathcal Q_{11})
  ]}
\end{equation}
where $\mathcal S_0$ is given by \eqref{eq:one-point.action}, $\mathcal S_1$ is
given by \eqref{eq:two-point.action}, and the new action $\mathcal S_x$ is
given by
\begin{equation} \label{eq:action.eigenvalue}
  \begin{aligned}
    \ell\mathcal S_x(\mathcal Q_x\mid\mathcal Q)
    =-&\frac12\ell\beta\mu_1+
    \frac12\beta\sum_b^\ell\bigg\{
      \frac12\beta f''(1)\sum_a^lA_{ab}^2\\
    &+\sum_a^m\left[
        \big(\hat\beta_0f''(C^{01}_{a1})+R^{10}_{a1}f'''(C^{01}_{a1})\big)(X^0_{ab})^2
        +2f''(C^{01}_{a1})X^0_{ab}\hat X^0_{ab}
      \right] \\
    &+\sum_a^n\left[
        \big(\hat\beta_1f''(C^{11}_{a1})+R^{11}_{a1}f'''(C^{11}_{a1})\big)(X^1_{ab})^2
        +2f''(C^{11}_{a1})X^1_{ab}\hat X^1_{ab}
      \right]
    \bigg\}\\
    &+\frac12\log\det\left(
      A-
      \begin{bmatrix}
        X^0\\\hat X^0\\X^1\\\hat X^1
      \end{bmatrix}^T
      \begin{bmatrix}
        C^{00}&iR^{00}&C^{01}&iR^{01}\\
        iR^{00}&D^{00}&iR^{10}&D^{01}\\
        (C^{01})^T&(iR^{10})^T&C^{11}&iR^{11}\\
        (iR^{01})^T&(D^{01})^T&iR^{11}&D^{11}\\
      \end{bmatrix}^{-1}
      \begin{bmatrix}
        X^0\\\hat X^0\\X^1\\\hat X^1
      \end{bmatrix}
    \right)
  \end{aligned}
\end{equation}
As usual in these quenched Franz--Parisi style computations, the saddle point expressions for the variables $\mathcal Q$ in the joint limits of $m$, $n$, and $\ell$ to zero are independent of $\mathcal Q_x$, and so these quantities take the same value they do for the two-point complexity that we computed above. The saddle point conditions for the variables $\mathcal Q_x$ are found by extremizing with respect to the action once the variables $\mathcal Q$ from the complexity have been fixed.

To evaluate this expression, we need a sensible ansatz for the variables
$\mathcal Q_x$. The matrix $A$ we expect to be an ordinary hierarchical matrix,
and since the model is a spherical 2-spin the finite but low temperature order
will be replica symmetric with nonzero $a_0$. The expected form of the $X$ matrices
follows our reasoning for the 01 matrices of the Appendix \ref{subsec:saddle}: namely, they
should have constant rows and a column structure which matches that of the
level of \textsc{rsb} order associated with the degrees of freedom that
parameterize the columns. Since both the reference configurations and the
constrained configurations have replica symmetric order, we expect
\begin{align}
  X^0
  =
  \begin{subarray}{l}
    \hphantom{[}\begin{array}{ccc}\leftarrow&\ell&\rightarrow\end{array}\hphantom{\Bigg]}\\
  \left[
    \begin{array}{ccc}
      x_0&\cdots&x_0\\
      0&\cdots&0\\
      \vdots&\ddots&\vdots\\
      0&\cdots&0
    \end{array}
  \right]\begin{array}{c}
    \\\uparrow\\m-1\\\downarrow
  \end{array}\\
  \vphantom{\begin{array}{c}n\end{array}}
  \end{subarray}
  &&
  \hat X^0
  =
  \left[
    \begin{array}{ccc}
      \hat x_0&\cdots&\hat x_0\\
      0&\cdots&0\\
      \vdots&\ddots&\vdots\\
      0&\cdots&0
    \end{array}
  \right]
  &&
  X^1
  =
  \begin{subarray}{l}
    \hphantom{[}\begin{array}{ccc}\leftarrow&\ell&\rightarrow\end{array}\hphantom{\Bigg]}\\
  \left[
    \begin{array}{ccc}
      0&\cdots&0\\
      x_1&\cdots&x_1\\
      \vdots&\ddots&\vdots\\
      x_1&\cdots&x_1
    \end{array}
  \right]\begin{array}{c}
    \\\uparrow\\n-1\\\downarrow
  \end{array}\\
  \vphantom{\begin{array}{c}n\end{array}}
  \end{subarray}
  &&
  \hat X^1
  =\begin{bmatrix}
    \hat x_1^0&\cdots&\hat x_1^0\\
    \hat x_1^1&\cdots&\hat x_1^1\\
    \vdots&\ddots&\vdots\\
    \hat x_1^1&\cdots&\hat x_1^1
  \end{bmatrix}
\end{align}
Here, the lower blocks of the 0 matrices are zero, because the replicas whose overlap they represent (that of the normalization of the reference configuration) have no
correlation with the reference or anything else. The first row of the $X^1$ matrix
needs to be zero because of the constraint that the tangent space vectors lie
in the tangent plane to the sphere, and therefore have $\mathbf x_a\cdot\mathbf
s_1=0$ for any $a$. This produces five parameters to deal with, which we
compile in the vector $\mathcal X=(x_0,\hat x_0,x_1,\hat x_1^1,\hat x_1^0)$.

Inserting this ansatz is straightforward in the first part of
\eqref{eq:action.eigenvalue}, but the term with $\log\det$ is again more complicated.
We must invert the block matrix inside. We define
\begin{equation}
    \begin{bmatrix}
      C^{00}&iR^{00}&C^{01}&iR^{01}\\
      iR^{00}&D^{00}&iR^{10}&D^{01}\\
      (C^{01})^T&(iR^{10})^T&C^{11}&iR^{11}\\
      (iR^{01})^T&(D^{10})^T&iR^{11}&D^{11}\\
    \end{bmatrix}^{-1}
    =
  \begin{bmatrix}
    M_{11} & M_{12} \\
    M_{12}^T & M_{22}
  \end{bmatrix}
\end{equation}
where the blocks inside the inverse are given by
\begin{align}
  M_{11}&=
    \left(
      \begin{bmatrix}
        C^{00}&iR^{00}\\iR^{00}&D^{00}
      \end{bmatrix}
      -
      \begin{bmatrix}
        C^{01}&iR^{01}\\
        iR^{10}&D^{01}
      \end{bmatrix}
      \begin{bmatrix}
        C^{11}&iR^{11}\\iR^{11}&D^{11}
      \end{bmatrix}^{-1}
      \begin{bmatrix}
        C^{01}&iR^{01}\\
        iR^{10}&D^{01}
      \end{bmatrix}^T
    \right)^{-1}
    \\
  M_{12}&=-
  M_{11}
      \begin{bmatrix}
        C^{01}&iR^{01}\\
        iR^{10}&D^{01}
      \end{bmatrix}
      \begin{bmatrix}
        C^{11}&iR^{11}\\iR^{11}&D^{11}
      \end{bmatrix}^{-1}
      \\
  M_{22}&=
    \left(
      \begin{bmatrix}
        C^{11}&iR^{11}\\iR^{11}&D^{11}
      \end{bmatrix}
      -
      \begin{bmatrix}
        C^{01}&iR^{01}\\
        iR^{10}&D^{01}
      \end{bmatrix}^T
      \begin{bmatrix}
        C^{00}&iR^{00}\\iR^{00}&D^{00}
      \end{bmatrix}^{-1}
      \begin{bmatrix}
        C^{01}&iR^{01}\\
        iR^{10}&D^{01}
      \end{bmatrix}
    \right)^{-1}
\end{align}
Here, $M_{22}$ is the inverse of the matrix already analyzed as part of
\eqref{eq:two-point.action}. Following our discussion of the inverses of block
replica matrices above, and reasoning about their products with the rectangular
block-constant matrices, things can be worked out using a computer
algebra system. For instance, the second term in $M_{11}$ contributes nothing
once the appropriate limits are taken, because each contribution is
proportional to $n$.

The contribution from the product with the block inverse matrix can be written as
\begin{equation} \label{eq:inverse.quadratic.form}
    \begin{bmatrix}
      X_0\\i\hat X_0
    \end{bmatrix}^TM_{11}
    \begin{bmatrix}
      X_0\\i\hat X_0
    \end{bmatrix}
    +
    2\begin{bmatrix}
      X_0\\i\hat X_0
    \end{bmatrix}^TM_{12}
    \begin{bmatrix}
      X_1\\i\hat X_1
    \end{bmatrix}
    +
    \begin{bmatrix}
      X_1\\i\hat X_1
    \end{bmatrix}^TM_{22}
    \begin{bmatrix}
      X_1\\i\hat X_1
    \end{bmatrix}
\end{equation}
and without too much reasoning one can see that the result is an $\ell\times\ell$ constant matrix. If $A$ is a replica matrix and $c$ is a constant, then
\begin{equation}
  \log\det(A-c)=\log\det A-\frac{c}{\sum_{i=0}^k(a_{i+1}-a_i)x_{i+1}}
\end{equation}
where $a_{k+1}=1$ and $x_{k+1}=1$.
The basic form of the action is therefore (for replica symmetric $A$)
\begin{equation}
  2\mathcal S_x(\mathcal Q_x\mid\mathcal Q)
  =-\beta\mu_1+\frac12\beta^2f''(1)(1-a_0^2)+\log(1-a_0)+\frac{a_0}{1-a_0}+\mathcal X^T\left(\beta B-\frac1{1-a_0}C\right)\mathcal X
\end{equation}
where the matrix $B$ comes from the $\mathcal X$-dependent parts of the first
lines of \eqref{eq:action.eigenvalue} and is given by
\begin{equation} \label{eq:matrix.b}
  B=\begin{bmatrix}
    \hat\beta_0f''(q)+r_{10}f'''(q)&f''(q)&0&0&0\\
    f''(q)&0&0&0&0\\
    0&0&-\hat\beta_1f''(q^{11}_0)-r^{11}_0f'''(q^{11}_0)&-f''(q_0^{11})&0\\
    0&0&-f''(q_0^{11})&0&0\\
    0&0&0&0&0
  \end{bmatrix}
\end{equation}
and where the matrix $C$ encodes the coefficients of the quadratic form
\eqref{eq:inverse.quadratic.form}, and is given element-wise by
\begin{align}
  \notag
  &
  C_{11}=d^{00}_\mathrm df'(1)
  \qquad
  C_{12}=r^{00}_\mathrm df'(1)
  \qquad
  C_{22}=-f'(1)
  \\
  \notag
  &
  C_{13}
  =\frac1{1-q_0}\left(
    (r^{11}_d-r^{11}_0)\left(r^{01}-q\frac{r^{11}_d-r^{11}_0}{1-q_0}\right)(f'(1)-f'(q_0))+qf'(1)d^{00}_d+r^{00}_d(r^{10}f'(1)+(r^{11}_d-r^{11}_0)f'(q))
    \right)
  \\
  \notag
  &
  C_{15}=r^{00}_df'(q)+\left(r^{01}-q\frac{r^{11}_d-r^{11}_0}{1-q_0}\right)(f'(1)-f'(q_0))
  \qquad
  C_{14}=-C_{15}
  \\
  &
  C_{23}=\frac1{1-q_0}\left((qr^{00}_d-r^{10})f'(1)-(r^{11}_d-r^{11}_0)f'(q)\right)
  \qquad
  C_{24}=f'(q)
  \qquad
  C_{25}=-C_{24} \label{eq:matrix.c}
  \\
  \notag
  &
  C_{33}
  =
    -\frac{r^{11}_d-r^{11}_0}{1-q_0}\left[
      \frac{r^{11}_d-r^{11}_0}{1-q_0}f'(1)
      -2\left(
        \frac{qr^{01}-r^{11}_0}{1-q_0}+\frac{1-q^2}{1-q_0}\frac{r^{11}_d-r^{11}_0}{1-q_0}
        \right)(f'(1)-f'(q_0))
        -2\frac{qr^{00}-r^{10}}{1-q_0}f'(q)
    \right]\\
  \notag
  &\qquad-\frac{1-q^2}{(1-q_0)^2}-\frac{(r^{10}-qr^{00}_d)^2}{(1-q_0)^2}f'(1)
    \\
  \notag
  &
  C_{34}
  =-(qr^{01}-r^{11}_0)\frac{f'(1)-f'(q_0)}{1-q_0}-\frac{r^{11}_d-r^{11}_0}{1-q_0}\left(
    \frac{1-q^2}{1-q_0}(f'(1)-f'(q_0))-f'(q_0)
    \right)-f'(q)\frac{qr^{00}_d-r^{10}}{1-q_0}
    \\
  \notag
  &
    C_{35}=-C_{34}-\frac{r^{11}_d-r^{11}_0}{1-q_0}(f'(1)-f'(q_0))
  \qquad
    C_{44}=f'(1)-2f'(q_0)
    \qquad
    C_{45}=f'(q_0)
    \qquad
    C_{55}=-f'(1)
  \notag
\end{align}
The saddle point conditions read
\begin{align}
  0=-\beta^2f''(1)a_0+\frac{a_0-\mathcal X^TC\mathcal X}{(1-a_0)^2}
  &&
  0=\bigg(\beta B-\frac1{1-a_0}C\bigg)\mathcal X
\end{align}
Note that the second of these conditions implies that the quadratic form in
$\mathcal X$ in the action vanishes at the saddle.

We would like to take the limit of $\beta\to\infty$. As is usual in the
two-spin model, the appropriate limit of the order parameter is
$a_0=1-(y\beta)^{-1}$. Upon inserting this scaling and taking the limit, we
finally find
\begin{equation}
  \lambda_\mathrm{min}=-2\lim_{\beta\to\infty}\frac1\beta\mathcal S_x
  =\mu_1-\left(y+\frac1yf''(1)\right)
\end{equation}
with associated saddle point conditions
\begin{align}
  0=-f''(1)+y^2(1-\mathcal X^TC\mathcal X)
  &&
  0=(B-yC)\mathcal X
\end{align}
as reported in the main text.

The solution described here also encodes information about the correlation
between the eigenvector $\mathbf x_\text{min}$ associated with the minimum eigenvalue and the tangent
direction connecting the two stationary points $\mathbf x_{0\leftarrow1}$.
The overlap between these vectors is directly related to the value of the order parameter $x_0=\frac1N\pmb\sigma_1\cdot\mathbf x_a$. This tangent vector is $\mathbf x_{0\leftarrow
1}=\frac1{1-q}\big(\pmb\sigma_1-q\mathbf s_a\big)$, which is normalized and
lies strictly in the tangent plane of $\mathbf s_a$. Then the overlap between the two vectors is
\begin{equation}
  q_\textrm{min}=\frac{\mathbf x_{0\leftarrow 1}\cdot\mathbf x_\mathrm{min}}N
  =\frac{x_0}{1-q}
\end{equation}
where $\mathbf x_\text{min}\cdot\mathbf s_a=0$ because of the restriction of
the $\mathbf x$ vectors to the tangent plane at $\mathbf s_a$.


\section{Comparison with the Franz--Parisi potential}
\label{sec:franz-parisi}

The comparison between the Franz--Parisi potential at zero temperature and the
minimum-energy limit of the two-point complexity is of interest to some
specialists because the two computations qualitatively describe the same thing.
However, it was previously found that the two computations produce different
results in the pure spherical models, to the surprise of those researchers
\cite{Ros_2019_Complexity}. Understanding this difference is subtle. The
zero-temperature Franz--Parisi potential underestimates the energy where nearby
minima are found, because it includes any configuration that is a minimum on
the subspace created by constraining the overlap. Many of these configurations
will not have zero gradient perpendicular to the overlap constraint manifold,
and therefore are not proper minima of the energy.

A strange feature of the comparison for the pure spherical models was that the
two-point complexity and the Franz--Parisi potential coincided at their local
maximum in $q$. It is not clear why this coincidence occurs, but it is good
news for those who use the Franz--Parisi potential to estimate the height of
the free energy barrier between states. Though it everywhere else
underestimates the energy of nearby states, it correctly gives the value of
this highest barrier.

Here, we compute the Franz--Parisi potential for the mixed spherical models at zero
temperature, with respect to a reference configuration fixed to be a stationary
point of energy $E_0$ and stability $\mu_0$ \cite{Franz_1995_Recipes,
Franz_1998_EffectivePotential}. Comparing with the lower energy boundary of the
2-point complexity, we find that the story in the mixed models is the same as
that in the pure models: the Franz--Parisi potential underestimates the lowest
energy of nearby minima almost everywhere except at its peak, where the two
measures coincide.

The potential is defined as the average free
energy of a system constrained to lie with a fixed overlap $q$ with a reference
configuration (here a stationary point with fixed energy and stability), and
given by
\begin{equation} \label{eq:franz-parisi.definition}
  \beta V_\beta(q\mid E_0,\mu_0)
  =-\frac1N\overline{\int\frac{d\nu_H(\pmb\sigma,\varsigma\mid E_0,\mu_0)}{\int d\nu_H(\pmb\sigma',\varsigma'\mid E_0,\mu_0)}\,
  \log\bigg(\int d\mathbf s\,\delta\big(\|\mathbf s\|^2-N\big)\,\delta(\pmb\sigma\cdot\mathbf s-Nq)\,e^{-\beta H(\mathbf s)}\bigg)}
\end{equation}
Both the denominator and the logarithm are treated using the replica trick, which yields
\begin{equation}
  \beta V_\beta(q\mid E_0,\mu_0)
  =-\frac1N\lim_{\substack{m\to0\\n\to0}}\frac\partial{\partial n}\overline{\int\left(\prod_{b=1}^md\nu_H(\pmb\sigma_b,\varsigma_b\mid E_0,\mu_0)\right)\left(\prod_{a=1}^nd\mathbf s_a\,\delta(\|\mathbf s_a\|^2-N)\,\delta(\pmb \sigma_1\cdot \mathbf s_a-Nq)\,e^{-\beta H(\mathbf s_a)}\right)}
\end{equation}
The derivation of this proceeds in much the same way as for the complexity or
the isolated eigenvalue. Once the $\delta$-functions are converted to
exponentials, the $H$-dependent terms can be expressed by convolution with the
linear operator
\begin{equation}
  \mathcal O(\mathbf t)
  =\sum_a^m\delta(\mathbf t-\pmb\sigma_a)\left(
    i\hat{\pmb\sigma}_a\cdot\partial_{\mathbf t}-\hat\beta_0
    \right)
    -\beta
    \sum_a^n\delta(\mathbf t-\mathbf s_a)
\end{equation}
Averaging over $H$ squares the application of this operator to $f$ as before.
After performing a Hubbard--Stratonovich using matrix order parameters
identical to those used in the calculation of the complexity, we find that
\begin{equation}
  \beta V_\beta(q\mid E_0,\mu_0)=-\frac1N\lim_{\substack{m\to0\\n\to0}}\frac\partial{\partial n}\int d\mathcal Q_0\,d\mathcal Q_1\,e^{Nm\mathcal S_0(\mathcal Q_0)+Nn\mathcal S_\mathrm{FP}(\mathcal Q_1\mid\mathcal Q_0)}
\end{equation}
where $\mathcal S_0$ is the same as in \eqref{eq:one-point.action} and
\begin{equation}
  n\mathcal S_{\mathrm{FP}}
  =\frac12\beta^2\sum_{ab}^nf(Q_{ab})
  +\beta\sum_a^m\sum_b^n\left[
    \hat\beta_0f(C^{01}_{ab})
    +R^{10}_{ab}f'(C^{01}_{ab})
  \right]
  +\frac12\log\det\left(
    Q-\begin{bmatrix}C^{01}\\iR^{10}\end{bmatrix}^T\begin{bmatrix}C^{00}&iR^{00}\\iR^{00}&D^{00}\end{bmatrix}^{-1}\begin{bmatrix}C^{01}\\iR^{10}\end{bmatrix}
    \right)
\end{equation}
Here, because we are at low but nonzero temperature for the constrained configuration,
we make a {\oldstylenums1}\textsc{rsb} anstaz for the matrix $Q$, while the
$00$ matrices will take their saddle point value for the one-point complexity
and the $01$ matrices have the same structure as \eqref{eq:01.ansatz}.
Inserting these gives
\begin{equation}
  \begin{aligned}
    \beta V_\beta&=\frac12\beta^2\big[f(1)-(1-x)f(q_1)-xf(q_0)\big]
    +\beta\hat\beta_0f(q)+\beta r^{10}f'(q)-\frac{1-x}x\log(1-q_1)
    \\
                 &\qquad+\frac1x\log(1-(1-x)q_1-xq_0)
     +\frac{q_0-d^{00}_df'(1)q^2-2r^{00}_df'(1)r^{10}q+(r^{10})^2f'(1)}{
      1-(1-x)q_1-xq_0
    }
  \end{aligned}
\end{equation}
The saddle point for $r^{10}$ can be taken explicitly. After this, we take the
limit of $\beta\to\infty$. There are two possibilities. First, in the replica
symmetric case $x=1$, and in the limit of large $\beta$ $q_0$ will scale like
$q_0=1-(y_0\beta)^{-1}$. Inserting this, the limit is
\begin{equation}
  V_\infty^{\textsc{rs}}=-\hat\beta_0 f(q)-r^{00}_\mathrm df'(q)q-\frac12\left(y_0(1-q^2)+\frac{f'(1)^2-f'(q)^2}{y_0f'(1)}\right)
\end{equation}
The saddle point in $y_0$ can now be taken, taking care to choose the solution for $y_0>0$. This gives
\begin{equation}
  V_\infty^{\textsc{rs}}(q\mid E_0,\mu_0)=-\hat\beta_0f(q)-r^{00}_\mathrm df'(q)q-\sqrt{(1-q^2)\left(1-\frac{f'(q)^2}{f'(1)^2}\right)}
\end{equation}
The second case is when the inner statistical mechanics problem has replica
symmetry breaking. Here, $q_0$ approaches a nontrivial limit, but
$x=z\beta^{-1}$ approaches zero and $q_1=1-(y_1\beta)^{-1}$ approaches one. The result is
\begin{equation}
  \begin{aligned}
    V_\infty^{\oldstylenums{1}\textsc{rsb}}(q\mid E_0,\mu_0)
    &=-\hat\beta_0f(q)-r^{00}_\mathrm df'(q)q-\frac12\bigg(
      z(f(1)-f(q_0))+\frac{f'(1)}{y_1}-\frac{y_1(q^2-q_0)}{1+y_1z(1-q_0)} \\
    &\hspace{8pc}-(1+y_1z(1-q_0))\frac{f'(q)^2}{y_1f'(1)}+\frac1z\log\left(1+zy_1(1-q_0)\right)
    \bigg)
  \end{aligned}
\end{equation}
Though the saddle point in $y_1$ can be evaluated in this expression, it
delivers no insight. The final potential is found by taking the saddle over
$z$, $y_1$, and $q_0$. A plot comparing the result to the minimum energy
saddles is found in Fig.~\ref{fig:franz-parisi}. As noted above, there is
little qualitatively different from what was found in \cite{Ros_2019_Complexity}
for the pure models.

\begin{figure}
  \centering
  \includegraphics{figs/franz_parisi.pdf}

  \caption{
    Comparison of the lowest-energy stationary points at overlap $q$ with a
    reference minimum of $E_0=-1.71865<E_\mathrm{th}$ and
    $\mu_0=6.1>\mu_\mathrm m$ (yellow, top), and the zero-temperature Franz--Parisi potential
    with respect to the same reference minimum (blue, bottom). The two curves
    coincide precisely at their minimum $q=0$ and at the local maximum $q\simeq0.5909$.
  } \label{fig:franz-parisi}
\end{figure}

\bibliographystyle{SciPost_bibstyle}
\bibliography{2-point}

\end{document}