1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
|
\documentclass[reprint,aps,prl,longbibliography]{revtex4-2}
\usepackage[utf8]{inputenc} % why not type "Bézout" with unicode?
\usepackage[T1]{fontenc} % vector fonts plz
\usepackage{amsmath,amssymb,latexsym,graphicx}
\usepackage{newtxtext,newtxmath} % Times for PR
\usepackage[dvipsnames]{xcolor}
\usepackage[
colorlinks=true,
urlcolor=MidnightBlue,
citecolor=MidnightBlue,
filecolor=MidnightBlue,
linkcolor=MidnightBlue
]{hyperref} % ref and cite links with pretty colors
\usepackage{anyfontsize}
\begin{document}
\title{
Unvieling the complexity of heirarchical energy landscapes
}
\author{Jaron Kent-Dobias}
\author{Jorge Kurchan}
\affiliation{Laboratoire de Physique de l'Ecole Normale Supérieure, Paris, France}
\begin{abstract}
We derive the general solution for counting the stationary points of
mean-field complex landscapes. It incorporates Parisi's solution
for the ground state, as it should. Using this solution, we count the
stationary points of two models: one with multi-step replica symmetry
breaking, and one with full replica symmetry breaking.
\end{abstract}
\maketitle
The functions used to describe the energies, costs, and fitnesses of disordered
systems in physics, computer science, and biology are typically \emph{complex},
meaning that they have a number of minima that grows exponentially with the
size of the system. Though they are often called `rough landscapes' to evoke
the intuitive image of many minima in something like a mountain range, the
metaphor to topographical landscapes is strained by the reality that these
complex landscapes also exist in very high dimensions: think of the dimensions
of phase space for $N$ particles, or the number of parameters in a neural
network.
The \emph{complexity} of a function is the logarithm of the average number of
its minima, maxima, and saddle points (collectively stationary points), under
conditions like the value of the energy or the index of the stationary point.
Since in complex landscapes this number grows exponentially with system size,
their complexity is an extensive quantity. Understanding the complexity offers
an understanding about the geometry and topology of the landscape, which can
provide insight into dynamical behavior.
When complex systems are fully connected, i.e., each degree of freedom
interacts directly with every other, they are often described by a hierarchical
structure of the type first proposed by Parisi, the \emph{replica symmetry
breaking} (RSB). This family of structures is rich, spanning uniform
\emph{replica symmetry} (RS), an integer $k$ levels of hierarchical nested
structure ($k$RSB), a full continuum of nested structure (full RSB or FRSB),
and arbitrary combinations thereof. Though these rich structures are understood
in the equilibrium properties of fully connected models, the complexity has
only been computed in RS cases.
In this paper we share the first results for the complexity with nontrivial
hierarchy. Using a general form for the solution, we detail the structure of
landscapes with a 1RSB complexity and a full RSB complexity.
The Thouless--Anderson--Palmer (TAP) complexity is the complexity of a kind of
mean-field free energy. Because of some deep thermodynamic relationships
between the TAP complexity and the equilibrium free energy, the TAP complexity
can be computed with extensions of the equilibrium method. As a result, the TAP
complexity has been previously computed for nontrivial hierarchical structure.
We study the mixed $p$-spin spherical models, with Hamiltonian
\begin{equation} \label{eq:hamiltonian}
H(\mathbf s)=-\sum_p\frac1{p!}\sum_{i_1\cdots i_p}^NJ^{(p)}_{i_1\cdots i_p}s_{i_1}\cdots s_{i_p}
\end{equation}
is defined for vectors $\mathbf s\in\mathbb R^N$ confined to the sphere
$\|\mathbf s\|^2=N$. The coupling coefficients $J$ are taken at random, with
zero mean and variance $\overline{(J^{(p)})^2}=a_pp!/2N^{p-1}$ chosen so that
the energy is typically extensive. The overbar will always denote an average
over the coefficients $J$. The factors $a_p$ in the variances are freely chosen
constants that define the particular model. For instance, the so-called `pure'
models have $a_p=1$ for some $p$ and all others zero.
The variance of the couplings implies that the covariance of the energy with
itself depends on only the dot product (or overlap) between two configurations.
In particular, one finds
\begin{equation} \label{eq:covariance}
\overline{H(\mathbf s_1)H(\mathbf s_2)}=Nf\left(\frac{\mathbf s_1\cdot\mathbf s_2}N\right)
\end{equation}
where $f$ is defined by the series
\begin{equation}
f(q)=\frac12\sum_pa_pq^p
\end{equation}
One needn't start with a Hamiltonian like
\eqref{eq:hamiltonian}, defined as a series: instead, the covariance rule
\eqref{eq:covariance} can be specified for arbitrary, non-polynomial $f$, as in
the `toy model' of M\'ezard and Parisi \cite{Mezard_1992_Manifolds}.
The family of spherical models thus defined is quite rich, and by varying the
covariance $f$ nearly any hierarchical structure can be found in
equilibrium. Because of a correspondence between the ground state complexity
and the entropy at zero temperature, any hierarchical structure in the
equilibrium should be reflected in the complexity.
The complexity is calculated using the Kac--Rice formula, which counts the
stationary points using a $\delta$-function weighted by a Jacobian. The count
is given by
\begin{equation}
\begin{aligned}
\mathcal N(E, \mu)
&=\int_{S^{N-1}}d\mathbf s\, \delta\big(\nabla H(\mathbf s)\big)\,\big|\det\operatorname{Hess}H(\mathbf s)\big| \\
&\hspace{2pc}\times\delta\big(NE-H(\mathbf s)\big)\delta\big(N\mu-\operatorname{Tr}\operatorname{Hess}H(\mathbf s)\big)
\end{aligned}
\end{equation}
with two additional $\delta$-functions inserted to fix the energy density $E$
and the stability $\mu$. The complexity is then
\begin{equation} \label{eq:complexity}
\Sigma(E,\mu)=\lim_{N\to\infty}\frac1N\overline{\log\mathcal N(E, \mu})
\end{equation}
The stability $\mu$, sometimes called the radial reaction, determines the depth
of minima or the index of saddles. At large $N$ the Hessian can be shown to
consist of the sum of a GOE matrix with variance $f''(1)/N$ shifted by a
constant diagonal matrix of value $\mu$. Therefore, the spectrum of the Hessian
is a Wigner semicircle of radius $\mu_m=\sqrt{4f''(1)}$ centered at $\mu$. When
$\mu>\mu_m$, stationary points are minima whose sloppiest eigenvalue is
$\mu-\mu_m$. When $\mu=\mu_m$, the stationary points are marginal minima with
flat directions. When $\mu<\mu_m$, the stationary points are saddles with
indexed fixed to within order one (fixed macroscopic index).
It's worth reviewing the complexity for the best-studied case of the pure model
for $p\geq3$. Here, because the covariance is a homogeneous polynomial, $E$ and
$\mu$ cannot be fixed separately, and one implies the other: $\mu=pE$.
Therefore at each energy there is only one kind of stationary point. When the
energy reaches $E_\mathrm{th}=\mu_m/p$, the population of stationary points
suddenly shifts from all saddles to all minima, and there is an abrupt
percolation transition in the topology of constant-energy slices of the
landscape. This behavior of the complexity can be used to explain a rich
variety of phenomena in the equilibrium and dynamics of the pure models: the
threshold energy $E_\mathrm{th}$ corresponds to the average energy at the
dynamic transition temperature, and the asymptotic energy reached by slow aging
dynamics, and to the algorithmic limit $E_\mathrm{alg}$.
Things become much less clear in even the simplest mixed models. For instance,
one mixed model known to have a replica symmetric complexity was shown to
nonetheless not have a clear relationship between features of the complexity
and the asymptotic dynamics \cite{Folena_2020_Rethinking}.
To compute the complexity in the generic case, we use the replica method to
treat the logarithm inside the average of \eqref{eq:complexity}, and the
$\delta$-functions are written in a Fourier basis.
\begin{figure}
\centering
\includegraphics[width=\columnwidth]{figs/316_detail.pdf}
\caption{
Detail of the `phases' of the $3+16$ model complexity as a function of
energy and stability. Above the yellow marginal stability line the
complexity counts saddles of fixed index, while below that line it counts
minima of fixed stability. The shaded red region shows places where the
complexity is described by the 1RSB solution, while the shaded gray region
shows places where the complexity is described by the RS solution. In white
regions the complexity is zero. Several interesting energies are marked
with vertical black lines: the traditional `threshold' $E_\mathrm{th}$
where minima become most numerous, the algorithmic threshold
$E_\mathrm{alg}$ that bounds the performance of smooth algorithms, and the
average energies at the $2$RSB and $1$RSB equilibrium transitions $\langle
E\rangle_2$ and $\langle E\rangle_1$, respectively. Though the figure is
suggestive, $E_\mathrm{alg}$ lies at slightly lower energy than the termination of the RS
-- 1RSB transition line.
} \label{fig:2rsb.phases}
\end{figure}
It is known that by choosing a covariance $f$ as the sum of polynomials with
well-separated powers, one develops 2RSB in equilibrium. This should correspond
to 1RSB in Kac--Rice. For this example, we take
\begin{equation}
f(q)=\frac12\left(q^3+\frac1{16}q^{16}\right)
\end{equation}
established to have a 2RSB ground state \cite{Crisanti_2011_Statistical}.
With this covariance, the model sees a replica symmetric to 1RSB transition at
$\beta_1=1.70615\ldots$ and a 1RSB to 2RSB transition at
$\beta_2=6.02198\ldots$. At these transitions, the average energies in equilibrium are
$\langle E\rangle_1=-0.906391\ldots$ and $\langle E\rangle_2=-1.19553\ldots$,
respectively, and the ground state energy is $E_0=-1.287\,605\,530\ldots$.
Besides these typical equilibrium energies, an energy of special interest for
looking at the landscape topology is the \emph{algorithmic threshold}
$E_\mathrm{alg}$, defined by the lowest energy reached by local algorithms like
approximate message passing \cite{ElAlaoui_2020_Algorithmic,
ElAlaoui_2021_Optimization}. In the spherical models, this has been proven to
be
\begin{equation}
E_{\mathrm{alg}}=-\int_0^1dq\,\sqrt{f''(q)}
\end{equation}
For full RSB systems, $E_\mathrm{alg}=E_0$ and the algorithm can reach the
ground state energy. For the pure $p$-spin models,
$E_\mathrm{alg}=E_\mathrm{th}$, where $E_\mathrm{th}$ is the energy at which
marginal minima are the most common stationary points. Something about the
topology of the energy function might be relevant to where this algorithmic threshold
lies. For the $3+16$ model at hand, $E_\mathrm{alg}=-1.275\,140\,128\ldots$.
In this model, the RS complexity gives an inconsistent answer for the
complexity of the ground state, predicting that the complexity of minima
vanishes at a higher energy than the complexity of saddles, with both at a
lower energy than the equilibrium ground state. The 1RSB complexity resolves
these problems, predicting the same ground state as equilibrium and with a ground state stability $\mu_0=6.480\,764\ldots>\mu_m$. It predicts that the
complexity of marginal minima (and therefore all saddles) vanishes at
$E_m=-1.287\,605\,527\ldots$, which is very slightly greater than $E_0$. Saddles
become dominant over minima at a higher energy $E_\mathrm{th}=-1.287\,575\,114\ldots$.
The 1RSB complexity transitions to a RS description for dominant stationary
points at an energy $E_1=-1.273\,886\,852\ldots$. The highest energy for which
the 1RSB description exists is $E_\mathrm{max}=-0.886\,029\,051\ldots$
For minima, the complexity does
not inherit a 1RSB description until the energy is with in a close vicinity of
the ground state. On the other hand, for high-index saddles the complexity
becomes described by 1RSB at quite high energies. This suggests that when
sampling a landscape at high energies, high index saddles may show a sign of
replica symmetry breaking when minima or inherent states do not.
Fig.~\ref{fig:2rsb.phases} shows a different detail of the complexity in the
vicinity of the ground state, now as functions of the energy difference and
stability difference from the ground state. Several of the landmark energies
described above are plotted, alongside the boundaries between the `phases.'
Though $E_\mathrm{alg}$ looks quite close to the energy at which dominant
saddles transition from 1RSB to RS, they differ by roughly $10^{-3}$, as
evidenced by the numbers cited above. Likewise, though $\langle E\rangle_1$
looks very close to $E_\mathrm{max}$, where the 1RSB transition line
terminates, they too differ. The fact that $E_\mathrm{alg}$ is very slightly
below the place where most saddle transition to 1RSB is suggestive; we
speculate that an analysis of the typical minima connected to these saddles by
downward trajectories will coincide with the algorithmic limit. An analysis of
the typical nearby minima or the typical downward trajectories from these
saddles at 1RSB is warranted \cite{Ros_2019_Complex, Ros_2021_Dynamical}. Also
notable is that $E_\mathrm{alg}$ is at a significantly higher energy than
$E_\mathrm{th}$; according to the theory, optimal smooth algorithms in this
model stall in a place where minima are exponentially subdominant.
\begin{figure}
\centering
\includegraphics[width=\columnwidth]{figs/24_phases.pdf}
\caption{
`Phases' of the complexity for the $2+4$ model in the energy $E$ and
stability $\mu^*$ plane. The region shaded gray shows where the RS solution
is correct, while the region shaded red shows that where the FRSB solution
is correct. The white region shows where the complexity is zero.
} \label{fig:frsb.phases}
\end{figure}
If the covariance $f$ is chosen to be concave, then one develops FRSB in equilibrium. To this purpose, we choose
\begin{equation}
f(q)=\frac12\left(q^2+\frac1{16}q^4\right)
\end{equation}
also studied before in equilibrium \cite{Crisanti_2004_Spherical, Crisanti_2006_Spherical}. Because the ground state is FRSB, for this model
\begin{equation}
E_0=E_\mathrm{alg}=E_\mathrm{th}=-\int_0^1dq\,\sqrt{f''(q)}=-1.059\,384\,319\ldots
\end{equation}
In the equilibrium solution, the transition temperature from RS to FRSB is $\beta_\infty=1$, with corresponding average energy $\langle E\rangle_\infty=-0.53125\ldots$.
Along the supersymmetric line, the FRSB solution can be found in full, exact
functional form. To treat the FRSB away from this line numerically, we resort to
finite $k$RSB approximations. Since we are not trying to find the actual
$k$RSB solution, but approximate the FRSB one, we drop the extremal condition
\eqref{eq:cond.x} for $x_1,\ldots,x_k$ and instead set
\begin{equation}
x_i=\left(\frac i{k+1}\right)x_\textrm{max}
\end{equation}
and extremize over $x_\textrm{max}$ alone. This dramatically simplifies the
equations that must be solved to find solutions. In the results that follow, a
20RSB approximation is used to trace the dominant saddles and marginal minima, while
a 5RSB approximation is used to trace the (much longer) boundaries of the
complexity.
Fig.~\ref{fig:frsb.complexity} shows the complexity for this model as a
function of energy difference from the ground state for several notable
trajectories in the energy and stability plane. Fig.~\ref{fig:frsb.phases}
shows these trajectories, along with the phase boundaries of the complexity in
this plane. Notably, the phase boundary predicted by \eqref{eq:mu.transition}
correctly predicts where all of the finite $k$RSB approximations terminate.
Like the 1RSB model in the previous subsection, this phase boundary is oriented
such that very few, low energy, minima are described by a FRSB solution, while
relatively high energy saddles of high index are also. Again, this suggests
that studying the mutual distribution of high-index saddle points might give
insight into lower-energy symmetry breaking in more general contexts.
\paragraph{Acknowledgements}
The authors would like to thank Valentina Ros for helpful discussions.
\paragraph{Funding information}
JK-D and JK are supported by the Simons Foundation Grant No.~454943.
\bibliography{frsb_kac-rice}
\end{document}
|