1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
|
\documentclass[aps,prl,nobibnotes,reprint,longbibliography,floatfix]{revtex4-2}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{amsmath,amssymb,latexsym,graphicx}
\usepackage{newtxtext,newtxmath}
\usepackage{bbold,anyfontsize}
\usepackage[dvipsnames]{xcolor}
\usepackage[
colorlinks=true,
urlcolor=BlueViolet,
citecolor=BlueViolet,
filecolor=BlueViolet,
linkcolor=BlueViolet
]{hyperref}
\begin{document}
\title{
On the topology of solutions to random continuous constraint satisfaction problems
}
\author{Jaron Kent-Dobias}
\email{jaron.kent-dobias@roma1.infn.it}
\affiliation{Istituto Nazionale di Fisica Nucleare, Sezione di Roma I, Rome, Italy 00184}
\begin{abstract}
We consider the set of solutions to $M$ random polynomial equations on the
$(N-1)$-sphere. When solutions exist, they form a manifold. We compute the
average Euler characteristic of this manifold, and find different behaviors
depending on $\alpha=M/N$. When $\alpha<1$, the average Euler characteristic
is subexponential in $N$ but positive, indicating the presence of few
simply-connected components. When $1\leq\alpha<\alpha_\mathrm a^*$, it is
exponentially large in $N$, indicating a shattering transition in the space
of solutions. Finally, when $\alpha_\mathrm a^*\leq\alpha$, the number of
solutions vanish. We further compute the average logarithm of the Euler
characteristic, which is representative of typical manifolds. We compare
these results with the analogous calculation for the topology of level sets
of the spherical spin glasses, whose connected phase has a negative Euler
characteristic indicative of many holes.
\end{abstract}
\maketitle
We consider the problem of finding configurations $\mathbf x\in\mathbb R^N$
lying on the $(N-1)$-sphere $\|\mathbf x\|^2=N$ that simultaneously satisfy $M$
nonlinear constraints $V_k(\mathbf x)=0$ for $1\leq k\leq M$. The nonlinear
constraints are taken to be centered Gaussian random functions with covariance
\begin{equation} \label{eq:covariance}
\overline{V_i(\mathbf x)V_j(\mathbf x')}=\delta_{ij}f\left(\frac{\mathbf x\cdot\mathbf x'}N\right)
\end{equation}
for some choice of $f$.
When the covariance function $f$ is polynomial, the $V_k$ are also polynomial, with terms of degree $p$ in $f$ corresponding to all possible terms of degree $p$ in $V_k$. In particular, taking
\begin{equation}
V_k(\mathbf x)
=\sum_{p=0}^\infty\frac1{p!}\sqrt{\frac{f^{(p)}(0)}{N^p}}
\sum_{i_1\cdots i_p}^NJ^{(k,p)}_{i_1\cdots i_p}x_{i_1}\cdots x_{i_p}
\end{equation}
and each of the elements of the tensors $J^{(k,p)}$ as independently
distributed with a unit normal distribution satisfies \eqref{eq:covariance}.
This problem or small variations thereof have attracted attention recently for
their resemblance to encryption, optimization, and vertex models of confluent
tissues \cite{Fyodorov_2019_A, Fyodorov_2020_Counting,
Fyodorov_2022_Optimization, Urbani_2023_A, Kamali_2023_Dynamical,
Kamali_2023_Stochastic, Urbani_2024_Statistical, Montanari_2023_Solving,
Montanari_2024_On, Kent-Dobias_2024_Conditioning, Kent-Dobias_2024_Algorithm-independent}. In each of these cases, the authors studied properties of
the cost function
\begin{equation}
\mathcal C(\mathbf x)=\frac12\sum_{k=1}^MV_k(\mathbf x)^2
\end{equation}
which achieves zero cost only for configurations that satisfy all the
constraints.
Here we dispense with defining a cost function, and instead study
the set of solutions directly.
The set of solutions to our nonlinear random constraint satisfaction problem
can be written as
\begin{equation}
\Omega=\{\mathbf x\in\mathbb R^N\mid \|\mathbf x\|^2=N,0=V_k(\mathbf x)
\;\forall\;k=1,\ldots,M\}
\end{equation}
$\Omega$ is almost always a manifold without singular points. The conditions for a singular point are that
$0=\frac\partial{\partial\mathbf x}V_k(\mathbf x)$ for all $k$. This is
equivalent to asking that the constraints $V_k$ all have a stationary point at
the same place. When the $V_k$ are independent and random, this is vanishingly
unlikely, requiring $NM$ independent equations to be simultaneously satisfied.
This means that different connected components of the set of solutions do not
intersect, nor are there self-intersections, without extraordinary fine-tuning.
One result of this previous work is the value of $\alpha$ for the
satisfiability transition, which is $\alpha_\text{\textsc{sat}}=f'(1)/f(1)$.
For $\alpha$ larger than $\alpha_\text{\textsc{sat}}$ solutions do not exist.
The Euler characteristic $\chi$ of a manifold is a topological invariant \cite{Hatcher_2002_Algebraic}. It is
perhaps most familiar in the context of connected compact orientable surfaces, where it
characterizes the number of handles in the surface: $\chi=2(1-\#)$ for $\#$
handles. For general $d$, the Euler characteristic of the $d$-sphere is $2$ if $d$ is even and 0 if $d$ is odd. The canonical method for computing the Euler characteristic is done by
defining a complex on the manifold in question, essentially a
higher-dimensional generalization of a polygonal tiling. Then $\chi$ is given
by an alternating sum over the number of cells of increasing dimension, which
for 2-manifolds corresponds to the number of vertices, minus the number of
edges, plus the number of faces.
Morse theory offers another way to compute the Euler characteristic using the
statistics of stationary points of a function $H:\Omega\to\mathbb R$ \cite{Audin_2014_Morse}. For
functions $H$ without any symmetries with respect to the manifold, the surfaces
of gradient flow between adjacent stationary points form a complex. The
alternating sum over cells to compute $\chi$ becomes an alternating sum over
the count of stationary points of $H$ with increasing index, or
\begin{equation}
\chi=\sum_{i=0}^N(-1)^i\mathcal N_H(\text{index}=i)
\end{equation}
Conveniently, we can express this abstract sum as an integral over the manifold
using a small variation on the Kac--Rice formula for counting stationary
points. Since the sign of the determinant of the Hessian matrix of $H$ at a
stationary point is equal to its index, if we count stationary points including
the sign of the determinant, we arrive at the Euler characteristic, or
\begin{equation} \label{eq:kac-rice}
\chi=\int_\Omega d\mathbf x\,\delta\big(\nabla H(\mathbf x)\big)\det\operatorname{Hess}H(\mathbf x)
\end{equation}
When the Kac--Rice formula is used to \emph{count} stationary points, the sign
of the determinant is a nuisance that one must take pains to preserve
\cite{Fyodorov_2004_Complexity}. Here we are correct to exclude it.
We treat the integral over the implicitly defined manifold $\Omega$ using the
method of Lagrange multipliers. We introduce one multiplier $\omega_0$ to
enforce the spherical constraint and $M$ multipliers $\omega_k$ to enforce the vanishing of
each of the $V_k$, resulting in the Lagrangian
\begin{equation}
L(\mathbf x,\pmb\omega)
=H(\mathbf x)+\frac12\omega_0\big(\|\mathbf x\|^2-N\big)
+\sum_{k=1}^M\omega_kV_k(\mathbf x)
\end{equation}
The integral over $\Omega$ in \eqref{eq:kac-rice} then becomes
\begin{equation} \label{eq:kac-rice.lagrange}
\chi=\int_{\mathbb R^N} d\mathbf x\int_{\mathbb R^{M+1}}d\pmb\omega
\,\delta\big(\partial L(\mathbf x,\pmb\omega)\big)
\det\partial\partial L(\mathbf x,\pmb\omega)
\end{equation}
where $\partial=[\frac\partial{\partial\mathbf x},\frac\partial{\partial\pmb\omega}]$
is the vector of partial derivatives with respect to all $N+M+1$ variables.
To evaluate the average of $\chi$ over the constraints, we first translate the $\delta$ functions and determinant to integral form, with
\begin{align}
\delta\big(\partial L(\mathbf x,\pmb\omega)\big)
=\int\frac{d\hat{\mathbf x}}{(2\pi)^N}\frac{d\hat{\pmb\omega}}{(2\pi)^{M+1}}
e^{i[\hat{\mathbf x},\hat{\pmb\omega}]\cdot\partial L(\mathbf x,\pmb\omega)}
\\
\det\partial\partial L(\mathbf x,\pmb\omega)
=\int d\bar{\pmb\eta}\,d\pmb\eta\,d\bar{\pmb\gamma}\,d\pmb\gamma\,
e^{-[\bar{\pmb\eta},\bar{\pmb\gamma}]^T\partial\partial H[\pmb\eta,\pmb\gamma]}
\end{align}
for real variables $\hat{\mathbf x}$ and $\hat{\pmb\omega}$, and Grassmann
variables $\bar{\pmb\eta}$, $\pmb\eta$, $\bar{\pmb\gamma}$, and $\pmb\gamma$.
With these transformations in place, there is a compact way to express $\chi$
using superspace notation. For a review of the superspace formalism for
evaluating integrals of the form \eqref{eq:kac-rice.lagrange}, see Appendices A
\& B of \cite{Kent-Dobias_2024_Conditioning}. Introducing the Grassmann indices
$\bar\theta_1$ and $\theta_1$, we define superfields
\begin{align}
\pmb\phi(1)
&=\mathbf x+\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1+\hat{\mathbf x}\bar\theta_1\theta_1
\label{eq:superfield.phi} \\
\pmb\sigma(1)
&=\pmb\omega+\bar\theta_1\pmb\gamma+\bar{\pmb\gamma}\theta_1+\hat{\pmb\omega}\bar\theta_1\theta_1
\label{eq:superfield.sigma}
\end{align}
with which we can represent $\chi$ by
\begin{equation}
\chi=\int d\pmb\phi\,d\pmb\sigma\,\exp\left\{
\int d1\,L\big(\pmb\phi(1),\pmb\sigma(1)\big)
\right\}
\end{equation}
We are now in a position to average over the distribution of constraints. Using
standard manipulations, we find the average Euler characteristic is
\begin{equation}
\begin{aligned}
\overline{\chi}&=\int d\pmb\phi\,d\sigma_0\,\exp\Bigg\{
-\frac M2\log\operatorname{sdet}f\left(\frac{\phi(1)^T\phi(2)}N\right) \\
&\qquad+\int d1\,\left[
H\big(\phi(1)\big)+\frac12\sigma_0(1)\big(\|\phi(1)\|^2-N\big)
\right]
\Bigg\}
\end{aligned}
\end{equation}
Now we are forced to make a decision about the function $H$. Because $\chi$ is
a topological invariant, any choice will work so long as it does not share some
symmetry with the underlying manifold, i.e., that it $H$ satisfies the Smale condition. Because our manifold of random
constraints has no symmetries, we can take a simple height function $H(\mathbf
x)=\mathbf x_0\cdot\mathbf x$ for some $\mathbf x_0\in\mathbb R^N$ with
$\|\mathbf x_0\|^2=N$. $H$ is a height function because when $\mathbf x_0$ is
used as the polar axis, $H$ gives the height on the sphere.
With this choice made, we can integrate over the superfields $\pmb\phi$.
Defining two order parameters $\mathbb Q(1,2)=\frac1N\phi(1)\cdot\phi(2)$ and
$\mathbb M(1)=\frac1N\phi(1)\cdot\mathbf x_0$, the result is
\begin{align}
\overline{\chi}
&=\int d\mathbb Q\,d\mathbb M\,d\sigma_0 \notag\\
&\quad\times\exp\Bigg\{
\frac N2\log\operatorname{sdet}(\mathbb Q-\mathbb M\mathbb M^T)
-\frac M2\log\operatorname{sdet}f(\mathbb Q) \notag \\
&\qquad+N\int d1\,\left[
\mathbb M(1)+\frac12\sigma_0(1)\big(\mathbb Q(1,1)-1\big)
\right]
\Bigg\}
\end{align}
This expression is an integral of an exponential with a leading factor of $N$
over several order parameters, and is therefore in a convenient position for
evaluating at large $N$ with a saddle point. The order parameter $\mathbb Q$ is
made up of scalar products of the original integration variables in our
problem in \eqref{eq:superfield.phi}, while $\mathbb M$ contains their scalar
project with $\mathbf x_0$, and $\pmb\sigma_0$ contains $\omega_0$ and
$\hat\omega_0$. We can solve the saddle point equations in all of these
parameters save for $m=\frac1N\mathbf x_0\cdot\mathbf x$, the overlap with the
height axis. The result reduces the average Euler characteristic to
\begin{equation}
\bar\chi\propto\int dm\,e^{N\mathcal S_\mathrm a(m)}
\end{equation}
where the annealed action $\mathcal S_a$ is given by
\begin{equation} \label{eq:ann.action}
\begin{aligned}
&\mathcal S_\mathrm a(m)
=\frac12\Bigg[
\log\left(
\frac{\frac{f'(1)}{f(1)}(1-m^2)-1}{\alpha-1}
\right) \\
&\hspace{4em} -\alpha\log\left(
\frac{\alpha}{\alpha-1}\left(
1-\frac1{\frac{f'(1)}{f(1)}(1-m^2)}
\right)
\right)
\Bigg]
\end{aligned}
\end{equation}
and must be evaluated at a maximum with respect to $m$. This function is
plotted for a specific covariance function $f$ in Fig.~\ref{fig:action}, where
several distinct regimes can be seen.
\begin{figure}
\includegraphics{figs/action.pdf}
\caption{
The annealed action $\mathcal S_\mathrm a$ of \eqref{eq:ann.action} plotted
as a function of $m$ at several values of $\alpha$. Here, the covariance
function is $f(q)=\frac12q^2$ and $\alpha_\text{\textsc{sat}}=2$. When
$\alpha<1$, the action is maximized for $m^2>0$ and its value is zero. When
$1\leq\alpha<\alpha_\text{\textsc{sat}}$, the action is maximized at
$m=0$ and is positive. When $\alpha>\alpha_\text{\textsc{sat}}$ there is no
maximum.
} \label{fig:action}
\end{figure}
First, when $\alpha<1$ the action $\mathcal S_\mathrm a$ is strictly negative
and has maxima at some $m^2>0$. At these maxima, $\mathcal S_\mathrm a(m)=0$.
When $\alpha>1$, the action flips over and becomes strictly positive. In the
regime $1<\alpha<\alpha_\text{\textsc{sat}}$, there is a single maximum at
$m=0$ where the action is positive. When $\alpha\geq\alpha_\text{\textsc{sat}}$
the maximum in the action vanishes.
This results in distinctive regimes for $\overline\chi$, with an example plotted in Fig.~\ref{fig:characteristic}. If $m^*$ is the maximum of $\mathcal S_\mathrm a$, then
\begin{equation}
\frac1N\log\overline\chi=\mathcal S_\mathrm a(m^*)
\end{equation}
When $\alpha<1$, the action evaluates to zero, and therefore $\overline\chi$ is
positive and subexponential in $N$. When $1<\alpha<\alpha_\text{\textsc{sat}}$, the action
is positive, and $\overline\chi$ is exponentially large in $N$. Finally, when
$\alpha\geq\alpha_\text{\textsc{sat}}$ the action and $\overline\chi$ are ill-defined.
\begin{figure}
\includegraphics{figs/characteristic.pdf}
\caption{
The logarithm of the average Euler characteristic $\overline\chi$ as a
function of $\alpha$. The covariance function is $f(q)=\frac12q^2$ and
$\alpha_\text{\textsc{sat}}=2$.
} \label{fig:characteristic}
\end{figure}
We can interpret this by reasoning about topology of $\Omega$ consistent with
these results. Cartoons that depict this reasoning are shown in
Fig.~\ref{fig:cartoons}. In the regime $\alpha<1$, $\overline\chi$ is positive but not
very large. This is consistent with a solution manifold made up of few large
components, each with the topology of a hypersphere. The saddle point value
$m^*$ for the overlap with the height axis $\mathbf x_0$ corresponds to the
latitude at which most stationary points that contribute to the Euler
characteristic are found. This means we can interpret $1-m^*$ as the typical
squared distance between a randomly selected point on the sphere and the
solution manifold.
\begin{figure}
\includegraphics[width=0.32\columnwidth]{figs/connected.pdf}
\hfill
\includegraphics[width=0.32\columnwidth]{figs/shattered.pdf}
\hfill
\includegraphics[width=0.32\columnwidth]{figs/gone.pdf}
\includegraphics{figs/bar.pdf}
\caption{
Cartoon of the topology of the CCSP solution manifold implied by our
calculation. The arrow shows the vector $\mathbf x_0$ defining the height
function. The region of solutions is shaded orange, and the critical points
of the height function restricted to this region are marked with a point.
For $\alpha<1$, there are few simply connected regions with most of the
minima and maxima contributing to the Euler characteristic concentrated at
the height $m^*$. For $\alpha\geq1$, there are many simply
connected regions and most of their minima and maxima are concentrated at
the equator.
} \label{fig:cartoons}
\end{figure}
When $1<\alpha<\alpha_\text{\textsc{sat}}$, $\overline\chi$ is positive and
very large. This is consistent with a solution manifold made up of
exponentially many disconnected components, each with the topology of a
hypersphere. If this interpretation is correct, our calculation effectively
counts these components. This is a realization of a shattering transition in
the solution manifold. Here $m^*$ is zero because for any choice of height
axis, the vast majority of stationary points that contribute to the Euler
characteristic are found near the equator. Finally, for
$\alpha\geq\alpha_\text{\textsc{sat}}$, there are no longer solutions that
satisfy the constraints. The Euler characteristic is not defined for an empty
set, and in this regime the calculation yields no solution.
In the regime where $\log\overline\chi$ is positive, it is possible that our
calculation yields a value which is not characteristic of typical sets of
constraints. This motivates computing $\overline{\log\chi}$, the average of
the logarithm, which should produce something characteristic of typical
samples, the so-called quenched calculation.
\begin{equation}
D=\beta R
\qquad
\beta=-\frac{m+\sum_aR_{1a}}{\sum_aC_{1a}}
\qquad
\hat m=0
\end{equation}
\begin{acknowledgements}
JK-D is supported by a \textsc{DynSysMath} Specific Initiative of the INFN.
The authors thank Pierfrancesco Urbani for helpful conversations on these topics.
\end{acknowledgements}
\bibliography{topology}
\appendix
\section{Euler characteristic of the spherical spin glasses}
We can compare this calculation with what we expect to find for the manifold
defined by $V(\mathbf x)=E$ for a single function $V$, with a rescaled covariance $\overline{V(\mathbf x)V(\mathbf x')}=Nf(\mathbf x\cdot\mathbf x'/N)$. This corresponds to the
energy level set of a spherical spin glass. Now the Lagrangian is
\begin{equation}
L(\mathbf x,\omega_0,\omega_1)=
H(\mathbf x)+\frac12\omega_0\big(\|\mathbf x\|^2-N\big)
+\omega_1\big(V(\mathbf x)-NE\big)
\end{equation}
The derivation follows almost identically as before, but we do not integrate
out $\sigma_1$. We have
\begin{align}
\overline{\chi}&=\int d\mathbb Q\,d\mathbb M\,d\sigma_0\,d\sigma_1\,\exp\Bigg\{
\frac N2\log\operatorname{sdet}(\mathbb Q-\mathbb M\mathbb M^T)
\notag \\
&\quad+N\int d1\,\bigg[
\mathbb M(1)+\frac12\sigma_0(1)\big(\mathbb Q(1,1)-1\big) \notag \\
&-E\sigma_1(1)
+\frac 12\int d2\,\sigma_1(1)\sigma_1(2)f\big(\mathbb Q(1,2)\big)
\bigg]
\Bigg\}
\end{align}
The saddle point condition for $\sigma_1$ gives
\begin{equation}
\sigma_1(1)=E\int d2\,f(\mathbb Q)^{-1}(1,2)
\end{equation}
which then yields
\begin{align}
\overline{\chi}&=\int d\mathbb Q\,d\mathbb M\,d\sigma_0\,\exp\Bigg\{
\frac N2\log\operatorname{sdet}(\mathbb Q-\mathbb M\mathbb M^T)
\notag \\
&\quad+N\int d1\,\bigg[
\mathbb M(1)+\frac12\sigma_0(1)\big(\mathbb Q(1,1)-1\big) \notag \\
&-\frac12E^2\int d2\,f(\mathbb Q)^{-1}(1,2)
\bigg]
\Bigg\}
\end{align}
\end{document}
|