bezout.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332

\documentclass[aps,prl,reprint,longbibliography,floatfix,fleqn]{revtex4-2}

\usepackage[utf8]{inputenc} % why not type "Bézout" with unicode?
\usepackage[T1]{fontenc} % vector fonts plz
\usepackage[
  colorlinks=true,
  urlcolor=purple,
  citecolor=purple,
  filecolor=purple,
  linkcolor=purple
]{hyperref} % ref and cite links with pretty colors
\usepackage{amsmath, amssymb, graphicx, xcolor} % standard packages

\begin{document}

\title{Complex complex landscapes: saturating the Bézout bound} % change me

\author{Jaron Kent-Dobias}
\author{Jorge Kurchan}

\affiliation{Laboratoire de Physique de l'Ecole Normale Supérieure, Paris, France}

\date\today

\begin{abstract}
  We study the saddle-points of the $p$-spin mode -- the best understood example of `complex (rugged) landscape' -- in the space in which all its $N$ variables are allowed to be complex. The problem becomes
  a system of $N$ random equations of degree $p-1$. 
  We solve for quantities averaged over randomness in the $N \rightarrow \infty$ limit. 
  We show that the number of solutions saturates the Bézout bound $\ln {\cal{N}}\sim N \ln (p-1) $\cite{Bezout_1779_Theorie}.
The Hessian of each saddle is given by a random matrix of the form $M=a A^2+b B^2 +ic [A,B]_-+ d [A,B]_+$,
where $A$ and $B$ are GOE matrices and $a-d$ real. Its spectrum  has a transition 
from one-cut to two-cut that generalizes the notion  of `threshold level' that is well-known in the real problem.
In the case that the disorder is itself real, only the square-root of the total  number solutions are real.
In terms of real and imaginary parts of the energy, the solutions are divided in  sectors where the saddles have
different topological properties.
\end{abstract}

\maketitle

Spin-glasses have long been considered the paradigm of `complex landscapes' of many variables, a subject that
includes Neural Networks and optimization problems, most notably  Constraint Satisfaction ones.
The most tractable family of these  are the mean-field spherical p-spin models defined by the energy:
\begin{equation} \label{eq:bare.hamiltonian}
  H_0 = \sum_p \frac{c_p}{p!}\sum_{i_1\cdots i_p}^NJ_{i_1\cdots i_p}z_{i_1}\cdots z_{i_p},
\end{equation}
where the $J_{i_1\cdots i_p}$ are real Gaussian variables and the $z_i$ are real and constrained
to a sphere $\sum_i z_i^2=N$. If there is a single  term of a given $p$, this is known as the `pure $p$-spin' model, the case we shall study here.

This problem has been attacked from several angles: the replica trick to
compute the Boltzmann--Gibbs distribution, a Kac--Rice \cite{Kac_1943_On,
Rice_1939_The, Fyodorov_2004_Complexity} procedure (similar to the Fadeev--Popov
integral)  to compute the number of saddle-points of the energy function, and
the gradient-descent -- or more generally Langevin -- dynamics staring from a
high-energy configuration.  Thanks to the relative simplicity of the energy,
all these approaches are possible analytically in the large $N$ limit.

In this paper we shall extend the study to the case  where $z\in\mathbb C^N$ are  and $J$ is a symmetric tensor
whose elements are complex normal with $\overline{|J|^2}=p!/2N^{p-1}$ and
$\overline{J^2}=\kappa\overline{|J|^2}$ for complex parameter $|\kappa|<1$. The constraint becomes $z^2=N$.

The motivations for this paper are of two types. On the practical side, there are situations in which  complex variables
have in a disorder problem appear naturally: such is the case in which they are {\em phases}, as in random laser problems \cite{Antenucci_2015_Complex, etc}. Another problem where a Hamiltonian very close to ours has been proposed is the Quiver Hamiltonians \cite{Anninos_2016_Disordered} modeling Black Hole horizons in the zero-temperature limit.  

There is however a more fundamental reason for this study: we know from experience that extending a problem to the complex plane often uncovers an underlying simplicity that is hidden in the purely real case. Consider, for example, the procedure of starting from a simple, known Hamiltonian $H_{00}$ and studying $\lambda H_{00} + (1-\lambda H_{0} )$, evolving adiabatically from $\lambda=0$ to $\lambda=1$, as is familiar from quantum annealing. The $H_{00}$ is a polynomial of degree $p$  chosen to have simple, known roots. Because we are working in
complex variables, and the roots are simple all the way (we shall confirm this), we may follow a root from $\lambda=0$ to $\lambda=1$. With real 
variables minima of functions appear and disappear, and this procedure is not possible. The same idea may be implemented by 
performing diffusion in the $J$'s, and following the roots, in complete analogy with Dyson's stochastic dynamics.


Let us go back to our model.
For the  constraint we  choose here $z^2=N$, rather than $|z|^2=N$, in order to preserve the holomorphic nature
of the functions. In addition, the  
nonholomorphic spherical constraint has a disturbing lack of critical points
nearly everywhere, since $0=\partial^* H=-p\epsilon z$ is only
satisfied for $\epsilon=0$, as $z=0$ is forbidden by the constraint.
It is enforced using the method of Lagrange
multipliers: introducing the $\epsilon\in\mathbb C$, this gives
\begin{equation} \label{eq:constrained.hamiltonian}
  H = H_0+\frac p2\epsilon\left(N-\sum_i^Nz_i^2\right).
\end{equation}
It is easy to see that {\em for a pure $p$-spin}, at  any critical point $\epsilon=H/N$, the average energy.


Since $H$ is holomorphic, a point is a critical point of its real part if and
only if it is also a critical point of its imaginary part. The number of
critical points of $H$ is therefore the number of critical points of
$\mathop{\mathrm{Re}}H$. Writing $z=x+iy$, $\mathop{\mathrm{Re}}H$ can be
interpreted as a real function of $2N$ real variables. The number of critical
points it has is given by the usual Kac--Rice formula:
\begin{equation} \label{eq:real.kac-rice}
  \begin{aligned}
    \mathcal N_J(\kappa,\epsilon)
      &= \int dx\,dy\,\delta(\partial_x\mathop{\mathrm{Re}}H)\delta(\partial_y\mathop{\mathrm{Re}}H) \\
      &\qquad\times\left|\det\begin{bmatrix}
          \partial_x\partial_x\mathop{\mathrm{Re}}H & \partial_x\partial_y\mathop{\mathrm{Re}}H \\
          \partial_y\partial_x\mathop{\mathrm{Re}}H & \partial_y\partial_y\mathop{\mathrm{Re}}H
        \end{bmatrix}\right|.
  \end{aligned}
\end{equation}
The Cauchy--Riemann equations may be used to write \eqref{eq:real.kac-rice} in
a manifestly complex way.  Using the Wirtinger derivative
$\partial=\partial_x-i\partial_y$, one can write
$\partial_x\mathop{\mathrm{Re}}H=\mathop{\mathrm{Re}}\partial H$ and
$\partial_y\mathop{\mathrm{Re}}H=-\mathop{\mathrm{Im}}\partial H$. Carrying
these transformations through, we have
\begin{equation} \label{eq:complex.kac-rice}
  \begin{aligned}
    \mathcal N_J&(\kappa,\epsilon)
      = \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H) \\
        &\qquad\qquad\qquad\times\left|\det\begin{bmatrix}
            \mathop{\mathrm{Re}}\partial\partial H & -\mathop{\mathrm{Im}}\partial\partial H \\
            -\mathop{\mathrm{Im}}\partial\partial H & -\mathop{\mathrm{Re}}\partial\partial H
          \end{bmatrix}\right| \\
      &= \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H)
        \left|\det[(\partial\partial H)^\dagger\partial\partial H]\right| \\
      &= \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H)
        |\det\partial\partial H|^2.
  \end{aligned}
\end{equation}
This gives three equivalent expressions for the determinant of the Hessian: as
that of a $2N\times 2N$ real matrix, that of an $N\times N$ Hermitian matrix,
or the norm squared of that of an $N\times N$ complex symmetric matrix.

These equivalences belie a deeper connection between the spectra of the
corresponding matrices: each eigenvalue of the real matrix has a negative
partner. For each pair $\pm\lambda$ of the real matrix, $\lambda^2$ is an
eigenvalue of the Hermitian matrix and $|\lambda|$ is a \emph{singular
value} of the complex symmetric matrix. The distribution of positive
eigenvalues of the Hessian is therefore the same as the distribution of
singular values of $\partial\partial H$, while both are the same as the
distribution of square-rooted eigenvalues of $(\partial\partial
H)^\dagger\partial\partial H$.

The expression \eqref{eq:complex.kac-rice} is to be averaged over the $J$'s as
$N \Sigma= \overline{\ln \mathcal N_J} = \int dJ \; \ln N_J$, a calculation
that involves the replica trick. In most the parameter-space that we shall
study here, the {\em annealed approximation} $N \Sigma \sim \ln \overline{
\mathcal N_J} = \ln \int dJ \; N_J$ is exact. 

A useful property of the Gaussian distributions is that gradient and Hessian
may be seen to be independent \cite{Bray_2007_Statistics,
Fyodorov_2004_Complexity}, so that we may treat the $\delta$-functions and the
Hessians as independent. We compute each by taking the saddle point. The
$\delta$-functions are converted to exponentials by the introduction of
auxiliary fields $\hat z=\hat x+i\hat y$. The average over $J$ can then be
performed. A generalized Hubbard--Stratonovich then allows a change of
variables from the $4N$ original and auxiliary fields to eight bilinears
defined by
\begin{equation}
  \begin{aligned}
    Na=z^*\cdot z
    &&
    N\hat c=\hat z\cdot\hat z
    &&
    Nb=\hat z^*\cdot z \\
    N\hat a=\hat z^*\cdot\hat z
    &&
    Nd=\hat z\cdot z
  \end{aligned}
\end{equation}
and their conjugates. The result is, to leading order in $N$,
\begin{equation} \label{eq:saddle}
    \overline{\mathcal N_J}(\kappa,\epsilon)
        = \int da\,d\hat a\,db\,db^*d\hat c\,d\hat c^*dd\,dd^*e^{Nf(a,\hat a,b,\hat c,d)},
\end{equation}
where
\begin{widetext}
  \textcolor{red}{\textbf{[appendix?? I'm putting too much right now so as to trim later...]}}
  \begin{equation}
    f=2+\frac12\log\det\frac12\begin{bmatrix}
      1 & a & d & b \\
      a & 1 & b^* & d^* \\
      d & b^* & \hat c & \hat a \\
      b & d^* & \hat a & \hat c^*
    \end{bmatrix}
    +\mathop{\mathrm{Re}}\left\{\frac18\left[\hat aa^{p-1}+(p-1)|d|^2a^{p-2}+\kappa(\hat c^*+(p-1)b^2)\right]-\epsilon b\right\}
    +\int d\lambda\,\rho(\lambda)\log|\lambda|^2
  \end{equation}
  where $\rho(\lambda)$, the distribution of eigenvalues $\lambda$ of
  $\partial\partial H$, is dependant on $a$ alone. This function has a maximum in
  $\hat a$, $b$, $\hat c$, and $d$ at which its value is (for simplicity, with
  $\kappa\in\mathbb R$)
  \begin{equation}
    \begin{aligned}
      f(a)&=1+\frac12\log\left(\frac4{p^2}\frac{a^2-1}{a^{2(p-1)}-\kappa^2}\right)+\int d\lambda\,\rho(\lambda)\log|\lambda|^2 \\
          &\hspace{80pt}-\frac{a^p(1+p(a^2-1))-a^2\kappa}{a^{2p}+a^p(a^2-1)(p-1)-a^2\kappa^2}(\mathop{\mathrm{Re}}\epsilon)^2
          -\frac{a^p(1+p(a^2-1))+a^2\kappa}{a^{2p}-a^p(a^2-1)(p-1)-a^2\kappa^2}(\mathop{\mathrm{Im}}\epsilon)^2,
    \end{aligned}
  \end{equation}
\end{widetext}
This leaves a single parameter, $a$, which dictates the magnitude of $z^*\cdot z$, or alternatively the magnitude $y^2$ of the imaginary part. The latter vanishes as $a\to1$, where we should recover known results for the real $p$-spin.


The Hessian of \eqref{eq:constrained.hamiltonian} is $\partial\partial
H=\partial\partial H_0-p\epsilon I$, or the Hessian of
\eqref{eq:bare.hamiltonian} with a constant added to its diagonal. The
eigenvalue distribution $\rho$ of the constrained Hessian is therefore related
to the eigenvalue distribution $\rho_0$ of the unconstrained one by a similar
shift, or $\rho(\lambda)=\rho_0(\lambda+p\epsilon)$. The Hessian of
\eqref{eq:bare.hamiltonian} is
\begin{equation} \label{eq:bare.hessian}
  \partial_i\partial_jH_0
  =\frac{p(p-1)}{p!}\sum_{k_1\cdots k_{p-2}}^NJ_{ijk_1\cdots k_{p-2}}z_{k_1}\cdots z_{k_{p-2}},
\end{equation}

{\color{red} \bf here I would explain the question of the det and also of the appearance of the gap, would draw a picture of ellipse etc, and would send the reader to an appendix for most of this part of the calculation}

which makes its ensemble that of Gaussian complex symmetric matrices. Given its variances
$\overline{|\partial_i\partial_j H_0|^2}=p(p-1)a^{p-2}/2N$ and
$\overline{(\partial_i\partial_j H_0)^2}=p(p-1)\kappa/2N$, $\rho_0(\lambda)$ is constant inside the ellipse
\begin{equation} \label{eq:ellipse}
  \left(\frac{\mathop{\mathrm{Re}}(\lambda e^{i\theta})}{a^{p-2}+|\kappa|}\right)^2+
  \left(\frac{\mathop{\mathrm{Im}}(\lambda e^{i\theta})}{a^{p-2}-|\kappa|}\right)^2
  <\frac{p(p-1)}{2a^{p-2}}
\end{equation}
where $\theta=\frac12\arg\kappa$ \cite{Nguyen_2014_The}. The eigenvalue
spectrum of $\partial\partial H$ therefore is that of an ellipse whose center
is shifted by $p\epsilon$.

\begin{figure}[htpb]
  \centering

  \includegraphics{fig/spectra_0.0.pdf}
  \includegraphics{fig/spectra_0.5.pdf}\\
  \includegraphics{fig/spectra_1.0.pdf}
  \includegraphics{fig/spectra_1.5.pdf}

  \caption{
    Eigenvalue and singular value spectra of the matrix $\partial\partial H$
    for $p=3$, $a=\frac54$, and $\kappa=\frac34e^{i3\pi/4}$ with (a)
    $\epsilon=0$, (b) $\epsilon=\frac12|\epsilon_{\mathrm{th}}|$, (c)
    $\epsilon=|\epsilon_{\mathrm{th}}|$, and (d)
    $\epsilon=\frac32|\epsilon_{\mathrm{th}}|$. The shaded region of each inset
    shows the support of the eigenvalue distribution. The solid line on each
    plot shows the distribution of singular values, while the overlaid
    histogram shows the empirical distribution from $2^{10}\times2^{10}$ complex
    normal matrices with the same covariance and diagonal shift as
    $\partial\partial H$.
  } \label{fig:spectra}
\end{figure}

The eigenvalue spectrum of the Hessian of the real part, or equivalently the
eigenvalue spectrum of $(\partial\partial H)^\dagger\partial\partial H$, is the
singular value spectrum of $\partial\partial H$. When $\kappa=0$ and the
elements of $J$ are standard complex normal, this corresponds to a complex
Wishart distribution. For $\kappa\neq0$ the problem changes, and to our
knowledge a closed form is not known.  We have worked out an implicit form for
this spectrum using the saddle point of a replica symmetric calculation for the
Green function. The result is
\begin{widetext}
  \begin{equation}
    G(\sigma)=\lim_{n\to0}\int d\alpha\,d\chi\,d\chi^*\frac\alpha2
    \exp nN\left\{
      1+\frac{p(p-1)}{16}a^{p-2}\alpha^2-\frac{\alpha\sigma}2+\frac12\log(\alpha^2-|\chi|^2)
      +\frac p4\mathop{\mathrm{Re}}\left(\frac{(p-1)}8\kappa^*\chi^2-\epsilon^*\chi\right)
      \right\}
  \end{equation}
\end{widetext}
The argument of the exponential has several saddles, but the one with the smallest value of $\mathop{\mathrm{Re}}\alpha$ gives the correct solution \textcolor{red}{\textbf{why????? we never figured this out...}}.

The transition from a one-cut to two-cut singular value spectrum naturally
corresponds to the origin leaving the support of the eigenvalue spectrum.
Weyl's theorem requires that the product over the norm of all eigenvalues must
not be greater than the product over all singular values \cite{Weyl_1912_Das}.
Therefore, the absence of zero eigenvalues implies the absence of zero singular
values. The determination of the threshold energy is therefore reduced to a
geometry problem, and yields
\begin{equation} \label{eq:threshold.energy}
  |\epsilon_{\mathrm{th}}|^2
  =\frac{p-1}{2p}\frac{(1-|\delta|^2)^2a^{p-2}}
  {1+|\delta|^2-2|\delta|\cos(\arg\kappa+2\arg\epsilon)}
\end{equation}
for $\delta=\kappa a^{-(p-2)}$.

% This is kind of a boring definition...
\begin{equation} \label{eq:count.def.marginal}
  \overline{\mathcal N}(\kappa,\epsilon)
  =\int da\,\overline{\mathcal N}(\kappa,\epsilon,a)
\end{equation}

\begin{equation} \label{eq:count.zero.energy}
  \overline{\mathcal N}(\kappa,0,a)
  =\left[(p-1)a^{p-1}\sqrt{\frac{1-a^{-2}}{a^{2(p-1)}-|\kappa|^2}}\right]^N
\end{equation}

\begin{equation}
  \overline{\mathcal N}(\kappa,\epsilon)
  =\lim_{a\to\infty}\overline{\mathcal N}(\kappa,\epsilon,a)
  =(p-1)^N
\end{equation}

For $|\kappa|<1$,
\begin{equation}
  \lim_{a\to1}\overline{\mathcal N}(\kappa,\epsilon,a)
  =0
\end{equation}

\begin{equation}
  \lim_{a\to1}\overline{\mathcal N}(1,0,a)
  =(p-1)^{N/2}
\end{equation}


{\color{teal} {\bf somewhere else}

Another instrument we have to study this problem is to compute the following partition function:
\begin{equation}
  \begin{aligned}
    Z(a,\beta)&=\int dx\, dy \, e^{-\mathop{\mathrm{Re}}(\beta H_0)}\\
     &\qquad\delta(\sum_i z_i^2-N) \delta\left(\sum_i y_i^2 -N \frac{a-1}{2}\right).
  \end{aligned}
\end{equation}
The energy $\Re H_0, \Im H_0$ are in a one-to one relation with the temperatures $\beta_R,\beta_I$. The entropy $S(a,H_0) = \ln Z+ +\beta_{R} \langle \Re H_0 \rangle +\beta_I \langle \Im H_0\rangle$
is the logarithm of the number of configurations of a given $(a,H_0)$.
This problem may be solved exactly with replicas, {\em but it may also be simulated} \cite{Bray_1980_Metastable}.
Consider for example the ground-state energy for given $a$, that is, the energy in the limit $\beta_R \rightarrow \infty$ taken adjusting  $\beta_I$ so that $\Im H_0=0$ . For $a=1$ this coincides with the ground-state of the real problem.

\begin{figure}[htpb]
  \centering
  \includegraphics{fig/desert.pdf}
  \caption{
    The minimum value of $a$ for which the complexity is positive as a function
    of (real) energy $\epsilon$  for the pure 4-spin model at several values of
    $\kappa$.
  }
\end{figure}

}
\bibliographystyle{apsrev4-2}
\bibliography{bezout}

\end{document}