\documentclass[aps,prl,reprint,longbibliography,floatfix]{revtex4-2} \usepackage[utf8]{inputenc} % why not type "Bézout" with unicode? \usepackage[T1]{fontenc} % vector fonts plz \usepackage[ colorlinks=true, urlcolor=purple, citecolor=purple, filecolor=purple, linkcolor=purple ]{hyperref} % ref and cite links with pretty colors \usepackage{amsmath, amssymb, graphicx, xcolor} % standard packages \begin{document} \title{Complex complex landscapes} \author{Jaron Kent-Dobias} \author{Jorge Kurchan} \affiliation{Laboratoire de Physique de l'Ecole Normale Supérieure, Paris, France} \date\today \begin{abstract} We study the saddle-points of the $p$-spin model -- the best understood example of a `complex' (rugged) landscape -- when its $N$ variables are complex. These points are the solutions to a system of $N$ random equations of degree $p-1$. We solve for $\overline{\mathcal{N}}$, the number of solutions averaged over randomness in the $N\to\infty$ limit. We find that it saturates the Bézout bound $\log\overline{\mathcal{N}}\sim N \log(p-1)$. The Hessian of each saddle is given by a random matrix of the form $M=a A^2+b B^2 +ic [A,B]_-+ d [A,B]_+$, where $A$ and $B$ are GOE matrices and $a-d$ real. Its spectrum has a transition from one-cut to two-cut that generalizes the notion of `threshold level' that is well-known in the real problem. The results from the real problem are recovered in the limit of real disorder. In this case, only the square-root of the total number solutions are real. In terms of real and imaginary parts of the energy, the solutions are divided in sectors where the saddles have different topological properties. \end{abstract} \maketitle Spin-glasses have long been considered the paradigm of `complex landscapes' of many variables, a subject that includes Neural Networks and optimization problems, most notably Constraint Satisfaction ones. The most tractable family of these are the mean-field spherical p-spin models \cite{Crisanti_1992_The} (for a review see \cite{Castellani_2005_Spin-glass}) defined by the energy: \begin{equation} \label{eq:bare.hamiltonian} H_0 = \sum_p \frac{c_p}{p!}\sum_{i_1\cdots i_p}^NJ_{i_1\cdots i_p}z_{i_1}\cdots z_{i_p}, \end{equation} where the $J_{i_1\cdots i_p}$ are real Gaussian variables and the $z_i$ are real and constrained to a sphere $\sum_i z_i^2=N$. If there is a single term of a given $p$, this is known as the `pure $p$-spin' model, the case we shall study here. This problem has been studied also in the algebra \cite{Cartwright_2013_The} and probability literature \cite{Auffinger_2012_Random, Auffinger_2013_Complexity}. It has been attacked from several angles: the replica trick to compute the Boltzmann--Gibbs distribution \cite{Crisanti_1992_The}, a Kac--Rice \cite{Kac_1943_On, Rice_1939_The, Fyodorov_2004_Complexity} procedure (similar to the Fadeev--Popov integral) to compute the number of saddle-points of the energy function \cite{Crisanti_1995_Thouless-Anderson-Palmer}, and the gradient-descent -- or more generally Langevin -- dynamics staring from a high-energy configuration \cite{Cugliandolo_1993_Analytical}. Thanks to the relative simplicity of the energy, all these approaches are possible analytically in the large $N$ limit. In this paper we shall extend the study to the case where the variables are complex $z\in\mathbb C^N$ and $J$ is a symmetric tensor whose elements are complex normal with $\overline{|J|^2}=p!/2N^{p-1}$ and $\overline{J^2}=\kappa\overline{|J|^2}$ for complex parameter $|\kappa|<1$. The constraint becomes $z^2=N$. The motivations for this paper are of two types. On the practical side, there are indeed situations in which complex variables in a disorder problem appear naturally: such is the case in which they are {\em phases}, as in random laser problems \cite{Antenucci_2015_Complex}. Another problem where a Hamiltonian very close to ours has been proposed is the Quiver Hamiltonians \cite{Anninos_2016_Disordered} modeling Black Hole horizons in the zero-temperature limit. There is however a more fundamental reason for this study: we know from experience that extending a problem to the complex plane often uncovers an underlying simplicity that is hidden in the purely real case. Consider, for example, the procedure of starting from a simple, known Hamiltonian $H_{00}$ and studying $\lambda H_{00} + (1-\lambda H_{0} )$, evolving adiabatically from $\lambda=0$ to $\lambda=1$, as is familiar from quantum annealing. The $H_{00}$ is a polynomial of degree $p$ chosen to have simple, known saddles. Because we are working in complex variables, and the saddles are simple all the way (we shall confirm this), we may follow a single one from $\lambda=0$ to $\lambda=1$, while with real variables minima of functions appear and disappear, and this procedure is not possible. The same idea may be implemented by performing diffusion in the $J$'s, and following the roots, in complete analogy with Dyson's stochastic dynamics. For our model the constraint we choose $z^2=N$, rather than $|z|^2=N$, in order to preserve the holomorphic nature of the functions. In addition, the nonholomorphic spherical constraint has a disturbing lack of critical points nearly everywhere, since $0=\partial^* H=-p\epsilon z$ is only satisfied for $\epsilon=0$, as $z=0$ is forbidden by the constraint. It is enforced using the method of Lagrange multipliers: introducing the $\epsilon\in\mathbb C$, this gives \begin{equation} \label{eq:constrained.hamiltonian} H = H_0+\frac p2\epsilon\left(N-\sum_i^Nz_i^2\right). \end{equation} It is easy to see that {\em for a pure $p$-spin}, at any critical point $\epsilon=H/N$, the average energy. Critical points are given by the set of equations: \begin{equation} \frac{c_p}{(p-1)!}\sum_{ i, i_2\cdots i_p}^NJ_{i, i_2\cdots i_p}z_{i_2}\cdots z_{i_p} = \epsilon z_i \end{equation} which for given $\epsilon$ are a set of $N$ equations of degree $p-1$, to which one must add the constraint condition. In this sense this study also provides a complement to the work on the distribution of zeroes of random polynomials \cite{Bogomolny_1992_Distribution}, which are for $N=1$ and $p \rightarrow \infty$. Since $H$ is holomorphic, a critical point of $\Re H_0$ is also a critical point of $\Im H_0$. The number of critical points of $H$ is therefore the number of critical points of $\mathop{\mathrm{Re}}H$. From each critical point emerges a gradient line of $\Re H_0$, which is also one of constant phase $\Im H_0=const$. Writing $z=x+iy$, $\mathop{\mathrm{Re}}H$ can be interpreted as a real function of $2N$ real variables. The number of critical points it has is given by the usual Kac--Rice formula: \begin{equation} \label{eq:real.kac-rice} \begin{aligned} \mathcal N_J(\kappa,\epsilon) &= \int dx\,dy\,\delta(\partial_x\mathop{\mathrm{Re}}H)\delta(\partial_y\mathop{\mathrm{Re}}H) \\ &\qquad\times\left|\det\begin{bmatrix} \partial_x\partial_x\mathop{\mathrm{Re}}H & \partial_x\partial_y\mathop{\mathrm{Re}}H \\ \partial_y\partial_x\mathop{\mathrm{Re}}H & \partial_y\partial_y\mathop{\mathrm{Re}}H \end{bmatrix}\right|. \end{aligned} \end{equation} The Cauchy--Riemann equations may be used to write \eqref{eq:real.kac-rice} in a manifestly complex way. Using the Wirtinger derivative $\partial=\partial_x-i\partial_y$, one can write $\partial_x\mathop{\mathrm{Re}}H=\mathop{\mathrm{Re}}\partial H$ and $\partial_y\mathop{\mathrm{Re}}H=-\mathop{\mathrm{Im}}\partial H$. Carrying these transformations through, we have \begin{equation} \label{eq:complex.kac-rice} \begin{aligned} \mathcal N_J&(\kappa,\epsilon) = \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H) \\ &\qquad\qquad\qquad\times\left|\det\begin{bmatrix} \mathop{\mathrm{Re}}\partial\partial H & -\mathop{\mathrm{Im}}\partial\partial H \\ -\mathop{\mathrm{Im}}\partial\partial H & -\mathop{\mathrm{Re}}\partial\partial H \end{bmatrix}\right| \\ &= \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H) \left|\det[(\partial\partial H)^\dagger\partial\partial H]\right| \\ &= \int dx\,dy\,\delta(\mathop{\mathrm{Re}}\partial H)\delta(\mathop{\mathrm{Im}}\partial H) |\det\partial\partial H|^2. \end{aligned} \end{equation} This gives three equivalent expressions for the determinant of the Hessian: as that of a $2N\times 2N$ real matrix, that of an $N\times N$ Hermitian matrix, or the norm squared of that of an $N\times N$ complex symmetric matrix. These equivalences belie a deeper connection between the spectra of the corresponding matrices: each eigenvalue of the real matrix has a negative partner. For each pair $\pm\lambda$ of the real matrix, $\lambda^2$ is an eigenvalue of the Hermitian matrix and $|\lambda|$ is a \emph{singular value} of the complex symmetric matrix. The distribution of positive eigenvalues of the Hessian is therefore the same as the distribution of singular values of $\partial\partial H$, the distribution of square-rooted eigenvalues of $(\partial\partial H)^\dagger\partial\partial H$. The expression \eqref{eq:complex.kac-rice} is to be averaged over the $J$'s as $N \Sigma= \overline{\ln \mathcal N_J} = \int dJ \; \ln N_J$, a calculation that involves the replica trick. In most the parameter-space that we shall study here, the {\em annealed approximation} $N \Sigma \sim \ln \overline{ \mathcal N_J} = \ln \int dJ \; N_J$ is exact. A useful property of the Gaussian distributions is that gradient and Hessian for given $\epsilon$ may be seen to be independent \cite{Bray_2007_Statistics, Fyodorov_2004_Complexity}, so that we may treat the $\delta$-functions and the Hessians as independent. We compute each by taking the saddle point. The $\delta$-functions are converted to exponentials by the introduction of auxiliary fields $\hat z=\hat x+i\hat y$. The average over $J$ can then be performed. A generalized Hubbard--Stratonovich then allows a change of variables from the $4N$ original and auxiliary fields to eight bilinears defined by \begin{equation} \begin{aligned} Na=z^*\cdot z && N\hat c=\hat z\cdot\hat z && Nb=\hat z^*\cdot z \\ N\hat a=\hat z^*\cdot\hat z && Nd=\hat z\cdot z \end{aligned} \end{equation} and their conjugates. The result is, to leading order in $N$, \begin{equation} \label{eq:saddle} \overline{\mathcal N_J}(\kappa,\epsilon) = \int da\,d\hat a\,db\,db^*d\hat c\,d\hat c^*dd\,dd^*e^{Nf(a,\hat a,b,\hat c,d)}, \end{equation} where \begin{widetext} \begin{equation} f=2+\frac12\log\det\frac12\begin{bmatrix} 1 & a & d & b \\ a & 1 & b^* & d^* \\ d & b^* & \hat c & \hat a \\ b & d^* & \hat a & \hat c^* \end{bmatrix} +\mathop{\mathrm{Re}}\left\{\frac18\left[\hat aa^{p-1}+(p-1)|d|^2a^{p-2}+\kappa(\hat c^*+(p-1)b^2)\right]-\epsilon b\right\} +\int d\lambda\,\rho(\lambda)\log|\lambda|^2 \end{equation} where $\rho(\lambda)$, the distribution of eigenvalues $\lambda$ of $\partial\partial H$, is dependant on $a$ alone. This function has a maximum in $\hat a$, $b$, $\hat c$, and $d$ at which its value is (for simplicity, with $\kappa\in\mathbb R$) \begin{equation} \label{eq:free.energy.a} \begin{aligned} f(a)&=1+\frac12\log\left(\frac4{p^2}\frac{a^2-1}{a^{2(p-1)}-\kappa^2}\right)+\int d\lambda\,\rho(\lambda)\log|\lambda|^2 \\ &\hspace{80pt}-\frac{a^p(1+p(a^2-1))-a^2\kappa}{a^{2p}+a^p(a^2-1)(p-1)-a^2\kappa^2}(\mathop{\mathrm{Re}}\epsilon)^2 -\frac{a^p(1+p(a^2-1))+a^2\kappa}{a^{2p}-a^p(a^2-1)(p-1)-a^2\kappa^2}(\mathop{\mathrm{Im}}\epsilon)^2, \end{aligned} \end{equation} \end{widetext} This leaves a single parameter, $a$, which dictates the magnitude of $z^*\cdot z$, or alternatively the magnitude $y^2$ of the imaginary part. The latter vanishes as $a\to1$, where we should recover known results for the real $p$-spin. \begin{figure}[htpb] \centering \includegraphics{fig/spectra_0.0.pdf} \includegraphics{fig/spectra_0.5.pdf}\\ \includegraphics{fig/spectra_1.0.pdf} \includegraphics{fig/spectra_1.5.pdf} \caption{ Eigenvalue and singular value spectra of the matrix $\partial\partial H$ for $p=3$, $a=\frac54$, and $\kappa=\frac34e^{-i3\pi/4}$ with (a) $\epsilon=0$, (b) $\epsilon=-\frac12|\epsilon_{\mathrm{th}}|$, (c) $\epsilon=-|\epsilon_{\mathrm{th}}|$, and (d) $\epsilon=-\frac32|\epsilon_{\mathrm{th}}|$. The shaded region of each inset shows the support of the eigenvalue distribution. The solid line on each plot shows the distribution of singular values, while the overlaid histogram shows the empirical distribution from $2^{10}\times2^{10}$ complex normal matrices with the same covariance and diagonal shift as $\partial\partial H$. } \label{fig:spectra} \end{figure} The Hessian of \eqref{eq:constrained.hamiltonian} is $\partial\partial H=\partial\partial H_0-p\epsilon I$, or the Hessian of \eqref{eq:bare.hamiltonian} with a constant added to its diagonal. The eigenvalue distribution $\rho$ of the constrained Hessian is therefore related to the eigenvalue distribution $\rho_0$ of the unconstrained one by a similar shift, or $\rho(\lambda)=\rho_0(\lambda+p\epsilon)$. The Hessian of \eqref{eq:bare.hamiltonian} is \begin{equation} \label{eq:bare.hessian} \partial_i\partial_jH_0 =\frac{p(p-1)}{p!}\sum_{k_1\cdots k_{p-2}}^NJ_{ijk_1\cdots k_{p-2}}z_{k_1}\cdots z_{k_{p-2}}, \end{equation} {\bf \color{red} restricting to directions proportional to $z$, i.e. orthogonal to the constraint}, these makes its ensemble that of Gaussian complex symmetric matrices. Given its variances $\overline{|\partial_i\partial_j H_0|^2}=p(p-1)a^{p-2}/2N$ and $\overline{(\partial_i\partial_j H_0)^2}=p(p-1)\kappa/2N$, $\rho_0(\lambda)$ is constant inside the ellipse \begin{equation} \label{eq:ellipse} \left(\frac{\mathop{\mathrm{Re}}(\lambda e^{i\theta})}{a^{p-2}+|\kappa|}\right)^2+ \left(\frac{\mathop{\mathrm{Im}}(\lambda e^{i\theta})}{a^{p-2}-|\kappa|}\right)^2 <\frac{p(p-1)}{2a^{p-2}} \end{equation} where $\theta=\frac12\arg\kappa$ \cite{Nguyen_2014_The}. The eigenvalue spectrum of $\partial\partial H$ therefore is that of an ellipse in the complex plane whose center lies at $-p\epsilon$. Examples of these distributions are shown in the insets of Fig.~\ref{fig:spectra}. The eigenvalue spectrum of the Hessian of the real part is the one we need for our Kac-Rice formula. It is different from the spectrum $\partial\partial H$, but rather equivalent to the square-root eigenvalue spectrum of $(\partial\partial H)^\dagger\partial\partial H$,in other words, the singular value spectrum of $\partial\partial H$. When $\kappa=0$ and the elements of $J$ are standard complex normal, this corresponds to a complex Wishart distribution. For $\kappa\neq0$ the problem changes, and to our knowledge a closed form is not in the literature. We have worked out an implicit form for this spectrum using the saddle point of a replica symmetric calculation for the Green function. Introducing replicas to bring the partition function to the numerator of the Green function \cite{Livan_2018_Introduction} gives \begin{widetext} \begin{equation} \label{eq:green.replicas} G(\sigma)=\lim_{n\to0}\int d\zeta\,d\zeta^*\,(\zeta_i^{(0)})^*\zeta_i^{(0)} \exp\left\{ \frac12\sum_\alpha^n\left[(\zeta_i^{(\alpha)})^*\zeta_i^{(\alpha)}\sigma -\Re\left(\zeta_i^{(\alpha)}\zeta_j^{(\alpha)}\partial_i\partial_jH\right) \right] \right\} \end{equation} with sums taken over repeated latin indices. The average can then be made over $J$ and Hubbard--Stratonovich used to change variables to replica matrices $N\alpha_{\alpha\beta}=(\zeta^{(\alpha)})^*\cdot\zeta^{(\beta)}$ and $N\chi_{\alpha\beta}=\zeta^{(\alpha)}\cdot\zeta^{(\beta)}$ and a series of replica vectors. Taking the replica-symmetric ansatz leaves all off-diagonal elements and vectors zero, and $\alpha_{\alpha\beta}=\alpha_0\delta_{\alpha\beta}$, $\chi_{\alpha\beta}=\chi_0\delta_{\alpha\beta}$. The result is \begin{equation}\label{eq:green.saddle} \overline G(\sigma)=N\lim_{n\to0}\int d\alpha_0\,d\chi_0\,d\chi_0^*\,\alpha_0 \exp\left\{nN\left[ 1+\frac{p(p-1)}{16}a^{p-2}\alpha_0^2-\frac{\alpha_0\sigma}2+\frac12\log(\alpha_0^2-|\chi_0|^2) +\frac p4\mathop{\mathrm{Re}}\left(\frac{(p-1)}8\kappa^*\chi_0^2-\epsilon^*\chi_0\right) \right]\right\}. \end{equation} \end{widetext} The argument of the exponential has several saddles. The solutions $\alpha_0$ are the roots of a sixth-order polynomial, but the root with the smallest value of $\mathop{\mathrm{Re}}\alpha_0$ in all the cases we studied gives the correct solution. A detailed analysis of the saddle point integration is needed to understand why this is so. Given such $\alpha_0$, the density of singular values follows from the jump across the cut, or \begin{equation} \label{eq:spectral.density} \rho(\sigma)=\frac1{i\pi N}\left( \lim_{\mathop{\mathrm{Im}}\sigma\to0^+}\overline G(\sigma) -\lim_{\mathop{\mathrm{Im}}\sigma\to0^-}\overline G(\sigma) \right) \end{equation} The transition from a one-cut to two-cut singular value spectrum naturally corresponds to the origin leaving the support of the eigenvalue spectrum. Weyl's theorem requires that the product over the norm of all eigenvalues must not be greater than the product over all singular values \cite{Weyl_1912_Das}. Therefore, the absence of zero eigenvalues implies the absence of zero singular values. The determination of the threshold energy -- the energy at which the distribution of singular values becomes gapped -- is therefore reduced to a geometry problem, and yields \begin{equation} \label{eq:threshold.energy} |\epsilon_{\mathrm{th}}|^2 =\frac{p-1}{2p}\frac{(1-|\delta|^2)^2a^{p-2}} {1+|\delta|^2-2|\delta|\cos(\arg\kappa+2\arg\epsilon)} \end{equation} for $\delta=\kappa a^{-(p-2)}$. With knowledge of this distribution, the integral in \eqref{eq:free.energy.a} may be preformed for arbitrary $a$. For all values of $\kappa$ and $\epsilon$, the resulting expression is always maximized for $a=\infty$. Taking this saddle gives \begin{equation} \label{eq:bezout} \ln \overline{\mathcal N}(\kappa,\epsilon) ={N\log(p-1)} \end{equation} This is precisely the Bézout bound, the maximum number of solutions that $N$ equations of degree $p-1$ may have \cite{Bezout_1779_Theorie}. More insight is gained by looking at the count as a function of $a$, defined by \begin{equation} \label{eq:count.def.marginal} {\mathcal N}(\kappa,\epsilon,a) ={\mathcal N}(\kappa,\epsilon/ \sum_i y_i^2