diff options
-rw-r--r-- | stokes.tex | 300 |
1 files changed, 163 insertions, 137 deletions
@@ -2,7 +2,6 @@ \usepackage[utf8]{inputenc} % why not type "Stokes" with unicode? \usepackage[T1]{fontenc} % vector fonts -\usepackage{newtxtext,newtxmath} % Times for PR \usepackage[ colorlinks=true, urlcolor=purple, @@ -10,7 +9,8 @@ filecolor=purple, linkcolor=purple ]{hyperref} % ref and cite links with pretty colors -\usepackage{amsmath, graphicx, xcolor} % standard packages +\usepackage{amsmath, amssymb, graphicx, xcolor} % standard packages +\usepackage{newtxtext,newtxmath} % Times for PR \usepackage[subfolder]{gnuplottex} % need to compile separately for APS \begin{document} @@ -44,31 +44,38 @@ \maketitle - -Consider a thermodynamic calculation involving the (real) $p$-spin model for a -particular instantiation of the coupling tensor $J$ +Consider an action $\mathcal S_\lambda$ defined on the phase space $\Omega$ and +depending on parameters $\lambda$. In the context of statistical mechanics, +$\mathcal S_{\beta,J}=-\beta H_J$ for some hamiltonian $H_J$ with quenched +parameters $J$ at inverse temperature $\beta$. A typical calculation stems from +the partition function \begin{equation} \label{eq:partition.function} - Z_J(\beta)=\int_{S^{N-1}}ds\,e^{-\beta H_J(s)} -\end{equation} -where $S^{N-1}$ is the $N-1$-sphere of radius $N$. Quantities of interest are -usually related to the quenched free energy, produced by averaging over the -$J$s the sample free energy $F_J$ -\begin{equation} - \overline F(\beta,\kappa)=-\beta^{-1}\int dJ\,p_\kappa(J)\log Z_J -\end{equation} -which can depend in general on the inverse temperature $\beta$ and on some -parameter $\kappa$ which governs the distribution of $J$s. For most -applications, $\beta$ is taken to be real and positive, and the distribution -$p_\kappa$ is taken to be Gaussian or discrete on $\pm1$. - -We are interested in analytically continuing expressions like $\overline F$ -into the region of complex $\beta$ or distributions $p_\kappa$ involving -complex $J$. - -When the argument of the exponential integrand in \eqref{eq:partition.function} -acquires an imaginary component, various numeric and perturbative schemes for -approximating its value can face immediate difficulties due to the emergence of -a sign problem, resulting from rapid oscillations coinciding with saddles. + Z(\lambda)=\int_\Omega ds\,e^{\mathcal S_\lambda(s)}. +\end{equation} +This integral is often dominated by its behavior near stationary points of the +action, and understanding these points is usually important to evaluate the +partition function. + +Recent developments have found that stationary points of the action are +important for understanding another aspect of the partition function: its +analytic continuation. The integral \eqref{eq:partition.function} is first +interpreted as a contour in a larger complex phase space, then deformed into a +linear combination of specially constructed contours each enumerated by a +stationary point. Analytic continuation of parameters preserves this +decomposition except at nongeneric points where contours intersect. + +We investigate the plausibility of analytic continuation in complex models +where the action has a macroscopic number of stationary points. Such actions +are common in studies of glasses, spin glass, machine learning, black holes, +\dots We find that the geometry of the landscape, and in particular the +relative position and spectrum of stationary points, is key. + +Analytic continuation of partition functions is useful for many reasons. Some +physically motivated theories have actions whose partition function is formally +infinite, but can be defined by continuing from a set of parameters where it +converges. Other theories have oscillatory actions that lead to a severe sign +problem in estimating the partition function, which can be addressed by taking +advantage of a deformed phase space where the phase of the action varies slowly. Unfortunately the study is not so relevant for low-dimensional `rugged' landscapes, which are typically constructed from the limits of series or @@ -77,56 +84,51 @@ integrals of analytic functions which are not themselves analytic \section{Integration by Lefschetz thimble} -Consider an $N$-dimensional hermitian manifold $M$ and a Hamiltonian $H:M\to\mathbb C$. The partition function -\begin{equation} - Z(\beta)=\int_S du\,e^{-\beta H(u)} -\end{equation} -for $S$ a submanifold (not necessarily complex) of $M$. For instance, the -$p$-spin spherical model can be defined on the complex space $M=\{z\mid -z^2=N\}$, but typically one is interested in the subspace $S=\{z\mid -z^2=N,z\in\mathbb R\}$. - -If $S$ is orientable, then the integral can be converted to one over a contour -corresponding to $S$. In this case, the contour can be freely deformed without -affecting the value of the integral. Two properties of this deformed contour -would be ideal. First, that as $|u|\to\infty$ the real part of $-\beta H(u)$ -goes to $-\infty$. This ensures that the integral is well defined. Second, that -the contours piecewise correspond to surfaces of constant phase of $-\beta H$, -so as to ameliorate sign problems. - -Remarkably, there is a recipe for accomplishing both these criteria at once, -courtesy of Morse theory. For a more thorough review, see -\citet{Witten_2011_Analytic}. Consider a critical point of $H$. The union of -all gradient descent trajectories on the real part of $-\beta H$ that terminate -at the critical point as $t\to-\infty$ is known as the \emph{Lefschetz thimble} -corresponding with that critical point. Since each point on the Lefschetz -thimble is a descent from a critical point, the value of -$\operatorname{Re}(-\beta H)$ is bounded from above by its value at the -critical point. Likewise, we shall see that the imaginary part of $\beta H$ is -preserved under gradient descent on its real part. - -Morse theory provides the universal correspondence between contours and -thimbles: one must produce an integer-weighted linear combination of thimbles -such that the homology of the combination is equivalent to that of the contour. -If $\Sigma$ is the set of critical points and for each $\sigma\in\Sigma$ -$\mathcal J_\sigma$ is its Lefschetz thimble, then this gives +We return to the partition function \eqref{eq:partition.function}. If +the action can be continued to a holomorphic function on the Kähler +manifold $\tilde\Omega\supset\Omega$ and $\Omega$ is orientable in $\tilde\Omega$, +then \eqref{eq:partition.function} can be considered a contour integral. In +this case, the contour can be freely deformed without affecting the value of +the integral. Two properties of this deformed contour would be ideal. First, +that as $|s|\to\infty$ the real part of the action goes to $-\infty$, to ensure +the integral converges. Second, that the contours piecewise correspond to +surfaces of slowing vary phase of the action, so as to ameliorate sign +problems. + +Remarkably, there is an elegant recipe for accomplishing both these criteria at +once, courtesy of Morse theory. For a more thorough review, see +\citet{Witten_2011_Analytic}. Consider a stationary point of the action. The +union of all gradient descent trajectories on the real part of the action that +begin at the stationary point is known as a \emph{Lefschetz thimble}. Since +each point on the Lefschetz thimble is found through descent from the +stationary point, the real part of the action is bounded from above by its +value at the stationary point. Likewise, we shall see that the imaginary part +of the action is constant on a thimble. + +Morse theory provides a universal correspondence between contours and thimbles. +For any contour $\Omega$, there exists a linear combination of thimbles such +that the relative homology of the combination with respect to decent int he +action is equivalent to that of the contour. If $\Sigma$ is the set of +stationary points of the action and for each $\sigma\in\Sigma$ the set +$\mathcal J_\sigma\subset\tilde\Omega$ is its thimble, then this gives \begin{equation} \label{eq:thimble.integral} - Z(\beta)=\sum_{\sigma\in\Sigma}n_\sigma\int_{\mathcal J_\sigma}du\,e^{-\beta H(u)} + Z(\lambda)=\sum_{\sigma\in\Sigma}n_\sigma\int_{\mathcal J_\sigma}ds\,e^{\mathcal S_\lambda(s)}. \end{equation} Each of these integrals is very well-behaved: convergent asymptotic series -exist for their value about the critical point $\sigma$, for example. One must -know the integer weights $n_\sigma$. - -Under analytic continuation of, say, $\beta$, the form of -\eqref{eq:thimble.integral} persists. When the homology of the thimbles is -unchanged by the continuation, the integer weights are likewise unchanged, and -one can therefore use the knowledge of these weights in one regime to compute -the partition function in other. However, their homology can change, and when -this happens the integer weights can be traded between critical points. These -trades occur when two thimbles intersect, or alternatively when one critical -point lies on the gradient descent of another. These places are called -\emph{Stokes points}, and the gradient descent trajectories that join two -critical points are called \emph{Stokes lines}. +exist for their value about each critical point. The integer weights $n_\sigma$ +are fixed by comparison with the initial contour. For a real action, all maxima +in $\Omega$ contribute in equal magnitude. + +Under analytic continuation, the form of \eqref{eq:thimble.integral} +generically persists. When the relative homology of the thimbles is unchanged +by the continuation, the integer weights are likewise unchanged, and one can +therefore use the knowledge of these weights in one regime to compute the +partition function in the other. However, their relative homology can change, +and when this happens the integer weights can be traded between critical +points. These trades occur when two thimbles intersect, or alternatively when +one stationary point lies in the gradient descent of another. These places are +called \emph{Stokes points}, and the gradient descent trajectories that join +two stationary points are called \emph{Stokes lines}. The prevalence (or not) of Stokes points in a given continuation, and whether those that do appear affect the weights of critical points of interest, is a @@ -137,105 +139,114 @@ resulting weights. \section{Gradient descent dynamics} -For a holomorphic Hamiltonian $H$, dynamics are defined by gradient descent on -$\operatorname{Re}\beta H$. In hermitian geometry, the gradient is given by raising -an index of the conjugate differential, or -$\operatorname{grad}f=(\partial^*f)^\sharp$. This implies that, in terms of -coordinates $u:M\to\mathbb C^N$, gradient descent follows the dynamics -\begin{equation} \label{eq:flow.raw} - \dot z^i - =-(\partial^*_{\gamma}\operatorname{Re}\beta H)h^{\gamma\alpha}\partial_\alpha z^i - =-\tfrac12(\beta\partial_\gamma H)^*h^{\gamma\alpha}\partial_\alpha z^i -\end{equation} -where $h$ is the Hermitian metric and we have used the fact that, for holomorphic $H$, $\partial^*H=0$. - -This can be simplied by noting that $h^{\gamma\alpha}=h^{-1}_{\gamma\alpha}$ for -$h_{\gamma\alpha}=(\partial_\gamma z^i)^*\partial_\alpha z^i=(J^\dagger -J)_{\gamma\alpha}$ where $J$ is the Jacobian of the coordinate map. Writing -$\partial H=\partial H/\partial z$ and inserting Jacobians everywhere they -appear, \eqref{eq:flow.raw} becomes +The `dynamics' describing thimbles is defined by gradient descent on the real +part of the action. +\begin{equation} \label{eq:flow.coordinate.free} + \dot s + =-\operatorname{grad}\operatorname{Re}\mathcal S + =-\left(\frac\partial{\partial s^*}\operatorname{Re}\mathcal S\right)^\sharp + =-\frac12\frac{\partial\mathcal S^*}{\partial s^*}g^{-1}\frac\partial{\partial s}, +\end{equation} +where $g$ is the metric and the holomorphicity of the action was used to set +$\partial^*\mathcal S=0$. + +We will be dealing with actions where it is convenient to refer to coordinates +in a higher-dimensional embedding space. Let $z:\tilde\Omega\to\mathbb C^N$ be +an embedding of phase space into complex euclidean space. This gives +\begin{equation}\label{eq:flow.raw} + \dot z + =-\frac12\frac{\partial\mathcal S^*}{\partial z^*}(Dz)^* g^{-1}(Dz)^T\frac\partial{\partial z} +\end{equation} +where $Dz=\partial z/\partial s$ is the Jacobian of the embedding. +The embedding induces a metric on $\tilde\Omega$ by $g=(Dz)^\dagger Dz$. +Writing $\partial=\partial/\partial z$, this gives \begin{equation} \label{eq:flow} - \dot z=-\tfrac12(\beta\partial H)^\dagger J^*[J^\dagger J]^{-1}J^T - =-\tfrac12\beta^*(\partial H)^\dagger P -\end{equation} -which is nothing but the projection of $(\partial H)^*$ into the tangent space of the manifold, with $P=J^*[J^\dagger J]^{-1}J^T$. Note that $P$ is hermitian: $P^\dagger=(J^*[J^\dagger J]^{-1}J^T)^\dagger=J^*[J^\dagger J]^{-1}J^T=P$. - -Gradient descent on $\operatorname{Re}\beta H$ is equivalent to Hamiltonian dynamics -with the Hamiltonian $\operatorname{Im}\beta H$. This is because $(M, h)$ is Kähler -and therefore admits a symplectic structure, but that the flow conserves -$\operatorname{Im}\beta H$ can be shown using \eqref{eq:flow} and the holomorphic property of $H$: + \dot z=-\tfrac12(\partial\mathcal S)^\dagger(Dz)^*[(Dz)^\dagger(Dz)]^{-1}(Dz)^T + =-\tfrac12(\partial \mathcal S)^\dagger P +\end{equation} +which is nothing but the projection of $(\partial\mathcal S)^*$ into the +tangent space of the manifold, with $P=(Dz)^*[(Dz)^\dagger(Dz)]^{-1}(Dz)^T$. +Note that $P$ is hermitian. + +Gradient descent on $\operatorname{Re}\mathcal S$ is equivalent to Hamiltonian +dynamics with the Hamiltonian $\operatorname{Im}\mathcal S$. This is because +$(\tilde\Omega, g)$ is Kähler and therefore admits a symplectic structure, but +that the flow conserves $\operatorname{Im}\mathcal S$ can be shown using +\eqref{eq:flow} and the holomorphic property of $\mathcal S$: \begin{equation} \begin{aligned} - \frac d{dt}&\operatorname{Im}\beta H - =\dot z\partial\operatorname{Im}\beta H+\dot z^*\partial^*\operatorname{Im}\beta H \\ + \frac d{dt}&\operatorname{Im}\mathcal S + =\dot z\partial\operatorname{Im}\mathcal S+\dot z^*\partial^*\operatorname{Im}\mathcal S \\ &=\frac i4\left( - \beta^*\beta(\partial H)^\dagger P\partial H-\beta\beta^*(\partial H)^TP^\dagger(\partial H)^* + (\partial \mathcal S)^\dagger P\partial\mathcal S-(\partial\mathcal S)^TP^*(\partial\mathcal S)^* \right) \\ - &=\frac i4|\beta|^2\left( - (\partial H)^\dagger P\partial H-[(\partial H)^\dagger P\partial H]^* + &=\frac i4\left( + (\partial\mathcal S)^\dagger P\partial\mathcal S-[(\partial\mathcal S)^\dagger P\partial\mathcal S]^* \right) \\ - &=\frac i4|\beta|^2\left( - \|\partial H\|^2-(\|\partial H\|^*)^2 + &=\frac i4\left( + \|\partial\mathcal S\|^2-(\|\partial\mathcal S\|^*)^2 \right)=0. \end{aligned} \end{equation} -As a result of this conservation law, surfaces of constant $\operatorname{Im}\beta H$ -will be important when evaluting the possible endpoints of trajectories. A consequence of this conservation is that the flow in the energy takes a simple form: +As a result of this conservation law, surfaces of constant imaginary action +will be important when evaluting the possible endpoints of trajectories. A +consequence of this conservation is that the flow in the action takes a simple +form: \begin{equation} - \dot H - =\dot z\partial H - =-\frac12\beta^*(\partial H)^\dagger P H - =-\frac12\beta^*\|\partial H\|^2. + \dot{\mathcal S} + =\dot z\partial\mathcal S + =-\frac12(\partial\mathcal S)^\dagger P\partial\mathcal S + =-\frac12\|\partial\mathcal S\|^2. \end{equation} -In the complex-$H$ plane, dynamics is occurs along straight lines whose -direction is the same as $\arg \beta^*$. +In the complex-$\mathcal S$ plane, dynamics is occurs along straight lines in +the negative real direction. -Let us consider the generic case, where the critical points of $H$ have -distinct energies. What is the topology of the $C=\operatorname{Im}\beta H$ level +Let us consider the generic case, where the critical points of $\mathcal S$ have +distinct energies. What is the topology of the $C=\operatorname{Im}\mathcal S$ level set? We shall argue its form by construction. Consider initially the situation in the absence of any critical point. In this case the level set consists of a -single simply connected surface, locally diffeomorphic to $\mathbb R^{2N}$. Now, `place' a generic -(nondegenerate) critical point in the function at $u_0$. In the vicinity of the critical +single simply connected surface, locally diffeomorphic to $\mathbb R^{2D-1}$. Now, `place' a generic +(nondegenerate) critical point in the function at $z_0$. In the vicinity of the critical point, the flow is locally \begin{equation} \begin{aligned} \dot z - &\simeq-\tfrac12\beta^*(\partial\partial H)^\dagger P(z-z_0)^* + &\simeq-\tfrac12(\partial\partial\mathcal S|_{z=z_0})^\dagger P(z-z_0)^* \end{aligned} \end{equation} -The matrix $(\partial\partial H)^\dagger P$ has a spectrum identical to that of -$(\partial\partial H)^\dagger$ save marginal directions corresponding to the normals to +The matrix $(\partial\partial\mathcal S)^\dagger P$ has a spectrum identical to that of +$(\partial\partial\mathcal S)^\dagger$ save marginal directions corresponding to the normals to manifold. Assuming we are working in a diagonal basis, this becomes \begin{equation} - \dot z_i=-\tfrac12\beta^*\lambda_i\Delta z_i^*+O(\Delta z^2,(\Delta z^*)^2) + \dot z_i=-\tfrac12\lambda_i\Delta z_i^*+O(\Delta z^2,(\Delta z^*)^2) \end{equation} Breaking into real and imaginary parts gives \begin{equation} \begin{aligned} \frac{d\Delta x_i}{dt}&=-\frac12\left( - \operatorname{Re}\beta\lambda_i\Delta x_i+\operatorname{Im}\beta\lambda_i\Delta y_i + \operatorname{Re}\lambda_i\Delta x_i+\operatorname{Im}\beta\lambda_i\Delta y_i \right) \\ \frac{d\Delta y_i}{dt}&=-\frac12\left( - \operatorname{Im}\beta\lambda_i\Delta x_i-\operatorname{Re}\beta\lambda_i\Delta y_i + \operatorname{Im}\lambda_i\Delta x_i-\operatorname{Re}\beta\lambda_i\Delta y_i \right) \end{aligned} \end{equation} Therefore, in the complex plane defined by each eigenvector of -$(\partial\partial H)^\dagger P$ there is a separatrix flow of the form in +$(\partial\partial\mathcal S)^\dagger P$ there is a separatrix flow of the form in Figure \ref{fig:local_flow}. The effect of these separatrices in each complex direction of the tangent space $T_{z_0}M$ is to separate that space into four quadrants: two disconnected pieces with greater imaginary part than the critical point, and two with lesser imaginary part. This partitioning implies -that the level set of $\operatorname{Im}\beta H=C$ for -$C\neq\operatorname{Im}\beta H(z_0)$ is split into two disconnected pieces, one +that the level set of $\operatorname{Im}\mathcal S=C$ for +$C\neq\operatorname{Im}\mathcal S(z_0)$ is split into two disconnected pieces, one lying in each of two quadrants corresponding with its value relative to that at the critical point. \begin{figure} \includegraphics[width=\columnwidth]{figs/local_flow.pdf} \caption{ - Gradient descent in the vicinity of a critical point, in the $z$--$z*$ - plane for an eigenvector $z$ of $(\partial\partial H)^\dagger P$. The flow + Gradient descent in the vicinity of a critical point, in the $z$--$z^*$ + plane for an eigenvector $z$ of $(\partial\partial\mathcal S)^\dagger P$. The flow lines are colored by the value of $\operatorname{Im}H$. } \label{fig:local_flow} \end{figure} @@ -243,14 +254,14 @@ the critical point. Continuing to `insert' critical points whose imaginary energy differs from $C$, one repeatedly partitions the space this way with each insertion. Therefore, for the generic case with $\mathcal N$ critical points, with $C$ differing in -value from all critical points, the level set $\operatorname{Im}\beta H=C$ has +value from all critical points, the level set $\operatorname{Im}\mathcal S=C$ has $\mathcal N+1$ connected components, each of which is simply connected, -diffeomorphic to $\mathbb R^{2N-1}$ and connects two sectors of infinity to +diffeomorphic to $\mathbb R^{2D-1}$ and connects two sectors of infinity to each other. When $C$ is brought to the same value as the imaginary part of some critical point, two of these disconnected surfaces pinch grow nearer and pinch together -at the critical point when $C=\operatorname{Im}\beta H$, as in the black lines of +at the critical point when $C=\operatorname{Im}\mathcal S$, as in the black lines of Figure \ref{fig:local_flow}. The unstable manifold of the critical point, which corresponds with the portion of this surface that flows away, produce the Lefschetz thimble associated with that critical point. @@ -263,7 +274,7 @@ must have the same imaginary energy if they are to be connected by a Stokes line. This is not a generic phenomena, but will happen often as one model parameter is continuously varied. When two critical points do have the same imaginary energy and $C$ is brought to that value, the level set -$C=\operatorname{Im}\beta H$ sees formally disconnected surfaces pinch together in +$C=\operatorname{Im}\mathcal S$ sees formally disconnected surfaces pinch together in two places. We shall argue that generically, a Stokes line will exist whenever the two critical points in question lie on the same connected piece of this surface. @@ -286,15 +297,30 @@ those with different energies, Stokes lines will be rare. \section{p-spin spherical models} -For $p$-spin spherical models, one is constrained to the manifold $M=\{z\mid -z^2=N\}$. The normal to this manifold at any point $z\in M$ is always in the -direction $z$. The projection operator onto the tangent space of this manifold -is given by +The $p$-spin spherical models are statistical mechanics models defined by the +action $\mathcal S=-\beta H$ for the Hamiltonian +\begin{equation} \label{eq:p-spin.hamiltonian} + H(x)=\sum_{p=2}^\infty\frac{a_p}{p!}\sum_{i_1\cdots i_p}^NJ_{i_1\cdots i_p}x_{i_1}\cdots x_{i_p} +\end{equation} +where the $x\in\mathbb R^N$ are constrained to lie on the sphere $x^2=N$. The tensor +components $J$ are complex normally distributed with zero mean and variances +$\overline{|J|^2}=p!/2N^{p-1}$ and $\overline{J^2}=\kappa\overline{|J|^2}$, and +the numbers $a$ define the model. The pure real $p$-spin model has +$a_i=\delta_{ip}$ and $\kappa=1$. + +The phase space manifold $\Omega=\{x\mid x^2=N, x\in\mathbb R^N\}$ has a +complex extension $\tilde\Omega=\{z\mid z^2=N, z\in\mathbb C^N\}$. The natural +extension of the hamiltonian \eqref{eq:p-spin.hamiltonian} to this complex manifold is +holomorphic. The normal to this manifold at any point $z\in\tilde\Omega$ is +always in the direction $z$. The projection operator onto the tangent space of +this manifold is given by \begin{equation} P=I-\frac{zz^\dagger}{|z|^2}, \end{equation} where indeed $Pz=z-z|z|^2/|z|^2=0$ and $Pz'=z'$ for any $z'$ orthogonal to $z$. +To find critical points, we use the method of Lagrange multipliers. Introducing $\mu\in\mathbb C$, + \subsection{2-spin} The Hamiltonian of the $2$-spin model is defined for $z\in\mathbb C^N$ by |