Lots of writing, especially in appendix on superspace.

author: Jaron Kent-Dobias <jaron@kent-dobias.com> 2024-06-10 19:29:07 +0200
committer: Jaron Kent-Dobias <jaron@kent-dobias.com> 2024-06-10 19:29:07 +0200
commit: 2de476de17aea90b040875a7e3e87c308944a035 (patch)
tree: c1e7eeca470f42d3c6fae69f48871b44725a4c8e
parent: d02197918d759195dd1d863de28f352f70cbf14c (diff)
download: marginal-2de476de17aea90b040875a7e3e87c308944a035.tar.gz
marginal-2de476de17aea90b040875a7e3e87c308944a035.tar.bz2
marginal-2de476de17aea90b040875a7e3e87c308944a035.zip
1 files changed, 139 insertions, 30 deletions
diff --git a/marginal.tex b/marginal.tex
index 246bcaf..024b0ac 100644
--- a/marginal.tex
+++ b/marginal.tex
@@ -442,6 +442,40 @@ Finally, the marginal complexity is defined by evaluating the complexity conditi
   =\Sigma_0(E,\mu_\text m(E))
 \end{equation}
 
+\subsection{General features of saddle point computation}
+
+\begin{align}
+  \label{eq:delta.grad}
+  &\delta\big(\nabla H(\mathbf x_a,\pmb\omega_a)\big)
+    =\int\frac{d\hat{\mathbf x}_a}{(2\pi)^N}e^{i\hat{\mathbf x}_a^T\nabla H(\mathbf x_a,\pmb\omega_a)} \\
+    \label{eq:delta.energy}
+  &\delta\big(NE-H(\mathbf x_a)\big)
+    =\int\frac{d\hat\beta_a}{2\pi}e^{\hat\beta_a(NE-H(\mathbf x_a))} \\
+  &\delta\big(N\lambda^*-\mathbf s^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\mathbf s\big)
+  \label{eq:delta.eigen}
+    =\int\frac{d\hat\lambda_a}{2\pi}e^{\hat\lambda_a(N\lambda^*-\mathbf s^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\mathbf s)}
+\end{align}
+
+Here we will merely sketch the steps that are standard. We start by translating elements of the Kac--Rice measure into terms more familiar to physicists. This means writing \eqref{eq:delta.grad}, \eqref{eq:delta.energy}, and \eqref{eq:delta.eigen}
+for the Dirac $\delta$ functions. At this point we will also discuss an
+important step we will use repeatedly in this paper: to drop the absolute value
+signs around the determinant in the Kac--Rice measure. This can potentially
+lead to severe problems with the complexity. However, it is a justified step
+when the parameters of the problem, i.e., $E$, $\mu$, and $\lambda^*$ put us in
+a regime where the exponential majority of stationary points have the same
+index. This is true for maxima and minima, and for saddle points whose spectra have a strictly positive bulk with a fixed number of negative
+outliers. Dropping the absolute value sign allows us to write
+\begin{equation}
+  \det\operatorname{Hess}H(\mathbf x_a, \pmb\omega_a)
+  =\int d\pmb\eta_a\,d\bar{\pmb\eta}_a\,e^{\bar{\pmb\eta}_a^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\pmb\eta_a}
+\end{equation}
+for $N$-dimensional Grassmann variables $\bar{\pmb\eta}_a$ and $\pmb\eta_a$. For
+the spherical models this step is unnecessary, since there are other ways to
+treat the determinant keeping the absolute value signs, as in previous works
+\cite{Folena_2020_Rethinking, Kent-Dobias_2023_How}. However, since other of
+our examples are for models where the same techniques are impossible, it is
+useful to see the fermionic method in action in this simple case.
+
 \section{Examples}
 
 \subsection{Spherical spin glasses}
@@ -475,36 +509,7 @@ calculation for this case, since we will something about its application in
 more nontrivial settings.
 
 The procedure to treat the complexity of the spherical models has been made in
-detail elsewhere \cite{Kent-Dobias_2023_How}. Here we will merely sketch the steps that are standard. We start by translating elements of the Kac--Rice measure into terms more familiar to physicists. This means writing
-\begin{align}
-  \label{eq:delta.grad}
-  \delta\big(\nabla H(\mathbf x_a,\pmb\omega_a)\big)
-    &=\int\frac{d\hat{\mathbf x}_a}{(2\pi)^N}e^{i\hat{\mathbf x}_a^T\nabla H(\mathbf x_a,\pmb\omega_a)} \\
-    \label{eq:delta.energy}
-  \delta\big(NE-H(\mathbf x_a)\big)
-    &=\int\frac{d\hat\beta_a}{2\pi}e^{\hat\beta_a(NE-H(\mathbf x_a))} \\
-  \delta\big(N\lambda^*-\mathbf s^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\mathbf s\big)
-  \label{eq:delta.eigen}
-    &=\int\frac{d\hat\lambda_a}{2\pi}e^{\hat\lambda_a(N\lambda^*-\mathbf s^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\mathbf s)}
-\end{align}
-for the Dirac $\delta$ functions. At this point we will also discuss an
-important step we will use repeatedly in this paper: to drop the absolute value
-signs around the determinant in the Kac--Rice measure. This can potentially
-lead to severe problems with the complexity. However, it is a justified step
-when the parameters of the problem, i.e., $E$, $\mu$, and $\lambda^*$ put us in
-a regime where the exponential majority of stationary points have the same
-index. This is true for maxima and minima, and for saddle points whose spectra have a strictly positive bulk with a fixed number of negative
-outliers. Dropping the absolute value sign allows us to write
-\begin{equation}
-  \det\operatorname{Hess}H(\mathbf x_a, \pmb\omega_a)
-  =\int d\pmb\eta_a\,d\bar{\pmb\eta}_a\,e^{\bar{\pmb\eta}_a^T\operatorname{Hess}H(\mathbf x_a,\pmb\omega)\pmb\eta_a}
-\end{equation}
-for $N$-dimensional Grassmann variables $\bar{\pmb\eta}_a$ and $\pmb\eta_a$. For
-the spherical models this step is unnecessary, since there are other ways to
-treat the determinant keeping the absolute value signs, as in previous works
-\cite{Folena_2020_Rethinking, Kent-Dobias_2023_How}. However, since other of
-our examples are for models where the same techniques are impossible, it is
-useful to see the fermionic method in action in this simple case.
+detail elsewhere \cite{Kent-Dobias_2023_How}. 
 
 Once these substitutions have been made, the entire expression
 \eqref{eq:min.complexity.expanded} is an exponential integral whose argument is
@@ -954,6 +959,110 @@ taking the zero-temperature limit, we find
 
 \appendix
 
+\section{A primer on superspace}
+\label{sec:superspace}
+
+The superspace $\mathbb R^{N|2D}$ is a vector space with $N$ real indices and
+$2D$ Grassmann indices $\bar\theta_1,\theta_1,\ldots,\bar\theta_D,\theta_D$.
+The Grassmann indices anticommute like fermions. Their integration is defined by
+\begin{equation}
+  \int d\theta\,\theta=1
+  \qquad
+  \int d\theta\,1=0
+\end{equation}
+Because the Grassmann indices anticommute, their square is always zero.
+Therefore, any series expansion of a function with respect to a given Grassmann
+index will terminate exactly at linear order, while a series expansion with
+respect to $n$ Grassmann variables will terminate exactly at $n$th order. If
+$f$ is an arbitrary function, then
+\begin{equation}
+  \int d\theta\,f(a+b\theta)
+  =\int d\theta\,\left[f(a)+f'(a)b\theta\right]
+  =f'(a)b
+\end{equation}
+This kind of behavior of integrals over the Grassmann indices makes them useful
+for compactly expressing the Kac--Rice measure. To see why, consider the
+specific superspace $\mathbb R^{N|2}$, where an arbitrary vector can be expression as
+\begin{equation}
+  \pmb\phi(1)=\mathbf x+\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1+\bar\theta_1\theta_1i\hat{\mathbf x}
+\end{equation}
+where $\mathbf x,\hat{\mathbf x}\in\mathbb R^N$ and $\bar{\pmb\eta},\pmb\eta$ are
+$N$-dimensional Grassmann vectors. The dependence of $\pmb\phi$ on 1 indicates
+the index of Grassmann variables $\bar\theta_1,\theta_1$ inside, since we will
+sometimes want to use, e.g., $\pmb\phi(2)$ defined identically save for
+substitution by $\bar\theta_2,\theta_2$. Consider the series expansion of an arbitrary function $f$ of this supervector:
+\begin{equation}
+  \begin{aligned}
+    f\big(\pmb\phi(1)\big)
+    &=f(\mathbf x)
+    +\big(\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1+\bar\theta_1\theta_1i\hat{\mathbf x}\big)^T\partial f(\mathbf x) \\
+    &\quad+\frac12\big(\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1\big)^T\partial\partial f(\mathbf x)\big(\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1\big) \\
+    &=f(\mathbf x)
+    +\big(\bar\theta_1\pmb\eta+\bar{\pmb\eta}\theta_1+\bar\theta_1\theta_1i\hat{\mathbf x}\big)^T\partial f(\mathbf x) \\
+    &\qquad-\bar\theta_1\theta_1\bar{\pmb\eta}^T\partial\partial f(\mathbf x)\pmb\eta
+  \end{aligned}
+\end{equation}
+where the last step we used the fact that the Hessian matrix is symmetric and
+that squares of Grassmann indicies vanish. Using the integration rules defined above, we find
+\begin{equation}
+  \int d\theta_1\,d\bar\theta_1\,f\big(\pmb\phi(1)\big)
+  =i\hat{\mathbf x}^T\partial f(\mathbf x)-\bar{\pmb\eta}^T\partial\partial f(\mathbf x)\pmb\eta
+\end{equation}
+These two terms are precisely the exponential representation of the Dirac
+$\delta$ function of the gradient and determinant of the Hessian (without
+absolute value sign) that make up the basic Kac--Rice measure, so that we can write
+\begin{equation}
+  \begin{aligned}
+    &\int d\mathbf x\,\delta\big(\nabla H(\mathbf x)\big)\,\det\operatorname{Hess}H(\mathbf x) \\
+    &\qquad=\int d\mathbf x\,d\bar{\pmb\eta}\,d\pmb\eta\,\frac{d\hat{\mathbf x}}{(2\pi)^N}\,e^{i\hat{\mathbf x}^T\nabla H(\mathbf x)-\bar{\pmb\eta}^T\operatorname{Hess}H(\mathbf x)\pmb\eta} \\
+    &\qquad=\int d\pmb\phi\,e^{\int d1\,H(\pmb\phi(1))}
+  \end{aligned}
+\end{equation}
+where we have written $d1=d\theta_1\,d\bar\theta_1$ and $d\pmb\phi=d\mathbf
+x\,d\bar{\pmb\eta}\,d\pmb\eta\,\frac{d\hat{\mathbf x}}{(2\pi)^N}$. Besides some deep connections
+to the physics of BRST, this compact notation dramatically simplifies the
+analytical treatment of the problem. The reason why this simplification is
+possible is because there are a large variety of superspace algebraic and
+integral operations with direct corollaries to their ordinary real
+counterparts. For instance, consider a super linear operator $M(1,2)$, which
+like the super vector $\pmb\phi$ is made up of a linear combination of $N\times
+N$ regular or Grassmann matrices indexed by every nonvanishing combination of
+the Grassmann indices $\bar\theta_1,\theta_1,\bar\theta_2,\theta_2$. Such a supermatrix acts on supervectors by ordinary matrix multiplication and convolution in the Grassmann indices, i.e.,
+\begin{equation}
+  (M\pmb\phi)(1)=\int d1\,M(1,2)\pmb\phi(2)
+\end{equation}
+Integrals involving superfields contracted into such operators result in schematically familiar expressions, like that of the standard Gaussian:
+\begin{equation}
+  \int d\pmb\phi\,e^{\int\,d1\,d2\,\pmb\phi(1)^TM(1,2)\pmb\phi(2)}
+  =(\operatorname{sdet}M)^{-N/2}
+\end{equation}
+where the usual role of the determinant is replaced by the superdeterminant.
+The superdeterminant can be defined using the ordinary determinant by writing a
+block version of the matrix $M$: if $\mathbf e(1)=\{1,\bar\theta_1\theta\}$ is
+the basis vector of the even subspace of the superspace and $\mathbf
+f(1)=\{\bar\theta_1,\theta_1\}$ is that of the odd subspace, then we can form a
+block representation of $M$ in analogy to the matrix form of an operator in quantum mechanics by
+\begin{equation}
+  \int d1\,d2\,\begin{bmatrix}
+    \mathbf e(1)M(1,2)\mathbf e(2)^T
+    &
+    \mathbf e(1)M(1,2)\mathbf f(2)^T
+    \\
+    \mathbf f(1)M(1,2)\mathbf e(2)^T
+    &
+    \mathbf f(1)M(1,2)\mathbf f(2)^T
+  \end{bmatrix}
+  =\begin{bmatrix}
+    A & B \\ C & D
+  \end{bmatrix}
+\end{equation}
+Then the superdeterminant of $M$ is given by
+\begin{equation}
+  \operatorname{sdet}M=\det(A-BD^{-1}C)\det(D)^{-1}
+\end{equation}
+which is the same for the normal equation for the determinant of a block matrix
+save for the inverse of $\det D$.
+
 \section{Complexity of dominant optima in the least-squares problem}
 \label{sec:dominant.complexity}
author	Jaron Kent-Dobias <jaron@kent-dobias.com>	2024-06-10 19:29:07 +0200
committer	Jaron Kent-Dobias <jaron@kent-dobias.com>	2024-06-10 19:29:07 +0200
commit	2de476de17aea90b040875a7e3e87c308944a035 (patch)
tree	c1e7eeca470f42d3c6fae69f48871b44725a4c8e
parent	d02197918d759195dd1d863de28f352f70cbf14c (diff)
download	marginal-2de476de17aea90b040875a7e3e87c308944a035.tar.gz marginal-2de476de17aea90b040875a7e3e87c308944a035.tar.bz2 marginal-2de476de17aea90b040875a7e3e87c308944a035.zip