summaryrefslogtreecommitdiff
path: root/marginal.tex
diff options
context:
space:
mode:
authorJaron Kent-Dobias <jaron@kent-dobias.com>2024-01-18 17:42:43 +0100
committerJaron Kent-Dobias <jaron@kent-dobias.com>2024-01-18 17:42:43 +0100
commit209712a2b2c2ffdc420d3994a18e202a403e9d3f (patch)
tree069ed5ef8dfb5a8a9fccadf34d1bdf667164ab4c /marginal.tex
parent9e7a47e1e28bbacc4a05cc1328ac073e867fafb3 (diff)
downloadmarginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.tar.gz
marginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.tar.bz2
marginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.zip
Some introduction writing.
Diffstat (limited to 'marginal.tex')
-rw-r--r--marginal.tex84
1 files changed, 58 insertions, 26 deletions
diff --git a/marginal.tex b/marginal.tex
index f06bc04..8c7cf0e 100644
--- a/marginal.tex
+++ b/marginal.tex
@@ -37,6 +37,62 @@
%\begin{abstract}
%\end{abstract}
+\section{Introduction}
+
+Systems with rugged landscapes are important across many disciplines, from the
+physics to glasses and spin-glasses to the statistical inference problems. The
+behavior of these systems is best understood when equilibrium or optimal
+solutions are studied and averages can be taken statically over all possible
+configurations. However, such systems are also infamous for their tendency to
+defy equilibrium and optimal expectations in practice, due to the presence of
+dynamic transitions or crossovers that leave physical or algorithmic dynamics
+stuck exploring only a subset of configurations.
+
+In some simple models of such landscapes, it was recently found that marginal
+minima are significant as the attractors of gradient descent dynamics
+\cite{Folena_2020_Rethinking, Folena_2023_On}. This extends to more novel
+algorithms, like message passing \cite{} \textbf{Find out if this is true}.
+\textbf{Think of other examples.}
+While it is still not known how to predict which marginal minima will be
+attractors, this ubiquity of behavior suggests that cartography of marginal
+minima is a useful step in bounding out-of-equilibrium dynamical behavior.
+
+In the traditional methods for analyzing the geometric structure of rugged
+landscapes, it is not necessarily straightforward to condition an analysis on
+the marginality of minima. Using the method of a Legendre transformation of the
+Parisi parameter corresponding to a set of real replicas, one can force the
+result to be marginal by restricting the value of that parameter, but this
+results in only the marginal minima at the energy level at which they are the
+majority of stationary points \cite{Monasson_1995_Structural}. It is now
+understood that out-of-equilibrium dynamics usually goes to marginal minima at
+other energy levels \cite{Folena_2023_On}.
+
+The alternative, used to great success in the spherical models, is to start by
+making a detailing understanding of the Hessian matrix at stationary points.
+Then, one can condition the analysis on whatever properties of the Hessian are
+necessary to lead to marginal minima. This strategy is so successful in the
+spherical models because it is very straightforward to implement: a natural
+parameter in the analysis of these models linearly shifts the spectrum of the
+Hessian, and so fixing this parameter by whatever means naturally allows one to
+require that the Hessian spectrum have a pseudogap.
+Unfortunately this strategy is less straightforward to generalize. Many models
+of interest, especially in inference problems, have Hessian statistics that are
+poorly understood.
+
+Here, we introduce a generic method for conditioning the statistics of
+stationary points on their marginality. The technique makes use of a novel way
+to condition an integral over parameters to select only those that result in a
+certain value of the smallest eigenvalue of a matrix that is a function of
+those parameters. By requiring that the smallest eigenvalue of the Hessian at
+stationary points be zero, we restrict to marginal minima, either those with a
+pseudogap in their bulk spectrum or those with outlying eigenvectors. We
+provide a heuristic to distinguish these two cases. We demonstrate the method
+on the spherical models, where it is unnecessary but instructive, and on
+extensions of the spherical models with non-GOE Hessians where the technique is
+more useful.
+
+\section{How to condition on the smallest eigenvalue}
+
An arbitrary function $g$ of the minimum eigenvalue of a matrix $A$ can be expressed as
\begin{equation}
g(\lambda_\textrm{min}(A))
@@ -51,7 +107,7 @@ associated with the minimum eigenvalue. By definition,
$x_\mathrm{min}(A)^TAx_\mathrm{min}(A)=x_\mathrm{min}(A)^Tx_\mathrm{min}(A)\lambda_\mathrm{min}(A)=N\lambda_\mathrm{min}(A)$
assuming the normalization is $\|x_\mathrm{min}(A)\|^2=N$. The second equality
extends a technique first introduced in \cite{Ikeda_2023_Bose-Einstein-like}
-and used in \cite{me}. A Boltzmann distribution is introduced over a spherical
+and used in \cite{Kent-Dobias_2024_Arrangement}. A Boltzmann distribution is introduced over a spherical
model whose Hamiltonian is quadratic with interaction matrix given by $A$. In
the limit of zero temperature, the measure will concentrate on the ground
states of the model, which correspond with the eigenvectors $\pm x_\mathrm{min}$
@@ -216,7 +272,7 @@ ensure that the integrals are constrained to the tangent space of the configurat
\end{aligned}
\end{equation}
-\section{Spherical model}
+\section{Application to the spherical models}
\begin{align}
C_{ab}=\frac1N\mathbf s_a\cdot\mathbf s_b
@@ -265,30 +321,6 @@ We will discuss at the end of this paper when these order parameters can be expe
\end{equation}
where the maximum over $\omega$ needs to lie at a real value.
-\section{Superfield formalism}
-
-\begin{equation}
- \pmb\phi_a(1)=\pmb\sigma_a+\bar\theta(1)\pmb\eta_a+\bar{\pmb\eta}_a\theta(1)+\hat{\pmb\sigma}_a\bar\theta(1)\theta(1)
-\end{equation}
-\begin{equation}
- \pmb\xi_b(1)=\pmb\sigma_1+\mathbf x_b\bar\vartheta(1)\theta(1)+\mathbf x_b\bar\theta(1)\vartheta(1)
-\end{equation}
-\begin{equation}
- \int d\theta\,d\bar\theta\,\left[
- (1+\hat\beta\bar\theta\theta)H(\pmb\phi)
- +\int d\vartheta\,d\bar\vartheta\,H(\pmb\xi)
- \right]
- =\hat{\pmb\sigma}^T\partial H(\pmb\sigma)
- +\pmb\eta^T\partial\partial H(\pmb\sigma)\pmb\eta
- +\beta\mathbf x^T\partial\partial H(\pmb\sigma)\mathbf x
- +\hat\beta H(\pmb\sigma)
-\end{equation}
-\begin{equation}
- \int d1\,d2\,(1+\hat\beta\bar\theta(1)\theta(1))(1+\hat\beta\bar\theta(2)\theta(2))
- f\left(\frac{\pmb\phi_a(1)\cdot\pmb\phi_b(2)}N\right)
- +\int d1\,(1+\hat\beta\bar\theta(1)\theta(1))f\left(\frac{\pmb\phi_a(1)\cdot\pmb\xi_b(2)}N\right)
-\end{equation}
-
\section{Twin spherical model}
$\Omega=S^{N-1}\times S^{N-1}$