diff options
author | Jaron Kent-Dobias <jaron@kent-dobias.com> | 2024-01-18 17:42:43 +0100 |
---|---|---|
committer | Jaron Kent-Dobias <jaron@kent-dobias.com> | 2024-01-18 17:42:43 +0100 |
commit | 209712a2b2c2ffdc420d3994a18e202a403e9d3f (patch) | |
tree | 069ed5ef8dfb5a8a9fccadf34d1bdf667164ab4c /marginal.tex | |
parent | 9e7a47e1e28bbacc4a05cc1328ac073e867fafb3 (diff) | |
download | marginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.tar.gz marginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.tar.bz2 marginal-209712a2b2c2ffdc420d3994a18e202a403e9d3f.zip |
Some introduction writing.
Diffstat (limited to 'marginal.tex')
-rw-r--r-- | marginal.tex | 84 |
1 files changed, 58 insertions, 26 deletions
diff --git a/marginal.tex b/marginal.tex index f06bc04..8c7cf0e 100644 --- a/marginal.tex +++ b/marginal.tex @@ -37,6 +37,62 @@ %\begin{abstract} %\end{abstract} +\section{Introduction} + +Systems with rugged landscapes are important across many disciplines, from the +physics to glasses and spin-glasses to the statistical inference problems. The +behavior of these systems is best understood when equilibrium or optimal +solutions are studied and averages can be taken statically over all possible +configurations. However, such systems are also infamous for their tendency to +defy equilibrium and optimal expectations in practice, due to the presence of +dynamic transitions or crossovers that leave physical or algorithmic dynamics +stuck exploring only a subset of configurations. + +In some simple models of such landscapes, it was recently found that marginal +minima are significant as the attractors of gradient descent dynamics +\cite{Folena_2020_Rethinking, Folena_2023_On}. This extends to more novel +algorithms, like message passing \cite{} \textbf{Find out if this is true}. +\textbf{Think of other examples.} +While it is still not known how to predict which marginal minima will be +attractors, this ubiquity of behavior suggests that cartography of marginal +minima is a useful step in bounding out-of-equilibrium dynamical behavior. + +In the traditional methods for analyzing the geometric structure of rugged +landscapes, it is not necessarily straightforward to condition an analysis on +the marginality of minima. Using the method of a Legendre transformation of the +Parisi parameter corresponding to a set of real replicas, one can force the +result to be marginal by restricting the value of that parameter, but this +results in only the marginal minima at the energy level at which they are the +majority of stationary points \cite{Monasson_1995_Structural}. It is now +understood that out-of-equilibrium dynamics usually goes to marginal minima at +other energy levels \cite{Folena_2023_On}. + +The alternative, used to great success in the spherical models, is to start by +making a detailing understanding of the Hessian matrix at stationary points. +Then, one can condition the analysis on whatever properties of the Hessian are +necessary to lead to marginal minima. This strategy is so successful in the +spherical models because it is very straightforward to implement: a natural +parameter in the analysis of these models linearly shifts the spectrum of the +Hessian, and so fixing this parameter by whatever means naturally allows one to +require that the Hessian spectrum have a pseudogap. +Unfortunately this strategy is less straightforward to generalize. Many models +of interest, especially in inference problems, have Hessian statistics that are +poorly understood. + +Here, we introduce a generic method for conditioning the statistics of +stationary points on their marginality. The technique makes use of a novel way +to condition an integral over parameters to select only those that result in a +certain value of the smallest eigenvalue of a matrix that is a function of +those parameters. By requiring that the smallest eigenvalue of the Hessian at +stationary points be zero, we restrict to marginal minima, either those with a +pseudogap in their bulk spectrum or those with outlying eigenvectors. We +provide a heuristic to distinguish these two cases. We demonstrate the method +on the spherical models, where it is unnecessary but instructive, and on +extensions of the spherical models with non-GOE Hessians where the technique is +more useful. + +\section{How to condition on the smallest eigenvalue} + An arbitrary function $g$ of the minimum eigenvalue of a matrix $A$ can be expressed as \begin{equation} g(\lambda_\textrm{min}(A)) @@ -51,7 +107,7 @@ associated with the minimum eigenvalue. By definition, $x_\mathrm{min}(A)^TAx_\mathrm{min}(A)=x_\mathrm{min}(A)^Tx_\mathrm{min}(A)\lambda_\mathrm{min}(A)=N\lambda_\mathrm{min}(A)$ assuming the normalization is $\|x_\mathrm{min}(A)\|^2=N$. The second equality extends a technique first introduced in \cite{Ikeda_2023_Bose-Einstein-like} -and used in \cite{me}. A Boltzmann distribution is introduced over a spherical +and used in \cite{Kent-Dobias_2024_Arrangement}. A Boltzmann distribution is introduced over a spherical model whose Hamiltonian is quadratic with interaction matrix given by $A$. In the limit of zero temperature, the measure will concentrate on the ground states of the model, which correspond with the eigenvectors $\pm x_\mathrm{min}$ @@ -216,7 +272,7 @@ ensure that the integrals are constrained to the tangent space of the configurat \end{aligned} \end{equation} -\section{Spherical model} +\section{Application to the spherical models} \begin{align} C_{ab}=\frac1N\mathbf s_a\cdot\mathbf s_b @@ -265,30 +321,6 @@ We will discuss at the end of this paper when these order parameters can be expe \end{equation} where the maximum over $\omega$ needs to lie at a real value. -\section{Superfield formalism} - -\begin{equation} - \pmb\phi_a(1)=\pmb\sigma_a+\bar\theta(1)\pmb\eta_a+\bar{\pmb\eta}_a\theta(1)+\hat{\pmb\sigma}_a\bar\theta(1)\theta(1) -\end{equation} -\begin{equation} - \pmb\xi_b(1)=\pmb\sigma_1+\mathbf x_b\bar\vartheta(1)\theta(1)+\mathbf x_b\bar\theta(1)\vartheta(1) -\end{equation} -\begin{equation} - \int d\theta\,d\bar\theta\,\left[ - (1+\hat\beta\bar\theta\theta)H(\pmb\phi) - +\int d\vartheta\,d\bar\vartheta\,H(\pmb\xi) - \right] - =\hat{\pmb\sigma}^T\partial H(\pmb\sigma) - +\pmb\eta^T\partial\partial H(\pmb\sigma)\pmb\eta - +\beta\mathbf x^T\partial\partial H(\pmb\sigma)\mathbf x - +\hat\beta H(\pmb\sigma) -\end{equation} -\begin{equation} - \int d1\,d2\,(1+\hat\beta\bar\theta(1)\theta(1))(1+\hat\beta\bar\theta(2)\theta(2)) - f\left(\frac{\pmb\phi_a(1)\cdot\pmb\phi_b(2)}N\right) - +\int d1\,(1+\hat\beta\bar\theta(1)\theta(1))f\left(\frac{\pmb\phi_a(1)\cdot\pmb\xi_b(2)}N\right) -\end{equation} - \section{Twin spherical model} $\Omega=S^{N-1}\times S^{N-1}$ |