diff options
-rw-r--r-- | marginal.bib | 42 | ||||
-rw-r--r-- | marginal.tex | 81 |
2 files changed, 120 insertions, 3 deletions
diff --git a/marginal.bib b/marginal.bib index 9f56ed1..1a5c853 100644 --- a/marginal.bib +++ b/marginal.bib @@ -126,6 +126,13 @@ issn = {1751-8121} } +@misc{Gael_2010_Eigen, + author = {Guennebaud, Gaël and Jacob, Benoît and others, }, + title = {Eigen v3}, + year = {2010}, + howpublished = {http://eigen.tuxfamily.org} +} + @article{Gamarnik_2021_TheA, author = {Gamarnik, David}, title = {The overlap gap property: A topological barrier to optimizing over random structures}, @@ -294,6 +301,17 @@ doi = {10.1002/cpa.21922} } +@article{Tange_2011_GNU, + author = {Tange, Ole}, + title = {GNU Parallel: The Command-Line Power Tool}, + journal = {;login: The USENIX Magazine}, + year = {2011}, + month = {2}, + number = {1}, + volume = {36}, + pages = {42--47} +} + @phdthesis{Tublin_2022_A, author = {Tublin, Rashel}, title = {A Few Results in Random Matrix Theory and Random Optimization}, @@ -331,3 +349,27 @@ eprinttype = {arxiv} } +@article{Wolfe_1969_Convergence, + author = {Wolfe, Philip}, + title = {Convergence Conditions for Ascent Methods}, + journal = {SIAM Review}, + publisher = {Society for Industrial & Applied Mathematics (SIAM)}, + year = {1969}, + month = {April}, + number = {2}, + volume = {11}, + pages = {226--235}, + url = {http://dx.doi.org/10.1137/1011036}, + doi = {10.1137/1011036}, + issn = {1095-7200} +} + +@book{Zinn-Justin_2002_Quantum, + author = {Zinn-Justin, Jean}, + title = {Quantum field theory and critical phenomena}, + publisher = {Clarendon Press Oxford University Press}, + year = {2002}, + address = {Oxford New York}, + isbn = {0198509235} +} + diff --git a/marginal.tex b/marginal.tex index 4afa2c5..6c28e08 100644 --- a/marginal.tex +++ b/marginal.tex @@ -448,6 +448,9 @@ Finally, the marginal complexity is defined by evaluating the complexity conditi \subsection{General features of saddle point computation} +Several elements of the computation of the marginal complexity, and indeed the +ordinary dominant complexity, follow from the formulae of the above section in +the same way. \begin{align} \label{eq:delta.grad} &\delta\big(\nabla H(\mathbf x_a,\pmb\omega_a)\big) @@ -1104,9 +1107,15 @@ absolute value sign) that make up the basic Kac--Rice measure, so that we can wr \end{aligned} \end{equation} where we have written $d1=d\theta_1\,d\bar\theta_1$ and $d\pmb\phi=d\mathbf -x\,d\bar{\pmb\eta}\,d\pmb\eta\,\frac{d\hat{\mathbf x}}{(2\pi)^N}$. Besides some deep connections -to the physics of BRST, this compact notation dramatically simplifies the -analytical treatment of the problem. The reason why this simplification is +x\,d\bar{\pmb\eta}\,d\pmb\eta\,\frac{d\hat{\mathbf x}}{(2\pi)^N}$. Besides some +deep connections to the physics of BRST, this compact notation dramatically +simplifies the analytical treatment of the problem. The energy of stationary points can also be fixed using this notation, by writing +\begin{equation} + \int d\pmb\phi\,\frac{d\hat\beta}{2\pi}\,e^{\hat\beta E+\int d1\,(1-\hat\beta\bar\theta_1\theta_1)H(\pmb\phi(1))} +\end{equation} +which a small calculation confirms results in the same expression as \eqref{eq:delta.energy}. + +The reason why this simplification is possible is because there are a large variety of superspace algebraic and integral operations with direct corollaries to their ordinary real counterparts. For instance, consider a super linear operator $M(1,2)$, which @@ -1155,6 +1164,72 @@ save for the inverse of $\det D$. The same method can be used to calculate the superdeterminant in arbitrary superspaces, where for $\mathbb R^{N|2D}$ each basis has $2^{2D-1}$ elements. For instance, for $\mathbb R^{N|4}$ we have $\mathbf e(1,2)=\{1,\bar\theta_1\theta_1,\bar\theta_2\theta_2,\bar\theta_1\theta_2,\bar\theta_2\theta_1,\bar\theta_1\bar\theta_2,\theta_1\theta_2,\bar\theta_1\theta_1\bar\theta_2\theta_2\}$ and $\mathbf f(1,2)=\{\bar\theta_1,\theta_1,\bar\theta_2,\theta_2,\bar\theta_1\theta_1\bar\theta_2,\bar\theta_2\theta_2\theta_1,\bar\theta_1\theta_1\theta_2,\bar\theta_2\theta_2\theta_1\}$. +\section{BRST symmetry} +\label{sec:brst} + +The superspace representation is also helpful because it can make manifest an +unusual symmetry in the dominant complexity of minima that would otherwise be +obfuscated. This arises from considering the Kac--Rice formula as a kind of +gauge fixing procedure \cite{Zinn-Justin_2002_Quantum}. Around each stationary +point consider making the coordinate transformation $\mathbf u=\nabla H(\mathbf +x)$. Then in the absence of fixing the trace, the Kac--Rice measure becomes +\begin{equation} + \int d\nu(\mathbf x,\pmb\omega\mid E) + =\int\sum_\sigma d\mathbf u\,\delta(\mathbf u)\, + \delta\big(NE-H(\mathbf x_\sigma)\big) +\end{equation} +where the sum is over stationary points. This integral has a symmetry of its +measure of the form $\mathbf u\mapsto\mathbf u+\delta\mathbf u$. Under the +nonlinear transformation that connects $\mathbf u$ and $\mathbf x$, this +implies a symmetry of the measure in the Kac--Rice integral of $\mathbf +x\mapsto\mathbf x+(\operatorname{Hess}H)^{-1}\delta\mathbf u$. This symmetry, while exact, is +nonlinear and difficult to work with. + +When the absolute value sign has been dropped and Grassmann vectors introduced, +this symmetry can be simplified considerably. Due to the expansion properties +of Grassmann integrals, any appearance of $-\bar{\pmb\eta}\pmb\eta^T$ in the +integrand resolves to $(\operatorname{Hess}H)^{-1}$. The +symmetry of the measure can then be written +\begin{equation} + \mathbf x\mapsto \mathbf x-\bar{\pmb\eta}\pmb\eta^T\delta\mathbf u + =\mathbf x+\bar{\pmb\eta}\delta\epsilon +\end{equation} +where $\delta\epsilon=-\pmb\eta^T\delta\mathbf u$ is a Grassmann number. This +establishes that $\delta\mathbf x=\bar{\pmb\eta}\delta\epsilon$, now linear. The rest of +the transformation can be built by requiring that the action is invariant after +expansion in $\delta\epsilon$. Ignoring for a moment the piece of the measure +fixing the trace of the Hessian, this gives +\begin{align} + \delta\mathbf x=\bar{\pmb\eta}\,\delta\epsilon && + \delta\hat{\mathbf x}=-i\hat\beta\bar{\pmb\eta}\,\delta\epsilon && + \delta\pmb\eta=-i\hat{\mathbf x}\,\delta\epsilon && + \delta\bar{\pmb\eta}=0 +\end{align} +so that the differential form of the symmetry is +\begin{equation} + \mathcal D=\bar{\pmb\eta}\frac\partial{\partial\mathbf x} + -i\hat\beta\bar{\pmb\eta}\frac\partial{\partial\hat{\mathbf x}} + -i\hat{\mathbf x}\frac\partial{\partial\pmb\eta} +\end{equation} +The Ward identities associated with this symmetry give rise to relationships among the order parameters. These identities are +\begin{align} + 0=\frac1N\mathcal D\langle\mathbf x_a^T\pmb\eta_b\rangle + =\frac1N\left[ + \langle\bar{\pmb\eta}_a^T\pmb\eta_b\rangle- + i\langle\mathbf x_a^T\hat{\mathbf x}_b\rangle + \right] + =G_{ab}+R_{ab} \\ + 0=\frac iN\mathcal D\langle\hat{\mathbf x}_a^T\pmb\eta_b\rangle + =\frac1N\left[ + \hat\beta\langle\bar{\pmb\eta}_a^T\pmb\eta_b\rangle + +\langle\hat{\mathbf x}_a^T\hat{\mathbf x}_b\rangle + \right] + =\hat\beta G_{ab}+D_{ab} +\end{align} +These identities establish $G_{ab}=-R_{ab}$ and $D_{ab}=\hat\beta R_{ab}$, +allowing elimination of the matrices $G$ and $D$ in favor of $R$. Fixing the +trace to $\mu$ explicitly breaks this symmetry, and the simplification is lost. + \section{Complexity of dominant optima in the least-squares problem} \label{sec:dominant.complexity} |