Added response to reviewers.

author: Jaron Kent-Dobias <jaron@kent-dobias.com> 2024-10-29 15:09:11 +0100
committer: Jaron Kent-Dobias <jaron@kent-dobias.com> 2024-10-29 15:09:11 +0100
commit: 1a2018495c5ef8d2ad84a496ede7b8cbab486a15 (patch)
tree: 713f68602c5fbcbced1993410d3cdcced945bae2
parent: 77cf86b193f24630890990e105fe40730d353fd0 (diff)
download: marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.tar.gz
marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.tar.bz2
marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.zip
1 files changed, 153 insertions, 0 deletions
diff --git a/response.txt b/response.txt
new file mode 100644
index 0000000..e072f5d
--- /dev/null
+++ b/response.txt
@@ -0,0 +1,153 @@
+I thank the referees for their useful feedback, which led to positive changes
+to the manuscript. All changes made since the first submission can be found
+highlighted in an attached PDF generated by latexdiff.
+
+In addition to the changes made in response to referee comments detailed below,
+there were three other changes made to the resubmitted manuscript:
+
+ - References to a "companion paper" were changed to a "related work", since
+   the two papers are not being considered as companions.
+
+ - Soon after submission I identified a mistake in Appendix A regarding the
+   matrix form of a super linear operator. This mistake did not affect any of
+   the formulae or results of the rest of the manuscript, but has nevertheless
+   been amended.
+
+ - I found and repaired a spelling mistake in the paragraph after what is now
+   equation 67
+
+Report of the First Referee:
+> 1) Terms such as "marginal minima" and "pseudogap" are used without clear
+> definitions. These terms refers to different concepts depending on various
+> fields, which can yield meaningless confusions. Provide clear definitions for
+> these technical terms when they appear at the first time.
+
+The text of the second paragraph of the introduction has been expanded to more
+precisely define the terms "marginal minima" and "pseudogap". Its final
+sentences now read:
+
+  The level set associated with this threshold energy contains mostly
+  \emph{marginal minima}, or minima whose Hessian matrix have a continuous
+  spectral density over all sufficiently small positive eigenvalues. In most
+  circumstances the spectrum is \emph{pseudogapped}, which means that the
+  spectral density smoothly approaches zero as zero eigenvalue is approached
+  from above.
+
+If this level of definition is not sufficiently clear to the reviewer, or if
+there are further terms I have neglected, I welcome further comment on the
+matter.
+
+> 2) In eqs (23)-(25), I could not figure out why both notations of L(x,w) and
+> H(x, w) are used. If the two notations refer to the identical quantity, it
+> should be unified. Otherwise, their difference should be explained.
+
+This confusion stems from a notational ambiguity. The domain of H and the
+domain of ∇H are not the same, and writing ∇H(x, ω) is not meant to imply the
+existence of a function H(x, ω). I have expanded the text around equations (24)
+and (25) in an attempt to clarify this point.
+
+> 3) At the first reading, I am confused with eq. (38). The author writes
+> "This is because the trace of $\partial \partial H$ is typically an order of
+> $N$ smaller than trace of $\partial \partial g_i". This would be true for
+> Hamiltonian of eq. (45). However, does it hold for sums of squared random
+> functions such as eq. (71)? Let us consider a trivial case
+> $V_i(x) = r_i \cdot x/\sqrt{N}$, where $r_i$ is a random vector from
+> $N(0, I_{N\times N})$. This makes eq. (71) a quadratic form of a negative
+> definite matrix, for which its trace of Hessian scale as $O(N)$.  This may be
+> an exceptional case. However, statements such as the above before showing
+> concrete target systems can confuse readers. I would like to ask the author
+> to amend the writing.
+
+I thank the referee for catching this mistake. Fortunately, its effect on the
+manuscript was minor, because correctly accounting for cases like the referee
+describes results in only a constant correction to μ. Since only the relative
+value of μ is important for identifying marginal minima, the marginal
+complexity calculated while neglecting it is still correct, as in the model
+examined in the "related work" arXiv:2407.02092 which has such a linear term.
+
+I have changed the text of the manuscript and several equations to correct this
+mistake. This can be seen in the vicinity of equations (using the new
+manuscript's numbering) 38/39, after equation 46, after equation 59, and after
+equation 75. In Sections IV.C and D this leads to changes in display math that
+were not captured by the latexdiff, in equations 76, 85, 86, D2, D6, D8, and
+D11, all consisting of replacing μ with μ + f'(0).
+
+Report of the Second Referee:
+> (1) The first two examples (spherical spin glasses and multi-spherical spin
+> glasses) exhibit the property that the complexity of marginal states splits into
+> two contributions: the “unconstrained” complexity and a large deviation function
+> associated with the smallest eigenvalue of the Hessian. In the text it is
+> claimed that this behavior follows from the Gaussian nature of the Hessian. Is
+> this statement general? If one constructs models whose Hessians are not
+> invariant—for example, with an entry-dependent variance pattern—can one still
+> expect this statement to hold?
+
+This question is an astute one, and I cannot speak to whether Gaussianity alone
+is a sufficient condition for the separation of the action. Positing properties
+of the Hessian is not enough for reasoning about this, since the key question
+is how correlations between the Hessian, gradient, and energy compare in
+magnitude with their self correlations. So, one would need to construct an
+ensemble of random functions whose Hessian has such a property to begin
+addressing this.
+
+Rather than venture into this probably rich research line, I have simply
+clarified in the text after what is now equation 53 that this is characteristic
+of isotropic and Gaussian random functions.
+
+> (2)  It appears that Eq. (66) and its zero-temperature limit, when evaluated at
+> the saddle point, provide a parametrization of the large deviation functions for
+> the smallest Hessian eigenvalue, analogous to Eq. (52) for the GOE case. Is
+> there any way to express this large deviation function more transparently, or in
+> a form that makes the limit  ϵ→0 easier to read?
+
+Unfortunately I have not found a way to nicely express such a thing. The
+zero-temperature limit of what is now equation 67 is a much more unwieldy
+expression than equation 67 itself, and is not appropriate for inclusion in a
+manuscript let alone a source of intuitive insight. Though the referee's
+suggestion of a reduction in the ε→0 limit does exist, it involves the
+nontrivial coordination of limiting saddle-point values in the variables making
+up the expression.
+
+> (3) Maybe the author can comment on the relation between his approach and the
+> methods developed in the past to track marginal minima (mostly in the sense of
+> an isolated eigenvalue of the Hessian rather than pseudogapped), such as:
+> 
+> Marginal states in mean-field glasses
+> Markus Müller, Luca Leuzzi, and Andrea Crisanti
+> PHYSICAL REVIEW B 74, 134431 2006
+
+I have added a final paragraph to the conclusion discussion the relationship
+between these two papers. It reads
+
+  The title of our paper and that of \citeauthor{Muller_2006_Marginal} suggest
+  they address the same topic, but this is not the case
+  \cite{Muller_2006_Marginal}. That work differs in three important and
+  fundamental ways. First, it describes minima of the TAP free energy and
+  involves peculiarities specific to the TAP. Second, it describes dominant
+  minima which happen to be marginal, not a condition for finding subdominant
+  marginal minima. Finally, it focuses on minima with a single soft direction
+  (which are the typical minima of the low temperature Sherrington--Kirkpatrick
+  TAP free energy), while we aim to avoid such minima in favor of ones that
+  have a pseudogap (which we argue are relevant to out-of-equilibrium
+  dynamics). The fact that the typical minima studied by
+  \citeauthor{Muller_2006_Marginal} are not marginal in this latter sense may
+  provide an intuitive explanation for the seeming discrepancy between the
+  proof that the low-energy Sherrington--Kirkpatrick model cannot be sampled
+  \cite{ElAlaoui_2022_Sampling} and the proof that a message passing algorithm
+  can find near-ground states \cite{Montanari_2021_Optimization}: the algorithm
+  finds the atypical low-lying states that are marginal in the sense considered
+  here but cannot find the typical ones considered by
+  \citeauthor{Muller_2006_Marginal}.
+
+> (4) When introducing the method around Eq. (1), it should be stated that this
+> works for symmetric matrices A.
+
+The text new reflects this.
+
+> Typos:
+> 
+> Eq (A8): integral should be over d2
+> Page 11, last line second column: “minima can dynamics” —> a verb is missing
+> here
+
+These small mistakes have been fixed in the new manuscript.
author	Jaron Kent-Dobias <jaron@kent-dobias.com>	2024-10-29 15:09:11 +0100
committer	Jaron Kent-Dobias <jaron@kent-dobias.com>	2024-10-29 15:09:11 +0100
commit	1a2018495c5ef8d2ad84a496ede7b8cbab486a15 (patch)
tree	713f68602c5fbcbced1993410d3cdcced945bae2
parent	77cf86b193f24630890990e105fe40730d353fd0 (diff)
download	marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.tar.gz marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.tar.bz2 marginal-1a2018495c5ef8d2ad84a496ede7b8cbab486a15.zip