diff options
author | Jaron Kent-Dobias <jaron@kent-dobias.com> | 2024-10-29 15:09:11 +0100 |
---|---|---|
committer | Jaron Kent-Dobias <jaron@kent-dobias.com> | 2024-10-29 15:09:11 +0100 |
commit | 1a2018495c5ef8d2ad84a496ede7b8cbab486a15 (patch) | |
tree | 713f68602c5fbcbced1993410d3cdcced945bae2 | |
parent | 77cf86b193f24630890990e105fe40730d353fd0 (diff) | |
download | marginal-master.tar.gz marginal-master.tar.bz2 marginal-master.zip |
-rw-r--r-- | response.txt | 153 |
1 files changed, 153 insertions, 0 deletions
diff --git a/response.txt b/response.txt new file mode 100644 index 0000000..e072f5d --- /dev/null +++ b/response.txt @@ -0,0 +1,153 @@ +I thank the referees for their useful feedback, which led to positive changes +to the manuscript. All changes made since the first submission can be found +highlighted in an attached PDF generated by latexdiff. + +In addition to the changes made in response to referee comments detailed below, +there were three other changes made to the resubmitted manuscript: + + - References to a "companion paper" were changed to a "related work", since + the two papers are not being considered as companions. + + - Soon after submission I identified a mistake in Appendix A regarding the + matrix form of a super linear operator. This mistake did not affect any of + the formulae or results of the rest of the manuscript, but has nevertheless + been amended. + + - I found and repaired a spelling mistake in the paragraph after what is now + equation 67 + +Report of the First Referee: +> 1) Terms such as "marginal minima" and "pseudogap" are used without clear +> definitions. These terms refers to different concepts depending on various +> fields, which can yield meaningless confusions. Provide clear definitions for +> these technical terms when they appear at the first time. + +The text of the second paragraph of the introduction has been expanded to more +precisely define the terms "marginal minima" and "pseudogap". Its final +sentences now read: + + The level set associated with this threshold energy contains mostly + \emph{marginal minima}, or minima whose Hessian matrix have a continuous + spectral density over all sufficiently small positive eigenvalues. In most + circumstances the spectrum is \emph{pseudogapped}, which means that the + spectral density smoothly approaches zero as zero eigenvalue is approached + from above. + +If this level of definition is not sufficiently clear to the reviewer, or if +there are further terms I have neglected, I welcome further comment on the +matter. + +> 2) In eqs (23)-(25), I could not figure out why both notations of L(x,w) and +> H(x, w) are used. If the two notations refer to the identical quantity, it +> should be unified. Otherwise, their difference should be explained. + +This confusion stems from a notational ambiguity. The domain of H and the +domain of ∇H are not the same, and writing ∇H(x, ω) is not meant to imply the +existence of a function H(x, ω). I have expanded the text around equations (24) +and (25) in an attempt to clarify this point. + +> 3) At the first reading, I am confused with eq. (38). The author writes +> "This is because the trace of $\partial \partial H$ is typically an order of +> $N$ smaller than trace of $\partial \partial g_i". This would be true for +> Hamiltonian of eq. (45). However, does it hold for sums of squared random +> functions such as eq. (71)? Let us consider a trivial case +> $V_i(x) = r_i \cdot x/\sqrt{N}$, where $r_i$ is a random vector from +> $N(0, I_{N\times N})$. This makes eq. (71) a quadratic form of a negative +> definite matrix, for which its trace of Hessian scale as $O(N)$. This may be +> an exceptional case. However, statements such as the above before showing +> concrete target systems can confuse readers. I would like to ask the author +> to amend the writing. + +I thank the referee for catching this mistake. Fortunately, its effect on the +manuscript was minor, because correctly accounting for cases like the referee +describes results in only a constant correction to μ. Since only the relative +value of μ is important for identifying marginal minima, the marginal +complexity calculated while neglecting it is still correct, as in the model +examined in the "related work" arXiv:2407.02092 which has such a linear term. + +I have changed the text of the manuscript and several equations to correct this +mistake. This can be seen in the vicinity of equations (using the new +manuscript's numbering) 38/39, after equation 46, after equation 59, and after +equation 75. In Sections IV.C and D this leads to changes in display math that +were not captured by the latexdiff, in equations 76, 85, 86, D2, D6, D8, and +D11, all consisting of replacing μ with μ + f'(0). + +Report of the Second Referee: +> (1) The first two examples (spherical spin glasses and multi-spherical spin +> glasses) exhibit the property that the complexity of marginal states splits into +> two contributions: the “unconstrained” complexity and a large deviation function +> associated with the smallest eigenvalue of the Hessian. In the text it is +> claimed that this behavior follows from the Gaussian nature of the Hessian. Is +> this statement general? If one constructs models whose Hessians are not +> invariant—for example, with an entry-dependent variance pattern—can one still +> expect this statement to hold? + +This question is an astute one, and I cannot speak to whether Gaussianity alone +is a sufficient condition for the separation of the action. Positing properties +of the Hessian is not enough for reasoning about this, since the key question +is how correlations between the Hessian, gradient, and energy compare in +magnitude with their self correlations. So, one would need to construct an +ensemble of random functions whose Hessian has such a property to begin +addressing this. + +Rather than venture into this probably rich research line, I have simply +clarified in the text after what is now equation 53 that this is characteristic +of isotropic and Gaussian random functions. + +> (2) It appears that Eq. (66) and its zero-temperature limit, when evaluated at +> the saddle point, provide a parametrization of the large deviation functions for +> the smallest Hessian eigenvalue, analogous to Eq. (52) for the GOE case. Is +> there any way to express this large deviation function more transparently, or in +> a form that makes the limit ϵ→0 easier to read? + +Unfortunately I have not found a way to nicely express such a thing. The +zero-temperature limit of what is now equation 67 is a much more unwieldy +expression than equation 67 itself, and is not appropriate for inclusion in a +manuscript let alone a source of intuitive insight. Though the referee's +suggestion of a reduction in the ε→0 limit does exist, it involves the +nontrivial coordination of limiting saddle-point values in the variables making +up the expression. + +> (3) Maybe the author can comment on the relation between his approach and the +> methods developed in the past to track marginal minima (mostly in the sense of +> an isolated eigenvalue of the Hessian rather than pseudogapped), such as: +> +> Marginal states in mean-field glasses +> Markus Müller, Luca Leuzzi, and Andrea Crisanti +> PHYSICAL REVIEW B 74, 134431 2006 + +I have added a final paragraph to the conclusion discussion the relationship +between these two papers. It reads + + The title of our paper and that of \citeauthor{Muller_2006_Marginal} suggest + they address the same topic, but this is not the case + \cite{Muller_2006_Marginal}. That work differs in three important and + fundamental ways. First, it describes minima of the TAP free energy and + involves peculiarities specific to the TAP. Second, it describes dominant + minima which happen to be marginal, not a condition for finding subdominant + marginal minima. Finally, it focuses on minima with a single soft direction + (which are the typical minima of the low temperature Sherrington--Kirkpatrick + TAP free energy), while we aim to avoid such minima in favor of ones that + have a pseudogap (which we argue are relevant to out-of-equilibrium + dynamics). The fact that the typical minima studied by + \citeauthor{Muller_2006_Marginal} are not marginal in this latter sense may + provide an intuitive explanation for the seeming discrepancy between the + proof that the low-energy Sherrington--Kirkpatrick model cannot be sampled + \cite{ElAlaoui_2022_Sampling} and the proof that a message passing algorithm + can find near-ground states \cite{Montanari_2021_Optimization}: the algorithm + finds the atypical low-lying states that are marginal in the sense considered + here but cannot find the typical ones considered by + \citeauthor{Muller_2006_Marginal}. + +> (4) When introducing the method around Eq. (1), it should be stated that this +> works for symmetric matrices A. + +The text new reflects this. + +> Typos: +> +> Eq (A8): integral should be over d2 +> Page 11, last line second column: “minima can dynamics” —> a verb is missing +> here + +These small mistakes have been fixed in the new manuscript. |