summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJaron Kent-Dobias <jaron@kent-dobias.com>2023-12-05 09:11:03 +0100
committerJaron Kent-Dobias <jaron@kent-dobias.com>2023-12-05 09:11:03 +0100
commit9e07af8c6d5a4b983c515b839b1a899a6089b99f (patch)
treebb90293a133d964d6ab51e58462c7eee22eeda20
parentf3c0e82cffe808deca34801eee07513c2d45a90d (diff)
downloadSciPostPhys_16_001-scipost.tar.gz
SciPostPhys_16_001-scipost.tar.bz2
SciPostPhys_16_001-scipost.zip
Amended response.scipost
-rw-r--r--response.md334
1 files changed, 80 insertions, 254 deletions
diff --git a/response.md b/response.md
index 5cbbf0a..740531a 100644
--- a/response.md
+++ b/response.md
@@ -1,307 +1,133 @@
For reviewer 2
-> On page 9, the author points out that there are solutions with complexity 0
-> that do not show an extensive barrier "in any situation". First, this "in any
-> situation" is quite unclear. Does the author mean above and below the
-> threshold energy? Does this solution exist even at high energy?
-> Can the author comment on what this solution could imply?
-
-The sense of "any situation" means "for a reference point of any energy and
-stability." This includes energies above and below the threshold energy, at
-stabilities that imply saddles, minima, or marginal minima, and even for
-combinations of energy and stability where the complexity of stationary points
-is negative.
-
-The two-point complexity is computed under the condition that the reference
-point exists. Given that the reference point exists, there is at least one
-point that can be found at zero overlap with the reference: itself. This
-reasoning alone rationalizes why we should find a solution with Σ₁₂ = 0, q = 0,
-and E₀ = E₁, μ₀ = μ₁ for any E₀ and μ₀.
-
-The interpretation is more subtle. Two different stationary points cannot lie
-at the same point, but the complexity calculation only resolves numbers of
-points that are exponential in N and differences in overlap that are linear in
-N. Therefore, the complexity calculation is compatible with many stationary
-points being contained in the subextensive region of dimension Δq = O(1/N)
-around any reference point. We can reason as to where these extremely near
-neighbors are likely or unlikely to exist in specific conditions, but the
-complexity calculation cannot rule them out.
-
-This is point is not crucial to anything in the paper, except to make more
-precise the statement that non-threshold marginal minima are separated by a gap
-in their overlap. Because marginal minima have very flat directions, they are
-good candidates for possessing these extremely near neighbors, and this might
-lead one to say they are not isolated. However, if such extremely near
-neighbors exist, they are irrelevant to dynamics: the entire group is isolated,
-since the complexity of similar stationary points at a small but extensive
-overlap further is negative.
-
-Because the point is not important to the conclusions of the paper, the
-paragraph has been revised for clarity and moved to a footnote.
-
-> At the technical level, I am confused by one of the constraints imposed in
-> eq.16, when \sigma_1 couples with all replicated s_a. I was expecting a sum
-> of \sigma_b.s_a over a and b. This may represent a rotation applied to all
-> replicas along a reference direction, which is probably what the author did,
-> but there is no comment about that. In general, it would have helped to
-> specify the constraints enforced instead of writing simply "Lagrange
-> multipliers" before eq.16.
-
-The fact noticed by the referee that only σ₁ appears in the scalar product with
-sₐ in equation (16) of the original manuscript was not introduced in that
-equation, but instead was introduced in equation (10) of the original
-manuscript. Right after that equation, the special status of σ₁ was clarified.
-This arises because of the structure of equation (9): in that equation, the
-logarithmic expression being averaged depends only on σ, which corresponds with
-σ₁ in the following equation. σ₂ through σₘ correspond to σ', which is
-replicated (m - 1) times to bring the partition function into the numerator.
-Therefore, there is a clear reason behind the asymmetry among the replicas
-associated with the reference spin, and it was not due to an ad-hoc
-transformation as suggested by the referee.
-
-As part of the rewriting of the manuscript for clarity, this subtlety has been
-emphasized around what are now equations (9) and (10).
-
-> The author analyses the problem using the Franz-Parisi potential, however,
-> this analysis does not seem to matter in the paper. We can read a comment at
-> the end of Sec.3.1 but without actual implications. It should be either
-> removed or expanded, at the moment it seems just without purpose.
-
-The analysis of the Franz-Parisi potential has been moved to Appendix C, with a
-more explanatory discussion of our interest in it included in the manuscript.
-In short, the referee is right to point out that it has no implications for the
-main topic of the paper. It is included because some specialists will be
-interested in the comparison between it and the two-point complexity. This
-reasoning is now explained at the beginning of Appendix C.
-
-> "We see arrangements of barriers relative to each other, perhaps...". Why
-> "perhaps"? Second, where is this analysis carried out? In the results
-> section, the author analyses stable minima and marginal states, I don't know
-> where to look. Adding a reference would have helped.
-
-The sentence in question has now been rephrased, but "perhaps" was due to the
-fact that not very much is learned about the mutual arrangement of saddles from
-this work. In order to make clear what conclusions can be drawn about saddles
-from our calculation, we have added a new subsection to the Results section,
-3.2: Grouping of saddle points. This subsection contains two paragraphs
-detailing what one might want to know about the geometry of saddle points, and
-what we actually learn from the two-point complexity.
-
-> After eq.3, the author comments on the replica ansatz, but this is out of
-> place. We are still introducing the model. It would be better to have it at
-> the end of the section (where indeed the author comes back to the same
-> concept) or remove it entirely.
-
-The referee is right to point out this oversight, and the note about the
-specific influence of the covariance function f on the form of RSB has been
-moved into the details for the calculation of the complexity, in subsection
-A.4: Replica ansatz and saddle point. Where it was in section 2 we now say
-
-"The choice of *f* has significant effect on the form of equilibrium order in
-the model, and likewise influences the geometry of stationary points."
-
-> fig.1, add a caption under each figure saying what they are (oriented
-> saddles, oriented minima, etc), it is much easier to read.
+> On page 9, the author points out that there are solutions with complexity 0 that do not show an extensive barrier "in any situation". First, this "in any situation" is quite unclear. Does the author mean above and below the threshold energy? Does this solution exist even at high energy? Can the author comment on what this solution could imply?
+
+The sense of "any situation" means "for a reference point of any energy and stability." This includes energies above and below the threshold energy, at stabilities that imply saddles, minima, or marginal minima, and even for combinations of energy and stability where the complexity of stationary points is negative.
+
+The two-point complexity is computed under the condition that the reference point exists. Given that the reference point exists, there is at least one point that can be found at zero overlap with the reference: itself. This reasoning alone rationalizes why we should find a solution with Σ₁₂ = 0, q = 0, and E₀ = E₁, μ₀ = μ₁ for any E₀ and μ₀.
+
+The interpretation is more subtle. Two different stationary points cannot lie at the same point, but the complexity calculation only resolves numbers of points that are exponential in N and differences in overlap that are linear in N. Therefore, the complexity calculation is compatible with many stationary points being contained in the subextensive region of dimension Δq = O(1/N) around any reference point. We can reason as to where these extremely near neighbors are likely or unlikely to exist in specific conditions, but the complexity calculation cannot rule them out.
+
+This is point is not crucial to anything in the paper, except to make more precise the statement that non-threshold marginal minima are separated by a gap in their overlap. Because marginal minima have very flat directions, they are good candidates for possessing these extremely near neighbors, and this might lead one to say they are not isolated. However, if such extremely near neighbors exist, they are irrelevant to dynamics: the entire group is isolated, since the complexity of similar stationary points at a small but extensive overlap further is negative.
+
+Because the point is not important to the conclusions of the paper, the paragraph has been revised for clarity and moved to a footnote.
+
+> At the technical level, I am confused by one of the constraints imposed in eq.16, when \sigma_1 couples with all replicated s_a. I was expecting a sum of \sigma_b.s_a over a and b. This may represent a rotation applied to all replicas along a reference direction, which is probably what the author did, but there is no comment about that. In general, it would have helped to specify the constraints enforced instead of writing simply "Lagrange multipliers" before eq.16.
+
+The fact noticed by the referee that only σ₁ appears in the scalar product with sₐ in equation (16) of the original manuscript was not introduced in that equation, but instead was introduced in equation (10) of the original manuscript. Right after that equation, the special status of σ₁ was clarified. This arises because of the structure of equation (9): in that equation, the logarithmic expression being averaged depends only on σ, which corresponds with σ₁ in the following equation. σ₂ through σₘ correspond to σ', which is replicated (m - 1) times to bring the partition function into the numerator. Therefore, there is a clear reason behind the asymmetry among the replicas associated with the reference spin, and it was not due to an ad-hoc transformation as suggested by the referee.
+
+As part of the rewriting of the manuscript for clarity, this subtlety has been emphasized around what are now equations (9) and (10).
+
+> The author analyses the problem using the Franz-Parisi potential, however, this analysis does not seem to matter in the paper. We can read a comment at the end of Sec.3.1 but without actual implications. It should be either removed or expanded, at the moment it seems just without purpose.
+
+The analysis of the Franz-Parisi potential has been moved to Appendix C, with a more explanatory discussion of our interest in it included in the manuscript. In short, the referee is right to point out that it has no implications for the main topic of the paper. It is included because some specialists will be interested in the comparison between it and the two-point complexity. This reasoning is now explained at the beginning of Appendix C.
+
+> "We see arrangements of barriers relative to each other, perhaps...". Why "perhaps"? Second, where is this analysis carried out? In the results section, the author analyses stable minima and marginal states, I don't know where to look. Adding a reference would have helped.
+
+The sentence in question has now been rephrased, but "perhaps" was due to the fact that not very much is learned about the mutual arrangement of saddles from this work. In order to make clear what conclusions can be drawn about saddles from our calculation, we have added a new subsection to the Results section, 3.2: Grouping of saddle points. This subsection contains two paragraphs detailing what one might want to know about the geometry of saddle points, and what we actually learn from the two-point complexity.
+
+> After eq.3, the author comments on the replica ansatz, but this is out of place. We are still introducing the model. It would be better to have it at the end of the section (where indeed the author comes back to the same concept) or remove it entirely.
+
+The referee is right to point out this oversight, and the note about the specific influence of the covariance function f on the form of RSB has been moved into the details for the calculation of the complexity, in subsection A.4: Replica ansatz and saddle point. Where it was in section 2 we now say
+
+"The choice of *f* has significant effect on the form of equilibrium order in the model, and likewise influences the geometry of stationary points."
+
+> fig.1, add a caption under each figure saying what they are (oriented saddles, oriented minima, etc), it is much easier to read.
The suggestion of the referee was good and was implemented in the new manuscript.
-> fig.2, elaborate a bit more in the main text. This is introduced at the of
-> the section without any comment.
+> fig.2, elaborate a bit more in the main text. This is introduced at the of the section without any comment.
-A paragraph discussing Fig. 2 has been added to the main text, and the end of
-Section 2.
+A paragraph discussing Fig. 2 has been added to the main text, and the end of Section 2.
-> fig.3, "the dot-dashed lines on both plots depict the trajectory of the solid
-> line on the other plot", which one?
+> fig.3, "the dot-dashed lines on both plots depict the trajectory of the solid line on the other plot", which one?
-The answer is both. This confusing sentence has been clarified in the new
-manuscript. It now reads:
+The answer is both. This confusing sentence has been clarified in the new manuscript. It now reads:
-"The dot-dashed line on the left plot depicts the trajectory of the solid line
-on the right plot, and the dot-dashed line on the right plot depicts the
-trajectory of the solid line on the left plot."
+"The dot-dashed line on the left plot depicts the trajectory of the solid line on the right plot, and the dot-dashed line on the right plot depicts the trajectory of the solid line on the left plot."
-> fig.3, "In this case, the points lying nearest to the reference minimum are
-> saddle with mu\<mu, but with energies smaller than the threshold energy", so?
-> What is the implication? This misses a conclusion.
+> fig.3, "In this case, the points lying nearest to the reference minimum are saddle with mu\<mu, but with energies smaller than the threshold energy", so? What is the implication? This misses a conclusion.
-These low-lying saddles represent large deviations from the typical complexity.
-The point has been clarified by appending "which makes them an atypical
-population of saddles" to the sentence.
+These low-lying saddles represent large deviations from the typical complexity. The point has been clarified by appending "which makes them an atypical population of saddles" to the sentence.
-> Sec.3.1, the author comments on the similarity with the pure model, without
-> explaining what is similar. What should we expect on the p-spin? At least the
-> relevant aspects. It would also be useful to plot a version of Fig.3 for the
-> p-spin. It would make the discussion easier to follow.
+> Sec.3.1, the author comments on the similarity with the pure model, without explaining what is similar. What should we expect on the p-spin? At least the relevant aspects. It would also be useful to plot a version of Fig.3 for the p-spin. It would make the discussion easier to follow.
-In the reversed manuscript, the points of comparison with the pure models are
-mode more explicit, as the referee suggests. We do not think it is necessary to
-include a figure for the pure models, instead clarifying the most important
-departure in the text:
+In the reversed manuscript, the points of comparison with the pure models are mode more explicit, as the referee suggests. We do not think it is necessary to include a figure for the pure models, instead clarifying the most important departure in the text:
-"The largest difference between the pure and mixed models is the decoupling of
-nearby stable points from nearby low-energy points: in the pure *p*-spin model,
-the left and right panels of Fig. 3 would be identical up to a constant factor
-−*p*."
+"The largest difference between the pure and mixed models is the decoupling of nearby stable points from nearby low-energy points: in the pure *p*-spin model, the left and right panels of Fig. 3 would be identical up to a constant factor −*p*."
-For those interested in more detailed comparisons, the relevant figure for the
-pure models is found in the paper twice cited in that subsection.
+For those interested in more detailed comparisons, the relevant figure for the pure models is found in the paper twice cited in that subsection.
> "the nearest neighbour points are always oriented saddles", where do I see this?
We have added a sentence to clarify this point:
-"This is a result of the persistent presence of a negative isolated eigenvalue
-in the spectrum of the nearest neighbors, e.g., as in the shaded regions of
-Fig. 3."
+"This is a result of the persistent presence of a negative isolated eigenvalue in the spectrum of the nearest neighbors, e.g., as in the shaded regions of Fig. 3."
-> the sentence "like in the pure models, the emergence [...]" is extremely hard
-> to parse and the paragraph ends without a conclusion. What are the
-> consequences?
+> the sentence "like in the pure models, the emergence [...]" is extremely hard to parse and the paragraph ends without a conclusion. What are the consequences?
This sentence has been expanded to make it more clear, and the statement now reads
-"Like in the pure models, the minimum energy and maximum stability of nearby
-points are not monotonic in *q*: there is a range of overlap where the minimum
-energy of neighbors decreases with overlap. The transition from stable minima
-to index-one saddles along the line of lowest-energy states occurs at its local
-minimum, another similarity with the pure models [13]. This point is
-interesting because it describes the properties of the nearest stable minima to
-the reference point. It is not clear why the local minimum of the boundary
-coincides with this point or what implications that has for behavior."
+"Like in the pure models, the minimum energy and maximum stability of nearby points are not monotonic in *q*: there is a range of overlap where the minimum energy of neighbors decreases with overlap. The transition from stable minima to index-one saddles along the line of lowest-energy states occurs at its local minimum, another similarity with the pure models [13]. This point is interesting because it describes the properties of the nearest stable minima to the reference point. It is not clear why the local minimum of the boundary coincides with this point or what implications that has for behavior."
-We also now emphasize that the implications are not known. However, the
-coincidence itself it interesting, at the very least for the ability to predict
-where an isolated eigenvalue should destabilize nearby minima without making the
-computation for the eigenvalue.
+We also now emphasize that the implications are not known. However, the coincidence itself it interesting, at the very least for the ability to predict where an isolated eigenvalue should destabilize nearby minima without making the computation for the eigenvalue.
> at page 9, the author talk about \Sigma_12 that however has not been defined yet.
-The referee is correct to point out this oversight, which has now been amended
-by a qualitative definition of Σ₁₂ at the beginning of the results section.
+The referee is correct to point out this oversight, which has now been amended by a qualitative definition of Σ₁₂ at the beginning of the results section.
-> this section starts without explaining what is the strategy to solve the
-> problem. Explaining how the following subsection will contribute to the
-> solution without entering into the details of the computation would be of
-> great help.
+> this section starts without explaining what is the strategy to solve the problem. Explaining how the following subsection will contribute to the solution without entering into the details of the computation would be of great help.
-The explanation of the calculation for the complexity has been reorganized and
-expanded in the new manuscript. In part of this expansion, we added more
-explanation of this kind. Most of this is now found in Appendix A.
+The explanation of the calculation for the complexity has been reorganized and expanded in the new manuscript. In part of this expansion, we added more explanation of this kind. Most of this is now found in Appendix A.
-> "This replica symmetry will be important later" how? Either we have an
-> explanation following or it should be removed.
+> "This replica symmetry will be important later" how? Either we have an explanation following or it should be removed.
The comment has been removed in the new manuscript.
-> at the end of a step it would be good to wrap everything up. For instance,
-> sec.4.2 ends with "we do not include these details, which are standard" at
-> least give a reference. Second, add the final result.
+> at the end of a step it would be good to wrap everything up. For instance, sec.4.2 ends with "we do not include these details, which are standard" at least give a reference. Second, add the final result.
In the revised manuscript, more has been done to wrap up each section. For instance, what was section 4.2 and is is now section A.2 now ends
-"The result of this calculation is found in the effective action (44), where it
-contributes all terms besides the functions D contributed by the Hessian terms
-in the previous section and the logarithms contributed by the
-Hubbard–Stratonovich transformation of the next section."
+"The result of this calculation is found in the effective action (44), where it contributes all terms besides the functions D contributed by the Hessian terms in the previous section and the logarithms contributed by the Hubbard–Stratonovich transformation of the next section."
-> "there is a desert where none are found" -> solutions are exponentially rare
-> (or something else)
+> "there is a desert where none are found" -> solutions are exponentially rare (or something else)
The statement has been rewritten, and now says
-"Therefore, marginal minima whose energy *E*₀ is greater than the threshold have
-neighbors at arbitrarily close distance with a quadratic pseudogap, while those
-whose energy is less than the threshold have an overlap gap."
+"Therefore, marginal minima whose energy *E*₀ is greater than the threshold have neighbors at arbitrarily close distance with a quadratic pseudogap, while those whose energy is less than the threshold have an overlap gap."
-> I would suggest a rewriting, especially the last sessions (4-6). I understand
-> the intention of removing simple details, but they should be replaced by
-> comments. The impression (which can be wrong but gives the idea) is of some
-> working notes where simple steps have been removed, resulting in
-> hard-to-follow computations. Finally, I would also recommend moving these
-> sections to an appendix (after acknowledgement and funding).
+> I would suggest a rewriting, especially the last sessions (4-6). I understand the intention of removing simple details, but they should be replaced by comments. The impression (which can be wrong but gives the idea) is of some working notes where simple steps have been removed, resulting in hard-to-follow computations. Finally, I would also recommend moving these sections to an appendix (after acknowledgement and funding).
-As suggested by both referees, much of the paper was rewritten and expanded,
-especially to provide more details in the calculation of the complexity. It was
-also rearranged to put most of those details in appendices.
+As suggested by both referees, much of the paper was rewritten and expanded, especially to provide more details in the calculation of the complexity. It was also rearranged to put most of those details in appendices.
For reviewer 1
-> i) On page 7, when referring to the set of marginal states that attract
-> dynamics "as evidenced by power-law relaxations", it would be convenient to
-> provide references for this statement.
-
-The evidence of power-law relaxation to marginal minima is contained in G.
-Folena and F. Zamponi, On weak ergodicity breaking in mean-field spin glasses,
-SciPost Physics 15(3), 109 (2023). In the original manuscript this work was
-cited at the end of the sentence, but the sentence has now be rephrased and the
-specific point about power-law relaxation has been removed to improve clarity.
-
-> ii) On the same page, the author refers to a quadratic pseudo-gap in the
-> complexity function associated with marginal states. It would be helpful to
-> have some more indication of how this was derived or, again, to provide
-> appropriate references.
-
-The form of the pseudo-gap in overlap for marginal states above the threshold
-energy is demonstrated in the subsection on the expansion of the complexity in
-the near neighborhood (equation 40 in the original manuscript).
-
-> iii) Section 4 “Calculation of the two-point complexity”. The author states
-> that conditioning the Hessian matrix of the stationary points to have a given
-> energy and given stability properties influences the statistics of points
-> only at the sub-leading order. It would be valuable to clarify the conditions
-> under which this occurs. I was thus wondering whether the author can
-> straightforwardly generalize such a computation and give some insights in the
-> case of a sparse (no longer fully connected) model.
+We thank the referee for their positive assessment. We believe their concerns with the manuscript have been addressed in the updated version. Here we address their specific concerns.
+
+> i) On page 7, when referring to the set of marginal states that attract dynamics "as evidenced by power-law relaxations", it would be convenient to provide references for this statement.
+
+The evidence of power-law relaxation to marginal minima is contained in [G. Folena and F. Zamponi, On weak ergodicity breaking in mean-field spin glasses, SciPost Physics 15(3), 109 (2023)](https://scipost.org/SciPostPhys.15.3.109). In the original manuscript this work was cited at the end of the sentence, but the sentence has now be rephrased and the specific point about power-law relaxation has been removed to improve clarity.
+
+> ii) On the same page, the author refers to a quadratic pseudo-gap in the complexity function associated with marginal states. It would be helpful to have some more indication of how this was derived or, again, to provide appropriate references.
+
+The form of the pseudo-gap in overlap for marginal states above the threshold energy is demonstrated in the subsection on the expansion of the complexity in the near neighborhood (equation (40) in the original manuscript, equation (17) in the revised manuscript). In the revised manuscript more has been done to emphasize the of the pseudogap analysis.
+
+> iii) Section 4 “Calculation of the two-point complexity”. The author states that conditioning the Hessian matrix of the stationary points to have a given energy and given stability properties influences the statistics of points only at the sub-leading order. It would be valuable to clarify the conditions under which this occurs. I was thus wondering whether the author can straightforwardly generalize such a computation and give some insights in the case of a sparse (no longer fully connected) model.
In the manuscript, we have added a small clarification as to the reason for this:
-"This is because the conditioning amounts to a rank-one perturbation to the
-Hessian matrix, which does not affect the bulk of its spectrum."
-
-From this, one can reason that the same assumptions will hold whenever rank-one
-perturbations do not affect the bulk spectrum. While we are not experts in the
-theory of sparse matrices, it seems likely this condition breaks down when a
-matrix is sufficiently sparse.
-
-> iv) Eq. (34) is quite complicated and difficult to grasp by eye. I thus
-> wonder whether the numerical protocol is robust enough to be sure that by
-> initializing differently, not exactly at q=0, the same solution is always
-> found. How sensitive is the protocol to the choice of initial conditions?
-
-It is true that (34) (now (11)) is quite complicated, but the numeric methods
-we use find the same solutions quite robustly. First, we make use of arbitrary
-precision arithmetic in Mathematica to ensure that the roots of the saddle
-point equations derived from this expression are indeed good roots, in this
-case with a 30-digit working precision. Second, the initialization near a good
-solution known from analytics, namely the solution at q = 0, is crucial because
-if initialized from random conditions a valid solution is never found. If we
-initialize the root-finding algorithm using the known solution at q = 0 and
-then attempt to solve the equations at some small q > 0, a consistent solution
-is found so long as q is sufficiently small. This is also true if the initial
-condition is randomly perturbed by a small amount. If the first q > 0 is too
-large or the random perturbation is too large, only nonphysical solutions are
-found. Luckily, we expect that the complexity of stationary points at different
-proximities varies smoothly, so that this procedure is justified.
-
-> v) In Section 5, the analysis of an isolated eigenvalue, which can be
-> attributed to a low-rank perturbation in the Hessian matrix, is discussed.
-> The technique results from a generalization of a paper recently published by
-> H. Ikeda, restricted to a quadratic model though. It would be worthwhile to
-> discuss how many of these predictions can be extended to models defined by a
-> double-well potential or to optimization problems relying on non-quadratic
-> functions (such as ReLu, sigmoid).
-
-The technique is quite generic, and the ability to apply it to other models
-rests mostly in the tractability of the saddle point calculation. I guess the
-reviewer is referencing the KHGPS model, or simple neural networks. The
-principle challenge in these cases is the Kac–Rice calculation itself, which
-has not be extended to systems without Gaussian disorder. If this were
-resolved, using this technique to analyse the properties of an isolated
-eigenvalue would be a painful corollary. (ReLu is problematic with respect to
-these landscape methods, however, because it does not have a well-defined
-Hessian everywhere.)
+"This is because the conditioning amounts to a rank-one perturbation to the Hessian matrix, which does not affect the bulk of its spectrum."
+
+From this, one can reason that the same assumptions will hold whenever rank-one perturbations do not affect the bulk spectrum. While we are not experts in the theory of sparse matrices, it seems likely this condition breaks down when a matrix is sufficiently sparse.
+
+> iv) Eq. (34) is quite complicated and difficult to grasp by eye. I thus wonder whether the numerical protocol is robust enough to be sure that by initializing differently, not exactly at q=0, the same solution is always found. How sensitive is the protocol to the choice of initial conditions?
+
+It is true that equation (34) (now (11)) is quite complicated, but the numeric methods we use find the same solutions quite robustly. First, we make use of arbitrary precision arithmetic in Mathematica to ensure that the roots of the saddle point equations derived from this expression are indeed good roots, in this case with a 30-digit working precision. Second, the initialization near a good solution known analytically, namely the solution at *q* = 0, is crucial because if initialized from random conditions a valid solution is never found. If we initialize the root-finding algorithm using the known solution at *q* = 0 and then attempt to solve the equations at some small *q* > 0, a consistent solution is found so long as q is sufficiently small. This is also true if the initial condition is randomly perturbed by a small amount. If the first *q* > 0 is too large or the random perturbation is too large, only nonphysical solutions are found. Luckily, we expect that the complexity of stationary points at different proximities varies smoothly, so that this procedure is justified.
+
+> v) In Section 5, the analysis of an isolated eigenvalue, which can be attributed to a low-rank perturbation in the Hessian matrix, is discussed. The technique results from a generalization of a paper recently published by H. Ikeda, restricted to a quadratic model though. It would be worthwhile to discuss how many of these predictions can be extended to models defined by a double-well potential or to optimization problems relying on non-quadratic functions (such as ReLu, sigmoid).
+
+The technique is quite generic, and the ability to apply it to other models rests mostly in the tractability of the saddle point calculation. I guess the reviewer is referencing the KHGPS model, or simple neural networks. The principle challenge in these cases is the Kac–Rice calculation itself, which has not been quantitatively extended to systems with non-Gaussian disorder. If this were resolved, using this technique to analyze the properties of an isolated eigenvalue would be a painful corollary. (ReLu is problematic with respect to these landscape methods, however, because it does not have a well-defined Hessian everywhere.)
## Requested changes