summaryrefslogtreecommitdiff
path: root/response.md
blob: 351a2212f2ba827f1090e7b432a796c54078c7f7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184

For reviewer 2

> On page 9, the author points out that there are solutions with complexity 0
> that do not show an extensive barrier "in any situation". First, this "in any
> situation" is quite unclear. Does the author mean above and below the
> threshold energy? Does this solution exist even at high energy?
> Can the author comment on what this solution could imply?

The sense of "any situation" means "for a reference point of any energy and
stability." This includes energies above and below the threshold energy, at
stabilities that imply saddles, minima, or marginal minima, and even for
combinations of energy and stability where the complexity of stationary points
is negative.

The two-point complexity is computed under the condition that the reference
point exists. Given that the reference point exists, there is at least one
point that can be found at zero overlap with the reference: itself. This
reasoning alone rationalizes why we should find a solution with Σ₁₂ = 0, q = 0,
and E₀ = E₁, μ₀ = μ₁ for any E₀ and μ₀.

The interpretation is more subtle. Two different stationary points cannot lie
at the same point, but the complexity calculation only resolves numbers of
points that are exponential in N and differences in overlap that are linear in
N. Therefore, the complexity calculation is compatible with many stationary
points being contained in the subextensive region of dimension Δq = O(1) around
any reference point. We can reason as to where these extremely near neighbors
are likely or unlikely to exist in specific conditions, but the complexity
calculation cannot rule them out.

This is point is not crucial to anything in the paper, except to make more
precise the statement that non-threshold marginal minima are separated by a gap
in their overlap. Because marginal minima have very flat directions, they are
good candidates for possessing these extremely near neighbors, and this might
lead one to say they are not isolated. However, if such extremely near
neighbors exist, they are irrelevant to dynamics: the entire group is isolated,
since the complexity of similar stationary points at a small but extensive
overlap further is negative.

> At the technical level, I am confused by one of the constraints imposed in
> eq.16, when \sigma_1 couples with all replicated s_a. I was expecting a sum
> of \sigma_b.s_a over a and b. This may represent a rotation applied to all
> replicas along a reference direction, which is probably what the author did,
> but there is no comment about that. In general, it would have helped to
> specify the constraints enforced instead of writing simply "Lagrange
> multipliers" before eq.16.

The fact noticed by the referee that only σ₁ appears in the scalar product with
sₐ in equation (16) of the original manuscript was not introduced in that
equation, but instead was introduced in equation (10) of the original
manuscript. Right after that equation, the special status of σ₁ was clarified.
This arises because of the structure of equation (9): in that equation, the
logarithmic expression being averaged depends only on σ, which corresponds with
σ₁ in the following equation. σ₂ through σₘ correspond to σ', which is
replicated (m - 1) times to bring the partition function into the numerator.
Therefore, there is a clear reason behind the asymmetry among the replicas
associated with the reference spin, and it was not due to an ad-hoc
transformation as suggested by the referee.

> The author analyses the problem using the Franz-Parisi potential, however,
> this analysis does not seem to matter in the paper. We can read a comment at
> the end of Sec.3.1 but without actual implications. It should be either
> removed or expanded, at the moment it seems just without purpose.

The analysis of the Franz-Parisi potential has been moved to an appendix, with
a more explanatory discussion of our interest in it included in the manuscript.

> "We see arrangements of barriers relative to each other, perhaps...". Why
> "perhaps"? Second, where is this analysis carried out? In the results
> section, the author analyses stable minima and marginal states, I don't know
> where to look. Adding a reference would have helped.

> After eq.3, the author comments on the replica ansatz, but this is out of
> place. We are still introducing the model. It would be better to have it at
> the end of the section (where indeed the author comes back to the same
> concept) or remove it entirely.

> fig.1, add a caption under each figure saying what they are (oriented
> saddles, oriented minima, etc), it is much easier to read.

The suggestion of the referee was good and was implemented in the new manuscript.

> fig.2, elaborate a bit more in the main text. This is introduced at the of
> the section without any comment.

> fig.3, "the dot-dashed lines on both plots depict the trajectory of the solid
> line on the other plot", which one?

> fig.3, "In this case, the points lying nearest to the reference minimum are
> saddle with mu\<mu, but with energies smaller than the threshold energy", so?
> What is the implication? This misses a conclusion.

> Sec.3.1, the author comments on the similarity with the pure model, without
> explaining what is similar. What should we expect on the p-spin? At least the
> relevant aspects. It would also be useful to plot a version of Fig.3 for the
> p-spin. It would make the discussion easier to follow.

> "the nearest neighbour points are always oriented saddles", where do I see this?

> the sentence "like in the pure models, the emergence [...]" is extremely hard
> to parse and the paragraph ends without a conclusion. What are the
> consequences?

> at page 9, the author talk about \Sigma_12 that however has not been defined yet.

> this section starts without explaining what is the strategy to solve the
> problem. Explaining how the following subsection will contribute to the
> solution without entering into the details of the computation would be of
> great help.

> "This replica symmetry will be important later" how? Either we have an
> explanation following or it should be removed.

> at the end of a step it would be good to wrap everything up. For instance,
> sec.4.2 ends with "we do not include these details, which are standard" at
> least give a reference. Second, add the final result.

> "there is a desert where none are found" -> solutions are exponentially rare
> (or something else)

> I would suggest a rewriting, especially the last sessions (4-6). I understand
> the intention of removing simple details, but they should be replaced by
> comments. The impression (which can be wrong but gives the idea) is of some
> working notes where simple steps have been removed, resulting in
> hard-to-follow computations. Finally, I would also recommend moving these
> sections to an appendix (after acknowledgement and funding).

For reviewer 1

> i) On page 7, when referring to the set of marginal states that attract
> dynamics "as evidenced by power-law relaxations", it would be convenient to
> provide references for this statement.

The evidence of power-law relaxation to marginal minima is contained in G.
Folena and F. Zamponi, On weak ergodicity breaking in mean-field spin glasses,
SciPost Physics 15(3), 109 (2023). In the original manuscript this work was
cited at the end of the sentence, but the sentence has now be rephrased and the
specific point about power-law relaxation has been removed to improve clarity.

> ii) On the same page, the author refers to a quadratic pseudo-gap in the
> complexity function associated with marginal states. It would be helpful to
> have some more indication of how this was derived or, again, to provide
> appropriate references.

The form of the pseudo-gap in overlap for marginal states above the threshold
energy is demonstrated in the subsection on the expansion of the complexity in
the near neighborhood (equation 40 in the original manuscript).

> iii) Section 4 “Calculation of the two-point complexity”. The author states
> that conditioning the Hessian matrix of the stationary points to have a given
> energy and given stability properties influences the statistics of points
> only at the sub-leading order. It would be valuable to clarify the conditions
> under which this occurs. I was thus wondering whether the author can
> straightforwardly generalize such a computation and give some insights in the
> case of a sparse (no longer fully connected) model.

CITE VALENTINA??

> iv) Eq. (34) is quite complicated and difficult to grasp by eye. I thus
> wonder whether the numerical protocol is robust enough to be sure that by
> initializing differently, not exactly at q=0, the same solution is always
> found. How sensitive is the protocol to the choice of initial conditions?

DO A LITTLE EXPERIMENT

> v) In Section 5, the analysis of an isolated eigenvalue, which can be
> attributed to a low-rank perturbation in the Hessian matrix, is discussed.
> The technique results from a generalization of a paper recently published by
> H. Ikeda, restricted to a quadratic model though. It would be worthwhile to
> discuss how many of these predictions can be extended to models defined by a
> double-well potential or to optimization problems relying on non-quadratic
> functions (such as ReLu, sigmoid).

ALWAYS QUADRATIC!

Requested changes

> I found the paper interesting but quite technical in some points. Moving the
> saddle-point computations and part of the analysis (see for instance on pages
> 11-13 and 18-20) to the supplement would make it easier to capture the main
> results, especially for general readers without extensive expertise in the
> replica trick and these models.

OK