% % Created by Jaron Kent-Dobias on Thu Apr 20 12:50:56 EDT 2017. % Copyright (c) 2017 Jaron Kent-Dobias. All rights reserved. % \documentclass[aps,prl,reprint]{revtex4-1} \usepackage[utf8]{inputenc} \usepackage{amsmath,amssymb,latexsym,mathtools,xifthen} \usepackage{algcompatible} \usepackage{algorithm} % uncomment to label only equations that are referenced in the text %\mathtoolsset{showonlyrefs=true} % I want labels but don't want to type out ``equation'' \def\[{\begin{equation}} \def\]{\end{equation}} % math not built-in \def\arcsinh{\mathop{\mathrm{arcsinh}}\nolimits} \def\arccosh{\mathop{\mathrm{arccosh}}\nolimits} \def\ei{\mathop{\mathrm{Ei}}\nolimits} % exponential integral Ei \def\re{\mathop{\mathrm{Re}}\nolimits} \def\im{\mathop{\mathrm{Im}}\nolimits} \def\sgn{\mathop{\mathrm{sgn}}\nolimits} \def\dd{d} % differential \def\O{O} % big O \def\o{o} % little O \def\Z{\mathbb Z} \def\R{\mathbb R} \def\G{S} % subscript for ``critical'' values, e.g., T_\c \def\c{\mathrm c} % scaling functions \def\fM{\mathcal M} % magnetization \def\fX{\mathcal Y} % susceptibility \def\fF{\mathcal F} % free energy \def\fiF{\mathcal H} % imaginary free energy \def\fS{\mathcal S} % surface tension \def\fG{\mathcal G} % exponential factor \def\H{\mathcal H} % lattice types \def\sq{\mathrm{sq}} \def\tri{\mathrm{tri}} \def\hex{\mathrm{hex}} % dimensions \def\dim{d} \def\twodee{\textsc{2d} } \def\threedee{\textsc{3d} } \def\fourdee{\textsc{4d} } % fancy partial derivative \newcommand\pd[3][]{ \ifthenelse{\isempty{#1}} {\def\tmp{}} {\def\tmp{^#1}} \frac{\partial\tmp#2}{\partial#3\tmp} } \newcommand\nn[2]{\langle#1#2\rangle} \newcommand\avg[1]{\langle#1\rangle} \newcommand\eavg[1]{\langle#1\rangle_{\mathrm e}} \newcommand\mavg[1]{\langle#1\rangle_{\mathrm m}} \def\e{\mathrm e} \def\m{\mathrm m} \newcommand\new[1]{\tilde{#1}} \renewcommand\vec[1]{\mathbf{#1}} \newcommand\unit[1]{\hat{\mathbf{#1}}} % used to reformat display math to fit in two-column ``reprint'' mode \makeatletter \newif\ifreprint \@ifclasswith{revtex4-1}{reprint}{\reprinttrue}{\reprintfalse} \makeatother \begin{document} \title{Efficiently sampling Ising states in an external field} \author{Jaron Kent-Dobias} \author{James P.~Sethna} \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, USA} \date\today \begin{abstract} We introduce an extension of the Wolff algorithm that preforms efficiently in an external magnetic field. Near the Ising critical point, the correlation time of our algorithm has a conventional scaling form that reduces to that of the Wolff algorithm at zero field. As an application, we directly measure scaling functions of observables in the metastable state of the 2D Ising model. \end{abstract} \maketitle The Ising model is a simple model of a magnet comprised of locally interacting spins. Like most large thermal systems, computation of its properties cannot be carried out explicitly and is preformed using Monte Carlo methods. Near its continuous phase transition, divergent correlation length leads to divergent correlation time in any locally-updating algorithm, hampering computation. At zero external field, this was largely alleviated by cluster algorithms, like the Wolff algorithm, whose dynamics are nonlocal and each step flips groups of spins whose size diverges with the correlation length. However, the Wolff algorithm only works at zero field. We describe an extension of this algorithm that works in arbitrary external field while preserving the Wolff algorithm's small dynamic exponent. The Wolff algorithm works by first choosing a random spin and adding it to an empty cluster. Every neighbor of that spin pointed in the same direction as the spin is added to the cluster with probability $1-e^{-2\beta J}$, where $\beta=1/T$ and $J$ is the coupling between sites. This process is iterated again for neighbors of every spin added to the cluster. When all sites surrounding the cluster have been exhausted, the cluster is flipped. Our algorithm is a simple extension of this. An extra spin is introduced (often referred to as a ``ghost spin'') that couples to all others with coupling $H$. The traditional Wolff algorithm is then preformed on this larger lattice exactly as described above, with the extra spin treated no differently from any others. Observables in the original system can be exactly estimated on the new one using a simple mapping. As an application, we use our algorithm to measure critical scaling functions of the 2D Ising model in its metastable phase. \section{Introduction} Consider an undirected graph $G=(V,E)$ describing a system of interacting spins. The set of vertices $V=\{1,\ldots,N\}$ enumerates the sites of the network, and the set of edges $E$ describes connections between interacting sites. On each site is a spin that can take any value from the set $S=\{-1,1\}$. The state of the system is described by a function $s:V\to S$, leading to a configuration space of all possible states $S^n=S\times\cdots\times S$. The Hamiltonian $\H:S^n\to\R$ gives the energy of a particular state $s\in S^n$ and is defined by \[ \H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-HM(s), \label{eq:ham} \] where $H$ is the external magnetic field, $J:E\to\R$ gives the coupling between spins on connected sites and $M:S^n\to\R$ is the magnetization of the system defined for a state $s\in S^n$ by \[ M(s)=\sum_{i\in V}s_i. \] For the purpose of this study, we will only be considering ferromagnetic systems where the function $J$ is nonnegative. All formal results can be generalized to the antiferromagnetic or mixed cases, but algorithmic efficiency for instance cannot. An observable of this system is a function $A:S^n\to\R$ depending on the system's state. Both the Hamiltonian and magnetization defined above are observables. Assuming the ergodic hypothesis holds, the expected value $\avg A$ of any observable $A$ is defined by its average over every state $s$ in the configuration space $S^n$ weighted by the Boltzmann factor $e^{-\beta\H(s)}$, or \[ \avg A=\frac1Z\sum_{s\in S^n}e^{-\beta\H(s)}A(s), \label{eq:avg.obs} \] where $\beta$ is the inverse temperature and the partition function $Z$ defined by \[ Z=\sum_{s\in S^n}e^{-\beta\H(s)} \] gives the correct normalization for the weighted sum. The sum over configurations in \eqref{eq:avg.obs} are intractable for all but for very small systems. Therefore expected values of observables are usually approximated by some means. Monte Carlo methods are a common way of accomplishing this. These methods sample states $s$ in the configuration space $S^n$ according to the Boltzmann distribution $e^{-\beta\H(s)}$, so that averages of observables made using their samples asymptotically approach the true expected value. The Metropolis--Hastings algorithm \cite{metropolis1953equation,hastings1970monte} is very popular for systems in statistical physics. A starting state $s\in S^n$ is randomly perturbed to the state $s'$, usually by flipping one spin. The change in energy $\Delta\H=\H(s')-\H(s)$ due to the perturbation is then computed. If the change is negative the perturbed state $s'$ is accepted as the new state. Otherwise the perturbed state is accepted with probability $e^{-\beta\Delta\H}$. This process is repeated indefinitely and a sample of states is made by sampling the state $s$ between iterations. This algorithm is shown schematically in Algorithm \ref{alg:met-hast}. Metropolis--Hastings is very general, but unless the perturbations are very carefully chosen the algorithm suffers in regimes where large correlations are present in the system, for instance near continuous phase transitions. Here the algorithm suffers from what is known as critical slowing-down, where likely states consist of large correlated clusters that take many perturbations to move between in configuration space. \begin{figure} \begin{algorithm}[H] \begin{algorithmic} \REQUIRE $s\in S^n$ \STATE $s'\gets$ \texttt{Perturb}($s$) \STATE $\Delta\H\gets\H(s')-\H(s)$ \IF {$\exp(-\beta\Delta\H)>$ \texttt{UniformRandom}$(0,1)$} \STATE $s\gets s'$ \ENDIF \end{algorithmic} \caption{Metropolis--Hastings} \label{alg:met-hast} \end{algorithm} \end{figure} The Wolff algorithm \cite{wolff1989collective} solves many of these problems, but only at zero external field, $H=0$. This algorithm solves the problem of critical slowing-down by flipping carefully-constructed clusters of spins at once in a way that samples high-correlated states quickly while also always accepting prospective states. A random site $i$ is selected from the graph and its spin is flipped. Each of the site's neighbors $j$ is also flipped with probability $1-e^{-2\beta|J_{ij}|}$ if doing so would lower the energy of the bond $i$--$j$. The process is repeated with every neighbor that was flipped. While this algorithm successfully addresses the problems of critical slowing down at zero field, it doesn't directly apply at nonzero field. A common extension is the hybrid Wolff--Metropolis, where a Wolff cluster is formed using the ordinary formula, then accepted or rejected Metropolis-style based on the change in energy \emph{relative to the external field}, $H\Delta M$. Because each spin has the same coupling to the external field, a strictly more efficient but exactly equivalent version of this hybrid is made by distributing the cluster acceptance or rejection across the spins in the cluster as they are added. In this version, the cluster is abandoned with probability $1-e^{-\beta H}$ every time a spin is added to it. $z=0.29(1)$ \cite{wolff1989comparison,liu2014dynamic} $z=0.35(1)$ for Swendsen--Wang \cite{swendsen1987nonuniversal} \begin{figure} \begin{algorithm}[H] \begin{algorithmic} \REQUIRE $s\in\G^n$ \STATE \textbf{let} $q$ be an empty stack \STATE \textbf{let} $i_0\in V$ \STATE $q.\mathtt{push}(i_0)$ \STATE $\sigma\gets s_{i_0}$ \WHILE {$q$ is not empty} \STATE $i\gets q.\mathtt{pop}$ \IF {$s_i=\sigma$} \FORALL {$j$ such that $\{i,j\}\in E$} \IF {$1-\exp(-2\beta J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} \STATE $q.\mathtt{push}(j)$ \ENDIF \ENDFOR \STATE $s_i\gets-s_i$ \ENDIF \ENDWHILE \end{algorithmic} \caption{Wolff (Zero-Field)} \label{alg:wolff} \end{algorithm} \end{figure} \begin{figure} \begin{algorithm}[H] \begin{algorithmic} \REQUIRE $s\in\G^n$ \STATE $s'\gets s$ \STATE \textbf{let} $q$ be an empty stack \STATE \textbf{let} $i_0\in V$ \STATE $q.\mathtt{push}(i_0)$ \STATE $\sigma\gets s_{i_0}'$ \STATE \texttt{completed} $\gets$ \textbf{true} \WHILE {$q$ is not empty} \STATE $i\gets q.\mathtt{pop}$ \IF {$s_i'=\sigma$} \FORALL {$j$ such that $\{i,j\}\in E$} \IF {$1-\exp(-2\beta J_{ij}s_i's_j')>\mathop{\mathtt{UniformRandom}}(0,1)$} \STATE $q.\mathtt{push}(j)$ \ENDIF \ENDFOR \STATE $s_i'\gets-s_i'$ \STATE $q.\mathtt{push}(i)$ \IF {$1-\exp(-2\beta\sigma H)>\mathop{\mathtt{UniformRandom}}(0,1)$} \STATE \texttt{completed} $\gets$ \textbf{false} \STATE \textbf{break} \ENDIF \ENDIF \ENDWHILE \IF {completed} $s\gets s'$ \ENDIF \end{algorithmic} \caption{Hybrid Wolff/Metropolis--Hastings} \label{alg:h.wolff} \end{algorithm} \end{figure} \section{Cluster-flip in a Field} Consider the new graph $\tilde G=(\tilde V,\tilde E)$ defined from $G$ by \begin{align} \tilde V&=0\cup V\\ \tilde E&=E\cup\{\{0,i\}\mid i\in V\}, \end{align} or by adding a zeroth vertex and edges from every other vertex to the new one. The network of spins described by this graph now have states that occupy a configuration space $S^{n+1}$. Extend the coupling between spins $\tilde J:\tilde E\to\R$ by \[ \tilde J_{ij}=\begin{cases} J_{ij} & i,j>0\\ H & \text{otherwise}, \end{cases} \] so that each spin is coupled to every other spin the same way they were before, and coupled to the new spin with strength $H$. Now define a Hamiltonian function $\tilde\H:S^{n+1}\to\R$ on this new, larger configuration space defined by \[ \tilde\H(s)=-\sum_{\{i,j\}\in\tilde E}\tilde J_{ij}s_is_j. \] This new Hamiltonian resembles the old one, but is comprised only of spin--spin interactions with no external field. However, by changing the terms considered in the sum we may write \[ \tilde\H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-H\tilde M(s) \] where the new magnetization $\tilde M:S^{n+1}\to\R$ is defined for $(s_0,s)\in S^{n+1}$ by \[ \tilde M(s)=s_0\sum_{i\in V}s_i=M(s_0\times(s_1,\ldots,s_n)). \] In fact, any observable $A$ of the original system can be written as an observable $\tilde A$ of the new system by defining \[ \tilde A(s)=A(s_0\times(s_1,\ldots,s_n)) \] and the expected value of the observable $A$ in the old system and that of its counterpart $\tilde A$ in the new system is unchanged. This can be seen by using the facts that $\tilde\H(s)=\tilde\H(-s)$, $\sum_{s\in S^n}f(s)=\sum_{s\in S^n}F(-s)$ for any $f:S^n\to\R$, and $\tilde\H(1,s)=\H(s)$ for $s\in S^n$, from which follows that \[ \begin{aligned} \tilde Z\avg{\tilde A} &=\sum_{s\in S^{n+1}}\tilde A(s)e^{-\beta\tilde\H(s)} =\sum_{s_0\in S}\sum_{s\in S^n}\tilde A(s_0,s)e^{-\beta\tilde\H(s_0,s)}\\ &=\bigg(\sum_{s\in S^n}A(s)e^{-\beta\tilde\H(1,s)} +\sum_{s\in S^n}A(-s)e^{-\beta\tilde\H(-1,s)}\bigg)\\ &=\bigg(\sum_{s\in S^n}A(s)e^{-\beta\tilde\H(1,s)} +\sum_{s\in S^n}A(-s)e^{-\beta\tilde\H(1,-s)}\bigg)\\ &=\bigg(\sum_{s\in S^n}A(s)e^{-\beta\tilde\H(1,s)} +\sum_{s\in S^n}A(s)e^{-\beta\tilde\H(1,s)}\bigg)\\ &=\bigg(\sum_{s\in S^n}A(s)e^{-\beta\H(s)} +\sum_{s\in S^n}A(s)e^{-\beta\H(s)}\bigg)\\ &=2Z\avg A. \end{aligned} \] An identical calculation shows $\tilde Z=2Z$, therefore immediately proving $\avg{\tilde A}=\avg A$. Notice this correspondence also holds for the Hamiltonian. Our new spin system with an additional field is, when $H$ is greater than zero, simply a ferromagnetic spin system in the absence of an external field. Therefore, the Wolff algorithm can be applied to it with absolutely no modifications. Since there is an exact correspondence between the expectation values of our ordinary spin system in a field and their appropriately defined counterparts in our new system, sampling Monte Carlo simulations of our new system allows us to estimate the expectation values for the old. This ``new'' algorithm, if you can call it that, is shown in Algorithm \ref{alg:wolff-field}. \begin{figure} \begin{algorithm}[H] \begin{algorithmic} \REQUIRE $s\in S^n$ \STATE \textbf{let} $q$ be an empty stack \STATE \textbf{let} $i_0\in V$ \STATE $q.\mathtt{push}(i_0)$ \STATE $\sigma\gets s_{i_0}$ \WHILE {$q$ is not empty} \STATE $i\gets q.\mathtt{pop}$ \IF {$s_i=\sigma$} \FORALL {$j$ such that $\{i,j\}\in\tilde E$} \IF {$1-\exp(-2\beta \tilde J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} \STATE $q.\mathtt{push}(j)$ \ENDIF \ENDFOR \STATE $s_i\gets-s_i$ \ENDIF \ENDWHILE \end{algorithmic} \caption{Wolff (Nonzero Field)} \label{alg:wolff-field} \end{algorithm} \end{figure} At sufficiently small $H$ both our algorithm and the hybrid Wolff--Metropolis reduce to the ordinary Wolff algorithm, and have its runtime properties. At very large $H$, the hybrid Wolff--Metropolis behaves exactly like Metropolis, where almost only one spin is ever flipped at a time with probability $\sim e^{-2\beta H}$, since the energy is dominated by contributions from the field. We measured the autocorrelation time $\tau$ of the internal energy $\H$ of a square-lattice Ising model ($J=1$ and $E$ is the set of nearest neighbor pairs) resulting from using each of these three algorithms at various fields and temperatures. This was done using a batch mean estimator. Time was measured as ``the number of spins that the algorithm has attempted to flip.'' For example, every Metropolis--Hastings step takes unit time, every Wolff step takes time equal to the number of spins in the flipping cluster, and every hybrid Wolff/Metropolis--Hastings step takes time equal to the number of spins in the cluster the step attempted to flip, whether it succeeds or not. The autocorrelation time as a function of reduced field $\beta H$ is shown in Fig.~\ref{fig:correlation}. By this measure of time, the modified Wolff algorithm thermalizes faster than both the hybrid Wolff/Metropolis and ordinary Metropolis--Hastings in all regimes. At high field ($\beta H\sim1$) the new algorithm has an autocorrelation time of roughly $n$, while both Metropolis--Hastings and the hybrid algorithm have an autocorrelation time of $2n$. At low temperature, neither the new algorithm nor Metropolis-Hastings have an autocorrelation time that depends on field, but the hybrid method performs very poorly at intermediate values of field in a way that becomes worse with system size, scaling worse than the hybrid algorithm does even it the critical temperature. At high temperature, all three algorithms perform similarly and in a way that is independent of system size. At the critical temperature, both the Metropolis--Hastings and hybrid algorithms experience critical slowing down in a similar way, whereas the new algorithm slows down in a far reduced way, always performing more efficiently at nonzero field than the zero-field Wolff algorithm. \begin{figure} \centering \input{fig_correlation} \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising model simulated by the three nonzero field methods detailed in this paper. The top plot is for simulations at high temperature, $T=5.80225$, the middle plot is for simulations at the critical temperature, $T=2.26919$, and the bottom plot is for simulations at } \label{fig:correlation} \end{figure} \begin{figure} \centering \input{fig_correlation_collapse-hL} \end{figure} \section{Magnetization Estimator} At any size, the ensemble average magnetization $\avg M$ is identically zero at zero field. However, this is not what is observed at low temperature. This is because in the low temperature phase the ergodic hypothesis---that the time-average value of observables is equal to their ensemble average---is violated. As the system size grows the likelihood of a fluctuation in any reasonable dynamics that flips the magnetization from one direction to the other becomes vanishingly small, and therefore it is inappropriate to estimate expected values in the low temperature phase by averaging over the whole configuration space. Instead, values must be estimated by averaging over the portion of configuration space that is accessible to the dynamics. For finite size systems, like any we would simulate, dynamics at zero field or even small nonzero field do allow the whole configuration space to be explored. However, people usually want to use the results from finite size systems to estimate the expected values in the thermodynamic limit, where this is no longer true. At zero field, for instance, it is common practice to use $\avg{|M|}$ to estimate the expected value for the magnetization instead of $\avg M$. But what to do at finite field? Is this approach justified? Since, in the thermodynamic limit expected values are given by an average over a restricted configuration space, we can estimate those expected values at finite size by making the same restriction. Defining the reduced configuration spaces \begin{align} S_\e^n&=\{s\in S^n\mid \sgn(H)M(s)>0\} \\ S_\m^n&=\{s\in S^n\mid \sgn(H)M(s)<0\} \\ S_0^n&=\{s\in S^n\mid \sgn(H)M(s)=0\} \end{align} where \[ \sgn(H)=\begin{cases}1&H\geq0\\-1&H<0.\end{cases} \] Clearly, $S^n=S^n_\e\cup S^n_\m\cup S^n_0$, which implies that $Z=Z_\e+Z_\m+Z_0$. Since $S^n_\e=\{-s\mid s\in S^n_\m\}$, $|S^n_\e|=|S^n_\m|$. At zero field, $\H(s)=\H(-s)$, and therefore $Z_\e=Z_\m$. This can be used to show the near-equivalence (at zero field) of taking expectation values $\avg{|M|}$ of the absolute value of the magnetization and taking expectation values $\avg M_\e$ of the magnetization on a reduced configuration space, since \begin{align} \avg{|M|} &=\frac1Z\sum_{s\in S^n}e^{-\beta\H(s)}|M(s)|\\ &=\frac1{Z_\e+Z_\m+Z_0}\bigg(\sum_{s\in S^n_\e}e^{-\beta\H(s)}|M(s)|+ \sum_{s\in S^n_\m}e^{-\beta\H(s)}|M(s)|+\sum_{s\in S^n_0}e^{-\beta\H(s)}|M(s)|\bigg)\\ &=\frac1{2Z_\e+Z_0}\bigg(\sum_{s\in S^n_\e}e^{-\beta\H(s)}M(s)+ \sum_{s\in S^n_\e}e^{-\beta\H(-s)}|M(-s)|\bigg)\\ &=\frac2{2Z_\e+Z_0}\sum_{s\in S^n_\e}e^{-\beta\H(s)}M(s)\\ &=\frac1{1+\frac{Z_0}{2Z_\e}}\eavg M \end{align} At infinite temperature, $Z_0/Z_\e\simeq n^{-1/2}\sim L^{-1}$ for large $L$, $N$. At any finite temperature, especially in the ferromagnetic phase, $Z_0\ll Z_\e$ in a much more extreme way. If the ensemble average over only positive magnetizations can be said to converge to the equilibrium state of the Ising model in the thermodynamic limit, what of the average over only \emph{negative} magnetizations, or the space $S^n_\m$, ordinarily unphysical? This is, in one sense, precisely the definition of the metastable state of the Ising model. Expected values of observables in the metastable state can therefore be computed by considering this reduced ensemble. How, in practice, are these samplings over reduced ensembles preformed? There may be efficient algorithms for doing Monte Carlo sampling inside a particular reduced ensemble, but for finite-size systems with any of the algorithms described here the whole configuration space is available. Any of these can be used, however, to simultaneously sample all reduced configuration spaces by allowing them to sample the whole space and then only add a contribution to a given average if the associated state is in the reduced space of interest. \begin{acknowledgments} Thanks! \end{acknowledgments} \bibliography{monte-carlo} \end{document}