% % Created by Jaron Kent-Dobias on Thu Apr 20 12:50:56 EDT 2017. % Copyright (c) 2017 Jaron Kent-Dobias. All rights reserved. % \documentclass[aps,prl,reprint]{revtex4-1} \usepackage[utf8]{inputenc} \usepackage{amsmath,amssymb,latexsym,mathtools,xifthen} \usepackage{algcompatible} \usepackage{algorithm} % uncomment to label only equations that are referenced in the text %\mathtoolsset{showonlyrefs=true} % I want labels but don't want to type out ``equation'' \def\[{\begin{equation}} \def\]{\end{equation}} % math not built-in \def\arcsinh{\mathop{\mathrm{arcsinh}}\nolimits} \def\arccosh{\mathop{\mathrm{arccosh}}\nolimits} \def\ei{\mathop{\mathrm{Ei}}\nolimits} % exponential integral Ei \def\re{\mathop{\mathrm{Re}}\nolimits} \def\im{\mathop{\mathrm{Im}}\nolimits} \def\sgn{\mathop{\mathrm{sgn}}\nolimits} \def\dd{d} % differential \def\O{O} % big O \def\o{o} % little O \def\Z{\mathbb Z} \def\R{\mathbb R} \def\G{S} % subscript for ``critical'' values, e.g., T_\c \def\c{\mathrm c} % scaling functions \def\fM{\mathcal M} % magnetization \def\fX{\mathcal Y} % susceptibility \def\fF{\mathcal F} % free energy \def\fiF{\mathcal H} % imaginary free energy \def\fS{\mathcal S} % surface tension \def\fG{\mathcal G} % exponential factor \def\H{\mathcal H} % lattice types \def\sq{\mathrm{sq}} \def\tri{\mathrm{tri}} \def\hex{\mathrm{hex}} % dimensions \def\dim{d} \def\twodee{\textsc{2d} } \def\threedee{\textsc{3d} } \def\fourdee{\textsc{4d} } % fancy partial derivative \newcommand\pd[3][]{ \ifthenelse{\isempty{#1}} {\def\tmp{}} {\def\tmp{^#1}} \frac{\partial\tmp#2}{\partial#3\tmp} } \newcommand\nn[2]{\langle#1#2\rangle} \newcommand\avg[1]{\langle#1\rangle} \newcommand\eavg[1]{\langle#1\rangle_{\mathrm e}} \newcommand\mavg[1]{\langle#1\rangle_{\mathrm m}} \def\e{\mathrm e} \def\m{\mathrm m} \newcommand\new[1]{\tilde{#1}} \renewcommand\vec[1]{\mathbf{#1}} \newcommand\unit[1]{\hat{\mathbf{#1}}} % used to reformat display math to fit in two-column ``reprint'' mode \makeatletter \newif\ifreprint \@ifclasswith{revtex4-1}{reprint}{\reprinttrue}{\reprintfalse} \makeatother \begin{document} \title{An efficient Wolff algorithm in an external magnetic field} \author{Jaron Kent-Dobias} \author{James P.~Sethna} \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, USA} \date\today \begin{abstract} We introduce an extension of the Wolff algorithm that preforms efficiently in an external magnetic field. Near the Ising critical point, the correlation time of our algorithm has a conventional scaling form that reduces to that of the Wolff algorithm at zero field and becomes more efficient at any nonzero field. As an application, we directly measure scaling functions of observables in the metastable state of the 2D Ising model. \end{abstract} \maketitle The Ising model is a simple model of a magnet comprised of discrete locally interacting one-component spins. Like most systems in statistical mechanics, calculation of its ensemble properties cannot be done explicitly and is often performed using Monte Carlo methods. Near its continuous phase transition, divergent correlation length leads to divergent correlation time in any locally-updating algorithm, hampering computation. With no external field, this critical slowing-down is largely alleviated by cluster algorithms---the most efficient of which is the Wolff algorithm---whose dynamics are nonlocal since each step flips a cluster of spins whose average size diverges with the correlation length. While less efficient cluster algorithms, like Swendsen--Wang, have been modified to perform in nonzero field, Wolff only works at zero field. We describe an extension of the Wolff algorithm that works in arbitrary external field while preserving Wolff's efficiency throughout the entire temperature--field parameter space. The Wolff algorithm works by first choosing a random spin and adding it to an empty cluster. Every neighbor of that spin that is pointed in the same direction as the spin is also added to the cluster with probability $1-e^{-2\beta J}$, where $\beta=1/T$ is inverse temperature and $J$ is the coupling between sites. This process is repeated for the neighbors of every spin added to the cluster. When all sites surrounding the cluster have been exhausted, the cluster is flipped. Our algorithm is a simple extension of this. An extra spin---often referred to as the ``ghost spin''---is introduced and made a nearest neighbor of all others with coupling $|H|$, the magnitude of the external field. The traditional Wolff algorithm is then preformed on this new extended lattice exactly as described above, with the extra spin treated no differently from any others, i.e., allowed to be added to clusters and subsequently flipped. Observables in the original system can be exactly estimated using the new one by a simple correspondence. This paper is divided into three sections. First, the Ising model (and our notation for it) is introduced, along with extant Monte Carlo algorithms. Second, we introduce our algorithm in detail and compare its efficiency with the existing ones. Finally, we use this new algorithm to directly measure critical scaling functions for observables of the 2D Ising model in its metastable state. \section{Introduction} Consider an undirected graph $G=(V,E)$ describing a network of interacting spins. The set of vertices $V=\{1,\ldots,n\}$ enumerates the sites of the network, and the set of edges $E$ describes connections between neighboring sites. On each site is a spin that can take any value from the set $S=\{-1,1\}$. The state of the system is described by a function $s:V\to S$, leading to a configuration space of all possible states $S^n=S\times\cdots\times S$. The Hamiltonian $\H:S^n\to\R$ gives the energy of a particular state $s\in S^n$ and is defined by \[ \H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-HM(s), \label{eq:ham} \] where $H$ is the external magnetic field, $J:E\to\R$ gives the coupling between spins on neighboring sites and $M:S^n\to\R$ is the magnetization of the system defined for a state $s\in S^n$ by \[ M(s)=\sum_{i\in V}s_i. \] For the purpose of this study, we will only be considering ferromagnetic systems where the function $J$ is nonnegative. An observable of the system is a function $A:S^n\to\R$ depending on the system's state. Both the Hamiltonian and magnetization defined above are observables. Assuming the ergodic hypothesis holds, the expected value $\avg A$ of any observable $A$ is defined by its average over every state $s$ in the configuration space $S^n$ weighted by the Boltzmann factor $e^{-\beta\H(s)}$, or \[ \avg A=\frac1Z\sum_{s\in S^n}e^{-\beta\H(s)}A(s), \label{eq:avg.obs} \] where the partition function $Z$ is defined by \[ Z=\sum_{s\in S^n}e^{-\beta\H(s)} \] and gives the correct normalization for the weighted sum. Unfortunately, the sum over configurations in \eqref{eq:avg.obs} are intractable for all but for very small systems. Therefore expectation values are usually approximated by some means. Monte Carlo methods are a common way of accomplishing this. These methods sample states $s$ from the configuration space $S^n$ according to the Boltzmann distribution $e^{-\beta\H(s)}$ so that averages of observables made using their incomplete samples asymptotically approach the true expected value. The Metropolis--Hastings algorithm \cite{metropolis1953equation,hastings1970monte} is very popular for systems in statistical physics. A starting state $s\in S^n$ is randomly perturbed to the state $s'$, usually by flipping one spin. The change in energy $\Delta\H=\H(s')-\H(s)$ due to the perturbation is then computed. If the change is negative the perturbed state $s'$ is accepted as the new state. Otherwise the perturbed state is accepted with probability $e^{-\beta\Delta\H}$. This process is repeated indefinitely and a sample of states is made by sampling the state $s$ between iterations sufficiently separated that the successively sampled states are uncorrelated, e.g., at separations larger than the correlation time $\tau$. Metropolis--Hastings is very general, but unless the perturbations are very carefully chosen the algorithm suffers in regimes where large correlations are present in the system, for instance near continuous phase transitions. Here the algorithm suffers from what is known as critical slowing-down, where likely states consist of large correlated clusters that take many perturbations to move between in configuration space. %\begin{figure} % \begin{algorithm}[H] % \begin{algorithmic} % \REQUIRE $s\in S^n$ % \STATE $s'\gets$ \texttt{Perturb}($s$) % \STATE $\Delta\H\gets\H(s')-\H(s)$ % \IF {$\exp(-\beta\Delta\H)>$ \texttt{UniformRandom}$(0,1)$} % \STATE $s\gets s'$ % \ENDIF % \end{algorithmic} % \caption{Metropolis--Hastings} % \label{alg:met-hast} % \end{algorithm} %\end{figure} % The Wolff algorithm \cite{wolff1989collective} solves many of these problems, but only at zero external field, $H=0$. This algorithm solves the problem of critical slowing-down by flipping carefully-constructed clusters of spins at once in a way that samples high-correlated states quickly while also always accepting prospective states. A random site $i$ is selected from the graph and its spin is flipped. Each of the site's neighbors $j$ is also flipped with probability $1-e^{-2\beta J_{ij}}$ if doing so would lower the energy of the bond $i$--$j$. The process is repeated with every neighbor that was flipped. While this algorithm successfully addresses the problems of critical slowing down at zero field, it doesn't directly apply at nonzero field. A common extension is the hybrid Wolff--Metropolis, where a Wolff cluster is formed using the ordinary formula, then accepted or rejected Metropolis-style based on the change in energy \emph{relative to the external field}, $H\Delta M$. Because each spin has the same coupling to the external field, a strictly more efficient but exactly equivalent version of this hybrid is made by distributing the cluster acceptance or rejection across the spins in the cluster as they are added. In this version, the cluster is abandoned with probability $1-e^{-2\beta H}$ every time a spin is added to it. $z=0.29(1)$ \cite{wolff1989comparison,liu2014dynamic} $z=0.35(1)$ for Swendsen--Wang \cite{swendsen1987nonuniversal} %\begin{figure} % \begin{algorithm}[H] % \begin{algorithmic} % \REQUIRE $s\in\G^n$ % \STATE \textbf{let} $q$ be an empty stack % \STATE \textbf{let} $i_0\in V$ % \STATE $q.\mathtt{push}(i_0)$ % \STATE $\sigma\gets s_{i_0}$ % \WHILE {$q$ is not empty} % \STATE $i\gets q.\mathtt{pop}$ % \IF {$s_i=\sigma$} % \FORALL {$j$ such that $\{i,j\}\in E$} % \IF {$1-\exp(-2\beta % J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} % \STATE $q.\mathtt{push}(j)$ % \ENDIF % \ENDFOR % \STATE $s_i\gets-s_i$ % \ENDIF % \ENDWHILE % \end{algorithmic} % \caption{Wolff (Zero-Field)} % \label{alg:wolff} % \end{algorithm} %\end{figure} % % % %\begin{figure} % \begin{algorithm}[H] % \begin{algorithmic} % \REQUIRE $s\in\G^n$ % \STATE $s'\gets s$ % \STATE \textbf{let} $q$ be an empty stack % \STATE \textbf{let} $i_0\in V$ % \STATE $q.\mathtt{push}(i_0)$ % \STATE $\sigma\gets s_{i_0}'$ % \STATE \texttt{completed} $\gets$ \textbf{true} % \WHILE {$q$ is not empty} % \STATE $i\gets q.\mathtt{pop}$ % \IF {$s_i'=\sigma$} % \FORALL {$j$ such that $\{i,j\}\in E$} % \IF {$1-\exp(-2\beta % J_{ij}s_i's_j')>\mathop{\mathtt{UniformRandom}}(0,1)$} % \STATE $q.\mathtt{push}(j)$ % \ENDIF % \ENDFOR % \STATE $s_i'\gets-s_i'$ % \STATE $q.\mathtt{push}(i)$ % \IF {$1-\exp(-2\beta\sigma H)>\mathop{\mathtt{UniformRandom}}(0,1)$} % \STATE \texttt{completed} $\gets$ \textbf{false} % \STATE \textbf{break} % \ENDIF % \ENDIF % \ENDWHILE % \IF {completed} % $s\gets s'$ % \ENDIF % \end{algorithmic} % \caption{Hybrid Wolff/Metropolis--Hastings} % \label{alg:h.wolff} % \end{algorithm} %\end{figure} \section{Cluster-flip in a Field} Consider the new graph $\tilde G=(\tilde V,\tilde E)$ defined from $G$ by \begin{align} \tilde V&=0\cup V\\ \tilde E&=E\cup\{\{0,i\}\mid i\in V\}, \end{align} or by adding a zeroth vertex and edges from every other vertex to the new one. Thee spin at site zero is often known as a ``ghost spin.'' The network of spins described by this graph now have states that occupy a configuration space $S^{n+1}$. Extend the coupling between spins $\tilde J:\tilde E\to\R$ by \[ \tilde J(e)=\begin{cases} J(e) & e\in E\\ |H| & \text{otherwise}, \end{cases} \] so that each spin is coupled to every other spin the same way they were before, and coupled to the new spin with strength $|H|$, the magnitude of the external field. Now define a Hamiltonian function $\tilde\H:S^{n+1}\to\R$ on this new, larger configuration space defined by \[ \tilde\H(s)=-\sum_{\{i,j\}\in\tilde E}\tilde J_{ij}s_is_j. \label{eq:new.h.simple} \] This new Hamiltonian resembles the old one, but is comprised only of spin--spin interactions with no external field. However, by changing the terms considered in the sum we may equivalently write \[ \tilde\H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-H\tilde M(s) \] where the new magnetization $\tilde M:S^{n+1}\to\R$ is defined for $s\in S^{n+1}$ by \[ \begin{aligned} \tilde M(s) &=\sgn(H)s_0\sum_{i\in V}s_i\\ &=M(s_0\sgn(H)(s_1,\ldots,s_n)). \end{aligned} \] In fact, any observable $A$ of the original system can be written as an observable $\tilde A$ of the new system by defining \[ \tilde A(s)=A(s_0\sgn(H)(s_1,\ldots,s_n)) \] such that $\avg{\tilde A}=\avg A$. This can be seen readily by using the symmetry $\tilde\H(-s)=\tilde\H(s)$ of the Hamiltonian \eqref{eq:new.h.simple}, the invariance of configuration space sums under negation of their summand, and the fact that $\tilde\H(1,s)=\H(s)$ for any $s\in S^n$. Notice in particular that this is true for the Hamiltonian $\tilde\H$ as well. Our new spin system with an additional field is simply a ferromagnetic spin system in the absence of an external field. Therefore, the Wolff algorithm can be applied to it with absolutely no modifications. Since there is an exact correspondence between the expectation values of our ordinary spin system in a field and their appropriately defined counterparts in our new system, sampling Monte Carlo simulations of our new system allows us to estimate the expectation values for the old. This ``new'' algorithm, if you can call it that, is shown in Algorithm \ref{alg:wolff-field}. %\begin{figure} % \begin{algorithm}[H] % \begin{algorithmic} % \REQUIRE $s\in S^n$ % \STATE \textbf{let} $q$ be an empty stack % \STATE \textbf{let} $i_0\in V$ % \STATE $q.\mathtt{push}(i_0)$ % \STATE $\sigma\gets s_{i_0}$ % \WHILE {$q$ is not empty} % \STATE $i\gets q.\mathtt{pop}$ % \IF {$s_i=\sigma$} % \FORALL {$j$ such that $\{i,j\}\in\tilde E$} % \IF {$1-\exp(-2\beta \tilde % J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} % \STATE $q.\mathtt{push}(j)$ % \ENDIF % \ENDFOR % \STATE $s_i\gets-s_i$ % \ENDIF % \ENDWHILE % \end{algorithmic} % \caption{Wolff (Nonzero Field)} % \label{alg:wolff-field} % \end{algorithm} %\end{figure} % At sufficiently small $|H|$ both our algorithm and the hybrid Wolff--Metropolis reduce to the ordinary Wolff algorithm, and have its runtime properties. At very large $|H|$, the hybrid Wolff--Metropolis behaves exactly like Metropolis, where almost only one spin is ever flipped at a time with probability $\sim e^{-2\beta H}$, since the energy is dominated by contributions from the field. We measured the autocorrelation time $\tau$ of the internal energy $\H$ of a square-lattice Ising model ($J=1$ and $E$ is the set of nearest neighbor pairs) resulting from using each of these three algorithms at various fields and temperatures. This was done using an initial convex sequence estimator, as described in \cite{geyer1992practical}. Time was measured as ``the number of spins that the algorithm has attempted to flip.'' For example, every Metropolis--Hastings step takes unit time, every Wolff step takes time equal to the number of spins in the flipping cluster, and every hybrid Wolff/Metropolis--Hastings step takes time equal to the number of spins in the cluster the step attempted to flip, whether it succeeds or not. The autocorrelation time as a function of reduced field $\beta H$ is shown in Fig.~\ref{fig:correlation}. By this measure of time, the modified Wolff algorithm thermalizes faster than both the hybrid Wolff/Metropolis and ordinary Metropolis--Hastings in all regimes. At high field ($\beta H\sim1$) the new algorithm has an autocorrelation time of roughly $n$, while both Metropolis--Hastings and the hybrid algorithm have an autocorrelation time of $2n$. At low temperature, neither the new algorithm nor Metropolis-Hastings have an autocorrelation time that depends on field, but the hybrid method performs very poorly at intermediate values of field in a way that becomes worse with system size, scaling worse than the hybrid algorithm does even it the critical temperature. At high temperature, all three algorithms perform similarly and in a way that is independent of system size. At the critical temperature, both the Metropolis--Hastings and hybrid algorithms experience critical slowing down in a similar way, whereas the new algorithm slows down in a far reduced way, always performing more efficiently at nonzero field than the zero-field Wolff algorithm. \begin{figure} \centering \input{fig_correlation} \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising model simulated by the three nonzero field methods detailed in this paper. The top plot is for simulations at high temperature, $T=5.80225$, the middle plot is for simulations at the critical temperature, $T=2.26919$, and the bottom plot is for simulations at } \label{fig:correlation} \end{figure} \begin{figure} \centering \input{fig_correlation_collapse-hL} \end{figure} Our algorithm for the Ising model in a field can be generalized to run on the $q$-spin Potts or $O(n)$ models in exactly the same way as the conventional Wolff algorithm. \section{Magnetization Estimator} At any size, the ensemble average magnetization $\avg M$ is identically zero at zero field. However, this is not what is observed at low temperature. This is because in the low temperature phase the ergodic hypothesis---that the time-average value of observables is equal to their ensemble average---is violated. As the system size grows the likelihood of a fluctuation in any reasonable dynamics that flips the magnetization from one direction to the other becomes vanishingly small, and therefore it is inappropriate to estimate expected values in the low temperature phase by averaging over the whole configuration space. Instead, values must be estimated by averaging over the portion of configuration space that is accessible to the dynamics. For finite size systems, like any we would simulate, dynamics at zero field or even small nonzero field do allow the whole configuration space to be explored. However, people usually want to use the results from finite size systems to estimate the expected values in the thermodynamic limit, where this is no longer true. At zero field, for instance, it is common practice to use $\avg{|M|}$ to estimate the expected value for the magnetization instead of $\avg M$. But what to do at finite field? Is this approach justified? The provenance of the restricted configuration space in the thermodynamic limit is the fact that, at criticality or for nonzero $H$ under the critical temperature, an infinite cluster forms and can never be flipped by the dynamics. In a finite system there is no infinite cluster, and even the largest cluster has a nonzero probability of being flipped. However, we still want to be able to use finite systems to estimate the quantities of infinite ones. We approach this by thinking of a finite system as a piece of an infinite one. Clearly, time-averages of quantities in the chunk are equal to time-averages of the whole system, simply with much slower convergence. However, simulation of finite systems present a problem: in each sample, we are given a snapshot of a chunk of the system but do not know which direction the infinite cluster in the surrounding infinite system is pointing. Therefore, we must make a guess such that, in the limit of larger and larger system size, the probability that our guess is wrong and that the infinite cluster is pointed in a direction opposite the one we expected approaches zero. To estimate the equilibrium values of quantities in the infinite system, we only admit values resulting from states whose corresponding infinite cluster is expected to point in the direction of our external field. If the external field is zero, we choose one direction at random, say positive. Snapshots of the chunk what we suspect are likely to be a piece of an infinite cluster facing opposite the external field are discarded for the purpose of computing equilibrium values of the system, but still represent something physical. When the model's state contains an infinite cluster oriented against the external field, it is known as \emph{metastable}. Therefore in a finite simulation one can make convergent estimates of quantities in both the equilibrium and metastable states of the infinite system by dividing sampled states into two sets: those that are likely snapshots of a finite block in an infinite system whose infinite cluster is oriented with the field, and those whose infinite cluster is oriented against the field. All we must do now is specify how to distinguish between likely oriented with and likely oriented against. Others have used the direction of the largest cluster in the finite system to make this distinction. Far from the critical point in the low-temperature phase this does a clear and consistent job. However, we would like to study systems very near the critical point. We will take the likely direction of the infinite cluster to be given by \emph{the direction in which the system is magnetized}. Clearly, as our finite chunk grows larger and larger, the probability that its magnetization is different from the direction of the infinite cluster goes to zero. In the thermodynamic limite Since, in the thermodynamic limit expected values are given by an average over a restricted configuration space, we can estimate those expected values at finite size by making the same restriction. Defining the reduced configuration spaces \begin{align} S_\e^n&=\{s\in S^n\mid \sgn(H)M(s)>0\} \\ S_\m^n&=\{s\in S^n\mid \sgn(H)M(s)<0\} \\ S_0^n&=\{s\in S^n\mid \sgn(H)M(s)=0\} \end{align} where \[ \sgn(H)=\begin{cases}1&H\geq0\\-1&H<0.\end{cases} \] Clearly, $S^n=S^n_\e\cup S^n_\m\cup S^n_0$, which implies that $Z=Z_\e+Z_\m+Z_0$. Since $S^n_\e=\{-s\mid s\in S^n_\m\}$, $|S^n_\e|=|S^n_\m|$. At zero field, $\H(s)=\H(-s)$, and therefore $Z_\e=Z_\m$. This can be used to show the near-equivalence (at zero field) of taking expectation values $\avg{|M|}$ of the absolute value of the magnetization and taking expectation values $\avg M_\e$ of the magnetization on a reduced configuration space, since \begin{align} \avg{|M|}=\frac1{1+Z_0/2Z_\e}\eavg M \end{align} At infinite temperature, $Z_0/Z_\e\simeq n^{-1/2}\sim L^{-1}$ for large $L$, $n$. At any finite temperature, especially in the ferromagnetic phase, $Z_0\ll Z_\e$ in a much more extreme way. If the ensemble average over only positive magnetizations can be said to converge to the equilibrium state of the Ising model in the thermodynamic limit, what of the average over only \emph{negative} magnetizations, or the space $S^n_\m$, ordinarily unphysical? This is, in one sense, precisely the definition of the metastable state of the Ising model. Expected values of observables in the metastable state can therefore be computed by considering this reduced ensemble. How, in practice, are these samplings over reduced ensembles preformed? There may be efficient algorithms for doing Monte Carlo sampling inside a particular reduced ensemble, but for finite-size systems with any of the algorithms described here the whole configuration space is available. Any of these can be used, however, to simultaneously sample all reduced configuration spaces by allowing them to sample the whole space and then only add a contribution to a given average if the associated state is in the reduced space of interest. \begin{acknowledgments} Thanks! \end{acknowledgments} \bibliography{monte-carlo} \end{document}