From 5cbe90854d1b3d784f956e3005e84b18cde9aeaa Mon Sep 17 00:00:00 2001 From: Jaron Kent-Dobias Date: Thu, 29 Mar 2018 13:31:29 -0400 Subject: lots of changes --- monte-carlo.tex | 739 +++++++++++++++++++++++--------------------------------- 1 file changed, 306 insertions(+), 433 deletions(-) (limited to 'monte-carlo.tex') diff --git a/monte-carlo.tex b/monte-carlo.tex index 7133461..929fc65 100644 --- a/monte-carlo.tex +++ b/monte-carlo.tex @@ -6,7 +6,7 @@ \documentclass[aps,prl,reprint]{revtex4-1} \usepackage[utf8]{inputenc} -\usepackage{amsmath,amssymb,latexsym,mathtools,xifthen} +\usepackage{amsmath,amssymb,latexsym,mathtools,xifthen,upgreek} \usepackage{algcompatible} \usepackage{algorithm} @@ -23,6 +23,7 @@ \def\ei{\mathop{\mathrm{Ei}}\nolimits} % exponential integral Ei \def\re{\mathop{\mathrm{Re}}\nolimits} \def\im{\mathop{\mathrm{Im}}\nolimits} +\def\tr{\mathop{\mathrm{Tr}}\nolimits} \def\sgn{\mathop{\mathrm{sgn}}\nolimits} \def\dd{d} % differential \def\O{O} % big O @@ -30,6 +31,8 @@ \def\Z{\mathbb Z} \def\R{\mathbb R} \def\G{S} +\def\X{\mathbb X} +\def\inf{\mathrm{inf}} % subscript for ``critical'' values, e.g., T_\c \def\c{\mathrm c} @@ -74,6 +77,9 @@ \renewcommand\vec[1]{\mathbf{#1}} \newcommand\unit[1]{\hat{\mathbf{#1}}} +\def\mcmc{\textsc{McMC}} +\def\vsigma{\pmb{\upsigma}} + % used to reformat display math to fit in two-column ``reprint'' mode \makeatletter \newif\ifreprint @@ -82,7 +88,7 @@ \begin{document} -\title{An efficient Wolff algorithm in an external magnetic field} +\title{An efficient cluster algorithm for spin systems in a symmetry-breaking field} \author{Jaron Kent-Dobias} \author{James P.~Sethna} \affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, USA} @@ -90,472 +96,339 @@ \date\today \begin{abstract} - We introduce an extension of the Wolff algorithm that preforms efficiently - in an external magnetic field. Near the Ising critical point, the - correlation time of our algorithm has a conventional scaling form that - reduces to that of the Wolff algorithm at zero field and becomes more efficient - at any nonzero field. As an application, we - directly measure scaling functions of observables in the metastable state of - the 2D Ising model. \end{abstract} \maketitle -The Ising model is a simple model of a magnet comprised of discrete locally -interacting one-component spins. Like most systems in statistical mechanics, -calculation of its ensemble properties cannot be done explicitly and is often -performed using Monte Carlo methods. Near its continuous phase transition, -divergent correlation length leads to divergent correlation time in any -locally-updating algorithm, hampering computation. With no external field, -this critical slowing-down is largely alleviated by cluster algorithms---the -most efficient of which is the Wolff algorithm---whose dynamics are nonlocal -since each step flips a cluster of spins whose average size diverges with the -correlation length. While less efficient cluster algorithms, like -Swendsen--Wang, have been modified to perform in nonzero field, Wolff -only works at zero field. We describe an extension of the Wolff -algorithm that works in arbitrary external field while preserving Wolff's efficiency throughout the entire temperature--field parameter -space. - -The Wolff algorithm works by first choosing a random spin and adding it to an -empty cluster. Every neighbor of that spin that is pointed in the same -direction as the spin is also added to the cluster with probability -$1-e^{-2\beta J}$, where $\beta=1/T$ is inverse temperature and $J$ is the -coupling between sites. This process is repeated for the neighbors of every -spin added to the cluster. When all sites surrounding the cluster have been -exhausted, the cluster is flipped. Our algorithm is a simple extension of -this. An extra spin---often referred to as the ``ghost spin''---is introduced -and made a nearest neighbor of all others with coupling $|H|$, the magnitude -of the external field. The traditional Wolff algorithm is then preformed on -this new extended lattice exactly as described above, with the extra spin -treated no differently from any others, i.e., allowed to be added to clusters -and subsequently flipped. Observables in the original system can be exactly -estimated using the new one by a simple correspondence. - -This paper is divided into three sections. First, the Ising model (and our -notation for it) is introduced, along with extant Monte Carlo algorithms. -Second, we introduce our algorithm in detail and compare its efficiency with -the existing ones. Finally, we use this new algorithm to directly measure -critical scaling functions for observables of the 2D Ising model in its -metastable state. - -\section{Introduction} - -Consider an undirected graph $G=(V,E)$ describing a network of interacting spins. The set -of vertices $V=\{1,\ldots,n\}$ enumerates the sites of the network, and the -set of edges $E$ describes connections between neighboring sites. On each site -is a spin that can take any value from the set $S=\{-1,1\}$. The state of the -system is described by a function $s:V\to S$, leading to a configuration space -of all possible states $S^n=S\times\cdots\times S$. The Hamiltonian -$\H:S^n\to\R$ gives the energy of a particular state $s\in S^n$ and is defined -by +Let $G=(V,E)$ be a graph, where the set of vertices $V=\{1,\ldots,N\}$ +enumerates the sites of a lattice and the set of edges $E$ contains pairs of +neighboring sites. Let $R$ be a group and $X$ an $R$-set, with the action +of group elements $r\in R$ on elements $s\in X$ denoted $r\cdot s$. $X$ is the +set of states accessible by a spin, and $R$ is the \emph{symmetry group} of +$X$. The set $X$ must admit a measure $\mu$ that is invariant under the action of $R$, e.g., for any +$A\subseteq X$ and $r\in R$, $\mu(r\cdot A)=\mu(A)$. This trait is shared by +the counting measure on any discrete set, or by any group acting by isometries +on a Riemannian manifold, such as $O(n)$ on $S^{n-1}$ in the $n$-component +model. Finally, the subset of elements in $R$ of order two must act +transitively on $X$. This property, while apparently obscure, is shared by any +symmetric space \cite{loos1969symmetric} or by any transitive, finitely generated isometry group. In fact, all the examples listed here have spins spaces with natural +metrics whose symmetry group is the set of isometries of the spin spaces. +We put one spin at each site of the lattice described by $G$, so that the +state of the entire spin system is described by elements $\vec s\in X\times\cdots\times +X=X^N$. + +The Hamiltonian of this system is a function $\H:X^N\to\R$ defined by \[ - \H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-HM(s), - \label{eq:ham} + \H(\vec s)=-\!\!\!\!\sum_{\{i,j\}\in E}\!\!\!\!Z(s_i,s_j)-\sum_{i\in V}B(s_i), \] -where $H$ is the external magnetic field, $J:E\to\R$ gives the coupling -between spins on neighboring sites and -$M:S^n\to\R$ is the magnetization of the system defined for a state -$s\in S^n$ by +where $Z:X\times X\to\R$ couples adjacent spins and +$B:X\to\R$ is an external field. $Z$ must be symmetric in its arguments and +invariant under the action of any element of $R$ applied to the entire lattice, that is, for any $r\in R$ and +$s,t\in X$, $Z(r\cdot s,r\cdot t)=Z(s,t)$. +One may also allow $Z$ to also be a function of the edge---for modelling +random-bond, long-range, or anisotropic interactions---or allow $B$ to be a +function of site---for modelling random fields. All the formal results of this paper hold equally +well for these cases, but we will drop the additional index notation for clarity. + +\begin{table*}[htpb] + \begin{tabular}{l||ccccc} + & Spin states ($X$) & Symmetry ($R$) & Action ($g\cdot s$) & + Coupling ($Z(s,t)$) & Field ($B(s)$) \\ + \hline\hline + Ising & $\{-1,1\}$ & $\Z/2\Z$ & $0\cdot + s\mapsto s$, $1\cdot s\mapsto -s$ & $st$ & $Hs$ \\ + $n$-component & $S^{n-1}$ & $\mathrm O(n)$ & $M\cdot s\mapsto Ms$ & $s^{\mathrm T}t$ & $H^{\mathrm T}s$\\ + Potts & $\mathbb Z/q\mathbb Z$ & $D_n$ & $r_m\cdot s=m+s$, $s_m\cdot + s=m-s$ & $\delta(s,t)$ & $\sum_mH_m\delta(m,s)$\\ + Clock & $\mathbb Z/q\mathbb Z$ & $D_n$ & $r_m\cdot s=m+s$, $s_m\cdot + s=-m-s$ & $\cos(2\pi\frac{s-t}q)$ & $\sum_mH_m\cos(2\pi\frac{s-m}q)$\\ + Roughening & $\mathbb Z$ & $D_\inf$ & $r_m\cdot s=m+s$, $s_m\cdot s=-m-s$ + & $(s-t)^2$ & $Hs^2$\\ + \end{tabular} + \caption{Some models.} + \label{table:models} +\end{table*} + +The goal of statistical mechanics as applied to these systems is to compute +expectation values of observables $A:X^N\to\R$. Assuming the ergodic +hypothesis holds (for systems with broken-symmetry states, it does not), the +expected value $\avg A$ of an observable $A$ is its average over every state +$\vec s$ +in the configuration space $X^N$ weighted by the probability $p(\vec s)$ of +that state appearing, or \[ - M(s)=\sum_{i\in V}s_i. +\avg A + =\frac{\int_{X^N}A(\vec s)e^{-\beta\H(\vec s)}\,\dd\mu(\vec s)} + {\int_{X^N}e^{-\beta\H(\vec s)}\,\dd\mu(\vec s)} \] -For the purpose of this study, we will only be considering ferromagnetic -systems where the function $J$ is nonnegative. - -An observable of the system is a function $A:S^n\to\R$ depending on the -system's state. Both the Hamiltonian and magnetization defined above are -observables. Assuming the ergodic hypothesis holds, the expected value $\avg -A$ of -any observable $A$ is defined by its average over every state $s$ in the -configuration space $S^n$ weighted by the Boltzmann factor $e^{-\beta\H(s)}$, -or + +We will first describe a generalized version of the celebrated Wolff algorithm +in the standard case where $B(s)=0$. After reflecting on the technical +requirements of that algorithm, we will introduce a transformation to our +system and Hamiltonian that allows the same algorithm to be applied with +nonzero, in fact \emph{arbitrary}, external fields. + +The Wolff algorithm proceeds in the following way. +\begin{enumerate} + \item Pick a random site and a random rotation $r\in R$ of order two, and add the site to + a stack. + \item While the stack isn't empty, + \begin{enumerate} + \item pop site $m$ from the stack. + \item If site $m$ isn't marked, + \begin{enumerate} + \item mark the site. + \item For every $j$ such that $\{m,j\}\in E$, add site $j$ to the + stack with probability + \[ + p_r(s_m,s_j)=\min\{0,1-e^{\beta(Z(r\cdot s_m,s_j)-Z(s_m,s_j))}\}. + \] + \item Take $s_m\mapsto r\cdot s_m$. + \end{enumerate} + \end{enumerate} +\end{enumerate} +When the stack is exhausted, a cluster of connected spins will have been +rotated by the action of $r$. In order for this algorithm to be useful, it +must satisfy ergodicity and detailed balance. The probability $P(\vec s\to\vec +s')$ that the configuration $\vec s$ is brought to $\vec s'$ by the flipping +of a cluster $C\subseteq E$ is related to the probability of the reverse process $P(\vec +s'\to\vec s)$ by +\begin{widetext} \[ - \avg A=\frac1Z\sum_{s\in S^n}e^{-\beta\H(s)}A(s), - \label{eq:avg.obs} + \begin{aligned} + \frac{P(\vec s\to\vec s')}{P(\vec s'\to\vec s)} + &=\prod_{\{i,j\}\in + C}\frac{p_r(s_i,s_j)}{p_{r^{-1}}(s_i',s_j')}\prod_{\{i,j\}\in\partial + C}\frac{1-p_r(s_i,s_j)}{1-p_{r^{-1}}(s'_i,s'_j)} + =\prod_{\{i,j\}\in + C}\frac{p_r(s_i,s_j)}{p_{r}(r\cdot s_i,r\cdot s_j)}\prod_{\{i,j\}\in\partial + C}\frac{1-p_r(s_i,s_j)}{1-p_{r^{-1}}(r\cdot s_i,s_j)} + \\ + &=\prod_{\{i,j\}\in + C}\frac{p_r(s_i,s_j)}{p_{r}(s_i,s_j)}\prod_{\{i,j\}\in\partial + C}e^{\beta(Z(r\cdot s_i,s_j)-Z(s_i,s_j))} + =\frac{e^{-\beta\H(\vec s)}}{e^{-\beta\H(\vec s')}} +\end{aligned} \] -where the partition function $Z$ is defined by +\end{widetext} +which satisfies detailed balance so long as $p_r(s,t)=p_{r^{-1}}(s,t)$. +Therefore, for the algorithm to function one must choose rotations at random +only from those group elements in $R$ that act by idempotents. Usually, as is +the case in all the examples here, the action is natural enough that this is +the same as only choosing group elements of order two to act on your spins. +Ergodicity is achieved if the subset of symmetry rotations that obey this +property allow any spin state to access any other under composition. This is +true for all the +models in Table \ref{table:models}. + +The function of the algorithm described above depends on the fact that the +coupling $Z$ depends only on the relative orientation of the spins---global +reorientations by acting by some rotation do not affect the Hamiltonian. The +external field $B$ breaks this symmetry. However, this can be resolved. Define +a new graph $\tilde G=(\tilde V,\tilde E)$, where $\tilde V=\{0,1,\ldots,N\}$ +and \[ - Z=\sum_{s\in S^n}e^{-\beta\H(s)} + \tilde E=E\cup\big\{\{0,i\}\mid i\in V\big\}. \] -and gives the correct normalization for the weighted sum. -Unfortunately, the sum over configurations in \eqref{eq:avg.obs} are intractable for all but -for very small systems. Therefore expectation values are usually -approximated by some means. Monte Carlo methods are a common way of -accomplishing this. These methods sample states $s$ from the configuration space -$S^n$ according to the Boltzmann distribution $e^{-\beta\H(s)}$ so that -averages of observables made using their incomplete samples asymptotically approach the -true expected value. - -The Metropolis--Hastings algorithm -\cite{metropolis1953equation,hastings1970monte} is very popular for systems in statistical -physics. A starting state $s\in S^n$ is randomly perturbed to the state $s'$, -usually by flipping one spin. The change in energy $\Delta\H=\H(s')-\H(s)$ due -to the perturbation is then computed. If the change is negative the perturbed -state $s'$ is accepted as the new state. Otherwise the perturbed state -is accepted with probability $e^{-\beta\Delta\H}$. This process is repeated -indefinitely and a sample of states is made by sampling the state $s$ -between iterations sufficiently separated that the successively sampled states -are uncorrelated, e.g., at separations larger than the correlation time $\tau$. Metropolis--Hastings is very general, but unless the -perturbations are very carefully chosen the algorithm suffers in regimes where -large correlations are present in the system, for instance near continuous -phase transitions. Here the algorithm suffers from what is known as critical -slowing-down, where likely states consist of large correlated clusters that -take many perturbations to move between in configuration space. - -%\begin{figure} -% \begin{algorithm}[H] -% \begin{algorithmic} -% \REQUIRE $s\in S^n$ -% \STATE $s'\gets$ \texttt{Perturb}($s$) -% \STATE $\Delta\H\gets\H(s')-\H(s)$ -% \IF {$\exp(-\beta\Delta\H)>$ \texttt{UniformRandom}$(0,1)$} -% \STATE $s\gets s'$ -% \ENDIF -% \end{algorithmic} -% \caption{Metropolis--Hastings} -% \label{alg:met-hast} -% \end{algorithm} -%\end{figure} -% -The Wolff algorithm \cite{wolff1989collective} solves many of these problems, but only at zero external -field, $H=0$. This algorithm solves the problem of critical slowing-down by -flipping carefully-constructed clusters of spins at once in a way that samples -high-correlated states quickly while also always accepting prospective states. -A random site $i$ is selected from the graph and its spin is flipped. Each of the -site's neighbors $j$ is also flipped with probability $1-e^{-2\beta J_{ij}}$ -if doing so would lower the energy of the bond $i$--$j$. The process is -repeated with every neighbor that was flipped. While this algorithm -successfully addresses the problems of critical slowing down at zero field, it -doesn't directly apply at nonzero field. A common extension is the hybrid -Wolff--Metropolis, where a Wolff cluster is formed using the ordinary formula, -then accepted or rejected Metropolis-style based on the change in energy -\emph{relative to the external field}, $H\Delta M$. Because each spin has the -same coupling to the external field, a strictly more efficient -but exactly equivalent version of this hybrid is made by distributing the -cluster acceptance or rejection across the spins in the cluster as they are -added. In this version, the cluster is abandoned with probability $1-e^{-2\beta -H}$ every time a spin is added to it. - -$z=0.29(1)$ \cite{wolff1989comparison,liu2014dynamic} $z=0.35(1)$ for -Swendsen--Wang \cite{swendsen1987nonuniversal} - -%\begin{figure} -% \begin{algorithm}[H] -% \begin{algorithmic} -% \REQUIRE $s\in\G^n$ -% \STATE \textbf{let} $q$ be an empty stack -% \STATE \textbf{let} $i_0\in V$ -% \STATE $q.\mathtt{push}(i_0)$ -% \STATE $\sigma\gets s_{i_0}$ -% \WHILE {$q$ is not empty} -% \STATE $i\gets q.\mathtt{pop}$ -% \IF {$s_i=\sigma$} -% \FORALL {$j$ such that $\{i,j\}\in E$} -% \IF {$1-\exp(-2\beta -% J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} -% \STATE $q.\mathtt{push}(j)$ -% \ENDIF -% \ENDFOR -% \STATE $s_i\gets-s_i$ -% \ENDIF -% \ENDWHILE -% \end{algorithmic} -% \caption{Wolff (Zero-Field)} -% \label{alg:wolff} -% \end{algorithm} -%\end{figure} -% -% -% -%\begin{figure} -% \begin{algorithm}[H] -% \begin{algorithmic} -% \REQUIRE $s\in\G^n$ -% \STATE $s'\gets s$ -% \STATE \textbf{let} $q$ be an empty stack -% \STATE \textbf{let} $i_0\in V$ -% \STATE $q.\mathtt{push}(i_0)$ -% \STATE $\sigma\gets s_{i_0}'$ -% \STATE \texttt{completed} $\gets$ \textbf{true} -% \WHILE {$q$ is not empty} -% \STATE $i\gets q.\mathtt{pop}$ -% \IF {$s_i'=\sigma$} -% \FORALL {$j$ such that $\{i,j\}\in E$} -% \IF {$1-\exp(-2\beta -% J_{ij}s_i's_j')>\mathop{\mathtt{UniformRandom}}(0,1)$} -% \STATE $q.\mathtt{push}(j)$ -% \ENDIF -% \ENDFOR -% \STATE $s_i'\gets-s_i'$ -% \STATE $q.\mathtt{push}(i)$ -% \IF {$1-\exp(-2\beta\sigma H)>\mathop{\mathtt{UniformRandom}}(0,1)$} -% \STATE \texttt{completed} $\gets$ \textbf{false} -% \STATE \textbf{break} -% \ENDIF -% \ENDIF -% \ENDWHILE -% \IF {completed} -% $s\gets s'$ -% \ENDIF -% \end{algorithmic} -% \caption{Hybrid Wolff/Metropolis--Hastings} -% \label{alg:h.wolff} -% \end{algorithm} -%\end{figure} - -\section{Cluster-flip in a Field} - -Consider the new graph $\tilde G=(\tilde V,\tilde E)$ defined from $G$ by -\begin{align} - \tilde V&=0\cup V\\ - \tilde E&=E\cup\{\{0,i\}\mid i\in V\}, -\end{align} -or by adding a zeroth vertex and edges from every other vertex to the new -one. Thee spin at site zero is often known as a ``ghost spin.'' The network of spins -described by this graph now have states that occupy a configuration space -$S^{n+1}$. Extend the coupling between spins $\tilde J:\tilde E\to\R$ by +We have introduced a new site to the lattice that neighbors every other site. +Instead of assigning this site a spin whose value comes from the set $X$, we +will assign it values $s_0\in R$, symmetry group elements, so that the new +configuration space of the model is $R\times X^N$. We introduce a Hamiltonian +$\tilde\H:R\times X^N\to\R$ defined by \[ - \tilde J(e)=\begin{cases} - J(e) & e\in E\\ - |H| & \text{otherwise}, - \end{cases} +\begin{aligned} + \tilde\H(s_0,\vec s) + &=-\!\!\!\!\sum_{\{i,j\}\in E}\!\!\!\!Z(s_i,s_j) + -\sum_{i\in V}B(s_0^{-1}\cdot s_i)\\ + &=-\!\!\!\!\sum_{\{i,j\}\in\tilde E}\!\!\!\!\tilde Z(s_i,s_j) +\end{aligned} \] -so that each spin is coupled to every other spin the same way they were -before, and coupled to the new spin with strength $|H|$, the magnitude of the -external field. Now define a -Hamiltonian function -$\tilde\H:S^{n+1}\to\R$ on this new, larger configuration space defined by +where the new coupling $\tilde Z:(R\cup X)\times(R\cup X)\to\R$ is defined for $s,t\in +R\cup X$ by \[ - \tilde\H(s)=-\sum_{\{i,j\}\in\tilde E}\tilde J_{ij}s_is_j. - \label{eq:new.h.simple} + \tilde Z(s,t) = + \begin{cases} + Z(s,t) & \text{if $s,t\in X$} \\ + B(s^{-1}\cdot t) & \text{if $s\in R$} \\ + B(t^{-1}\cdot s) & \text{if $t\in R$} + \end{cases} + \label{eq:new.z} \] -This new Hamiltonian resembles the old one, but is comprised only of -spin--spin interactions with no external field. However, by changing the terms -considered in the sum we may equivalently write +Note that this modified coupling is invariant under the action of group +elements: for any $r,s_0\in R$ and $s\in X$, \[ - \tilde\H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-H\tilde M(s) +\begin{aligned} + \tilde Z(rs_0,r\cdot s) + &=B((rs_0)^{-1}\cdot (r\cdot s))\\ + &=B((s_0^{-1}r^{-1})\cdot(r\cdot s))\\ + &=B((s_0^{-1}r^{-1}r)\cdot s)\\ + &=B(s_0^{-1}\cdot s) + =\tilde Z(s_0,s) +\end{aligned} \] -where the new magnetization $\tilde M:S^{n+1}\to\R$ is defined for $s\in -S^{n+1}$ by +The invariance $\tilde Z$ to rotations given other arguments follows from the +invariance properties of $Z$. + +We have produced a system that incorporates the field function $B$ whose +Hamiltonian is invariant to global rotations, but how does it relate to our +previous system, whose properties we actually want to measure? If $A:X^N\to\R$ +is an observable of the original system, one can construct an observable +$\tilde A:R\times X^N\to\R$ of the new system defined by \[ - \begin{aligned} - \tilde M(s) - &=\sgn(H)s_0\sum_{i\in V}s_i\\ - &=M(s_0\sgn(H)(s_1,\ldots,s_n)). - \end{aligned} + \tilde A(s_0,\vec s)=A(s_0^{-1}\cdot\vec s) \] -In fact, any observable $A$ of the original system can be written as an -observable $\tilde A$ of the new system by defining +whose expectation value in the new system equals that of the original +observable in the old system. First, note that $\tilde\H(1,\vec s)=\H(\vec s)$. Since +the Hamiltonian is invarient under global rotations, it follows that for any +$g\in R$, $\tilde\H(g,g\cdot\vec s)=\tilde\H(g^{-1}g,g^{-1}g\cdot\vec +s)=\tilde\H(1,\vec s)=\H(\vec s)$. +Using the invariance properties of the measure on $X$ and introducing a +measure $\rho$ on $R$, it follows that \[ - \tilde A(s)=A(s_0\sgn(H)(s_1,\ldots,s_n)) +\begin{aligned} + \avg{\tilde A} + &=\frac{ + \int_R\int_{X^N}\tilde A(s_0,\vec + s)e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0) + } { + \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0) + }\\ + &=\frac{ + \int_R\int_{X^N}A(s_0^{-1}\cdot\vec + s)e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0) + } { + \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0) + }\\ + &=\frac{ + \int_R\int_{X^N}A(\vec + s')e^{-\beta\tilde\H(s_0,s_0\cdot\vec + s')}\dd\mu(s_0\cdot\vec s')\,\dd\rho(s_0) + } { + \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,s_0\cdot\vec s')}\dd\mu(s_0\cdot\vec + s')\,\dd\rho(s_0) + }\\ + &=\frac{ + \int_R\dd\rho(s_0)}{ + \int_R\dd\rho(s_0)}\frac{\int_{X^N}A(\vec + s')e^{-\beta\H(\vec + s')}\dd\mu(\vec s') +}{\int_{X^N}e^{-\beta\H(\vec s')}\dd\mu(\vec + s') + } + =\avg A +\end{aligned} \] -such that $\avg{\tilde A}=\avg A$. This can be seen readily by using the -symmetry $\tilde\H(-s)=\tilde\H(s)$ of the Hamiltonian -\eqref{eq:new.h.simple}, the invariance of configuration space sums under -negation of their summand, and the fact that $\tilde\H(1,s)=\H(s)$ for any $s\in -S^n$. Notice in particular that this is true for the Hamiltonian $\tilde\H$ as -well. - -Our new spin system with an additional field is simply a ferromagnetic spin system in the absence of an external field. -Therefore, the Wolff algorithm can be applied to it with absolutely no -modifications. Since there is an exact correspondence between the expectation -values of our ordinary spin system in a field and their appropriately defined -counterparts in our new system, sampling Monte Carlo simulations of our new -system allows us to estimate the expectation values for the old. This ``new'' -algorithm, if you can call it that, is shown in Algorithm -\ref{alg:wolff-field}. +To summarize, spin systems in a field may be treated in the following way. +\begin{enumerate} + \item Add a site to your lattice adjacent to every other site. + \item Initialize a ``spin'' at that site that is a representation of a + member of the symmetry group of your ordinary spins. + \item Carry out the ordinary Wolff cluster-flip procedure on this new + lattice, substituting $\tilde Z$ as defined in \eqref{eq:new.z} for $Z$. +\end{enumerate} +Ensemble averages of observables $A$ can then be estimated by sampling the +value of $\tilde A$ on the new system. + +\section{Examples} + +\subsection{The Ising Model} + +In the Ising model, spins are drawn from the set $\{1,-1\}$. The symmetry +group of this model is $C_2$, the cyclic group on two elements, which can be +conveniently represented by the multiplicative group with elements $\{1,-1\}$, +exactly the same as the spins themselves. The only nontrivial element is of +order two. Because the symmetry group and the spins are described by the same +elements, performing the algorithm on the Ising model in a field is very +accurately described by simply adding an extra spin coupled to all others and +running the ordinary algorithm. + +\subsection{The $n$-component Model} + +In the $n$-component model, spins are described by vectors on the $(n-1)$-sphere, +so that $X=S^{n-1}$. The symmetry group of this model is $O(n)$, $n\times n$ +orthogonal matrices. The symmetry group acts on the spins by matrix +multiplication. The elements of $O(n)$ that are order two are reflections +about some hyperplane through the origin and $\pi$ rotations about any axis +through the origin. Since the former generate the entire group, the set of +reflections alone suffices to provide ergodicity. Computation of the coupling +of ordinary spins with the external field and expectation values requires a +matrix inversion, but since the matrices in question are orthogonal this is +quickly accomplished by a transpose. + +\subsection{The Potts \& Clock Models} + +In both the $q$-state Potts and clock models, spins are described by +$\Z/q\Z$, the set of integers modulo $q$. The symmetry group of this model is the dihedral group +$D_q=\{r_0,\ldots,r_{q-1},s_0,\ldots,s_{q-1}\}$, the group of symmetries of a +regular $q$-gon. The element $r_n$ represents a rotation of the polygon by +$2\pi n/q$, and the element $s_n$ represents a reflection composed with a +rotation $r_n$. The group acts on the spins by permutation: $r_n\cdot +m={n+m}\pmod q$ +and $s_n\cdot m={-(n+m)}\pmod q$. Intuitively, this can be thought of +as the natural action of the group on the vertices of a regular polygon that have +been numbered $0$ through $q-1$. The elements of $D_q$ that are of order 2 are +all reflections and $r_{q/2}$ if $q$ is even, though the former can generate +the latter. While the reflections do not necessarily generate the entire group, for any +$n,m\in\Z/q\Z$ there +exists a +reflection that takes $n\to m$, ensuring +ergodicity. The elements of the dihedral group can be stored simply as an +integer and a boolean that represents whether the element is a pure rotation or a +reflection. The principle difference between the Potts and clock models is +that, in the latter case, the form of the coupling $Z$ allows a geometric +interpretation as being two-dimensional vectors fixed with even spacing along +the unit circle. + + +\subsection{Roughening} + +Though not often thought of as a spin model, simple roughening of surfaces can +be described in this framework. The set of states is the integers $\Z$ and its +symmetry group is the infinite dihedral group $D_\infty=\{r_i,s_i\mid +i\in\Z\}$, where the action of the symmetry on the spins $j\in\Z$ is given by $r_i\cdot +j=i+j$ and $s_i\cdot j=i-j$. These are shifts by $i$ and reflection about the +integer $i$, respectively. The elements of order two are the reflections +$s_i$, which suffice to provide ergodicity as any integer can be taken to any +other in one step of this kind. The coupling is usually taken to be +$Z(i,j)=|i-j|$, though it may also be any function of the absolute difference. +Because randomly choices of integer will almost always result in energy +changes so big that the whole system is always flipped, it is better to select +random reflections about integers close to the average state of the system. +Continuous roughening models---where the spin states are described by real +numbers and the symmetry group is $\mathrm E(1)$, the Euclidean group for +one-dimensional space---are equally well described. + %\begin{figure} -% \begin{algorithm}[H] -% \begin{algorithmic} -% \REQUIRE $s\in S^n$ -% \STATE \textbf{let} $q$ be an empty stack -% \STATE \textbf{let} $i_0\in V$ -% \STATE $q.\mathtt{push}(i_0)$ -% \STATE $\sigma\gets s_{i_0}$ -% \WHILE {$q$ is not empty} -% \STATE $i\gets q.\mathtt{pop}$ -% \IF {$s_i=\sigma$} -% \FORALL {$j$ such that $\{i,j\}\in\tilde E$} -% \IF {$1-\exp(-2\beta \tilde -% J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$} -% \STATE $q.\mathtt{push}(j)$ -% \ENDIF -% \ENDFOR -% \STATE $s_i\gets-s_i$ -% \ENDIF -% \ENDWHILE -% \end{algorithmic} -% \caption{Wolff (Nonzero Field)} -% \label{alg:wolff-field} -% \end{algorithm} +% \centering +% \input{fig_correlation} +% \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising +% model simulated by the three nonzero field methods detailed in this paper. +% The top plot is for simulations at high temperature, $T=5.80225$, the middle +%plot is for simulations at the critical temperature, $T=2.26919$, and the +%bottom plot is for simulations at } +% \label{fig:correlation} %\end{figure} -% - -At sufficiently small $|H|$ both our algorithm and the hybrid Wolff--Metropolis reduce -to the ordinary Wolff algorithm, and have its runtime properties. At very -large $|H|$, the hybrid -Wolff--Metropolis behaves exactly like Metropolis, where almost only one spin -is ever flipped at a time with probability $\sim e^{-2\beta H}$, since the -energy is dominated by contributions from the field. - -We measured the autocorrelation time $\tau$ of the internal energy $\H$ of a -square-lattice Ising model ($J=1$ and $E$ is the set of nearest neighbor -pairs) -resulting from using each of these three algorithms at various -fields and temperatures. This was done using an initial convex sequence -estimator, as described in \cite{geyer1992practical}. Time was -measured as ``the number of spins that the algorithm has attempted to flip.'' -For example, every Metropolis--Hastings step takes unit time, every Wolff step takes -time equal to the number of spins in the flipping cluster, and every hybrid -Wolff/Metropolis--Hastings step takes time equal to the number of spins in the -cluster the step attempted to flip, whether it succeeds or not. The -autocorrelation time as a function of reduced field $\beta H$ is shown in -Fig.~\ref{fig:correlation}. By this measure of time, the modified Wolff -algorithm thermalizes faster than both the hybrid Wolff/Metropolis and -ordinary Metropolis--Hastings in all regimes. At high field ($\beta H\sim1$) -the new algorithm has an autocorrelation time of roughly $n$, while both -Metropolis--Hastings and the hybrid algorithm have an autocorrelation time of -$2n$. At low temperature, neither the new algorithm nor Metropolis-Hastings -have an autocorrelation time that depends on field, but the hybrid method -performs very poorly at intermediate values of field in a way that becomes -worse with system size, scaling worse than the hybrid algorithm does even it -the critical temperature. At high temperature, all three algorithms perform -similarly and in a way that is independent of system size. At the critical -temperature, both the Metropolis--Hastings and hybrid algorithms experience -critical slowing down in a similar way, whereas the new algorithm slows down -in a far reduced way, always performing more efficiently at nonzero field than -the zero-field Wolff algorithm. \begin{figure} \centering - \input{fig_correlation} - \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising - model simulated by the three nonzero field methods detailed in this paper. - The top plot is for simulations at high temperature, $T=5.80225$, the middle -plot is for simulations at the critical temperature, $T=2.26919$, and the -bottom plot is for simulations at } - \label{fig:correlation} + \input{fig_correlation_collapse-hL} + \caption{The correlation time $\tau$ as a function of the renormalization + invarient $hL^{-\beta\delta/\nu}$ for the $N=L\times L$ square lattice + Ising model for $L=8$, $16$, $32$, $64$, $128$, and $256$. $z=0.3$ + } \end{figure} \begin{figure} \centering - \input{fig_correlation_collapse-hL} + \input{fig_correlation-temp} + \caption{The correlation time $\tau$ as a function of the renormalization + invarient $ht^{-\beta\delta}$ for the $N=128\times 128$ square lattice + Ising model. $z=0.3$ + } \end{figure} -Our algorithm for the Ising model in a field can be -generalized to run on the $q$-spin Potts or $O(n)$ models in exactly the same -way as the conventional Wolff algorithm. - -\section{Magnetization Estimator} - -At any size, the ensemble average magnetization $\avg M$ is identically zero -at zero field. However, this is not what is observed at low temperature. This -is because in the low temperature phase the ergodic hypothesis---that the -time-average value of observables is equal to their ensemble average---is -violated. As the system size grows the likelihood of a fluctuation in any -reasonable dynamics that flips the magnetization from one direction to the -other becomes vanishingly small, and therefore it is inappropriate to estimate -expected values in the low temperature phase by averaging over the whole -configuration space. Instead, values must be estimated by averaging over the -portion of configuration space that is accessible to the dynamics. - -For finite size systems, like any we would simulate, dynamics at zero field or -even small nonzero field do allow the whole configuration space to be -explored. However, people usually want to use the results from finite size -systems to estimate the expected values in the thermodynamic limit, where this -is no longer true. At zero field, for instance, it is common practice to use -$\avg{|M|}$ to estimate the expected value for the magnetization instead of -$\avg M$. But what to do at finite field? Is this approach justified? - -The provenance of the restricted configuration space in the thermodynamic -limit is the fact that, at criticality or for nonzero $H$ under the critical -temperature, an infinite cluster forms and can never be flipped by the -dynamics. In a finite system there is no infinite cluster, and even the -largest cluster has a nonzero probability of being flipped. However, we still -want to be able to use finite systems to estimate the quantities of infinite -ones. We approach this by thinking of a finite system as a piece of an -infinite one. Clearly, time-averages of quantities in the chunk are equal to -time-averages of the whole system, simply with much slower convergence. -However, simulation of finite systems present a problem: in each sample, we -are given a snapshot of a chunk of the system but do not know which direction -the infinite cluster in the surrounding infinite system is pointing. -Therefore, we must make a guess such that, in the limit of larger and larger -system size, the probability that our guess is wrong and that the infinite -cluster is pointed in a direction opposite the one we expected approaches -zero. To estimate the equilibrium values of quantities in the infinite system, -we only admit values resulting from states whose corresponding infinite -cluster is expected to point in the direction of our external field. If the -external field is zero, we choose one direction at random, say positive. - -Snapshots of the chunk what we suspect are likely to be a piece of an infinite -cluster facing opposite the external field are discarded for the purpose of -computing equilibrium values of the system, but still represent something -physical. When the model's state contains an infinite cluster oriented against -the external field, it is known as \emph{metastable}. Therefore in a finite -simulation one can make convergent estimates of quantities in both the -equilibrium and metastable states of the infinite system by dividing sampled -states into two sets: those that are likely snapshots of a finite block in an -infinite system whose infinite cluster is oriented with the field, and those -whose infinite cluster is oriented against the field. All we must do now is -specify how to distinguish between likely oriented with and likely oriented -against. Others have used the direction of the largest cluster in the finite -system to make this distinction. Far from the critical point in the -low-temperature phase this does a clear and consistent job. However, we would -like to study systems very near the critical point. We will take the likely direction of the -infinite cluster to be given by \emph{the direction in which the system is -magnetized}. Clearly, as our finite chunk grows larger and larger, the -probability that its magnetization is different from the direction of the -infinite cluster goes to zero. In the thermodynamic limite - -Since, in the thermodynamic limit expected values are given by an average -over a restricted configuration space, we can estimate those expected values -at finite size by making the same restriction. Defining the reduced -configuration spaces -\begin{align} - S_\e^n&=\{s\in S^n\mid \sgn(H)M(s)>0\} - \\ - S_\m^n&=\{s\in S^n\mid \sgn(H)M(s)<0\} - \\ - S_0^n&=\{s\in S^n\mid \sgn(H)M(s)=0\} -\end{align} -where -\[ - \sgn(H)=\begin{cases}1&H\geq0\\-1&H<0.\end{cases} -\] -Clearly, $S^n=S^n_\e\cup S^n_\m\cup S^n_0$, which implies that -$Z=Z_\e+Z_\m+Z_0$. Since $S^n_\e=\{-s\mid s\in S^n_\m\}$, $|S^n_\e|=|S^n_\m|$. -At zero field, $\H(s)=\H(-s)$, and therefore $Z_\e=Z_\m$. This can be used to -show the near-equivalence (at zero field) of taking expectation values -$\avg{|M|}$ of the absolute value of the magnetization and taking expectation -values $\avg M_\e$ of the magnetization on a reduced configuration space, -since -\begin{align} - \avg{|M|}=\frac1{1+Z_0/2Z_\e}\eavg M -\end{align} -At infinite temperature, $Z_0/Z_\e\simeq n^{-1/2}\sim L^{-1}$ for large $L$, -$n$. At any finite temperature, especially in the ferromagnetic phase, -$Z_0\ll Z_\e$ in a much more extreme way. - -If the ensemble average over only positive magnetizations can be said to -converge to the equilibrium state of the Ising model in the thermodynamic -limit, what of the average over only \emph{negative} magnetizations, or the -space $S^n_\m$, -ordinarily unphysical? This is, in one sense, precisely the definition of the -metastable state of the Ising model. Expected values of observables in the -metastable state can therefore be computed by considering this reduced -ensemble. - -How, in practice, are these samplings over reduced ensembles preformed? There -may be efficient algorithms for doing Monte Carlo sampling inside a particular -reduced ensemble, but for finite-size systems with any of the algorithms -described here the whole configuration space is available. Any of these can be -used, however, to simultaneously sample all reduced configuration spaces by -allowing them to sample the whole space and then only add a contribution to -a given average if the associated state is in the reduced space of interest. - \begin{acknowledgments} - Thanks! \end{acknowledgments} \bibliography{monte-carlo} -- cgit v1.2.3-54-g00ecf