summaryrefslogtreecommitdiff
path: root/monte-carlo.tex
diff options
context:
space:
mode:
Diffstat (limited to 'monte-carlo.tex')
-rw-r--r--monte-carlo.tex739
1 files changed, 306 insertions, 433 deletions
diff --git a/monte-carlo.tex b/monte-carlo.tex
index 7133461..929fc65 100644
--- a/monte-carlo.tex
+++ b/monte-carlo.tex
@@ -6,7 +6,7 @@
\documentclass[aps,prl,reprint]{revtex4-1}
\usepackage[utf8]{inputenc}
-\usepackage{amsmath,amssymb,latexsym,mathtools,xifthen}
+\usepackage{amsmath,amssymb,latexsym,mathtools,xifthen,upgreek}
\usepackage{algcompatible}
\usepackage{algorithm}
@@ -23,6 +23,7 @@
\def\ei{\mathop{\mathrm{Ei}}\nolimits} % exponential integral Ei
\def\re{\mathop{\mathrm{Re}}\nolimits}
\def\im{\mathop{\mathrm{Im}}\nolimits}
+\def\tr{\mathop{\mathrm{Tr}}\nolimits}
\def\sgn{\mathop{\mathrm{sgn}}\nolimits}
\def\dd{d} % differential
\def\O{O} % big O
@@ -30,6 +31,8 @@
\def\Z{\mathbb Z}
\def\R{\mathbb R}
\def\G{S}
+\def\X{\mathbb X}
+\def\inf{\mathrm{inf}}
% subscript for ``critical'' values, e.g., T_\c
\def\c{\mathrm c}
@@ -74,6 +77,9 @@
\renewcommand\vec[1]{\mathbf{#1}}
\newcommand\unit[1]{\hat{\mathbf{#1}}}
+\def\mcmc{\textsc{McMC}}
+\def\vsigma{\pmb{\upsigma}}
+
% used to reformat display math to fit in two-column ``reprint'' mode
\makeatletter
\newif\ifreprint
@@ -82,7 +88,7 @@
\begin{document}
-\title{An efficient Wolff algorithm in an external magnetic field}
+\title{An efficient cluster algorithm for spin systems in a symmetry-breaking field}
\author{Jaron Kent-Dobias}
\author{James P.~Sethna}
\affiliation{Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, NY, USA}
@@ -90,472 +96,339 @@
\date\today
\begin{abstract}
- We introduce an extension of the Wolff algorithm that preforms efficiently
- in an external magnetic field. Near the Ising critical point, the
- correlation time of our algorithm has a conventional scaling form that
- reduces to that of the Wolff algorithm at zero field and becomes more efficient
- at any nonzero field. As an application, we
- directly measure scaling functions of observables in the metastable state of
- the 2D Ising model.
\end{abstract}
\maketitle
-The Ising model is a simple model of a magnet comprised of discrete locally
-interacting one-component spins. Like most systems in statistical mechanics,
-calculation of its ensemble properties cannot be done explicitly and is often
-performed using Monte Carlo methods. Near its continuous phase transition,
-divergent correlation length leads to divergent correlation time in any
-locally-updating algorithm, hampering computation. With no external field,
-this critical slowing-down is largely alleviated by cluster algorithms---the
-most efficient of which is the Wolff algorithm---whose dynamics are nonlocal
-since each step flips a cluster of spins whose average size diverges with the
-correlation length. While less efficient cluster algorithms, like
-Swendsen--Wang, have been modified to perform in nonzero field, Wolff
-only works at zero field. We describe an extension of the Wolff
-algorithm that works in arbitrary external field while preserving Wolff's efficiency throughout the entire temperature--field parameter
-space.
-
-The Wolff algorithm works by first choosing a random spin and adding it to an
-empty cluster. Every neighbor of that spin that is pointed in the same
-direction as the spin is also added to the cluster with probability
-$1-e^{-2\beta J}$, where $\beta=1/T$ is inverse temperature and $J$ is the
-coupling between sites. This process is repeated for the neighbors of every
-spin added to the cluster. When all sites surrounding the cluster have been
-exhausted, the cluster is flipped. Our algorithm is a simple extension of
-this. An extra spin---often referred to as the ``ghost spin''---is introduced
-and made a nearest neighbor of all others with coupling $|H|$, the magnitude
-of the external field. The traditional Wolff algorithm is then preformed on
-this new extended lattice exactly as described above, with the extra spin
-treated no differently from any others, i.e., allowed to be added to clusters
-and subsequently flipped. Observables in the original system can be exactly
-estimated using the new one by a simple correspondence.
-
-This paper is divided into three sections. First, the Ising model (and our
-notation for it) is introduced, along with extant Monte Carlo algorithms.
-Second, we introduce our algorithm in detail and compare its efficiency with
-the existing ones. Finally, we use this new algorithm to directly measure
-critical scaling functions for observables of the 2D Ising model in its
-metastable state.
-
-\section{Introduction}
-
-Consider an undirected graph $G=(V,E)$ describing a network of interacting spins. The set
-of vertices $V=\{1,\ldots,n\}$ enumerates the sites of the network, and the
-set of edges $E$ describes connections between neighboring sites. On each site
-is a spin that can take any value from the set $S=\{-1,1\}$. The state of the
-system is described by a function $s:V\to S$, leading to a configuration space
-of all possible states $S^n=S\times\cdots\times S$. The Hamiltonian
-$\H:S^n\to\R$ gives the energy of a particular state $s\in S^n$ and is defined
-by
+Let $G=(V,E)$ be a graph, where the set of vertices $V=\{1,\ldots,N\}$
+enumerates the sites of a lattice and the set of edges $E$ contains pairs of
+neighboring sites. Let $R$ be a group and $X$ an $R$-set, with the action
+of group elements $r\in R$ on elements $s\in X$ denoted $r\cdot s$. $X$ is the
+set of states accessible by a spin, and $R$ is the \emph{symmetry group} of
+$X$. The set $X$ must admit a measure $\mu$ that is invariant under the action of $R$, e.g., for any
+$A\subseteq X$ and $r\in R$, $\mu(r\cdot A)=\mu(A)$. This trait is shared by
+the counting measure on any discrete set, or by any group acting by isometries
+on a Riemannian manifold, such as $O(n)$ on $S^{n-1}$ in the $n$-component
+model. Finally, the subset of elements in $R$ of order two must act
+transitively on $X$. This property, while apparently obscure, is shared by any
+symmetric space \cite{loos1969symmetric} or by any transitive, finitely generated isometry group. In fact, all the examples listed here have spins spaces with natural
+metrics whose symmetry group is the set of isometries of the spin spaces.
+We put one spin at each site of the lattice described by $G$, so that the
+state of the entire spin system is described by elements $\vec s\in X\times\cdots\times
+X=X^N$.
+
+The Hamiltonian of this system is a function $\H:X^N\to\R$ defined by
\[
- \H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-HM(s),
- \label{eq:ham}
+ \H(\vec s)=-\!\!\!\!\sum_{\{i,j\}\in E}\!\!\!\!Z(s_i,s_j)-\sum_{i\in V}B(s_i),
\]
-where $H$ is the external magnetic field, $J:E\to\R$ gives the coupling
-between spins on neighboring sites and
-$M:S^n\to\R$ is the magnetization of the system defined for a state
-$s\in S^n$ by
+where $Z:X\times X\to\R$ couples adjacent spins and
+$B:X\to\R$ is an external field. $Z$ must be symmetric in its arguments and
+invariant under the action of any element of $R$ applied to the entire lattice, that is, for any $r\in R$ and
+$s,t\in X$, $Z(r\cdot s,r\cdot t)=Z(s,t)$.
+One may also allow $Z$ to also be a function of the edge---for modelling
+random-bond, long-range, or anisotropic interactions---or allow $B$ to be a
+function of site---for modelling random fields. All the formal results of this paper hold equally
+well for these cases, but we will drop the additional index notation for clarity.
+
+\begin{table*}[htpb]
+ \begin{tabular}{l||ccccc}
+ & Spin states ($X$) & Symmetry ($R$) & Action ($g\cdot s$) &
+ Coupling ($Z(s,t)$) & Field ($B(s)$) \\
+ \hline\hline
+ Ising & $\{-1,1\}$ & $\Z/2\Z$ & $0\cdot
+ s\mapsto s$, $1\cdot s\mapsto -s$ & $st$ & $Hs$ \\
+ $n$-component & $S^{n-1}$ & $\mathrm O(n)$ & $M\cdot s\mapsto Ms$ & $s^{\mathrm T}t$ & $H^{\mathrm T}s$\\
+ Potts & $\mathbb Z/q\mathbb Z$ & $D_n$ & $r_m\cdot s=m+s$, $s_m\cdot
+ s=m-s$ & $\delta(s,t)$ & $\sum_mH_m\delta(m,s)$\\
+ Clock & $\mathbb Z/q\mathbb Z$ & $D_n$ & $r_m\cdot s=m+s$, $s_m\cdot
+ s=-m-s$ & $\cos(2\pi\frac{s-t}q)$ & $\sum_mH_m\cos(2\pi\frac{s-m}q)$\\
+ Roughening & $\mathbb Z$ & $D_\inf$ & $r_m\cdot s=m+s$, $s_m\cdot s=-m-s$
+ & $(s-t)^2$ & $Hs^2$\\
+ \end{tabular}
+ \caption{Some models.}
+ \label{table:models}
+\end{table*}
+
+The goal of statistical mechanics as applied to these systems is to compute
+expectation values of observables $A:X^N\to\R$. Assuming the ergodic
+hypothesis holds (for systems with broken-symmetry states, it does not), the
+expected value $\avg A$ of an observable $A$ is its average over every state
+$\vec s$
+in the configuration space $X^N$ weighted by the probability $p(\vec s)$ of
+that state appearing, or
\[
- M(s)=\sum_{i\in V}s_i.
+\avg A
+ =\frac{\int_{X^N}A(\vec s)e^{-\beta\H(\vec s)}\,\dd\mu(\vec s)}
+ {\int_{X^N}e^{-\beta\H(\vec s)}\,\dd\mu(\vec s)}
\]
-For the purpose of this study, we will only be considering ferromagnetic
-systems where the function $J$ is nonnegative.
-
-An observable of the system is a function $A:S^n\to\R$ depending on the
-system's state. Both the Hamiltonian and magnetization defined above are
-observables. Assuming the ergodic hypothesis holds, the expected value $\avg
-A$ of
-any observable $A$ is defined by its average over every state $s$ in the
-configuration space $S^n$ weighted by the Boltzmann factor $e^{-\beta\H(s)}$,
-or
+
+We will first describe a generalized version of the celebrated Wolff algorithm
+in the standard case where $B(s)=0$. After reflecting on the technical
+requirements of that algorithm, we will introduce a transformation to our
+system and Hamiltonian that allows the same algorithm to be applied with
+nonzero, in fact \emph{arbitrary}, external fields.
+
+The Wolff algorithm proceeds in the following way.
+\begin{enumerate}
+ \item Pick a random site and a random rotation $r\in R$ of order two, and add the site to
+ a stack.
+ \item While the stack isn't empty,
+ \begin{enumerate}
+ \item pop site $m$ from the stack.
+ \item If site $m$ isn't marked,
+ \begin{enumerate}
+ \item mark the site.
+ \item For every $j$ such that $\{m,j\}\in E$, add site $j$ to the
+ stack with probability
+ \[
+ p_r(s_m,s_j)=\min\{0,1-e^{\beta(Z(r\cdot s_m,s_j)-Z(s_m,s_j))}\}.
+ \]
+ \item Take $s_m\mapsto r\cdot s_m$.
+ \end{enumerate}
+ \end{enumerate}
+\end{enumerate}
+When the stack is exhausted, a cluster of connected spins will have been
+rotated by the action of $r$. In order for this algorithm to be useful, it
+must satisfy ergodicity and detailed balance. The probability $P(\vec s\to\vec
+s')$ that the configuration $\vec s$ is brought to $\vec s'$ by the flipping
+of a cluster $C\subseteq E$ is related to the probability of the reverse process $P(\vec
+s'\to\vec s)$ by
+\begin{widetext}
\[
- \avg A=\frac1Z\sum_{s\in S^n}e^{-\beta\H(s)}A(s),
- \label{eq:avg.obs}
+ \begin{aligned}
+ \frac{P(\vec s\to\vec s')}{P(\vec s'\to\vec s)}
+ &=\prod_{\{i,j\}\in
+ C}\frac{p_r(s_i,s_j)}{p_{r^{-1}}(s_i',s_j')}\prod_{\{i,j\}\in\partial
+ C}\frac{1-p_r(s_i,s_j)}{1-p_{r^{-1}}(s'_i,s'_j)}
+ =\prod_{\{i,j\}\in
+ C}\frac{p_r(s_i,s_j)}{p_{r}(r\cdot s_i,r\cdot s_j)}\prod_{\{i,j\}\in\partial
+ C}\frac{1-p_r(s_i,s_j)}{1-p_{r^{-1}}(r\cdot s_i,s_j)}
+ \\
+ &=\prod_{\{i,j\}\in
+ C}\frac{p_r(s_i,s_j)}{p_{r}(s_i,s_j)}\prod_{\{i,j\}\in\partial
+ C}e^{\beta(Z(r\cdot s_i,s_j)-Z(s_i,s_j))}
+ =\frac{e^{-\beta\H(\vec s)}}{e^{-\beta\H(\vec s')}}
+\end{aligned}
\]
-where the partition function $Z$ is defined by
+\end{widetext}
+which satisfies detailed balance so long as $p_r(s,t)=p_{r^{-1}}(s,t)$.
+Therefore, for the algorithm to function one must choose rotations at random
+only from those group elements in $R$ that act by idempotents. Usually, as is
+the case in all the examples here, the action is natural enough that this is
+the same as only choosing group elements of order two to act on your spins.
+Ergodicity is achieved if the subset of symmetry rotations that obey this
+property allow any spin state to access any other under composition. This is
+true for all the
+models in Table \ref{table:models}.
+
+The function of the algorithm described above depends on the fact that the
+coupling $Z$ depends only on the relative orientation of the spins---global
+reorientations by acting by some rotation do not affect the Hamiltonian. The
+external field $B$ breaks this symmetry. However, this can be resolved. Define
+a new graph $\tilde G=(\tilde V,\tilde E)$, where $\tilde V=\{0,1,\ldots,N\}$
+and
\[
- Z=\sum_{s\in S^n}e^{-\beta\H(s)}
+ \tilde E=E\cup\big\{\{0,i\}\mid i\in V\big\}.
\]
-and gives the correct normalization for the weighted sum.
-Unfortunately, the sum over configurations in \eqref{eq:avg.obs} are intractable for all but
-for very small systems. Therefore expectation values are usually
-approximated by some means. Monte Carlo methods are a common way of
-accomplishing this. These methods sample states $s$ from the configuration space
-$S^n$ according to the Boltzmann distribution $e^{-\beta\H(s)}$ so that
-averages of observables made using their incomplete samples asymptotically approach the
-true expected value.
-
-The Metropolis--Hastings algorithm
-\cite{metropolis1953equation,hastings1970monte} is very popular for systems in statistical
-physics. A starting state $s\in S^n$ is randomly perturbed to the state $s'$,
-usually by flipping one spin. The change in energy $\Delta\H=\H(s')-\H(s)$ due
-to the perturbation is then computed. If the change is negative the perturbed
-state $s'$ is accepted as the new state. Otherwise the perturbed state
-is accepted with probability $e^{-\beta\Delta\H}$. This process is repeated
-indefinitely and a sample of states is made by sampling the state $s$
-between iterations sufficiently separated that the successively sampled states
-are uncorrelated, e.g., at separations larger than the correlation time $\tau$. Metropolis--Hastings is very general, but unless the
-perturbations are very carefully chosen the algorithm suffers in regimes where
-large correlations are present in the system, for instance near continuous
-phase transitions. Here the algorithm suffers from what is known as critical
-slowing-down, where likely states consist of large correlated clusters that
-take many perturbations to move between in configuration space.
-
-%\begin{figure}
-% \begin{algorithm}[H]
-% \begin{algorithmic}
-% \REQUIRE $s\in S^n$
-% \STATE $s'\gets$ \texttt{Perturb}($s$)
-% \STATE $\Delta\H\gets\H(s')-\H(s)$
-% \IF {$\exp(-\beta\Delta\H)>$ \texttt{UniformRandom}$(0,1)$}
-% \STATE $s\gets s'$
-% \ENDIF
-% \end{algorithmic}
-% \caption{Metropolis--Hastings}
-% \label{alg:met-hast}
-% \end{algorithm}
-%\end{figure}
-%
-The Wolff algorithm \cite{wolff1989collective} solves many of these problems, but only at zero external
-field, $H=0$. This algorithm solves the problem of critical slowing-down by
-flipping carefully-constructed clusters of spins at once in a way that samples
-high-correlated states quickly while also always accepting prospective states.
-A random site $i$ is selected from the graph and its spin is flipped. Each of the
-site's neighbors $j$ is also flipped with probability $1-e^{-2\beta J_{ij}}$
-if doing so would lower the energy of the bond $i$--$j$. The process is
-repeated with every neighbor that was flipped. While this algorithm
-successfully addresses the problems of critical slowing down at zero field, it
-doesn't directly apply at nonzero field. A common extension is the hybrid
-Wolff--Metropolis, where a Wolff cluster is formed using the ordinary formula,
-then accepted or rejected Metropolis-style based on the change in energy
-\emph{relative to the external field}, $H\Delta M$. Because each spin has the
-same coupling to the external field, a strictly more efficient
-but exactly equivalent version of this hybrid is made by distributing the
-cluster acceptance or rejection across the spins in the cluster as they are
-added. In this version, the cluster is abandoned with probability $1-e^{-2\beta
-H}$ every time a spin is added to it.
-
-$z=0.29(1)$ \cite{wolff1989comparison,liu2014dynamic} $z=0.35(1)$ for
-Swendsen--Wang \cite{swendsen1987nonuniversal}
-
-%\begin{figure}
-% \begin{algorithm}[H]
-% \begin{algorithmic}
-% \REQUIRE $s\in\G^n$
-% \STATE \textbf{let} $q$ be an empty stack
-% \STATE \textbf{let} $i_0\in V$
-% \STATE $q.\mathtt{push}(i_0)$
-% \STATE $\sigma\gets s_{i_0}$
-% \WHILE {$q$ is not empty}
-% \STATE $i\gets q.\mathtt{pop}$
-% \IF {$s_i=\sigma$}
-% \FORALL {$j$ such that $\{i,j\}\in E$}
-% \IF {$1-\exp(-2\beta
-% J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$}
-% \STATE $q.\mathtt{push}(j)$
-% \ENDIF
-% \ENDFOR
-% \STATE $s_i\gets-s_i$
-% \ENDIF
-% \ENDWHILE
-% \end{algorithmic}
-% \caption{Wolff (Zero-Field)}
-% \label{alg:wolff}
-% \end{algorithm}
-%\end{figure}
-%
-%
-%
-%\begin{figure}
-% \begin{algorithm}[H]
-% \begin{algorithmic}
-% \REQUIRE $s\in\G^n$
-% \STATE $s'\gets s$
-% \STATE \textbf{let} $q$ be an empty stack
-% \STATE \textbf{let} $i_0\in V$
-% \STATE $q.\mathtt{push}(i_0)$
-% \STATE $\sigma\gets s_{i_0}'$
-% \STATE \texttt{completed} $\gets$ \textbf{true}
-% \WHILE {$q$ is not empty}
-% \STATE $i\gets q.\mathtt{pop}$
-% \IF {$s_i'=\sigma$}
-% \FORALL {$j$ such that $\{i,j\}\in E$}
-% \IF {$1-\exp(-2\beta
-% J_{ij}s_i's_j')>\mathop{\mathtt{UniformRandom}}(0,1)$}
-% \STATE $q.\mathtt{push}(j)$
-% \ENDIF
-% \ENDFOR
-% \STATE $s_i'\gets-s_i'$
-% \STATE $q.\mathtt{push}(i)$
-% \IF {$1-\exp(-2\beta\sigma H)>\mathop{\mathtt{UniformRandom}}(0,1)$}
-% \STATE \texttt{completed} $\gets$ \textbf{false}
-% \STATE \textbf{break}
-% \ENDIF
-% \ENDIF
-% \ENDWHILE
-% \IF {completed}
-% $s\gets s'$
-% \ENDIF
-% \end{algorithmic}
-% \caption{Hybrid Wolff/Metropolis--Hastings}
-% \label{alg:h.wolff}
-% \end{algorithm}
-%\end{figure}
-
-\section{Cluster-flip in a Field}
-
-Consider the new graph $\tilde G=(\tilde V,\tilde E)$ defined from $G$ by
-\begin{align}
- \tilde V&=0\cup V\\
- \tilde E&=E\cup\{\{0,i\}\mid i\in V\},
-\end{align}
-or by adding a zeroth vertex and edges from every other vertex to the new
-one. Thee spin at site zero is often known as a ``ghost spin.'' The network of spins
-described by this graph now have states that occupy a configuration space
-$S^{n+1}$. Extend the coupling between spins $\tilde J:\tilde E\to\R$ by
+We have introduced a new site to the lattice that neighbors every other site.
+Instead of assigning this site a spin whose value comes from the set $X$, we
+will assign it values $s_0\in R$, symmetry group elements, so that the new
+configuration space of the model is $R\times X^N$. We introduce a Hamiltonian
+$\tilde\H:R\times X^N\to\R$ defined by
\[
- \tilde J(e)=\begin{cases}
- J(e) & e\in E\\
- |H| & \text{otherwise},
- \end{cases}
+\begin{aligned}
+ \tilde\H(s_0,\vec s)
+ &=-\!\!\!\!\sum_{\{i,j\}\in E}\!\!\!\!Z(s_i,s_j)
+ -\sum_{i\in V}B(s_0^{-1}\cdot s_i)\\
+ &=-\!\!\!\!\sum_{\{i,j\}\in\tilde E}\!\!\!\!\tilde Z(s_i,s_j)
+\end{aligned}
\]
-so that each spin is coupled to every other spin the same way they were
-before, and coupled to the new spin with strength $|H|$, the magnitude of the
-external field. Now define a
-Hamiltonian function
-$\tilde\H:S^{n+1}\to\R$ on this new, larger configuration space defined by
+where the new coupling $\tilde Z:(R\cup X)\times(R\cup X)\to\R$ is defined for $s,t\in
+R\cup X$ by
\[
- \tilde\H(s)=-\sum_{\{i,j\}\in\tilde E}\tilde J_{ij}s_is_j.
- \label{eq:new.h.simple}
+ \tilde Z(s,t) =
+ \begin{cases}
+ Z(s,t) & \text{if $s,t\in X$} \\
+ B(s^{-1}\cdot t) & \text{if $s\in R$} \\
+ B(t^{-1}\cdot s) & \text{if $t\in R$}
+ \end{cases}
+ \label{eq:new.z}
\]
-This new Hamiltonian resembles the old one, but is comprised only of
-spin--spin interactions with no external field. However, by changing the terms
-considered in the sum we may equivalently write
+Note that this modified coupling is invariant under the action of group
+elements: for any $r,s_0\in R$ and $s\in X$,
\[
- \tilde\H(s)=-\sum_{\{i,j\}\in E}J_{ij}s_is_j-H\tilde M(s)
+\begin{aligned}
+ \tilde Z(rs_0,r\cdot s)
+ &=B((rs_0)^{-1}\cdot (r\cdot s))\\
+ &=B((s_0^{-1}r^{-1})\cdot(r\cdot s))\\
+ &=B((s_0^{-1}r^{-1}r)\cdot s)\\
+ &=B(s_0^{-1}\cdot s)
+ =\tilde Z(s_0,s)
+\end{aligned}
\]
-where the new magnetization $\tilde M:S^{n+1}\to\R$ is defined for $s\in
-S^{n+1}$ by
+The invariance $\tilde Z$ to rotations given other arguments follows from the
+invariance properties of $Z$.
+
+We have produced a system that incorporates the field function $B$ whose
+Hamiltonian is invariant to global rotations, but how does it relate to our
+previous system, whose properties we actually want to measure? If $A:X^N\to\R$
+is an observable of the original system, one can construct an observable
+$\tilde A:R\times X^N\to\R$ of the new system defined by
\[
- \begin{aligned}
- \tilde M(s)
- &=\sgn(H)s_0\sum_{i\in V}s_i\\
- &=M(s_0\sgn(H)(s_1,\ldots,s_n)).
- \end{aligned}
+ \tilde A(s_0,\vec s)=A(s_0^{-1}\cdot\vec s)
\]
-In fact, any observable $A$ of the original system can be written as an
-observable $\tilde A$ of the new system by defining
+whose expectation value in the new system equals that of the original
+observable in the old system. First, note that $\tilde\H(1,\vec s)=\H(\vec s)$. Since
+the Hamiltonian is invarient under global rotations, it follows that for any
+$g\in R$, $\tilde\H(g,g\cdot\vec s)=\tilde\H(g^{-1}g,g^{-1}g\cdot\vec
+s)=\tilde\H(1,\vec s)=\H(\vec s)$.
+Using the invariance properties of the measure on $X$ and introducing a
+measure $\rho$ on $R$, it follows that
\[
- \tilde A(s)=A(s_0\sgn(H)(s_1,\ldots,s_n))
+\begin{aligned}
+ \avg{\tilde A}
+ &=\frac{
+ \int_R\int_{X^N}\tilde A(s_0,\vec
+ s)e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0)
+ } {
+ \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0)
+ }\\
+ &=\frac{
+ \int_R\int_{X^N}A(s_0^{-1}\cdot\vec
+ s)e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0)
+ } {
+ \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,\vec s)}\,\dd\mu(\vec s)\,\dd\rho(s_0)
+ }\\
+ &=\frac{
+ \int_R\int_{X^N}A(\vec
+ s')e^{-\beta\tilde\H(s_0,s_0\cdot\vec
+ s')}\dd\mu(s_0\cdot\vec s')\,\dd\rho(s_0)
+ } {
+ \int_R\int_{X^N}e^{-\beta\tilde\H(s_0,s_0\cdot\vec s')}\dd\mu(s_0\cdot\vec
+ s')\,\dd\rho(s_0)
+ }\\
+ &=\frac{
+ \int_R\dd\rho(s_0)}{
+ \int_R\dd\rho(s_0)}\frac{\int_{X^N}A(\vec
+ s')e^{-\beta\H(\vec
+ s')}\dd\mu(\vec s')
+}{\int_{X^N}e^{-\beta\H(\vec s')}\dd\mu(\vec
+ s')
+ }
+ =\avg A
+\end{aligned}
\]
-such that $\avg{\tilde A}=\avg A$. This can be seen readily by using the
-symmetry $\tilde\H(-s)=\tilde\H(s)$ of the Hamiltonian
-\eqref{eq:new.h.simple}, the invariance of configuration space sums under
-negation of their summand, and the fact that $\tilde\H(1,s)=\H(s)$ for any $s\in
-S^n$. Notice in particular that this is true for the Hamiltonian $\tilde\H$ as
-well.
-
-Our new spin system with an additional field is simply a ferromagnetic spin system in the absence of an external field.
-Therefore, the Wolff algorithm can be applied to it with absolutely no
-modifications. Since there is an exact correspondence between the expectation
-values of our ordinary spin system in a field and their appropriately defined
-counterparts in our new system, sampling Monte Carlo simulations of our new
-system allows us to estimate the expectation values for the old. This ``new''
-algorithm, if you can call it that, is shown in Algorithm
-\ref{alg:wolff-field}.
+To summarize, spin systems in a field may be treated in the following way.
+\begin{enumerate}
+ \item Add a site to your lattice adjacent to every other site.
+ \item Initialize a ``spin'' at that site that is a representation of a
+ member of the symmetry group of your ordinary spins.
+ \item Carry out the ordinary Wolff cluster-flip procedure on this new
+ lattice, substituting $\tilde Z$ as defined in \eqref{eq:new.z} for $Z$.
+\end{enumerate}
+Ensemble averages of observables $A$ can then be estimated by sampling the
+value of $\tilde A$ on the new system.
+
+\section{Examples}
+
+\subsection{The Ising Model}
+
+In the Ising model, spins are drawn from the set $\{1,-1\}$. The symmetry
+group of this model is $C_2$, the cyclic group on two elements, which can be
+conveniently represented by the multiplicative group with elements $\{1,-1\}$,
+exactly the same as the spins themselves. The only nontrivial element is of
+order two. Because the symmetry group and the spins are described by the same
+elements, performing the algorithm on the Ising model in a field is very
+accurately described by simply adding an extra spin coupled to all others and
+running the ordinary algorithm.
+
+\subsection{The $n$-component Model}
+
+In the $n$-component model, spins are described by vectors on the $(n-1)$-sphere,
+so that $X=S^{n-1}$. The symmetry group of this model is $O(n)$, $n\times n$
+orthogonal matrices. The symmetry group acts on the spins by matrix
+multiplication. The elements of $O(n)$ that are order two are reflections
+about some hyperplane through the origin and $\pi$ rotations about any axis
+through the origin. Since the former generate the entire group, the set of
+reflections alone suffices to provide ergodicity. Computation of the coupling
+of ordinary spins with the external field and expectation values requires a
+matrix inversion, but since the matrices in question are orthogonal this is
+quickly accomplished by a transpose.
+
+\subsection{The Potts \& Clock Models}
+
+In both the $q$-state Potts and clock models, spins are described by
+$\Z/q\Z$, the set of integers modulo $q$. The symmetry group of this model is the dihedral group
+$D_q=\{r_0,\ldots,r_{q-1},s_0,\ldots,s_{q-1}\}$, the group of symmetries of a
+regular $q$-gon. The element $r_n$ represents a rotation of the polygon by
+$2\pi n/q$, and the element $s_n$ represents a reflection composed with a
+rotation $r_n$. The group acts on the spins by permutation: $r_n\cdot
+m={n+m}\pmod q$
+and $s_n\cdot m={-(n+m)}\pmod q$. Intuitively, this can be thought of
+as the natural action of the group on the vertices of a regular polygon that have
+been numbered $0$ through $q-1$. The elements of $D_q$ that are of order 2 are
+all reflections and $r_{q/2}$ if $q$ is even, though the former can generate
+the latter. While the reflections do not necessarily generate the entire group, for any
+$n,m\in\Z/q\Z$ there
+exists a
+reflection that takes $n\to m$, ensuring
+ergodicity. The elements of the dihedral group can be stored simply as an
+integer and a boolean that represents whether the element is a pure rotation or a
+reflection. The principle difference between the Potts and clock models is
+that, in the latter case, the form of the coupling $Z$ allows a geometric
+interpretation as being two-dimensional vectors fixed with even spacing along
+the unit circle.
+
+
+\subsection{Roughening}
+
+Though not often thought of as a spin model, simple roughening of surfaces can
+be described in this framework. The set of states is the integers $\Z$ and its
+symmetry group is the infinite dihedral group $D_\infty=\{r_i,s_i\mid
+i\in\Z\}$, where the action of the symmetry on the spins $j\in\Z$ is given by $r_i\cdot
+j=i+j$ and $s_i\cdot j=i-j$. These are shifts by $i$ and reflection about the
+integer $i$, respectively. The elements of order two are the reflections
+$s_i$, which suffice to provide ergodicity as any integer can be taken to any
+other in one step of this kind. The coupling is usually taken to be
+$Z(i,j)=|i-j|$, though it may also be any function of the absolute difference.
+Because randomly choices of integer will almost always result in energy
+changes so big that the whole system is always flipped, it is better to select
+random reflections about integers close to the average state of the system.
+Continuous roughening models---where the spin states are described by real
+numbers and the symmetry group is $\mathrm E(1)$, the Euclidean group for
+one-dimensional space---are equally well described.
+
%\begin{figure}
-% \begin{algorithm}[H]
-% \begin{algorithmic}
-% \REQUIRE $s\in S^n$
-% \STATE \textbf{let} $q$ be an empty stack
-% \STATE \textbf{let} $i_0\in V$
-% \STATE $q.\mathtt{push}(i_0)$
-% \STATE $\sigma\gets s_{i_0}$
-% \WHILE {$q$ is not empty}
-% \STATE $i\gets q.\mathtt{pop}$
-% \IF {$s_i=\sigma$}
-% \FORALL {$j$ such that $\{i,j\}\in\tilde E$}
-% \IF {$1-\exp(-2\beta \tilde
-% J_{ij}s_is_j)>\mathop{\mathtt{UniformRandom}}(0,1)$}
-% \STATE $q.\mathtt{push}(j)$
-% \ENDIF
-% \ENDFOR
-% \STATE $s_i\gets-s_i$
-% \ENDIF
-% \ENDWHILE
-% \end{algorithmic}
-% \caption{Wolff (Nonzero Field)}
-% \label{alg:wolff-field}
-% \end{algorithm}
+% \centering
+% \input{fig_correlation}
+% \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising
+% model simulated by the three nonzero field methods detailed in this paper.
+% The top plot is for simulations at high temperature, $T=5.80225$, the middle
+%plot is for simulations at the critical temperature, $T=2.26919$, and the
+%bottom plot is for simulations at }
+% \label{fig:correlation}
%\end{figure}
-%
-
-At sufficiently small $|H|$ both our algorithm and the hybrid Wolff--Metropolis reduce
-to the ordinary Wolff algorithm, and have its runtime properties. At very
-large $|H|$, the hybrid
-Wolff--Metropolis behaves exactly like Metropolis, where almost only one spin
-is ever flipped at a time with probability $\sim e^{-2\beta H}$, since the
-energy is dominated by contributions from the field.
-
-We measured the autocorrelation time $\tau$ of the internal energy $\H$ of a
-square-lattice Ising model ($J=1$ and $E$ is the set of nearest neighbor
-pairs)
-resulting from using each of these three algorithms at various
-fields and temperatures. This was done using an initial convex sequence
-estimator, as described in \cite{geyer1992practical}. Time was
-measured as ``the number of spins that the algorithm has attempted to flip.''
-For example, every Metropolis--Hastings step takes unit time, every Wolff step takes
-time equal to the number of spins in the flipping cluster, and every hybrid
-Wolff/Metropolis--Hastings step takes time equal to the number of spins in the
-cluster the step attempted to flip, whether it succeeds or not. The
-autocorrelation time as a function of reduced field $\beta H$ is shown in
-Fig.~\ref{fig:correlation}. By this measure of time, the modified Wolff
-algorithm thermalizes faster than both the hybrid Wolff/Metropolis and
-ordinary Metropolis--Hastings in all regimes. At high field ($\beta H\sim1$)
-the new algorithm has an autocorrelation time of roughly $n$, while both
-Metropolis--Hastings and the hybrid algorithm have an autocorrelation time of
-$2n$. At low temperature, neither the new algorithm nor Metropolis-Hastings
-have an autocorrelation time that depends on field, but the hybrid method
-performs very poorly at intermediate values of field in a way that becomes
-worse with system size, scaling worse than the hybrid algorithm does even it
-the critical temperature. At high temperature, all three algorithms perform
-similarly and in a way that is independent of system size. At the critical
-temperature, both the Metropolis--Hastings and hybrid algorithms experience
-critical slowing down in a similar way, whereas the new algorithm slows down
-in a far reduced way, always performing more efficiently at nonzero field than
-the zero-field Wolff algorithm.
\begin{figure}
\centering
- \input{fig_correlation}
- \caption{The autocorrelation time $\tau$ of the internal energy $\H$ for a $n=32\times32$ square-lattice Ising
- model simulated by the three nonzero field methods detailed in this paper.
- The top plot is for simulations at high temperature, $T=5.80225$, the middle
-plot is for simulations at the critical temperature, $T=2.26919$, and the
-bottom plot is for simulations at }
- \label{fig:correlation}
+ \input{fig_correlation_collapse-hL}
+ \caption{The correlation time $\tau$ as a function of the renormalization
+ invarient $hL^{-\beta\delta/\nu}$ for the $N=L\times L$ square lattice
+ Ising model for $L=8$, $16$, $32$, $64$, $128$, and $256$. $z=0.3$
+ }
\end{figure}
\begin{figure}
\centering
- \input{fig_correlation_collapse-hL}
+ \input{fig_correlation-temp}
+ \caption{The correlation time $\tau$ as a function of the renormalization
+ invarient $ht^{-\beta\delta}$ for the $N=128\times 128$ square lattice
+ Ising model. $z=0.3$
+ }
\end{figure}
-Our algorithm for the Ising model in a field can be
-generalized to run on the $q$-spin Potts or $O(n)$ models in exactly the same
-way as the conventional Wolff algorithm.
-
-\section{Magnetization Estimator}
-
-At any size, the ensemble average magnetization $\avg M$ is identically zero
-at zero field. However, this is not what is observed at low temperature. This
-is because in the low temperature phase the ergodic hypothesis---that the
-time-average value of observables is equal to their ensemble average---is
-violated. As the system size grows the likelihood of a fluctuation in any
-reasonable dynamics that flips the magnetization from one direction to the
-other becomes vanishingly small, and therefore it is inappropriate to estimate
-expected values in the low temperature phase by averaging over the whole
-configuration space. Instead, values must be estimated by averaging over the
-portion of configuration space that is accessible to the dynamics.
-
-For finite size systems, like any we would simulate, dynamics at zero field or
-even small nonzero field do allow the whole configuration space to be
-explored. However, people usually want to use the results from finite size
-systems to estimate the expected values in the thermodynamic limit, where this
-is no longer true. At zero field, for instance, it is common practice to use
-$\avg{|M|}$ to estimate the expected value for the magnetization instead of
-$\avg M$. But what to do at finite field? Is this approach justified?
-
-The provenance of the restricted configuration space in the thermodynamic
-limit is the fact that, at criticality or for nonzero $H$ under the critical
-temperature, an infinite cluster forms and can never be flipped by the
-dynamics. In a finite system there is no infinite cluster, and even the
-largest cluster has a nonzero probability of being flipped. However, we still
-want to be able to use finite systems to estimate the quantities of infinite
-ones. We approach this by thinking of a finite system as a piece of an
-infinite one. Clearly, time-averages of quantities in the chunk are equal to
-time-averages of the whole system, simply with much slower convergence.
-However, simulation of finite systems present a problem: in each sample, we
-are given a snapshot of a chunk of the system but do not know which direction
-the infinite cluster in the surrounding infinite system is pointing.
-Therefore, we must make a guess such that, in the limit of larger and larger
-system size, the probability that our guess is wrong and that the infinite
-cluster is pointed in a direction opposite the one we expected approaches
-zero. To estimate the equilibrium values of quantities in the infinite system,
-we only admit values resulting from states whose corresponding infinite
-cluster is expected to point in the direction of our external field. If the
-external field is zero, we choose one direction at random, say positive.
-
-Snapshots of the chunk what we suspect are likely to be a piece of an infinite
-cluster facing opposite the external field are discarded for the purpose of
-computing equilibrium values of the system, but still represent something
-physical. When the model's state contains an infinite cluster oriented against
-the external field, it is known as \emph{metastable}. Therefore in a finite
-simulation one can make convergent estimates of quantities in both the
-equilibrium and metastable states of the infinite system by dividing sampled
-states into two sets: those that are likely snapshots of a finite block in an
-infinite system whose infinite cluster is oriented with the field, and those
-whose infinite cluster is oriented against the field. All we must do now is
-specify how to distinguish between likely oriented with and likely oriented
-against. Others have used the direction of the largest cluster in the finite
-system to make this distinction. Far from the critical point in the
-low-temperature phase this does a clear and consistent job. However, we would
-like to study systems very near the critical point. We will take the likely direction of the
-infinite cluster to be given by \emph{the direction in which the system is
-magnetized}. Clearly, as our finite chunk grows larger and larger, the
-probability that its magnetization is different from the direction of the
-infinite cluster goes to zero. In the thermodynamic limite
-
-Since, in the thermodynamic limit expected values are given by an average
-over a restricted configuration space, we can estimate those expected values
-at finite size by making the same restriction. Defining the reduced
-configuration spaces
-\begin{align}
- S_\e^n&=\{s\in S^n\mid \sgn(H)M(s)>0\}
- \\
- S_\m^n&=\{s\in S^n\mid \sgn(H)M(s)<0\}
- \\
- S_0^n&=\{s\in S^n\mid \sgn(H)M(s)=0\}
-\end{align}
-where
-\[
- \sgn(H)=\begin{cases}1&H\geq0\\-1&H<0.\end{cases}
-\]
-Clearly, $S^n=S^n_\e\cup S^n_\m\cup S^n_0$, which implies that
-$Z=Z_\e+Z_\m+Z_0$. Since $S^n_\e=\{-s\mid s\in S^n_\m\}$, $|S^n_\e|=|S^n_\m|$.
-At zero field, $\H(s)=\H(-s)$, and therefore $Z_\e=Z_\m$. This can be used to
-show the near-equivalence (at zero field) of taking expectation values
-$\avg{|M|}$ of the absolute value of the magnetization and taking expectation
-values $\avg M_\e$ of the magnetization on a reduced configuration space,
-since
-\begin{align}
- \avg{|M|}=\frac1{1+Z_0/2Z_\e}\eavg M
-\end{align}
-At infinite temperature, $Z_0/Z_\e\simeq n^{-1/2}\sim L^{-1}$ for large $L$,
-$n$. At any finite temperature, especially in the ferromagnetic phase,
-$Z_0\ll Z_\e$ in a much more extreme way.
-
-If the ensemble average over only positive magnetizations can be said to
-converge to the equilibrium state of the Ising model in the thermodynamic
-limit, what of the average over only \emph{negative} magnetizations, or the
-space $S^n_\m$,
-ordinarily unphysical? This is, in one sense, precisely the definition of the
-metastable state of the Ising model. Expected values of observables in the
-metastable state can therefore be computed by considering this reduced
-ensemble.
-
-How, in practice, are these samplings over reduced ensembles preformed? There
-may be efficient algorithms for doing Monte Carlo sampling inside a particular
-reduced ensemble, but for finite-size systems with any of the algorithms
-described here the whole configuration space is available. Any of these can be
-used, however, to simultaneously sample all reduced configuration spaces by
-allowing them to sample the whole space and then only add a contribution to
-a given average if the associated state is in the reduced space of interest.
-
\begin{acknowledgments}
- Thanks!
\end{acknowledgments}
\bibliography{monte-carlo}