Cluster Algorithms: Beating Critical Slowing Down

The problem: critical slowing down¶

Near $T_c \approx 2.269\,J/k_B$ the correlation length $\xi$ diverges. A single Metropolis flip only propagates local information, so decorrelating a whole configuration requires $\tau \sim \xi^{\,z}$ Monte Carlo sweeps with dynamic exponent $z \approx 2.17$ for the 2D Ising universality class. In practice: simulations grind to a halt exactly where we most want to measure.

Wolff’s idea. Instead of flipping one spin, grow a connected cluster of like-oriented spins by adding neighbors with probability

p_{\text{add}} = 1 - e^{-2\beta J},

(1)

and flip the entire cluster. The factor is tuned so the move satisfies detailed balance with respect to the Boltzmann distribution. Near $T_c$ a single cluster move can flip a domain the size of the correlation length, giving a dynamic exponent $z_{\text{Wolff}} \approx 0.25$ — essentially free sampling.

import numpy as np
import matplotlib.pyplot as plt
from numba import njit
from matplotlib.animation import FuncAnimation
from IPython.display import HTML

rng = np.random.default_rng(0)

Two samplers side by side¶

Both routines return a list of magnetizations so we can compare their autocorrelation.

@njit
def metropolis_ising(spins, T=2.27, J=1.0, n_sweeps=2000):
    N = spins.shape[0]
    beta = 1.0 / T
    M_trace = np.empty(n_sweeps)
    for sweep in range(n_sweeps):
        for _ in range(N * N):
            i = np.random.randint(N)
            j = np.random.randint(N)
            s = spins[i, j]
            nb = (spins[(i+1) % N, j] + spins[(i-1) % N, j]
                  + spins[i, (j+1) % N] + spins[i, (j-1) % N])
            dE = 2.0 * J * s * nb
            if dE <= 0 or np.random.rand() < np.exp(-beta * dE):
                spins[i, j] = -s
        M_trace[sweep] = spins.mean()
    return spins, M_trace

@njit
def wolff_ising(spins, T=2.27, J=1.0, n_clusters=2000):
    N = spins.shape[0]
    beta = 1.0 / T
    p_add = 1.0 - np.exp(-2.0 * beta * J)
    mask = np.zeros((N, N), dtype=np.uint8)
    stack_x = np.empty(N * N, dtype=np.int64)
    stack_y = np.empty(N * N, dtype=np.int64)
    M_trace = np.empty(n_clusters)

    for step in range(n_clusters):
        mask[:, :] = 0
        i0 = np.random.randint(N)
        j0 = np.random.randint(N)
        seed = spins[i0, j0]
        mask[i0, j0] = 1
        stack_x[0] = i0
        stack_y[0] = j0
        top = 1
        while top > 0:
            top -= 1
            x = stack_x[top]
            y = stack_y[top]
            for dx, dy in ((-1, 0), (1, 0), (0, -1), (0, 1)):
                xn = (x + dx) % N
                yn = (y + dy) % N
                if mask[xn, yn] == 0 and spins[xn, yn] == seed:
                    if np.random.rand() < p_add:
                        mask[xn, yn] = 1
                        stack_x[top] = xn
                        stack_y[top] = yn
                        top += 1
        for x in range(N):
            for y in range(N):
                if mask[x, y]:
                    spins[x, y] = -spins[x, y]
        M_trace[step] = spins.mean()
    return spins, M_trace

Run both at $T_c$ ¶

Use the same lattice size and initial configuration so the comparison is fair. We equilibrate both chains first, then collect the production trace.

N = 32
Tc = 2.0 / np.log(1.0 + np.sqrt(2.0))   # exact 2D Ising critical T, J=1
init = rng.choice(np.array([-1, 1]), size=(N, N)).astype(np.int64)

spins_m = init.copy()
spins_m, _ = metropolis_ising(spins_m, T=Tc, n_sweeps=500)       # burn-in
_, M_metro = metropolis_ising(spins_m, T=Tc, n_sweeps=4000)

spins_w = init.copy()
spins_w, _ = wolff_ising(spins_w, T=Tc, n_clusters=500)          # burn-in
_, M_wolff = wolff_ising(spins_w, T=Tc, n_clusters=4000)

Autocorrelation of $|M|$ ¶

We measure $C(t) = \langle |M|_s\,|M|_{s+t}\rangle - \langle |M|\rangle^2$ normalized to $C(0) = 1$ . At $T_c$ the Metropolis trace decorrelates over hundreds of sweeps; the Wolff trace is nearly white.

def autocorr(x, maxlag=400):
    x = np.asarray(x) - np.mean(x)
    var = np.dot(x, x) / len(x)
    lags = np.arange(maxlag)
    C = np.array([np.mean(x[:len(x)-t] * x[t:]) for t in lags]) / var
    return lags, C

lag_m, C_m = autocorr(np.abs(M_metro), maxlag=400)
lag_w, C_w = autocorr(np.abs(M_wolff), maxlag=400)

tau_m = 0.5 + C_m[1:].clip(min=0).sum()
tau_w = 0.5 + C_w[1:].clip(min=0).sum()

fig, ax = plt.subplots(1, 2, figsize=(11, 4))
ax[0].plot(np.abs(M_metro), lw=0.8, color='steelblue', label='Metropolis')
ax[0].plot(np.abs(M_wolff), lw=0.8, color='crimson', label='Wolff')
ax[0].set(xlabel='update #', ylabel='$|M|$',
          title=f'Traces at $T_c = {Tc:.3f}$')
ax[0].legend()
ax[0].grid(True, ls=':', alpha=0.5)

ax[1].plot(lag_m, C_m, color='steelblue', label=f'Metropolis $\\tau_{{int}}\\approx{tau_m:.0f}$')
ax[1].plot(lag_w, C_w, color='crimson', label=f'Wolff $\\tau_{{int}}\\approx{tau_w:.0f}$')
ax[1].axhline(0, color='k', lw=0.5)
ax[1].set(xlabel='lag', ylabel='$C(t)/C(0)$', title='Normalized autocorrelation of $|M|$')
ax[1].legend()
ax[1].grid(True, ls=':', alpha=0.5)
fig.tight_layout()

The integrated autocorrelation time $\tau_{\text{int}}$ is how many Monte Carlo updates you need between independent samples. At $T_c$ Metropolis typically needs hundreds; Wolff is of order 1 — independent samples essentially for free. This speed-up is the entire reason cluster algorithms exist.

Watching a cluster grow and flip¶

A small animation of a few Wolff moves makes the physical picture clear — most steps flip a domain the size of the correlation length.

spins_anim = init.copy()
frames = [spins_anim.copy()]
for _ in range(60):
    spins_anim, _ = wolff_ising(spins_anim, T=Tc, n_clusters=1)
    frames.append(spins_anim.copy())

fig, ax = plt.subplots(figsize=(4, 4))
im = ax.imshow(frames[0], cmap='coolwarm', vmin=-1, vmax=1)
ax.set_xticks([]); ax.set_yticks([])
title = ax.set_title('Wolff step 0')

def update(k):
    im.set_data(frames[k])
    title.set_text(f'Wolff step {k}')
    return im, title

ani = FuncAnimation(fig, update, frames=len(frames), interval=150, blit=False)
plt.close()
HTML(ani.to_jshtml())

Problems¶

Dynamic exponent. Measure $\tau_{\text{int}}$ at $T_c$ for $N = 8, 16, 32, 64$ with both algorithms. Fit $\tau_{\text{int}} \sim N^{z}$ and confirm $z_{\text{Metropolis}} \approx 2.1$ while $z_{\text{Wolff}} \approx 0.25$ .
Off-criticality. Repeat the comparison at $T = 1.5$ and $T = 3.0$ . Why does the Wolff advantage shrink far from $T_c$ ?
Cluster-size distribution. Modify wolff_ising to return the size of every cluster. Plot its histogram at $T_c$ — it should be a broad power-law reflecting scale invariance.

Reference¶

U. Wolff, Collective Monte Carlo updating for spin systems, Phys. Rev. Lett. 62, 361 (1989). Wolff (1989)

References¶

Wolff, U. (1989). Collective Monte Carlo updating for spin systems. Physical Review Letters, 62(4), 361–364. 10.1103/PhysRevLett.62.361