Rapid Mixing of Glauber Dynamics up to Uniqueness via Contraction

Zongchen Chen Georgia Institute of Technology. Email: chenzongchen@gatech.edu. Research supported in part by NSF grant CCF-2007022.    Kuikui Liu University of Washington. Email: liukui17@cs.washington.edu. Research supported in part by NSF grant CCF-1907845 and ONR-YIP grant N00014-17-1-2429.    Eric Vigoda University of California, Santa Barbara. Email: vigoda@ucsb.edu. Research supported in part by NSF grant CCF-2007022.
Abstract

For general antiferromagnetic 2-spin systems, including the hardcore model on weighted independent sets and the antiferromagnetic Ising model, there is an 𝖥𝖯𝖳𝖠𝖲 for the partition function on graphs of maximum degree Δ when the infinite regular tree lies in the uniqueness region by Li et al. (2013). Moreover, in the tree non-uniqueness region, Sly (2010) showed that there is no 𝖥𝖯𝖱𝖠𝖲 to estimate the partition function unless 𝖭𝖯=𝖱𝖯. The algorithmic results follow from the correlation decay approach due to Weitz (2006) or the polynomial interpolation approach developed by Barvinok (2016). However the running time is only polynomial for constant Δ. For the hardcore model, recent work of Anari et al. (2020) establishes rapid mixing of the simple single-site Markov chain known as the Glauber dynamics in the tree uniqueness region. Our work simplifies their analysis of the Glauber dynamics by considering the total pairwise influence of a fixed vertex v on other vertices, as opposed to the total influence of other vertices on v, thereby extending their work to all 2-spin models and improving the mixing time.

More importantly our proof ties together the three disparate algorithmic approaches: we show that contraction of the so-called tree recursions with a suitable potential function, which is the primary technique for establishing efficiency of Weitz’s correlation decay approach and Barvinok’s polynomial interpolation approach, also establishes rapid mixing of the Glauber dynamics. We emphasize that this connection holds for all 2-spin models (both antiferromagnetic and ferromagnetic), and existing proofs for the correlation decay or polynomial interpolation approach immediately imply rapid mixing of the Glauber dynamics. Our proof utilizes that the graph partition function is a divisor of the partition function for Weitz’s self-avoiding walk tree. This fact leads to new tools for the analysis of the influence of vertices, and may be of independent interest for the study of complex zeros.

1 Introduction

A remarkable connection has been established between the computational complexity of approximate counting problems in general graphs of maximum degree Δ and the statistical physics phase transition on infinite, regular trees of degree Δ (or up to Δ in the more general case). This connection holds for 2-state antiferromagnetic spin systems – the hardcore model on independent sets and the Ising model are the most interesting examples of such systems.

Given an n-vertex graph G=(V,E), configurations of the 2-spin model are the 2n assignments of spins 0,1 to the vertices. A 2-spin system is defined by three parameters: edge weights β,γ>0 and a vertex weight λ>0. Edge parameter β controls the (relative) strength of interaction between neighboring 1-spins, γ corresponds to neighboring 0-spins, and λ is the external field applied to vertices with 1-spins.

Every spin configuration σ{0,1}V is assigned a weight

wG(σ)=βm1(σ)γm0(σ)λn1(σ),

where, for spin s{0,1}, ms(σ)=#{uvE:σu=σv=s} is the number of monochromatic edges with spin s, and n1(σ)=#{vV:σv=1} is the number of vertices with spin 1 (as is standard, the parameters are normalized so we can avoid two additional parameters). The Gibbs distribution over spin configurations is given by μG(σ)=wG(σ)ZG(β,γ,λ), where ZG(β,γ,λ)=σ{0,1}Vβm1(σ)γm0(σ)λn1(σ) is the partition function.

There are two examples of particular interest: the hardcore model and the Ising model. When β=0 and γ=1 then the only configurations with non-zero weight are independent sets of G and the weight of an independent set σ is w(σ)=λ|σ|; this example is known as the hardcore model where the parameter λ corresponds to the fugacity.

In the case β=γ then the important quantity is the total number of monochromatic edges m(σ)=m0(σ)+m1(σ) and the weight of a configuration σ is w(σ)=βm(σ)λn1(σ); this is the classical Ising model where the parameter β corresponds to the inverse temperature and λ is the external field (λ=1 means no external field). Note, when β>1 then the model is ferromagnetic as neighboring vertices prefer to have the same spin, and β<1 is the antiferromagnetic Ising model. In the general 2-spin system, the model is ferromagnetic when βγ>1 and antiferromagnetic when βγ<1. (When βγ=1 we get a trivial product distribution.)

The fundamental algorithmic tasks are to sample from the Gibbs distribution and to estimate the partition function. For the approximate sampling problem we are given a graph G and an ϵ>0 and our goal is to generate a sample from a distribution π which is within total variation distance ϵ of the Gibbs distribution μG in time poly(n,log(1/ϵ)). An efficient approximate sampling algorithm implies an 𝖥𝖯𝖱𝖠𝖲 (fully-polynomial randomized approximation scheme) for the approximate counting problem [JVV86, ŠVV09]. Recall, given an n-vertex graph G, and ϵ,δ>0, an 𝖥𝖯𝖱𝖠𝖲 outputs a (1±ϵ)-approximation of ZG with probability 1δ in time poly(n,1/ϵ,log(1/δ)), whereas an 𝖥𝖯𝖳𝖠𝖲 is the deterministic analog (i.e., δ=0).

A standard approach to the approximate sampling problem is the Markov Chain Monte Carlo (MCMC) method; in fact there is a simple Markov chain known as the Glauber dynamics. The Glauber dynamics works as follows: from a configuration Xt at time t, choose a random vertex v, we then set Xt+1(w)=Xt(w) for all wv, and finally we choose Xt+1(v) from the conditional distribution of μ(σv|σw=Xt+1(w) for all wv). For the case of the hardcore model, then Xt+1(v) is set to occupied (i.e., spin 1) with probability λ/(1+λ) if no neighbors are currently occupied, and otherwise it is set to unoccupied.

It is straightforward to verify that the Glauber dynamics is ergodic with the Gibbs distribution as the unique stationary distribution. The mixing time is the minimum number of steps to guarantee, from the worst initial state X0, that the distribution of Xt is within total variation distance 1/4 of the Gibbs distribution. The goal is to prove that the mixing time is polynomial in n, in which case the chain is said to be rapidly mixing.

For the case of the ferromagnetic Ising model (with or without an external field), a classical result of Jerrum and Sinclair [JS93] gives an 𝖥𝖯𝖱𝖠𝖲 for all graphs via the MCMC method. This is the only case with an efficient algorithm for general graphs. For antiferromagnetic 2-spin models the picture is closely tied to statistical physics phase transitions on the regular tree.

The uniqueness/non-uniqueness phase transition is nicely illustrated for the case of the hardcore model. Consider the infinite Δ-regular tree T rooted at r, and let Th denote the tree truncated at the first h levels. This phase transition captures whether the configuration at the leaves of Th “influences” the root, in the limit h. For the hardcore model we can consider even height trees (corresponding to the all even boundary condition) versus odd height trees. Let ph denote the marginal probability that the root is occupied in the Gibbs distribution μTh. Let p𝖾𝗏𝖾𝗇=limhp2h and p𝗈𝖽𝖽=limhp2h+1. We say that tree uniqueness holds if p𝖾𝗏𝖾𝗇=p𝗈𝖽𝖽 and tree non-uniqueness holds if they are not equal. For all Δ3 there exists a critical fugacity λc(Δ)=(Δ1)Δ1/(Δ2)Δ) [Kel85], where tree uniqueness holds iff λλc(Δ).

The remarkable connection is that an algorithmic phase transition for general graphs of maximum degree Δ occurs at this same tree critical point. For all constant Δ, all δ>0, all λ<(1δ)λc(Δ), all graphs of maximum degree Δ, [Wei06] presented an 𝖥𝖯𝖳𝖠𝖲 for approximating the partition function. On the other side, for all δ>0, all λ>(1+δ)λc(Δ), [Sly10, SS14, GŠV16] proved that, unless 𝖭𝖯=𝖱𝖯, there is no 𝖥𝖯𝖱𝖠𝖲 for estimating the partition function.

One important caveat is that the running time of Weitz’s algorithm is (n/ϵ)ClogΔ where the approximation factor is (1±ϵ) and the constant C depends polynomially on the gap δ (recall, λ<(1δ)λc). Weitz’s correlation decay algorithm was extended to the antiferromagnetic Ising model in the tree uniqueness region by Sinclair et al. [SST14], and to all antiferromagnetic 2-spin systems in the corresponding tree uniqueness region (as we detail below) by Li, Lu, and Yin [LLY13].

An intriguing new algorithmic approach was presented by Barvinok [Bar16] and refined by Patel and Regts [PR17], utilizing the absence of zeros of the partition function in the complex plane to efficiently approximate a suitable transformation of the logarithm of the partition function using Taylor approximation. This polynomial interpolation approach was shown to be efficient in the same tree uniqueness region as for Weitz’s result by Peters and Regts [PR19], although the exponent in the running time depends exponentially on Δ.

It was long conjectured that the simple Glauber dynamics is rapidly mixing in the tree uniqueness region. This was recently proved by Anari, Liu, and Oveis Gharan [ALO20]; they proved, for all δ>0, the mixing time is nO(exp(1/δ)) whenever λ<(1δ)λc(Δ). We improve this result. First, we improve the mixing time from nO(exp(1/δ)) to nO(1/δ) as detailed in the following theorem.

Theorem 1 (Hardcore model).

Let Δ3 be an integer and δ(0,1). For every n-vertex graph G of maximum degree Δ and every 0<λ(1δ)λc(Δ), the mixing time of the Glauber dynamics for the hardcore model on G with fugacity λ is O(n2+32/δ).

This bound is optimal barring further improvements in the local-to-global arguments from [AL20]. Our improved result follows from a simpler, cleaner proof approach which enables us to extend our result to a wide variety of 2-spin models, matching the key results for the correlation decay algorithm with vastly improved running times.

Our proof approach unifies the three major algorithmic tools for approximate counting: correlation decay, polynomial interpolation, and MCMC. Most known results for both correlation decay and polynomial interpolation approach are proved by showing contraction of a suitably defined potential function on the so-called tree recursions; the tree recursions arise as a result of Weitz’s self-avoiding walk tree that we will describe in more detail later in this paper. A recent work of Shao and Sun [SS20] unifies these two approaches by showing that the contraction which is normally used to prove efficiency of the correlation decay algorithm, also implies (under some additional analytic conditions) that the polynomial interpolation approach is efficient.

Here we prove that this same contraction of a potential function also implies rapid mixing of the Glauber dynamics, with our improved running time that is independent of Δ; see 4 and 5 for a detailed statement. Our proof utilizes several new tools concerning Weitz’s self-avoiding walk tree, which are detailed in Section 3. In particular, we show that the partition function of a graph G divides the partition function of Weitz’s self-avoiding walk tree; see 8. This result is potentially of independent interest for establishing absence of zeros for the partition function with complex parameters, as it enables one to consider the self-avoiding walk tree. This result also yields a new, useful equivalence for bounding the influence in a graph in terms of the self-avoiding tree, which strengthens the previously known connection by Weitz [Wei06]; see 8 for details.

As an easy consequence we obtain rapid mixing for the Glauber dynamics for the antiferromagnetic Ising model in the tree uniqueness region. In terms of the edge activity, the two critical points for the Ising model on the Δ-regular tree are at βc(Δ)=Δ2Δ and β¯c(Δ)=1βc(Δ)=ΔΔ2; the first lies in the antiferromagnetic regime, while the second lies in the ferromagnetic regime. If βc(Δ)<β<β¯c(Δ), then uniqueness holds for all external field λ on the Δ-regular tree.

As mentioned earlier, for the ferromagnetic Ising model, an 𝖥𝖯𝖱𝖠𝖲 was known for general graphs [JS93]. Furthermore, Mossel and Sly [MS13] proved O(nlogn) mixing time of the Glauber dynamics for the ferromagnetic Ising model when 1β<β¯c(Δ). However, rapid mixing for the antiferromagnetic Ising model in the tree uniqueness region was not known.

We provide the following mixing result for the case β>βc(Δ). Note, when ββc there is an additional uniqueness region for certain values of the external field λ; this region is covered by 3.

Theorem 2 (Antiferromagnetic Ising Model).

Let Δ3 be an integer and δ(0,1). Assume that 1>ββc(Δ)+δ(1βc(Δ)) and λ>0. Then for every n-vertex graph G of maximum degree Δ, the mixing time of the Glauber dynamics for the Ising model on G with edge weight β and external field λ is O(n2+1.5/δ).

Our results for the hardcore and Ising models fit within a larger framework of general antiferromagnetic 2-spin systems. Recall that the antiferromagnetic case is when βγ<1.

For general 2-spin systems the appropriate tree phase transition is more complicated as there are models where the tree uniqueness threshold is not monotone in Δ. Hence the appropriate notion is “up-to-Δ uniqueness” as considered by [LLY13]. Roughly speaking, we say uniqueness with gap δ(0,1) holds on the d-regular tree if for every integer 1, all vertices at distance from the root have total “influence” (1δ) on the marginal of the root. We say up-to-Δ uniqueness with gap δ holds if uniqueness with gap δ holds on the d-regular tree for all 1dΔ; see Section 2 for the precise definition.

Both 1 and 2 are corollaries of the following general rapid mixing result which holds for general antiferromagnetic 2-spin systems in the entire tree uniqueness region.

Theorem 3 (General antiferromagnetic 2-spin system).

Let Δ3 be an integer and δ(0,1). Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. Assume that the parameters (β,γ,λ) are up-to-Δ unique with gap δ. Then for every n-vertex graph G of maximum degree Δ, the mixing time of the Glauber dynamics for the antiferromagnetic 2-spin system on G with parameters (β,γ,λ) is O(n2+72/δ).

We also match existing correlation decay results [GL18, SS20] for ferromagnetic 2-spin models; see Section 8 for results, and Appendix F for proofs.

1.1 Mixing by the potential method

The tree recursion is very useful in the study of approximating counting. Consider a tree rooted at r. Suppose that r has d children, denoted by v1,,vd. For 1iΔi we define Tvi to be the subtree of T rooted at vi that contains all descendant of vi. Let Rr=μT(σr = 1)/μT(σr = 0) denote the marginal ratio of the root, and Rvi=μTvi(σvi = 1)/μTvi(σvi = 0) for each subtree. The tree recursion is a formula that computes Rr given Rv1,,Rvd, due to the independence of Tvi’s. More specifically, we can write Rr=Fd(Rv1,,Rvd) where Fd:[0,+]d[0,+] is a multivariate function such that for (x1,,xd)[0,+]d,

Fd(x1,,xd)=λi=1dβxi+1xi+γ.

In this paper, however, we pay particular interest in the log of marginal ratios. The reason is that we will carefully study the pairwise influence matrix G of the Gibbs distribution μG, introduced in [ALO20] and defined as for every r,vV

G(r v)=μG(σv = 1σr = 1)μG(σv = 1σr = 0).

In [ALO20], the authors show that if the maximum eigenvalue of G is bounded appropriately, then the Glauber dynamics is rapid mixing. One crucial observation we make in this paper is that the influence G(r v) of r on v can be viewed as the derivative of logRr with respect to the log external field at v (see 12). Thus, it is more convenient for us to work with the log ratios. To this end, we rewrite the tree recursion as logRv=Hd(logRv1,,logRvd) where Hd:[,+]d[,+] is a function such that for (y1,,yd)[,+]d,

Hd(y1,,yd)=logλ+i=1dlog(βeyi+1eyi+γ).

Observe that H=logFexp. Moreover, we define

h(y)=(1βγ)ey(βey+1)(ey+γ)

for y[,+], so that yiHd(y1,,yd)=h(yi) for each i.

To prove our main results, we use the potential method, which has been widely used to establish the decay of correlation. By choosing a suitable potential function for the log ratios, we show that the total influence from a given vertex decays exponentially with the distance, and thus establish rapid mixing of the Glauber dynamics. Let us first specify our requirements on the potential. For every integer d0, we define a bounded interval Jd which contains all log ratios at a vertex of degree d. More specifically, we let Jd=[log(λβd),log(λ/γd)] when βγ<1, and Jd=[log(λ/γd),log(λβd)] when βγ>1. Furthermore, define J=d=0Δ1Jd to be the interval containing all log ratios with degree less than Δ.

Definition 4 ((α,c)-Potential function).

Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0 and λ>0. Let Ψ:[,+](,+) be a differentiable and increasing function with image S=Ψ[,+] and derivative ψ=Ψ. For any α(0,1) and c>0, we say Ψ is an (α,c)-potential function with respect to Δ and (β,γ,λ) if it satisfies the following conditions:

  1. 1.

    (Contraction) For every integer d such that 1d<Δ and every (y~1,,y~d)Sd, we have

    HdΨ(y~1,,y~d)1=i=1dψ(y)ψ(yi)|h(yi)|1α

    where HdΨ=ΨHdΨ1, yi=Ψ1(y~i) for 1id, and y=Hd(y1,,yd).

  2. 2.

    (Boundedness) For every y1,y2J, we have

    ψ(y2)ψ(y1)|h(y1)|cΔ.

In the definition of (α,c)-potential, one should think of y as the log marginal ratio at a vertex and the potential function is of logR. The following theorem establishes rapid mixing of the Glauber dynamics given an (α,c)-potential function.

Theorem 5.

Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0 and λ>0. Suppose that there is an (α,c)-potential with respect to Δ and (β,γ,λ) for some α(0,1) and c>0. Then for every n-vertex graph G of maximum degree Δ, the mixing time of the Glauber dynamics for the 2-spin system on G with parameters (β,γ,λ) is O(n2+c/α).

We outline our proofs in Section 3. Note that in both 4 and 5, the constant c is allowed to depend on the maximum degree Δ and parameters (β,γ,λ) in general. For example, a straightforward black-box application of the potential in [LLY13] would give c=Θ(Δ) for the Boundedness condition, resulting in nΘ(Δ) mixing. However, this is undesirable for graphs with potentially unbounded degrees. One of our contributions is that we show the Boundedness condition holds for a universal constant c independent of Δ and (β,γ,λ). Thus, our mixing time is O(n2+c/δ) with no parameters in the exponent except for 1/δ.

In Section 7, we give a slightly more general definition of (α,c)-potentials, which relaxes the Boundedness condition, and is necessary for our analysis of antiferromagnetic 2-spin systems with 0β<1<γ. 5 still holds for this larger class of potentials.

We remark that in all previous works of the potential method, results and proofs are always presented in terms of Fd, the tree recursion of R, and Φ, a potential function of R. In fact, our results can also be translated into the language of (Fd,Φ). To see this, since Hd=logFdexp, it is straightforward to check that HdΨ=ΨHdΨ1=ΦFdΦ1=FdΦ if we pick Φ=Ψlog, and thereby HdΨ=FdΦ. This implies that the Contraction condition in 4 holds for (Hd,Ψ) if and only if the corresponding contraction condition holds for (Fd,Φ). The Boundedness condition can also be stated equivalently for (Fd,Φ). Nevertheless, in this paper we choose to work with (Hd,Ψ) for the following two reasons. First, as mentioned earlier, the fact that G(r v) is a derivative of logRr makes it natural to consider the tree recursion for the log ratios. Indeed, it is easier and cleaner to present our results and proofs using (Hd,Ψ) directly rather than switching to (Fd,Φ). Second, the potential function Ψ we will use is obtained from the exact potential Φ in [LLY13], by the transformation Ψ=Φexp.111To be more precise, we also multiply a constant factor which only simplifies our calculation and does not matter much; also notice that [LLY13] denotes the potential function by φ and its derivative by Φ=φ. It is intriguing to notice that the derivative of this potential is simply ψ=|h|. Then the Contraction condition has a nice form: i=1dh(y)h(yi)1α; and the Boundedness condition only involves an upper bound on h(y). This seems to shed some light on the mysterious potential function Φ from [LLY13], and also indicates that Hd is a meaningful variant of the tree recursion to consider. To add one more evidence, for a lot of cases (e.g., Δ2Δ<βγ<ΔΔ2) where the potential Φ=log is picked, that just means we can pick Ψ to be the identity function and Hd itself is contracting without any nontrivial potential.

Revision in July 2021.

After the publication of this paper in FOCS 2020, a small error was found in [LLY13] regarding descriptions of the uniqueness region for antiferromagnetic 2-spin systems. The error was fixed in the latest version of [LLY13]. In this revision, we update corresponding results and proofs in Section 7 and Appendix E that rely on the changes in [LLY13]; in particular, 36 is adjusted in accordance with the current description of uniqueness regions. We remark that these changes are purely technical and do not affect the validity of our main results like 5.

Acknowledgments.

We would like to thank Shayan Oveis Gharan and Nima Anari for stimulating discussions. We also thank the anonymous referees for helpful comments and suggestions. We are grateful to Yitong Yin for communicating with us about the latest update of [LLY13] and for providing helpful instructions on modifying statements and proofs of results in Appendix E, particularly 36.

2 Preliminaries

Mixing time and spectral gap

Let P be the transition matrix of an ergodic (i.e., irreducible and aperiodic) Markov chain on a finite state space Ω with stationary distribution μ. Let Pt(x0,) denote the distribution of the chain after t steps starting from x0Ω. The mixing time of P is defined as

Tmix(P)=maxx0Ωmin{t0:Pt(x0,)μ()TV14}.

We say P is reversible if μ(x)P(x,y)=μ(y)P(y,x) for all x,yΩ. If P is reversible, then P has only real eigenvalues which can be denoted by 1=λ1λ|Ω|1. The spectral gap of P is defined to be 1λ2 and the absolute spectral gap of P is defined as λ(P)=1max{|λ2|,|λ|Ω||}. If P is also positive semidefinite with respect to the inner product ,μ, then all eigenvalues of P are nonnegative and thus λ(P)=1λ2. Finally, the mixing time and the absolute spectral gap are related by

Tmix(P)1λ(P)log(4minxΩμ(x)). (1)

Uniqueness

Let Δ3 be an integer or Δ=. Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. For 1d<Δ, define

fd(R)=λ(βR+1R+γ)d

and denote the unique fixed point of fd by Rd. For δ(0,1), we say the parameters (β,γ,λ) are up-to-Δ unique with gap δ if |fd(Rd)|<1δ for all 1d<Δ.

Ratio and influence

Consider the 2-spin system on a graph G=(V,E). Let ΛV and σΛ{0,1}Λ. For all vV\Λ, we define the marginal ratio at v to be

RGσΛ(v)=μG(σv = 1σΛ)μG(σv = 0σΛ).

For all u,vV\Λ, we define the (pairwise) influence of u on v by

GσΛ(u v)=μG(σv = 1σu = 1,σΛ)μG(σv = 1σu = 0,σΛ).

Write GσΛ for the (pairwise) influence matrix whose entries are given by GσΛ(u v).

Weitz’s self-avoiding walk tree

Let G=(V,E) be a connected graph and rV be a vertex of G. The self-avoiding walk (SAW) tree is defined as follows. Suppose that there is a total ordering of the vertex set V. A self-avoiding walk from r is a path r=v0v1v such that vivj for all 0i<j. The SAW tree Tsaw(G,r) is a tree rooted at r, consisting of all self-avoiding walks r=v0v1v with deg(v)=1, and those appended with one more vertex that closes the cycle (i.e., r=v0v1vvi for some 0i2 such that {v,vi}E). Note that a vertex of G might have many copies in the SAW tree, and the degrees of vertices are preserved except for leaves. See Fig. 1 for an example.

We can define a 2-spin system on Tsaw(G,r) with the same parameters (β,γ,λ), in which some of the leaves are fixed to a particular spin. More specifically, for a self-avoiding walk r=v0v1v appended with vi, we fix vi to be spin 1 if vi+1<v with respect to the total ordering on V, and spin 0 if vi+1>v. For each vV we denote the set of all free (unfixed) copies of v in Tsaw(G,r) by 𝒞v. For ΛV and a partial configuration σΛ{0,1}Λ, we define the SAW tree with conditioning σΛ by assigning the spin σv to every copy v^ of v from 𝒞v and removing all descendants of v^, for each vΛ. Note that in general, different copies of v from 𝒞v can receive different spin assignments. Finally, in the case that every vertex v has a distinct field λv, all copies of v from 𝒞v will have the same field λv in the SAW tree.

Refer to caption
Figure 1: A graph G and the self-avoiding walk tree Tsaw(G,r) rooted at r. Vertices with the same label in Tsaw(G,r) are copies of the same vertex from G. (/: fixed to spin 1/0.)

3 Proof outline for main results

Step 1 ([ALO20]): Spectral Independence implies rapid mixing.

Our proof builds on [ALO20] who showed that the Glauber dynamics for sampling from the hardcore distribution on graphs of maximum degree at most Δ mixes in O(nexp(O(1/δ))) steps whenever λ(1δ)λc(Δ). One of the key ingredients of their proof is a notion they call spectral independence. [ALO20] shows that the spectral independence property implies rapid mixing. Note that the diagonal entries of GσΛ are 1, as opposed to 0 in the original definition in [ALO20].

Definition 6 (Spectral Independence [ALO20]).

We say the Gibbs distribution μG on an n-vertex graph G is (η0,,ηn2)-spectrally independent, if for every 0kn2, ΛV of size k and σΛ{0,1}Λ, one has λmax(GσΛ)1ηk.

Theorem 7 ([ALO20]).

If μ is an (η0,,ηn2)-spectrally independent distribution, then the Glauber dynamics for sampling from μ has spectral gap at least

1ni=0n2(1ηini1).

Our primary goal now is to bound the maximum eigenvalue of GσΛ.

Step 2: Self-avoiding walk trees preserve influences.

From standard linear algebra, we know that the maximum eigenvalue of GσΛ is upper bounded by both the 1-norm GσΛ1=maxrVvV|GσΛ(v r)|, which corresponds to total influences on a vertex r, and the infinity-norm GσΛ=maxrVvV|GσΛ(r v)|, corresponding to total influences of r. In [ALO20] the authors use GσΛ1 as an upper bound on λmax(GσΛ). Roughly speaking, they show that the sum of absolute influences on a fixed vertex r, is upper bounded by the maximum absolute influences on r in the self-avoiding walk tree rooted at r, over all boundary conditions. Here in this paper, we will use GσΛ to upper bound λmax(GσΛ) instead. In fact, much more is true if we look at the influences from r in the self-avoiding tree. We show that for every vertex vV, the influence GσΛ(r v) in G is preserved in the self-avoiding walk tree T=Tsaw(G,r) rooted at r, in the form of sum of influences TσΛ(r v^) over all copies v^ of v.

The way we establish this fact is by viewing the partition function as a polynomial in λ. In fact, it will be useful to consider the more general case with an arbitrary external field λv for every vV. Let 𝝀={λv:vV} denote the fields. For ΛV and σΛ{0,1}Λ, the weight of σ{0,1}V\Λ conditional on σΛ is defined to be wG(σσΛ)=βm1(σσΛ)γm0(σσΛ)vV\Λλvσv where mi(σΛ) is the number of i-i edges with at least one endpoint in V\Λ for i=0,1. Furthermore, ZGσΛ=σ{0,1}V\ΛwG(σσΛ) is the partition function conditioned on σΛ. We shall view β and γ as some fixed constants and think of 𝝀 as n=|V| variables. In this sense, we regard the weights wG(σσΛ) as monomials in 𝝀 and the partition function ZGσΛ as a polynomial in 𝝀. Moreover, the marginal ratios RGσΛ(v) and the influences GσΛ(r v) for r,vV are all functions in 𝝀. Our main result is that the partition function of G divides that of Tsaw(G,r) for each rV. From that, we show that the SAW tree preserves influences of the root, as well as re-establishing Weitz’s celebrated result [Wei06], see 13.

Lemma 8.

Let G=(V,E) be a connected graph, rV be a vertex and ΛV\{r} such that G\Λ is connected. Let T=Tsaw(G,r) be the self-avoiding walk tree of G rooted at r. Then for every σΛ{0,1}Λ, ZGσΛ divides ZTσΛ. More precisely, there exists a polynomial PG,rσΛ=PG,rσΛ(𝛌) independent of λr such that

ZTσΛ=ZGσΛPG,rσΛ. (2)

As a corollary, for each vertex vV,

GσΛ(r v)=v^𝒞vTσΛ(r v^), (3)

where 𝒞v is the set of all free (unfixed) copies of v in T.

Remark 1.

We emphasize that for the purposes of bounding the total influence of a vertex in G, only Eq. 3 of 8 is needed, which can be proved in a purely combinatorial fashion. However, we believe the divisibility property Eq. 2 of the multivariate partition function of G and its self-avoiding walk tree may be of independent interest.

We note that a univariate version of the divisibility statement Eq. 2 has already appeared in [Ben18] for the hardcore model and [LSS19] for the zero-field Ising model in the study of complex roots of the partition function. From 8, we can get vV|GσΛ(r v)|vVT|TσΛ(r v)| for any fixed r. That means, we only need to upper bound the sum of all influences for trees, in order to get an upper bound on λmax(GσΛ).

Step 3: Decay of influences given a good potential.

The tree recursion provides us a great tool for computing the (log) ratios of vertices recursively for trees. As we show in 12, the influence GσΛ(r v) is in fact a version of derivative of the log marginal ratio at r. Thus, the tree recursion can be used naturally to relate these influences. We then apply the potential method, which has been widely used in literature to establish the decay of correlations (strong spatial mixing). The following lemma shows that the sum of absolute influences to distance k has exponential decay with k, which can be thought of as the decay of pairwise influences.

Lemma 9.

If there exists an (α,c)-potential function Ψ with respect to Δ and (β,γ,λ) where α(0,1) and c>0, then for every ΛVT\{r}, σΛ{0,1}Λ and all integers k1,

vLr(k)|TσΛ(r v)|c(1α)k1

where Lr(k) denote the set of all free vertices at distance k away from r.

5 is then proved by combining 7, 8 and 9. We leave its proof to Appendix A.

Step 4: Find a good potential.

As our final step, we need to find an (α,c)-potential function as defined in 4. The potential Ψ we choose is exactly the one from [LLY13], adapted to the log marginal ratios and the tree recursion H (see Section 6 for more details). We show that if the parameters (β,γ,λ) are up-to-Δ unique with gap δ(0,1) and either βγ>Δ2Δ or γ1, then Ψ is an (α,c)-potential.

Lemma 10.

Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. Assume that (β,γ,λ) is up-to-Δ unique with gap δ(0,1). Define the function Ψ implicitly by

Ψ(y)=ψ(y)=(1βγ)ey(βey+1)(ey+γ)=|h(y)|,Ψ(0)=0. (4)

If βγ>Δ2Δ, then Ψ is an (α,c)-potential function with αδ/2 and c1.5. If βγΔ2Δ and γ1, then Ψ is an (α,c)-potential with αδ/2 and c18; we can further take c4 if β=0.

We deduce 3 for the case βγ>Δ2Δ or γ1 from 5 and 10. The proof of it can be found in Appendix A. The case that βγΔ2Δ and γ>1 is trickier. As discussed in Section 5 of [LLY13], when βγΔ2Δ and γ>1, for some λ>0 the spin system lies in the uniqueness region for arbitrary graphs, even with unbounded degrees (i.e., up-to- unique). Thus, in this case the total influences of a vertex can be as large as Θ(Δ/δ), resulting in nΘ(Δ/δ) mixing time. To deal with this, we consider a suitably weighted sum of absolute influences of a fixed vertex, which also upper bounds the maximum eigenvalue of the influence matrix. 4 and 5 are then modified to a slightly stronger version. The statements and proofs for this case are presented in Section 7 and Appendix D.

The rest of the paper is organized as follows. In Section 4 we prove 8 about properties of the SAW tree. In Section 5 we establish 9 regarding the decay of influences by the potential method. We verify the Contraction condition in Section 6 for our choice of potential. Section 7 is devoted to the case that βγΔ2Δ and γ>1, where a more general version of 4 and 5 is required; missing proofs can be found in Appendix D. In Appendix E we verify the Boundedness condition and its generalization for our potential in all cases. We consider ferromagnetic spin systems in Section 8 and the proofs are left to Appendix F. We prove all of our main results in Appendix A.

4 Preservation of influences for self-avoiding walk trees

In this section we show that the self-avoiding walk (SAW) tree, introduced in [Wei06] (see also [SS05]), maintains all the influence of the root, and thus establishes 8. To do this, we show that the partition function of G, viewed as a polynomial of the external fields 𝝀, divides that of the SAW tree. From there we prove that the influence of the root vertex r on another vertex v in G, is exactly equal to that on all copies of v in the SAW tree. Using our proof approach, we show that the marginal of the root is maintained in the SAW tree, re-establishing Weitz’s celebrated result [Wei06], and also all pairwise covariances concerned with v are preserved.

Theorem 11.

Let G=(V,E) be a connected graph, rV be a vertex and ΛV\{r} such that G\Λ is connected. Let T=Tsaw(G,r) be the self-avoiding walk tree of G rooted at r. Then for every σΛ{0,1}Λ, ZGσΛ divides ZTσΛ. More precisely, there exists a polynomial PG,rσΛ=PG,rσΛ(𝛌) such that

ZTσΛ=ZGσΛPG,rσΛ.

Moreover, the polynomial PG,rσΛ is independent of λr.

Remark 2.

The proof of 11 can be adapted to give a purely combinatorial proof of Eq. 3 in 8. Like in the proof of [Wei06, Theorem 3.1], one can proceed via vertex splitting and telescoping, where instead of telescoping a product of marginal ratios, one instead telescopes a sum of single-vertex influences.

We remark that [Ben18] proved a univariate version of 11 for the hardcore model, and [LSS19] showed a similar result for the zero-field Ising model with a uniform edge weight. Our result holds for all 2-spin systems and arbitrary fields for each vertex. We can also generalize it to arbitrary edge weights for each edge in a straightforward fashion. It is crucial that the quotient polynomial PG,rσΛ is independent of the field λr at the root, from which we can deduce the preservation of marginal and influences of the root immediately.

Before proving 11, we first give a few consequences of it. For all u,vV\Λ, we define the marginal at v as MGσΛ(v)=μG(v = 1σΛ) (henceforth we write v=i for the event σv=i for convenience), and the covariance of u and v as

KGσΛ(u,v)=μG(u = v = 1σΛ)μG(u = 1σΛ)μG(v = 1σΛ).

The following lemma relates the quantities we are interested in with appropriate derivatives of the (log) partition function. Parts 1 and 2 of the lemma are folklore.

Lemma 12.

For every graph G=(V,E), ΛV and σΛ{0,1}Λ, the following holds:

  1. 1.

    For all vV,

    (λvλv)logZGσΛ=MGσΛ(v);
  2. 2.

    For all u,vV,

    (λvλv)(λuλu)logZGσΛ=(λvλv)MGσΛ(u)=KGσΛ(u,v);
  3. 3.

    For all u,vV,

    (λvλv)logRGσΛ(u)=GσΛ(u v).
Proof.

The first two parts are standard. We include the proofs of these two facts in Appendix B for completeness. For Part 3, we deduce from Part 2 that

(λvλv)logRGσΛ(u)=(λvλv)log(MGσΛ(u)1MGσΛ(u))=(λvλv)MGσΛ(u)MGσΛ(u)(1MGσΛ(u))=KGσΛ(u,v)KGσΛ(u,u).

It remains to show that

GσΛ(u v)=KGσΛ(u,v)KGσΛ(u,u),

which actually holds for any two binary random variables. To see this, we first compute KGσΛ(u,u)GσΛ(u v) by definition:

KGσΛ(u,u)GσΛ(u v)
= μG(u = 1σΛ)μG(u = 0σΛ)[μG(v = 1u = 1,σΛ)μG(v = 1u = 0,σΛ)]
= μG(u = 1,v = 1σΛ)μG(u = 0σΛ)μG(u = 1σΛ)μG(u = 0,v = 1σΛ)
= μG(u = 1,v = 1σΛ)μG(u = 0,v = 0σΛ)μG(u = 1,v = 0σΛ)μG(u = 0,v = 1σΛ).

Meanwhile, the covariance can be written as

KGσΛ(u,v) =μG(u = 1,v = 1σΛ)μG(u = 1σΛ)μG(v = 1σΛ)
=μG(u = 1,v = 1σΛ)μG(u = 0,v = 0σΛ)μG(u = 1,v = 0σΛ)μG(u = 0,v = 1σΛ).

This shows that GσΛ(u v)=KGσΛ(u,v)/KGσΛ(u,u) and thus establishes Part 3.

We deduce 8 from 11 and the second item of the following lemma. The proof of 11 is presented in Section 4.1.

Lemma 13.

Let G=(V,E) be a connected graph, rV be a vertex and ΛV\{r} such that G\Λ is connected. Let T=Tsaw(G,r) be the self-avoiding walk tree of G rooted at r. Then for every σΛ{0,1}Λ we have:

  1. 1.

    ([Wei06, Theorem 3.1]) Preservation of marginal of the root r:

    MGσΛ(r)=MTσΛ(r)andRGσΛ(r)=RTσΛ(r);
  2. 2.

    Preservation of covariances and influences of r: for every vV,

    KGσΛ(r,v)=v^𝒞vKTσΛ(r,v^)andGσΛ(r v)=v^𝒞vTσΛ(r v^).

    where 𝒞v is the set of all free (unfixed) copies of v in T.

Proof.

By 11, there exists a polynomial PG,rσΛ=PG,rσΛ(𝝀) such that ZTσΛ=ZGσΛPG,rσΛ and PG,rσΛ is independent of λr. Then it follows from 12 that

MTσΛ(r)=(λrλr)logZTσΛ=(λrλr)(logZGσΛ+logPG,rσΛ)=(λrλr)logZGσΛ=MGσΛ(r),

and therefore RTσΛ(r)=RGσΛ(r). For the second item, again from 12 we get

KGσΛ(r,v)=(λvλv)MGσΛ(r)=(λvλv)MTσΛ(r).

Recall that for the spin system on the SAW tree T, every free copy v^ of v from 𝒞v has the same external field λv^=λv. Then, by the chain rule of derivatives and 12, we deduce that

KGσΛ(r,v)=v^𝒞v(λv^λv^)MTσΛ(r)λv^λvλvλv^=v^𝒞vKTσΛ(r,v^).

Finally, we have

GσΛ(r v)=(λvλv)logRGσΛ(r)=(λvλv)logRTσΛ(r)=v^𝒞vTσΛ(r v^),

where the last equality follows as above.

4.1 Proof of 11

Before presenting our proof, let us first review the notations and definitions introduced earlier. Denote the set of fields at all vertices by 𝝀={λv:vV}. For ΛV and σΛ{0,1}Λ, the weight of σ{0,1}V\Λ conditional on σΛ is given by

wG(σσΛ)=βm1(σσΛ)γm0(σσΛ)vV\Λλvσv,

where for i=0,1, mi(σΛ) denotes the number of edges such that both endpoints receive the spin i and at least one of them is in V\Λ. The partition function conditional on σΛ is defined as ZGσΛ=σ{0,1}V\ΛwG(σσΛ). For the SAW tree, we define the conditional weights and partition function in the same way. In particular, recall that when we fix a conditioning σΛ on the SAW tree, we also remove all descendants of v^𝒞v for each vΛ.

For every vV\Λ and i{0,1}, we shall write v=i to represent the set of configurations such that σv=i (i.e., {σ{0,1}V\Λ:σv=i}) and let ZGσΛ(v = i) be sum of weights of all configurations with v = i. We further extend this notation and write ZGσΛ(U = σU) for every UV\Λ and σU{0,1}U. For the SAW tree we adopt the same notations as well.

Proof of 11.

We will show that there exists a polynomial PG,rσΛ=PG,rσΛ(𝝀), independent of λr, such that

ZTσΛ(r = 1)=ZGσΛ(r = 1)PG,rσΛandZTσΛ(r = 0)=ZGσΛ(r = 0)PG,rσΛ. (5)

The high-level proof idea of Eq. 5 is similar to the corresponding result in [Wei06, Theorem 3.1]. Let m be the number of edges with at least one endpoint in V\Λ. We use induction on m. When m=0 the statement is trivial since T=G. Assume that Eq. 5 holds for all graphs and all conditioning with less than m edges. Suppose that the root r has d neighbors v1,,vd. Define G to be the graph obtained by replacing the vertex r with d vertices r1,,rd and then connecting {ri,di} for 1id.

Consider first the case where (G\{r})\Λ is still connected. For each i, let Gi=Gri. Define the 2-spin system on Gi with the same parameters (β,γ,𝝀), plus an additional conditioning that the vertices r1,,ri1 are fixed to spin 0 while ri+1,,rd are fixed to spin 1; we denote this conditioning by σUi with Ui={v1,,vd}\{vi}. Then, T=Tsaw(G,r) can be generated by the following recursive procedure. Also see Fig. 2 for an illustration.

Refer to caption
Figure 2: A recursive construction of the self-avoiding walk (SAW) tree. Here Ti is the SAW tree of Gi rooted at vi for i=1,2,3. (/: fixed to spin 1/0.)
Algorithm: Tsaw(G,r)
  1. 1.

    For each i, let Ti=Tsaw(Gi,vi) plus the conditioning σUi;

  2. 2.

    Let T=Tsaw(G,r) be the union of r and T1,,Td by connecting {r,vi} for 1id; output T.

For the purpose of proof, we also consider the 2-spin system on G with the same parameters (β,γ,𝝀), with an exception that we let the vertices r1,,rd have no fields (i.e., setting λri=1 for 1id instead of λr). We then observe that

ZGσΛ(r = 1)=λrZGσΛ(r1 = 1,,rd = 1),

and the same holds with spin 1 replaced by 0. For 1id, let σΛi denote the union of the conditioning σΛ and σUi, where Λi=ΛUi. Then for every 1id we have

ZGσΛ(r1 = 0,,ri1 = 0,ri = 1,,rd = 1)=βZGiσΛi(vi = 1)+ZGiσΛi(vi = 0).

Notice that both sides are independent of the field λr: for the left side, all ri’s do not have a field for the spin system on G; for the right side, recall that we do not count the weight of fixed vertices for the conditional partition function for each Gi. Now define QG,rσΛ=QG,rσΛ(𝝀) by

QG,rσΛ=i=2dZGσΛ(r1 = 0,,ri1 = 0,ri = 1,,rd = 1),

which is independent of λr. Then we get

ZGσΛ(r = 1)QG,rσΛ =λri=1dZGσΛ(r1 = 0,,ri1 = 0,ri = 1,,rd = 1)
=λri=1d(βZGiσΛi(vi = 1)+ZGiσΛi(vi = 0)).

Using a similar argument, we also have

ZGσΛ(r = 0)QG,rσΛ =i=1dZGσΛ(r1 = 0,,ri = 0,ri+1 = 1,,rd = 1)
=i=1d(ZGiσΛi(vi = 1)+γZGiσΛi(vi = 0)).

Since we assume that (G\{r})\Λ is connected, the graph Gi\Λ is also connected for each i. Then, by the induction hypothesis, for each i there exists a polynomial PGi,viσΛi=PGi,viσΛi(𝝀) such that

ZTiσΛi(r = 1)=ZGiσΛi(r = 1)PGi,viσΛiandZTiσΛi(r = 0)=ZGiσΛi(r = 0)PGi,viσΛi;

these polynomials are independent of λr since the conditional partition functions for Gi’s do not involve λr. Now if we let

PG,rσΛ=QG,rσΛi=1dPGi,viσΛi,

then it follows from the tree recursion that

ZTσΛ(r = 1) =λri=1d(βZTiσΛi(vi = 1)+ZTiσΛi(vi = 0))
=λri=1d(βZGiσΛi(vi = 1)PGi,viσΛi+ZGiσΛi(vi = 0)PGi,viσΛi)
=ZGσΛ(r = 1)QG,rσΛi=1dPGi,viσΛi
=ZGσΛ(r = 1)PG,rσΛ.

The other equality ZTσΛ(r = 0)=ZGσΛ(r = 0)PG,rσΛ is established in the same way. This completes the proof for the case that (G\{r})\Λ is connected.

If (G\{r})\Λ has two or more connected components, then we can construct Tsaw(G,r) by the SAW tree of each component. Recall that G is defined by splitting the vertex r into d copies in the graph G. Suppose that G\Λ has k connected component for an integer k2. Let G(1),,G(k) be the subgraphs induced by each component, along with vertices from Λ that are adjacent to it. For each j, let G(j) be the graph obtained from G(j) by contracting all copies of r into one vertex r(j), and let T(j)=Tsaw(G(j),r(j)). Observe that once we contract the roots r(1),,r(k) of T(1),,T(k), the resulting tree is Tsaw(G,r).

We define the 2-spin system on each G(j) with the same parameters (β,γ,𝝀), except that the vertex r(j) does not have a field (i.e., λr(j)=1 instead of λr). For 1jk, let Λ(j)=ΛV(G(j)) and σΛ(j) be the configuration σΛ restricted on Λ(j). Then G(j)\Λ(j) is connected for every j and, since k2, each G(j) with conditioning σΛ(j) has fewer than m edges. Thus, we can apply the induction hypothesis; namely, for 1jk there exists a polynomial PG(i),r(i)σΛ(j)=PG(i),r(i)σΛ(j)(𝝀), which is independent of λr, such that

ZT(j)σΛ(j)(r(j) = 1)=ZG(j)σΛ(j)(r(j) = 1)PG(j),r(j)σΛ(j)andZT(j)σΛ(j)(r(j) = 0)=ZG(j)σΛ(j)(r(j) = 0)PG(j),r(j)σΛ(j).

We define the polynomial PG,rσΛ=PG,rσΛ(𝝀) to be

PG,rσΛ=j=1kPG(j),r(j)σΛ(j).

It is then easy to check that

ZTσΛ(r = 1) =λrj=1kZT(j)σΛ(j)(r(j) = 1)=λrj=1k(ZG(j)σΛ(j)(r(j) = 1)PG(j),r(j)σΛ(j))
=ZGσΛ(r = 1)j=1kPG(j),r(j)σΛ(j)=ZGσΛ(r = 1)PG,rσΛ,

and similarly ZTσΛ(r = 0)=ZGσΛ(r = 0)PG,rσΛ. The theorem then follows.

5 Influence bound for trees

In this section, we study the influences of the root on other vertices in a tree. We give an upper bound on the total influences of the root on all vertices at a fixed distance away. To do this, we apply the potential method, which has been used to establish the correlation decay property (see, e.g., [LLY12, LLY13, GL18]). Given an arbitrary potential function Ψ, our upper bound is in terms of properties of Ψ, involving bounds on HdΨ1 and |ψ| where ψ=Ψ. We then deduce 9 in the case that Ψ an (α,c)-potential.

Assume that T=(VT,ET) is a tree rooted at r of maximum degree at most Δ. Let ΛVT\{r} and σΛ{0,1}Λ be arbitrary and fixed. Consider the 2-spin system on T with parameters (β,γ,λ), conditioned on σΛ. We need to bound the influence TσΛ(r v) from the root r to another vertex vVT. Notice that if v is disconnected from r when Λ is removed, then TσΛ(r v)=0 by the Markov property of spin systems. Therefore, we may assume that, by removing all such vertices, Λ contains only leaves of T.

For a vertex vVT, let Tv=(VTv,ETv) be the subtree of T rooted at v that contains all descendant of v; note that Tr=T. We will write Lv(k)VT\Λ for the set of all free vertices at distance k away from v in Tv. We pay particular interest in the marginal ratio at v in the subtree Tv, and write Rv=RTvσΛ(v) for simplicity. The logRv’s are related by the tree recursion H. If a vertex v has d children, denoted by u1,,ud, then the tree recursion is given by

logRv=Hd(logRu1,,logRud),

where for 1dΔ and (y1,,yd)[,+]d,

Hd(y1,,yd)=logλ+i=1dlog(βeyi+1eyi+γ).

Also recall that for y[,+], we define

h(y)=(1βγ)ey(βey+1)(ey+γ)

and yiHd(y1,,yd)=h(yi) for all 1idΔ.

The following lemma allows us to bound the sum of all influences from the root to distance k, using an arbitrary potential function.

Lemma 14.

Let Ψ:[,+](,+) be a differentiable and increasing (potential) function with image S=Ψ[,+] and derivative ψ=Ψ. Denote the degree of the root r by Δr. Then for every integer k1,

vLr(k)|TσΛ(r v)|ΔrAΨBΨ(max1d<Δsup𝒚~SdHdΨ(𝒚~)1)k1

where

AΨ=maxuLr(1){|h(logRu)|ψ(logRu)}andBΨ=maxvLr(k){ψ(logRv)}.

Before proving 14, we first present two useful properties of the influences on trees. Firstly, it was shown in [ALO20] that the influences satisfy the following form of chain rule on trees.

Lemma 15 ([ALO20, Lemma B.2]).

Suppose that u,v,wVT are three distinct vertices such that u is on the unique path from v to w. Then

TσΛ(v w)=TσΛ(v u)TσΛ(u w).

Secondly, for two adjacent vertices on a tree, the influence from one to the other is given by the function h.

Lemma 16.

Let vVT and u be a child of v in the subtree Tv. Then

TσΛ(v u)=h(logRu).
Proof.

The lemma can be proved through an explicit computation of the influence. Here we present a more delicate proof utilizing 12, which gives some insights into the relation between the influence and the function h. We assume that v has d children in the subtree Tv, denoted by u1=u and u2,,ud respectively. We also assume, as a more general setting than uniform fields, that each vertex w is attached to a field λw of its own. Then 12 and the tree recursion imply that

TσΛ(v u) =TvσΛ(v u)=(λuλu)logRv
=(λuλu)Hd(logRu1,,logRud)
=i=1dlogRuiHd(logRu1,,logRud)(λuλu)logRui
=i=1dh(logRui)TuiσΛ(ui u)=h(logRu),

where the last equality is because TuiσΛ(ui u)=0 for uiu and TuσΛ(u u)=1.

We are now ready to prove 14.

Proof of 14.

For a vertex vVT, denote the number of its children by dv; note that dr=Δr. Let u1,,uΔr be the children of the root r. We may assume that all these children of r are free, since if ui is fixed then TσΛ(r ui)=0 by definition. Then by 15 and 16, we get

vLr(k)|TσΛ(r v)| =i=1Δr|TσΛ(r ui)|vLui(k1)|TσΛ(ui v)|
=i=1Δr|h(logRui)|vLui(k1)|TσΛ(ui v)|
=i=1Δr|h(logRui)|ψ(logRui)vLui(k1)ψ(logRui)|TσΛ(ui v)|.

Hence, we obtain that

vLr(k)|TσΛ(r v)|Δrmax1iΔr{|h(logRui)|ψ(logRui)}max1iΔr{vLui(k1)ψ(logRui)|TσΛ(ui v)|}. (6)

Next, we show by induction that for every vertex uVT\{r} and every integer k0 we have

vLu(k)ψ(logRu)|TσΛ(u v)|maxvLu(k){ψ(logRv)}(maxwVTusup𝒚~SdwHdwΨ(𝒚~)1)k. (7)

Observe that once we establish Eq. 7, the lemma follows immediately by plugging Eq. 7 into Eq. 6. We will use induction on k to prove Eq. 7. When k=0, if uΛ is fixed then Lu(0)= and there is nothing to show; otherwise, Eq. 7 becomes

ψ(logRu)|TσΛ(u u)|ψ(logRu),

which holds with equality since TσΛ(u u)=1. Now suppose that Eq. 7 holds for some integer k10 (and for every vertex uVT\{r}). Let uVT\{r} be arbitrary and denote the children of u by w1,,wd, where 1d<Δ (if d=0 then Lu(k)= and Eq. 7 holds trivially). Again by 15 and 16 we have

vLu(k)ψ(logRu)|TσΛ(u v)| =i=1dψ(logRu)|TσΛ(u wi)|vLwi(k1)|TσΛ(wi v)|
=i=1dψ(logRu)ψ(logRwi)|h(logRwi)|vLwi(k1)ψ(logRwi)|TσΛ(wi v)|.

Using the induction hypothesis, we get

vLu(k)ψ(logRu)|TσΛ(u v)|
i=1dψ(logRu)ψ(logRwi)|h(logRwi)|maxvLwi(k1){ψ(logRv)}(maxwVTwisup𝒚~SdwHdwΨ(𝒚~)1)k1
maxvLu(k){ψ(logRv)}(maxwVTu\{u}sup𝒚~SdwHdwΨ(𝒚~)1)k1i=1dψ(logRu)ψ(logRwi)|h(logRwi)|
maxvLu(k){ψ(logRv)}(maxwVTusup𝒚~SdwHdwΨ(𝒚~)1)k,

where the last inequality follows from that

i=1dψ(logRu)ψ(logRwi)|h(logRwi)| =i=1d|Ψ(logRwi)HdΨ(Ψ(logRw1),,Ψ(logRwd))|
=HdΨ(Ψ(logRw1),,Ψ(logRwd))1.

This establishes Eq. 7, and thus completes the proof of the lemma.

We then derive 9 as a corollary.

Proof of 9.

Since Ψ is an (α,c)-potential, the Contraction condition implies that

max1d<Δsup𝒚~SdHdΨ(𝒚~)11α.

Meanwhile, since the degree of a vertex vVT\{r} in the subtree Tv is less than Δ, we have logRvJ. Then the Boundedness condition implies that for all uLr(1) and vLr(k),

ψ(logRv)ψ(logRu)|h(logRu)|cΔ.

Therefore, we get

ΔrAΨBΨ=ΔrmaxuLr(1){|h(logRu)|ψ(logRu)}maxvLr(k){ψ(logRv)}c.

The lemma then follows immediately from 14.

6 Verifying a good potential: Contraction

In this section, we make a first step for proving 10. Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. Recall that define our potential function Ψ:[,+](,+) through its derivative by

Ψ(y)=ψ(y)=(1βγ)ey(βey+1)(ey+γ),Ψ(0)=0. (1)

We include a short proof in Appendix C to show that Ψ is well-defined. If (β,γ,λ) is up-to-Δ unique with gap δ(0,1), then we show that Ψ satisfies the Contraction condition for α=δ/2. This holds for all parameters (β,γ,λ) in the uniqueness region, without requiring that γ1. Later in Appendix E, we establish the Boundedness condition for Ψ when γ1, completing the proof of 10. The case of γ>1 is more complicated and is left to Section 7.

Before giving our proof, we first point out that the potential function Ψ is essentially the same potential function Φ used in [LLY13] (notice that [LLY13] uses φ as the notation of the potential function and Φ=φ for its derivative). Recall that the tree recursion for the marginal ratios is given by the function Fd:[0,+]d[0,+] where 1dΔ such that for all (x1,,xd)[0,+]d,

Fd(x1,,xd)=λi=1dβxi+1xi+γ.

The potential function Φ:[0,+](,+) from [LLY13] is defined implicitly via its derivative as

Φ(x)=φ(x)=1x(βx+1)(x+γ),Φ(1)=0.

The follows lemma explains how we obtain our potential Ψ from Φ.

Lemma 17.

We have Ψ=1βγ(Φexp); namely, Ψ(y)=1βγΦ(ey) for all y[,+].

Proof.

It is straightforward to check that

ψ(y)=(1βγ)ey(βey+1)(ey+γ)=1βγey1ey(βey+1)(ey+γ)=1βγeyφ(ey).

Therefore,

Ψ(y)=0yψ(t)dt=1βγ0yetφ(et)dt=1βγ1eyφ(s)ds=1βγΦ(ey).

Combining the results of Lemmas 12, 13 and 14 from [LLY13], we get that the potential function Φ satisfies the following gradient bound when (β,γ,λ) is in the uniqueness region. Note that this can be regarded as the Contraction condition but for Φ and Fd.

Theorem 18 ([LLY13]).

Let SΦ=Φ[0,+] be the image of Φ. If the parameters (β,γ,λ) are up-to-Δ unique with gap δ(0,1), then for every integer d such that 1d<Δ and every (x~1,,x~d)SΦd,

FdΦ(x~1,,x~d)11δ

where FdΦ=ΦFdΦ1.

Recall our definition from Section 1.1. The tree recursion, in terms of the log marginal ratios, is described by the function Hd:[,+]d[,+] where 1dΔ such that for every (y1,,yd)[,+]d,

Hd(y1,,yd)=logλ+i=1dlog(βeyi+1eyi+γ).

Observe that Hd=logFdexp, since we move from ratios to log ratios. We are now ready to establish the Contraction condition for Ψ.

Lemma 19.

Let SΨ=Ψ[,+] be the image of Ψ. If the parameters (β,γ,λ) are up-to-Δ unique with gap δ(0,1), then for every integer d such that 1d<Δ and every (y~1,,y~d)SΨd,

HdΨ(y~1,,y~d)11δ

where HdΨ=ΨHdΨ1.

Proof.

Define the linear function a: to be a(x)=1βγx for x. Then 17 gives Ψ=aΦexp, and thereby Ψlog=aΦ. It follows that for every 1d<Δ,

HdΨ=ΨHdΨ1=ΨlogFdexpΨ1=aΦFdΦ1a1=aFdΦa1.

That means, for every (y~1,,y~d)SΨd we have

HdΨ(y~1,,y~d)=1βγFdΦ(x~1,,x~d)

where x~i=y~i/1βγ for 1id. Then, for each i,

y~iHdΨ(y~1,,y~d)=1βγx~iFdΦ(x~1,,x~d)dx~idy~i=x~iFdΦ(x~1,,x~d).

This implies that HdΨ(y~1,,y~d)=FdΦ(x~1,,x~d) for all (y~1,,y~d)SΨd, and the lemma then follows from 18.

7 Remaining antiferromagnetic cases: βγΔ2Δ and γ>1

In this section, we discuss the case where βγΔ2Δ and γ>1. As studied in [LLY13], in this case the uniqueness region is more complicated. For example, there exists a critical λc>0 such that the 2-spin system with λ<λc is in the uniqueness region for arbitrary graphs; namely, (β,γ,λ) is up-to- unique. To deal with large degrees, we need to relax the Boundedness condition in 4 and define a more general version of (α,c)-potentials. We shall see that 5 still holds for this general (α,c)-potential. The reason behind it is that in order to bound the maximum eigenvalue of the influence matrix, it suffices to consider a vertex-weighted sum of absolute influences of a vertex with large degree.

Remark 20.

We give more background on the uniqueness region in Section E.1. Note that in a recent revision of [LLY13], the authors updated the descriptions of the uniqueness region for the case βγΔ2Δ and γ>1, fixing a small error in the previous version. Statements and proofs in this section and Appendix E of this paper are also adjusted accordingly based on the new version of [LLY13].

Recall that our goal is to bound the maximum eigenvalue of the matrix GσΛ. We can do this by upper bounding the absolute row sum vV\Λ|GσΛ(r v)| for fixed r, thereby giving us a valid upper bound on λmax(GσΛ). However, this approach does not work when βγΔ2Δ and γ>1. In this case, the potential Ψ fails to be an (α,c)-potential for a universal constant c independent of Δ. In fact, no such (α,c)-potentials exist as the absolute row sum vV\Λ|GσΛ(r v)| can be as large as Θ(Δ). Especially, if the parameters (β,γ,λ) are up-to- unique, which means the spin system has uniqueness for arbitrary graphs, then the absolute row sum vV\Λ|GσΛ(r v)| can be Θ(n) where n=|V|. We give a specific example where this is the case.

Example 21.

Consider the antiferromagnetic 2-spin system specified by parameters β=0, γ>1 and λ>0 on the star graph centered at r with Δ leaves. A simple calculation reveals that |G(r v)|=λλ+γ for any leaf vertex vr. Hence, vr|G(r v)|=Δλλ+γ. Now, since γ>1, we have

λc=λc(γ,Δ)=min1<d<Δγd+1dd(d1)d+1=Θγ(1),

forcing vr|G(r v)|=Θγ(Δ) even when λ<λc lies in the uniqueness region. However, we still have λmax(G)=O(1) since vr|G(v r)|=O(1).

To solve this issue, one might want to consider the absolute column sum, involving the sum of absolute influences on a fixed vertex. However, this will not allow us to use the beautiful connection between graphs and SAW trees as showed in 8. Instead, we consider here a vertex-weighted version of the absolute row sum of GσΛ, which also upper bounds the maximum eigenvalue.

Lemma 22.

Let ρ:V+ be a positive weight function of vertices. If there is a constant ξ>0 such that for every rV we have

vV\Λρv|GσΛ(r v)|ξρr, (8)

then λmax(GσΛ)ξ.

Proof.

Let 𝒫=diag{ρv:vV\Λ}. Then the assumption is equivalent to 𝒫1GσΛ𝒫ξ. It follows that λmax(GσΛ)=λmax(𝒫1GσΛ𝒫)ξ.

We then modify our definition of (α,c)-potentials from 4 which allows a weaker Boundedness condition. We remark that the only two differences between 23 and 4 is that: we allow Δ=; and the Boundedness condition is relaxed to what we call General Boundedness. Recall that for every 0d<Δ, we let Jd=[log(λβd),log(λ/γd)] when βγ<1, and Jd=[log(λ/γd),log(λβd)] when βγ>1.

Definition 23 (General (α,c)-potential function).

Let Δ3 be an integer or Δ=. Let β,γ,λ be reals such that 0βγ, γ>0 and λ>0. Let Ψ:[,+](,+) be a differentiable and increasing function with image S=Ψ[,+] and derivative ψ=Ψ. For any α(0,1) and c>0, we say Ψ is a general (α,c)-potential function with respect to Δ and (β,γ,λ) if it satisfies the following conditions:

  1. 1.

    (Contraction) For every integer d such that 1d<Δ and every (y~1,,y~d)Sd, we have

    HdΨ(y~1,,y~d)1=i=1dψ(y)ψ(yi)|h(yi)|1α

    where HdΨ=ΨHdΨ1, yi=Ψ1(y~i) for 1id, and y=Hd(y1,,yd).

  2. 2.

    (General Boundedness) For all integers d1,d2 such that 0d1,d2<Δ, and all reals y1Jd1,y2Jd2, we have

    ψ(y2)ψ(y1)|h(y1)|2cd1+d2+2.

Notice that General Boundedness is a weaker condition than Boundedness. To see this, if a potential function Ψ satisfies Boundedness with parameter c, then for every 0di<Δ and every yiJdi where i=1,2 we have

ψ(y2)ψ(y1)|h(y1)|cΔ2cd1+d2+2.

The following theorem generalizes 5 and shows that a general (α,c)-potential function is sufficient to establish rapid mixing of the Glauber dynamics.

Theorem 24.

Let Δ3 be an integer or Δ=+. Let β,γ,λ be reals such that 0βγ, γ>0 and λ>0. Suppose that there is a general (α,c)-potential with respect to Δ and (β,γ,λ) for some α(0,1) and c>0. Then for every n-vertex graph G of maximum degree Δ, the mixing time of the Glauber dynamics for the 2-spin system on G with parameters (β,γ,λ) is O(n2+2c/α).

We then give a counterpart of 10, showing that Ψ is a general (α,c)-potential when βγΔ2Δ and γ>1. 3 for this case is then obtained from 24 and 25.

Lemma 25.

Let Δ3 be an integer. Let β,γ,λ be reals such that 0β<1<γ and βγΔ2Δ. Assume that (β,γ,λ) is up-to-Δ unique with gap δ(0,1). Then the function Ψ defined implicitly by Eq. 4 is a general (α,c)-potential function with αδ/2 and c18; we can further take c4 if β=0.

The proof of 24 can be found in Appendix D. For 25, the Contraction condition of Ψ follows from 19, and General Boundedness is proved in Appendix E together with all other cases.

8 Ferromagnetic cases

In the ferromagnetic case, the best known correlation decay results are given in [GL18, SS20]. Using the potential functions in [GL18] and [SS20], we show the following two results, which match the known correlation decay results. In fact, the potential function from [SS20] turns out to be an (α,c)-potential function for constants α=Θ(δ) and cO(1).

Theorem 26.

Fix an integer Δ3, positive real numbers β,γ,λ and 0<δ<1, and assume (β,γ,λ) satisfies one of the following three conditions:

  1. 1.

    Δ2+δΔδβγΔδΔ2+δ, and λ is arbitrary;

  2. 2.

    βγΔΔ2 and λ(1δ)γmax{1,βΔ1}((Δ2)βγΔ);

  3. 3.

    βγΔΔ2 and λ11δ(Δ2)βγΔβmin{1,1/γΔ1}.

Then the identity function Ψ(y)=y (based on the potential given in [SS20]) is an (α,c)-potential function for α=Θ(δ) and cO(1). Furthermore, for every n-vertex graph G of maximum degree at most Δ, the mixing time of the Glauber dynamics for the 2-spin system on G with parameters (β,γ,λ) is O(n2+c/δ), for a universal constant c>0.

Remark 3.

Condition 1 includes both the ferromagnetic case 1<βγΔδΔ2+δ and the antiferromagnetic case Δ2+δΔδβγ<1. Note that in both cases (β,γ,λ) is up-to-Δ unique with gap δ. For the antiferromagnetic case, the identity function Ψ is an (α,c)-potential with c1.5 and a better contraction rate αδ, compared with the bound αδ/2 of the potential Ψ given by Eq. 4 in 10. For the ferromagnetic case with β=γ>1 (Ising model), a stronger result by [MS13] was known, which gives O(nlogn) mixing.

The potential function from [GL18] is indeed an (α,c)-potential, but c must, unfortunately, depend on Δ. We have the following result, which is weaker than the correlation decay algorithm in [GL18] for unbounded degree graphs.

Theorem 27.

Fix an integer Δ3, and nonnegative real numbers β,γ,λ satisfying β1γ, βγΔΔ2, and λ<(γβ)βγβγ1. Then for every n-vertex graph G with maximum degree at most Δ, the mixing time of the Glauber dynamics for the ferromagnetic 2-spin system on G with parameters (β,γ,λ) is O(nC), for a constant C depending only on β,γ,λ,Δ, but not n.

Proofs of these theorems are provided in Appendix F.

References

  • [AL20] Vedat Levi Alev and Lap Chi Lau “Improved analysis of higher order random walks and applications” In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), 2020, pp. 1198–1211
  • [ALO20] Nima Anari, Kuikui Liu and Shayan Oveis Gharan “Spectral Independence in High-Dimensional Expanders and Applications to the Hardcore Model” In arXiv preprint arXiv:2001.00303, 2020
  • [Bar16] Alexander Barvinok “Combinatorics and Complexity of Partition Functions” Springer AlgorithmsCombinatorics, 2016
  • [Ben18] Ferenc Bencs “On trees with real-rooted independence polynomial” In Discrete Mathematics 341.12 Elsevier, 2018, pp. 3321–3330
  • [GL18] Heng Guo and Pinyan Lu “Uniqueness, Spatial Mixing, and Approximation for Ferromagnetic 2-Spin Systems” In ACM Transactions on Computation Theory 10.4, 2018
  • [GŠV16] Andreas Galanis, Daniel Štefankovič and Eric Vigoda “Inapproximability of the Partition Function for the Antiferromagnetic Ising and Hard-Core Models” In Combinatorics, Probability and Computing 25.4 Cambridge University Press, 2016, pp. 500–559
  • [JS93] Mark Jerrum and Alistair Sinclair “Polynomial-time approximation algorithms for the Ising model” In SIAM Journal on Computing 22.5 SIAM, 1993, pp. 1087–1116
  • [JVV86] Mark R Jerrum, Leslie G Valiant and Vijay V Vazirani “Random generation of combinatorial structures from a uniform distribution” In Theoretical Computer Science 43 Elsevier, 1986, pp. 169–188
  • [Kel85] F.. Kelly “Stochastic Models of Computer Communication Systems” In Journal of the Royal Statistical Society. Series B (Methodological) 47.3, 1985, pp. 379–395
  • [LLY12] Liang Li, Pinyan Lu and Yitong Yin “Approximate Counting via Correlation Decay in Spin Systems” In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2012, pp. 922–940
  • [LLY13] Liang Li, Pinyan Lu and Yitong Yin “Correlation Decay Up to Uniqueness in Spin Systems” In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013, pp. 67–84
  • [LSS19] Jingcheng Liu, Alistair Sinclair and Piyush Srivastava “Fisher zeros and correlation decay in the Ising model” In Journal of Mathematical Physics 60.10 AIP Publishing LLC, 2019, pp. 103304
  • [MS13] Elchanan Mossel and Allan Sly “Exact thresholds for Ising-Gibbs samplers on general graphs” In Annals of Probability 41.1, 2013, pp. 294–328
  • [PR17] Viresh Patel and Guus Regts “Deterministic Polynomial-Time Approximation Algorithms for Partition Functions and Graph Polynomials” In SIAM Journal on Computing 46, 2017, pp. 1893–1919
  • [PR19] Han Peters and Guus Regts “On a conjecture of Sokal concerning roots of the independence polynomial” In The Michigan Mathematical Journal 68.1, 2019, pp. 33–55
  • [Sly10] Allan Sly “Computational Transition at the Uniqueness Threshold” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2010, pp. 287–296
  • [SS05] Alexander D. Scott and Alan D. Sokal “The Repulsive Lattice Gas, the Independent-Set Polynomial, and the Lovász Local Lemma” In Journal of Statistical Physics 118.5, 2005, pp. 1151–1261
  • [SS14] Allan Sly and Nike Sun “The Computational Hardness of Counting in Two-Spin Models on d-Regular Graphs” In The Annals of Probability 42.6, 2014, pp. 2383–2416
  • [SS20] Shuai Shao and Yuxin Sun “Contraction: A Unified Perspective of Correlation Decay and Zero-Freeness of 2-Spin Systems” In Proceedings of the 47th International Colloquium on Automata, Languages, and Programming (ICALP), 2020, pp. 96:1–15
  • [SST14] Alistair Sinclair, Piyush Srivastava and Marc Thurley “Approximation Algorithms for Two-State Anti-Ferromagnetic Spin Systems on Bounded Degree Graphs” In Journal of Statistical Physics 155.4, 2014, pp. 666–686
  • [ŠVV09] Daniel Štefankovič, Santosh Vempala and Eric Vigoda “Adaptive simulated annealing: A near-optimal connection between sampling and counting” In Journal of the ACM 56.3, 2009, pp. 1–36
  • [Wei06] Dror Weitz “Counting Independent Sets Up to the Tree Threshold” In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), 2006, pp. 140–149

Appendix A Proof of main results

In this section we give the proofs of 1, 2, 3 and 5.

Proof of 5.

Note that since the transition matrix P for the Glauber dynamics has all nonnegative eigenvalues, we have that λ(P)=1λ2(P) and so in order to deduce mixing, it suffices to lower bound 1λ2(P). We do this by employing 7. It suffices to show (η0,,ηn2)-spectrally independence for sufficiently small ηi.

To bound ηi, it suffices to bound vV\{r}|GσΛ(r v)| for all graphs G=(V,E) with n=|V| vertices and all boundary conditions σΛ on a subset Λ of i vertices. We claim the following:

vV\{r}|GσΛ(r v)|min{cα,C(ni1)} (9)

where C(0,1) is a constant depending only on β,γ,λ,Δ. The first upper bound cδ is deduced by

vV\{r}|GσΛ(r v)| vVT\{r}|TσΛ(r v)| (8; T=Tsaw(G,r))
=k=1vLr(k)|TσΛΛ(r v)| (split the sum by levels)
ck=1(1α)k1 (9)
=cα.

The second upper bound C(ni1) is more trivial. Intuitively, it means each absolute pairwise influence |GσΛ(r v)| is at most some constant C and hence the sum of absolute influences is upper bounded by C(ni1). The following two claims, whose proofs are provided in Section A.2, give a more precise statement.

Claim 28 (Antiferromagnetic Case).

Fix an integer Δ3 and real numbers β,γ,λ, and assume 0βγ, γ>0, βγ<1 and λ>0. Then for every n-vertex graph G of maximum degree at most Δ, the antiferromagnetic 2-spin system on G with parameters (β,γ,λ) is Cn-spectrally independent, for a constant 0<C<1 depending only on β,γ,λ,Δ. Furthermore, if (β,γ,Δ) is up-to-Δ unique, then we can drop the dependence on Δ.

Claim 29 (Ferromagnetic Case).

Fix an integer Δ3 and positive real numbers β,γ,λ, and assume βγ and βγ>1. Then for every n-vertex graph G of maximum degree at most Δ, the ferromagnetic 2-spin system on G with parameters (β,γ,λ) is Cn-spectrally independent, for a constant 0<C<1 depending only on β,γ,λ,Δ.

With Eq. 9 in hand, we immediately see that by 7,

1λ2(P)1ni=0n2(1ηini1)1n(1C)2c/α1i=0n2c/α1(1cα1ni1).

Using the fact that 1xexp(xx2) for all 0x12 (which can be proved straightforwardly by calculus), we get

i=0n2c/α1(1cα1ni1)=j=2c/αn1(1cα1j)exp(cαj=2c/αn11jc2α2j=2c/αn11j2).

Now since

j=2c/αn11jj=2n1j1ndxx=logn

and

j=2c/αn11j2j=21j(j1)=1,

we deduce that

1λ2(P)(1C)2c/α1e(c/α)2n(1+c/α).

The theorem then follows from Eq. 1.

Proof of 3.

We leverage 5 and 24, which shows O(n2+cα) mixing as long as there is an (α,c)-potential, or O(n2+2cα) mixing if there is a general (α,c)-potential. We use the potential given by Eq. 4, which is an adaptation of the potential function in [LLY13] to the log marginal ratios. When (β,γ,λ) is up-to-Δ unique with gap δ(0,1), it is an (α,c)-potential or a general (α,c)-potential by 10 and 25, with αδ/2 and c a universal constant specified by the range of parameters. The theorem then follows.

Proof of 1.

By 30 later in Section A.1, λ(1δ)λc(Δ) implies up-to-Δ uniqueness with gap δ/4. Since γ1, we can again appeal to 10 to obtain an (α,c)-potential with αδ/8 and c4. 1 then follows by 5 with O(n2+32/δ) mixing.

Proof of 2.

By 31 later in Section A.1, ββc(Δ)+δ(1βc(Δ)) implies up-to-Δ uniqueness with gap δ. Again, appealing to 10, we obtain an (α,c)-potential with αδ/2 and c1.5. 2 then follows by 5 with O(n2+3/δ) mixing.

Though we technically get O(n2+3/δ) by using the [LLY13] potential, we can improve it to O(n2+1.5/δ) mixing by using the trivial identity function as the potential. See the first case of 26 (proved in Section F.1) and Remark 3.

A.1 Uniqueness gaps in terms of parameter paps

In this section we state and prove 30 and 31, which relate the parameter gaps with the uniqueness gaps.

Claim 30 (Hardcore Model; Lemma C.1 from [ALO20]).

Fix an integer Δ3, 0<δ<1, and β=0,γ>0. If λ(1δ)λc(γ,Δ), then (β,γ,λ) is up-to-Δ unique with gap δ/4.

Claim 31 (Large βγ).

Fix an integer Δ3, and 0<δ<1. If βγΔ2Δ+δ(1Δ2Δ)=Δ2(1δ)Δ, then (β,γ,λ) is up-to-Δ unique with gap 0<δ<1 for all λ. Note if β=γ, this is precisely the condition ββc(Δ)+δ(1βc(Δ)).

Proof.

Consider the univariate recursion for the marginal ratios with d<Δ children fd(R)=λ(βR+1R+γ)d. Differentiating, we have

fd(R) =dλ(βR+1R+γ)d1(βR+γβR+1(R+γ)2)=d(1βγ)λ(βR+1R+γ)d1(βR+1)(R+γ)
=d(1βγ)fd(R)(βR+1)(R+γ).

At the unique fixed point Rd, we have fd(Rd)=Rd so

|fd(Rd)|=d(1βγ)Rd(βRd+1)(Rd+γ).

By 37, we have the upper bound

|fd(Rd)|d1βγ(1+βγ)2=d1βγ1+βγ.

Since we assumed βγΔ2(1δ)Δ, we obtain

d1βγ1+βγdΔ(Δ2(1δ))Δ+(Δ2(1δ))=d1δΔ1+δ(1δ)dΔ1.

As this is at most 1δ for all d<Δ, we have up-to-Δ uniqueness with gap δ.

A.2 Spectral independence bounds for constant-size graphs

In this section, we prove spectral independence bounds for graphs with fewer than O(c/α)-many vertices, since for graphs with such few vertices, our bounds based on contraction of the tree recursions become trivial.

Proof of 28.

If Rv denotes the marginal ratio of a vertex vG, then RvλβΔ. In the case γ1, we have Rvλ/γΔ as well; if γ>1, we have Rvλ. It follows that we immediately have the bounds

|G(u v)|{|λλ+γΔλβΔ1+λβΔ|=λ(1βΔγΔ)(λ+γΔ)(1+λβΔ),if γ1|λ1+λλβΔ1+λβΔ|=λ(1βΔ)(λ+1)(1+λβΔ),o.w.

for all u,vG. Note that these constants are less than 1, and only depend on β,γ,λ,Δ, yielding the first claim.

Now, we proceed to remove the dependence on Δ when up-to-Δ uniqueness holds. We have the following cases:

  1. 1.

    If γ>1, we immediately obtain a bound of λ1+λ which is independent of Δ.

  2. 2.

    If β=0 and γ1, then λ(1βΔγΔ)(λ+γΔ)(1+λβΔ)=λλ+γΔλγΔ. Since (β,γ,λ) is up-to-Δ unique, we must have λλc(γ,Δ)=min1<d<Δγd+1dd(d1)d+1γΔ(Δ1)Δ1(Δ2)ΔγΔO(1/Δ). It follows that λγΔO(1/Δ).

  3. 3.

    If βγ>Δ2Δ and γ1, then

    λ(1βΔγΔ)(λ+γΔ)(1+λβΔ)1βΔγΔ1e2.
  4. 4.

    If βγΔ2Δ, then let Δ0 be the maximal 1<d<Δ such that βγ>d2d. If λλc(β,γ,Δ), then by 35, we have

    λ(1βΔγΔ)(λ+γΔ)(1+λβΔ)λγΔO(Δ0/Δ).

    If λλ¯c(β,γ,Δ), then again by 35, we have

    λ(1βΔγΔ)(λ+γΔ)(1+λβΔ)1λβΔO(Δ0/Δ).
Proof of 29.

The proof is identical to the antiferromagnetic case and we omit it here.

Appendix B Proof of 12 (Parts 1 and 2)

Proof of 12 (Parts 1 and 2).

To see the first equality, we compute directly and get

(λvλv)logZGσΛ =1ZGσΛ(λvλv)ZGσΛ
=1ZGσΛσ{0,1}V\Λ(λvλv)(βm1(σ)γm0(σ)wVλwσw)
=1ZGσΛσ{0,1}V\Λσv(βm1(σ)γm0(σ)wVλwσw)
=σ{0,1}V\ΛσvμG(σσΛ)=MGσΛ(v).

For Part 2, using the result above, we can also get

(λvλv)(λuλu)logZGσΛ
= (λvλv)(1ZGσΛ(λuλu)ZGσΛ)
= 1ZGσΛ(λvλv)(λuλu)ZGσΛ1(ZGσΛ)2(λvλv)ZGσΛ(λuλu)ZGσΛ
= 1ZGσΛ(λvλv)(σ{0,1}V\Λσu(βm1(σ)γm0(σ)wVλwσw))MGσΛ(u)MGσΛ(v)
= 1ZGσΛσ{0,1}V\Λσu(λvλv)(βm1(σ)γm0(σ)wVλwσw)MGσΛ(u)MGσΛ(v)
= 1ZGσΛσ{0,1}V\Λσuσv(βm1(σ)γm0(σ)wVλwσw)MGσΛ(u)MGσΛ(v)
= σ{0,1}V\ΛσuσvμG(σσΛ)MGσΛ(u)MGσΛ(v)
= KGσΛ(u,v).

Appendix C A technical lemma for Ψ

The following lemma implies that the potential Ψ given by Eq. 4 is well-defined.

Lemma 32.

For all β,γ>0 such that βγ<1, we have

+(1βγ)ey(βey+1)(ey+γ)<+.
Proof.

For the + side we have

0+(1βγ)ey(βey+1)(ey+γ)=0+1βγβey+γey+βγ+1<0+1βey<+.

Similarly, for the side we have

0(1βγ)ey(βey+1)(ey+γ)<01γey<+.

Appendix D Mixing by the potential method: Proof of 24

In this section, we prove 24 in the same way of 5, as outlined in Section 3. The major difference here is that we consider a weighted sum of absolute influences vV\Λρv|GσΛ(r v)| where ρ:V+ is a weight function. This is sufficient for us to bound the eigenvalue of the influence matrix, as indicated by 22. We will choose the weight of a vertex v to be ρv=Δv, the degree of v. The following lemma provides us an upper bound on the weighted sum of absolute influences to distance k, given a general (α,c)-potential. In particular, it generalizes 9.

Lemma 33.

If there exists a general (α,c)-potential function Ψ with respect to Δ and (β,γ,λ) where α(0,1) and c>0, then for every ΛVT\{r}, σΛ{0,1}Λ and all integers k1,

vLr(k)Δv|TσΛ(r v)|2c(1α)k1Δr

where Lr(k) denote the set of all free vertices at distance k away from r.

To prove 33, we first state the following generalization of 14 for any weight function ρ. The proof of 34 is identical to 14 and we omit here.

Lemma 34.

Let Ψ:[,+](,+) be a differentiable and increasing (potential) function with image S=Ψ[,+] and derivative ψ=Ψ. Denote the degree of the root r by Δr. Then for every integer k1,

vLr(k)ρv|TσΛ(r v)|ΔrAΨBΨρ(max1d<Δsup𝒚~SdHdΨ(𝒚~)1)k1

where

AΨ=maxuLr(1){|h(logRu)|ψ(logRu)}andBΨρ=maxvLr(k){ρvψ(logRv)}.

We then prove 33 and 24.

Proof of 33.

Denote the degree of a vertex vVT\{r} by Δv, and the degree of v in the subtree Tv by dv=Δv1. Pick the weights of vertices to be ρv=Δv for all vVT. Since Ψ is a general (α,c)-potential, the Contraction condition implies that

max1d<Δsup𝒚~SdHdΨ(𝒚~)11α.

Since logRvJdv by the definition of Jd, the General Boundedness condition implies that for all uLr(1) and vLr(k),

ψ(logRv)ψ(logRu)|h(logRu)|2cΔu+Δv.

Therefore, we get

ΔrAΨBΨρ=ΔrmaxuLr(1){|h(logRu)|ψ(logRu)}maxvLr(k){Δvψ(logRv)}2cΔr.

The lemma then follows immediately from 34.

Proof of 24.

The proof of 24 is almost identical to 5. We point out that the only difference here is that we consider the weighted sum of absolute influences of a given vertex. Since the SAW tree preserve degrees of vertices, we can still apply 8. Then, combining 7, 22, 8 and 33, we complete the proof of the theorem.

Appendix E Verifying a good potential: Boundedness

In this subsection, we show the Boundedness or General Boundedness condition for our potential function Ψ defined by Eq. 4 in different ranges of parameters. Combining 19, we complete the proofs of 10 and 25.

In Section E.1 we give background on the uniqueness region of the parameters (β,γ,λ), based on the work of [LLY13]. We then show Boundedness and General Boundedness in Section E.2. Proofs of technical lemmas are left to Section E.3.

E.1 Preliminaries of the uniqueness region

In this section we give a brief description of the uniqueness region of parameters (β,γ,λ). All the results here, and also their proofs, can be found in Lemma 21 from the latest version of [LLY13].

Let Δ3 be an integer and β,γ,λ be reals. We assume that 0βγ, γ>0, βγ<1 and λ>0. For 1dΔ define

fd(R)=λ(βR+1R+γ)d

and denote the unique fixed point of fd by Rd. Recall that the parameters (β,γ,λ) are up-to-Δ unique with gap δ(0,1) if |fd(Rd)|<1δ for all 1d<Δ.

When β=0, the spin system is called a hard-constraint model. In this case, there exists a critical threshold for the external field defined as

λc=λc(γ,Δ)=min1<d<Δγd+1dd(d1)d+1,

such that the parameters (0,γ,λ) are up-to-Δ unique if and only if λ<λc. In particular, when γ1 the critical field is given by

λc=λc(γ,Δ)=γΔ(Δ1)Δ1(Δ2)Δ.

When β>0, the spin system is called a soft-constraint model. If βγ>Δ2Δ, then (β,γ,λ) is up-to-Δ unique for all λ>0. If βγΔ2Δ the uniqueness region is more complicated which we now describe. Let

Δ¯=1+βγ1βγ,

so that for every 1d<Δ¯ we have d1βγ1+βγ<1, and for every dΔ¯ we have d1βγ1+βγ1. For every Δ¯d<Δ, we define x1(d)x2(d) to be the two positive roots of the quadratic equation

d(1βγ)x(βx+1)(x+γ)=1.

More specifically, x1(d) and x2(d) are given by

x1(d)=θ(d)θ(d)24βγ2βandx2(d)=θ(d)+θ(d)24βγ2β

where

θ(d)=d(1βγ)(1+βγ).

Notice that θ(d)2βγ for all dΔ¯. For i=1,2 we let

λi(d)=xi(d)(xi(d)+γβxi(d)+1)d.

Then, the parameters (β,γ,λ) are up-to-Δ unique if and only if λ belongs to the following regime

𝒜=Δ¯d<Δ[(0,λ1(d))(λ2(d),)]. (10)

In particular, when γ1 there are two critical thresholds 0<λc<λ¯c such that the parameters (β,γ,λ) are up-to-Δ unique if and only if λ<λc or λ>λ¯c (i.e., 𝒜=(0,λc)(λ¯c,)), where

λc=λc(β,γ,Δ)=minΔ¯d<Δλ1(d)andλ¯c=λ¯c(β,γ,Δ)=maxΔ¯d<Δλ2(d)=λ2(Δ1).

The following bounds on the critical fields are helpful for our proofs later.

Lemma 35.
  1. 1.

    If β=0, then for every integer d such that 1<d<Δ we have

    λc4γd+1d1.
  2. 2.

    If β>0 and βγΔ2Δ, then for every integer d such that Δ¯d<Δ we have

    λ1(d)18γd+1θ(d)andλ2(d)θ(d)18βd+1

    where θ(d)=d(1βγ)(1+βγ).

The proof of 35 is postponed to Section E.3.

E.2 Proofs of boundedness

In this section we complete the proofs of 10 and 25 by establishing Boundedness and General Boundedness in the corresponding range of parameters.

Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. Recall that the potential function Ψ is defined by

Ψ(y)=ψ(y)=(1βγ)ey(βey+1)(ey+γ)=|h(y)|,Ψ(0)=0. (1)

It is surprising to find out that ψ=|h|, as the potential Ψ is exactly the one from [LLY13] as indicated by 17. This seems not to be a coincidence, and it provides some intuition why the potential from [LLY13] works. More importantly, the fact that ψ=|h| is helpful in our proof of Boundedness and General Boundedness. Recall that for 0d<Δ and βγ<1 we let Jd=[log(λβd),log(λ/γd)] to be the range of log marginal ratios of a vertex with d children. Then for every 0di<Δ and yiJdi where i=1,2, we have

ψ(y2)ψ(y1)|h(y1)|=|h(y1)||h(y2)|. (11)

The following lemma gives upper bounds on |h(y1)||h(y2)|, from which and Eq. 11 we deduce Boundedness and General Boundedness immediately. The brackets in the lemma indicate which lemma the bound is applied to.

Lemma 36.

Let Δ3 be an integer. Let β,γ,λ be reals such that 0βγ, γ>0, βγ<1 and λ>0. Assume that the parameters (β,γ,λ) are up-to-Δ unique with gap δ(0,1). Then for all integers d1,d2 such that 0d1,d2<Δ, and all reals yiJdi where i=1,2, the following holds:

  1. H.

    Hard-constraint models: β=0 and λ<λc.

    1. H.1.

      (10) If γ1, then

      |h(y1)|4Δ.
    2. H.2.

      (25) If γ>1, then

      |h(y1)||h(y2)|8d1+d2+2.
  2. S.

    Soft-constraint models: β>0 and λ𝒜.

    1. S.1.

      (10) If βγ>Δ2Δ, then

      |h(y1)|1.5Δ.
    2. S.2.

      (10) If βγΔ2Δ and γ1, then

      |h(y1)|18Δ.
    3. S.3.

      (25) If βγΔ2Δ and γ>1, then

      |h(y1)||h(y2)|36d1+d2+2.

The following lemma, whose proof can be found in Section E.3, is helpful.

Lemma 37.

For every y[,+] we have

|h(y)|=|1βγ|ey(βey+1)(ey+γ)|1βγ|1+βγ.

We present here the proof of 36.

Proof of 36.

We use notations and results from Section E.1.

H. Hard-constraint models: β=0 and λ<λc.

H.1. γ1.

For every y1Jd1 we deduce from 35 that

ey1λγd1λcγΔ14γΔ2.

Hence,

|h(y1)|=ey1ey1+γ4γΔ24γΔ2+γ=4Δ+24Δ.

H.2. γ>1.

Let y¯=y1+y22 and d¯=d1+d22. Then we get

|h(y1)||h(y2)|=ey1ey1+γey2ey2+γ=1(1+γey1)(1+γey2)11+γey¯,

where the last inequality follows from the AM–GM inequality by

(1+γey1)(1+γey2)=1+γ(ey1+ey2)+γ2e2y¯1+2γey¯+γ2e2y¯=(1+γey¯)2.

Since yiJdi for i=1,2, we have

ey¯=ey1ey2λγd1λγd2=λγd¯.

If d¯2, then we deduce from 35 and γ>1 that

ey¯λcγd¯4γd¯1.

It follows that

|h(y1)||h(y2)|11+γey¯11+d¯14=4d¯+38d1+d2+2.

If d¯<2, then it is easy to see that

|h(y1)||h(y2)|18d1+d2+2.

S. Soft-constraint models: β>0 and λ𝒜.

S.1. βγ>Δ2Δ.

For every y1J we deduce from 37 that

|h(y1)|1βγ1+βγ1Δ11.5Δ.

S.2. βγΔ2Δ and γ1.

In this case, we have either λ<λc or λ>λ¯c where λc,λ¯c are the two critical fields. Consider first λ>λ¯c. For every y1Jd1 we deduce from 35 and β<1 that

ey1λβd1λ¯cβΔ1θ(Δ1)18β

where θ(d)=d(1βγ)(1+βγ). Hence,

|h(y1)|=(1βγ)ey1(βey1+1)(ey1+γ) =1βγβey1+γey1+(1+βγ)
1βγθ(Δ1)18+(1+βγ)=18(1βγ)(Δ1)(1βγ)+17(1+βγ)18Δ.

Next we consider λ<λc. For every y1Jd1 we deduce from 35 and γ1 that

ey1λγd1λcγΔ118γθ(Δ1).

Hence,

|h(y1)|=1βγβey1+γey1+(1+βγ)1βγθ(Δ1)18+(1+βγ)18Δ.

S.3. βγΔ2Δ and γ>1.

Let y¯=y1+y22, d¯=d1+d22, dL=d¯, and dR=d¯. We first consider some trivial cases. If d¯2 then it is easy to see that

|h(y1)||h(y2)|16d1+d2+2.

If d¯>2 and dLΔ¯, then we deduce from 37 that

|h(y1)||h(y2)|1βγ1+βγ=1Δ¯2d1+d226d1+d2+2.

Hence, in the following we may assume that d¯>2 and dL>Δ¯.

Since the parameters (β,γ,λ) are up-to-Δ unique, we have λ𝒜 where the regime 𝒜 is given by Eq. 10. Observe that

𝒜(0,λ1(dL))(λ2(dR),)(λ2(dL),λ1(dR))

where the last interval is nonempty only when λ2(dL)<λ1(dR). This means that λ is contained in at least one of the three intervals. We establish the bound by considering these three cases separately.

Case 1: λ<λ1(dL). By the Cauchy-Schwarz inequality, we have

|h(y1)||h(y2)| =1βγβey1+γey1+(1+βγ)1βγβey2+γey2+(1+βγ)
1βγ(βey1+γey1)(βey2+γey2)+(1+βγ). (12)

Therefore, we get

|h(y1)||h(y2)|1βγγey¯+(1+βγ).

Since yiJdi for i=1,2 and γ>1, we deduce from 35 that

ey¯λγd¯λ1(dL)γdL18γθ(dL),

where θ(dL)=dL(1βγ)(1+βγ). It follows that

|h(y1)||h(y2)|1βγγey¯+(1+βγ)1βγθ(dL)18+(1+βγ)36d1+d2+2.

Case 2: λ>λ2(dR). Similarly, we obtain from Eq. 12 that

|h(y1)||h(y2)|1βγβey¯+(1+βγ).

Since yiJdi for i=1,2 and β<1, we deduce from 35 that

ey¯λβd¯λ2(dR)βdRθ(dR)18β,

where θ(dR)=dR(1βγ)(1+βγ). It follows that

|h(y1)||h(y2)|1βγβey¯+(1+βγ)1βγθ(dR)18+(1+βγ)36d1+d2+2.

Case 3: λ2(dL)<λ<λ1(dR). We may assume that d1d2. By Eq. 12, we obtain

|h(y1)||h(y2)|1βγβγey2y12+(1+βγ).

Since yiJdi for i=1,2 and β<1<γ, we have

ey2y1βd2γd1βdLγdR.

Meanwhile, we deduce from 35 that

θ(dL)18βdL+1λ2(dL)<λ<λ1(dR)18γdR+1θ(dR),

which implies

βγey2y12βdL+1γdR+1θ(dL)θ(dR)18θ(dL)18.

It follows that

|h(y1)||h(y2)|1βγβγey2y12+(1+βγ)1βγθ(dL)18+(1+βγ)36d1+d2+2.

E.3 Proofs of technical lemmas

Proof of 35.

1. For every 1<d<Δ we have

λcγd+1dd(d1)d+1=γd+1d1(dd1)d4γd+1d1,

where the last inequality follows from that (dd1)d4 for all integer d>1.

2. For every Δ¯d<Δ we have

x1(d)=2γθ(d)+θ(d)24βγ2γθ(d).

Observe that the function x+γβx+1 is monotone increasing in x when βγ<1, and thus we deduce that

x1(d)+γβx1(d)+12γθ(d)+γ2βγθ(d)+1=γ2+d(1βγ)(1+βγ)2βγ+d(1βγ)(1+βγ)=γd+1d1.

Therefore,

λ1(d)=x1(d)(x1(d)+γβx1(d)+1)d2γθ(d)γd(d+1d1)d18γd+1θ(d)

where the last inequality follows from that (d+1d1)d9 for all integer d>1.

The second part can be proved similarly. For every Δ¯d<Δ we have

x2(d)=θ(d)+θ(d)24βγ2βθ(d)2β,

and hence,

x2(d)+γβx2(d)+1θ(d)2β+γθ(d)2+1=1βd(1βγ)(1+βγ)+2βγd(1βγ)(1+βγ)+2=1βd1d+1.

We then conclude that

λ2(d)=x2(d)(x2(d)+γβx2(d)+1)dθ(d)2β1βd(d1d+1)dθ(d)18βd+1,

where the last inequality again follows from that (d+1d1)d9 for all integer d>1.

Proof of 37.

We deduce from the AM–GM inequality that

|h(y)|=|1βγ|βey+γey+1+β|1βγ|2βγ+1+β=|1βγ|1+βγ.

Appendix F Proofs for ferromagnetic cases

F.1 Proof of 26

Proof of 26.

Throughout, we use the “trivial potential” function Ψ(y)=y. Note that then, ψ(y)=1 is a constant function. Now, we prove Contraction and Boundedness. We split into the three cases.

  1. 1.

    We first prove the Contraction part. By 37, for all y[,+] we have

    |h(y)||1βγ|1+βγ1δΔ1.

    Now let us prove the Boundedness condition. From the above inequality we have

    |h(y)|1Δ11.5Δ

    for Δ3.

  2. 2.

    For the Contraction part, since log(λmax{1,1/γΔ1})yilog(λmax{1,βΔ1}), we have

    |Hd(𝒚)yi| =|h(yi)|=βγ11+βγ+γeyi+βeyiβγ11+βγ+γeyi
    βγ11+βγ+γλmax{1,βΔ1}.

    Since we assumed λ(1δ)γmax{1,βΔ1}((Δ2)βγΔ), it follows that we have the upper bound

    βγ11+βγ+(Δ2)βγΔ1δ =(1δ)βγ1(Δ1δ)βγ(Δ1+δ)
    =(1δ)βγ1(Δ1δ)(βγ1)+2δ
    1δΔ1δ(1Θ(δ))1Δ1.

    Now, we prove the Boundedness condition. Note that since λγmax{1,βΔ1}((Δ2)βγΔ), it follows that ylog(λmax{1,βΔ1})log(γ(Δ2)βγΔ). A simple calculation reveals that γ(Δ2)βγΔγβ and so by 37, we have

    |h(y)| |h(log(γ(Δ2)βγΔ))|(βγ1)elog(γ(Δ2)βγΔ)elog(γ(Δ2)βγΔ)+γ
    =(βγ1)11+(Δ2)βγΔ=βγ1(Δ2)(βγ1)1O(1/Δ).
  3. 3.

    For the Contraction part, since log(λmax{1,1/γΔ1})yilog(λmax{1,βΔ1}), we have

    |Hd(𝒚)yi| =|h(yi)|=βγ11+βγ+γeyi+βeyiβγ11+βγ+βeyi
    βγ11+βγ+βλmax{1,1/γΔ1}.

    Since we assumed λ11δ(Δ2)βγΔβmin{1,1/γΔ1}, it follows that we have the upper bound

    βγ11+βγ+(Δ2)βγΔ1δ

    which is again is upper bounded by (1Θ(δ))1Δ1 as we calculated in case 2 above.

    Now, we prove the Boundedness condition. Note that since λ(Δ2)βγΔβmin{1,1/γΔ2, it follows that ylog(λmin{1,1/γΔ1}log((Δ2)βγΔβ). A simple calculation reveals that (Δ2)βγΔβγβ and so by 37, we have

    |h(y)| |h(log((Δ2)βγΔβ))|(βγ1)1β(Δ2)βγΔβ+1
    =βγ1(Δ2)(βγ1)1O(1/Δ).

F.2 Proof of 27

In this subsection, we use results from [GL18] to prove 27. Their potential function is implicitly defined by its derivative for the marginal ratios as

Φ(R)=ϕ(R)=min{βγ1αγlogλ+γβλ+1,1RlogλR}

for a constant 0α1 depending only on β,γ,λ (see [GL18] for a precise definition). In our context, the corresponding potential for the log ratios is

Ψ(y)=ψ(y)=eyϕ(ey)=min{βγ1αγlogλ+γβλ+1ey,1logλey}

and is bounded by constants depending on β,γ,λ,Δ for log(λ/γΔ1)ylogλ.

One of the main technical results in [GL18] is showing that the tree recursion is contracting with the potential function Φ, and the derivative ϕ is bounded in the sense that there exist positive constants C1,C2 depending only on β,γ,λ such that C1ϕ(R)C2 for all 0Rλ. [GL18] refer to such a function as a universal potential function.

In our context, we get that Ψ is an (α,c)-potential function which satisfies 4, but with a constant c that depends on γ,Δ. Indeed, worst case, we have

maxy1,y2ψ(y2)ψ(y1)ψ(logλ)ψ(log(λ/γΔ1))=λβγ1αγlogλ+γβλ+1βγ1αlogλ+γβλ+1λγΔ=γΔ1.

More precisely, we have the following result from [GL18], stated in terms of the log marginal ratios.

Theorem 38.

Assume β,γ,λ are nonnegative real numbers satisfying β1γ, βγ1, and λ<(γβ)βγβγ1. Then the function Ψ is an (α,c)-potential function for a constant 0<α<1 depending on β,γ,λ, and a constant c>0 depending on β,γ,λ,Δ.

Combined with 5, this gives O(nC) mixing with a constant C depending only on β,γ,λ,Δ. We note this is weaker than the correlation decay result in [GL18], since there, C does not depend on Δ, and hence is efficient for arbitrary graphs.

Appendix G Slightly faster mixing

In this section, we slightly optimize our mixing time results for certain antiferromagnetic 2-spin systems by more carefully taking into account the tradeoff between the (nontrivial) spectral independence bound we prove based on contraction, and the (trivial) spectral independence bound we obtained in Section A.2 for handling constant-sized graphs.

Proposition 39.

Suppose a distribution μ on subsets of [n] is (η0,,ηn2)-spectrally independent for ηimin{a,(ni1)b}, for some a0 and 0b1. Then the Glauber dynamics for sampling from μ has spectral gap at least 1nΩ(abn)a

Proof.

Suppose we have already conditioned on c-fraction of elements to be “in/out. The resulting distribution is both b(1c)n-spectrally independent and a-spectrally independent. The exact threshold c for which the bound b(1c)n is better than a is given by

c=1abn

We note such a c only makes sense when 01abn1, or equivalently, bna. Now, we apply the a-spectral independence bound for all conditional distributions based on fixing at most c-fraction of vertices. We apply the (ni1)b-spectral independence otherwise. We obtain a final spectral gap lower bound of

1n(1b)(1c)nk=0cn(1ank1)

Observe that

(1b)(1c)n=(1b)abexp(a)

We also have

k=0cn(1ank1) exp(ak=0cn1nk1)
exp(a(k=0n21nk1lognk=cn+1n21nk1log(1c)n))
exp(alog11c)
exp(alogbna)
(abn)a

Putting these together, we obtain the desired lower bound.

With this result, we can apply it to the antiferromagnetic models with βγΔ2Δ,γ1 and β=0,γ1, since looking in the proof of 28, we have such systems are Cn-spectrally independent roughly with CO(1/Δ).

Corollary 40 (Soft Constraints).

Fix integers Δ3, 1<Δ¯<Δ. Let β,γ,λ0 be nonnegative real numbers satisfying Δ¯2Δ¯βγΔ¯1Δ¯+1 and γ1. Assume further that (β,γ,λ) is up-to-Δ unique with gap 0<δ<1. Then for every n-vertex graph G with maximum degree at most Δ, the Glauber dynamics for sampling from the antiferromagnetic 2-spin system with parameters (β,γ,λ) mixes in O(Δ¯nΔ)O(1/δ) steps.

Corollary 41 (Hard Constraints).

Fix an integer Δ3, fix β=0, and let 0γ1,λ0 be up-to-Δ unique with gap 0<δ<1. Then for every n-vertex graph G with maximum degree at most Δ, the Glauber dynamics for sampling from the antiferromagnetic 2-spin system with parameters (β,γ,λ)-mixes in O(nΔ)O(1/δ) steps.