Rapid Mixing of Glauber Dynamics up to Uniqueness via Contraction

Zongchen Chen Georgia Institute of Technology. Email: chenzongchen@gatech.edu. Research supported in part by NSF grant CCF-2007022. Kuikui Liu University of Washington. Email: liukui17@cs.washington.edu. Research supported in part by NSF grant CCF-1907845 and ONR-YIP grant N00014-17-1-2429. Eric Vigoda University of California, Santa Barbara. Email: vigoda@ucsb.edu. Research supported in part by NSF grant CCF-2007022.

Abstract

For general antiferromagnetic $2$ -spin systems, including the hardcore model on weighted independent sets and the antiferromagnetic Ising model, there is an $\mathsf{FPTAS}$ for the partition function on graphs of maximum degree $\Delta$ when the infinite regular tree lies in the uniqueness region by Li et al. (2013). Moreover, in the tree non-uniqueness region, Sly (2010) showed that there is no $\mathsf{FPRAS}$ to estimate the partition function unless $\mathsf{NP}=\mathsf{RP}$ . The algorithmic results follow from the correlation decay approach due to Weitz (2006) or the polynomial interpolation approach developed by Barvinok (2016). However the running time is only polynomial for constant $\Delta$ . For the hardcore model, recent work of Anari et al. (2020) establishes rapid mixing of the simple single-site Markov chain known as the Glauber dynamics in the tree uniqueness region. Our work simplifies their analysis of the Glauber dynamics by considering the total pairwise influence of a fixed vertex $v$ on other vertices, as opposed to the total influence of other vertices on $v$ , thereby extending their work to all 2-spin models and improving the mixing time.

More importantly our proof ties together the three disparate algorithmic approaches: we show that contraction of the so-called tree recursions with a suitable potential function, which is the primary technique for establishing efficiency of Weitz’s correlation decay approach and Barvinok’s polynomial interpolation approach, also establishes rapid mixing of the Glauber dynamics. We emphasize that this connection holds for all 2-spin models (both antiferromagnetic and ferromagnetic), and existing proofs for the correlation decay or polynomial interpolation approach immediately imply rapid mixing of the Glauber dynamics. Our proof utilizes that the graph partition function is a divisor of the partition function for Weitz’s self-avoiding walk tree. This fact leads to new tools for the analysis of the influence of vertices, and may be of independent interest for the study of complex zeros.

1 Introduction

A remarkable connection has been established between the computational complexity of approximate counting problems in general graphs of maximum degree $\Delta$ and the statistical physics phase transition on infinite, regular trees of degree $\Delta$ (or up to $\Delta$ in the more general case). This connection holds for 2-state antiferromagnetic spin systems – the hardcore model on independent sets and the Ising model are the most interesting examples of such systems.

Given an $n$ -vertex graph $G=(V,E)$ , configurations of the 2-spin model are the $2^{n}$ assignments of spins ${0,1}$ to the vertices. A 2-spin system is defined by three parameters: edge weights $\beta,\gamma>0$ and a vertex weight $\lambda>0$ . Edge parameter $\beta$ controls the (relative) strength of interaction between neighboring $1$ -spins, $\gamma$ corresponds to neighboring $0$ -spins, and $\lambda$ is the external field applied to vertices with $1$ -spins.

Every spin configuration $\sigma\in\{0,1\}^{V}$ is assigned a weight

w_{G}(\sigma)=\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\lambda^{n_{1}(\sigma)},

where, for spin $s\in\{0,1\}$ , $m_{s}(\sigma)=\#\{uv\in E:\sigma_{u}=\sigma_{v}=s\}$ is the number of monochromatic edges with spin $s$ , and $n_{1}(\sigma)=\#\{v\in V:\sigma_{v}=1\}$ is the number of vertices with spin $1$ (as is standard, the parameters are normalized so we can avoid two additional parameters). The Gibbs distribution over spin configurations is given by $\mu_{G}(\sigma)=\frac{w_{G}(\sigma)}{Z_{G}(\beta,\gamma,\lambda)},$ where $Z_{G}(\beta,\gamma,\lambda)=\sum_{\sigma\in\{0,1\}^{V}}\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\lambda^{n_{1}(\sigma)}$ is the partition function.

There are two examples of particular interest: the hardcore model and the Ising model. When $\beta=0$ and $\gamma=1$ then the only configurations with non-zero weight are independent sets of $G$ and the weight of an independent set $\sigma$ is $w(\sigma)=\lambda^{|\sigma|}$ ; this example is known as the hardcore model where the parameter $\lambda$ corresponds to the fugacity.

In the case $\beta=\gamma$ then the important quantity is the total number of monochromatic edges $m(\sigma)=m_{0}(\sigma)+m_{1}(\sigma)$ and the weight of a configuration $\sigma$ is $w(\sigma)=\beta^{m(\sigma)}\lambda^{n_{1}(\sigma)}$ ; this is the classical Ising model where the parameter $\beta$ corresponds to the inverse temperature and $\lambda$ is the external field ( $\lambda=1$ means no external field). Note, when $\beta>1$ then the model is ferromagnetic as neighboring vertices prefer to have the same spin, and $\beta<1$ is the antiferromagnetic Ising model. In the general $2$ -spin system, the model is ferromagnetic when $\beta\gamma>1$ and antiferromagnetic when $\beta\gamma<1$ . (When $\beta\gamma=1$ we get a trivial product distribution.)

The fundamental algorithmic tasks are to sample from the Gibbs distribution and to estimate the partition function. For the approximate sampling problem we are given a graph $G$ and an $\epsilon>0$ and our goal is to generate a sample from a distribution $\pi$ which is within total variation distance $\leq\epsilon$ of the Gibbs distribution $\mu_{G}$ in time $\operatorname{poly}(n,\log(1/\epsilon))$ . An efficient approximate sampling algorithm implies an $\mathsf{FPRAS}$ (fully-polynomial randomized approximation scheme) for the approximate counting problem [JVV86, ŠVV09]. Recall, given an $n$ -vertex graph $G$ , and $\epsilon,\delta>0$ , an $\mathsf{FPRAS}$ outputs a $(1\pm\epsilon)$ -approximation of $Z_{G}$ with probability $\geq 1-\delta$ in time $\operatorname{poly}(n,1/\epsilon,\log(1/\delta))$ , whereas an $\mathsf{FPTAS}$ is the deterministic analog (i.e., $\delta=0$ ).

A standard approach to the approximate sampling problem is the Markov Chain Monte Carlo (MCMC) method; in fact there is a simple Markov chain known as the Glauber dynamics. The Glauber dynamics works as follows: from a configuration $X_{t}$ at time $t$ , choose a random vertex $v$ , we then set $X_{t+1}(w)=X_{t}(w)$ for all $w\neq v$ , and finally we choose $X_{t+1}(v)$ from the conditional distribution of $\mu(\sigma_{v}|\sigma_{w}=X_{t+1}(w)\mbox{ for all }w\neq v)$ . For the case of the hardcore model, then $X_{t+1}(v)$ is set to occupied (i.e., spin $1$ ) with probability $\lambda/(1+\lambda)$ if no neighbors are currently occupied, and otherwise it is set to unoccupied.

It is straightforward to verify that the Glauber dynamics is ergodic with the Gibbs distribution as the unique stationary distribution. The mixing time is the minimum number of steps to guarantee, from the worst initial state $X_{0}$ , that the distribution of $X_{t}$ is within total variation distance $\leq 1/4$ of the Gibbs distribution. The goal is to prove that the mixing time is polynomial in $n$ , in which case the chain is said to be rapidly mixing.

For the case of the ferromagnetic Ising model (with or without an external field), a classical result of Jerrum and Sinclair [JS93] gives an $\mathsf{FPRAS}$ for all graphs via the MCMC method. This is the only case with an efficient algorithm for general graphs. For antiferromagnetic 2-spin models the picture is closely tied to statistical physics phase transitions on the regular tree.

The uniqueness/non-uniqueness phase transition is nicely illustrated for the case of the hardcore model. Consider the infinite $\Delta$ -regular tree $T$ rooted at $r$ , and let $T_{h}$ denote the tree truncated at the first $h$ levels. This phase transition captures whether the configuration at the leaves of $T_{h}$ “influences” the root, in the limit $h\rightarrow\infty$ . For the hardcore model we can consider even height trees (corresponding to the all even boundary condition) versus odd height trees. Let $p_{h}$ denote the marginal probability that the root is occupied in the Gibbs distribution $\mu_{T_{h}}$ . Let $p_{\mathsf{even}}=\lim_{h\rightarrow\infty}p_{2h}$ and $p_{\mathsf{odd}}=\lim_{h\rightarrow\infty}p_{2h+1}$ . We say that tree uniqueness holds if $p_{\mathsf{even}}=p_{\mathsf{odd}}$ and tree non-uniqueness holds if they are not equal. For all $\Delta\geq 3$ there exists a critical fugacity $\lambda_{c}(\Delta)=(\Delta-1)^{\Delta-1}/(\Delta-2)^{\Delta})$ [Kel85], where tree uniqueness holds iff $\lambda\leq\lambda_{c}(\Delta)$ .

The remarkable connection is that an algorithmic phase transition for general graphs of maximum degree $\Delta$ occurs at this same tree critical point. For all constant $\Delta$ , all $\delta>0$ , all $\lambda<(1-\delta)\lambda_{c}(\Delta)$ , all graphs of maximum degree $\Delta$ , [Wei06] presented an $\mathsf{FPTAS}$ for approximating the partition function. On the other side, for all $\delta>0$ , all $\lambda>(1+\delta)\lambda_{c}(\Delta)$ , [Sly10, SS14, GŠV16] proved that, unless $\mathsf{NP}=\mathsf{RP}$ , there is no $\mathsf{FPRAS}$ for estimating the partition function.

One important caveat is that the running time of Weitz’s algorithm is $(n/\epsilon)^{C\log\Delta}$ where the approximation factor is $(1\pm\epsilon)$ and the constant $C$ depends polynomially on the gap $\delta$ (recall, $\lambda<(1-\delta)\lambda_{c}$ ). Weitz’s correlation decay algorithm was extended to the antiferromagnetic Ising model in the tree uniqueness region by Sinclair et al. [SST14], and to all antiferromagnetic 2-spin systems in the corresponding tree uniqueness region (as we detail below) by Li, Lu, and Yin [LLY13].

An intriguing new algorithmic approach was presented by Barvinok [Bar16] and refined by Patel and Regts [PR17], utilizing the absence of zeros of the partition function in the complex plane to efficiently approximate a suitable transformation of the logarithm of the partition function using Taylor approximation. This polynomial interpolation approach was shown to be efficient in the same tree uniqueness region as for Weitz’s result by Peters and Regts [PR19], although the exponent in the running time depends exponentially on $\Delta$ .

It was long conjectured that the simple Glauber dynamics is rapidly mixing in the tree uniqueness region. This was recently proved by Anari, Liu, and Oveis Gharan [ALO20]; they proved, for all $\delta>0$ , the mixing time is $n^{O(\exp(1/\delta))}$ whenever $\lambda<(1-\delta)\lambda_{c}(\Delta)$ . We improve this result. First, we improve the mixing time from $n^{O(\exp(1/\delta))}$ to $n^{O(1/\delta)}$ as detailed in the following theorem.

Theorem 1 (Hardcore model).

Let $\Delta\geq 3$ be an integer and $\delta\in(0,1)$ . For every $n$ -vertex graph $G$ of maximum degree $\Delta$ and every $0<\lambda\leq(1-\delta)\lambda_{c}(\Delta)$ , the mixing time of the Glauber dynamics for the hardcore model on $G$ with fugacity $\lambda$ is $O(n^{2+32/\delta})$ .

This bound is optimal barring further improvements in the local-to-global arguments from [AL20]. Our improved result follows from a simpler, cleaner proof approach which enables us to extend our result to a wide variety of 2-spin models, matching the key results for the correlation decay algorithm with vastly improved running times.

Our proof approach unifies the three major algorithmic tools for approximate counting: correlation decay, polynomial interpolation, and MCMC. Most known results for both correlation decay and polynomial interpolation approach are proved by showing contraction of a suitably defined potential function on the so-called tree recursions; the tree recursions arise as a result of Weitz’s self-avoiding walk tree that we will describe in more detail later in this paper. A recent work of Shao and Sun [SS20] unifies these two approaches by showing that the contraction which is normally used to prove efficiency of the correlation decay algorithm, also implies (under some additional analytic conditions) that the polynomial interpolation approach is efficient.

Here we prove that this same contraction of a potential function also implies rapid mixing of the Glauber dynamics, with our improved running time that is independent of $\Delta$ ; see 4 and 5 for a detailed statement. Our proof utilizes several new tools concerning Weitz’s self-avoiding walk tree, which are detailed in Section 3. In particular, we show that the partition function of a graph $G$ divides the partition function of Weitz’s self-avoiding walk tree; see 8. This result is potentially of independent interest for establishing absence of zeros for the partition function with complex parameters, as it enables one to consider the self-avoiding walk tree. This result also yields a new, useful equivalence for bounding the influence in a graph in terms of the self-avoiding tree, which strengthens the previously known connection by Weitz [Wei06]; see 8 for details.

As an easy consequence we obtain rapid mixing for the Glauber dynamics for the antiferromagnetic Ising model in the tree uniqueness region. In terms of the edge activity, the two critical points for the Ising model on the $\Delta$ -regular tree are at $\beta_{c}(\Delta)=\frac{\Delta-2}{\Delta}$ and $\overline{\beta}_{c}(\Delta)=\frac{1}{\beta_{c}(\Delta)}=\frac{\Delta}{\Delta-2}$ ; the first lies in the antiferromagnetic regime, while the second lies in the ferromagnetic regime. If $\beta_{c}(\Delta)<\beta<\overline{\beta}_{c}(\Delta)$ , then uniqueness holds for all external field $\lambda$ on the $\Delta$ -regular tree.

As mentioned earlier, for the ferromagnetic Ising model, an $\mathsf{FPRAS}$ was known for general graphs [JS93]. Furthermore, Mossel and Sly [MS13] proved $O(n\log{n})$ mixing time of the Glauber dynamics for the ferromagnetic Ising model when $1\leq\beta<\overline{\beta}_{c}(\Delta)$ . However, rapid mixing for the antiferromagnetic Ising model in the tree uniqueness region was not known.

We provide the following mixing result for the case $\beta>\beta_{c}(\Delta)$ . Note, when $\beta\leq\beta_{c}$ there is an additional uniqueness region for certain values of the external field $\lambda$ ; this region is covered by 3.

Theorem 2 (Antiferromagnetic Ising Model).

Let $\Delta\geq 3$ be an integer and $\delta\in(0,1)$ . Assume that $1>\beta\geq\beta_{c}(\Delta)+\delta(1-\beta_{c}(\Delta))$ and $\lambda>0$ . Then for every $n$ -vertex graph $G$ of maximum degree $\Delta$ , the mixing time of the Glauber dynamics for the Ising model on $G$ with edge weight $\beta$ and external field $\lambda$ is $O(n^{2+1.5/\delta})$ .

Our results for the hardcore and Ising models fit within a larger framework of general antiferromagnetic 2-spin systems. Recall that the antiferromagnetic case is when $\beta\gamma<1$ .

For general 2-spin systems the appropriate tree phase transition is more complicated as there are models where the tree uniqueness threshold is not monotone in $\Delta$ . Hence the appropriate notion is “up-to- $\Delta$ uniqueness” as considered by [LLY13]. Roughly speaking, we say uniqueness with gap $\delta\in(0,1)$ holds on the $d$ -regular tree if for every integer $\ell\geq 1$ , all vertices at distance $\ell$ from the root have total “influence” $\lesssim(1-\delta)^{\ell}$ on the marginal of the root. We say up-to- $\Delta$ uniqueness with gap $\delta$ holds if uniqueness with gap $\delta$ holds on the $d$ -regular tree for all $1\leq d\leq\Delta$ ; see Section 2 for the precise definition.

Both 1 and 2 are corollaries of the following general rapid mixing result which holds for general antiferromagnetic $2$ -spin systems in the entire tree uniqueness region.

Theorem 3 (General antiferromagnetic $2$ -spin system).

Let $\Delta\geq 3$ be an integer and $\delta\in(0,1)$ . Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Assume that the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta$ . Then for every $n$ -vertex graph $G$ of maximum degree $\Delta$ , the mixing time of the Glauber dynamics for the antiferromagnetic $2$ -spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $O(n^{2+72/\delta})$ .

We also match existing correlation decay results [GL18, SS20] for ferromagnetic 2-spin models; see Section 8 for results, and Appendix F for proofs.

1.1 Mixing by the potential method

The tree recursion is very useful in the study of approximating counting. Consider a tree rooted at $r$ . Suppose that $r$ has $d$ children, denoted by $v_{1},\dots,v_{d}$ . For $1\leq i\leq\Delta_{i}$ we define $T_{v_{i}}$ to be the subtree of $T$ rooted at $v_{i}$ that contains all descendant of $v_{i}$ . Let $R_{r}=\mu_{T}(\sigma_{r}\text{\scriptsize{~{}$=$~{}}}1)/\mu_{T}(\sigma_{r}\text{\scriptsize{~{}$=$~{}}}0)$ Rr=μT(σr = 1)/μT(σr = 0) denote the marginal ratio of the root, and $R_{v_{i}}=\mu_{T_{v_{i}}}(\sigma_{v_{i}}\text{\scriptsize{~{}$=$~{}}}1)/\mu_{T_{v_{i}}}(\sigma_{v_{i}}\text{\scriptsize{~{}$=$~{}}}0)$ Rvi=μTvi(σvi = 1)/μTvi(σvi = 0) for each subtree. The tree recursion is a formula that computes $R_{r}$ given $R_{v_{1}},\dots,R_{v_{d}}$ , due to the independence of $T_{v_{i}}$ ’s. More specifically, we can write $R_{r}=F_{d}(R_{v_{1}},\dots,R_{v_{d}})$ where $F_{d}:[0,+\infty]^{d}\to[0,+\infty]$ is a multivariate function such that for $(x_{1},\dots,x_{d})\in[0,+\infty]^{d}$ ,

F_{d}(x_{1},\dots,x_{d})=\lambda\prod_{i=1}^{d}\frac{\beta x_{i}+1}{x_{i}+\gamma}.

In this paper, however, we pay particular interest in the log of marginal ratios. The reason is that we will carefully study the pairwise influence matrix $\mathcal{I}_{G}$ of the Gibbs distribution $\mu_{G}$ , introduced in [ALO20] and defined as for every $r,v\in V$

\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}1\mid\sigma_{r}\text{\scriptsize{~{}$=$~{}}}1)-\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}1\mid\sigma_{r}\text{\scriptsize{~{}$=$~{}}}0).

In [ALO20], the authors show that if the maximum eigenvalue of $\mathcal{I}_{G}$ is bounded appropriately, then the Glauber dynamics is rapid mixing. One crucial observation we make in this paper is that the influence $\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐG(r → v) of $r$ on $v$ can be viewed as the derivative of $\log R_{r}$ with respect to the log external field at $v$ (see 12). Thus, it is more convenient for us to work with the log ratios. To this end, we rewrite the tree recursion as $\log R_{v}=H_{d}(\log R_{v_{1}},\dots,\log R_{v_{d}})$ where $H_{d}:[-\infty,+\infty]^{d}\to[-\infty,+\infty]$ is a function such that for $(y_{1},\dots,y_{d})\in[-\infty,+\infty]^{d}$ ,

H_{d}(y_{1},\dots,y_{d})=\log\lambda+\sum_{i=1}^{d}\log\left(\frac{\beta e^{y_{i}}+1}{e^{y_{i}}+\gamma}\right).

Observe that $H=\log\circ F\circ\exp$ . Moreover, we define

h(y)=-\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}

for $y\in[-\infty,+\infty]$ , so that $\frac{\partial}{\partial y_{i}}H_{d}(y_{1},\dots,y_{d})=h(y_{i})$ for each $i$ .

To prove our main results, we use the potential method, which has been widely used to establish the decay of correlation. By choosing a suitable potential function for the log ratios, we show that the total influence from a given vertex decays exponentially with the distance, and thus establish rapid mixing of the Glauber dynamics. Let us first specify our requirements on the potential. For every integer $d\geq 0$ , we define a bounded interval $J_{d}$ which contains all log ratios at a vertex of degree $d$ . More specifically, we let $J_{d}=\left[{\log(\lambda\beta^{d}),\log(\lambda/\gamma^{d})}\right]$ when $\beta\gamma<1$ , and $J_{d}=\left[{\log(\lambda/\gamma^{d}),\log(\lambda\beta^{d})}\right]$ when $\beta\gamma>1$ . Furthermore, define $J=\bigcup_{d=0}^{\Delta-1}J_{d}$ to be the interval containing all log ratios with degree less than $\Delta$ .

Definition 4 ( $(\alpha,c)$ -Potential function).

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ and $\lambda>0$ . Let $\Psi:[-\infty,+\infty]\to(-\infty,+\infty)$ be a differentiable and increasing function with image $S=\Psi[-\infty,+\infty]$ and derivative $\psi=\Psi^{\prime}$ . For any $\alpha\in(0,1)$ and $c>0$ , we say $\Psi$ is an $(\alpha,c)$ -potential function with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ if it satisfies the following conditions:

1.

(Contraction) For every integer $d$ such that $1\leq d<\Delta$ and every $(\tilde{y}_{1},\dots,\tilde{y}_{d})\in S^{d}$ , we have

$\left\|{\nabla H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})}\right\|_{1}=\sum_{i=1}^{d}\frac{\psi(y)}{\psi(y_{i})}\cdot|h(y_{i})|\leq 1-\alpha$

where $H_{d}^{\Psi}=\Psi\circ H_{d}\circ\Psi^{-1}$ , $y_{i}=\Psi^{-1}(\tilde{y}_{i})$ for $1\leq i\leq d$ , and $y=H_{d}(y_{1},\dots,y_{d})$ .
2.

(Boundedness) For every $y_{1},y_{2}\in J$ , we have

$\frac{\psi(y_{2})}{\psi(y_{1})}\cdot\left|{h(y_{1})}\right|\leq\frac{c}{\Delta}.$

In the definition of $(\alpha,c)$ -potential, one should think of $y$ as the log marginal ratio at a vertex and the potential function is of $\log R$ . The following theorem establishes rapid mixing of the Glauber dynamics given an $(\alpha,c)$ -potential function.

Theorem 5.

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ and $\lambda>0$ . Suppose that there is an $(\alpha,c)$ -potential with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ for some $\alpha\in(0,1)$ and $c>0$ . Then for every $n$ -vertex graph $G$ of maximum degree $\Delta$ , the mixing time of the Glauber dynamics for the $2$ -spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $O(n^{2+c/\alpha})$ .

We outline our proofs in Section 3. Note that in both 4 and 5, the constant $c$ is allowed to depend on the maximum degree $\Delta$ and parameters $(\beta,\gamma,\lambda)$ in general. For example, a straightforward black-box application of the potential in [LLY13] would give $c=\Theta(\Delta)$ for the Boundedness condition, resulting in $n^{\Theta(\Delta)}$ mixing. However, this is undesirable for graphs with potentially unbounded degrees. One of our contributions is that we show the Boundedness condition holds for a universal constant $c$ independent of $\Delta$ and $(\beta,\gamma,\lambda)$ . Thus, our mixing time is $O(n^{2+c/\delta})$ with no parameters in the exponent except for $1/\delta$ .

In Section 7, we give a slightly more general definition of $(\alpha,c)$ -potentials, which relaxes the Boundedness condition, and is necessary for our analysis of antiferromagnetic $2$ -spin systems with $0\leq\beta<1<\gamma$ . 5 still holds for this larger class of potentials.

We remark that in all previous works of the potential method, results and proofs are always presented in terms of $F_{d}$ , the tree recursion of $R$ , and $\Phi$ , a potential function of $R$ . In fact, our results can also be translated into the language of $(F_{d},\Phi)$ . To see this, since $H_{d}=\log\circ F_{d}\circ\exp$ , it is straightforward to check that $H_{d}^{\Psi}=\Psi\circ H_{d}\circ\Psi^{-1}=\Phi\circ F_{d}\circ\Phi^{-1}=F_{d}^{\Phi}$ if we pick $\Phi=\Psi\circ\log$ , and thereby $\nabla H_{d}^{\Psi}=\nabla F_{d}^{\Phi}$ . This implies that the Contraction condition in 4 holds for $(H_{d},\Psi)$ if and only if the corresponding contraction condition holds for $(F_{d},\Phi)$ . The Boundedness condition can also be stated equivalently for $(F_{d},\Phi)$ . Nevertheless, in this paper we choose to work with $(H_{d},\Psi)$ for the following two reasons. First, as mentioned earlier, the fact that $\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐG(r → v) is a derivative of $\log R_{r}$ makes it natural to consider the tree recursion for the log ratios. Indeed, it is easier and cleaner to present our results and proofs using $(H_{d},\Psi)$ directly rather than switching to $(F_{d},\Phi)$ . Second, the potential function $\Psi$ we will use is obtained from the exact potential $\Phi$ in [LLY13], by the transformation $\Psi=\Phi\circ\exp$ .¹¹1To be more precise, we also multiply a constant factor which only simplifies our calculation and does not matter much; also notice that [LLY13] denotes the potential function by $\varphi$ and its derivative by $\Phi=\varphi^{\prime}$ . It is intriguing to notice that the derivative of this potential is simply $\psi=\sqrt{|h|}$ . Then the Contraction condition has a nice form: $\sum_{i=1}^{d}\sqrt{h(y)h(y_{i})}\leq 1-\alpha$ ; and the Boundedness condition only involves an upper bound on $h(y)$ . This seems to shed some light on the mysterious potential function $\Phi$ from [LLY13], and also indicates that $H_{d}$ is a meaningful variant of the tree recursion to consider. To add one more evidence, for a lot of cases (e.g., $\frac{\Delta-2}{\Delta}<\sqrt{\beta\gamma}<\frac{\Delta}{\Delta-2}$ ) where the potential $\Phi=\log$ is picked, that just means we can pick $\Psi$ to be the identity function and $H_{d}$ itself is contracting without any nontrivial potential.

Revision in July 2021.

After the publication of this paper in FOCS 2020, a small error was found in [LLY13] regarding descriptions of the uniqueness region for antiferromagnetic 2-spin systems. The error was fixed in the latest version of [LLY13]. In this revision, we update corresponding results and proofs in Section 7 and Appendix E that rely on the changes in [LLY13]; in particular, 36 is adjusted in accordance with the current description of uniqueness regions. We remark that these changes are purely technical and do not affect the validity of our main results like 5.

Acknowledgments.

We would like to thank Shayan Oveis Gharan and Nima Anari for stimulating discussions. We also thank the anonymous referees for helpful comments and suggestions. We are grateful to Yitong Yin for communicating with us about the latest update of [LLY13] and for providing helpful instructions on modifying statements and proofs of results in Appendix E, particularly 36.

2 Preliminaries

Mixing time and spectral gap

Let $P$ be the transition matrix of an ergodic (i.e., irreducible and aperiodic) Markov chain on a finite state space $\Omega$ with stationary distribution $\mu$ . Let $P^{t}(x_{0},\cdot)$ denote the distribution of the chain after $t$ steps starting from $x_{0}\in\Omega$ . The mixing time of $P$ is defined as

T_{\mathrm{mix}}(P)=\max_{x_{0}\in\Omega}\min\left\{t\geq 0:\left\|{P^{t}(x_{0},\cdot)-\mu(\cdot)}\right\|_{\mathrm{TV}}\leq\frac{1}{4}\right\}.

We say $P$ is reversible if $\mu(x)P(x,y)=\mu(y)P(y,x)$ for all $x,y\in\Omega$ . If $P$ is reversible, then $P$ has only real eigenvalues which can be denoted by $1=\lambda_{1}\geq\dots\geq\lambda_{|\Omega|}\geq-1$ . The spectral gap of $P$ is defined to be $1-\lambda_{2}$ and the absolute spectral gap of $P$ is defined as $\lambda^{*}(P)=1-\max\{|\lambda_{2}|,|\lambda_{|\Omega|}|\}$ . If $P$ is also positive semidefinite with respect to the inner product $\langle\cdot,\cdot\rangle_{\mu}$ , then all eigenvalues of $P$ are nonnegative and thus $\lambda^{*}(P)=1-\lambda_{2}$ . Finally, the mixing time and the absolute spectral gap are related by

T_{\mathrm{mix}}(P)\leq\frac{1}{\lambda^{*}(P)}\log\left(\frac{4}{\min_{x\in\Omega}\mu(x)}\right).

(1)

Uniqueness

Let $\Delta\geq 3$ be an integer or $\Delta=\infty$ . Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . For $1\leq d<\Delta$ , define

f_{d}(R)=\lambda\left({\frac{\beta R+1}{R+\gamma}}\right)^{d}

and denote the unique fixed point of $f_{d}$ by $R_{d}^{*}$ . For $\delta\in(0,1)$ , we say the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta$ if $|f^{\prime}_{d}(R^{*}_{d})|<1-\delta$ for all $1\leq d<\Delta$ .

Ratio and influence

Consider the $2$ -spin system on a graph $G=(V,E)$ . Let $\Lambda\subseteq V$ and ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ . For all $v\in V\backslash\Lambda$ , we define the marginal ratio at $v$ to be

R_{G}^{\sigma_{\Lambda}}(v)=\frac{\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})}{\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})}.

For all $u,v\in V\backslash\Lambda$ , we define the (pairwise) influence of $u$ on $v$ by

\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}1\mid\sigma_{u}\text{\scriptsize{~{}$=$~{}}}1,\,{\sigma_{\Lambda}})-\mu_{G}(\sigma_{v}\text{\scriptsize{~{}$=$~{}}}1\mid\sigma_{u}\text{\scriptsize{~{}$=$~{}}}0,\,{\sigma_{\Lambda}}).

Write $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ for the (pairwise) influence matrix whose entries are given by $\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐGσΛ(u → v).

Weitz’s self-avoiding walk tree

Let $G=(V,E)$ be a connected graph and $r\in V$ be a vertex of $G$ . The self-avoiding walk (SAW) tree is defined as follows. Suppose that there is a total ordering of the vertex set $V$ . A self-avoiding walk from $r$ is a path $r=v_{0}-v_{1}-\dots-v_{\ell}$ such that $v_{i}\neq v_{j}$ for all $0\leq i<j\leq\ell$ . The SAW tree $T_{\textsc{saw}}(G,r)$ is a tree rooted at $r$ , consisting of all self-avoiding walks $r=v_{0}-v_{1}-\dots-v_{\ell}$ with $\deg(v_{\ell})=1$ , and those appended with one more vertex that closes the cycle (i.e., $r=v_{0}-v_{1}-\dots-v_{\ell}-v_{i}$ for some $0\leq i\leq\ell-2$ such that $\{v_{\ell},v_{i}\}\in E$ ). Note that a vertex of $G$ might have many copies in the SAW tree, and the degrees of vertices are preserved except for leaves. See Fig. 1 for an example.

We can define a $2$ -spin system on $T_{\textsc{saw}}(G,r)$ with the same parameters $(\beta,\gamma,\lambda)$ , in which some of the leaves are fixed to a particular spin. More specifically, for a self-avoiding walk $r=v_{0}-v_{1}-\dots-v_{\ell}$ appended with $v_{i}$ , we fix $v_{i}$ to be spin $1$ if $v_{i+1}<v_{\ell}$ with respect to the total ordering on $V$ , and spin $0$ if $v_{i+1}>v_{\ell}$ . For each $v\in V$ we denote the set of all free (unfixed) copies of $v$ in $T_{\textsc{saw}}(G,r)$ by $\mathcal{C}_{v}$ . For $\Lambda\subseteq V$ and a partial configuration $\sigma_{\Lambda}\in\{0,1\}^{\Lambda}$ , we define the SAW tree with conditioning ${\sigma_{\Lambda}}$ by assigning the spin $\sigma_{v}$ to every copy $\hat{v}$ of $v$ from $\mathcal{C}_{v}$ and removing all descendants of $\hat{v}$ , for each $v\in\Lambda$ . Note that in general, different copies of $v$ from $\mathcal{C}_{v}$ can receive different spin assignments. Finally, in the case that every vertex $v$ has a distinct field $\lambda_{v}$ , all copies of $v$ from $\mathcal{C}_{v}$ will have the same field $\lambda_{v}$ in the SAW tree.

Refer to caption — Figure 1: A graph $G$ and the self-avoiding walk tree $T_{\textsc{saw}}(G,r)$ rooted at $r$ . Vertices with the same label in $T_{\textsc{saw}}(G,r)$ are copies of the same vertex from $G$ . ( $\CIRCLE$ / $\Circle$ : fixed to spin $1$ / $0$ .)

3 Proof outline for main results

Step 1 ([ALO20]): Spectral Independence implies rapid mixing.

Our proof builds on [ALO20] who showed that the Glauber dynamics for sampling from the hardcore distribution on graphs of maximum degree at most $\Delta$ mixes in $O(n^{\exp(O(1/\delta))})$ steps whenever $\lambda\leq(1-\delta)\lambda_{c}(\Delta)$ . One of the key ingredients of their proof is a notion they call spectral independence. [ALO20] shows that the spectral independence property implies rapid mixing. Note that the diagonal entries of $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ are $1$ , as opposed to $0$ in the original definition in [ALO20].

Definition 6 (Spectral Independence [ALO20]).

We say the Gibbs distribution $\mu_{G}$ on an $n$ -vertex graph $G$ is $(\eta_{0},\dots,\eta_{n-2})$ -spectrally independent, if for every $0\leq k\leq n-2$ , $\Lambda\subseteq V$ of size $k$ and $\sigma_{\Lambda}\in\{0,1\}^{\Lambda}$ , one has $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})-1\leq\eta_{k}$ .

Theorem 7 ([ALO20]).

If $\mu$ is an $(\eta_{0},\dots,\eta_{n-2})$ -spectrally independent distribution, then the Glauber dynamics for sampling from $\mu$ has spectral gap at least

\frac{1}{n}\,\prod_{i=0}^{n-2}\left({1-\frac{\eta_{i}}{n-i-1}}\right).

Our primary goal now is to bound the maximum eigenvalue of $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ .

Step 2: Self-avoiding walk trees preserve influences.

From standard linear algebra, we know that the maximum eigenvalue of $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ is upper bounded by both the $1$ -norm $\left\|{\mathcal{I}_{G}^{\sigma_{\Lambda}}}\right\|_{1}=\max_{r\in V}\sum_{v\in V}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}r)|$ ‖ℐGσΛ‖1=maxr∈V∑v∈V|ℐGσΛ(v → r)|, which corresponds to total influences on a vertex $r$ , and the infinity-norm $\left\|{\mathcal{I}_{G}^{\sigma_{\Lambda}}}\right\|_{\infty}=\max_{r\in V}\sum_{v\in V}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|$ ‖ℐGσΛ‖∞=maxr∈V∑v∈V|ℐGσΛ(r → v)|, corresponding to total influences of $r$ . In [ALO20] the authors use $\left\|{\mathcal{I}_{G}^{\sigma_{\Lambda}}}\right\|_{1}$ as an upper bound on $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})$ . Roughly speaking, they show that the sum of absolute influences on a fixed vertex $r$ , is upper bounded by the maximum absolute influences on $r$ in the self-avoiding walk tree rooted at $r$ , over all boundary conditions. Here in this paper, we will use $\left\|{\mathcal{I}_{G}^{\sigma_{\Lambda}}}\right\|_{\infty}$ to upper bound $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})$ instead. In fact, much more is true if we look at the influences from $r$ in the self-avoiding tree. We show that for every vertex $v\in V$ , the influence $\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐGσΛ(r → v) in $G$ is preserved in the self-avoiding walk tree $T=T_{\textsc{saw}}(G,r)$ rooted at $r$ , in the form of sum of influences $\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}\hat{v})$ ℐTσΛ(r → v^) over all copies $\hat{v}$ of $v$ .

The way we establish this fact is by viewing the partition function as a polynomial in $\lambda$ . In fact, it will be useful to consider the more general case with an arbitrary external field $\lambda_{v}$ for every $v\in V$ . Let $\bm{\lambda}=\{\lambda_{v}:v\in V\}$ denote the fields. For $\Lambda\subseteq V$ and ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ , the weight of $\sigma\in\{0,1\}^{V\backslash\Lambda}$ conditional on ${\sigma_{\Lambda}}$ is defined to be $w_{G}(\sigma\mid{\sigma_{\Lambda}})=\beta^{m_{1}(\sigma\mid{\sigma_{\Lambda}})}\gamma^{m_{0}(\sigma\mid{\sigma_{\Lambda}})}\prod_{v\in V\backslash\Lambda}\lambda_{v}^{\sigma_{v}}$ where $m_{i}(\cdot\mid{\sigma_{\Lambda}})$ is the number of $i$ - $i$ edges with at least one endpoint in $V\backslash\Lambda$ for $i=0,1$ . Furthermore, $Z_{G}^{\sigma_{\Lambda}}=\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}w_{G}(\sigma\mid{\sigma_{\Lambda}})$ is the partition function conditioned on ${\sigma_{\Lambda}}$ . We shall view $\beta$ and $\gamma$ as some fixed constants and think of $\bm{\lambda}$ as $n=|V|$ variables. In this sense, we regard the weights $w_{G}(\sigma\mid{\sigma_{\Lambda}})$ as monomials in $\bm{\lambda}$ and the partition function $Z_{G}^{\sigma_{\Lambda}}$ as a polynomial in $\bm{\lambda}$ . Moreover, the marginal ratios $R_{G}^{\sigma_{\Lambda}}(v)$ and the influences $\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐGσΛ(r → v) for $r,v\in V$ are all functions in $\bm{\lambda}$ . Our main result is that the partition function of $G$ divides that of $T_{\textsc{saw}}(G,r)$ for each $r\in V$ . From that, we show that the SAW tree preserves influences of the root, as well as re-establishing Weitz’s celebrated result [Wei06], see 13.

Lemma 8.

Let $G=(V,E)$ be a connected graph, $r\in V$ be a vertex and $\Lambda\subseteq V\backslash\{r\}$ such that $G\backslash\Lambda$ is connected. Let $T=T_{\textsc{saw}}(G,r)$ be the self-avoiding walk tree of $G$ rooted at $r$ . Then for every ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ , $Z_{G}^{\sigma_{\Lambda}}$ divides $Z_{T}^{\sigma_{\Lambda}}$ . More precisely, there exists a polynomial $P_{G,r}^{\sigma_{\Lambda}}=P_{G,r}^{\sigma_{\Lambda}}(\bm{\lambda})$ independent of $\lambda_{r}$ such that

\displaystyle Z_{T}^{\sigma_{\Lambda}}=Z_{G}^{\sigma_{\Lambda}}\cdot P_{G,r}^{\sigma_{\Lambda}}.

(2)

As a corollary, for each vertex $v\in V$ ,

\displaystyle\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\sum_{\hat{v}\in\mathcal{C}_{v}}\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}\hat{v}),

(3)

where $\mathcal{C}_{v}$ is the set of all free (unfixed) copies of $v$ in $T$ .

Remark 1.

We emphasize that for the purposes of bounding the total influence of a vertex in $G$ , only Eq. 3 of 8 is needed, which can be proved in a purely combinatorial fashion. However, we believe the divisibility property Eq. 2 of the multivariate partition function of $G$ and its self-avoiding walk tree may be of independent interest.

We note that a univariate version of the divisibility statement Eq. 2 has already appeared in [Ben18] for the hardcore model and [LSS19] for the zero-field Ising model in the study of complex roots of the partition function. From 8, we can get $\sum_{v\in V}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|\leq\sum_{v\in V_{T}}|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|$ ℐGσΛ(r → v)|≤∑v∈VT|ℐTσΛ(r → v)| for any fixed $r$ . That means, we only need to upper bound the sum of all influences for trees, in order to get an upper bound on $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})$ .

Step 3: Decay of influences given a good potential.

The tree recursion provides us a great tool for computing the (log) ratios of vertices recursively for trees. As we show in 12, the influence $\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐGσΛ(r → v) is in fact a version of derivative of the log marginal ratio at $r$ . Thus, the tree recursion can be used naturally to relate these influences. We then apply the potential method, which has been widely used in literature to establish the decay of correlations (strong spatial mixing). The following lemma shows that the sum of absolute influences to distance $k$ has exponential decay with $k$ , which can be thought of as the decay of pairwise influences.

Lemma 9.

If there exists an $(\alpha,c)$ -potential function $\Psi$ with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ where $\alpha\in(0,1)$ and $c>0$ , then for every $\Lambda\subseteq V_{T}\backslash\{r\}$ , ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ and all integers $k\geq 1$ ,

\sum_{v\in L_{r}(k)}\left|{\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|\leq c\cdot(1-\alpha)^{k-1}

where $L_{r}(k)$ denote the set of all free vertices at distance $k$ away from $r$ .

5 is then proved by combining 7, 8 and 9. We leave its proof to Appendix A.

Step 4: Find a good potential.

As our final step, we need to find an $(\alpha,c)$ -potential function as defined in 4. The potential $\Psi$ we choose is exactly the one from [LLY13], adapted to the log marginal ratios and the tree recursion $H$ (see Section 6 for more details). We show that if the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta\in(0,1)$ and either $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ or $\gamma\leq 1$ , then $\Psi$ is an $(\alpha,c)$ -potential.

Lemma 10.

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Assume that $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta\in(0,1)$ . Define the function $\Psi$ implicitly by

\Psi^{\prime}(y)=\psi(y)=\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}=\sqrt{\left|{h(y)}\right|},\qquad\Psi(0)=0.

(4)

If $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ , then $\Psi$ is an $(\alpha,c)$ -potential function with $\alpha\geq\delta/2$ and $c\leq 1.5$ . If $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma\leq 1$ , then $\Psi$ is an $(\alpha,c)$ -potential with $\alpha\geq\delta/2$ and $c\leq 18$ ; we can further take $c\leq 4$ if $\beta=0$ .

We deduce 3 for the case $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ or $\gamma\leq 1$ from 5 and 10. The proof of it can be found in Appendix A. The case that $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ is trickier. As discussed in Section 5 of [LLY13], when $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ , for some $\lambda>0$ the spin system lies in the uniqueness region for arbitrary graphs, even with unbounded degrees (i.e., up-to- $\infty$ unique). Thus, in this case the total influences of a vertex can be as large as $\Theta(\Delta/\delta)$ , resulting in $n^{\Theta(\Delta/\delta)}$ mixing time. To deal with this, we consider a suitably weighted sum of absolute influences of a fixed vertex, which also upper bounds the maximum eigenvalue of the influence matrix. 4 and 5 are then modified to a slightly stronger version. The statements and proofs for this case are presented in Section 7 and Appendix D.

The rest of the paper is organized as follows. In Section 4 we prove 8 about properties of the SAW tree. In Section 5 we establish 9 regarding the decay of influences by the potential method. We verify the Contraction condition in Section 6 for our choice of potential. Section 7 is devoted to the case that $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ , where a more general version of 4 and 5 is required; missing proofs can be found in Appendix D. In Appendix E we verify the Boundedness condition and its generalization for our potential in all cases. We consider ferromagnetic spin systems in Section 8 and the proofs are left to Appendix F. We prove all of our main results in Appendix A.

4 Preservation of influences for self-avoiding walk trees

In this section we show that the self-avoiding walk (SAW) tree, introduced in [Wei06] (see also [SS05]), maintains all the influence of the root, and thus establishes 8. To do this, we show that the partition function of $G$ , viewed as a polynomial of the external fields $\bm{\lambda}$ , divides that of the SAW tree. From there we prove that the influence of the root vertex $r$ on another vertex $v$ in $G$ , is exactly equal to that on all copies of $v$ in the SAW tree. Using our proof approach, we show that the marginal of the root is maintained in the SAW tree, re-establishing Weitz’s celebrated result [Wei06], and also all pairwise covariances concerned with $v$ are preserved.

Theorem 11.

Let $G=(V,E)$ be a connected graph, $r\in V$ be a vertex and $\Lambda\subseteq V\backslash\{r\}$ such that $G\backslash\Lambda$ is connected. Let $T=T_{\textsc{saw}}(G,r)$ be the self-avoiding walk tree of $G$ rooted at $r$ . Then for every ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ , $Z^{\sigma_{\Lambda}}_{G}$ divides $Z^{\sigma_{\Lambda}}_{T}$ . More precisely, there exists a polynomial $P^{\sigma_{\Lambda}}_{G,r}=P^{\sigma_{\Lambda}}_{G,r}(\bm{\lambda})$ such that

Z^{\sigma_{\Lambda}}_{T}=Z^{\sigma_{\Lambda}}_{G}\cdot P^{\sigma_{\Lambda}}_{G,r}.

Moreover, the polynomial $P^{\sigma_{\Lambda}}_{G,r}$ is independent of $\lambda_{r}$ .

Remark 2.

The proof of 11 can be adapted to give a purely combinatorial proof of Eq. 3 in 8. Like in the proof of [Wei06, Theorem 3.1], one can proceed via vertex splitting and telescoping, where instead of telescoping a product of marginal ratios, one instead telescopes a sum of single-vertex influences.

We remark that [Ben18] proved a univariate version of 11 for the hardcore model, and [LSS19] showed a similar result for the zero-field Ising model with a uniform edge weight. Our result holds for all $2$ -spin systems and arbitrary fields for each vertex. We can also generalize it to arbitrary edge weights for each edge in a straightforward fashion. It is crucial that the quotient polynomial $P^{\sigma_{\Lambda}}_{G,r}$ is independent of the field $\lambda_{r}$ at the root, from which we can deduce the preservation of marginal and influences of the root immediately.

Before proving 11, we first give a few consequences of it. For all $u,v\in V\backslash\Lambda$ , we define the marginal at $v$ as $M_{G}^{\sigma_{\Lambda}}(v)=\mu_{G}(v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})$ MGσΛ(v)=μG(v = 1∣σΛ) (henceforth we write $v=i$ for the event $\sigma_{v}=i$ for convenience), and the covariance of $u$ and $v$ as

K_{G}^{\sigma_{\Lambda}}(u,v)=\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})-\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\mu_{G}(v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}}).

The following lemma relates the quantities we are interested in with appropriate derivatives of the (log) partition function. Parts 1 and 2 of the lemma are folklore.

Lemma 12.

For every graph $G=(V,E)$ , $\Lambda\subseteq V$ and ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ , the following holds:

1.

For all $v\in V$ ,

$\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log Z^{\sigma_{\Lambda}}_{G}=M_{G}^{\sigma_{\Lambda}}(v);$
2.

For all $u,v\in V$ ,

$\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)\log Z^{\sigma_{\Lambda}}_{G}=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)M_{G}^{\sigma_{\Lambda}}(u)=K_{G}^{\sigma_{\Lambda}}(u,v);$
3.

For all $u,v\in V$ ,

$\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log R_{G}^{\sigma_{\Lambda}}(u)=\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v).$

Proof.

The first two parts are standard. We include the proofs of these two facts in Appendix B for completeness. For Part 3, we deduce from Part 2 that

\displaystyle\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log R_{G}^{\sigma_{\Lambda}}(u)=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log\left(\frac{M_{G}^{\sigma_{\Lambda}}(u)}{1-M_{G}^{\sigma_{\Lambda}}(u)}\right)=\frac{\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)M_{G}^{\sigma_{\Lambda}}(u)}{M_{G}^{\sigma_{\Lambda}}(u)\left(1-M_{G}^{\sigma_{\Lambda}}(u)\right)}=\frac{K_{G}^{\sigma_{\Lambda}}(u,v)}{K_{G}^{\sigma_{\Lambda}}(u,u)}.

It remains to show that

\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\frac{K_{G}^{\sigma_{\Lambda}}(u,v)}{K_{G}^{\sigma_{\Lambda}}(u,u)},

which actually holds for any two binary random variables. To see this, we first compute $K_{G}^{\sigma_{\Lambda}}(u,u)\cdot\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ KGσΛ(u,u)⋅ℐGσΛ(u → v) by definition:

		$\displaystyle K_{G}^{\sigma_{\Lambda}}(u,u)\cdot\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)$
	$\displaystyle={}$	$\displaystyle\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})\cdot\left[\mu_{G}(v\text{\scriptsize{~{}$=$~{}}}1\mid u\text{\scriptsize{~{}$=$~{}}}1,\,{\sigma_{\Lambda}})-\mu_{G}(v\text{\scriptsize{~{}$=$~{}}}1\mid u\text{\scriptsize{~{}$=$~{}}}0,\,{\sigma_{\Lambda}})\right]$
	$\displaystyle={}$	$\displaystyle\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})-\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})$
	$\displaystyle={}$	$\displaystyle\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0,v\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})-\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}}).$

Meanwhile, the covariance can be written as

	$\displaystyle K_{G}^{\sigma_{\Lambda}}(u,v)$	$\displaystyle=\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})-\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})$
		$\displaystyle=\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0,v\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})-\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}1,v\text{\scriptsize{~{}$=$~{}}}0\mid{\sigma_{\Lambda}})\cdot\mu_{G}(u\text{\scriptsize{~{}$=$~{}}}0,v\text{\scriptsize{~{}$=$~{}}}1\mid{\sigma_{\Lambda}}).$

This shows that $\mathcal{I}_{G}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)=K_{G}^{\sigma_{\Lambda}}(u,v)/K_{G}^{\sigma_{\Lambda}}(u,u)$ ℐGσΛ(u → v)=KGσΛ(u,v)/KGσΛ(u,u) and thus establishes Part 3. ∎

We deduce 8 from 11 and the second item of the following lemma. The proof of 11 is presented in Section 4.1.

Lemma 13.

1.

([Wei06, Theorem 3.1]) Preservation of marginal of the root $r$ :

$M_{G}^{\sigma_{\Lambda}}(r)=M_{T}^{\sigma_{\Lambda}}(r)\qquad\text{and}\qquad R_{G}^{\sigma_{\Lambda}}(r)=R_{T}^{\sigma_{\Lambda}}(r);$
2.

Preservation of covariances and influences of $r$ : for every $v\in V$ ,

$K_{G}^{\sigma_{\Lambda}}(r,v)=\sum_{\hat{v}\in\mathcal{C}_{v}}K_{T}^{\sigma_{\Lambda}}(r,\hat{v})\qquad\text{and}\qquad\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\sum_{\hat{v}\in\mathcal{C}_{v}}\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}\hat{v}).$

where $\mathcal{C}_{v}$ is the set of all free (unfixed) copies of $v$ in $T$ .

Proof.

By 11, there exists a polynomial $P^{\sigma_{\Lambda}}_{G,r}=P^{\sigma_{\Lambda}}_{G,r}(\bm{\lambda})$ such that $Z^{\sigma_{\Lambda}}_{T}=Z^{\sigma_{\Lambda}}_{G}\cdot P^{\sigma_{\Lambda}}_{G,r}$ and $P^{\sigma_{\Lambda}}_{G,r}$ is independent of $\lambda_{r}$ . Then it follows from 12 that

M_{T}^{\sigma_{\Lambda}}(r)=\left(\lambda_{r}\frac{\partial}{\partial\lambda_{r}}\right)\log Z^{\sigma_{\Lambda}}_{T}=\left(\lambda_{r}\frac{\partial}{\partial\lambda_{r}}\right)\left(\log Z^{\sigma_{\Lambda}}_{G}+\log P^{\sigma_{\Lambda}}_{G,r}\right)=\left(\lambda_{r}\frac{\partial}{\partial\lambda_{r}}\right)\log Z^{\sigma_{\Lambda}}_{G}=M_{G}^{\sigma_{\Lambda}}(r),

and therefore $R_{T}^{\sigma_{\Lambda}}(r)=R_{G}^{\sigma_{\Lambda}}(r)$ . For the second item, again from 12 we get

K_{G}^{\sigma_{\Lambda}}(r,v)=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)M_{G}^{\sigma_{\Lambda}}(r)=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)M_{T}^{\sigma_{\Lambda}}(r).

Recall that for the spin system on the SAW tree $T$ , every free copy $\hat{v}$ of $v$ from $\mathcal{C}_{v}$ has the same external field $\lambda_{\hat{v}}=\lambda_{v}$ . Then, by the chain rule of derivatives and 12, we deduce that

K_{G}^{\sigma_{\Lambda}}(r,v)=\sum_{\hat{v}\in\mathcal{C}_{v}}\left(\lambda_{\hat{v}}\frac{\partial}{\partial\lambda_{\hat{v}}}\right)M_{T}^{\sigma_{\Lambda}}(r)\cdot\frac{\partial\lambda_{\hat{v}}}{\partial\lambda_{v}}\cdot\frac{\lambda_{v}}{\lambda_{\hat{v}}}=\sum_{\hat{v}\in\mathcal{C}_{v}}K_{T}^{\sigma_{\Lambda}}(r,\hat{v}).

Finally, we have

\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log R_{G}^{\sigma_{\Lambda}}(r)=\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log R_{T}^{\sigma_{\Lambda}}(r)=\sum_{\hat{v}\in\mathcal{C}_{v}}\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}\hat{v}),

where the last equality follows as above. ∎

4.1 Proof of 11

Before presenting our proof, let us first review the notations and definitions introduced earlier. Denote the set of fields at all vertices by $\bm{\lambda}=\{\lambda_{v}:v\in V\}$ . For $\Lambda\subseteq V$ and ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ , the weight of $\sigma\in\{0,1\}^{V\backslash\Lambda}$ conditional on ${\sigma_{\Lambda}}$ is given by

w_{G}(\sigma\mid{\sigma_{\Lambda}})=\beta^{m_{1}(\sigma\mid{\sigma_{\Lambda}})}\gamma^{m_{0}(\sigma\mid{\sigma_{\Lambda}})}\prod_{v\in V\backslash\Lambda}\lambda_{v}^{\sigma_{v}},

where for $i=0,1$ , $m_{i}(\cdot\mid{\sigma_{\Lambda}})$ denotes the number of edges such that both endpoints receive the spin $i$ and at least one of them is in $V\backslash\Lambda$ . The partition function conditional on ${\sigma_{\Lambda}}$ is defined as $Z_{G}^{\sigma_{\Lambda}}=\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}w_{G}(\sigma\mid{\sigma_{\Lambda}})$ . For the SAW tree, we define the conditional weights and partition function in the same way. In particular, recall that when we fix a conditioning ${\sigma_{\Lambda}}$ on the SAW tree, we also remove all descendants of $\hat{v}\in\mathcal{C}_{v}$ for each $v\in\Lambda$ .

For every $v\in V\backslash\Lambda$ and $i\in\{0,1\}$ , we shall write $v=i$ to represent the set of configurations such that $\sigma_{v}=i$ (i.e., $\{\sigma\in\{0,1\}^{V\backslash\Lambda}:\sigma_{v}=i\}$ ) and let $Z_{G}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$=$~{}}}i)$ ZGσΛ(v = i) be sum of weights of all configurations with $v\text{\scriptsize{~{}$=$~{}}}i$ v = i. We further extend this notation and write $Z_{G}^{\sigma_{\Lambda}}(U\text{\scriptsize{~{}$=$~{}}}\sigma_{U})$ ZGσΛ(U = σU) for every $U\subseteq V\backslash\Lambda$ and $\sigma_{U}\in\{0,1\}^{U}$ . For the SAW tree we adopt the same notations as well.

Proof of 11.

We will show that there exists a polynomial $P^{\sigma_{\Lambda}}_{G,r}=P^{\sigma_{\Lambda}}_{G,r}(\bm{\lambda})$ , independent of $\lambda_{r}$ , such that

Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}1)=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda}}_{G,r}\quad\text{and}\quad Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}0)=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda}}_{G,r}.

(5)

The high-level proof idea of Eq. 5 is similar to the corresponding result in [Wei06, Theorem 3.1]. Let $m$ be the number of edges with at least one endpoint in $V\backslash\Lambda$ . We use induction on $m$ . When $m=0$ the statement is trivial since $T=G$ . Assume that Eq. 5 holds for all graphs and all conditioning with less than $m$ edges. Suppose that the root $r$ has $d$ neighbors $v_{1},\dots,v_{d}$ . Define $G^{\prime}$ to be the graph obtained by replacing the vertex $r$ with $d$ vertices $r_{1},\dots,r_{d}$ and then connecting $\{r_{i},d_{i}\}$ for $1\leq i\leq d$ .

Consider first the case where $(G\backslash\{r\})\backslash\Lambda$ is still connected. For each $i$ , let $G_{i}=G^{\prime}-r_{i}$ . Define the $2$ -spin system on $G_{i}$ with the same parameters $(\beta,\gamma,\bm{\lambda})$ , plus an additional conditioning that the vertices $r_{1},\dots,r_{i-1}$ are fixed to spin $0$ while $r_{i+1},\dots,r_{d}$ are fixed to spin $1$ ; we denote this conditioning by $\sigma_{U_{i}}$ with $U_{i}=\{v_{1},\dots,v_{d}\}\backslash\{v_{i}\}$ . Then, $T=T_{\textsc{saw}}(G,r)$ can be generated by the following recursive procedure. Also see Fig. 2 for an illustration.

Algorithm: $T_{\textsc{saw}}(G,r)$

1.

For each $i$ , let $T_{i}=T_{\textsc{saw}}(G_{i},v_{i})$ plus the conditioning $\sigma_{U_{i}}$ ;
2.

Let $T=T_{\textsc{saw}}(G,r)$ be the union of $r$ and $T_{1},\dots,T_{d}$ by connecting $\{r,v_{i}\}$ for $1\leq i\leq d$ ; output $T$ .

For the purpose of proof, we also consider the $2$ -spin system on $G^{\prime}$ with the same parameters $(\beta,\gamma,\bm{\lambda})$ , with an exception that we let the vertices $r_{1},\dots,r_{d}$ have no fields (i.e., setting $\lambda_{r_{i}}=1$ for $1\leq i\leq d$ instead of $\lambda_{r}$ ). We then observe that

Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)=\lambda_{r}\cdot Z^{\sigma_{\Lambda}}_{G^{\prime}}(r_{1}\text{\scriptsize{~{}$=$~{}}}1,\dots,r_{d}\text{\scriptsize{~{}$=$~{}}}1),

and the same holds with spin $1$ replaced by $0$ . For $1\leq i\leq d$ , let $\sigma_{\Lambda_{i}}$ denote the union of the conditioning ${\sigma_{\Lambda}}$ and $\sigma_{U_{i}}$ , where $\Lambda_{i}=\Lambda\cup U_{i}$ . Then for every $1\leq i\leq d$ we have

Z^{\sigma_{\Lambda}}_{G^{\prime}}(r_{1}\text{\scriptsize{~{}$=$~{}}}0,\dots,r_{i-1}\text{\scriptsize{~{}$=$~{}}}0,r_{i}\text{\scriptsize{~{}$=$~{}}}1,\dots,r_{d}\text{\scriptsize{~{}$=$~{}}}1)=\beta\cdot Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}1)+Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}0).

Notice that both sides are independent of the field $\lambda_{r}$ : for the left side, all $r_{i}$ ’s do not have a field for the spin system on $G^{\prime}$ ; for the right side, recall that we do not count the weight of fixed vertices for the conditional partition function for each $G_{i}$ . Now define $Q^{\sigma_{\Lambda}}_{G,r}=Q^{\sigma_{\Lambda}}_{G,r}(\bm{\lambda})$ by

Q^{\sigma_{\Lambda}}_{G,r}=\prod_{i=2}^{d}Z^{\sigma_{\Lambda}}_{G^{\prime}}(r_{1}\text{\scriptsize{~{}$=$~{}}}0,\dots,r_{i-1}\text{\scriptsize{~{}$=$~{}}}0,r_{i}\text{\scriptsize{~{}$=$~{}}}1,\dots,r_{d}\text{\scriptsize{~{}$=$~{}}}1),

which is independent of $\lambda_{r}$ . Then we get

	$\displaystyle Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot Q^{\sigma_{\Lambda}}_{G,r}$	$\displaystyle=\lambda_{r}\cdot\prod_{i=1}^{d}Z^{\sigma_{\Lambda}}_{G^{\prime}}(r_{1}\text{\scriptsize{~{}$=$~{}}}0,\dots,r_{i-1}\text{\scriptsize{~{}$=$~{}}}0,r_{i}\text{\scriptsize{~{}$=$~{}}}1,\dots,r_{d}\text{\scriptsize{~{}$=$~{}}}1)$
		$\displaystyle=\lambda_{r}\cdot\prod_{i=1}^{d}\left(\beta\cdot Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}1)+Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}0)\right).$

Using a similar argument, we also have

	$\displaystyle Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}0)\cdot Q^{\sigma_{\Lambda}}_{G,r}$	$\displaystyle=\prod_{i=1}^{d}Z^{\sigma_{\Lambda}}_{G^{\prime}}(r_{1}\text{\scriptsize{~{}$=$~{}}}0,\dots,r_{i}\text{\scriptsize{~{}$=$~{}}}0,r_{i+1}\text{\scriptsize{~{}$=$~{}}}1,\dots,r_{d}\text{\scriptsize{~{}$=$~{}}}1)$
		$\displaystyle=\prod_{i=1}^{d}\left(Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}1)+\gamma\cdot Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}0)\right).$

Since we assume that $(G\backslash\{r\})\backslash\Lambda$ is connected, the graph $G_{i}\backslash\Lambda$ is also connected for each $i$ . Then, by the induction hypothesis, for each $i$ there exists a polynomial $P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}=P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}(\bm{\lambda})$ such that

Z^{\sigma_{\Lambda_{i}}}_{T_{i}}(r\text{\scriptsize{~{}$=$~{}}}1)=Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}\quad\text{and}\quad Z^{\sigma_{\Lambda_{i}}}_{T_{i}}(r\text{\scriptsize{~{}$=$~{}}}0)=Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(r\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}};

these polynomials are independent of $\lambda_{r}$ since the conditional partition functions for $G_{i}$ ’s do not involve $\lambda_{r}$ . Now if we let

P^{\sigma_{\Lambda}}_{G,r}=Q^{\sigma_{\Lambda}}_{G,r}\cdot\prod_{i=1}^{d}P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}},

then it follows from the tree recursion that

	$\displaystyle Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}1)$	$\displaystyle=\lambda_{r}\cdot\prod_{i=1}^{d}\left(\beta\cdot Z^{\sigma_{\Lambda_{i}}}_{T_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}1)+Z^{\sigma_{\Lambda_{i}}}_{T_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}0)\right)$
		$\displaystyle=\lambda_{r}\cdot\prod_{i=1}^{d}\left(\beta\cdot Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}+Z^{\sigma_{\Lambda_{i}}}_{G_{i}}(v_{i}\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}\right)$
		$\displaystyle=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot Q^{\sigma_{\Lambda}}_{G,r}\cdot\prod_{i=1}^{d}P^{\sigma_{\Lambda_{i}}}_{G_{i},v_{i}}$
		$\displaystyle=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda}}_{G,r}.$

The other equality $Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}0)=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda}}_{G,r}$ ZTσΛ(r = 0)=ZGσΛ(r = 0)⋅PG,rσΛ is established in the same way. This completes the proof for the case that $(G\backslash\{r\})\backslash\Lambda$ is connected.

If $(G\backslash\{r\})\backslash\Lambda$ has two or more connected components, then we can construct $T_{\textsc{saw}}(G,r)$ by the SAW tree of each component. Recall that $G^{\prime}$ is defined by splitting the vertex $r$ into $d$ copies in the graph $G$ . Suppose that $G^{\prime}\backslash\Lambda$ has $k$ connected component for an integer $k\geq 2$ . Let $G^{\prime}_{(1)},\dots,G^{\prime}_{(k)}$ be the subgraphs induced by each component, along with vertices from $\Lambda$ that are adjacent to it. For each $j$ , let $G_{(j)}$ be the graph obtained from $G^{\prime}_{(j)}$ by contracting all copies of $r$ into one vertex $r_{(j)}$ , and let $T_{(j)}=T_{\textsc{saw}}(G^{\prime}_{(j)},r_{(j)})$ . Observe that once we contract the roots $r_{(1)},\dots,r_{(k)}$ of $T_{(1)},\dots,T_{(k)}$ , the resulting tree is $T_{\textsc{saw}}(G,r)$ .

We define the $2$ -spin system on each $G_{(j)}$ with the same parameters $(\beta,\gamma,\bm{\lambda})$ , except that the vertex $r_{(j)}$ does not have a field (i.e., $\lambda_{r_{(j)}}=1$ instead of $\lambda_{r}$ ). For $1\leq j\leq k$ , let $\Lambda_{(j)}=\Lambda\cap V(G_{(j)})$ and $\sigma_{\Lambda_{(j)}}$ be the configuration ${\sigma_{\Lambda}}$ restricted on $\Lambda_{(j)}$ . Then $G_{(j)}\backslash\Lambda_{(j)}$ is connected for every $j$ and, since $k\geq 2$ , each $G_{(j)}$ with conditioning $\sigma_{\Lambda_{(j)}}$ has fewer than $m$ edges. Thus, we can apply the induction hypothesis; namely, for $1\leq j\leq k$ there exists a polynomial $P^{\sigma_{\Lambda_{(j)}}}_{G_{(i)},r_{(i)}}=P^{\sigma_{\Lambda_{(j)}}}_{G_{(i)},r_{(i)}}(\bm{\lambda})$ , which is independent of $\lambda_{r}$ , such that

Z^{\sigma_{\Lambda_{(j)}}}_{T_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}1)=Z^{\sigma_{\Lambda_{(j)}}}_{G_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda_{(j)}}}_{G_{(j)},r_{(j)}}\quad\text{and}\quad Z^{\sigma_{\Lambda_{(j)}}}_{T_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}0)=Z^{\sigma_{\Lambda_{(j)}}}_{G_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda_{(j)}}}_{G_{(j)},r_{(j)}}.

We define the polynomial $P^{\sigma_{\Lambda}}_{G,r}=P^{\sigma_{\Lambda}}_{G,r}(\bm{\lambda})$ to be

P^{\sigma_{\Lambda}}_{G,r}=\prod_{j=1}^{k}P^{\sigma_{\Lambda_{(j)}}}_{G_{(j)},r_{(j)}}.

It is then easy to check that

	$\displaystyle Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}1)$	$\displaystyle=\lambda_{r}\cdot\prod_{j=1}^{k}Z^{\sigma_{\Lambda_{(j)}}}_{T_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}1)=\lambda_{r}\cdot\prod_{j=1}^{k}\left(Z^{\sigma_{\Lambda_{(j)}}}_{G_{(j)}}(r_{(j)}\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda_{(j)}}}_{G_{(j)},r_{(j)}}\right)$
		$\displaystyle=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot\prod_{j=1}^{k}P^{\sigma_{\Lambda_{(j)}}}_{G_{(j)},r_{(j)}}=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}1)\cdot P^{\sigma_{\Lambda}}_{G,r},$

and similarly $Z^{\sigma_{\Lambda}}_{T}(r\text{\scriptsize{~{}$=$~{}}}0)=Z^{\sigma_{\Lambda}}_{G}(r\text{\scriptsize{~{}$=$~{}}}0)\cdot P^{\sigma_{\Lambda}}_{G,r}$ ZTσΛ(r = 0)=ZGσΛ(r = 0)⋅PG,rσΛ. The theorem then follows. ∎

5 Influence bound for trees

In this section, we study the influences of the root on other vertices in a tree. We give an upper bound on the total influences of the root on all vertices at a fixed distance away. To do this, we apply the potential method, which has been used to establish the correlation decay property (see, e.g., [LLY12, LLY13, GL18]). Given an arbitrary potential function $\Psi$ , our upper bound is in terms of properties of $\Psi$ , involving bounds on $\left\|{\nabla H_{d}^{\Psi}}\right\|_{1}$ and $|\psi|$ where $\psi=\Psi^{\prime}$ . We then deduce 9 in the case that $\Psi$ an $(\alpha,c)$ -potential.

Assume that $T=(V_{T},E_{T})$ is a tree rooted at $r$ of maximum degree at most $\Delta$ . Let $\Lambda\subseteq V_{T}\backslash\{r\}$ and ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ be arbitrary and fixed. Consider the $2$ -spin system on $T$ with parameters $(\beta,\gamma,\lambda)$ , conditioned on ${\sigma_{\Lambda}}$ . We need to bound the influence $\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)$ ℐTσΛ(r → v) from the root $r$ to another vertex $v\in V_{T}$ . Notice that if $v$ is disconnected from $r$ when $\Lambda$ is removed, then $\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)=0$ ℐTσΛ(r → v)=0 by the Markov property of spin systems. Therefore, we may assume that, by removing all such vertices, $\Lambda$ contains only leaves of $T$ .

For a vertex $v\in V_{T}$ , let $T_{v}=(V_{T_{v}},E_{T_{v}})$ be the subtree of $T$ rooted at $v$ that contains all descendant of $v$ ; note that $T_{r}=T$ . We will write $L_{v}(k)\subseteq V_{T}\backslash\Lambda$ for the set of all free vertices at distance $k$ away from $v$ in $T_{v}$ . We pay particular interest in the marginal ratio at $v$ in the subtree $T_{v}$ , and write $R_{v}=R_{T_{v}}^{\sigma_{\Lambda}}(v)$ for simplicity. The $\log R_{v}$ ’s are related by the tree recursion $H$ . If a vertex $v$ has $d$ children, denoted by $u_{1},\dots,u_{d}$ , then the tree recursion is given by

\log R_{v}=H_{d}(\log R_{u_{1}},\dots,\log R_{u_{d}}),

where for $1\leq d\leq\Delta$ and $(y_{1},\dots,y_{d})\in[-\infty,+\infty]^{d}$ ,

H_{d}(y_{1},\dots,y_{d})=\log\lambda+\sum_{i=1}^{d}\log\left(\frac{\beta e^{y_{i}}+1}{e^{y_{i}}+\gamma}\right).

Also recall that for $y\in[-\infty,+\infty]$ , we define

h(y)=-\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}

and $\frac{\partial}{\partial y_{i}}H_{d}(y_{1},\dots,y_{d})=h(y_{i})$ for all $1\leq i\leq d\leq\Delta$ .

The following lemma allows us to bound the sum of all influences from the root to distance $k$ , using an arbitrary potential function.

Lemma 14.

Let $\Psi:[-\infty,+\infty]\to(-\infty,+\infty)$ be a differentiable and increasing (potential) function with image $S=\Psi[-\infty,+\infty]$ and derivative $\psi=\Psi^{\prime}$ . Denote the degree of the root $r$ by $\Delta_{r}$ . Then for every integer $k\geq 1$ ,

\sum_{v\in L_{r}(k)}\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\leq\Delta_{r}A_{\Psi}B_{\Psi}\left(\max_{1\leq d<\Delta}\,\sup_{\tilde{\bm{y}}\in S^{d}}\left\|{\nabla H_{d}^{\Psi}(\tilde{\bm{y}})}\right\|_{1}\right)^{k-1}

where

A_{\Psi}=\max_{u\in L_{r}(1)}\left\{\frac{|h(\log R_{u})|}{\psi(\log R_{u})}\right\}\quad\text{and}\quad B_{\Psi}=\max_{v\in L_{r}(k)}\left\{\psi(\log R_{v})\right\}.

Before proving 14, we first present two useful properties of the influences on trees. Firstly, it was shown in [ALO20] that the influences satisfy the following form of chain rule on trees.

Lemma 15 ([ALO20, Lemma B.2]).

Suppose that $u,v,w\in V_{T}$ are three distinct vertices such that $u$ is on the unique path from $v$ to $w$ . Then

\mathcal{I}_{T}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}w)=\mathcal{I}_{T}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}u)\cdot\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}w).

Secondly, for two adjacent vertices on a tree, the influence from one to the other is given by the function $h$ .

Lemma 16.

Let $v\in V_{T}$ and $u$ be a child of $v$ in the subtree $T_{v}$ . Then

\mathcal{I}_{T}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}u)=h(\log R_{u}).

Proof.

The lemma can be proved through an explicit computation of the influence. Here we present a more delicate proof utilizing 12, which gives some insights into the relation between the influence and the function $h$ . We assume that $v$ has $d$ children in the subtree $T_{v}$ , denoted by $u_{1}=u$ and $u_{2},\dots,u_{d}$ respectively. We also assume, as a more general setting than uniform fields, that each vertex $w$ is attached to a field $\lambda_{w}$ of its own. Then 12 and the tree recursion imply that

	$\displaystyle\mathcal{I}_{T}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}u)$	$\displaystyle=\mathcal{I}_{T_{v}}^{\sigma_{\Lambda}}(v\text{\scriptsize{~{}$\rightarrow$~{}}}u)=\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)\log R_{v}$
		$\displaystyle=\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)H_{d}(\log R_{u_{1}},\dots,\log R_{u_{d}})$
		$\displaystyle=\sum_{i=1}^{d}\frac{\partial}{\partial\log R_{u_{i}}}H_{d}(\log R_{u_{1}},\dots,\log R_{u_{d}})\cdot\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)\log R_{u_{i}}$
		$\displaystyle=\sum_{i=1}^{d}h(\log R_{u_{i}})\cdot\mathcal{I}_{T_{u_{i}}}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}u)=h(\log R_{u}),$

where the last equality is because $\mathcal{I}_{T_{u_{i}}}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}u)=0$ ℐTuiσΛ(ui → u)=0 for $u_{i}\neq u$ and $\mathcal{I}_{T_{u}}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}u)=1$ ℐTuσΛ(u → u)=1. ∎

We are now ready to prove 14.

Proof of 14.

For a vertex $v\in V_{T}$ , denote the number of its children by $d_{v}$ ; note that $d_{r}=\Delta_{r}$ . Let $u_{1},\dots,u_{\Delta_{r}}$ be the children of the root $r$ . We may assume that all these children of $r$ are free, since if $u_{i}$ is fixed then $\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}u_{i})=0$ ℐTσΛ(r → ui)=0 by definition. Then by 15 and 16, we get

	$\displaystyle\sum_{v\in L_{r}(k)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$	$\displaystyle=\sum_{i=1}^{\Delta_{r}}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}u_{i})\right\|\sum_{v\in L_{u_{i}}(k-1)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
		$\displaystyle=\sum_{i=1}^{\Delta_{r}}\left\|h(\log R_{u_{i}})\right\|\sum_{v\in L_{u_{i}}(k-1)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
		$\displaystyle=\sum_{i=1}^{\Delta_{r}}\frac{\left\|h(\log R_{u_{i}})\right\|}{\psi(\log R_{u_{i}})}\sum_{v\in L_{u_{i}}(k-1)}\psi(\log R_{u_{i}})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|.$

Hence, we obtain that

\sum_{v\in L_{r}(k)}\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\leq\Delta_{r}\cdot\max_{1\leq i\leq\Delta_{r}}\left\{\frac{|h(\log R_{u_{i}})|}{\psi(\log R_{u_{i}})}\right\}\cdot\max_{1\leq i\leq\Delta_{r}}\left\{\sum_{v\in L_{u_{i}}(k-1)}\psi(\log R_{u_{i}})\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\right\}.

(6)

Next, we show by induction that for every vertex $u\in V_{T}\backslash\{r\}$ and every integer $k\geq 0$ we have

\sum_{v\in L_{u}(k)}\psi(\log R_{u})\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\leq\max_{v\in L_{u}(k)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{u}}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\|_{1}\right)^{k}.

(7)

Observe that once we establish Eq. 7, the lemma follows immediately by plugging Eq. 7 into Eq. 6. We will use induction on $k$ to prove Eq. 7. When $k=0$ , if $u\in\Lambda$ is fixed then $L_{u}(0)=\emptyset$ and there is nothing to show; otherwise, Eq. 7 becomes

\psi(\log R_{u})\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}u)\right|\leq\psi(\log R_{u}),

which holds with equality since $\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}u)=1$ ℐTσΛ(u → u)=1. Now suppose that Eq. 7 holds for some integer $k-1\geq 0$ (and for every vertex $u\in V_{T}\backslash\{r\}$ ). Let $u\in V_{T}\backslash\{r\}$ be arbitrary and denote the children of $u$ by $w_{1},\dots,w_{d}$ , where $1\leq d<\Delta$ (if $d=0$ then $L_{u}(k)=\emptyset$ and Eq. 7 holds trivially). Again by 15 and 16 we have

	$\displaystyle\sum_{v\in L_{u}(k)}\psi(\log R_{u})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$	$\displaystyle=\sum_{i=1}^{d}\psi(\log R_{u})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}w_{i})\right\|\sum_{v\in L_{w_{i}}(k-1)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(w_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
		$\displaystyle=\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|\sum_{v\in L_{w_{i}}(k-1)}\psi(\log R_{w_{i}})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(w_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|.$

Using the induction hypothesis, we get

		$\displaystyle\sum_{v\in L_{u}(k)}\psi(\log R_{u})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
	$\displaystyle\leq{}$	$\displaystyle\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|\cdot\max_{v\in L_{w_{i}}(k-1)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{w_{i}}}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k-1}$
	$\displaystyle\leq{}$	$\displaystyle\max_{v\in L_{u}(k)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{u}}\backslash\{u\}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k-1}\cdot\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|$
	$\displaystyle\leq{}$	$\displaystyle\max_{v\in L_{u}(k)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{u}}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k},$

where the last inequality follows from that

	$\displaystyle\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|$	$\displaystyle=\sum_{i=1}^{d}\left\|\frac{\partial}{\partial\Psi(\log R_{w_{i}})}\,H_{d}^{\Psi}\left(\Psi(\log R_{w_{1}}),\dots,\Psi(\log R_{w_{d}})\right)\right\|$
		$\displaystyle=\left\\|{\nabla H_{d}^{\Psi}\left(\Psi(\log R_{w_{1}}),\dots,\Psi(\log R_{w_{d}})\right)}\right\\|_{1}.$

This establishes Eq. 7, and thus completes the proof of the lemma. ∎

We then derive 9 as a corollary.

Proof of 9.

Since $\Psi$ is an $(\alpha,c)$ -potential, the Contraction condition implies that

\max_{1\leq d<\Delta}\sup_{\tilde{\bm{y}}\in S^{d}}\left\|{\nabla H_{d}^{\Psi}(\tilde{\bm{y}})}\right\|_{1}\leq 1-\alpha.

Meanwhile, since the degree of a vertex $v\in V_{T}\backslash\{r\}$ in the subtree $T_{v}$ is less than $\Delta$ , we have $\log R_{v}\in J$ . Then the Boundedness condition implies that for all $u\in L_{r}(1)$ and $v\in L_{r}(k)$ ,

\frac{\psi(\log R_{v})}{\psi(\log R_{u})}\cdot|h(\log R_{u})|\leq\frac{c}{\Delta}.

Therefore, we get

\Delta_{r}A_{\Psi}B_{\Psi}=\Delta_{r}\cdot\max_{u\in L_{r}(1)}\left\{\frac{|h(\log R_{u})|}{\psi(\log R_{u})}\right\}\cdot\max_{v\in L_{r}(k)}\left\{\psi(\log R_{v})\right\}\leq c.

The lemma then follows immediately from 14. ∎

6 Verifying a good potential: Contraction

In this section, we make a first step for proving 10. Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Recall that define our potential function $\Psi:[-\infty,+\infty]\to(-\infty,+\infty)$ through its derivative by

\Psi^{\prime}(y)=\psi(y)=\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}},\qquad\Psi(0)=0.

(1)

We include a short proof in Appendix C to show that $\Psi$ is well-defined. If $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta\in(0,1)$ , then we show that $\Psi$ satisfies the Contraction condition for $\alpha=\delta/2$ . This holds for all parameters $(\beta,\gamma,\lambda)$ in the uniqueness region, without requiring that $\gamma\leq 1$ . Later in Appendix E, we establish the Boundedness condition for $\Psi$ when $\gamma\leq 1$ , completing the proof of 10. The case of $\gamma>1$ is more complicated and is left to Section 7.

Before giving our proof, we first point out that the potential function $\Psi$ is essentially the same potential function $\Phi$ used in [LLY13] (notice that [LLY13] uses $\varphi$ as the notation of the potential function and $\Phi=\varphi^{\prime}$ for its derivative). Recall that the tree recursion for the marginal ratios is given by the function $F_{d}:[0,+\infty]^{d}\to[0,+\infty]$ where $1\leq d\leq\Delta$ such that for all $(x_{1},\dots,x_{d})\in[0,+\infty]^{d}$ ,

F_{d}(x_{1},\dots,x_{d})=\lambda\prod_{i=1}^{d}\frac{\beta x_{i}+1}{x_{i}+\gamma}.

The potential function $\Phi:[0,+\infty]\to(-\infty,+\infty)$ from [LLY13] is defined implicitly via its derivative as

\Phi^{\prime}(x)=\varphi(x)=\frac{1}{\sqrt{x(\beta x+1)(x+\gamma)}},\qquad\Phi(1)=0.

The follows lemma explains how we obtain our potential $\Psi$ from $\Phi$ .

Lemma 17.

We have $\Psi=\sqrt{1-\beta\gamma}\cdot(\Phi\circ\exp)$ ; namely, $\Psi(y)=\sqrt{1-\beta\gamma}\cdot\Phi(e^{y})$ for all $y\in[-\infty,+\infty]$ .

Proof.

It is straightforward to check that

\psi(y)=\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}=\sqrt{1-\beta\gamma}\cdot e^{y}\cdot\sqrt{\frac{1}{e^{y}(\beta e^{y}+1)(e^{y}+\gamma)}}=\sqrt{1-\beta\gamma}\cdot e^{y}\varphi(e^{y}).

Therefore,

\Psi(y)=\int_{0}^{y}\psi(t)\,\mathrm{d}t=\sqrt{1-\beta\gamma}\cdot\int_{0}^{y}e^{t}\varphi(e^{t})\,\mathrm{d}t=\sqrt{1-\beta\gamma}\cdot\int_{1}^{e^{y}}\varphi(s)\,\mathrm{d}s=\sqrt{1-\beta\gamma}\cdot\Phi(e^{y}).\qed

Combining the results of Lemmas 12, 13 and 14 from [LLY13], we get that the potential function $\Phi$ satisfies the following gradient bound when $(\beta,\gamma,\lambda)$ is in the uniqueness region. Note that this can be regarded as the Contraction condition but for $\Phi$ and $F_{d}$ .

Theorem 18 ([LLY13]).

Let $S_{\Phi}=\Phi[0,+\infty]$ be the image of $\Phi$ . If the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta\in(0,1)$ , then for every integer $d$ such that $1\leq d<\Delta$ and every $(\tilde{x}_{1},\dots,\tilde{x}_{d})\in S_{\Phi}^{d}$ ,

\left\|{\nabla F_{d}^{\Phi}(\tilde{x}_{1},\dots,\tilde{x}_{d})}\right\|_{1}\leq\sqrt{1-\delta}

where $F_{d}^{\Phi}=\Phi\circ F_{d}\circ\Phi^{-1}$ .

Recall our definition from Section 1.1. The tree recursion, in terms of the log marginal ratios, is described by the function $H_{d}:[-\infty,+\infty]^{d}\to[-\infty,+\infty]$ where $1\leq d\leq\Delta$ such that for every $(y_{1},\dots,y_{d})\in[-\infty,+\infty]^{d}$ ,

H_{d}(y_{1},\dots,y_{d})=\log\lambda+\sum_{i=1}^{d}\log\left(\frac{\beta e^{y_{i}}+1}{e^{y_{i}}+\gamma}\right).

Observe that $H_{d}=\log\circ F_{d}\circ\exp$ , since we move from ratios to log ratios. We are now ready to establish the Contraction condition for $\Psi$ .

Lemma 19.

Let $S_{\Psi}=\Psi[-\infty,+\infty]$ be the image of $\Psi$ . If the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta\in(0,1)$ , then for every integer $d$ such that $1\leq d<\Delta$ and every $(\tilde{y}_{1},\dots,\tilde{y}_{d})\in S_{\Psi}^{d}$ ,

\left\|{\nabla H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})}\right\|_{1}\leq\sqrt{1-\delta}

where $H_{d}^{\Psi}=\Psi\circ H_{d}\circ\Psi^{-1}$ .

Proof.

Define the linear function $a:{\mathbb{R}}\to{\mathbb{R}}$ to be $a(x)=\sqrt{1-\beta\gamma}\cdot x$ for $x\in{\mathbb{R}}$ . Then 17 gives $\Psi=a\circ\Phi\circ\exp$ , and thereby $\Psi\circ\log=a\circ\Phi$ . It follows that for every $1\leq d<\Delta$ ,

H_{d}^{\Psi}=\Psi\circ H_{d}\circ\Psi^{-1}=\Psi\circ\log\circ F_{d}\circ\exp\circ\Psi^{-1}=a\circ\Phi\circ F_{d}\circ\Phi^{-1}\circ a^{-1}=a\circ F_{d}^{\Phi}\circ a^{-1}.

That means, for every $(\tilde{y}_{1},\dots,\tilde{y}_{d})\in S_{\Psi}^{d}$ we have

H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})=\sqrt{1-\beta\gamma}\cdot F_{d}^{\Phi}(\tilde{x}_{1},\dots,\tilde{x}_{d})

where $\tilde{x}_{i}=\tilde{y}_{i}/\sqrt{1-\beta\gamma}$ for $1\leq i\leq d$ . Then, for each $i$ ,

\frac{\partial}{\partial\tilde{y}_{i}}H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})=\sqrt{1-\beta\gamma}\cdot\frac{\partial}{\partial\tilde{x}_{i}}F_{d}^{\Phi}(\tilde{x}_{1},\dots,\tilde{x}_{d})\cdot\frac{\mathrm{d}\tilde{x}_{i}}{\mathrm{d}\tilde{y}_{i}}=\frac{\partial}{\partial\tilde{x}_{i}}F_{d}^{\Phi}(\tilde{x}_{1},\dots,\tilde{x}_{d}).

This implies that $\nabla H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})=\nabla F_{d}^{\Phi}(\tilde{x}_{1},\dots,\tilde{x}_{d})$ for all $(\tilde{y}_{1},\dots,\tilde{y}_{d})\in S_{\Psi}^{d}$ , and the lemma then follows from 18. ∎

7 Remaining antiferromagnetic cases: $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$

In this section, we discuss the case where $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ . As studied in [LLY13], in this case the uniqueness region is more complicated. For example, there exists a critical $\lambda_{c}^{*}>0$ such that the $2$ -spin system with $\lambda<\lambda_{c}^{*}$ is in the uniqueness region for arbitrary graphs; namely, $(\beta,\gamma,\lambda)$ is up-to- $\infty$ unique. To deal with large degrees, we need to relax the Boundedness condition in 4 and define a more general version of $(\alpha,c)$ -potentials. We shall see that 5 still holds for this general $(\alpha,c)$ -potential. The reason behind it is that in order to bound the maximum eigenvalue of the influence matrix, it suffices to consider a vertex-weighted sum of absolute influences of a vertex with large degree.

Remark 20.

We give more background on the uniqueness region in Section E.1. Note that in a recent revision of [LLY13], the authors updated the descriptions of the uniqueness region for the case $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ , fixing a small error in the previous version. Statements and proofs in this section and Appendix E of this paper are also adjusted accordingly based on the new version of [LLY13].

Recall that our goal is to bound the maximum eigenvalue of the matrix $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ . We can do this by upper bounding the absolute row sum $\sum_{v\in V\backslash\Lambda}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|$ ℐGσΛ(r → v)| for fixed $r$ , thereby giving us a valid upper bound on $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})$ . However, this approach does not work when $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ . In this case, the potential $\Psi$ fails to be an $(\alpha,c)$ -potential for a universal constant $c$ independent of $\Delta$ . In fact, no such $(\alpha,c)$ -potentials exist as the absolute row sum $\sum_{v\in V\backslash\Lambda}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|$ ℐGσΛ(r → v)| can be as large as $\Theta(\Delta)$ . Especially, if the parameters $(\beta,\gamma,\lambda)$ are up-to- $\infty$ unique, which means the spin system has uniqueness for arbitrary graphs, then the absolute row sum $\sum_{v\in V\backslash\Lambda}|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)|$ ℐGσΛ(r → v)| can be $\Theta(n)$ where $n=|V|$ . We give a specific example where this is the case.

Example 21.

Consider the antiferromagnetic 2-spin system specified by parameters $\beta=0$ , $\gamma>1$ and $\lambda>0$ on the star graph centered at $r$ with $\Delta$ leaves. A simple calculation reveals that $\left|{\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|=\frac{\lambda}{\lambda+\gamma}$ ℐG(r → v)|=λλ+γ for any leaf vertex $v\neq r$ . Hence, $\sum_{v\neq r}\left|{\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|=\Delta\cdot\frac{\lambda}{\lambda+\gamma}$ ℐG(r → v)|=Δ⋅λλ+γ. Now, since $\gamma>1$ , we have

\lambda_{c}=\lambda_{c}(\gamma,\Delta)=\min_{1<d<\Delta}\frac{\gamma^{d+1}d^{d}}{(d-1)^{d+1}}=\Theta_{\gamma}(1),

forcing $\sum_{v\neq r}\left|{\mathcal{I}_{G}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|=\Theta_{\gamma}(\Delta)$ ℐG(r → v)|=Θγ(Δ) even when $\lambda<\lambda_{c}$ lies in the uniqueness region. However, we still have $\lambda_{\max}(\mathcal{I}_{G})=O(1)$ since $\sum_{v\neq r}|\mathcal{I}_{G}(v\text{\scriptsize{~{}$\rightarrow$~{}}}r)|=O(1)$ ℐG(v → r)|=O(1).

To solve this issue, one might want to consider the absolute column sum, involving the sum of absolute influences on a fixed vertex. However, this will not allow us to use the beautiful connection between graphs and SAW trees as showed in 8. Instead, we consider here a vertex-weighted version of the absolute row sum of $\mathcal{I}_{G}^{\sigma_{\Lambda}}$ , which also upper bounds the maximum eigenvalue.

Lemma 22.

Let $\rho:V\to\mathbb{R}^{+}$ be a positive weight function of vertices. If there is a constant $\xi>0$ such that for every $r\in V$ we have

\sum_{v\in V\backslash\Lambda}\rho_{v}\cdot\left|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\leq\xi\cdot\rho_{r},

(8)

then $\lambda_{\mathrm{max}}(\mathcal{I}_{G}^{\sigma_{\Lambda}})\leq\xi$ .

Proof.

Let $\mathcal{P}=\mathrm{diag}\{\rho_{v}:v\in V\backslash\Lambda\}$ . Then the assumption is equivalent to $\|\mathcal{P}^{-1}\mathcal{I}_{G}^{\sigma_{\Lambda}}\mathcal{P}\|_{\infty}\leq\xi$ . It follows that $\lambda_{\max}(\mathcal{I}_{G}^{\sigma_{\Lambda}})=\lambda_{\max}(\mathcal{P}^{-1}\mathcal{I}_{G}^{\sigma_{\Lambda}}\mathcal{P})\leq\xi$ . ∎

We then modify our definition of $(\alpha,c)$ -potentials from 4 which allows a weaker Boundedness condition. We remark that the only two differences between 23 and 4 is that: we allow $\Delta=\infty$ ; and the Boundedness condition is relaxed to what we call General Boundedness. Recall that for every $0\leq d<\Delta$ , we let $J_{d}=\left[{\log(\lambda\beta^{d}),\log(\lambda/\gamma^{d})}\right]$ when $\beta\gamma<1$ , and $J_{d}=\left[{\log(\lambda/\gamma^{d}),\log(\lambda\beta^{d})}\right]$ when $\beta\gamma>1$ .

Definition 23 (General $(\alpha,c)$ -potential function).

Let $\Delta\geq 3$ be an integer or $\Delta=\infty$ . Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ and $\lambda>0$ . Let $\Psi:[-\infty,+\infty]\to(-\infty,+\infty)$ be a differentiable and increasing function with image $S=\Psi[-\infty,+\infty]$ and derivative $\psi=\Psi^{\prime}$ . For any $\alpha\in(0,1)$ and $c>0$ , we say $\Psi$ is a general $(\alpha,c)$ -potential function with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ if it satisfies the following conditions:

1.

(Contraction) For every integer $d$ such that $1\leq d<\Delta$ and every $(\tilde{y}_{1},\dots,\tilde{y}_{d})\in S^{d}$ , we have

$\left\|{\nabla H_{d}^{\Psi}(\tilde{y}_{1},\dots,\tilde{y}_{d})}\right\|_{1}=\sum_{i=1}^{d}\frac{\psi(y)}{\psi(y_{i})}\cdot|h(y_{i})|\leq 1-\alpha$

where $H_{d}^{\Psi}=\Psi\circ H_{d}\circ\Psi^{-1}$ , $y_{i}=\Psi^{-1}(\tilde{y}_{i})$ for $1\leq i\leq d$ , and $y=H_{d}(y_{1},\dots,y_{d})$ .
2.

(General Boundedness) For all integers $d_{1},d_{2}$ such that $0\leq d_{1},d_{2}<\Delta$ , and all reals $y_{1}\in J_{d_{1}},y_{2}\in J_{d_{2}}$ , we have

$\frac{\psi(y_{2})}{\psi(y_{1})}\cdot|h(y_{1})|\leq\frac{2c}{d_{1}+d_{2}+2}.$

Notice that General Boundedness is a weaker condition than Boundedness. To see this, if a potential function $\Psi$ satisfies Boundedness with parameter $c$ , then for every $0\leq d_{i}<\Delta$ and every $y_{i}\in J_{d_{i}}$ where $i=1,2$ we have

\frac{\psi(y_{2})}{\psi(y_{1})}\cdot|h(y_{1})|\leq\frac{c}{\Delta}\leq\frac{2c}{d_{1}+d_{2}+2}.

The following theorem generalizes 5 and shows that a general $(\alpha,c)$ -potential function is sufficient to establish rapid mixing of the Glauber dynamics.

Theorem 24.

Let $\Delta\geq 3$ be an integer or $\Delta=+\infty$ . Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ and $\lambda>0$ . Suppose that there is a general $(\alpha,c)$ -potential with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ for some $\alpha\in(0,1)$ and $c>0$ . Then for every $n$ -vertex graph $G$ of maximum degree $\Delta$ , the mixing time of the Glauber dynamics for the $2$ -spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $O(n^{2+2c/\alpha})$ .

We then give a counterpart of 10, showing that $\Psi$ is a general $(\alpha,c)$ -potential when $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ . 3 for this case is then obtained from 24 and 25.

Lemma 25.

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta<1<\gamma$ and $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ . Assume that $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta\in(0,1)$ . Then the function $\Psi$ defined implicitly by Eq. 4 is a general $(\alpha,c)$ -potential function with $\alpha\geq\delta/2$ and $c\leq 18$ ; we can further take $c\leq 4$ if $\beta=0$ .

The proof of 24 can be found in Appendix D. For 25, the Contraction condition of $\Psi$ follows from 19, and General Boundedness is proved in Appendix E together with all other cases.

8 Ferromagnetic cases

In the ferromagnetic case, the best known correlation decay results are given in [GL18, SS20]. Using the potential functions in [GL18] and [SS20], we show the following two results, which match the known correlation decay results. In fact, the potential function from [SS20] turns out to be an $(\alpha,c)$ -potential function for constants $\alpha=\Theta(\delta)$ and $c\leq O(1)$ .

Theorem 26.

Fix an integer $\Delta\geq 3$ , positive real numbers $\beta,\gamma,\lambda$ and $0<\delta<1$ , and assume $(\beta,\gamma,\lambda)$ satisfies one of the following three conditions:

1.

$\frac{\Delta-2+\delta}{\Delta-\delta}\leq\sqrt{\beta\gamma}\leq\frac{\Delta-\delta}{\Delta-2+\delta}$ , and $\lambda$ is arbitrary;
2.

$\sqrt{\beta\gamma}\geq\frac{\Delta}{\Delta-2}$ and $\lambda\leq(1-\delta)\frac{\gamma}{\max\{1,\beta^{\Delta-1}\}\cdot((\Delta-2)\beta\gamma-\Delta)}$ ;
3.

$\sqrt{\beta\gamma}\geq\frac{\Delta}{\Delta-2}$ and $\lambda\geq\frac{1}{1-\delta}\cdot\frac{(\Delta-2)\beta\gamma-\Delta}{\beta\cdot\min\{1,1/\gamma^{\Delta-1}\}}$ .

Then the identity function $\Psi(y)=y$ (based on the potential given in [SS20]) is an $(\alpha,c)$ -potential function for $\alpha=\Theta(\delta)$ and $c\leq O(1)$ . Furthermore, for every $n$ -vertex graph $G$ of maximum degree at most $\Delta$ , the mixing time of the Glauber dynamics for the 2-spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $O(n^{2+c/\delta})$ , for a universal constant $c>0$ .

Remark 3.

Condition 1 includes both the ferromagnetic case $1<\sqrt{\beta\gamma}\leq\frac{\Delta-\delta}{\Delta-2+\delta}$ and the antiferromagnetic case $\frac{\Delta-2+\delta}{\Delta-\delta}\leq\sqrt{\beta\gamma}<1$ . Note that in both cases $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta$ . For the antiferromagnetic case, the identity function $\Psi$ is an $(\alpha,c)$ -potential with $c\leq 1.5$ and a better contraction rate $\alpha\geq\delta$ , compared with the bound $\alpha\geq\delta/2$ of the potential $\Psi$ given by Eq. 4 in 10. For the ferromagnetic case with $\beta=\gamma>1$ (Ising model), a stronger result by [MS13] was known, which gives $O(n\log n)$ mixing.

The potential function from [GL18] is indeed an $(\alpha,c)$ -potential, but $c$ must, unfortunately, depend on $\Delta$ . We have the following result, which is weaker than the correlation decay algorithm in [GL18] for unbounded degree graphs.

Theorem 27.

Fix an integer $\Delta\geq 3$ , and nonnegative real numbers $\beta,\gamma,\lambda$ satisfying $\beta\leq 1\leq\gamma$ , $\sqrt{\beta\gamma}\geq\frac{\Delta}{\Delta-2}$ , and $\lambda<\left({\frac{\gamma}{\beta}}\right)^{\frac{\sqrt{\beta\gamma}}{\sqrt{\beta\gamma}-1}}$ . Then for every $n$ -vertex graph $G$ with maximum degree at most $\Delta$ , the mixing time of the Glauber dynamics for the ferromagnetic 2-spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $O(n^{C})$ , for a constant $C$ depending only on $\beta,\gamma,\lambda,\Delta$ , but not $n$ .

Proofs of these theorems are provided in Appendix F.

References

[AL20] Vedat Levi Alev and Lap Chi Lau “Improved analysis of higher order random walks and applications” In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), 2020, pp. 1198–1211
[ALO20] Nima Anari, Kuikui Liu and Shayan Oveis Gharan “Spectral Independence in High-Dimensional Expanders and Applications to the Hardcore Model” In arXiv preprint arXiv:2001.00303, 2020
[Bar16] Alexander Barvinok “Combinatorics and Complexity of Partition Functions” Springer AlgorithmsCombinatorics, 2016
[Ben18] Ferenc Bencs “On trees with real-rooted independence polynomial” In Discrete Mathematics 341.12 Elsevier, 2018, pp. 3321–3330
[GL18] Heng Guo and Pinyan Lu “Uniqueness, Spatial Mixing, and Approximation for Ferromagnetic 2-Spin Systems” In ACM Transactions on Computation Theory 10.4, 2018
[GŠV16] Andreas Galanis, Daniel Štefankovič and Eric Vigoda “Inapproximability of the Partition Function for the Antiferromagnetic Ising and Hard-Core Models” In Combinatorics, Probability and Computing 25.4 Cambridge University Press, 2016, pp. 500–559
[JS93] Mark Jerrum and Alistair Sinclair “Polynomial-time approximation algorithms for the Ising model” In SIAM Journal on Computing 22.5 SIAM, 1993, pp. 1087–1116
[JVV86] Mark R Jerrum, Leslie G Valiant and Vijay V Vazirani “Random generation of combinatorial structures from a uniform distribution” In Theoretical Computer Science 43 Elsevier, 1986, pp. 169–188
[Kel85] F.. Kelly “Stochastic Models of Computer Communication Systems” In Journal of the Royal Statistical Society. Series B (Methodological) 47.3, 1985, pp. 379–395
[LLY12] Liang Li, Pinyan Lu and Yitong Yin “Approximate Counting via Correlation Decay in Spin Systems” In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2012, pp. 922–940
[LLY13] Liang Li, Pinyan Lu and Yitong Yin “Correlation Decay Up to Uniqueness in Spin Systems” In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2013, pp. 67–84
[LSS19] Jingcheng Liu, Alistair Sinclair and Piyush Srivastava “Fisher zeros and correlation decay in the Ising model” In Journal of Mathematical Physics 60.10 AIP Publishing LLC, 2019, pp. 103304
[MS13] Elchanan Mossel and Allan Sly “Exact thresholds for Ising-Gibbs samplers on general graphs” In Annals of Probability 41.1, 2013, pp. 294–328
[PR17] Viresh Patel and Guus Regts “Deterministic Polynomial-Time Approximation Algorithms for Partition Functions and Graph Polynomials” In SIAM Journal on Computing 46, 2017, pp. 1893–1919
[PR19] Han Peters and Guus Regts “On a conjecture of Sokal concerning roots of the independence polynomial” In The Michigan Mathematical Journal 68.1, 2019, pp. 33–55
[Sly10] Allan Sly “Computational Transition at the Uniqueness Threshold” In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2010, pp. 287–296
[SS05] Alexander D. Scott and Alan D. Sokal “The Repulsive Lattice Gas, the Independent-Set Polynomial, and the Lovász Local Lemma” In Journal of Statistical Physics 118.5, 2005, pp. 1151–1261
[SS14] Allan Sly and Nike Sun “The Computational Hardness of Counting in Two-Spin Models on $d$ -Regular Graphs” In The Annals of Probability 42.6, 2014, pp. 2383–2416
[SS20] Shuai Shao and Yuxin Sun “Contraction: A Unified Perspective of Correlation Decay and Zero-Freeness of 2-Spin Systems” In Proceedings of the 47th International Colloquium on Automata, Languages, and Programming (ICALP), 2020, pp. 96:1–15
[SST14] Alistair Sinclair, Piyush Srivastava and Marc Thurley “Approximation Algorithms for Two-State Anti-Ferromagnetic Spin Systems on Bounded Degree Graphs” In Journal of Statistical Physics 155.4, 2014, pp. 666–686
[ŠVV09] Daniel Štefankovič, Santosh Vempala and Eric Vigoda “Adaptive simulated annealing: A near-optimal connection between sampling and counting” In Journal of the ACM 56.3, 2009, pp. 1–36
[Wei06] Dror Weitz “Counting Independent Sets Up to the Tree Threshold” In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC), 2006, pp. 140–149

Appendix A Proof of main results

In this section we give the proofs of 1, 2, 3 and 5.

Proof of 5.

Note that since the transition matrix $P$ for the Glauber dynamics has all nonnegative eigenvalues, we have that $\lambda^{*}(P)=1-\lambda_{2}(P)$ and so in order to deduce mixing, it suffices to lower bound $1-\lambda_{2}(P)$ . We do this by employing 7. It suffices to show $(\eta_{0},\dots,\eta_{n-2})$ -spectrally independence for sufficiently small $\eta_{i}$ .

To bound $\eta_{i}$ , it suffices to bound $\sum_{v\in V\backslash\{r\}}\left|{\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|$ ℐGσΛ(r → v)| for all graphs $G=(V,E)$ with $n=|V|$ vertices and all boundary conditions $\sigma_{\Lambda}$ on a subset $\Lambda$ of $i$ vertices. We claim the following:

\sum_{v\in V\backslash\{r\}}\left|{\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|\leq\min\left\{{\frac{c}{\alpha},C(n-i-1)}\right\}

(9)

where $C\in(0,1)$ is a constant depending only on $\beta,\gamma,\lambda,\Delta$ . The first upper bound $\frac{c}{\delta}$ is deduced by

$\displaystyle\sum_{v\in V\backslash\{r\}}\left\|{\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right\|$	$\displaystyle\leq\sum_{v\in V_{T}\backslash\{r\}}\left\|{\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right\|$	(8; $T=T_{\textsc{saw}}(G,r)$ )
	$\displaystyle=\sum_{k=1}^{\infty}\sum_{v\in L_{r}(k)}\left\|{\mathcal{I}_{T}^{\sigma_{\Lambda_{\Lambda}}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right\|$	(split the sum by levels)
	$\displaystyle\leq c\sum_{k=1}^{\infty}(1-\alpha)^{k-1}$	(9)
	$\displaystyle=\frac{c}{\alpha}.$

The second upper bound $C(n-i-1)$ is more trivial. Intuitively, it means each absolute pairwise influence $\left|{\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|$ ℐGσΛ(r → v)| is at most some constant $C$ and hence the sum of absolute influences is upper bounded by $C(n-i-1)$ . The following two claims, whose proofs are provided in Section A.2, give a more precise statement.

Claim 28 (Antiferromagnetic Case).

Fix an integer $\Delta\geq 3$ and real numbers $\beta,\gamma,\lambda$ , and assume $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Then for every $n$ -vertex graph $G$ of maximum degree at most $\Delta$ , the antiferromagnetic $2$ -spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $C n$ -spectrally independent, for a constant $0<C<1$ depending only on $\beta,\gamma,\lambda,\Delta$ . Furthermore, if $(\beta,\gamma,\Delta)$ is up-to- $\Delta$ unique, then we can drop the dependence on $\Delta$ .

Claim 29 (Ferromagnetic Case).

Fix an integer $\Delta\geq 3$ and positive real numbers $\beta,\gamma,\lambda$ , and assume $\beta\leq\gamma$ and $\beta\gamma>1$ . Then for every $n$ -vertex graph $G$ of maximum degree at most $\Delta$ , the ferromagnetic $2$ -spin system on $G$ with parameters $(\beta,\gamma,\lambda)$ is $C n$ -spectrally independent, for a constant $0<C<1$ depending only on $\beta,\gamma,\lambda,\Delta$ .

With Eq. 9 in hand, we immediately see that by 7,

1-\lambda_{2}(P)\geq\frac{1}{n}\,\prod_{i=0}^{n-2}\left({1-\frac{\eta_{i}}{n-i-1}}\right)\geq\frac{1}{n}\cdot(1-C)^{2\lceil c/\alpha\rceil-1}\cdot\prod_{i=0}^{n-2\lceil c/\alpha\rceil-1}\left({1-\frac{c}{\alpha}\cdot\frac{1}{n-i-1}}\right).

Using the fact that $1-x\geq\exp(-x-x^{2})$ for all $0\leq x\leq\frac{1}{2}$ (which can be proved straightforwardly by calculus), we get

\displaystyle\prod_{i=0}^{n-2\lceil c/\alpha\rceil-1}\left({1-\frac{c}{\alpha}\cdot\frac{1}{n-i-1}}\right)=\prod_{j=2\lceil c/\alpha\rceil}^{n-1}\left({1-\frac{c}{\alpha}\cdot\frac{1}{j}}\right)\geq\exp\left(-\frac{c}{\alpha}\sum_{j=2\lceil c/\alpha\rceil}^{n-1}\frac{1}{j}-\frac{c^{2}}{\alpha^{2}}\sum_{j=2\lceil c/\alpha\rceil}^{n-1}\frac{1}{j^{2}}\right).

Now since

\sum_{j=2\lceil c/\alpha\rceil}^{n-1}\frac{1}{j}\leq\sum_{j=2}^{n}\frac{1}{j}\leq\int_{1}^{n}\frac{dx}{x}=\log n

and

\sum_{j=2\lceil c/\alpha\rceil}^{n-1}\frac{1}{j^{2}}\leq\sum_{j=2}^{\infty}\frac{1}{j(j-1)}=1,

we deduce that

1-\lambda_{2}(P)\geq(1-C)^{2\lceil c/\alpha\rceil-1}\cdot e^{-(c/\alpha)^{2}}\cdot n^{-(1+c/\alpha)}.

The theorem then follows from Eq. 1. ∎

Proof of 3.

We leverage 5 and 24, which shows $O(n^{2+\frac{c}{\alpha}})$ mixing as long as there is an $(\alpha,c)$ -potential, or $O(n^{2+\frac{2c}{\alpha}})$ mixing if there is a general $(\alpha,c)$ -potential. We use the potential given by Eq. 4, which is an adaptation of the potential function in [LLY13] to the log marginal ratios. When $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta\in(0,1)$ , it is an $(\alpha,c)$ -potential or a general $(\alpha,c)$ -potential by 10 and 25, with $\alpha\geq\delta/2$ and $c$ a universal constant specified by the range of parameters. The theorem then follows. ∎

Proof of 1.

By 30 later in Section A.1, $\lambda\leq(1-\delta)\lambda_{c}(\Delta)$ implies up-to- $\Delta$ uniqueness with gap $\geq\delta/4$ . Since $\gamma\leq 1$ , we can again appeal to 10 to obtain an $(\alpha,c)$ -potential with $\alpha\geq\delta/8$ and $c\leq 4$ . 1 then follows by 5 with $O(n^{2+32/\delta})$ mixing. ∎

Proof of 2.

By 31 later in Section A.1, $\beta\geq\beta_{c}(\Delta)+\delta(1-\beta_{c}(\Delta))$ implies up-to- $\Delta$ uniqueness with gap $\delta$ . Again, appealing to 10, we obtain an $(\alpha,c)$ -potential with $\alpha\geq\delta/2$ and $c\leq 1.5$ . 2 then follows by 5 with $O(n^{2+3/\delta})$ mixing.

Though we technically get $O(n^{2+3/\delta})$ by using the [LLY13] potential, we can improve it to $O(n^{2+1.5/\delta})$ mixing by using the trivial identity function as the potential. See the first case of 26 (proved in Section F.1) and Remark 3. ∎

A.1 Uniqueness gaps in terms of parameter paps

In this section we state and prove 30 and 31, which relate the parameter gaps with the uniqueness gaps.

Claim 30 (Hardcore Model; Lemma C.1 from [ALO20]).

Fix an integer $\Delta\geq 3$ , $0<\delta<1$ , and $\beta=0,\gamma>0$ . If $\lambda\leq(1-\delta)\lambda_{c}(\gamma,\Delta)$ , then $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $\delta/4$ .

Claim 31 (Large $\sqrt{\beta\gamma}$ ).

Fix an integer $\Delta\geq 3$ , and $0<\delta<1$ . If $\sqrt{\beta\gamma}\geq\frac{\Delta-2}{\Delta}+\delta\left({1-\frac{\Delta-2}{\Delta}}\right)=\frac{\Delta-2(1-\delta)}{\Delta}$ , then $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $0<\delta<1$ for all $\lambda$ . Note if $\beta=\gamma$ , this is precisely the condition $\beta\geq\beta_{c}(\Delta)+\delta(1-\beta_{c}(\Delta))$ .

Proof.

Consider the univariate recursion for the marginal ratios with $d<\Delta$ children $f_{d}(R)=\lambda\left({\frac{\beta R+1}{R+\gamma}}\right)^{d}$ . Differentiating, we have

	$\displaystyle f_{d}^{\prime}(R)$	$\displaystyle=d\lambda\left({\frac{\beta R+1}{R+\gamma}}\right)^{d-1}\cdot\left({\frac{\beta}{R+\gamma}-\frac{\beta R+1}{(R+\gamma)^{2}}}\right)=-d(1-\beta\gamma)\lambda\left({\frac{\beta R+1}{R+\gamma}}\right)^{d}\cdot\frac{1}{(\beta R+1)(R+\gamma)}$
		$\displaystyle=-d(1-\beta\gamma)\cdot\frac{f_{d}(R)}{(\beta R+1)(R+\gamma)}.$

At the unique fixed point $R_{d}^{*}$ , we have $f_{d}(R_{d}^{*})=R_{d}^{*}$ so

\displaystyle\left|{f_{d}^{\prime}(R_{d}^{*})}\right|=d(1-\beta\gamma)\frac{R_{d}^{*}}{(\beta R_{d}^{*}+1)(R_{d}^{*}+\gamma)}.

By 37, we have the upper bound

\displaystyle\left|{f_{d}^{\prime}(R_{d}^{*})}\right|\leq d\cdot\frac{1-\beta\gamma}{(1+\sqrt{\beta\gamma})^{2}}=d\cdot\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}.

Since we assumed $\sqrt{\beta\gamma}\geq\frac{\Delta-2(1-\delta)}{\Delta}$ , we obtain

\displaystyle d\cdot\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}\leq d\cdot\frac{\Delta-(\Delta-2(1-\delta))}{\Delta+(\Delta-2(1-\delta))}=d\cdot\frac{1-\delta}{\Delta-1+\delta}\leq(1-\delta)\frac{d}{\Delta-1}.

As this is at most $1-\delta$ for all $d<\Delta$ , we have up-to- $\Delta$ uniqueness with gap $\delta$ . ∎

A.2 Spectral independence bounds for constant-size graphs

In this section, we prove spectral independence bounds for graphs with fewer than $O(c/\alpha)$ -many vertices, since for graphs with such few vertices, our bounds based on contraction of the tree recursions become trivial.

Proof of 28.

If $R_{v}$ denotes the marginal ratio of a vertex $v\in G$ , then $R_{v}\geq\lambda\beta^{\Delta}$ . In the case $\gamma\leq 1$ , we have $R_{v}\leq\lambda/\gamma^{\Delta}$ as well; if $\gamma>1$ , we have $R_{v}\leq\lambda$ . It follows that we immediately have the bounds

\displaystyle\left|{\mathcal{I}_{G}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|\leq\begin{cases}\left|{\frac{\lambda}{\lambda+\gamma^{\Delta}}-\frac{\lambda\beta^{\Delta}}{1+\lambda\beta^{\Delta}}}\right|=\frac{\lambda(1-\beta^{\Delta}\gamma^{\Delta})}{(\lambda+\gamma^{\Delta})(1+\lambda\beta^{\Delta})},&\quad\text{if }\gamma\leq 1\\ \left|{\frac{\lambda}{1+\lambda}-\frac{\lambda\beta^{\Delta}}{1+\lambda\beta^{\Delta}}}\right|=\frac{\lambda(1-\beta^{\Delta})}{(\lambda+1)(1+\lambda\beta^{\Delta})},&\quad\text{o.w.}\end{cases}

for all $u,v\in G$ . Note that these constants are less than $1$ , and only depend on $\beta,\gamma,\lambda,\Delta$ , yielding the first claim.

Now, we proceed to remove the dependence on $\Delta$ when up-to- $\Delta$ uniqueness holds. We have the following cases:

1.

If $\gamma>1$ , we immediately obtain a bound of $\frac{\lambda}{1+\lambda}$ which is independent of $\Delta$ .
2.

If $\beta=0$ and $\gamma\leq 1$ , then $\frac{\lambda(1-\beta^{\Delta}\gamma^{\Delta})}{(\lambda+\gamma^{\Delta})(1+\lambda\beta^{\Delta})}=\frac{\lambda}{\lambda+\gamma^{\Delta}}\leq\frac{\lambda}{\gamma^{\Delta}}$ . Since $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique, we must have $\lambda\leq\lambda_{c}(\gamma,\Delta)=\min_{1<d<\Delta}\frac{\gamma^{d+1}d^{d}}{(d-1)^{d+1}}\leq\frac{\gamma^{\Delta}(\Delta-1)^{\Delta-1}}{(\Delta-2)^{\Delta}}\leq\gamma^{\Delta}\cdot O(1/\Delta)$ . It follows that $\frac{\lambda}{\gamma^{\Delta}}\leq O(1/\Delta)$ .
3.

If $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ and $\gamma\leq 1$ , then

$\displaystyle\frac{\lambda(1-\beta^{\Delta}\gamma^{\Delta})}{(\lambda+\gamma^{\Delta})(1+\lambda\beta^{\Delta})}\leq 1-\beta^{\Delta}\gamma^{\Delta}\approx 1-e^{-2}.$
4.

If $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ , then let $\Delta_{0}$ be the maximal $1<d<\Delta$ such that $\sqrt{\beta\gamma}>\frac{d-2}{d}$ . If $\lambda\leq\lambda_{c}(\beta,\gamma,\Delta)$ , then by 35, we have

$\displaystyle\frac{\lambda(1-\beta^{\Delta}\gamma^{\Delta})}{(\lambda+\gamma^{\Delta})(1+\lambda\beta^{\Delta})}\leq\frac{\lambda}{\gamma^{\Delta}}\leq O(\Delta_{0}/\Delta).$

If $\lambda\geq\overline{\lambda}_{c}(\beta,\gamma,\Delta)$ , then again by 35, we have

$\frac{\lambda(1-\beta^{\Delta}\gamma^{\Delta})}{(\lambda+\gamma^{\Delta})(1+\lambda\beta^{\Delta})}\leq\frac{1}{\lambda\beta^{\Delta}}\leq O(\Delta_{0}/\Delta).\qed$

Proof of 29.

The proof is identical to the antiferromagnetic case and we omit it here. ∎

Appendix B Proof of 12 (Parts 1 and 2)

Proof of 12 (Parts 1 and 2).

To see the first equality, we compute directly and get

	$\displaystyle\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\log Z^{\sigma_{\Lambda}}_{G}$	$\displaystyle=\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\cdot\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)Z^{\sigma_{\Lambda}}_{G}$
		$\displaystyle=\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\prod_{w\in V}\lambda_{w}^{\sigma_{w}}\right)$
		$\displaystyle=\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{v}\left(\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\prod_{w\in V}\lambda_{w}^{\sigma_{w}}\right)$
		$\displaystyle=\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{v}\cdot\mu_{G}(\sigma\mid{\sigma_{\Lambda}})=M_{G}^{\sigma_{\Lambda}}(v).$

For Part 2, using the result above, we can also get

		$\displaystyle\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)\log Z^{\sigma_{\Lambda}}_{G}$
	$\displaystyle={}$	$\displaystyle\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\cdot\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)Z^{\sigma_{\Lambda}}_{G}\right)$
	$\displaystyle={}$	$\displaystyle\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\cdot\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)Z^{\sigma_{\Lambda}}_{G}-\frac{1}{(Z^{\sigma_{\Lambda}}_{G})^{2}}\cdot\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)Z^{\sigma_{\Lambda}}_{G}\cdot\left(\lambda_{u}\frac{\partial}{\partial\lambda_{u}}\right)Z^{\sigma_{\Lambda}}_{G}$
	$\displaystyle={}$	$\displaystyle\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\cdot\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{u}\left(\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\prod_{w\in V}\lambda_{w}^{\sigma_{w}}\right)\right)-M_{G}^{\sigma_{\Lambda}}(u)\cdot M_{G}^{\sigma_{\Lambda}}(v)$
	$\displaystyle={}$	$\displaystyle\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{u}\cdot\left(\lambda_{v}\frac{\partial}{\partial\lambda_{v}}\right)\left(\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\prod_{w\in V}\lambda_{w}^{\sigma_{w}}\right)-M_{G}^{\sigma_{\Lambda}}(u)\cdot M_{G}^{\sigma_{\Lambda}}(v)$
	$\displaystyle={}$	$\displaystyle\frac{1}{Z^{\sigma_{\Lambda}}_{G}}\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{u}\cdot\sigma_{v}\left(\beta^{m_{1}(\sigma)}\gamma^{m_{0}(\sigma)}\prod_{w\in V}\lambda_{w}^{\sigma_{w}}\right)-M_{G}^{\sigma_{\Lambda}}(u)\cdot M_{G}^{\sigma_{\Lambda}}(v)$
	$\displaystyle={}$	$\displaystyle\sum_{\sigma\in\{0,1\}^{V\backslash\Lambda}}\sigma_{u}\cdot\sigma_{v}\cdot\mu_{G}(\sigma\mid{\sigma_{\Lambda}})-M_{G}^{\sigma_{\Lambda}}(u)\cdot M_{G}^{\sigma_{\Lambda}}(v)$
	$\displaystyle={}$	$\displaystyle K_{G}^{\sigma_{\Lambda}}(u,v).\qed$

Appendix C A technical lemma for $\Psi$

The following lemma implies that the potential $\Psi$ given by Eq. 4 is well-defined.

Lemma 32.

For all $\beta,\gamma>0$ such that $\beta\gamma<1$ , we have

\int_{-\infty}^{+\infty}\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}<+\infty.

Proof.

For the $+\infty$ side we have

\int_{0}^{+\infty}\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}=\int_{0}^{+\infty}\sqrt{\frac{1-\beta\gamma}{\beta e^{y}+\gamma e^{-y}+\beta\gamma+1}}<\int_{0}^{+\infty}\frac{1}{\sqrt{\beta e^{y}}}<+\infty.

Similarly, for the $-\infty$ side we have

\int_{-\infty}^{0}\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}<\int_{-\infty}^{0}\frac{1}{\sqrt{\gamma e^{-y}}}<+\infty.\qed

Appendix D Mixing by the potential method: Proof of 24

In this section, we prove 24 in the same way of 5, as outlined in Section 3. The major difference here is that we consider a weighted sum of absolute influences $\sum_{v\in V\backslash\Lambda}\rho_{v}\cdot\left|\mathcal{I}_{G}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|$ ρv⋅|ℐGσΛ(r → v)| where $\rho:V\to{\mathbb{R}}^{+}$ is a weight function. This is sufficient for us to bound the eigenvalue of the influence matrix, as indicated by 22. We will choose the weight of a vertex $v$ to be $\rho_{v}=\Delta_{v}$ , the degree of $v$ . The following lemma provides us an upper bound on the weighted sum of absolute influences to distance $k$ , given a general $(\alpha,c)$ -potential. In particular, it generalizes 9.

Lemma 33.

If there exists a general $(\alpha,c)$ -potential function $\Psi$ with respect to $\Delta$ and $(\beta,\gamma,\lambda)$ where $\alpha\in(0,1)$ and $c>0$ , then for every $\Lambda\subseteq V_{T}\backslash\{r\}$ , ${\sigma_{\Lambda}}\in\{0,1\}^{\Lambda}$ and all integers $k\geq 1$ ,

\sum_{v\in L_{r}(k)}\Delta_{v}\cdot\left|{\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)}\right|\leq 2c\cdot(1-\alpha)^{k-1}\cdot\Delta_{r}

where $L_{r}(k)$ denote the set of all free vertices at distance $k$ away from $r$ .

To prove 33, we first state the following generalization of 14 for any weight function $\rho$ . The proof of 34 is identical to 14 and we omit here.

Lemma 34.

\sum_{v\in L_{r}(k)}\rho_{v}\cdot\left|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right|\leq\Delta_{r}A_{\Psi}B_{\Psi}^{\rho}\left(\max_{1\leq d<\Delta}\,\sup_{\tilde{\bm{y}}\in S^{d}}\left\|{\nabla H_{d}^{\Psi}(\tilde{\bm{y}})}\right\|_{1}\right)^{k-1}

where

A_{\Psi}=\max_{u\in L_{r}(1)}\left\{\frac{|h(\log R_{u})|}{\psi(\log R_{u})}\right\}\quad\text{and}\quad B_{\Psi}^{\rho}=\max_{v\in L_{r}(k)}\left\{\rho_{v}\cdot\psi(\log R_{v})\right\}.

We then prove 33 and 24.

Proof of 33.

Denote the degree of a vertex $v\in V_{T}\backslash\{r\}$ by $\Delta_{v}$ , and the degree of $v$ in the subtree $T_{v}$ by $d_{v}=\Delta_{v}-1$ . Pick the weights of vertices to be $\rho_{v}=\Delta_{v}$ for all $v\in V_{T}$ . Since $\Psi$ is a general $(\alpha,c)$ -potential, the Contraction condition implies that

\max_{1\leq d<\Delta}\sup_{\tilde{\bm{y}}\in S^{d}}\left\|{\nabla H_{d}^{\Psi}(\tilde{\bm{y}})}\right\|_{1}\leq 1-\alpha.

Since $\log R_{v}\in J_{d_{v}}$ by the definition of $J_{d}$ , the General Boundedness condition implies that for all $u\in L_{r}(1)$ and $v\in L_{r}(k)$ ,

\frac{\psi(\log R_{v})}{\psi(\log R_{u})}\cdot|h(\log R_{u})|\leq\frac{2c}{\Delta_{u}+\Delta_{v}}.

Therefore, we get

\Delta_{r}A_{\Psi}B_{\Psi}^{\rho}=\Delta_{r}\cdot\max_{u\in L_{r}(1)}\left\{\frac{|h(\log R_{u})|}{\psi(\log R_{u})}\right\}\cdot\max_{v\in L_{r}(k)}\left\{\Delta_{v}\cdot\psi(\log R_{v})\right\}\leq 2c\cdot\Delta_{r}.

The lemma then follows immediately from 34. ∎

Proof of 24.

The proof of 24 is almost identical to 5. We point out that the only difference here is that we consider the weighted sum of absolute influences of a given vertex. Since the SAW tree preserve degrees of vertices, we can still apply 8. Then, combining 7, 22, 8 and 33, we complete the proof of the theorem. ∎

Appendix E Verifying a good potential: Boundedness

In this subsection, we show the Boundedness or General Boundedness condition for our potential function $\Psi$ defined by Eq. 4 in different ranges of parameters. Combining 19, we complete the proofs of 10 and 25.

In Section E.1 we give background on the uniqueness region of the parameters $(\beta,\gamma,\lambda)$ , based on the work of [LLY13]. We then show Boundedness and General Boundedness in Section E.2. Proofs of technical lemmas are left to Section E.3.

E.1 Preliminaries of the uniqueness region

In this section we give a brief description of the uniqueness region of parameters $(\beta,\gamma,\lambda)$ . All the results here, and also their proofs, can be found in Lemma 21 from the latest version of [LLY13].

Let $\Delta\geq 3$ be an integer and $\beta,\gamma,\lambda$ be reals. We assume that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . For $1\leq d\leq\Delta$ define

f_{d}(R)=\lambda\left({\frac{\beta R+1}{R+\gamma}}\right)^{d}

and denote the unique fixed point of $f_{d}$ by $R_{d}^{*}$ . Recall that the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta\in(0,1)$ if $|f^{\prime}_{d}(R^{*}_{d})|<1-\delta$ for all $1\leq d<\Delta$ .

When $\beta=0$ , the spin system is called a hard-constraint model. In this case, there exists a critical threshold for the external field defined as

\lambda_{c}=\lambda_{c}(\gamma,\Delta)=\min_{1<d<\Delta}\frac{\gamma^{d+1}d^{d}}{(d-1)^{d+1}},

such that the parameters $(0,\gamma,\lambda)$ are up-to- $\Delta$ unique if and only if $\lambda<\lambda_{c}$ . In particular, when $\gamma\leq 1$ the critical field is given by

\lambda_{c}=\lambda_{c}(\gamma,\Delta)=\frac{\gamma^{\Delta}(\Delta-1)^{\Delta-1}}{(\Delta-2)^{\Delta}}.

When $\beta>0$ , the spin system is called a soft-constraint model. If $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ , then $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique for all $\lambda>0$ . If $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ the uniqueness region is more complicated which we now describe. Let

\overline{\Delta}=\frac{1+\sqrt{\beta\gamma}}{1-\sqrt{\beta\gamma}},

so that for every $1\leq d<\overline{\Delta}$ we have $d\cdot\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}<1$ , and for every $d\geq\overline{\Delta}$ we have $d\cdot\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}\geq 1$ . For every $\overline{\Delta}\leq d<\Delta$ , we define $x_{1}(d)\leq x_{2}(d)$ to be the two positive roots of the quadratic equation

\frac{d(1-\beta\gamma)x}{(\beta x+1)(x+\gamma)}=1.

More specifically, $x_{1}(d)$ and $x_{2}(d)$ are given by

x_{1}(d)=\frac{\theta(d)-\sqrt{\theta(d)^{2}-4\beta\gamma}}{2\beta}\qquad\text{and}\qquad x_{2}(d)=\frac{\theta(d)+\sqrt{\theta(d)^{2}-4\beta\gamma}}{2\beta}

where

\theta(d)=d(1-\beta\gamma)-(1+\beta\gamma).

Notice that $\theta(d)\geq 2\sqrt{\beta\gamma}$ for all $d\geq\overline{\Delta}$ . For $i=1,2$ we let

\lambda_{i}(d)=x_{i}(d)\left(\frac{x_{i}(d)+\gamma}{\beta x_{i}(d)+1}\right)^{d}.

Then, the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique if and only if $\lambda$ belongs to the following regime

\mathcal{A}=\bigcap_{\overline{\Delta}\leq d<\Delta}\Big{[}(0,\lambda_{1}(d))\cup(\lambda_{2}(d),\infty)\Big{]}.

(10)

In particular, when $\gamma\leq 1$ there are two critical thresholds $0<\lambda_{c}<\overline{\lambda}_{c}$ such that the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique if and only if $\lambda<\lambda_{c}$ or $\lambda>\overline{\lambda}_{c}$ (i.e., $\mathcal{A}=(0,\lambda_{c})\cup(\overline{\lambda}_{c},\infty)$ ), where

\lambda_{c}=\lambda_{c}(\beta,\gamma,\Delta)=\min_{\overline{\Delta}\leq d<\Delta}\lambda_{1}(d)\qquad\text{and}\qquad\overline{\lambda}_{c}=\overline{\lambda}_{c}(\beta,\gamma,\Delta)=\max_{\overline{\Delta}\leq d<\Delta}\lambda_{2}(d)=\lambda_{2}(\Delta-1).

The following bounds on the critical fields are helpful for our proofs later.

Lemma 35.

1.

If $\beta=0$ , then for every integer $d$ such that $1<d<\Delta$ we have

$\lambda_{c}\leq\frac{4\gamma^{d+1}}{d-1}.$
2.

If $\beta>0$ and $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ , then for every integer $d$ such that $\overline{\Delta}\leq d<\Delta$ we have

$\lambda_{1}(d)\leq\frac{18\gamma^{d+1}}{\theta(d)}\qquad\text{and}\qquad\lambda_{2}(d)\geq\frac{\theta(d)}{18\beta^{d+1}}$

where $\theta(d)=d(1-\beta\gamma)-(1+\beta\gamma)$ .

The proof of 35 is postponed to Section E.3.

E.2 Proofs of boundedness

In this section we complete the proofs of 10 and 25 by establishing Boundedness and General Boundedness in the corresponding range of parameters.

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Recall that the potential function $\Psi$ is defined by

\Psi^{\prime}(y)=\psi(y)=\sqrt{\frac{(1-\beta\gamma)e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}}=\sqrt{\left|{h(y)}\right|},\qquad\Psi(0)=0.

(1)

It is surprising to find out that $\psi=\sqrt{|h|}$ , as the potential $\Psi$ is exactly the one from [LLY13] as indicated by 17. This seems not to be a coincidence, and it provides some intuition why the potential from [LLY13] works. More importantly, the fact that $\psi=\sqrt{|h|}$ is helpful in our proof of Boundedness and General Boundedness. Recall that for $0\leq d<\Delta$ and $\beta\gamma<1$ we let $J_{d}=\left[{\log(\lambda\beta^{d}),\log(\lambda/\gamma^{d})}\right]$ to be the range of log marginal ratios of a vertex with $d$ children. Then for every $0\leq d_{i}<\Delta$ and $y_{i}\in J_{d_{i}}$ where $i=1,2$ , we have

\frac{\psi(y_{2})}{\psi(y_{1})}\cdot|h(y_{1})|=\sqrt{|h(y_{1})|\cdot|h(y_{2})|}.

(11)

The following lemma gives upper bounds on $\sqrt{|h(y_{1})|\cdot|h(y_{2})|}$ , from which and Eq. 11 we deduce Boundedness and General Boundedness immediately. The brackets in the lemma indicate which lemma the bound is applied to.

Lemma 36.

Let $\Delta\geq 3$ be an integer. Let $\beta,\gamma,\lambda$ be reals such that $0\leq\beta\leq\gamma$ , $\gamma>0$ , $\beta\gamma<1$ and $\lambda>0$ . Assume that the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique with gap $\delta\in(0,1)$ . Then for all integers $d_{1},d_{2}$ such that $0\leq d_{1},d_{2}<\Delta$ , and all reals $y_{i}\in J_{d_{i}}$ where $i=1,2$ , the following holds:

H.
Hard-constraint models: $\beta=0$ and $\lambda<\lambda_{c}$ .
1. H.1.
  
  (10) If $\gamma\leq 1$ , then
  
  $|h(y_{1})|\leq\frac{4}{\Delta}.$
2. H.2.
  
  (25) If $\gamma>1$ , then
  
  $\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{8}{d_{1}+d_{2}+2}.$
S.
Soft-constraint models: $\beta>0$ and $\lambda\in\mathcal{A}$ .
1. S.1.
  
  (10) If $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ , then
  
  $|h(y_{1})|\leq\frac{1.5}{\Delta}.$
2. S.2.
  
  (10) If $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma\leq 1$ , then
  
  $|h(y_{1})|\leq\frac{18}{\Delta}.$
3. S.3.
  
  (25) If $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ , then
  
  $\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{36}{d_{1}+d_{2}+2}.$

The following lemma, whose proof can be found in Section E.3, is helpful.

Lemma 37.

For every $y\in[-\infty,+\infty]$ we have

|h(y)|=\frac{|1-\beta\gamma|e^{y}}{(\beta e^{y}+1)(e^{y}+\gamma)}\leq\frac{|1-\sqrt{\beta\gamma}|}{1+\sqrt{\beta\gamma}}.

We present here the proof of 36.

Proof of 36.

We use notations and results from Section E.1.

H. Hard-constraint models: $\beta=0$ and $\lambda<\lambda_{c}$ .

H.1. $\gamma\leq 1$ .

For every $y_{1}\in J_{d_{1}}$ we deduce from 35 that

e^{y_{1}}\leq\frac{\lambda}{\gamma^{d_{1}}}\leq\frac{\lambda_{c}}{\gamma^{\Delta-1}}\leq\frac{4\gamma}{\Delta-2}.

Hence,

|h(y_{1})|=\frac{e^{y_{1}}}{e^{y_{1}}+\gamma}\leq\frac{\frac{4\gamma}{\Delta-2}}{\frac{4\gamma}{\Delta-2}+\gamma}=\frac{4}{\Delta+2}\leq\frac{4}{\Delta}.

H.2. $\gamma>1$ .

Let $\bar{y}=\frac{y_{1}+y_{2}}{2}$ and $\bar{d}=\frac{d_{1}+d_{2}}{2}$ . Then we get

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}=\sqrt{\frac{e^{y_{1}}}{e^{y_{1}}+\gamma}}\cdot\sqrt{\frac{e^{y_{2}}}{e^{y_{2}}+\gamma}}=\frac{1}{\sqrt{(1+\gamma e^{-y_{1}})(1+\gamma e^{-y_{2}})}}\leq\frac{1}{1+\gamma e^{-\bar{y}}},

where the last inequality follows from the AM–GM inequality by

(1+\gamma e^{-y_{1}})(1+\gamma e^{-y_{2}})=1+\gamma(e^{-y_{1}}+e^{-y_{2}})+\gamma^{2}e^{-2\bar{y}}\geq 1+2\gamma e^{-\bar{y}}+\gamma^{2}e^{-2\bar{y}}=(1+\gamma e^{-\bar{y}})^{2}.

Since $y_{i}\in J_{d_{i}}$ for $i=1,2$ , we have

e^{\bar{y}}=\sqrt{e^{y_{1}}\cdot e^{y_{2}}}\leq\sqrt{\frac{\lambda}{\gamma^{d_{1}}}\cdot\frac{\lambda}{\gamma^{d_{2}}}}=\frac{\lambda}{\gamma^{\bar{d}}}.

If $\bar{d}\geq 2$ , then we deduce from 35 and $\gamma>1$ that

e^{\bar{y}}\leq\frac{\lambda_{c}}{\gamma^{\lfloor\bar{d}\rfloor}}\leq\frac{4\gamma}{\lfloor\bar{d}\rfloor-1}.

It follows that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1}{1+\gamma e^{-\bar{y}}}\leq\frac{1}{1+\frac{\lfloor\bar{d}\rfloor-1}{4}}=\frac{4}{\lfloor\bar{d}\rfloor+3}\leq\frac{8}{d_{1}+d_{2}+2}.

If $\bar{d}<2$ , then it is easy to see that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq 1\leq\frac{8}{d_{1}+d_{2}+2}.

S. Soft-constraint models: $\beta>0$ and $\lambda\in\mathcal{A}$ .

S.1. $\sqrt{\beta\gamma}>\frac{\Delta-2}{\Delta}$ .

For every $y_{1}\in J$ we deduce from 37 that

|h(y_{1})|\leq\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}\leq\frac{1}{\Delta-1}\leq\frac{1.5}{\Delta}.

S.2. $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma\leq 1$ .

In this case, we have either $\lambda<\lambda_{c}$ or $\lambda>\overline{\lambda}_{c}$ where $\lambda_{c},\overline{\lambda}_{c}$ are the two critical fields. Consider first $\lambda>\overline{\lambda}_{c}$ . For every $y_{1}\in J_{d_{1}}$ we deduce from 35 and $\beta<1$ that

e^{y_{1}}\geq\lambda\beta^{d_{1}}\geq\overline{\lambda}_{c}\beta^{\Delta-1}\geq\frac{\theta(\Delta-1)}{18\beta}

where $\theta(d)=d(1-\beta\gamma)-(1+\beta\gamma)$ . Hence,

	$\displaystyle\|h(y_{1})\|=\frac{(1-\beta\gamma)e^{y_{1}}}{(\beta e^{y_{1}}+1)(e^{y_{1}}+\gamma)}$	$\displaystyle=\frac{1-\beta\gamma}{\beta e^{y_{1}}+\gamma e^{-y_{1}}+(1+\beta\gamma)}$
		$\displaystyle\leq\frac{1-\beta\gamma}{\frac{\theta(\Delta-1)}{18}+(1+\beta\gamma)}=\frac{18(1-\beta\gamma)}{(\Delta-1)(1-\beta\gamma)+17(1+\beta\gamma)}\leq\frac{18}{\Delta}.$

Next we consider $\lambda<\lambda_{c}$ . For every $y_{1}\in J_{d_{1}}$ we deduce from 35 and $\gamma\leq 1$ that

e^{y_{1}}\leq\frac{\lambda}{\gamma^{d_{1}}}\leq\frac{\lambda_{c}}{\gamma^{\Delta-1}}\leq\frac{18\gamma}{\theta(\Delta-1)}.

Hence,

|h(y_{1})|=\frac{1-\beta\gamma}{\beta e^{y_{1}}+\gamma e^{-y_{1}}+(1+\beta\gamma)}\leq\frac{1-\beta\gamma}{\frac{\theta(\Delta-1)}{18}+(1+\beta\gamma)}\leq\frac{18}{\Delta}.

S.3. $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$ .

Let $\bar{y}=\frac{y_{1}+y_{2}}{2}$ , $\bar{d}=\frac{d_{1}+d_{2}}{2}$ , $d_{L}=\lfloor\bar{d}\rfloor$ , and $d_{R}=\lceil\bar{d}\rceil$ . We first consider some trivial cases. If $\bar{d}\leq 2$ then it is easy to see that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq 1\leq\frac{6}{d_{1}+d_{2}+2}.

If $\bar{d}>2$ and $d_{L}\leq\overline{\Delta}$ , then we deduce from 37 that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\sqrt{\beta\gamma}}{1+\sqrt{\beta\gamma}}=\frac{1}{\overline{\Delta}}\leq\frac{2}{d_{1}+d_{2}-2}\leq\frac{6}{d_{1}+d_{2}+2}.

Hence, in the following we may assume that $\bar{d}>2$ and $d_{L}>\overline{\Delta}$ .

Since the parameters $(\beta,\gamma,\lambda)$ are up-to- $\Delta$ unique, we have $\lambda\in\mathcal{A}$ where the regime $\mathcal{A}$ is given by Eq. 10. Observe that

\mathcal{A}\subseteq(0,\lambda_{1}(d_{L}))\cup(\lambda_{2}(d_{R}),\infty)\cup(\lambda_{2}(d_{L}),\lambda_{1}(d_{R}))

where the last interval is nonempty only when $\lambda_{2}(d_{L})<\lambda_{1}(d_{R})$ . This means that $\lambda$ is contained in at least one of the three intervals. We establish the bound by considering these three cases separately.

Case 1: $\lambda<\lambda_{1}(d_{L})$ . By the Cauchy-Schwarz inequality, we have

	$\displaystyle\sqrt{\|h(y_{1})\|\cdot\|h(y_{2})\|}$	$\displaystyle=\sqrt{\frac{1-\beta\gamma}{\beta e^{y_{1}}+\gamma e^{-y_{1}}+(1+\beta\gamma)}}\cdot\sqrt{\frac{1-\beta\gamma}{\beta e^{y_{2}}+\gamma e^{-y_{2}}+(1+\beta\gamma)}}$
		$\displaystyle\leq\frac{1-\beta\gamma}{\sqrt{(\beta e^{y_{1}}+\gamma e^{-y_{1}})(\beta e^{y_{2}}+\gamma e^{-y_{2}})}+(1+\beta\gamma)}.$		(12)

Therefore, we get

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\gamma e^{-\bar{y}}+(1+\beta\gamma)}.

Since $y_{i}\in J_{d_{i}}$ for $i=1,2$ and $\gamma>1$ , we deduce from 35 that

e^{\bar{y}}\leq\frac{\lambda}{\gamma^{\bar{d}}}\leq\frac{\lambda_{1}(d_{L})}{\gamma^{d_{L}}}\leq\frac{18\gamma}{\theta(d_{L})},

where $\theta(d_{L})=d_{L}(1-\beta\gamma)-(1+\beta\gamma)$ . It follows that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\gamma e^{-\bar{y}}+(1+\beta\gamma)}\leq\frac{1-\beta\gamma}{\frac{\theta(d_{L})}{18}+(1+\beta\gamma)}\leq\frac{36}{d_{1}+d_{2}+2}.

Case 2: $\lambda>\lambda_{2}(d_{R})$ . Similarly, we obtain from Eq. 12 that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\beta e^{\bar{y}}+(1+\beta\gamma)}.

Since $y_{i}\in J_{d_{i}}$ for $i=1,2$ and $\beta<1$ , we deduce from 35 that

e^{\bar{y}}\geq\lambda\beta^{\bar{d}}\geq\lambda_{2}(d_{R})\beta^{d_{R}}\geq\frac{\theta(d_{R})}{18\beta},

where $\theta(d_{R})=d_{R}(1-\beta\gamma)-(1+\beta\gamma)$ . It follows that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\beta e^{\bar{y}}+(1+\beta\gamma)}\leq\frac{1-\beta\gamma}{\frac{\theta(d_{R})}{18}+(1+\beta\gamma)}\leq\frac{36}{d_{1}+d_{2}+2}.

Case 3: $\lambda_{2}(d_{L})<\lambda<\lambda_{1}(d_{R})$ . We may assume that $d_{1}\geq d_{2}$ . By Eq. 12, we obtain

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\sqrt{\beta\gamma}e^{\frac{y_{2}-y_{1}}{2}}+(1+\beta\gamma)}.

Since $y_{i}\in J_{d_{i}}$ for $i=1,2$ and $\beta<1<\gamma$ , we have

e^{y_{2}-y_{1}}\geq\beta^{d_{2}}\gamma^{d_{1}}\geq\beta^{d_{L}}\gamma^{d_{R}}.

Meanwhile, we deduce from 35 that

\frac{\theta(d_{L})}{18\beta^{d_{L}+1}}\leq\lambda_{2}(d_{L})<\lambda<\lambda_{1}(d_{R})\leq\frac{18\gamma^{d_{R}+1}}{\theta(d_{R})},

which implies

\sqrt{\beta\gamma}e^{\frac{y_{2}-y_{1}}{2}}\geq\sqrt{\beta^{d_{L}+1}\gamma^{d_{R}+1}}\geq\frac{\sqrt{\theta(d_{L})\theta(d_{R})}}{18}\geq\frac{\theta(d_{L})}{18}.

It follows that

\sqrt{|h(y_{1})|\cdot|h(y_{2})|}\leq\frac{1-\beta\gamma}{\sqrt{\beta\gamma}e^{\frac{y_{2}-y_{1}}{2}}+(1+\beta\gamma)}\leq\frac{1-\beta\gamma}{\frac{\theta(d_{L})}{18}+(1+\beta\gamma)}\leq\frac{36}{d_{1}+d_{2}+2}.\qed

E.3 Proofs of technical lemmas

Proof of 35.

1. For every $1<d<\Delta$ we have

\lambda_{c}\leq\frac{\gamma^{d+1}d^{d}}{(d-1)^{d+1}}=\frac{\gamma^{d+1}}{d-1}\left(\frac{d}{d-1}\right)^{d}\leq\frac{4\gamma^{d+1}}{d-1},

where the last inequality follows from that $(\frac{d}{d-1})^{d}\leq 4$ for all integer $d>1$ .

2. For every $\overline{\Delta}\leq d<\Delta$ we have

x_{1}(d)=\frac{2\gamma}{\theta(d)+\sqrt{\theta(d)^{2}-4\beta\gamma}}\leq\frac{2\gamma}{\theta(d)}.

Observe that the function $\frac{x+\gamma}{\beta x+1}$ is monotone increasing in $x$ when $\beta\gamma<1$ , and thus we deduce that

\frac{x_{1}(d)+\gamma}{\beta x_{1}(d)+1}\leq\frac{\frac{2\gamma}{\theta(d)}+\gamma}{\frac{2\beta\gamma}{\theta(d)}+1}=\gamma\cdot\frac{2+d(1-\beta\gamma)-(1+\beta\gamma)}{2\beta\gamma+d(1-\beta\gamma)-(1+\beta\gamma)}=\gamma\cdot\frac{d+1}{d-1}.

Therefore,

\lambda_{1}(d)=x_{1}(d)\left(\frac{x_{1}(d)+\gamma}{\beta x_{1}(d)+1}\right)^{d}\leq\frac{2\gamma}{\theta(d)}\cdot\gamma^{d}\cdot\left(\frac{d+1}{d-1}\right)^{d}\leq\frac{18\gamma^{d+1}}{\theta(d)}

where the last inequality follows from that $(\frac{d+1}{d-1})^{d}\leq 9$ for all integer $d>1$ .

The second part can be proved similarly. For every $\overline{\Delta}\leq d<\Delta$ we have

x_{2}(d)=\frac{\theta(d)+\sqrt{\theta(d)^{2}-4\beta\gamma}}{2\beta}\geq\frac{\theta(d)}{2\beta},

and hence,

\frac{x_{2}(d)+\gamma}{\beta x_{2}(d)+1}\geq\frac{\frac{\theta(d)}{2\beta}+\gamma}{\frac{\theta(d)}{2}+1}=\frac{1}{\beta}\cdot\frac{d(1-\beta\gamma)-(1+\beta\gamma)+2\beta\gamma}{d(1-\beta\gamma)-(1+\beta\gamma)+2}=\frac{1}{\beta}\cdot\frac{d-1}{d+1}.

We then conclude that

\lambda_{2}(d)=x_{2}(d)\left(\frac{x_{2}(d)+\gamma}{\beta x_{2}(d)+1}\right)^{d}\geq\frac{\theta(d)}{2\beta}\cdot\frac{1}{\beta^{d}}\cdot\left(\frac{d-1}{d+1}\right)^{d}\geq\frac{\theta(d)}{18\beta^{d+1}},

where the last inequality again follows from that $(\frac{d+1}{d-1})^{d}\leq 9$ for all integer $d>1$ . ∎

Proof of 37.

We deduce from the AM–GM inequality that

|h(y)|=\frac{|1-\beta\gamma|}{\beta e^{y}+\gamma e^{-y}+1+\beta}\leq\frac{|1-\beta\gamma|}{2\sqrt{\beta\gamma}+1+\beta}=\frac{|1-\sqrt{\beta\gamma}|}{1+\sqrt{\beta\gamma}}.\qed

Appendix F Proofs for ferromagnetic cases

F.1 Proof of 26

Proof of 26.

Throughout, we use the “trivial potential” function $\Psi(y)=y$ . Note that then, $\psi(y)=1$ is a constant function. Now, we prove Contraction and Boundedness. We split into the three cases.

1.

We first prove the Contraction part. By 37, for all $y\in[-\infty,+\infty]$ we have

$\left|{h(y)}\right|\leq\frac{|1-\sqrt{\beta\gamma}|}{1+\sqrt{\beta\gamma}}\leq\frac{1-\delta}{\Delta-1}.$

Now let us prove the Boundedness condition. From the above inequality we have

$\left|{h(y)}\right|\leq\frac{1}{\Delta-1}\leq\frac{1.5}{\Delta}$

for $\Delta\geq 3$ .

For the Contraction part, since $\log(\lambda\max\{1,1/\gamma^{\Delta-1}\})\leq y_{i}\leq\log(\lambda\max\{1,\beta^{\Delta-1}\})$ , we have

	$\displaystyle\left\|{\frac{\partial H_{d}(\bm{y})}{\partial y_{i}}}\right\|$	$\displaystyle=\left\|{h(y_{i})}\right\|=\frac{\beta\gamma-1}{1+\beta\gamma+\gamma e^{-y_{i}}+\beta e^{y_{i}}}\leq\frac{\beta\gamma-1}{1+\beta\gamma+\gamma e^{-y_{i}}}$
		$\displaystyle\leq\frac{\beta\gamma-1}{1+\beta\gamma+\frac{\gamma}{\lambda\max\{1,\beta^{\Delta-1}\}}}.$

Since we assumed $\lambda\leq(1-\delta)\frac{\gamma}{\max\{1,\beta^{\Delta-1}\}\cdot((\Delta-2)\beta\gamma-\Delta)}$ , it follows that we have the upper bound

	$\displaystyle\frac{\beta\gamma-1}{1+\beta\gamma+\frac{(\Delta-2)\beta\gamma-\Delta}{1-\delta}}$	$\displaystyle=(1-\delta)\frac{\beta\gamma-1}{(\Delta-1-\delta)\beta\gamma-(\Delta-1+\delta)}$
		$\displaystyle=(1-\delta)\frac{\beta\gamma-1}{(\Delta-1-\delta)(\beta\gamma-1)+2\delta}$
		$\displaystyle\leq\frac{1-\delta}{\Delta-1-\delta}\leq(1-\Theta(\delta))\frac{1}{\Delta-1}.$

Now, we prove the Boundedness condition. Note that since $\lambda\leq\frac{\gamma}{\max\{1,\beta^{\Delta-1}\}\cdot((\Delta-2)\beta\gamma-\Delta)}$ , it follows that $y\leq\log(\lambda\max\{1,\beta^{\Delta-1}\})\leq\log\left({\frac{\gamma}{(\Delta-2)\beta\gamma-\Delta}}\right)$ . A simple calculation reveals that $\frac{\gamma}{(\Delta-2)\beta\gamma-\Delta}\leq\sqrt{\frac{\gamma}{\beta}}$ and so by 37, we have

	$\displaystyle\left\|{h(y)}\right\|$	$\displaystyle\leq\left\|{h\left({\log\left({\frac{\gamma}{(\Delta-2)\beta\gamma-\Delta}}\right)}\right)}\right\|\leq\frac{(\beta\gamma-1)e^{\log\left({\frac{\gamma}{(\Delta-2)\beta\gamma-\Delta}}\right)}}{e^{\log\left({\frac{\gamma}{(\Delta-2)\beta\gamma-\Delta}}\right)}+\gamma}$
		$\displaystyle=(\beta\gamma-1)\frac{1}{1+(\Delta-2)\beta\gamma-\Delta}=\frac{\beta\gamma-1}{(\Delta-2)(\beta\gamma-1)-1}\leq O(1/\Delta).$

3.

For the Contraction part, since $\log(\lambda\max\{1,1/\gamma^{\Delta-1}\})\leq y_{i}\leq\log(\lambda\max\{1,\beta^{\Delta-1}\})$ , we have

$\displaystyle\left|{\frac{\partial H_{d}(\bm{y})}{\partial y_{i}}}\right|$ $\displaystyle=\left|{h(y_{i})}\right|=\frac{\beta\gamma-1}{1+\beta\gamma+\gamma e^{-y_{i}}+\beta e^{y_{i}}}\leq\frac{\beta\gamma-1}{1+\beta\gamma+\beta e^{y_{i}}}$

$\displaystyle\leq\frac{\beta\gamma-1}{1+\beta\gamma+\beta\lambda\max\{1,1/\gamma^{\Delta-1}\}}.$

Since we assumed $\lambda\geq\frac{1}{1-\delta}\cdot\frac{(\Delta-2)\beta\gamma-\Delta}{\beta\cdot\min\{1,1/\gamma^{\Delta-1}\}}$ , it follows that we have the upper bound

$\displaystyle\frac{\beta\gamma-1}{1+\beta\gamma+\frac{(\Delta-2)\beta\gamma-\Delta}{1-\delta}}$

which is again is upper bounded by $(1-\Theta(\delta))\frac{1}{\Delta-1}$ as we calculated in case 2 above.

Now, we prove the Boundedness condition. Note that since $\lambda\geq\frac{(\Delta-2)\beta\gamma-\Delta}{\beta\min\{1,1/\gamma^{\Delta-2}}$ , it follows that $y\geq\log(\lambda\min\{1,1/\gamma^{\Delta-1}\}\geq\log\left({\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}}\right)$ . A simple calculation reveals that $\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}\geq\sqrt{\frac{\gamma}{\beta}}$ and so by 37, we have

$\displaystyle\left|{h(y)}\right|$ $\displaystyle\leq\left|{h\left({\log\left({\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}}\right)}\right)}\right|\leq(\beta\gamma-1)\frac{1}{\beta\cdot\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}+1}$

$\displaystyle=\frac{\beta\gamma-1}{(\Delta-2)(\beta\gamma-1)-1}\leq O(1/\Delta).\qed$

F.2 Proof of 27

In this subsection, we use results from [GL18] to prove 27. Their potential function is implicitly defined by its derivative for the marginal ratios as

\displaystyle\Phi^{\prime}(R)=\phi(R)=\min\left\{{\frac{\beta\gamma-1}{\alpha\gamma\log\frac{\lambda+\gamma}{\beta\lambda+1}},\frac{1}{R\log\frac{\lambda}{R}}}\right\}

for a constant $0\leq\alpha\leq 1$ depending only on $\beta,\gamma,\lambda$ (see [GL18] for a precise definition). In our context, the corresponding potential for the log ratios is

\displaystyle\Psi^{\prime}(y)=\psi(y)=e^{y}\phi(e^{y})=\min\left\{{\frac{\beta\gamma-1}{\alpha\gamma\log\frac{\lambda+\gamma}{\beta\lambda+1}}e^{y},\frac{1}{\log\frac{\lambda}{e^{y}}}}\right\}

and is bounded by constants depending on $\beta,\gamma,\lambda,\Delta$ for $\log(\lambda/\gamma^{\Delta-1})\leq y\leq\log\lambda$ .

One of the main technical results in [GL18] is showing that the tree recursion is contracting with the potential function $\Phi$ , and the derivative $\phi$ is bounded in the sense that there exist positive constants $C_{1},C_{2}$ depending only on $\beta,\gamma,\lambda$ such that $C_{1}\leq\phi(R)\leq C_{2}$ for all $0\leq R\leq\lambda$ . [GL18] refer to such a function as a universal potential function.

In our context, we get that $\Psi$ is an $(\alpha,c)$ -potential function which satisfies 4, but with a constant $c$ that depends on $\gamma,\Delta$ . Indeed, worst case, we have

\displaystyle\max_{y_{1},y_{2}}\frac{\psi(y_{2})}{\psi(y_{1})}\geq\frac{\psi(\log\lambda)}{\psi(\log(\lambda/\gamma^{\Delta-1}))}=\frac{\lambda\frac{\beta\gamma-1}{\alpha\gamma\log\frac{\lambda+\gamma}{\beta\lambda+1}}}{\frac{\beta\gamma-1}{\alpha\log\frac{\lambda+\gamma}{\beta\lambda+1}}\cdot\frac{\lambda}{\gamma^{\Delta}}}=\gamma^{\Delta-1}.

More precisely, we have the following result from [GL18], stated in terms of the log marginal ratios.

Theorem 38.

Assume $\beta,\gamma,\lambda$ are nonnegative real numbers satisfying $\beta\leq 1\leq\gamma$ , $\sqrt{\beta\gamma}\geq 1$ , and $\lambda<\left({\frac{\gamma}{\beta}}\right)^{\frac{\sqrt{\beta\gamma}}{\sqrt{\beta\gamma}-1}}$ . Then the function $\Psi$ is an $(\alpha,c)$ -potential function for a constant $0<\alpha<1$ depending on $\beta,\gamma,\lambda$ , and a constant $c>0$ depending on $\beta,\gamma,\lambda,\Delta$ .

Combined with 5, this gives $O(n^{C})$ mixing with a constant $C$ depending only on $\beta,\gamma,\lambda,\Delta$ . We note this is weaker than the correlation decay result in [GL18], since there, $C$ does not depend on $\Delta$ , and hence is efficient for arbitrary graphs.

Appendix G Slightly faster mixing

In this section, we slightly optimize our mixing time results for certain antiferromagnetic 2-spin systems by more carefully taking into account the tradeoff between the (nontrivial) spectral independence bound we prove based on contraction, and the (trivial) spectral independence bound we obtained in Section A.2 for handling constant-sized graphs.

Proposition 39.

Suppose a distribution $\mu$ on subsets of $[n]$ is $(\eta_{0},\dots,\eta_{n-2})$ -spectrally independent for $\eta_{i}\leq\min\{a,(n-i-1)b\}$ , for some $a\geq 0$ and $0\leq b\leq 1$ . Then the Glauber dynamics for sampling from $\mu$ has spectral gap at least $\frac{1}{n}\cdot\Omega\left({\frac{a}{bn}}\right)^{a}$

Proof.

Suppose we have already conditioned on $c$ -fraction of elements to be “in/out. The resulting distribution is both $b(1-c)n$ -spectrally independent and $a$ -spectrally independent. The exact threshold $c$ for which the bound $b(1-c)n$ is better than $a$ is given by

\displaystyle c=1-\frac{a}{bn}

We note such a $c$ only makes sense when $0\leq 1-\frac{a}{bn}\leq 1$ , or equivalently, $bn\geq a$ . Now, we apply the $a$ -spectral independence bound for all conditional distributions based on fixing at most $c$ -fraction of vertices. We apply the $(n-i-1)b$ -spectral independence otherwise. We obtain a final spectral gap lower bound of

\displaystyle\frac{1}{n}\cdot(1-b)^{(1-c)n}\cdot\prod_{k=0}^{cn}\left({1-\frac{a}{n-k-1}}\right)

Observe that

\displaystyle(1-b)^{(1-c)n}=(1-b)^{\frac{a}{b}}\gtrsim\exp(-a)

We also have

	$\displaystyle\prod_{k=0}^{cn}\left({1-\frac{a}{n-k-1}}\right)$	$\displaystyle\gtrsim\exp\left({-a\sum_{k=0}^{cn}\frac{1}{n-k-1}}\right)$
		$\displaystyle\gtrsim\exp\left({-a\left({\underset{\approx\log n}{\underbrace{\sum_{k=0}^{n-2}\frac{1}{n-k-1}}}-\underset{\approx\log(1-c)n}{\underbrace{\sum_{k=cn+1}^{n-2}\frac{1}{n-k-1}}}}\right)}\right)$
		$\displaystyle\gtrsim\exp\left({-a\cdot\log\frac{1}{1-c}}\right)$
		$\displaystyle\gtrsim\exp\left({-a\log\frac{bn}{a}}\right)$
		$\displaystyle\gtrsim\left({\frac{a}{bn}}\right)^{a}$

Putting these together, we obtain the desired lower bound. ∎

With this result, we can apply it to the antiferromagnetic models with $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta},\gamma\leq 1$ and $\beta=0,\gamma\leq 1$ , since looking in the proof of 28, we have such systems are $C n$ -spectrally independent roughly with $C\leq O(1/\Delta)$ .

Corollary 40 (Soft Constraints).

Fix integers $\Delta\geq 3$ , $1<\overline{\Delta}<\Delta$ . Let $\beta,\gamma,\lambda\geq 0$ be nonnegative real numbers satisfying $\frac{\overline{\Delta}-2}{\overline{\Delta}}\leq\sqrt{\beta\gamma}\leq\frac{\overline{\Delta}-1}{\overline{\Delta}+1}$ and $\gamma\leq 1$ . Assume further that $(\beta,\gamma,\lambda)$ is up-to- $\Delta$ unique with gap $0<\delta<1$ . Then for every $n$ -vertex graph $G$ with maximum degree at most $\Delta$ , the Glauber dynamics for sampling from the antiferromagnetic 2-spin system with parameters $(\beta,\gamma,\lambda)$ mixes in $O\left({\frac{\overline{\Delta}\cdot n}{\Delta}}\right)^{O(1/\delta)}$ steps.

Corollary 41 (Hard Constraints).

Fix an integer $\Delta\geq 3$ , fix $\beta=0$ , and let $0\leq\gamma\leq 1,\lambda\geq 0$ be up-to- $\Delta$ unique with gap $0<\delta<1$ . Then for every $n$ -vertex graph $G$ with maximum degree at most $\Delta$ , the Glauber dynamics for sampling from the antiferromagnetic 2-spin system with parameters $(\beta,\gamma,\lambda)$ -mixes in $O\left({\frac{n}{\Delta}}\right)^{O(1/\delta)}$ steps.

	$\displaystyle\sum_{v\in L_{r}(k)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$	$\displaystyle=\sum_{i=1}^{\Delta_{r}}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(r\text{\scriptsize{~{}$\rightarrow$~{}}}u_{i})\right\|\sum_{v\in L_{u_{i}}(k-1)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
		$\displaystyle=\sum_{i=1}^{\Delta_{r}}\left\|h(\log R_{u_{i}})\right\|\sum_{v\in L_{u_{i}}(k-1)}\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
		$\displaystyle=\sum_{i=1}^{\Delta_{r}}\frac{\left\|h(\log R_{u_{i}})\right\|}{\psi(\log R_{u_{i}})}\sum_{v\in L_{u_{i}}(k-1)}\psi(\log R_{u_{i}})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u_{i}\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|.$

		$\displaystyle\sum_{v\in L_{u}(k)}\psi(\log R_{u})\left\|\mathcal{I}_{T}^{\sigma_{\Lambda}}(u\text{\scriptsize{~{}$\rightarrow$~{}}}v)\right\|$
	$\displaystyle\leq{}$	$\displaystyle\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|\cdot\max_{v\in L_{w_{i}}(k-1)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{w_{i}}}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k-1}$
	$\displaystyle\leq{}$	$\displaystyle\max_{v\in L_{u}(k)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{u}}\backslash\{u\}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k-1}\cdot\sum_{i=1}^{d}\frac{\psi(\log R_{u})}{\psi(\log R_{w_{i}})}\left\|h(\log R_{w_{i}})\right\|$
	$\displaystyle\leq{}$	$\displaystyle\max_{v\in L_{u}(k)}\left\{\psi(\log R_{v})\right\}\cdot\left(\max_{w\in V_{T_{u}}}\sup_{\tilde{\bm{y}}\in S^{d_{w}}}\left\\|{\nabla H_{d_{w}}^{\Psi}(\tilde{\bm{y}})}\right\\|_{1}\right)^{k},$

	$\displaystyle\left\|{h(y)}\right\|$	$\displaystyle\leq\left\|{h\left({\log\left({\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}}\right)}\right)}\right\|\leq(\beta\gamma-1)\frac{1}{\beta\cdot\frac{(\Delta-2)\beta\gamma-\Delta}{\beta}+1}$
		$\displaystyle=\frac{\beta\gamma-1}{(\Delta-2)(\beta\gamma-1)-1}\leq O(1/\Delta).\qed$

Rapid Mixing of Glauber Dynamics up to Uniqueness via Contraction

Abstract

1 Introduction

Theorem 1 (Hardcore model).

Theorem 2 (Antiferromagnetic Ising Model).

Theorem 3 (General antiferromagnetic 2-spin system).

1.1 Mixing by the potential method

Definition 4 ((α,c)-Potential function).

Theorem 5.

Revision in July 2021.

Acknowledgments.

2 Preliminaries

Mixing time and spectral gap

Uniqueness

Ratio and influence

Weitz’s self-avoiding walk tree

3 Proof outline for main results

Step 1 ([ALO20]): Spectral Independence implies rapid mixing.

Definition 6 (Spectral Independence [ALO20]).

Theorem 7 ([ALO20]).

Step 2: Self-avoiding walk trees preserve influences.

Lemma 8.

Remark 1.

Step 3: Decay of influences given a good potential.

Lemma 9.

Step 4: Find a good potential.

Lemma 10.

4 Preservation of influences for self-avoiding walk trees

Theorem 11.

Remark 2.

Lemma 12.

Proof.

Lemma 13.

Proof.

4.1 Proof of 11

Proof of 11.

Algorithm: Tsaw​(G,r)

5 Influence bound for trees

Lemma 14.

Lemma 15 ([ALO20, Lemma B.2]).

Lemma 16.

Proof.

Proof of 14.

Proof of 9.

6 Verifying a good potential: Contraction

Lemma 17.

Proof.

Theorem 18 ([LLY13]).

Lemma 19.

Proof.

7 Remaining antiferromagnetic cases: β​γ≤Δ−2Δ and γ>1

Remark 20.

Example 21.

Lemma 22.

Proof.

Definition 23 (General (α,c)-potential function).

Theorem 24.

Lemma 25.

8 Ferromagnetic cases

Theorem 26.

Remark 3.

Theorem 27.

References

Appendix A Proof of main results

Proof of 5.

Claim 28 (Antiferromagnetic Case).

Claim 29 (Ferromagnetic Case).

Proof of 3.

Proof of 1.

Proof of 2.

A.1 Uniqueness gaps in terms of parameter paps

Claim 30 (Hardcore Model; Lemma C.1 from [ALO20]).

Claim 31 (Large β​γ).

Proof.

A.2 Spectral independence bounds for constant-size graphs

Proof of 28.

Proof of 29.

Appendix B Proof of 12 (Parts 1 and 2)

Proof of 12 (Parts 1 and 2).

Appendix C A technical lemma for Ψ

Theorem 3 (General antiferromagnetic $2$ -spin system).

Definition 4 ( $(\alpha,c)$ -Potential function).

Algorithm: $T_{\textsc{saw}}(G,r)$

7 Remaining antiferromagnetic cases: $\sqrt{\beta\gamma}\leq\frac{\Delta-2}{\Delta}$ and $\gamma>1$

Definition 23 (General $(\alpha,c)$ -potential function).

Claim 31 (Large $\sqrt{\beta\gamma}$ ).

Appendix C A technical lemma for $\Psi$