Approximate counting using Taylor’s theorem:
a survey

Viresh Patel111School of Mathematical Sciences, Queen Mary, University of London. Email: viresh.patel@qmul.ac.uk.    Guus Regts222Korteweg de Vries Institute for Mathematics, University of Amsterdam. Email: guusregts@gmail.com. Funded by the Netherlands Organisation of Scientific Research (NWO): VI.Vidi.193.068
Abstract

In this article we consider certain well-known polynomials associated with graphs including the independence polynomial and the chromatic polynomial. These polynomials count certain objects in graphs: independent sets in the case of the independence polynomial and proper colourings in the case of the chromatic polynomial. They also have interpretations as partition functions in statistical physics.

The algorithmic problem of (approximately) computing these types of polynomials has been studied for close to 50 years, especially using Markov chain techniques. Around eight years ago, Barvinok devised a new algorithmic approach based on Taylor’s theorem for computing the permanent of certain matrices, and the approach has been applied to various graph polynomials since then. This article is intended as a gentle introduction to the approach as well as a partial survey of associated techniques and results.

Keywords: approximate counting, independence polynomial, complex zeros, chromatic polynomial.

1 Introduction

Computational counting is an area of theoretical computer science, which, at its heart, is concerned with the computational problem of counting certain structures inside some combinatorial object given as input. Think of counting the number of satisfying assignments of some logical formula, or the number of independent sets in a graph. Often the structures to be counted have some natural weighting and one is interested in the weighted count.

In this article, we focus on graph counting problems and in particular on finding efficient algorithms for (approximately) counting objects of interest inside some input graph. The counting problems we consider here are ones where the number of objects to be counted is typically super-polynomial in the size of the graph and cannot be directly enumerated in polynomial time. For example, the number of independent sets of a graph can be exponentially large in the size of the graph333As a contrasting example the number of triangles of a graph is polynomial in the size of the graph and can be enumerated by brute force in polynomial time. Of course it is interesting to know whether there is an algorithm for counting triangles that is better than using brute force, but we do not pursue this here. and indeed, the problem of (approximately) counting independent sets of a graph is a rich area of research (here an independent set in a graph is a subset of vertices no two of which are adjacent). The problem of exact counting is often computationally hard (this is the case for independent sets [78, 86, 42]) so one is usually interested in approximation algorithms for counting problems. A notable exception is the problem of counting spanning trees of a graph. The number of spanning trees is typically exponential in the size of the graph, but spanning trees can be counted in polynomial time via the matrix tree theorem [62]. Throughout the article we use the example of counting independent sets in graphs to illustrate the various ideas we discuss. In fact the ideas apply more generally for counting many other graph theoretic objects including trees, matchings, cuts, and proper colourings (we discuss proper colourings towards the end of the article).

The basic combinatorial counting problems are often not treated directly, but are considered in more generality by examining their corresponding generating functions. For example, for independent sets, one is interested in the independence polynomial, which for a graph G=(V,E), is defined to be the polynomial

ZG(λ):=SVindependentλ|S|=k0αkλk,

where αk=αk(G) is the number of independent sets of size k in G. This polynomial encodes a lot of information about the (sizes) of independent sets in G. For example it is easy to see that ZG(1) gives the number of independent sets in G and ZG(1)/ZG(1) gives the average size of an independent set in G. Knowing the value of ZG(λ) for very large λ would allow one to extract the degree of the polynomial i.e. the size of the largest independent set (which is known to be NP-hard to compute and even to approximate within a constant factor). This already tells us we should not expect to be able to efficiently approximate the independence polynomial at all values of λ.

In this article, we describe a recent technique, the so-called Taylor polynomial interpolation method of Barvinok (first introduced in [7]), for designing approximation algorithms for computational counting problems. Our aim here is to introduce the reader to the ideas behind the method and to give a flavour of the mathematics involved. We do not intend to give a complete survey of results that use the technique and nor do we fully formalise all of the ideas we present. For the latter, we refer the reader to the excellent book of Barvinok [6] and to [72].

One distinguishing feature of the Taylor interpolation method is that, as well as its applications to ordinary counting problems, it also applies to evaluations of generating functions at negative and complex numbers. This is in contrast to earlier techniques. One motivation for understanding such complex evaluations is in quantum computing [90, 26, 70, 49], although we will not discuss this here. Another is that complex evaluations are sometimes useful for real counting problems (see e.g. [3]), and perhaps most importantly, broadening our perspective to the complex plane gives a deeper understanding of the underlying computational complexity of various counting problems (see Section 5 for more discussion on this).

Other techniques for designing approximate counting algorithms (which we will not discuss) include the Markov chain Monte Carlo method (see Jerrum [62] for an excellent introduction to the area) as well as the correlation decay method first introduced by Weitz [89] and Bandyopadhyay and Gamarnik [4] (see e.g. Chapters 5 and 6 of [6] for an introduction). A very recent technique, closely related to Barvinok’s interpolation method, is based on the cluster expansion from statistical physics and has been introduced by Jenssen, Keevash and Perkins [61]. We say a few words about this at the end of Section 4.

1.1 Connection to statistical physics

The generating functions for the counting problems we encounter are often studied in the statistical physics community (using different terminology). For example the independence polynomial is known as the partition function of the hard-core model in statistical physics. The hard-core model is a model for gases. Given a closed container of a gas at equilibrium consider examining the gas in a small region of space inside the container. The (discretised) space in the region is represented by a grid graph, where vertices of the graph represent points in space. Each such point can either be occupied or unoccupied by a gas molecule but adjacent points in space cannot both be occupied due to repulsive forces between the molecules. Therefore, at any moment in time, the gas molecules can only occupy an independent set in the grid. The probability (S) that at any moment in time the occupied points form a particular independent set S of the grid is proportional to λ|S|, where λ[0,) is a temperature-like parameter often called the fugacity. A high temperature corresponds to a small value of λ, which, as we intuitively expect, makes it less likely that we see a large set S of occupied points in our small region of space. Since (S)λ|S|, and SV independent(S)=1, we see that (S)=λ|S|/ZG(λ). Here we see the independence polynomial ZG(λ) (known here as the partition function of the hard-core model) appearing as the normalising constant in the probability. Again, this partition function is much more than just a normalising constant, and encodes a lot of physical information about the system. For example, by considering the limiting behaviour of lnZG(λ)/|V(G)| and its derivatives for larger and larger graphs (usually grids), discontinuities of these limit functions give information about phase transitions in the system, that is, sharp changes in the physical parameters associated with the system indicating a qualitative change in the system. We direct the reader to [46] for a comprehensive and rigourous mathematical treatment of phase transitions for many models and to [81, 38, 53] for more on the hard-core model. We will not be concerned directly with the statistical physics, but some results originally proved by statistical physicists will be used in the algorithmic approach we describe.

1.2 Preliminaries

We have already mentioned the independence polynomial as an example of a graph polynomial that we may wish to approximate. The independence polynomial will serve as a running example throughout the article to illustrate various ideas. Here we mention a few basic properties of the independence polynomial to give the reader a feel for this object.

Recall that ZG(λ)=λ|S|, where the sum is over all independent sets S of G. The first easy but important fact to note is that the empty set is an independent set, so ZG(0) (i.e. the constant term in the polynomial) is always 1. Another important fact is that the independence polynomial is multiplicative, that is ZG1G2(λ)=ZG1(λ)ZG2(λ), where we write G1G2 for the disjoint union of the graphs G1 and G2. This is because every independent set S of G1G2 can be written uniquely as S=S1S2, where Si is an independent set of Gi. Therefore λ|S|=λ|S1|λ|S2|, which allows us to factorise the sum. Using this multiplicative property, we also see, for example, that the independence polynomial of k isolated vertices is (1+λ)k. One can also see directly that the complete graph on k vertices has independence polynomial 1+kλ.

We now describe the type of algorithm we ideally wish to obtain for our graph counting problems. Suppose p=p(G) is a graph parameter, e.g. p(G) is the number of independent sets in G, or p(G)=ZG(λ) for some fixed λ. Note that we allow p(G) to be a complex number. A fully polynomial-time approximation scheme (or FPTAS for short) for p is an algorithm that takes as input a graph G and an error tolerance ε>0 and outputs a (complex) number N such that N=eεtp(G) for some t with |t|1 in time polynomial in |G| (the number of vertices of G) and ε1. Note that when ε is small, we have N=eεtp(G)(1+εt)p(G), so that N is roughly within a distance ε|p(G)| of the true value of p(G). For this reason we call such output N a multiplicative ε-approximation (for p(G)).444Note that this definition of FPTAS is consistent with the usual notion of FPTAS for real parameters. We also discuss algorithms that provide the same approximation as above but that run in time super-polynomial in |G|.

2 Barvinok’s interpolation method

In this section we describe the Taylor polynomial interpolation method of Barvinok, a method that can be applied to a wide variety of counting problems. Consider some graph polynomial, that is, each graph G has some associated polynomial P(z)=PG(z). As with the independence polynomial, we should imagine that PG is not directly accessible, i.e. at least some of its coefficients are difficult to compute from G. We will however assume that the degree of the polynomial PG is always bounded by a constant times |G|; this is certainly the case for the independence polynomial and is easy to verify for most graph polynomials one might consider. Our goal is to (efficiently) obtain a multiplicative ε-approximation for PG(z) for z.

The insight of Barvinok was to use Taylor’s theorem, about power series approximations of smooth functions, to obtain the desired approximation. At first sight we seem to gain nothing from Taylor’s theorem because the Taylor series of a polynomial is simply the polynomial itself. However, notice that the truncated Taylor series of a (non-polynomial) function gives an additive ε-approximation to the function, whereas we are interested in a multiplicative ε-approximation. Therefore, rather than considering the Taylor series of PG(z), we should in fact consider the Taylor series of g(z):=lnPG(z) and then take the exponential of the result to obtain the desired approximation.555In order for g(z) to be well-defined we need to fix a branch of the logarithm here; we say more below.

To this end, consider the Taylor series of g(z) about zero:

g(z)=k=0g(k)(0)k!zk,

where g(k) denotes the kth derivative of g. Unfortunately, the Taylor series of a function does not usually converge for all z. We will return shortly to the question of convergence, but let us assume for now that the Taylor series does converge to g(z) for a value of z we are interested in. In that case, if we write Tm(z) for the first m terms of the Taylor series of g above, then for m sufficiently large, we will have that |Tm(z)g(z)|<ε, i.e. Tm(z)=g(z)+εt for some t with |t|<1. Taking the exponential of both sides of the equation, we obtain exp(Tm(z))=exp(εt)PG(z) i.e. exp(Tm(z)) a multiplicative ε-approximation for PG(z).

This gives us the desired approximation, but several questions remain. Firstly, there is the question of convergence mentioned above. Secondly, if the Taylor series does converge, then how large does m have to be to guarantee that |Tm(z)g(z)|<ε? Finally, how can we actually compute Tm(z) in order to compute our approximation exp(Tm(z)) for PG(z)? Note that we do not have direct access to the numbers g(k)(0); these have to be computed in some way.

For the first question of convergence, Taylor’s theorem says that the Taylor series for g converges inside the disk DR:={z:|z|R} for any R>0 provided that g is analytic inside DR. In our case, this holds provided PG(z)0 for all zDR.666Formally, to ensure g is analytic, we fix lnPG(0), and take the branch of g(z)=lnPG(z) on DR given by g(z)=lnPG(0)+0zPG(w)/PG(w)𝑑w. So the Taylor series will converge inside the largest disk that contains no roots of PG(z). Establishing such zero-freeness results for particular graph polynomials will be the subject of Section 4.

The second question concerns the rate of convergence of the Taylor series of g. Here we take advantage of the particular form of g as the logarithm of a polynomial. If η1,,ηd are the (complex) roots777Again, we do not typically have access to the roots of PG; we work with the roots only in the analysis of the algorithm. of PG(z) then we can write PG(z)=ai=1d(1zηi), where a=PG(0) is assumed to be non-zero. Then taking logarithms of both sides, we have

g(z)=ln(a)+i=1dln(1(z/ηi)).

Using that the Taylor series of ln(1z)=zz22z33 for |z|<1, we obtain the Taylor series of g as

g(z)=ln(a)i=1dk=1(z/ηi)kk

for |z|<mini|ηi| (precisely the condition of zero-freeness mentioned above). Assuming |z|δmini|ηi| for some δ(0,1), this gives

|g(z)Tm(z)|i=1dk=m|(z/ηi)kk|i=1dk=mδk=dδm1δ.

In order to bound the last expression by ε, it is sufficient to take mCln(d/ε) for some constant C depending on δ. For such m we have that exp(Tm(z)) is a multiplicative ε-approximation for PG(z).

The final question of actually computing Tm(z) is more subtle and will only be partially addressed here and in the next section. We will show that if we know the values of the first m=Cln(d/ε) coefficients of PG(z) then we can compute the derivatives g(0),g(1),,g(m) in time poly(m). However, we do not typically have immediate access to the first m coefficients of our graph polynomials. For example, in the case of the independence polynomial ZG(λ), the coefficient αk of λk is the number of independent sets of size k in G: computing this naively with a brute force approach of checking every k-tuple of vertices takes time nk (where n=|G|) and so computing αm takes time nm=nO(lnn) (noting that the degree of ZG i.e. the size of the largest independent set could be and often is linear in n). In the next section, we show how to compute α0,,αm in poly(n) time and the idea turns out to generalise for many other graph polynomials of interest. For now, here is how to compute Tm(z) given the first m coefficients of PG(z).

Suppose P(z)=PG(z)=a0+a1z++adzd. We defined g(z)=lnPG(z). We know g(0)(0)=g(0)=ln(a0). If we differentiate once and rearrange, we obtain g(1)(z)P(z)=P(1)(z). If we now repeatedly differentiate this expression, we obtain the following expressions:

P(1) =g(1)P(0)
P(2) =g(2)P(0)+g(1)P(1)
P(r) =g(r)P(0)+(r11)g(r1)P(1)+(r12)g(r2)P(2)++(r1r1)g(1)P(r1).

Evaluating these expressions at zero, and noting that P(r)(0)=r!ar, we obtain

a1 =a0g(1)(0)
2a2 =a0g(2)(0)+a1g(1)(0)
rar =a0g(r)(0)+(r1)!(r1)!a1g(r1)(0)+(r1)!(r2)!a2g(r2)(0)++(r1)!1!ar1g(1)(0).

We see that if we know a0,,ar and we have computed g(0)(0),,g(r1)(0), then we can use the rth equation above to compute g(r)(0) in time O(r). Therefore given a0,,am, we can compute Tm(z) in O(m2) time.

The following summarises what we have shown in this section and is the essence of the Taylor polynomial interpolation method.

Theorem 2.1.

Suppose 𝒢 is an (infinite) set of graphs and for each G𝒢, PG(z) is a polynomial associated with G, where

PG(z)=i=0d(G)ai(G)zi

Suppose there exists R>0 and a function T:× with the properties that

  • (i)

    PG(z)0 whenever |z|R for all graphs G𝒢, and

  • (ii)

    we are able to compute ai(G) in time bounded by T(|G|,i), where we assume for convenience that T is non-decreasing in both arguments.

Then there is an algorithm, which, given input G𝒢, ε>0, and z with |z|<R, computes a multiplicative ε-approximation of PG(z) in time mT(n,m)+O(m2), where n=|G| and m:=Cln(d(G)/ε) (as defined earlier).

Some remarks are in order. The theorem is formulated for a general class of graphs 𝒢 rather than all graphs because often, we are only able to establish conditions (i) and (ii) effectively for certain types of graphs (typically bounded degree graphs). This is best illustrated by applying the result above to the independence polynomial.

For the independence polynomial, if we consider 𝒢 to be the set of all graphs, then there is no zero-free disk of positive radius888The k-vertex complete graph has independence polynomial 1+kλ, so its roots tend to 0. so we can only take R=0 in condition (i). However, if we restrict 𝒢 to graphs of maximum degree Δ, we will see in Section 4 that we can take R=(Δ1)Δ1/ΔΔ. Similarly for condition (ii), if we take 𝒢 to be all graphs, then the brute force approach mentioned earlier is essentially the only way to compute coefficients of ZG(λ) giving T(n,k)=O(nk): overall this gives a super-polynomial running time of mT(n,m)+O(m2)=nO(m)+O(m2)=nO(ln(n/ε)). Such a quasi-polynomial running time is already quite promising because it is significantly better than the exponential running time of a brute-force algorithm. However, in the next section we will see that for graphs of maximum degree at most Δ, we can compute the coefficients much faster and take T(n,k)=poly(n)ΔO(k), thereby establishing an overall polynomial running time of T(n,m)=poly(n)ΔO(m)=poly(n)ΔO(ln(n/ε))=(n/ε)O(lnΔ). Combining the results from the next two sections will therefore give an FPTAS for computing ZG(λ) on graphs of maximum degree at most Δ provided |λ|<(Δ1)Δ1/ΔΔ.

Finally, we remark that one can in fact relax condition (i) to include regions that are not necessarily disks provided the region is “thick” in a certain sense and contains the point 0. Concretely one should think of a small neighbourhood of a real interval or a sector region. Relaxing condition (i) to non-disk regions is achieved by making suitable polynomial transformations of PG; see Section 2.2.2 of [6] and [10] for details.

3 Polynomial running time for bounded degree graphs

In the last section we saw how we can use Taylor’s theorem to design algorithms to approximate graph polynomials. Let P=PG be a graph polynomial. Examining Theorem 2.1, we require two ingredients to establish an approximation algorithm to compute PG(z). First we need to establish a zero-free disk for PG; this will be discussed in detail in the next section. Second, we need to be able to efficiently compute the first O(ln|G|) coefficients of PG, which we discuss in detail here. There is usually a straightforward, direct approach for computing these coefficients, which leads to quasi-polynomial time algorithms, but which is not fast enough for an FPTAS. We already saw this in the last section with the independence polynomial, where we saw that computing the coefficients naively leads to an nO(lnn)-time algorithm.999Computing the coefficients naively often gives us quasi-polynomial time approximation algorithms as with the example of the independence polynomial. This is already very good because it is a significant improvement on the exponential time taken to enumerate independent sets in a graph. Achieving the polynomial runtime of an FPTAS is considered to be the gold standard in the area. In this section, we show how to compute the coefficients of the independence polynomial more efficiently for bounded degree graphs. The technique generalises to many other graph polynomials but all the key ideas are best understood through the concrete example of the independence polynomial. We give the statement for general graph polynomials at the end of the section.

It is worth noting that, barring a few exceptions, the setting of bounded degree graphs is often the setting of interest. For example, the problem of computing a multiplicative ε-approximation for ZG(z) is known to be computationally hard for all complex z0 if we have no restriction on G; see [84, 51, 23].

3.1 Computing the coefficients of the independence polynomial efficiently

Recall that the independence polynomial ZG(λ) is given by

ZG(λ)=k0αkλk,

where αk=αk(G) is the number of independent sets of size k in G. Throughout this section we focus on bounded degree graphs and write 𝒢Δ for the set of graphs of maximum degree at most Δ. If we apply Theorem 2.1 to ZG(λ) (with G𝒢Δ), assuming we have some suitable zero-free disk containing λ, Theorem 2.1 gives us an algorithm to compute a multiplicative ε-approximation of ZG(λ) in time mT(n,m)+O(m2), where T(n,i) is the time needed to compute αi(G) for n-vertex graphs G𝒢Δ and mCln(n/ε). We will sketch a proof of the following.

Theorem 3.1.

For G𝒢Δ, we can compute αi(G) in time poly(|G|)ΔO(i), i.e. for n-vertex graphs of maximum degree at most Δ, we can take T(n,i)=poly(n)ΔO(i).

Using the theorem above, Theorem 2.1 gives us an approximation algorithm for the independence polynomial with running time

mT(n,m)+O(m2)=poly(n)ΔO(ln(n/ε)=poly(n)(n/ε)O(ln(Δ)).

We see that this running time is of the form required for a fully polynomial time approximation scheme. We now sketch the proof of Theorem 3.1.

Sketch proof of Theorem 3.1

We begin by generalising the algorithmic problem we are interested in. We are interested in computing αk(G), the number of independent sets of size k in G, when G𝒢Δ. Equivalently, αk(G) is the number of induced copies of the graph Ik in G, where Ik is the graph consisting of k vertices and no edges. Generally for graphs H and G, write ind(H,G) for the number of induced copies101010The number of induced copies of a graph H in the graph G=(V,E) is defined as the number of vertex subsets SV such that G[S]=H. of H in G. Then αk(G)=ind(Ik,G).

The first observation is that, while we do not know how to efficiently compute ind(Ik,G), it is not too hard to efficiently compute ind(H,G), when H is connected.

Observation 3.1.

We can compute ind(H,G) in time poly(n)ΔO(k), where G𝒢Δ, n=|G| and k=|H|.

To see this, we first pick any spanning tree T in H (i.e. a subgraph of H that is a tree and that uses all k vertices of H). Such a spanning tree exists because H is connected. The idea is to find all (not necessarily induced) copies of T in G and to check which of the copies of T extend to an induced copy of H. This accounts for all induced copies of H because every induced copy of H in G contains a (not necessarily induced) copy of T in G.

There are only relatively few (not necessarily induced) copies of T in G. Indeed, first we enumerate the vertices of T in a breadth-first ordering v1,v2,,vk. We embed T into G one vertex at a time in order. There are n choices of where to embed v1. Each subsequent vertex of T has at most Δ possibilities for its embedding into G because when we come to embed vi, its parent in T (say vi) has already been embedded as some vertex xi in G, so the embedding of vi must be a neighbour of xi in G. Therefore altogether there are at most nΔk1 embeddings of T in G and each such embedding of T is checked to see if it gives an induced copy of of H.

From Observation 3.1, we see that we can compute ind(H,G), when H is connected, but the graph H we are interested in, namely Ik, is very much disconnected. It would be useful if we could express ind(Ik,G) in terms of ind(H,G) for connected H. A trivial case of this is the fact that ind(I2,G)=(n2)ind(e,G), where e is the graph on two vertices with an edge between them. This says nothing other than that the number of edges and non-edges in an n-vertex graph sum to (n2). With a little more work, we can express ind(I3,G) in terms of induced counts of connected graphs as follows. There are four graphs on three vertices, namely I3, the triangle denoted T, the path on three vertices denoted P3 and the disjoint union of an edge and a vertex denoted e+I1. By enumerating all induced subgraphs of G on three vertices, we have

ind(I3,G)=(n3)ind(T,G)ind(P2,G)ind(e+I1,G).

The only disconnected graph on the right hand side is e+I1, and by simple counting, it is not too hard to show that

ind(e+I1,G)=(n2)ind(e,G)2ind(P3,G)3ind(T,G).

Substituting the second formula into the first gives an expression for ind(I3,G) in terms of induced counts of connected graphs.

These calculations suggest that it is possible to express ind(Ik,G) in terms of induced counts ind(H,G) for connected graphs H, but that the calculations and formulae will get cumbersome. A new idea is needed to approach the problem in a systematic and manageable way. The next observation is the key insight to overcoming this hurdle and is at the heart of the proof of Theorem 3.1. It was proved by Csikvári and Frenkel [36]; the proof is short and can also be found in [72].

Observation 3.2.

Suppose τ(G) is an additive graph property, meaning that it satisfies the following two properties.

  • (i)

    τ(G) can be written as sum of products of induced graph counts, i.e. for all G

    τ(G)=i=1rμiHiind(H,G),

    where i is a (finite) set of graphs and μi is a constant for each i=1,,r, and

  • (ii)

    τ(G1G2)=τ(G1)+τ(G2) for all graphs G1 and G2.

Then τ is in fact of a simpler form, namely, for all graphs G, we have

τ(G)=λ1ind(H1,G)++λsind(Hs,G),

where H1,,Hs are connected graphs and λ1,,λs.

The observation above says that every additive graph parameter is a linear combination of ind(Hi,G) for connected Hi, and so by Observation 3.1, such additive graph parameters can be computed efficiently.111111Actually efficient computation is not immediate because it depends on the number and size of the Hi; we address this later. Our task now is reduced to the task of expressing αk(G)=ind(Ik,G) in terms of additive graph parameters. In order to do this, we now switch from the combinatorial to the polynomial perspective of αk(G).

Recall that the αk(G) are the coefficients of the independence polynomial ZG, i.e. ZG(λ)=α0+α1λ+αdλd. Suppose that η1,,ηd are the roots of ZG. Noting that the constant term α0 is one, we can write ZG(λ)=(1η11λ)(1ηd1λ). While we cannot compute the ηi directly, we can relate them to the coefficients αk by expanding the product above. We see that the αk are the elementary symmetric polynomials in ηi1, namely

α0=1,α1=1idηi1,α2=1i<jdηi1ηj1etc.

Another important class of symmetric polynomials are the power sums. Let us define the ith power sum pi to be

pi=η1i++ηdi.

It is well known that the power sums can be related to the elementary symmetric polynomials using the Newton identities. There are several short derivations of these identities. In the context of our problem, the Newton identities give the following expressions relating the αi and the pi.

α1 =α0p1
2α2 =α0p2+α1p1
3α3 =α0p3+α1p2+α2p1
tαt =α0pt+α1pt1++αt1p1.

From this it is easy to see that if we know the values of the pi then we can inductively compute the αi. Indeed, if we know the values of p1,,pt, and we also know (by induction) the values of α1,αt1 then using the tth identity, we can compute αt. Thus the problem of efficiently computing the αi is reduced to that of efficiently computing the pi. It is possible to efficiently compute the power sums because, as the reader may have guessed, the power sums are additive graph parameters.

Observation 3.3.

The power sums pi=pi(G) as defined above have the property of being additive graph parameters.

It is easy to verify that pi satisfies the second property of an additive graph parameter, namely that pi(G1G2)=pi(G1)+pi(G2) for any graphs G1 and G2. Indeed, since ZG1G2=ZG1ZG2 (see Section 1.2), if η1,,ηd are the roots of of ZG1 and ν1,,νd are the roots of ZG2 then η1,,ηd,ν1,,νd are the roots of ZG1G2 so that

pi(G1G2)=η1i++ηdi+ν1i++νdi=pi(G1)+pi(G2).

For the first property, we use the Newton identities. Note that, since α0=1, we can rearrange the tth identity and express pt as a sum of products of p1,,pt1 and α1,,αt. We know that the αi are induced graph counts, and if we assume by induction that p1,,pt1 are also sums of products of induced graph counts, then we see that pt is also a sum of products of induced graph counts and so satisfies property (i) of an additive graph parameter.

We now have all the ingredients to explain how to compute the αk efficiently. We can compute the power sums pi efficiently. This is because the power sums are additive graph parameters (Observation 3.3) and they are therefore linear combinations of induced counts of connected graphs (Observation 3.2). Each induced graph count ind(H,G) in this linear combination can be computed efficiently when G is of bounded degree since H is connected (Observation 3.1) thus allowing us to compute the power sums efficiently. Once we have computed the power sums p1,p2,, we can inductively compute the αi using the Newton identities.

This gives the main ideas of the argument although there are a few subtleties that we have glossed over. The main one is that it is not quite obvious that we can compute the power sums pi(G) efficiently, i.e. in time poly(|G|)ΔO(i). While the pi(G) can be expressed as a linear combination of induced counts of connected graphs

pi(G)=λ1ind(H1,G)++λsind(Hs,G),

we have not said how to find H1,,Hs and λ1,,λs. Conceivably, s could be superexponential in i or the Hi could have size superlinear in i; in either case we would not automatically get the desired running time. However, by using the Newton identities more carefully, and using the fact that G has bounded degree it is not too difficult to overcome these technical obstacles. All the details can be found in [72].

3.2 Computing the coefficients of other graph polynomials efficiently

In Section 3.1, we described the main idea of how we can efficiently compute the first ln|G| coefficients of the independence polynomial ZG for graphs G of bounded degree. The ideas can be generalised to work for many other graph polynomials of interest.

What are the crucial properties of the independence polynomial ZG that we use in the sketch proof of Theorem 3.1? The whole proof is based around manipulating induced graph counts, so we certainly need the coefficients of ZG to be (functions of) induced graph counts. We also crucially need that ZG is multiplicative, which allows us to conclude that the power sums are additive, therefore allowing us to compute them efficiently.

In [72], we show that if a graph polynomial P=PG satisfies certain properties given below, then its coefficients can be computed efficiently for bounded degree graphs i.e. the ith coefficient of PG can be computed in time poly(n)ΔO(i) where G is an n-vertex graph of maximum degree at most Δ. As with the independence polynomial, this is enough to use the Taylor polynomial interpolation method from Section 2 to give an approximation algorithm for computing PG(z) (provided z is in a suitable zero-free disk) with the required run time of an FPTAS.

Suppose P=PG is a graph polynomial given by PG(z)=a0+a1z++adzd. Suppose that P satisfies the following properties for some fixed constant α>0:

  • (i)

    for each , the th coefficient of P can be expressed as a “α-bounded” linear combination of induced graph counts, that is, for all G𝒢Δ

    a(G)=HζH,ind(H,G),

    where the sum is over graphs H with at most α vertices and ζH, are constants (independent of G);

  • (ii)

    in property (i), for each H we can compute ζH, in time exp(O(|H|); and

  • (iii)

    PG is multiplicative, i.e. PG1G2=PG1PG2.

Then we can compute ai(G) in time poly(|G|)ΔO(i). Again, using the Taylor polynomial interpolation method, this leads to an FPTAS for approximating PG(z) for G𝒢Δ, again provided we establish a suitable zero-free disk containing z.

Note that in the case of the independence polynomial, properties (i) and (ii) are trivial and we saw it is easy to verify property (iii). These properties also hold for various other graph polynomials including the matching polynomial, the chromatic polynomial, and the Tutte polynomial.121212The Tutte polynomial is a polynomial in two variables, but the properties above hold if one of the variables is fixed We will not check these here, but refer the interested reader to [72]. It is also worth noting that the technique described in this section can be adapted and applied to polynomials beyond those satisfying properties (i)-(iii) above; see [68, 15, 70].

In Section 2 we explained how one can design algorithms for approximating graph polynomials using Taylor’s theorem. In this section, we showed how to make these algorithms efficient (having the running time of an FPTAS) for many graph polynomials provided we restrict attention to bounded degree graphs. We have seen in Section 2 that essential to all of these algorithms is to establish a suitable zero-free disk or zero-free region in the complex plane for the graph polynomial in question. Our discussion of algorithms ends at this point and in the next section, we turn our attention entirely to the independent problem of establishing these zero-free regions.

4 Techniques for proving absence of zeros

In the previous sections, we have sketched how the problem of approximately evaluating graph polynomials (particularly the independence polynomial) in a region of the complex plane is reduced to the problem of establishing that the polynomial has no zeros in that region. There is a long history of proving such results about the locations of zeros of graph polynomials and partition functions. The techniques used often have their origin in statistical physics but have now been picked up and extended by the theoretical computer science community. In this section we will discuss three different techniques.

4.1 Recursion and ratios

Many graph polynomials satisfy recursions in which the polynomial for a given graph can be expressed in terms of the polynomial for smaller graphs. Such recursions allow us to prove properties about the graph polynomial, such as absence of zeros, by induction. However, rather than working with the polynomials directly, it is often more productive to work instead with related quantities. We illustrate this approach through our running example of the independence polynomial and at the end of the section we direct the reader to further work in which this technique is employed.

Our aim is to sketch a proof of the following result due to Shearer [83], Dobrushin [38] and Scott and Sokal [81]:

Theorem 4.1.

Let G=(V,E) be a graph with maximum degree Δ2 and let λ satisfy |λ|λ(Δ):=(Δ1)Δ1ΔΔ. Then ZG(λ)0.

Let us briefly discuss this result before delving into the proof. First, recall that by the Taylor polynomial interpolation method (particularly Theorem 2.1 and Theorem 3.1), this result immediately implies an FPTAS for computing ZG(λ) for G𝒢Δ inside the zero-free disk given by |λ|<λ(Δ). Second, note that if we are only interested in zero-free disks, then one cannot improve Theorem 4.1 in the sense that we cannot increase the constant λ(Δ). Indeed, one can show that there is a sequence of graphs Gn (in fact trees) of maximum degree Δ and negative numbers λn such that ZGn(λn)=0 and λnλ(Δ) [81]. However there has been a lot of interest recently in establishing zero-freeness for non-disk regions. Most notably, it was shown recently [74] that ZG(λ)0 whenever G𝒢Δ and λR where R is an open set containing the interval [0,λc(Δ)) and λc(Δ):=(Δ1)Δ1/(Δ2)Δ. One significance of λc(Δ) is that it is an algorithmic threshold for real parameters λ: using the interpolation method, the result in [74] implies that there is an FPTAS131313In fact, an FPTAS was established earlier in [89] using the correlation decay method. to compute ZG(λ) whenever G𝒢Δ and λ[0,λc(Δ)), while for λ>λc(Δ), it is known that there is no such FPTAS unless P=NP [84, 51].

We now discuss the proof of Theorem 4.1. Let G=(V,E) be a graph and fix a vertex vV. We can write down a recursion for ZG(λ)=SV independent λ|S| by splitting the sum over those independent sets that do not contain v and those that do to obtain

ZG(λ)=ZGv(λ)+λZG[N[v](λ), (1)

where Gv (resp. GN[v]) denote the graphs obtained from G by removing v (resp. v and its neighbours in G). As mentioned earlier, rather than working directly with a recursion for ZG, it turns out to be more useful to work with a recursion of a related quantity. Define the ratio, RG,v, by

RG,v(λ):=λZGN[v](λ)ZGv(λ). (2)

Observe that provided ZGv(λ)0, we have ZG(λ)=0 if and only if RG,v(λ)=1 (using (1)). So to prove absence of zeros it suffices to inductively show that the ratios avoid 1.

Next we establish a recursion for these ratios. Let G be a graph with fixed vertex u0 and let λ. Let u1,,ud be the neighbours of u0 in G (in any order). Set G0=Gu0 and define for i=1,,d, Gi:=Gi1ui (so Gd=GN[u0]). Suppose that ZGi(λ)0 for all i=0,,d. Then we use ‘telescoping’ to write

RG,u0(λ)λ=ZGd(λ)ZG0(λ)=ZG1(λ)ZG0(λ)ZG2(λ)ZG1(λ)ZGd(λ)ZGd1(λ).

Applying (1) to each of the denominators and after some rearranging we end up with the following identity:

RG,u0(λ)=λi=1d(1+RGi1,ui(λ)). (3)

The identity above captures all the relevant combinatorics of independent sets that we need and the rest of the proof essentially boils down to proving a property about the above recursion.

Proof of Theorem 4.1.

We may assume that G is connected (if G has connected components H1,,Hk then ZG(λ)=ZH1(λ)ZHk(λ) and so it is sufficient to prove the theorem for each Hi).

Fix v0V. We will show by induction that the following holds for all UV{v0}:

  • (i)

    ZG[U](λ)0,

  • (ii)

    if u0U has a neighbour in VU, then |RG[U],u0(λ)|<1/Δ.

Indeed if |U|=0 then this is trivially true, so suppose that |U|>0. Then since G is connected, there is u0U that has a neighbour vVU. Let us write H=G[U] and let u1,,ud be the neighbours of u0 in H. Let H0=Hu0 and Hi=Hi1ui for i>0. Then, by induction ZHi(λ)0 and |RHi,ui+1(λ)|<1/Δ (since ui+1 has a neighbour in UV(Hi), namely u0). So we may use (3) to conclude that

|RH,u0(λ)|=|λ|i=1d|1+RHi1,ui(λ)| <|λ|(11/Δ)d
|λ|(Δ1Δ)(Δ1)=1/Δ, (4)

where we used that dΔ1 (since u0 has a neighbour in VU) and that |λ|λΔ. This shows (ii). Then, we also see that RH,u0(λ)1 and so ZH(λ)0, showing (i). This completes the induction.

To conclude the proof of the theorem we apply the same trick once more to RG,v0. From (4) we then obtain the bound |RG,v0|<1/(Δ1) since v0 may have d=Δ neighbours rather than dΔ1. Again we have RG,v0(λ)1 and so ZG(λ)0, as desired.

The proof essentially consists of two steps. First express a suitably chosen ratio in terms of ratios of smaller graphs. Secondly, use this expression to inductively show that these ratios are ‘trapped’ in some suitable region of the complex plane (the open disk of radius 1/Δ in the proof above). Of course the real ingenuity comes in finding the right ‘trapping region’.

This approach can be traced back to work of Dobrushin [38] and possibly even earlier. Recent years have seen many variations and refinements of this approach resulting in significant extensions of Theorem 4.1 [74, 19, 21] and zero-free regions for permanents [7, 8, 11], for the graph homomorphism partition functions [17, 16], for the partition function of the Ising and Potts models [67, 69, 75, 13, 22, 35], for Holant problems [76] and for various other graph polynomials [5, 9, 15, 14, 12, 64].

4.2 Stability of multivariate polynomials

In this subsection we briefly mention the technique of polynomial stability without going into too much detail. The basic idea here is that there are certain operations on polynomials that preserve certain useful properties. If one can use these operations to construct some desired graph polynomial or partition function from “elementary” polynomials, we can establish useful properties of the graph polynomial / partition function. The method is often most effective for multivariate polynomials, and indeed many graph polynomials have multivariate counterparts.

For our running example, the independence polynomial, the multivariate counterpart is defined as follows. Let G=(V,E) be a graph and associate to each vertex v a variable xv. The multivariate independence polynomial is then defined as

ZG((xv))=SVindependentxS,

where we use the shorthand notation xS:=vSxv. Note that if we set all the variables equal to λ then we recover the original (univariate) independence polynomial. The multivariate independence polynomial is a multi-affine polynomial meaning that it is affine in each variable (i.e. if we fix all but one variable xv it becomes a polynomial of degree 1 in xv). It is easy to see that any multi-affine polynomial f (in the same variables (xv)vV) can be written as f=SVasxS for some constants aS.

For two multi-affine polynomials P=SVpSxS and Q=SVqSxS, their Schur product, PQ is defined as the multi-affine polynomial in which the coefficient of xS is pSqS i.e. PQ=SVpSqSxS. We can build up the polynomial ZG using Schur product of simpler polynomials as follows. Suppose H1 and H2 are graphs on the same vertex set V and G is the union141414This is very different from the disjoint union of graphs that we made heavy use of in Section 3. of H1 and H2 (i.e. the edges of G are precisely the edges of H1 together with the edges of H2). Then

ZG=ZH1ZH2.

This is easy to see since we know S is an independent set of G if and only if S is an independent set of both H1 and H2 and the Schur product has the corresponding property that the coefficient of xS is 1 in ZH1ZH2 if and only if it is 1 in both ZH1 and in ZH2. For example the 4-cycle C4 with vertex set {1,2,3,4} and edges {1,2}, {2,3}, {3,4} and {4,1} is the union of two matchings M1 with edges {1,2},{3,4} and M2 with edges {1,3},{2,4}. Using the multiplicative property151515We showed this property for the univariate independence polynomial and it follows in the same way for the multivariate version of the independence polynomial, we know

ZM1=(1+x1+x2)(1+x3+x4) and ZM2=(1+x2+x3)(1+x1+x4)

and using the Schur product property, one can check

ZC4=ZM1ZM2=(1+x1+x2+x3+x4+x1x3+x2x4).

The Schur product corresponds beautifully well to taking unions of graphs for the independence polynomial, but does it preserve any useful properties? Writing 𝔻 for the open unit disk in , we say a multi-affine polynomial P=SVpSxS is 𝔻-stable if P((xv)vV)0 whenever xv𝔻 for all vV. It is well known (see [6]) that if P and Q are 𝔻-stable then so is PQ. This seems promising for us, but unfortunately, the independence polynomial of a matching or indeed a single edge (out of which we build all other independence polynomials) is not 𝔻-stable, e.g. ZM1(12,12,0,0)=0. The independence polynomial of a matching is however non-zero if all the arguments are in an open disk of radius 1/2. Now, using the fact that every graph in 𝒢Δ is the union of at most Δ+1 matchings (Vizing’s theorem) and applying a simple scaling argument, one can still make use of the 𝔻-stability of Schur products to show that ZG is non-zero if all arguments are in a disk of radius smaller than 1/2Δ+1, where G𝒢Δ.

This is a much weaker bound than Theorem 4.1 from the the previous subsection, but is given simply to illustrate the idea of stability. The idea of using multi-affine polynomials and operations preserving zero-freeness was pioneered by Asano [2] about fifty years ago to give a short and elegant proof of the famous Lee-Yang theorem (see also [6] for a proof using Schur products.) The theorem states that the partition function of the Ising model (in terms of vertex activities), which essentially is the generating function of the edge cuts in the graph, has all its zeros on the unit circle under suitable conditions; we choose not to introduce the relevant background here. By now there are several variations of the technique, some of which use the Grace-Szëgo-Walsh theorem, and they have been applied to partition functions of several models and graph polynomials [80, 80, 88, 54, 54, 20].

4.3 The polymer method

We introduced the multivariate independence polynomial in the last subsection to illustrate the idea of polynomial stability. It turns out that many other graph polynomials and partition functions can be expressed as evaluations of multivariate independence polynomials of a particular type. For this reason, there has been a lot of interest in understanding and proving conditions that guarantee zero-freeness of such multivariate independence polynomials. This idea of first rewriting a partition function/graph polynomial as an evaluation of a multivariate independence polynomial and then checking conditions from the literature known to guarantee that the latter evaluation is nonzero is a powerful technique originating in statistical physics. There, the multivariate independence polynomial is sometimes called the partition function of a polymer model, and the technique we describe is sometimes called the polymer method.

We will give an example of this idea applied to the chromatic polynomial, a graph polynomial used for counting proper colourings of a graph, which we will shortly introduce. We sketch a proof of a result of Férnandez and Procacci [44] and Jackson, Procacci and Sokal [59] about zero-freeness of the chromatic polynomial. At the end of the subsection, we list some recent results based on this technique and indicate how a variation of this technique can in fact be used directly to design efficient algorithms to approximate graph polynomials, without having to use the interpolation method.

4.3.1 The chromatic polynomial

For a graph G=(V,E) and integer q, a proper q colouring of G is an assignment of q colours (usually labelled 1,,q) to the vertices such that adjacent vertices receive different colours. This means in particular that all vertices assigned some fixed colour i form an independent set. The function χG counts the number of proper q-colourings of G, that is, for each q, χG(q) is defined to be the number of proper q-colourings of G. For example the number of proper q-colourings of a triangle is q(q1)(q2) since after ordering the vertices arbitrarily, the first vertex can receive any of the q colours, the second vertex may receive any of the colours except the colour of the first vertex, and the third vertex may receive any colour except those of the first two vertices (which are different).161616Note that the formula is correct even when q<3, i.e. when there are no proper q-colourings of the triangle. More generally, the number of proper q colourings of Kr, the complete graph on r vertices is q(q1)(qr+1), i.e. χKr(q)=q(q1)(qr+1) for every q. For any tree T on r vertices, χT(q)=q(q1)r1 for all q since if we colour the vertices in a breadth-first ordering, then the first vertex may receive any of the q colours, while each subsequent vertex can receive any colour except that of its parent. Of course, it is not usually so easy to determine χG(q) because it is NP-complete to decide if there is even one proper q-colouring of G, i.e. whether χG(q) is positive or not. Nonetheless, as the examples above suggest, χG(q) is always a polynomial in q as we shall see shortly, and χG is called the chromatic polynomial of G.

The chromatic polynomial was introduced in 1912 by Birkhoff in an attempt to prove the four colour theorem. It has a long history and has been studied from many perspectives together with its far-reaching generalisation, the Tutte polynomial (see [40, 43] for a comprehensive account).

We now establish a very useful formula for the chromatic polynomial called the random cluster model, due to Fortuin and Kasteleyn (see [45]); it is sometimes used as the definition of the chromatic polynomial. Formally, a proper q-colouring of a graph G=(V,E) is a function f:V{1,,q}=:[q] such that f(u)f(v) whenever {u,v}E. Then we can write

χG(q)=f:V[q]{u,v}E𝟙f(u)f(v),

where 𝟙f(u)f(v) is the indicator function that f(u)f(v) (so that the product is 1 if and only if all edges are properly coloured). Replacing 𝟙f(u)f(v) with (1𝟙f(u)=f(v))) and expanding, we obtain

χG(q)=f:V[q]{u,v}E(1𝟙f(u)=f(v))) =f:V[q]FE(1)|F|{u,v}F𝟙f(u)=f(v)
=FE(1)|F|f:V[q]{u,v}F𝟙f(u)=f(v).

The inner sum in the last expression is equal to qk(F), where k(F) is the number of components of the graph (V,F). The reason is that the product is 1 for an assignment f if and only if every edge of F is monochromatic in f, which means that f must assign a single colour to each component of (V,F). There are precisely qk(F) ways of doing this. Thus

χG(q):=FEqk(F)(1)|F|, (5)

and so we see that χG is indeed a polynomial (although there are easier ways of showing this) and has degree |G|.

Our goal will be to prove the following zero-freeness result for the chromatic polynomial.

Theorem 4.2 ([44, 59]).

Let G be any graph. Then all the zeros of χG are contained in the disk of radius 6.91Δ(G) centered at 0 in the complex plane.

It is likely that the constant 6.91 can be improved, but it is not clear what the optimal value is likely to be; see [79, Footnote 4] for further discussion. By the Taylor polynomial interpolation method, Theorem 4.2 almost immediately implies an FPTAS for approximating χG(q) whenever G𝒢Δ and |q|6.92Δ. The trick is to apply the interpolation method to the polynomial q|V|χG(1/q), which has no zeros in the disk of radius 16.91Δ. From the combinatorial perspective, this implies an FPTAS to count the number of proper q-colourings of any graph G𝒢Δ whenever q>6.91Δ. It is believed that there is an FPTAS for counting proper q-colourings whenever q>Δ and this is an active area of research. By proving a zero-freeness result for a different polynomial (the partition function of the Potts model) Liu, Sinclair, and Srivastava [66] have shown that there is an FPTAS when q2Δ, and this is currently the state of the art.171717There are improved bounds if we allow randomised algorithms based on the Markov chain Monte Carlo method [87, 33].

We now sketch the proof of Theorem 4.2. As mentioned, we will need to work again with the (multivariate) independence polynomial and to make use of a suitable zero-freeness result for it.

4.3.2 The chromatic polynomial as a multivariate independence polynomial

Our first lemma shows how to express the chromatic polynomial of a graph G as an evaluation of the multivariate independence polynomial of an associated graph. For this we need some notation. Let G=(V,E) be a graph. Define a new graph Γ whose vertices are subsets S of V of size at least two. (In the context of the polymer method, these sets are called polymers.) Two of those sets S,T are connected by an edge if and only if ST. Notice that the graph Γ is independent of the edges of G.

We now associate weights to vertices of Γ as follows; these will depend on the edges of G and on q (the variable in the chromatic polynomial). For each vertex S of Γ, i.e. SV with |S|2, define

λS:=FE(S)connected(1)|F|q|S|1. (6)

Now the multivariate independence polynomial of Γ with the (complex) vertex weights λS is given by

ZΓ((λS))=IV(Γ)independentSIλS. (7)
Lemma 4.3.

With notation as above we have

q|V|χG(1/q)=ZΓ((λS)).
Proof.

We start by expanding the left-hand side using (5)

q|V|χG(1/q) =FE(1)|F|q|V|k(F)=FEC component ofF(1)|C|q|V(C)|1.

Next, we break up the sum over FE in terms of the component structure of F as follows. We sum over all F that have exactly k connected components with vertex sets S1,,Sk and then we sum over all possible choices of S1,,Sk and all possible choices of k. In fact we can ignore the components that consist of a single vertex (and no edge) since they contribute a factor of 1 to the product above. In this way (after exchanging a sum and product) we obtain

q|V|χG(1/q)=k0S1,,SkVSiSj= for ij|Si|2i=1kFiE(Si)(Si,Fi) connected(1)|Fi|q|Si|1.

By construction any collection of sets {S1,,Sk} contributing to this sum forms an independent set of size k in the graph Γ. The weights are constructed precisely so that the last expression is ZΓ((λS)), as desired.

4.3.3 Zero-freeness conditions and their verification

Here we present a result due to Biascot, Férnandez and Procacci [25] that provides useful conditions that guarantee that our multivariate independence polynomial (for graphs of the type Γ) does not evaluate to zero. We will then verify these conditions for our situation. Let G=(V,E) and Γ be as before.

Theorem 4.4 ([25]).

For any complex numbers (λS)SV(Γ) and any a>1, if, for each vV, it holds that

SvS|S|2|λS|a|S|a1, (8)

then ZΓ((λS))0.

The theorem can be proved along the same lines as the proof of Theorem 4.1. See [25, Proposition 3.1] for a proof along these lines and a discussion of how this condition compares with other similar conditions including the Kotécky-Preis conditions [63] and Dobrushin’s conditions [38].

To verify the conditions in Theorem 4.4, we need a bound on the weights λS given in (6). Our first step in this direction is to get rid of the ‘alternating signs’ in (6). The lemma below can for example be proved using well-known properties of the Tutte polynomial; see e.g. [43] for these properties and see [73, 85] for a direct proof.

Lemma 4.5.

Let H be a connected graph and denote by τ(H) the number of spanning trees in H. Then

|FE(H)(V(H),F) connected(1)|F||τ(H).

For a graph G=(V,E), a vertex vV, and a variable x we define the tree generating function by

TG,v(x):=TE(G)(V(T),T) is a tree, vV(T)x|T|.

We can now bound SvS,|S|2|λS|a|S| in terms of the tree generating function as follows:

SvS,|S|2|λS|a|S| =SvS,|S|2|FE(S)connected(1)|F|q|S|1|a|S|
SvS,|S|2|FE(S)connected(1)|F|||q||S|1a|S|
SvS,|S|2τ(G[S])|q||S|1a|S|
=aTG,v(a|q|)a. (9)

The next lemma shows how to bound the tree generating function. The proof we give is slightly shorter than the proof given in [59], and is new as far as we know.

Lemma 4.6 ([58]).

Let G=(V,E) be a graph of maximum degree at most Δ1 and let vV. Fix any α>1. Then

TG,v(lnααΔ)α.
Proof.

The proof is by induction on the number of vertices of G. If |V|=1, the statement is clearly true. Next assume that |V|2. Given a tree T such that vV(T) let S be the set of neighbours of v in V(T). After removing v from T, the tree decomposes into the disjoint union of |S| trees, each containing a unique vertex from S. Therefore, writing c=lnααΔ, we have

TG,v(c)SNG(v)c|S|sSTGv,s(c),

which by induction is bounded by

SNG(V)(cα)|S|(1+(lnα)/Δ)Δelnα=α.

This finishes the proof.

We now combine all our ingredients to finish the proof of Theorem 4.2. Fix Δ2. For a>1 to be determined, define α=α(a)=21/a. Then if

|q|lnαaαΔ=ln(21/a)(2a1)Δ,

we have χG(1/q)0 for any graph of maximum degree at most Δ. Indeed, for such a value of q0 we have |aq|lnααΔ and therefore by (9) and the previous lemma

SvS,|S|2|λS|a|S|a(TG,v(a|q|)1)a(α1)=a(11/a)=a1,

and so by Theorem 4.4 and Lemma 4.3 we have χG(1/q)0. In other words if |q|Δ2a1ln(21/a) we have χG(q)0. One can determine

mina>12a1ln(21/a)<6.91,

where the minimum is attained at a1.588. This finishes the proof sketch of Theorem 4.2.

4.3.4 Recipe and relation to cluster expansion

The steps we took to prove Theroem 4.2 suggest a ‘recipe’ for proving absence of zeros using the polymer approach:

  • Express the graph polynomial as an evaluation of the multivariate independence polynomial of an associated graph.

  • Use the conditions from Theorem 4.4 (or other conditions) that guarantee the evaluation is nonzero.

  • Verify these conditions using combinatorial arguments.

Most combinatorial applications of this ‘recipe’ include various extensions and variations of the chromatic polynomial. See [85, 44, 39, 60, 59, 41, 36, 35] for some examples in this direction.

From a statistical physics perspective both Theorem 4.2 and Theorem 4.1 are statements about so-called high temperature models (in the case of Theorem 4.1, high temperature means small values of λ for the independence polynomial). Surprisingly, for some restricted families of graphs, the ‘recipe’ above can also sometimes be used at low temperature (see e.g. [28, 46] for this in statistical physics). For example, it has been used in combination with the interpolation method to design efficient approximation algorithms to approximate the independence polynomial at large λ on certain subgraphs of the integer lattice d [57], but also on bipartite expander graphs [61]. In fact, [61] slightly modified the approach from [57]. The idea is to use conditions like those in Theorem 4.4 to show absolute convergence of the cluster expansion, a formal power series of the logarithm of ZΓ((λS)), and to bound the remainder after truncating it at a suitable depth. This avoids the use of the interpolation method and may occasionally lead to faster algorithms, but other than that is quite similar in spirit. See [27, 30, 65, 55, 31, 71, 50, 47, 56, 32] for some results inspired by and based on this.

5 Concluding remarks

We have shown how absence of zeros allows one to design efficient algorithms to approximately compute evaluations of graph polynomials using Barvinok’s interpolation method. A key part of this method is establishing absence of zeros for the graph polynomials in question. A few natural questions that remain are: how do other approaches for approximate counting relate to absence of zeros, and what does presence of zeros mean for the possibility to design efficient approximation algorithms. In this section we will briefly address these two questions pointing the interested reader to the relevant literature.

5.1 Absence of zeros and other algorithm approaches

As mentioned in the introduction there are two other (and older) approaches for designing approximation algorithms to compute evaluations of graph polynomials: a Markov chain based sampling approach and the method of correlation decay. We will not discuss the workings of these approaches here, but we mention how these approaches relate to the interpolation method, or rather, how they relate to absence of zeros.

Recently it was shown that a standard technique for proving decay of correlations can be transformed to prove absence of zeros near the real axis [82, 66]. In the other direction, some results appeared indicating that absence of zeros can be used to establish some form of decay of correlations [52, 77]. Perhaps more surprisingly, in [1, 34] it was shown that if a multivariate version of the polynomial has no zeros near the positive real axis, then the associated Glauber dynamics (a local Markov chain often used in approximate counting and sampling) mixes rapidly. These results indicate that, while absence of complex zeros is vital for the interpolation method, it also plays a key role (albeit in disguise) in these two other approaches for approximate counting.

5.2 Presence of zeros

In this section we discuss how presence of zeros is related to hardness of approximation. We will again specialise the discussion to the independence polynomial and give some references to results on other polynomials at the end of this section. In what follows we shall see that presence of zeros implies hardness of approximating the independence polynomial.

Let us first state the precise algorithmic problem in question. Let λ[i] (the set of complex numbers whose real and imaginary parts are both rational) and let Δ. Consider the following computational problem.

  • Name

    #Hard-CoreNorm(λ,Δ)

  • Input

    A graph G of maximum degree at most Δ.

  • Output

    If ZG(λ)0 the algorithm must output a rational number N such that N/1.001|ZG(λ)|1.001N. If ZG(λ)=0 the algorithm may output any rational number.

It is easy to show that replacing the constant 1.001 by any other constant C>1 does not change the complexity of the problem.181818 An algorithm that solves the problem above in polynomial time can also be used to solve the problem with 1.001 replaced by (1.001)2 by running the original algorithm on the disjoint union of two copies of the graph.

The typical notion of hardness one considers for computational counting problems is #P-completeness/hardness. We do not introduce the notion formally, but wish to impress only that one does not expect a polynomial-time algorithm for a #P-complete counting problem (just as one does not expect a polynomial-time algorithm for an NP-complete problem). For example, the problem of exactly counting the number of independent sets of a graph of maximum degree Δ is known to be #P-complete [78, 86, 42] for any Δ3.

Returning to the problem #Hard-CoreNorm(λ,Δ), we define the sets

𝒫Δ ={λ[i] #Hard-CoreNorm(λ,Δ) is #P-hard},
𝒵Δ ={λZG(λ)=0 for some graph G𝒢Δ}.

Building on [23], it was shown in [37] that the closure of the set 𝒵Δ is contained in the closure of the set 𝒫Δ, meaning that arbitrarily close to any zero λ𝒵Δ there exists a parameter λ[i] such that #Hard-CoreNorm(λ,Δ) is #P-hard. (A similar result holds if instead of approximating the norm one wishes to approximate the argument of ZG(λ).) Recall that Theorem 4.1 gives a zero-free region for the independence polynomial that contains the point 0. If one can show that the maximal zero-free region of the independence polynomial for bounded degree graphs is connected, then this would result in an essentially complete understanding of the hardness of approximating the independence polynomial at complex parameters on bounded degree graphs [37]. We remark that quite recently it was shown that in the Δ limit this is true [18], but unfortunately this does not give us information for any finite Δ yet.

For some models/polynomials such as the matching polynomial [24] and the ferromagnetic Ising model (as a function of the external field) [29] we know that absence/presence of zeros on bounded degree graphs exactly corresponds to easiness/ hardness of approximation. For others such as the Ising model (as a function of edge interaction) [48] a partial correspondence has been established. This suggests a program of study to understand this connection for more models and also to understand the phenomenon in general.

The interpolation method is clearly a powerful technique for establishing fast approximation algorithms to evaluate graph polynomials at complex parameters, but more than that, it often seems to capture the dichotomy between parameters where approximate computation is easy and hard.

References

  • [1] Yeganeh Alimohammadi, Nima Anari, Kirankumar Shiragur, and Thuy-Duong Vuong. Fractionally log-concave and sector-stable polynomials: counting planar matchings and more. Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 433–446.
  • [2] Taro Asano. Lee-Yang theorem and the Griffiths inequality for the anisotropic Heisenberg ferromagnet. Phys. Rev. Lett., 24:1409–1411, 1970.
  • [3] Zonglei Bai, Yongzhi Cao, and Hanpin Wang. Zero-freeness and approximation of real Boolean Holant problems. Theoretical Computer Science, 917:12–30, 2022.
  • [4] Antar Bandyopadhyay and David Gamarnik. Counting without sampling: asymptotics of the log-partition function for certain statistical physics models. Random Structures & Algorithms, 33(4):452–479, 2008.
  • [5] Alexander Barvinok. Computing the partition function for cliques in a graph. Theory Comput., 11:339–355, 2015.
  • [6] Alexander Barvinok. Combinatorics and Complexity of Partition Functions, volume 30 of Algorithms and combinatorics. Springer, 2016.
  • [7] Alexander Barvinok. Computing the permanent of (some) complex matrices. Found. Comput. Math., 16(2):329–342, 2016.
  • [8] Alexander Barvinok. Approximating permanents and hafnians. Discrete Anal., Paper No. 2, 34, 2017.
  • [9] Alexander Barvinok. Computing the partition function of a polynomial on the Boolean cube. In A journey through discrete mathematics, pages 135–164, 2017.
  • [10] Alexander Barvinok. Approximating real-rooted and stable polynomials, with combinatorial applications. Online J. Anal. Comb., (14):Paper No. 8, 13, 2019.
  • [11] Alexander Barvinok. Computing permanents of complex diagonally dominant matrices and tensors. Israel J. Math., 232(2):931–945, 2019.
  • [12] Alexander Barvinok. Stability and complexity of mixed discriminants. Math. Comp., 89(322):717–735, 2020.
  • [13] Alexander Barvinok and Nicholas Barvinok. More on zeros and approximation of the Ising partition function. Forum Math. Sigma, 9: Paper No. e46, 18, 2021.
  • [14] Alexander Barvinok and Anthony Della Pella. Testing for dense subsets in a graph via the partition function. SIAM J. Discrete Math., 34(1):308–327, 2020.
  • [15] Alexander Barvinok and Guus Regts. Weighted counting of solutions to sparse systems of equations. Combin. Probab. Comput., 28(5):696–719, 2019.
  • [16] Alexander Barvinok and Pablo Soberón. Computing the partition function for graph homomorphisms with multiplicities. J. Combin. Theory Ser. A, 137:1–26, 2016.
  • [17] Alexander Barvinok and Pablo Soberón. Computing the partition function for graph homomorphisms. Combinatorica, 37(4):633–650, 2017.
  • [18] Ferenc Bencs, Pjotr Buys, and Han Peters. The limit of the zero locus of the independence polynomial for bounded degree graphs. arXiv preprint arXiv:2111.06451, 2021.
  • [19] Ferenc Bencs and Péter Csikvári. Note on the zero-free region of the hard-core model. arXiv preprint arXiv:1807.08963, 2018.
  • [20] Ferenc Bencs, Péter Csikvári, and Guus Regts. Some applications of Wagner’s weighted subgraph counting polynomial. Electron. J. Combin., 28(4):Paper No. 4.14, 21, 2021.
  • [21] Ferenc Bencs, Péter Csikvári, Piyush Srivastava, and Jan Vondrák. On complex roots of the independence polynomial. arXiv preprint arXiv:2204.04868, 2022.
  • [22] Ferenc Bencs, Ewan Davies, Viresh Patel, and Guus Regts. On zero-free regions for the anti-ferromagnetic Potts model on bounded-degree graphs. Ann. Inst. Henri Poincaré D, 8(3):459–489, 2021.
  • [23] Ivona Bezáková, Andreas Galanis, Leslie Ann Goldberg, and Daniel Štefankovič. Inapproximability of the independent set polynomial in the complex plane. SIAM J. Comput., 49(5):STOC18–395–STOC18–448, 2020.
  • [24] Ivona Bezáková, Andreas Galanis, Leslie Ann Goldberg, and Daniel Štefankovič. The complexity of approximating the matching polynomial in the complex plane. ACM Trans. Comput. Theory, 13(2):Art. 13, 37, 2021.
  • [25] Rodrigo Bissacot, Roberto Fernández, and Aldo Procacci. On the convergence of cluster expansions for polymer gases. J. Stat. Phys., 139(4):598–617, 2010.
  • [26] Magnus Bordewich, Michael Freedman, Lázalo Lovász, and Dominic Welsh. Approximate counting and quantum computation. Combin. Probab. Comput., 14(5-6):737–754, 2005.
  • [27] Christian Borgs, Jennifer Chayes, Tyler Helmuth, Will Perkins, and Prasad Tetali. Efficient sampling and counting algorithms for the Potts model on d at all temperatures. Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pages 738–751, 2020.
  • [28] Christian Borgs and John Imbrie. A unified approach to phase diagrams in field theory and statistical mechanics. Comm. Math. Phys., 123(2):305–328, 1989.
  • [29] Pjotr Buys, Andreas Galanis, Viresh Patel, and Guus Regts. Lee-Yang zeros and the complexity of the ferromagnetic Ising model on bounded-degree graphs. Forum Math. Sigma, 10:Paper No. e7, 43, 2022.
  • [30] Sarah Cannon and Will Perkins. Counting independent sets in unbalanced bipartite graphs. In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, pages 1456–1466, 2020.
  • [31] Charles Carlson, Ewan Davies, and Alexandra Kolla. Efficient algorithms for the Potts model on small-set expanders. arXiv preprint arXiv:2003.01154, 2020.
  • [32] Katrin Casel, Philipp Fischbeck, Tobias Friedrich, Andreas Göbel, and J. A. Gregor Lagodzinski. Zeros and approximations of Holant polynomials on the complex plane. Comput. Complexity, 31(2):Paper No. 11, 2022.
  • [33] Sitan Chen, Michelle Delcourt, Ankur Moitra, Guillem Perarnau, and Luke Postle. Improved bounds for randomly sampling colorings via linear programming. Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2216–2234, 2019.
  • [34] Zongchen Chen, Kuikui Liu, and Eric Vigoda. Spectral independence via stability and applications to Holant-type problems. IEEE 62nd Annual Symposium on Foundations of Computer Science, pages 149–160, 2022.
  • [35] Matthew Coulson, Ewan Davies, Alexandra Kolla, Viresh Patel, and Guus Regts. Statistical physics approaches to unique games. 35th Computational Complexity Conference, volume 169 of LIPIcs. Leibniz Int. Proc. Inform., Art. No. 13, 27. 2020.
  • [36] Péter Csikvári and Péter Frenkel. Benjamini-Schramm continuity of root moments of graph polynomials. European J. Combin., 52(part B):302–320, 2016.
  • [37] David de Boer, Pjotr Buys, Lorenzo Guerini, Han Peters, and Guus Regts. Zeros, chaotic ratios and the computational complexity of approximating the independence polynomial. arXiv preprint arXiv:2104.11615, 2021.
  • [38] R. L. Dobrushin. Estimates of semi-invariants for the Ising model at low temperatures. In Topics in statistical and theoretical physics, volume 177 of Amer. Math. Soc. Transl. Ser. 2, pages 59–81, 1996.
  • [39] F. M. Dong and K. M. Koh. Bounds for the real zeros of chromatic polynomials. Combin. Probab. Comput., 17(6):749–759, 2008.
  • [40] F. M. Dong, K. M. Koh, and K. L. Teo. Chromatic polynomials and chromaticity of graphs. World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2005.
  • [41] Fengming Dong and Xian’an Jin. Zeros of Jones polynomials of graphs. Electron. J. Combin., 22(3):Paper 3.23, 18, 2015.
  • [42] Martin Dyer and Catherine Greenhill. On Markov chains for independent sets. J. Algorithms, 35(1):17–49, 2000.
  • [43] Joanna A. Ellis-Monaghan and Iain Moffatt. Handbook of the Tutte Polynomial and Related Topics. CRC Press, 2022.
  • [44] Roberto Fernández and Aldo Procacci. Regions without complex zeros for chromatic polynomials on graphs with bounded degree. Combin. Probab. Comput., 17(2):225–238, 2008.
  • [45] C. M. Fortuin and P. W. Kasteleyn. On the random-cluster model. I. Introduction and relation to other models. Physica, 57:536–564, 1972.
  • [46] S. Friedli and Y. Velenik. Statistical mechanics of lattice systems: a concrete mathematical introduction. Cambridge University Press, Cambridge, 2018.
  • [47] Tobias Friedrich, Andreas Göbel, Martin S Krejca, and Marcus Pappik. Polymer dynamics via cliques: New conditions for approximations. arXiv preprint arXiv:2007.08293, 2020.
  • [48] Andreas Galanis, Leslie Ann Goldberg, and Andrés Herrera-Poyatos. The complexity of approximating the complex-valued Ising model on bounded degree graphs. arXiv preprint arXiv:2105.00287, 2021.
  • [49] Andreas Galanis, Leslie Ann Goldberg, and Andrés Herrera-Poyatos. The complexity of approximating the complex-valued Potts model. Comput. Complexity, 31(1):Paper No. 2, 94, 2022.
  • [50] Andreas Galanis, Leslie Ann Goldberg, and James Stewart. Fast algorithms for general spin systems on bipartite expanders. ACM Trans. Comput. Theory, 13(4):Art. 25, 18, 2021.
  • [51] Andreas Galanis, Daniel Štefankovič, and Eric Vigoda. Inapproximability of the partition function for the antiferromagnetic Ising and hard-core models. Combin. Probab. Comput., 25(4):500–559, 2016.
  • [52] David Gamarnik. Correlation decay and the absence of zeros property of partition functions. Random Structures & Algorithms, 2022.
  • [53] David Gaunt and Michael Fisher. Hard-sphere lattice gases. i. plane-square lattice. J. Chem. Phys., 43(8):2840–2863, 1965.
  • [54] Heng Guo, Chao Liao, Pinyan Lu, and Chihao Zhang. Zeros of Holant problems: locations and algorithms. ACM Trans. Algorithms, 17(1):Art. 4, 25, 2021.
  • [55] Tyler Helmuth, Matthew Jenssen, and Will Perkins. Finite-size scaling, phase coexistence, and algorithms for the random cluster model on random graphs. arXiv preprint arXiv:2006.11580, 2020.
  • [56] Tyler Helmuth and Ryan L Mann. Efficient algorithms for approximating quantum partition functions at low temperature. arXiv preprint arXiv:2201.06533, 2022.
  • [57] Tyler Helmuth, Will Perkins, and Guus Regts. Algorithmic Pirogov-Sinai theory. Probab. Theory Related Fields, 176(3-4):851–895, 2020.
  • [58] Jeroen Huijben and Guus Regts. Private communication. 2021.
  • [59] Bill Jackson, Aldo Procacci, and Alan D. Sokal. Complex zero-free regions at large |q| for multivariate Tutte polynomials (alias Potts-model partition functions) with general complex edge weights. J. Combin. Theory Ser. B, 103(1):21–45, 2013.
  • [60] Bill Jackson and Alan D. Sokal. Zero-free regions for multivariate Tutte polynomials (alias Potts-model partition functions) of graphs and matroids. J. Combin. Theory Ser. B, 99(6):869–903, 2009.
  • [61] Matthew Jenssen, Peter Keevash, and Will Perkins. Algorithms for #BIS-hard problems on expander graphs. SIAM J. Comput., 49(4):681–710, 2020.
  • [62] Mark Jerrum. Counting, sampling and integrating: algorithms and complexity. Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 2003.
  • [63] R. Kotecký and D. Preiss. Cluster expansion for abstract polymer models. Comm. Math. Phys., 103(3):491–498, 1986.
  • [64] Liang Li and Guangzeng Xie. Complex contraction on trees without proof of correlation decay. arXiv preprint arXiv:2112.15347, 2021.
  • [65] Chao Liao, Jiabao Lin, Pinyan Lu, and Zhenyu Mao. An FPTAS for the hardcore model on random regular bipartite graphs. Theoretical Computer Science, 929:174–190, 2022.
  • [66] Jingcheng Liu, Alistair Sinclair, and Piyush Srivastava. Correlation decay and partition function zeros: Algorithms and phase transitions. SIAM Journal on Computing, 0(0):FOCS19–200–FOCS19–252, 0.
  • [67] Jingcheng Liu, Alistair Sinclair, and Piyush Srivastava. Fisher zeros and correlation decay in the Ising model. J. Math. Phys., 60(10):103304, 12, 2019.
  • [68] Jingcheng Liu, Alistair Sinclair, and Piyush Srivastava. The Ising partition function: zeros and deterministic approximation. J. Stat. Phys., 174(2):287–315, 2019.
  • [69] Jingcheng Liu, Alistair Sinclair, and Piyush Srivastava. A deterministic algorithm for counting colorings with 2Δ colors. IEEE 60th Annual Symposium on Foundations of Computer Science, pages 1380–1404, 2019.
  • [70] Ryan L. Mann and Michael J. Bremner. Approximation Algorithms for Complex-Valued Ising Models on Bounded Degree Graphs. Quantum, 3:162, July 2019.
  • [71] Ryan L. Mann and Tyler Helmuth. Efficient algorithms for approximating quantum partition functions. J. Math. Phys., 62(2):Paper No. 022201, 7, 2021.
  • [72] Viresh Patel and Guus Regts. Deterministic polynomial-time approximation algorithms for partition functions and graph polynomials. SIAM J. Comput., 46(6):1893–1919, 2017.
  • [73] Oliver Penrose. Convergence of fugacity expansions for classical systems. Statistical mechanics: foundations and applications, page 101, 1967.
  • [74] Han Peters and Guus Regts. On a conjecture of Sokal concerning roots of the independence polynomial. Michigan Math. J., 68(1):33–55, 2019.
  • [75] Han Peters and Guus Regts. Location of zeros for the partition function of the Ising model on bounded degree graphs. J. Lond. Math. Soc. (2), 101(2):765–785, 2020.
  • [76] Guus Regts. Zero-free regions of partition functions with applications to algorithms and graph limits. Combinatorica, 38(4):987–1015, 2018.
  • [77] Guus Regts. Absence of zeros implies strong spatial mixing. arXiv preprint arXiv:2111.04809, 2021.
  • [78] Dan Roth. On the hardness of approximate reasoning. Artificial Intelligence, 82(1-2):273–302, 1996.
  • [79] Gordon F. Royle and Alan D. Sokal. Linear bound in terms of maxmaxflow for the chromatic roots of series-parallel graphs. SIAM J. Discrete Math., 29(4):2117–2159, 2015.
  • [80] David Ruelle. Zeros of graph-counting polynomials. Comm. Math. Phys., 200(1):43–56, 1999.
  • [81] Alexander D. Scott and Alan D. Sokal. The repulsive lattice gas, the independent-set polynomial, and the Lovász local lemma. J. Stat. Phys., 118(5-6):1151–1261, 2005.
  • [82] Shuai Shao and Yuxin Sun. Contraction: a unified perspective of correlation decay and zero-freeness of 2-spin systems. J. Stat. Phys., 185(2):Paper No. 12, 25, 2021.
  • [83] J. B. Shearer. On a problem of Spencer. Combinatorica, 5(3):241–245, 1985.
  • [84] Allan Sly and Nike Sun. Counting in two-spin models on d-regular graphs. Ann. Probab., 42(6):2383–2416, 2014.
  • [85] Alan D. Sokal. Bounds on the complex zeros of (di)chromatic polynomials and Potts-model partition functions. Combin. Probab. Comput., 10(1):41–77, 2001.
  • [86] Salil P. Vadhan. The complexity of counting in sparse, regular, and planar graphs. SIAM J. Comput., 31(2):398–427, 2001.
  • [87] Eric Vigoda. Improved bounds for sampling colorings. J. Math. Phys., 41(3):1555–1569, 2000.
  • [88] David G. Wagner. Weighted enumeration of spanning subgraphs with degree constraints. J. Combin. Theory Ser. B, 99(2):347–357, 2009.
  • [89] Dror Weitz. Counting independent sets up to the tree threshold. Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pages 140–149, 2006.
  • [90] D. J. A. Welsh. Complexity: knots, colourings and counting, volume 186 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1993.