Counting independent sets in graphs

Wojciech Samotij School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978, Israel; and Trinity College, Cambridge CB2 1TQ, UK samotij@post.tau.ac.il
Abstract.

In this short survey article, we present an elementary, yet quite powerful, method of enumerating independent sets in graphs. This method was first employed more than three decades ago by Kleitman and Winston and has subsequently been used numerous times by many researchers in various contexts. Our presentation of the method is illustrated with several applications of it to ‘real-life’ combinatorial problems. In particular, we derive bounds on the number of independent sets in regular graphs, sum-free subsets of {1,,n}, and C4-free graphs and give a short proof of an analogue of Roth’s theorem on 3-term arithmetic progressions in sparse random sets of integers which was originally formulated and proved by Kohayakawa, Łuczak, and Rödl.

Research supported in part by a grant from the Israel Science Foundation

1. Introduction

Many well-studied problems in combinatorics concern characterising discrete structures that satisfy certain ‘local’ constraints. For example, the celebrated theorem of Szemerédi [42] gives an upper bound on the maximum size of a subset of the first n integers which does not contain an arithmetic progression of a fixed length k. To give another example, the archetypal problem studied in extremal graph theory, dating back to the work of Mantel [32] and Turán [43], is that of characterising graphs which do not contain a fixed graph H as a subgraph.

Problems of this type fall into the following general framework. We are given a finite set V and a collection of subsets of V. What can be said about sets IV that do not contain any member of ? Such a collection is often called a hypergraph with vertex set V, members of are termed edges, and any set IV that contains no edge is called an independent set. In view of this, one might say that a large part of combinatorics is concerned with studying independent sets in various hypergraphs. For instance, in the first example from the previous paragraph, V is the set {1,,n} and is the collection of all k-term arithmetic progressions contained in V; stated in this language, Szemerédi’s theorem says that for every positive constant δ, every independent set in has fewer than δn elements, provided that n is sufficiently large. In the second example, V is the edge set of a complete graph on a given set of n vertices and is the family of all (n|V(H)|) sets of |E(H)| edges that form a copy of H in the complete graph; in this notation, if H is a clique with k+1 vertices, then Turán’s theorem says that the largest independent sets in are precisely the edge sets of the complete balanced k-partite subgraphs of the complete graph with edge set V and the well-known theorem of Kolaitis, Prömel, and Rothschild [29] states that almost all independent sets of are k-partite, that is, the number i() of independent sets in that are not the edge sets of k-partite subgraphs of the complete graph with edge set V satisfies i()/i()0 as n.

For a hypergraph , let () denote the family of all independent sets in , let i()=|()|, and let α() be the largest cardinality of an element of (), usually called the independence number of . There are two natural problems that one usually poses about a specific hypergraph :

  1. (i)

    Determine α() and describe all I() with α() elements.

  2. (ii)

    Estimate i() and describe a ‘typical’ member of ().

Let us remark here that providing a precise characterisation of a typical element of () usually yields a precise estimate for i().

An apparent connection between problems (i) and (ii) may be easily observed in the following two inequalities, which are trivial consequences of the above definitions and the fact that the family () is closed under taking subsets:

2α()i()m=0α()(|V()|m). (1)

Note that, unless α() is very close to |V()|, the lower and upper bounds on i() given in (1) are quite far apart. Since for many interesting hypergraphs this naive lower bound is actually fairly close to being best possible, the efforts of many researchers have been focused on improving the upper bound.

In this short survey article, we present an elementary, yet very powerful, method for proving stronger upper bounds in the case when all edges of have size two, that is, when is a graph. This method was first described more than three decades ago by Kleitman and Winston, who used it to obtain upper bounds on the number of lattices111A lattice is a partially ordered set in which every two elements have a supremum and an infimum. [25] and graphs without cycles of length four [26]. Variations of this method were subsequently rediscovered by several researchers, most notably by Sapozhenko, in the context of enumerating independent sets in regular graphs [1, 36] and sum-free sets in abelian groups [1, 31, 37]. We shall illustrate our presentation of this method with several applications of it to ‘real-life’ combinatorial problems. We would like to stress here that none of the results or proof techniques presented here are new, but we hope that there is some value in seeing them next to one another.

2. The Kleitman–Winston algorithm

Suppose that we are given an arbitrary graph G with n vertices. Our goal is to give an upper bound on i(G), the number of independent sets in G. The idea of Kleitman and Winston was to devise an algorithm that, given a particular independent set I(G), would encode I in an invertible way. Crucially, the encoding should be performed in a way which makes it easy to estimate the total number of outputs of the algorithm. Since for every invertible encoding, the total number of outputs is precisely i(G), in this way one could derive an upper bound on this quantity.

The crucial idea of Kleitman and Winston was to consider the vertices of G ordered according to their degrees and encode each independent set I as a sequence of positions of the elements of I in that ordering. We make this precise below.

Definition.

Let G be a graph and fix an arbitrary total order on V(G). For every AV(G), the max-degree ordering of A is the ordering (v1,,v|A|) of all elements of A, where for each j{1,,|A|}, vj is the maximum-degree vertex in the subgraph of G induced by A{v1,,vj1}; ties are broken by giving preference to vertices that come earlier in the fixed total order on V(G).

The Algorithm.

Suppose that a graph G, an I(G), and an integer q|I| are given. Set A=V(G) and S=. For s=1,,q, do the following:

  1. (a)

    Let (v1,,v|A|) be the max-degree ordering of A.

  2. (b)

    Let js be the minimal index j such that vjI.

  3. (c)

    Move vjs from A to S.

  4. (d)

    Delete v1,,vjs1 from A.

  5. (e)

    Delete NG(vjs)A from A.

Output (j1,,jq) and AI.

For each output sequence (j1,,jq) and every s{1,,q}, denote by A(j1,,js) and S(j1,,js) the sets A and S at the end of the sth iteration of the algorithm (run on some input I that produces this particular sequence (j1,,jq)), respectively. Observe that these definitions do not depend on the choice of I as the sequence (j1,,jq) uniquely determines how the sets S and A evolve throughout the algorithm. More precisely, if running the algorithm on two inputs I,I(G) produces the same sequence (j1,,jq), then both these executions will also yield the same sets S and A. Indeed, all the modifications of the sets S and A in the sth iteration of the algorithm depend solely on js.

Note crucially that S(j1,,js)I and IS(j1,,js)A(j1,,js) for every s. Indeed, by the minimality of js and the assumption that I is independent, the only vertices of I that are deleted from A are moved to S. It follows that one may recover the set I from the output of the algorithm, as I=S(j1,,jq)(A(j1,,jq)I). We also note for future reference that the sequence (j1,,jq) can be recovered from the set S(j1,,jq). Indeed, if running the algorithm on some input I(G) produces a sequence (j1,,jq) and S=S(j1,,jq), then the same sequence will be produced by running the algorithm with I replaced by S. Finally, let us observe that j1++jq|V(G)||A(j1,,jq)|, as in steps (c) and (d) of the sth iteration of the main loop, we removed from A some js vertices.

Let i(G,m) be the number of independent sets in G that have precisely m elements. The above observations readily imply that for every m and q with mq,

i(G,m)(js)i(G[A(j1,,jq)],mq)(js)(|A(j1,,jq)|mq), (2)

where the above sums range over all output sequences (j1,,jq). In particular, letting n=|V(G)|,

i(G)m=0q1(nm)+(js)i(G[A(j1,,jq)])m=0q1(nm)+(js)2|A(j1,,jq)|. (3)

In view of (2) and (3), it is in our interest to make the set A(j1,,jq) as small as possible, uniformly for all values of (j1,,jq). This is why we consider the vertices of A listed according to the max-degree ordering. (An attentive reader might have already noticed that this particular ordering maximises degG(vjs,A) in each iteration of the algorithm.) Suppose that we are at the sth iteration of the main loop of the algorithm and let A=A{v1,,vjs1}, where A is as at the start of this iteration, that is, A=A(j1,,js1). By the definition of the max-degree ordering,

|NG(vjs)A|=maxvAdegG(v,A)2eG(A)|A|.

In particular, if eG(A)=β(|A|2), then the right-hand side of the above inequality is β(|A|1). Consequently, the number of vertices that are removed from A during the sth iteration of the main loop of the algorithm is at least js+β(|A|1), which is at least β|A|, as |A|1=|A|js and β1. In other words, as long as the density of the subgraph induced by the set A exceeds some β, each iteration of the main loop of the algorithm shrinks A by a factor of at most 1β.

The following two lemmas, which are both implicit in the work of Kleitman and Winston, summarise the above discussion. The first lemma gives a simple bound on the number of independent sets of a given size in a graph which satisfies a certain local density condition. The exact statement of this lemma is taken from [27]. The second lemma characterises the family of all independent sets in such a locally dense graph. The statement of this lemma is inspired by the statement of the main result of [7].

Lemma 1.

Let G be a graph on n vertices and assume that an integer q and reals R and β[0,1] satisfy

Reβqn. (4)

Suppose that the number of edges induced in G by every set UV(G) with |U|R satisfies

eG(U)β(|U|2). (5)

Then, for every integer m with mq,

i(G,m)(nq)(Rmq). (6)
Proof.

Since there are exactly (nq) sequences (j1,,jq) satisfying j1++jqn and js1 for each s, the sum in the right-hand side of (2) has at most (nq) terms. Therefore, it suffices to show that for each sequence (j1,,jq) that is outputted by the algorithm, the set A(j1,,jq) has at most R elements. If this were not the case, then there would be some sequence (j1,,jq) such that for each s{1,,q}, the set A{v1,,vjs1} in the sth iteration of the main loop of the algorithm (run on some input that results in this particular sequence) would have more than R elements and therefore induce in G a subgraph with edge density at least β. It follows from our discussion that each of the q iterations would shrink the set A by a factor of at most 1β. Since |A|=|V(G)|=n at the start of the algorithm, then, by (4),

|A(j1,,jq)|(1β)qneβqnR,

a contradiction.

Lemma 2.

Let G be a graph on n vertices and assume that an integer q and reals R and D satisfy

R+qDn. (7)

Suppose that the number of edges induced in G by every set UV(G) with |U|R satisfies

2eG(U)D|U|. (8)

Then there exists a collection 𝒮 of q-element subsets of V(G) and two mappings g:(G)𝒮 and f:𝒮𝒫(V(G)) such that |f(S)|R for each S𝒮 and g(I)If(g(I))g(I) for every I(G) with at least q elements.

Proof.

We define the mappings f and g and the family 𝒮 as follows. We simply run the algorithm with input I for each I(G) with at least q elements and let g(I) and f(g(I)) be the final sets S and A, respectively. Moreover, we let 𝒮 be the family of all such S, that is, the set of values taken by g. The discussion in the paragraph following the description of the algorithm should convince us that this is a valid definition of f, that g(I)If(g(I))g(I) for each I as above, and that 𝒮 consists solely of q-element subsets of V(G). It suffices to check that |f(g(I))|R for each such I. If this were not the case, then there would be some sequence (j1,,jq) such that for each s{1,,q}, the set A{v1,,vjs1} in the sth iteration of the main loop of the algorithm (run on an input I that generates this sequence) would have more than R elements and therefore induce in G a subgraph with average degree at least D. But then, each of the q iterations would remove from A at least D+1 vertices. Since |A|=|V(G)|=n at the start of the algorithm, then by (7),

|A(j1,,jq)|nDqR,

a contradiction.

Before we close this section, let us make several final remarks. First, the conclusion of Lemma 2 is stronger than the conclusion of Lemma 1. This is simply because the existence of f and g as in the statement of the second lemma imply the bound on i(G,m) asserted by the first lemma. Moreover, it should be clear from the proofs that the assumptions of the two lemmas are ‘interchangeable’ in the following sense. If a graph G satisfies the assumptions of Lemma 1 with some q, R, and β, then the conclusion of Lemma 2 holds for G with the same q and R; and vice-versa, if a graph G satisfies the assumptions of Lemma 2 with some q, R, and D, then the conclusion of Lemma 1 holds for G with the same q and R. (The latter statement is redundant because, as we have already noted above, the conclusion of Lemma 2 is stronger than the conclusion of Lemma 1.)

3. Applications

3.1. Independent sets in regular graphs

During a number theory conference at Banff in 1988, Granville conjectured (see [1]) that an n-vertex d-regular graph can have no more than 2(1+o(1))n2 independent sets, where o(1) is some function that tends to 0 as d. A few years later, this was shown to be true by Alon [1], who proved that in fact

i(G)2(1+O(d0.1))n2

for every n-vertex d-regular graph G. As our first application of Lemma 1, we derive a somewhat stronger estimate, which was obtained several years later by Sapozhenko [36], using arguments very similar to those presented in Section 2.

Theorem 3 ([36]).

There is an absolute constant C such that every n-vertex d-regular graph G satisfies

i(G)2(1+Clogdd)n2.

Alon [1] speculated that when n is divisible by 2d, then the disjoint union of n2d complete bipartite graphs Kd,d has the maximum number of independent sets among all d-regular graphs with n vertices. A slightly stronger statement (Theorem 4 below) was later conjectured by Kahn [23], who proved it under the additional assumption that G is bipartite, using a beautiful entropy argument. This assumption was recently shown to be unnecessary by Zhao [45], who gave a short and elegant argument showing that for every n-vertex d-regular graph G, there exists a 2n-vertex d-regular bipartite graph G such that i(G)i(G)1/2. The results of Kahn and Zhao yield the following.

Theorem 4 ([23, 45]).

For every n-vertex d-regular graph G,

i(G)i(Kd,d)n2d=(2d+11)n2d.

We now derive Theorem 3 from Lemma 1.

Proof of Theorem 3.

Let G be an n-vertex d-regular graph. We shall in fact estimate i(G,m) for each m and deduce the claimed bound on i(G) by summing over all m. Since i(G)2n and C is an arbitrary constant, we may assume that d is sufficiently large (and therefore n is sufficiently large). We consider two cases. First, if mn/10, then we simply note that

i(G,m)(nn10)(10e)n1020.48n, (9)

where we used the well-known inequality (ab)(ea/b)b valid for all a and b.

In the complementary case, m>n/10, we shall apply Lemma 1. To this end, let BV(G) and note that

d|B|=vBdegG(v)=2e(B)+e(B,Bc)2e(B)+vBcdegG(v)=2e(B)+d(n|B|). (10)

Fix an arbitrary β, let R=n2+βn22d, and observe that if |B|R, then (10) yields

e(B)d2(2|B|n)d2(2Rn)βn22β(|B|2). (11)

Assume that β>10/n and let q=1/β. By Lemma 1, since

eβqne1nR,

then for every m with mn/10q,

i(G,m)(nq)(n2+βn22dmq)(enq)q(n2+βn22dmq)(eβn)1/β(n2+βn22dmq). (12)

Summing (9) and (12) over all m yields

i(G)20.49n+2n2+βn22d+1/βlog2(eβn)

We obtain the claimed bound by letting β=dlogdn; we note that dlogd>10 as we assumed that d is large.

We ought to indicate here that one may significantly improve the upper bound given by Theorem 3 by a somewhat more careful analysis of the execution of the Kleitman–Winston algorithm than the one given in the proof of Lemma 1. The main reason why one should expect such an improvement to be possible is the crudeness of the second inequality in (11) in the case when |B|n/2 is much larger than Rn/2. The proof of Lemma 1 uses (11) to show that in each step of the algorithm, the set A loses at least β|A| elements whereas in reality A will lose many more elements as long as |A| is not very close to n/2+βn2/(2d). By considering the ‘evolution’ of |A| partitioned into ‘dyadic’ intervals (n/2+n/2i+1,n/2+n/2i], where 1ilog2dlog2log2d, one may prove that there is an absolute constant C such that every n-vertex d-regular graph G satisfies

i(G)2(1+C(logd)2d)n2.

One rigorous way of tracking this ‘evolution’ of |A| is to repeatedly invoke Lemma 2 with Ri=n/2+n/2i+1 and Di=d/2i for i=1,,log2dlog2log2d. We leave filling in the details as an exercise for the reader.

3.2. Sum-free sets

The conjecture of Granville mentioned in the previous section was motivated by a problem posed by Cameron and Erdős at the same number theory conference. A set A of elements of an abelian group is called sum-free if there are no x,y,zA satisfying x+y=z. Let [n] denote the set {1,,n}. Cameron and Erdős raised the question of determining the number SF([n]) of sum-free sets contained in the set [n]. They noted that any set containing either only odd integers or only integers greater than n/2 is sum-free, and that it is unlikely that there is another large collection of sum-free sets that are not essentially of one of the above two types. In view of this, they conjectured that SF([n])=O(2n/2). Soon afterwards, Alon [1] showed that the aforementioned conjecture of Granville implies the following weaker estimate on SF([n]), which will serve as a second example application of Lemma 1.

Theorem 5 ([1]).

The set {1,,n} has at most 2(1/2+o(1))n sum-free subsets.

The Cameron–Erdős conjecture was solved some fifteen years later by Green [18] and, independently, by Sapozhenko [38]. The solution due to Sapozhenko uses a method akin to the Kleitman–Winston algorithm presented in Section 2, while the one due to Green uses discrete Fourier analysis.222However, one might still argue that the general ‘philosophy’ behind Green’s proof is similar. We do not discuss either of their arguments here, but instead refer the interested reader to the original papers. Finally, we mention that strong estimates on the number of sum-free subsets of [n] with a given number of elements, which imply the conjecture, were recently obtained in [3]; the proof there employs the ideas presented in Section 2.

Proof of Theorem 5.

Observe first that the number of all subsets of [n] which contain fewer than n2/3 elements from {1,,n/21} is at most (n/2)n2/32n/2+1. Therefore, we may restrict our attention to sum-free sets that contain at least n2/3 elements strictly smaller than n/2. For each such set A, let SA be the set of n2/3 smallest elements of A.

Given a set S{1,,n/21}, define an auxiliary graph GS with vertex set [n] by letting

E(GS)={xy:x+sy(modn) for some sS(S)}

and note that GS is 2|S|-regular, as n(n/21)>n/21 and hence S and S contain different residues modulo n. The crucial observation is that for every sum-free A as above, the set ASA is an independent set in the graph GSA. Indeed, otherwise there would be x,yASA and an sSA(SA) with x+sy(modn); since 1|s|<x,yn, this is only possible when x+s=y. In particular, for a given S{1,,n/21}, there are at most i(GS) sum-free sets A satisfying S=SA. By Theorem 3, we conclude that

SF([n])(n/2)n2/32n/2+1+(n/2n2/3)2(1+O(n1/3logn))n22(1/2+O(n1/3logn))n.

Before closing this section, we remark that the paper of Alon [1] started a very successful line of inquiry into the closely related problem of determining the number of sum-free sets contained in an arbitrary finite abelian group; see, e.g., [2, 19, 20, 31, 37]. In many of these works, variations of the ideas presented in Section 2 play a prominent role.

3.3. Independent sets in regular graphs without small eigenvalues

Since every n-vertex bipartite graph G satisfies α(G)n/2 and hence it contains at least 2n/2 independent sets, the upper bounds for i(G) proved in Section 3.1 are essentially best possible whenever G is bipartite. It is natural to ask whether these bounds can be improved when one assumes that G is ‘far’ from being bipartite. An affirmative answer to this question was given by Alon and Rödl [5].

Recall that the adjacency matrix of an n-vertex graph G is a real-valued symmetric n×n matrix and therefore it has n real eigenvalues. Denote these eigenvalues by λ1,,λn, where λ1λn. It is well known that the quantity max{|λ2|,|λn|}, called the second eigenvalue of G, is closely tied with, among other parameters, the expansion properties of G. We shall be interested only in the smallest eigenvalue λn of G, which we denote by λ(G). It was first proved by Hoffman [21] that every d-regular n-vertex graph G satisfies α(G)λ(G)dλ(G)n. This was later significantly strengthened333In particular, Lemma 6 implies that eG(A)>0 for every A with more than λ(G)dλ(G)n vertices. by Alon and Chung [4], who established the following relation between λ(G) and the number of edges induced by large sets of vertices in G, cf. the expander mixing lemma (see, e.g., [22]).

Lemma 6 ([4]).

Let G be an n-vertex d-regular graph. For all AV(G),

2eG(A)dn|A|2+λ(G)n|A|(n|A|).

Alon and Rödl [5] were the first to prove that if λ(G) is much larger than d, then each such G has far fewer than 2n/2 independent sets. As our next application of Lemma 1, we derive a similar estimate, originally proved in [2].

Theorem 7 ([2]).

For every ε>0, there exists a constant C such that the following holds. If G is an n-vertex d-regular graph with λ(G)λ, then

i(G,m)((λd+λ+ε)nm),

provided that mCn/d.

Proof of Theorem 7.

Fix some ε>0, let G be an n-vertex d-regular graph, and let λ=λ(G). We may assume that λd+λ+ε<1 as otherwise there is nothing to prove. Let UV(G) be an arbitrary set with |U|(λd+λ+ε2)n. Lemma 6 implies that

2eG(U)dn|U|2λn|U|(n|U|)=|U|n((d+λ)|U|λn)εd2|U|εdn(|U|2).

Let β=εdn, q=log(2/ε)εnd, and R=(λd+λ+ε2)n and observe that Reβqn. If follows from Lemma 1 that for every m with mq,

i(G,m)(nq)(Rmq). (13)

Let r(t) denote the right-hand side of (13) with q replaced by t. We may clearly assume that mα(G)λd+λn, as otherwise i(G,m)=0. An elementary calculation shows that

r(t+1)r(t)=ntt+1mtRm+t+1nm(t+1)(Rm)2mε(t+1)

and hence

i(G,m)=r(q)=t=0q1r(t+1)r(t)r(0)(2m)qεqq!(Rm)(2emεq)q(RR+εn/2)m(R+εn/2m),

where we used the inequalities a!>(a/e)a and (ac)(a/b)c(bc) valid whenever abc0. Finally, if K is sufficiently large (as a function of ε) and CKlog(2/ε)ε, then for every m with mCn/dKq,

(2emεq)q/mRR+εn/2(2Keε)1/K(1ε2)1,

which completes the proof of the theorem.

We close this section with several remarks. First, the constant λd+λ in the assertion of the theorem is optimal as for many values of n, d, and α, there are n-vertex d-regular graphs with α(G)=λ(G)dλ(G)n=αn. Second, the assumption that mCn/d cannot be relaxed as for every ε>0, every n-vertex d-regular graph G satisfies i(G,m)((1ε)nm) whenever mεn/(d+1). (To see this, consider the greedy process of constructing an independent set which repeatedly picks an arbitrary vertex of G and removes it and all of its neighbours from G.) Third, the above theorem implies the conjecture of Granville stated in Section 3.1 as λ(G)d for every d-regular graph G. Finally, we refer the interested reader to [2] and [5], where Theorem 7 was used to obtain upper bounds on the number of sum-free sets in abelian groups of even order and lower bounds on some multicolor Ramsey numbers, respectively.

3.4. The number of C4-free graphs

As our next example, we present the main result from one of the papers of Kleitman and Winston [26] which introduced the methods described in Section 2. Call a graph C4-free if it does not contain a cycle of length four and let ex(n,C4) denote the maximum number of edges in a C4-free graph with n vertices. A classical result of Kővári, Sós, and Turán [30] together with a construction due to Brown [10] and Erdős, Rényi, and Sós [16] imply that

ex(n,C4)=(12+o(1))n3/2.

Let fn(C4) be the number of (labeled) C4-free graphs on the vertex set {1,,n}. Since each subgraph of a C4-free graph is itself C4-free, we have

2ex(n,C4)fn(C4)m=0ex(n,C4)((n2)m)=2Θ(ex(n,C4)logn),

which yields

ex(n,C4)log2fn(C4)O(ex(n,C4)logn). (14)

Answering a question of Erdős, Kleitman and Winston [26] showed that the lower bound in (14) is tight up to a constant factor.

Theorem 8 ([26]).

There is a positive constant C such that

log2fn(C4)Cn3/2.

Before we continue with the proof of the theorem, let us make a few comments. In fact, Erdős asked whether log2fn(H)=(1+o(1))ex(n,H) for an arbitrary H that contains a cycle. This was shown to be the case by Erdős, Frankl, and Rödl [15] under the assumption that χ(H)3. Very recently, Morris and Saxton [33] proved that log2fn(C6)1.0007ex(n,C6) for infinitely many n. But the notoriously difficult problem of determining whether or not log2fn(H)=O(ex(n,H)) for every bipartite H that is not a forest remains unsolved, apart from the following two special cases: H is a cycle length four [26], six [24], or ten [33] or H is an unbalanced complete bipartite graph [8, 9]. More exactly, it is proved in [9] and [33] that log2fn(Ks,t)=O(n21/s) whenever 2st and that log2fn(C2)=O(n1+1/) for every 2, respectively. As it is commonly believed that ex(n,Ks,t)=Ω(n21/s) whenever st and that ex(n,C2)=Ω(n1+1/), both these results are most likely best possible. Finally, we mention that the proofs of most of the results mentioned in this paragraph use either a variant of Lemma 1 or extensions of the ideas presented in Section 2 to hypergraphs, see Section 4.2.

Proof of Theorem 8.

Note that one can order the vertices of every n-vertex graph G as v1,,vn in such a way that for every i{2,,n}, letting Gi=G[{v1,,vi}],

δ(Gi1)degGi(vi)1.

Indeed, one may obtain such an ordering by iteratively letting vi be a minimum-degree vertex of G{vi+1,,vn} for i=n,,2. In particular, every labeled n-vertex graph G can be constructed in the following way. First, choose an ordering v1,,vn of the vertices and let G1 be the empty graph with vertex set {v1}. Second, for each i{2,n}, build a graph Gi by adding to the graph Gi1 a vertex labeled vi in such a way that its degree di (in Gi) satisfies diδ(Gi1)+1. Finally, we let G=Gn. Observe that G is C4-free if and only if Gi is C4-free for each i.

Now, given integers d and i with di, let gi(d) denote the maximum number of ways to attach a vertex of degree d to an i-vertex C4-free graph with minimum degree at least d1 in such a way that the resulting graph remains C4-free. This number is well defined as clearly gi(d)(id). Moreover, let gi=max{gi(d):di}. The argument given in the previous paragraph proves that

fn(C4)n!n!i=2ngi1. (15)

Indeed, there are n! ways to order [n] as v1,,vn and for each such ordering, there are at most n! choices for the sequence d2,,dn of degrees. In view of (15), the following claim easily implies the assertion of the theorem.

Claim.

There exists a constant C such that gnexp(Cn) for all n.

Without loss of generality, we may assume that n is large. Thus, if dn/logn, then

gn(d)(nd)(nnlogn)(enlogn)nlognexp(n).

Therefore, we shall from now on assume that d>n/logn. Let G be a C4-free graph on n vertices with δ(G)d1. Let H be the square of G, that is, the graph with V(H)=V(G) and

E(H)={xy:xz,yzE(G) for some zV(G)}.

Crucially, observe that adding v to G will result in a C4-free graph if and only if the neighbourhood of v is an independent set in H. Hence, i(H,d) is an upper bound on the number of C4-free extensions of G by a vertex of degree d. We shall estimate i(H,d) using Lemma 1.

To this end, we show that subgraphs of H induced by large subsets of V(H) have reasonably high density. Since G is C4-free, every edge xy of H corresponds to a unique vertex zV(G) such that xz and yz are edges of G. Therefore, for each BV(H),

eH(B)=zV(G)(degG(z,B)2)n(zdeg(z,B)/n2),

where the last inequality is Jensen’s inequality applied to the convex function x(x2). Since

zV(G)degG(z,B)=xBdegG(x)|B|δ(G)(d1)|B|,

then assuming that |B|2nd1 implies

eH(B)n(d1)|B|2n((d1)|B|n1)(d1)22n(|B|2).

Finally, let R=2nd1, β=(d1)22n, and q=3(logn)3. Since d>n/logn and n is large, then βqlogn and therefore eβqn1R. If follows from Lemma 1 that

i(H,d)(nq)(2nd1dq)e4log4n(2en(dq)2)dqsupk>0(enk)2k=e2n,

where we used the assumption that n is large and the fact that sup{(ex)x:x>0}=e.

3.5. Roth’s theorem in random sets

As our final example, we present a short proof of a well-known result of Kohayakawa, Łuczak, and Rödl [28]. Recall that [n] denotes the set {1,,n}. A famous theorem of Roth [34] asserts that for every positive δ, any set of at least δn integers from [n] contains a 3-term arithmetic progression (3-term AP), provided that n is sufficiently large (as a function of δ only). Given a positive δ, we shall say that a set A is δ-Roth if each BA satisfying |B|δ|A| contains a 3-term AP. We may now restate Roth’s theorem as follows. For every positive δ, there exists an n0 such that the set [n] is δ-Roth whenever nn0. With the aim of showing that there exist ‘smaller’ and ‘sparser’ δ-Roth sets Kohayakawa, Łuczak, and Rödl [28] proved the following result.

Theorem 9 ([28]).

For every positive δ, there exists a constant C such that if Cnmn, then the probability that a uniformly chosen random m-element subset of {1,,n} is δ-Roth tends to 1 as n.

We shall deduce Theorem 9 as an easy corollary of the following upper bound for the number of subsets of [n] of a given cardinality that do not contain a 3-term AP, originally proved in [7] and [39] in a much more general form. This upper bound will be derived from Roth’s theorem using Lemma 2 with one additional twist which was previously considered in [2].

Theorem 10.

For every positive ε, there exists a constant D such that if Dnmn,

|{A[n]:|A|=m and A contains no 3-term AP}|(εnm).
Proof of Theorem 9.

Fix a positive δ, let ε=δ/6, and let D be the constant from the statement of Theorem 10. Let C=D/δ and suppose that Cnmn. Since δmDn, Theorem 10 implies that the set 𝒜 defined by

𝒜={A[n]:|A|=δm and A contains no 3-term AP}

has at most (εnδm) elements. Now, let R be an m-element subset of [n] chosen uniformly at random. Clearly,

Pr(R is not δ-Roth)=Pr(RA for some A𝒜)A𝒜Pr(RA)A𝒜(mn)|A|=|𝒜|(mn)δm(εnδm)(mn)δm(εenδmmn)δm2δm.

Our proof of Theorem 10 will use the following simple consequence of Roth’s theorem, observed first by Varnavides [44], as a ‘black box’.

Proposition 11 ([34, 44]).

For every positive δ, there exist an integer n0 and a positive β such that if nn0, then every set of at least δn integers from {1,,n} contains at least βn2 3-term APs.

Proof of Theorem 10.

Fix a positive ε, let n0 and β be the constants from the statement of Proposition 11 invoked with δ=ε/2, and suppose that nn0. Given an arbitrary set B[n] and integers m and n, let

a(B,m) =|{IB:|I|=m and I contains no 3-term AP}|,
a(n,m) =max{a(B,m):B[n] with |B|=n}.

Our aim is to show that a([n],m)=a(n,m)(εnm), provided that mCn for some constant C which depends only on ε. This inequality will follow from the trivial observation that a(n,m)(nm) for all n and m and the following claim.

Claim.

If nεn/2 and m2n, then a(n,m)2(nn)2a(nβn/12,m2n).

Let be the 3-uniform hypergraph with vertex set [n] whose edges are all triples of numbers which form a 3-term AP. Let B be an arbitrary n-element subset of [n]. By Proposition 11, e(B)βn2. Let ZB be the set of all vertices of [B], the subhypergraph of induced by B, whose degree is at least βn. In other words, Z is the set of all numbers in B that belong to at least βn three-term APs contained in B. Since the maximum degree of is at most 2n, we have |Z|βn.

We first estimate the number of m-element subsets of B with no 3-term AP that contain fewer than n elements of Z. Since each such set A may be partitioned into A1 and A2, where |A1|=n and A2BZ, there are at most (nn)a(nβn,mn) such sets. We may therefore focus on counting subsets of B that contain at least n elements of Z. We shall obtain a suitable upper bound for their number using Lemma 2.

Let W be an arbitrary subset of Z and consider the auxiliary graph GW with vertex set B whose edges are all pairs {x,y} such that {x,y,z} for some zW. Since for a given pair {x,y}[n], there are at most three different z such that {x,y,z}, it follows that e(GW)|W|βn/3 and the maximum degree of GW is no more than 3|W|. It follows that for an arbitrary subset U of B with at least nβn/12 elements,

eGW(U)e(GW)|BU|Δ(GW)βn|W|3βn123|W|=βn|W|12. (16)

Observe crucially that if some set IW contains no 3-term APs, then I is an independent set in the graph GW.

Let w=n and fix some WZ with |W|=w. We shall prove an upper bound on the number of ways one can extend W to an m-element subset of B that contains no 3-term APs. By our above discussion, if IW is such a set, then I is an independent set of GW with mw elements. Let 𝒮 be the family of sets and let f and g be the maps whose existence is postulated by Lemma 2 with G=GW, q=n, R=nβn/12, and D=βw/6. Note that the assumptions of the lemma are satisfied by our discussion above, see (16). Since clearly for each extension I of W to an m-element subset of B with no 3-term APs, If(g(I)) contains no 3-term APs, the number EW of extensions of W satisfies

EWS𝒮a(f(S),mwq)(nq)a(R,mwq).

We conclude that

a(B,m)(nn)a(nβn,mn)+WZ:|W|=wEW(nn)2a(nβn,m2n)+(nw)(nq)a(nβn/12,m2n)2(nn)2a(nβn/12,m2n),

which, since B was an arbitrary n-element subset of [n], proves the claim.

Let K=(126ε)/β and suppose that mn. We recursively invoke the claim K times to obtain

a(n,m)2K(nn)2K(εn/2m2Kn)2K(2Kn2Kn)(εn/2m2Kn). (17)

As in the proof of Theorem 7, denote by r(t) the right-hand side of (17) with 2Kn replaced by t. We may clearly assume that m<εn/4 as otherwise a(n,m)=0 by Roth’s theorem (we may assume that n is sufficiently large). An elementary calculation shows that

r(t+1)r(t)=2Kntt+1mtεn/2m+t+12Knm(t+1)(εn/2m)8Kmε(t+1)

and hence, letting T=2Kn,

a(n,m)r(T)2K(8Km)TεTT!(εn/2m)2K(8eKmεT)T(12)m(εnm).

Finally, if D is sufficiently large as a function of K and ε, then for every m with mDnD/(2K)T, we have

2K/m(8eKmεT)T/m2,

which completes the proof of the theorem.

4. Concluding remarks and further reading

4.1. Other applications of the Kleitman–Winston method

There have been quite a few successful applications of the Kleitman–Winston method other than the ones presented in Section 3. In particular, variants of Lemma 1 were used in the following works: Kleitman and Wilson [24] proved that the number of n-vertex graphs with girth larger than 2 is 2O(n1+1/); Dellamonica, Kohayakawa, Lee, Rödl, and the author [13, 14, 27] proved sharp bounds on the number of subsets of [n] with a given cardinality which contain no non-trivial solutions to the equation a1++ah=b1++bh for every h2; Balogh, Das, Delcourt, Liu, and Sharifzadeh [6] and Gauy, Hàn, and Oliveira [17] proved sharp bounds for the number of intersecting families of k-element subsets of [n] with a given cardinality and for the typical size of the largest intersecting subfamily contained in a random collection of k-element subsets of [n].

4.2. Extensions of the Kleitman–Winston method to hypergraphs

It seems natural to seek a generalisation of the Kleitman–Winston method that would yield non-trivial upper bounds for the number of independent sets in a hypergraphs of higher uniformity. Perhaps somewhat surprisingly, such generalisations were considered only fairly recently. To the best of our knowledge this was first done in [8, 9], where sharp upper bounds for the number of n-vertex graphs which do not contain a copy of a fixed complete bipartite subgraph were proved using a generalisation of the argument presented in Section 3.4. Around the same time, similar ideas were developed by Saxton and Thomason, who used them to establish lower bounds for the list chromatic number of regular uniform hypergraphs [40]. Inspired by the groundbreaking work of Conlon and Gowers [12] and Schacht [41], these efforts culminated in far-reaching generalisations of the Kleitman–Winston method to arbitrary uniform hypergraphs, obtained independently by Saxton and Thomason [39], and by Balogh, Morris, and the author [7]. For further details, we refer the interested reader to [7, 11, 12, 35, 39, 41].


Acknowledgments. I would like to thank Noga Alon, Józsi Balogh, Domingos Dellamonica, Yoshi Kohayakawa, Sang June Lee, Rob Morris, and Vojta Rödl for many interesting discussions on the topics of independent sets in graphs and the Kleitman–Winston method and its applications over the past several years. These discussions have greatly influenced the content of this paper. I would also like to thank David Conlon, Asaf Ferber, and Rob Morris for their careful reading of an earlier version of this manuscript and many valuable comments which helped me improve the exposition and saved me from making several embarrassing mistakes. Finally, special thanks to Jarik Nešetřil for his encouragement to write this survey.

References

  • [1] N. Alon, Independent sets in regular graphs and sum-free subsets of finite groups, Israel J. Math. 73 (1991), 247–256.
  • [2] N. Alon, J. Balogh, R. Morris, and W. Samotij, Counting sum-free sets in abelian groups, Israel J. Math. 199 (2014), 309–344.
  • [3] by same author, A refinement of the Cameron-Erdős conjecture, Proc. Lond. Math. Soc. (3) 108 (2014), 44–72.
  • [4] N. Alon and F. R. K. Chung, Explicit construction of linear sized tolerant networks, Proceedings of the First Japan Conference on Graph Theory and Applications (Hakone, 1986), vol. 72, 1988, pp. 15–19.
  • [5] N. Alon and V. Rödl, Sharp bounds for some multicolor Ramsey numbers, Combinatorica 25 (2005), 125–141.
  • [6] J. Balogh, S. Das, M. Delcourt, H. Liu, and M. Sharifzadeh, The typical structure of intersecting families of discrete structures, arXiv:1408.2559 [math.CO].
  • [7] J. Balogh, R. Morris, and W. Samotij, Independent sets in hypergraphs, to appear in J. Amer. Math. Soc.
  • [8] J. Balogh and W. Samotij, The number of Km,m-free graphs, Combinatorica 31 (2011), 131–150.
  • [9] by same author, The number of Ks,t-free graphs, J. Lond. Math. Soc. (2) 83 (2011), 368–388.
  • [10] W. G. Brown, On graphs that do not contain a Thomsen graph, Canad. Math. Bull. 9 (1966), 281–285.
  • [11] D. Conlon, Combinatorial theorems relative to a random set, arXiv:1404.3324 [math.CO].
  • [12] D. Conlon and W. T. Gowers, Combinatorial theorems in sparse random sets, arXiv:1011.4310 [math.CO].
  • [13] D. Dellamonica Jr., Y. Kohayakawa, S. Lee, V. Rödl, and W. Samotij, The number of B3-sets of a given cardinality, submitted.
  • [14] by same author, On the number of Bh-sets, to appear in Combin. Probab. Comput.
  • [15] P. Erdős, P. Frankl, and V. Rödl, The asymptotic number of graphs not containing a fixed subgraph and a problem for hypergraphs having no exponent, Graphs Combin. 2 (1986), 113–121.
  • [16] P. Erdős, A. Rényi, and V. T. Sós, On a problem of graph theory, Studia Sci. Math. Hungar. 1 (1966), 215–235.
  • [17] M. M. Gauy, H. Hàn, and I. C. Oliveira, Erdős–Ko–Rado for random hypergraphs: asymptotics and stability, arXiv:1409.3634 [math.CO].
  • [18] B. Green, The Cameron-Erdős conjecture, Bull. London Math. Soc. 36 (2004), 769–778.
  • [19] B. Green and I. Z. Ruzsa, Counting sumsets and sum-free sets modulo a prime, Studia Sci. Math. Hungar. 41 (2004), 285–293.
  • [20] by same author, Sum-free sets in abelian groups, Israel J. Math. 147 (2005), 157–188.
  • [21] A. J. Hoffman, On eigenvalues and colorings of graphs, Graph Theory and its Applications (Proc. Advanced Sem., Math. Research Center, Univ. of Wisconsin, Madison, Wis., 1969), Academic Press, New York, 1970, pp. 79–91.
  • [22] S. Hoory, N. Linial, and A. Wigderson, Expander graphs and their applications, Bull. Amer. Math. Soc. (N.S.) 43 (2006), 439–561 (electronic).
  • [23] J. Kahn, An entropy approach to the hard-core model on bipartite graphs, Combin. Probab. Comput. 10 (2001), 219–237.
  • [24] D. J. Kleitman and D. B. Wilson, On the number of graphs which lack small cycles, manuscript, 1996.
  • [25] D. J. Kleitman and K. J. Winston, The asymptotic number of lattices, Ann. Discrete Math. 6 (1980), 243–249, Combinatorial mathematics, optimal designs and their applications (Proc. Sympos. Combin. Math. and Optimal Design, Colorado State Univ., Fort Collins, Colo., 1978).
  • [26] by same author, On the number of graphs without 4-cycles, Discrete Math. 41 (1982), 167–172.
  • [27] Y. Kohayakawa, S. Lee, V. Rödl, and W. Samotij, The number of Sidon sets and the maximum size of Sidon sets contained in a sparse random set of integers, to appear in Random Structures Algorithms.
  • [28] Y. Kohayakawa, T. Łuczak, and V. Rödl, Arithmetic progressions of length three in subsets of a random set, Acta Arith. 75 (1996), 133–163.
  • [29] Ph. G. Kolaitis, H. J. Prömel, and B. L. Rothschild, Kl+1-free graphs: asymptotic structure and a 0-1 law, Trans. Amer. Math. Soc. 303 (1987), 637–671.
  • [30] T. Kővari, V. T. Sós, and P. Turán, On a problem of K. Zarankiewicz, Colloquium Math. 3 (1954), 50–57.
  • [31] V. F. Lev, T. Łuczak, and T. Schoen, Sum-free sets in abelian groups, Israel J. Math. 125 (2001), 347–367.
  • [32] W. Mantel, Problem 28, Wiskundige Opgaven 10 (1907), 60–61.
  • [33] R. Morris and D. Saxton, The number of C2-free graphs, arXiv:1309.2927 [math.CO].
  • [34] K. F. Roth, On certain sets of integers, J. London Math. Soc. 28 (1953), 104–109.
  • [35] W. Samotij, Stability results for random discrete structures, Random Structures Algorithms 44 (2014), 269–289.
  • [36] A. A. Sapozhenko, On the number of independent sets in extenders, Diskret. Mat. 13 (2001), 56–62.
  • [37] by same author, Asymptotics of the number of sum-free sets in abelian groups of even order, Dokl. Akad. Nauk 383 (2002), 454–457.
  • [38] by same author, The Cameron-Erdős conjecture, Dokl. Akad. Nauk 393 (2003), 749–752.
  • [39] D. Saxton and A. Thomason, Hypergraph containers, arXiv:1204.6595 [math.CO].
  • [40] by same author, List colourings of regular hypergraphs, Combin. Probab. Comput. 21 (2012), 315–322.
  • [41] M. Schacht, Extremal results for random discrete structures, submitted.
  • [42] E. Szemerédi, On sets of integers containing no k elements in arithmetic progression, Acta Arith. 27 (1975), 199–245.
  • [43] P. Turán, Eine Extremalaufgabe aus der Graphentheorie, Mat. Fiz. Lapok 48 (1941), 436–452.
  • [44] P. Varnavides, On certain sets of positive density, J. London Math. Soc. 34 (1959), 358–360.
  • [45] Y. Zhao, The number of independent sets in a regular graph, Combin. Probab. Comput. 19 (2010), 315–320.