Graph removal lemma explained

In graph theory, the graph removal lemma states that when a graph contains few copies of a given subgraph, then all of the copies can be eliminated by removing a small number of edges.The special case in which the subgraph is a triangle is known as the triangle removal lemma.

The graph removal lemma can be used to prove Roth's theorem on 3-term arithmetic progressions, and a generalization of it, the hypergraph removal lemma, can be used to prove Szemerédi's theorem. It also has applications to property testing.

Formulation

Let

be a graph with

vertices. The graph removal lemma states that for any

\epsilon>0

, there exists a constant

\delta=\delta(\epsilon,H)>0

such that for any

-vertex graph

with fewer than

\deltan^h

subgraphs isomorphic to

, it is possible to eliminate all copies of

by removing at most

\epsilonn²

edges from

An alternative way to state this is to say that for any

-vertex graph

with

o(n^h)

subgraphs isomorphic to

, it is possible to eliminate all copies of

by removing

o(n²⁾

edges from

. Here, the

indicates the use of little o notation.

In the case when

is a triangle, resulting lemma is called triangle removal lemma.

History

The original motivation for the study of triangle removal lemma was Ruzsa–Szemerédi problem. Initial formulation due to Imre Z. Ruzsa and Szemerédi from 1978 was slightly weaker than the triangle removal lemma used nowadays and can be roughly stated as follows: every locally linear graph on

vertices contains

o(n²⁾

edges. This statement can be quickly deduced from a modern triangle removal lemma. Ruzsa and Szemerédi provided also an alternative proof of Roth's theorem on arithmetic progressions as a simple corollary.

In 1986 during their work on generalizations of Ruzsa–Szemerédi problem to arbitrary

-uniform graphs, Erdős, Frankl, and Rödl provided statement for general graphs very close to the modern graph removal lemma: if graph

H₂

is a homomorphic image of

H₂

, then any

H₁

-free graph

vertices can be made

H₂

-free by removing

o(n²⁾

edges.

The modern formulation of graph removal lemma was first stated by Füredi in 1994. The proof generalized earlier approaches by Ruzsa and Szemerédi and Erdős, Frankl, and Rödl, also utilizing Szemerédi regularity lemma.

Graph counting lemma

A key component of the proof of graph removal lemma is the graph counting lemma about counting subgraphs in systems of regular pairs. Graph counting lemma is also very useful on its own. According to Füredi, it is used "in most applications of regularity lemma".

Heuristic argument

Let

be a graph on

vertices, whose vertex set is

V=\{1,2,\ldots,h\}

and edge set is

. Let

X_1,X_2,\ldots,X_h

be sets of vertices of some graph

such that for all

ij\inE

pair

(X_i,X_j)

\epsilon

-regular (in the sense of regularity lemma). Let also

d_ij

be the density between sets

X_i

and

X_j

. Intuitively, regular pair

(X,Y)

with density

should behave like a random Erdős–Rényi-like graph, where every pair of vertices

(x,y)\in(X x Y)

is selected to be an edge independently with probability

. This suggests that the number of copies of

on vertices

x_1,x_2,\ldots,x_h

such that

x_i\inX_i

should be close to the expected number from Erdős–Rényi model:

\prod_d_\prod_|X_i|

where

E(H)

and

V(H)

are the edge set and the vertex set of

Precise statement

The straightforward formalization of above heuristic claim is as follows. Let

be a graph on

vertices, whose vertex set is

V=\{1,2,\ldots,h\}

and edge set is

. Let

\delta>0

be arbitrary. Then there exists

\epsilon>0

such that for any

X_1,X_2,\ldots,X_h

as above, satisfying

d_ij>\delta

for all

ij\inE

, the number of graph homomorphisms from

such that vertex

i\inV(H)

is mapped to

X_i

is not smaller than

(1-\delta)\prod_(d_-\delta)\prod_|X_i|

Blow-up Lemma

One can even find bounded degree subgraphs of blow-ups of

in a similar setting. The following claim appears in the literature under name of the blow-up lemma and was first proven by Komlós, Sárközy and Szemerédi. Precise statement here is a slightly simplified version due to Komlós, who referred to it also as the key lemma, as it is used in numerous regularity-based proofs.

Let

H₁

be an arbitrary graph and

t\inZ₊

. Construct

H(t)

by replacing each vertex

by independent set

V_i

of size

and replacing every edge

by complete bipartite graph on

(V_i,V_j)

. Let

\epsilon,\delta>0

be arbitrary reals,

be a positive integer and let

H₂

be a subgraph of

H(t)

with

vertices and with maximum degree

\Delta

. Define

	\Delta/(2+\Delta)
\epsilon
	0=\delta

. Finally, let

be a graph and

X_1,X_2,\ldots,X_h

be disjoint sets of vertices of

such that whenever

ij\inE(H₂₎

then

(X_i,X_j)

is a

\epsilon

-regular pair with density at least

\epsilon+\delta

. Then if

\epsilon\leq\epsilon₀

and

1-t\leqN\epsilon₀

, the number of injective graph homomorphisms from

H₂

is at least

	h
(\epsilon
	0N)

In fact, one can only restrict to counting homomorphisms such that any vertex

k\in[h]

H₂

such that

k\inV_i

is mapped to a vertex in

X_i

Proof

We will provide proof of the counting lemma in the case when

is a triangle (triangle counting lemma). The proof of the general case, as well as the proof of the blow-up lemma, are very similar and do not require different techniques.

Take

\epsilon=\delta/2

. Let

X_1'\subsetX₁

be the set of those vertices in

X₁

which have at least

(d₁₂-\epsilon)|X_2|

neighbors in

X₂

and at least

(d₁₃-\epsilon)|X_3|

neighbors in

X₃

. Note that if there were more than

\epsilon|X_1|

vertices in

X₁

with less than

(d₁₂-\epsilon)|X_2|

neighbors in

X₂

, then these vertices together with whole

X₂

would witness

\epsilon

-irregularity of the pair

(X_1,X₂₎

. Repeating this argument for

X₃

shows that we must have

|X_{1'|>(1-2\epsilon)|X}_1|

. Now take arbitrary

x\inX_1'

and define

X_2'

and

X_3'

as neighbors of

X₂

and

X₃

respectively. By definition

|X_2'|\geq(d₁₂-\epsilon)|X_2|\geq\epsilon|X_2|

and

|X_3'|\geq\epsilon|X_3|

so by regularity of

(X_2,X₃₎

we obtain existence of at least

(d_-\epsilon)|X_2'||X_3'|\geq (d_-\epsilon)(d_-\epsilon)(d_-\epsilon)|X_2||X_3|

triangles containing

. Since

was chosen arbitrarily from the set

X_1'

of size at least

(1-2\epsilon)|X_1|

, we obtain a total of at least

(1-2\epsilon)(d_-\epsilon)|X_2'||X_3'|\geq (d_-\epsilon)(d_-\epsilon)(d_-\epsilon)|X_1||X_2||X_3|

which finishes the proof as

\epsilon=\delta/2

Proof

Proof of the triangle removal lemma

To prove the triangle removal lemma, consider an

\epsilon/4

-regular partition

V₁\cup … \cupV_M

of the vertex set of

. This exists by the Szemerédi regularity lemma. The idea is to remove all edges between irregular pairs, low-density pairs, and small parts, and prove that if at least one triangle still remains, then many triangles remain. Specifically, remove all edges between parts

V_i

and

V_j

if This procedure removes at most

\epsilonn²

edges. If there exists a triangle with vertices in

V_i,V_j,V_k

after these edges are removed, then the triangle counting lemma tells us there are at least

\left(1-\frac\right)\left(\frac\right)^3\left(\frac\right)^3\cdot n^3

triples in

V_i x V_j x V_k

which form a triangle. Thus, we may take

\delta < \frac \left(1-\frac\right)\left(\frac\right)^3\left(\frac\right)^3.

Proof of the graph removal lemma

The proof of the case of general

is analogous to the triangle case, and uses graph counting lemma instead of triangle counting lemma.

Induced Graph Removal Lemma

A natural generalization of the Graph Removal Lemma is to consider induced subgraphs. In property testing it is often useful to consider how far a graph is from being induced H-free. A graph

is considered to contain an induced subgraph

if there is an injective map

f:V(H) → V(G)

such that

(f(u),f(v))

is an edge of

if and only if

(u,v)

is an edge of

. Notice that non-edges are considered as well.

is induced

-free if there is no induced subgraph

. We define

\epsilon

-far from being induced

-free if we cannot add or delete

\epsilonn²

edges to make

induced

-free.

Formulation

A version of the Graph Removal for induced subgraphs was proved by Alon, Fischer, Krivelevich, and Szegedy in 2000. It states that for any graph

with

vertices and

\epsilon>0

, there exists a constant

\delta>0

such that if an

-vertex graph

has fewer than

\deltan^h

induced subgraphs isomorphic to

, then it is possible to eliminate all induced copies of

by adding or removing fewer than

\epsilonn²

edges.

The problem can be reformulated as follows: Given a red-blue coloring

of the complete graph

K_h

(Analogous to the graph

on the same

vertices where non-edges are blue, edges are red), and a constant

\epsilon>0

, then there exists a constant

\delta>0

such that for any red-blue colorings of

K_n

has fewer than

\deltan^h

subgraphs isomorphic to

, then it is possible to eliminate all copies of

by changing the colors of fewer than

\epsilonn²

edges. Notice that our previous "cleaning" process, where we remove all edges between irregular pairs, low-density pairs, and small parts, only involves removing edges. Removing edges only corresponds to changing edge colors from red to blue. However, there are situations in the induced case where the optimal edit distance involves changing edge colors from blue to red as well. Thus, the Regularity Lemma is insufficient to prove Induced Graph Removal Lemma. The proof of the Induced Graph Removal Lemma must take advantage of the strong regularity lemma.

Proof

Strong Regularity Lemma

The strong regularity lemma is a strengthened version of Szemerédi's Regularity Lemma. For any infinite sequence of constants

\epsilon_0\ge\epsilon₁\ge...>0

, there exists an integer

such that for any graph

, we can obtain two (equitable) partitions

l{P}

and

l{Q}

such that the following properties are satisfied:

l{Q}

refines

l{P}

, that is every part of

l{P}

is the union of some collection of parts in

l{Q}

l{P}

\epsilon₀

-regular and

l{Q}

\epsilon_|l{P|}

-regular.

q(l{Q})<q(l{P})+\epsilon₀

|l{Q}|\leM

The function

is defined to be the energy function defined in Szemerédi regularity lemma. Essentially, we can find a pair of partitions

l{P},l{Q}

where

l{Q}

is regular compared to

l{P}

, and at the same time

l{P},l{Q}

are close to each other. (This property is captured in the third condition)

Corollary of the Strong Regularity Lemma

The following corollary of the strong regularity lemma is used in the proof of the Induced Graph Removal Lemma. For any infinite sequence of constants

\epsilon_0\ge\epsilon₁\ge...>0

, there exists

\delta>0

such that there exists a partition

l{P}={V_1,...,V_k}

and subsets

W_i\subsetV_i

for each

where the following properties are satisfied:

|W_i|>\deltan

(W_i,W_j)

\epsilon_|l{P|}

-regular for each pair

i,j

|d(W_i,W_j)-d(V_i,V_j)|\le\epsilon₀

for all but

\epsilon₀|l{P}|²

pairs

i,j

The main idea of the proof of this corollary is to start with two partitions

l{P}

and

l{Q}

that satisfy the Strong Regularity Lemma where

	3/8
q(l{Q})<q(l{P})+\epsilon
	0

. Then for each part

V_i\inl{P}

, we uniformly at random choose some part

W_i\subsetV_i

that is a part in

l{Q}

. The expected number of irregular pairs

(W_i,W_j)

is less than 1. Thus, there exists some collection of

W_i

such that every pair is

\epsilon_|l{P|}

-regular!

The important aspect of this corollary is that pair of

W_i,W_j

are

\epsilon_|l{P|}

-regular! This allows us to consider edges and non-edges when we perform our cleaning argument.

Proof of Sketch of the Induced Graph Removal Lemma

With these results, we are able to prove the Induced Graph Removal Lemma. Take any graph

with

vertices that has less than

\deltan^v(H)

copies of

. The idea is to start with a collection of vertex sets

W_i

which satisfy the conditions of the Corollary of the Strong Regularity Lemma. We then can perform a "cleaning" process where we remove all edges between pairs of parts

(W_i,W_j)

with low density, and we can add all edges between pairs of parts

(W_i,W_j)

with high density. We choose the density requirements such that we added/deleted at most

\epsilonn²

edges.

If the new graph has no copies of

, then we are done. Suppose the new graph has a copy of

. Suppose the vertex

v_i\inv(H)

is embedded in

W_f(i)

. Then if there is an edge connecting

v_i,v_j

, then

W_i,W_j

does not have low density. (Edges between

W_i,W_j

were not removed in the cleaning process) Similarly, if there is not an edge connecting

v_i,v_j

, then

W_i,W_j

does not have high density. (Edges between

W_i,W_j

were not added in the cleaning process)

Thus, by a similar counting argument to the proof of the triangle counting lemma, that is the graph counting lemma, we can show that

has more than

\deltan^v(H)

copies of

Generalizations

The graph removal lemma was later extended to directed graphs and to hypergraphs.

Quantitative bounds

Usage of regularity lemma in the proof of graph removal lemma forces

\delta

to be extremely small, bounded by tower function of height polynomial in

\epsilon^-1

that is

\delta=1/tower(\epsilon^-O(1))

(here

tower(k)

is the tower of twos of height

). Tower function of height

\epsilon^-O(1)

is necessary in all regularity proofs as is implied by results of Gowers on lower bounds in regularity lemma. However, in 2011 Fox provided a new proof of graph removal lemma which does not use regularity lemma, improving the bound to

\delta=1/tower(5h^2log\epsilon^-1)

(here

is number of vertices of removed graph

). His proof, however, uses regularity-related ideas such as energy increment, but with different notion of energy, related to entropy. This proof can be also rephrased using Frieze-Kannan weak regularity lemma as noted by Conlon and Fox. In the special case of bipartite

it was shown that

\delta=\epsilon^O(1)

is sufficient.

There is a large gap between upper and lower bounds for

\delta

in the general case. The current best result true for all graphs

is due to Alon and states that for each nonbipartite

there exists constant

c>0

such that

\delta<(\epsilon/c)^clog

is necessary for the graph removal lemma to hold while for bipartite

the optimal

\delta

has polynomial dependence on

\epsilon

, which matches the lower bound. Construction for nonbipartite case is a consequence of Behrend construction of large Salem-Spencer set. Indeed, as triangle removal lemma implies Roth's theorem, existence of large Salem-Spencer set may be translated to an upper bound for

\delta

in the triangle removal lemma. This method can be leveraged for arbitrary nonbipartite

to give aforementioned bound.

Graph removal lemma explained

Formulation

History

Graph counting lemma

Heuristic argument

Precise statement

Blow-up Lemma

Proof

Proof

Proof of the triangle removal lemma

Proof of the graph removal lemma

Induced Graph Removal Lemma

Formulation

Proof

Strong Regularity Lemma

Corollary of the Strong Regularity Lemma

Proof of Sketch of the Induced Graph Removal Lemma

Generalizations

Quantitative bounds

Applications

Property testing

See also