Pseudorandom graph explained

In graph theory, a graph is said to be a pseudorandom graph if it obeys certain properties that random graphs obey with high probability. There is no concrete definition of graph pseudorandomness, but there are many reasonable characterizations of pseudorandomness one can consider.

Pseudorandom properties were first formally considered by Andrew Thomason in 1987.[1] [2] He defined a condition called "jumbledness": a graph

G=(V,E)

is said to be

(p,\alpha)

-jumbled for real

p

and

\alpha

with

0<p<1\leq\alpha

if

\left|e(U)-p\binom{|U|}{2}\right|\leq\alpha|U|

for every subset

U

of the vertex set

V

, where

e(U)

is the number of edges among

U

(equivalently, the number of edges in the subgraph induced by the vertex set

U

). It can be shown that the Erdős–Rényi random graph

G(n,p)

is almost surely

(p,O(\sqrt{np}))

-jumbled. However, graphs with less uniformly distributed edges, for example a graph on

2n

vertices consisting of an

n

-vertex complete graph and

n

completely independent vertices, are not

(p,\alpha)

-jumbled for any small

\alpha

, making jumbledness a reasonable quantifier for "random-like" properties of a graph's edge distribution.

Connection to local conditions

Thomason showed that the "jumbled" condition is implied by a simpler-to-check condition, only depending on the codegree of two vertices and not every subset of the vertex set of the graph. Letting

\operatorname{codeg}(u,v)

be the number of common neighbors of two vertices

u

and

v

, Thomason showed that, given a graph

G

on

n

vertices with minimum degree

np

, if

\operatorname{codeg}(u,v)\leqnp2+\ell

for every

u

and

v

, then

G

is

\left(p,\sqrt{(p+\ell)n}\right)

-jumbled. This result shows how to check the jumbledness condition algorithmically in polynomial time in the number of vertices, and can be used to show pseudorandomness of specific graphs.

Chung–Graham–Wilson theorem

In the spirit of the conditions considered by Thomason and their alternately global and local nature, several weaker conditions were considered by Chung, Graham, and Wilson in 1989:[3] a graph

G

on

n

vertices with edge density

p

and some

\varepsilon>0

can satisfy each of these conditions if

X,Y

of the vertex set

V=V(G)

, the number of edges between

X

and

Y

is within

\varepsilonn2

of

p|X||Y|

.

X

of

V

, the number of edges among

X

is within

\varepsilonn2

of

p\binom{|X|}{2}

.

H

, the number of labeled copies of

H

among the subgraphs of

G

is within

\varepsilonnv(H)

of

pe(H)nv(H)

.

4

-cycles among the subgraphs of

G

is within

\varepsilonn4

of

p4n4

.

\operatorname{codeg}(u,v)

be the number of common neighbors of two vertices

u

and

v

,

\sumu,v\in|\operatorname{codeg}(u,v)-p2n|\leq\varepsilonn3.

λ1\geqλ2\geq\geqλn

are the eigenvalues of the adjacency matrix of

G

, then

λ1

is within

\varepsilonn

of

pn

and

max\left(\left|λ2\right|,\left|λn\right|\right)\leq\varepsilonn

.

These conditions may all be stated in terms of a sequence of graphs

\{Gn\}

where

Gn

is on

n

vertices with

(p+o(1))\binom{n}{2}

edges. For example, the 4-cycle counting condition becomes that the number of copies of any graph

H

in

Gn

is

\left(pe(H)+o(1)\right)ev(H)

as

n\toinfty

, and the discrepancy condition becomes that

\left|e(X,Y)-p|X||Y|\right|=o(n2)

, using little-o notation.

A pivotal result about graph pseudorandomness is the Chung–Graham–Wilson theorem, which states that many of the above conditions are equivalent, up to polynomial changes in

\varepsilon

. A sequence of graphs which satisfies those conditions is called quasi-random. It is considered particularly surprising that the weak condition of having the "correct" 4-cycle density implies the other seemingly much stronger pseudorandomness conditions. Graphs such as the 4-cycle, the density of which in a sequence of graphs is sufficient to test the quasi-randomness of the sequence, are known as forcing graphs.

Some implications in the Chung–Graham–Wilson theorem are clear by the definitions of the conditions: the discrepancy on individual sets condition is simply the special case of the discrepancy condition for

Y=X

, and 4-cycle counting is a special case of subgraph counting. In addition, the graph counting lemma, a straightforward generalization of the triangle counting lemma, implies that the discrepancy condition implies subgraph counting.

The fact that 4-cycle counting implies the codegree condition can be proven by a technique similar to the second-moment method. Firstly, the sum of codegrees can be upper-bounded:

\sumu,v\in\operatorname{codeg}(u,v)=\sumx\in\deg(x)2\gen\left(

2e(G)
n

\right)2=\left(p2+o(1)\right)n3.

Given 4-cycles, the sum of squares of codegrees is bounded:

\sumu,v

2=NumberoflabeledcopiesofC
\operatorname{codeg}(u,v)
4

+o(n4)\le\left(p4+o(1)\right)n4.

Therefore, the Cauchy–Schwarz inequality gives

\sumu,v\in|\operatorname{codeg}(u,v)-p2n|\len\left(\sumu,v\in\left(\operatorname{codeg}(u,v)-p2n\right)2\right)1/2,

which can be expanded out using our bounds on the first and second moments of

\operatorname{codeg}

to give the desired bound. A proof that the codegree condition implies the discrepancy condition can be done by a similar, albeit trickier, computation involving the Cauchy–Schwarz inequality.

The eigenvalue condition and the 4-cycle condition can be related by noting that the number of labeled 4-cycles in

G

is, up to

o(1)

stemming from degenerate 4-cycles,
4\right)
\operatorname{tr}\left(A
G
, where

AG

is the adjacency matrix of

G

. The two conditions can then be shown to be equivalent by invocation of the Courant–Fischer theorem.

Connections to graph regularity

The concept of graphs that act like random graphs connects strongly to the concept of graph regularity used in the Szemerédi regularity lemma. For

\varepsilon>0

, a pair of vertex sets

X,Y

is called

\varepsilon

-regular
, if for all subsets

A\subsetX,B\subsetY

satisfying

|A|\geq\varepsilon|X|,|B|\geq\varepsilon|Y|

, it holds that

\left|d(X,Y)-d(A,B)\right|\le\varepsilon,

where

d(X,Y)

denotes the edge density between

X

and

Y

: the number of edges between

X

and

Y

divided by

|X||Y|

. This condition implies a bipartite analogue of the discrepancy condition, and essentially states that the edges between

A

and

B

behave in a "random-like" fashion. In addition, it was shown by Miklós Simonovits and Vera T. Sós in 1991 that a graph satisfies the above weak pseudorandomness conditions used in the Chung–Graham–Wilson theorem if and only if it possesses a Szemerédi partition where nearly all densities are close to the edge density of the whole graph.[4]

Sparse pseudorandomness

Chung–Graham–Wilson theorem analogues

The Chung–Graham–Wilson theorem, specifically the implication of subgraph counting from discrepancy, does not follow for sequences of graphs with edge density approaching

0

, or, for example, the common case of

d

-regular graphs on

n

vertices as

n\toinfty

. The following sparse analogues of the discrepancy and eigenvalue bounding conditions are commonly considered:

X,Y

of the vertex set

V=V(G)

, the number of edges between

X

and

Y

is within

\varepsilondn

of
d
n

|X||Y|

.

λ1\geqλ2\geq\geqλn

are the eigenvalues of the adjacency matrix of

G

, then

max\left(\left|λ2\right|,\left|λn\right|\right)\leq\varepsilond

.

It is generally true that this eigenvalue condition implies the corresponding discrepancy condition, but the reverse is not true: the disjoint union of a random large

d

-regular graph and a

d+1

-vertex complete graph has two eigenvalues of exactly

d

but is likely to satisfy the discrepancy property. However, as proven by David Conlon and Yufei Zhao in 2017, slight variants of the discrepancy and eigenvalue conditions for

d

-regular Cayley graphs are equivalent up to linear scaling in

\varepsilon

.[5] One direction of this follows from the expander mixing lemma, while the other requires the assumption that the graph is a Cayley graph and uses the Grothendieck inequality.

Consequences of eigenvalue bounding

A

d

-regular graph

G

on

n

vertices is called an

(n,d,λ)

-graph
if, letting the eigenvalues of the adjacency matrix of

G

be

d1\geqλ2\geq\geqλn

,

max\left(\left|λ2\right|,\left|λn\right|\right)\leqλ

. The Alon-Boppana bound gives that

max\left(\left|λ2\right|,\left|λn\right|\right)\geq2\sqrt{d-1}-o(1)

(where the

o(1)

term is as

n\toinfty

), and Joel Friedman proved that a random

d

-regular graph on

n

vertices is

(n,d,λ)

for

λ=2\sqrt{d-1}+o(1)

.[6] In this sense, how much

λ

exceeds

2\sqrt{d-1}

is a general measure of the non-randomness of a graph. There are graphs with

λ\leq2\sqrt{d-1}

, which are termed Ramanujan graphs. They have been studied extensively and there are a number of open problems relating to their existence and commonness.

Given an

(n,d,λ)

graph for small

λ

, many standard graph-theoretic quantities can be bounded to near what one would expect from a random graph. In particular, the size of

λ

has a direct effect on subset edge density discrepancies via the expander mixing lemma. Other examples are as follows, letting

G

be an

(n,d,λ)

graph:

d\leq

n
2
, the vertex-connectivity

\kappa(G)

of

G

satisfies

\kappa(G)\geqd-

36λ2
d

.

[7]

λ\leqd-2

,

G

is

d

edge-connected. If

n

is even,

G

contains a perfect matching.

G

is at most
n(d)
4
.

U\subsetV(G)

in

G

is of size at least
nln\left(
2(d)
|U|(d)
n(λ+1)

+1\right).

[8]

G

is at most
6(d)
ln\left(d+1\right)
λ+1

.

Connections to the Green–Tao theorem

Pseudorandom graphs factor prominently in the proof of the Green–Tao theorem. The theorem is proven by transferring Szemerédi's theorem, the statement that a set of positive integers with positive natural density contains arbitrarily long arithmetic progressions, to the sparse setting (as the primes have natural density

0

in the integers). The transference to sparse sets requires that the sets behave pseudorandomly, in the sense that corresponding graphs and hypergraphs have the correct subgraph densities for some fixed set of small (hyper)subgraphs.[9] It is then shown that a suitable superset of the prime numbers, called pseudoprimes, in which the primes are dense obeys these pseudorandomness conditions, completing the proof.

Notes and References

  1. Thomason . Andrew . Pseudo-random graphs . Annals of Discrete Math . 1987 . 33 . 307–331 .
  2. Book: Krivelevich . Michael . Sudakov . Benny . More Sets, Graphs and Numbers . Pseudo-random Graphs . 2006 . 15 . 199–262 . https://people.math.ethz.ch/~sudakovb/pseudo-random-survey.pdf. 10.1007/978-3-540-32439-3_10 . 978-3-540-32377-8 . Bolyai Society Mathematical Studies . 1952661 .
  3. Chung . F. R. K. . Graham . R. L. . Wilson . R. M. . Quasi-Random Graphs . Combinatorica . 1989 . 9 . 4 . 345–362 . 10.1007/BF02125347 . 17166765 .
  4. Simonovits . Miklós . Sós . Vera . Szemerédi's partition and quasirandomness . Random Structures and Algorithms . 1991 . 2 . 1–10. 10.1002/rsa.3240020102 .
  5. Conlon . David . Zhao . Yufei . Quasirandom Cayley graphs . Discrete Analysis . 2017 . 6 . 1603.03025 . 10.19086/da.1294 . 56362932 .
  6. Friedman. Joel. 2003. Relative expanders or weakly relatively Ramanujan graphs. Duke Math. J.. 118. 1. 19–35. 1978881. 10.1215/S0012-7094-03-11812-8.
  7. Krivelevich . Michael . Sudakov . Benny . Vu . Van H. . Wormald . Nicholas C. . Random regular graphs of high degree . Random Structures and Algorithms . 2001 . 18 . 4 . 346–363 . 10.1002/rsa.1013 . 16641598 .
  8. Alon . Noga . Krivelevich . Michael . Sudakov . Benny . List coloring of random and pseudorandom graphs . Combinatorica . 1999 . 19 . 4 . 453–472. 10.1007/s004939970001 . 5724231 .
  9. 1403.2957 . The Green–Tao theorem: an exposition . David . Conlon . David Conlon . Jacob . Fox . Jacob Fox . Yufei . Zhao . 3285854 . EMS Surveys in Mathematical Sciences . 2014 . 10.4171/EMSS/6 . 1 . 2 . 249–282 . 119301206 .