Giant component explained

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of vertices is present, independently of the other edges, with probability . In this model, if

p\le

1-\epsilon
n
for any constant

\epsilon>0

, then with high probability (in the limit as

n

goes to infinity) all connected components of the graph have size, and there is no giant component. However, for

p\ge

1+\epsilon
n
there is with high probability a single giant component, with all other components having size . For

p=pc=

1
n
, intermediate between these two possibilities, the number of vertices in the largest component of the graph,

Pinf

is with high probability proportional to

n2/3

.[1]

Giant component is also important in percolation theory.[2] When a fraction of nodes,

q=1-p

, is removed randomly from an ER network of degree

\langlek\rangle

, there exists a critical threshold,

pc=

1
\langlek\rangle
. Above

pc

there exists a giant component (largest cluster) of size,

Pinf

.

Pinf

fulfills,

Pinf=p(1-\exp(-\langlek\ranglePinf))

. For

p<pc

the solution of this equation is

Pinf=0

, i.e., there is no giant component.

At

pc

, the distribution of cluster sizes behaves as a power law,

n(s)

~

s-5/2

which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately

n/2

edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when edges have been added, for values of close to but larger than

n/2

, the size of the giant component is approximately

4t-2n

.[1] However, according to the coupon collector's problem,

\Theta(nlogn)

edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions.The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution,the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree

k

, then the giant component exists[3] if and only if\mathbb E [k^2] - 2 \mathbb E [k]>0.

E[k]

which is also written in the form of

{\langlek\rangle}

is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional. There are three types of connected components in directed graphs. For a randomly chosen vertex:
  1. out-component is a set of vertices that can be reached by recursively following all out-edges forward;
  2. in-component is a set of vertices that can be reached by recursively following all in-edges backward;
  3. weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Criteria for giant component existence in directed and undirected configuration graphs

Let a randomly chosen vertex has

kin

in-edges and

kout

out edges. By definition, the average number of in- and out-edges coincides so that

c=E[kin]=E[kout]

. If

G0(x)=style\sumk\displaystyleP(k)xk

is the generating function of the degree distribution

P(k)

for an undirected network, then

G1(x)

can be defined as

G1(x)=style\sumk\displaystyle

k
\langlek\rangle

P(k)xk-1

. For directed networks, generating function assigned to the joint probability distribution

P(kin,kout)

can be written with two valuables

x

and

y

as:

l{G}(x,y)=

\sum
kin,kout

\displaystyleP({kin,kout

})x^y^ , then one can define

g(x)=

1
c

{\partiall{G}\over\partialx}\verty=1

and

f(y)=

1
c

{\partiall{G}\over\partialy}\vertx=1

. The criteria for giant component existence in directed and undirected random graphs are given in the table below:
TypeCriteria
undirected: giant component

E[k2]-2E[k]>0

, or

G'1(1)=1

directed: giant in/out-component

E[kinkout]-E[kin]>0

,[4] or

g'1(1)=f'1(1)=1

directed: giant weak component

2E[kin]E[kinkout] -E[kin]

2]
E[k
out

-E[kin]

2]
E[k
in
2]
+E[k
out

-E[kin

2
k
out]

>0

[5]
+

Notes and References

  1. .
  2. Book: Newman, M. E. J. . Networks : an introduction . 2010 . Oxford University Press . New York . 456837194.
  3. Molloy . Michael . Reed . Bruce . A critical point for random graphs with a given degree sequence . Random Structures & Algorithms . 6 . 2–3 . 1995 . 1042-9832 . 10.1002/rsa.3240060204 . 161–180.
  4. Newman . M. E. J. . Strogatz . S. H. . Watts . D. J. . Random graphs with arbitrary degree distributions and their applications . Physical Review E . 64 . 2 . 2001-07-24 . 1063-651X . 10.1103/physreve.64.026118 . 11497662 . 026118. free. cond-mat/0007235. 2001PhRvE..64b6118N .
  5. Kryven . Ivan . Emergence of the giant weak component in directed random graphs with arbitrary degree distributions . Physical Review E . 94 . 1 . 2016-07-27 . 2470-0045 . 10.1103/physreve.94.012315 . 27575156 . 012315. 1607.03793. 2016PhRvE..94a2315K . 206251373 .