Giant component explained

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of vertices is present, independently of the other edges, with probability . In this model, if

p\le

	1-\epsilon
	n

for any constant

\epsilon>0

, then with high probability (in the limit as

goes to infinity) all connected components of the graph have size, and there is no giant component. However, for

p\ge

	1+\epsilon
	n

there is with high probability a single giant component, with all other components having size . For

p=p_c=

	1
	n

, intermediate between these two possibilities, the number of vertices in the largest component of the graph,

P_inf

is with high probability proportional to

n^2/3

.^[1]

Giant component is also important in percolation theory.^[2] When a fraction of nodes,

q=1-p

, is removed randomly from an ER network of degree

\langlek\rangle

, there exists a critical threshold,

p_c=

	1
	\langlek\rangle

. Above

p_c

there exists a giant component (largest cluster) of size,

P_inf

fulfills,

P_inf=p(1-\exp(-\langlek\rangleP_inf))

. For

p<p_c

the solution of this equation is

P_inf=0

, i.e., there is no giant component.

p_c

, the distribution of cluster sizes behaves as a power law,

n(s)

s^-5/2

which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately

n/2

edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when edges have been added, for values of close to but larger than

n/2

, the size of the giant component is approximately

4t-2n

.^[1] However, according to the coupon collector's problem,

\Theta(nlogn)

edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions.The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution,the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree

, then the giant component exists^[3] if and only if

\mathbb E [k^2] - 2 \mathbb E [k]>0.

E[k]

which is also written in the form of

{\langlek\rangle}

is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional. There are three types of connected components in directed graphs. For a randomly chosen vertex:

out-component is a set of vertices that can be reached by recursively following all out-edges forward;
in-component is a set of vertices that can be reached by recursively following all in-edges backward;
weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Criteria for giant component existence in directed and undirected configuration graphs

Let a randomly chosen vertex has

k_in

in-edges and

k_out

out edges. By definition, the average number of in- and out-edges coincides so that

c=E[k_in]=E[k_out]

. If

G_0(x)=style\sum_k\displaystyleP(k)x^k

is the generating function of the degree distribution

P(k)

for an undirected network, then

G_1(x)

can be defined as

G_1(x)=style\sum_k\displaystyle

	k
	\langlek\rangle

P(k)x^k-1

. For directed networks, generating function assigned to the joint probability distribution

P(k_in,k_out)

can be written with two valuables

and

as:

l{G}(x,y)=

\sum
	k_in,k_out

\displaystyleP({k_in,k_out

})x^y^ , then one can define

g(x)=

	1
	c

{\partiall{G}\over\partialx}\vert_y=1

and

f(y)=

	1
	c

{\partiall{G}\over\partialy}\vert_x=1

. The criteria for giant component existence in directed and undirected random graphs are given in the table below:

Type

Criteria

undirected: giant component

E[k^2]-2E[k]>0

, or

G'₁₍₁₎₌₁

directed: giant in/out-component

E[k_ink_out]-E[k_in]>0

,^[4] or

g'₁₍₁₎₌f'₁₍₁₎=1

directed: giant weak component

2E[k_in]E[k_ink_out]
-E[k_in]

	2]
E[k
	out

-E[k_in]

	2]
E[k
	in

	2]
+E[k
	out

-E[k_in

	2
k
	out]

^[5]

Notes and References

.
Book: Newman, M. E. J. . Networks : an introduction . 2010 . Oxford University Press . New York . 456837194.
Molloy . Michael . Reed . Bruce . A critical point for random graphs with a given degree sequence . Random Structures & Algorithms . 6 . 2–3 . 1995 . 1042-9832 . 10.1002/rsa.3240060204 . 161–180.
Newman . M. E. J. . Strogatz . S. H. . Watts . D. J. . Random graphs with arbitrary degree distributions and their applications . Physical Review E . 64 . 2 . 2001-07-24 . 1063-651X . 10.1103/physreve.64.026118 . 11497662 . 026118. free. cond-mat/0007235. 2001PhRvE..64b6118N .
Kryven . Ivan . Emergence of the giant weak component in directed random graphs with arbitrary degree distributions . Physical Review E . 94 . 1 . 2016-07-27 . 2470-0045 . 10.1103/physreve.94.012315 . 27575156 . 012315. 1607.03793. 2016PhRvE..94a2315K . 206251373 .