In graph theory, a component of an undirected graph is a connected subgraph that is not part of any larger connected subgraph. The components of any graph partition its vertices into disjoint sets, and are the induced subgraphs of those sets. A graph that is itself connected has exactly one component, consisting of the whole graph. Components are sometimes called connected components.
The number of components in a given graph is an important graph invariant, and is closely related to invariants of matroids, topological spaces, and matrices. In random graphs, a frequently occurring phenomenon is the incidence of a giant component, one component that is significantly larger than the others; and of a percolation threshold, an edge probability above which a giant component exists and below which it does not.
The components of a graph can be constructed in linear time, and a special case of the problem, connected-component labeling, is a basic technique in image analysis. Dynamic connectivity algorithms maintain components as edges are inserted or deleted in a graph, in low time per change. In computational complexity theory, connected components have been used to study algorithms with limited space complexity, and sublinear time algorithms can accurately estimate the number of components.
A component of a given undirected graph may be defined as a connected subgraph that is not part of any larger connected subgraph. For instance, the graph shown in the first illustration has three components. Every vertex
v
Another definition of components involves the equivalence classes of an equivalence relation defined on the graph's vertices.In an undirected graph, a is reachable from a if there is a path from
u
u
v
u
v
u
Similar definitions involving equivalence classes have been used to defined components for other forms of graph connectivity, including the weak components and strongly connected components of directed graphs and the biconnected components of undirected graphs.
The number of components of a given finite graph can be used to count the number of edges in its spanning forests: In a graph with
n
c
n-c
n-c
m
n
c
A graph can be interpreted as a topological space in multiple ways, for instance by placing its vertices as points in general position in three-dimensional Euclidean space and representing its edges as line segments between those points. The components of a graph can be generalized through these interpretations as the topological connected components of the corresponding space; these are equivalence classes of points that cannot be separated by pairs of disjoint closed sets. Just as the number of connected components of a topological space is an important topological invariant, the zeroth Betti number, the number of components of a graph is an important graph invariant, and in topological graph theory it can be interpreted as the zeroth Betti number of the graph.
The number of components arises in other ways in graph theory as well. In algebraic graph theory it equals the multiplicity of 0 as an eigenvalue of the Laplacian matrix of a finite graph. It is also the index of the first nonzero coefficient of the chromatic polynomial of the graph, and the chromatic polynomial of the whole graph can be obtained as the product of the polynomials of its components. Numbers of components play a key role in the Tutte theorem characterizing finite graphs that have perfect matchings and the associated Tutte–Berge formula for the size of a maximum matching, and in the definition of graph toughness.
It is straightforward to compute the components of a finite graph in linear time (in terms of the numbers of the vertices and edges of the graph) using either breadth-first search or depth-first search. In either case, a search that begins at some particular will find the entire component (and no more) before returning. All components of a graph can be found by looping through its vertices, starting a new breadth-first or depth-first search whenever the loop reaches a vertex that has not already been included in a previously found component. describe essentially this algorithm, and state that it was already "well known".
Connected-component labeling, a basic technique in computer image analysis, involves the construction of a graph from the image and component analysis on the graph.The vertices are the subset of the pixels of the image, chosen as being of interest or as likely to be part of depicted objects. Edges connect adjacent pixels, with adjacency defined either orthogonally according to the Von Neumann neighborhood, or both orthogonally and diagonally according to the Moore neighborhood. Identifying the connected components of this graph allows additional processing to find more structure in those parts of the image or identify what kind of object is depicted. Researchers have developed component-finding algorithms specialized for this type of graph, allowing it to be processed in pixel order rather than in the more scattered order that would be generated by breadth-first or depth-first searching. This can be useful in situations where sequential access to the pixels is more efficient than random access, either because the image is represented in a hierarchical way that does not permit fast random access or because sequential access produces better memory access patterns.
There are also efficient algorithms to dynamically track the components of a graph as vertices and edges are added, by using a disjoint-set data structure to keep track of the partition of the vertices into equivalence classes, replacing any two classes by their union when an edge connecting them is added. These algorithms take amortized time
O(\alpha(n))
\alpha
O(log2n/loglogn)
O(logn/loglogn)
Components of graphs have been used in computational complexity theory to study the power of Turing machines that have a working memory limited to a logarithmic number of bits, with the much larger input accessible only through read access rather than being modifiable. The problems that can be solved by machines limited in this way define the complexity class L. It was unclear for many years whether connected components could be found in this model, when formalized as a decision problem of testing whether two vertices belong to the same component, and in 1982 a related complexity class, SL, was defined to include this connectivity problem and any other problem equivalent to it under logarithmic-space reductions. It was finally proven in 2008 that this connectivity problem can be solved in logarithmic space, and therefore that
In a graph represented as an adjacency list, with random access to its vertices, it is possible to estimate the number of connected components, with constant probability of obtaining additive (absolute) error at most
\varepsilonn
O(\varepsilon-2log\varepsilon-1)
See main article: Giant component. In random graphs the sizes of components are given by a random variable, which, in turn, depends on the specific model of how random graphs are chosen. In the
G(n,p)
n
1-p
\varepsilon
n
p<(1-\varepsilon)/n
In this range of
p
p ≈ 1/n
The largest connected component has a number of vertices proportional to There may exist several other large components; however, the total number of vertices in non-tree components is again proportional to
p>(1+\varepsilon)/n
There is a single giant component containing a linear number of vertices. For large values of
p
|C1| ≈ yn
y
p
For different models including the random subgraphs of grid graphs, the connected components are described by percolation theory. A key question in this theory is the existence of a percolation threshold, a critical probability above which a giant component (or infinite component) exists and below which it does not.