Louvain method explained

The Louvain method for community detection is a method to extract non-overlapping communities from large networks created by Blondel et al.^[1] from the University of Louvain (the source of this method's name). The method is a greedy optimization method that appears to run in time

O(n ⋅ logn)

where

is the number of nodes in the network.^[2]

Modularity optimization

The inspiration for this method of community detection is the optimization of modularity as the algorithm progresses. Modularity is a scale value between −0.5 (non-modular clustering) and 1 (fully modular clustering) that measures the relative density of edges inside communities with respect to edges outside communities. Optimizing this value theoretically results in the best possible grouping of the nodes of a given network. But because going through all possible iterations of the nodes into groups is impractical, heuristic algorithms are used.

In the Louvain Method of community detection, first small communities are found by optimizing modularity locally on all nodes, then each small community is grouped into one node and the first step is repeated. The method is similar to the earlier method by Clauset, Newman and Moore^[3] that connects communities whose amalgamation produces the largest increase in modularity. The Louvain algorithm was shown to correctly identify the community structure when it exists, in particular in the stochastic block model.^[4]

Algorithm

The value to be optimized is modularity, defined as a value in the range

[-1/2,1]

that measures the density of links inside communities compared to links between communities. For a weighted graph, modularity is defined as:

	1
	2m

	N [A
\sum
	ij

	k_ik_j
	2m

]\delta(c_i,c_j),

where:

A_ij

represents the edge weight between nodes

and

; see Adjacency matrix;

k_i

and

k_j

are the sum of the weights of the edges attached to nodes

and

, respectively;

is the sum of all of the edge weights in the graph;

is the total number of nodes in the graph;

c_i

and

c_j

are the communities to which the nodes

and

belong; and

\delta

is Kronecker delta function:

\begin{align} \delta(c_i,c_j)&=\begin{cases} 1&ifc_iandc_jarethesamecluster\\ 0&otherwise \end{cases} \end{align}

Based on the above equation, the modularity of a community

can be calculated as:^[5]

\begin{align} Q_c&=\dfrac{1}{2m}\sum_i\sum_jA_ij1\left\{c_i=c_j=c\right\} -\left(\sum_i\dfrac{k_i}{2m}1\left\{c_i=c\right\}\right)²\ &=

	\Sigma_in
	2m

-\left(

	\Sigma_tot
	2m

\right)^{2
\end{align}}

where

\Sigma_in

is the sum of edge weights between nodes within the community

(each edge is considered twice); and

\Sigma_tot

is the sum of all edge weights for nodes within the community (including edges which link to other communities).

As nodes in different communities do not contribute to the modularity

, it can be written as:

Q=\sum_cQ_c

In order to maximize modularity efficiently, the Louvain Method has two phases that are repeated iteratively.

Phase 1:

1. First, each node in the network is assigned to its own community.

2. Next, for each node

, the change in modularity is calculated for removing

from its own community and moving it into the community of each neighbor

. This value is computed in two steps:

(a) Compute the change in modularity

\DeltaQ

for removing node

from its original community.

(b) Compute the change in modularity

\DeltaQ

for inserting an isolated node

(i.e. node

has no connections and is in a community of only itself) into the community of neighbouring node, denoted

c_j

In the following, we will show the derivation for (b). The equations for (a) are similar and can be computed by similar methods.

First, we compute the modularity of the isolated cluster of node

, which we will call

c_i

. Here we are assuming that there are no loops, and so

A_\mu\mu=0

for all values of

\mu

	prev
\begin{align} Q
	i

	N_i=1
\dfrac{1}{2m} \sum
	\mu=1

	N_i=1
\sum
	\nu=1

A_\mu\nu-

	N_i=1
\left(\sum
	\mu=1

\dfrac{k_\mu

}\right)^2 \\&= - \left(\dfrac\right)^2\end

Next, we compute the modularity of the cluster

c_j

before we have added the new node

. We already computed this equation:

	prev
\begin{align} Q
	j

	N_j
\dfrac{1}{2m} \sum
	\mu=1

	N_j
\sum
	\nu=1

A_\mu\nu-

	N_j
\left(\sum
	\mu=1

\dfrac{k_\mu

}\right)^2\end

Finally, we compute the modularity of the cluster

c_j

after we have added a new node

	updated
\begin{align} Q
	j

	N_j+1
\dfrac{1}{2m} \sum
	\mu=1

	N_j+1
\sum
	\nu=1

A_\mu\nu-

	N_j+1
\left(\sum
	\mu=1

\dfrac{k_\mu

}\right)^2 \end

We can rewrite the first term as follows:

	N_j+1
\begin{align} \dfrac{1}{2m} \sum
	\mu=1

	N_j+1
\sum
	\nu=1

A_\mu\nu&=

	N_j
\dfrac{1}{2m} \left[\sum
	\mu=1

	N_j
\sum
	\nu=1

A_\mu\nu

	N_j+1
+ 2\sum
	\mu=1

A
	\mu,N_j+1

\right] \\ &=

	N_j
\dfrac{1}{2m} \left[\sum
	\mu=1

	N_j
\sum
	\nu=1

A_\mu\nu

	(j)
+ 2k
	i

\right] \end{align}

where

	(j)
k
	i

represents the sum of the weights of all the edges which go between node

and the nodes in community

c_j

. In other words,

	(j)
k
	i

is the degree of node

within community

c_j

We can rewrite the second term as:

	N_j+1
\begin{align} \left(\sum
	\mu=1

\dfrac{k_\mu

}\right)^2 &= \sum_^ \sum_^\dfrac \\ &= \sum_^ \sum_^\dfrac +2k_ \sum_^\dfrac +\dfrac \\ &= \dfrac\left[\left(\sum_{\mu=1}^{N_j} k_{\mu} +k_{N_j+1}\right) \left(\sum_{\nu=1}^{N_j} k_{\nu} +k_{N_j+1}\right) \right] \\ &= \left(\dfrac\right)^2\end

Putting this together we have:

	updated
\begin{align} Q
	j

	N_j
\dfrac{1}{2m} \left[\sum
	\mu=1

	N_j
\sum
	\nu=1

A_\mu\nu

	(j)
+ 2k
	i

\right]-

	N_j
\left(\dfrac{\sum
	\mu=1

k_\mu

+k
	N_j+1

}\right)^2\end

Putting together the equations for

	prev
Q
	i

	prev
Q
	j

, and

	updated
Q
	j

, we can compute the change in modularity

\DeltaQ

for adding an isolated node

to the cluster

c_j

. This is sometimes referred to as the gain:

$\begin\Delta Q &= Q_^ - Q_j^ -Q_i^ \\&= \dfrac\left[\sum_{\mu=1}^{N_j}\sum_{\nu=1}^{N_j}A_{\mu\nu} +
2k_{i}^{(j)}\right] - \left(\dfrac\right)^2 \\ &- \left[\dfrac{1}{2m}
\sum_{\mu=1}^{N_j}\sum_{\nu=1}^{N_j}A_{\mu\nu} -
\left(\sum_{\mu=1}^{N_j} \dfrac{k_{\mu}}{2m}\right)^2 -
\left(\dfrac{k_{i}}{2m}\right)^2\right]\end$

3. Once the change in modularity

\DeltaQ

has been computed for all communities

\{c_j\}

that node

is connected to, node

is placed into the community that resulted in the greatest modularity increase. If no increase is possible, node

remains in its original community.

4. This process is applied repeatedly and sequentially to all nodes until no modularity increase can occur. Once this local maximum of modularity is hit, the first phase has ended.

Phase 2:

The second phase of the algorithm involves reducing communities to a single node and repeating the steps in Phase 1:

1. Each community

c_j

is reduced to a single node. Edges connecting nodes from

c_j

to other communities and likewise reduced to a single weighted edge.

2. Once the new graph is created, the second phase has ended and the first phase can be re-applied to the new network.

Previous uses

Twitter social Network (2.4 Million nodes, 38 million links) by Josep Pujol, Vijay Erramilli, and Pablo Rodriguez:^[6] The authors explore the problem of partitioning Online Social Networks onto different machines.
Mobile phone Network (4 Million nodes, 100 Million links) by Derek Greene, Donal Doyle, and Padraig Cunningham:^[7] Community-tracking strategies for identifying dynamic communities of different dynamic social networks.
Detecting species in network-based dynamical model.^[8]

Disadvantages

Louvain produces only non-overlapping communities, which means that each node can belong to at most one community. This is highly unrealistic in many real-world applications. For example, in social networks, most people belong to multiple communities: their family, their friends, their co-workers, old school buddies, etc. In biological networks, most genes or proteins belong to more than one pathway or complex. Furthermore, Louvain has been shown to sometimes produce arbitrarily badly connected communities, and has been effectively superseded (at least in the non-overlapping case) by the Leiden algorithm.^[9]

Comparison to other methods of non-overlapping community detection

When comparing modularity optimization methods, the two measures of importance are the speed and the resulting modularity value. A higher speed is better as it shows a method is more efficient than others and a higher modularity value is desirable as it points to having better-defined communities.The compared methods are, the algorithm of Clauset, Newman, and Moore,^[3] Pons and Latapy,^[10] and Wakita and Tsurumi.^[11]

Modularity Optimization Comparison^[12] !! Karate! Arxiv! Internet! Web nd.edu! Phone! Web uk-2005! Web WebBase 2001
Nodes/links	34/77	9k/24k	70k/351k	325k/1M	2.6M/6.3M	39M/783M	118M/1B
Clauset, Newman, and Moore	.38/0s	.772/3.6s	.692/799s	.927/5034s	-/-	-/-	-/-
Pons and Latapy	.42/0s	.757/3.3s	.729/575s	.895/6666s	-/-	-/-	-/-
Wakita and Tsurumi	.42/0s	.761/0.7s	.667/62s	.898/248s	.56/464s	-/-	-/-
Louvain Method	.42/0s	.813/0s	.781/1s	.935/3s	.769/134s	.979/738s	.984/152mn

-/- in the table refers to a method that took over 24hrs to run. This table (from^[13]) shows that the Louvain method outperforms many similar modularity optimization methods in both the modularity and the time categories.

References

"The Louvain method for community detection in large networks" Vincent Blondel http://perso.uclouvain.be/vincent.blondel/research/louvain.html

Notes and References

Blondel. Vincent D. Guillaume. Jean-Loup. Lambiotte. Renaud. Lefebvre. Etienne. 334423. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment. 9 October 2008. 2008. 10. P10008. 10.1088/1742-5468/2008/10/P10008. 0803.0476. 2008JSMTE..10..008B.
Lancichinetti . Andrea . Fortunato . Santo . Community detection algorithms: A comparative analysis . Physical Review E . 80 . 5 . 2009-11-30 . 056117 . 1539-3755 . 10.1103/physreve.80.056117 . 20365053 . 0908.1062 . 2009PhRvE..80e6117L . 14193110 .
Clauset. Aaron. Newman. M. E. J.. Moore. Cristopher. 8977721. 2004-12-06. Finding community structure in very large networks. cond-mat/0408187. Physical Review E. 70. 6. 066111. 10.1103/PhysRevE.70.066111. 15697438. 1539-3755. 2004PhRvE..70f6111C.
Cohen-Addad. Vincent. Kosowski. Adrian. Mallmann-Trenn. Frederik. Saulpic. David . On the Power of Louvain in the Stochastic Block Model . 4055–4066 . Curran Associates, Inc. . Advances in Neural Information Processing Systems (Neurips 2020) . 2020 .
https://eecs.wsu.edu/~ananth/papers/Ghosh_IPDPS18.pdf
0905.4918v1. Pujol. Josep M.. Divide and Conquer: Partitioning Online Social Networks. Erramilli. Vijay. Rodriguez. Pablo. cs.NI. 2009.
Tracking the Evolution of Communities in Dynamic Social Networks. Derek. Greene. Dónal. Doyle. Pádraig. Cunningham. University College Dublin. UCD-CSI-2011-06. May 2011. 2014-11-20 . dead . https://web.archive.org/web/20130512153616/http://www.csi.ucd.ie/files/ucd-csi-2011-06.pdf . 2013-05-12.
Markovitch. Omer . Krasnogor. Natalio . Predicting species emergence in simulated complex pre-biotic networks . PLOS ONE. 13. 2. 10.1371/journal.pone.0192871. 29447212 . 5813963 . e0192871. 2018 . 2018PLoSO..1392871M . free .
Traag . V. A. . Waltman . L. . van Eck . N. J. . 2019-03-26 . From Louvain to Leiden: guaranteeing well-connected communities . Scientific Reports . en . 9 . 1 . 5233 . 10.1038/s41598-019-41695-z . 2045-2322 . 6435756 . 30914743. 1810.08473 . 2019NatSR...9.5233T .
Pascal. Pons. Matthieu. Latapy. 121714719. Journal of Graph Algorithms and Applications. 10. 2. 191–218. 2006. Computing Communities in Large Networks Using Random Walks. 10.7155/jgaa.00124. cond-mat/0412368 . free.
cs/0702048. Wakita. Ken. Finding Community Structure in Mega-scale Social Networks. Tsurumi. Toshiyuki. 2007.
0803.0476. 10.1088/1742-5468/2008/10/P10008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment. 2008. 10. P10008. 2008. Blondel. Vincent D.. Guillaume. Jean-Loup. Lambiotte. Renaud. Lefebvre. Etienne. 334423. 2008JSMTE..10..008B.
Book: Aynaud . Thomas . Graph Partitioning . Blondel . Vincent D. . Guillaume . Jean-Loup . Lambiotte . Renaud . Wiley . 2013 . 978-1-84821-233-6 . Bichot . Charles-Edmond . 1 . 13 February 2013 . 315–345 . en . Multilevel Local Optimization of Modularity . 10.1002/9781118601181.ch13 . Siarry . Patrick . https://onlinelibrary.wiley.com/doi/10.1002/9781118601181.ch13.

Louvain method explained

Modularity optimization

Algorithm

Previous uses

Disadvantages

Comparison to other methods of non-overlapping community detection

See also

References

Notes and References