An Empirical Study of Community Detection Algorithms on Social and Road Networks

Waqas Nawaz

Keywords: Graph Mining, Community Detection, Road Network, Social Network, Algorithms, Community Analysis

Issue I, Volume I, Pages 185-200

Community detection in social networks is a problem with considerable interest, since, discovering

communities reveals hidden information about networks. There exist many algorithms to detect

inherent community structures and recently few of them are investigated on social networks. However,

it is non-trivial to decide the best approach in the presence of diverse nature of graphs, in

terms of density and sparsity, and inadequate analysis of the results. Therefore, in this study, we

analyze and compare various algorithms to detect communities in two networks, namely social

and road networks, with varying structural properties. The algorithms under consideration are

evaluated with unique metrics for internal and external connectivity of communities that includes

internal density, average degree, cut ratio, conductance, normalized cut, and average Jaccard Index.

The evaluation results revealed key insights about selected algorithms and underlying community

structures.

[1] A. Clauset, M. E. J. Newman, and C. Moore, “Finding community structure in very large

networks,” Phys. Rev. E, vol. 70, no. 6, p. 066111, 2004.

[2] J. Chen and Y. Saad, “Dense subgraph extraction with application to community detection,”

Knowledge and Data Engineering, IEEE Transactions on, vol. 24, no. 7, pp. 1216–1230,2012.

[3] J. Huang, H. Sun, Q. Song, H. Deng, and J. Han, “Revealing density-based clustering structure

from the core-connected tree of a network,” Knowledge and Data Engineering, IEEE

Transactions on, vol. 25, no. 8, pp. 1876–1889, 2013.

[4] R. R. Khorasgani, J. Chen, and O. R. Za¨ıane, “Top leaders community detection approach in

information networks,” in 4th SNA-KDD Workshop on Social Network Mining and Analysis. Citeseer, 2010.

[5] J. M. Kumpula, M. Kivel¨a, K. Kaski, and J. Saram¨aki, “Sequential algorithm for fast clique

percolation,” Physical Review E, vol. 78, no. 2, p. 026109, 2008.

[6] I. X. Leung, P. Hui, P. Lio, and J. Crowcroft, “Towards real-time community detection in

large networks,” Physical Review E, vol. 79, no. 6, p. 066107, 2009.

[7] U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm to detect community

structures in large-scale networks,” Physical Review E, vol. 76, no. 3, p. 036106, 2007.

[8] F. Zhao and A. K. Tung, “Large scale cohesive subgraphs discovery for social network visual

analysis,” Proceedings of the VLDB Endowment, vol. 6, no. 2, pp. 85–96, 2012.

[9] J. Yang and J. Leskovec, “Defining and evaluating network communities based on groundtruth,”

Knowledge and Information Systems, vol. 42, no. 1, pp. 181–213, 2015.

[10] J. Xie, S. Kelley, and B. K. Szymanski, “Overlapping community detection in networks: The

state-of-the-art and comparative study,” Acm computing surveys (csur), vol. 45, no. 4, p. 43,2013.

[11] S. Harenberg, G. Bello, L. Gjeltema, S. Ranshous, J. Harlalka, R. Seay, K. Padmanabhan,

and N. Samatova, “Community detection in large-scale networks: a survey and empirical

evaluation,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 6, no. 6, pp. 426– 439, 2014.

[12] J. Leskovec, K. J. Lang, and M. Mahoney, “Empirical comparison of algorithms for network

community detection,” in Proceedings of the 19th international conference on World wide

web. ACM, 2010, pp. 631–640.

[13] M. Wang, C. Wang, J. X. Yu, and J. Zhang, “Community detection in social networks: An

in-depth benchmarking study with a procedure-oriented framework,” PVLDB, vol. 8, no. 10,

pp. 998–1009, 2015.

[14] G. Misra, J. M. Such, and H. Balogun, “Non-sharing communities? an empirical study of

community detection for access control decisions,” in 2016 IEEE/ACM International Conference

on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 2016, pp.49–56.

[15] N. Garg and R. Rani, “A comparative study of community detection algorithms using graphs

and r,” in 2017 International Conference on Computing, Communication and Automation

(ICCCA). IEEE, 2017, pp. 273–278.

[16] Z. Zhao, S. Zheng, C. Li, J. Sun, L. Chang, and F. Chiclana, “A comparative study on community

detection methods in complex networks,” Journal of Intelligent & Fuzzy Systems, no.

Preprint, pp. 1–10, 2018.

[17] K. Chandusha, S. R. Chintalapudi, and M. H. M. Krishna Prasad, “An empirical study on

community detection algorithms,” in Smart Intelligent Computing and Applications, S. C.

Satapathy, V. Bhateja, and S. Das, Eds. Singapore: Springer Singapore, 2019, pp. 35–44.

[18] F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and D. Parisi, “Defining and identifying

communities in networks,” Proceedings of the National Academy of Sciences of the United

States of America, vol. 101, no. 9, pp. 2658–2663, 2004.

[19] B. Adamcsek, G. Palla, I. J. Farkas, I. Der´enyi, and T. Vicsek, “Cfinder: locating cliques and

overlapping modules in biological networks,” Bioinformatics, vol. 22, no. 8, pp. 1021–1023,2006.

[20] A. Clauset, “Source code of cnm algorithm,” http://www.cs.unm.edu/aaron/research/

fastmodularity.htm (Last accessed on 16 March, 2019).

[21] F. Radicchi, “Source code of radicchi algorithm,” http://homes.soic.indiana.edu/filiradi/

resources.html (Last accessed on 16 March, 2019).

[22] “R code of lpa algorithm,” http://igraph.wikidot.com/community-detection-in-r (Last accessed

on 16 March, 2019).

[23] “Python code of lpa algorithm,” http://orkohunter-networkx.readthedocs.org/en/latest/

modules/networkx/algorithms/community/asyn lpa.html (Last accessed on 16 March, 2019).

[24] R. R. Khorasgani, “Source code of topleaders algorithm,” https://webdocs.cs.ualberta.ca/

rabbanyk/TopLeader/ (Last accessed on 16 March, 2019).

[25] J. M. Kumpula, “Source codes of scp algorithm,” http://complex.cs.aalto.fi/resources/

software/ (Last accessed on 16 March, 2019).

[26] “Code of mbdsge algorithm,” https://github.com/sranshous/Graph-Community-Detection

(Last accessed on 16 March, 2019).

[27] “Eigen: Linear algebra library,” http://eigen.tuxfamily.org/index.php?title=Main Page (Last

accessed on 16 March, 2019).

[28] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http:

//snap.stanford.edu/data (Last accessed on 16 March, 2019), 2014.

[29] J. Leskovec and et. al., “Statistical properties of community structure in large social and

information networks,” in Proceedings of the 17th international conference on World Wide

Web. ACM, 2008, pp. 695–704.

[30] J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney, “Community structure in large

networks: Natural cluster sizes and the absence of large well-defined clusters,” Internet Mathematics,

vol. 6, no. 1, pp. 29–123, 2009.

[31] J. Shi and J. Malik, “Normalized cuts and image segmentation,” Pattern Analysis and Machine

Intelligence, IEEE Transactions on, vol. 22, no. 8, pp. 888–905, 2000.

[32] S. Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, no. 3, pp. 75–174, 2010.