Community detection methods have been used in computer, sociology, physics, biology, and brain information science areas. Many methods are based on the optimization of modularity. The algorithm proposed by Blondel et ...Community detection methods have been used in computer, sociology, physics, biology, and brain information science areas. Many methods are based on the optimization of modularity. The algorithm proposed by Blondel et al. (Blondel V D, Guillaume J L, Lambiotte R and Lefebvre E 2008 J. Star. Mech. 10 10008) is one of the most widely used methods because of its good performance, especially in the big data era. In this paper we make some improvements to this algorithm in correctness and performance. By tests we see that different node orders bring different performances and different community structures. We find some node swings in different communities that influence the performance. So we design some strategies on the sweeping order of node to reduce the computing cost made by repetition swing. We introduce a new concept of overlapping degree (OV) that shows the strength of connection between nodes. Three improvement strategies are proposed that are based on constant OV, adaptive OV, and adaptive weighted OV, respectively. Experiments on synthetic datasets and real datasets are made, showing that our improved strategies can improve the performance and correctness.展开更多
基金Project supported by the Major State Basic Research Development Program of China (Grant Nos.2013CB329602 and 2012CB316303)the Science Research Foundation for the Returned Overseas Chinese Scholars,China (Grant No.2010-31)+1 种基金the International Collaborative Project of Shanxi Province,China (Grant No.2011081034)the National Natural Science Foundation of China (Grant Nos.61232010 and 61202215)
文摘Community detection methods have been used in computer, sociology, physics, biology, and brain information science areas. Many methods are based on the optimization of modularity. The algorithm proposed by Blondel et al. (Blondel V D, Guillaume J L, Lambiotte R and Lefebvre E 2008 J. Star. Mech. 10 10008) is one of the most widely used methods because of its good performance, especially in the big data era. In this paper we make some improvements to this algorithm in correctness and performance. By tests we see that different node orders bring different performances and different community structures. We find some node swings in different communities that influence the performance. So we design some strategies on the sweeping order of node to reduce the computing cost made by repetition swing. We introduce a new concept of overlapping degree (OV) that shows the strength of connection between nodes. Three improvement strategies are proposed that are based on constant OV, adaptive OV, and adaptive weighted OV, respectively. Experiments on synthetic datasets and real datasets are made, showing that our improved strategies can improve the performance and correctness.