Quadratic program-based modularity maximization for fuzzy community detection in social networks

Document Type

Article

Publication Date

9-29-2014

Department

Department of Electrical and Computer Engineering; Center for Data Sciences

Abstract

One of the most important elements of social network analysis is community detection, i.e., finding groups of similar people based on their traits. In this paper, we present the fuzzy modularity maximization (FMM) approach for community detection, which finds overlapping - that is, fuzzy - communities (where appropriate) by maximizing a generalized form of Newman's modularity. The first proposed FMM solution uses a tree-based structure to find a globally optimal solution, while the second proposed solution uses alternating optimization to efficiently search for a locally optimal solution. Both of these approaches are based on a proposed algorithm called one-step modularity maximization (OSMM), which computes the optimal cluster memberships for one person in the social network. We prove that OSMM can be formulated as a simplified quadratic knapsack optimization problem, which is O(n) time complexity. We then propose a tree-based algorithm, called FMM/Find Best Leaf Node (FMM/FBLN), which represents sequences of OSMM steps in a tree-based structure. It is proved that FMM/FBLN finds globally optimal solutions for FMM; however, the time complexity of FMM/FBLN is O(nd ), d ≥ 2; thus, it is impractical for most real-world networks. To combat this inefficiency, we propose five heuristic-based alternating optimization schemes, i.e., FMM/H1-H5, which are all shown to be O(n2 ) time complexity. We compare the results of the FMM/H solutions with those of state-of-the-art community detection algorithms, MULTICUT spectral FCM (MSFCM) and GALS, and with those of two fuzzy community detection algorithms called GA and vertex-similarity based gradient-descent method (VSGD) on ten real-world datasets. We conclude that one of the five heuristic algorithms (FMM/H2) is very competitive with GALS and much more effective than MSFCM, GA, and VSGD. Furthermore, all of the FMM/H schemes are at least two orders of magnitude faster than GALS in run time. Finally, FMM/H, unlike GALS (which only produces crisp partitions) and MSFCM (which always finds fuzzy partitions), is the only fuzzy community detection algorithm to date that can find the max-modularity partition, fuzzy or crisp.

Publication Title

IEEE Transactions on Fuzzy Systems

Share

COinS