6533b7d6fe1ef96bd12670b2
RESEARCH PRODUCT
Clique Percolation Method: Memory Efficient Almost Exact Communities
Alexis BaudinMaximilien DanischSergey KirgizovClémence MagnienMarwan Ghanemsubject
Social and Information Networks (cs.SI)FOS: Computer and information sciencesPhysics - Physics and Society[INFO.INFO-SI] Computer Science [cs]/Social and Information Networks [cs.SI][PHYS.PHYS.PHYS-SOC-PH]Physics [physics]/Physics [physics]/Physics and Society [physics.soc-ph][INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]FOS: Physical sciences[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]Computer Science - Social and Information NetworksPhysics and Society (physics.soc-ph)[INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]Computer Science - Information Retrieval[PHYS.PHYS.PHYS-SOC-PH] Physics [physics]/Physics [physics]/Physics and Society [physics.soc-ph][INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]Computer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)[INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]Information Retrieval (cs.IR)MathematicsofComputing_DISCRETEMATHEMATICSdescription
Automatic detection of relevant groups of nodes in large real-world graphs, i.e. community detection, has applications in many fields and has received a lot of attention in the last twenty years. The most popular method designed to find overlapping communities (where a node can belong to several communities) is perhaps the clique percolation method (CPM). This method formalizes the notion of community as a maximal union of $k$-cliques that can be reached from each other through a series of adjacent $k$-cliques, where two cliques are adjacent if and only if they overlap on $k-1$ nodes. Despite much effort CPM has not been scalable to large graphs for medium values of $k$. Recent work has shown that it is possible to efficiently list all $k$-cliques in very large real-world graphs for medium values of $k$. We build on top of this work and scale up CPM. In cases where this first algorithm faces memory limitations, we propose another algorithm, CPMZ, that provides a solution close to the exact one, using more time but less memory.
year | journal | country | edition | language |
---|---|---|---|---|
2022-02-02 |