6533b7d7fe1ef96bd1267c5f
RESEARCH PRODUCT
Statistically validated networks in bipartite complex systems.
Michele TumminelloMichele TumminelloJyrki PiiloFabrizio LilloFabrizio LilloFabrizio LilloSalvatore MiccichèRosario N. Mantegnasubject
Theoretical computer scienceComputer sciencelcsh:MedicineNetwork theorySocial and Behavioral SciencesBioinformaticsQuantitative Biology - Quantitative MethodsSociologyProtein Interaction Mappinglcsh:ScienceQuantitative Methods (q-bio.QM)MultidisciplinarySystems BiologyApplied MathematicsPhysicsStatisticsComplex SystemsGenomicsLink (geometry)Social NetworksSpecialization (logic)Interdisciplinary PhysicsBipartite graphProbability distributionResearch ArticleNetwork analysisPhysics - Physics and SocietyComplex systemFOS: Physical sciencesPhysics and Society (physics.soc-ph)Type (model theory)BiologyModels BiologicalNetwork theory Statistical PhysicsStatistical MechanicsSet (abstract data type)Statistical MethodsBiologyStructure (mathematical logic)Statistical Physicslcsh:RComputational BiologyModels TheoreticalComparative GenomicsSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)FOS: Biological sciencesNetwork theorylcsh:QNull hypothesisMathematicsdescription
Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the projected network against a null hypothesis taking into account the heterogeneity of the system. We apply our method to three different systems, namely the set of clusters of orthologous genes (COG) in completely sequenced genomes [13, 14], a set of daily returns of 500 US financial stocks, and the set of world movies of the IMDb database [15]. In all these systems, both different in size and level of heterogeneity, we find that our method is able to detect network structures which are informative about the system and are not simply expression of its heterogeneity. Specifically, our method (i) identifies the preferential relationships between the elements, (ii) naturally highlights the clustered structure of investigated systems, and (iii) allows to classify links according to the type of statistically validated relationships between the connected nodes.
year | journal | country | edition | language |
---|---|---|---|---|
2011-01-01 | PLoS ONE |