6533b857fe1ef96bd12b515d
RESEARCH PRODUCT
Gene-based and semantic structure of the Gene Ontology as a complex network
Michele TumminelloClaudia CoronnelloSalvatore Miccichèsubject
0301 basic medicineStatistics and ProbabilityFOS: Computer and information sciencesPhysics - Physics and SocietyComplex systemComputer scienceMolecular Networks (q-bio.MN)Complex systemFOS: Physical sciencesNetworkCondensed Matter PhysicPhysics and Society (physics.soc-ph)computer.software_genreQuantitative Biology - Quantitative MethodsStatistics - ApplicationsGeneSemantic network03 medical and health sciencesSemantic similarityQuantitative Biology - Molecular NetworksApplications (stat.AP)GeneQuantitative Methods (q-bio.QM)Community detectionGene ontologybusiness.industryOntologyOntology-based data integrationComplex networkCondensed Matter PhysicsBipartite system030104 developmental biologyBipartite system; Community detection; Complex systems; Genes; Networks; Ontology; Condensed Matter Physics; Statistics and ProbabilityFOS: Biological sciencesOntologyWeighted networkData miningArtificial intelligenceComputingMethodologies_GENERALbusinesscomputerNatural language processingdescription
The last decade has seen the advent and consolidation of ontology based tools for the identification and biological interpretation of classes of genes, such as the Gene Ontology. The information accumulated time-by-time and included in the GO is encoded in the definition of terms and in the setting up of semantic relations amongst terms. This approach might be usefully complemented by a bottom-up approach based on the knowledge of relationships amongst genes. To this end, we investigate the Gene Ontology from a complex network perspective. We consider the semantic network of terms naturally associated with the semantic relationships provided by the Gene Ontology consortium and a gene-based weighted network in which the nodes are the terms and a link between any two terms is set up whenever genes are annotated in both terms. One aim of the present paper is to understand whether the semantic and the gene-based network share the same structural properties or not. We then consider network communities. The identification of communities in the SVNs network can therefore be the basis of a simple protocol aiming at fully exploiting the possible relationships amongst terms, thus improving the knowledge of the semantic structure of GO. This is also important from a biomedical point of view, as it might reveal how genes over-expressed in a certain term also affect other biological functions not directly linked by the GO semantics. As a by-product, we present a simple methodology that allows to have a first glance insight about the biological characterization of groups of GO terms.
year | journal | country | edition | language |
---|---|---|---|---|
2012-11-10 |