0000000000046688
AUTHOR
Fabio Fassetti
Discriminating graph pattern mining from gene expression data
We consider the problem of mining gene expression data in order to single out interesting features that characterize healthy/unhealthy samples of an input dataset. We present and approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Out main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminating patterns" among graphs belonging to the two different sample sets. Differently from the …
Data Sources and Models
Biological networks rely on the storage and retrieval of data associated to the physical interactions and/or functional relationships among different actors. In particular, the attention may be on the interactions among cellular components, such as proteins, genes, RNA, or for example on phenotype–genotype associations. Data from which biological networks are built are usually stored in public databases, and we provide here a brief summary of the main types of both data and associations, publicly available. Moreover, we also explain how it is possible to construct suitable network models from these associations, focusing on protein–protein interaction networks, gene–disease networks and net…
(Discriminative) Pattern Discovery on Biological Networks
This work provides a review of biological networks as a model for analysis, presenting and discussing a number of illuminating analyses. Biological networks are an effective model for providing insights about biological mechanisms. Networks with different characteristics are employed for representing different scenarios. This powerful model allows analysts to perform many kinds of analyses which can be mined to provide interesting information about underlying biological behaviors. The text also covers techniques for discovering exceptional patterns, such as a pattern accounting for local similarities and also collaborative effects involving interactions between multiple actors (for example …
FEDRO: a software tool for the automatic discovery of candidate ORFs in plants with c →u RNA editing
RNA editing is an important mechanism for gene expression in plants organelles. It alters the direct transfer of genetic information from DNA to proteins, due to the introduction of differences between RNAs and the corresponding coding DNA sequences. Software tools successful for the search of genes in other organisms not always are able to correctly perform this task in plants organellar genomes. Moreover, the available software tools predicting RNA editing events utilise algorithms that do not account for events which may generate a novel start codon. We present Fedro, a Java software tool implementing a novel strategy to generate candidate Open Reading Frames (ORFs) resulting from Cytidi…
Exceptional Pattern Discovery
This chapter is devoted to a discussion on exceptional pattern discovery, namely on scenarios, contexts, and techniques concerning the mining of patterns which are so rare or so frequent to be considered as exceptional and, then, of interest for an expert to shed lights on the domain. Frequent patterns have found broad applications in areas like association rule mining, indexing, and clustering [1, 20, 23]. The application of frequent patterns in classification also achieved some success in the classification of relational data [6, 13, 14, 19, 25], text [15], and graphs [7]. The part is organized as follows. First, the frequent pattern mining on classical datasets is presented. This is not …
Discovering discriminative graph patterns from gene expression data
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other…
Contributions from ADBIS 2018 workshops
The ADBIS conferences provide an international forum for the presentation of research on database theory, development of advanced DBMS technologies, and their applications. The 22nd edition of ADBIS, held on September 2–5, 2018, in Budapest, Hungary, includes six thematic workshops collecting contributions from various domains representing new trends in the broad research areas of databases and information systems.
Problems and Techniques
When biological networks are considered, the extraction of interesting knowledge often involves subgraphs isomorphism check that is known to be NP-complete. For this reason, many approaches try to simplify the problem under consideration by considering structures simpler than graphs, such as trees or paths. Furthermore, the number of existing approximate techniques is notably greater than the number of exact methods. In this chapter, we provide an overview of three important problems defined on biological networks: network alignment, network clustering, and motifs extraction from biological networks. For each of these problems, we also describe some of the most important techniques proposed…
IP6K gene identification in plant genomes by tag searching
Abstract Background Plants have played a special role in inositol polyphosphate (IP) research since in plant seeds was discovered the first IP, the fully phosphorylated inositol ring of phytic acid (IP6). It is now known that phytic acid is further metabolized by the IP6 Kinases (IP6Ks) to generate IP containing pyro-phosphate moiety. The IP6K are evolutionary conserved enzymes identified in several mammalian, fungi and amoebae species. Although IP6K has not yet been identified in plant chromosomes, there are many clues suggesting its presences in vegetal cells. Results In this paper we propose a new approach to search for the plant IP6K gene, that lead to the identification in plant genome…
Automatic simulation of RNA editing in plants for the identification of novel putative Open Reading Frames
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer, intro- ducing differences between RNAs and their coding DNA sequences that hind both experimental and computational research of genes. Thus common software tools for gene search, successfully exploited to find canonic genes, often can fail in discovering genes encrypted in the genome of plants. In this work we propose a novel strategy useful to intercept candidate coding sequences resulting from some possible editing substitutions on the start and stop codons of a given input organism DNA. O…
Discriminative pattern discovery for the characterization of different network populations
Abstract Motivation An interesting problem is to study how gene co-expression varies in two different populations, associated with healthy and unhealthy individuals, respectively. To this aim, two important aspects should be taken into account: (i) in some cases, pairs/groups of genes show collaborative attitudes, emerging in the study of disorders and diseases; (ii) information coming from each single individual may be crucial to capture specific details, at the basis of complex cellular mechanisms; therefore, it is important avoiding to miss potentially powerful information, associated with the single samples. Results Here, a novel approach is proposed, such that two different input popul…
Discovering new proteins in plant mitochondria by RNA editing simulation
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer. Indeed it causes differences between RNAs and their coding DNA sequences that hinder both experimental and computational research of genes. Therefore common software tools for gene search, successfully applied to find canonical genes, often fail in discovering genes encrypted in the genome of plants. Here we propose a novel strategy useful to identify candidate coding sequences resulting from possible editing substitutions. In particular, we consider c!u substitutions leading to the creation…