0000000000532434
AUTHOR
Simona Ester Rombo
Singling out functional similarities in graph databases
Basic Statistical Indices for SeqAn
Algorithms for Graph and Network Analysis: Clustering and Search of Motifs in Graphs
In this article we deal with problems that involve the analysis of topology in graphs modeling biological networks. In particular, we consider two important problems: (i) Network clustering, aiming at finding compact subgraphs inside the input graph in order to isolate molecular complexes, and (ii) searching for motifs, i.e., sub-structures repeated in the input network and presenting high significance (e.g., in terms of their frequency). We provide a compact overview of the main techniques proposed in the literature to solve these problems.
Riduzione del Traffico nei Sistemi P2P: un Approccio Semantico
A recommendation system for the prediction of drug-target associations
In this chapter a recommendation system is presented, based on the integration of a Protein-Protein Interaction (PPI) network taken from the Intact database, and a set of associations between drugs and targets taken from the DrugBank database. Depending on how proteins are connected on the PPI network, given an input drug the system suggests new targets. The framework adopted for the implementation is Apache Spark, useful for loading, managing and manipulating data by means of appropriate Resilient Distributed Datasets (RDD), and for the use of the Alternating Least Square (ALS) machine learning algorithm, a Matrix Factorization algorithm for distributed and parallel computing.Finally, an a…
IP6K gene identification by tag search
Discovering Protein Complexes in Protein Interaction Networks
A technique to search functional similarities in PPI networks
We describe a method to search for similarities across protein-protein interaction networks of different organisms. The technique core consists in computing a maximum weight matching of bipartite graphs resulting from comparing the neighbourhoods of proteins belonging to different networks. Both quantitative and reliability information are exploited. We tested the method on the networks of S. cerevisiae, D. melanogaster and C. elegans. The experiments showed that the technique is able to detect functional orthologs when the sole sequence similarity does not prove itself sufficient. They also demonstrated the capability of our approach in discovering common biological processes involving unc…
Derivazione Efficiente di Pattern Strutturati Frequenti da Database di Natura Biologica
Bi-GRAPPIN: Bipartite graph based protein-protein interaction networks similarity search
Restricted Neighborhood search clustering revisited: an evolutionary computation perspective
Discovering meaningful protein-protein interaction modules by a co-clustering based approach
A big data approach for sequences indexing on the cloud via burrows wheeler transform
Indexing sequence data is important in the context of Precision Medicine, where large amounts of "omics"data have to be daily collected and analyzed in order to categorize patients and identify the most effective therapies. Here we propose an algorithm for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our approach is the first that distributes the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources. Copyright © 2020 for this paper by its authors.
Pattern Discovery In Biosequences: From Simple To Complex Patterns
Asymmetric Global Alignment of Protein-Protein Interaction Graph Databases
Genomic Databases Characteristics
IP6K gene identification in plant cells via tag discovery
Optimal extraction of motif patterns in 2D
The combinatorial explosion of motif patterns occurring in 1D and 2D arrays leads to the consideration of special classes of motifs growing linearly with the size of the input array. Such motifs, called irredundant motifs, are able to succinctly represent all of the other motifs occurring in the same array within reasonable time and space bounds. In previous work irredundant motifs were extracted from 2D arrays in O (N 2 log 2 n log log n) and O (N 3) time, where N is the size of the 2D input array and n is its largest dimension. In this paper, we present an algorithm to extract irredundant motifs from 2D arrays that is quadratic in the size of the input. The input is defined on a binary al…
Discriminating Graph Pattern Miningfrom Gene Expression Data
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other…
Foreword: Algorithms, Strings and Theoretical Approaches in the Big Data Era – Special Issue in Honor of the 60th Birthday of Professor Raffaele Giancarlo(Editorial)
Raffaele Giancarlo was born in 1957 in Salerno, Italy. He received his Laurea Degree in Computer Science from the University of Salerno in 1982. His Laurea thesis on combinatorial algorithms on words was supervised by Professor Alberto Apostolico. Some years later, in 1984, he was one of the few young researchers attending the Advanced Research Workshop on Combinatorial Algorithms on Words held at Maratea (Italy). In the same year, he won a public competition for an Assistant Professor position at University of Salerno. He also decided to pursue graduate studies in the US. Raffaele Giancarlo obtained his Ph.D. in Computer Science from Columbia University in 1990, defending one of the first …