0000000000001849
AUTHOR
Luigi Palopoli
PROTEIN SECONDARY STRUCTURE PREDICTION: HOW TO IMPROVE ACCURACY BY INTEGRATION
In this paper a technique to improve protein secondary structure prediction is proposed. The approach is based on the idea of combining the results of a set of prediction tools, choosing the most correct parts of each prediction. The correctness of the resulting prediction is measured referring to accuracy parameters used in several editions of CASP. Experimental evaluations validating the proposed approach are also reported.
Assistive robotic walker parameter identification for estimation of human thrust without force sensors
In this paper we propose a parameter identification procedure for the wheel-motor dynamic of a robotic walker, i.e. a commercial trolley for elderly people endowed with cognitive, sensing and guidance capabilities. The objective of the wheel-motor dynamic model is to generate a suitable time reference to be used in an estimation algorithm. The ultimate goal of the estimation algorithm is to retrieve the thrust, i.e. torque and force, that the older adult user of the robotic walker applies to the platform. These quantities are of paramount importance in order to adopt intelligent and comfortable walker guidance algorithms. The novelty of this approach is the avoidance of additional costly se…
JSSPrediction: a Framework to Predict Protein Secondary Structures Using Integration
Identifying protein secondary structures is a difficult task. Recently, a lot of software tools for protein secondary structure prediction have been produced and made available on-line, mostly with good performances. However, prediction tools work correctly for families of proteins, such that users have to know which predictor to use for a given unknown protein. We propose a framework to improve secondary structure prediction by integrating results obtained from a set of available predictors. Our contribution consists in the definition of a two phase approach: (i) select a set of predictors which have good performances with the unknown protein family, and (ii) integrate the prediction resul…
Flexible pattern discovery with (extended) disjunctive logic programming
The post-genomic era showed up a wide range of new challenging issues for the areas of knowledge discovery and intelligent information management. Among them, the discovery of complex pattern repetitions in string databases plays an important role, specifically in those contexts where even what are to be considered the interesting pattern classes is unknown. This paper provides a contribution in this precise setting, proposing a novel approach, based on disjunctive logic programming extended with several advanced features, for discovering interesting pattern classes from a given data set.
FEDRO: a software tool for the automatic discovery of candidate ORFs in plants with c →u RNA editing
RNA editing is an important mechanism for gene expression in plants organelles. It alters the direct transfer of genetic information from DNA to proteins, due to the introduction of differences between RNAs and the corresponding coding DNA sequences. Software tools successful for the search of genes in other organisms not always are able to correctly perform this task in plants organellar genomes. Moreover, the available software tools predicting RNA editing events utilise algorithms that do not account for events which may generate a novel start codon. We present Fedro, a Java software tool implementing a novel strategy to generate candidate Open Reading Frames (ORFs) resulting from Cytidi…
Protein-protein interaction network querying by a "focus and zoom" approach
We propose an approach to network querying in protein-protein interaction networks based on bipartite graph weighted matching. An algorithm is presented that first “focuses” the potentially relevant portion of the target graph by performing a global alignment of this one with the query graph, and then “zooms” on the actual matching nodes by considering their topological arrangement, hereby obtaining a (possibly) approximated occurrence of the query graph within the target graph. Approximation is related to node insertions, node deletions and edge deletions possibly intervening in the query graph. The technique manages networks of arbitrary topology. Moreover, edge labels are used to represe…
Algorithms for Graph and Network Analysis: Graph Alignment
In this article we discuss the problem of graph alignment, which has been longly referred to for the purpose of analyzing and comparing biological networks. In particular, we describe different facets of graph alignment, according to the number of input networks, the fixed output objective, the possible heterogeneity of input data. Accordingly, we will discuss pairwise and multiple alignment, global and local alignment, etc. Moreover, we provide a comprehensive overview of the algorithms and techniques proposed in the literature to solve each of the specific considered types of graph alignment. In order to make the material presented here complete and useful to guide the reader in the use o…
Protein Structure Metapredictors
Asymmetric Comparison and Querying of Biological Networks
Comparing and querying the protein-protein interaction (PPI) networks of different organisms is important to infer knowledge about conservation across species. Known methods that perform these tasks operate symmetrically, i.e., they do not assign a distinct role to the input PPI networks. However, in most cases, the input networks are indeed distinguishable on the basis of how the corresponding organism is biologically well characterized. In this paper a new idea is developed, that is, to exploit differences in the characterization of organisms at hand in order to devise methods for comparing their PPI networks. We use the PPI network (called Master) of the best characterized organism as a …
New Trends in Graph Mining
Searching for repeated features characterizing biological data is fundamental in computational biology. When biological networks are under analysis, the presence of repeated modules across the same network (or several distinct ones) is shown to be very relevant. Indeed, several studies prove that biological networks can be often understood in terms of coalitions of basic repeated building blocks, often referred to as network motifs.This work provides a review of the main techniques proposed for motif extraction from biological networks. In particular, main intrinsic difficulties related to the problem are pointed out, along with solutions proposed in the literature to overcome them. Open ch…
"Master-Slave" Biological Network Alignment
Performing global alignment between protein-protein interaction (PPI) networks of different organisms is important to infer knowledge about conservation across species. Known methods that perform this task operate symmetrically, that is to say, they do not assign a distinct role to the input PPI networks. However, in most cases, the input networks are indeed distinguishable on the basis of how well the corresponding organism is biologically well-characterized. For well-characterized organisms the associated PPI network supposedly encode in a sound manner all the information about their proteins and associated interactions, which is far from being the case for not well characterized ones. He…
A summary of genomic databases: overview and discussion
In the last few years both the amount of electronically stored biological data and the number of biological data repositories grew up significantly (today, more than eight hundred can be counted thereof). In spite of the enormous amount of available resources, a user may be disoriented when he/she searches for specific data. Thus, the accurate analysis of biological data and repositories turn out to be useful to obtain a systematic view of biological database structures, tools and contents and, eventually, to facilitate the access and recovery of such data. In this chapter, we propose an analysis of genomic databases, which are databases of fundamental importance for the research in bioinfo…
IP6K gene identification in plant genomes by tag searching
Abstract Background Plants have played a special role in inositol polyphosphate (IP) research since in plant seeds was discovered the first IP, the fully phosphorylated inositol ring of phytic acid (IP6). It is now known that phytic acid is further metabolized by the IP6 Kinases (IP6Ks) to generate IP containing pyro-phosphate moiety. The IP6K are evolutionary conserved enzymes identified in several mammalian, fungi and amoebae species. Although IP6K has not yet been identified in plant chromosomes, there are many clues suggesting its presences in vegetal cells. Results In this paper we propose a new approach to search for the plant IP6K gene, that lead to the identification in plant genome…
Experimental Evaluation of Protein Secondary Structure Predictors
Understanding protein biological function is a key issue in modern biology, which is largely determined by its 3D shape. Protein 3D shape, in its turn, is functionally implied by its amino acid sequence. Since the direct inspection of such 3D structures is rather expensive and time consuming, a number of software techniques have been developed in the last few years that predict a spatial model, either of the secondary or of the tertiary form, for a given target protein starting from its amino acid sequence. This paper offers a comparison of several available automatic secondary structure prediction tools. The comparison is of the experimental kind, where two relevant sets of proteins, a non…
Automatic simulation of RNA editing in plants for the identification of novel putative Open Reading Frames
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer, intro- ducing differences between RNAs and their coding DNA sequences that hind both experimental and computational research of genes. Thus common software tools for gene search, successfully exploited to find canonic genes, often can fail in discovering genes encrypted in the genome of plants. In this work we propose a novel strategy useful to intercept candidate coding sequences resulting from some possible editing substitutions on the start and stop codons of a given input organism DNA. O…
Improving protein secondary structure predictions by prediction fusion
Protein secondary structure prediction is still a challenging problem at today. Even if a number of prediction methods have been presented in the literature, the various prediction tools that are available on-line produce results whose quality is not always fully satisfactory. Therefore, a user has to know which predictor to use for a given protein to be analyzed. In this paper, we propose a server implementing a method to improve the accuracy in protein secondary structure prediction. The method is based on integrating the prediction results computed by some available on-line prediction tools to obtain a combined prediction of higher quality. Given an input protein p whose secondary struct…
Discovering new proteins in plant mitochondria by RNA editing simulation
In plant mitochondria an essential mechanism for gene expression is RNA editing, often influencing the synthesis of functional proteins. RNA editing alters the linearity of genetic information transfer. Indeed it causes differences between RNAs and their coding DNA sequences that hinder both experimental and computational research of genes. Therefore common software tools for gene search, successfully applied to find canonical genes, often fail in discovering genes encrypted in the genome of plants. Here we propose a novel strategy useful to identify candidate coding sequences resulting from possible editing substitutions. In particular, we consider c!u substitutions leading to the creation…
Extracting similar sub-graphs across PPI Networks
Singling out conserved modules (corresponding to connected sub-graphs) throughout protein-protein interaction networks of different organisms is a main issue in bioinformatics because of its potential applications in biology. This paper presents a method to discover highly matching sub-graphs in such networks. Sub-graph extraction is carried out by taking into account, on the one side, both protein sequence and network structure similarities and, on the other side, both quantitative and reliability information possibly available about interactions. The method is conceived as a generalization of a known technique, able to discover functional orthologs in interaction networks. Some preliminar…