Search results for " BioInformatics."
showing 10 items of 65 documents
"A KNOWLEDGE-BASED EXPERT SYSTEM IN BIOINFORMATICS: AN APPLICATION TO REVERSE ENGINEERING GENE REGULATORY NETWORK"
2011
The huge amount of biological data has spread the development of plenty of bionformatics tools, databases and web services. In order to face a computational biology problem, there not exist only a way, but different methodologies and strategies, with their own pros and cons, can be applied. In this PhD thesis I present a knowledge-based expert system that aims at helping a bionformatics researcher in the choice of the proper strategy and heuristic in order to resolve a bioinformatics issue. The Knowledge Base of the system is structured by means of an ontology and codes the expertise about the application domain. KB is organized into decision-making modules that introduce a set of metareaso…
Normalised compression distance and evolutionary distance of genomic sequences: comparison of clustering results
2009
Genomic sequences are usually compared using evolutionary distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a time consuming procedure and the obtained dissimilarity results is not a metric. Recently, the normalised compression distance was introduced as a method to calculate the distance between two generic digital objects and it seems a suitable way to compare genomic strings. In this paper, the clustering and the non-linear mapping obtained using the evolutionary distance and the compression distance are compared, in order to understand if the two distances sets are similar.
ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.
2015
Background: Cluster analysis is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from statistics to computer science. It is central to the life sciences due to the advent of high throughput technologies, e.g., classification of tumors. In particular, in cluster analysis, it is of relevance to assess cluster quality and to predict the number of clusters in a dataset, if any. This latter task is usually performed via internal validation measures. Despite their potentially important role, both the use of classic internal validation measures and the design of new ones, specific for microarray data, do not seem to have grea…
ballaxy: web services for structural bioinformatics.
2014
Abstract Motivation: Web-based workflow systems have gained considerable momentum in sequence-oriented bioinformatics. In structural bioinformatics, however, such systems are still relatively rare; while commercial stand-alone workflow applications are common in the pharmaceutical industry, academic researchers often still rely on command-line scripting to glue individual tools together. Results: In this work, we address the problem of building a web-based system for workflows in structural bioinformatics. For the underlying molecular modelling engine, we opted for the BALL framework because of its extensive and well-tested functionality in the field of structural bioinformatics. The large …
The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis
2021
Abstract Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e. their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either mi…
SKINK: a web server for string kernel based kink prediction in α-helices
2014
Abstract Motivation: The reasons for distortions from optimal α-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. Here, we present a new web server, called SKINK, for string kernel based kink prediction. Extending our previous study, we also annotate the most probable kink position in a given α-helix sequence. Availability and implementation: The SKINK web server is freely accessible at http://biows-inf.zdv.uni-mainz.de/skink. Moreover, SKINK is a module of the BALL software, also freely available at www.ballview.org. Contact: benny.kneissl@roche.com
kmcEx: memory-frugal and retrieval-efficient encoding of counted k-mers.
2018
Abstract Motivation K-mers along with their frequency have served as an elementary building block for error correction, repeat detection, multiple sequence alignment, genome assembly, etc., attracting intensive studies in k-mer counting. However, the output of k-mer counters itself is large; very often, it is too large to fit into main memory, leading to highly narrowed usability. Results We introduce a novel idea of encoding k-mers as well as their frequency, achieving good memory saving and retrieval efficiency. Specifically, we propose a Bloom filter-like data structure to encode counted k-mers by coupled-bit arrays—one for k-mer representation and the other for frequency encoding. Exper…
Gradation of Fuzzy Preconcept Lattices
2021
Noticing certain limitations of concept lattices in the fuzzy context, especially in view of their practical applications, in this paper, we propose a more general approach based on what we call graded fuzzy preconcept lattices. We believe that this approach is more adequate for dealing with fuzzy information then the one based on fuzzy concept lattices. We consider two possible gradation methods of fuzzy preconcept lattice—an inner one, called D-gradation and an outer one, called M-gradation, study their properties, and illustrate by a series of examples, in particular, of practical nature.
Algorithmics for the Life Sciences
2013
The life sciences, in particular molecular biology and medicine, have wit- nessed fundamental progress since the discovery of the “the Double Helix”. A rele- vant part of such an incredible advancement in knowledge has been possible thanks to synergies with the mathematical sciences, on the one hand, and computer science, on the other. Here we review some of the most relevant aspects of this cooperation focusing on contributions given by the design, analysis and engineering of fast al- gorithms for the life sciences.
Peptide classification using optimal and information theoretic syntactic modeling
2010
Accepted version of an article published in the journal: Pattern Recognition. Published version available on Sciverse: http://dx.doi.org/10.1016/j.patcog.2010.05.022 We consider the problem of classifying peptides using the information residing in their syntactic representations. This problem, which has been studied for more than a decade, has typically been investigated using distance-based metrics that involve the edit operations required in the peptide comparisons. In this paper, we shall demonstrate that the Optimal and Information Theoretic (OIT) model of Oommen and Kashyap [22] applicable for syntactic pattern recognition can be used to tackle peptide classification problem. We advoca…