Search results for " COMPUTATION"
showing 10 items of 1478 documents
Alignment-free sequence comparison using absent words
2018
Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…
Measuring the clustering effect of BWT via RLE
2017
Abstract The Burrows–Wheeler Transform (BWT) is a reversible transformation on which are based several text compressors and many other tools used in Bioinformatics and Computational Biology. The BWT is not actually a compressor, but a transformation that performs a context-dependent permutation of the letters of the input text that often create runs of equal letters (clusters) longer than the ones in the original text, usually referred to as the “clustering effect” of BWT. In particular, from a combinatorial point of view, great attention has been given to the case in which the BWT produces the fewest number of clusters (cf. [5] , [16] , [21] , [23] ). In this paper we are concerned about t…
Coupling News Sentiment with Web Browsing Data Improves Prediction of Intra-Day Price Dynamics
2015
The new digital revolution of big data is deeply changing our capability of understanding society and forecasting the outcome of many social and economic systems. Unfortunately, information can be very heterogeneous in the importance, relevance, and surprise it conveys, affecting severely the predictive power of semantic and statistical methods. Here we show that the aggregation of web users' behavior can be elicited to overcome this problem in a hard to predict complex system, namely the financial market. Specifically, our in-sample analysis shows that the combined use of sentiment analysis of news and browsing activity of users of Yahoo! Finance greatly helps forecasting intra-day and dai…
Linear-time sequence comparison using minimal absent words & applications
2016
Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realized by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as q-gram distance, are usually computed in time linear with respect to the length of the sequences. In this article, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an absent word of some sequence if it does not occur in…
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.
2018
International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…
A stable brain from unstable components: Emerging concepts and implications for neural computation.
2017
Neuroscientists have often described the adult brain in similar terms to an electronic circuit board- dependent on fixed, precise connectivity. However, with the advent of technologies allowing chronic measurements of neural structure and function, the emerging picture is that neural networks undergo significant remodeling over multiple timescales, even in the absence of experimenter-induced learning or sensory perturbation. Here, we attempt to reconcile the parallel observations that critical brain functions are stably maintained, while synapse- and single-cell properties appear to be reformatted regularly throughout adult life. In this review, we discuss experimental evidence at multiple …
Intermittent targeted therapies and stochastic evolution in patients affected by chronic myeloid leukemia
2016
Front line therapy for the treatment of patients affected by chronic myeloid leukemia (CML) is based on the administration of tyrosine kinase inhibitors, namely imatinib or, more recently, axitinib. Although imatinib is highly effective and represents an example of a successful molecular targeted therapy, the appearance of resistance is observed in a proportion of patients, especially those in advanced stages. In this work, we investigate the appearance of resistance in patients affected by CML, by modeling the evolutionary dynamics of cancerous cell populations in a simulated patient treated by an intermittent targeted therapy. We simulate, with the Monte Carlo method, the stochastic evolu…
An overview of recent molecular dynamics applications as medicinal chemistry tools for the undruggable site challenge
2018
Molecular dynamics (MD) has become increasingly popular due to the development of hardware and software solutions and the improvement in algorithms, which allowed researchers to scale up calculations in order to speed them up. MD simulations are usually used to address protein folding issues or protein-ligand complex stability through energy profile analysis over time. In recent years, the development of new tools able to deeply explore a potential energy surface (PES) has allowed researchers to focus on the dynamic nature of the binding recognition process and binding-induced protein conformational changes. Moreover, modern approaches have been demonstrated to be effective and reliable in …
Block Sorting-Based Transformations on Words: Beyond the Magic BWT
2018
The Burrows-Wheeler Transform (BWT) is a word transformation introduced in 1994 for Data Compression and later results have contributed to make it a fundamental tool for the design of self-indexing compressed data structures. The Alternating Burrows-Wheeler Transform (ABWT) is a more recent transformation, studied in the context of Combinatorics on Words, that works in a similar way, using an alternating lexicographical order instead of the usual one. In this paper we study a more general class of block sorting-based transformations. The transformations in this new class prove to be interesting combinatorial tools that offer new research perspectives. In particular, we show that all the tra…
Assessing statistical significance in multivariable genome wide association analysis
2016
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whe…