Search results for " Detection"
showing 10 items of 1676 documents
CARE: context-aware sequencing read error correction.
2020
Abstract Motivation Error correction is a fundamental pre-processing step in many Next-Generation Sequencing (NGS) pipelines, in particular for de novo genome assembly. However, existing error correction methods either suffer from high false-positive rates since they break reads into independent k-mers or do not scale efficiently to large amounts of sequencing reads and complex genomes. Results We present CARE—an alignment-based scalable error correction algorithm for Illumina data using the concept of minhashing. Minhashing allows for efficient similarity search within large sequencing read collections which enables fast computation of high-quality multiple alignments. Sequencing errors ar…
Structure and evolution of a European Parliament via a network and correlation analysis
2016
We present a study of the network of relationships among elected members of the Finnish parliament, based on a quantitative analysis of initiative co-signatures, and its evolution over 16 years. To understand the structure of the parliament, we constructed a statistically validated network of members, based on the similarity between the patterns of initiatives they signed. We looked for communities within the network and characterized them in terms of members' attributes, such as electoral district and party. To gain insight on the nested structure of communities, we constructed a hierarchical tree of members from the correlation matrix. Afterwards, we studied parliament dynamics yearly, wi…
A simple comparative analysis of exact and approximate quantum error correction
2014
We present a comparative analysis of exact and approximate quantum error correction by means of simple unabridged analytical computations. For the sake of clarity, using primitive quantum codes, we study the exact and approximate error correction of the two simplest unital (Pauli errors) and nonunital (non-Pauli errors) noise models, respectively. The similarities and differences between the two scenarios are stressed. In addition, the performances of quantum codes quantified by means of the entanglement fidelity for different recovery schemes are taken into consideration in the approximate case. Finally, the role of self-complementarity in approximate quantum error correction is briefly ad…
Quantum Walk Search through Potential Barriers
2015
An ideal quantum walk transitions from one vertex to another with perfect fidelity, but in physical systems, the particle may be hindered by potential energy barriers. Then the particle has some amplitude of tunneling through the barriers, and some amplitude of staying put. We investigate the algorithmic consequence of such barriers for the quantum walk formulation of Grover's algorithm. We prove that the failure amplitude must scale as $O(1/\sqrt{N})$ for search to retain its quantum $O(\sqrt{N})$ runtime; otherwise, it searches in classical $O(N)$ time. Thus searching larger "databases" requires increasingly reliable hop operations or error correction. This condition holds for both discre…
Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling
2016
ABSTRACTWe introduce a score-type statistic to test for a non-zero regression coefficient when the relevant term involves a nuisance parameter present only under the alternative. Despite the non-regularity and complexity of the problem and unlike the previous approaches, the proposed test statistic does not require the nuisance to be estimated. It is simple to implement by relying on the conventional distributions, such as Normal or t, and it justified in the setting of probabilistic coherence. We focus on testing for the existence of a breakpoint in segmented regression, and illustrate the methodology with an analysis on data of DNA copy number aberrations and gene expression profiles from…
kmcEx: memory-frugal and retrieval-efficient encoding of counted k-mers.
2018
Abstract Motivation K-mers along with their frequency have served as an elementary building block for error correction, repeat detection, multiple sequence alignment, genome assembly, etc., attracting intensive studies in k-mer counting. However, the output of k-mer counters itself is large; very often, it is too large to fit into main memory, leading to highly narrowed usability. Results We introduce a novel idea of encoding k-mers as well as their frequency, achieving good memory saving and retrieval efficiency. Specifically, we propose a Bloom filter-like data structure to encode counted k-mers by coupled-bit arrays—one for k-mer representation and the other for frequency encoding. Exper…
Bayesian measures of surprise for outlier detection
2003
From a Bayesian point of view, testing whether an observation is an outlier is usually reduced to a testing problem concerning a parameter of a contaminating distribution. This requires elicitation of both (i) the contaminating distribution that generates the outlier and (ii) prior distributions on its parameters. However, very little information is typically available about how the possible outlier could have been generated. Thus easy, preliminary checks in which these assessments can often be avoided may prove useful. Several such measures of surprise are derived for outlier detection in normal models. Results are applied to several examples. Default Bayes factors, where the contaminating…
Outlier detection with automatic modelling: TRAMO/SEATS versus X-12-ARIMA
2012
Efficient change point detection in genomic sequences of continuous measurements
2010
Abstract Motivation: Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and location of change points, and moreover these methods may be somewhat vulnerable to deviations of model assumptions usually employed. Results: We present a computationally efficient method to obtain estimates of the number and location of the change points. The method is based on a simple transformation of data and it provides re…
A multi-scale approach for testing and detecting peaks in time series
2020
An approach is presented that combines a statistical test for peak detection with the estimation of peak positions in time series. Motivated by empirical observations in neuronal recordings, we aim at investigating peaks of different heights and widths. We use a moving window approach to compare the differences of estimated slope coefficients of local regression models. We combine multiple windows and use the global maximum of all different processes as a test statistic. After rejection, a multiple filter algorithm combines peak positions estimated from multiple windows. Analysing neuronal activity recorded in anaesthetized mice, the procedure could identify significant differences between …