Search results for "Suffix"

showing 10 items of 75 documents

Acceleration of short and long DNA read mapping without loss of accuracy using suffix array

2014

HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20 for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies.

Statistics and ProbabilityComputer scienceSequence analysisSequence alignmentdatabase searchescomputer.software_genreBiochemistrylaw.inventionAccelerationchemistry.chemical_compoundlawCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALAnimalsHumansMolecular BiologyDatabasesequencing dataSuffix arraySequence analysisHigh-Throughput Nucleotide SequencingalignmentSequence Analysis DNAApplications NotesComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDrosophilaSuffixSequence AlignmentcomputerAlgorithmAlgorithmsSoftwareDNA
researchProduct

Parallel Construction and Query of Index Data Structures for Pattern Matching on Square Matrices

1999

AbstractWe describe fast parallel algorithms for building index data structures that can be used to gather various statistics on square matrices. The main data structure is the Lsuffix tree, which is a generalization of the classical suffix tree for strings. Given ann×ntext matrixA, we build our data structures inO(logn) time withn2processors on a CRCW PRAM, so that we can quickly processAin parallel as follows: (i) report some statistical information aboutA, e.g., find the largest repeated square submatrices that appear at least twice inAor determine, for each position inA, the smallest submatrix that occurs only there; (ii) given, on-line, anm×mpattern matrixPAT, check whether it occurs i…

Statistics and ProbabilityNumerical AnalysisControl and OptimizationAlgebra and Number TheoryApplied MathematicsGeneral MathematicsSuffix treeParallel algorithmData structureSquare matrixSquare (algebra)law.inventionTree (data structure)lawPattern matchingAlgorithmMathematicsData compressionJournal of Complexity
researchProduct

Variable Length Memory Chains: Characterization of stationary probability measures

2021

Variable Length Memory Chains (VLMC), which are generalizations of finite order Markov chains, turn out to be an essential tool to modelize random sequences in many domains, as well as an interesting object in contemporary probability theory. The question of the existence of stationary probability measures leads us to introduce a key combinatorial structure for words produced by a VLMC: the Longest Internal Suffix. This notion allows us to state a necessary and sufficient condition for a general VLMC to admit a unique invariant probability measure. This condition turns out to get a much simpler form for a subclass of VLMC: the stable VLMC. This natural subclass, unlike the general case, enj…

Statistics and ProbabilityPure mathematicsLongest Internal SuffixStationary distributionMarkov chain60J05 60C05 60G10Probability (math.PR)010102 general mathematics01 natural sciencesMeasure (mathematics)Variable Length Memory Chains010104 statistics & probabilityProbability theoryConvergence of random variablesFOS: MathematicsCountable setState spaceRenewal theory[MATH]Mathematics [math]0101 mathematicsstable context treessemi-Markov chainsMathematics - Probabilitystationary probability measureMathematicsBernoulli
researchProduct

A loop-free two-close Gray-code algorithm for listing k-ary Dyck words

2006

AbstractP. Chase and F. Ruskey each published a Gray code for length n binary strings with m occurrences of 1, coding m-combinations of n objects, which is two-close—that is, in passing from one binary string to its successor a single 1 exchanges positions with a 0 which is either adjacent to the 1 or separated from it by a single 0. If we impose the restriction that any suffix of a string contains at least k−1 times as many 0's as 1's, we obtain k-suffixes: suffixes of k-ary Dyck words. Combinations are retrieved as special case by setting k=1 and k-ary Dyck words are retrieved as a special case by imposing the additional condition that the entire string has exactly k−1 times as many 0's a…

Theoretical Computer ScienceCombinatoricsGray codeComputational Theory and MathematicsDiscrete Mathematics and CombinatoricsTwo-closeBinary stringsSpecial caseSuffixk-ary Dyck wordsGray codeLoop-free algorithmAlgorithmMathematicsCoding (social sciences)Journal of Discrete Algorithms
researchProduct

Boosting Textual Compression in Optimal Linear Time

2005

We provide a general boosting technique for Textual Data Compression. Qualitatively, it takes a good compression algorithm and turns it into an algorithm with a better compression performance guarantee. It displays the following remarkable properties: (a) it can turn any memoryless compressor into a compression algorithm that uses the “best possible” contexts; (b) it is very simple and optimal in terms of time; and (c) it admits a decompression algorithm again optimal in time. To the best of our knowledge, this is the first boosting technique displaying these properties.Technically, our boosting technique builds upon three main ingredients: the Burrows--Wheeler Transform, the Suffix Tree d…

Theoretical computer scienceBurrows–Wheeler transformSuffix treeString (computer science)Data_CODINGANDINFORMATIONTHEORYBurrows-Wheeler transformSubstringArithmetic codinglaw.inventionLempel-Ziv compressorsArtificial IntelligenceHardware and ArchitectureControl and Systems Engineeringlawtext compressionempirical entropyArithmetic codingGreedy algorithmTime complexityAlgorithmSoftwareInformation SystemsMathematicsData compression
researchProduct

Algorithmics for the Life Sciences

2013

The life sciences, in particular molecular biology and medicine, have wit- nessed fundamental progress since the discovery of the “the Double Helix”. A rele- vant part of such an incredible advancement in knowledge has been possible thanks to synergies with the mathematical sciences, on the one hand, and computer science, on the other. Here we review some of the most relevant aspects of this cooperation focusing on contributions given by the design, analysis and engineering of fast al- gorithms for the life sciences.

Theoretical computer scienceSettore INF/01 - InformaticaKolmogorov complexityMathematical scienceslawComputer scienceSuffix treeAlgorithmicsDesign and Analysis of Algorithms Bioinformaticslaw.invention
researchProduct

Multi-Dimensional motivic pattern extraction founded on adaptive redundancy filtering

2005

Abstract We present a computational model for discovering repeated patterns in symbolic representations of monodic music. Patterns are discovered through an incremental adaptive identification along a multi-dimensional parametric space. The difficulties of pattern discovery mainly come from combinatorial redundancies, that our model is able to control efficiently. A specificity relation is defined between pattern descriptions, unifying suffix and inclusion relations and enabling a filtering of redundant descriptions. Combinatorial proliferation caused by successive repetitions of patterns is managed using cyclic patterns. The modelling of these redundancy control mechanisms enables an autom…

Theoretical computer scienceVisual Arts and Performing ArtsRelation (database)Space (commercial competition)050105 experimental psychology060404 music[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI][INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing[STAT.ML]Statistics [stat]/Machine Learning [stat.ML][INFO.INFO-FL]Computer Science [cs]/Formal Languages and Automata Theory [cs.FL]Redundancy (engineering)0501 psychology and cognitive sciencesControl (linguistics)MathematicsParametric statistics[INFO.INFO-PL]Computer Science [cs]/Programming Languages [cs.PL][SHS.MUSIQ]Humanities and Social Sciences/Musicology and performing artsbusiness.industry05 social sciences06 humanities and the artsAutomation[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]Multi dimensionalNASuffixbusiness0604 artsMusic
researchProduct

$O(n^2 log n)$ Time On-line Construction of Two-Dimensional Suffix Trees

2007

The two-dimensional suffix tree of an n × n square matrix A is a compacted trie that represents all square submatrices of A [11]. For the off-line case, i.e., A is given in advance to the algorithm, it is known how to build it in optimal time, for any type of alphabet size [11], [18]. Motivated by applications in Image Compression [22], Giancarlo and Guaiana [14] considered the on-line version of the two-dimensional suffix tree and presented an O(n2 log2 n)-time algorithm, which we refer to as GG. That algorithm is a nontrivial generalization of Ukkonen’s on-line algorithm for standard suffix trees [23]. The main contribution in this paper is an O(logn) factor improvement in the time comple…

Two-dimensional suffix tree On-line algorithm Index data structures.
researchProduct

Los procesos de derivación locucional en el continuum discursivo de la literatua medieval de castigos

2016

Las características formales y estilísticas del género sapiencial se antojan fundamentales para el estudio histórico de la fraseología. Por su brevedad y concisión, por su capacidad de adaptarse lingu?ísticamente a tradiciones literarias que nacen y se desarrollan en contextos culturales muy diversos, las colecciones de sentencias medievales que tuvieron su eclosión en el siglo XIII y que llegaron hasta el siglo XV como consecuencia de un continuum discursivo, se caracterizaron por manifestar un uso locucional bastante nutrido. Esta investigación tiene como principal objetivo caracterizar todas las locuciones prepositivas cuyo núcleo se vio sometido a un proceso de derivación léxica, lo cua…

UNESCO::CIENCIAS DE LAS ARTES Y LAS LETRASHistorical PhraseologyPrepositional LocutionsDiscursive Traditionslcsh:Literature (General):CIENCIAS DE LAS ARTES Y LAS LETRAS [UNESCO]Derivational Suffixes.Wisdom Literaturelcsh:PN1-6790
researchProduct

Le suffixe diminutif: un marqueur d'appropriation du signifiant.

2010

International audience

[ SHS ] Humanities and Social Scienceslinguistique hispanique[SHS] Humanities and Social Sciencessuffixe diminutifsuffixation quantitativeComputingMilieux_MISCELLANEOUS[SHS]Humanities and Social Sciences
researchProduct