6533b852fe1ef96bd12aacc1
RESEARCH PRODUCT
Indexing a sequence for mapping reads with a single mismatch
Alessio LangiuAlessio LangiuM. Sohel RahmanMaxime CrochemoreMaxime Crochemoresubject
Computer sciencegenome sequenceGeneral Mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]General Physics and AstronomyContext (language use)algorithmscomputer.software_genrePattern matchingSequenceSearch engine indexingGeneral EngineeringWildcard characterArticlescomputer.file_formatConstruct (python library)Data structuremapping readspattern matchingComputingMethodologies_DOCUMENTANDTEXTPROCESSINGData mining[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]Focus (optics)mismatchcomputerAlgorithmindexingdescription
Mapping reads against a genome sequence is an interesting and useful problem in computational molecular biology and bioinformatics. In this paper, we focus on the problem of indexing a sequence for mapping reads with a single mismatch. We first focus on a simpler problem where the length of the pattern is given beforehand during the data structure construction. This version of the problem is interesting in its own right in the context of the next generation sequencing. In the sequel, we show how to solve the more general problem. In both cases, our algorithm can construct an efficient data structure in time and space and can answer subsequent queries in time. Here, n is the length of the sequence, m is the length of the read, 0< ε <1 and is the optimal output size.
year | journal | country | edition | language |
---|---|---|---|---|
2014-01-01 | Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences |