Search results for "sequencing data"

showing 6 items of 16 documents

No evidence of EMAST in whole genome sequencing data from 248 colorectal cancers.

2021

Microsatellite instability (MSI) is caused by defective DNA mismatch repair (MMR), and manifests as accumulation of small insertions and deletions (indels) in short tandem repeats of the genome. Another form of repeat instability, elevated microsatellite alterations at selected tetranucleotide repeats (EMAST), has been suggested to occur in 50% to 60% of colorectal cancer (CRC), of which approximately one quarter are accounted for by MSI. Unlike for MSI, the criteria for defining EMAST is not consensual. EMAST CRCs have been suggested to form a distinct subset of CRCs that has been linked to a higher tumor stage, chronic inflammation, and poor prognosis. EMAST CRCs not exhibiting MSI have b…

Cancer Researchcongenital hereditary and neonatal diseases and abnormalities3122 Cancerscolorectal cancersuolistosyövätBiologymikrosatelliititmedicine.disease_causeGenomeDNA sequencingEMAST03 medical and health sciences0302 clinical medicineINDEL MutationGeneticsmedicineHumansGenetic TestingIndelneoplasmsGeneticsWhole genome sequencingnext generation sequencingMutationDNA-analyysiWhole Genome Sequencing1184 Genetics developmental biology physiologyMicrosatellite instabilitymedicine.diseasedigestive system diseases3. Good health030220 oncology & carcinogenesisgenome sequencing dataMicrosatellitesyöpätauditDNA mismatch repaircolorectal cancersColorectal NeoplasmsMicrosatellite RepeatsGenes, chromosomescancerREFERENCES

researchProduct

Acceleration of short and long DNA read mapping without loss of accuracy using suffix array

2014

HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20 for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies.

Statistics and ProbabilityComputer scienceSequence analysisSequence alignmentdatabase searchescomputer.software_genreBiochemistrylaw.inventionAccelerationchemistry.chemical_compoundlawCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALAnimalsHumansMolecular BiologyDatabasesequencing dataSuffix arraySequence analysisHigh-Throughput Nucleotide SequencingalignmentSequence Analysis DNAApplications NotesComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDrosophilaSuffixSequence AlignmentcomputerAlgorithmAlgorithmsSoftwareDNA

researchProduct

piRNAclusterDB 2.0: update and expansion of the piRNA cluster database.

2021

Abstract PIWI-interacting RNAs (piRNAs) and their partnering PIWI proteins defend the animal germline against transposable elements and play a crucial role in fertility. Numerous studies in the past have uncovered many additional functions of the piRNA pathway, including gene regulation, anti-viral defense, and somatic transposon repression. Further, comparative analyses across phylogenetic groups showed that the PIWI/piRNA system evolves rapidly and exhibits great evolutionary plasticity. However, the presence of so-called piRNA clusters as the major source of piRNAs is common to nearly all metazoan species. These genomic piRNA-producing loci are highly divergent across taxa and critically…

Transposable elementSmall RNAendocrine systemAcademicSubjects/SCI00010Sequencing dataPiwi-interacting RNADatasets as TopicBiologycomputer.software_genreGermlineEvolution Molecular03 medical and health sciences0302 clinical medicineDatabases GeneticGeneticsAnimalsCluster AnalysisHumansDatabase IssueRNA Small InterferingPhylogeny030304 developmental biologyRegulation of gene expression0303 health sciencesInternetGenomePhylogenetic treeDatabaseurogenital systemGenetic LociArgonaute ProteinsDNA Transposable Elementscomputer030217 neurology & neurosurgerySoftwareNucleic acids research

researchProduct

Lightweight LCP construction for next-generation sequencing datasets

2012

The advent of "next-generation" DNA sequencing (NGS) technologies has meant that collections of hundreds of millions of DNA sequences are now commonplace in bioinformatics. Knowing the longest common prefix array (LCP) of such a collection would facilitate the rapid computation of maximal exact matches, shortest unique substrings and shortest absent words. CPU-efficient algorithms for computing the LCP of a string have been described in the literature, but require the presence in RAM of large data structures. This prevents such methods from being feasible for NGS datasets. In this paper we propose the first lightweight method that simultaneously computes, via sequential scans, the LCP and B…

Whole genome sequencingGenomics (q-bio.GN)FOS: Computer and information sciencesSequenceBWT; LCP; next-generation sequencing datasetsBWT LCP text indexes next-generation sequencing datasets massive datasetsSettore INF/01 - InformaticaComputer scienceComputationString (computer science)LCP arrayParallel computingData structureDNA sequencingSubstringBWTLCPFOS: Biological sciencesComputer Science - Data Structures and AlgorithmsQuantitative Biology - GenomicsData Structures and Algorithms (cs.DS)next-generation sequencing datasets

researchProduct

Exploiting Glomus intraradices sequencing data to dissect molecular mechanisms of plant genome control over fungal gene expression in mycorrhiza

2006

International audience

[SDV] Life Sciences [q-bio]molecular mechanisms of plant genome control[SDV]Life Sciences [q-bio]sequencing datafungal gene expressionmycorrhizaGlomus intraradicesComputingMilieux_MISCELLANEOUS

researchProduct

SNPs detection by eBWT positional clustering

2019

Sequencing technologies keep on turning cheaper and faster, thus putting a growing pressure for data structures designed to efficiently store raw data, and possibly perform analysis therein. In this view, there is a growing interest in alignment-free and reference-free variants calling methods that only make use of (suitably indexed) raw reads data. We develop the positional clustering theory that (i) describes how the extended Burrows–Wheeler Transform (eBWT) of a collection of reads tends to cluster together bases that cover the same genome position (ii) predicts the size of such clusters, and (iii) exhibits an elegant and precise LCP array based procedure to locate such clusters in the e…

lcsh:QH426-470Computer scienceLCP arrayReference-free[SDV]Life Sciences [q-bio]0206 medical engineeringSequencing dataSNPAssembly-free02 engineering and technologyBWT LCP array SNPs Reference-free Assembly-freecomputer.software_genreSoftwareBWTStructural BiologyComputational Theory and MathematicCluster (physics)Cluster analysislcsh:QH301-705.5Molecular BiologyComputingMilieux_MISCELLANEOUSSettore INF/01 - Informaticabusiness.industryResearchApplied MathematicsLCP arrayData structurePipeline (software)lcsh:GeneticsComputational Theory and Mathematicslcsh:Biology (General)Data miningBWT; LCP array; SNPs; Reference-free; Assembly-free[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]businessRaw datacomputer020602 bioinformaticsSNPs

researchProduct