Search results for " GENOMICS"

showing 10 items of 390 documents

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome
researchProduct

Towards next-generation diagnostics for tuberculosis: identification of novel molecular targets by large-scale comparative genomics.

2020

5 páginas, 2 figuras. AVAILABILITY AND IMPLEMENTATION: The database of non-tuberculous mycobacteria assemblies can be accessed at: 10.5281/zenodo.3374377. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online: http://dx.doi.org/10.1093/bioinformatics/btz729

Statistics and ProbabilityTuberculosisGenomicsComputational biologyBiologyBiochemistryMycobacterium tuberculosis03 medical and health sciencesmedicineHumansTuberculosisDiscovery NotesMolecular Biology030304 developmental biologyComparative genomics0303 health sciences030306 microbiologyScale (chemistry)GenomicsMycobacterium tuberculosismedicine.diseasebiology.organism_classificationGenome Analysis3. Good healthComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsMycobacterium tuberculosis complexMolecular targetsIdentification (biology)BiomarkersBioinformatics (Oxford, England)
researchProduct

Two hundred and fifty-four metagenome-assembled bacterial genomes from the bank vole gut microbiota.

2020

Abstract Vertebrate gut microbiota provide many essential services to their host. To better understand the diversity of such services provided by gut microbiota in wild rodents, we assembled metagenome shotgun sequence data from a small mammal, the bank vole Myodes glareolus (Rodentia, Cricetidae). We were able to identify 254 metagenome assembled genomes (MAGs) that were at least 50% ( n  = 133 MAGs), 80% ( n  = 77 MAGs) or 95% ( n  = 44 MAGs) complete. As typical for a rodent gut microbiota, these MAGs are dominated by taxa assigned to the phyla Bacteroidetes ( n  = 132 MAGs) and Firmicutes ( n  = 80), with some Spirochaetes ( n  = 15) and Proteobacteria ( n  = 11). Based on coverage over…

Statistics and Probabilitymetagenomicsbacterial genomicsGenomeBacteriametsämyyräArvicolinaesuolistomikrobistoBacterialsequencinggenomiikkaLibrary and Information Sciencesmicrobial ecologybakteeritComputer Science ApplicationsEducationGastrointestinal MicrobiomemikrobiekologiaAnimalslcsh:QStatistics Probability and Uncertaintylcsh:ScienceInformation Systems
researchProduct

One is not enough: On the effects of reference genome for the mapping and subsequent analyses of short-reads.

2020

Mapping of high-throughput sequencing (HTS) reads to a single arbitrary reference genome is a frequently used approach in microbial genomics. However, the choice of a reference may represent a source of errors that may affect subsequent analyses such as the detection of single nucleotide polymorphisms (SNPs) and phylogenetic inference. In this work, we evaluated the effect of reference choice on short-read sequence data from five clinically and epidemiologically relevant bacteria (Klebsiella pneumoniae, Legionella pneumophila, Neisseria gonorrhoeae, Pseudomonas aeruginosa and Serratia marcescens). Publicly available whole-genome assemblies encompassing the genomic diversity of these species…

Systematic errorSingle Nucleotide PolymorphismsPathology and Laboratory MedicineGenomeKlebsiella PneumoniaeDatabase and Informatics MethodsData sequencesKlebsiellaMedicine and Health SciencesBiology (General)CladePhylogenyData ManagementEcologyPhylogenetic treeBacterial GenomicsMicrobial GeneticsChromosome MappingHigh-Throughput Nucleotide SequencingPhylogenetic AnalysisGenomicsBacterial PathogensPhylogeneticsLegionella PneumophilaComputational Theory and MathematicsMedical MicrobiologyModeling and SimulationPathogensSequence AnalysisResearch ArticleComputer and Information SciencesBioinformaticsQH301-705.5LegionellaSequence alignmentSingle-nucleotide polymorphismGenomicsComputational biologyMicrobial GenomicsBiologyResearch and Analysis MethodsPolymorphism Single NucleotideMicrobiologyCellular and Molecular NeurosciencePhylogeneticsGeneticsSNPBacterial GeneticsEvolutionary SystematicsMolecular BiologyMicrobial PathogensEcology Evolution Behavior and SystematicsTaxonomyEvolutionary BiologyBacteriaOrganismsBiology and Life SciencesBacteriologySequence AlignmentGenome BacterialReference genomePLoS Computational Biology
researchProduct

Whole genome sequencing of the black grouse (Tetrao tetrix): reference guided assembly suggests faster-Z and MHC evolution

2014

Background The different regions of a genome do not evolve at the same rate. For example, comparative genomic studies have suggested that the sex chromosomes and the regions harbouring the immune defence genes in the Major Histocompatability Complex (MHC) may evolve faster than other genomic regions. The advent of the next generation sequencing technologies has made it possible to study which genomic regions are evolutionary liable to change and which are static, as well as enabling an increasing number of genome studies of non-model species. However, de novo sequencing of the whole genome of an organism remains non-trivial. In this study, we present the draft genome of the black grouse, wh…

Tetrao tetrixMaleGenome evolutionBiologyGenomePolymorphism Single NucleotideChromosomesBirdsEvolution MolecularMajor Histocompatibility ComplexGene densityGeneticsAnimalsGenetikGenome sizeRepetitive Sequences Nucleic AcidGeneticsComparative genomicsWhole genome sequencingteeriGenomeComputational BiologyHigh-Throughput Nucleotide SequencingMolecular Sequence AnnotationGenome projectGenomicsEvolutionary biologyReference genomeBiotechnologyResearch ArticleBMC Genomics
researchProduct

Evaluation of GPU-based Seed Generation for Computational Genomics Using Burrows-Wheeler Transform

2012

Unprecedented production of short reads from the new high-throughput sequencers has posed challenges to align short reads to reference genomes with high sensitivity and high speed. Many CPU-based short read aligners have been developed to address this challenge. Among them, one popular approach is the seed-and-extend heuristic. For this heuristic, the first and foremost step is to generate seeds between the input reads and the reference genome, where hash tables are the most frequently used data structure. However, hash tables are memory-consuming, making it not well-suited to memory-stringent many-core architectures, like GPUs, even though they usually have a nearly constant query time com…

Theoretical computer scienceBurrows–Wheeler transformComputational complexity theoryComputer scienceComputational genomicsParallel computingData structureTime complexityHash table2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
researchProduct

Statistically validated networks in bipartite complex systems.

2011

Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the pr…

Theoretical computer scienceComputer sciencelcsh:MedicineNetwork theorySocial and Behavioral SciencesBioinformaticsQuantitative Biology - Quantitative MethodsSociologyProtein Interaction Mappinglcsh:ScienceQuantitative Methods (q-bio.QM)MultidisciplinarySystems BiologyApplied MathematicsPhysicsStatisticsComplex SystemsGenomicsLink (geometry)Social NetworksSpecialization (logic)Interdisciplinary PhysicsBipartite graphProbability distributionResearch ArticleNetwork analysisPhysics - Physics and SocietyComplex systemFOS: Physical sciencesPhysics and Society (physics.soc-ph)Type (model theory)BiologyModels BiologicalNetwork theory Statistical PhysicsStatistical MechanicsSet (abstract data type)Statistical MethodsBiologyStructure (mathematical logic)Statistical Physicslcsh:RComputational BiologyModels TheoreticalComparative GenomicsSettore FIS/07 - Fisica Applicata(Beni Culturali Ambientali Biol.e Medicin)FOS: Biological sciencesNetwork theorylcsh:QNull hypothesisMathematicsPLoS ONE
researchProduct

High-throughput sequencing of RNA silencing-associated small RNAs in olive (Olea europaea L.).

2011

14 páginas, 5 figuras, 3 tablas, S4 figuras, S2 tablas

Time FactorsScienceMolecular Sequence DataSequence DatabasesPlant ScienceBiologyDeep sequencingTranscriptomesRNA interferenceGene Expression Regulation PlantGenome Analysis ToolsOleaGene expressionmicroRNAGenome DatabasesPlant GenomicsGene silencingGene Regulatory NetworksGenome SequencingBiologyConserved SequenceGeneticsPlant Growth and DevelopmentMultidisciplinaryPolymorphism GeneticBase SequenceReverse Transcriptase Polymerase Chain ReactionSequence Analysis RNAGene Expression ProfilingQRRNAGene Expression Regulation DevelopmentalHigh-Throughput Nucleotide SequencingReproducibility of ResultsGenomicsOlive treesFunctional GenomicsRNA silencingMicroRNAsRNA PlantSmall MoleculesMedicineRNA InterferenceResearch ArticleBiotechnologyDevelopmental BiologyPloS one
researchProduct

A complete set of nascent transcription rates for yeast genes

2010

The amount of mRNA in a cell is the result of two opposite reactions: transcription and mRNA degradation. These reactions are governed by kinetics laws, and the most regulated step for many genes is the transcription rate. The transcription rate, which is assumed to be exercised mainly at the RNA polymerase recruitment level, can be calculated using the RNA polymerase densities determined either by run-on or immunoprecipitation using specific antibodies. The yeast Saccharomyces cerevisiae is the ideal model organism to generate a complete set of nascent transcription rates that will prove useful for many gene regulation studies. By combining genomic data from both the GRO (Genomic Run-on) a…

Transcription factoriesSaccharomyces cerevisiae ProteinsTranscription GeneticRNA StabilityGenes FungalDNA transcriptionlcsh:MedicineYeast and Fungal ModelsRNA polymerase IISaccharomyces cerevisiaeBiologyBiochemistryGenètica molecularchemistry.chemical_compoundSaccharomycesModel OrganismsMolecular cell biologyTranscripció genèticaGene Expression Regulation FungalRNA polymeraseGeneticsRNA MessengerRNA synthesislcsh:ScienceBiologyRNA polymerase II holoenzymeGeneticsMultidisciplinaryGeneral transcription factorGene Expression Profilinglcsh:RPromoterGenomicsChromatinFunctional GenomicsNucleic acidsGenòmicaRNA processingchemistrybiology.proteinRNAlcsh:QRNA Polymerase IIGene expressionTranscription factor II DTranscription factor II BResearch Article
researchProduct

Annotation of microsporidian genomes using transcriptional signals

2012

EA GenoSol CT3; International audience; High-quality annotation of microsporidian genomes is essential for understanding the biological processes that govern the development of these parasites. Here we present an improved structural annotation method using transcriptional DNA signals. We apply this method to re-annotate four previously annotated genomes, which allow us to detect annotation errors and identify a significant number of unpredicted genes. We then annotate the newly sequenced genome of Anncaliia algerae. A comparative genomic analysis of A. algerae permits the identification of not only microsporidian core genes, but also potentially highly expressed genes encoding membrane-asso…

Transcription Geneticgenome annotationMESH : Molecular Sequence AnnotationGeneral Physics and AstronomyMESH: PhosphotransferasesGenometranscriptional signalMESH : Protein TransportMESH : Fungal ProteinsDNA FungalConserved SequenceComputingMilieux_MISCELLANEOUSGenetics0303 health sciencesFungal proteinMESH: Conserved SequenceMultidisciplinaryMESH: Genomics030302 biochemistry & molecular biologyGenomicsGenome projectProtein TransportMolecular Sequence Annotation[ SDV.BBM.GTP ] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]MESH: Genome FungalMESH: Fungal ProteinsMESH : PhosphotransferasesGenome FungalTransposable elementMESH: Protein TransportGenes FungalGenomicsMESH: Molecular Sequence AnnotationMESH : MicrosporidiaMESH : Open Reading FramesComputational biologyBiologyGeneral Biochemistry Genetics and Molecular BiologyFungal ProteinsOpen Reading Frames03 medical and health sciencesMESH : Conserved Sequence[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Anncaliia algeraeparasitic diseasesGene030304 developmental biologybioinformaticMESH: Transcription GeneticMESH : Genome FungalPhosphotransferasesstructural annotationMESH : GenomicsfungiMESH : Transcription GeneticMolecular Sequence AnnotationGeneral ChemistryMESH: Open Reading FramesMESH: MicrosporidiaMESH: DNA FungalmicrosporidiaMESH : Genes Fungal[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]MESH : DNA FungalMESH: Genes FungalNature Communications
researchProduct