Search results for "METAGENOMICS"

showing 10 items of 168 documents

Mining virulence genes using metagenomics.

2011

When a bacterial genome is compared to the metagenome of an environment it inhabits, most genes recruit at high sequence identity. In free-living bacteria (for instance marine bacteria compared against the ocean metagenome) certain genomic regions are totally absent in recruitment plots, representing therefore genes unique to individual bacterial isolates. We show that these Metagenomic Islands (MIs) are also visible in bacteria living in human hosts when their genomes are compared to sequences from the human microbiome, despite the compartmentalized structure of human-related environments such as the gut. From an applied point of view, MIs of human pathogens (e.g. those identified in enter…

ScienceVirulenceBacterial genome sizeBiologyGenomeMicrobiologyMicrobiologyMicrobiomeBiologyGenome EvolutionComparative genomicsGeneticsEscherichia ColiMultidisciplinaryBacteriaVirulenceQHuman microbiomeRGenomicsPathogenicity islandBacterial PathogensMetagenomicsMicrobial EvolutionMedicineMetagenomicsGenome BacterialResearch ArticlePLoS ONE

researchProduct

Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies

2013

High-throughput sequencing technologies produce large collections of data, mainly DNA sequences with additional information, requiring the design of efficient and effective methodologies for both their compression and storage. In this context, we first provide a classification of the main techniques that have been proposed, according to three specific research directions that have emerged from the literature and, for each, we provide an overview of the current techniques. Finally, to make this review useful to researchers and technicians applying the existing software and tools, we include a synopsis of the main characteristics of the described approaches, including details on their impleme…

Sequence analysisComputer sciencebusiness.industryComputational BiologyHigh-Throughput Nucleotide SequencingContext (language use)Data CompressionBioinformaticsData scienceDNA sequencingSoftwareSequence analysis Data compressionMetagenomicsState (computer science)businessSequence AlignmentMolecular BiologyAlgorithmsSoftwareInformation SystemsData compressionBriefings in Bioinformatics

researchProduct

Classification of Sequences with Deep Artificial Neural Networks: Representation and Architectural Issues

2021

DNA sequences are the basic data type that is processed to perform a generic study of biological data analysis. One key component of the biological analysis is represented by sequence classification, a methodology that is widely used to analyze sequential data of different nature. However, its application to DNA sequences requires a proper representation of such sequences, which is still an open research problem. Machine Learning (ML) methodologies have given a fundamental contribution to the solution of the problem. Among them, recently, also Deep Neural Network (DNN) models have shown strongly encouraging results. In this chapter, we deal with specific classification problems related to t…

SequenceBiological dataSequence classiﬁcationSettore INF/01 - InformaticaArtificial neural networkProcess (engineering)Computer sciencebusiness.industryDeep learningBacteria classificationSequence classificationBacteria classiﬁcationNucleosome identiﬁcationDeep neural networkMachine learningcomputer.software_genreData typeNucleosome identificationComponent (UML)Artificial intelligenceMetagenomicsRepresentation (mathematics)businesscomputer

researchProduct

CROSSMAPPER: estimating cross-mapping rates and optimizing experimental design in multi-species sequencing studies

2020

Motivation Numerous sequencing studies, including transcriptomics of host-pathogen systems, sequencing of hybrid genomes, xenografts, mixed species systems, metagenomics and meta-transcriptomics, involve samples containing genetic material from divergent organisms. A crucial step in these studies is identifying from which organism each sequencing read originated, and the experimental design should be directed to minimize biases caused by cross-mapping of reads to incorrect source genomes. Additionally, pooling of sufficiently different genetic material into a single sequencing library could significantly reduce experimental costs but requires careful planning and assessment of the impact of…

Statistics and Probability:Informàtica::Aplicacions de la informàtica::Bioinformàtica [Àrees temàtiques de la UPC]Computer sciencecomputer.software_genreBiochemistryGenomeTranscriptome03 medical and health sciencesResource (project management)GenomesTranscriptomicsMolecular BiologyOrganismGenòmica -- Informàtica030304 developmental biology0303 health sciences030306 microbiologyHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNADNAGenome analysisGenome AnalysisAnàlisis de seqüènciesComputer Science ApplicationsApplications NoteComputational MathematicsComputational Theory and MathematicsCross-mappingResearch DesignMetagenomicsRNAData miningLine (text file)computerSoftwareGenèticaparametres

researchProduct

MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study

2019

Abstract Motivation Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome. Results To address this problem, we developed a novel clustering approach called ‘metagenomic clustering by reference library’ (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed ‘signatures’, are iteratively clustered in a greedy fashion, re…

Statistics and ProbabilityContigComputer scienceRobustness (evolution)Computational biologyOriginal PapersBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsComputational Theory and MathematicsMetagenomicsReference genesGene familyHuman viromeCluster analysisMolecular BiologyBioinformatics

researchProduct

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome

researchProduct

Metagenomics reveals our incomplete knowledge of global diversity

2008

Metagenomic sequencing obtains huge amounts of sequences from environmental and clinical samples, thus providing a glimpse of the global prokaryotic diversity of both species and genes in these sources. The current trend in metagenomic analysis follows the so-called gene-centric approach, focused on describing the environments by the study of the functional roles of the proteins encoded in the sequenced genes. In this way, it is clear that metagenomic analysis relies heavily on the accurate knowledge of the universe of proteins stored in the databases. Nevertheless, it is known that some biases exist in the composition of databases (which are rich in sequences from common, cultivable and ea…

Statistics and ProbabilityGeneticsPhylogenetic treebiologyPhylumGenetic VariationGenomicsBiodiversityGenomicsGenome Analysisbiology.organism_classificationBiochemistryComputer Science ApplicationsComputational MathematicsTaxonComputational Theory and MathematicsEvolutionary biologyMetagenomicsGenBankCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALTaxonomic rankLetter to the EditorMolecular BiologyEcosystemAcidobacteria

researchProduct

Two hundred and fifty-four metagenome-assembled bacterial genomes from the bank vole gut microbiota.

2020

Abstract Vertebrate gut microbiota provide many essential services to their host. To better understand the diversity of such services provided by gut microbiota in wild rodents, we assembled metagenome shotgun sequence data from a small mammal, the bank vole Myodes glareolus (Rodentia, Cricetidae). We were able to identify 254 metagenome assembled genomes (MAGs) that were at least 50% ( n = 133 MAGs), 80% ( n = 77 MAGs) or 95% ( n = 44 MAGs) complete. As typical for a rodent gut microbiota, these MAGs are dominated by taxa assigned to the phyla Bacteroidetes ( n = 132 MAGs) and Firmicutes ( n = 80), with some Spirochaetes ( n = 15) and Proteobacteria ( n = 11). Based on coverage over…

Statistics and Probabilitymetagenomicsbacterial genomicsGenomeBacteriametsämyyräArvicolinaesuolistomikrobistoBacterialsequencinggenomiikkaLibrary and Information Sciencesmicrobial ecologybakteeritComputer Science ApplicationsEducationGastrointestinal MicrobiomemikrobiekologiaAnimalslcsh:QStatistics Probability and Uncertaintylcsh:ScienceInformation Systems

researchProduct

Animal rennets as sources of dairy lactic acid bacteria

2014

ABSTRACT The microbial composition of artisan and industrial animal rennet pastes was studied by using both culture-dependent and -independent approaches. Pyrosequencing targeting the 16S rRNA gene allowed to identify 361 operational taxonomic units (OTUs) to the genus/species level. Among lactic acid bacteria (LAB), Streptococcus thermophilus and some lactobacilli, mainly Lactobacillus crispatus and Lactobacillus reuteri , were the most abundant species, with differences among the samples. Twelve groups of microorganisms were targeted by viable plate counts revealing a dominance of mesophilic cocci. All rennets were able to acidify ultrahigh-temperature-processed (UHT) milk as shown by pH …

Streptococcus thermophilusColony CountColony Count MicrobialApplied Microbiology and BiotechnologyAcidification; Animal rennet pastes; Autolysis; Lactic acid bacteria; Microbial ecology; PyrosequencingMicrobial ecologyMicrobialCheeseRNA Ribosomal 16SLactobacillusEnterococcus casseliflavusLactic acid bacteriaCluster AnalysisPhylogenyEcologybiologyLactobacillus crispatusBacterialAnimal rennet pastefood and beveragesPyrosequencingHydrogen-Ion ConcentrationAutolysiBiotaAnimals; Cluster Analysis; Colony Count Microbial; DNA Bacterial; DNA Ribosomal; Enterococcus; Hydrogen-Ion Concentration; Lactobacillus; Microbial Viability; Milk; Molecular Sequence Data; Phylogeny; RNA Ribosomal 16S; Sequence Analysis DNA; Biota; ChymosinMilkSequence AnalysisChymosinBiotechnologyDNA Bacterial16SMolecular Sequence DataDNA RibosomalEnterococcus faecalisMicrobiologyAcidificationAnimalsRibosomalMicrobial ViabilitySequence Analysis DNADNAbiology.organism_classificationLactobacillus reuteriLactobacillusEnterococcusFood MicrobiologyRNAMetagenomicsEnterococcusFood ScienceEnterococcus faeciumSettore AGR/16 - Microbiologia Agraria

researchProduct

Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

2013

Abstract Background The main limitations in the analysis of viral metagenomes are perhaps the high genetic variability and the lack of information in extant databases. To address these issues, several bioinformatic tools have been specifically designed or adapted for metagenomics by improving read assembly and creating more sensitive methods for homology detection. This study compares the performance of different available assemblers and taxonomic annotation software using simulated viral-metagenomic data. Results We simulated two 454 viral metagenomes using genomes from NCBI's RefSeq database based on the list of actual viruses found in previously published metagenomes. Three different ass…

Taxonomic classificationComputational biologyBiologyGenomeContig MappingContig MappingUser-Computer Interface03 medical and health sciencesAnnotationDatabases GeneticGeneticsRefSeqCluster AnalysisHumansComputer SimulationTaxonomic rank030304 developmental biologyDe Bruijn sequenceInternetPrincipal Component Analysis0303 health sciencesBacteriaContigChimera identification030306 microbiologyComputational BiologyFunctional annotationViral metagenomeIntestinesAssembler performanceMetagenomicsVirusesMetagenomicsAlgorithmsResearch ArticleBiotechnologyBMC Genomics

researchProduct