Search results for "Reference genome"

showing 10 items of 27 documents

Chloroplast genomes of Rubiaceae: Comparative genomics and molecular phylogeny in subfamily Ixoroideae.

2020

In Rubiaceae phylogenetics, the number of markers often proved a limitation with authors failing to provide well-supported trees at tribal and generic levels. A robust phylogeny is a prerequisite to study the evolutionary patterns of traits at different taxonomic levels. Advances in next-generation sequencing technologies have revolutionized biology by providing, at reduced cost, huge amounts of data for an increased number of species. Due to their highly conserved structure, generally recombination-free, and mostly uniparental inheritance, chloroplast DNA sequences have long been used as choice markers for plant phylogeny reconstruction. The main objectives of this study are: 1) to gain in…

0106 biological sciences0301 basic medicineChloroplastsPlant GenomesCoffeaRubiaceaePlant SciencePlant Genetics01 natural sciencesGenomePlant GenomicsPlastidsGenome EvolutionPhylogenyData ManagementMultidisciplinaryIxoroideaeQDNA ChloroplastRHigh-Throughput Nucleotide Sequencingfood and beveragesPhylogenetic AnalysisGenomicsPhylogeneticsChloroplast DNAEngineering and TechnologyMedicineGenome PlantResearch ArticleBiotechnologyGenome evolutionComputer and Information SciencesNuclear genePlant Cell BiologyScienceGenomicsBioengineeringBiology010603 evolutionary biologyPolymorphism Single NucleotideMolecular EvolutionEvolution Molecular03 medical and health sciencesChloroplast GenomeGeneticsEvolutionary SystematicsGenome ChloroplastTaxonomyComparative genomicsEvolutionary BiologyBiology and Life SciencesComputational BiologyCell BiologySequence Analysis DNAComparative Genomicsbiology.organism_classificationGenome AnalysisGenomic Libraries030104 developmental biologyEvolutionary biologyPlant BiotechnologyReference genomePLoS ONE

researchProduct

Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing

2016

Background In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. Results In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were design…

0301 basic medicineChromosomes Artificial BacterialDNA PlantGenomicsBiologyMaritime pineGenome03 medical and health sciencesGene captureGeneticsGene familyGenomic libraryGeneBACGene LibraryGeneticsModels GeneticExonsGenomicsSequence Analysis DNAPinusIntronsGene structurePromoter studies030104 developmental biologyBioinformatic pipelineGene model constructDNA microarrayFunctional genomicsGenome PlantReference genomeResearch ArticleBiotechnologyBMC Genomics

researchProduct

GIbPSs: a toolkit for fast and accurate analyses of genotyping-by-sequencing data without a reference genome.

2015

Genotyping-by-sequencing (GBS) and related methods are increasingly used for studies of non-model organisms from population genetic to phylogenetic scales. We present GIbPSs, a new genotyping toolkit for the analysis of data from various protocols such as RAD, double-digest RAD, GBS, and two-enzyme GBS without a reference genome. GIbPSs can handle paired-end GBS data and is able to assign reads from both strands of a restriction fragment to the same locus. GIbPSs is most suitable for population genetic and phylogeographic analyses. It avoids genotyping errors due to indel variation by identifying and discarding affected loci. GIbPSs creates a genotype database that offers rich functionality…

0301 basic medicineGeneticseducation.field_of_studyGenotyping TechniquesPopulationComputational BiologyLocus (genetics)Computational biologySequence Analysis DNABiology03 medical and health sciencesPhylogeography030104 developmental biologyGenetics PopulationGenotypeGeneticseducationIndelGenotypingGenotyping TechniquesEcology Evolution Behavior and SystematicsPaired-end tagBiotechnologyReference genomeMolecular ecology resources

researchProduct

Population Structure in the Model Grass Brachypodium distachyon Is Highly Correlated with Flowering Differences across Broad Geographic Areas

2016

The small, annual grass Brachypodium distachyon (L.) Beauv., a close relative of wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.), is a powerful model system for cereals and bioenergy grasses. Genome-wide association studies (GWAS) of natural variation can elucidate the genetic basis of complex traits but have been so far limited in B. distachyon by the lack of large numbers of well-characterized and sufficiently diverse accessions. Here, we report on genotyping-by-sequencing (GBS) of 84 B. distachyon, seven B. hybridum, and three B. stacei accessions with diverse geographic origins including Albania, Armenia, Georgia, Italy, Spain, and Turkey. Over 90,000 high-quality single-nu…

0301 basic medicineGermplasmLinkage disequilibriumlcsh:QH426-470PopulationPlant Sciencelcsh:Plant cultureQuantitative trait locusBiologyphenology03 medical and health sciencesGeneticGenetic variationevolutionGeneticslcsh:SB1-1110educationbiogeographyeducation.field_of_studyEcologySettore BIO/02 - Botanica Sistematicafood and beveragespopulation structureVernalizationbiology.organism_classificationBrachypodium distachyon genome DNA Poaceae Population structurelcsh:Genetics030104 developmental biologyEvolutionary biologySettore BIO/03 - Botanica Ambientale E ApplicataBrachypodium distachyonAgronomy and Crop ScienceReference genome

researchProduct

Mycobacterium tuberculosis complex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type…

2021

Pathogens of theMycobacterium tuberculosiscomplex (MTBC) are considered to be monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate strains of the different MTBC lineages (L), especially L5 and L6 (traditionally termedMycobacterium africanum) strains, from each other. However, this genome variability and gene content, especially of L5 strains, has not been fully explored and may be important for pathobiology and current approaches for genomic analysis of MTBC strains, including transmission studies. By comparing the genomes of 355 L5 clinical strains (including 3 complete genomes and 352 Illumina whole-genome sequenc…

0301 basic medicineLineage (genetic)Genotype030106 microbiologySequence assemblyPathogens and Epidemiologylineage 5Genomegenomic diversity03 medical and health sciencesSpecies SpecificityDrug Resistance Multiple BacterialGenotypeHumansTuberculosisH37RvBiologyGeneResearch Articlesreference genomewithin-lineage variabilityGeneticsWhole Genome SequencingbiologyChromosome MappingGenetic VariationHigh-Throughput Nucleotide SequencingMycobacterium tuberculosisSequence Analysis DNAgene presence/absenceGeneral Medicinebiology.organism_classification030104 developmental biologyL5.3.2Mycobacterium tuberculosis complexM. africanumHuman medicineMycobacterium africanumGenome BacterialReference genomeMicrobial Genomics

researchProduct

Detailed analysis of inversions predicted between two human genomes: errors, real polymorphisms, and their origin and population distribution.

2016

The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints …

0301 basic medicinePopulationBiologyGenomeEvolution Molecular03 medical and health sciencesGeneticsHumans1000 Genomes ProjectAlleleSelection GeneticeducationMolecular BiologyAllele frequencyGenetics (clinical)Geneticseducation.field_of_studyPolymorphism GeneticGenome HumanSequence InversionBreakpointMolecular Sequence AnnotationGeneral MedicineSequence Analysis DNA030104 developmental biologyChromosome InversionHuman genomeReference genomeHuman molecular genetics

researchProduct

MetaCache: context-aware classification of metagenomic reads using minhashing.

2017

Abstract Motivation Metagenomic shotgun sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification, i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes corresponding software tools suffer from either long runtimes, large memory requirements or low accuracy. Results We introduce MetaCache—a novel software for read classification using the big data technique minhashing. Our…

0301 basic medicineStatistics and ProbabilityComputer scienceSequence analysisContext (language use)BiochemistryGenome03 medical and health scienceschemistry.chemical_compound0302 clinical medicineRefSeqHumansMolecular BiologyInformation retrievalShotgun sequencingHigh-Throughput Nucleotide SequencingSequence Analysis DNAComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicschemistryMetagenomicsMetagenomics030217 neurology & neurosurgeryDNAAlgorithmsSoftwareReference genomeBioinformatics (Oxford, England)

researchProduct

Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.

2017

Abstract Motivation Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions …

0301 basic medicineStatistics and ProbabilityQuality ControlGenotypeComputer sciencemedia_common.quotation_subjectPopulationGenomicsBioinformaticscomputer.software_genreBiochemistryGenome03 medical and health sciencesGenetic variationAnimalsHumansQuality (business)AlleleeducationMolecular BiologyGenotypingReliability (statistics)media_commonProtocol (science)education.field_of_studyGenomeModels StatisticalGenetic VariationReproducibility of ResultsGenomicsGenome AnalysisOriginal PapersComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsData miningcomputerSoftwareReference genome

researchProduct

Parallel and Space-Efficient Construction of Burrows-Wheeler Transform and Suffix Array for Big Genome Data

2016

Next-generation sequencing technologies have led to the sequencing of more and more genomes, propelling related research into the era of big data. In this paper, we present ParaBWT, a parallelized Burrows-Wheeler transform (BWT) and suffix array construction algorithm for big genome data. In ParaBWT, we have investigated a progressive construction approach to constructing the BWT of single genome sequences in linear space complexity, but with a small constant factor. This approach has been further parallelized using multi-threading based on a master-slave coprocessing model. After gaining the BWT, the suffix array is constructed in a memory-efficient manner. The performance of ParaBWT has b…

0301 basic medicineTheoretical computer scienceBurrows–Wheeler transformComputer scienceGenomicsData_CODINGANDINFORMATIONTHEORYParallel computingGenomelaw.invention03 medical and health scienceslawGeneticsHumansEnsemblMulti-core processorApplied MathematicsLinear spaceSuffix arrayChromosome MappingHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNA030104 developmental biologyAlgorithmsBiotechnologyReference genomeIEEE/ACM Transactions on Computational Biology and Bioinformatics

researchProduct

2018

Background The European beech is arguably the most important climax broad-leaved tree species in Central Europe, widely planted for its valuable wood. Here, we report the 542 Mb draft genome sequence of an up to 300-year-old individual (Bhaga) from an undisturbed stand in the Kellerwald-Edersee National Park in central Germany. Findings Using a hybrid assembly approach, Illumina reads with short- and long-insert libraries, coupled with long Pacific Biosciences reads, we obtained an assembled genome size of 542 Mb, in line with flow cytometric genome size estimation. The largest scaffold was of 1.15 Mb, the N50 length was 145 kb, and the L50 count was 983. The assembly contained 0.12% of Ns.…

0301 basic medicineWhole genome sequencingbiologyHealth InformaticsGenome browserbiology.organism_classificationGenomeComputer Science ApplicationsPopulation genomics03 medical and health sciences030104 developmental biologyFagus sylvaticaEvolutionary biologyGenome sizeBeechReference genomeGigaScience

researchProduct