Search results for "Sequence assembly"
showing 10 items of 26 documents
De novoassembly of the zucchini genome reveals a whole-genome duplication associated with the origin of theCucurbitagenus
2018
Summary The Cucurbita genus (squashes, pumpkins and gourds) includes important domesticated species such as C. pepo, C. maxima and C. moschata. In this study, we present a high-quality draft of the zucchini (C. pepo) genome. The assembly has a size of 263 Mb, a scaffold N50 of 1.8 Mb and 34 240 gene models. It includes 92% of the conserved BUSCO core gene set, and it is estimated to cover 93.0% of the genome. The genome is organized in 20 pseudomolecules that represent 81.4% of the assembly, and it is integrated with a genetic map of 7718 SNPs. Despite the small genome size, three independent lines of evidence support that the C. pepo genome is the result of a whole-genome duplication: the …
“Out of the can”: a draft genome assembly, liver transcriptome, and nutrigenomics of the european sardine, sardina pilchardus
2018
Clupeiformes, such as sardines and herrings, represent an important share of worldwide fisheries. Among those, the European sardine (Sardina pilchardus, Walbaum 1792) exhibits significant commercial relevance. While the last decade showed a steady and sharp decline in capture levels, recent advances in culture husbandry represent promising research avenues. Yet, the complete absence of genomic resources from sardine imposes a severe bottleneck to understand its physiological and ecological requirements. We generated 69 Gbp of paired-end reads using Illumina HiSeq X Ten and assembled a draft genome assembly with an N50 scaffold length of 25,579 bp and BUSCO completeness of 82.1% (Actinoptery…
Correction to: A haplotype-resolved, de novo genome assembly for the wood tiger moth (Arctia plantaginis) through trio binning
2021
&ldquo;Out of the Can&rdquo;: A Draft Genome Assembly, Liver Transcriptome and Nutrigenomics of the European Sardine, <em>Sardina p<…
2018
Clupeiformes, such as sardines and herrings, represent an important share of worldwide fisheries. Among those, the European sardine (Sardina pilchardus, Walbaum 1792) exhibits significant commercial relevance. While the last decade showed a steady and sharp decline in capture levels, recent advances in culture husbandry represent promising research avenues. Yet, the complete absence of genomic resources from sardine imposes a severe bottleneck to understand its physiological and ecological requirements. We generated 69 Gbp of paired-end reads using Illumina HiSeq X Ten and assembled a draft genome assembly with an N50 scaffold length of 25579 bp and BUSCO completeness of 82.1% (Actinopteryg…
Impact of analytic provenance in genome analysis
2014
Background Many computational methods are available for assembly and annotation of newly sequenced microbial genomes. However, when new genomes are reported in the literature, there is frequently very little critical analysis of choices made during the sequence assembly and gene annotation stages. These choices have a direct impact on the biologically relevant products of a genomic analysis - for instance identification of common and differentiating regions among genomes in a comparison, or identification of enriched gene functional categories in a specific strain. Here, we examine the outcomes of different assembly and analysis steps in typical workflows in a comparison among strains of Vi…
Sequencing, De Novo Assembly and Annotation of the Colorado Potato Beetle, Leptinotarsa decemlineata, Transcriptome
2012
Background. The Colorado potato beetle (Leptinotarsa decemlineata) is a major pest and a serious threat to potato cultivation throughout the northern hemisphere. Despite its high importance for invasion biology, phenology and pest management, little is known about L. decemlineata from a genomic perspective. We subjected European L. decemlineata adult and larval transcriptome samples to 454-FLX massively-parallel DNA sequencing to characterize a basal set of genes from this species. We created a combined assembly of the adult and larval datasets including the publicly available midgut larval Roche 454 reads and provided basic annotation. We were particularly interested in diapause-specific g…
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases
2019
AbstractThe widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins. This multi-level problem can lead to errors in genome and protein databases that are often not recognized or acknowledged. As a consequence, end users working with sequences with repetitive regions are faced with ‘ready-to-use’ deposited data whose trustworthiness is difficult to determine, let alone to quantify. Here, we provide a review of the problems associated with tandem repeat sequences that originate from different stages during the sequencing-assembly-annotatio…
Mycobacterium tuberculosiscomplex lineage 5 exhibits high levels of within-lineage genomic diversity and differing gene content compared to the type …
2020
AbstractPathogens of theMycobacterium tuberculosiscomplex (MTBC) are considered monomorphic, with little gene content variation between strains. Nevertheless, several genotypic and phenotypic factors separate the different MTBC lineages (L), especially L5 and L6 (traditionally termedMycobacterium africanum), from each other. However, genome variability and gene content especially of L5 and L6 strains have not been fully explored and may be potentially important for pathobiology and current approaches for genomic analysis of MTBC isolates, including transmission studies.We compared the genomes of 358 L5 clinical isolates (including 3 completed genomes and 355 Illumina WGS (whole genome seque…
The new era of genome sequencing using high-throughput sequencing technology: generation of the first version of the Atlantic cod genome
2016
Abstract The genome of Atlantic cod (Gadus morhua L.) published in 2011 was the first example of a teleost genome obtained using a pure high-throughput sequencing (HTS) technology strategy, and the first large vertebrate genome generated by exclusively using Roche/454 sequencing technology. At the start of the sequencing project in 2009, two HTS technologies were available, the Roche/454 and Illumina technologies. Because of the longer read length of the Roche/454 technology and a wider range of suitable software utilizing those data at the time, we chose to use this technology for the first version of the Atlantic cod genome. In this chapter, we describe the process leading to the assembly…
Direct sequencing of human gut virome fractions obtained by flow cytometry
2015
The sequence assembly of the human gut virome encounters several difficulties. A high proportion of human and bacterial matches is detected in purified viral samples. Viral DNA extraction results in a low DNA concentration, which does not reach the minimal limit required for sequencing library preparation. Therefore, the viromes are usually enriched by whole genome amplification, which is, however, prone to the development of chimeras and amplification bias. In addition, as there is a very wide diversity of gut viral species, very extensive sequencing efforts must be made for the assembling of whole viral genomes. We present an approach to improve human gut virome assembly by employing a mo…