6533b830fe1ef96bd129710c

RESEARCH PRODUCT

Whole genome sequencing data and de novo draft assemblies for 66 teleost species

Ole K. TørresenMichael MatschinerMartin MalmstrømKjetill S. JakobsenSissel JentoftSissel Jentoft

subject

0106 biological sciences0301 basic medicineStatistics and ProbabilityData DescriptorComputational biologyLibrary and Information Sciences010603 evolutionary biology01 natural sciencesGenomeEducation03 medical and health sciencesbiology.animalGenome assembly algorithmsAnimalsDNA sequencingGenePhylogenyGeneticsWhole genome sequencingGenomeWhole Genome SequencingbiologyPhylogenetic treeComparative genomicsGene treeFishesRobustness (evolution)VertebrateGenomicsComputer Science ApplicationsMetadata030104 developmental biologyStatistics Probability and UncertaintyInformation Systems

description

Teleost fishes comprise more than half of all vertebrate species, yet genomic data are only available for 0.2% of their diversity. Here, we present whole genome sequencing data for 66 new species of teleosts, vastly expanding the availability of genomic data for this important vertebrate group. We report on de novo assemblies based on low-coverage (9–39×) sequencing and present detailed methodology for all analyses. To facilitate further utilization of this data set, we present statistical analyses of the gene space completeness and verify the expected phylogenetic position of the sequenced genomes in a large mitogenomic context. We further present a nuclear marker set used for phylogenetic inference and evaluate each gene tree in relation to the species tree to test for homogeneity in the phylogenetic signal. Collectively, these analyses illustrate the robustness of this highly diverse data set and enable extensive reuse of the selected phylogenetic markers and the genomic data in general. This data set covers all major teleost lineages and provides unprecedented opportunities for comparative studies of teleosts. Machine-accessible metadata file describing the reported data (ISA-Tab format)

https://doi.org/10.1038/sdata.2016.132