6533b871fe1ef96bd12d1c66
RESEARCH PRODUCT
GIbPSs: a toolkit for fast and accurate analyses of genotyping-by-sequencing data without a reference genome.
D. ThieleA. Hapkesubject
0301 basic medicineGeneticseducation.field_of_studyGenotyping TechniquesPopulationComputational BiologyLocus (genetics)Computational biologySequence Analysis DNABiology03 medical and health sciencesPhylogeography030104 developmental biologyGenetics PopulationGenotypeGeneticseducationIndelGenotypingGenotyping TechniquesEcology Evolution Behavior and SystematicsPaired-end tagBiotechnologyReference genomedescription
Genotyping-by-sequencing (GBS) and related methods are increasingly used for studies of non-model organisms from population genetic to phylogenetic scales. We present GIbPSs, a new genotyping toolkit for the analysis of data from various protocols such as RAD, double-digest RAD, GBS, and two-enzyme GBS without a reference genome. GIbPSs can handle paired-end GBS data and is able to assign reads from both strands of a restriction fragment to the same locus. GIbPSs is most suitable for population genetic and phylogeographic analyses. It avoids genotyping errors due to indel variation by identifying and discarding affected loci. GIbPSs creates a genotype database that offers rich functionality for data filtering and export in numerous formats. We performed comparative analyses of simulated and real GBS data with GIbPSs and another program, pyRAD. This program accounts for indel variation by aligning homologous sequences. GIbPSs performed better than pyRAD in several aspects. It required much less computation time and displayed higher genotyping accuracy. GIbPSs retained smaller numbers of loci overall in analyses of real GBS data. It nevertheless delivered more complete genotype matrices with greater locus overlap between individuals and greater numbers of loci sampled in all individuals.
year | journal | country | edition | language |
---|---|---|---|---|
2015-06-29 | Molecular ecology resources |