6533b836fe1ef96bd12a0866

RESEARCH PRODUCT

SNVSniffer: An integrated caller for germline and somatic SNVs based on Bayesian models

Martin LoewerYongchao LiuSrinivas AluruBertil Schmidt

subject

GeneticsSomatic cellBayesian probabilitySNPMultinomial distributionSingle-nucleotide polymorphismConditional probability distributionBiologyGenomeGermline

description

The discovery of single nucleotide variants (SNVs) from next-generation sequencing (NGS) data typically works by aligning reads to a given genome and then creating an alignment map to interpret the presence of SNVs. Various approaches have been developed to call whether germline SNVs (or SNPs) in normal cells or somatic SNVs in cancer/tumor cells. Nonetheless, efficient callers for both germline and somatic SNVs have not yet been extensively investigated. In this paper, we present SNVSniffer, an integrated caller for germline and somatic SNVs from NGS data based on Bayesian probabilistic models. In SNVSniffer, our germline SNV calling models allele counts per site as a multinomial conditional distribution. Meanwhile, our somatic SNV calling relies on NGS tumor-normal sample pairs, and introduces a hybrid approach combining a subtraction approach with a joint sample analysis which models tumor-normal allele counts per site as a joint multinomial conditional distribution. Moreover, we investigate a lightweight tumor purity estimation approach, which demonstrates high accuracy on synthetic tumors. Compared to some leading SNP callers (SAMtools, GATK and FaSD) and somatic SNV callers (VarScan2, SomaticSniper, JointSNVMix2, MuTect), SNVSniffer demonstrates comparable or even better accuracy at faster speed. SVNSniffer, the synthetic tumor-normal data and the supplementary information are available at http://snvsniffer.sourceforge.net.

https://doi.org/10.1109/bibm.2015.7359659