6533b7d1fe1ef96bd125cc1c

RESEARCH PRODUCT

PVAmpliconFinder: a workflow for the identification of human papillomaviruses from high-throughput amplicon sequencing

Magali OlivierSankhadeep DuttaSankhadeep DuttaMassimo TommasinoTarik GheitAdam GrundhoffMarcis LejaDana E. RollisonNicole FischerAlexis RobitailleRosario Nicola Brancaccio

subject

Computer scienceComputational biologylcsh:Computer applications to medicine. Medical informaticsBiochemistryWorkflowUser-Computer Interface03 medical and health sciencessymbols.namesakeStructural BiologyHumansVirus discoverylcsh:QH301-705.5PapillomaviridaeMolecular BiologyThroughput (business)PhylogenyAmplicon sequencing030304 developmental biologySanger sequencing0303 health sciencesBiological data030306 microbiologyMethodology ArticleApplied MathematicsHigh-Throughput Nucleotide SequencingPapillomavirusAmpliconComputer Science ApplicationsIdentification (information)Workflowlcsh:Biology (General)MetagenomicsDNA ViralAmplicon sequencingsymbolslcsh:R858-859.7Primer (molecular biology)DNA microarray

description

Abstract Background The detection of known human papillomaviruses (PVs) from targeted wet-lab approaches has traditionally used PCR-based methods coupled with Sanger sequencing. With the introduction of next-generation sequencing (NGS), these approaches can be revisited to integrate the sequencing power of NGS. Although computational tools have been developed for metagenomic approaches to search for known or novel viruses in NGS data, no appropriate tool is available for the classification and identification of novel viral sequences from data produced by amplicon-based methods. Results We have developed PVAmpliconFinder, a data analysis workflow designed to rapidly identify and classify known and potentially new Papillomaviridae sequences from NGS amplicon sequencing with degenerate PV primers. Here, we describe the features of PVAmpliconFinder and its implementation using biological data obtained from amplicon sequencing of human skin swab specimens and oral rinses from healthy individuals. Conclusions PVAmpliconFinder identified putative new HPV sequences, including one that was validated by wet-lab experiments. PVAmpliconFinder can be easily modified and applied to other viral families. PVAmpliconFinder addresses a gap by providing a solution for the analysis of NGS amplicon sequencing, increasingly used in clinical research. The PVAmpliconFinder workflow, along with its source code, is freely available on the GitHub platform: https://github.com/IARCbioinfo/PVAmpliconFinder.

https://doi.org/10.1186/s12859-020-03573-8