6533b7d6fe1ef96bd1267245

RESEARCH PRODUCT

German-Wide Interlaboratory Study Compares Consistency, Accuracy and Reproducibility of Whole-Genome Short Read Sequencing.

Carlus DenekeSylvia KletaSimon H. TauschBurkhard MalornyKerstin StinglAnne WöhlkeThomas HankelnKathrin SzaboErik BrinksMaria BorowiakLaura UelzeMarkus BönnLarissa Murr

subject

Microbiology (medical)Whole genome sequencing0303 health sciencesReproducibilityinterlaboratory study030306 microbiologylcsh:QR1-502Computational biologyIon semiconductor sequencingBiologyMicrobiologyGenomeion torrentlcsh:Microbiology03 medical and health sciencesfood safetyConsistency (statistics)whole-genome sequencingData qualityilluminaBase callingIllumina dye sequencing030304 developmental biology

description

We compared the consistency, accuracy and reproducibility of next-generation short read sequencing between ten laboratories involved in food safety (research institutes, state laboratories, universities and companies) from Germany and Austria. Participants were asked to sequence six DNA samples of three bacterial species (Campylobacter jejuni, Listeria monocytogenes and Salmonella enterica) in duplicate, according to their routine in-house sequencing protocol. Four different types of Illumina sequencing platforms (MiSeq, NextSeq, iSeq, NovaSeq) and one Ion Torrent sequencing instrument (S5) were involved in the study. Sequence quality parameters were determined for all data sets and centrally compared between laboratories. SNP and cgMLST calling were performed to assess the reproducibility of sequence data collected for individual samples. Overall, we found Illumina short read data to be more accurate (higher base calling accuracy, fewer miss-assemblies) and consistent (little variability between independent sequencing runs within a laboratory) than Ion Torrent sequence data, with little variation between the different Illumina instruments. Two laboratories with Illumina instruments submitted sequence data with lower quality, probably due to the use of a library preparation kit, which shows difficulty in sequencing low GC genome regions. Differences in data quality were more evident after assembling short reads into genome assemblies, with Ion Torrent assemblies featuring a great number of allele differences to Illumina assemblies. Clonality of samples was confirmed through SNP calling, which proved to be a more suitable method for an integrated data analysis of Illumina and Ion Torrent data sets in this study.

10.3389/fmicb.2020.573972https://pubmed.ncbi.nlm.nih.gov/33013811