Q-nexus: a comprehensive and efficient analysis pipeline designed for ChIP-nexus

6533b7defe1ef96bd1275ed2

RESEARCH PRODUCT

Q-nexus: a comprehensive and efficient analysis pipeline designed for ChIP-nexus

Benjamin S. Menkuec Peter Hansen Peter N. Robinson Jonas Ibn-salem Jochen Hecht Matthias Truss Sebastian Roskosch

subject

0301 basic medicine FOS: Computer and information sciences Duplication rates Chromatin Immunoprecipitation Bioinformatics Pipeline (computing)610 Biology computer.software_genre 600 Technik Medizin angewandte Wissenschaften::610 Medizin und Gesundheit 03 medical and health sciences Software ChIP-nexus Genetics Preprocessor Nucleotide Motifs Library complexity ChIP-exo Genetics Protocol (science)Binding Sites business.industry fungi Computational Biology High-Throughput Nucleotide Sequencing Reproducibility of Results Chip Chromatin immunoprecipitation Data mapping DNA-Binding Proteins Algorithm 030104 developmental biology ChIP-exo Data mining business Peak calling computer Algorithms Software Protein Binding Transcription Factors Research Article Biotechnology

description

Background: ChIP-nexus, an extension of the ChIP-exo protocol, can be used to map the borders of protein-bound DNA sequences at nucleotide resolution, requires less input DNA and enables selective PCR duplicate removal using random barcodes. However, the use of random barcodes requires additional preprocessing of the mapping data, which complicates the computational analysis. To date, only a very limited number of software packages are available for the analysis of ChIP-exo data, which have not yet been systematically tested and compared on ChIP-nexus data. Results: Here, we present a comprehensive software package for ChIP-nexus data that exploits the random barcodes for selective removal of PCR duplicates and for quality control. Furthermore, we developed bespoke methods to estimate the width of the protected region resulting from protein-DNA binding and to infer binding positions from ChIP-nexus data. Finally, we applied our peak calling method as well as the two other methods MACE and MACS2 to the available ChIP-nexus data. Conclusions: The Q-nexus software is efficient and easy to use. Novel statistics about duplication rates in consideration of random barcodes are calculated. Our method for the estimation of the width of the protected region yields unbiased signatures that are highly reproducible for biological replicates and at the same time very specific for the respective factors analyzed. As judged by the irreproducible discovery rate (IDR), our peak calling algorithm shows a substantially better reproducibility. An implementation of Q-nexus is available at http://charite.github.io/Q/. This project was supported by the Bundesministerium für Bildung und Forschung (BMBF; project no. 0313911 and 13GW0099) and the European Community’s Seventh Framework Programme (grant agreement no. 602300; SYBIL). Furthermore, we acknowledge support of the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013-2017’.

year	journal	country	edition	language
2016-01-01	BMC Genomics

10.1186/s12864-016-3164-6 http://dx.doi.org/10.1186/s12864-016-3164-6