6533b859fe1ef96bd12b8307
RESEARCH PRODUCT
Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs
Jonas Ibn-salemMiguel A. Andrade-navarrosubject
PhysicsChromosome conformation captureCTCFgenetic processesnatural sciencesHuman genomePromoterComputational biologyBinding siteSequence motifTranscription factorChromatindescription
Background: Transcription factors (TFs) bind to gene promoters or distal regulatory elements that interact with the promoter via chromatin looping. While the TF binding sites themselves are detected genome-wide by ChIP-seq experiments, it is difficult to associate them regulated genes without information of chromatin looping. Recent experimental techniques such as Hi-C or ChIA-PET measure long-range interactions genome-wide but are experimentally elaborate and have limited resolution. Here, we present Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs (7C). Results: While ChIP-seq was not designed to detect contacts, the formaldehyde treatment in the ChIP-seq protocol results in cross-linking of proteins with each other and with DNA. Consequently, also regions that are not directly bound by the targeted TF but interact with the binding site via chromatin looping are co-immunoprecipitated and sequenced. This results in minor ChIP-seq signals at loop anchor regions when in close proximity to the directly bound site. Since many chromatin looping anchors are characterized by CTCF sequence motifs in convergent orientation, we can use the position and shape of ChIP-seq signals around CTCF motif pairs together with motif orientation, and genomic distance to predict whether they interact or not. We applied 7C to all CTCF motif pairs within 1 MB in the human genome and validated predicted interactions with high-resolution Hi-C and ChIA-PET. Known architectural proteins (CTCF, Rad21, Znf143) show best prediction performance, but also other TFs, like TRIM22 or RUNX3, predict loops accurately from a single ChIP-seq experiment. Importantly, the 7C model is general enough to predict loops in different cell types and for TF ChIP-seq datasets that were not used in training. Conclusion: 7C predicts chromatin loops with base-pair resolution and can be used to associate TF binding sites to regulated genes in a condition-specific manner. Furthermore, profiling of hundreds of ChIP-seq datasets results in novel candidate factors functionally involved in chromatin looping. Our method is implemented as an R package and is publicly available: https://github.com/ibn-salem/sevenC.
year | journal | country | edition | language |
---|---|---|---|---|
2018-02-01 |