6533b833fe1ef96bd129c1ef
RESEARCH PRODUCT
A motif-independent metric for DNA sequence specificity
Bret HanlonGiosuè Lo BoscoGuo-cheng YuanLuca Pinellosubject
Biologylcsh:Computer applications to medicine. Medical informaticsDNA-binding proteinGenomeBiochemistryDNA sequencingCell Line03 medical and health scienceschemistry.chemical_compound0302 clinical medicineStructural BiologyHumansTranscription factorMolecular Biologylcsh:QH301-705.5Sequence Specificity Epigenomics Bioinformatics030304 developmental biologyEpigenomicsGenetics0303 health sciencesBase SequenceSettore INF/01 - InformaticaGenome HumanApplied MathematicsMethodology ArticleDNAComputer Science ApplicationsDNA-Binding Proteinschemistrylcsh:Biology (General)lcsh:R858-859.7Human genomeDNA microarray030217 neurology & neurosurgeryDNAAlgorithmsSoftwareGenome-Wide Association StudyProtein BindingTranscription Factorsdescription
Abstract Background Genome-wide mapping of protein-DNA interactions has been widely used to investigate biological functions of the genome. An important question is to what extent such interactions are regulated at the DNA sequence level. However, current investigation is hampered by the lack of computational methods for systematic evaluating sequence specificity. Results We present a simple, unbiased quantitative measure for DNA sequence specificity called the Motif Independent Measure (MIM). By analyzing both simulated and real experimental data, we found that the MIM measure can be used to detect sequence specificity independent of presence of transcription factor (TF) binding motifs. We also found that the level of specificity associated with H3K4me1 target sequences is highly cell-type specific and highest in embryonic stem (ES) cells. We predicted H3K4me1 target sequences by using the N- score model and found that the prediction accuracy is indeed high in ES cells.The software to compute the MIM is freely available at: https://github.com/lucapinello/mim. Conclusions Our method provides a unified framework for quantifying DNA sequence specificity and serves as a guide for development of sequence-based prediction models.
year | journal | country | edition | language |
---|---|---|---|---|
2011-10-21 | BMC Bioinformatics |