6533b7cefe1ef96bd1257b33
RESEARCH PRODUCT
Alignment Free Dissimilarities for Nucleosome Classification
Giosuè Lo Boscosubject
0301 basic medicineNearest neighbour classifiersKnn classifierSettore INF/01 - Informatica030102 biochemistry & molecular biologybiologyComputer scienceSpeech recognitionEpigeneticContext (language use)Computational biologyL-tuples03 medical and health sciences030104 developmental biologyHistoneSimilarity (network science)DNA methylationbiology.proteinNucleosomeEpigeneticsAlignment free DNA sequence dissimilaritiesk-mersNucleosome classificationEpigenomicsdescription
Epigenetic mechanisms such as nucleosome positioning, histone modifications and DNA methylation play an important role in the regulation of cell type-specific gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have shown a role of DNA sequences in recruitment of epigenetic regulators. For this reason, the use of more suitable similarities or dissimilarity between DNA sequences could help in the context of epigenetic studies. In particular, alignment-free dissimilarities have already been successfully applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic profiles. In this work, we focalize the study on the problem of nucleosome classification, providing a benchmark study of 6 alignment free dissimilarity measures between sequences, belonging to the categories of geometric-based, correlation-based, information-based and compression based. Their comparisons have been done versus an alignment based dissimilarity, by measuring the performance of several nearest neighbour classifiers that incorporate each one the considered dissimilarities. Results computed on three dataset of nucleosome forming and inhibiting sequences, shows that among the alignment free dissimilarities, the geometric and correlation are the more suitable for the purpose of nucleosome classification, making them a more efficient alternative to the alignment-based similarity measures, which nevertheless are yet the preferred choice when dealing with sequence similarity measurements.
year | journal | country | edition | language |
---|---|---|---|---|
2016-01-01 |