6533b821fe1ef96bd127afdb

RESEARCH PRODUCT

MLML2R: an R package for maximum likelihood estimation of DNA methylation and hydroxymethylation proportions.

José D. BermúdezArce Domingo-rellosoMaria Tellez-plazaSamara KiihlMaria Jose Martinez-garrido

subject

Statistics and ProbabilityDNA HydroxymethylationEpigenomicsIterative methodMaximum likelihood03 medical and health sciencessymbols.namesake0302 clinical medicineGeneticsHumansMolecular Biology030304 developmental biologyMathematics0303 health sciencesLikelihood FunctionsComputational BiologyHigh-Throughput Nucleotide SequencingProbability and statisticsDNA MethylationComputational MathematicsR packageLagrange multiplierDNA methylationsymbolsIterative approximationAlgorithm030217 neurology & neurosurgery

description

Abstract Accurately measuring epigenetic marks such as 5-methylcytosine (5-mC) and 5-hydroxymethylcytosine (5-hmC) at the single-nucleotide level, requires combining data from DNA processing methods including traditional (BS), oxidative (oxBS) or Tet-Assisted (TAB) bisulfite conversion. We introduce the R package MLML2R, which provides maximum likelihood estimates (MLE) of 5-mC and 5-hmC proportions. While all other available R packages provide 5-mC and 5-hmC MLEs only for the oxBS+BS combination, MLML2R also provides MLE for TAB combinations. For combinations of any two of the methods, we derived the pool-adjacent-violators algorithm (PAVA) exact constrained MLE in analytical form. For the three methods combination, we implemented both the iterative method by Qu et al. [Qu, J., M. Zhou, Q. Song, E. E. Hong and A. D. Smith (2013): “Mlml: consistent simultaneous estimates of dna methylation and hydroxymethylation,” Bioinformatics, 29, 2645–2646.], and also a novel non iterative approximation using Lagrange multipliers. The newly proposed non iterative solutions greatly decrease computational time, common bottlenecks when processing high-throughput data. The MLML2R package is flexible as it takes as input both, preprocessed intensities from Infinium Methylation arrays and counts from Next Generation Sequencing technologies. The MLML2R package is freely available at https://CRAN.R-project.org/package=MLML2R.

10.1515/sagmb-2018-0031https://pubmed.ncbi.nlm.nih.gov/30653470