6533b7d4fe1ef96bd1262672

RESEARCH PRODUCT

Integrating genomic binding site predictions using real-valued meta classifiers

Rene Te BoekhorstNeil DaveyAlistair G. RustMark RobinsonRod AdamsYi Sun

subject

Artificial neural networkComputer sciencebusiness.industryMachine learningcomputer.software_genreDNA binding siteSupport vector machineArtificial IntelligenceArtificial intelligenceAdaBoostPrecision and recallbusinessClassifier (UML)computerSoftware

description

Currently the best algorithms for predicting transcription factor binding sites in DNA sequences are severely limited in accuracy. There is good reason to believe that predictions from different classes of algorithms could be used in conjunction to improve the quality of predictions. In this paper, we apply single layer networks, rules sets, support vector machines and the Adaboost algorithm to predictions from 12 key real valued algorithms. Furthermore, we use a ‘window’ of consecutive results as the input vector in order to contextualise the neighbouring results. We improve the classification result with the aid of under- and over-sampling techniques. We find that support vector machines and the Adaboost algorithm outperform the original individual algorithms and the other classifiers employed in this work. In particular they give a better tradeoff between recall and precision.

https://doi.org/10.1007/s00521-008-0204-4