This paper presents the use of Support Vector Machines (SVMs) for prediction and analysis of antisense oligonucleotide (AO) efficacy. The collected database comprises 315 AO molecules including 68 features each, inducing a problem well-suited to SVMs. The task of feature selection is crucial given the presence of noisy or redundant features, and the well-known problem of the curse of dimensionality. We propose a two-stage strategy to develop an optimal model: (1) feature selection using correlation analysis, mutual information, and SVM-based recursive feature elimination (SVM-RFE), and (2) AO prediction using standard and profiled SVM formulations. A profiled SVM gives different weights to …
Bioinformatics and Computational Biology
Bioinformatics is a new, rapidly expanding field that uses computational approaches to answer biological questions (Baxevanis, 2005). These questions are answered by means of analyzing and mining biological data. The field of bioinformatics or computational biology is a multidisciplinary research and development environment, in which a variety of techniques from computer science, applied mathematics, linguistics, physics, and, statistics are used. The terms bioinformatics and computational biology are often used interchangeably (Baldi, 1998; Pevzner, 2000). This new area of research is driven by the wealth of data from high throughput genome projects, such as the human genome sequencing pro…