6533b823fe1ef96bd127df19

RESEARCH PRODUCT

Sequential Learning with LS-SVM for Large-Scale Data Sets

Daniel PolaniTobias Jung

subject

Data streamSupport vector machineApproximation errorBasis functionSequence learningLarge scale dataAlgorithmRegularization (mathematics)Subspace topologyMathematics

description

We present a subspace-based variant of LS-SVMs (i.e. regularization networks) that sequentially processes the data and is hence especially suited for online learning tasks. The algorithm works by selecting from the data set a small subset of basis functions that is subsequently used to approximate the full kernel on arbitrary points. This subset is identified online from the data stream. We improve upon existing approaches (esp. the kernel recursive least squares algorithm) by proposing a new, supervised criterion for the selection of the relevant basis functions that takes into account the approximation error incurred from approximating the kernel as well as the reduction of the cost in the original learning task. We use the large-scale data set 'forest' to compare performance and efficiency of our algorithm with greedy batch selection of the basis functions via orthogonal least squares. Using the same number of basis functions we achieve comparable error rates at much lower costs (CPU-time and memory wise).

https://doi.org/10.1007/11840930_39