6533b86efe1ef96bd12cb314

RESEARCH PRODUCT

A Feature Set Decomposition Method for the Construction of Multi-classifier Systems Trained with High-Dimensional Data

Yoisel CamposRoberto EstradaFrancesc J. FerriCarlos Morell

subject

Clustering high-dimensional databusiness.industryComputer sciencePattern recognitionInformation theorycomputer.software_genreUncorrelatedDecomposition method (queueing theory)Data miningArtificial intelligencebusinessFeature setcomputerClassifier (UML)Curse of dimensionality

description

Data mining for the discovery of novel, useful patterns, encounters obstacles when dealing with high-dimensional datasets, which have been documented as the "curse" of dimensionality. A strategy to deal with this issue is the decomposition of the input feature set to build a multi-classifier system. Standalone decomposition methods are rare and generally based on random selection. We propose a decomposition method which uses information theory tools to arrange input features into uncorrelated and relevant subsets. Experimental results show how this approach significantly outperforms three baseline decomposition methods, in terms of classification accuracy.

https://doi.org/10.1007/978-3-642-41822-8_35