6533b824fe1ef96bd12812b3

RESEARCH PRODUCT

Ensemble Feature Selection Based on Contextual Merit and Correlation Heuristics

Seppo PuuronenIryna SkrypnykAlexey Tsymbal

subject

Training setbusiness.industryComputer scienceFeature selectionPattern recognitionBase (topology)Machine learningcomputer.software_genreExpert systemRandom subspace methodComputingMethodologies_PATTERNRECOGNITIONEnsembles of classifiersFeature (machine learning)Artificial intelligencebusinessHeuristicscomputerCascading classifiers

description

Recent research has proven the benefits of using ensembles of classifiers for classification problems. Ensembles of diverse and accurate base classifiers are constructed by machine learning methods manipulating the training sets. One way to manipulate the training set is to use feature selection heuristics generating the base classifiers. In this paper we examine two of them: correlation-based and contextual merit -based heuristics. Both rely on quite similar assumptions concerning heterogeneous classification problems. Experiments are considered on several data sets from UCI Repository. We construct fixed number of base classifiers over selected feature subsets and refine the ensemble iteratively promoting diversity of the base classifiers and relying on global accuracy growth. According to the experimental results, contextual merit -based ensemble outperforms correlation-based ensemble as well as C4.5. Correlation-based ensemble produces more diverse and simple base classifiers, and the iterations promoting diversity have not so evident effect as for contextual merit -based ensemble.

https://doi.org/10.1007/3-540-44803-9_13