6533b854fe1ef96bd12adff9

RESEARCH PRODUCT

Local dimensionality reduction within natural clusters for medical data analysis

S. PuuronenMykola PechenizkiyAlexey Tsymbal

subject

business.industryComputer scienceFeature vectorDimensionality reductionFeature extractionPattern recognitionFeature selectioncomputer.software_genreArtificial intelligenceData pre-processingData miningMultidimensional systemsbusinessCluster analysiscomputerCurse of dimensionality

description

Inductive learning systems have been successfully applied in a number of medical domains. Nevertheless, the effective use of these systems requires data preprocessing before applying a learning algorithm. Especially it is important for multidimensional heterogeneous data, presented by a large number of features of different types. Dimensionality reduction is one commonly applied approach. The goal of this paper is to study the impact of natural clustering on dimensionality reduction for classification. We compare several data mining strategies that apply dimensionality reduction by means of feature extraction or feature selection for subsequent classification. We show experimentally on microbiological data that local dimensionality reduction within natural clusters results in a better feature space for classification in comparison with the global search in terms of generalization accuracy.

https://research.tue.nl/en/publications/a014d207-8e85-469f-9fdb-b24cfbaaf8b3