6533b86efe1ef96bd12cc175

RESEARCH PRODUCT

Scalable robust clustering method for large and sparse data

Joonas HämäläinenTommi KärkkäinenTuomo Rossi

subject

datadatasetsklusterianalyysiclustering

description

Datasets for unsupervised clustering can be large and sparse, with significant portion of missing values. We present here a scalable version of a robust clustering method with the available data strategy. Moreprecisely, a general algorithm is described and the accuracy and scalability of a distributed implementation of the algorithm is tested. The obtained results allow us to conclude the viability of the proposed approach. peerReviewed

http://urn.fi/URN:NBN:fi:jyu-201901281317