6533b850fe1ef96bd12a84d7

RESEARCH PRODUCT

Computation Cluster Validation in the Big Data Era

Filippo UtroRaffaele Giancarlo

subject

Clustering high-dimensional dataClass (computer programming)Clustering validation measureSettore INF/01 - InformaticaComputer sciencebusiness.industryBig dataInferenceMicroarrays data analysiscomputer.software_genreGap statisticTask (project management)ComputingMethodologies_PATTERNRECOGNITIONCURE data clustering algorithmConsensus clusteringHypothesis testing in statisticClustering Class Discovery in Data Algorithmsb Clustering algorithmFigure of meritConsensus clusteringData miningCluster analysisbusinesscomputer

description

Data-driven class discovery, i.e., the inference of cluster structure in a dataset, is a fundamental task in Data Analysis, in particular for the Life Sciences. We provide a tutorial on the most common approaches used for that task, focusing on methodologies for the prediction of the number of clusters in a dataset. Although the methods that we present are general in terms of the data for which they can be used, we offer a case study relevant for Microarray Data Analysis.

10.1016/b978-0-12-809633-8.20385-3http://hdl.handle.net/10447/291370