Search results for "Clustering"

showing 10 items of 446 documents

Automatic detection of cervical cells in Pap-smear images using polar transform and k-means segmentation

2016

We introduce a novel method of cell detection and segmentation based on a polar transformation. The method assumes that the seed point of each candidate is placed inside the nucleus. The polar representation, built around the seed, is segmented using k-means clustering into one candidate-nucleus cluster, one candidate-cytoplasm cluster and up to three miscellaneous clusters, representing background or surrounding objects that are not part of the candidate cell. For assessing the natural number of clusters, the silhouette method is used. In the segmented polar representation, a number of parameters can be conveniently observed and evaluated as fuzzy memberships to the non-cell class, out of …

business.industryk-means clustering02 engineering and technologyImage segmentationElectronic mail030218 nuclear medicine & medical imagingSilhouette03 medical and health sciences0302 clinical medicine0202 electrical engineering electronic engineering information engineeringCluster (physics)Polar020201 artificial intelligence & image processingSegmentationComputer visionArtificial intelligencebusinessCluster analysisMathematics2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA)
researchProduct

The Three Steps of Clustering in the Post-Genomic Era: A Synopsis

2011

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particul…

cluster validation indicesSettore INF/01 - InformaticaProcess (engineering)Computer sciencebusiness.industryGenomic datadistance functionMachine learningcomputer.software_genreObject (computer science)ClusteringCluster algorithmPredictive powerRelevance (information retrieval)Artificial intelligenceHigh dimensionalitybusinessCluster analysiscomputer
researchProduct

Statistical Indexes for Computational and Data Driven Class Discovery in Microarray Data

2009

clustering
researchProduct

Restricted Neighborhood search clustering revisited: an evolutionary computation perspective

2013

clustering analysisprotein-protein interaction network
researchProduct

Computational cluster validation for microarray data analysis: experimental assessment of Clest, Consensus Clustering, Figure of Merit, Gap Statistic…

2008

Abstract Background Inferring cluster structure in microarray datasets is a fundamental task for the so-called -omic sciences. It is also a fundamental question in Statistics, Data Analysis and Classification, in particular with regard to the prediction of the number of clusters in a dataset, usually established via internal validation measures. Despite the wealth of internal measures available in the literature, new ones have been recently proposed, some of them specifically for microarray data. Results We consider five such measures: Clest, Consensus (Consensus Clustering), FOM (Figure of Merit), Gap (Gap Statistics) and ME (Model Explorer), in addition to the classic WCSS (Within Cluster…

clustering microarray dataMicroarrayComputer scienceStatistics as Topiccomputer.software_genrelcsh:Computer applications to medicine. Medical informaticsBiochemistryStructural BiologyDatabases GeneticConsensus clusteringStatisticsCluster (physics)AnimalsCluster AnalysisHumansCluster analysislcsh:QH301-705.5Molecular BiologyOligonucleotide Array Sequence AnalysisStructure (mathematical logic)Microarray analysis techniquesApplied MathematicsComputational BiologyComputer Science ApplicationsBenchmarkingComputingMethodologies_PATTERNRECOGNITIONlcsh:Biology (General)Gene chip analysislcsh:R858-859.7Data miningDNA microarraycomputerAlgorithmsSoftwareResearch ArticleBMC Bioinformatics
researchProduct

A PARTITION TYPE METHOD FOR CLUSTERING MIXED DATA

1990

In this paper, we propose a method for clustering mixed data. The method is a nonhierarchical one, and deals simultaneously with variables of three main kinds: numerical, ordinal, and nominal. It is based on the minimization of a particular criterion f(G。) over all the partitions G。of n entities in m distinct clusters. The criterion is founded on a peculiar kind of internal standardized mean diversity of the entities, according to the three types of variables. The algorithm to get the best partition is also presented: it starts from a non-random choice of the first partition; the results are compared with those obtained by a random assignment to a first partition. In order to show the usefu…

clustering mixed data partition typeSettore SECS-S/01 - Statistica
researchProduct

A new approach for clustering of effects in quantile regression

2017

In this paper we aim at nding similarities among the coefficients from a multivariate regression. Using a quantile regression coefficients modeling, the effect of each covariate, given a response (also multivariate) is a curve in the multidimensional space of the percentiles. Collecting all the curves, describing the effects of each covariate on each response variable, we could be able to assess if only one or more covariates have same effects on different responses.

curves clustering; quantile regression coefficients modeling; multivariate analysis; functional datacurves clusteringmultivariate analysiSettore SECS-S/01 - Statisticaquantile regression coefficients modelingfunctional data
researchProduct

Scalable robust clustering method for large and sparse data

2018

Datasets for unsupervised clustering can be large and sparse, with significant portion of missing values. We present here a scalable version of a robust clustering method with the available data strategy. Moreprecisely, a general algorithm is described and the accuracy and scalability of a distributed implementation of the algorithm is tested. The obtained results allow us to conclude the viability of the proposed approach. peerReviewed

datadatasetsklusterianalyysiclustering
researchProduct

Comparison of cluster validation indices with missing data

2018

Clustering is an unsupervised machine learning technique, which aims to divide a given set of data into subsets. The number of hidden groups in cluster analysis is not always obvious and, for this purpose, various cluster validation indices have been suggested. Recently some studies reviewing validation indices have been provided, but any experiments against missing data are not yet available. In this paper, performance of ten well-known indices on ten synthetic data sets with various ratios of missing values is measured using squared euclidean and city block distances based clustering. The original indices are modified for a city block distance in a novel way. Experiments illustrate the di…

dataklusterianalyysicluster validationclustering
researchProduct

Data-Driven Interactive Multiobjective Optimization Using a Cluster-Based Surrogate in a Discrete Decision Space

2019

In this paper, a clustering based surrogate is proposed to be used in offline data-driven multiobjective optimization to reduce the size of the optimization problem in the decision space. The surrogate is combined with an interactive multiobjective optimization approach and it is applied to forest management planning with promising results. peerReviewed

data-driven optimizationMathematical optimizationOptimization problemComputer scienceboreal forest managementComputer Science::Neural and Evolutionary Computationpäätöksenteko0211 other engineering and technologiesMathematicsofComputing_NUMERICALANALYSISdecision maker02 engineering and technologypreference informationSpace (commercial competition)Multi-objective optimizationComputingMethodologies_ARTIFICIALINTELLIGENCEData-drivenklusteritoptimointi0202 electrical engineering electronic engineering information engineeringCluster analysis021103 operations researchsurrogatesComputingMethodologies_PATTERNRECOGNITIONboreaalinen vyöhyke020201 artificial intelligence & image processingmetsänhoitoCluster basedclustering
researchProduct