6533b85afe1ef96bd12b9601

RESEARCH PRODUCT

“Anti-Bayesian” flat and hierarchical clustering using symmetric quantiloids

B. John OommenB. John OommenHugo Lewi HammerAnis Yazidi

subject

Scheme (programming language)Information Systems and ManagementTheoretical computer scienceComputer scienceBayesian principleBayesian probabilityVDP::Matematikk og Naturvitenskap: 400::Matematikk: 410::Statistikk: 412Multivariate normal distribution0102 computer and information sciences02 engineering and technology01 natural sciencesDomain (mathematical analysis)ClusteringTheoretical Computer ScienceArtificial Intelligence0103 physical sciencesCluster (physics)0202 electrical engineering electronic engineering information engineering010306 general physicsCluster analysiscomputer.programming_languageCentroidComputer Science ApplicationsHierarchical clustering010201 computation theory & mathematicsControl and Systems EngineeringAnti-Bayesian classification020201 artificial intelligence & image processingcomputerSoftwareQuantiloidsQuantile

description

A myriad of works has been published for achieving data clustering based on the Bayesian paradigm, where the clustering sometimes resorts to Naive-Bayes decisions. Within the domain of clustering, the Bayesian principle corresponds to assigning the unlabelled samples to the cluster whose mean (or centroid) is the closest. Recently, Oommen and his co-authors have proposed a novel, counter-intuitive and pioneering PR scheme that is radically opposed to the Bayesian principle. The rational for this paradigm, referred to as the “Anti-Bayesian” (AB) paradigm, involves classification based on the non-central quantiles of the distributions. The first-reported work to achieve clustering using the AB paradigm was in [1], where we proposed a flat clustering method which assigned unlabelled points to clusters based on the AB paradigm, and where the distances to the respective learned clusters was based on their quantiles rather than the clusters’ centroids for uni-dimensional and two-dimensional data. This paper, extends the results of [1] in many directions. Firstly, we generalize our previous AB clustering [1], initially proposed for handling uni-dimensional and two-dimensional spaces, to arbitrary d-dimensional spaces using their so-called “quantiloids”. Secondly, we extend the AB paradigm to consider how the clustering can be achieved in hierarchical ways, where we analyze both the Top-Down and the Bottom-Up clustering options. Extensive experimentation demonstrates that our clustering achieves results competitive to the state-of-the-art flat, Top-Down and Bottom-Up clustering approaches, demonstrating the power of the AB paradigm.

10.1016/j.ins.2017.08.017http://hdl.handle.net/11250/2491681