6533b850fe1ef96bd12a82f5
RESEARCH PRODUCT
Preventing Overlaps in Agglomerative Hierarchical Conceptual Clustering
Quentin BrabantAmira MouakherAurélie Bertauxsubject
Structure (mathematical logic)Theoretical computer scienceComputer scienceConceptual clustering02 engineering and technologyDisjoint setsHierarchical clusteringSet (abstract data type)Pattern language (formal languages)ComputingMethodologies_PATTERNRECOGNITIONApplication domain020204 information systems0202 electrical engineering electronic engineering information engineeringUnsupervised learning020201 artificial intelligence & image processingdescription
Hierarchical Clustering is an unsupervised learning task, whi-ch seeks to build a set of clusters ordered by the inclusion relation. It is usually assumed that the result is a tree-like structure with no overlapping clusters, i.e., where clusters are either disjoint or nested. In Hierarchical Conceptual Clustering (HCC), each cluster is provided with a conceptual description which belongs to a predefined set called the pattern language. Depending on the application domain, the elements in the pattern language can be of different nature: logical formulas, graphs, tests on the attributes, etc. In this paper, we tackle the issue of overlapping concepts in the agglomerative approach of HCC. We provide a formal characterization of pattern languages that ensures a result without overlaps. Unfortunately, this characterization excludes many pattern languages which may be relevant for agglomerative HCC. Then, we propose two variants of the basic agglomerative HCC approach. Both of them guarantee a result without overlaps; the second one refines the given pattern language so that any two disjoint clusters have mutually exclusive descriptions.
year | journal | country | edition | language |
---|---|---|---|---|
2020-01-01 |