6533b82dfe1ef96bd1291cae

RESEARCH PRODUCT

Classification Similarity Learning Using Feature-Based and Distance-Based Representations: A Comparative Study

Francisco GrimaldoEmilia López-iñestaMiguel Arevalillo-herráez

subject

Computer sciencebusiness.industryFeature vectorPattern recognitionMachine learningcomputer.software_genreDistance measuresSupport vector machineArtificial IntelligenceFeature basedArtificial intelligencebusinessImage retrievalcomputerClassifier (UML)Similarity learningDistance based

description

Automatically measuring the similarity between a pair of objects is a common and important task in the machine learning and pattern recognition fields. Being an object of study for decades, it has lately received an increasing interest from the scientific community. Usually, the proposed solutions have used either a feature-based or a distance-based representation to perform learning and classification tasks. This article presents the results of a comparative experimental study between these two approaches for computing similarity scores using a classification-based method. In particular, we use the Support Vector Machine as a flexible combiner both for a high dimensional feature space and for a family of distance measures, to finally learn similarity scores. The approaches have been tested in a content-based image retrieval context, using three different repositories. We analyze both the influence of the different input data formats and the training size on the performance of the classifier. Then, we found that a low-dimensional, multidistance-based representation can be convenient for small to medium-size training sets, whereas it is detrimental as the training size grows.

https://doi.org/10.1080/08839514.2015.1026658