The fundamental theory of optimal "Anti-Bayesian" parametric pattern classification using order statistics criteria

6533b85bfe1ef96bd12baaa9

RESEARCH PRODUCT

The fundamental theory of optimal "Anti-Bayesian" parametric pattern classification using order statistics criteria

subject

Mahalanobis distance VDP::Mathematics and natural science: 400::Mathematics: 410::Statistics: 412 Feature vector Order statistic Bayesian probability classification by moments of order statistics 020206 networking & telecommunications VDP::Technology: 500::Information and communication technology: 550 02 engineering and technology prototype reduction schemes Naive Bayes classifier Bayes' theorem Exponential family pattern classification order statistics Artificial Intelligence Signal Processing 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Algorithm Software reduction of training patterns Mathematics Parametric statistics

description

Author's version of an article in the journal: Pattern Recognition. Also available from the publisher at: http://dx.doi.org/10.1016/j.patcog.2012.07.004 The gold standard for a classifier is the condition of optimality attained by the Bayesian classifier. Within a Bayesian paradigm, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding means. The reader should observe that, in this context, the mean, in one sense, is the most central point in the respective distribution. In this paper, we shall show that we can obtain optimal results by operating in a diametrically opposite way, i.e., a so-called "anti-Bayesian" manner. Indeed, we assert a completely counter-intuitive result that by working with a very few points distant from the mean, one can obtain remarkable classification accuracies. The number of points can sometimes be as small as two. Further, if these points are determined by the order statistics of the distributions, the accuracy of our method, referred to as Classification by Moments of Order Statistics (CMOS), attains the optimal Bayes' bound. This claim, which is totally counter-intuitive, has been proven for many uni-dimensional, and some multi-dimensional distributions within the exponential family, and the theoretical results have been verified by rigorous experimental testing. Apart from the fact that these results are quite fascinating and pioneering in their own right, they also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.

year	journal	country	edition	language
2013-01-01

http://hdl.handle.net/11250/137998