Optimal Bayesian linear classifiers have been studied in the literature for many decades. In this paper, we demonstrate that all the known results consider only the scenario when the quadratic polynomial has coincident roots. Indeed, we present a complete analysis of the case when the optimal classifier between two normally distributed classes is pairwise and linear. To the best of our knowledge, this is a pioneering work for the use of such classifiers in any area of statistical Pattern Recognition (PR). We shall focus on some special cases of the normal distribution with nonequal covariance matrices. We determine the conditions that the mean vectors and covariance matrices have to satisfy in order to obtain the optimal pairwise linear classifier. As opposed to the state of the art, in all the cases discussed here, the linear classifier is given by a pair of straight lines, which is a particular case of the general equation of second degree. One of these cases is when we have two overlapping classes with equal means, which resolves the general case of the Minsky's paradox for the perception. We have also provided some empirical results, using synthetic data for the Minsky's paradox case, and demonstrated that the linear classifier achieves very good performance. Finally, we have tested our approach on real life data obtained from the UCI machine learning repository. The empirical results that we obtained show the superiority of our scheme over the traditional Fisher's discriminant classifier.

Linear classifiers, Optimal Bayesian classification, Pattern classification, Statistical pattern recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
School of Computer Science

Rueda, L. (Luis), & Oommen, J. (2002). On optimal pairwise linear classifiers for normal distributions: The two-dimensional case. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 274–280. doi:10.1109/34.982905