Tailoring classifier hyperplanes to general metrics
Finding a hyperplane that separates two classes of data points with the minimum number of misclassifications is directly related to the following problem in linear programming: given an infeasible set of linear constraints, find the smallest number of constraints to remove such that the remaining constraints constitute a feasible set (the Maximum Feasible Subsystem problem). This relationship underlies an effective heuristic method for finding separating hyperplanes in classification problems [Chinneck 2001]. This paper shows how to tailor the maximum feasible subsystem hyperplane placement heuristic so that it can provide good values for metrics other than total accuracy. The concepts are demonstrated using accuracyrelated metrics such as precision and recall, balancing the population accuracies, and balancing the accuracies on each side of the hyperplane, but the principles also apply to other metrics such as the Gini index, entropy, etc. Customizations such as these may prove useful in developing better decision trees.
|Keywords||Classification, Infeasibility analysis, Separating hyperplanes|
Chinneck, J. (2009). Tailoring classifier hyperplanes to general metrics. doi:10.1007/978-0-387-88843-9_19