The process of placing a separating hyperplane for data classification is normally disconnected from the process of selecting the features to use. An approach for feature selection that is conceptually simple but computationally explosive is to simply apply the hyperplane placement process to all possible subsets of features, selecting the smallest set of features that provides reasonable classification accuracy. Two ways to speed this process are (i) use a faster filtering criterion instead of a complete hyperplane placement, and (ii) use a greedy forward or backwards sequential selection method. This paper introduces a new filtering criterion that is very fast: maximizing the drop in the sum of infeasibilities in a linear-programming transformation of the problem. It also shows how the linear programming transformation can be applied to reduce the number of features after a separating hyperplane has already been placed while maintaining the separation that was originally induced by the hyperplane. Finally, a new and highly effective integrated method that simultaneously selects features while placing the separating hyperplane is introduced.

Additional Metadata
Keywords Classification, Feature selection, Linear programming, Maximum feasible subsystem
Persistent URL dx.doi.org/10.1016/j.eswa.2012.01.148
Journal Expert Systems with Applications
Citation
Chinneck, J. (2012). Integrated classifier hyperplane placement and feature selection. Expert Systems with Applications, 39(9), 8193–8203. doi:10.1016/j.eswa.2012.01.148