We study the strategies in feature selection with sparse support vector machine (SVM). Recently, the socalled Lp-SVM (0 〈 p 〈 1) has attracted much attention because it can encourage better sparsity than the wid...We study the strategies in feature selection with sparse support vector machine (SVM). Recently, the socalled Lp-SVM (0 〈 p 〈 1) has attracted much attention because it can encourage better sparsity than the widely used L1-SVM. However, Lp-SVM is a non-convex and non-Lipschitz optimization problem. Solving this problem numerically is challenging. In this paper, we reformulate the Lp-SVM into an optimization model with linear objective function and smooth constraints (LOSC-SVM) so that it can be solved by numerical methods for smooth constrained optimization. Our numerical experiments on artificial datasets show that LOSC-SVM (0 〈 p 〈 1) can improve the classification performance in both feature selection and classification by choosing a suitable parameter p. We also apply it to some real-life datasets and experimental results show that it is superior to L1-SVM.展开更多
基金This work is supported in part by the National Natural Science Foundation of China under Grant Nos. 61502159, 61379057, 11101081, and 11271069, and the Research Foundation of Central South University of China under Grant No. 2014JSJJ019.
文摘We study the strategies in feature selection with sparse support vector machine (SVM). Recently, the socalled Lp-SVM (0 〈 p 〈 1) has attracted much attention because it can encourage better sparsity than the widely used L1-SVM. However, Lp-SVM is a non-convex and non-Lipschitz optimization problem. Solving this problem numerically is challenging. In this paper, we reformulate the Lp-SVM into an optimization model with linear objective function and smooth constraints (LOSC-SVM) so that it can be solved by numerical methods for smooth constrained optimization. Our numerical experiments on artificial datasets show that LOSC-SVM (0 〈 p 〈 1) can improve the classification performance in both feature selection and classification by choosing a suitable parameter p. We also apply it to some real-life datasets and experimental results show that it is superior to L1-SVM.