摘要
L2范数罚支持向量机(Support vector machine,SVM)是目前使用最广泛的分类器算法之一,同时实现特征选择和分类器构造的L1范数和L0范数罚SVM算法也已经提出.但是,这两个方法中,正则化阶次都是事先给定,预设p=2或p=1.而我们的实验研究显示,对于不同的数据,使用不同的正则化阶次,可以改进分类算法的预测准确率.本文提出p范数正则化SVM分类器算法设计新模式,正则化范数的阶次p可取范围为0<p≤2.使用网格法选择模型参数值,使用迭代再权方法求解分类器目标函数,找出最小分类预测误差的模型参数值.在实际数据集上的实验结果验证了提出算法能够同时实现分类预测和特征选择,性能优于L2范数罚SVM,L1范数罚SVM和L0范数罚SVM.
The L2 penalty support vector machine (SVM) algorithm is one of the most widely used learning algorithms, meanwhile L1 norm and L0 norm penalty support vector machines have been devised, which achieve simultaneously feature selection and classifier construction. However, in both methods, the regularization parameter is predetermined, i.e., the default p = 2 or p = 1. Our experimental study shows that different data, using a different regularization of order, can improve prediction accuracy of the classification algorithm. In this paper, new classifier design pattern of SVM based on p-norm regularization is proposed, where 0 〈 p 〈 2. We design grid method to select parameter values of model, use the iterative reweighted method to solve classification object function then discover the right parameter values of model at the minimum prediction error. The performance of classification and feature selection on real datasets indicate that the devised algorithm is better than L2-norm, L1-norm, and L0-norm SVM.
出处
《自动化学报》
EI
CSCD
北大核心
2012年第1期76-87,共12页
Acta Automatica Sinica
基金
国家自然科学基金(21006127
20976193)
中国石油大学(北京)基础学科研究基金项目资助~~
关键词
迭代再权方法
p范数(0<p≤2)
支持向量机
特征选择
稀疏化模型
高维小样本数据
Iterative reweighted method, p-norm (0 〈 p ≤ 2), support vector machine (SVM), feature selection, sparse model, high-dimensional small sample dataset