摘要
多元校正分析模型的精度不仅依赖于模型的结构和参数,还很大程度上取决于训练样本的分布。实际过程中,训练样本通常呈现不均匀分布,导致基于全体样本的回归模型预测性能不理想。本文针对该问题提出了支持向量机分类与回归联合建模方法:首先使用最小二乘支持向量机(LS-SVM)分类器构建分类决策树,然后对每一类样本分别建立最小二乘支持向量机回归模型;对未知样本进行定量分析时,首先经过分类决策树分类,再根据分类信息选择相应的回归模型进行计算。针对汽油辛烷值拉曼光谱分析问题,基于全体样本建模的LS-SVM回归模型的标准预测误差为0.54,而采用本文方法所得的模型预测误差为0.22,大幅度地提高了分析精度。
In multivariate calibration, the model performance depends not only on model structure and parameters, but also the training sample distribution. In practical application, training samples often distribute unevenly in space, Therefore the model performance based on whole training sample set degrades. Aiming at this problem, a new hybrid modeling method based on support vector classification and regression is proposed in this paper. A classification decision tree with binary tree form is firstly built using least-squares support vector classifier; then least-squares support vector regression is used to construct the regression model for each class. For an unknown sample, the established classification decision tree is applied to determine its class and then corresponding regression model is selected for quantitative analysis. This method was applied to Raman spectral analysis of gasoline octane number; and the standard prediction error is 0.22. However, the standard prediction error from the calibration based on the whole data set is 0.54, which is approximately 2.5 times larger. Analysis result shows that the proposed method has greatly improved the model performance and thus demonstrates its potential for general purpose analysis.
出处
《仪器仪表学报》
EI
CAS
CSCD
北大核心
2010年第11期2440-2446,共7页
Chinese Journal of Scientific Instrument
基金
国家"863"计划项目(2009AA04Z123)资助项目
关键词
最小二乘支持向量机
分类
回归
拉曼光谱
汽油
least squares support vector machine
classification
regression
Raman spectroscopy
gasoline