期刊文献+

支持向量机对多环芳烃毒性的定量构效预测 被引量:2

Quantitative structure-activity forecast of the PAHs toxicity by using the supporting vector machine
原文传递
导出
摘要 为了实现多环芳烃(PAHs)毒性的有效预测,提出应用定量构效技术对多环芳烃的空气-正辛醇分配系数(KOA)和致癌性进行预测。应用分子描述符和试验值确立构效关系,采用支持向量机算法(SVM)和人工神经网络算法(ANN)分别建立了PAHs的KOA回归预测模型和致癌性分类预测模型。利用网格划分(GS)、遗传算法(GA)、粒子群算法(PSO)对SVM进行参数寻优。应用均方误差(MSE)、拟合决定系数R2和分类准确率(Accuracy)分别对模型进行了验证与评价。结果表明,最佳回归预测模型GS-SVR的MSE为0.059 7,R2为0.913 0;最佳分类预测模型GA-SVC的Accuracy为95%。研究表明:应用SVM所建两种模型的稳定性和预测能力都优于应用ANN建立的模型;参数优化后模型的稳定性和预测能力得到了提高。 This paper is inclined to propose a kind of quantitative structure efficiency technology for toxicity analysis in hoping to predict the toxicity of the polycyclic aromatic hydrocarbons( PAHs) and its toxic threatening degrees. The technology is mainly used to forecast or predict the air-octanol partition coefficient of the polycyclic aromatic hydrocarbons( PAHs) and its carcinogenicity. Therefore,the so-called quantitative structure activity relationship( QSAR) the paper has referred to should consist of the molecule descriptors and the experimental data. In the paper,we have established the models by adopting the supporting vector machine algorithm( SVM) and the artificial neural network algorithm( ANN). And,in terms of the MATLAB,we have established the KOAregression forecasting model of PAHs and the prediction model for classifying the carcinogenicity of the PAHs under way. Furthermore,we have managed to use the grid division( GS) method,the genetic algorithm( GA) and the particle swarm optimization algorithm( PSO) to optimize the parameters of SVM,accordingly. The verification and evaluation of the models are supposed to be achieved through the validation of the internal and external parameters. And,then,the KOAregression forecast result of PAHs can further be evaluated by means of R2 and MSE,whereas the classification accuracy can be used to test the results whether the carcinogenicity classification evaluation of PAHs proves effective or not. Moreover,the regression forecast may result in that the gained MSE of SVM ought to be less than MSE of ANN,while R2 of SVM should be bigger than R2 of ANN. In the result of classification prediction,the classification accuracy of SVM should be greater than that of ANN. After the optimization of the regression prediction,GS and PSO should increase R2 of the original model but decrease the MSE of the original model.However,GA may turn to be over the fitting phenomenon and result in that the MSE of the best regression prediction model may be: GS-SVR is 0. 059 7 with R2 being 0. 913 0. Thus,the classification accuracy of the three models should be greater than that of the original one on average on the condition that the classification prediction has been optimized. Therefore,the best classification model can produce a classification precision as high as GA-SVC is95%. The results of our experiments show that the stability and the predictive power of the two models offered by SVM can be better than those of ANN,whereas SVM can heighten the stability and the predictive ability of the original model through parameter optimization.
出处 《安全与环境学报》 CAS CSCD 北大核心 2017年第4期1600-1604,共5页 Journal of Safety and Environment
基金 国家自然科学基金项目(51206038) 国家安全生产监督管理总局重大事故防治关键技术科技项目(heilongjiang-0001-2014AQ)
关键词 环境学 多环芳烃(PAHs) 定量构效关系(QSAR) 支持向量机(SVM) environmentalology polycyclic aromatic hydrocarbons quantitative structure-activity relationship support vector machine
  • 相关文献

参考文献4

二级参考文献59

共引文献10

同被引文献11

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部