期刊文献+

基于KPCA及SVM的蛋白质O-糖基化位点的预测 被引量:4

Prediction of the Protein O-glycosylation by Kernel Principal Component Analysis and Support Vector Machines
在线阅读 下载PDF
导出
摘要 为了提高蛋白质O-糖基化位点的预测准确率,提出了把核主成分分析(KPCA)与支持向量机(SVM)相结合的方法。实验样本用稀疏编码方式编码,窗口长度为21。首先,用核主成分分析提取了样本的核主成分(特征);然后,在特征空间中用改进的支持向量机(ISVM)进行分类(预测)。在使用支持向量机分类时,设置了一个边界系数αc来减少运算的复杂度。实验结果表明,使用KPCA+ISVM的方法预测的效果优于PCA+SVM的预测效果。预测准确率为87%。更进一步,用不同长度的样本做实验(w=5,7,9,11,21,31,41,51),使用多数投票法综合各子分类器的优势。结果表明,组合分类器的预测准确率优于子分类器的预测准确率,预测准确率为88%。 To improve the prediction accuracy of O-glycosylation sites, a new method of KPCA + ISVM was pro- posed. The samples for experiment were encoded by the sparse coding with window size w = 21, kernel principal com- ponents (feature) were extracted by kernel principal component analysis ( KPCA), then the prediction (classification) was done in feature space by improved support vector machines (ISVM). When using ISVM, a bound coefficient ctc was defined to reduce the complexity of model. The results of experiment show that the performance of KPCA + ISVM is better than that of PCA + SVM and SVM. The prediction accuracy is about 87%. Furthermore, the same protein sequence under various window size (w = 5,7,9,11,21,31,41,51 )was investigated, and the majority-vote scheme was used to combine all the pre-classifiers to improve the prediction performance. The results indicate that the perform- ance of ensembles of KPCA + ISVM is superior to that of pre-classifier. The prediction accuracy is about 88%
作者 杨雪梅 苏祯
出处 《科学技术与工程》 北大核心 2013年第25期7371-7376,共6页 Science Technology and Engineering
基金 陕西省教育厅2013年度科学研究计划项目(2013JK1125)资助
关键词 预测蛋白质 核主成分分析 改进的支持向量机 组合分类器 prediction protein kernel principal componentchines (ISVM) ensemble classifieranalysis (KPCA)improved support vector ma-
  • 相关文献

参考文献14

  • 1Nishikawa I, Sakamoto H, Nouno I, et al. Prediction of the O-glyco- sylation sites in protein by layered neural networks and support vector machines. Lecture Notes in Artificial Intelligence, 2006; LNAI (4252) :953-960.
  • 2Kenta S, Nobuyoshi N, Yasubumi S. Support vector machines pre- diction of N- and O-glyeosylation sites using whole sequenee informa- tion and subeellular loealizition. IPSJ Transactions on Bioinformatics, 2009; (2) :25-35.
  • 3LI S. Predicting O-glyeosylation sites in mammalian proteins by using SVMs. Computational Biology and Chemistry,2006 ;30:203-208.
  • 4Chen Yongzi. Prediction of mucin-type O-Glyeosylation sites in mam- maliam protein using the composition of k-spaced amino acid pairs. BMC Bioinformatics ,2008 ;9 : 101-112.
  • 5Zhou Knn, Ai Chunzhi, Dong Peipei, et al. A novel model to predict O-glycosylation sites using a highly unbalanced dataset. Glyeoeon- jugate Journal ,2012 ;29(7 ) :551-564.
  • 6杨雪梅,赵花丽.蛋白质结构的主成分分析及氧链糖基化位点的人工神经网络预测[J].数学的实践与认识,2009,39(19):108-114. 被引量:4
  • 7杨雪梅,李世鹏.基于核Fisher判别分析的蛋白质氧链糖基化位点的预测[J].计算机应用,2010,30(11):2959-2961. 被引量:5
  • 8任苏娅.基于改进的PCA和ICA算法的掌纹识别研究.北京:北京交通大学,2007.
  • 9Scholkopf B, Aaexander S, MOiler K R. Nonlinear component analy- sis as a kernel eigenvalue problem. Neural Computation, 1998 ; 10 (5) :1299-1319.
  • 10梁胜杰,张志华,崔立林.主成分分析法与核主成分分析法在机械噪声数据降维中的应用比较[J].中国机械工程,2011,22(1):80-83. 被引量:39

二级参考文献60

共引文献206

同被引文献17

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部