期刊文献+

基于多特征信息及Ma-Ada多分类器融合的蛋白质结构类预测 被引量:1

Protein Structural Class Prediction Based on Multi-Feature and Ma-Ada Multi-Classifier Fusion
暂未订购
导出
摘要 蛋白质序列特征表示和机器学习算法是影响蛋白质结构类预测效果好坏的两个重要方面。本研究基于k-字统计频率和k-片段位置分布两种特征提取方法,将分别提取到的氨基酸序列信息和物理化学性质信息同蛋白质二级结构信息进行融合,建立17维和57维的特征信息集,并尝试在Adaboost.M1算法中引入Multi-Agent多智能体融合的思想,提出了一种Ma-Ada多分类器融合算法。该算法作为蛋白质结构类的预测工具,充分挖掘了单分类器度量层信息以及各个单分类器之间的交互融合信息。实验结果表明,Ma-Ada算法在Z277、Z498、1189和D640四个蛋白质数据集的57维特征信息集上的分类率分别达到了91.3%、96.8%、85.3%和87.2%,在17维特征信息集上的分类率也分别达到了90.6%、95.8%、84.8%和88.3%。与其它蛋白质结构类预测方法的结果相比,本方法能够获得较好的分类率。 Protein sequence feature and machine learning algorithm are two important aspects to determine the results of protein structural class prediction. In this study, we established 17-D and 57-D feature information sets through fusing the sequence information, physical and chemical information with the secondary structure information based on the k-word statistical frequency and the k-fragment distribution feature extraction method. By introducing Multi-Agent's idea into Adaboost. M1 algorithm, a novel method for protein structural class prediction, called Ma-Ada multi-classifier fusion algorithm, was proposed, which fully utilized the information of the single classifier metric layer and the fusion of information among individual classifiers. Four protein datasets including Z277, Z498, 1189, D640 were used to validate the performance of the Ma-Ada algorithm. Classification accuracies are 91.3 % , 96.8 % , 85.3% and 87.2 % with 57 -D features, and 90.6 % , 95. 8 % , 84.8 % and 88.3 % with 17 D features on datasets Z277, Z498, 1189 and D640, respectively. The experimental results show better.
作者 郑斌 厉力华
出处 《中国生物医学工程学报》 CAS CSCD 北大核心 2013年第5期580-587,共8页 Chinese Journal of Biomedical Engineering
基金 国家自然科学基金(61271063) 国家重点基础研究发展计划(973计划)(2013CB329502) 国家杰出青年科学基金(60788101)
关键词 蛋白质结构类预测 特征信息集 Ma-Ada多分类器融合 protein structural class prediction feature information set Ma-Ada multi-classifier fusion
  • 相关文献

参考文献25

  • 1Levitt M, Chothia C. Structural patterns in globular proteins[J]. Nature, 1976, 261 (5561) : 552 - 558.
  • 2Nakashima H, Nishikawa K. Discrimination of intracellular and extraeellular proteins using amino acid composition and residue- pair frequencies[ J]. Journal of Molecular Biology, 1994, 238 (1): 54 -61.
  • 3Bu Weishu, Feng Zhiping, Zhang Ziding, et al. Prediction of protein (domain) structural classes based on amino-acid index [ J ]. European Journal of Biochemistry, 1999, 266 ( 3 ) : 1043 - 1049.
  • 4Chou Kuoehen. Prediction of protein cellular attributes using pseudo-amino acid composition [ J ]. Proteins: Structure, Function, and Bioinformaties, 2001, 43(3 ) : 246 - 255.
  • 5Ding CHQ, Dubehak 1. Multi-class protein fold recognition using support vector machines and neural networks [ J ]. Bioinformaties, 2001, 17 (4) : 349 - 358.
  • 6Liu Taigang, Zheng Xiaoqi, Wang Jun. Prediction of protein structural class using a complexity-based distance measure [ J ]. Amino Acids, 2010, 38(3) : 721 -728.
  • 7Wu Li, Dai Qi, Han Bin, et al. Prediction of protein structural class using a combined representation of protein-squence information and support vector machine[ C ]//Bioinformatics and Biomedicine Workshops (BIBMW). HongKong: IEEE, 2010: 101 - 106.
  • 8Cai YD, Feng KY, Lu WC, et al. Using logitboost classifier to predict protein structural classes [ J ]. Journal of Theoretical Biology[ J]. 2006, 238( 1 ) : 172 - 176.
  • 9Feng KY, Cai YD, Chou KC. Boosting classifier for predicting protein domain structural class[J]. Biochemical and Biophysical Research Communications, 2005, 334 ( 1 ) : 213 - 217.
  • 10Dai Qi, Wu Li, Li Lihua. Improving protein structural class prediction using novel combined sequence information and predicted secondary structural features [ J ]. Journal of Computational Chemistry, 2011,32( 16 ) : 3393 - 3398.

二级参考文献55

  • 1卢雅琴,邬凌超.基于数学形态学的车牌定位方法[J].计算机工程,2005,31(3):224-225. 被引量:50
  • 2鹿晓亮,陈继荣.复杂背景下快速车牌定位方法研究[J].计算机仿真,2006,23(7):256-259. 被引量:8
  • 3Khuwaja G A, Abu-Rezq A N. Bi-Modal Breast Cancer Classification System[J]. Pattern Anal. Appl. , 2004, 7(3) : 235 -242.
  • 4Nandi R J, Nandi A K, Rangayyan R, et al. Classification of Breast Masses in Mammograms Using Genetic Programming and Feature Selection [ J ]. Medical and Biological Engineering and Computing, 2006, 44 (8) : 693 - 694.
  • 5Bruce L M, Adhami R R. Classifying Mammographic Mass Shapes Using the Wavelet Transform Modulus-maxima Method [ J ]. IEEE Trans. Med. Imag. , 1999, 18(12): 1170-1177.
  • 6Shaniner B, Petrick N, Chan H, et al. Computer-Aided Characterization of Mammographic Masses: Accuracy of Mass Segmentation and Its Effects on Characterization[ J ]. IEEE Trans. Med. Imag. , 2001, 20(12): 1275- 1284.
  • 7Kilday J, Palmieri F, Fox M D. Classifying Mammographic Lesions Using Computerized Image Analysis[ J]. IEEE Trans. Med. Imag. , 1993, 12:664-669.
  • 8Jiang Y, Nishikawa R M, Schmidt R A, et al. Improving Breast Cancer Diagnosis with Computer-Aided Diagnosis [ J ]. Academic Radiol. , 1999, 6:22-33.
  • 9Chan H P, Sahiner B, Helvie M A, et al. Improvement of Radiologists' Characterization of Mammographic Masses by Using Computeraided Diagnosis: An ROC Study [ J ]. Radiology, 1999, 212:817-827.
  • 10Huo Z, Giger M L, Vyborny C J, et al. Effectiveness of CAD in the Diagnosis of Breast Cancer: An Observer Study on an Independent Database of Mammograms [ J ]. Radiology, 2000, 7 : 1077 - 1084.

共引文献87

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部