摘要
从整体上考虑多肽一级结构,提出了3种仅基于多肽氨基酸序列、计算简便、适于不等长肽和可捕获多肽上下文关联特征的多肽新描述子,即地统计学关联(GS-AA531)描述子、多尺度组分与关联(MSCC)描述子和地统计学关联与多尺度组分(GS-AA531-MSC)描述子.将其应用于2个抗菌肽体系(等长肽与不等长肽)的结构表征,并以支持向量回归建立QSAM模型.模型的拟合、留一法及独立测试结果表明,结合特征筛选的新描述子GS-AA531与GS-AA531-MSC的预测精度明显稳定且优于其它参比描述子,在多肽QSAM研究中具有广泛应用前景.
Primary structure characterization is the key to quantitative sequence-activity modeling(QSAM) of polypeptides. This paper reported three new descriptors, GS-AA531, MSCC and GS-AA531-MSC, identified through integrating the information in peptide or protein primary structure. The calculations identified these descriptions were simple, only based on amino acid sequence, suitable to peptides with different lengths and could capture the context features. The new descriptors and other reference descriptors were applied to the two AMPs systems(equal and unequal length peptides) for constructing QSAM models combined with features screening. The accuracies of fitting, leave-one-out cross validation, and extra-sample prediction for the models based on GS-AA531 and GS-AA531-MSC descriptors improved significantly compared with those based on the other descriptors. Therefore, the new peptide or protein descriptors GS-AA531 and GS-AA531-MSC are pro-mising for broad applications in peptide or protein QSAM study.
出处
《高等学校化学学报》
SCIE
EI
CAS
CSCD
北大核心
2012年第11期2526-2531,共6页
Chemical Journal of Chinese Universities
基金
湖南省杰出青年科学基金(批准号:10JJ1005)
公益性行业(农业)科研专项基金(批准号:201303029-8)
湖南省2011年财政厅项目(批准号:62020411074)资助
关键词
结构表征
定量序效模型
抗菌肽
支持向量回归
特征筛选
Structural characterization
Quantitative sequence-activity model
Antimicrobial peptide
Supportvector regression
Feature screening