期刊文献+

基于半监督支持向量机的并行远同源检测方法

Parallel remote homology detection approach based on semi-supervised support vector machine
在线阅读 下载PDF
导出
摘要 在生物信息学中,对给定氨基酸序列的蛋白质进行分类,检测细微的蛋白质序列相似性或远同源性对于准确预测蛋白质功能和结构都非常重要。提出一种新的基于半监督支持向量机的远同源性检测方法,通过定义序列概率剖面,充分利用大型数据库的非标记数据,并行构筑支持向量机核函数,并结合最近邻分类器实现对任何数据的全覆盖。实验表明,该方法能够大幅提高蛋白质序列分类器的性能与效率。使用并行技术将总体计算时间控制在一定范围,推动了半监督支持向量机分类器的广泛应用。 The classification of protein sequences into functional and structural families based on sequence homology is a fundamental problem in computational biology. This paper introduced a novel parallel remote homology detection approach based on semi-supervised support vector machine. The method defined the SVM kernel function parallel by probabilistic profiles which were built with unlabeled data by searching large database and got the complete data coverage by combined with the nearest neighbor algorithm, And presented the remote homology detection experiments to show that the parallel method could increase accuracy and computational efficiency greatly. The use of parallel computing technology to a whole-time control to a certain extent, promoted the semi-supervised support vector machine classifier widely used.
出处 《计算机应用研究》 CSCD 北大核心 2009年第12期4624-4627,共4页 Application Research of Computers
基金 天津市科技支撑重点项目(09ZCKFGX00400) 河南省高等教育信息化工程项目(2008xxh011)
关键词 半监督学习 支持向量机 并行计算 分类器 semi-supervised learning support vector machine parallel computing classifier
  • 相关文献

参考文献17

  • 1NEEDLEMAN S B, WUNSCH C D. A general method applicable to the search for similarities in the amino acid sequence of two proteins [J]. Journal of Molecular Biology, 1970, 48(3) : 443-453.
  • 2SMITH T, WATERMAN M. Identification of common molecular subsequences [ J ]. Journal of Molecular Biology, 1981,147 ( 1 ) : 195-197.
  • 3ALTSCHUL S, GISH W, MILLER W, et al. A basic local alignment search tool [ J ]. Journal of Molecular Biology, 1990,215 ( 3 ) : 403-410.
  • 4PEARSON W R. Rapid and sensitive sequence comparisons with FASTP and FASTA [ J ]. MethOds Enzymol, 1985,183:63-98.
  • 5GRIBSKOV M, L THY R, EISENBERG D. Profile analysis [ J]. Methods Enzymol, 1990,183 : 146-159.
  • 6KROGH A, BROWN M, MIAN I, et al. Hidden Markov models in computational biology: applications to protein modeling[ J ]. Journal of Molecular Biology, 1994, 235 (5) : 1501 - 1531.
  • 7PARK J, KARPLUS K, BARRETT C, et al. Sequence comparison using multiple sequences detect three times as many remote homologues as pairwise methods [ J ]. Journal of Molecular Biology, 1998, 284(4) : 1201-1210.
  • 8ALTSCHUL S, MADDEN T, SCHAFFER A, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs[J]. Nucleic Acids Research,1997,25(17) : 3389-3402.
  • 9SCHOLKOPF B. An introduction to support vector machines [ C ]// Proc of Recent Advances and Trends in Nonparametric Statistics. 2003:3-17.
  • 10LIAO L, NOBLE W S. Combining pairwise sequence similarity and support vector machines for remote protein homology detection [ C ]// Proc of the 6th Annual International Conference on Research in Computational Molecular Biology. 2002:225-232.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部