期刊文献+

基于关键字树的DNA多序列星比对算法 被引量:10

An Algorithm for DNA Multiple Sequence Alignment Based on Center Star Method and Keyword Tree
在线阅读 下载PDF
导出
摘要 在构建进化树、比较单体型序列等生物信息学研究中,需要比对多个相似程度很高的DNA序列.对于数量多、序列长的多序列比对问题,通常使用时间复杂度较低的星比对算法.然而在处理大规模数据时,星比对的平方时间复杂度依然不能满足需要.因此,在星比对思想的基础上,本文结合关键字树理论,先找出完全匹配的区域,然后比对剩余区域,以达到降低期望时间复杂度的目的.两组实验证明了本文算法的有效性,在取得相同比对效果的情况下,本文算法运行时间小于其他方法. Multiple sequence alignment is necessary and important for reconstructing evolutionary trees and comparing haplotype sequences. Center star method is always used to deal with lots of long sequences. However, square time complexity is a bottleneck for large data. In this paper, we propose a novel keyword tree based algorithm for improving the center star method. Aho-Corasick algorithm is employed to match a set of substrings and the rest regions are aligned by dynamic programming. Experiments show that the improved method rims faster than the initial center star method and clustalx.
出处 《电子学报》 EI CAS CSCD 北大核心 2009年第8期1746-1750,共5页 Acta Electronica Sinica
基金 国家自然科学基金(No.60671011 No.60741001 No.60871092) 黑龙江省杰出青年科学基金(No.JC200611) 黑龙江省自然科学重点项目基金(No.ZJG0705)
关键词 多序列比对 星比对 关键字树 AHO-CORASICK算法 生物信息学 multiple sequence alignment center star method keyword tree Aho-Corasick algorithm bioinformatics
  • 相关文献

参考文献10

  • 1L Wang, T Jiang. On the complexity of multiple sequence alignment[ J] .Journal of Computational Biology, 1994, 1 (4) : 337 - 34.
  • 2Robert C Edgar, Seraflm Batzoglou. Muitiple sequence alignment[ J]. Current opinion in slructural biology. 2006, 16( 3): 368 - 373.
  • 3Notredame C,Higgins D. G. SAGA:sequence alignment by genetic algorithm[ J]. Nucleic Acids Research. 1996,24 (8) : 1515 - 1524.
  • 4Serafim Batzoglou. The many faces of sequence alignment[ J]. Briefings in Bioinformatics. 2005,6( 1 ) : 6 - 22.
  • 5Aho A V, Corasick M J. Efficient string matching: an aid to bibliographic search [J]. Communications of ACM, 1975, 18 (6) :333 - 340.
  • 6D Gusfield. Algorithms on strings, trees and sequences: computer science and computational biology [ M ]. Cambridge: Cambridge University Press. 1997.
  • 7Quan Zou,Maozu Guo, Yang Liu, Taotao Zhang. An Improved Algorithm for Aligning DNA/RNA Sequences Quickly[ A]. International Conference on Computational Intelligence and Security[C]. Harbin: reEF, Press,2007. 825 - 828.
  • 8D Gusfield. Efficient methods for multiple sequence alignment with guaranteed error bounds[ J]. Bulletin of Mathematical Biology. 1993,55(1) : 141 - 154.
  • 9Brown J W. The Ribonuclease P database [ J ]. Nucleic Acids Research, 1999,27( 1 ) :314.
  • 10A Rambaut.Estimating the rate of molecular evolution:incorporating noncontemporaneous sequences into maximum likelihood phylogenies[ J]. Bioinformatics, 2000,16(4) : 395 - 399.

同被引文献100

引证文献10

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部