摘要
目的探讨直系同源蛋白质聚类分析的方法,为高效、快速的直系同源蛋白质聚类分析研究提供有效帮助。方法基于蛋白质序列的相似性和结构域的相似性,提出一种直系同源蛋白质聚类方法,实现了直系同源蛋白质的快速、精确聚类。结果对人类、酵母、蠕虫、果蝇、拟南芥和斑马鱼等六种真核生物序列直系同源蛋白质的聚类分析,结果明显优于NCBI和TIGR的聚类结果。结论利用蛋白质序列的相似性和结构域的相似性,可以有效筛选出假的同源关系,进而显著提高直系同源蛋白聚类的精确性和紧密性。
[ Objective ] To investigate the methods of orthologs clustering analysis, and provide a notion for auto- matic and robust clustering analysis of orthologs. [Methods] Based on the similarities of sequences and domains, a method to cluster orthologs was presented, which could automatic cluster orthologs from multiple species. [ Results ] Analysis on six completely sequenced eukaryotic genomes showed that a significant improvement of our clustering results compared with those by NCBI and TIGR. [ Conclusion ] It suggests that using the similarities of sequences and domains can filter the false homology relationships and improve the accuracy and robustness of orthologs cluster- ing.
出处
《中国现代医学杂志》
CAS
CSCD
北大核心
2012年第27期15-18,共4页
China Journal of Modern Medicine