期刊文献+

生物序列数据挖掘技术研究 被引量:3

Review of biological sequence data mining techniques
在线阅读 下载PDF
导出
摘要 生物序列数据是生物信息数据中重要的一部分,研究生物序列解读其隐含的生物学意义是生物信息学研究的热点和难点。数据挖掘是当前分析大规模数据的有效工具之一,已广泛应用于分析生物序列数据,并取得了许多研究成果。文章综述了生物序列数据挖掘的关键技术,包括序列比对算法、DNA序列模式挖掘、关联、分类、聚类分析、RNA二级结构预测、蛋白质序列分类和聚类分析,最后展望未来研究方向。 Biological sequence data are an important part of bioinformation data. Researching biological sequence data and finding the tacit knowledge has become a hot and difficult issue of bioinformatics re- search. As one of the most efficient data analysis methods at present, data mining technique has been used widely in biological sequence data analysis, and considerable research achievements have been ob- tained. In this paper, the core technologies of biological sequence data mining are reviewed, including sequence alignment algorithms, DNA sequential pattern mining, association, classification and cluste- ring mining, RNA secondary structure prediction, protein sequential classification and clustering min- ing. Then future work in this respect is forecasted.
作者 杨恒宇
出处 《合肥工业大学学报(自然科学版)》 CAS CSCD 北大核心 2012年第9期1212-1216,共5页 Journal of Hefei University of Technology:Natural Science
关键词 生物序列 数据挖掘 生物信息学 序列比对 biological sequence data mining bioinformatics sequence similarity
  • 相关文献

参考文献27

  • 1刘斌,朱明,王景华,张利,李献会.基于可拓数据挖掘的用户需求获取研究[J].合肥工业大学学报(自然科学版),2011,34(12):1823-1826. 被引量:9
  • 2Mount D W. Bioinformatics sequence and genome analysis [M]. New York Colt Spring Harbor Laboratory Press, 2001:21--22.
  • 3张法.生物序列相似性的比较[J].信息技术快报,2005,3(5):7一19.
  • 4陈娟,陈崚.多重序列比对的蚁群算法[J].计算机应用,2006,26(B06):124-128. 被引量:5
  • 5葛宏伟,梁艳春.基于隐马尔可夫模型和免疫粒子群优化的多序列比对算法[J].计算机研究与发展,2006,43(8):1330-1336. 被引量:9
  • 6Otterpohl J R. Baum-Welch learning in discrete hidden Markov models with linear factorial constraints [C]//Dor- ronsoro J R. Lecture Notes in Computer Science 2415. Ber- lin: Springer, 2002 .. 1180-- 1185.
  • 7Colin M, Jignesh M P, Shniti K. OASIS: an online and accu- rate technique for local alignment searches on biological se- quences [C]//Freytag J C, Lockemann P C, Abiteboul S, et al. Proc of the 29th Int Con{ on Very Large Data Bases (VLDB). Berlin: Morgan Kaufmann Publishers, 2003.. 910--921.
  • 8GenBank. National center for biotechnology information [EB/OL]. (1983-04-07) [2011-05-06]. http://www, ncbi. nih. gov/genbank/.
  • 9Ester M, Zhang X. A top-down method for mining most specific frequent patterns in biological sequence data[C]// Proc of the 4th SIAM Int Conf on Data Mining, 2004.. 90--101.
  • 10Chen G, Wu X, Zhu X, et al. Efficient string matching with wildcards and length constraints[J]. Knowledge and Infor marion Systems,2006,4: 399--419.

二级参考文献34

  • 1梁栋,霍红卫.自适应蚁群算法在序列比对中的应用[J].计算机仿真,2005,22(1):100-102. 被引量:20
  • 2陈文伟,杨春燕,黄金才.可拓知识与可拓知识推理[J].哈尔滨工业大学学报,2006,38(7):1094-1096. 被引量:31
  • 3Liao S S, Hsieh C, Huang Suiping. Mining product maps for new product development[J]. International Journal of Production Research, 2006,44 (18): 4027-4041.
  • 4Jiao J,Zhang L,Zhang Y, et al. Association rule mining for product and process variety mapping [J]. International Journal of Computer Integrated Manufacturing, 2008, 21 (1):111-124.
  • 5Avasere A, Omiecinski E, Navathe S. An efficient algorithm for mining association rules[C]//Proceedings of the AAAI Workshob on Knowledge Discovery in Databases, 1994:181-192.
  • 6Li Cunrong, Yang Mingzhong. Association rules data mining in manufacturing information system based on genetic algorithms[C]//3rd International Conference on Computational Electromagnetics and Its Applications, ICCEA 2004:153-156.
  • 7Li Feng, Liu Ziyan. Effects of multi-objective genetic rule selection on short-term toad forecasting for anomalous days[C]//2006 IEEE Power Engineering Society General Meeting, PES, 2006 IEEE Power Engineering Society General Meeting, 2006: 10-100.
  • 8NEEDLEMAN S,WUNSCH C.A general method applicable to the search for similarities in the amino acid sequence of two proteins[J].J.Mol.Biol.,1970,48:443-453.
  • 9LIPMAN DJ,ALTSCHUL SF,KECECIOGLU JD.A tool for multiple sequence alignment[A].Proc.Natl.Acad.Sci[C].USA 1989,86:4412 -4415.
  • 10STOYE J,MOULTON V,DRESS AW:DCA:an efficient implementation of the divide-andconquer approach to simultaneous multiple sequence alignment[M].Comput.Appl.Biosci.1997,13(6):625-6.

共引文献56

同被引文献8

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部