期刊文献+

带通配符和One-Off条件的序列模式挖掘 被引量:23

Mining Sequential Patterns with Wildcards and the One-Off Condition
在线阅读 下载PDF
导出
摘要 很多应用领域产生大量的序列数据.如何从这些序列数据中挖掘具有重要价值的模式,已成为序列模式挖掘研究的主要任务.研究这样一个问题:给定序列S、支持度阈值和间隔约束,从序列S中挖掘所有出现次数不小于给定支持度阈值的频繁序列模式,并且要求模式中任意两个相邻元素在序列中的出现位置满足用户定义的间隔约束.设计了一种有效的带有通配符的模式挖掘算法One-Off Mining,模式在序列中的出现满足One-Off条件,即模式的任意两次出现都不共享序列中同一位置的字符.在生物DNA序列上的实验结果表明,One-Off Mining比相关的序列模式挖掘算法具有更好的时间性能和完备性. There is a huge wealth of sequence data available in real-world applications.The task of sequential pattern mining serves to mine important patterns from the sequence data.Given a sequence S,a certain threshold,and gap constraints,this paper aims to discover frequent patterns whose supports in S are no less than the given threshold value.There are flexible wildcards in pattern P,and the number of the wildcards between any two successive elements of P fulfills the user-specified gap constraints.The study designs an efficient mining algorithm: One-Off Mining,whose mining process satisfies the One-Off condition under which each character in the given sequence can be used at most once in all occurrences of a pattern.Experiments on DNA sequences show that this method performs better in time and completeness than the related sequential pattern mining algorithms.
出处 《软件学报》 EI CSCD 北大核心 2013年第8期1804-1815,共12页 Journal of Software
基金 国家自然科学基金(61229301 60828005 61273292) 美国国家科学基金(CCF-0905337 CCF-0514819) 国家高技术研究发展计划(863)(2012AA011005) 国家重点基础研究发展计划(973)(2013CB329604)
关键词 数据挖掘 序列模式挖掘 频繁模式 通配符 One-Off条件 data mining sequential pattern mining frequent pattern wildcard One-Off condition
  • 相关文献

参考文献1

二级参考文献15

  • 1Agrawal R, Srikant R. Mining sequential patterns. In: Yu PS, Chen ASP, eds. Proc. of the 11th Int'l Conf. on Data Engineering. Washington DC: IEEE Computer Society Press, 1995. 3-14.
  • 2Agrawal R, Srikant R. Mining sequential patterns: Generalizations and performance improvements. In: Apers PMG, Mokrane B, et al., eds. Proc. of the 5th Int'l Conf. on Extending Database Technology. Heidelberg: Springer-Verlag, 1996. 3-17.
  • 3Ozden B, Ramaswamy S, Silberschatz A. Cyclic association rules. In: Proc. of the 14th Int'l Conf. on Data Engineering. 1998. http://citeseer.ist.psu.edu/ozden98cyclic.html
  • 4Garofalakis M, Rastogi R, Shim K. Spirit: Sequential pattern mining with regular expression constraints. In: Atkinson MP, Orlowska ME, et al., eds. Proc. of the Int'l Conf. on Very Large Data Bases. Edinburgh: Morgan Kaufmann Publishers, 1999. 223-234.
  • 5Han J, Pei J, Mortazavi-Asl B, Chen QM, Dayal U, Hsu MC. Freespan: Frequent pattern-projected sequential pattern mining. In: Ramakrishnan R, ed. Proc. of the Int'l Conf. on Knowledge Discovery and Data Mining. New York: ACM Press, 2000. 355-359.
  • 6Han J, Pei J, Mortazavi-Asl B, Pinto H, Chen QM, Dayal U, Hsu MC. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proc. of the 17th Int'l Conf. on Data Engineering. 2001. http://citeseer.ist.psu.edu/470226.html
  • 7Ayres J, Gehrke J, Yiu T, Flannick J. Sequential pattern mining using a bitmap representation. In: Proc. of the 8th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. 2002. http://citeseer.ist.psu.edu/ayres02sequential.html
  • 8Parthasarathy S, Zaki MJ, Ogihara M, Dwarkadas S. Incremental and interactive sequence mining. In: Gauch S, ed. Proc. of the 8th Int'l Conf. on Information and Knowledge Management. New York: ACM Press, 1999. 251-258.
  • 9Masseglia F, Poncelet P, Teisseire M. Incremental mining of sequential patterns in large databases. Data & Knowledge Engineering, 2003,46(1):97-121.
  • 10Zou X, Zhang W, Cai CS, Wang QY. A difficient incremental updating algorithm for discovering sequential patterns in large database. Journal of Nanjing University (Natural Sciences), 2003,39(2):165-171 (in Chinese with English abstract).

共引文献18

同被引文献191

引证文献23

二级引证文献231

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部