期刊文献+

一种基于有向树挖掘Web日志中最大频繁访问模式的方法 被引量:9

A directed tree based approach for mining maximum frequent access patterns in Web logs
在线阅读 下载PDF
导出
摘要 提出了一种基于Apriori思想的挖掘最大频繁访问模式的s-Tree算法。该算法使用有向树表示用户会话,能挖掘出最大前向引用事务和用户的浏览偏爱路径;使用一种基于内容页面优先的支持度计算方法,能挖掘出传统算法不能发现的特定的用户访问模式;使用频繁模式树连接分层的频繁弧克服了图结构数据挖掘算法中直接连接两个频繁模式树要判断连接条件的缺点,同时采用预剪枝策略,降低了算法的开销。实验表明,s-Tree算法具有可扩展性,运行效率比直接采用图结构数据挖掘算法要高。 A novel Apriori-based algorithm named s-Tree was proposed for mining maximum frequent access pattems in Web logs. The main contributions of the novel algorithm were as follows. Firstly, the directed tree was used to represent the user session, which enabled us to mine the maximum forward reference transaction and the users' preferred access path. Secondly, a novel method for counting supporting degree based on content first, which helped us to discover some more important and interesting patterns than normal methods. Thirdly, two special strategies were adopted to reduce the overhead of jointing frequent pattems. Experiment results show that the s-Tree algorithm is scalable, and is more efficient than previous graph-based structure pattem mining algorithms such as AGM( Apriori-based Graph Mining) and FSG( Frequent Subgraph Discovery).
出处 《计算机应用》 CSCD 北大核心 2006年第7期1662-1665,共4页 journal of Computer Applications
基金 国家自然科学基金资助项目(60373023)
关键词 WEB使用挖掘 最大频繁访问模式 有向树 WEB日志 Web usage mining maximum frequent access pattern directed tree Web logs
  • 相关文献

参考文献12

  • 1SRIVASTAVA J, COOLEY R, DESHPANDE M, et al. Web usage mining: Discovery and applications of usage patterns from Web data[J]. SIGKDD Explorations, 2000, 1(2): 12 -23.
  • 2KOSALA R, BLOCKEEL H. Web mining research: a survey[ J].ACM SIGKDD Explorations, 2000, 2(1).
  • 3韩家炜,孟小峰,王静,李盛恩.Web挖掘研究[J].计算机研究与发展,2001,38(4):405-414. 被引量:356
  • 4PEI J, HAN J, MORTAZAVI-ASL B, et al. Mining access patterns efficiently from Web logs[ A]. Proceedings of 4th Pacific Asia Conference on Knowledge Discovery and Data Mining[ C]. Tokyo, Japan, 2000.
  • 5HAN J, PEI, J, YIN Y. Mining frequent patterns without candidate generation[ A]. SIGMOD2000[ C].2000.
  • 6SUN L, ZHANG X. Efficient Frequent Pattern Mining on Web Logs[ A]. APWeb 2004[ C].2004. 533 -542.
  • 7AGRAWAL R, SRIKANT R. Fast algorithms for mining association rules in large database[ A]. VLDB1994[C].1994. 487 -499.
  • 8EZEIFE C, LU Y. Mining Web Log Sequential Patterns with Position Coded Pre-Order Linked WAP-Tree [ J]. Data Mining and Knowledge Discovery, 2005, 10( 1 ) : 5 - 38.
  • 9INOKUCHI A, WASHIO T, MOTODA H. An apriori-based algorithm for mining frequent substructures from graph data[ A]. PKDD2000[C]. Lyon, France, 2000.
  • 10KURAMOCHI M, KARYPIS G. Frequent subgraph discovery[ A].ICDM2001 [ C]. San Jose, USA, 2001.

二级参考文献17

  • 1Rakesh Agrawal, Ramakrishnan Srikant. Fast algorithms for mining association rules in large databases. VLDB1994, Santiago,Chile, 1994.
  • 2Heikki Mannila, et al. Search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery,1997, 1(3): 241~258.
  • 3Jong Soo Park, et al. An effective Hash based algorithm for mining association rules. SIGMOD1995, San Jose, USA, 1995.
  • 4Sergey Brin, et al. Dynamic itemset counting and implication rules for market basket data. SIGMOD1997, Tucson, USA,1997.
  • 5Ramesh C. Agarwal, et al. Depth first generation of long patterns, KDD 2000, Boston, USA, 2000.
  • 6Ramesh C. Agarwal, et al. A tree projection algorithm for generation of frequent itemsets. J. of Parallel and Distributed Computing, 2001, 61(3): 350~371.
  • 7Jiawei Han, Jian Pei, Yiwen Yin. Mining frequent patterns without candidate generation. SIGMOD2000, Dallas, USA, 2000.
  • 8J. Pei, et al.. H-Mine: Hyper-structure mining of frequent patterns in large databases. ICDM'01, San Jose, CA, 2001.
  • 9Mike Perkowitz, Oren Etzioni. Adaptive sites: Automatically learning from user access patterns. WWW' 97, Santa Clara, 1997.
  • 10J. Pei, et al.. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. ICDE'01, Heidelberg, 2001.

共引文献370

同被引文献65

引证文献9

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部