期刊文献+

Web使用挖掘中数据预处理过程研究 被引量:6

Research on process of data preprocessing in Web usage mining
在线阅读 下载PDF
导出
摘要 Web使用挖掘是将数据挖掘技术应用于用户浏览Web时交互作用的二次数据以发现使用模式,从而达到更好地理解和服务基于Web应用的需要的目的。在将数据挖掘算法应用于从服务器日志收集来的数据之前必须要进行一些预处理工作。数据预处理就是把源数据转换为下一步应用数据挖掘算法所必须的数据抽象的过程。作为模式发现的数据源,数据预处理结果的质量直接影响着模式发现的最终结果。本文提出了几种可用于提高数据预处理性能的技术和方法。实验证明,这些技术和方法是有效的。最后,总结全文并提出了进一步的研究方向。 Web usage mining is the application of data mining techniques to discover usage patterns from the secondary data derived from the interactions of the users while surfing on the Web, in order to understand and better serve the needs of Web-based applications. There are several preprocessing tasks that must be performed prior to applying data mining algorithms to the data collected from server logs. Data preprocessing is the process to convert the raw data into the data abstraction necessary for the further applying' the data mining algorithm. As the data sources of patterns discovery,the results' quality of data preprocessing influences the results of patterns discovery directly. This paper presents several data preparation techniques and methods that can be used to improve the performance of data preprocessing in order to identify unique users and user sessions. These techniques and methods have been proved valid and efficient by experiments. Finally, we conclude this paper and propose the future research directions.
出处 《电子测量技术》 2007年第3期3-5,共3页 Electronic Measurement Technology
基金 湖北省科技攻关项目(2005101C18) 中南民族大学自然科学基金项目
关键词 WEB使用挖掘 WEB日志 数据预处理 用户会话 Web usage mining Web log data preprocessing user session
  • 相关文献

参考文献10

  • 1JAIDEEP S, ROBERT C, MUKUND D. Web usage mining: discovery and applications of usage patterns from Web data [J]. SIGKDD Explorations, 2000, 1(2):1-12.
  • 2李超锋.Web使用挖掘数据源分析[J].中南民族大学学报(自然科学版),2005,24(4):82-85. 被引量:7
  • 3MOBASHER B. Discovery and evaluation of aggregate usage profiles for Web personalization [J]. Data Mining and Knowledge Discovery, 2002,6 (1) : 61-82.
  • 4SHAHABI C, KASHANI F B. A framework for efficient and anonymous Web usage mining based onclient-side tracking [J]. Proc WEBKDD 2001: Mining Web Log Data across All Customer Touch Points,LNCS 2356, Springer-Verlag, 2002 : 113-144.
  • 5ZHANG F, CHANG H Y. Research and development in Web usage mining system-key issues and proposed solutions: a survey [J]. Machine Learning and Cybernetics, 2002 (2) : 986-990.
  • 6BERENDT B. The impact of site structure and user environment on session reconstruction in Web usageanalysis [J]. Proc WEBKDD 2002: Mining Web Datafor Discovery Usage Patterns and Profiles, LNCS2703, Springer-Verlag, 2002 : 159-179.
  • 7TANASA D, TROUSSE B. Data preprocessing for WUM. Potentials [J]. IEEE,2004(3):22- 25.
  • 8TANASA D, TROUSSE B. Advanced data preprocessing for intersites Web usage mining.Intelligent Systems [J]. IEEE, 2004 (19) : 59-65.
  • 9ZHANG H Y, LIANG W A. An intelligent algorithm of data pre-processing in Web usage mining [Z].Intelligent Control and Automation, WCICA 2004,Fifth World Congress, 4 : 3119-3123.
  • 10FANG Y, WANG L J, GE Y. Study on data preprocessing algorithm in Web log mining. Machine Learning and Cybernetics[C]. 2003 International Conference, 2003,1 : 28-32.

二级参考文献7

  • 1涂承胜,陆玉昌.Web使用挖掘技术研究[J].小型微型计算机系统,2004,25(7):1177-1184. 被引量:37
  • 2Zhang Feng, Chang Huiyou. Research and development in Web usage mining system-key issues and proposed solutions, a survey [J]. Machine Learning and Cybernetics, 2002, (2):986- 990.
  • 3Srivastava J, Cooley R, Deshpande M, Tan Pangning. Web usage mining:discovery and applications of usage patterns from web data[J]. SIGKDD Explorations ,2000, 1(2): 1-12.
  • 4Cohen E, Krishnamurthy B, Rexford J. Improving end-to-end performance of the web using server volumes and proxy filters [J]. ACM SIGCOMM Computer Communication Review, 1998, 28 (4) : 241-253.
  • 5Shahabi C, Zarkesh A, Adibi J, et al. Knowledge discovery from users web-page navigation [J].Proceedings of the IEEE RIDE97 Workshop, 1997,(4):20-29.
  • 6Chen M S, Park JS, Yu PS. Data mining for path traversal patterns in a web environment [J].Proceedings of the 16th ICDCS, IEEE, 1996, (5) :385-392.
  • 7Cooley R, Mobasher B, Srivastava J. Data preparation for mining world wide web browsing patterns[J].Knowledge and Information Systems, 1999, (1):127.

共引文献6

同被引文献41

引证文献6

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部