期刊文献+

模糊C均值聚类算法在Web使用挖掘上的应用研究 被引量:9

Research on Application of Fuzzy C-Means Algorithm in Web Usage Mining
在线阅读 下载PDF
导出
摘要 Web日志中含有大量的用户浏览信息,从中将相似用户及相关页面进行聚类是建立自适应网站的必要前提。通过基本的预处理,实现了日志的数据净化、用户识别会话识别及数据规约,形成了用户访问页面的序列数据库,同时通过离散化技术计算出用户访问页面频度。在这些数据准备工作的基础上,构造了用户-页面关联矩阵,作为改进的模糊C均值聚类算法的输入,实现了相似用户及相关页面的聚类。实验表明改进的FCM算法的有效性。 Web logs contain a lot of user browsing information. Clustering of similar customers and relative pages is necessary for creating adaptive web sites. Implements the web log's cleaning, user- recognizing, session - recognizing and data convention by means of preprocessing technology. Then a user- page sequence database can be achieved. Simultaneously, the frequency of the user's visit is added to the database. After all these preparation work, can get the associated matrix which is also the input of the improved fuzzy c- means algorithm. Finally realize the clustering of similar customers and relative pages. The result of experiment shows the validity of the algorithm.
作者 吴瑛 王秋生
出处 《计算机技术与发展》 2008年第6期32-35,共4页 Computer Technology and Development
关键词 模糊C均值聚类 Web日志预处理 关联矩阵 用户聚类 页面聚类 fuzzy c-means algorithm Web log's data preparation associated matrix customer-clustering page-clustering
  • 相关文献

参考文献6

  • 1Bezdek J C. Fuzzy Mathematics in Pattern Classification[D]. Ithaca:Applied Math. Center, Comell University, 1973.
  • 2SWEIGERM 陆昌辉 张光剑 陈佐 译.点击流数据仓库[M].北京:电子工业出版社,2004..
  • 3Pitkow J. In Search of Reliable Usage Data on the WWW[C].In: sixth International World Wide Web Conference. Santa Clara, CA: [s.n.], 1997:451-463.
  • 4Cooley R, Mobasher B, Srivasta J. Data Prepatation for mining world wide Web browsing patterns[J ]. Journal of Knowledge an Information System, 1999,1 ( 1 ) : 5 - 32.
  • 5Pirolli P, Pitkow J, Rao R. Silk from a Sow's Ear: Extract- ing Usable Structure from the Web[C].In: Proceedings of CHI'96. Vancouver BC: ACM Press, 1996:118- 125.
  • 6杜家强,韩其睿,王科,杜家兴.Web日志中用户频繁路径快速挖掘算法[J].计算机工程与应用,2005,41(22):164-167. 被引量:12

二级参考文献8

  • 1Anand S S,Patrick A R,Hughes J G.A data Mining methodology for cross-sales[J].Knowledge Based Systems Journal, 1998; 10(7) :449~461.
  • 2Mobasher B,Srivastava J.Data preparation for mining world wide web browing patterns [ J ].Knowledge and Information System, 1999;1(1):5~32.
  • 3Srikant Rt, Agrawal R.Mining generalized association rules [ C ].In:Proceedings of the 21st International Conference on Very Large DataBase, Switzerland, 1995: 407~419.
  • 4Karunap Joshi,Nupam Joshi,Elena Yesha. On Using Warehouse to Analyze Web Logs[J].Distributed and Parallel Databases,2003;13:61~180.
  • 5Qiang Yang,Joshua Zhexue Huang,Michael NG.A Data Cube Model for Prediction-Based Web Prefetchingp [J ] .Journal of Intelligent Information Systems, 2003; 20 ( 1 ): 11~30.
  • 6Jiawei Han,Micheline Kamber. Data Mining Concepts and Techniques [M].Beijing:China Machine Press,2003-09.
  • 7宋擒豹,沈钧毅.Web日志的高效多能挖掘算法[J].计算机研究与发展,2001,38(3):328-333. 被引量:115
  • 8邢东山,沈钧毅,宋擒豹.从Web日志中挖掘用户浏览偏爱路径[J].计算机学报,2003,26(11):1518-1523. 被引量:87

共引文献14

同被引文献70

引证文献9

二级引证文献53

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部