期刊文献+

基于Web日志的高精度聚类算法 被引量:4

High Precision Clustering Algorithm Based on Web Log
在线阅读 下载PDF
导出
摘要 提出一种Web日志挖掘算法,该算法首先以Web站点的URL为行、以用户的UserID为列,建立URL- UserID关联矩阵,元素值为用户的访问次数;然后,对行向量进行相似性度量获得用户会话粗聚类,最后,利用层次结构对比聚类算法,对用户会话粗聚类进行进一步地处理得到更高精度的聚类,实验表明该算法在提高聚类精度方面卓有成效。 Similar customer groups, relevant Web pages and frequent access paths can be discovered by analyzing Web log files. A Web log mining algorithm is presented here. Firstly, according to Web site' s directed graph defined, a URL-UserID relevant matrix is set up, with URL as row and UserID as column, and users times of visiting as element values. Secondly, rough session clusters are obtained by measuring similarity between row vectors. Finally, by dealing with the rough session clusters further through hierarchy comparison clustering algorithm, clusters with higher precision can be acquired. Experiments prove the effectiveness of the algorithm.
出处 《河南科技大学学报(自然科学版)》 CAS 2006年第2期49-51,共3页 Journal of Henan University of Science And Technology:Natural Science
基金 河南省自然科学基金项目(0411010500)
关键词 网络 WEB日志挖掘 会话聚类 结构层次 Networks Web log mining Session clustering Structure hierarchy
  • 相关文献

参考文献8

二级参考文献54

  • 1.甘仞初信息系统开发[M].北京:经济科学出版社,1999..
  • 2JibiteshMishraAshokMohanty.现代信息系统设计方法[M].北京:电子工业出版社,2002..
  • 3陆昌辉.SQL Server 2000核心技术内幕[M].北京:希望电子出版社,2002.257—262.
  • 4Lan Shu,Mo Zhi Wen,Hu Dan. Methods of learning rules based on rough set: LBR and LEM3 [ A ]. IFSA World Congress and 20th NAFIPS International Couference [C]. 2001,2:753 -756.
  • 5Bakar A A,Sulaiman M N,Othman M,et al. Fining minimal reduct with binary integer programming in data mining [A]. TENCON 2000 [C]. 2000,(2) :141 -146.
  • 6Felix R, Ushio T. Rough sets-based machine learning using a binary discernibility matrix [ A ]. Proceedings of the Second International Conference on Intelligent Processing and Manufacturing of Materials [C]. 1999,1:299-305.
  • 7Guan J W ,Bell D A,Guan Z. Matrix computation for information systems [ J ]. Information Sciences,2001,131, ( 1 - 4) : 129 -156.
  • 8Zhong Ning,Dong Juzhen,Ohsuga Setsuo. Rule discovery by soft induction techniques[J]. Neurocomputing,2001,36 ( 1 - 4) :171 -204.
  • 9Fujimori S, Kaiya T, Inoue T. Analysis of discharge currents with discernibility matrices [ A]. Proceedings of 1998 International Symposium on Electrical Insulating Materials [ C ]. 1998.649-652.
  • 10Miao Duoqian, Wang Jue. Information-based algerithm for reduction of knowledge [ A ]. IEEE International Conference on Intelligent Processing Systems [C]. 1997,2:1155 -1158.

共引文献145

同被引文献32

引证文献4

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部