摘要
针对当前Web站点设计与管理的复杂性以及优化其拓扑结构的需要,提出了一种从Web日志中挖掘用户浏览偏爱路径的算法。该算法引入反映页面浏览频率的访问矩阵与支持-偏爱度,得到访问矩阵行向量间的Hamming距离矩阵,将相似性阈值与其元素值逐一比较,获得候选兴趣子路径2-项集,根据支持-偏爱度阈值,将子路径集中不符的子路径剔除,合并子路径,生成用户浏览偏爱路径。实验结果证明了该算法的有效性。
In connection with the complexity of designing and managing Web sites and the need of optimizing their topology structures currently, this paper put forward a kind of algorithm which was about discovering preferred browsing paths from Web logs. This algorithm imported User Access Matrix which was used to reflect the frequency of browsing pages and support-preference. After obtaining Hamming Distance Matrix among the row vectors of User Access Matrix, its element values were corn- pared with Similarity threshold one by one, then Candidate Interest Sub-path 2-items Set was gained. According to support-preference threshold, inappropriate sub-paths which belonged to Sub-path Set were eliminated. By merging sub-paths, Preferred Browsing Paths were generated. When the algorithm is applied in the experiment, its validity is proved.
出处
《重庆理工大学学报(自然科学)》
CAS
2012年第10期82-88,共7页
Journal of Chongqing University of Technology:Natural Science
基金
吕梁学院校内自然科学基金资助项目(ZRXN201215)