摘要
在Web使用挖掘的研究领域中,很多传统的会话识别算法效率不高,得到的会话往往也不是很精确,从而影响了最终的挖掘结果。针对这种现状,研究了Web使用挖掘中的数据预处理和会话识别过程,并提出一种Markov链模型结合动态时间阀值的会话识别新算法。实验结果表明,这种方法比其它传统的算法的表现有显著提高。
In web usage mining research area, the efficiency of traditional reconstructing sessions methods is not high, and the sessions recognized by traditional methods are not accurate, thus the final mining result is affected. Aimed at this kind of present situation, not only data preprocessing and session reconstruction in web usage mining are studied, but also new approach of reconstructing sessions from web server logs is preseented using the Markov chain model combined with a dynamic thresholds heuristics. The experiments show that our approach provides a significant improvement compared to other traditional methods.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第7期1685-1687,1693,共4页
Computer Engineering and Design