摘要
网络热点话题检测与跟踪已成为舆情分析领域的前沿研究课题,具有广阔应用前景。本文研究基于主题演化图的网络论坛(BBS)热点跟踪问题。在采用共词分析和bisecting K-means聚类算法检测BBS热点话题基础上,提出了一个综合考虑话题帖子篇数与帖子热度的热点话题关注度计算方法。然后给出了一个基于相对熵的热点话题语义距离计算方法。最后通过构造主题演化图实现BBS热点话题的自动跟踪。在由实际BBS论坛数据构成的测试集上的实验表明,本文提出的方法是有效的。
Internet public hot topics detecting and tracking has become a flourishing frontier in the Web mining community and has a wide range of application prospects. This paper studies BBS hot topics track- ing using theme evolution graph. Firstly, we create an algorithm to automatically detect the hot topics of BBS threads based on co-word analysis and bisecting K-means algorithm. Then, the calculation methods of attention-degree for hot topic and semantic distance between hot topics are presented. Finally, a ap- proach for BBS hot topics tracking based on theme evolution graph is proposed. Experimental results on thousands of real BBS threads demonstrate that the approach proposed in this paper is effective.
出处
《情报科学》
CSSCI
北大核心
2013年第3期147-150,共4页
Information Science
基金
浙江省自然科学基金项目资助(Y1100176)
关键词
网络论坛
热点跟踪
主题演化图
网络舆情
BBS
hot topic tracking
theme evolution graph
internet public opinion