摘要
传统的文献聚类算法根据分析文献关键词进行,忽视了文献之间的引用关系,导致了主题漂移和搜索精度不高的问题。针对引文网络中的聚类问题,受到优先情节和增长定律的启发,提出了一种基于角色划分的分层次的文献软聚类算法。首先根据文献之间的引用关系构造引用矩阵,进行结构挖掘;然后根据结构挖掘的结果为每一聚类构造聚类主题,进而进行关键词分析,精化聚类。实验结果表明,该算法能够有效地提高搜索精度和效率。
Traditional paper clustering algorithm focuses on the
出处
《计算机应用研究》
CSCD
北大核心
2012年第3期856-858,共3页
Application Research of Computers
基金
国家自然科学基金资助项目
关键词
主题漂移
优先情节
增长定律
角色划分
聚类主题
analysis while ignores the "refer-to" relationship
which results in the problem of topic drift and low accuracy.This paper inspired by the complex priority and thegrouth theorem
in terms of the clustering in citation network
came up with a hierarchical soft clustering algorithm based on role assorted thoughts.It firstly constructed the "refer-to" matrix in accordance with the reference relationship
mined the structure communities
afterwards
it constructed the clustering theme on the basis of structure discovery
and then analyzed the Key words
refined the clustering.Experimental results show that this algorithm is able to greatly improve the search accuracy and efficiency. Key words: topic drift
complex priority
thegrouth theorem
role assorted
clustering theme