摘要
近年来,生物医学文献数量激增,导致严重的信息过载。生物医学文献挖掘能够有效地缓解这一现象,而文献聚类是其中一个重要研究方向。当前文献聚类算法主要是基于文献内容信息实现的,并没有考虑文献间存在的大量引文信息。将引文信息引入到文献聚类中,提出一种结合引文信息和内容信息的聚类算法,实验结果表明了该方法的有效性。
Surge in the number of bio-medical documents in recent years leads to information overload. Mining the bio-medical document can finely relieve this problem, and the document clustering is one of the most important research directions in this regard. Current clustering methods are mainly implemented based on the information of document content without considering the volume of citations among documents. In this paper, the citation information is introduced into document clustering. A novel clustering method that incorporates both content and citation information is proposed. Experimental results show the effectiveness of the method.
出处
《计算机应用与软件》
CSCD
北大核心
2012年第10期5-7,共3页
Computer Applications and Software
基金
国家自然科学基金项目(60903076)
关键词
文献聚类
引文信息
生物文本挖掘
Document clustering Citation Information Biomedical text mining