摘要
为了帮助用户在因特网上搜索感兴趣的在线文本 ,提出了基于聚类的文本过滤模型 .其基本思想是 :在预定的层次目录之下 ,根据用户给出的过滤模板进行动态扩张 ,以便于全面地反映用户信息需求 .然后 ,通过对扩张模板的聚类分析 ,使得每一类由表达相同或相近兴趣的用户模板组成 .匹配时 ,先将文本推送到相应的模板类中 ,再计算与具体模板的相似度 ,获得最终的匹配结果 。
The paper presents the text filtering model based on clustering in order to help users search the texts related to their interests on the internet. Its main idea is shown as follows: Under the hierarchical categories pre\|arranged by the model, it applied the query expansion approach to the user profiles based on the co\|occurrence matrix, and then it divided the expanded user profiles into several classes by clustering analysis. In the course of matching texts and user profiles, it first pushes the texts to the relevant profile classes, and it ranks the texts according to the similarities between texts and user profiles. The experiments show that it remarkably improves the efficiency of the text filtering.
出处
《大连理工大学学报》
CAS
CSCD
北大核心
2002年第2期249-252,共4页
Journal of Dalian University of Technology