摘要
针对传统的基于Web图的垂直搜索策略Authorities and Hubs,提出了一种融合了网页内容评价和Web图的启发式垂直搜索策略,此外,引入向量空间模型进行针对网页内容的主题相关度判断,进一步提高主题网页下载的准确率。实验表明,文中算法有效地提高了主题网页的聚合程度,且随着网页下载数量的增加,垂直搜索引擎的准确率逐渐递增,并在下载网页达到一定数量后,准确率趋于稳定,算法具有较好的鲁棒性,可以应用到相关垂直搜索引擎系统中。
In accordance with the traditional Web graph-based vertical search strategy, Authorities and Hubs, this paper puts forward an enlightening vertical search strategy which combines Web page content evaluation with Web graph. Moreover, Vector Space Model is used to judge the topic correlation of Web page content to further enhance the precision of the downloaded Web pages. The experiments show that the algorithm increases the topic correlation of Web page content effectively, and with the increase of the number of the downloaded Web pages, the precision of the vertical search engine increases gradually. The precision tends to be stable after the number of the downloaded Web pages reaches a certain number. The algorithm has good robustness and can be used in relevant vertical search engine systems.
出处
《情报理论与实践》
CSSCI
北大核心
2009年第9期121-124,共4页
Information Studies:Theory & Application
基金
江西省教育厅基金项目(赣教技字[2006]177号)
华东交通大学校立基金(项目编号:08xx05)资助的成果之一
关键词
垂直搜索引擎
网页
内容评价
Web图
vertical search engine
Web page
content evaluation
Web graph