期刊文献+

应用链接分析的web搜索结果聚类 被引量:4

Clustering of Web Search Results Using Link Analysis
在线阅读 下载PDF
导出
摘要 随着web上信息的急剧增长,如何有效地从web上获得高质量的web信息已经成为很多研究领域里的热门研究主题之一,比如在数据库,信息检索等领域。在信息检索里,web搜索引擎是最常用的工具,然而现今的搜索引擎还远不能达到满意的要求,使用链接分析,提出了一种新的方法用来聚类web搜索结果,不同于信息检索中基于文本之间共享关键字或词的聚类算法,该文的方法是应用文献引用和匹配分析的方法,基于两web页面所共享和匹配的公共链接,并且扩展了标准的K-means聚类算法,使它更适合于处理噪音页面,并把它应用于web结果页面的聚类,为验证它的有效性,进行了初步实验。 With information proliferation on the Web,how to obtain high-quality information from the Web has been one of hot research topics in many fields like Database as well as IR.Web search engine is the most commonly used tool for information retrieval;however,its current status is far from satisfaction.we propose a new approach to cluster search results returned from Web search engine using link analysis.Unlike document clustering algorithms in IR that based on common words /phrases shared between documents,our approach is base on common links shared by pages using co-citation and coupling analysis.We also extend standard clustering algorithm K-means to make it more natural to handle noises and apply it to web search results..Preliminary experiments are conducted to investigate its effective-ness.The experiment results show that clustering on web search results via link analysis is promising
出处 《计算机工程与应用》 CSCD 北大核心 2005年第2期179-183,共5页 Computer Engineering and Applications
基金 湖南省自然科学基金项目(编号:03092) 国家教育部重点科研项目
关键词 链接分析 公共文献引用 匹配 hub页面 权威页面 link analysis,co-citation,coupling,hub page,authority page
  • 相关文献

参考文献25

  • 1J Kleinberg.Authoritative sources in a hyperlinked environment[C].In: proceedings of the 9thACM-SIAM Symposium on Discrete Algorithms (SODA), 1998-01.
  • 2D Ravi Kumar et al.Trawling the Web for emerging eyber-communities[C].In : Proceedings of the 8th WWW conference,Toronto,Canada, ! 999.
  • 3Brin S,Page LThe anatomy of a large-scale hypertextual web search engine[C].In:Proceedings of WWW7,Brisbane,Australia, 1998-04.
  • 4DB Oren Zamir,Oren Etzioni 99 Grouper.A Dynamic Clustering Interface to Web SearchResults[C]An:Proceedings of the 8th WWW Conference,Toronto Canada, 1999.
  • 5Richard C Dubes,Anil K Jain,Algorithms for Clustering Data[M]. Prentice Hall, 1988.
  • 6DB Oren Zamir,Oren Etzioni.Fast and Intuitive clustering of Web documents[C].In :KDD'97,1997:287-290.
  • 7EB Oren Zamir,Oren Etzioni.Web document clustering:A feasibility demonstration[C].In:Melboume,Australia Proceedings of SIGIR'98, 1998.
  • 8Zhihua Jiang et al.Retriever:impring web Search Engine,Results Using Link Analysis.http://citeseer.nj.nec.com/275012.html.
  • 9D Ron Weiss et al.96 Hypursuit:A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering[C].In:ACM Conference on Hypertext,Washington USA, 1996.
  • 10D Michael Steinbach et al.A Comparison of Document Clustering techniques[R].Technical report of University of Minnesota,KDD'2000.

同被引文献71

引证文献4

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部