期刊文献+

Inherit/Feedback:一种新的Web主题挖掘方法 被引量:4

Inherit/Feedback: A New Web Topic-Specific Mining Method
在线阅读 下载PDF
导出
摘要 经典链接分析方法 (如PageRank和HITS)更多地关注的是网页的权威度 ,而不是其主题相关度 ,所以在引导主题搜索的过程中 ,很快就发生主题漂移 为此 ,在构建主题关联拓扑模型的基础上 ,提出了Inherit/Feedback方法 ,以用于Web主题挖掘 基本思想是 :在搜索路径上 ,一个结点继承其父辈结点的主题相关度 ,并且将其主题相关度反馈给父辈结点 同时 ,提出了基于Inherit/Feedback的主题搜索算法 (IFC) 实验结果表明 ,这种方法能有效地引导主题搜索 。 Classical hyperlink analysis algorithms (such as PageRank, HITS) focus on the authority of Web page rather than its topic Thus the crawler based on these algorithms would rapidly drift away in the course of crawling In this paper a new hyperlink analysis method called Inherit/Feedback is presented The key idea is that a page inherits the topic specific correlation from its ancestors and gets the feedback from its descendants There are various applications that can be enhanced by the Inherit/Feedback method, such as pages ranking and topic specific crawling A new topic specific crawling algorithm based on Inherit/Feedback (IFC) is also proposed The experiments show that IFC performs quite well while guiding the topic specific crawling agent and it can be applied to the further discovery and mining from topic specific website
出处 《计算机研究与发展》 EI CSCD 北大核心 2004年第5期807-811,共5页 Journal of Computer Research and Development
基金 广东省科技攻关基金项目 (C10 2 0 1 A10 2 0 10 3)
关键词 链接分析 主题搜索 WEB挖掘 hyperlink analysis topic specific crawling Web mining
  • 相关文献

参考文献13

  • 1L Page,S Brin,R Motwani et al.The Page Rank citation ranking:Bring order to the Web.Stanford University,Tech Rep:1997-0072,1997
  • 2J Kleinberg.Authoritative sources in a hyperlinked environment.Journal of the ACM,1999,46(5):604~632
  • 3S Brin,L Page.The anatomy of a large-scale hypertextual Web search engine.The 7th Int'l World Wide Web Conf (WWW-98),Brisbane,Australia,1998
  • 4K Bharat,M R Henzinger.Improved algorithms for topic distillation in a hyperlinked environment.The 21st Int'l ACM SIGIR Conf on Research and Development in Information Retrieval (SIGIR-98),Melbourne,Australia,1998
  • 5P Srinivasan,G Pant,F Menczer.Target seeking crawlers and their topical performance.The 25th Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval (SIGIR-02),Tampere,Finland,2002
  • 6T H Haveliwala.Topic-sensitive PageRank.The 11 th Int'l World Wide Web Conf (WWW-02),Honolulu,Hawaii,USA,2002
  • 7E J Glover.Using Web structure for classifying and describing Web pages.The 11th Int'l World Wide Web Conf (WWW-02),Honolulu,Hawaii,USA,2002
  • 8F Menczer.Evaluating topic-driven Web crawlers.The 24th Annual Int'l ACM SIGIR Conf on Research and Development in Information Retrieval ( SIGIR-01 ),New Orleans,Louisiana,USA,2001
  • 9G Pant,P Srinivasan,F Merczer.Exploration versus exploitation in topic driven crawlers.The WWW-02 Workshop on Web Dynamics,Honolulu,Hawaii,USA,2002
  • 10S Chakrabarti,M Gerg,B Dom.Focused crawling:A new approach to topic-specific Web resource discovery.Computer Networks,1999,31(11):1623~1640

同被引文献71

引证文献4

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部