期刊文献+

基于搜索引擎的知识发现 被引量:3

Knowledge Discovery Based on the Search Engine
在线阅读 下载PDF
导出
摘要 数据挖掘一般用于高度结构化的大型数据库,以发现其中所蕴含的知识。随着在线文本的增多,其中所蕴含的知识也越来越丰富,但是,它们却难以被分析利用。因而,研究一套行之有效的方案发现文本中所蕴含的知识是非常重要的,也是当前重要的研究课题。该文利用搜索引擎Google获取相关Web页面,进行过滤和清洗后得到相关文本,然后,进行文本聚类,利用Episode进行事件识别和信息抽取,数据集成及数据挖掘,从而实现知识发现。最后给出了原型系统,对知识发现进行实践检验,收到了很好的效果。 Data mining is typically applied to large databases of highly structured information in order to discover new knowledge.Though the amount of potentially valuable knowledge contained in document collections can be great,they are often difficult to analyze.Therefore,it is important to develop methods to efficiently discover knowledge embedded in these document repositories,and text mining becomes an important research area too.This paper describes an approach for mining knowledge from web pages,at first,gets web pages from the web by search engine Google,then filters out the irrelevant documents,takes text categorization,extracts information and recognizes the event type by episode,integrates and mines the data in order to discover new knowledge.Finally,a prototype based on this theory is developed,and then the result is described in detail.
出处 《计算机工程与应用》 CSCD 北大核心 2004年第30期178-180,220,共4页 Computer Engineering and Applications
关键词 搜索引擎 文本聚类 EPISODE 信息抽取 知识发现 search engine,text categorization,episode,information extraction,knowledge discovery
  • 相关文献

参考文献8

  • 1Raymond Kosala,Hendrik Blockeel. Web mining research:A survey[J].SIGKDD Explorations,2000;2(1): 1~15
  • 2中国新闻社.http:∥www.chinanews.com.cn/,2003-10-01
  • 3Google.http :∥www.google.com/, 2003-10-01
  • 4Sergey Brin,Lawrence Page.The Anatomy of a Large-scale Hypertextual Web Search Engine[J].Computer Networks and ISDN Systems,1998 ;30:107~117
  • 5Salton G,Wong A,Young C S.A Vector Space Model for Automatic Indexing[J].Communications of the ACM, 1975; 18(5) :613~620
  • 6Hearst M A,Pedersen J. Reexamining the Cluster Hypothesis:Scatter/Gather on Retrieval Results[C].In:Proc of the 19th Annual Int ACM/SIGIR Conf Zurich:76~84
  • 7H Ahonen,O Heinonen. Applying Data Mining Techniques in Text Analysis[R].Report C-1997-23 ,University of Helsinki ,Department of Computer Science, 1997-03
  • 8U Nahm,R Mooney.Text Mining with Information Extraction[C].In:Proceedings of the AAAI 2002 Spring Symposium on Mining Answers from Texts and Knowledge Bases,2002

同被引文献28

引证文献3

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部