期刊文献+

基于xml的Deep Web信息自动抽取技术的研究

Deep Web Information Automatic Extraction Technology Based On XML
在线阅读 下载PDF
导出
摘要 随着近年来Internet的飞速发展,Deep Web已成为网络信息资源的重要组成部分,用户通过查询接口在线访问其后端的Web数据库来动态的获取其中蕴含的海量信息。由于Deep Web资源分布在各个Deep Web站点,具有异构、动态、数据量大等特点,使用起来较为不便,因此,面向DeepWeb的数据集成系统便应运而生。本文对Deep Web数据集成系统中的数据抽取技术进行了研究,提出了基于xml的Deep Web数据自动抽取方法,并作了详细的技术分析与研究,它能够快速有效地抽取出Deep Web资源,具有抽取准确度高,抽取粒度细等特点。 With the rapid development of Internet in recent years, Deep W^b has become an important part of network information resources, the tremendous information can only be accessed by the query interfaces provided by Web database. The data in Deep Web are obtained in the form of dynamic Web pages when users send a query. As the Deep Web resources are located in various Deep Web site, with a heterogeneous, dynamic, large volumes of data and other characteristics, and inconvenient to use, therefore, the Deep Web data integration systems emerged. In this paper, we researched the data extraction technology in Deep Web Data Integration System, and proposed Deep Web data automatic extraction method based on xml, and has a detailed technical analysis and research for that. The system can quickly and efficiently extracted out of Deep Web resources, has drawn high accuracy and fine granularity extraction and so on.
机构地区 长春工业大学
出处 《科技信息》 2009年第33期85-85,104,共2页 Science & Technology Information
关键词 信息提取 DEEPWEB DeepWeb数据集成 XML Information extraction Deep Web Deep Web data integration xml
  • 相关文献

参考文献2

  • 1Thanaa M Ghanem,Walid G.Aref.Databases deepen the Web[].IEEE Computer.2004
  • 2Bergman M K.The Deep Web:surfacing hidden value[].Journal of Electronic Publishing.2001

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部