期刊文献+

一种模式匹配促进实体统一的方法

A METHOD OF PATTERN MATCHING ACCELERATE ENTITY RESOLUTION
在线阅读 下载PDF
导出
摘要 一个web信息集成系统一般包含以下几个部分:领域模型构建,数据抽取,模式匹配和实体统一等.目前对与模式匹配和实体统一的研究都是各自独立的,但这两部分是相互关联的.笔者提出了一种通过模式匹配促进实体统一的新思路,并提出了基于这种思路的SMPER算法,该算法充分利用模式匹配和实体统一的关联性,使得查准率和查全率得到有效提高,从而验证了通过模式匹配促进实体统一这一新思路的正确性和可行性. A web information integration system can be divided into several different but closely related components : Domain Model Construction, Data Sources Process, Data Extraction, Pattern Matching, Entity Resolution and User Application Interface construction. Although Pattern Matching and Entity Resolution are the two important components of the web information integration system, they are most commonly studied separately. This article puts forward an idea that entity resolution can accelerate pattern matching, and the SMPER algorithm based on this idea. This algorithm improves the precision and recall, and also verifies the correctness and feasibility of the idea.
出处 《山东师范大学学报(自然科学版)》 CAS 2012年第2期35-39,44,共6页 Journal of Shandong Normal University(Natural Science)
关键词 WEB信息集成 模式匹配 实体统一 web information integration schema matching entity resolution
  • 相关文献

参考文献9

  • 1Kang J,Naughton J. On Schema Matching with Opaque Column Names and Data Values[ C]. In Proceeding of SIG MOD 2003, San Diego, Cali- fornia, USA, 2003:9 - 12.
  • 2David Guy Brizan, Abdullah Uz Tansel. A survey of entity resolution and record linkage methotlologies~ J]. Communications of the IIMA, 2006, 6(3) :41 -50.
  • 3Rahm E, Bernstein P A. A survey of approaches to automatic schema matching[ J ]. VLDB ,2001,10 (4) : 334 -350.
  • 4Chen Z, Kalashnikov D V, Mehrotra S. Exploiting relationships for object consolidation[ C ]. In SIGMOD -2005 Workshop on Information Quality in Information Systems, Baltimore, MD ,2005:47- 58.
  • 5Hemandez M A,Stolfo S J. The merge/purge problem for large databases[ C ]. In Proceedings of the ACM SIGMOD International Conference on Management of Data 1995:127 - 138.
  • 6Bilke,Naumann F. Schema matching using duplicates[ C]. In Proceedings of the IEEE CS International Conference on Data Engineering. 2005 : 69 - 80.
  • 7Salton G, Clement T Y. On the construction of effective vocabularies for information retrieval[ C]. In proceedings of the 1973 Meeting on Pro- gramming Languages and Information Retrieval. New York. 1973 : 11.
  • 8刘通,刘国华,刘欣,等.一种基于副本的完整模式匹配算法[J].计算机科学,2006,33(8):366-368.
  • 9Monge A E. An Adaptive and Efficient Algorithm for Detecting Approximately Duplicate Database Records[ EB/OL], 2000 -09.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部