摘要
一个web信息集成系统一般包含以下几个部分:领域模型构建,数据抽取,模式匹配和实体统一等.目前对与模式匹配和实体统一的研究都是各自独立的,但这两部分是相互关联的.笔者提出了一种通过模式匹配促进实体统一的新思路,并提出了基于这种思路的SMPER算法,该算法充分利用模式匹配和实体统一的关联性,使得查准率和查全率得到有效提高,从而验证了通过模式匹配促进实体统一这一新思路的正确性和可行性.
A web information integration system can be divided into several different but closely related components : Domain Model Construction, Data Sources Process, Data Extraction, Pattern Matching, Entity Resolution and User Application Interface construction. Although Pattern Matching and Entity Resolution are the two important components of the web information integration system, they are most commonly studied separately. This article puts forward an idea that entity resolution can accelerate pattern matching, and the SMPER algorithm based on this idea. This algorithm improves the precision and recall, and also verifies the correctness and feasibility of the idea.
出处
《山东师范大学学报(自然科学版)》
CAS
2012年第2期35-39,44,共6页
Journal of Shandong Normal University(Natural Science)
关键词
WEB信息集成
模式匹配
实体统一
web information integration
schema matching
entity resolution