期刊文献+

数据空间中时间为中心的集合实体识别策略 被引量:4

Time-Centered Collective Entity Resolution Strategy in Dataspace
在线阅读 下载PDF
导出
摘要 数据空间是一个异构的环境,并且数据及模式具有随时间演化的特性。已有的实体识别技术很少考虑时间信息在识别中所起的作用,并且没有考虑实体随时间演化的特性。针对数据空间中具有时间信息的实体识别,提出了一个四阶段的时间为中心的集合实体识别策略(time-centered collective entity resolution,T-CER)。T-CER在实体识别过程的不同阶段都考虑了时间信息所起的作用,在识别阶段提出了基于时间的聚类算法(time-based clustering,T-Clustering),并使用基于时间的约束对识别结果进行检查,以获得更精确的识别结果。在真实数据集上的大量实验结果表明了T-CER的可行性和有效性。 Dataspace is a heterogeneous environment, and the data and schema both evolve with time. The existing entity resolution (ER) techniques seldom consider the role played by the temporal information in the ER process, and do not consider the characteristic of entity evolution with time. So aiming at ER with temporal information in the dataspace, this paper proposes a four-stage time-centered collective entity resolution (T-CER) strategy. Considering temporal information in each different stage of ER process, T-CER proposes a time-based clustering (T-Clustering) algorithm in resolution stage, and uses time-based constraints checking for further accurate ER results. Extensive experimental results on real world data sets show the effectiveness and correctness of T-CER.
出处 《计算机科学与探索》 CSCD 2012年第11期974-984,共11页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金(60973021 61003060) 国家重点基础研究发展规划(973)(2012CB316201) 中央高校基本科研业务费专项资金(N100704001)~~
关键词 数据空间 集合实体识别 时间信息 dataspace collective entity resolution temporal information
  • 相关文献

参考文献16

  • 1寇月,申德荣,李冬,聂铁铮.一种基于语义及统计分析的DeepWeb实体识别机制[J].软件学报,2008,19(2):194-208. 被引量:18
  • 2Elmacioglu E, Kan M-Y, Lee D, et al. Web based linkage[C]// Proceedings of the 9th International Workshop on Web Infor- mation and Data Management (WlDM '07), Lisboa, Portugal, 2007. New York, NY, USA: ACM, 2007: 121-128.
  • 3Ananthakrishna R, Chaudhuri S, Ganti V, et al. Eliminating fuzzy duplicates in data warehouses[C]//Proceedings of the 28th International Conference on Very Large Data Bases (VLDB '02), Hong Kong, China, 2002: 586-597.
  • 4Chaudhuri S, Ganjam K, Ganti V, et al. Robust and efficient fuzzy match for online data cleaning[C]//Proceedings of the 2003 ACM SIGMOD International Conference on Manage- ment of Data (SIGMOD '03), San Diego, California, USA, 2003. New York, NY, USA: ACM, 2003:313-324.
  • 5工宏志,樊文飞.复杂数据上的实体识别技术研究[J].计算机学报.2010,34(10):439-448.
  • 6Dong Xin, Halevy A, Madhavan J. Reference reconciliation in complex information spaces[C]//Proceedings of the 2005ACM SIGMOD International Conference on Management of Data (SIGMOD '05), Baltimore, Maryland, USA, 2005. New York, NY, USA: ACM, 2005: 85-96.
  • 7Bhattacharya I, Getoor L. Collective entity resolution in rela- tional data[J]. ACM Transactions on Knowledge Discovery from Data, 2007, 1(I): 5-39.
  • 8Culotta A, McCallum A. Joint deduplication of multiple re- cord types in relational data[C]//Proceedings of the 14th ACM International Conference on Information and Knowl- edge Management (CIKM '05), Bremen, Germany, 2005. New York, NY, USA: ACM, 2005: 257-258.
  • 9Singla P, Domingos E Entity resolution with Markov logic[C]// Proceedings of the 6th International Conference on Data Mining (ICDM '06), Hong Kong, China, 2006. Washington, DC, USA" IEEE Computer Society, 2006: 572-582.
  • 10Arasu A, Re C, Suciu D. Large-scale deduplication with constraints using dedupalog[C]//Proceedings of the 25th International Conference on Data Engineering (ICDE '09), Shanghai, China, 2009. Washington, DC, USA: IEEE Com- puter Society, 2009: 952-963.

二级参考文献4

共引文献17

同被引文献118

  • 1黄佳.电影音乐艺术探微——以《辛德勒的名单》为个案的分析[J].电影文学,2008(4):21-23. 被引量:9
  • 2程善邦.论现代电影剧作对主题的要求[J].湖北第二师范学院学报,1999,0(6):1-6. 被引量:2
  • 3葛菲.着眼今天 面向未来——重新看待影视关系[J].电影新作,1996(1):15-16. 被引量:1
  • 4甄灵敏,杨晓春,王斌,Ahmed A Hussein.基于属性权重的实体解析技术[J].计算机研究与发展,2013,50(S1):281-289. 被引量:5
  • 5Michael F, Alon H, David M. From databases to dataspaces: a new abstraction for information management [ C ]. Proc of 2005 ACM International Conference on Management of Data (SIGMOD), 2005:27-33.
  • 6Halevy A, Franklin M, Maier D. Principles of dataspace systems [C]. Proc of the 25th Symposium on Principles of Data Systems (PODS) ,2006 : 1-9.
  • 7Dong X, Halevy A, Madhavan J. Reference reconciliation in com- plex information spaces el. Proc of 2005 ACM SIGMOD Interna- tional Conference on Management of Data ( SIGMOD), Baltimore, Maryland, USA,2005 : 85-96.
  • 8Bhattacharya I, Getoor L. Collective entity resolution in relational data[ J]. ACM Trans. Knowl. Discov. Data,2007,1 ( 1 ) :5-39.
  • 9Culotta A, McCallum A. Joint deduplication of multiple record types in relational data[ C]. Proc of the 14th ACM International Confer- ence on Information and Knowledge Management (CIKM) 2005, Bremen, Germany, 2005 : 257-258.
  • 10Singla P, Domingos P. Entity resolution with markov logic [ C ]. Proc of IEEE International Conference on Data Mining (ICDM 2006 ), HongKong, China, 2006:572-582.

引证文献4

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部