期刊文献+

基于网页文本依存特征的人名消歧 被引量:6

Name Disambiguation Based on Dependency Feature in Web Page Text
在线阅读 下载PDF
导出
摘要 研究互联网中的人名消歧问题。抽取与网页文本中人名关键字实体相关的依存特征及命名实体等辅助特征,利用二层聚类算法,根据依存特征将可信度高的文档聚类,使用辅助特征将剩余文档加到现有聚类结果中,由此实现人名消歧。实验结果证明,该方法消歧效果优于其他人名消歧方法。 This paper works on the common ambiguity problem on Internet.The following is the proposed method: extract the dependency features which are related to the key name entities in the Web page text,while extract supporting features such as named entity extraction;cluster these features by a two-step cluster algorithm which clusters the documents with high reliability in the first stage and then merges the other documents to the existing clustering results.Experimental result shows that the proposed disambiguation system has better performance than common methods.
出处 《计算机工程》 CAS CSCD 2012年第19期133-136,共4页 Computer Engineering
基金 国家自然科学基金资助项目(60970056 61070123 61003155) 江苏省自然科学基金资助项目(BK2008160) 高等学校博士学科点专项基金资助项目(20093201110006) 模式识别国家重点实验室开放课题基金资助项目
关键词 人名歧义 依存特征 人名消歧 命名实体 聚类 name ambiguity dependency feature name disambiguation named entity clustering
  • 相关文献

参考文献11

  • 1Malin B, Airoldi E, Carley K M. A Network Analysis Model for Disambiguation of Names in Lists[J]. Computational & Mathematical Organization Theory, 2005, 11(2): 119-139.
  • 2Bagga A, Baldwin B. Entity-based Cross-document Corefe- rencing Using the Vector Space Model[C]//Proc. of the 17th International Conference on Computational Linguistics. [S. l.]: IEEE Press, 1998: 75-85.
  • 3Chen Ying, Jin Peng, Li Wenjie, et al. The Chinese Persons Name Disambiguation Evaluation: Exploration of Personal Name Disambiguation in Chinese News[C]//Proc. of CIPS- SIGHAN Joint Conference on Chinese Language Processing. Beijing, China: Chinese Information Processing Society of China, 2010: 346-352.
  • 4Mann G, Yarowsky D. Unsupervised Personal Name Disambigu- ation[C]//Proc. of CoNLL’03. Edmonton, Canada: Association for Computational Linguistics, 2003: 33-40.
  • 5Fleischman M, Hovy E. Multi-document Person Name Resolution[C]//Proc. of the 42nd Annual Meeting of the Association for Computational Linguistics. Madrid, Spain: [s. n.], 2004: 1-8.
  • 6Chen Ying, Martin J. Towards Robust Unsupervised Personal Name Disambiguation[C]//Proc. of 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Pargue, Czech: [s. n.], 2007: 190-198.
  • 7Ono S, Sato I, Yoshida M, et al. Person Name Disambiguation in Web Pages Using Social Network, Compound Words and Latent Topics[C]//Proc. of the 12th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. Heidelberg, Germany: Springer-Verlag, 2008: 260-271.
  • 8Malin B. Unsupervised Name Disambiguation via Social Network Similarity[C]//Proc. of 2005 SIAM International Conference on Data Mining. Newport Beach, USA: [s. n.], 2005: 93-102.
  • 9Romano L, Buza K, Giuliano C. XMedia: Web People Search by Clustering with Machinely Learned Similarity Measures[C]// Proc. of Web People Search Evaluation Workshop at World Wide Web Conference. Madrid, Spain: [s. n.], 2009.
  • 10王厚峰.指代消解的基本方法和实现技术[J].中文信息学报,2002,16(6):9-17. 被引量:46

二级参考文献4

共引文献45

同被引文献47

  • 1曹犟,邬晓钧,夏云庆,郑方.基于拼音索引的中文模糊匹配算法[J].清华大学学报(自然科学版),2009(S1):1328-1332. 被引量:14
  • 2张猛,王大玲,于戈.一种基于自动阈值发现的文本聚类方法[J].计算机研究与发展,2004,41(10):1748-1753. 被引量:16
  • 3俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:168
  • 4周波,杨国纬.基于贝叶斯算法的中国人名识别[J].计算机应用,2006,26(4):998-1000. 被引量:12
  • 5李丽双,黄德根,毛婷婷,徐潇潇.基于支持向量机的中国人名的自动识别[J].计算机工程,2006,32(19):188-190. 被引量:9
  • 6Chen Ying, Jin Peng, l.i Wenjie, et al. Exploration of personal name disambiguation in Chinese news [C]ffCIPS-SIGHAN Joint Conference on Chinese Language Processing. Bejing,China:ACL, 2010: 20-26.
  • 7He Zhengyan, Wang Houfeng. l.i Sujian. The task 2 of CIPS-SIGHAN 2012 named entity recognition and disambiguation in Chinese bakeoff [C3//C'IPS-SIC; H AN J oint Conference on Chinese Language Processing. Tianiin, China : ACL, 2012 : 108-114.
  • 8()no S, Sato I, Yoshida M, et al. Person name disambiguation in web pages using social network, compound words and latent topics [C]//Advances in Knowledge Discovery and Data Mining. E S. 1.1: Springer Berlin Heidelberg: 2008: 260-271.
  • 9Long C, Shi L. Web person name disambiguation by relevance weighting of extended feature sets[C]//11 th Workshop of the Cross-Language Evaluation Forum. Padua: ACL, 2010 : 1-13.
  • 10Fan Xiaoming, Wang Jianyong, Pu Xu, et al. On graph-based name disambiguation [J]. Journal of Data and Information Quality, 2011, 2(2): 1-23.

引证文献6

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部