期刊文献+

CSSCI语料中短语结构标注与自动识别 被引量:2

Chinese Phrase Tagging and Automated Annotation Based on CSSCI Corpus
原文传递
导出
摘要 将短语结构标注引入CSSCI期刊论文题录信息分析,在关键词、术语构成上从语法角度深度探讨各组成词汇之间的语法关系,力图通过语法功能分析揭示其所蕴含的语义知识。在进行一定规模语料标注基础上,通过短语词汇、词性统计及短语语法功能分析获取学术文献中短语结构构成特征,并将这部分特征与清华树库语料短语特征混合,提高短语结构在科技文献中的识别率。 The paper introduces a new syntax method as the solution of term phrase identification on CSSCI corpus, and obtains the inter - relationship among terms in academic literature from the linguistic aspect based on phrase components, such as words, part - of - speech, grammar functions, etc. These linguistic features are mixed with phrase features which are extracted from Tsinghua Treebank so as to leverage the accuracy of phrase auto - identification in academic corpus.
出处 《现代图书情报技术》 CSSCI 北大核心 2012年第12期32-38,共7页 New Technology of Library and Information Service
基金 国家自然科学基金面上项目"面向知识服务的知识组织模式与应用研究"(项目编号:71273126) 高技术研究发展计划(863计划)项目"以科技文献服务为主的搜索引擎研制"(项目编号:2011AA01A206) 江苏省教育厅高校哲学社会科学研究基金项目"基于本体的高校突发事件网络舆情监控预警模式研究"(项目编号:2010SJB870003)的研究成果之一
关键词 短语结构标记 CSSCI语料 混合特征 自动识别 Phrase annotation CSSCI corpus Multi - feature Auto - identification
  • 相关文献

参考文献12

二级参考文献46

共引文献129

同被引文献26

  • 1黄昌宁,赵海.由字构词--中文分词新方法[c]//中文信息处理前沿进展--中国中文信息学会二十五周年学术会议论文集,北京:清华大学出版社,2006:53-63.
  • 2Chruch K W. A Stochastic Parts Program and Noun Phrase for Unrestricted Test : proceedings of the 2nd Conference on Applied Natural Language Processing, Austin, TX [ C ]. USA : Kluwer Academic Publicshers, 1988 : 136-- 142.
  • 3Ramshaw L, Marcus M. Text Chunking Using Transfor- mation-Based Learning [C] //Proceedings of 3rd Work- shop on Very Large Corpora. Massachusetts : Association for Computational Linguistics, 1995 : 82--94.
  • 4K Uehimoto, et al. Named entity extraction based on a maximum entropy model and transformation rules [ C ] //Proceedings of the 38th Annual Meeting of the Associa- tion for Computational Linguistics, 2000 : 326-- 335.
  • 5Gulila Ahenbek, Ruina Sun. Kazakh Noun Phrase Ex- traction based on N-gram and Rules : 2010 International Conference on Asian Language Processing [C ]. Harbin, Heilongiiang, China: 1EEE computer society, 2010: 305-- 308.
  • 6Laffel~y J. et al. Conditional Random Fields : Proba- bilistic Models for Segmenting and Labeling Sequence Da- m [ C ]//Proceedings of the 18th International Conf on machineLeaming, 2001: 282--289.
  • 7S Lakshmana Pandian, T V Geetha. CRF Models for Tamil Part of Speech Tagging and Chunking [ C ]. In- ternational Conference on the Computer Processing of Ori- entalLanguages-ICCPOL, Hong Kong, 2009: 11 --22.
  • 8He Saike, et al. Multi-task learning in conditional random fields for chunking in shallow semantic parsing [J]. PACLIC23-Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, 2009, I : 180-- 189.
  • 9Doddington G R,Mitchell A,Przybocki M A,et al.The Automatic Content Extraction(ACE)program-tasks,data,and evaluation[C]//Proceedings of the 4th International Conference on Language Resources and Evaluation,Lisbon,Portugal,2004.
  • 10Chopra D,Morwal S.Named entity recognition in English using hidden Markov model[J].International Journal,2013.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部