摘要
中文句子评价对象抽取是指在中文句子中抽取评论所针对的对象或对象的属性。目前国内相关研究工作尚未能有效识别复合词评价对象和未登陆评价对象。针对以上两种情况,该文提出了一种基于层叠条件随机场的中文句子评价对象抽取方法。该方法首先通过低层条件随机场获得候选评价对象集,然后通过降噪模型对噪声进行过滤、补充模型对缺失的候选评价对象进行补充、合并模型对复合短语候选评价对象进行合并,最后由高层模型抽取出评价对象。实验结果显示,与基于线性链条件随机场的识别方法相比,该方法准确率、召回率和F1值分别提升1.62%、5.75%和4.17%,能有效地识别复合词评价对象和未登录评价对象,从而提高中文句子评价对象的识别精度。
Sentiment-objects extraction aims to identify the targets of opinion described in sentiment sentences. How- ever, previous researches fail to extract compound targets and unknown words. In this paper, the cascaded CRFs model is presented to deal with the problem. The method first acquires opinion target set using lower-lever CRFs model, then, middle-lever models is employed to get candidate set by filtering noise, complementing missing candi- date targets, and merging compound noun phrases. Finally, opinion targets set is extract from the higher-lever mod- el using middle-lever model candidate set as input. Experiments show that our method outperforms linear chain CRFs by 1.62~ in precision, 5.75~ in recall, and 4.17~ in F1 measure. Meanwhile, the method is also effective to identify the compound targets and unknown targets.
出处
《中文信息学报》
CSCD
北大核心
2013年第3期69-76,共8页
Journal of Chinese Information Processing
基金
福建省自然科学基金资助项目(2010J05133)
福建省科技创新平台计划资助项目(2009J1007)
福州大学科技发展基金资助项目(2010-XQ-22)
关键词
评价对象
层叠条件随机场
降噪模型
补充模型
sentiment-objects
cascaded conditional random fields~ noise reduction model
complement model