摘要
本文从句子级的角度进行了中文文本的情感倾向分析,提出以HowNet中的情感词表为种子情感词集,采用基于CRF模型的半监督学习迭代方法获取大量评价词,然后依据中文词间的语义规则判断句子的极性的方法。将该方法应用于COAE2011中任务2-观点句识别,在评价词的识别和观点句极性判断都取得了很好的结果。
During recent years, sentiment analysis about text in Chinese is becoming more and more popular in academic research. In this paper, sentiment analysis is processed on sentence level. Sentiment words published by HowNet is used as the original evaluated-word set, a large amount of evaluated-words are obtained by semi-supervised bootstrapping based on CRF model. Then sentiment sentence can be recognized by evaluated-words, and the polarity of sentiment sentence can be judged by the designed semantic rules.
出处
《情报学报》
CSSCI
北大核心
2012年第10期1071-1076,共6页
Journal of the China Society for Scientific and Technical Information
基金
国家自然科学基金项目“基于文本语义挖掘的商品评论信息可信度分析研究”(71103085)、教育部人文社会科学研究规划基金项目“基于语义的电子商务产品主/客观信息提取研究”(09YJA870015)、江苏省研究生科研创新计划“基于领域本体一CRF的商品主观”(CXIOS-001R)的资助
关键词
CRF
观点句
半监督
情感倾向性
CRF, sentiment sentence, semi-supervised, sentiment analysis