期刊文献+

基于证据理论的多分类器中文微博观点句识别 被引量:8

Multiple-classifiers Opinion Sentence Recognition in Chinese Micro-blog Based on D-S Theory
在线阅读 下载PDF
导出
摘要 随着新技术及社会网络的发展与普及,微博用户数据量剧增,与此相关的研究引起了学术界和工业界的关注。针对中文微博语句特点,通过对比多种特征选取方法,提出一种新的特征统计方法。根据构建的词语字典与词性字典,分析支持向量机、朴素贝叶斯、K最近邻等分类模型,并利用证据理论结合多分类器对中文微博观点句进行识别。采用中国计算机学会自然语言处理与中文计算会议(NLP&CC 2012)提供的数据,运用该方法得到的准确率、召回率和F值分别为70.6%、89.2%、78.9%,而NLP&CC2012公布的评测结果相应平均值分别为72.7%、61.5%、64.7%,该方法在召回率和F值2个指标上超过其平均值,而F值比NLP&CC2012评测结果的最好值高出0.5%。 With the development and popularity of the new technology and social network, the data volume of micro-blog users surge sharply. Related research causes increasing attention from both academia and industry. This paper proposes a new statistical method on feature extraction. Classification performances of different schemas such as Support Vector Machine(SVM), Naive Bayes and K-Nearest Neighbour(KNN) are analyzed carefully. It proposes a combined model based on D-S theory to take the advantages of different classifiers. A series of experiments based on the Chinese Micro-Blog data provided by CCF NLP&CC 2012 are conducted, and it gets the average estimate 72.7% in precision, 61.5% in recall and 64.7% in F-measure of NLP&CC 2012 as a baseline. Experimental results show that the method can achieve significant enhancement in both recall and F-measure with 70.6%, 89.2% and 78.9%, respectively, and F-measure is even 0.5% higher than the best result of NLP&CC 2012.
出处 《计算机工程》 CAS CSCD 2014年第4期159-163,169,共6页 Computer Engineering
基金 国家自然科学基金资助项目(61170192)
关键词 微博 观点句 支持向量机 朴素贝叶斯 K近邻 证据理论 micro-blog opinion sentence Support Vector Machine(SVM) Naive Bayes K-Nearest Neighbour(KNN) D-S theory
  • 相关文献

参考文献5

二级参考文献55

共引文献392

同被引文献136

  • 1司小胜,胡昌华,周志杰.基于证据推理的故障预报模型[J].中国科学:信息科学,2010,40(7):954-967. 被引量:13
  • 2刘明,袁保宗,唐晓芳.证据理论k-NN规则中确定相似度参数的新方法[J].电子学报,2005,33(4):766-768. 被引量:8
  • 3朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量:329
  • 4姚天防,彭思崴.汉语主客观文本分类方法的研究[C]//中国中文信息学会.全国信息检索与内容安全学术会议论文集.北京:中国中文信息学会,2007:1-7.
  • 5谭松波,王素格,廖祥文,等.第五届中文倾向性分析评测总体报告[C]//第五届中文倾向性分析评测研讨会(COAE2013).山西,太原,2013.
  • 6Taboada M, Brooke J, Tofiloski M, et al. Lexicon-based methods for sentiment analysis [J]. Computational linguis tics, 2011, 37 (2): 267-307.
  • 7Pang B, Lee L. Opinion Mining and Sentiment Analysis[J]. Foundations and Trends in Information Retrieval,2008, 2(1): 1-135.
  • 8Kim S M,Hovy E. Determining the Sentiment of Opinions[C]// Proceedings of International Conference on Computational Linguistics, 2004.
  • 9Zhang L, Liu B. Liu B, Zhang L. A Survey of Opinion Mining and Sentiment Analysis[C]//Aggarwal C C, Zhai C X. Mining Text Data. New York: Springer, 2012: 434-499.
  • 10Zhang L, Liu B. Aspect and Entity Extraction for Opinion Mining[C]//Data Mining and Knowledge Discovery for Big Data. New York: Springer Heidelberg, 2014: 1-35.

引证文献8

二级引证文献40

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部