摘要
随着Web2.0时代的兴起,与微博相关的研究得到学术界和工业界的广泛关注。选取微博文本中的动词和形容词作为特征;提出基于层次结构的特征降维方法;采用设计的基于表情符号的方法计算特征极性值;在此基础上,提出基于特征极性值的位置权重计算方法,借助SVM作为机器学习模型将微博文本分为正面、负面和中性三类。实验结果表明,提出的方法能够比较有效地对中文微博文本进行情感分类。
Along with the rising of Web2.0 age, the studies related to microblogging have drawn wide attentions from both the academia and industry communities. By selecting verbs and adjectives in microblogging texts as the features, we put forward a hierarchical structure-based feature dimensionality reduction approach. The designed emoticon-based method is adopted to calculate the feature polarity value. On this basis, the position weight calculation method based on feature polarity value is proposed. And with the help of SVM as the machine learning model, the approach classifies the microblogging texts into positive, negative and neutral categories separately. Experimental results show that the proposed approach can effectively make sentiment classification on Chinese microblogging texts.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第7期177-181,共5页
Computer Applications and Software
基金
国家自然科学基金项目(61171159
61271304)
北京市教委科技发展计划重点项目暨北京市自然科学基金B类重点项目(KZ201311232037)
关键词
微博
表情符号
极性值
位置权重
情感分类
Microblogging Emoticon
Polarity value
Position weight
Sentiment classification