摘要
针对监督学习方法在文本的跨领域情感分析效果较差的问题,提出基于质心迁移的领域间适应性情感分类方法。该方法利用源领域的标注文本对目标领域的大量未标注文本进行分类,选择一部分可信度高的文本加入到训练集,同时去除源领域中距离目标领域测试集质心较远的文本,通过迭代逐渐缩小两个领域间的质心距离,减小领域间差异。实验结果表明,该方法能提高跨领域倾向性分析的精度。
Supervised learning techniques do not perform well in documents cross-domain sentiment analysis.To deal with this problem,we proposed a novel approach,that is the adaptive intra-domain sentiment classification based on centroid transfer.The method makes full use of the labelled documents in source domain to classify a great deal of unlabelled documents in target's domain and chooses part of the high-confidence documents to join into the training set,simultaneously removes some of the documents in source domain which are far form the centroid of the target domain test set,through the iteration it gradually narrows the centroid distance between two domains,and reduces the differences between the domains.Experiment results indicate that the proposed algorithm can improve the precision of cross-domain inclination analysis.
出处
《计算机应用与软件》
CSCD
2011年第12期26-28,74,共4页
Computer Applications and Software
基金
国家自然科学基金(90920004
60970056
60873150)
江苏省自然科学基金(BK2008160)
江苏省高校自然科学重大基础研究项目(08KJA520002)
关键词
领域适应
情感分析
质心迁移
观点分类
Domain adaption Sentiment analysis Centroid transfer Opinion classification