摘要
传统的情感倾向性分析方法主要针对同一领域的文本,对于不同领域的文本,传统方法效果较差。为解决该问题,提出一种基于特征变换的跨领域产品评论倾向性分析方法。通过领域独立词建立源领域和目标领域的领域依赖词之间的关联,将源领域的领域知识迁移到目标领域中,以解决数据分布不同造成的分类器效果下降的问题。使用产品评论文本作为语料进行实验,结果表明,在所有语料上基于支持向量机和逻辑回归方法的平均精度分别为76.61%和76.81%,均高于Baseline算法的平均结果。
Traditional sentiment analysis methods aim at same domain documents, the performance becomes worse for different domain documents. To solve this problem, this paper presents an opinion analysis method of cross-domain product reviews based on feature transformation. This proposed method builds the relevance of domain dependent words between source domain and target domain via domain independent words so that it can transfer acknowledge from the source domain to the target domain. It solves the classifier performance decreasing problem due to different data distributions. The product reviews are used as a corpus in the experiment. The average accuracies are 76.61% and 76.81% by using the methods of Support Vector Machine(SVM) and logistic regression respectively in all corpora. The results are higher than Baseline algorithm.
出处
《计算机工程》
CAS
CSCD
2013年第10期167-171,共5页
Computer Engineering
基金
国家自然科学基金资助项目(61202254)
中国博士后科学基金资助项目(2013M530918)
中央高校自主科研基金资助项目(DC120101081,DC120101084)
辽宁省教育厅科学研究基金资助一般项目(L2012478)
关键词
特征变换
倾向性分析
产品评论
源领域
目标领域
领域独立词
领域依赖词
feature transformation
opinion analysis
product review
source domain
target domain
domain independent word
domaindependent word