面向跨领域情感分类的统一框架被引量：10

A Unified Framework for Cross-Domain Sentiment Classification

下载PDF

导出

摘要文本的情感分类问题,即判断文本中的论断是持支持态度还是反对态度.已有的研究表明,监督分类方法对情感分类很有效.但是多数情况下,已有的标注数据与待判断情感类别的数据不属于同一个领域,此时监督分类算法的性能明显下降,由此产生的即为跨领域情感分类问题.为解决此问题,提出一个统一框架,分多阶段进行跨领域情感分类:首先利用训练域文本的准确标签来得到测试域文本的初始标签;然后将测试域建成一个加权网络,将一些较准确的测试文本作为"源点"和"汇点",进一步利用热传导思想迭代进行跨领域情感分类.实验结果表明,此方法能大幅度提高跨领域情感分类的精度. Sentiment classification of documents aims to determine the opinion （e. g. , negative or positive） of a given document. Existing studies have shown that, Usually, supervised classification approaches perform well in sentiment classification. However, in most cases, the existing labeled data and the unlabeled data don＇t belong to the same domain. And the performance of sentiment classification decreases sharply when transferred from one domain to another domain. This causes cross-domain sentiment classification, which is a very significant problem and getting more and more attention. A unified framework is proposed, which integrates several stages for cross-domain sentiment classification. Firstly, we utilize the accurate labels of source-domain documents to get the initial labels of target-domain documents. Then, we build the target domain as a weighted network, and choose some target-domain documents whose opinions are determined more accurately as ＂source components＂ and ＂sink components＂. Further, we apply heat conduction process to the weighted network to improve the performance of cross-domain sentiment classification of target-domain data, with the help of ＂source components＂ and ＂sink components＂. An experiment is conducted using data from three different domains, and we transfer between two of them. The experiment results indicate that the proposed framework could improve the performance of cross-domain sentiment classification dramatically.

作者吴琼刘悦沈华伟张瑾许洪波程学旗

机构地区中国科学院计算技术研究所

出处《计算机研究与发展》 EI CSCD 北大核心 2013年第8期1683-1689,共7页 Journal of Computer Research and Development

基金国家自然科学基金项目(61100083 60903139 61173064) 国家自然科学基金重点课题(60933005) 国家"二四二"安全专项基金项目(2011F65 2011A001) 国家"九七三"重点基础研究发展计划基金项目(2012CB316303)

关键词跨领域情感分类热传导模型倾向性分析迁移学习 cross domain sentiment classi{ication heat conduction model opinion analysis weighted network

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献21

1胡熠,陆汝占,李学宁,段建勇,陈玉泉.基于语言建模的文本情感分类研究[J].计算机研究与发展,2007,44(9):1469-1475. 被引量：23
2姚天昉,娄德成.汉语语句主题语义倾向分析方法的研究[J].中文信息学报,2007,21(5):73-79. 被引量：78
3徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量：124
4唐慧丰,谭松波,程学旗.基于监督学习的中文情感分类技术比较研究[J].中文信息学报,2007,21(6):88-94. 被引量：139
5赵军,许洪波,黄萱菁,等.中文倾向性分析评测技术报告[R].北京:中文信息学会,2008.
6Du Wc.Iu. Ten Songbo. An iterative reinforcement approach for fine-grained opinion mining [C]//Proc of Human Language Technologies: The 2009 Annual Conf of the North American Chapt er of the Association for Computational Linguistics. Stroudsburg. USA: ACL. 2009: 486-493.
7Aue A. Gamon M. Customizing sentiment classifiers to new domains: A case study [OL] //Proc of RANLP, 2005. [2012-05-28]. http://reseerch. microsoft. com/pubs/65430/ new_domain_sentiment. pdf.
8Blitzer.l. Dredze M. Pereira F. Biographies, bollvwood , boom-boxes and blenders: domain adapt arion for sentiment classification [C]//Proc of Association of Computational Linguist ics. Stroudsburg. USA: ACL. 2007: 440-447.
9吴琼,谭松波,许洪波,段洣毅,程学旗.基于随机游走模型的跨领域倾向性分析研究[J].计算机研究与发展,2010,47(12):2123-2131. 被引量：12
10Turney P D. Littman M L. Unsupervised learning of semantic orientation from a hundred-hillion-word corpus. ERB-1094 [R]. Ottawa. Canada. National Research Council Canada, Institute for Information Technology, 2002.

二级参考文献78

1董振东.语义关系的表达和知识系统的建造[J].语言文字应用,1998(3):79-85. 被引量：60
2金珠,林鸿飞,赵晶.基于HowNet的话题跟踪及倾向性分类研究[J].情报学报,2005,24(5):555-561. 被引量：21
3朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量：329
4徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100. 被引量：124
5Pang B,Lee L,Vaithyanathan S.Thumbs up? Sentiment classification using machine learning techniques[C]//Proc of EMNLP 2002.Morristown,NJ,USA:ACL,2002:79-86.
6Ku L,Liang Y,Chen H.Opinion extraction,summarization and tracking in news and blog corpora[C]//Proc of AAAI 2006.Boston,Massachusetts:ACL,2006.
7赵军,许洪波,黄萱菁,等.中文倾向性分析评测技术报告[R].北京:中文信息学会,2008.
8Aue A,Gamon M.Customizing sentiment classifiers to new domains:A case study[C]//Proc of RANLP 2005.Borovets,Bulgaria:RANLP,2005.
9Blitzer J,Dredze M,Pereira F.Biographies,bollywood,boom-boxes and blenders:Domain adaptation for sentiment classification[C]//Proc of ACL 2007.Prague,Czech Republic:ACL,2007:440-447.
10Tan S,Wang Y,Wu G,et al.Using unlabeled data to handle domain-transfer problem of semantic detection[C]//Proc of SAC 2008.New York:ACM,2008:896-903.

共引文献336

1刘昊.情感视域下社交媒体平台舆论分层与社群挖掘研究[J].中国网络传播研究,2018(2). 被引量：1
2杨梦月,卫伟,陆慧娟,卢海峰.基于差分进化的中文情感分类集成算法研究[J].计量学报,2020,41(2):225-230. 被引量：2
3何忠育,王勇,王瑛,陈新,廖朝辉.基于分布式计算的网络舆情分析系统的设计[J].警察技术,2010(3):19-22. 被引量：6
4田冬阳.一种基于改进支持向量机的文本倾向性分类算法[J].微型电脑应用,2011(3):34-37. 被引量：3
5刘功申,何文垒,朱杰,来火尧.Feature Representation Based on Sentimental Orientation Classification[J].China Communications,2011,8(3):90-98. 被引量：5
6王北斗,窦志,陈纯,卜佳俊.支持评价类问题与电影智能搜索的问答系统构建[J].大连理工大学学报,2011,51(S1):93-97. 被引量：1
7郝博一,夏云庆,邬晓钧,郑方,刘轶.基于泛化和繁殖的自举式意见目标抽取方法[J].清华大学学报（自然科学版）,2009(S1):1333-1338.
8姚天昉,娄德成.汉语语句主题语义倾向分析方法的研究[J].中文信息学报,2007,21(5):73-79. 被引量：78
9胡熠,陆汝占,李学宁,段建勇,陈玉泉.基于语言建模的文本情感分类研究[J].计算机研究与发展,2007,44(9):1469-1475. 被引量：23
10王素格,李伟.面向中日关系论坛的情感分类问题研究[J].计算机工程与应用,2007,43(32):174-177. 被引量：4

同被引文献78

1朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20. 被引量：329
2马海兵,刘永丹,王兰成,李荣陆.三种文档语义倾向性识别方法的分析与比较[J].现代图书情报技术,2007(4):43-47. 被引量：15
3胡熠,陆汝占,李学宁,段建勇,陈玉泉.基于语言建模的文本情感分类研究[J].计算机研究与发展,2007,44(9):1469-1475. 被引量：23
4Du Wei{u, Tan Songbo, Cheng Xueqi, et al. Adapting information bottleneck method for automatic construction of domain oriented sentiment lexicon [C] //Proc of the 3rd ACM Int Con{ on Web Search and Data Mining. New York= ACM, 2010:111-120.
5Bollegala D, Weir D, Carroll J. Cross-Domain sentiment classification using a sentiment sensitive thesaurus [J]. IEEE Trans on Knowledge and Data Engineering, 2013, 25 (8): 1719-1731.
6Pang B, Lee L, Vaithyanathan S. Thumbs up? sentiment classification using machine learning techniques [C]//Proc of the Association of Computational Linguistics Conf on Empirical Methods in Natural Language Processing. Stroudsburg, PA: ACL, 2002:79-86.
7Yu L C, Wu J L, Chang P C, et al. Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news [J]. Knowledge-Based Systems, 2013, 4i: 89-97.
8Zhu Zhu, Dai Darning, Ding Yaxing, et al. Employing emotion keywords to improve cross-domain sentiment classification [G] //LNCS 7717 Chinese Lexical Semantics. Berlin: Springer, 2013 64-71.
9Kaya M, Fidan G, Toroslu I H. Transfer learning using twitter data for improving sentiment classification of turkish political news [G] //LNEE 264: Information Sciences and Systems 2013. Berlin.- Springer, 2013:139-148.
10Jambhulkar P, Nirkhi S. A survey paper on cross-domain sentiment analysis [J]. International Journal of Advanced Research in Computer and Communication Engineering,2014, 3(1): 5241-5245.

引证文献10

1赵传君,王素格,李德玉,李欣.基于分组提升集成的跨领域文本情感分类[J].计算机研究与发展,2015,52(3):629-638. 被引量：12
2胡杨,戴丹,刘骊,冯旭鹏,刘利军,黄青松.基于情感角色模型的文本情感分类方法[J].计算机应用,2015,35(5):1310-1313. 被引量：3
3孟佳娜,赵丹丹,于玉海,孙世昶.归纳式迁移学习在跨领域情感倾向性分析中的应用[J].南京大学学报（自然科学版）,2016,52(1):175-183. 被引量：2
4李超男.情感分类综述[J].现代计算机（中旬刊）,2016(10):41-44. 被引量：1
5胡杨,冯旭鹏,戴丹,刘利军,黄青松.最小费用最大流跨领域情感分类框架[J].小型微型计算机系统,2017,38(1):49-55. 被引量：1
6李长镜,赵书良,池云仙,罗燕.文本情感分类方法的分析[J].电子世界,2018,0(3):65-65. 被引量：1
7陈龙,管子玉,何金红,彭进业.情感分类研究进展[J].计算机研究与发展,2017,54(6):1150-1170. 被引量：100
8赵传君,王素格,李德玉.跨领域文本情感分类研究进展[J].软件学报,2020,31(6):1723-1746. 被引量：14
9龚宁静.情感迁移视角下中文词嵌入技术框架研究[J].信息与电脑,2021,33(17):20-23. 被引量：1
10赵传君,武美龄,申利华,上官学奎,王彦婕,李杰,王素格,李德玉.基于句法结构迁移和领域融合的跨领域情感分类[J].清华大学学报（自然科学版）,2023,63(9):1380-1389. 被引量：1

二级引证文献134

1包乾辉,李佳利,石淑珍,戴引,刘雪.基于DSLML的鸡蛋消费在线评论情感分析[J].农业机械学报,2021,52(S01):496-503. 被引量：8
2陈翔.基于公共安全事件主题的海外网络舆情与情绪分析研究[J].网络空间安全,2022,13(4):78-85. 被引量：2
3张军,王素格.基于逐步优化分类模型的跨领域文本情感分类[J].计算机科学,2016,43(7):234-239. 被引量：3
4兰天,郭躬德.基于词共现和情感元素的突发话题检测算法[J].计算机系统应用,2016,25(8):101-108. 被引量：4
5冯旭鹏,马震,谢波,刘利军,黄青松.基于评价修饰分布差的评论文本倾向性识别方法[J].计算机工程,2016,42(10):176-180.
6胡杨,冯旭鹏,戴丹,刘利军,黄青松.最小费用最大流跨领域情感分类框架[J].小型微型计算机系统,2017,38(1):49-55. 被引量：1
7刘晨晨,冯旭鹏,胡杨,刘利军,黄青松,段成香.基于主题角色的文本情感分类方法[J].计算机应用与软件,2017,34(1):154-159. 被引量：2
8王林,李昀泽.情感倾向分析在舆情监控方面的研究[J].微型机与应用,2017,36(5):11-13. 被引量：3
9宋晓勇,吕品,陈年生.PORSC:融合用户个性化特征的在线评论情感分类模型[J].复旦学报（自然科学版）,2017,56(3):359-365. 被引量：2
10段传明.传统情感分类方法与基于深度学习的情感分类方法对比分析[J].软件导刊,2018,17(1):22-24. 被引量：7

1贺文熙,丁兴号.一范数在图像修复中的应用[J].计算机工程与应用,2010,46(19):164-165. 被引量：1
2阳爱民,周咏梅,周剑峰.中文微博语料情感类别自动标注方法[J].计算机应用,2014,34(8):2188-2191. 被引量：5
3周冰园,陈庆奎.基于CUDA的各向异性热传导模型的图像修复[J].计算机应用研究,2014,31(6):1901-1905.
4拍照手机的未来趋势[J].新概念电脑,2003(6):63-63.
5秦川,王朔中,张新鹏.基于各向异性热传导模型的自适应图像修复[J].仪器仪表学报,2010,31(6):1303-1309. 被引量：10
6张记龙.红外热图像技术在复合材料表面缺陷检测中的理论分析(英文)[J].测试技术学报,2004,18(4):301-304. 被引量：1
7黄培花,侯勇,任敏.分析建模的描述工具DFD[J].商场现代化,2008(24):25-25. 被引量：1
8王素格,魏英.停用词表对中文文本情感分类的影响[J].情报学报,2008,27(2):175-179. 被引量：22
9王朔中,克达尔,秦川,张新鹏.应用热传导模型的偏微分方程图像修复[J].上海大学学报（自然科学版）,2007,13(4):331-336. 被引量：7
10杜攀,郭嘉丰,张瑾,程学旗,张旭.基于热传导模型的更新摘要算法[J].模式识别与人工智能,2012,25(3):367-374. 被引量：1

计算机研究与发展

2013年第8期

浏览历史

内容加载中请稍等...

面向跨领域情感分类的统一框架被引量：10

参考文献21

二级参考文献78

共引文献336

同被引文献78

引证文献10

二级引证文献134

相关作者

相关机构

相关主题

浏览历史

面向跨领域情感分类的统一框架 被引量：10

参考文献21

二级参考文献78

共引文献336

同被引文献78

引证文献10

二级引证文献134

相关作者

相关机构

相关主题

浏览历史

面向跨领域情感分类的统一框架被引量：10