摘要
为了解决海量电商评价信息中每个评价对象的情感倾向性和评价对象与评价词不匹配问题,提出一种结合句法关系与语义关系的多粒度条件随机场模型抽取评价单元方法SSMCRFs(syntactic semantic and multi-grained conditional random fields,SSMCRFs).首先,爬取京东商城的评论数据为基础数据,将评论文本进行句法关系,语义关系等处理;然后,使用TF-IDF算法对预处理后的数据集进行统计分析,以确定用户的关注度;最后,使用条件随机场模型进行评价单元识别.实验结果表明,SSMCRFs在识别评价单元上准确率达到92.92%,召回率达到93.25%,F值达到93.08%.相对于马晓君等(2017)的方法,SSMCRFs方法在准确率,召回率,F值上均有较大的提高.
In order to solve the problem of the emotional orientation of each evaluation object in the massive e-commerce evaluation information and the mismatch between the evaluation object and the evaluation word,a multi-granularity conditional random field model extraction evaluation unit method SSMCRFs(syntactic semantic and multi-grained conditional random fields,SSMCRFs) is proposed.First,we crawl the comment data of Jingdong Mall as the basic data,and process the comment text in syntactic relationship and semantic relationship;then,we use the TF-IDF algorithm to perform statistical analysis on the preprocessed data set to determine the user’s attention;Finally,the conditional random field model is used for evaluation unit identification.The experimental results show that the accuracy rate of SSMCRFs in the identification and evaluation unit is 92.92%,the recall rate is93.25%,and the F value is 93.08%.Compared with the method by Ma,et al.(2017),the SSMCRFs method has a better improvement in accuracy,recall rate and F value.
作者
陈苹
冯林
余游
徐其凤
CHEN Ping;FENG Lin;YU You;XU Qifeng(College of Computer Science,Sichuan Normal University,Chengdu 610101)
出处
《系统科学与数学》
CSCD
北大核心
2020年第1期63-80,共18页
Journal of Systems Science and Mathematical Sciences
基金
国家科技支撑计划课题(2014BAH11F01)资助课题。
关键词
评价单元识别
句法分析
语义分析
条件随机场模型
Evaluation unit identification
syntax analysis
semantic analysis
conditional random field model