摘要
针对互联网中的产品评论信息,提出一种三层过滤的评价对象抽取方法.该方法采用一个自举式的抽取算法在评论文本中得到候选的评价对象和情感词;利用评价对象与情感词之间的关联度对候选词进行关联置信度计算,提取关联置信度高的评价对象以提高识别的准确率;引入一个不相关的平行领域对剩余的候选词进行领域置信度计算,挖掘低频的评价对象.3个公开数据集中的实验结果表明该方法能够显著地提高评价对象的识别效果.
A three-level filter method was proposed to extract the opinion targets for product reviews on the Internet.In the first level,a bootstrapping framework was adopted to extract candidate opinion targets and opinion words from opinion texts.In the second level,the association between the opinion target and opinion word was used to estimate the association confidence of every candidate opinion target and candidate opinion word.The opinion targets with high association confidence were extracted to improve recognition accuracy.In the third level,an uncorrelated domain was adopted to calculate the domain confidence of every opinion target in the rest set which was for mining the opinion targets of low frequency.The experimental results on three public datasets demonstrate the effectiveness of the proposed approach.
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2016年第11期1154-1159,共6页
Transactions of Beijing Institute of Technology
基金
国家自然科学基金资助项目(61370137)
关键词
评价对象抽取
情感词
关联置信度
领域置信度
opinion targets extraction
opinion word
association confidence
domain confidence