摘要
针对中文网络客户评论中的产品特征挖掘问题,提出一种基于Apriori算法的非监督挖掘方法。利用Apriori算法挖掘候选特征集合,设计邻近规则剪枝算法和最小独立支持度剪枝算法,并通过实验确定邻近规则距离值和最小独立支持度。实验结果表明,这2种剪枝算法均能有效提高产品特征挖掘的查准率和查全率。
This paper focuses on product features mining from reviews of Chinese network customers and proposes a method based on Apriori algorithm which is an unsupervised mining method.It extracts the candidate features collection by Apriori algorithm,and takes redundancy pruning and compactness pruning algorithms.According to the experimental research results,it establishes adjacent words value and p-support value.Results show that the precision and recall of mining method are effective improved by two proposed pruning algorithms.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第23期43-45,共3页
Computer Engineering
基金
国家自然科学基金资助项目(71001023)
黑龙江省教育厅科研基金资助项目(11553023)
中央高校基本科研业务费专项基金资助项目(DL11BB25)
关键词
评论挖掘
关联规则
产品特征
剪枝
非结构化信息
非监督学习
review mining
association rule
product feature
pruning
unstructured information
unsupervised learning