摘要
在信息管理数据的关联规则挖掘研究中,产生候选频繁项,存在的重复计算和冗余候选项,会造成数据关联特征发生漂移,导致计算支持数时重复扫描事务数据库的次数增加。为此提出一种抗特征漂移的深度挖掘算法,首先进行数据处理,计算数据挖掘指标的熵,计算构建加权规范化矩阵,计算数据特征距离,利用数据贴近度的概念实现数据深度挖掘,有效地提高了算法的效率。实验数据表明,该算法的挖掘效率比现有的同类算法更快速有效。
This paper puts forward a resistance characteristics of the depth of the drift under mining algorithm, first carries on the data processing, calculation data mining index of entropy, calculation construct weighted standardization matrix, and the calculated data characteristic distance, use data to the concept of degree of realization of data mining depth, effectively improve the efficiency of the algorithm. Through the experimental data show that the efficiency than the existing similar algorithm more quickly and efficiently.
出处
《科技通报》
北大核心
2013年第12期58-60,共3页
Bulletin of Science and Technology
关键词
信息管理
数据挖掘
特征漂移
规范化矩阵
贴近度
information management
data mining
Features drift
standardization matrix
closeness