摘要
为了平衡隐含概念漂移的数据流分类算法的分类精度和效率之间的矛盾,提出了基于聚类决策树的框架来处理快速到达的数据流,通过将不能实时分类的数据预聚类成n个类,并基于聚类结果产生VFDT新分支或替代原有分支。实验结果证明,聚类决策树框架算法在预测精度和效率上均有一定的提升。
To balance the contradiction between classification accuracy and efficiency of the implicit concept drift of data flow classification algorithm, this paper proposed the framework based on clustering decision tree.-The framework, deals with data flow which is fast arrived. It will cluster those data which is not classified temporarily into n class, and generate new branches of the VFDT based on cluster result or replace original ones. Our experimental results show that the proposed framework, has substantial improvement in prediction accuracy and efficiency.
出处
《计算机安全》
2014年第3期18-22,共5页
Network & Computer Security
关键词
数据流
概念漂移
聚集
类别
data stream
concept drifting
duster
classification