摘要
决策树分类算法C4.5是数据挖掘中最常用、最经典的分类算法。但是C4.5算法也存在一些不足之处,针对C4.5算法处理连续属性比较耗时的特点,本文对连续的处理过程进行改进,以提高算法的计算效率。改进的C4.5算法与原C4.5算法相比,在构造决策树时具有相同的准确率和更高的计算速度。
The decision tree classification algorithm C4.5 is the most popular and classical classification algorithm in the data mining.But,there are some defects in it,the processing of continuous variables in the C4.5 algorithm consumes too much time,according to this characteristic,the paper improves the processing of continuous variables to enhance the efficiency of the algorithm.The improved algorithm has better efficiency and has the same accuracy comparing with the C4.5 algorithm when building decision tree.
出处
《计算机与现代化》
2010年第8期8-10,共3页
Computer and Modernization
基金
贵州省省长基金资助项目(200404)
贵州大学自然科学青年基金资助项目(2009021)
关键词
数据挖掘
决策树
C4.5算法
连续属性
data mining
decision tree
C4.5 algorithm
continuous variables