摘要
应用识别与流量分类是网络管理、安全、研究等相关事务的必要前提.随着网络的高速发展以及各种新型应用的不断涌现,基于分组传输层端口号和深度分组解析的分类技术难以满足需求.本文验证网络流量的统计特性可以有效地区分不同应用,提出一种基于C4.5决策树分类器的有监督网络流量分类方法,讨论boosting增强方法和特征选择两种改进.实验结果表明,C4.5分类器的训练复杂度适中,准确率高且分类速度快;增强方法可以进一步提高分类器的准确率,代价是训练时间大幅提高和分类时间稍微减慢;特征选择算法则提高分类速度而稍微降低准确率.
Traffic classification or application identification is an essential step for a number of network issues including management, se- curity and research. The diminished effectiveness of traditional port-based traffic classifier and the overheads of deep packet inspection approaches motivate new techniques. It has been proved that traffic statistics can discriminate between applications, in this paper, we propose a supervised method based on boosted C4.5 decision tree classifier. Experiment results show that C4.5 classifier can perform fast classification and achieve high accuracy ; while boosted C4.5 classifier achieves higher accuracy with much longer training time and slightly slower classify rate.
出处
《小型微型计算机系统》
CSCD
北大核心
2009年第11期2150-2156,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金-广东联合基金重点项目(U0735002)资助
国家"八六三"高技术研究发展计划项目(2007AA01Z449)资助
关键词
网络流量分类
机器学习
决策树
增强算法
特征选择
internet traffic classification
machine learning
decision tree
boosting
feature selection