摘要
针对传统网络流量分类方法准确率低、开销大、应用范围受限等问题,提出一种支持向量机(SVM)的半监督网络流量分类方法。该方法在SVM训练中,使用增量学习技术在初始和新增样本集中动态地确定支持向量,避免不必要的重复训练,改善因出现新样本而造成原分类器分类精度降低、分类时间长的情况;改进半监督Tri-training方法对分类器进行协同训练,同时使用大量未标记和少量已标记样本对分类器进行反复修正,减少辅助分类器的噪声数据,克服传统协同验证对分类算法及样本类型要求苛刻的不足。实验结果表明,该方法可明显提高网络流量分类的准确率和效率。
In order to solve low accuracy, large time consumption and limited application range in traditional network traffic classification, a semi-supervised network traffic classification method of Support Vector Machine (SVM) was proposed. During the training of SVM, it determined the support vectors from the initial and new sample set by using incremental learning technology, avoided unnecessary repetition training, and improved the situation of original classifiers' low accuracy and time- consuming as a result of new samples that appeared. This paper also proposed an improved Tri-training method to train multiple classifiers, and a large number of unlabeled samples and a small amount of labeled samples were used to modify the classifiers, which reduced auxiliary classifier' s noise data and overcame the strict limitation of sample types and traditional Co- verification for classification methods. The experimental results show that the proposed algorithm has excellent accuracy and speed in traffic classification.
出处
《计算机应用》
CSCD
北大核心
2013年第6期1515-1518,共4页
journal of Computer Applications
基金
国家自然科学基金资助项目(61163058
61172053)
广西自然科学基金资助项目(2011GXNSFB018076)
关键词
网络流量分类
支持向量机
半监督
增量学习
协同训练
network traffic classification
Support Vector Machine (SVM)
semi-supervised
incremental learning
Tri-training