期刊文献+

基于深度学习的高结构化恶意样本的检测方法 被引量:2

High-Structured Malicious Sample Detection Method Based on Deep Learning
原文传递
导出
摘要 随着攻击检测及缓解等安全防护能力的增强,高结构化的文件(如PDF、HTML等)成为当前漏洞利用的主要目标。由于高结构化的文件具有结构复杂、格式多样、自定义规则灵活等特点,恶意样本的模式与规则难以抽取,导致传统基于模式和规则的检测方法难以应对高结构化恶意样本的检测问题。边界值填充、恶意代码嵌入等操作使得恶意样本字节流分布有所改变,依据样本字节流分布差异,本文提出了一种基于深度学习的高结构化恶意样本的检测方法(JLMethod)。该方法使用卷积神经网络对样本文件的字节流特征进行分类,能有效检测出恶意样本。在文档型PDF文件实验中以4.1‰的漏报率、99.59%准确率和在非文档型HTML恶意样本(WebShell)检测实验中以8.5‰的漏报率、98.89%准确率,验证了本文方法在高结构化恶意样本检测方面的可行性。 With the enhancement of security protection capabilities such as attack detection and mitigation,highly structured files(such as PDF,HTML,etc.)have become the main targets of vulnerability exploitation.Due to the structure complexity,formats variety,and the flexibility of customized rules,it is difficult to extract patterns of malicious samples,which brings great challenge to traditional detection techniques based on patterns and rules.It is observed that the construction of malicious samples such as filling boundary values or embedding malicious code can change the distribution of byte streams,thus this paper proposes a method to detect highly structured malicious samples based on deep learning(JLMethod).In details,this method leverage convolutional neural network to classify byte streams features of sample,and then effectively detect malicious samples.Experiment results show that our approach achieves 99.59%accuracy rate and 4.1‰false negative on the detection of highly structured PDF file,98.89%accuracy rate and 8.5‰false negative rate on the detection of highly structured non-document HTML malicious samples(WebShell),which demonstrates the effectiveness of our method.
作者 赵磊 金银山 刘勤亮 张羿辰 ZHAO Lei;JIN Yinshan;LIU Qinliang;ZHANG Yichen(School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,Hubei,China)
出处 《武汉大学学报(理学版)》 CAS CSCD 北大核心 2019年第6期571-575,共5页 Journal of Wuhan University:Natural Science Edition
基金 国家自然科学基金(61672394,61872273)
关键词 恶意样本 深度学习 漏洞 高结构化 malicious samples deep learning vulnerability highly structured
  • 相关文献

参考文献3

二级参考文献33

  • 1李万新.Web日志数据挖掘在服务器安全方面的应用[J].中山大学学报论丛,2007,27(5):116-118. 被引量:5
  • 2刘冰.多类SVM分类算法的研究和改进.电脑知识与技术,2007,(6):1590-1593.
  • 3Xiao Yao. Large and Medium-sized Network Intrusions Cases Research[J]. Publishing House Of Electronics Industry, 2010,(10):301-310.
  • 4J. Ross Quinlan. C4. 5: programs for machine learning[M]. San Francisco: Morgan Kaufmann, 1993.
  • 5Yung-Tsung Hou, Yimeng Chang, Tsuhan Chen.Malicious web content detection by machine learning[J]. Expert Systems with Applications,2010,37(1):55-60.
  • 6Osuna E, Freund R, Girosi F. An improved training algorithm for support vector machines[C]//Proceedings of IEEE Workshop on Neural Networks for Signal Processing. Amelia Island, USA: IEEE Press, 1997: 276-285.
  • 7Lin H T, Lin C J, Weng R C. A note on Plat tps probabilistic outputs for support vector machines[J]. Machine Learning, 2007, 68 (3): 267-276.
  • 8Brinker K. On multiclass active learning with support vector machines[C]//Proceedings of European Conference on Artificial Intelligence. 2004: 969-970.
  • 9Yuan X, Lai W, Mei T , et al. Automatic video genre categorization using hierarchical SVM[C]//IEEE International Conference on Image Processing. Atlanta: IEEE Press, 2006: 2905-2908.
  • 10Tong S , Chang. E Support vector machine active learning for image ret rieval[C]//Proceedings of the 9th ACM International Conference on Multimedia. Ottawa, Canada: ACM Press, 2001, 9: 107-118.

共引文献52

同被引文献12

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部