An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some trad...An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some traditional ones because of integrating secret data and bitmap resources together. Through analyzing the principle of bitmap resources parsing in an operating system and the layer of resource data in PE files, a safe and useful solution is presented to solve two problems that bitmap resources are incorrectly analyzed and other resources data are confused in the process of data embedding. The feasibility and effectiveness of the proposed algorithm are confirmed through computer experiments.展开更多
One aspect of cybersecurity,incorporates the study of Portable Executables(PE)files maleficence.Artificial Intelligence(AI)can be employed in such studies,since AI has the ability to discriminate benign from malicious...One aspect of cybersecurity,incorporates the study of Portable Executables(PE)files maleficence.Artificial Intelligence(AI)can be employed in such studies,since AI has the ability to discriminate benign from malicious files.In this study,an exclusive set of 29 features was collected from trusted implementations,this set was used as a baseline to analyze the presented work in this research.A Decision Tree(DT)and Neural Network Multi-Layer Perceptron(NN-MLPC)algorithms were utilized during this work.Both algorithms were chosen after testing a few diverse procedures.This work implements a method of subgrouping features to answer questions such as,which feature has a positive impact on accuracy when added?Is it possible to determine a reliable feature set to distinguish a malicious PE file from a benign one?when combining features,would it have any effect on malware detection accuracy in a PE file?Results obtained using the proposed method were improved and carried few observations.Generally,the obtained results had practical and numerical parts,for the practical part,the number of features and which features included are the main factors impacting the calculated accuracy,also,the combination of features is as crucial in these calculations.Numerical results included,finding accuracies with enhanced values,for example,NN_MLPC attained 0.979 and 0.98;for DT an accuracy of 0.9825 and 0.986 was attained.展开更多
The continuous development of cyberattacks is threatening digital transformation endeavors worldwide and leadsto wide losses for various organizations. These dangers have proven that signature-based approaches are ins...The continuous development of cyberattacks is threatening digital transformation endeavors worldwide and leadsto wide losses for various organizations. These dangers have proven that signature-based approaches are insufficientto prevent emerging and polymorphic attacks. Therefore, this paper is proposing a Robust Malicious ExecutableDetection (RMED) using Host-based Machine Learning Classifier to discover malicious Portable Executable (PE)files in hosts using Windows operating systems through collecting PE headers and applying machine learningmechanisms to detect unknown infected files. The authors have collected a novel reliable dataset containing 116,031benign files and 179,071 malware samples from diverse sources to ensure the efficiency of RMED approach.The most effective PE headers that can highly differentiate between benign and malware files were selected totrain the model on 15 PE features to speed up the classification process and achieve real-time detection formalicious executables. The evaluation results showed that RMED succeeded in shrinking the classification timeto 91 milliseconds for each file while reaching an accuracy of 98.42% with a false positive rate equal to 1.58. Inconclusion, this paper contributes to the field of cybersecurity by presenting a comprehensive framework thatleverages Artificial Intelligence (AI) methods to proactively detect and prevent cyber-attacks.展开更多
针对基于深度学习的可移植执行(PE)恶意软件检测方法中,数据集存在的不平衡或不完整问题,以及神经网络结构过深或特征集庞大而导致的模型计算资源开销和耗时增加问题,提出一种基于浅层人工神经网络(SANN)的PE恶意软件静态检测模型。首先...针对基于深度学习的可移植执行(PE)恶意软件检测方法中,数据集存在的不平衡或不完整问题,以及神经网络结构过深或特征集庞大而导致的模型计算资源开销和耗时增加问题,提出一种基于浅层人工神经网络(SANN)的PE恶意软件静态检测模型。首先,利用LIEF(Library to Instrument Executable Formats)库创建PE特征提取器从EMBER数据集中提取PE文件样本,并提出一种特征组合,该特征集具备更少的PE文件特征,从而在减小特征空间和模型参数量的同时能够提高深度学习模型的性能;其次,生成特征向量,通过数据清洗去除未标记的样本;再次,对特征集内的不同特征值进行归一化处理;最后,将特征向量输入SANN中进行训练和测试。实验结果表明,SANN可达到95.64%的召回率和95.24%的准确率,相较于MalConv模型和LightGBM模型,SANN的准确率分别提高了1.19和1.57个百分点。SANN的总工作耗时约为用时最少的对比模型LightGBM的1/2。此外,SANN在面对未知攻击时具备较好的弹性,且仍能够保持较高的检测水平。展开更多
The growing threat of malware,particularly in the Portable Executable(PE)format,demands more effective methods for detection and classification.Machine learning-based approaches exhibit their potential but often negle...The growing threat of malware,particularly in the Portable Executable(PE)format,demands more effective methods for detection and classification.Machine learning-based approaches exhibit their potential but often neglect semantic segmentation of malware files that can improve classification performance.This research applies deep learning to malware detection,using Convolutional Neural Network(CNN)architectures adapted to work with semantically extracted data to classify malware into malware families.Starting from the Malconv model,this study introduces modifications to adapt it to multi-classification tasks and improve its performance.It proposes a new innovative method that focuses on byte extraction from Portable Executable(PE)malware files based on their semantic location,resulting in higher accuracy in malware classification than traditional methods using full-byte sequences.This novel approach evaluates the importance of each semantic segment to improve classification accuracy.The results revealed that the header segment of PE files provides the most valuable information for malware identification,outperforming the other sections,and achieving an average classification accuracy of 99.54%.The above reaffirms the effectiveness of the semantic segmentation approach and highlights the critical role header data plays in improving malware detection and classification accuracy.展开更多
基金supported by the Applied Basic Research Programs of Sichuan Province under Grant No. 2010JY0001the Fundamental Research Funds for the Central Universities under Grant No. ZYGX2010J068
文摘An information hiding algorithm is proposed, which hides information by embedding secret data into the palette of bitmap resources of portable executable (PE) files. This algorithm has higher security than some traditional ones because of integrating secret data and bitmap resources together. Through analyzing the principle of bitmap resources parsing in an operating system and the layer of resource data in PE files, a safe and useful solution is presented to solve two problems that bitmap resources are incorrectly analyzed and other resources data are confused in the process of data embedding. The feasibility and effectiveness of the proposed algorithm are confirmed through computer experiments.
文摘One aspect of cybersecurity,incorporates the study of Portable Executables(PE)files maleficence.Artificial Intelligence(AI)can be employed in such studies,since AI has the ability to discriminate benign from malicious files.In this study,an exclusive set of 29 features was collected from trusted implementations,this set was used as a baseline to analyze the presented work in this research.A Decision Tree(DT)and Neural Network Multi-Layer Perceptron(NN-MLPC)algorithms were utilized during this work.Both algorithms were chosen after testing a few diverse procedures.This work implements a method of subgrouping features to answer questions such as,which feature has a positive impact on accuracy when added?Is it possible to determine a reliable feature set to distinguish a malicious PE file from a benign one?when combining features,would it have any effect on malware detection accuracy in a PE file?Results obtained using the proposed method were improved and carried few observations.Generally,the obtained results had practical and numerical parts,for the practical part,the number of features and which features included are the main factors impacting the calculated accuracy,also,the combination of features is as crucial in these calculations.Numerical results included,finding accuracies with enhanced values,for example,NN_MLPC attained 0.979 and 0.98;for DT an accuracy of 0.9825 and 0.986 was attained.
文摘The continuous development of cyberattacks is threatening digital transformation endeavors worldwide and leadsto wide losses for various organizations. These dangers have proven that signature-based approaches are insufficientto prevent emerging and polymorphic attacks. Therefore, this paper is proposing a Robust Malicious ExecutableDetection (RMED) using Host-based Machine Learning Classifier to discover malicious Portable Executable (PE)files in hosts using Windows operating systems through collecting PE headers and applying machine learningmechanisms to detect unknown infected files. The authors have collected a novel reliable dataset containing 116,031benign files and 179,071 malware samples from diverse sources to ensure the efficiency of RMED approach.The most effective PE headers that can highly differentiate between benign and malware files were selected totrain the model on 15 PE features to speed up the classification process and achieve real-time detection formalicious executables. The evaluation results showed that RMED succeeded in shrinking the classification timeto 91 milliseconds for each file while reaching an accuracy of 98.42% with a false positive rate equal to 1.58. Inconclusion, this paper contributes to the field of cybersecurity by presenting a comprehensive framework thatleverages Artificial Intelligence (AI) methods to proactively detect and prevent cyber-attacks.
文摘针对基于深度学习的可移植执行(PE)恶意软件检测方法中,数据集存在的不平衡或不完整问题,以及神经网络结构过深或特征集庞大而导致的模型计算资源开销和耗时增加问题,提出一种基于浅层人工神经网络(SANN)的PE恶意软件静态检测模型。首先,利用LIEF(Library to Instrument Executable Formats)库创建PE特征提取器从EMBER数据集中提取PE文件样本,并提出一种特征组合,该特征集具备更少的PE文件特征,从而在减小特征空间和模型参数量的同时能够提高深度学习模型的性能;其次,生成特征向量,通过数据清洗去除未标记的样本;再次,对特征集内的不同特征值进行归一化处理;最后,将特征向量输入SANN中进行训练和测试。实验结果表明,SANN可达到95.64%的召回率和95.24%的准确率,相较于MalConv模型和LightGBM模型,SANN的准确率分别提高了1.19和1.57个百分点。SANN的总工作耗时约为用时最少的对比模型LightGBM的1/2。此外,SANN在面对未知攻击时具备较好的弹性,且仍能够保持较高的检测水平。
文摘The growing threat of malware,particularly in the Portable Executable(PE)format,demands more effective methods for detection and classification.Machine learning-based approaches exhibit their potential but often neglect semantic segmentation of malware files that can improve classification performance.This research applies deep learning to malware detection,using Convolutional Neural Network(CNN)architectures adapted to work with semantically extracted data to classify malware into malware families.Starting from the Malconv model,this study introduces modifications to adapt it to multi-classification tasks and improve its performance.It proposes a new innovative method that focuses on byte extraction from Portable Executable(PE)malware files based on their semantic location,resulting in higher accuracy in malware classification than traditional methods using full-byte sequences.This novel approach evaluates the importance of each semantic segment to improve classification accuracy.The results revealed that the header segment of PE files provides the most valuable information for malware identification,outperforming the other sections,and achieving an average classification accuracy of 99.54%.The above reaffirms the effectiveness of the semantic segmentation approach and highlights the critical role header data plays in improving malware detection and classification accuracy.