Mobile phones are an essential part of modern life.The two popular mobile phone platforms,Android and iPhone Operating System(iOS),have an immense impact on the lives of millions of people.Among these two,Android curr...Mobile phones are an essential part of modern life.The two popular mobile phone platforms,Android and iPhone Operating System(iOS),have an immense impact on the lives of millions of people.Among these two,Android currently boasts more than 84%market share.Thus,any personal data put on it are at great risk if not properly protected.On the other hand,more than a million pieces of malware have been reported on Android in just 2021 till date.Detecting and mitigating all this malware is extremely difficult for any set of human experts.Due to this reason,machine learning-and specifically deep learning-has been utilized in the recent past to resolve this issue.However,deep learning models have primarily been designed for image analysis.While this line of research has shown promising results,it has been difficult to really understand what the features extracted by deep learning models are in the domain of malware.Moreover,due to the translation invariance property of popular models based on ConvolutionalNeural Network(CNN),the true potential of deep learning for malware analysis is yet to be realized.To resolve this issue,we envision the use of Capsule Networks(CapsNets),a state-of-the-art model in deep learning.We argue that since CapsNets are orientation-based in terms of images,they can potentially be used to capture spatial relationships between different features at different locations within a sequence of opcodes.We design a deep learning-based architecture that efficiently and effectively handles very large scale malware datasets to detect Androidmalware without resorting to very deep networks.This leads tomuch faster detection as well as increased accuracy.We achieve state-of-the-art F1 score of 0.987 with an FPR of just 0.002 for three very large,real-world malware datasets.Our code is made available as open source and can be used to further enhance our work with minimal effort.展开更多
Windows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems.One of the major challenges in tackling this problem is the c...Windows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems.One of the major challenges in tackling this problem is the complexity of malware analysis,which requires expertise from human analysts.Recent developments in machine learning have led to the creation of deep models for malware detection.However,these models often lack transparency,making it difficult to understand the reasoning behind the model’s decisions,otherwise known as the black-box problem.To address these limitations,this paper presents a novel model for malware detection,utilizing vision transformers to analyze the Operation Code(OpCode)sequences of more than 350000 Windows portable executable malware samples from real-world datasets.The model achieves a high accuracy of 0.9864,not only surpassing the previous results but also providing valuable insights into the reasoning behind the classification.Our model is able to pinpoint specific instructions that lead to malicious behavior in malware samples,aiding human experts in their analysis and driving further advancements in the field.We report our findings and show how causality can be established between malicious code and actual classification by a deep learning model,thus opening up this black-box problem for deeper analysis.展开更多
文摘Mobile phones are an essential part of modern life.The two popular mobile phone platforms,Android and iPhone Operating System(iOS),have an immense impact on the lives of millions of people.Among these two,Android currently boasts more than 84%market share.Thus,any personal data put on it are at great risk if not properly protected.On the other hand,more than a million pieces of malware have been reported on Android in just 2021 till date.Detecting and mitigating all this malware is extremely difficult for any set of human experts.Due to this reason,machine learning-and specifically deep learning-has been utilized in the recent past to resolve this issue.However,deep learning models have primarily been designed for image analysis.While this line of research has shown promising results,it has been difficult to really understand what the features extracted by deep learning models are in the domain of malware.Moreover,due to the translation invariance property of popular models based on ConvolutionalNeural Network(CNN),the true potential of deep learning for malware analysis is yet to be realized.To resolve this issue,we envision the use of Capsule Networks(CapsNets),a state-of-the-art model in deep learning.We argue that since CapsNets are orientation-based in terms of images,they can potentially be used to capture spatial relationships between different features at different locations within a sequence of opcodes.We design a deep learning-based architecture that efficiently and effectively handles very large scale malware datasets to detect Androidmalware without resorting to very deep networks.This leads tomuch faster detection as well as increased accuracy.We achieve state-of-the-art F1 score of 0.987 with an FPR of just 0.002 for three very large,real-world malware datasets.Our code is made available as open source and can be used to further enhance our work with minimal effort.
文摘Windows malware is becoming an increasingly pressing problem as the amount of malware continues to grow and more sensitive information is stored on systems.One of the major challenges in tackling this problem is the complexity of malware analysis,which requires expertise from human analysts.Recent developments in machine learning have led to the creation of deep models for malware detection.However,these models often lack transparency,making it difficult to understand the reasoning behind the model’s decisions,otherwise known as the black-box problem.To address these limitations,this paper presents a novel model for malware detection,utilizing vision transformers to analyze the Operation Code(OpCode)sequences of more than 350000 Windows portable executable malware samples from real-world datasets.The model achieves a high accuracy of 0.9864,not only surpassing the previous results but also providing valuable insights into the reasoning behind the classification.Our model is able to pinpoint specific instructions that lead to malicious behavior in malware samples,aiding human experts in their analysis and driving further advancements in the field.We report our findings and show how causality can be established between malicious code and actual classification by a deep learning model,thus opening up this black-box problem for deeper analysis.