With the development of anti-virus technology,malicious documents have gradually become the main pathway of Advanced Persistent Threat(APT)attacks,therefore,the development of effective malicious document classifiers ...With the development of anti-virus technology,malicious documents have gradually become the main pathway of Advanced Persistent Threat(APT)attacks,therefore,the development of effective malicious document classifiers has become particularly urgent.Currently,detection methods based on document structure and behavioral features encounter challenges in feature engineering,these methods not only have limited accuracy,but also consume large resources,and usually can only detect documents in specific formats,which lacks versatility and adaptability.To address such problems,this paper proposes a novel malicious document detection method-visualizing documents as GGE images(Grayscale,Grayscale matrix,Entropy).The GGE method visualizes the original byte sequence of the malicious document as a grayscale image,the information entropy sequence of the document as an entropy image,and at the same time,the grayscale level co-occurrence matrix and the texture and spatial information stored in it are converted into grayscale matrix image,and fuses the three types of images to get the GGE color image.The Convolutional Block Attention Module-EfficientNet-B0(CBAM-EfficientNet-B0)model is then used for classification,combining transfer learning and applying the pre-trained model on the ImageNet dataset to the feature extraction process of GGE images.As shown in the experimental results,the GGE method has superior performance compared with other methods,which is suitable for detecting malicious documents in different formats,and achieves an accuracy of 99.44%and 97.39%on Portable Document Format(PDF)and office datasets,respectively,and consumes less time during the detection process,which can be effectively applied to the task of detecting malicious documents in real-time.展开更多
With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal ...With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.展开更多
The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learnin...The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learning,such as poor robustness,weak generalization,and a lack of ability to learn common features.Therefore,this paper proposes a malicious traffic identification method based on stacked sparse denoising autoencoders combined with a regularized extreme learning machine through particle swarm optimization.Firstly,the simulation environment of the Chinese train control system-3,was constructed for data acquisition.Then Pearson coefficient and other methods are used for pre-processing,then a stacked sparse denoising autoencoder is used to achieve nonlinear dimensionality reduction of features,and finally regularization extreme learning machine optimized by particle swarm optimization is used to achieve classification.Experimental data show that the proposed method has good training performance,with an average accuracy of 97.57%and a false negative rate of 2.43%,which is better than other alternative methods.In addition,ablation experiments were performed to evaluate the contribution of each component,and the results showed that the combination of methods was superior to individual methods.To further evaluate the generalization ability of the model in different scenarios,publicly available data sets of industrial control system networks were used.The results show that the model has robust detection capability in various types of network attacks.展开更多
The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable chall...The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable challenge to cybersecurity. Traditional machine learning and deep learning techniques often fall short in identifying encrypted malicious traffic due to their inability to fully extract and utilize the implicit relational and positional information embedded within data packets. This limitation has led to an unresolved challenge in the cybersecurity community: how to effectively extract valuable insights from the complex patterns of traffic packet transmission. Consequently, this paper introduces the TB-Graph model, an encrypted malicious traffic classification model based on a relational graph attention network. The model is a heterogeneous traffic burst graph that embeds side-channel features, which are unaffected by encryption, into the graph nodes and connects them with three different types of burst edges. Subsequently, we design a relational positional coding that prevents the loss of temporal relationships between the original traffic flows during graph transformation. Ultimately, TB-Graph leverages the powerful graph representation learning capabilities of Relational Graph Attention Network (RGAT) to extract latent behavioral features from the burst graph nodes and edge relationships. Experimental results show that TB-Graph outperforms various state-of-the-art methods in fine-grained encrypted malicious traffic classification tasks on two public datasets, indicating its enhanced capability for identifying encrypted malicious traffic.展开更多
With the growth of the Internet of Things(IoT)comes a flood of malicious traffic in the IoT,intensifying the challenges of network security.Traditional models operate with independent layers,limiting their effectivene...With the growth of the Internet of Things(IoT)comes a flood of malicious traffic in the IoT,intensifying the challenges of network security.Traditional models operate with independent layers,limiting their effectiveness in addressing these challenges.To address this issue,we propose a cross-layer cooperative Feature Subset-Based Malicious Traffic Detection(FSMMTD)model for detecting malicious traffic.Our approach begins by applying an enhanced random forest method to adaptively filter and retain highly discriminative first-layer features.These processed features are then input into an improved state-space model that integrates the strengths of recurrent neural networks(RNNs)and transformers,enabling superior processing of complex patterns and global information.This integration allows the FSMMTD model to enhance its capability in identifying intricate data relationships and capturing comprehensive contextual insights.The FSMMTD model monitors IoT data flows in real-time,efficiently detecting anomalies and enabling rapid response to potential intrusions.We validate our approach using the publicly available ToN_IoT dataset for IoT traffic analysis.Experimental results demonstrate that our method achieves superior performance with an accuracy of 98.37%,precision of 96.28%,recall of 95.36%,and F1-score of 96.79%.These metrics indicate that the FSMMTD model outperforms existing methods in detecting malicious traffic,showcasing its effectiveness and reliability in enhancing IoT network security.展开更多
Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and ...Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks.展开更多
In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In thi...In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.展开更多
The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has rec...The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has received a lot of research attention and various universal labeling methods have been proposed.However,the labeling task of malicious communication samples targeted at advanced threats has to face the two practical challenges:the difficulty of extracting effective features in advance and the complexity of the actual sample types.To address these problems,we proposed a sample labeling method for malicious communication based on semi-supervised deep neural network.This method supports continuous learning and optimization feature representation while labeling sample,and can handle uncertain samples that are outside the concerned sample types.According to the experimental results,our proposed deep neural network can automatically learn effective feature representation,and the validity of features is close to or even higher than that of features which extracted based on expert knowledge.Furthermore,our proposed method can achieve the labeling accuracy of 97.64%~98.50%,which is more accurate than the train-then-detect,kNN and LPA methodsin any labeled-sample proportion condition.The problem of insufficient labeled samples in many network attack detecting scenarios,and our proposed work can function as a reference for the sample labeling tasks in the similar real-world scenarios.展开更多
Background:In recent years,blockchain technology has attracted considerable attention.It records cryptographic transactions in a public ledger that is difficult to alter and compromise because of the distributed conse...Background:In recent years,blockchain technology has attracted considerable attention.It records cryptographic transactions in a public ledger that is difficult to alter and compromise because of the distributed consensus.As a result,blockchain is believed to resist fraud and hacking.Results:This work explores the types of fraud and malicious activities that can be prevented by blockchain technology and identifies attacks to which blockchain remains vulnerable.Conclusions:This study recommends appropriate defensive measures and calls for further research into the techniques for fighting malicious activities related to blockchains.展开更多
While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning me...While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.展开更多
Spam is no longer just commercial unsolicited email messages that waste our time, it consumes network traffic and mail servers’ storage. Furthermore, spam has become a major component of several attack vectors includ...Spam is no longer just commercial unsolicited email messages that waste our time, it consumes network traffic and mail servers’ storage. Furthermore, spam has become a major component of several attack vectors including attacks such as phishing, cross-site scripting, cross-site request forgery and malware infection. Statistics show that the amount of spam containing malicious contents increased compared to the one advertising legitimate products and services. In this paper, the issue of spam detection is investigated with the aim to develop an efficient method to identify spam email based on the analysis of the content of email messages. We identify a set of features that have a considerable number of malicious related features. Our goal is to study the effect of these features in helping the classical classifiers in identifying spam emails. To make the problem more challenging, we developed spam classification models based on imbalanced data where spam emails form the rare class with only 16.5% of the total emails. Different metrics were utilized in the evaluation of the developed models. Results show noticeable improvement of spam classification models when trained by dataset that includes malicious related features.展开更多
This paper introduces the background,illustrates the hardware structure and software features of malicious base station,explains its work principle,presents a method of detecting malicious base station,analyses the ex...This paper introduces the background,illustrates the hardware structure and software features of malicious base station,explains its work principle,presents a method of detecting malicious base station,analyses the experiment and evaluates the experimental results to verify the reliability of this method.Finally proposes the future work.展开更多
We study the detailed malicious code propagating process in scale-free networks with link weights that denotes traffic between two nodes. It is found that the propagating velocity reaches a peak rapidly then decays in...We study the detailed malicious code propagating process in scale-free networks with link weights that denotes traffic between two nodes. It is found that the propagating velocity reaches a peak rapidly then decays in a power-law form, which is different from the well-known result in unweighted network case. Simulation results show that the nodes with larger strength are preferential to be infected, but the hierarchical dynamics are not clearly found. The simulation results also show that larger dispersion of weight of networks leads to slower propagating, which indicates that malicious code propagates more quickly in unweighted scale-free networks than in weighted scale-free networks under the same condition. These results show that not only the topology of networks but also the link weights affect the malicious propagating process.展开更多
Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by...Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.展开更多
The continuously booming of information technology has shed light on developing a variety of communication networks,multimedia,social networks and Internet of Things applications.However,users inevitably suffer from t...The continuously booming of information technology has shed light on developing a variety of communication networks,multimedia,social networks and Internet of Things applications.However,users inevitably suffer from the intrusion of malicious users.Some studies focus on static characteristics of malicious users,which is easy to be bypassed by camouflaged malicious users.In this paper,we present a malicious user detection method based on ensemble feature selection and adversarial training.Firstly,the feature selection alleviates the dimension disaster problem and achieves more accurate classification performance.Secondly,we embed features into the multidimensional space and aggregate it into a feature map to encode the explicit content preference and implicit interaction preference.Thirdly,we use an effective ensemble learning which could avoid over-fitting and has good noise resistance.Finally,we propose a datadriven neural network detection model with the regularization technique adversarial training to deeply analyze the characteristics.It simplifies the parameters,obtaining more robust interaction features and pattern features.We demonstrate the effectiveness of our approach with numerical simulation results for malicious user detection,where the robustness issues are notable concerns.展开更多
Wireless sensor networks are often used to monitor physical and environmental conditions in various regions where human access is limited. Due to limited resources and deployment in hostile environment, they are vulne...Wireless sensor networks are often used to monitor physical and environmental conditions in various regions where human access is limited. Due to limited resources and deployment in hostile environment, they are vulnerable to faults and malicious attacks. The sensor nodes affected or compromised can send erroneous data or misleading reports to base station. Hence identifying malicious and faulty nodes in an accurate and timely manner is important to provide reliable functioning of the networks. In this paper, we present a malicious and malfunctioning node detection scheme using dual-weighted trust evaluation in a hierarchical sensor network. Malicious nodes are effectively detected in the presence of natural faults and noise without sacrificing fault-free nodes. Simulation results show that the proposed scheme outperforms some existing schemes in terms of mis-detection rate and event detection accuracy, while maintaining comparable performance in malicious node detection rate and false alarm rate.展开更多
Wireless sensor networks are extremely vulnerable to various security threats.The intrusion detection method based on game theory can effectively balance the detection rate and energy consumption of the system.The acc...Wireless sensor networks are extremely vulnerable to various security threats.The intrusion detection method based on game theory can effectively balance the detection rate and energy consumption of the system.The accurate analysis of the attack behavior of malicious sensor nodes can help to configure intrusion detection system,reduce unnecessary system consumption and improve detection efficiency.However,the completely rational assumption of the traditional game model will cause the established model to be inconsistent with the actual attack and defense scenario.In order to formulate a reasonable and effective intrusion detection strategy,we introduce evolutionary game theory to establish an attack evolution game model based on optimal response dynamics,and then analyze the attack behavior of malicious sensor nodes.Theoretical analysis and simulation results show that the evolution trend of attacks is closely related to the number of malicious sensors in the network and the initial state of the strategy,and the attacker can set the initial strategy so that all malicious sensor nodes will eventually launch attacks.Our work is of great significance to guide the development of defense strategies for intrusion detection systems.展开更多
The primary function of wireless sensor networks is to gather sensor data from the monitored area. Due to faults or malicious nodes, however, the sensor data collected or reported might be wrong. Hence it is important...The primary function of wireless sensor networks is to gather sensor data from the monitored area. Due to faults or malicious nodes, however, the sensor data collected or reported might be wrong. Hence it is important to detect events in the presence of wrong sensor readings and misleading reports. In this paper, we present a neighbor-based malicious node detection scheme for wireless sensor networks. Malicious nodes are modeled as faulty nodes behaving intelligently to lead to an incorrect decision or energy depletion without being easily detected. Each sensor node makes a decision on the fault status of itself and its neighboring nodes based on the sensor readings. Most erroneous readings due to transient faults are corrected by filtering, while nodes with permanent faults are removed using confidence-level evaluation, to improve malicious node detection rate and event detection accuracy. Each node maintains confidence levels of itself and its neighbors, indicating the track records in reporting past events correctly. Computer simulation shows that most of the malicious nodes reporting against their own readings are correctly detected unless they behave similar to the normal nodes. As a result, high event detection accuracy is also maintained while achieving low false alarm rate.展开更多
In this paper, we present a malicious node detection scheme using confidence-level evaluation in a grid-based wireless sensor network. The sensor field is divided into square grids, where sensor nodes in each grid for...In this paper, we present a malicious node detection scheme using confidence-level evaluation in a grid-based wireless sensor network. The sensor field is divided into square grids, where sensor nodes in each grid form a cluster with a cluster head. Each cluster head maintains the confidence levels of its member nodes based on their readings and reflects them in decision-making. Two thresholds are used to distinguish between false alarms due to malicious nodes and events. In addition, the center of an event region is estimated, if necessary, to enhance the event and malicious node detection accuracy. Experimental results show that the scheme can achieve high malicious node detection accuracy without sacrificing normal sensor nodes.展开更多
In this paper, we propose a new online system that can quickly detect malicious spam emails and adapt to the changes in the email contents and the Uniform Resource Locator (URL) links leading to malicious websites by ...In this paper, we propose a new online system that can quickly detect malicious spam emails and adapt to the changes in the email contents and the Uniform Resource Locator (URL) links leading to malicious websites by updating the system daily. We introduce an autonomous function for a server to generate training examples, in which double-bounce emails are automatically collected and their class labels are given by a crawler-type software to analyze the website maliciousness called SPIKE. In general, since spammers use botnets to spread numerous malicious emails within a short time, such distributed spam emails often have the same or similar contents. Therefore, it is not necessary for all spam emails to be learned. To adapt to new malicious campaigns quickly, only new types of spam emails should be selected for learning and this can be realized by introducing an active learning scheme into a classifier model. For this purpose, we adopt Resource Allocating Network with Locality Sensitive Hashing (RAN-LSH) as a classifier model with a data selection function. In RAN-LSH, the same or similar spam emails that have already been learned are quickly searched for a hash table in Locally Sensitive Hashing (LSH), in which the matched similar emails located in “well-learned” are discarded without being used as training data. To analyze email contents, we adopt the Bag of Words (BoW) approach and generate feature vectors whose attributes are transformed based on the normalized term frequency-inverse document frequency (TF-IDF). We use a data set of double-bounce spam emails collected at National Institute of Information and Communications Technology (NICT) in Japan from March 1st, 2013 until May 10th, 2013 to evaluate the performance of the proposed system. The results confirm that the proposed spam email detection system has capability of detecting with high detection rate.展开更多
基金supported by the Natural Science Foundation of Henan Province(Grant No.242300420297)awarded to Yi Sun.
文摘With the development of anti-virus technology,malicious documents have gradually become the main pathway of Advanced Persistent Threat(APT)attacks,therefore,the development of effective malicious document classifiers has become particularly urgent.Currently,detection methods based on document structure and behavioral features encounter challenges in feature engineering,these methods not only have limited accuracy,but also consume large resources,and usually can only detect documents in specific formats,which lacks versatility and adaptability.To address such problems,this paper proposes a novel malicious document detection method-visualizing documents as GGE images(Grayscale,Grayscale matrix,Entropy).The GGE method visualizes the original byte sequence of the malicious document as a grayscale image,the information entropy sequence of the document as an entropy image,and at the same time,the grayscale level co-occurrence matrix and the texture and spatial information stored in it are converted into grayscale matrix image,and fuses the three types of images to get the GGE color image.The Convolutional Block Attention Module-EfficientNet-B0(CBAM-EfficientNet-B0)model is then used for classification,combining transfer learning and applying the pre-trained model on the ImageNet dataset to the feature extraction process of GGE images.As shown in the experimental results,the GGE method has superior performance compared with other methods,which is suitable for detecting malicious documents in different formats,and achieves an accuracy of 99.44%and 97.39%on Portable Document Format(PDF)and office datasets,respectively,and consumes less time during the detection process,which can be effectively applied to the task of detecting malicious documents in real-time.
基金supported by the Fundamental Research Funds for the Central Universities(2024JKF13)the Beijing Municipal Education Commission General Program of Science and Technology(No.KM202414019003).
文摘With the widespread use of SMS(Short Message Service),the proliferation of malicious SMS has emerged as a pressing societal issue.While deep learning-based text classifiers offer promise,they often exhibit suboptimal performance in fine-grained detection tasks,primarily due to imbalanced datasets and insufficient model representation capabilities.To address this challenge,this paper proposes an LLMs-enhanced graph fusion dual-stream Transformer model for fine-grained Chinese malicious SMS detection.During the data processing stage,Large Language Models(LLMs)are employed for data augmentation,mitigating dataset imbalance.In the data input stage,both word-level and character-level features are utilized as model inputs,enhancing the richness of features and preventing information loss.A dual-stream Transformer serves as the backbone network in the learning representation stage,complemented by a graph-based feature fusion mechanism.At the output stage,both supervised classification cross-entropy loss and supervised contrastive learning loss are used as multi-task optimization objectives,further enhancing the model’s feature representation.Experimental results demonstrate that the proposed method significantly outperforms baselines on a publicly available Chinese malicious SMS dataset.
文摘The increasingly complex and interconnected train control information network is vulnerable to a variety of malicious traffic attacks,and the existing malicious traffic detection methods mainly rely on machine learning,such as poor robustness,weak generalization,and a lack of ability to learn common features.Therefore,this paper proposes a malicious traffic identification method based on stacked sparse denoising autoencoders combined with a regularized extreme learning machine through particle swarm optimization.Firstly,the simulation environment of the Chinese train control system-3,was constructed for data acquisition.Then Pearson coefficient and other methods are used for pre-processing,then a stacked sparse denoising autoencoder is used to achieve nonlinear dimensionality reduction of features,and finally regularization extreme learning machine optimized by particle swarm optimization is used to achieve classification.Experimental data show that the proposed method has good training performance,with an average accuracy of 97.57%and a false negative rate of 2.43%,which is better than other alternative methods.In addition,ablation experiments were performed to evaluate the contribution of each component,and the results showed that the combination of methods was superior to individual methods.To further evaluate the generalization ability of the model in different scenarios,publicly available data sets of industrial control system networks were used.The results show that the model has robust detection capability in various types of network attacks.
文摘The proliferation of internet traffic encryption has become a double-edged sword. While it significantly enhances user privacy, it also inadvertently shields cyber-attacks from detection, presenting a formidable challenge to cybersecurity. Traditional machine learning and deep learning techniques often fall short in identifying encrypted malicious traffic due to their inability to fully extract and utilize the implicit relational and positional information embedded within data packets. This limitation has led to an unresolved challenge in the cybersecurity community: how to effectively extract valuable insights from the complex patterns of traffic packet transmission. Consequently, this paper introduces the TB-Graph model, an encrypted malicious traffic classification model based on a relational graph attention network. The model is a heterogeneous traffic burst graph that embeds side-channel features, which are unaffected by encryption, into the graph nodes and connects them with three different types of burst edges. Subsequently, we design a relational positional coding that prevents the loss of temporal relationships between the original traffic flows during graph transformation. Ultimately, TB-Graph leverages the powerful graph representation learning capabilities of Relational Graph Attention Network (RGAT) to extract latent behavioral features from the burst graph nodes and edge relationships. Experimental results show that TB-Graph outperforms various state-of-the-art methods in fine-grained encrypted malicious traffic classification tasks on two public datasets, indicating its enhanced capability for identifying encrypted malicious traffic.
基金funded by the National Natural Science Foundation of China,grant numbers 61876189,61703426,and 61273275.
文摘With the growth of the Internet of Things(IoT)comes a flood of malicious traffic in the IoT,intensifying the challenges of network security.Traditional models operate with independent layers,limiting their effectiveness in addressing these challenges.To address this issue,we propose a cross-layer cooperative Feature Subset-Based Malicious Traffic Detection(FSMMTD)model for detecting malicious traffic.Our approach begins by applying an enhanced random forest method to adaptively filter and retain highly discriminative first-layer features.These processed features are then input into an improved state-space model that integrates the strengths of recurrent neural networks(RNNs)and transformers,enabling superior processing of complex patterns and global information.This integration allows the FSMMTD model to enhance its capability in identifying intricate data relationships and capturing comprehensive contextual insights.The FSMMTD model monitors IoT data flows in real-time,efficiently detecting anomalies and enabling rapid response to potential intrusions.We validate our approach using the publicly available ToN_IoT dataset for IoT traffic analysis.Experimental results demonstrate that our method achieves superior performance with an accuracy of 98.37%,precision of 96.28%,recall of 95.36%,and F1-score of 96.79%.These metrics indicate that the FSMMTD model outperforms existing methods in detecting malicious traffic,showcasing its effectiveness and reliability in enhancing IoT network security.
基金This work was supported by National Natural Science Foundation of China(No.62172308,No.U1626107,No.61972297,No.62172144,and No.62062019).
文摘Power Shell has been widely deployed in fileless malware and advanced persistent threat(APT)attacks due to its high stealthiness and live-off-theland technique.However,existing works mainly focus on deobfuscation and malicious detection,lacking the malicious Power Shell families classification and behavior analysis.Moreover,the state-of-the-art methods fail to capture fine-grained features and semantic relationships,resulting in low robustness and accuracy.To this end,we propose Power Detector,a novel malicious Power Shell script detector based on multimodal semantic fusion and deep learning.Specifically,we design four feature extraction methods to extract key features from character,token,abstract syntax tree(AST),and semantic knowledge graph.Then,we intelligently design four embeddings(i.e.,Char2Vec,Token2Vec,AST2Vec,and Rela2Vec) and construct a multi-modal fusion algorithm to concatenate feature vectors from different views.Finally,we propose a combined model based on transformer and CNN-Bi LSTM to implement Power Shell family detection.Our experiments with five types of Power Shell attacks show that PowerDetector can accurately detect various obfuscated and stealth PowerShell scripts,with a 0.9402 precision,a 0.9358 recall,and a 0.9374 F1-score.Furthermore,through singlemodal and multi-modal comparison experiments,we demonstrate that PowerDetector’s multi-modal embedding and deep learning model can achieve better accuracy and even identify more unknown attacks.
基金supported in part by the National Key R&D Program of China under Grant 2018YFA0701601part by the National Natural Science Foundation of China(Grant No.U22A2002,61941104,62201605)part by Tsinghua University-China Mobile Communications Group Co.,Ltd.Joint Institute。
文摘In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.
基金partially funded by the National Natural Science Foundation of China (Grant No. 61272447)National Entrepreneurship & Innovation Demonstration Base of China (Grant No. C700011)Key Research & Development Project of Sichuan Province of China (Grant No. 2018G20100)
文摘The limited labeled sample data in the field of advanced security threats detection seriously restricts the effective development of research work.Learning the sample labels from the labeled and unlabeled data has received a lot of research attention and various universal labeling methods have been proposed.However,the labeling task of malicious communication samples targeted at advanced threats has to face the two practical challenges:the difficulty of extracting effective features in advance and the complexity of the actual sample types.To address these problems,we proposed a sample labeling method for malicious communication based on semi-supervised deep neural network.This method supports continuous learning and optimization feature representation while labeling sample,and can handle uncertain samples that are outside the concerned sample types.According to the experimental results,our proposed deep neural network can automatically learn effective feature representation,and the validity of features is close to or even higher than that of features which extracted based on expert knowledge.Furthermore,our proposed method can achieve the labeling accuracy of 97.64%~98.50%,which is more accurate than the train-then-detect,kNN and LPA methodsin any labeled-sample proportion condition.The problem of insufficient labeled samples in many network attack detecting scenarios,and our proposed work can function as a reference for the sample labeling tasks in the similar real-world scenarios.
文摘Background:In recent years,blockchain technology has attracted considerable attention.It records cryptographic transactions in a public ledger that is difficult to alter and compromise because of the distributed consensus.As a result,blockchain is believed to resist fraud and hacking.Results:This work explores the types of fraud and malicious activities that can be prevented by blockchain technology and identifies attacks to which blockchain remains vulnerable.Conclusions:This study recommends appropriate defensive measures and calls for further research into the techniques for fighting malicious activities related to blockchains.
基金This research was funded by National Natural Science Foundation of China under Grant No.61806171Sichuan University of Science&Engineering Talent Project under Grant No.2021RC15+2 种基金Open Fund Project of Key Laboratory for Non-Destructive Testing and Engineering Computer of Sichuan Province Universities on Bridge Inspection and Engineering under Grant No.2022QYJ06Sichuan University of Science&Engineering Graduate Student Innovation Fund under Grant No.Y2023115The Scientific Research and Innovation Team Program of Sichuan University of Science and Technology under Grant No.SUSE652A006.
文摘While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.
文摘Spam is no longer just commercial unsolicited email messages that waste our time, it consumes network traffic and mail servers’ storage. Furthermore, spam has become a major component of several attack vectors including attacks such as phishing, cross-site scripting, cross-site request forgery and malware infection. Statistics show that the amount of spam containing malicious contents increased compared to the one advertising legitimate products and services. In this paper, the issue of spam detection is investigated with the aim to develop an efficient method to identify spam email based on the analysis of the content of email messages. We identify a set of features that have a considerable number of malicious related features. Our goal is to study the effect of these features in helping the classical classifiers in identifying spam emails. To make the problem more challenging, we developed spam classification models based on imbalanced data where spam emails form the rare class with only 16.5% of the total emails. Different metrics were utilized in the evaluation of the developed models. Results show noticeable improvement of spam classification models when trained by dataset that includes malicious related features.
文摘This paper introduces the background,illustrates the hardware structure and software features of malicious base station,explains its work principle,presents a method of detecting malicious base station,analyses the experiment and evaluates the experimental results to verify the reliability of this method.Finally proposes the future work.
基金Supported by the National Natural Science Foundation of China (90204012, 60573036) and the Natural Science Foundation of Hebei Province (F2006000177)
文摘We study the detailed malicious code propagating process in scale-free networks with link weights that denotes traffic between two nodes. It is found that the propagating velocity reaches a peak rapidly then decays in a power-law form, which is different from the well-known result in unweighted network case. Simulation results show that the nodes with larger strength are preferential to be infected, but the hierarchical dynamics are not clearly found. The simulation results also show that larger dispersion of weight of networks leads to slower propagating, which indicates that malicious code propagates more quickly in unweighted scale-free networks than in weighted scale-free networks under the same condition. These results show that not only the topology of networks but also the link weights affect the malicious propagating process.
文摘Malicious traffic detection over the internet is one of the challenging areas for researchers to protect network infrastructures from any malicious activity.Several shortcomings of a network system can be leveraged by an attacker to get unauthorized access through malicious traffic.Safeguard from such attacks requires an efficient automatic system that can detect malicious traffic timely and avoid system damage.Currently,many automated systems can detect malicious activity,however,the efficacy and accuracy need further improvement to detect malicious traffic from multi-domain systems.The present study focuses on the detection of malicious traffic with high accuracy using machine learning techniques.The proposed approach used two datasets UNSW-NB15 and IoTID20 which contain the data for IoT-based traffic and local network traffic,respectively.Both datasets were combined to increase the capability of the proposed approach in detecting malicious traffic from local and IoT networks,with high accuracy.Horizontally merging both datasets requires an equal number of features which was achieved by reducing feature count to 30 for each dataset by leveraging principal component analysis(PCA).The proposed model incorporates stacked ensemble model extra boosting forest(EBF)which is a combination of tree-based models such as extra tree classifier,gradient boosting classifier,and random forest using a stacked ensemble approach.Empirical results show that EBF performed significantly better and achieved the highest accuracy score of 0.985 and 0.984 on the multi-domain dataset for two and four classes,respectively.
基金supported in part by projects of National Natural Science Foundation of China under Grant 61772406 and Grant 61941105supported in part by projects of the Fundamental Research Funds for the Central Universitiesthe Innovation Fund of Xidian University under Grant 500120109215456.
文摘The continuously booming of information technology has shed light on developing a variety of communication networks,multimedia,social networks and Internet of Things applications.However,users inevitably suffer from the intrusion of malicious users.Some studies focus on static characteristics of malicious users,which is easy to be bypassed by camouflaged malicious users.In this paper,we present a malicious user detection method based on ensemble feature selection and adversarial training.Firstly,the feature selection alleviates the dimension disaster problem and achieves more accurate classification performance.Secondly,we embed features into the multidimensional space and aggregate it into a feature map to encode the explicit content preference and implicit interaction preference.Thirdly,we use an effective ensemble learning which could avoid over-fitting and has good noise resistance.Finally,we propose a datadriven neural network detection model with the regularization technique adversarial training to deeply analyze the characteristics.It simplifies the parameters,obtaining more robust interaction features and pattern features.We demonstrate the effectiveness of our approach with numerical simulation results for malicious user detection,where the robustness issues are notable concerns.
文摘Wireless sensor networks are often used to monitor physical and environmental conditions in various regions where human access is limited. Due to limited resources and deployment in hostile environment, they are vulnerable to faults and malicious attacks. The sensor nodes affected or compromised can send erroneous data or misleading reports to base station. Hence identifying malicious and faulty nodes in an accurate and timely manner is important to provide reliable functioning of the networks. In this paper, we present a malicious and malfunctioning node detection scheme using dual-weighted trust evaluation in a hierarchical sensor network. Malicious nodes are effectively detected in the presence of natural faults and noise without sacrificing fault-free nodes. Simulation results show that the proposed scheme outperforms some existing schemes in terms of mis-detection rate and event detection accuracy, while maintaining comparable performance in malicious node detection rate and false alarm rate.
基金National Natural Science Foundation of China(No.61163009)。
文摘Wireless sensor networks are extremely vulnerable to various security threats.The intrusion detection method based on game theory can effectively balance the detection rate and energy consumption of the system.The accurate analysis of the attack behavior of malicious sensor nodes can help to configure intrusion detection system,reduce unnecessary system consumption and improve detection efficiency.However,the completely rational assumption of the traditional game model will cause the established model to be inconsistent with the actual attack and defense scenario.In order to formulate a reasonable and effective intrusion detection strategy,we introduce evolutionary game theory to establish an attack evolution game model based on optimal response dynamics,and then analyze the attack behavior of malicious sensor nodes.Theoretical analysis and simulation results show that the evolution trend of attacks is closely related to the number of malicious sensors in the network and the initial state of the strategy,and the attacker can set the initial strategy so that all malicious sensor nodes will eventually launch attacks.Our work is of great significance to guide the development of defense strategies for intrusion detection systems.
文摘The primary function of wireless sensor networks is to gather sensor data from the monitored area. Due to faults or malicious nodes, however, the sensor data collected or reported might be wrong. Hence it is important to detect events in the presence of wrong sensor readings and misleading reports. In this paper, we present a neighbor-based malicious node detection scheme for wireless sensor networks. Malicious nodes are modeled as faulty nodes behaving intelligently to lead to an incorrect decision or energy depletion without being easily detected. Each sensor node makes a decision on the fault status of itself and its neighboring nodes based on the sensor readings. Most erroneous readings due to transient faults are corrected by filtering, while nodes with permanent faults are removed using confidence-level evaluation, to improve malicious node detection rate and event detection accuracy. Each node maintains confidence levels of itself and its neighbors, indicating the track records in reporting past events correctly. Computer simulation shows that most of the malicious nodes reporting against their own readings are correctly detected unless they behave similar to the normal nodes. As a result, high event detection accuracy is also maintained while achieving low false alarm rate.
文摘In this paper, we present a malicious node detection scheme using confidence-level evaluation in a grid-based wireless sensor network. The sensor field is divided into square grids, where sensor nodes in each grid form a cluster with a cluster head. Each cluster head maintains the confidence levels of its member nodes based on their readings and reflects them in decision-making. Two thresholds are used to distinguish between false alarms due to malicious nodes and events. In addition, the center of an event region is estimated, if necessary, to enhance the event and malicious node detection accuracy. Experimental results show that the scheme can achieve high malicious node detection accuracy without sacrificing normal sensor nodes.
文摘In this paper, we propose a new online system that can quickly detect malicious spam emails and adapt to the changes in the email contents and the Uniform Resource Locator (URL) links leading to malicious websites by updating the system daily. We introduce an autonomous function for a server to generate training examples, in which double-bounce emails are automatically collected and their class labels are given by a crawler-type software to analyze the website maliciousness called SPIKE. In general, since spammers use botnets to spread numerous malicious emails within a short time, such distributed spam emails often have the same or similar contents. Therefore, it is not necessary for all spam emails to be learned. To adapt to new malicious campaigns quickly, only new types of spam emails should be selected for learning and this can be realized by introducing an active learning scheme into a classifier model. For this purpose, we adopt Resource Allocating Network with Locality Sensitive Hashing (RAN-LSH) as a classifier model with a data selection function. In RAN-LSH, the same or similar spam emails that have already been learned are quickly searched for a hash table in Locally Sensitive Hashing (LSH), in which the matched similar emails located in “well-learned” are discarded without being used as training data. To analyze email contents, we adopt the Bag of Words (BoW) approach and generate feature vectors whose attributes are transformed based on the normalized term frequency-inverse document frequency (TF-IDF). We use a data set of double-bounce spam emails collected at National Institute of Information and Communications Technology (NICT) in Japan from March 1st, 2013 until May 10th, 2013 to evaluate the performance of the proposed system. The results confirm that the proposed spam email detection system has capability of detecting with high detection rate.