Domain Generation Algorithms(DGAs)continue to pose a significant threat inmodernmalware infrastructures by enabling resilient and evasive communication with Command and Control(C&C)servers.Traditional detection me...Domain Generation Algorithms(DGAs)continue to pose a significant threat inmodernmalware infrastructures by enabling resilient and evasive communication with Command and Control(C&C)servers.Traditional detection methods-rooted in statistical heuristics,feature engineering,and shallow machine learning-struggle to adapt to the increasing sophistication,linguistic mimicry,and adversarial variability of DGA variants.The emergence of Large Language Models(LLMs)marks a transformative shift in this landscape.Leveraging deep contextual understanding,semantic generalization,and few-shot learning capabilities,LLMs such as BERT,GPT,and T5 have shown promising results in detecting both character-based and dictionary-based DGAs,including previously unseen(zeroday)variants.This paper provides a comprehensive and critical review of LLM-driven DGA detection,introducing a structured taxonomy of LLM architectures,evaluating the linguistic and behavioral properties of benchmark datasets,and comparing recent detection frameworks across accuracy,latency,robustness,and multilingual performance.We also highlight key limitations,including challenges in adversarial resilience,model interpretability,deployment scalability,and privacy risks.To address these gaps,we present a forward-looking research roadmap encompassing adversarial training,model compression,cross-lingual benchmarking,and real-time integration with SIEM/SOAR platforms.This survey aims to serve as a foundational resource for advancing the development of scalable,explainable,and operationally viable LLM-based DGA detection systems.展开更多
On the basis of the objective functions,dithering optimization techniques can be divided into the intensity-based optimization technique and the phase-based optimization technique.However,both types of techniques are ...On the basis of the objective functions,dithering optimization techniques can be divided into the intensity-based optimization technique and the phase-based optimization technique.However,both types of techniques are spatial-domain optimization techniques,while their measurement performances are essentially determined by the harmonic components in the frequency domain.In this paper,a novel genetic optimization technique in the frequency domain is proposed for highquality fringe generation.In addition,to handle the time-consuming difficulty of genetic algorithm(GA),we first optimize a binary patch,then join the optimal binary patches together according to periodicity and symmetry so as to generate a full-size pattern.It is verified that the proposed technique can significantly enhance the measured performance and ensure the robustness to various amounts of defocusing.展开更多
Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 ser...Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 servers by generating various network locations.The detection of DGA domain names is one of the important technologies for command and control communication detection.Considering the randomness of the DGA domain names,recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names.However,these methods are insufficient to handle wordlist-based DGA threats,which generate domain names by randomly concatenating dictionary words according to a special set of rules.In this paper,we proposed a a deep learning framework ATT-CNN-BiLSTMfor identifying and detecting DGA domains to alleviate the threat.Firstly,the Convolutional Neural Network(CNN)and bidirectional Long Short-Term Memory(BiLSTM)neural network layer was used to extract the features of the domain sequences information;secondly,the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names.Finally,the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification.Our extensive experimental results demonstrate the effectiveness of the proposed model,both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones.To be precise,we got a F1 score of 98.79%for the detection and macro average precision and recall of 83%for the classification task of DGA domain names.展开更多
Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 ser...Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 servers by generating various network locations.The detection of DGA domain names is one of the important technologies for command and control communication detection.Considering the randomness of the DGA domain names,recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names.However,these methods are insufficient to handle wordlist-based DGA threats,which generate domain names by randomly concatenating dictionary words according to a special set of rules.In this paper,we proposed a a deep learning framework ATT-CNN-BiLSTMfor identifying and detecting DGA domains to alleviate the threat.Firstly,the Convolutional Neural Network(CNN)and bidirectional Long Short-Term Memory(BiLSTM)neural network layer was used to extract the features of the domain sequences information;secondly,the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names.Finally,the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification.Our extensive experimental results demonstrate the effectiveness of the proposed model,both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones.To be precise,we got a F1 score of 98.79% for the detection and macro average precision and recall of 83% for the classification task of DGA domain names.展开更多
基金the Deanship of Scientific Research at King Khalid University for funding this work through large group under grant number(GRP.2/663/46).
文摘Domain Generation Algorithms(DGAs)continue to pose a significant threat inmodernmalware infrastructures by enabling resilient and evasive communication with Command and Control(C&C)servers.Traditional detection methods-rooted in statistical heuristics,feature engineering,and shallow machine learning-struggle to adapt to the increasing sophistication,linguistic mimicry,and adversarial variability of DGA variants.The emergence of Large Language Models(LLMs)marks a transformative shift in this landscape.Leveraging deep contextual understanding,semantic generalization,and few-shot learning capabilities,LLMs such as BERT,GPT,and T5 have shown promising results in detecting both character-based and dictionary-based DGAs,including previously unseen(zeroday)variants.This paper provides a comprehensive and critical review of LLM-driven DGA detection,introducing a structured taxonomy of LLM architectures,evaluating the linguistic and behavioral properties of benchmark datasets,and comparing recent detection frameworks across accuracy,latency,robustness,and multilingual performance.We also highlight key limitations,including challenges in adversarial resilience,model interpretability,deployment scalability,and privacy risks.To address these gaps,we present a forward-looking research roadmap encompassing adversarial training,model compression,cross-lingual benchmarking,and real-time integration with SIEM/SOAR platforms.This survey aims to serve as a foundational resource for advancing the development of scalable,explainable,and operationally viable LLM-based DGA detection systems.
基金Project supported by the Science and Technology Major Projects of Zhejiang Province,China(Grant No.2017C31080)
文摘On the basis of the objective functions,dithering optimization techniques can be divided into the intensity-based optimization technique and the phase-based optimization technique.However,both types of techniques are spatial-domain optimization techniques,while their measurement performances are essentially determined by the harmonic components in the frequency domain.In this paper,a novel genetic optimization technique in the frequency domain is proposed for highquality fringe generation.In addition,to handle the time-consuming difficulty of genetic algorithm(GA),we first optimize a binary patch,then join the optimal binary patches together according to periodicity and symmetry so as to generate a full-size pattern.It is verified that the proposed technique can significantly enhance the measured performance and ensure the robustness to various amounts of defocusing.
基金Our research was supported by the National Key Research and Development Program of China(Grant No.2016YFB0801004)the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDC02030200)the National Key Research and Development Program of China(Grant No.2018YFC0824801).
文摘Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 servers by generating various network locations.The detection of DGA domain names is one of the important technologies for command and control communication detection.Considering the randomness of the DGA domain names,recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names.However,these methods are insufficient to handle wordlist-based DGA threats,which generate domain names by randomly concatenating dictionary words according to a special set of rules.In this paper,we proposed a a deep learning framework ATT-CNN-BiLSTMfor identifying and detecting DGA domains to alleviate the threat.Firstly,the Convolutional Neural Network(CNN)and bidirectional Long Short-Term Memory(BiLSTM)neural network layer was used to extract the features of the domain sequences information;secondly,the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names.Finally,the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification.Our extensive experimental results demonstrate the effectiveness of the proposed model,both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones.To be precise,we got a F1 score of 98.79%for the detection and macro average precision and recall of 83%for the classification task of DGA domain names.
基金supported by the National Key Research and Development Program of China(Grant No.2016YFB0801004)the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDC02030200)the National Key Research and Development Program of China(Grant No.2018YFC0824801).
文摘Command and control(C2)servers are used by attackers to operate communications.To perform attacks,attackers usually employee the Domain Generation Algorithm(DGA),with which to confirm rendezvous points to their C2 servers by generating various network locations.The detection of DGA domain names is one of the important technologies for command and control communication detection.Considering the randomness of the DGA domain names,recent research in DGA detection applyed machine learning methods based on features extracting and deep learning architectures to classify domain names.However,these methods are insufficient to handle wordlist-based DGA threats,which generate domain names by randomly concatenating dictionary words according to a special set of rules.In this paper,we proposed a a deep learning framework ATT-CNN-BiLSTMfor identifying and detecting DGA domains to alleviate the threat.Firstly,the Convolutional Neural Network(CNN)and bidirectional Long Short-Term Memory(BiLSTM)neural network layer was used to extract the features of the domain sequences information;secondly,the attention layer was used to allocate the corresponding weight of the extracted deep information from the domain names.Finally,the different weights of features in domain names were put into the output layer to complete the tasks of detection and classification.Our extensive experimental results demonstrate the effectiveness of the proposed model,both on regular DGA domains and DGA that hard to detect such as wordlist-based and part-wordlist-based ones.To be precise,we got a F1 score of 98.79% for the detection and macro average precision and recall of 83% for the classification task of DGA domain names.