A URL(Uniform Resource Locator)is used to locate a digital resource.With this URL,an attacker can perform a variety of attacks,which can lead to serious consequences for both individuals and organizations.Therefore,at...A URL(Uniform Resource Locator)is used to locate a digital resource.With this URL,an attacker can perform a variety of attacks,which can lead to serious consequences for both individuals and organizations.Therefore,attackers create malicious URLs to gain access to an organization’s systems or sensitive information.It is crucial to secure individuals and organizations against these malicious URLs.A combination of machine learning and deep learning was used to predict malicious URLs.This research contributes significantly to the field of cybersecurity by proposing a model that seamlessly integrates the accuracy of machine learning with the swiftness of deep learning.The strategic fusion of Random Forest(RF) and Multilayer Perceptron(MLP)with an accuracy of 81% represents a noteworthy advancement,offering a balanced solution for robust cybersecurity.This study found that by combining RF and MLP,an efficient model was developed with an accuracy of 81%and a training time of 33.78 s.展开更多
Cybersecurity-related solutions have become familiar since it ensures security and privacy against cyberattacks in this digital era.Malicious Uniform Resource Locators(URLs)can be embedded in email or Twitter and used...Cybersecurity-related solutions have become familiar since it ensures security and privacy against cyberattacks in this digital era.Malicious Uniform Resource Locators(URLs)can be embedded in email or Twitter and used to lure vulnerable internet users to implement malicious data in their systems.This may result in compromised security of the systems,scams,and other such cyberattacks.These attacks hijack huge quantities of the available data,incurring heavy financial loss.At the same time,Machine Learning(ML)and Deep Learning(DL)models paved the way for designing models that can detect malicious URLs accurately and classify them.With this motivation,the current article develops an Artificial Fish Swarm Algorithm(AFSA)with Deep Learning Enabled Malicious URL Detection and Classification(AFSADL-MURLC)model.The presented AFSADL-MURLC model intends to differentiate the malicious URLs from genuine URLs.To attain this,AFSADL-MURLC model initially carries out data preprocessing and makes use of glove-based word embedding technique.In addition,the created vector model is then passed onto Gated Recurrent Unit(GRU)classification to recognize the malicious URLs.Finally,AFSA is applied to the proposed model to enhance the efficiency of GRU model.The proposed AFSADL-MURLC technique was experimentally validated using benchmark dataset sourced from Kaggle repository.The simulation results confirmed the supremacy of the proposed AFSADL-MURLC model over recent approaches under distinct measures.展开更多
Detecting malicious Uniform Resource Locators(URLs)is crucially important to prevent attackers from committing cybercrimes.Recent researches have investigated the role of machine learning(ML)models to detect malicious...Detecting malicious Uniform Resource Locators(URLs)is crucially important to prevent attackers from committing cybercrimes.Recent researches have investigated the role of machine learning(ML)models to detect malicious URLs.By using ML algorithms,rst,the features of URLs are extracted,and then different ML models are trained.The limitation of this approach is that it requires manual feature engineering and it does not consider the sequential patterns in the URL.Therefore,deep learning(DL)models are used to solve these issues since they are able to perform featureless detection.Furthermore,DL models give better accuracy and generalization to newly designed URLs;however,the results of our study show that these models,such as any other DL models,can be susceptible to adversarial attacks.In this paper,we examine the robustness of these models and demonstrate the importance of considering this susceptibility before applying such detection systems in real-world solutions.We propose and demonstrate a black-box attack based on scoring functions with greedy search for the minimum number of perturbations leading to a misclassication.The attack is examined against different types of convolutional neural networks(CNN)-based URL classiers and it causes a tangible decrease in the accuracy with more than 56%reduction in the accuracy of the best classier(among the selected classiers for this work).Moreover,adversarial training shows promising results in reducing the inuence of the attack on the robustness of the model to less than 7%on average.展开更多
The potential of text analytics is revealed by Machine Learning(ML)and Natural Language Processing(NLP)techniques.In this paper,we propose an NLP framework that is applied to multiple datasets to detect malicious Unif...The potential of text analytics is revealed by Machine Learning(ML)and Natural Language Processing(NLP)techniques.In this paper,we propose an NLP framework that is applied to multiple datasets to detect malicious Uniform Resource Locators(URLs).Three categories of features,both ML and Deep Learning(DL)algorithms and a ranking schema are included in the proposed framework.We apply frequency and prediction-based embeddings,such as hash vectorizer,Term Frequency-Inverse Dense Frequency(TF-IDF)and predictors,word to vector-word2vec(continuous bag of words,skip-gram)from Google,to extract features from text.Further,we apply more state-of-the-art methods to create vectorized features,such as GloVe.Additionally,feature engineering that is specific to URL structure is deployed to detect scams and other threats.For framework assessment,four ranking indicators are weighted:computational time and performance as accuracy,F1 score and type error II.For the computational time,we propose a new metric-Feature Building Time(FBT)as the cutting-edge feature builders(like doc2vec or GloVe)require more time.By applying the proposed assessment step,the skip-gram algorithm of word2vec surpasses other feature builders in performance.Additionally,eXtreme Gradient Boost(XGB)outperforms other classifiers.With this setup,we attain an accuracy of 99.5%and an F1 score of 0.99.展开更多
文摘A URL(Uniform Resource Locator)is used to locate a digital resource.With this URL,an attacker can perform a variety of attacks,which can lead to serious consequences for both individuals and organizations.Therefore,attackers create malicious URLs to gain access to an organization’s systems or sensitive information.It is crucial to secure individuals and organizations against these malicious URLs.A combination of machine learning and deep learning was used to predict malicious URLs.This research contributes significantly to the field of cybersecurity by proposing a model that seamlessly integrates the accuracy of machine learning with the swiftness of deep learning.The strategic fusion of Random Forest(RF) and Multilayer Perceptron(MLP)with an accuracy of 81% represents a noteworthy advancement,offering a balanced solution for robust cybersecurity.This study found that by combining RF and MLP,an efficient model was developed with an accuracy of 81%and a training time of 33.78 s.
基金the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number(45/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R140)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4310373DSR21.
文摘Cybersecurity-related solutions have become familiar since it ensures security and privacy against cyberattacks in this digital era.Malicious Uniform Resource Locators(URLs)can be embedded in email or Twitter and used to lure vulnerable internet users to implement malicious data in their systems.This may result in compromised security of the systems,scams,and other such cyberattacks.These attacks hijack huge quantities of the available data,incurring heavy financial loss.At the same time,Machine Learning(ML)and Deep Learning(DL)models paved the way for designing models that can detect malicious URLs accurately and classify them.With this motivation,the current article develops an Artificial Fish Swarm Algorithm(AFSA)with Deep Learning Enabled Malicious URL Detection and Classification(AFSADL-MURLC)model.The presented AFSADL-MURLC model intends to differentiate the malicious URLs from genuine URLs.To attain this,AFSADL-MURLC model initially carries out data preprocessing and makes use of glove-based word embedding technique.In addition,the created vector model is then passed onto Gated Recurrent Unit(GRU)classification to recognize the malicious URLs.Finally,AFSA is applied to the proposed model to enhance the efficiency of GRU model.The proposed AFSADL-MURLC technique was experimentally validated using benchmark dataset sourced from Kaggle repository.The simulation results confirmed the supremacy of the proposed AFSADL-MURLC model over recent approaches under distinct measures.
基金supported by Korea Electric Power Corporation(Grant Number:R18XA02).
文摘Detecting malicious Uniform Resource Locators(URLs)is crucially important to prevent attackers from committing cybercrimes.Recent researches have investigated the role of machine learning(ML)models to detect malicious URLs.By using ML algorithms,rst,the features of URLs are extracted,and then different ML models are trained.The limitation of this approach is that it requires manual feature engineering and it does not consider the sequential patterns in the URL.Therefore,deep learning(DL)models are used to solve these issues since they are able to perform featureless detection.Furthermore,DL models give better accuracy and generalization to newly designed URLs;however,the results of our study show that these models,such as any other DL models,can be susceptible to adversarial attacks.In this paper,we examine the robustness of these models and demonstrate the importance of considering this susceptibility before applying such detection systems in real-world solutions.We propose and demonstrate a black-box attack based on scoring functions with greedy search for the minimum number of perturbations leading to a misclassication.The attack is examined against different types of convolutional neural networks(CNN)-based URL classiers and it causes a tangible decrease in the accuracy with more than 56%reduction in the accuracy of the best classier(among the selected classiers for this work).Moreover,adversarial training shows promising results in reducing the inuence of the attack on the robustness of the model to less than 7%on average.
基金supported by a grant of the Ministry of Research,Innovation and Digitization,CNCS-UEFISCDI,Project Number PN-Ⅲ-P4-PCE-2021-0334,within PNCDI Ⅲ.
文摘The potential of text analytics is revealed by Machine Learning(ML)and Natural Language Processing(NLP)techniques.In this paper,we propose an NLP framework that is applied to multiple datasets to detect malicious Uniform Resource Locators(URLs).Three categories of features,both ML and Deep Learning(DL)algorithms and a ranking schema are included in the proposed framework.We apply frequency and prediction-based embeddings,such as hash vectorizer,Term Frequency-Inverse Dense Frequency(TF-IDF)and predictors,word to vector-word2vec(continuous bag of words,skip-gram)from Google,to extract features from text.Further,we apply more state-of-the-art methods to create vectorized features,such as GloVe.Additionally,feature engineering that is specific to URL structure is deployed to detect scams and other threats.For framework assessment,four ranking indicators are weighted:computational time and performance as accuracy,F1 score and type error II.For the computational time,we propose a new metric-Feature Building Time(FBT)as the cutting-edge feature builders(like doc2vec or GloVe)require more time.By applying the proposed assessment step,the skip-gram algorithm of word2vec surpasses other feature builders in performance.Additionally,eXtreme Gradient Boost(XGB)outperforms other classifiers.With this setup,we attain an accuracy of 99.5%and an F1 score of 0.99.