In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and ...In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.展开更多
In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approac...In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approach to embedding sophisticated phishing detection within a password manager’s framework, called PhishGuard. PhishGuard uses a Large Language Model (LLM), specifically a fine-tuned BERT algorithm that works in real time, where URLs fed by the user in the credentials are analyzed and authenticated. This approach enhances user security with its provision of real-time protection from phishing attempts. Through rigorous testing, this paper illustrates how PhishGuard has scored well in tests that measure accuracy, precision, recall, and false positive rates.展开更多
Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors ha...Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.展开更多
Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise ...Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise in phishing attacks.Moreover,these fraudulent schemes are progressively becoming more intricate,thereby rendering them more challenging to identify.Hence,it is imperative to utilize sophisticated algorithms to address this issue.Machine learning is a highly effective approach for identifying and uncovering these harmful behaviors.Machine learning(ML)approaches can identify common characteristics in most phishing assaults.In this paper,we propose an ensemble approach and compare it with six machine learning techniques to determine the type of website and whether it is normal or not based on two phishing datasets.After that,we used the normalization technique on the dataset to transform the range of all the features into the same range.The findings of this paper for all algorithms are as follows in the first dataset based on accuracy,precision,recall,and F1-score,respectively:Decision Tree(DT)(0.964,0.961,0.976,0.968),Random Forest(RF)(0.970,0.964,0.984,0.974),Gradient Boosting(GB)(0.960,0.959,0.971,0.965),XGBoost(XGB)(0.973,0.976,0.976,0.976),AdaBoost(0.934,0.934,0.950,0.942),Multi Layer Perceptron(MLP)(0.970,0.971,0.976,0.974)and Voting(0.978,0.975,0.987,0.981).So,the Voting classifier gave the best results.While in the second dataset,all the algorithms gave the same results in four evaluation metrics,which indicates that each of them can effectively accomplish the prediction process.Also,this approach outperformed the previous work in detecting phishing websites with high accuracy,a lower false negative rate,a shorter prediction time,and a lower false positive rate.展开更多
Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional ...Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional Encoder Representations from Transformers(BERT)for feature extraction and CNN for classification,specifically designed for enterprise information systems.BERT’s linguistic capabilities are used to extract key features from email content,which are then processed by a convolutional neural network(CNN)model optimized for phishing detection.Achieving an accuracy of 97.5%,our proposed model demonstrates strong proficiency in identifying phishing emails.This approach represents a significant advancement in applying deep learning to cybersecurity,setting a new benchmark for email security by effectively addressing the increasing complexity of phishing attacks.展开更多
Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus s...Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus software,often fail to counter these sophisticated attacks,which target human vulnerabilities.To strengthen defenses,healthcare organizations are increasingly adopting Machine Learning(ML)techniques.ML-based SPA defenses use advanced algorithms to analyze various features,including email content,sender behavior,and attachments,to detect potential threats.This capability enables proactive security measures that address risks in real-time.The interpretability of ML models fosters trust and allows security teams to continuously refine these algorithms as new attack methods emerge.Implementing ML techniques requires integrating diverse data sources,such as electronic health records,email logs,and incident reports,which enhance the algorithms’learning environment.Feedback from end-users further improves model performance.Among tested models,the hierarchical models,Convolutional Neural Network(CNN)achieved the highest accuracy at 99.99%,followed closely by the sequential Bidirectional Long Short-Term Memory(BiLSTM)model at 99.94%.In contrast,the traditional Multi-Layer Perceptron(MLP)model showed an accuracy of 98.46%.This difference underscores the superior performance of advanced sequential and hierarchical models in detecting SPAs compared to traditional approaches.展开更多
Phishing attacks are more than two-decade-old attacks that attackers use to steal passwords related to financial services.After the first reported incident in 1995,its impact keeps on increasing.Also,during COVID-19,d...Phishing attacks are more than two-decade-old attacks that attackers use to steal passwords related to financial services.After the first reported incident in 1995,its impact keeps on increasing.Also,during COVID-19,due to the increase in digitization,there is an exponential increase in the number of victims of phishing attacks.Many deep learning and machine learning techniques are available to detect phishing attacks.However,most of the techniques did not use efficient optimization techniques.In this context,our proposed model used random forest-based techniques to select the best features,and then the Brown-Bear optimization algorithm(BBOA)was used to fine-tune the hyper-parameters of the convolutional neural network(CNN)model.To test our model,we used a dataset from Kaggle comprising 11,000+websites.In addition to that,the dataset also consists of the 30 features that are extracted from the website uniform resource locator(URL).The target variable has two classes:“Safe”and“Phishing.”Due to the use of BBOA,our proposed model detects malicious URLs with an accuracy of 93%and a precision of 92%.In addition,comparing our model with standard techniques,such as GRU(Gated Recurrent Unit),LSTM(Long Short-Term Memory),RNN(Recurrent Neural Network),ANN(Artificial Neural Network),SVM(Support Vector Machine),and LR(Logistic Regression),presents the effectiveness of our proposed model.Also,the comparison with past literature showcases the contribution and novelty of our proposed model.展开更多
Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Te...Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.展开更多
Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detec...Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detection have relied heavily on feature engineering and have often fallen short in adapting to the dynamically changing patterns of phishingUniformResource Locator(URLs).Addressing these challenge,we introduce a framework that integrates the sequential data processing strengths of a Recurrent Neural Network(RNN)with the hyperparameter optimization prowess of theWhale Optimization Algorithm(WOA).Ourmodel capitalizes on an extensive Kaggle dataset,featuring over 11,000 URLs,each delineated by 30 attributes.The WOA’s hyperparameter optimization enhances the RNN’s performance,evidenced by a meticulous validation process.The results,encapsulated in precision,recall,and F1-score metrics,surpass baseline models,achieving an overall accuracy of 92%.This study not only demonstrates the RNN’s proficiency in learning complex patterns but also underscores the WOA’s effectiveness in refining machine learning models for the critical task of phishing detection.展开更多
The two-factor authentication mechanism is gaining popularity as more people are becoming aware of the need to secure their identities. In the current form, existing 2FA systems are defenseless against phishing attack...The two-factor authentication mechanism is gaining popularity as more people are becoming aware of the need to secure their identities. In the current form, existing 2FA systems are defenseless against phishing attacks. They do not provide any visual indicator to the user to check the website’s validity before logging in during phishing attacks. This exposes the user’s password during the phishing attack. Two-factor authentication needs to be enhanced to provide a mechanism to detect phishing attacks without adding a significant burden on the user. This research paper will propose a novel 2-FA TOTP mechanism to provide a subconscious indicator during a phishing attack. In comparison, the new proposed novel approach provides better security against phishing attack. Lastly, the mathematical analysis is performed to understand the TOTP variance and validate the security considerations against the existing 2FA systems with respect to adversary attack.展开更多
Phishing is one of the most common threats on the Internet. Traditionally, detection methods have relied on blacklists and heuristic rules, but these approaches are showing their limitations in the face of rapidly evo...Phishing is one of the most common threats on the Internet. Traditionally, detection methods have relied on blacklists and heuristic rules, but these approaches are showing their limitations in the face of rapidly evolving attack techniques. Artificial Intelligence (AI) offers promising solutions for improving phishing detection, prediction and prevention. In our study, we analyzed three supervised machine learning classifiers and one deep learning classifier for detecting and predicting phishing websites: Naive Bayes, Decision Tree, Gradient Boosting and Multi-Layer Perceptron. The results showed that the Gradient Boosting Classifier performed best, with a precision of 96.2%, a F1-score of 96.6%, recall and precision of 99.9% in all classes, and a mean absolute error (MAE) of just 0.002. Closely followed by the Gradient Boosting Classifier with a precision of 96.2% and a score of 96.6%. In contrast, Naive Bayes and the Decision Tree showed a lower accuracy rate. These results underline the importance of high accuracy in these models to reduce the risk associated with malicious attachments and reinforce security measures in this area of research.展开更多
Phishing emails have experienced a rapid surge in cyber threats globally,especially following the emergence of the COVID-19 pandemic.This form of attack has led to substantial financial losses for numerous organizatio...Phishing emails have experienced a rapid surge in cyber threats globally,especially following the emergence of the COVID-19 pandemic.This form of attack has led to substantial financial losses for numerous organizations.Although variousmodels have been constructed to differentiate legitimate emails fromphishing attempts,attackers continuously employ novel strategies to manipulate their targets into falling victim to their schemes.This form of attack has led to substantial financial losses for numerous organizations.While efforts are ongoing to create phishing detection models,their current level of accuracy and speed in identifying phishing emails is less than satisfactory.Additionally,there has been a concerning rise in the frequency of phished emails recently.Consequently,there is a pressing need for more efficient and high-performing phishing detection models to mitigate the adverse impact of such fraudulent messages.In the context of this research,a comprehensive analysis is conducted on both components of an email message—namely,the email header and body.Sentence-level characteristics are extracted and leveraged in the construction of a new phishing detection model.This model utilizes K Nearest Neighbor(KNN)introducing the novel dimension of sentence-level analysis.Established datasets fromKagglewere employed to train and validate the model.The evaluation of this model’s effectiveness relies on key performance metrics including accuracy of 0.97,precision,recall,and F1-measure.展开更多
In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones...In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones to extract the features of phishing URLs. Then a machine learning algorithm is applied to obtain the URL classification model from the sample data set training. In order to adapt to the change of a phishing URL, the classification model should be constantly updated according to the new samples. So, an incremental learning algorithm based on the feedback of the original sample data set is designed. The experiments verify that the combination of the URL features extracted in this paper and the support vector machine (SVM) classification algorithm can achieve a high phishing detection accuracy, and the incremental learning algorithm is also effective.展开更多
A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of t...A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of two web pages. The web pages are first converted into images, and then divided into sub-images with iterated dividing and shrinking. After that, the attributes of sub-images including color histograms, gray histograms and size parameters are computed to construct the attributed relational graph(ARG)of each page. In order to match two ARGs, the inner earth mover's distances(EMD)between every two nodes coming from each ARG respectively are first computed, and then the similarity of web pages by the outer EMD between two ARGs is worked out to detect phishing web pages. The experimental results show that the proposed architecture and algorithm has good robustness along with scalability, and can effectively detect phishing.展开更多
Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defen...Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defense of security requires understanding the nature of Cyber Attacks,so prevention becomes easy and accurate by acquiring sufficient knowledge about various features of Cyber Attacks.Cyber-Security proposes appropriate actions that can handle and block attacks.A phishing attack is one of the cybercrimes in which users follow a link to illegal websites that will persuade them to divulge their private information.One of the online security challenges is the enormous number of daily transactions done via phishing sites.As Cyber-Security have a priority for all organizations,Cyber-Security risks are considered part of an organization’s risk management process.This paper presents a survey of different modern machine-learning approaches that handle phishing problems and detect with high-quality accuracy different phishing attacks.A dataset consisting of more than 11000 websites from the Kaggle dataset was utilized and studying the effect of 30 website features and the resulting class label indicating whether or not it is a phishing website(1 or−1).Furthermore,we determined the confusion matrices of Machine Learning models:Neural Networks(NN),Na飗e Bayes,and Adaboost,and the results indicated that the accuracies achieved were 90.23%,92.97%,and 95.43%,respectively.展开更多
The data in the cloud is protected by various mechanisms to ensure security aspects and user’s privacy.But,deceptive attacks like phishing might obtain the user’s data and use it for malicious purposes.In Spite of m...The data in the cloud is protected by various mechanisms to ensure security aspects and user’s privacy.But,deceptive attacks like phishing might obtain the user’s data and use it for malicious purposes.In Spite of much techno-logical advancement,phishing acts as thefirst step in a series of attacks.With technological advancements,availability and access to the phishing kits has improved drastically,thus making it an ideal tool for the hackers to execute the attacks.The phishing cases indicate use of foreign characters to disguise the ori-ginal Uniform Resource Locator(URL),typosquatting the popular domain names,using reserved characters for re directions and multi-chain phishing.Such phishing URLs can be stored as a part of the document and uploaded in the cloud,providing a nudge to hackers in cloud storage.The cloud servers are becoming the trusted tool for executing these attacks.The prevailing software for blacklisting phishing URLs lacks the security for multi-level phishing and expects security from the client’s end(browser).At the same time,the avalanche effect and immut-ability of block-chain proves to be a strong source of security.Considering these trends in technology,a block-chain basedfiltering implementation for preserving the integrity of user data stored in the cloud is proposed.The proposed Phish Block detects the homographic phishing URLs with accuracy of 91%which assures the security in cloud storage.展开更多
The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks rang...The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks ranging from establishing a phone call to performing critical and sensitive tasks like money payments. Since the mobile device is accompanying a person most of his time, it is highly probably that it includes personal and sensitive data for that person. The increased use of mobile devices in daily life made mobile systems an excellent target for attacks. One of the most important attacks is phishing attack in which an attacker tries to get the credential of the victim and impersonate him. In this paper, analysis of different types of phishing attacks on mobile devices is provided. Mitigation techniques—anti-phishing techniques—are also analyzed. Assessment of each technique and a summary of its advantages and disadvantages is provided. At the end, important steps to guard against phishing attacks are provided. The aim of the work is to put phishing attacks on mobile systems in light, and to make people aware of these attacks and how to avoid them.展开更多
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis...Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.展开更多
文摘In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.
文摘In the digital age, phishing attacks have been a persistent security threat leveraged by traditional password management systems that are not able to verify the authenticity of websites. This paper presents an approach to embedding sophisticated phishing detection within a password manager’s framework, called PhishGuard. PhishGuard uses a Large Language Model (LLM), specifically a fine-tuned BERT algorithm that works in real time, where URLs fed by the user in the credentials are analyzed and authenticated. This approach enhances user security with its provision of real-time protection from phishing attempts. Through rigorous testing, this paper illustrates how PhishGuard has scored well in tests that measure accuracy, precision, recall, and false positive rates.
文摘Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.
基金funding from Deanship of Scientific Research in King Faisal University with Grant Number KFU 241085.
文摘Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise in phishing attacks.Moreover,these fraudulent schemes are progressively becoming more intricate,thereby rendering them more challenging to identify.Hence,it is imperative to utilize sophisticated algorithms to address this issue.Machine learning is a highly effective approach for identifying and uncovering these harmful behaviors.Machine learning(ML)approaches can identify common characteristics in most phishing assaults.In this paper,we propose an ensemble approach and compare it with six machine learning techniques to determine the type of website and whether it is normal or not based on two phishing datasets.After that,we used the normalization technique on the dataset to transform the range of all the features into the same range.The findings of this paper for all algorithms are as follows in the first dataset based on accuracy,precision,recall,and F1-score,respectively:Decision Tree(DT)(0.964,0.961,0.976,0.968),Random Forest(RF)(0.970,0.964,0.984,0.974),Gradient Boosting(GB)(0.960,0.959,0.971,0.965),XGBoost(XGB)(0.973,0.976,0.976,0.976),AdaBoost(0.934,0.934,0.950,0.942),Multi Layer Perceptron(MLP)(0.970,0.971,0.976,0.974)and Voting(0.978,0.975,0.987,0.981).So,the Voting classifier gave the best results.While in the second dataset,all the algorithms gave the same results in four evaluation metrics,which indicates that each of them can effectively accomplish the prediction process.Also,this approach outperformed the previous work in detecting phishing websites with high accuracy,a lower false negative rate,a shorter prediction time,and a lower false positive rate.
基金supported by a grant from Hong Kong Metropolitan University (RD/2023/2.3)supported Princess Nourah bint Abdulrah-man University Researchers Supporting Project number (PNURSP2024R 343)+1 种基金Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Northern Border University,Arar,Kingdom of Saudi Arabia for funding this research work through the project number“NBU-FFR-2024-1092-09”.
文摘Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional Encoder Representations from Transformers(BERT)for feature extraction and CNN for classification,specifically designed for enterprise information systems.BERT’s linguistic capabilities are used to extract key features from email content,which are then processed by a convolutional neural network(CNN)model optimized for phishing detection.Achieving an accuracy of 97.5%,our proposed model demonstrates strong proficiency in identifying phishing emails.This approach represents a significant advancement in applying deep learning to cybersecurity,setting a new benchmark for email security by effectively addressing the increasing complexity of phishing attacks.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under Grant Number(DGSSR-2023-02-02513).
文摘Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus software,often fail to counter these sophisticated attacks,which target human vulnerabilities.To strengthen defenses,healthcare organizations are increasingly adopting Machine Learning(ML)techniques.ML-based SPA defenses use advanced algorithms to analyze various features,including email content,sender behavior,and attachments,to detect potential threats.This capability enables proactive security measures that address risks in real-time.The interpretability of ML models fosters trust and allows security teams to continuously refine these algorithms as new attack methods emerge.Implementing ML techniques requires integrating diverse data sources,such as electronic health records,email logs,and incident reports,which enhance the algorithms’learning environment.Feedback from end-users further improves model performance.Among tested models,the hierarchical models,Convolutional Neural Network(CNN)achieved the highest accuracy at 99.99%,followed closely by the sequential Bidirectional Long Short-Term Memory(BiLSTM)model at 99.94%.In contrast,the traditional Multi-Layer Perceptron(MLP)model showed an accuracy of 98.46%.This difference underscores the superior performance of advanced sequential and hierarchical models in detecting SPAs compared to traditional approaches.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R 343),Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaThe authors extend their appreciation to the Deanship of Scientific Research at Northern Border University,Arar,KSA for funding this research work through the project number NBU-FFR-2024-1092-18.
文摘Phishing attacks are more than two-decade-old attacks that attackers use to steal passwords related to financial services.After the first reported incident in 1995,its impact keeps on increasing.Also,during COVID-19,due to the increase in digitization,there is an exponential increase in the number of victims of phishing attacks.Many deep learning and machine learning techniques are available to detect phishing attacks.However,most of the techniques did not use efficient optimization techniques.In this context,our proposed model used random forest-based techniques to select the best features,and then the Brown-Bear optimization algorithm(BBOA)was used to fine-tune the hyper-parameters of the convolutional neural network(CNN)model.To test our model,we used a dataset from Kaggle comprising 11,000+websites.In addition to that,the dataset also consists of the 30 features that are extracted from the website uniform resource locator(URL).The target variable has two classes:“Safe”and“Phishing.”Due to the use of BBOA,our proposed model detects malicious URLs with an accuracy of 93%and a precision of 92%.In addition,comparing our model with standard techniques,such as GRU(Gated Recurrent Unit),LSTM(Long Short-Term Memory),RNN(Recurrent Neural Network),ANN(Artificial Neural Network),SVM(Support Vector Machine),and LR(Logistic Regression),presents the effectiveness of our proposed model.Also,the comparison with past literature showcases the contribution and novelty of our proposed model.
文摘Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2024R 343)PrincessNourah bint Abdulrahman University,Riyadh,Saudi ArabiaDeanship of Scientific Research at Northern Border University,Arar,Kingdom of Saudi Arabia,for funding this researchwork through the project number“NBU-FFR-2024-1092-02”.
文摘Phishing attacks present a persistent and evolving threat in the cybersecurity land-scape,necessitating the development of more sophisticated detection methods.Traditional machine learning approaches to phishing detection have relied heavily on feature engineering and have often fallen short in adapting to the dynamically changing patterns of phishingUniformResource Locator(URLs).Addressing these challenge,we introduce a framework that integrates the sequential data processing strengths of a Recurrent Neural Network(RNN)with the hyperparameter optimization prowess of theWhale Optimization Algorithm(WOA).Ourmodel capitalizes on an extensive Kaggle dataset,featuring over 11,000 URLs,each delineated by 30 attributes.The WOA’s hyperparameter optimization enhances the RNN’s performance,evidenced by a meticulous validation process.The results,encapsulated in precision,recall,and F1-score metrics,surpass baseline models,achieving an overall accuracy of 92%.This study not only demonstrates the RNN’s proficiency in learning complex patterns but also underscores the WOA’s effectiveness in refining machine learning models for the critical task of phishing detection.
文摘The two-factor authentication mechanism is gaining popularity as more people are becoming aware of the need to secure their identities. In the current form, existing 2FA systems are defenseless against phishing attacks. They do not provide any visual indicator to the user to check the website’s validity before logging in during phishing attacks. This exposes the user’s password during the phishing attack. Two-factor authentication needs to be enhanced to provide a mechanism to detect phishing attacks without adding a significant burden on the user. This research paper will propose a novel 2-FA TOTP mechanism to provide a subconscious indicator during a phishing attack. In comparison, the new proposed novel approach provides better security against phishing attack. Lastly, the mathematical analysis is performed to understand the TOTP variance and validate the security considerations against the existing 2FA systems with respect to adversary attack.
文摘Phishing is one of the most common threats on the Internet. Traditionally, detection methods have relied on blacklists and heuristic rules, but these approaches are showing their limitations in the face of rapidly evolving attack techniques. Artificial Intelligence (AI) offers promising solutions for improving phishing detection, prediction and prevention. In our study, we analyzed three supervised machine learning classifiers and one deep learning classifier for detecting and predicting phishing websites: Naive Bayes, Decision Tree, Gradient Boosting and Multi-Layer Perceptron. The results showed that the Gradient Boosting Classifier performed best, with a precision of 96.2%, a F1-score of 96.6%, recall and precision of 99.9% in all classes, and a mean absolute error (MAE) of just 0.002. Closely followed by the Gradient Boosting Classifier with a precision of 96.2% and a score of 96.6%. In contrast, Naive Bayes and the Decision Tree showed a lower accuracy rate. These results underline the importance of high accuracy in these models to reduce the risk associated with malicious attachments and reinforce security measures in this area of research.
文摘Phishing emails have experienced a rapid surge in cyber threats globally,especially following the emergence of the COVID-19 pandemic.This form of attack has led to substantial financial losses for numerous organizations.Although variousmodels have been constructed to differentiate legitimate emails fromphishing attempts,attackers continuously employ novel strategies to manipulate their targets into falling victim to their schemes.This form of attack has led to substantial financial losses for numerous organizations.While efforts are ongoing to create phishing detection models,their current level of accuracy and speed in identifying phishing emails is less than satisfactory.Additionally,there has been a concerning rise in the frequency of phished emails recently.Consequently,there is a pressing need for more efficient and high-performing phishing detection models to mitigate the adverse impact of such fraudulent messages.In the context of this research,a comprehensive analysis is conducted on both components of an email message—namely,the email header and body.Sentence-level characteristics are extracted and leveraged in the construction of a new phishing detection model.This model utilizes K Nearest Neighbor(KNN)introducing the novel dimension of sentence-level analysis.Established datasets fromKagglewere employed to train and validate the model.The evaluation of this model’s effectiveness relies on key performance metrics including accuracy of 0.97,precision,recall,and F1-measure.
基金The National Basic Research Program of China(973 Program)(No.2010CB328104,2009CB320501)the National Natural Science Foundation of China(No.61272531,61070158,61003257,61060161,61003311,41201486)+4 种基金the National Key Technology R&D Program during the11th Five-Year Plan Period(No.2010BAI88B03)Specialized Research Fund for the Doctoral Program of Higher Education(No.20110092130002)the National Science and Technology Major Project(No.2009ZX03004-004-04)the Foundation of the Key Laboratory of Netw ork and Information Security of Jiangsu Province(No.BM2003201)the Key Laboratory of Computer Netw ork and Information Integration of the Ministry of Education of China(No.93K-9)
文摘In order to effectively detect malicious phishing behaviors, a phishing detection method based on the uniform resource locator (URL) features is proposed. First, the method compares the phishing URLs with legal ones to extract the features of phishing URLs. Then a machine learning algorithm is applied to obtain the URL classification model from the sample data set training. In order to adapt to the change of a phishing URL, the classification model should be constantly updated according to the new samples. So, an incremental learning algorithm based on the feedback of the original sample data set is designed. The experiments verify that the combination of the URL features extracted in this paper and the support vector machine (SVM) classification algorithm can achieve a high phishing detection accuracy, and the incremental learning algorithm is also effective.
基金The National Basic Research Program of China (973Program)(2010CB328104,2009CB320501)the National Natural Science Foundation of China (No.60773103,90912002)+1 种基金Specialized Research Fund for the Doctoral Program of Higher Education(No.200802860031)Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (No.93K-9)
文摘A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of two web pages. The web pages are first converted into images, and then divided into sub-images with iterated dividing and shrinking. After that, the attributes of sub-images including color histograms, gray histograms and size parameters are computed to construct the attributed relational graph(ARG)of each page. In order to match two ARGs, the inner earth mover's distances(EMD)between every two nodes coming from each ARG respectively are first computed, and then the similarity of web pages by the outer EMD between two ARGs is worked out to detect phishing web pages. The experimental results show that the proposed architecture and algorithm has good robustness along with scalability, and can effectively detect phishing.
文摘Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defense of security requires understanding the nature of Cyber Attacks,so prevention becomes easy and accurate by acquiring sufficient knowledge about various features of Cyber Attacks.Cyber-Security proposes appropriate actions that can handle and block attacks.A phishing attack is one of the cybercrimes in which users follow a link to illegal websites that will persuade them to divulge their private information.One of the online security challenges is the enormous number of daily transactions done via phishing sites.As Cyber-Security have a priority for all organizations,Cyber-Security risks are considered part of an organization’s risk management process.This paper presents a survey of different modern machine-learning approaches that handle phishing problems and detect with high-quality accuracy different phishing attacks.A dataset consisting of more than 11000 websites from the Kaggle dataset was utilized and studying the effect of 30 website features and the resulting class label indicating whether or not it is a phishing website(1 or−1).Furthermore,we determined the confusion matrices of Machine Learning models:Neural Networks(NN),Na飗e Bayes,and Adaboost,and the results indicated that the accuracies achieved were 90.23%,92.97%,and 95.43%,respectively.
文摘The data in the cloud is protected by various mechanisms to ensure security aspects and user’s privacy.But,deceptive attacks like phishing might obtain the user’s data and use it for malicious purposes.In Spite of much techno-logical advancement,phishing acts as thefirst step in a series of attacks.With technological advancements,availability and access to the phishing kits has improved drastically,thus making it an ideal tool for the hackers to execute the attacks.The phishing cases indicate use of foreign characters to disguise the ori-ginal Uniform Resource Locator(URL),typosquatting the popular domain names,using reserved characters for re directions and multi-chain phishing.Such phishing URLs can be stored as a part of the document and uploaded in the cloud,providing a nudge to hackers in cloud storage.The cloud servers are becoming the trusted tool for executing these attacks.The prevailing software for blacklisting phishing URLs lacks the security for multi-level phishing and expects security from the client’s end(browser).At the same time,the avalanche effect and immut-ability of block-chain proves to be a strong source of security.Considering these trends in technology,a block-chain basedfiltering implementation for preserving the integrity of user data stored in the cloud is proposed.The proposed Phish Block detects the homographic phishing URLs with accuracy of 91%which assures the security in cloud storage.
文摘The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks ranging from establishing a phone call to performing critical and sensitive tasks like money payments. Since the mobile device is accompanying a person most of his time, it is highly probably that it includes personal and sensitive data for that person. The increased use of mobile devices in daily life made mobile systems an excellent target for attacks. One of the most important attacks is phishing attack in which an attacker tries to get the credential of the victim and impersonate him. In this paper, analysis of different types of phishing attacks on mobile devices is provided. Mitigation techniques—anti-phishing techniques—are also analyzed. Assessment of each technique and a summary of its advantages and disadvantages is provided. At the end, important steps to guard against phishing attacks are provided. The aim of the work is to put phishing attacks on mobile systems in light, and to make people aware of these attacks and how to avoid them.
文摘Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.