Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,...Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,virtual similarity detection,black and white lists,and machine learning(ML).However,phishing attempts remain a problem,and establishing an effective anti-phishing strategy is a work in progress.Furthermore,while most antiphishing solutions achieve the highest levels of accuracy on a given dataset,their methods suffer from an increased number of false positives.These methods are ineffective against zero-hour attacks.Phishing sites with a high False Positive Rate(FPR)are considered genuine because they can cause people to lose a lot ofmoney by visiting them.Feature selection is critical when developing phishing detection strategies.Good feature selection helps improve accuracy;however,duplicate features can also increase noise in the dataset and reduce the accuracy of the algorithm.Therefore,a combination of filter-based feature selection methods is proposed to detect phishing attacks,including constant feature removal,duplicate feature removal,quasi-feature removal,correlated feature removal,mutual information extraction,and Analysis of Variance(ANOVA)testing.The technique has been tested with differentMachine Learning classifiers:Random Forest,Artificial Neural Network(ANN),Ada-Boost,Extreme Gradient Boosting(XGBoost),Logistic Regression,Decision Trees,Gradient Boosting Classifiers,Support Vector Machine(SVM),and two types of ensemble models,stacking and majority voting to gain A low false positive rate is achieved.Stacked ensemble classifiers(gradient boosting,randomforest,support vector machine)achieve 1.31%FPR and 98.17%accuracy on Dataset 1,2.81%FPR and Dataset 3 shows 2.81%FPR and 97.61%accuracy,while Dataset 2 shows 3.47%FPR and 96.47%accuracy.展开更多
Cognitive biases are commonly used by attackers to manipulate users’psychology in phishing emails.This study systematically analyzes the exploitation of cognitive biases in phishing emails and addresses the following...Cognitive biases are commonly used by attackers to manipulate users’psychology in phishing emails.This study systematically analyzes the exploitation of cognitive biases in phishing emails and addresses the following questions:(1)Which cognitive biases are frequently exploited in phishing emails?(2)How are cognitive biases exploited in phishing emails?(3)How effective are cognitive bias features in detecting phishing emails?(4)How can the exploitation of cognitive biases in phishing emails be modelled?To address these questions,this study constructed a cognitive processing model that explains how attackers manipulate users by leveraging cognitive biases at different cognitive stages.By annotating 482 phishing emails,this study identified 10 common types of cognitive biases and developed corresponding detection models to evaluate the effectiveness of these bias features in phishing email detection.The results show that models incorporating cognitive bias features significantly outperform baseline models in terms of accuracy,recall,and F1 score.This study provides crucial theoretical support for future anti-phishing methods,as a deeper understanding of cognitive biases offers key insights for designing more effective detection and prevention strategies.展开更多
In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and ...In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.展开更多
A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of t...A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of two web pages. The web pages are first converted into images, and then divided into sub-images with iterated dividing and shrinking. After that, the attributes of sub-images including color histograms, gray histograms and size parameters are computed to construct the attributed relational graph(ARG)of each page. In order to match two ARGs, the inner earth mover's distances(EMD)between every two nodes coming from each ARG respectively are first computed, and then the similarity of web pages by the outer EMD between two ARGs is worked out to detect phishing web pages. The experimental results show that the proposed architecture and algorithm has good robustness along with scalability, and can effectively detect phishing.展开更多
Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defen...Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defense of security requires understanding the nature of Cyber Attacks,so prevention becomes easy and accurate by acquiring sufficient knowledge about various features of Cyber Attacks.Cyber-Security proposes appropriate actions that can handle and block attacks.A phishing attack is one of the cybercrimes in which users follow a link to illegal websites that will persuade them to divulge their private information.One of the online security challenges is the enormous number of daily transactions done via phishing sites.As Cyber-Security have a priority for all organizations,Cyber-Security risks are considered part of an organization’s risk management process.This paper presents a survey of different modern machine-learning approaches that handle phishing problems and detect with high-quality accuracy different phishing attacks.A dataset consisting of more than 11000 websites from the Kaggle dataset was utilized and studying the effect of 30 website features and the resulting class label indicating whether or not it is a phishing website(1 or−1).Furthermore,we determined the confusion matrices of Machine Learning models:Neural Networks(NN),Na飗e Bayes,and Adaboost,and the results indicated that the accuracies achieved were 90.23%,92.97%,and 95.43%,respectively.展开更多
The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks rang...The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks ranging from establishing a phone call to performing critical and sensitive tasks like money payments. Since the mobile device is accompanying a person most of his time, it is highly probably that it includes personal and sensitive data for that person. The increased use of mobile devices in daily life made mobile systems an excellent target for attacks. One of the most important attacks is phishing attack in which an attacker tries to get the credential of the victim and impersonate him. In this paper, analysis of different types of phishing attacks on mobile devices is provided. Mitigation techniques—anti-phishing techniques—are also analyzed. Assessment of each technique and a summary of its advantages and disadvantages is provided. At the end, important steps to guard against phishing attacks are provided. The aim of the work is to put phishing attacks on mobile systems in light, and to make people aware of these attacks and how to avoid them.展开更多
Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phis...Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.展开更多
Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors ha...Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.展开更多
Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes us...Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%.展开更多
The continuous destruction and frauds prevailing due to phishing URLs make it an indispensable area for research.Various techniques are adopted in the detection process,including neural networks,machine learning,or hy...The continuous destruction and frauds prevailing due to phishing URLs make it an indispensable area for research.Various techniques are adopted in the detection process,including neural networks,machine learning,or hybrid techniques.A novel detection model is proposed that uses data mining with the Particle Swarm Optimization technique(PSO)to increase and empower the method of detecting phishing URLs.Feature selection based on various techniques to identify the phishing candidates from the URL is conducted.In this approach,the features mined from the URL are extracted using data mining rules.The features are selected on the basis of URL structure.The classification of these features identified by the data mining rules is done using PSO techniques.The selection of features with PSO optimization makes it possible to identify phishing URLs.Using a large number of rule identifiers,the true positive rate for the identification of phishing URLs is maximized in this approach.The experiments show that feature selection using data mining and particle swarm optimization helps tremendously identify the phishing URLs based on the structure of the URL itself.Moreover,it can minimize processing time for identifying the phishing website instead.So,the approach can be beneficial to identify suchURLs over the existing contemporary detecting models proposed before.展开更多
This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels fro...This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.展开更多
Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such atta...Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such attacks.Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users.Machine and deep learning and other methods were used to design detection methods for these attacks.However,there is still a need to enhance detection accuracy.Optimization of an ensemble classification method for phishing website(PW)detection is proposed in this study.A Genetic Algo-rithm(GA)was used for the proposed method optimization by tuning several ensemble Machine Learning(ML)methods parameters,including Random Forest(RF),AdaBoost(AB),XGBoost(XGB),Bagging(BA),GradientBoost(GB),and LightGBM(LGBM).These were accomplished by ranking the optimized classi-fiers to pick out the best classifiers as a base for the proposed method.A PW data-set that is made up of 4898 PWs and 6157 legitimate websites(LWs)was used for this study's experiments.As a result,detection accuracy was enhanced and reached 97.16 percent.展开更多
Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise ...Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise in phishing attacks.Moreover,these fraudulent schemes are progressively becoming more intricate,thereby rendering them more challenging to identify.Hence,it is imperative to utilize sophisticated algorithms to address this issue.Machine learning is a highly effective approach for identifying and uncovering these harmful behaviors.Machine learning(ML)approaches can identify common characteristics in most phishing assaults.In this paper,we propose an ensemble approach and compare it with six machine learning techniques to determine the type of website and whether it is normal or not based on two phishing datasets.After that,we used the normalization technique on the dataset to transform the range of all the features into the same range.The findings of this paper for all algorithms are as follows in the first dataset based on accuracy,precision,recall,and F1-score,respectively:Decision Tree(DT)(0.964,0.961,0.976,0.968),Random Forest(RF)(0.970,0.964,0.984,0.974),Gradient Boosting(GB)(0.960,0.959,0.971,0.965),XGBoost(XGB)(0.973,0.976,0.976,0.976),AdaBoost(0.934,0.934,0.950,0.942),Multi Layer Perceptron(MLP)(0.970,0.971,0.976,0.974)and Voting(0.978,0.975,0.987,0.981).So,the Voting classifier gave the best results.While in the second dataset,all the algorithms gave the same results in four evaluation metrics,which indicates that each of them can effectively accomplish the prediction process.Also,this approach outperformed the previous work in detecting phishing websites with high accuracy,a lower false negative rate,a shorter prediction time,and a lower false positive rate.展开更多
To secure web applications from Man-In-The-Middle(MITM)and phishing attacks is a challenging task nowadays.For this purpose,authen-tication protocol plays a vital role in web communication which securely transfers dat...To secure web applications from Man-In-The-Middle(MITM)and phishing attacks is a challenging task nowadays.For this purpose,authen-tication protocol plays a vital role in web communication which securely transfers data from one party to another.This authentication works via OpenID,Kerberos,password authentication protocols,etc.However,there are still some limitations present in the reported security protocols.In this paper,the presented anticipated strategy secures both Web-based attacks by leveraging encoded emails and a novel password form pattern method.The proposed OpenID-based encrypted Email’s Authentication,Authorization,and Accounting(EAAA)protocol ensure security by relying on the email authenticity and a Special Secret Encrypted Alphanumeric String(SSEAS).This string is deployed on both the relying party and the email server,which is unique and trustworthy.The first authentication,OpenID Uniform Resource Locator(URL)identity,is performed on the identity provider side.A second authentication is carried out by the hidden Email’s server side and receives a third authentication link.This Email’s third SSEAS authentication link manages on the relying party(RP).Compared to existing cryptographic single sign-on protocols,the EAAA protocol ensures that an OpenID URL’s identity is secured from MITM and phishing attacks.This study manages two attacks such as MITM and phishing attacks and gives 339 ms response time which is higher than the already reported methods,such as Single Sign-On(SSO)and OpenID.The experimental sites were examined by 72 information technology(IT)specialists,who found that 88.89%of respondents successfully validated the user authorization provided to them via Email.The proposed EAAA protocol minimizes the higher-level risk of MITM and phishing attacks in an OpenID-based atmosphere.展开更多
Phishing is one of the simplest ways in cybercrime to hack the reliable data of users such as passwords,account identifiers,bank details,etc.In general,these kinds of cyberattacks are made at users through phone calls...Phishing is one of the simplest ways in cybercrime to hack the reliable data of users such as passwords,account identifiers,bank details,etc.In general,these kinds of cyberattacks are made at users through phone calls,emails,or instant messages.The anti-phishing techniques,currently under use,aremainly based on source code features that need to scrape the webpage content.In third party services,these techniques check the classification procedure of phishing Uniform Resource Locators(URLs).Even thoughMachine Learning(ML)techniques have been lately utilized in the identification of phishing,they still need to undergo feature engineering since the techniques are not well-versed in identifying phishing offenses.The tremendous growth and evolution of Deep Learning(DL)techniques paved the way for increasing the accuracy of classification process.In this background,the current research article presents a Hunger Search Optimization with Hybrid Deep Learning enabled Phishing Detection and Classification(HSOHDL-PDC)model.The presented HSOHDL-PDC model focuses on effective recognition and classification of phishing based on website URLs.In addition,SOHDL-PDC model uses character-level embedding instead of word-level embedding since the URLs generally utilize words with no importance.Moreover,a hybrid Convolutional Neural Network-Long Short Term Memory(HCNN-LSTM)technique is also applied for identification and classification of phishing.The hyperparameters involved in HCNN-LSTM model are optimized with the help of HSO algorithm which in turn produced improved outcomes.The performance of the proposed HSOHDL-PDC model was validated using different datasets and the outcomes confirmed the supremacy of the proposed model over other recent approaches.展开更多
Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus s...Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus software,often fail to counter these sophisticated attacks,which target human vulnerabilities.To strengthen defenses,healthcare organizations are increasingly adopting Machine Learning(ML)techniques.ML-based SPA defenses use advanced algorithms to analyze various features,including email content,sender behavior,and attachments,to detect potential threats.This capability enables proactive security measures that address risks in real-time.The interpretability of ML models fosters trust and allows security teams to continuously refine these algorithms as new attack methods emerge.Implementing ML techniques requires integrating diverse data sources,such as electronic health records,email logs,and incident reports,which enhance the algorithms’learning environment.Feedback from end-users further improves model performance.Among tested models,the hierarchical models,Convolutional Neural Network(CNN)achieved the highest accuracy at 99.99%,followed closely by the sequential Bidirectional Long Short-Term Memory(BiLSTM)model at 99.94%.In contrast,the traditional Multi-Layer Perceptron(MLP)model showed an accuracy of 98.46%.This difference underscores the superior performance of advanced sequential and hierarchical models in detecting SPAs compared to traditional approaches.展开更多
Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional ...Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional Encoder Representations from Transformers(BERT)for feature extraction and CNN for classification,specifically designed for enterprise information systems.BERT’s linguistic capabilities are used to extract key features from email content,which are then processed by a convolutional neural network(CNN)model optimized for phishing detection.Achieving an accuracy of 97.5%,our proposed model demonstrates strong proficiency in identifying phishing emails.This approach represents a significant advancement in applying deep learning to cybersecurity,setting a new benchmark for email security by effectively addressing the increasing complexity of phishing attacks.展开更多
Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Te...Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.展开更多
基金financially supported by the Deanship of Scientific Research and Graduate Studies at King Khalid University under research grant number(R.G.P.2/21/46)in part by the Deanship of Scientific Research,Vice Presidency for Graduate Studies and Scientific Research,King Faisal University,Saudi Arabia,under Grant KFU253116.
文摘Today,phishing is an online attack designed to obtain sensitive information such as credit card and bank account numbers,passwords,and usernames.We can find several anti-phishing solutions,such as heuristic detection,virtual similarity detection,black and white lists,and machine learning(ML).However,phishing attempts remain a problem,and establishing an effective anti-phishing strategy is a work in progress.Furthermore,while most antiphishing solutions achieve the highest levels of accuracy on a given dataset,their methods suffer from an increased number of false positives.These methods are ineffective against zero-hour attacks.Phishing sites with a high False Positive Rate(FPR)are considered genuine because they can cause people to lose a lot ofmoney by visiting them.Feature selection is critical when developing phishing detection strategies.Good feature selection helps improve accuracy;however,duplicate features can also increase noise in the dataset and reduce the accuracy of the algorithm.Therefore,a combination of filter-based feature selection methods is proposed to detect phishing attacks,including constant feature removal,duplicate feature removal,quasi-feature removal,correlated feature removal,mutual information extraction,and Analysis of Variance(ANOVA)testing.The technique has been tested with differentMachine Learning classifiers:Random Forest,Artificial Neural Network(ANN),Ada-Boost,Extreme Gradient Boosting(XGBoost),Logistic Regression,Decision Trees,Gradient Boosting Classifiers,Support Vector Machine(SVM),and two types of ensemble models,stacking and majority voting to gain A low false positive rate is achieved.Stacked ensemble classifiers(gradient boosting,randomforest,support vector machine)achieve 1.31%FPR and 98.17%accuracy on Dataset 1,2.81%FPR and Dataset 3 shows 2.81%FPR and 97.61%accuracy,while Dataset 2 shows 3.47%FPR and 96.47%accuracy.
文摘Cognitive biases are commonly used by attackers to manipulate users’psychology in phishing emails.This study systematically analyzes the exploitation of cognitive biases in phishing emails and addresses the following questions:(1)Which cognitive biases are frequently exploited in phishing emails?(2)How are cognitive biases exploited in phishing emails?(3)How effective are cognitive bias features in detecting phishing emails?(4)How can the exploitation of cognitive biases in phishing emails be modelled?To address these questions,this study constructed a cognitive processing model that explains how attackers manipulate users by leveraging cognitive biases at different cognitive stages.By annotating 482 phishing emails,this study identified 10 common types of cognitive biases and developed corresponding detection models to evaluate the effectiveness of these bias features in phishing email detection.The results show that models incorporating cognitive bias features significantly outperform baseline models in terms of accuracy,recall,and F1 score.This study provides crucial theoretical support for future anti-phishing methods,as a deeper understanding of cognitive biases offers key insights for designing more effective detection and prevention strategies.
文摘In the evolving landscape of cyber threats,phishing attacks pose significant challenges,particularly through deceptive webpages designed to extract sensitive information under the guise of legitimacy.Conventional and machine learning(ML)-based detection systems struggle to detect phishing websites owing to their constantly changing tactics.Furthermore,newer phishing websites exhibit subtle and expertly concealed indicators that are not readily detectable.Hence,effective detection depends on identifying the most critical features.Traditional feature selection(FS)methods often struggle to enhance ML model performance and instead decrease it.To combat these issues,we propose an innovative method using explainable AI(XAI)to enhance FS in ML models and improve the identification of phishing websites.Specifically,we employ SHapley Additive exPlanations(SHAP)for global perspective and aggregated local interpretable model-agnostic explanations(LIME)to deter-mine specific localized patterns.The proposed SHAP and LIME-aggregated FS(SLA-FS)framework pinpoints the most informative features,enabling more precise,swift,and adaptable phishing detection.Applying this approach to an up-to-date web phishing dataset,we evaluate the performance of three ML models before and after FS to assess their effectiveness.Our findings reveal that random forest(RF),with an accuracy of 97.41%and XGBoost(XGB)at 97.21%significantly benefit from the SLA-FS framework,while k-nearest neighbors lags.Our framework increases the accuracy of RF and XGB by 0.65%and 0.41%,respectively,outperforming traditional filter or wrapper methods and any prior methods evaluated on this dataset,showcasing its potential.
基金The National Basic Research Program of China (973Program)(2010CB328104,2009CB320501)the National Natural Science Foundation of China (No.60773103,90912002)+1 种基金Specialized Research Fund for the Doctoral Program of Higher Education(No.200802860031)Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (No.93K-9)
文摘A phishing detection system, which comprises client-side filtering plug-in, analysis center and protected sites, is proposed. An image-based similarity detection algorithm is conceived to calculate the similarity of two web pages. The web pages are first converted into images, and then divided into sub-images with iterated dividing and shrinking. After that, the attributes of sub-images including color histograms, gray histograms and size parameters are computed to construct the attributed relational graph(ARG)of each page. In order to match two ARGs, the inner earth mover's distances(EMD)between every two nodes coming from each ARG respectively are first computed, and then the similarity of web pages by the outer EMD between two ARGs is worked out to detect phishing web pages. The experimental results show that the proposed architecture and algorithm has good robustness along with scalability, and can effectively detect phishing.
文摘Cyber Attacks are critical and destructive to all industry sectors.They affect social engineering by allowing unapproved access to a Personal Computer(PC)that breaks the corrupted system and threatens humans.The defense of security requires understanding the nature of Cyber Attacks,so prevention becomes easy and accurate by acquiring sufficient knowledge about various features of Cyber Attacks.Cyber-Security proposes appropriate actions that can handle and block attacks.A phishing attack is one of the cybercrimes in which users follow a link to illegal websites that will persuade them to divulge their private information.One of the online security challenges is the enormous number of daily transactions done via phishing sites.As Cyber-Security have a priority for all organizations,Cyber-Security risks are considered part of an organization’s risk management process.This paper presents a survey of different modern machine-learning approaches that handle phishing problems and detect with high-quality accuracy different phishing attacks.A dataset consisting of more than 11000 websites from the Kaggle dataset was utilized and studying the effect of 30 website features and the resulting class label indicating whether or not it is a phishing website(1 or−1).Furthermore,we determined the confusion matrices of Machine Learning models:Neural Networks(NN),Na飗e Bayes,and Adaboost,and the results indicated that the accuracies achieved were 90.23%,92.97%,and 95.43%,respectively.
文摘The rapid evolution in mobile devices and communication technology has increased the number of mobile device users dramatically. The mobile device has replaced many other devices and is used to perform many tasks ranging from establishing a phone call to performing critical and sensitive tasks like money payments. Since the mobile device is accompanying a person most of his time, it is highly probably that it includes personal and sensitive data for that person. The increased use of mobile devices in daily life made mobile systems an excellent target for attacks. One of the most important attacks is phishing attack in which an attacker tries to get the credential of the victim and impersonate him. In this paper, analysis of different types of phishing attacks on mobile devices is provided. Mitigation techniques—anti-phishing techniques—are also analyzed. Assessment of each technique and a summary of its advantages and disadvantages is provided. At the end, important steps to guard against phishing attacks are provided. The aim of the work is to put phishing attacks on mobile systems in light, and to make people aware of these attacks and how to avoid them.
文摘Phishing attacks pose a significant security threat by masquerading as trustworthy entities to steal sensitive information,a problem that persists despite user awareness.This study addresses the pressing issue of phishing attacks on websites and assesses the performance of three prominent Machine Learning(ML)models—Artificial Neural Networks(ANN),Convolutional Neural Networks(CNN),and Long Short-Term Memory(LSTM)—utilizing authentic datasets sourced from Kaggle and Mendeley repositories.Extensive experimentation and analysis reveal that the CNN model achieves a better accuracy of 98%.On the other hand,LSTM shows the lowest accuracy of 96%.These findings underscore the potential of ML techniques in enhancing phishing detection systems and bolstering cybersecurity measures against evolving phishing tactics,offering a promising avenue for safeguarding sensitive information and online security.
文摘Onemust interact with a specific webpage or website in order to use the Internet for communication,teamwork,and other productive activities.However,because phishing websites look benign and not all website visitors have the same knowledge and skills to inspect the trustworthiness of visited websites,they are tricked into disclosing sensitive information and making them vulnerable to malicious software attacks like ransomware.It is impossible to stop attackers fromcreating phishingwebsites,which is one of the core challenges in combating them.However,this threat can be alleviated by detecting a specific website as phishing and alerting online users to take the necessary precautions before handing over sensitive information.In this study,five machine learning(ML)and DL algorithms—cat-boost(CATB),gradient boost(GB),random forest(RF),multilayer perceptron(MLP),and deep neural network(DNN)—were tested with three different reputable datasets and two useful feature selection techniques,to assess the scalability and consistency of each classifier’s performance on varied dataset sizes.The experimental findings reveal that the CATB classifier achieved the best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.9%,95.73%,and 98.83%.The GB classifier achieved the second-best accuracy across all datasets(DS-1,DS-2,and DS-3)with respective values of 97.16%,95.18%,and 98.58%.MLP achieved the best computational time across all datasets(DS-1,DS-2,and DS-3)with respective values of 2,7,and 3 seconds despite scoring the lowest accuracy across all datasets.
基金This research was supported in part by Basic Science Research Program through the National Research Foundation of Korea(NRF),funded by the Ministry of Education(NRF-2021R1A6A1A03039493)in part by the NRF grant funded by the Korea government(MSIT)(NRF-2022R1A2C1004401).
文摘Phishing is a type of cybercrime in which cyber-attackers pose themselves as authorized persons or entities and hack the victims’sensitive data.E-mails,instant messages and phone calls are some of the common modes used in cyberattacks.Though the security models are continuously upgraded to prevent cyberattacks,hackers find innovative ways to target the victims.In this background,there is a drastic increase observed in the number of phishing emails sent to potential targets.This scenario necessitates the importance of designing an effective classification model.Though numerous conventional models are available in the literature for proficient classification of phishing emails,the Machine Learning(ML)techniques and the Deep Learning(DL)models have been employed in the literature.The current study presents an Intelligent Cuckoo Search(CS)Optimization Algorithm with a Deep Learning-based Phishing Email Detection and Classification(ICSOA-DLPEC)model.The aim of the proposed ICSOA-DLPEC model is to effectually distinguish the emails as either legitimate or phishing ones.At the initial stage,the pre-processing is performed through three stages such as email cleaning,tokenization and stop-word elimination.Then,the N-gram approach is;moreover,the CS algorithm is applied to extract the useful feature vectors.Moreover,the CS algorithm is employed with the Gated Recurrent Unit(GRU)model to detect and classify phishing emails.Furthermore,the CS algorithm is used to fine-tune the parameters involved in the GRU model.The performance of the proposed ICSOA-DLPEC model was experimentally validated using a benchmark dataset,and the results were assessed under several dimensions.Extensive comparative studies were conducted,and the results confirmed the superior performance of the proposed ICSOA-DLPEC model over other existing approaches.The proposed model achieved a maximum accuracy of 99.72%.
基金The authors would like to thank the Deanship of Scientific Research at Shaqra University for supporting this work.
文摘The continuous destruction and frauds prevailing due to phishing URLs make it an indispensable area for research.Various techniques are adopted in the detection process,including neural networks,machine learning,or hybrid techniques.A novel detection model is proposed that uses data mining with the Particle Swarm Optimization technique(PSO)to increase and empower the method of detecting phishing URLs.Feature selection based on various techniques to identify the phishing candidates from the URL is conducted.In this approach,the features mined from the URL are extracted using data mining rules.The features are selected on the basis of URL structure.The classification of these features identified by the data mining rules is done using PSO techniques.The selection of features with PSO optimization makes it possible to identify phishing URLs.Using a large number of rule identifiers,the true positive rate for the identification of phishing URLs is maximized in this approach.The experiments show that feature selection using data mining and particle swarm optimization helps tremendously identify the phishing URLs based on the structure of the URL itself.Moreover,it can minimize processing time for identifying the phishing website instead.So,the approach can be beneficial to identify suchURLs over the existing contemporary detecting models proposed before.
基金Supported by the Fundamental Research Funds for the Central Universities in North China Electric Power University(11MG13)the Natural Science Foundation of Hebei Province(F2011502038)
文摘This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.
基金This research has been funded by the Scientific Research Deanship at University of Ha'il-Saudi Arabia through Project Number RG-20023.
文摘Phishing attacks are security attacks that do not affect only individuals’or organizations’websites but may affect Internet of Things(IoT)devices and net-works.IoT environment is an exposed environment for such attacks.Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users.Machine and deep learning and other methods were used to design detection methods for these attacks.However,there is still a need to enhance detection accuracy.Optimization of an ensemble classification method for phishing website(PW)detection is proposed in this study.A Genetic Algo-rithm(GA)was used for the proposed method optimization by tuning several ensemble Machine Learning(ML)methods parameters,including Random Forest(RF),AdaBoost(AB),XGBoost(XGB),Bagging(BA),GradientBoost(GB),and LightGBM(LGBM).These were accomplished by ranking the optimized classi-fiers to pick out the best classifiers as a base for the proposed method.A PW data-set that is made up of 4898 PWs and 6157 legitimate websites(LWs)was used for this study's experiments.As a result,detection accuracy was enhanced and reached 97.16 percent.
基金funding from Deanship of Scientific Research in King Faisal University with Grant Number KFU 241085.
文摘Phishing,an Internet fraudwhere individuals are deceived into revealing critical personal and account information,poses a significant risk to both consumers and web-based institutions.Data indicates a persistent rise in phishing attacks.Moreover,these fraudulent schemes are progressively becoming more intricate,thereby rendering them more challenging to identify.Hence,it is imperative to utilize sophisticated algorithms to address this issue.Machine learning is a highly effective approach for identifying and uncovering these harmful behaviors.Machine learning(ML)approaches can identify common characteristics in most phishing assaults.In this paper,we propose an ensemble approach and compare it with six machine learning techniques to determine the type of website and whether it is normal or not based on two phishing datasets.After that,we used the normalization technique on the dataset to transform the range of all the features into the same range.The findings of this paper for all algorithms are as follows in the first dataset based on accuracy,precision,recall,and F1-score,respectively:Decision Tree(DT)(0.964,0.961,0.976,0.968),Random Forest(RF)(0.970,0.964,0.984,0.974),Gradient Boosting(GB)(0.960,0.959,0.971,0.965),XGBoost(XGB)(0.973,0.976,0.976,0.976),AdaBoost(0.934,0.934,0.950,0.942),Multi Layer Perceptron(MLP)(0.970,0.971,0.976,0.974)and Voting(0.978,0.975,0.987,0.981).So,the Voting classifier gave the best results.While in the second dataset,all the algorithms gave the same results in four evaluation metrics,which indicates that each of them can effectively accomplish the prediction process.Also,this approach outperformed the previous work in detecting phishing websites with high accuracy,a lower false negative rate,a shorter prediction time,and a lower false positive rate.
文摘To secure web applications from Man-In-The-Middle(MITM)and phishing attacks is a challenging task nowadays.For this purpose,authen-tication protocol plays a vital role in web communication which securely transfers data from one party to another.This authentication works via OpenID,Kerberos,password authentication protocols,etc.However,there are still some limitations present in the reported security protocols.In this paper,the presented anticipated strategy secures both Web-based attacks by leveraging encoded emails and a novel password form pattern method.The proposed OpenID-based encrypted Email’s Authentication,Authorization,and Accounting(EAAA)protocol ensure security by relying on the email authenticity and a Special Secret Encrypted Alphanumeric String(SSEAS).This string is deployed on both the relying party and the email server,which is unique and trustworthy.The first authentication,OpenID Uniform Resource Locator(URL)identity,is performed on the identity provider side.A second authentication is carried out by the hidden Email’s server side and receives a third authentication link.This Email’s third SSEAS authentication link manages on the relying party(RP).Compared to existing cryptographic single sign-on protocols,the EAAA protocol ensures that an OpenID URL’s identity is secured from MITM and phishing attacks.This study manages two attacks such as MITM and phishing attacks and gives 339 ms response time which is higher than the already reported methods,such as Single Sign-On(SSO)and OpenID.The experimental sites were examined by 72 information technology(IT)specialists,who found that 88.89%of respondents successfully validated the user authorization provided to them via Email.The proposed EAAA protocol minimizes the higher-level risk of MITM and phishing attacks in an OpenID-based atmosphere.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups Project under grant number(158/43)Princess Nourah bint Abdulrahman University Researchers Supporting Project number(PNURSP2022R135)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:22UQU4340237DSR22.
文摘Phishing is one of the simplest ways in cybercrime to hack the reliable data of users such as passwords,account identifiers,bank details,etc.In general,these kinds of cyberattacks are made at users through phone calls,emails,or instant messages.The anti-phishing techniques,currently under use,aremainly based on source code features that need to scrape the webpage content.In third party services,these techniques check the classification procedure of phishing Uniform Resource Locators(URLs).Even thoughMachine Learning(ML)techniques have been lately utilized in the identification of phishing,they still need to undergo feature engineering since the techniques are not well-versed in identifying phishing offenses.The tremendous growth and evolution of Deep Learning(DL)techniques paved the way for increasing the accuracy of classification process.In this background,the current research article presents a Hunger Search Optimization with Hybrid Deep Learning enabled Phishing Detection and Classification(HSOHDL-PDC)model.The presented HSOHDL-PDC model focuses on effective recognition and classification of phishing based on website URLs.In addition,SOHDL-PDC model uses character-level embedding instead of word-level embedding since the URLs generally utilize words with no importance.Moreover,a hybrid Convolutional Neural Network-Long Short Term Memory(HCNN-LSTM)technique is also applied for identification and classification of phishing.The hyperparameters involved in HCNN-LSTM model are optimized with the help of HSO algorithm which in turn produced improved outcomes.The performance of the proposed HSOHDL-PDC model was validated using different datasets and the outcomes confirmed the supremacy of the proposed model over other recent approaches.
基金funded by the Deanship of Graduate Studies and Scientific Research at Jouf University under Grant Number(DGSSR-2023-02-02513).
文摘Spear Phishing Attacks(SPAs)pose a significant threat to the healthcare sector,resulting in data breaches,financial losses,and compromised patient confidentiality.Traditional defenses,such as firewalls and antivirus software,often fail to counter these sophisticated attacks,which target human vulnerabilities.To strengthen defenses,healthcare organizations are increasingly adopting Machine Learning(ML)techniques.ML-based SPA defenses use advanced algorithms to analyze various features,including email content,sender behavior,and attachments,to detect potential threats.This capability enables proactive security measures that address risks in real-time.The interpretability of ML models fosters trust and allows security teams to continuously refine these algorithms as new attack methods emerge.Implementing ML techniques requires integrating diverse data sources,such as electronic health records,email logs,and incident reports,which enhance the algorithms’learning environment.Feedback from end-users further improves model performance.Among tested models,the hierarchical models,Convolutional Neural Network(CNN)achieved the highest accuracy at 99.99%,followed closely by the sequential Bidirectional Long Short-Term Memory(BiLSTM)model at 99.94%.In contrast,the traditional Multi-Layer Perceptron(MLP)model showed an accuracy of 98.46%.This difference underscores the superior performance of advanced sequential and hierarchical models in detecting SPAs compared to traditional approaches.
基金supported by a grant from Hong Kong Metropolitan University (RD/2023/2.3)supported Princess Nourah bint Abdulrah-man University Researchers Supporting Project number (PNURSP2024R 343)+1 种基金Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabiathe Deanship of Scientific Research at Northern Border University,Arar,Kingdom of Saudi Arabia for funding this research work through the project number“NBU-FFR-2024-1092-09”.
文摘Phishing attacks present a serious threat to enterprise systems,requiring advanced detection techniques to protect sensitive data.This study introduces a phishing email detection framework that combines Bidirectional Encoder Representations from Transformers(BERT)for feature extraction and CNN for classification,specifically designed for enterprise information systems.BERT’s linguistic capabilities are used to extract key features from email content,which are then processed by a convolutional neural network(CNN)model optimized for phishing detection.Achieving an accuracy of 97.5%,our proposed model demonstrates strong proficiency in identifying phishing emails.This approach represents a significant advancement in applying deep learning to cybersecurity,setting a new benchmark for email security by effectively addressing the increasing complexity of phishing attacks.
文摘Cybercriminals often use fraudulent emails and fictitious email accounts to deceive individuals into disclosing confidential information,a practice known as phishing.This study utilizes three distinct methodologies,Term Frequency-Inverse Document Frequency,Word2Vec,and Bidirectional Encoder Representations from Transform-ers,to evaluate the effectiveness of various machine learning algorithms in detecting phishing attacks.The study uses feature extraction methods to assess the performance of Logistic Regression,Decision Tree,Random Forest,and Multilayer Perceptron algorithms.The best results for each classifier using Term Frequency-Inverse Document Frequency were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).Word2Vec’s best results were Multilayer Perceptron(Precision:0.98,Recall:0.98,F1-score:0.98,Accuracy:0.98).The highest performance was achieved using the Bidirectional Encoder Representations from the Transformers model,with Precision,Recall,F1-score,and Accuracy all reaching 0.99.This study highlights how advanced pre-trained models,such as Bidirectional Encoder Representations from Transformers,can significantly enhance the accuracy and reliability of fraud detection systems.