The proliferation of maliciously coded documents as file transfers increase has led to a rise in sophisticated attacks.Portable Document Format(PDF)files have emerged as a major attack vector for malware due to their ...The proliferation of maliciously coded documents as file transfers increase has led to a rise in sophisticated attacks.Portable Document Format(PDF)files have emerged as a major attack vector for malware due to their adaptability and wide usage.Detecting malware in PDF files is challenging due to its ability to include various harmful elements such as embedded scripts,exploits,and malicious URLs.This paper presents a comparative analysis of machine learning(ML)techniques,including Naive Bayes(NB),K-Nearest Neighbor(KNN),Average One Dependency Estimator(A1DE),RandomForest(RF),and SupportVectorMachine(SVM)forPDFmalware detection.The study utilizes a dataset obtained from the Canadian Institute for Cyber-security and employs different testing criteria,namely percentage splitting and 10-fold cross-validation.The performance of the techniques is evaluated using F1-score,precision,recall,and accuracy measures.The results indicate that KNNoutperforms other models,achieving an accuracy of 99.8599%using 10-fold cross-validation.The findings highlight the effectiveness of ML models in accurately detecting PDF malware and provide insights for developing robust systems to protect against malicious activities.展开更多
The seamless integration of intelligent Internet of Things devices with conventional wireless sensor networks has revolutionized data communication for different applications,such as remote health monitoring,industria...The seamless integration of intelligent Internet of Things devices with conventional wireless sensor networks has revolutionized data communication for different applications,such as remote health monitoring,industrial monitoring,transportation,and smart agriculture.Efficient and reliable data routing is one of the major challenges in the Internet of Things network due to the heterogeneity of nodes.This paper presents a traffic-aware,cluster-based,and energy-efficient routing protocol that employs traffic-aware and cluster-based techniques to improve the data delivery in such networks.The proposed protocol divides the network into clusters where optimal cluster heads are selected among super and normal nodes based on their residual energies.The protocol considers multi-criteria attributes,i.e.,energy,traffic load,and distance parameters to select the next hop for data delivery towards the base station.The performance of the proposed protocol is evaluated through the network simulator NS3.40.For different traffic rates,number of nodes,and different packet sizes,the proposed protocol outperformed LoRaWAN in terms of end-to-end packet delivery ratio,energy consumption,end-to-end delay,and network lifetime.For 100 nodes,the proposed protocol achieved a 13%improvement in packet delivery ratio,10 ms improvement in delay,and 10 mJ improvement in average energy consumption over LoRaWAN.展开更多
Retrieving information from evolving digital data collection using a user’s query is always essential and needs efficient retrieval mechanisms that help reduce the required time from such massive collections.Large-sc...Retrieving information from evolving digital data collection using a user’s query is always essential and needs efficient retrieval mechanisms that help reduce the required time from such massive collections.Large-scale time consumption is certain to scan and analyze to retrieve the most relevant textual data item from all the documents required a sophisticated technique for a query against the document collection.It is always challenging to retrieve a more accurate and fast retrieval from a large collection.Text summarization is a dominant research field in information retrieval and text processing to locate the most appropriate data object as single or multiple documents from the collection.Machine learning and knowledge-based techniques are the two query-based extractive text summarization techniques in Natural Language Processing(NLP)which can be used for precise retrieval and are considered to be the best option.NLP uses machine learning approaches for both supervised and unsupervised learning for calculating probabilistic features.The study aims to propose a hybrid approach for query-based extractive text summarization in the research study.Text-Rank Algorithm is used as a core algorithm for the flow of an implementation of the approach to gain the required goals.Query-based text summarization of multiple documents using a hybrid approach,combining the K-Means clustering technique with Latent Dirichlet Allocation(LDA)as topic modeling technique produces 0.288,0.631,and 0.328 for precision,recall,and F-score,respectively.The results show that the proposed hybrid approach performs better than the graph-based independent approach and the sentences and word frequency-based approach.展开更多
文摘The proliferation of maliciously coded documents as file transfers increase has led to a rise in sophisticated attacks.Portable Document Format(PDF)files have emerged as a major attack vector for malware due to their adaptability and wide usage.Detecting malware in PDF files is challenging due to its ability to include various harmful elements such as embedded scripts,exploits,and malicious URLs.This paper presents a comparative analysis of machine learning(ML)techniques,including Naive Bayes(NB),K-Nearest Neighbor(KNN),Average One Dependency Estimator(A1DE),RandomForest(RF),and SupportVectorMachine(SVM)forPDFmalware detection.The study utilizes a dataset obtained from the Canadian Institute for Cyber-security and employs different testing criteria,namely percentage splitting and 10-fold cross-validation.The performance of the techniques is evaluated using F1-score,precision,recall,and accuracy measures.The results indicate that KNNoutperforms other models,achieving an accuracy of 99.8599%using 10-fold cross-validation.The findings highlight the effectiveness of ML models in accurately detecting PDF malware and provide insights for developing robust systems to protect against malicious activities.
基金This work was supported by the Basic Science Research Program through the NationalResearch Foundation ofKorea(NRF)funded by the Ministry of Education under Grant RS-2023-00237300 and Korea Institute of Planning and Evaluation for Technology in Food,Agriculture and Forestry(IPET)through the Agriculture and Food Convergence Technologies Program for Research Manpower Development,funded by Ministry of Agriculture,Food and Rural Affairs(MAFRA)(Project No.RS-2024-00397026).
文摘The seamless integration of intelligent Internet of Things devices with conventional wireless sensor networks has revolutionized data communication for different applications,such as remote health monitoring,industrial monitoring,transportation,and smart agriculture.Efficient and reliable data routing is one of the major challenges in the Internet of Things network due to the heterogeneity of nodes.This paper presents a traffic-aware,cluster-based,and energy-efficient routing protocol that employs traffic-aware and cluster-based techniques to improve the data delivery in such networks.The proposed protocol divides the network into clusters where optimal cluster heads are selected among super and normal nodes based on their residual energies.The protocol considers multi-criteria attributes,i.e.,energy,traffic load,and distance parameters to select the next hop for data delivery towards the base station.The performance of the proposed protocol is evaluated through the network simulator NS3.40.For different traffic rates,number of nodes,and different packet sizes,the proposed protocol outperformed LoRaWAN in terms of end-to-end packet delivery ratio,energy consumption,end-to-end delay,and network lifetime.For 100 nodes,the proposed protocol achieved a 13%improvement in packet delivery ratio,10 ms improvement in delay,and 10 mJ improvement in average energy consumption over LoRaWAN.
文摘Retrieving information from evolving digital data collection using a user’s query is always essential and needs efficient retrieval mechanisms that help reduce the required time from such massive collections.Large-scale time consumption is certain to scan and analyze to retrieve the most relevant textual data item from all the documents required a sophisticated technique for a query against the document collection.It is always challenging to retrieve a more accurate and fast retrieval from a large collection.Text summarization is a dominant research field in information retrieval and text processing to locate the most appropriate data object as single or multiple documents from the collection.Machine learning and knowledge-based techniques are the two query-based extractive text summarization techniques in Natural Language Processing(NLP)which can be used for precise retrieval and are considered to be the best option.NLP uses machine learning approaches for both supervised and unsupervised learning for calculating probabilistic features.The study aims to propose a hybrid approach for query-based extractive text summarization in the research study.Text-Rank Algorithm is used as a core algorithm for the flow of an implementation of the approach to gain the required goals.Query-based text summarization of multiple documents using a hybrid approach,combining the K-Means clustering technique with Latent Dirichlet Allocation(LDA)as topic modeling technique produces 0.288,0.631,and 0.328 for precision,recall,and F-score,respectively.The results show that the proposed hybrid approach performs better than the graph-based independent approach and the sentences and word frequency-based approach.