The boom of coding languages in the 1950s revolutionized how our digital world was construed and accessed. The languages invented then, including Fortran, are still in use today due to their versatility and ability to...The boom of coding languages in the 1950s revolutionized how our digital world was construed and accessed. The languages invented then, including Fortran, are still in use today due to their versatility and ability to underpin a large majority of the older portions of our digital world and applications. Fortran, or Formula Translation, was a programming language implemented by IBM that shortened the apparatus of coding and the efficacy of the language syntax. Fortran marked the beginning of a new era of efficient programming by reducing the number of statements needed to operate a machine several-fold. Since then, dozens more languages have come into regular practice and have been increasingly diversified over the years. Some modern languages include Python, Java, JavaScript, C, C++, and PHP. These languages significantly improved efficiency and also have a broad range of uses. Python is mainly used for website/software development, data analysis, task automation, image processing, and graphic design applications. On the other hand, Java is primarily used as a client-side programming language. Expanding the coding languages allowed for increasing accessibility but also opened up applications to pertinent security issues. These security issues have varied by prevalence and language. Previous research has narrowed its focus on individual languages, failing to evaluate the security. This research paper investigates the severity and frequency of coding vulnerabilities comparatively across different languages and contextualizes their uses in a systematic literature review.展开更多
Vulnerability reports are essential for improving software security since they record key information on vulnerabilities.In a report,CWE denotes the weakness of the vulnerability and thus helps quickly understand the ...Vulnerability reports are essential for improving software security since they record key information on vulnerabilities.In a report,CWE denotes the weakness of the vulnerability and thus helps quickly understand the cause of the vulner-ability.Therefore,CWE assignment is useful for categorizing newly discovered vulnerabilities.In this paper,we propose an automatic CwE assignment method with graph neural networks.First,we prepare a dataset that contains 3394 real world vulnerabilities from Linux,OpenSSL,Wireshark and many other software programs.Then,we extract state-ments with vulnerability syntax features from these vulnerabilities and use program slicing to slice them according to the categories of syntax features.On top of slices,we represent these slices with graphs that characterize the data dependency and control dependency between statements.Finally,we employ the graph neural networks to learn the hidden information from these graphs and leverage the Siamese network to compute the similarity between vulnerability functions,thereby assigning CWE IDs for these vulnerabilities.The experimental results show that the proposed method is effective compared to existing methods.展开更多
文摘The boom of coding languages in the 1950s revolutionized how our digital world was construed and accessed. The languages invented then, including Fortran, are still in use today due to their versatility and ability to underpin a large majority of the older portions of our digital world and applications. Fortran, or Formula Translation, was a programming language implemented by IBM that shortened the apparatus of coding and the efficacy of the language syntax. Fortran marked the beginning of a new era of efficient programming by reducing the number of statements needed to operate a machine several-fold. Since then, dozens more languages have come into regular practice and have been increasingly diversified over the years. Some modern languages include Python, Java, JavaScript, C, C++, and PHP. These languages significantly improved efficiency and also have a broad range of uses. Python is mainly used for website/software development, data analysis, task automation, image processing, and graphic design applications. On the other hand, Java is primarily used as a client-side programming language. Expanding the coding languages allowed for increasing accessibility but also opened up applications to pertinent security issues. These security issues have varied by prevalence and language. Previous research has narrowed its focus on individual languages, failing to evaluate the security. This research paper investigates the severity and frequency of coding vulnerabilities comparatively across different languages and contextualizes their uses in a systematic literature review.
基金The research was supported in part by the National Natural Science Foundation of China(Nos.62166004,U21A20474)the Guangxi Science and Technology Major Project(No.AA22068070)+1 种基金the Guangxi Natural Science Foundation(No.2020GXNSFAA297075)the Center for Applied Mathematics of Guangxi,the Guangxi"Bagui Scholar"Teams for Innovation and Research Project,the Guangxi Talent Highland Project of Big Data Intelligence and Application,the Guangxi Collaborative Center of Multisource Information Integration and Intelligent Processing and Fundamental Research Funds for the Central Universities(No.2021JKF06).
文摘Vulnerability reports are essential for improving software security since they record key information on vulnerabilities.In a report,CWE denotes the weakness of the vulnerability and thus helps quickly understand the cause of the vulner-ability.Therefore,CWE assignment is useful for categorizing newly discovered vulnerabilities.In this paper,we propose an automatic CwE assignment method with graph neural networks.First,we prepare a dataset that contains 3394 real world vulnerabilities from Linux,OpenSSL,Wireshark and many other software programs.Then,we extract state-ments with vulnerability syntax features from these vulnerabilities and use program slicing to slice them according to the categories of syntax features.On top of slices,we represent these slices with graphs that characterize the data dependency and control dependency between statements.Finally,we employ the graph neural networks to learn the hidden information from these graphs and leverage the Siamese network to compute the similarity between vulnerability functions,thereby assigning CWE IDs for these vulnerabilities.The experimental results show that the proposed method is effective compared to existing methods.