The reuse of third-party open source software(OSS)has become increasingly common,leading to the inadvertent introduction of external vulnerabilities and subsequent security issues.While security patches play a crucial...The reuse of third-party open source software(OSS)has become increasingly common,leading to the inadvertent introduction of external vulnerabilities and subsequent security issues.While security patches play a crucial role in software security,the lack of vulnerability patch datasets poses a challenge.In this paper,we introduce VPLocator,an automated approach matching vulnerabilities with commit patches based on large language model(LLM).VPLocator utilizes BERT to extract deep semantic interactive relation between vulnerability description and commit message,and integrates these features with manually selected and extracted features.By training a prediction model based on a multilayer perceptron(MLP),primary VPLocator achieves an average recall rate of 98.2%.Furthermore,advanced VPLocator employs ChatGPT to analyze difference between the code before and after commit modification.By incorporating semantic features from the descriptive understanding text analyzed by ChatGPT,VPLocator enhances precision from 47.7%to 78.7%.展开更多
基金supported by the National Key Research and Development Program“Strategic Technology Innovation Cooperation”Key Special Project(No.SQ2024YFE0201727).
文摘The reuse of third-party open source software(OSS)has become increasingly common,leading to the inadvertent introduction of external vulnerabilities and subsequent security issues.While security patches play a crucial role in software security,the lack of vulnerability patch datasets poses a challenge.In this paper,we introduce VPLocator,an automated approach matching vulnerabilities with commit patches based on large language model(LLM).VPLocator utilizes BERT to extract deep semantic interactive relation between vulnerability description and commit message,and integrates these features with manually selected and extracted features.By training a prediction model based on a multilayer perceptron(MLP),primary VPLocator achieves an average recall rate of 98.2%.Furthermore,advanced VPLocator employs ChatGPT to analyze difference between the code before and after commit modification.By incorporating semantic features from the descriptive understanding text analyzed by ChatGPT,VPLocator enhances precision from 47.7%to 78.7%.