With the rapid development of operating systems,attacks on system vulnerabilities are increasing.Dynamic link library(DLL)hijacking is prevalent in installers on freeware platforms and is highly susceptible to exploit...With the rapid development of operating systems,attacks on system vulnerabilities are increasing.Dynamic link library(DLL)hijacking is prevalent in installers on freeware platforms and is highly susceptible to exploitation by malware attackers.However,existing studies are based solely on the load paths of DLLs,ignoring the attributes of installers and invocation modes,resulting in low accuracy and weak generality of vulnerability detection.In this paper,we propose a novel model,AB-DHD,which is based on an attention mechanism and a bi-directional gated recurrent unit(BiGRU)neural network for DLL hijacking vulnerability discovery.While BiGRU is an enhancement of GRU and has been widely applied in sequence data processing,a double-layer BiGRU network is introduced to analyze the internal features of installers with DLL hijacking vulnerabilities.Additionally,an attention mechanism is incorporated to dynamically adjust feature weights,significantly enhancing the ability of our model to detect vulnerabilities in new installers.A comprehensive“List of Easily Hijacked DLLs”is developed to serve a reference for future studies.We construct an EXEFul dataset and a DLLVul dataset,using data from two publicly available authoritative vulnerability databases,Common Vulnerabilities&Exposures(CVE)and China National Vulnerability Database(CNVD),and mainstream installer distribution platforms.Experimental results show that our model outperforms popular automated tools like Rattler and DLLHSC,achieving an accuracy of 97.79%and a recall of 94.72%.Moreover,17 previously unknown vulnerabilities have been identified,and corresponding vulnerability certifications have been assigned.展开更多
Security vulnerability is one of the root causes of cyber-security threats.To discover vulnerabilities and fix them in advance,researchers have proposed several techniques,among which fuzzing is the most widely used o...Security vulnerability is one of the root causes of cyber-security threats.To discover vulnerabilities and fix them in advance,researchers have proposed several techniques,among which fuzzing is the most widely used one.In recent years,fuzzing solutions,like AFL,have made great improvements in vulnerability discovery.This paper presents a summary of the recent advances,analyzes how they improve the fuzzing process,and sheds light on future work in fuzzing.Firstly,we discuss the reason why fuzzing is popular,by comparing different commonly used vulnerability discovery techniques.Then we present an overview of fuzzing solutions,and discuss in detail one of the most popular type of fuzzing,i.e.,coverage-based fuzzing.Then we present other techniques that could make fuzzing process smarter and more efficient.Finally,we show some applications of fuzzing,and discuss new trends of fuzzing and potential future directions.展开更多
Tackling binary program analysis problems has traditionally implied manually defining rules and heuristics,a tedious and time consuming task for human analysts.In order to improve automation and scalability,we propose...Tackling binary program analysis problems has traditionally implied manually defining rules and heuristics,a tedious and time consuming task for human analysts.In order to improve automation and scalability,we propose an alternative direction based on distributed representations of binary programs with applicability to a number of downstream tasks.We introduce Bin2vec,a new approach leveraging Graph Convolutional Networks(GCN)along with computational program graphs in order to learn a high dimensional representation of binary executable programs.We demonstrate the versatility of this approach by using our representations to solve two semantically different binary analysis tasks–functional algorithm classification and vulnerability discovery.We compare the proposed approach to our own strong baseline as well as published results,and demonstrate improvement over state-of-the-art methods for both tasks.We evaluated Bin2vec on 49191 binaries for the functional algorithm classification task,and on 30 different CWE-IDs including at least 100 CVE entries each for the vulnerability discovery task.We set a new state-of-the-art result by reducing the classification error by 40%compared to the source-code based inst2vec approach,while working on binary code.For almost every vulnerability class in our dataset,our prediction accuracy is over 80%(and over 90%in multiple classes).展开更多
Mutation-based greybox fuzzing has been one of the most prevalent techniques for security vulnerability discovery and a great deal of research work has been proposed to improve both its efficiency and effectiveness.Mu...Mutation-based greybox fuzzing has been one of the most prevalent techniques for security vulnerability discovery and a great deal of research work has been proposed to improve both its efficiency and effectiveness.Mutation-based greybox fuzzing generates input cases by mutating the input seed,i.e.,applying a sequence of mutation operators to randomly selected mutation positions of the seed.However,existing fruitful research work focuses on scheduling mutation operators,leaving the schedule of mutation positions as an overlooked aspect of fuzzing efficiency.This paper proposes a novel greybox fuzzing method,PosFuzz,that statistically schedules mutation positions based on their historical performance.PosFuzz makes use of a concept of effective position distribution to represent the semantics of the input and to guide the mutations.PosFuzz first utilizes Good-Turing frequency estimation to calculate an effective position distribution for each mutation operator.It then leverages two sampling methods in different mutating stages to select the positions from the distribution.We have implemented PosFuzz on top of AFL,AFLFast and MOPT,called Pos-AFL,-AFLFast and-MOPT respectively,and evaluated them on the UNIFUZZ benchmark(20 widely used open source programs)and LAVA-M dataset.The result shows that,under the same testing time budget,the Pos-AFL,-AFLFast and-MOPT outperform their counterparts in code coverage and vulnerability discovery ability.Compared with AFL,AFLFast,and MOPT,PosFuzz gets 21%more edge coverage and finds 133%more paths on average.It also triggers 275%more unique bugs on average.展开更多
The popularity of small office and home office routers has brought convenience,but it also caused many security issues due to vulnerabilities.Black-box fuzzing through network protocols to discover vulnerabilities bec...The popularity of small office and home office routers has brought convenience,but it also caused many security issues due to vulnerabilities.Black-box fuzzing through network protocols to discover vulnerabilities becomes a viable option.The main drawbacks of state-of-the-art black-box fuzzers can be summarized as follows.First,the feedback process neglects to discover the mising felds in the raw message.Secondly,the guidance of the raw message content in the mutation process is aimless.Finally,the randomized validity of the test case structure can cause most fuzzing tests to end up with an invalid response of the tested device.To address these challenges,we propose a novel black-box fuzzing framework called MSL Fuzzer.MSL Fuzzer infers the raw message structure according to the response from a tested device and generates a message segment list.Furthermore,MSL Fuzzer performs semantic,sequence,and stability analyses on each message segment to enhance the complementation of missing fields in the raw message and guide the mutation process.We construct a dataset of 35 real-world vulnerabilities and evaluate MSL Fuzzer.The evaluation results show that MSL Fuzzer can find more vulnerabilities and elicit more types of responses from fuzzing targets.Additionally,MSL Fuzzer successfully discovered 10 previously unknown vulnerabilities.展开更多
基金supported by the National Natural Science Foundation of China under Grant Nos.62072253,62172258,62302238,and 62372245the CCF-Tencent Rhino-Bird Open Research Fund,the Major Science and Technology Demonstration Project of Jiangsu Provincial Key Research and Development Program under Grant No.BE2022798the Postgraduate Research and Practice Innovation Program of Jiangsu Province of China under Grant No.KYCX20_0829.
文摘With the rapid development of operating systems,attacks on system vulnerabilities are increasing.Dynamic link library(DLL)hijacking is prevalent in installers on freeware platforms and is highly susceptible to exploitation by malware attackers.However,existing studies are based solely on the load paths of DLLs,ignoring the attributes of installers and invocation modes,resulting in low accuracy and weak generality of vulnerability detection.In this paper,we propose a novel model,AB-DHD,which is based on an attention mechanism and a bi-directional gated recurrent unit(BiGRU)neural network for DLL hijacking vulnerability discovery.While BiGRU is an enhancement of GRU and has been widely applied in sequence data processing,a double-layer BiGRU network is introduced to analyze the internal features of installers with DLL hijacking vulnerabilities.Additionally,an attention mechanism is incorporated to dynamically adjust feature weights,significantly enhancing the ability of our model to detect vulnerabilities in new installers.A comprehensive“List of Easily Hijacked DLLs”is developed to serve a reference for future studies.We construct an EXEFul dataset and a DLLVul dataset,using data from two publicly available authoritative vulnerability databases,Common Vulnerabilities&Exposures(CVE)and China National Vulnerability Database(CNVD),and mainstream installer distribution platforms.Experimental results show that our model outperforms popular automated tools like Rattler and DLLHSC,achieving an accuracy of 97.79%and a recall of 94.72%.Moreover,17 previously unknown vulnerabilities have been identified,and corresponding vulnerability certifications have been assigned.
基金supported in part by the National Natural Science Foundation of China(Grant No.6177230861472209,and U1736209)Young Elite Scientists Spon-sorship Program by CAST(Grant No.2016QNRC001)award from Tsinghua Information Science And Technology National Laboratory.
文摘Security vulnerability is one of the root causes of cyber-security threats.To discover vulnerabilities and fix them in advance,researchers have proposed several techniques,among which fuzzing is the most widely used one.In recent years,fuzzing solutions,like AFL,have made great improvements in vulnerability discovery.This paper presents a summary of the recent advances,analyzes how they improve the fuzzing process,and sheds light on future work in fuzzing.Firstly,we discuss the reason why fuzzing is popular,by comparing different commonly used vulnerability discovery techniques.Then we present an overview of fuzzing solutions,and discuss in detail one of the most popular type of fuzzing,i.e.,coverage-based fuzzing.Then we present other techniques that could make fuzzing process smarter and more efficient.Finally,we show some applications of fuzzing,and discuss new trends of fuzzing and potential future directions.
文摘Tackling binary program analysis problems has traditionally implied manually defining rules and heuristics,a tedious and time consuming task for human analysts.In order to improve automation and scalability,we propose an alternative direction based on distributed representations of binary programs with applicability to a number of downstream tasks.We introduce Bin2vec,a new approach leveraging Graph Convolutional Networks(GCN)along with computational program graphs in order to learn a high dimensional representation of binary executable programs.We demonstrate the versatility of this approach by using our representations to solve two semantically different binary analysis tasks–functional algorithm classification and vulnerability discovery.We compare the proposed approach to our own strong baseline as well as published results,and demonstrate improvement over state-of-the-art methods for both tasks.We evaluated Bin2vec on 49191 binaries for the functional algorithm classification task,and on 30 different CWE-IDs including at least 100 CVE entries each for the vulnerability discovery task.We set a new state-of-the-art result by reducing the classification error by 40%compared to the source-code based inst2vec approach,while working on binary code.For almost every vulnerability class in our dataset,our prediction accuracy is over 80%(and over 90%in multiple classes).
基金This research was supported by National Key R&D Program of China(2022YFB3103900)National Natural Science Foundation of China(62032010,62202462)Strategic Priority Research Program of the CAS(XDC02030200).
文摘Mutation-based greybox fuzzing has been one of the most prevalent techniques for security vulnerability discovery and a great deal of research work has been proposed to improve both its efficiency and effectiveness.Mutation-based greybox fuzzing generates input cases by mutating the input seed,i.e.,applying a sequence of mutation operators to randomly selected mutation positions of the seed.However,existing fruitful research work focuses on scheduling mutation operators,leaving the schedule of mutation positions as an overlooked aspect of fuzzing efficiency.This paper proposes a novel greybox fuzzing method,PosFuzz,that statistically schedules mutation positions based on their historical performance.PosFuzz makes use of a concept of effective position distribution to represent the semantics of the input and to guide the mutations.PosFuzz first utilizes Good-Turing frequency estimation to calculate an effective position distribution for each mutation operator.It then leverages two sampling methods in different mutating stages to select the positions from the distribution.We have implemented PosFuzz on top of AFL,AFLFast and MOPT,called Pos-AFL,-AFLFast and-MOPT respectively,and evaluated them on the UNIFUZZ benchmark(20 widely used open source programs)and LAVA-M dataset.The result shows that,under the same testing time budget,the Pos-AFL,-AFLFast and-MOPT outperform their counterparts in code coverage and vulnerability discovery ability.Compared with AFL,AFLFast,and MOPT,PosFuzz gets 21%more edge coverage and finds 133%more paths on average.It also triggers 275%more unique bugs on average.
基金supported by the major project of Science and Technology Innovation 2030,"The next generation of Artificial Intelligence"under Grant Number 2021ZD0111400the Open project of the State Key Laboratory of Computer Architecture,Neural Network Enhanced Symbolic Execution Algorithm Research under Grant Number CARCH201910the Fundamental Research Fundsfor the Central Universities under Grant Number 3132018XNG1814 and 3132018XNG1815.
文摘The popularity of small office and home office routers has brought convenience,but it also caused many security issues due to vulnerabilities.Black-box fuzzing through network protocols to discover vulnerabilities becomes a viable option.The main drawbacks of state-of-the-art black-box fuzzers can be summarized as follows.First,the feedback process neglects to discover the mising felds in the raw message.Secondly,the guidance of the raw message content in the mutation process is aimless.Finally,the randomized validity of the test case structure can cause most fuzzing tests to end up with an invalid response of the tested device.To address these challenges,we propose a novel black-box fuzzing framework called MSL Fuzzer.MSL Fuzzer infers the raw message structure according to the response from a tested device and generates a message segment list.Furthermore,MSL Fuzzer performs semantic,sequence,and stability analyses on each message segment to enhance the complementation of missing fields in the raw message and guide the mutation process.We construct a dataset of 35 real-world vulnerabilities and evaluate MSL Fuzzer.The evaluation results show that MSL Fuzzer can find more vulnerabilities and elicit more types of responses from fuzzing targets.Additionally,MSL Fuzzer successfully discovered 10 previously unknown vulnerabilities.