Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols ...Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.展开更多
Inferring protocol state machines from observable information presents a significant challenge in protocol reverse engineering(PRE),especially when passively collected traffic suffers from message loss,resulting in an...Inferring protocol state machines from observable information presents a significant challenge in protocol reverse engineering(PRE),especially when passively collected traffic suffers from message loss,resulting in an incomplete protocol state space.This paper introduces an innovative method for actively inferring protocol state machines using the minimally adequate teacher(MAT)framework.By incorporating session completion and deterministic mutation techniques,this method broadens the range of protocol messages,thereby constructing a more comprehensive input space for the protocol state machine from an incomplete message domain.Additionally,the efficiency of active inference is improved through several optimizations for the L_(M)^(+)algorithm,including traffic deduplication,the construction of an expanded prefix tree acceptor(EPTA),query optimization based on responses,and random counterexample generation.Experiments on the real-time streaming protocol(RTSP)and simple mail transfer protocol(SMTP),which use Live555 and Exim implementations across multiple versions,demonstrate that this method yields more comprehensive protocol state machines with enhanced execution efficiency.Compared to the L_(M)^(+) algorithm implemented by AALpy,Act_Infer achieves an average reduction of approximately 40.7%in execution time and significantly reduces the number of connections and interactions by approximately 28.6%and 46.6%,respectively.展开更多
基金This work is supported by the National Natural Science Foundation of China(Grant Number:61471141,61361166006,61301099)Basic Research Project of Shenzhen,China(Grant Number:JCYJ20150513151706561)National Defense Basic Scientific Research Program of China(Grant Number:JCKY2018603B006).
文摘Internet communication protocols define the behavior rules of network components when they communicate with each other.With the continuous development of network technologies,many private or unknown network protocols are emerging in endlessly various network environments.Herein,relevant protocol specifications become difficult or unavailable to translate in many situations such as network security management and intrusion detection.Although protocol reverse engineering is being investigated in recent years to perform reverse analysis on the specifications of unknown protocols,most existing methods have proven to be time-consuming with limited efficiency,especially when applied on unknown protocol state machines.This paper proposes a state merging algorithm based on EDSM(Evidence-Driven State Merging)to infer the transition rules of unknown protocols in form of state machines with high efficiency.Compared with another classical state machine inferring method based on Exbar algorithm,the experiment results demonstrate that our proposed method could run faster,especially when dealing with massive training data sets.In addition,this method can also make the state machines have higher similarities with the reference state machines constructed from public specifications.
基金Project supported by the Key JCJQ Program of China(Nos.2020-JCJQ-ZD-021-00 and 2020-JCJQ-ZD-024-12)。
文摘Inferring protocol state machines from observable information presents a significant challenge in protocol reverse engineering(PRE),especially when passively collected traffic suffers from message loss,resulting in an incomplete protocol state space.This paper introduces an innovative method for actively inferring protocol state machines using the minimally adequate teacher(MAT)framework.By incorporating session completion and deterministic mutation techniques,this method broadens the range of protocol messages,thereby constructing a more comprehensive input space for the protocol state machine from an incomplete message domain.Additionally,the efficiency of active inference is improved through several optimizations for the L_(M)^(+)algorithm,including traffic deduplication,the construction of an expanded prefix tree acceptor(EPTA),query optimization based on responses,and random counterexample generation.Experiments on the real-time streaming protocol(RTSP)and simple mail transfer protocol(SMTP),which use Live555 and Exim implementations across multiple versions,demonstrate that this method yields more comprehensive protocol state machines with enhanced execution efficiency.Compared to the L_(M)^(+) algorithm implemented by AALpy,Act_Infer achieves an average reduction of approximately 40.7%in execution time and significantly reduces the number of connections and interactions by approximately 28.6%and 46.6%,respectively.