With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network securi...With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network security.The blockchain uses the P2P protocol to implement various functions across the network.Furthermore,the P2P protocol format of blockchain may differ from the standard format specification,which leads to sniffing tools such as Wireshark and Fiddler not being able to recognize them.Therefore,the ability to distinguish different types of unknown network protocols is vital for network security.In this paper,we propose an unsupervised clustering algorithm based on maximum frequent sequences for binary protocols,which can distinguish various unknown protocols to provide support for analyzing unknown protocol formats.We mine the maximum frequent sequences of protocolmessage sets in bytes.Andwe calculate the fuzzymembership of the protocolmessage to each maximum frequent sequence,which is based on fuzzy set theory.Then we construct the fuzzy membership vector for each protocol message.Finally,we adopt K-means++to split different types of protocol messages into several clusters and evaluate the performance by calculating homogeneity,integrity,and Fowlkes and Mallows Index(FMI).Besides,the clustering algorithms based onNeedleman–Wunsch and the fixed-length prefix are compared with the algorithm presented in this paper.Compared with these traditional clustering methods,we demonstrate a certain improvement in the clustering performance of our work.展开更多
Protocol Reverse Engineering(PRE)is of great practical importance in Internet security-related fields such as intrusion detection,vulnerability mining,and protocol fuzzing.For unknown binary protocols having fixed-len...Protocol Reverse Engineering(PRE)is of great practical importance in Internet security-related fields such as intrusion detection,vulnerability mining,and protocol fuzzing.For unknown binary protocols having fixed-length fields,and the accurate identification of field boundaries has a great impact on the subsequent analysis and final performance.Hence,this paper proposes a new protocol segmentation method based on Information-theoretic statistical analysis for binary protocols by formulating the field segmentation of unsupervised binary protocols as a probabilistic inference problem and modeling its uncertainty.Specifically,we design four related constructions between entropy changes and protocol field segmentation,introduce random variables,and construct joint probability distributions with traffic sample observations.Probabilistic inference is then performed to identify the possible protocol segmentation points.Extensive trials on nine common public and industrial control protocols show that the proposed method yields higher-quality protocol segmentation results.展开更多
基金National Natural Science Foundation of China under Grant No.61872111Sichuan Science and Technology Program(No.2019YFSY0049)the“Project for the Development and Application of Safety Testing and Verification Platform for Industrial Robots”of the Ministry of Industry and Information Technology.
文摘With the rapid development of the Internet,a large number of private protocols emerge on the network.However,some of them are constructed by attackers to avoid being analyzed,posing a threat to computer network security.The blockchain uses the P2P protocol to implement various functions across the network.Furthermore,the P2P protocol format of blockchain may differ from the standard format specification,which leads to sniffing tools such as Wireshark and Fiddler not being able to recognize them.Therefore,the ability to distinguish different types of unknown network protocols is vital for network security.In this paper,we propose an unsupervised clustering algorithm based on maximum frequent sequences for binary protocols,which can distinguish various unknown protocols to provide support for analyzing unknown protocol formats.We mine the maximum frequent sequences of protocolmessage sets in bytes.Andwe calculate the fuzzymembership of the protocolmessage to each maximum frequent sequence,which is based on fuzzy set theory.Then we construct the fuzzy membership vector for each protocol message.Finally,we adopt K-means++to split different types of protocol messages into several clusters and evaluate the performance by calculating homogeneity,integrity,and Fowlkes and Mallows Index(FMI).Besides,the clustering algorithms based onNeedleman–Wunsch and the fixed-length prefix are compared with the algorithm presented in this paper.Compared with these traditional clustering methods,we demonstrate a certain improvement in the clustering performance of our work.
文摘Protocol Reverse Engineering(PRE)is of great practical importance in Internet security-related fields such as intrusion detection,vulnerability mining,and protocol fuzzing.For unknown binary protocols having fixed-length fields,and the accurate identification of field boundaries has a great impact on the subsequent analysis and final performance.Hence,this paper proposes a new protocol segmentation method based on Information-theoretic statistical analysis for binary protocols by formulating the field segmentation of unsupervised binary protocols as a probabilistic inference problem and modeling its uncertainty.Specifically,we design four related constructions between entropy changes and protocol field segmentation,introduce random variables,and construct joint probability distributions with traffic sample observations.Probabilistic inference is then performed to identify the possible protocol segmentation points.Extensive trials on nine common public and industrial control protocols show that the proposed method yields higher-quality protocol segmentation results.