摘要
针对目前蜜罐合约检测方法准确率不高以及泛化性较差等问题,提出了基于源码结构和图注意力网络的以太坊蜜罐合约检测(CSGDetector)方法。首先,为了提取出智能合约Solidity源码的结构信息,对源码进行语法分析,将其转换为XML解析树;然后,筛选出可以表达合约结构特征和内容特征的特征词集,并构造出合约源码结构图;最后,为避免数据集不平衡性带来的影响,在集成学习理论基础上引入教师模型和学生模型的概念,分别从全局和局部的角度训练图注意力网络模型,并融合所有模型的输出作为合约最终检测结果。实验表明,与已有方法KOLSTM相比,CSGDetector在二分类与多分类实验中的F1值分别提升了1.27%与7.21%,验证了其具有较高的蜜罐检测能力;与已有方法XGB相比,CSGDetector在掩蔽蜜罐检测实验中针对不同类型蜜罐合约的平均召回率提升了7.57%,验证了所提方法在提升算法泛化性能方面的有效性。
To address the problems of low accuracy and poor generalization of current honeypot contract detection methods,a honeypot contract detection method for Ethereum based on source code structure and graph attention network was proposed.Firstly,in order to extract the structural information of the Solidity source code of the smart contract,the source code was parsed and converted into an XML parsing tree.Then,a set of feature words that could express the structural and content characteristics of the contract was selected,and the contract source code structure graph was constructed.Finally,in order to avoid the impact of dataset imbalance,the concepts of teacher model and student model were introduced based on the ensemble learning theory.Moreover,the graph attention network model was trained from the global and local perspectives,respectively,and the outputs of all models were fused to obtain the final contract detection result.The experiments demonstrate that CSGDetector has higher honeypot detection capability than the existing method KOLSTM,with increments of 1.27%and 7.21%on F1 measurement in two-class classification and multi-class classification experiments,respectively.When comparing with the existing method XGB,the average recall rate of CSGDetector in the masked honeypot detection experiments for different types of honeypot contracts is improved by 7.57%,which verifies the effectiveness of the method in improving the generalization performance of the algorithm.
作者
王友卫
侯玉栋
凤丽洲
WANG Youwei;HOU Yudong;FENG Lizhou(School of Information,Central University of Finance and Economics,Beijing 102206,China;School of Statistics,Tianjin University of Finance and Economics,Tianjin 300222,China)
出处
《通信学报》
EI
CSCD
北大核心
2023年第9期161-172,共12页
Journal on Communications
基金
教育部人文社科基金资助项目(No.19YJCZH178)
国家自然科学基金资助项目(No.61906220)
国家社科基金资助项目(No.18CTJ008)
中央财经大学新兴交叉学科建设项目。
关键词
以太坊
蜜罐合约
源码结构
图注意力网络
集成学习
Ethereum
honeypot contract
source code structure
graph attention network
ensemble learning