To quickly find documents with high similarity in existing documentation sets, fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be t...To quickly find documents with high similarity in existing documentation sets, fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be too low and fewer fingerprints could lead to low accuracy. It can be proved that the efficiency of similarity retrieval is improved by fingerprint group merging retrieval algorithm with lower similarity threshold. Experiments with the lower similarity threshold r=0.7 and high fingerprint bits k=400 demonstrate that the CPU time-consuming cost decreases from 1 921 s to 273 s. Theoretical analysis and experimental results verify the effectiveness of this method.展开更多
Nowadays,the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks.Though many protection mechanisms are provided,they have been proved easy to bypass,and the exist...Nowadays,the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks.Though many protection mechanisms are provided,they have been proved easy to bypass,and the existed detection methods show poor performance when facing malicious documents with unknown vulnerabilities or with few malicious behaviors.In this paper,we first introduce the definition of im-documents,to describe those vulnerable documents which show implicitly malicious behaviors and escape most of public antivirus engines.Then we present GLDOC—a GCN based framework that is aimed at effectively detecting im-documents with dynamic analysis,and improving the possible blind spots of past detection methods.Besides the system call which is the only focus in most researches,we capture all dynamic behaviors in sandbox,take the process tree into consideration and reconstruct both of them into graphs.Using each line to learn each graph,GLDOC trains a 2-channel network as well as a classifier to formulate the malicious document detection problem into a graph learning and classification problem.Experiments show that GLDOC has a comprehensive balance of accuracy rate and false alarm rate−95.33%and 4.33%respectively,outperforming other detection methods.When further testing in a simulated 5-day attacking scenario,our proposed framework still maintains a stable and high detection accuracy on the unknown vulnerabilities.展开更多
基金Project(60873081) supported by the National Natural Science Foundation of ChinaProject(NCET-10-0787) supported by the Program for New Century Excellent Talents in University, ChinaProject(11JJ1012) supported by the Natural Science Foundation of Hunan Province, China
文摘To quickly find documents with high similarity in existing documentation sets, fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be too low and fewer fingerprints could lead to low accuracy. It can be proved that the efficiency of similarity retrieval is improved by fingerprint group merging retrieval algorithm with lower similarity threshold. Experiments with the lower similarity threshold r=0.7 and high fingerprint bits k=400 demonstrate that the CPU time-consuming cost decreases from 1 921 s to 273 s. Theoretical analysis and experimental results verify the effectiveness of this method.
基金supported by the National Natural Science Foundation of China(General Program,NO.62176264).
文摘Nowadays,the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks.Though many protection mechanisms are provided,they have been proved easy to bypass,and the existed detection methods show poor performance when facing malicious documents with unknown vulnerabilities or with few malicious behaviors.In this paper,we first introduce the definition of im-documents,to describe those vulnerable documents which show implicitly malicious behaviors and escape most of public antivirus engines.Then we present GLDOC—a GCN based framework that is aimed at effectively detecting im-documents with dynamic analysis,and improving the possible blind spots of past detection methods.Besides the system call which is the only focus in most researches,we capture all dynamic behaviors in sandbox,take the process tree into consideration and reconstruct both of them into graphs.Using each line to learn each graph,GLDOC trains a 2-channel network as well as a classifier to formulate the malicious document detection problem into a graph learning and classification problem.Experiments show that GLDOC has a comprehensive balance of accuracy rate and false alarm rate−95.33%and 4.33%respectively,outperforming other detection methods.When further testing in a simulated 5-day attacking scenario,our proposed framework still maintains a stable and high detection accuracy on the unknown vulnerabilities.