期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Near-duplicate document detection with improved similarity measurement 被引量:2
1
作者 袁鑫攀 龙军 +1 位作者 张祖平 桂卫华 《Journal of Central South University》 SCIE EI CAS 2012年第8期2231-2237,共7页
To quickly find documents with high similarity in existing documentation sets, fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be t... To quickly find documents with high similarity in existing documentation sets, fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be too low and fewer fingerprints could lead to low accuracy. It can be proved that the efficiency of similarity retrieval is improved by fingerprint group merging retrieval algorithm with lower similarity threshold. Experiments with the lower similarity threshold r=0.7 and high fingerprint bits k=400 demonstrate that the CPU time-consuming cost decreases from 1 921 s to 273 s. Theoretical analysis and experimental results verify the effectiveness of this method. 展开更多
关键词 similarity estimation near-duplicate document detection fingerprint group Hamming distance minwise hashing
在线阅读 下载PDF
GLDOC:detection of implicitly malicious MS‑Office documents using graph convolutional networks
2
作者 Wenbo Wang Peng Yi +2 位作者 Taotao Kou Weitao Han Chengyu Wang 《Cybersecurity》 2025年第3期61-74,共14页
Nowadays,the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks.Though many protection mechanisms are provided,they have been proved easy to bypass,and the exist... Nowadays,the malicious MS-Office document has already become one of the most effective attacking vectors in APT attacks.Though many protection mechanisms are provided,they have been proved easy to bypass,and the existed detection methods show poor performance when facing malicious documents with unknown vulnerabilities or with few malicious behaviors.In this paper,we first introduce the definition of im-documents,to describe those vulnerable documents which show implicitly malicious behaviors and escape most of public antivirus engines.Then we present GLDOC—a GCN based framework that is aimed at effectively detecting im-documents with dynamic analysis,and improving the possible blind spots of past detection methods.Besides the system call which is the only focus in most researches,we capture all dynamic behaviors in sandbox,take the process tree into consideration and reconstruct both of them into graphs.Using each line to learn each graph,GLDOC trains a 2-channel network as well as a classifier to formulate the malicious document detection problem into a graph learning and classification problem.Experiments show that GLDOC has a comprehensive balance of accuracy rate and false alarm rate−95.33%and 4.33%respectively,outperforming other detection methods.When further testing in a simulated 5-day attacking scenario,our proposed framework still maintains a stable and high detection accuracy on the unknown vulnerabilities. 展开更多
关键词 Im-document APT attack GCN Dynamic analysis Malicious document detection
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部