Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased th...Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased threat detection methods have been enhanced with machine learning and Large Language Models(LLMs),these approaches remain limited in addressing emerging threats.This study evaluates a two-step Retrieval Augmented Generation(RAG)approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance.The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework,AWS Threat Technique Catalogue,and threat reports to overcome limitations of static pre-trained LLMs.We constructed an evaluation dataset of 200 unique CloudTrail events(122 malicious,78 benign)using the Stratus Red Team adversary emulation framework,covering 9 MITRE ATT&CK techniques across 8 tactics.Events were sampled from 1724 total events using stratified sampling.Ground truth labels were created through systematic expert annotation with 90%inter-annotator agreement.The RAG-enabled model achieved estimated 78%accuracy,85%precision,and 79%F1-score,representing 70.5%accuracy improvement and 76.4%F1-score improvement over baseline Gemini 2.5 Pro(46%accuracy,45%F1-score).Performance are based on evaluation results on 200-event dataset.Cost-latency analysis revealed processing time of 4.1 s and cost of$0.00376 per event,comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution.The findings demonstrate that RAG substantially enhances context-aware threat detection,providing actionable insights for cloud security operations.展开更多
文摘Amazon Web Services(AWS)Cloud Trail auditing service provides detailed records of operational and security events,enabling cloud administrators to monitor user activity and manage compliance.Although signaturebased threat detection methods have been enhanced with machine learning and Large Language Models(LLMs),these approaches remain limited in addressing emerging threats.This study evaluates a two-step Retrieval Augmented Generation(RAG)approach using Gemini 2.5 Pro to enhance threat detection accuracy and contextual relevance.The RAG system integrates external cybersecurity knowledge sources including the MITRE ATT&CK framework,AWS Threat Technique Catalogue,and threat reports to overcome limitations of static pre-trained LLMs.We constructed an evaluation dataset of 200 unique CloudTrail events(122 malicious,78 benign)using the Stratus Red Team adversary emulation framework,covering 9 MITRE ATT&CK techniques across 8 tactics.Events were sampled from 1724 total events using stratified sampling.Ground truth labels were created through systematic expert annotation with 90%inter-annotator agreement.The RAG-enabled model achieved estimated 78%accuracy,85%precision,and 79%F1-score,representing 70.5%accuracy improvement and 76.4%F1-score improvement over baseline Gemini 2.5 Pro(46%accuracy,45%F1-score).Performance are based on evaluation results on 200-event dataset.Cost-latency analysis revealed processing time of 4.1 s and cost of$0.00376 per event,comparable to commercial SIEM solutions while providing superior MITRE ATT&CK attribution.The findings demonstrate that RAG substantially enhances context-aware threat detection,providing actionable insights for cloud security operations.