Registration is now open! Early-bird pricing available through May 5, 2026. Register now

Retrieval-Augmented LLMs for Security Incident Analysis

Xavier Cadet (Dartmouth College), Aditya Vikram Singh (Northeastern University), Harsh Mamania (Northeastern University), Edward Koh (Dartmouth College), Alex Fitts (Punch Cyber), Dirk Van Bruggen (Punch Cyber), Simona Boboila (Northeastern University), Peter Chin (Dartmouth College), Alina Oprea (Northeastern University)

Architectural Patterns & Composition Evaluation & Benchmarking

Abstract

Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts, network traffic records, and authentication events. This process is labor-intensive: analysts must sift through large volumes of data to identify relevant indicators and piece together what happened. We present a RAG-based system that performs security incident analysis through targeted query-based filtering and LLM semantic reasoning. The system uses a query library with associated MITRE ATT\&CK techniques to extract indicators from raw logs, then retrieves relevant context to answer forensic questions and reconstruct attack sequences. We evaluate the system with five LLM providers on malware traffic incidents and multi-stage Active Directory attacks. We find that LLM models have different performance and tradeoffs, with Claude Sonnet~4 and DeepSeek~V3 achieving 100\% recall across all four malware scenarios, while DeepSeek costs 15$\times$ less (\$0.008 vs. \$0.12 per analysis). Attack step detection on Active Directory scenarios reaches 100\% precision and 82\% recall. Ablation studies confirm that a RAG architecture is essential: LLM baselines without RAG-enhanced context correctly identify victim hosts but miss all attack infrastructure including malicious domains and command-and-control servers. These results demonstrate that combining targeted query-based filtering with RAG-based retrieval enables accurate, cost-effective security analysis within LLM context limits.

                        Authors
                        Xavier Cadet
Dartmouth College
Aditya Vikram Singh
Northeastern University
Harsh Mamania
Northeastern University
Edward Koh
Dartmouth College
Alex Fitts
Punch Cyber
Dirk Van Bruggen
Punch Cyber
Simona Boboila
Northeastern University
Peter Chin
Dartmouth College
Alina Oprea
Northeastern University