Registration has reached capacity. Join the waitlist

SREGym: A Live Training Ground for AI SRE Agents with High-Fidelity Failure Drills

Jackson Clark (University of Illinois Urbana-Champaign), Yiming Su (University of Illinois Urbana-Champaign), Saad Mohammad Rafid Pial (Bangladesh University of Engineering and Technology), Lily Gniedziejko (University of Illinois Urbana-Champaign), Tianyin Xu (University of Illinois Urbana-Champaign)

Evaluation & Benchmarking

A live benchmark for AI SRE agents featuring high-fidelity failure drills with fault injection across OS kernels, hardware, and compound multi-event scenarios.

Presentation

Demo session

Thursday, May 28 · 4:30 PM – 6:00 PM

San Jose

View day schedule

Description

SREGym is a new benchmark for AI-driven SRE (Site Reliability Engineering) techniques for diagnosing and mitigating production failures. SREGym provides a live training ground where high-fidelity failure drills are emulated through fault injectors. SREGym differs from existing SRE benchmarks such as AIOpsLab and ITBench in its realization of comprehensive, high-fidelity failure drills. SREGym implements an extensible software architecture that orchestrates fault injectors and simulators across system stacks, with new capabilities: (1) simulating low-level faults in OS kernels and hardware, (2) coordinating multiple concurrent events into compound drills, and (3) composing noises to model production environments. We demonstrate how to use and extend SREGym and present three representative cases of how AI agents tackle SREGym problems.

Artifacts & Links

                        Authors
                        Jackson Clark
University of Illinois Urbana-Champaign
Yiming Su
University of Illinois Urbana-Champaign
Saad Mohammad Rafid Pial
Bangladesh University of Engineering and Technology
Lily Gniedziejko
University of Illinois Urbana-Champaign
Tianyin Xu
University of Illinois Urbana-Champaign