optimize_anything: Unified Text Optimization can Outperform Specialized Systems
Lakshya A Agrawal (UC Berkeley), Donghyun Lee (UC Berkeley), Shangyin Tan (UC Berkeley), Wenjie Ma (UC Berkeley), Karim Elmaaroufi (UC Berkeley), Rohit Sandadi (UC Berkeley), Sanjit A. Seshia (UC Berkeley), Koushik Sen (UC Berkeley), Dan Klein (UC Berkeley), Ion Stoica (UC Berkeley), Joseph E. Gonzalez (UC Berkeley), Omar Khattab (MIT), Alexandros G. Dimakis (UC Berkeley), Matei Zaharia (UC Berkeley)
Architectural Patterns & Composition
optany is a single LLM-based optimization system that achieves state-of-the-art results across six diverse tasks simultaneously—nearly tripling Gemini Flash's ARC-AGI accuracy, cutting cloud scheduling costs 40%, and matching AlphaEvolve on circle packing—by framing all problems as improving a text artifact evaluated by a scoring function. The results challenge the assumption that domain-specific optimization tools are necessary.
Presentation
Talk
Paper Session 8: AI Systems in Practice
Friday, May 29 · 1:10 PM – 1:20 PM
Bayshore Ballroom
Poster
Friday, May 29 · 1:45 PM – 3:15 PM
Carmel / Monterey
Abstract
Can a single LLM-based optimization system match specialized tools across fundamentally different domains? We show that when optimization problems are formulated as improving a text artifact evaluated by a scoring function, a single AI-based optimization system—supporting single-task search, multi-task search with cross-problem transfer, and generalization to unseen inputs—achieves state-of-the-art results across six diverse tasks. Our system discovers agent architectures that nearly triple Gemini Flash's ARC-AGI accuracy (32.5% → 89.5%), finds scheduling algorithms that cut cloud costs by 40%, generates CUDA kernels where 87% match or beat PyTorch, and outperforms AlphaEvolve's reported circle packing solution (n=26). Ablations across three domains reveal that structured diagnostic feedback (side information) yields faster convergence and substantially higher final scores than score-only feedback, and that multi-task search outperforms independent optimization given equivalent per-problem budget through cross-task transfer, with benefits scaling with the number of related tasks. Together, we show for the first time that text optimization with LLM-based search is a general-purpose problem-solving paradigm, unifying tasks traditionally requiring domain-specific algorithms under a single framework. We open-source optimize_anything with support for multiple backends as part of the GEPA project at https://github.com/gepa-ai/gepa.