Who Decides the Trade-off? Resolution Policy as Delegation Governance in Autonomous Agents
Koji YAMAZAKI (Docomo Innovations, Inc)
Security & Privacy Architectural Patterns & Composition
Abstract
When an autonomous AI agent's delegated constraints cannot be simultaneously satisfied, someone must decide which constraint to sacrifice. In current LLM-based agent systems, this decision is made probabilistically by the model's sampling process, producing outcomes that are unpredictable, unreproducible, and unauditable. We term this the \emph{Trust Gap}. Through 2,956 experimental probes across two frontier LLMs, we demonstrate that a single fallback instruction reduces deviation from 84\% to 0\%, establishing that \emph{behavioral compliance} is achievable. However, behavioral compliance is fundamentally distinct from \emph{structural guarantee}: a single adversarial override reverses compliance from 0\% to 100\% (R5), and this pattern generalizes across resolution strategies (R7). We formalize the missing element---\emph{Resolution Policy}---through the \emph{Deterministic Delegation Model (DDM)}: a principal's deterministic, pre-committed trade-off strategy that structurally binds intent to execution outcome. Evaluation across complete $2\times2$ factorial designs confirms DDM operates independently of prompt content, injection content, and resolution strategy type. Concurrent work has advanced authorization enforcement; the complementary question---\emph{what to do when authorized actions conflict, and by whose authority}---is the problem Resolution Policy resolves.