Dossier: Deep Research via Ledger-Driven Branching Search and Query Encoding Learning
Om Chabra (MIT), Noah Ziems (University of Notre Dame), Meng Jiang (University of Notre Dame), Omar Khattab (MIT), Hari Balakrishnan (MIT)
Architectural Patterns & Composition
Abstract
Deep research requires synthesizing information across fragmented sources. Existing ReAct-style agents typically rely on long, linear search trajectories, where early retrieval failures propagate and compound through the reasoning chain. We introduce \textbf{Dossier}, a deep research agent that replaces linear paths with {\em locally parallel, branching search} managed by a persistent \textbf{Research Ledger}. The Ledger explicitly tracks claims, contradictions, and information gaps, continuously updating as new evidence is synthesized. To improve the quality of each retrieval step, we introduce \textbf{Evidence-Aligned Query Learning (EAQL)}, a training mechanism that fine-tunes query encoders to condition on the Research Ledger. This ensures that generated queries are contextually grounded in the agent’s evolving state rather than isolated prompts. Our evaluation demonstrates that Dossier’s branching architecture improves end-to-end accuracy on BrowseComp-Plus by 27 percentage points and HoVer by 29 percentage points, respectively, with the Ledger and EAQL contributing 10 and 13 percentage points in accuracy, respectively.