{"ID":2827608,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.20660","arxiv_id":"2512.20660","title":"The Dual-State Architecture for Reliable LLM Agents","abstract":"Large Language Models deployed as code generation agents exhibit stochastic behavior incompatible with the deterministic guarantees required by software engineering. We formalize the Dual-State Action Pair (DSAP), an execution primitive that couples stochastic generation with deterministic post-condition verification. Guard functions act as sensing actions that project opaque LLM outputs onto observable workflow state, enabling a dual-state decomposition: finite, deterministic S_workflow paired with infinite, stochastic S_env. We prove that for epsilon-capable generators, failure probability P(fail) \u003c= (1-epsilon)^R_max -\u003e 0. To prevent naive O(R^K) retry explosion across multi-step workflows, we introduce a three-level recovery hierarchy: context refinement (retry within step), informed backtracking (stagnation detection with cascade invalidation and context injection to upstream steps), and human escalation. Experimental validation across 13 LLMs (1.3B-15B parameters) on three diagnostic probes demonstrates reliability gains of up to 66 percentage points at 1.2-2.1x baseline cost. Recovery mechanism evaluation on 99 SWE-Bench Pro instance-arm pairs (Qwen3-Coder-Next) demonstrates 100% context injection effectiveness (upstream output changed in all 71 escalation events) with step-specific recovery asymmetry -- 37.5% for test generation vs. 0% for patch generation -- and 0% end-to-end patch production, establishing the boundary between execution architecture and plan synthesis: execution recovery is necessary but not sufficient for autonomous software engineering.","short_abstract":"Large Language Models deployed as code generation agents exhibit stochastic behavior incompatible with the deterministic guarantees required by software engineering. We formalize the Dual-State Action Pair (DSAP), an execution primitive that couples stochastic generation with deterministic post-condition verification....","url_abs":"https://arxiv.org/abs/2512.20660","url_pdf":"https://arxiv.org/pdf/2512.20660v2","authors":"[\"Matthew Thompson\"]","published":"2025-12-18T15:28:21Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"cs.SE\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
