Robust, Observable, and Evolvable Agentic Systems Engineering: A Principled Framework Validated via the Fairy GUI Agent

cs.AI arXiv:2509.20729
View PDF arXiv JSON

Abstract

The Agentic Paradigm faces a significant Software Engineering Absence, yielding Agentic systems commonly lacking robustness, observability, and evolvability. To address these deficiencies, we propose a principled engineering framework comprising Runtime Goal Refinement (RGR), Observable Cognitive Architecture (OCA), and Evolutionary Memory Architecture (EMA). In this framework, RGR ensures robustness and intent alignment via knowledge-constrained refinement and human-in-the-loop clarification; OCA builds an observable and maintainable white-box architecture using component decoupling, logic layering, and state-control separation; and EMA employs an execution-evolution dual-loop for evolvability. We implemented and empirically validated Fairy, a mobile GUI agent based on this framework. On RealMobile-Eval, our novel benchmark for ambiguous and complex tasks, Fairy outperformed the best SoTA baseline in user requirement completion by 33.7%. Subsequent controlled experiments, human-subject studies, and ablation studies further confirmed that the RGR enhances refinement accuracy and prevents intent deviation; the OCA improves maintainability; and the EMA is crucial for long-term performance. This research provides empirically validated specifications and a practical blueprint for building reliable, observable, and evolvable Agentic AI systems.

PDF Viewer