{"ID":2859919,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.04886","arxiv_id":"2510.04886","title":"Where Did It All Go Wrong? A Hierarchical Look into Multi-Agent Error Attribution","abstract":"Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in interaction traces - whether using all-at-once evaluation, step-by-step analysis, or binary search - fall short when analyzing complex patterns, struggling with both accuracy and consistency. We present ECHO (Error attribution through Contextual Hierarchy and Objective consensus analysis), a novel algorithm that combines hierarchical context representation, objective analysis-based evaluation, and consensus voting to improve error attribution accuracy. Our approach leverages a positional-based leveling of contextual understanding while maintaining objective evaluation criteria, ultimately reaching conclusions through a consensus mechanism. Experimental results demonstrate that ECHO outperforms existing methods across various multi-agent interaction scenarios, showing particular strength in cases involving subtle reasoning errors and complex interdependencies. Our findings suggest that leveraging these concepts of structured, hierarchical context representation combined with consensus-based objective decision-making, provides a more robust framework for error attribution in multi-agent systems.","short_abstract":"Error attribution in Large Language Model (LLM) multi-agent systems presents a significant challenge in debugging and improving collaborative AI systems. Current approaches to pinpointing agent and step level failures in interaction traces - whether using all-at-once evaluation, step-by-step analysis, or binary search...","url_abs":"https://arxiv.org/abs/2510.04886","url_pdf":"https://arxiv.org/pdf/2510.04886v2","authors":"[\"Adi Banerjee\",\"Anirudh Nair\",\"Tarik Borogovac\"]","published":"2025-10-06T15:07:13Z","proceeding":"cs.AI","tasks":"[\"cs.AI\",\"cs.MA\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
