{"ID":2855114,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.13363","arxiv_id":"2510.13363","title":"D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree","abstract":"Large Language Models (LLMs) often exhibit factual inconsistencies and logical decay in extended, multi-turn dialogues, a challenge stemming from their reliance on static, pre-trained knowledge and an inability to reason adaptively over the dialogue history. Prevailing mitigation strategies, such as Retrieval-Augmented Generation (RAG) and agentic working memories, improve information recall but still engage with fundamentally static knowledge sources and follow pre-defined single reasoning path. This hinders their ability to preserve factual and logical consistency of their responses in multi-turn dialogues while the context evolves over time. To address this issue, we propose D-SMART, a model-agnostic framework designed to maintain multi-turn dialogue consistency by enabling LLMs to build and reason over a dynamic, structured representation of the conversational context. This is achieved via two synergistic components: (1) a Dynamic Structured Memory (DSM), which incrementally constructs and maintains an authoritative, OWL-compliant knowledge graph of the conversation; and (2) a Reasoning Tree (RT), which executes inferences as an explicit and traceable multi-step search over the graph. As the popular-used quality score (judged by GPT-4) can overlook logical flaws, we introduce new NLI-based metrics to better measure multi-turn dialogue consistency. Comprehensive experiments on the MT-Bench-101 benchmark show that D-SMART significantly outperforms state-of-the-art baselines, elevating the dialogue consistency score by over 48\\% for both proprietary and open-source models, and notably improves the quality score of the latter by up to 10.1\\%.","short_abstract":"Large Language Models (LLMs) often exhibit factual inconsistencies and logical decay in extended, multi-turn dialogues, a challenge stemming from their reliance on static, pre-trained knowledge and an inability to reason adaptively over the dialogue history. Prevailing mitigation strategies, such as Retrieval-Augmented...","url_abs":"https://arxiv.org/abs/2510.13363","url_pdf":"https://arxiv.org/pdf/2510.13363v1","authors":"[\"Xiang Lei\",\"Qin Li\",\"Min Zhang\",\"Min Zhang\"]","published":"2025-10-15T09:53:11Z","proceeding":"cs.CL","tasks":"[\"cs.CL\"]","methods":"[\"RAG\",\"Large Language Model\",\"Language Model\"]","has_code":false}