{"ID":2856956,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.09965","arxiv_id":"2510.09965","title":"Homomorphic Mappings for Value-Preserving State Aggregation in Markov Decision Processes","abstract":"State aggregation aims to reduce the computational complexity of solving Markov Decision Processes (MDPs) while preserving the performance of the original system. A fundamental challenge lies in optimizing policies within the aggregated, or abstract, space such that the performance remains optimal in the ground MDP-a property referred to as {\"}optimal policy equivalence {\"}. This paper presents an abstraction framework based on the notion of homomorphism, in which two Markov chains are deemed homomorphic if their value functions exhibit a linear relationship. Within this theoretical framework, we establish a sufficient condition for the equivalence of optimal policy. We further examine scenarios where the sufficient condition is not met and derive an upper bound on the approximation error and a performance lower bound for the objective function under the ground MDP. We propose Homomorphic Policy Gradient (HPG), which guarantees optimal policy equivalence under sufficient conditions, and its extension, Error-Bounded HPG (EBHPG), which balances computational efficiency and the performance loss induced by aggregation. In the experiments, we validated the theoretical results and conducted comparative evaluations against seven algorithms.","short_abstract":"State aggregation aims to reduce the computational complexity of solving Markov Decision Processes (MDPs) while preserving the performance of the original system. A fundamental challenge lies in optimizing policies within the aggregated, or abstract, space such that the performance remains optimal in the ground MDP-a p...","url_abs":"https://arxiv.org/abs/2510.09965","url_pdf":"https://arxiv.org/pdf/2510.09965v1","authors":"[\"Shuo Zhao\",\"Yongqiang Li\",\"Yu Feng\",\"Zhongsheng Hou\",\"Yuanjing Feng\"]","published":"2025-10-11T02:40:03Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\",\"stat.ML\"]","methods":"[]","has_code":false}
