{"ID":2845867,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.03595","arxiv_id":"2511.03595","title":"Tensor-Efficient High-Dimensional Q-learning","abstract":"High-dimensional reinforcement learning(RL) faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with the curse of dimensionality, where the number of state-action pairs grows exponentially with problem size. While neural network-based approaches like Deep Q-Networks have shown success, they do not explicitly exploit problem structure. Many high-dimensional control tasks exhibit low-rank structure in their value functions, and tensor-based methods using low-rank decomposition offer parameter-efficient representations. However, existing tensor-based Q-learning methods focus on representation fidelity without leveraging this structure for exploration. We propose Tensor-Efficient Q-Learning (TEQL), which represents the Q-function as a low-rank CP tensor over discretized state-action spaces and exploits the tensor structure for uncertainty-aware exploration. TEQL incorporates Error-Uncertainty Guided Exploration (EUGE), which combines tensor approximation error with visit counts to guide action selection, along with frequency-aware regularization to stabilize updates. Under matched parameter budgets, experiments on classic control tasks demonstrate that TEQL outperforms both matrix-based low-rank methods and deep RL baselines in sample efficiency, making it suitable for resource-constrained applications where sampling costs are high.","short_abstract":"High-dimensional reinforcement learning(RL) faces challenges with complex calculations and low sample efficiency in large state-action spaces. Q-learning algorithms struggle particularly with the curse of dimensionality, where the number of state-action pairs grows exponentially with problem size. While neural network-...","url_abs":"https://arxiv.org/abs/2511.03595","url_pdf":"https://arxiv.org/pdf/2511.03595v2","authors":"[\"Junyi Wu\",\"Dan Li\"]","published":"2025-11-05T16:16:31Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"eess.SY\"]","methods":"[\"Reinforcement Learning\",\"LoRA\"]","has_code":false}
