{"ID":2888646,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.16206","arxiv_id":"2509.16206","title":"Deep Reinforcement Learning in Factor Investment","abstract":"Deep reinforcement learning has shown promise in trade execution, yet its use in low-frequency factor portfolio construction remains under-explored. A key obstacle is the high-dimensional, unbalanced state space created by stocks that enter and exit the investable universe. We introduce Conditional Auto-encoded Factor-based Portfolio Optimisation (CAFPO), which compresses stock-level returns into a small set of latent factors conditioned on 94 firm-specific characteristics. The factors feed a DRL agent implemented with both PPO and DDPG to generate continuous long-short weights. On 20 years of U.S. equity data (2000--2020), CAFPO outperforms equal-weight, value-weight, Markowitz, vanilla DRL, and Fama--French-driven DRL, delivering a 24.6\\% compound return and a Sharpe ratio of 0.94 out of sample. SHAP analysis further reveals economically intuitive factor attributions. Our results demonstrate that factor-aware representation learning can make DRL practical for institutional, low-turnover portfolio management.","short_abstract":"Deep reinforcement learning has shown promise in trade execution, yet its use in low-frequency factor portfolio construction remains under-explored. A key obstacle is the high-dimensional, unbalanced state space created by stocks that enter and exit the investable universe. We introduce Conditional Auto-encoded Factor-...","url_abs":"https://arxiv.org/abs/2509.16206","url_pdf":"https://arxiv.org/pdf/2509.16206v1","authors":"[\"Junlin Liu\"]","published":"2025-07-30T04:59:52Z","proceeding":"cs.CE","tasks":"[\"cs.CE\",\"cs.LG\"]","methods":"[\"Reinforcement Learning\"]","has_code":false}