{"ID":2898204,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.04168","arxiv_id":"2507.04168","title":"Generative Regression with IQ-BART","abstract":"Implicit Quantile BART (IQ-BART) posits a non-parametric Bayesian model on the conditional quantile function, acting as a model over a conditional model for $Y$ given $X$. One of the key ingredients is augmenting the observed data $\\{(Y_i,X_i)\\}_{i=1}^n$ with uniformly sampled values $τ_i$ for $1\\leq i\\leq n$ which serve as training data for quantile function estimation. Using the fact that the location parameter $μ$ in a $τ$-tilted asymmetric Laplace distribution corresponds to the $τ^{th}$ quantile, we build a check-loss likelihood targeting $μ$ as the parameter of interest. We equip the check-loss likelihood parametrized by $μ=f(X,τ)$ with a BART prior on $f(\\cdot)$, allowing the conditional quantile function to vary both in $X$ and $τ$. The posterior distribution over $μ(τ,X)$ can be then distilled for estimation of the {\\em entire quantile function} as well as for assessing uncertainty through the variation of posterior draws. Simulation-based predictive inference is immediately available through inverse transform sampling using the learned quantile function. The sum-of-trees structure over the conditional quantile function enables flexible distribution-free regression with theoretical guarantees. As a byproduct, we investigate posterior mean quantile estimator as an alternative to the routine sample (posterior mode) quantile estimator. We demonstrate the power of IQ-BART on time series forecasting datasets where IQ-BART can capture multimodality in predictive distributions that might be otherwise missed using traditional parametric approaches.","short_abstract":"Implicit Quantile BART (IQ-BART) posits a non-parametric Bayesian model on the conditional quantile function, acting as a model over a conditional model for $Y$ given $X$. One of the key ingredients is augmenting the observed data $\\{(Y_i,X_i)\\}_{i=1}^n$ with uniformly sampled values $τ_i$ for $1\\leq i\\leq n$ which ser...","url_abs":"https://arxiv.org/abs/2507.04168","url_pdf":"https://arxiv.org/pdf/2507.04168v1","authors":"[\"Sean O'Hagan\",\"Veronika Ročková\"]","published":"2025-07-05T21:42:08Z","proceeding":"stat.ME","tasks":"[\"stat.ME\",\"stat.ML\"]","methods":"[]","has_code":false}
