{"ID":2854967,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.13134","arxiv_id":"2510.13134","title":"Convergence, design and training of continuous-time dropout as a random batch method","abstract":"We study dropout regularization in continuous-time models through the lens of random-batch methods -- a family of stochastic sampling schemes originally devised to reduce the computational cost of interacting particle systems. We construct an unbiased, well-posed estimator that mimics dropout by sampling neuron batches over time intervals of length $h$. Trajectory-wise convergence is established with linear rate in $h$ for the expected uniform error. At the distribution level, we establish stability for the associated continuity equation, with total-variation error of order $h^{1/2}$ under mild moment assumptions. During training with fixed batch sampling across epochs, a Pontryagin-based adjoint analysis bounds deviations in the optimal cost and control, as well as in gradient-descent iterates. On the design side, we compare convergence rates for canonical batch sampling schemes, recover standard Bernoulli dropout as a special case, and derive a cost--accuracy trade-off yielding a closed-form optimal $h$. We then specialize to a single-layer neural ODE and validate the theory on classification and flow matching, observing the predicted rates, regularization effects, and favorable runtime and memory profiles.","short_abstract":"We study dropout regularization in continuous-time models through the lens of random-batch methods -- a family of stochastic sampling schemes originally devised to reduce the computational cost of interacting particle systems. We construct an unbiased, well-posed estimator that mimics dropout by sampling neuron batches...","url_abs":"https://arxiv.org/abs/2510.13134","url_pdf":"https://arxiv.org/pdf/2510.13134v1","authors":"[\"Antonio Álvarez-López\",\"Martín Hernández\"]","published":"2025-10-15T04:19:01Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.OC\"]","methods":"[]","has_code":false}
