{"ID":2833639,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.04051","arxiv_id":"2512.04051","title":"Convergence for Discrete Parameter Update Schemes","abstract":"Modern deep learning models require immense computational resources, motivating research into low-precision training. Quantised training addresses this by representing training components in low-bit integers, but typically relies on discretising real-valued updates. We introduce an alternative approach where the update rule itself is discrete, avoiding the quantisation of continuous updates by design. We establish convergence guarantees for a general class of such discrete schemes, and present a multinomial update rule as a concrete example, supported by empirical evaluation. This perspective opens new avenues for efficient training, particularly for models with inherently discrete structure.","short_abstract":"Modern deep learning models require immense computational resources, motivating research into low-precision training. Quantised training addresses this by representing training components in low-bit integers, but typically relies on discretising real-valued updates. We introduce an alternative approach where the update...","url_abs":"https://arxiv.org/abs/2512.04051","url_pdf":"https://arxiv.org/pdf/2512.04051v2","authors":"[\"Paul Wilson\",\"Fabio Zanasi\",\"George Constantinides\"]","published":"2025-12-03T18:34:26Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.OC\"]","methods":"[]","has_code":false}
