{"ID":2829450,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.12132","arxiv_id":"2512.12132","title":"Approximation with SiLU Networks: Constant Depth and Exponential Rates for Basic Operations","abstract":"We present SiLU network constructions whose approximation efficiency depends critically on proper hyperparameter tuning. For the square function $x^2$, with optimally chosen shift $a$ and scale $β$, we achieve approximation error $\\varepsilon$ using a two-layer network of constant width, where weights scale as $β^{\\pm k}$ with $k = \\mathcal{O}(\\ln(1/\\varepsilon))$. We then extend this approach through functional composition to Sobolev spaces, we obtain networks with depth $\\mathcal{O}(1)$ and $\\mathcal{O}(\\varepsilon^{-d/n})$ parameters under optimal hyperparameters settings. Our work highlights the trade-off between architectural depth and activation parameter optimization in neural network approximation theory.","short_abstract":"We present SiLU network constructions whose approximation efficiency depends critically on proper hyperparameter tuning. For the square function $x^2$, with optimally chosen shift $a$ and scale $β$, we achieve approximation error $\\varepsilon$ using a two-layer network of constant width, where weights scale as $β^{\\pm...","url_abs":"https://arxiv.org/abs/2512.12132","url_pdf":"https://arxiv.org/pdf/2512.12132v2","authors":"[\"Koffi O. Ayena\"]","published":"2025-12-13T01:56:34Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"math.NA\"]","methods":"[]","has_code":false}
