{"ID":2869662,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.18161","arxiv_id":"2509.18161","title":"Developing Training Procedures for Piecewise-linear Spline Activation Functions in Neural Networks","abstract":"Activation functions in neural networks are typically selected from a set of empirically validated, commonly used static functions such as ReLU, tanh, or sigmoid. However, by optimizing the shapes of a network's activation functions, we can train models that are more parameter-efficient and accurate by assigning more optimal activations to the neurons. In this paper, I present and compare 9 training methodologies to explore dual-optimization dynamics in neural networks with parameterized linear B-spline activation functions. The experiments realize up to 94% lower end model error rates in FNNs and 51% lower rates in CNNs compared to traditional ReLU-based models. These gains come at the cost of additional development and training complexity as well as end model latency.","short_abstract":"Activation functions in neural networks are typically selected from a set of empirically validated, commonly used static functions such as ReLU, tanh, or sigmoid. However, by optimizing the shapes of a network's activation functions, we can train models that are more parameter-efficient and accurate by assigning more o...","url_abs":"https://arxiv.org/abs/2509.18161","url_pdf":"https://arxiv.org/pdf/2509.18161v1","authors":"[\"William H Patty\"]","published":"2025-09-17T03:51:16Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"cs.AI\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false}
