{"ID":2896086,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.07675","arxiv_id":"2507.07675","title":"Some Theoretical Results on Layerwise Effective Dimension Oscillations in Finite Width ReLU Networks","abstract":"We analyze the layerwise effective dimension (rank of the feature matrix) in fully-connected ReLU networks of finite width. Specifically, for a fixed batch of $m$ inputs and random Gaussian weights, we derive closed-form expressions for the expected rank of the \\$m\\times n\\$ hidden activation matrices. Our main result shows that $\\mathbb{E}[EDim(\\ell)]=m[1-(1-2/π)^\\ell]+O(e^{-c m})$ so that the rank deficit decays geometrically with ratio $1-2 / π\\approx 0.3634$. We also prove a sub-Gaussian concentration bound, and identify the \"revival\" depths at which the expected rank attains local maxima. In particular, these peaks occur at depths $\\ell_k^*\\approx(k+1/2)π/\\log(1/ρ)$ with height $\\approx (1-e^{-π/2}) m \\approx 0.79m$. We further show that this oscillatory rank behavior is a finite-width phenomenon: under orthogonal weight initialization or strong negative-slope leaky-ReLU, the rank remains (nearly) full. These results provide a precise characterization of how random ReLU layers alternately collapse and partially revive the subspace of input variations, adding nuance to prior work on expressivity of deep networks.","short_abstract":"We analyze the layerwise effective dimension (rank of the feature matrix) in fully-connected ReLU networks of finite width. Specifically, for a fixed batch of $m$ inputs and random Gaussian weights, we derive closed-form expressions for the expected rank of the \\$m\\times n\\$ hidden activation matrices. Our main result...","url_abs":"https://arxiv.org/abs/2507.07675","url_pdf":"https://arxiv.org/pdf/2507.07675v2","authors":"[\"Darshan Makwana\"]","published":"2025-07-10T11:54:18Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}
