{"ID":2822694,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2601.02455","arxiv_id":"2601.02455","title":"Diagnostic-Driven Layer-Wise Compensation for Post-Training Quantization of Encoder-Decoder ASR Models","abstract":"Deploying Automatic Speech Recognition (ASR) models on memory-constrained edge devices requires aggressive low-bit weight quantization. Layer-wise post-training quantization is practical and effective, but it suffers from cross-layer error accumulation. Existing compensation methods typically use a single global strength for all layers, which is ill-suited to encoder-decoder ASR models whose acoustic encoder and linguistic decoder exhibit markedly different sensitivities to quantization noise. We propose FADE, a diagnostic-driven framework that assigns each layer an adaptive compensation coefficient by combining two complementary signals: an intrinsic vulnerability score from weight geometry and a calibration reliability score from the data-driven solution. The resulting layer-wise coefficient balances local quantization fidelity against cross-layer error correction, enabling tailored compensation without retraining or hyperparameter search. Experiments on Whisper, Moonshine, and Qwen3-ASR across four benchmarks show that FADE consistently improves mean Word Error Rate over strong baselines at both 3- and 4-bit precision while substantially reducing run-to-run variance.","short_abstract":"Deploying Automatic Speech Recognition (ASR) models on memory-constrained edge devices requires aggressive low-bit weight quantization. Layer-wise post-training quantization is practical and effective, but it suffers from cross-layer error accumulation. Existing compensation methods typically use a single global streng...","url_abs":"https://arxiv.org/abs/2601.02455","url_pdf":"https://arxiv.org/pdf/2601.02455v2","authors":"[\"Xinyu Wang\",\"Ziyu Zhao\",\"Yajie Luo\",\"Yihong Wu\",\"Liheng Ma\",\"Jingrui Tian\",\"Lei Ding\",\"Xiao-Wen Chang\",\"Peng Lu\"]","published":"2026-01-05T18:47:16Z","proceeding":"cs.SD","tasks":"[\"cs.SD\",\"cs.CL\",\"eess.AS\"]","methods":"[]","has_code":false}
