{"ID":541426,"CreatedAt":"2026-03-04T20:59:09Z","UpdatedAt":"2026-03-04T20:59:09Z","DeletedAt":null,"paper_url":"https://paperswithcode.com/paper/prompting-and-adapter-tuning-for-self","arxiv_id":"2310.02971","title":"Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model","abstract":"Prompting and adapter tuning have emerged as efficient alternatives to fine-tuning (FT) methods. However, existing studies on speech prompting focused on classification tasks and failed on more complex sequence generation tasks. Besides, adapter tuning is primarily applied with a focus on encoder-only self-supervised models. Our experiments show that prompting on Wav2Seq, a self-supervised encoder-decoder model, surpasses previous works in sequence generation tasks. It achieves a remarkable 53% relative improvement in word error rate for ASR and a 27% in F1 score for slot filling. Additionally, prompting competes with the FT method in the low-resource scenario. Moreover, we show the transferability of prompting and adapter tuning on Wav2Seq in cross-lingual ASR. When limited trainable parameters are involved, prompting and adapter tuning consistently outperform conventional FT across 7 languages. Notably, in the low-resource scenario, prompting consistently outperforms adapter tuning.","url_abs":"https://arxiv.org/abs/2310.02971v3","url_pdf":"https://arxiv.org/pdf/2310.02971v3.pdf","authors":"[\"Kai-Wei Chang\", \"Ming-Hsin Chen\", \"Yun-Ping Lin\", \"Jing Neng Hsu\", \"Paul Kuo-Ming Huang\", \"Chien-yu Huang\", \"Shang-Wen Li\", \"Hung-Yi Lee\"]","published":"2023-10-04T00:00:00Z","tasks":"[\"Cross-Lingual ASR\", \"Decoder\", \"slot-filling\", \"Slot Filling\"]","methods":"[\"Adapter\", \"Focus\"]","has_code":false}