{"ID":2842030,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.09915","arxiv_id":"2511.09915","title":"HI-TransPA: Hearing Impairments Translation Personal Assistant","abstract":"Hearing-impaired individuals often face significant barriers in daily communication due to the inherent challenges of producing clear speech. To address this, we introduce the Omni-Model paradigm into assistive technology and present HI-TransPA, an instruction-driven audio-visual personal assistant. The model fuses indistinct speech with lip dynamics, enabling both translation and dialogue within a single multimodal framework. To address the distinctive pronunciation patterns of hearing-impaired speech and the limited adaptability of existing models, we develop a multimodal preprocessing and curation pipeline that detects facial landmarks, stabilizes the lip region, and quantitatively evaluates sample quality. These quality scores guide a curriculum learning strategy that first trains on clean, high-confidence samples and progressively incorporates harder cases to strengthen model robustness. Architecturally, we employs a novel unified 3D-Resampler to efficiently encode the lip dynamics, which is critical for accurate interpretation. Experiments on purpose-built HI-Dialogue dataset show that HI-TransPA achieves state-of-the-art performance in both literal accuracy and semantic fidelity. Our work establishes a foundation for applying Omni-Models to assistive communication technology, providing an end-to-end modeling framework and essential processing tools for future research.","short_abstract":"Hearing-impaired individuals often face significant barriers in daily communication due to the inherent challenges of producing clear speech. To address this, we introduce the Omni-Model paradigm into assistive technology and present HI-TransPA, an instruction-driven audio-visual personal assistant. The model fuses ind...","url_abs":"https://arxiv.org/abs/2511.09915","url_pdf":"https://arxiv.org/pdf/2511.09915v2","authors":"[\"Zhiming Ma\",\"Shiyu Gan\",\"Junhao Zhao\",\"Xianming Li\",\"Qingyun Pan\",\"Peidong Wang\",\"Mingjun Pan\",\"Yuhao Mo\",\"Jiajie Cheng\",\"Chengxin Chen\",\"Zhonglun Cao\",\"Chonghan Liu\",\"Shi Cheng\"]","published":"2025-11-13T03:27:39Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.MM\",\"cs.SD\"]","methods":"[]","has_code":false}
