{"ID":2872990,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.19312","arxiv_id":"2509.19312","title":"E2E Learning Massive MIMO for Multimodal Semantic Non-Orthogonal Transmission and Fusion","abstract":"This paper investigates multimodal semantic non-orthogonal transmission and fusion in hybrid analog-digital massive multiple-input multiple-output (MIMO). A Transformer-based cross-modal source-channel semantic-aware network (CSC-SA-Net) framework is conceived, where channel state information (CSI) reference signal (RS), feedback, analog-beamforming/combining, and baseband semantic processing are data-driven end-to-end (E2E) optimized at the base station (BS) and user equipments (UEs). CSC-SA-Net comprises five sub-networks: BS-side CSI-RS network (BS-CSIRS-Net), UE-side channel semantic-aware network (UE-CSANet), BS-CSANet, UE-side multimodal semantic fusion network (UE-MSFNet), and BS-MSFNet. Specifically, we firstly E2E train BS-CSIRS-Net, UE-CSANet, and BS-CSANet to jointly design CSI-RS, feedback, analog-beamforming/combining with maximum {\\emph{physical-layer's}} spectral-efficiency. Meanwhile, we E2E train UE-MSFNet and BS-MSFNet for optimizing {\\emph{application-layer's}} source semantic downstream tasks. On these pre-trained models, we further integrate application-layer semantic processing with physical-layer tasks to E2E train five subnetworks. Extensive simulations show that the proposed CSC-SA-Net outperforms traditional separated designs, revealing the advantage of cross-modal channel-source semantic fusion.","short_abstract":"This paper investigates multimodal semantic non-orthogonal transmission and fusion in hybrid analog-digital massive multiple-input multiple-output (MIMO). A Transformer-based cross-modal source-channel semantic-aware network (CSC-SA-Net) framework is conceived, where channel state information (CSI) reference signal (RS...","url_abs":"https://arxiv.org/abs/2509.19312","url_pdf":"https://arxiv.org/pdf/2509.19312v2","authors":"[\"Minghui Wu\",\"Zhen Gao\"]","published":"2025-09-09T11:25:51Z","proceeding":"eess.SP","tasks":"[\"eess.SP\",\"cs.AI\",\"cs.IT\",\"cs.LG\"]","methods":"[\"Transformer\"]","has_code":false}