{"ID":3050037,"CreatedAt":"2026-06-04T02:13:16.786527022Z","UpdatedAt":"2026-06-06T11:59:53.540122282Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.04857","arxiv_id":"2606.04857","title":"Rethinking Incompleteness: Formalizing Protocol Divergence and Train-Once Learning for Robust IMVC","abstract":"Standard IMVC evaluation retrains separate models for different missing-data configurations. We show that this paradigm obscures a fundamental vulnerability: missing rate alone is insufficient to characterize data incompleteness. Specifically, we show that protocols with identical nominal missing rates can differ by up to $50\\times$ in their proportion of fully observed samples, inducing drastically different learning regimes. We formalize this phenomenon as incompleteness divergence, providing measures that capture structural disparities across missing-data protocols. We further prove that for a broad class of reconstruction-based objectives, learning becomes structurally ill-posed when the proportion of complete samples falls below a critical threshold, leading to near-random performance. To bypass this theoretical bound, we propose CRAFT (Complete-data Robust Attention-masked Fusion Transformer). CRAFT shifts the burden of robustness from the loss function to the architecture via two key properties: (i) per-sample independence, which removes reliance on complete-sample co-occurrence, and (ii) mask-aware variable-length fusion, which aggregates only observed views through attention masking. This design allows a single model, trained once on complete data, to generalize to diverse missing patterns at inference time without retraining. Extensive experiments on seven benchmarks show that CRAFT matches or outperforms per-configuration baselines while reducing training overhead by $8.8\\times$, demonstrating that robustness to missing data can be achieved as an inherent architectural property. Code (CRAFT) and our imvc-audit toolkit are available at https://anonymous.4open.science/r/CRAFT-BF80/ and https://anonymous.4open.science/r/imvc-audit-8263/.","short_abstract":"Standard IMVC evaluation retrains separate models for different missing-data configurations. We show that this paradigm obscures a fundamental vulnerability: missing rate alone is insufficient to characterize data incompleteness. Specifically, we show that protocols with identical nominal missing rates can differ by up...","url_abs":"https://arxiv.org/abs/2606.04857","url_pdf":"https://arxiv.org/pdf/2606.04857v1","authors":"[\"Haolu Liu\",\"Xiyue Wang\",\"Xuanting Xie\",\"Liangjian Wen\",\"Zhao Kang\"]","published":"2026-06-03T13:24:09Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Transformer\"]","project_urls":"[\"https://anonymous.4open.science/r/CRAFT-BF80/\",\"https://anonymous.4open.science/r/imvc-audit-8263/\"]","has_code":false}
