{"ID":2857718,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.15953","arxiv_id":"2510.15953","title":"Hierarchical Multi-Modal Threat Intelligence Fusion Without Aligned Data: A Practical Framework for Real-World Security Operations","abstract":"Multi-modal threat detection faces a fundamental challenge that involves security tools operating in isolation, and this creates streams of network, email, and system data with no natural alignment or correlation. We present Hierarchical Multi-Modal Threat Intelligence Fusion (HM-TIF), a framework explicitly designed for this realistic scenario where naturally aligned multi-modal attack data does not exist. Unlike prior work that assumes or creates artificial alignment, we develop principled methods for correlating independent security data streams while maintaining operational validity. Our architecture employs hierarchical cross-attention with dynamic weighting that adapts to data availability and threat context, coupled with a novel temporal correlation protocol that preserves statistical independence. Evaluation on UNSW-NB15, CSE-CIC-IDS2018, and CICBell-DNS2021 datasets demonstrates that HM-TIF achieves 88.7% accuracy with a critical 32% reduction in false positive rates, even without true multi-modal training data. The framework maintains robustness when modalities are missing, making it immediately deployable in real security operations where data streams frequently have gaps. Our contributions include: (i) the first multi-modal security framework explicitly designed for non-aligned data, (ii) a temporal correlation protocol that avoids common data leakage pitfalls, (iii) empirical validation that multi-modal fusion provides operational benefits even without perfect alignment, and (iv) practical deployment guidelines for security teams facing heterogeneous, uncoordinated data sources. Index Terms: multi-modal learning, threat intelligence, non-aligned data, operational security, cross-attention mechanisms, practical deployment","short_abstract":"Multi-modal threat detection faces a fundamental challenge that involves security tools operating in isolation, and this creates streams of network, email, and system data with no natural alignment or correlation. We present Hierarchical Multi-Modal Threat Intelligence Fusion (HM-TIF), a framework explicitly designed f...","url_abs":"https://arxiv.org/abs/2510.15953","url_pdf":"https://arxiv.org/pdf/2510.15953v1","authors":"[\"Sisir Doppalapudi\"]","published":"2025-10-10T18:21:46Z","proceeding":"cs.CR","tasks":"[\"cs.CR\"]","methods":"[]","has_code":false}