{"ID":2879992,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2508.15928","arxiv_id":"2508.15928","title":"Transforming Causality: Transformer-Based Temporal Causal Discovery with Prior Knowledge Integration","abstract":"We introduce a novel framework for temporal causal discovery and inference that addresses two key challenges: complex nonlinear dependencies and spurious correlations. Our approach employs a multi-layer Transformer-based time-series forecaster to capture long-range, nonlinear temporal relationships among variables. After training, we extract the underlying causal structure and associated time lags from the forecaster using gradient-based analysis, enabling the construction of a causal graph. To mitigate the impact of spurious causal relationships, we introduce a prior knowledge integration mechanism based on attention masking, which consistently enforces user-excluded causal links across multiple Transformer layers. Extensive experiments show that our method significantly outperforms other state-of-the-art approaches, achieving a 12.8% improvement in F1-score for causal discovery and 98.9% accuracy in estimating causal lags.","short_abstract":"We introduce a novel framework for temporal causal discovery and inference that addresses two key challenges: complex nonlinear dependencies and spurious correlations. Our approach employs a multi-layer Transformer-based time-series forecaster to capture long-range, nonlinear temporal relationships among variables. Aft...","url_abs":"https://arxiv.org/abs/2508.15928","url_pdf":"https://arxiv.org/pdf/2508.15928v1","authors":"[\"Jihua Huang\",\"Yi Yao\",\"Ajay Divakaran\"]","published":"2025-08-21T19:19:11Z","proceeding":"cs.LG","tasks":"[\"cs.LG\",\"stat.ML\"]","methods":"[\"Transformer\"]","has_code":false}
