{"ID":2830193,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.10451","arxiv_id":"2512.10451","title":"Metacognitive Sensitivity for Test-Time Dynamic Model Selection","abstract":"A key aspect of human cognition is metacognition - the ability to assess one's own knowledge and judgment reliability. While deep learning models can express confidence in their predictions, they often suffer from poor calibration, a cognitive bias where expressed confidence does not reflect true competence. Do models truly know what they know? Drawing from human cognitive science, we propose a new framework for evaluating and leveraging AI metacognition. We introduce meta-d', a psychologically-grounded measure of metacognitive sensitivity, to characterise how reliably a model's confidence predicts its own accuracy. We then use this dynamic sensitivity score as context for a bandit-based arbiter that performs test-time model selection, learning which of several expert models to trust for a given task. Our experiments across multiple datasets and deep learning model combinations (including CNNs and VLMs) demonstrate that this metacognitive approach improves joint-inference accuracy over constituent models. This work provides a novel behavioural account of AI models, recasting ensemble selection as a problem of evaluating both short-term signals (confidence prediction scores) and medium-term traits (metacognitive sensitivity).","short_abstract":"A key aspect of human cognition is metacognition - the ability to assess one's own knowledge and judgment reliability. While deep learning models can express confidence in their predictions, they often suffer from poor calibration, a cognitive bias where expressed confidence does not reflect true competence. Do models...","url_abs":"https://arxiv.org/abs/2512.10451","url_pdf":"https://arxiv.org/pdf/2512.10451v1","authors":"[\"Le Tuan Minh Trinh\",\"Le Minh Vu Pham\",\"Thi Minh Anh Pham\",\"An Duc Nguyen\"]","published":"2025-12-11T09:15:05Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false}
