{"ID":2828856,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.13102","arxiv_id":"2512.13102","title":"Socratic Students: Teaching Language Models to Learn by Asking Questions","abstract":"Large language Models (LLMs) are usually used to answer questions, but many high-stakes applications (e.g., tutoring, clinical support) require the complementary skill of asking questions: detecting missing information, requesting clarifications, and using them to solve tasks. We study this skill in reasoning-heavy domains where progress depends on inquiry rather than factual recall. We define an interactive protocol where a student model engages a stronger teacher under a small turn budget. After each teacher reply, we evaluate the student on the original task with Pass@k. We propose Outcome-Driven Question optimization Strategy (ODQS ), a training framework that learns a questioning policy from downstream task outcomes. At each turn, we sample multiple candidate questions; query the teacher with each, then score the student's resulting performance. Using these scores, we train the student via supervised fine-tuning followed by Direct Preference Optimization (DPO), without any human labels. On GSM8K, HumanEval, and OpenCoder, ODQS produces large gains over interactive baselines, boosting Pass@5 by up to 54.7% (absolute) on math and 22.9% (absolute) on coding, and matching baseline performance in three fewer turns. Thus, question asking can be explicitly trained from task outcomes, improving both accuracy and efficiency in interactive reasoning.","short_abstract":"Large language Models (LLMs) are usually used to answer questions, but many high-stakes applications (e.g., tutoring, clinical support) require the complementary skill of asking questions: detecting missing information, requesting clarifications, and using them to solve tasks. We study this skill in reasoning-heavy dom...","url_abs":"https://arxiv.org/abs/2512.13102","url_pdf":"https://arxiv.org/pdf/2512.13102v4","authors":"[\"Rajeev Bhatt Ambati\",\"Tianyi Niu\",\"Aashu Singh\",\"Shlok Mishra\",\"Snigdha Chaturvedi\",\"Shashank Srivastava\"]","published":"2025-12-15T08:59:19Z","proceeding":"cs.AI","tasks":"[\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
