{"ID":2836368,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.00087","arxiv_id":"2512.00087","title":"Exploring Automated Recognition of Instructional Activity and Discourse from Multimodal Classroom Data","abstract":"Observation of classroom interactions can provide concrete feedback to teachers, but current methods rely on manual annotation, which is resource-intensive and hard to scale. This work explores AI-driven analysis of classroom recordings, focusing on multimodal instructional activity and discourse recognition as a foundation for actionable feedback. Using a densely annotated dataset of 164 hours of video and 68 lesson transcripts, we design parallel, modality-specific pipelines. For video, we evaluate zero-shot multimodal LLMs, fine-tuned vision-language models, and self-supervised video transformers on 24 activity labels. For transcripts, we fine-tune a transformer-based classifier with contextualized inputs and compare it against prompting-based LLMs on 19 discourse labels. To handle class imbalance and multi-label complexity, we apply per-label thresholding, context windows, and imbalance-aware loss functions. The results show that fine-tuned models consistently outperform prompting-based approaches, achieving macro-F1 scores of 0.577 for video and 0.460 for transcripts. These results demonstrate the feasibility of automated classroom analysis and establish a foundation for scalable teacher feedback systems.","short_abstract":"Observation of classroom interactions can provide concrete feedback to teachers, but current methods rely on manual annotation, which is resource-intensive and hard to scale. This work explores AI-driven analysis of classroom recordings, focusing on multimodal instructional activity and discourse recognition as a found...","url_abs":"https://arxiv.org/abs/2512.00087","url_pdf":"https://arxiv.org/pdf/2512.00087v2","authors":"[\"Ivo Bueno\",\"Ruikun Hou\",\"Babette Bühler\",\"Tim Fütterer\",\"James Drimalla\",\"Jonathan Kyle Foster\",\"Peter Youngs\",\"Peter Gerjets\",\"Ulrich Trautwein\",\"Enkelejda Kasneci\"]","published":"2025-11-26T11:57:22Z","proceeding":"cs.CV","tasks":"[\"cs.CV\"]","methods":"[\"Transformer\",\"Large Language Model\",\"Language Model\"]","has_code":false}