AIED course
This is the webpage for the Generative AI-powered Educational Applications course, a PhD-level elective course that I taught at MBZUAI in Spring 2026.
Contents:
- Overview
- Course outline
- Reading list
- Material: Week 1, Week 2, Week 3, Week 4, Week 5, Week 6, Week 7
Overview
This course covers a range of applications empowered by AI – from writing assistants to dialogue-based intelligent tutoring systems – across a range of subject domains, including but not limited to language learning and STEM subjects. We will cover topics surrounding content and feedback generation using generative AI, adaptation and personalization of AI-driven educational systems, multi-modal interactive approaches (including not only text-based but also speech and visual systems), agentic AI approaches to educational applications, generative AI model alignment with educational, age- and subject-specific aspects, and novel human-computer interaction opportunities in this domain. In addition to such novel opportunities, the course will delve into emerging challenges, focusing on ethical issues, societal impact and real-world integration of this technology, and evaluation.
Course outline
- Week 1: Introduction, core tasks, fundamental concepts [go to Week1]
- Week 2: AI for writing assistance and language learning [go to Week2]
- Week 3: Intelligent Tutoring Systems [go to Week3]
- Week 4: Learner analytics and personalization [go to Week4]
- Week 5: LLM alignment for educational applications [go to Week5]
- Week 6: Agentic AI for educational applications [go to Week6]
- Week 7: Human-computer interaction and real-life applications [go to Week7]
Reading list
- Opportunities for natural language processing research in education (Burstein, 2009)
- Practical and ethical challenges of large language models in education: A systematic scoping review (Yan et.al., 2023)
- COLING 2025 tutorial
- BEA 2025 tutorial on LLMs for Education: Understanding the Needs of Stakeholders, Current Capabilities and the Path Forward
- NeuIPS 2024 Workshop on Large Foundation Models for Educational Assessment
- AAAI 2024 Workshop on AI in Education
- uWaterloo workshop on Generative AI in K-12 Education
- EDM 2024 Workshop: Leveraging Large Language Models for Next Generation Educational Technologies
- KDD 2024 Workshop on AI for Education (AI4EDU): Advancing Personalized Education with LLM and Adaptive Learning
- Grammatical Error Correction: A Survey of the State of the Art (Bryant et al., 2023).
- Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods (Loem et al., 2023)
- Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction (Coyne et al., 2023)
- GPT-3.5 for Grammatical Error Correction (Katinskaia et al., 2024)
- Prompting open-source and commercial language models for grammatical error correction of English learner text (Davis et al., 2024)
- Pillars of Grammatical Error Correction: Comprehensive Inspection Of Contemporary Approaches In The Era of Large Language Models (Omelianchuk et al., 2024)
- Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction (Kaneko et al., 2024)
- Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction (Kobayashi et al., 2024)
- Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance (Wambsganss et al., 2023)
- SALMONN: Towards generic hearing abilities for large language models (Tang et al., 2024)
- Can GPT-4 do L2 analytic assessment? (Bannò et al., 2024)
- Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning (Kim et al., 2022)
- A Study on Fine-Tuning wav2vec2.0 Model for the Task of Mispronunciation Detection and Diagnosis (Peng et al., 2021)
- Incorporating uncertainty into deep learning for spoken language assessment (Malinin et al., 2017)
- Automated speaking assessment: Using language technologies to score spontaneous speech (Zechner and Evanini, 2019)
- The ‘communicative’ legacy in language testing (Fulcher, 2000)
- MATHDIAL: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems (Macina et al., 2023)
- Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction (Wang and Demszky, 2023)
- Are We There Yet? - A Systematic Literature Review on Chatbots in Education (Wollny et al., 2021)
- Teaching the science of learning (Weinstein et al., 2018)
- Intelligent tutoring systems with conversational dialogue (Graesser et al., 2001)
- The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring (Bloom, 1984)
Week 1: Introduction, core tasks, fundamental concepts
Overview:
- Introduction and overview of the field and the core tasks
- Introduction to fundamental concepts (including Bloom’s taxonomy, scaffolding, etc.), techniques and theories (including knowledge tracing and item response theory, among others) from the learning sciences
- Overview of the key AI techniques used in education
- Overview of the core tasks
- AI in education in academic and industrial contexts (including OpenAI’s educational models, Google’s LearnLM, Khan Academy’s Khanmigo, etc.)
Learning materials:
- Slides
- Reading list:
- LLMs in education: Novel perspectives, challenges, and opportunities. Bashar Alhafni, Sowmya Vajjala, Stefano Bannò, Kaushal Kumar Maurya, Ekaterina Kochmar. COLING 2025 tutorial
- Opportunities and Challenges of LLMs in Education: An NLP Perspective. Sowmya Vajjala, Bashar Alhafni, Stefano Bannò, Kaushal Kumar Maurya, Ekaterina Kochmar, 2025
- Large language models for education: A survey and outlook. Wang, Shen, Tianlong Xu, Hang Li, Chaoli Zhang, Joleen Liang, Jiliang Tang, Philip S. Yu, and Qingsong Wen, 2024
Week 2: AI for writing assistance and language learning
Overview:
- Overview of the core tasks: writing assistants, grammatical error detection (GED) and correction (GEC)
- LLM-empowered writing assistance and assessment
- State-of-the-art AI-based approaches to GEC, GED, and grammatical error explanation (GEE)
- Language learning across modalities (from text to speech) and languages
Learning materials:
- Slides
- Reading list:
- Grammatical Error Correction: A Survey of the State of the Art (Bryant et al., 2023).
- Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation. Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, Jinpeng Hu, Lidia S. Chao, Yue Zhang. ArXiv preprint, 2023
- GEE! Grammar Error Explanation with Large Language Models. Yixiao Song, Kalpesh Krishna, Rajesh Bhatt, Kevin Gimpel, Mohit Iyyer. Findings of NAACL 2024
- Towards End-to-End Spoken Grammatical Error Correction. Stefano Banno, Rao Ma, Mengjie Qian, Kate M. Knill, Mark J.F. Gales. ICASSP 2024
Week 3: Intelligent Tutoring Systems
Overview:
- Introduction into the theory and practice of building Intelligent Tutoring Systems (ITSs)
- From traditional ITSs to modern, AI-powered systems – what generative AI can do for us?
- ITSs across domains and subject areas
- Evaluation of ITS – from purely intrinsic metrics to extrinsic (i.e., learner-based) evaluation
Learning materials:
- Slides
- Reading list:
- AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring. Benjamin D Nye, Arthur C Graesser and Xiangen Hu. International Journal of Artificial Intelligence in Education, 24(4):427–469. 2014
- AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails. Sankalan Pal Chowdhury, Vilém Zouhar, Mrinmaya Sachan. Learning@Scale, 2024
- Improving the Validity of Automatically Generated Feedback via Reinforcement Learning. Alexander Scarlatos, Digory Smith, Simon Woodhead, and Andrew Lan. AIED 2024
- From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics. Jaewook Lee, Jeongah Lee, Wanyong Feng, Andrew Lan. AIED 2025
Week 4: Learner analytics and personalization
Overview:
- Tracking learner knowledge via Bayesian Knowledge Tracing (BKT) and its variants (e.g., Deep Knowledge Tracing)
- Testing learning material appropriateness via Item Response Theory (IRT)
Learning materials:
- Slides
- Reading list:
- Deep Knowledge Tracing. Chris Piech, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein. NeurIPS 2015
- Can LLMs Reliably Simulate Real Students’ Abilities in Mathematics and Reading Comprehension? KV Srivatsa, Kaushal Kumar Maurya, Ekaterina Kochmar. BEA 2025
Week 5: LLM alignment for educational applications
Overview:
- Overview of the learning sciences principles
- Pedagogical alignment of LLMs
- Techniques applied in the educational contexts
- Evaluation of pedagogical properties of educationally oriented models
Learning materials:
- Slides
- Reading list:
- Unifying AI Tutor Evaluation: An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors. Kaushal Kumar Maurya, KV Aditya Srivatsa, Kseniia Petukhova, Ekaterina Kochmar. NAACL 2025
- KidLM: Advancing Language Models for Children – Early Insights and Future Directions. Mir Tafseer Nayeem and Davood Rafiei. EMNLP 2024
- CLASS: A Design Framework for Building Intelligent Tutoring Systems Based on Learning Science principles. Shashank Sonkar, Naiming Liu, Debshila Mallick, Richard Baraniuk. EMNLP 2023
- Efficient RL for optimizing conversation level outcomes with an LLM-based tutor. Hyunji Nam, Omer Gottesman, Amy Zhang, Dean Foster, Emma Brunskill, Lyle Ungar. arXiv preprint arXiv:2507.16252, 2025
Week 6: Agentic AI for educational applications
Overview:
- Applications of agentic AI to education
- Mechanisms of multi-agent collaboration in educational contexts
Learning materials:
- Slides
- Reading list:
- Architecture for building conversational agents that support collaborative learning. Rohit Kumar and Carolyn P. Rose. IEEE Transactions on Learning Technologies, 4(1), 21–34. 2010
- AI agents and education: Simulated practice at scale. Ethan Mollick, Lilach Mollick, Natalie Bach, LJ Ciccarelli, Ben Przystanski, Daniel Ravipinto. arXiv preprint arXiv:2407.12796, 2024
- Content Knowledge Identification with Multi-Agent Large Language Models (LLMs). Kaiqi Yang, Yucheng Chu, Taylor Darwin, Ahreum Han, Hang Li, Hongzhi Wen, Yasemin Copur-Gencturk, Jiliang Tang, and Hui Liu. In International Conference on Artificial Intelligence in Education (pp. 284-292), 2024
- MEDCO: Medical Education Copilots Based on A Multi-Agent Framework. Hao Wei, Jianing Qiu, Haibao Yu, and Wu Yuan. ECCV, 2024
- KELE: A Multi-Agent Framework for Structured Socratic Teaching with Large Language Models. Xian Peng, Pan Yuan, Dong Li, Junlong Cheng, Qin Fang, Zhi Liu. EMNLP 2025
Week 7: Human-computer interaction and real-life applications
Overview:
- Human-computer interaction (HCI) aspects in AI-for-education
- Real-life integration
- Ethical considerations
Learning materials:
- Slides
- Reading list:
- GPTeach: Interactive TA Training with GPT Based Students. Julia Markel, Steven Opferman, James Landay, and Chris Piech. Learning@Scale 2023
- Bridging the novice-expert gap via models of decision-making: A case study on remediating math mistakes. Rose Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky. NAACL 2024