Teaching and Learning with Generative AI

Building and studying open-source AI tools for undergraduate biology education — from formative feedback chatbots to automated assessment systems.

Since the release of ChatGPT in 2022, generative AI has fundamentally changed the landscape of higher education. My research and tool-building work focuses specifically on undergraduate biology — a context with distinct challenges and opportunities that differ from other disciplines. I build tools, run classroom studies, and contribute to the broader community of educators working to use AI thoughtfully.


Schema Study

An LLM-powered formative feedback tool for asynchronous student learning.

Schema Study is a no-code, open-source Streamlit app that converts a course term list into a Socratic AI study coach. Students select a term and engage in a guided dialogue — the app withholds direct answers and instead asks scenario-grounded follow-up questions that push for mechanistic reasoning. Instructors set it up by uploading a CSV; no programming required.

Research findings (Winter 2025, N=225):

  • 72% of students would reuse Schema Study in a future biology course
  • Each additional day per week of use more than doubled the likelihood of recommending it to peers
  • Increased student AI self-efficacy and beliefs about AI relevance to their education and careers

Published: Reuther, K., Mueller, L. O., Constantian, G., & Nguyen, A. (2026). Schema Study: A Large Language Model (LLM) Application for Asynchronous Student Learning and Inquiry. CourseSource Teaching Tools and Strategies.

Launch Schema Study Source Code


AI-Based Assessment of Student Writing

A multi-agent LLM system for automated short-answer grading in biology.

This in-progress research project examines whether LLMs can reliably grade open-ended biology constructed responses — and how to configure them to do so accurately and fairly. The system uses a multi-agent pipeline that mirrors how human expert graders approach scoring, and benchmarks LLM performance against expert human graders using the Rater Performance Evaluation (RPE) framework.

Key design questions include: what prompt structures, temperatures, and model configurations produce the most consistent and accurate grading? Does performance differ across student demographics? The project draws on Canvas course data and addresses practical questions for instructors considering AI-assisted grading at scale.

Ongoing research project.


AI Personas: Practice with Prompting

Pre-built AI personas for learning to work with AI.

A collection of ready-to-use AI personas — Socratic Tutor, AI Coach, AI Mentor, AI Teammate, AI Simulator — that students and instructors can interact with immediately. Also supports custom system instructions, making it a useful tool for exploring prompt engineering and AI behavior in an educational context. Supports GPT 5.4 Mini, GPT 5.4, GPT 4.1, and Claude Sonnet.

Launch AI Personas