Reimagining OSCE Grading and Medical Education at UT Southwestern

Since the end of USMLE Step 2 CS in 2021, medical schools have carried greater responsibility for assessing students’ clinical skills — at scale, with limited faculty time. On the new episode of Healthcare AI Pioneers, “Reimagining OSCE Grading and Medical Education at UT Southwestern,” clinician-educator Thomas Dalton, MD and AI scientist Andrew Jamieson, PhD join host Jesse Pines, MD to discuss UT Southwestern’s approach to faster, more consistent, faculty-governed assessment.

Thomas Dalton, MD (left) and Andrew Jamieson, PhD (right)

Thomas Dalton, MD (left) and Andrew Jamieson, PhD (right). Photos: UT Southwestern.

The production deployment they describe helps inform UT-REAL — the UT System initiative behind this site, Scaling and Validating AI-Enabled Simulation Assessment Across University of Texas Medical Schools. UT Southwestern is the consortium’s lead site, and the model described in this episode is the kind of workflow UT-REAL is adapting — site by site, with local rubrics and governance — across its participating institutions.

Why Listen

OSCEs are central to medical education, but grading them well takes faculty time, clear rubrics, and consistent review. When USMLE Step 2 CS ended, that accountability shifted squarely to medical schools — raising the stakes on how dependably clinical skills are assessed.

Dalton and Jamieson walk through the UTSW approach: multi-camera simulation-center capture, structured rubrics, post-encounter notes, transcripts, and multimodal video analysis. The goal is not to replace clinical educators, but to support them — with faster turnaround, more consistent scoring, and more specific feedback for learners. It is a deliberately governed deployment, built by a team that pairs clinical-education expertise with AI and platform engineering.

A central theme is that AI deployment exposes process quality. Rubrics need to be clear enough that both faculty reviewers and AI systems can apply them consistently.

What the Episode Covers

Why OSCE grading is a practical deployment problem, not only a modeling problem.
How rubric clarity, faculty consensus, and data quality shape the reliability of AI-enabled assessment.
How the work progressed from conventional machine learning to zero-shot large language model grading and multimodal video review.
How UTSW built safeguards around rollout, including human review and adjudication for lower-performing encounters.
How UT-REAL builds on this work — validating and adapting these workflows across University of Texas medical schools.
Where the field may go next — competency-based progression, procedural assessment, team dynamics, and precision education.

What This Means for UT-REAL

UT Southwestern’s track record is the starting point, not the finish line. UT-REAL’s purpose is to turn a single-institution deployment into a shared, multi-school capability — adapting the workflow to each partner site’s simulation environment, rubrics, and governance, and validating it along the way. Schools interested in AI-enabled OSCE assessment can watch a MAPLES walkthrough, learn more about the project, or get in touch with the team.

Reimagining OSCE Grading and Medical Education at UT Southwestern

Why Listen

What the Episode Covers

What This Means for UT-REAL

Related Posts

MAPLES Walkthrough: From Rubric Upload to Faculty Review

SAIL 2026: MAPLES Multimodal OSCE Grading

Project MAPLES Holds All-Site Kickoff

Award Setup Complete — Partner Site Planning Begins

Why Listen

What the Episode Covers

What This Means for UT-REAL

Listen and Subscribe

Related Links

Related Posts

MAPLES Walkthrough: From Rubric Upload to Faculty Review

SAIL 2026: MAPLES Multimodal OSCE Grading

Project MAPLES Holds All-Site Kickoff

Award Setup Complete — Partner Site Planning Begins