· Dissemination · 3 min read
Reimagining OSCE Grading and Medical Education at UT Southwestern
On Healthcare AI Pioneers, Thomas Dalton and Andrew Jamieson discuss UT Southwestern’s AI-enabled OSCE assessment work: rubrics, multimodal evidence, staged rollout, human review, and precision education.

Since the end of USMLE Step 2 CS in 2021, medical schools have carried greater responsibility for assessing students’ clinical skills — at scale, with limited faculty time. On the new episode of Healthcare AI Pioneers, “Reimagining OSCE Grading and Medical Education at UT Southwestern,” clinician-educator Thomas Dalton, MD and AI scientist Andrew Jamieson, PhD join host Jesse Pines, MD to discuss UT Southwestern’s approach to faster, more consistent, faculty-governed assessment.

Thomas Dalton, MD (left) and Andrew Jamieson, PhD (right). Photos: UT Southwestern.
The production deployment they describe helps inform UT-REAL — the UT System initiative behind this site, Scaling and Validating AI-Enabled Simulation Assessment Across University of Texas Medical Schools. UT Southwestern is the consortium’s lead site, and the model described in this episode is the kind of workflow UT-REAL is adapting — site by site, with local rubrics and governance — across its participating institutions.
Why Listen
OSCEs are central to medical education, but grading them well takes faculty time, clear rubrics, and consistent review. When USMLE Step 2 CS ended, that accountability shifted squarely to medical schools — raising the stakes on how dependably clinical skills are assessed.
Dalton and Jamieson walk through the UTSW approach: multi-camera simulation-center capture, structured rubrics, post-encounter notes, transcripts, and multimodal video analysis. The goal is not to replace clinical educators, but to support them — with faster turnaround, more consistent scoring, and more specific feedback for learners. It is a deliberately governed deployment, built by a team that pairs clinical-education expertise with AI and platform engineering.
A central theme is that AI deployment exposes process quality. Rubrics need to be clear enough that both faculty reviewers and AI systems can apply them consistently.
What the Episode Covers
- Why OSCE grading is a practical deployment problem, not only a modeling problem.
- How rubric clarity, faculty consensus, and data quality shape the reliability of AI-enabled assessment.
- How the work progressed from conventional machine learning to zero-shot large language model grading and multimodal video review.
- How UTSW built safeguards around rollout, including human review and adjudication for lower-performing encounters.
- How UT-REAL builds on this work — validating and adapting these workflows across University of Texas medical schools.
- Where the field may go next — competency-based progression, procedural assessment, team dynamics, and precision education.
What This Means for UT-REAL
UT Southwestern’s track record is the starting point, not the finish line. UT-REAL’s purpose is to turn a single-institution deployment into a shared, multi-school capability — adapting the workflow to each partner site’s simulation environment, rubrics, and governance, and validating it along the way. Schools interested in AI-enabled OSCE assessment can watch a MAPLES walkthrough, learn more about the project, or get in touch with the team.
Listen and Subscribe
- Episode page: “Reimagining OSCE Grading and Medical Education at UT Southwestern”
- Healthcare AI Pioneers podcast site
- Apple Podcasts
- Spotify
- RSS feed


