Evaluating ChatGPT-4o's accuracy in answering American Board of Dermatology practice questions: an analysis of AI in dermatology residency education

Authors

  • Lauren McGrath Center for Dermatology Research, Department of Dermatology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States Author
  • Nathan Schedler Wake Forest School of Medicine, Winston-Salem, North Carolina, United States Author
  • Sarah Martin Wake Forest School of Medicine, Winston-Salem, North Carolina, United States Author
  • Matthew Hrin Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina, United States Author
  • Maria Mariencheck Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina, United States Author
  • Steven Feldman Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina, United States , Department of Pathology, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States , Department of Social Sciences & Health Policy, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States Author
  • Zeynep Akkurt Department of Dermatology, Wake Forest University School of Medicine, Winston-Salem, North Carolina, United States Author

DOI:

https://doi.org/10.25251/bvjddn46

Keywords:

artificial intelligence, board examinations, ChatGPT-4o, education

Abstract

Artificial intelligence may enhance medical education. This study evaluates ChatGPT-4o’s accuracy in answering sample questions from the American Board of Dermatology BASIC, CORE, and APPLIED examinations. Fifty publicly available questions, with and without images, were analyzed for accuracy and performance across difficulty levels and categories. Its performance varied significantly between text-only and image-based questions, with lower accuracy on image-based questions (47%). Improvements in artificial intelligence for the use in dermatology residency education are necessary, as limitations in visual diagnostic skills were evident.

References

1. Haug CJ, Drazen JM. Artificial intelligence and machine learning in clinical medicine, 2023. The New England Journal of Medicine. 2023;388(13):1201-1208. [PMID: 36988595].

2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health. 2023;2(2):e0000198. [PMID: 36812645].

3. Passby L, Jenko N, Wernham A. Performance of ChatGPT-4o on specialty certificate examination in Dermatology multiple-choice questions. Clinical and Experimental Dermatology. 2024;49(7):722-727. [PMID: 37264670].

4. ABD Certification Pathway: Sample Items. American Board of Dermatology. 2024. https://www.abderm.org/residents-and-fellows/abd-certification-pathway/abd-certification-pathway-sample-items. Accessed on November 17, 2024.

Downloads

Published

09/08/2025

Issue

Section

Scholarly Commentary