Performance of ChatGPT on dermatology Specialty Certificate Examination multiple choice questions
Affiliation
University Hospitals Birmingham NHS Foundation Trust; The Royal Orthopaedic Hospital NHS Foundation Trust; Leicester Royal Infirmary; Walsall Healthcare NHS TrustPublication date
2023-06-25Subject
Dermatology
Metadata
Show full item recordAbstract
ChatGPT is a large language model trained on increasingly large datasets by OpenAI to perform language-based tasks. It is capable of answering multiple-choice questions, such as those posed by the dermatology SCE examination. We asked two iterations of ChatGPT: ChatGPT-3.5 and ChatGPT-4 84 multiple-choice sample questions from the sample dermatology SCE question bank. ChatGPT-3.5 achieved an overall score of 63.1%, and ChatGPT-4 scored 90.5% (a significant improvement in performance (p<0.001)). The typical pass mark for the dermatology SCE is 70-72%. ChatGPT-4 is therefore capable of answering clinical questions and achieving a passing grade in these sample questions. There are many possible educational and clinical implications for increasingly advanced artificial intelligence (AI) and its use in medicine, including in the diagnosis of dermatological conditions. Such advances should be embraced provided that patient safety is a core tenet, and the limitations of AI in the nuances of complex clinical cases are recognisedCitation
Passby L, Jenko N, Wernham A. Performance of ChatGPT on Specialty Certificate Examination in Dermatology multiple-choice questions. Clin Exp Dermatol. 2024 Jun 25;49(7):722-727. doi: 10.1093/ced/llad197.Type
ArticlePMID
37264670Publisher
Oxford University Pressae974a485f413a2113503eed53cd6c53
10.1093/ced/llad197