Comparative readability analysis of AI-generated versus evidence-based educational content on atrial fibrillation management for medical professionals
Konduru, Thejas Swaroop ; Medapuram Belagu, Sharada ; Momtaz, Areefa ; Maghamifar, Jasmin
Konduru, Thejas Swaroop
Medapuram Belagu, Sharada
Momtaz, Areefa
Maghamifar, Jasmin
Citations
Altmetric:
Affiliation
Sandwell and West Birmingham NHS Trust; Walsall Healthcare NHS Trust
Other Contributors
Publication date
2025-09-17
Subject
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Introduction Atrial fibrillation (AF) is the most common sustained arrhythmia and is associated with increased risks of stroke, heart failure, and healthcare burden. Access to clear and up-to-date educational content is essential for effective decision-making in complex cases such as AF. Evidence-based resources like UpToDate are often time-consuming to read, and clinicians frequently face time constraints in fast-paced clinical settings. With the growing role of artificial intelligence in healthcare, tools like ChatGPT-3.5 (OpenAI, San Francisco, CA, USA) offer fast and accessible medical summaries. However, their suitability in professional education remains inadequately studied, particularly in comparison with evidence-based resources like UpToDate. Methodology A cross-sectional study was conducted in June 2025. Educational content was generated using ChatGPT-3.5 based on structured prompts and retrieved from UpToDate. Non-textual elements were excluded. Readability was assessed using the Flesch-Kincaid Reading Ease (FRE) score, the Flesch-Kincaid Grade Level (FKGL), the Simple Measure of Gobbledygook (SMOG) Index, word count, sentence count, average words per sentence, and both count and percentage of difficult words. Statistical comparison was done using the Mann-Whitney U test (p < 0.05), analysed with R software (v4.3.2; R Foundation for Statistical Computing, Vienna, Austria). Results ChatGPT content was significantly shorter (median 495 vs. 3381 words; p = 0.029), had shorter sentences (14.3 vs. 19.3 words; p = 0.029), but a higher percentage of difficult words (29.6% vs. 23.3%; p = 0.029). Other differences were not statistically significant. Conclusions ChatGPT provides concise educational content with readability scores comparable to UpToDate but with a higher proportion of complex vocabulary. While promising as a supplementary resource, its integration into clinical decision-making should be guided by expert review and validation.
Citation
Konduru TS, Medapuram Belagu S, Momtaz A, Maghamifar J. Comparative Readability Analysis of AI-Generated Versus Evidence-Based Educational Content on Atrial Fibrillation Management for Medical Professionals. Cureus. 2025 Sep 17;17(9):e92506. doi: 10.7759/cureus.92506. PMID: 41069886; PMCID: PMC12507389.
Type
Article
