Publication

Deep learning for diagnosing and grading pterygium : a systematic review and meta-analysis

Tiong, Ethan W W
Soon, Carine Y S
Ong, Zun Zheng
Liu, Su-Hsun
Qureshi, Riaz
Rauz, Saaeha
Ting, Darren Shu Jeng
Citations
Altmetric:
Affiliation
University of Manchester; Sandwell and West Birmingham NHS Trust; University of Colorado Anschutz Medical Campus
Other Contributors
Publication date
2025-07-11
Research Projects
Organizational Units
Journal Issue
Abstract
Topic: A systematic review and meta-analysis evaluating the accuracy of DL models in pterygium detection and severity assessment against clinical experts. Clinical relevance: Deep learning (DL) has made significant progress in diagnosing various ophthalmological conditions. However, its diagnostic performance for pterygium remains elusive and requires further investigation. Methods: We systematically searched EMBASE, MEDLINE and other clinical registries between 1974 and February 2025. Relevant peer-reviewed clinical studies examining the diagnostic performance of AI algorithms for diagnosing or grading pterygium were deemed eligible. Risk of bias was assessed using QUADAS-2, and diagnostic accuracy was analyzed using bivariate random-effects models. The protocol is publicly available at the Open Science Framework (https://osf.io/ytpb3). Results: Out of 406 studies identified, 20 studies (45,913 anterior segment photographs from >4460 patients) were included, of which 13 were case-control and seven were cross-sectional. No studies provided external validation data, and one study directly compared DL performance with expert ophthalmologists, which showed similar diagnostic accuracy for diagnosis and grading of pterygium. Most studies had high risk of bias in patient selection (65 %) and index test (100 %) domains. Based on 14 internal validation studies (8094 images), the summary estimates (95 % CI) of sensitivity and specificity for diagnosing pterygium were 98.1 % (96.4-99.1) and 99.1 % (98.0-99.6). For severity grading (8 studies; 1631 images), summary sensitivity and specificity estimates were 91.2 % (87.7-93.7) and 92.9 % (88.4-95.8). Conclusions: DL models appear to have high accuracy for diagnosing and grading pterygium and may have comparable performance to clinical experts. However, these findings need to be interpreted with caution due to methodological limitations, including lack of external validation, case-control study designs, lack of pre-specified decision thresholds, and image-based analysis that did not account for within-individual correlation. Future studies should prioritize transparent reporting, prospective designs, external validation, and direct comparisons with clinicians to facilitate the translation of DL technologies for pterygium into clinical practice.
Citation
Tiong EWW, Soon CYS, Ong ZZ, Liu SH, Qureshi R, Rauz S, Ting DSJ. Deep learning for diagnosing and grading pterygium: A systematic review and meta-analysis. Comput Biol Med. 2025 Sep;196(Pt A):110743. doi: 10.1016/j.compbiomed.2025.110743. Epub 2025 Jul 11
Type
Article
Description
Additional Links
DOI
Publisher
Embedded videos