Development and validation of open-source deep neural networks for comprehensive chest x-ray reading: a retrospective, multicentre study
Name:
Publisher version
View Source
Access full-text PDFOpen Access
View Source
Check access options
Check access options
Author
Cid, Yashin DicenteMacpherson, Matthew
Gervais-Andre, Louise
Zhu, Yuanyi
Franco, Giuseppe
Santeramo, Ruggiero
Lim, Chee
Selby, Ian
Muthuswamy, Keerthini
Amlani, Ashik
Hopewell, Heath
Indrajeet, Das
Liakata, Maria
Hutchinson, Charles E
Goh, Vicky
Montana, Giovanni
Publication date
2023-12-08
Metadata
Show full item recordAbstract
Background: Artificial intelligence (AI) systems for automated chest x-ray interpretation hold promise for standardising reporting and reducing delays in health systems with shortages of trained radiologists. Yet, there are few freely accessible AI systems trained on large datasets for practitioners to use with their own data with a view to accelerating clinical deployment of AI systems in radiology. We aimed to contribute an AI system for comprehensive chest x-ray abnormality detection. Methods: In this retrospective cohort study, we developed open-source neural networks, X-Raydar and X-Raydar-NLP, for classifying common chest x-ray findings from images and their free-text reports. Our networks were developed using data from six UK hospitals from three National Health Service (NHS) Trusts (University Hospitals Coventry and Warwickshire NHS Trust, University Hospitals Birmingham NHS Foundation Trust, and University Hospitals Leicester NHS Trust) collectively contributing 2 513 546 chest x-ray studies taken from a 13-year period (2006-19), which yielded 1 940 508 usable free-text radiological reports written by the contemporary assessing radiologist (collectively referred to as the "historic reporters") and 1 896 034 frontal images. Chest x-rays were labelled using a taxonomy of 37 findings by a custom-trained natural language processing (NLP) algorithm, X-Raydar-NLP, from the original free-text reports. X-Raydar-NLP was trained on 23 230 manually annotated reports and tested on 4551 reports from all hospitals. 1 694 921 labelled images from the training set and 89 238 from the validation set were then used to train a multi-label image classifier. Our algorithms were evaluated on three retrospective datasets: a set of exams sampled randomly from the full NHS dataset reported during clinical practice and annotated using NLP (n=103 328); a consensus set sampled from all six hospitals annotated by three expert radiologists (two independent annotators for each image and a third consultant to facilitate disagreement resolution) under research conditions (n=1427); and an independent dataset, MIMIC-CXR, consisting of NLP-annotated exams (n=252 374). Findings: X-Raydar achieved a mean AUC of 0·919 (SD 0·039) on the auto-labelled set, 0·864 (0·102) on the consensus set, and 0·842 (0·074) on the MIMIC-CXR test, demonstrating similar performance to the historic clinical radiologist reporters, as assessed on the consensus set, for multiple clinically important findings, including pneumothorax, parenchymal opacification, and parenchymal mass or nodules. On the consensus set, X-Raydar outperformed historical reporter balanced accuracy with significance on 27 of 37 findings, was non-inferior on nine, and inferior on one finding, resulting in an average improvement of 13·3% (SD 13·1) to 0·763 (0·110), including a mean 5·6% (13·2) improvement in critical findings to 0·826 (0·119). Interpretation: Our study shows that automated classification of chest x-rays under a comprehensive taxonomy can achieve performance levels similar to those of historical reporters and exhibit robust generalisation to external data. The open-sourced neural networks can serve as foundation models for further research and are freely available to the research community. Funding: Wellcome Trust.Citation
Cid YD, Macpherson M, Gervais-Andre L, Zhu Y, Franco G, Santeramo R, Lim C, Selby I, Muthuswamy K, Amlani A, Hopewell H, Indrajeet D, Liakata M, Hutchinson CE, Goh V, Montana G. Development and validation of open-source deep neural networks for comprehensive chest x-ray reading: a retrospective, multicentre study. Lancet Digit Health. 2024 Jan;6(1):e44-e57. doi: 10.1016/S2589-7500(23)00218-2. Epub 2023 Dec 8.Type
ArticleAdditional Links
https://www.sciencedirect.com/journal/the-lancet-digital-healthPMID
38071118Journal
The Lancet Digital HealthPublisher
Elsevierae974a485f413a2113503eed53cd6c53
10.1016/S2589-7500(23)00218-2