Revealing transparency gaps in publicly available COVID-19 datasets used for medical artificial intelligence development-a systematic review
dc.contributor.affiliation | University Hospitals Birmingham NHS Foundation Trust; University of Birmingham; University Hospital Southampton NHS Foundation Trust; The Royal Wolverhampton NHS Trust; King's College London; Birmingham Women's and Children's NHS Foundation Trust; University Hospitals of Leicester NHS Trust; NIHR Blood and Transplant Research Unit (BTRU); Massachusetts Institute of Technology; The Hospital for Sick Children; SickKids Research Institute; University of Cambridge; Roche Diagnostics; University College London; PATH; Wellcome Trust; Independent Cancer Patients Voice; Oxford University Hospitals NHS Foundation Trust; Moorfields Eye Hospital; NIHR Biomedical Research Centre | en_US |
dc.contributor.author | Alderman, Joseph E | |
dc.contributor.author | Charalambides, Maria | |
dc.contributor.author | Sachdeva, Gagandeep | |
dc.contributor.author | Laws, Elinor | |
dc.contributor.author | Palmer, Joanne | |
dc.contributor.author | Lee, Elsa | |
dc.contributor.author | Menon, Vaishnavi | |
dc.contributor.author | Malik, Qasim | |
dc.contributor.author | Vadera, Sonam | |
dc.contributor.author | Calvert, Melanie | |
dc.contributor.author | Ghassemi, Marzyeh | |
dc.contributor.author | McCradden, Melissa D | |
dc.contributor.author | Ordish, Johan | |
dc.contributor.author | Mateen, Bilal | |
dc.contributor.author | Summers, Charlotte | |
dc.contributor.author | Gath, Jacqui | |
dc.contributor.author | Matin, Rubeta N | |
dc.contributor.author | Denniston, Alastair K | |
dc.contributor.author | Liu, Xiaoxuan | |
dc.contributor.department | Research and Development | en_US |
dc.contributor.department | Ophthalmology | en_US |
dc.contributor.role | Admin and Clerical | en_US |
dc.contributor.role | Medical and Dental | en_US |
dc.contributor.trustauthor | Vadera, Sonam | |
dc.contributor.trustauthor | Denniston, Alastair | |
dc.date.accessioned | 2024-12-04T12:57:34Z | |
dc.date.available | 2024-12-04T12:57:34Z | |
dc.date.issued | 2024-10-23 | |
dc.description.abstract | During the COVID-19 pandemic, artificial intelligence (AI) models were created to address health-care resource constraints. Previous research shows that health-care datasets often have limitations, leading to biased AI technologies. This systematic review assessed datasets used for AI development during the pandemic, identifying several deficiencies. Datasets were identified by screening articles from MEDLINE and using Google Dataset Search. 192 datasets were analysed for metadata completeness, composition, data accessibility, and ethical considerations. Findings revealed substantial gaps: only 48% of datasets documented individuals' country of origin, 43% reported age, and under 25% included sex, gender, race, or ethnicity. Information on data labelling, ethical review, or consent was frequently missing. Many datasets reused data with inadequate traceability. Notably, historical paediatric chest x-rays appeared in some datasets without acknowledgment. These deficiencies highlight the need for better data quality and transparent documentation to lessen the risk that biased AI models are developed in future health emergencies. | en_US |
dc.identifier.citation | Alderman JE, Charalambides M, Sachdeva G, Laws E, Palmer J, Lee E, Menon V, Malik Q, Vadera S, Calvert M, Ghassemi M, McCradden MD, Ordish J, Mateen B, Summers C, Gath J, Matin RN, Denniston AK, Liu X. Revealing transparency gaps in publicly available COVID-19 datasets used for medical artificial intelligence development-a systematic review. Lancet Digit Health. 2024 Nov;6(11):e827-e847. doi: 10.1016/S2589-7500(24)00146-8. | en_US |
dc.identifier.doi | 10.1016/S2589-7500(24)00146-8 | |
dc.identifier.eissn | 2589-7500 | |
dc.identifier.pmid | 39455195 | |
dc.identifier.uri | http://hdl.handle.net/20.500.14200/6680 | |
dc.language.iso | en | en_US |
dc.publisher | Elsevier | en_US |
dc.relation.url | https://www.thelancet.com/journals/landig/home | en_US |
dc.rights | Copyright © 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved. | |
dc.source.beginpage | e827 | |
dc.source.country | England | |
dc.source.endpage | e847 | |
dc.source.issue | 11 | |
dc.source.journaltitle | The Lancet Digital Health | en_US |
dc.source.volume | 6 | |
dc.subject | Patients. Primary care. Medical profession. Forensic medicine | en_US |
dc.subject | Public health. Health statistics. Occupational health. Health education | en_US |
dc.subject | Health services. Management | en_US |
dc.title | Revealing transparency gaps in publicly available COVID-19 datasets used for medical artificial intelligence development-a systematic review | en_US |
dc.type | Article | en_US |
dspace.entity.type | Publication | |
oa.grant.openaccess | na | en_US |
rioxxterms.version | NA | en_US |
Files
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.7 KB
- Format:
- Item-specific license agreed upon to submission
- Description: