Inter-rater agreement of HER2-low scores between expert breast pathologists and the Visiopharm digital image analysis application (HER2 APP, CE2797)
Parry, Suzanne ; Zabaglo, Lila ; Shaaban, Abeer M ; Dodson, Andrew
Parry, Suzanne
Zabaglo, Lila
Shaaban, Abeer M
Dodson, Andrew
Affiliation
UK National External Quality Assessment Scheme for Immunocytochemistry and In-Situ Hybridisation; University of Birmingham; University Hospitals Birmingham NHS Foundation Trust
Other Contributors
Publication date
2025-10-16
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Inter-observer concordance data for the HER2 category as assessed by a group of 16 specialist breast pathologists on 50 diagnostic core biopsies was compared with that produced by digital image analysis (DIA) using the HER2 APP, CE2797 (VP APP; Visiopharm, Hoersholm, Denmark). Comparing pathologists' consensus scores and DIA scores, 36 cases (73.5%) agreed. Fleiss' kappa statistic was 0.433 (indicative of moderate agreement). Cohen's weighted kappa was used to compare the scores of individual raters to consensus scores; for all 50 cases the kappa scores had a range between 0.412 and 0.854; the VP APP was ranked 12th of 17 raters (kappa score 0.638 indicating substantial agreement). Results for HER2-low cases (N = 44) showed a kappa score range of 0.295 to 0.823; the VP APP ranked 12th of 17 (score 0.535 indicating moderate agreement). For high agreement cases the kappa score range was 0.664 to 1.000 for all HER2 scores (N = 24) and the VP APP scored 0.916 (indicating almost perfect agreement). For the HER2-low scores (N = 20), the kappa score range was 0.506-1.000 and the VP APP scored 0.860 (almost perfect agreement). DIA of the proportions of tumour cells showing expression within each of the HER2 categories demonstrated that the majority of cases showing a low level of agreement between pathologists showed heterogeneity and/or a level of expression close to a cut-point for decision making. This study demonstrates that the VP APP produces results that are extremely well-aligned to those of expert pathologists in cases with good overall agreement, and in difficult cases its reproducibility will outperform that of the visual scorer. The results also suggest that use of the VP APP has the potential to reduce the proportion of cases referred for gene amplification testing by reducing the number of cases incorrectly classified as HER2 2+.
Citation
Parry S, Zabaglo L, Shaaban AM, Dodson A. Inter-rater agreement of HER2-low scores between expert breast pathologists and the Visiopharm digital image analysis application (HER2 APP, CE2797). J Pathol Clin Res. 2025 Nov;11(6):e70051. doi: 10.1002/2056-4538.70051.
Type
Article
