Deep learning for predicting invasive recurrence of ductal carcinoma in situ: leveraging histopathology images and clinical features
Doyle S, Lips EH, Marcus E, Mulder L, Liu YH, Canton FD, Kootstra T, van Seijen MM, Bouybayoune I, Sawyer EJ, Thompson AM, Pinder SE, Sánchez CI, Teuwen J, Wesseling J,
BACKGROUND: Ductal Carcinoma In Situ (DCIS) can progress to ipsilateral invasive breast cancer (IBC) but over 75% of DCIS lesions do not progress if untreated. Currently, DCIS that might progress to IBC cannot reliably be identified. Therefore, most patients with DCIS undergo treatment resembling IBC. To facilitate identification of low-risk DCIS, we developed deep learning models using histology whole-slide images (WSIs) and clinico-pathological data.
METHODS: We predicted invasive recurrence in patients with primary, pure DCIS treated with breast-conserving surgery using clinical Cox proportional hazards models and deep learning. Deep learning models were trained end-to-end with only WSIs or in combination with clinical data (integrative). We employed nested k-fold cross-validation (k = 5) on a Dutch multicentre dataset (n = 558). Models were also tested on the UK-based Sloane dataset (n = 94).
FINDINGS: Evaluated over 20 years on the Dutch dataset, deep learning models using only WSIs effectively stratified patients into low-risk (no recurrence) and high-risk (invasive recurrence) groups (negative predictive value (NPV) = 0.79 (95% CI: 0.74-0.83); hazard ratio (HR) = 4.48 (95% CI: 3.41-5.88, p < 0.0001); area under the receiver operating characteristic curve (AUC) = 0.75 (95% CI: 0.70-0.79)). Integrative models achieved similar results with slightly enhanced hazard ratios compared to the image-only models (NPV = 0.77 (95% CI 0.73-0.82); HR = 4.85 (95% CI 3.65-6.45, p < 0.0001); AUC = 0.75 (95% CI 0.7-0.79)). In contrast, clinical models were borderline significant (NPV = 0.64 (95% CI 0.59-0.69); HR = 1.37 (95% CI 1.03-1.81, p = 0.041); AUC = 0.57 (95% CI 0.52-0.62)). Furthermore, external validation of the models was unsuccessful, limited by the small size and low number of cases (22/94) in our external dataset, WSI quality, as well as the lack of well-annotated datasets that allow robust validation.
INTERPRETATION: Deep learning models using routinely processed WSIs hold promise for DCIS risk stratification, while the benefits of integrating clinical data merit further investigation. Obtaining a larger, high-quality external multicentre dataset would be highly valuable, as successful generalisation of these models could demonstrate their potential to reduce overtreatment in DCIS by enabling active surveillance for women at low risk.
FUNDING: Cancer Research UK, the Dutch Cancer Society (KWF), and the Dutch Ministry of Health, Welfare and Sport.
Copyright © 2025 The Author(s). Published by Elsevier B.V. All rights reserved.
EBioMedicine, 2025-05-31