Impact of harmonization on predicting complications in head and neck cancer after radiotherapy using MRI radiomics and machine learning techniques
Khajetash B, Hajianfar G, Talebi A, Mahdavi SR, Ghavidel B, Kalati FA, Molana SH, Lei Y, Tavakoli M
BACKGROUND: Variations in medical images specific to individual scanners restrict the use of radiomics in both clinical practice and research. To create reproducible and generalizable radiomics-based models for outcome prediction and assessment, data harmonization is essential.
PURPOSE: This study aims to investigate the impact of harmonization in performance of machine learning-based radiomics model toward the prediction of radiotherapy-induced toxicity (early and late sticky saliva and xerostomia) in head and neck cancer (HNC) patients after radiation therapy using T 1 $T_1$ and T 2 $T_2$ -weighted magnetic resonance (MR) images.
METHODS: A total of 85 HNC patients who underwent radiotherapy was studied. Radiomic features were extracted from T 1 $T_1$ and T 2 $T_2$ -weighted MR images with standardized protocols. Data harmonization was performed using ComBat algorithm to reduce inter-center variability. Besides imaging features, both dosimetric and demographic features were extracted and used in our model. Recursive feature elimination was employed as feature selection method to identify the most important variables. Ten classification algorithms, including eXtreme Gradient Boosting (XGBoost), multilayer perceptron (MLP), support vector machines (SVM), random forest (RF), k-nearest neighbor (KNN), Naive Bayes (NB), logistic regression (LR), and decision tree (DT), boosted generalized linear model (GLMB), and stack learning (SL) were utilized and compared to develop predictive models. This evaluation comparisons were performed before and after harmonization to demonstrate its significance.
RESULTS: Our results indicate that harmonization consistently enhances predictive performance across various complications and imaging modalities. In early and late sticky saliva prediction using T 1 $T_1$ -weighted images, the SVM and RF models achieved an impressive area under the curve (AUC) of 0.88 ± $\pm$ 0.09 and 0.97 ± $\pm$ 0.05 with harmonization versus 0.42 ± $\pm$ 0.12 and 0.83 ± $\pm$ 0.08 without harmonization, respectively. Similarly, in early and late xerostomia prediction, the model attained an AUC of 0.79 ± $\pm$ 0.15 and 0.61 ± $\pm$ 0.14 with harmonization and 0.55 ± $\pm$ 0.17 and 0.46 ± $\pm$ 0.14 without harmonization.
CONCLUSION: Our study highlights the importance of harmonization techniques in improving the performance of predictive models utilizing magnetic resonance imaging radiomics features. While harmonization consistently enhanced performance for sticky saliva and early xerostomia using T 1 $T_1$ -weighted features, the prediction of early and late xerostomia using T 2 $T_2$ -weighted features remains challenging. These findings try to develop accurate and reliable predictive models in medical imaging, that contribute to improve patient care and treatment outcomes.
© 2025 American Association of Physicists in Medicine.
Medical physics, 2025-04-02