Incorporation of explainable artificial intelligence in ensemble machine learning-driven pancreatic cancer diagnosis
Almisned FA, Usanase N, Ozsahin DU, Ozsahin I
Despite the strides made in medical science, pancreatic cancer continues to be a threat, highlighting the urgent need for creative strategies to address this concern. Recently, a potential approach that has attracted significant attention is using machine learning in clinical decision-making. This research aims to analyze six machine learning algorithms, and an ensemble voting classifier, develop hybrid models for the early detection of pancreatic cancer based on several clinical characteristics and interpret their performance with Shapley Additive Explanations (SHAP). A publicly available dataset composed of 590 patient urine samples was utilized to develop six conventional models for the classification of cancerous from non-cancerous pancreatic cases through the analysis of specific attributes. An ensemble voting classifier was developed from the best-performed single models, which were later hybridized to form six novel hybrid models. The ensemble voting classifier outperformed all stand-alone models with an accuracy of 96.61% and a precision of 98.72%. The six novel hybrid models exhibited higher performance than single models with voting classifier random forest hybridized model outperforming others with an AUC of 99.05% (95% confidence interval (CI): 0.93-1.00) and an interpretation was given by SHAP showing top influential features in pancreatic cancer diagnosis that exhibited the greatest positive SHAP values. Employing rapid sophisticated models with high accuracy and precision holds significant promise in facilitating the effective detection of various diseases, including pancreatic cancer.
© 2025. The Author(s).
Scientific reports, 2025-04-25