Assessing the value of artificial intelligence-based image analysis for pre-operative surgical planning of neck dissections and iENE detection in head and neck cancer patients
Schmidl B, Hoch CC, Walter R, Wirth M, Wollenberg B, Hussain T
OBJECTIVES: Accurate preoperative detection and analysis of lymph node metastasis (LNM) in head and neck squamous cell carcinoma (HNSCC) is essential for the surgical planning and execution of a neck dissection and may directly affect the morbidity and prognosis of patients. Additionally, predicting extranodal extension (ENE) using pre-operative imaging could be particularly valuable in oropharyngeal HPV-positive squamous cell carcinoma, enabling more accurate patient counseling, allowing the decision to favor primary chemoradiotherapy over immediate neck dissection when appropriate. Currently, radiological images are evaluated by radiologists and head and neck oncologists; and automated image interpretation is not part of the current standard of care. Therefore, the value of preoperative image recognition by artificial intelligence (AI) with the large language model (LLM) ChatGPT-4 V was evaluated in this exploratory study based on neck computed tomography (CT) images of HNSCC patients with cervical LNM, and corresponding images without LNM. The objective of this study was to firstly assess the preoperative rater accuracy by comparing clinician assessments of imaging-detected extranodal extension (iENE) and the extent of neck dissection to AI predictions, and secondly to evaluate the pathology-based accuracy by comparing AI predictions to final histopathological outcomes.
MATERIALS AND METHODS: 45 preoperative CT scans were retrospectively analyzed in this study: 15 cases in which a selective neck dissection (sND) was performed, 15 cases with ensuing radical neck dissection (mrND), and 15 cases without LNM (sND). Of note, image analysis was based on three single images provided to both ChatGPT-4 V and the head and neck surgeons as reviewers. Final pathological characteristics were available in all cases as HNSCC patients had undergone surgery. ChatGPT-4 V was tasked with providing the extent of LNM in the preoperative CT scans and with providing a recommendation for the extent of neck dissection and the detection of iENE. The diagnostic performance of ChatGPT-4 V was reviewed independently by two head and neck surgeons with its accuracy, sensitivity, and specificity being assessed.
RESULTS: In this study, ChatGPT-4 V reached a sensitivity of 100% and a specificity of 34.09% in identifying the need for a radical neck dissection based on neck CT images. The sensitivity and specificity of detecting iENE was 100% and 34.15%, respectively. Both human reviewers achieved higher specificity. Notably, ChatGPT-4 V also recommended a mrND and detected iENE on CT images without any cervical LNM.
DISCUSSION: In this exploratory study of 45 preoperative CT Neck scans before a neck dissection, ChatGPT-4 V substantially overestimated the degree and severity of lymph node metastasis in head and neck cancer. While these results suggest that ChatGPT-4 V may not yet be a tool providing added value for surgical planning in head and neck cancer, the unparalleled speed of analysis and well-founded reasoning provided suggests that AI tools may provide added value in the future.
© 2025. The Author(s).
Discover oncology, 2025-06-01