Multi-cancer early detection based on serum surface-enhanced Raman spectroscopy with deep learning: a large-scale case-control study

BACKGROUND: Early detection of cancer can help patients with more effective treatments and result in better prognosis. Unfortunately, established cancer screening technologies are limited for use, especially for multi-cancer early detection. In this study, we described a serum-based platform integrating surface-enhanced Raman spectroscopy (SERS) technology with resampling strategy, feature dimensionality enhancement, deep learning and interpretability analysis methods for sensitive and accurate pan-cancer screening.
METHODS: Totally, 1655 early-stage patients with breast cancer (BC, n = 569), lung cancer (LC, n = 513), thyroid cancer (TC, n = 220), colorectal cancer (CC, n = 215), gastric cancer (GC, n = 100), esophageal cancer (EC, n = 38), and 1896 healthy controls (HC) were enrolled. The serum SERS spectra were obtained from each participant. Data dimension enhancement was conducted by heatmap transformation and continuous wavelet transform (CWT). The dimensionalization SERS spectral data were subsequently analyzed by residual neural network (ResNet) as convolutional neural network (CNN) algorithm. Class activation mapping (CAM) method was performed to elucidate the potential biological significance of spectral data classification.
RESULTS: All participants were divided into a training set and a test set with a ratio of 7:3. The BorderlineSMOTE method was selected as the most appropriate resampling strategy and the deep neural network (DNN) model achieved desirable performance among all groups (accuracy rate: 93.15%, precision rate: 88:46%, recall rate: 85.68%, and F1-score: 86.98%), with the generated AUC values of 0.991 for HC, 0.995 for BC, 0.979 for LC, 0.996 for TC, 0.994 for CC, 0.982 for GC, and 0.941 for EC, respectively. Furthermore, the combination use of SERS spectra data and ResNet (form of heatmap) were also capable of effectively distinguishing different categories and making accurate predictions (accuracy rate: 94.75%, precision rate: 89.02, recall rate: 86.97, and F1-score: 87.88), with the AUC values of 0.996 for HC, 0.995 for BC, 0.988 for LC, 0.999 for TC, 0.993 for CC, 0.985 for GC, and 0.940 for EC, respectively. Additionally, strong wave number range of the spectral data was observed in the CAM analysis.
CONCLUSIONS: Our study has offered a highly effective serum SERS-based approach for multi-cancer early detection, which might shed new light on cancer screening in clinical practice.

© 2025. The Author(s).
BMC medicine, 2025-02-23