Deep learning-based model shows promise for personalizing ESCC treatment


A retrospective analysis shows the ability of Vision-Mamba (Vim), a novel deep-learning (DL) model, to predict pathologic complete response (pCR) in individuals with esophageal squamous cell carcinoma (ESCC) following neoadjuvant immunotherapy and chemotherapy (nICT).
“Accurate prediction of pCR following nICT is crucial for tailoring patient care in ESCC … [Hence,] we developed a DL-based model for the early assessment of pCR in patients with ESCC who received nICT,” said the researchers.
“By integrating voxel-level radiomics feature maps and CT images, our model accurately predicted pCR, achieving favourable area under the curve (AUC), high accuracy, sensitivity, and specificity in one internal independent validation cohort and two external independent validation cohorts,” they continued.
The study included 741 patients (median age 65 years, 92 percent women) with ESCC who underwent nICT followed by radical esophagectomy from three institutions in China. Patients from one centre were divided into a training set (n=469) and an internal validation set (n=118), while data from the other two facilities were used as external validation sets (n=120 and 34, respectively).
Immunotherapy included standard doses (200 mg Q3W per cycle) of PD-1/PD-L1 monoclonal antibodies, and chemo regimens were platinum-based. pCR was defined as having no viable tumour residual at the primary tumour site. [J Immunother Cancer 2025;13:e011149]
In the training set, Vim demonstrated robust predictive capabilities, showing an accuracy of 0.91, an area under the curve (AUC) of 0.92, a sensitivity of 0.94, and a specificity of 0.91. The validation sets showed similar trends – accuracy of 0.83–0.91, AUC of 0.83–0.86, sensitivity of 0.73–0.82, and specificity of 0.84–1.00.
“[B]y ensuring high specificity, the model helps accurately identify patients who do not achieve pCR and need timely surgical intervention. This avoids the risk of misclassifying non-pCR patients as pCR, which could delay necessary treatment and compromise long-term outcomes,” the researchers noted.
Furthermore, the model’s ability to provide accurate risk stratification may allow for individualized treatment approaches. For instance, a more aggressive consolidation therapy may be recommended for high-risk patients, whereas those at low risk can avoid overtreatment, they added.
Vim also outdid other DL models and traditional radiomics methods, showing high accuracy, sensitivity, and specificity across multiple validation cohorts, they said.
Clinical implications
Neoadjuvant chemoradiotherapy has been the standard of care for locally advanced esophageal cancer. [J Clin Oncol 2021;39:1995-2004] However, evidence focusing on the omission of RT has shown better outcomes. [J Clin Oncol 2024;42:17]
“[A]lthough the optimal neoadjuvant treatment strategy remains uncertain, current evidence and our clinical experience suggest that nICT holds significant potential. This promising approach underpins the rationale for our study,” the investigators said.
They noted that the study underlines the potential of voxel-level radiomics plus DL for enhancing clinical decision-making in the treatment of ESCC. “By combining voxel-level radiomics, the model provides a more detailed and robust prediction of tumour response, offering a promising tool for personalized treatment strategies.”
The ability of Vim to predict pCR may guide clinicians in selecting candidates for organ-preserving strategies (eg, watch-and-wait approach), which, in turn, could minimize unnecessary surgeries and improve quality of life.
“Future research may focus on validating this model in larger, prospective trials and exploring its integration with other predictive biomarkers, potentially influencing clinical practice guidelines for the treatment of ESCC,” they concluded.