Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

Framework of the TORCH Model

Background Introduction

Cancer of Unknown Primary (CUP) is a type of malignant disease that is confirmed to be metastatic through histopathology but whose primary site cannot be identified using conventional baseline diagnostic methods. CUP presents significant diagnostic and therapeutic challenges in clinical practice and is believed to account for 3-5% of all human cancers. Among these, adenocarcinoma is the most common pathological type, followed by squamous cell carcinoma and undifferentiated carcinoma. Despite employing a series of combined chemotherapy treatments, the overall prognosis for patients remains extremely poor, with only 20% of patients reaching a median survival time of 10 months. A notable characteristic of CUP is its early spread, aggressive clinical presentation, and multi-organ involvement.

Immunohistochemistry (IHC) is usually employed as a key method for predicting the potential primary site of CUP. However, with the use of approximately 20 different immunostaining units, less than 30% of CUP cases can be accurately located. Thus, accurately predicting the primary site is crucial for effective and personalized treatment.

Paper Source

This paper was authored by Fei Tian and others from institutions including the Cancer Institute and First Affiliated Hospital of Tianjin Medical University, the First Affiliated Hospital of Soochow University, the First Affiliated Hospital of Zhengzhou University, and Harvard University in the United States. The study was submitted in May 2023, accepted in March 2024, and published online in April 2024 in “Nature Medicine”.

Research Purpose

This study aims to develop a deep learning model for predicting the primary site of tumors based on cytology images. Building upon traditional cytological analysis, the research team developed a deep learning model named TORCH (Tumor Origin Classification through Homeostasis) to diagnose malignant tumors and predict their primary site using 57,220 cytology images from four tertiary hospitals.

Research Procedures

Data Collection and Processing

The study collected 90,572 cytology smear images from 76,183 patients between June 2010 and October 2023 across four large hospitals. After excluding 24,808 malignant images that could not locate the primary site, the final dataset consisted of 57,220 images from 43,688 patients. The training set included 29,883 images from 20,638 patients, covering 12 types of tumors. Besides the 19,406 tumor images, the training set also included 10,477 benign disease images. Internal test sets from hospitals in Tianjin, Zhengzhou, and Suzhou included a total of 12,799 images, and external test sets from hospitals in Tianjin and Yantai included a total of 14,538 images.

Model Development and Validation

The team used four different deep neural networks trained on three different input types, generating twelve different models (Methods section). These models were integrated through model ensemble methods to enhance the generalization capability and interoperability of the TORCH model. Across three internal test sets and two external test sets, the TORCH model achieved an average AUROC value of 0.969 in a total of 27,337 tests. Specifically, the values were 0.953, 0.962, and 0.979 for the internal test sets from Tianjin, Zhengzhou, and Suzhou, respectively; and 0.958 and 0.978 for the two external test sets.

Experimental Results

  • Cancer Diagnosis Performance: The TORCH model achieved an AUROC value of 0.974, accuracy of 92.6%, sensitivity of 92.8%, and specificity of 92.4% in the five test sets.
  • Tumor Primary Site Prediction Performance: Overall accuracy was 0.969 (95%CI: 0.967–0.970), with a Top-1 accuracy of 82.6% and Top-3 accuracy reaching 98.9%. The model achieved the highest AUROC values of 0.930, 0.962, and 0.960, and the second-highest AUROC values of 0.799, 0.905, and 0.947.
  • Comparison with Pathologists: TORCH’s predictive performance was significantly better than that of pathologists, particularly in improving the diagnostic scores of junior pathologists.
  • Survival Analysis: CUP patients receiving treatment consistent with TORCH-predicted primary sites had longer survival times compared to those receiving inconsistent treatment (27 months vs. 17 months, p=0.006).

Research Conclusion

The TORCH model demonstrates its potential as a significant auxiliary tool in clinical practice by diagnosing cancer and locating primary sites through cytology images, showing promising application prospects. Although further validation in future randomized trials is needed, this study clearly provides reliable support for the clinical management and personalized treatment of CUP patients.

Research Highlights

  1. Innovation and Application Value: This study is the first to use deep learning image analysis technology in the context of high-efficiency CUP diagnosis, significantly improving diagnostic and predictive accuracy.
  2. Large Data Volume and Extensive Coverage: Using a large dataset of cytology images from four medical centers enhances the reliability and universality of the research results.
  3. Intelligent Auxiliary Diagnosis: The TORCH model helps junior pathologists achieve diagnostic accuracy close to that of senior pathologists, reducing diagnosis time and costs.
  4. Extended Patient Survival Time: In actual clinical applications, TORCH has achieved a practical improvement in patient survival time, demonstrating its value in clinical practice.

Further Research Directions

Although the TORCH model has shown good preliminary results, future research could consider integrating more types of clinical data (such as genetic data, radiographic images) to further enhance model performance. Meanwhile, its universality should be further validated in patients from different races and regions.

Summary

This study proposes and validates a deep learning-based tumor primary site prediction model, providing new ideas and methods for the diagnosis and treatment of CUP. It also serves as an important case for the application of artificial intelligence in medical diagnosis.