A computed tomography-based deep learning model for non-invasively predicting World Health Organization (WHO)/International Society of Urological Pathology (ISUP) pathological grades of clear cell renal cell carcinoma (ccRCC): a multicenter cohort study

Ting Huang; Mang Ke; Qing Liu; Mingliang Ying; Meiling Hu; Xiaodan Fu; Yang Hu; Min Xu

doi:10.21037/tau-2025-222

Original Article

A computed tomography-based deep learning model for non-invasively predicting World Health Organization (WHO)/International Society of Urological Pathology (ISUP) pathological grades of clear cell renal cell carcinoma (ccRCC): a multicenter cohort study

Ting Huang¹, Mang Ke², Qing Liu³, Mingliang Ying⁴, Meiling Hu⁴, Xiaodan Fu⁵, Yang Hu¹, Min Xu¹

¹Department of Urology, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, China; ²Department of Urology, Affiliated Taizhou Hospital, Wenzhou Medical University, Taizhou, China; ³Department of Urology, Affiliated Jinhua Hospital, Wenzhou Medical University, Jinhua, China; ⁴Department of Radiology, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, China; ⁵Department of Pathology, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, China

Contributions: (I) Conception and design: T Huang, M Ke; (II) Administrative support: M Xu; (III) Provision of study materials or patients: T Huang, M Ke, Q Liu; (IV) Collection and assembly of data: T Huang, M Ke, Q Liu; (V) Data analysis and interpretation: M Ying, M Hu, X Fu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Min Xu, BM, BCh. Department of Urology, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, 365 East Renmin Rd., Jinhua, China. Email: xumintongxun@163.com.

Background: Clear cell renal cell carcinoma (ccRCC) is the most common and aggressive subtype of kidney cancer, commonly exhibiting significant morphological heterogeneity in its pathological characteristics. The objective of this study is to develop a deep learning (DL) model for predicting pathological grades of ccRCC based on contrast-enhanced computed tomography (CECT).

Methods: Retrospective data were collected from 483 ccRCC patients across three medical centers. Arterial phase and portal venous phase computed tomography (CT) images from the dataset were segmented for renal tumors and kidneys. Three convolutional neural networks (CNNs) were employed to extract features from the regions of interest (ROIs) in the CT images across multiple dimensions including three-dimensional (3D), two-and-a-half-dimensional (2.5D), and two-dimensional (2D). Least absolute shrinkage and selection (LASSO) regression was used for feature selection. The models were evaluated using receiver operating characteristic (ROC) curves and decision curve analysis (DCA).

Results: Two types of 2.5D tumor DL models based on ResNet-34 and ShuffleNet_v2 were selected, both had area under the curves (AUCs) greater than 0.72 in the training set as well as in the internal and external test sets. The best model, resulting from the fusion of tumor and kidney models, achieved an AUC of 0.777 (95% confidence interval: 0.704–0.839, P<0.001) in the total test set, showing improved predictive ability compared to the tumor-alone models. DCA demonstrated the clinical utility of the model.

Conclusions: The DL model based on CT achieved satisfactory results in predicting the pathological grades of ccRCC.

Keywords: Tumor grading; deep learning (DL); clear cell renal cell carcinoma (ccRCC)

Submitted Mar 20, 2025. Accepted for publication Jun 18, 2025. Published online Jul 25, 2025.

doi: 10.21037/tau-2025-222

Highlight box

Key findings

• Our study developed a deep learning (DL) model that combines tumor and kidney features for predicting pathological grade of clear cell renal cell carcinoma (ccRCC). The ability of this model to predict ccRCC grade improved compared to previous models that used tumor features alone [area under the curve (AUC): 0.777 vs. 0.748].

What is known and what is new?

• Some researchers have used convolutional neural network (CNN) models for the automatic segmentation and qualitative prediction of renal tumors, but results have been variable, and there is a lack of comprehensive exploration of multidimensional models using various types of CNNs for renal tumors and kidney tissues.

• During the investigation, we employed DL models based on multiple CNNs to explore and extract features from regions of interest (ROIs) in computed tomography (CT) images across various dimensions, including three-dimensional (3D), two-and-a-half-dimensional (2.5D), 2.5D image fusion, two-dimensional (2D), and traditional radiomics. A comprehensive analysis and comparison were conducted to obtain the best model with the optimal dimension. This study provides insights for future ccRCC radiomics research.

What is the implication, and what should change now?

• This study demonstrates that, in the context of selecting appropriate model types, a DL model combining 2.5D image features of renal tumors with 3D kidney features from contrast-enhanced CT scans shows significantly superior performance in predicting renal tumor pathological grading compared to traditional models relying solely on 2D or 3D tumor images. A larger dataset would improve model performance, thereby facilitating routine clinical implementation.

Introduction

Clear cell renal cell carcinoma (ccRCC) is the most common type of renal tumor, accounting for about 70–75% of all renal tumors (1,2). Different histopathological grades of ccRCC require different treatment strategies and have varying prognoses (3). However, fine needle aspiration, the gold standard for preoperative pathological diagnosis of renal tumors, carries risks of tumor tract seeding and tumor rupture (4), and is not widely recommended for preoperative diagnosis. Therefore, a non-invasive and reliable method for preoperative prediction of ccRCC pathological grades is needed to inform individualized treatment plans (5).

In recent years, deep learning (DL) has gained significant attention in the field of medical imaging. Various DL models based on convolutional neural networks (CNNs) have been developed and widely used for the automatic diagnosis and analysis of medical images (6,7). Some researchers have used CNN models for the automatic segmentation and qualitative prediction of renal tumors (8,9), but results have been variable, and there is a lack of comprehensive exploration of multidimensional models using various types of CNNs for renal tumors and kidney tissues. This study aims to develop and validate the best DL model based on a multi-center dataset and assess its performance. We present this article in accordance with the TRIPOD reporting checklist (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-222/rc).

Methods

Study patients

This study recruited three cohorts of patients with pathologically diagnosed ccRCC, including a total of 483 patients who underwent nephrectomy or partial nephrectomy. The detailed inclusion and exclusion criteria are shown in Figure 1. It should be noted that to avoid the impact of abnormal kidney morphology on feature extraction, patients with special types of renal tumors such as horseshoe kidney, polycystic kidney, and duplex kidney were excluded. The basic clinical information of the included patients is shown in Table 1. Pathological data were obtained from postoperative pathological reports in the hospital information system. All pathological ccRCC grades were reconfirmed by an independent pathologist with 10 years of experience in urogenital system pathology according to the 2016 World Health Organization (WHO) grading criteria (10), when encountering tumors with mixed pathological grades, the RCC grade is defined based on the area with the highest pathological grade within the tumor (11), WHO/International Society of Urological Pathology (ISUP) grades I and II were classified as the low-grade group, and WHO/ISUP grades III and IV were classified as the high-grade group. The study protocol was registered with ClinicalTrials.gov (https://clinicaltrials.gov) retrospectively (Clinical Trial Number: NCT06559046). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Institutional Review Board of Affiliated Jinhua Hospital, Zhejiang University School of Medicine (approval No. 2024-107) and informed consent was obtained from all individual participants. The Affiliated Taizhou Hospital, Wenzhou Medical University, and Affiliated Jinhua Hospital, Wenzhou Medical University were also informed and agreed on the study.

Figure 1 Flowchart of patient recruitment. Center 1: Affiliated Jinhua Hospital, Zhejiang University School of Medicine; Center 2: Affiliated Taizhou Hospital, Wenzhou Medical University; Center 3: Affiliated Jinhua Hospital, Wenzhou Medical University. ccRCC, clear cell renal cell carcinoma; CT, computed tomography; HG, high grade; LG, low grade.

Table 1

Baseline characteristics of patients with clear cell renal cell carcinoma

Characteristics	Training set (n=326)	Internal test set (n=37)	External test set 1 (n=89)	External test set 2 (n=31)
Age (years)	59.1±11.6	61.6±10.9	59.2±10.7	60.0±11.8
Sex
Male	211 (64.7)	23 (62.2)	53 (59.6)	22 (71.0)
Female	115 (35.3)	14 (37.8)	36 (40.4)	9 (29.0)
Tumor size (cm)	4.0±2.1	4.0±2.8	3.9±2.4	2.7±1.8
T stage
T1	287 (88.0)	33 (89.2)	81 (91)	30 (96.8)
T2	20 (6.1)	1 (2.7)	7 (7.9)	1 (3.2)
T3	14 (4.3)	2 (5.4)	1 (1.1)	0
T4	5 (1.5)	1 (2.7)	0	0
WHO/ISUP grade
Low grade	259 (79.4)	29 (78.4)	80 (89.9)	27 (87.1)
G1	43 (13.2)	4 (10.8)	13 (14.6)	23 (74.2)
G2	216 (66.3)	25 (67.6)	67 (75.3)	4 (12.9)
High grade	67 (20.6)	8 (21.6)	9 (10.1)	4 (12.9)
G3	46 (14.1)	6 (16.2)	7 (7.9)	3 (9.7)
G4	21 (6.4)	2 (5.4)	2 (2.2)	1 (3.2)

Data are presented as mean ± standard deviation or n (%). ISUP, International Society of Urological Pathology; T, tumor; WHO, World Health Organization.

CT examination and image preprocessing

This study used the arterial phase and portal venous phase images of enhanced renal CT to establish DL models and perform internal and external testing. The CT scanning parameters of the phases were as follows: the tube voltage was 120 kV; the thickness was 5 mm. All DICOM image files were imported into 3D Slicer (http://www.slicer.org) for annotation. After registering the images of the two phases, two radiologists with 10 years of experience (M.Y. and M.H.) independently annotated the tumors and kidneys. During the annotation process, renal cysts were excluded from the kidney annotation area. After annotation, the images and masks were resampled to a fixed resolution of 1 mm × 1 mm × 1 mm. Then, the intensity distribution was standardized according to the window level/window width, and all images were normalized using z-score normalization.

Model construction and validation

Before constructing the model, all patients were stratified into cohorts based on the medical center (Figure 1). Patients from Jinhua Hospital Affiliated to Zhejiang University School of Medicine were divided into a training cohort and an internal validation cohort in a 9:1 ratio, while all patients from Taizhou Hospital and Jinhua Hospital Affiliated with Wenzhou Medical University were considered the external validation cohort.

For tumors, we constructed a radiomics model (a) and four different DL models for tumor dimensions, including a three-dimensional (3D) model (b), a two-and-a-half-dimensional (2.5D) model (c) that simultaneously extracts features from the largest ROI and images 2 mm above and below the largest ROI layer (12), a 2.5D Image_fusion model (d) that is formed by fusing the three-layer images of the 2.5D model into an .npy format file and then extracting features from the fusion file, and a two-dimensional (2D) model (e). For kidneys, since the largest ROI layer of the tumor does not contain kidney tissue in many patients with large tumor volumes, making it unsuitable for constructing the 2D or 2.5D models, we constructed a 3D model for the kidney. We initially attempted to integrate clinical information, including gender, age, tumor diameter, and tumor-node-metastasis (TNM) staging, into the feature extraction phase. However, the performance of the resulting model on the test set was inferior to that of the imaging-only model (Figure S1). Consequently, we decided to exclude the clinical features from the model. The workflow of our study is illustrated in Figure 2.

Figure 2 The overall pipeline of this study. (A) ROI segmentation. (B) Training procedure. Model a: construction process for the traditional radiomics machine learning model. Models b, c, d, e: construction process for the deep learning model, using 3D-based and 2.5D-based and 2D-based CT images as inputs to different models, including three convolutional neural networks; the predicted probabilities of high grade or low grade are the outputs. Additionally, Grad-CAM was used to visualize the decision-making process of the model. (C) Model test. (D) Combined model test. 2D, two-dimensional; 2.5D, two-and-a-half-dimensional; 3D, three-dimensional; CT, computed tomography; Grad-CAM, gradient-weighted class activation mapping.

Model construction workflow

The model construction workflow includes three steps: feature extraction, feature selection [using least absolute shrinkage and selection (LASSO) regression for feature selection], and building the WHO/ISUP grading model. Our study utilized three CNNs, including two ResNet models (ResNet-34 and ResNet-50), and ShuffleNet_v2. To enhance model interpretability, the gradient information of the last convolutional layer of the CNN was used for weighted fusion to obtain a gradient-weighted class activation mapping (Grad-CAM) (Figure S2), which highlights important regions in the classification target image (13). The tumor radiomics model and 12 DL models constructed using three CNNs based on four tumor dimensions were evaluated, and the best model was selected from 13 tumor models. Similarly, kidney models constructed using three CNNs were evaluated, and the best model was selected from the three kidney models. The features of the best kidney and tumor models were then fused to form the final model, which was evaluated using ROC curves and DCA. Various algorithms, including logistic regression (LR), support vector machine (SVM), ExtraTrees, LightGBM, AdaBoost, and multilayer perceptron (MLP), were used in the model evaluation process, and 10-fold cross-validation was performed.

Statistical analysis

The categorical and continuous baseline characteristics of the patients were described using frequency (percentage) and mean ± standard deviation. We conducted a sample size prediction using R code before the study (Appendix 1). The receiver operating characteristic (ROC) curve, AUC, and decision curve analysis (DCA) were primarily used to evaluate the efficiency of the models. Python 3.11.4 was used for statistical analysis and data visualization.

Results

Patient characteristics

Baseline information is summarized in Table 1. A total of 483 patients were included in the study, divided into a training group, internal test group, external test group 1, and external test group 2 (comprising 326, 37, 89, and 31 patients, respectively). The average age of the entire cohort was 59.34±11.4 years, with an average maximum tumor diameter of 3.87±2.22 cm. Of the patients, 64% (n=309) were male, and 89.2% (n=431) had stage I tumors. High-grade pathological classification was observed in 18.2% (n=88) of the patients, but in the training set, 20.6% (n=67) of the patients had a high-grade pathological classification, thus avoiding overfitting of the model in the low-grade pathological group due to class imbalance (14).

Development and testing of tumor models

During the training of the radiomics model, variance analysis and Student’s t-test were performed on the training data, retaining 512 tumor features as candidate features for LASSO analysis, which was further narrowed down to 16 features. A total of 12 DL models for the tumor were constructed using three CNNs (ResNet-34, ResNet-50, and ShuffleNet_v2) based on tumor images of four dimensions (i.e., 3D, 2.5D, 2.5DImage_fusion, and 2D). By comparing the ROC curves of the training set and the internal and external test sets (Figure S3), two superior models were identified: a 2.5D tumor model based on ResNet-34 and 2.5D tumor model based on ShuffleNet_v2. Both models had AUCs greater than 0.72 in the training set as well as in the internal test set and two external test sets. The AUCs for the 2.5D tumor model based on ResNet-34 in the training set, internal test set, external test set 1, and external test set 2 were 0.857, 0.853, 0.736, and 0.722, respectively. The AUCs for the 2.5D tumor model based on ShuffleNet_v2 were 0.938, 0.756, 0.753, and 0.722, respectively.

Development and testing of kidney models

Three kidney models were constructed using three CNNs (ResNet-34, ResNet-50, and ShuffleNet_v2) based on 3D kidney images. By comparing the prediction results of the training set and test sets (Table 2), the kidney model constructed based on ResNet-50 was identified as the superior model.

Table 2

Diagnostic performance of three different models constructed for the kidney

Model	Accuracy	Sensitivity	Specificity	AUC (95%CI)
Training set
Resnet-34	0.794	0.582	0.849	0.774 (0.724–0.818)
Resnet-50	0.791	0.746	0.803	0.869 (0.827–0.904)
ShuffleNet	0.773	0.731	0.784	0.832 (0.462–0.573)
Test set
Resnet-34	0.764	0.524	0.801	0.669 (0.589–0.741)
Resnet-50	0.605	0.714	0.588	0.697 (0.619–0.768)
ShuffleNet	0.624	0.619	0.625	0.636 (0.540–0.697)

AUC, area under the curve; CI, confidence interval.

Development and testing of combined models

The features extracted by the two superior tumor models were respectively combined with the features of the kidney ResNet-50 model (the kidney model constructed based on ResNet-50). After fusion, feature selection was performed again to form two models: the combined1 model was obtained by merging the 2.5D tumor ResNet-34 model with the kidney ResNet-50 model, and the combined2 model was obtained by merging the 2.5D tumor ShuffleNet_v2 model with the kidney ResNet-50 model. The ROC curves of the training set and internal and external test sets for the two tumor models, the kidney model, the combind1 model, and the combind2 model were plotted (Figure S4). In the total test set, the AUC of the combined1 model reached 0.777, showing a diagnostic advantage over the combined2 model and other models. Different algorithms were used to evaluate the models, all yielding good results (Figure 3). The combined1 model included 14 features, comprising 9 tumor features and 5 kidney features (Figure 4). Automated prediction of lesion pathological grading is achievable through quantitative analysis of 14 radiomic features derived from CT imaging. DCA curves were drawn to evaluate the combined1 model (Figure 5), indicating that more than 40% of patients in the test set might benefit from this model.

Figure 3 ROC curves plotted based on different algorithms for the combined1 model’s performance on the test set. AUC, area under the curve; CI, confidence interval; LightGBM, light gradient boosting machine; LR, logistic regression; MLP, multilayer perceptron; ROC, receiver operating characteristic; SVM, support vector machine.

Figure 4 Texture feature selection using the LASSO. Selection of the tuning parameter (λ) in the LASSO model via 10-fold cross-validation based on minimum criteria. The optimal λ value of 0.045 was selected. The 14 resulting features with the highest absolute coefficients were indicated in the plot, and the x-axis represents the coefficient of features. +2: plane 2 millimeters above the ROI plane; −2: plane 2 millimeters below the ROI plane. A, arterial phase features; DL, deep learning; LASSO, least absolute shrinkage and selection operator; MSE, mean squared error; ROI, region of interest; V, portal venous phase features.

Figure 5 DCA curves of two combined models and individual models for tumor and kidney in the training and testing sets. The combined1 model shows a benefit starting point around 0.07, reaching approximately 0.5, suggesting that more than 40% of patients may benefit from this model. DCA, decision curve analysis.

Discussion

Artificial intelligence is rapidly advancing in various fields, including medicine. With the development and updates of methods and algorithms, the capabilities of DL have been rapidly improving. Since the WHO/ISUP pathological grading of ccRCC is closely related to prognosis (15), non-invasive preoperative prediction of pathological grading can provide valuable information for determining treatment plans and choosing surgical methods. Several studies have achieved promising results in predicting renal tumor pathological grades using DL models based on 2D CT images. For instance, Xu et al. (16) developed a model trained on corticomedullary-phase CT images from 592 patients, achieving an AUC of 0.864 in a test set of 114 patients. Other researchers employed strategies such as random cropping, affine transformation, Gaussian blur, and Gaussian noise augmentation on 2D CT images (17), with the resulting DL model attaining an AUC of 0.854 in external validation. Additionally, models based on multiparametric MRI radiomics have also demonstrated satisfactory performance in distinguishing high-low from low-grade renal tumors (18). With advancing research in tumor radiomics, scholars aim to extract more effective information from imaging data through diverse strategies to enhance diagnostic accuracy. Examples include leveraging multidimensional image data or integrating multimodal clinical and imaging features. Kim et al. developed 2.5D and 3D DL models based on CT images to differentiate invasive pulmonary adenocarcinomas among subsolid nodules (SSNs) (19), finding that the AUC of the 2.5D DenseNet (0.921) was significantly higher than that of the 3D DenseNet (0.835; P=0.037). Wang et al. compared 3D and 2D DL based on CT imaging to predict occult lymph node metastasis (LNM) of laryngeal squamous cell carcinoma (LSCC), discovering that 3D DL features had better discriminative ability compared to 2D DL and radiomics features (20). However, current studies on AI and CT-based prediction of kidney tumor pathology grading often use 2D (21,22) or machine learning (23) approaches, with no literature on 2.5D.

This retrospective study, based on arterial and portal venous phase CT images, comprehensively explored the use of DL models with different tumor dimensions to distinguish between low-grade and high-grade ccRCC. Through comprehensive comparison, superior tumor and kidney models were selected and fused, resulting in a CNN-based DL model that achieved satisfactory performance.

In recent years, some scholars have explored using radiomics and DL for CT modeling to predict ccRCC. Our study has three highlights. First, in the DL process for tumors, besides the traditional radiomics model, we extracted features from different tumor dimensions, including 3D, 2D, and 2.5D. The 2.5D approach includes the largest cross-section of the tumor and the two cross-sections that are 2 mm away from the largest section. We divided this feature extraction method into two types: one extracts features from each of the three cross-sections separately and then fuses the features (24), and the other merges the three cross-sections into an .npy format file to store multi-dimensional array data and then extracts features. We found through comparison that the 2.5D DL model without image fusion performed better and was more stable than other dimensional DL models and the radiomics model. The second advantage of our study is the use of three types of CNNs, including the ResNet series and the ShuffleNet series. Lin et al. compared the performance of three ResNet series models in distinguishing low-grade and high-grade ccRCC, finding that, in cases of limited training samples, simpler models in the ResNet series performed better in distinguishing low-grade and high-grade ccRCC (25), which is consistent with our finding that the ResNet-34 model outperformed the ResNet-50 model. The ShuffleNet series focuses on the architecture of lightweight deep neural networks, aiming to provide good computational efficiency and accuracy while maintaining high efficiency of the model, making it suitable for models derived from clinical information with limited training samples. Our comparison of the three CNNs showed that both the 2.5D ResNet-34 model and the 2.5D ShuffleNet_v2 model had AUCs greater than 0.72 in the training set as well as in the internal and external test sets. We also compared the three CNN models of kidney tissue. As mentioned earlier, in many patients with large-volume kidney tumors, the largest cross-section of the tumor does not contain kidney tissue, while the largest cross-section of the kidney is far from the tumor, making it impossible to capture the information of the peritumoral kidney tissue. Therefore, extracting kidney tissue features from these two cross-sections is not suitable, so we chose the 3D model for kidney tissue. Finally, we combined the 2.5D ResNet-34 model and the 2.5D ShuffleNet_v2 model for the tumor with the 3D ResNet-50 model for the kidney. After further screening, we obtained the combined1 and combined2 models. The comparison showed that the combined1 model had the best diagnostic advantage in the training set and various test sets, with AUCs of 0.841 (95% CI: 0.683–0.940), 0.767 (95% CI: 0.664–0.849), 0.787 (95% CI: 0.603–0.913) and 0.777 (95% CI: 0.704–0.839) in the internal, external 1, external 2 test sets, and the total test set, respectively. Overall, the diagnostic accuracy improved compared to the tumor-alone models, indicating that the peritumoral area also provides effective information (26). The DCA curves demonstrated the clinical utility of this model (27), with the starting point of the combined1 model at about 0.07 and the endpoint at about 0.5. The threshold range of the combined1 model is the widest relative to other models, and its net benefit is the highest in the DCA curve across most threshold ranges, making it the optimal model.

This study is limited by the sample size and the insufficient number of external validations. The efficacy of machine learning-based predictive models in clinical medicine is influenced by the sample size of the training dataset (28). A larger dataset would improve model performance, thereby facilitating routine clinical implementation. Studies such as this establish the foundation for future algorithmic developments that may enhance treatment outcomes for patients. Furthermore, due to sample size limitations—particularly the small cohort of G4 patients—this study only differentiates between high-grade and low-grade tumors. However, we must acknowledge the clinically significant differences between G1 vs. G2 and G3 vs. G4 tumors. In future work, we intend to achieve more granular tumor classification.

In the selection of 2.5D images, since the minimum tumor diameter of the included patients was 5 mm, we selected cross-sections 2 mm away from the ROI to form three channels. More channels may be more advantageous (19), so our next research direction could be to increase the information provided by adding sagittal and coronal images. Another limitation of this study is the manual segmentation of the ROI, which increases labor and time costs. In future research, we will attempt automatic segmentation of the ROI (29,30).

Conclusions

This study demonstrates that, in the context of selecting appropriate model types, a DL model combining 2.5D image features of renal tumors with 3D kidney features from contrast-enhanced CT scans shows significantly superior performance in predicting renal tumor pathological grading compared to traditional models relying solely on 2D or 3D tumor images. These findings provide valuable insights for the development of DL models in clinical applications.

In summary, the DL-assisted preoperative prediction model for ccRCC pathological grading proposed in this paper shows excellent performance and can provide some reference for the individualized precise treatment of clinical ccRCC patients.

Acknowledgments

We would like to thank KetengEdit (www.ketengedit.com) for its linguistic assistance during the preparation of this manuscript. We also thank platform “One-key AI” for code consultation and statistical consultation of this study.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-222/rc

Data Sharing Statement: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-222/dss

Peer Review File: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-222/prf

Funding: This work was supported by Project of the Jinhua Science and Technology Plan (No. 2022-3-106 to T.H.).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-222/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Institutional Review Board of Affiliated Jinhua Hospital, Zhejiang University School of Medicine (approval No. 2024-107) and informed consent was obtained from all individual participants. The Affiliated Taizhou Hospital, Wenzhou Medical University, and Affiliated Jinhua Hospital, Wenzhou Medical University were also informed and agreed on the study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Rose TL, Kim WY. Renal Cell Carcinoma: A Review. JAMA 2024;332:1001-10. [Crossref] [PubMed]
Bukavina L, Bensalah K, Bray F, et al. Epidemiology of Renal Cell Carcinoma: 2022 Update. Eur Urol 2022;82:529-42. [Crossref] [PubMed]
Verine J, Colin D, Nheb M, et al. Architectural Patterns are a Relevant Morphologic Grading System for Clear Cell Renal Cell Carcinoma Prognosis Assessment: Comparisons With WHO/ISUP Grade and Integrated Staging Systems. Am J Surg Pathol 2018;42:423-41. [Crossref] [PubMed]
Choy B, Nayar R, Lin X. Role of renal mass biopsy for diagnosis and management: Review of current trends and future directions. Cancer Cytopathol 2023;131:480-94. [Crossref] [PubMed]
Jiang Y, Yang M, Wang S, et al. Emerging role of deep learning-based artificial intelligence in tumor pathology. Cancer Commun (Lond) 2020;40:154-66. [Crossref] [PubMed]
Singh SP, Wang L, Gupta S, et al. 3D Deep Learning on Medical Images: A Review. Sensors (Basel) 2020;20:5097. [Crossref] [PubMed]
Serghiou S, Rough K. Deep Learning for Epidemiologists: An Introduction to Neural Networks. Am J Epidemiol 2023;192:1904-16. [Crossref] [PubMed]
Lee H, Hong H, Kim J, et al. Deep feature classification of angiomyolipoma without visible fat and renal cell carcinoma in abdominal contrast-enhanced CT images with texture image patches and hand-crafted feature concatenation. Med Phys 2018;45:1550-61. [Crossref] [PubMed]
Hsiao CH, Lin PC, Chung LA, et al. A deep learning-based precision and automatic kidney segmentation system using efficient feature pyramid networks in computed tomography images. Comput Methods Programs Biomed 2022;221:106854. [Crossref] [PubMed]
Moch H, Cubilla AL, Humphrey PA, et al. The 2016 WHO Classification of Tumours of the Urinary System and Male Genital Organs-Part A: Renal, Penile, and Testicular Tumours. Eur Urol 2016;70:93-105. [Crossref] [PubMed]
Delahunt B, Cheville JC, Martignoni G, et al. The International Society of Urological Pathology (ISUP) grading system for renal cell carcinoma and other prognostic parameters. Am J Surg Pathol 2013;37:1490-504. [Crossref] [PubMed]
Zeng Y, Zhang X, Kawasumi Y, et al. A 2.5D Deep Learning-Based Method for Drowning Diagnosis Using Post-Mortem Computed Tomography. IEEE J Biomed Health Inform 2023;27:1026-35. [Crossref] [PubMed]
Selvaraju RR, Cogswell M, Das A, et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int J Comput Vis 2020;128:336-59.
Jiang X, Hu Z, Wang S, et al. Deep Learning for Medical Image-Based Cancer Diagnosis. Cancers (Basel) 2023;15:3608. [Crossref] [PubMed]
Xiao Q, Yi X, Guan X, et al. Validation of the World Health Organization/International Society of Urological Pathology grading for Chinese patients with clear cell renal cell carcinoma. Transl Androl Urol 2020;9:2665-74. [Crossref] [PubMed]
Xu L, Yang C, Zhang F, et al. Deep Learning Using CT Images to Grade Clear Cell Renal Cell Carcinoma: Development and Validation of a Prediction Model. Cancers (Basel) 2022;14:2574. [Crossref] [PubMed]
Yang M, He X, Xu L, et al. CT-based transformer model for non-invasively predicting the Fuhrman nuclear grade of clear cell renal cell carcinoma. Front Oncol 2022;12:961779. [Crossref] [PubMed]
Li Q, Liu YJ, Dong D, et al. Multiparametric MRI Radiomic Model for Preoperative Predicting WHO/ISUP Nuclear Grade of Clear Cell Renal Cell Carcinoma. J Magn Reson Imaging 2020;52:1557-66. [Crossref] [PubMed]
Kim H, Lee D, Cho WS, et al. CT-based deep learning model to differentiate invasive pulmonary adenocarcinomas appearing as subsolid nodules among surgical candidates: comparison of the diagnostic performance with a size-based logistic model and radiologists. Eur Radiol 2020;30:3295-305. [Crossref] [PubMed]
Wang W, Liang H, Zhang Z, et al. Comparing three-dimensional and two-dimensional deep-learning, radiomics, and fusion models for predicting occult lymph node metastasis in laryngeal squamous cell carcinoma based on CT imaging: a multicentre, retrospective, diagnostic study. EClinicalMedicine 2024;67:102385. [Crossref] [PubMed]
Nie P, Liu S, Zhou R, et al. A preoperative CT-based deep learning radiomics model in predicting the stage, size, grade and necrosis score and outcome in localized clear cell renal cell carcinoma: A multicenter study. Eur J Radiol 2023;166:111018. [Crossref] [PubMed]
Li S, Zhou Z, Gao M, et al. Incremental value of automatically segmented perirenal adipose tissue for pathological grading of clear cell renal cell carcinoma: a multicenter cohort study. Int J Surg 2024;110:4221-30. [Crossref] [PubMed]
Luo S, Wei R, Lu S, et al. Fuhrman nuclear grade prediction of clear cell renal cell carcinoma: influence of volume of interest delineation strategies on machine learning-based dynamic enhanced CT radiomics analysis. Eur Radiol 2022;32:2340-50. [Crossref] [PubMed]
Avesta A, Hossain S, Lin M, et al. Comparing 3D, 2.5D, and 2D Approaches to Brain Image Auto-Segmentation. Bioengineering (Basel) 2023;10:181. [Crossref] [PubMed]
Lin F, Ma C, Xu J, et al. A CT-based deep learning model for predicting the nuclear grade of clear cell renal cell carcinoma. Eur J Radiol 2020;129:109079. [Crossref] [PubMed]
Zhou J, Zhang Y, Chang KT, et al. Diagnosis of Benign and Malignant Breast Lesions on DCE-MRI by Using Radiomics and Deep Learning With Consideration of Peritumor Tissue. J Magn Reson Imaging 2020;51:798-809. [Crossref] [PubMed]
Fenlon C, O’Grady L, Doherty ML, et al. A discussion of calibration techniques for evaluating binary and categorical predictive models. Prev Vet Med 2018;149:107-14. [Crossref] [PubMed]
Goldenholz DM, Sun H, Ganglberger W, et al. Sample Size Analysis for Machine Learning Clinical Validation Studies. Biomedicines 2023;11:685. [Crossref] [PubMed]
Sun P, Mo Z, Hu F, et al. Segmentation of kidney mass using AgDenseU-Net 2.5D model. Comput Biol Med 2022;150:106223. [Crossref] [PubMed]
Lin Z, Cui Y, Liu J, et al. Automated segmentation of kidney and renal mass and automated detection of renal mass in CT urography using 3D U-Net-based deep convolutional neural network. Eur Radiol 2021;31:5021-31. [Crossref] [PubMed]

Cite this article as: Huang T, Ke M, Liu Q, Ying M, Hu M, Fu X, Hu Y, Xu M. A computed tomography-based deep learning model for non-invasively predicting World Health Organization (WHO)/International Society of Urological Pathology (ISUP) pathological grades of clear cell renal cell carcinoma (ccRCC): a multicenter cohort study. Transl Androl Urol 2025;14(7):2018-2028. doi: 10.21037/tau-2025-222

A computed tomography-based deep learning model for non-invasively predicting World Health Organization (WHO)/International Society of Urological Pathology (ISUP) pathological grades of clear cell renal cell carcinoma (ccRCC): a multicenter cohort study

Highlight box

Introduction

Methods

Study patients

Table 1

CT examination and image preprocessing

Model construction and validation

Model construction workflow

Statistical analysis

Results

Patient characteristics

Development and testing of tumor models

Development and testing of kidney models

Table 2

Development and testing of combined models

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share