Prostate imaging reporting and data system version 2.1 as a predictor of clinically significant and aggressive prostate cancer: a radical prostatectomy-validated study
Highlight box
Key findings
• Higher PI-RADS v2.1 scores are strongly associated with higher radical prostatectomy Gleason scores (rpGS) and adverse pathological features, including extraprostatic extension, seminal vesicle invasion, lymphovascular invasion, and perineural invasion.
• Each one-category increase in PI-RADS yielded an 8- to 10-fold higher odds of elevated rpGS and a 3-fold increase in the odds of pathological T3 disease.
• PI-RADS ≥4 provided the best balance for diagnostic threshold, while PI-RADS 5 showed the strongest association with aggressive disease features, including grade group ≥3 and pT3 status.
• Integrating PI-RADS scores into standard clinical models significantly improved risk-prediction performance.
What is known and what is new?
• Multiparametric MRI and the PI-RADS scoring system are established tools for detecting clinically significant prostate cancer prior to biopsy.
• Validated against radical prostatectomy pathology, this study proves that PI-RADS v2.1 is an independent predictor of tumor grade and adverse features.
What is the implication, and what should change now?
• PI-RADS v2.1 provides crucial prognostic information beyond simple tumor detection.
• PI-RADS scores should be formally incorporated into preoperative risk stratification.
• Utilizing PI-RADS ≥4 as an optimal threshold for identifying clinically significant cancer, and PI-RADS 5 as a strong marker for aggressive pathology, can help clinicians better select appropriate candidates for active surveillance and tailor personalized prostate cancer management.
Introduction
Prostate cancer remains among the most common cancers in men worldwide, with more than 1.4 million new cases each year and incidence expected to double over the next two decades (1-3). Recent declines in mortality are likely attributable to ongoing improvements in cancer care (4). Over the past decade, the use of multiparametric magnetic resonance imaging (mpMRI) has increased substantially—from 36.4% to 92.1% between 2014 and 2023—accompanied by a reduction in low-risk cancer diagnoses from 22.7% to 10.5%, reflecting decreased overdetection and improved case selection (5).
In the era of personalized cancer care, identifying patients with clinically significant prostate cancer (csPCa) and accurately estimating risk are essential for treatment planning. The Gleason score remains the most widely used prognostic indicator for prostate cancer outcomes. The current grading system, developed by the International Society of Urological Pathology (ISUP), classifies tumors into grade groups (GGs) 1 through 5, with GG 1 considered clinically insignificant and associated with only a 2% risk of cancer-related death or metastasis. The risk of adverse outcomes increases with each successive grade (6,7). Nonetheless, substantial discordance between biopsy and radical prostatectomy (RP) exists, with one recent analysis showing discordance in nearly half (48.6%) of patients—most commonly due to upstaging (8).
Beyond the Gleason score, several pathological features may influence treatment outcomes. Extraprostatic extension (EPE) is a well-established adverse prognostic factor and is associated with biochemical recurrence following RP (9-11). The presence of EPE on prostatectomy specimens frequently prompts consideration of adjuvant therapy (11-14). Seminal vesicle invasion (SVI, pT3b) represents an even more adverse pathological feature, with bilateral involvement carrying a worse prognosis than unilateral invasion (15-17). Novel evidence also identifies peri-seminal vesicle soft-tissue invasion—present in approximately 74% of pT3b cases—as a distinct adverse feature associated with nodal metastasis rates exceeding 40% (18). Contemporary salvage guidelines emphasize SVI together with GG 4–5 disease as high-risk characteristics that support early salvage radiotherapy at low prostate-specific antigen (PSA) levels and more intensive systemic treatment integration (19). A recent meta-analysis showed that MRI can predict EPE with high accuracy (20). For SVI specifically, pooled analyses demonstrate that MRI achieves high specificity (95%) but only moderate sensitivity (57%), with an area under the curve (AUC) of 0.87, reinforcing its role in confirming rather than excluding locally advanced disease (21). Although less extensively studied, other pathological features—such as lymphovascular invasion (LVI), perineural invasion (PNI), and lymph node involvement—are also potential indicators of poor prognosis and may influence post-surgical management (14,22-27). Recent validation in contemporary cohorts confirms that LVI detected at RP is an independent predictor of metastasis and cancer-specific mortality, further informing adjuvant therapy decisions (28).
Magnetic resonance imaging (MRI) has become an integral component of prostate cancer management. Current guidelines recommend mpMRI prior to biopsy to assist in diagnosis of csPCa (29,30). The use of MRI improves detection rates, optimizes biopsy targeting, and enhances staging accuracy. MRI is also valuable in monitoring patients with low-risk prostate cancer undergoing active surveillance (AS) (14,31-43). Recent validation studies confirm that mpMRI achieves good diagnostic accuracy (AUC values 0.78–0.81) for detecting csPCa (44).
The Prostate Imaging Reporting and Data System (PI-RADS), developed in 2012 and revised to v2 and v2.1, standardizes prostate MRI interpretation and enhances diagnostic consistency. PI-RADS v2 assigns a score from 1 to 5, estimating csPCa likelihood, and subsequent updates addressed limitations and improved key criteria (30,45-47). Recent prospective studies have demonstrated progressive increases in csPCa with higher PI-RADS categories, from 0% in PI-RADS 1 to 77% in PI-RADS 5 (48). Furthermore, integrating PI-RADS v2.1 with clinical features can potentially reduce unnecessary biopsies by up to 27% compared to clinical parameters alone (49).
Although many studies have examined the diagnostic and prognostic role of PI-RADS, most have focused on interobserver variability, detection rates of clinically significant cancer, or comparisons between biparametric and multiparametric MRI (47-49). In contrast, this study evaluates PI-RADS v2.1 across a broader range of clinically relevant outcomes, including detection of csPCa, identification of high-risk patients, prediction of adverse pathological features, and correlation with comprehensive pathological parameters—aiming to inform risk-adapted, more individualized management of prostate cancer. In doing so, we assess PI-RADS v2.1 both as a diagnostic test for adverse pathology and as a key predictor within multivariable risk models. We present this article in accordance with the TRIPOD reporting checklist (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-1-910/rc).
Methods
Ethical considerations
This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments, and complied with the Health Insurance Portability and Accountability Act (HIPAA). This study was approved by the Institutional Review Board of Massachusetts General Hospital (IRB No. 2019P002618). Given the retrospective design and use of de-identified clinical data, the requirement for informed consent was waived. All data were anonymized and managed in accordance with institutional policies and applicable data protection standards.
Study design and setting
This single-center retrospective cohort study included consecutive patients who underwent preoperative multiparametric MRI followed by RP at a large tertiary referral center in the United States between April 1, 2016, and February 28, 2017. The study had two complementary objectives: (I) to assess the diagnostic accuracy of PI-RADS v2.1 for detecting clinically significant and aggressive prostate cancer, following STARD recommendations, and (II) to externally validate multivariable prediction models incorporating PI-RADS v2.1 alongside clinical predictors [age, PSA, biopsy Gleason score (bxGSs)], following TRIPOD guidance for prediction model validation (Type 2b study).
Participants
Eligibility criteria
Eligible patients were those who underwent RP during the study period (April 1, 2016, to February 28, 2017) and had a preoperative multiparametric MRI (mpMRI) performed at any time before surgery. Exclusion criteria were: (I) inadequate image quality for assessment of at least two out of three mpMRI parameters, or (II) receipt of neoadjuvant chemotherapy prior to surgery. For patients with multiple preoperative MRI exams, the scan closest to RP was analyzed. To minimize selection bias, all consecutive eligible patients in the study period were included. No a priori sample size calculation was performed, consistent with the retrospective, hypothesis-generating nature of the study; the sample size was determined by the number of eligible patients during the study period.
Variables and data sources
Clinical and pathological data
Demographic characteristics, serum PSA levels, and preoperative bxGSs were extracted from the electronic medical record. RP pathology reports provided final Gleason score, presence of EPE, SVI, LVI, PNI, pathological tumor (pT) stage, and regional lymph node status.
Pathological assessment was performed as part of routine clinical care at the time of surgery [2016–2017]. In accordance with standard practice during that period, pathologists had access to the original clinical MRI reports generated contemporaneously for patient management. Importantly, these clinical MRI reports predated PI-RADS version 2.1, which had not yet been released, and were not structured according to its standardized criteria.
The retrospective imaging review that formed the basis of the present study was conducted independently by two radiologists between 2019 and 2020, after the release of PI-RADS version 2.1. Both radiologists re-evaluated all MRI examinations and assigned PI-RADS v2.1 categories strictly according to published criteria. Pathologists were therefore completely blinded to the PI-RADS v2.1 classifications and interpretations used in the analysis, ensuring independence between the index test evaluated in this study and the reference standard.
Pathological staging harmonization
At the time of RP [2016–2017], pathological staging was reported according to the American Joint Committee on Cancer (AJCC) 7th edition, which was the prevailing staging system during that period. For the purposes of analysis and reporting consistency, pathological T staging was subsequently harmonized to the AJCC 8th edition.
This conversion was performed retrospectively based on explicit pathological descriptors documented in the original reports—specifically the presence or absence of EPE and SVI—using a rule-based mapping without reinterpretation of histopathologic findings. This approach ensured alignment with contemporary staging definitions while preserving the integrity of the original pathological assessment. The AJCC 8th edition was the active staging system during the period of data analysis and manuscript preparation.
MRI acquisition protocol
All imaging was performed on a 3-Tesla MR scanner (Discovery MR750, GE Medical Systems) without an endorectal coil. The protocol comprised axial T2-weighted fast recovery fast spin echo (FRFSE) sequences (slice thickness 3 mm, spacing 3.5 mm, echo time 129 ms, repetition time 2724 ms, flip angle 111°, field of view 200 mm × 200 mm), and T2-weighted PROPELLER sequences in axial, coronal, and sagittal planes. Diffusion-weighted imaging (DWI) sequences used b-values ranging from 0 to 1,500 s/mm2; apparent diffusion coefficient (ADC) maps were generated. Three-dimensional dynamic contrast-enhanced (DCE) imaging was acquired per protocol.
Image interpretation
Two fellowship-trained abdominal radiologists (three and five years post-fellowship experience, respectively) with extensive prior use of PI-RADS v2/v2.1 in routine clinical prostate MRI reporting independently reviewed all preoperative MRI scans. They did not participate in a dedicated joint training session or perform case-by-case consensus reading before or during the study; instead, each reader applied PI-RADS v2.1 according to the standard manual and their subspecialty expertise. PI-RADS scores were assigned per version 2.1 guidelines. For each case, T2-weighted and diffusion-weighted images were scored [1–5], DCE imaging was recorded as positive or negative, and an overall PI-RADS category was assigned.
Although images were acquired before release of PI-RADS v2.1, both radiologists retrospectively re-evaluated all scans and assigned scores according to v2.1 criteria. To reduce observer bias, readers were blinded to all clinical, pathological, and prior radiological information, as well as to each other’s scores.
Statistical analysis
Continuous variables are reported as median [interquartile range (IQR)] and compared using nonparametric tests. Categorical variables were compared using the chi-squared test or Fisher’s exact test, as appropriate. Binary logistic regression was used to evaluate the association between PI-RADS category and pathological T3 disease, adjusting for age, serum PSA, and bxGS; results are reported as odds ratios (ORs) with 95% confidence intervals (CIs). Ordinal logistic regression was used to assess the association between PI-RADS category and RP Gleason score, adjusting for the same covariates. The proportional-odds assumption was evaluated using the Brant test. No model updating or recalibration of existing models was performed.
Diagnostic performance metrics—including sensitivity, specificity, accuracy, positive and negative likelihood ratios, and AUC with 95% CIs—were calculated. AUCs were compared using the DeLong test. Model fit was assessed using likelihood ratio tests and pseudo-R2 statistics (McFadden’s and Nagelkerke’s R2).
Internal validation was performed using bootstrap resampling (500 iterations) to quantify model optimism and assess the stability of discrimination and calibration. Optimism-corrected estimates of discrimination (C-index/AUC) and calibration slopes were derived from bootstrap samples. Model calibration was further evaluated using calibration-in-the-large, calibration slope, Brier score, and expected-to-observed ratios, with visual inspection of calibration plots.
Clinical utility was assessed using decision curve analysis (DCA), which quantified net benefit across a range of clinically relevant threshold probabilities by comparing MRI-integrated models with biopsy-only clinical models and default “treat all” and “treat none” strategies.
Interobserver agreement for PI-RADS scoring was assessed using weighted Cohen’s kappa. Missing data were handled using complete-case analysis. All statistical analyses were performed using R version 4.5.0 (R Foundation for Statistical Computing, Vienna, Austria) with RStudio. Statistical significance was defined as a two-sided P value ≤ 0.05.
Results
General data, laboratory investigation, and pre-operative Gleason score
A total of 195 patients met the inclusion criteria. Two were excluded due to prior treatment, two were excluded due to severe image degradation from hip prostheses, and one case was excluded due to incomplete imaging assessments (cases evaluated by only one reader). This resulted in 190 patients eligible for imaging analysis.
The median age was 62 (IQR, 57–67) years. The median interval between MRI and surgery was 89 days (IQR, 28–162 days). The median total serum PSA level was 6.30 (IQR, 4.51–9.15) ng/mL. Table 1 displays baseline patient characteristics and stratification by Gleason score group.
Table 1
| Characteristics | Data, N=190 |
|---|---|
| Age (years) | 62 [57, 67] |
| Serum PSA (ng/mL) | 6.30 [4.51, 9.15] |
| Gleason score (biopsy) | |
| 6 | 55 (30.1) |
| 3+4 | 72 (39.3) |
| 4+3 | 21 (11.5) |
| 8 | 21 (11.5) |
| 9 | 14 (7.7) |
| Missing | 7 |
| Gleason score (surgery) | |
| 6 | 36 (18.9) |
| 3+4 | 93 (48.9) |
| 4+3 | 38 (20.0) |
| 8 | 12 (6.3) |
| 9 | 11 (5.8) |
Continuous data are presented as median [IQR]; categorical data are presented as n (%). Percentages are calculated based on total N; missing data not excluded. Normality (Shapiro-Wilk): age, P=0.04; PSA, P≤0.001. IQR, interquartile range; PSA, prostate-specific antigen.
bxGSs were available for 183 patients. Median age differed significantly across Gleason score groups (P=0.002), with patients in the 3+3 group tending to be younger. There was no significant difference in serum PSA levels across Gleason score groups (P=0.92).
Pathological diagnosis
Table 2 presents the distribution of EPE, SVI, LVI, PNI, and regional lymph node metastasis in relation to the final Gleason score from the radical prostatectomy specimen (rpGS). The prevalence of all these pathological features was significantly higher in patients with higher rpGS (P<0.001 for all factors). Similarly, the incidence of higher pathological T stage (T3 vs. T2) increased with higher rpGS, rising from 0% in GS 6 to 90.9% in GS 9 (P<0.001).
Table 2
| Pathological features | Gleason score | |||||
|---|---|---|---|---|---|---|
| 6 | 3+4 | 4+3 | 8 | 9 | P | |
| Extraprostatic extension | 0 (0) | 27 (29) | 16 (42.1) | 9 (75) | 10 (90.9) | <0.001 |
| Seminal vesicle invasion | 0 (0) | 1 (1.1) | 5 (13.2) | 3 (25) | 5 (45.5) | <0.001 |
| Lymphovascular invasion | 0 (0) | 3 (3.2) | 2 (5.3) | 3 (25) | 4 (36.4) | <0.001 |
| Perineural invasion | 18 (50) | 66 (71) | 34 (89.5) | 11 (91.7) | 11 (100) | <0.001 |
| Pathological lymph node | 0 (0) | 0 (0) | 3 (9.4) | 3 (25) | 4 (40) | <0.001 |
Data are presented as n (%).
MRI findings and pathologic outcomes
Overall PI-RADS classification
There was a significant difference in the distribution of rpGS among patients with different overall PI-RADS scores. Higher PI-RADS scores were associated with a greater likelihood of higher rpGS. Additionally, patients with higher PI-RADS scores had significantly increased rates of EPE, SVI, LVI, PNI, regional lymph node metastasis, and higher pathological T stage (Table 3).
Table 3
| Features | Reader 1 | Reader 2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1+2 | 3 | 4 | 5 | P | 1+2 | 3 | 4 | 5 | P | ||
| Gleason score | <0.001 | <0.001 | |||||||||
| 6 | 17 (47.2) | 2 (5.6) | 14 (38.9) | 3 (8.3) | 18 (50.0) | 3 (8.3) | 13 (36.1) | 2 (5.6) | |||
| 3+4 | 17 (18.3) | 5 (5.4) | 44 (47.3) | 27 (29.0) | 16 (17.2) | 2 (2.2) | 44 (47.3) | 31 (33.3) | |||
| 4+3 | 1 (2.6) | 1 (2.6) | 14 (36.8) | 22 (57.9) | 1 (2.6) | 1 (2.6) | 12 (31.6) | 24 (63.2) | |||
| 8 | – | – | 4 (33.3) | 8 (66.7) | – | – | 4 (33.3) | 8 (66.7) | |||
| 9 | 1 (9.1) | – | 1 (9.1) | 9 (81.8) | 1 (9.1) | – | 1 (9.1) | 9 (81.8) | |||
| Pathological T stage | <0.001 | <0.001 | |||||||||
| T2 | 31 (24.4) | 7 (5.5) | 61 (48.0) | 28 (22.0) | 31 (24.4) | 5 (3.9) | 59 (46.5) | 32 (25.2) | |||
| T3a | 5 (10.0) | 1 (2.0) | 15 (30.0) | 29 (58.0) | 5 (10.0) | 1 (2.0) | 14 (28.0) | 30 (60.0) | |||
| T3b | 0 (0.0) | 0 (0.0) | 1 (7.7) | 12 (92.3) | 0 (0.0) | 0 (0.0) | 1 (7.7) | 12 (92.3) | |||
| Extraprostatic extension | 5/36 (13.9) | 1/8 (12.5) | 15/77 (19.5) | 41/69 (59.4) | <0.001 | 5/36 (13.9) | 1/6 (16.7) | 15/74 (20.3) | 41/74 (55.4) | <0.001 | |
| Seminal vesicle invasion | 0/35 (0.0) | 0/8 (0.0) | 2/77 (2.6) | 12/69 (17.4) | 0.002 | 0/35 (0.0) | 0/6 (0.0) | 2/74 (2.7) | 12/74 (16.2) | 0.004 | |
| Lymphovascular invasion | 0/36 (0.0) | 0/8 (0.0) | 2/77 (2.6) | 10/68 (14.7) | 0.009 | 0/36 (0.0) | 0/6 (0.0) | 1/74 (1.4) | 11/73 (15.1) | 0.003 | |
| Perineural invasion | 19/36 (52.8) | 3/8 (37.5) | 53/76 (69.7) | 65/69 (94.2) | <0.001 | 19/36 (52.8) | 4/6 (66.7) | 49/74 (66.2) | 68/73 (93.2) | <0.001 | |
| Regional lymph node | 0/36 (0.0) | 0/8 (0.0) | 1/77 (1.3) | 9/69 (13.0) | 0.008 | 0/36 (0.0) | 0/6 (0.0) | 1/74 (1.4) | 9/74 (12.2) | 0.016 | |
Data are presented as n (%) or present/total (%). PI-RADS, Prostate Imaging Reporting and Data System.
Association between PI-RADS scores and high-risk pathologic features
Notably, for Reader 1, high-risk Gleason scores (≥8) were identified in 22 (15.1%) of patients with PI-RADS 4–5, compared with 1 (2.3%) among those with PI-RADS 1–3 (P=0.02). Similarly, for Reader 2, high-risk Gleason scores (≥8) were observed in 22 (14.9%) of patients with PI-RADS 4–5 and 1 (2.4%) of those with PI-RADS 1–3 (P=0.03).
Additionally, rare adverse pathologic features clustered almost exclusively in higher PI-RADS categories. For SVI, the proportion with SVI was 14 (9.6%) in PI-RADS 4–5 versus 0 (0%) in PI-RADS 1–3 for Reader 1 (P=0.04), and 14 (9.5%) versus 0 (0%) for Reader 2 (P=0.04).
For regional lymph-node metastasis (N1), N1 occurred in 10 (9.7 %) among PI-RADS ≥ 4 versus 0 (0 %) among PI-RADS 1–3 for Reader 1 (P=0.21), and 10 (9.4 %) versus 0 (0 %) for Reader 2 (P=0.36). Notably, 65 cases had no lymph nodes submitted for pathological evaluation and were therefore excluded from this analysis.
Logistic regression analysis
Ordinal logistic regression
The association between PI-RADS scores and rpGS was further evaluated using ordinal logistic regression models. For Reader 1, each one-category increase in PI-RADS score was associated with a 8.17-fold increase in the odds of having a higher rpGS (95% CI: 4.20–16.44, P<0.001).
Similarly, for Reader 2, each one-category increase in PI-RADS score corresponded to a 10.76-fold increase in the odds of a higher rpGS (95% CI: 5.25–22.91, P<0.001).
After adjusting for age, serum PSA, and bxGS, the association between PI-RADS score and rpGS remained statistically significant. Each one-category increase in PI-RADS score was associated with an adjusted OR of 5.54 (95% CI: 2.52–12.61, P<0.001) for Reader 1 and adjusted OR of 8.19 (95% CI: 3.55–19.91, P<0.001) for Reader 2.
Other significant predictors include serum PSA and bxGS, both showing strong association with higher rpGS, whereas age was not a significant predictor in the multivariable model (Table 4). The proportional-odds assumption was evaluated using the Brant test and was not violated for either model (Reader 1: P>0.99; Reader 2: P>0.99).
Table 4
| Variable | Adjusted OR | 95% CI (lower) | 95% CI (upper) | P value |
|---|---|---|---|---|
| Reader 1 | ||||
| PI-RADS category (Reader 1, linear trend) | 5.54 | 2.52 | 12.61 | <0.001 |
| Age (years) | 0.97 | 0.93 | 1.02 | 0.29 |
| Serum PSA (ng/mL) | 1.16 | 1.08 | 1.25 | <0.001 |
| Biopsy Gleason score (linear trend) | 105.15 | 34.64 | 352.06 | <0.001 |
| Reader 2 | ||||
| PI-RADS category (Reader 2, linear trend) | 8.19 | 3.55 | 19.91 | <0.001 |
| Age (years) | 0.96 | 0.91 | 1.01 | 0.12 |
| Serum PSA (ng/mL) | 1.16 | 1.08 | 1.25 | <0.001 |
| Biopsy Gleason score (linear trend) | 130.78 | 41.57 | 457.82 | <0.001 |
Values are adjusted OR for higher radical prostatectomy Gleason score category, derived from ordinal logistic regression. Association between PI-RADS score and radical prostatectomy Gleason category after adjustment for age, PSA, and biopsy Gleason score. CI, confidence interval; OR, odds ratio; PI-RADS, Prostate Imaging Reporting and Data System; PSA, prostate-specific antigen.
Binary logistic regression
In the MRI-integrated logistic regression model evaluating predictors of pathological T3 disease, neither age nor PSA demonstrated significant associations with final pathological stage (age: OR 1.02, P=0.50; PSA: OR 1.01, P=0.77). As expected, bxGS remained a strong predictor, with higher GGs showing significantly increased odds of pT3 disease (OR 4.98, 95% CI: 1.83–14.84, P=0.002).
Importantly, the PI-RADS category demonstrated a significant and independent linear association with pathological T3 disease. Each one-step increase in PI-RADS (for example, from 2→3 or 3→4) was associated with an approximately threefold increase in the odds of having EPE (OR 3.02, 95% CI: 1.26–8.28, P=0.02).
Diagnostic performance of PI-RADS cutoffs
The performance of different PI-RADS cutoffs for predicting adverse pathological outcomes—including csPCa (rpGS ≥3+4), pathological GG ≥3 (rpGS ≥4+3), high-risk Gleason category (rpGS ≥8), and pathological T3 and N1 stages—is summarized in Table 5.
Table 5
| PI-RADS cutoffs | Reader 1 | Reader 2 | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sens | Spec | Acc | PLR | NLR | AUC (95% CI) | Sens | Spec | Acc | PLR | NLR | AUC (95% CI) | ||
| Clinically significant PCa (GS ≥3+4) | |||||||||||||
| PI-RADS 5 | 0.43 | 0.92 | 0.52 | 5.14 | 0.62 | 0.67 (0.61–0.73) | 0.47 | 0.94 | 0.56 | 8.42 | 0.56 | 0.71 (0.65–0.76) | |
| PI-RADS ≥4 | 0.84 | 0.53 | 0.78 | 1.77 | 0.31 | 0.68 (0.60–0.77) | 0.86 | 0.58 | 0.81 | 2.07 | 0.23 | 0.72 (0.64–0.81) | |
| PI-RADS ≥3 | 0.88 | 0.47 | 0.80 | 1.66 | 0.26 | 0.67 (0.59–0.76) | 0.88 | 0.50 | 0.81 | 1.77 | 0.23 | 0.69 (0.60–0.78) | |
| Grade group ≥3 (GS ≥4+3) | |||||||||||||
| PI-RADS 5 | 0.64 | 0.77 | 0.73 | 2.75 | 0.47 | 0.70 (0.63–0.77) | 0.67 | 0.74 | 0.72 | 2.63 | 0.44 | 0.71 (0.64–0.78) | |
| PI-RADS ≥4 | 0.95 | 0.32 | 0.52 | 1.39 | 0.15 | 0.63 (0.59–0.68) | 0.95 | 0.30 | 0.51 | 1.36 | 0.16 | 0.63 (0.58–0.67) | |
| PI-RADS ≥3 | 0.97 | 0.26 | 0.49 | 1.31 | 0.12 | 0.62 (0.57–0.66) | 0.97 | 0.26 | 0.49 | 1.31 | 0.12 | 0.62 (0.57–0.66) | |
| High-risk Gleason (GS ≥8) | |||||||||||||
| PI-RADS 5 | 0.74 | 0.69 | 0.69 | 2.37 | 0.38 | 0.71 (0.62–0.81) | 0.74 | 0.66 | 0.67 | 2.17 | 0.40 | 0.70 (0.60–0.80) | |
| PI-RADS ≥4 | 0.96 | 0.26 | 0.34 | 1.29 | 0.17 | 0.61 (0.55–0.66) | 0.96 | 0.25 | 0.33 | 1.27 | 0.18 | 0.60 (0.55–0.65) | |
| PI-RADS ≥3 | 0.96 | 0.21 | 0.30 | 1.21 | 0.21 | 0.58 (0.53–0.64) | 0.96 | 0.21 | 0.30 | 1.21 | 0.21 | 0.58 (0.53–0.64) | |
| Pathological T3 stage | |||||||||||||
| PI-RADS 5 | 0.65 | 0.78 | 0.74 | 2.95 | 0.45 | 0.72 (0.65–0.78) | 0.67 | 0.75 | 0.72 | 2.65 | 0.45 | 0.71 (0.64–0.78) | |
| PI-RADS ≥4 | 0.90 | 0.30 | 0.50 | 1.29 | 0.32 | 0.60 (0.55–0.66) | 0.90 | 0.28 | 0.49 | 1.26 | 0.34 | 0.59 (0.54–0.65) | |
| PI-RADS ≥3 | 0.92 | 0.24 | 0.47 | 1.22 | 0.33 | 0.58 (0.53–0.63) | 0.92 | 0.24 | 0.47 | 1.22 | 0.33 | 0.58 (0.53–0.63) | |
| Regional lymph node metastasis | |||||||||||||
| PI-RADS 5 | 0.90 | 0.67 | 0.68 | 2.70 | 0.15 | 0.78 (0.68–0.89) | 0.90 | 0.64 | 0.65 | 2.49 | 0.16 | 0.77 (0.67–0.87) | |
| PI-RADS ≥4 | 0.95 | 0.25 | 0.29 | 1.27 | 0.18 | 0.62 (0.59–0.65) | 0.95 | 0.23 | 0.28 | 1.25 | 0.19 | 0.62 (0.59–0.65) | |
| PI-RADS ≥3 | 0.95 | 0.20 | 0.24 | 1.20 | 0.23 | 0.60 (0.57–0.63) | 0.95 | 0.20 | 0.24 | 1.20 | 0.23 | 0.60 (0.57–0.63) | |
Acc, accuracy; AUC, area under the curve; CI, confidence interval; GS, Gleason score; NLR, negative likelihood ratio; PCa, prostate cancer; PI-RADS, Prostate Imaging Reporting and Data System; PLR, positive likelihood ratio; Sens, sensitivity; Spec, specificity.
For detecting csPCa, both PI-RADS ≥3 and PI-RADS ≥4 thresholds demonstrated comparable diagnostic accuracies (0.80–0.81 and 0.78–0.81, respectively), with PI-RADS ≥ 4 offering higher specificity (0.53–0.58 vs. 0.47–0.50). In contrast, the PI-RADS 5 cutoff yielded the lowest overall accuracy (0.52–0.56) and only modest AUCs (0.67–0.71) for both readers.
Although PI-RADS 5 showed lower sensitivity for identifying any clinically significant cancer, it achieved the highest accuracy for detecting aggressive disease features—namely, Gleason score ≥4+3 (accuracy: 0.72–0.73, AUC: 0.70–0.71), high-risk (GS ≥8) tumors (accuracy: 0.67–0.69, AUC: 0.70–0.71), and advanced pathological stage (pT3; accuracy: 0.72–0.74, AUC: 0.71–0.72). These findings highlight the superior performance of the PI-RADS 5 threshold for identifying adverse prognostic pathology.
Interobserver agreement
Interobserver agreement between the two radiologists for PI-RADS scoring was assessed using weighted Cohen’s kappa statistic. The weighted kappa value was 0.96 (95% CI: 0.93–1), indicating almost perfect agreement between the readers.
Model evaluation
The MRI-integrated model demonstrated good overall discrimination, with C-index values of 0.883 for Reader 1 and 0.894 for Reader 2. Model fit was further supported by the pseudo-R2 statistics, showing McFadden’s R2=0.327 and 0.353, and Nagelkerke’s R2=0.666 and 0.692 for Readers 1 and 2, respectively. These findings indicate that both models achieved a strong balance between discrimination and explanatory power.
Incremental value of MRI over biopsy-only models
MRI integration modestly improved model discrimination compared with biopsy-only models across all pathological outcomes. For predicting the final Gleason score category, the MRI-integrated models demonstrated superior performance: the concordance index (C-index) increased from 0.864 to 0.883 for Reader 1 and from 0.864 to 0.894 for Reader 2, reflecting stronger rank-ordering of pathological severity. Similarly, McFadden’s R2 rose from 0.284 to 0.327 for Reader 1 and from 0.284 to 0.353 for Reader 2, indicating improved explanatory power.
For clinically significant disease (≥ GG2), the AUC increased from 0.88 to 0.91 for Reader 1 and from 0.88 to 0.92 for Reader 2, indicating a more favorable sensitivity-specificity balance with MRI. For high-risk cancer (≥ GG4), AUC values were maintained or slightly improved—from 0.9 to 0.9 for Reader 1 and from 0.9 to 0.91 for Reader 2—suggesting that MRI contributed additional value for identifying more aggressive pathology without loss of overall accuracy (Figure 1).
A similar trend was observed for pathological T3 disease. Relative to the biopsy-only model (AUC 0.75), integration of MRI increased the AUC to 0.8 for Reader 1 and 0.79 for Reader 2. Likelihood ratio tests confirmed the MRI-integrated model had a significantly better fit (χ2=16.5, df=3, P=0.001; χ2=16.4, df=3, P=0.001), while DeLong tests indicated borderline-significant AUC improvement [Reader 1: P(DeLong)=0.052; Reader 2: P(DeLong)=0.099].
Pseudo-R2 values also increased—McFadden’s R2 (0.131 → 0.201) and Nagelkerke’s R2 (0.213 → 0.314)—highlighting improved overall model fit and explanatory capacity. Full ROC curve overlays for all models (GG2, GG4, and pT3) are provided in Figure 1.
Internal validation and model calibration
Ordinal logistic regression bootstrap results for Gleason GG
Internal validation was performed using 500 bootstrap resamples to quantify model optimism and assess the stability of the MRI-integrated ordinal logistic regression models. A total of 394 bootstrap iterations completed successfully and were included in the estimation of optimism.
After optimism correction, the bootstrap-adjusted C-index was 0.871 for Reader 1 and 0.883 for Reader 2, indicating that both models retained excellent ability to rank-order patients by pathological Gleason GG with minimal loss of performance relative to their apparent estimates.
The bootstrap-adjusted calibration slopes were 0.923 for Reader 1 and 0.921 for Reader 2, values that remain close to the ideal slope of 1. These findings demonstrate stable probability scaling and minimal overfitting across the full range of predicted Gleason categories.
Overall, the optimism-corrected discrimination and calibration metrics confirm that both MRI-integrated multivariable models exhibit strong internal validity and reliable generalizability for predicting RP Gleason GG.
Internal validation for predicting pathological T3 disease
Internal validation of the logistic regression models predicting pathological T3 disease was performed using 500 bootstrap resamples, of which 500 successfully converged and were included in the optimism estimates. After correction, the biopsy-only clinical model showed a decrease in discrimination from an apparent AUC of 0.747 to a bootstrap-adjusted AUC of 0.712. Both MRI-integrated models retained higher discriminatory performance even after optimism adjustment. The MRI Reader 1 model demonstrated an apparent AUC of 0.796 with a corrected AUC of 0.75, while the MRI Reader 2 model yielded comparable values (apparent 0.789, corrected 0.742), confirming a consistent incremental gain from MRI.
Calibration slopes, initially close to 1.0 in all models, decreased after bootstrap correction to 0.796 for the clinical model and to 0.651 and 0.665 for the MRI-based models. Although the MRI models showed greater slope shrinkage—expected when incorporating a strong imaging predictor within a moderately sized dataset—they continued to outperform the clinical model in discrimination. Taken together, the bootstrap-adjusted results indicate that MRI provides meaningful incremental value for predicting pathological T3 disease, and the models demonstrate stable performance with acceptable levels of overfitting.
Model calibration
Calibration was evaluated at clinically relevant thresholds (≥ GG2, ≥ GG3, and ≥ GG4) and for prediction of pathological T3 disease. Across all outcomes, MRI-integrated models demonstrated superior overall probabilistic accuracy, reflected by lower Brier scores, compared with biopsy-only models, without evidence of compromised calibration stability (Figure 2).
csPCa (≥ GG2)
All models demonstrated excellent agreement between predicted and observed risks, with calibration-in-the-large values close to zero (intercept range –0.01 to –0.01). Calibration slopes exceeded 1.0 (1.26–1.34), indicating mildly conservative predictions with some compression of predicted probabilities. MRI-integrated models showed lower Brier scores (0.09, 0.1) compared with biopsy-only models (0.11, 0.11), while expected-to-observed ratios remained close to unity (Jeny1, 1), supporting good global calibration at this clinically important threshold.
Unfavorable intermediate-risk disease (≥ GG3)
Calibration performance at this threshold—distinguishing favorable from unfavorable intermediate-risk disease—was near-ideal. Calibration slopes were close to unity (0.95, 0.97), with minimal systematic bias as reflected by small calibration-in-the-large values (0.05, 0.05). MRI-integrated models demonstrated consistently lower Brier scores (0.11, 0.11 vs. 0.12, 0.12), without deterioration of calibration.
High-grade disease (≥ GG4)
For prediction of high-grade disease, calibration slopes were lower (0.76, 0.79), and calibration-in-the-large values were more negative (−0.13, −0.12), indicating a tendency toward risk overestimation. This pattern is consistent with the lower prevalence of ≥ GG4 disease and is commonly observed in multivariable prediction models for rarer outcomes. Nevertheless, Brier scores remained low (0.07, 0.08), and expected-to-observed ratios remained acceptable (1.06, 1.07), indicating preserved global calibration.
Pathological T3 disease
All models demonstrated excellent calibration for predicting pathological T3 disease, with calibration-in-the-large values near zero (intercept range 0, 0) and calibration slopes approximating 1.0 (Jeny1, 1, 1). MRI-integrated models showed lower Brier scores (0.17, 0.17) compared with the clinical model (0.19), consistent with improved overall probabilistic accuracy while maintaining appropriate calibration.
DCA
DCA demonstrated that MRI-integrated models provided greater clinical net benefit than the biopsy-only model across a wide range of clinically relevant threshold probabilities commonly encountered in clinical decision-making (Figure 3).
For csPCa (≥ GG2), both MRI-integrated models consistently achieved higher net benefit than the biopsy-only model throughout most threshold probabilities between approximately 0.15 and 0.45. Across this range, the MRI-integrated curves remained above both the biopsy-only model and the “treat all” and “treat none” strategies, indicating improved decision-making efficiency when MRI information was incorporated. The near-overlapping performance of Reader 1 and Reader 2 models suggests robustness of the MRI contribution across readers.
For pathological T3 disease, MRI-integrated models again demonstrated superior net benefit compared with the biopsy-only model across clinically plausible thresholds (approximately 0.20–0.45). The biopsy-only model showed lower net benefit and approached the “treat none” strategy at higher thresholds, whereas MRI-integrated models maintained positive net benefit, indicating improved identification of patients at risk for EPE. The “treat all” strategy showed rapid decline in net benefit as threshold probability increased, reinforcing the clinical advantage of risk stratification using MRI-based models.
Overall, DCA confirms that integrating MRI into multivariable prediction models yields greater net clinical benefit across a broad range of decision thresholds, supporting the clinical utility of MRI beyond improvements in discrimination and calibration alone.
Exploratory analysis: Gleason upgrade in biopsy GG1–2 subgroup
Exploratory analyses of MRI effects on biopsy-to-surgery Gleason discordance did not demonstrate a meaningful or independent association when evaluated in the entire cohort. However, when the analysis was restricted to men with biopsy Gleason GG1–2—a clinically relevant subgroup for AS, focal treatment, or deferred intervention—higher PI-RADS categories remained significantly associated with pathological upgrading, after adjustment for age and PSA.
For Reader 1, each one-category increase in PI-RADS score was associated with an adjusted OR of 2.16 (95% CI: 1–4.84, P=0.055). For Reader 2, the adjusted OR was 2.68 (95% CI: 1.22–6.17, P=0.02).
Given the exploratory nature of this subgroup analysis and the limited sample size, these findings were not incorporated into the main predictive models but may provide additional context for patient counseling and risk stratification in lower-risk disease.
DCA in the biopsy GG1–2 subgroup
To further explore the potential clinical implications of MRI in this lower-risk population, DCA was performed in patients with biopsy Gleason GG1–2. Across clinically relevant threshold probabilities—approximately between 0.10 and 0.35—MRI-integrated models demonstrated greater net benefit than the clinical model incorporating age, PSA, and bxGS, as well as the default “treat all” and “treat none” strategies (Figure 4).
Within this threshold range, the MRI-integrated models provided incremental net benefit over the clinical model alone, indicating that MRI contributed additional risk stratification beyond information already available from biopsy and clinical variables. This suggests improved identification of patients at higher risk of pathological upgrading while reducing unnecessary intervention among those with truly low-risk disease.
At higher threshold probabilities (>~0.40), net benefit for all models converged toward zero, reflecting increasing uncertainty and the relatively small number of upgrade events in this subgroup. This pattern is expected in exploratory subgroup analyses and highlights that the potential clinical value of MRI is most pronounced at moderate risk thresholds relevant to shared decision-making in AS.
Taken together, these exploratory findings suggest that, among men with biopsy GG1–2 disease, MRI may provide clinically meaningful incremental utility beyond biopsy and standard clinical parameters when estimating upgrade risk. Given the limited sample size and post hoc nature of this analysis, these results should be interpreted cautiously and warrant validation in larger, dedicated AS cohorts.
Discussion
MRI has become an integral component of prostate cancer diagnostic pathways and is now recommended in multiple clinical guidelines, particularly in biopsy-naive individuals where MRI can safely help determine the need for biopsy (50-52). MRI is also valuable for detecting lesions after a negative transrectal ultrasound-guided biopsy and in evaluating candidates for AS (51).
csPCa (Gleason score ≥7 or ISUP GG ≥2) carries a higher likelihood of progression, metastasis, or cancer-related mortality and is commonly used as a threshold for definitive treatment (53). Using RP specimens as the reference standard, our study reaffirms the robust diagnostic validity of multiparametric MRI in detecting csPCa. Rates of csPCa increased progressively across PI-RADS categories, and nearly all high-risk grade groups (ISUP ≥4) clustered within PI-RADS 4–5. We observed a strong linear relationship between PI-RADS category and final rpGS, with each one-category increase in PI-RADS associated with an 8–10-fold increase in the odds of higher rpGS. Importantly, PI-RADS remained an independent predictor of rpGS even after adjustment for age, PSA, and bxGS, indicating that MRI captures morphologic features of tumor aggressiveness that may be under-sampled or missed by systematic biopsy alone.
Although bxGS remains the dominant determinant of clinical management, our findings demonstrate that MRI provides meaningful incremental value. Incorporation of PI-RADS into a standard clinical model (age, PSA, and bxGS) improved discrimination and explanatory power for final pathological GG. These gains persisted after internal validation, with optimism-corrected C-indices and calibration slopes remaining close to ideal values, supporting the robustness and generalizability of MRI-integrated models.
Determining the optimal threshold for defining a “positive MRI” remains a key clinical challenge. Prior studies generally support PI-RADS ≥4 as a threshold for substantial risk of csPCa, whereas PI-RADS 3 demonstrates highly variable malignancy rates and often requires adjunctive parameters such as PSA density or lesion volume (54). Our results align with this evidence: PI-RADS 4 offered comparable accuracy but higher specificity than PI-RADS 3.
PI-RADS 3, by design, represents a heterogeneous category. Consistent with prior reports (54,55), we observed notable inter-reader variability in csPCa prevalence within PI-RADS 3, whereas PI-RADS 4–5 demonstrated high consistency across readers. These findings reinforce PI-RADS 4 as a more reliable clinical threshold for risk stratification.
While csPCa detection remains a benchmark for MRI performance, reliance on a binary definition oversimplifies contemporary decision-making. Expanding evidence supports broader eligibility for AS, including selected patients with favorable intermediate-risk disease (GG2) (56-59). Contemporary European cohorts utilizing pre-biopsy MRI selection for GG2 disease report 3-year metastasis-free survival rates of 98.1%, supporting the safety of MRI-informed AS expansion beyond traditional ISUP GG1 criteria (60). Nevertheless, current clinical staging still relies primarily on Gleason score, digital rectal examination, and PSA, with MRI not yet formally integrated into staging algorithms (61).
This context highlights the prognostic relevance of PI-RADS 5 observed in our cohort. Patients with PI-RADS 5 demonstrated substantially higher prevalence of adverse pathological features, including GG ≥3 disease, EPE, SVI, LVI, PNI, and nodal metastasis. PI-RADS was also an independent predictor of pathological T3 disease, with each one-category increase associated with approximately a threefold increase in odds. Correspondingly, PI-RADS 5 yielded the highest specificity and accuracy for identifying aggressive pathology. These observations are consistent with prior literature showing that MRI-integrated models can improve discrimination for EPE and SVI (62). Current AUA/ASTRO/SUO guidelines emphasize that the decision between adjuvant and early salvage radiotherapy should consider not only pathological stage but also PSA kinetics and comorbidities, with consolidative therapy increasingly recognized as a distinct clinical entity requiring specific timing considerations (19).
Importantly, MRI-integrated models not only improved discrimination but also demonstrated stable calibration across clinically relevant thresholds and superior net clinical benefit on DCA. This indicates that MRI contributes actionable risk stratification beyond statistical performance, particularly in guiding decisions where overtreatment and undertreatment must be carefully balanced.
The distinction is clinically meaningful for patients with intermediate biopsy findings. A patient with biopsy GG2 and PI-RADS 5 represents a markedly different risk profile from one with PI-RADS 3, despite identical biopsy classification. This differentiation becomes especially relevant when considering nerve-sparing surgery, extent of lymph node dissection, or candidacy for AS.
Our exploratory subgroup analysis among men with biopsy GG1–2—those most likely to be considered for AS—further supports this concept. Within this lower-risk cohort, pathological upgrading increased approximately two- to three-fold with each incremental increase in PI-RADS score. DCA in this subgroup demonstrated that MRI-integrated models provided incremental net benefit over a clinical model incorporating age, PSA, and bxGS, particularly at moderate risk thresholds relevant to shared decision-making. These findings suggest that MRI may improve identification of patients at higher risk of harboring aggressive disease while reducing unnecessary intervention among truly low-risk individuals.
Although exploratory, these results align with emerging evidence supporting selective MRI use prior to AS enrollment, particularly in settings where MRI is not routinely incorporated into eligibility assessment (53,63-65). However, the 95% confidence intervals were wide because of the limited number of patients in this subgroup. These findings therefore require validation in larger cohorts and should be interpreted as hypothesis-generating and with appropriate caution. While our subgroup analysis was underpowered, external MRI-selected cohorts demonstrate that MRI may identify GG2 patients with more indolent disease biology who remain suitable for surveillance, consistent with our observation of differential upgrading rates across PI-RADS categories (60).
Beyond local T staging, mpMRI and PI-RADS should be considered within a broader multimodality staging framework. mpMRI provides strong performance for local assessment, particularly EPE and SVI, and can enhance established clinical nomograms when incorporated into risk models (62,66). Recent advances in MRI-targeted biopsy techniques have further improved Gleason score concordance between biopsy and RP specimens, reducing upgrading rates and supporting more accurate preoperative risk assessment (67-69). In contrast, Prostate-specific membrane antigen positron emission tomography/computed tomography (PSMA PET/CT) generally offers superior sensitivity for regional and distant nodal metastases compared with MRI or CT, and is increasingly recommended for N and M staging (70,71), whereas mpMRI remains the primary tool for detailed local T staging. In our cohort, nodal involvement was not evaluated by direct MRI lymph node assessment; instead, we examined the association between PI-RADS category—especially PI-RADS 5—and pathological N status. Thus, while PI-RADS 5 can reasonably inform local T staging, its relationship with nodal disease should be interpreted as an indirect association rather than direct nodal detection, and our findings are not directly comparable to staging systems or imaging modalities that explicitly target lymph node metastases.
These integrated imaging approaches—combining MRI-targeted biopsy, advanced local staging, and functional imaging with PSMA PET/CT—suggest that future risk stratification will increasingly rely on multimodal assessment rather than any single imaging technique. Although our study focused specifically on PI-RADS performance for morphologic risk assessment, correlating MRI findings with complementary imaging modalities represents an important direction for optimizing treatment planning and patient selection.
Interobserver agreement for PI-RADS scoring in our study was near-perfect, presumably reflecting comparable subspecialty training and clinical experience. While reader expertise is known to influence MRI performance (72-75), the consistency observed here supports the reproducibility of PI-RADS when applied by experienced practitioners. Nonetheless, exclusion of poor-quality examinations, use of similarly experienced readers, and the spectrum-enriched prostatectomy cohort likely reduced the proportion of indeterminate or borderline lesions and may have contributed to the very high kappa. As a result, interobserver agreement and diagnostic performance in community or less-specialized settings may be lower, and our findings should be generalized with caution. Moreover, interobserver agreement was evaluated between two subspecialty abdominal radiologists in a tertiary-referral prostatectomy cohort, which may overestimate reproducibility compared with general practice or community settings.
Strengths and limitations
Our study benefits from the use of RP histopathology as the reference standard, minimizing verification bias inherent to biopsy-based studies (8). We further strengthened methodological rigor through internal bootstrap validation, calibration assessment, and DCA, providing a comprehensive evaluation of model performance beyond traditional accuracy metrics.
Limitations include the retrospective, single-center design and the spectrum-enriched population inherent to RP cohorts. Evaluation of nodal disease was underpowered due to limited lymph node dissection, and PSA density was not incorporated, although it could further refine risk stratification. Whole-mount prostatectomy correlation was unavailable, reflecting institutional practice. The single tertiary referral-center setting may also have led to an overrepresentation of more advanced cases, and the relatively modest sample size (190 patients) increases the risk of overfitting in multivariable analyses, despite our efforts to perform bootstrap internal validation and maintain parsimonious models. Finally, exclusion of patients who had received prior systemic therapy or had non-interpretable MRI examinations, although limited in number, may introduce some selection bias and slightly overestimate performance compared with fully unselected populations.
Conclusions
In conclusion, our study validates PI-RADS v2.1 as a robust and reproducible tool that provides clinically meaningful value beyond tumor detection alone. PI-RADS 4 emerges as an optimal threshold for diagnosing csPCa, while PI-RADS 5 functions as a distinct prognostic marker strongly associated with adverse pathological features. MRI demonstrates incremental value over standard clinical models, retaining discrimination, calibration, and net clinical benefit after internal validation. These findings support integrating MRI into post-biopsy risk stratification and clinical decision-making, particularly for patients being considered for AS or tailored surgical approaches.
Acknowledgments
Artificial intelligence tools, including large language models, were used to assist with proofreading, editing for clarity, and debugging during statistical analysis. All initial drafts and substantive revisions were written by the authors. The authors confirm that all data used in the study are authentic, accurately represented, and fully derived from the original dataset. AI assistance did not generate, manipulate, or fabricate any clinical or statistical data. The authors gratefully acknowledge Nabih Nakrour, M.D. for his valuable assistance in securing and managing the necessary IRB documentation for this study. The authors also thank Dr. Dylan Southard for his valuable assistance with English language refinement.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-1-910/rc
Data Sharing Statement: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-1-910/dss
Peer Review File: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-1-910/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-1-910/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments, and complied with the Health Insurance Portability and Accountability Act (HIPAA). This study was approved by the Institutional Review Board (IRB) of Massachusetts General Hospital (IRB No. 2019P002618). Given the retrospective design and use of de-identified clinical data, the requirement for informed consent was waived.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Gandaglia G, Leni R, Bray F, et al. Epidemiology and Prevention of Prostate Cancer. Eur Urol Oncol 2021;4:877-92. [Crossref] [PubMed]
- Schafer EJ, Laversanne M, Sung H, et al. Recent Patterns and Trends in Global Prostate Cancer Incidence and Mortality: An Update. Eur Urol 2025;87:302-13. [Crossref] [PubMed]
- James ND, Tannock I, N'Dow J, et al. The Lancet Commission on prostate cancer: planning for the surge in cases. Lancet 2024;403:1683-722. [Crossref] [PubMed]
- Culp MB, Soerjomataram I, Efstathiou JA, et al. Recent Global Patterns in Prostate Cancer Incidence and Mortality Rates. Eur Urol 2020;77:38-52. [Crossref] [PubMed]
- Tohi Y, Kato T, Fujiwara K, et al. Shifts in Diagnostic Approaches for Prostate Cancer: Impact of MRI-Informed Biopsies on Low-Risk Cancer Detection. Int J Urol 2025;32:1622-7. [Crossref] [PubMed]
- Srigley JR, Delahunt B, Egevad L, et al. One is the new six: The International Society of Urological Pathology (ISUP) patient-focused approach to Gleason grading. Can Urol Assoc J 2016;10:339-41. [Crossref] [PubMed]
- Swanson GP, Trevathan S, Hammonds KAP, et al. Gleason Score Evolution and the Effect on Prostate Cancer Outcomes. Am J Clin Pathol 2021;155:711-7. [Crossref] [PubMed]
- Pasecinic V, Novacescu D, Zara F, et al. Predictors of ISUP Grade Group Discrepancies Between Biopsy and Radical Prostatectomy: A Single-Center Analysis of Clinical, Imaging, and Histopathological Parameters. Cancers (Basel) 2025;17:2595. [Crossref] [PubMed]
- Danneman D, Wiklund F, Wiklund NP, et al. Prognostic significance of histopathological features of extraprostatic extension of prostate cancer. Histopathology 2013;63:580-9. [Crossref] [PubMed]
- Faisal FA, Tosoian JJ, Han M, et al. Clinical, Pathological and Oncologic Findings of Radical Prostatectomy with Extraprostatic Extension Diagnosed on Preoperative Prostate Biopsy. J Urol 2019;201:937-42. [Crossref] [PubMed]
- Park CK, Chung YS, Choi YD, et al. Revisiting extraprostatic extension based on invasion depth and number for new algorithm for substaging of pT3a prostate cancer. Sci Rep 2021;11:13952. [Crossref] [PubMed]
- Bronkema C, Rakic N, Abdollah F. Adjuvant radiotherapy in prostate cancer patients with positive margins or extracapsular extension. Ann Transl Med 2019;7:S291. [Crossref] [PubMed]
- Renzulli JF 2nd, Brito J 3rd, Kim IY, et al. A meta-analysis on the use of radiotherapy after prostatectomy: adjuvant versus early salvage radiation. Prostate Int 2022;10:80-4. [Crossref] [PubMed]
- Labra A, González F, Silva C, et al. MRI/TRUS fusion vs. systematic biopsy: intra-patient comparison of diagnostic accuracy for prostate cancer using PI-RADS v2. Abdom Radiol (NY) 2020;45:2235-43. [Crossref] [PubMed]
- Teramoto Y, Numbere N, Wang Y, et al. The Clinical Significance of pT3a Lesions as Well as Unilateral Versus Bilateral Invasion Into the Seminal Vesicle in Men With pT3b Prostate Cancer: A Proposal for a New pT3b Subclassification. Arch Pathol Lab Med 2023;147:1261-7. [Crossref] [PubMed]
- Numbere N, Teramoto Y, Gurung PMS, et al. The Clinical Impact of Unilateral Versus Bilateral Invasion Into the Seminal Vesicle in Patients With Prostate Cancer Undergoing Radical Prostatectomy. Arch Pathol Lab Med 2022;146:855-61. [Crossref] [PubMed]
- Suh J, Jeong IG, Jeon HG, et al. Bilateral Seminal Vesicle Invasion as a Strong Prognostic Indicator in T3b Prostate Cancer Patients Following Radical Prostatectomy: A Comprehensive, Multicenter, Long-term Follow-up Study. Cancer Res Treat 2024;56:885-92. [Crossref] [PubMed]
- Bang S, Shin SJ, Kim DK, et al. Clinical implication of peri-seminal vesicle soft-tissue invasion in patients with pT3b prostate cancer. Prostate Int 2025;13:246-52. [Crossref] [PubMed]
- Morgan TM, Boorjian SA, Buyyounouski MK, et al. Salvage Therapy for Prostate Cancer: AUA/ASTRO/SUO Guideline Part III: Salvage Therapy After Radiotherapy or Focal Therapy, Pelvic Nodal Recurrence and Oligometastasis, and Future Directions. J Urol 2024;211:526-32. [Crossref] [PubMed]
- Choi MH, Kim DH, Lee YJ, et al. Imaging features of the PI-RADS for predicting extraprostatic extension of prostate cancer: systematic review and meta-analysis. Insights Imaging 2023;14:77. [Crossref] [PubMed]
- Li T, Graham PL, Cao B, et al. Accuracy of MRI in detecting seminal vesicle invasion in prostate cancer: a systematic review and meta-analysis. BJU Int 2025;135:17-28. [Crossref] [PubMed]
- Saeter T, Vlatkovic L, Waaler G, et al. Combining lymphovascular invasion with reactive stromal grade predicts prostate cancer mortality. Prostate 2016;76:1088-94. [Crossref] [PubMed]
- Jeong JU, Nam TK, Song JY, et al. Prognostic significance of lymphovascular invasion in patients with prostate cancer treated with postoperative radiotherapy. Radiat Oncol J 2019;37:215-23. [Crossref] [PubMed]
- Park YH, Kim Y, Yu H, et al. Is lymphovascular invasion a powerful predictor for biochemical recurrence in pT3 N0 prostate cancer? Results from the K-CaP database. Sci Rep 2016;6:25419. [Crossref] [PubMed]
- Jiang W, Zhang L, Wu B, et al. The impact of lymphovascular invasion in patients with prostate cancer following radical prostatectomy and its association with their clinicopathological features: An updated PRISMA-compliant systematic review and meta-analysis. Medicine (Baltimore) 2018;97:e13537. [Crossref] [PubMed]
- Suresh N, Teramoto Y, Goto T, et al. Clinical significance of perineural invasion by prostate cancer on magnetic resonance imaging-targeted biopsy. Hum Pathol 2022;121:65-72. [Crossref] [PubMed]
- Zhang LJ, Wu B, Zha ZL, et al. Perineural invasion as an independent predictor of biochemical recurrence in prostate cancer following radical prostatectomy or radiotherapy: a systematic review and meta-analysis. BMC Urol 2018;18:5. [Crossref] [PubMed]
- Sathianathen NJ, Furrer MA, Mulholland CJ, et al. Lymphovascular Invasion at the Time of Radical Prostatectomy Adversely Impacts Oncological Outcomes. Cancers (Basel) 2023;16:123. [Crossref] [PubMed]
- Bjurlin MA, Carroll PR, Eggener S, et al. Update of the Standard Operating Procedure on the Use of Multiparametric Magnetic Resonance Imaging for the Diagnosis, Staging and Management of Prostate Cancer. J Urol 2020;203:706-12. [Crossref] [PubMed]
- Mottet N, Bellmunt J, Bolla M, et al. EAU-ESTRO-SIOG Guidelines on Prostate Cancer. Part 1: Screening, Diagnosis, and Local Treatment with Curative Intent. Eur Urol 2017;71:618-29. [Crossref] [PubMed]
- Lee CH, Tan TW, Tan CH. Multiparametric MRI in Active Surveillance of Prostate Cancer: An Overview and a Practical Approach. Korean J Radiol 2021;22:1087-99. [Crossref] [PubMed]
- Fernandes MC, Yildirim O, Woo S, et al. The role of MRI in prostate cancer: current and future directions. MAGMA 2022;35:503-21. [Crossref] [PubMed]
- Nam R, Patel C, Milot L, et al. Prostate MRI versus PSA screening for prostate cancer detection (the MVP Study): a randomised clinical trial. BMJ Open 2022;12:e059482. [Crossref] [PubMed]
- Klotz L, Chin J, Black PC, et al. Comparison of Multiparametric Magnetic Resonance Imaging-Targeted Biopsy With Systematic Transrectal Ultrasonography Biopsy for Biopsy-Naive Men at Risk for Prostate Cancer: A Phase 3 Randomized Clinical Trial. JAMA Oncol 2021;7:534-42. [Crossref] [PubMed]
- Eklund M, Jäderling F, Discacciati A, et al. MRI-Targeted or Standard Biopsy in Prostate Cancer Screening. N Engl J Med 2021;385:908-20. [Crossref] [PubMed]
- Siddiqui MM, Rais-Bahrami S, Turkbey B, et al. Comparison of MR/ultrasound fusion-guided biopsy with ultrasound-guided biopsy for the diagnosis of prostate cancer. JAMA 2015;313:390-7. [Crossref] [PubMed]
- Kasivisvanathan V, Rannikko AS, Borghi M, et al. MRI-Targeted or Standard Biopsy for Prostate-Cancer Diagnosis. N Engl J Med 2018;378:1767-77. [Crossref] [PubMed]
- Urase Y, Ueno Y, Tamada T, et al. Comparison of prostate imaging reporting and data system v2.1 and 2 in transition and peripheral zones: evaluation of interreader agreement and diagnostic performance in detecting clinically significant prostate cancer. Br J Radiol 2022;95:20201434. [Crossref] [PubMed]
- Turkbey B, Rosenkrantz AB, Haider MA, et al. Prostate Imaging Reporting and Data System Version 2.1: 2019 Update of Prostate Imaging Reporting and Data System Version 2. Eur Urol 2019;76:340-51. [Crossref] [PubMed]
- Scott R, Misser SK, Cioni D, et al. PI-RADS v2.1: What has changed and how to report. SA J Radiol 2021;25:2062. [Crossref] [PubMed]
- Purysko AS, Baroni RH, Giganti F, et al. PI-RADS Version 2.1: A Critical Review, From the AJR Special Series on Radiology Reporting and Data Systems. AJR Am J Roentgenol 2021;216:20-32. [Crossref] [PubMed]
- Szempliński S, Kamecki H, Dębowska M, et al. Predictors of Clinically Significant Prostate Cancer in Patients with PIRADS Categories 3-5 Undergoing Magnetic Resonance Imaging-Ultrasound Fusion Biopsy of the Prostate. J Clin Med 2022;12:156. [Crossref] [PubMed]
- Nowier A, Mazhar H, Salah R, et al. Performance of multi-parametric magnetic resonance imaging through PIRADS scoring system in biopsy naïve patients with suspicious prostate cancer. Arab J Urol 2022;20:121-5. [Crossref] [PubMed]
- Nakai H, Takahashi H, LeGout JD, et al. Estimated diagnostic performance of prostate MRI performed with clinical suspicion of prostate cancer. Insights Imaging 2024;15:271. [Crossref] [PubMed]
- Bastian-Jordan M. Magnetic resonance imaging of the prostate and targeted biopsy, Comparison of PIRADS and Gleason grading. J Med Imaging Radiat Oncol 2018;62:183-7. [Crossref] [PubMed]
- Wenzel M, Hoeh B, Mandel P, et al. Diagnosis of Clinically Significant Prostate Cancer Diagnosis Without Histological Proof in the Prostate-specific Membrane Antigen Era: The Jury Is Still Out. Eur Urol Open Sci 2022;45:50-1. [Crossref] [PubMed]
- Magi-Galluzzi C, Montironi R, Epstein JI. Contemporary Gleason grading and novel Grade Groups in clinical practice. Curr Opin Urol 2016;26:488-92. [Crossref] [PubMed]
- Yilmaz EC, Shih JH, Belue MJ, et al. Prospective Evaluation of PI-RADS Version 2.1 for Prostate Cancer Detection and Investigation of Multiparametric MRI-derived Markers. Radiology 2023;307:e221309. [Crossref] [PubMed]
- Gelikman DG, Azar WS, Yilmaz EC, et al. A Prostate Imaging-Reporting and Data System version 2.1-based predictive model for clinically significant prostate cancer diagnosis. BJU Int 2025;135:751-9. [Crossref] [PubMed]
- Cornford P, van den Bergh RCN, Briers E, et al. EAU-EANM-ESTRO-ESUR-ISUP-SIOG Guidelines on Prostate Cancer-2024 Update. Part I: Screening, Diagnosis, and Local Treatment with Curative Intent. Eur Urol 2024;86:148-63. [Crossref] [PubMed]
- Mew A, Chau E, Bera K, et al. Recommendations from Imaging, Oncology, and Radiology Organizations to Guide Management in Prostate Cancer: Summary of Current Recommendations. Radiol Imaging Cancer 2025;7:e240091. [Crossref] [PubMed]
- Hamm CA, Asbach P, Pöhlmann A, et al. Oncological Safety of MRI-Informed Biopsy Decision-Making in Men With Suspected Prostate Cancer. JAMA Oncol 2025;11:145-53. [Crossref] [PubMed]
- Drost FH, Osses DF, Nieboer D, et al. Prostate MRI, with or without MRI-targeted biopsy, and systematic biopsy for detecting prostate cancer. Cochrane Database Syst Rev 2019;4:CD012663. [Crossref] [PubMed]
- Pellegrino F, Stabile A, Sorce G, et al. Added Value of Prostate-specific Antigen Density in Selecting Prostate Biopsy Candidates Among Men with Elevated Prostate-specific Antigen and PI-RADS ≥3 Lesions on Multiparametric Magnetic Resonance Imaging of the Prostate: A Systematic Assessment by PI-RADS Score. Eur Urol Focus 2024;10:634-40. [Crossref] [PubMed]
- Scialpi M, Martorana E, Aisa MC, et al. Score 3 prostate lesions: a gray zone for PI-RADS v2. Turk J Urol 2017;43:237-40. [Crossref] [PubMed]
- Sherer MV, Leonard AJ, Nelson TJ, et al. Prognostic Value of the Intermediate-risk Feature in Men with Favorable Intermediate-risk Prostate Cancer: Implications for Active Surveillance. Eur Urol Open Sci 2023;50:61-7. [Crossref] [PubMed]
- Pekala KR, Bergengren O, Eastham JA, et al. Active surveillance should be considered for select men with Grade Group 2 prostate cancer. BMC Urol 2023;23:152. [Crossref] [PubMed]
- Bernardino R, Sayyid RK, Leão R, et al. Using active surveillance for Gleason 7 (3+4) prostate cancer: A narrative review. Can Urol Assoc J 2024;18:135-44. [PubMed]
- Katelaris A, Amin A, Blazevski A, et al. Outcomes for active surveillance are similar for men with favourable risk ISUP-2 to those with ISUP-1 prostate cancer: A pair matched cohort study. J Clin Urol 2025;18:121-8. [Crossref]
- Baboudjian M, Leni R, Oderda M, et al. Active Surveillance of Grade Group 2 Prostate Cancer: Oncological Outcomes from a Contemporary European Cohort. Eur Urol Oncol 2025;8:1253-9. [Crossref] [PubMed]
- Oliveira T, Amaral Ferreira L, Marto CM, et al. The Role of Multiparametric MRI in the Local Staging of Prostate Cancer. Front Biosci (Elite Ed) 2023;15:21. [Crossref] [PubMed]
- Barletta F, Gandaglia G, Ploussard G, et al. SC127 - Added value of mpMRI, MRI-targeted and systematic biopsy in the prediction of adverse pathologic features in contemporary prostate cancer patients undergoing radical prostatectomy. European Urology Supplements 2019;18:e3232. [Crossref]
- Chung Y, Hong SK. Evaluating prostate cancer diagnostic methods: The role and relevance of digital rectal examination in modern era. Investig Clin Urol 2025;66:181-7. [Crossref] [PubMed]
- Carneiro A, Racy D, Bacchi CE, et al. Consensus on Screening, Diagnosis, and Staging Tools for Prostate Cancer in Developing Countries: A Report From the First Prostate Cancer Consensus Conference for Developing Countries (PCCCDC). JCO Glob Oncol 2021;7:516-22. [Crossref] [PubMed]
- Artiles Medina A, Rodríguez-Patrón Rodríguez R, Ruiz Hernández M, et al. Identifying Risk Factors for MRI-Invisible Prostate Cancer in Patients Undergoing Transperineal Saturation Biopsy. Res Rep Urol 2021;13:723-31. [Crossref] [PubMed]
- Zhu M, Gao J, Han F, et al. Diagnostic performance of prediction models for extraprostatic extension in prostate cancer: a systematic review and meta-analysis. Insights Imaging 2023;14:140. [Crossref] [PubMed]
- Pepe P, Pepe L, Fiorentino V, et al. Multiparametric MRI targeted prostate biopsy: When omit systematic biopsy? Arch Ital Urol Androl 2024;96:12992. [Crossref] [PubMed]
- Fiorentino V, Martini M, Dell'Aquila M, et al. Histopathological Ratios to Predict Gleason Score Agreement between Biopsy and Radical Prostatectomy. Diagnostics (Basel) 2020;11:10. [Crossref] [PubMed]
- Fiorentino V, Pepe L, Zuccalà V, et al. Gleason score down and upgrading at radical prostatectomy in targeted vs. systematic prostate biopsy: Findings from an institutional cohort. Pathol Res Pract 2025;271:156040. [Crossref] [PubMed]
- Tayara OM, Pełka K, Kunikowska J, et al. Comparison of Multiparametric MRI, [68Ga]Ga-PSMA-11 PET-CT, and Clinical Nomograms for Primary T and N Staging of Intermediate-to-High-Risk Prostate Cancer. Cancers (Basel) 2023;15:5838. [Crossref] [PubMed]
- Pepe P, Pepe L, Fiorentino V, et al. PSMA PET/CT Accuracy in Diagnosing Prostate Cancer Nodes Metastases. In Vivo 2024;38:2880-5. [Crossref] [PubMed]
- Oguzdogan GY, Adibelli ZH, Şefik E, et al. Accuracy and interobserver agreement of the correlation between prostate imaging reporting and data system version 2.1 and international society of urological pathology scores. Hong Kong J Radiol 2023;26:100-10. [Crossref]
- Ahmed HM, Ebeed AE, Hamdy A, et al. Interobserver agreement of Prostate Imaging–Reporting and Data System (PI-RADS–V2). Egyptian Journal of Radiology and Nuclear Medicine 2021;52:5. [Crossref]
- Jóźwiak R, Sobecki P, Lorenc T. Intraobserver and Interobserver Agreement between Six Radiologists Describing mpMRI Features of Prostate Cancer Using a PI-RADS 2.1 Structured Reporting Scheme. Life (Basel) 2023;13:580. [Crossref] [PubMed]
- Wei CG, Zhang YY, Pan P, et al. Diagnostic Accuracy and Interobserver Agreement of PI-RADS Version 2 and Version 2.1 for the Detection of Transition Zone Prostate Cancers. AJR Am J Roentgenol 2021;216:1247-56. [Crossref] [PubMed]

