Identification and validation of a 9-RBPs-related gene signature associated with prognosis and immune infiltration in bladder cancer based on bioinformatics analysis and machine learning
Highlight box
Key findings
• A novel gene signature was developed and validated based on nine RNA-binding proteins (RBPs), which may enhance prognostic evaluation and prediction for patients with bladder cancer (BLCA).
What is known and what is new?
• RBPs are essential in the developmental processes of cancer and have been linked to disease progression and prognosis. RBPs are implicated in the initiation, progression, and metastasis of BLCA, as well as in predicting patient survival.
• In this study, we identified nine RBPs associated with the prognosis of BLCA and constructed a signature for predicting the clinical prognosis of BLCA patients. These results may provide new insights into the mechanisms underlying the occurrence and progression of BLCA and contribute to developing new clinical therapeutic targets and prognostic biomarkers.
What is the implication, and what should change now?
• These differentially expressed RBPs are likely to play a significant role in the tumorigenesis, progression, invasion, and metastasis of BLCA. Identifying the RBP signature could enhance our understanding of the molecular mechanisms involved in BLCA progression.
• DERBPs should be incorporated into the prognostic assessment of BLCA patients. Additionally, the effectiveness of the RBP signature should be validated in larger, prospective, multicenter cohorts to support its further application.
Introduction
Bladder cancer (BLCA) ranks among the most prevalent malignant tumors of the urinary system (1). According to statistical data, approximately 613,000 new cases and 220,000 deaths occurred globally in 2022, positioning BLCA as the ninth most common cancer in terms of incidence and thirteenth in mortality among malignant tumors (1). BLCA is classified into two categories—non-muscle-invasive bladder cancer (NMIBC) and muscle-invasive bladder cancer (MIBC)—based on clinical symptoms and the extent of tumor infiltration, with NMIBC constituting 75% to 85% of cases, while MIBC accounts for 15% to 25% (2). Currently, the treatment of BLCA primarily involves surgical intervention, complemented by radiotherapy and chemotherapy (3). Most patients present with NMIBC upon initial diagnosis (4). Although NMIBC is not life-threatening, up to 70% of these patients experience tumor recurrence and postoperative deterioration (5), with 10% to 25% progressing to MIBC based on staging and grading (6,7). MIBC often leads to a poor prognosis, characterized by the invasion of blood vessels and lymph nodes, as well as the development of distant metastases. Both MIBC and metastatic BLCA are significant contributors to mortality and poor prognosis in BLCA patients, with 5-year survival rates estimated at only 36% to 48% and 5% to 36%, respectively (8). Furthermore, BLCA patients require repeated medical evaluations and treatments, rendering it one of the most costly malignancies to address (9). Despite advancements in the diagnosis and treatment of BLCA, the prognosis remains variable, underscoring the necessity for more accurate prognostic biomarkers and therapeutic targets. Improving long-term survival rates for BLCA patients relies on early detection, accurate risk assessment, timely interventions, and comprehensive monitoring (10). Identifying biomarkers specific to BLCA is essential not only for facilitating early and accurate diagnosis but also for effective tumor classification. Therefore, it is crucial to conduct systematic research and clinical studies focused on the accurate classification of BLCA and the discovery of novel potential biomarkers. Such efforts are vital for developing personalized, effective, and reliable therapeutic strategies aiming at enhancing the quality of life and improving the long-term survival of patients with BLCA.
RNA-binding proteins (RBPs) are proteins capable of binding to various types of RNA molecules, forming ribonucleoprotein complexes (11). These complexes are essential for maintaining gene expression homeostasis and regulating post-transcriptional RNA processes (12). RBPs are integral to numerous biological processes, and their regulatory mechanisms in tumor cells have been elucidated, encompassing alternative splicing, polyadenylation, stability, subcellular localization, and translation (13). As components of ribonucleoprotein complexes, RBPs can be affected by genomic and RNA editing-induced amino acid substitutions, which may affect their functional interactions (14). Consequently, any alterations in their expression levels or functional states may contribute to the pathophysiological changes associated with malignant tumors. Research has shown that RBPs are vital in the initiation and progression of several malignancies, including BLCA (14-16). Specifically, RBPs are implicated in the initiation, progression, and metastasis of BLCA (17), as well as in predicting patient survival (18). Thus, this study aims to explore the relationship between RBPs and BLCA prognosis to identify accurate and reliable prognostic biomarkers.
The biological process of malignant tumor development disrupts gene expression (19), leaving behind features that may contain valuable information. In recent years, bioinformatics and machine learning techniques have been extensively utilized to analyze gene and protein expression profiles, facilitating rapid and precise biomarker screening and the construction of prognostic models. The available dataset contains vast amounts of information related to prognostic genes in BLCA, in which the features to be analyzed consist of complex, nonlinear gene combinations embedded within multiple gene expressions, and machine learning has become the primary method for uncovering these patterns and building predictive models based on gene expression data (20). Several classical machine learning algorithms such as linear regression analysis (17) and logistic regression analysis (15,21), have been employed in studies involving RBPs. While these approaches have yielded promising results, challenges remain in constructing models that are both sufficiently accurate and robust for practical clinical applications. Therefore, this study aims to apply bioinformatics and machine learning techniques to identify RBPs that are strongly associated with BLCA prognosis, as well as investigating their roles in biological processes. Moreover, the accuracy of the models will be validated through multiple machine learning algorithms, aiming to identify prognostic markers capable of effectively predicting BLCA prognosis. These findings may provide novel insights into the diagnosis, prognostic assessment, and therapeutic targets of BLCA, ultimately improving clinical decision-making, facilitating early diagnosis, enhancing disease monitoring, and optimizing treatment strategies for BLCA. We present this article in accordance with the TRIPOD reporting checklist (available at https://tau.amegroups.com/article/view/10.21037/tau-2024-688/rc).
Methods
Data acquisition and preprocessing
The “TCGAbioLinks” package of R (https://www.r-project.org/) was utilized to obtain the BLCA dataset (TCGA-BLCA) from the TCGA database (https://portal.gdc.cancer.gov/), and GSE31684 of BLCA patients from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) through the “GEOquery” package to obtain raw gene expression profiling data and clinical characterization of BLCA patients. The TCGA contained 412 BLCA samples and 19 normal tissue samples, serving as the training cohort. The GEO contained 93 BLCA samples and was used as the validation cohort for follow-up analysis and model validation. The “limma” package in R was employed for background correction, median normalization and gene symbol conversion for both datasets. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Identification of differentially expressed RBPs (DERBPs)
Differential expression analysis was performed using the “limma” package in R (version 4.4.1) to identify and screen for DERBPs between BLCA and normal tissue samples. DERBPs were defined by an absolute log2 fold change [log2(fold change)] greater than 1 and a false discovery rate (FDR)-adjusted P value below 0.05. To visualize the results of the differential expression analysis, heatmaps of genes associated with DERBPs were generated using the “pheatmap” package in R (version 4.4.1).
Screening hub RBPs-related genes through machine learning and determining their predictive ability
The TCGA dataset served as the training cohort for identifying RBPs-related genes significantly associated with BLCA prognosis. This was accomplished through univariate Cox regression analysis, utilizing a threshold value of 0.05 to select candidate genes to develop a prognostic model. Risk models were created based on gene expression values and corresponding correlation coefficients. Within the training cohort, we constructed consistent models by employing a combination of multiple machine learning algorithms and calculated the average concordance index (C-index) for each model to assess the consistency and predictive power of the identified signature genes across different algorithms. Ultimately, a risk score prediction model was established. The risk value for each patient was determined using the following formula:
In this equation, denotes the RBPs coefficient, and represents the expression level of the RBPs.
Each patient’s risk score was calculated accordingly, and the cohort was subsequently categorized into high-risk and low-risk groups based on the best cut-off risk score. Kaplan-Meier survival curves were generated using the “survival” and “survminer” packages in R, with the log-rank test employed to compare overall survival (OS) between the high- and low-risk groups, where a P value of less than 0.05 indicated statistical significance.
Independent analysis of the prognostic model
To assess whether the prognostic signature developed in this study could function as an independent predictor of BLCA patient outcomes, we performed both univariate and multivariate Cox regression analyses, involving integrating clinical characteristics such as age, gender, and cancer stage, applying a log-rank threshold of P<0.05 for significant correlations. The analysis was conducted using the “survival” package in R. Additionally, the “timeROC” and “pROC” packages were utilized to generate receiver operating characteristic (ROC) curves over time, along with calibration curves for 1-, 3-, and 5-year intervals to evaluate the prognostic model’s sensitivity and specificity.
Construction and verification of prognostic nomogram
To further the prediction of prognosis for BLCA patients, a nomogram was developed. Using the “rms” package in R (version 4.4.1), we constructed a nomogram that integrates the risk score with clinical characteristics, including gender, age, and stage, to forecast 1-, 3-, and 5-year survival for BLCA patients. The accuracy and reliability of the nomogram were confirmed by plotting a calibration curve with the “calibration” function.
Analysis of immune features and immune subtypes in high- and low-risk groups
To begin, the “corcorplot” package was utilized to assess the correlation between immune cell infiltration and RiskScore. Next, we applied single-sample gene set enrichment analysis (ssGSEA) within the “GSVA” package of R software (version 4.4.1) to explore the levels of immune cell infiltration, along with the expression of immune checkpoints and immune-related pathways, between tumor samples from the high- and low-risk groups. A boxplot was generated to visually represent the differences in immune cell infiltration between these two groups. Furthermore, the “TCGAbiolinks” package in R (version 4.4.1) was used to retrieve immune subtyping data for BLCA from the TCGA database. Immune subtyping analysis was then conducted on tumor samples from both the high-risk and low-risk groups.
Gene set enrichment analysis (GSEA) of DERBPs
GSEA is a method for identifying significant biological functions and signaling pathways that are enriched within a predefined set of genes, typically based on differences observed between two biological conditions (22). In this study, GSEA was performed to generate an ordered list of genes based on the expression correlation of DERBPs. This analysis was performed to explore the survival differences between the high-expression and low-expression groups of DERBPs. GSEA was also employed to examine the potential biological functions and signaling pathways associated with the hub DERBPs in the context of BLCA pathogenesis. A FDR threshold of less than 0.25 and a P value threshold of less than 0.05 were applied for statistical significance. The results were visualized using the “GOplot” package.
Statistical analysis
All data processing and statistical analyses were conducted using R software version 4.4.1. For comparisons between two sets of continuous data, independent Student’s t-tests were used to assess statistical significance for regularly distributed variables. For non-normally distributed data, the Mann-Whitney U test was applied. Spearman’s correlation analysis was performed to calculate correlation coefficients between genes, as well as between genes and immune cells, and to assess module-trait associations. Statistical significance was set at a P value of less than 0.05.
Results
Identification of differentially expressed genes
The bioinformatics analysis process of this study is summarized in Figure 1. The heatmap shows the expression of RBPs in BLCA samples and normal tissues, with a total of 116 genes obtained (Figure 2A). The 116 DERBPs associated with prognosis in the sample were then analyzed through univariate cox regression analysis, yielding 34 RBPs associated with prognosis (P<0.05). Twenty-three of the DERBPs (PABPN1, TRMU, MRPL38, OAS1, CLK2, DUS4L, MTG1, MOV10, C2orf15, FASTKD3, GEMIN6, MRPS26, HEXIM2, SNRPF, PPARGC1B, RPS10, NOL12, SNRPA1, TRMT2A, EEF1E1, PABPC1L, POLR2H, GTPBP2) were identified as protective genes with hazard ratio (HR) less than 1, and another eleven DERBPs (DARS2, CALR, IGF2BP3, ZC3HAV1L, ATXN1, XPOT, CSDC2, RBM24, ENOX1, RBPMS2, ZNF106) were identified as risk genes with HR greater than 1. To visualize these findings, we generated a Forest (Figure 2B).


Constructing the RBPs-related gene signature using machine learning methods
A hybrid approach incorporating 13 machine learning algorithms was utilized to analyze the 34 DERBPs associated with the prognosis of BLCA and to develop a prognostic signature. In the training set, 101 predictive models were generated by combining different algorithms and hyperparameter configurations. Model performance was evaluated using the C-index for both training and validation sets, which quantifies the ability to rank survival outcomes accurately. While initial model (plsRcox) achieved marginally higher C-index, it identified over 30 genes, introducing high complexity and susceptibility to overfitting. In contrast, RSF + Enet [alpha=0.2] prioritized sparsity via L1 regularization (alpha =0.2), yielding a concise 9-gene signature. This alignment with the principle of parsimony enhances interpretability and generalizability by minimizing redundant variables while retaining predictive power. A smaller gene set reduces noise from redundant signals and facilitates functional validation. The alpha parameter in Elastic Net balances L1 (sparsity) and L2 (multicollinearity control) penalties. Setting alpha =0.2 emphasized L1 regularization, enabling stringent variable selection without sacrificing predictive accuracy. RSF + Enet [alpha=0.2] demonstrated superior stability and clinical utility (Figure 3A). By adhering to the bias-variance tradeoff, this model minimized overfitting risks while maintaining competitive discrimination. Based on this model, nine genes (OAS1, MTG1, DUS4L, IGF2BP3, NOL12, PABPC1L, ZC3HAV1L, TRMT2A, and TRMU) were identified as key prognostic markers, and BLCA patients in the dataset were classified into high- and low-risk group. Kaplan-Meier survival curve analysis demonstrated that the OS was significantly worse in the high-risk group than low-risk group in the GEO cohort (P=0.06; Figure 3B), and TCGA cohort (P<0.001; Figure 3C).

Performance evaluation of the prognostic signature
To assess whether the RiskScore could serve as an independent prognostic factor for BLCA, both univariate and multivariate Cox regression analyses were performed, incorporating clinical characteristics such as stage, gender, and age. The results indicated that RiskScore, stage, and age were all significantly associated with survival (P<0.001) (Figure 4A,4B). Specifically, univariate analysis revealed the following risk ratios: staging hazard ratio =1.696, age hazard ratio =1.035, riskscore hazard ratio =2.438 (Figure 4A). Multivariate analysis of the data set showed stage hazard ratio =1.534, age hazard ratio =1.029, and risk score hazard ratio =2.073 (Figure 4B). In the 5-year ROC of the model, the AUCs of risk score, stage, gender, and age were 0.661, 0.629, 0.479, and 0.674, respectively, demonstrating the satisfactory predictive performance of the RiskScore (Figure 4C). Moreover, the model’s ability to predict survival at 1-, 3-, and 5-year intervals was confirmed, with RiskScore demonstrating an AUC of 0.661, 0.655 and 0.676 for each time point (Figure 4D). These findings support the use of the RiskScore, derived from the RBPs-related gene signature, as an independent prognostic factor for BLCA, highlighting the potential clinical value of DERBPs in BLCA prognosis.

Construction and validation of the nomogram model
A nomogram integrating clinical characteristics (gender, age, BLCA stage) and the RiskScore was developed for predicting 1-, 3-, and 5-year OS in BLCA patients (Figure 5A). The nomogram assigns weighted points to each variable based on their prognostic contributions. As shown in Figure 5A, BLCA stage contributed the highest points, reflecting its strong association with poor survival. To validate the RBPs-related gene signature and assess the accuracy of the nomogram, a prognostic calibration analysis was conducted. The calibration curves showed that the model-predicted survival (nomogram-predicted OS) was highly consistent with the observed outcomes (observed OS), especially at the 1- and 3-year time points (curves close to the diagonal line). The 5-year predicted values were slightly biased, which might be related to the missing data from long-term follow-up. The model achieved a C-index of 0.680 [95% confidence interval (CI): 0.639–0.721], suggesting that the nomogram is a reliable tool for predicting survival in BLCA patients (Figure 5B).

Immune status varies among patients in different risk groups
The association between risk scores and immune status was examined using various algorithms. The immune cell bubble plots indicated a significant correlation between risk scores and the majority of immune cells, predominantly exhibiting a positive relationship (P<0.05) (Figure 6A). Findings revealed that patients classified in the high-risk group exhibited higher expression levels of numerous immune checkpoints, suggesting a potential link to poorer survival outcomes in this group (Figure 6B). Additionally, ssGSEA was employed to assess immune-related pathways and cell types across low- and high-risk groups. Results indicated that the high-risk group was associated with several immune-regulatory signaling pathways. Moreover, infiltration levels of various immune cells, including activated dendritic cells (aDCs), dendritic cells (DCs), and plasmacytoid dendritic cells (pDCs), were notably higher in the high-risk group compared to the low-risk group. In terms of immune functionality, metrics such as antigen-presenting cell (APC)_co_inhibition, APC_co_stimulation, chemokine receptor (CCR), check-point and cytolytic_activity were all elevated in the high-risk cohort relative to the low-risk group (Figure 6C). Previous research has categorized tumors in the TCGA database into six distinct subtypes based on their immune status: C1 (wound healing), C2 (IFN-g dominant), C3 (inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet), and C6 (TGF-β dominant) (23). The subtype distribution observed in this study indicated that the proportions of C1 (43%), C3 (4%), and C4 (3%) were lower than the corresponding proportions of C1 (44%), C3 (7%), and C4 (15%) found in the high-risk group. Conversely, the proportion of subtype C2 (50%) was greater in the high-risk group compared to subtype C2 (34%) in the low-risk group (Figure 6D).

Signal pathway enrichment analysis of characteristic gene sets
To investigate the biological functions and pathways linked to DERBPs associated with poor survival outcomes, GSEA was conducted to elucidate the functional differences between the high- and low-risk groups identified through the prognostic signature (Figure 7A-7D). In the TCGA-BLCA cohort, GSEA results demonstrated that the high-risk group exhibited significant enrichment in pathways related to extracellular matrix (ECM) components and interactions, as well as pathways involving cytokine and receptor interactions.

Discussion
Currently, there are several clinical therapies for BLCA, including surgical resection, radiotherapy, chemotherapy and immunotherapy. Although each treatment modality has its own characteristics, due to the significant rates of recurrence, propensity for progression, potential for metastasis, and resistance to multiple drugs of BLCA, patients often have a poor prognosis (3,24,25). With the development of precision medicine for cancer and advances in biotechnology and bioinformatics, a diverse set of markers or biomarkers has been established to predict disease prognosis and treatment response through genomic analysis and various bioinformatics tools (26-28). Therefore, the identification of reliable biomarkers is important for the improvement of survival prediction in patients with BLCA.
RBPs are essential in the regulation of RNA. Dysregulation of RBPs has been implicated in the progression of various malignancies (29-31). However, the precise functional roles and underlying mechanisms of many RBPs in human cancers remain unclear (13,32). While several prognostic signatures for BLCA have recently been developed (15,21,28,33-35), limited research has focused on the specific role of RBPs in predicting BLCA survival, leaving a significant gap in understanding their relevance in this context. Consequently, the clinical value and potential molecular mechanisms of RBP-associated genes in BLCA warrant further investigation.
In this study, 116 differentially expressed RBPs between normal and BLCA tissues were identified through bioinformatics analysis using TCGA and GEO datasets. Bioinformatics methods combined with machine learning algorithms led to the identification of nine key RBP-related genes, and a prognostic signature was successfully developed. The signature demonstrated diagnostic utility for predicting OS in BLCA patients, as shown by ROC analysis. Additionally, nomograms were constructed to predict OS at 1, 3, and 5 years, showing robust predictive performance. These findings suggest that the prognostic signature established in this study holds significant potential for predicting BLCA patient prognosis.
By combining machine learning algorithms, we screened nine RBPs-related genes that are closely associated with BLCA prognosis (OAS1, MTG1, DUS4L, IGF2BP3, NOL12, PABPC1L, ZC3HAV1L, TRMT2A and TRMU). Previous studies have highlighted the significant role of RBPs in tumorigenesis, yet the molecular mechanisms underlying their involvement in cancer pathogenesis remain poorly understood. Oligoadenylate synthetase 1 (OAS1), is a key interferon-stimulated gene effector protein essential in antiviral defense (36). OAS1 has been shown to be differentially expressed in pan-cancer and correlates with patient prognosis, suggesting its potential as a prognostic marker (37). Additionally, OAS1 may regulate the anti-tumor immune response and has been identified as a valuable predictor for immunotherapy outcomes in BLCA, offering insights into BLCA development and potential individualized treatment protocols (33,38). MTG1, a conserved ribosomal guanosine triphosphatase, is a critical cofactor for mitochondrial translation. Studies have demonstrated that MTG1 is significantly overexpressed in BLCA tumor tissues, positioning it as a potential prognostic biomarker for BLCA (15). MTG1 plays an important role in tumor induction or progression. Although its involvement in tumor initiation and progression is acknowledged (39), the exact mechanisms by which MTG1 contributes to BLCA development remain unclear. Dihydrouridine synthase 4-like (DUS4L) has been implicated in transcriptional and genetic alterations in several cancers, including gastric and prostate cancers (40,41). Moreover, Li et al. reported that DUS4L is significantly upregulated in lung adenocarcinoma (LUAD) tissues, where it inhibits cell proliferation and promotes apoptosis in LUAD A549 cells (42). Further research has shown that DUS4L interacts with the signaling molecule GRB2, triggering epithelial-mesenchymal transition through the PI3K/AKT and ERK/MAPK pathways, thereby enhancing LUAD progression and metastasis (43). Insulin-like growth factor II mRNA binding protein (3IGF2BP3) has been identified as a driver of BLCA progression, with studies revealing that PM2.5-induced m6A modifications regulate the stability of BIRC5 mRNA through METTL3/IGF2BP3, promoting BLCA proliferation and metastasis (44). Furthermore, IGF2BP3 has been shown to regulate programmed death ligand 1 (PD-L1) expression in BLCA, suggesting its potential as a prognostic marker for BLCA (45). NOL12 is a multifunctional RBP that acts as a link between RNA and DNA metabolism (46). NOL12 may serve as an independent prognostic predictor in renal clear cell carcinoma and HBV-related hepatocellular carcinoma (47,48), with significant overexpression reported in HCC tissues (49). Its upregulation correlates with patient prognosis, making it a potential therapeutic target. PABPC1like (PABPC1L), an important paralog of PABPC1, regulates mRNA translation and stability. Overexpression of PABPC1L has been linked to multiple cancers, including prostate cancer, renal cell carcinoma, and colorectal cancer (47,50-52). In addition, PABPC1L was overexpressed in colorectal cancer tissues and correlated with the prognosis of colorectal cancer patients. In colorectal cancer, PABPC1L overexpression correlates with poor prognosis, and its reduction inhibits the activation of p-AKT and p-PI3K, leading to suppressed cell proliferation and migration. These findings suggest PABPC1L as a potential therapeutic target for colorectal cancer (53). Zinc finger CCCH-type containing, antiviral 1 (ZC3HAV1L) has been shown to limit the replication of specific viruses, playing a protective role against virus-associated cancers such as liver cancer and leukemia (54,55). Previous studies reported that TRMT2A has been linked to the risk of recurrence in breast cancer and can serve as a predictor for treatment response (56). TRMU, a nuclear gene encoding a mitochondrial protein involved in tRNA modifications, has been found to be upregulated in BLCA tumor tissues and is significantly associated with the tumor immune microenvironment and immune status (57,58). Therefore, TRMU may serve as a potential biomarker for BLCA prognosis. Overall, our findings provide new insights into the role of RBPs in BLCA development and could offer valuable perspectives for future cancer diagnosis and therapy. However, the precise biological roles and mechanisms of these biomarkers in BLCA require further investigation to fully elucidate their potential clinical applications.
Recent studies have increasingly highlighted the crucial role of tumor immune infiltration in both tumorigenesis and cancer progression, with significant implications for immunotherapy (59-61). In this study, we aimed to explore the relationship between the constructed signature and tumor immune infiltration. Our findings revealed that the high-risk group exhibited a notably higher number of immune checkpoint genes, suggesting a worse prognosis. Furthermore, ssGSEA immune infiltration analysis identified significant differences between the two subgroups in terms of immune-related pathways and cell types. Specifically, immune cells such as aDCs, DCs, and pDCs were more abundant in the high-risk group compared to the low-risk group. Additionally, immune functions, including APC co-inhibition, APC co-stimulation, CCR, checkpoint, and cellular cytolytic activity, were elevated in the high-risk group, indicating that increased immune checkpoint activity may contribute to tumor immune evasion. High expression of immune checkpoints is typically associated with poorer patient prognosis in cancer (62,63). These findings suggest that DERBPs may play a significant role in immunotherapy for BLCA patients. Immune subtypes are closely linked to tumor prognosis, with previous research showing that the C3 subtype has the best prognosis, while C6 and C4 subtypes correlate with worse outcomes (64). Compared to the low-risk group, the high-risk group had higher proportions of subtypes C1, C3, and C4, and a lower proportion of subtype C2, indicating a poorer prognosis for high-risk patients. These results imply that the disparity in prognosis between the high- and low-risk groups may partially arise from differences in the patients’ immune status.
Finally, GSEA analysis was conducted to examine the expression differences of DERBPs-associated biological functions and signaling pathways between BLCA and normal tissues. The results revealed that DERBPs-related genes were predominantly enriched in pathways involved in the ECM composition and interaction pathways. Consequently, the poor survival outcomes observed in the high-risk group appear to be primarily driven by dysregulated ECM signaling. During cancer development, tumor cells tend to alter the surrounding microenvironment by secreting specific ECM components. Oncogenes can drive cancer progression by remodeling the ECM to enhance tumor cell viability, migration, and invasiveness (65). Alterations in the ECM can lead to remodeling of the tumor microenvironment, affecting cell signaling and promoting tumor growth and metastasis. For example, changes in ECM composition may affect tumor angiogenesis and enhance the supply of nutrients to the tumor (66). Moreover, oncogenes can modify the interaction between cells and the ECM by influencing the expression of ECM components and their receptors. Such modifications can enhance the interactions between tumor cells and the surrounding microenvironment, thereby facilitating the migration and invasiveness of these cells (65). Research has demonstrated a significant correlation between tumor-associated pathways and poor prognosis (27), indicating that the 9-RBPs-related gene signature warrants further investigation to uncover its potential mechanisms in BLCA. Overall, our comprehensive analysis highlights that the 9-RBPs-related gene signature may serve as a valuable predictive biomarker for immunotherapy response and could assist in tailoring personalized therapeutic strategies for patients with BLCA.
In conclusion, this study successfully developed a robust RBPs-related signature by systematically integrating data from TCGA and GEO, employing various widely recognized machine learning algorithms instead of relying on a single method. This approach allowed us to identify the model with the best average C-index performance, thereby minimizing the potential influence of overfitting among features on our results. Our predictive signature demonstrated commendable efficacy in forecasting the survival of patients with BLCA. While our research offers valuable insights into the role of RBPs in BLCA and their potential prognostic and therapeutic implications, several limitations must be acknowledged. First, our analysis is primarily based on publicly available datasets, which, although they provide a substantial amount of data, may carry inherent biases and limitations. The samples within these datasets might not fully represent the entire BLCA population, and variations in data quality and experimental methodologies across different studies could affect the results. Consequently, caution is warranted when generalizing our findings to the wider BLCA patient population. Second, due to the retrospective nature of this study, larger sample sizes and more comprehensive mechanistic investigations are necessary to validate the 9-RBPs-related gene signature. Future research should focus on evaluating these effects in larger, prospective, multicenter cohorts. Lastly, while our study identified nine prognostic RBPs significantly associated with OS in BLCA, the analysis was conducted solely through data mining. We primarily concentrated on the prognostic and therapeutic implications of DERBPs in BLCA; however, the specific molecular mechanisms and pathways through which these genes exert their effects remain inadequately explored. Further functional studies—including gene knockdown or overexpression experiments, pathway analyses, and mechanistic investigations—are crucial to fully elucidate the biological significance of DERBPs in BLCA and to examine the impact of other potential biological processes and environmental factors on this association. Despite these limitations, the predictive value of this signature in BLCA patients is noteworthy. Future well-designed, multi-institutional studies will be essential to further validate our findings.
Conclusions
In conclusion, this study conducted a systematic analysis of the biological functions and prognostic significance of differentially expressed RBPs in BLCA using bioinformatics techniques and machine learning algorithms. We identified nine DERBPs that are associated with the prognosis of BLCA. These RBPs are likely to play a critical role in the tumorigenesis, progression, invasion, and metastasis of BLCA. A novel prognostic signature was developed and validated based on these nine DERBPs, demonstrating its potential as an independent prognostic factor for BLCA. Furthermore, investigations into immune cell infiltration and immune checkpoint expression highlighted the prognostic signature’s utility in forecasting responses to immunotherapy, along with enrichment analyses that indicated high-risk groups were associated with tumor-related pathways. Collectively, these findings offer valuable insights into the mechanisms that underlie the development and progression of BLCA. They may also aid in the identification of new clinical therapeutic targets or prognostic markers, which are essential for tailoring personalized treatment strategies for BLCA patients.
Acknowledgments
We would like to express our gratitude to the organizers of the TCGA and GEO projects, as well as all the participants involved, for making the publicly available RBPs and clinical data accessible for this study.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tau.amegroups.com/article/view/10.21037/tau-2024-688/rc
Peer Review File: Available at https://tau.amegroups.com/article/view/10.21037/tau-2024-688/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tau.amegroups.com/article/view/10.21037/tau-2024-688/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
- Compérat E, Varinot J, Moroch J, et al. A practical guide to bladder cancer pathology. Nat Rev Urol 2018;15:143-54. [Crossref] [PubMed]
- Witjes JA, Bruins HM, Cathomas R, et al. European Association of Urology Guidelines on Muscle-invasive and Metastatic Bladder Cancer: Summary of the 2020 Guidelines. Eur Urol 2021;79:82-104. [Crossref] [PubMed]
- Krafft U, Tschirdewahn S, Hess J, et al. STIP1 Tissue Expression Is Associated with Survival in Chemotherapy-Treated Bladder Cancer Patients. Pathol Oncol Res 2020;26:1243-9. [Crossref] [PubMed]
- Burger M, Catto JW, Dalbagni G, et al. Epidemiology and risk factors of urothelial bladder cancer. Eur Urol 2013;63:234-41. [Crossref] [PubMed]
- Sanli O, Dobruch J, Knowles MA, et al. Bladder cancer. Nat Rev Dis Primers 2017;3:17022. [Crossref] [PubMed]
- Dyrskjøt L, Ingersoll MA. Biology of nonmuscle-invasive bladder cancer: pathology, genomic implications, and immunology. Curr Opin Urol 2018;28:598-603. [Crossref] [PubMed]
- Lenis AT, Lec PM, Chamie K, et al. Bladder Cancer: A Review. JAMA 2020;324:1980-91. [Crossref] [PubMed]
- Leal J, Luengo-Fernandez R, Sullivan R, et al. Economic Burden of Bladder Cancer Across the European Union. Eur Urol 2016;69:438-47. [Crossref] [PubMed]
- Kural S, Jain G, Agarwal S, et al. Urinary extracellular vesicles-encapsulated miRNA signatures: A new paradigm for urinary bladder cancer diagnosis and classification. Urol Oncol 2024;42:179-90. [Crossref] [PubMed]
- Zhao Y, Mir C, Garcia-Mayea Y, et al. RNA-binding proteins: Underestimated contributors in tumorigenesis. Semin Cancer Biol 2022;86:431-44. [Crossref] [PubMed]
- Matia-González AM, Laing EE, Gerber AP. Conserved mRNA-binding proteomes in eukaryotic organisms. Nat Struct Mol Biol 2015;22:1027-33. [Crossref] [PubMed]
- Pereira B, Billaud M, Almeida R. RNA-Binding Proteins in Cancer: Old Players and New Actors. Trends Cancer 2017;3:506-28. [Crossref] [PubMed]
- Fierro-Monti I. RBPs: an RNA editor's choice. Front Mol Biosci 2024;11:1454241. [Crossref] [PubMed]
- Wu Y, Liu Z, Wei X, et al. Identification of the Functions and Prognostic Values of RNA Binding Proteins in Bladder Cancer. Front Genet 2021;12:574196. [Crossref] [PubMed]
- Neelamraju Y, Gonzalez-Perez A, Bhat-Nakshatri P, et al. Mutational landscape of RNA-binding proteins in human cancers. RNA Biol 2018;15:115-29. [Crossref] [PubMed]
- Gu L, Chen Y, Li X, et al. Integrated Analysis and Identification of Critical RNA-Binding Proteins in Bladder Cancer. Cancers (Basel) 2022;14:3739. [Crossref] [PubMed]
- Guo C, Shao T, Jiang X, et al. Comprehensive analysis of the functions and prognostic significance of RNA-binding proteins in bladder urothelial carcinoma. Am J Transl Res 2020;12:7160-73. [PubMed]
- Terekhanova NV, Karpova A, Liang WW, et al. Epigenetic regulation during cancer transitions across 11 tumour types. Nature 2023;623:432-41. [Crossref] [PubMed]
- Hanczar B, Bourgeais V, Zehraoui F. Assessment of deep learning and transfer learning for cancer prediction based on gene expression data. BMC Bioinformatics 2022;23:262. [Crossref] [PubMed]
- Wu Y, Liu Y, He A, et al. Identification of the Six-RNA-Binding Protein Signature for Prognosis Prediction in Bladder Cancer. Front Genet 2020;11:992. [Crossref] [PubMed]
- Subramanian A, Kuehn H, Gould J, et al. GSEA-P: a desktop application for Gene Set Enrichment Analysis. Bioinformatics 2007;23:3251-3. [Crossref] [PubMed]
- Thorsson V, Gibbs DL, Brown SD, et al. The immune landscape of cancer. Immunity 2018;48:812-830 e14. [Crossref] [PubMed]
- Zhao F, Vakhrusheva O, Markowitsch SD, et al. Artesunate Impairs Growth in Cisplatin-Resistant Bladder Cancer Cells by Cell Cycle Arrest, Apoptosis and Autophagy Induction. Cells 2020;9:2643. [Crossref] [PubMed]
- Roquette R, Painho M, Nunes B. Spatial epidemiology of cancer: a review of data sources, methods and risk factors. Geospat Health 2017;12:504. [Crossref] [PubMed]
- Chu G, Ji X, Wang Y, et al. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. Mol Ther Nucleic Acids 2023;33:110-26. [Crossref] [PubMed]
- Chen X, Dong X, Li H, et al. RNA-binding proteins signature is a favorable biomarker of prognosis, immunotherapy and chemotherapy response for cervical cancer. Cancer Cell Int 2024;24:80. [Crossref] [PubMed]
- Hao S, Yang Z, Wang G, et al. Development of prognostic model incorporating a ferroptosis/cuproptosis-related signature and mutational landscape analysis in muscle-invasive bladder cancer. BMC Cancer 2024;24:958. [Crossref] [PubMed]
- Mohibi S, Chen X, Zhang J. Cancer the'RBP'eutics-RNA-binding proteins as therapeutic targets for cancer. Pharmacol Ther 2019;203:107390. [Crossref] [PubMed]
- Wang S, Sun Z, Lei Z, et al. RNA-binding proteins and cancer metastasis. Semin Cancer Biol 2022;86:748-68. [Crossref] [PubMed]
- Nag S, Goswami B, Das Mandal S, et al. Cooperation and competition by RNA-binding proteins in cancer. Semin Cancer Biol 2022;86:286-97. [Crossref] [PubMed]
- Gerstberger S, Hafner M, Tuschl T. A census of human RNA-binding proteins. Nat Rev Genet 2014;15:829-45. [Crossref] [PubMed]
- Zhou J, Zhou R, Zhu Y, et al. Investigating the impact of regulatory B cells and regulatory B cell-related genes on bladder cancer progression and immunotherapeutic sensitivity. J Exp Clin Cancer Res 2024;43:101. [Crossref] [PubMed]
- Wu J, Zhang F, Zheng X, et al. Identification of bladder cancer subtypes and predictive signature for prognosis, immune features, and immunotherapy based on immune checkpoint genes. Sci Rep 2024;14:14431. [Crossref] [PubMed]
- Xiao H, Huang X, Chen H, et al. Establishment of a SUMO pathway related gene signature for predicting prognosis, chemotherapy response and investigating the role of EGR2 in bladder cancer. J Cancer 2024;15:3841-56. [Crossref] [PubMed]
- Li XL, Blackford JA, Judge CS, et al. RNase-L-dependent destabilization of interferon-induced mRNAs. A role for the 2-5A system in attenuation of the interferon response. J Biol Chem 2000;275:8880-8. [Crossref] [PubMed]
- Jiang S, Deng X, Luo M, et al. Pan-cancer analysis identified OAS1 as a potential prognostic biomarker for multiple tumor types. Front Oncol 2023;13:1207081. [Crossref] [PubMed]
- Gao L, Ren R, Shen J, et al. Values of OAS gene family in the expression signature, immune cell infiltration and prognosis of human bladder cancer. BMC Cancer 2022;22:1016. [Crossref] [PubMed]
- Liu X, Pan L. Predicating candidate cancer-associated genes in the human signaling network using centrality. Curr Bioinform 2016;11:87-92. [Crossref]
- Kim HP, Cho GA, Han SW, et al. Novel fusion transcripts in human gastric cancer revealed by transcriptome analysis. Oncogene 2014;33:5434-41. [Crossref] [PubMed]
- Nacu S, Yuan W, Kan Z, et al. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples. BMC Med Genomics 2011;4:11. [Crossref] [PubMed]
- Li Z, Yin C, Li B, et al. DUS4L Silencing Suppresses Cell Proliferation and Promotes Apoptosis in Human Lung Adenocarcinoma Cell Line A549. Cancer Manag Res 2020;12:9905-13. [Crossref] [PubMed]
- Li Z, Zhao PL, Gao X, et al. DUS4L suppresses invasion and metastasis in LUAD via modulation of PI3K/AKT and ERK/MAPK signaling through GRB2. Int Immunopharmacol 2024;142:113043. [Crossref] [PubMed]
- Liu H, Gu J, Huang Z, et al. Fine particulate matter induces METTL3-mediated m(6)A modification of BIRC5 mRNA in bladder cancer. J Hazard Mater 2022;437:129310. [Crossref] [PubMed]
- Cui J, Zhu Y, Liu X, et al. Comprehensive analysis of N(6)-methyladenosine regulators with the tumor immune landscape and correlation between the insulin-like growth factor 2 mRNA-binding protein 3 and programmed death ligand 1 in bladder cancer. Cancer Cell Int 2022;22:72. [Crossref] [PubMed]
- Scott DD, Trahan C, Zindy PJ, et al. Nol12 is a multifunctional RNA binding protein at the nexus of RNA and DNA metabolism. Nucleic Acids Res 2017;45:12509-28. [Crossref] [PubMed]
- Xiang Y, Zhou S, Hao J, et al. Development and validation of a prognostic model for kidney renal clear cell carcinoma based on RNA binding protein expression. Aging (Albany NY) 2020;12:25356-72. [Crossref] [PubMed]
- Li M, Liu Z, Wang J, et al. Systematic Analysis Identifies a Specific RNA-Binding Protein-Related Gene Model for Prognostication and Risk-Adjustment in HBV-Related Hepatocellular Carcinoma. Front Genet 2021;12:707305. [Crossref] [PubMed]
- Huang J, Kang W, Pan S, et al. NOL12 as an Oncogenic Biomarker Promotes Hepatocellular Carcinoma Growth and Metastasis. Oxid Med Cell Longev 2022;2022:6891155. [Crossref] [PubMed]
- Quan Y, Zhang H, Wang M, et al. Visium spatial transcriptomics reveals intratumor heterogeneity and profiles of Gleason score progression in prostate cancer. iScience 2023;26:108429. [Crossref] [PubMed]
- Gao L, Meng J, Zhang Y, et al. Development and validation of a six-RNA binding proteins prognostic signature and candidate drugs for prostate cancer. Genomics 2020;112:4980-92. [Crossref] [PubMed]
- Xing Q, Luan Jiaochen, Liu Shouyong, et al. Six RNA binding proteins (RBPs) related prognostic model predicts overall survival for clear cell renal cell carcinoma and is associated with immune infiltration. Bosn J Basic Med Sci 2022;22:435-52. [PubMed]
- Wu YQ, Ju CL, Wang BJ, et al. PABPC1L depletion inhibits proliferation and migration via blockage of AKT pathway in human colorectal cancer cells. Oncol Lett 2019;17:3439-45. [Crossref] [PubMed]
- Mao R, Nie H, Cai D, et al. Inhibition of hepatitis B virus replication by the host zinc finger antiviral protein. PLoS Pathog 2013;9:e1003494. [Crossref] [PubMed]
- Gao G, Guo X, Goff SP. Inhibition of retroviral RNA production by ZAP, a CCCH-type zinc finger protein. Science 2002;297:1703-6. [Crossref] [PubMed]
- Hicks DG, Janarthanan BR, Vardarajan R, et al. The expression of TRMT2A, a novel cell cycle regulated protein, identifies a subset of breast cancer patients with HER2 over-expression that are at an increased risk of recurrence. BMC Cancer 2010;10:108. [Crossref] [PubMed]
- He Z, Sun S, Waqas M, et al. Reduced TRMU expression increases the sensitivity of hair-cell-like HEI-OC-1 cells to neomycin damage in vitro. Sci Rep 2016;6:29621. [Crossref] [PubMed]
- Yu X, Luo B, Lin J, et al. Alternative splicing event associated with immunological features in bladder cancer. Front Oncol 2022;12:966088. [Crossref] [PubMed]
- Wang Y, Chen Y, Zhu B, et al. A Novel Nine Apoptosis-Related Genes Signature Predicting Overall Survival for Kidney Renal Clear Cell Carcinoma and its Associations with Immune Infiltration. Front Mol Biosci 2021;8:567730. [Crossref] [PubMed]
- Nishikawa H, Koyama S. Mechanisms of regulatory T cell infiltration in tumors: implications for innovative immune precision therapies. J Immunother Cancer 2021;9:e002591. [Crossref] [PubMed]
- Cheng W, Xu B, Zhang H, et al. Lung adenocarcinoma patients with KEAP1 mutation harboring low immune cell infiltration and low activity of immune environment. Thorac Cancer 2021;12:2458-67. [Crossref] [PubMed]
- Ding Y, Chu L, Cao Q, et al. A meta-validated immune infiltration-related gene model predicts prognosis and immunotherapy sensitivity in HNSCC. BMC Cancer 2023;23:45. [Crossref] [PubMed]
- Zhang J, Zhao Q, Huang H, et al. Establishment and validation of a novel peroxisome-related gene prognostic risk model in kidney clear cell carcinoma. BMC Urol 2024;24:26. [Crossref] [PubMed]
- Li SC, Jia ZK, Yang JJ, et al. Telomere-related gene risk model for prognosis and drug treatment efficiency prediction in kidney cancer. Front Immunol 2022;13:975057. [Crossref] [PubMed]
- Yuan Z, Li Y, Zhang S, et al. Extracellular matrix remodeling in tumor progression and immune escape: from mechanisms to treatments. Mol Cancer 2023;22:48. [Crossref] [PubMed]
- Yamamoto Y, Kasashima H, Fukui Y, et al. The heterogeneity of cancer-associated fibroblast subpopulations: Their origins, biomarkers, and roles in the tumor microenvironment. Cancer Sci 2023;114:16-24. [Crossref] [PubMed]