Integration of genetic, proteomic, and transcriptomic data identifies therapeutic targets and prognostic biomarkers in bladder cancer
Highlight box
Key findings
• Multi-omics integration reveals causative genes and protein markers in bladder cancer (BC).
• This study explored the prediction and prognosis biomarkers tied to BC risk, and constructed predictive models.
• Drug predictions and molecular docking prioritize targets and small-molecule binders.
• Multi-level cross-omics quantitative trait loci analysis uncovers complex gene regulation in BC.
• Findings aid early diagnosis, precise treatment, and personalized approaches in BC.
What is known and what is new?
• Biomarker discovery for bladder cancer is constrained by biological heterogeneity and inconsistent single-omics findings. Few protein markers have been prospectively evaluated for incident disease, and prior work rarely links genetic instruments to transcripts, proteins, and risk through rigorous colocalization. Clinical models seldom leverage such genetically anchored proteomics.
• This study employs a prospective approach to establish associations between gene regulatory proteins and newly emerging diseases. It validates protein expression quantitative trait loci, expression quantitative trait loci and methylation quantitative trait loci evidence through colocalization analysis and verifies findings at the single-cell level, ultimately identifying actionable targets with computer-simulated druggability.
What is the implication, and what should change now?
• The protein panel may support early detection and prognostic stratification. WFDC1 inhibition and modulation of GSTM4/RHOC warrant experimental testing. The framework is transferable to other cancers but requires external replication, functional validation, and evaluation in non-European populations.
Introduction
Bladder cancer (BC) is a major public health concern globally, ranking among the most common and lethal urologic malignancies. Despite recent advancements in treatment, early diagnosis and precision treatment remain significant challenges (1-3). The predominant pathological type of BC is urothelial carcinoma (UC), characterized by diverse clinical manifestations and prognoses. Based on the clinical stage of the tumor, BC is categorized into non-muscle invasive bladder cancer (NMIBC) and muscle invasive bladder cancer (MIBC). NMIBC exhibits a notably high incidence due to its indolent nature and elevated recurrence rate, comprising approximately 75% of cases. In contrast, MIBC, which constitutes the remaining quarter, typically necessitates curative therapy, often in conjunction with systemic chemotherapy or radiation therapy, and is associated with a poor prognosis (4-6). Despite significant advancements in surgical and chemotherapy interventions, challenges persist in the early diagnosis and precision treatment strategies for BC, particularly in the effective identification of high-risk patients and the implementation of individualized treatment approaches (7).
The occurrence and development of tumors is an exceedingly complex process. Data analysis focused on a single omics layer is often insufficient to fully elucidate the underlying pathogenesis. Recent advances in multi-omics integration have revolutionized cancer research by combining genetic, transcriptomic, and proteomic data, providing a more comprehensive understanding of tumor biology. The integration of multiple omics layers, such as expression quantitative trait loci (eQTLs), methylation quantitative trait loci (mQTLs), and protein quantitative trait loci (pQTLs), allows for a deeper exploration of gene regulation and protein expression at different biological levels. This comprehensive approach enables the identification of novel therapeutic targets and biomarkers, overcoming the limitations of single-omics studies. Importantly, multi-omics integration facilitates the identification of complex molecular interactions and regulatory networks that drive cancer progression, providing new opportunities for personalized treatment strategies (8,9).
High-throughput proteomics, scross-omics quantitative trait loci (xQTL), and transcriptomic analysis have become pivotal tools in elucidating disease mechanisms. By revealing genetic associations among different omics datasets, xQTLs offer new insights into the intricate genetic context of diseases (10,11). The OLINK platform effectively bridges the gap between genomic information and disease phenotypes through the high-throughput measurement of plasma proteins (12). Additionally, transcriptomics uncovers cancer-related molecular mechanisms through a comprehensive analysis of gene expression patterns (13). Collectively, these technologies not only provide innovative perspectives for tumor research but also establish a foundation for the implementation of precision medicine.
Although previous studies have applied multi-omics approaches to BC, many have been limited by the isolated use of individual omics datasets without proper integration or correlation analysis. This study aims to address these gaps by integrating cross-omics data, providing a more holistic view of the molecular landscape of BC and uncovering novel biomarkers and therapeutic targets. Furthermore, existing studies often emphasize the statistical description of clinical features or explore pathogenesis and prognostic factors solely from a genomic perspective, neglecting the critical roles of protein function and gene expression regulation in the progression of BC. Consequently, there is an urgent need for a more integrated analytical approach that synthesizes different omics data to comprehensively elucidate the molecular mechanisms and identify potential therapeutic targets for BC (7,14-16).
This study aims to explore potential molecular targets and therapeutic agents associated with the occurrence and progression of BC to enhance early diagnosis and precision treatment. By integrating xQTLs, OLINK proteomics technology, and transcriptome data, we seek to identify molecular targets related to BC. Specifically, we constructed a plasma protein prediction model linked to BC development using Olink data from UK Biobank cohort. This model aims to facilitate the early diagnosis of BC. Additionally, through an in-depth analysis of pQTLs and eQTLs, we identified four genes closely associated with the occurrence and progression of BC, further validating their critical roles through multi-level xQTLs. By combining drug prediction and molecular docking technologies, we identified three drug targets for BC and discovered potential therapeutic drugs for each target. In this study, we aim to apply a cross-omics approach to identify potential therapeutic targets and prognostic biomarkers in BC. By integrating genetic tools (such as mQTL, eQTL, and pQTL), transcriptomic, and proteomic data, we seek to provide a more comprehensive understanding of BC’s molecular mechanisms, ultimately facilitating the development of precision medicine strategies for improved diagnosis and treatment of BC. We present this article in accordance with the STREGA reporting checklist (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-443/rc).
Methods
xQTLs data sources
Genetic variation in plasma proteins was determined by pooling results from three large-scale proteomic studies [deCODE Genetics study, 4,719 proteins; Atherosclerosis Risk in Communities study (ARIC), 4,657 proteins; UK Biobank Pharma Proteomics Project (UKB-PPP), 2,923 proteins]. It is assumed that cis-pQTL directly controls protein expression levels; the selection of cis-pQTL was based on several criteria: (I) a significant association with proteins at the genome-wide scale (P<5×10−8); (II) single nucleotide polymorphisms (SNPs) situated within the major histocompatibility complex (MHC) region and outside the protein coding area (chr6: 25.5–34.0 Mb); (III) independence of association [linkage disequilibrium (LD) cluster with r2<0.001]; (IV) the maximum cumulative variance accounted for by genetic tools at the protein level; (V) SNPs located within a 1 Mb radius of the transcription start site of the protein. The blood eQTL dataset was sourced from the eQTLGen consortium, including data from 31,684 individuals. mQTLs data are from a study by McRae et al. This work focuses on the analysis of SNP-CpG associations in European blood. This study identified 52,916 cis-mQTLs and 2,025 trans-mQTLs, providing valuable information on the relationship between methylation and genotype (17,18). Figure 1 lists the process of research and information of included studies and consortia is shown in Table S1.
Genome-wide association studies (GWAS) outcome data sources
The methods for MR analysis were utilized to uncover potential therapeutic targets with BC serving as the outcome. Summary data related to BC were obtained from the FinnGen R11 GWAS consortium (https://www.finngen.fi/en/access_results) for the ensuing MR analysis (19). The summary GWAS data concerning BC comprised 2,574 patients with BC and 345,118 controls.
UK Biobank Olink data sources
The UK Biobank represents a cohort study based on the population that includes more than 500,000 participants aged from 40 to 69 years old, who were recruited from 2006 to 2010. The dataset includes population characteristics, biological samples, whole genome genotyping, exome sequencing, and multiple physical and anthropometric measurements (https://biobank.ndph.ox.ac.uk/showcase/). All participants provided informed consent. In this study, the UK Biobank Olink data were derived from the UKB-PPP (https://metabolomips.org/ukbbpgwas/). This project, a collaboration between the UK Biobank and thirteen biopharmaceutical firms, had the goal of analyzing the plasma proteome of 54,219 participants from the UK Biobank and proposed an extensive mapping of pQTLs for 2,923 different proteins (20). This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study was carried out in accordance with UK Biobank application (No. 304841). The North West Multicenter Research Ethics Committee and the National Health and Social Care Information Management Board gave its approval to the UK Biobank project (No. 21/NW/0157).
The Cancer Genome Atlas (TCGA) data sources
TCGA database is a project provided by the American Cancer Research Institute and is currently a widely used bioinformatics database (https://portal.gdc.cancer.gov/). It integrates genetic variation, gene expression, and DNA methylation data of 7,648 patients and 33 types of cancer. The BC-related transcriptomics and methylation data utilized in this study were sourced from the Cancer Genome Atlas Bladder Urothelial Carcinoma (TCGA-BLCA). Transcriptomics data were generated using RNA-Seq technology, and gene expression matrices in HTSeq-FPKM format were downloaded, encompassing 20,000 genes, including 400 BC tissue samples and 50 normal control samples. Methylation data were acquired through the Illumina Infinium HumanMethylation450 BeadChip platform. Data retrieval was conducted via the Genomic Data Commons (GDC) portal (https://portal.gdc.cancer.gov/) in compliance with the TCGA data use policy. Preprocessing of transcriptomics data involved the removal of genes with expression levels below 1 FPKM in more than 50% of the samples, standardization using DESeq2, and correction of batch effects employing the ComBat method. For mQTLs data, CpG sites with methylation values below the detection limit and samples with over 20% missing values were excluded, and standardization was achieved through β-value conversion (21).
Summary-data-based Mendelian Randomization (SMR) analysis
Subsequent analysis using Summary Mendelian Randomization was performed to confirm the causal link between the proteins and BC. For cases involving more than three SNPs, the Heterogeneity in Dependent Instruments (HEIDI) test was utilized to determine whether the causal proteins resulted from shared genetic variation instead of genetic linkage (22). Both SMR analysis and HEIDI testing were executed through the use of SMR software (Version: 1.3.1). A threshold for significance was established at P<0.05 for the SMR analysis. A P value >0.01 in the HEIDI test suggested that the causal relationship between the exposure and outcome was unaffected by linkage disequilibrium.
Bayesian colocalization analysis
To examine whether the identified causal proteins exhibited overlapping causal variants of BC within the genomic region, thereby excluding the influence of linkage disequilibrium, a Bayesian colocalization analysis was conducted using the coloc package in R. This analysis was structured around five primary hypotheses: (I) H0: there is no causal variant for exposure or outcome within the genomic region; (II) H1: a singular causal variant exhibits a significant association with exposure only; (III) H2: a singular causal variant shows a significant association with outcome only; (IV) H3: causal variants exist that are significantly associated with either exposure or outcome within the genomic region, but are driven by distinct causal variants; (V) H4: the same causal variant is responsible for both the exposure and outcome (23). In this study, PH4 >0.5 was interpreted as strong evidence supporting colocalization.
Drug prediction and molecular docking
Drug Signatures Database (DSigDB, http://tanlab.ucdenver.edu/DSigDB) is a database containing a large number of molecular structures of drugs and compounds. This database can be used to predict drugs and small molecules associated with target genes, and to know whether some drugs and small molecules have inhibitory or promoting effects on their target genes. In this study, we use the DSigDB through the Enrichr online platform (https://maayanlab.cloud/Enrichr/) to predict potential drugs for WFDC1, RHOC and GSTM4 (24,25). After identifying the drugs that can interact with the target protein, we perform molecular docking to understand how the receptor and ligand bind to each other, and to evaluate the binding affinity of these interactions to screen for high-affinity ligands that bind to the target protein. We obtained the structures of the proteins and ligands. The protein structures of GSTM4 and RHOC were retrieved from the Protein Data Bank (PDB, https://www.rcsb.org/), while the structure of WFDC1 was obtained from the AlphaFold Database (AlphaFoldDB, https://alphafold.com/) (26,27). The structural information for all drug ligands utilized in this study was obtained from the PubChem database, which is accessible at https://pubchem.ncbi.nlm.nih.gov/ (28). The related IDs are presented in Table 1. The docking process was performed using AutoDock4.2.6 (https://autodock.scripps.edu/), a suite of automated docking tools.
Table 1
| Target | ID | Drug | PubChem ID | P value | Binding energy | Effect of action |
|---|---|---|---|---|---|---|
| GSTM4 | 4GTU (PDB) | 0175029-0000 | 18375167 | 0.02 | −4.67 | Up |
| 4GTU (PDB) | Camptothecin | 24360 | 0.02 | −6.18 | Up | |
| 4GTU (PDB) | Vorinostat | 5311 | 0.04 | −3.81 | Up | |
| RHOC | 2GCN (PDB) | Podophyllotoxin | 10607 | 0.02 | −5.79 | Up |
| 2GCN (PDB) | PNU-0293363 | 22176864 | 0.01 | −5.9 | Up | |
| 2GCN (PDB) | Mebendazole | 4030 | 0.02 | −6.29 | Up | |
| 2GCN (PDB) | Lycorine | 72378 | 0.04 | −6.93 | Up | |
| 2GCN (PDB) | Anisomycin | 253602 | 0.057 | −3.99 | Up | |
| WFDC1 | Q9HC57 (AlphaFoldDB) | Temozolomide | 5394 | 0.01 | −4.09 | Unknown |
| Q9HC57 (AlphaFoldDB) | Camptothecin | 24360 | 0.02 | −2.93 | Unknown |
The lower the Binding Energy, the better the binding effect and the higher the affinity. AlphaFoldDB, AlphaFold Database; DSigDB, Drug Signatures Database; Docking predicts affinity only; PDB, Protein Data Bank.
Single-cell RNA sequencing (scRNA-seq) data processing, cell type identification, and pseudo-time analysis
We calculated the proportions of mitochondrial and red blood cell content using the Percentage FeatureSet function. Then, we filtered cells based on the following criteria: (I) each cell’s unique molecular identifiers (UMI) were greater than 1,000, and we removed the top 3% largest cells; (II) excluded cells with <300 or >7,000 measured genes; (III) excluded cells with mitochondrial contamination >10% or red blood cell contamination greater than 3%. A total of 65,523 cells were obtained. Subsequently, we normalized the data from the 9 samples using the Normalize Data function (29). We identified highly variable genes using the Find Variable Features function, and additionally, we scaled all genes to eliminate the influence of mitochondrial genes using the ScaleData function and performed principal component analysis (PCA) dimensionality reduction to find anchors. Finally, we integrated batch correction using the harmony method. We selected dim =20 and clustered the cells using the FindNeighbors and FindClusters functions (resolution =0.2), resulting in 12 clusters (30). Furthermore, we downloaded human cell marker genes from the CellMarker website (http://biocc.hrbmu.edu.cn/CellMarker/) and selected a list of cell marker genes related to bladder tissue. The FindCluster function was used for cell clustering. To characterize the potential processes of gene expression changes and determine the potential differentiation between different cells, pseudo-time analysis of cellular evolution was performed using the “Monocle” software package (31).
Statistical analysis
Correlation analysis used Spearman based on packages of “corrplot”. Nonlinearity test was investigated using restricted cubic spline models based on package of “rms”. Cox regression was employed to examine the relationship between proteins and BC. Package of “survival” was used construct Kaplan-Meier curves. Explainable machine learning in explaining Cox and random Forest was used package of “Survex” (32). All statistical analyses were carried out using R software (Version 4.0.2).
Results
Identifying causal plasm-pQTLs associated with BC risk
In order to identify the causal effect of circulating plasm protein contributed to BC risk, we selected three large plasma protein sequencing projects as exposure variables among European and American population, including UKB-PPP, DECODE and ARIC. As shown in (Figure 2A-2C), we have identified 102 plasm-pQTLs in UKB-PPP sequencing cohort, 97 plasm-pQTLs in DECODE sequencing cohort and 81 plasm-pQTLs in ARIC sequencing cohort using SMR. Then, we intersected the results, 17 of plasm-pQTLs were significant in the three sequencing cohorts (Figure 2D). We further mapped the Manhattan map showing the distribution of genes for plasma proteins across chromosomes (Figure 2E-2G). After HEIDI test, we excluded SCGB3A1 due to HEIDI value <0.01 in the ARIC sequencing cohort (Figure 2H). The results of the three sequencing data are presented in tables online https://cdn.amegroups.cn/static/public/tau-2025-443-1.zip.
Risk proteins validation and prediction models construction
In order to validate the association of selected proteins and BC risk, we used the UK Biobank Olink proteomics cohort to conduct further analysis. Firstly, we conducted Spearman correlation analysis of protein expression in the cohort as shown in the Figure 3A. We found that 16 proteins had no significant correlation and were relatively independent. Next, we used cox regression models to analyze the relationship between 16 protein and BC risk in a prospective cohort (Figure 3B). 7 proteins were found to be associated with BC, including MFGE8 [hazard ratio (HR): 1.43, 95% confidence interval (CI): 1.19–1.71], ROR1 (HR: 1.35, 95% CI: 1.04–1.75), CTSS (HR: 1.63, 95% CI: 1.07–2.46), CLEC4G (HR: 1.33, 95% CI: 1.03–1.70), PLAT (HR: 1.53, 95% CI: 1.32–1.76), CD59 (HR: 2.27, 95% CI: 1.72–3.00) and TNFRSF19 (HR: 1.82, 95% CI: 1.46–2.26), which indicated that the risk of BC increases with increased protein expression. RSC curves were conducted to identify the linear relationship between exposure and outcome (Figure 3C). Results showed that ROR1, CLEC4G, PLAT and TNFRSF19 had linear relationship with BC (Pnon-linear >0.05). MFGE8 and CD59 had non-linear relationship with BC (Pnon-linear <0.05). RSC curves showed CTSS only has an upward trend, and its HR value is not greater than 1. We further mapped Kaplan-Meier survival to identify the association between high and low risk groups and BC (Figure 3D) and exclude ROR1 due to no statistical difference. Based on the 5 proteins screened above, we constructed cox model and RSF model. Next, we performed a visual analysis of the two prediction models. We visualized the permutation-based feature importance including Brier score and area under curve (AUC) (Figure 3E,3F). Results showed that CD59 had relative higher importance. We further used SurvSHAP(t) to explain the predictions for specific individuals which showed the survival probability prediction of a single sample at different time points is affected by each feature (Figure 3G). Finally, we evaluated the performance of two models. Figure 3H showed the accuracy of the model in predicting prediction time, usually including indicators such as Concordance Index or Brier Score. Figure 3I provides a simplified representation of performance metrics, which include things like mean prediction error.
Investigation on drug target, drug prediction and molecular docking
In order to predict the potential drug target, we conducted the colocalization analysis between the 16 protein pQTLs and GWAS of BC in three plasma protein sequencing projects. Colocalization analysis was shown in the Figure 4A-4C as stacked column chart. We found WFDC1, GSTM4 and RHOC had higher PH4 value (>0.5) which indicate phenotype 1 (GWAS) and phenotype 2 (pQTL) are significantly associated with SNP sites in a genomic region and are driven by the same causal mutation site. GAWS colocalization visualization results are listed in the Figure 4D-4F. These three proteins are regarded as potential drug targets. We obtained the proteins 3D construction listed in the Table S2. From SMR results, we found WFDC1 was positive associated with BC, meanwhile GSTM4 and RHOC were negative associated with BC (Figure 4H,4I). Immunohistochemistry results from HPA showed that WFDC1 was high expression in BC while GSTM4 and RHOC were not detect expression in BC (Figure 4J). Thus, we aimed to explore the WFDC1 inhibitor, GSTM4 and RHOC agonist as potential drug. Through the DSigDB database, we predict the effective intervention drugs of GSTM4, RHOC, WFDC1. And select drugs based on their functional interaction with the target protein. Three drugs that promote GSTM4 were selected: 0175029-0000, camptothecin, and vorinostat; four drugs that promote RHOC were selected: podophyllotoxin, PUN-0293363, mebendazole, and lycorine; because the DSigDB database does not specify whether the drugs promote or inhibit WFDC1, two drugs with the smallest P values (less than 0.05) were selected: Temozolomide and Carmustine. Next, we import the structure files of the target proteins and predicted drugs into AutoDock 4.2.6 for docking analysis to obtain the docked complex of each drug-protein pair, and further evaluate the binding affinity of the candidate drugs to their respective targets. The docking results of ten protein-drug complexes are shown in Figure 5 and Table 1. The drug candidates are all linked to their target proteins by visible hydrogen bonds. The binding energies for all docking results are displayed in Table 1. The smaller the binding energy, the more stable the binding of the drug and protein. Among the drugs for GSTM4, camptothecin exhibited the most stable binding, with a binding energy of −6.18 kcal/mol. Among the drugs for RHOC, lycorine showed the most stable binding, with a binding energy of −6.93 kcal/mol. Notably, WFDC1 was considered a target for inhibition, whereas GSTM4 and RHOC were considered for activation based on risk directionality, docking evaluates binding only and does not imply agonism/antagonism.
Causal relationship of plasm-eQTLs and BC risk
To further explore the causal relationship between gene expression and BC risk, we to analyzed eQTLs from eQTLGene and GWAS of BC using SMR (https://cdn.amegroups.cn/static/public/tau-2025-443-2.csv). We identified a total of 782 genes significantly associated with BC, of which 402 were positive association and 380 were negative association (Figure 6A). We intersected the 16 screened proteins with the meaningful eQTLs and found four significant genes, including MFGE8, CTSS, TNFRSF19 and IL1RAP (Figure 6B). These four genes pass the HEIDI test (P>0.01) (Figure 6C). We also mapped SMR effect plot and SMR locus plot as shown in the Figure 6D,6E which show eQTLs of MFGE8, TNFRSF19 and IL1RAP were positive associated with BC while CTSS was negative association. The results of eQTLs showed the same trend as pQTLs. Then, we used TCGA-BLCA cohort to explore these 4 gene expression impact on prognosis using Kaplan-Meier curves. We found higher expression of MFGE8 (HR: 1.67, 95% CI: 1.21–2.31), TNFRSF19 (HR: 1.88, 95% CI: 1.25–2.82) and IL1RAP (HR: 1.50, 95% CI: 1.09–2.05) had worse prognosis and higher expression of CTSS (HR: 1.88, 95% CI: 1.25–2.82) had better prognosis (Figure 6F). Thus, MFGE8, CTSS, TNFRSF19 and IL1RAP were identified as risk factors as well as prognosis predictors.
Single-cell RNA-seq analysis, clustering and pseudo-time series analysis
After reading the scRNA-seq dataset GSE222315 from the GEO (Gene Expression Omnibus) database, to obtain high-quality RNA expression data from single cells, a preliminary quality control assessment was conducted using the following filtering criteria. A total of 65,523 high-quality cells were isolated from the BC tissues of 9 patients for further analysis. After batch correction and integration using Harmony, we utilized PCA for initial dimensionality reduction and clustering, followed by the application of Uniform Manifold Approximation and Projection (UMAP) and t-distributed stochastic neighbor embedding (tSNE) algorithms on the top 30 principal components to visualize the high-dimensional scRNA-seq data (Figure 7A). We successfully divided the cells into 12 clusters using the UMAP algorithm and further annotated them into seven recognized cell types (T cells, B cells, monocyte cells, mast cells, fibroblasts, endothelial cells, epithelial cells) based on corresponding cell markers (Figure 7B). It can be observed that the four genes “MFGE8”, “CTSS”, “TNFRSF19”, and “IL1RAP” are widely expressed in various cells of BC. To reveal the developmental trajectories of cellular states during the progression of BC, we performed pseudo-time and trajectory analysis using the “monocle” package (v 2.24.0). We can see that BC cells start from the lower left corner, progress through a branching point, and move towards two different end outcomes, with the entire process being divided into three distinct cellular states (Figure 7D). Fibroblasts and endothelial cells are mainly in the early developmental stages. Monocyte cells and epithelial cells are in the late developmental stages of different end outcomes (Figure 7E). We further analyzed the expression of the four genes along the developmental trajectory and observed that “MFGE8” expression decreases with tumor progression, while “CTSS” expression gradually increases with tumor progression, and “TNFRSF19” and “IL1RAP” show no significant expression changes along the developmental trajectory. In different cellular states, “MFGE8” is further validated to be highly expressed in the early state1, and “CTSS” expression gradually increases from state1 to state3 (Figure 7F). In different cell types, “MFGE8” and “TNFRSF19” tend to be expressed in Fibroblast cells, “CTSS” tends to be expressed in epithelial cells, T cells, and Monocyte cells, and “IL1RAP” shows no significant expression differences among different cells (Figure 7G).
Genetical analysis of mQTLs and gene methylation cohort analysis using TCGA
For a deeper understanding of gene methylation and BC risk, we used SMR to analyze the mQTLs of 4 selected gene and BC risk (Figure 8A and tables available online: https://cdn.amegroups.cn/static/public/tau-2025-443-3.zip). SMR results showed that CTSS had two methylation sites associated with BC, including cg26891210 [odds ratio (OR): 0.83, 95% CI: 0.70–0.97] and cg18892169 (OR: 1.39, 95% CI: 1.12–1.66) which had reversed effect. MFGE8 had two methylation sites negative associated with BC, including cg08896420 (OR: 0.83, 95% CI: 0.71–0.95) and cg26213453 (OR: 0.92, 95% CI: 0.86–0.98). TNFRSF19 had six methylation sites negative associated with BC, including cg24350296 (OR: 0.87, 95% CI: 0.77–0.97), cg20334079 (OR: 0.84, 95% CI: 0.72–0.96), cg22125926 (OR: 0.81, 95% CI: 0.67–0.96), cg07720188 (OR: 0.85, 95% CI: 0.74–0.97), cg00504595 (OR: 0.83, 95% CI: 0.66–1.00) and cg12217560 (OR: 0.77, 95% CI: 0.53–1.00). Then, we used TCGA cohort to explore the methylation sites prognosis value. We found CTSS of cg18892169 was a poor prognostic index and MFGE8 of cg08896420, cg26213453 methylation had better prognosis which consistent with risk indicators. However, TNFRSF19 methylation sites had reversed effect on morbidity and prognosis, including cg00504595, cg12217560, cg22125926 and cg24350296 which was adverse prognostic factors (Figure 8B-8H). Furthermore, cg08896420 had higher colocalization PPH4 value with higher level of evidence compared other methylation sites (Figure 8I). Figure 8J showed regional plot of colocalization evidence of cg08896420.
Interactive analysis of xQTLs and multi-omics sequencing
Finally, we aim to explore the pathogenesis and prognosis of BC through multi-omics interaction. Through TCGA cohort, we found cg18892169 was negative associated with CTSS expression (r=−0.523) (Figure 9A). cg08896420 (r=−0.128) and cg26213453 (r=−0.172) were also negative associated with MFGE8 expression (Figure 9B). These results were consistent with GWAS results. For TNFRSF19, we found cg07720188 (r=0.149), cg12217560 (r=0.220) and cg00504595 (r=0.157) were positive associated with TNFRSF19 expression (Figure 9C). However, GWAS results showed that mQTLs of TNFRSF19 were negative associated with eQTLs of TNFRSF19 which indicated that there are differences in tumorigenesis and prognosis of BC. To explain the opposite direction of effect between TNFRSF19 mQTLs and eQTLs, we noted that the xQTL layers used in this study are derived from different biological compartments (blood mQTL/eQTL, plasma pQTLs, and tumor tissue expression/prognosis). Differences across tissues and disease states can lead to inconsistencies in the direction of instrument variables. In addition, the location of the CpG site is critical: promoter methylation typically inhibits transcription, while genosome/enhancer methylation may be positively correlated with gene expression. In TCGA, the three CpGs we examined (cg07720188, cg12217560, cg00504595) were positively correlated with methylation-expression of TNFRSF19, which is consistent with non-promoter regulation. Allelic heterogeneity with linkage imbalance may also cause mQTL to be labeled with eQTL into different causal variations, so opposite SMR orientation does not necessarily mean biological contradictions. Finally, we showed multi-omics concordance and effect directions for key genes in Table 2 and mapped the xQTLs interactive network as shown in the Figure 9D.
Table 2
| Gene | pQTL | eQTL | mQTL | Cross-omics concordance |
|---|---|---|---|---|
| MFGE8 | Positive (+) | Positive (+) | Negative (−) | pQTL & eQTL concordant; mQTL discordant |
| CTSS | Negative (−) | Negative (−) | Mixed† | pQTL & eQTL concordant; mQTL discordant |
| TNFRSF19 | Positive (+) | Positive (+) | Negative (−) | pQTL & eQTL concordant; mQTL discordant |
| IL1RAP | Positive (+) | Positive (+) | NA | pQTL & eQTL concordant; mQTL NA |
†, cg26891210 negative (−), cg18892169 positive (+). eQTL, expression quantitative trait loci; mQTL, methylation quantitative trait loci; NA, not applicable; pQTL, protein quantitative trait loci.
Discussion
In the study of drug targets, we identified three potential intervention targets for BC: WFDC1, GSTM4, and RHOC. The results from SMR analysis and protein expression data in BC tissues indicated that high expression levels of WFDC1 were associated with an increased risk of BC, whereas the expression of GSTM4 and RHOC correlated with a decreased risk. Utilizing the DSigDB database and molecular docking analysis, we identified nine drugs and small molecules that could effectively dock with these intervention targets. Notably, the docking results of lycorine with RHOC demonstrated the highest binding affinity, but docking does not determine whether lycorine activates or inhibits RHOC. Previous studies have shown that lycorine inhibits the progression of BC by down-regulating NEDD4 gene expression and inducing apoptosis in BC cells through the inhibition of the Akt pathway and the activation of the intrinsic apoptosis signaling pathway (33,34). We therefore regard lycorine as a docking-prioritized binder that warrants orthogonal biochemical and cellular validation (target engagement, mechanism, and efficacy). In our study, we found that lycorine inhibited BC by promoting RHOC expression. This finding offers a novel mechanism of action, providing insights into how lycorine may inhibit the development of BC and establishing a foundation for further experimental validation and clinical translational studies. And pharmacologic activation/inhibition requires orthogonal biochemical and cellular validation given context-specific biology.
Through the analysis of systemic xQTLs, CTSS and TNFRSF19 exhibited a complex regulatory pattern during the development of BC. TNFRSF19, also known as TROY, is a member of the tumor necrosis factor receptor superfamily (TNFRSF), which comprises 29 receptors with similar structural features (35). Previous studies have demonstrated that TNFRSF19 is involved in the regulation of tumors in the digestive system, including lung, liver, and colorectal cancers. Its function varies across different cancer types (36-38). Yet its role in BC remains inadequately explored. In our study, the expression of TNFRSF19 was positively correlated with BC, as evidenced by pQTLs and mQTLs analyses. However, findings from SMR analysis at the mQTL level indicated that all six of its methylation sites were negatively correlated with BC, suggesting that the methylation of TNFRSF19 may inhibit its cancer-promoting effects. Research has shown that methylation typically occurs at CpG islands, a molecular process that targets CpG dinucleotides. Most genes possess unmethylated CpGs in their 5’-regulatory region, with approximately 29,000 CpGs identified in the human genome (8,9). Current studies indicate that DNA methylation plays a significant role in the occurrence and progression of BC, with certain methylated sites serving as potential biomarkers for diagnosis and prognosis (39,40). Consequently, investigating methylation is essential for understanding the mechanisms underlying BC. In normal cells, a substantial portion of the genome is methylated, which helps maintain genomic stability and reduces the likelihood of mutations and chromosomal instability. In contrast, cancer cells exhibit a decrease in overall methylation levels, accompanied by the demethylation of repetitive sequences. This alteration leads to genomic instability, chromosomal rearrangements, and breaks, ultimately triggering the activation of aberrant genes and increasing the mutation rate, thereby promoting tumor development (41). In our study, all six methylation sites of TNFRSF19 were found to inhibit its pro-tumorigenic effects, potentially due to the suppression of its activation and mutation levels through methylation.
The regulation of CTSS exhibits a more complex characterization. Research indicates that CTSS contributes to tumorigenesis and progression by promoting tumor angiogenesis, extracellular matrix degradation, and genomic alterations (42). Furthermore, within the BC microenvironment, CTSS enhances the survival of BC patients by facilitating macrophage M1 polarization. This finding aligns with our single-cell sequencing results, which show that CTSS expression gradually increases during tumor progression and is predominantly expressed in monocytes (43). Additionally, CTSS was found to be significantly and positively correlated with the infiltration levels of CD4+ and CD8+ T cells in BC. Consequently, high CTSS expression is associated with improved prognosis for BC patients. Collectively, these studies support the notion that elevated CTSS expression enhances the prognosis of BC patients (44). These results are consistent with the pQTLs and eQTLs findings in our study. However, the mQTLs results indicate that different methylation sites of CTSS exert opposing regulatory effects on tumors. Baylin et al. demonstrated that in tumors, 5-10% of normally unmethylated CpG islands become abnormally hypermethylated (41). However, for oncogenes, their promoters tend to be unmethylated or hypomethylated, which helps maintain the inhibition of aberrant cell proliferation, DNA repair, and apoptosis. When the promoters of oncogenes become silenced through methylation, tumor invasion, metastasis, and angiogenesis are promoted (45). In our study, the methylation site cg18892169 is located at the CpG position of the oncogene CTSS at chr1:150,765,286. We further examined this coordinate in the UCSC Genome Browser (https://genome.ucsc.edu/) (46). The results for Candidate Cis-Regulatory Elements (cCREs) and H3K27Ac mark suggest that this locus may reside within an active cis-acting regulatory element of the CTSS gene. Additionally, the mQTLs results indicated that cg18892169 has a promoting effect on BC. In summary, we hypothesize that the same gene can exhibit different regulatory effects on BC, potentially due to the methylation site being situated in various functional compartments of the gene. Further basic research and validation are necessary to support this hypothesis. With the adoption of enfortumab vedotin (EV) combined with pembrolizumab as a first-line regimen, the treatment landscape for advanced BC (mainly UC) has recently changed (47,48). However, most advanced or metastatic patients are still incurable, and drug resistance and limited long-term survival are still common. These realities highlight the need for mechanism-based informatized targets and biomarkers to guide patient stratification, develop rational combination therapies, and mitigate drug resistance strategies. Our integrated multi-omics results fill this gap by placing WFDC1 as a priority inhibitory target and GSTM4 with RHOC as a priority upregulation target, providing a testable hypothesis for combination therapy with EV+ pembrolizumab or a post-line treatment regimen.
There are several limitations in this study. First, the database utilized in this research was composed primarily of individuals of European ancestry, necessitating caution when extrapolating the findings to other ethnic groups. Second, BC is a highly heterogeneous solid tumor, characterized by variations in pathogenesis across different pathological types and significant discrepancies in treatment modalities. Thirdly, our multi-omics and docking findings are hypothesis-generating, and their functional importance remains to be established. These findings require in-vitro and in-vivo preclinical studies to validate mechanism, on-target activity, and efficacy.
Furthermore, investigations that focus on the underlying molecular mechanisms and the functions of drugs will facilitate the translation of these findings into clinically applicable diagnostic and therapeutic options. Lastly, our tissue-level datasets predominantly represent muscle-invasive, high-grade UC. Accordingly, the mechanistic inferences and target/biomarker signals reported here should be interpreted as primarily applicable to MIBC. Extrapolation to non–muscle-invasive disease (NMIBC; Ta/T1/CIS) or variant histologies is uncertain and requires stage-specific validation.
Conclusions
Based on a comprehensive analysis of xQTLs, OLINK, and transcriptomics data, our multi-omics findings suggest that WFDC1, RHOC, and GSTM4 emerge as genetically and transcriptomically prioritized targets and require experimental validation in vitro and in vivo for any therapeutic inference. Additionally, MFGE8, CTSS, TNFRSF19, and IL1RAP have been identified as disease risk factors and prognostic predictors. Notably, CTSS and TNFRSF19 exhibit complex regulatory patterns at various stages of bladder carcinogenesis and prognosis.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the STREGA reporting checklist. Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-443/rc
Peer Review File: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-443/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-443/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Siegel RL, Kratzer TB, Giaquinto AN, et al. Cancer statistics, 2025. CA Cancer J Clin 2025;75:10-45. [Crossref] [PubMed]
- Dyrskjøt L, Hansel DE, Efstathiou JA, et al. Bladder cancer. Nat Rev Dis Primers 2023;9:58. [Crossref] [PubMed]
- Saginala K, Barsouk A, Aluru JS, et al. Epidemiology of Bladder Cancer. Med Sci (Basel) 2020;8:15. [Crossref] [PubMed]
- Catto JWF, Gordon K, Collinson M, et al. Radical Cystectomy Against Intravesical BCG for High-Risk High-Grade Nonmuscle Invasive Bladder Cancer: Results From the Randomized Controlled BRAVO-Feasibility Study. J Clin Oncol 2021;39:202-14. [Crossref] [PubMed]
- Witjes JA, Bruins HM, Cathomas R, et al. European Association of Urology Guidelines on Muscle-invasive and Metastatic Bladder Cancer: Summary of the 2020 Guidelines. Eur Urol 2021;79:82-104. [Crossref] [PubMed]
- Jubber I, Ong S, Bukavina L, et al. Epidemiology of Bladder Cancer in 2023: A Systematic Review of Risk Factors. Eur Urol 2023;84:176-90. [Crossref] [PubMed]
- Lopez-Beltran A, Cookson MS, Guercio BJ, et al. Advances in diagnosis and treatment of bladder cancer. BMJ 2024;384:e076743. [Crossref] [PubMed]
- Zhang SL, Cheng LS, Zhang ZY, et al. Untangling determinants of gut microbiota and tumor immunologic status through a multi-omics approach in colorectal cancer. Pharmacol Res 2023;188:106633. [Crossref] [PubMed]
- Gong J, Qin Z, Xiao Y, et al. Multi-Omics Analysis and Validation of Cell Senescence-Related Genes Associated with Non-Alcoholic Fatty Liver Disease. J Inflamm Res 2025;18:8821-33. [Crossref] [PubMed]
- Zhang Z, Fang T, Chen L, et al. Multi-omics Mendelian randomization integrating GWAS, eQTL, and mQTL data identified genes associated with breast cancer. Am J Cancer Res 2024;14:1433-45. [Crossref] [PubMed]
- Wu W, Kaicen W, Bian X, et al. Akkermansia muciniphila alleviates high-fat-diet-related metabolic-associated fatty liver disease by modulating gut microbiota and bile acids. Microb Biotechnol 2023;16:1924-39. [Crossref] [PubMed]
- Eldjarn GH, Ferkingstad E, Lund SH, et al. Large-scale plasma proteomics comparisons through genetics and disease associations. Nature 2023;622:348-58. [Crossref] [PubMed]
- Wang X, Almet AA, Nie Q. The promising application of cell-cell interaction analysis in cancer from single-cell and spatial transcriptomics. Semin Cancer Biol 2023;95:42-51. [Crossref] [PubMed]
- Wei C, Deng C, Dong R, et al. Multi-omics analysis reveals critical metabolic regulators in bladder cancer. Int Urol Nephrol 2024;56:923-34. [Crossref] [PubMed]
- Yang C, Ou Y, Zhou Q, et al. Methionine orchestrates the metabolism vulnerability in cisplatin resistant bladder cancer microenvironment. Cell Death Dis 2023;14:525. [Crossref] [PubMed]
- Yao Z, Xu N, Shang G, et al. Proteogenomics of different urothelial bladder cancer stages reveals distinct molecular features for papillary cancer and carcinoma in situ. Nat Commun 2023;14:5670. [Crossref] [PubMed]
- Jia Y, Lin F, Li R, et al. Multiomics genetic insights into potential molecular targets for intracranial aneurysm. Stroke Vasc Neurol 2025;svn. [Crossref] [PubMed]
- Bai Y, Wang J, Feng X, et al. Identification of drug targets for Sjögren’s syndrome: multi-omics Mendelian randomization and colocalization analyses. Front Immunol 2024;15:1419363. [Crossref] [PubMed]
- Qin S, Jiang Y, Ou Y, et al. Mendelian randomization of circulating proteome identifies IFN-γ as a druggable target in aplastic anemia. Ann Hematol 2024;103:2245-56. [Crossref] [PubMed]
- Zhang X, Zhao L, Ngo LH, et al. Prediagnostic plasma proteomics profile for hepatocellular carcinoma. J Natl Cancer Inst 2024;116:1343-55. [Crossref] [PubMed]
- Jeon S, Park C, Kim J, et al. Comparing variants related to chronic diseases from genome-wide association study (GWAS) and the cancer genome atlas (TCGA). BMC Med Genomics 2023;16:332. [Crossref] [PubMed]
- Su Z, Wan Q. Potential therapeutic targets for membranous nephropathy: proteome-wide Mendelian randomization and colocalization analysis. Front Immunol 2024;15:1342912. [Crossref] [PubMed]
- Yang H, Shi P, Li M, et al. Mendelian-randomization study reveals causal relationships between nitrogen dioxide and gut microbiota. Ecotoxicol Environ Saf 2023;267:115660. [Crossref] [PubMed]
- Liu S, Cho M, Huang YN, et al. Multi-omics analysis for identifying cell-type-specific and bulk-level druggable targets in Alzheimer’s disease. J Transl Med 2025;23:788. [Crossref] [PubMed]
- Xie M, Li Z, Li X, et al. Identifying crucial biomarkers in peripheral blood of schizophrenia and screening therapeutic agents by comprehensive bioinformatics analysis. J Psychiatr Res 2022;152:86-96. [Crossref] [PubMed]
- Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021;596:583-9. [Crossref] [PubMed]
- Cao Y, Yang Y, Hu Q, et al. Identification of potential drug targets for rheumatoid arthritis from genetic insights: a Mendelian randomization study. J Transl Med 2023;21:616. [Crossref] [PubMed]
- Kim S, Chen J, Cheng T, et al. PubChem 2025 update. Nucleic Acids Res 2025;53:D1516-25. [Crossref] [PubMed]
- Stuart T, Butler A, Hoffman P, et al. Comprehensive Integration of Single-Cell Data. Cell 2019;177:1888-1902.e21. [Crossref] [PubMed]
- Zhang H, Yang Y, Cao Y, et al. IPF-related new macrophage subpopulations and diagnostic biomarker identification - combine machine learning with single-cell analysis. Respir Res 2024;25:241. [Crossref] [PubMed]
- Liu Y, Xu B, Fan C. Single-Cell RNA Sequencing and Microarray Analysis Reveal the Role of Lipid-Metabolism-Related Genes and Cellular Immune Infiltration in Pre-Eclampsia and Identify Novel Biomarkers for Pre-Eclampsia. Biomedicines 2023;11:2328. [Crossref] [PubMed]
- Liu RJ, Guo XG, Zhao CF, et al. An Ecological Survey of Chiggers (Acariformes: Trombiculidae) Associated with Small Mammals in an Epidemic Focus of Scrub Typhus on the China-Myanmar Border in Southwest China. Insects 2024;15:812. [Crossref] [PubMed]
- Yang C, Xiang H, Fu K, et al. Lycorine suppresses cell growth and invasion via down-regulation of NEDD4 ligase in bladder cancer. Am J Cancer Res 2022;12:4708-20.
- Wang C, Wang Q, Li X, et al. Lycorine induces apoptosis of bladder cancer T24 cells by inhibiting phospho-Akt and activating the intrinsic apoptotic cascade. Biochem Biophys Res Commun 2017;483:197-202. [Crossref] [PubMed]
- Liu S, Tian Y, Liu C, et al. TNFRSF19 promotes endoplasmic reticulum stress-induced paraptosis via the activation of the MAPK pathway in triple-negative breast cancer cells. Cancer Gene Ther 2024;31:217-27. [Crossref] [PubMed]
- Wong AM, Ding X, Wong AM, et al. Unique molecular characteristics of NAFLD-associated liver cancer accentuate β-catenin/TNFRSF19-mediated immune evasion. J Hepatol 2022;77:410-23. [Crossref] [PubMed]
- Zuo X, Wang X, Ma T, et al. TNFRSF19 within the 13q12.12 Risk Locus Functions as a Lung Cancer Suppressor by Binding Wnt3a to Inhibit Wnt/β-Catenin Signaling. Mol Cancer Res 2024;22:227-39. [Crossref] [PubMed]
- Shao L, Zuo X, Yang Y, et al. The inherited variations of a p53-responsive enhancer in 13q12.12 confer lung cancer risk by attenuating TNFRSF19 expression. Genome Biol 2019;20:103. [Crossref] [PubMed]
- Liu X, Ding G, Liu Y, et al. Epigenetic regulation of bladder cancer in the context of aging. Front Pharmacol 2025;16:1617452. [Crossref] [PubMed]
- Larsen LK, Lind GE, Guldberg P, et al. DNA-Methylation-Based Detection of Urological Cancer in Urine: Overview of Biomarkers and Considerations on Biomarker Design, Source of DNA, and Detection Technologies. Int J Mol Sci 2019;20:2657. [Crossref] [PubMed]
- Gilyazova I, Enikeeva K, Rafikova G, et al. Epigenetic and Immunological Features of Bladder Cancer. Int J Mol Sci 2023;24:9854. [Crossref] [PubMed]
- Choi E, Jeon KH, Lee H, et al. Radiosensitizing effect of a novel CTSS inhibitor by enhancing BRCA1 protein stability in triple-negative breast cancer cells. Cancer Sci 2024;115:2036-48. [Crossref] [PubMed]
- Mao C, Xu N. Single-cell Sequencing Data Reveals Aggressive CD68-type Macrophages and Prognostic Models in Bladder Cancer. Curr Med Chem 2024;31:1523-38. [Crossref] [PubMed]
- Riether C, Ochsenbein AF. Genetic Alterations Impact Immune Microenvironment Interactions in Follicular Lymphoma. Cancer Cell 2020;37:621-2. [Crossref] [PubMed]
- Nunes SP, Henrique R, Jerónimo C, et al. DNA Methylation as a Therapeutic Target for Bladder Cancer. Cells 2020;9:1850. [Crossref] [PubMed]
- Rigden DJ, Fernández XM. The 27th annual Nucleic Acids Research database issue and molecular biology database collection. Nucleic Acids Res 2020;48:D1-8. [Crossref] [PubMed]
- Jang A, Tripathi N, Lanka SM, et al. Advanced urothelial carcinoma treated with enfortumab vedotin and pembrolizumab: a path to cure? Oncologist 2025;30:oyaf259. [Crossref] [PubMed]
- van der Heijden MS, Powles T, Gupta S, et al. Exploratory subgroup analyses of EV-302: a phase III global study to evaluate enfortumab vedotin in combination with pembrolizumab versus chemotherapy in previously untreated locally advanced or metastatic urothelial carcinoma. ESMO Open 2025;10:105544. [Crossref] [PubMed]

