Identification of M2 macrophage-related biomarkers for a predictive model of interstitial fibrosis and tubular atrophy after kidney transplantation by machine learning algorithms
Highlight box
Key findings
• This study identified ALOX5, ARL4C, and MS4A6A as M2 macrophage (Mφ2)-associated hub genes driving interstitial fibrosis and tubular atrophy (IFTA) in kidney transplants using bioinformatics and machine learning.
• We developed a diagnostic nomogram model with strong predictive accuracy for IFTA (area under the curve: 0.738 in training cohort; 0.78–0.88 in external validation cohorts).
• Consensus clustering stratified patients into high-risk (cluster 1) and low-risk (cluster 2) groups, with higher hub gene expression linked to faster graft loss (P<0.001).
• High-risk patients showed dysregulation of TNFα/NF-κB and TGF-β signaling, while low-risk patients displayed metabolic pathway activation.
What is known and what is new?
• Chronic Mφ2 polarization promotes IFTA fibrosis, but reliable biomarkers for early diagnosis and risk assessment are lacking. Current IFTA diagnosis relies on biopsy, which is limited by inter-observer variability and insensitivity to early fibrotic changes.
• This study uses transcriptomics, and machine learning to uncover Mφ2 heterogeneity in human kidney allografts. Our multi-cohort validation bridges computational discovery with clinical prognosis.
What is the implication, and what should change now?
• The identified hub genes (ALOX5, ARL4C, and MS4A6A) offer potential as biomarkers for IFTA diagnosis, prognosis, and therapy, improving kidney transplant outcomes.
Introduction
Globally, solid organ transplantation remains a critical therapeutic intervention for end-stage organ failure, with kidney transplants accounting for around two-thirds of all procedures (1,2). Despite advancements in immunosuppressive regimens and post-transplant management, long-term graft outcomes remain suboptimal, with approximately 40% of kidney allografts failing within a decade (3). Chronic allograft injury, primarily mediated by alloimmune responses, typically manifests histologically as interstitial fibrosis and tubular atrophy (IFTA) and glomerulosclerosis (4-6). These pathological changes are the primary drivers of progressive graft dysfunction and remain the leading cause of long-term graft loss (2,7). While the immune-mediated mechanisms underlying chronic graft injury have been extensively studied, there is an urgent need to elucidate the molecular pathways driving irreversible fibrosis to develop more effective prognostic and therapeutic strategies.
The plasticity of macrophages (Mφ) within the tissue microenvironment plays a critical role in the pathogenesis of fibrosis (8,9). These innate immune cells adopt context-dependent polarization states-classically activated [M1 macrophage (Mφ1); pro-inflammatory] or alternatively activated [M2 macrophage (Mφ2); resolution/repair phenotypes]-through differential cytokine stimulation (8,10). While Mφ1 polarization is induced by interferon-γ (IFN-γ) and lipopolysaccharide, Mφ2 differentiation is primarily driven by interleukin-4 (IL-4) and IL-13 (11). Although transient Mφ2 activation is beneficial for tissue repair, persistent Mφ2 polarization in kidney allografts contribute to maladaptive fibrotic processes (12). Dysregulated Mφ2 secretes elevated levels of TGF-β, which promotes extracellular matrix (ECM) deposition through SMAD3-dependent pathways and facilitates macrophage-to-myofibroblast transition (MMT), a key event in fibrogenesis (13). Emerging evidence further implicates the ATF6/TGF-β/SMAD3 and JAK/STAT6 pathways in the progression of MMT (14). Despite these insights, a comprehensive characterization of fibrosis-associated Mφ2 subpopulations and their transcriptional signatures in human kidney allografts remain limited, a critical gap this study aims to address.
Current clinical diagnosis of IFTA relies heavily on invasive pathological biopsy results, but limitations like inconsistent physician evaluations, vague tissue patterns, and un-reliable guidelines reduce accuracy (7,15,16). Objective molecular measurements could supplement traditional methods, while advancements in gene analysis tools and artificial intelligence (AI) are transforming disease marker discovery (17). Advances in high-throughput sequencing and computational approaches, such as machine learning, now enable the systematic identification of diagnostic gene signatures across various pathologies (18). Among these methods, weighted gene co-expression network analysis (WGCNA) has emerged as a powerful tool for constructing scale-free gene networks that preserve continuous expression relationships (19). By correlating gene modules with clinical phenotypes, WGCNA identifies hub genes that are mechanistically linked to disease progression (19). When combined with advanced machine learning techniques such as least absolute shrinkage and selection operator (LASSO) regression and random forest (RF) classifiers, WGCNA enhances the precision and generalizability of biomarker discovery in high-dimensional datasets (20).
In this study, we utilized the GSE98320 dataset, obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), to identify potential Mφ2-related differentially expressed genes (DEGs) through differential expression analysis and WGCNA. Using three machine learning algorithms [LASSO, eXtreme Gradient Boosting (XGBoost), and RF], we identified three key Mφ2-associated hub biomarkers—ALOX5, ARL4C, and MS4A6A—as central drivers of IFTA. A diagnostic model based on these genes demonstrated robust predictive performance [area under the curve (AUC): 0.738 in the derivation cohort; 0.78–0.88 in external validation cohorts], establishing them as reliable biomarkers for kidney deterioration. Prognostically, consensus clustering stratified kidney transplant recipients into high-risk (cluster 1) and low-risk (cluster 2) groups, with elevated hub gene expression levels strongly correlating with accelerated graft loss (P<0.001). Functional enrichment analysis revealed significant immune dysregulation, particularly involving TNFα/NF-κB and TGF-β signaling pathways, in high-risk patients. Our findings uncover novel Mφ2-related biomarkers in post-transplant fibrosis, providing valuable diagnostic, prognostic, and therapeutic targets to mitigate IFTA progression and enhance graft survival. We present this article in accordance with the TRIPOD reporting checklist (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-198/rc).
Methods
Data collection and identification of DEGs
Figure 1 presents the workflow of this study. RNA sequencing (RNA-seq) datasets were retrieved from the NCBI GEO (https://www.ncbi.nlm.nih.gov/geo/). Five distinct datasets were utilized: GSE98320, GSE76882, GSE22459, GSE65326, and GSE21374. Detailed information regarding these datasets are provided in Table 1. Differentially expression analysis was performed on the GSE98320 dataset to compare Non-IFTA and IFTA groups using the “limma” package in R (21). The analysis applied stringent criteria: |log fold change (FC)| >0.5 and the adjusted P value <0.05. Visualization of DEGs was facilitated through volcano plots and heatmaps, constructed using the “ggplot2” and “pheatmap” packages in R, respectively. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Table 1
| GSE series | Platform | Organism | Source types | Sample size | Use for |
|---|---|---|---|---|---|
| GSE98320 | GPL15207 | Homo sapiens | Kidney transplant biopsies | 274 non-IFTA vs. 145 IFTA | (I) Getting the DEGs; (II) getting Mφ2-related genes; (III) identifying the hub genes by machine learning algorithms; (IV) construction of the IFTA diagnostic model |
| GSE76882 | GPL13158 | Homo sapiens | Kidney transplant biopsies | 99 non-IFTA vs. 135 IFTA | Validating IFTA diagnostic model |
| GSE22459 | GPL570 | Homo sapiens | Kidney transplant biopsies | 25 non-IFTA vs. 40 IFTA | Validating IFTA diagnostic model |
| GSE65326 | GPL10558 | Homo sapiens | Kidney transplant biopsies | 6 non-IFTA vs. 16 IFTA | Validating IFTA diagnostic model |
| GSE21374 | GPL570 | Homo sapiens | Kidney transplant biopsies | 6 non-IFTA vs. 16 IFTA | Investigating the prognostic implications of three hub genes in kidney transplantation |
DEGs, differentially expressed genes; GEO, Gene Expression Omnibus; IFTA, interstitial fibrosis and tubular atrophy; Mφ2, M2 macrophage.
Immune cell profiling using CIBERSORT algorithm
The CIBERSORT algorithm was employed to quantify immune cell composition within the microenvironment. Utilizing the LM22 leukocyte gene signature matrix, which identifies 22 human hematopoietic cell subtypes—including T cells, B cells, plasma cells, NK cells, and myeloid subsets-this study estimated the relative abundance of these immune cell types. Specifically, Mφ composition was assessed based on the gene expression matrix derived from the GSE98320 dataset using CIBERSORT.
WGCNA and integration with DEGs
The WGCNA R package was utilized to construct weighted gene co-expression networks (19). Initially, we applied hierarchical clustering analysis to exclude outlier samples. The optimal soft threshold β was determined using the “pickSoftThreshold” function in the WGCNA package, facilitating the construction of an adjacency matrix. This matrix was subsequently converted into a topological overlap matrix (TOM). Average linkage hierarchical clustering was performed based on TOM measures to group genes with comparable expression profiles into distinct modules. Lastly, we evaluated the relationship between these modules, macrophage subtypes, and clinical characteristics. To elucidate genes associated with Mφ2, the intersection of the DEGs and the critical module genes identified via WGCNA was analyzed, yielding a set of the common genes potentially linked to Mφ2.
Identification of hub genes by machine learning
To identify hub genes, we employed a three-step machine learning approach. First, the LASSO logistic regression algorithm was implemented using the “glmnet” package in R, enabling the selection of potential hub genes from the common gene pool (22). The regularization parameter (λ) was optimized through 10-fold cross-validation, with “lambda.min” chosen as the optimal value. Regression coefficients were visualized using path diagrams and cross-validation curves. Next, the RF algorithm was applied via the “randomForest” package. Gene importance was quantified using the Mean Decrease Gini coefficient, which serves as a measure of feature purity. Additionally, the XGBoost algorithm, an advanced gradient boosting method known for its efficacy in classification tasks, was employed to rank features by importance using the “XGBoost” package in R (18). By intersecting the top 5 genes identified by XGBoost, the top 5 genes from the RF algorithm, as well as the significant genes derived from LASSO regression, we ultimately identified three hub genes. This integrative approach ensures the reliability and accuracy of the results, providing a comprehensive framework for hub gene identification.
Functional and pathway enrichment analysis of DEGs
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted to explore the functional roles and pathways of DEGs in the GSE98320 dataset, using the “clusterProfiler” package (23). A significance threshold of P<0.05 was applied. Gene set variation analysis (GSVA) was performed using the R package “GSVA” and the H: hallmark gene sets (h.all.v2023.1.Hs.symbols) from the Molecular Signatures Database (MSigDB; http://www.gsea-msigdb.org/gsea/msigdb/index.jsp) (24), to identify pathway differences between two cluster subtypes in GSE21374. Results were considered significant at false discovery rate (FDR) <0.05. Single-gene gene set enrichment analysis (GSEA) of three hub genes was conducted using the “GSEABase” package to elucidate their biological functions in IFTA.
Identification of molecular subtype
To elucidate the molecular mechanisms of IFTA, we employed the consensus clustering algorithm—a robust method for identifying subgroups within the GSE21374 dataset. Utilizing the R package “ConsensusClusterPlus”, an unsupervised clustering analysis was performed. The optimal number of clusters were determined through consensus matrix plots, cumulative distribution function (CDF) plots, and relative changes in the area under the CDF curve. Principal component analysis (PCA) was subsequently conducted to assess inter-cluster variability. Furthermore, expression patterns of the 3 hub genes were visualized using violin plots, highlighting significant disparities between clusters.
Analysis of immune cell infiltration
CIBERSORT, a deconvolution algorithm, quantified 22 immune cell types in the GSE21374 dataset, generating an immune infiltration map to distinguish two subtypes via unsupervised clustering using the R package “ggpubr”. To validate these findings, single-sample GSEA (ssGSEA) was performed on four subgroups to assess immune function enrichment.
Development and validation of an IFTA predictive nomogram model
We constructed a nomogram model using three hub genes from the GSE98320 dataset, employing the “rms” package in R to predict IFTA incidence. Calibration curves were generated to assess the model’s accuracy, while receiver operating characteristic (ROC) curves evaluated its efficacy. External validation was performed using the GSE76882, GSE22459, and GSE65326 datasets. Kaplan-Meier (KM) survival curves were used to analyze survival differences among kidney transplantation patient subtypes in the GSE21374 dataset.
Statistical analysis
Statistical analyses were conducted using R software (version 4.2.2). Group differences were evaluated using Student’s t-test for normally distributed variables and the Mann-Whitney U test for non-normally distributed data. Survival outcomes were analyzed through K-M survival curves, with comparisons performed via the log-rank test and the “survminer” R package. Correlation analysis was conducted using the Pearson correlation test. All statistical tests were two-sided, and a P value <0.05 was deemed statistically significant to ensure robust findings.
Results
Identification of DEGs
The research flowchart for this study is illustrated in Figure 1. We obtained the GSE98320 dataset from the GEO database, which included a total of 419 samples: 145 from patients with IFTA and 274 from the non-IFTA group. Differential gene expression analysis obtained 197 DEGs, including 139 upregulated genes and 58 downregulated genes in the IFTA group (Figure 2A,2B; available online: https://cdn.amegroups.cn/static/public/tau-2025-198-1.xlsx).
Estimation of the Mφ infiltration level in kidney fibrosis
The infiltration levels of Mφ across all samples in GSE98320 dataset were quantified using the CIBERSORT algorithm, which was applied to the gene expression matrix. The clinical feature and the proportions of the three Mφ phenotypes were integrated for WGCNA (available online: https://cdn.amegroups.cn/static/public/tau-2025-198-2.xlsx).
Identification of key Mφ2-related biomarker modules using WGCNA
Employing WGCNA, we identified critical gene modules correlated with Mφ. After preprocessing the GSE98320 dataset to eliminate duplicates and missing values, the top 25% genes were selected for WGCNA analysis. The optimal soft threshold β of 10 was determined based on achieving a scale-free R2 of 0.8 (Figure 2C). With β=10 and a minimum module size of 100, six gene co-expression modules were established using average hierarchical clustering and dynamic tree pruning (Figure 2D). The turquoise module showed the strongest negative correlation with Mφ2 infiltration levels (R=−0.42, P=2e−19), while the blue module demonstrated the strongest positive correlation (R=0.28, P=9e−9) (Figure 2E). Both modules were significantly associated with IFTA, suggesting their potential role in the transition from non-IFTA to IFTA, correlated with Mφ2. They were thus designated as Mφ2-related modules for further investigation. We calculated the MM values for genes to identify hub genes in these modules, with a focus on those strongly correlated with Mφ2 [module membership (MM) >0.8, gene significance (GS) >0.3]. The blue and turquoise modules exhibited the highest correlation coefficients with Mφ2 (blue module =0.51, P value =2.4e−52; turquoise module =0.33, P value =8.3e−42) (Figure 2F). A total of 24 key Mφ2-related biomarkers were identified through “network screening” based on GS and MM, marking them as potential biomarkers in the context of Mφ2-related gene modules (Table S1).
Gene co-expression and functional analysis in IFTA
The overlap of DEGs and Mφ2-related biomarkers was visualized in a Venn diagram, obtaining 12 common genes (Figure 3A). To elucidate their roles in IFTA, we conducted GO and KEGG analyses. Biological process (BP) results revealed significant enrichment in TNF signaling, B cell activation, and humoral immune response (Figure 3B). Cellular component (CC) results showed significant enrichment in IPAF inflammasome complex, NLRP3 inflammasome complex and trans-Golgi network (Figure 3C). Molecular function (MF) results highlighted modulation of cysteine endopeptidase, arachidonate metabolism (Figure 3D). KEGG analysis indicated that the common genes were most enriched in the efferocytosis, necroptosis, and NOD-like receptor signaling pathway (Figure 3E).
Identification of the diagnostic markers for kidney fibrosis via machine learning
Employing three machine-learning algorithms, we identified potential biomarkers for kidney fibrosis. LASSO regression identified 5 genes (Figure 4A,4B, Table S2). XGBoost ranked the common genes by their importance (Figure 4C, Table S3). RF algorithm quantified gene importance via mean decrease Gini scores (Figure 4D,4E, Table S4). Intersection of top 5 genes from each machine learning method yielded 3 hub genes: ALOX5, ARL4C, and MS4A6A (Figure 4F).
Expression levels and correlation analysis of the hub genes
The differential expressions of ALOX5, ARL4C and MS4A6A between the Non-IFTA and IFTA in GSE98320 datasets were shown in Figure 5A. The expression of these 3 hub genes exhibited elevated in IFTA (P<0.05). The consistent expression trends in GSE76882 dataset reinforced these findings (Figure 5B). Furthermore, we performed correlation analysis to better understand the correlation between the hub genes. Correlation analysis revealed strong positive correlations among ARL4C and ALOX5 (r=0.818, P<0.001), ALOX5 and MS4A6A (r=0.816, P<0.001), and ARL4C and MS4A6A (r=0.755, P<0.001), underscoring their shared characteristics in IFTA (Figure 5C-5E).
Establishment and validation of the IFTA diagnostic model
In the GSE98320 dataset, the three hub genes were employed to fabricate an IFTA diagnostic model via logistic regression. Subsequently, this model was graphically represented using a nomogram, integrating three significant predictors of IFTA occurrence (Figure 6A). Calibration curves confirmed a strong correlation between observed and predicted IFTA incidence (Figure 6B). With an AUC of 0.738, the model demonstrated adequate discriminative power (Figure 6C). Individual analyses revealed AUC values of 0.720 for ALOX5, 0.730 for ARL4C, and 0.700 for MS4A6A, respectively (Figure 6D-6F). External validation on GSE22459, GSE65326, and GSE76882 affirmed the model’s accuracy with AUCs of 0.782, 0.833, and 0.884, respectively (Figure 6G-6I).
The role of three hub genes in kidney transplant prognosis
Investigating the prognostic implications of the three hub genes in kidney transplantation, we employed consensus clustering in the GSE21374 dataset. Optimal stability in clustering was identified with two clusters, validated through CDF and CDF delta curves (Figure 7A,7B). The 282-kidney transplant samples were distinctly partitioned into clusters 1 (n=118) and clusters 2 (n=164), as exhibited in the consensus matrix plot (Figure 7C) (available online: https://cdn.amegroups.cn/static/public/tau-2025-198-3.xlsx). PCA plot delineated clear distinctions between the clusters (Figure 7D). The expression patterns of hub genes were visualized using violin plots (Figure 7E). Furthermore, we compared their expression levels between non-rejection and rejection groups, as well as between survival and loss groups (Figure S1). Comparative KM survival curves revealed that cluster 1, exhibiting poorer outcomes, had a higher rate of kidney graft loss over time post-transplantation (Figure 7F). Notably, cluster 1 had a significantly greater proportion of rejection episodes (P<0.001) (Figure 7G). Moreover, cluster 1 showed a notably elevated incidence of graft loss compared to cluster 2 (P<0.001) (Figure 7H). These results suggested consensus clustering stratified patients into high-risk (cluster 1) and low-risk (cluster 2) groups.
Functional enrichment and immune cell infiltrations between the two clusters
Conducting GSVA enrichment analysis with the hallmarks gene set from MSigDB database, we observed that cluster1 exhibited enrichment in inflammatory responses and allograft rejection pathways, including IL-6-JAK-STAT3 signaling, TNFα/NF-κB signaling, and TGF-β signaling. Conversely, cluster 2 showed metabolic pathway activation, such as xenobiotic metabolism and oxidative phosphorylation (Figure 8A). GSEA linked the hub genes (ALOX5, ARL4C, and MS4A6A) to allograft rejection, graft-versus-host disease, and primary immunodeficiency (Figure 8B-8D). Employing CIBERSORT, we discovered significant differences in immune cell infiltration, predominantly within T cells and macrophages (Figure 8E). Subsequent ssGSEA confirmed these differences in immune cell functions, with nearly all cell types showing significant disparity (Figure 8F).
Discussion
IFTA represents not only a prevalent histopathological manifestation of chronic kidney disease (CKD) but also a significant contributor to long-term kidney failure in transplanted kidneys (2,24). Emerging in the early post-transplant period as a consequence of chronic fibrosis, IFTA progressively leads to kidney dysfunction (25,26). While early diagnostic models for IFTA have demonstrated utility in prognosis assessment, effective detection methods remain scarce. The predominant understanding of fibrotic progression posits that the graft undergoes irreversible damage, maintaining structural integrity through a non-specific healing response, ultimately manifesting as interstitial fibrosis (27). Within this complex pathological process, Mφ emerge as pivotal immune regulators in kidney homeostasis, orchestrating inflammation, tissue regeneration, and fibrosis (14). Shinoda et al. elucidated that tissue transglutaminase (TG2) activity exacerbates kidney fibrosis through ALOX15-mediated polarization of monocytes into Mφ2 (28). Furthermore, the process of MMT has been identified as a significant contributor to ECM accumulation, with transitional cells co-expressing myofibroblast marker α-SMA and macrophage marker CD68. Notably, most α-SMA+CD68+ cells in fibrotic regions also express CD206, a characteristic Mφ2 marker (14). The advent of AI has revolutionized medical diagnostics, with machine learning emerging as a powerful computational tool for outcome prediction through advanced data mining and algorithmic analysis (29). This technology has shown particular promise in anticipating post-transplantation complications (30).
In our current investigation, we employed a comprehensive approach to identify key molecular players in IFTA pathogenesis. Through differential gene expression analysis and WGCNA of the GSE98320 dataset, we identified 197 DEGs and 24 key modular genes. Their intersection yielded 12 common genes, which were subsequently subjected to functional enrichment analysis. GO and KEGG analyses revealed their involvement in critical immune processes, including efferocytosis, necroptosis, and the NOD-like receptor signaling pathway.
Employing advanced machine learning algorithms (XGBoost, LASSO, and RF algorithms), we pinpointed three hub genes: ALOX5, ARL4C, and MS4A6A. A nomogram model incorporating these genes demonstrated promising efficacy in IFTA onset prediction, with AUC values consistently exceeding 0.7 across multiple validation datasets (GSE76882, GSE22459, and GSE65326). Consensus clustering analysis of GSE21374 delineated two distinct clusters, with subsequent prognosis analysis, functional enrichment, and immune cell infiltration studies revealing significant differences between these subgroups.
The identified hub genes warrant detailed examination. ALOX5 is a critical mediator in lipid metabolism and inflammation (31), significantly influencing Mφ polarization and fibrotic processes. In cancer, ALOX5 regulates the tumor microenvironment (TME) by promoting the infiltration and polarization of Mφ2, a subset of macrophages associated with immune suppression and tumor progression. In intrahepatic cholangiocarcinoma (ICC), the ALOX5 metabolite LTB4 recruits Mφ2 via BLT1/BLT2 receptors and activates the PI3K pathway, driving tumor growth (31). Similarly, in pancreatic cancer, ALOX5 enhances Mφ2 polarization through the JAK/STAT pathway, while Zileuton, an ALOX5 inhibitor, effectively counteracts this effect (32). In gliomas, ALOX5 mediates immunosuppressive Mφ2 polarization and upregulates programmed death-ligand 1 (PD-L1) expression via 5-hydroxyeicosatetraenoic acid (5-HETE), contributing to tumor immune evasion (33). Beyond oncology, ALOX5 plays a role in fibrotic diseases such as encapsulating peritoneal sclerosis, where its overexpression suggests it as a potential therapeutic target (34). In diabetic nephropathy (DN), ALOX5 inhibition reduces NF-κB signaling and mitigates kidney cell injury, offering new avenues for treatment (35). These diverse roles highlight the central role of ALOX5 in linking Mφ2 polarization and fibrotic mechanisms across diverse pathological conditions, emphasizing its potential as a therapeutic target in both cancer and fibrosis.
ARL4C, a membrane-localized GTP-binding protein, plays a significant role in Mφ polarization and fibrosis, particularly in inflammatory and fibrotic diseases (36). In rheumatoid arthritis (RA), ARL4C activation in fibroblast-like synoviocytes (FLSs) promotes synovial inflammation, cartilage degradation, and bone erosion through PI3K/AKT and MAPK signaling pathways. Importantly, silencing ARL4C disrupts the polarization of monocytes to the pro-inflammatory Mφ1 phenotype and inhibits the repolarization of Mφ2 to Mφ1, highlighting its regulatory role in macrophage dynamics (37). Beyond autoimmune diseases, ARL4C contributes to cancer-related fibrosis by driving epithelial-to-mesenchymal transition (EMT) (38) and facilitating invasion in cancers such as pancreatic and colorectal cancer through specific signaling pathways like ARL4C-IQGAP1-MMP14 (38-40). Additionally, therapeutic agents like ursolic acid show promise by modulating AKT signaling to promote ARL4C degradation, thus inhibiting fibrosis and cancer metastasis (41). These findings collectively underscore ARL4C as a key regulator connecting Mφ polarization, fibrotic processes, and disease progression, offering potential therapeutic targets.
MS4A6A plays a critical role in Mφ function (42) and fibrosis progression (43). In gliomas, it serves as a prognostic biomarker produced by Mφ, linked to poor outcomes and tumor aggressiveness (42). Additionally, MS4A6Ahigh Mφ with an Mφ2 phenotype drives inflammatory responses in fibrotic hypersensitivity pneumonitis, highlighting its involvement in immune dysregulation and fibrotic processes (43). Beyond fibrosis, MS4A6A contributes to autoimmune pathology, such as in lupus nephritis, further underscoring its significance in immune modulation and disease progression (44). These findings suggest MS4A6A as a key regulator of Mφ activity and fibrotic mechanisms.
To further elucidate the role of the three hub genes in the long-term outcomes of kidney transplantation, we performed unsupervised clustering on the prognosis set based on their expression, identifying two distinct subgroups. Notably, the cluster1 group exhibited significantly poorer prognosis compared to cluster2. GSVA enrichment analysis revealed that cluster1 was characterized by the upregulation of pathways associated with allograft rejection, including IL-6/JAK/STAT3 signaling and TGF-β signaling, both of which are closely linked to fibrosis (45,46). In contrast, cluster2 showed activation of metabolic processes such as xenobiotic metabolism, adipogenesis, fatty acid metabolism, and oxidative phosphorylation. These findings suggest that the cluster1 group, identified through unsupervised clustering with the IFTA diagnostic model, represents a high-risk population with advanced fibrosis and poor clinical outcomes. This classification provides a valuable tool for risk stratification in the early post-transplantation period, enabling targeted high-frequency screening and early interventions for high-risk patients. Such an approach may mitigate adverse events and reduce the risk of graft loss, ultimately improving transplant survival rates.
While our study provides valuable insights into Mφ2-related biomarkers for IFTA, several limitations should be acknowledged. First, reliance on retrospective transcriptomic data introduces biases from batch effects and platform variability. Although normalized, RNA-seq data may still be influenced by technical factors like sequencing depth and RNA quality. Second, the machine learning workflow prioritized computational robustness, but the limited sample size (n=419 in GSE98320) restricts generalizability. Prospective, multi-center cohorts with standardized histology are needed for validation. Third, the findings remain correlative. While TNFα/NF-κB and TGF-β pathways were implicated, experimental validation (e.g., Mφ2 polarization assays or knockout models) is required to confirm causality of ALOX5, ARL4C, and MS4A6A in fibrosis. Finally, the nomogram model focuses on diagnostic the accuracy but lacks integration of clinical parameters like donor-specific antibodies, epidermal growth factor receptor (eGFR), or proteinuria, which could enhance predictive value. Future studies should address these gaps through multi-omics approaches and longitudinal monitoring to improve personalized management of kidney fibrosis.
Conclusions
This study identifies ALOX5, ARL4C, and MS4A6A as hub genes associated with M2 macrophage-driven kidney fibrosis in IFTA. Using bioinformatics and machine learning, we developed a diagnostic nomogram model with robust predictive accuracy (AUC 0.738–0.88). Consensus clustering stratified patients into high- and low-risk groups, where elevated hub gene expression correlated with accelerated graft loss (P<0.001) and dysregulation of fibrosis-related pathways (TGF-β, TNFα/NF-κB). These findings provide critical diagnostic/prognostic biomarkers and therapeutic targets, advancing personalized strategies for mitigating IFTA progression in kidney transplantation.
Acknowledgments
The authors sincerely acknowledge the Gene Expression Omnibus (GEO) database for providing the invaluable datasets used in this study. Additionally, the authors would like to express their gratitude to jvenn (https://www.bioinformatics.com.cn/static/others/jvenn/example.html) for the Venn diagrams provided.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-198/rc
Peer Review File: Available at https://tau.amegroups.com/article/view/10.21037/tau-2025-198/prf
Funding: This work was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://tau.amegroups.com/article/view/10.21037/tau-2025-198/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Zheng X, Zhang W, Zhou H, et al. A randomized controlled trial to evaluate efficacy and safety of early conversion to a low-dose calcineurin inhibitor combined with sirolimus in renal transplant patients. Chin Med J (Engl) 2022;135:1597-603. [Crossref] [PubMed]
- Yin Y, Chen C, Zhang D, et al. Construction of predictive model of interstitial fibrosis and tubular atrophy after kidney transplantation with machine learning algorithms. Front Genet 2023;14:1276963. [Crossref] [PubMed]
- Lai X, Zheng X, Mathew JM, et al. Tackling Chronic Kidney Transplant Rejection: Challenges and Promises. Front Immunol 2021;12:661643. [Crossref] [PubMed]
- Zheng X, Li M, Wang P, et al. Assessment of chronic allograft injury in renal transplantation using diffusional kurtosis imaging. BMC Med Imaging 2021;21:63. [Crossref] [PubMed]
- Ahuja HK, Azim S, Maluf D, et al. Immune landscape of the kidney allograft in response to rejection. Clin Sci (Lond) 2023;137:1823-38. [Crossref] [PubMed]
- Wang M, Zeng F, Ning F, et al. Ceria nanoparticles ameliorate renal fibrosis by modulating the balance between oxidative phosphorylation and aerobic glycolysis. J Nanobiotechnology 2022;20:3. [Crossref] [PubMed]
- Guo Y, Cen K, Hong K, et al. Construction of a neural network diagnostic model for renal fibrosis and investigation of immune infiltration characteristics. Front Immunol 2023;14:1183088. [Crossref] [PubMed]
- Zhang Y, Liu Y, Luo S, et al. An adoptive cell therapy with TREM2-overexpressing macrophages mitigates the transition from acute kidney injury to chronic kidney disease. Clin Transl Med 2025;15:e70252. [Crossref] [PubMed]
- Froom ZSCS, Callaghan NI, Davenport Huyer L. Cellular crosstalk in fibrosis: Insights into macrophage and fibroblast dynamics. J Biol Chem 2025;301:110203. [Crossref] [PubMed]
- Fonseca AC, Colavite PM, Azevedo MCS, et al. Inhibition of MEK1/2 Signaling Pathway Limits M2 Macrophage Polarization and Interferes in the Dental Socket Repair Process in Mice. Biology (Basel) 2025;14:107. [Crossref] [PubMed]
- Wang H, Ye X, Spanos M, et al. Exosomal Non-Coding RNA Mediates Macrophage Polarization: Roles in Cardiovascular Diseases. Biology (Basel) 2023;12:745. [Crossref] [PubMed]
- Setten E, Castagna A, Nava-Sedeño JM, et al. Understanding fibrosis pathogenesis via modeling macrophage-fibroblast interplay in immune-metabolic context. Nat Commun 2022;13:6499. [Crossref] [PubMed]
- Tang PM, Zhang YY, Xiao J, et al. Neural transcription factor Pou4f1 promotes renal fibrosis via macrophage-myofibroblast transition. Proc Natl Acad Sci U S A 2020;117:20741-52. [Crossref] [PubMed]
- Li G, Yang H, Zhang D, et al. The role of macrophages in fibrosis of chronic kidney disease. Biomed Pharmacother 2024;177:117079. [Crossref] [PubMed]
- St Jeor JD, Reisenauer CJ, Andrews JC, et al. Transjugular Renal Biopsy Bleeding Risk and Diagnostic Yield: A Systematic Review. J Vasc Interv Radiol 2020;31:2106-12. [Crossref] [PubMed]
- Pang Q, Chen H, Wu H, et al. N6-methyladenosine regulators-related immune genes enable predict graft loss and discriminate T-cell mediate rejection in kidney transplantation biopsies for cause. Front Immunol 2022;13:1039013. [Crossref] [PubMed]
- Greener JG, Kandathil SM, Moffat L, et al. A guide to machine learning for biologists. Nat Rev Mol Cell Biol 2022;23:40-55. [Crossref] [PubMed]
- Mao K, Lin F, Pan Y, et al. Identification of glycosyltransferase genes for diagnosis of T-cell mediated rejection and prediction of graft loss in kidney transplantation. Transpl Immunol 2024;87:102114. [Crossref] [PubMed]
- Peng S, Yan W, Yan Y, et al. AP2M1 as the potential biomarker for prediction of the response of atopic dermatitis to Dupilumab therapy: Multi-omics analysis and evidence. Int J Biol Macromol 2025;297:139757. [Crossref] [PubMed]
- Jiang H, Zhang X, Wu Y, et al. Bioinformatics identification and validation of biomarkers and infiltrating immune cells in endometriosis. Front Immunol 2022;13:944683. [Crossref] [PubMed]
- Huang L, Zhang J, Songyang Z, et al. Identification and Validation of eRNA as a Prognostic Indicator for Cervical Cancer. Biology (Basel) 2024;13:227. [Crossref] [PubMed]
- Yao X, Qi X, Wang Y, et al. Identification and Validation of an Annexin-Related Prognostic Signature and Therapeutic Targets for Bladder Cancer: Integrative Analysis. Biology (Basel) 2022;11:259. [Crossref] [PubMed]
- Mao K, Lin F, Pan Y, et al. Identification of mitophagy-related gene signatures for predicting delayed graft function and renal allograft loss post-kidney transplantation. Transpl Immunol 2024;87:102148. [Crossref] [PubMed]
- Djudjaj S, Boor P. Cellular and molecular mechanisms of kidney fibrosis. Mol Aspects Med 2019;65:16-36. [Crossref] [PubMed]
- Feng YL, Wang WB, Ning Y, et al. Small molecules against the origin and activation of myofibroblast for renal interstitial fibrosis therapy. Biomed Pharmacother 2021;139:111386. [Crossref] [PubMed]
- Zhang H, Yang Y, Liu Z, et al. Significance of methylation-related genes in diagnosis and subtype classification of renal interstitial fibrosis. Hereditas 2023;160:32. [Crossref] [PubMed]
- Romagnani P, Remuzzi G, Glassock R, et al. Chronic kidney disease. Nat Rev Dis Primers 2017;3:17088. [Crossref] [PubMed]
- Shinoda Y, Tatsukawa H, Yonaga A, et al. Tissue transglutaminase exacerbates renal fibrosis via alternative activation of monocyte-derived macrophages. Cell Death Dis 2023;14:136. [Crossref] [PubMed]
- Lin A, Qi C, Li M, et al. Deep Learning Analysis of the Adipose Tissue and the Prediction of Prognosis in Colorectal Cancer. Front Nutr 2022;9:869263. [Crossref] [PubMed]
- Senanayake S, White N, Graves N, et al. Machine learning in predicting graft failure following kidney transplantation: A systematic review of published predictive models. Int J Med Inform 2019;130:103957. [Crossref] [PubMed]
- Chen J, Tang Y, Qin D, et al. ALOX5 acts as a key role in regulating the immune microenvironment in intrahepatic cholangiocarcinoma, recruiting tumor-associated macrophages through PI3K pathway. J Transl Med 2023;21:923. [Crossref] [PubMed]
- Hu WM, Liu SQ, Zhu KF, et al. The ALOX5 inhibitor Zileuton regulates tumor-associated macrophage M2 polarization by JAK/STAT and inhibits pancreatic cancer invasion and metastasis. Int Immunopharmacol 2023;121:110505. [Crossref] [PubMed]
- Chen T, Liu J, Wang C, et al. ALOX5 contributes to glioma progression by promoting 5-HETE-mediated immunosuppressive M2 polarization and PD-L1 expression of glioma-associated microglia/macrophages. J Immunother Cancer 2024;12:e009492. [Crossref] [PubMed]
- Lu X, Wu K, Jiang S, et al. Therapeutic mechanism of baicalein in peritoneal dialysis-associated peritoneal fibrosis based on network pharmacology and experimental validation. Front Pharmacol 2023;14:1153503. [Crossref] [PubMed]
- Chen X, Xie H, Liu Y, et al. Interference of ALOX5 alleviates inflammation and fibrosis in high glucose induced renal mesangial cells. Exp Ther Med 2023;25:34. [Crossref] [PubMed]
- Sztul E, Chen PW, Casanova JE, et al. ARF GTPases and their GEFs and GAPs: concepts and challenges. Mol Biol Cell 2019;30:1249-71. [Crossref] [PubMed]
- Tang N, Luo X, Ding Z, et al. Single-Cell Multi-Dimensional data analysis reveals the role of ARL4C in driving rheumatoid arthritis progression and Macrophage polarization dynamics. Int Immunopharmacol 2024;141:112987. [Crossref] [PubMed]
- Kanai R, Uehara T, Yoshizawa T, et al. ARL4C is associated with epithelial-to-mesenchymal transition in colorectal cancer. BMC Cancer 2023;23:478. [Crossref] [PubMed]
- Harada A, Matsumoto S, Yasumizu Y, et al. Localization of KRAS downstream target ARL4C to invasive pseudopods accelerates pancreatic cancer cell invasion. Elife 2021;10:e66721. [Crossref] [PubMed]
- Hu Q, Masuda T, Sato K, et al. Identification of ARL4C as a Peritoneal Dissemination-Associated Gene and Its Clinical Significance in Gastric Cancer. Ann Surg Oncol 2018;25:745-53. [Crossref] [PubMed]
- Zhang M, Xiang F, Sun Y, et al. Ursolic acid inhibits the metastasis of colon cancer by downregulating ARL4C expression. Oncol Rep 2024;51:27. [Crossref] [PubMed]
- Zhang C, Liu H, Tan Y, et al. MS4A6A is a new prognostic biomarker produced by macrophages in glioma patients. Front Immunol 2022;13:865020. [Crossref] [PubMed]
- Wang J, Zhang L, Luo L, et al. Characterizing cellular heterogeneity in fibrotic hypersensitivity pneumonitis by single-cell transcriptional analysis. Cell Death Discov 2022;8:38. [Crossref] [PubMed]
- Wang Z, Hu D, Pei G, et al. Identification of driver genes in lupus nephritis based on comprehensive bioinformatics and machine learning. Front Immunol 2023;14:1288699. [Crossref] [PubMed]
- Chen QY, Jiang YN, Guan X, et al. Aerobic Exercise Attenuates Pressure Overload-Induced Myocardial Remodeling and Myocardial Inflammation via Upregulating miR-574-3p in Mice. Circ Heart Fail 2024;17:e010569. [Crossref] [PubMed]
- Lee YI, Shim JE, Kim J, et al. WNT5A drives interleukin-6-dependent epithelial-mesenchymal transition via the JAK/STAT pathway in keloid pathogenesis. Burns Trauma 2022;10:tkac023. [Crossref] [PubMed]

