Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Immunol., 12 September 2025

Sec. Cancer Immunity and Immunotherapy

Volume 16 - 2025 | https://doi.org/10.3389/fimmu.2025.1638445

Air pollution-related immune gene prognostic signature for hepatocellular carcinoma: network toxicology, machine learning and multi-omics analysis

Lei Pu&#x;Lei Pu1†Xiaoyan Zhang&#x;Xiaoyan Zhang1†Cheng PuCheng Pu2Peng Sun*Peng Sun1*
  • 1Key Laboratory of Adolescent Health Assessment and Exercise Intervention of the Ministry of Education, East China Normal University, Shanghai, China
  • 2College of Martial Arts, Shanghai University of Sport, Shanghai, China

Background: Air pollution may crosstalk with immune system to promote hepatocellular carcinoma (HCC) development, but its precise mechanisms and prognostic significance remain unclear.

Objective: This study aims to construct a prognostic signature for HCC based on air pollutant-related immune genes (APIGs).

Methods: We obtained mRNA-seq and scRNA of HCC from GEO, TCGA and ICGC. AP-related target genes were retrieved from several online databases. APIGs were obtained using WGCNA, differential gene expression analysis and immune infiltration analysis. Molecular subtypes were conducted based on APIG expression to characterize immune features. A total of 101 combinations of 10 machine learning algorithms were used to construct an APIG-based prognostic signature (APIGPS). Furthermore, we performed qRT-PCR, survival analyses, functional enrichment, immune infiltration and single-cell analyses. Subsequently, LASSO, RF, and RFE-SVM were employed to identify diagnostic genes, followed by pan-cancer analysis.

Results: We identified 19 APIGs. HCC samples were divided into 3 subtypes, with C1 exhibiting a pro-tumor immune microenvironment and poorer prognosis. APIGPS constructed by 7 APIGs (CDC25C, MELK, ATG4B, SLC2A1, CDC25B, APEX1, GLS), demonstrated robust predictive ability independent of clinical features. The biological pathway differences between APIGPS-based high- and low-risk groups involved immune responses and cell proliferation and migration. APIGPS genes had stable binding to 7 APs and were mainly expressed in macrophages, with HRG exhibiting higher macrophage abundance. CDC25C was identified as the hub gene after intersecting diagnostic genes and APIGPS genes. CDC25C was associated with survival of 10 cancers, MSI in 10 cancers, TMB in 21 cancers, and immune cell abundance in 13 cancers.

Conclusions: We identified key APIGs and constructed a robust APIG-based prognostic signature for HCC. CDC25C was a key target through which APs impact HCC and multiple other cancers.

Introduction

The acceleration of global industrialization has exacerbated air pollution, contributing to the onset and progression of various cancers. Air pollutants (APs) and other extrinsic factors account for 70% to 90% of lifetime risk of cancers (1). In 2019, air pollution-related neoplasms caused an estimated 387.45 million death cases globally. And the number of death cases were expected to rise to 559.02 million by 2025 (2). Exposure to pollutants such as PM2.5, nitrogen dioxide (NO2), SO2 and ozone (O3) has been strongly associated with the incidence and mortality of multiple cancers, including lung, kidney, colon, bladder cancers, and hepatocellular carcinoma (HCC). However, their exact mechanisms, especially for HCC, remain incompletely understood (35).

Air pollution may influence HCC development and prognosis by modulating immune system function. The prognosis for HCC remains poor, with a 5-year survival rate ranging from 13% to 36% from early to advanced stages (6). Among chronic hepatitis B patients treated with nucleotide/nucleoside analogue therapy, air pollution has been associated with an increased risk of developing HCC (7). For every 5.0 μg/m3 increase in PM2.5, the hazard ratio for HCC all-cause mortality increased by 1.18 (8). Studies in both animal models and humans suggest that inhalation of APs—such as nitrogen oxides and volatile organic compounds—is associated with hepatic dysfunction. Air pollution may also promote inflammatory responses through immune system crosstalk, enhancing the secretion of pro-inflammatory cytokines such as TNF-α, IL - 6, and IL - 1β (9). However, the precise mechanisms by which air pollution drives HCC via immune modulation remains elusive Based on these studies, we hypothesize that air pollution contributes to HCC progression by modulating tumor immune responses, and that specific air pollutant-related immune genes (APIGs) could serve as valuable diagnostic and prognostic biomarkers. To investigate this, we sought to identify key APIGs and elucidate their biological functions, prognostic significance, and potential therapeutic relevance in HCC.

In this study, we obtained AP-related target genes through network toxicology, and obtained APIGs using weighted gene co-expression network analysis (WGCNA), differential gene expression analysis and immune infiltration analysis. Machine learning algorithms were used to construct a prognostic signature (APIGPS) and validate its predictive performance. Based on the APIGPS, we performed nomogram construction, qRT-PCR, immune infiltration, tumor mutation burden, drug sensitivity, and single-cell analyses. To further explore the potential link between APs and HCC pathogenesis, we further identified diagnostic biomarkers to obtain potential hub genes with both diagnostic and prognostic utility, and performed pan-cancer analysis.

Methods

Data collection and processing

Figure 1 presents the research flowchart. We obtained HCC RNA-seq, scRNA-seq, and clinical information from The Cancer Genome Atlas (TCGA; 374 HCC cases; https://portal.gdc.cancer.gov/v1) and the Gene Expression Omnibus (GEO)as the training set. The International Cancer Genome Consortium (ICGC; 233 HCC cases; https://docs.icgc-argo.org/docs/data-access/icgc-25k-data) was used as an external validation set(). We obtained 16206 immune-related genes from GeneCards (www.genecards.org; search term: immune system, score: 1, type: coding; Supplementary Table S1).

Figure 1
A detailed scientific workflow diagram illustrating the analysis of gene expression data. It includes steps involving data sources like PubChem and GeneCards, heatmaps, venn diagrams for feature selection methods (RF, SVM-RFE, LASSO, APIGPS), and visualizations like ROC curves, bar charts, and Kaplan-Meier survival plots. Colored nodes and pathways indicate various analytical processes related to gene interactions and risk stratification. The bottom right corner features a 3D protein structure and bar graph comparison of expression levels in different cell types.

Figure 1. Research flowchart. We utilized multiple databases to obtain AP targets and immune-related targets. Through WGCNA, differential gene expression analysis, and immune infiltration analysis, we identified APIG. Based on these APIG, we constructed the APIGPS and conducted qRT-PCR validation (*=P < 0.05; **=P < 0.01; ***=P < 0.001), survival analysis, immune infiltration assessment, drug sensitivity analysis, tumor mutation burden, nomogram, single-cell analysis, and potential functional enrichment analysis. Subsequently, diagnostic biomarkers were screened based on AP targets and intersected with APIGPS to identify hub gene, followed by pan-cancer analysis.

Chemical structures of 7 APs were retrieved from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/). First, we assessed their carcinogenicity using ADMETLAB 3.0 (https://admetlab3.scbdd.com) and ProTox3 (https://tox.charite.de/protox3). If the result from either database was above 0.5, we considered it indicative of carcinogenicity. Then we obtained human target genes for these compounds using four databases: the SuperPred database (https://prediction.charite.de/; Species: Human; Probability > 50%), Swiss Target Prediction database (http://www.swisstargetprediction.ch/; Species: Human), STITCH database (http://stitch.embl.de/; Organism: Human; score > 0.5), and the SEA database (https://sea.bkslab.org/; Species: Human; P < 0.05).

Weighted gene co-expression network analysis

Weighted Gene Co-expression Network Analysis (WGCNA) was used to construct gene co-expression networks, identify gene modules with highly coordinated expression patterns, and evaluate module-phenotype relationships. To identify immune genes significantly associated with HCC, we first removed genes with a standard deviation of expression level < 0.5. The optimal soft threshold was determined with a scale-free R2 of 0.9. We then converted the adjacency matrix into a topological overlap matrix, and calculated the corresponding dissimilarity (1-TOM). Subsequently, gene modules were identified using dynamic tree cutting and hierarchical clustering (cut height = 0.1, minimum module size = 100).

Identification of APIGs

We conducted differential gene expression analysis on the 16206 immune-related genes to obtain differentially expressed genes (DEGs) with |log2 fold change (FC)| > 1 and FDRq < 0.05. APIGs were obtained by intersecting the identified module genes, DEGs, and AP target genes. To evaluate their correlations with the immune features of HCC, we used the CIBERSORT algorithm (https://CIBERSORT.stanford.edu/) to estimate the infiltration abundance of 22 immune cells in HCC. CIBERSORT uses a deconvolution algorithm to infer cell-type abundances from a complex tissue based on a signature matrix of gene expression profiles. Subsequently, Pearson test was used to assess the correlation between APIGs and the abundance of 22 immune cells. |r| > 0.3 and p < 0.05 denotes a potential correlation.

Clustering analysis

Based on the expression of APIGs, clustering analysis was conducted to identify molecular subtypes. We used the “ConsensusClusterPlus” R package with Partitioning Around Medoids (PAM) algorithm and Euclidean distance; 80% of the samples were resampled for 10 repetitions. The optimal number of clusters (k) was determined by cumulative distribution function (CDF) plot and average cluster consensus. Principal component analysis (PCA) was conducted to visualize the spatial distribution of subtypes by projecting gene expression data onto principal components (PCs) for dimensionality reduction.

Functional and pathway enrichment analysis

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted using the “clusterprofiler” R package. We also performed gene set variation analysis (GSVA) to evaluate differentially enriched pathways across samples. Additionally, GeneMANIA (https://genemania.org/) was used for functional network-based enrichment analysis.

Construction and validation of APIGPS

We used the “sva” R package to remove batch effects in the TCGA and ICGC sets, and then performed univariate COX analysis on the TCGA set to obtain APIGs significantly associated with HCC survival.

Using 10-fold cross-validation, 10 machine learning algorithms—least absolute shrinkage and selection operator (LASSO), Ridge, StepCox, CoxBoost, random survival forest (RSF), generalized boosted regression modeling (GBM), survival support vector machine (survival-SVM), supervised principal components (SuperPC), elastic net (Enet), and partial least squares regression for Cox (plsRcox)—were applied to screen variables to construct 101 APIGPS models. HCC samples with fewer than 20 days of survival and models with fewer than 5 genes were excluded. The optimal model was determined based on the average C-index value. HCC patients were then stratified into high-risk group (HRG) and low-risk group (LRG) according to risk scores calculated through linear combination formula. Subsequently, we performed univariate and multivariate COX analysis, K-M survival analysis, clinical ROC curve, temporal ROC curve, and risk curve using “survival”, “survminer”, and “timeROC” R package to further assess the predicative performance of APIGPS. p < 0.05 and the area under the curve (AUC) > 0.5 were considered significant.

Correlations with clinical features and nomogram construction

To evaluate the associations between APIGPS and clinical features, we performed chi-square tests followed by K-M survival analysis. Subsequently, we constructed a nomogram using the “rms” R package and evaluated the associations between the nomogram score and HCC survival. C-index value was used to evaluate the consistency of the predicted values and the observed values. Univariate and multivariate COX analyses were used to assess the independent predictive performance of the nomogram, ultimately providing a tool for individualized survival prediction in different HCC patients.

Immune-related and drug sensitivity analyses

We used CIBERSORT to compare the differences in the infiltration abundance of 22 immune cells between the HRG and LRG. To further validate these differences and evaluate the correlation between APIGPS and immune cell infiltration, we applied 6 additional algorithms, including QUANTISEQ, TIMER, MCPCOUNTER, CIBERSORT−ABS, EPIC, and XCELL. To evaluate immune microenvironment differences between the HRG and LRG, we performed single-sample gene set enrichment analysis (ssGSEA) using the “GSVA” R package, assessing the enrichment scores of 29 immune traits (16 immune cells and 13 immune functions). Based on predefined gene sets, ssGSEA ranks genes according to their expression levels and calculates enrichment scores for individual samples by evaluating the cumulative distribution and weighted enrichment of genes within the set, allowing quantification of the relative activity of specific immune cells in samples.

Next, we utilized the Tumor Immune Dysfunction and Exclusion (TIDE) (http://tide.dfci.harvard.edu/) to assess the immune escape ability of HRG and LRG, where higher scores correlate with poorer response to immunotherapy. Additionally, we obtained the Immuno-Phenoscore (IPS) for HCC patients from the TCIA database, with higher score indicating greater immunotherapy sensitivity. Finally, we leveraged TISIDB (https://cis.hku.hk/TISIDB/index.php) to obtain immune checkpoint genes and assess their differential expression between HRG and LRG, as well as their association with APIGPS in the context of immunotherapy response.

To identify highly sensitive drugs in the HRG and LRG, we used the “oncoppredict” R package to evaluate their sensitivity to 198 FDA-approved drugs. Drug sensitivity was quantified by half maximal inhibitory concentration (IC50), with lower values indicating higher sensitivity.

Tumor mutational burden analysis

To investigate the link between AP target mutations and APIGPS, we conducted mutation analysis on the 7 AP target genes. Based on the cutoff value, the patients were divided into TMB-high and TMB-low groups, which were then combined with APIGPS for survival analysis. The “maftools” package was used to visualize the top 20 genes with the highest mutation frequencies.

Single cell analysis

We obtained scRNA-seq data from GEO (GSE210679 and GSE149614) for 11 HCC cases. Data preprocessing was performed using the “Seurat” R package; the percentage of mitochondrial genes was calculated using PercentageFeatureSet. Quality control criteria were: nFeature = 500 - 6000, nCount = 2000 - 40000, and percent.mt < 25. Batch effects were corrected using the “Harmony” package, and PCA were used for dimension reduction, respectively. The top 10 PCs were used for cell clustering via “FindNeighbors” and “FindClusters” functions. Visualization was realized by Uniform Manifold Approximation and Projection (UMAP). Cell type annotation was performed using the “SingleR” package with reference to five human hematopoietic datasets from the “celldex” package (BlueprintEncodeData, MonacoImmuneData, and NovershternHematopoieticData). Furthermore, we performed cell-cell communication analysis using the “CellChat” package, based on ligand-receptor pairs from the CellChat database. Finally, we performed pseudotime trajectory analysis of macrophages using the “monocle” R package, and calculated gene set scores using the AddModuleScore function from the Seurat package.

Molecular docking

We obtained protein structures of APIGPS genes from the Alphafold (https://alphafold.ebi.ac.uk/) and PDB (https://www.rcsb.org/) databases. Molecular docking analysis was performed using the CB-Dock2 (https://cadd.labshare.cn/), which predicts binding affinity based on structural complementarity. Lower binding energy indicates stronger, more stable binding affinity.

Identification of hub genes

To further identify key markers with dual diagnostic and prognostic values, we used LASSO, SVM-RFE and RF to screened APIGPS genes based on 10-fold cross-validation. AUC was used to evaluate the diagnostic performance of these genes. LASSO incorporates L1 regularization and the tuning parameter λ (lambda) to control penalty strength. LASSO can shrink the coefficients of redundant features to zero to produce sparse modeling in high-dimensional datasets, which can help identify the most predictive features in disease prediction. SVM-RFE is a feature selection algorithm that combines Support Vector Machine and Recursive Feature Elimination. Its core idea is to use the weight coefficients in the SVM model to rank features and eliminate the least important features iteratively, thus progressively optimizing the feature subset and ultimately the most critical features for improving classification performance. By combining random data sampling and random feature selection, RF constructs multiple decision trees and combines their predictions to select feature genes for disease prediction.

Pan-cancer analysis of the hub gene

Given the widespread impact of APs on the survival of various cancers, we performed pan-cancer analysis using mRNA-seq data of 33 cancers retrieved from the UCSC database (https://xena.ucsc.edu/). We examined the expression level of the hub gene in 33 cancers and the association with immune infiltration, pan-cancer survival, tumor mutational burden (TMB), and microsatellite instability (MSI).

qRT-PCR and immunohistochemistry

The cell lines (THLE - 2, HepG2) were purchased from the Institute of Cell Research, Chinese Academy of Sciences. Total RNA was isolated using Trizol reagent following the manufacturer’s instructions (Invitrogen, 1596 - 026). cDNA was synthesized using a reverse transcription kit (Fermentas, #K1622). Quantitative real-time PCR (qRT-PCR) was conducted with the SYBR Green kit (Thermo, #K0223). GAPDH served as the internal control for normalization. The primer sequences are provided in Supplementary Table S6.

Immunohistochemical (IHC) results for normal people and HCC patients were obtained from the Human Protein Atlas (HPA) database (https://www.proteinatlas.org/).

Statistical analysis

All statistical analyses were performed using R software (Version 4.3.2; the R Foundation, St. Louis, MO, USA). Chi-square test was used for categorical data; t-test or Wilcox test was used for continuous data. Cytoscape (v3.8.2) was used to visualize the interaction network. HR > 1 was considered a risk factor; HR < 1 indicated a protective factor. Unless otherwise specified, p < 0.05 with a confidence interval of 95% was considered significant.

Results

Toxicity and target genes of air pollutants

The 7 Aps analyzed exhibited high carcinogenicity in both toxicity prediction platforms. Target genes were identified from four databases as follows: 82 for benzene, 70 for carbon monoxide (CO), 92 for nitric oxide (NO), 87 for nitrogen dioxide (NO2), 88 for ozone (O3), 92 for sulfur dioxide (SO2), and 102 for toluene. After intersection, a total of 257 AP-related target genes were identified (Supplementary Tables S1, S2).

Identification of APIGs associated with HCC

We performed WGCNA using the TCGA set to identify immune genes most associated with HCC. Based on the optimal power value of 11 (Figures 2A, B), we identified 9 co-expression modules (Figure 2C). Among these modules, the turquoise module (containing 1636 genes) had the highest correlation with HCC (r=0.59, p=1e-41; Figure 2D). And the gene significance and module membership for the turquoise module showed a significant correlation (r=0.47, p=1.1e-90; Supplementary Table S3; Figure 2E).

Figure 2
Panel of scientific charts and diagrams analyzing genetic data:  A: Line graph titled “Scale independence” shows scale-free topology model fit against soft threshold power.  B: Scatter plot titled “Mean connectivity” displays mean connectivity against soft threshold power.  C: Dendrogram with colored modules representing gene clusters.  D: Heatmap titled “Module-trait relationships (TCGA)” illustrates correlations between modules and traits.  E: Scatter plot of module membership versus gene significance for tumors.  F: Volcano plot showing log fold change versus -log P values.  G: Venn diagram comparing WGCNA, DEGs, and AP-target.  H: Network diagram linking HCC and AP with gene interactions.  I: Heatmap of immune cell correlation with genes, marked by significance levels.

Figure 2. Identification of APIG. (A) The selection of soft threshold β. (B) Mean connectivity for HCC. (C) Cluster dendrogram of WGCNA analysis. (D) Module-trait heatmap indicating the correlation between modules and HCC. (E) The correlation between module membership and gene significance in the turquoise module for HCC. (F) Volcano plot of DEGs. (G) Venn diagram of DEGs, module genes, and AP-target. (H) Correlation plot of APIG. (I) Correlation analysis between genes and immune infiltration (*=P < 0.05; **=P < 0.01; ***=P < 0.001).

Subsequently, differential gene expression analysis of 16206 immune genes identified 4118 DEGs (Supplementary Table S4; Figure 2F). Intersecting these DEGs with 1636 genes in the turquoise module yielded1197 immune genes associated with HCC.

These 1197 immune genes were further intersected with the 257 AP target genes, resulting in 25 genes (Figures 2G, H). We then assessed their correlation with immune cell infiltration and finally obtained 19 APIGs (Supplementary Table S5; Figure 2I).

Molecular subtyping based on APIGs

Based on the expression of APIGs and CDF curve, HCC samples were divided into 3 subtypes (Supplementary Figures S1A, B). PCA showed distinct separation among the subtypes (Supplementary Figure S1C). Among them, C1 subtype exhibited a tumor-promoting immune phenotype, characterized by significantly higher macrophage (Mφ) abundance as confirmed by both ssGSEA and CIBERSORT analyses (Figures 3A, B). In contrast, C2 and C3 subtypes exhibited features of an immunosuppressive microenvironment, such as higher abundance of resting memory CD4+ T cells and monocytes, yet displayed a more favorable immune microenvironment than C1. Consistently, all 7 immune infiltration algorithms confirmed the significant differences in Mφ abundance between C1 and C2/C3 subtypes (Figure 3C). As expected, the K-M curve indicated that the C1 subtype had significantly lower overall survival compared to C2 and C3 subtypes (Figure 3D). These findings suggest that AP-related immune alterations may drive distinct immune subtypes in HCC.

Figure 3
Chart A shows a box plot comparing immune infiltration across clusters C1, C2, and C3, with significant differences indicated. Chart B displays a box plot of scores by risk and cluster groups. Chart C presents a detailed heatmap of immune cell types clustered by methods, with variations visualized through color gradients. Chart D is a Kaplan-Meier survival curve for clusters C1, C2, and C3, showing survival probabilities over a ten-year period, with statistical significance noted by p<0.001.

Figure 3. Molecular subtyping and immune traits. (A) Differences in the infiltration abundance of 22 immune cells between the three subtypes (*P < 0.05; **P < 0.01; ***P < 0.001). (B) Differences in the infiltration abundance of 29 immune traits between the three subtypes (*P < 0.05; **P < 0.01; ***P < 0.001). (C) 7 immune infiltration algorithms for the three subtypes. (D) Survival analysis of the three subtypes.

Construction of APIGPS through machine learning

Univariate COX analysis identified 18 APIGs significantly associated with HCC prognosis (Figure 4A; Supplementary Table S6). Subsequently, based on 10-fold cross validation, 101 combinations of 10 machine learning algorithms were employed to construct APIGPS models. Among them, APIGPS constructed by Lasso+CoxBoost had the highest average C-index (0.729; Figure 4B), and incorporated 7 APIGs (CDC25C, MELK, ATG4B, SLC2A1, CDC25B, APEX1, GLS). K-M survival analysis indicated that low expression of these genes was associated with better HCC survival (Supplementary Figures S2A-G). Additionally, qRT-PCR confirmed their significant upregulation in HCC samples (Figure 4C; Supplementary Table S7), which was further supported by IHC results (Supplementary Figures S3A-G).

Figure 4
A multi-panel scientific figure showing various analyses:   A) A forest plot of hazard ratios for different proteins, with values and significance levels.  B) A table listing predictive models and their C-index scores, with color-coded performance levels.  C) A bar graph comparing relative mRNA expression for several genes in two cell types, with significance indicators.  D) A scatter plot of principal component analysis differentiating high and low risk groups.  E) A Kaplan-Meier survival curve showing significant differences in survival between risk groups.  F) A ROC curve for different time points with AUC values for model prediction.  G) A ROC curve comparing predictive factors like risk, age, gender, grade, and stage with AUC values.

Figure 4. Construction and Validation of APIGPS. (A) Univariate COX analyses associated with survival. (B) Construction of APIGPS using integrated machine learning. (C) qRT-PCR of the signature genes (*=P < 0.05; **=P < 0.01; ***=P < 0.001). (D) PCA based on EIGPS. (E) K-M survival curves based on APIGPS. (F) ROC curves predicting 1-, 3-, and 5-year survival. (G) ROC curves of risk scores and clinical features.

Evaluation and validation of APIGPS

To assess the performance of the constructed APIGPS, we calculated risk scores for patients and divided them into HRG and LRG. PCA demonstrated clear separation between HRG and LRG (Figure 4D). Notably, increasing APIGPS risk scores correlated with progressively higher mortality rates (Supplementary Figures S4A, B). K-M curve showed that HRG had significantly worse survival than LRG (Figure 4E). Time-dependent ROC curves suggested that the APIGPS had good predictive accuracy for 1-year (AUC = 0.791), 3-year (AUC = 0.676), and 5-year (AUC = 0.646) survival (Figure 4F). Importantly, APIGPS outperformed other clinical features in prognostic accuracy (Figure 4G). Univariate (HR = 3.958, 95%CI: 2.527 - 6.193, P<0.001; Supplementary Figure S4C) and multivariate COX regression analyses (HR = 3.113, 95%CI:1.904-5.091, P<0.001; Supplementary Figure S4D) confirmed the independent prognostic value of APIGPS. Furthermore, APIGPS exhibited potential associations with molecular subtyping (Supplementary Figure S5).

We further validated the robustness and generalizability of APIGPS in the external validation set. The results of the ROC curve (Supplementary Figures S6A, B), K-M analysis (Supplementary Figure S6H), risk curve (Supplementary Figure S6C), PCA (Supplementary Figure S6G), scatter plot (Supplementary Figure S6F), univariate (Supplementary Figure S6E) and multivariate (Supplementary Figure S6B) COX regression analyses, and signature genes expression (Supplementary Figure S6I) in the validation set were all consistent with the training set.

Clinical correlations and the nomogram based on APIGPS

We evaluated the correlations between APIGPS and clinical features and observed significant differences in T-stage, Stage, and Grade between HRG and LRG (Supplementary Figure S7A). K-M analysis revealed that the prognostic value of APIGPS was independent of clinical features: patients in the LRG consistently had better survival across early and late stage (Supplementary Figures S7B-D).

To enable personalized prognostic prediction for HCC patients, we constructed a nomogram based on APIGPS (Supplementary Figure S8A). The nomogram accurately predicted 1-, 3-, and 5-year survival (C-index = 0.736, 95% CI: 0.685 - 0.787; Supplementary Figure S8B). Mortality gradually increased with higher nomogram scores (Supplementary Figure S8C). Both univariate (HR = 2.061, 95%CI:1.712-2.48, P<0.001; Supplementary Figure S8D) and multivariate (HR = 1.9, 95%CI:1.428-2.53, P<0.001; Supplementary Figure S8E) COX regression analyses confirmed the nomogram’s independent prognostic utility (AUC = 0.832; Supplementary Figure S8F).

Immune infiltration analysis based on APIGPS

CIBERSORT analysis indicated that the abundance of Mφ and activated memory CD4+ T cells were significantly higher in HRG than in LRG, while the abundance of resting memory CD4+ T cells, monocytes, and M1-like macrophage (M1)were significantly higher in LRG than in HRG (Figures 5A, B). Consistently, ssGSEA showed higher activity or abundance of immature dendritic cells (iDCs), Mφ, MHC class I, CD4+ T helper cells, and tumor-infiltrating lymphocytes (TILs)in the HRG than in LRG (Figure 5C). Results from 7 immune infiltration algorithms further supported an elevated abundance of multiple tumor-promoting immune features in the HRG (Figure 5D). For instance, Mφ abundance was significantly elevated across multiple algorithms, and was significantly positively correlated with APIGPS (Figure 5E). Furthermore, all of these algorithms showed that Mφ abundance was elevated in HRG. Survival analysis demonstrated that higher Mφ abundance was associated with worse survival (Figure 5F).

Figure 5
A series of graphs and charts analyzing immune cell interactions and survival data. Panel A shows a stacked bar chart of immune cell percentages across samples. Panel B features a box plot comparing immune cell fractions in low and high-risk groups. Panel C presents a bar chart with scores for different immune-related features, contrasting low and high-risk groups. Panel D is a heatmap of immune cell scores from various methods. Panel E is a scatter plot of correlation coefficients for immune cell interactions. Panel F displays a Kaplan-Meier survival curve for macrophages, contrasting high and low-risk groups with p-value 0.009.

Figure 5. Immune infiltration analysis of APIGPS. (A) Heatmap of immune infiltration in HRG and LRG. (B) Differences in 22 immune cells in HRG and LRG (*P < 0.05; **P < 0.01; ***P < 0.001). (C) Differences in 16 immune cells and 13 immune functions in HRG and LRG (*P < 0.05; **P < 0.01; ***P < 0.001). (D) Heatmap of the differences in the immune infiltration abundance between HRG and LRG based on 7 immune infiltration algorithms. (E) Correlation between risk scores and immune infiltration abundance based on 7 immune infiltration algorithms. (F) Survival analysis of macrophages.

Single cell analysis based on APIGPS

After quality control, we obtained 33694 genes and 43918 cells; after outlier removal, we retained 25190 genes and 34202 cells for normalization and dimensionality reduction (Supplementary Figure S9A). The top 10 PCs were selected according to the elbow plot (Supplementary Figure S9B). We identified 18 clusters with 0.5 as the best resolution (Figure 6A; Supplementary Figures S9C, D). Cell type annotation revealed cell types, including adipocytes, B cells, CD8+ T cells, endothelial cells, fibroblasts, hepatocytes, Mφ, monocytes and T cells (Figure 6B). APIGPS genes were mainly expressed in Mφ (Figure 6C). Cell–cell communication analysis showed that interactions between Mφ and endothelial cells exhibited the highest interaction number and strength (Figures 6D, E; Supplementary Tables S8 and S9), with PPIA-BSG ligand-receptor pairs playing a mediating role (Figure 6F). Subsequently, we extracted macrophage subsets and identified distinct macrophage clusters (Figure 6G). Using the AddModuleScore function, we calculated APIGPS scores and found that cluster 4 exhibited the highest scores (Supplementary Figure S9E). Pseudotime trajectory analysis revealed three distinct differentiation paths (Figure 6H), with APIGPS+ macrophages located at the terminal end of the first trajectory (Figure 6I). Notably, along this trajectory, the APIGPS+ cells showed a phenotypic shift from high to low M1 polarization scores (Figure 6J), suggesting a potential inhibitory effect of APIGPS on M1 polarization.

Figure 6
Panel A shows a hierarchical tree diagram with node sizes representing count. Panel B is a UMAP plot with colored cluster annotations for various cell types. Panel C contains UMAP expression plots for genes APEX1, ATG4B, CDC25B, CDC25C, GLS, MELK, and SLC2A1. Panels D and E depict circular network diagrams illustrating interactions among cell types. Panel F is a dot plot showing communication probabilities and P-values. Panel G is another UMAP with cluster annotations for macrophages. Panel H is a pseudotime plot, and Panel I is a seurat cluster plot. Panel J presents a box plot comparing M1 scores.

Figure 6. Single cell analysis. (A) Annotated cell types. (B) Selection of clustering resolutions. (C) APIGPS genes expression in cells. (D) Circle plot of interaction number in cell-cell communication. (E) Circle plot of interaction strength in cell-cell communication. (F) Receptor-ligand for cell communication. (G) Annotation of macrophages. (H) Macrophage pseudotime trajectory. (I) The pseudotime trajectory of macrophage classification. (J) The score of M1.

Macrophage-related immune pathway enrichment analysis based on APIGPS

To delineate differences in Mφ-related pathways between HRG and LRG, we performed GO and KEGG analyses. The most significant differences in immunological functions and pathways were leukocyte mediated immunity and chemokine signaling pathway, respectively (Supplementary Figures S10A, B; Supplementary Tables S10, S11). GSVA further confirmed dysregulated Mφ-related pathways in HRG, such as complement and coagulation cascades, and Fc gamma receptor (FcγR) mediated phagocytosis (Supplementary Figure S10C; Supplementary Table S12).

Immunotherapy-related and drug sensitivity analyses based on APIGPS

Immunotherapy represents a promising strategy for the treatment of HCC. The HRG had a significantly higher TIDE score than the LRG, suggesting its greater immune escape ability and poorer predicted response to immunotherapy (Figure 7A). The LRG had a higher IPS for CTLA4-/PD - 1- treatment, suggesting its better response to PD - 1 inhibitor and CTLA4 inhibitor therapy (Figure 7B). Immune checkpoint-related genes are closely associated with the efficacy of immunotherapy, and were more highly expressed in the HRG (Figure 7C). The expression of immune checkpoint-related genes can reflect T-cell activation or exhaustion, and higher expression can indicate either immune activation or immune suppression. Given our immune infiltration findings, this likely reflects an immune-suppressive phenotype in the HRG. Moreover, APIGPS risk scores were positively correlated with the expression of core clinical immunotherapy targets, implying that HCC patients with higher risk scores may derive less benefit from immunotherapy (Figure 7D). Drug sensitivity analysis indicated that the LRG was more sensitive to Axitinib and Ribociclib (Figures 7E, F), whereas the HRG showed higher sensitivity to Afatinib and Cediranib (Figures 7G, H).

Figure 7
Graphical panel comparing low-risk (blue) and high-risk (red) groups across different metrics. (A) and (B) present violin plots illustrating TIDE and tumor mutation burden, respectively, with significant differences indicated. (C) shows box plots of gene expression for various genes, highlighting differences between risk groups. (D) is a correlation matrix of immune checkpoint gene expressions. (E), (F), (G), and (H) consist of box plots comparing azathioprine, ribociclib, axitinib, and crizotinib sensitivity across risk groups, showcasing significant differences.

Figure 7. Immunotherapy related analysis. (A) TIDE scores (***P < 0.001). (B) IPS scores. (C) Differential expression of immune checkpoint genes (*P < 0.05; **P < 0.01; ***P < 0.001). (D) Correlation of risk scores with clinically common immune checkpoints. (E, F) Drugs that are more sensitive in the LRG. (G, H) Drugs that are more sensitive in the HRG.

Tumor mutational burden analysis based on APIGPS

After dividing HCC patients into TMB-high and TMB-low groups, K-M curves showed that TMB-high group had significantly lower survival rate than TMB-low group (Supplementary Figure S11A). Mutation rates differed markedly between risk groups, with 55.8% in HRG and 32.35% in LRG (Supplementary Figures S11B, C). Among the 257 AP-related target genes, TP53 was the most frequently mutated gene in both HRG and LRG. When combining TMB status with APIGPS risk, the TMB-high + HRG group had the worst survival (Supplementary Figure S11D).

Identification of a hub gene with dual diagnostic and prognostic value

We employed LASSO, SVM-RFE, and RF to screen key genes from 19 APIGs. LASSO selected 11 genes based on the minimum λ (Figures 8A, B); RF identified 8 genes with an importance score > 5 (Figures 8C, D); SVM-RFE identified 9 genes according to the minimum error (Figures 8E, F; Supplementary Table S13). After intersecting these genes with APIGPS genes (Figure 8G), CDC25C was identified as the hub gene with good diagnostic value (AUC = 0.98; Figure 8H).

Figure 8
Panel A shows a binomial deviance plot against Log Lambda. Panel B presents a coefficients path plot for L1 norm. Panel C displays error rates over various numbers of trees in a random forest. Panel D is a bar plot showing feature importance with CDK1 being most significant. Panel E illustrates the 10-fold cross-validation accuracy across several feature numbers. Panel F displays the 10-fold cross-validation error. Panel G is a Venn diagram showing feature selection overlap among methods: RF, LASSO, SVM-RFE, APIGPS. Panel H is an ROC curve for CDC25C, indicating an AUC of 0.980.

Figure 8. Identification of hub gene using machine learning. (A) LASSO lambda. (B) LASSO coefficient profiles. (C) RF error rate. (D) RF importance ranking of AP-related targets. (E) SVM-RFE accuracy of AP-related targets. (F) SVM-RFE error of AP-related targets. (F) RF importance ranking of AP-related targets. (G) Venn diagram of genes identified by 3 machine learning methods. (H) ROC curve of CDC25C.

Pan-cancer analysis of the hub gene

To explore broader oncogenic relevance, we conducted a pan-cancer assessment of CDC25C. CDC25C was highly expressed in all cancers and was involved in cell cycle-related functions (Supplementary Figures S12A, B). In addition, CDC25C was significantly associated with immune cell abundance in 13 cancers (Supplementary Figure S12C; Supplementary Table S14), TMB in 21 cancers (Supplementary Figure S12D; Supplementary Table S15), and microsatellite instability in 10 cancers (Supplementary Figure S12E; Supplementary Table S16). High CDC25C expression was associated with low survival in 9 cancers besides HCC (Supplementary Figures S13A-I).

Molecular docking analysis based on APIGPS

Molecular docking analysis suggested that all AP–APIGPS interactions had binding energies between -1 to -5, indicating good binding affinity between 7 APs and 7 APIGPS genes (Figure 9A; Supplementary Table S17). Among them (Top 3), CDC25C formed more unstable bindings with carbon monoxide (Figure 9B), nitric oxide (Figure 9C), and sulfur dioxide (Figure 9D). These findings further supported the hypothesis that APs may crosstalk with APIGPS expression, thereby influencing the progression of HCC.

Figure 9
Heatmap and 3D models of protein-ligand interactions. The heatmap on the left shows interactions between pollutants and proteins with a color scale from red to blue, indicating the range of values from -1.5 to -4.5. The 3D structures on the right, labeled B, C, and D, depict molecular interactions highlighting specific residues and bonds, with proteins represented in various colors, including yellow, white, and pink.

Figure 9. Molecular docking. (A) Heatmap of binding energies of molecular docking analysis (lower energy indicates stronger binding affinity). (B) CDC25C and carbon monoxide. (C) CDC25C and nitric oxide. (D) CDC25C and sulfur dioxide.

Discussion

The rising mortality and morbidity of air pollution-induced HCC underscore an urgent public health concern. Recent epidemiological evidence suggests that air pollution may interact with the immune system to promote HCC progression; however, the exact mechanisms remain unclear. In this study, we intersected AP-related target genes (derived from network toxicology) with HCC-related immune genes (identified via WGCNA and differential gene expression analysis), followed by immune infiltration analysis, and ultimately identified 19 APIGs for molecular subtyping. Subsequently, we constructed an APIGPS containing 7 APIGs, which demonstrated robust predictive performance. Further analysis revealed that Mφ were a major immune cell phenotype distinguishing HRG and LRG, with HRG exhibiting poorer response to immunotherapy. Afatinib and Cediranib were identified as potential sensitive drugs for HRG patients. Single-cell RNA sequencing showed that APIGPS genes were predominantly expressed in Mφ. TP53 was identified as the most frequently mutated gene. Using 3 machine learning methods, we identified diagnostic genes and intersected them with APIGPS, identifying CDC25C as the hub gene. Finally, molecular docking analysis confirmed strong binding affinity between 7 APs and 7 APIGPS genes, which further supported the hypothesis that AP may drive the occurrence and development of HCC through molecular interactions with APIGPS.

To the best of our knowledge, this study is the first to identify APIGs highly correlated with HCC and constructed a clinically applicable APIGPS. HCC patients were classified into three subtypes based on the 19 APIGs, with C1 subtype exhibiting tumor-promoting immune phenotype, particularly elevated Mφ abundance, and the poorest prognosis. These findings suggest that AP exposure may induce distinct HCC subtypes by modulating the immune microenvironment, thus complicating personalized treatment strategies. APIGPS demonstrated good predictive performance; the HRG had poorer prognosis and overlapped mostly with C1 subtype, suggesting that APIGPS may reflect subtype-specific molecular and immunological features. Multivariate COX analysis and ROC curves validated the independence and robustness of APIGPS in both the training and external validation set. To aid clinical application, we constructed a nomogram for better individualized prognosis prediction. Furthermore, CDC25C showed good diagnostic value and may be involved in the progression of multiple cancers, making it an important pan-cancer target for APs.

CDC25C is a cell cycle–regulating phosphatase that primarily promotes G2/M transition by activating the CDK1/cyclin B complex (10). Although not traditionally considered an immune regulator per se, CDC25C is involved in immune modulation by inducing genomic instability. It is highly expressed in multiple cancers and correlates with TMB, microsatellite instability, immune infiltration, and low survival in at least 10 cancers. Overexpression of CDC25C can counteract the suppression by the ATM/ATR-CHK1/2 pathway, leading to sustained CDK1 activation. This enables DNA-damaged cells to bypass the checkpoints and enter M phase, promoting genomic instability and mutations in tumor suppressor genes; it also disrupts the DNA mismatch repair system, ultimately accelerating tumor cell proliferation (10, 11). Additionally, such instability produces cytosolic DNA fragments, which are recognized by cGAS and catalyze the synthesis of the second messenger cGAMP. cGAMP binds to and activates STING protein on the endoplasmic reticulum membrane, inducing its conformational change and recruiting TBK1 kinase. Activated TBK1 phosphorylates IRF3, promoting its nuclear translocation and initiating IFN-α/β transcription, which rapidly recruits CD8+ T cells and NK cells to elicit anti-tumor effects (12, 13). However, chronic overexpression of CDC25C upregulates PD-L1 in tumor cells through the STAT3/IRF1 pathway, and promotes the recruitment of Tregs, MDSCs and M2-like macrophages (M2) via chemokines such as CXCL10. These immunosuppressive cells directly inhibit T cell activity by secreting IL - 10, Arg1, and TGF-β, and enhance angiogenesis, tumor invasion, and immune escape by secreting factors like VEGF and MMP9 (1417). Furthermore, persistent activation of the ATM/ATR signaling pathway caused by DNA damage repair defects can also upregulate the expression of immune checkpoints such as PD-L1 and VISTA via the NF-κB or HIF - 1α pathways (17). CDC25C overexpression has been associated with shorter progression-free survival in LUAD patients treated with nivolumab (18). Targeted inhibition of CDC25C induces immunogenic cell death, leading to the release of damage-associated molecular patterns (19, 20). Extracellular ATP activates the P2X7 receptor on Mφ surface, inducing NLRP3 inflammasome-dependent IL - 1β release and upregulating the expression of MHC-II molecules and costimulatory molecules CD80/CD86, thereby promoting the cross-presentation of tumor antigens to CD4+ T cells (21, 22). HMGB1, on the other hand, activates the NF-κB pathway in Mφ via TLR4/RAGE signaling, promoting the secretion of chemokines such as CXCL9/CXCL10 and recruitment of effector T cells into the tumor microenvironment, while inhibiting Mφ polarization towards the M2 phenotype (20, 23).

The other 6 APIGs in the APIGPS also act as poor prognostic factors. ATG4B, a member of C54 peptidases, primarily cleaves the C-terminus of microtubule-associated protein 1 light chain 3 to promote the extension and closure of autophagosome membrane (24). Targeted inhibition of ATG4B can block ATG4B-mediated autophagic degradation of TBK1 (a crucial kinase for antiviral immunity), enhancing antiviral immune responses and CD8+ T cell infiltration, thereby delaying HCC progression (25, 26). Additionally, ATG4B closely interacts with SLC2A1, promoting the Warburg effect in tumor and increasing L-lactate production and glucose uptake (27). SLC2A1, in turn, facilitates glucose transport to maintain the Warburg effect, and promotes Mφ polarization toward the M2 by enhancing efferocytosis and inflammatory factors secretion such as CD206 and IL - 10 (28). In liver metastatic lesions, SLC2A1 fosters an immunosuppressive microenvironment by increasing the proportion of Mφ and their inhibitory interactions with T cells (29). Furthermore, SLC2A1 interacts with GLS, mediating the transport of glutamine into the cell. GLS catalyzes the hydrolysis of glutamine to glutamate, the rate-limiting step in glutamine metabolism, to generate α-ketoglutarate for the TCA cycle (30). GLS-driven glutaminolysis increases production of α-ketoglutarate and reactive oxygen species, which activate the Wnt/β-catenin signaling pathway and maintain the stemness and survival of cancer stem-like cells (31). This process contributes to immune suppression by reducing T cell infiltration and promoting the expression of immune checkpoint molecules (32). Moreover, excessive glutamine metabolism mediated by GLS1 drives the polarization of Mφ toward the M2 immunosuppressive phenotype and inhibits Th1 and CD8+ T cell differentiation (33). APEX1 is a DNA repair enzyme with apurinic/apyrimidinic activity. Its redox domain regulates the activity of several transcription factors such as NF-κB, AP - 1, and STAT3, enhancing cytokine and chemokine secretion such as TNFα, IL - 6, and IL - 8, ultimately creating a pro-inflammatory and immunosuppressive microenvironment (34, 35). Targeted inhibition of APEX1 or its redox function can enhance the IFNγ-producing Th1 response and suppress HCC cell migration and proliferation (36, 37). MELK, a serine/threonine protein kinase, is a cell-cycle modulator essential for mitotic progression (38). MELK promotes HCC cell migration by upregulating MMP7 expression and regulates G2/M phase progression via PLK1-CDC25-CDK signaling, thereby inhibiting apoptosis (38). This process may further suppress CD8+ T cell infiltration by activating downstream cascades such as CDK4/6 activation, facilitating tumor immune escape (39). Knockdown of MELK inhibits cell viability by inducing apoptosis and mitosis in HCC cells, promotes M1polarization, hinders M2 polarization, induces CD8+ T cell recruitment, and improves sensitivity to radiotherapy (40). CDC25B and CDC25C belong to the CDC25 phosphatase family. Unlike CDC25C, CDC25B promotes G2/M transition by dephosphorylating and activating CDK1 during the G2 phase (10). CDC25B participates in immune pathways similar to CDC25C. CDC25B expression correlates with infiltration of B cells, CD8+ T cells, CD4+ T cells, Mφ, neutrophils, and dendritic cells in HCC (41).

Immunological features determine the prognosis and therapeutic response in HCC. We found that APIGPS was predominantly expressed in Mφ, with high Mφ abundance in the C1 subtype and HRG. High expression of Mφ aws significantly associated with poor prognosis, and this immune microenvironment remodeling may contribute to immunotherapy resistance, which may also explain the lower abundance of resting memory CD4+ T cells in the C1 subtype. Mφ can be categorized into tissue-resident Mφ and monocyte-derived Mφ. During early HCC, tissue-resident Mφ accumulate close to tumor cells, promoting epithelial-mesenchymal transition and tumor invasiveness, while inducing myeloid-derived suppressor cells and Tregs response to form an immunosuppressive microenvironment (42). During tumor progression, tissue-resident Mφ were redistributed at the periphery of the tumor microenvironment, whereas monocyte-derived Mφ dominate the TME. The latter highly express PD-L1, PD-L2, CD80, and CD86, resulting in resistance to radiotherapy, chemotherapy, and immune checkpoint blockade (ICB) therapy (42, 43). Research indicates that Mφ suppress endogenous STAT3 in T cells by releasing CSF - 1, chemokines or exosomes, thereby regulating Treg/Th17 cell balance and continuously recruiting monocyte-derived Mφ into the TME to form a drug-resistant positive feedback loop (44, 45). Additionally, Mφ can capture PD - 1 antibodies through the Fc-FcγR pathway, leading to resistance to ICB, which aligns with our GSVA results that the FcγR pathway was highly expressed in HRG (46). In vivo studies showed that Mφ and Tregs are colocalized in tumor tissues. Depletion of Mφ can reverse the immunosuppression in the TME, restore ICB efficacy, and promot CD8+ T cell infiltration into HCC cells (47, 48). In addition, Our study revealed that Mφ communicated with endothelial cells via the PPIA/BSG axis. This process activates NF-κB through the IL - 6/STAT3 pathway, upregulates PD-L1 expression, and suppresses T cell activity (49). Targeting the PPIA-CD147 axis can block this cascade, inhibit angiogenesis, and reverse immune escape in HCC. However, it is noteworthy that there are alternative pathways to PPIA/BSG-mediated Mφ-endothelial cell communication, as evidenced by persistent PPIA-induced IL - 8 expression even after BSG knockdown (50). TP53 was the most frequently mutated gene in both HRG and LRG. TP53 mutation frequency was associated with abundance of NK cells, Mφ, and follicular helper T cells in HCC (51). TP53 gain-of-function mutation promote Mφ recruitment via BRD4-dependent CSF - 1 expression (52). Loss of TP53 increases PD-L1 expression and reduces CD8+ T cell infiltration in HCC samples and mouse models (53).

GO and KEGG analyses showed that the main differences in Mφ-related pathways between HRG and LRG were leukocyte mediated immunity and chemokine signaling pathway, which suggests Mφ polarization and reprograming of the immune microenvironment. Mφ belong to myeloid-derived leukocytes and exhibit high plasticity: classically activated M1have pro-inflammatory and anti-tumor effects, while alternatively activated M2exert anti-inflammatory and pro-tumor effects. These polarized states can interconvert under specific conditions. Cytokines like IL12, TNF, and IFNG, MAMPs such as LPS, or other Toll-like receptor agonists can induce polarization toward M1 state. Conversely, IL4, IL5, IL10, IL13, CSF1, TFGB1, and PGE2 all promote Mφ polarization toward M2 state. It is worth noting that although targeting M1 polarization is beneficial, the role of M1 polarization appears to be double-edged. M1have been shown to promote HCC cell motility by secreting IL - 1β, and induce PD-L1 expression via IRF1 and p65, thereby contributing to adaptive resistance to immunotherapy in HCC (54). However, this manipulation of Mφ phenotypic plasticity raises several concerns. For instance, can M1acquire M2-associated properties? Will the converted cells exhibit stronger tumor-promoting effects? Moreover, is the simplistic M1/M2 binary classification sufficient to describe macrophage states in the TME? These questions remain to be elucidated through further investigation.

Our study first identified APIG and their potential pathways and mechanisms affecting HCC, constructed a prognostic signature based on them, and further determined the hub gene. However, this study also has several critical limitations that must be acknowledged. The most significant limitation is the lack of comprehensive experimental validation. Although we performed qRT-PCR to confirm the expression levels of target genes, this alone is insufficient to establish causal relationships or mechanistic insights. In vitro functional experiments, such as siRNA-mediated gene knockdown, CCK - 8 assays to evaluate cell proliferation, flow cytometry or immunofluorescence to assess Mφ polarization, and Transwell migration assays are essential to verify the functional roles of the identified APIGs. Moreover, in vivo studies would provide important mechanistic and translational validation. The absence of these validations may currently weaken the biological conclusions of the study. Second, the specificity of the effect of APs on HCC remains largely unclear. Clinically, multiple APs often coexist and may interact synergistically (e.g., ozone, benzene/toluene), making it challenging to determine the individual contribution of each pollutant. Further studies are needed to clarify the pollutant-specific mechanisms driving HCC. Third, the affinity between APs and Mφ and the polarization direction (M1/M2) remain unclear. More importantly, it is unknown whether AP exposure can actively trigger Mφ repolarization, and how the temporal dynamics of such polarization evolve. Single-cell and transcriptomics analyses are recommended to dissect spatiotemporal changes in the immune microenvironment. Furthermore, although molecular docking analyses suggested potential binding between APs and target proteins, these predictions do not confirm causal regulatory relationships. Future studies should incorporate in vitro or in vivo AP exposure experiments to examine the dynamic regulation of APIG expression, and perform transcriptomic profiling under controlled exposure conditions to elucidate regulatory mechanisms. In addition, while the APIGPS demonstrates robust performance in the current datasets, there remains a risk of overfitting due to reliance on retrospective public cohorts. The generalizability of the model to broader clinical populations or real-world settings is yet to be established. Prospective validation in large, multi-center cohorts is necessary to assess its clinical applicability. Finally, while this study suggests APs may impact immunotherapy efficacy in HCC, and green environment or pollution levels correlate with patient prognosis, the mechanistic links between APs and immunotherapy warrants further exploration.

Conclusions

Our study identified 19 APIGs from 7 APs, and selected 7 of them to construct an APIGPS with good predictive performance. Among them, CDC25C was the hub gene with diagnostic and prognostic value, and was associated with the survival outcomes in 10 cancers. Future studies should include experimental validation using AP extracts, employing techniques such as siRNA, CCK - 8, flow cytometry, immunofluorescence, Western blotting, and Transwell assays to further elucidate the molecular mechanisms and immune regulatory effects of APIGs. Overall, this study underscores the importance of integrating environmental exposure factors into cancer research, particularly in understanding HCC progression and improving individualized prognosis and treatment strategies.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Author contributions

LP: Visualization, Writing – original draft, Software, Formal Analysis, Methodology, Conceptualization, Writing – review & editing. XZ: Writing – review & editing, Software, Writing – original draft, Data curation, Visualization. CP: Visualization, Writing – original draft, Writing – review & editing. PS: Project administration, Formal Analysis, Supervision, Writing – original draft, Writing – review & editing, Methodology.

Funding

The author(s) declare financial support was received for the research and/or publication of this article. This work was supported by grants from the National Natural Science Foundation of China (32171130 to PS).

Acknowledgments

We sincerely thank TCGA, ICGC, PubChem, ADMETLAB, SuperPred, GEO, Swiss Target Prediction, STITCH, SEA and GeneCards for providing transcriptomic and clinicopathological data and the researchers who uploaded the data.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2025.1638445/full#supplementary-material

References

1. Wu S, Powers S, Zhu W, and Hannun YA. Substantial contribution of extrinsic risk factors to cancer development. Nature. (2016) 529:43–7. doi: 10.1038/nature16166

PubMed Abstract | Crossref Full Text | Google Scholar

2. Ren F and Liu G. Global, regional, and national burden and trends of air pollution-related neoplasms from 1990 to 2019: an observational trend study from the global burden of disease study 2019. Ecotoxicol Environ Saf. (2024) 285:117068. doi: 10.1016/j.ecoenv.2024.117068

PubMed Abstract | Crossref Full Text | Google Scholar

3. Xue Y, Wang L, Zhang Y, Zhao Y, and Liu Y. Air pollution: a culprit of lung cancer. J Hazard Mater. (2022) 434:128937. doi: 10.1016/j.jhazmat.2022.128937

PubMed Abstract | Crossref Full Text | Google Scholar

4. Turner MC, Krewski D, Diver WR, Pope CR, Burnett RT, Jerrett M, et al. Ambient air pollution and cancer mortality in the cancer prevention study ii. Environ Health Perspect. (2017) 125:87013. doi: 10.1289/EHP1249

PubMed Abstract | Crossref Full Text | Google Scholar

5. Chin WS, Pan SC, Huang CC, Chen PJ, and Guo YL. Exposure to air pollution and survival in follow-up after hepatocellular carcinoma. Liver Cancer. (2022) 11:474–82. doi: 10.1159/000525346

PubMed Abstract | Crossref Full Text | Google Scholar

6. Siegel RL, Miller KD, Wagle NS, and Jemal A. Cancer statistics, 2023. Ca Cancer J Clin. (2023) 73:17–48. doi: 10.3322/caac.21763

PubMed Abstract | Crossref Full Text | Google Scholar

7. Jang TY, Zeng YT, Liang PC, Wu CD, Wei YJ, Tsai PC, et al. Role of air pollution in development of hepatocellular carcinoma among chronic hepatitis b patients treated with nucleotide/nucleoside analogues. Liver Int. (2025) 45:e16149. doi: 10.1111/liv.16149

PubMed Abstract | Crossref Full Text | Google Scholar

8. Deng H, Eckel SP, Liu L, Lurmann FW, Cockburn MG, and Gilliland FD. Particulate matter air pollution and liver cancer survival. Int J Cancer. (2017) 141:744–9. doi: 10.1002/ijc.30779

PubMed Abstract | Crossref Full Text | Google Scholar

9. Wu X, Zhang X, Yu X, Liang H, Tang S, and Wang Y. Exploring the association between air pollution and the incidence of liver cancers. Ecotoxicol Environ Saf. (2025) 290:117437. doi: 10.1016/j.ecoenv.2024.117437

PubMed Abstract | Crossref Full Text | Google Scholar

10. Boutros R, Lobjois V, and Ducommun B. Cdc25 phosphatases in cancer cells: key players? Good targets? Nat Rev Cancer. (2007) 7:495–507. doi: 10.1038/nrc2169

PubMed Abstract | Crossref Full Text | Google Scholar

11. Thanasoula M, Escandell JM, Suwaki N, and Tarsounas M. Atm/atr checkpoint activation downregulates cdc25c to prevent mitotic entry with uncapped telomeres. Embo J. (2012) 31:3398–410. doi: 10.1038/emboj.2012.191

PubMed Abstract | Crossref Full Text | Google Scholar

12. Deng L, Liang H, Xu M, Yang X, Burnette B, Arina A, et al. Sting-dependent cytosolic dna sensing promotes radiation-induced type i interferon-dependent antitumor immunity in immunogenic tumors. Immunity. (2014) 41:843–52. doi: 10.1016/j.immuni.2014.10.019

PubMed Abstract | Crossref Full Text | Google Scholar

13. Mackenzie KJ, Carroll P, Martin CA, Murina O, Fluteau A, Simpson DJ, et al. Cgas surveillance of micronuclei links genome instability to innate immunity. Nature. (2017) 548:461–5. doi: 10.1038/nature23449

PubMed Abstract | Crossref Full Text | Google Scholar

14. Garcia-Diaz A, Shin DS, Moreno BH, Saco J, Escuin-Ordinas H, Rodriguez GA, et al. Interferon receptor signaling pathways regulating pd-l1 and pd-l2 expression. Cell Rep. (2017) 19:1189–201. doi: 10.1016/j.celrep.2017.04.031

PubMed Abstract | Crossref Full Text | Google Scholar

15. Curiel TJ, Coukos G, Zou L, Alvarez X, Cheng P, Mottram P, et al. Specific recruitment of regulatory t cells in ovarian carcinoma fosters immune privilege and predicts reduced survival. Nat Med. (2004) 10:942–9. doi: 10.1038/nm1093

PubMed Abstract | Crossref Full Text | Google Scholar

16. Jahangiri A, Dadmanesh M, and Ghorban K. Stat3 inhibition reduced pd-l1 expression and enhanced antitumor immune responses. J Cell Physiol. (2020) 235:9457–63. doi: 10.1002/jcp.29750

PubMed Abstract | Crossref Full Text | Google Scholar

17. Karlsson-Rosenthal C and Millar JB. Cdc25: mechanisms of checkpoint inhibition and recovery. Trends Cell Biol. (2006) 16:285–92. doi: 10.1016/j.tcb.2006.04.002

PubMed Abstract | Crossref Full Text | Google Scholar

18. Zhang W, Shang X, Yang F, Han W, Xia H, Liu N, et al. Cdc25c as a predictive biomarker for immune checkpoint inhibitors in patients with lung adenocarcinoma. Front Oncol. (2022) 12:867788. doi: 10.3389/fonc.2022.867788

PubMed Abstract | Crossref Full Text | Google Scholar

19. Zhang H, Hu K, Lu Y, Xu Z, Chen G, Yu D, et al. A novel pterostilbene compound dcz0825 induces macrophage m1 differentiation and th1 polarization to exert anti-myeloma and immunomodulatory. Int Immunopharmacol. (2024) 127:111446. doi: 10.1016/j.intimp.2023.111446

PubMed Abstract | Crossref Full Text | Google Scholar

20. Ahmed A and Tait S. Targeting immunogenic cell death in cancer. Mol Oncol. (2020) 14:2994–3006. doi: 10.1002/1878-0261.12851

PubMed Abstract | Crossref Full Text | Google Scholar

21. Mantovani A, Sozzani S, Locati M, Allavena P, and Sica A. Macrophage polarization: tumor-associated macrophages as a paradigm for polarized m2 mononuclear phagocytes. Trends Immunol. (2002) 23:549–55. doi: 10.1016/S1471-4906(02)02302-5

PubMed Abstract | Crossref Full Text | Google Scholar

22. Denardo DG and Ruffell B. Macrophages as regulators of tumour immunity and immunotherapy. Nat Rev Immunol. (2019) 19:369–82. doi: 10.1038/s41577-019-0127-6

PubMed Abstract | Crossref Full Text | Google Scholar

23. Meier P, Legrand AJ, Adam D, and Silke J. Immunogenic cell death in cancer: targeting necroptosis to induce antitumour immunity. Nat Rev Cancer. (2024) 24:299–315. doi: 10.1038/s41568-024-00674-x

PubMed Abstract | Crossref Full Text | Google Scholar

24. Sun C, Chen Y, Gu Q, Fu Y, Wang Y, Liu C, et al. Ube3c tunes autophagy via atg4b ubiquitination. Autophagy. (2024) 20:645–58. doi: 10.1080/15548627.2023.2299514

PubMed Abstract | Crossref Full Text | Google Scholar

25. Xie W, Zhang C, Wang Z, Chen H, Gu T, Zhou T, et al. Atg4b antagonizes antiviral immunity by gabarap-directed autophagic degradation of tbk1. Autophagy. (2023) 19:2853–68. doi: 10.1080/15548627.2023.2233846

PubMed Abstract | Crossref Full Text | Google Scholar

26. Jiang Y, Chen S, Li Q, Liang J, Lin W, Li J, et al. Tank-binding kinase 1 (tbk1) serves as a potential target for hepatocellular carcinoma by enhancing tumor immune infiltration. Front Immunol. (2021) 12:612139. doi: 10.3389/fimmu.2021.612139

PubMed Abstract | Crossref Full Text | Google Scholar

27. Ni Z, He J, Wu Y, Hu C, Dai X, Yan X, et al. Akt-mediated phosphorylation of atg4b impairs mitochondrial activity and enhances the warburg effect in hepatocellular carcinoma cells. Autophagy. (2018) 14:685–701. doi: 10.1080/15548627.2017.1407887

PubMed Abstract | Crossref Full Text | Google Scholar

28. Morioka S, Perry J, Raymond MH, Medina CB, Zhu Y, Zhao L, et al. Efferocytosis induces a novel slc program to promote glucose uptake and lactate release. Nature. (2018) 563:714–8. doi: 10.1038/s41586-018-0735-5

PubMed Abstract | Crossref Full Text | Google Scholar

29. Yang S, Qian L, Li Z, Li Y, Bai J, Zheng B, et al. Integrated multi-omics landscape of liver metastases. Gastroenterology. (2023) 164:407–23. doi: 10.1053/j.gastro.2022.11.029

PubMed Abstract | Crossref Full Text | Google Scholar

30. Song W, Li D, Tao L, Luo Q, and Chen L. Solute carrier transporters: the metabolic gatekeepers of immune cells. Acta Pharm Sin B. (2020) 10:61–78. doi: 10.1016/j.apsb.2019.12.006

PubMed Abstract | Crossref Full Text | Google Scholar

31. Grobben Y. Targeting amino acid-metabolizing enzymes for cancer immunotherapy. Front Immunol. (2024) 15:1440269. doi: 10.3389/fimmu.2024.1440269

PubMed Abstract | Crossref Full Text | Google Scholar

32. Yang WH, Qiu Y, Stamatatos O, Janowitz T, and Lukey MJ. Enhancing the efficacy of glutamine metabolism inhibitors in cancer therapy. Trends Cancer. (2021) 7:790–804. doi: 10.1016/j.trecan.2021.04.003

PubMed Abstract | Crossref Full Text | Google Scholar

33. Johnson MO, Wolf MM, Madden MZ, Andrejeva G, Sugiura A, Contreras DC, et al. Distinct regulation of th17 and th1 cell differentiation by glutaminase-dependent metabolism. Cell. (2018) 175:1780–95. doi: 10.1016/j.cell.2018.10.001

PubMed Abstract | Crossref Full Text | Google Scholar

34. Oliveira TT, Coutinho LG, de Oliveira L, Timoteo A, Farias GC, and Agnez-Lima LF. Ape1/ref-1 role in inflammation and immune response. Front Immunol. (2022) 13:793096. doi: 10.3389/fimmu.2022.793096

PubMed Abstract | Crossref Full Text | Google Scholar

35. Cesaratto L, Codarin E, Vascotto C, Leonardi A, Kelley MR, Tiribelli C, et al. Specific inhibition of the redox activity of ape1/ref-1 by e3330 blocks tnf-alpha-induced activation of il-8 production in liver cancer cell lines. PloS One. (2013) 8:e70909. doi: 10.1371/journal.pone.0070909

PubMed Abstract | Crossref Full Text | Google Scholar

36. Akhter N, Takeda Y, Nara H, Araki A, Ishii N, Asao N, et al. Apurinic/apyrimidinic endonuclease 1/redox factor-1 (ape1/ref-1) modulates antigen presenting cell-mediated t helper cell type 1 responses. J Biol Chem. (2016) 291:23672–80. doi: 10.1074/jbc.M116.742353

PubMed Abstract | Crossref Full Text | Google Scholar

37. Lu X, Zhao H, Yuan H, Chu Y, and Zhu X. High nuclear expression of ape1 correlates with unfavorable prognosis and promotes tumor growth in hepatocellular carcinoma. J Mol Histol. (2021) 52:219–31. doi: 10.1007/s10735-020-09939-9

PubMed Abstract | Crossref Full Text | Google Scholar

38. Tang Q, Li W, Zheng X, Ren L, Liu J, Li S, et al. Melk is an oncogenic kinase essential for metastasis, mitotic progression, and programmed death in lung carcinoma, Signal Transduct. Targeting Ther. (2020) 5:279. doi: 10.1038/s41392-020-00288-3

PubMed Abstract | Crossref Full Text | Google Scholar

39. Laguette N, Bregnard C, Hue P, Basbous J, Yatim A, Larroque M, et al. Premature activation of the slx4 complex by vpr promotes g2/m arrest and escape from innate immune sensing. Cell. (2014) 156:134–45. doi: 10.1016/j.cell.2013.12.011

PubMed Abstract | Crossref Full Text | Google Scholar

40. Tang B, Zhu J, Shi Y, Wang Y, Zhang X, Chen B, et al. Tumor cell-intrinsic melk enhanced ccl2-dependent immunosuppression to exacerbate hepatocarcinogenesis and confer resistance of hcc to radiotherapy. Mol Cancer. (2024) 23:137. doi: 10.1186/s12943-024-02049-0

PubMed Abstract | Crossref Full Text | Google Scholar

41. Huang Z, Xu L, Wu Z, Xiong X, Luo L, and Wen Z. Cdc25b is a prognostic biomarker associated with immune infiltration and drug sensitivity in hepatocellular carcinoma. Int J Genomics. (2024) 2024:8922878. doi: 10.1155/2024/8922878

PubMed Abstract | Crossref Full Text | Google Scholar

42. Casanova-Acebes M, Dalla E, Leader AM, Leberichel J, Nikolic J, Morales BM, et al. Tissue-resident macrophages provide a pro-tumorigenic niche to early nsclc cells. Nature. (2021) 595:578–84. doi: 10.1038/s41586-021-03651-8

PubMed Abstract | Crossref Full Text | Google Scholar

43. Munn DH, Sharma MD, and Johnson TS. Treg destabilization and reprogramming: implications for cancer immunotherapy. Cancer Res. (2018) 78:5191–9. doi: 10.1158/0008-5472.CAN-18-1351

PubMed Abstract | Crossref Full Text | Google Scholar

44. Zhou J, Li X, Wu X, Zhang T, Zhu Q, Wang X, et al. Exosomes released from tumor-associated macrophages transfer mirnas that induce a treg/th17 cell imbalance in epithelial ovarian cancer. Cancer Immunol Res. (2018) 6:1578–92. doi: 10.1158/2326-6066.CIR-17-0479

PubMed Abstract | Crossref Full Text | Google Scholar

45. Sun W, Wei FQ, Li WJ, Wei JW, Zhong H, Wen YH, et al. A positive-feedback loop between tumour infiltrating activated treg cells and type 2-skewed macrophages is essential for progression of laryngeal squamous cell carcinoma. Br J Cancer. (2017) 117:1631–43. doi: 10.1038/bjc.2017.329

PubMed Abstract | Crossref Full Text | Google Scholar

46. Arlauckas SP, Garris CS, Kohler RH, Kitaoka M, Cuccarese MF, Yang KS, et al. In vivo imaging reveals a tumor-associated macrophage-mediated resistance pathway in anti-pd-1 therapy. Sci Transl Med. (2017) 9(389):eaal3604. doi: 10.1126/scitranslmed.aal3604

PubMed Abstract | Crossref Full Text | Google Scholar

47. Peranzoni E, Lemoine J, Vimeux L, Feuillet V, Barrin S, Kantari-Mimoun C, et al. Macrophages impede cd8 t cells from reaching tumor cells and limit the efficacy of anti-pd-1 treatment, Proc. Natl Acad Sci U S A. (2018) 115:E4041–50. doi: 10.1073/pnas.1720948115

PubMed Abstract | Crossref Full Text | Google Scholar

48. Li X, Yao W, Yuan Y, Chen P, Li B, Li J, et al. Targeting of tumour-infiltrating macrophages via ccl2/ccr2 signalling as a therapeutic strategy against hepatocellular carcinoma. Gut. (2017) 66:157–67. doi: 10.1136/gutjnl-2015-310514

PubMed Abstract | Crossref Full Text | Google Scholar

49. Han JM and Jung HJ. Cyclophilin a/cd147 interaction: a promising target for anticancer therapy. Int J Mol Sci. (2022) 23(16):9341. doi: 10.3390/ijms23169341

PubMed Abstract | Crossref Full Text | Google Scholar

50. Bahmed K, Henry C, Holliday M, Redzic J, Ciobanu M, Zhang F, et al. Extracellular cyclophilin-a stimulates erk1/2 phosphorylation in a cell-dependent manner but broadly stimulates nuclear factor kappa b. Cancer Cell Int. (2012) 12:19. doi: 10.1186/1475-2867-12-19

PubMed Abstract | Crossref Full Text | Google Scholar

51. Duan X, Cai Y, He T, Shi X, Zhao J, Zhang H, et al. The effect of the tp53 and rb1 mutations on the survival of hepatocellular carcinoma patients with different racial backgrounds. J Gastrointest Oncol. (2021) 12:1786–96. doi: 10.21037/jgo-21-312

PubMed Abstract | Crossref Full Text | Google Scholar

52. Efe G, Dunbar KJ, Sugiura K, Cunningham K, Carcamo S, Karaiskos S, et al. P53 gain-of-function mutation induces metastasis via brd4-dependent csf-1 expression. Cancer Discov. (2023) 13:2632–51. doi: 10.1158/2159-8290.CD-23-0601

PubMed Abstract | Crossref Full Text | Google Scholar

53. Yu J, Ling S, Hong J, Zhang L, Zhou W, Yin L, et al. Tp53/mtorc1-mediated bidirectional regulation of pd-l1 modulates immune evasion in hepatocellular carcinoma. J Immunother Cancer. (2023) 11:e007479. doi: 10.1136/jitc-2023-007479

PubMed Abstract | Crossref Full Text | Google Scholar

54. Zong Z, Zou J, Mao R, Ma C, Li N, Wang J, et al. M1 macrophages induce pd-l1 expression in hepatocellular carcinoma cells through il-1beta signaling. Front Immunol. (2019) 10:1643. doi: 10.3389/fimmu.2019.01643

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: air pollutants, hepatocellular carcinoma, immune, network toxicology, machine learning, multi-omics analysis

Citation: Pu L, Zhang X, Pu C and Sun P (2025) Air pollution-related immune gene prognostic signature for hepatocellular carcinoma: network toxicology, machine learning and multi-omics analysis. Front. Immunol. 16:1638445. doi: 10.3389/fimmu.2025.1638445

Received: 30 May 2025; Accepted: 01 September 2025;
Published: 12 September 2025.

Edited by:

Petar Ozretić, Rudjer Boskovic Institute, Croatia

Reviewed by:

Yongqiang Zhang, Henan Academy of Innovations in Medical Science, China
Hanqi Li, The First Affiliated Hospital of Xi’an JiaoTong University, China

Copyright © 2025 Pu, Zhang, Pu and Sun. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Peng Sun, cHN1bkB0eXh4LmVjbnUuZWR1LmNu

†These authors share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.