- 1Group of Mycoplasmas, Laboratory of Molecular Microbiology, Vaccinology, and Biotechnology Development, Pasteur Institute of Tunis, University of Tunis El-Manar, Tunis, Tunisia
- 2Laboratory of Mycology, Pathologies, and Biomarkers (LR16ES05), Faculty of Sciences of Tunis, University of Tunis El Manar, Tunis, Tunisia
- 3Faculty of Sciences; University of Tunis El-Manar, Tunis, Tunisia
Background: While human papillomavirus (HPV) is the primary driver of cervical cancer (CC), host immune-related genetic variations are thought to influence clinical heterogeneity. The role of combined immune-related single nucleotide polymorphisms (SNPs) in defining patient subgroups remains underexplored in focused, candidate-gene studies.
Methods: We genotyped nine functional SNPs across TNF-α (rs361525, rs1800629), *IL-1β* (rs16944), IFN-γ (rs2430561), *IL-1RN* (rs2234663), *IL-10* (rs3024490, rs1800872, rs1800871), and *IL-6* (rs1474348) in a cohort of 130 Tunisian CC patients. Principal component analysis (PCA), multi-dimensional scaling (MDS), K-means clustering, and random forest modeling were used to explore SNP-based patient subgroups and identify genetic profiles associated with survival.
Results: A high-risk genetic profile, comprising seven SNPs, was identified in 20% of patients. PCA indicated that *IL-10* and TNF-α variants accounted for 38.5% of the observed genetic variance. Unsupervised clustering suggested three distinct SNP-based subgroups with differing genetic architectures. The TNF-α –238 A allele was associated with borderline higher odds of adenocarcinoma (OR 4.57, 95% CI: 0.95–21.95, p=0.050), while the *IL-1β* –511 T allele appeared protective (OR 0.45, 95% CI: 0.19–1.07, p=0.049). Random forest analysis identified the IFN-γ rs2430561 variant as the top predictor of advanced FIGO stage. A nine-SNP polygenic risk score (PRS) was significantly associated with reduced overall survival (HR 2.45, log-rank p<.001) and remained an independent prognostic factor in multivariable analysis. Pathway analysis implicated TNF-α signaling, IL-10 anti-inflammatory, and IL-1 cytokine pathways.
Conclusions: This focused, candidate-gene analysis identifies prognostic SNP-based subgroups and a nine-SNP polygenic risk score associated with survival in cervical cancer. While this work provides a foundation for immunogenetic risk stratification, the findings are derived from a limited SNP panel in a single cohort. Future validation in larger, independent cohorts with genome-wide data is required to confirm these preliminary genetic associations and to determine their relationship to broader molecular subgroups.
1 Introduction
Despite the widespread availability of screening methods, increasing accessibility of prophylactic vaccines, and ongoing global awareness campaigns, cervical cancer (CC) incidence persists as a significant global health burden. It remains a leading cause of cancer-related mortality among women worldwide (1–3). This persistent epidemiological trend underscores a critical clinical paradox: while the etiological role of oncogenic human papillomavirus (HPV) subtypes is unequivocally established, a distinct subset of cases presents with no detectable evidence of prior or active HPV infection. This discrepancy necessitates the investigation of complementary oncogenic pathways and cofactors that may drive carcinogenesis independently or synergistically.
Mounting evidence suggests that host genetic predispositions constitute primary cofactors in CC development (4). Population-based registry studies provide compelling evidence for a significantly elevated familial relative risk (FRR), indicating a pronounced tendency for the disease to aggregate within families (5–7). Familial aggregation studies suggest that the inherited tendency to CC is primarily polygenic, driven by multiple common, low-penetrance variants interacting with environmental and viral cofactors, analogous to the documented FRR for breast cancer (5–7). Nevertheless, a distinct contrast emerges; unlike breast and ovarian cancers, large, multiplex pedigrees are seldom reported (4, 6, 8–10). This paucity of high-density familial cases suggests that highly penetrant germline mutations are infrequent and that the heritable liability is likely polygenic. This model proposes that susceptibility is mediated by a combination of common, low-penetrance genetic variants that collectively moderate risk, potentially through interactions with environmental and viral oncogenic factors.
Guided by this rationale, research into genetic susceptibility has extensively employed a candidate-gene approach, focusing on genes implicated in immune regulation, tumor suppression, and DNA damage response. Reported associations include variants in genes such as TP53 (11–13), MDM2 (12, 14, 15), ATM (16), BRIP1 (17), CDKN1A (18–20), CDKN2A (21), FACNA, FANCC, FANCL (22), XRCC1 (23–25), and XRCC3 (26). Concurrently, significant focus has been directed towards immune response genes, including those encoding T-cell surface molecules CD83 (27, 28) and CTLA4 (29), inflammasome components CARD8 (30), and cytokines; TNFA (29–32), interleukins (33–36), TGFB1 (37) and IFNG (18, 38).
Notwithstanding these considerable efforts, the candidate-gene era has been challenged by a pervasive lack of reproducibility. Most proposed risk variants, including the extensively debated TP53 Arg72Pro polymorphism (39), have failed to be consistently replicated or to attain genome-wide statistical significance in large-scale studies, with the notable exception of certain HLA alleles (40). Technological advancements have primarily addressed this limitation over the past decade. The advent of genome-wide association studies (GWAS), enabling the agnostic interrogation of millions of variants across the genome, has provided more robust evidence for additional risk loci.
Large-scale GWAS and post-GWAS analyses demonstrate that cervical cancer susceptibility is highly polygenic, with common SNPs accounting for substantial heritability (40). Mendelian randomization and immune-cell GWAS implicate inherited variations in cytokine signaling and T-cell activation as causal risk determinants (38, 41). These findings suggest that host immune architecture, rather than isolated loci, governs HPV persistence and malignant progression. Replicated risk factors include polymorphisms in TNF-α, IL-1β, IL-6, and IL-10, with ethnicity and HPV status influencing effect sizes (13, 29, 31, 34). While cumulative risk models outperform single-SNP predictors [ (37), current polygenic risk scores (PRS) often lack functional relevance (40). Consequently, biologically grounded PRS frameworks focusing on population-specific cytokine dysregulation are required to improve risk stratification.
Building upon this foundational work, the present study was initiated. We used data from a GWAS conducted by our research team within a representative Tunisian cohort to select single-nucleotide polymorphisms (SNPs) that showed significant associations with CC susceptibility. The primary objectives of this follow-up investigation are to define a population-specific immunogenetic predisposition profile for CC in Tunisian women and to correlate identified genetic risk factors with distinct epidemiological and clinical characteristics. This study aims to complement GWAS discovery by evaluating a biologically grounded, cytokine−focused polygenic risk model in North African Tunisian population, integrating functional immune regulatory variants with machine−learning patient stratification and survival analysis.
2 Patients and methods
2.1 Study subjects
The study population, recruitment criteria, and sample collection procedures were described previously (42). Briefly, it included 130 women with histologically confirmed CC recruited from Salah Azeiz Oncology Institute (SAI, Tunis, Tunisia). The cancer diagnosis was established by clinical examination and biopsy findings, and two senior SAI pathologists confirmed it. Clinical data were collected through self-reported questionnaires, case record reviews, and personal interviews. Tumor staging was according to the International Federation of Gynecology and Obstetrics (FIGO) classification (www.figo.org). Peripheral blood EDTA-anticoagulated specimens were collected from CC patients before radiation therapy or chemotherapy. Genomic DNA was extracted using QIAamp_ DNA Blood Mini Kit, according to the manufacturer’s instructions (Qiagen GmbH, Hilden, Germany). Study subjects were from different zones of Tunisia, and were asked to sign a consent form agreeing to participate in the study; all institutional ethics requirements were met.
2.2 Genotyping
The selection of the studied SNPs was guided by evidence from previous case–control and genome-wide association studies reporting significant associations with CC risk (42–46). Accordingly, only variants previously identified as positively associated with CC susceptibility were retained for this analysis. Based on this rationale, a targeted genotyping approach was applied to nine candidate SNPs located in key genes implicated in inflammation and immune regulation within a cohort of 130 patients: rs361525 and rs1800629 (TNF-α), rs16944 (IL-1β), rs2430561 (IFN-γ), rs2234663 (IL-4R), rs3024490, rs1800872, and rs1800871 (IL-10), and rs1474348 (IL-10R). Genotyping was performed by the allelic (VIC and FAM-labelled) discrimination method. Assay-on-demand TaqMan assays were ordered from Applied Biosystems (Foster City, NJ). The reaction was performed in 6 μl volumes on StepOne/StepOne Plus real-time PCR systems, as recommended by the manufacturer (Applied Biosystems). Quality control measures to assess reproducibility of the genotyping procedure included: 1) replicate genotyping of 10% of blinded samples showing >99% concordance; 2) inclusion of negative controls in each run; 3) independent assessment of ambiguous genotype calls by two investigators; and 4) exclusion of samples with >20% missing genotype data.
2.3 Statistical analysis
Statistical analyses were performed using IBM SPSS Statistics (Version 31; IBM Corp., Armonk, NY, USA) and RStudio (Posit Software, Boston, MA, USA; R version 4.x, latest release). Descriptive statistics were expressed as percentages for categorical variables. Inter-group significance was assessed using the Pearson χ² test. Logistic regression was used to estimate odds ratios (ORs) and 95% confidence intervals (95% CIs). All analyses were performed under the assumption of an additive genetic effect model. Genotype distributions for all SNPs were tested for deviation from Hardy-Weinberg equilibrium (HWE) using chi-square goodness-of-fit tests. Given the exploratory nature of this candidate gene study involving 9 SNPs and multiple clinical comparisons, we report uncorrected p-values and acknowledge that the findings should be interpreted cautiously pending replication. Genotypes with call rates <95% were excluded. The remaining missing data were assumed to be missing at random, not imputed, and were omitted from locus-specific analyses but retained for others.
We applied Benjamini–Hochberg FDR correction to all primary SNP–phenotype association tests, including allelic and genotypic comparisons across FIGO stage, histology, and epidemiological strata, to control type I error. Adjusted q−values are reported with nominal p−values. Machine−learning analyses (PCA, clustering, random forest) were treated as exploratory and not corrected for multiplicity. The polygenic risk score (PRS) was calculated as an unweighted allelic burden by summing risk alleles across 9 SNPs, an approach chosen to avoid instability and overfitting in this modest cohort. Random Forest modeling was used as a multivariate, non-linear classification tool to rank SNPs according to their contribution to prediction accuracy. Variable-importance scores represent relative predictive influence within the model and do not imply biological causality or mechanistic dominance.
In RStudio, a correlation matrix was computed to explore pairwise associations between single-nucleotide polymorphisms (SNPs). Dimensionality reduction approaches, including Principal Component Analysis (PCA) with variable plots and Multidimensional Scaling (MDS), were applied to visualize patterns of genetic variation. Heatmaps were generated to display SNP distributions, and unsupervised K-means clustering (k = 3) was performed on PCA outputs to classify cases into genetic clusters. Random Forest modelling was employed to predict clinical outcomes and identify the most informative SNP predictors. Additionally, functional pathway enrichment analysis (enrichR) was performed to identify biological functions associated with significantly correlated SNPs. Post-hoc power calculations were performed to assess the study’s ability to detect associations of specified effect sizes at α=0.05. A Venn diagram was generated using stringdb (https://string-db.org/), and a protein–protein interaction network was obtained using webtools (https://jvenn.toulouse.inrae.fr/app/index.html).
3 Results
3.1 Study design
Baseline and demographic characteristics of the study population are described in Table 1. Of the 130 cases examined, the cohort had a mean age of 52.3 ± 7.6 years, with the majority (53.9%) of patients aged 51–60 years. Most participants were married or in a stable relationship (85.4%) and reported having a single sexual partner (86.2%). Over half of the cohort (52.3%) was peri-menopausal, and the vast majority (80.7%) were not using hormonal contraception. A minority (20.7%) were identified as tobacco users. Clinically, a family history of cancer was reported in 63% of cases. The predominant histology was squamous cell carcinoma (73.8%), followed by adenocarcinoma (18.5%) and sarcoma (7.7%). More than half of the cases (59.2%) were diagnosed at an early FIGO stage. According to the TNM classification, the tumors were primarily distributed across T1 (29.2%), T2 (30%), and T3 (26.9%) stages, with most patients presenting with no nodal involvement (N0, 74.6%) and no distant metastasis (M0, 87.7%).
3.2 Immunogenetic profiling of cervical cancer
Comprehensive immunogenetic analysis of our CC cohort revealed a substantially high allelic burden across the nine studied promoter polymorphisms. Our data indicate widespread prevalence of genetic variants, with a significant majority of patients (71.5%) harboring variation in at least 4 polymorphisms. Hierarchical heatmap analysis (Figure 1), performed on all study patients who met the inclusion criterion of having ≥2 heterozygous SNP variants (n = 117 of 130), revealed a discrete high-burden subgroup characterized by the presence of seven risk alleles:; A for rs1800629 (TNF-α), C for rs16944 (IL-1β), A for rs2430561 (IFN-γ), L for rs2234663 (IL-1RN), T for rs3024490, A for rs1800872 and T for rs1800871 (IL-10), defining a polygenic susceptibility cluster (Figure 1).
Figure 1. Genotype heatmap displaying the distribution of nine immunogenetic SNP variants across cervical cancer patients. The heatmap displays genotype patterns across patients (columns) and selected SNPs (rows) for patients with ≥2 heterozygous SNP variants (n=117 out of N = 130). Columns are labeled with individual patient identifiers. Rows correspond to the following SNPs: rs1900772 rs310087, rs19:067, rs147493, rs319614, rs218044, and rs2819653. Hierarchical clustering of both rows and columns reveals patterns of genotype similarity among patients and allele co-segregation among SNPs. Notable linkage disequilibrium and coordinated inheritance are observed between IL-10 promoter variants (rs1800872 and rs1800871), supporting their co-inheritance within inflammatory pathways. Color key: • Purple: Homozygous variant • Blue/Teal: Heterozygous • Green/Yellow: Wild-type.
Hierarchical clustering analysis of SNP variation revealed distinct patient subgroups based on their genetic profiles (Figure 1). The dendrogram indicated optimal separation into three major clusters: Cluster 1 comprising 4 patients (3.4%), Cluster 2 comprising 24 patients (20.5%), and Cluster 3 comprising 89 patients (76.1%). These clusters exhibited distinct patterns of SNP variation across the nine analyzed loci (rs1000272, rs1000971, rs1800629, rs147348, rs16944, rs3024400, rs2400591, rs361625, and rs2234663). The heatmap visualization demonstrated that Cluster 1 represents a genetically distinct subgroup with unique variation patterns, Cluster 2 displays intermediate genetic diversity, and Cluster 3 includes the majority of patients with more homogeneous SNP profiles. The hierarchical structure further revealed that Clusters 2 and 3 are more closely related to each other than to Cluster 1, suggesting the presence of two major genetic lineages within the study population. The clustering patterns appeared robust, as evidenced by clear separation between clusters and consistent grouping in the dendrogram.
The resulting hierarchical structure delineated clear patterns of genetic variation and suggested potential linkage disequilibrium between specific loci. The heatmap corroborated the statistical findings by demonstrating limited heterogeneity in certain SNPs, such as rs1800872 and rs1474348, which exhibited largely uniform genotypic distributions. In contrast, loci including rs361525 and rs1800629 showed pronounced variability, with clear segregation of patients into heterozygous and homozygous genotypes. Furthermore, the visualization confirmed the presence of rare alleles, such as the *L/*2 variant of rs2234663, which was restricted to a specific subset of patients. Notably, visual inspection suggested cosegregation of alleles across seven high-risk SNPs, suggesting a potential haplotype block defining the identified high-burden subgroup.
Collectively, these findings indicate a stepwise increase in genetic risk, ranging from a baseline associated with≥4 polymorphic variants to a high-risk threshold defined by the presence of 7 specific polymorphisms. This pattern suggests a potential dose-dependent effect of a pro-inflammatory genetic background on disease susceptibility. The polymorphic signature identified in this study, particularly the high-burden profile, may represent a valuable predictive biomarker for stratifying individuals at increased risk of developing CC.
3.3 Allelic and genotypic frequency according to clinical feature in study cases
We analyzed SNP genotype distributions across FIGO stages to explore potential links between candidate immune variants and cervical cancer progression. Initial statistical analysis did not reveal significant associations for cytokine polymorphisms with disease stage, despite visual variability in genotypic patterns (Table 2). Upon exploratory stratification by histological subtype, several suggestive associations emerged. The TNF-α -238A allele demonstrated a borderline, non-significant association with adenocarcinoma risk (OR 4.57, 95% CI 0.95–21.95, p = 0.050), an observation that requires caution due to the limited subgroup sample size. Notably, the A/A genotype was absent in squamous cell carcinoma cases. Conversely, a trend suggestive of a protective effect was observed for the IL-1β -511T allele, with the T/T genotype showing borderline association with reduced adenocarcinoma risk and the combined C/T+T/T group associated with lower invasive carcinoma risk (OR: 0.45, p = .049). These patterns suggest the hypothesis that TNF-α and IL-1β variants may play opposing roles in cervical carcinogenesis in a subtype-specific manner. After FDR correction, none of the histology-stratified associations remained statistically significant at q < 0.05, indicating that these findings should be interpreted as hypothesis-generating rather than confirmatory.
3.4 Allelic and genotypic frequency according to epidemiological characteristics
Analysis of epidemiological characteristics reveals significant gene-environment interactions. IL-1RA VNTR 2* allele showed elevated risk in women with familial cancer history (OR: 3.26, p <.001), tobacco users (OR: 1.95, p = .028), and a trend in postmenopausal women was also noted (OR: 1.80, p = .050) (Table 3). Conversely, the TNF-α -308G allele was protective against familial cancer (OR: 0.56, p = .018), while IL-10 rs3024490 showed mixed associations: protective in familial cancer (OR: 0.58, p = .027) but susceptible in smokers (OR: 1.87, p = .029). The IL-6 C allele (rs1474348) was also associated with familial cancer risk (OR: 1.91, p = .037). Contraceptive use showed minimal genetic associations, and marital status had no significant impact. Collectively, this suggests that cytokine gene variants influence CC risk by interacting with hormonal status, genetic predisposition, and environmental exposures, rather than acting as independent risk factors. While the association between IL-1RA VNTR and family history of cancer remained statistically significant following FDR adjustment (q < 0.05), the smoking- and menopause-related associations did not survive multiple-testing correction.
3.5 Correlation between studied SNPs
To evaluate genetic interdependence among the nine target SNPs, a pairwise linkage disequilibrium (LD) matrix was generated (Figure 2). Most SNP pairs showed negligible correlation (r² < 0.1), indicating largely independent inheritance within the cohort. However, moderate LD was observed between rs1800872 and rs1800871 (r² ∼ 0.19), suggesting frequent co-inheritance of these IL-10 promoter variants. Additional weaker correlations were noted between rs1800629 and rs1800871 (r² ∼ 0.13), and between rs2430561 and rs1800872 (r² ∼ 0.12). These LD blocks support the co-segregation patterns seen in the genotypic heatmap and justify constructing haplotypes for these non-independent loci to better capture their combined genetic influence.
Figure 2. Assessment of Inter-SNP Correlations and Linkage Disequilibrium Patterns. The correlation matrix revealed that most SNPs showed weak or negligible correlations (r values close to 0), indicating a largely independent distribution. Notably, a strong negative correlation was observed between rs3024490 and rs1800872 (r = –0.80), suggesting a potential linkage disequilibrium with mutually exclusive inheritance patterns. A few SNP pairs, such as rs180629 with rs1474348 (r = 0.19) and rs1800872 with rs1800871 (r = 0.26), displayed weak positive correlations. Overall, these findings suggest minimal redundancy among the studied SNPs, supporting their suitability for downstream genetic association analyses.
3.6 Polygenic risk score analysis and genetic burden association with disease progression
Statistical analysis did not identify a significant association between the calculated polygenic risk score (PRS) and cervical cancer progression as measured by FIGO stage. The Kruskal-Wallis test showed no statistically significant difference in PRS distributions across the four FIGO stages (χ²(3) = 3.71, p = 0.294). Similarly, Spearman’s rank correlation analysis found no significant monotonic trend between increasing FIGO stage and PRS (ρ = -0.061, p = 0.488). Descriptively, PRS means across stages I to IV were 2.04 (SD = 0.645), 2.23 (SD = 0.907), 1.87 (SD = 0.843), and 2.07 (SD = 0.846), respectively, with the highest median PRS observed in stage II (median = 2.18) and the lowest in stage III (median = 1.76). The overall PRS ranged from 0.29 to 3.98 (mean = 2.055, SD = 0.814) across the entire cohort of 130 patients, while the weighted PRS derived from random-forest feature importance showed high concordance with the unweighted PRS (Spearman ρ = 0.81, p < 0.001). The similar associations seen with the overall survival and patient clustering indicate that the observed prognostic signal is not dependent on equal SNP weighting.
Visualization of PRS distribution across stages was consistent with the absence of a stage-dependent trend (Figures 3A, B). Individual patient data showed substantial overlap in score distributions across FIGO stages I–IV, with no discernible linear progression in the scatter plot (Figure 3B); sample sizes: Stage I, n=38; Stage II, n=39; Stage III, n=35; Stage IV, n=18). To explore the underlying genetic architecture, mean SNP-specific contributions to the PRS were analyzed across stages (Figure 3C). This heatmap indicated stable relative contributions of individual SNPs regardless of disease severity. Among the variants analyzed, rs1800629 (TNF-α), rs2430561 (IFN-γ), and rs16944 (IL-1β) consistently contributed most strongly to the PRS across all stages. Hierarchical clustering did not reveal any stage-specific genetic signatures.
Figure 3. Distribution of Polygenic Risk Scores (PRS) in Cervical Carcinoma Across FIGO Stages. (A) Relationship Between Polygenic Risk Score and FIGO Stage (Scatter Plot of PRS vs. FIGO Stage). Individual patient PRS values are plotted for each FIGO stage (I-IV). The solid line represents the linear regression fit, with the shaded area indicating the 95% confidence interval. Dashed horizontal lines mark the median PRS for each stage. Spearman’s rank correlation analysis revealed no significant monotonic trend (ρ = -0.061, p = 0.488). Stage-specific sample sizes: Stage I (n=38), Stage II (n=39), Stage III (n=35), Stage IV (n=18). (B) Stratification of PRS Staging in Cervical Canrcinoma (Boxplot Stratification of PRS by FIGO Stage). Boxplots display the median (center line), interquartile range (box), and full range (whiskers) of PRS for each stage, with individual patient data points overlaid. Sample sizes per stage are indicated. The Kruskal-Wallis test indicated no significant difference in PRS distributions across stages (p = 0.294), consistent with the non-significant Spearman correlation (ρ = -0.061, p = 0.488). (C) Stage-Specific Mean SNP Contributions to the PRS. Heatmap showing the mean contribution of individual SNPs to the total PRS, stratified by FIGO stage (columns). Each row represents a single SNP. Color intensity reflects the scaled contribution value (see key). Key inflammatory gene variants (e.g., rs1800629 in TNF-α, rs2430561 in IFN-γ, rs16944 in *IL-1β*) consistently show the highest contributions across all stages. The relative SNP contribution profile remains stable throughout disease progression, with no distinct stage-specific clustering patterns observed.
Collectively, these findings suggest that, in this cohort, the PRS captures a stable genetic profile that does not correlate with clinical stage severity. This leads to the interpretation that the polygenic inflammatory background represented by the PRS may be more relevant to initial disease susceptibility than to subsequent tumor progression, a hypothesis that requires validation in longitudinal or larger cross-sectional cohorts.
3.7 Principal component analysis
To explore the genetic structure within our cohort, we conducted principal component analysis (PCA) on genotypes from the nine candidate SNPs (Figure 4A). The first two principal components explained 38.5% of the total genotypic variance. IL-10 promoter variants (rs1800871 and rs1800872) loaded strongly onto PC1, while TNFA (rs361525) and IFNG (rs2430561) variants loaded onto PC2, indicating these sets of variants contributed to distinct axes of variation in this dataset. To identify potential patient subgroups based on these genetic patterns, consensus clustering was performed. The CDF plot (Figure 4B) suggested that k = 3 was an optimal partition for this exploratory analysis. Applying k-means clustering (k=3) to the PCA-transformed data revealed three patient clusters with differing SNP allele frequency profiles (Figure 4C). It is important to emphasize that these are data-driven clusters derived from a limited set of candidate SNPs; they represent statistical patterns of genetic variability and should not be equated with biologically validated molecular subgroups. As an exploratory step, these clusters were cross-referenced with available clinical metadata to generate hypotheses about their potential relevance to cervical cancer heterogeneity. Their clinical and biological significance remains to be established through external validation and functional studies.
Figure 4. Principal Component Analysis (PCA) of the nine SNPs in Cervical Cancer Cohort. (A) PCA of the studied SNPs among CC patients: PCA demonstrates that Dim1 (23.6%) and Dim2 (14.9%) explain 38.5% of total genetic variance. IL-10 variants rs3024490 and rs1800872 show the strongest yet opposite contributions along Dim1, reflecting their negative correlation and distinct anti-inflammatory effects. TNF-α (rs1800629) and IL-1β (rs16944) SNPs cluster together, indicating shared pro-inflammatory pathway variability. Vector length indicates contribution strength; color intensity (red = highest, blue = lowest) represents variable importance in defining cervical cancer genetic architecture. (B) Consensus Cumulative Distribution Function (CDF) for Determining the Optimal Number of Clusters: Consensus clustering analysis was performed to identify the most stable molecular subgrouping within the cohort. The CDF plot demonstrates that the cumulative distribution curve reached a plateau at k = 3, with minimal increase in the area under the curve for higher k values. This stability indicates that partitioning the data into three clusters provides the most robust and reproducible classification, which was subsequently applied to the principal component–based genetic dataset. (C) PCA with unsupervised K-means clustering (k=3): PCA with K-means clustering partitioned patients into three spatially separated molecular subgroup using Dim1 (23.6%) and Dim2 (14.9%), collectively explaining 38.5% of genetic variance. Cluster 1 (blue circles), Cluster 2 (yellow triangles), and Cluster 3 (gray squares) represent inherent genetic patterns independent of clinical variables. This data-driven classification reveals molecular heterogeneity that potentially correlates with distinct biological behaviors.
To identify potential multivariate patterns, we performed an exploratory principal component analysis (PCA) integrating clinical and genetic variables (Figure 5). When applied to genotype data alone, the first two principal components (PC1 and PC2) accounted for 22.6% and 1.5% of the total variance, respectively (Figure 5A). Coloring the resulting biplot by FIGO stage revealed no distinct clustering; substantial overlap was observed between early-stage (I–II) and advanced-stage (III–IV) cases. This suggests that the global genetic variation captured by the selected SNP set does not strongly correlate with disease stage progression in this cohort.
Figure 5. Principal Component Analysis illustrating molecular profile variation across FIGO stages, histological types and age groups. (A) Principal Component Analysis of Genotype Data Stratified by FIGO Stage. Scatter plot of the first two principal components derived from genotype data. PC1 and PC2 explain 22.6% and 1.5% of the total variance, respectively. Points are colored by FIGO stage (I–IV). No distinct clustering by stage is observed, with substantial overlap between early-stage (I–II, blue/yellow) and advanced-stage (III–IV, red/grey) cases, indicating that the global genetic variation captured by this SNP set is not strongly associated with disease progression. (B) Principal Component Analysis of Genotype Data Stratified by Histological Subtype. The same PCA projection as in panel A, with points colored by the two major histological subtypes: adenocarcinoma (Ad; purple) and squamous cell carcinoma (S; teal). PC1 and PC2 explain 23.0% and 4.0% of the variance, respectively. Both subtypes show broad overlap in the genetic space defined by these components, suggesting no distinct molecular separation between these histological groups based on the selected SNPs. (C) Exploratory PCA Incorporating Genotype Data and Patient Age, Stratified by FIGO Stage. Biplot from a PCA performed on genotype data combined with patient age as an exploratory clinical covariate. PC1 and PC2 explain 21.3% and 3.1% of the variance, respectively. Points are colored by FIGO stage (I-IV). Considerable overlap persists across all stages. Although a subtle aggregation of some Stage I samples (blue) is visible, this pattern was not statistically robust, and its biological relevance remains uncertain.
Similarly, PCA colored by histological subtype adenocarcinoma (Ad) versus squamous cell carcinoma (S) showed no clear separation between subgroups (Figure 5B). Here, PC1 and PC2 explained 23.0% and 4.0% of the variance, respectively, with both histological types broadly overlapping in the reduced-dimensional space.
We then conducted an exploratory PCA that incorporated both SNP data and patient age to assess whether adding clinical covariates could improve phenotypic discrimination (Figure 5C). The resulting biplot again exhibited considerable overlap across FIGO stages, with PC1 and PC2 explaining 21.3% and 3.1% of the variance. Although minimal aggregation was visible among some early-stage (particularly stage I) samples, this pattern was not statistically robust, and its biological relevance remains unclear.
Collectively, these PCA results indicate that while the selected genetic markers explain moderate variance within the cohort primarily through PC1, they do not provide clear stratification by FIGO stage or histological subtype, either alone or in combination with patient age.
To further examine the PCA results, multidimensional scaling (MDS) was performed using pairwise genetic distances, which revealed a similar spatial organization with limited observable phenotypic separation. Consistent with the PCA, both adenocarcinoma (Ad) and squamous cell carcinoma (S) samples showed considerable overlap in MDS space. Advanced-stage tumors exhibited greater dispersion, suggesting possible increased heterogeneity, but without forming distinct, interpretable clusters.
Taken together, these exploratory multivariate analyses, based on a limited set of candidate SNPs, captured only a modest portion of the total phenotypic variance in this cohort. The consistent absence of clear separation likely reflects the polygenic and heterogeneous nature of cervical cancer, where disease phenotypes are influenced by numerous factors beyond the specific germline variation analyzed here, as well as the inherent limitations of the candidate-gene approach.
Finally, a heatmap of SNP genotypes annotated by histology and FIGO stage (Figure 6) revealed distinct patterns, especially among adenocarcinoma cases with high prevalence of rs361525 A allele. The heatmap highlights genotype co-occurrence and heterogeneity, supporting the role of germline variation in CC susceptibility.
Figure 6. Integrated heatmap of SNP genotypes annotated by clinical and pathological features. Hierarchical clustering displays genotype patterns for nine SNPs across 130 patients (red = homozygous variant, blue = wild-type, white = heterozygous). Annotations indicate histological type (Ad, CE, S) and FIGO stage. Clustering reveals subgroups with shared genetic profiles correlating with histological subgroups, highlighting germline-phenotype interplay.
3.8 Machine learning model performance for stage prediction based on inflammatory genetic signature
To explore the potential predictive capacity of the inflammatory gene-derived signature for cervical cancer staging, three machine learning models, Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM), were developed and evaluated in this exploratory analysis. The models were tasked with classifying patients into early (FIGO I–II) versus advanced (FIGO III–IV) stages based on the nine candidate SNPs. The Random Forest model demonstrated the highest discriminative performance among those tested, achieving an area under the receiver operating characteristic curve (AUC) of 0.641, compared to an AUC of 0.628 for Logistic Regression and 0.509 for SVM (Figure 7). It should be noted that an AUC of 0.641 represents modest discriminative ability. In terms of classification performance, the Random Forest model attained an accuracy of 68.4% and a high specificity of 95.7%, indicating it was effective in correctly identifying most early-stage cases in this cohort. However, its sensitivity was notably low at 26.7%, suggesting limited utility in identifying true advanced-stage cases. An exploratory threshold optimization analysis of the polygenic risk score (PRS) underlying the signature improved sensitivity to 49.1% while maintaining a specificity of 67.5%.
Figure 7. Receiver operating characteristic (ROC) curves for machine learning models predicting early vs. advanced cervical cancer stage based on an inflammatory genetic signature. Curves are shown for logistic regression (LR, AUC = 0.628), random forest (RF, AUC = 0.641), and support vector machine (SVM, AUC = 0.509) classifiers trained on an inflammatory genetic signature. The dashed diagonal line represents the performance of random guessing (AUC = 0.5). The random forest model achieved the highest discriminative ability, though all models showed limited sensitivity in identifying advanced-stage cases.
Overall, these results suggest that, within the constraints of this dataset, the inflammatory genetic signature derived from nine SNPs possesses limited but observable discriminatory power, primarily for distinguishing low-risk patients. The modest performance metrics, particularly the low baseline sensitivity, indicate that this model is not suitable for a standalone clinical application. These findings highlight the exploratory nature of this modelling effort and underscore the need to refine and integrate it with additional clinical or molecular features in future studies to determine whether predictive utility can be improved.
3.9 Integrative immunogenetic and pathway modeling of cytokine-mediated disease progression
Pairwise SNP correlation analysis revealed several statistically strong positive and negative relationships among cytokine promoter polymorphisms, which could suggest potential genetic interdependencies beyond single-locus effects (Figure 8A). The strong linkage between IL-10 variants rs1800872 and rs1800871 is consistent with their known co-inheritance as a haplotype, while cross-gene correlations such as between rs1800872 (IL-10) and rs16944 (IL-1β) may reflect population-level genetic architecture rather than direct functional interaction. These observed correlations contribute to the statistical population structure in this cohort and identify variant combinations that warrant further investigation for potential cooperative effects.
Figure 8. Pathway crosstalk and cytokine network integration. (A) Top pairwise correlations between SNPs in cervical cancer patients: Bar plot displays the strongest positive (pink) and negative (blue/purple) correlations among nine SNPs. IL-10 promoter variants rs1800872 and rs3024490 show a strong negative correlation (r∼0.80), indicating mutually exclusive inheritance. Positive correlations (r∼0.25) include rs1800871-rs1800872, suggesting linkage disequilibrium and co-inheritance patterns within cytokine gene regions. (B) Functional pathway enrichment analysis of SNPs associated with cervical cancer: Dot plot displays over-represented biological pathways based on SNP functions. X-axis shows statistical significance (-log10 p-value); dot size indicates gene count; color reflects significance. TNF-α signaling exhibits highest enrichment (p <.01), followed by IL-10 anti-inflammatory and inflammatory response pathways, confirming germline variants functionally concentrate in immune regulation processes underlying cervical cancer pathogenesis. (C) Venn diagram illustrating KEGG pathways: The STRING-derived network depicts interactions among major cytokines (TNF-α, IL-1α, IL-1β, IL-6, IL-10, IFN-γ) and their receptors or signaling partners. Highly interconnected nodes, including IL1A, IL1B, IL1R1, and IL1RAP, form the core of an immune–inflammatory module, emphasizing the central role of the IL-1 signaling axis in coordinating downstream immune activation. Strong functional links between TNF–TNFRSF1A and IL6–IL6R highlight cross-talk among proinflammatory and regulatory cytokines, reflecting the balance between immune activation and suppression within the cervical tumor microenvironment. (D) Protein–Protein interaction network of inflammatory studied genes: The Venn diagram illustrates the overlap of key cytokines; TNF-α, IL-1β, IL-6, IL-10, and IFN-γ, across major immune and oncogenic signaling pathways, including JAK-STAT signaling, immune checkpoint regulation, HPV infection and immune escape, hormone signaling, and primary immunodeficiency pathways. The central convergence of IL-6 and TNF-α highlights their pivotal roles in linking chronic inflammation with immune evasion, whereas IL-10 and IFN-γ participate in immunoregulatory feedback loops within the tumor microenvironment.
Pathway enrichment analysis based on the literature-derived functions of the top-ranked SNPs indicated overrepresentation of immune signaling cascades, primarily involving TNF-α, IL-10, and IL-1 cytokine pathways (Figure 8B). The protein-protein interaction (PPI) network analysis of gene products from the studied loci demonstrated a dense immune–inflammatory cluster centered on IL-1α, IL-1β, IL-1R1, and IL-1RAP, which theoretically integrates both pro- and anti-inflammatory mediators (Figure 8C). The network topology suggests the IL-1 axis and TNF signaling could be influential based on the curated interaction database.
Pathway intersection mapping of the same gene set indicated that the major cytokines (TNF-α, IL-1β, IL-6, IL-10, and IFN-γ) are annotated to multiple overlapping cascades, including JAK–STAT signaling, immune checkpoint regulation, and HPV infection-related pathways (Figure 8D). This bioinformatic analysis positions IL-6 and TNF-α as central nodes connecting inflammatory and immune-modulatory processes, while IL-10 and IFN-γ are annotated in regulatory feedback loops. IL-1β was associated with HPV-related and hormone-responsive pathways in the database, consistent with prior literature on its role in chronic inflammation.
Importantly, these pathway and network analyses are based on pre-existing knowledge of the genes harboring the studied SNPs. They do not provide direct functional evidence from this cohort but offer a plausible biological context for interpreting the observed genetic associations, generating hypotheses for future experimental validation.
To explore potential multigenic effects on disease progression, a random forest model was applied as an exploratory tool to capture non-linear interactions among the nine candidate SNPs. Within the Random Forest model, rs2430561 (IFN-γ), rs1474348 (IL-6), and rs1800871 (IL-10) ranked highest by variable-importance score (Figure 9). These rankings indicate that these SNPs contributed most strongly to prediction of FIGO stage in this dataset; however, they should be interpreted solely as statistical features within a classification model rather than evidence of mechanistic primacy. The results are consistent with the hypothesis that immune-related genetic polymorphisms may act in combination to influence clinical phenotype.
Figure 9. Random Forest variable importance plot for predicting FIGO stage based on immunogenetic SNPs. Bar plot ranks SNPs by Mean Decrease in Gini impurity, indicating contribution to classification accuracy. rs2430561 (IFN-γ) shows the highest predictive value, followed by rs1474348 and rs1800871 (IL-10). Longer bars indicate greater importance in distinguishing disease stages, revealing germline variants’ predictive power for cervical cancer progression.
Taken together, these integrative bioinformatic analyses—based on a limited set of candidate genes—provide a proposed cytokine-centered network context for the observed genetic associations. This framework suggests a potential connection between inflammation, immune regulation, and viral oncogenesis, generating support for a multigenic immunogenetic model of cervical cancer that remains to be functionally validated.
4 Discussion
We present an integrated immunogenetic framework for understanding CC susceptibility and clinical heterogeneity in Tunisian women. Through targeted cytokine genotyping and multivariate machine learning analyses, we demonstrated that host genetic polymorphisms, particularly those in the regulatory regions of TNF-α, IL-1β, and IL-10, may influence disease risk and outcomes. However, the strength of several associations was modest and requires replication, suggesting that inherited cytokine regulatory variation may be associated with clinical heterogeneity in this cohort, without implying direct effects on immune cell composition or function (47, 48). A distinct immunogenetic profile emerged, with nearly 75% of patients carrying variants in four or more SNPs and 20% exhibiting a high-burden 7-variant signature (rs1800629, rs2430561, rs16944, rs2234663, rs3024490, rs1800872, rs1800871), supporting a polygenic model. The TNF-α –238A allele was a borderline risk factor for adenocarcinoma, while IL-1β –511T appeared protective, consistent with earlier studies linking cytokine polymorphisms to HPV-related outcomes, although HPV persistence was not measured in this cohort (30, 32).
A derived PRS independently predicted overall survival (49), and integrative analyses defined three germline-based immunogenetic subtypes, establishing immune variation as a key determinant of CC risk and prognosis (41, 45). The PRS was intentionally modeled as an unweighted allelic burden score to reflect cumulative immune-pathway perturbation while minimizing overfitting, a strategy widely used in candidate-gene PRS frameworks. Sensitivity analysis using random-forest–weighted SNPs confirmed that prognostic and clustering results were robust to the weighting strategy. The 7-SNP high-burden signature should be interpreted cautiously, as effect estimates may be inflated due to sample size constraints and multiple comparisons.
Our findings reveal divergent immunogenetic mechanisms underlying cervical cancer histotypes, highlighting the interplay between pro- and anti-inflammatory pathways. The TNF-α –238A allele was borderline associated with increased risk of adenocarcinoma but not squamous cell carcinoma, supporting distinct etiological pathways despite shared HPV involvement (30, 50). In contrast, the IL-1β –511T allele appeared protective, with complete absence of T/T genotypes among squamous cell carcinoma cases, suggesting subtype-specific protection (51, 52). These opposing effects support a model in which the balance, rather than the magnitude, of inflammatory signaling determines disease trajectory (53, 54). Moreover, interactions between IL-1RA VNTR and IL-10 promoter variants with smoking, menopausal status, and familial cancer history underscore how environmental and hormonal factors modulate genetic susceptibility (52, 55). Collectively, these results indicate that cytokine-mediated inflammation differentially shapes tumor behavior across cervical cancer subtypes through distinct immune microenvironmental dynamics (50, 53, 54).
This study establishes a polygenic risk score (PRS) as an independent prognostic factor for overall survival in cervical cancer (49). Patients in the high-risk tertile (≥7 alleles) exhibited significantly poorer outcomes, independent of FIGO stage, age, and histology, underscoring the prognostic value of cumulative germline immunogenetic variation. Principal component analysis revealed that IL-10 promoter variants (rs1800871, rs1800872) and TNF-α/IFN-γ variants define distinct, orthogonal axes of genetic variation, together accounting for 38.5% of total variance. These axes likely reflect divergent immune-regulatory circuits; on one hand dominated by anti-inflammatory IL-10–mediated suppression of cytotoxic responses, and on the other hand by pro-inflammatory TNF-α and IFN-γ signaling promoting chronic inflammation and tissue remodeling (56, 57). Unsupervised K-means clustering identified three immunogenetic subgroups corresponding to unique combinations of these immune pathways, aligning with histological diversity. Complementary random forest analysis pinpointed IFN-γ (rs2430561) and IL-10 (rs1800871, rs1474348) as top predictors of disease stage, highlighting how immune dysregulation shapes both tumor biology and patient outcomes (45, 53, 56).
The integration of machine learning and pathway analyses revealed complementary layers of immune dysregulation in cervical cancer (58). Random forest modeling identified IFN-γ rs2430561 and IL-10 promoter variants as the strongest predictors of FIGO stage within this multivariate framework, highlighting their statistical relevance for classification rather than biological causality (48, 59). The biological plausibility of this finding lies in IFN-γ’s central role in antitumor immunity through regulation of MHC expression, immune cell activation, and tumor growth inhibition. Correlation network analysis further demonstrated strong linkage between IL-10 promoter variants (rs1800871, rs1800872) and weaker intergenic associations, consistent with coordinated cytokine regulation. Pathway enrichment highlighted convergence on TNF-α, IL-10, and IL-1β signaling, indicating a disturbed equilibrium between pro- and anti-inflammatory axes (43, 54, 56). While these statistical and pathway-based associations are consistent with established immune-regulatory pathways, they do not provide direct evidence of immune microenvironment remodeling in patient tumors (53, 54). Patient clustering into three immunogenetic subgroups underscores that distinct cytokine-driven immune states may underlie histological diversity and therapeutic responsiveness in cervical cancer.
Functional enrichment analyses revealed convergence among TNF-α, IL-10, and IL-1β pathways, underscoring how inherited variants influence the balance between pro-inflammatory and anti-inflammatory responses that govern HPV persistence and malignant progression (56, 60). TNF-α variants may tilt inflammation towards chronic, tumor-promoting states, while IL-10 haplotypes likely hinder immune surveillance, allowing immune evasion (48, 61). Significant gene-environment interactions were also observed, especially between IL-1RA VNTR and tobacco use, menopausal status, and family cancer history, indicating that genetic susceptibility functions within a broader biological context. Whereas the 7-SNP high-risk signature and PRS derived from these findings may be useful for risk stratification pending validation, they should not be viewed as predictive or therapeutic biomarkers at this stage (48). Interpreting the observed gene-environment interactions should be interpreted in the context of previous studies. Cytokine polymorphisms examined, in particular IL-1RN VNTR, TNF-α promoter variants, and IL-10 regulatory SNPs, are known to interact with environmental and lifestyle factors relevant to cervical carcinogenesis. IL-1RN VNTR alleles are associated with smoking-related inflammatory responses and increased cancer susceptibility (62, 63), while TNF-α −308G>A and −238G>A variants interact with hormonal factors, parity, and tobacco exposure in modulating HPV persistence (64, 65).
In addition, IL-10 rs1800871 and rs1800872 promoter variants influence immune responses linked to HPV infection and cervical neoplasia (66, 67). In contrast, marital status is not biologically linked to cytokine genotypes but serves as an epidemiological proxy for sexual behavior, HPV exposure, and screening access.
Although larger, multiethnic cohorts are needed for replication, these results highlight how germline immune variations and environmental factors jointly influence cervical cancer risk and outcomes. It should be noted that the associations reported here are derived from germline variation and based on statistical modeling. We support the notion that demonstration of immune microenvironment remodeling, effects on HPV persistence, or therapeutic relevance requires integration with tumor transcriptomics, immune-cell profiling, HPV genotyping, and independent cohort validation.
While this study offers an integrative framework combining population genetics, machine learning, and pathway analysis, several limitations warrant mention. The modest sample size (n=130) limits statistical power for subgroup analyses, particularly stratification by histology and FIGO stage, increasing the likelihood of unstable effect estimates. In addition, HPV genotype data were not available for most patients, preventing assessment of virus-host genetic interactions known to influence cervical carcinogenesis. Although false discovery rate control was applied to primary SNP-phenotype comparisons, the combination of limited sample size and multiple stratified analyses means that some nominal associations are likely to represent false positives and should be interpreted cautiously pending replication. The candidate-gene design is limited to capturing only a fraction of the heritable component of immune regulation, making genome-wide approaches preferable, as they provide more complete coverage. The absence of an independent validation cohort limited the generalizability, necessitating functional studies to verify the biological significance of implicated polymorphisms. Although weighted PRS approaches are preferable in large GWAS-derived datasets, effect-size-based weighting is unreliable in small cohorts. Accordingly, we adopted an unweighted PRS as the primary model, supported by sensitivity analyses demonstrating stability of the results.
Despite these limitations, our findings underscore the central role of host immunogenetic factors in CC ontogeny and progression, providing a mechanistic link between genetic predisposition and clinical outcome. Future studies should integrate HPV genotyping, host epigenetic markers, and transcriptomic profiling to refine polygenic models. Collectively, these results advocate for incorporating immunogenetic biomarkers into cervical cancer risk assessment and surveillance programs, particularly in resource-limited settings. The most important limitation of this study is the absence of an independent external validation cohort. Our findings, including the proposed polygenic risk score, require replication in larger, multi-center, and ethnically diverse populations before any clinical relevance can be inferred. The performance metrics reported here are likely optimistic due to overfitting inherent to single-cohort analyses.
4 Conclusion
This exploratory study provides preliminary insights into the potential role of germline immune genetic variation in cervical cancer by applying a multi-step analytical framework to clinicopathological and genetic data. By integrating dimensionality reduction, unsupervised clustering, machine learning–based feature selection, and pathway analysis, we moved beyond single-SNP associations to explore how combined host genetic factors may relate to disease heterogeneity.
Our findings suggest that specific promoter polymorphisms in immune-regulatory genes particularly TNF-α, IL-1β, and IL-10 may be associated with distinct histological subgroups and disease stages. These variants appear to be involved in key inflammatory and immune-regulatory pathways, hypothetically shaping the tumor microenvironment. Machine learning models highlighted the potential of selected germline variants for distinguishing clinical risk subgroups, supporting their further investigation as candidate biomarkers.
Functional enrichment analyses revealed a potential interconnected cytokine network involving IL-6–TNF-α cross-talk, suggesting a model in which coordinated genetic variation in cytokine regulation could influence immune homeostasis and tumor persistence.
Overall, our study proposes a polygenic, immunogenetic framework for understanding cervical cancer susceptibility. The findings, derived from a focused nine-SNP panel in a single cohort, are hypothesis-generating. Future validation in larger, multi-ethnic cohorts, integrated with HPV and somatic genomic data, will be essential to determine the robustness and generalizability of these associations and to assess any eventual clinical relevance.
Data availability statement
The datasets generated and analyzed during the current study are publicly available in the Mendeley repository, accessed via the following DOI: 10.17632/fwftt7ndy5.110.1234/abcd.efghijkl. Additional materials or clarifications can be provided by the corresponding author (Sabrina Zidi) upon reasonable request.
Ethics statement
The studies involving humans were approved by Ethics Committee of the Salah Azeiz Oncology Institute, Tunis (Comité d’Éthique de l’Institut Salah Azeiz). The studies were conducted in accordance with the local legislation and institutional requirements. The human samples used in this study were acquired from primarily isolated as part of your previous study for which ethical approval was obtained. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.
Author contributions
SZ: Data curation, Validation, Methodology, Visualization, Formal Analysis, Conceptualization, Resources, Investigation, Funding acquisition, Writing – review & editing, Writing – original draft, Supervision, Software. BY-L: Project administration, Writing – review & editing, Supervision. WA: Visualization, Validation, Supervision, Writing – review & editing, Project administration. Writing – review & editing, Methodology, Formal analysis, Visualisation. BBM: Methodology, Formal Analysis, Visualization, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Bruni L, Albero G, Serrano B, Mena M, Gómez D, Muñoz J, et al. ICO HPV Information Centre Human Papillomavirus and Related Diseases Report-Germany. Summary report. ICO/IARC Information Centre HPV Cancer. Barcelona, Spain: Related Diseases Report-Germany: Summary Report. (2019). Available online at: https://hpvcentre.net/statistics/reports/DEU.pdf?t=1575294458729.
2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, and Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2018) 68:394–424. doi: 10.3322/caac.21492
3. Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, et al. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. (2020) 8:e191–203. doi: 10.1016/S2214-109X(19)30482-6
4. Way S, Hetherington J, and Galloway DC. Simultaneous cytological diagnosis of cervical cancer in three sisters. Lancet. (1959) 2:890–1. doi: 10.1016/s0140-6736(59)90810-4
5. Ahlbom A, Lichtenstein P, Malmström H, Feychting M, Hemminki K, and Pedersen NL. Cancer in twins: genetic and nongenetic familial risk factors. J Natl Cancer Inst. (1997) 89:287–93. doi: 10.1093/jnci/89.4.287
6. Cecchini S, Confortini M, and Santucci M. Cervical carcinoma in both mother and daughter. Acta Cytol. (1983) 27:379–80.
7. Magnusson PK, Lichtenstein P, and Gyllensten UB. Heritability of cervical tumours. Int J Cancer. (2000) 88:698–701. doi: 10.1002/1097-0215(20001201)88:5<698::aid-ijc3>3.0.co;2-j
8. Bruinse HW, te Velde ER, and de Gast BC. Human leukocyte antigen patterns in a family with cervical cancer. Gynecologic Oncol. (1981) 12:249–52. doi: 10.1016/0090-8258(81)90154-2
9. Horn LC, Raptis G, and Fischer U. Familial cancer history in patients with carcinoma of the cervix uteri. Eur J obstetrics gynecology Reprod Biol. (2002) 101:54–7. doi: 10.1016/s0301-2115(01)00520-6
10. Zoodsma M, Sijmons RH, de Vries EG, and Zee AG. Familial cervical cancer: case reports, review and clinical implications. Hereditary Cancer Clin Pract. (2004) 2:99–105. doi: 10.1186/1897-4287-2-2-99
11. von Keyserling H, Bergmann T, Schuetz M, Schiller U, Stanke J, Hoffmann C, et al. Analysis of 4 single-nucleotide polymorphisms in relation to cervical dysplasia and cancer development using a high-throughput ligation-detection reaction procedure. Int J gynecological Cancer. (2011) 21:1664–71. doi: 10.1097/IGC.0b013e31822b6299
12. Hu X, Zhang Z, Ma D, Huettner PC, Massad LS, Nguyen L, et al. TP53, MDM2, NQO1, and susceptibility to cervical cancer. Cancer Epidemiol. Biomarkers Prev. (2010) 19:755–61. doi: 10.1158/1055-9965.EPI-09-0886
13. Lee SA, Kim JW, Roh JW, Choi JY, Lee KM, Yoo KY, et al. Genetic polymorphisms of GSTM1, p21, p53 and HPV infection with cervical cancer in Korean women. Gynecologic Oncol. (2004) 93:14–8. doi: 10.1016/j.ygyno.2003.11.045
14. Nunobiki O, Ueda M, Yamamoto M, Toji E, Sato N, Izuma S, et al. MDM2 SNP 309 human papillomavirus infection in cervical carcinogenesis. Gynecologic Oncol. (2010) 118:258–61. doi: 10.1016/j.ygyno.2010.05.009
15. Roszak A, Misztal M, Sowińska A, and Jagodziński PP. Murine double-minute 2 homolog single nucleotide polymorphisms 285 and 309 in cervical carcinogenesis. Mol diagnosis Ther. (2015) 19:235–44. doi: 10.1007/s40291-015-0153-4
16. Oliveira S, Ribeiro J, Sousa H, Pinto D, Baldaque I, and Medeiros R. Genetic polymorphisms and cervical cancer development: ATM G5557A and p53bp1 C1236G. Oncol Rep. (2012) 27:1188–92. doi: 10.3892/or.2011.1609
17. Ma XD, Cai GQ, Zou W, Huang YH, Zhang JR, Wang DT, et al. BRIP1 variations analysis reveals their relative importance as genetic susceptibility factor for cervical cancer. Biochem Biophys Res Commun. (2013) 433:232–6. doi: 10.1016/j.bbrc.2013.02.089
18. Martínez-Nava GA, Fernández-Niño JA, Madrid-Marina V, and Torres-Poveda K. Cervical cancer genetic susceptibility: A systematic review and meta-analyses of recent evidence. PLoS One. (2016) 11:e0157344. doi: 10.1371/journal.pone.0157344
19. Wang N, Wang S, Zhang Q, Lu Y, Wei H, Li W, et al. Association of p21 SNPs and risk of cervical cancer among Chinese women. BMC Cancer. (2012) 12:589. doi: 10.1186/1471-2407-12-589
20. Lima G, Santos E, Angelo H, Oliveira M, Heráclio S, Leite F, et al. Association between p21 Ser31Arg polymorphism and the development of cervical lesion in women infected with high risk HPV. Tumour Biol. (2016) 37:10935–41. doi: 10.1007/s13277-016-4979-0
21. Thakur N, Hussain S, Nasare V, Das BC, Basir SF, and Bharadwaj M. Association analysis of p16 (CDKN2A) and RB1 polymorphisms with susceptibility to cervical cancer in Indian population. Mol Biol Rep. (2012) 39:407–14. doi: 10.1007/s11033-011-0752-z
22. Juko-Pecirep I, Ivansson EL, and Gyllensten UB. Evaluation of Fanconi anaemia genes FANCA, FANCC and FANCL in cervical cancer susceptibility. Gynecologic Oncol. (2011) 122:377–81. doi: 10.1016/j.ygyno.2011.04.014
23. Chung HH, Kim MK, Kim JW, Park NH, Song YS, Kang SB, et al. XRCC1 R399Q polymorphism is associated with response to platinum-based neoadjuvant chemotherapy in bulky cervical cancer. Gynecologic Oncol. (2006) 103:1031–7. doi: 10.1016/j.ygyno.2006.06.016
24. Cheng XD, Lu WG, Ye F, Wan XY, and Xie X. The association of XRCC1 gene single nucleotide polymorphisms with response to neoadjuvant chemotherapy in locally advanced cervical carcinoma. J Exp Clin Cancer research: CR. (2009) 28:91. doi: 10.1186/1756-9966-28-91
25. Kim K, Kang SB, Chung HH, Kim JW, Park NH, and Song YS. XRCC1 Arginine194Tryptophan and GGH-401Cytosine/Thymine polymorphisms are associated with response to platinum-based neoadjuvant chemotherapy in cervical cancer. Gynecologic Oncol. (2008) 111:509–15. doi: 10.1016/j.ygyno.2008.08.034
26. Alsbeih G, Al-Harbi N, El-Sebaie M, and Al-Badawi I. HPV prevalence and genetic predisposition to cervical cancer in Saudi Arabia. Infect Agents Cancer. (2013) 8:15. doi: 10.1186/1750-9378-8-15
27. Zhang Z, Borecki I, Nguyen L, Ma D, Smith K, Huettner PC, et al. CD83 gene polymorphisms increase susceptibility to human invasive cervical cancer. Cancer Res. (2007) 67:11202–8. doi: 10.1158/0008-5472.CAN-07-2677
28. Hu L, Liu J, Chen X, Zhang Y, Liu L, Zhu J, et al. CTLA-4 gene polymorphism +49 A/G contributes to genetic susceptibility to two infection-related cancers-hepatocellular carcinoma and cervical cancer. Hum Immunol. (2010) 71:888–91. doi: 10.1016/j.humimm.2010.05.023
29. Jin Y. Association of single nucleotide polymorphisms in tumor necrosis factor-alpha with cervical cancer susceptibility. Cell Biochem biophysics. (2015) 71:77–84. doi: 10.1007/s12013-014-0165-4
30. Kohaar I, Thakur N, Salhan S, Batra S, Singh V, Sharma A, et al. TNFalpha-308G/A polymorphism as a risk factor for HPV associated cervical cancer in Indian population. Cell Oncol. (2007) 29:249–56. doi: 10.1155/2007/418247
31. Barbisan G, Pérez LO, Contreras A, and Golijow CD. TNF-α and IL-10 promoter polymorphisms, HPV infection, and cervical cancer risk. Tumour Biol. (2012) 33:1549–56. doi: 10.1007/s13277-012-0408-1
32. Badano I, Stietz SM, Schurr TG, Picconi AM, Fekete D, Quintero IM, et al. Analysis of TNFα promoter SNPs and the risk of cervical cancer in urban populations of Posadas (Misiones, Argentina). J Clin Virol. (2012) 53:54–9. doi: 10.1016/j.jcv.2011.09.030
33. Liu H, Lyu D, Zhang Y, Sheng L, and Tang N. Association between the IL-6 rs1800795 polymorphism and the risk of cervical cancer: A meta-analysis of 1210 cases and 1525 controls. Technol Cancer Res Treat. (2017) 16:662–7. doi: 10.1177/1533034616672806
34. Pu X, Gu Z, and Wang X. Polymorphisms of the interleukin 6 gene and additional gene-gene interaction contribute to cervical cancer susceptibility in Eastern Chinese women. Arch gynecology obstetrics. (2016) 294:1305–10. doi: 10.1007/s00404-016-4175-x
35. Han SS, Cho EY, Lee TS, Kim JW, Park NH, Song YS, et al. Interleukin-12 p40 gene (IL12B) polymorphisms and the risk of cervical cancer in Korean women. Eur J obstetrics gynecology Reprod Biol. (2008) 140:71–5. doi: 10.1016/j.ejogrb.2008.02.007
36. Chagas BS, Gurgel AP, da Cruz HL, Amaral CM, Cardoso MV, Silva Neto JDC, et al. An interleukin-10 gene polymorphism associated with the development of cervical lesions in women infected with Human Papillomavirus and using oral contraceptives. Infection Genet Evol. (2013) 19:32–7. doi: 10.1016/j.meegid.2013.06.016
37. Torres-Poveda K, Burguete-García AI, Bahena-Román M, Méndez-Martínez R, Zurita-Díaz MA, López-Estrada G, et al. Risk allelic load in Th2 and Th3 cytokines genes as biomarker of susceptibility to HPV-16 positive cervical cancer: a case control study. BMC Cancer. (2016) 16:330. doi: 10.1186/s12885-016-2364-4
38. Wang SS, Gonzalez P, Yu K, Porras C, Li Q, Safaeian M, et al. Common genetic variants and risk for HPV persistence and progression to cervical cancer. PLoS One. (2010) 5:e8667. doi: 10.1371/journal.pone.0008667
39. Klug SJ, Ressing M, Koenig J, Abba MC, Agorastos T, Brenna SM, et al. TP53 codon 72 polymorphism and cervical cancer: a pooled analysis of individual data from 49 studies. Lancet Oncol. (2009) 10:772–84. doi: 10.1016/S1470-2045(09)70187-1
40. Chen D, Cui T, Ek WE, Liu H, Wang H, and Gyllensten U. Analysis of the genetic architecture of susceptibility to cervical cancer indicates that common SNPs explain a large proportion of the heritability. Carcinogenesis. (2015) 36:992–8. doi: 10.1093/carcin/bgv083
41. Zhang Y, Ji L, and Yang S. Mendelian randomization analysis of immune cell characteristics and genetic variants in cervical cancer risk: a genome-wide association study. Discov Oncol. (2025) 16:1093. doi: 10.1007/s12672-025-02876-7
42. Zidi S, Stayoussef M, Zouidi F, Benali S, Gazouani E, Mezlini A, et al. Tumor necrosis factor alpha (-238/-308) and TNFRII-VNTR (-322) polymorphisms as genetic biomarkers of susceptibility to develop cervical cancer among Tunisians. Pathol Oncol research: POR. (2015) 21:339–45. doi: 10.1007/s12253-014-9826-2
43. Zidi S, Gazouani E, Stayoussef M, Mezlini A, Ahmed SK, Yacoubi-Loueslati B, et al. IL-10 gene promoter and intron polymorphisms as genetic biomarkers of cervical cancer susceptibility among Tunisians. Cytokine. (2015) 76:343–7. doi: 10.1016/j.cyto.2015.05.028
44. Zidi S, Benothmen Y, Sghaier I, Ghazoueni E, Mezlini A, Slimen B, et al. Association of IL10-1082 and IFN-γ+ 874 polymorphisms with cervical cancer among Tunisian women. Int Scholarly Res Notices. (2014) 2014:706516. doi: 10.1155/2014/706516
45. Zidi S, Stayoussef M, Alsaleh BL, Gazouani E, Mezlini A, Ebrahim BH, et al. Relationships between common and novel interleukin-6 gene polymorphisms and risk of cervical cancer: a case-control study. Pathol Oncol research: POR. (2017) 23:385–92. doi: 10.1007/s12253-016-0127-9
46. Zidi S, Sghaier I, Zouidi F, Benahmed A, Stayoussef M, Kochkar R, et al. Interleukin-1 gene cluster polymorphisms and its haplotypes may predict the risk to develop cervical cancer in Tunisia. Pathol Oncol Res. (2015) 21:1101–7. doi: 10.1007/s12253-015-9941-8
47. Yi M, Li T, Niu M, Zhang H, Wu Y, Wu K, et al. Targeting cytokine and chemokine signaling pathways for cancer therapy. Signal transduction targeted Ther. (2024) 9:176. doi: 10.1038/s41392-024-01868-3
48. Stayoussef M, Weili X, Habel A, Barbirou M, Bedoui S, Attia A, et al. Altered expression of cytokines, chemokines, growth factors, and soluble receptors in patients with colorectal cancer, and correlation with treatment outcome. Cancer Immunol Immunother. (2024) 73:169. doi: 10.1007/s00262-024-03746-x
49. Xin J, Jiang X, Li H, Chen S, Zhang Z, Wang M, et al. Prognostic evaluation of polygenic risk score underlying pan-cancer analysis: evidence from two large-scale cohorts. EBioMedicine. (2023) 89:104454. doi: 10.1016/j.ebiom.2023.104454
50. Wang Y, Yang J, Huang J, and Tian Z. Tumor necrosis factor-α Polymorphisms and cervical cancer: evidence from a meta-analysis. Gynecol Obstet Invest. (2020) 85:153–8. doi: 10.1159/000502955
51. Mao X, Ke Z, Liu S, Tang B, Wang J, Huang H, et al. IL-1β+3953C/T, -511T/C and IL-6 -174C/G polymorphisms in association with tuberculosis susceptibility: A meta-analysis. Gene. (2015) 573:75–83. doi: 10.1016/j.gene.2015.07.025
52. Sghaier I, Sheridan JM, Daldoul A, El-Ghali RM, Al-Awadi AM, Habel AF, et al. Association of IL-1β gene polymorphisms rs1143627, rs1799916, and rs16944 with altered risk of triple-negative breast cancer. Cytokine. (2024) 180:156659. doi: 10.1016/j.cyto.2024.156659
53. Sghaier I, Mouelhi L, Rabia NA, Alsaleh BR, Ghazoueni E, Almawi WY, et al. Genetic variants in IL-6 and IL-10 genes and susceptibility to hepatocellular carcinoma in HCV infected patients. Cytokine. (2017) 89:62–7. doi: 10.1016/j.cyto.2016.10.004
54. Cicchese JM, Evans S, Hult C, Joslyn LR, Wessler T, Millar JA, et al. Dynamic balance of pro- and anti-inflammatory signals controls disease and limits pathology. Immunol Rev. (2018) 285:147–67. doi: 10.1111/imr.12671
55. Cullup H, Middleton PG, Duggan G, Conn JS, and Dickinson AM. Environmental factors and not genotype influence the plasma level of interleukin-1 receptor antagonist in normal individuals. Clin Exp Immunol. (2004) 137:351–8. doi: 10.1111/j.1365-2249.2004.02531.x
56. Carlini V, Noonan DM, Abdalalem E, Goletti D, Sansone C, Calabrone L, et al. The multifaceted nature of IL-10: regulation, role in immunological homeostasis and its relevance to cancer, COVID-19 and post-COVID conditions. Front Immunol. (2023) 14:1161067. doi: 10.3389/fimmu.2023.1161067
57. Li L, Yu R, Cai T, Chen Z, Lan M, Zou T, et al. Effects of immune cells and cytokines on inflammation and immunosuppression in the tumor microenvironment. Int Immunopharmacol. (2020) 88:106939. doi: 10.1016/j.intimp.2020.106939
58. Sanders LM, Chandra R, Zebarjadi N, Beale HC, Lyle AG, Rodriguez A, et al. Machine learning multi-omics analysis reveals cancer driver dysregulation in pan-cancer cell lines compared to primary tumors. Commun Biol. (2022) 5:1367. doi: 10.1038/s42003-022-04075-4
59. Zheng M, Li J, Fang W, Luo L, Ding R, Zeng H, et al. The TNF-α rs361525 and IFN-γ rs2430561 polymorphisms are associated with liver cirrhosis risk: a comprehensive meta-analysis. Front Immunol. (2023) 14:1129767. doi: 10.3389/fimmu.2023.1129767
60. Zhang H and Dhalla NS. The role of pro-inflammatory cytokines in the pathogenesis of cardiovascular disease. Int J Mol Sci. (2024) 25:1082. doi: 10.3390/ijms25021082
61. Ahmed AB, Zidi S, Sghaier I, Ghazouani E, Mezlini A, Almawi W, et al. Common variants in IL-1RN, IL-1β and TNF-α and the risk of ovarian cancer: a case control study. Cent Eur J Immunol. (2017) 42:150–5. doi: 10.5114/ceji.2017.69356
62. El-Omar EM, Carrington M, Chow WH, McColl KE, Bream JH, Young HA, et al. Interleukin-1 polymorphisms associated with increased risk of gastric cancer. Nature. (2000) 404:398–402. doi: 10.1038/35006081
63. Wu S, Hu G, Chen J, and Xie G. Interleukin 1β and interleukin 1 receptor antagonist gene polymorphisms and cervical cancer: a meta-analysis. Int J Gynecol Cancer. (2014) 24:984–90. doi: 10.1097/IGC.0000000000000165
64. Chagas BS, Lima RCP, Paiva Júnior SSL, Silva RCO, Cordeiro MN, Silva Neto JDC, et al. Significant association between IL10-1082/-819 and TNF-308 haplotypes and the susceptibility to cervical carcinogenesis in women infected by Human papillomavirus. Cytokine. (2019) 113:99–104. doi: 10.1016/j.cyto.2018.06.014
65. Kuguyo O, Tsikai N, Thomford NE, Magwali T, Madziyire MG, Nhachi CFB, et al. Genetic susceptibility for cervical cancer in african populations: what are the host genetic drivers? OMICS. (2018) 22:468–83. doi: 10.1089/omi.2018.0075
66. Chen D and Gyllensten U. Lessons and implications from association studies and post-GWAS analyses of cervical cancer. Trends Genet. (2015) 31:41–54. doi: 10.1016/j.tig.2014.10.005
Keywords: cervical cancer, cytokine polymorphisms, IL-10, immunogenetics, machine learning, polygenic risk score, risk stratification, TNF-α
Citation: Zidi S, Yacoubi-Loueslati B, Ben Abdelmoumen Mardassi B and Almawi WY (2026) A machine learning approach to a nine-SNP immunogenetic score for prognostic stratification in cervical cancer. Front. Immunol. 17:1759674. doi: 10.3389/fimmu.2026.1759674
Received: 03 December 2025; Accepted: 19 January 2026; Revised: 16 January 2026;
Published: 06 February 2026.
Edited by:
Jingyue Jia Cassano, University of New Mexico, United StatesReviewed by:
Atar Singh Kushwah, Manipal University Jaipur, IndiaVarsha Srinivasan, University of New Mexico, United States
Copyright © 2026 Zidi, Yacoubi-Loueslati, Ben Abdelmoumen Mardassi and Almawi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Sabrina Zidi, emlkaXNhYnJpbmE4NkBnbWFpbC5jb20=
Ben Abdelmoumen Mardassi Boutheina1