Decoding the tumor microenvironment and molecular mechanism: unraveling cervical cancer subpopulations and prognostic signatures through scRNA-Seq and bulk RNA-seq analyses

Background Cervical carcinoma (CC) represents a prevalent gynecological neoplasm, with a discernible rise in prevalence among younger cohorts observed in recent years. Nonetheless, the intrinsic cellular heterogeneity of CC remains inadequately investigated. Methods We utilized single-cell RNA sequencing (scRNA-seq) transcriptomic analysis to scrutinize the tumor epithelial cells derived from four specimens of cervical carcinoma (CC) patients. This method enabled the identification of pivotal subpopulations of tumor epithelial cells and elucidation of their contributions to CC progression. Subsequently, we assessed the influence of associated molecules in bulk RNA sequencing (Bulk RNA-seq) cohorts and performed cellular experiments for validation purposes. Results Through our analysis, we have discerned C3 PLP2+ Tumor Epithelial Progenitor Cells as a noteworthy subpopulation in cervical carcinoma (CC), exerting a pivotal influence on the differentiation and progression of CC. We have established an independent prognostic indicator—the PLP2+ Tumor EPCs score. By stratifying patients into high and low score groups based on the median score, we have observed that the high-score group exhibits diminished survival rates compared to the low-score group. The correlations observed between these groups and immune infiltration, enriched pathways, single-nucleotide polymorphisms (SNPs), drug sensitivity, among other factors, further underscore their impact on CC prognosis. Cellular experiments have validated the significant impact of ATF6 on the proliferation and migration of CC cell lines. Conclusion This study enriches our comprehension of the determinants shaping the progression of CC, elevates cognizance of the tumor microenvironment in CC, and offers valuable insights for prospective CC therapies. These discoveries contribute to the refinement of CC diagnostics and the formulation of optimal therapeutic approaches.


Introduction
Cervical carcinoma (CC) ranks as the fourth most prevalent malignancy and represents a leading contributor to global female cancer-related mortality (1).In both incidence and mortality profiles, CC consistently maintains a prominent status among the most prevalent cancer types worldwide in women, trailing behind only breast cancer, colorectal cancer, and lung cancer (2).Despite the substantial preventive potential provided by efficient screening and vaccination initiatives against CC, vaccination uptake remains notably deficient, with only a minority of women accessing comprehensive healthcare services (3).Consequently, the mortality rate for advanced-stage CC patients remains elevated, with a median survival period of a mere 16.8 months (4), and an average 5-year survival rate of 72% (5).Due to chemotherapy resistance, the current efficacy of chemotherapeutic agents is limited, exhibiting a response rate ranging from 29% to 63% (6).Standard care for CC presently encompasses radiotherapy, chemotherapy, or surgical resection.However, these approaches entail significant side effects, possess limited efficacy for advancedstage diseases, and offer few treatment alternatives for cases of recurrence or metastasis (7).Although some treatments targeting advanced or recurrent CC, such as anti-angiogenesis and immunotherapy, exist, their response rates remain suboptimal (8).Given the low cure rates for advanced-stage diseases and the adverse effects associated with current therapies, there is an urgent need to provide new treatment options for CC patients.
To comprehend CC, it is imperative to initially comprehend the tumor microenvironment (TME) within which CC tumors arise and proliferate.The tumor microenvironment, consisting of diverse cellular components such as fibroblasts, endothelial cells (ECs), and immune cells, in conjunction with extracellular constituents including cytokines, hormones, extracellular matrix, and growth factors, constitutes an intricate network enveloping CC cells.The TME plays a pivotal role not only in the initiation, progression, and metastasis of CC but also profoundly influences therapeutic outcomes (9).Chemoresistance mediated by the TME is a result of intricate crosstalk between CC cells and their surrounding environment.In previous studies, investigations into the molecular mechanisms and chemoresistance of CC patients have predominantly focused on bulk genomic or transcriptomic analysis methods and in situ hybridization techniques (10).Consequently, research on chemoresistance mechanisms based on distinct characteristics of cell populations remains ambiguous.
Today, single-cell RNA sequencing (scRNA-seq) technology has emerged as a powerful tool for analyzing cell population spectra within tissues.This technique has been widely employed to elucidate complex subpopulations in organ tissues, such as the lungs (11), heart (12), and brain (13), as well as various cancers, including melanoma (14), ovarian (15), and colorectal (16) cancers (17).Single-cell sequencing stands as the latest tool for revealing tumor cell heterogeneity and the microenvironment.However, its application in clinical samples of CC remains limited.
In 2021, Hua et al. utilized scRNA-seq to investigate the intratumoral heterogeneity of CSCC based on tumor tissue and adjacent normal tissue from a single patient (18).While there are some comparative studies on TME changes before and after chemotherapy, relevant reports are concise and lack detailed exploration of scRNA-seq data.Currently, the use of scRNA-seq for mapping studies has been widely embraced by the scientific community (19).
Therefore, this study employed single-cell RNA sequencing (scRNA-seq) on CC samples to decipher the immunological microenvironment of CC.Unraveling the immune landscape of CC may offer novel insights into the treatment of advanced-stage CC, potentially expediting the eradication of cervical cancer.The paper extensively discusses and summarizes the functional roles and clinical relevance of tumor epithelial cell subpopulations during the progression of CC.This study provides a valuable resource and deeper insights into CC initiation and progression, which is helpful in refining CC diagnosis and for the design of optimal treatment strategies.number of patients is four.Each patient had a sample sequenced.The samples included: GSM5236544, GSM5236545, GSM5236546, GSM5236547.The bulk data was from the The Cancer Genome Atlas (TCGA) official website (https://portal.gdc.cancer.gov/).

Processing of scRNA-seq data
The raw gene expression substrates were processed with the Seurat software package (version 4.3.0)(20).High-quality cells were obtained according to the following criteria: removed cells with extreme nFeature, nCount values; mitochondrial gene expression below 20% of the total number in one cell, and erythrocyte gene expression below 5% of the total number in one cell.Potential double cells was removed by using the DoubletFinder package.The samples were normalized to find the top 2000 highly variable genes, and their data were normalized (21,22).Subsequently, we further analyzed the data by PCA and we used the method of harmony in order to remove the batch effect between samples.The top 30 significant principal components (PCs) were selected for uniform manifold approximation and projection (UMAP) downscaling and gene expression visualization (23,24).Cell clusters were annotated by cell markers obtained from previous literature and according to the CellMarker database (http:// xteam.xbio.top/CellMarker/).Subsequently, we also observed the proportion of different cell types (25)(26)(27).

DEGs and GESA
Differentially Expressed Genes (DEGs) of per cell type were identified by performing the FindAllMarker function on the normalized expression data in the Seurat software package (28, 29), and genes expressed in more than 25% of the cells in clusters with a logFC value greater than 0.25 were selected.Genes with adjusted p-values <0.05 were considered statistically significant in KEGG and GO enrichment analysis.The ClusterProfiler software package (version 0.1.1)was used to enrich and analyze clusterspecific biomarker genes (30).

Tumor cell and non-malignant epithelial cell differentiation based on inferred CNVs
To differentiate tumor cells and non-malignant epithelial cells, the initial copy number variation (CNV) signal was estimated for each region by using the inferCNV package (https://github.com/broadinstitute/inferCNV/wiki).Where NK/T cells were used as a reference, while we defined subpopulations with a high copy number variation profile as tumor epithelial cells.

Determination of cell subpopulations
All tumor epithelial cells were extracted and renormalized to find the first 2000 highly variable genes, and their data were normalized.Subsequently, we further analyzed them by PCA and we used the method of harmony in order to remove the batch effect among samples.The expression of known typical marker genes for the respective cell types was annotated according to cell subtype.Cell subclusters with similar gene expression patterns were annotated to the same cell type and projected onto a 2D map by using the UMAP approach.

Differential and enrichment analysis of cell subpopulations
We further used the "FindAllMarkers" function of the Wilcoxon rank-sum test to identify differential genes in each subpopulation of tumor epithelial cell subgroups and performed GO-BP enrichment analysis by using ClusterProfiler.

Trajectory analysis
To further investigate CC tumorigenesis, trajectory analysis of tumor epithelial cell subpopulations was performed by using three software packages.
First, the algorithm of cytoTRACE was used to assess the cell stemness of each subpopulation of cells.Then, we further used the Monocle software toolkit to reconstruct cell differentiation trajectories, used DDRTree to downscale and to observe the development of the subpopulation cells under the newly created trajectories.Finally, we further analyzed the cell trajectories during tumor epithelial cell differentiation using the slingshot method, which was used to infer cell lineages and to estimate the cell expression level of each lineage over the pseudotime.

Analysis of cell-cell interactions
To investigate the cell-cell interaction network between epithelial cell subpopulations and other microenvironmental cells, ligand-receptor pairs between ecotone cell subtypes and malignant cells were explored using the "CellChat" package (version 1.6.1)(31).We inferred cell-cell communication at the signaling pathway and receptor-ligand levels and explored how signaling pathways were coordinating across multiple cell types.

Constructed novel immunerelated features
To focus on the role of cervical cancer in predicting patient survival, we used key subgroups of cervical cancer marker genes, which were used to predict genetic signatures.We used univariate COX regression analysis based on the "survival" package to explore the correlation between the expression of these important genes and the overall survival (OS) of cervical cancer patients (32).LASSO regression analysis was performed by using the R package "survival" to find the more important prognostic genes.The risk score for each cervical cancer patient was calculated based on the expression level of each gene and the corresponding LASSO regression coefficient.Riskscore = Expression of gene 1 * coefficient 1 + Expression of gene 2 * coefficient 2 +… + Expression of gene n * coefficient n.And groups were divided based on the median, above the median was high score group and below the median was low score group (33,34).We further did survival analysis to observe the prognosis of patients in different groups.The accuracy was assessed by plotting the subjects' work characteristics (ROC) curves at 1, 3, and 5 years by using the timeROC software package (version 0.4.0)(35,36).In addition, we further explored the relationship between modeled genes and risk scores and OS.

Construction and validation of nomogram plots
Nomogram plot was constructed based on the risk score, Mstage, N-stage and etc.of the new characteristics to predict the 1year, 3-year, and 5-year overall survival probabilities of cervical cancer patients in the TCGA cohort (37,38).The predictive ability of the model was also evaluated by using the C-index score.

Estimation of immune cell infiltration
The immune cell infiltration in each CESC sample of the TCGA dataset was estimated by the computational analysis tools CIBERSORT (http://cibersort.stanford.edu/)(39), ESTIMATE, and Xcell.Subsequently, we further observed the high or low level of immune cells in different groups under the CIBERSORT algorithm, and further observed their correlation with risk score, modeling genes, and OS.In addition, we also observed the high and low situation of Stromal Score, Immune Score,EATIMATE Score and TumorPurity in different groups.

Differential and enrichment analysis of bulk data
We used the "DESeq2" package to analyze the differences between the high and low risk group, with a threshold of |logFC| >2 and a p-value of less than 0.05, and used the ClusterProfiler package to analyze the GO, KEGG, and GSEA enrichment of the differential genes (40)(41)(42).

Somatic mutation analysis
The mutation data for somatic mutation analysis were obtained from the TCGA database, and we observed the mutation distribution of highly mutated genes and modeled genes.The tumor mutation load (TMB) of each tumor epithelial cell sample was calculated using the "maftools" software package, and the cervical cancer samples were classified into high and low TMB groups according to the median tumor mutation load (TMB).And Kaplan-Meier was used to observe the survival differences among different groups.In addition, we further observed the CNV profiles of the modeled genes.

Immunotherapy effect in predicting chemotherapy response
We used the "pRRophetic" package (version 0.5) to infer the half-maximal inhibitory concentration (IC50) of chemotherapeutic drugs (43), and assessed the drug sensitivity of chemotherapeutic drugs in different groups (44,45).

Cell culture
SiHa and Hela cells came from the American type culture collection (ATCC).Both cell lines were cultured with DMEM medium (Gibco BRL, USA) supplemented with 10% fetal bovine serum (Gibco BRL, USA) in a 5% CO2 incubator at 37°C.

Cell transfection and RT-qPCR
Two small interfering RNAs (siRNAs) targeting ATF6 genes and their corresponding negative controls (si-NC) were synthesized by Ribobio (Guangzhou, China), The transfection regimen was performed according to the protocol for Lipofectamine 3000 (Invitrogen, USA).
Total RNA was extracted from cell lines using TRIzol reagent (15596018, Thermo) and the RNA concentration was standardized.Subsequently, cDNA was synthesized using PrimeScript ™ RT kit (R232-01, Vazyme).SYBR Green Kit (TaKaRa Biotechnology, Dalian, China) is used for real-time quantitative PCR (qRT-PCR).GAPDH was used as an internal reference.Supplementary Table 1 contains sequences of primers and siRNAs.

Wound healing
The transfected cells were cultured in a 6-well plate (Corning, USA, 3516).When the cell density reached about 95%, a sterile pipette with a volume of 200 mL was used to cut the cell layer along a straight line.Rinse gently with PBS to remove unattached cells and debris.Subsequently, the serum-free cell medium was replaced to maintain cell growth.The photos were taken at the same location at 0 and 48 hours respectively.

Transwell
Cells (corning, USA, 3412-01) with or without Matrigel matrix (BD Biosciences, USA) were placed in 24-well plates for transwell experiments.1×104 cells were placed in each upper chamber and cultured with 200 microliters of serum-free DMEM medium.A complete DMEM medium with 700 µl 10% serum was added to the lower chamber.Culture in the incubator for 36 hours, after the cells of the upper chamber penetrate the lower chamber, the chamber is cleaned, fixed, and stained.

Results
scRNA sequencing revealed the main cell types in the progress of CC In order to obtain the main cell types in the progress of cervical cancer, we collected CC samples from 4 patients with cervical cancer to obtain scRNA-seq.After initial quality control and batch effect removal, a total of 13770 cells were retained.We clustered these 13,770 cells by dimensionality reduction, and the analysis revealed 23 unique tissue states (upper left of Figure 1A).Based on the typical tissue type-specific markers defined in the literature, we divided the cell clusters into six main cell types: NK_T cells(6975), Epithelial cells(5434), Fibroblasts(59), pDCs(139), B_Plasma cells(707) and Myeloid cells(456).The proportion of each cell type in different patients was very different, among which Epithelial cells were the most abundant structural cells, accounting for the largest proportion of all cells.UMAP diagram was used to show the phase situation (lower left in Figure 1A) and HPV infection (lower right in Figure 1A) in six cell types.
The expression of marker genes in six cell types were different, and the marker genes (top10) in each cell type were displayed by bubble diagram (Figure 1B).Bar charts were used to show the proportion of six cell types in HPV+ group and HPV-group.Although the proportion of cells in each sample was different, NK_T cells and Epithelial cells were the cell types with high proportion in all samples (Figure 1C).However, the results of box chart showed that there was no statistical difference in the proportion of these six cell types between groups (Figure 1D).UMAP diagrams were used to display nCount_RNA, nFeature_RNA, S. Score and G2m.Score of all cells (Figure 1E).The word cloud diagrams (Figure 1F) and volcano diagrams (Figure 1G) were used to describe the differential genes of these six cell types.The results of GO-BP enrichment analysis of six cell types were displayed by thermogram (Figure 1H).It could be seen that NK_T cells were related to biological processes such as positive regulation of leukocyte activation, positive regulation of cell activation, T cell receptor signaling pathway, etc. Epithelial cells were related to biological processes such as antimicrobial humoral response, negative regulation of anoikis, regulation of endopeptidase activity, etc. Fibroblasts were related to biological processes such as collagen fibril organization, cellular response to amino acid stimulus, cellular response to acid chemical, etc. PDCs were related to negative regulation of reproductive process, defense response to bacterium, response to glucocorticoid and other biological processes.B_Plasma cells were related to biological processes such as B cell receptor signaling pathway, antigen receptor−mediated signaling pathway, immune response −activating cell surface receptor signaling pathway and other biological processes.Myeloid cells were related to positive regulation of interleukin−6 production, neutrophil activation, interleukin−6 production and other biological processes.

Visualization of cervical cancer tumor epithelial cell subgroups
Next, we used inferCNV (Supplementary Figure 1) to explore the single-cell RNA-seq data from tumor, distinguished tumor epithelial cells, and made further sub-clustering, resulting in five cell subgroups and marking their cell numbers: C0 TMPRSS2+ Tumor EPCs(1266), C1 ANKRD36C+ Tumor EPCs(919), C2 HK2 + Tumor EPCs(489), C3 PLP2+ Tumor EPCs(440), C4 MKI67+ Tumor EPCs(39) (Figure 2A, upper left), and showed the relationship between five cell subgroups and cell cycle stages (Figure 2A, upper right), HPV infection (Figure 2A, lower left) and sample source (Figure 2A, lower right).The cell subsets of the HPV+ group account for a higher proportion of all cell subsets, which proved that the occurrence of cervical cancer may be related to HPV infection.As previously reported, HPV infection was the main cause of cervical cancer development, which could be seen in 95% of cases (46,47).The proportion of five cell subgroups in HPV+ group and HPV-group, the proportion of HPV+ and HPV-in five cell subgroups, the proportion of five cell subgroups in cell cycle stages and the proportion of cells in five cell subgroups in cell cycle stages were displayed (Figure 2B).It was found that C0 existed only in HPV+ group and accounted for the most in G1 phase, while C3 was mostly in HPV-group, and the proportion of cells in S phase was much higher than that in other two phases.Several related features(CNV score, ncount _ RNA, S. score and G2M.score) of five cell subgroups were visualized (Figure 2C).The differential genes of five cell subgroups were displayed by volcano diagrams (Figure 2D), the enrichment pathways were displayed by word cloud diagrams (Figure 2E), and the marker genes (top10) in each cell subgroup were displayed by bubble diagram (Figure 2F).The results of enrichment analysis of GO-BP, the differential genes of five cell subgroups, were displayed by thermogram (Figure 2G).C0 TMPRSS2+ Tumor EPCs were related to biological processes such as Vesicle organization, Macroautophagy, Endosome organization,etc.C1 ANKRD36C+ Tumor EPCs are related to biological processes such as ATP metabolic process, Oxidative phosphorylation, Purine ribonucleoside triphosphate metabolic process,etc.C2 HK2+ Tumor EPCs are related to biological processes such as Skin development, Establishment of skin barrier, Regulation of water loss via skin, etc. C3 PLP2+ Tumor EPCs are related to biological processes, such as Cytoplasmic translation, Ribonucleoprotein complex biogenesis, Ribosome biogenesis, etc.C4 MKI67+ Tumor EPCs is related to biological processes such as Chromosome segregation, Nuclear division, Mitotic nuclear division, etc.

Visualization of pseudo-sequential analysis of tumor epithelial cells by CytoTRACE and monocle
In order to explore the differentiation and development relationship among the five cell subgroups of tumor epithelial cells, the differentiation of tumor epithelial cells was analyzed and visualized by CytoTRACE (Figure 3A).It could be seen that the five cell subgroups differentiated along the direction of C4-C3-C2-C0-C1 (Figure 3B).Pseudo-sequential analysis of cancer development process was carried out to explore the differentiation process of tumor epithelial cells.The distribution of epithelial cells from four patients was shown in the pseudotime-series trajectory, and the distribution of cell subgroups in pseudotime-series was shown by using UMAP diagram, violin diagram and ridge diagram respectively.It could be seen that five cell subgroups were continuously differentiated in pseudotime-series (Figure 3C).At the same time, five kinds of states were identified.In state1, state2 and state3, the proportion of C0 TMPRSS2+ Tumor EPCs subgroup was the highest, while in state4, the proportion of C1 ANKRD36C+ Tumor EPCs subgroup was the highest, and in state5, the proportion of C3 PLP2+ Tumor EPCs subgroup was the highest, and C4 MKI67+ Tumor EPCs subgroup only existed in state5.In HPV+ and HPV-groups, the proportion of C0 TMPRSS2+ Tumor EPCs subgroup in HPV+ group was the highest, and C3 PLP2+ Tumor EPCs subgroup in HPV-group was the highest (Figures 3D,  E).In order to study the origin of tumor epithelial cells, the pseudotime sequence trajectory of five cell subgroups was further analyzed.Starting from state1 at the lower right of the trajectory, two trajectories are divided upward, one is state2 downward, the other is continuously divided upward to the second branch point of state3, and the second branch point is divided into two branches, one is state4 upward, and the other is state5 downward to the left.The C0 TMPRSS2+ Tumor EPCs subgroup was mainly displayed at the beginning of the trajectory and in the branch with the first branch point down (corresponding to state1 and state2), the C1 ANKRD36C+ Tumor EPCs subgroup was mainly in the branch with the second branch point up (corresponding to state4), and the C3 PLP2+ Tumor EPCs subgroup mainly existed at the end of the whole pseudotime sequence trajectory (corresponding to state5).Pseudotime series analysis showed that C0 TMPRSS2+ Tumor EPCs subgroup may be the starting point of tumor cells, and gradually differentiated into other subgroups during the progression of cervical cancer (Figure 3F).
The named genes of five cell subgroups were selected and their changes with pseudotime series were shown by scatter plots and pseudotime series UMAP plots respectively.It could be seen that the C0 subgroup represented by the gene TMPRSS2 was mostly in the initial state of pseudotime series.The C1 subgroup represented by ANKRD36C and the C2 subgroup represented by HK2 basically run through the pseudotime series.However, the C3 subgroup represented by gene PLP2 and C4 subgroup represented by gene MKI67 were mostly at the end of pseudotime series.The sectional views of each subgroup also confirmed this conclusion (Figures 3G-H).

Slingshot analysis of pseudotime sequence trajectory of tumor epithelial cell subgroups
Slingshot is a uniquely robust and flexible tool which combines the highly stable techniques necessary for noisy single-cell data with the ability to identify multiple trajectories.In order to infer the continuous branching pedigree structure in tumor epithelial cell data, the pseudotime series trajectories of five cell subgroups were analyzed by using slingshot, and two lineages were obtained: lineage1 and lineage2.The two lineages have similar trajectories, but the final footholds are different, lineage1 finally reaches the C3 PLP2+ Tumor EPCs, and lineage2 finally reaches the C4 MKI67+ Tumor EPCs (Figure 4A).Next, the relationship between two lineages and pseudotime-series differentiation trajectory was displayed respectively, and the two pseudotime-series trajectories were visualized by GO-BP enrichment analysis.It was found that C1 in lineage1 was related to biological processes such as silencing gene, C2 was related to biological processes such as humoral, C3 was related to biological processes such as keratinocyte and transport, and C4 was related to biological processes such as hormone and biosynthetic.In lineage2, C1 was related to biological processes such as silencing gene, C2 was related to biological processes such as humoral and activity, C3 was related to biological processes such as ensheathment, humoral, collagen and proteincoupled, and C4 was related to biological processes such as mitotic and cycle (Figures 4B-D).Finally, the distribution status of different subpopulations on lineage1 and lineage2 and the differentiation curves with the pseudotime series were shown in scatter plots (Figure 4E).

CellChat analysis between cells
In order to systematically elucidate the complex cellular responses, we attempted to probe the cell-to-cell relationships and ligand-receptor communication networks to better understand the interactions between cells.Through CellChat analysis, first, we established intercellular communication networks between most cells, including Myeloid cells, Fibroblasts, NK_T cells and various subgroups of tumor epithelial cells, etc.Then calculated the number of interactions (indicated by the "line" connection between two cell types, the thicker the line, the higher the number of interaction pathways) and the strength of interactions (indicated by the "line" weight, the thicker the line, the stronger the interaction strength) (Figure 5A).
We used gene expression pattern analysis methods available on CellChat to explore how cells and signaling pathways interact.First, we determined the correspondence between inferred potential communication patterns and groups of secreted cells to decipher the outgoing communication patterns.Three signaling patterns were identified: pattern 1 (tumor epithelial cells), pattern 2 (NK/T cells, B_Plasma cells, Myeloid cells, pDCs) and pattern 3 (Fibroblasts) (Figure 5B).Then, to identify the key incoming and outgoing signals associated with the five tumor epithelial cell subpopulations, the ligand receptor network was quantitatively measured using CellChat to predict its key incoming and outgoing signals by utilizing pattern recognition methods.For example, in cervical cancer, each cell type could be a secreting cell (signal sender) that releases different cytokines or ligands, and each cell type could also be a targeting cell (signal receiver), and ligand-receptor-mediated communication between different cell types should contribute to cervical cancer when receptors on these cells were targeted by ligands released from the same type of cell or from other cells development (Figure 5C).In addition to exploring the detailed communication of individual pathways, an important question was how multiple cell populations and signaling pathways coordinate their functions.To address this question, CellChat used a pattern recognition method based on nonnegative matrix decomposition to identify global communication patterns, as well as key signals in different cell groups.The application of this analysis revealed three incoming signaling patterns and three outgoing signaling patterns.For example, this output showed that the majority of outgoing tumor epithelial cell signaling was characterized by mode 1, which represented multiple pathways, including but not limited to APP, CD99, CDH, ANNEXIN, etc.All output NK_T cells, B_Plasma cells, Myeloid cells, pDCs signalings were characterized by mode 2, which represented pathways such as CD22, CD45, ICAM, CESC L and TNF.On the other hand, the communication patterns of the target cells suggested that incoming tumor epithelial cell signalings were dominated by mode 1, which included signaling pathways such as CD99 and MK, as well as PTN, CEACAM, CD96, and GRN.The majority of incoming NK_T cells, B_Plasma cells, Myeloid cells, and pDCs signalings were characterized by mode 2, driven by pathways such as APP and ANNEXIN (Figure 5D).We found that CD99 can be secreted by almost all types of cells in cervical cancer, and its main target cells (receptors) were tumor epithelial cell subgroups, Fibroblasts and pDCs, among which C3 PLP2+ Tumor EPCs subgroups were most significantly expressed (Figure 5E).

Analysis of CD99 signal pathway
In order to explore the function way of CD99 signal pathway, the CD99 signal pathway was visually analyzed.The cell communication pattern of CD99 signaling pathway was displayed by scatter plot, and it could be seen that the tumor epithelial cell subgroup C3 PLP2+ Tumor EPCs had a large number and the highest intensity on CD99 signaling pathway (Figure 6A).
Besides the sender and receiver of CD99 signaling, we also identified the cell types as the medium and influencer of CD99 signaling-mediated intercellular communication according to the relative importance of each cell type based on the algorithm, which was called "centrality measurement".As could be seen from the figure, the tumor epithelial cell subgroup C3 PLP2+ Tumor EPCs had the highest expression on the CD99 signaling pathway (Figure 6B).Violin diagram showed the interaction between cells, and it was found that the epithelial cell subgroup C3 PLP2+ Tumor EPCs was highly expressed on CD99.Combined with the previous experimental results in this paper, it could be concluded that the epithelial cell subgroup C3 PLP2+ Tumor EPCs was an important subgroup of tumor epithelial cells (Figure 6C).The ligand-receptor between tumor epithelial cells and other cells was displayed by chord diagram (Figure 6D).If we set all 10 identified cell types in cervical cancer as the source cells of CD99, and set the five cell types listed on the left in Figure 6E as potential target cells, Then the hierarchical diagram showed that CD99 released by all 10 cell types could target C0 TMPRSS2+ Tumor EPCs subgroup, C2 HK2+ Tumor EPCs subgroup and C3 PLP2+ Tumor EPCs subgroup, and if the other five cell types listed on the right side of Figure 6E were set as potential target cells, Then the layered map showed that CD99 released by all 10 cell types could target C1 ANKRD36C+ Tumor EPCs subgroup and C2 HK2+ Tumor EPCs subgroup (Figure 6E).These results indicated that all cell types in cervical cancer may be the source of CD99, but only C0 TMPRS2+ Tumor EPCs subgroup, C1 ANKRD36C+ Tumor EPCs subgroup, C2 HK2+ Tumor EPCs subgroup and C3 PLP2+ Tumor EPCs subgroup had different targeting intensities (line width between cells) for CD99 (Figure 6E).The specific situation of cellto-cell interaction in CD99 signaling pathway was shown in the figure (Figure 6F).

Screening the genes that constitute the risk score and performing correlation analysis
In order to study the clinical effect of the cell types identified in this study, we analyzed the marker gene (top100) of C3 tumor epithelial cell subgroup by univariate COX analysis.The results showed that there were 12 genes related to the prognosis of patients, among which PLAGL1, HIF1A, ERG, ELF1, ATF6 and ATF1 were risk factors, while TBX21, SPIB, LHX2, JUND, ETV7 and ATF5 were protective factors.In order to avoid multicollinearity of these genes, lasso regression was used for further screening, and nine genes constituting PLP2+ Tumor EPCs score were selected.Lambda diagram verified the above conclusion (Figure 7B).Next, the nine PLP2+ Tumor EPCs score genes were divided into high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group for survival analysis (Figure 7C).Compared with low PLP2+ Tumor EPCs score group, the prognosis of high PLP2+ Tumor EPCs score group was worse.As we expected, high PLP2+ Tumor EPCs score was associated with worse clinical outcome, while low PLP2+ Tumor EPCs score was associated with better clinical outcome (P < 0.0001).
In addition, we also analyzed the survival of genes which constituted PLP2+ Tumor EPCs score (Figure 7D), and the results showed that only four of them (SPIB, ATF6, PLAGL1, ERG) were statistically significant (P<0.05).Among them, in the survival analysis chart corresponding to three genes, ATF6, PLAGL1 and ERG, all the groups with high gene expression were associated with worse clinical results.The group with low gene expression was associated with better clinical outcome, while the result of SPIB corresponding to survival analysis chart was just the opposite, which could prove the result in Figure 7A, that was, ATF6, PLAGL1 and ERG were risk factors and SPIB was protective factors.
Through the above analysis, two groups of high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group had been obtained, and then these two groups were analyzed.The PLP2 + Tumor EPCs score of each patient in TCGA-CESC data set was calculated according to the expression level and regression In addition, the distribution of survival time showed that the higher PLP2+ Tumor EPCs score, the worse the prognosis.The corresponding expression levels of nine modeling-related genes were also shown.high PLP2+ Tumor EPCs score group highly expressed ATF6, ELF1, ERG and PLAGL1 genes, while low PLP2+ Tumor EPCs score group highly expressed ATF5, TBX21, SPIB, LHX2 and JUND genes (Figure 7E).The correlation among survival days, risk score and genes constituting the model was studied.OS was negatively correlated with PLP2+ Tumor EPCs score, ERG was significantly negatively correlated with JUND, and most other modeling genes were positively correlated.The scatter plot further showed the correlation between nine modeling genes and risk score and OS (Figure 7F).According to the predicted AUC scores of survival ROC curve in 1 year, 3 years and 5 years, the AUC(1 year): 0.762, AUC (3 years): 0.733 and AUC(5 years): 0.781 were obtained (Figure 7G).The relationship between four statistically significant modeling genes and PLP2+Tumor EPCs Score was shown by scatter plot (Figure 7H), and the difference of expression levels of four statistically significant modeling genes between high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group was shown (Figure 7I).It was found that ATT6, ERG and PLAGL1 were more expressed in high PLP2+ Tumor EPCs score group, while SPIB was more expressed in low PLP2+ Tumor EPCs score group (P < 0.001).The differential expression of genes in high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, and the differential expression in different ages, different T, N, M stages and different races were demonstrated by box plots (Supplementary Figure 2).
In order to verify the independence of risk factors, a gene-cell clinical prediction model was constructed, and multi-factor Cox regression was carried out by combining age, race, T stage, N stage and M stage clinicopathological factors with high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, with a p value of 0.001, indicating that PLP2+Tumor EPCs Score could be independent risk factor (Figure 7J).Age, race, T, N and M stages were selected to construct the nomogram, which showed the 1-year survival rate, 3-year survival rate and 5-year survival rate (Figure 7K).The estimated survival rates of one of these patients in 1, 3 and 5 years are 0.865, 0.556 and 0.389, respectively.In order to further evaluate the accuracy of the nomogram, the box diagram was used to show the internal cross-validation results (Figure 7L).

Analysis of the difference of immune infiltration between high PLP2+ Tumor
EPCs score group and low PLP2+ Tumor EPCs score group.
In order to explore the immune infiltration in high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, and to observe the relationship between immune infiltrating cells and the two groups, we showed the differential expression of immune infiltration between the two groups by thermogram (Figure 8A).In order to explore the immune infiltration of high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, the predicted abundance of different immune cells was displayed (upper Figure 8B).We used CIBERSORT algorithm to determine the immune cell infiltration of cervical cancer patients from TCGA database, and show the predicted abundance of 12 immune cells with differences between the two groups.High PLP2+ Tumor EPCs score group was more common in Mast cells activated, Macrophages M0, Dendritic cells activated, etc., while low PLP2+ Tumor EPCs score group was more common in T cells CD8, T cells regulatory, B cells naive, etc. (under Figure 8B).The correlation between immune infiltrating cells and PLP 2+Tumor EPCs score was demonstrated by bar chart.PLP2+ Tumor EPCs score was positively correlated with Mast cells activated, Macrophages M0, Dendritic cells activated, and negatively correlated with T cells CD8, T cells regulatory, B cells naive, etc. (Figure 8C).Through a variety of methods for evaluating the content of immune cells, the relationship between the nine genes, OS and PLP2+ Tumor EPCs score and immune cells was compared and summarized, and displayed by thermal diagram.The closer the color was to red, the higher the positive correlation, and the closer the color was to blue, the higher the negative correlation (Figure 8D).
Stromal Score, Immune Score and EATIMATE Score of high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group were displayed, and it could be obtained that the results of Stromal Score Group were not statistically significant.Immune Score and EATIMATE Score were both low PLP2+ Tumor EPCs score group higher than high PLP2+ Tumor EPCs score group (Figure 8E).Visualizing the Tumor Purity of two groups, the value of Tumor Purity of high PLP2+ Tumor EPCs score group was higher (Figure 8F).

Difference analysis and enrichment analysis
In order to explore the differences between high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, the Screening genes which constituted PLP2+ Tumor EPCs score, and grouped into two groups of high and low PLP2+ Tumor EPCs score group and took correlation analysis.(A) Forest plot showed univariate COX analysis of genes constituting PLP2+ Tumor EPCs score.HR < 1,protective factor, HR > 1,risk factor.(B) LASSO regression profiled nine genes in the TCGA cohort: ERG, ELF1, ATF6, PLAGL1, ATF5, TBX21, SPIB, LHX2, and JUND (top); coefficient profiled were generated based on the logarithmic (lambda) sequence.Selected the optimal parameters (lambda) in the LASSO model (bottom).(C) Based on the median PLP2+ Tumor EPCs score, all datas were divided into two groups: high and low PLP2+ Tumor EPCs score group, and Kaplan Meier curve showed the survival analysis of the two groups.(D) Kaplan Meier curves showed the overall survival (OS) of cervical cancer (CC) patients with high and low expression of four statistically significant genes (SPIB, ATF6, PLAGL1, and ERG) in all genes which constructed the PLP2+ Tumor EPCs score (p<0.05).(E) Curve plots showed hazard scores of high and low PLP2+ Tumor EPCs score groups (top, middle); heatmap showed differential gene expression of high and low PLP2+ Tumor EPCs score group.The color scale was based on normalized data (bottom).Green indicated low PLP2+ Tumor EPCs score group and orange indicated high PLP2+ Tumor EPCs score group.(F) Heatmap and scatterplot showed the correlation analysis of genes constituting PLP2+ Tumor EPCs score.Red indicated positive correlation, blue indicated negative correlation, and color shades indicated high or low correlation.(G) The ROC curve of the survival plot.The area under the curve (AUC) was 0.762, 0.733 and 0.781 for 1, 3 and 5 years, respectively.(H) Scatter plots showed the correlation analysis of modeled genes (4 statistically significant genes) with PLP2+ Tumor EPCs score.(I) Peak and box plot showed the different expression among the 4 statistically significant genes which constituted PLP2+ Tumor EPCs score in high and low PLP2+ Tumor EPCs score groups.(J) Forest plot showed the multivariate Cox analysis of genes constituting PLP2+ Tumor EPCs score.HR < 1,protective factor, HR > 1, risk factor.(K) Column line plot was constructed according to TCGA patient race, T-stage, N-stage, M-stage and etc. (L) Box-and-line plot for internal cross-validation of AUC scores at 1, 3, and 5 years.
following analysis was made.First, the volcano diagram and thermal diagram were used to show the expression of differential genes between the two groups (Figures 9A, B).In order to further determine the potential role of each subgroup in the initiation and progression of cervical cancer, functional enrichment was carried out, and the GO-BP enrichment analysis results of differential genes were displayed with bar charts.The results showed that the differential genes were mainly related to digestion, modulation of process of another organism, defense response to gram negative bacteria and odontogenesis of dentition containing tooth (Figure 9C).KEGG enrichment analysis was carried out on the differential genes, and the enrichment results of different pathways were displayed by bar graphs, and it was found that the differential genes were enriched with Ras signaling pathway, carbohydrate digestion and absorption, PPAR signaling pathway and other pathways (Figure 9D).Through GSEA scoring of GO-BP enrichment items of different genes, the enrichment scores of genes in different pathways were displayed (Figure 9E).

Mutation analysis
To determine the correlation between gene mutations and immune components in TME, we initiated further studies, first showing the top 30 most frequently mutated genes in two groups of somatic cells.The upper bars indicated the mutation load for each sample, and the right bars indicated the total percentage of mutations in that gene in those samples (Figure 9G).The cellular mutation data from the two groups were analyzed and visualized to show the mutations in the nine genes that were modeled (Figure 9F), and the chromosomal gains and losses were demonstrated by using bar graphs to show that the most CNV gain was seen in ATF6, whereas the most CNV loss was seen in ELF1 (Figure 9H).Heatmaps were used to show the correlation of mutation profiles among the genes constituting the risk score (Figure 9I).
And lollipop plots were used to visualize the mutation profiles of different genes (Figure 9J).To explore the difference situation of mutation load between high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group, the results were visualized and analyzed by using violin plots, which were not statistically significant (Figure 9K), and the correlation between mutation load and risk scores was demonstrated by using scatter plots, which were not statistically significant (Figure 9L).According to the tumor mutation load score, divided into four groups: high-risk high mutation load, high-risk low mutation load, low-risk high mutation load, and low-risk low mutation load, and the curves showed the results of the survival analysis of the four groups: the low-risk high mutation load group had the best survival, while the high-risk high mutation load group had the worst survival (p < 0.0001) (Figure 9M).

Drug sensitivity analysis
Finally, the differences in different drug sensitivities in the high PLP2+ Tumor EPCs score group versus the low PLP2+ Tumor EPCs score group were shown by violin plots (Figure 9N).For example, CI.1040, Nutlin.3a,PF.02341066, Metformin, EHT.1864, and LFM.A13 had higher drug sensitivities in high PLP2+ Tumor EPCs group than in low PLP2+ Tumor EPCs score group.

Experimental verification
To further elucidate the function of ATF6, we conducted in vitro functional experiments.As shown in Figures 10A, B, the CCK8 experiment showed that compared with the control group, the proliferation capacity of the two cell lines in the ATF6 knockdown group was significantly decreased.The results of plate cloning showed that the number and size of colony formation in both cell lines were significantly inhibited after ATF6 gene knockdown (Figures 10C, D).Further, the inhibitory effect of ATF6 knockdown on the proliferation of two cervical cancer cell lines was demonstrated.The subsequent wound healing experiment results showed that the cell migration rate was slower in the ATF6 knockdown group, and the results were statistically significant (Figures 10E, F).The results of the Transwell experiment showed that the number of cells penetrating the lower chamber was significantly reduced after ATF6 gene knockdown, indicating that ATF6 gene knockdown significantly inhibited the migration and invasion ability of cervical cancer cells (Figures 10G-I).

Discussion
In recent years, the rapid advancement of bioinformatics has profoundly accelerated the diagnosis and prognosis of diseases (48)(49)(50).In this investigation, we employed single-cell RNA sequencing (scRNA-seq) to comprehensively delineate the cellular heterogeneity of human CC.Leveraging scRNA-seq technology, we identified all cellular phenotypes present in CC, including NK_T cells, epithelial cells, fibroblasts, plasmacytoid dendritic cells (pDCs), B plasma cells, and myeloid cells.Among these, epithelial cells emerged as the predominant cellular population.Drawing upon prior research indicating the pivotal role of squamous epithelial cell dysfunction in the initiation of cervical cancer (51), our focus shifted towards the investigation of epithelial cells.Through inferCNV analysis, we characterized tumor epithelial cells and conducted dimensionality reduction clustering, revealing five distinct cellular subgroups: C0 TMPRSS2+ Tumor Epithelial Progenitor Cells, C1 ANKRD36C+ Tumor Epithelial Progenitor Cells, C2 HK2+ Tumor Epithelial Progenitor Cells, C3 PLP2+ Tumor Epithelial Progenitor Cells, and C4 MKI67+ Tumor Epithelial Progenitor Cells.
Through the application of slingshot, monocle, and cytoTRACE, the differentiation trajectory of tumor epithelial cells along a pseudo-temporal sequence was demonstrated.This analysis identified our targeted subgroup for this study: the C3 PLP2+ Tumor EPCs subgroup.Regarding the nomenclature of this subgroup gene, previous research suggests that PLP2 is a transmembrane protein located in the endoplasmic reticulum (52,53).It is recognized as an oncogenic inducer in various cancers, including melanoma, osteosarcoma, breast cancer, hepatocellular carcinoma, and acute lymphoblastic leukemia (54).PLP2 has been implicated in accelerating UCEC cell proliferation, the epithelialmesenchymal transition (EMT) process, invasion, and metastasis, thereby promoting tumor progression (55).It is associated with the persistence and metastasis of tumor cells in melanoma and hematologic malignancies (53).Additionally, PLP2 is known to enhance tumor sphere-forming ability and cell proliferation (56, 57).It can be seen that there is an inevitable link between PLP2 and tumor progression.Therefore, we hypothesize that the C3 PLP2+ Tumor EPCs subgroup is intricately linked to the progression of the tumor.
To explore the interactions between the C3 PLP2+ Tumor EPCs subgroup and other cell types, CellChat communication pattern analysis was employed to unveil coordinated responses among different cell types.Distinct cell types can simultaneously activate either common cell type-independent signaling transduction pathways or distinct cell type-specific signaling transduction pathways.This methodology is utilized for inferring, analyzing, and visualizing intercellular communication from given scRNA-seq data (31).Through the application of CellChat to depict the relationships between the tumor epithelial cell subgroup and other cell types, three patterns were identified along with their corresponding signal pathway expressions.Notably, the CD99 signaling pathway corresponds to Tumor EPCs pattern 1 and is secreted by almost all cell types in cervical cancer, signifying its crucial role as a significant signaling pathway.The presentation of the subgroups on the CD99 signaling pathway reveals that the C3 PLP2+ Tumor EPCs subgroup has the highest quantity and centralization score, providing evidence of a robust association between this pathway and the C3 PLP2+ Tumor EPCs subgroup.This confirmation underscores the significance of the C3 PLP2+ Tumor EPCs subgroup as the focal point in this study.
To substantiate the role of the C3 PLP2+ Tumor Epithelial Progenitor Cells subgroup in tumor advancement, we proceeded with the development of a novel prognostic model using LASSO regression analysis and COX risk regression analysis.This model was designed to elucidate the association between this subgroup and prognosis, with a focus on nine selected genes to establish the PLP2 + Tumor EPCs score.Among the nine genes in the model, PLAG1 is acknowledged as an oncogene with significant DNA binding affinity and overlapping functions.Its involvement in promoter swapping and subsequent activation plays a pivotal role in the pathogenesis of pleomorphic adenomas of the salivary gland, lipoblastomas, and hepatoblastomas (58-60).ERG, in concert with co-repressive proteins such as HDAC and EZH2, governs AR transcriptional activity, suppressing epithelial differentiation and fostering tumor progression (61).ELF1 exhibits heightened expression in glioma tissues and exhibits a close correlation with WHO grading and patient Karnofsky Performance Status (KPS) scores, suggesting its potential role as a tumor-promoting factor (62). ATF6 serves as an inducer of genes that augment protein folding and restore protein homeostasis (63), while also promoting inflammation during the course of chronic pancreatitis (64).SPIB is upregulated in various malignant tumors, including colorectal cancer, hepatocellular carcinoma, and gastric cancer (65).
The constructed PLP2+ Tumor EPCs score has been demonstrated as an independent prognostic factor.Based on the median risk score, it has been divided into two distinct prognostic groups: the high PLP2+ Tumor EPCs score group and the low PLP2 + Tumor EPCs score group.Subsequent construction of ROC curves and column charts, along with comprehensive analysis, indicates that a low PLP2+ Tumor EPCs score is associated with a better prognosis, while a high PLP2+ Tumor EPCs score is conversely related to a poorer prognosis.Therefore, the PLP2+ Tumor EPCs score can serve as a theoretical basis for clinical decision-making.
The immune system plays a pivotal role in carcinogenesis (66).It is widely acknowledged that the migratory capacity of tumor cells is closely associated with poor prognosis and recurrence (67).Immune cells form the cellular foundation for immunotherapy, and gaining in-depth insights into immune infiltration is crucial for uncovering potential molecular mechanisms and providing new immunotherapeutic strategies to enhance clinical outcomes (68).Therefore, to further discuss the role of the PLP2+ Tumor EPCs score in the tumor process, we analyzed the immune infiltration in two groups based on the PLP2+ Tumor EPCs score (high PLP2+ Tumor EPCs score group and low PLP2+ Tumor EPCs score group).The high PLP2+ Tumor EPCs score group was found to have a higher prevalence of activated Mast cells, Macrophages M0, and activated Dendritic cells.In contrast, the low PLP2+ Tumor EPCs score group exhibited a higher prevalence of T cells CD8, regulatory T cells, and naive B cells, mostly associated with favorable prognoses in various cancers (69).Visualization of the analysis results through Stromal Score, Immune Score, ESTIMATE Score, and Tumor Purity revealed that the Stromal Score had no statistical significance.A higher estimated score in Immune Score indicated a greater abundance of immune components in the tumor microenvironment (TME), while the ESTIMATE Score, representing the sum of Immune Score and Stromal Score, reflected the comprehensive proportion of these two components in the TME (70).Tumor purity refers to the proportion of tumor cells in tumor tissue (71).Consequently, the conclusion can be drawn that the low PLP2+ Tumor EPCs score group has a higher total of immune and stromal components, with a lower proportion of tumor cells, potentially correlating with better prognostic outcomes in this group.
The presentation includes differential gene expression, enriched pathways, mutation profiles, and variances in drug sensitivity between the two groups.Notably, drugs such as CI.1040, Nutlin.3a,PF.02341066, Metformin, EHT.1864, and LFM.A13 have undergone investigation in various malignancies, encompassing gastric cancer, breast cancer, endometrial cancer, and prostate cancer (72)(73)(74)(75).Metformin has emerged as a promising anti-tumor agent, leading to significant advancements in the management of breast cancer and colorectal cancer (72).Nutlin-3a exhibits immune-modulating properties, rendering it a viable option for tumor therapy (76).PF.02341066 serves as a multitarget inhibitor of anaplastic lymphoma kinase (ALK), ROS1, and MET proto-oncogene receptor tyrosine kinase, and is the first agent approved by the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA) for the treatment of advanced ROS1 fusion-positive lung cancer (77,78).Based on the analysis of drug sensitivity, it is suggested that heightened sensitivity to drugs may correlate with a high PLP2+ Tumor EPCs score.Our findings from drug sensitivity analysis furnish a foundation for the election of targeted therapies for cervical cancer patients and provide novel insights for the development of innovative targeted therapeutic agents.Furthermore, our cellular experiments have provided additional evidence regarding the functions of key molecules.However, this study still has certain limitations.Firstly, the sample size selected is relatively small, focusing only on a small subset of CC patients.Secondly, we conducted only transcriptomic studies and in vitro experiments.Thirdly, in previous studies, it has been found that there is a link between cervical cancer and HPV infection.In cases of cervical squamous lesions, the hrHPV positivity rate was 55.6% (79).The relationship between the C3 PLP2+ Tumor EPCs and HPV infection can be further explored in the following studies.Fourth, the article mentions the cellular immune infiltration in different groups.Studies have proposed that T lymphocytes can improve the tumor microenvironment of cervical cancer, improve treatment efficacy, and improve prognosis (80,81), and may continue to explore in this direction in the future.In subsequent research, we plan to validate the role of the PLP2+ Tumor EPCs score in CC across a larger sample size.Additionally, we aim to explore the functions of relevant subgroups and key molecules in various omics data, such as metabolomics, proteomics, and ATACseq.We will integrate in vivo and in vitro experiments to provide more comprehensive validation.

Conclusion
In summary, our investigation elucidated alterations in the tumor microenvironment and highlighted the pivotal role of the C3 PLP2+ Tumor Epithelial Progenitor Cells subpopulation in the onset and progression of cervical carcinoma (CC).Additionally, we identified an independent prognostic factor, the PLP2+ Tumor EPCs score.Our findings unveil prospective therapeutic targets for CC, offering valuable resources and enhanced understanding of the etiology and progression of the disease.

Funding
The author(s) declare financial support was received for the research, authorship, and/or publication of this article.This research was supported by the Scientific Research Foundation of Qilu Hospital of Shandong University (Qingdao)(grant number QDKY2023BS15).

2
FIGURE 1 scRNA sequencing revealed major cell types during CC progression, (A) UMAP plot showed the 23 clusters of cells in cervical cancer patients and the number of cells in each cluster (top left); UMAP plot showed the six major cell types obtained by down clustering of cells in cervical cancer (top right); UMAP plot showed the distribution of phases in the six cell types (bottom left) and the infection of HPV (bottom right).Each point corresponded to a single cell colored according to cell cluster or cell type.(B) Bubble plot showed differential expression of Top10 maker genes in cervical cancer cells across cell types (NK/T cells, Epithelial cells, Fibroblasts, pDCs, B_Plasma cells, Myeloid cells).Bubble colors were based on normalized data and sizes indicated the percentage of genes expressed in the subpopulation.(C) Histograms depicted the percentage of the 6 major cell types in each of the 4 cervical cancer samples.(D) Box line plot depicted the percentage of the 6 cell types in the HPV+ group versus the HPV-group.The colors of the dots represented the HPV-and HPV+ groups, respectively.p-values corresponded to paired Wilcoxon tests.(E) UMAP plots visualized the relevant features of 6 cell types: nCount_RNA, nFeature_RNA, S.score, G2M.score.(F) Word cloud plots showed gene enrichment in the 6 cell types.The size of the letter indicated the number of genes and the color indicated the high or low enrichment score of the cell type.(G) Volcano plots demonstrated the expression of differential genes in the 6 cell types.(H) GO-BP enrichment analysis demonstrated the biological processes associated with the 6 cell types.

3
FIGURE 3 Visualization of pseudotime-series analysis of tumor epithelial cells by CytoTRACE and monocle.(A) The left figure represented the analysis of the differentiation of tumor epithelial cells by using CytoTRACE and it was shown in 2D.The color could represent the level of differentiation.The right figure represented the CytoTRACE results displayed according to different tumor epithelial cell subpopulations.The colors represented different tumor epithelial cell subpopulations.(B) Box line plot demonstrated the predicted ordering by CytoTRACE of tumor epithelial cell subpopulations.(C) UMAP plot, violin plot and ridge plot showed the pseudotime distribution of tumor epithelial cell subpopulations.*, p ≤ 0.05, ****, p < 0.0001 indicated a significant difference, ns indicated a non-significant difference.(D) The occupancy of the relevant features of the five tumor epithelial cell subpopulations at different pseudotime stages was visualized: the occupancy of the five tumor epithelial cell subpopulations in different states (state1-state5) (top) and the occupancy of the five cell subpopulations in the HPV+ group and HPV-group (bottom).(E) Bar graph showed the occupancy of different states (state1-state5) in 5 tumor epithelial cell subpopulations (left) and the occupancy of 5 cell subpopulations in HPV+ group and HPV-group (right).(F) The derivation process of tumor epithelial cells.Left : The figure showed the pseudotime trajectory of tumor epithelial cells; middle: the pseudotime trajectory graph showed the distribution of STATE; right: the pseudotime trajectory graph showed the distribution of tumor epithelial cell subpopulations.(G) Scatter plots showed the changes of named genes of 5 cell subpopulations of tumor epithelial cells with the pseudotime sequence (top); pseudotime trajectory plots showed the distribution of named genes of 5 cell subpopulations of tumor epithelial cells on the pseudotime trajectory (bottom).(H) Split-plane diagrams of tumor epithelial cell pseudotime sequence trajectories showed the distribution of different cell subpopulations on the pseudotime sequence, respectively.

4
FIGURE 4 Slingshot analysis of tumor epithelial cell subpopulations on pseudotime trajectories.(A) UMAP plot showed the distribution of two differentiation trajectories of tumor epithelial cells fitted by the pseudotime order in all tumor epithelial cells.(B) UMAP plot demonstrated the change of Lineage1 with the fitted pseudotime order (left); UMAP plot demonstrated the differentiation trajectory of Lineage1 on the fitted pseudotime order (right).(C) UMAP plot demonstrated the change of Lineage2 with the fitted pseudotime order (left); UMAP plot demonstrated the differentiation trajectory of Lineage2 on the fitted pseudotime order (right).(D) GO-BP enrichment analysis demonstrated the biological processes corresponding to the two pseudotime trajectories of tumor epithelial cell subpopulations.Left: Lineage1; Right: Lineage2.(E) Scatter plots demonstrated the trajectories of named genes of five cell subpopulations of tumor epithelial cells changing on two lineages obtained after slingshot visualization.

5
FIGURE 5 CellChat analysis among all cells.(A) Circle plots showed the number (left) and strength (right) of interactions between all cells.(B) Heatmap showed pattern recognition of incoming cells (left), and outgoing cells (right) among all cells.(C) Outgoing contribution bubble plot and incoming contribution bubble plot showed cellular communication patterns among various cell subpopulations of tumor epithelial cells and other cells.(D) Sankey diagrams showed inferred outgoing communication patterns of secreting cells, showed correspondence between inferred potential patterns, cell populations, and signaling pathways.Left: incoming Sankey diagram, right: outgoing Sankey diagram.(E) Heatmap showed incoming and outgoing signaling intensities for the all cellular interactions.

6
FIGURE 6 Visual analysis of CD99 signaling pathway.(A) Scatter plot showed the cellular communication patterns of CD99 signaling pathway.Each dot represented the communication network of a signaling pathway.The size of the dots indicated the number of signaling pathways.Different colors represented different signaling pathway groups.(B) Heatmap demonstrated the centrality score of the CD99 signaling pathway network, showing the relative importance of each cell group.(C) Violin plot showed the cellular interactions of the CD99 signaling pathway.(D) Circle plot showed the cellular interactions of the CD99 signaling pathway when tumor epithelial cells were selected as the RECEIVER.(E) Hierarchical diagram showed the interactions between tumor epithelial cells and other cells in the CD99 signaling pathway.Solid and hollow circles indicated source and target cell types, respectively.The edge color of the middle circle was consistent with the signal source.(F) Heatmap showed the cell interactions of the CD99 signaling pathway.

8 9
FIGURE 8 Differential analysis of immune infiltration in high and low PLP2+ Tumor EPCs score group (A) Heatmap showed differential expression of immune infiltration in high and low PLP2+ Tumor EPCs score group.(B) Stacked bar graph of immune infiltration (top); box-and-line plot showed the differential immune infiltration of 12 immune cells which had significant differences in high and low PLP2+ Tumor EPCs score groups (bottom).(C, D) Bar graph and heatmap showed the correlation analysis of immune cells with genes constituting PLP2+ Tumor EPCs score.(E) Box line plot showed the difference between high and low PLP2+ Tumor EPCs score groupin StromalScore, ImmuneScore, and ESTIMATEScore.(F) Violin plot showed the differences of TumorPurity in high and low PLP2+ Tumor EPCs score group.*, p < 0.05; **, p < 0.01; ***, p < 0.001; ****, p < 0.0001 indicated a significant difference and ns indicated a nonsignificant difference.

10 ATF6
FIGURE 10 ATF6 significantly affects the proliferation and migration of cervical cancer cell lines.(A, B) CCK-8 experiment.After ATF6 knockdown, the proliferation ability of SiHa and Hela cell lines decreased significantly.(C, D) Plate cloning experiment.After ATF6 knockdown, the colony formation ability of SiHa and Hela cell lines decreased significantly.(E, F) wound healing test.After ATF6 knockdown, the migration ability of SiHa and Hela cell lines decreased significantly.(G-I) Transwell experiment.After ATF6 knockdown, the migration and invasion ability of SiHa and Hela cell lines were significantly reduced.(**P<0.01;***P< 0.001).