A Pan-Cancer Analysis of the Oncogenic and Immunogenic Role of m6Am Methyltransferase PCIF1

Background Phosphorylated CTD-interacting factor 1 (PCIF1) is identified as the only known methyltransferase of N6,2′-O-dimethyladenosine (m6Am) in mRNA. However, its oncogenic and immunogenic role in cancer research is at an initial stage. Methods Herein, we carried out a pan-cancer analysis of PCIF1, with a series of datasets (e.g., TIMER2.0, GEPIA2, cBioPortal). Results PCIF1 expression was higher in most cancers than normal tissues and was discrepant across pathological stages. Highly expressed PCIF1 was positively correlated with overall survival (OS) or disease-free survival (DFS) of some tumors. PCIF1 expression had a positive correlation with CD4+ T-cell infiltration in kidney renal clear cell carcinoma (KIRC), CD8+ T cells, macrophages, and B cells in thyroid carcinoma (THCA), and immune checkpoint genes (ICGs) in LIHC but a negative correlation with CD4+ T cells, neutrophils, myeloid dendritic cells, and ICGs in THCA. It also affected tumor mutational burden (TMB) and microsatellite instability (MSI) of most tumors. Conclusion PCIF1 expression was correlated with cancer prognosis and immune infiltration, suggesting it to be a potential target for cancer therapy.


INTRODUCTION
Epitranscriptomics has been an attractive field of recent cancer researches. Posttranslational modification (PTM) of mRNA includes N4-acetylcytidine (ac4C), N1-methyladenosine(m1A), N6-methyladenosine (m6A), 5-methylcytosine (m5C), 5-hydroxymethylcytidine (hm5C), N6,2′-Odimethyladenosine (m6Am), 7-methylguanosine (m7G), inosine (I), and pseudouridine (Y). Among which, m6A is one of the most studied PTM in cancer researches. Accumulating evidence has suggested the participation of other mRNA modification in oncogenesis or cancer development. Zhang et al., for example, have found NAT10-mediated mRNA ac4C modification in promotion of gastric cancer (GC) metastasis and epithelial-to-mesenchymal transition (EMT) (1). The Y box binding protein 1 (YBX1)-mediated m5C modification promoted mRNA stability, which activated oncogenesis of urothelial carcinoma of the bladder (2). The m6Am was another established reversible RNA modification, conducive to mRNA translation, expression, and stability. Previous studies have identified phosphorylated CTD-interacting factor 1 (PCIF1) as the only known m6Am writer protein (3)(4)(5)(6)(7) and fat mass and obesity-associated protein (FTO) as an eraser protein for mRNA (7)(8)(9). METTL4 was a m6Am writer protein for U2 small nuclear RNA (snRNA), which regulated RNA splicing (10). PCIF1 was involved in glucose homeostasis through regulating the function or survival of pancreatic beta cell (11) and viral infectivity (12). PCIF1-mediated m6Am RNA modification was associated with reduced mice weight (13). Recently, Relier and colleagues have shown that inhibition of nuclear PCIF1-mediated m6Am modification in mRNA led to reduced colorectal cancer stem cell abilities, which was the first and the only study so far that revealed the role of m6Am of mRNA in cancer (14). However, biological and functional role of m6Am in cancer research remains obscure. We attempt to primarily explore the oncogenic and immunogenic role of PCIF1 on the basis of the pan-cancer analysis.

Expression Analysis
We obtained expression difference of PCIF1 between tumor and normal tissues based on Cancer Genome Atlas (TCGA) database. For those tumors that lack data for normal tissues, we used genotype-tissue expression (GTEx) database as a Supplementary Material. We also obtained violin plots of the PCIF1 expression difference in different pathological stages of TCGA tumors via the "Pathological Stage Plot" module of GEPIA2 (http://gepia2. cancer-pku.cn/) (15). The log2 (transcripts per million (TPM) +1) were applied as log-scale for violin plots. We also examined specificity of PCIF1 RNA expression in different tissues, blood cell lineages, and single cell types via the consensus dataset composed of Human Protein Atlas (HPA), GTEx, and FANTOM5 datasets on the Human Protein Atlas website (https://www.proteinatlas.org/). We applied p-value as the measurable indicator for statistical hypothesis during the whole study and suggested a significant difference for a result of p < 0.01, a statistical difference for a result of p < 0.05, and no statistical difference for p > 0.05 herein and in the rest of the content (16).

Genetic Alteration Analysis
We utilized the cBioPortal web (https://www.cbioportal.org/) to explore genetic alteration characteristics of SND1 by pressing "quick search" and entering PCIF1 (17,18). The expected results including a three-dimensional structure diagram of PCIF1, alteration frequency, mutation type, copy number alteration (CNA), and PTM sites via "Cancer Types Summary", "Plots", and "Mutations" module. We also obtained the differences of OS, DFS, progression-free survival (PFS), and disease-specific survival (DSS) for pan-cancer with or without PCIF1 alteration via the "Comparison/Survival" module and generated Kaplan-Meier plots with log-rank p-value.

PCIF1-Related Gene Enrichment Analysis
To figure out interplays of PCIF1 protein, we carried out a series of gene-enrichment analysis. We first obtained top 100 PCIF1binding proteins in Homo sapiens via the STRING database (http://string-db.org). Parameters we used were "evidence" for meaning of network edges ("evidence"), "low confluence (0.150)" for minimum required interaction score, "100" for max interactors, and no limitations/"experiment" for active interaction sources to get predicted functional partners of PCIF1. We used "Similar Gene Detection" module of GEPIA2 get top 100 PCIF1-correlated genes. Top 5 correlated genes were selected for further analysis. We applied the "Correlation Analysis" module of GEPIA2 to Pearson's test of the top 5 PCIF1correlated genes. Venn diagram was provided to obtain an intersection of 100 PCIF1 correlated genes from GEPIA2 and 100 interacted proteins from STRING with Evenn (http://www. ehbio.com/test/venn/). We also conducted global percentage and relation network diagram to depict pathway activity of PCIF1, its interacted proteins, and correlated genes via GSCALite, a powerful and widely applied tool for analyzing cancer multiomics and drug sensitivity (http://bioinfo.life.hust.edu.cn/ web/GSCALite/) (21).

PCIF1 Expression Analysis
In our study, we first assessed differential expression of PCIF1 among tumor and adjacent normal tissues in 28 types of tumors based on data collected from the TCGA database and GTEx database as Supplementary Materials. Expression of PCIF1 had a significant difference in 25 of 28 tumor types like adrenocortical carcinoma (ACC), breast invasive carcinoma (BRCA), colon adenocarcinoma (COAD), glioblastoma multiforme (GBM), kidney chromophobe (KICH), and prostate adenocarcinoma (PRAD) (p < 0.01), while no statistic difference in the rest of tumor types (p > 0.05) ( Figures 1A, B). Violin plots showed that the difference expression of PCIF1 had statistical significance in each pathological stages of COAD, KICH, kidney renal clear cell carcinoma (KIRC), liver hepatocellular carcinoma (LIHC), pancreatic adenocarcinoma (PAAD), and skin cutaneous melanoma (SKCM) ( Figure 1C). Additionally, we conducted total protein expression distribution of the PCIF1 among primary tumor and normal tissues from clinical proteomic tumor analysis consortium (CPTAC) dataset. Six available tumors were chosen including breast cancer, clear cell renal cell carcinoma (RCC), colon cancer, ovarian cancer (OV), lung adenocarcinoma (LUAD), and uterine corpus endometrial carcinoma (UCEC). The PCIF1 protein expression in UCEC and LUAD was much more higher in normal tissue than primary tumor (p < 0.01) on the contrary to that in clear cell RCC (p < 0.01). There was no significant difference in the PCIF1 protein expression between normal and tumor tissues in the rest of the three cancers including breast cancer, colon cancer, and OV (p > 0.05) ( Figure 1D). PCIF1 RNA specificity was low in different tissues, blood cell lineages, single cell types ( Figure 1E), and cell lines, which was not enclosed.

PCIF1 Survival Prognosis Analysis
The OS and RFS significance map of those who had statistical difference, namely, ACC, KIRC, kidney renal papillary cell carcinoma (KIRP), mesothelioma (MESO), SKCM, and thyroid carcinoma (THCA) were demonstrated ( Figure 2). As shown in Figure 2A, the high expression of PCIF1 was correlated with higher OS of ACC (p = 0.031), KIRC (p = 0.0079), KIRP (p = 0.02), and MESO (p = 0.014), while only high expression of PCIF1 was correlated with a better RFS (p = 0.023) in KIRC. Highly expressed PCIF1 was positively linked with RFS of THCA (p = 0.038) but had no statistic difference in OS of THCA (p > 0.05). Additionally, we have analyzed the correlation of PCIF1 expression and PFS and DSS data on TCGA datasets (Supplementary Figure S1). Interestingly, there is a correlation between highly expressed PCIF1 and poorer DSS of cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) (p = 0.0012) and COAD (p = 0.0482). The above data suggested that PCIF1 expression was differentially associated with the prognosis of specific tumor types and mostly presented a positive correlation with OS and RFS.

PCIF1 Genetic Alteration Analysis
Furthermore, genetic alteration status of PCIF1 were tested and shown ( Figure 3). COAD possessed the highest gene alteration frequency of PCIF1 (9.43%), with 11 of 594 (1.85%) "mutation" cases and 45 of 594 (7.58%) "amplification" cases. The highest mutation frequency (3.07%) of PCIF1 appeared in endometrial carcinoma. There was a fairly low frequency of structural variant in esophagogastric adenocarcinoma, nonsmall-cell lung cancer (NSCLC) and deep deletion in leukemia, NSCLC, and prostate adenocarcinoma ( Figures 3A, B). The total somatic mutation frequency was 0.9%, in which missense mutations were in the majority. R265H/C was detected in two cases of UCEC and one case of COAD ( Figure 3C), while the role of amino acid motifs remained unknown. However, altered PCIF1 had no correlation with OS, RFS, PFS, and DSS of pan-cancer and each tumor type (p > 0.05) ( Figure 3E).

PCIF1 Protein Phosphorylation Analysis
We also analyzed differences in phosphorylation of PCIF1 among normal tissues and tumor lesions (OV, colon cancer, clear cell RCC, UCEC, breast cancer, and lung adenocarcinoma) based on CPTAC dataset (Figure 4). PCIF1 phosphorylation level in OV had no significant difference among normal tissues and tumor sites ( Figure 4A, p > 0.05). The S116, S159, and S144  loci of PCIF1 possessed a higher phosphorylation level in colon cancer tissues, but the phosphorylation level at the T150 locus was higher in normal tissues with statistical difference ( Figure 4B, p < 0.05). An increased phosphorylation level was also seen in clear cell RCC and breast cancer tissues, with statistical difference, compared with normal tissues (Figures 4C, E, p < 0.05) while the situation was opposite in UCEC ( Figure 4D, p < 0.05). PCIF1 phosphorylation of the T158 locus was elevated in normal tissues compared with lung adenocarcinoma tissues; PCIF1 phosphorylation of the S116 locus had no statistical difference between the two groups. PCIF1 was predicted to be located intracellularly (nucleoplasm, microtubules, cytokinetic bridge). Whether the phosphorylation of PCIF1 has an effect on its location or its function (e.g., catalytic activity) remains unknown, requiring more investigations.

PCIF1 Immune Infiltration Analysis
We applied a series of analytical tool on TIMER2.0 including TIMER, CIBERSORT, EPIC, MCP-counter, quanTIseq, and xCell to investigate immune infiltration of PCIF1 ( Figure 5 and Supplementary Figure S2). Figure 5 displays the expression level of PCIF1 in CD8+ T cell, CD4+ T cell, neutrophil, myeloid dendritic cell (MDC), macrophage, and B cell in 33 tumor types, among which PCIF1 expression harbored a significant difference in immune cells in liver hepatocellular carcinoma (LIHC) and prostate adenocarcinoma (PRAD). The xCell had revealed that the significant difference of PCIF1 expression in monocytes and macrophages ("macrophage," "macrophage M1," and "macrophage M2") of GBM and sarcoma (SARC). Furthermore, we created linear regressions in ACC, KIRC, KIRP, and THCA to see how two variables fit. The selected tumors are those who had statistical difference not only in expression among normal tissue and tumor sites but also in OS or RFS ( Figure 5 and Supplementary Figure  S3). Six immune cell types were chosen according to xCell.

PCIF1 Expression and TMB or MSI
As TMB (23) and MSI (24) are considered biomarkers for immunotherapy, we also depicted the correlation between PCIF1 expression and TMB or MSI with bubble diagram to further evaluate its immunogenicity (Figure 7). The expression of PCIF1 was positively correlated with TMB of MESO, GBM, LUAD, and brain lower-grade glioma (LGG) but negatively correlated with TMB of COAD, THYM, PRAD, THCA, BRCA, and UCEC (p < 0.05). The expression of PCIF1 was positively correlated with MSI of KICH, KIRC, LIHC, and lung squamous cell carcinoma (LUSC) but negatively correlated with MSI of rectum adenocarcinoma (READ) (p < 0.05). Interestingly, there were no overlaps between TMB and MSI groups.

PCIF1-Related Gene Enrichment Analysis
A few progresses have been made in 2019 revealing the role of PCIF1 role as the only known m6Am methyltransferase, but still blank to cancer so far. Thus, we carried out a series of enrichment analysis of PCIF1 to explore its oncogenic role via STRING, GEPIA2, and GSCALite (21). Meanwhile, we gained a list of top 100 genes that have correlations with PCIF1 via the module of "similar gene detection" of GEPIA2 (Supplementary Table S1). Thirty-four proteins had an interaction with PCIF1 in Homo sapiens based on the STRING database; eight interacted proteins were identified by experiments, namely, PDX1, MNAT1, HDAC3, POLR2A, POLR2I, CAMSAP2, C19of43, and FANCM ( Figures 8A, B and Supplementary Table S2). Pearson's test displayed correlation of PCIF1 expression and top 5 correlated genes including ZSWIM1 (R = 0.55, p < 0.01), UBOX5 (R = 0.5, p < 0.01), SLC35C2 (R = 0.5, p < 0.01), IFT52 (R = 0.48, p < 0.01), and EHMT2 (R = 0.47, p < 0.01) ( Figure 8C). We also obtained global percentage of PCIF1, the abovementioned eight PCIF1-binding proteins (PDX1, MNAT1, HDAC3, POLR2A, POLR2I, CAMSAP2, C19orf43, FANCM), and top 5 correlated genes (ZSWIM1, UBOX5, SLC35C2, IFT52, EHMT2) of TCGA tumors, as well as interaction map of gene and pathway (Figures 9A, B). PCIF1 had an activation on DNA damage response and hormone androgen receptor (AR), while an inhibition on apoptosis ( Figure 9A). However, the Venn diagram suggested that there was no overlap between the interacted and correlated proteins ( Figure 9). Furthermore, we obtained heatplots of negative or positive differentially expressed genes and Spearman's correlation test of PCIF1-binding proteins in ACC, KIRC, KIRP, and TYHA ( Figures 10A, B). These four tumors were those with statistical difference in normal and tumor tissues and OS and RFS as mentioned before. We defined absolute statistic value (r value): strong correlation for 0.7-1.0, moderate correlation for 0.4-0.7, weak correlation for 0.2-0.4, and extreme weak correlation for 0.0-0.2 (25). A positive r value indicated a positive correlation and vice versa. We found that in ACC, PCIF1 had a positive moderate correlation with PHF20 (r = 0.649), C20orf43 (r = 0.640), and  metabolism, necroptosis in ACC, spliceosome, Notch signaling pathway in KIRC, ribosome, and metabolism (e.g., carbon, glycine, serine, threonine) in TYHA while negatively correlated with pathways associated with ribosome, DNA replication, cell cycle and mismatch repair in ACC, protein export, ferroptosis, proteasome in KIRC, natural killer cell-mediated cytotoxicity, antigen processing and presentation in KIRP, TNF, JAK-STAT, and NOD-like receptor signaling pathway in TYHA. However, the interaction mechanisms required more investigations.

DISCUSSION
Input of "PCIF1" on PubMed (https://pubmed.ncbi.nlm.nih.gov/? term=PCIF1) was quite limited, especially in cancer. Our study first analyzed PCIF1 expression and its correlation. It was shown that PCIF1 RNA expressions in most tumors was higher than that in normal tissues with significant differences but had a low specificity in tissue, blood cell lineage, each cell type, and each cell lines. Relier et al. have shown that the expression of PCIF1 and FTO was lower in metastatic cell lines, which may partially explain comparative expression in different cell lines (14). PCIF1 protein expression was different in clear cell RCC, LUAD, and UCEC. The expression of PCIF1 was correlated with OS or DFS of ACC, KIRC, KIRP, MESO, and THCA, suggesting its potential application as prognostic biomarker in the abovementioned tumors. Uncovering oncogenic or immunogenic role of PCIF1 may be of great beneficial for cancer therapy, especially for immunotherapy or targeted therapy. Thus, we provided a primary investigation of the function of PCIF1 in tumor and immune infiltration, accompanied by limitations of this study which can only be solved by experiments. First, we had found that phosphorylation of PCIF1 at different sites resulted in differential expression of protein. However, the role of PCIF1 phosphorylation remained unclear. Secondly, we carried out a set of immune analysis to show the correlation of PCIF1 expression and TMB, MSI, immune infiltrations, and ICG expressions. These results suggested that PCIF1 might be involved in the process of immune infiltration to some extent, providing theoretic basis for the tumor immune research of PCIF1.
Taking KIRC as an example, previous studies have shown that KIRC could be classified into C3 (inflammatory) immune subtype, characterized by an elevation of helper T cell 1 (Th1) and Th17 infiltration, lower proliferation capacity, and lower level of aneuploidy and somatic CNA (23). Our bioinformatics results demonstrated that PCIF1 expression was positively correlated with OS and DFS of KIRC (Figure 2, p < 0.05) and negatively correlated with KIRC Th1 infiltration (Supplementary Figure S2, p < 0.001). Furthermore, the study of Ko et al. has shown that ZSWIM1, the top correlated genes of PCIF1, was selectively downregulated in Th1 and could be considered a biomarker for activation or differentiation of CD4+ Th cells (26). These findings suggested a potential participation of PCIF1 in regulating Th1mediated immune modulatory effect in KIRC, while requiring experimental verifications.
Actually, though the studies of PCIF1 were quite limited, the past 4 years did witnessed great progress in this area. From 2018 to 2019, four critical studies have suggested PCIF1 as m6Am  (27). The study by Relier et al. showed that PCIF1 silence reduced colonosphere formation and m6Am/m6A ratio and caused resistance to chemotherapy in CRC1 colorectal cancer cell line (14). It remained obscure whether the participation of PCIF1 in oncogenesis and cancer development is dependent on m6Am modification. The mechanism of the oncogenic and immunogenic role of PCIF1 was a complicated puzzle. Our study tended to portray that the interaction, correlation network, and potential pathway activity were obtained from the TCGA databases with diverse formats like Pearson's test, Spearman's test, and KEGG analysis, hoping to guide the way for future mechanisms and functional studies of PCIF1 in cancer. However, more experimental studies are required to verify these bioinformatics results.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
W-LJ and X-PW conceived and edited the paper. M-ZJ and Y-GZ did the analysis. M-ZJ wrote the paper. All authors contributed to the article and approved the submitted version.