Pan-Cancer Analyses Reveal Oncogenic and Immunological Role of PLOD2

Some previous studies have shown that PLOD2 has some value in tumorigenesis. However, the broad significance of PLOD2 has not been discussed in depth. This study was aimed at elaborated and summarized the value of PLOD2 in various tumors. First, we integrated GTEx, The Cancer Genome Atlas and Cancer Cell Line Encyclopedia databases to analyze the expression of PLOD2, and found that it was expressed differently in normal tissues and significantly highly expressed in most tumors compared with normal tissues. Second, our analysis revealed that PLOD2 expression was negatively correlated with the prognosis of several tumors. For gastric cancer, the median overall survival time was significantly higher in the PLOD2 low expression group [HR 0.616 (95%CI 0.442–0.858), p = 0.004]. Third, for tumor immunity, PLOD2 was significantly associated with tumor infiltration, including immune infiltrating cells; immune checkpoint expression; immune microenvironment scores (immune score, stromal score and estimate scores); immunotherapy-related scores (tumor mutational burden, microsatellite instability, tumor neoantigen burden); expression of DNA repair genes Mismatch Repairs and methyltransferase; and enrichment analyses identified PLOD2-associated terms and pathways. Lastly, twenty pairs of gastric cancer and adjacent immunohistochemistry showed that PLOD2 was significantly overexpressed in gastric cancer (p < 0.001). Collectively, PLOD2 played a significant role in tumorigenesis and maybe serve as a potential biomarker for diagnosis and prognosis in cancers.


INTRODUCTION
The incidence of malignant neoplasms has increased at an alarming rate over the past few decades (Bray et al., 2021;Sung et al., 2021). Pan-cancer analysis aims to examine the similarities and differences between genomic and cellular changes found in different tumor types (Chang et al., 2013;Gentles et al., 2015). Pan-cancer analysis projects, such as the Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA), were created based on the evaluation of different human cancer cell lines and tissues at the epigenomic, genomic, proteomic, and transcriptomic levels. TCGA provides medical researchers with irreplaceable genomic, epigenomic, transcriptomic, and clinical data (Hutter and Zenklusen, 2018). What's more, it has boosted the study of tumor immunology and immunotherapy (Thorsson et al., 2018). Pan-cancer analysis has made an important contribution to the development of life science and medicine. For example, on 4 February 2020, Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and TCGA published six articles in Nature (Author Anonyms, 2020), proposing the most comprehensive cancer genome analysis so far. Different from the previous focus on protein coding regions, this time is to analyze the whole genome of cancer. This program covers six aspects: pan-cancer analysis of whole genomes (ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium, 2020); analyses of non-coding somatic drivers in 2,658 cancer whole genomes (Rheinbay et al., 2020); the repertoire of mutational signatures in human cancer (Alexandrov et al., 2020); patterns of somatic structural variation in human cancer genomes ; the evolutionary history of 2,658 cancers (Gerstung et al., 2020); genomic basis for RNA alterations in cancer (Calabrese et al., 2020).
PLOD2 (Procollagen-Lysine,2-Oxoglutarate 5-Dioxygenase 2) was a member of PLOD family (PLOD1, PLOD2, PLOD3), which encodes a special protein (also known as LH2, TLH2 and BRKS2) mediating the formation of stabilized collagen cross-links (Genecards, 2021). Collagen crosslinking played a key role in extracellular matrix (Du et al., 2017). Various studies have shown the extracellular matrix (ECM) to be closely to tumor cell growth and metastasis (Gilkes et al., 2014;Tadeo et al., 2016). Upregulation of PLOD2 has been observed in several malignancies such as bladder cancer (Miyamoto et al., 2016), lung cancer (Kocher et al., 2021), gastric cancer (Song et al., 2021), head and neck squamous cell cancer (Xin et al., 2021), breast cancer (Gilkes et al., 2013), etc. Kiyozumi et al. showed that PLOD2 was significantly associated with peritoneal dissemination in gastric cancer (Kiyozumi et al., 2018). In the metastatic group, PLOD2 was significantly highly expressed, both at the mRNA and protein level. Silencing PLOD2 significantly reduced cell invasiveness and migration in vitro. Further experiments showed that this was mainly regulated by HIF-1a in hypoxia condition. Another study on PLOD2 and 5-FU resistance in gastric cancer showed that PLOD2 could enhance 5-FU resistance by regulating BCRP and inhibit cell apoptosis by affecting the expression of Bax and Bcl2 (Wang et al., 2020). Downregulation of PLOD2 facilitated the sensitivity of gastric cancer to 5-FU in vivo. Generally, PLOD2 plays an important role in both tumor growth and closely related to the prognosis of patients. Although a number of studies have been carried out on PLOD2, no single study exists which could overall evaluate its effects on considerable types of cancers. To understand the functions (especially cancer immunity) of PLOD2 in different tumors, a comprehensive pan-cancer analysis was necessary.
To that end, we will elucidate the expression of PLOD2 in 33 different malignant tumors in the following aspects and focus on gastric cancer. All in all, the results of our study provide information regarding the role of PLOD2 in tumors, reveal the relationship between PLOD2 and tumor-immune interactions, and clarify the potential underlying mechanisms.

Patient Datasets and Processes
This study has been approved by the Ethics Committee of Beijing Jishuitan Hospital (202004-58). First, we analyzed the PLOD2 gene expression levels in each normal tissue using the GTEx (Genotype-Tissue Expression) database (https://xena.ucsc.edu/). Second, the data of each tumor cell line were downloaded from the CCLE database (https://sites.broadinstitute.org/ccle), and the expression levels of 21 tissues were analyzed according to the tissue source. Third, we obtained gene expression differences between cancer and para-cancer tissues in individual tumor samples from the TCGA database (https://portal.gdc.cancer. gov/). Fourth, considering the small number of normal samples in TCGA database, we integrated data from GTEx and TCGA database to analyze the differences expression in multiple tumors.
Procollagen-Lysine,2-Oxoglutarate 5-Dioxygenase 2 Expression and its Survival-Associated Cancers The differences of PLOD2 gene expression were compared according to TNM stages of different tumors (data from TCGA database). Next, univariable and multivariable Cox regression analysis was used to compare the relationship between different PLOD2 expression (divided into high and low expression groups with the median cutoff value) and prognosis. Prognosis includes OS (overall survival; period from the start of treatment to death from any cause), DSS (disease specific survival; cancer survival in the absence of other causes of death), and PFI (progression free interval; period from the start of treatment to disease progression or death from any cause). Subsequently, our findings were verified in the GSE84433 cohort.
Procollagen-Lysine,2-Oxoglutarate 5-Dioxygenase 2 and Tumor Immunity We used CIBERSORT (Newman et al., 2019) to explore the association of PLOD2 gene expression with the level of immune infiltration in different types of cancer. CIBERSORT (https://CIBERSORT.stanford.edu/) is a tool for deconvolution of expression matrices of immune cell subtypes based on the principle of linear support vector regression, and can be used to estimate immune cell infiltration with data from RNA-Seq. Then we used xCell algorithm (Aran et al., 2017) and MCP-Counter algorithm (Becht et al., 2016) to verify the result of CIBERSORT.
Next, we analyzed the relationship between PLOD2 expression and TMB, MSI, and TNB. TMB is usually defined as the number of somatic nonsynonymous mutations or all mutations occurring per MB in the gene region detected by whole-exome sequencing or targeted sequencing in one tumor sample (Passaro et al., 2020). Somatic mutations calculated by TMB include point mutations and insertion/deletion mutations (Valero et al., 2021). TNB is an indicator of the total number of neoantigens in tumor cells, usually expressed as the number of tumor neoantigens per million bases of tumor genomic region . The combination of TMB and TNB can better predict the efficacy of immunotherapy. MSI, the insertion or loss of base pairs in microsatellite regions due to replication errors, was first identified in colorectal cancer and is thought to be a feature of hereditary non-polyposis colorectal cancer (Lynch syndrome) (Vilar and Gruber, 2010) and has since been found in a variety of sporadic tumors.
We downloaded the PLOD2 genetic mutation data, transcriptome data, and clinical data from the TCGA database. To identify the somatic mutations of the patients with PLOD2 in the TCGA database, mutation data were downloaded and visualized using the "maftools" package in R software. Horizontal histogram showed the genes have the higher mutation frequency in patients with PLOD2. Finally, we evaluated the relationship between the expression of PLOD2 and 5 DNA repair genes (MMRs: MLH1, MSH2, MSH6, PMS2, EPCAM) and 4 methyltransferases (DNMT1, DNMT2, DNMT3A, DNMT3B) genes.

Gene Set Enrichment Analysis
Using JAVA (http://software.broadinstitute.org/gsea/index.jsp), we conducted GSEA to assess for possible underlying mechanisms based on the 'Molecular Signatures Database' of c5.all.v7.1.symbols and c2.cp.kegg.v7.1.symbols. When the number of random sample arrangements was 100 and the significance threshold was p < 0.05, R software and Bioconductor (http://bioconductor.org/) were applied to visualize our results.

Immunohistochemical Staining
Tissue sections were prepared from the paraffin-embedded tissue samples. Then PLOD2 immunostaining was performed according to the instructions (proteintech 21214-1-AP, China). Immunohistochemical scoring was performed by semiquantitative analysis (20 pairs of gastric cancer and adjacent tissues). Two pathologists analyzed and scored the immunohistochemistry of gastric tissue. Each slice was randomly observed for 5 high-power visual fields, and scored according to the percentage of positive cells (0-5%, 6-25%, 26-50%, 51-75%, 76-100% were recorded as 0, 1, 2, 3, and 4 points respectively) and the intensity of staining (0, 1, 2, and 3 points respectively for non-staining, light, medium, and deep). The total score was the sum of staining intensity and percentage of positive cells. Next, we validated the expression of PLOD2 in STAD and normal tissues in the Human Protein Atlas (HPA) database (Uhlén et al., 2015).

Statistical Methods
The Wilcoxon log-rank test was used to determine the presence or absence of a markedly increased sum of gene expression z-scores for tumor tissues, as compared to adjacent normal tissues. The difference in PLOD2 expression between different tumor stages was compared using the Kruskal-Wallis H test. Survival was analyzed using the K-M curves, log-rank test, and Cox proportional hazards regression model. Spearman's test was used for correlation analysis. R language (version 3.6.0; R Foundation) was used for all analyses. A two-sided p < 0.05 indicated a statistically significant difference.

Pan-Cancer Expression Landscape of Procollagen-Lysine,2-Oxoglutarate 5-Dioxygenase 2
Firstly, we analyzed the expression levels of PLOD2 in 7858 normal samples using the GTEx dataset. As shown in Figure 1A, the differences in PLOD2 gene expression were significant (p < 0.001) in 31 tissues. Subsequently, we analyzed data downloaded from the CCLE database for each tumor cell line. There were significant differences in expression among the 21 tumor cell lines ( Figure 1B), with the highest in renal tumors. Further, we obtained the differences in PLOD2 from the TCGA database between cancer and para-cancer in individual tumor samples; and as shown in Figure 1C, PLOD2 was highly expressed in 11 (ESCA, GBM, HNSC, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, STAD, UCEC) of 20 different tumors, lowly expressed in 4 tumors (COAD, KICH, PRAD, READ). Finally, considering the small number of normal samples in TCGA database, we integrated data from GTEx and TCGA database to analyze the differences expression in multiple tumors (Supplementary Table S1). As shown in Figure 1D  We further analyzed the relationship between PLOD2 gene expression and prognosis in gastric cancer patients in detail ( Figure 2). Figures 2A-C showed the relationship between PLOD2 gene expression and OS, DSS and PFI, respectively. In the OS analysis, as illustrated in Figure 2A, we split cases into high-risk and low-risk groups according to the median expression. The median OS time was significantly higher in The correlation coefficient with p value less than 0.05 is expressed in bold.  Figure 2D). In DSS analysis ( Figure 2B) and PFI analysis ( Figure 2C), hazard ratio was 1.77 (95% CI 1.16-2.69, p = 0.008) and 1.64 (95% CI 1.15-2.34, p = 0.006), respectively. AUC values of the classifier to predict 1-, 3-, and 5-year DSS were 0.608, 0.622, and 0.728, respectively ( Figure 2E). AUC values of the classifier to predict 1-, 3-, and 5-year PFI were 0.611, 0.629, and 0.663, respectively ( Figure 2F). As shown in Figure 3A, in the GSE84433 validation cohort (355 patients remained after deleting 2 patients who survived less than 1 month), the overall survival time of the PLOD2 low expression group was also significantly longer than that of the high expression group [HR 0.73, 95% CI (0.54-0.99), p = 0.041]. As shown in Table 1, univariate analysis showed that age (p = 0.005), T stage, N stage, M stage, pathologic stage and PLOD2 expression were significantly correlated with OS (all p < 0.05). However, in multivariate analysis, only age [HR: 1.731 (95% CI  analysis, the overall survival of PLOD2 high expression group was significantly worse than that of low expression group ( Figure 3B).
Next, we analyzed the relationship between PLOD2 expression and infiltrating immune cells in gastric cancer based on the xCELL algorithm. As shown in Figure 4B, the proportion of T cell CD4 + Th1, Macrophage, Macrophage M1, Plasmacytoid dendritic cell, B cell, Monocyte, Neutrophil, and Endothelial cell were significantly higher in the PLOD2 high expression group than low expression group. Contrarily, the proportion of B cell plasma, microenvironment score, T cell CD8 + effector memory, T cell CD8 + central memory, Class-switched memory B cell, B cell memory, Granulocyte-monocyte progenitor, Hematopoietic stem cell and stroma score were higher in the PLOD2 low expression group. we also used MCP-Counter deconvolution Frontiers in Genetics | www.frontiersin.org May 2022 | Volume 13 | Article 864655 9 methods to verify our results (Supplementary Figures S7A,B). In the GSE84433 validation cohort, macrophage M0 (p < 0.05) and macrophage M2 (p < 0.05) increased significantly in the PLOD2 high expression group, while T cell CD4 + memory activated (p < 0.001) decreased significantly ( Figure 3C).
Numerous studies indicated that the tumor immune microenvironment has an important role in tumor development. As shown in Table 2 Figure S3).

Relationship Between Procollagen-Lysine,2-Oxoglutarate 5-Dioxygenase 2 Somatic Mutation, Mismatch Repairs and DNA Methyltransferase
We downloaded mutect-processed mutation data from TCGA to analyze the mutation of PLOD2 gene in these tumors. As Supplementary Figure S6A demonstrated, the proportion of PLOD2 mutations in each tumor, which ranged from 5.09% (UCEC) to 0.23% (OV). Supplementary Figure S6B shown the distribution of mutations in the top 3 tumors, UCEC (5.09%), COAD (2.76%) and STAD (2.75%). In STAD, the most common of the was Missense Mutation, followed by Frame Shift Del and Nonsense Mutation.
MMRs (Mismatch Repairs) was the repair of nucleotide sequences to normal in DNA molecules containing mismatched bases. Thus, the MMR system was a safety and security system in vivo that maintains the integrity and stability of genetic material. As shown in Figure 6A, the expression of PLOD2 was positively correlated with MLH1, MSH2, and MSH6 in a variety of tumors. In STAD, PLOD2 expression was positively correlated with MLH1, MLH6, and PSM2.

Immunohistochemical Staining
As shown in Table 3, the average age of the 20 gastric cancer patients was 62 years old, and women accounted for 35%. Among the clinical stages, stage III or IV accounted for 55%, signet ring cell carcinoma accounted for 25%, and diffuse, intestinal and mixed accounted for 55%, 15%, and 30% respectively. We note that clinical stage III or IV in PLOD2-high group accounted for 75%, and Lauren's type was mainly diffuse and mixed; while clinical stage III or IV in PLOD2-high group accounted for 25%. Lauren's classification was mainly diffuse and intestinal. However, due to the small sample size, there was no significant difference. Figure 8 showed the immunohistochemical staining results of 20 pairs of gastric cancer and corresponding adjacent tissues. In the gastric cancer group, PLOD2 was significantly overexpressed, while the expression in adjacent tissues was low. The median values of the two groups were 4 and 2 respectively. There was significant difference in staining score (p < 0.001, Figure 8E). Figures  8A,B were cancer tissues, and Figures 8C,D were adjacent tissues. The results of HPA database also showed that the expression of PLOD2 in STAD was higher than that normal gastric tissue (Figures 8F,G).

DISCUSSION
The present study aimed to demonstrate a comprehensive workflow for pan-cancer analysis and to extensively investigate the role of PLOD2 as it related to various cancers. Based on our results, we found that PLOD2 overexpression was associated with prognosis in a variety of tumors (CHOL, HNSC, KIRC, KIRP, LAML, LUAD, MESO, PAAD, SARC, SKCM, and STAD) based on Cox proportional risk models and KM survival analysis. We focused on the relationship between PLOD2 expression and STAD. Univariate and multivariate analysis showed that the overall survival time of PLOD2 high expression group was significantly less than low group. Immunohistochemical results also showed that the PLOD2 expression in gastric cancer was significantly higher than normal tissues.
In order to investigate the research status of PLOD2 and gastric cancer, we conducted a search on the PubMed using the following search strategy: ["stomach neoplasms" (Title/ Abstract) OR "stomach neoplasms" (MeSH Terms) OR "gastric adenocarcinoma" (Title/Abstract)] AND ["PLOD2" (Title/Abstract)]. Finally, seven studies were found (Kiyozumi et al., 2018;Luo et al., 2020;Wang et al., 2020;Dai et al., 2021;Li et al., 2021;Song et al., 2021). Dai et al. constructed a prognostic model for five genes including PLOD2, which was subsequently validated by RT-PCR in normal tissue and gastric cancer cell lines (Dai et al., 2021). However, their study did not perform analyses related to tumor immunity (including infiltrating immune cells, TMB, etc.). Similarly, Li J et al. , Li SS et al. , Luo et al. (Luo et al., 2020), andSong et al.(Song et al., 2021) were also constructed multiple genes (including PLOD2) prognostic model, but all lacked tumor immune-related analysis or only had comparisons of different immune cell classifications. Kiyozumi et al. study showed that PLOD2 promotes cell invasion and migration in gastric cancer under hypoxic conditions and leads to dissemination to the peritoneum, in vitro (Kiyozumi et al., 2018). This might be even better when coupled with a PLOD2 knockout or overexpression mouse model. Wang et al. conducted a study on the relationship between PLOD2 gene expression and gastric cancer chemotherapy (Wang et al., 2020). Their results showed that PLOD2 knockdown in BGC823 cells significantly reduced the IC 50 value of 5-FU, which contributed to the reduction of migration and invasion and promoted apoptosis of gastric cancer cells. The opposite result appeared in PLOD2 overexpressing MGC803 cells. In vivo experiments showed that knockdown of PLOD2 gene enhanced the inhibitory effect of 5-FU on the growth of transplanted tumors in nude mice. It is particularly unfortunate that the study was only cellular and animal-based, and extrapolation to human gastric cancer requires further validation. In brief, all of the above studies have their own merits and there were many areas for further improvement also.
Our study showed that both OS, DSS, and PFI suggested that the prognosis of PLOD2 high group of was significantly worse than that low group. This may be related to the following reasons. Firstly, the immune cell infiltration in the low group was more abundant (DC, M1 macrophages, CD4 + T cells, CD8 + T cells higher than PLOD2 low group ( Figure 4B). Secondly, immune checkpoint gene was also significantly overexpressed in the high expression group. Thirdly, the tumor stroma score in the high group was significantly higher than low group (shown in Figure 4B). This indicates that the proportion of non-immune cells [e.g., cancer associated fibroblasts (Chen and Song, 2019)] was aplenty in the high group. Derks et al. showed that the infiltration of non-immune cells (such as fibroblasts and stromal cells) was associated with poor prognosis in gastroesophageal adenocarcinomas (Derks et al., 2020). In addition, immunohistochemical results showed that in the high PLOD2 group, the clinical stages were mainly stage III and stage IV, and the proportion of signet ring cell carcinoma was also higher. Signet ring cell carcinomas was usually "cold tumor" (i.e., lack of immune infiltration) (Garcia-Pelaez et al., 2021;Monster et al., 2022). Therefore, we speculated that the high expression of PLOD2 and poor prognosis may be related to immune infiltration and pathological types. However, further animal experiments were needed to prove it.
In conclusion, we have found that PLOD2 can serve as a valuable prognostic biomarker for some tumors, especially gastric cancer. We believe that these findings may lay the groundwork for prospective functional experiments and eventually have an impact in clinical work.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Beijing Jishuitan Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
QX and PG designed the manuscript. QX wrote and completed the manuscript. QX, NK, YZ, QW, XW, and XX completed the data download and analysis. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by Beijing JST Research Funding (ZR-201919).