A cell cycle–related lncRNA signature predicts the progression-free interval in papillary thyroid carcinoma

The cell cycle plays a vital role in tumorigenesis and progression. Long non-coding RNAs (lncRNAs) are key regulators of cell cycle processes. Therefore, understanding cell cycle–related lncRNAs (CCR-lncRNAs) is crucial for determining the prognosis of papillary thyroid carcinoma (PTC). RNA-seq and clinical data of PTC were acquired from The Cancer Genome Atlas, and CCR-lncRNAs were selected based on Pearson’s correlation coefficients. According to univariate Cox regression, least absolute shrinkage and selection operator (LASSO), and multivariate Cox regression analyses, a five-CCR-lncRNA signature (FOXD2-AS1, LOC100507156, BSG-AS1, EGOT, and TMEM105) was established to predict the progression-free interval (PFI) in PTC. Kaplan–Meier survival, time-dependent receiver operating characteristic curve, and multivariate Cox regression analyses proved that the signature had a reliable prognostic capability. A nomogram consisting of the risk signature and clinical characteristics was constructed that effectively predicted the PFI in PTC. Functional enrichment analyses indicted that the signature was involved in cell cycle– and immune-related pathways. Furthermore, we also analyzed the correlation between the signature and immune cell infiltration. Finally, we verified the differential expression of CCR-lncRNAs in vitro using quantitative real-time polymerase chain reaction. Overall, the newly developed prognostic risk signature based on five CCR-lncRNAs may become a marker for predicting the PFI in PTC.


Introduction
Thyroid cancer (TC) is the most common endocrine system malignancy and accounts for 3.4% of all cancers diagnosed globally each year (1), of which approximately 85% of TC cases are papillary thyroid carcinoma (PTC) (2). Although PTC has a good outcome, with a five-year survival rate of more than 97% (3), approximately 30% of patients exhibit 2 Materials and methods

Data collection
HTseq-FPKM and HTseq-count data of TC samples were obtained from TCGA (https://cancergenome.nih.gov/, accessed January 24, 2022). After removing TC samples of other pathological types, a total of 502 PTC samples and 58 adjacent normal samples as well as the corresponding clinical information were obtained from TCGA.

Construction and validation of a CCR-lncRNA prognostic signature
Post-operative recurrence and metastasis are the main factors leading to the poor prognosis of PTC; accordingly, the PFI was chosen as the endpoint of this study, instead of overall survival. A total of 498 PTC samples with complete PFI information were included after removing samples with incomplete clinical information and prognosis less than 30 days. The samples were categorized into training (249) and test (249) cohorts at a 1:1 ratio by the "caret" package. Univariate Cox regression analyses were primarily used to screen CCR-lncRNAs that were associated with PFI in the training cohort (p< 0.05). To reduce noise caused by gene interactions and co-expression patterns, we applied LASSO-Cox regression to filter lncRNAs; the formula is as follows: where the b is the regression coefficient, x is the gene expression level, and p is an adjusted parameter decided by 10-fold cross validation. Parameter optimization was repeated 1000 times for all lncRNAs. These lncRNAs were utilized to construct a Cox proportional hazards model. The Akaike information criterion was adopted to select the appropriate model. The risk score for each patient (P RS ) was calculated with the following formula: Thereafter, the cut-off value for the risk score was determined using the "survminer" package and patients were separated into high-and low-risk groups. The optimal cut-off values for the training, test and entire cohort were 2.026, 1.998 and 2.038, respectively. Kaplan-Meier (K-M) survival curves was plotted using the "survival" package. Time-dependent receiver operating characteristic (ROC) curves was generated using the "timeROC" package to assess the predictive accuracy of the model. In addition, clinicopathological differences between the high-and low-risk groups were analyzed using the Wilcoxon rank sum test.

Independent prognostic value determination and nomogram construction
To validate the CCRLSig as an independent prognostic indicator for PFI in PTC, the signature risk score and clinical parameters (age, gender, tumor size, T stage, N stage, AJCC stage, multifocality, aggressiveness, anatomic site of the tumor, and BRAFV600E mutation status) were subjected to univariate and multivariate Cox regression analyses.
A nomogram incorporating clinical and pathological factors and the risk score was established using the "survival" and "rms" R packages to predict 1-, 3-and 5-year PFI in the entire cohort. The calibration curves and ROC curves were used to estimate the predictive accuracy of the nomogram.

Functional enrichment analysis
DEGs between the high-and low-risk groups based on the CCRLSig were identified using DEseq2 with a |log FC| > 0.5 and p.adjust< 0.05 as thresholds, after which Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and gene set enrichment analysis (GSEA) of these DEGs were conducted using the R "clusterProfiler" package.

Tumor mutation burden (TMB) and immune cell infiltration
The somatic mutation data of PTC patients were acquired from TCGA. The TMB was compared between the high-and low-risk groups using the "maftools" package. The TMB was measured as follows: (total mutations/total covered bases) × 10^6 for each patient with PTC.
To further explore if the risk signature was associated with immune cell infiltration in PTC, immunity-associated data were extracted from xCell (http://xCell.ucsf.edu/) and parameters were compared between the high-and low-risk groups by the Wilcoxon rank sum test. In addition, correlations between the CCR-lncRNAs and immune cell fractions were evaluated.

Cell culture
The human normal thyroid cell line (Nthy-ori3-1) was provided by Professor Hongmei Shen (

Quantitative real-time PCR validation
Total RNA was extracted from the cell lines using TRIzol reagent (Thermo Fisher Scientific, Waltham, MA, USA), and reverse transcribed into cDNA using the Roche Reverse Transcription Kit (Roche, Basel, Switzerland). QRT-PCR was performed using a real-time PCR instrument (Q5). Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) served as a control. The primers for lncRNAs were provided by Sangon Biotech (Shanghai, China) and are listed in Table S1. The relative expression level of each lncRNA was calculated by the 2 -△△CT method.

Statistical analysis
The R statistical environment (V4.1.2) and GraphPad Prism 8.0 were used for statistical analyses. Analysis of variance, Wilcoxon rank sum test, and Student's t-tests were applied to evaluate differences between groups. The K-M analysis and log-rank test were adopted for comparing PFIs between groups. Values of p< 0.05 were considered statistically significant difference for all analyses.

Construction of a CCRLSig predicting PFI in PTC
The 498 PTC samples with complete clinical information were divided into training and test cohorts at a ratio of 1:1. In the training cohort, 31 CCR-lncRNAs associated with PFI were selected by univariate Cox analyses. Subsequently, we applied LASSO-Cox regression analysis and established a prognostic signature consisting of five CCR-lncRNAs (FOXD2-AS1, LOC100507156, BSG-AS1, EGOT, and TMEM105; Figures 1A, B). The K-M survival curves for the five CCR-lncRNAs relating to the PFI are shown in Supplementary Figures S1A-E. Next, the risk score for each patient in the training cohort was calculated using the Next, according to the optimal cut-off value of the risk score, patients were grouped into high-and low-risk groups. As the risk score increased, patients' prognosis worsened (Figures 2A, B). The expression levels of TMEM105, EGOT, FOXD2-AS1, and BSG-AS1 in the CCRLSig increased and the expression level of LOC100507156 decreased as the risk score increased ( Figure 2C). K-M analysis showed that PTC patients in the high-risk group had a shorter PFI than that of patients in the low-risk group ( Figure 2D). The area under the ROC curve (AUC) values for the risk score for 1-, 3-, and 5-year PFI were 0.784, 0.722, and 0.681, respectively ( Figure 2E).

Validation of the CCRLSig
To validate the accuracy of the signature, patients in the test cohort and the entire cohort were separated into high-and low-risk groups according to the optimal cut-off value for each dataset. The risk curves, PFI status, and heatmaps of risk lncRNA expression profiles in the test cohort and entire cohort were consistent with those of the training cohort ( Figures 2F-H, K-M). Similarly, a K-M curve analysis of the test cohort and entire cohort indicated that the high-risk groups had a shorter PFI than the low-risk groups (Figures 2I, N). In the test cohort, the AUC values for the risk score for 1-, 3-, and 5-year PFI were 0.723, 0.677, and 0.668, separately ( Figure 2J). In the entire cohort, the AUC values for the risk score for 1-, 3-, and 5-year PFI were 0.757, 0.693, and 0.673, separately ( Figure 2O). These results suggest that the five-CCR-lncRNA risk signature has good predictive performance for PFI in PTC.

Independent prognostic value of the CCRLSig
Univariate and multivariate Cox regression analyses were employed to explore whether the risk score based on the CCRLSig predicts PFI in PTC. Univariate Cox analyses showed that the risk score was notably correlated to the PFI in the training, test, and entire cohorts ( Figures 3A-C). A multivariate Cox analysis proved that the risk score was an independent predictor for the PFI of PTC in the training, test, and entire cohorts ( Figures 3D-F).

Correlations between the CCRLSig and clinicopathological characteristics
In correlation analyses, the risk scores based on the CCRLSig were higher for patients older than 55 years than for patients younger than 55 years ( Figure 4B). The risk scores of tumor size >2 cm were higher than those for tumor size ≤ 2 ( Figure 4C). The risk scores for N1 were higher than those for N0 ( Figure 4D). We also found that the risk score tended to increase with T stage and AJCC stage (Figures 4E, F), implying the critical role of the signature in the progression of PTC. Additionally, the risk score was significantly higher in patients with the BRAFV600E mutation than in those without the mutation ( Figure 4H). Nevertheless, there were no significant correlations between the risk score and gender or M stage (p > 0.05, Figures 4A, G).

Construction and assessment of a nomogram
In the entire cohort, we built a nomogram for the prediction of the 1-, 3-, and 5-year PFI of PTC based on the signature's risk score and significant clinicopathological features (P< 0.05) identified in univariate Cox regression analyses, including the pathological T stage, N stage, AJCC stage, tumor size, age, and aggressiveness ( Figure 5A). The calibration curve indicated that the nomogram-predicted PFI at one, three, and five years was highly consistent with the practically observed PFI (Figures 5B-D). Furthermore, the AUCs of the nomogram for evaluation of 1-, 3-, and 5-year PFI were 0.796, 0.711, and 0.681, respectively, and the predictive performances were superior to those of other clinical characteristics (age and N stage; Figures 5E-G). These results suggest that the nomogram reliably predicts the 1-, 3-, and 5-year PFI in PTC.

Functional enrichment analysis
To investigate the molecular mechanisms and pathways by which the signature is related to the risk of PTC progression, we carried out GO and KEGG enrichment analyses and GSEA of DEGs between the two risk groups. The GO enrichment analysis demonstrated that the DEGs were enriched in multiple biological processes and molecular functions, including cellular calcium ion homeostasis, positive regulation of MAPK cascade, cell-substrate adhesion, positive regulation of cell-cell adhesion, humoral immune response, T cell mediated immunity, and chemokine activity ( Figure 6A). The KEGG analysis showed that the DEGs were involved in the PI3K-Akt, MAPK signaling pathway, cytokine-cytokine receptor interaction, cell adhesion molecules, antigen processing and presentation, and IL-17 signaling pathway ( Figure 6B). The GSEA revealed that the DEGs were enriched in the cell cycle pathway and several immune-related biological processes ( Figure 6C), including the cell cycle, P53 signaling pathway, cell cycle checkpoints, innate immune system, antigen response, and MHC class II antigen presentation.

Analysis of TMB
To reveal genetic variation in risk score subtypes, we compared the TMB between the high-and low-risk groups. Compared with the low-risk group, the high-risk group had a markedly higher TMB (p< 0.01, Supplementary Figure S2A). The top 20 mutated genes in two risk group are shown in Supplementary Figures S2B, C. We observed that the mutation rate of BRAF was markedly higher in the high-risk group (84%) than in the low-risk group (56%). Supplementary Figures S2D, E show a complete view of the somatic mutations in the high-and low-risk groups.

Relationship between the signature and immune cell infiltration
The functional enrichment analysis displayed that the CCRLSig may be associated with immunity. Hence, we further analyzed the relationships between the signature and immune cell infiltration. The relative frequencies of infiltrating immune cells in all PTC patients are shown in Figure 7A. The high-risk group exhibited higher immune scores than the lowrisk group ( Figure 7B). The fractions of B-cells, CD4+ memory T cells, class-switched memory B-cells, DC, macrophages, NKT, Th2 cells, and Tregs in the high-risk group were markedly higher than those in the low-risk group (p< 0.05). In contrast, the fractions of CD4+ Tcm, CD8+ T cells, and CD8 + Tcm cells in the high-risk group were lower than those in the low-risk group ( Figure 7C). Additionally, we explored correlations between the expression levels of the five lncRNAs in the signature and the infiltration of multiple immune cells in PTC ( Figure 7D). These results implied that the signature is linked to immune cell infiltration and may regulate immune processes in PTC.

Validation of the expression levels of five CCR-lncRNAs in cell lines
We analyzed the differential expression of the five lncRNAs between normal and PTC tissues in TCGA data, as illustrated in Figure 8A, and thereafter verified the results in cell lines. The expression levels of FOXD2-AS1, LOC100507156, BSG-AS1, EGOT, and TMEM105 were notably higher in TPC-1 cells than in Nthy-ori3-1 cells (Figures 8B-F), which was in line with the results of the bioinformatics analysis, thereby supporting the accuracy of our analysis.

Discussion
The cell cycle is closely related to the growth and proliferation of cancer cells (21), and numerous lncRNAs related to the progression of cancers via cell cycle regulation have been identified (22). Understanding the expression levels of these lncRNAs and their combined regulatory patterns is crucial for determining patient outcomes and prognosis. Therefore, we constructed a cell cycle- Frontiers in Endocrinology frontiersin.org related lncRNA signature and explored its prognostic capability in PTC patients. We screened out CCR-lncRNAs and divided PTC samples into training and test cohorts to establish and validate a prognostic signature. Univariate Cox and LASSO-Cox regression analyses were employed to construct a CCRLSig for predicting the PFI of PTC in the training cohort. The prognostic value of the CCRLSig was supported by K-M curve analysis, ROC curve analyses, and multivariate Cox analysis. Furthermore, a nomogram illustrated that the signature has excellent predictive power. To further understand the clinical application of the CCRLSig, we investigated its association with clinicopathological characteristics and observed that a high risk score was positively correlated with age, tumor size, BRAFV600E mutation, AJCC stage, N stage, and T stage. These results indicated that the CCRLSig effectively predicts outcome and can better guide risk stratification for PTC management.
To better understand the underlying mechanisms by which this CCRLSig affects the prognosis of PTC, we performed a functional enrichment analysis of DEGs between the two risk groups. GO and KEGG analyses indicated that these DEGs were enriched in the following terms and pathways: cell-substrate adhesion, cell adhesion molecules, PI3K-Akt signaling pathway, MAPK signaling pathway, humoral immune response, T cell mediated immunity, and the IL-17 signaling pathway, all of which are associated with tumor proliferation, migration and immunity. GSEA also demonstrated that these DEGs were primarily engaged in cell cycle-and immune-related signaling pathways. These results imply that CCR-lncRNAs can affect the progression of PTC by regulating cell cycle-and immune-related signaling pathways, providing new directions for the treatment of PTC.
Recent studies had indicated the key roles of lncRNAs in the regulation of cancer immunity, including lncRNAs involved in immune cell differentiation, proliferation, trafficking, and infiltration (23). For example, the lncRNA HOXA-AS2 promotes Treg proliferation and immune tolerance in glioma (24), and LINC00887 promotes clear cell renal cell carcinoma progression by inhibiting the infiltration of CD8+ T cells (25). CCRG signatures are potential indicators of immune cell infiltration, immune evasion, and immune responses (26)(27)(28). Our previous functional enrichment analysis has shown that the CCRLSig was involved in immune processes. Thus, we further analyzed immune cell infiltration in the two risk groups. The high-risk group had infiltrates with higher proportions of B-cells, CD4+ memory T cell, DC, macrophages, NKT, Th2, and Tregs than the low-risk group, and the low-risk group primarily showed the infiltration of CD8+T cells, CD4+ Tcm, and CD8+Tcm cells. Studies have shown that Tregs and DCs play crucial roles in tumor immune escape (29-31) and promote tumor progression. However, CD8+ T cells exert an antitumor effect in PTC (32). Our results demonstrated that lowrisk patients had a lower risk of immune evasion and may be more responsive to immunotherapy. Additionally, in PTC, DCs are significantly related to tumor T stage (T3/T4) and lymph node metastasis (33). Tregs show elevated infiltration in the thyroid tissue of PTC patients and were positively correlated with an advanced disease stage (34). CD8+ T cell infiltration is correlated with a lower incidence of lymph node metastasis and favorable prognosis in TC (35,36). These findings further suggest that the low-risk group has a better prognosis. In summary, our findings suggested that the CCRLSig is associated with tumor immunity and can predict the immune landscape in PTC patients. In addition, these CCR-lncRNAs may be targets for immunotherapy.
Most of the lncRNAs in our signature have been previously reported to be implicated in cancers. For example, lncRNA FOXD2-AS1 has been discovered to be upregulated in PTC and correlated with a poor prognosis (37), consistent with our results. In addition, lncRNA FOXD2-AS1 promotes the progression of multiple cancers by participating in several biological processes, such as chemoresistance, proliferation, migration and invasion (38)(39)(40). The lncRNA EGOT may play different roles in different types of cancers. It promotes the progression of hepatocellular carcinoma (41), colon cancer (42), and gastric cancer (43). However, another study has shown that the lncRNA EGOT inhibits the progression of breast carcinoma (44) and renal cell carcinoma (45). The lncRNA TMEM105, a ferroptosis and immune-related lncRNA, serves as prognostic and diagnostic biomarker for patients with breastinfiltrating duct and lobular carcinoma (46). The lncRNA BSG-AS1 contributes to the proliferation and metastasis of hepatocellular carcinoma via maintaining BSG mRNA stability (47). However, LOC100507156 has not yet been reported in cancer and requires further investigations. These previous findings indicate that CCR-lncRNAs participate in the progression of multiple types of tumors, further indicating that it is reasonable to develop a risk signature based on CCR-lncRNAs to determine prognosis in PTC. In addition, the expression differences of the five CCR-lncRNAs were verified at the cellular level.
Although the newly constructed CCRLSig may be applied to predict the outcome of PTC, our study had some deficiencies. First, the dataset used to construct and validate the prognostic signature based on CCR-lncRNAs was obtained only from TCGA. Additional external data from other public databases are needed to evaluate the reliability of the signature. Second, we conducted a preliminary expression study of five CCR-lncRNAs in the signature at the cellular level. However, further functional analyses and mechanistic studies are needed. We will conduct more in-depth studies to verify the performance of our CCRLSig.
In summary, we developed a new CCRLSig that can reliably predict the PFI of PTC, providing a new direction for the prognostic management and treatment of PTC.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.

Author contributions
SL and HQ conceived and designed the study, SL wrote the manuscript, SL and MR collected the data and performed bioinformatics analysis. All authors contributed to the article and approved the submitted version

Funding
The present study was supported by the National Natural Science Foundation of China (Nos:82073491 and 81872560).