Construction of the miRNA-mRNA Regulatory Networks and Explore Their Role in the Development of Lung Squamous Cell Carcinoma

Purpose: MicroRNA (miRNA) binds to target mRNA and inhibit post-transcriptional gene expression. It plays an essential role in regulating gene expression, cell cycle, and biological development. This study aims to identify potential miRNA-mRNA regulatory networks that contribute to the pathogenesis of lung squamous cell carcinoma (LUSC). Patients and Methods: MiRNA microarray and RNA-Seq datasets were obtained from the gene expression omnibus (GEO) databases, the cancer genome atlas (TCGA), miRcancer, and dbDEMC. The GEO2R tool, “limma” and “DEseq” R packages were used to perform differential expression analysis. Gene enrichment analysis was conducted using the DAVID, DIANA, and Hiplot tools. The miRNA-mRNA regulatory networks were screened from the experimentally validated miRNA-target interactions databases (miRTarBase and TarBase). External validation was carried out in 30 pairs of LUSC tissues by Real-Time Quantitative Reverse Transcription PCR (qRT-PCR). Receiver operating characteristic curve (ROC) and decision curve analysis (DCA) were conducted to evaluate the diagnostic value. Clinical, survival and phenotypic analysis of miRNA-mRNA regulatory networks were further explored. Results: We screened 5 miRNA and 10 mRNA expression datasets from GEO and identified 7 DE-miRNAs and 270 DE-mRNAs. After databases screening and correlation analysis, four pairs of miRNA-mRNA regulatory networks were screened out. The miRNA-mRNA network of miR-205-5p (up) and PTPRM (down) was validated in 30 pairs of LUSC tissues. MiR-205-5p and PTPRM have good diagnostic efficacy and are expressed differently in different clinical features and are related to tumor immunity. Conclusion: The research identified a potential miRNA-mRNA regulatory network, providing a new way to explore the genesis and development of LUSC.


INTRODUCTION
Lung cancer is the leading cause of cancer death worldwide, accounting for approximately 18% of all cancer deaths (Bray et al., 2018). Lung cancer has the highest morbidity and mortality rate in China, accounting for 24.7% and 29.4%, respectively. There are four main histological types of lung cancer: adenocarcinoma, squamous cell carcinoma, small cell carcinoma, and large cell carcinoma, of which lung squamous cell carcinoma (LUSC) is one of the major histological subtypes (Travis et al., 2015). Survival rates for lung squamous cell carcinoma remain low, with 5-year survival rates of only 10-20% in most parts of the world. The diagnostic stage is a significant determinant of the prognosis of lung squamous cell carcinoma, and the detection of molecular markers for early diagnosis, prognosis, and therapeutic targets of LUSC has become very urgent (Allemani et al., 2015).
MicroRNA (miRNA) is a kind of single-stranded, non-coding small RNA molecule (containing about 22 nucleotides) found in plants, animals, and viruses. miRNA binds target mRNA to inhibit post-transcriptional gene expression and plays an essential role in regulating gene expression, cell cycle, and the timing of biological development (Bartel 2009). The expression patterns of miRNAs have been reported to have the potential to identify various types of cancers (Slattery et al., 2016;Liu et al., 2018;Zhang et al., 2019). Even though numerous studies have been conducted on miRNA expression and function in LUSC, systematic and comprehensive analysis of the miRNA-mRNA regulatory network based on clinical samples from LUSC are still lacking. The construction of a potential miRNA-mRNA regulatory network will help reveal the relatively comprehensive molecular mechanism of LUSC.
Here, we downloaded data from the GEO, TCGA-LUSC, miRcancer, and dbDEMC databases to screen for differential miRNAs (De-miRNAs) and mRNA (DE-mRNAs) between LUSC and normal tissue. The interactions between DE-miRNAs and DE-mRNAs were determined using Tarbase and miRTarbase databases, which are experimentally validated miRNA-target interaction databases. We further validated the DE-miRNAs and DE-mRNAs in 30 pairs of LUSC tissues by qRT-PCR. In summary, the interaction of the miRNA-mRNA regulatory networks have been researched in detail, providing new ideas or strategies for further development and application in clinical setting, for early diagnosis and better treatment.

Data Acquisition and Processing of miRNA and Gene Expression Profiles
We downloaded the miRNA and mRNA sequencing expression profiles and associated clinicopathological data of TCGA-LUSC from the GDC data portal at the National Cancer Institute (https://portal.gdc.cancer.gov/). We searched lung squamous cell carcinoma relevant gene microarray expression datasets from the Gene Expression Omnibus (GEO) database (http:// www.ncbi.nlm.nih.gov/geo/) with the following keywords: "lung squamous cell carcinoma". Filters were set to "series" and "Expression profiling by array", "Non-coding RNA profiling by array" and "Homo sapiens". The inclusive criteria of datasets is that the datasets have miRNA or mRNA expression values in lung squamous cell carcinoma tumors and normal tissues. Then we screened 5 eligible miRNA datasets and 10 eligible mRNA datasets. First, we screened differentially expressed miRNA (DE-miRNAs) or mRNA (DE-mRNAs) in each dataset. The DE-miRNAs and DE-mRNAs were obtained from microarray expression profiles using the web analysis tool GEO2R, which is used to compare groups of samples by the GEOquery and limma R packages from the Bioconductor project in the GEO database (http://www.ncbi.nlm.nih.gov/geo/geo2r/). The cut-off criteria were p-value < 0.05 and |log2 (fold change) | ≥ 1. Then, we summarized the differential miRNAs or mRNAs screened out from each dataset. We also collected differentially expressed miRNAs from miRCancer and Database of Differentially Expressed MiRNAs in human Cancers (dbDEMC) (Xie et al., 2013;Yang et al., 2017). An overview of the workflow steps is shown in Figure 1.

Identification and Function Analysis of miRNA-mRNA Networks
TarBase (Karagkouni et al., 2018) (http://www.microrna.gr/ tarbase) is a reference database, specifically for index experiments support the miRNA targets. It integrates information on cell-type specific miRNA-gene regulation, while hundreds of thousands of miRNA-binding locations are reported. miRTarBase (Chou et al., 2018) is a comprehensive annotated and experimentally validated miRNA-target interaction database for miRNA-related researches. Tarbase and miRTarBase are used to construct the miRNA-mRNA regulatory networks. We analyzed the correlation between miRNA and mRNA in TCGA-LUSC, and screened the miRNA-mRNA regulatory networks according to the results. The online program DAVID (http://david.abcc.ncifcrf.gov/), a database for annotation, visualization and integrated discovery, is a comprehensive tool for researchers and scholars to understand the biological meaning behind multiple genes. We used DAVID, DIANA-MirPath (Vlachos et al., 2015), a miRNA pathway analysis web-server, and Hiplot (https://hiplot.com.cn), a comprehensive web platform for scientific data visualization, to perform Gene Ontology (GO) functional analysis and Kyoto Encyclopedia of Genes Genomes (KEGG) pathways analysis. Single sample gene set Enrichment analysis (ssGSEA) is an extension of the GSEA method that allows the definition of an enrichment score that represents the absolute degree of enrichment of the gene set in each sample within a given dataset (Hoadley et al., 2018;Xiao et al., 2020). The ssGSEA data of TCGA-LUSC were downloaded from UCSC Xena (https://xena.ucsc.edu/) to analyze the possible enrichment pathways of DE-miRNAs and DE-mRNAs.  Table 1. Total RNA was extracted from FFPE specimens using the TIANGEN RNAprep Pure FFPE kit (Tiangen, Beijing, China). The acquired RNA from each sample was lysed in 40 μl RNase-free water and stored at −80°C until use. The concentration and purity of RNA were analyzed by the Nanodrop 2000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, United States ) (A260/A280 = 1.8-2, A260/ A230 > 1.7).

Quantitative Reverse Transcription PCR (qRT-PCR) Assay
According to the manufacturer's instructions, the external validation was verified by qRT-PCR using PrimeScript RT reagent Kit (Takara) and SYBR Premix Ex Taq II (Takara) after adding a poly (A) tail to RNA by Poly (A) Polymerase Kit (Takara). The sequences of PCR primers are listed in Supplementary Table S1. This process was run on qTOWER³ 84 (Analytik Jena) at 95°C for 20 s, followed by 40 cycles of 10 s at 95°C, 20 s at 60°C. The expression levels of miRNAs and mRNAs in tissue were calculated using the 2 −ΔΔCt method (Livak and Schmittgen 2001). (U6 as endogenous reference miRNA for sample normalization; 18s rRNA as endogenous reference mRNA for sample normalization; ΔCt = Ct miRNA/mRNA − Ct normalizer ; Ct: the threshold cycle).

Evaluation of Interactions of miRNA-mRNA Networks and Tumor-Relative Phenotypes
We downloaded the infiltrating immune cell types data from the TCGA website and calculated using CIBERSORT (https:// cibersort.stanford.edu/index.php/), a general calculation method used to quantify cell fractions from a large number of tissue gene expression profiles (GEP). Combining support vector  regression with prior knowledge of the expression profile of purified white blood cell subsets, CIBERSORT was able to accurately estimate the immune component of tumor biopsy (Chen et al., 2018). The stromal and immune levels of TCGA-LUSC samples were evaluated using ESTIMATE (Estimation of STromal and Immune cells in MAlignant Tumour tissues using Expression data) software. This method uses gene expression signatures to infer the fraction of stromal and immune cells in tumour samples (Yoshihara et al., 2013). We used the UCSC Xena platform (https://xena.ucsc.edu/) to obtain the methylation level of CpG sites in TCGA-LUSC samples, which is detected by the Illumina Infinium HumanMethylation450 BeadChips platform. The amount of somatic variants per megabase (MB) of the genome was measured by tumor mutation burden (TMB).

Statistical Analysis
We used GraphPad Prism software, IBM SPSS Statistics v.26 software (IBM Corporation, Armonk, NY, United States ), and R language v3.6.3 (https://cran.r-project.org/) for data analysis. The t-test was performed to analyze DE-miRNAs and DE-mRNAs. The statistically significant criteria of DE-miRNAs and DE-mRNAs are |log2FC| >0.58 and p < 0.05. The area under the ROC curve (AUC) and decision curve analysis (DCA) based on logistic regression were used to evaluate the diagnostic efficacy of miRNA-mRNA networks. The Pearson correlation method was used to calculate the correlation between DE-mRNAs or DE-miRNAs and tumor-related phenotypes. Prognostic analysis was performed using the "survival" package.

Identification of Differentially Expressed miRNAs and mRNAs in LUSC
We screened 5 miRNA and 10 mRNA expression datasets from GEO, and the information of 15 GEO datasets is shown in Table 2. The log2 fold change (LUSC vs. NC) was used to screen the DE-mRNAs and DE-miRNAs. As shown in

Screening of miRNA-RNA Regulatory Networks Associated With LUSC
We further screened miRtarbase and Tarbase database to establish potential miRNA-mRNA regulatory networks, and screened out 5 miRNA-mRNA regulatory networks ( Figure 3A). The complete prediction network of miR-205-5p is shown in Supplementary Table S3. Then, we filtered out 4 miRNA-mRNA networks with significant correlation (adjusted p-value < 0.05) in TCGA-LUSC listed in Figure 3B and full statistical results listed in Supplementary Table S4.

Validation of the Expression of miRNAs and mRNAs in LUSC Tissue
In order to prove the differential expression of the DE-miRNAs and DE-mRNAs, we further validated them in 30 pairs of matched tumors and adjacent normal tissues by qRT-PCR. As shown in Figure 4, the expression of miR-205-5p (p < 0.0001) and UBE2C (p = 0.0093) were upregulated in tumor tissues, while the expressions of miR-140-3p (p = 0.0293), PTPRM (p < 0.0001), GPD1L (p = 0.0002), and FOXF2 (p < 0.0001) were downregulated in tumor tissues. At the same time, miR-182-5p (p = 0.2286) and miR-210-3p (p = 0.0879) showed no significant difference in tumor tissues compared with normal tissues. Spearman correlation analysis of the interaction between DE-miRNAs and DE-mRNAs, showed that miR-205-5p was significantly correlated with PTPRM (p = 0.0186, r = −0.3031). IHC images from the HPA database showed that PTPRM expression in LUSC was lower than in normal control shown in Supplementary Figure S2.

Correlation Analysis of LUSC Clinical-Pathological Features and Prognosis With the Expression of miRNA and mRNA Expression Levels
Based on the analysis of FIGO stages, there was no statistically significant difference in the expression of miR-205-5p and PTPRM in the early stage (I) and advanced stage (II + III + IV) (Supplementary Figures S3A,B). The expression of miR-205-5p in male patients was higher than that in female patients    Figures S3C,D), and PTPRM was higher in age over 68 years (Supplementary Figures S3E,F). Univariate and multivariate cox regression analyses were used to estimate the Hazard ratio (HR) of different clinical features in the TCGA-LUSC. K-M survival analysis and cox regression analysis showed that miR-205-5p and PTPRM were not associated with prognosis (Supplementary Figure S4).

Analysis of Tumor-Related Phenotypes Associated With miRNA-mRNA Network
We downloaded the ssGSEA enrichment score of TCGA-LUSC data from UCSC Xena, and then analyzed the correlation between expression values of miR-205-5p and PTPRM and enrichment score. The results showed that miR-205-5p and PTPRM were highly correlated with immune-related pathways, such as yaci and bcma stimulation of b cell immune responses and Translocation of ZAP-70 to Immunological synapse ( Figure 6A). Therefore, we further analyzed its correlation with immune cells to explore its role in tumor immunity. We conducted a differential analysis of the immune cell data in TCGA-LUSC and found that 13 types of immune cells differed between tumor and normal tissues listed in Supplementary Table S5. Then, the correlation analysis of differential immune cells with miR-205-5p and PTPRM were carried out in the following research. As shown in Figure 6B, the expression of miR-205-5p and PTPRM were correlated with dendritic cells activate. We used CIBERSORT to calculate the proportion of various immune cells in each TCGA-LUSC sample, ESTIMATE to estimate the stroma and immunity levels of the samples, and obtained the methylation levels of CpG sites in TCGA-LUSC samples from the UCSC Xena platform. In conclusion, miR-205-5p and PTPRM interact with DNA methylation, tumor immunity and inflammation in the tumor microenvironment. MiR-205-5p and PTPRM had no correlation with TMB ( Figure 6C).

DISCUSSION
The functional pattern of the miRNA-mRNA regulatory network has been shown in the occurrence and progression of various Frontiers in Molecular Biosciences | www.frontiersin.org May 2022 | Volume 9 | Article 888020 human diseases, including cancer (Muller and Nowak 2014; Wang et al., 2017;Cheng et al., 2018;Chen et al., 2020). In the current work, we aim to construct a potential miRNA-mRNA regulatory network in LUSC. Firstly, we researched the GEO database to screen the DE-miRNAs and DE-mRNAs preliminarily. The DE-miRNAs and DE-mRNAs showed consistent differential expression in 5 miRNA and 10 mRNA datasets. Then, the DE-miRNAs and DE-mRNAs preliminarily screened from GEO were further verified by TCGA-LUSC, miRcancer, and dbDEMC databases. Ultimately, 7 De-miRNAs (3 upregulated and 4 downregulated) and 270 DE-mRNAs (101 upregulated and 169 downregulated) showed consistent differential expression across the four databases. The miRNA-mRNA regulatory networks were verified by experimental screening from miRTarBase and Tarbase database, and 5 miRNA-mRNA regulatory networks were screened out. We used TCGA-LUSC data to conduct correlation analysis on the regulatory networks initially screened, and finally identified 4 miRNA-mRNA regulatory networks. We know that the DE-miRNAs are involved in various pathways related to fatty acid metabolism and degradation. Fatty acid metabolism and fatty acid degradation have recently been recognized as an essential aberration of metabolism required for carcinogenesis (Swierczynski, Hebanowska, and Sledzinski 2014). Energy metabolism reprogramming, which fuels fast cell growth and proliferation by adjustments of energy metabolism, has been considered as an emerging hallmark of cancer (Hanahan and Weinberg 2011). In cancer cells, the biosynthesis of fatty acids often increases to meet the needs of lipid synthesis membranes and signaling molecules. Cancer cells usually obtain higher lipid accumulation in lipid droplets than normal cells (Catalina-Rodriguez et al., 2012).
After a series of bioinformatics analyses and external experimental verification, upregulated miR-205-5p and downregulated PTPRM in tumor tissues were finally screened out. The expression of miR-205-5p is closely associated with the incidence and development of tumors, such as head and neck cancer, ovarian cancer and breast cancer (Iorio et al., 2007;Tran et al., 2007;Wu and Mo 2009). It has been reported that the expression of miR-205 in non-small cell lung cancer (NSCLC) tissues is significantly increased and is related to the degree of tumor differentiation, which may lead to increased proliferation and invasion of lung cancer cells, thus leading to cancer progression (Lebanony et al., 2009;Jiang et al., 2013;Duan et al., 2017;Jiang et al., 2017). PTPRM is involved in cell-cell adhesion through same-sex interactions and may play a key role in signal transduction and growth regulation (Aricescu et al., 2006). PTPRM is associated with the prognosis of cervical cancer and promotes tumor growth and lymph node metastasis .
In this study, a panel combined with the network was built using logistic regression analysis. The expression of miR-205-5p and PTPRM has a good diagnostic efficacy in distinguishing LUSC patients from normal controls. The correlation analysis results of miR-205-5p and PTPRM with ssGSEA and tumorassociated phenotypes showed that miR-205-5p and PTPRM seemed to have a certain correlation with tumor immunity.
Tumor-infiltrating immune cells, such as T cells, macrophages, and neutrophils, are critical elements in tumor microenvironment and have shown close association with the clinical outcomes of various cancers (Gonzalez, Hagerling, and Werb 2018). T cells are composed of different subtypes with complicated phenotypes and functions, and tumor-infiltrating T cells play an extremely important role in the immune response system. Thymic epithelial cells (TECs) are essential regulators of T cell development and selection. miR-205-5p inhibits thymic epithelial cell proliferation via FA2H-TFAP2A feedback regulation in age-associated thymus involution (Gong et al., 2020). In glioma cells, miR-205-5p regulates TGF-β1 by targeting SMAD2, thereby influencing the immune response of TGF-β1 in tumors (Flavell et al., 2010;Meulmeester and Ten Dijke 2011;Duan and Chen 2016). Smad1 mediates BMP signaling and is involved in cell growth, apoptosis, morphogenesis and immune response (Bharathy et al., 2008). In esophageal squamous cell carcinoma, Mir-205-5p affects tumor immune response by targeting SMAD1 (Liang et al., 2014). PTPRM can activate STAT3 signaling pathway to enhance the anti-cancer immune responses and rescuing the suppressed immunologic microenvironment in tumors Im et al., 2020).
CpG sites were identified in a region immediately upstream of the first exon of MIR205HG and in the miR-205 locus (Ferrari and Gandellini 2020). DNA methylation maintains the silencing of miR-205, thus playing a role in the occurrence and development of lung cancer (Tellez et al., 2011). The upregulation of FN1 reduced PTPRM by increasing its methylation, resulting in an increase of STAT3 phosphorylation and promoting GBM cell proliferation (Song et al., 2021). In general, miR-205-5p and PTPRM have a certain correlation with tumor immunity and global methylation, and we will further explore their role in tumor immunity in future experiments.
Although we have conducted a comprehensive analysis and experimental verification of the miRNA-mRNA regulatory networks involved in LUSC, the internal threats to the validity of this study be different platforms, study time, study population and experimental methods across different datasets, and limited sample size of external validation. Based on the above reasons, we selected DE-miRNAs or DE-mRNAs in each dataset to eliminate the influence caused by different datasets as much as possible. We will also expand the sample size and increase multi-center samples for further research in future experimental studies. External threats to the validity may require more experiments and clinical validation before they can be truly applied to clinical practice. In future research, we will try to cooperate with different centers and further evaluate the miRNA-mRNA regulatory networks based on larger sample sizes for more generalizable findings.