Validating HMMR Expression and Its Prognostic Significance in Lung Adenocarcinoma Based on Data Mining and Bioinformatics Methods

Hyaluronic acid-mediated motility receptor (HMMR), a tumor-related gene, plays a vital role in the occurrence and progression of various cancers. This research is aimed to reveal the effect of HMMR in lung adenocarcinoma (LUAD). We first obtained the gene expression profiles and clinical data of patients with LUAD from The Cancer Genome Atlas (TCGA) database. Then, based on the TCGA cohort, the HMMR expression difference between LUAD tissues and nontumor tissues was detected and verified with public tissue microarrays (TMAs), clinical LUAD specimen cohort, and Gene Expression Omnibus (GEO) cohort. Logistic regression analysis and chi-square test were adopted to study the correlation between HMMR expression and clinicopathological parameters. The effect of HMMR expression on survival was evaluated by Kaplan–Meier survival analysis and using the Cox regression model. Furthermore, Gene Set Enrichment Analysis (GSEA) was utilized to screen out signaling pathways related to LUAD and the co-expression analysis was employed to build the protein–protein interaction (PPI) network. The HMMR expression level in LUAD tissues was dramatically higher than that in nontumor tissues. Logistic regression analysis and chi-square test demonstrated that the high HMMR expression in LUAD has relation with gender, pathological stage, T classification, lymph node metastasis, and distant metastasis. The Kaplan–Meier curve suggested a poor prognosis for LUAD patients with high HMMR expression. Multivariate analysis implied that the high HMMR expression was a vital independent predictor of poor overall survival (OS). GSEA indicated that a total of 15 signaling pathways were enriched in samples with the high HMMR expression phenotype. The PPI network gave 10 genes co-expressed with HMMR. HMMR may be an oncogene in LUAD and is expected to become a potential prognostic indicator and therapeutic target for LUAD.


BACKGROUND
According to the GLOBOCAN 2018 estimation, there were more than 2,000,000 new cases of lung cancer all over the world, accounting for 11.6% of new cancer cases, which makes lung cancer the most common malignant tumor on earth (1). Approximately 1,700,000 patients died of lung cancer, accounting for 18.4% of total cancer-related deaths (2). Thus, lung cancer is considered to be the main cause of cancer-related deaths in global world. Non-small cell lung cancer (NSCLC) patients account for nearly 85% of all lung cancer cases, and almost 50% of them suffer from LUAD, which is the most common lung malignancy (3). In the past two decades, significant advancements have been achieved in the treatment of NSCLC, including tyrosine kinase inhibitors and immunotherapy (4). However, persistent reactions are rare and the prognosis is still very poor, with a 19% overall 5-year survival rate in the United States and a worldwide ratio of lung cancer mortality-to-incidence of 0.87 (5). Consequently, it is essential to determine a biomarker that can forecast the prognosis of LUAD with high sensitivity and strong specificity and act as a target for LUAD treatment.
Hyaluronan, an extracellular matrix component, can not only absorb water in tissues but also regulate the proliferation of stem cell populations (6). It was reported that hyaluronan receptors were highly expressed in stem cells isolated from normal tissues (7). HMMR, a protein related to centrosomes and microtubules, is one of several well-defined receptors for hyaluronan (8). On the one hand, the central region of human HMMR is a coiled-coil stem, which can serve as a potential dimerization domain and binding region for other proteins (9). Specifically, the interaction between HMMR and CHICA/FAM83D is based on amino acids 365-546 (10), and amino acids 574-602 can act as calciumdependent calmodulin-binding domains (11). On the other hand, the two microtubule-binding subdomains of human HMMR are located at amino acids 40-59 and 76-90, respectively (12). Besides, the microtubule-binding subdomain at amino acids 76-90 is encoded by exon 4 (12). On the basis of the domains, HMMR can directly bind to microtubules (12). Moreover, the conserved C-terminal bZip motif in HMMR overlaps with the designated B-X7-B motif, and the two motifs bind to hyaluronic acid and heparin in an ionic manner (13). Because of the structural characteristics and chemical properties, HMMR is capable of binding to microtubules via its N-terminal and localizing to the centrosome by virtue of its C-terminal bZip motif. Also, serving as a binding partner of different spindle assembly factors, HMMR can regulate the assembly, stability, and position of spindle microtubules during mitosis and meiosis (14).
In recent years, a growing number of scholars at home and abroad have begun to devote themselves to studying the mechanism of HMMR in the occurrence and development of LUAD. Several representative studies are as follows. With the aim of understanding potential mechanisms of LUAD recurrence, Stevens et al. (15) studied the network of extracellular matrix (ECM) molecules by analyzing Affymetrix data and RNA-seq data. It turned out that the overexpression of HMMR in primary LUAD had associations with a poor prognosis (15). Further, reducing HMMR in LUAD cells could decrease their abilities to induce lung cancers and distant metastases (15). Afterward, through utilizing the step-wise multivariate Cox analysis, He et al. (16) developed a novel eight-gene prognostic signature (CDCP1, HMMR, TPX2, CIRBP, HLF, KBTBD7, SEC24B-AS1, and SH2B1) for earlystage NSCLC. This signature might be helpful to personalized treatment decisions. Furthermore, applying the data-mining method, Liu et al. (17) designed a risk score staging system to predict the prognosis of LUAD patients. They identified a fourgene signature (AGRN, AKR1A1, DDIT4, and HMMR) to divide LUAD patients into the high-risk group and the low-risk group. Analogously, utilizing the mRNA-mining method to analyze mRNA expression profiling in the large LUAD cohorts, Zhang et al. (18) confirmed a total of nine genes (HMMR, B4GALT1, SLC16A3, ANGPTL4, EXT1, GPC1, RBCK1, SOD1, and AGRN) that were significantly associated with metastasis and OS in LUAD patients.
Although some studies have explored potential associations between HMMR and LUAD, known evidence for HMMR as a biomarker is still insufficient and the specific mechanism of HMMR is still unclear in LUAD. Based on the TCGA database and GEO database, this research has investigated the associations between the HMMR expression level and the clinicopathological characteristics of LUAD, as well as the prognostic significance of HMMR, in order to provide more evidence for the potential role of HMMR in LUAD. Then, GSEA was implemented to deepen the understanding of the signal pathways involved in HMMR regulatory networks related to LUAD. Meanwhile, the PPI network was constructed to predict genes co-expressed with HMMR.

RNA-Sequencing and Clinical Data From TCGA
On or before October 7, 2020, the raw gene expression data for 497 LUAD tissues and 54 adjacent nontumor tissues were obtained from the TCGA database. The workflow type of each LUAD case is HTSeq-FPKM. In the same way, we obtained the raw expression data of common cancers. The corresponding clinical data of LUAD patients were also achieved. These clinical data contain information of age, gender, pathological stage, T stage, N stage, M stage, and vital status ( Table 1). This research is in full compliance with the guidelines of the National Institute of Health (NIH) TCGA human subject protection and data access policies.

Online Bioinformatics Analysis
Two online public available database including TIMER2.0 (http://timer.cistrome.org/) and UALCAN (http://ualcan.path. uab.edu/) were utilized to observe the mRNA and protein levels of HMMR in human pan-cancer. The PPI network for HMMR was constructed and visualized based on the STRING database (http://string-db.org/). The correlation of mRNA level between HMMR and its co-expressed genes in the PPI network was analyzed by the GEPIA online database (http://gepia.cancerpku.cn/).

HMMR Expression and Survival Analysis
First, Perl was applied to process raw gene expression data from the TCGA database. Through employing the limma package, HMMR expression data were extracted from processed data. To visualize HMMR expression data, the limma package and beeswarm package were utilized to draw scatter difference chart and paired difference chart. The survival data of patients with LUAD were extracted from downloaded clinical information, and samples without survival time or survival status information were filtered out. We matched the complete survival information of each sample with its HMMR expression data and then received 458 LUAD samples who meet requirements. According to the HMMR expression median, 458 LUAD patients were divided into two groups (high HMMR expression group and low HMMR expression group). Based on the two groups, we drew the Kaplan-Meier survival curve by utilizing the survival package.

RNA Extraction and qRT-PCR
According to instructions, we utilized TRIzol reagent (Invitrogen) to extract total RNA from fresh frozen LUAD tissues and adjacent nontumor tissues. Then, TransScript One-Step Guide DNA Removal and Complementary DNA Synthesis SuperMix were used for the reverse transcription reaction. The primer sequences for PCR amplification were as follows: HMMR, forward: 5′-AACAAGCTGAAAGGCTGGTCA-3′, reverse: 5′-GGGTATGAGCAGCACTACTTTT-3′.

Verification of HMMR From GEO and Human Protein Atlas
Adopting "lung" and "adenocarcinoma" as search terms, and "Homo sapiens" as the qualifier, we searched microarrays that met experimental requirements in the GEO database. After excluding datasets with a sample size of less than 40, there were still seven eligible datasets (GSE101929, GSE11969, GSE18842, GSE21933, GSE27262, GSE32863, and GSE75037). As shown in Table 2, the selected datasets included 359 LUAD tissue samples and 271 nontumor tissue samples. Based on Review Manager 5.3 software, the meta-analysis was applied to evaluate the differences of HMMR expression between LUAD samples and nontumor samples. We calculated the combined value on the basis of standard mean difference (SMD) with a 95% confidence interval (CI). Meanwhile, the heterogeneity between seven selected datasets was assessed by chi-square (c 2 ) and I 2 statistical tests. If p > 0.05 or I 2 < 50%, a fixed-effect model was chosen to calculate the combined effect; otherwise, a random effect model was selected (p < 0.05 or I 2 > 50%). What is more, taking "HMMR," "lung," and "adenocarcinoma" as search terms, we also obtained immunohistochemical results of patients with LUAD from the human protein atlas database (http://www. proteinatlas.org).

Univariate and Multivariate Cox Regression Analyses
Based on the TCGA dataset and Cox regression model, we conducted univariate and multivariate analyses and calculated the hazard ratio (HR) and 95% CI. Simultaneously, we performed a quantitative assessment of the predictive value of clinicopathological parameters and HMMR expression on survival. Via adjusting for confounding factors, the independent prognostic effect of HMMR on survival was  estimated. Concretely, Perl was utilized to process raw clinical data of LUAD patients and delete samples with incomplete clinical information. Then, the processed clinical data were matched with the HMMR expression data and LUAD patients were divided into the high HMMR expression group or the low HMMR expression group according to the HMMR expression value. Finally, the data of 316 LUAD patients were applied to perform univariate and multivariate Cox regression analyses.

Gene Set Enrichment Analysis
GSEA is used to confirm whether a given set of genes displays statistically obvious and consistent differences between two biological states (19). The GSEA software was adopted to seek signaling pathways associated with HMMR between datasets with low or high HMMR expression. In GSEA software, the annotated gene set (c2.cp.kegg.v6.2.symbols.gmt) was regarded as the reference gene set. In each analysis, we set 1,000 genes to identify distinct pathways and gene-set permutations were implemented 1,000 times. The normalized enrichment score (NES), nominal p-value, and false discovery rate (FDR) q-value were given to denote the importance of associations between gene sets and pathways.

Statistical Analysis
A variety of statistical methods were adopted to achieve a comprehensive analysis. The Mann-Whitney U test was utilized to detect HMMR expression differences between LUAD tissue samples and nontumor tissue samples. The Kruskal-Wallis test was applied to inspect HMMR expression differences among multiple groups. The correlation between HMMR expression and each clinicopathological parameter was evaluated through the c 2 test. The log-rank test was performed to compare the survival rate difference between the high HMMR expression group and the low HMMR expression group. The Cox regression model was employed to carry out univariate and multivariate survival analyses. In these statistical analysis methods, we adopted p < 0.05 to determine the significance level.

HMMR Is Upregulated in Pan-Cancer and LUAD According to Public Databases
To determine whether HMMR is involved in human cancers, we first analyzed the expression of HMMR in different types of cancers via utilizing the UALCAN database. We observed that HMMR mRNA expression was significantly higher in most human cancers compared with the corresponding normal tissues ( Figure 1A). Consistent with this result, HMMR was also found to be notably upregulated in numerous human solid tumors according to the TIMER 2.0 database ( Figure 1B). Then, further analysis of HMMR mRNA expression was carried out for 551 tissues (497 LUAD tissues and 54 adjacent nontumor tissues) from the TCGA database. The LUAD tissues displayed a significantly higher HMMR mRNA expression level than the normal tissues ( Figure 1C). What is more, the result was also validated by data from 54 tumor tissues and paired adjacent nontumor tissues of LUAD patients in the TCGA database ( Figure 1D), and a CPTAC cohort containing 111 cancer patients and 111 adjacent nontumor tissues ( Figure 1E). Moreover, as seen in Figure 1A, HMMR is found to be overexpressed in various cancers. To further analyze the specificity of HMMR for LUAD diagnosis, we calculated the difference multiple of the HMMR expression between normal samples and tumor samples for common cancers. After filtering out these cancers whose samples cannot be obtained and the number of normal samples less than 10, we finally achieved the difference multiple of HMMR in 12 different cancers. As listed in Table S1, we observed that the difference multiple of the HMMR expression was greater than 3 in only three cancers (LUAD: log2FC=3.22, LUSC: log2FC=3.32, LIHC: log2FC=3.99). From the perspective of HMMR expression, HMMR has a certain specificity for LUAD diagnosis. The specificity of HMMR for LUAD diagnosis at the level of regulatory mechanisms will be explored in our future work.

Validation of HMMR Upregulation in LUAD by Public TMAs, qRT-PCR, and SMD
To characterize the HMMR expression status, we analyzed the HMMR protein expression in clinical specimens from the human protein atlas database (http://www.proteinatlas.org). We found that HMMR had a positive expression in LUAD tissues and a negative expression in normal lung tissues (Figure 2A). To further verify the difference of the HMMR expression in the TCGA database, the HMMR mRNA expression was validated in a clinical HMMR cohort containing 32 pairs of fresh frozen tissue specimens, revealing that HMMR was upregulated in LUAD tissues compared with adjacent nontumor tissues ( Figure 2B) (p < 0.001). Moreover, a comprehensive meta-analysis of HMMR expression data for LUAD patients in the GEO dataset ( Table 2) was performed. As depicted in Figure 2C, the I-square value was 85% (p < 0.001) and the combined SMD of HMMR was 2.28 in view of the random-effect model (95% CI: 1.68-2.89). These clearly indicated that HMMR was highly expressed in LUAD.

HMMR Is Associated With Malignant Progression in Patients With LUAD
With the in-depth study of HMMR mRNA expression data in the TCGA database, we were surprised to find that the expression of HMMR was different in groups classified according to pathological stage (p < 0.001, Figure 3A and p < 0.05, Figure 3E), T classification (p < 0.001, Figure 3B), N classification (p < 0.001, Figure 3C), M classification (p < 0.05, Figure 3D), and histological grade (p < 0.05, Figure 3F). To further explore the relationship between HMMR expression and clinicopathological parameters, the clinical data of 316 LUAD patients were obtained from the TCGA database. As depicted in Table 3, the high HMMR expression level was significantly correlated with gender (p = 0.043), pathological stage (p = 0.003), T stage (p = 0.033), lymph node metastasis (p = 0.001), and distant metastasis (p = 0.042). In Table 4, by adopting logistic regression analysis, we observed that the upregulated expression of HMMR mRNA in LUAD was significantly related to gender (OR = 1.594 for male vs. female, p = 0.012), pathological stage (OR = 2.139 for stage II vs. stage I and p = 0.008, OR = 2.316 for stage III vs. stage I and p = 0.005, OR = 2.574 for stage IV vs. stage I and p = 0.014), T classification (OR = 1.845 for T2 vs. T1 and p < 0.001), and lymph node metastasis (OR = 2.195 for positive vs. negative and p < 0.001).

High HMMR Expression Is Related to Poor Survival in Patients With LUAD
The Kaplan-Meier risk estimate was applied to evaluate the prognostic role of HMMR in LUAD. Compared with the low HMMR expression group, the high HMMR expression group was more associated with a poor OS ( Figures 4A, D and S3). Besides, the median OS of the high HMMR expression group was 34.77 months and the low HMMR expression group had a median OS of 49.93 months in Figure 4A. The 5-year survival rate of the high HMMR expression group (29.7%) was also lower than that of the low HMMR expression group (37.5%) in Figure 4A.  Figure 4B and Table 5).
To further evaluate the accuracy of HMMR as a prognostic gene, we drew the receiver operating characteristic (ROC) curve of each prognostic gene (HMM, ANGPTL4, CDCP1, DDIT4, and SLC16A3) and calculated the area under the curve (AUC) value of each prognostic gene. The process of obtaining these genes (ANGPTL4, CDCP1, DDIT4, and SLC16A3) with prognostic potential has been elaborated in the supplementary material (Table S2, S3, and Figure S1). As described in Figure 4C, the performance of HMMR is better (AUC = 0.743).

Identification of HMMR-Related Signaling Pathways and 10 Genes Co-Expressed With HMMR
GSEA was performed to investigate the possible mechanism of HMMR in promoting LUAD progression. We first divided TCGA samples into high and low HMMR expression groups. According to the normalized enrichment score (NES), false discovery rate (FDR) q-value, and nominal (NOM) p-value, 15 significantly enriched signaling pathways with the high HMMR expression phenotype were identified and listed as cell cycle, oocyte meiosis, ubiquitin-mediated proteolysis, RNA degradation, basal transcription factors, progesterone-mediated oocyte maturation, pyrimidine metabolism, nucleotide excision repair (NER), spliceosome, p53 signaling pathway, DNA replication, protein export, small cell lung cancer, RNA polymerase, and regulation of autophagy ( Figures 5A, S2, and Table 6). Based on the STRING database, we also identified 10 genes co-expressed with HMMR (TOP2A, PTTG1, DLGAP5, ASPM, CEP55, CENPF, NCAPG, BUB1, PBK, and CDK1) according to the confidence score and built the corresponding PPI network (p-value = 8.03e-10) ( Figure 5B). Moreover, a correlation analysis by GEPIA demonstrated that the mRNA levels of TOP2A, PTTG1, DLGAP5, ASPM, CEP55, CENPF, NCAPG, BUB1, PBK, and CDK1 were significantly associated with HMMR, indicating that HMMR may work with these genes to promote LUAD progression ( Figure 5C).

DISCUSSION
The research on the role of HMMR in occurrence and progression of tumors has recently attracted widespread attention. Lots of studies have declared that HMMR is abnormally expressed in different types of cancers. With the growing development of high-throughput sequencing technology, various data of gene expression level in LUAD samples were uploaded to the public database, such as GEO and TCGA, which provides chances for biomarker discovery and validation. In this study, through taking full use of publicly available databases TIMER2.0 and UALCAN, we first analyzed the HMMR expression level among various human tumors. These results implied that HMMR gene expression was upregulated in lung adenocarcinoma, bladder urothelial carcinoma, breast invasive carcinoma, clear cell renal cell carcinoma, cholangiocarcinoma, esophageal carcinoma, stomach adenocarcinoma, and others than in their matched adjacent normal tissues. Then, we focused on the expression of HMMR in lung adenocarcinoma.
In the current study, we have made huge attempts to detect the role of HMMR expression in LUAD progression, especially as a prognostic biomarker for LUAD. Additionally, we screened the signal pathways associated with HMMR in LUAD to reveal the potential mechanism of HMMR regulating LUAD development. First, on the basis of the RNA-seq data from the TCGA cohort, we observed that the HMMR expression level in LUAD tissues was significantly higher than that in nontumor tissues and the result has been confirmed in mRNA and protein levels by qRT-PCR and public TMAs. Also, meta-analysis was implemented on the GEO cohort to compare the differences in the HMMR expression level between LUAD tissues and nontumor tissues, which obtained the same conclusion (18). These results indicated that HMMR may serve as an oncogene and play an important role in LUAD initiation and progression. Moreover, it was observed that the HMMR expression level was different in groups classified by pathological stage, T stage, N stage, and M stage. After further analyzing the relationship between HMMR expression and clinicopathological parameters, it is shown that the high HMMR expression level was significantly correlated with pathological stage, T classification, lymph node metastasis, and distant metastasis. In 2018, Song et al. (20) proposed that the overexpression of HMMR in LUAD was correlated with late pathological stage and reduced OS. Later, He et al. (16) put forward that HMMR could be regarded as a prognostic signature in early-stage NSCLC. All in all, these studies implied that the HMMR expression at the mRNA level is related to some important clinicopathological parameters. Kaplan-Meier survival analysis indicated that the prognosis of the high HMMR expression group was worse than that of the low HMMR expression group. Besides, the univariate analysis manifested that the high HMMR expression was related to poor OS. Pathological stage, T stage, and N stage were also associated with the prognosis of LUAD patients. In general, we found that HMMR could be treated as an independent prognostic factor for the OS of LUAD patients and proved its potential as a biomarker for LUAD.
The GSEA method was utilized to analyze signaling pathways of HMMR in LUAD. The results showed that cell cycle, oocyte meiosis, ubiquitin-mediated proteolysis, RNA degradation, basal transcription factors, progesterone-mediated oocyte maturation, pyrimidine metabolism, NER, spliceosome, p53 signaling pathway, DNA replication, protein export, small cell lung cancer, RNA polymerase, and regulation of autophagy were correlated with the progression of LUAD. The operation of the cell cycle is precisely regulated by various factors inside and outside the cell, and internal factors are the basis for regulation. Disorders of the cell cycle and abnormal cell proliferation may lead to cell canceration, and each aspect of the cell cycle regulation system may be the main factor leading to cell canceration (21). The ubiquitin pathway plays a key role in regulating cell growth and proliferation via controlling the abundance of cyclins. Besides, unscheduled proteolysis of many cell cycle regulators contributes to tumorigenesis (22). Abnormal degradation of RNA may inhibit gene expression, thereby inducing cancer or promoting cancer progression (23). Basic transcription factors are necessary for the initiation of RNA polymerase II transcription and can maintain the basic level of transcription. Once the basic level of transcription is imbalanced, it will affect the function of RNAs and then induce the occurrence and progression of tumors (24). Moreover, abnormal pyrimidine metabolism plays a certain role in the process of tumor invasion and metastasis (25). NER can eliminate structurally unrelated DNA lesions through a multiwise "cut and patch" reactions. Further, the global genome NER sub-pathway prevents mutagenesis by detecting twisting damages of the genome. Therefore, defects in the global genome NER may cause cancers (26). As an important regulatory step in the process of gene expression, abnormal splicing regulation is a common feature among various cancers. Specifically, these cancers may be caused by mutations that disrupt the splicing of specific genes or by the general loss of spliceosomal function, thereby affecting many gene targets (27). p53 stabilizes the genome by interacting with different signal transduction pathways in cells, thereby regulating various cellular processes. The mutations in p53  have associations with genomic instability and an increased sensitivity to cancers (28). Furthermore, we constructed the PPI network through utilizing the STRING database and gave 10 genes that are likely to co-express with HMMR.
This study also has some limitations. First of all, the clinical data lack some important information, such as tumor size. Specific details that are critical to the prognosis of patients, including surgical treatments and surgical details, are not  provided. Based on public available databases including GEO and TCGA, we are unable to clarify the direct functional mechanism of HMMR in LUAD.

CONCLUSION
In conclusion, through analyzing HMMR expression data of LUAD patients, we observed that the HMMR expression in LUAD tissues is higher than that in nontumor tissues. It is found that the upregulation of HMMR is closely correlated with some clinicopathological features of LUAD. We infer that the upregulation of HMMR promotes the occurrence and progression of LUAD. According to univariate and multivariate survival analyses, the increased HMMR expression in LUAD was identified as an independent risk factor for shorter OS. All in all, we believe that the HMMR expression level can become a promising marker for the diagnosis and prognosis of LUAD.

DATA AVAILABILITY STATEMENT
Available datasets in this study were analyzed and can be downloaded from The Cancer Genome Atlas (https://portal.gdc.cancer.gov/) and the NCBI Gene Expression Omnibus (GSE101929, GSE11969, GSE18842, GSE21933, GSE27262, GSE32863 and GSE75037).

ETHICS STATEMENT
The project was granted approval by the Ethics Committee of the Affiliated Hospital of Xuzhou Medical University. The patients/ participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
XL and LiZ designed the overall idea of this study, conceived the experiments, analyzed the data, prepared the figures and tables, and authored the drafts of the paper. QS and YX collected the data from the TCGA and GEO datasets and performed the experiments. LoZ and HZ guided and supervised this study and reviewed the drafts of the paper. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by the Talented Scientific Research Foundation of Xuzhou Medical University (No. D2018018).