Identifying Immune Cell Infiltration and Effective Diagnostic Biomarkers in Lung Adenocarcinoma by Comprehensive Bioinformatics Analysis and In Vitro Study

Family with sequence similarity 72B (FAM72B) has been characterized in the regulation of neuronal development. Nevertheless, the prognostic value of FAM72B expression and its function in the immune microenvironment of lung adenocarcinoma (LUAD) currently remains elusive. In this study, by adopting bioinformatics methodology and experimental verification, we found that FAM72B was upregulated in lung cancer tissues and cell lines, and a higher FAM72B level predicted an unfavorable clinical outcome in LUAD patients. The knockdown of FAM72B significantly inhibited cell proliferation, cell migration, and induced cell apoptosis in LUAD. The receiver operating characteristic curve suggested that FAM72B had a high predictive accuracy for the outcomes of LUAD. Kyoto Encyclopedia of Genes and Genomes and Gene Set Enrichment Analyses confirmed that FAM72B-related genes were involved in cell proliferation and immune-response signaling pathway. Moreover, upregulated FAM72B expression was significantly associated with immune cell infiltration in the LUAD tumor microenvironment. Meanwhile, a potential ceRNA network was constructed to identify the lncRNA-AL360270.2/TMPO-AS1/AC125807.2/has-let-7a/7b/7c/7e/7f/FAM72B regulatory axis that regulates FAM72B overexpression in LUAD and is associated with a poor prognosis. We also confirmed that AL360270.2, TMPO-AS1, and AC125807.2 were significantly upregulated in LUAD cell lines than in human bronchial epithelial cells. In conclusion, FAM72B may serve as a novel biomarker in predicting the clinical prognosis and immune status for lung adenocarcinoma.


INTRODUCTION
Lung cancer is the leading cause of cancer-related deaths worldwide, according to Cancer Statistics 2020. The incidence rate of lung cancer ranks second, while the death rate of lung cancer ranks first (1). Lung cancer includes small cell lung carcinoma (SCLC) and non-small cell lung carcinoma (NSCLC). NSCLC includes lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), and large-cell lung carcinoma. The NSCLC cancer accounts for approximately 85% of all cases (2). Although the treatment of LUAD has improved, for the new LUAD pathogenesis, noninvasive diagnostic biomarkers with high sensitivity and specificity are still needed. Therefore, the discovery of potential key prognostic markers with more characteristics and value will help in the early prediction and treatment of LUAD at the molecular level.
Preliminary studies uncover that family with sequence similarity 72B (FAM72B) was upregulated in the nervous system, neuroblastoma, and breast adenocarcinoma (3). FAM72D was reported as a specific proliferation marker in myelomas (4). FAM72 (A-D) was increased during non-small cell and cancer cell proliferation and is present in the G2/M phase of the cell cycle (5). It has been confirmed that the depletion of FAM72A inhibited NSC cell proliferation and promotes cell differentiation (6). However, the prognostic value, diagnostic value, underlying function, and mechanisms of FAM72B in LUAD progression remain unclear.
Therefore, the aim of this study was to determine the effect of FAM72B on the progression of LUAD. In this study, we used The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and Kaplan-Meier plotter web to examine FAM72B expression and its correlation with prognosis. Furthermore, the association between FAM72B expression and immune infiltration was determined by TIMER database and single-sample Gene Set Enrichment Analyses (ssGSEA) method. The FAM72B-miRNA-lncRNA network was constructed by starBase. Finally, immunohistochemistry (IHC), qPCR, growth curve, transwell assay, wound healing, and cell flow cytometry experiments were performed to examine the biological function of FAM72B in LUAD cell lines. This study may provide evidence for prognostic biomarkers and therapeutic targets for LUAD.

TCGA Datasets
We acquired the gene profiles and clinical survival data of the LUAD samples from TCGA database (https://portal.gdc.cancer. gov/) (7). We utilized these data analyses of the correlation between FAM72B expression and relevant clinical information, including pathological stage and TNM stage. Because the sequencing data of normal tissues included in the TCGA are very limited and many patients lack transcriptome sequencing results for their normal tissues, we obtained data for normal tissues from the GTEx database. The above-mentioned analyses were constructed using the R (v4.0.3) software package ggplot2 (v3.3.3). R software v4.0.3 and ggplot2 (v3.3.3) were used for visualization. R software v4.0.3 was used for statistical analysis.

LinkedOmics Database
LinkedOmics (http://www.linkedomics.org/login.php) is a publicly available portal that includes multi-omics data from all 32 TCGA cancer types and 10 Clinical Proteomics Tumor Analysis Consortium cancer cohorts. In this study, LinkedOmics was employed to obtain the genes that were significantly positively correlated with FAM72B expression in TCGA-LUAD.

Kyoto Encyclopedia of Genes and Genomes and Gene Set Enrichment Analysis
The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and related gene information were acquired from Gene Set Enrichment Analyses (GSEA) database. GSEA were conducted to examine the biological and molecular functions of FAM72B across different cancer types using a total of 300 genes that were positively correlated with FAM72B. All three analyses were performed using the R package Cluster Profiler. GSEA was also used to estimate the enrichment of various biological processes in each sample.

Generation of Prognostic Risk Prediction Model
Univariate and multivariate Cox regression analyses were performed by applying the R3.6.1 package (version 2.41-1); then, the independent prognostic clinical factors of LUAD samples in the TCGA datasets were acquired with the P <0.05. Based on the independent prognostic factors screened in the previous step and the risk information discriminated by the prognostic prediction model, the 1-, 3-, and 5-year prognostic risk prediction models of the nomogram were built by applying R3.6.1 "rms". In this research, Kaplan-Meier method was utilized to examine the prognostic values of FAM72B, miRNA and lncRNA expression-employing R packages of survminer-and survival.

Immune Infiltration Analysis by TIMER Database and ssGSEA
The TIMER web server is a comprehensive resource for the systematical analysis of immune infiltrates across diverse cancer types (8). In this study, we employed the TIMER database to determine the association between FAM72B expression and the immune infiltrates (B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells). We also utilized ssGSEA to examine the correlation between FAM72B expression and the LUAD immune infiltration of 24 tumor-infiltrating immune cells in tumor samples (9).
Starbase Database starBase v2.0 (http://starbase.sysu.edu.cn/) is a database which includes the RNA-RNA and protein-RNA interaction networks from CLIP-Seqdata sets generated by 37 independent studies (10). In this finding, starBase was used to predict the potential non-coding RNAs of FAM72B and determine the correlation between miRNAs and FAM72B in LUAD. Furthermore, Pearson's correlation analysis was used to determine the relationship between lncRNAs and FAM72B expression in TCGA-LUAD.

Cell Culture and Transfection
The BEAS-2B cell line was purchased from the Chinese Academy of Sciences Cell Bank (CASCB, China) and cultured in Bronchial Epithelial Cell Growth Medium (Lonza, CC-3170). The lung cancer cell lines, including HCC827, H1650, A549, and H1975, were purchased from the CASCB (China) with STR documents and were cultured in RPMI-1640 medium (Corning) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin.

Cell Migration Assay
Cell migration and invasion assays were conducted to explore the biological function of FAM72B on LUAD cells. For the transwell migration assay, 2.5 × 10 4 cells/well in 100 ml serum-free medium were plated in a 24-well plate chamber insert, and the lower chamber was filled with 10% FBS. After incubation for 24 h, the cells were fixed with 4% paraformaldehyde, washed, and then stained with 0.5% crystal violet for further pictures to be captured.

CCK8 Assay
We seeded the cells in 96-well plates at 2.5 × 10 3 per well in 100 ml of complete medium and 10 ml of CCK-8 reagent (RiboBio, Guangzhou, China) for 1 h each day after 3 days of culture. We then used a microplate to measure the absorbance of each well at 450 nm. Each sample was evaluated in triplicate.

Immunohistochemical Staining
For immunohistochemical staining, the sections were deparaffinized in xylene and rehydrated through graded ethanol. Antigen retrieval was performed for 20 min at 95°C with sodium citrate buffer (pH 6.0). After quenching the endogenous peroxidase activity with 3% H 2 O 2 and blocking the non-specific binding with 1% bovine serum albumin buffer, the sections were incubated overnight at 4°C with the indicated primary antibodies. Following several washes, the sections were treated with horseradish peroxidase-conjugated secondary antibody for 40 min at room temperature and stained with 3,3-diaminobenzidine tetrahydrochloride. The slides were photographed with a microscope (Olympus BX43F, Japan). The photographs were analyzed based on the ratio of the staining with the Image-Pro Plus 7.0 software (Media Cybernetics, Inc., Silver Spring, MD, USA).

Statistical Analyses
All statistical analyses were performed using R software, and receiver operating characteristic (ROC) curves were used to detect FAM72B cutoff values using pROC packages. For the data regarding the function of FAM72B, Graph Pad Prism 7.0 was used for statistical analyses.

Expression Pattern of FAM72B in Human Cancers
To determine the mRNA expression pattern of FAM72B in diverse cancer types, we used TCGA and GTEx datasets in conducting an analysis. The results indicated that FAM72B was highly expressed in 25 of the 33 cancers compared with normal tissue ( Figure 1A). We further determine FAM72B expression in paired cancer tissues and adjacent normal tissues by utilizing the TCGA datasets. We found that FAM72B expression was significantly higher in 15 of the 18 cancers compared with normal tissue ( Figure 1B). These results show that FAM72B was highly expressed in various human cancers.

FAM72B Was Upregulated in Lung Adenocarcinoma
To examine the FAM72B expression level in LUAD, we analyzed FAM72B expression based on the TCGA and Human Protein Atlas database. We found that FAM72B was upregulated both in LUAD and LUSC than in normal tissues (Figures 2A, B). Consistent with the above-mentioned results, the Gene Expression Omnibus (GEO) dataset also demonstrated that the FAM72B mRNA level was obviously increased in lung cancer tissues ( Figure 2C). Furthermore, we showed that the RNA of FAM72B was upregulated in LUAD cells lines, especially in H1975 cells ( Figure 2D). Finally, to prove the above-mentioned findings, immunohistochemical staining assay was conducted to examine the protein of FAM72B in lung cancer tissues. The results confirmed the upregulation in lung cancer tissues than in normal lung tissues ( Figures 2E, F).
Given that the biological function of FAM72B in LUAD remains unclear, we further determine the potential function of FAM72B on LUAD cell proliferation and migration. The qRT-PCR assay showed that the expression of FAM72B mRNA was significantly decreased in H1975 cells after treatment with the targeted siRNA ( Figure 2G). The growth curve assays demonstrated that FAM72B depletion significantly inhibits the cell proliferation ability of LUAD ( Figure 2H). Moreover, we show that the knockdown of FAM72B promotes cell apoptosis ( Figure 2I). Furthermore, to validate whether FAM72B is critical for cell migration, we performed transwell and wound healing assays and revealed that FAM72B knock-down significantly inhibited the cell proliferation ability of LUAD ( Figures 2J, K).

FAM72B Expression and Clinicopathological Characteristics of Lung Adenocarcinoma
We also assessed the correlation between FAM72B expression and the clinicopathological characteristics of LUAD samples. As shown in Figures 3A-J, FAM72B expression was significantly associated with pathological stage, TNM stage, primary therapy outcome, gender, age, OS, DSS, and progression-free survival (PFS) in LUAD ( Figures 3H-J). The logistic regression analysis also suggested that increased FAM72B expression was associated with T stage (T2 and T3 and T4 vs. T1; P < 0.001), N stage (N1 and N2 and N3 vs. N0; P = 0.040), pathologic stage (stage III and stage IV vs. stage I and stage II; P = 0.025), and gender (male vs. female) (P < 0.001) ( Table 1).

Analysis of the Diagnostic and Prognostic Value of FAM72B in LUAD
The relationship between FAM72B expression and OS, DSS, and PFS in LUAD patients was examined by a Kaplan-Meier curve. We found that increased FAM72B expression was correlated with poor OS, DSS, and PFS in patients with LUAD ( Figures 4A-C). According to time-dependent ROC, the FAM72B expression level had a relatively good performance in predicting the 1-year (C statistics, 1.0), 3-year (C statistics, 0.749), and 5-year overall survival (C statistics, 0.8363) in LUAD patients ( Figure 4D), had a better performance in predicting the 1-year (C statistics, 1.00), 3-year (C statistics, 0.929), and 5-year disease-free survival (C statistics, 0.965) in LUAD patients ( Figure 4E), and had a relatively good performance in predicting the 1-year (C statistics, 0.864), 3-year (C statistics, 0.901), and 5-year progression-free survival (C statistics, 0.900) in LUAD patients ( Figure 4F). We also utilized the GEO dataset to validate the above-mentioned results. We showed that the upregulation of FAM72B expression was related to adverse clinical outcomes in patients with lung cancer (Figures 5A-D). We further explore the diagnostic significance of FAM72B in lung cancer; a ROC curve analysis was performed. The ROC curve analysis confirmed that the area under the ROC curve values of FAM72B were 0.914, 0.914, 0.878, and 0.884 in various GEO datasets, respectively ( Figures 5E-H). These results confirmed that FAM72B may be a promising biomarker for differentiating LUAD.

Validation of the Prognostic Value of FAM72B Based on Various Subgroups
We further determine the prognostic values of FAM72B in various clinical subgroups, including the pathological stage, TNM stage, gender, primary therapy outcome, age, residual tumors, race, and smoker. The results suggested that the upregulated FAM72B level is associated with a poor clinical outcome in patients with lung cancer (Figures 6A-C).

Univariate and Multivariate Cox Regression Analyses of Different Parameters on Overall Survival
We performed univariate Cox regression analysis in the TCGA-LUAD cohort to determine whether the FAM72B expression level and the pathologic stage might be valuable prognostic biomarkers. The univariate COX analysis suggested that a higher expression of FAM72B, pathologic stage, and TNM stage, respectively, were correlated with a poor clinical outcome in LUAD patients. To ascertain whether the FAM72B expression level could be an independent prognostic factor for patients with LUAD, multivariate Cox regression analysis was performed. The multivariate COX analysis shows that a higher FAM72B expression, as well as pathologic stage and TNM stage, was a significant independent prognostic factor in the TCGA-LUAD cohort that directly correlated with poor overall survival ( Figures 7A, B).

Construction and Validation of FAM72B-Based Nomogram
The multivariate analysis result confirmed that FAM72B is an independent prognostic factor in LUAD. We then constructed a prediction model for overall survival and progression-free survival by integration of FAM72B expression. We established a nomogram to integrate FAM72B as a LUAD biomarker. Higher total points on the nomogram for OS, DSS, and PFS, respectively, indicated a worse prognosis ( Figures 8A-F).

KEGG Enrichment Analysis
To determine the potential function of FAM72B in LUAD progression, LinkedOmics database was utilized to obtain the top 100 genes that were significantly positively correlated with FAM72B expression (Figures 9A, B). The correlation analysis of FAM72B expression and the top 8 co-expressed genes in TCGA LUAD is shown in Figures 9C, D. For the terms of biological process, FAM72B is mainly involved in nuclear division, chromosome segregation, regulation of cell cycle phase transition, DNA replication, regulation of mitotic cell cycle phase transition, mitotic nuclear division, and cell cycle G2/M phase transition ( Figure 9E). The KEGG enrichment analysis suggested that these genes participate in cell cycle, RNA transport, DNA replication, cellular senescence, spliceosome, and 53 signaling pathways ( Figure 9F).
To explore the possible mechanism of FAM72B in LUAD, the GSEA analysis was carried out on the different genes. The GSEA also showed that pathways, including the PI3K AKT MTOR signaling pathway, TNFA signaling pathway, IL2 STAT5 signaling pathway, KRAS signaling pathway, glycolysis, G2M checkpoint, epithelial-to-mesenchymal transition (EMT), and apoptosis, were significantly enriched in the high-FAM72Bexpression group (Figures 10A, B).

Correlation Between FAM72B Expression and Immune Infiltration
Given that the GSEA indicated that FAM72B may be correlated with immune response regulation, we subsequently examined the relationship between FAM72B expression and immune cell infiltration. We found that the somatic copy number alterations of FAM72B significantly affect the infiltration level of B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells in LUAD ( Figure 11A). Furthermore, our results c o nfi r m e d t h a t m o s t i m m u n e c e l l s i n t h e tu m o r microenvironment, including B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells, were negatively associated with FAM72B expression in LUAD ( Figure 11B).
Additionally, to validate the above-mentioned results, we employed the ssGSEA method to determine the association between FAM72B expression and 24 tumor-infiltrating  lymphocytes in LUAD. The results suggested that FAM72B was positively associated with the infiltration of Th2 cells, Tgd, and NK CD56dim cells but negatively associated with the infiltration of mast cells, eosinophils, TFH, iDC, and DC in LUAD ( Figures 11C, D).

DISCUSSION
LUAD is still the most afflicting cancer in the world, and the 5year survival rate of lung cancer is only 10-15% in many countries (12). Previous studies have confirmed that FAM72B has been found to play a crucial role in maintaining the nervous system development (5). Nevertheless, there are few research studies on the synthesis study of FAM72B in LUAD. In this finding, we analyzed FAM72B expression, prognostic value, diagnostic values, ceRNA network, and correlation with tumor immune cell infiltration in LUAD for the first time.
In this project, we found a high level of FAM72B in various human cancers by analyzing the GTEX and TCGA cohorts. Moreover, we uncover that the mRNA and protein levels of FAM72B in the LUAD samples were remarkably higher than those in the normal control group through in vitro experiments and IHC staining, and the analysis results are the same as the above-mentioned studies. The elevated FAM72B expression was associated with an adverse pathological stage and TNM stage. The Kaplan-Meier curve analysis suggested that FAM72B expression was correlated with OS, disease-free survival, and PFS in the LUAD patients of the TCGA data. We also analyzed the potential of FAM72B expression to predict LUAD by conducting ROC curves and suggested that FAM72B has a  high accuracy in predicting the outcomes of normal tissues and LUAD. Our findings are consistent with those of previous research. FAM72B was increased in GBM and correlated with poorer survival of patients (5).
The logistic regression analysis also suggested that increased FAM72B expression was associated with T stage (T2 and T3 and T4 vs. T1) (P < 0.001), N stage (N1 and N2 and N3 vs. N0) (P = 0.040), pathologic stage (stage III and stage IV vs. stage I and stage II) (P = 0.025), and gender (male vs. female) (P < 0.001). Next, the univariate and multivariate analysis results suggested that FAM72B expression was an independent factor associated with the survival of patients.
Given that FAM72B was highly expressed in LUAD tissues and cell lines, we also uncover that the knockdown of FAM72B significantly reduced the proliferation and migration abilities of LUAD cells. Cell apoptosis was found to play a crucial role in maintaining cell growth. In this study, we determine that the depletion of FAM72B significantly promotes cell apoptosis in LUAD.
Previous studies reported that FAM72B was upregulated in the nervous system, neuroblastoma, and breast adenocarcinoma (3). FAM72B was identified as a member of a 7-gene signature in prostate cancer and correlated with poor prognosis in patients with prostate cancer (13). It has been shown that FAM72B promotes NSC and cancer cell proliferation and is present in the G2/M phase of the cell cycle (6). Another study confirmed that the knockdown of FAM72B inhibited the cell proliferation of human fibroblasts (14). In this study, we investigated the underlying mechanisms through which FAM72B was involved in the progression of LUAD. The GSEA enrichment suggested that FAM72B was significantly associated with the PI3K AKT MTOR signaling pathway, TNFA signaling pathway, IL2 STAT5 signaling pathway, KRAS signaling pathway, glycolysis, G2M checkpoint, EMT, and apoptosis.
By the analysis of TIMER database, we discovered that FAM72B expression in LUAD was negatively associated with the expression levels of B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells but positively associated with tumor purity. Moreover, FAM72B CNV was remarkably correlated with B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells. These analyses point out that FAM72B may be participating in the immune response to the LUAD tumor microenvironment, particularly to B cells and CD4+ T cells.
This study improves our understanding of the correlation between FAM72B and LUAD, but some limitations still exist. First, although we explored the correlation between FAM72B and immune infiltration in LUAD patients, there is a lack of experiments to validate the function of FAM72B in the tumor microenvironment regulation of LUAD. Second, we uncover that the knockdown of FAM72B inhibits the cell proliferation and cell migration of LUAD. However, the potential molecular mechanisms of FAM72B in cancer progression need to be explored in further studies.

CONCLUSION
This finding demonstrated, for the first time, the clinical significance and biological function of FAM72B in lung adenocarcinoma. Therefore, the lncRNA-AL360270.2, TMPO-AS1, and AC125807.2/has-let-7a/7b/7c/7e/7f-5p/FAM72B regulatory network may serve as a novel prognostic biomarker and potential therapeutic target for LUAD treatment. In summary, FAM72B may serve as a promising diagnostic and prognostic biomarker for LUAD.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.