Multi-Omics Analysis of the Effects of Smoking on Human Tumors

Comprehensive studies on cancer patients with different smoking histories, including non-smokers, former smokers, and current smokers, remain elusive. Therefore, we conducted a multi-omics analysis to explore the effect of smoking history on cancer patients. Patients with smoking history were screened from The Cancer Genome Atlas database, and their multi-omics data and clinical information were downloaded. A total of 2,317 patients were included in this study, whereby current smokers presented the worst prognosis, followed by former smokers, while non-smokers showed the best prognosis. More importantly, smoking history was an independent prognosis factor. Patients with different smoking histories exhibited different immune content, and former smokers had the highest immune cells and tumor immune microenvironment. Smokers are under a higher incidence of genomic instability that can be reversed following smoking cessation in some changes. We also noted that smoking reduced the sensitivity of patients to chemotherapeutic drugs, whereas smoking cessation can reverse the situation. Competing endogenous RNA network revealed that mir-193b-3p, mir-301b, mir-205-5p, mir-132-3p, mir-212-3p, mir-1271-5p, and mir-137 may contribute significantly in tobacco-mediated tumor formation. We identified 11 methylation driver genes (including EIF5A2, GBP6, HGD, HS6ST1, ITGA5, NR2F2, PLS1, PPP1R18, PTHLH, SLC6A15, and YEATS2), and methylation modifications of some of these genes have not been reported to be associated with tumors. We constructed a 46-gene model that predicted overall survival with good predictive power. We next drew nomograms of each cancer type. Interestingly, calibration diagrams and concordance indexes are verified that the nomograms were highly accurate for the prognosis of patients. Meanwhile, we found that the 46-gene model has good applicability to the overall survival as well as to disease-specific survival and progression-free intervals. The results of this research provide new and valuable insights for the diagnosis, treatment, and follow-up of cancer patients with different smoking histories.


INTRODUCTION
Smoking is extremely harmful to human health, resulting in more cardiovascular diseases, chronic obstructive pulmonary disease, and different types of cancers (Kulhánová et al., 2020;Stang et al., 2021). It has been reported that smoking can increase the incidence of many diseases as well as lead to a poor prognosis and even increase the recurrence risk of cancer patients (Parsons et al., 2010;Rieken et al., 2015;Foerster et al., 2018). Indeed, according to the global burden of disease in 2019, smoking ranks second among the most important risk factors for death attribution in the whole population, accounting for 8.71 million people, only after high systolic blood pressure (GBDRF Collaborators, 2020). Studies have shown that smoking is detrimental not only to active smokers but also to passive smokers (Eriksen et al., 1988;Dugué et al., 2020). The carcinogenic effects of smoking mainly include the following aspects. First, the free radicals produced by smoking directly damage the cell components, leading to DNA damage and tumor formation. Second, smoking can induce mutations in a variety of genes, causing continuous proliferation of cells and malignant transformation.
Third, it can cause the transformation of the inflammatory response to malignant transformation (Husgafvel-Pursiainen et al., 2000;Vähäkangas et al., 2001;Taioli, 2008;Takahashi et al., 2010). Previous studies have established that cancer patients who had smoked lived shorter compared with those who had never smoked, while smoking cessation can increase the survival time of smokers, and the earlier they quit, the longer they live (Doll et al., 2004). Elsewhere, a meta-analysis of smoking status on colorectal cancer (CRC) prognosis showed a poor overall survival (OS) rate for current smokers than non-smokers, whereas Cox regression analysis revealed that current smokers had a higher risk of poorer prognosis. In comparison, quitting smokers can improve their OS compared with current smokers, Frontiers in Molecular Biosciences | www.frontiersin.org November 2021 | Volume 8 | Article 704910 2 thus exhibiting a higher specific survival (Ordóñez-Mena et al., 2018). Given this background, this study integrated multipleomics data from The Cancer Genome Atlas (TCGA) database to explore possible underlying molecular mechanisms among nonsmokers, former smokers, and current smokers. A flow chart of this study is provided in Figure 1.

Data Downloading and Processing
Patients with a past smoking history (including non-smokers, former smokers, and current smokers) were screened from the TCGA database were included in our study, as well as downloaded their level 3 RNA-seq expression, DNA methylation (Illumina Human Methylation 450), miRNA expression, somatic copy number variations (CNVs, masked copy number segment), and simple nucleotide variations (SNVs, VarScan2 Variant Aggregation and Masking) from TCGA-GDC portal (https://portal.gdc.cancer.gov/). We also downloaded prognostic indicators such as OS, disease-specific survival (DSS), progression-free interval (PFI), as well as clinical characteristics, including gender, age, grade, and stage, among others. However, we did not include the prognostic marker disease-free interval in this study because it had too many missing values. Strawberry Perl (version: 5.30.0.1) (http:// strawberryperl.com/) was used to annotate gene expression profiles. If multiple probes corresponded to one gene, the average value was taken as the expression level of the gene and expression converted by log2 ((transcripts per kilobase of exon model per million mapped reads) +1). We followed the TCGA usage rules, and thus approval from the ethics committee was not required for this work. All databases were up-to-date as of December 15, 2020.

Survival and Risk Analyses
Prognostic differences and the hazard ratio (HR) were evaluated among the three patients with a past smoking history with different survival indicators (OS, DSS, and PFI). We employed the Cox regression analysis to evaluate the effects of different smoking histories (non-smoker, former smoker, and current smoker were defined as 0, 1, and 2, respectively), with other clinical characteristics such as gender (females and males were defined as 0 and 1, respectively), stage and age, on the prognosis of patients.

Immunological Content of Patients With Different Smoking Histories
To quantify immune-related cells, functions, and pathways of each patient, we used a single-sample gene set enrichment analysis (ssGSEA), which was implemented using the R-project packages GSVA and GSEA (Barbie et al., 2009). Next, the ESTIMATE method was applied to quantify the tumor immune microenvironment, including stromal score, immune score, estimate score, and tumor purity (Yoshihara et al., 2013). Additionally, the B-cell receptor (BCR) diversity (BCR Richness, BCR Shannon), leucocyte infiltration, neoantigens, homologous recombination defects (HRD), cancer testis antigen (CTA), and intratumor heterogeneity of each patient were obtained from a study by Thorsson et al. (2018). We subsequently explored the differences in these indicators among patients with different past smoking histories.

Analysis of the Difference of Stemness Indices in Patients With Different Smoking Histories
The mRNA stemness indices (mRNAsi), DNA methylation stemness indices (mDNAsi), differentially methylated probesbased stemness index (DMPsi), enhancer-based stemness index (ENHsi), RNA expression-based epigenetically regulated-mRNAsi (EREG-mRNAsi), and DNA methylation-based (EREG-mDNAsi) of each patient were obtained from the study by Malta et al. (2018), in which we compared the differences in these indicators among patients characterized by different smoking histories. The stemness indices were rated between 0 and 1, signifying that the closer the stemness indices were to 1, the lower the level of tumor cell differentiation as well as the stronger the tumor cell stemness characteristics.

Somatic Simple Nucleotide and Copy Number Variations Analyses
The incidence of SNV events among different smoking history was compared. tumor mutation burden (TMB), usually quantified as the number of mutations per megabases, was defined as the total number of nonsynonymous mutations in each coding region of the tumor genome. CNV events (loss or gain) in each sample were integrated and analyzed using the Genomic Identification of Significant Targets in Cancer (GISTIC) 2.0 (https://cloud.genepattern.org/) (Mermel et al., 2011). The reference genome file was BSgenome.Hsapiens.UCSC.hg38, while segment mean log 2 (copy number/2). In particular, the value of a segment of mean greater than 0.2 was defined as gain (recorded as 1), a segment of mean with less than −0.2 was defined as loss (recorded as −1), and a segment of mean between −0.2 and 0.2 was defined as without CNV (recorded as 0). Finally, the total number of genes with CNV loss or gain at the focal or arm levels was defined as CNV loss or gain burden (Shen et al., 2019). We used R-project maftools package to visualize the results (Mayakonda et al., 2018).

Chemotherapeutic Response Prediction
The response to chemotherapy in each patient was predicted with the Genomics of Drug Sensitivity in Cancer (GDSC) (https:// www.cancerrxgene.org/). The R-project prophet package was used to perform the prediction, while the half-maximal inhibitory concentration (IC50) of each patient was predicted through ridge regression. Based on the GDSC training set, the precision of prediction was verified by 10 cross-validations (Geeleher et al., 2014).

Pathways Enrichment of Patients With Different Smoking Histories
Gene Set Enrichment Analysis (GSEA) was applied to identify the enrichment of oncogenic signature in patients with different smoking histories, which was performed using the GSEA software (http://software.broadinstitute.org/gsea/downloads.jsp) (version: 4.0.3). The criteria for enrichment difference included: |normalized enrichment score (NES)| > 1, the nominal p-value (NOM p-value) < 0.05, and the false discovery rate q-value (FDR q-value) ≤ 0.25.

DNA Methylation Driver Gene Analyses
Differentially expressed methylated genes between any two of the three smoking histories were screened according to the screening criteria p-value < 0.05. Spearman correlation analysis was applied to calculate the correlation coefficient (R) between methylation level and its mRNA expression. A R < -0.4 and p < 0.05 was considered as a DNA methylation driver gene.

Establishing and Assessing a Smoking-Related Prognostic Model for Patients
Univariate Cox analysis was used to identify differentially expressed genes associated with OS of patients with p-value < 0.05. The R-project glmnet package was used for lasso regression analysis to prevent the over fitting of the model (alpha 1) (Friedman et al., 2010). Upon constructing a multivariate Cox proportional hazards regression model, we obtained the risk score of each sample using the regression coefficient and expression of the gene as follows: Risk score n i 1 Coefficient (gene i )*expression (gene i ). Moreover, we used R-project survival and survminer packages to examine the difference in OS between different risk score groups. Then, the Cox regression analysis was employed to assess the effects of the model along with other clinical characteristics on OS. The R-project survival ROC package was used to draw the receiver operating characteristic curve (ROC) as well as calculate the area under the curve (AUC) to determine the predictive ability of the model (Heagerty et al., 2000).
Nomogram was established to diagnose or predict the progression or prognosis of the disease by combining multiple clinical indicators. Harrell's concordance index (C-index) and calibration diagram were used to estimate the consistency between the predictive results of the nomogram and the actual occurrence of events. Further, R-project rms, survival, and survcomp packages were used to draw the calibration diagram and calculate the C-index (Schröder et al., 2011). Lastly, we assessed the applicability of the model to other clinical outcome indicators (DSS and PFI).

Statistical Analysis
The R-project (version: 3.6.3) (https://www.r-project.org/) and bio conductor packages (http://bioconductor.org/) were used for statistical analysis and visualization. Then, χ 2 and p-values were calculated using the Chi-square or Fisher's exact tests. The linear correlation between the two variables was analyzed using the Spearman correlation coefficient. Differences between the two sets were compared using the Wilcoxon rank-sum test, while those between multiple groups were compared using Kruskal-Wallis test. All statistical tests were two-sided and a value of p < 0.05 was considered statistically significant. "*," "**," "***" and "ns" indicates p < 0.05, p < 0.01, p < 0.001, and p > 0. 05, respectively.

Smoking Exerts a Negative Effect on the Prognosis of Patients
In this study, we collected 2,317 patients with various smoking histories from seven cancer types, namely, bladder urothelial carcinoma (BLCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), kidney renal papillary cell carcinoma (KIRP), lung adenocarcinoma (LUAD), and lung squamous cell carcinoma (LUSC). The detailed information is summarized in Supplementary Table S1. The Sankey diagram of cancer type, smoking history, and survival status was depicted in Figure 2A.
The results of survival analysis of patients with different smoking histories revealed that non-smokers enumerated the best OS and DSS, followed by former smokers, while on the other hand, current smokers exhibited the worst OS and DSS (p < 0.05). Besides, current smokers also have a higher risk for poor OS and DSS, followed by former smokers, whereas non-smokers have a lower risk ( Figures 2B,C).
For PFI, we observed no significant difference (p > 0.05) among patients with different smoking histories, but the 10-,15-years PFI was the highest for non-smokers, followed by former smokers, while that of the current smokers was the The Cox regression analysis results demonstrated that smoking history was an independent factor for OS and DSS in patients. Notably, current smoking was an independent risk

Differences in Immune Indicators Among Patients With Different Smoking Histories
We found higher immune levels as well as lower tumor purity in former smokers ( Figure 3). Antigen-presenting cell coinhibition, antigen-presenting cell co-stimulation, B cells, CD8 + T cells, checkpoint, follicular helper T cells, natural killer (NK) cells, T cell co-inhibition, and tumor-infiltrating lymphocyte were the highest in former smokers, lowest in current smokers, and moderate in non-smokers. In addition, the levels of dendritic cells (DCs), chemokine receptor (CCR), cytolytic activity, immature dendritic cells, inflammationpromoting, macrophages, mast cells, neutrophils, para inflammation, plasmacytoid dendritic cells, regulatory T cells, T cell co-stimulation, type 1 T helper cells, type 2 T helper cells, type I interferon (IFN) response, type II IFN response was higher in former smokers and current smokers than those in nonsmokers, while there was no difference between former smokers and current smokers. The levels of the immune score and ESTIMATE score were higher in former smokers than those in non-smokers and current smokers, while there was no difference between non-smokers and current smokers. The stromal score was the highest in former smokers, followed by current smokers, and the lowest in non-smokers.
For BCR diversity (BCR Shannon and BCR Richness), leukocyte fraction, neoantigens, and intratumoral heterogeneity, we identified that smokers were markedly higher than nonsmokers, but there was no significant difference between former smokers and current smokers. Next, we observed that the HRD and CTA scores were the highest in current smokers, followed by former smokers, and lowest in non-smokers (Figures 4A-G).

Smoking can Induce Tumor Cell Stemness Formation
Compared with former smokers and non-smokers, current smokers exhibited higher mRNAsi, and there was no difference between former smokers and non-smokers. For DMPsi, ENHsi, and EREG-mRNAsi, smokers were considerably higher compared with non-smokers. In addition, mDNAsi and EREG-mDNAsi were the highest in current smokers, followed by former smokers, and the lowest in nonsmokers. These findings indicate that the stimulation of smoking can induce tumor cell stemness formation, which may be reversed following smoking cessation in some indicators ( Figures 4H-M).

Smoking Induces More Somatic Mutations and Copy Number Variations That Remain the Same After Smoking Cessation
We herein recorded that the TMB of non-smokers was lower than that of smokers, but there was no difference in TMB between former smokers and current smokers ( Figure 4N). Figure 5 and Supplementary Figure S1A shows the genes with higher SNV incidence in patients with different smoking histories. Importantly, the SNV incidence of multiple genes, including TP53, TTN, MUC16, CSMD3, RYR2, LRP1B, USH2A, SYNE1, ZFHX4, FLG, XIRP2, and PCLO, among others, in smokers were markedly increased compared with non-smokers. Except for DNAH5 and NAV3, there was no statistical difference in SNV incidence of other genes between current smokers and former smokers.
In comparison, non-smokers have a lower burden of gain and loss at the focal and arm levels than smokers. For arm level gain burden, the level of former smokers was higher when compared with that of current smokers. Nonetheless, there were no statistical differences in the focal level gain and loss burden, and the arm level loss burden, between former smokers and current smokers ( Figure 6E). We also identified the top 30 genes characterized by a higher incidence of CNV events (Figure 7 and Supplementary Figures  S1B,C). The CNV gain predominantly occurred in FNDC3B (3q26.31), GHSR (3q26.31), TNFSF10 (3q26.31), PLD1 (3q26.31), ECT2 (3q26.31), TNIK (3q26.2−q26.31), NCEH1 (3q26.31), TP63 (3q28), and SLC2A2 (3q26.2) ( Figure 7A). On the other hand, CNV loss mainly occurred in CDKN2A We observed that the incidence of the multiple gene CNV gain was the highest in current smokers, followed by former smokers, and the lowest in non-smokers (Supplementary Figure S1B). In terms of CNV loss, the incidence of multiple genes, including CDKN2A, CDKN2B, RP11-145E5.5, MTAP, and DMRTA1, was the highest in current smokers, followed by former smokers, and lowest in non-smokers (Supplementary Figure S1C), suggesting that smoking cessation could reverse CNV events in some genes.

Gene Set Enrichment Analysis
For oncogenic signatures, VEGF A UP. V1 DN, E2F3 UP. V1 DN, and MTOR UP. V1 UP were significantly enriched in smokers than non-smokers. These pathways were also largely enriched in current smokers relative to reformer smokers (Supplementary Table S2, Sheet 1).

Multiple DNA Methylation Drivers Genes Have Been Identified Associated With Smoking
Here, we identified 67 up-regulated and 416 down-regulated differentially expressed methylated genes (Supplementary Table S2, Sheet 4). Using the Spearman correlation analysis, we obtained 11 methylation driver genes, including

Establishment and Validation of a Smoking-Related Prognostic Model
Using the univariate Cox regression analysis, we found that 1,404 genes were associated with OS (Supplementary Table S2, Sheet 5). As a result, a 46-gene smoking-related prognostic model was constructed (the 46 genes included in the smoking-related prognostic model are outlined in Table 1). Risk scores for all samples obtained via the prognostic model are detailed in Supplementary Table S2, Sheet 6. Afterward, we uncovered that current smokers have the highest risk scores, lowest in non-smokers, and between the two were former smokers ( Figure 10A). Patients with poor staging (stage III and IV) had higher risk scores compared with patients with better staging (stage I and II) for all cancer types except for ESCA ( Figure 10B). Moreover, KM curves demonstrated that the OS of patients in the high-risk score group was worse compared with the low-risk score group in each cancer type ( Figure 10C and Supplementary Figure S2). It is to be noted, the independent prognostic analysis elucidated that the 46-gene smoking-related model was an independent risk factor for the OS of patients in each cancer type ( Figures 10D,E and Supplementary Figure S2). More interestingly, ROC shows that the 46-gene smoking-related model exhibited effective predictive power for 1-, 3-, and 5-year OS of patients in each cancer type. Furthermore, compared with other clinical indicators, the 46-gene smokingrelated model possesses the superior predictive ability for the OS ( Figures 10F-H and Supplementary Figure S3). To predict the OS for each type of cancer, we then constructed nomograms by combining the 46-gene smoking-related model with other clinical characteristics, including age, sex, stage, and so on ( Figure 10I and Supplementary Figure S4). Both the calibration diagrams (Figures 10J-L and Supplementary Figure S4) and C-indexes ( Next, we explored the applicability of the 46-gene smokingrelated model for DSS and PFI. Our findings demonstrated that the model has good applicability for other cancer types except for ESCA. Patients with high-risk scores displayed poor DSS and PFI, while risk scores were independent risk factors for DSS and PFI (Supplementary Figures S5, S6). Remarkably, ROC analysis shows that the 46-gene smoking-related model had better predictive power for 1-, 3-, and 5-year DSS and PFI of patients in each cancer type (Supplementary Figure S7). Similarly, we constructed nomograms to predict the DSS and PFI of patients. The calibration diagrams and C-indexes show that the nomograms have high accuracy (Supplementary Figures S8, S9).

DISCUSSION
Smoking has been shown to cause the occurrence of a variety of diseases. Currently, there is a dearth of information regarding non-smokers, former smokers, and current smokers. In this present investigation, we explored the underlying possible molecular mechanisms of smoking history on the occurrence and development of tumors based on multi-omics analysis of smoking-related tumors in the TCGA database. We uncovered that smoking history exerted an effect on the prognosis of cancer patients. Additionally, patients with different smoking histories showed differences in immune content, tumor cell stemness, genome stability, and sensitivity to chemotherapy drugs. Multiple miRNAs that may be associated with the pathogenesis of smoking-related tumors were identified. We also built a smoking-related model to predict the prognosis of patients, which was characterized by high accuracy and wide clinical applicability.
We studied the difference in prognosis among patients with different smoking histories. The results showed that non-smokers had better OS and DSS than smokers. Importantly, quitting smoking could improve the prognosis and prolong survival time. Smoothed hazard estimates demonstrated that current smokers exhibited the highest risk for poor OS and DSS, followed by former smokers, while non-smokers had the lowest risk. Notably, the Cox regression analysis identified that current smoking was an independent risk factor for OS and DSS, which agrees with previous research results that, compared with non-smokers, smokers have a poor prognosis. Also, smoking cessation can improves the prognosis of smokers, whereas smoking history was an independent prognostic factor for cancer patients (Vineis et al., 1988;Manjer et al., 2000;Sardari Nia et al., 2005;Tsao et al., 2006;Zhou et al., 2006;Vladimirov and Schiodt, 2009;Kenfield et al., 2011).
It has been reported that smoking can lead to changes in innate and acquired immune systems, as well as changes in the number and function of immune indicators, promote the production of a variety of pro-inflammatory factors, and inhibit the production of anti-inflammatory factors (Arnson et al., 2010). In terms of leukocyte infiltration, intratumoral heterogeneity, and neoantigens, smokers were significantly higher relative to non-smokers. Therefore, it is apparent that smoking cessation does not lead to any immediate change in this status. This is probably because tobacco can induce more leukocyte infiltration as well as neoantigens in the body (Inamura et al., 2017;Ahmad et al., 2019). Another reason for drug resistance is intratumor heterogeneity, which can be caused by tobacco (Salk et al., 2010;Alexandrov et al., 2016). Through ssGSEA, we found considerable differences in multiple immune indicators among patients with different smoking histories. Inflammation and immune regulation due to smoking are potentially important mechanisms in the development of cancer. Tobacco smoke contains a wide variety of mutagenic and carcinogenic compounds, including carbon monoxide, nicotine, nitrogen oxides, and cadmium, among others. Smoking has been established to cause many systemic immune changes, changes in the number of macrophages, neutrophils, eosinophils, mast cells, and DCs, as well as alterations in the function of macrophages and neutrophils (Shiels et al., 2014). Li et al. (2018) found that the abnormal activation of mast cells plays a vital role in abnormal pulmonary immune function caused by smoking, which may lead to tumorigenesis and development. Smoking may increase inflammation by increasing the number and function of DCs and can lead to a sharp increase in the number of DCs and Langerhans cells (Soler et al., 1989). Smoking can upregulate the expression of CCR7 and CD86, and significantly promote the transport and response of DCs in the airway of mice to promote allergic airway inflammation (Robays et al., 2009). Tobacco did not induce inflammation or immune response in CD8 knockout mice (Maeno et al., 2007). The IFN-γ-inducible protein-10 derived from CD8 + T cells promotes the production of elastin in macrophages, leading to elastin fragmentation and lung damage. In addition to activating the expression of CD8+T cells, cigarette smoke also induces CD8 + T cells to produce more toll-like receptor proteins and thereby increasing the expression of cytokines (Nadigel et al., 2011). Current smokers have a significantly higher risk of acute prostatitis than former smokers and non-smokers (Moreira et al., 2015), and smoking can induce B-cell signatures of prostatitis and prostate cancer in current smokers, leading to immunoglobulin expression (Prueitt et al., 2016). Tobacco exposure increased the expression of IFN-γ and CD107a in the NK cells of mice and enhanced the NK cells response (Motz et al., 2010). Cigarette smoke also caused mouse NK cells to express more T helper cell-17 cytokine (Bozinovski et al., 2015). Levels of a variety of immune indicators were higher in former smokers than current smokers, possibly due to continuous tobacco stimulation resulting in a decrease in immune function, which can be reversed after quitting smoking (Cui and Li, 2010).
In this work, we found that smokers presented higher tumor cells stemness relative to non-smokers. Tobacco is well known to promote tumor resilience through the Akt-mediated ABCG2 activity to increase the proportion of lung cancer and HNSC tumor stem-like cells . Besides, it can also trigger the activation of the Sonic hedgehog pathway, which contributes  significantly to the maintenance of stemness in kidney cancer and BLCA cells (Qian et al., 2018;Sun et al., 2020). Nicotine induces the expression of the embryonic stem cell factor SOX2 via the NACHR-YAP1-E2F1 signaling axis that maintains the characteristic of NSCLC tumor stem cells (Schaal et al., 2018). Moreover, tobacco can induce more incidence of SNV, including TP53, TTN, MUC16, SYNE1, CSMD13, RYR2, USH2A, and FLG, which may play a pivotal role in mediating cell malignancy and tumor progression that cannot be reversed even after quitting smoking in a certain gene. We noted that non-smokers were substantially more sensitive to multiple chemotherapeutic drugs than smokers. Similarly, previous studies have found that smoking can induce mutations in multiple genes, which in turn can induce patients to become resistant to chemotherapy drugs (Alexandrov et al., 2016;Shang et al., 2020). For instance, TP53-mutations type can interact with BCAR1 to promote tumor cell invasion, leading to poor prognosis (Guo et al., 2020). In comparison, the incidence of TP53 mutation was higher in smokers than in non-smokers. It was also observed that the frequency of TP53 mutation increased with the increase of smoking amount (Halvorsen et al., 2016). A study by Yu et al. (2019) analyzed somatic mutations in 100 cases of NSCLC, whose results revealed that a variety of gene mutations such as TTN, CSMD3, RYR2, USH2A, and ZFHX4 were different in patients with different smoking histories, and thus the mutation incidence was higher in smokers than that in non-smokers. Likewise, Shang et al. found that the mutation frequency of CDKN2A, FAT1, FGFR1, NFE2L1, CCNE1, CCND1, SMARCA4, KEAP1, KMT2C, and STK11 was higher in smokers compared with non-smokers (Shang et al., 2020). According to the literature, chromosome instability can lead to CNV and genetic heterogeneity, which may trigger the occurrence of cancer (Myllykangas et al., 2008). In this study, we unearthed that tobacco causes a higher incidence of CNV, mainly occurring on chromosomes 3, 8, 1, 5, 9, 19, and 4. The gain of the 3q26 locus was remarkably related to the occurrence of human squamous cell carcinoma, including LUSC and HNSC. Hence, the gain of 3q26 was significantly associated with smoking (Li et al., 2020a). CDKN2A (9p21.3) encodes P16 protein that is a tumor suppressor. Studies have shown that CDKN2A loss and abnormal expression of P16 are associated with the occurrence of various malignant tumors (Kettunen et al., 2019). Smoking-related HNSC tumors indicated a large number of CDKN2A losses, suggesting that smoking may induce CDKN2A CNV loss (Cancer Genome Atlas, 2015). In oral cancer, TP63 CNV was significantly associated with smoking history, while the incidence of TP63 CNV gain was considerably increased among smokers (Pattle et al., 2017). A recent study by Tom et al. reported that loss of 19q13.42 occurred at a significantly higher rate in recurrent anal cancer than in primary tumors, implying that a loss of 19q13.42 may promote tumor recurrence (Cacheux et al., 2018). The loss of chromosome segment 8p23.3 is markedly associated with the development and progression ofBLCA, resulting in poor tumor staging (Muscheck et al., 2000). In another study, Joseph et al. (2020) found that loss of ARHGEF10 was found in more than 30% of pancreatic cancers (PC), whose loss led to enhanced subcutaneous tumor growth in the mouse model as well as increased proliferation, invasion, and motility of PC cell lines in vitro, and also enhanced tumor metastatic spread in the mouse model. The gain of PLD1, which is prevalent in LUSC, is considered as a new biomarker for LUSC (Mendez and Ramirez, 2013).
Smoking is associated with the induction of chemical resistance in different types of cancers, namely, CRC (Lee et al., 2016), HSNC (Shen et al., 2010), PC (Lee et al., 2016), and BLCA (Chen et al., 2010). In particular, smoking promotes chemotherapy-resistant and anti-apoptotic effects on breast cancer cells by signaling cascades of STAT3, galectin-3, and nicotine acetylcholine receptors (Guha et al., 2014). Tobacco may also alter the pharmacodynamics of anti-cancer drugs (Willey et al., 1997;Villard et al., 1998). For example, in lung cancer, smokers who were treated with chemotherapeutic drugs showed more rapid elimination than non-smokers, requiring an increase in dose to achieve the same therapeutic effect (O'Malley et al., 2014). Nicotine has been established to promote XIAP protein stabilization and surviving transcriptional induction through the Akt pathway, thereby inhibiting the apoptosis effects of chemotherapeutic drugs on NSCLC tumor cells (Dasgupta et al., 2006). The components pyrazine, 2-ethylpyridine, and 3-ethylpyridine in tobacco can induce multi-drug resistance in LUAD tumor cells. The induction is enhanced in hypoxia (Liu et al., 2015). In BLCA, smoking induces tumor growth and mTOR inhibitor resistance through activation of the PI3K/ Akt/mTOR signaling pathway (Yuge et al., 2015). Jin et al. (2004) demonstrated that nicotine-induced multisite phosphorylation of BAD may be the cause of resistance to PKC and MEK inhibitors in human lung cancer.
Nicotine, the major component of cigarette smoke, can stimulate the expression of VEGF in endothelial cells, thereby promoting endothelial cell proliferation, migration, and angiogenesis (Zhang et al., 2015). In addition, smoking can activate the combination of VEGF promoter and MZF1 to induce VEGF expression (Krüger et al., 2020). Yuge et al. pointed that smoking activates the PI3K/Akt/mTOR signaling pathway in BLCA to promote tumor cell growth and develop resistance to chemotherapeutic drugs (Yuge et al., 2015).
Epigenetic disorders are associated with the occurrence and progression of cancer. In gastric cancer, levels of GBP6 methylation are negatively correlated with their mRNA expression. Consequently, patients with high levels of GBP6 methylation are significantly associated with poor prognoses . Also, GBP6 is associated with oral cancer and HNSC therefore can be used as a prognostic marker Wu et al., 2020). ITGA5 has been reported to mediate the initial adhesion process in ovarian and CRC (Ohyagi-Hara et al., 2013;Yoo et al., 2016). In breast cancer, the increased levels of ITAG5 promoter methylation led to the decrease of ITAG5 expression, thus inhibiting the growth, development, and cell migration of breast cancer cells (Fang et al., 2010). However, a high ITGA5 expression was associated with malignant characteristics of BLCA and HNSC (Deng et al., 2019;Yan and Ye, 2019). In oral cancer, it has been found that NR2F2 is hypermethylated in cancer tissue compared to normal tissue, which may play a critical role in oral cancer. In glioma, NR2F2 hypermethylation is a differentially expressed methylated gene between glioma patients with better prognosis and poor prognosis, which is enriched in diseases and disorders in both molecular and cellular aspects (Shinawi et al., 2013). Additionally, PPP1R18 encodes a defense protein that is tightly localized to cytoskeleton proteins. Significant reductions in PPP1R18 methylation have been recorded in patients with severe liver fibrosis, suggesting that epigenetic disorders are involved in the progression of the disease (Zeybel et al., 2016). SLC6A15 methylation level was significantly higher in CRC, although there was no significant correlation between the SLC6A15 methylation level and its mRNA expression level (Kim et al., 2011;Mitchell et al., 2014). Meanwhile, the SLC6A15 mRNA expression was significantly reduced in ovarian cancer cell lines with a chemotherapy-resistant phenotype. Up to now, there are few reports on the association between EIF5A2, HGD, HS6ST1, PLS1, PTHLH, and YEATS2 methylation modification and tumor. The overexpression of EIF5A2 was associated with invasion, metastasis, and other malignant phenotypes of various cancers, suggesting that EIF5A2 may be a potential therapeutic target (Chen et al., 2018;Ba et al., 2019;Dong et al., 2019). In addition, EIF5A2 regulates the resistance of gastric cancer cells to cisplatin by mediating epithelial-stromal transformation (Sun et al., 2018). The downregulation of HGD expression was found to be associated with less metastasis, as well as better prognosis, pathological grade, and clinical stage of cholangiocarcinoma patients (Aukkanimart et al., 2015). The PLS1 was overexpressed in CRC patients and associated with lymph node metastasis and a poor prognosis. The PLS1 can also induce the migration and invasion of CRC cells as well as the metastasis to the liver and lung. In addition, the PLS1 also enhanced the expression of matrix metalloproteinases 9 and 2, which were key factors in CRC metastasis (Zhang et al., 2020). Compared with the normal tissue, PTHLH was significantly overexpressed in HNSC and was associated with poor prognosis in HNSC. Meanwhile, the increased expression of the PTHLH can induce the cell cycle progression of tumor cells and actively regulate the expression of core proteins (Chang et al., 2017). The increased expression of HS6ST1 mRNA during the progression of cartilage tumors suggests that HS6ST1 may promote the formation of malignant phenotypes of cartilage tumors (Waaijer et al., 2012). The inhibition of YEATS2 mRNA expression can reduce the proliferation and migration of PC cells. Meanwhile, the hypoxia-inducible factor 1α (HIF1α) regulates the expression of YEATS2 mRNA by binding to the hypoxia response element of YEATS2. HIF1α was co-expressed with YEATS2 in PC. In turn, overexpressed YEATS2 can block the inhibitory effect of HIF1α silencing on PC cell proliferation and migration under hypoxia (Zeng et al., 2021).
Next, we developed a prognostic model for tumor patients. Our findings elucidated that the OS of patients in the high-risk score group was poor compared with that of patients in the low-risk score group. It is to be noted, the high-risk score was an independent risk factor for OS. The results of ROC showed that the model has good ability to predict OS. We further constructed a nomogram for each cancer type, including the prognostic model and clinical features, to predict OS. Both calibration diagrams and C-indexes confirmed that the nomograms were reliable and highly accurate. Moreover, the prognostic model has a wide clinical applicability for predicting the DSS and PFI of cancer patients. We also drew the nomograms and verified that they had good ability in forecasting the DSS and PFI of patients.
Nevertheless, despite these intriguing results, this study has some shortcomings. First, we did not analyze the influence of smoking time and the number of cigarettes on the study. Second, we did not evaluate the influence of different durations of quitting smoking on the study. Finally, we did not validate the results in vivo and vitro experiments.
In summary, we systematically studied the molecular level differences among non-smokers, former smokers, and current smokers. We found that smoking cessation can reduce the risk of poor prognosis in patients. However, at the same time, tobacco induces SNVs and CNVs, which are changes that can be reversed by smoking cessation. Furthermore, smoking can activate the immune function of patients, while continuous smoking may induce a decline in immune indicators. Therefore, based on this, we recommend that further functional experiments are needed to verify our findings.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material; further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
RW, SL, and JZ conceived and designed the experiments. RW, SL, and WW collected and analyzed the data. RW, SL, and WW prepared the figures and tables. RW, SL, WW, and JZ wrote the manuscript. JZ reviewed the manuscript. All authors read and approved the final manuscript.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmolb.2021.704910/ full#supplementary-material Supplementary Figure 9 | Construction of nomograms for predicting PFI of each cancer patient as well as evaluation of its accuracy. Nomograms were constructed by use of the smoking-related model and other clinical indicators to predict the PFI of each cancer patient. Calibration curves were used to evaluate the consistency of nomogram prediction patient PFI with real results.

Supplementary Table 1 | Information about the included samples.
Supplementary Table 2 | The results of GSEA between patients with any two smoking histories (Sheet 1). Identification and Intersection of differentially expressed genes between any two smoking histories in patients with various smoking histories (lncRNA & mRNA: Sheet 2, miRNA: Sheet 3, methylated genes: Sheet 4). Genes associated with OS were obtained through cox regression analysis (Sheet 5). Risk scores for all samples were obtained via the prognostic model (Sheet 6).