Multiomics Landscape Uncovers the Molecular Mechanism of the Malignant Evolution of Lung Adenocarcinoma Cells to Chronic Low Dose Cadmium Exposure

Cadmium (Cd) from cigarette smoke and polluted air can lead to lung adenocarcinoma after long-term inhalation. However, most studies are based on short-term exposure to this toxic metal at high concentrations. Here, we investigate the effects of long-term exposure of A549 cells (lung adenocarcinoma) to cadmium at low concentrations using morphological and multiomics analyses. First, we treated A549 cells continuously with CdCl2 at 1μM for 8 months and found that CdCl2 promoted cellular migration and invasion. After that, we applied transmission electron and fluorescence microscopies and did not observe significant morphological changes in Golgi apparatus, endoplasmic reticulum, lysosomes, or mitochondria on Cd treated cells; microfilaments, in contrast, accumulated in lamellipodium and adhesion plaques, which suggested that Cd enhanced cellular activity. Second, by using whole-exome sequencing (WES) we detected 4222 unique SNPs in Cd-treated cells, which included 382 unique non-synonymous mutation sites. The corresponding mutated genes, after GO and KEGG enrichments, were involved mainly in cell adhesion, movement, and metabolic pathways. Third, by RNA-seq analysis, we showed that 1250 genes (784 up and 466 down), 1623 mRNAs (1023 up and 591 down), and 679 lncRNAs (375 up and 304 down) were expressed differently. Furthermore, GO enrichment of these RNA-seq results suggested that most differentially expressed genes were related to cell adhesion and organization of the extracellular matrix in biological process terms; KEGG enrichment revealed that the differentially expressed genes took part in 26 pathways, among which the metabolic pathway was the most significant. These findings could be important for unveiling mechanisms of Cd-related cancers and for developing cancer therapies in the future.


INTRODUCTION
Cadmium (Cd) is one of the known toxic and carcinogenic transition metals that is distributed widely in the environment. It was classified as a class I human carcinogen by the International Agency for Research on Cancer (IARC) in 1993 (1). The lung is a primary target organ of exposure to cadmium because this metal is mainly absorbed through inhalation (2,3). Cadmium has a very long biological half-life, which result in accumulative toxic and carcinogenic effects (4). Therefore, the harm to the human body also shows long-term and multifaceted characteristics (3). However, most previous studies described the potential of cadmium to cause carcinogenic effects using short-term exposure to this toxic metal at high concentrations (5,6). To mimic conditions more similar to occupational and cadmium exposure that is relevant biologically (7), we exposed human lung adenocarcinoma cell line A549 at a concentration of 1mM for 8 months in our study.
Pulmonary cancer is one of the leading malignant tumors in the world. Despite, the existence of various chemo/physical/ immunological therapies, pulmonary cancer in many countries still has a low 5-year survival rate (<15%) (8). Indeed, epidemiological investigations (using specific pollution) and experimental studies (using laboratory animals) have both demonstrated that Cd exposure increased the risk of pulmonary adenocarcinomas (3,9). However, the specific mechanism of how cadmium exposure promoted the development of lung cancer has not been documented. Carcinogenesis is a multi-stage process that involves a multitude of alterations to the cell. Several studies have focused on how cadmium exposure induced malignant degeneration of bronchial epithelial cells or alveolar epithelial cells, but cadmium promotion of cancer development is involved in all stages of cancer development. There is still a lack of research on how cadmium exposure promotes the malignant progression of lung cancer. In our study, human lung A549 cells were chosen because they display many differentiated features of lung alveolar cells and have been used by many researchers in the cadmium toxicity studies (10)(11)(12).
With the advent of next generation sequencing technologies, recent carcinogenic investigations have focused on studying global exome and transcriptome changes to understand the molecular basis of cancer (13,14). Omics data, such as exome, transcriptomics, and epigenomics, provide us an integrated global functional view on cellular responses after exposure to environmental substances. These responses can be linked to the generation and progression of disease. Genome/exome-wide data sets, a complication of large, curated, mechanistic databases, and bioinformatic predictions can deepen our understanding of the sequence of events; these understandings can help us to propose a hypothesis, which can be validated using experimental approaches (15). In this study, we aim to investigate somatic genetic alterations of adenocarcinoma (A549 cells) after longterm, low dose cadmium exposure by using whole-exome sequencing (WES) and RNA-sequencing analyses.

Effects of Chronic Low Dose Cadmium Exposure on Cytoskeleton, Mitochondria, Endoplasmic Reticulum, Golgi Apparatus, and Lysosomes of A549 Cells
To determine the influence of cadmium (Cd) on A549 cells, we performed the following experiments: 1) verification of the identity of the A549 cell line; and 2) imaging the cellular ultrastructure and organelle morphology. DNA fingerprinting using short tandem repeat (STR) profiling is an easy and reliable tool that can be used to verify cell lines. It does not change significantly with cell passage number (16). For verification, we confirmed the identity of A459 cells and excluded crosscontamination possibilities in long-term cell culture, by profiling DNA fingerprinting with the STR (Supplementary Figure 1). The results showed that Cd-treated (A549+Cd) cells and Cd-untreated (A549+H0) cells had essentially identical profiles, which indicated no cross-contamination in these two cell lines.
A great deal of evidence suggests that the biphasic nature of cadmium is characterized by low-dose stimulation and high-dose inhibition (17,18).To evaluate this hypothesis in our experimental system, we treated A549 cells with a low concentration (1µM) of cadmium chloride (CdCl 2 ) for 8 months. For ultrastructure imaging, we used a transmission electron microscope (TEM) and found that long-term exposure to CdCl 2 did not change the ultrastructure of A549 cells noticeably ( Figures 1A, B), which suggested Cd did not exhibit a biphasic nature on A549 cells in this setting.
For imaging of organelle morphology, we probed the mitochondria, the endoplasmic reticulum (ER), the Golgi, and lysosomes with fluorescent dyes. No significant morphological change was found in Golgi apparatus, endoplasmic reticulum, lysosomes, or mitochondria after the long-term exposure to Cd ( Figures 1C-F). However, when we probed the actin cytoskeleton, we found that, actin was apparently more disordered after Cd treatment ( Figure 1G). The cytoskeleton is a complex of detergent-insoluble components of the cytoplasm that play critical roles in cell motility, shape generation, and mechanical properties of a cell. These results implied that longterm exposure to low doses of CdCl 2 might affect organization of A549 cells in the actin cytoskeleton.

Promoting Adenocarcinoma A549 Migration and Invasion
To examine the effects of chronic low dose CdCl 2 on migration and invasion of A549 cells, we performed the Transwell invasion assay and the Scratch test migration assay. Our invasion assay showed that Cd exposure increased the cell numbers of invasion; the cell numbers of A549+Cd group and A549+H0 group were 144.3 ± 7.41 and 98.5 ± 12.86((t=4.965, p=0.0157), respectively ( Figure 2A). The migration assay showed that A549+Cd cells exhibited increased migration ability compared with A549+H0 cells, as demonstrated by the narrower scratching gap between   Figure 2B). These results indicated that chronic cadmium exposure, even at a low dose, may affect cell invasion and migration.
Genomic Variations in A549 Cells Due to Chronic Low Dose Cadmium Exposure As a significant development of next-generation sequencing, WES is a powerful tool for evaluating genomic variation. To identify genomic variation between the A549+Cd cells and A549+H0 cells, we performed high-through put, WES, which generated a total of~33.0 Gb of data (>20× depth, 87× depth max) ( Table 1). Based on the sequencing data, we characterized a total of 58,289 SNPs in A549+Cd cells and 50,887 SNPs in A549+H0 cells ( Figure 3 and Supplemental Data 1). For further biological analyses, we selected 6876 nonsynonymous mutation SNP sites, 125 non-frameshift mutations, 64 frameshift mutations, 52 terminator acquisition loci, and six terminator loss loci from both cells. Subsequently, 382 unique nonsynonymous mutation SNPs were detected in the A549+Cd cells (Supplemental Data 2) compared with A549+H0 cells. Among these 382 SNPs, 78 affected genes were associated with cell migration and invasion ability. However, none of them harbored a specific pathogenic mutation based on rigorous evaluation (Pathogenicity Calculator, http://calculator.clinicalgenome.org/site/ cg-calculator) (Supplemental Data 3). To explore the affected 78 genes in human lung adenocarcinoma, we downloaded human lung adenocarcinoma SNP mutation data and transcriptome data (normal: 54, tumor: 497) (TCGA-LUAD) from a TCGA database.
We found that the mutation rates of TTN, PCLO, RYR3, and LRP2 in human lung adenocarcinoma were 41%, 16%, 14% and 11%, respectively, but the mutation rates of other mutant genes were <10% ( Figure 4A and Table 2). From the DESeq differential analysis, 1826 up-regulated genes and 1295 down-regulated genes were obtained from transcriptome date of TCGA lung adenocarcinoma. By taking the intersection as shown in Figure 4B, the expressions of CKAP2L, FAT1, GSDMC, MUC4, KRT15, KIF18B and SKA1 of 78 affected genes were up-regulated in the TCGA database, MYH11, SPTBN1, LAMC3 and CD33 were down-regulated, and other genes were not expressed differentially in lung adenocarcinoma of TCGA.
Functional annotation analysis of genes in the associated regions was performed using different databases, which included NCBI non-redundant (NR), The Gene Ontology (GO), Clusters of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway analysis. According to GO enrichment analysis, the unique mutant genes were enriched in 30 GO terms, which included 10 biological process terms (BP), 10 molecular function terms (MF), and 10 cellular component terms (CC). In BP terms, most of the mutated gene enrichments, involved adhesion and cell movement, such as biological adhesion, cell adhesion, homophilic cell adhesion, and actin filament-based movement ( Table 3 and Figure 5). In MF terms, mutated genes were largely related to metal ion binding, cation binding, and calcium ion binding. In CC terms, major mutations were involved in extracellular regions and the cytoskeleton. The KEGG enrichment revealed that the unique mutant genes in A549+Cd cells were enriched in nine pathways, in which the metabolic pathway had the majority of enriched genes ( Figure 5A). The enrichment analysis showed that unique mutant genes in A549+Cd cells were associated with protein digestion and absorption (four mutated genes), metabolism of xenobiotics by cytochrome P450 (five mutated genes), starch and sucrose metabolism(three mutated genes), one carbon pool by folate (three mutated genes), amino sugar and nucleotide sugar metabolism genes (three mutated genes), carbon metabolism (four mutated genes), glycosaminoglycan degradation (two mutated genes), drug metabolism-cytochrome P450 (five mutated genes), and chemical carcinogenesis (five mutated genes) ( Figure 5B).

Profiles of Transcriptomes Due to Chronic Low Dose Cadmium Exposure of A549 Cells to Cadmium
To characterize transcriptomic differences after long-term exposure to Cd, we conducted next-generation RNA  By comparing the RNA sequencing data of A549+Cd and A549+H0 cells, we obtained the differential profile transcriptomes that were caused by chronic, low dose, cadmium exposure. We identified 1250 differentially expressed genes, among which 784 genes were significantly up-regulated and 466 genes were down-regulated in A549+Cd cells, compared with A549+H0 cells. Among them, 132 differentially expressed genes were related to migration; 85 up-regulated genes ( Table 4) and 47 down-regulated genes ( Table 5). We also identified differentially expressed mRNAs, which included 1023 increased and point 591 decreased mRNAs. Transcriptional data also indicated that there were 679 differentially expressed lncRNAs (375 increased and 304 decreased).
Functional enrichment analysis based on GO annotations showed that the differentially expressed genes of A549+Cd cells were enriched in 30 GO terms, which included 10 biological process (BP) terms, 10 molecular function (MF) terms, and 10 cellular component (CC) terms (Supplementary Data 5 and Figure 5). Most differentially expressed genes that were enriched in BP terms may be related to cell adhesion and extracellular matrix organization, such as biological adhesion, cell adhesion, which was similar to the GO annotations based on WES. Most differentially expressed genes enriched in MF terms were related to receptor binding and protein heterodimerization activity. In addition, most differentially expressed genes enriched in CC terms were related to extracellular regions and integral/intrinsic components of plasma membrane. These results from the transcriptomes were similar to the results of the entire exome, which indicated that Cd long-term exposure may affect development of carcinoma.
Apart from GO annotations, we also performed KEGG enrichment analysis and found that the DEGs were involved in 26 pathways (Supplementary Data 6); the metabolic pathway was the most significant ( Figure 6). Other significant pathways  were chemical carcinogenesis, transcriptional mis-regulation in cancer, cytokine-cytokine receptor interaction, calcium signal pathway, Rap1 signal pathway and Hedgehog pathway.

DISCUSSION
Cadmium (Cd) is a heavy element that is known to accumulate in the body and to have strong biological effects. The main entry for cadmium into the body include the respiratory tract, digestive tract, and skin contact. As a non-essential element with 20-30 years half-life in vivo, the Cd can remain in the human kidney and liver for almost a lifetime. Therefore, the harm to the human body also shows long-term and multifaceted characteristics. Because the relationship between cadmium and "pain disease" was put forward in the 1960s, there have been many studies on the relationship between environmental pollution and human health due to cadmium. Due to daily-life and industrial activities, 4000 to 13000 tons of Cd has been discharged into the environment every year (19). These toxic pollutants can then accumulate in the human body through direct inhalation of air, drinking water, or by eating crops that grew in polluted soil (19).
In addition, occupational exposure and smoking are also important avenues (20). Unlike the kidney and liver, the lung is a primary target organ, directly exposed to Cd pollutants, especially for those people with occupational exposure or for those who smoke tobacco. The affected lung can suffer acute inflammation, chronic edema, bronchitis, and even cancers (21). For the underlying molecular mechanisms of these diseases, Cd may be a strong inducer for deletion of multiple loci in the genes (or gene mutations). Because it cannot be bound to DNA stably, Cd might lead to gene mutations by indirect genotoxic mechanisms, such as oxidative stress, inhibition of DNA repair, stimulation of cell proliferation, blockage of apoptosis, and epigenetic mechanisms (4,22). Although Cd can induce gene mutation indirectly and promote the development of cancer, a global picture of the progression of adenocarcinoma after long-term exposure to low-dose Cd is still vague. Previous studies have reported that cadmium affected the biological behavior of cells through Wnt (23,24), PI3K/AKT (25,26) or JNK (5, 27) signaling pathways, but there is still a lack of studies on the effect of long-term, low dose cadmium exposure on whole exome and transcriptomes of lung cancer cells in the lung. Cadmium itself has the ability to  induce gene mutation indirectly (28) and, therefore, it inhibits the DNA repair system, which results in gene damage and accumulation of mutation. To provide a clear picture, we sequenced whole exome and transcriptomes of A549 cells after a low dose, long-term treatment with Cd and then analyzed the function of the mutated genes by GO and KEGG enrichments.
Our exome results showed that 78 mutated genes were related to biological processes of cell movement and migration, such as calmodulin binding, microfilament movement function, calcium binding function, cytoskeleton composition, actin-related biological processes, and cell adhesion. Our transcriptomic results showed that Cd affected the expression of genes related to calcium binding, cell adhesion, DNA, and protein synthesis. These influences are signs of transcriptomic disorders in tumor progression. The transcriptomic disorders may lead further to cytoskeleton disorders in A549 cells and then affect biological functions, such as adhesion and migration. Because Cd was shown previously to regulate Wnt, PI3K/AKT and JNK signaling pathways, we speculated that these pathways together with the disorders found in our studies rendered progression of adenocarcinoma more malignant. Other tumor-related pathways (e.g., Hedgehog pathway, Rap1 signal pathway, and calcium signal pathway, and chemical carcinogenesis) might also crosstalk with those disorders. Reactive oxygen species (ROS) can be induced by Cd-related cellular disorders, such as decreasing the activity of antioxidant enzymes, imbalance of intracellular calcium homeostasis, interference with cellular calcium metabolism, and damage of the DNA repair mechanism (29,30). Over the last decade, many investigators have highlighted the involvement of ROSstimulated signaling in metal-induced carcinogenesis. Unlike general concepts on the relationship between metal and ROS, a direct role of ROS is not considered likely in cadmium-induced carcinogenesis because cadmium does not participate in Fenton- type chemical reactions. Although cadmium is not a direct mutagenic inducer, proposed mechanisms of cadmiuminduced carcinogenesis include formation of ROS, alteration of antioxidant enzymes, inhibition of DNA repair enzymes, and an imbalance between pro-and anti-apoptotic proteins. However, based on our sequencing data, we did not detect obvious alteration of ROS-stimulated signaling, apoptotic signaling, or DNA repair signaling. Thus, we believe that with respect to  chronic low dose cadmium exposure, ROS may not play an important role in the malignant progression of cancer. Although ROS might not be involved, we found that metabolic pathways played a critical role in Cd-induced lung cancer progression. Several studies have identified extensive differences in metabolic profiles between cancerous and normal lung tissue (31,32), which are consistent with our observations. Alterations in various metabolic pathways conferred selective advantages to cancer cells, such as producing energy and substrates for biosynthesis, increasing redox imbalance, and promoting progression of cancer (e.g., growth, proliferation, and migration) (33). Metabolic toxicity of Cd was rarely discussed in previous studies, because only a few studies indicated that Cd disrupted multiple metabolic pathways in adipocytes and progenitor cells (34).
Several limitations of this study are acknowledged here. First, Cd-induced progression of adenocarcinoma was investigated in A549 cells, which might be different from that measured in lung cancer tissues. Although most of the target genes screened in our cell experiments were validated in a lung cancer cohort, applying the results from surrogate tissue to humans should be done with caution and more advanced models or human tissues are needed to confirm these alterations in the responding genome and transcriptomes. Second, the Cd dose used in this study might not represent the ordinary exposure of human lung tissue. It is difficult to determine the appropriate dose because the reported cadmium levels in lung tissue were largely inconsistent (35)(36)(37). To solve this issue, we carried out preliminary experiments and determined suitable conditions that affected the biological behavior of cells significantly without decreasing cell viability. These conditions well mimicked the conditions of chronic Cd exposure and accumulation in lung tissue. Besides, freedom from in vivo environmental pressure, especially the tumor surveillance mechanism that removes, morbid cells, may also bias the results. Finally, we only carried out experiments using A549 cells, which acted as surrogate tissue for KRAS mutant/EGFR wild type lung carcinoma. Because the A549 cell line is a KRAS-mutated lung cancer cell line already, the variants may have generated spontaneously from the in vitro tissue culture. We did not analyze the changes between the long term cultured A549 and the baseline cell or the low dose exposed A549 cells, which may undermine the impact of long-term culture to the cell line. To better understand the evolution of genomic, transcriptomic and behavioral events at different time points, the short-term, -high dose cadmium exposure model should be included in future studies. Given that lung cancer is highly heterogenous, various lung cancer cell lines or tissues should be utilized to explore the association between Cd-induced, genomic-/transcriptomicalterations and lung cancer more thoroughly.
In conclusion, Cd as a heavy toxic mental from polluted air or cigarette smoke can target human lungs directly and lead to progression of adenocarcinoma. To investigate physiological influence of lung cancer under long-term Cd exposure systematically, we established a chronic model by using A459 cells and provided a global view of genomic-/transcriptomicregulations, which linked with our morphological observations (i.e., the cancer migration and invasion). Thus, our studies based on a long-term Cd exposure model not only deepened our understanding of Cd influence on adenocarcinoma, but also could be a useful indication for future therapeutic development.

Cell Culture and Treatment
Lung adenocarcinoma A549 cells were purchased from Cell Bank of Type Culture Collection of Chinese Academy of Sciences (Shanghai, China). A549 cells were cultured in DMEM medium (Invitrogen, Carlsbad, CA) that contained 10% fetal bovine serum, 100 U/ml penicillin, and 100 mg/ml streptomycin (Invitrogen). Cells were authenticated by short tandem repeat (STR) testing and amelogenin to the reference profile of A549 (ATCC CCL-185) (Supplementary Figure 1).
For cadmium exposures, cadmium chloride hemipentahydrate (Acros Organics, Gael, Belgium) was added to the media and applied evenly to the cultured cells. Cadmium chloride hemipentahydrate (Acros Organics, Gael, Belgium) at 1 mM was added to the culturing cells continually for 8 months to build the chronic low-dose Cd model.

Examination of Cell Ultrastructure
Ultrastructure of A549+Cd cells and A549+H0 cells were observed using a transmission electron microscope (TEM). Cells were fixed with 2.5% glutaraldehyde (v/v) in 100 mM phosphate buffer (PBS, pH 7.0) for 2 h. Cells were washed three times with PBS and post-fixed in 1% osmium tetroxide (OsO 4 ) for 1 h. Cells were dehydrated with an ethanol series, infiltrated, embedded in araldite and sectioned at 70 nm thickness using an ultramicrotome. Ultrathin sections were stained with 2% uranyl acetate and 0.2% lead citrate. Sections were examined using a TEM (JEOL JEM-1230 EX, Japan) at 80 Kv, and the images were observed.

Scratch Test Migration Assay and Matrigel Transwell Invasive Assay
A two-chambered culture insert (Ibidi, Wisconsin, US) was placed in a 35 mm dish. The cells (2.5×10 5 cells/chamber) were then seeded into the insert for 24 h until fully confluent attachment. After that, the inserts were removed using sterile forceps to create an even 500 mm cell-free gap. Cells were washed carefully with PBS once to remove any floating cells. Cell migration was analyzed at different time points (0 h, 6 h, 12 h, and 24 h) by a Nikon TMS-F phase contrast microscope (Tokyo, Japan).
For the Matrigel Transwell invasive assays, A549+Cd cells and A549+H0 cells were collected and resuspended at a density of 1x10 5 in 100 ml of serum-free medium and then seeded into the upper chamber (Corning, USA) with a Matrigel-coated membrane (24-well insert; 8-mm pore size) according to the manufacturer's protocol. Afterwards, we filled the lower chambers with DMEM that contained 10% FBS. After incubation for the indicated time, non-invasive cells on the upper surface of the upper chamber were removed mechanically by cotton swabs; the cells on the lower surface of filters were fixed with methanol for 30 min and stained with 0.1% crystal violet for 30 min. The number of invasive cells was counted in five random 200 fields using an inverted microscope (Olympus IX51; Olympus America Inc., Melville, NY, USA).

WES Analysis and Exome Data Analyses
For exome sequencing, DNA of A549 cells (with/without Cd treatments) were extracted by a Qiagen genomic DNA extraction kit (Qiagen GmbH, Germany). DNA libraries were then prepared using a NEXTflexTM Rapid DNA Sequencing Kit (5144-02). The libraries were tested for enrichment by qPCR; their size distribution and concentration were determined by an Agilent Bioanalyzer 2100. After the library construction, WES was performed on an Illumina HiSeq3000 sequencer (version 3, Illumina, Inc., California, USA) in high-output mode with 150 bp paired-end reads. The consequent WES data were analyzed by a pipeline with three major steps: 1) quality control (QC) of raw data; 2) sequence read mapping; and 3) single-nucleotide polymorphism (SNP) analyses. In detail, QC was performed on the NGS raw data (FASTQ format) by using FastQC (38), which provides a thorough examination of the reads. Raw reads were cleaned by removal of adapter sequences and low-quality reads (Phred quality <20) and then applied to genome mapping. Different attributes of reads (e.g. Phred score, GC ratio, reads coverage, adapter/primer influences, and sequence duplicates) were checked, trimmed, and filtered to present high-quality reads for sequence mapping. The sequences were then mapped to the human genome (hg19) using the Burrows-Wheeler Aligner (0.7.12) (39). After mapping, GATK/Picard (40) software was employed to detect SNP loci with PHRED score >30 (i.e., error rate < 1/1000). Subsequently, ANNOVAR (41) was used to annotate SNP loci to determine whether the detected SNPs were in the known database (i.e., dbSNP138SNP database, 1000 Genome database and ESP6500 human exon database), gene functional annotation, exonic variant annotation, and heterozygosity.
GO and KEGG analysis were performed through the DAVID (https://david.ncifcrf.gov) database (42). Besides, to evaluate further the pathogenicity of these SNPs in 78 mutated genes associated with cell migration and invasion, we checked on Pathogenicity Calculator (http://calculator.clinicalgenome.org/ site/cg-calculator) (43). To analyze the 78 mutated genes associated with cell migration and invasion, the human lung adenocarcinoma SNP mutation data and transcriptome data (normal: 54, tumor: 497) (TCGA-LUAD) were downloaded from the Cancer Genome Atlas (TCGA) database through DESeq2 installation package (44). The overlap of up-regulated mutated genes and down-regulated mutated genes were visualized using the R package "maftools".

RNA Sequencing and Data Analysis
Total RNA was extracted and purified by using Qiagen RNeasy Mini Kit (Hilden, Germany). The purity, concentration and integrity of total RNA were detected by NanoDrop 2000 spectrophotometer, Qubit 3.0 Fluorometer and Agilent 2100 Bioanalyzer, respectively. To retain all lncRNAs, which included those with or without a poly (A) tail, ribosomal RNA was removed by an Epicentre Ribo-zero ™ rRNA Removal Kit (Epicentre, USA). The purified RNA was fragmented, converted into cDNA with adenylation of 3' ends, and then amplified by using PCR. RNA-seq was then performed by Bai Hao Biological Technologies (Liaoning, China), using an Illumina HiSeq3000 platform (50 cycles with 150 base pair-end reads). Totals of 61,156,328 and 47,872,203 pair-end reads remain in A549+Cd and A549+H0 cells, respectively, after the QC process, which included the removal of adapter sequences, contaminated sequences, and low-quality sequences (Phred quality < 20 or N base > 10%). We also used the RNAcentral database (45) to remove the rRNAs that would affect the lncRNA data analysis, and 121,308,770 and 94,864,580 reads remained for A549+Cd and A549+H0 cells, respectively, for the mapping process. Tophat2 (46) was used to map reads to the Ensemble GRCh37/hg19 (iGenome version) reference with the default parameters (-read-mismatches = 2 and -read-gap-length = 2). Subsequently, we obtained 75,968,484 mapped reads that were distributed in exonic (80.9%), intergenic(3.4%), intronic(15.3%) and splicing(0.4%) in A549+Cd. A549+H0 had a similar result with 69,272,540 mapped reads distributed in exonic (78.6%), intergenic (4.0%), intronic (17.0%) and splicing (0.4%). The unique mapped reads were subjected to subsequent processing, such as removing PCR duplicates, before counting transcripts. Differential expression analysis (Cd treatments compared with Cd non-treatments) was conducted by a standard workflow, using AudicS (47) for genes, mRNA and lncRNA. The Benjamini-Hochberg multiple test correction was enabled by default during the analysis. All thresholds for significant differential expression were then set as q-value < 0.05 & |log2 (Fold change)| > 1.5.

Statistical Analysis
The results of Scratch test migration assay and Matrigel Transwell invasive assay were expressed as mean ± standard deviation calculated from three independent experiments. The data were analyzed with an independent samples t-test, SPSS 17.0 software. Values of p < 0.05 were taken as statistically significant. The workflow of DEseq2 and AudicS used the Benjamini-Hochberg multiple test correction (FDR) to correct p-value to q-value.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
SDD contributed to the conception of the work, designed the work and drafted the manuscript. SW participated in the acquisition and analysis of the data. JCZ and YNQ interpreted the data and revised the manuscript. All authors contributed to the article and approved the submitted version.