Multi-Omics-Based Identification and Functional Characterization of Gh_A06G1257 Proves Its Potential Role in Drought Stress Tolerance in Gossypium hirsutum

Cotton is one of the most important fiber crops globally. Despite this, various abiotic stresses, including drought, cause yield losses. We used transcriptome profiles to investigate the co-expression patterns of gene networks associated with drought stress tolerance. We identified three gene modules containing 3,567 genes highly associated with drought stress tolerance. Within these modules, we identified 13 hub genes based on intramodular significance, for further validation. The yellow module has five hub genes (Gh_A07G0563, Gh_D05G0221, Gh_A05G3716, Gh_D12G1438, and Gh_D05G0697), the brown module contains three hub genes belonging to the aldehyde dehydrogenase (ALDH) gene family (Gh_A06G1257, Gh_A06G1256, and Gh_D06G1578), and the pink module has five hub genes (Gh_A02G1616, Gh_D12G2599, Gh_D07G2232, Gh_A02G0527, and Gh_D07G0629). Based on RT-qPCR results, the Gh_A06G1257 gene has the highest expression under drought stress in different plant tissues and it might be the true candidate gene linked to drought stress tolerance in cotton. Silencing of Gh_A06G1257 in cotton leaves conferred significant sensitivity in response to drought stress treatments. Overexpression of Gh_A06G1257 in Arabidopsis also confirms its role in drought stress tolerance. L-valine, Glutaric acid, L-proline, L-Glutamic acid, and L-Tryptophan were found to be the most significant metabolites playing roles in drought stress tolerance. These findings add significantly to existing knowledge of drought stress tolerance mechanisms in cotton.


INTRODUCTION
Cotton (Gossypium spp.) has been cultivated for many years by human beings (Fang et al., 2017). Today, G. hirsutum accounts for 95% of the yearly cotton production globally  with India, China, United States, Pakistan, and Brazil being the leading five cotton-growing countries in the world. They produce 76% of all cotton on the globe (Baytar et al., 2018).
China's cotton industry has particularly grown considerably in the last 60 years. It produces 30% of the world's cotton despite only having 15% of acreage for cotton at present (Dai and Dong, 2016).
Drought stress causes extensive crop loss and is predicted to intensify in the future. As a result, a global movement is underway to promote drought-tolerant crops (Shadakshari and Shanthakumar, 2015). Drought tolerance occurs as a result of a chain of molecular, cellular, and physiological developments including the induction and/or repression of a variety of genes that are the basis of the buildup of numerous osmolytes, enhanced antioxidant system, decreased transpiration, repressed shoot growth, and decreased tillering (Joshi et al., 2016). Plants have several mechanisms to conquer this abiotic stress such as drought avoidance (i.e., reduction of transpiration by closing stomata and thus bearing inner water potential), drought escape (i.e., rapid maturation), and drought tolerance (i.e., dealing with water stress without changing physiological features) (Iqbal et al., 2013). Plants change the transcriptional activity of stress response genes at the molecular level in response to abiotic stimuli.
Omics techniques have been shown to account for the majority of relevant and prospective biotechnological tools for enhancing plant abiotic stress tolerance (Bagati et al., 2018). Transcriptome has been broadly employed to investigate how stress factors impact the transcriptome of crops. As a result, the ability of a crop to maintain photosynthesis in the face of drought is a substantial measure of drought tolerance. However, the expressions of these genes in response to drought stress tolerance remains poorly studied . It is a decisive tool that offers a combined task of the genes along with information on gene abundance and expression levels in various plant tissues when crops are faced with numerous abiotic and biotic stress elicitors .
In addition, metabolomics displays crucial secondary metabolites of tolerant varieties for fighting abiotic stress (Fahimirad and Ghorbanpour, 2019). Metabolites are believed to be signaling molecules because they are correlated with physiological practices and are distributed from each organelle to the cytoplasm in the form of retrograde signals. Plant responses to drought are influenced by interactions between genes, metabolites, proteins, and the drought-responsive transcriptome. As a result, by integrating transcriptome and metabolomics methods, we can better recognize the processes underlying fundamental plant resistance to drought stress. Many studies on the integration of transcriptome and metabolome have been reported in response to salt stress in Astragalus membranaceous, foxtail millets, and sesame (Jia et al., 2016;Shi et al., 2018;Zhang et al., 2019), whereas some used integrated approach in Nicotiana tabacum and rice under cold stress Jin et al., 2017).
Progress in functional genomics of cotton will depend on enlarging high-throughput technologies and integrating multidisciplinary action toward future cotton enhancement programs (Ashraf et al., 2018). Moreover, Abdelrahman et al. (2015) stated that for a long time, breeding and continuous cultivation for desired agronomic features has had a negative impact on the diversity of current cotton genotypes, resulting in a reduction in genetic diversity. As a result, wild accessions became valuable pools of natural genetic diversity that can be used to expand cultivar genetic bases . Cotton production, on the other hand, is hampered by numerous biotic and abiotic stresses. Drought stress has emerged as the most serious threat to major cotton crop loss due to the global shortage of water . Understanding the biochemical and genetic bases of cotton's drought response and developing drought-tolerant cotton varieties, is critical.
Several genes from wild resources improve tolerance to abiotic stress by indirectly detoxifying cellular reactive oxygen species (ROS) (Guo et al., 2017). Aldehyde dehydrogenase (ALDH) is a family of enzymes that catalyze the permanent conversion of aldehydes to acids to reduce the damage caused by abiotic stressors. Abiotic stress including drought, salinity, and high temperatures also cause a buildup of ROS which stimulates endogenous aldehyde formation via a lipid peroxidation chain reaction . Aldehydes and their detoxifying roles in plants are poorly understood. As a result, it is of immense importance to distinguish the approaches underlying the detoxification of aldehydes throughout abiotic stresses and to mine endogenous resistance genes (Guo et al., 2017). Thus, drought tolerance is complex and mutagenic. Therefore, an integrated transcriptomic, metabolomic, and functional analysis approach is required to make progress in drought tolerance (Oladosu et al., 2019). It is imperative to reveal this complex mechanism or to explore the relevant genes and pathways related to drought stress tolerance in cotton. Hence, understanding the molecular mechanisms and metabolic regulatory networks will help to improve the drought tolerance in cotton. Therefore, in the current study, integration of transcriptome and metabolome was performed at various time intervals in different plant parts to investigate the genetic and molecular networks and pathways underlying drought tolerance in three cotton semi-wild lines, namely, Marie-galantie85 (MG85) tolerant, Lattifolium40 (LT40) sensitive, and Upland cotton (CRI12) as a standard check. Virus induced gene silencing (VIGS) and overexpression-based functional characterization of candidate gene was also performed to validate its part in response to drought stress. The current study offers a comprehensive understanding and additional knowledge to our existing knowledge of both genetic and drought tolerance in cotton is mediated by molecular processes.

Planting Material, Plant Growth, and Drought Stress Treatment
Two semi-wild accessions of G. hirsutum, namely, Marie-galantie85 and Lattifolium40 that were originated from Mexico and distributed in Guadeloupe and Guatemala and a released variety CRI12 for drought tolerant, sensitive, and regular checks respectively were used for transcriptome and metabolome analysis. Marie-galantie85 was specifically used for functional validation of the genes associated with the key metabolites identified in this study through gene knockout, also referred to as Virus-Induced gene silencing. A variety of CRI12 was also used for drought tolerant, sensitive, and regular checks .
The seeds of these lines were sterilized and aseptically kept immersed in water for one day at 30 • C before being placed for germination in an absorbent paper under 28 • C of day and night temperature. They were placed in 16/8 h of alternating light and dark periods . The experiment was set up in a greenhouse with three biological replications in a completely random design. The properly germinated seeds on paper rafts were transferred to a hydroponic setup of Hoagland solution (Hoagland and Arnon, 1950).
The sampling parts of the materials were the roots and leaves. Sampling was done at 0, 24, and 48 h for transcriptome profiling sequence and 0, 3, 6, 9, 12, 24, and 48 for VIGS trials in three replications. A total of 54 samples were done for RNAsequencing. The samples were then rapidly submerged in liquid nitrogen and kept at −80 • C until RNA extraction was completed .

Data Collection on Phenotypic, Physiological, and Biochemical Parameters
Phenotypic and physiological data including excised leaf water loss (ELWL), relative leaf water content (RLWC), chlorophyll content, and cell membrane stability (CMS) were recorded from three randomly sampled plants. We took leaf tissue samples for the determination of biochemical analysis. In the biochemical analysis, the antioxidant catalase (CAT), superoxide dismutase (SOD), hydrogen peroxide (H 2 O 2 ), and malondialdehyde (MDA) enzyme activities were measured. All the antioxidants and oxidant enzymes CAT, SOD, H 2 O 2 , and MDA were evaluated using the Solarbio life sciences kit (www.solarbio.com) using the instructions of the manufacturer. For physiological traits, the following formulas were used to calculate the values. ELWL = (FW -WW/DW) (Clarke and McCaig, 1982). RLWC = (FW -DW)/ (SW -DW) × 100 (Barrs and Weatherley, 1962). (Blum and Ebercon, 1981).

cDNA Library Preparation, and RNA Sequencing
To isolate RNA samples from both leaf and root tissues for RNA-Seq analysis, Trizol R reagent (Invitrogen, Waltham, MA, USA) was used. Using an agarose gel electrophoresis and a NanoDrop 2000 spectrophotometer, the quality and concentration of RNA were assessed (Thermo Fisher Scientific, USA) (Yu et al., 2017). The Agilent 2100 Bioanalyzer RNA Nanochip was used to complete a more precise RNA quantification (Agilent Technologies, Waldbronn, Germany) . The cDNA fragments were then cleaned using a Qia Quick PCR extraction kit that was corrected at the ends with added poly(A) and ligated to Illumina sequencing adapters. Gene Denovo Biotechnology Co. used the Illumina HiSeqTM 2500 to amplify and sequence PCR products. The size of the ligation products was determined using agarose gel electrophoresis from Gene Denovo Biotechnology Co (Guangzhou, China). PCR amplified and sequenced the PCR using the Illumina HiSeqTM 2500. Tophat2 (V.2.0.13) was used to extract clean reads from raw reads, which were then associated to the reference genome of G. hirsutum (http://cottonfgd.org) to generate mapped reads (Kim et al., 2013). Cufflinks v2.2.1 was used to analyze gene expression and to calculate the differences between the treatment and control fragments per kilobase of transcript per million fragments (FPKM) values. Readings from the transcriptome were measured and adjusted to fragments per kilobase of transcript per million fragments linked to the reference genome (Trapnell et al., 2014).

DEGs Identification
R software (http://www.r-project.org/) was used to find differentially expressed genes between samples. The false discovery rate (FDR) and log2 Fold Change thresholds for differentially expressed genes (DEGs) were set at <.05 and > 1, respectively. The DEGs were then submitted to a GO function and KEGG pathway enrichment analysis. First, through the AgriGO web tool (http://systemsbiology.cau.edu.cn/agriGOv2/), all DEGs were mapped to GO (Gene and Consortium, 2000). Pathway enrichment analysis revealed considerably enhanced metabolic transduction pathways when DEGs were compared to the entire genomic background.

WGCNA Analysis
To transfer genes into co-expressed modules, weighted gene co-expression network analysis (WGCNA) was done in the R package (Zhang and Horvath, 2005;Langfelder and Horvath, 2008). Before generating an adjacency matrix, the FPKM values were normalized. WGCNA software was used to import the phenotypic data, and a correlation-based relationship between accessions, time points, and gene modules was done using the default settings. The WCGNA software was used to transform the adjacency matrix into TOM (topological overlap matrix). Transcripts with comparable expression patterns were clustered into one module once the network was created, and eigengenes were determined for these modules. Using Cytoscape's default parameters, each module's genes were exported.

Co-expression Network and Phylogenetic Analysis of Key Genes
Leaf and root tissues of the three semi-wild cotton species Marie-galantie85, Lattifolium40, and Upland cotton at 0-, 24-, and 48-h time points were used as phenotype data in the WGCNA analysis. Thirteen unique gene modules were recognized from RNA Seq data by entering WGCNA using the FPKM values. Coexpression network analysis of the three highly and positively correlated to drought modules namely brown, yellow, and pink was established by Cytoscape v.3.7.2 (Shannon et al., 2003). Cotton functional genomics database website (www.cottonfgd.org/) was used to download the protein sequence of the candidate gene for G. hirsutum and G. arboreum, while G.raimondii, A. thaliana, Theobroma cacao, Brassica rapa, Glycine max, Medicago truncatula, Oryza sativa, Triticum aestivum, Zea mays, and Sorghum bicolor were downloaded from phytozome (https:// phytozome.jgi.doe.gov). The ClustalX tool was used to align the full-length ALDH protein sequences (Larkin et al., 2007) and MEGA 7 was used to create the phylogenetic tree using the neighbor-joining method with 1000 bootstrap replications (Tamura et al., 2011).

Metabolite Profiling, and Data Analysis
The tender leaves and roots of cotton at the three-leaf seedling stage were used for metabolite profiling by the liquid chromatography-mass spectrometry (LC-MS) approach. To detect reproducibility under the same treatment, the sample extracts were mixed to prepare a quality control sample (QC). During the LC-MS analysis, QC samples are arranged by mixing sample extracts to analyze the recurrence ability of samples under the same processing method. In the process of instrument analysis, a QC sample is usually inserted into every 10 test analysis samples to investigate the repeatability of the analysis process. Three biological replications were maintained in sampling and metabolome analysis (Zhou et al., 2012).
Cloning and Transformation of Gh_A06G1257 in A. thaliana RNA sequencing of G. hirsutum yielded a substantially upregulated gene, which was then transformed into Arabidopsis thaliana (Colombia-0). The PCR study with forward (CGGCCATTTAAATAGTGGATTCGG) and reverse (GCCACCATCTATCCTCAACGA) pair of primer sequences of Gh_A06G1257 synthesized from Invitrogen, Beijing, China, verified the pWM101-35S: Gh_A06G1257 build in Agrobacterium tumefaciens GV3101. The floral dip approach was used to transform wild-type A. thaliana plants.
The infiltration media was prepared as prescribed previously (Lu et al., 2019). The seedlings were moved to a growing environment with a temperature of 25 • C and a 16 light/8 dark hours cycle from the selection medium at three-leaf stages. The little plastic pots were filled with a 1:1 mix of vermiculite and humus (Sadau et al., 2021).
The first-generation seeds were collected after the seedlings from generation T0 had been grown to set seeds (T1). True lines were found by estimating the antibiotic-selectable marker segregation ratio of 3:1 after T1 seeds were sown in antibiotic media. Only the lines with a 100% success rate were chosen for T3 generation growth. After RT-qPCR, T3 homozygous lines were chosen from a T2 generation. Three of the six ALDH transgenic lines that were successfully transformed (OE-1, OE-9, and OE-10) were chosen from a T2 generation. The RT-qPCR prototype was Gh_A06G1257 forward primer sequence and Gh_A06G1257 reverse primer sequence with complete complementary DNA (cDNA). T3 homozygous generation was used to conduct the phenotypic studies.

Determination of Drought Tolerance in the Transgenic Lines
The plants were subjected to drought stress treatments after 21 days of growth. Samples from leaves were collected from transgenic lines and control. The pots were then watered with water containing 15% PEG-6000 treatment was used to alleviate drought tension. After 8 days, physiological and phenotypic traits were observed. All measurements were replicated three times biologically and three times technically .

Germination Rate and Root Elongation Determination
Drought simulated stress conditions were used to assess the germination percentage and root length of transgenic lines and wild form. OE-1, OE-9, OE-10, and wild-type seeds were sown in.5 MS plates complemented with 0, 100, 200, and 300 mM mannitol concentrations to simulate drought. After 10 days, the germination rate was assessed. Transgenic and wild-type seeds were seeded for 6 days in 0.5 MS media before being moved to 0.5 MS supplemented with different amounts of 0, 100, 200, and 300 mM mannitol for the root length assay. All measurements were replicated three times biologically and three times technically .

Drought Stress Treatment for VIGS in Cotton
The functional characterization of substantially upregulated gene expression in G. hirsutum was investigated using this approach. A gene Gh_A06G1257 with 296 bp fragments was transformed by Forward (CTGTGAGTAAGGTTACCGAATTCTCTAGAGAGA TGTGGAATCCCCTTGGAA) and Reverse (TCGAGACGCG TGAGCTCGGTACCGGATCCACCTTCGAACTCCCCGTGA) primer sequences into pTRV vector using enzymes XbaI and BamHI to develop a 35S promoter-driven pTRV2: ALDH7B4. The promoter cells of A. tumefaciens LBA4404 were deformed with the recombinant vector by freezing and thawing (Velásquez et al., 2009). Wild, pTRV: 00 were used positively with Phytoene desaturase (PDS) as a negative control.

RT-qPCR Analysis
To validate RNA-Seq data, DEGs were verified by RT-qPCR using (Schmittgen and Livak, 2008) procedures. Primers were created using NCBI (https://www.ncbi.nlm.nih.gov) and gene sequences from G. hirsutum. Quantitative PCR was performed using an SYBR R Green PCR Master Mix Kit (Applied Biosystems, Foster City, CA, USA) and an ABI-7900 system. The 2-C T was used to determine the relative gene expression (Schmittgen and Livak, 2008). Internally, GhActin was employed as a reference. The RT-qPCR experiment was carried out in three biological and technical replications .

Statistical Analysis
Chenomx NMR Suite 7.7 was used to do the multivariate statistical analysis, principal component analysis (PCA), partial least squares determinant analysis (PLS-DA), and orthogonal projections to latent structures (OPLS-DA). The variance in the data matrix was presented using PCA. PLS-DA can maximize the differentiation between groups, which helps find differential Frontiers in Plant Science | www.frontiersin.org metabolites. OPLS-DA data was used to examine subsequent model tests and differential metabolite screening. Fisher's exact test in R was used to calculate statistical significance. The Benjamini-Hochberg correction was used to adjust the false recovery rate (FDR) in the transcriptome analysis (Noble, 2009). Coexpression network analysis was established by Cytoscape v.3.7.2 (Shannon et al., 2003). The SPSS program was used to perform both t-tests and ANOVA. Significant differences were declared at the p < 0.05 probability level (Worley and Powers, 2015).

Phenotypical and Biochemical Response of Cotton Lines Under Drought Stress
Wilting was observed in the plants under drought stress with reduced evaporation. There were no obvious differences in the phenotype of three semi-wild cotton lines in response to drought stress ( Figure 1I). Clear and significant differences (p < 0.05) were recorded in the case of physiological and biochemical parameters ( Figure 1II). Chlorophyll and relative water contents in MG85 and CRI12 were higher as compared to LT40. A lower ion leakage with a higher excised leaf water loss was observed in LT40 as compared to CRI12 and MG85 ( Figure 1III).
Higher activities of CAT and SOD were observed in MG85 and CRI12 under drought stress conditions, however lower CAT and SOD were observed in the case of LT40. Moreover, we also measured the H 2 O 2 and MDA activities. We found out that they were higher in LT40 as compared to MG85 and CRI12. Thus, suggesting that MG85 and CRI12 were more resistant to drought as compared to LT40 ( Figure 1IV).

Transcriptome Profiling of Semi-wild Cotton Accessions
We selected semi-wild cotton lines, namely, Marie-galantie85, Lattifolium40, and Upland cotton for the transcriptome sequencing in response to drought treatment. After sequencing, 54 transcriptome libraries were obtained. For the accuracy of the subsequent analysis of the new data, reads with lower, linker sequences, sequences having >10% unknown N were deleted. After filtering the new data, we finally acquired high-quality clean reads of 2.66 Gigabytes in all the 54 libraries ( Table 1). The average length of each clean-read was 150 bp. We, therefore, acquired over 95% high-quality clean reads. The Q20 and Q30 base percentages were all above 96 and 90%, respectively, and the GC content was higher than 43% (Table 1). Bowtie2 was used for the screening of high-quality reads and then comparing them with G. hirsutum reference genome by TopHat2 software. The samples were tested for Pairwise repeatability and the correlation coefficient was all above 84%, indicating that the samples were reproducible.
The reason for higher DGEs number in roots at different time points of drought application could be that they are the first to be in contact with PEG-6000, thus triggering the drought response. The amount of differently expressed genes in root tissues grew over time, owing to the roots' direct interaction with PEG-6000, which resulted in more sustained drought response in the roots. Intersection and union analysis of differentially expressed genes at different time points in the same material was performed by applying Venny online software ( Figure 2I).

Analysis of Differential Gene Expression Trends
The expression patterns of genes change with conditions such as specific environments and time. The trend analysis of genes that were differentially expressed was performed by STEM software for the expression of leaf and root regulatory genes at different time points. The differentially expressed genes of the experimental materials were divided into 8 gene expression trends. According to the standard of (p < 0.05), the gene expression trend showed the same significant trend (trend 1, trend 6, and trend 7) with time in the three materials in both leaves and roots ( Table 2).

Analysis of Metabolites in Cotton Semi-wild Lines Under Drought Stress
According to the principal component analysis of the six quality control samples gathered together, the dispersion is very small, indicating that the stability of the instrument is better in the analysis of metabolites, and the results of metabolite analysis are reliable. Thirty-six samples were divided into four components by the main components PCA1 (39.9%) and PCA2 (12.8%). The main component PCA1 (39.9%) separated the leaves and root samples, indicating that under drought stress, leaves and roots respond differently ( Figure 2III). The main component PCA2 (12.8%) separates the samples at different treatment time points of the material, and the difference in the roots is small. It is indicated that the changes in leaves are more obvious at different treatment time points of drought stress. This may be due to the sensitivity of leaf tissues to stress as compared to root tissues. LC-MS analysis was helpful to identify 445 metabolites totally (Supplementary Table 4). According to the analysis of the differentially expressed metabolites, the highest upregulated metabolites were recorded in the leaves at MGL0 vs. MGL48 (67 metabolites), MRL0 vs. MRL48 (61 metabolites), and MSL0 vs. MSL48 (56 metabolites), respectively, whereas the highest downregulation of metabolites was found in roots from MRR0 vs. MRR48 (54 metabolites), MSR0 vs. MRR48 (47 metabolites) and MSR0 vs. MSR48 (36 metabolites) successively ( Figure 2IV).

GO Function Enrichment Analysis
The identified differentially expressed genes in both the roots and leaves tissues were mainly enriched in the molecular functions, cell components, and the biological processes of functional categories (Supplementary Figure 1). In MG85, the leaf tissues were enriched in biological and molecular functions, whereas the root tissues were enriched only in biological processes. In CRI12, both tissues are enriched in two functions, namely, biological and molecular, and biological and cellular, respectively. Differently from the above lines LT40 enriched in all the three GO functions biological, cellular, and molecular in leaf tissues only, there is no significant enrichment in the root tissues (Supplementary Figure 1)    them, we found that most of the DEGs were enriched in the metabolic processes in the tissues of different materials. Thus revealing that primary metabolites and secondary metabolites play an essential part in response to drought stress.

WGCNA Analysis for the Identification of Hub Genes
Here we used the FPKM values of commonly expressed DEGs to perform a weighted gene coexpression network analysis for the identification of hub genes associated with drought stress  tolerance. A cluster dendograms was generated to see the number of modules along with DEGs, a network heatmap was also generated from the genes located in each of the identified module (Figures 3A,B). Using WGCNA, thirteen different modules were detected. Out of these thirteen modules, three modules (Yellow, brown, and pink) were found to be highly and positively associated with our phenotype. The yellow module contains 1,568 genes, the brown module contains 1,234 genes whereas 765 genes were present in the pink module. The yellow module has significant associations with MG85 0 h, CRI12 0 h, and  Network Visualization, RT-qPCR Validations, and Phylogenetic Analysis for Key Genes Linked to Drought Stress WGCNA helped us to import the top 30 genes for each of the significant modules to build the coexpression networks. The gene lists were further used for the network visualizations. We used Cytoscape software for this purpose. To identify the hub genes, a built-in Cytoscape extension by the name of "cytohubba" was used (Shannon et al., 2003). Five hub genes from the yellow module include Gh_A07G0563, Gh_D05G0221, Gh_A05G3716, Gh_D12G1438, and Gh_D05G0697, the brown module contains three hub genes (Gh_A06G1257, Gh_A06G1256, and Gh_D06G1578), and the pink module has five hub genes, namely, Gh_A02G1616, Gh_D12G2599, Gh_D07G2232, Gh_A02G0527, and Gh_D07G0629. The most interesting thing is that all the thirteen hub genes belong to the same ALDH family. Furthermore, the expression profiles of all hub genes in different plant parts under drought stress were validated via RT-qPCR. Results from RT-qPCR suggest that the Gh_A06G1257 gene, having the highest expression under drought stress, might be the true candidate responsible for drought stress tolerance. We further validate this gene on a functional basis (Figure 4). Two distinct clusters were formed as a result of phylogenetic analysis. The candidate gene Gh_A06G1257 together with the two Gossypium species together with A. thaliana, T. cacao, and B. rapa were grouped in one cluster. G. max, M. truncatula, O. sativa, T. aestivum, Z. mays, and S. bicolor were also grouped in the second cluster.

Overexpression of Gh_A06G1257 in Arabidopsis Increases Tolerance to Drought Stress
We performed PCR and RT-qPCR analysis to see the expression level of six positive lines. OE-1, OE-9, and OE-10 overexpressed lines conformed with the highest expression level ( Figure 8I). Seedlings of overexpressed lines were exposed to drought for studying the functions of Gh_A06G1257 during drought stress. The wild-type and transgenic seedlings grew and germinated uniformly in the MS medium. On the other hand, the MS medium containing 300 mM mannitol seedlings of OE lines (OE-1, OE-9, and OE-10) showed better germination, grew faster than wild-type seedlings, and demonstrated a drought-resistant phenotype. Increased mannitol concentration resulted in a steady decrease in the rate of germination. In normal conditions, germination was calculated to be >90% in both wild and transgenic lines Figures 5A,B. However, under drought stress treatments, wild-type seedlings showed a decrease in germination rate to <25%, while a significantly higher germination rate of 60% was observed in transgenic lines in 300 mM mannitol. Under mannitol treatment, longer roots (13 mm) were observed in the lines with over-expressed ALDH gene while in wildtype a shorter root length of 6 mm, suggesting overexpressed lines are more resistant to drought stress (Figures 5C,D).

Physiological Parameters Evaluation in Arabidopsis Under Drought Stress
In response to drought stress, we compared the physiological responses of the three transgenic lines with the wildtype. Under normal conditions, there were no significant differences between the transgenic lines and the wild type in any of the measurements, but when the plants were stressed, the transgenic lines showed greater stress tolerance than the wild type. RLWC was much higher in transgenic lines than in control lines ( Figure 6I). Wildtype plants displayed more stress-induced ion leakage when they were stressed by drought. Under regulated conditions, the ELWL did not differ statistically between wild-type and transgenic plants. However, under stress conditions, the wild-type leaves lost more water than the transgenic plant leaves Figure 6II. The fact that the transformed gene improved drought stress tolerance in the transgenic Arabidopsis lines was demonstrated by the increased degree of tolerance between the transgenic lines.

Measurements of Oxidants and Antioxidants in Transgenic and Wild-Type Arabidopsis Plants
Activities of oxidants and antioxidants were measured in the transgenic and wild-type plants after applying drought treatment. Under control circumstances, there were no significant variations in SOD and CAT activity between transgenic lines and the wild type. However, when seedlings were exposed to drought stress after 8 days, the transgenic lines OE-1, OE-9, and OE-11 showed significant changes from the wild type, with increased SOD and CAT activity, lower MDA, and lower H 2 O 2 levels (Figures 7A-D). Our results suggest that Gh_A06G1257 is playing a critical role in drought stress tolerance.

Expression Analysis of Stress-Responsive Genes
We choose four abiotic responsive genes ABF4, SOS1, RAB18, and RD22 to measure the expressions by qRT-PCR in overexpressed and wild-type seedlings for a better understanding of the role Gh_A06G1257 is playing when plants are exposed to FIGURE 6 | (I) Physiological parameters evaluation in ALDH overexpressed lines under drought stress. (II) A Transgenic lines and wild type under control and treatment. A Relative leaf water content, B excised leaf water loss, C chlorophyll content, and D Ion leakage after drought stress treatment. The error bars reflect the SE. Significant differences were observed by different letters above the graphs (ANOVA, p < 0.05). WT, wild type; overexpressed lines (OE-1, OE-9, and OE-10) after 8-day stress treatment. Three replications were maintained in the experiment. ANOVA, analysis of variance; SE, standard error. abiotic stress. Results from the RT-qPCR results revealed that upon exposure of transgenic lines to drought, the expressions of ABF4, SOS1, RAB18, and RD22 were recorded higher in the transgenic lines ( Figure 8II). Thus, suggesting that Gh_A06G1257 has a key role in coping with drought stress.

Effects on Physio-Morphological Traits Under Drought Stress Conditions
Significant variations in the physio-morphological traits of cotton were observed under drought conditions. The VIGS plants, positive controlled plants, and the wild types significantly (p < 0.05) differs in their response to drought stress in both morphological and physiological traits (Figure 9). Plant height and root length showed no significant differences, while VIGS plants had a considerably lower shoot and root fresh weights than wild-type pes and positive controls. Moreover, VIGS plants have a higher excised leaf water loss, and ion leakage but significantly lower relative leaf water contents than positive control and wildtype plants under drought stress.

Measurement of Oxidants and Antioxidants in VIGS and Wild-Type Plants
We observed significant differences (p < 0.001) in the oxidants and antioxidants activities among VIGS and wild-type plants.
Overall, lower antioxidant activities and higher oxidant activities were measured in the case of VIGS plants (Figure 10II). CAT and SOD contents before treatment were almost the same, while after being treated with PEG-6000, the wild plants showed an increase over the VIGS plants. An increase in the contents of the oxidant enzymes activities showed that the VIGS plants experience higher oxidative stress than wild-type and positive controls. Activities of MDA and H 2 O 2 were similar in normal conditions but recorded higher under drought stress mainly in the VIGS plants. Thus, our results suggest that the knockdown of the Gh_A06G1257 gene has a significant influence on coping with drought stress.

Relative Expression of the Knocked Down Gene and Wild-Type Plants
Samples from the leaves and roots of wild-type, positive control, and VIGS plants were collected. To clarify the role of the ALDH gene (Gh_A06G1257) under drought, we collected samples from different tissues to check the expression. The expression of Gh_A06G1257 in VIGS plants was lower as compared to the wild-type and the positive control plants. Thus, indicating that Gh_A06G1257 is playing a critical role in coping with drought ( Figure 10IIE,F).

Multi Pathway Enrichment Analysis
In the KEGG pathway annotation, the metabolites were enriched in metabolism, genetic information processing, and environmental processing information groups. A total of 25  Table 1B). Within this, the ALDH genes were enriched in amino acids metabolism, carbohydrates and lipids metabolism, and the replication and repair activities. The metabolism enrichment analysis found that the major enriched pathways during drought stress were arginine and proline metabolisms, tryptophan metabolism, valine and leucine degradation, and lysine degradation (Figure 11). In general, higher numbers of metabolites were upregulated in the leaves than roots under drought. In arginine and proline metabolism pathway out of nine metabolites agmatine and spermidine, in tryptophan metabolism out of ten metabolites, tryptamine, N-Acetyl-5hydroxytryptamine, and kynurenic acid, in valine and leucine degradation mainly L-valine showed upregulation in leaf tissue. On the contrary, L-ascorbic acid in glycine, serine, and threonine metabolism, glutaric acid and N6-Acetyl-L-Lysine in lysine degradation, L-valine in valine and leucine degradation, and Lproline in arginine and proline metabolism showed upregulation in root tissues during drought stress. The rest of the metabolites showed a similar trend in both tissues.

DISCUSSION
Cotton, an essential industrial crop worldwide, and its production are affected by numerous stresses including both biotic and abiotic. Drought is among the key threats contributing to the significant yield losses in cotton (Hou et al., 2018). Several processes, i.e., molecular, cellular, and physiological contributes, in causing drought to result in the variations in the expression levels of genes involved in osmolyte production and their roles in improving antioxidant systems (Joshi et al., 2016). Wild relatives are believed to be a source of prominent genetic assets linked to abiotic stress tolerance (Iseki et al., 2018). Narrow genetic diversity among crops, including wild wheat, leads the cultivated species to lose tolerance to drought (Budak et al., 2013). Similarly, wild species of rice offer an extensive range of adaptive traits and can serve as potential contributors of biotic and abiotic stress tolerance (Neelam et al., 2018). In the current study, MG85 was used, which has a higher tolerance to drought and salt stresses than CRI12  while LT40 is highly vulnerable to both the aforementioned stresses (Yang et al., 2019).
In total, 70,478 coding genes were present in the allotetraploid cotton G. hirsutum genome (TM-1) . The tissues of MG85, LT40, and CRI12 were subjected to transcriptome sequencing and obtained 54 transcriptome libraries. After filtering the new data, a total of 64,617 genes were found. In a similar study, a total of 64,737 genes from 48 cDNA libraries were identified from the same test accessions under salt stress conditions . In distinct comparison groups, upregulated genes were considerably higher than downregulated FIGURE 8 | Expression analysis of Gh_A06G1257 by RT-qPCR. (I) A The 771 bp CDS sequence transformation to T2 generation was checked using polymerase chain reaction (PCR), 1-11 overexpressed lines, WT, wild type. B RT-qPCR was used to examine the transcript levels of the Gh A06G1257 (ALDH) of T2 overexpressed lines in three biological replications. (II) Expression analysis of abiotic stress-responsive genes. A APF4, B SOS1, C RAb18, D RD22 in transgenic lines (OE-1, OE-9, and OE-10) and wild type, Atactin2 gene used as an internal reference, and each experiment was done three times. CDS, coding sequence; RTqPCR, Real time quantitative polymerase chain reaction. genes in both leaves and roots. More DEGs in roots than in leaves showed that the roots are the most affected tissues under salt stress and, thus, have a multifaceted gene regulation to decrease the effect of salt stress . WGCNA proved to be a highly useful analysis to identify hub genes linked to drought stress tolerance in cotton.
In metabolite analysis, the principal component analysis separates the leaves and root samples, which meant that after drought treatment, the metabolites in leaves and roots respond to drought stress differently at different treatment time points. Differentially expressed metabolites analysis showed that the highest upregulated metabolites were recorded in leaves, whereas the highest downregulation of metabolites occurred in the roots. Kang et al. (2019) reported that leaves of tolerant wheat genotype changed 45 and 20 metabolites more than the roots, whereas the leaves and roots of sensitive wheat genotype changed 38 and 28 metabolites, respectively, indicating that plants dedicated more resources to the leaves of tolerant genotype. Arabidopsis plants upon grown on MS medium having supplemented mannitol, seedlings with over-expressed ALDH possess drought tolerance where wild type shows sensitivity.
Our results indicate that Gh_A06G1257 might play a key role in drought improvement during seed germination and root elongation. Moreover, overexpressed lines showed stable relative water contents, chlorophyll contents and, low water loss and ion damage as compared to the wild-type plants. We also employed pTRV-VIGS in cotton seedlings to study the part Gh_A06G1257 is playing in drought stress. Plant height and root length showed no significant differences, while VIGS plants had a considerably lower shoot and root fresh weights than wild-type pes and positive controls. Moreover, VIGS plants have a higher excised leaf water loss, and ion leakage but significantly lower relative leaf water contents than positive control and wild-type plants under drought stress. Duan et al. (2015) studied the co-expression of the introduced bar and CsALDH genes in transgenic plants of Alfalfa and found that the transgenic plants retain higher relative water content levels, higher shoot biomasses, little changes in the photosystem, and reduced membrane damage.
Overexpressed plants have higher antioxidant enzymes activities that regulate the contents of ROS within the cell (Hasanuzzaman et al., 2019). Under mannitol treatment, Gh_A06G1257 plants have higher antioxidants activities including CAT and Peroxidase (POD) whereas, lower activities of oxidant enzymes (MDA and H 2 O 2 ) than wild-type (WT) plants. Under stress treatment, GhMPK3 overexpressed plants were shown to have higher levels of antioxidants and lower contents of oxidant enzymes than WT plants (Sadau et al., 2021).
The activities of oxidant and antioxidant enzymes before the treatment were the same while after being treated with PEG-6000 solution, the wild and positive control plants showed variations over the VIGS plants. Under drought stress circumstances, the VIGS plants experienced more oxidative stress than the WT and positively controlled plants, as evidenced by the elevated levels of oxidant enzymes (Yang et al., 2019). These results were following Kirungu et al. (2020), who stated that knockdown of Dehydrin gene member meaningfully effects in reduction of plant height, root length, shoot fresh weight, and root fresh weight compared to the positive control and the wild-type plants. Similarly, contents of MDA and H 2 O 2 were similar in normal conditions but reached high under drought stress, mainly in the VIGS plants. Plants pay well-organized detoxifying systems in response to excess-initiation of ROS under abiotic stress, containing improved activities of antioxidant enzymes such as superoxide dismutase, catalase, peroxidase, ascorbate peroxidase, and a controlled amount of non-enzymatic antioxidants such as glutathione and ascorbic acid (Shi et al., 2015;Ding et al., 2018). Therefore, crops must preserve the amount of ROS at a proper level under abiotic stresses.
According to our results, Gh_A06G1257 was found to be a candidate gene responsible for tolerance to drought stress. Results from VIGS in cotton and overexpression in Arabidopsis also confirm the role this gene is playing to cope with drought.
An over-expression research in A. thaliana of the VvALDH2B4 gene boosted protection against high salt and pathogenic bacteria and bring about lower MDA levels (Wen et al., 2012). When ScALDH21 was introduced into tobacco plants, it resulted in greater germination ratios, root lengths, proline concentrations, antioxidant enzyme actions, and decreased MDA levels (Yang et al., 2015). CaALDH1 gene silencing in pepper disturbed phenolic compound buildup, H 2 O 2 production, protection response gene expression, and cell death, while in transgenic A. thaliana, overexpressing CaALDH1 revealed improved defense response to infection (Kim and Hwang, 2015).
BrALDH7B2 has the potential to cope in both abiotic stress and hormonal treatments in Brassica rapa. Over-expressing BrALDH7B2 in E. coli and yeast cells resulted in a substantial tolerance to abiotic stress. Hence, BrALDH genes need to be explored more as they play a key role to cope with abiotic stress in B. rapa (Gautam et al., 2019). Drought stress in soybean leaves strongly supported the functions of GmALDH genes, including GmALDH3H2, GmALDH12A2, and GmALDH18B3 . The expression of "ALDH3I1" and "ALDH7B4" genes was adequate to boost tolerance against drought, salinity, and oxidative stress in crops (Kotchoni et al., 2010). Plant ALDH7B expression has been found to be responsive to a wide range of stressors in studies, and expression is assumed to be a part of general stress-response pathways. UV exposure, dehydration, high salinity, low temperature, heat shock, and applied behavior analysis (ABA) therapy are just some of the stressors that cause ALDH7B overexpression (Brocker et al., 2013).
In the KEGG pathway annotation, ALDH genes play key roles in different pathways that have been identified as the home of metabolites during drought stress. L-ascorbic acid, agmatine, spermidine, myoinositol, ectoine, tryptamine, N-Acetyl-5-hydroxytryptamine, and kynurenic acid in leaf tissue, glutaric acid, L-ascorbic acid, N6-Acetyl-L-Lysine, L-tryptophan, and L-proline in root tissues showed upregulation. The ALDH gene family, which is involved in many pathway regulations and enzymes, appears to play a primary role in abiotic stress-response pathways. These metabolites could be an important target for increasing plant resistance to stressful conditions like elevated soil salinity or dehydration, which is especially important when developing stress-tolerant crops (Brocker et al., 2013). Metabolites like proline, L-arginine, L-histidine, L-isoleucine, and tryptophan exhibited increased after drought stress, which was likely the sign of acclimation in response to drought stress in chickpeas . Similarly, in maize research, an increased amount of proline was observed under drought stress and tryptophan addition increased Indole-3-acetic acid (IAA) and Gibberellic acid (GA) production, and further increased Abscisic acid (ABA) production in water shortage areas, demonstrating that tryptophan addition may help enhance drought tolerance (Yasmin et al., 2017).
Plants produce large amounts of ROS under drought stress. Most of these substances are antioxidants that can increase the drought resistance of plants under drought stress. Excessive amounts of these ROS can cause damage to plant cells and can lead to plant death (Suzuki et al., 2012;Baxter et al., 2014;Nakabayashi et al., 2014;An et al., 2015). Plants, sensing the osmotic stress through their receptor, will induce the regulation and expression of genes in the body and generate a large amount of metabolism through various amino acids, lipids, carbohydrates, and secondary metabolism. Antioxidant substances such as glutathione, lysine, proline, ascorbic acid, terpenoids, and flavonoids respond to drought stress (Nakabayashi et al., 2014;Wang et al., 2016). Similarly, a prominent group of metabolites affected by drought stress is amino acids, organic acids, sugars, sugar alcohol, and fatty alcohol. Tolerant genotypes distributed more resources to leaves than roots, whereas sensitive genotypes allocated resources to roots and leaves in a similar way (Kang et al., 2019). Tryptophan is an osmolyte that regulates ion transport, modulates stomatal opening, and maintains water balance (Rahman et al., 2017). Proline has been positively correlated with stress tolerance, serves as a compatible solute, and helps plants to avoid oxidative stress by keeping reduced levels of ROS (Hayat et al., 2012). Raised tryptophan and proline was crucial in drought stress condition (Kang et al., 2019). Similarly, an increase in lysine concentration encourages the transcriptional upregulation of genes enhancing lysine-to-a-aminoadipate metabolic instability and using glutamate to yield proline to respond to abiotic and biotic stress (Arruda and Barreto, 2020). In the RNA-Seq and RT-qPCR analysis, most genes were upregulated thus, demonstrating that these genes could be playing a significant role in enhancing drought stress tolerance in cotton. Similarly, Gh_A06G1257 was found to be upregulated in all species and time points according to RNA-Sequencing as well as RT-qPCR analysis, which revealed that this is a true potential candidate for drought stress tolerance in cotton. Cai et al. (2019) stated that RT-qPCR results were significantly correlated to the RNA-Seq data both at 6 and 12 h points under cold stress (r 2 = 0.88 and 0.95) in G. thurberi.

CONCLUSION
In the current research work, an integrated transcriptome and metabolome approach to investigate gene/s to cope with drought in cotton was utilized. Thirteen hub genes were initially identified by WGCNA. Further expression analysis in both leaves and root tissues revealed that Gh_A06G1257 (GhALDH7B4) belonging to the ALDH family might be the candidate gene. We opted for VIGS and overexpression approaches for functionally validating the candidate gene and current results suggest that Gh_A06G1257 is involved in drought stress tolerance in cotton. L-valine, Glutaric acid, L-proline, L-Glutamic acid, and L-Tryptophan were found to be the upregulated metabolites in the pathway annotation. This gene is newly identified. No previous reports regarding its role or functions in cotton are available. The results of this study may add to a profound understanding of the highly multifaceted genes and regulatory mechanisms that function in plants during drought stress.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI (accession: PRJNA663204).