Biochemical and Taxonomic Characterization of Novel Haloarchaeal Strains and Purification of the Recombinant Halotolerant α-Amylase Discovered in the Isolate

Haloarchaea are salt-loving archaea and potential source of industrially relevant halotolerant enzymes. In the present study, three reddish-pink, extremely halophilic archaeal strains, namely wsp1 (wsp-water sample Pondicherry), wsp3, and wsp4, were isolated from the Indian Solar saltern. The phylogenetic analysis based on 16S rRNA gene sequences suggests that both wsp3 and wsp4 strains belong to Halogeometricum borinquense while wsp1 is closely related to Haloferax volcanii species. The comparative genomics revealed an open pangenome for both genera investigated here. Whole-genome sequence analysis revealed that these isolates have multiple copies of industrially/biotechnologically important unique genes and enzymes. Among these unique enzymes, for recombinant expression and purification, we selected four putative α-amylases identified in these three isolates. We successfully purified functional halotolerant recombinant Amy2, from wsp1 using pelB signal sequence-based secretion strategy using Escherichia coli as an expression host. This method may prove useful to produce functional haloarchaeal secretory recombinant proteins suitable for commercial or research applications. Biochemical analysis of Amy2 suggests the halotolerant nature of the enzyme having maximum enzymatic activity observed at 1 M NaCl. We also report the isolation and characterization of carotenoids purified from these isolates. This study highlights the presence of several industrially important enzymes in the haloarchaeal strains which may potentially have improved features like stability and salt tolerance suitable for industrial applications.


INTRODUCTION
The extremely halophilic archaea have adapted to bloom even in harsh environmental conditions such as high salinity, desiccation, and intense solar radiations (Mormile et al., 2003;Gunde-Cimerman et al., 2005;Schubert et al., 2010;Stan-Lotter and Fendrihan, 2015;Winters et al., 2015). These microorganisms require at least 1.5-2.5 M NaCl concentration for their viability and typically grow optimally in 3.5 M NaCl concentrations (Ollivier et al., 1994). Haloarchaea commonly resides in hypersaline environments such as salt lakes, salterns, heavily salted hides, meats, fish, and sauces (Radax et al., 2001;Gruber et al., 2004;Stan-Lotter and Fendrihan, 2015). Adaptation in such extreme and diverse environments makes their genome highly rich in multiple essential genes that are absent in other microorganisms (Papke et al., 2015). This essential new haloarchaeal gene pool analysis has the potential to uncover many industrially important proteins and enzymes. So, it is imperative to perform pangenome or comparative genomics analysis to understand the genetic evolution and distribution of unique and conserved genes in these microbes that help them to survive in harsh conditions (Kim et al., 2018). As aerophilic mesophiles, many haloarchaea are easy to grow in the laboratory conditions, making them one of the most extensively studied archaeal groups and thus, leading to the development of a variety of biochemical, genetic and genomic tools for better understanding of several diverse haloarchaeal species (Soppa, 2006).
Besides having the ability to thrive in high salt conditions, haloarchaea possess diverse physiologies including alkaliphiles, facultative thermophiles, thermoalkaliphiles, and psychrotolerant species (Bowers et al., 2009;Bowers and Wiegel, 2011) and diverse metabolic strategies. These interesting features make them ideal organisms for understanding archaeal biology (Falb et al., 2008). In addition, the genetic basis of these microbes to flourish in hypersaline environments may provide crucial insights to develop salt-tolerant plants for growth in currently non-arable land (Flowers and Colmer, 2015).
Halophiles produce a range of unique and stable biomolecules of commercial applications including (1) hydrolytic enzymes like gelatinases, proteases, lipases, DNAases, xylanases, and amylase. Exo-enzymes from these organisms with polymer degrading ability is of great interest in many industrial processes where high salt concentration would cause enzymatic inhibition (Oren, 2010). These unique features make haloarchaeal enzymes very useful in commercial industries such as baking industries (Oren, 2010), starch liquefaction (Chi et al., 2010), detergent industries, maltose production, etc. (Singh and Singh, 2017). (2) Membrane proteins such as bacteriorhodopsin, commercially recognized for its use in artificial retina, holograms, photoelectrical devices, optical computing, etc. (3) Biodegradable polymers such as polyhydroxyalkanoates (PHAs), produced by many haloarchaea can be used as an alternative to non-degradable plastics (Schiraldi et al., 2002;Oren, 2010). (4) Carotenoids: under certain growth conditions, microbial cells are known to accumulate different pigments, having several commercial applications (Britton, 1995;Vershinin, 1999). Carotenoids are one such class of pigments produced by microbes and plants which play a major role in protecting cells against photo-oxidative damage and hence have vital applications in the environment (Zhang et al., 2014), food and nutrition (Vilchez et al., 2011), disease control (Fiedor and Burda, 2014), and as potent antimicrobial agent (Narsing Rao et al., 2017). Haloarchaea are one of the richest sources of carotenoids compared to other microorganisms (Yatsunami et al., 2014;Montero-Lobato et al., 2018;Giani et al., 2019). The main component of the haloarchaeal carotenoid pool is bacterioruberin which reportedly has more antioxidant properties compared to plant β-carotenes (Yatsunami et al., 2014). Other than bacterioruberin, they also contain isopentenyl dehydrodopin, lycopene, and phytoene in trace amounts (Yatsunami et al., 2014).
In the present study, we isolated and characterized three haloarchaeal strains namely wsp1, wsp3, and wsp4 from the Pondicherry solar lakes, one of the high salt containing areas of the Indian solar salterns. We performed a polyphasic taxonomic classification of these isolated strains using wholegenome sequencing and biochemical assays. Using comparative genomic analysis, we identified the core, accessory, and unique gene pool of haloarchaeal proteins. We identified several industrially important enzymes encoded in the genomes of these strains. We report the expression, and purification of a halotolerant recombinant starch degrading enzyme i.e., α-amylase Amy2 isolated from Haloferax strain wsp1 having distinct sequence and structural features. All the strains reported here were colored and genome analysis suggested the presence of carotene biosynthesis genes in their genomes. We further report the isolation and characterization of carotenoids produced by these strains.

Isolation and Taxonomic Characterization of Haloarchaeal Isolates
All three strains were isolated from solar saltern samples collected from Marakkanam solar salterns, India (12 • 11 13.0272 N and 79 • 55 40.4220 E) using the dilution-plate technique on the Halovibrio agar medium as described previously (Verma et al., 2019). Briefly, on agar plates, the reddish-pink opaque convex colonies, 1-2 mm in diameter appeared in 7 days. The growth of wsp isolates were screened from 12 to 42 • C, which suggested that they have optimum growth at 37 • C. The positive growth was observed on media containing 3-5 M NaCl (Figure 1A), whereas no growth was observed on 0.3-2 M NaCl concentrations. Transmission electron microscopic images suggested that wsp1, wsp3, and wsp4 have pleomorphic morphology, and compared to wsp1, both wsp3, and wsp4 isolates are highly vacuolated ( Figure 1B and Supplementary Figure S1).
Comparative Genome Analyses of wsp1, wsp3, and wsp4 Comparison of both 16S rRNA and whole-genome sequences suggested that all three isolates are novel strains that belong to Haloferax and Halogeometricum genera. These results were further confirmed by calculating ANI (Average nucleotide index) and digital DNA-DNA hybridization scores where the observed values were higher than the accepted cut-off values (for ANI > 95% and for DNA-DNA hybridization >91%) for novel species (Supplementary Tables S1, S2). The phylogenetic analysis based on 16S rRNA sequencing data suggested that wsp1 is closely related to Haloferax volcanii FIGURE 1 | Morphological and taxonomic characterization of the haloarchaeal isolates. (A) Wild type culture of haloarchaeal isolates in light reddish-pink color suggesting high carotenoid production. (B) TEM images suggest that wsp3 and wsp4 are highly vacuolated (shown in blue arrow) compared with wsp1 (C,D) The16S rRNA based phylogenetic analysis suggested that wsp1 strain is closely related to H. volcanii DS2 while both wsp3 and wsp4 are closely related to H. borinquense DSM 11551. (E,F) Genomic maps of the isolates and their reference species constructed using the BRIG tool. The inner black rings show the coordinates in scale and total genome size of the reference sequence, with black histogram representing GC content and purple/green histograms representing GC deviations. In the panel, (E) outermost ring in magenta color represents the genome map of wsp1 strain and in the panel, (F) blue color represents wsp3 genome map while wsp4 genome map is represented in magenta color.

Pangenome Analysis
During evolution, microbes acquired several genes that facilitate their growth and survival. The important genes in this list include genes responsible for cell signaling, metabolic regulators, antibacterial proteins, ion transporters, etc. To understand the environmental effect on gene pool variation, we performed a pangenome analysis of wsp samples with their closely related species using the Bacterial Pan Genome Analysis (BPGA) pipeline (Chaudhari et al., 2016). The original pangenome concept was developed by Tettelin et al. (2005) and it describes the total pool of genetic material comprized of all members of a species. The pangenome consists of three different groups known as the core, accessory, and unique genomes. The core genome consists of common genes that are present in all individuals, accessory or dispensable genome containing shell genes present in few individuals and a unique genome contains genes that are specifically present only in an individual member (Tettelin et al., 2005;Wolf et al., 2012;Vernikos et al., 2015). The outcomes of the pangenome analysis are discussed in the following sections.

Clusters of Orthologous Groups (COG) Distribution Plots
The whole-genome sequences of Haloferax and Halogeometricum members were retrieved from NCBI (National Center for Biotechnology Information) database. The functional annotation was carried out using Rapid Annotation using Subsystem Technology webserver (Aziz et al., 2008). The distribution of the archaeal clusters of orthologous groups (arCOG) of Haloferax and Halogeometricum genera along with our isolates with their biological functions are shown in Figures 2A,B. The potential functions of the unique genes from both genera appear to be widespread and linked with many different cellular functions such as cell motility, post-translational modifications, chaperones, signal transduction mechanisms, and many genes with unknown function. On the other hand, the accessory genes identified are mainly associated with carbohydrate transport, inorganic ion transport, and several genes with unknown function. The core genome consists of the genes involved in translation regulation, ribosomal structure and biogenesis, replication, recombination and repair proteins, energy production and conversion, coenzyme transport and metabolism, lipid transport and metabolism and nucleotide transport and metabolism.

Pan and Core Genome Plots
The comparative genome analysis revealed an open pangenome for both genera investigated here (Figures 2C,D) as the number of accessory and unique genes are increasing with the addition of new genomes. For the genus Haloferax, the pangenome and core genome contain 7950 and 1885 genes, respectively ( Figure 2C). For the genus Halogeometricum, the pangenome and core genome contain 7197 and 1874 genes, respectively ( Figure 2D). Similar findings were reported for the pangenome based analysis in other studies as well (Capes et al., 2012;Becker et al., 2014;Borchert et al., 2017). The pangenome analysis revealed that wsp1, wsp3, and wsp4 contain 246, 530, and 106 unique genes in their genomes. These unique genes were further annotated using online server Web Services for Metagenomic Analysis (WEBMGA) (Wu et al., 2011). We found multiple unique clusters including genes potentially encoding cation efflux proteins, integrases, and proteins potentially involved in multidrug resistance. The annotated genes also included commercially useful salt-tolerant enzymes like amylases, alcohol dehydrogenases, sulphataes, esterase's, etc.
The size of pangenome is larger than the core genome and is increasing with the addition of the new genomes suggesting both the genera investigated here have open pangenome. The current study is important for understanding the genomic variations and distribution of accessory genes providing survival advantage to the haloarchaeal strains in adverse conditions.

Biochemical Characterization of Haloarchaeal Isolates
Polyphasic taxonomic characterization were studied according to the methods in the proposed minimal standards for description of new taxa in the order Halobacteriales (Table 1A; Grant and Larsen, 1989;Oren et al., 1997). Biochemical characterization suggested that all three isolates were positive for the catalase test and negative for urease, lipase, and Voges-Proskauer tests. All three strains were also negative for acid production, casein hydrolysis, and hydrogen sulfide production. We also performed enzymatic screening which involved in amino-acid biosynthesis pathways and found that all three were negative for lysine decarboxylase, arginine decarboxylase, aesulin hydrolysis, and ornithine decarboxylase.
Wsp3 and wsp4 strains had many similar enzymatic features such as both were positive for citrate utilization, oxidase test, and methyl red test. Besides these similarities, some variations were also observed e.g., both H. borinquense DSM 11551 and wsp3 are positive for gelatin liquefaction and indole production while wsp4 is negative. We also found that wsp3 is unable to reduce nitrate, unlike H. borinquense DSM 11551 and wsp4. Genome analysis suggests that both H. borinquense DSM 11551 FIGURE 2 | Archaeal clusters of orthologous groups (arCOG) analysis of the genomes. Comparison of the COG distribution between the core, accessory and unique genes of the (A) thirteen species belonging to genus Haloferax including wsp1 and (B) four species belonging to genus Halogeometricum including wsp3 and wsp4 have been analyzed using default parameters of BPGA pipeline. The arCOG categories are indicated on the X-axis and the percentage of the genes present in each category of the arCOG classes is indicated on the Y -axis. (C,D) The Core and pangenome plots of genera Haloferax and Halogeometricum, respectively. Red and blue colored lines indicate the change in the number of pangenome and core genome gene groups added sequentially to the genome. and wsp4 encode nitrate reductase gene which is missing in the wsp3 genome. Similarly, we also found a copy of the nitrate reductase gene in wsp1 genome, which reduces nitrate, whereas it is missing in H. volcanii DS2 which is negative for nitrate utilization.

Carbon Source Utilization
Carbon source preferences of haloarchaeal isolates were identified by monitoring their growth in the presence of different carbon sources. Optical density-based assay results suggest that all three strains have similar preferences for dextrose while they were unable to utilize arabinose, rhamnose, aldonitol, inositol, inulin, and melibiose (Table 1B). Along with enzymebased biochemical tests, wsp3 and wsp4 had some similar and dissimilar preferences for carbon source utilization. Both wsp3 and wsp4 were positive for pyruvate and maltose utilization while negative for lactose and galactose utilization. Surprisingly, wsp3 alone was positive for three different carbon sources i.e., sucrose, cellobiose, and dulicitol while both H. borinquense DSM 11551 and wsp4 were negative. Similarly, wsp1 is not able to utilize sucrose and arabinose while its type strain H. volcanii DS2 is capable of utilizing both sugars.
Quantitative analysis of carbon source utilization shows that both wsp1 and wsp3 efficiently utilize galactose and maltose, respectively. Genome analysis suggested that both wsp1 and H. volcanii DS2, encode galactokinase enzyme which help them to utilize galactose as a carbon source (Anderson et al., 2011) while this enzyme is absent in both wsp3 and wsp4 strains. For maltose utilization, two alternative pathways have been proposed (Cibrario et al., 2016). The two essential enzymes  of these pathways are malz (α-1, 4-glucosidase) and malA (α-amylase MalA). Genomic data analysis suggested that only wsp3 genome encodes for two different copies of α-1, 4glucosidase enzyme while wsp1 and wsp4 only have an α-amylase gene in their genomes. This probably explains why wsp3 can utilize maltose more efficiently compared to both wsp1 and wsp4.

Antibiotic Susceptibility Profile
Culture isolates were also screened for antibiotic resistance ( Table 2). The analysis suggested that all three isolates were sensitive for novobiocin and resistant to penicillin, vancomycin, chloramphenicol, cephadroxil, lincomycin, cephalexin, ceftazidine, and cephadroxil. Surprisingly wsp3 had sensitivity for cefazolin and kanamycin while wsp1 and wsp4 were resistant to both the antibiotics like their type strains. The possible reason for kanamycin resistance in wsp1 and Antibiotic sensitivity wsp3 wsp4 wsp1 wsp4 is the presence of aminoglycoside phosphotransferase while this enzyme is absent in wsp3 hence probably explaining the differences observed in kanamycin resistance in these isolates.

Carotenoid Isolation and Spectroscopic Characterization
In the present manuscript, we isolated carotenoids from three haloarchaeal strains (Figures 3A,B). The visible cell pellet color of wsp1 was different when compared to both wsp3, and wsp4 suggesting either variations in the content or molecular structure of the carotenoids produced by these strains (Figure 3A). For further analysis, we purified carotenoids following Yatsunami et al. (2014) protocol and characterized them using UV-visible spectroscopy. UV-visible spectra suggested that wsp1, wsp3, and wsp4 have a similar absorption spectrum having major absorption peaks at 476, 502, and 535 nm, however, there were differences in the peak heights ( Figure 3B). These differences may explain why carotenoids isolated from wsp1 are different in color compared to wsp3 and wsp4. These three major peaks correspond to bacterioruberin (535 nm), all-Trans-lycopene (502 nm), and 13-cis-lycopene (476 nm). Bacterioruberin is a major component of haloarchaeal carotenoids which is synthesized from lycopene (Yang et al., 2015). We took the ratio of peak1/peak3 which may help us understand the bacterioruberin and lycopene contents in these strains. The lycopene has specific absorption at 502 and 476 nm while bacterioruberin absorbs at 535 nm. A slight absorption of bacterioruberin is also reported at 466 nm. The peak1/peak3 ratio was 1.0, 0.9, and 0.8 for wsp1, wsp3, and wsp4, respectively, suggesting that wsp1 has the highest bacterioruberin content among the three strains.

Recombinant Expression and Purification of Halotolerant α-Amylase From the Isolates
Haloarchaea lives in hypersaline environments such as marine salterns, saline soils, soda lakes, salted foods, etc. The cell lysis of marine planktons such as shrimps, diatoms, algae, fish, purple and green bacteria, etc. release different biomolecules including starch, cellulose, proteins, chitin, etc. To digest this biopolymer, haloarchaea needs to produce different digestive enzymes to utilize these substrates. The whole-genome sequencing and annotation results suggested that wsp1, wsp3, wsp4 genomes code for two, one, and three putative α-amylase genes, respectively. We successfully cloned all four α-amylase genes sharing <40% sequence identity among themselves. However, only one of them, named amy2 (locus tag G3A49-11660) isolated from wsp1, expressed well. Multiple domain analysis of Amy2 (protein ID QIB80089.1) suggested that it has a conserved α-amylase domain in the C-terminal region (residue range 227 to 635). A PSI-BLAST search using the NCBI database was performed for the additional N-terminal region (residue ranges from 1 to 226 aa). However, we did not find any significant hits, therefore, the function of this N-terminal domain is still not clear. Initially, we attempted to purify recombinant His-tagged Amy2 from the cytosolic fraction using E. coli Rosetta DE3 as an expression host. We successfully purified Amy2, however, the purified enzyme was inactive as suggested by the starch agar plate assay. Since most of the amylases are secretory proteins so we cloned amy2 in pET22b vector having pelB secretion signal sequence at the N-terminus to aid secretion /expression in the periplasmic space and C-terminal 6X His-tag to aid purification of the recombinant Amy2. We used this construct for purification and further biochemical characterization of Amy2.

Biochemical Characterization of Amy2
Upon induction, we observed amylase activity even in the cellfree culture supernatant (Figure 4A). E. coli cells alone or cells carrying the pET-22b vector without amylase genes were used as negative controls in these experiments ( Figure 4A).
We successfully purified Amy2 from the periplasmic space using Ni-NTA-based affinity chromatography ( Figure 4B). The gel-filtration profile suggested that Amy2 is predominantly monomeric in solution ( Figure 4C) and enzymatically active on the starch agar plates. We initially tested Amy2 salt tolerance by incubating purified enzyme for 12 h with different concentrations of salt ranging from 0.25 to 4 M on the starch agar plate (Figure 4D). We observed amylase activity in all the samples suggesting halotolerant property of Amy2. We further performed the biochemical characterization of Amy2 to study its stability and activity at various salt concentrations, pH, and temperature.

Effect of Salt Concentration on Enzyme Activity
The effect of NaCl concentration ranging from 0-4 M on enzyme activity was examined. Data suggests that Amy2 is active in all the NaCl concentrations studied having maximum activity at 1 M NaCl concentration. These results suggest that Amy2 has wide-range salt tolerance and retains moderate activity even at 3 M NaCl concentrations ( Figure 4E).

Effect of pH on Enzyme Activity
The effect of pH on activity was observed by performing enzyme assay under pH ranging from 3.0 to 11.0 at 1unit interval for 30 min at 37 • C. The maximum enzymatic activity was observed at pH 6.0, while ∼30% activity was observed at pH 5.0 and pH 9.0 suggesting enzyme is active at wide pH range ( Figure 4F).

Effect of Temperature on Enzyme Activity
The thermal effect on enzymatic activity was studied at the temperature range of 5-75 • C. Our finding revealed that The analytical gel filtration profile suggests that Amy2 is predominantly monomeric in solution. A small fraction, corresponding to large aggregates, was also observed in the void volume (8.47 ml). (D) Evaluating salt tolerance of Amy2 using starch plate assay suggests that the enzyme is halotolerant. (E) The enzymatic activity of Amy2 against soluble starch was assayed by the iodine-starch method at the various concentrations of NaCl. Relative activity was defined as the percentage of maximum activity at 1 M NaCl. Plots showing effect of pH (F) and temperature (G) on the Activity of Amy2 and the relative activity was defined as the percentage of maximum activity observed at pH 6.0 or 35 • C, respectively. All activity assays were performed three times in triplicate and mean value with standard deviation are plotted.
Amy2 showed maximum activity at 35 • C, and retained 40% activity at 55 • C whereas negligible activity was observed at 75 • C ( Figure 4G).
The biochemical characterization reveals attractive features of Amy2 including halotolerant, moderately thermostable, and activity at a wide pH range suitable for various industrial applications. These biochemical characteristics reported here are comparable to α-amylases reported from other halophiles (Onodera et al., 2013;Bajpai et al., 2015).

Structure Prediction and Analysis of Amy2
Most of the archaeal α-amylases belong to the subfamily of glycosyl hydrolase GH families: GH13, GH70, and GH77 (Dong et al., 1997;Bowers et al., 2009). In recent years, several potential α-amylases from different halophilic archaea have also been added to these families (Lombard et al., 2014;Santorelli et al., 2016). The enzyme belonging to this families should have three distinct domains: a central catalytic domain harboring a (β/α) 8 Tim barrel (domain A), with an irregular loop domain (domain B) usually protruding as a long loop out of the barrel connecting the third β-strand and the third α-helix and with the typical structure of an antiparallel β-sandwich (domain C) (Sarian et al., 2017).
Multiple sequence alignment using the NCBI blast portal revealed that Amy2 has a catalytic α-amylase domain encompassing 267-600 residue range ( Figure 5A). This catalytic domain is mainly present in archaeal and bacterial species and is known to hydrolyze α-(1, 4) glycosidic linkages of glycogen, starch, related polysaccharides, and some oligosaccharides (Antranikian, 1992;Gupta et al., 2003;Sivaramakrishnan et al., 2006). The structural model of the catalytic domain of Amy2 (R269-D625) was built using PHYRE2 (Kelley et al., 2015) webserver suggested that it may have a similar structural architecture found in other members of GH family of amylases i.e., eight stranded alpha/beta-barrel that contains the active site, calcium-binding domain present between beta-strand 3rd and alpha-helix 3rd, and a carboxyl-terminal Greek key beta-barrel domain (Abe et al., 2005). Amy2 sequence blast at the RCSB database (Bernstein et al., 1977) suggested that Halothermothrix orenii α-amylase is the closest structural homolog sharing 28% sequence identity over 426 residues (ranges from 3 to 429 aa) ( Figure 5B). The majority of the α-amylase enzymes are calcium-dependent metalloenzymes, where metal ion is required for both stability and enzymatic activity (Hsiu et al., 1964;Boel et al., 1990). H. orenii α-amylase (PDB ID-1WZA) harbors two different calcium-binding loops, loop 1: Asp44 aa to Ile52 aa and loop 2: Asp65 to Asp73 residues (Sivakumar et al., 2006). The sequence alignment and predicted structural model suggests that the calcium-binding loop 2 is missing in Amy2 (Figures 5B,C). Amy2 also has smaller B and C domains compared to H. orenii α-amylase due to deletions in the loop regions (Figures 5B,C).
For amylase activity, a catalytic triad consisting of Asp-Glu-Asp is required where first aspartic acid is involved in attacking the sugar anomeric center by a nucleophilic side chain and this reaction is assisted and governed by other two Glu-Asp residues (Hsiu et al., 1964;Machius et al., 1995). In H. orenii this triad is composed of Asp224-Glu260-Asp330. Multiple sequence alignment-based analysis suggests that Amy2 also has similar tried consisting of Asp436-Glu463-Asp529, respectively ( Figure 5D).

Screening of CRISPR Sequences and Secondary Metabolites
The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system is used by bacteria and archaea to fend off foreign genetic elements (Horvath and Barrangou, 2010). Since its discovery, it has been developed into numerous applications like genome editing and regulation of transcription in eukaryotes and bacteria (Gilbert et al., 2013). The genomes of the strains wsp1, wsp3, and wsp4 were submitted to CRISPR Finder 1 (Grissa et al., 2007) which suggested that all three genomes have multiple CRISPR-DR (CRISPR-direct repeat) as well as spacer sequences. In wsp1, wsp3 and wsp4 genomes six, one, and four CRISPR-DR sequences were observed, respectively (Supplementary Table S3). Whole-genome data of wsp1, wsp3, and wsp4 were further analyzed for the presence of CRISPR associated genes (Cas). Our analysis suggested the presence of several Cas sequences in both wsp1 and wsp4 genomes although no Cas-related genes were identified in the wsp3 genome. In haloarchaea CRISPR-Cas system type I, subtype I-B, is dominantly present (Maier et al., 2017). A characteristic feature of type I systems is the presence of nuclease Cas3 protein, and subtype I-B is further characterized by the presence of Cas8b (Makarova and Koonin, 2015). The functional process of Type I-B system is mainly based on the presence of Cas proteins, namely, Cas5, Cas7, and Cas8b which combine to form a Cascade (CRISPR-associated complex for antiviral defense)like complex in different Haloferax species such as H. volcanii (Maier et al., 2019). Genome analysis of both wsp1 and wsp4 suggested that they encode all the genes required to type I CRISPR system i.e., Cas1, Cas2, Cas3, Cas4, Cas5, Cas6, and Cas8b genes. The presence of both Cas3 and Cas8b suggests that pws1 and pws4 harbor CRISPR-cas type IB system similar to other haloarchaeal species.
In addition to enzymes, recent studies indicate that archaea can also produce a wide range of small peptides and secondary metabolites (Pietra, 1997;Pettit, 2011;Wang et al., 2019). These small peptides and secondary metabolites could be of considerable interest in biotechnology. For the screening of secondary metabolite production, all three genome sequences were uploaded onto an online antiSMASH server (Medema et al., 2011). The output results predicted multiple copies of two important secondary metabolites i.e., siderophore and terpenes in all three strains (Supplementary Table S4). In the case of wsp3, no siderophore gene was predicted. Siderophores play an important role in iron transport (Neilands, 1995) while the functions of terpenes in haloarchaea are not known yet.

DISCUSSION
Haloarchaeal genera show huge variations in the diversity within different geographical locations (Oren, 2002;Naghoni et al., 2017). There are several reports which suggest that different haloarchaeal strains were routinely isolated from the different salt areas like solar salterns, brines, salt lakes, salt pans, rock salts, etc. (Oren, 2002;Javor, 2012). In the present study, three haloarchaeal strains, namely wsp1, wsp3, and wsp4, were isolated from the high salinity environment and studied in detail.
Haloarchaeal isolates adopt diverse morphological shapes like rods, pleomorphic rods, square, cocci, triangles, and disks (Burns et al., 2007;Minegishi et al., 2010). Few haloarchaeal strains show rod form morphology in liquid culture while they are motile and formed nonmotile cocci on solid agar medium (Grant, 2001). All of the three isolates in this study i.e., wsp1, wsp3, and wsp4 were highly pleomorphic. Also, wsp3 and wsp4 were highly vacuolated, which may help these microbes to float on the surface of the water (Oren, 2012).
The pangenome comparison of the four Halogeometricum and twelve Haloferax haloarchaeal genomes, including three isolates reported in this study, have revealed genomic variations resulting from horizontal gene transfer, gene duplication, gene loss events, etc. which is in line with results from other studies (Vos and Didelot, 2009;DeMaere et al., 2013). This data also suggests that pangenomes of these genera are open and each genome contains about 3 to 20% unique genes. The pangenome analysis further suggested that the distribution of the unique, accessory, and core genes is widespread across different biological functions and no obvious pattern was observed.
Our data and past studies (Nagata et al., 1989;Littlechild, 2015;Karray et al., 2018;Amoozegar et al., 2019) suggest that haloarchaea codes for several enzymes suitable for industrial and research applications. Especially, their salt-tolerant and thermostable nature are attractive features suitable for specific applications like food processing, biofuel production, detergent additives, etc. However, the potential of these enzymes has not yet been commercially exploited. Amylase enzymes are typically involved in essential processes in microorganisms, such as hydrolysis of starch and glycogen, yielding glucose and maltose (Werries and Muller, 1986). There is an increased interest in exploiting amylases for the conversion of grain starch into fermentable sugars for ethanol production (Shigechi et al., 2004;Prakash and Jaiswal, 2010). However, purification of enzymes from the natural haloarchaeal source will not be commercially viable owing to the poor yield and slow growth rate (doubling time ∼4-10 h). Hence, there is a need for developing processes based on the recombinant expression of the target enzymes using fast-growing microbial hosts like E. coli. There are limited studies that report the recombinant haloarchaeal α-amylase production (Perez-Pomares et al., 2003;Hutcheon et al., 2005;Bajpai et al., 2015;Santorelli et al., 2016). Onodera et al., produced recombinant Haloarcula japonica α-amylase using E. coli host and found that malA is a cytosolic halotolerant α-amylase which is active even at 4 M salt concentration (Onodera et al., 2013). Our whole-genome data analysis of the wsp isolates suggested that they all code for one or more types of α-amylase genes in their genomes. We were able to successfully clone, express, and purify one of the annotated α-amylase genes named amy2 from wsp1. Interestingly, though soluble, Amy2 purified from the cytosolic soluble fraction was not active. α-amylases are usually secretory proteins and there are five cysteine residues in Amy2. So, to direct the protein to the periplasmic space for proper folding and functional expression, we cloned Amy2 having pelB signal sequence at the N-terminus. Using this strategy, we were successful in purifying active Amy2. In future, this strategy may aid purification of other functional α-amylases as well. The purified Amy2 was studied for its activity in the presence of varying salt concentration, pH, and temperature conditions. Data suggests that Amy2 was active in a wide range of salt concentrations, temperature, and pH conditions. However, detailed studies are required to further investigate its suitability for various industrial applications.
All three isolates were characterized based on a battery of biochemical tests and their antibiotic susceptibility was also evaluated. The results for these assays for wsp1, wsp3, and wsp4 were similar to their corresponding type strains. However, some differences were also observed which we were able to explain based on the whole genome sequencing data analysis. These discrepancies were easily correlated with the presence or absence of the corresponding enzyme(s) in the genome as described in the results section. Haloarchaea is known for producing high levels of carotenoids like bacterioruberin and its C50-related pigments having high commercial potential in various industries including cosmetic, food, poultry and health (Vilchez et al., 2011;Mata-Gomez et al., 2014;Naziri et al., 2014;Rodrigo-Banos et al., 2015). Interestingly, haloarchaeal carotenoid has also been shown to have dose-dependent cytotoxicity against human liver cancer cell lines suggesting anti-cancer activity (Rodrigo-Banos et al., 2015). The isolated carotenoids from all of the three strains in this studies showed a characteristic absorption profile similar to other haloarchaeal strains (Yatsunami et al., 2014;Yang et al., 2015). The differences in the UV-Vis spectrum observed among the carotenoids purified from these strains suggest differences in the contents of bacterioruberin and other carotenoids.
To summarize, we have characterized three haloarchaeal strains isolated from Indian solar salterns. We identified several commercially important enzymes encoded in these three haloarchaeal strains and successfully purified halotolerant recombinant α-amylase using E. coli as an expression host. Haloarchaea is known to survive harsh environmental and nutrition depleted conditions. The detailed biochemical and biophysical characterization, in the future, may also help explore the feasibility of these enzymes for commercial applications. The future studies aimed at deciphering the function of hypothetical unique and accessory genes may also help in identifying the mechanisms of stress adaptation that aid survival of haloarchaea in harsh conditions.

The Whole-Genome Sequencing and 16S rRNA Comparison
Genomic DNA extractions were performed using the Zymogen DNA isolation kit (Cat. No. D6105). The 16S rRNA gene sequences of strain wsp1, wsp3, and wsp4 were PCR-amplified using 21 forward (TCCGGTTGATCCYGCCGG) and 1453 reverse (GGGCYGCACGCGYRCTACA) standard primers. The amplified 16S rRNA gene sequences were further aligned with those of representative related taxa using the EzTaxon server (Yoon et al., 2017). The 16S rRNA gene sequences of wsp1, wsp3, and wsp4 representatives closely related species were retrieved from the EzTaxon server (Chun et al., 2007) and aligned using MEGA version X (Tamura et al., 2004;Kumar et al., 2018). Phylogenetic trees were constructed using the neighborjoining methods (Saitou and Nei, 1987) and the percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1500 replicates) are shown next to the branches (Felsenstein, 1985). The whole-genome sequencing of all three strains were performed using Illumina NextSeq and for assembly CLC NGS Cell ver 9 (CGWB) were used. The draft genomes were annotated using the RAST online server (Aziz et al., 2008). The H. borinquense DSM 11551 and H. volcanii DS2 genomes were used as a reference for comparison and annotation. Pangenome analysis was carried out by using the BPGA pipeline (Chaudhari et al., 2016). The genomes were screened for secondary metabolite gene clusters using antiSMASH online server (Medema et al., 2011).

Genome-Sequence Submission
The genome sequence of wsp1, wsp3, and wsp4 are submitted at NCBI under accession numbers as NZ_CP048738, NZ_CP050274 and NZ_CP048739.

Transmission Electron Microscopy
Aliquots of 50 µl from samples containing wild type culture (basal medium supplemented with 25.0% NaCl) were pipetted on a carbon-coated 300-mesh copper grid (Polysciences, United States). The excess sample was blotted and followed by air drying. The grids were further imaged using a JEM 2100 electron microscope, operated at 200 keV (JEOL).

Cloning, Expression and Purification of Amy2
The full length of amy2 was amplified and cloned in pET22b (Novagen) over-expression vector between NdeI-XhoI and NcoI-XhoI sites to yield recombinant protein having C-terminal 6X His-tag. The first set of restrictions were used for cytosolic expression (NdeI-XhoI) while the other set was used to express Amy2 fused with pelB (NcoI-XhoI) secretion signal sequence at the N-terminal. Forward primers 5 AGCTCATATGGCTGTCGGCGAGTCAGTA and 5 AGTAC CATGGCTGTCGGCGAGTCAGTA having NdeI and NcoI sites, respectively, and reverse primer 5 AGTCCTCGAGGGCGAGC GGGCAGACGAC having XhoI site were used for gene amplification. The amplified PCR products and vector were digested with NdeI-XhoI and NcoI-XhoI sites followed by an overnight ligation at 4 • C. The ligated products were transformed into Top10 chemical competent cells (Thermo Scientific, United States). Positive clones were screened using colony PCR method. The positive clones were further confirmed by DNA sequencing. The protein was expressed using BL21-DE3 Rosetta cells (Thermo Scientific, United States). Transformed cells were inoculated into 10 ml media as a primary and then into 1000 ml media supplemented 100 µg/ml ampicillin. The cultures were induced at 0.6 OD 600 by adding 0.3 mM IPTG and incubated overnight at 18 • C. Amy2 was purified using Ni-NTA (Merk-Sigma, United States) based affinity chromatography method following standard manufacturer's instructions. Cell pellets were dissolved in buffer A (20 mM Tris pH 7.4, 250 mM NaCl and 10 mM CaCl 2 ) and lysed by sonication. The sample was centrifuged at 12,000 g for 30 min. After centrifugation, pellet was discarded and the cell free supernatant was mixed with Ni-NTA resin. Protein was eluted by adding increasing the concentration of imidazole (20-500 mM) to the buffer A. The eluted fractions were loaded on 15% SDS PAGE to check purity and quality of the protein samples. The elution fractions were pooled and concentrated using 30 K centrifugal amicon ultrafiltration devices (Merk Millipore, United States). The protein was further purified using gel filtration chromatography using Superdex 200 Increase column. The purified Amy2 was used for further biochemical experiments.

Analytical Gel Filtration
The 0.5 ml of the purified protein sample was injected into the Superdex 200 Increase (GE Healthcare, United States) analytical gel filtration column pre-equilibrated with 20 mM Tris pH 7.4, 250 mM NaCl and 10 mM CaCl 2 . The flow rate was kept at 0.5 ml per min. The chromatogram was recorded at both 280 and 220 nm.

Carotenoid Isolation
The carotenoids were extracted by following protocol published by Yatsunami et al. (2014). Briefly, 20 ml of haloarchaeal cultures were centrifuged at 12,000 g for 30 min at 25 • C. The supernatant was discarded and pellets were further resuspended in 5 ml of acetone and methanol solution (7:3). This solution was incubated at room temperature for 30 min in dark followed by centrifugation at 12,000 g for 30 min. Reddish-pink supernatant was collected and pellets were discarded. Isolated colored supernatants were centrifuged at 3000 g in the speed-vac (Eppendorf concentrator plus) to remove methanol and acetone. The Reddish-pink colored pellets were finally dissolved in 200 µl of 100% methanol. UV spectra of extracted solutions were recorded at 200-700 nm range using a spectrophotometer (CECIL CE7500).

Enzymatic Assay on a Starch Agar Plate
The α-amylase activity of the heterologously expressed Amy2 was measured using a starch agar plate assay. 1 gm of starch agar was dissolved in 100 ml of distilled water and poured into petriplates. 50 µl of 5 µM purified Amy2 was loaded into starch agar wells and incubated overnight at 37 • C. The activity was checked by staining the plates with an iodine solution [15% solution containing 5% (wt/v) I 2 and 10% (wt/v) KI].

Biochemical Characterization of Amy2
Biochemical characterization of the enzyme was performed by the iodine-starch method. The reaction solution of 50 µl contained 1.0 ISU/ml enzyme and 0.2% (w/v) starch in reaction buffer (20 mM Tris-HCl, 250 mM NaCl and 10 mM CaCl 2, pH 7.4). A 660 was measured by standard enzyme assay. The effect of temperature on enzyme activity was determined by incubating the enzyme at the temperature range of 5-75 • C at 10 • C intervals for 30 mins. Optimum pH for enzyme activity was determined by measuring the activity at 37 • C for 30 min under different pH conditions at an interval of 1.0 pH unit. Sodium acetate buffer was used for pH 3-5, phosphate buffer was used for pH 6-7, Tris buffer was used for pH 8-9, and sodium bicarbonate buffer was used for pH 10-11. Effect of salt concentration was observed by examining the activity at different NaCl concentration (0-4 M) at 37 • C for 30 min. The data presented is an average of three independent experiments.

Evolutionary Relationships of Taxa
The evolutionary history was inferred using the Neighbor-Joining method (Saitou and Nei, 1987). The optimal tree with the sum of branch length = 0.23099238 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1500 replicates) is shown next to the branches. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method and are in the units of the number of base substitutions per site. This analysis involved 16 nucleotide sequences. All ambiguous positions were removed for each sequence pair (pairwise deletion option). There were a total of 1483 positions in the final dataset. Evolutionary analyses were conducted in software suite MEGA X (Kumar et al., 2018).

AUTHOR CONTRIBUTIONS
KT, SP, and DV conceived the study. KT, SP, and AP coordinated the study. SP provided the strains. KT and DV designed experiments, analyzed the data, and wrote the manuscript with inputs from other co-authors. DV, GV, and CS performed experiments. All authors reviewed the results and approved the final version of the manuscript.

FUNDING
This work was supported by grants to KT, SP, and AP by Council of Scientific and Industrial Research, India. DV and GV are recipient of senior research fellowship from Council of Scientific and Industrial Research and University Grants Commission, India, respectively.