Soil Microbiomes With the Genetic Capacity for Atmospheric Chemosynthesis Are Widespread Across the Poles and Are Associated With Moisture, Carbon, and Nitrogen Limitation

Soil microbiomes within oligotrophic cold deserts are extraordinarily diverse. Increasingly, oligotrophic sites with low levels of phototrophic primary producers are reported, leading researchers to question their carbon and energy sources. A novel microbial carbon fixation process termed atmospheric chemosynthesis recently filled this gap as it was shown to be supporting primary production at two Eastern Antarctic deserts. Atmospheric chemosynthesis uses energy liberated from the oxidation of atmospheric hydrogen to drive the Calvin-Benson-Bassham (CBB) cycle through a new chemotrophic form of ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO), designated IE. Here, we propose that the genetic determinants of this process; RuBisCO type IE (rbcL1E) and high affinity group 1h-[NiFe]-hydrogenase (hhyL) are widespread across cold desert soils and that this process is linked to dry and nutrient-poor environments. We used quantitative PCR (qPCR) to quantify these genes in 122 soil microbiomes across the three poles; spanning the Tibetan Plateau, 10 Antarctic and three high Arctic sites. Both genes were ubiquitous, being present at variable abundances in all 122 soils examined (rbcL1E, 6.25 × 103–1.66 × 109 copies/g soil; hhyL, 6.84 × 103–5.07 × 108 copies/g soil). For the Antarctic and Arctic sites, random forest and correlation analysis against 26 measured soil physicochemical parameters revealed that rbcL1E and hhyL genes were associated with lower soil moisture, carbon and nitrogen content. While further studies are required to quantify the rates of trace gas carbon fixation and the organisms involved, we highlight the global potential of desert soil microbiomes to be supported by this new minimalistic mode of carbon fixation, particularly throughout dry oligotrophic environments, which encompass more than 35% of the Earth’s surface.

In a recent study by Ji et al. (2017), a novel form of lightindependent autotrophy termed atmospheric chemosynthesis was discovered in soils across two oligotrophic East Antarctic deserts; the arid Robinson Ridge (average organic carbon 0.17%, moisture 4.4%) in the Windmill Islands and the hyper-arid Adams Flat (average organic carbon 0.09%, moisture 0.42%) in the Vestfold Hills region. At these sites, H 2 -oxidizing bacteria were proposed to employ high-affinity type 1h-[NiFe]hydrogenases to scavenge and oxidize hydrogen gas that has diffused into the subsurface soil from the atmosphere. The energy liberated from this oxidation process was proposed to support cell maintenance as well as carbon fixation via the Calvin-Benson-Bassham (CBB) cycle, and was linked to a novel chemotrophic form of ribulose-1,5-bisphosphate carboxylase/ oxygenase (RuBisCO), type IE (Grostern and Alvarez-Cohen, 2013;Tebo et al., 2015;Ji et al., 2017). RuBisCO type IE (rbcL1E) is phylogenetically distinct from the photoautotrophic RuBisCO types IA and IB and is notably also distinct from the other chemoautotrophic RuBisCO red-types IC and ID, diverging from these clades prior to their own separation (Park et al., 2009;Tebo et al., 2015). Despite this discovery, the broader ecological role and significance of this novel RuBisCO has still not been determined.
Here, we propose that terrestrial microbiomes inhabiting cold oligotrophic deserts throughout the world may be genetically capable of supporting cell growth through atmospheric chemosynthesis, particularly in environments where photoautotrophs are limited. We used quantitative PCR (qPCR) targeting the rbcL1E and the 1h-[NiFe]-hydrogenase large subunit (hhyL) genes to survey 122 desert soils spanning the Tibetan Plateau and 13 Antarctic and high Arctic sites. The taxonomic composition of each soil was analyzed using amplicon sequencing. We also aimed to identify the abiotic parameters associated with the genetic capacity for atmospheric chemosynthesis within each region, by correlating the relative abundances of rbcL1E and hhyL against 26 measured soil physicochemical parameters. We hypothesize that atmospheric chemosynthesis, as a new form of chemoautotrophy, is associated with low moisture and nutrient limitation in cold desert soils, under the general exclusion of phototrophs.

Soil Sampling
Soil samples were obtained from the 13 high Arctic and Antarctic sites by Australian Antarctic Program (AAP) expeditioners, while the Tibetan Plateau soil samples were obtained by expeditioners from the Chinese Academy of Sciences. In total, 122 samples were collected from the top 10 cm of soil, as previously described (Siciliano et al., 2014;Ji et al., 2020;Zhang et al., 2020). All soil samples were stored at −80°C until downstream analysis. ). Laser scatter was used to quantify soil particle size, while pH and conductivity were measured using a 1:5 soil to distilled water suspension. Global positioning system (GPS), geographic information system (GIS), and digital elevation models (DEMs) were implemented to measure physical parameters, including location, elevation, and aspect. Combustion and nondispersive infrared (NDIR) gas analysis were used to quantify total carbon (TC), total phosphorus (TP), and total nitrogen (TN) content (Rayment and Lyons, 2011). As comprehensive soil physicochemical analysis was not performed on the Tibetan Plateau soils, these samples were excluded from downstream correlation analysis.
Community Genomic DNA (gDNA) Extraction DNA was extracted in triplicate from 0.25 to 0.3 g of each soil sample using the FastDNA SPIN kit for soil (MP Biomedicals, NSW, Australia) as per the manufacturer's instructions. DNA was quantified with Picogreen (Life Technologies, Vic, Australia) and fluorescence measured on a fluorescence plate reader (SpectraMax M3 Multi-Mode Microplate Reader; Molecular Devices, CA), prior to being stored at −80°C until used.

Bacterial 16S rRNA Gene Sequencing and Data Analysis
Barcode tag amplification of the bacterial 16S ribosomal RNA (rRNA) gene was previously performed on soil gDNA using primers 28F and 519R (Bissett et al., 2016;Ji et al., 2020;Zhang et al., 2020). ARISA analysis confirmed that each set of triplicate gDNA extractions were significantly correlated (data not shown; van Dorst et al., 2014b;Ferrari et al., 2015). Therefore, 16S rRNA sequencing and all downstream analysis was performed using a single gDNA extract from each of the soil samples. Paired-end amplicon sequencing was performed using the Illumina MiSeq platform (Illumina, 312 California, US) in accordance with protocols from the Biome of Australia Soil Environments (BASE) project by Bioplatforms Australia (Bissett et al., 2016). The Antarctic and high Arctic data were downloaded together from the Australian Antarctic Datacentre 1 , and the BASE repository 2 . Amplicon sequencing data for the Tibetan Plateau soils was obtained from Ji et al. (2020) and analyzed separately. Open operational taxonomic unit (OTU) picking, assignment and classification were performed according to previously described methods (Zhang et al., 2020). In brief, USEARCH v10.0.240 (Edgar et al., 2011) and VSEARCH v2.8.0 (Torbjørn et al., 2016) were employed according to the UPARSE-OTU algorithm (Edgar, 2013). Sequences were quality filtered, trimmed, and clustered de novo to classify OTUs at 97% identity, and assigned to separate sample-by-OTU matrices where singletons were discarded manually. Sequences were then taxonomically classified against the SILVA v3.2.1323 SSU rRNA database (Quast et al., 2013).

Validation of the RuBisCO Type IE qPCR Primer Set
The RuBisCO type IE qPCR primer set (rbcL1Ef/rbcL1Er) designed in Ji et al. (2017) was validated for use in polar soils by amplicon sequencing DNA lysates. PCR was performed in reaction mixtures composed of 2 μl template gDNA, 5 μl GoTaq Flexi Buffer; pH 8.5 (Promega Corporation, USA), 1 μl 25 mM MgCl2, 0.5 μl 10 mM dNTPs (Bioline), 13.75 μl UltraPure DNase/RNase-free distilled water (Invitrogen, Scotland), 0.312 μl of each 40 μM primer (rbcL1Ef/rbcL1Er; Table 1; Integrated DNA Technologies), 0.126 μl GoTaq polymerase (Promega Corporation, USA), and 2 μl of 1 mg/ μl Bovine Serum Albumin (BSA). Amplifications were conducted using a Mastercycler nexus X2 (Eppendorf, NSW, Australia) under the following conditions; 95°C for 5 min, 35 cycles of denaturing at 95°C for 30 s, annealing at 55°C for 30 s and extension at 72°C for 30 s, and a final extension of 72°C for 5 min. PCR products underwent amplicon sequencing using the Illumina platform at the Australian Centre for Ecogenomics (University of Queensland). Sequences that lacked both the forward and reverse primer binding sites were discarded, as were those with average quality scores <35. The remaining sequences were matched to reference RuBisCO subtype sequences using the National Centre for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST; Ye et al., 2012).
qPCR Data Analysis CFX manager software (Bio-Rad Laboratories) was used for data analysis. Melt peak analysis confirmed amplification specificity. Amplification within the most dilute standard was detected at least 5 C t before the negative template controls. The average C t values across replicates were determined, and then standard curve efficiencies and copy numbers were converted into copies/g of soil. Here, the average efficiencies for the 16S rRNA, rbcL1E and hhyL genes qPCR reactions were 86.9% (±3.55), 91.1% (±1.48), and 89.9% (±3.06), respectively. The R 2 value for each qPCR was equal to or greater than 0.99. In accordance with Jones and Hallin (2019), the genetic copy numbers of rbcL1E quantified were corrected against the proportion of rbcL1E target reads observed during site-specific primer validation [section "Validation of the RuBisCO Type IE qPCR Primer Set"]. Next, rbcL1E and hhyL copy numbers normalized against 16S rRNA gene copy numbers and expressed as a percentage. Beanplots comparing the relative abundances of rbcL1E and the hhyL gene were produced using BoxPlotR (Kampstra, 2008;RStudio and Inc, 2013;R Core Team, 2018).

Multivariate Data Analysis
Skewness was eliminated from the 26 measured physicochemical factors using log or square root transformations within the PRIMER v7 + PERMANOVA package (Clarke and Warwick, 2001), and normalized (mean/standard deviation; van Dorst et al., 2014a). Subsequent multivariate data analysis of the physicochemical, 16S rRNA amplicon and qPCR data was carried out in the R v3.5.1 environment (R Core Team, 2018).  (Maeda et al., 2003) Non-metric multi-dimensional scaling (NMDS) ordination plots were generated using the R package "vegan" (Oksanen et al., 2015) to visualize the ordering of samples in reduced two-dimensional space. Euclidean dissimilarity index was applied to the physicochemical parameters, while the Bray-Curtis dissimilarity index was applied to the bacterial taxonomic abundance dataset. Ellipse paths were calculated using the "veganCovEllipse" function developed by Torondel et al. (2016). The veg.dist() and anosim() function from the "vegan" R package was used to conduct one-way analysis of similarity (ANOSIM) on dissimilarity matrices on our OTU and physicochemical datasets, grouped on a regional and site level (permutations = 999, α = 0.05). Subsequently, the qPCR data was analyzed against the 26 measured physicochemical factors. To identify the most appropriate correlation method to apply to this analysis, the linearity between the genetic abundance of rbcL1E and hhyL with each physicochemical parameter was tested through the generation of scatterplots using the "ggscatter" function from the "ggplot2" R package (Wickham, 2016). Spearman correlations between rbcL1E and hhyL genetic abundances and each physicochemical condition were displayed using the R package "corrplot" (Wei et al., 2017) to determine the direction of correlations observed. Multivariate random forest regression analysis was subsequently conducted using the "rfPermute" package (Archer, 2013) under the R-environment 3 . The relative importance and the significance of environmental factors in explaining the relative abundance of rbcL1E, hhyL, and the total bacterial abundance were determined.

Validation of the rbcL1E qPCR Primer Set
Following quality control, amplicon sequencing using the rbcL1E primer set resulted in a total of 760, 855 sequence reads. In these polar soils, the primers were on average 73.8% specific toward rbcL1E (Supplementary Material 4). The specificity of the rbcL1E primer pair ranged between 61.5 and 86.9% with most non-target sequences retrieved classified as RuBisCO type 1C. Genetic copies of rbcL1E obtained during later qPCR analysis were corrected against the site-specific primer specificities obtained here.

Relative Abundances of rbcL1E and hhyL Genes in Cold Desert Soils
The genetic determinants for atmospheric chemosynthesis were detected in high abundances across all 122 polar soils analyzed (Figure 2). As a percentage of 16S rRNA gene copies/g soil, average relative abundances of rbcL1E were highest within the Vestfold Hills (58.1%), followed by the Tibetan Plateau (42.1%), the Windmill Islands (31.0%), and the high Arctic (10.0%).
The relative abundances of the hhyL gene were also variable, with the highest relative abundance observed in the Vestfold Hills (7.73%), followed by the Windmill Islands (3.95%), the high Arctic (3.86%), and Tibetan Plateau (1.21%).
3 http://cran.r-project.org/ The average copy numbers of rbcL1E and hhyL genes per site were highest within the Vestfold Hills at 3.42 × 10 8 and 6.05 × 10 7 copies/g soil, respectively. Within the Vestfold Hills, Adams Flat exhibited the highest relative abundance of both target genes (rbcL1E 7.24 × 10 8 and hhyL 1.33 × 10 8 ), while the lowest abundances were found at The Ridge (rbcL1E 5.68 × 10 7 and hhyL 4.61 × 10 6 ). In comparison to the Vestfold Hills, the relative abundances of rbcL1E and hhyL were lower on average within the Windmill Islands with 1.11 × 10 8 and 1.20 × 10 7 copies, respectively. Within the Windmill Islands, the rbcL1E gene abundances ranged between 2.32 × 10 7 copies/g soil at Browning Peninsula and 2.72 × 10 8 copies/g soil at Mitchell Peninsula. The lowest hhyL gene abundances in the Windmill Islands were observed for soils from the human impacted Casey station (4.18 × 10 6 copies/g soil), with the highest numbers detected at Herring Island (3.39 × 10 7 copies/g soil).

Bacterial Communities Across the Poles
Soil bacterial communities were dominated by Actinobacteria at most sites, with relative abundances ranging from 23.57% at the Ridge to 58.63% at Herring Island (Figure 3, Supplementary Material 5). While Proteobacteria, Chloroflexi, Bacteroidetes, and Acidobacteria were also highly abundant, Cyanobacteria were present in low relative abundances (<1%) at 8 of the 13 sites (Spitsbergen Vestpynten, Spitsbergen Slijeringa, Heidemann Valley, Old Wallow, Adams Flat, Mitchell Peninsula, Robinson Ridge, and the Tibetan Plateau). In contrast, the relative abundances of Cyanobacteria were higher at 5-9% within the Alexandra Fjord Highlands, Rookery Lake, Casey station, and Browning Peninsula. Candidate phyla accounted for substantial proportions of the bacterial communities particularly within the Windmill Islands, dominating microbial communities at Mitchell Peninsula (12.71%), Robinson Ridge (8.71%), and Casey station (4.52%). Of the candidate phyla residing in the Windmill Islands, Candidatus Eremiobacterota (WPS-2) was the most abundant comprising 7.42% relative abundances on average at Mitchell Peninsula, followed by Candidatus Dormibacterota (AD3) at an average relative abundance of 4.06% at Mitchell Peninsula.

Soil Physicochemistry and Bacterial Community Similarity Across East Antarctica and the High Arctic
At the global scale, the bacterial communities were more similar within, rather than between regions as indicated by the distinct formation of regional clusters when viewed using an NMDS plot ( Figure 4B). Accordingly, ANOSIM results indicated that bacterial community composition varied significantly between the Antarctic and the high Arctic samples (ANOSIM R = 0.520, p = 0.001), with significant differences also observed on regional level between Norway, Canada, the Windmill Islands, and Vestfold Hills (ANOSIM R = 0.872, p = 0.001). Site-level bacterial community similarities have also been visualized, with the greatest variations occurring within Mitchell Peninsula and Alexandra Fjord Highlands (Figure 4D). Bacterial communities within soils from the three high Arctic sites were significantly similar to each other (ANOSIM R = 0.328, p = 0.002), as were soils sampled from within the Vestfold Hills (ANOSIM R = 0.276, p = 0.001). Comparatively, a significant and more substantial variation in bacterial composition was observed within the Windmill Island sites (ANOSIM R = 0.476, p = 0.001).
The measured soil physicochemical properties mirrored the bacterial community data, with distinct regional clusters showing that soils were more similar within than between regions ( Figure 4A). The associated ANOSIM results indicated that physicochemical conditions varied significantly between the Antarctic and the high Arctic samples (ANOSIM R = 0.626, p = 0.001), with differences also observed on a regional level A B FIGURE 2 | Relative abundances of target genes associated with atmospheric chemosynthesis within polar desert soils. The proposed genetic determinants of this process were widely distributed across all 14 sites spanning Antarctica, the high Arctic, and Tibetan Plateau (A). Ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) type IE (rbcL1E; B). 1h-[NiFe]-hydrogenase large subunit gene (hhyL). Large solid black lines indicate average relative abundances per site; the dotted black line indicates the mean relative abundance of all 14 sites (36.68% for rbcL1E and 5.22% for hhyL); the small, solid black lines represent individual data points; polygons represent the estimated density of the data.
between Norway, Canada, the Windmill Islands, and Vestfold Hills (ANOSIM R = 0.730, p = 0.001). Site-level variations in physicochemical conditions have also been visualized ( Figure 4C). On this more localized level, soil samples from within the Vestfold Hills demonstrated a high degree of physicochemical similarity to each other compared to those from other regions (ANOSIM R = 0.302, p = 0.001). High Arctic soils were also more physicochemically like each other than soils from other regions (ANOSIM R = 0.302, p = 0.001). In contrast, the Windmill Island soil samples demonstrated greater dissimilarity within sites (ANOSIM R = 0.617, p = 0.001).
The East Antarctic soils (n = 90) analyzed in this study were low in moisture (0.0023-0.26%) and nutrients, particularly TC (0.008-2.58%), TN (0.0065-0.22%), and TP (0.027-0.23%; Supplementary Materials 1, 2). Within the high Arctic (n = 27), these values were higher (TN was 0.022-0.080% in Canada and 0.073-0.43% in Norway; TC was 0.29-6.89% in Canada and 1.15-6.55% in Norway; Moisture was 0.042-0.11% in Canada and 0.11-0.42% in Norway; Supplementary Material 3). Soils obtained from the Windmill Islands and Norway were acidic (average pH 6.08 and 5.97, respectively), while soils from the Vestfold Actinobacteria dominated all soils, with average site-level relative abundances ranging from 23.6% at The Ridge to 58.6% at Herring Island. Proteobacteria, Chloroflexi, Bacteroidetes, and Acidobacteria were also highly abundant. In contrast, Cyanobacteria were present in low relative abundances accounting for <1% of the bacterial communities in eight sites, increasing to 5-9% within the Alexandra Fjord Highlands, Rookery Lake, Casey station, and Browning Peninsula. Within the Windmill Islands region, Candidatus Eremiobacterota was present in high relative abundances, with site-level averages ranging between 0.02% at Herring Island and 7.42% at Mitchell Peninsula. Similarly, Candidatus Dormibacterota was also present in high relative abundances in the Windmill Islands only. Hills and Canada were more alkaline (average pH 8.49 and 7.81, respectively; Supplementary Materials 1-3).
Multivariate random forest analysis revealed strong and significant relationships between the genetic abundances of rbcL1E, hhyL, and multiple environmental parameters, while Spearman correlations showed the direction of these relationships. The relative abundances of rbcL1E and hhyL were most significantly explained by soil moisture (IncMSE = 69.35 and 24.38, respectively), TC (IncMSE = 16.45 and 17.48, respectively), and mud composition (IncMSE = 28.73 and 15.61, respectively), with significantly greater (p < 0.05; Table 2) abundances of both genes occurring under low moisture, carbon and mud composition (Figure 5). Additionally, rbcL1E was significantly explained by various oxides including NO 3 (IncMSE = 32.83), Na 2 O (IncMSE = 24.40), MgO (IncMSE = 19.47), CaO (IncMSE = 26.44), and MnO (IncMSE = 16.09; p < 0.05; Table 2). Spearman analysis revealed that in these cases, greater rbcL1E was associated with lower NO 3 content and greater A C B D FIGURE 4 | NMDS plots showing the relationships among samples on a regional and local scale for (A,C) measured soil physicochemical parameters and (B,D) bacterial community composition at phylum level. Soil samples displayed greater environmental and bacterial community similarities within rather than between regions, as indicated by the formation of regional-level clustering. Bacterial communities within the high Arctic samples (Norway and Canada) were highly similar to each other with the clusters overlapping. Soil samples clustered according to site, indicating that soils from the same location share similar environmental conditions and bacterial community structures. For both environmental and bacterial communities, the greatest dissimilarity was observed within the Windmill Islands although variation in physicochemical parameters was also observed across the high Arctic sites, predominantly Alexandra Fjord Highlands. levels of the other oxides (Figure 5). Random forest analysis revealed that multiple environmental factors significantly influenced microbial community structure, including pH (IncMSE = 31.35), conductivity (IncMSE = 23.86), TN (IncMSE = 26.91), K 2 O (IncMSE = 27.04), Al 2 O 3 (IncMSE = 20.59), SiO 2 (IncMSE = 16.87), PO 4 (IncMSE = 17.23), and NO 2 (IncMSE = 12.83; p < 0.05), however, these factors did not significantly explain either genetic determinant of atmospheric chemosynthesis ( Table 2).

DISCUSSION
Atmospheric hydrogen oxidation has only recently been identified as an energetic driver of microbial autotrophic CO 2 fixation through the CBB cycle . Until now, atmospheric chemosynthesis has been overlooked as a niche process with unknown global significance. Here, we confirm that the genetic determinants of this new form of chemoautotrophy (rbcL1E and hhyL) are widespread and abundant throughout soil microbiomes of geographically distinct polar regions throughout Antarctica, the high Arctic, and the Tibetan Plateau. These findings support the hypothesis that this minimalistic carbon fixation strategy may be considered a globally occurring phenomenon and an important widespread survival adaptation in oligotrophic desert soil ecosystems.
While the role of hydrogen oxidation in contributing to microbial primary production is newly discovered, the role of high affinity hydrogenases (hyyL) in fulfilling the energy requirements of dormant soil bacteria is well established (Constant et al., 2008(Constant et al., , 2010(Constant et al., , 2011Berney and Cook, 2010;Berney et al., 2014;Greening et al., 2015;Islam et al., 2019;Piche-Choquette and Constant, 2019). During periods of extreme environmental stress, H 2 -oxidizers can reversibly lower their metabolic activity and thereby, their energy requirements (Lennon and Jones, 2011). Under these conditions, the aerobic oxidation of atmospheric hydrogen provides bacteria with a ubiquitous and reliable source of energy (Morita, 1999;Smith-Downey et al., 2008;Constant et al., 2011). The process is indeed widespread, with hyyL reported at abundances of 10 6 -10 8 genetic copies per gram of soil in both oligotrophic and copiotrophic ecosystems (Constant et al., 2011). Moreover, greater hhyL expression and hydrogen oxidizing activity have been linked to environments with lower organic carbon content (King, 2003;Greening et al., 2015), with H 2 -oxidizers reported to be among the earliest colonizers of volcanic deposits, despite the negligible amounts of organic matter present (King, 2003;Sato et al., 2004). Here, we also revealed the presence of high numbers of hhyL genes (4.18 × 10 6 -3.39 × 10 7 copies/g soil) in cold desert soils from across the three poles, many of which contained extremely low levels of carbon and nitrogen. We note here that although the qPCR primer are widely implemented (Constant et al., 2010(Constant et al., , 2011  Frontiers in Microbiology | www.frontiersin.org Meredith et al., 2014;Khdhiri et al., 2015;Piché-Choquette et al., 2016), the discovery of high-affinity hydrogenases beyond group 1 h (D. Cowan, personal communication;Greening et al., 2014;Islam et al., 2020) suggests that the high-affinity hydrogenase gene abundances quantified here are underestimations. Despite the widespread and abundant distribution of hhyL, the global co-occurrence of hhyL and rbcL1E has been unknown. The high and widespread co-occurrence of hhyL and rbcL1E across all 122 soils analyzed here indicates that the energy liberated from atmospheric hydrogen oxidation may be directed toward bacterial cell growth and primary production more pervasively than anticipated. Previous studies have indicated that trace gas chemosynthetic bacteria belong to the phyla Actinobacteria, C. Eremiobacterota, and C. Dormibacterota phyla (Park et al., 2009;Ji et al., 2017). The rbcL1E gene has also been detected within Chloroflexota, Firmicutes, and Verrucomicrobiota (Tebo et al., 2015). In this study, these taxa dominated soil communities from across the three poles, together accounting for up to 76.2% of the microbial community composition (Figure 3; Supplementary Material 5). Thus, in cold nutrient-starved deserts, trace gas chemoautotrophs appear to have a selective advantage for survival.
It has been proposed that atmospheric chemosynthesis and photosynthesis are both contributors to microbial primary production in oligotrophic environments, with contributions likely to vary along an aridity gradient Bay et al., 2018). Indeed, variability in photo and chemoautotrophic potential was observed here with abundances of rbcL1E being particularly low in the high Arctic (10.0%), Casey station (4.8%), and Browning Peninsula (9.0%) soils (Figure 2). Each of these sites also contained greater photosynthetic potential than the other sites due to higher abundances of Cyanobacteria (5.7-8.6%; Kleinteich et al., 2017;Pudasaini et al., 2017;Zhang et al., 2020). rbcL1E gene abundances were also more variable than hhyL, reflecting the more widespread role of the high affinity hydrogenases in supplying maintenance energy to dormant microbial communities (Berney and Cook, 2010;Constant et al., 2010Constant et al., , 2011Berney et al., 2014;Greening et al., 2015), as well as for reproduction.
It has been proposed that atmospheric chemosynthesis occurs increasingly within drier, more nutrient-starved soils Bay et al., 2018), in part due to the exclusion of phototrophic microorganisms under moisture limitation (Warren-Rhodes et al., 2006;McKay, 2016). We found that the genetic capacity for atmospheric chemosynthesis was associated with increasingly drier, more nutrient-limited soils Bay et al., 2018). Random forest and Pearson correlations revealed that rbcL1E and hhyL, relative to 16S rRNA, increased significantly across Antarctic and high Arctic soils that were increasingly limited in moisture and TC (Table 2 and Figure 5). Additionally, the relative abundance of rbcL1E also significantly increased in soils limited in NO 3 − (Table 2 and Figure 5). Neither genetic determinant formed a significant positive correlation with bioavailable substances that are widely utilized by geothermal chemoautotrophic bacteria such as NO 2 − and PO 4 − (Figure 5; Engel, 2012). This lack of correlative data supports our current understanding that rbcL1E catalyzed primary production is not driven by geochemical energy sources. As a result, atmospheric chemosynthesis may occur within environments where soil nutrients are limited. Positive correlations were detected between rbcL1E and multiple trace oxides measured by X-Ray fluorescence elemental analysis (MnO, MgO, CaO, and Na 2 O; Table 2; Figure 5), suggesting a potential metabolic significance that requires further investigation.
It is recommended that additional studies are conducted to focus upon the isolation and characterization of trace gas chemosynthetic bacteria. Additionally, metagenomic and biochemical studies, including hydrogen oxidation and 14 CO 2 assimilation assays should be performed on a broader range of environments where atmospheric chemosynthesis is likely to occur. We suggest that sites should be targeted where organic carbon, water, and photoautotrophs are limited and the utilization of atmospheric gases by microbial communities is well documented (King, 2003;Lynch et al., 2012Lynch et al., , 2014. This includes volcanic deposits as well as additional cold and hot deserts, such as the McMurdo Dry Valleys (Babalola et al., 2009;Van Goethem et al., 2016), Namib (van der Walt et al., 2016;Gunnigle et al., 2017), Thar (Rao et al., 2016), and Atacama (Lynch et al., 2014;Schulze-Makuch et al., 2018). Finally, this study highlights the genetic potential of microbial communities residing in cold oligotrophic deserts across the globe to conduct atmospheric chemosynthesis and FIGURE 5 | Spearman correlations between the relative abundance of the two target genes, rbcL1E and hhyL, against all 26 measured physicochemical parameters for the 117 Antarctic and high Arctic soils. rbcL1E and hhyL each produced positive correlations with soil conductivity, pH, and sand composition as well as with greater abundances of various oxides. rbcL1E and hhyL both occurred in higher abundance within samples of low moisture, total carbon (TC), and total nitrogen (TN). rbcL1E also occurred in higher abundance within samples of low NO 3 − and mud composition.
their propensity for survival in regions with highly limited water and nutrient availability.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories described in the article. Original sequencing data is publicly available through NCBI under the accession number PRJNA645753. All other original contributions presented are included in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
BF determined the research objective with input from AR, MJ, and AT. AR conducted tag sequencing and qPCR. Soil parameter data for the Vestfold Hills was led by AT. Tag sequencing analysis was performed by AR and MJ. MJ and WK provided the Tibetan Plateau soil samples. AR, EZ, and MJ conducted multivariate data analysis, with input from BF. AR and BF wrote the manuscript with input from remaining authors. All authors contributed to the article and approved the submitted version.