Identification of Major Organisms Involved in Nutritional Ecosystem in the Acidic Soil From Pennsylvania, USA

Microorganisms play a critical role in the structure and functioning of soil ecosystems. Within acidic soil across the northeastern United States and Canada, we have little understanding of the microbial diversity present and its relationship to the biochemical cycles. The current study is aimed at understanding the taxonomical and functional diversities in the acidic soil obtained from near various types of trees, how the diversities change as a function of depth, and the linkage between taxonomical and functional diversities. From eight sampling locations, soil samples were collected from three horizons (depths). The three depths were 0–10 cm (A), 11–25 cm (B), and 26–40 cm (C). Results indicate that across all the samples analyzed, Bradyrhizobium and Candidatus Solibacter are the most abundant bacteria in the soil microbiome. The differences in the soil microbiome across the samples were attributed to the abundance of individual organism’s present in the soil and not to the presence or absence of individual organisms. Subsystem level analysis of the soil microbiome sequences indicate that there is higher level of abundance of genes attributed to regulation and cell signaling. A low level of sequences were detected for sulfur metabolism, potassium metabolism, iron acquisition and metabolism, and phosphorous metabolism. Structure-functional analysis indicate that Bradyrhizobium, Rhodopseudomonas, and Burkholderia are the major organisms involved in the nutritional ecosystem functioning within acidic soil. Based on the results, we propose utilizing a consortium of these organisms as an environmentally friendly alternative to the use of chemicals to maintain soil fertility and ecosystem functioning.


INTRODUCTION
Soil microorganisms are the largest biodiversity pool on earth, with more than 10 30 microbial cells, 10 4 -10 6 species, and nearly 1,000 Gbp of microbial genome per gram of soil (Vogel et al., 2009;Mendes and Tsai, 2018). They are the primary factors that affect the soil ecosystem functioning and play key roles in forming and maintaining a multitude of soil characteristics including integrity, fertility, ecology, and overall soil function . Soil microorganisms are also vital for decomposition, pollutant removal, recycling of essential elements, suppressing plant diseases found in soil, and promoting growth for vegetation (Garbeva et al., 2004). Much is known of the microbial taxa present in soils from across the planet and the impact of perturbation of soil conditions. A Google Scholar search for the term "soil microbial diversity" reveals over 1.6 million hits. Nevertheless, our understanding of how microbial diversity and ecosystem functions are linked, and how each of the microbial taxa present in the soil are linked to the individual ecological functions remain limited.
Increased use of 16s rDNA metagenomic methodology using pyrosequencing and Illumina Miseq and Hiseq techniques, has increased our understanding of the taxonomy of soil microorganisms by orders of magnitude. However, the 16S rDNA sequencing method has numerous limitations including differentiating closely related species (Hasan et al., 2014), nonuniform distribution of sequence dissimilarity among taxa, presence of multiple copies of the 16S rRNA gene (Garrity et al., 2009), failure of target amplification of polymerase chain reaction (PCR) primers (Venter et al., 2004), and generation of chimeric sequences (Quince et al., 2009). Further, in majority of these research, the role of individual microorganisms in the soil remains at the level of hypothesis based on prior literature (examples include our own prior research: Kumar et al., 2011;Collins et al., 2012). Methods such as Biolog, Fungilog, and soil enzyme activity are many times used in studies as indicators of the ecosystem functioning and correlation to the taxonomic data (Nannipieri et al., 2002;Sobek and Zak, 2003;Rutgers et al., 2016). While a step forward, these methods are primarily predictive of soil microbial functional dynamics (Bell et al., 2009).
Whole genome shotgun metagenomics provide a better approach for obtaining the taxonomic and functional aspects of the entire soil microbial genome. This method yields millions to billions of short reads, providing necessary sequencing depth as needed. It also offers an opportunity to identify organisms present in the microbiome and the biochemical pathway information present at the genomic level in each of the identified organisms. Shotgun metagenomic study has been used to elucidate the microbiome present in soil (Enagbonma et al., 2019;Edwards et al., 2013), rhizosphere Akinola et al., 2021), waste water and sludge samples (Delforno et al., 2017;Ibarbalz et al., 2016) and samples from international space station (Singh et al., 2018). In this study, we employ shotgun metagenomic approach to identify and quantitate bacterial species present in the acidic soil and elucidate the major ecological functions of major organisms.
Acidic soil typically has a pH range of 4.0-4.5, is high in iron and aluminum, and is often considered nutrient-poor. Across the eastern United States and southeastern Canada, soil is primarily acidic (Bruulsema, 2006). The acidic conditions in the soil of the region is primarily attributed to the parent materials of the soil and increased precipitation that leaches cations from the soil. The soil is optimal for the growth of trees like Apple, Beech, Dogwood, Oak, and Magnolia, and Pears. In addition, plants from heath family (huckleberries, blueberries, and cranberries) do well in acidic soil. Literature search indicates that no reports are available studying the structure-function relationship in the natural acidic soil from the region. Such information is critical to understand how within such soil is the ecosystem functioning and how to ensure the ecosystem remains functional for optimal agricultural productivity. The current study focused on understanding the taxonomical and functional diversities in the acidic soil obtained from near various types of trees, how the diversities change as a function of depth, and the linkage between taxonomical and functional diversities. We address three questions in the study: 1) what are the major microorganisms that are common to acidic soil types and depth? 2) what are the biochemical pathways that can be generalized across all acidic soil types and depth? 3) What role do each of these organisms play in the ecosystem functioning in the soil?

METHODOLOGY Sample Collection
Samples were collected initially from eight sampling locations across the West Chester University Campus in West Chester, PA. No permit was required to obtain samples. Table 1 describes the analyzed eight locations, the types of vegetation present at each sampling location, as well as sampling coordinates. Sampling coordinates were recorded using Google map app downloaded on an android smart phone. The sampling locations were selected based on the vegetation present and initial sampling of 25 different locations. The final locations were selected based on the similarity of the vegetation between sampling locations, pH levels, and the quantity/quality of the DNA isolated. The protocols for the safety of data collection were strictly followed as recommended by the U.S. Fish and Wildlife Services and the Foundation for Ecological Research in the Northeast (Batcher, 2005). At each location, soil samples were collected from three horizons: 0-5 cm (Horizon A); 6-15 cm (Horizon B) and 16-30 cm (Horizon C). All distinguishable debris and pebbles were removed using sterile forceps, and the soil was mixed thoroughly prior to analysis to yield a homogenized sample.
pH Measurement 5 g soil samples were mixed with 10 ml d/w and vortexed for 10 min. The solution was allowed to sit for 1 h and pH measured of the settled solution. All measurements were done in triplicates.

DNA Extraction and Shotgun Metagenomics
DNA extraction from each soil sample was carried out using the Qiagen DNeasy PowerSoil DNA Isolation Kit, according to the manufacturer's protocol. DNA isolation was performed in triplicates for each sampling location and the samples were pooled prior to analysis. DNA concentration in all samples were determined using the Qubit three Fluorometer (Invitrogen Technologies). All the samples were diluted to 100 ng/μL and used for the library preparation, using the Taxonomic and functional profiles were generated using the normalized abundance of sequence matches to the Refseq and Subsystems databases, respectively. All settings were set at default values prior to analysis (maximum e-value cutoff, 1e −5 ; minimum % identity cutoff, 60%; minimum alignment length cutoff, 15). The sequences have been deposited and are available through the NCBI BioProject Database ID: PRJNA719140.

Clustering Analysis
Clustering analysis were performed using Statistica (release 14.0) software. The tree cluster analysis was performed using Ward's method as the amalgamation rule and the distance measured as the Euclidean distances. The method was chosen as the cluster membership is assessed by calculating the total sum of squared deviations from the mean of a cluster . Prior to clustering analysis, data obtained from MG-RAST were log2 transformed and DSeq normalized.

Soil pH
pH for all the soil samples analyzed in this study were in the acidic range of 4.1-6.3 (Table 1). Results show that the type of tree clearly influences the soil pH, with soil around Douglas Fir being the most acidic soil amongst all the types studied. No significant difference in pH was observed across the depths, except for soil obtained around the Oak tree (Table 1), where a stark drop in pH was observed as we go from depth A (pH, 5.7) to depth C (pH, 4.4).

Sequencing Analyses and Microbial Community Diversity
A total of 22,745,412 raw sequence reads were generated for the 24 samples using the Illumina Miseq sequencing platform.   Table S1). Over 99% of the sequences were annotated, with almost equal distribution of known proteins (4,073,051) and proteins of unknown function (4,073,868) (Supplementay Table S1).
The rarefaction curves indicate high genetic diversity, with no complete saturation observed even after almost 8 million sequences (Supplementary Figure S1). For all the samples, the curve has slowly begun to flatten, indicating a reasonable number of species have been sampled. The mean alpha diversity observed was 479, with the range from 417 to 547 species (Supplementay Table S1).

Taxonomic Characterization of Soil Microbiome
Taxonomically, all soil samples had bacterial populations from 50 to 57 phylum. Bacteria belonging to Proteobacteria and Actinobacteria were the most predominant bacteria, comprising over 60% of the total microbial community in each of the samples analyzed ( Figure 1) Hierarchical structure analysis was performed on the normalized genus level abundance data using Ward's linkage FIGURE 3 | Heat map showing the differential abundance of functional categories (subsystem Level 1) between different soil samples.
Frontiers in Environmental Science | www.frontiersin.org March 2022 | Volume 10 | Article 766302 5 method to investigate the link between soil microbiota and plant type/soil pH and depth. Results indicate that the soil samples analyzed can be divided into six major clusters, after which the linkage distance separating the sub-clusters is small (Figure 2). Table 2 describes the members of each cluster and K-means clustering confirms the results. While overall the samples from individual locations from each of the horizon are clustered together or are in close clusters, samples from horizon B of location 10 (Pine tree vegetation) and horizon A of location 12 (Tulip tree) have unique microbiota to form its own cluster. ANOVA analysis indicates that the mean abundance for all the genera within a cluster are statistically different between clusters (p < 0.05), except for the abundance of seven genera (Supplementary Table S2). The seven genera whose abundance are not statistically different between clusters (p > 0.05) are Nitrosopumilus, Carboxydothermus, unclassified genera derived from Deltaproteobacteria, Pelotomaculum, Oceanicola, Thermotoga, and Bdellovibrio (Supplementary Table S2). Table 3 shows the abundance of the top 30 microbial genera in the representative samples from each of the clusters and the average abundance of the organisms across all the 24 samples analyzed. Results show that Bradyrhizobium and Candidatus Solibacter, both Gram-negative bacteria, are the most abundant microorganism in the soil samples analyzed. Streptomyces and Mycobacterium are the two most abundant Gram-positive bacteria found in the soil samples.

Functional Characterization of Soil Microbiome
A heat map illustrating the functional annotation of sequence reads containing predicated proteins of known functions across all the 24 soil samples is shown in Figure 3. Variation was observed between samples primarily related to proteins involved in virulence, disease and defense; cell wall and capsule; membrane transport; DNA metabolism; and respiration. Among the functional categories identified by MG-RAST, the five most dominant categories based on the relative abundance of assigned reads were carbohydrates (13.3 ± 0.4%), the clustering-based subsystems (functional coupling evidence but unknown function; 12.9 ± 0.2%), amino acids and derivatives (9.6 ± 0.3%), miscellaneous (6.8 ± 0.2%), and protein metabolism (7.7 ± 0.3%).
Relative abundance of the predicated proteins annotated at subsystem level 2 for each of the soil samples is presented in the supplemental table (Supplementary Table S3). Hierarchical structure analysis was performed on the normalized values,   similar to that performed for taxonomic data. Results indicate that the soil samples can be divided into 5 clusters, after which the linkage distance separating the sub-clusters is small (Figure 4). Table 4 describes the members of each cluster and K-means clustering confirms the results. Similar to taxonomic clustering, samples from horizon B of location 10 (Pine tree vegetation) and horizon A of location 12 (Tulip tree) have unique composition of functional proteins to form its own cluster. Table 5 shows the abundance of top 30 predicated proteins in the representative samples from each of the clusters. Results show that unidentified proteins involved in regulation and cell signaling comprise nearly 1 in 5 proteins predicated from the sequences. Nearly 6% of the predicated proteins are from the miscellaneous SEED category comprising a diverse set of genes identified during investigation of plant-prokaryote interactions by a project at the Department of Energy (DOE), USA (Thureborn et al., 2016). Protein biosynthesis, central carbohydrate metabolism, and resistance to antibiotics and toxic compounds were the other top predicated functions of the proteins.

Linking Diversity to Function
To identify the key microorganisms playing significant role in the biochemistry of soil, Refseq and Subsystems analysis were performed together on MGRAST platform. The Subsystem analysis was performed at level 3 wherever possible. Top 5 genera having the largest quantity of annotated reads within each of the metabolic class were identified (Supplementary Table  S4). Data indicates that Bradyrhizobium, Rhodopseudomonas, and Burkholderia are the key bacteria within the soil microbiota. Both, Bradyrhizobium and Rhodopseudomonas are top contributors in 24 of the 44 metabolic classes analyzed ( Figure 5). Burkholderia is a top organism in 16 of the metabolic classes ( Figure 5).
From an agricultural perspective, Nitrogen, Phosophorus, Sulphur and Iron metabolic pathways are significant. In the Nitrogen, Iron and Sulphur pathways, beyond the three genera identified, Mycobacteria also plays a significant role. Organisms from Anaeromyxobacter and Aromatoleum genera are key contributors in the nitrosative stress and dissimilatory nitrile reductase pathways respectively (Supplementary Table S4). Organisms from Sorangium genera have the most genes coding for Sulphate reduction associated complexes. Similarly, organisms from Cupriavidus and Pseudomonas genera are other top bacteria involved in Phosphate pathways (Supplementary Table S4). In Iron pathways, Bacillus, Frankia, and Pseudomonas were the top genera involved (Supplementary Table S4). The catabolic genes related to the degradation of xenobiotics were also annotated and linked to the microbial genera. Beyond Bradyrhizobium, Rhodopseudomonas, and Burkholderia, bacteria from Pseudomonas and Cupriavidus play a key role in degradation of xenobiotic compounds (Supplementary Table  S4). Results indicate that for each of the biochemical functions, there is redundancy within the soil microbiome.

DISCUSSION
We investigated the microbial structural and functional diversity within the top acidic soil associated with a wide variety of plants.
Results indicate that irrespective of the level of acidity in the soil, most of the microorganisms type associated with the soil generally remains the same. The differences observed between soil samples, could be attributed to the abundance of individual organisms present in the soil. Soil chemistry and the vegetation present guides the abundance of individual organisms. The change in microbial abundance results in change in the abundance of functional genes within the soil microbiome. Literature is replete with scientific studies showing soil microbiome changes with the structure of the soil (eg. Fierer and Jackson, 2006;Fierer et al., 2012;Mendes and Tsai, 2018;Shah et al., 2021). Based on our results, we suggest that one needs to consider whether the type of organisms present in the soil are different or if the abundance of individual organisms is different before reaching the conclusions related to microbiome difference amongst different soil samples. Further, current methods of calculating alpha and beta diversity may not capture the true similarities in the microbiome from different soil types. As further advances are made in the next-generation sequencing techniques, we believe similarities in the microbiome across soil type could become more evident. Taxonomically, prior research has shown that Gram-negative organisms are predominant organisms present in the soil (Shah and Subramaniam, 2018). Results obtained in the current study supports the prior observation. When one considers similar observations in microbiome studies conducted in marine environments, and even in human, fish and animals, a theme starts to emerge-in the microbial communities across the matrices, Gram-negative bacteria are the predominant organisms.
Functionally, high levels of genes attributed to regulation and cell signaling (level 1) appear to be an identifying indicator for acidic soils. cAMP is a major gene annotated to this category. Delmont et al. (2012) reported abundance of cAMP related annotation within the soil metagenome. Considering acidic soil is poor in nutrition and the northeast region of the United States has varying weather patterns, soil bacteria might be required to deal with constantly fluctuating substrates and environmental conditions. cAMP is a universal cell energy and metabolism regulator. Higher level of this and other genes involved in regulation and cell signaling can be attributed to the requirement of bacteria to adapt to the changing soil chemistry. Surprisingly, we noticed low levels of the abundance of Frontiers in Environmental Science | www.frontiersin.org March 2022 | Volume 10 | Article 766302 8 genes related to nutrient cycling (sulfur metabolism, potassium metabolism, iron acquisition and metabolism, and phosphorous metabolism). Genes annotated to virulence disease and defense were significantly prevalent in the soil samples analyzed. The clusterbased subsystems contain diverse functions, such as resistance to antibiotics and toxic compounds, and pathogenicity islands.
Results of our study indicate that in the acidic soil, Bradyrhizobium, Rhodopseudomonas, and Burkholderia are the major organisms in the soil involved in the nutritional ecosystem functioning. Bradyrhizobium and Candidatus solibacter are taxonomically the most abundant organisms in the soil samples analyzed. Collectively, it is evident from taxonomic and functional analysis of the soil microbiome, bacteria from Bradyrhizobium are highly critical to maintaining soil fertility, irrespective of soil type. Analyzing the microbiota present in 52 soil samples from different countries, Shah and Subramaniam (2018) found that bacteria from Bradyrhizobium genera were the most abundant organisms in the microbiota. The structure-function linkage results indicate that the organism is not only responsible for nitrogen fixation and other pathways in N cycle, but also plays a key role in S and Fe cycles, and degradation pathways of xenobiotic compounds. Bradyrhizobium bacteria are present as symbiotic and non-symbiotic organisms in the soil, and literature is replete with the importance of the organism in the Nitrogen cycle (Ormeño-Orrillo and Martínez-Romero, 2019). Many strains of Bradyrhizobium are used commercially to improve crop production (Environmental Protection Agency, n.d.). We suggest that the beneficial impact of the organism in improving soil fertility could also be attributed to its role in other biochemical pathways.
Acidic soils provide a unique environment for soil microorganisms due to iron, manganese and aluminum toxicity, low nitrogen, phosphorus, and molybdenum levels, toxic levels of phenolic acids, and hydrogen ion toxicity (Kidd and Proctor, 2001;Shah et al., 2011). Often to overcome this issue, nitrogen fertilizers and other chemicals are used to improve soil fertility, but these methods can cause other environmental issues including increase in nitrous oxide emissions (Xu et al., 2014). As a substitute to the use of chemicals for improving soil fertility and crop production, we suggest to the scientific community to study the possibility of using consortia of organisms including Bradyrhizobium, Rhodopseudomonas, and Burkholderia. Considering the importance and ubiquity of these organisms in the soil, the consortia could be used by farmers across the globe, irrespective of soil chemistry and geographical location.
Next-generation sequencing methods are increasingly used to study how the soil microbiome responds to changes in environmental conditions or to addition of contaminants in the soil. We suggest that in addition to analyzing general communitybased diversity changes, scientists should specifically look for changes in the Bradyrhizobium, Rhodopseudomonas, and Burkholderia population to understand the impact. Our results suggest that changes in abundance of these organisms may greatly impact the soil fertility.
Considering that the soil samples analyzed were from the West Chester, PA region only, further studies are warranted using acidic soil samples from across the globe to validate the observations. Nevertheless, the metagenomic data reported here furthers our knowledge on the acidic soil microbial communities at structural and functional level. There is a large degree of similarity in the soil microbiome associated with different vegetation and soil pH. Increasing our attention to similarities in soil microbiome may allow us to further the biotechnological potential of microbial based products to improve soil fertility in the future.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi. nlm.nih.gov/bioproject/719140.

AUTHOR CONTRIBUTIONS
VS conceived the idea; MJ, SF, SS, VS were responsible for experimentation, data analysis and manuscript writing.