Primer and Database Choice Affect Fungal Functional but Not Biological Diversity Findings in a National Soil Survey

The internal transcribed spacer (ITS) region is the accepted DNA barcode of fungi. Its use has led to a step-change in the assessment and characterisation of fungal communities from environmental samples by precluding the need to isolate, culture, and identify individuals. However, certain functionally important groups, such as the arbuscular mycorrhizas (Glomeromycetes), are better characterised by alternative markers such as the 18S rRNA region. Previous use of an ITS primer set in a nationwide metabarcoding soil biodiversity survey revealed that fungal richness declined along a gradient of productivity and management intensity. Here, we wanted to discern whether this trend was also present in data generated from universal 18S primers. Furthermore, we wanted to extend this comparison to include measures of functional diversity and establish trends with soil types and soil organic matter (SOM) content. Over the 413 individual sites examined (arable, grassland, woodland, moorland, heathland), we found congruent trends of total fungal richness and β -diversity across land uses, SOM class, and soil type with both ITS and 18S primer sets. A total of 24 fungal classes were shared between datasets, in addition to 15 unique to ITS1 and 12 unique to 18S. However, using FUNGUILD, divergent trends of functional group richness became apparent, especially for symbiotrophic fungi, likely driven by an increased detection rate of Glomeromycetes in the 18S dataset. The disparate trends were also apparent when richness and β -diversity were compared to soil properties. Additionally, we found SOM class to be a more meaningful variable than soil type biodiversity for predicting biodiversity analyses because organic matter was calculated for each sample whereas soil type was assigned from a national soil map. We advocate that a combination of fungal primers should be used in large-scale soil biodiversity surveys to capture important groups that can be underrepresented by universal barcodes. Utilising such an approach can prevent the oversight of ubiquitous but poorly described species as well as critically important functional groups.

The internal transcribed spacer (ITS) region is the accepted DNA barcode of fungi. Its use has led to a step-change in the assessment and characterisation of fungal communities from environmental samples by precluding the need to isolate, culture, and identify individuals. However, certain functionally important groups, such as the arbuscular mycorrhizas (Glomeromycetes), are better characterised by alternative markers such as the 18S rRNA region. Previous use of an ITS primer set in a nationwide metabarcoding soil biodiversity survey revealed that fungal richness declined along a gradient of productivity and management intensity. Here, we wanted to discern whether this trend was also present in data generated from universal 18S primers. Furthermore, we wanted to extend this comparison to include measures of functional diversity and establish trends with soil types and soil organic matter (SOM) content. Over the 413 individual sites examined (arable, grassland, woodland, moorland, heathland), we found congruent trends of total fungal richness and β-diversity across land uses, SOM class, and soil type with both ITS and 18S primer sets. A total of 24 fungal classes were shared between datasets, in addition to 15 unique to ITS1 and 12 unique to 18S. However, using FUNGUILD, divergent trends of functional group richness became apparent, especially for symbiotrophic fungi, likely driven by an increased detection rate of Glomeromycetes in the 18S dataset. The disparate trends were also apparent when richness and β-diversity were compared to soil properties. Additionally, we found SOM class to be a more meaningful variable than soil type biodiversity for predicting biodiversity analyses because organic matter was calculated for each sample whereas soil type was assigned from a national soil map. We advocate that a combination of fungal primers should be used in large-scale soil biodiversity surveys to capture important groups that can be underrepresented by universal barcodes. Utilising such an approach can prevent the oversight of ubiquitous but poorly described species as well as critically important functional groups.

INTRODUCTION
Soil fungi are the dominant eukaryotic component of soil communities and are known to perform crucial ecosystem functions (Peay et al., 2008). Characterising the diversity of fungi within the landscape and their response to anthropogenic perturbation therefore represents an important topic within ecology. High-throughput sequencing has allowed the rapid estimation and identification of fungi by overcoming historical limitations of culture isolation and classifying fruiting bodies (Tedersoo et al., 2015). Using these DNA-based approaches it has been estimated that global fungal diversity in soil ranges from 3.5 to 5 million species. Yet at the beginning of the present decade, only around one-tenth of fungal diversity was thought to have been described (Rosling et al., 2011). In terms of ecosystem function, the majority of fungi are important in organic matter turnover and nutrient recycling as they facilitate the conversion of complex organic polymers into forms more readily accessible to other organisms (Peay et al., 2008;Nguyen et al., 2016). Consequently, they play a crucial role in regulating both belowand above-ground productivity (Peay et al., 2008). Many soil fungi also form important interactions with plants. Some form mutualistic relationships, best exemplified by the wide range of mycorrhizas (Wang and Qui, 2006;Smith and Read, 2008;Nguyen et al., 2016), whereas others are pathogens, responsible for numerous plant and animal diseases within agriculture and forestry (Fisher et al., 2012;Nguyen et al., 2016). Depending on environmental conditions or life stage, fungi are capable of taking on some or all of these roles (i.e., saprotroph, symbiotroph, pathotroph; Fisher et al., 2012). Despite the recognition that fungi are extremely important in soil ecosystems, characterising fungal communities has remained a challenge, exemplified by the numerous studies on soil bacteria in comparison to fungi.
Fungal barcode sequences are found within the ubiquitous, multicopy ribosomal RNA gene. Within this, the internal transcribed spacer (ITS) region has been accepted as a universal barcode for fungi (Schoch et al., 2012). Recent development of ITS-based databases such as UNITE (Kõljalg et al., 2013) and Warcup (Deshpande et al., 2016) have overcome limitations in collecting and assigning taxonomic identities to unknown sequences, though database selection may introduce bias into results (Tedersoo et al., 2015;Xue et al., 2019). Yet ITS barcodes exhibit some limitations when dealing with unknown or environmental samples. Generally, the ITS region cannot be aligned above the family-level (Cavender-Bares et al., 2009), making phylogenies based on ITS sequence data unreliable. Importantly, the ITS region has proven unreliable at distinguishing certain fungal groups at the species-level, such as Glomeromycetes (Stockinger et al., 2010). Such inconsistencies mean that ITS primers may not accurately detect target organisms. For instance, Berruti et al. (2017), found that ITS primers underestimated Glomeromycetes in bulk soil. Such uncertainty may confound experimental results and lead to erroneous conclusions.
Despite the widespread use of ITS barcodes, other markers may better capture the diversity of some fungal taxa. Primers targeting the small and large subunits as well as the ITS regions of the rRNA gene have all been applied to fungi (Tedersoo et al., 2015;Xue et al., 2019). For example, early diverging lineages such as Chytridiomycota (Schoch et al., 2012;Tedersoo et al., 2015) and Glomeromycetes (Tedersoo et al., 2015) are poorly represented in ITS sequencing. Additionally, advancements in classification have highlighted the shortcomings of environmental DNA barcoding. For example, the Archaeorhizomycetes are a poorly understood but ubiquitous class of soil fungi and their previously unidentifiable sequences have been major components of past soil biodiversity assessments (Anderson et al., 2003;Rosling et al., 2011). Overlooking these lineages may potentially lead to erroneous assumptions of biological and functional diversity in soils.
Underrepresentation of Glomeromycetes in particular exemplifies this issue. Arbuscular mycorrhizal fungi (AMF) form symbiotic relationships with more than 80% of vascular plant families and have been categorised into the monophyletic Glomeromycetes (Schüβler et al., 2001). Unlike most fungi, the ITS region has consistently demonstrated poor resolution in some closely related AMF species (Stockinger et al., 2010) as it is too hyper-variable (Thiéry et al., 2016). As mentioned previously, the ITS region underestimates Glomeromycetes in bulk soil (Berruti et al., 2017). Instead, the 18S region is more commonly used for barcoding AMF, especially in ecological studies (Öpik et al., 2014). Therefore, it is important to recognise biases inherent even in supposedly universal barcodes.
We previously undertook a nation-wide assessment of soil biodiversity across Wales, representing a breadth of heterogeneous land uses, which included agricultural land, grasslands, woodlands, and upland bogs. In this case, fungal richness and β-diversity were assessed using soil environmental DNA, utilising ITS1 primers (George et al., 2019). Yet, from the earliest stages of experimental design, we were cognisant that the ITS1 universal primer choice may not account for numerous functionally important fungal groups, particularly AMF. Thus, the primary objective of the present study was to assess whether observed fungal biodiversity (richness and β-diversity) across contrasting land uses from the ITS1 dataset would differ when compared to a dataset derived from an alternative choice of primer and database. We therefore sought to assess if primer choice influenced fungal biodiversity across land use, soil type, and soil organic matter (SOM) class. Our next aim was to critically evaluate the influence of climatic and edaphic factors [e.g., soil pH, total carbon (C), nitrogen (N), phosphorus (P)] on fungal diversity arising from the use of the two different primer sets. Our final aim was to look for differences in coverage of taxonomic and functional diversity between the two primer sets across the broad range of land uses and soil types evaluated.

Study Design
Data were collected as part of the Glastir Monitoring & Evaluation Programme (GMEP). The GMEP initiative was established by Welsh Government to monitor their most recent agri-environment scheme, Glastir, which involved 4,911 FIGURE 1 | Map of sites selected for GMEP monitoring. To protect landowner anonymity, each triangle gives an approximate location of every 1 km 2 plot from which samples were taken.
landowners over an area of 3,263 km 2 (Figure 1). Through the GMEP framework, survey teams collected samples in 2013 and 2014 between April and October in each year (Emmett and the GMEP Team, 2017). Sampling protocols were based on those of the UK-wide ecosystem monitoring programme, Countryside Survey (Emmett et al., 2010). The survey design randomly located 300, 1 km squares across 26 land classes in Wales which survey teams sampled with 5 plots in each square. A subset of samples were then randomly chosen from squares with a maximum of 3 selected in an individual square. A total of 437 samples were collected for biodiversity analyses.
At each sampling location, 2 cores were collected. One was a 15 cm deep by 4 cm diameter core from which measurements of soil physical and chemical properties were taken, including total C (%), N (%), P (mg/kg), organic matter (% loss-on-ignition), pH (measured in 0.01 M CaCl 2 ), mean soil water repellency (water drop penetration time in seconds), bulk density (g/cm 3 ), volume of rocks (cm 3 ), volumetric water content (m 3 /m 3 ), as well as percentage sand and clay. For complete details on chemical analyses methodology, see Emmett et al. (2010). Soil texture data were measured by laser granulometry with a LS320 13 analyser (Beckman-Coulter) as described in George et al. (2019). The cut-off points for clay, silt, and sand were: 2.2, 63, and 2,000 µm, respectively. Clay and sand percentages were selected for subsequent analyses and normalised using Aitchison's log 10 -ratio transformation. Further geographic data including grid eastings, northings, and elevation were also collected. Mean temperature ( • C) on date of sample collection and annual precipitation (mL) data were extracted from the Climate Hydrology and Ecology research Support System dataset (Robinson et al., 2017). Environmental variables were normalised (by log 10 or square root transformation) where appropriate (see Table 1).
Each sampling site was assigned to a land use category, soil type, and SOM class (based on percentage organic matter). The land use classification used in this study was originally developed for the UK Countryside Survey in 1990 (Bunce et al., 1999). Briefly, vegetation was recorded by surveyors and used to classify each site into one of the 8 Aggregate Vegetation Classes (AVCs) as described in Bunce et al. (1999; for further details please see Supplementary Material). The AVCs have been shown to follow a gradient of soil nutrient content from which productivity and management intensity can also be inferred (see Supplementary Material and Bunce et al., 1999). There were 7 AVCs identified in the present study. The AVCs in descending order of productivity are: Crops/weeds (including arable land), Fertile grassland, Infertile grassland, Lowland woodland, Upland woodland, Moorland grass-mosaic, Heath/bog (Supplementary Table 1). Soil type based on the predominant major soil group classification was extracted from the National Soil Map (Supplementary Material; Avery, 1980). Additionally, we classified soils on a per sample basis by organic matter content. Each sample was grouped into one of four organic matter classes based on percent loss-on-ignition (LOI) following the protocols of the 2007 Countryside Survey (Emmett et al., 2010): mineral (0-8% LOI), humus-mineral (8-30% LOI), organo-mineral (30-60% LOI), and organic (60-100% LOI). Mean values for each environmental variable were recorded for each land use, soil organic matter class, and soil type.

DNA Extraction
Soils used in DNA extraction were collected from 15 cm deep by 8 cm diameter cores. Soil samples were transported in refrigerated boxes; samples were received at Environment Centre Wales, Bangor within an average of 48 h post-extraction and frozen at −80 • C upon arrival. Soils were then thawed and homogenised as they passed through a sterilised 2 mm stainless steel sieve after which they were returned to a −80 • C freezer until DNA extraction. Sieves were sterilised between samples by rinsing with tap water at high pressure and an application of Vircon R laboratory disinfectant followed by UV-treating each side for 5 min. DNA was extracted by mechanical lysis from 0.25 g of soil per sample using a PowerLyzer PowerSoil DNA Isolation Kit (MO-BIO Inc.). Soils were pre-treated with 750 µL of a suspension of CaCO 3 (1 M) following Sagova-Mareckova et al. (2008) to improve PCR performances, especially for acidic soils. Extracted DNA was stored at −20 • C until amplicon library preparation began. The extractions and homogenisation steps were performed in triplicate. To check for contamination in sieves, 3 negative control DNA extractions were completed as well as 2 negative control kit extractions using the same technique but without the CaCO 3 pre-treatment. Aliquots of the resultant DNA were used to create amplicon libraries for sequencing with each primer set.

Primer Selection and PCR Protocols for Library Preparation
Amplicon libraries were created using primers for the ITS1 (ITS5/5.8S_fungi) area to specifically target fungi (Epp et al., 2012) and the V4 region of the 18S gene (TAReuk454FWD1/TAReukREV3; Behnke et al., 2011) targeting a wide range of, but not all, eukaryotic organisms, including fungi. A two-step PCR following protocols devised in conjunction with the Liverpool Centre for Genome Research was used as described in George et al. (2019). Amplification of amplicon libraries was run in triplicate on DNA Engine Tetrad R 2 Peltier Thermal Cycler (BIO-RAD Laboratories Inc.) and thermocycling parameters for both PCR protocols started with 98 • C for 30 s and terminated with 72 • C for 10 min for final extension and held at 4 • C for a final 10 min. For the ITS1 locus, there were 15 cycles of 98 • C for 10 s; 58 • C for 30 s; 72 • C for 30 s. For the 18S locus there were 15 cycles at 98 • C for 10 s; 50 • C for 30 s; 72 • C for 30 s. Twelve microliters of each first-round PCR product were mixed with 0.1 µL of exonuclease I, 0.2 of µL thermosensitive alkaline phosphatase, and 0.7 µL of water and cleaned in the thermocycler with a programme of 37 • C for 15 min and 74 • C for 15 min and held at 4 • C. Addition of Illumina Nextera XT 384-way indexing primers to the cleaned first round PCR products were amplified following a single protocol which started with initial denaturation at 98 • C for 3 min; 15 cycles of 95 • C for 30 s; 55 • C for 30 s; 72 • C for 30 s; final extension at 72 • C for 5 min and held at 4 • C. Twenty-five microliters of second-round PCR products were purified with an equal amount of AMPure XP beads (Beckman Coulter). Library preparation for the 2013 samples was conducted at Bangor University. Illumina sequencing for both years and library preparation for 2014 samples were conducted at the Liverpool Centre for Genome Research.

Bioinformatics
Bioinformatics analyses were performed on the Supercomputing Wales cluster as previously described in George et al. (2019). A total of 104,276,828, and 98,999,009 raw reads were recovered from the ITS1 and 18S sequences, respectively. Illumina adapters were trimmed from sequences using Cutadapt (Martin, 2011) with 10% level mismatch for removal. Sequences were then de-multiplexed, filtered, quality-checked, and clustered using a combination of USEARCH v. 7.0 (Edgar, 2010) and VSEARCH v. 2.3.2 (Rognes et al., 2016). Open-reference clustering (97% sequence similarity) of operational taxonomic units (OTUs) was performed using VSEARCH; all other steps were conducted with USEARCH. Sequences with a maximum error greater than 1 and shorter than 200 bp were removed following the merging of forward and reverse reads for ITS1 sequences. A cut-off of 250 bp was used for 18S sequences, according to higher quality scores. There were 7,242,508 (ITS1) and 9,163,754 (18S) cleaned reads following these steps. Sequences were sorted and those that only appeared once in each dataset were removed.
Remaining sequences were matched first against the UNITE 7.2 (Kõljalg et al., 2013) and SILVA 128 (Quast et al., 2013) databases for the ITS1 and 18S sequences, respectively. Ten per cent of sequences that failed to match were clustered de novo and used as a new reference database for failed sequences. Sequences that failed to match with the de novo database were subsequently also clustered de novo. All clusters were collated and chimeras were removed using the uchime_ref command in VSEARCH. Chimera-free clusters and taxonomy assignment summarised in an OTU table with QIIME v. 1.9.1  using RDP (Wang et al., 2007) methodology with the UNITE database for ITS1 data. Taxonomy was assigned to the 18S OTU table using BLAST (Altschul et al., 1990) against the SILVA database and OTUs appearing only once or in only 1 sample were removed from each OTU table. Based on DNA quality and read counts, 413 samples were used for analyses of the ITS1 data and 422 for 18S data (from the total of 438).
A Newick tree was constructed for the 18S tables using 80% identity thresholds and was paired with the 18S OTU table as part of analyses using the R package phyloseq (McMurdie and Holmes, 2013). Non-fungi OTUs were removed from both OTU tables. Read counts from each group were rarefied 100 times using phyloseq (as justified by Weiss et al., 2017) and the resulting mean richness was calculated for each sample. The ITS1 table was rarefied at a depth of 4,000 reads whereas the 18S table was rarefied to 10,000 reads. A subset of the 18S data was rarefied to 400 reads across 398 samples to analyse Glomeromycetes OTUs separately. Samples with observed lower read counts were removed before rarefaction. To assess functional diversity, both OTU tables were processed using FUNGUILD (Nguyen et al., 2016) and the resulting matched OTU tables were used to investigate functional roles based on trophic mode. Sequences have been uploaded to The European Nucleotide Archive and can be accessed with the following primary accession codes after the end of the data embargo: PRJEB28028 (ITS1), and PRJEB28067 (18S).

Statistical Analysis
All statistical analyses were run using R v. 3.3.3 (R Core Team, 2017) following rarefaction. For each data set, NMDS ordinations using Bray-Curtis dissimilarity were created with the vegan package (Oksanen et al., 2016) to assess β-diversity. Environmental data was fitted linearly onto each ordination of AVCs using the envfit function. NMDS scores were plotted against these values for each variable to determine the direction of associations. Differences in β-diversity amongst AVCs were calculated with PERMANOVA and homogeneity of dispersion was also assessed.
Linear mixed models were constructed using package nlme (Pinheiro et al., 2016) to show the differences in α-diversity amongst AVCs, soil types, and LOI classification, for both ITS1 and 18S fungal data sets. Sample year as fixed factors; sample square identity was the random factor. This methodology was also used for the subsets of data that matched to the FUNGUILD database. For each model, significant differences were assessed by ANOVA and pairwise differences were identified using Tukey's post-hoc tests from the multcomp package (Hothorn et al., 2008).
Partial least squares regressions from the pls package (Mevik et al., 2016) were used with the variable importance in projection (VIP) approach (Chong and Jun, 2005) to sort the original explanatory variables by order of importance to identify the most important environmental variables for richness. Such analysis is ideal for data where there are many more explanatory variables than sample numbers or where extreme multicollinearity is present (Lallias et al., 2015;George et al., 2019). Variables with VIP values > 1 were considered most important. Relationships between important variables and richness values for each group of organisms were investigated by linear regression. Richness was normalised before regression when necessary.

Soil Properties
Soil properties displayed a range of changes across land uses (  [F (6, 429) = 4.4, p < 0.001], though significant, were less clear across land uses however. These findings were also apparent when samples were grouped from low-to-high organic matter content by organic matter class (Supplementary Table 2). Overall, no clear trends were evident across the different soil types (Supplementary Table 3).

Sequencing Data
A total of 7,582 and 4,408 fungal OTUs were recovered using the ITS1 and 18S primer sets, respectively. Of these, 5,666 were assigned an identifier at the class-level in the ITS1 dataset while 4,367 were assigned an identifier in the 18S dataset. There were 15 classes that were only found in the ITS1 dataset and 12 unique to the 18S data. Endogonomycetes was the most abundant class found only in the ITS dataset (19 OTUs), whereas Laboulbeniomycetes (17 OTUs) was the most abundant fungal class unique to the 18S data. A total of 24 classes were present in both ITS1 and 18S data (Figure 2A).
As reported in George et al. (2019), Agaricomycetes were the most abundant class of fungi in the ITS1 dataset overall. There were also a large proportion of Sordariomycetes ( Figure 2B). Archaeorhizomycetes was the most abundant class in the 18S dataset ( Figure 2C). Proportionate abundances of Sordariomycetes and Agaricomycetes followed contrasting trends, with the dominance of the former replaced by the latter in lower productivity AVCs in the ITS1 data, as described previously ( Figure 3A). Although Agaricomycetes and Sordariomycetes comprised smaller fractions of the 18S dataset ( Figure 2C), this trend was still apparent ( Figure 3B). Additionally, the Archaeorhiozmycetes from 18S data generally followed the same trend as the Sordariomycetes (Figure 3B). The preceding trends observed across land uses are also evident across organic matter classes ( Figure S1) but are not as clear across soil types ( Figure S2).
When a class was present in both datasets, it was usually much more prevalent in one than the other (Supplementary Table 4

Fungal Richness and β-Diversity From ITS1 and 18S Data
We found that fungal richness followed the same trends across land use, irrespective of primer set. As previously demonstrated in George et al. (2019), fungal OTU richness from ITS1 metabarcoding significantly declined [F (6, 258) = 39.87, p < 0.001; Figure 4A] from high to low productivity/management intensity. Richness in Fertile grasslands was significantly greater than all other AVCs (p < 0.001) except Crops/weeds. In the 18S dataset, richness was also significantly higher [F (6, 267) = 82.73, p < 0.001] in more productive/managed land uses and declined along this gradient. However, richness in grasslands was highest in this dataset ( Figure 4B). For complete pairwise differences between land uses see Supplementary Material.
The trend of declining richness with productivity was also apparent when samples were categorised by organic matter content (Figure 5). In both datasets, richness was significantly greater [F (3, 259) = 48.13, p < 0.001; F (3, 269) = 46.71, p < 0.001; for ITS1 and 18S, respectively] in mineral and humus-mineral than all other classifications (ITS1, Figure 5A; 18S, Figure 5B). There was no consistent pattern of richness when soils were categorised by soil type (Figure S3). Again pairwise differences between organic matter classes and soil types are described in the Supplementary Material.
Community composition based on non-metric multidimensional scaling of Bray-Curtis distances also showed consistent trends between the datasets. Plots demonstrate tight clustering of Crops/weeds, and grassland AVCs in both ITS1 ( Figure 6A) and 18S ( Figure 6B)  When these results are visualised by organic matter classification, the tight clusters are populated by mineral and humus-mineral samples, whereas organo-mineral and organic samples are more common in the widely dispersed areas of the plots (Figures S4, S5). Soil types are more widely dispersed but Brown and Surface-water gley soils are more common in the tightly grouped area (Figures S6, S7). Again, significant results were observed for both PERMANOVA and dispersion of variance across organic matter classes and soil types in both datasets.

Relationships Between Soil Properties and Fungal Biodiversity
Fungal richness showed similar relationships to soil properties in both datasets. Across samples, PLS and VIP analyses highlighted strong correlations between fungal richness and soil properties. There were significant, positive relationships of richness with pH and bulk density; and significant, negative correlations between richness and C:N ratio, organic matter, elevation, and mean annual precipitation ( Table 2). Although these results followed the same trend in ITS1 and 18S data, however, their    relative rankings varied. For example, fungal richness from ITS1 data was most strongly correlated with bulk density and organic matter, while richness from 18S data was more strongly correlated to C:N ratio and elevation in addition to bulk density ( Table 2). Furthermore, there were some relationships unique to each dataset. Significant negative relationships were observed between richness and soil water repellency. Similarly, richness derived from 18S data was negatively related to total C and sand content of soil but also positively related to clay content. We found pH was the best predictor of β-diversity from linear fitting for fungi no matter what gene region is amplified (Tables 3, 4). All fitted variables were significantly correlated to β-diversity, though most of these only weakly. It is likely that they did not strongly influence the fungal communities. Variables followed similar rankings in both the ITS1 and 18S data. Elevation, annual precipitation, soil moisture, C:N ratio, organic matter, and bulk density all had R 2 values greater than 0.35, but their relative order differed between datasets (Tables 3, 4).

Effect of Land Use on Functional Diversity
There was a distinct difference in trophic modes of OTUs that were successfully matched to the FUNGUILD database between ITS1 and 18S datasets. In total, 3,402 and 1,783 OTUs from the ITS1 and 18S datasets, respectively were matched to the FUNGUILD database. Overall, saprotrophs were the most abundant trophic mode in both datasets ( Figure 6); however, pathotrophs ranked second in ITS1 ( Figure 6A) data while the pathotroph-saprotroph-symbiotroph multi-trophic group was second-most abundant in 18S data ( Figure 6B). Across land uses, proportions of pathotrophs and pathotroph-saprotrophsymbiotrophs fell with declining productivity (Figure 7). In matches from the ITS1 data, pathotroph-saprotrophs increased across the productivity gradient (Figure 7A), as did saprotrophs in the 18S data ( Figure 7B). The aforementioned trend in proportional abundance of pathotrophs and pathotrophsaprotroph-symbiotrophs was also present across organic matter classes ( Figure S8). Symbiotrophs appeared to follow an opposite trend, increasing as productivity fell. Interestingly, this was    the case for saprotrophs in the 18S (Figure S8B) but not the ITS1 (Figure S8A) dataset. Proportional abundances of fungal OTUs grouped by trophic modes did not follow a discernable pattern across changing soil types ( Figure S9). For simplicity, we focused further analyses only on the broadly defined saprotroph, pathotroph, and symbiotroph groups, ignoring all combination groups; pairwise differences for all of the following comparisons are described in the Supplementary Material. Across land uses, significant differences were observed in the richness of saprotrophic fungi in both the ITS1 [F (6, 258) = 25.14, p < 0.001] and 18S [F (6, 267) = 31.10, p < 0.001] data; however, there were differences between datasets (Figure 8). In the ITS1 dataset, richness followed the same trend as overall fungal richness, with the highest and lowest values in the Crops/weeds and Heath/bog AVCs respectively ( Figure 8A). Although this pattern was preserved in the 18S data ( Figure 8B), richness of saprotrophs was much more even across AVCs in this case. Indeed, rather than the linear decline of richness along the productivity gradient, there appeared to be 3 distinct levels in the data affiliated with (i) grassland/agricultural sites, (ii) woodlands, and (iii) bogs.
In the case of pathotrophic fungi, richness also followed a similar trend to the saprotrophs across both datasets. In the ITS1 data, significantly [F (6, 258) = 26.11, p < 0.001] greater richness values were observed in Crops/weeds and grassland samples ( Figure 8A). Richness of pathotrophs was significantly highest in Crops/weeds sites. Again, this trend was present, though not as clear, in the 18S dataset ( Figure 8B). Significant differences [F (6, 267) = 52.26, p < 0.001] were observed between AVCs, with the highest richness of pathotrophs occurring in the Fertile grassland and Crop/weeds land uses.
Across organic matter classes, significant differences were also observed in pathotroph richness in the ITS1 [F (3, 250) = 24.91, p < 0.001] and 18S [F (3, 269) = 30.49, p < 0.001] datasets. However, in this case the trends were more apparent in the 18S data than the ITS1 data (Figure 9). Pathotroph richness was highest in mineral soils and lowest in organic soils when compared to all other classes in the ITS1 data ( Figure 9A). However, all organic matter classifications were statistically different from each other in the 18S data (Figure 9B), in descending order from mineral to peat soils. Again, trends were less clear across soil types ( Figure S10). Significant differences were observed in the ITS1 data [F (5, 259) = 6.93, p < 0.001] with the lowest pathotroph richness found in peat soils (Figure S10A). In the 18S data, differences between pathotrophic fungi across soil types were more similar to those observed in other groups ( Figure S10B). datasets plotted against Aggregate Vegetation Class. Aggregate Vegetation Classes are ordered from most (Crops/weeds) to least (Heath/bog) productive. Boxes cover the first and third quartiles and horizontal lines denote the median. Black dots represent outliers beyond the whiskers, which cover 1.5X the interquartile range. Notches indicate confidence interval around the median. Overlapping notches are a proxy for non-significant differences between medians. Black dots are outliers.
FIGURE 9 | Boxplots of richness of fungal OTUs matched to the pathotrophic, saprotroph, and symbiotroph trophic modes in FUNGuild for (A) ITS1 and (B) 18S datasets plotted against organic matter class. Organic matter classes are listed in order of increasing percent organic matter. Boxes cover the first and third quartiles and horizontal lines denote the median. Black dots represent outliers beyond the whiskers, which cover 1.5X the interquartile range. Notches indicate confidence interval around the median. Overlapping notches are a proxy for non-significant differences between medians. Black dots are outliers. Vegetation Classes are ordered from most (Crops/weeds) to least (Heath/bog) productive. Organic matter classes are listed in order of increasing percent organic matter. Soils are listed in increasing order of moisture retention. Boxes cover the first and third quartiles and horizontal lines denote the median. Black dots represent outliers beyond the whiskers, which cover 1.5X the interquartile range. Notches indicate confidence interval around the median. Overlapping notches are a proxy for non-significant differences between medians. Black dots are outliers.
Pathotroph richness was significantly [F (5, 268) = 13.6, p < 0.001] different across soil types with the highest values found in brown soils and the lowest in peats.
The previously described trend of declining richness across the land use productivity gradient (i.e., Figure 4) was not apparent when considering symbiotrophs. Furthermore, although significant differences were apparent in both the ITS1 [F (6, 258) = 14.88, p < 0.001] and 18S [F (6, 267) = 55.13, p < 0.001] datasets they were by no means identical (Figure 8). Symbiotroph richness was highest in Lowland wood sites followed by Upland wood. This trend was not apparent in the 18S dataset, however ( Figure 8B). Here richness of symbiotrophs was greatest in grassland AVCs and lowest in Heath/bog sites much like the overarching trend of total fungal OTU richness.
We suspected that the differences in functional diversity observed between datasets might be a result of differential coverage of important groups. We were able to confirm this when we analysed the richness of OTUs identified as Glomeromycetes present in the 18S dataset (Figure 10). All of the 162 Glomeromycetes OTUs were assigned as highlyprobable symbiotrophs through FUNGUILD. Across land uses, richness of Glomeromycetes followed similar trends to those of symbiotrophs and saprotrophs from 18S data. There were significant [F (6, 244) = 33.47, p < 0.001] differences across land uses, though they appeared, like the saprotroph richness to be tiered between grasslands, woods, and bogs ( Figure 10A). Richness of Glomeromycetes was higher in grasslands than all other AVCs except Crops/weeds and lowest in Heath/bog sites. Again, when grouped by organic matter class ( Figure 10B) and soil type (Figure 10C), Glomeromycetes richness followed the same trend as saprotrophs and symbiotrophs from the 18S dataset. Richness was significantly [F (3, 246) = 37.65, p < 0.001] greater in mineral and humus-mineral soils than all others. Across soil types, richness of Glomeromycetes was significantly [F (5, 245) = 8.65, p < 0.001] lower in peat soils when compared to most other soil types.

Relationships Between Soil Properties and Fungal Functional Diversity
Across all samples, PLS and VIP analyses highlighted strong correlations between fungal richness and soil properties by trophic groups. Richness of pathotrophs showed similar relationships to soil properties in both datasets. There were significant, positive relationships of richness with pH and bulk density; and significant negative correlations between richness and total C, C:N ratio, organic matter, elevation, and mean annual precipitation ( Table 5). As with the total fungal data, the relative rankings of the strength of relationships between pathotroph and each property varied between datasets. Organic matter was most strongly correlated with pathotroph richness from ITS1 data whereas pH was most strongly correlated with pathotroph richness in the 18S data ( Table 5). Also soil moisture content was also negatively correlated with pathotroph richness in the ITS1 dataset only.
Organic matter, elevation (both negative), pH, and bulk density (both positive) all showed significant relationships with saprotroph richness in both datasets ( Table 5). The correlations between richness of saprotrophs and both bulk density and pH were the strongest observed in the ITS1 data. There were also negative correlations between saprotroph richness and total C, mean annual precipitation, soil moisture, soil water repellency, and mite abundance in the ITS1 data. However, it again should be noted that the correlation with mites was extremely weak. C:N ratio was strongly and positively correlated with saprotroph richness in the 18S data. Similarly, richness from 18S data was negatively related to total C and sand content of soil but also positively related to clay content. In addition, there was a significant, positive, but weak correlation between sand content and saprotroph richness.
In both datasets, symbiotroph richness was significantly correlated with pH and C:N ratio (Table 5). Interestingly, the relationships were positive in the case of C:N ratio and negative for pH in ITS1 data but the opposite was apparent in the 18S data. There were also many more relationships unique to each dataset. Weak but significant positive relationships were observed between symbiotroph richness and rock volume, Collembola abundance, and temperature as well as a negative correlation to soil moisture. In the 18S data, stronger relationships were observed between symbiotroph richness and bulk density (positive) and elevation (negative). Furthermore, a weakly negative correlation was observed with sand content in addition to weak positive correlations with clay content and total P.

Primer Choice and the Total Fungal Community
We observed congruent patterns in total fungal OTU richness across land uses, organic matter classes and soil type when measured with either ITS1 or 18S primer sets. Richness was greater in arable and grassland land uses, which are highly productive, intensively managed and declined in the less productive, largely unmanaged bogs. Although these findings had been previously known from the ITS1 dataset (George et al., 2019), it is important to note that the trend was also present in the fungal OTUs identified from 18S sequencing. A similar trend was observed across organic matter classes. Here, fungal richness fell as organic matter increased. Fungal α-diversity is known to be greater in arable soils than in grasslands or  forests (Szoboszlay et al., 2017). Potential mechanisms for this include: (i) increased nutrient availability due to fertiliser input (Szoboszlay et al., 2017), and (ii) beneficial disturbance from tillage and other standard agricultural practices. The latter is consistent with the intermediate disturbance hypothesis whereby high levels of diversity are maintained by consistent interruption of successional processes (Connell, 1978).
Soils rich in organic matter, especially peats, found in upland moors, bogs, and other wetlands across harbour distinct fungal communities from neighbouring habitats (Anderson et al., 2003). Fungi dominate microbial communities in bogs (Thormann and Rice, 2007) although their proportional abundance drops sharply below the first 5 cm of bog habitats (Potter et al., 2017). Yet, richness in bogs is consistently low, perhaps due to environmental pressures such as high acidity, highly recalcitrant SOM, low nutrients, and oxygen levels (Rousk et al., 2010;Tedersoo et al., 2014) or reduced competition within the fungal community.
In comparison to AVC and SOM levels, differences in fungal communities were not as clear across soil types as defined by the National Soil Map (Avery, 1980), which is inline with previous work on microbial activity across the UK (Jones et al., 2014). Richness was highest in brown soils and was lowest in peats. Brown soils commonly support grassland communities across Wales (Avery, 1980;Rudeforth et al., 1984). Nearly half of the Fertile and Infertile grasslands surveyed in GMEP were categorised as brown soils. The absence of other major trends besides these may be due to the use of the dominant soil type and lack of resolution for the soil classification. The soils map used in this study simply does not provide enough resolution (1:63, 360; Avery, 1980) for soil type to be an effective category. Furthermore, this system heavily uses subsoil properties to determine soil type (Avery, 1980), while our work only involved the upper 15 cm. However, it is our opinion that the use of organic matter classification is more effective and simple metric that can be easily implemented in large-scale studies in lieu of fine-scale maps.
Results of PLS analyses demonstrates that soil properties and associated environmental factors influencing fungal richness are consistent across ITS1 and 18S datasets. Major drivers included pH, bulk density, C:N ratio, organic matter, elevation, and mean annual temperature ( Table 2). Such results from 18S data are consistent with previous findings from the ITS1 data (George et al., 2019). However, there were certain properties that were significant in only one of the datasets and the relative importance of these properties does vary between the two datasets. There are several possible explanations for this. Firstly, 9 more samples were used in the 18S dataset (n = 422) than the ITS1 data (n = 413), which may have introduced the discrepancy in relative importance of the data. However, it is much more likely that a differential coverage of fungal groups between the two datasets caused these discrepancies.
Community composition showed consistent clustering across land uses, organic matter classes, and soil types in both data sets. As in George et al. (2019), communities were most similar in the grassland and arable sites and more spread out across woodlands and upland habitats. This was likely driven by environmental factors across Wales. In both datasets, pH was the most important environmental variable influencing community composition and although the remaining properties followed similar patterns, their relative importance again differed in the dataset. The importance of pH, elevation, C:N ratio, and precipitation in determining fungal community composition fits well in the wider context of soil fungi biogeography. Tedersoo et al. (2014) previously highlighted the importance of these variables in the distribution of fungi at the global scale. Furthermore, the strong positive correlation with C:N ratio is indicative of the expected fungal dominance (de Vries et al., 2006) of nutrient-poor, acidic soils (Bloem et al., 1997).

Primer Choice and Fungal Functional Diversity
Differences between richness of trophic modes of fungi, used here as a proxy for functional diversity, showed some discrepancies across land uses and soil classification between data sets. Saprotrophs made up the largest proportion of the 3 functional groups studied and generally exhibited the same trends as total richness across soils and land uses. This was also the case for pathotrophs. Indeed, correlations between environmental variables with pathotroph and saprotroph richness were largely consistent across datasets. However, we observed divergent trends in symbiotroph richness across land uses and soils. Symbiotroph richness was highest in woodlands in the ITS1 dataset whereas it was highest in grasslands according to the 18S data (Figures 7A,B). A similar increase in richness within grasslands in the 18S data is repeated when Glomeromycetes were considered on their own (Figure 9); AMF are the predominant mycorrhizal fungi in grassland systems (Smith and Read, 2008). The symbiotroph peak in the ITS1 data may be explained by an increase in coverage of ectomycorrhizas which are the most common group to associate with trees and shrubs (Smith and Read, 2008). Despite these differences, both datasets suggest that symbiotroph richness was low in arable land, which is in line with previous findings demonstrating high susceptibility of mycorrhizal fungi to disturbance, for example tillage (Schnoor et al., 2011;Säle et al., 2015), and the addition of fertilizers, which decreases the receptiveness of many agricultural plants to mycorrhizal infection (Smith and Read, 2008).
The divergent trend in symbiotroph richness and discrepancies in relationships between functional groups and environmental variables likely stem from primer biases. Primer biases have been well-recognised as a confounding factor in categorising communities from environmental DNA (Cai et al., 2013;Elbrecht and Leese, 2015;Tedersoo et al., 2015). Tedersoo et al. (2015) assessed the effectiveness of fungal barcodes from the ITS, 18S, and 28S rDNA regions and found that primer choice did not affect richness or β-diversity results of soil fungi communities from Papua New Guinea, although fewer OTUs were recovered by 18S primers than ITS primers. In silico analyses suggests such findings are the result of lumping of sequences in the 18S that may predominantly affect rare sequences, thereby strengthening community matrices. Similarly, results were similar enough for all primers to be suitable for analyses at the class-level (Tedersoo et al., 2015). Although the 18S primers used here were designed to cover the breadth of eukaryotes and may lack specificity to fungi (Behnke et al., 2011), our results show strong congruence to the ITS1 data across total richness and indeed most functional groups.
Unlike Tedersoo et al. (2015) we observed considerable differences in the proportions of fungal classes between the ITS1 and 18S data sets. We suspect that such differences stem from the need to use appropriate databases to assign taxonomy to OTUs to each dataset (Xue et al., 2019). Perhaps only 30-35% of Glomeromycetes are present in 18S and ITS databases, respectively (Hart et al., 2015), and although sequences are continuously being uploaded to such repositories, it is likely the majority of AMF are not identifiable from environmental samples (but see Öpik et al., 2014). Similarly we suspect that, although not studied in detail, primer choice may lead to biases in other groups. Archaeorhizomycetes accounted for nearly 25% of the 18S sequences but less than 1% from the ITS1 data ( Figure 2B). Primer bias has been recognised for Archaeorhizomycetes even before the class' formal description; ∼19% of 18S sequences collected from Anderson et al. (2003), have been matched to Archaeorhizomycetes, whereas none were recovered from the same samples using ITS primers. Despite its recent description, Archaeorhizomycetes are ubiquitous components of soil communities. Strong associations have been observed with trees, yet precise functional roles of these fungi have yet to be determined (Rosling et al., 2011). Subsequently, such biases likely account for divergent relationships between functional group richness and environmental properties.

CONCLUSIONS
Our comparison of the use of ITS1 and 18S primers and their respective databases in a nationwide metabarcoding survey of fungi yielded 3 major findings. First, the congruent findings of total richness and β-diversity across land use and their relationships to environmental variables confirmed our previous research (George et al., 2019). Second, soil organic matter was found to be a more sensitive metric than soil type in our survey design. Third, biases from the combination of primer and database choice became apparent for certain classes of fungi, including Glomeromycetes and Archaeorhizomycetes, which strongly influenced functional group richness across land uses as well as their relationships with environmental variables. It is therefore important to recognise the sensitivity of metabarcoding to primer choice, even when using universal primers. Without simultaneous analyses of environmental DNA using both primers and databases, the presence of AM fungi as well as the newly characterised Archaeorhizomycetes would have been overlooked and unquantified in this survey. Furthermore, since the majority of soil biodiversity is undescribed (Ramirez et al., 2015), utilising multiple primers will elucidate a more complete picture of belowground biodiversity by revealing shortcomings in existing probes and revealing the presence of as yet undescribed organisms. We therefore advocate that future nation-wide surveys included both a sample-based metric of soil type (i.e., organic matter classification) and multiple primers for fungal biodiversity. Such measures should not be arduous to implement, especially if researchers can identify specific fungal groups of particular interest to accommodate.

DATA AVAILABILITY STATEMENT
Data associated with this paper will be publically published in the National Environment Research Council (NERC) Environmental Information Data Centre (EIDC). Currently, pH, bulk density, C, N, P, moisture, and water repellency data are available (Robinson et al., 2019). Data are also available from the authors upon reasonable request with permission from the Welsh Government. Sequences with limited sample metadata have been uploaded to the European Nucleotide Archive and can be accessed with the following primary accession codes: PRJEB28028 (ITS1) and PRJEB28067 (18S).

AUTHOR CONTRIBUTIONS
PG, DJ, DR, and SC conceived this project. BE and DR facilitated the use of GMEP data. Bioinformatics and statistical analyses were led by PG with assistance from SC and RG. PG wrote the first draft of the manuscript. SC, DR, and DJ contributed to subsequent revisions. All authors read and approved the final draft of the manuscript.

ACKNOWLEDGMENTS
We thank the GMEP Team for their contribution in collecting the data and the laboratory staff of Environment Centre Wales, especially Delphine Lallias, for sample processing and DNA extraction. Thanks also to Fiona M. Seaton for assistance with statistical analyses. We thank Supercomputing Wales for support and training on their system for bioinformatics analyses. We thank John G. Kenny and Richard M. Eccles for generating sequence data at the Centre for Genomic Research, University of Liverpool.