Diverse Microbial Composition of Sourdoughs From Different Origins

Hundreds of sourdoughs have been investigated in the last decades. However, many studies used a culture-dependent and/or culture-independent microbiological approach [mainly based on denaturing gradient gel electrophoresis (DGGE) of PCR amplicons], seldomly combined with a metabolite target analysis, to characterize the microbial species communities of the sourdoughs examined. Moreover, attention was mainly paid on lactic acid bacteria (LAB) and yeast species. In the present study, distinct household-scale (including an artisan lambic brewery) and artisan bakery-scale backslopped sourdoughs (17 in total), obtained from different regions (Belgium, France, United Kingdom, and USA), were examined through a multiphasic approach, encompassing a culture-dependent analysis [targeting LAB, acetic acid bacteria (AAB), and yeasts], different culture-independent techniques [rRNA-PCR-DGGE, metagenetics, and metagenomics (four bakery sourdoughs)], and metabolite target analysis. It turned out that the microbial species diversity of the sourdoughs was influenced by the house microbiota of the producer. Further, when the producer made use of different flours, the sourdoughs harbored similar microbial communities, independent of the flour used. AAB were only present in the Belgian sourdoughs, which might again be related to the processing environment. Fructilactobacillus sanfranciscensis (formerly known as Lactobacillus sanfranciscensis) was the prevalent LAB species of the eight sourdoughs produced by two of the three bakeries of different countries analyzed. These sourdoughs were characterized by the presence of either Saccharomyces cerevisiae or Kazachstania humilis. Moreover, the presence of Fl. sanfranciscensis was positively correlated with the production of mannitol and negatively correlated with the presence of other LAB or AAB species. Sourdoughs produced in an artisan lambic brewery were characterized by the presence of the yeast species Dekkera anomala and Pichia membranifaciens. One household sourdough was characterized by the presence of uncommon species, such as Pediococcus parvulus and Pichia fermentans. Metagenomic sequencing allowed the detection of many more LAB and AAB species than the other methods applied, which opened new frontiers for the understanding of the microbial communities involved during sourdough production processes.

Hundreds of sourdoughs have been investigated in the last decades. However, many studies used a culture-dependent and/or culture-independent microbiological approach [mainly based on denaturing gradient gel electrophoresis (DGGE) of PCR amplicons], seldomly combined with a metabolite target analysis, to characterize the microbial species communities of the sourdoughs examined. Moreover, attention was mainly paid on lactic acid bacteria (LAB) and yeast species. In the present study, distinct household-scale (including an artisan lambic brewery) and artisan bakeryscale backslopped sourdoughs (17 in total), obtained from different regions (Belgium, France, United Kingdom, and USA), were examined through a multiphasic approach, encompassing a culture-dependent analysis [targeting LAB, acetic acid bacteria (AAB), and yeasts], different culture-independent techniques [rRNA- PCR-DGGE, metagenetics, and metagenomics (four bakery sourdoughs)], and metabolite target analysis. It turned out that the microbial species diversity of the sourdoughs was influenced by the house microbiota of the producer. Further, when the producer made use of different flours, the sourdoughs harbored similar microbial communities, independent of the flour used. AAB were only present in the Belgian sourdoughs, which might again be related to the processing environment. Fructilactobacillus sanfranciscensis (formerly known as Lactobacillus sanfranciscensis) was the prevalent LAB species of the eight sourdoughs produced by two of the three bakeries of different countries analyzed. These sourdoughs were characterized by the presence of either Saccharomyces cerevisiae or Kazachstania humilis. Moreover, the presence of Fl. sanfranciscensis was positively correlated with the production of mannitol and negatively correlated with the presence of other LAB or AAB species. Sourdoughs produced in an artisan lambic brewery were characterized by the presence of the yeast species Dekkera anomala and Pichia membranifaciens. One household sourdough was characterized by the presence of uncommon species, such as Pediococcus parvulus and Pichia fermentans. Metagenomic sequencing allowed the detection of many more LAB and AAB species than the other methods applied, which opened new frontiers for the understanding of the microbial communities involved during sourdough production processes.

INTRODUCTION
Sourdoughs are matrices of mainly cereal flour and water that are fermented by means of lactic acid bacteria (LAB; mainly heterofermentative LAB species) and yeasts Settanni, 2017;Van Kerrebroeck et al., 2017;Gobbetti et al., 2019). The LAB to yeast ratio is mostly 10:1 to 100:1 (Gobbetti, 1998;Hammes et al., 2005;De Vuyst et al., 2014. Not only cereal flours from wheat, rye, spelt, and barley but also flours from pseudocereals, legumes, and seeds are used (Coda et al., 2014;De Vuyst et al., 2014. Sourdough production is generally carried out through backslopping, whereby each backslopping step is characterized by a spontaneous fermentation of the flour-water mixture, thanks to the wild microorganisms present in the flour, other ingredients, or the environment; alternatively, it can be initiated with starter cultures . Backslopped sourdoughs often harbor particular consortia of LAB and/or yeasts. For instance, San Francisco sourdough harbors a microbial consortium of Fructilactobacillus sanfranciscensis (formerly known as Lactobacillus sanfranciscensis; Zheng et al., 2020) and Kazachstania humilis (formerly Candida humilis) that is the result of a nutritional mutualism and thanks to mutual stress responses (Gänzle et al., 1998;Jacques et al., 2016;De Vuyst et al., 2017). However, other consortia of maltose-positive LAB and maltose-negative yeasts occur (e.g., Fl. sanfranciscensis and other Kazachstania species) as well as consortia of LAB species with a glucoserepressed maltose metabolism and maltose-positive yeasts [e.g., Lactiplantibacillus plantarum (formerly known as Lactobacillus plantarum) and Saccharomyces cerevisiae], also supporting on mutual relationships (Guerzoni et al., 2007;De Vuyst et al., 2017;Sieuwerts et al., 2018).
Up to now, more than 90 different LAB species and more than 40 different yeast species have been isolated from sourdoughs worldwide ). Yet, a certain backslopped sourdough usually harbors three or less LAB species and one to two yeast species, underlining the competitive ecosystem's mutual relationships and/or matrix-specific adaptations Van Kerrebroeck et al., 2017). Alternatively, the microbial composition of sourdoughs depends on the process technology or way of inoculation applied (Gänzle and Ripari, 2016;Van Kerrebroeck et al., 2017). Consequently, a wide variety of traditional sourdoughs exists, some of which have got the annotations of protected designation of origin (PDO; e.g., Italian Altamura bread and Tuscan bread; Ricciardi et al., 2005;Minervini et al., 2012a;Palla et al., 2017) or protected geographical indication (PGI; e.g., Matera bread and Coppia ferrarese bread; Vernocchi et al., 2004;Minervini et al., 2012a). Besides tens of studies on Italian sourdough varieties, spontaneous sourdoughs from Belgium, China, France, and Germany have been studied extensively Van Kerrebroeck et al., 2017). Most wheat sourdoughs encountered in these studies harbor at least the LAB species Lacp. plantarum and the yeast species S. cerevisiae. However, both complex and restricted microbial species diversities have been reported in wheat sourdoughs (Scheirlinck et al., 2008;Minervini et al., 2012a;De Vuyst et al., 2014Lhomme et al., 2015a,b). For example, sourdoughs characterized by Fl. sanfranciscensis and K. humilis solely and sourdoughs containing multiple microbial species, some of which are rarely reported in sourdoughs, such as Companilactobacillus heilongjiangensis (formerly known as Lactobacillus heilongjiangensis), Levilactobacillus koreensis (formerly known as Lactobacillus koreensis), and Lactiplantibacillus xiangfangensis (formerly known as Lactobacillus xiangfangensis), exist (Zhang and He, 2013;Michel et al., 2016;De Vuyst et al., 2017). However, these species are relatively new; therefore, it is possible that they have been misidentified and thus are underrepresented in sourdoughs.
During the last decade, it became clear that also acetic acid bacteria (AAB) are often part of the microbial consortium of spontaneous sourdoughs (Minervini et al., 2012b;Zhang and He, 2013;Lhomme et al., 2015a;Liu et al., 2016;Ripari et al., 2016b;Comasio et al., 2019). However, as they have not been searched for systematically, it is not clear how widespread they are in sourdoughs. Yet, they do not dominate the final sourdoughs, possibly due to their aerobic metabolism. Alternatively, functional starter culture strains of AAB have been tested successfully regarding their exopolysaccharide production in the presence of sucrose (Hermann et al., 2015;Ua-Arak et al., 2016 or aroma formation potential (Ripari et al., 2016a;Ua-Arak et al., 2017). Finally, alternative yeasts are being exploited to produce innovative sourdoughs .
The present study aimed to explore the species diversity of sourdoughs from different origins multiphasically, obtained through spontaneous fermentation and maintained through backslopping on a household or bakery scale, to get insight into the communality or uniqueness of the microbial communities involved in sourdough production.

Collection of Sourdoughs
Seventeen sourdoughs, which were propagated using a backslopping procedure, were collected randomly from private persons as well as from artisan bakeries and a lambic brewery. A specific code was allocated to the different samples (

Microbial Community Enumeration
To determine the counts of presumptive LAB, AAB, yeasts, and/or cycloheximide-resistant yeasts (only for the sourdoughs G- B-W and G-B-WL) in the sourdough samples, a culturedependent plating analysis was performed. Therefore, appropriate decimal dilutions of fresh sourdough samples were made and 100 µL of each dilution was plated on (i) modified de Man-  agar medium , supplemented with 0.1 g of cycloheximide (Sigma-Aldrich, Saint-Louis, Missouri, USA) and 0.005 g of amphotericin B (Sigma-Aldrich); (ii) modified deoxycholatemannitol-sorbitol (mDMS) agar medium (Papalexandratou et al., 2013), containing 0.1 g of cycloheximide (Sigma-Aldrich) and 0.005 g of amphotericin B (Sigma-Aldrich); (iii) yeast extractpeptone-dextrose (YPD) agar medium, supplemented with 0.1 g of chloramphenicol (Sigma-Aldrich) (Spitaels et al., 2014c); and (iv) YPD agar medium containing 0.1 g of chloramphenicol (Sigma-Aldrich) and 0.05 g of cycloheximide (Sigma-Aldrich) (YPDc; Spitaels et al., 2014c) for sourdough samples of the lambic brewery, respectively. All plates were incubated at 30 • C for 72 h (up to 8 days for YPDc). All platings were performed in triplicate. The average counts are expressed as colony forming units (CFU) per g of sourdough.

Bacterial identifications
Bacterial DNA extraction was performed on cell pellets of all overnight cultures mentioned above, which were obtained through microcentrifugation (14,000 × g, 5 min), as described previously (Comasio et al., 2019).
To classify and identify the bacteria (GTG) 5 -PCR fingerprinting analysis of genomic DNA was performed, as described previously . Therefore, the oligonucleotide (GTG) 5 primer was used (Versalovic et al., 1994;Gevers et al., 2001). PCR assays were performed as described previously . All fingerprint images were analyzed using Bionumerics v5.10 software (Applied Maths, Sint-Martens-Latem, Belgium). A cluster analysis of the (GTG) 5 -PCR fingerprints was performed using the Pearson correlation coefficient. To calculate a similarity matrix, the intensity of each band of the fingerprints was taken into account. The dendrograms were obtained by means of the unweighted pair group method with arithmetic average (UPGMA) clustering algorithm. Representative isolates (10%) of each cluster were identified by 16S rRNA gene sequencing, after amplification of this gene using the primers pA and pH (Edwards et al., 1989). In the case of AAB isolates, the dnaK gene was amplified too, using the primers dnaK- 01-F and dnaK-02-R (Cleenwerck et al., 2010). The PCR amplicons obtained (circa 1,500 and 750 bp for the 16S rRNA gene and the dnaK gene, respectively) were sequenced in a commercial facility using capillary technology (Macrogen, Amsterdam, The Netherlands; VIB Genetic Service Facility, Antwerp, Belgium). A basic local alignment search tool (BLAST) analysis was performed to evaluate the sequencing results and to determine the closest known relatives (type strains) of the partial gene sequences obtained via the National Center for Biotechnology Information (NCBI) non-redundant nucleotide (nt) database (http://www.ncbi.nlm.nih.gov/BLAST/; Altschul et al., 1990). Sequence identities of ≥98% were taken into account. Percentages of identity and accession numbers of hits (GenBank) are reported below.

Yeast identifications
Yeast DNA extraction was performed on cell pellets of all overnight yeast cultures mentioned above, which were obtained through microcentrifugation (14,000 × g, 5 min), as described previously (Comasio et al., 2019).
To classify and identify the yeasts, M13-PCR fingerprinting analysis of genomic DNA was performed, as described previously (Laureys and De Vuyst, 2014). Therefore, the oligonucleotide primer M13 was used (Vassart et al., 1987). PCR assays were performed as described previously . M13-PCR fingerprint cluster analysis was performed as described above. At least 10% of representative isolates from each cluster were identified by sequencing of the internal transcribed spacer (ITS) region of the rDNA. Therefore, the primers ITS1 and ITS4 were used (White et al., 1990). The PCR amplicons obtained (variable lengths) were processed as described above. Sequence identities of ≥98% were taken into account. Percentages of identity and accession numbers of hits (GenBank) are reported.

Culture-Independent Microbial Community Dynamics and Identifications
To determine the prevailing LAB, AAB, and yeast communities present in the different sourdough samples and to compare those results with the culture-dependent data, denaturing gradient gel electrophoresis (DGGE) and high-throughput sequencing (both metagenetics and shotgun metagenomics) were performed.

Sample Preparation
Sample preparation was performed as described previously . For the isolation of total DNA from the sourdough samples, 10 g of sourdough and 90 mL of peptonephysiological solution [0.1%, m/v, bacteriological peptone (Oxoid) and 0.85%, m/v, NaCl (Merck)] were mixed in a stomacher bag (Stomacher 400; Seward, Worthing, West Sussex, UK) for 5 min. These suspensions (50 mL) were centrifuged (1,000 × g for 5 min at 4 • C) to remove solid flour particles. The supernatants were collected and a second centrifugation (4,600 × g for 20 min at 4 • C) yielded cell pellets that were stored at −20 • C.

DGGE Analysis
Total DNA extraction was performed as described previously (Laureys and De Vuyst, 2014;Harth et al., 2016). DGGE analysis of PCR amplicons of these total DNA extracts was performed as described previously . Amplification of the V3 region of the bacterial 16S rRNA gene was done with the universal primers F357-GC and 518R (Muyzer et al., 1993). GC stands for the attached clamp. To amplify fungal DNA from the D1 region of the fungal 26S rRNA gene, the eukaryotic universal primers NL1-GC and LS2 were used (Cocolin et al., 2000). These PCR amplifications were performed as described previously (Comasio et al., 2019).
The PCR amplicons were analyzed by means of a polyacrylamide gel [polyacrylamide, 8% (v/v) in 1 × Trisacetate-ethylenediaminetetraacetic acid (EDTA) buffer; Bio-Rad, Hercules, California, USA] through DGGE, as described previously (Comasio et al., 2019). The gels were normalized by using ladders of known bacterial and fungal DNA. Therefore, a mixture of PCR products originating from pure cultures of the strains Lactobacillus amylovorus DCE 471, Companilactobacillus crustorum LMG 23699 (formerly known as Lactobacillus crustorum), Limosilactobacillus fermentum IMDO 130101 (formerly known as Lactobacillus fermentum), Levilactobacillus namurensis LMG 23584 (formerly known as Lactobacillus namurensis), Lacp. plantarum IMDO 130201, Latilactobacillus sakei IMDO CG1 (formerly known as Lactobacillus sakei), and Fl. sanfranciscensis IMDO 150101 (laboratory collection of the research group IMDO) was used to analyze the LAB community profiles. To analyze the fungal community profiles, a ladder was constructed based on PCR products from pure cultures of the strains S. cerevisiae DIV/07-125X, Candida glabrata DIV/07-076BZ, Wickerhamomyces anomalus DIV/07-076BY, and Kazachstania unispora DIV/07-125CR (laboratory collection of the research group IMDO). DNA bands of interest were excised from the gels with a sterile blade, resuspended in 30 µL of ultrapure water (MilliQ; Merck Millipore, Burlington, Massachusetts, USA), and incubated at 4 • C for 48 h for DNA elution. Five µL of these DNA solutions was used to reamplify the PCR products using the F357-518R primers for the bacteria and the NL1-LS2 primers for the yeasts (both with a M13-tag and no GC clamp; Integrated DNA Technologies, Leuven, Belgium). The M13-tag was used to increase the quality of the sequences of the short PCR products. The amplicons were purified with the Wizard R SV Gel and PCR Clean up system (Promega, Madison, Wisconsin, USA) and sequenced in a commercial facility by means of capillary sequencing technology (Macrogen; VIB Genetic Service Facility). A BLAST analysis was performed to evaluate the sequencing results and to determine the closest known relatives (type strains) of the partial sequences obtained in the NCBI non-redundant nt database (http://www.ncbi.nlm. nih.gov/BLAST/). Sequence identities of ≥98% were taken into account. Percentages of identity and accession numbers of hits (GenBank) are reported below.

Metagenetic Sequencing Total DNA extraction, PCR amplification, and DNA sequencing
The same total DNA extracted using the protocol described above was used for metagenetic sequencing. PCR amplifications of group-specific loci of both bacterial and fungal DNA sequences in the total DNA extracted were performed, as described before (De Bruyn et al., 2017;De Roos et al., 2018a). The hypervariable V4 region of the bacterial 16S rRNA gene was amplified using the primers F515 (5 ′ -TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG GTG TGC CAG CMG CCG CGG TAA-3 ′ ) and R806 (5 ′ -GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA G GG ACT ACH VGG GTW TCT AAT-3 ′ ) with an Illumina platform-specific 5 ′ -tag (underlined) (Caporaso et al., 2011). The fungal ITS1 region was amplified using the primers BITS (5 ′ -TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG ACC TGC GGA RGG ATC A-3 ′ ) and B58S3 (5 ′ -GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA G GA GAT CCR TTG YTR AAA GTT-3 ′ ) with an Illumina platform-specific 5 ′ -tag (underlined) (Bokulich and Mills, 2013). The PCR amplicons obtained were purified using the Wizard SV Gel and PCR Clean up system (Promega), eluted in 50 µL of nuclease-free water (Promega), and the primer dimers were removed using Agencourt AMPure XP PCR Purification magnetic beads (Beckman Coulter, Brea, California, USA), following the manufacturers' instructions. The amplicon size distribution was checked qualitatively by means of a 2100 Bioanalyzer instrument (Agilent Technologies, Santa Clara, California, USA). Finally, DNA concentrations were quantified using the fluorometric Qubit 2.0 quantitation assay (Thermo Fisher Scientific, Waltham, Massachusetts, USA). Next, the bacterial and fungal DNA template libraries of each sample were combined and sequenced under the same index (De Bruyn et al., 2017). Therefore, bacterial V4 and fungal ITS1 amplicons originating from the same sample were pooled equimolarly in a final volume of 30 µL and barcoded with the same index before being sequenced using an Illumina MiSeq platform (Illumina, San Diego, California, USA) in the interuniversity VUB-ULB sequencing facility BRIGHTcore (Jette, Belgium). All sequences generated were submitted to the European Nucleotide Archive of the European Bioinformatics Institute (ENA/EBI) under accession number PRJEB35796 (experiments ERX3762279-95 for V4 sequences and ERX3762296-312 for ITS1 sequences).

Bioinformatic analysis
A trimming of the sequences was performed using Cutadapt software 1.9.6. to remove the adapters (Martin, 2011). Both the bacterial and fungal diversities were processed with DADA2 software v1.6.0, following the pipeline described before (Callahan et al., 2017), yielding amplicon sequence variants (ASVs). The unique bacterial ASVs were aligned against the bacterial 16S rRNA SILVA database (http://www.arb-silva.de; version 128; Quast et al., 2013), whereas the fungal ASVs were aligned against the UNITE_ITS1 database (https://unite.ut.ee; version 6, sh 99; Kõljalg et al., 2005). The resulting sequences of the V4 and ITS1 regions were also aligned using the BLAST tool to determine the closest known relatives (type strains) of the partial sequences obtained in the NCBI non-redundant nt database (http://www. ncbi.nlm.nih.gov/BLAST/). Sequence identities of ≥98% were considered for species classification. When the ASVs resulted in an identical taxonomic assignment, the sequences were merged to determine the relative abundance.

Shotgun Metagenomic Sequencing
Shotgun metagenomic sequencing was performed on the four sourdough samples (E-B-W, E-B-M, E-B-R, and E-B-S) from bakery producer E only. High-quality DNA extraction and further processing, as well as taxonomic analysis, were carried out as described below (Laureys and De Vuyst, 2014;Verce et al., 2019).

DNA extraction for shotgun metagenomic analysis
For DNA extraction from these four sourdough samples, cell pellets obtained as described above (Section Sample Preparation) were resuspended in 500 µL of lysis buffer (sucrose, 80 g/L; EDTA, 50 mM; Tris-base, 50 mM; lysozyme, 20 mg/mL; pH 8.0), containing 10 µL of mutanolysin (12.5 U/mL), and βmercaptoethanol (30 mM). These mixtures were incubated at 37 • C for 45 min. Afterwards, lyticase (200 U; Sigma-Aldrich) and Zymolyase (15 U; G Biosciences, Saint-Louis, Missouri, USA) were added and the mixtures were incubated at 37 • C for another 45 min. Then, 0.2 g of acid-washed 0.2-mm glass beads, 50 µL of SDS solution (20%, m/v), and 50 µL of proteinase K solution (5 mg/mL) were added to the mixtures and the tubes were vortexed for 1 min, followed by an incubation at 56 • C for 45 min. One volume of phenol:chloroform:isoamylalcohol (49.5:49.5:1.0) was added to these suspensions, which were then mixed for 1 min and centrifuged at 6,000 × g for 15 min. The aqueous phase was transferred to 15-mL centrifuge tubes, after which one volume of AL buffer (Qiagen, Hilden, Germany) and one volume of absolute ethanol (VWR International, Darmstadt, Germany) were added. The mixtures were repeatedly applied to a DNeasy Blood and Tissue column in aliquots of 600 µL, until the whole mixtures were used, followed by DNA purification with the DNeasy Blood and Tissue Kit (Qiagen), and final elution in 200 µL of elution buffer. An RNase treatment was performed by the addition of 4 µL of RNase (Fermentas, St. Leon-Roth, Germany) to 180 µL of DNA solution and a second purification using the DNeasy Blood and Tissue Kit. The resulting DNA concentrations were measured as described above.

Preparation of libraries, shotgun metagenomic sequencing, and data preprocessing
The metagenomic DNA of the four extractions was processed as described previously (Verce et al., 2019). Briefly, they were enzymatically sheared to produce library fragments of the desired length, using an Ion Xpress Plus gDNA Fragment Library Preparation Kit (Thermo Fisher Scientific). First, the shearing time was optimized (5, 8, or 12 min) by following the shearing protocol of the manufacturer. After optimization of the shearing time, three shearing reactions were performed, using an Ion Xpress Plus Fragment Library Kit (Thermo Fisher Scientific), with 100 ng of DNA as input, and the sheared DNA was purified using an Agencourt AMPure XP Kit (Beckman Coulter) according to the manufacturers' instructions. Adapters were ligated to the fragments and the nicks were repaired, followed by another purification of the DNA fragments, using an Agencourt AMPure XP Kit (Beckman Coulter). The three shearing reaction products were pooled during the elution step to ensure a sufficient DNA library concentration. The unamplified DNA library was size-selected using an E-Gel SizeSelect 2.0% (m/v) agarose gel (Thermo Fisher Scientific) to produce library fragments of ∼400 bp. The size-selected library was qualified and quantified using a Bioanalyzer 2100 with a High Sensitivity DNA Kit (Agilent Technologies). As such, four 350-bp libraries were obtained for sequencing.
The size-selected libraries were used as template for emulsion PCR onto Ion Sphere Particles (ISPs) using an Ion PGM Template OT2 HiQ View and the Ion OneTouch 2 Instrument (Thermo Fisher Scientific). The template-positive ISPs were measured using the Ion Sphere Quality control kit and were enriched using the Ion PGM Enrichment Beads and an Ion OneTouch ES (Thermo Fisher Scientific). Template-positive ISPs were loaded on an Ion 316 Chip and sequencing was performed using the Ion Hi-Q Sequencing Kit on an Ion PGM (Thermo Fisher Scientific).
The four metagenomic sequence datasets were subjected to quality checks and quality trimming using FastQC v0.10.1 (Andrews, 2012) and PRINSEQ 0.20.2 (Schmieder and Edwards, 2011) with the appropriate settings. All four metagenomic data sets were submitted to the ENA/EBI under accession number PRJEB35796 (experiments ERX3762343-46).
The BLAST algorithm megablast was used to compare the metagenomic reads to sequences in the non-redundant nt database of NCBI (accessed February 2017). DIAMOND was used to compare the metagenomic reads to sequences in the NCBI non-redundant protein (nr) database (accessed September 2017). The outputs were parsed with MEGAN 6.7.11 (Huson et al., 2016), using the following settings: MinScore, 100 (50 for nr); MaxExpected, 0.01; TopPercent, 10.0; MinSupport, 0.01% of all reads; and lowest common ancestor (LCA) percentage, 100.
A database constructed from complete genomes of bacteria, archaea, and fungi, available in GenBank (accessed September 2017) was used for sequence classification with Kraken. A database consisting of protein sequences from the NCBI non-redundant nr database, including microbial eukaryotic sequences, was used for sequence classification with Kaiju.
To construct metagenomic recruitment plots for specieslevel taxonomic analysis, genera represented by more than 0.1% of all reads in any of the four metagenomes with any of the aforementioned methods were selected, together with the yeast genera detected with culture-dependent methods.
Of all species and subspecies of these genera, the genome sequences of the sequenced type strains were obtained from the NCBI RefSeq assembly database (Tatusova et al., 2014). If the genome sequence of the type strain of a species was not available, another representative for that species was chosen, preferably a strain with a complete genome. A BLAST search was performed using blastn, whereby the metagenomic sequence reads were used as query sequences and the genome sequences as database. The minimum identity threshold was set at 60%. Only the top hit for each sequence was retained, using 50 bp as the minimum length. The result was used as a basis for the metagenomic recruitment plotting, using R (R Core Team, 2017), RStudio (RStudio Team, 2015), and the R packages ggplot2 (Wickham, 2009), reshape2 (Wickham, 2017), and scales (Wickham, 2016).

Sample Preparation
To measure the concentrations of residual substrates and metabolites produced in the different sourdough samples, these samples were diluted 5 to 10 times in ultrapure water (MilliQ) and mixed by means of a rotator Stuart SB3 (Bibby Scientific, Stone, Staffordshire, UK) at 25 rpm for 20 min, before being centrifuged (4,600 × g for 20 min) to store the cell-free supernatants at −20 • C until further analysis.

Determination of Substrate and Metabolite Concentrations
Concentrations of erythritol, fructose, glucose, glycerol, maltose, mannitol, sorbitol, and sucrose were determined by high-performance anion exchange chromatography with pulsed amperometric detection (HPAEC-PAD); those of acetic acid, acetoin, diacetyl, ethanol, and ethyl acetate by gas chromatography with flame ionization detection (GC-FID); and those of L-lactic acid and D-lactic acid by high-performance liquid chromatography with ultraviolet detection (HPLC-UV), as described previously (Comasio et al., 2019).

Statistical Analysis
Intra-species diversity (alpha-diversity; based on the metagenetic analysis) of both bacteria and yeasts was evaluated by calculating the Simpson (diversity) and Pielou (evenness) indexes. A principal component analysis (PCA) was performed on the quantitative normalized data of the culture-dependent LAB, AAB, and yeast species identifications. A Spearman correlation matrix was calculated based on the microbial species diversity (metagenetic analysis of species with a relative abundance of > 5.0% in at least one of the samples) and metabolites of the sourdough samples. Analysis and visualization were performed using R (R Core Team, 2017), RStudio (RStudio Team, 2015), and the R packages corrplot (Wei and Simko, 2017), factoextra (Kassambara and Mundt, 2020), ggplot2 (Wickham, 2009), and Hmisc (Harrell, 2018).

Culture-Dependent Microbial Community Enumerations and Identifications
Enumerations The 17 sourdough samples from different origins contained presumptive LAB (mMRS-5) and yeasts (YPD); all Belgian sourdough samples, except for sourdough sample C-B-R, contained presumptive AAB (mDMS) too. Presumptive LAB represented the most abundant microbial group; their numbers varied between 8.7 log (CFU/g) (sourdough sample E-B-W) and 9.7 log (CFU/g) (D-F-S). The numbers of the presumptive AAB were highest in the samples from the sourdoughs made in the lambic brewery and varied between 6.2 log (CFU/g) (G-B-WL) and 7.9 log (CFU/g) (G-B-W). The numbers of the presumptive yeasts were more variable. Their counts ranged from 4.5 log (CFU/g) (A-UK-R) to 8.1 log (CFU/g) (G-B-W). The sourdoughs made in the lambic brewery also contained presumptive cycloheximide-resistant yeasts (YPDc), namely 7.6 log (CFU/g) (G-B-W) and 5.6 log (CFU/g) (G-B-WL).

AAB identifications
Species of AAB were isolated from the mMRS-5 agar media as well [Acetobacter cerevisiae (sourdough sample G-B-W), Acetobacter oryzifermentans (E-B-M), and Acetobacter senegalensis (E-B-M and E-B-S)] ( Figure 1A). In the seven sourdough samples that harbored presumptive AAB, based on mDMS agar plating, different AAB species were identified through (GTG) 5 -PCR fingerprinting and 16S rRNA gene and dnaK gene sequencing of colonies picked ( Figure 1B). Acetobacter cerevisiae was the only AAB species isolated from the two sourdough samples of bakery producer G. The same species (29.2%) was found in the sourdough sample B-B-R, together with Acetobacter fabarum (54.2%) and Komagataeibacter xylinus (4.2%). Acetobacter pasteurianus subsp. pasteurianus (75.0-83.3%) and A. oryzifermentans (4.2-50.0%) were the main AAB species isolated from the four sourdough samples of bakery producer E, whereas Acetobacter sicerae (12.5%) and A. senegalensis (4.2-12.5%) were present in all sourdough samples of this producer, except for sample E-B-R. In the latter case, Komagataeibacter sucrofermentans (20.8%) was isolated.

Yeast identifications
The yeast species diversity based on M13-PCR fingerprinting and ITS1/ITS2 sequencing of genomic DNA of colonies picked from the YPD agar media was limited to one to two species per sourdough ( Figure 1C). Saccharomyces cerevisiae (16.7-50.0%) and Kazachstania bulderi (50.0-83.3%) were isolated from three sourdough samples of bakery producer E (E-B-W, E-B-M, and E-B-S); the fourth sourdough sample (E-B-R) contained only K. bulderi. Saccharomyces cerevisiae was the only yeast species found in the sourdough samples from household producer A and bakery producer D and it co-occurred with Pichia fermentans (41.7%) in the sourdough sample C-B-R. Kazachstania unispora was found in sourdough sample B-B-R, whereas K. humilis was the only yeast species isolated from the three sourdough samples of bakery producer F. Dekkera anomala was the only yeast species found in the sourdough samples of lambic brewery producer G, both on the YPD and YPDc agar media.
In the case of producers D, E, F, and G, these identification data indicated that the microbial species diversity seemed to be independent of the flour used but was rather similar in sourdoughs coming from the same producer, as further confirmed through a PCA (Figure 2).

Yeast community identifications
The 26S rRNA-PCR-DGGE profiles of the yeast communities present in the sourdough samples analyzed confirmed the low species diversity that was shown culture-dependently (Figure 4). Pichia myanmarensis, a yeast species closely related to Wickerhamomyces anomalus, was present in the sourdough samples A-UK-R and G-B-WL. Saccharomyces cariocanus/paradoxus was also found in the former sourdough sample and was the only yeast species found in the sourdough samples from bakery producer D. The brewery sourdough samples G-B-W and G-B-WL harbored the species D. anomala (also found culture-dependently) and Dekkera bruxellensis. Kazachstania solicola, Kazachstania exigua/pseudohumilis, and K. humilis were present in sourdough samples from household producer B and bakery producers E and F, respectively, the latter species being found culture-dependently as well. From sourdough sample C-B-R, Pi. fermentans could be detected through 16S rRNA-PCR-DGGE analysis.
As for the culture-dependent analysis, the microbial species diversity seemed to be independent of the flour used in the case of the producers D, E, F, and G, but was rather similar in sourdough samples coming from the same producer.

Bacterial species diversity
Amplicon sequencing applied on total DNA extracted from all sourdough samples analyzed generally resulted in a greater LAB and AAB species diversity (Tables 2, 3) compared to culture-dependent plating/identification and rRNA-PCR-DGGE community profiling ( Table 4). The highest intra-species diversity, based on the Simpson and Pielou indexes, was found in the samples of sourdoughs from household producer A, bakery producer C, and lambic brewery producer G ( Table 2). For instance, concerning LAB, ASVs belonging to Latilactobacillus graminis/fuchuensis (formerly known as Lactobacillus  graminis/fuchuensis), the Companilactobacillus genus (formerly the Lb. alimentarius group), and Fl. sanfranciscensis were found in the sourdough sample A-UK-R, next to those found culture-dependently and through 16S rRNA-PCR-DGGE analysis. The latter species was also found in the sourdough samples of bakery producer F and as the sole LAB species in those of bakery producer D, as shown culture-dependently and through 16S rRNA-PCR-DGGE analysis. It represented a minority of 7.0% in the sourdough sample C-B-R and <1.0% in the sourdough samples B-B-R, E-B-W, E-B-M, E-B-S, and G-B-W. In the sourdough samples of lambic brewery producer G, mainly Lacc. paracasei/casei (as shown culturedependently and through 16S rRNA-PCR-DGGE analysis), as well as Latl. sakei/curvatus, were found. In the rye sourdough sample B-B-R, mainly the Companilactobacillus genus and Levl. brevis were present, as shown culture-dependently. The Companilactobacillus genus was also present in all sourdough samples from bakery producer E, as shown culture-dependently and through 16S rRNA-PCR-DGGE analysis, followed by Levl. parabrevis/hammesii (low relative abundance in sourdough sample E-B-W). Reads belonging to the Lentilactobacillus genus (formerly the Lactobacillus buchneri group) were present in the sourdough samples from bakery producer E too (most in sourdough sample E-B-S). Sourdough sample C-B-R mainly harbored Levl. parabrevis/hammesii, followed by Levl. brevis and P. parvulus, as shown culture-dependently and through 16S rRNA-PCR-DGGE analysis.
Different AAB species were found in the Belgian sourdough samples, except for the C-B-R sourdough sample. The brewery sourdoughs harbored Acetobacter malorum/cerevisiae and Acetobacter lambici/okinawensis/indonesiensis. The four sourdough samples of bakery producer E contained mainly Acetobacter pasteurianus/ghanensis/oryzifermentans, as shown culture-dependently. Also Komagataeibacter was present in some sourdough samples, but with low and decreasing relative abundances in sourdough samples B-B-R, E-B-M, E-B-W, and E-B-R. Low relative abundances of ASVs belonging to the genera Asaia, Bacillus, Clostridium, Gluconobacter, Pantoea, and Pseudomonas were found as well, especially in the sourdough samples G-B-W and G-B-WL.
Simpson and Pielou indexes), which contained different yeast species (in particular K. humilis, which was also present in the other sourdough samples of bakery producer F, as shown culturedependently and through 26S rRNA-PCR-DGGE analysis) as well as filamentous fungi (Tables 2, 5, 6). Filamentous fungi were also present in some other sourdough samples (B-B-R, C-B-R, D-F-B, E-B-W, and E-B-R). In sourdough sample A-UK-R, a high relative abundance of W. anomalus, next to S. cerevisiae (shown culture-dependently), was found. The producer D sourdoughs mainly contained S. cerevisiae, as shown culturedependently. ASVs belonging to D. anomala (shown culturedependently and through 26S rRNA-PCR-DGGE analysis) and Pichia membranifaciens were mainly present in the sourdough samples of lambic brewery producer G. Kazachstania unispora was unique for sourdough sample B-B-R, as shown culturedependently. The sourdough samples from bakery producer E contained mainly K. bulderi and S. cerevisiae (lower relative abundance), as shown culture-dependently for the respective sourdough samples. Sourdough sample C-B-R mainly harbored Pi. fermentans, confirming the culture-dependent and 26S rRNA-PCR-DGGE data.
Spearman correlation analysis showed that Fl. sanfranciscensis negatively correlated with other LAB and AAB species but did positively correlate with S. cerevisiae and K. humilis (Figure 5).

Shotgun Metagenomic Sequencing
The sequencing of the four metagenomic libraries derived from the sourdough samples of bakery producer E resulted in four datasets with a combined size of 4.28 Gbp of metagenomic sequence data after quality checks and trimming. The lower and upper limits of the read lengths were 25 and 372 bp, respectively; the median read length was 283 bp.
Taxonomic analysis of the four metagenomic sequence datasets assigned the reads to the family Lactobacillaceae (up to circa 70% of the total reads using Kraken and Kaiju), and the genus Acetobacter (up to circa 15% using Kraken and Kaiju) and Komagataeibacter (circa 1-2% with all tools applied) as main bacterial taxa. Less than 1% of the total reads were assigned to yeast genera using these taxonomy profiling tools. A low number of reads (<3%) were assigned to fruit flies (Drosophila) and nematodes (Trichinella) using BLAST and DIAMOND, respectively. Also, depending on the flour used, less than 10 or 2% of the reads were assigned to cereals (Secale and Triticum) using BLAST and DIAMOND, respectively.
The metagenomic recruitment plots of the reads of the four datasets showed a high inter-and intra-species diversity (Figure 6 and Supplementary Table 1). However, sourdough sample E-B-R carried fewer microbial species compared to the three other samples of bakery producer E. For the four bakery E sourdough samples, many reads were assigned to the LAB species Coml. paralimentarius (42.9,9.0,33.2,and 24.4% of the total reads for E-B-W, E-B-M, E-B-R, and E-B-S, respectively), Coml. crustorum (3.8, 1.1, 0.6, and 1.7%, respectively), Levl. parabrevis (2.4, 25.9, 5.8, and 18.2%, respectively), Lacp. xiangfangensis (1.4, 11.0, 1.1, and 4.0%, respectively), and Lentilactobacillus kefiri (formerly known as The sample codes are as described in Table 1. The Simpson (1-D) and Pielou (Je) indexes were calculated to measure the sourdough alpha diversity and evenness, respectively.
Different Pantoea and Pseudomonas spp. were also found, albeit at very low read numbers, with Pantoea agglomerans (0.07, 0.03, 0.22, and 0.11% of the total reads for sourdough samples E- B-W, E-B-M, E-B-R, and E-B-S, respectively) and Pseudomonas poae (0.01, < 0.01, 0.03, and 0.01%, respectively) as the most prevalent species of these genera.
A yeast species related to Kazachstania turicensis (0.1, 0.6, 0.4, and 0.3% of the total reads for sourdough samples E-B-W, E-B-M, E-B-R, and E-B-S, respectively) was present in the four sourdough samples, albeit representing low read numbers. Instead, culture-dependent analysis retrieved K. bulderi as the most prevalent yeast species, but due to a lack of its complete genome in the NCBI database, it could not be included during the metagenomic recruitment plotting analysis. Saccharomyces cerevisiae (0.02, 0.01, and 0.01% for sourdough samples E-B-W, E-B-M, and E-B-S, respectively) was found in three of the four sourdough samples. It was not found in sourdough sample E-B-R, confirming the culture-dependent and metagenetic analyses. Reads belonging to a species related to the arbuscular mycorrhizal fungus Rhizophagus irregularis were present in the four sourdough samples as well.

Substrate and Metabolite Profiles
No residual carbohydrates (glucose, fructose, sucrose, or maltose) were found in the sourdough samples from household producer The highest concentrations of glycerol and mannitol were found in sourdough sample F-NY-R (47.91 ± 2.28 mM and 43.39 ± 6.56 mM, respectively). Based on a Spearman correlation analysis, these two sugar alcohols positively correlated with the presence of S. cerevisiae and/or Fl. sanfranciscensis (Figure 5). High concentrations of glycerol were also found in the sourdough samples of household producer A and bakery producers C and D, lower concentrations in those of household producer B and bakery producer E as well as in sourdough samples F-NY-WR and G-B-WL, and glycerol was absent in sourdough samples F-NY-WW and G-B-W (Figure 7B). High concentrations of 3 | Relative abundance (%) of bacterial amplicon sequence variants (ASVs) obtained through metagenetic analysis of 17 sourdough samples from backslopped sourdoughs of different origins. The sample codes are as described in Table 1. ASVs with an occurrence under 0.1 % were grouped as "Others." The following abbreviations are used: Fl., Fructilactobacillus; Ff., Furfurilactobacillus; Lacc., Lacticaseibacillus; Lacp., Lactiplantibacillus; Latl., Latilactobacillus; Lenl., Lentilactobacillus; Levl., Levilactobacillus; Loil., Loigolactobacillus; P., Pediococcus; A., Acetobacter; G., Gluconobacter. The intensity of the blue color is based on the percentage of reads assigned to a specific species/genus. Frontiers in Microbiology | www.frontiersin.org TABLE 4 | Comparison of the culture-dependent and culture-independent identifications of lactic acid bacterial and acetic acid bacterial genera in 17 sourdough samples from backslopped sourdoughs of different origins, indicated as presence (+) or absence (-) through colony identification, rRNA-PCR-DGGE, metagenetics, and metagenomics (only producer E), respectively. The sample codes are as described in Table 1. When a genus was identified in a sample by at least one of the different methods, a green color was used.

Companilactobacillus
Frontiers in Microbiology | www.frontiersin.org  The sample codes are as described in Table 1. ASVs with an occurrence below 0.25% (0.5% for F-NY-WW) were grouped as "Others". ASVs of non-yeast fungal communities were grouped together. The intensity of the blue color is based on the percentage of reads assigned to a specific species/genus. Frontiers in Microbiology | www.frontiersin.org TABLE 6 | Comparison of the culture-dependent and culture-independent identifications of yeast genera in 17 sourdough samples from backslopped sourdoughs of different origins, indicated as presence (+) or absence (-) through colony identification, rRNA- PCR-DGGE, metagenetics, and metagenomics (only producer E). of the house microbiota (Minervini et al., 2015) or the process conditions applied . For instance, the sourdoughs of the French artisan bakery producer D, although prepared from different flours, harbored only Fl. sanfranciscensis and S. cerevisiae, those of the Belgian artisan bakery producer E mainly Coml. paralimentarius, Lacp. xiangfangensis, Levl. brevis, S. cerevisiae, K. bulderi, and several AAB species, those of the American artisan bakery producer F mainly Fl. sanfranciscensis and K. humilis, and those of the Belgian artisan lambic brewery producer G mainly Lacc. paracasei, D. anomala, and Acetobacter species. It has been shown before that French sourdoughs often contain Fl. sanfranciscensis (Ferchichi et al., 2007(Ferchichi et al., , 2008; Robert  , 2009;Lhomme et al., 2014Lhomme et al., , 2015aLhomme et al., ,b, 2016Michel et al., 2016) and that artisan sourdoughs may contain either a restricted or a wide microbial consortium, whether or not depending on the house microbiota (Scheirlinck et al., 2009;Minervini et al., 2015). The presence of D. anomala and diverse AAB species in the sourdoughs of the artisan lambic brewery producer G most likely originated from the lambic beer or brewery environment (Spitaels et al., 2014a(Spitaels et al., ,b,c, 2015(Spitaels et al., , 2017De Vuyst, 2018, 2019;De Roos et al., 2018a).
Levilactobacillus brevis and Lacp. plantarum were detected culture-dependently and/or culture-independently in all household sourdoughs examined, showing a possible adaptation of this species to the household environment, as it has been found in household sourdoughs of different origins before (Gänzle  Table 1. and Zheng, 2019). The backslopped household sourdough of United Kingdom origin harbored not only other different sourdough-common microorganisms, such as S. cerevisiae, but also the sourdough-uncommon LAB species Lacc. paracasei subsp. tolerans/paracasei. Moreover, Lacc. paracasei subsp. paracasei was prevalent in the lambic brewery sourdoughs. However, this LAB species is not typical for sourdough nor for lambic beer production De Roos and De Vuyst, 2019). Its active growth was likely reflected in the high percentage of the L-lactate enantiomer, which is typical for this dairy LAB species (Maragkoudakis et al., 2006;Moon et al., 2012). In general, differences in the lactic acid enantiomer ratio depended on the LAB species present in the sourdoughs (Dicks and Endo, 2009). For instance, the growth of Coml. crustorum is also responsible for a high production of the L-lactate enantiomer (Scheirlinck et al., 2007;Comasio et al., 2019). Also the occurrence of Fl. sanfranciscensis is responsible for a high production of L-lactic acid (Maruyama and Okada, 2006).
Species of the Levilactobacillus genus (formerly the Lactobacillus brevis group) were isolated from household sourdoughs solely, although they commonly occur in backslopped sourdoughs . One of the Belgian sourdoughs from household origin harbored the sourdough-uncommon P. parvulus (likely responsible for the high production of D-lactic acid) and Pi. fermentans as the prevailing LAB and yeast species, respectively. Both microbial species are usually associated with other fermented food matrices. Although Pi. fermentans has been isolated once from an artisan bakery sourdough, this yeast species is mainly found in cheese and as contaminant in fruit juices (Arias et al., 2002;Succi et al., 2003;Kurtzman et al., 2011). In contrast, P. parvulus has never been found in sourdoughs; this LAB species is mainly retrieved as contaminant from minimally processed vegetables and fermented beverages, such as cider and wine (Davis et al., 1986;Fernández et al., 1996;Bennik et al., 1997).
AAB species were found only in the Belgian sourdoughs. This may be associated with cross-contamination in the production places examined, as the artisan bakery involved uses a lot of fruits in the production room and the lambic brewery obviously houses AAB De Vuyst, 2018, 2019). Also, fruit flies may be more present in the bakery and act as carrier of AAB. Fruits are a natural habitat of several AAB species (Matsushita et al., 2016). Lambic beer is a source of AAB (Spitaels et al., 2014a(Spitaels et al., ,b,c, 2015(Spitaels et al., , 2017De Vuyst, 2018, 2019;De Roos et al., 2018a). Moreover, the lowest concentrations of ethanol and acetic acid were present in the brewery sourdough with the highest number of AAB, whose species not only grew but also oxidized ethanol to acetic acid that was further over-oxidized to carbon dioxide and water. Acetobacter species can oxidize acetic acid through the tricarboxylic acid cycle, but only when ethanol is completely depleted (De Roos and De Vuyst, 2019). Also, glycerol (that was absent in this brewery sourdough) can be oxidized by AAB to dihydroxyacetone phosphate (when ethanol is depleted), which can be further converted into acetic acid and carbon dioxide via the Embden-Meyerhof-Parnas pathway and gluconeogenesis (Mamlouk and Gullo, 2013;Komagata et al., 2014;Matsushita et al., 2016). Further, D. anomala was the prevailing yeast species in the lambic brewery sourdoughs, at least as shown culture-dependently, whereas 26S rRNA-PCR-DGGE community profiling identified D. bruxellensis, indicating survival and activity of these lambic beer yeasts in the lambic brewery sourdoughs. Both Dekkera species are indeed typical for the maturation of lambic beer (De Roos and De Vuyst, 2019). Dekkera bruxellensis was detected before through 26S rRNA-PCR-DGGE community profiling in a rye sourdough made with a mixed starter culture, encompassing different LAB species, K. humilis and S. cerevisiae, even after several backslopping steps (Meroth et al., 2003). Dekkera anomala was never found in a spontaneously fermented sourdough up to now. The present study showed its possible adaptation to a sourdough matrix, at least under the environmental and fermentation conditions applied. Although not monitored, D. anomala has been used as non-conventional yeast in dough making, causing a low carbon dioxide production because of a slow metabolism or its non-survival (Aslankoohi et al., 2016). Finally, Pi. membranifaciens could be retrieved only by ampliconbased metagenetic sequencing, although it is also present during lambic beer maturation (Spitaels et al., 2014c).
The sourdough-specific LAB species Fl. sanfranciscensis occurred in the French and American sourdoughs analyzed, in association with S. cerevisiae and K. humilis, respectively. This strictly heterofermentative LAB species uses fructose as alternative external electron acceptor, converting it into mannitol, as reflected in the high concentrations of the latter sugar alcohol found in those sourdoughs (Vogel et al., 2011). Its occurrence may be linked with the presence of insects, as it has been isolated from the insect gut and insect frass that may contaminate stored cereals (Ripari et al., 2016b;Boiocchi et al., 2017), provided the right technological conditions are applied for sourdough making, such as fast backslopping procedures, ambient temperature, and moderate acidic pH . Moreover, Fl. sanfranciscensis is able to survive in sourdoughs made from different flours, as it has been shown in, not only wheat, rye, buckwheat, and teff sourdoughs , but also in a gluten-free sourdough (prepared from a mixture of buckwheat, rice, and whole-meal rice flours) that was initiated from a wheat sourdough (Lhomme et al., 2014). The sourdoughs of the present study that were dominated by Fl. sanfranciscensis displayed lower bacterial diversities than the other sourdoughs examined, confirming meta-analysis data on the occurrence of this sourdough-specific LAB species . Furthermore, in sourdoughs dominated by Fl. sanfranciscensis, the yeasts K. humilis or S. cerevisiae were prevalent, confirming the stable association between this maltose-positive LAB species and the maltose-negative K. humilis as well the possible co-existence of both the maltose-positive Fl. sanfranciscensis and the maltose-positive S. cerevisiae , 2016Vigentini et al., 2014;Van Kerrebroeck et al., 2017).
The sourdoughs of the Belgian artisan bakery displayed a rather wide LAB, AAB, and yeast species diversity, which may be due to cross-contamination, as mentioned above for the AAB species. However, the sourdough based on rye differed from these other bakery sourdoughs in that S. cerevisiae and some LAB and AAB species as well as residual carbohydrates were absent, whereas the acetic acid concentration was the highest. This may indicate an influence of the rye flour used.
Finally, the present study showed that different techniques used to unravel the microbial composition of different sourdough samples may help to find out their actual taxonomic structure, as they can be both confirmative and complementary. For instance, the use of both rRNA-PCR-DGGE community profiling and metagenetics to analyze the microbial composition of sourdough samples showed that the latter technique detected more species than the former one and microbiological plating, as subdominant species can be detected, sometimes quantitatively (Ercolini, 2013;Viiard et al., 2016). Moreover, whereas AAB species can often hardly be distinguished through rRNA-PCR-DGGE community profiling (Papalexandratou et al., 2011), different AAB species could be identified through metagenetics, in most cases confirming the culture-dependent analysis. However, due to the limited lengths of the DNA fragments amplified as well as the specificity of the DNA region targeted for amplification, species identification was not always possible. Indeed, multiple species were often found after BLAST analysis of DNA fragments obtained through rRNA-PCR-DGGE analysis, making it difficult to distinguish closely related LAB species (e.g., Coml. paralimentarius and Coml. alimentarius, P. parvulus and P. inopinatus). Also, 26S rRNA-PCR-DGGE community profiling based on amplification of the D1 region of the fungal 26S rRNA gene, colony identification based on amplification of the ITS region of the fungal rRNA transcribed unit, and metagenetics based on the ITS1 region could not uniformly identify S. cerevisiae or S. cariocanus due to their close relatedness (Kurtzman et al., 2011). Further, metagenetics and metagenomics unraveled the presence of more or different species, as these techniques may include subdominant background microorganisms and non-cultured microorganisms, thanks to the use of different taxonomy analysis tools. To the best of our knowledge, this was the first study that applied shotgun metagenomics on sourdough samples. Of particular help was the taxonomical analysis of the metagenomic sequence data, based on metagenomic recruitment plotting of the whole genome of a particular species, which allows also the detection of species that were not found with the other techniques (e.g., Lenl. kefiri in the four Belgian artisan bakery sourdoughs) and even new species, such as Oenococcus sicerae in water kefir (Verce et al., 2019(Verce et al., , 2020). Yet, the finding of A. tropicalis through metagenomic recruitment plotting was not in accordance with the culture-dependently isolated and identified A. senegalensis (both 16S rRNA and dnaK gene sequencing). However, both species differ only 4 and 10 nucleotides in their 16S rRNA and dnaK genes, respectively (this study; Li et al., 2014). Hence, the present study showed not only the complementarity of the combined use of culture-dependent and culture-independent methods regarding species detection and identification, which has been commonly employed since more than 20 years, but also the provision of both qualitative and quantitative data regarding species distribution. Moreover, shotgun metagenomic sequencing gives a deeper insight into the cultivable minor and non-cultivable species diversity, in particular when metagenomic recruitment plotting is applied. Yet, amplicon-based sequencing approaches are straightforward when an ASV instead of an operational taxonomic unit (OTU) approach is applied, as the latter is not always able to detect down to species level (Callahan et al., 2017). Furthermore, relative to metagenetic and DGGE analyses, shotgun metagenomic sequencing is of utmost importance for further inclusion in multiphasic approaches.
In conclusion, the present study showed that the producers of the sourdoughs were the main factors to drive the species diversity, which was mainly reflected in their house microbiota and likely in the process parameters applied, although different relative abundances were found with different flours used by producers that made use of different flours. The combined use of different culture-independent approaches showed the advantage of metagenomics to study sourdough ecosystems. Moreover, the decreasing prices for sequencing and the growing availability of bioinformatics tools and databanks further merits the incorporation of shotgun metagenomics into groundbreaking research on the microbial composition of (spontaneously) fermented foods and beverages.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the ENA/EBI: PRJEB35796.

AUTHOR CONTRIBUTIONS
AC contributed to the experimental work, the acquisition, processing, interpretation of the data, and drafting of the manuscript. MV contributed in the metagenomic analysis. SV reviewed and edited the manuscript. LD supervised the work, interpreted the data, reviewed, and edited the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was financially supported by the Research Council of the Vrije Universiteit Brussel (SRP7 and IOF342 projects), the Hercules Foundation (projects UABR09004 and UAB13002), and the Flanders' FOOD project Innocereal II. The funder bodies were not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.

ACKNOWLEDGMENTS
AC was the recipient of a Ph.D. fellowship of the Vrije Universiteit Brussel. MV was the recipient of a Ph.D. fellowship of the Research Foundation Flanders (FWO). We thank Wim Borremans for providing technical assistance with the metabolite target analysis.