<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Microbiol.</journal-id>
<journal-title>Frontiers in Microbiology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Microbiol.</abbrev-journal-title>
<issn pub-type="epub">1664-302X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fmicb.2021.720513</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Microbiology</subject>
<subj-group>
<subject>Technology and Code</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>DAnIEL: A User-Friendly Web Server for Fungal ITS Amplicon Sequencing Data</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Loos</surname>
<given-names>Daniel</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhang</surname>
<given-names>Lu</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Beemelmanns</surname>
<given-names>Christine</given-names>
</name>
<xref rid="aff2" ref-type="aff"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kurzai</surname>
<given-names>Oliver</given-names>
</name>
<xref rid="aff3" ref-type="aff"><sup>3</sup></xref>
<xref rid="aff4" ref-type="aff"><sup>4</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1374404/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Panagiotou</surname>
<given-names>Gianni</given-names>
</name>
<xref rid="aff1" ref-type="aff"><sup>1</sup></xref>
<xref rid="aff5" ref-type="aff"><sup>5</sup></xref>
<xref rid="c001" ref-type="corresp"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1361790/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Systems Biology and Bioinformatics Group, Leibniz Institute for Natural Product Research and Infection Biology</institution>, <addr-line>Jena</addr-line>, <country>Germany</country></aff>
<aff id="aff2"><sup>2</sup><institution>Chemical Biology of Microbe-Host Interactions Group, Leibniz Institute for Natural Product Research and Infection Biology</institution>, <addr-line>Jena</addr-line>, <country>Germany</country></aff>
<aff id="aff3"><sup>3</sup><institution>Institute for Hygiene and Microbiology, University of W&#x00FC;rzburg</institution>, <addr-line>W&#x00FC;rzburg</addr-line>, <country>Germany</country></aff>
<aff id="aff4"><sup>4</sup><institution>National Reference Center for Invasive Fungal Infections NRZMyk, Leibniz Institute for Natural Product Research and Infection Biology</institution>, <addr-line>Jena</addr-line>, <country>Germany</country></aff>
<aff id="aff5"><sup>5</sup><institution>Systems Biology and Bioinformatics Group, School of Biological Sciences, Faculty of Science, The University of Hong Kong</institution>, <addr-line>Pokfulam, China</addr-line></aff>
<author-notes>
<fn id="fn1" fn-type="edited-by"><p>Edited by: Jana Seifert, University of Hohenheim, Germany</p></fn>
<fn id="fn2" fn-type="edited-by"><p>Reviewed by: Dominik Heider, University of Marburg, Germany; Marcus H. Y. Leung, City University of Hong Kong, SAR China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Gianni Panagiotou, <email>gianni.panagiotou@leibniz-hki.de</email></corresp>
<fn id="fn3" fn-type="other"><p>This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>17</day>
<month>08</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>720513</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>06</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>26</day>
<month>07</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Loos, Zhang, Beemelmanns, Kurzai and Panagiotou.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Loos, Zhang, Beemelmanns, Kurzai and Panagiotou</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Trillions of microbes representing all kingdoms of life are resident in, and on, humans holding essential roles for the host development and physiology. The last decade over a dozen online tools and servers, accessible <italic>via</italic> public domain, have been developed for the analysis of bacterial sequences; however, the analysis of fungi is still in its infancy. Here, we present a web server dedicated to the comprehensive analysis of the human mycobiome for (i) translating raw sequencing reads to data tables and high-standard figures, (ii) integrating statistical analysis and machine learning with a manually curated relational database and (iii) comparing the user&#x2019;s uploaded datasets with publicly available from the Sequence Read Archive. Using 1,266 publicly available Internal transcribed spacers (ITS) samples, we demonstrated the utility of DAnIEL web server on large scale datasets and show the differences in fungal communities between human skin and soil sites.</p>
</abstract>
<kwd-group>
<kwd>metagenomics</kwd>
<kwd>fungi</kwd>
<kwd>mycobiome</kwd>
<kwd>web server</kwd>
<kwd>ITS</kwd>
</kwd-group>
<contract-sponsor id="cn1">Deutsche Forschungsgemeinschaft (DFG)<named-content content-type="fundref-id">10.13039/501100001659</named-content>
</contract-sponsor>
<counts>
<fig-count count="3"/>
<table-count count="1"/>
<equation-count count="0"/>
<ref-count count="39"/>
<page-count count="8"/>
<word-count count="5023"/>
</counts>
</article-meta>
</front>
<body>
<sec id="sec1" sec-type="intro">
<title>Introduction</title>
<p>Metagenomics provide a comprehensive view about microbial community structure. Previous studies have revealed many insights about the diversity, composition, and interaction patterns of bacterial communities. Fungi are a neglected but very important kingdom due to the important role they play in many human diseases (<xref ref-type="bibr" rid="ref24">Mukherjee et al., 2014</xref>). The number of publications in PubMed related to the mycobiome is exponentially growing and increased more than 17-fold in the past 5years. Fungal metagenomics is becoming an essential part for comprehensive human host studies and should be accessible to the whole scientific community without the need of laborious and time-consuming efforts. We present DAnIEL (Describing, Analyzing and Integrating fungal Ecology to effectively study the systems of Life), the only web server that covers the whole workflow of ITS analysis beginning from raw reads to publication ready figures and tables, contains a relational database for the biological evaluation of statistical findings and allows comparative analysis with public available mycobiome datasets. For all steps, a summary of methods and results, including citations, is provided, and interactive plots can be created tailor-made. The web server is optimised to account for the properties of typical ITS datasets such as a high sparseness of the abundance profile and amplicon length variability. Whereas the web server can be used with ITS samples from all kinds of environments, we started to build the manual curated database with fungal species relevant for humans; however, many of the species are found in other environmental niches as well. DAnIEL is freely available at <ext-link xlink:href="https://sbi.hki-jena.de/daniel" ext-link-type="uri">https://sbi.hki-jena.de/daniel</ext-link>.</p>
</sec>
<sec id="sec2">
<title>Design and Implementation</title>
<sec id="sec3">
<title>Overview</title>
<p>The workflow is illustrated in <xref rid="fig1" ref-type="fig">Figure 1</xref>. Raw reads can be uploaded in compressed FASTQ format. Optionally, read runs from the NCBI Sequence Read Archive (SRA) can be added by either selecting from the 700 existing cohorts of the DAnIEL database or by entering their accessions directly. Metadata about the samples can be uploaded in CSV or Excel file format if statistical analysis is needed. Parameter sets for tweaking the workflow can be created, e.g., to filter features by abundance or to trim custom primer sequences. A comprehensive documentation about parameters to tweak the workflow and a tutorial is available on the DAnIEL web server. To facilitate biological insights in significant features, we constructed a manually curated database containing 1,669 fungal interactions with diseases, bacteria species and immune components retrieved from 761 published papers. This database is used by the web server for biological interpretation of significantly different abundant or correlated taxa. Furthermore, we incorporated a list of clinical samples of species involved in fungal infections from the German National Reference Center for Invasive Fungal Infections (NRZMyk).</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption><p>Overview of the workflow of the DAnIEL web server. Methods and tools are shown in dark and light blue, respectively. Databases are shown in red. Feature generation and analysis are part of the back end. Taxa are augmented using our relational database of clinical samples (DanIEL clinical) and interactions reported in the literature (DanIEL interact).</p></caption>
<graphic xlink:href="fmicb-12-720513-g001.tif"/>
</fig>
</sec>
<sec id="sec4">
<title>Feature Generation</title>
<p>Features are generated from the raw reads provided. Samples are demultiplexed, if necessary, according to the barcode mapping provided in the metadata table. External samples are downloaded from the NCBI Sequence Read Archive using grabseqs (<xref ref-type="bibr" rid="ref32">Taylor et al., 2020</xref>). Quality control (QC) is performed afterwards. FastQC and MultiQC are used to monitor sequencing errors (<xref ref-type="bibr" rid="ref10">Ewels et al., 2016</xref>). Cutadapt is used to trim primer and adapter sequences (<xref ref-type="bibr" rid="ref20">Martin, 2011</xref>). Samples can be excluded from downstream analysis using various criteria such as minimum number of quality-controlled reads or base quality tests specified by FastQC. Representative biological sequences are created from quality-controlled reads <italic>via</italic> denoising. Either OTUs or amplicon sequence variants (ASVs) can be called using PIPITS (<xref ref-type="bibr" rid="ref14">Gweon et al., 2015</xref>) or DADA2 (<xref ref-type="bibr" rid="ref5">Callahan et al., 2016</xref>), respectively. Taxonomy of denoised sequences is assigned using either Naive Bayes or BLAST consensus approach of QIIME2 (<xref ref-type="bibr" rid="ref3">Bolyen et al., 2019</xref>). Abundance counts are pooled at any given taxonomic rank and filtered by abundance and prevalence. Lastly, pooled counts are normalised using the methods aware of different library sizes like rarefaction or cumulative sum scaling (CSS), as implemented in the R packages vegan and metagenomeSeq, respectively (<xref ref-type="bibr" rid="ref7">Dixon, 2003</xref>; <xref ref-type="bibr" rid="ref27">Paulson et al., 2013</xref>). Centered log-ratio (CLR) normalisation is used by default to account for the compositionality. The generated features are used in downstream analysis to infer biological insights.</p>
</sec>
<sec id="sec5">
<title>Relational Database Generation</title>
<p>DAnIEL was initially run on three cohorts to retrieve a list of fungal species relevant for analyses when studying human samples: Faecal samples from mycobiome datasets of cancer patients (<italic>N</italic>=71, ITS2, PRJEB33756; <xref ref-type="bibr" rid="ref23">Mirhakkak et al., 2021</xref>), antibiotics intervention (<italic>N</italic>=59, ITS2, PRJNA579284; <xref ref-type="bibr" rid="ref31">Seelbinder et al., 2020</xref>) and human skin swab samples (<italic>N</italic>=203, ITS1, PRJNA286273; <xref ref-type="bibr" rid="ref19">Leung et al., 2016</xref>). For each species found, we constructed a NCBI Entrez query to search for PubMed abstracts. Terms &#x201C;disease&#x201D;, &#x201C;cytokine&#x201D;, &#x201C;immune system&#x201D; and &#x201C;prokaryote&#x201D; and a limit of 20 papers per species were used to narrow down the focus of our subsequent manual curation. In total 1,337 abstracts from these papers were reviewed to create a manually curated database of fungal interactions. Medical Subject Headings (MeSH) were used for annotations whenever applicable. In addition, FUNGuild was integrated to provide information about the trophic modes in an ecological context (<xref ref-type="bibr" rid="ref25">Nguyen et al., 2016</xref>).</p>
</sec>
<sec id="sec6">
<title>Feature Analysis</title>
<p>Diversity is calculated using the R packages vegan (<xref ref-type="bibr" rid="ref7">Dixon, 2003</xref>) and phyloseq (<xref ref-type="bibr" rid="ref22">McMurdie and Holmes, 2013</xref>). Various methods, including principal coordinates analysis (PCoA) and non-metric multidimensional scaling (NMDS), can be used to generate ordination plots. FastSpar implementation of the SparCC algorithm can be used to create correlation networks of co-abundant taxa (<xref ref-type="bibr" rid="ref12">Friedman and Alm, 2012</xref>; <xref ref-type="bibr" rid="ref34">Watts et al., 2019</xref>). Alternatively, BAnOCC can be chosen to account for the compositionality of NGS abundance data (<xref ref-type="bibr" rid="ref30">Schwager et al., 2017</xref>). The correlation analysis can be executed for each sample group individually, e.g., to compare networks of &#x201C;case&#x201D; and &#x201C;control&#x201D; samples. If a metadata table is provided, group-wise statistics are performed using Mann&#x2013;Whitney U test for binary response variables and Kruskal&#x2013;Wallis one-way analysis of variance in combination with Dunn&#x2019;s <italic>post hoc</italic> test (<xref ref-type="bibr" rid="ref9">Dunn, 1964</xref>) for other nominal responses. Spearman&#x2019;s rank correlation is used for continuous responses instead. Features significant in any of these tests are annotated with our manually curated database of fungal interactions and clinical samples. Machine learning is applied to categorical response variables using the R package caret (<xref ref-type="bibr" rid="ref18">Kuhn, 2008</xref>). Both random forest (RF) and support vector machines (SVMs) are used in combination with ANOVA filter and recursive feature selection. Best performing models according to the area under the receiver operating curve (AUC) in 5-fold cross-validation and feature importance scores are reported.</p>
</sec>
<sec id="sec7">
<title>Technical Design</title>
<p>The overall pipeline of the DAnIEL web server consists of two parts: A front-end the user is interacting with to upload and visualise the data and a back-end workflow responsible for processing the uploaded data. The front-end of DAnIEL web server is implemented as an R shiny app. For visualisation ggplot2 is used (<xref ref-type="bibr" rid="ref37">Wickham et al., 2019</xref>) and Rmarkdown to create summary reports. The back-end is built as a Snakemake workflow (<xref ref-type="bibr" rid="ref17">Koster and Rahmann, 2012</xref>). This allows running the workflow separately on any Linux system including computing clusters. Conda is used to create reproducible environments for installing and running scripts and individual tools. A unique identifier will be assigned to each project to access the results later on. This also acts as a token for authentication. The tutorial consisting of 38 samples usually takes approximately half an hour wall time using 10 threads to be fully processed. Reports and visualisations can be accessed at the front-end once the corresponding step in the workflow has finished. This includes interactive plots and a summary consisting of findings, annotations, methods and references in a single HTML file.</p>
</sec>
</sec>
<sec id="sec8" sec-type="results">
<title>Results</title>
<sec id="sec9">
<title>Comparison to Relevant Softwares</title>
<p>An overview on related software packages for analysing fungal amplicon sequencing data is given in <xref rid="tab1" ref-type="table">Table 1</xref>. QIIME2 is a command-line focused tool; therefore, it is not ideal for researchers without programming skills (<xref ref-type="bibr" rid="ref3">Bolyen et al., 2019</xref>). ITScan covers profiling of operational taxonomic units (OTUs), however it does not cover quality control of raw reads (<xref ref-type="bibr" rid="ref11">Ferro et al., 2014</xref>). CloVR-ITS was designed for pyrosequencing data; whereas DAnIEL is built for illumina paired-end data (<xref ref-type="bibr" rid="ref35">White et al., 2013</xref>). Most tools are lacking the ability to calculate correlation networks especially those aware of the compositional nature of taxon counts, which is crucial in most analyses (<xref ref-type="bibr" rid="ref13">Gloor et al., 2017</xref>). Tools like PipeCraft and LotuS focus on calculating the OTU table (<xref ref-type="bibr" rid="ref15">Hildebrand et al., 2014</xref>; <xref ref-type="bibr" rid="ref2">Anslan et al., 2017</xref>). Many existing tools are general and do not account for properties of a typical ITS dataset by default. For example, the length of ITS1 can range from 9 to 1,181bp (<xref ref-type="bibr" rid="ref38">Yang et al., 2018</xref>). We chose 50bp as the default minimal QC read length as a trade-off to be able to detect fungi with a short ITS region while still have enough bases left for an accurate taxonomic classification. To the best of our knowledge, DAnIEL is the only web server available covering the whole workflow of ITS analysis beginning from raw reads to publication ready figures and tables, as well as, integration with a relational database for biological evaluation of statistical findings and comparative analysis with public available mycobiome data sets.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption><p>Functionality of software for ITS analysis.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="center" valign="top"/>
<th align="center" valign="top"/>
<th align="center" valign="top">DAnIEL</th>
<th align="center" valign="top">QIIME2</th>
<th align="center" valign="top">mothur</th>
<th align="center" valign="top">CloVR-ITS</th>
<th align="center" valign="top">ITScan</th>
<th align="center" valign="top">SEED2</th>
<th align="center" valign="top">PipeCraft</th>
<th align="center" valign="top">LotuS</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Usabilty</td>
<td align="left" valign="top">Web server</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">GUI</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">HTML report</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top">Data</td>
<td align="left" valign="top">Additional cohorts</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">ITS tailored</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top">Profiling</td>
<td align="left" valign="top">Quality control</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">OTU profiling</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">ASV profiling</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">+</td>
</tr>
<tr>
<td align="left" valign="top">Analysis</td>
<td align="left" valign="top">Diversity</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">SparCC correlation</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">BAnOCC correlation</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">Machine Learning</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
<tr>
<td align="left" valign="top"/>
<td align="left" valign="top">Knowledge base</td>
<td align="center" valign="top">+</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
<td align="center" valign="top">&#x2212;</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="sec10">
<title>Case Studies</title>
<p>We demonstrated the functionality of the DAnIEL web server by running it on public available cohorts of soil and human mycobiomes. The first cohort investigated the effects of wildfire on the soil fungi in northwestern Canadian boreal forest (<italic>N</italic>=300, NCBI Accession PRJNA564811; <xref ref-type="bibr" rid="ref36">Whitman et al., 2019</xref>). The cohort was selected from the integrated database of fungal projects. DAnIEL was run with default parameters. The results are shown in <xref rid="fig2" ref-type="fig">Figures 2</xref>, <xref rid="fig3" ref-type="fig">3</xref>. All sub figures were directly generated by the web server. We confirmed that fungal communities were strongly dissimilar between burned and unburned sites. Burned sites showed significantly decreased Shannon and Chao1 alpha diversity metrics (Wilcoxon rank sum test, <italic>p</italic>&#x003C;0.01). Fungal Bray-Curtis dissimilarities differ also significantly (Adonis PERMANOVA, <italic>p</italic>&#x003C;10<sup>&#x2212;4</sup>). Furthermore, a disrupted co-abundance pattern was observed in burned sites using SparCC correlation networks. Genus node degree and betweenness centrality are significantly decreased (Wilcoxon rank sum test, <italic>p</italic>&#x003C;0.01). We increased the minimal absolute correlation coefficient to <inline-formula><mml:math id="M1"><mml:mrow><mml:msub><mml:mrow><mml:mfenced close="|" open="|"><mml:mi>r</mml:mi></mml:mfenced></mml:mrow><mml:mrow><mml:mi>min</mml:mi><mml:mo>,</mml:mo><mml:mi>S</mml:mi><mml:mi>p</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi><mml:mi>C</mml:mi><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0.3</mml:mn></mml:mrow></mml:math></inline-formula>, in the interactive network exploration of DAnIEL to emphasise this coabundance fragmentation. A Random Forest was picked to be the best model in predicting the state (burned vs. unburned) based on the fungal abundance profile (AUC=98% in 5-fold cross validation).</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption><p>Mycobiome comparison of burned and unburned soil samples. All figures were directly generated by the web server. <bold>(A)</bold> Alpha diversity. <bold>(B)</bold> Beta diversity: Ordination of Bray-Curtis dissimilarities. <bold>(C)</bold> Area under ROC in predicting the burning site from the abundance profile (best model, random forest). <sup>&#x002A;&#x002A;</sup><italic>p</italic> &#x003C; 0.01, <sup>&#x002A;&#x002A;&#x002A;</sup><italic>p</italic> &#x003C; 0.001.</p></caption>
<graphic xlink:href="fmicb-12-720513-g002.tif"/>
</fig>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption><p>Correlation networks of burned and unburned soil samples. SparCC correlation networks per sample group using <bold>(A)</bold> default threshold <inline-formula><mml:math id="M2"><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mfenced close="|" open="|"><mml:mi>r</mml:mi></mml:mfenced><mml:mo>&#x003E;</mml:mo><mml:mn>0.2</mml:mn><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> and (<bold>B</bold>; <inline-formula><mml:math id="M3"><mml:mrow><mml:mfenced close="|" open="|"><mml:mi>r</mml:mi></mml:mfenced><mml:mo>&#x003E;</mml:mo><mml:mn>0.3</mml:mn></mml:mrow></mml:math></inline-formula>) using the interactive GUI. <bold>(C)</bold> Distribution of network topology metrics over genera in the correlation network. <sup>&#x002A;&#x002A;&#x002A;</sup><italic>p</italic> &#x003C; 0.001.</p></caption>
<graphic xlink:href="fmicb-12-720513-g003.tif"/>
</fig>
<p>Secondly, we performed a meta-analysis of two publicly available human skin mycobiomes about dandruff (<italic>N</italic>=966, ITS1, PRJNA415710; <xref ref-type="bibr" rid="ref28">Saxena et al., 2021</xref>) and chronic wounds (<italic>N</italic>=384, ITS1, PRJNA324668; <xref ref-type="bibr" rid="ref16">Kalan et al., 2018</xref>). Results of the human case study are shown in <xref ref-type="supplementary-material" rid="SM1">Supplementary Figures 1</xref>, <xref ref-type="supplementary-material" rid="SM1">2</xref>. Wound sites showed significantly decreased Shannon and Chao1 alpha diversity metrics (Wilcoxon, <italic>p</italic>&#x003C;0.001). Fungal Bray-Curtis dissimilarities differ also significantly (Adonis PERMANOVA, p&#x003C;10<sup>&#x2212;4</sup>). Wound samples showed increased abundances in <italic>Malassezia</italic> and <italic>Saccharomyces</italic>. A fungal co-abundance network was only possible to be constructed for dandruff samples after filtering with default parameters. The minimum absolute SparCC correlation coefficient was lowered to 0.1 to check the robustness of the networks. This revealed three co-abundant genera pairs in the wound samples, but this network was still much sparser than the network obtained from dandruff samples. A random forest model was picked to be the best one in predicting the skin type based on the fungal abundance profile (AUC=99% in 5-fold cross validation) with <italic>Saccharomyces</italic> and <italic>Ascomycota</italic> spp. Showing high Gini feature importance. The higher AUC value compared to the soil example indicated a stronger non-linear biological signal discriminating the sample groups using the fungal abundance profile.</p>
<p>Taxa found significant in either of differential abundance or co-abundance analysis were annotated with FUNGuild (<xref ref-type="bibr" rid="ref25">Nguyen et al., 2016</xref>) and our integrated relational databases. The genera found significantly different in the human cohort were assigned with 867 interactions to other bacteria or cytokines and 141 infection related samples from our manually curated database (<xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 3</xref>). The number of annotations varied across the clades according to their coverage of published literature. For the soil cohort, however, the genera found significant showed only 214 interactions reported in the literature and 25 infection related samples being reported confirming the human focus of our database.</p>
</sec>
<sec id="sec11">
<title>Benchmarking</title>
<p>Processing durations were benchmarked on a Docker container provided with 10 cores and 100GB of RAM. It took 2.9h to process the soil samples and 62.9h to process the human samples (see <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 4</xref>). The time limiting step in big cohorts of the pipeline is the denoising process, which can be parallelised much better for ASV profiling compared to OTU profiling. Downloading and quality control takes usually less than a minute per sample.</p>
<p>The performance of taxonomic profiling was evaluated using simulated reads. Grinder was used to simulate 10 samples in which each of them consists of <inline-formula><mml:math id="M4"><mml:mrow><mml:mn>500</mml:mn><mml:mi mathvariant="normal"></mml:mi><mml:mo>&#x00D7;</mml:mo><mml:mi mathvariant="normal"></mml:mi><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mn>3</mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> 150bp paired-end (PE) reads (<xref ref-type="bibr" rid="ref1">Angly et al., 2012</xref>). Primers ITS1 and ITS2 targeting the ITS1 sub region were utilised to simulate abundances from 100 different reference sequences following an exponential distribution. The same database UNITE 8.2 dynamic was used for simulation and training the classifiers to enable a fair comparison (<xref ref-type="bibr" rid="ref26">Nilsson et al., 2019</xref>). To simulate biological variability, a uniform mutation rate of 1% was incorporated, in which substitutions were four times more likely than insertions or deletions. DAnIEL was run on the data with different methods for denoising (DADA2 and PIPITS) and taxonomic classification (BLAST consensus and Naive Bayes). Benchmarking performance of profiling abundance was based on (<xref ref-type="bibr" rid="ref39">Ye et al., 2019</xref>) using counts pooled at genus rank. Briefly, the Euclidean distance (L2 norm) between the measured and the true abundance profile was calculated for each sample. Furthermore, differences of abundances were calculated for each sample and taxon separately. Pure taxon occurrence was evaluated by counting samples, in which a taxon was both measured and simulated. Precision, Sensitivity, Specificity and F1 score were calculated based on this contingency table. Taxon occurrences were very specifically but less sensitively profiled (98 and 33% on average, respectively, see <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 5</xref>). DADA2 outperformed PIPITS in all metrics. The outperformance of using ASV compared to OTU is consistent with the literature (<xref ref-type="bibr" rid="ref4">Callahan et al., 2017</xref>; <xref ref-type="bibr" rid="ref6">Caruso et al., 2019</xref>). ASVs are more accurate and allow a more detailed analysis of the mycobiome studied. However, this can be also disadvantageous. The detailed taxonomic profile of ASVs in taxonomically diverse studies like environmental samples can make a manual curation of sequence alignments and statistical downstream analyses more difficult. Furthermore, intragenomic variation can result in multiple ASVs originating from the same fungal cell overestimating the true diversity (<xref ref-type="bibr" rid="ref29">Schoch et al., 2012</xref>). On the other hand, OTU profiling is less sensitive to potentially unwanted details, and we still considered it as a necessary function in DAnIEL to make the results comparable with older studies.</p>
<p>Naive Bayes classification outperformed the BLAST consensus approach in terms of specificity and precision but not in sensitivity. Most accurate abundance profiles were generated using DADA2 (see <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 6</xref>). PIPITS underestimated many abundances, which lead to increased distances in some samples especially in combination with the Naive Bayes classifier. The values were very similar for different phyla. Therefore, taxonomy seemed to have only little influence on the classification performance.</p>
<p>Furthermore, we compared the results in the case studies using both ASV and OTU profiling. The results are shown in <xref ref-type="supplementary-material" rid="SM1">Supplementary Figures 7</xref>, <xref ref-type="supplementary-material" rid="SM1">8</xref>. Most of the high abundant taxa were found using any denoising method with similar abundance values and correlation networks. The alpha diversity was significantly higher using OTU profiling (Wilcoxon rank sum test, <italic>p</italic>&#x003C;0.001).</p>
</sec>
<sec id="sec12">
<title>Relational Database Generalisability</title>
<p>We evaluated the generalisability of the DAnIEL interactions database on 30 other ITS studies from various habitats. The results are shown in <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure 9</xref>. Only genera prevalent in at least 10% of the samples in any habitat were considered for this analysis. On average, 29% of the prevalent genera in host habitats, 28% in aquatic and 15% in soil samples were already found in our manually curated database.</p>
</sec>
</sec>
<sec id="sec13">
<title>Availability and Future Directions</title>
<p>DAnIEL is freely available as a web service at <ext-link xlink:href="https://sbi.hki-jena.de/daniel" ext-link-type="uri">https://sbi.hki-jena.de/daniel</ext-link>. There is no registration required. Instead, an ID token will be assigned to each project. Results will be available for 30days. The source code is hosted at <ext-link xlink:href="https://github.com/bioinformatics-leibniz-hki/DAnIEL" ext-link-type="uri">https://github.com/bioinformatics-leibniz-hki/DAnIEL</ext-link>, together with several tests and an example on how to use it, and is distributed under the BSD-2-Clause license. All databases including reference sequences, existing cohorts and fungal interactions can be downloaded at <ext-link xlink:href="https://doi.org/10.5281/zenodo.4073125" ext-link-type="uri">https://doi.org/10.5281/zenodo.4073125</ext-link>.</p>
<p>Since sequencing costs dropped drastically in the past decade, whole metagenome sequencing (WMS) becomes more and more popular. This web server for amplicon sequencing, however, will still be relevant in the future, because many large reference cohorts including the American Gut Project and the Earth Microbiome Project are based on ITS sequencing and one needs a very high sample size to conduct machine learning and correlation network analyses (<xref ref-type="bibr" rid="ref33">Thompson et al., 2017</xref>; <xref ref-type="bibr" rid="ref21">McDonald et al., 2018</xref>). Furthermore, amplicon sequencing can be helpful in identifying low abundant fungi.</p>
<p>Using the Snakemake workflow engine, DAnIEL can be easily extended by other steps (<xref ref-type="bibr" rid="ref17">Koster and Rahmann, 2012</xref>). For instance, Picrust2 can be integrated to predict fungal function profiles (<xref ref-type="bibr" rid="ref8">Douglas et al., 2020</xref>). Fungal taxonomic profiling profits from sequencing larger amplicons. Therefore, tools specifically designed for long read sequencing data of the third generation can be used for quality control to improve classification performance. Furthermore, updating the manually curated database and augmenting it with text mining approaches will improve the biological interpretation of significant taxa.</p>
</sec>
<sec id="sec14">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/<xref rid="sec17" ref-type="sec">Supplementary Material</xref>.</p>
</sec>
<sec id="sec15">
<title>Author Contributions</title>
<p>DL and GP conceived the study, designed the web server, and wrote the manuscript. DL implemented the web server. LZ and CB curated the relational database of fungal interactions. OK developed the NRZMyk database. DL processed the existing projects. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="sec40" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<p>We thank the members of NRZMyk for providing data about clinical samples with fungal infections.</p>
</ack>
<sec id="sec17" sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link xlink:href="https://www.frontiersin.org/articles/10.3389/fmicb.2021.720513/full#supplementary-material" ext-link-type="uri">https://www.frontiersin.org/articles/10.3389/fmicb.2021.720513/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Presentation_1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Angly</surname> <given-names>F. E.</given-names></name> <name><surname>Willner</surname> <given-names>D.</given-names></name> <name><surname>Rohwer</surname> <given-names>F.</given-names></name> <name><surname>Hugenholtz</surname> <given-names>P.</given-names></name> <name><surname>Tyson</surname> <given-names>G. W.</given-names></name></person-group> (<year>2012</year>). <article-title>Grinder: a versatile amplicon and shotgun sequence simulator</article-title>. <source>Nucleic Acids Res.</source> <volume>40</volume>:<fpage>e94</fpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gks251</pub-id>, PMID: <pub-id pub-id-type="pmid">22434876</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anslan</surname> <given-names>S.</given-names></name> <name><surname>Bahram</surname> <given-names>M.</given-names></name> <name><surname>Hiiesalu</surname> <given-names>I.</given-names></name> <name><surname>Tedersoo</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>PipeCraft: flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data</article-title>. <source>Mol. Ecol. Resour.</source> <volume>17</volume>, <fpage>e234</fpage>&#x2013;<lpage>e240</lpage>. doi: <pub-id pub-id-type="doi">10.1111/1755-0998.12692</pub-id>, PMID: <pub-id pub-id-type="pmid">28544559</pub-id></citation></ref>
<ref id="ref3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bolyen</surname> <given-names>E.</given-names></name> <name><surname>Rideout</surname> <given-names>J. R.</given-names></name> <name><surname>Dillon</surname> <given-names>M. R.</given-names></name> <name><surname>Bokulich</surname> <given-names>N. A.</given-names></name> <name><surname>Abnet</surname> <given-names>C. C.</given-names></name> <name><surname>Al-Ghalith</surname> <given-names>G. A.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2</article-title>. <source>Nat. Biotechnol.</source> <volume>37</volume>, <fpage>852</fpage>&#x2013;<lpage>857</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41587-019-0209-9</pub-id>, PMID: <pub-id pub-id-type="pmid">31341288</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Callahan</surname> <given-names>B. J.</given-names></name> <name><surname>McMurdie</surname> <given-names>P. J.</given-names></name> <name><surname>Holmes</surname> <given-names>S. P.</given-names></name></person-group> (<year>2017</year>). <article-title>Exact sequence variants should replace operational taxonomic units in marker-gene data analysis</article-title>. <source>ISME J.</source> <volume>11</volume>, <fpage>2639</fpage>&#x2013;<lpage>2643</lpage>. doi: <pub-id pub-id-type="doi">10.1038/ismej.2017.119</pub-id>, PMID: <pub-id pub-id-type="pmid">28731476</pub-id></citation></ref>
<ref id="ref5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Callahan</surname> <given-names>B. J.</given-names></name> <name><surname>McMurdie</surname> <given-names>P. J.</given-names></name> <name><surname>Rosen</surname> <given-names>M. J.</given-names></name> <name><surname>Han</surname> <given-names>A. W.</given-names></name> <name><surname>Johnson</surname> <given-names>A. J. A.</given-names></name> <name><surname>Holmes</surname> <given-names>S. P.</given-names></name></person-group> (<year>2016</year>). <article-title>DADA2: high-resolution sample inference from illumina amplicon data</article-title>. <source>Nat. Methods</source> <volume>13</volume>, <fpage>581</fpage>&#x2013;<lpage>583</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nmeth.3869</pub-id>, PMID: <pub-id pub-id-type="pmid">27214047</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Caruso</surname> <given-names>V.</given-names></name> <name><surname>Song</surname> <given-names>X.</given-names></name> <name><surname>Asquith</surname> <given-names>M.</given-names></name> <name><surname>Karstens</surname> <given-names>L.</given-names></name></person-group> (<year>2019</year>). <article-title>Performance of microbiome sequence inference methods in environments with varying biomass</article-title>. <source>mSystems</source> <volume>4</volume>:<fpage>e00163</fpage>&#x2013;<lpage>18</lpage>. doi: <pub-id pub-id-type="doi">10.1128/mSystems.00163-18</pub-id>, PMID: <pub-id pub-id-type="pmid">30801029</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dixon</surname> <given-names>P.</given-names></name></person-group> (<year>2003</year>). <article-title>VEGAN, a package of r functions for community ecology</article-title>. <source>J. Veg. Sci.</source> <volume>14</volume>, <fpage>927</fpage>&#x2013;<lpage>930</lpage>. doi: <pub-id pub-id-type="doi">10.1111/j.1654-1103.2003.tb02228.x</pub-id></citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Douglas</surname> <given-names>G. M.</given-names></name> <name><surname>Maffei</surname> <given-names>V. J.</given-names></name> <name><surname>Zaneveld</surname> <given-names>J. R.</given-names></name> <name><surname>Yurgel</surname> <given-names>S. N.</given-names></name> <name><surname>Brown</surname> <given-names>J. R.</given-names></name> <name><surname>Taylor</surname> <given-names>C. M.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>PICRUSt2 for prediction of metagenome functions</article-title>. <source>Nat. Biotechnol.</source> <volume>38</volume>, <fpage>685</fpage>&#x2013;<lpage>688</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41587-020-0548-6</pub-id>, PMID: <pub-id pub-id-type="pmid">32483366</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dunn</surname> <given-names>O. J.</given-names></name></person-group> (<year>1964</year>). <article-title>Multiple comparisons using rank sums</article-title>. <source>Technometrics</source> <volume>6</volume>, <fpage>241</fpage>&#x2013;<lpage>252</lpage>. doi: <pub-id pub-id-type="doi">10.1080/00401706.1964.10490181</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ewels</surname> <given-names>P.</given-names></name> <name><surname>Magnusson</surname> <given-names>M.</given-names></name> <name><surname>Lundin</surname> <given-names>S.</given-names></name> <name><surname>K&#x00E4;ller</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>MultiQC: summarize analysis results for multiple tools and samples in a single report</article-title>. <source>Bioinformatics</source> <volume>32</volume>, <fpage>3047</fpage>&#x2013;<lpage>3048</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btw354</pub-id>, PMID: <pub-id pub-id-type="pmid">27312411</pub-id></citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferro</surname> <given-names>M.</given-names></name> <name><surname>Antonio</surname> <given-names>E. A.</given-names></name> <name><surname>Souza</surname> <given-names>W.</given-names></name> <name><surname>Bacci</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>ITScan: a web-based analysis tool for Internal Transcribed Spacer (ITS) sequences</article-title>. <source>BMC Res. Notes</source> <volume>7</volume>:<fpage>857</fpage>. doi: <pub-id pub-id-type="doi">10.1186/1756-0500-7-857</pub-id>, PMID: <pub-id pub-id-type="pmid">25430816</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friedman</surname> <given-names>J.</given-names></name> <name><surname>Alm</surname> <given-names>E. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Inferring correlation networks from genomic survey data</article-title>. <source>PLoS Comput. Biol.</source> <volume>8</volume>:<fpage>e1002687</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pcbi.1002687</pub-id>, PMID: <pub-id pub-id-type="pmid">23028285</pub-id></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gloor</surname> <given-names>G. B.</given-names></name> <name><surname>Macklaim</surname> <given-names>J. M.</given-names></name> <name><surname>Pawlowsky-Glahn</surname> <given-names>V.</given-names></name> <name><surname>Egozcue</surname> <given-names>J. J.</given-names></name></person-group> (<year>2017</year>). <article-title>Microbiome datasets are compositional: And This is not optional</article-title>. <source>Front. Microbiol.</source> <volume>8</volume>:<fpage>2224</fpage>. doi: <pub-id pub-id-type="doi">10.3389/fmicb.2017.02224</pub-id>, PMID: <pub-id pub-id-type="pmid">29187837</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gweon</surname> <given-names>H. S.</given-names></name> <name><surname>Oliver</surname> <given-names>A.</given-names></name> <name><surname>Taylor</surname> <given-names>J.</given-names></name> <name><surname>Booth</surname> <given-names>T.</given-names></name> <name><surname>Gibbs</surname> <given-names>M.</given-names></name> <name><surname>Read</surname> <given-names>D. S.</given-names></name> <etal/></person-group>. (<year>2015</year>). <article-title>PIPITS: an automated pipeline for analyses of fungal internal transcribed spacer sequences from the illumina sequencing platform</article-title>. <source>Methods Ecol. Evol.</source> <volume>6</volume>, <fpage>973</fpage>&#x2013;<lpage>980</lpage>. doi: <pub-id pub-id-type="doi">10.1111/2041-210X.12399</pub-id>, PMID: <pub-id pub-id-type="pmid">27570615</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hildebrand</surname> <given-names>F.</given-names></name> <name><surname>Tadeo</surname> <given-names>R.</given-names></name> <name><surname>Voigt</surname> <given-names>A. Y.</given-names></name> <name><surname>Bork</surname> <given-names>P.</given-names></name> <name><surname>Raes</surname> <given-names>J.</given-names></name></person-group> (<year>2014</year>). <article-title>LotuS: an efficient and user-friendly OTU processing pipeline</article-title>. <source>Microbiome</source> <volume>2</volume>:<fpage>30</fpage>. doi: <pub-id pub-id-type="doi">10.1186/2049-2618-2-30</pub-id>, PMID: <pub-id pub-id-type="pmid">27367037</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kalan</surname> <given-names>L.</given-names></name> <name><surname>Meisel</surname> <given-names>J. S.</given-names></name> <name><surname>Loesche</surname> <given-names>M. A.</given-names></name> <name><surname>Horwinski</surname> <given-names>J.</given-names></name> <name><surname>Soaita</surname> <given-names>I.</given-names></name> <name><surname>Chen</surname> <given-names>X.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>The microbial basis of impaired wound healing: differential roles for pathogens, &#x201C;bystanders&#x201D;, and strain-level diversification in clinical outcomes</article-title>. <source>bioRxiv</source> <comment>[Preprint]</comment>. doi: <pub-id pub-id-type="doi">10.1101/427567.X</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koster</surname> <given-names>J.</given-names></name> <name><surname>Rahmann</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>Snakemake&#x2014;a scalable bioinformatics workflow engine</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>2520</fpage>&#x2013;<lpage>2522</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/bts480</pub-id>, PMID: <pub-id pub-id-type="pmid">22908215</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kuhn</surname> <given-names>M.</given-names></name></person-group> (<year>2008</year>). <article-title>Building predictive models in r using the caret package</article-title>. <source>J. Stat. Softw.</source> <volume>28</volume>, <fpage>1</fpage>&#x2013;<lpage>26</lpage>. doi: <pub-id pub-id-type="doi">10.18637/jss.v028.i05</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leung</surname> <given-names>M. H. Y.</given-names></name> <name><surname>Chan</surname> <given-names>K. C. K.</given-names></name> <name><surname>Lee</surname> <given-names>P. K. H.</given-names></name></person-group> (<year>2016</year>). <article-title>Skin fungal community and its correlation with bacterial community of urban chinese individuals</article-title>. <source>Microbiome</source> <volume>4</volume>:<fpage>46</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s40168-016-0192-z</pub-id>, PMID: <pub-id pub-id-type="pmid">27558504</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martin</surname> <given-names>M.</given-names></name></person-group> (<year>2011</year>). <article-title>Cutadapt removes adapter sequences from high-throughput sequencing reads</article-title>. <source>EMBnet. J.</source> <volume>17</volume>, <fpage>10</fpage>&#x2013;<lpage>12</lpage>. doi: <pub-id pub-id-type="doi">10.14806/ej.17.1.200</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McDonald</surname> <given-names>D.</given-names></name> <name><surname>Hyde</surname> <given-names>E.</given-names></name> <name><surname>Debelius</surname> <given-names>J. W.</given-names></name> <name><surname>Morton</surname> <given-names>J. T.</given-names></name> <name><surname>Gonzalez</surname> <given-names>A.</given-names></name> <name><surname>Ackermann</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>American gut: an open platform for citizen science microbiome research</article-title>. <source>mSystems</source> <volume>3</volume>:<fpage>e00031</fpage>&#x2013;<lpage>18</lpage>. doi: <pub-id pub-id-type="doi">10.1128/mSystems.00031-18</pub-id>, PMID: <pub-id pub-id-type="pmid">29795809</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McMurdie</surname> <given-names>P. J.</given-names></name> <name><surname>Holmes</surname> <given-names>S.</given-names></name></person-group> (<year>2013</year>). <article-title>Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data</article-title>. <source>PLoS One</source> <volume>8</volume>:<fpage>e61217</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0061217</pub-id>, PMID: <pub-id pub-id-type="pmid">23630581</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mirhakkak</surname> <given-names>M. H.</given-names></name> <name><surname>Sch&#x00E4;uble</surname> <given-names>S.</given-names></name> <name><surname>Klassert</surname> <given-names>T. E.</given-names></name> <name><surname>Brunke</surname> <given-names>S.</given-names></name> <name><surname>Brandt</surname> <given-names>P.</given-names></name> <name><surname>Loos</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Metabolic modeling predicts specific gut bacteria as key determinants for Candida albicans colonization levels</article-title>. <source>ISME J.</source> <volume>15</volume>, <fpage>1257</fpage>&#x2013;<lpage>1270</lpage>. doi: <pub-id pub-id-type="doi">10.1038/s41396-020-00848-z</pub-id>, PMID: <pub-id pub-id-type="pmid">33323978</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mukherjee</surname> <given-names>J. A. R.</given-names></name> <name><surname>Pranab</surname> <given-names>K.</given-names></name> <name><surname>Chandra</surname></name></person-group> (<year>2014</year>). <article-title>Oral mycobiome analysis of HIV-infected patients: identification of pichia as an antagonist of opportunistic fungi</article-title>. <source>PLoS Pathog.</source> <volume>10</volume>:<fpage>e1003996</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.ppat.1003996</pub-id>, PMID: <pub-id pub-id-type="pmid">24626467</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nguyen</surname> <given-names>N. H.</given-names></name> <name><surname>Song</surname> <given-names>Z.</given-names></name> <name><surname>Bates</surname> <given-names>S. T.</given-names></name> <name><surname>Branco</surname> <given-names>S.</given-names></name> <name><surname>Tedersoo</surname> <given-names>L.</given-names></name> <name><surname>Menke</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>FUNGuild: an open annotation tool for parsing fungal community datasets by ecological guild</article-title>. <source>Fungal Ecol.</source> <volume>20</volume>, <fpage>241</fpage>&#x2013;<lpage>248</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.funeco.2015.06.006</pub-id></citation></ref>
<ref id="ref26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nilsson</surname> <given-names>R. H.</given-names></name> <name><surname>Larsson</surname> <given-names>K.-H.</given-names></name> <name><surname>Taylor</surname> <given-names>A. F. S.</given-names></name> <name><surname>Bengtsson-Palme</surname> <given-names>J.</given-names></name> <name><surname>Jeppesen</surname> <given-names>T. S.</given-names></name> <name><surname>Schigel</surname> <given-names>D.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>The UNITE database for molecular identification of fungi: handling dark taxa and parallel taxonomic classifications</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>D259</fpage>&#x2013;<lpage>D264</lpage>. doi: <pub-id pub-id-type="doi">10.1093/nar/gky1022</pub-id>, PMID: <pub-id pub-id-type="pmid">30371820</pub-id></citation></ref>
<ref id="ref27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Paulson</surname> <given-names>J. N.</given-names></name> <name><surname>Stine</surname> <given-names>O. C.</given-names></name> <name><surname>Bravo</surname> <given-names>H. C.</given-names></name> <name><surname>Pop</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>Differential abundance analysis for microbial marker-gene surveys</article-title>. <source>Nat. Methods</source> <volume>10</volume>, <fpage>1200</fpage>&#x2013;<lpage>1202</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nmeth.2658</pub-id>, PMID: <pub-id pub-id-type="pmid">24076764</pub-id></citation></ref>
<ref id="ref28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saxena</surname> <given-names>R.</given-names></name> <name><surname>Mittal</surname> <given-names>P.</given-names></name> <name><surname>Clavaud</surname> <given-names>C.</given-names></name> <name><surname>Dhakan</surname> <given-names>D. B.</given-names></name> <name><surname>Roy</surname> <given-names>N.</given-names></name> <name><surname>Breton</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Longitudinal study of the scalp microbiome suggests coconut oil to enrich healthy scalp commensals</article-title>. <source>Sci. Rep.</source> <volume>11</volume>:<fpage>7220</fpage>. doi: <pub-id pub-id-type="doi">10.1038/s41598-021-86454-1</pub-id>, PMID: <pub-id pub-id-type="pmid">33790324</pub-id></citation></ref>
<ref id="ref29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schoch</surname> <given-names>C. L.</given-names></name> <name><surname>Seifert</surname> <given-names>K. A.</given-names></name> <name><surname>Huhndorf</surname> <given-names>S.</given-names></name> <name><surname>Robert</surname> <given-names>V.</given-names></name> <name><surname>Spouge</surname> <given-names>J. L.</given-names></name> <name><surname>Levesque</surname> <given-names>C. A.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for fungi</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>109</volume>, <fpage>6241</fpage>&#x2013;<lpage>6246</lpage>. doi: <pub-id pub-id-type="doi">10.1073/pnas.1117018109</pub-id>, PMID: <pub-id pub-id-type="pmid">22454494</pub-id></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwager</surname> <given-names>E.</given-names></name> <name><surname>Mallick</surname> <given-names>H.</given-names></name> <name><surname>Ventz</surname> <given-names>S.</given-names></name> <name><surname>Huttenhower</surname> <given-names>C.</given-names></name></person-group> (<year>2017</year>). <article-title>A bayesian method for detecting pairwise associations in compositional data</article-title>. <source>PLoS Comput. Biol.</source> <volume>13</volume>:<fpage>e1005852</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pcbi.1005852</pub-id>, PMID: <pub-id pub-id-type="pmid">29140991</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seelbinder</surname> <given-names>B.</given-names></name> <name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Brunke</surname> <given-names>S.</given-names></name> <name><surname>Vazquez-Uribe</surname> <given-names>R.</given-names></name> <name><surname>Santhaman</surname> <given-names>R.</given-names></name> <name><surname>Meyer</surname> <given-names>A. C.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Antibiotics create a shift from mutualism to competition in human gut communities with a longer-lasting impact on fungi than bacteria</article-title>. <source>Microbiome</source> <volume>8</volume>:<fpage>133</fpage>. doi: <pub-id pub-id-type="doi">10.1186/s40168-020-00899-6</pub-id>, PMID: <pub-id pub-id-type="pmid">32919472</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taylor</surname> <given-names>L. J.</given-names></name> <name><surname>Abbas</surname> <given-names>A.</given-names></name> <name><surname>Bushman</surname> <given-names>F. D.</given-names></name></person-group> (<year>2020</year>). <article-title>Grabseqs: simple downloading of reads and metadata from multiple next-generation sequencing data repositories</article-title>. <source>Bioinformatics</source> <volume>36</volume>, <fpage>3607</fpage>&#x2013;<lpage>3609</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/btaa167</pub-id>, PMID: <pub-id pub-id-type="pmid">32154830</pub-id></citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thompson</surname> <given-names>L. R.</given-names></name> <name><surname>Sanders</surname> <given-names>J. G.</given-names></name> <name><surname>McDonald</surname> <given-names>D.</given-names></name> <name><surname>Amir</surname> <given-names>A.</given-names></name> <name><surname>Ladau</surname> <given-names>J.</given-names></name> <name><surname>Locey</surname> <given-names>K. J.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>A communal catalogue reveals Earth&#x2019;s multiscale microbial diversity</article-title>. <source>Nature</source> <volume>551</volume>, <fpage>457</fpage>&#x2013;<lpage>463</lpage>. doi: <pub-id pub-id-type="doi">10.1038/nature24621</pub-id>, PMID: <pub-id pub-id-type="pmid">29088705</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Watts</surname> <given-names>S. C.</given-names></name> <name><surname>Ritchie</surname> <given-names>S. C.</given-names></name> <name><surname>Inouye</surname> <given-names>M.</given-names></name> <name><surname>Holt</surname> <given-names>K. E.</given-names></name></person-group> (<year>2019</year>). <article-title>FastSpar: rapid and scalable correlation estimation for compositional data</article-title>. <source>Bioinformatics</source> <volume>35</volume>, <fpage>1064</fpage>&#x2013;<lpage>1066</lpage>. doi: <pub-id pub-id-type="doi">10.1093/bioinformatics/bty734</pub-id>, PMID: <pub-id pub-id-type="pmid">30169561</pub-id></citation></ref>
<ref id="ref35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>White</surname> <given-names>J. R.</given-names></name> <name><surname>Maddox</surname> <given-names>C.</given-names></name> <name><surname>White</surname> <given-names>O.</given-names></name> <name><surname>Angiuoli</surname> <given-names>S. V.</given-names></name> <name><surname>Fricke</surname> <given-names>W. F.</given-names></name></person-group> (<year>2013</year>). <article-title>CloVR-ITS: automated internal transcribed spacer amplicon sequence analysis pipeline for the characterization of fungal microbiota</article-title>. <source>Microbiome</source> <volume>1</volume>:<fpage>6</fpage>. doi: <pub-id pub-id-type="doi">10.1186/2049-2618-1-6</pub-id>, PMID: <pub-id pub-id-type="pmid">24451270</pub-id></citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Whitman</surname> <given-names>T.</given-names></name> <name><surname>Whitman</surname> <given-names>E.</given-names></name> <name><surname>Woolet</surname> <given-names>J.</given-names></name> <name><surname>Flannigan</surname> <given-names>M. D.</given-names></name> <name><surname>Thompson</surname> <given-names>D. K.</given-names></name> <name><surname>Parisien</surname> <given-names>M.-A.</given-names></name></person-group> (<year>2019</year>). <article-title>Soil bacterial and fungal response to wildfires in the Canadian boreal forest across a burn severity gradient</article-title>. <source>Soil Biol. Biochem.</source> <volume>138</volume>:<fpage>107571</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.soilbio.2019.107571</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wickham</surname> <given-names>H.</given-names></name> <name><surname>Averick</surname> <given-names>M.</given-names></name> <name><surname>Bryan</surname> <given-names>J.</given-names></name> <name><surname>Chang</surname> <given-names>W.</given-names></name> <name><surname>McGowan</surname> <given-names>L.</given-names></name> <name><surname>Fran&#x00E7;ois</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Welcome to the tidyverse</article-title>. <source>J. Open Source Softw.</source> <volume>4</volume>:<fpage>1686</fpage>. doi: <pub-id pub-id-type="doi">10.21105/joss.01686</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>R. H.</given-names></name> <name><surname>Su</surname> <given-names>J. H.</given-names></name> <name><surname>Shang</surname> <given-names>J. J.</given-names></name> <name><surname>Wu</surname> <given-names>Y. Y.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Bao</surname> <given-names>D. P.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Evaluation of the ribosomal DNA internal transcribed spacer (ITS), specifically ITS1 and ITS2, for the analysis of fungal diversity by deep sequencing</article-title>. <source>PLoS One</source> <volume>13</volume>:<fpage>e0206428</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pone.0209775</pub-id>, PMID: <pub-id pub-id-type="pmid">30596740</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ye</surname> <given-names>S. H.</given-names></name> <name><surname>Siddle</surname> <given-names>K. J.</given-names></name> <name><surname>Park</surname> <given-names>D. J.</given-names></name> <name><surname>Sabeti</surname> <given-names>P. C.</given-names></name></person-group> (<year>2019</year>). <article-title>Benchmarking metagenomics tools for taxonomic classification</article-title>. <source>Cell</source> <volume>178</volume>, <fpage>779</fpage>&#x2013;<lpage>794</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cell.2019.07.010</pub-id>, PMID: <pub-id pub-id-type="pmid">31398336</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported by the Deutsche Forschungsgemeinschaft (DFG) CRC/Transregio 124 &#x201C;Pathogenic Fungi and Their Human Host: Networks of Interaction,&#x201D; subprojects B5 and INF (FungiNet; number 210879364). CB and LZ greatly acknowledge the ERC funding (ERC Starting Grant Project 802736 MORPHEUS).</p></fn>
</fn-group>
</back>
</article>