Fungi Contribute Critical but Spatially Varying Roles in Nitrogen and Carbon Cycling in Acid Mine Drainage

The ecosystem roles of fungi have been extensively studied by targeting one organism and/or biological process at a time, but the full metabolic potential of fungi has rarely been captured in an environmental context. We hypothesized that fungal genome sequences could be assembled directly from the environment using metagenomics and that transcriptomics and proteomics could simultaneously reveal metabolic differentiation across habitats. We reconstructed the near-complete 27 Mbp genome of a filamentous fungus, Acidomyces richmondensis, and evaluated transcript and protein expression in floating and streamer biofilms from an acid mine drainage (AMD) system. A. richmondensis transcripts involved in denitrification and in the degradation of complex carbon sources (including cellulose) were up-regulated in floating biofilms, whereas central carbon metabolism and stress-related transcripts were significantly up-regulated in streamer biofilms. These findings suggest that the biofilm niches are distinguished by distinct carbon and nitrogen resource utilization, oxygen availability, and environmental challenges. An isolated A. richmondensis strain from this environment was used to validate the metagenomics-derived genome and confirm nitrous oxide production at pH 1. Overall, our analyses defined mechanisms of fungal adaptation and identified a functional shift related to different roles in carbon and nitrogen turnover for the same species of fungi growing in closely located but distinct biofilm niches.

The ecosystem roles of fungi have been extensively studied by targeting one organism and/or biological process at a time, but the full metabolic potential of fungi has rarely been captured in an environmental context. We hypothesized that fungal genome sequences could be assembled directly from the environment using metagenomics and that transcriptomics and proteomics could simultaneously reveal metabolic differentiation across habitats. We reconstructed the near-complete 27 Mbp genome of a filamentous fungus, Acidomyces richmondensis, and evaluated transcript and protein expression in floating and streamer biofilms from an acid mine drainage (AMD) system. A. richmondensis transcripts involved in denitrification and in the degradation of complex carbon sources (including cellulose) were up-regulated in floating biofilms, whereas central carbon metabolism and stress-related transcripts were significantly up-regulated in streamer biofilms. These findings suggest that the biofilm niches are distinguished by distinct carbon and nitrogen resource utilization, oxygen availability, and environmental challenges. An isolated A. richmondensis strain from this environment was used to validate the metagenomics-derived genome and confirm nitrous oxide production at pH 1. Overall, our analyses defined mechanisms of fungal adaptation and identified a functional shift related to different roles in carbon and nitrogen turnover for the same species of fungi growing in closely located but distinct biofilm niches.

INTRODUCTION
Fungi are extremely diverse and abundant. They perform many critical ecosystem functions including the decomposition of organic matter, carbon storage, nutrient transfer, metal transformation and accumulation, and soil formation. The role of fungi in these ecosystem processes has been extensively studied using techniques such as cultivation, gene surveys, and enzyme or rate measurements. These approaches commonly target individual biological processes (e.g., carbon degradation). However, the full metabolic potential of fungi-across all of their genes, pathways, and processes-is rarely captured in an environmental context. The full metabolic potential of environmentally-relevant organisms can be accessed by reconstructing genomes directly from environmental DNA sequences. The genome-centric metagenomics approach (involving sequence assembly) simultaneously recovers intact genomic sequences from many community members, thereby enabling analysis of organismal interactions. Specific functions can be connected to individual community members from reconstructed genomes. In fact, it is possible to recover complete and essentially complete bacterial and archaeal genomes from metagenomics datasets (e.g., Iverson et al., 2012;Wrighton et al., 2012;Albertsen et al., 2013;Castelle et al., 2013;Di Rienzi et al., 2013;Kantor et al., 2013;. However, genome-centric metagenomics has rarely been applied to obtain intact eukaryotic genomes from environmental samples (Cantu et al., 2011;Cissé et al., 2012;Quandt et al., 2015), in part due to their larger, more complex, and often repeat-rich genomes.
Other genomic approaches have been successfully applied to microbial eukaryotes, but each of these approaches has limitations. Approximately 600 fungal genomes have been sequenced, but most of these are cultured representatives (which are not always environmentally relevant) and they only represent about half of known fungal diversity. Targeted metagenomics approaches (which use cell sorting coupled with multiple displacement amplification, MDA), have successfully sequenced partial eukaryotic genomes (Cuvelier et al., 2010;Yoon et al., 2011;Monier et al., 2012;Vaulot et al., 2012); however, MDA reactions lead to inherent biases in genome coverage (Woyke et al., 2010). Read-based genomic sequencing (which maps reads to known genomes using similarity searches such as BLAST) has provided some functional information from root-associated fungi (Liao et al., 2014), but this approach is limited by genome availability and incomplete genome annotations. It remains to be seen if recent advances in sequencing technologies and assembly approaches make metagenomics more feasible for recovery of fungal genomes from mixed communities.
Expression of predicted genes can be measured at the level of mRNAs and proteins as a proxy for cellular activity in the environment. Metatranscriptomic and metaproteomic techniques allow for the quantification of thousands of transcripts and proteins from individual taxonomic groups within mixed communities. Combining transcriptomic and proteomic measurements allows for the integration of diverse molecular-level processes to predict system-level function.
Here, we applied a combination of genome-centric metagenomics, transcriptomics and proteomics methodologies to test for differentiation of fungal metabolism as a function of microbial community context in a well-characterized microbial ecosystem, acid mine drainage (AMD) biofilms from the Richmond Mine (Iron Mountain near Redding, California). Fungi are found in two different types of biofilms within the mine: (1) the widespread floating biofilms (characterized extensively with metagenomics and proteomics in prior studies), and (2) the less common streamer biofilms. Floating biofilms are found at the air-AMD solution interface and are dominated by Leptospirillum Group II bacteria, which fix carbon into the system and support the growth of other heterotrophic microbes (e.g.,  and references therein). Many complex sugars have been previously shown to be constituents of the extracellular matrix in the floating biofilms (Jiao et al., 2010). Streamer biofilms, dominated by fungi, grow as long filaments submerged within the flowing AMD solution and are anchored to pyrite sediments within the stream (Baker and Banfield, 2003). Very little is known about streamer biofilm communities, or about the physiology and ecology of the prolific fungal members.
Floating and streamer biofilm are found in close proximity to each other and experience essentially the same bulk AMD solution (extremely acidic, pH <1; metal-rich, ∼200 mM Fe; and thermophilic, 40-50 • C) and underlying pyrite sediment. Nonetheless, fungal abundance is very different in the two biofilm types: fungi dominate streamer biofilm communities, but are found in low abundance in floating biofilms (Baker and Banfield, 2003;Baker et al., 2004Baker et al., , 2009. We hypothesize that variations in fungal metabolism between the biofilm types influence the observed distribution patterns. For instance, fungi likely play a role in degrading organic carbon within both biofilms, but the carbon sources may vary between the floating biofilms and the submerged, streamer biofilms.

Biofilm Sample Collection
AMD biofilms were collected from the Richmond Mine (Iron Mountain near Redding, California) into sterile tubes that were flash-frozen in an ethanol/dry ice bath on-site before transferring to a −80 • C freezer upon return to the laboratory. The streamer biofilms used for community genome sequencing were sampled at the 5-way site (28Feb08). Streamer biofilms used for initial mRNA sequencing for gene model prediction and annotation were sampled from the 4-Way site (09Jun09). For expression studies, two different types of AMD biofilms were collected: (1) floating biofilms, which are found at the air-AMD solution interface, and (2) streamer biofilms, which are submerged within the AMD solution and are anchored to pyrite sediments within the stream. Floating biofilms (characterized as mature, growth stage two biofilms based on biofilm thickness) were sampled from the AB-Muck site on three different dates: 17Sept10 (F1); 21Oct11 (F2); and 28Dec11 (F3). Streamer biofilms were sampled on three different dates at two sites: 4-Way site, 17Sept10 (S1); 4-Way site, 21Oct11 (S3); AB+10m site, 02Nov12 (S4). All of the sampling locations are within ∼100 meters of each other, with essentially the same underlying pyrite sediment and AMD solution with low pH (typically 0.5-1.2), elevated temperature (30-56 • C), and millimolar concentrations of sulfate, iron, zinc, copper, and arsenic (Druschel et al., 2004).

Microscopy and Fluorescence In situ Hybridization
Differential interference contrast (DIC) and epifluorescent microscopy was carried out at 630X magnification on a standard epifluorescence microscope. Fluorescence in situ hybridization (FISH) was carried out on fixed (4% paraformaldehyde) AMD biofilm samples as described previously (Amann et al., 1995;Bond and Banfield, 2001;Baker et al., 2009). Oligonucleotide probes used in this study for identification of individual species and groups were as follows: EUBMIX (all Bacteria); ARC915 (all Archaea); EUKMIX (all Eukaryotes); DOH299 (Dothideomycetes Class); LF655 (all Leptospirillum bacteria); LF1252 (Leptospirillum group III bacteria); L2UBA353 (Leptospirillum group II UBA genotype); L2CG353 (Leptospirillum group II 5-way genotype); and SUL230 (Sulfobacillus spp.). A total of 1045-1506 cells were counted for each biofilm sample, from 3 to 6 fields of view per probe (with an average of 313 cells counted per probe per sample). Counts were converted to a percentage of the total cell count found using the general nucleic acid stain 4 ′ ,6-diamidino-2-phenylindole (DAPI). Images for figures were adjusted for contrast and combined with GIMP (www.gimp.org).

A. richmondensis Culture Growth Conditions
Acidomyces richmondensis strain BFW was isolated from streamer biofilms collected at the 5-way location of the Richmond Mine on 28Feb08 that were frozen on dry ice at the site and stored at -80 • C. The biofilms were thawed and homogenized (in a glass tube by using several vigorous strokes of a tight-fitting, round, glass pestle), and then inoculated into minimal media (pH 1) containing glucose and ampicillin, as previously described . After growth was observed, the mycelia were spread on potato dextrose agar plates (Becton Dickinson, Sparks, MD). Individual fungal colonies were re-streaked after 1 week of growth. One of these colonies was then regrown in the liquid medium. Culture purity was verified by microscopy and PCR amplification and sequencing of 16S and 18S ribosomal RNA genes. This A. richmondensis isolate has been maintained at 4 • C for ∼4 years.
To test for the production of nitrous oxide (N 2 O) gas, A. richmondensis was grown in sealed serum vials with a modified M9 minimal medium at pH ∼1 (6.78 g/L Na 2 HPO 4 ; 3 g/L KH 2 PO 4 ; 0.5 g/L NaCl; 1 g/L NH 4 Cl; 2 mM MgSO 4 ; 0.1 mM CaCl 2; 0.4% glucose; 2 g/L NaNO 3 ; 0.034 mg/ml chloramphenicol). Triplicate cultures were grown under high and low oxygen conditions. For high oxygen conditions, the media and headspace were aerobic but the supply of air was discontinued after the serum vials were sealed. For low oxygen conditions, the aerobic culture was inoculated into anaerobic vials (flushed with ultrapure nitrogen gas) at a 0.01 volume/volume ratio. After 6 days of growth, gas samples from the headspace were analyzed for N 2 O on a Shimadzu GC-14A gas chromatograph fitted with a 63 Ni electron capture detector.

DNA Extraction
Samples for genomic DNA sequencing (both from biofilms and cultures) were thawed in cold 0.9% NaCl, pH 1, by manual homogenization with a sterile, cut-off 1000 µL pipette tip. Homogenized samples were centrifuged at 7000 × g for 5 min at 4 • C, decanted, and resuspended in 1 mL 0.9% NaCl, pH 1. A 16-gauge needle was used to further homogenize the biofilm. The homogenate was centrifuged at 7000 × g for 5 min at 4 • C, decanted, and resuspended in 1 mL 1X PBS, pH 7. A 16-gauge needle was used to further homogenize the biofilm, which was then centrifuged at 7000 × g for 5 min at 4 • C. The supernatant was decanted, and the remaining pellet was flash frozen in liquid nitrogen, and transferred to a pre-chilled, sterile mortar and pestle. The cell pellet was ground to a fine powder, adding liquid N 2 as necessary to keep the powder frozen. Powder was optionally stored for up to a month in liquid nitrogen before proceeding. Frozen powder was thawed in 1 mL pre-warmed CTAB buffer, pH 8, with 1% β-mercaptoethanol and 1% 0.5M EDTA. Samples were incubated at 70 • C for 1 h, with occasional mixing by hand and three freeze/thaw cycles in liquid N 2 and a 70 • C heat block after 30 min. An equal volume (1 mL) of phenol:chloroform:isoamyl alcohol (25:24:1) was added to a pre-spun phase lock gel heavy 15 mL tube (5 PRIME, Gaithersburg, MD), and samples were centrifuged for 5 min at 1500 × g at room temperature after hand mixing. A second 1 mL of phenol:chloroform:isoamyl alcohol was added to the tube, mixed, and centrifuged as before. A final equal volume of chloroform:isoamyl alcohol (24:1) was added and centrifuged as before. The aqueous supernatant was removed, and DNA was precipitated with 1X volume of isopropanol and 0.1X volume 3M NaOAc (pH 5.8) at −20 • C for at least 1 h, pelleted and washed with 70% EtOH, and resuspended in 50 µL TE pH 8.

A. richmondensis Culture Genomic Sequencing and Assembly
The A. richmondensis BFW isolate genome was sequenced using 454/Roche pyrosequencing (454-Rapid, 1 full run, 590.7 Mbp) and Solexa/Illumina sequencing (GAIIx, 3 lanes, "2 × 76, 300 bp" insert, 6.4 Gbp total sequencing). DNA (500 ng) was fragmented by nebulization to an average size of 500-800 bp and size selected using SPRI beads (Beckman Coulter, Indianapolis, IN). The fragments were treated with end repair and ligated with adapters using the 454 TM Rapid Library preparation kit (Roche, Basel, Switzerland) and then sequenced. DNA (1 µg) was sheared using the Covaris E210 (Covaris, Woburn, MA) and gel-size selected for 300 bp. The fragments were treated with end-repair, A-tailing, and ligation of Illumina compatible adapters (IDT, Coralville, IA) followed by 10 cycles of PCR (NEB, Ipswich, MA). qPCR was used to determine the concentration of the libraries, according to JGI standard operating procedures using the KAPA Complete Library Quantification Kit (Kapa Biosystems, Wilmington, MA). Libraries were sequenced on the Illumina Hiseq. Data were quality-control filtered for artifacts and process contamination. Illumina reads were first assembled with velvet (Zerbino and Birney, 2008) version 0.7.55 (k = 61, -min_contig_lgth 100 -exp_cov 59 -cov_cutoff 15). The resulting contigs were shredded to 1000 bp fragments with 800 bp overlaps. Additionally, 400 bp reverse complement fragments were created from each contig terminus. These shredded contigs and the 454 reads were subsequently assembled with Newbler (www.454.com) version 2.5 (-info -consed -nrm -finish -rip -sio -a 50 -l 350 -g -ml 20 -mi 97 -e 17). Mitochondrial sequences were screened out and assembled separately.

Metagenomic-Based Sequencing and Assembly of A. richmondensis
The A. richmondensis genome was also assembled using metagenomic sequencing reads (Aciri1_meta assembly). One library (mean insert size 193 bp) was prepared and sequenced on three lanes (2×76 bp) of an Illumina GAIIx flowcell and one lane (2 × 150 bp) of an Illumina HiSeq2000 flowcell using standard protocols at the US Department of Energy Joint Genome Institute. Raw reads are deposited at the National Center for Biotechnology Short Read Archive under study ID SRP006615. For the GAIIx runs, reads were trimmed with an in-house script that removed bases from the 3 ′ end of reads until a base with a PHRED quality score of ≥3 was found. Only read pairs with both reads ≥43 bp after trimming were kept for further processing. Overlapping HiSeq reads were merged and adapter sequences removed using SeqPrep (https://github.com/jstjohn/SeqPrep) with the following parameters: -6 -m 0.3 -n 0.7 -o 12 -Z 100 000 -N 1.
Assembly was performed in stages designed to gradually select and assemble reads likely to be from A. richmondensis. Initially, the merged reads from the HiSeq library above were assembled using velvet (Zerbino and Birney, 2008) version 1.1 with the columbus module (k = 99, -cov_cutoff 5.0 -exp_cov 19.7 -ins_length 193 -ins_length_sd 27.4) using a bowtie v.1 (Langmead et al., 2009) mapping of reads to the isolate genome as a reference. Contigs greater than three kbp from this assembly were binned based on tetranucleotide content using emergent self-organizing maps (ESOM) (Dick et al., 2009). Contigs 2-3 kbp in length were then added to the ESOM by calculating their tetranucleotide frequencies and projecting these 2-3 kb contigs onto best match positions on the fixed ESOM learned from the set of contigs greater than three kbp in length, using the Project command in Databionics ESOM Tools (http://databionic-esom. sourceforge.net). Contigs belonging to a large, obvious fungal bin in the ESOM were manually selected and exported. Putative fungal contigs were initially identified based on membership in one or more of the following contig sets. Set (1) contained contigs in the manually identified fungal ESOM bin. Set (2) contained contigs that were nearby to contigs in set (1) in the genome assembly graph (≤two edges away in velvet's final de Bruijn graph, defined using velvet's LastGraph file). That is, these contigs showed some read-based evidence of contiguity with contigs in set (1) or evidence of contiguity with contigs that were first neighbors to contigs in set (1). Set (3) contained contigs with a best blast hit of any putative protein on the contig to a fungal protein with e ≤ 1e-10. All reads from all sequencing runs were mapped against this set of putative fungal contigs with bowtie as unpaired reads, and all mapping reads (and paired reads, if available) were collected. Velvet was again used to assemble this subset of reads using the isolate genome as a mapped reference (k = 75, -cov_cutoff 5.0 -exp_cov 20.5 -ins_length 193 -ins_length_sd 27.4). Contigs were split at any run of ambiguous N bases (there are no scaffolds in this assembly), and used as input to a new ESOM-based binning. Contigs were selected from the resulting large fungal bin, and searched against an in-house database of known AMD bacteria and archaea from the study site, as well as draft Sulfobacillus genomes assembled from this data. Based on this, 12 contigs were removed from the assembly as potential non-fungal contaminants. After data analysis, one additional contig (01615) was identified as having similarity to a new Sulfobacillus genome. Finally, contigs were broken at potential mis-assemblies using an in-house script that looks for regions with zero paired-read support, as previously described .
Assemblies were annotated using a standard pipeline implemented in the MycoCosm fungal web portal at the JGI (Grigoriev et al., 2012(Grigoriev et al., , 2014. Prediction of filtered gene models was aided by 24.4 Gbp of initial cDNA sequencing (Illumina HiSeq) derived from poly-A selected streamer community RNA.
For comparison, we also assembled the metagenomic reads without using information from the isolate assembly, following an analogous iterative protocol to that above. Velvet was used to assemble the merged HiSeq library reads as above, without the aid of the reference, and contigs ≥3 kb from the resulting assembly was binned via ESOM-based binning of tetranucleotide frequencies. Contigs between 1000 and 2999 bp were added to the ESOM map by projection. The ESOM structure, BLAST hits to an in-house database of known AMD bacteria and archaea from the study site, and BLAST hits to Dothideomycete genomes were used to identify the large putative fungal bin in the assembly. All reads from all sequencing runs were mapped to contigs from this selected bin and contigs in the DeBruijn assembly graph within two edges of these contigs. Mapped reads and their read pairs were used as input to a new velvet assembly (k = 75, -cov_cutoff 5.0 -exp_cov 24.5 -ins_length 193 -ins_length_sd 27.4). Contigs were split at any run of N bases, and a new ESOM was prepared with contigs ≥3 kb, with contigs 1000-2999 bp then projected on the map. The fungal bin was manually selected based on ESOM topography and BLAST hits as above. Selected contigs were manually screened via BLAST against the in-house database of AMD organisms and against the NCBI nr database for nonfungal contigs, and 38 contigs were removed. Lastly, contigs were quality-screened as above for misassemblies.

Genome Completeness
Genome completeness was estimated based on the presence of conserved, low-copy-number genes in both the isolate and metagenome assemblies of A. richmondensis. CEGMA (Parra et al., 2009) version 2.5 was run with default parameters, and completeness was estimated based on the presence of partial or complete hits to 248 "core eukaryotic genes." We also considered hits only to "group 4" genes, which represent the most conserved KOGs. In addition, the "clusters" tool in MycoCosm (Grigoriev et al., 2012(Grigoriev et al., , 2014 was used to identify all gene family clusters (based on Markov clustering of predicted proteins; Enright et al., 2002) that exist in single copy in the 50 other available Dothideomycete genome assemblies available (May 2014). We then counted the number of these clusters that contained one or more representative proteins from the isolate or metagenome assemblies of A. richmondensis.
Carbohydrate Active Enzymes (CAZymes) were predicted using sequence libraries that were built with full length and constitutive modules (glycoside hydrolase, GH; polysaccharide lyase, PL; carbohydrate esterase, CE; glycosyltransferase, GT; and carbohydrate-binding module, CBM) isolated from the collection of carbohydrate-active enzymes of the CAZy database (www.cazy.org) (Cantarel et al., 2009). A series of profile hidden Markov models (HMMs) were built from each of the module families (and subfamilies in a number of cases) described by the CAZy database. Assignment of the protein models to a CAZy family (or to several families in the case of multimodular proteins) was performed by a two-step procedure that involved first a BLAST (Altschul et al., 1997) search against the full length CAZy proteins, keeping for further analysis all proteins that gave a e-value smaller than 0.1. The selected proteins were then subjected to a combination of BLAST and HMMer3 (Eddy, 2009) searches against the libraries of sequence made with the CAZy modules and the collection of HMM profiles, respectively. All positive results giving significant scores with both BLAST and HMMs were manually inspected, checked for the presence of catalytic residues or sequence motifs characteristic of the family, annotated for possible problems (for instance, if the model was a fragment), and assigned to one or several CAZy families.

Molecular Phylogeny
The intergenic transcribed spacer region (ITS) of the A. richmondensis genome was identified via BLAST (Altschul et al., 1997) against a set of previously identified Dothideomycetes fungi (Yamazaki et al., 2010), followed by manual adjustment. Fungal strains with identical ITS sequences were removed from the set, and sequences were aligned with mafft using default nucleotide parameters (mafft-linsi) (Katoh and Standley, 2013). Gblocks (Talavera and Castresana, 2007) was used to trim the alignment to conserved positions in minimum block-lengths of five nucleotides that were deemed informative for phylogenetic analyses. This resulted in an alignment of 424 positions in 19 conserved blocks. FastTree (Price et al., 2010) was used for approximate maximum likelihood tree inference with support values under the discrete gamma model with 20 rate categories (-gamma). All Bispora sp. MEY-1 coding sequences (HM003045.1, HM003044.1, HM003043.1, GU351880.1, GU074519.1, FJ695140.1, FJ492963.1, FJ472925.1, FJ212324.1, EU919724.1) were downloaded from the NCBI nucleotide database, and searched against both assemblies with default blastn parameters.

RNA Extractions
Biofilm samples were prepared for RNA extractions by resuspending 1 mL of frozen biofilm (from a 2 mL microcentrifuge tube) in 1 mL 1X PBS (pH 7), centrifuging at 15,000 rpm for 2 min at 4 • C, resuspending in 1 mL 1X PBS (pH 7), and then splitting the resulting suspension into separate aliquots of ∼100 µL. Aliquots were pelleted by centrifugation, cleaned of supernatant and immediately flash frozen on liquid nitrogen and stored at −80 • C until processing, within 1 month of preparation. Prior to RNA extraction, 1 mL of RLC Buffer (from the RNeasy Plant Mini Kit, QIAGEN, Valencia, CA) with 10% (v/v) 2-mercaptoethanol (2-ME) was added to each pellet. The resulting mixture was transferred to Lysing Matrix E bead beating tubes (MP Biomedicals, Santa Ana, CA) and immediately flash frozen in liquid nitrogen. Tubes were then thawed and bead beaten for 30 s using a Fast Prep-24 Automated Homogenizer (MP Biomedicals, Santa Ana, CA) at 6.0 m/s. Following bead beating, tubes were flash frozen again. Tubes were then heated for 1 min at 65 • C, vortexed for 30 s, heated at 65 • C for 2 min, vortexed for 30 s and briefly centrifuged to settle cell debris. Five hundred microliters of the resulting supernatant was processed using the QIAGEN RNeasy Plant Mini Kit (QIAGEN, Valencia, CA) following the manufacturer's protocol.
RNA extraction replicates were pooled (10-40 replicates per biofilm sample) and then treated with DNase using the QIAGEN RNase-free DNase Set, following the manufacturer's protocol (QIAGEN, Valencia, CA). Samples were further cleaned and concentrated with the RNeasy MinElute Cleanup Kit (QIAGEN, Valencia, CA). Pooled, cleaned and concentrated samples were assessed for purity, quality, and concentration using a NanoDrop Spectrophotometer (NanoDrop Products, Wilmington, DE), Qubit Broad Range RNA Assay (Life Technologies, Grand Island, NY), and an Agilent Bioanalyzer with the RNA 6000 Nano Kit (Agilent Technologies, Santa Clara, CA), yielding 23-187 µg RNA per biofilm sample.

RNA Sequencing
Total RNA was processed and sequenced by the Joint Genome Institute. Two separate approaches were used to enrich mRNA in the biofilm samples: (1) poly-A selection to target eukaryotic transcripts, and (2) ribosomal RNA (rRNA) depletion using the RiboZero human and bacteria kits (Epicentre) for targeting transcripts from the entire community. Transcripts were converted to cDNA and sequenced with an Illumina HiSeq 2000 instrument using 150 base pair (bp), paired-end reads.
Reads were trimmed to 36 bp and then aligned with BWA [seed length = 25, maximum hits = 1; (Li and Durbin, 2009)] to an in-house reference database containing nearly 80,000 predicted transcripts derived from previous genomic characterizations of cultures and biofilms from the Richmond Mine AMD system (Denef et al., 2010 and references within). Reads from the poly-A selected biofilm samples were aligned to the A. richmondensis (Aciri1_meta) transcriptome available in MycoCosm (Grigoriev et al., 2012(Grigoriev et al., , 2014. The number of reads that aligned to each gene were counted for each sample, only considering uniquely mapping reads and those mapping to one strand.

Protein Extraction
AMD biofilm samples were prepared for protein extractions by removing 100-200 mg of frozen biomass and centrifuging at 15,000 rpm at 4 • C for 2 min to pellet biofilm and remove any residual fluid. Proteins were extracted using a protocol modified from previously reported methodology (Chourey et al., 2010). Briefly, cells were lysed using three approaches. First, cell pellets were flash frozen with liquid nitrogen and ground with a micropestle. Ground pellets were then resuspended in 1 mL SDS cell lysis buffer (5% SDS; 50 mM tris-HCl, pH 8; 150 mM NaCl; 0.1 mM EDTA; 1 mM MgCl 2 ) with 12 µL 4 M dithiothrietol. This solution was transferred to a Lysing Matrix E bead beating tube (MP Biomedicals, Santa Ana, CA) and processed for 30 s using a Fast Prep-24 Automated Homogenizer (MP Biomedicals, Santa Ana, CA) at 6.0 m/s. Tubes were centrifuged briefly and the supernatant was transferred to 15 mL tubes. This solution was then sonicated using an Ultrasonic Processor (QSonica, Newtown, CT) for three intervals of 1 min at 100 amps followed by 1 min of rest. Next, the lysed cell solution was incubated for 15 min at 99 • C, cooled, vortexed for 3 min and centrifuged at 10,000 rpm for 10 min at 4 • C. Pelleted cell debris was discarded and 300 µL 100% trichloroacetic acid (TCA) was added to the supernatant cell lysates which were then incubated overnight at 4 • C to precipitate proteins. Following overnight incubation, precipitated proteins were pelleted by centrifuging at 14,000 rpm for 20 min at 4 • C. Pellets were washed three times with 1 mL acetone, centrifuging at 14,000 rpm for 10 min at 4 • C between each wash. Once washed, the pellet was allowed to air dry and then resuspended in guanidine HCl buffer (6 M guanidine HCl; 50 mM Tris, pH = 7.6; 10 mM CaCl 2 ). Total protein concentrations were estimated with the bicinchoninic acid (BCA) assay (Pierce BCA Protein Assay Kit; Thermo Scientific, Waltham, MA).

Proteomics Methods
Each protein sample was reduced with DTT (10 mM, final concentration). Fifty micrograms of proteins from each AMD biofilm sample was further processed with the Filter-Aided Sample Preparation (FASP) method (Wiśniewski et al., 2009) following the manufacturer's protocol (Expedeon, San Diego, CA). Proteins were digested overnight with sequencing grade trypsin (Promega, Madison, WI) in an enzyme:substrate ratio of 1:100 (wt:wt) at room temperature with gentle shaking, followed by a second digestion for 4 h with the same amount of trypsin (i.e., 0.5 µg). The peptide samples were eluted from the filter by centrifugation and then stored at −80 • C.

2D-LC-MS/MS Proteomic Measurements
The multi-dimensional protein identification technology (MudPIT) (Washburn et al., 2001) was used in our analytical workflow. In each MudPIT run, 50 µg of peptides were loaded offline into a 150-µm-I.D. 2D back column (Polymicro Technologies, Phoenix, AZ) packed with 3 cm of C18 reverse phase (RP) resin (Luna, Phenomenex, Torrance, CA) and 3 cm of strong cation exchange (SCX) resin (Luna, Phenomenex, Torrance, CA). The back column was connected to a 100µm-I.D. front column (New Objective, Woburn, MA) packed in-house with 15 cm of C18 RP resin. The back column and front column were placed in-line with a U3000 quaternary HPLC pump (Dionex, Sunnyvale, CA). Prior to the measurement, the back column loaded with peptides was de-salted offline with 100% Solvent A (95% H 2 O, 5% CH3CN, and 0.1% formic acid), and washed with a 1 h gradient from 100% Solvent A to 100% Solvent B (30% H 2 O, 70% CH3CN, and 0.1% formic acid) to move peptides from RP resin to SCX resin. Each MudPIT run was configured with the 11 SCX-RP separations in 22 h. 5, 7, 10, 12, 15, 17, 20, 25, 35, 50, and 100% of Solvent D (500 mM ammonium acetate dissolved in Solvent A) were used in the 11 SCX fractionations. Each SCX fraction was separated by a 110 min RP gradient from 100% Solvent A to 60% Solvent B.
The AMD biofilm samples were measured on an LTQ Orbitrap Velos Pro mass spectrometer (Thermo Scientific, Waltham, MA). MS scans were acquired in Orbitrap with the resolution of 30,000. After each MS scan, the top 20 most abundant precursor ions were selected for fragmentation by collisional-induced dissociation with 35% normalized collision energy (NCE) and 10 ms activation time. Fragment ions were measured in ion trap. Precursor ions with unassigned charge states and +1 were rejected for MS/MS analysis. Dynamic exclusion was enabled with ±10 parts-per-million (ppm) exclusion window width and 60 s of exclusion duration.

Protein Identification and Quantification
All MS/MS spectra were searched with Sipros Hyatt and Pan, 2012;Wang et al., 2013) against a database containing ∼80,000 proteins derived from ∼80 Gb of genomic information obtained from previous genomic characterizations of biofilms sampled from the Richmond Mine AMD system. The database was concatenated with reverse sequences for estimation of false discovery rate (FDR), using the following search parameters: parent mass offsets of −1, 0, +1, +2, +3 Da; 0.03 Da and 0.5 Da of mass tolerances for parent ions and fragment ions, respectively; up to three missed cleavages; full enzyme specificity required; dynamic modifications included oxidation of methionine and deamidation of asparagine and glutamine; static modification included carbamidomethylation of cysteine.
The raw search result of each MudPIT run was filtered individually to achieve 1% FDR at the peptide level estimated by concatenated reverse peptide sequence identifications. Proteins were inferred from the identified peptides using parsimony rules. Briefly, indistinguishable proteins were combined into protein groups and subset proteins and subsumable proteins were removed. A minimum of two peptides, at least one of which must be unique, was required for each inferred protein or protein group, which resulted in <1% FDR at the protein level, estimated by concatenated reverse protein sequence identifications.
Protein quantification was carried out with ProRata (Pan et al., 2006;Wang et al., 2013). For protein quantification via spectrum counting, spectrum counts of proteins were balanced by taking the sum of the spectrum counts uniquely mapping to a gene plus a fraction of the non-unique spectra split between matching proteins.

Proteome-and Transcriptome-Based Community Structure Analyses
For the community-level analysis of the biofilms, transcript read counts, and protein spectral counts were normalized by making the total "library-size" of each sample identical. Transcript read counts from the ribosomal RNA-depleted libraries were normalized by the sum of all mapped pairedend reads in the sample and multiplied by 1,000,000. Balanced protein spectrum counts were normalized by the sum of all balanced spectrum counts in the sample and multiplied by 1,000,000. After normalization, only uniquely mapping proteins (to one protein from one organism) were considered for further analyses.
For organism-level analysis of the biofilms, protein spectral counts were further normalized by making the total sum of an individual organism (e.g., A. richmondensis) identical across all samples. Poly-A selected RNA was used for A. richmondensis organism-level transcriptome analyses, normalized in the same manner.
Hierarchical clustering was performed on transcript and protein abundance values normalized at the communitylevel and organism-level (for A. richmondensis) with absolute intensities converted to percentages for each transcript/protein (the sum of the percentages for each transcript/protein is equal to 100%). The clustering method used an uncentered Pearson correlation distance matrix and average linkage clustering (using Multi-experiment Viewer; MeV_4_8; www.tm4.org/mev/) (Saeed et al., 2003). For clustering analyses only, 0.000001 was added to each number to avoid software adjustments of zero values.
Community structure was evaluated by summing the transcript read counts and protein spectral counts for each organismal group (e.g., Leptospirillum group III, archaea, etc.) and then dividing by the total sum of all transcripts/proteins in the sample (using community-level normalized values).

Differential Expression Analyses
For differential expression analyses, transcripts and proteins were only considered if they were quantified in all five biofilm replicates. Differential expression of individual transcripts and proteins were identified as those with abundance ratios (e.g., floating biofilm:streamer biofilm) >2 or <0.5 combined with a Rank Product p = 0.01, similar to methods used in other studies (Williamson et al., 2008;Dobbin et al., 2010;Soares et al., 2010;Zhao et al., 2010;Han et al., 2011;Jain et al., 2011;Muthukrishnan et al., 2011). Rank Product, commonly used in microarray experiments, is a non-parametric statistical method based on ranks of fold changes (Breitling et al., 2004). In the significance analyses, transcripts and proteins that are considered as up-regulated in one condition are concomitantly considered as down-regulated under the other condition. Counts were not normalized by gene/transcript/protein length in order to minimize data transformations, thus, only transcripts/proteins with the largest fold change or greatest significance between treatments were considered, rather than the largest overall abundance in a sample.
Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005) was used to determine whether expression of defined set of genes (e.g., genes involved in amino acid biosynthesis) showed statistically significant differences between conditions (floating vs. streamer biofilms). Gene set analyses can illuminate important effects on overall pathways that might be missed by singlegene analyses. GSEA settings were: gene set permutation; classic enrichment analysis; and log 2 ratio of classes metric for ranking genes.

Accession Numbers
The A. richmondensis genomic sequence assemblies have been deposited at DDBJ/EMBL/GenBank under the accession numbers JOOL00000000 (metagenome assembly) and JPDO00000000 (isolate assembly). Genomic and transcriptomic data is available in the MycoCosm web-portal (http://jgi.doe.gov/ fungi; Project ID: 402991). The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (Vizcaíno et al., 2016) partner repository with the dataset identifier PXD003541.

Morphology of A. richmondensis in Biofilms and in Culture
Floating and streamer biofilms were collected from the Richmond Mine (Iron Mountain near Redding, California). In the context of prior molecular surveys , microscopy indicated that A. richmondensis was likely the only eukaryote in the biofilms sampled, based on fluorescence in situ hybridization with probes specific to Eukaryotes and to Dothideomycetes (Figure 1). A. richmondensis strains were previously observed in other biofilms from the mine, based on microscopy and phylogeny ), but no prior studies have evaluated its genomic potential, physiology, or ecology.
In the streamer biofilms, A. richmondensis grew as dark hyphae in microscopic bundles, with very few branchings (Figure 1). Hyphae were ∼2.7 µm wide, with septa spaced approximately every 16.5 µm. In floating biofilms, A. richmondensis grew as filaments interspersed within the biofilm matrix and did not form thick bundles as seen in the streamer biofilms.

Assembly of A. richmondensis Genome from Metagenomic and Genomic Sequence Datasets
Genome assembly was performed from Illumina and Roche/454 sequencing reads derived from an isolate of A. richmondensis. The A. richmondensis genome was also assembled using Illumina sequencing reads derived from the metagenome of a natural streamer biofilm community containing fungi, bacteria, archaea, and viruses. The metagenome reads were assembled in two ways: (1) using the assembled genome from the culture isolate as a reference, and (2) as a completely separate dataset independent of the culture isolate reads. All three approaches produced assemblies of comparable length and number of contigs ( Table 1). The reference-guided metagenome assembly had the best overall contiguity so it was used for all downstream analyses. The assembly was 26.8 Mbp in length with 10,352 predicted genes, compared to the average Dothideomycetes genome of 38.9 Mbp with 11,955 genes (Supplemental Dataset) (Ohm et al., 2012).  Two different approaches estimate high genome and predicted proteome completeness based on the presence of conserved genes (Table 1). First, we used the CEGMA pipeline (Parra et al., 2009) to estimate completeness based on the presence of 248 eukaryotic orthologous groups (KOGs) (Koonin et al., 2004) identified as present in single or low copy number across six diverse eukaryotic genomes. This CEGMA approach yielded estimates of 96.8% completeness, and 100% completeness when only a subset of the most conserved KOGs is considered. Second, genome completeness was estimated considering 1007 gene families (based on protein clustering) present in single copy within all of the other Dothideomycete genomes available in the MycoCosm fungal genome portal (Grigoriev et al., 2012(Grigoriev et al., , 2014. In the A. richmondensis metagenome assembly, 99.4% of these families were represented (97.7% of these were present as a single protein).

Phylogenetic Context for A. richmondensis
As closely related Dothideomycetes have not yet been sampled genomically, the assembled A. richmondensis internal transcribed spacer (ITS) region  was used to place the fungus within a broader phylogenetic context (Supplemental Figure S1). A. richmondensis belonged to the Dothideomycetes class of the Ascomycota phylum. The assembled A. richmondensis ITS was 95% identical to Acidomyces acidophilum, an organism previously suggested to be identical to A. richmondensis that was isolated from acidic soils and mine drainage in Canada (as Scytalidium acidophilum) (Sigler and Carmichael, 1974), Northern Europe (Selbmann et al., 2008), and the Czech Republic (Hujslová et al., 2009). The assembled ITS region was identical to the recently described Teratosphaeria acidotherma, isolated from acidic Japanese hot springs (Yamazaki et al., 2010), and to Bispora sp. MEY-1, isolated from acidic uranium AMD in China (Luo et al., 2008). T. acidotherma (renamed as Acidomyces acidothermus) was also isolated from acidic soils in the Czech Republic and Iceland (Hujslová et al., 2013). Ten Bispora sp. MEY-1 carbohydrate-active enzyme genes each aligned to A. richmondensis contigs with 99-100% identity. Thus, A. richmondensis and closely related fungi appeared to be globally distributed in diverse acidic environments, and future genome sequencing will help resolve the molecular phylogeny of these organisms.

Biofilm Community Transcriptome and Proteome
Quantitative transcriptomics and shotgun proteomics were used to determine transcript and protein abundance and to infer function in three replicates each of floating and streamer biofilms.
The floating biofilm sample F3 was removed from subsequent analyses because a low number of transcripts and proteins were recovered (45-46%) compared to the other five biofilm samples. Additionally, the community transcriptome was uncorrelated to any other sample (based on hierarchical clustering and Pearson correlation values ≤ 0.33). Taken together, this suggests that there may have been a technical error in the RNA and proteins extractions from the F3 biofilm, or that the biofilm matrix differed in some way, possibly resulting in extraction biases.
Across the five remaining biofilms, 55,829 different transcribed genes were quantified from 80,044,780 total paired-end reads (Supplemental Dataset). Poly-A selected transcriptomics libraries generated another 49,570,388 pairedend reads (Supplemental Dataset). A total of 7791 different proteins were quantified from the biofilm communities, 7004 of which could be uniquely assigned to a single protein in a single organism (Supplemental Dataset). Most of these proteins were quantified across multiple biofilm samples. From the 10,352 genes predicted in the A. richmondensis genome, a total of 10,260 transcripts and 2497 proteins were quantified in the biofilms.
Community transcriptomes and proteomes were strongly correlated within the same condition (floating or streamer biofilms) but not between different conditions (Pearson correlation coefficients: within floating 0.78-0.92; within streamer 0.85-0.93; between floating and streamer 0.14-0.35). Hierarchical clustering of transcript and protein abundance showed that floating and streamer biofilms clustered into separate groups (Figure 2).

Community Expression Profile of Floating and Streamer Biofilms
We evaluated the community expression profile within the biofilms using transcripts and proteins quantified from 25 different bacterial, archaeal, and eukaryal organisms (Figure 3; Supplemental Figure S2). Rank abundance curves based on transcript and protein counts indicated that transcripts and proteins were unevenly distributed among organisms in streamer biofilms (Supplemental Figure S3). A. richmondensis transcripts and proteins dominated the streamer biofilms, averaging 85% of the community, relative to 7% of the floating community (based on transcript and protein counts). Overall, only three organisms consistently had more transcripts and proteins in the streamer biofilms than in the floating biofilms: A. richmondensis, Sulfobacillus I, and Ferroplasma II (Figure 3). Both Sulfobacillus and Ferroplasma have closely related genotypes that had more abundant transcripts and proteins in floating biofilms.
Transcript and protein abundance was more evenly distributed in floating biofilms (based on rank abundance curves; Supplemental Figure S3). In the floating biofilms, the Leptospirillum bacterial genotypes together accounted for 41-52% of the transcriptome and 72-80% of the proteome. Transcripts and proteins from the Leptospirillum bacterial genotypes were all more abundant in floating biofilms than in streamer biofilms, however, the ratio of group III to group II was lower in floating biofilms (i.e., group III were less abundant than group II in floating biofilms but more abundant in streamer biofilms).
Overall, detected transcript and protein counts are influenced by a complex combination of factors (including variations in genome copy number, gene copy number, genome size, gene size, cell count, cell volume, activity levels per cell, and inherent methodological biases) and should not be interpreted as a measure of cellular abundance.

Expression of A. richmondensis Genes in Floating and Streamer Biofilms
We evaluated the predicted metabolism and in situ expression of A. richmondensis in floating and streamer biofilms. Transcript FIGURE 3 | Relative abundance of detected transcripts and proteins per organism in floating and streamer biofilms. The log ratio of transcript (dark gray) and protein (light gray) abundance between floating and streamer biofilms is based on the summed transcript and protein abundance for each organism averaged across replicates. Blue-and purple-shaded bars indicate relative levels of overall transcript abundance on a linear scale. and protein abundance was evaluated at the organism-level to adjust for differences in organism abundance across samples (i.e., by making the total sum of all A. richmondensis transcript and protein counts identical across all samples; organism-level transcripts originated from the poly-A selected RNA sequencing libraries). A total of 9867 A. richmondensis transcripts quantified in all of the five biofilm replicates were considered for gene set analyses of the biofilms. More than 60 different gene sets were evaluated covering a broad range of functionality, as described below. A. richmondensis proteins were not considered for gene set analyses of the biofilms because only 357 A. richmondensis proteins were quantified in floating biofilms, due to the organism's low abundance in the floating biofilm community. Detected A. richmondensis proteins were, however, used to confirm expression of certain genes and pathways (see below).

General Metabolism
Transcripts involved in amino acid metabolism (including 9 KEGG pathways, such as histidine metabolism) were significantly up-regulated in streamer biofilms. Conversely, transcripts of amino acid transporters were significantly up-regulated in floating biofilms. Transcript abundance in broad metabolic pathways varied between biofilm conditions: replication, recombination, and repair; RNA processing and modification; and transcription were significantly more abundant in floating biofilms, whereas translation; ribosomal proteins; energy metabolism; and lipid metabolism were all significantly more abundant in streamer biofilms. Cytochrome c oxidase genes (aerobic respiration) were detected at the transcript and protein level in both types of biofilms. Nearly 100 vesicle-related genes were predicted in the A. richmondensis genome, and these transcripts were significantly more abundant in streamer biofilms. More than half of the vesicle-related genes were detected as proteins within the streamer biofilms, and three were detected in floating biofilms.

Heavy Metal Transport and Detoxification
A. richmondensis encoded and expressed many genes related to heavy metal transport and detoxification. The A. richmondensis genome had a wide range of metal transporters specific to iron, copper, zinc, magnesium, calcium, and nickel/cobalt. Several genes potentially involved in chelation were identified in the A. richmondensis genome, including a ferrochelatase, a siderophore dependent iron transporter, sideroflexin, and several transcripts for ferric-chelate reductase. Cyanide hydratase/nitrilase and cyanate lyase genes may act to detoxify exogenous cyanide or cyanide byproducts of other cellular metabolic reactions.
When considering all of the metal transport and detoxification genes together (i.e., as one gene set), metalrelated transcripts were significantly up-regulated in streamer biofilms. Additionally, an individual arsenate reductase was significantly up-regulated at the protein level in streamer biofilms. Overall, 24 metal transport and detoxification proteins were measured only in streamer biofilms, and an additional eight proteins were measured in both floating and streamer biofilms.

Compatible Solute Metabolism
The A. richmondensis genome encoded genes involved in the metabolism of compatible solutes that may be particularly useful in the high-ionic-strength waters within the Richmond Mine (Druschel et al., 2004). The A. richmondensis genome contained genes involved in the biosynthesis and degradation of taurine, and prior metabolomic analyses have identified taurine in AMD biofilms (Mosier et al., 2013). A. richmondensis also had genes involved in the biosynthesis and degradation of trehalose, as well as a betaine-aldehyde dehydrogenase gene. Overall, transcripts involved in compatible solute metabolism were significantly upregulated in streamer biofilms. More than half of the compatible solute-related genes were measured as proteins within the streamer biofilms, and two were measured in floating biofilms.

Carbon Transformations
The A. richmondensis genome encoded complete TCA, glycolysis, and pentose phosphate pathways. Additionally, A. richmondensis contained many genes involved in the metabolism of fructose, mannose, galactose, starch, and sucrose.
The central carbon metabolism pathways and carbohydrate transporters were significantly up-regulated in streamer biofilms (Figure 4). However, CAZymes and starch and sucrose metabolism were up-regulated in floating biofilms. CAZymes were also significantly more abundant in floating biofilms when broken down by each class (GTs, CBMs, CEs, and GHs), or when only considering cellulases, GHs acting on alpha-linked sugars, or GHs acting on beta-linked sugars. When analyzing each CAZyme family (e.g., the sum of all expressed GH88 transcripts per biofilm replicate), 75% of the families were more abundant in floating biofilms (Figure 5). At the level of individual transcripts, 16 GHs were significantly more abundant in floating biofilms, a majority of which act on beta-linked sugars ( Table 2).
A. richmondensis encoded genes to produce acetate, ethanol, lactate, and propionate as fermentation end products. The fermentative pathway transcripts were significantly up-regulated in streamer biofilms (Figure 4). Nearly 75% of the fermentationrelated genes were measured as proteins within either the floating or streamer biofilms.

Nitrogen Transformations
The A. richmondensis genome had genes for denitrification (the reduction of nitrate to nitrogen gasses) including a nitrate transporter, two nitrate reductases (nitrate to nitrite), a coppercontaining nitrite reductase (NirK; nitrite to nitric oxide), and a cytochrome-P450nor nitric oxide reductase (nitric oxide to nitrous oxide). Each of these genes were expressed in the biofilms at the transcript level, as well as nitrite reductase proteins. Nitrous oxide reductase genes (nitrous oxide to N 2 ) were not evident in the genome, similar to all known denitrifying fungi (Shoun et al., 2012). Nitrous oxide production (∼6 ppm) was measured in pure cultures of A. richmondensis grown at pH 1 under oxic conditions where the media and headspace were aerobic but the supply of air was discontinued after the serum vials were sealed. No nitrous oxide production was detected in anaerobic cultures or in abiotic controls. Denitrification transcripts were up-regulated in fungi growing in the floating biofilms, relative to streamer biofilms.
The nitrate reductase and an assimilatory nitrite reductase (nitrite to ammonia) may be used for assimilatory or dissimilatory purposes via ammonia fermentation as seen with the fungus Fusarium oxysporum (Zhou et al., 2002). A. richmondensis also had ammonium transporters and urease genes, which facilitate the hydrolysis of urea into carbon dioxide and ammonia.

Reconstruction of Eukaryotic Genomes from Environmental Samples
Reconstructing eukaryotic genomes from metagenomic sequences has historically been a significant challenge and has precluded genome-based expression studies of eukaryotes within their natural habitats. Here, we reconstructed the nearcomplete genome of A. richmondensis from AMD biofilms. We applied genome assembly approaches (e.g., stepwise assembly and ESOM-based binning) that have been shown to facilitate reconstruction of complete and essentially-complete bacterial and archaeal genomes from complex samples (e.g., Dick et al., 2009;. Our genome assembly efforts were feasible due in part to the low diversity of the eukaryotic AMD community, the small size of the targeted genome (27 Mbp), and the sequencing of the isolate genome. We hypothesize that assembly approaches such as these, together with continued developments in sequencing technologies and binning and assembly algorithms, may make eukaryotic genome reconstruction more tractable in complex systems.

The Overall Distribution of Transcripts and Proteins among Fungal, Bacterial, and Archaeal Community Members
The overall distribution of transcripts and proteins among microbial community members differed greatly between floating and streamer biofilms. A. richmondensis transcripts and proteins dominated the streamer biofilms (Figure 3) and comprised a significant portion of the total biomass (Figure 1). Conversely, A. richmondensis transcripts and proteins were at low abundance (7%) in the floating biofilms, which were dominated by Leptospirillum bacterial transcripts and proteins. Consistent with many studies of floating biofilms in this mine (e.g., Mueller et al., 2010;Justice et al., 2012), Leptospirillum group II transcripts and proteins were more abundant than group III in floating biofilms. Conversely, Leptospirillum group III, which can be quite abundant in some laboratory-cultivated biofilms (Belnap et al., 2011;Mosier et al., 2013Mosier et al., , 2015, transcripts and proteins were more abundant than Group II in streamer biofilms. Other than A. richmondensis, only two organisms consistently had more abundant transcripts and proteins in the streamer biofilms than in the floating biofilms: Sulfobacillus I and Ferroplasma II (Figure 3). Interestingly, both Sulfobacillus and Ferroplasma had closely related genotypes that had more abundant transcripts and proteins in floating biofilms (Sulfobacillus III and Ferroplasma I), which may be suggestive of niche differentiation within these groups. Sulfobacillus species are facultative anaerobic autotrophs, and genomic differences between types I and III include genes involved in sulfur metabolism, carbon monoxide oxidation, and nitrate reduction (Justice et al., 2014). Prior genomic analyses indicated that both Ferroplasma type I and II are facultative aerobic heterotrophs and that differences between the strains included genes involved in carbon monoxide oxidation, mercury resistance, and S-layer structure (Yelton et al., 2013). It is possible that these genomic differences play a role in determining species distribution in floating and streamer biofilms. Sulfobacillus and Ferroplasma distribution may also be linked to fungal abundance.
FIGURE 5 | Relative transcript abundance of carbohydrate active enzyme (CAZyme) families in floating and streamer biofilms. Individual transcript abundances were summed for each CAZyme family across the biofilm replicates (e.g., transcripts from two genes encoding GH7s were summed for each biofilm replicate). Summed abundances were then transformed to a log ratio of the replicate average (floating biofilm average:streamer biofilm average). Only CAZyme families with transcripts measured in all five biofilm replicates are shown.

Response to Environmental Stresses
The chemical and physical conditions of the AMD are expected to be similar in the floating and streamer biofilm habitats, with low pH, high metal concentrations and high ionic strength. However, A. richmondensis genes involved in responding to these harsh conditions (i.e., metal transport and detoxification and compatible solute metabolism) were more highly expressed in streamer biofilms than in floating biofilms. Thus, A. richmondensis may experience greater stress in streamer biofilms. Mature biofilms floating at the air-solution interface, possibly buoyed by accumulated gases, may exchange solution slowly with underlying AMD. This may protect organisms encased in the biofilm matrix from environmental challenges. In contrast, submerged streamer biofilms are likely permeated by the flowing

Denitrification
Denitrification by fungi has been shown in pure cultures growing at pH values ranging from ∼5-7 (Shoun et al., 2012;Jasrotia et al., 2014;Rohe et al., 2014a,b). Denitrification attributed to fungi and/or archaea (i.e., denitrification as measured in the presence of bacterial inhibitors) was measured in groundwater sediments with pH as low as 3.67 (Jasrotia et al., 2014). Here, we show that A. richmondensis may denitrify at pH 1 (possibly the lowest recorded pH for denitrification by any bacteria, archaea, or eukaryote): A. richmondensis has denitrification genes for converting nitrate to nitrous oxide gas and expresses these genes as transcripts and proteins in natural biofilms. Nitrous oxide was produced in A. richmondensis cultures, but not in abiotic controls with nitrate. These results do not rule out the possibility of chemodenitrification (the abiotic conversion of nitrite or nitric oxide to nitrous oxide), but do show that A. richmondensis may play a role in nitrogen gas production (either by coupling biotic nitrate reduction to abiotic chemodenitrification; or by biotic denitrification). Denitrification is predominantly performed by facultative anaerobes in anoxic environments such as saturated soils, estuary sediments, and wetlands (e.g., Prieme et al., 2002;Henry et al., 2004;Seitzinger et al., 2006;Mosier and Francis, 2010). For A. richmondensis, nitrous oxide production occurred in aerobic cultures, but not in anaerobic cultures. Aerobic cultures were grown in sealed vials where the media and headspace were aerobic but no additional air was supplied during the growth experiment. Aerobic respiration likely decreased oxygen concentrations in the culture vials over time. Other fungal cultures have also shown that denitrifying activity requires a minimal amount of oxygen (Zhou et al., 2001;Jasrotia et al., 2014). This finding is consistent with the observation that A. richmondensis denitrification transcripts were up-regulated in floating biofilms, which have been shown to have low oxygen concentrations in the interior biofilm regions due to diffusion limitation (Ma and Banfield, 2011). Taken together, we infer that denitrification by A. richmondensis may be most important in microaerophilic conditions.
Interestingly, a few organisms are able to simultaneously use both oxygen and nitrate, though with lower productivity than when using oxygen alone (Robertson and Kuenen, 1984;Tanimoto et al., 1992). A. richmondensis denitrification and cytochrome c oxidase (aerobic respiration) transcripts and proteins were simultaneously quantified in floating and streamer biofilms. This may suggest that A. richmondensis constitutively expresses transcripts for both aerobic respiration and for denitrification in order to adapt to fluctuating conditions or that micro-niches are present within each habitat.

Carbon Transformations
A. richmondensis CAZymes were up-regulated in floating biofilms, including those that hydrolyze polymers of the sugars glucose, mannose, xylose, rhamnose, and galactose. Many of these sugars have been previously shown to be constituents of the extracellular matrix in the AMD floating biofilms (Jiao et al., 2010). These sugars may be synthesized by the bacterial and archaeal biofilm community members. For instance, Leptospirillum bacteria (the dominant members of floating biofilms) have genes for cellulose biosynthesis (Aliaga Goltsman et al., 2009 and Ferroplasma type II have genes for generating galactose. Some GHs involved in the degradation of peptidoglycan (a major membrane component of bacteria) were also up-regulated in floating biofilms. Overall, this may suggest that A. richmondensis utilizes complex carbon sources accessed from its close interactions with other community members in the floating biofilms.
In streamer biofilms, central carbon metabolism and carbohydrate transporters were significantly up-regulated relative to floating biofilms. This might suggest that A. richmondensis utilizes simple carbon sources dissolved within the stream, since it does not have access to the reservoir of complex carbon within the floating biofilms.

CONCLUSIONS
We reconstructed the near-complete genome of a microbial eukaryote, A. richmondensis, from a metagenomics dataset and confirmed the genome by comparison with that of an isolate. By examining the differential gene expression of this fungus in two distinct types of biofilm communities, we identified differences in its metabolic functioning, particularly with regard to carbon and nitrogen substrate utilization. The functional differences may be due in part to the differences in the composition and metabolism of the associated microbial community and they may reflect distinct physical and chemical conditions in the two niches. The results of this study further establish the value of linked metagenomic, transcriptomic, and proteomic approaches for prediction and functional analysis of the roles of microbes in natural ecosystems.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb. 2016.00238