Metatranscriptomics Reveals the Functions and Enzyme Profiles of the Microbial Community in Chinese Nong-Flavor Liquor Starter

Chinese liquor is one of the world's best-known distilled spirits and is the largest spirit category by sales. The unique and traditional solid-state fermentation technology used to produce Chinese liquor has been in continuous use for several thousand years. The diverse and dynamic microbial community in a liquor starter is the main contributor to liquor brewing. However, little is known about the ecological distribution and functional importance of these community members. In this study, metatranscriptomics was used to comprehensively explore the active microbial community members and key transcripts with significant functions in the liquor starter production process. Fungi were found to be the most abundant and active community members. A total of 932 carbohydrate-active enzymes, including highly expressed auxiliary activity family 9 and 10 proteins, were identified at 62°C under aerobic conditions. Some potential thermostable enzymes were identified at 50, 62, and 25°C (mature stage). Increased content and overexpressed key enzymes involved in glycolysis and starch, pyruvate and ethanol metabolism were detected at 50 and 62°C. The key enzymes of the citrate cycle were up-regulated at 62°C, and their abundant derivatives are crucial for flavor generation. Here, the metabolism and functional enzymes of the active microbial communities in NF liquor starter were studied, which could pave the way to initiate improvements in liquor quality and to discover microbes that produce novel enzymes or high-value added products.


INTRODUCTION
Chinese liquor is one of the world's four best-known distilled spirits. It accounts for more than one-third of all spirits consumed (Sweeney, 2013) and is the largest spirit category by sales in the world (Molon, 2013). The unique and traditional Chinese solid-state simultaneous saccharification and fermentation (SSF) and liquor brewing technologies have been in continuous use for several thousand years (Xiao et al., 2011;Yao et al., 2015;Xu et al., 2017). Nong-flavor (NF) liquor accounts for more than 70% of the Chinese liquor market. The main fermentation process includes two stages: liquor starter production and alcohol fermentation. The production process of NF liquor starter, which is aerobically produced, usually includes approximately 1 month of spontaneous incubation in a fermentation room and 3 months of drying in a storage room to mature (Zheng et al., 2011;Chen et al., 2014). NF liquor starters use wheat as feedstock, and the wetted wheat is shaped into bricks, each weighing approximately 1.5-4.5 kg (Zheng et al., 2011;Zheng and Han, 2016). As NF liquor starter is characterized by a moderately high temperature (62 • C), the liquor starter fermenting bricks must be maintained at a moderately high temperature for 8 days in the fermentation room. For the alcohol fermentation process, first, raw sorghum, wheat, corn, rice, and sticky rice are crushed, steamed, cooled, and mixed evenly with liquor starter. Then, the mixture undergoes a solidstate SSF process for approximately 40-45 days in a pit. Finally, the fermented mixture is distilled to produce the liquor (Tao et al., 2014;Yan et al., 2015). The fermentation processes that occur during SSF are mainly attributed to the metabolism and interactions of the microorganisms from the liquor starter, Zaopei and pit mud (Chen et al., 2014). Liquor starter is the most important and essential component for liquor fermentation. During the production of liquor starter, no microorganisms are intentionally inoculated; thus, most of the microbes are enriched from naturally occurring ecosystems, such as feedstock, water, air and the working environment, with high balance and stability. These Chinese NF liquor starter microbial communities have evolved for more than several thousand years and have greatly influenced liquor properties, such as their distinctive flavor and taste.
As the NF liquor starter production process is subjected to extremely severe conditions (50-62 • C), the special microbial community enriched in the liquor starter may produce efficient and diverse thermophilic carbohydrate-degrading enzymes. Recently, great efforts have been made to discover novel thermophilic lignocellulases with excellent performance, including high activity and marked stability (McClendon et al., 2012;Balasubramanian and Simões, 2014). The carbohydratedegrading enzymes from the liquor starter under aerobic and thermophilic conditions are different from those in termite and other herbivore-associated gut communities, which are dominated by anaerobic bacteria. Thus, enzymes from liquor starter may have great potential for industrial applications because lignocellulose decomposition has mainly been demonstrated under aerobic conditions (Robinson et al., 2001). Therefore, global and comprehensive technologies are needed to retrieve multiple thermophilic and synergistic carbohydrate-degrading enzymes from the NF liquor starter system. Elucidating the saccharification capability of liquor starters and identifying other attractive enzymes for industrial applications would also be of great value.
The microbial community of liquor starters has been studied using culture-dependent and denaturing gradient gel electrophoresis methods (Zheng et al., 2012;Yan et al., 2013;Chen et al., 2014;Zhang L. et al., 2014;Wang and Xu, 2015) as well as pyrosequencing techniques (Li et al., 2013;Wang et al., 2017). Moreover, Huang et al. (unpublished data) compared the dominant microbial community of Jiang-flavor (JF) and NF liquor starters and provided a more complete picture of the microbial composition in liquor starters. These studies have increased the understanding of the microbial community structure of liquor starters. However, not all of these methods are ideal for assessing community functions, and little is known concerning the active microbial community compositions and their metabolic functions in liquor starter. Metatranscriptomics, the direct analysis of mRNA from environmental samples, offers a powerful tool to study the active microbial community composition as well as their active genes and changes in transcriptional regulation when microbes respond to temporal variation (Mitra et al., 2011). Although, several studies have demonstrated the great advantage of metatranscriptomic technology (Bashir et al., 2013;Sanders et al., 2013;Wu et al., 2013), high-quality RNA from complex and difficult environmental samples severely challenges metatranscriptomics projects. The high content of starch and other polysaccharides, the complex fermentation products and the strong colored biomass during fermentation make RNA extraction of the microbial community in liquor starter difficult.
In the present study, we first successfully extracted total RNA from complex liquor starter samples and then applied high-throughput metatranscriptomic technology to globally, comprehensively and functionally analyze the actual microbial composition and metabolic characteristics of the most widely consumed NF liquor starter during the production process. The efforts of this study provide the first step in understanding the metabolism and function of the active microbial communities in liquor starters and pave the way toward the optimization of liquor production, improvement of liquor quality and discovery of microbes that produce valuable and novel enzymes with great potential for industrial applications.

Sample Collection
NF liquor starter was sampled from a fermentation workshop of the Yibin Hongloumeng Distillery Group Co., Ltd. in Yibing, Sichuan, China, in July 2013. The liquor starter was sampled at different time points. The samples were harvested from three locations in the same liquor starter fermentation room at each time point. N1 was collected at the beginning of liquor starter fermentation (30 • C); N2 was collected after 3 days of liquor starter fermentation (50 • C); N3 was collected after 9 days of liquor starter fermentation (62 • C); and N4 was collected from the mature liquor starter after fermentation for 20 days (25 • C). The samples were frozen in liquid nitrogen when they were harvested in the fermentation workshop and were then immediately transferred to 50-ml RNase free Corning CentriStar TM centrifuge tubes (430828, Corning, NY, USA) and kept on dry ice. Finally, all the samples were transferred to the Chengdu Biology Institute at the Chinese Academy of Sciences on the day of sampling and stored in a −80 • C freezer until analyses. The liquor starter samples for enzyme analysis were prepared as follows: 5 g of liquor starter was suspended in 20 ml of 0.1% (v/v) Tween 80 solution and transferred to the Chengdu Biology Institute, Chinese Academy of Sciences at room temperature.

Enzyme Profile of the NF Liquor Starter
After the liquor starter in the Tween 80 solution was transferred to the laboratory, the samples were incubated at 25 • C with shaking at 100 rpm overnight. The enzyme profile of the supernatant was investigated using insoluble chromogenic AZurine Cross-Linked (AZCL) polysaccharides according to the manufacturer's protocol (Megazyme, Ireland). After incubation at 35, 45, or 55 • C for 22 h, the diameter of the blue haloes were measured and recorded in millimeters.

RNA Extraction
Total RNA was extracted from liquor starter according to a previously reported method (Kumar et al., 2011) with some modifications. Briefly, 1 g of liquor starter was homogenized into fine powder in a precooled mortar with liquid nitrogen. Next, 4 ml of a hot (80 • C) borate buffer [200 mM sodium borate (pH 9.0), 30 mM ethyleneglycotetraacetic acid (EGTA), 1% (w/v) sodium dodecyl sulfate (SDS), 2% (w/v) polyvinylpyrrolidone (PVP), and 0.5% (v/v) Nonidet-40 (NP-40) combined in 0.1% diethyl pyrocarbonate (DEPC)-treated water and then autoclaved; after cooling and just before use, 10 mM β-mercaptoethanol and 0.03% (v/v) RNase inhibitor were added] and 280 µl of proteinase K (20 mg/ml) were added, and the mixture was incubated at 80 • C for 2 min. The lysate was centrifuged for 10 min at 5,000 × g. The supernatant was mixed thoroughly with an equal volume of 70% ethanol by shaking vigorously. The sample was applied to an RNeasy midi column and centrifuged for 5 min at 5,000 × g; this step was repeated for the residual sample. The sample was cleaned following the RNeasy Midi Kit protocol (Qiagen, 75142) and treated with DNase I (Fermentas, USA) according to the manufacturer's protocol. The purity, concentration and RNA integrity number (RIN) were measured using an Agilent 2100 Bioanalyzer. Qualified total RNA was submitted to the Beijing Genomics Institute (BGI)-Shenzhen, China, for RNA sequencing.
cDNA Illumina Library Construction, RNA Sequencing and De novo Assembly More than 20 µg of qualified total RNA from each sample (N1, N2, N3, and N4) was used for RNA sequencing using the HiSeq TM 2000 platform. For eukaryotes, poly (A) mRNA was purified using poly-T oligo-attached magnetic beads. For prokaryotes, rRNA was removed before subsequent library construction steps. The mRNA was mixed with fragmentation buffer and then fragmented. Fragmented mRNAs were synthesized into first-strand cDNA using reverse transcriptase and random primers. This step was followed by second-strand cDNA synthesis. Short fragments were purified and resolved with EB buffer for end reparation and poly (A)-tailing. Thereafter, the short fragments were connected with sequencing adapters, and then 200-bp cDNA fragments were purified for further template enrichment by PCR. The validated 200-bp fragment cDNA libraries were submitted for paired-end (PE) RNA sequencing using the HiSeq TM 2000 platform. Known bacterial, fungal and archaeal sequences were extracted from the National Center for Biotechnology Information (NCBI) Nucleotide (NT) database using in-house scripts, and the filtered reads were mapped to these sequences using the SOAP aligner (version 2.21) (Li et al., 2009). Next, the transcriptome data were assembled de novo using Trinity (http://trinityrnaseq.sourceforge.net/; Grabherr et al., 2011). The raw and assembled sequencing data have been deposited in the DDBJ/EMBL/GenBank database under the accession numbers SRR5384077 and GFMA00000000, respectively.

Functional Annotation and Cluster Analysis
The software TransDecoder (http://sourceforge.net/projects/ transdecoder/) was used to predict open reading frames (ORFs) based on the assembly results. The predicted amino acid sequences were aligned to diverse databases through BLAST (version 2.2.23), and related information was extracted and summarized using custom scripts. Gene Ontology (GO) classification (Ashburner et al., 2000) was achieved using WEGO (http://wego.genomics.org.cn/cgi-bin/wego/index.pl) (Ye et al., 2006). Enzyme codes were extracted, and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were retrieved from the KEGG web server (Kanehisa, 1997;Kanehisa et al., 2004Kanehisa et al., , 2006. Carbohydrate-active enzymes were annotated according to the Carbohydrate-Active enZymes database (CAZy) (version: 2011-9-20) (Cantarel et al., 2009). The evolutionary genealogy of genes was extracted from Non-supervised Orthologous Groups (eggNOG) (version 3.0) (Powell et al., 2012). KEGG, GO CAZy and eggNOG function cluster analyses were conducted using custom scripts. A heat map was constructed with the R package (Ihaka and Gentleman, 1996) using custom scripts.

Expression Profiling and Differential Expression Identification
To investigate the expression level of each unigene in the different samples, all the predicted ORFs were removed for redundancy using cd-hit (Version 4.6.1, http://weizhong-lab. ucsd.edu/cdhit_suite/cgi-bin/index.cgi). Unigene expression was calculated according to the fragments per kilobase of transcripts per mapped million fragments method (FPKM) (Ali et al., 2008). The P-values and log2-fold-changes (log 2 FCs) were calculated, and then the significantly differentially expressed transcripts (DETs) between the two samples were identified using p ≤ 0.05 and log 2 FC ≥ 1. Because thousands of hypothesis tests were performed using the transcriptome data, a suitable p-value for an individual test is not sufficient to guarantee a low rate of false discovery. Thus, multiple testing corrections for each individual hypothesis were performed to guarantee an overall low false discovery rate. The false discovery rate (FDR) control is a statistical method used in multiple hypothesis testing to correct for the p-value as described previously (Benjamini and Yekutieli, 2001). When the FDR was obtained, the FPKM ratio of the two samples was used at the same time. In this analysis, the values were identified as follows: FDR ≤ 0.001 and FPKM ratio ≥ 2.

Metatranscriptome Sequencing and De novo Assembly of the NF Liquor Starter Samples
After sequencing, we obtained 21.789 Gbp of data in total (Table S1). The raw reads were cleaned, pooled and assembled de novo. The assembly metrics can be found in Table S2. The maximum contig lengths were 13,340, 17,819, 12,496, and 12,933 bp, and the N 50 lengths were 404, 538, 1,084, and 1,105 bp for N1, N2, N3, and N4, respectively. The coding DNA and protein sequences were predicted and translated based on the assembled transcripts. Scanning the ORFs of all the contigs identified 25387, 58884, 56927, and 28618 ORFs with lengths longer than 600 bp for N1, N2, N3, and N4, respectively (Table S3).

Functional Profiling and Characterization of the NF Liquor Starter Metatranscriptome
The high-quality sequences were aligned to the bacterial and fungal sequences in the NT database. The composition of active species is presented in Table 1. The identified active fungal community was more prevalent than the bacterial community during the entire liquor starter production process as more active mRNA can be detected. The fungal component even increased up to 66.22% for sample N4. However, the bacterial component increased to the highest value of 15.30% for sample N3 and dropped to 12.33% for sample N4. GO, CAZy, eggNOG, and KEGG annotation combined with BLASTX was performed to profile the active genes and related pathways in the NF liquor starter. A total of 29418 unigenes from all the NF samples were annotated with the GO database, accounting for 17.32% of all the unigenes. The annotation was mainly clustered into three general sections: biological processes, cellular components and molecular functions ( Figure S1). Primary metabolic processes and catalytic activities were the most enriched GO terms in the biological processes and molecular functions sections, respectively. This result was further confirmed by KEGG analysis. For the NF liquor starter samples N1, N2, N3, and N4, 5047, 16419, 17034 and 8662 unigenes were found in 254, 264, 260, and 250 reference pathways, accounting for 19. 88%, 27.88, 29.92, and 30.27% of the total unigenes, respectively.
The 20 most abundant KEGG pathways of the four samples are shown in Figure 1. Notably, oxidative phosphorylation was ranked as the first most abundant pathway for sample N2. Among the 20 pathways, glycolysis, pyruvate metabolism, the citrate cycle, fructose and mannose metabolism, starch and sucrose metabolism and the pentose phosphate pathway were highly represented, particularly in samples N2 and N3. Additionally, butanoate metabolism and propanoate metabolism were found among the 20 pathways. Metabolism of 10 amino acids, i.e., lysine, alanine, aspartate, glutamate, valine, leucine, isoleucine, cysteine, methionine and glutathione, were also ranked in the top 20 and were dominant in samples N2 and N3.
The specific pathways related to saccharification, ethanol fermentation and flavor generation in the NF liquor starter metatranscriptome are schematically presented in Figure 2. First, polymers such as cellulose, hemicellulose, starch and protein are converted to monomers by diverse carbohydrate-active enzymes and proteases with high activity. Next, the monomers are taken up and further utilized by the microbial community. The products and intermediates of primary metabolism, such as glycolysis and the citrate cycle, as well as by-products of metabolism, mainly contribute to ethanol production and flavor development.
In this study, the most abundant and diverse auxiliary activity 9 (AA9) genes (up to 12 unique encoding genes) were found FIGURE 1 | The 20 most abundant KEGG pathways in the metatranscriptome of Nong-flavor liquor starter samples (N1, N2, N3, and N4). N1 was sampled at the beginning of liquor starter production, N2 was sampled after 3 days of liquor starter fermentation, N3 was sampled after 9 days of liquor starter fermentation, and N4 was the mature liquor starter. The temperatures of N1, N2, N3, and N4 were 30, 50, 62 and 25 • C, respectively.
in the N3 sample ( Table 3). The AA9 genes were mainly from Thermoascus aurantiacus, Trichocomaceae and Emericella nidulans. Only one AA10 protein from Bacillales was found in the N3 sample. All of these AA9 and AA10 proteins secreted by thermophilic fungi and bacteria were first identified in the NF liquor starter with low identity (41-75%) to the reported protein sequences. Additionally, the phylogenetic tree ( Figure S2) showed that the AA9 proteins are diverse in sequence.

Starch Metabolism, Glycolysis, Ethanol Metabolism, and Pyruvate Metabolism
Enzymes related to starch metabolism were analyzed among the four samples ( Figure 4A). Most of the enzymes were more abundant in the N2 and N3 samples, especially the important enzymes related to the conversion of starch to glucose, such as α-amylase, starch phosphorylase, maltose phosphorylase, β-glucosidase, glucoamylase, glucan 1,3-β-glucosidase and endoglucanase (colored in red in Figure 4A). Moreover, αamylase had the highest expression level, with an RPKM value of up to 1398.5 in the N2 sample.
We then comparatively analyzed the glycolysis metabolism of the NF liquor starter samples. The results indicated that there were various expression levels for the 9 enzymecatalyzed reactions that convert hexose to pyruvate in all the samples ( Figure S3). The N2 sample had the highest total numbers and expression levels of the glycolysis genes (Figure 1 and Figure 4B). Among these genes, glyceraldehyde 3-phosphate dehydrogenase had the highest expression level with an RPKM value of 6004.1. Many of the enzyme genes also exhibited comparable expression levels in sample N3 ( Figure 4B). Furthermore, three key enzymes in glycolysis, hexokinase, 6-phosphofructokinase and pyruvate kinase (colored in red in Figure 4B), were significantly up-regulated, with log 2 Ratio values (N2/N1) ranging from 3.6 to 16.6 ( Figure S3 and Table S5). In total, 8 hexokinases, 10 6-phosphofructokinases and 16 pyruvate kinases were up-regulated, and the upregulated genes were mainly from Saccharomycetales and Mucorales.
Pyruvate can be converted to ethanol in conditions of insufficient oxygen, which possibly occurs inside the liquor starter brick. As shown in Figure S3, alcohol dehydrogenase, aldehyde dehydrogenase (NAD+) and aldehyde dehydrogenase (NAD(P)+) are three key enzymes in ethanol metabolism. The highest expression levels of alcohol dehydrogenase and aldehyde dehydrogenase (NAD(P)+) were in N3, and the highest expression level of aldehyde dehydrogenase (NAD+) (RPKM: 1878.7) was in N2, with a comparable level (RPKM: 1300.4) in N3 (colored in blue in Figure 4B). Furthermore, compared with the N1 sample, the alcohol dehydrogenase levels in the N2, N3, and N4 samples from different microbial community members were up-regulated. In particular, in the N2 sample, 10 types of alcohol dehydrogenases were up-regulated from diverse microorganisms, such as Leuconostocaceae, Millerozyma farinose, Weissella thailandensis, Saccharomycetales, Dikarya, Rhizopus oryzae, and Lactobacillales ( Table 4). The log 2 Ratio (N2/N1) values of the alcohol dehydrogenases varied from 9.5 to 14.2. When the temperature was increased to 62 • C, some aldehyde dehydrogenases (NAD+) and aldehyde dehydrogenases (NAD(P)+) were also up-regulated in the N3 sample. The RPKM value of one aldehyde dehydrogenase (NAD(P)+) produced by Trichocomaceae increased from 9.7 to 237.6, and the log 2 Ratio (N3/N2) value of aldehyde dehydrogenase (NAD+) produced by bacteria was the highest value of 15.6 ( Table 5).
Another important intermediate of pyruvate metabolism is acetyl-CoA, which is produced by pyruvate dehydrogenase E1 and pyruvate dehydrogenase E2 (Figures S3, S4). Pyruvate dehydrogenases E1 and E2 (colored in green in Figure 4B) both showed their highest expression levels in the N2 sample, which was followed by N3. Meanwhile, other enzymes involved in the complicated metabolism of pyruvate were also compared among the four samples. Most of them showed relatively high expression levels in the N2 and N3 samples ( Figure 4C and Table S6). For the 19 most highly expressed pyruvate metabolism enzymes, 8 enzymes exhibited their highest RPKM values in the N2 sample, and 7 enzymes exhibited their highest RPKM values in N3. Malate dehydrogenase, D-lactate dehydrogenase/Dlactate dehydrogenase (cytochrome), L-lactate dehydrogenase and pyruvate oxidase (colored in red in Figure 4C) are responsible for reversibly converting pyruvate to L-malate, Dlactate, L-lactate and acetyl phosphate, respectively. Meanwhile, phosphate acetyltransferase, malate synthase and acetyl-CoA hydrolase (colored in blue in Figure 4C) are responsible for irreversibly converting acetyl-CoA to acetyl phosphate, Lmalate and acetate, respectively. More importantly, acetyl-CoA carboxylase (ACAC/accA), acetyl-CoA C-acetyltransferase and homocitrate synthase (colored in green in Figure 4C) are responsible for producing intermediates from acetyl-CoA for fatty acid biosynthesis, butanoate metabolism and leucine biosynthesis, respectively ( Figure S4). In all, the N2 and N3 samples exhibited large capacities for converting pyruvate to pivotal intermediates for carbohydrates, fatty acids and amino acids, which further contribute to specific flavor. Pyruvate can also be converted to lactic acid in conditions of insufficient oxygen, which likely occurs inside the liquor starter brick. In the NF liquor starter, low numbers of Llactate dehydrogenase (cytochrome), D-lactate dehydrogenase, D-lactate dehydrogenase (cytochrome) and L-lactate dehydrogenase were detected in the N1 (total RPKM: 80.5) and N4 (165.0) samples, but higher numbers were found in N2 (RPKM: 611.2) and N3 (RPKM: 600.2) (Figures 4B,C and Table S7).

The Citrate Cycle and Flavor Generation
A high total number of citrate cycle genes were found in the N2 sample (Figure 1). However, the number of key enzymes increased in N3 ( Figure S5). Most of the top 24 most highly expressed citrate cycle enzymes had definitively higher expression levels in N2 and N3, with 13 enzyme genes having their highest expression levels in N3 and 6 genes in N2 ( Figure 4D and Table S8). Meanwhile, isocitrate dehydrogenase (NAD+) (IDH3) and isocitrate dehydrogenase (IDH1) showed comparably high expression levels in N2 and N3. Furthermore, a comparison between the N2 and N3 samples was deeply analyzed, and most of the key enzyme genes were up-regulated in N3. The RPKM and log 2 Ratio (N3/N2) values of the key enzymes, citrate synthase, isocitrate dehydrogenase, 2-oxoglutarate dehydrogenase, dihydrolipoamide succinyl transferase and dihydrolipoamide dehydrogenase, are shown in Table 6. These key enzymes were mainly produced by bacterial and fungal community members such as Eurotiomycetidae, Bacillales, Trichocomaceae, Firmicutes, and E. nidulans.

DISCUSSION
This study globally and comprehensively explored the active microbial community and its function in the highly consumed NF liquor starter. A promising number of thermophilic enzymes were also discovered. The results showed that the fungal communities were much more diverse and the enzymes produced by them were more abundant than that of bacteria communities during the process of making the liquor starter, especially during the high temperature period. This interesting FIGURE 3 | Matched numbers of carbohydrate-active enzymes from the Nong-flavor liquor starter samples. CBM, Carbohydrate-Binding Module; GT, Glycosyl Transferase; PL, Polysaccharide Lyase; GH, Glycoside Hydrolase; and CE, Carbohydrate Esterase. N1 was sampled at the beginning of liquor starter production, N2 was sampled after 3 days of liquor starter fermentation, N3 was sampled after 9 days of liquor starter fermentation, and N4 was the mature liquor starter. The temperatures of N1, N2, N3, and N4 were 30, 50, 62, and 25 • C, respectively.
Frontiers in Microbiology | www.frontiersin.org finding is complementary to the results of 16S rRNA and ITS sequencing study (Huang et al., unpublished data), which indicated that the diversity and richness of the total bacterial community was much higher than that of the total fungal community. Thus, metatranscriptomics offered an important and excellent platform to actually understand the dynamics of microbial metabolism at the transcript level in liquor starter.
In this study, we discovered diverse and abundant carbohydrate-active enzymes from the NF liquor starter, especially in the N2 and N3 samples, which were characterized by high temperature and an aerobic environment. As mentioned above ( Table 2), some thermostable carbohydrate-active enzymes were only detected in special stages, such as endo-1,3-β-D-glucanase and endo-1,5-α-L-arabinanase in N2, rhamnogalacturonanase in N3 and N4, endo-proteases and endo-β-1,3-1,4-glucanase in N2, N3, and N4, and endo-1,4-β-D-xylanase in N4. Thus, this study highlights the benefits of specifically mining for thermostable enzymes from one special stage (N2, N3, or N4) and not just from the mature starter (N4). The liquor starter production system is markedly different from other types of environmental systems, such as the microbes in cow rumens (Hess et al., 2011), wood-feeding termite hindguts (Warnecke et al., 2007), leaf-cutter ant fungal gardens (Aylward et al., 2012), and panda guts (Zhu et al., 2011), which have been well studied using metagenomic and metaproteomic strategies. Relatively high numbers of carbohydrate-active enzymes have been found in these systems by metagenomic technologies (Table S9). However, these metagenomic analyses could not reflect the genes that are actively expressed at any given time and in response to external environmental conditions. Additionally, the microbial communities associated with these various gut systems were dominated by anaerobic bacterial taxa. Notably, industrial-scale lignocellulose degradation has mostly been demonstrated under aerobic conditions (Robinson et al., 2001). By contrast, liquor starter is made in an aerobic environment, and both bacteria and fungi were enriched, with the microbial composition dynamically changing during the production process. The highest number of carbohydrateactive enzymes was found at 62 • C and thus potentially offers thermophilic enzymes for lignocellulosic biomass degradation. The enzymatic conversion of polysaccharides in agricultural waste is a promising technology. However, it is still limited by the heterogeneity of the plant cell wall and recalcitrant biomass (Himmel et al., 2007). Previous studies have found that AA9 and AA10 can act synergistically with cellulose, hemicellulose, starch and chitin (Harris et al., 2010;Horn et al., 2012;Lo Leggio et al., 2015;Paspaliari et al., 2015;Kojima et al., 2016) because they have flat substrate-binding surfaces and have an oxidative mechanism to cleave polysaccharide chains in the crystalline context (Vaaje-Kolstad et al., 2010). The AA9 genes in the N3 sample were produced by T. aurantiacus, Trichocomaceae, and Emericella nidulans. Thermoascus aurantiacus is a promising thermophilic fungus for enzyme production and biomass degradation (McClendon et al., 2012). One of the T. aurantiacus AA9 enzymes can reduce commercial enzyme loads and is part of a well-understood synergistic system (Rosgaard et al., 2006;Harris et al., 2010). The Trichocomaceae family has been reported to have diverse physiological properties and can grow under extreme conditions. Some members of this family have been exploited in biotechnology for the production of enzymes (Houbraken, 2013). The thermotolerant E. nidulans (also called Aspergillus nidulans) can also utilize a broad spectrum of biomass to produce enzymes with high specific activities (Kango et al., 2003). The present study was the first to identify an AA10 protein from Bacillales in liquor starter, which might boost cellulose degradation. These highly expressed AA9 and AA10 members might contribute to the robust degradation capabilities of NF liquor starter, and made themselves potential candidates for industrial application. Based on these results, approximately 60 complete carbohydrate-active enzyme genes, including several AA9 proteins sequences, have been amplified from the cDNA of sample N3. They will be further cloned, expressed and characterized in future work.
Functional profiling and comparative analysis of the 4 NF liquor starter samples showed that oxidative phosphorylation was the most abundant pathway in sample N2, indicating that the microbial community quickly metabolized and released ample energy to drive energy-requiring reactions, as well as producing considerable heat that increased the room temperature to 50 • C in 3 days. Among the 20 abundant pathways, most of them were closely related to energy and sugar metabolism, indicating that the microbial community has a great capability to degrade sugars and convert them to important products, such as ethanol. Additionally, butanoate metabolism and propanoate metabolism were also active in FIGURE 4 | Genes related to carbohydrate and energy metabolism were relatively highly expressed in the liquor starter samples (N1, N2, N3, and N4). Four abundant carbohydrate and energy metabolisms were analyzed here, i.e., starch and sucrose metabolism (A), glycolysis (B), pyruvate metabolism (C), and the citrate cycle (D). For each metabolism, relatively high gene expression levels were presented by function, EC number and total RPKM. Relative expression (log 2 RPKM)) is shown between the high (red) and low (blue) expression levels. The key enzymes are highlighted with color.  (Tao et al., 2014). More interestingly, amino acid metabolism was robust in the NF liquor starter. The metabolisms of 10 amino acids were found to be dominant   (Zhuang, 2007). Thus, these highly expressed genes involved in butanoate, propanoate and amino acid metabolism indicate that the liquor starter has great potential for flavor development.
We further analyzed starch metabolism, glycolysis and ethanol and pyruvate metabolism because they are important for ethanol production and are related to flavor generation. The results showed that the microbial community had a high capability for degrading starch with different functional enzymes throughout the liquor starter production process, especially during the high temperature period. The key enzymes of glycolysis metabolism in the N2 and N3 samples were highly expressed. All the up-regulated genes in the glycolysis pathways were mainly from Saccharomycetales and Mucorales. Saccharomycetales are multifunctional microorganisms that saccharify sugar polymers, improve esterification, contribute to aroma precursors, utilize feedstocks efficiently and affect flavor. Mucorales belongs to the Zygomycetes family of filamentous fungi. They are robust fungi that show great promise for ethanol fermentation (Abedinifar et al., 2009) and the production of efficient and diverse carbohydrate-active enzymes (Battaglia et al., 2011). Therefore, the extremely highly expressed key glycolytic enzymes from Saccharomycetales and Mucorales were important for saccharification and ethanol fermentation. Additionally, more diverse microbial community members, including both fungi and bacteria, mainly contributed to pyruvate and ethanol metabolism at 50 and 62 • C. Analysis of the important enzymes of pyruvate metabolism further showed that the NF liquor starter exhibited large capacities for converting pyruvate to intermediates, i.e., acetate and acetyl-CoA, which further contribute to other metabolic functions and specific flavor. The enzymes involved in the conversion of pyruvate to lactic acid were higher in the N2 and N3 samples and reduced in N4 sample. To some extent, this result was consistent with finding in the study by Huang et al. (unpublished data); Lactobacillus increased quickly when the liquor starter had incubated for 3 days (N2) and decreased markedly when the temperature reached its maximum of 62 • C (N3), and fewer was found in N4. A markedly low level of lactic acid in the mature NF liquor starter is associated with high quality, and such starter can be further used for ethanol fermentation (Li, 2000;Lai, 2007).
The intermediates of the citrate cycle also have important functions in specific flavor generation. Thus, the citrate cycle is essential for many biochemical pathways in the liquor starter microbial community and it is necessary to understand the multiple functions of this cycle in the liquor starter process.
The key enzymes of the citrate cycle were produced by both bacterial and fungal community members in the N2 and N3 samples. Bacillales species were the dominant bacterial community members when the room temperature increased to 62 • C (Huang et al., unpublished data). The other relevant microbial community members were fungi. The high expression levels of the fungal community members also confirmed the high abundance of the active fungal community at the highest temperature point. All of these fungi have been reported to have high capabilities for degrading carbohydrates, fats and amino acids. Therefore, the N2 and N3 samples were more active and would have large capacities for generating energy and producing intermediates for liquor flavor generation. However, liquor flavor development is determined by a much more complex metabolic process; it is determined not only by the microbial community from the liquor starter but also by the Zaopei and pit mud. Thus, the mechanism of liquor flavor generation requires further comprehensive analysis of the microbial communities from the liquor starter, Zaopei and pit mud.

CONCLUSIONS
Chinese liquor starter is produced in a thermophilic and aerobic system. The present study used metatranscriptomics to globally and comprehensively explore the active microbial communities and their functional transcripts in NF liquor starter. The results demonstrated that fungi were the most abundant active community members during the liquor starter production process. The identified abundant pathways, diverse thermophilic carbohydrate-active enzymes, and up-regulated key enzyme genes that are involved in glycolysis, ethanol metabolism, pyruvate metabolism and the citrate cycle at 50 and 62 • C implied that the liquor starter is capable of robust saccharification, fermentation and production of flavorgenerating agents. A breakthrough occurred during this study regarding the understanding of microbial metabolism and the function of Chinese liquor starter, paving the way for the optimization of liquor production and for the discovery of special and scarce microbial resources and thermophilic enzymes. To obtain encompassing insights into Chinese liquor, which has been produced for several thousand years and involves a complex and dynamic ecosystem, further temporal and spatial studies are needed concerning the microbial communities involved throughout the entire liquor brewing process.

AUTHOR CONTRIBUTIONS
HZ, YH, KH, DL, HL, YJ, and YF designed the experiment; YH and ZY wrote the main manuscript, performed the experiment and analyzed data; YH, DZ, and HH collected samples and communicated with the liquor factory; YH, ZY, YF, and HZ revised the manuscript. All authors revised and approved the final version of the manuscript.