# ADVANCES IN THE REGULATION AND PRODUCTION OF FUNGAL ENZYMES BY TRANSCRIPTOMICS, PROTEOMICS AND RECOMBINANT STRAINS DESIGN

EDITED BY : André Damasio, Gustavo H. Goldman, Roberto N. Silva and Fernando Segato PUBLISHED IN : Frontiers in Bioengineering and Biotechnology and Frontiers in Microbiology

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88963-053-0 DOI 10.3389/978-2-88963-053-0

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ADVANCES IN THE REGULATION AND PRODUCTION OF FUNGAL ENZYMES BY TRANSCRIPTOMICS, PROTEOMICS AND RECOMBINANT STRAINS DESIGN

Topic Editors:

André Damasio, University of Campinas (UNICAMP), Brazil Gustavo H. Goldman, University of São Paulo (USP), Brazil Roberto N. Silva, University of São Paulo (USP), Brazil Fernando Segato, University of São Paulo (USP), Brazil

Image: Marcelo V Rubio; Jaqueline A Gerhardt and Cesar RF Terrasan

Several efforts have been made in developing strategies to supply the enzyme market, as well as in reducing its costs. It includes the selection of an appropriate enzyme source and the optimization of enzyme properties and secretion. Carbohydrate-Active Enzymes (CAZymes) are industrially relevant biocatalysts that are capable of degrading plant cell wall biomass. The most important secreted enzymes related to plant cell wall decomposition are cellulases, hemicellulases, and auxiliary enzymes. These enzymes have been applied in the hydrolysis of plant biomass for the production of second-generation (2G) ethanol and several other high added value products.

One of the bottlenecks for 2G ethanol production is the cost of enzymes applied on plant biomass hydrolysis. The improvement of proteins production by fungi applying system biology and genetic engineering is an interesting and promising strategy to reduce the enzymes cost and make the 2G ethanol production viable.

Fungi play an important role in plant biomass degradation and biotechnology by producing and secreting high yields of enzymes. In spite of the fact that filamentous fungi present several advantages compared to other microorganisms due to the high level of proteins production, heterologous protein production is far from optimal levels and still needs to be improved. Currently, heterologous production of certain proteins is generally considerably lower than the levels obtained to homologous production. Many strategies have been studied in order to improve heterologous production of proteins by filamentous fungi, including the deletion of genes that encode for proteases, the deletion of lectin-like ER-Golgi cargo receptors and the co-expression of specific chaperones.

It has been shown that the main bottleneck in the production of heterologous proteins is not caused by the low expression of the target gene. An experimental evidence suggests that most target proteins produced in filamentous fungi are lost or stuck in the secretory pathway due to errors in processing, modification or misfolding that result in their elimination by endoplasmic reticulum (ER) quality control. Misfolded proteins alter homeostasis and proper ER functioning resulting in a state known as ER stress. ER stress activates a conserved signaling pathway called unfolded protein response (UPR) and ER-associated protein degradation (ERAD), which upregulates genes responsible for restoring protein folding homeostasis in the cell and degrades misfolded protein in the cytosol by the ubiquitin-proteasome system.

The genetic manipulation of individual genes and changes in the genome seems not to be the best alternative to overcome the main bottlenecks in heterologous protein secretion. However, the understanding of complex interactions of important proteins and genes, as well as how they are regulated is more promising.

Citation: Damasio, A., Goldman, G. H., Silva, R. N., Segato, F., eds. (2019). Advances in the Regulation and Production of Fungal Enzymes by Transcriptomics, Proteomics and Recombinant Strains Design. Lausanne: Frontiers Media. doi: 10.3389/978-2-88963-053-0

# Table of Contents


Hui Wei, Wei Wang, Hal S. Alper, Qi Xu, Eric P. Knoshaug, Stefanie Van Wychen, Chien-Yuan Lin, Yonghua Luo, Stephen R. Decker, Michael E. Himmel and Min Zhang


Elisabeth Fitz, Franziska Wanka and Bernhard Seiboth


Leonardo Martins-Santana, Luisa C. Nora, Ananda Sanches-Medeiros, Gabriel L. Lovate, Murilo H. A. Cassiano and Rafael Silva-Rocha


Eva Stappler, Jonathan D. Walton, Sabrina Beier and Monika Schmoll

and Robert Neil Gerard Miller

# Editorial: Advances in the Regulation and Production of Fungal Enzymes by Transcriptomics, Proteomics and Recombinant Strains Design

André Damasio<sup>1</sup> \*, Gustavo H. Goldman<sup>2</sup> , Roberto N. Silva<sup>3</sup> and Fernando Segato<sup>4</sup>

<sup>1</sup> Department of Biochemistry and Tissue Biology, Institute of Biology, University of Campinas, Campinas, Brazil, <sup>2</sup> School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, Brazil, <sup>3</sup> Ribeirão Preto Medical School, University of São Paulo, Ribeirão Preto, Brazil, <sup>4</sup> Department of Biotechnology, Engineering School of Lorena, University of São Paulo, Lorena, Brazil

Keywords: fungal enzymes, cell factories, genetic engineering, omics, rational design

#### **Editorial on the Research Topic**

#### **Advances in the Regulation and Production of Fungal Enzymes by Transcriptomics, Proteomics and Recombinant Strains Design**

#### Edited by:

Rongxin Su, Tianjin University, China

#### Reviewed by:

Alok Satlewal, Indian Oil Corporation, India Maria Carolina Quecine, University of São Paulo, Brazil

> \*Correspondence: André Damasio adamasio@unicamp.br

#### Specialty section:

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

> Received: 03 May 2019 Accepted: 12 June 2019 Published: 28 June 2019

#### Citation:

Damasio A, Goldman GH, Silva RN and Segato F (2019) Editorial: Advances in the Regulation and Production of Fungal Enzymes by Transcriptomics, Proteomics and Recombinant Strains Design. Front. Bioeng. Biotechnol. 7:157. doi: 10.3389/fbioe.2019.00157 There are many studies reporting the importance of biological processes related to protein secretion in filamentous fungi including the mechanisms of unfolded protein response (UPR), endoplasmicreticulum-associated protein degradation (ERAD), lipid biosynthesis, cell wall integrity, vesicles transport, autophagy, kinases among others (Kwon et al., 2014; Malavazi et al., 2014; de Assis et al., 2015; Burggraaf and Ram, 2016; Schalén et al., 2016; Zhang et al., 2016; Yokota et al., 2017).

However, the secretion level of target proteins has been low in general, which lead to the economically unviable production of some enzymes. Due to this fact, it has been desired more robust host strains with higher secretion yield to increase the range of commercial enzymes.

Essentially, the basis of many of the changes underlying strain improvement is either undefined or not in the public domain. After decades of strain improvement programs, details on the genomic evolution of production strains have not been available in scientific literature due to industrial confidentiality.

In general, the production improvement of a specific protein by genetic engineering does not necessarily promote the same effect in the production of other protein. Moreover, the attempts have resulted in the improvement of one target protein secretion instead of total protein secretion. Then, how to achieve the production of industrial strains such as T. reesei that produce 120 g/L of proteins? We believe we are far to reach these industrial levels by rational design.

We sincerely thank all researchers who contributed to this Research Topic. Three reviews and five original research articles were published.

The ascomycete Trichoderma reesei is one of the main fungal producers of cellulases based on its high production capacity. Fitz et al. reviewed the advances in promoter toolbox for recombinant gene expression in T. reesei. The authors discussed established constitutive promoters for gene expression in T. reesei such as promoters from eno1, gpd1, and tef1 genes, and also tunable cellulase promoters from cel6a, cel7a, cel5a, and cel12a genes. Moreover, potential new tunable promoters discovered by transcriptomic studies were explored. The expression of the gene encoding the copper transporter tcu1 promoter was abolished by the presence of a certain amount of copper in the medium and can be relieved by the addition of a Cu2<sup>+</sup> chelator. Another promising promoter induced by low amounts of pantothenic acid was validated by fusion with the T. reesei ß-glucosidase BGL1.

Emphasizing the importance of transcription factors manipulation in the microbial cell factories field, Alazi and Ram presented an elegant review on the transcriptional regulation of plant biomass-degrading enzymes. A complete survey on rational design of industrial fungal strains with increased or constitutive production of Carbohydrate-Active Enzymes (CAZymes) was shown. Essentially the authors described the constitutive activation of transcription factors and deletion or down-regulation of specific repressors in order to regulate the expression of CAZyme genes. Finally, recent data on chromatin remodeling and CAZymes overexpression as well as the design of synthetic promoters were also reported. Martins-Santana et al. organized an excellent review on systems and synthetic biology tools to redesign metabolic and secondary metabolites pathways in fungal strains.

Two reports focused on yeasts manipulation. Wei et al. described the potential of a high lipid-accumulating strain of Yarrowia lipolytica to express a core of cellulase genes. The recombinant strains successfully co-expressed cbhI, cbhII, and egII from T. reesei. In addition, the Y. lipolytica recombinant strain showed higher glucose utilization rate that led to ∼2 fold more cell mass and 3-fold more lipid production per liter culture compared to parental control strain growing in media with a high C/N ratio. Similarly, Yang et al. expressed cellulases in Saccharomyces cerevisiae using the CRISPR-cas9 tool resulting in an improvement in the saccharification of orange-peels.

Midorikawa et al. described the expression of CAZymes by Aspergillus tamarii strain BLU37 cultivated on steam-exploded sugarcane bagasse. The results showed a high expression of

#### REFERENCES


several genes encoding CAZymes, such as Glycoside Hydrolases (GH), Carbohydrate Esterases (CE), and Auxiliary Activities (AA). Moreover, transcription factors involved in cellulases and hemicellulases regulation were overexpressed such as XlnR and ClrA. Exploring new fungal chassis are fundamental in order to found microbial cell factories with a potential application on industrial enzymes production.

Finally, in this topic two reports used T. reesei as a model organism. Borin et al. demonstrated a range of new potential targets to improve the cellulolytic potential of T. reesei RUT-C30 strains. Several cellulases, sugar transporters and hypothetical proteins coding genes upregulated in sugarcane bagasse were grouped into different highly connected gene modules. Stappler et al. investigated the influence of different light intensities on cellulase activity and protein secretion by proteomics approach. Several glycoside hydrolases showed lightdependent regulation and the photoreceptor genes blr1 and blr2 are important for protein abundance regulation between light and darkness.

#### AUTHOR CONTRIBUTIONS

AD designed and wrote the Editorial with contributions from GG, RS, and FS.

#### ACKNOWLEDGMENTS

We would like to thank the authors and reviewers for their valuable contributions to this special topic.

secretory proteins in the filamentous fungus Aspergillus oryzae. Appl. Microbiol. Biotechnol. 101, 2437–2446. doi: 10.1007/s00253-016- 8086-3

Zhang, H., Wang, S., Zhang, X. X., Ji, W., Song, F., Zhao, Y., et al. (2016). The amyR-deletion strain of Aspergillus niger CICC2462 is a suitable host strain to express secreted protein with a low background. Microb. Cell Fact. 15:68. doi: 10.1186/s12934-016- 0463-1

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer MQ declared a shared affiliation, though no other collaboration, with several of the authors, GG, RS, and FS to the handling Editor.

Copyright © 2019 Damasio, Goldman, Silva and Segato. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Ameliorating the Metabolic Burden of the Co-expression of Secreted Fungal Cellulases in a High Lipid-Accumulating Yarrowia lipolytica Strain by Medium C/N Ratio and a Chemical Chaperone

Hui Wei<sup>1</sup> \*, Wei Wang<sup>1</sup> , Hal S. Alper<sup>2</sup> , Qi Xu<sup>1</sup> , Eric P. Knoshaug<sup>3</sup> , Stefanie Van Wychen1,3 , Chien-Yuan Lin<sup>1</sup> , Yonghua Luo<sup>1</sup> , Stephen R. Decker<sup>1</sup> , Michael E. Himmel<sup>1</sup> and Min Zhang1,3 \*

#### Edited by:

Marie-Joelle Virolle, The National Center for Scientific Research (CNRS), France

#### Reviewed by:

Rodrigo Ledesma-Amaro, Imperial College London, United Kingdom Aleksandra Maria Mironczuk, ´ Wrocław University of Environmental and Life Sciences, Poland

#### \*Correspondence:

Hui Wei Hui.Wei@nrel.gov Min Zhang Min.Zhang@nrel.gov

#### Specialty section:

This article was submitted to Systems Microbiology, a section of the journal Frontiers in Microbiology

Received: 21 August 2018 Accepted: 17 December 2018 Published: 09 January 2019

#### Citation:

Wei H, Wang W, Alper HS, Xu Q, Knoshaug EP, Van Wychen S, Lin C-Y, Luo Y, Decker SR, Himmel ME and Zhang M (2019) Ameliorating the Metabolic Burden of the Co-expression of Secreted Fungal Cellulases in a High Lipid-Accumulating Yarrowia lipolytica Strain by Medium C/N Ratio and a Chemical Chaperone. Front. Microbiol. 9:3276. doi: 10.3389/fmicb.2018.03276 <sup>1</sup> Biosciences Center, National Renewable Energy Laboratory, Golden, CO, United States, <sup>2</sup> Department of Chemical Engineering, The University of Texas at Austin, Austin, TX, United States, <sup>3</sup> National Bioenergy Center, National Renewable Energy Laboratory, Golden, CO, United States

Yarrowia lipolytica, known to accumulate lipids intracellularly, lacks the cellulolytic enzymes needed to break down solid biomass directly. This study aimed to evaluate the potential metabolic burden of expressing core cellulolytic enzymes in an engineered high lipid-accumulating strain of Y. lipolytica. Three fungal cellulases, Talaromyces emersonii-Trichoderma reesei chimeric cellobiohydrolase I (chimeric-CBH I), T. reesei cellobiohydrolase II (CBH II), and T. reesei endoglucanase II (EG II) were expressed using three constitutive strong promoters as a single integrative expression block in a recently engineered lipid hyper-accumulating strain of Y. lipolytica (HA1). In yeast extractpeptone-dextrose (YPD) medium, the resulting cellulase co-expressing transformant YL165-1 had the chimeric-CBH I, CBH II, and EG II secretion titers being 26, 17, and 132 mg L−<sup>1</sup> , respectively. Cellulase co-expression in YL165-1 in culture media with a moderate C/N ratio of ∼4.5 unexpectedly resulted in a nearly two-fold reduction in cellular lipid accumulation compared to the parental control strain, a sign of cellular metabolic drain. Such metabolic drain was ameliorated when grown in media with a high C/N ratio of 59 having a higher glucose utilization rate that led to approximately twofold more cell mass and threefold more lipid production per liter culture compared to parental control strain, suggesting cross-talk between cellulase and lipid production, both of which involve the endoplasmic reticulum (ER). Most importantly, we found that the chemical chaperone, trimethylamine N-oxide dihydride increased glucose utilization, cell mass and total lipid titer in the transformants, suggesting further amelioration of the metabolic drain. This is the first study examining lipid production in cellulase-expressing Y. lipolytica strains under various C/N ratio media and with a chemical chaperone highlighting the metabolic complexity for developing robust, cellulolytic and lipogenic yeast strains.

Keywords: fungal cellulolytic enzymes, Yarrowia lipolytica, cellulosic biofuel, cellobiohydrolase I, endoglucanase II, lipid metabolism, endoplasmic reticulum stress, chemical chaperone

### INTRODUCTION

fmicb-09-03276 January 5, 2019 Time: 10:56 # 2

As one of the most abundant renewable resources today, lignocellulosic biomass is under development worldwide to produce biofuels and chemicals. One key bottleneck hindering biofuels development is the high cost of conversion of feedstocks to sugars due to the general recalcitrance of plant cell walls (Himmel et al., 2007). To overcome this hurdle, a process strategy of cellulase production, cell wall polymer hydrolysis, and sugar fermentation in a single step termed consolidated bioprocessing (CBP) has been proposed (Lynd et al., 2005). So far, Saccharomyces cerevisiae and Kluyveromyces marxianus (Fan et al., 2012; Olson et al., 2012; Sun et al., 2012; Chang et al., 2013; Yamada et al., 2013; Kricka et al., 2014), and recently Yarrowia lipolytica (Wei et al., 2014; Guo et al., 2017) have been explored as potential CBP microorganisms.

Yarrowia lipolytica has long been known to be oleaginous and recently, has been engineered to utilize xylose for lipid production (Ledesma-Amaro et al., 2016; Li and Alper, 2016; Markham et al., 2016; Rodriguez et al., 2016). The accumulated lipids can be extracted and upgraded for biodiesel production (Ratledge and Wynn, 2002; Sitepu et al., 2014). Because Y. lipolytica lacks the cellulolytic enzymes needed to break down cellulosic biomass directly (Ryu et al., 2015), efforts have been made recently to express cellulases in this yeast, specifically CBH I, CBH II, endoglucanase II (EG II), β-D-glucosidase, and xylanase (Boonvitthya et al., 2013; Wang et al., 2014a; Wei et al., 2014; Guo et al., 2015). We note that a consortium co-culture of these cellulase transformants demonstrated synergy in utilizing cellulose (Wei et al., 2014). Most recently, progress has been made in co-expressing CBHs, EGs, and BGLs in Y. lipolytica to mimic the ratio of the main cellulases in the secretome of T. reesei. The resultant strains were shown to grow efficiently on industrial cellulose pulp, which is mostly amorphous, but limited growth on recalcitrant crystalline cellulose was also noted (Guo et al., 2017).

Another recent accomplishment is the engineering of Y. lipolytica to achieve a high yield of lipids or other hydrocarbons; see major reviews focused on the metabolic engineering of Y. lipolytica in the past 5 years (Abghari and Chen, 2014; Gonçalves et al., 2014; Zhu and Jackson, 2015; Ledesma-Amaro and Nicaud, 2016; Xie, 2017; Abdel-Mawgoud et al., 2018; Carsanba et al., 2018). Notably, the mutant pex10 mfe1 leu<sup>−</sup> ura<sup>+</sup> DGA1 was generated as a lipid hyper-accumulator (strain HA1), which can accumulate total lipids up to ∼90% on a cell dry-weight basis (Blazeck et al., 2014).

With the overall goal of developing a CBP platform to produce lipids as a drop-in fuel precursor from cellulosic biomass feedstocks, the objectives of this study are twofold: First, to co-express and evaluate the secretion efficiency and cellulolytic functionality of fungal CBH I, CBH II, and EG II in the abovementioned Y. lipolytica HA1 strain. These fungal enzymes include the Te-Tr chimeric CBH I (Ilmen et al., 2011; Wei et al., 2014), T. reesei CBH II, and T. reesei endoglucanase II (EG II). For simplicity, the co-expression of chimeric CBH I-CBH II-EG II (triplet cassette of cellulases) in this study is generally referred to as CBH I-CBH II-EG II in the text. Secondly, to investigate whether the co-expression of the core cellulolytic enzymes is a metabolic burden to the high lipid-accumulating strain of Y. lipolytica and whether adjusting the C/N ratio of the growth medium and supplementing the medium with a chemical chaperone can alleviate this metabolic stress. This work provides a new level of rigor for CBP strain development in yeast by exploring the relationship between cellulase production and lipid accumulation.

#### MATERIALS AND METHODS

#### Strains, Plasmids, and Culture Medium

Yarrowia lipolytica strains and plasmids used in this study are described in **Table 1**. The pedigree of transformants expressing either a single or multiple cellulases is illustrated together with an overview of the experimental characterization of these transformants (**Figure 1**). Yeast strains were maintained at 28◦C on YPD agar medium that contains 10 g L−<sup>1</sup> yeast extract, 20 g L <sup>−</sup><sup>1</sup> peptone, 20 g L−<sup>1</sup> dextrose and 15 g L−<sup>1</sup> agar.

#### Promoters, Signal Peptide, and Terminators for Cellulase Expression

Promoters were selected to mimic as closely as possible the optimal ratio of CBH I: CBH II: EG II of 60: 10: 30, or to the optimal ratio for CBH I: EG II of 90: 10 to achieve maximal cellulose degradation (Kallioinen et al., 2014; Wei et al., 2014). Thus the promotors TEFin, GPD, and EXP1 with expression levels of approximately 17, 0.8, and 1.2-fold that of TEF (Damude et al., 2011; Tai and Stephanopoulos, 2013) were chosen to control the expression of individual cellulases.

Details for the promoters, signal peptide and terminators for expressing individual cellulases are described as below. Chimeric CBH I was expressed by the TEFin promoter and the XPR2 terminator (GenBank, accession no. M23353) (Davidow et al., 1987). CBH II was expressed by the GPD promoter (YALI0C06369p; -931 to -1; GenBank nucleotide 158467892) and the Lip2 terminator (GenBank accession no. AJ012632). EG II was expressed by the EXP1 promoter and EXP1 terminator (Damude and Zhu, 2007; Ye et al., 2012). Each of three cellulase genes contains an XPR2 signal peptide (AAGCTCGCTACCGCCTTTACTATTCTCACGGCCGTTCTG GCC) and was codon optimized according to the codon optimization table for Y. lipolytica. The cellulase expression cassette (i.e., CBH I-CBH II-EG II cassette; named as construct 162) was synthesized by GenScript Inc. (Piscataway, NJ, United States) with SalI-PmlI sites on the 5<sup>0</sup> end and a KpnI site on its 3<sup>0</sup> end (see sequence in the additional file). This expression cassette was cloned into the vector pUC57 at the SalI and EcoRV sites. The sequence of the complete expression cassette of construct 162 is provided in Supplementary Materials

**Abbreviations:** BGL, β-glucosidase; CBH, cellobiohydrolase; DCW, dry cell weight; DGA, diacylglycerolacyltransferase; ER, endoplasmic reticulum; FAME, fatty acid methyl esters; hp4d, hybrid promoter derived from pXPR2; Te, Talaromyces emersonii; TMAO, trimethylamine N-oxide dihydride; Tr, Trichoderma reesei; Te-Tr, Talaromyces emersonii-Trichoderma reesei; XPR2, alkaline extracellular protease 2; YP, yeast extract-peptone (medium); YPD, yeast extract-peptone-dextrose (medium).

#### TABLE 1 | Yarrowia lipolytica strains and plasmids.

fmicb-09-03276 January 5, 2019 Time: 10:56 # 3


The theoretical molecular weight (MW) was calculated based on amino acid sequence (without a signal peptide). CBH, cellobiohydrolase; EG, endoglucanase; Te, Talaromyces emersonii; Tr, T. reesei.

and Methods. Finally, to build the Y. lipolytica cellulase secretion construct (named as construct 165), construct 162 and vector pYLEX1 were digested with SalI and KpnI, and ligated together (**Figures 2A,B**).

#### Transformation and Selection

Random integrative transformation of Y. lipolytica strain HA1 with NotI-linearized plasmid 165 DNA was conducted using YLOS One Step Transformation system and the YLEX expression kit (Yeastern Biotech Co., Taipei, Taiwan) as described previously (Wang et al., 2014a). The transformation mixture was spread on YNB selection plates lacking leucine for the appearance of transformant (Leu+) colonies.

#### RNA Extraction and Real-Time RT-PCR

Total RNA was extracted from 60 to 80 mg (wet weight) cell pellets of transformants using Qiagen RNeasy Mini Kit (Valencia, CA, United States). The procedure for cDNA synthesis was

similar to the protocol described by Wei and coworkers (Wei et al., 2012). One microgram of purified total RNA was reversetranscribed using High-Capacity cDNA Reverse Transcription Kit (Cat. no. 4368814, Applied Biosystems, Grand Island, NY, United States) with random hexamers according to the manufacturer's instructions.

Primers were designed for real-time RT-PCR targeting the codon-optimized sequences for individual cbh1, cbh2, and eg2 genes expressed in Y. lipolytica transformants (**Supplementary Table S1**). Primers for the reference gene encoding actin (YALI0D08272g) were previously described (Dulermo et al., 2015). Real-time RT-PCR was performed using ABI 7500 Real-Time PCR System (Thermo Fisher Scientific, Applied Biosystems, Grand Island, NY, United States) and Power SYBR Green PCR Master Mix (Cat. no. 4367659, Applied Biosystems, Grand Island, NY, United States). PCR reactions were performed in triplicate. The relative transcription level of genes was calculated from the Ct value of reference gene and gene-of-interest (Schmittgen and Livak, 2008).

### SDS–PAGE and Western Blot

The supernatant from each strain was collected from YPD pH 4.0 medium when the culture reached an OD600 value of 10. The loading amount per well was 22.5 µL mixed with 7.5 µL 4X loading buffer, following the procedures described previously for the SDS–PAGE and western blot analyses (Wei et al., 2014), for which the custom mouse monoclonal anti-CBH I, mouse monoclonal anti-CBH II, and rabbit polyclonal anti-EG II were used. Densitometric analysis of the detected bands was performed in accordance with literature (Savina et al., 2002; Olszanecki et al., 2007; Rashid et al., 2015) using the Quantity One software (Bio-Rad, CA, United States). The band intensities, relative to the corresponding bands in the protein samples of strains expressing individual cellulases, are presented as the average values from three separate experiments.

### Cellulosic Substrates for Enzymatic Activity Assays, Screening, and Culturing of Transformants

Two types of cellulosic substrates were used in this study:


### Screening CBH I-CBH II-EG II Co-expressing Transformants on YPD-PASC Plates

The capacity of transformants to utilize cellulose was assessed by clearing zones on 0.5% PASC-YPD agar plates (0.5% PASC, 1.0% yeast extract, 2.0% peptone, 2.0% dextrose, 1.5% agar). The plates were inoculated with strains and then incubated at 28◦C for 6 days before Congo Red staining (Wood and Mahalingeshwara Bhat, 1988).

### Enzyme Activity Assay

fmicb-09-03276 January 5, 2019 Time: 10:56 # 5

The combined enzymatic activity of co-expressed CBH I-CBH II-EG II was measured by using unconcentrated or concentrated supernatants of yeast cell cultures. The supernatants were collected from YPD cultures after 5 days shaking at 30◦C, 200 rpm. The concentrated crude enzymes were prepared by concentrating 35-fold using ultrafiltration with a molecular weight cut-off of 10,000-dalton. For enzyme activity measurements, 0.5 mL of the supernatants or concentrated crude enzyme solutions was mixed with 0.5 mL of 1% Avicel suspended in 50 mM acetate buffer at pH 4.8. As controls, 0.5 mL of the concentrated crude enzyme was mixed with 0.5 mL acetate buffer without the substrate; in parallel, Avicel without enzyme (using ddH2O instead) was also set as controls. The replicate vials containing enzyme-Avicel mixtures or controls were incubated at 50◦C for 1 h, 24 h, and 5 days. The samples were centrifuged at 12,000 rpm for 3 min and the supernatant was filtered through a 0.45-µm filter. The released sugars were measured by high-performance liquid chromatography (HPLC). The "% Avicel to glucose" conversion rate was calculated as the measured "Total glucose equivalent released g L−<sup>1</sup> " divided by 5 g L−<sup>1</sup> , which was the amount of total glucose equivalent released for theoretical 100% Avicel conversion.

### Cellulose Utilization by Cellulase Co-expressing Transformants in Mineral Medium

Basal mineral media have been used to assess the cellulose utilization by filamentous fungi and yeast (Wei et al., 2013, 2014; Guo et al., 2017). The mineral medium described by Guo et al. (Guo et al., 2017) contained higher essential nutrients, and was used in this study. In addition to CBH and EG, functionally potent BGL is also needed for the conversion of cellulose to glucose (Dashtban and Qin, 2012). In general, Y. lipolytica is viewed as limited in endogenous ability to digest cellobiose, with some cellobiose-consuming wild-type, or substrate-adapted strains being reported (Kurtzman et al., 2011; Mirbagheri et al., 2012; Lane et al., 2015; Ryu et al., 2016). A strategy of adding exogenous BGL was used in previous studies to boost the low BGL activity in T. reesei's cellulase preparations (Xin et al., 1993; Berlin et al., 2007; Chen et al., 2008). This study follows these previous examples by adding exogenous BGL to the medium so that BGL would not be a limiting factor for cellulose degradation.

Briefly, seed cell culture was grown in 20 mL YPD medium in a 125-mL baffled flask overnight at 28◦C, followed by centrifugation for cell collection. The cells were washed with sterile ddH2O, and used to inoculate 100 mL of mineral medium in 500-mL baffled flasks to reach an initial OD of 1.6 (equivalent to approximately 1 g DCW L−<sup>1</sup> ; DCW basis), which was comparable to the inoculation rate reported in literature (Guo et al., 2017). The mineral medium was supplemented with 2.7 g Avicel (i.e., 2.7% w/v), as well as with BGL (Aspergillus niger BGL: Cat. no. E-BGLUC, Megazyme, International Ireland Ltd., Wicklow, Ireland), at a concentration of 2 mg BGL per gram cellulose substrate based on the literature (Selig et al., 2014; Wei et al., 2014; Xu et al., 2017). The BGL used was chromatographically purified prior to use. The flasks that contained the cell-medium-Avicel-BGL mixtures were incubated in a rotary shaker at 200 rpm and 28◦C for 5 days. The cells were centrifuged, freeze-dried, and subjected to Avicel residue analysis, as described below. Three biological cultures were run for the cell mixtures.

### Quantification of Avicel Residues and Cell Weight in Avicel-Yeast Cell Pellets

The determination of Avicel residues and cell dry weight was conducted as previously described (Wei et al., 2014; Guo et al., 2017). Briefly, the Avicel-yeast cells from the transformant culture growing on Avicel were centrifuged, washed with ddH2O, freezedried, and weighed. The amount of Avicel contained in the pellet was determined by enzymatic digestions using the Cellic CTec2 cellulase enzyme product (Novozyme, Franklin, NC, United States) and cross-checked by diluted acid (2.5% sulfuric acid) hydrolysis of the residues with at 121◦C for 1 h. The total glucose released was measured by HPLC and taken as the corresponding amount of Avicel contained in the Avicel-yeast cell mix. Cell dry weight was calculated by subtracting the amount of Avicel from the weight of Avicel-cell yeast pellet.

### Growth Curves and Validation

Growth curves were obtained by using a Bioscreen C analyzer (Growth Curves United States, Piscataway, NJ, United States) and a modified protocol for cell inoculation, growth conditions, and turbidity measurements (Franden et al., 2013; Wang et al., 2014b). In brief, log phase cultures of Y. lipolytica strains were used to inoculate 20 mL YPD pH 4.0 medium (as a cellulase production medium), which had an initial C/N ratio of approximately 4.5 (Martinez-Force and Benitez, 1995; Josefsen et al., 2012), in 150 mL flask for overnight growth at 30◦C and 210 rpm. After overnight growth, the culture reached an OD<sup>600</sup> ∼ 10. Cells were then diluted into fresh YPD pH 4.0 medium at an initial OD600 of 0.25 and distributed into Bioscreen C microplates (three wells per cell line; 300 µL per well). Incubation for the Bioscreen C microplates was performed at 30◦C for 5 days, with absorbance readings taken every 15 min. The turbidity measurement with a wide band filter (420–580 nm, which is relatively insensitive to color changes) were computer operated with EZ Experiment software. The collected data were exported to spreadsheets of Microsoft Excel, and the turbidity data was averaged from three replicates of cell samples.

Cell mass dry weight measurements in shake flask cultures were used to validate the growth curves generated using the Bioscreen C analyzer. Overnight cultures were used to inoculate 200 mL YPD pH 4.0 medium in 1000-mL baffled flask to reach an initial OD600 of 0.25. The flasks were incubated in a rotary shaker at 200 rpm and 30◦C for 5 days, during which ten milliliters of the cell cultures were collected after 6, 12, 24, 48, 72, 96, and

120 h and centrifuged at 4000 g for 5 min. The collected cell pellets were washed three times with sterile distilled-deionized water (ddH2O) to remove medium residues and collected by centrifugation at 4000 g for 5 min. The cell pellets obtained were freeze dried and weighed. Cell weight data were averaged from three replicate samples.

### Glucose Consumption and Lipid Accumulation in Moderate and High C/N Media

Two types of media were used to grow the cellulase co-expressing transformants for investigating their glucose consumption and lipid accumulation. One was a moderate C/N ratio medium prepared from the YPD medium supplemented with 10 g L −1 extra glucose (with the final glucose of 30 g L−<sup>1</sup> , thus referred as YPD-3% Glu), with unadjusted pH and a C/N ratio > 4.5 (Martinez-Force and Benitez, 1995; Josefsen et al., 2012). Another was a high C/N ratio medium adapted from literature (Blazeck et al., 2014), which contained 80 g L−<sup>1</sup> glucose, 6.7 g L−<sup>1</sup> Yeast Nitrogen Base w/o amino acids (containing 5 g L−<sup>1</sup> ammonium sulfate, which corresponds to 1.365 g L−<sup>1</sup> ammonium, or approximately 38 mM nitrogen), and 0.79 g L−<sup>1</sup> CSM supplement (Cat. no. 114500012; MP Biomedicals). The C/N ratio (g/g) of this high C/N ratio medium was approximately 59, with a pH value of 4.7. Both media were sterilized by 0.2-µm filtration.

Briefly, an overnight growth of transformants and the parent control strains grown in YPD medium was used to inoculate 50 mL of either YPD-3% Glu medium or the high C/N ratio medium in 250-mL baffled flasks to reach an initial OD of 0.25. The flasks were incubated in a rotary shaker at 200 rpm and 28◦C for 6 days, followed by centrifugation for cell harvest. The cell pellets were freeze-dried and subjected to FAME analysis, as described previously (Wei et al., 2013).

### Treatment of Yeast Cells With a Chemical Chaperone

The chemical chaperone, trimethylamine N-oxide dihydride (TMAO; Cat. no T0514, Sigma), was used to examine its effect on the cell growth and lipid accumulation of Y. lipolytica control strain and the cellulase-expressing transformants. A stock solution of 3 M TMAO was prepared by dissolving the chemical in the culture medium and sterilized by using 0.22 µm filter.

Briefly, an overnight growth of transformants and the parent control strains grown in YPD was used to inoculate 50 mL of high C/N ratio medium in 250-mL baffled flasks to reach an initial OD of 0.25. The flasks were incubated in a rotary shaker at 200 rpm and 28◦C for 4 days, followed by adding 2.63 mL of sterile 3 M TMAO stock solution to reach a working concentration of 150 mM. The flasks were incubated in a rotary shaker at 200 rpm and 28◦C for another 2 days, then centrifuged for cell harvest. The dose and duration of the chemical chaperone treatment were based on literature (Thanonkeo et al., 2007; Sootsuwan et al., 2013; Lamont and Sargent, 2017). Untreated parallel cultures (without TMAO) were used as controls. The cell pellets were freeze-dried and subjected to FAME analysis, as described previously (Wei et al., 2013).

## RESULTS AND DISCUSSION

## Co-expression of CBH I-CBH II-EG II in Y. lipolytica HA1

Prior to the present work, the enzymes chimeric CBH I, CBH II, and EG II had been expressed in Y. lipolytica individually (Wei et al., 2014). To co-express these three enzymes, plasmid construct 165 was built in the backbone of pYLEX1 vector, and the resultant plasmid was named as pYLEX1-CBH I-CBH II-EG II with LEU2 as selection marker (**Figure 2**). The plasmid construct 165 was linearized with NotI and transformed into Y. lipolytica HA1.

Approximately 2000 Leu<sup>+</sup> transformants were recovered on YNB plates lacking leucine. Since the cells we used for transformation were not synchronized, they likely were in different stages of their cell cycle. In addition, random insertion-associated positional effects have the potential to lead to a range of size variation among the resulted positive colonies of transformants on the selection plates. Out of these transformants, 20 visually larger colonies were selected as, in general, the large colonies are more likely to be stable transformants from transformation of Y. lipolytica (Orr-Weaver and Szostak, 1983) and colony size has been used as a proxy for fitness in yeast and other microorganisms (Baryshnikova et al., 2010; Wagih et al., 2013). These 20 transformants were further narrowed down to eight transformants based on their higher OD600 values in YNB liquid medium after overnight growth. The single-colony purified transformants were tested for their ability to hydrolyze cellulose by growth on PASC-YPD agar for 6 days followed by Congo Red staining. The results showed that five out of the eight transformants (YL165-1, YL165-5, YL165-6, YL165-7, and YL165-8) produced relatively large clearance zones on the PASC-YPD plates after Congo Red staining (**Supplementary Figures S1A,B**). These transformants were confirmed to express CBH I-CBH II-EG II by real-time RT-PCR (**Supplementary Figure S2**).

It is noteworthy that in this study, two colony sizes appeared on the selection plates: large colonies appeared at day 3 after plating while smaller became visible at day 5 to 6 after plating. In addition to the picking of 20 larger colonies as described above, we also picked 20 of the later smaller colonies. However, further culturing of the smaller colonies indicated that all of them were false positive as they did not grow after re-streaking on selection plates, which was not an unusual phenomenon in fungal and yeast transformation caused by transient expression or unstable insertion of target genes and marker into the host genome (Singh A. et al., 2015; Schwarzhans et al., 2016).

### Enzyme Activity Screening of Transformants Co-expressing CBH I-CBH II-EG II

Further investigation was conducted to measure the enzyme activity of the culture supernatant or concentrated crude enzymes using wet ball-dispersed Avicel as a substrate. Enzymatic conversion of Avicel to cellobiose and glucose was measured after 1 h, 24 h, and 5 days digestion. For unconcentrated

supernatants of transformants of YL165-1, YL165-5, and YL165- 7, the enzyme activities (expressed as released sugars) were only measurable after 24 h and 5 days of hydrolysis, with up to 11% of Avicel being converted to cellobiose and glucose (**Table 2**, columns 3–4). For transformants YL165-6 and YL165-8, the enzyme activities of their unconcentrated supernatants were the lowest, only converting 3–4% of Avicel to cellobiose and glucose after 5 days of incubation with the Avicel substrates. Thus, these transformants were eliminated from further testing of their supernatants for enzymatic analysis.

With the above low percent conversion in the culture supernatants and to better understand the potential enzymatic activity present in these culture supernatants, the culture supernatants of YL165-1, YL165-5, and YL165-7 were concentrated 35×. In the concentrated supernatants, significant conversion of Avicel, 13% to 31%, was observed after 1 h. Conversion increased to 56 and 69% for transformant YL165-1 and 50% and 66% for YL165-5 after 24 h and 5 days, respectively (**Table 2**, columns 5–6), indicating that 35x concentrated cellulases can significantly enhance the degradation of cellulose.

Microscopic imaging analysis of the Avicel-crude enzyme mixture after 5 days incubation with 35x concentrated crude enzymes of transformant YL165-1 confirmed the effects of cellulases on the size of Avicel particles. Compared to Avicel granules incubated with no enzymes, the particles of Avicel incubated with 35x concentrated crude enzymes of transformant YL165-1 were found to be much finer (**Supplementary Figure S3**), confirming deconstruction of the crystalline cellulose particles.

### Comparing Cellulase Levels in YL165-1 vs. Single Cellulase Expressing Strains

The best performing transformant from those described above in converting Avicel was YL165-1, which was further subjected to detailed analysis for the co-expressed cellulases. Strains YL151, YL102, and YL101, which were previously generated to express individual chimeric CBH I, CBH II, and EG II cellulases (Wei et al., 2014), were used as reference (see **Figure 1** for the strain pedigree). Experimentally, these strains were arranged into three subsets and cultured in parallel in YPD media for 5 days. The supernatants were collected from the cultures when the OD600 value reached 10, followed by PAGE and western blot analyses (**Figures 3A–C**).

A western blot using anti-Tr CBH I antibody, which recognizes the T. reesei CBM and linker in the chimeric CBH I, detected a single band with the expected size for chimeric CBH I (**Figure 3A**, middle panel, lane 4), confirming a successful expression of chimeric CBH I in transformant YL165-1, with a protein titer being 0.8-fold of that for transformant YL151 expressing the single chimeric CBH I (**Figure 3A**, bottom panel, lanes 4 vs. 2<sup>0</sup> ). The western blot using anti-CBH II antibody also showed a single band of the size expected for CBH II (**Figure 3B**, middle panel, lane 4), indicating the expression of CBH II in transformant YL165-1, with a titer being 0.7-fold of that for transformant YL102 expressing single CBH II (**Figure 3B**, bottom panel, lanes 4 vs. 200). The western blot using anti-EG II antibody detected a single band with the expected size for EG II (**Figure 3C**, middle panel, lane 4), validating the successful expression of EG II in transformant YL165-1, with a titer being 3.3-fold of that in transformant


Data presented were the average of three biological replicates and the SEM (standard error of the mean) was less than 10%. Glu equiv., glucose equivalent. HA1 (EV), parent control strain HA1 transformed with empty vector.

cellulase expressing transformants. (A) Supernatant samples for western blot using anti-Tr CBH I antibody. (B) Supernatant samples for western blot using anti-CBH II antibody. (C) Supernatant samples for western blot using anti-EG II antibody. In (A–C), the upper panels show SDS–PAGE gels after staining, and the middle panels show the identically loaded gels used for the western blot, while the bottom panels show the densitometric analysis of the western blots, for which error bars indicate the SEM for three biological replicates; <sup>∗</sup> and ∗∗ indicate significantly different from the reference strains (that expressing single CBH I, CBH II, or EG II) with p < 0.05 and p < 0.01, respectively. Lane 1, strain Po1g (transformed with empty vector) as the parent strain control for YL151, YL102, and YL101. Lane 2<sup>0</sup> , YL151 expressing chimeric CBH I; lane 200, YL102 expressing CBH II; lane 2000, YL101 expressing EG II. Lane 3, strain HA1 (transformed with empty vector). Lane 4, YL165-1. Loading amount was 22.5 µL supernatant per well. Cellulase titers (g L-<sup>1</sup> ) are indicated by numbers in the western blot images.

YL101 expressing single EG II (**Figure 3C**, bottom panel, lanes 4 vs. 2000).

It is noteworthy that recombinant CBH I enzymes in yeast were reported to exhibit variable levels of glycosylation, for which hyper-glycosylation leads to smeared bands in SDS–PAGE imaging (Godbole et al., 1999; Boer et al., 2000; Den Haan et al., 2007; Ilmen et al., 2011). The smeared band pattern of TeTrCBH I in YL151 (**Figure 3A**) was consistent with our observation in another recent study, which showed that when TeTrCBH I was expressed in the yeast L. starkeyi, it had a relatively high magnitude of glycosylation that led to apparently smeared band of purified TeTrCBH I band by SDS–PAGE analysis (Xu et al., 2017). Our observation here further showed that the TeTrCBH I co-expressed in YL165-1 vs. YL151 had different extents of glycosylation, as reflected by the different western blot bands in their respective lanes in **Figure 3A**.

### Low Correlation Between Transcriptional and Protein Levels of Cellulase Genes: Intrinsic Nature of Individual Cellulases Affecting Their Co-expression Ratio

The above data permitted a quantitation of the co-expressed cellulases in YL165-1. Previously, the titers of single chimeric CBH I, CBH II, and EG II proteins expressed were estimated to be 32, 24, and 40 mg L−<sup>1</sup> in strains YL151, YL102, and YL101, respectively, in flask cultures (Wei et al., 2014). Accordingly, the titers of co-expressed CBH I, CBH II, and EG II in YL165-1 can be calculated by multiplying each single protein's titer (in YL151,

YL102, and YL101) with its respective densitometric fold-changes (**Figures 3A–C**, bottom panels). This gives a protein titer ratio of:

$$\text{Protein CBH I}: \text{CBH II}: \text{EG II} = 26:17:132 \text{ mg L}^{-1} \qquad \text{(1)}$$

$$= 1.0:0.7:5.1$$

The total titer of these cellulases is 175 mg L−<sup>1</sup> . Among them, EG II appears to be highly efficient for synthesis and secretion as it was the dominant cellulase based on its titer (132 mg L−<sup>1</sup> ); in contrast, CBH I appears to be less efficient for synthesis and secretion in the obtained transformant (with a titer of 26 mg L−<sup>1</sup> ).

Meanwhile, the real-time RT PCR analysis of cDNA samples revealed that:

$$\text{Transcritp} \text{ CBH I}: \text{CBH II}: \text{EG II} = 15.6:1:1.4 \tag{2}$$

To obtain a direct, visual illustration for the transcriptional and protein levels of these cellulases, the data in Eqns. 1 and 2 were re-plotted in **Figure 4**, which reveals a substantial discrepancy (i.e., lack of correlation) between the transcriptional and protein levels for chimeric CBH I vs. EG II. Future studies on the co-secretion of chimeric CBH I and EG II under the same promoters are warranted.

An optimal ratio of CBH I, CBH II, and EG II is crucial for an efficient conversion of cellulose to simple sugars. This study achieved a high titer for EG II but a moderate titer for chimeric CBH I, which indicates that further efforts are needed to boost the expression level of CBH I for an optimal CBH I and EG II ratio. CBH I proteins from different fungal species vary in their expression levels and specific activity when expressed in the yeast S. cerevisiae (Ilmen et al., 2011) and Y. lipolytica (Wei et al., 2014; Guo et al., 2017). While the Te-Tr chimeric CBH I showed a likely intrinsic property for its expression limitation (26 mg L−<sup>1</sup> under the control of TEFin promoter, which is a TEF promoter combined with its intron) in this study, a higher expression level of CBH I from Neurospora crassa (24 and 95 mg L−<sup>1</sup> under control of TEF and a hybrid promoter HTEF, respectively) (Guo et al., 2017) raises hope for further tuning the ratio of CBH I and EG II by including N. crassa CBH I into this platform in future studies.

#### Cellulose Utilization and Lipid Production by Transformant YL165-1 in Avicel Medium

The capability of Y. lipolytica transformant YL165-1 for utilizing cellulose and producing lipids was evaluated by growing the transformant in mineral medium with Avicel as a carbon source. The cultures were harvested at 120 h and the Avicel residues were analyzed. The results showed that YL165-1 consumed 22.8% of the original Avicel content and produced 0.28 g DCW cell biomass per g Avicel consumed (**Table 3**), while the control strain HA1 (EV) showed no detectable cell growth. Note that the Avicel utilization rate of 22.8% by YL165-1 is on the low end of that observed for Y. lipolytica strains expressing a different set of cellulase enzymes (in the range of 22–30%) (Guo et al., 2017), which can be explained by a suboptimal CBH I/EG II titer ratio of the expressed enzymes in this study.

Furthermore, FAME analysis showed that the total FAME produced by transformant YL165-1 was 0.19 g L−<sup>1</sup> , while the FAME yield was 31 mg g−<sup>1</sup> Avicel consumed (**Table 3**). Compared with lipid production by other oleaginous microorganisms on cellulose or glucose medium, this FAME yield by YL165-1 is in a similar range. For example, it was reported that the FAME yield of the oleaginous, filamentous fungus Mucor circinelloides was 57 mg FAME per g Avicel consumed when supplemented with exogenous CBH I and cultured on Avicel substrates (with 33% Avicel being utilized) (Wei et al., 2013). Another report showed the yield of total lipids (which are usually higher than FAME) of the oleaginous yeast Rhodosporidium toruloides was 80 mg lipids per g glucose consumed when cultured in a medium (C/N ratio of 5) using glucose as substrates (Andrade et al., 2012). Note that the FAME profile of YL165-1 in Avicel medium is presented and discussed a later section. Taken together, the transformant YL165-1 was able to maintain a considerable capability for lipid accumulation using cellulose-containing medium, while the cellulose conversion efficiency remains as the limiting factor and as a future direction for strain engineering.

#### Impacts of Cellulase Expression on the Growth of Transformants in YPD Medium

The effects of simultaneous expression of multiple cellulases on the growth of Y. lipolytica was investigated by plotting the turbidity obtained from a Bioscreen C growth assay. Previously, the Bioscreen C instrument has been used to monitor microbial growth curve in terms of optical density of bacteria (Franden et al., 2009, 2013; Wang et al., 2014b) and yeast (Bom et al., 2001; Jung et al., 2015; Gientka et al., 2016). Y. lipolytica transformants expressing single cellulases were similar to their parent strain Po1g (EV) during the exponential growth phase and only show slight differences during the late growth phase (**Figure 5**). The growth curve of YL165-1 was similar to its parent control strain HA1 (EV) during exponential growth and was only 11% lower in optical density during the late growth phase (p < 0.05). Meanwhile, there was no significant difference between YL165-1 and HA1 (EV) in the mean size of cells based on the measurement by the Cellometer Vision (Nexcelom Bioscience, Lawrence, MA, United States). These data indicate that the co-expression of CBH I-CBH II-EG II comes at a small cost for the cells in terms of slightly reduced growth rate in acquiring the capacity to degrade and utilize cellulosic substrates.

The Bioscreen C has been used to determine the growth curve of Y. lipolytica in recent years (Lazar et al., 2011; Dobrowolski et al., 2016; Mironczuk et al., 2016, 2017 ´ ; Rzechonek et al., 2018). Under the culture condition we used for Bioscreen C analysis, relatively a small proportion of cells changed from typical yeast form to filamentous form (as examined using Cellometer), which was not significant enough to disrupt the turbidity readings. Nevertheless, a cell dry weight method for shake-flask culture was used to validate the results of growth curve obtained via the Bioscreen C, and the data are presented

relative transcriptional data (in blue color), the transcriptional level for CBH II-encoding gene had the lowest mRNA level and was set at 1. The presented data were collected from three biological replicates, for which error bars indicate the SEM.

TABLE 3 | Cell mass and FAME analyses of Y. lipolytica transformant YL165-1 grown on mineral medium containing Avicel as sole carbon sources.


Data presented as the average of triplicate biological samples with SEM. DCW, dry cell weight. [1] Avicel consumed % was measured described in Methods section; The initial concentration of cellulose in the medium was 27 g L−<sup>1</sup> . [2] Total DCW was calculated as Dry weight of cell-Avicel pellet – Amount of Avicel consumed. [3] DCW of newly grown cells = Total DCW – Dry weight of initial inoculated cells; the dry weight of initial inoculated cells was 1 g L−<sup>1</sup> . [4] Cell mass yield was calculated was as g dry weight of newly grown cells per g Avicel consumed. [5] "Total newly formed FAME" = "FAME of total cells" – "FAME of initial inoculated cells."

FIGURE 5 | Comparison of growth curves between Y. lipolytica transformants expressing individual or multiple cellulases grown in YPD medium. Turbidity data were obtained from Bioscreen C, by which absorbance readings were taken every 15 min. The relative turbidity percentage values for HA1 (EV) and YL165-1 were calculated from the turbidity of the strain cultures at the end-point of the growth curve, with the turbidity of HA1 (EV) being set as 100%. HA1 (EV), strain HA1 transformed with empty vector; Po1g (EV), strain Po1g transformed with empty vector. The bars for SEM from triplicates at time points of 60, 70, 80, 90, 100, and 110 h were shown; <sup>∗</sup> indicates significantly different from the HA1 (EV) strain with p < 0.05.

in the **Supplementary Figure S4**. The results indicate that the timeline profiling of cell mass weight of the parent control strains and cellulase-expressing transformants follows a similar trend as their growth curves illustrated in **Figure 5**, confirming that the co-expression of CBH I-CBH II-EG II in YL165-1 comes at a small cost (i.e., mild metabolic burden) for the cells in terms of slightly reduced growth rate the late growth phase.

This relatively mild metabolic burden of cellulase coexpression on the growth of Y. lipolytica is in contrast to the relatively larger burden of cellulase expression on growth of S. cerevisiae, by which CBH expressing S. cerevisiae Y294[Y118p] strain had a 1.4-fold lower maximum specific growth rate than the reference strain (van Rensburg et al., 2012).

#### Effects of Cellulase Co-expression on Lipid Production: Metabolic Drain and Its Alleviation

#### Scenario 1. Drain of Cellulase Co-expression on Lipid Production of Cells in Medium With Moderate C/N Ratio

The purpose of using a moderate C/N ratio medium (YPD-3% Glu, with a C/N ratio > 4.5) was to investigate whether cellulase co-expression causes a drain on lipid production in carbon-nitrogen balanced medium. After 6 days of culturing, most glucose in the medium was consumed by both the parent control strain HA1 (EV) and the transformant YL165-1 cells (**Table 4**, rows 1–2). Consistent with our above observation that the turbidity of YL165-1 cells was only slightly lower than its parent control strain HA1 (EV) at the late growth phase, the cell mass yield of YL165-1 was only slightly lower than that of HA1 (EV) (10.5 vs. 11.9 DCW g L−<sup>1</sup> ; **Table 4**).

However, the FAME content in transformant YL165-1 was significantly lower than that in parent control strain HA1 (EV) (15 vs. 27% FAME, cell dry weight basis; **Table 4**, rows 1–2). Accordingly, the FAME yield of YL165-1 was half of that in HA1 (EV) (52 vs. 113 mg FAME per g sugar; **Table 4**), suggesting that co-expression of CBH I-CBH II-EG II can compromise lipid accumulation when cultured in carbon-nitrogen balanced medium.

Fatty acid methyl esters analyses show that the major types of fatty acids in both types of cells were C18:1n9, C18:0, C18:2n6, and C16:0 (in order of prominence) with relatively moderate contributions from C16:1n7, C24:0, and C22:0 (**Figure 6A**). Such a fatty acid pattern is not only consistent with recent observation for the parent strain HA1 in which C18:1, C18:0, and C16:0 being the predominant types (Blazeck et al., 2014), but also consistent with the reported very long-chain fatty acids (VLCFA) content in the wild-type (Katre et al., 2012; Pomraning et al., 2015) and recombinant Y. lipolytica strains (Faure et al., 2010; Tsigie et al., 2012; Zhang et al., 2013). Overall, there is no dramatic difference between YL165-1 and HA1 (EV) in their FAME profiling despite a slight difference in the ratio of saturated fatty acids (SFA) vs. unsaturated fatty acids (UFA) (**Figures 6A,D**).

#### Scenario 2. Improved Lipid Production in Cellulase Co-expressing Cells by High C/N Ratio Medium

The lipid production capability of Y. lipolytica YL165-1 was also examined in a high C/N ratio medium, with an initial concentration of 80 g L−<sup>1</sup> glucose and an initial C/N ratio of 59. For glucose consumption, after 6 days of culturing, the control strain HA1 (EV) and transformant YL165-1 cells consumed 12.6 and 48.0 g L−<sup>1</sup> glucose in the medium, respectively (**Table 4**, rows 3–4). The control strain HA1 (EV) had a relative low OD600 of 5.1 but a high % FAME of 28% (DCW basis) (**Table 4**), which is consistent with previous reports showing that while a high C/N ratio medium benefited lipid accumulation, it limited the cell growth of Y. lipolytica (Back et al., 2016; Yang et al., 2016).

In contrast, cellulase co-expressing transformant YL165-1 grown in high C/N ratio medium showed a completely different

TABLE 4 | Cell mass and FAME content in Y. lipolytica transformant YL165-1 cells cultured in moderate and high C/N media with or without a chemical chaperone.


Data presented as the average of 3 biological replicate samples with SEM. <sup>∗</sup> and ∗∗ indicate statistical significance of p < 0.05 and p < 0.01, respectively. DCW, dry cell weight; Glu, glucose. HA1 (EV), parent control strain HA1 transformed with empty vector; TMAO, trim ethylamine N-oxide dehydrate.

pattern of enhanced glucose utilization (i.e., 48.0 g L−<sup>1</sup> vs. 12.6 glucose g L−<sup>1</sup> for the control strain). This led to twofold improvement in cell mass and a threefold increase in total FAME amount in the control strain (i.e., 10.1 vs. 3.1 g DCW, and 3.4 vs. 0.9 g FAME L−<sup>1</sup> in control strain) (**Table 4**, rows 3–4).

Significant shifts between SFA and UFA were found in transformant YL165-1, with significantly more SFA (C16:0 and C18:0) but much less UFA (C18:1n9 and C18:2n6; **Figure 6B**), leading to a higher SFA/UFA ratio of 1.03 compared with the control strain's SFA/UFA ratio of 0.30 (**Figure 6D**). These data suggest that cellulase co-expression can alter not only the amount but also the composition of the accumulated lipids. Our observation of increased SFA in YL165-1 grown in high C/N ratio medium is consistent with report that in the green alga Ankistrodesmus falcatus grown under combined stress conditions of nutrients (N, P, and Fe), both saturated fatty acid and lipid accumulation were significantly enhanced (Singh P. et al., 2015). The underlying mechanism were that under the stress conditions, unsaturated fatty acids tend to undergo oxidative cleavage, which led to higher saturated fatty acids providing oxidative stability (Sitepu et al., 2013; Singh et al., 2016).

#### Scenario 3. Lipid Production of Cellulase Co-expressing Cells Was Further Enhanced by the Addition of a Chemical Chaperone

It has been reported that chemical chaperones, a group of small-molecular-weight compounds, stabilize the conformation and structure of proteins, enhance the protein folding capacity of the ER, and help the trafficking of mutated proteins (Vallejo-Becerra et al., 2008). They are capable of reducing ER and oxidative stresses (Sootsuwan et al., 2013; Lamont and Sargent, 2017). Accordingly, we treated cellulase co-expressing transformants and control cells with the chemical chaperone TMAO, and compared the results with those from untreated cells.

The TMAO-facilitated lipid production capability of Y. lipolytica YL165-1 was examined in its day 4 culture grown with a high C/N ratio medium. The day 4 cells were treated with 150 mM TMAO for two additional days. For glucose consumption, while TMAO treatment increased that of control strain HA1 (EV) cells by 28%, from 12.6 to 16.1 g glucose L−<sup>1</sup> (**Table 4**, row 3 vs. row 5), it increased that of transformant YL165-1 cells by much larger magnitude of 66% from 48.0 to 79.9 g glucose L−<sup>1</sup> (**Table 4**, row 4 vs. row 6). Accordingly, for cell mass yield, while TMAO treatment increased that of control strain HA1 (EV) cells by 32%, from 3.1 to 4.1 g DCW L−<sup>1</sup> (**Table 4**, row 3 vs. row 5), it increased that of transformant YL165-1 by a much larger magnitude 117%, from 10.1 to 21.9 g DCW L−<sup>1</sup> (**Table 4**, row 4 vs. row 6).

For lipid production, TMAO treatment increased the total FAME production of control strain HA1 (EV) cells by 11%, from 0.9 to 1.0 g FAME L−<sup>1</sup> (**Table 4**, row 3 vs. row 5) and increased that of transformant YL165-1 cells by a much larger magnitude of 74%, from 3.4 to 5.9 g FAME L−<sup>1</sup> (**Table 4**, row 4 vs. row 6). In summary, as demonstrated in **Table 4**, TMAO significantly enhanced glucose consumption and cell mass yield of YL165-1 that led to higher production yield of lipids (i.e., g total lipid per liter). TMAO has been found to stimulate the cell growth of Escherichia coli (Ishimoto and Shimokawa, 1978), Salmonella typhimurium (Kim and Chang, 1974) and Proteus sp. strain NTHC153 (Strom et al., 1979), likely by acting as a terminal electron acceptor in the respiratory chain (Strom et al., 1979), thus mediating the cell's energy metabolism and redox balance. This study observed the stimulation of cell growth of Y. lipolytica cellulase-expressing strains; we propose that in addition to the well-documented role of TMAO acting as a chemical chaperon in facilitating protein folding as well as in alleviating ER stress, TMAO may also act as electron acceptor, enhance the cell's energy metabolism and thus mediate the redox status in the cellulase-expressing Y. lipolytica cells. Future studies are needed to explore this complex mechanism. Future studies are also needed to investigate the mechanism of this effect.

For the SFA and UFA profiling, TMAO-treated transformant YL165-1 cells in scenario 3, showed a similar pattern as observed in scenario 2, with significantly more SFA (C16:0 and C18:0) and less UFA (C18:1n9 and C18:2n6; **Figure 6C**) that leading to a much higher SFA/UFA ratio of 0.74, compared with the TMAOtreated control strain's SFA/UFA ratio of 0.31 (**Figure 6D**). Nevertheless, it is noteworthy the SFA/UFA ratio value of 0.74 in scenario 3 was significantly lower that of 1.03 in scenario 2, supporting the literature-reported role of TMAO in reducing the oxidative stress in cells (Sootsuwan et al., 2013; Lamont and Sargent, 2017), which can lead to less oxidative cleavage and less conversion of UFA to SFA.

It is noteworthy that since high C/N ratio medium has been broadly used as a lipid production medium to maximize fatty acids synthesis and lipid accumulation by severely restricting protein synthesis in oleaginous organisms (Back et al., 2016; Yang et al., 2016), the cellulase production of YL165-1 grown in high C/N plus TMAO medium was not assessed in this study. Future studies are needed to investigate any potential effects of TMAO on the cellulase production in engineered Y. lipolytica strains.

### Effects of Carbon Source and C/N Ratio on the Fatty Acid Composition Profile of YL165-1

The fatty acid profile of YL165-1 cultured in four types of media demonstrates that MM-Avicel medium-grown YL165-1 cells have a unique fatty acid composition different from both the moderate C/N medium, and the high C/N medium (with or without TMAO) (**Figure 7**). MM-Avicel medium-grown YL165- 1 cells have the highest proportion of UFA (C16:1n7, C16:3 and C18:2n6) and lowest SFA (C18:0 and C24:0) than other three media-grown cells (**Figure 7**). Accordingly, MM-Avicel medium-grown YL165-1 cells have the lowest SFA/UFA ratio. This observation is consistent with a recent study conducted by Dobrowolski et al., which showed that the fatty acid profiles in Y. lipolytica strain A101 cultured on various sources of

crude glycerol were different, depending on the types of applied substrate (Dobrowolski et al., 2016).

### Factors Affecting the Lipid Contents of Y. lipolytica and Their Estimation

It should be noted that since the focus of this study was to co-express fungal cellulases in Yarrowia, we deviated from the optimal culture conditions (such as fermentation in a bioreactor with controlled pH and improved aeration, etc.) used in the previous studies with this strain (Blazeck et al., 2014). As a result, the measured FAME content data for parent control strain HA1 (EV), which was close to 30% FAME DCW basis in shake flasks in this study, cannot be directly compared to the previously reported values, which was in the range of 71% (Liu et al., 2015) to 90% (Blazeck et al., 2014) generated during fermentation in pH controlled, more efficiently aerated bioreactors. Difference between the media and shaking speeds used in flask culture in this vs. previous studies. The previous study used three types of containers (rotary drum, shake flasks, and bioreactor) in assessing lipid accumulation in Y. lipolytica strain HA1 (Blazeck et al., 2014). For shake flask culturing of HA1, the different shaking speeds used in the previous study (225 rpm) vs. this study (200 rpm) should partially account for the different levels of lipid accumulation between these two studies as a better agitation/aeration for microbial growth can significantly impact the performance in producing metabolites (Liu et al., 2015). Meanwhile, the different media used in flask culture in this vs. the previous study would likely account for a larger portion of the reported difference in lipid accumulation. In addition to the modified YPD-based medium (YPD-3% Glu that contains yeast extract 10 g L−<sup>1</sup> , peptone 20 g L−<sup>1</sup> , glucose of 30 g L−<sup>1</sup> , which led to a FAME level of 27% ± 2% DCW basis for the strain HA1 cells), this study also used a YNB-based medium that contained 80 g L−<sup>1</sup> glucose and 1.365 g L−<sup>1</sup> ammonium (which led to a FAME level of 28% ± 1% DCW basis for the strain HA1 cells); both media were significantly different from the YNB-based medium used in the previous study (Blazeck et al., 2014), which contained up to 160 g L−<sup>1</sup> glucose and only 0.055 g L−<sup>1</sup> ammonium and led to a total lipid level up to 90% DCW basis for the strain HA1 cells. The significantly higher C/N ratio of the YNB-based media used in the previous study will likely lead to a higher lipid accumulation for the same strain.

Growth conditions (flask vs. bioreactor). Growth conditions play a significant role in lipid accumulation in Y. lipolytica, which is exemplified by the observation of a significant range in lipid accumulation that spans a 74-fold improvement over the parent strain of HA1, strain Po1f, as illustrated in **Figure 1** (Blazeck et al., 2014). Another study reported the bioreactor fermentation increased the lipid content of Y. lipolytica cells by 50% compared to the shake flask (Tai and Stephanopoulos, 2013).

Buoyancy of Y. lipolytica cells. As listed in **Table 1**, both strains Po1f (the parent strain for HA1 used in the study) and Po1g were derived from the wild-type strain W29 (Madzak et al., 2000). For Po1g, its cell mats were found to float in water after being scrapped off the surface of agar medium, as illustrated in the **Supplementary Figure S1** of our previous report (Wei et al., 2014). For HA1, a more recent study

showed that a subpopulation of cells had very high buoyancy. The cells floating on top of the medium were full of lipids determined by using fluorescence microscopy with Nile Red staining, whereas normal cells that settled to the bottom of the tube did not contain as much lipids (Liu et al., 2015). In our study, as illustrated in **Supplementary Figure S5**, the cellulaseco-expressing transformant YL165-1 showed more floating cell mass than the control strain HA1 (EV) in shake flask culture. We speculate that the buoyancy of YL165-1 cells likely also affects the oxygen diffusion into the medium, providing another rationale for optimizing culture conditions by using higher rpm in flask shaking or bioreactor condition to achieve increased lipid accumulation.

### Mechanisms for Relieving the Metabolic Burden of Cellulase Expression by a High C/N Ratio Medium and the Addition of Chemical Chaperone

#### Proposed Role of ER for the Efficient Co-expression of Cellulases and Lipid Biosynthesis

One goal of this study was to examine the dynamic relationship between the production and secretion of heterologous cellulases and the accumulation of lipids. ER is an important organelle for both protein synthesis, folding and secretion, and lipid metabolism. In fact, the synthesis of secretory proteins starts in the ER for correctly integrating nascent proteins and ensuring correct post-translational modification and folding, followed by being captured into ER-derived transport vesicles and delivered to the early Golgi (Barlowe and Miller, 2013; den Haan et al., 2013).

Endoplasmic reticulum also plays a critical role for lipid metabolism, reflected by literature report that (1) a list of 493 candidate proteins (accounting for approximately 9% of the proteome in the yeast S. cerevisiae) were known or predicted to be involved in lipid metabolism and its regulation (Natter et al., 2005), and (2) green fluorescent protein (GFP) tagging coupled with confocal laser scanning microscopy was used to localize the above proteins, the majority of tagged, lipid metabolismrelated enzymes localized to ER (92), followed by vesicles (53), mitochondria (27), lipid droplets (23), peroxisomes (17), plasma membrane (14), and Golgi (7) (Natter et al., 2005). Furthermore, lipid droplets in yeast are not only functionally connected to the ER (Jacquier et al., 2011), its de novo LD biogenesis occurs in ER, and it eventually buds from ER (Jacquier et al., 2013; Choudhary et al., 2015). All these considerations undoubtedly support the ER as the central organelle for protein synthesis, lipid biosynthesis, and lipid droplet formation.

Our results demonstrated that transformant YL165-1 with an overall high level of cellulase secretion compromises the lipid accumulation of these cells. Such an observation, plus the above literature analysis, prompts us to propose an intrinsic link between cellulase co-expression/secretion and lipid accumulation. An enforced overexpression of secreted cellulases will cause a drain on the ER of yeast cells, leading to competition among the co-expressed cellulases for synthesis and secretion, and lipid production.

#### Possible Mechanisms for Relieving the Metabolic Burden by High C/N Ratio Medium and the Addition of Chemical Chaperone

High level expression of heterologous proteins in yeast has been previously found to induce significant cellular changes, including a decrease in growth rate and the altering of nitrogen and redox metabolisms, and poses a metabolic burden on the host cells (Ruijter et al., 2017). In the case of cellulase expression, expression of heterologous A. aculeatus and Saccharomycopsis fibuligera BGLs had an increasingly negative effect on cell growth of S. cerevisiae as the expressed gene doses increased until a final failure to grow (Ding et al., 2017). Equally relevantly, it is reported that high level expression of endogenous and heterologous secreted cellulases can cause ER stress, which subsequently induces the unfolded protein response (UPR), activating related genes to relieve stress in the secretory pathway to improve protein folding efficiency and capacity in filamentous fungi and yeast (Collen et al., 2005; Ilmen et al., 2011; Fan et al., 2015; Van Zyl et al., 2016; Ruijter et al., 2017). The link between ER stress, UPR, and lipid accumulation have been shown in literature that reported ER stress stimulates and increases the level of lipid droplets and protects the yeast cells against the effects of misfolded proteins (Czabany et al., 2008; Fei et al., 2009; Hapala et al., 2011).

The high C/N ratio, limited-nitrogen media have been used to cultivate a range of yeast and algae to achieve high levels of lipid accumulation (Li et al., 2008; Chaisawang et al., 2012; Lamers et al., 2012; Sharma et al., 2012; Braunwald et al., 2013; Sitepu et al., 2013; Blazeck et al., 2014; Hirooka et al., 2014; Zhang et al., 2014; Yu et al., 2015; Yang et al., 2016). As observed in this study, the cellulase co-expression route had an unexpected benefit in enhancing lipid accumulation when grown in medium with high C/N ratio; we propose the likely underlying mechanism would be that the high C/N ratio medium effectively lowers the levels of protein synthesis (because of nitrogen limitation) and ER stress, which leads to boosting nitrogen-limitation-induced lipid production, as shown in **Figure 6E** (scenario 2). The supplement of TMAO into the high C/N ratio medium likely facilitates the protein folding and lowers the ER and oxidative stresses. TMAO may also help mediate the cell energy metabolism and redox balance via acting as an electron acceptor, as suggested by its enhancement of cell growth and cell mass yield. Future studies are warranted to further explore the mechanisms underlying the delicate balance between the cellulase synthesis/secretion and fatty acid-based biofuels production.

## CONCLUSION

We have successfully demonstrated the co-expression and secretion of three core fungal cellulases in a high lipidaccumulating engineered strain of the oleaginous yeast, Y. lipolytica, enabling nearly a 23% conversion of Avicel. To the best of our knowledge, it represents the first case of exploring the relationship between secreted cellulase expression, cell growth, and lipid production in Y. lipolytica strains. The transformant YL165-1 expressed a relatively high titer of EG II and moderate titer of CBH I. Remarkably, when grown in medium with a high C/N ratio and supplemented with a chemical chaperone, the cellulase co-expressing transformant showed a pattern by which the metabolic drain caused by cellulase co-expression and secretion was relieved, and both cell growth and lipid productivity were significantly increased, highlighting the effectiveness of the above approaches in rebalancing the protein synthesis and lipid production in Y. lipolytica.

#### AUTHOR CONTRIBUTIONS

fmicb-09-03276 January 5, 2019 Time: 10:56 # 16

MH, MZ, HA, and HW led the project and coordinated the study. HW conceived and designed the experiments. HA provided the lipid hyperaccumulating yeast strains and the genetic background information. HW executed, and WW and QX assisted with DNA construct building, yeast transformation and screening, SDS–PAGE, western blotting analysis, and enzymatic analyses. SVW conducted the FAME analysis. C-YL, YL, and HW designed the primers and conducted the RNA extraction and real-time RT-PCR. SD provided CBH I, CBH II and EG II antibodies, and provided guidance on enzymatic assays. EK provided guidance on yeast culturing techniques and media. HW prepared the initial draft of the manuscript. SD, EK, MH, HA, and MZ edited the manuscript. HW coordinated the manuscript submission. All authors read and approved the final manuscript.

#### REFERENCES


#### FUNDING

This work was authored by Alliance for Sustainable Energy, LLC, the Manager and Operator of the National Renewable Energy Laboratory for the U.S. Department of Energy (DOE) under contract no. DE-AC36-08GO28308. Funding provided by U.S. Department of Energy Office of Energy Efficiency and Renewable Energy Bioenergy Technologies Office. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.

### ACKNOWLEDGMENTS

The authors thank Dr. Andrew B. Hill for technical assistance.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.03276/full#supplementary-material

Zygosaccharomyces bailii based on combination of a membrane-active peptide with an oligosaccharide that leads to an impaired glycosylphosphatidylinositol (GPI)-dependent yeast wall protein layer. FEMS Yeast Res. 1, 187–194.


endoglucanase I with a hydrophobic tag. Biotechnol. Bioeng. 89, 335–344. doi: 10.1002/bit.20350



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Wei, Wang, Alper, Xu, Knoshaug, Van Wychen, Lin, Luo, Decker, Himmel and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fmicb-09-03276 January 5, 2019 Time: 10:56 # 19

# Gene Co-expression Network Reveals Potential New Genes Related to Sugarcane Bagasse Degradation in Trichoderma reesei RUT-30

Gustavo Pagotto Borin1,2, Marcelo Falsarella Carazzolle<sup>3</sup> , Renato Augusto Corrêa dos Santos <sup>4</sup> , Diego Mauricio Riaño-Pachón<sup>5</sup> and Juliana Velasco de Castro Oliveira1,2 \*

<sup>1</sup> Laboratório Nacional de Ciência e Tecnologia do Bioetanol (CTBE), Centro Nacional de Pesquisa em Energia e Materiais (CNPEM), Campinas, Brazil, <sup>2</sup> Programa de Pós-Graduação em Genética e Biologia Molecular, Instituto de Biologia, Universidade de Campinas (UNICAMP), Campinas, Brazil, <sup>3</sup> Laboratório de Genômica e Expressão (LGE), Departamento de Genética, Evolução, Microbiologia e Imunologia, Instituto de Biologia, Universidade Estadual de Campinas (UNICAMP), Campinas, Brazil, <sup>4</sup> Faculdade de Ciências Farmacêuticas de Ribeirão Preto, Universidade de São Paulo (USP), Ribeirão Preto, Brazil, <sup>5</sup> Centro de Energia Nuclear na Agricultura, Universidade de São Paulo (USP), Piracicaba, Brazil

#### Edited by:

Roberto Silva, Universidade de São Paulo, Brazil

#### Reviewed by:

Kentaro Inokuma, Kobe University, Japan Andrei Steindorff, Joint Genome Institute (JGI), United States

#### \*Correspondence:

Juliana Velasco de Castro Oliveira juliana.velasco@ctbe.cnpem.br

#### Specialty section:

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

Received: 10 July 2018 Accepted: 03 October 2018 Published: 22 October 2018

#### Citation:

Borin GP, Carazzolle MF, dos Santos RAC, Riaño-Pachón DM and Oliveira JVdC (2018) Gene Co-expression Network Reveals Potential New Genes Related to Sugarcane Bagasse Degradation in Trichoderma reesei RUT-30. Front. Bioeng. Biotechnol. 6:151. doi: 10.3389/fbioe.2018.00151 The biomass-degrading fungus Trichoderma reesei has been considered a model for cellulose degradation, and it is the primary source of the industrial enzymatic cocktails used in second-generation (2G) ethanol production. However, although various studies and advances have been conducted to understand the cellulolytic system and the transcriptional regulation of T. reesei, the whole set of genes related to lignocellulose degradation has not been completely elucidated. In this study, we inferred a weighted gene co-expression network analysis based on the transcriptome dataset of the T. reesei RUT-C30 strain aiming to identify new target genes involved in sugarcane bagasse breakdown. In total, ∼70% of all the differentially expressed genes were found in 28 highly connected gene modules. Several cellulases, sugar transporters, and hypothetical proteins coding genes upregulated in bagasse were grouped into the same modules. Among them, a single module contained the most representative core of cellulolytic enzymes (cellobiohydrolase, endoglucanase, β-glucosidase, and lytic polysaccharide monooxygenase). In addition, functional analysis using Gene Ontology (GO) revealed various classes of hydrolytic activity, cellulase activity, carbohydrate binding and cation:sugar symporter activity enriched in these modules. Several modules also showed GO enrichment for transcription factor activity, indicating the presence of transcriptional regulators along with the genes involved in cellulose breakdown and sugar transport as well as other genes encoding proteins with unknown functions. Highly connected genes (hubs) were also identified within each module, such as predicted transcription factors and genes encoding hypothetical proteins. In addition, various hubs contained at least one DNA binding site for the master activator Xyr1 according to our in silico analysis. The prediction of Xyr1 binding sites and the co-expression with genes encoding carbohydrate active enzymes and sugar transporters suggest a putative role

**26**

of these hubs in bagasse cell wall deconstruction. Our results demonstrate a vast range of new promising targets that merit additional studies to improve the cellulolytic potential of T. reesei strains and to decrease the production costs of 2G ethanol.

Keywords: Trichoderma reesei, sugarcane bagasse, 2G ethanol, enzymatic cocktail, gene co-expression network, Xyr1-binding site

### INTRODUCTION

Over the previous years, the development of new sustainable alternatives to fossil fuels has become critical to mitigate greenhouse gas emissions and to avoid the exhaustion of natural sources. Among the biofuels, second generation (2G) ethanol has emerged as one of the most promising substitutes for gasoline since it can be produced from lignocellulosic feedstocks, including agroindustrial residues and municipal waste (Lynd et al., 2017).

The United States and Brazil are the world's largest 1G ethanol producers and are responsible for 57 and 27% of its global production, respectively (Renewable Fuels Association, 2017). In Brazil, the 1G ethanol production system is well established and is based on sugarcane milling and sugar-rich juice fermentation with generation of bagasse and straw as byproducts. Currently bagasse is used to produce steam and energy in the sugar-ethanol biorefineries, and the straw is left on the soil to prevent erosion and enhance the organic carbon content (da Rosa, 2013; Pereira et al., 2015; Lisboa et al., 2017). These residues have already been used as feedstocks for 2G ethanol in Brazilian industrial plants (GranBio and Raízen), and they have a high potential to be explored more thoroughly in an integrated process with 1G ethanol technology (Junqueira et al., 2017; Lynd et al., 2017).

2G ethanol technology primarily consists of three major steps: pretreatment, enzymatic hydrolysis and sugar fermentation, being pretreatment and enzymatic hydrolysis the major limitations for the economic feasibility of 2G ethanol. In this context, a cost-effective process is required to make this biofuel more attractive and competitive (Bornscheuer et al., 2014; Gupta and Verma, 2015).

Trichoderma reesei is the primary fungal source of the industrial cellulases present in enzymatic cocktails for lignocellulose degradation and the production of 2G ethanol. Due to its remarkable capacity to produce and secrete enzymes that are active on carbohydrates, T. reesei has been used as a microbial factory for the breakdown of lignocellulose and a host for heterologous expression (Schmoll et al., 2016). These enzymes and other accessory proteins are collectively named as Carbohydrate-Active enZymes (CAZymes) (Lombard et al., 2014) and their genes are under control of several transcriptional regulators, which are activated or repressed according to sugar (xyr1 and cre1, for example) and nitrogen (areA) availability, pH alteration (pacC), light (env1), and other factors (Stricker et al., 2006; Rassinger et al., 2018; Schmoll, 2018a). In this context, some of them have been target of studies of genetic manipulation involving deletion and overexpression of transcriptional regulators in order to enhance the cellulolytic phenotype of T. reesei (Meng et al., 2018; Rassinger et al., 2018; Zhang et al., 2018).

In the last few years, several advances have been achieved in different research fields and contributed to a better understanding of the physiology and key features behind the T. reesei hypercellulolytic capacity. Such advances include the development of new tools for genetic manipulation (Derntl et al., 2015; Liu et al., 2015), transcriptomic and proteomic studies using lignocellulosic residues (Dos Santos Castro et al., 2014; Borin et al., 2015, 2017; Daly et al., 2017; Ellilä et al., 2017; Cologna et al., 2018), the discovery of new transcription factors (TFs) and regulatory elements (Derntl et al., 2016, 2017; Stappler et al., 2017; Zheng et al., 2017b; Benocci et al., 2018), promoter characterization (Zheng et al., 2017a; Kiesenhofer et al., 2018) and structural studies of cellulases (Li et al., 2015; Bodenheimer and Meilleur, 2016; Eibinger et al., 2016; Ma et al., 2017; Borisova et al., 2018).

However, despite all of the studies conducted and the knowledge acquired, the biotechnological potential of T. reesei has not been completely explored since various genes identified by different "omic" approaches have still not been characterized. These genes represent interesting new targets, because several were found either activated or repressed in culture media having complex carbon sources (Häkkinen et al., 2012; Borin et al., 2017; Daly et al., 2017; Horta et al., 2018). In this context, the gene co-expression network analysis has become a valuable bioinformatic toolkit for data integration and the identification of new candidate genes related to a biological process of interest (Gonzalez-Valbuena and Treviño, 2017). Only a few studies have inferred gene networks from the transcriptome of T. reesei. T. reesei RUT-C30 is a well-known industrial strain able to abundantly produce and secrete cellulases, and it has been used as a genetic background to develop other industrial strains. RUT-C30 was isolated following three rounds of mutagenesis and screening from the ancestral QM6a strain, and its hypercellulolytic phenotype is attributed in part to the truncation ofcre1, the main player of carbon catabolic repression. Since then, RUT-C30 strain has been the target of numerous studies on the conversion of biomass into biofuels and other high-value products (Marx et al., 2013; Mello-De-Sousa et al., 2014; Druzhinina and Kubicek, 2017).

Previously, Borin et al. (2017) investigated the transcriptome of T. reesei RUT-C30 grown on steam-exploded sugarcane bagasse (referred to as "bagasse" from this point on) in a time course of 6, 12, and 24 h as well as on fructose after 24 h. Interestingly, a set of cellulase, hemicellulase, TF, and sugar transporter coding genes were activated in bagasse along with genes encoding hypothetical or uncharacterized proteins. Based on the assumption that co-expressed genes tend to share similar expression patterns and that they could be co-regulated by the same elements, such as the carbon source and pH (van Dam et al., 2017), this study attempts to identify new

genes related to the T. reesei lignocellulose degradation response using a Weighted Correlation Network Analysis (WGCNA) (Langfelder and Horvath, 2008). WGCNA estimates the coexpression similarity between the genes and constructs a weighted correlation matrix following a scale-free topology. In scale-free networks, a few nodes (genes) have a high degree (links), while most nodes have a small number of interactions (edges). These highly connected genes (hubs) play a central role in the network stability against perturbations, and they are very important in diverse cellular processes (Han et al., 2004; Luscombe et al., 2004).

In addition, an in silico prediction of the DNA binding sites for Xyr1, the master activator of cellulases (Stricker et al., 2006), was performed using the promoters to find genes of unknown function that could be regulated by this TF. To our knowledge, this is the first report of the use of a network approach combined with regulatory motif analyses to reveal new genes of biotechnological interest in T. reesei RUT-C30. In this study, an extensive number of genes were found to be co-expressed in bagasse, and they could be the target of new studies to evaluate their role in lignocellulose degradation.

### MATERIALS AND METHODS

#### Fungal Strain and Culture Conditions

Gene co-expression network analysis was performed based on a RNA-Seq dataset from a previous study conducted by our research group (Borin et al., 2017). Briefly, T. reesei RUT-C30 spores were first cultivated in potato dextrose agar for 7–10 days at 29◦C, and harvested in sterile distilled water. The spore suspensions were inoculated to a final concentration of 1 × 10<sup>6</sup> spores per 30 mL of basic culture medium (BCM) (pH 5.5) composed of 0.05% yeast extract (w/v), 50 mL/L salt solution (6 g/L NaNO3, 1.5 g/L KH2PO4, 0.5 g/L KCl, and 0.5 g/L MgSO4), 200 µL/L trace elements (10 g/L ethylenediaminetetraacetic acid, 4.4 g/L ZnSO4·7H2O, 1.0 g/L MnCl2·4H2O, 0.32 g/L CoCl2·6H2O, 0.315 g/L CuSO4·5H2O, 0.22 g/L (NH4)6Mo7O24·4H2O), 1.47 g/L CaCl2·2H2O, and 1 g/L FeSO4·7H2O), with 1% fructose (w/v) as the carbon source at 29◦C, 200 rpm for 48 h. The pre-grown mycelia were transferred to fresh BCM (without yeast extract) with 0.5% bagasse (w/v) as the carbon source for 6, 12, and 24 h, and to 1% fructose (w/v) for 24 h. Fructose was used as control condition in the RNA-Seq experiment as it is an inert sugar, which neither induces nor suppresses overall expression of lignocellulolytic enzymes (Amore et al., 2013). The cultures were grown under continuous light exposure, as it influences positively cellulase gene expression (Schmoll, 2018b). Mycelia were harvested by filtration, washed with sterile water and immediately ground into powder in liquid nitrogen. Frozen material was then used for the RNA extraction.

### Gene Co-expression Network Analysis

The reads obtained by Borin et al. (2017) (BioProject accession PRJNA350272) were size-filtered (minimum of 40 bp) and selected by quality (Q > 20) using AlienTrimmer software (Criscuolo and Brisse, 2013). The filtered reads were mapped to the T. reesei RUT-C30 v1.0 genome available in the JGI Genome Portal (9,852 gene models) (Le Crom et al., 2009; Nordberg et al., 2014) using TopHat2 (Kim et al., 2013). The mapped reads were counted with the featureCounts function from the Rsubread v1.12.6 package (Liao et al., 2014). Low abundance genes were filtered out, keeping only genes with a cpm ≥ 1 in at least three samples. RPKM values were calculated following TMM normalization using the edgeR package v3.12.1 (Bioconductor) (Robinson et al., 2010) within the R environment (R Core Team, 2015). Only differentially expressed genes (DEGs) with two-fold change cutoff (SEB 6, 12 or 24 h vs. fructose 24 h) were considered, i.e., log2-fold change ≥ 1 (upregulated) or ≤- 1 (downregulated). The identification of the genes from T. reesei was based on RUT-C30 strain retrieved from JGI database.

Gene co-expression network analysis was then performed using the RPKM values and the WGCNA package v1.51 (Langfelder and Horvath, 2008). Briefly, a softpower β was chosen using the function pickSoftThreshold to fit the signed network to a scale-free topology. Next, an adjacency matrix was generated as follows: adj = (0.5 <sup>∗</sup> (1+cor))<sup>β</sup> , where adj, cor and β are adjacency, pairwise Pearson correlation and softpower value, respectively. Topological Overlap Matrix (TOM) was used as an input in the function hclust ("average" method) to construct a hierarchical clustering tree (dendrogram). TOM is a measure that quantifies the topological similarity between the genes within a network, i.e., it evaluates whether two or more nodes share links within the network and groups them into the same module (Ravasz et al., 2002; Langfelder, 2013). A threshold of 0.15 (correlation > 85%) was chosen to merge similar modules, and only modules having at least 30 genes were kept. Network visualization and analyses for the highly connected genes (absolute Pearson correlation > 0.8, adj = 0.064) were carried out in Cytoscape v3.3.0. The entire R script used in the WGCNA analysis is available in **Data Sheet 1**.

The MCODE (v1.4.2) plugin (Bader and Hogue, 2003) of Cytoscape was used to identify subclusters of genes densely connected within each module with the parameters set to default (degree cutoff = 2, node score cutoff = 0.2, K-core = 2, maximum depth = 100, haircut method). Genes with high connectivity within the modules, i.e., nodes with degree values higher than 90% of the entire degree distribution, were considered to be hubs (Liang et al., 2012; Bi et al., 2015). The individual betweenness centrality of the nodes generated by Cytoscape was also compared to the corresponding degree value in order to confirm the centrality of the hubs. Hub genes tend to demonstrate a positive correlation between these two parameters (Potapov et al., 2005; Lee, 2006; Li et al., 2014). Biological process and molecular function annotation from the Gene Ontology (GO) consortium was obtained from the JGI database (https://genome.jgi.doe.gov/TrireRUTC30\_1/ TrireRUTC30\_1.home.html) and used for the enrichment analysis using Cytoscape's plugin BiNGO v3.0.3 (Maere et al., 2005) configured to perform hypergeometric test and adjust pvalues for multiple testing using the Benjamini & Hochberg's false discovery rate (FDR) method (p ≤ 0.05). To completely annotate the function of the T. reesei proteins, KEGG and KOG annotations were retrieved from the JGI database. In addition, manually curated annotation of the QM6a strain based on the trichoCODE pipeline (Druzhinina et al., 2016) was also transferred to the RUT-C30 strain using clusters of 1:1 orthologs generated by applying OrthoMCL pipeline (Li et al., 2003).

### Prediction of the Xyr1-Binding Sites

The promoter region of the T. reesei RUT-C30 genes was searched for Xyr1-binding sites (XBS) according to a modified pipeline developed by Silva-Rocha et al. (2014). The 1.5 kb sequences immediately upstream from the start codon ATG of 22 cellulase genes regulated directly by Xyr1 (Castro et al., 2014) were retrieved from the JGI database according to an in-house Biopython script available online (https://github.com/ SantosRAC/UNICAMP\_RACSMaster/tree/master/GFFTools).

For motif discovery, these sequences were used as input in the MEME program (Bailey et al., 2009) using the following parameters: (i) zero or one occurrence per sequence at the forward and reverse strand, (ii) a minimum of 10 motifs, and (iii) a minimum and maximum motif width of 6 and 10, respectively. The frequency matrix of the motif most similar to the Xyr1 consensus sequence 5 ′ -GGC(A/T)3-3 ′ (Rauscher et al., 2006) was chosen to be sought in the promoter of the network genes using the matrix-scan tool from the RSAT server (p-value ≤ 1.00E-04) (http://rsat-tagc.univ-mrs.fr/rsat/matrix-scan\_form. cgi). KOG annotation was used to identify the functional groups of genes that have the predicted XBS, and a hypergeometric test was applied to enrich the statistically significant KOG groups (p-value ≤ 1.00E-03) using KOG annotation for all the genes as background. P-values were adjusted for multiple testing corrections using Benjamini & Hochberg's false discovery rate (FDR) method (p ≤ 0.05).

### RESULTS

### Construction of a Weighted Gene Co-expression Network

Recently, the transcriptome of T. reesei RUT-C30 grown on bagasse after a time course of 6, 12 and 24 h as well as on fructose after 24 h was investigated (Borin et al., 2017). Several DEGs were identified, including CAZymes, genes encoding sugar transporters and uncharacterized TFs and proteins. Using this dataset, a gene co-expression network was inferred to identify new targets related to the cell wall degradation of bagasse.

From 9,852 genes, 8,402 were kept for further analyses after gene expression filtering and normalization. The data were imported into the WGCNA package, and a softpower β of 26 (R² = 0.85) was chosen to fit the scale independence to a scale-free topology and taking into account the connectivity between the genes based on their expression (**Data Sheet 2A**). Genes having similar expression patterns were grouped into modules, and highly connected modules were merged (**Data Sheet 2B**).

In total, 28 different modules were formed with the genes highly co-expressed (Pearson correlation > 0.8). The DEGs identified and annotated by Borin et al. (2017) were sought within each module generated using the WGCNA package, and ∼70% of all of the up and downregulated genes were identified in the T. reesei network (**Table 1**, **Table S1**). Genes encoding CAZymes, TFs, sugar transporters and uncharacterized proteins with secretion signal peptides were gathered in a few modules. Most of the upregulated genes of these functional classes were found in four modules: coral1, darkorange, black, and darkred (**Figure 1A**, **Table S1**), while most of the downregulated genes were in additional four modules: brown, darkolivegreen4, grey60, and lightcyan1 (**Figure 1B**, **Table S1**). The first four modules will be designated the "up set" and the latter four the "down set." These different classes of DEGs are of major importance to the identification of new targets related to bagasse deconstruction and other processes associated, as they cover enzymes for the lignocellulose breakdown, transcriptional regulators of the gene expression, transport of inducer-acting molecules and other unknown players. Taken together, these modules represented more than half of all the DEGs in the T. reesei transcriptome. For this reason, further analyses focused only on them.

### Identification of Subclusters and Go Enrichment

Subclusters of co-expressed genes were classified using the MCODE algorithm to find protein coding genes acting together during the process of lignocellulose deconstruction from bagasse. Each of the modules previously identified was partitioned, and a few subclusters were formed for each module from the up and down sets (**Table S2**). Next, a GO enrichment was conducted with the genes within each subcluster. Only subclusters having at least 10 genes were considered for GO enrichment (**Table S3**).

Overall, subclusters of coral1, black, darkorange, and darkred modules presented 26 GO terms enriched, including cellulose binding (GO:0030248) and hydrolase activity (GO:0004553, 0016798), transferase activity (GO:0016758, 0016740), carbohydrate transport (GO:0008643) and regulation of transcription (GO:0006355), and transcription factor activity (GO:0003700), respectively (**Table S3**). As already described, these modules had the largest number of upregulated genes of the whole T. reesei network and gathered a significant share of the CAZyme and predicted sugar transporter genes (**Figure 1A**). Therefore, it is not surprising that the enrichment of these GO terms were identified for these modules and shows that our analyses were directed toward the identification of the genes of unknown function, which were co-expressed with known players in lignocellulose degradation and/or sugar transport.

Alternatively, the down set demonstrated only 15 GO terms enriched, such as threonine-type endopeptidase activity (GO:0004298) (module brown), carbohydrate biosynthetic process (GO:0016051) (darkolivegreen4), DNA conformation change (GO:0071103) and electron carrier activity (GO:0009055) (grey60), and anion binding (GO:0043168) (lightcyan1) (**Table S3**). The difference between the GO terms enriched in the up and down sets could be explained by the carbon sources used in the culture media. Bagasse is a complex recalcitrant structure that must be broken down by the combined action of CAZymes. The sugars released in this process are transported to the intracellular environment, where they are utilized in carbohydrate catabolism to produce energy. These sugars, including cellobiose and xylose, can also act as inducers of cellulase and hemicellulase expression (Mach-Aigner et al.,


Genes were up or downregulated in at least one time point of T. reesei grown on sugarcane bagasse. Nodes and edges represent genes and pairwise interactions, respectively. <sup>8</sup> The percentage was calculated based on the total number of up (1475) and downregulated (1500) genes expressed in the transcriptome of T. reesei (Borin et al., 2017).

2010; Zhou et al., 2012; Zhang et al., 2013). In contrast, the disaccharide fructose is a readily assimilable sugar that does not require cellulase enzymes to be utilized by the fungus. Therefore, it is evident that T. reesei adapts its metabolism according to the available carbon source.

#### Hub Genes Identification

Important genes for a biological process tend to be central in a gene co-expression network and share a high number of co-expressed neighbors having a lower degree (Villa-Vialaneix et al., 2013). In this study, hubs were defined as the genes at the top of the degree distribution and those that demonstrated a positive correlation between the degree and betweenness centrality (**Table S4**). In total, 321 and 294 hubs were identified in the up and down sets, respectively (**Table S5**). Among the 321 hubs, 129 (40%) were upregulated in bagasse in at least one time point (6, 12 or 24 h), and 191 (65%) out of 294 genes were downregulated in the down set. Various genes encoding proteins that have different functions were found to be hubs in all eight modules investigated, including sugar, ion, and amino acid transporters; CAZymes; TFs; proteins of chromatin remodeling; enzymes from carbohydrate, amino acid, lipid and nucleotide metabolism; chaperones and hypothetical proteins without annotated function (**Table S5**).

For simplification purposes, only a few hubs are shown in **Table 2**. In the up set, 3 CAZymes, 9 sugar transporters, 6 TFs and 5 genes encoding unknown proteins with a predicted secretion signal peptide were found to be hub genes. A few TFs and other proteins having a putative regulatory function in gene transcription were identified as hub genes, and most were only characterized in their corresponding fungal orthologs, including the PRO1 Zn2Cys6 transcriptional regulator (jgi|136533, module darkorange) (Masloff et al., 1999), the PRO41 protein (jgi|8730, module coral1) (Nowrousian et al., 2007) and the AMA1 activator (jgi|114362, module coral1) (Diamond et al., 2009) (**Table 2**). Interestingly, two genes encoding methyltransferases (jgi|79832, module darkred; jgi|72465, module coral1) and one gene encoding a GCN5-acetyltransferase (jgi|133861, module coral1) were also identified as hub genes. Among the hub genes encoding proteins of unknown function with signal peptides, the

genes jgi| 124417, 128655 were co-expressed in the module coral1 and darkorange, respectively, and were upregulated in bagasse. The trichoCODE annotation indicated that their orthologous genes in T. reesei QM6a encode SSCRPs, and therefore they could be acting in the extracellular environment (**Table 2**, **Table S5**). However, the true evidence that they are pivotal to T. reesei response when grown on bagasse still needs to be evaluated.

Similar to the up set, various hub genes encoding different proteins were found in the down set. In total, 7 CAZymes, 5 sugar transporters, 8 TFs and 4 proteins of unknown functions that have predicted secretion signal peptide were found to be downregulated in bagasse. In addition to the identification of the hub genes that have a putative regulatory role in the gene expression, such as TFs, chromatin remodeling and signal transduction proteins, several genes encoding unknown proteins significantly downregulated were also found, especially in the brown module (**Table 2**, **Table S5**).

### Prediction of the Xyr1-Binding Sites in the Promoter Region

Previously, Castro et al. (2014) showed that 22 genes encoding cellulases and hemicellulases of the T. reesei QM9414 strain are regulated directly by the master activator Xyr1 (jgi|98788). This TF is essential for the induction of cellulase and hemicellulase genes, and it has been the target of various studies aiming to improve the hypercellulolytic phenotype of T. reesei using genetic manipulation (Wang et al., 2013; Lv et al., 2015; Zhang et al., 2017).

To predict the presence of a specific DNA binding motif for the RUT-C30 strain, the promoter regions (1.5 kb immediately upstream the start codon ATG) of the orthologous genes between the RUT-C30 and QM9414 strains were used to seek motifs that were similar to the Xyr1 consensus sequence 5′ -GGC(A/T)3-3′

(Rauscher et al., 2006) and the motif used in the study of Silva-Rocha et al. (2014). As a result, only one putative XBS of 10 nucleotides resembling the Xyr1 motifs was chosen for further analyses (**Figure 2**). The frequency matrix of the motif chosen was then retrieved from the MEME server and used as a model in the identification of XBS predicted within the promoter of the genes from the up and down sets. The complete list of the motifs predicted in the promoter of the 22 CAZymes and the frequency matrix is available in **Table S6**.

To validate the motifs found in silico, they were compared with the motifs already characterized in the promoters of the CAZyme genes (cbh1 and xyn1) of other T. reesei strains. Cellobiohydrolase 1 (Cbh1/Cel7a) is one the most produced and secreted enzymes from T. reesei (Kiesenhofer et al., 2018), and endo-β-1,4-xylanase 1 (Xyn1) is a hemicellulase with important role in xylan deconstruction (Liu et al., 2017). The comparison of the motifs showed that our pipeline could predict XBS that had been previously described and characterized (**Figures S3**, **S4**) (Rauscher et al., 2006; Furukawa et al., 2009; Ries et al., 2014; Kiesenhofer et al., 2018), similarly to the prediction of Silva-Rocha et al. (2014).

Next, the predicted XBS found in the promoter of the genes from the up and down sets were investigated (**Table S7**). In total, 245 upregulated and 210 downregulated genes had at least one XBS predicted in the promoter region. This represented 24 and 22% of the entire set of DEGs present in the up and down sets, respectively. Considering only the number of DEGs in each module, coral1 (45.5%) and grey60 (40.4%) were the modules having the largest percentage of genes with predicted XBS (**Table S1**).

KOG functional annotation of the genes having predicted XBS from up and down sets was also investigated (**Figure 3**). The hypergeometric test and multiple testing correction showed that the KOG classes metabolism, carbohydrate transport and metabolism, and information storage and processing were


TABLE 2 | Hub genes found in the up and down sets that were differently expressed in sugarcane bagasse (Borin et al., 2017).

<sup>1</sup>Trichoderma reesei RUT C30 v1.0 database from JGI was used to recover the T. reesei proteins ID; <sup>2</sup>QM6a ortholog genes; <sup>3</sup>Functional annotation according to KEGG, KOG and Druzhinina et al.'s work (2016); <sup>4</sup>Log<sup>2</sup> fold change (FC), B: sugarcane bagasse. Red and blue colors indicate gene expression of genes up and downregulated in sugarcane bagasse, respectively.

statistically significant in the up set, while secondary metabolites biosynthesis, transport and catabolism was the only class enriched in the down set (**Figure 3**). This difference stresses the diversity of genes encoding proteins with several functions that could be modulated by the activator Xyr1, in addition to other elements.

Analyzing the hub genes of the up set, 30 genes (23.3%) demonstrated at least one XBS in their promoter, including one Zn2Cys6 transcriptional regulator, transporters from the Major Facilitator Superfamily (MFS) and one putative calcium transporter, one Ras GTPase, one putative SWI-SNF chromatinremodeling complex protein, one polyketide synthase (PKS) and various hypothetical proteins with unknown function (**Table S5**). In the down set, 33 (17.3%) hub genes had XBS predicted in this study, including two CAZymes (chitin synthase and glucan endo-1,3-β-glucosidase), one ABC transporter family protein, one putative nonribosomal peptide synthase (NRPS), one putative SWI-SNF chromatin-remodeling complex protein and several uncharacterized proteins (**Table S5**).

#### Xyr1 and Co-expressed Genes in the Coral1 Module

As already described, Xyr1 is critical for the activation of cellulases and hemicellulases, and it was expressed in bagasse more highly than fructose during the entire time course (Log2FC: 6 h: 0.48; 12 h: 0.99; 24 h: 1.65). Along with the other 385 upregulated genes, xyr1 was grouped into the coral1 module which had the largest number of CAZyme genes upregulated in bagasse (**Figure 1A**). Thus, it is worth investigating the genes that were co-expressed with this activator, since they could be involved in the fungal response to the lignocellulosic biomass.

From the pairwise edges of the coral1 module, 858 neighbor nodes to Xyr1 were retrieved. A total of 338 out of 858 genes were upregulated in bagasse, and only 115 had at least one XBS in the promoter. Among these genes, 35 CAZymes were found to be co-expressed with xyr1, including cellobiohydrolases (cbh1 and

cbh2), endoglucanases (egl1, egl2, egl3, and egl5), β-glucosidases (bgl1 and bgl2), endo-β-1,4-xylanases (xyn2, xyn3, xyn4, and xyn5), and monooxygenases from the AA9 family (cel61a and cel61b) (**Figure 4A**, **Table S8**).

Interestingly, swo1 (jgi|104220) and cip1 (jgi| 121449) were found as neighbor nodes to xyr1, and they were under strong activation in bagasse during the entire time course (6, 12, and 24 h) (**Table S8**). Swo1 acts synergistically with xylanases to remove the hemicellulosic fraction of the lignocellulose (Gourlay et al., 2013), and Cip1 appears to be important to the hydrolysis of the lignocellulosic biomass (Lehmann et al., 2016). Both proteins have a carbohydrate-binding domain that belongs to family 1 (CBM1), and one XBS was predicted in each gene promoter. Therefore, they play a role in bagasse deconstruction and are transcriptionally regulated by Xyr1, as reported previously (Reithner et al., 2014; Ma et al., 2016). In addition, 14 putative sugar transporters, three Zn2Cys6 transcriptional regulators, one putative GCN5-acetyltransferase, one methyltransferase and several other genes were also co-expressed with xyr1 and had XBS in their promoters (**Figure 4A**, **Table S8**).

Finally, hub genes that were neighbor nodes to the master activator Xyr1 (represented in the coral1 module) and that had XBS in their promoter were also identified. The list of hubs included 14 genes encoding one MFS transporter, one Zn2Cys6 transcriptional regulator, one protein with an ankyrin repeat, and several other genes still not characterized (**Figure 4B**, **Table S8**).

#### DISCUSSION

The construction of a gene co-expression network allows us to identify clusters of genes that have a similar expression pattern and to assess the biological information that is relevant to a specific phenotype. Hitherto there were only a few studies investigating gene co-expression network in T. reesei. Dos Santos Castro et al. (2014), for instance, performed RNA-Seq of T. reesei QM9414 strain grown on sophorose, cellulose and glucose, and their analysis revealed specific differentially expressed genes in sophorose and cellulose. More recently, Horta et al. (2018) examined the transcriptome and exoproteome of T. reesei, T. harzianum, and T. atroviride grown on cellulose and glucose. Based on their co-expression network, they found a set of 80 genes shared between the three Trichoderma species that could represent a common cellulose degradation system. However, no research has been reported exploring the gene co-expression network of T. reesei RUT-C30 grown on sugarcane bagasse. In this study, the recently published transcriptome of T. reesei RUT-C30 grown on bagasse after 6, 12 and 24 h was used to determine the modules of co-expressed genes. As a result, 28 modules of co-expressed genes were formed, and only eight

were chosen for further analysis due to the largest number of up and downregulated genes encoding CAZymes, TFs, sugar transporters and other genes of unknown function that were co-expressed (**Figure 1**).

After the MCODE clustering, the GO enrichment showed that several terms related to lignocellulose breakdown, including polysaccharide (GO:0030247) and cellulose binding (GO:0030248), and hydrolase activities (GO:0004553 and GO:0016798), were enriched in the subcluster 1 of the coral1 module (**Table S3**) where the master activator Xyr1 was also found. The presence of 50 upregulated CAZyme genes encoding cellulases, hemicellulases and oxidative enzymes in the coral1 module (**Table S1**), along with the co-expression of xyr1, allows us to hypothesize that soon after xyr1 is expressed in bagasse, its protein product regulates the transcription of the (hemi)cellulolytic genes. In addition, 36 out of the CAZyme 50 genes demonstrated at least one XBS in their promoter, which is consistent with previous studies (Castro et al., 2014; Silva-Rocha et al., 2014) and highlights the ability of our approach to identify the Xyr1 target genes. In addition to coral1, the darkorange module had the carbohydrate transport (GO:0008643) term enriched (**Table S3**), revealing that several genes encoding transporters were grouped in this module. Most of them belonged to the MFS family, and its members are thought to transport a vast array of small molecules, such as sugars, inorganic ions, siderophores and amino acids (Yan, 2015; Quistgaard et al., 2016). Some studies have also shown that MFS transporters can still mediate the induction of cellulases in T. reesei (Zhang et al., 2013; Huang et al., 2015; Nogueira et al., 2018), clarifying the importance of these putative sugar transporters in the induction of CAZymes and lignocellulose degradation.

Distinct GO terms were enriched in the down set, such as threonine-type endopeptidase activity (GO:0004298) and carbohydrate biosynthetic process (GO:0016051) (**Table S3**). The latter was enriched in the darkolivegreen4, and interestingly, this GO term consisted of seven enzymes possibly involved in fungal cell wall biosynthesis, including three glycosyl transferases, one α-amylase, one rhamnose reductase, one pyruvate carboxylase, and one glucose 6-phosphate isomerase (GPI) (**Table S3**). Fungal cell wall is primarily composed of β-1,3-glucan, β-1,6 glucan, mixed β-1,3-/ β-1,4-glucan, α-1,3-glucan, chitin, and glycoproteins. They are organized in a complex backbone, and it is thought that glycosyl hydrolases and glycosyl transferases are fundamentally important in cell wall biosynthesis (Free, 2013). Genes encoding enzymes from the gluconeogenesis pathway, such as pyruvate carboxylase and GPI, could also provide hexose phosphates as building blocks for the cell wall biosynthesis (Ene et al., 2012). Considering that fructose is a simpler sugar than the bagasse lignocellulose, the fungus must be utilizing this readily assimilable carbon source to grow, and consequently it has to remodel its cell wall. Although most of those genes were downregulated and highly co-expressed, the engagement of these genes in this process remains elusive.

Hub genes were also sought out within the main modules, and a large number of new targets that were co-expressed were discovered (**Table 2**, **Table S5**). For example, the Pro1 coding gene (jgi|136533) was found as a hub node within the darkorange module, and it was upregulated in the three time points analyzed (6, 12, and 24 h). Pro1 is a member of the Zn2Cys6 transcription factors, and several orthologs had already been characterized in ascomycetes. In Neurospora crassa, the ortholog Adv-1 (NCU07392; identity: 66%) regulates cell-to-cell fusion and sexual development (Chinnici et al., 2014; Dekhang et al., 2017), and in Sordaria macrospora (CAB52588.2; identity: 66%), Pro1 has a pivotal role in various different processes, including the regulation of genes involved in cell wall integrity, the NADH oxidase pathway and pheromone signaling (Masloff et al., 2002; Steffens et al., 2016).

Steffens et al. (2016) demonstrated that, in addition to Adv-1, N. crassa requires the pH response transcription regulator PacC to activate its female development. Intriguingly, its ortholog gene pac1 (jgi|95791) in T. reesei RUT-C30 was also identified to be a hub node (**Table 2**, **Table S5**). pac1 was upregulated in bagasse and co-expressed with the pro1 gene in the same module. He et al. (2014) functionally characterized pac1 in T. reesei, and they observed increased cellulase transcription and production in the 1pac1 mutants at neutral pH. They also noted that the pac1 deletion impaired fungal growth and development. In total, the results point to possible crosstalk between Pro1, Pac1 and other genes involved in an intricate regulatory network.

As already described, two putative methyltransferase coding genes (jgi|79832, module darkred; jgi|72465, module coral1) and a GCN5-acetyltransferase coding gene (jgi|133861, module coral1) were also identified as hubs in the up set (**Table S5**). These classes of enzymes are thought to be responsible for DNA methylation and histone modification, respectively, and epigenetically regulate various important cellular processes, such as the silencing of transposable elements and the activation of gene transcription (Xin et al., 2013; Su et al., 2016; Lyko, 2017). Intriguingly, recent studies have suggested that methyltransferases are also implicated in lignin degradation in the basidiomycete Phanerochaete chrysosporium, broadening the functions of these enzymes (Korripally et al., 2015; Thanh Mai Pham and Kim, 2016; Kameshwar and Qin, 2017). In addition, five genes encoding unknown proteins that have secretion peptide signal were considered to be hubs in the up set (**Table S5**). Among them, two (jgi|124417, 128655) had their QM6a orthologs annotated as annotated as small secreted cysteine-rich proteins (SSCRPs) using trichoCODE. In general, SSCRPs are related to interactions between the microorganisms and the environment, including biocontrol, the induction of plant resistance and adhesion (Shcherbakova et al., 2015; Qi et al., 2016). The presence of a predicted secretion signal peptide allows us to hypothesize that these proteins could be secreted to support the bagasse-fungal adhesion. In total, the identification of these hub genes shows that they could be acting as regulatory players or accessory proteins in the bagasse degradation response.

Other putative regulators were also identified as hub nodes in the up and down sets, and some of them were not characterized in T. reesei, only in its homologs (**Table S5**). For example, the gene encoding the Ama1 activator (jgi|114362) was found in the coral1 module having an XBS in the promoter, and its putative ortholog in Saccharomyces cerevisiae (YGR225W; identity: 33%) encodes a meiosis-specific activator related to spore morphogenesis (Okaz et al., 2012; Schmoll et al., 2016). One gene encoding the developmental regulatory protein WetA (jgi|9281) was found in the same module being upregulated during the entire time course. In Fusarium graminearum, the putative ortholog WetA (I1S0E2.2; identity: 43%) is necessary for conidiogenesis and conidial maturation, and the wetA mutants produced conidia that were more sensitive to oxidative and heat stress (Son et al., 2014). Intriguingly, Wu et al. (2017) suggested that the Aspergillus flavus ortholog WetA (EED47149.1; identity: 57%) could be a global player in the regulation of conidial development, acting during the fungal cell wall biogenesis and regulating the secondary metabolic pathways. In addition, the putative TF (jgi|98900) was one of the hub genes of the brown module, and its corresponding orthologs encode the heat shock transcription factor 1 (Hsf1) (Schmoll et al., 2016), the master stress response regulator in eukaryotes (Zheng et al., 2016) and the activator of virulence in the pathogen Candida albicans (Nicholls et al., 2011) (**Table 2**, **Table S5**). Another four and seven putative TF coding genes were also found to be hubs in the up and down set, respectively (**Table S5**). The search for their orthologous genes in other ascomycetes revealed that none had been characterized, and therefore, their function remains to be elucidated. The relevant role of the characterized ortholog hub genes identified in this study highlights the central position of these nodes in the up and down sets and support the necessity of investigating their putative regulatory influence on the T. reesei response in bagasse.

Finally, other genes encoding proteins of different functions were also identified as hub nodes in the up and down sets (**Table S5**). In the up set, three CAZymes (jgi|97768, 75420, 128705), two SSCRPs (jgi|124417, 128655), the transloconassociated protein (TRAP) (jgi|103979) and the Sec61 beta subunit (jgi|23456) from the endoplasmic reticulum (ER) translocon, ion and amino acid transporters, and various unknown proteins were found. In the down set, other hubs were discovered, including seven CAZymes (jgi|24326, 104519, 103899, 124897, 97721, 139518, 104242), six proteases (jgi|90298, 135507, 75405, 82006, 138263, 87936), a SWI-SNF chromatin-remodeling complex protein (jgi|124027) and unknown proteins (**Table 2**, **Table S5**). This vast array of hub genes demonstrates that the great diversity of genes is central in the co-expressed modules. However, most of them are still not characterized, and efforts are required to elucidate their function in fungal physiology.

The prediction of XBS based on the promoter of cellulase and hemicellulase coding genes was used in this study to verify the genes possibly regulated by Xyr1 in the presence of a complex lignocellulosic biomass. Xyr1 is the key transcriptional regulator of the CAZymes involved in cell wall deconstruction, and it is regulated by the carbon catabolite repressor Cre1, which is truncated and confers a hypercellulolytic phenotype in the RUT-C30 strain (Ilmén et al., 1996; Silva-Rocha et al., 2014). To validate our pipeline of regulatory motif predictions, the XBS predicted in the cbh1 and xyn1 promoters were compared with those from other studies.

According to Ries et al. (2014), there are two XBS in the positions −733 and −320 in relation to the start codon ATG of the cbh1 gene (**Data Sheet S3**). The −733 motif (5′ -TTTGCC-3 ′ ) was predicted by the pipeline of Silva-Rocha et al. (2014). However, it was not predicted in this study. Alternatively, we identified four XBS in the promoter region of cbh1 at positions −508, −748, −771, and −1376 that were not identified in the study by Silva-Rocha et al. (2014). This was likely to be due to the differences between the strains and pipelines. Using a labeled Xyr155−<sup>195</sup> probe, Furukawa et al. (2009) found that two of these predicted sequences (positions−508 and−748) showed strong binding to the probe. In addition, Kiesenhofer et al. (2018) replaced the promoter of the glycine oxidase (goxA) reporter gene with the cbh1 promoter containing deletions in three different regions and observed a significant decrease in the GoxA activity when the mutant strains were grown on different carbon sources, including lactose, carboxymethylcellulose (CMC) and pretreated wheat straw. Two of these regions spanned the XBS predicted in this study at positions −748 and −771 (**Data Sheet S3**).

In addition to cbh1, the XBS predicted in the xyn1 promoter were also compared with the ones previously reported (Rauscher et al., 2006; Furukawa et al., 2009; Kiesenhofer et al., 2018). Two XBS were predicted at positions −417 and −621 being the first one identified and characterized in other studies (**Data Sheet S4**). For example, Rauscher et al. (2006) investigated a 217-bp region (−321 to −538) inside the xyn1 promoter and discovered that Xyr1 was able to bind to two sequences at positions −404 and −420. Point mutations in each of these two motifs caused a substantial decrease in the reporter activity of the gene glucose oxidase and showed that they are critical for the transcriptional activation of xyn1 in the presence of xylan, one of the primary sugar inducers of hemicellulases (Rauscher et al., 2006). Some years later, Furukawa et al. (2009) analyzed three XBS (−80, −420 and −887) in the xyn1 promoter and discovered that only the XBS at position −420 showed strong binding to the Xyr1 probe. Finally, Kiesenhofer et al. (2018) recently demonstrated that the deletion of these XBS at positions −404 and −420 abolished the GoxA activity under control of the xyn1 promoter even in the presence of 0.5 mM xylose (**Data Sheet S4**). In summary, the previous studies indicate that some of the XBS predicted in this study could be functional in T. reesei RUT-C30 and therefore, are important for the transcriptional regulation of CAZyme genes and probable additional genes. In addition to the cbh1 and xyn1, the promoter of the cbh2 gene also had two XBS predicted in this study that were shared with its homologs in other T. reesei strains (data not shown).

Based on the assumption that Xyr1 is essential to the induction of the CAZymes and sugar transporters, it was not surprising that genes related to carbohydrate transport and metabolism, such as CAZymes and putative sugar transporters, were the most abundant upregulated genes that had XBS in the KOG analyses. Alternatively, the XBS were enriched in the genes of secondary metabolites biosynthesis, transport, and catabolism class in the down set (**Figure 3**), which suggests that Xyr1 participates in the regulation of secondary metabolism.

To verify the putative targets of Xyr1, the direct co-expressed neighbors of the xyr1 node were retrieved from module1, and we searched for the XBS in the gene promoters (**Table S8**). From a total of 115 Xyr1-partner genes having a predicted XBS, almost half encode CAZymes and putative sugar transporters. Three Zn2Cys6 transcription factors (jgi|139402, 77124, 141251) were also found, and one (jgi|141251) demonstrated 55 and 72% identity to the NirA ortholog in F. fujikuroi (KLO85312.1) and A. nidulans (AN0098), respectively (**Table S8**). NirA is a nitratespecific transcription factor that modulates nitrogen metabolite repression (NMR), and this TF accumulates in the nucleus in the presence of nitrate or nitrite and a low concentration of assimilable nitrogen sources, such as ammonium. The nitrogen metabolism regulator AreA interacts physically with NirA, and the complex formed activates the genes for nitrate assimilation (Gallmetzer et al., 2015; Pfannmüller et al., 2017). The T. reesei ortholog areA (jgi|140814) was not differentially expressed in bagasse, and therefore it is curious to note the upregulation of the T. reesei ortholog nirA in this carbon source and its co-expression with xyr1.

Three genes (jgi|126063, 73493, 92240) encoding putative transporters of non-sugar solutes were co-expressed with xyr1 and had XBS in their promoters (**Table S8**). The gene jgi|73493 encodes a putative calcium transporter with 10 transmembrane domains, and it was found as a hub in the coral1 module (**Table S8**). It has already been reported that metal ions, such as Ca2<sup>+</sup> and Mn2+, have a positive effect on the mycelial growth of T. reesei and cellulase production, and this molecular signaling mechanism is mediated by the Mn2<sup>+</sup> transporters TPHO84-1 and TPHO82-2, a Ca2+/ Mn2<sup>+</sup> ATPase (TPMR1) and the components of the Ca2+/calmodulin signal transduction, including the TF Crz1 (Chen et al., 2016, 2018). Therefore, it is worth investigating the role of this putative calcium transporter in the induction of the genes responsive to lignocellulose degradation, since it could transport cations that activate gene expression.

In addition to the non-sugar transporters, 15 putative sugar transporter coding genes were co-expressed with xyr1 and demonstrated at least one XBS predicted at their promoter region (**Table S7**). One of them was considered to be a hub gene encoding a protein annotated as allantoate permease (jgi|93149), but its participation in the fungal response toward the biomass deconstruction remains unclear. Most of these putative transporters are members of the MFS superfamily, and therefore, could transport a variety of solutes, including sugar and amino acids. Sloothaak et al. (2016) developed a Hidden Markov Model (HMM) to identify new xylose transporters in T. reesei and A. niger. Several candidates were identified, and one (RUT-C30: jgi| 7811; QM6a ortholog: jgi| 106330) was found upregulated and co-expressed with xyr1 in this study (**Table S7**). Unexpectedly, this putative xylose transporter coding gene had one XBS predicted in the promoter and showed an increasing expression profile in bagasse. The prediction of an XBS in the promoter region of this gene suggests that it could be regulated by Xyr1 and could be involved in the xylose assimilation after the hemicellulose breakdown of bagasse.

## CONCLUSION

The T. reesei gene co-expression network analysis grouped several differentially expressed genes into modules based on their expression patterns in steam-exploded sugarcane bagasse. A large number of interesting genes encoding CAZymes, putative sugar and ion transporters, as well as TFs and proteins with a putative regulatory role were highly co-expressed within some modules. The prediction of XBS in the promoters confirmed the influence of Xyr1 in the CAZyme coding genes regulation and enabled the identification of new putative targets of this master regulator. Hub nodes were also found within the modules, and many of them had not been characterized. Several CAZymes, accessory proteins and uncharacterized protein coding genes were coexpressed with xyr1. Finally, this study provided an extensive number of genes that were co-expressed in bagasse. These genes have the potential to contribute to the lignocellulose degradation and to the development of T. reesei hypercellulolytic strains, and they should be studied in more detail.

## AUTHOR CONTRIBUTIONS

GB performed the analyses, and RdS carried out the data processing. DR-P and MC supervised the study and performed the analyses. JO supervised the study and planned the analyses.

### REFERENCES


All the authors wrote the draft and approved its final version.

## FUNDING

This study received financial support from the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP processes n 2014/15799-7, 2014/11766-7, 2015/08222-8, and 2017/18987- 7), from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq process number 141574/2015-1), and also in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES).

### ACKNOWLEDGMENTS

We would like to thank the Laboratório Nacional de Ciência e Tecnologia do Bioetanol (CTBE) and the Centro Nacional de Pesquisa em Energia e Materiais (CNPEM) for the computational facility.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fbioe. 2018.00151/full#supplementary-material


transcription in the presence of lignocellulosic substrates. Microb. Cell Fact. 11:134. doi: 10.1186/1475-2859-11-134


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor declared a shared affiliation, though no other collaboration, with several of the authors, RS and DR, at the time of the review.

Copyright © 2018 Borin, Carazzolle, dos Santos, Riaño-Pachón and Oliveira. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Promoter Toolbox for Recombinant Gene Expression in *Trichoderma reesei*

#### Elisabeth Fitz 1,2, Franziska Wanka<sup>2</sup> and Bernhard Seiboth1,2 \*

<sup>1</sup> Research Division Biochemical Technology, Institute of Chemical, Environmental and Bioscience Engineering, TU Wien, Vienna, Austria, <sup>2</sup> Austrian Centre of Industrial Biotechnology (ACIB) GmbH, Institute of Chemical, Environmental and Bioscience Engineering, TU Wien, Vienna, Austria

The ascomycete Trichoderma reesei is one of the main fungal producers of cellulases and xylanases based on its high production capacity. Its enzymes are applied in food, feed, and textile industry or in lignocellulose hydrolysis in biofuel and biorefinery industry. Over the last years, the demand to expand the molecular toolbox for T. reesei to facilitate genetic engineering and improve the production of heterologous proteins grew. An important instrument to modify the expression of key genes are promoters to initiate and control their transcription. To date, the most commonly used promoter for T. reesei is the strong inducible promoter of the main cellobiohydrolase cel7a. Beside this one, there is a number of alternative inducible promoters derived from other cellulase- and xylanase encoding genes and a few constitutive promoters. With the advances in genomics and transcriptomics the identification of new constitutive and tunable promoters with different expression strength was simplified. In this review, we will discuss new developments in the field of promoters and compare their advantages and disadvantages. Synthetic expression systems constitute a new option to control gene expression and build up complex gene circuits. Therefore, we will address common structural features of promoters and describe options for promoter engineering and synthetic design of promoters. The availability of well-characterized gene expression control tools is essential for the analysis of gene function, detection of bottlenecks in gene networks and yield increase for biotechnology applications.

Keywords: constitutive promoter, inducible promoter, cellulase, promoter engineering, recombinant protein production, synthetic biology, strain engineering, *Trichoderma reesei*

### INTRODUCTION

Trichoderma reesei is a model organism for plant biomass degradation and a production platform for proteins and enzymes (Bischof et al., 2016). The T. reesei reference strain QM6a was isolated during the Second World War on the Solomon Islands due to its ability to degrade US army tent canvas. Initial efforts to understand the biochemical mechanisms of its extracellular cellulases were performed by Mary Mandels and Elwyn T. Reese in the Natick Army Research Laboratories (Reese, 1976; Allen et al., 2009). The isolated wild-type strain T. reesei QM6a is, as many naturally occurring strains, a poor cellulase producer but showed a high potential for enhancement of secretion capacity. Therefore, it was necessary to run several strain improvement programs to convert QM6a into an industrial workhorse for cellulase production. These efforts

*Edited by:*

Roberto Silva, Universidade de São Paulo, Brazil

#### *Reviewed by:*

Humberto Prieto, Instituto de Investigaciones Agropecuarias (INIA), Chile Wenjing Cui, Jiangnan University, China

*\*Correspondence:* Bernhard Seiboth bernhard.seiboth@tuwien.ac.at

#### *Specialty section:*

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

*Received:* 10 July 2018 *Accepted:* 12 September 2018 *Published:* 11 October 2018

#### *Citation:*

Fitz E, Wanka F and Seiboth B (2018) The Promoter Toolbox for Recombinant Gene Expression in Trichoderma reesei. Front. Bioeng. Biotechnol. 6:135. doi: 10.3389/fbioe.2018.00135 resulted in many different mutagenized strains with the two main lineages from Rutgers and Natick (Reese, 1975; Montenecourt and Eveleigh, 1979). In the 1980ties the Rutgers lineage strain Rut-C30 (Peterson and Nevalainen, 2012) produced and secreted 30 g protein per liter fermentation medium using lactose as inducing carbon source (Durand et al., 1988). Further advancements by classical and genetic strain improvement led to industrial hyperproducer strains which are able to secrete more than 100 g of protein per liter in industrial fermentation settings, the new golden standard for cellulase production (Cherry and Fidantsef, 2003). Since all improved strains are derived from the single Solomon Islands isolate QM6a, a genomic tracking of changes accompanied with cellulase hyperproduction is possible (Le Crom et al., 2009). The potential to secrete huge amounts of protein promoted efforts to use T. reesei as expression host for recombinant proteins. Following the establishment of transformation techniques (Penttilä et al., 1987; Gruber et al., 1990), numerous fungal and non-fungal proteins were successfully produced by T. reesei (Nevalainen and Peterson, 2014; Paloheimo et al., 2016). Among these, calf chymosin was one of the first mammalian proteins produced in fungi (Harkki et al., 1989). Another emerging application is the biosynthesis of human proteins. Although T. reesei is not able to produce proteins with a human like glycosylation pattern, one advantage is that it does not hyperglycosylate proteins. These high-mannose glycans are often observed in proteins from other fungal expression hosts including Saccharomyces cerevisiae or Pichia pastoris and affect protein function or lead to immunogenic reactions in patients when used as therapeutic proteins (van Arsdell et al., 1987; Penttilä et al., 1988; Godbole et al., 1999; Boer et al., 2000; Jeoh et al., 2008; Ward, 2012). Promising results were recently achieved for a number of human proteins including antibodies, interferon alpha 2b, and insulin like growth factor when expressed in multiple protease deficient strains (Landowski et al., 2015). Besides its excellent yield of extracellular proteins, some of the secreted enzymes of T. reesei received the GRAS status (Generally Recognized as Safe) by the US Food and Drug Administration (U.S. Department of Health and Human Services, www.fda.gov). Another advantage is the cultivation on inexpensive media and the possibility for upscaling to reactor volumes larger than 100 m<sup>3</sup> without compromising productivity (Paloheimo et al., 2016). T. reesei can grow on cheap lignocellulosic waste materials from food and non-food crops as easy available carbon. The lignocellulolytic enzymes produced under these conditions are then used to convert plant cell wall polysaccharides into simple fermentable sugars which are microbiologically transformed into bioethanol or other biorefinery products with higher value (Belal, 2013; Saravanakumar and Kathiresan, 2014; Xu et al., 2015).

For recombinant protein expression, the existence of a welldeveloped genetic toolbox is essential for efficient engineering of the production host. Its main parts were extensively reviewed, for example in (Keränen and Penttilä, 1995; Steiger, 2013; Bischof and Seiboth, 2014). The basic requirements are suitable host strains and preferentially a large set of different gene expression cassettes. One important element of an expression cassette is the promoter, which initiates and controls gene expression.

In this review, we will describe established promoters for T. reesei with emphasis on their advantages and disadvantages and discuss recently discovered promoters using transcriptomic approaches. Subsequently, we will give examples for other useful promoters that could be adapted for T. reesei and describe synthetic expression systems that have recently been or are about to be established for T. reesei. Finally, we will touch the emerging field of promoter engineering, where we will address the promoter architecture in eukaryotes, specific features of promoters and transcription factors in T. reesei and give an outlook on the in silico design of promoters.

#### ESTABLISHED PROMOTERS FOR GENE EXPRESSION IN *T. REESEI*

Promoters are regulatory regions upstream of the transcription start site controlling the transcription of genes. They provide information for the binding of RNA polymerase and factors necessary for recruitment of the RNA polymerase. Initiation of transcription is regulated by a diverse set of activators and repressors with often further coactivators or corepressors involved (Wei et al., 2011). Although our knowledge on fungal promoters is steadily increasing, it is still considerable low compared to prokaryotic promoters and the exact composition of regulatory motives found in promoters of eukaryotes is still not sufficiently characterized. Some of the known features are summarized in section Structure and Regulation of Promoters.

Promoters can be classified into constitutive and tunable promoters. Constitutive promoters are expressed independently of environmentally induced transcription factors. **Figure 1A** shows such an independent activation or repression. In contrast to constitutive promoters, tunable promoters react to presence or absence of biotic or abiotic factors as shown in **Figures 1B–D**. Bidirectional promoters are a special case and able to regulate two adjacent genes oriented in opposite directions. They can be employed to express two genes simultaneously but will not be further addressed in this review, since they are not yet broadly used for T. reesei.

#### Constitutive Promoters

Constitutive promoters regulate expression of basal genes, like housekeeping genes or genes of the glycolytic pathway. They produce at a constant rate independent of the employed carbon source resulting in about the same amount of gene product over time. However, their activity is not very flexible, since they lack an on/off option. Constitutive promoters are independent of environmental factors. Nevertheless, they are often correlated to the growth of the fungus. Using the term constitutive is not completely correct in that context. The here described promoters can also be considered as auto-inducible, since their independence from growth factors has not yet been shown. The term auto-inducible is often used with bacterial expression hosts as Escherichia coli (Briand et al., 2016; Anilionyte et al., 2018). In this review, we will stick to the term "constitutive" as it is still generally used in literature. The most commonly used constitutive promoters for T. reesei were usually only tested under

specific conditions for protein production. Especially D-glucose based media are cheap and therefore attractive for economic protein production. Thus, it would need further investigations to figure out, whether they are truly independent of media components or if they are active during all growth phases.

Frequently used homologous constitutive promoters are listed in **Table 1**. The promoters of cDNA1 (Nakari-Setälä and Penttilä, 1995) and tef1 (Nakari et al., 1993) were isolated by screening of cDNA libraries for genes highly expressed during growth on D-glucose and were used for overexpression in D-glucose containing media. The promoter of the uncharacterized cDNA1 is generally regarded as one of the strongest among the constitutive promoters, while Ptef1 is medium strong (Nakari-Setälä and Penttilä, 1995; Uzbas et al., 2012). Other promoters of eno1, gpd1 and pdc1 are also well expressed during cultivation on D-glucose (Chambergo et al., 2002; Li et al., 2012). (Li et al., 2012) compared their activities by expression of the T. reesei xylanase xyn2. They found that Ppdc1 and Peno1 are stronger than Pgpd1, but that Pgpd1 showed a very stable expression, whereas Ppdc1 and Peno1 activities increased on high D-glucose concentrations in the medium (Li et al., 2012). The homologous cellobiohydrolase cel7a was overexpressed under Peno1 control in a 1cel7a strain resulting in suitable product levels (Linger TABLE 1 | Examples for constitutive promoters for T. reesei.


et al., 2015). The promoter of pki1 was used for e.g., xylanase production exhibiting low to medium strength (Kurzatkowski et al., 1996).

Overall, the comparison of the expression strength of these different promoters is difficult as the cultivation conditions, media compositions or strain backgrounds vary considerably in the different publications. So far, no results are available, which compare these promoters under industrial production conditions. Although constitutive promoters are simple to use and independent of media requirements, the major disadvantage in comparison to inducible promoters is their expression strength. The strong constitutive cDNA1 promotor produced significantly less protein in a cultivation using D-glucose compared to the strong inducible cel7a promoter under cellulase inducing conditions (Penttilä et al., 1987).

#### Tunable Promoters

Tunable promoters are either inducible or repressible and dependent on the presence or absence of activating or repressing agents. These can be substances like sugars, amino acids, vitamins, metals or physical stimuli like light or different temperatures. The different mode of action of substance related activation and repression is depicted in **Figures 1B,C**. The effect of these modulating substances can be e.g., concentration dependent or competitive. Preferably, the strength of the promoter can be fine-tuned by the addition of different quantities of the inducing/repressing substance. For protein production, an inducible promoter should have no or only a low basal expression which is considerably enhanced by addition of the inducer. The induction should be strong, but in contrast to gene function studies a full tightness is not the main concern but surely advantageous to separate growth and production phase (Meyer et al., 2011; Huang et al., 2015). In both cases, the induction can be the result of a direct activation and the neutralization of a repressor, respectively. The expression level of a repressible promoter should be high to moderate and significantly lowered by addition of a repressing substance. As a production host for cellulases and xylanases, the promoters of genes of the main secreted enzymes are attractive targets for use in overexpression cassettes.

#### Cellulase Promoters

The cellulase cellobiohydrolase CBH1, or according to the Carbohydrate-active Enzymes (CAZyme) annotation CEL7A (Lombard et al., 2014), constitutes with about 60% the dominating protein in the T. reesei secretome produced under cellulase inducing conditions. Its expression strength has made the cel7a promoter the first choice to drive recombinant protein production (Gritzali and Brown, 1979; Nummi et al., 1983). Cel7a promoter based expression constructs can be introduced in multiple copies. It is estimated that up to four copies of the expression cassette can still increase protein production but that at higher copy numbers a saturation effect occurs which is assumed to be caused by a depletion of transcriptional activators (Karhunen et al., 1993; Margolles-Clark et al., 1996). Besides the one of cel7a, there are a few other frequently used cellulase promoters listed in **Table 2**, including Pcel6a or Pegl2. Beside their lower expression strength, all of them exhibit similar advantages and limitations as the promoter of cel7a.

Cellulase genes have to be induced by sugars, which need to be present in the fermentation medium. Consequently, the application of those promoters is limited by the media requirements. Induction of cellulase promoters can be achieved by a broad variety of carbohydrates, ranging from insoluble polymeric carbon sources like cellulose and cellulose containing raw materials, to soluble disaccharides. Some of the best explored and most used inducing carbon sources can be found in **Table 3**. Cellulose and related carbon sources can be disadvantageous for industrial processes due to the insolubility of the substrate, which can affect downstream processing. The soluble disaccharide sophorose is too expensive for industrial use and lactose leads to lower enzyme production than cellulose in most strains. The monosaccharide L-sorbose affects growth of the fungus and is usually not used in industrial protein production. A more detailed discussion on further aspects of these inducing carbon sources was reviewed before (Stricker et al., 2008a; Kubicek et al., 2009; Amore et al., 2013).

Cellulase promoters, including Pcel7a, are usually inactive during growth on D-glucose and other easily metabolizable carbon sources, such as D-xylose, due to a mechanism termed carbon catabolite repression (Ruijter and Visser, 1997; Mach-Aigner et al., 2010). In the cellulase hyperproducer strain T. reesei Rut-C30, ancestor of many industrial strains, carbon catabolite repression was abolished and therefore the strain can express cellulase and xylanase genes on D-glucose and other noninducing carbon sources. This is the result of a mutation in the carbon catabolite repressor gene cre1 (Ilmén et al., 1996b; Seidl et al., 2008). However, these derepressed cellulase levels are considerably lower compared to the induced expression levels (Nakari-Setälä et al., 2009).

A side effect when using cellulase promoters for protein expression is that induction leads to the expression of other native cellulases. This does not constitute a problem when the whole protein mixture is used for e.g., plant biomass degradation but leads to a loss of energy for the production and secretion of these byproducts in case a pure product is desired. In addition, they lead to increased costs in the downstream processing during protein purification. These problems are partially prevented by knocking out the most prominent cellulases (Landowski et al., 2015) or targeting the expression cassette to the cel7a locus to eliminate formation of the major cellobiohydrolase CEL7A. As an alternative, (Uzbas et al., 2012) established a generally cellulase and xylanase free production platform in T. reesei by using a strain deleted in the main cellulase activator XYR1. By doing so, the enzyme activities can be determined directly in the supernatant without disturbance of native cellulases or xylanases avoiding laborious purification steps. This system was already used to express various proteins, for example the T. reesei swollenin SWO1 and a Corynascus thermophilus cellobiose dehydrogenase (Eibinger et al., 2016; Ma et al., 2017).

#### Alternative Tunable Promoters

Whereas most cellulases are coordinately regulated, the expression of the xylanases can differ to some extent. There are several xylanase promoters in use (**Table 2**). Xylanases are induced by sugars, but in contrast to cellulases, not all of them are induced and repressed by the same substances. XYN1 and XYN3 are repressed by CRE1, respectively D-glucose, while XYN2 retains a low constitutive expression on D-glucose (Mach-Aigner et al., 2010; Herold et al., 2013). XYN2 is also induced by cellulose and related inducers (Amore et al., 2013; Herold et al., 2013). The most common used xylanase promoters are the ones of xyn1 and xyn2.

Whereas cellulase and xylanase promoters are repressed on Dglucose, the promoter of the sugar transporter stp1 is turned on when D-glucose is used as carbon source. Promising studies have been made for this alternative promoter, that can use cheap Dglucose but also other carbon sources as activating agent (Ward, 2011; Zhang et al., 2013, 2015).

### NEW DEVELOPMENTS TO EXPAND THE PROMOTER TOOLBOX

Rational strain engineering demands novel regulatory tools to understand the complexity, limitations and driving forces of protein production. Therefore, it is necessary to regulate genes in a new manner as simply overexpression or deletion will not be sufficient to grasp the complexity of protein production. Nowadays, whole gene networks need to be adjusted in order to optimize product formation. Common obstacles can for example be metabolite pool drownings, unfavorable changes in the redox state of the cell, protein misfolding or protease formation caused by overexpression of recombinant proteins. Furthermore,


TABLE 3 | Inducing substrates for cellulase and xylanase expression.


complex gene systems often compensate for simple genetic modifications and cannot be characterized by only silencing or activating a single gene. All these drawbacks can be overcome by directed changes in expression or by construction of new gene circuits.

As detailed in the previous chapters, there is an adequate selection of promoters available for recombinant protein expression derived from cellulase or xylanase genes, respectively a few constitutive promoters with medium strength. However, they have a number of limitations. One of the problems of these inducible promoters is that they are controlled by a similar set of transcription regulators including the activator XYR1 and repressor CRE1 and that the inducing substrates lead to the expression of other coregulated enzymes and proteins. In addition, introduction of multiple copies of the cel7a promoter influences the expression of cel7a and other

members of this regulon (Karhunen et al., 1993). Consequently, only very limited options for rational strain engineering apart from cellulase induction are possible. Therefore, it is necessary to establish new promoters and develop synthetic expression systems, which are regulated independently from the carbon source and in a cellulase and xylanase independent fashion. Furthermore, the industrial processes in the bioethanol or biorefinery industry demand more and more the utilization of lignocellulosic waste material from food and non-food crops as cheap carbon sources for the production of bioethanol or biorefinery products (Seidl and Seiboth, 2010). Ideally, the respective gene modulating substances should not be present in the fermentation medium, which is often difficult to achieve when such complex carbon sources as wheat straw are used.

The advent of genomics and transcriptomics facilitates the discovery of new promoters with new characteristics and qualities. In theory, there are hundreds of native tunable promoters available for each production host, whose expression strength can vary with biotic and abiotic substances available in the medium, growth state or developmental phase. State-ofthe-art identification of new potential promoters and inducer or repressor are comparative transcriptomic analyses using either microarray or the generally more accurate RNA-Seq (Wang et al., 2009; Nazar et al., 2010). A further advantage of this approach is that it is not only possible to identify new suitable promoters but it is also possible to estimate how strong the transcriptomic response is upon addition of the inducing or repressing substance. Preferentially only a minor set of genes should be affected to avoid changes in product formation or growth. Therefore, some of the gene expression modulating substances described below were specifically tested for their effect on gene transcript levels expression during growth on the raw plant biomass wheat straw as carbon source. With this approach also published expression data can be analyzed as shown for example for Aspergillus niger to identify constitutive promoters (Blumhoff et al., 2013). Likewise, comparative genomics enhances the identification of orthologous sequences and accelerates thereby the transferability of promoters and the adaptation of expression systems from related fungi to T. reesei, as further source for new genetic tools.

### New Tunable Promoters Discovered by Transcriptomic Studies

The assimilation of sulfur in microorganisms comprises genes that are also sensitive toward different sulfur containing substances such as L-methionine. MET/met genes offer a set of promoters of medium strength that are tightly repressed in the presence of L-methionine. The efficient inactivation of the most commonly used promoter of the L-methionine repressible ATP sulphurylase MET3 was demonstrated for different yeasts including Saccharomyces cerevisiae (Cherest et al., 1985; Mao et al., 2002), P. pastoris (Delic et al., 2013), Candida albicans (Care et al., 1999), and Ashbya gossypii (Dünkler and Wendland, 2007). Addition of L-methionine to the medium does not interfere with T. reesei biomass formation or native cellulase production making this amino acid a suitable substance to repress gene expression (Gremel et al., 2008; Bischof et al., 2015). The use of different orthologues of these met genes as source for repressible promoters was therefore also tested for T. reesei. Although the MET3 orthologue in T. reesei is highly repressible when simple cellulase inducing carbon sources as lactose were used, met3 repression was absent in the presence of the complex carbon source wheat straw in the medium. Therefore, it was necessary to identify new L-methionine repressible genes. Transcriptomic comparison of cultures grown on wheat straw treated with and without L-methionine revealed that only 50 genes were differentially regulated upon L-methionine addition. Among the down-regulated genes, a promoter of tauD like dioxygenases was successfully tested with the invertase sucA from Aspergillus niger as reporter. A strong repression of invertase expression was found in wheat straw cultures, even hours after addition of Lmethionine. This repressible promoter is not restricted to wheat straw cultures and was also functional during growth on other carbon sources including D-glucose and glycerol (Bischof et al., 2015).

Analogous to L-methionine repressible promoters, new inducible promoters were discovered which do not interfere with cellulase production or growth of T. reesei. Among different amino acids and vitamins tested, pantothenic acid was identified as suitable inducer at low substrate concentrations. Comparative microarray analysis between a pantothenic induced and a mock treated wheat straw culture identified only a small number of genes that were differentially regulated. Six of the highest inducible genes were found in a gene cluster including a putative pantothenic acid transporter. Fusion of promoters of these pantothenic acid inducible genes to the T. reesei ß-glucosidase BGL1 showed a clear induction of expression at low amounts with 0.1 and 1 mM pantothenic acid (Gamauf et al., 2018).

Another example for tuning of gene expression is the promoter of the copper transporter tcu1. Copper transporter are tightly repressed by environmental copper levels to efficiently regulate the uptake of copper. Copper responsive promoters were already successfully applied in other fungi (Ory et al., 2004; Lamb et al., 2013) and were recently adapted for T. reesei (Lv et al., 2015). T. reesei tcu1 is the orthologue of the N. crassa copper transporter and its activity depends on the copper concentration in the medium but is independent of the carbon source. Its expression is abolished if a certain amount of copper is present in the medium and can be relieved by the addition of a Cu2<sup>+</sup> chelator. The function of the tcu1 promoter was tested by expressing the cellulase and xylanase activator XYR1. Using this promoter a derepression of cellulase production in T. reesei was achieved by xyr1 expression (Lv et al., 2015). The promoter was further engineered to be more sensitive toward lower copper concentrations in the µM scale and was established in the context of an unmarked genetic modification strategy (Lv et al., 2015; Zhang et al., 2016b). The tcu1 promoter was also applied to selectively silence genes without risking a lethal phenotype (Zheng et al., 2017; Wang et al., 2018), which qualifies the system for cost-effectively exploring functions of essential genes. An interesting question in the context of cellulase production is if the low copper concentrations can affect the activity of the cellulase cocktail as they contain beside the canonical glycoside hydrolases also copper-dependent oxidases.

An interesting option to control gene expression are developmentally regulated promoters. A set of genes active only under sporulation conditions was identified in T. reesei by a transcriptomic approach (Metz et al., 2011). These promoters could be used to express the proteins exclusively during the onset of sporulation and target the proteins to the conidiospores. In this case, the promoters do not need to be induced by addition of an agent to the medium, but are regulated by the developmental phase of the fungus.

### Examples for Potential Adaptation of Promoters From Other Fungi to *T. reesei*

A number of promoters are found in other well-studied ascomycetes including Neurospora and Aspergillus spp. and could represent useful additions to the genetic toolbox of T. reesei.

#### Promoters From Neurospora crassa

An inducible promoter often used for N. crassa is the qa-2 promoter which can be activated by quinic acid and which is repressed by D-glucose (Baum and Giles, 1985; Geever et al., 1987). The promoter works similarly in T. reesei, but is not yet optimized for gene expression, since it is highly sensitive to sugar concentrations in the culture (Zheng et al., 2017).

Beside chemically induced promoter, there are promoters regulated by a physical stimulus such as light. Light sensitive promoters are available but were tested only for a few organisms, for example N. crassa (Fuller et al., 2018). T. reesei ability to react to light stimuli was already the subject of a number of transcriptomic studies (Tisch et al., 2011), but a systematic search for light-sensitive promoters under different light conditions is still missing.

#### Promoters From Aspergillus

Several inducible promoters established for Aspergillus sp. show a great potential as alternatives for T. reesei. For example, the recently described promoter of bphA (benzoate parahydroxylase) from A. niger can be induced by the presence of benzoic acid. It is tightly repressed in the absence of benzoic acid and induced within 10 min upon its addition to the medium (Antunes et al., 2016). That the fungus can grow on benzoic acid as the sole carbon source indicates that toxicity is a minimal concern in this system.

There are also thiamine repressible thiA promoters available, which were tested successfully in A. oryzae und A. nidulans. One drawback is that they are not repressible under alkaline conditions (Shoji et al., 2005).

A challenge for protein production processes is to minimize the amount of carbon source necessary for biomass maintenance and formation but using the carbon source instead to maintain only high productivity. One possibility to uncouple product from biomass formation is to reach a specific growth rate close to zero. Whole genome transcriptomic analysis has identified a number of promoters active under these harsh conditions of zero growth for Aspergillus (Jørgensen et al., 2010; Wanka et al., 2016a), some of them are also available in T. reesei.

### Synthetic Expression Systems

In order to realize a metabolism-independent expression and accurate regulation of genes, synthetic expression systems are recommended. Such a well-engineered expression system should respond in a controlled manner to input, for example to an inducing agent, be tightly regulated, cover different expression strength and be non-toxic. In most cases, the controlled gene expression is achieved by the use of a synthetic transactivator, which consists of a fusion of different protein domains. Synthetic transactivators comprise a DNA-binding domain, a heterologous regulatory domain and an activation domain e.g., derived from the Herpes simplex virus protein 16 (VP16), as shown in **Figure 2**. This transactivator usually requires a chemical ligand for binding to the responsive operator sequence. To increase gene expression multiple copies of the operator are usually introduced to the promoter. The goal is to establish a variety of regulatory systems in T. reesei to simultaneously but independently regulate or balance the expression of different genes. **Figure 2** shows the general mode of action of such an expression system. Examples for such expression systems are the estrogen receptor system, a light regulated expression system and the Tet-on/off system.

The inducible estrogen receptor system (hERα) was successfully applied for S. cerevisiae. It is based on the human estrogen receptor hERα, a member of a family of nuclear receptors for small hydrophobic ligands. The hERα was fused to the DNA-binding domain of LexA or of S. cerevisiae Gal4 and the VP16 activation domain (Quintero et al., 2007; Ottoz et al., 2014; Dossani et al., 2018). The expression system was successfully tested in Aspergillus nidulans and A. niger, in which diethylstilbestrol or 17-β-estradiol regulated hERα to activate reporter gene transcription via estrogen-responsive elements (Pierrat et al., 1992; Pachlinger et al., 2005). A downside of

FIGURE 2 | A schematic presentation of synthetic expression system for gene activation. A promoter drives the expression of a synthetic transcription activator (TA) containing a DNA binding domain (e.g., LexA or GAL4), a regulatory domain (e.g., an estrogen receptor) and a transcription activation domain (e.g., VP16). Upon binding of an inducer the conformation of the transactivator changes and interacts with the operator that initiates the transcription of the gene of interest.

the system is that it is either highly inducible but exhibits a high basal expression level or it is tightly regulated but only weakly inducible (Pachlinger et al., 2005; Meyer et al., 2011). A respective system from plants (Zuo et al., 2000) was adapted to T. reesei with similar results (Derntl, 2018).

Controlling the expression strength of genes with light promises a variety of advantages upon regulation with chemical agents: (i) light can be applied and removed in an instance, (ii) it is cheap and simple to obtain, (iii) it is easy to control intensity and duration, that means it is fully tunable and (iv) no specific media composition is required. A main drawback is, however, to implement this system on an industrial scale where high biomass concentrations pose clearly a challenge. In S. cerevisiae light sensitivity toward blue or red light was introduced by fusing photoreceptors of Arabidopsis to the S. cerevisiae GAL1 promoter (Hughes et al., 2012). (Wang et al., 2014) developed a light-sensitive transactivator for T. reesei by fusing the S. cerevisiae Gal4 DNA-binding domain to the N. crassa blue-light photoreceptor Vivid and the H. simplex VP16 activation domain (Zoltowski et al., 2007). The transactivator was then set under the control of the pki1 promoter and the expression of two reporter genes was successfully induced by light pulses (Wang et al., 2014). This "expression-switch" can be inserted to commonly used promoters to make them lightsensitive (Zhang et al., 2016a).

An almost universal applied expression system is the Teton/off system (Gossen and Bujard, 1992). In case of the Teton system the transcription is dose-dependently activated by the addition of the synthetic tetracycline derivative doxycycline, which bind to the reverse transactivator rtTA. This complex then binds to the tetO operator sequence activating the transcription of the gene of interest. The Tet-off system consists of the tetracycline-controlled transactivator tTA, which allows permanent expression of the gene of interest in the absence of tetracycline. Addition of tetracycline prevents tTA binding to the tetO operator and represses thereby expression. This systems have been adapted for fungi including A. fumigatus (Vogt et al., 2005; Helmschrott et al., 2013) and A. niger (Meyer et al., 2011; Wanka et al., 2016b), but obstacles are observed for a successful adaptation for T. reesei (Zheng et al., 2017).

Overall, the selection of established artificial expression systems for T. reesei is very limited. However, there are multiple systems available, which show high potential. Examples include the temperature inducible gene regulation (TIGR) system (Weber et al., 2003), cumate gene-switch (4-isopropylbenzoic acid) (Mullick et al., 2006), biotin triggered genetic switch (Weber et al., 2009), pristinamycin on/off system (Fussenegger et al., 2000) or erythromycin on/off system (Reeves et al., 2002).

Expression systems are usually only adapted for one specific host organism and can be transferred effectively only to a limited number of closely related species. Therefore, (Rantasalo et al., 2018) have developed a universal synthetic expression system for recombinant protein production for six different yeasts including S. cerevisiae and P. pastoris and the two fungi T. reesei and A. niger. The system comprises two different expression cassettes, including one which provides a weak constitutive level of a synthetic transcription factor (BM3R1-NLS-VP16) and a second one for strong tunable expression of the target gene via an synthetic transcription factor dependent promoter consisting of eight BM3R1 binding sites (Rantasalo et al., 2018).

### ENGINEERING OF PROMOTERS

The rational design of promoters is one of the newest fields within synthetic biology. The advantages are clear: a promoter with the needed characteristics can be designed independently of the natural spectrum of the production host. Ideally, the expression strength can be modulated, promoter switches can be added and inducibility or repression properties can be added or deleted. In the age of modern molecular biology, in which the synthesis of artificial DNA sequences is quick and cheap, the modification of primary sequences of promoters is a convenient way to optimize expression. The changes can alter different levels of regulation, as depicted in **Figure 3** including transcription factor binding sites, secondary structure stabilities or chromatin accessibility. Sounds easy, but prior to the design of new promoters it is crucial to retrieve information about the sequences that lead to the desired characteristics. One backlash until today is that there are no adequate and reliable prediction software for promoter functions available. Neither for the modification of natural backbones, nor the rational design of completely synthetic promoters. Today, modified and synthetic promoters are still tested via trial and error.

#### Structure and Regulation of Promoters

Regulatory elements of fungal promoters are still poorly characterized. However, a detailed knowledge of the binding

activating or repressing transcription factors (TF). They can bind in different ways to motifs in the sequence. (B) The activity is influenced by the secondary structure of the DNA. It can support or weaken the activity by stabilizing the primary sequence. (C) The tertiary structure of the chromatins regulates the accessibility of the promoter for DNA binding molecules.

motives in the promoters is necessary for rational engineering of promoters for biotechnological application.

The promoter consists of two merging parts: the core promoter which is responsible for its basal activity (Roeder, 1996; Roy and Singer, 2015) and the proximal promoter. The region near the start codon of the coding sequence contains the transcription start site for the RNA polymerase II (Smale and Kadonaga, 2003) and other enhancing elements (Novina and Roy, 1996). Usually they contain a TFIIB recognition site (Gelev et al., 2014), initiation site (Inr), motif 10 element (MTE); (Lim et al., 2004) and downstream promoter element (DPE) (Juven-Gershon and Kadonaga, 2010). The regulation of expression is determined by the accessibility of the DNA (Svejstrup, 2004; Müller and Tora, 2014), the RNA polymerase affinity respectively the stability of the initiation complex and transcription factors (Lemon and Tjian, 2000; Juven-Gershon and Kadonaga, 2010). The chromatin structure is another layer of regulation for the promoter. In eukaryotes, DNA can be tightly packaged with the help of histones forming the chromatin structure. This condensed state can prevent the RNA polymerase, activators and repressors to interact with the DNA. To allow access to the otherwise inactive DNA, chromatin remodeling is necessary to alter the architecture of these regions allowing transcriptional regulation by these factors. The relation between chromatin status and induction of cellulase production in T. reesei was already addressed in a number of publications (Seiboth et al., 2012; Mello-de-Sousa et al., 2015, 2016) but will not further be followed in this review.

#### Modification of Natural Promoters

An obvious approach to alter the characteristics of a promoter is to add or delete transcription factor binding sites or genetic switches of the promoter (Mach and Zeilinger, 2003). Premise for this approach is that the respective cis acting sequences and mechanisms of activation/repression are already known. We will therefore shed light on the essential parts of eukaryotic promoter and on some of the known transcription factors and their binding sites specific for T. reesei.

A highly engineered promoter is the one of cel7a. By deleting CRE1 repression sites but leave induction sites intact, Pcel7a can be used to express a recombinant gene on D-glucose, on which all native cellulases are usually repressed (Ilmén et al., 1996a). Several studies were made in which CRE1 binding sites were replaced by activator binding sites for e.g., ACE2 and HAP2/3/5 to abolish carbon catabolite repression and enhance activity of the engineered cel7a promoter (Liu et al., 2008; Zou et al., 2012). Other examples were already discussed above, like the insertion of sequences into the native promoter to add characteristics like light-inducibility or copper sensitivity. Multiple insertions of activating sites do not necessarily lead to increased promoter activity, as the general promoter architecture has to be considered, since the spacing between certain elements in the promoter can be play an important role as exemplified for the cel7a promoter (Kiesenhofer et al., 2017).

Another option to engineer promoters is the assembly of active sequences. This method is especially useful when the binding motifs and their interacting transcription factors are unknown. Natural promoter sequences with known expression characteristics or strength can be aligned and similar parts found among the sequences in these promoters can be divided into building blocks. These blocks can then be arranged in various ways to form synthetic promoters. Once a functional promoter is assembled, the building blocks can be further modified, duplicated or deleted to improve the overall activity of the new promoter (Hartner et al., 2008; Xuan et al., 2009; Mellitzer et al., 2014; Vogl et al., 2014). While this approach was already extensively used to optimize S. cerevisiae and P. pastoris promoters, it was not applied for T. reesei yet. An advantage of this system is that first hints for new transcription factor binding sites and their respective transcription factors can be found without prior knowledge.

#### General Eukaryotic Promoter Elements

Although only present in 5–7% of all eukaryotic promoters, the TATA-box is one of the most known and well characterized core promoter elements (Roy and Singer, 2015). The TATA-box and the TATA-binding protein are necessary for assembly of the TATA-initiation complex (Smale and Kadonaga, 2003). For a detailed summary see (Yella and Bansal, 2017).

The HAP2/3/5-complex binding site or CAAT-box can be found in about 30% of eukaryotic promoters, more frequently in TATA-less promotors but also in promoters with a TATA-site (Mantovani, 1998). The HAP2/3/5-complex acts as enhancer and the presence of its binding motif CCAAT is necessary for high expression ofcbh2 (Zeilinger et al., 1998, 2001). The complex may play a role in the chromatin remodeling and therefore is involved in cellulase induction (Zeilinger et al., 2003).

CpG islands are mostly unmethylated, long stretches of CGrich regions and are involved in the epigenetic regulation of transcription (Deaton and Bird, 2011). In such an island the GC content of a region of 200–2,000 bases (depending on the definition) is higher than 55% and it is thought that the GpC islands can initiate or silence gene expression, especially in promoters without TATA-box (Delgado et al., 1998; Chatterjee and Vinson, 2012). Several promoters can lie within one island. Most of the studies concerning the role of CpG islands were done in mammals but GpC islands can also be found in intergenic regions of the T. reesei genome. The regulatory role of such an element has not been investigated yet.

GC-box with its consensus sequence of GGGCGG is preferentially found in TATA-less promoters and is involved in the correct positioning of the RNA polymerase II (Weis and Reinberg, 1997).

#### T. reesei Transcription Factors and Their Binding Sites

In T. reesei the best-studied transcription factors are involved in the regulation of cellulases. Recent research efforts demonstrate that cellulase regulation is controlled by a highly adapted regulatory network involving multiple transcription factors, which can directly or indirectly regulate cellulase expression. CRE1 is the major regulator for carbon catabolite repression. The transcription factor impairs the native cellulase production on glucose, which is mainly an indirect effect. During growth on D-glucose as carbon source, CRE1 mainly accounts for the repression of genes responsible for the entry of cellulase inducers into the cell. Studies have shown that the cre1 gene is truncated in the industrial ancestor strain RUT-C30 and leads to a cellulase hyperproduction and derepression in that very strain (Strauss et al., 1995; Ilmén et al., 1996b; Takashima et al., 1996; Le Crom et al., 2009; Nakari-Setälä et al., 2009; Portnoy et al., 2011a). CRE1 binds to the SYGGRG sequence and it is believed that the functional binding sites consists of two closely spaced repeats (Cubero and Scazzocchio, 1994; Mach et al., 1996).

XYR1 is the master activator for cellulases and xylanases and is involved in D-xylose and L-arabinose catabolism. Lack of XYR1 completely eliminates cellulase expression (Stricker et al., 2008a; Akel et al., 2009; Furukawa et al., 2009; Portnoy et al., 2011b; Lichius et al., 2015). XYR1 binds to a GGCTAA sequence arranged as an inverted repeat (Stricker et al., 2006; Kiesenhofer et al., 2017) and seems to be regulated by a long non-coding RNA (Till et al., 2018).

ACE1 is a Cys2His2-type zing finger protein and binds to different AGGCA motifs (Saloheimo et al., 2000). A deletion of the transcription factor leads to an increased production of all (hemi-)cellulases, suggesting that it acts as a repressor for these class of enzymes (Aro et al., 2003).

ACE2 is a zinc binuclear cluster protein that binds to a GGCTAA sequence, which is the same binding motif as for XYR1 (Aro et al., 2001). A deletion of ace2 leads to a significantly reduced expression of cellulases and biomass formation on cellulose. The induction time of xyn1 and xyn2 was reduced, but the overall xylanase production was lower in 1ace2 strains (Aro et al., 2001; Stricker et al., 2008b).

ACE3 is another activator. Its overexpression resulted in enhanced cellulase production, whereas the deletion impaired cellulase expression and severely decreased xylanase activity (Häkkinen et al., 2014). The exact binding motif is not yet determined.

RCE1 is a zinc binuclear cluster protein and acts as a transcriptional repressor. Lack of RCE1 facilitates cellulase induction and delays the termination of cellulase expression. RCE1 seems to antagonize the binding of XYR1 to the cellulase promoters (Cao et al., 2017).

Beside these, a number of other transcription factors have been characterized including ARA1, which is involved in the utilization of D-galactose and L-arabinose and regulates different CAZymes in response to D-galactose (Benocci et al., 2018). The xylanase regulator SxlR seems to be a repressor for GH11 family xylanases as an overexpression in RUT-C30 resulted in a reduced and a deletion in an increased xylanase activity (Liu et al., 2017). The deletion of the transcription factor Trpac1 lead to a significantly higher cellulase production at pH 7, but not at other pH values (He et al., 2014). For more detailed reviews concerning the transcriptional regulation of cellulases see (Kubicek et al., 2009; Bischof et al., 2016; Kunitake and Kobayashi, 2017) and (Benocci et al., 2017).

#### Synthetic Design of Promoters

By gaining more and more insights into the regulation of expression, the way leads to the rational design of synthetic promoters optimized for the respective purpose. Several studies were made toward this direction in mammals (Juven-Gershon et al., 2006; Schlabach et al., 2010), but there are still a lot of unknown factors in gene expression and a lack of reliable prediction methods for an everyday application of rationally designed promoters. Due to this lack of deeper knowledge, a first approach for the design could be based on a Multivariate Data Analysis, in which the sheer number of potentially relevant factors can be statistically analyzed. In the next step, in silico assumptions have to be confirmed by experimental evaluation. The number of experiments can be drastically reduced by the Design of Experiment (DoE) approach. For example, a number of activating short sequences are identified and these elements are then arranged in different combination in a backbone. DoE will then help to reduce the number of actual sequences that have to be synthesized and tested. The activity of the new promoter sequences will then be determined and parts with higher or lower impact can be analyzed by Multivariate Data Analysis tools including PCA or PCR (Gagniuc et al., 2012). The designed promoters can further be optimized by additional rounds of engineering.

## CONCLUSION

For a long time, the natural low cellulase and xylanase production of T. reesei was enhanced for industry by random mutagenesis. To establish T. reesei as recombinant production host, it is more desirable to rationally engineer and optimize the strain for fermentation processes. Many tools are necessary for an efficient modification of the organism, one of them being a broad selection of promoters and expression systems. The choice of the respective promoter depends on the aim of the manipulation, for a simple overexpression a strong inducible promoter does the trick. For complex engineering issues, for example, the manipulation of metabolite pools or the alteration growth parameters toward more favorable characteristics in the fermenter, different promoters and expression systems are needed, that are not interfering with the protein production itself. Finally, synthetic biology offers new avenues and opens the possibility of de novo design of context-specific, customizable promoters. However, to realize these approaches in T. reesei and other fungi, it will be necessary to further increase our understanding of the regulatory networks governing gene expression.

### AUTHOR CONTRIBUTIONS

EF and BS prepared the outline of the manuscript. EF and FW wrote the main part of the manuscript. BS wrote the introduction, read and commented on the ms.

### FUNDING

This work has been supported by the Austrian BMWD, BMVIT, SFG, Standortagentur Tirol, Government of Lower Austria and Business Agency Vienna through the Austrian FFG-COMET-Funding Program.

#### REFERENCES


and cbh2 glycosyl hydrolase gene promoters. N. Biotechnol. 30, 523–530. doi: 10.1016/j.nbt.2013.02.005


reesei: a master regulator of carbon assimilation. BMC Genomics 12:269. doi: 10.1186/1471-2164-12-269


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Fitz, Wanka and Seiboth. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# CRISPR-Cas9 Approach Constructing Cellulase sestc-Engineered Saccharomyces cerevisiae for the Production of Orange Peel Ethanol

Peizhou Yang\*, Yun Wu, Zhi Zheng, Lili Cao, Xingxing Zhu, Dongdong Mu and Shaotong Jiang

College of Food and Biological Engineering, Anhui Key Laboratory of Intensive Processing of Agricultural Products, Hefei University of Technology, Hefei, China

#### Edited by:

Fernando Segato, Universidade de São Paulo, Brazil

#### Reviewed by:

Tikam Chand Dakal, Manipal University Jaipur, India Fabiano Jares Contesini, Universidade Estadual de Campinas, Brazil

\*Correspondence:

Peizhou Yang yangpeizhou@163.com; yangpeizhou@hfut.edu.cn

#### Specialty section:

This article was submitted to Microbiotechnology, Ecotoxicology and Bioremediation, a section of the journal Frontiers in Microbiology

Received: 04 July 2018 Accepted: 24 September 2018 Published: 10 October 2018

#### Citation:

Yang P, Wu Y, Zheng Z, Cao L, Zhu X, Mu D and Jiang S (2018) CRISPR-Cas9 Approach Constructing Cellulase sestc-Engineered Saccharomyces cerevisiae for the Production of Orange Peel Ethanol. Front. Microbiol. 9:2436. doi: 10.3389/fmicb.2018.02436 The development of lignocellulosic bioethanol plays an important role in the substitution of petrochemical energy and high-value utilization of agricultural wastes. The safe and stable expression of cellulase gene sestc was achieved by applying the clustered regularly interspaced short palindromic repeats-Cas9 approach to the integration of sestc expression cassette containing Agaricus biporus glyceraldehyde-3-phosphatedehydrogenase gene (gpd) promoter in the Saccharomyces cerevisiae chromosome. The target insertion site was found to be located in the S. cerevisiae hexokinase 2 by designing a gRNA expression vector. The recombinant SESTC protein exhibited a size of approximately 44 kDa in the engineered S. cerevisiae. By using orange peel as the fermentation substrate, the filter paper, endo-1,4-β-glucanase, exo-1,4-β-glucanase activities of the transformants were 1.06, 337.42, and 1.36 U/mL, which were 35.3-fold, 23.03-fold, and 17-fold higher than those from wild-type S. cerevisiae, respectively. After 6 h treatment, approximately 20 g/L glucose was obtained. Under anaerobic conditions the highest ethanol concentration reached 7.53 g/L after 48 h fermentation and was 37.7-fold higher than that of wild-type S. cerevisiae (0.2 g/L). The engineered strains may provide a valuable material for the development of lignocellulosic ethanol.

Keywords: biomass ethanol, cellulase, Saccharomyces cerevisiae, CRISPR-Cas9, orange peel, sestc

### INTRODUCTION

Bioethanol, as a promising alternative to petroleum resources, can be produced from lignocellulosic biomass and starch-rich plants (Kou et al., 2017). The development of lignocellulosic ethanol could be more promising in the future than food crop ethanol considering its sustainable, renewable, and environmentally friendly features (Cai et al., 2016). The conversion of lignocellulosic materials requires three key steps, namely, biomass pretreatment, saccharification, and fermentation (Chang et al., 2013). Pretreatment changes the physical and chemical properties of the raw materials. Saccharification produces fermentable sugars from cellulosic materials via enzymatic degradation, acidolysis, and ionic hydrolysis (Zhao et al., 2018). Fermentation converts D-glucose and other monosaccharides into ethanol by using microorganisms. A highly efficient transformation of lignocellulosic materials requires the integration of these three processing technologies (Koppram et al., 2014).

Orange is an important fruit around the world. During the orange processing, orange peel is removed from the flesh and discarded as a waste (Pandiarajan et al., 2018). The wasted orange peel generally decays and severely pollutes the atmosphere, soil, and water. The rational use of wasted orange peel has important value and significance (Rehan et al., 2018). The compositions of dry orange peel (w/w) are 17.5% of cellulose, 8.6% of hemicellulose, 25.4% of pectin, 6.73% of protein, 7.7% of moisture, 4.4% of ash (Miran et al., 2016), and 0.85% of lignin (Kantar et al., 2018). The cellulose can be converted into D-glucose via the hydrolysis of cellulase. Due to its lower lignin content, cellulase is more readily accessible to cellulose (Gao et al., 2014). In this study, orange peel was adopted to produce ethanol using the cellulase-engineering Saccharomyces cerevisiae. By in situ saccharification and fermentation, the production of ethanol from orange peel was investigated. Admittedly, orange peel is a representative of lignocellulosic materials. Other lignocellulosic materials such as corn straw and rice straw would be further investigated for the production of ethanol.

As an engineered host strain, S. cerevisiae can effectively produce ethanol and express heterologous proteins (Romanos et al., 1992; Ruohonen et al., 1997). However, the extremely low activities of cellulase produced by S. cerevisiae severely limit the conversion of lignocellulosic materials into fermentable sugars (Yang et al., 2018). Currently, the high cost of commercial cellulase considerably reduces the competitiveness of bioethanol in the market compared with fossil energy (Aswathy et al., 2010). The construction of engineered S. cerevisiae expressing cellulase is an important approach to degrading lignocellulosic materials (Kroukamp et al., 2018).

Previous reports on the engineered strain of cellulase genes mainly focused on the transformation of expression vectors embracing antibiotic resistance genes (Yang et al., 2016a). These expression vectors are located in the cytoplasm and are easily lost during cell proliferation. The addition of antibiotics raises the production cost, contaminates the broth, and reduces the product purity (Solange et al., 2010). Therefore, the integration of the cellulase gene into the S. cerevisiae genome without the addition of antibiotics should be an economic, safe, and feasible solution to producing bioethanol (Lee et al., 2017). As an RNA-mediated adaptive immune system, clustered regularly interspaced short palindromic repeats-Cas9 (CRISPR-Cas9) has been developed to achieve gene editing through its RNA-guided endonuclease activity (Singh et al., 2017). A segment of DNA is integrated into the host genome by an RNA-mediated approach (Mali et al., 2013). The integrated heterogenous DNA can be sustainably preserved in the engineered strain without antibiotic resistance genes (Jinek et al., 2014). Therefore, CRISPR-Cas9 is an effective approach to construct engineered polyploid industrial yeast and fungal strains (Liu et al., 2017; Lian et al., 2018).

The cellulase gene used in this study was isolated from the stomach tissue of Ampullaria gigas Spix (Yang et al., 2018). The gene encoding protein possesses three kinds of cellulase activity, namely, endo-1,4-β-glucanase (EG), exo-1,4-β-glucanase (CBH), and β-xylanase, and was named single-enzyme-systemthree-cellulase (sestc). In this study, the expression cassette carrying a cellulase gene from A. gigas Spix was integrated into the S. cerevisiae chromosome by using a CRISPR-Cas9 based approach. The novel engineered strains can simultaneously express SESTC cellulase and produce ethanol by consuming the glucose obtained from the SESTC enzymatic hydrolysis. All the processes of SESTC expression, lignocellulosic saccharification, and ethanol production were investigated without the addition of any antibiotics. This study is valuable for developing lignocellulosic ethanol with a lignocellulosic material as the carbon source.

#### MATERIALS AND METHODS

#### Strains, HXK2-gRNA, Cas9-NAT, and sestc Expression Cassette

The S. cerevisiae used in this study was derived from diploid industrial yeast. The Cas9 expression cassette was carried by a yeast single-copy episomal plasmid. The Cas9-NAT plasmid carrying selectable markers nourseothricin and bacterial resistance ampicillin was from Addgene and is reconstructed from the vector backbone pRS414-TEF1p-Cas9-CYC1t (Zhang et al., 2014; Chin et al., 2016). The gRNA expression vector for targeting S. cerevisiae hexokinase 2 gene (HXK2-gRNA) was constructed by amplifying gRNA-trp-HYB, which is preserved in Addgene, and using the primers shown in **Table 1**.

The sestc expression cassette for cellulase gene expression was isolated as follows. (1) Lentinula edodes glyceraldehyde-3 phosphate dehydrogenase (gpd) gene promoter was isolated from L. edodes genome by using L. edodes gpd promoter primers. The upstream and downstream regions of the L. edodes gpd promoter contained the restriction enzyme cutting sites of Sac I and Spe I. The L. edodes gpd promoter was inserted into the pBluescript II KS(-) plasmid by the double-enzyme digestion of Sac I, Spe I, and ligase. (2) The sestc fragment containing Spe I and BsrG I cutting sites was integrated into the plasmid containing the L. edodes gpd promoter. (3) A pair of primers were designed to amplify a complete sestc cassette embraced by the backbone of pBluescript II KS(-). The isolated sestc cassette contained the L. edodes gpd promoter, sestc, and the terminator.

### CRISPR-Cas9 Approach Integrating the sestc Expression Cassette

The sestc expression cassette was integrated into the S. cerevisiae genome by knocking out the hxk2 gene via CRISPR-Cas9 directed gene disruption (**Figure 1**). A double-strand break was formed at the gRNA target sequence. As a donor DNA, the sestc cassette was inserted into the S. cerevisiae chromosome in the double-strand break. The specific target sequence of the sgRNA (5<sup>0</sup> -ctcattttggaacaagtcatcgg-3<sup>0</sup> ) for hxk2 deletion was designed using an online software<sup>1</sup> . The sestc gene cassette was inserted into the S. cerevisiae chromosome by using the two-step CRISPR-Cas9 approach. (1) The Cas9 plasmid was integrated into the S. cerevisiae. The transformation of the Cas9 plasmid containing a nourseothricin resistance

<sup>1</sup>http://chopchop.cbu.uib.no/

#### TABLE 1 | Primers and target sequences of S. cerevisiae HXK2-gRNA.


The underline indicated the gRNA target sequences of the S. cerevisiae hexokinase 2 gene based on the sequences of S. cerevisiae hexokinase 2 (NCBI Genbanks ID:852639).

gene was performed by the LiAc-PEG method (Schiestl, 2007). Approximately 50 µL of the transformation solution was cultivated on YPD agar plates containing 100 µg/mL nourseothricin at 30◦C for 2 days. The true transformants were named S. cerevisiae-Cas9. (2) The HXK2-gRNA expression plasmid containing a hygromycin B resistance gene and

sestc expression cassette were synchronously transformed into S. cerevisiae-Cas9 by the LiA-PEG-mediated method (Schiestl, 2007). The transformation solution was cultivated on YPD agar plates containing 80 µg/mL nourseothricin and 200 µg/mL hygromycin B at 30◦C for 2 days. The putative transformants were further identified at molecular and protein levels.

### RT-PCR Identification and Protein Analysis of Recombinant S. cerevisiae

Before the identification, the putative transformants were cultured in liquid YPD media at 30◦C and a rotational speed of 280 rpm for 3 days. The monoclonal colonies without nourseothricin and hygromycin resistance were selected for further identification. The S. cerevisiae transformant RNA was extracted using a TransZol Plant extraction kit from TransGen Biotech Co., Ltd. EasyScript First-Strand cDNA Synthesis SuperMix from TransGen Biotech Co., Ltd. was used to synthesize the complementary DNA, with the transformant RNA as the template. The designed primers based on the sestc sequences were used to amplify sestc, with wild-type S. cerevisiae as the control. The total proteins of the engineered and wild-type S. cerevisiae were analyzed by the sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) approach.

### Cellulase Expression, Saccharification, and Ethanol Production

Exactly 20 mL of 20 OD<sup>600</sup> yeast cells/mL was inoculated into a 500 mL Erlenmeyer flask loaded with 200 mL of YP medium and 10 g of oven-dried orange peel powders. After fermentation at 30◦C and 180 rpm for 60 h, the cellulase activities were measured using the dinitrosalicylic acid method (Ghose, 1987). The total cellulase activity (filter paper activity, FPA), endo-β-1-4-glucanase (CMCase, EG), and exo-β-1-4-glucanase (cellobiohydrolase, CBH) were measured using Whatman grade 1 filter paper (50 mg, 1 cm × 6 cm), 0.51% CMC–Na (w/v), and 1% microcrystalline cellulose (w/v) as catalytic substrates, respectively. The saccharification process was conducted by incubating the fermentation solution at 37◦C for 6 h. Subsequently, the temperature was reduced to 30◦C for ethanol production under anaerobic conditions. The ethanol concentration was measured using gas chromatography with the following parameters: Agilent DB-624, FID detector, 50◦C column temperature, 250◦C detector temperature, 175◦C injection temperature, and 1 µL injection volume. High-performance liquid chromatography (HPLC) was used to measure the content of glucose (Zhang et al., 2016).

### RESULTS

#### Cotransformation of Cas9-NAT, HXK2-gRNA, and sestc Expression Cassette

Cas9-NAT, HXK2-gRNA, and the sestc expression cassette were successively transformed into wild-type S. cerevisiae. After the transformation, the solution mixture was cultured on solid YPD media, which contained two antibiotics. The transformation rate of the sestc expression cassette was 37.32 ± 4.23 colonies/µg sestc expression cassettes. No colony was observed on the plates of the two negative controls. The screened colonies were used for further identification.

### PCR, RT-PCR, and SDS-PAGE Identification

The colonies that lost both antibiotics were used for PCR identification by using the primers for sestc amplification (**Table 1**). The sizes of the putative transformant DNA bands were similar to those of the positive control. No band existed in the negative control lane (figure omitted). The positive transformants identified by PCR were further identified by the RT-PCR approach. The sizes of the DNA bands from the transformants were as expected. No DNA band existed in the wild-type S. cerevisiae lane (**Figure 2**). The SDS-PAGE approach was used to analyze the protein profiles of the identified transformants in comparison with the wild type S. cerevisiae (**Figure 3**). SESTC was approximately 44 kDa in size. Therefore, the sestc gene was effectively expressed in the engineered S. cerevisiae.

### Activity Profiles of Cellulase and Saccharification for Glucose Release

The cellulase activities in wild-type S. cerevisiae were lower than those in the transformants. After fermentation for 48 h, the cellulase activities reached their peaks. The highest FPA, EG, and CBH were 1.06, 337.42, and 1.36 U/mL, which were

transformants. Lane M marker; lane 1 wild-type S. cerevisiae; lane 2 the positive control; lane 3,4 the putative transformants.

35.3-fold, 23.03-fold, and 17-fold higher than those in wild-type S. cerevisiae, respectively. After incubation at 37◦C for 18 h, the content of the released glucose was measured. After 6 h treatment, the concentration of glucose reached its peak (20 g/L) (**Figure 4**). Afterward, the released amount of glucose nearly did not increase when treated for 6–18 h. As the control, nearly no glucose was released from the orange peel, as determined by the HPLC method.

#### Anaerobic Fermentation for Ethanol Production

After saccharification, the fermentation mixture containing 15 OD<sup>600</sup> of the recombinant S. cerevisiae cells, 20 g/L glucose, residual lignocellulose, and culture medium was treated at 30◦C under anaerobic conditions (**Figure 5**). After 24 h fermentation, the ethanol concentration reached 7.19 g/L. Subsequently, the growth trend of ethanol content gradually decreased. The highest ethanol concentration reached 7.53 g/L after 48 h fermentation and was 37.7-fold higher than that of wild-type S. cerevisiae (0.2 g/L). The conversion rate of ethanol was 0.377 g/g glucose and 0.151 g/g dry orange peel.

#### DISCUSSION

Lignocellulosic ethanol is an ideal alternative to petrochemical energy. The direct application of lignocellulosic materials can reduce the product cost. The reported materials were mainly from corn stover (Zhao et al., 2017), rice straw (Yu and Li, 2015), spruce (Stenberg et al., 2000), switchgrass (Wu et al., 2018), Miscanthus × giganteus (Scordia et al., 2013; **Table 2**). Compared with these lignocellulosic materials, orange peel showed benefits, such as low lignin content, easy decomposition, loose structure, and high sugar content. In addition, orange peel ethanol did not require other treatments, such as steam-, acid-, and alkali-based treatments. The construction of cellulase gene-engineered S. cerevisiae was an effective approach to achieve in situ saccharification and ethanol production. S. cerevisiae coexpressing cellulase/cellodextrin and Trichoderma viride EG3/BGL1 produced ethanol contents of 4.3 g/L (Yamada et al., 2013) and 4.63 g/L (Gong et al., 2014). S. cerevisiae expressing Aspergillus aculeatus β-glucosidase achieved an ethanol content of 15 g/L (Treebupachatsakul et al., 2016). S. cerevisiae expressing Cel3A, Cel7A, and Cel5A achieved an ethanol content of 10% (w/v) (Davison et al., 2016). The current study indicated that the final ethanol concentration reached


7.53 g/L, and the conversion rates of ethanol were 0.377 g/g glucose and 0.151 g/g dry orange peel. These indices approached or exceeded those previously reported (Yamada et al., 2013; Gong et al., 2014; Yu and Li, 2015; Wu et al., 2018).

The high-value utilization of wastes from agricultural products reduces environmental pollution (Tetsuya et al., 2013). Previous studies on cellulase-engineered S. cerevisiae focused on the gene expression in the cytoplasm by using vectors that carry antibiotic resistance genes, such as nourseothricin and hygromycin B resistance genes (Koppram et al., 2014; Yang et al., 2016b). In this study, the cellulase sestc cassette was integrated into the S. cerevisiae genome by using CRISPR-Cas9 technology to achieve the stable expression of the lignocellulolytic enzyme in S. cerevisiae. The sestc gene can be expressed at the level of the S. cerevisiae genome with the gpd promoter by using orange peel as the fermentation substrate. This study provided a valuable reference on cellulase sestc, particularly its expression in cellulase and its role in ethanol production with lignocellulosic materials. However, the cellulase sestc-engineered S. cerevisiae still exhibited problems. The lack of β-glucosidase activity in the sestc-engineered strains caused insufficient substrate hydrolysis. The heterologous expression of β-glucosidase gene in the sestc-engineered strains is a reasonable solution; in addition, the supplement to

#### REFERENCES


β-glucosidase in the solution containing SESTC is also an effective strategy.

#### CONCLUSION

Crushed orange peel powder was used to produce ethanol via saccharification and fermentation of sestc-engineered S. cerevisiae. The highest ethanol concentration was 7.53 g/L after 48 h fermentation. This value was 37.7-fold higher than that of wild-type S. cerevisiae. The conversion rates of ethanol were 0.377 g/g glucose and 0.151 g/g dry orange peel. This study provided an effective approach to integrate cellulase sestc cassettes into the S. cerevisiae genome via the CRISPR-Cas9 approach. The constructed engineered S. cerevisiae can be applied to biomass ethanol production.

### AUTHOR CONTRIBUTIONS

PY proposed the research and wrote the manuscript. YW performed the experiments. ZZ analyzed the data. XZ made the maps. SJ designed the scheme. DM wrote the discussion section. LC measured the ethanol content.

for extraction of polyphenols and fermentable sugars from orange peels. Food Res. Int. 107, 755–762. doi: 10.1016/j.foodres.2018.01.070


Romanos, M., Scorer, C., and Clarke, J. (1992). Foreign gene expression in yeast: a review. Yeast 8, 423–488. doi: 10.1002/yea.320080602


transporter-co-expressing Saccharomyces cerevisiae. AMB Express 3:34. doi: 10. 1186/2191-0855-3-34


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Yang, Wu, Zheng, Cao, Zhu, Mu and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Systems and Synthetic Biology Approaches to Engineer Fungi for Fine Chemical Production

Leonardo Martins-Santana, Luisa C. Nora, Ananda Sanches-Medeiros, Gabriel L. Lovate, Murilo H. A. Cassiano and Rafael Silva-Rocha\*

Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil

Since the advent of systems and synthetic biology, many studies have sought to harness microbes as cell factories through genetic and metabolic engineering approaches. Yeast and filamentous fungi have been successfully harnessed to produce fine and high valueadded chemical products. In this review, we present some of the most promising advances from recent years in the use of fungi for this purpose, focusing on the manipulation of fungal strains using systems and synthetic biology tools to improve metabolic flow and the flow of secondary metabolites by pathway redesign. We also review the roles of bioinformatics analysis and predictions in synthetic circuits, highlighting in silico systemic approaches to improve the efficiency of synthetic modules.

#### Edited by:

André Ricardo Lima Damasio, Universidade Estadual de Campinas, Brazil

#### Reviewed by:

Elizabeth Bilsland, Universidade Estadual de Campinas, Brazil Jean Marie François, UMR5504 Laboratoire d'Ingénierie des Systèmes Biologiques et des Procédés (LISBP), France

\*Correspondence:

Rafael Silva-Rocha silvarochar@gmail.com

#### Specialty section:

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

Received: 09 April 2018 Accepted: 02 August 2018 Published: 03 October 2018

#### Citation:

Martins-Santana L, Nora LC, Sanches-Medeiros A, Lovate GL, Cassiano MHA and Silva-Rocha R (2018) Systems and Synthetic Biology Approaches to Engineer Fungi for Fine Chemical Production. Front. Bioeng. Biotechnol. 6:117. doi: 10.3389/fbioe.2018.00117 Keywords: synthetic biology, yeast, filamentous fungi, genetic engineering, bioinformatics, biotechnology

## OVERVIEW

Systems biology has risen as a relevant approach to encompass the phenomena that occur concomitantly in determined environments. Since its advent, research was modeled after a global perception of biological systems operation. Systems biology presents scientific possibilities to understand processes as a whole and to integrate new data and new analytical tools that are intended to solve biological problems by integrating parts and modules. Specifically, systems biology offers us the possibility to study the combination of individual parts which can play different roles in the behavior of the biological system depending on the context.

On the order hand, synthetic biology refers to the use of rational approaches that provide us with the possibility to study a unique part from a module or a system in a minimalist manner to understand how the creation of nonexistent parts could enable us to engineer biological circuits, for the engineering of microbes as cell factories. These approaches seek to address the growing industrial demand for biotechnological processes that manufacture products in the required scale in an economically efficient manner. Many efforts in the genetics and metabolic engineering fields have explored the use of viable fungal cells to generate fine chemicals.

Synthetic biology plays a central role in these efforts by means of innovating the use of the components of biological circuits, pathway, or processes and ultimately of systematically harnessing these modules to achieve the goals of industrial scale production. The advent of the bioinformatics era has allowed the redesigning of living information in the cell environment. This has significantly contributed to the development of new strategies based on systems biology approaches, mainly computational and rational ones. Since all these advances in single parts of a system and integrative modules direct efforts toward a common goal, systems and synthetic biology permit the deeper exploration and elucidation of cellular mechanisms, mainly the genetic and metabolic engineering of fungal cells in an integrative way.

Here, we review and discuss the most promising recent advances in this field, focusing on the use of single module, genetic and metabolic designs to engineer filamentous fungi and yeasts for industrial biotechnological processes. We describe some of the most promising industrially relevant tools for this purpose. Also, a list of significant studies that have used synthetic biology approaches to engineer fungi is provided in **Table 1**.

#### MOLECULAR ENGINEERING TOOLS

#### Synthetic Promoters

Transcriptional control redesign is a general term encompassing processes used to engineer fungi that have been the most intensively studied ones. Many single modules that are involved are pivotal to drive heterologous protein expression. Advances in synthetic promoter design have significantly contributed to the creation of engineered strains. In this sense, Ata et al. (2017) recently reported the construction of a synthetic promoter library based on the glyceraldehyde-3-phosphate dehydrogenase (G3PDH) promoter in Pichia pastoris, a wellknown methylotrophic yeast model. The study adopted a robust strategy of simultaneous deletions and duplications of transcription factor binding sites (TFBSs) to control expression of transcription factor (TF) genes to understand the transcriptional dynamics in these cells. This is a promising approach to unravel the mechanisms of transcriptional dynamics that could be potentially expanded to further investigations of the transcription logic in yeast, with the ultimate goal of robust and improved strain engineering.

This strategy allows the creation of promoter variants with different strengths in the presence or in the absence of critical regulators. It relies on the premise that these dynamics directly influence the transcriptional response and the regulation logic. Approaches like these are the most common and robust tools for understanding the dynamics of regulation in yeast cells. The design of a set of promoters that rely on different architectural compositions is crucial for the systemic understanding of regulation dynamics and is a convenient application for more detailed studies to engineer strains for industrial purposes (**Figure 1A**). Although the potential of the approach in yeast engineering is clear, further studies are required for its successful application in the engineering of filamentous fungi.

Another elegant strategy to engineer fungal promoters consists of a series of molecular designs to obtain hybrid promoters using bacterial modules. Standard tools from bacteria could be easily implemented to investigate transcriptional behavior in eukaryotic cells. This approach was proposed in a recent study that reported the construction of synthetic promoters based on the Ashbya gossypii translation elongation factor (TEF) promoter architecture (Hector and Mertens, 2017). In this study, the authors used the negative regulator xylR from the gram-negative bacterium, Caulobacter crescentus, to construct a xylose-sensitive transcriptional circuit. For this, the authors replaced specific TF recognition regions

TABLE 1 | Synthetic biology approaches for fine chemical production with filamentous fungi and yeast as cell biofactories. Approach Organism Strategy References Promoter library construction P. pastoris Synthetic promoter engineering Ata et al., 2017 Transcriptional circuit sensitive to xylose A. gossypii Synthetic promoter engineering Hector and Mertens, 2017 Cellulase optimization promoter dynamics T. reesei Synthetic promoter engineering Kiesenhofer et al., 2017 Improvement of promoter strength S. cerevisiae Intronic sequences in promoters Hoshida et al., 2017 Evaluation of terminators function improvement S. cerevisiae Nucleosome occupancy arrangement predictions Morse et al., 2017 Galactaric acid production improvement A. niger CRISPR/Cas9 Kuivanen et al., 2016 Yeast genome engineering P. pastoris Optimized codons for Cas9 and RNA polymerases promoter sequences Weninger et al., 2016 Synthetic biopathway control S. cerevisiae CRISPR/dCas9 Jensen et al., 2017 Improvement of production and tolerance to ethanol S. cerevisiae Polymerase engineering Qiu and Jiang, 2017 Expression of cellulase genes through a copper responsive promoter T. reesei RNA interference Wang et al., 2018 Stable segregation of vectors S. stipiti Episomal vector optimization Cao et al., 2017 Tolerant acetic acid mutant yeast S. cerevisiae Direct evolution approach González-Ramos et al., 2016 Responsiveness to low-pH conditions S. cerevisiae Synthetic promoter engineering Rajkumar et al., 2016 Redirected carbon flux from acetyl-CoA to ß-carotene production Y. lipolytica Fine-tuning expression of synthetic genes Gao et al., 2017 Production of terpenes production R. toruloides Codon optimization of biosynthetic enzyme coding genes Yaegashi et al., 2017 Isobutanol production S. cerevisiae Mitochondrial compartmentalization pathway Park et al., 2016 Controlling accumulation of free fatty acids S. cerevisiae Dynamic regulatory circuits Teixeira et al., 2017 Production of alkaloids S. cerevisiae Proof-of-concept synthetic circuit Galanie et al., 2015

of the promoter using the bacterial DNA, which involved the modification of upstream and downstream TATA flanking regions. This strategy proved to be relevant for the engineering of yeast for use in fine chemical production, mainly in biomass conversion processes (Hector and Mertens, 2017).

The replacement and new insertions of (TFBSs) are elegant ways to engineer promoters, as are combinatorial fusions of regulatory elements creating novel, promoter architectures. Both approaches could optimize transcription in fungal cells. One of the greatest benefits of this approach is the possibility to supply the industrial demand for metabolites or fine chemicals, which is currently limited by extreme chemical conditions, such as low pH. A systematic and expansive molecular strategy to overcome the limits of transcriptional regulation is needed to increase the level of production of high-value added products. To this end, a construct based on the core architecture of G3PDH gene promoter complementary with the upstream activating sequence of the guanylate kinase gene promoter was explored to guarantee transcriptional responsiveness to pH oscillations in Candida glycerinogenes. The resultant promoter was used for the expression of an industrially valuable enzyme, xylose dehydrogenase, in low pH conditions, heralding the approach as a relevant strategy for metabolic bioprocess in such conditions (Ji et al., 2017). The same logic could be applied for other purposes in processes that take place in a basic pH environment or in the presence of various stress conditions.

Although promoter engineering is a promising approach, new engineering technologies and strategies are required especially for filamentous fungi. Some studies have explored the optimization of transcriptional regulation systems to understand how the disposition and repetition of TFBSs in DNA can influence the dynamics of regulation. A recent study reported the construction of engineered promoters from two wild type Trichoderma reesei promoters, the most utilized fungus for the production of cellulases (Kiesenhofer et al., 2017). The findings of this work have indicated the relevance of the number of repeated TFBSs for transcription regulation and that the disposition of the recognition sites plays a major role in the final response of the system. This strategy could be further explored to engineer fungi with redesigned metabolic flow and to investigate regulatory complexity in these organisms.

#### Effects of Introns in Transcriptional Regulation

Synthetic promoters can also be engineered with the use of portions of a gene. Insertion of introns upstream of specific promoter sequences is a typical example of this form of engineering (**Figure 1A**). Molecular recognition by the transcriptional cell machinery has been recently used to promote a transcriptional response following the integration of intron sequences upstream of a promoter sequence. One example is the insertion of introns in promoter regions of relevant genes involved in lipid biosynthesis in Rhodosporidium toruloides (Liu et al., 2016). With a vector containing a luciferase gene under the transcriptional control of the promoters in the study, the authors successfully harnessed the intronic sequences to modulate the promoter strength. As a result, the promoter activity of synthetic promoters was 3-fold higher when compared with some wild type sequences. The improvement in the strength of the modified promoters of perilipin/lipid droplet protein 1 gene, acetyl-CoA carboxylase gene, and fatty acid synthase subunit β gene, makes this strategy feasible for the engineering of yeast promoters for industrial applications (Liu et al., 2016).

Another recent study investigated the influence of introns in promoter sequences in Saccharomyces cerevisiae. In this case, the authors demonstrated that the presence of 5′ untranslated regions (UTRs) of intronic sequences in the promoter of genes encoding 40S ribosomal proteins (RPs) increased the strength of the promoter. The luciferase activity was 17-fold higher when an intronic promoter was used to drive expression when compared with the strong wild type TDH3 promoter in S. cerevisiae (Hoshida et al., 2017). Furthermore, when the RPS25A intronic promoter and TDH3 promoter were merged, the luciferase activity increased, approximately, by 50-fold when compared with the control of the TDH3 promoter alone (Hoshida et al., 2017). Further studies on these combinations will alleviate the remaining bottlenecks due to intron off-effects and expand the strategy to the engineering of filamentous fungi.

#### Synthetic Terminators

Terminator sequences constitute an important part of molecular regulatory modules in a circuit. These sequences are pivotal in the final steps of transcription and are required for the complete and successful generation of the mRNA machinery, owing to their involvement in mRNA stability, and even as insulators in genetic circuits (Curran et al., 2013; Geisberg et al., 2014; Song et al., 2016). In synthetic biology, terminators are significant modules that guarantee precise and controlled regulation. These functions have heightened the research focus on terminator sequences. Decades ago, scientists had already focused their efforts on creating synthetic and minimal terminators to study how these elements could influence transcription and mRNA half-life (Guo and Sherman, 1996). In the intervening decades, further studies have demonstrated the use of engineered terminators as suitable tools for synthetic biology applications. In this context, the direct relationship of a 3′ UTR of S. cerevisiae dityrosine-deficient 1 (DIT1) terminator to increased protein expression was described as a process mediated by NAB6p and PAP1p trans-acting RNAbinding proteins. This relationship was demonstrated by the analysis of mutations of the DIT1 terminator sequence. The mutations that were introduced improved the terminator activity by 500% when compared with an internal terminator used as the control (Ito et al., 2016). This approach is relevant for the study of post-transcriptional control mechanisms, and it demonstrates the influence of terminators on mRNA stability and the levels of protein production in vivo.

With evidence from multiple studies that the half-life of mRNA could be affected by terminator sequences, the idea that molecular arrangements in DNA strands may reflect that the distribution of terminators has gained acceptance in the scientific community. Recently, Morse et al. (2017) described that the function of S. cerevisiae terminators could be modulated based on the predicted nucleosome occupancy arrangements. The study exploited the fluorescence emission that occurred under the control of designated promoters to demonstrate that the engineered minimal terminators of the cyc1 and adh1 genes may positively influence protein production via decreased nucleosome affinity (Morse et al., 2017). The authors also explored the hypothesis that terminators behave as antisense promoters in the generation of noncoding RNA molecules. To investigate this, the authors reversed the orientation of the yECitrine fluorescence reporter gene to evaluate a possible transcriptional control of the synthetic terminator in question. However, the results that were obtained failed to confirm the hypothesis of a terminator acting as an antisense promoter in this case. The authors suggested alternative mechanisms for the absence of differential trends of the described synthetic terminators in the study, and the data supporting these alternatives could be found in the literature (Morse et al., 2017).

Fewer studies have addressed the activity and influence of terminators on transcriptional and post-transcriptional regulation when compared to the studies describing promoter engineering. Therefore, further investigations are necessary to gain a full understanding of the properties of terminators. However, the present knowledge indicates that the intrinsic properties of terminators make them a suitable tool for synthetic biology approaches that will be useful for industrial applications in the not too distant future (**Figure 1B**).

#### CRISPR/Cas9 Tools

Clustered regularly interspaced short palindromic repeats (CRISPR) tools for the engineering of microbes as cell factories are an efficient and reliable strategy in synthetic biology. Fungal cells, even filamentous species, have already been modified for their successful use in industrial bioprocesses. The CRISPR/CRISPR-associated protein 9 (Cas9) technique, basically, consists of using a Cas9 endonuclease-guided RNA to target a DNA sequence, which allows the precisely targeted disruption of this sequence and the activation of repair mechanisms in cells. This strategy fits a chimeric single guide RNA (sgRNA) and a Cas9 into a unique and easily manipulatable plasmid vector. Since the appearance of CRISPR/Cas9, numerous efforts have aimed to engineer fungal cells for industrial purposes (**Figure 1C**). Recent advances deserve special attention and are addressed in this review. For instance, the filamentous fungi Aspergillus niger was engineered with CRISPR/Cas9 to create a strain capable of producing galactaric acid from galacturonic acid, a representative molecule present in pectin fibers. In this study, authors bring to light the fact that galactaric acid can be used as a carbon source for A. niger. To create a strain capable of retaining this molecule, the CRISP/Cas9 system was used to efficiently disrupt the metabolic pathway for this compound (Kuivanen et al., 2016). This is a remarkable study in the field, considering the fact that fungi from the Aspergillus genus have a considerable genetic capability to produce enzymes for hydrolysis of biomass content such as pectin, as well as established raw material for industrial applications.

A major advantage of using the CRISPR/Cas9 system to edit genomes is the system's plasticity for repairing doublestrand DNA breaks. Led by sgRNA, Cas9 cleaves the target DNA, which activates a repair mechanism in eukaryotic cells. Therefore, the nonhomologous end joining system of repair is considered an error-prone method for fixing broken sequences and may generate disruptions in these sequences by deleting or incorporating small DNA molecules to the site of damage. This type of repair system is a suitable alternative to generate knock down strains or to disrupt an open reading frame (ORF) of the gene of interest. Additionally, the repair promoted by homologous recombination provides an error-free system through the incorporation of a faithful donor DNA sequence, which is a valuable strategy when the goal is to replace a sequence of interest.

The CRISPR/Cas9 system can be applied for both yeast and filamentous fungi strains, even with the considerable genetic and morphological differences between the two groups. Despite the limitations of recombination, this technology has been used to create a series of functional constructs of P. pastoris in a precise and efficient manner (Weninger et al., 2016). In this case, the authors tested combinations of single modules in a robust system featuring optimized codons for Cas9, as well as optimized RNA polymerase II and III promoter sequences (Weninger et al., 2016). This approach is promising for future applications in yeast genome editing and is anticipated to allow the creation of efficient standardized strategies for the engineering of filamentous fungi. In addition to this case, recent studies have demonstrated the application of this technology to Yarrowia lipolytica (Schwartz et al., 2017), A. oryaze (Katayama et al., 2016), A. fumigatus (Zhang et al., 2016), C. albicans (Shapiro et al., 2017), Penicillium chrysogenum (Pohl et al., 2016), for the hyper-expression of cellulase using Myceliophthora species (Liu et al., 2017), and even for the creation of broad spectrum promoters (Yang et al., 2017). Therefore, it becomes clear that this tool will be increasingly important for the generation of new strains with the improvement of the functions that are required for biotechnological applications.

#### CRISPR/Cas9 and Transcription Regulation

As addressed before, breakthroughs using the CRISPR/Cas9 technique have been achieved even with the inherent limitations of the method, such as the occurrence of low recombination rates and off-targets in the genome. One of the most important improvements generated recently has been the coupling of a transcription factor to a mutated/dead Cas9, which is generally termed dCas9 (**Figure 1D**). This technique leads to a CRISPR/Cas9 system but with the power of inhibiting transcription when Cas9 blocks the sites for RNA polymerase anchoring in promoters, reducing the number of transcripts in a system. Additionally, molecular approaches have also been employed to fuse transcription regulatory domains of proteins to promote the enhancement or repression of transcription.

Singular approaches have also been developed that combine a dCas9 and engineered genomic RNA, as discussed by Jensen et al. (2017). In this case, the authors described the creation of different systems of transcriptional response that were modulated using dCas9 to control synthetic pathways in S. cerevisiae. They also proposed that differently predicted genomic RNAs could influence the reprogramming regulation in different promoters of the yeast (Jensen et al., 2017). The results indicated that the occurrence of multiplex reprogramming strategies to engineer yeast for the production of triacylglycerols (TAGs) and isoprenoid compounds, but further studies are required for the expansion and optimization of industrial applications.

### Molecular Tools for Transcriptional Regulation

A wide array of possible molecular fungal engineering tools is available in the literature. Here, we briefly review the more notable and the newest synthetic biology approaches to engineer fungi for fine chemical production.

#### RNA Interference (RNAi)

Post-transcriptional control is also a point of study for synthetic biologists to understand how mRNA or non-coding RNAs may behave as knockdown agents for the regulation of protein expression (**Figure 2A**). In this sense, Wang et al. (2018) reported the use of RNAi in T. reesei to successfully create a system for the inhibition and derepression of cellulase genes under transcriptional control of the copper responsive tcu1 promoter. In this system, the absence of copper in the medium promoted the transcription of a hairpin RNA for the genes of interest. The presence of copper in the medium also repressed transcription in a system of toggle modulation for post-transcriptional control (Wang et al., 2018).

#### Polymerase Engineering

Protein engineering is a reliable strategy for industrial synthetic biology (**Figure 2B**). With this method, it is possible to create a library of mutant protein domains in order to improve the catalytical potential of any given enzyme to be further applied in industrial processes. Recently, it was reported that mutations on subunit Rpb7 of RNA polymerase II (generated by error-prone PCR) in S. cerevisiae improves the production and tolerance to ethanol in bioprocesses (Qiu and Jiang, 2017). This study highlighted a simple but very efficient technique for strain engineering. This was the first study to describe the engineering of eukaryotic polymerases to modulate transcriptional responses in yeast (Qiu and Jiang, 2017). The study additionally highlights the elegance of simple approaches for synthetic biology and paves the way for more investigations that will make it possible to engineer the same procedure in filamentous fungi for industrial purposes.

#### Episomal Optimization

Cao et al. (2017) used fine engineering to improve episomal expression. The authors identified minimal regions in centromeres that were capable of promoting the stable segregation of vectors in Scheffersomyces stipitis, an unconventional yeast that the authors studied for the production of shikimate pathway-derived compounds owing to the native capacity of the yeast to metabolize xylose. This is a remarkable milestone in the field of synthetic biology, both for the identification of centromere functional minimal regions and for the successful application of CRISPR/Cas9 for an unconventional yeast species, paving the way for novel strategies to engineer novel strains for industrial purposes (Cao et al., 2017).

#### Orthogonality by Engineered TFs

Engineered TFs can be used to improve the transcriptional response in living cells. A recent and innovative study used engineered TFs with yeast, virus, and plant activation domains that were fused to a nuclear localization signal and a fulllength TF or its DNA binding domain. In this case, the authors noticed that transcription of a yeast-enhanced green fluorescent protein reporter gene under the control of a yeast promoter was differentially modulated by each domain (Naseri et al., 2017). The authors also discussed the influence of the number of binding sites embedded in such promoters and noticed that some artificial TFs derived from Arabidopsis thaliana could also modulate the transcription of the fluorescence pattern in the presence/absence of an inducer (Naseri et al., 2017). This approach expands the possibilities for the future application of synthetic biology in other eukaryotic cells.

#### METABOLIC ENGINEERING

Metabolic engineering is an ongoing challenge in the microbial production of fine chemicals from sustainable biomass. For many years, metabolic engineers had to rely on random mutagenesis and screening of strains, meaning that any adverse characteristics could not be easily detected (Campbell et al., 2017). However, with the advent of synthetic biology, it has become possible to circumvent most of these constraints and improve bioprocesses, allowing the enhanced manipulation of carbon flux in fungal strains for the production of chemicalas, food additives, pharmaceutical products, and other molecules of interest (Wang et al., 2016). A relevant example of metabolic engineering applied to fungal strains is the production of fine chemicals through plant biomass hydrolysis. Plant biomass is mainly composed of lignin, cellulose, and hemicellulose. These compounds, frequently present in agro-industrial wastes, can be hydrolyzed to fermentable sugars thought the action of a number of specific enzymes. For this purpose, fungal strains can be engineered to produce those enzymes that can break specific bonds in such polymers and generate sugars, such as glucose and xylose, which can be directed to the production of ethanol, butanol, fatty acids, and aromatic compounds.

The base of biotechnological production from biomass is cellulose and hemicellulose. When hydrolyzed, they generate pentoses and hexoses, which can be converted to generate biosustainable commodities (Guerriero et al., 2016; Gupta et al., 2016). There are two different approaches to obtain these products from fungi. One approach is for the production of non-oleaginous biofuels and chemicals and the other is for the production of oleaginous compounds. For the non-oleaginous bioproducts, synthetic biology approaches focus on the glycolysis of C5 and C6 sugars and their further conversion to the product of interest, in which the pyruvate flux is redirected to alternative pathways (Chen and Nielsen, 2016). Examples include lactic acid, succinic acid, cis-muconic acid, and ethanol, which are molecules important for polymer production and for applications in the cosmetic, food, chemical, and biofuel industries (for further details, please refer Chen and Nielsen, 2016).

Metabolic engineering for the production of oleaginous compounds is focuse on the central carbon metabolism, with the aim of increasing the molecular levels of acetyl-CoA, malonyl-CoA, and acyl-CoA. This generates precursors for the production of fatty acids that are used for the manufacture of detergents, lubricants, biodiesel, plastics, and coatings (Marella et al., 2018). Although some fungal species, such as S. cerevisiae, fail to produce high levels of cytosolic acetyl-CoA and require metabolic manipulation to circumvent this absence, there are species (Y. lipolytica, for example) that naturally produce higher rates of this molecule. Therefore, strategies to manipulate both species have been reported (Jin et al., 2015; Marella et al., 2018) and subsequently reviewed in detail. Nevertheless, studies have reported the use of S. cerevisiae as a chassis for the production of fatty acids, in which biomass is used as a carbon source for animal feed supplementation through the addition of acetyl-CoA carboxylase and thioesterase genes from Corynebacterium glutamicum (You et al., 2017). Another application is the production of oil with reduced viscosity (composed by acetyl-TAGs) with the introduction of a diacylglycerol acetyltransferase from Euonymus alatus (Tran et al., 2017). In addition, Wei et al. (2017a) successfully engineered S. cerevisiae with six cocoa genes to produce a cocoa butter-like (CBL) lipid.

Many studies concerning waste biomass utilization have focused primarily on hemicellulose as the main carbon source. However, lignin, a rich aromatic resource and an underutilized product from biomass hydrolysis, is another lignocellulosic feedstock that could be successfully employed for biochemical production. A major challenge in the utilization of lignin is the variety of aromatic compounds that are present in this material. Thus, engineering the metabolism of fungi is required to depolymerize and convert complex aromatic macromolecules of lignin to a utilizable resource (Beckham et al., 2016; Xie et al., 2016). In this sense, Yaegashi et al. (2017) described the capacity of R. toruloides to consume aromatic compounds as lignin components and suggested that this organism has the metabolic potential to convert depolymerized lignin into lignocellulosic sugars. Moreover, Mahan et al. (2017) and Liu Z. H. et al. (2018) described the ways to improve lignin utilization by Rhodococcus opacus for the production of oleaginous compounds, with the use of pretreatments and fermentation that favors lignin hydrolysis. It would be interesting to apply these techniques in fungal engineering with the goal of lignin utilization.

#### Construction of Resistant Strains

One of the first steps in the creation of cell factories to produce fine chemicals from biomass is to overcome metabolite inhibition. Several metabolites generated by hydrolysis and further assimilation of substrates can inhibit the metabolism of the microorganisms and even block the production of the desired substances (Zhang et al., 2015). Especially, regarding lignocellulosic biomass, even more challenges exist because the biomass is a complex of carbon backbones whose hydrolysis could generate furan derivatives, phenolic compounds, and weak organic acids that could behave as fermentation inhibitors. Furthermore, inhibitor substances could also be generated as intermediates of the reaction or as products (Ling et al., 2014). This highlights the need to create resistant strains as an indispensable strategy to improve the entire biotechnological process.

Another problem during the biomass fermentation is the acidification of the growth medium due to the acidic pretreatment of the biomass (Fletcher et al., 2017). In this regard, Chen et al. (2016) examined transcriptional responses from S. cerevisiae during fermentation stress and acidic environments, given that lignocellulosic material often undergoes acidic pretreatment. By identifying genes that are induced under acidic condition, the authors reported that the simultaneous overexpression of a protein related to acid resistance (Sfp1) and a protein related to the general stress response (Whi2) increased yeast performance and ethanol productivity, even in highly acidic media.

As another example, González-Ramos et al. (2016) used a direct evolution approach to identify S. cerevisiae mutants that tolerated different concentrations of acetic acid. Additionally, whole-genome sequencing of the tolerant strains was performed to identify the mutations contributing to this phenotype, with the aim of identifying genes worthy of further study with regard to acidic resistance. Notwithstanding this, Ma et al. (2015) developed an S. cerevisiae mutant strain tolerant to acetic acid by transforming it with a synthetic zinc finger protein transcription factor (ZFP-TF) library, and they identified genes that could be related to this acetic acid resistance phenotype. The authors also observed that the glucose consumption rates and ethanol productivity in media containing acetic acid were increased in the mutants when compared with the wild type. In another study, Rajkumar et al. (2016) engineered a yeast synthetic promoter that was responsive to low pH. For this, they improved the already existing responsiveness to the low pH YGP1 promoter, altering its TFBSs and selecting the best responsive system. As a result, the engineered promoter was associated with a 10-fold improvement in the transcription rate when compared with a commonly utilized promoter. Therefore, the intersection between this synthetic promoter technique and the aforementioned direct evolution and engineering strategies could assist in the identification of a microbe that can strive in highly acidic conditions.

As mentioned before, the sensitivity to the product generated by a metabolic process is another bottleneck that needs to be circumvented for industrial purposes. For example, the production of fatty acids by S. cerevisiae leads to toxic effects that include membrane and mitochondria disruption and oxidative stress. Besada-Lombana et al. (2017) engineered S. cerevisiae strains with increased tolerance to octanoic acid, whose presence compromises plasma membrane integrity. For this, the authors cloned a copy of the acetyl-CoA carboxylase gene with a mutation in S1157A to increase the concentration of oleic acid in the plasma membrane. The engineered strains displayed increased tolerance to octanoic acid, n-butanol, 2-propanol, and hexanoic acid molecules, indicating that this strategy can be a promising tool to improve tolerance to products from metabolic processes.

Seeking novel means for engineering fungal metabolism has always been a challenge, and emerging techniques are trying to explore the stress response in the organisms of interest to improve metabolic tolerance. In this sense, Li et al. (2017) investigated the ethanol stress response mechanism of S. cerevisiae by RNA-seq to understand how the cells can be manipulated to overcome ethanol sensitivity. In the landscape of unconventional organisms, this understanding could allow the exploitation of the yeast biodiversity to generate new tools for metabolic engineering of industrial domesticated strains. Mukherjee et al. (2017) have described some unconventional yeasts, such as P. kudriavzevii and Wickerhamomyces anomalus, which have an elevated tolerance to stress factors present in fermentation processes. Studies like this are extremely relevant to the development of new tools for the metabolic engineering of fungi, and they will contribute to the improvement of biomass usage and generation of high value-added products in the diverse scope of biotechnological processes.

### COMPARTMENTALIZATION OF PATHWAYS AND PROTEIN SCAFFOLDS

As proof that compartmentalization is important to optimize metabolic pathways, an efficient strategy to improve the use of acetyl-CoA focused on targeting the pathway proteins to the mitochondrial matrix. In this context, Yuan and Ching (2016) exploited the compartmentalization of acetyl-CoA utilization pathways by taking advantage of the subcellular metabolism of S. cerevisiae to avoid competing intermediates. They were able to produce amorphadiene using the mitochondrial acetyl-CoA pool by using a plasmid to overexpress amorphadiene synthase in the mitochondria. The production was considerably more efficient in this case (about 80% higher) than when the same enzyme was expressed in the cytosol, demonstrating that the mitochondrial matrix can be a desirable environment to redirect metabolic pathways. Furthermore, a pathway for isobutanol production was also improved by localizing biosynthetic enzymes to the mitochondria of S. cerevisiae. Park et al. (2016) increased the pool of mitochondrial pyruvate by overexpressing the subunits of the hetero-oligomeric mitochondrial pyruvate carrier. Higher titers of isobutanol production were evident in this case when compared with the wild type and to previous reports.

As another important example, DeLoache et al. (2016) proposed a redesign of the metabolic flux in yeast peroxisome. The authors engineered mutant yeast strains in which the peroxisomes became "synthetic organelles," meaning they could receive heterologous proteins and control the influx of substrates. This study not only developed novel in vivo assays to test cargo import and to measure membrane permeability, but it also identified a modular tag for peroxisome localization. The findings showed that this methodology has the potential to redirect metabolic flux in yeast species, including S. cerevisiae and P. pastoris. For an extensive review on the localization of heterologous proteins to different yeast cell compartments, please refer Hammer and Avalos (2017).

Synthetic protein scaffolds, on the other hand, interact with the enzymes of natural pathways via peptide ligands, colocalizing them, and assembling them into organized clusters. This enables the direct transfer of substrates from one active site to another, which is termed substrate channeling (Wheeldon et al., 2016). In this sense, Wang and Yu (2012) used this concept to construct nine synthetic scaffold proteins to enhance the overall cascade catalysis of the resveratrol biosynthesis pathway in S. cerevisiae. Resveratrol is an antioxidant that is of interest to pharmaceutical companies. The authors successfully produced this compound using p-coumaric acid that was derived from lignin. The study revealed that the number of binding domains can affect the flux through the biosynthetic pathway.

In an even more sagacious study, Lin et al. (2017) expressed synthetic scaffold proteins from the ethyl acetate biosynthesis pathway in S. cerevisiae with a localization tag to lipid droplets, based on the idea that co-localization of enzymes would provide kinetic advantages, and balance the metabolic flux and substrate channeling. In fact, there was an increase in the metabolic rates due to the nanometer spacing between the co-localized enzymes, even in the presence of competing substrates. These findings indicated that compartmentalization along with organization and spatial distribution must be considered while developing tools to engineer fungal metabolism.

### SYNTHETIC BIOLOGY TOOLS FOR REWIRING CARBON METABOLISM

The flow of metabolites through metabolic pathways is termed metabolic flux (Venayak et al., 2015). The flux must be tightly controlled to achieve the three most important characteristics of an industrial strain–yield, titer, and productivity (Venayak et al., 2015; Campbell et al., 2017). These three strategies are being applied to organize the flux of metabolites and to improve host productivity. The first is the rewiring of metabolic flux. The second is the compartmentalization of pathways. The third is the construction of synthetic protein scaffolds.

Rewiring of the metabolic flux can be achieved by deleting genes from the host that, somehow, inhibit the production or accumulation of the desired compound and/or by overexpressing the required pathway enzymes, aiming at the accumulation of a specific product. The advantage of engineering protein scaffolds is the presence of modular protein-protein interaction domains that can be used to adjust the stoichiometry of a given complex in a pathway, thus, allowing the fine-tuning of the metabolic flux (Dueber et al., 2009). With regard to compartmentalization, even though bacteria use the cytoplasm for most of their metabolic reactions, while working with fungal strains, we can take advantage of the numerous subcellular compartments available in their organelles and the cell wall (Hammer and Avalos, 2017). Thus, the metabolic flux toward subcellular compartments can also be engineered to redirect chemical production to beneficial conditions, using several strategies that include the engineering of protein localization tags (Campbell et al., 2017). Another issue that can be overcome by both compartmentalization and engineered protein scaffolds is the loss of intermediates to competing pathways (Wheeldon et al., 2016).

Acetyl-CoA metabolism has been extensively studied, since a number of compounds can be derived from this substrate. These include fatty acids (for biodiesel), polyketides (for antibiotics), and terpenoids, such as ß-carotene and amorphadiene (Yuan and Ching, 2016; Campbell et al., 2017). The significantly regulated central carbon metabolism has been targeted, recently, to increase the metabolic flux toward these metabolites (Campbell et al., 2017; Marella et al., 2018). An interesting design to optimize the use of acetyl-CoA in S. cerevisiae cells was adopted by Zhou et al.

(2016). The authors constructed a cell chassis in which the fatty acid reactivation pathway was disrupted to stop the inhibition of fatty acid biosynthesis. Subsequently, they built a synthetic chimeric citrate lyase pathway to improve the supply of acetyl-CoA. Using this approach, very high yields of free fatty acids were obtained, with amounts reaching up to 10.4 g/L in a fedbatch fermenter. The free fatty acids were further converted to important oleochemicals, such as alkanes and fatty alcohols. An interesting aspect of the construction of this cell factory is that all constructs were integrated into the genome with no plasmid expression, which is appealing for industrial purposes.

Y. lipolytica is a distinctive oleaginous yeast strain whose metabolism is also being widely studied for its high capacity to produce cytosolic acetyl-CoA (Marella et al., 2018). Xu et al. (2016) rewired the central carbon metabolic pathway by overexpressing some alternative routes for the formation of acetyl-CoA, using plasmids containing the ePathBrick technology, a synthetic biology tool that is being used widely in the design of biosynthetic pathways. The authors were able to decouple nitrogen starvation from lipogenesis, enabling the biosynthesis of oleochemicals during the exponential growth phase. Interestingly, overexpression of a protein that exports acetyl-CoA from the mitochondria to the cytosol markedly improved lipid accumulation. Friedlander et al. (2016) further explored the Y. lipolytica lipid pathway, by examining when acetyl-CoA is converted to acyl-CoA. The authors overexpressed two heterologous enzymes (DGA1 and DGA2) responsible for the incorporation of acyl-CoA onto the diacylglycerol backbone to synthesize TAGs. This, along with a deletion from a lipase regulator, increased the lipid content to 77%. Both the studies reported the efficiency of using synthetic pathways to increase the production of fatty acids by Y. lipolytica, which later can be converted to oleochemicals that are important for industrial applications, including biodiesel, detergents, and bioplastics, among others (Marella et al., 2018).

Another industrially important molecule produced by Y. lipolytica is ß-carotene, which is widely used as a color additive and nutritional supplement (Gao et al., 2017). In this study, the authors redirected the carbon flux from acetyl-CoA to ß-carotene by fine-tuning the expression of 11 synthetic genes modulated by strong promoters. For this, a multiple-copy integration strategy was used, and the final product yield was surprisingly higher when compared with the organisms that are commonly used to produce ß-carotene industrially.

R. toruloides is a yeast strain that is growing in importance in the field of metabolic engineering. Remarkable results have been obtained in the production of fine chemicals through the process of degradation of complex substrates, such as xylose and aromatics, derived from lignin. Yaegashi et al. (2017) reported the high-level production of two terpenes by R. toruloides. Bisabolene is the precursor of bisabolane, which is a biosynthetic alternative to D2 diesel. Amorphadiene is the precursor of the antimalarial drug artemisinin. The production of these compounds was achieved by taking advantage of the large amounts of acetyl-CoA present in the cytosol of this organism and by expressing codon-optimized versions of the genes encoding bisabolene synthase (bis) and amorphadiene synthase (ads). Remarkably, the production of these chemicals increased when the organism grew using lignocellulose hydrolysates when compared with to purified substrates. A brief overview about the systems and synthetic biology strategies used for rewiring and dynamic control of metabolism in fungi are summarized in **Figure 3**.

### DYNAMIC CONTROL THROUGH GENETIC CIRCUITS

Even after considering the aforementioned accomplishments, it is important to note that there is still a huge difference between static and dynamic metabolism. Biological systems are much more complex than a physical, steady system. Results are hard to reproduce and, thus, hard to predict. This is especially true if we only consider a limited part of the system, such as enzymes in a metabolic pathway (Chubukov et al., 2016). Those components are not static, as their levels are always in response to changes in cellular environment. Synthetic biology can explore the fine-tuning of key metabolic steps through genetic circuits, which can optimize cell factories and bypass the any given specific limitation. This can also be seen under the light of a systems biology approach, not only for coupling chemical production with growth but also because one can use stoichiometric modeling to help in host engineering (Venayak et al., 2015; Chubukov et al., 2016; Teixeira et al., 2017).

One constraint that can be circumvented by the use of genetic circuits is the rate of fungal biomass production vs. the amount of the desired products generated (and also vs. all undesired by-products). An example of a study that took advantage of genetic circuits was performed by Williams et al. (2015). The authors constructed an ON-OFF circuit to overcome the metabolic burden by separating growth and compound production. This system was based on quorum sensing, using a pheromone and RNAi methodology. In this system, genes related to the production of the compound of interest were kept OFF, while genes related to growth were ON. When the population reached an optimal growth, genes related to biomass production were knocked down by RNAi while those related to the metabolic pathway of interest were activated. The authors used this strategy to increase the yield of p-hydroxybenzoic acid (PHDA) in S. cerevisiae and reported the highest PHDA yield ever achieved in yeast. Still, in this sense, ON-OFF genetic circuits can also be achieved by adding responses to temperature or other inducers, such as isopropyl β-D-1-thiogalactopyranoside (Venayak et al., 2015).

Continuous genetic circuits can also be constructed for the dynamic control of target metabolic pathways, coupling gene expression to the sensing of a specific metabolite. Metabolite sensors allow the circuit to respond in accordance with the cellular environment so that they become sensitive to variations. As an example, Xu et al. (2014) developed a continuous circuit triggered by malonyl-CoA concentrations. The promoters that were activated by malonyl-CoA induced a consumption pathway, while the same substrate repressed a production pathway. Thus, this intermediate compound was used to regulate the entire fatty acid biosynthetic pathway.

Another strategy for the dynamic control of metabolic pathways was employed by Teixeira et al. (2017). In this case, the authors studied the production of fatty alcohols from free fatty acids. For this, they dynamically expressed the fatty acyl-CoA synthase gene faa1 under the control of different promoters to prevent the accumulation of free fatty acids in industrial mutant strains, thus avoiding the loss of precursors to the extracellular medium. This approach enhanced the production of fatty alcohols and expanded the knowledge regarding the control of metabolite flux of this pathway. This is an excellent example of the use of dynamic control to increase industrial production. Venayak et al. (2015) supplied other examples to explain all the benefits and drawbacks of this approach. Algorithms for stoichiometric metabolic models are currently available to help with the understanding of how cell network contributes to the yield of the final product. Some examples are OptStrain and OptForce, which can be used to find additional reactions that can be targeted and to identify the pathways that need to be engineered. In this sense, these algorisms can be applied to guide host engineering approaches and to enhance the benefits of building genetic circuits (Chubukov et al., 2016). In a generic manner, dynamic regulatory circuits can be combined with all the previously cited tools and strategies to provide a more refined and productive mutant fungal strain to address the biotechnological demand in industrial processes.

### SYNTHETIC BIOLOGY APPLIED TO SECONDARY METABOLISM

In the last 15 years, the research community has used two approaches to produce fine chemicals in fungi. One approach is the implementation of new metabolic pathways through recombinant DNA techniques. The second approach is the engineering of existing pathways to enhance the yield and purity of existing metabolites. While in this review, we focus on the production of fine chemicals, molecules with low aggregated value and high-volume production have been a point of interest in of recent studies, as summarized in **Figure 4**.

The bioproduction of lipid molecules in fungi can provide novel renewable and sustainable material for the production of food ingredients independent of plant cultivation, climate changes, and seasonal availability. Lipids extracted from plants, such as the cocoa tree (Theobroma cacao), can be utilized for

industrial purposes. The lipids extracted from T. cacao beans are the basic component of cocoa butter, a valuable ingredient that is becoming increasingly scarce in the market. In this sense, Wei et al. (2018) reported the application of rational metabolic engineering to produce a CBL product in yeast. To obtain a strain producing a CBL product, the authors precisely determined the genes that are responsible for the production of the TAGs that compose cocoa butter, so that these genes could be functionally expressed in S. cerevisiae. The main composition of this triacylglycerol product is 1,3-dipalmitoyl-2-oleoylglycerol, 1-palmitoyl-3-stearoyl-2-oleoylglycerol, and 1,3-distearoyl-2 oleoylglycerol. These TAGs fail to naturally accumulate in large amounts in S. cerevisiae. Wei et al. (2018) proposed that the enzymes responsible for the synthesis of TAGs from T. cacao could be cloned into S. cerevisiae, favoring the production of the adequate TAG to obtain the (CBL) product.

To investigate this hypothesis, the authors cloned cocoa glycerol-3-phosphate acyltransferase, lysophospholipid acyltransferase, and diacylglycerol acyltransferase in S. cerevisiae through Gibson assembly into the pBS01A plasmid (Wei et al., 2018). Interestingly, some of those genes were amplified from cocoa cDNA and showed different sequences from the previously annotated genes. Additionally, a valuable use of this approach is the possibility of directly characterizing and testing the activity of plant genes in yeast. As a means to optimize the production of CBL, the authors combined the expression of the cocoa genes with two previously reported genes for the production of CBL compound (Wei et al., 2017b). The resulting strains were able to produce significantly higher amounts of the TAGs of interest than the control strain, with an increased total fatty acid production up to 84%.

#### Alkaloids

Alkaloids play a significant role in human health, and these are structurally and functionally diverse molecules. Their health applications are diverse, ranging from the treatment of pain (opioids) to cancer (vinblastine and vincristine). Owing to their structural complexity, most of those important chemicals are produced through extraction or semi-synthesis from plant species. Addressing problems like seasonal variations in crop yields and batch variation of Papaver somniferum, a plant that produces morphine and other opioids, Galanie et al. (2015) developed a proof-of-concept synthetic circuit composed of more than 20 genes that can produce opioids in S. cerevisiae. Similar advances were made by Li and Smolke (2016) for the production of noscapine, an anticancer drug isolated from P. somniferum.

The assembly and production of these molecules in yeast allowed the establishment of new sources for these key molecules. They also enable researchers to use yeast as a platform for enzyme engineering, to generate new tools that can be used in drug discovery. The study conducted by Li and Smolke (2016) also raised important concerns regarding the safety and ethics of producing psychoactive drugs in an easy manner in organisms such as S. cerevisiae. Although the yield obtained by Galanie et al. (2015) was low (which makes licit and illicit applications of this technology unlikely), advances in the yeast biosynthesis of other non-alkaloid psychoactive compounds such as 19-tetrahydrocanabinolic acid (THCA), a precursor of the main psychoactive Cannabis constituent (tetrahydrocannabinol (THC)) (Zirpel et al., 2015), has raised concerns about the safety of microbial production of addictive active pharmaceutical ingredients.

#### Nonribosomal Peptides

The production of nonribosomal peptides (NRP) is of interest for their use in synthetic biology approaches, since the modularity of NRP synthetases (NRPSs) allows enhanced compound production or the synthesis of entirely new structures. A milestone in the application of synthetic biology to produce NRP using S. cerevisiae was reported by Awan et al. (2017). They managed to express the whole biosynthetic pathway of benzylpenicillin in this yeast, thus, validating a screening method for antibiotic NRP production. The benzylpenicillin biosynthetic pathway consists of five enzymes (encoded by pcbAB, npgA, pcbC, pclA, and penDE). pcbAB and npgA are responsible for the synthesis of the benzylpenicillin (PEN) precursor, amino-adipylcysteine-valine (ACV), while the products of pcbC, pclA, and penDE are responsible for the conversion of ACV to PEN.

Although the heterologous production of ACV in S. cerevisiae and of PEN in Hansenula polymorpha were previously reported by (Gidijala et al., 2009; Siewers et al., 2009), the approach used to express the whole PEN pathway in S. cerevisiae was a simple, inexpensive, yet, powerful platform to screen relevant NRP pathways. This approach enabled the development of new NRP through the combinatorial assembly of NRPS pathways and construction of chimeras (Awan et al., 2017).

#### Flavonoids

Flavonoids are a class of polyphenolic compounds produced by plants. They have a variety of uses in modern and traditional health practices, with benefits including antimicrobial and antioxidant activities (Marín et al., 2015; Skrovankova et al., 2015). Breviscapine is a flavonoid extract that is used in Chinese medicine. This flavonoid is composed mainly of two molecules, scutellarin and apigenin 7-O-glucuronide. It is obtained through the extraction of vegetal tissues from Erigeron breviscapus, which has resulted in scarce supply as its popularity has increased over the past 30 years. Therefore, new methodologies to produce the active molecule are needed to ensure the supply of this active pharmaceutical ingredient. In an attempt to produce the main components of breviscapine, Liu X. et al. (2018) identified and expressed components of the breviscapine pathway in yeast, and they were able to produce the flavonoids scutellarin and apigenin 7-O-glucuronide from glucose. Their success exemplifies the potential of synthetic biology as a metabolic pathway elucidation tool, which, in this example, aided the researchers to elucidate the breviscapine biosynthetic pathway. This example shows how synthetic biology can have a huge impact on the biosynthesis of flavonoids, therefore, establishing a constant supply chain of such natural products through fungal bioproduction.

#### Glycosides

The burgeoning prevalence of metabolic syndrome, obesity, and diabetes is increasing the need for alternatives to sugar. Steviosides are safe sweeteners that are extracted from the leaves of Stevia rebaudiana. They are glucosides composed of diterpenoids that are covalently bonded to three glucose molecules. A collateral effect of those sweetener molecules is a bitter off-flavor. In this sense, researchers from the biotechnology company Evolva (Olsson et al., 2016) developed strains of S. cerevisiae that produce novel next-generation stevioside with a reduced bitter taste. Through homology modeling and identification of key target amino acids present in the glucosyltransferase UGT76G1, the authors were able to obtain S. cerevisiae strains with increased accumulation of rebaudioside D and M steviosides with a less bitter taste and enhanced sweetness. Their success indicates the potential of this approach for yeast metabolic engineering of sugar replacements.

### EUKARYOTIC PROMOTERS AND TRANSCRIPTION FACTORS: THE BLOCKS TO CONTROL GENE EXPRESSION

The role of TFs and TF motifs can directly or indirectly drive transcription. These proteins control transcription by transforming physiological and environmental signals into patterns of gene expression (Weingarten-Gabbay and Segal, 2014), thus, acting as biosensors to turn transcription ON or OFF (D'Ambrosio and Jensen, 2017). Transcription factors recognize specific sequences in the DNA, collectively abbreviated as TFBSs. With the aid of computational approaches, TFBSs can be represented as position weight matrices (PWMs) in an attempt to represent the statistical or binding energy of the DNAprotein interaction depending on the data type that originated from the PWM (Schipper and Gordân, 2016). Additionally, more comprehensive models that include Bayesian networks or support vector machine based models have been reported recently, which are reviewed in Boeva (2016).

The discovery of new TF motifs is a step forward in understanding gene regulation, which can serve as a basis for new bioengineering applications. However, finding TFBS motifs is a difficult task, since primary nucleotide sequence is not the unique characteristic that specifies a TF target. Additional factors, such as multiple modes of DNA binding, DNA modification, DNA shape, genomic context, and coding and noncoding (genetic) variation

can change TF nucleotide sequence preferences, as reviewed in Inukai et al. (2017) and Siggers and Gordân (2014).

Studies using high-throughput data regarding TF binding specifity are another source of valuable information on parameters affecting gene regulators. In this sense, Gordân et al. (2011) suggested that several TFs of S. cerevisiae have a primary and secondary binding motif, which may even perform distinct regulatory functions. These findings indicate a property that could be further explored in motif search algorithms that focus on yeasts. Even with many experimental approaches that generate high-throughput data, it is impossible to test all the environmental conditions that a natural regulatory network is able to respond to. Despite this limitation, several methods and approaches have been used to identify motifs in high-throughput data. In this regard, computational motif discovery tools that deal with large data (like those generated from ChIP-seq, SELEX, or ChIP-chip assays) that are currently being used include HMS (Hu et al., 2010), cERMIT (Georgiev et al., 2010), HOMER (Heinz et al., 2010), diChIPMunk (Kulakovskiy et al., 2013), MEME-ChIP (Machanick and Bailey, 2011), rGADEM (Mercier et al., 2011), POSMO (Ma et al., 2012), XXmotif (Hartmann et al., 2013), FMotif (Jia et al., 2014), Dimont (Grau et al., 2013), and DeepBind (Alipanahi et al., 2015). Particularly, rGADEM, HOMER, POSMO, Dimont, and ChIPMunk are tools with good performance (Boeva, 2016; Jayaram et al., 2016).

#### Promoters

Bioinformatics research is being allied with synthetic biology for the discovery of new modules and their interactions. Yeast promoters have motifs/sequences that are necessary for promoter function and for assembly of the preinitiation complex, which leads to the recruitment of the entire cellular transcription machinery (Thomas and Chiang, 2006). The core elements in yeast promoters are nucleosome-free regions located approximately 140 base pairs (bp) at the beginning and at the end of the genes that are rich in adenine (A) or thymine (T). Additionally, transcription start sites (TSSs), TATA-box, and upstream activation sites located several hundred bp upstream of the TSS are required. Analogously, upstream repressing sequences can also be present in natural promoters (Venters and Pugh, 2009; Sesma and von der Haar, 2014)**.**

Promoter prediction is a hurdle. Transcriptional initiation is the first step in gene expression and, so, is an important control point. Despite its importance, eukaryotic promoter prediction is not simple because of the structural complexity of natural cis-regulatory elements (Pedersen et al., 1999; Yella and Bansal, 2017). In the past years, promoter prediction tools have improved, which have been driven by new high-throughput data generated by next-generation sequencing and by the application of analytical machine learning methods, such as the support vector machine, neural networks, and naïve Bayesian classifier (Singh et al., 2015). However, until now, no precise prediction tools are available. Yet, several software use eukaryotic promoter characteristics that are specific to a given specie or to animals, which are tools that work with a specific methodology and precision.

Owing to the complexity of promoter prediction in general, we presents prediction below some useful tools that have been applied in fungi/yeast or that could be adapted to these organisms. For example, Shahmuradov et al. (2017) attained better performance than previous programs by using a novel prediction tool for plant promoters, which was named TSSPlant. This tool was created by using large promoter collections from plant promoter databases to identify most relevant promoter cisregulatory elements. This was followed by the use of a neural network with back propagation to create a promoter classifier, thus, allowing an improvement in accuracy. Another notable approach to solve the prediction bottleneck was described by Umarov and Solovyev (2017). The authors developed a general method for the recognition of promoters by constructing a predictive model with convolutional neural networks. To test the method and to prove its universality, the authors used promoter sequences from distinct organisms (bacteria, human, mouse, and plant). For each organism, the best accuracy was achieved when it was compared with the existing tools.

In addition to the aforementioned prediction methods that deal with pre existing data, several studies have used transcriptomics to find putative TSSs. With specific RNA-seq protocols to obtain 5′ regions (Gowda et al., 2007; de Hoon and Hayashizaki, 2008), the reads can be compared with the available genome annotation data. Analysis of nearby regions could then be performed using alignment and conventional motif discovery tools (discussed in detail below). This approach provided useful information about the core promoters of Schizosaccharomyces pombe and A. nidulans (Sibthorp et al., 2013; Li et al., 2015). These studies highlight a relevant experimental/computational strategy to obtain data regarding regulatory elements in fungal promoters. Yet, since this study, no recent reports about fungal promoter prediction tools have been published. Hence, we suggest that these aforementioned notable approaches could be integrated to aid the creation of new software and new pipelines of promoter discovery in these biotechnologically relevant organisms (**Figure 5**).

### Bioinformatic Approaches for the Identification of TFBS

Considering current efforts in analyzing large amounts of data, Yu et al. (2016) proposed a new algorithm, PairMotifChIP, for this purpose. This tool can identify motifs by extracting combining pairs of an "l" width in the input sequences that have small Hamming distance, distinguishing the motifs from random overrepresented sequences by probabilistic analysis and then combines the remaining sequences to form motifs. This tool runs very fast and does not require previous user information (Yu et al., 2016). Caldonazzo Garbelini et al. (2018) created a new approach for motif discovery by making use of a genetic algorithm to escape from optimal local solutions. The algorithm was designated as Memetic Framework for Motif Discovery (MFMD). The study made use of a version of the semi-greedy heuristic to build initial solution population and genetic algorithms (as a global optimizer) to develop these initial solutions (Caldonazzo Garbelini et al., 2018). Another tool that

was also built to escape from optimal local solutions is weaklysupervised motif discovery (WSMD). The WSMD algorithm uses a latent support vector machine optimization strategy and learns PWM in a continuous space, which reduces the loss of information. Thus, the quality of the motifs is improved (Zhang et al., 2017). All of these studies feature a comparative analysis with the currently used motif discovery tools and seem to have better performance.

While these tools may help in studies to overcome limitations in motif discovery, working with high-throughput data remains a bottleneck. To deal with this, Guzman and D'Orso (2017) developed CIPHER, a framework that integrates complex bioinformatics tools to analyze different types of next-generation sequencing data (e.g., ChIP-seq, RNA-seq, DNase-seq). This tool provides several types of analyses that include differential gene expression, peak annotation, and reasoning (with HOMER) in an easy-to-use way. This tool also allows parallelization of processing for the better use of local hardware, as well as a quality control module to identify possible errors, contaminations, and bias in input sequences, which generates accurate results (Guzman and D'Orso, 2017).

#### CONCLUSIONS

Systems and synthetic biology are playing a pivotal role in the development of tools for engineering yeast and filamentous fungi. Studies based on them have shed light on how single modules can influence an entire and robust system. Besides, the particularity of each tool presented in this review, the approaches to engineer fungi as cell factories for biotechnological industrial processes, can be successfully combined to guarantee a viable and cost-efficient strategy. The tools described here can cover a wide area of applications, substantially improving the already existing methodologies to engineer these organisms. Still, despite the great promises of synthetic biology, the genetic manipulation of filamentous fungi remains to be one of the major challenges for industrial applications. The difficulty of manipulating their genomes still hinders the use of metabolic engineering for filamentous species. However, synthetic biology

#### REFERENCES


approaches used for yeast are significantly contributing to the spread of methodologies and tools for engineering microbes to produce high value-added products at an achievable cost and to benefit equipoise. In general, synthetic biology approaches have presented successful examples of management and the acquisition of mutant strains that meet industrial demand. Finally, we anticipate that novel computational tools, especially for the investigation and design of regulatory elements, will play a pivotal role in future engineering attempts of these remarkable organisms.

#### AUTHOR CONTRIBUTIONS

LM-S and RS-R conceived the work. LM-S, LN, AS-M, GL, MC, and RS-R assembled the first draft of the manuscript. LM-S and RS-R revised the final version of the work. All the authors have read and approved the final version of the manuscript.

#### FUNDING

RS-R was supported by the São Paulo Research Foundation (FAPESP, award number 2012/22921-8). LM-S was supported by FAPESP PhD Fellowship (award number 2017/17924-1). LN was supported by FAPESP Master Fellowship (award number 2016/03763-3). AS-M, GL, and MC were supported by FAPESP Scientific Initiation Fellowship (award numbers 2015/22386-3, 2016/11093-8, and 2017/04217-5).

#### ACKNOWLEDGMENTS

The authors would like to thank their lab colleagues for their insightful discussion about this work.


expression, and network inference. BioTechniques 44(Suppl.), 627–632. doi: 10.2144/000112802


cerevisiae by expression of selected cocoa genes. AMB Express 7, 1–11. doi: 10.1186/s13568-017-0333-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer EB and handling Editor declared their shared affiliation.

Copyright © 2018 Martins-Santana, Nora, Sanches-Medeiros, Lovate, Cassiano and Silva-Rocha. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modulating Transcriptional Regulation of Plant Biomass Degrading Enzyme Networks for Rational Design of Industrial Fungal Strains

#### Ebru Alazi and Arthur F. J. Ram\*

Molecular Microbiology and Biotechnology, Institute of Biology Leiden, Leiden University, Leiden, Netherlands

#### Edited by:

Gustavo Henrique Goldman, Universidade de São Paulo, Brazil

#### Reviewed by:

Ramón Alberto Batista-García, Universidad Autónoma del Estado de Morelos, Mexico Bernhard Seiboth, Technische Universität Wien, Austria

\*Correspondence:

Arthur F. J. Ram a.f.j.ram@biology.leidenuniv.nl

#### Specialty section:

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

Received: 28 March 2018 Accepted: 05 September 2018 Published: 25 September 2018

#### Citation:

Alazi E and Ram AFJ (2018) Modulating Transcriptional Regulation of Plant Biomass Degrading Enzyme Networks for Rational Design of Industrial Fungal Strains. Front. Bioeng. Biotechnol. 6:133. doi: 10.3389/fbioe.2018.00133 Filamentous fungi are the most important microorganisms for the industrial production of plant polysaccharide degrading enzymes due to their unique ability to secrete these proteins efficiently. These carbohydrate active enzymes (CAZymes) are utilized industrially for the hydrolysis of plant biomass for the subsequent production of biofuels and high-value biochemicals. The expression of the genes encoding plant biomass degrading enzymes is tightly controlled. Naturally, large amounts of CAZymes are produced and secreted only in the presence of the plant polysaccharide they specifically act on. The signal to produce is conveyed via so-called inducer molecules which are di- or mono-saccharides (or derivatives thereof) released from the specific plant polysaccharides. The presence of the inducer results in the activation of a substrate-specific transcription factor (TF), which is required not only for the controlled expression of the genes encoding the CAZymes, but often also for the regulation of the expression of the genes encoding sugar transporters and catabolic pathway enzymes needed to utilize the released monosaccharide. Over the years, several substrate-specific TFs involved in the degradation of cellulose, hemicellulose, pectin, starch and inulin have been identified in several fungal species and systems biology approaches have made it possible to uncover the enzyme networks controlled by these TFs. The requirement for specific inducers for TF activation and subsequently the expression of particular enzyme networks determines the choice of feedstock to produce enzyme cocktails for industrial use. It also results in batch-to-batch variation in the composition and amounts of enzymes due to variations in sugar composition and polysaccharide decorations of the feedstock which hampers the use of cheap feedstocks for constant quality of enzyme cocktails. It is therefore of industrial interest to produce specific enzyme cocktails constitutively and independently of inducers. In this review, we focus on the methods to modulate TF activities for inducer-independent production of CAZymes and highlight various approaches that are used to construct strains displaying constitutive expression of plant biomass degrading enzyme networks. These approaches and combinations thereof are also used to construct strains displaying increased expression of CAZymes under inducing conditions, and make it possible to design strains in which different enzyme mixtures are simultaneously produced independently of the carbon source.

Keywords: CAZyme, plant biomass degradation, strain design, industrial fungi, transcriptional regulation, overexpression of transcription factors, constitutively active transcription factors, inducer accumulation

### INTRODUCTION

Plant biomass is the most abundant renewable carbon source in the world and represents the major natural substrate for fungi (Kowalczyk et al., 2014). Plant biomass mainly consists of plant cell wall material, which contains the polysaccharides, i.e., cellulose, hemicellulose and pectin, lignin, and structural proteins (Loqué et al., 2015). Cellulose, the most abundant plant cell wall polysaccharide, is a linear chain of glucose (Kolpak and Blackwell, 1976). Hemicelluloses are complex heteropolysaccharides with xylan- (linear chain of xylose), glucan- (linear chain of glucose), glucomannan- (linear chain of glucose and mannose), or mannan- (linear chain of mannose) backbones, with several different types of monomers attached to the backbone to give e.g., arabinoxylan or xyloglucan (Scheller and Ulvskov, 2010). Pectins are other complex polysaccharides containing D-galacturonic acid (GA) as the main sugar acid in their backbones. Polygalacturonic acid (PGA) is a linear chain of GA and the most abundant pectin substructure. Other pectin substructures are rhamnogalacturonan I (linear chain of alternating GA and rhamnose residues), rhamnogalacturonan II and xylogalacturonan, and contain several different types of monomers or polymers attached to their backbones (Caffall and Mohnen, 2009). Plant biomass also contains the plant cell storage polysaccharides starch, a linear (amylose) or branched (amylopectin) polymer of glucose, and inulin, a polymer of fructose residues with a terminal glucose residue (Gidley, 2001; Ritsema and Smeekens, 2003).

Polysaccharides present in plant biomass are the target of the degrading enzymes secreted abundantly by filamentous fungi. Fungal carbohydrate active enzymes (CAZymes) are also utilized industrially for the hydrolysis of plant biomass for the subsequent production of mainly bioethanol, and highvalue biochemicals (Kowalczyk et al., 2014; Gupta et al., 2016; Benocci et al., 2017). Plant biomass degrading enzymes are classified into families based on their sequence, such as glycoside hydrolases, polysaccharide lyases and carbohydrate esterases, abbreviated as GH, PL and CE, respectively (Carbohydrate Active Enzymes database, http://www.cazy.org/) (Lombard et al., 2014). Filamentous fungi secrete large amounts of CAZymes only in the presence of the plant polysaccharide they specifically act on. For a review on the substrate specificity of CAZymes we refer to Kowalczyk et al. (2014) and de Vries et al. (2017). The signal for tailor-made production of CAZymes is conveyed via so-called inducer molecules which are di- or mono-saccharides (or derivatives thereof) released from the specific plant polysaccharides. The presence of the inducer results in the activation of a substrate-specific transcription factor (TF), which is required for the controlled expression of the genes encoding not only the CAZymes, but usually also the transporters and catabolic pathway enzymes needed to utilize the released monosaccharide (Culleton et al., 2013; Benocci et al., 2017).

Several substrate-specific TFs involved in plant biomass degradation have been identified in filamentous fungal species. For a review on the TFs involved in plant biomass degradation we refer to Huberman et al. (2016) and Benocci et al. (2017). For example, it has been shown that induction of expression of the genes encoding cellulases when cellulose is present requires the presence of the transcriptional activators Clr-1 and Clr-2 in Neurospora crassa, ClrB in Aspergillus nidulans and Penicillium oxalicum, ManR in A. oryzae and Xyr1 in Trichoderma reesei (Stricker et al., 2006; Coradetti et al., 2012; Ogawa et al., 2013; Li et al., 2015). Clr-2, ClrB and ManR are orthologs (Kunitake and Kobayashi, 2017), while Xyr1 shows significant sequence similarity to the TFs involved in xylan degradation, XlnR in A. niger and Xlr-1 in N. crassa (Rauscher et al., 2006). The enzyme families controlled by orthologous transcription factors can vary. Although Xyr1 in T. reesei is essential for the expression of cellulases and xylanases, Xyr1 orthologs (named XlnR or Xlr-1 in other species) are essential only for the expression of xylanases in filamentous fungal species, such as A. oryzae (on xylose), A. niger (on xylan), A. nidulans (on xylan), Talaromyces cellulolyticus (on xylan), P. oxalicum (on cellulose), N. crassa (on xylan), Fusarium oxysporum (on xylan or wheat cell walls), and Myceliophthora thermophila (van Peij et al., 1998; Marui et al., 2002b; Stricker et al., 2006; Brunner et al., 2007; Tamayo et al., 2008; Sun et al., 2012; Fujii et al., 2014; Li et al., 2015; Wang et al., 2015). In A. oryzae, A. niger, T. cellulolyticus, and P. oxalicum, contribution of XlnR to the expression of cellulases during growth on cellulose and/or xylose/xylan has also been reported (Gielkens et al., 1999; Marui et al., 2002; Li et al., 2015; Okuda et al., 2016). Arabinan, present in the side chains of some form of hemicellulose, such as arabinogalactan, or some pectins, is degraded via CAZymes that are transcriptionally regulated by AraR in A. niger (Battaglia et al., 2011; Kowalczyk et al., 2017). Arabinan degradation in T. reesei is partially regulated via Xyr1 (Akel et al., 2009) and via Ara1 (Benocci et al., 2018). The TF GaaR, required for the expression of CAZymes degrading pectin, especially PGA, has been identified in both Botrytis cinerea and A. niger (Alazi et al., 2016; Zhang et al., 2016). Recently, Pdr-1 in N. crassa was described as a TF required for the

**Abbreviations:** CAZyme, carbohydrate active enzyme; GA, D-galacturonic acid; PGA, polygalacturonic acid; TF, transcription factor; CCR, carbon catabolite repression.

expression of the genes encoding CAZymes that degrade several pectin substructures including PGA and rhamnogalacturonan I (Thieme et al., 2017). Transcriptional control of the genes involved in starch degradation is conducted by AmyR in A. niger and A. nidulans (Tani et al., 2001; vanKuyk et al., 2012).

The transcriptional activators involved in plant biomass degradation mentioned above were identified either via classical approaches, such as mutant complementation (xlnR in A. niger) and gene cloning (xlnR and amyR in A. nidulans and amyR A. niger), or via post-genomic approaches, such as yeast onehybrid screening (gaaR in B. cinerea), screening TF deletion mutant collections (pdr-1 and clr-2 in N. crassa, manR in A. oryzae, and clrB and xlnR in P. oxalicum), and also based on sequence similarity (clrB in A. nidulans, araR and gaaR in A. niger, xyr1 in T. reesei and M. thermophila, and xlnR in F. oxysporum, T. cellulolyticus and A. oryzae) or their expression levels in the transcriptomics data (xlr-1 in N. crassa) (van Peij et al., 1998; Tani et al., 2001; Marui et al., 2002b; Rauscher et al., 2006; Brunner et al., 2007; Battaglia et al., 2011; Coradetti et al., 2012; Ogawa et al., 2012; Sun et al., 2012; vanKuyk et al., 2012; Fujii et al., 2014; Li et al., 2015; Wang et al., 2015; Alazi et al., 2016; Zhang et al., 2016; Thieme et al., 2017). Positively acting TFs involved in controlling expression of plant cell wall degrading enzymes belong to the Zn2Cys<sup>6</sup> type family of TFs. TFs belonging to this family are found specifically in fungi (both yeasts and filamentous fungi) and contain a DNA-binding domain with six cysteine residues bound to two zinc atoms, usually close to their NH2-terminal end. Most of the Zn2Cys<sup>6</sup> type TFs also contain a fungal-specific TF domain, known as the middle homology region, with a proposed role in regulating the activity of the TF (MacPherson et al., 2006). Apart from activators several repressor proteins involved in plant biomass degradation have been identified including wide domain repressors, such as the carbon catabolite repressor protein CreA/Cre1, or substrate specific repressors which are discussed in paragraphs 1.2 and 2.3, respectively.

### Diversity in the Control of Transcription Factor Activities

The mechanisms by which the activity of TFs is regulated in response to stimuli can be diverse. Firstly, the protein level of the TFs might be controlled. This can be realized by regulation at the level of transcription, mRNA stability or protein stability resulting in low protein levels under noninducing conditions and higher levels under inducing conditions. Secondly, the subcellular localization (nuclear import/export), DNA binding activity, and/or transcriptional activity of the TFs might be regulated by post-transcriptional modifications, such as phosphorylation, and/or via protein-protein or proteinmetabolite interactions (MacPherson et al., 2006; Chang and Ehrlich, 2013; Tani et al., 2014). Finally, DNA site occupancy of TFs might depend on their cooperation or competition with other proteins and the chromatin accessibility (Granek and Clarke, 2005; Biggin, 2011).

Our knowledge about regulation of the activity of the TFs involved in biomass degradation is limited, yet growing. The amount of Clr-2 in N. crassa is regulated at the level of transcription by Clr-1, which gets activated and positively regulates clr-2 expression only under inducing conditions (on insoluble crystalline cellulose Avicel or cellobiose). The expression of the clr-2 ortholog in A. nidulans, clrB, on the other hand, is not drastically induced under inducing conditions (on Avicel) and the Clr-1 ortholog ClrA is not essential but only contributes to cellulase gene expression (Coradetti et al., 2012). This comparison indicates that activation mechanisms of even orthologous transcription factors differ between fungal species.

The expression of xyr1 in T. reesei is subject to Cre1 dependent carbon catabolite repression (CCR) (see below) and induced on carbon sources inducing cellulase expression (e.g., lactose, sophorose, or cellulose), but not on carbon sources inducing xylanase expression (e.g., xylose) (Mach-Aigner et al., 2008; Portnoy et al., 2011a; Lichius et al., 2014). Xyr1 was shown to accumulate in the nucleus during growth on an inducing carbon source (i.e., sophorose or low concentration of xylose), whereas it is degraded in the nucleus during growth on a repressing carbon source (i.e., glucose or a high concentration of xylose). The increased amount of nuclear Xyr1 correlates with the increased expression level of cellulase gene cbh1 on sophorose, and that of xylanase gene xyn2 on a low concentration of xylose (Lichius et al., 2014). Similar to T. reesei xyr1, the expression of xlnR in A. nidulans and P. oxalicum is also subject to CreA-dependent CCR (see below), whereas xlnR in A. niger is not regulated at the level of transcription, but constitutively transcribed at low levels (Tamayo et al., 2008; Mach-Aigner et al., 2012; Li et al., 2015). Moreover, XlnR was found to be localized in the nucleus regardless of the presence of inducer in A. niger (Hasper, 2004). In F. oxysporum, xlnR is transcriptionally regulated by both CCR and induction on xylose/xylan (Calero-Nieto et al., 2007). XlnR in A. oryzae is reversibly phosphorylated in response to xylose, which does not affect its protein stability and correlates with the expression of XlnR target genes (Noguchi et al., 2011). Recently, Kunitake and Kobayashi proposed that a conserved sequence in XlnR is involved in the xylose-mediated phosphorylation of XlnR in A. oryzae, implying a conserved mechanism regulating XlnR activity among Ascomycete fungi. Moreover, the observation that XlnR in A. oryzae is not phosphorylated in response to cellobiose, but required for cellulase expression, indicates that its activity is regulated by a different mechanism on cellobiose (Noguchi et al., 2011; Kunitake and Kobayashi, 2017).

The activity of the GA-responsive transcriptional activator GaaR in A. niger is regulated by a different mechanism. It is suggested to be inhibited by the repressor protein GaaX via protein-protein interaction under non-inducing conditions. Under inducing conditions (in the presence of GA), the inducer is proposed to bind to GaaX, resulting in a free and active form of GaaR (Niu et al., 2017). The amount of GaaR is not significantly regulated at the level of transcription (Alazi et al., 2016, 2018). In addition, GaaR-eGFP was shown to be constitutively localized in the nucleus (Alazi et al., 2018). Transcription of pdr-1 in N. crassa is induced under inducing conditions (on rhamnose) and repressed under repressing conditions. Moreover, the activity of Pdr-1 is suggested to be post-transcriptionally regulated, as its nuclear accumulation and transcriptional activity requires the presence of rhamnose in a strain overexpressing pdr-1 (Thieme et al., 2017).

The transcription of amyR is upregulated on inducing carbon sources (e.g., starch or maltose) and is subject to CreA-dependent CCR (see below) in A. nidulans (Tani et al., 2001). Further, AmyR was shown to localize in the nucleus in an inducer (i.e., isomaltose)-dependent manner and activate the expression of its target genes (Makita et al., 2009).

### The Role of the Carbon Catabolite Repressor in the Production of Plant Biomass Degrading Enzymes

Apart from being upregulated under inducing conditions, the expression of CAZymes might also be controlled via CCR when a more energetically favorable carbon source, such as glucose, compared to plant biomass polysaccharides is available for fungi. The Cys2His<sup>2</sup> type TF CreA/Cre1/Cre-1 (in Aspergillus species, T. reesei, and N. crassa, respectively) is the main transcriptional repressor contributing to CCR. Regulation of CreA activity has not been fully understood, but several studies have indicated that post-transcriptional modifications, such as ubiquitination and phosphorylation, affect CreA protein stability, subcellular localization and/or DNA binding activity (Cziferszky et al., 2002; Ries et al., 2016; see Adnan et al., 2017 for a recent review). CreA not only represses the expression of genes encoding CAZymes, but it might also repress the expression of some transcriptional activators, such as clrB and xlnR in P. oxalicum, xyr1/xlnR in T. reesei, A. nidulans and F. oxysporum, and amyR in A. nidulans, that are required for the expression of CAZymes (Tani et al., 2001; Calero-Nieto et al., 2007; Mach-Aigner et al., 2008; Tamayo et al., 2008; Li et al., 2015). Furthermore, CreA might repress the expression of CAZymes under high xylose concentrations, too, as was shown to be the case for XlnR and its target genes in A. nidulans, and for XlnR itself in A. oryzae (Tamayo et al., 2008; Ichinose et al., 2017). Elimination of CCR due to a lack of function of CreA and/or CreB, a ubiquitin-specific protease involved in CCR, or their orthologs has been reported to result in an increased expression of CAZymes degrading cellulose, hemicellulose, pectin or starch, under inducing and/or non-inducing conditions in filamentous fungi (Prathumpai et al., 2004; Mach-Aigner et al., 2008; Nakari-Setälä et al., 2009; Denton and Kelly, 2011; Sun and Glass, 2011; Fujii et al., 2013; Ichinose et al., 2014, 2017; Niu et al., 2015; Yang et al., 2015). However, even when CreA-dependent CCR is circumvented, the expression of CAZymes might require the presence of active transcriptional activators, which is normally achieved in the presence of inducing metabolites. For example, in a Cre1-negative T. reesei strain (Rut-C30), the full expression of Xyr1 target genes requires the presence of the inducer (i.e., xylose) (Mach-Aigner et al., 2008). Another study revealed that deletion of cre1 in T. reesei results in elevated production of cellulases and xylanases under inducing (on lactose) and, to a lesser extent, under non-inducing conditions (on glucose) (Nakari-Setälä et al., 2009). Genome-wide studies in T. reesei have also indicated that only a few of the cellulolytic genes were upregulated in a 1cre1 strain during growth on glucose (Portnoy et al., 2011b; Antoniêto et al., 2014). This indicates that the majority of these cellulolytic genes are not induced simply by derepression, but require additional inducing condition for expression. Similarly, although the expression of pectinases is subject to CreA-dependent CCR in A. niger, deletion of creA is not sufficient for an increased production of polygalacturonases under non-inducing conditions. Polygalacturonases are only produced in the presence of the inducer (i.e., GA) or when gaaX is deleted, showing that GA-responsive gene expression requires the presence of active GaaR relieved from GaaX inhibition even in a 1creA strain (Niu et al., 2015, 2017).

### APPROACHES FOR INCREASED OR CONSTITUTIVE EXPRESSION OF PLANT BIOMASS DEGRADING ENZYMES BY MODULATING THEIR TRANSCRIPTIONAL REGULATION

Plant biomass with varying sugar composition present in waste streams from agriculture, forestry and food industries represent a cheap, sustainable and renewable feedstock for the production of the CAZymes, and subsequently, valuable chemicals (Sweeney and Xu, 2012; Meyer et al., 2015). However, the composition of the enzyme cocktail will vary because of variation in the composition of the plant biomass. It is therefore of industrial interest to produce specific fungal enzyme cocktails constitutively, independently of inducers and the substrate used. Fungal production strains, such as A. niger CBS513.88, T. reesei Rut-C30, and P. oxalicum JU-A10- T, have been for a long time improved via multiple rounds of classical mutagenesis and screening approaches that can be time-consuming (Montenecourt and Eveleigh, 1979; Pel et al., 2007; Fang et al., 2010; van Hanh et al., 2010; Ho and Ho, 2015; Yao et al., 2015). However, emergence of the -omics era (genomics, transcriptomics etc.) and advances in recombinant technologies allow nowadays efficient strain improvement via genetic engineering approaches with minimal alterations in the genome (Meyer et al., 2010; Liu et al., 2013). In the remaining part of this review, we discuss the approaches to modulate transcriptional regulation in order to rationally design fungal strains with increased or constitutive production of plant biomass degrading enzymes, examples of which are given in **Table 1**. As will become clear in the following sections, the success of the approach used highly depends on the mechanism regulating the activity of the targeted TF.

#### Overexpression of TFs

Increasing the amount of a TF at the level of transcription can be achieved by expressing multiple copies of the TF via its endogenous promoter, or (multiple copies of) the TF via a strong inducible/constitutive promoter. Overexpression of several TFs in Saccharomyces cerevisiae has been shown to result in an increased expression of the TF target genes, even under noninducing conditions (Chua et al., 2006).

#### TABLE 1 | Examples of rational design of industrial fungal strains with increased or constitutive production of CAZymes.


Inducer-independent production of cellulases in N. crassa was achieved by constitutively overexpressing clr-2 via the ccg-1 promoter (Coradetti et al., 2013). While deletion of clr-2 resulted in a drastic decrease in the expression of most of the genes encoding cellulases under inducing conditions [i.e., on insoluble crystalline cellulose (Avicel)], overexpression of clr-2 resulted in an increased expression of cellulases under inducing conditions and constitutive expression under non-inducing conditions. Expression of the genes encoding cellulases was higher under starvation (no carbon) conditions than on sucrose highlighting the effect of CCR on cellulase encoding genes under repressing conditions (i.e., on sucrose) even when clr-2 was overexpressed (Coradetti et al., 2013).

Unlike overexpression of clr-2 in N. crassa, overexpression of the clr-2 ortholog clrB in A. nidulans or in P. oxalicum, both via the constitutive A. nidulans gpdA promoter, did not result in an increased expression of CAZymes encoding genes under non-inducing conditions, but only under inducing conditions (on Avicel or cellulose, respectively) (Coradetti et al., 2013; Li et al., 2015). It was shown that deletion of creA in combination with clrB overexpression allows increased expression of cellulases under non-inducing conditions in P. oxalicum, indicating that the strong CreA-dependent repression on cellulase genes overrules ClrB-dependent induction (Li et al., 2015).

ManR in A. oryzae is the N. crassa clr-2 ortholog, regulating the expression of genes encoding cellulose and hemicellulose (including mannan, but not xylan) degrading enzymes in A. oryzae (Ogawa et al., 2012, 2013). Overexpression of ManR via the tef1 promoter resulted in an increased expression of cellulases and hemicellulases involved in mannan degradation under inducing conditions (i.e., on Avicel and mannan, respectively) (Ogawa et al., 2012, 2013). The effect of overexpression of ManR under non-inducing conditions has not been reported.

As mentioned before, in T. reesei, Xyr1 controls the transcriptional regulation of both cellulase encoding and xylanase encoding genes. Expression of xyr1 is subject to CCR and xyr1 expression is upregulated on carbon sources inducing cellulase production. Overexpression of xyr1 via the tcu1 promoter resulted in an inducer-independent production of cellulases, even under repressing conditions (i.e., on glucose) (Lv et al., 2015). All reported T. reesei Xyr1 orthologs in other filamentous fungal species regulate mainly the transcription of hemicellulase encoding genes and contribute less to the transcriptional regulation of cellulase encoding genes. Overexpression of xlnR via gpdA promoter in T. cellulolyticus resulted in an increased production of cellulases on cellulose, but not that of xylanases (Okuda et al., 2016). In A. niger, increased expression of cellulases and xylanases on xylose was observed in a strain carrying multiple copies of xlnR (Gielkens et al., 1999). Similarly, an A. oryzae strain overexpressing xlnR via the tef1 promoter showed an increased production of both cellulases and xylanases when grown on carbon sources known to induce cellulase (i.e., Avicel or cellobiose) or xylanases (i.e., xylose or xylan) production (Marui et al., 2002; Noguchi et al., 2009).

The studies mentioned above reported on the increased expression of xylanases in fungal strains overexpressing xlnR or its orthologs under inducing conditions. Only a few studies have reported the effect of overexpression of xlnR on the expression of xylanases under non-inducing conditions. For instance, xlnR overexpression via the constitutive gpdA promoter in A. nidulans or in F. oxysporum yields to increased production of xylanases encoding genes only in the presence of inducing carbon sources (e.g., xylose or xylan) (Calero-Nieto et al., 2007; Tamayo et al., 2008). It was also reported that when GFP-XlnR is overproduced in A. oryzae, it localizes constitutively in the nucleus, but its target genes are expressed only in the presence of xylose (Noguchi et al., 2011). These results indicate that XlnR in these organisms gets activated by an inducer-mediated post-transcriptional mechanism and that XlnR activation under non-inducing conditions cannot simply be achieved by overexpression. In M. thermophila, overexpression of xyr1 via the pdc promoter resulted in an increased expression of xylanases under both inducing (i.e., corncob) and non-inducing conditions (i.e., glucose) (Wang et al., 2015). This indicates that the mechanism of Xyr1 activation in M. thermophila differ from other filamentous fungi.

Another example of inducer-independent production of CAZymes by overexpression of TFs was given by a recent study in A. niger. Here it was shown that overexpression of gaaR in A. niger via the A. nidulans gpdA promoter leads to constitutive expression of the genes encoding pectinases, as well as GA transporters and GA catabolic pathway enzymes (Alazi et al., 2018). Deletion of creA further enhanced pectinase production under mildly repressing conditions (i.e., on fructose) indicating competing roles of GaaR and CreA to control the expression of GA-induced genes (Alazi et al., 2018). The effect of overexpression of gaaR on the expression of its target genes under non-inducing conditions is likely caused by its specific activation mechanism. Regulation of the activity of GaaR includes a specific repressor protein (GaaX) (Niu et al., 2017). It has been proposed that under non-inducing conditions the activity of GaaR is controlled through direct interaction by GaaX which prevents GaaR to be active (Niu et al., 2017). Modulating the amount of GaaR by overexpression affects the balance of GaaR-GaaX and results in the presence of uncomplexed active GaaR even under non-inducing conditions (Alazi et al., 2018).

The regulation of GA-induced gene expression shows some striking similarities with the regulation of genes involved in quinic acid catabolism. Quinic acid is present as an aromatic compound in the plant cell and can be released from tannins, which are water-soluble polyphenols, by the action of tannases known to be secreted by filamentous fungi (Wagh, 2010). Similar to regulation of GA utilization in A. niger, regulation of quinic acid utilization in A. nidulans involves a Zn2Cys<sup>6</sup> type transcriptional activator (QutA) and a repressor (QutR), and strains expressing multiple copies of qutA displayed constitutive expression of the genes encoding quinic acid catabolic pathway enzymes (Lamb et al., 1996). The QutR and GaaX repressor proteins are both multidomain proteins with sequence similarity to the three C-terminal domains of AROM, indicating that these repressors share a common evolutionary origin (Niu et al., 2017).

On the other hand, overexpression of pdr-1, the pectin degradation regulator in N. crassa, via the gpd promoter of M. thermophila resulted in elevated expression of its target genes only under inducing conditions (on rhamnose). Therefore, it was proposed that Pdr-1 activity is regulated post-transcriptionally in a manner depending on the presence of the inducer but not on the amount of Pdr-1 (Thieme et al., 2017).

An A. niger strain carrying multiple copies of amyR displayed increased expression of AmyR target genes, such as CAZymes acting on starch as well as on cellulose and hemicellulose, under inducing conditions (i.e., on maltose, starch, and low concentration of glucose), but not under non-inducing condition (vanKuyk et al., 2012). This result is in line with the observation that in A. nidulans, nuclear localization of AmyR and thereby activation of its target genes is inducer-dependent (Makita et al., 2009).

In conclusion, we have seen that overexpression of TFs involved in plant biomass degradation can result in increased expression of their target genes under inducing conditions. However, in most cases overexpression of TFs, such as XlnR, Pdr-1 and AmyR, does not result in inducer-independent expression of their target genes. In these cases it is likely that an inducer molecule is required to activate the TF. The exact activation mechanism of XlnR, Pdr-1, and AmyR are currently unknown, and could involve direct interaction of the TF with its inducer, or could be related to post-translational modifications connected to the presence of inducers. Overexpression of TFs like GaaR and QutA results in inducer-independent expression of GaaR and QutA target genes most likely by titrating away the corresponding repressor proteins (GaaX and QutR, respectively). This illustrates that different mechanisms controlling TF activities result in different outcomes of overexpression of TFs under non-inducing conditions.

#### Constitutive Activation of TFs

The activity of a TF can be controlled in different ways, including post-transcriptional modifications affecting its localization, DNA binding or interaction with repressor protein(s). Over the past 15 years, mutations resulting in changes in amino acid sequences and thereby in constitutively active TFs have been identified via classical mutagenesis and screening approaches, or via predesigned amino acid substitutions, domain removal or protein truncation analyses.

The first reported constitutively active form of XlnR (XlnRV756F) was identified in A. niger via a forward genetic screen. Expression of xlnRV756<sup>F</sup> resulted in a constitutive expression of xylanases even under repressing conditions (i.e., on fructose or glucose; Hasper, 2004; Hasper et al., 2004). Later, in T. reesei, a point mutation in Xyr1 (Xyr1A824V) introduced via UV mutagenesis was found to result in a constitutively active Xyr1 and constitutive expression of cellulases and xylanases even in the presence of a repressing carbon source (Derntl et al., 2013). In addition, overexpression of xyr1A824<sup>V</sup> in T. reesei was shown to be more effective in increasing cellulase production than overexpression of xyr1 under inducing conditions (e.g., Avicel) (Jiang et al., 2016). Both amino acids changes (V756F in XlnR and A824V in Xyr1) are located within the same predicted αhelix in the fungal-specific TF domain of XlnR/Xyr1 (Derntl et al., 2013). Although overexpression of xlr-1 in N. crassa via the ccg-1 promoter did not yield to constitutive expression of xylanases, overexpression of xlr-1A828V, which carries the homologous mutation as in xyr1A824V, resulted in a constitutive and increased production of xylanases under both inducing (on xylan) and noninducing conditions (Craig et al., 2015). Similarly, overexpression of xlnR carrying the homologous point mutation (xlnRA871V) using the PDE\_02864 promoter in a P. oxalicum strain that lacks creA and overexpresses clrB (see above), enabled even more increased production of cellulases and xylanases under inducing conditions (on wheat bran) (Gao et al., 2017a).

Recently, via a forward genetic screen, several different point mutations throughout AraR were found to give rise to a constitutively active AraR and constitutive expression of AraR target genes (abfA, abfB, and abfC). Unlike the mutations in XlnR that gave rise to inducer-independent and constitutive expression of its target genes, the mutations in AraR lead to constitutive expression of AraR target genes only under derepressed conditions. Deletion of creA improved the production of CAZymes that degrade arabinan to a large extent, indicating the strong CCR on the genes encoding these CAZymes under repressing conditions on fructose (Reijngoud et al., manuscript in preparation).

We recently conducted a large forward genetic screen for mutants with constitutive expression of pectinases (Niu et al., 2017). Apart from identifying GaaX as a repressor for GAinduced genes expression, it also brought about the identification of a constitutive allele of GaaR (GaaRW361R) resulting in constitutive expression of pectinases under even repressing conditions (on glucose or fructose). Within GaaR, W361 is situated in the fungal-specific TF domain and is highly conserved among Aspergillus species (Alazi et al., in press).

Deletion of the C-terminal regions of several TFs involved in plant biomass degradation was shown to result in constitutive activation of the TFs indicating that these C-terminal parts contain an inhibitory domain. For example, in A. niger, truncation of XlnR from L668 resulted in a constitutive expression of xylanases (Hasper et al., 2004), and truncation of AraR from P646 was reported to cause constitutive activation of AraR (Jiang et al., 2016). Expression of the C-terminally truncated AmyR1−<sup>514</sup> and AmyR1−<sup>511</sup> resulted in constitutive localization of AmyR in the nucleus both in A. nidulans and A. oryzae, respectively. However, while constitutive amylase production was observed in A. nidulans (Makita et al., 2009), loss of expression was observed in A. oryzae (Suzuki et al., 2015), indicating that the effect of the truncation can also be species-specific.

### Deletion or Down-Regulation of Specific Repressors

It has been shown that besides the general CCR, specific repressors might play a role in the transcriptional regulation of the genes encoding CAZymes. For instance, the Cys2His<sup>2</sup> type TF Ace1 represses the expression of both cellulase and hemicellulase (i.e., xylanase) encoding genes, as well as the expression of xyr1 in T. reesei (Aro et al., 2003; Wang et al., 2013). Furthermore, Ace1 was shown to compete with Xyr1 to bind to the same sequence in the promoter of the xylanase gene xyn1 (Rauscher et al., 2006). Deletion of ace1 in the wild type background resulted in an increased expression of the genes encoding cellulases and xylanases only under inducing conditions (on cellulose), and gene silencing of ace1 led to an increase in constitutive expression of these genes when combined with the overexpression of xyr1 via the pdc promoter in a Cre1-negative background (Rut-C30) (Aro et al., 2003; Wang et al., 2013). Recently, another repressor, the Zn2Cys<sup>6</sup> type TF Rce1, was identified in T. reesei that is involved in the repression of the genes encoding cellulases, but not xylanases. It was shown that Rce1 is constitutively localized in the nucleus and competes with Xyr1 to bind to the same sequence in the promoter of the cellulase gene cbh1 (Cao et al., 2017). Deletion of rce1 resulted in an increased production of cellulases under inducing conditions (on cellulose) (Cao et al., 2017). Deletion of a basic helix-loop-helix TF xpp1, xylanase promoter-binding protein 1, in T. reesei resulted in an increased expression of hemicellulase (i.e., xylanase) encoding genes, but not cellulase encoding genes, at late stages of cultivation under inducing conditions (on xylan) (Derntl et al., 2015). However, Xpp1 was later described as a general regulator of both primary and secondary metabolism and therefore not a factor specifically controlling xylanases expression (Derntl et al., 2017). In addition, the Zn2Cys<sup>6</sup> type transcriptional repressor SxlR was found to bind to the promoters of specific (GH11 family) xylanase genes, and deletion of sxlR resulted in an increased expression of these xylanase genes under inducing conditions (on Avicel, xylan or lactose) (Liu et al., 2017).

Similarly, the Cys2His<sup>2</sup> type TF Hcr-1 in N. crassa was shown to repress the expression of the genes encoding xylanases, but not the ones encoding cellulases. A 1hcr-1 strain exhibited an increased xylanase production under inducing conditions (on xylan or Avicel) (Li et al., 2014).

Recently a new TF, MhR1, was identified in M. thermophila as the regulator of cellulase and xylanase genes. Gene silencing of mhr1 resulted in an increased production of cellulases and xylanases, as well as increased expression of xyr1 and the genes encoding cellulases under inducing conditions (on wheat straw) (Wang et al., 2018).

As mentioned before, the activity of GaaR in A. niger is inhibited by the repressor protein GaaX, the amount of which is regulated at the level of transcription by induction on GA. GaaX was proposed to bind to- and inhibit GaaR under noninducing conditions, and bind to the inducer molecule and release GaaR under inducing conditions. Deletion of gaaX resulted in a constitutive expression of pectinases, providing additional evidence for the proposed model of regulation of GA-responsive gene expression in A. niger (Niu et al., 2017). Similarly, deletion of QutR, the repressor protein involved in controlling the expression of quinic acid-responsive genes, also results in constitutive expression of at least eight genes involved in quinic acid uptake and metabolism (Levett et al., 2000).

#### Accumulation of Inducers

The intracellular accumulation of inducers is another effective method to boost the production of CAZymes by fungi. Cellulase production by N. crassa is greatly induced on insoluble, crystalline cellulose (Avicel). However, cellulase production is not observed on cellobiose, which is the soluble degradation product of cellulose, possibly due to rapid action of βglucosidases converting cellobiose to glucose and subsequent glucose-mediated CCR. As first shown in N. crassa, deletion of three genes encoding the major β-glucosidases (intracellular enzyme Gh1-1, and extracellular enzymes Gh3-3 and Gh3-4) disrupted the hydrolysis of the inducer, cellobiose, and resulted in higher levels of expression of cellulases on cellobiose compared to the wild type strain grown on cellobiose, and similar to the wild type strain grown on Avicel. Moreover, deletion of cre-1 further increased cellulase production on cellobiose (Znameroski et al., 2012). Similarly, deletion of the major intracellular β-glucosidase bgl2 in a carbon catabolite de-repressed (1creA) P. oxalicum strain that overexpresses clrB using the A. nidulans gpdA promoter, gave rise to higher levels of cellulase and hemicellulase (i.e., xylanase) production on cellulose compared to the wild type strain. These were similar to the levels of the industrial strain JU-A10-T grown on wheat bran (Yao et al., 2015). A similar effect of deleting bgl2 on cellulase and hemicellulase production was observed in P. oxalicum grown on cellulose (Chen et al., 2013).

Expression of hemicellulases, specifically xylanases, in T. reesei is induced both on xylose and arabinose. Although the catabolic pathways assimilating xylose and arabinose are interconnected, the physiological inducers triggering hemicellulase production appear to be different and were suggested to be xylose and L-arabitol, respectively. Expression of xylanases increased dramatically in single or double xylose/arabinose catabolic pathway enzyme deletion mutants, indicating the effect of accumulation of physiological inducers in these mutants (Seiboth et al., 2012a; Herold et al., 2013).

One of the GA catabolic pathway intermediates, 2-keto-3 deoxy-L-galactonate, was recently shown to be the physiological inducer of the genes encoding pectinases in A. niger. It was demonstrated that deletion of gaaC, the gene encoding 2-keto-3 deoxy-L-galactonate aldolase, results in accumulation of 2-keto-3-deoxy-L-galactonate and thereby elevated expression levels of pectinase encoding genes (Alazi et al., 2018). Inducer molecules that are responsible to the activation of TFs represent therefore another interesting target to enhance expression of CAZymes by constructing strains in which the inducer accumulates either due to prevention of rapid hydrolysis of the inducer or due to inactivation of the metabolic genes that function downstream of the enzyme that forms the inducer.

## FUTURE PERSPECTIVES AND CONCLUDING REMARKS

Increased expression of plant cell wall degrading enzymes can be achieved by overexpression of specific transcription factors or by identifying mutations in TFs leading to constitutive activation, and combinations thereof. It is also well established that CreA-dependent CCR has in many cases a negative effect on the production of enzymes and therefore CreA is an important target for improving enzyme production under both inducing and non-inducing conditions. Apart from the approaches described above in the previous paragraphs, modulating chromatin accessibility is yet an under-utilized approach to increase the expression of CAZyme encoding genes in filamentous fungi. Chromatin remodeling of the promoters of CAZyme encoding genes through the actions of the histone acetyltransferase Gcn5, the CCAAT-binding complex, and possibly the putative protein methyltransferase Lae1 is required for the full expression of these CAZymes in T. reesei (Zeilinger et al., 1998; Xin et al., 2013; Aghcheh and Kubicek, 2015; Li et al., 2016). Overexpression of a putative GCN5-related N-acetyltransferase (gene ID 123668) via the A. nidulans gpdA promoter or lae1 via the tef1 promoter resulted in an increased expression of cellulase encoding genes under inducing conditions (on lactose) (Seiboth et al., 2012b; Häkkinen et al., 2014). More recently, Cre1 was shown to be involved in chromatin accessibility of xyr1 promoter (Mello-de-Sousa et al., 2016).

As our knowledge about regulation of TF activities accumulates, various possibilities for rational design of industrial fungal strains emerge, such as combinations of different approaches as already mentioned above, or the use of synthetic TFs. Recently, inducer-independent production of CAZymes by filamentous fungi was achieved via synthetic TFs. For instance, overexpression via pdc1 promoter of a synthetic TF (xyr1-cre1<sup>b</sup> ) consisting of Xyr1 DNA-binding and effector domains, and Cre1 DNA-binding domain, resulted in inducerindependent production of cellulases and hemicellulases (i.e., xylanases) in the Cre1-negative T. reesei strain Rut-C30 (Zhang et al., 2017). Another example of synthetic TFs (CXC-S) was shown in P. oxalicum, where the DNA-binding domain of the constitutively active XlnRA871V was replaced with that of ClrB.

#### REFERENCES


This synthetic TF was overexpressed using the A. nidulans gpdA promoter, yielding to inducer-independent, but still glucoserepressed, expression of cellulases and xylanases (Gao et al., 2017a,b).

To conclude, fungal strains with increased or constitutive production of plant biomass degrading enzymes can be rationally designed by tuning the transcriptional regulatory systems. Mutations that lead to constitutive expression of enzymes are difficult to predict and so far only identified via genetic screens. Once identified, these mutations can be successfully transferred to industrial strains or related species. In this review, we also highlighted that regulation of activities of orthologues TFs, or the set of genes regulated by orthologues TFs might be speciesspecific. It is therefore important that detailed studies on how TFs are activated are performed not only in a single species, which is subsequently used as a blue print to predict the activation mechanism in other fungal species, but performed in several representative fungal species across the filamentous fungi.

#### AUTHOR CONTRIBUTIONS

EA and AR designed the content of the manuscript and collected literature. EA wrote the manuscript and AR critically commented on the manuscript.

#### ACKNOWLEDGMENTS

EA was supported by a grant from BE-Basic (Flagship 10). We acknowledge the helpful comments of Dr. Jaap Visser and Prof. Dr. Peter J. Punt on the manuscript.


of xylanase genes and virulence. Mol. Plant Microbe Interact. 20, 977–985. doi: 10.1094/MPMI-20-8-0977


transduction within the QUTR transcription repressor protein. Biochem. J. 350, 189–197. doi: 10.1042/bj3500189


Industrial Microbiology and Biotechnology, eds R. Baltz, A. Demain, J. Davies, A. Bull, B. Junker, L. Katz et al. (Washington, DC: ASM Press), 318–329.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Alazi and Ram. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Analysis of the Transcriptome in Aspergillus tamarii During Enzymatic Degradation of Sugarcane Bagasse

Glaucia Emy Okida Midorikawa<sup>1</sup> , Camila Louly Correa<sup>1</sup> , Eliane Ferreira Noronha<sup>1</sup> , Edivaldo Ximenes Ferreira Filho<sup>1</sup> , Roberto Coiti Togawa<sup>2</sup> , Marcos Mota do Carmo Costa<sup>2</sup> , Orzenil Bonfim Silva-Junior <sup>2</sup> , Priscila Grynberg<sup>2</sup> and Robert Neil Gerard Miller <sup>1</sup> \*

<sup>1</sup> Departamento de Biologia Celular, Universidade de Brasília, Brasília, Brazil, <sup>2</sup> Embrapa Recursos Genéticos e Biotecnologia, Parque Estação Biológica, Brasília, Brazil

#### Edited by:

Roberto Silva, Universidade de São Paulo, Brazil

#### Reviewed by:

Baskar Gurunathan, St. Joseph's College of Engineering, India Ching Man Wai, Michigan State University, United States

> \*Correspondence: Robert Neil Gerard Miller robertmiller@unb.br

#### Specialty section:

This article was submitted to Bioenergy and Biofuels, a section of the journal Frontiers in Bioengineering and Biotechnology

Received: 06 July 2018 Accepted: 20 August 2018 Published: 18 September 2018

#### Citation:

Midorikawa GEO, Correa CL, Noronha EF, Filho EXF, Togawa RC, Costa MMdC, Silva-Junior OB, Grynberg P and Miller RNG (2018) Analysis of the Transcriptome in Aspergillus tamarii During Enzymatic Degradation of Sugarcane Bagasse. Front. Bioeng. Biotechnol. 6:123. doi: 10.3389/fbioe.2018.00123 The production of bioethanol from non-food agricultural residues represents an alternative energy source to fossil fuels for incorporation into the world's economy. Within the context of bioconversion of plant biomass into renewable energy using improved enzymatic cocktails, Illumina RNA-seq transcriptome profiling was conducted on a strain of Aspergillus tamarii, efficient in biomass polysaccharide degradation, in order to identify genes encoding proteins involved in plant biomass saccharification. Enzyme production and gene expression was compared following growth in liquid and semi-solid culture with steam-exploded sugarcane bagasse (SB) (1% w/v) and glucose (1% w/v) employed as contrasting sole carbon sources. Enzyme production following growth in liquid minimum medium supplemented with SB resulted in 0.626 and 0.711 UI.mL−<sup>1</sup> xylanases after 24 and 48 h incubation, respectively. Transcriptome profiling revealed expression of over 7120 genes, with groups of genes modulated according to solid or semi-solid culture, as well as according to carbon source. Gene ontology analysis of genes expressed following SB hydrolysis revealed enrichment in xyloglucan metabolic process and xylan, pectin and glucan catabolic process, indicating up-regulation of genes involved in xylanase secretion. According to carbohydrate-active enzyme (CAZy) classification, 209 CAZyme-encoding genes were identified with significant differential expression on liquid or semi-solid SB, in comparison to equivalent growth on glucose as carbon source. Up-regulated CAZyme-encoding genes related to cellulases (CelA, CelB, CelC, CelD) and hemicellulases (XynG1, XynG2, XynF1, XylA, AxeA, arabinofuranosidase) showed up to a 10-fold log2FoldChange in expression levels. Five genes from the AA9 (GH61) family, related to lytic polysaccharide monooxygenase (LPMO), were also identified with significant expression up-regulation. The transcription factor gene XlnR, involved in induction of hemicellulases, showed up-regulation on liquid and semi-solid SB culture. Similarly, the gene ClrA, responsible for regulation of cellulases, showed increased expression on liquid SB culture. Over 150 potential transporter genes were also identified with increased expression on liquid and semi-solid SB culture. This first comprehensive analysis of the transcriptome of A. tamarii contributes to our understanding of genes and regulatory systems involved in cellulose and hemicellulose degradation in this fungus, offering potential for application in improved enzymatic cocktail development for plant biomass degradation in biorefinery applications.

Keywords: Aspergillus tamarii, transcriptome, carbohydrate-active enzymes, XlnR, ClrA, sugar transporters, lignocellulose, bioethanol

#### INTRODUCTION

The production of renewable energy is one of the greatest challenges of the twenty-first Century. Whilst dependency upon fossil fuels is associated with depleting oil reserves and greenhouse gas emissions, plant biomass, by contrast, with its' global abundancy, represents a sustainable and environmentally clean energy source (Goldemberg, 2007; Tan et al., 2016).

The production of bioethanol from non-food agricultural residues such as lignocellulosic trash, grasses and woods, is known as second-generation (2G) ethanol (Alvira et al., 2010), and is considered a promising alternative energy source to fossil fuels for incorporation into the world's economy. Brazil is currently one of the principal agricultural producers, as an important supplier of both food and industrial crops. Sugarcane is planted over an area of almost 9 million hectares, with an annual production of over 620 million tons (CONAB, 2018). Whilst around 45% of the crop production is employed for sugar extraction, the majority is used in the bioethanol industry, with estimates of production of 28 billion liters of anhydrous and hydrated ethanol for 2018/2019 (CONAB, 2018). Bioethanol production in Brazil is based almost exclusively on first-generation technologies, whereby the sucrose content of the plant is converted into ethanol. In this process, sugarcane bagasse will accumulate as an agricultural residue (Goldemberg, 2008). Whilst the burning of bagasse currently serves as an energy source in bioethanol mills, as this biomass represents approximately one-third of the energy content of the crop, the conversion of the lignocellulose component of the cell wall into fermentable hexose (glucose) and pentose (e.g., D-xylose and L-arabinose) sugars offers considerable potential for increased 2G ethanol production, potentially by up to 40% (Amorim et al., 2011). Two Brazilian cellulosic ethanol plants came into operation in 2014, with capacities planned for production of up to 1 billion liters of ethanol per year from bagasse (Silva et al., 2017).

For economically viable 2G ethanol production, complete hydrolysis, or saccharification, of plant biomass is required. Such plant material is composed mainly of polysaccharide crystalline microfibers of cellulose (40–50%), followed in abundance by a matrix of various hemicelluloses and pectins (25–35%), in addition to the polyaromatic lignin (15–20%) (Lin and Tanaka, 2006; Ragauskas et al., 2006; Jordan et al., 2012; Guerriero et al., 2016). Efficient biorefinary-based conversion of this material is hampered due to the recalcitrance of lignocelluloses (Chundawat et al., 2011). In the case of sugarcane bagasse, lignocellulose sugars vary in terms of identity and branching, comprising residues of glucose (60%), xylose (13%), arabinose (6%), mannose (3%), galactose (1,5%), and less than 1% fructose and rhamnose (Häkkinen et al., 2012).

As cellulases and hemicellulases remain costly, increasing the costs of 2G bioethanol production, a continued characterization of sources of such enzymes, together with an improved understanding of the mechanisms involved in enzyme secretion and enzyme efficiency are of fundamental importance for the biofuel industry. Hydrolytic enzymes appropriate for fermentation of available sugars in lignocelluloses are known to be secreted by a wide variety of bacteria and filamentous fungi, with the latter often producing not only a diverse array of extracellular lignocellulolitic enzymes, but also displaying efficiency in secretion of such enzymes in high quantities (Phitsuwan et al., 2013). For this reason, fungi are today the principal source of hydrolytic enzymes for this industrial application (Sims et al., 2010; Couturier et al., 2012).

Lignocellulolytic fungi typically produce extensive sets of carbohydrate-active enzymes (CAZymes) that correlate with their geographical origin habitat (Van Den Brink and De Vries, 2011). A number of ascomycete fungi, notably species members of the genera Trichoderma and Aspergillus, produce a range of cellulases and hemicellulases, which are today applied across numerous relevant industries for production of food, feed, paper, textiles and pharmaceuticals (Archer, 2000; de Souza et al., 2011). Whilst Aspergillus niger and Trichoderma reseei are currently employed in the production of commercial enzymatic cocktails for lignocellulosic biomass deconstruction (Singhania, 2011; Mohanram et al., 2013), the identification of additional sources of carbohydrate active enzymes will likely increase efficiency in the deconstruction of this biomass. As such, additional species have recently been screened as potential sources of cellulases, hemicellulases and accessory proteins for optimized industrial enzyme production (Brown et al., 2016; Cong et al., 2017; de Gouvêa et al., 2018).

The availability of whole genome sequences for fungi has improved understanding of fungal biodiversity with respect to plant cell wall degradation. Aspergillus nidulans, considered the model species of the genus given its' well-elucidated sexual cycle, possesses a genome sequence of 30.06 MB, with 9396 predicted genes (Galagan et al., 2005). Other characterized Aspergillus species of importance for the food, textile, pulp and paper industries, and potentially in 2G ethanol production, include A. oryzae and A. niger. A. oryzae has a total genome size of 37.12 MB, with 12336 predicted genes (Machida et al., 2005). Similarly, A. niger possesses a genome of 37.2 MB, with 14600 predicted genes (Pel et al., 2007). Comparison of gene sequences against the Carbohydrate-Active Enzymes Database (http://www.cazy. org/) (Cantarel et al., 2009) has revealed 186 genes related to polysaccharide hydrolysis in A. nidulans, 217 in A. oryzae and 171 in A. niger (Delmas et al., 2012).

In addition to gene discovery, the annotated genome sequences for these species serve as resources for analysis of the transcriptome in additional Aspergillus species without available genome sequences. Such analysis of transcriptional regulation of genes encoding hydrolytic enzymes in Aspergillus has been studied in relation to growth on different sugar carbon sources (Andersen et al., 2008; Jørgensen et al., 2009; Salazar et al., 2009). In relation to fermentation of sugarcane bagasse, microarray analysis provided information on gene expression modulation in A. niger (Guillemette et al., 2007; de Souza et al., 2011), with cellulases, hemicellulases and transporters identified with increased expression during growth on sugarcane bagasse in comparison to fructose. Subsequent RNAseq analysis of the A. niger transcriptome, following growth on wheat straw compared to simple sugars, revealed a CAZy gene representation change from 3% of total mRNA on 1% glucose to 19% on wheat straw, representing numerous enzymes from the classes of Glycoside Hydrolases (GH), Carbohydrate Esterases (CE), and Polysaccharide Lyases (PL) (Delmas et al., 2012). Further RNAseq-based analysis of gene expression in Aspergillus species following growth on sugarcane bagasse as carbon source has also revealed important information regarding regulatory mechanisms and genes encoding plant cell wall degrading enzymes, accessory proteins and transporters (Pullan et al., 2014; Brown et al., 2016; Borin et al., 2017; Cong et al., 2017; de Gouvêa et al., 2018).

The continued characterization of hydrolases, accessory proteins and the regulation of their expression in Aspergillus species that display efficiency in degradation of lignocellulose will further our understanding of their roles in saccharification. Given the importance of Aspergillus tamarii as an efficient producer of enzymes such as xylanases (El-Gindy et al., 2015; Monclaro et al., 2016), we utilized an Illumina RNA-seq approach to analyze the transcriptome in this fungus following semisolid and liquid cultivation on steam-exploded bagasse (SB) compared gene expression following growth on glucose (G). Genes encoding cellulases and hemicellulases, transcription factors and transporters are characterized in relation to their differential expression following fungal growth on each carbon source. Data will benefit the development of improved fungal strains with increased ability to deconstruct lignocellulose and generate value-added bioproducts.

#### MATERIALS AND METHODS

#### Strain and Culture Conditions

A stock culture of a strain of A. tamarii, code BLU37, was provided by the fungal culture collection at the Enzymology Laboratory, University of Brasilia, Brazil (genetic heritage number 010237/2015-1). The strain was originally isolated into pure culture from natural composting cotton textile waste material in the Vale do Itajaí, Santa Catarina, Brazil (Siqueira et al., 2009) and maintained in the culture collection at −80◦C in 50% glycerol.

Species identity reconfirmation was conducted by sequence analysis of the nuclear ribosomal DNA (rDNA) ITS1-5.8S-ITS2 region, together with specific regions of the β-tubulin and calmodulin genes. Genomic DNA was extracted according to Raeder and Broda (1985) from a 3 day old liquid culture in Czapek Yeast Extract medium (CYA) (Pitt and Hocking, 2009) incubated on an orbital shaker at 28◦C. Each PCR reaction contained 10 ng genomic DNA, 2,5 mmol−<sup>1</sup> of each primer, 1 mmol−<sup>1</sup> dNTPs, 4 mmol−<sup>1</sup> MgCl2, 1U of Taq Platinum <sup>R</sup> polymerase (Invitrogen) and 1 x Taq Platinum <sup>R</sup> polymerase buffer (Invitrogen). Ribosomal DNA ITS regions were amplified using primers ITS5 and ITS4 (White et al., 1990), a β-tubulin gene region with primers Bt2a and Bt2b (Glass and Donaldson, 1995), and a calmodulin gene region amplified with primers Cmd5 and Cmd6 (Hong et al., 2006). PCR cycling was performed with the following programs: initial denaturation at 94◦C for 4 min, 30 cycles of denaturation at 94◦C for 1 min, primer annealing for 1 min, at 50◦C for primers ITS5 and ITS4, and at 60◦C for primers Bt2a, Bt2b, Cmd5 and Cmd6, extension at 72◦C for 1 min, and a final extension period at 72◦C for 5 min. PCR products were purified using ExoSAP-IT <sup>R</sup> (USB, Cleveland, Ohio, USA) and sequenced using Big Dye <sup>R</sup> Terminator v3.1 Cycle Sequencing chemistry (Applied Biosystems, Foster City, CA, USA) on an ABI 3700 DNA sequencer (Applied Biosystems, Foster City, CA, USA). For molecular identification, sequences were compared against the nucleotide database NCBI using the BLASTn algorithm (Altschul et al., 1990). Ribosomal DNA ITS, β-tubulin and calmodulin gene sequences were deposited in GenBank under accession numbers MH540359, MH544272 and MH544273, respectively.

For analysis of gene expression in BLU37 following exposure to SB or glucose as carbon source, the strain was grown in either liquid or semi-solid minimal medium (KH2PO<sup>4</sup> 7 g; K2HPO<sup>4</sup> 2 g; MgSO<sup>4</sup> 0.4 g; (NH4)2SO<sup>4</sup> 1.6 g, pH 7.0, per liter of distilled water), containing SB (1% w/v) or glucose (1% w/v) (Sigma Aldrich) as exclusive carbon source. In order to guarantee elimination of reducing sugars, prior to fungal inoculation, SB was repeatedly washed with deionized water until reducing sugars were no longer detectable by the colorimetric dinitrosalicylic acid (DNS) assay (Miller, 1959). Liquid cultures were grown in 100 mL of media in Erlenmeyer flasks, whilst semi-solid cultures were grown on petri plates with media supplemented with agar (15 g L−<sup>1</sup> ). Fungal spores at a concentration of 1 × 10<sup>8</sup> conidiospores mL−<sup>1</sup> were used as inocula, with cultures then incubated at 28◦C and 150 rpm for 36 and 48 h. Fungal cultures were arranged in a randomized block design, with three replicates for each treatment and time point. Growth treatments were labeled as follows: liquid medium, SB carbon source, 36 h incubation (LB36); liquid medium, glucose carbon source, 36 h incubation (LG36); liquid medium, SB carbon source, 48 h incubation (LB48); liquid medium, glucose carbon source, 48 h incubation (LG48); semi-solid medium SB carbon source, 36 h incubation (SB36); semi-solid medium, glucose carbon source, 36 h incubation (SG36); semi-solid medium, SB carbon source, 48 h incubation (SB48); semi-solid medium, glucose carbon source, 48 h incubation (SG48).

#### Analysis of Enzymatic Activities

Analysis of hydrolytic enzyme secretion was conducted following fungal growth in the liquid minimal medium with SB (1% w/v) or glucose (1% w/v) as carbon source. Enzyme activities were evaluated over a 10 day period at 24 h intervals. Xylanase, CMCase, pectinase and FPase assays were determined using the DNS assay at pH 5.0. Each assay comprised 10 µl of the fungal secretome, together with xylan (1% w/v), carboxy methylcellulose (1% w/v), or pectin (1% w/v) as substrate. An assay for FPase activity, as a measurement of total cellulases, was conducted according to Ghose (1987). All assays were conducted with at least three replicates. Quantification of reduced sugars from the assays was conducted using a spectrophotometer at an absorbance of 540 nm (Spectra Max), calibrated using standard curves of glucose, xylose and galacturonic acid. Absorbance values for each assay were calculated in international units (UI), where 1 UI was defined as the amount of enzyme necessary to release 1 µmol of reducing sugars per minute per liter by hydrolysis of each crude substrate.

#### Vizualization of Fungal Colonization of SB by Scanning Electron Microscopy

Following 36 and 48 h mycelial growth, liquid minimum medium culture supplemented with SB was filtered with Whatman <sup>R</sup> filter paper n◦ 1 and washed with Karnovsky buffer (0.05M; pH 7.2). Samples were fixed for 4 h in a fresh solution of 0.05M cacodylate buffer at pH 7.4. Following dehydration with acetone, samples were postfixed for 1 h using 1% osmium tetroxide. Samples were washed in liquid CO<sup>2</sup> at 4◦C, dried in a critical point drier (Emitech K850, Kent, UK), mounted on copper stubs then sputter coated with 20 nm gold particles. Prepared samples were observed under scanning electron microscopy using a Zeiss DSM 962 scanning electron microscope.

#### Total RNA Extraction and Illumina RNA-seq

Following 36 and 48 h incubation, total RNA was extracted from each culture according to a standard phenol/chloroform method (Brasileiro et al., 2015). Mycelia from liquid media was collected by filtration with Whatman <sup>R</sup> filter paper n◦ 1, whilst from semi-solid cultures mycelia from the surface of each agar plate was collected manually using a sterilized spatula. Extracted total RNA was quantified and integrity determined using an Agilent 2100 Bioanalyzer and RNA LabChip <sup>R</sup> kit system (Agilent Technologies, Santa Clara, CA, USA). Isolation of mRNA, cDNA library construction and Illumina RNA-seq were conducted by Eurofins MWG Operon (Louisville, KY, USA). All treatments from the replicate bioassays were paired-end sequenced (2 × 100 bases) using TruSeq RNA Chemistry v3 on two flow cell channels of an Illumina Hiseq2000 system (Illumina Inc., San Diego CA, USA).

#### Bioinformatics Analysis Read Mapping and Assembly

Quality was determined for sequence reads from each cDNA library using ea-utils (Aronesty, 2011). For assignment of reads to gene models for the genus, high quality sequences (Fastq QC > 30) were mapped against the reference annotated genome sequence for the phylogenetically related species A. oryzae, strain RIB40 (National Research Institute of Brewing Stock Culture and ATCC-42149 (Machida et al., 2005), publically available at DOGAN (http://www.bio.nite.go.jp/dogan/Top). Alignment and assembly were conducted using the programs TopHat (Trapnell et al., 2009) and Cufflinks (Trapnell et al., 2010).

#### Analysis of Gene Expression Levels for Normalized Data

Genes with statistically significant differential expression between the evaluated growth conditions were identified on the basis of comparisons of in silico data. Read counts aligned to each gene were determined using the Python script HTseq-count (Anders et al., 2015). Differences in gene expression levels between treatments were calculated using the DEGseq program (Wang et al., 2010a). Differentially expressed genes (DEGs) between evaluated treatments were considered to be significant if a log2 fold change (FC) was at least ≥2-fold and at a probability level of p ≤ 0.01.

#### Gene Ontology and Analysis of Enrichment

Using the program FUNC (Prüfer et al., 2007), a hypergeometric test enabled analysis of both over- and under-representation of DEGs according to gene function classification within gene ontology categories (GO). Redundancy in category terms was eliminated using the program REVIGO (http://revigo.irb.hr/).

#### Carbohydrate-Active Enzymes

Identification of genes encoding hydrolytic enzymes was conducted through alignment against the Carbohydrate-Active Enzymes (CAZymes) database (http://www.cazy.org/) (Cantarel et al., 2009), to enable classification of glycosyl hydrolases, glycosyltransfereases, carbohydrate-binding modules and carbohydrate esterases.

#### Transcription Factors

Transcription factors were identified following alignment of gene sequences against the Fungal Transcription Factor Database (FTFD) (http://ftfd.snu.ac.kr/index.php?a=view).

### Validation of RNAseq-Derived DEGs by RT-qPCR

To validate DEGs identified on the basis of in silico transcriptome data, a RT-qPCR analysis was conducted according to MIQE guidelines (Bustin et al., 2009). All specific primers were designed with Primer Express <sup>R</sup> software (Applied Biosystems) and evaluated at OligoAnalyzer 3.1 – IDT (https://www. idtdna.com/calc/analyzer). All cDNA libraries were synthesized using the same RNA samples employed for RNAseq analysis. Three independent biological replicates were analyzed for each carbon source and time point treatment, with three technical replicates per amplification. Total RNA was treated with 2U of Amplification Grade DNase I (Invitrogen, Carlsbad, CA, USA) and cDNA was synthesized with Oligo(dT)20 primers (Invitrogen, Carlsbad, CA, USA) and SuperScript <sup>R</sup> II Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA). PCR was carried on a Step One Plus Real Time PCR System (Applied Biosystems) using a Platinum <sup>R</sup> SYBR <sup>R</sup> Green qPCR Super Mix-UDG w/ROX kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's recommendations and 1 µL of template cDNA. Thermocycling was performed with 40 cycles of denaturation at 95◦C for 15 s, followed by primer annealing and extension at 60◦C for 30 s. GAPDH (Wang et al., 2010b) and β-tubulin (Mckelvey and Murphy, 2010) were employed as stable reference genes. Cycle threshold (Ct) values were determined using the program SDS 2.2.2 (Applied Biosystems, Foster City, USA), with specificity of PCR products for each primer set verified according to the Tm (dissociation) of the amplified products. Individual amplification efficiencies were calculated with the program LinRegPCR, version 2013.0, using a window-of-linearity. Gene expression values were calculated according to the 2-11CT method (Livak and Schmittgen, 2001).

### RESULTS

### Fungal Growth on Steam-Exploded Sugarcane Bagasse

Scanning electron microscopy-based observation of liquid minimal medium culture supplemented with SB following inoculation with A. tamarii conidia revealed evidence for degradation of sugarcane bagasse after incubation with the fungus (**Figure 1**). Homogeneous SB parenchyma fragments prior to inoculation can be observed in **Figure 1A**, with mycelial growth and resultant disruption of the parenchyma surface clearly visible in **Figures 1B,C**, following 36 and 48 h fungal colonization.

#### Evaluation of Enzyme Production

In order to gain an understanding of the genes and pathways involved in xylanase and cellulase induction following exposure to SB, analysis of enzyme secretion in A. tamarii BLU37 was firstly conducted over a 10 day period following growth on liquid minimal medium culture supplemented with SB, as well as on the same medium with a glucose carbon source as control (**Figure 2**). Xylanase activity was shown to rapidly increase during the first 24 h on SB (0.626 ± 0.001), in contrast to growth on glucose (0.000 ± 0.004), with activity remaining relatively constant over the 10 day period. The activities of CMCases, pectinases and FPases at 24 h on SB were inferior in comparison to xylanases (0.096 ± 0.002, 0.039 ± 0.004, and 0.159 ± 0.037, respectively), although they were clearly induced by SB when compared to values for these enzymes following growth in glucose (0.000 ± 0.002, 0.006 ± 0.007, and 0.000 ± 0.088, respectively). Again, the expression of these enzymes remained relatively constant throughout the 10 day evaluation time course. Given the kinetics of xylanase production on SB, together with evidence for carbon catabolic repression during the first 48 h of growth on glucose, 36 and 48 h were selected as time points for analysis of the

FIGURE 1 | Scanning electron microscopy images of (A) non-inoculated steam-exploded sugarcane bagasse parenchyma in liquid minimal medium culture; (B) steam-exploded sugarcane bagasse parenchyma in liquid minimal medium culture following 36 h incubation with A. tamarii BLU37; (C) steam-exploded sugarcane bagasse parenchyma in liquid minimal medium culture following 48 h incubation with A. tamarii BLU37. Bars = 100µm; 200X magnification.

transcriptome of A. tamarii BLU37 in response to SB or glucose as carbon source.

#### Transcriptome of Aspergillus Tamarii BLU37 in Response to Sugarcane Bagasse Sequence Metrics

Illumina Hiseq2000-sequenced cDNA libraries resulted in a total of 634,255,527 reads with lengths between 75 and 85 bp, totaling 83.82 Gb of raw data. Illumina quality filtering indicated no information loss on the basis of passfilter data. High quality sequences (Fastq QC>30) averaged 80.34% after adapter trimming (**Supplementary Table 1**). The total of paired and unpaired reads mapping to the reference genome A. oryzae RIB40 was 94 and 84%, respectively. An indel percentage of approximately 0.002% was observed in quality filtered sequences, with a low total of chimeric reads, at 0.000949%. All Illumina RNAseq data was deposited at the NCBI Sequence Read Archive (SRA) database (BioProject ID PRJNA479954, SRA accession: SRP152413).

#### Gene Expression Modulation

Sequence data generated from each of the cDNA libraries successfully aligned to over 7120 of the 12074 gene models in the reference A. oryzae RIB40 genome. Such homogeneity in characterized genes across cDNA libraries indicates a high quality and coverage of sequenced mRNA.

Analysis of fold change in gene expression was conducted on read counts aligned to each gene model in the reference A. oryzae RIB40 genome. DEGs with significant fold change (at least ≥2-fold and at a probability level of p ≤ 0.01) between treatments were identified through comparison of mapped read counts for genes expressed in cultures grown on SB as carbon source, in contrast read counts following growth on respective glucose controls. Differential gene expression analysis was also conducted in relation to culture

format (liquid or semi-solid) and growth period (36 or 48 h).

A global heatmap representation of gene expression profiles in A. tamarii BLU37 following growth on SB in comparison to glucose revealed considerable modulation of gene expression following growth on this source of plant cell wall polysaccharides (**Figure 3**). Modulated genes on SB, in comparison to glucose, that grouped exclusively according to culture format (liquid or semi-solid culture), and independent of the growth period, can be observed in **Figure 3**, region 1. Modulated genes after growth on SB were also identified that grouped according to both culture format and growth period (**Figure 3**, region 2), according to SB as carbon source, independent of culture format or growth period (**Figure 3**, region 3), or exclusively according to growth period (**Figure 3**, region 4). Comparison of expression levels in the predicted genes identified in each of the treatments, whether in growth media supplemented with SB or glucose, revealed that gene expression in A. tamarii can also be influenced exclusively according to growth period or culture format, regardless of carbon source (**Supplementary Figure 1**).

Numerous genes were detected with significant expression fold change on SB as sole carbon source in comparison to glucose at an equivalent growth time point. For LB36, a total of 621 DEGs were detected with significant expression up-regulation, plus a total of 919 DEGs showing down-regulation. For LB48, numbers were greater, with 1187 DEGs up-regulated and 1229 down-regulated. The overall number of DEGs observed following growth on semi-solid media was generally lower than observed on liquid media. For SB36, a total of 755 DEGs were significantly up-regulated, with 449 down-regulated. Similarly for SB48, a total of 914 DEGs were detected as up-regulated and 445 as down-regulated.

#### Gene Ontology Enrichment Analysis of Differentially Expressed Genes

Differentially expressed genes were analyzed for over- and under-representation according to gene function classification within gene ontology categories (GO) (**Figure 4**). The majority of annotations for DEGs were classified in biological process subcategories, followed by molecular function and cellular component. Biological process GO terms enriched in DEGs up-regulated following SB hydrolysis included those involved in xylan (GO:0045493), pectin (GO:0045490) and glucan (GO:0009251) catabolic process, with enrichment of up to 4.2 times in all treatments supplemented with SB. Xyloglucan metabolic process (GO:0010411) was the most pronounced of such GO terms, with an enrichment of 8.4 times for the treatment SB48. By contrast, glycolytic process (GO:0006096), glucose catabolic process (GO:0006007), fatty acid (GO:0006633) and ergosterol biosynthetic process (GO:0006696) were abundant terms for down-regulated DEGs following growth in SB-supplemented treatments, indicating that genes classified under these GO terms are more highly expressed in the treatments supplemented with glucose. With regard to molecular function, subcategories cellulase (GO:0008810), galactosidase (GO:0015925), pectate lyase (GO:0030570), alpha-L-arabinofuranosidase (GO:0046556), carboxylic ester hydrolase (GO:0052689) and glucosidase activity (GO:0015926) were all enriched in up-regulated DEGs following growth on SB-supplemented treatments, providing further evidence for A. tamarii as a promising fungal species for secretion of cellulases and hemicellulases.

#### CAZyme-Encoding Genes

With regard to plant cell wall degradation in SB, global analysis of gene expression following fungal growth in the four treatments LB36, LB48, SB36, and SB48 revealed the presence of a total of 311, 314, 337, and 324 expressed CAZyme-encoding genes, respectively, after growth in treatments with this complex carbon source. Differential expression modulation in relation to equivalent treatments with glucose is summarized in **Figure 5**. A. tamarii BLU37 clearly expressed important genes related to SB degradation, with a total of 209 CAZy-encoding genes with statistically significant differential expression on SB, either in liquid or semi-solid treatments, in comparison to equivalent growth treatments with glucose as carbon source (**Table 1**; **Supplementary Table 2**). In terms of those with significant modulation of gene expression, genes encoding GH family proteins were the most abundant, with a total of 141 DEGs observed, followed by 15 DEGs encoding CEs, 15 DEGs encoding PLs, four DEGs encoding AA proteins and 32 DEGs encoding

GTs. A Venn representation of total numbers of CAZymeencoding DEGs in treatments LB36, LB48, SB36 and SB48 revealed 50 genes that were common to all four treatments and likely representing those essential in hydrolysis of the complex carbon source SB (**Figure 6**).

DEGs involved in cellulose depolymerization included β-1,4-endoglucanases, cellobiohydrolases, β-glucosidases, and lytic polysaccharide mono-oxygenases (LPMO). Genes with expression modulation of up to 10-fold were also observed amongst the DEGs, including numerous encoding hemicellulases related to xylan depolymerization such as xylanase, endoxylanase, endo-β-1,4-xylanase, α-Larabinofuranosidase, arabinofuranosidase, and feruloyl esterase. Genes related to pectin depolymerization were also highly expressed, such as for those encoding polygalacturonase, pectinesterase, pectin and pectate lyase. A total of 40 CAZyme-encoding DEGs were exclusive to specific treatments with sugarcane. Interestingly, most were expressed positively in the treatment SB36 and were related to pectin hydrolysis. These included, for example, orthologos for genes encoding a polygalacturonase (AO090009000470), a rhamnogalacturonan hydrolase (AO090102000139), a pectin lyase (AO090010000030), and a β-glucosidase (AO090701000841).

#### Differentially Expressed Transcription Factor and Transporter-Encoding Genes

A total of 80 transcription factor genes were differentially expressed in A. tamarii BLU37 following growth on sugarcane in comparison with equivalent treatments with glucose (padj < 0.01) (**Supplementary Table 3**). These comprised genes with transcription factor activity domains in families such as Zn2/Cys6 DNA-binding domain, Homeodomain, C2H2 zinc fingers, helix-loop-helix DNA-binding domain, GATA zinc finger and Basic-leucine zipper (bZIP). An ortholog for the

Transcription Factor-encoding genes; Gray dots, un-annotated genes.

transcription factor gene XlnR (AO090012000267), known to be responsible for activation of genes involved in xylan and cellulose degradation, showed an increased expression of up to 1-fold in LB48 and 1.46-fold in SB36. Additionally, the XlnR gene ortholog AO090003001292, showed increased expression of 2.06-fold in LB48 and 1.75-fold in SB48. Similarly, an ortholog of the ClrA transcription factor gene AO090011000944, which is positively regulated by XlnR, was differentially modulated 2.33-fold in LB36, 2.34-fold in LB48 and 2.07-fold in SB36. Two transcription factors controlling pectinase-encoding genes, namely RhaR (AO090005000121) and AraR (AO090003001292), were also up-regulated, 1.80 fold in LB48, 2.60-fold in SB36, 2.05-fold in LB48 and 1.74 in SB48. Interestingly, an ortholog of the BlrA gene (AO090005001041), a DNA binding transcription factor involved in asexual sporulation in fungi, displayed an early up-regulation in the treatments on sugarcane bagasse, with a 1.11-fold increase in expression in LB36 followed by a 0.73-fold decrease in LB48, together with a 7.40-fold increase in expression in SB36 and subsequent 1.70-fold decrease in gene expression in SB48. Transcription factor genes that are known to act in repression of cellulases and hemicellulases, such as the zincfinger carbon catabolite repressor transcription factor CreA (AO090026000464) (Ruijter and Visser, 1997), together with Ace1 (AO090005001502) and Ace2 (AO090003000678), whilst observed amongst the unigenes identified in the A. tamarii transcriptome, were not significantly differentially expressed on sugarcane bagasse in comparison to growth on glucose as carbon source.

A total of 155 transporter genes were differentially expressed across the four treatments on sugarcane, in comparison to glucose (**Supplementary Table 4**). Highly expressed genes related to sugar transport were, in the majority, those known to be positively regulated by the transcription factor XlnR and coding for Major Facilitator Superfamily (MFS) proteins. Orthologs of the genes AO090003000782, AO090001000069, and AO090003001277 were the most positively differentially expressed of such genes in all treatments on sugarcane, in comparison to glucose. The putative D-xylose transmembrane transporter XtrD (AO090001000069) was also highly expressed on sugarcane bagasse (log2FC 5.02 in LB36, 5.57 in LB48, 7.16 in SB36 and 4.68 in SB48), as was the putative cellobiose transporter cdt-2 gene ortholog (AO090003001277), expressed at log2FC levels 6.53 in LB36, 9.10 in LB48, 8.48 in SB36 and 6.07 in SB48.

#### Validation of RNAseq Analysis by RT-qPCR

In order to validate Illumina RNAseq-derived gene expression data, a total of 14 highly expressed CAZyme-encoding genes were selected for expression profile analysis by RT-qPCR. These gene orthologs, which are known to be positively regulated by XlnR, comprised cellobiohydrolases (AO090001000348, AO090012000941, AO090038000439), endoxylanases (AO090103000423, AO090120000026, AO090001000111), feruloyl esterase (AO090701000884), pectinesterase (AO090102000010), polygalacturonase (AO090102000011), arabinofuranosidase (AO090103000120), β-xylosidase (AO090005000986), exoarabinase (AO090011000141), endoglucanase (AO090026000102), and a Lytic Polysaccharide Mono-Oxygenase (LPMO) (AO090023000787). Both RNAseq and RT-qPCR gene expression values are represented in log2FoldChange, showing broad agreement in expression pattern tendencies to up- or down-regulation in each of the four treatment comparisons (**Figure 7**). Primer sequences for target genes selected for validation via RT-qPCR are available in **Supplementary Table 5**.

## DISCUSSION

Efficient 2G bioethanol production systems require the complete hydrolysis of hexose and pentose sugars present in plant biomass. In this study, a global analysis of the transcriptome revealed considerable ligninolytic potential in the ascomycete fungus A. tamarii BLU37. With an abundance of expressed CAZyme-encoding genes, data on the transcriptional response to contrasting carbon sources of sugarcane bagasse and glucose both increase understanding of the mechanisms involved in enzyme secretion and efficiency, and serve as a resource for exploitation of genes encoding transcription factors, transporters and enzymes involved in the degradation of plant biomass for engineering of improved microorganisms.

As plant pathogens or saprophytes, many filamentous fungi are efficient in hydrolytic extracellular enzyme secretion for plant biomass depolymerization. A number of species belonging to the genus Aspergillus are efficient in secretion of a large repertoire of glycosyl hydrolases appropriate for industrial application in lignocellulosic biomass degradation (Machida et al., 2005; Duarte et al., 2012; Jaramillo et al., 2013; Pirota et al., 2014). The strain examined in this study was originally isolated from composting cotton textile waste material (Siqueira et al., 2009). Typically, this species is associated with plant biomass degradation as a spoilage fungus, with reports of isolation from food and feed products (Rodrigues et al., 2011; Midorikawa et al., 2014; Martins et al., 2017; Prencipe et al., 2018). Previous studies have also highlighted the potential in this species in depolymerization of xylan, through secretion of β-1,4-endoxylanases (Gouda and Abdel-Naby, 2002; da Silva et al., 2014). Efficient secretion of such xylanases has also been reported in the strain A. tamarii BLU37 (Duarte et al., 2012; Monclaro et al., 2016). Whilst the transcriptome in different Aspergillus species during degradation of plant biomass has been the subject of recent investigation (de Souza et al., 2011; Delmas et al., 2012; Pullan et al., 2014; van Munster et al., 2014; Miao et al., 2015; Borin et al., 2017), despite the clear potential of A. tamarii as a xylanolytic species, there has been no investigation, prior to the current study, of the transcriptome of this species during saccharification of plant biomass.

The A. tamarii BLU37 transcriptome in response to SB was mapped to the annotated reference A. oryzae RIB40 genome, with the total number of expressed genes detected (7126 genes) similar to that previously observed in A. niger (7359 genes) when grown on this same lignocellulosic substrate as sole carbon source (Borin et al., 2017). Global analysis of gene expression revealed DEGs not only induced or repressed according to culture treatments with SB or C, but also according to semi-solid versus liquid culture (**Supplementary Figure 1**). Culture format has previously also been reported to influence enzyme secretion in Aspergillus species, with greater protein production in A. oryzae following solid-state fermentation (Wang et al., 2010b). Similarly, in the case of A. niger, solid-state fermentation of pretreated sugarcane bagasse was reported to favor endoglucanase and xylanase secretion, with submerged fermentation favoring βglucosidase production (Vasconcellos et al., 2015).

Enzymes involved in the degradation of plant cell wall polysaccharides are distributed in different classes of glycosidases based on their primary amino acid sequence and related catalytic modules. Currently, these enzymes are classified as carbohydrate esterases (CEs), polysaccharide lyases (PLs), glycoside hydrolases (GHs), glycosyltransferases (GTs), as enzymes with the auxiliary activities (AAs), and as carbohydratebinding modules (CBMs). As lignocellulose deconstruction requires the activities of a large number of these CAZymes, analysis of DEGs included a comparison of CAZyme-encoding genes. Considering that Aspergillus sp. are able to colonize a variety of plant biomass polysaccharides, and given that sugarcane bagasse has been shown to comprise approximately 35% cellulose, 24% hemicellulose and 22% lignin (Rezende et al., 2011), it was expected that A. tamarii BLU37 would secrete many enzymes from different CAZy families when growing on SB as carbon source. The majority of the CAZyme-encoding genes observed amongst DEGs on SB were those encoding proteins classified across 36 GH families. The total of 141 GH family DEGs reflects the similar numbers identified in the genomes of A. oryzae, A. terreus, A. niger and A. nidulans, with respective totals of 194, 186, 157, and 172 predicted GH family CAZyme-encoding genes. Greatest numbers of genes with differential expression on SB included those from GH families 3, 5, 31, 43, and 92. CAZyme-encoding DEGs classified in CE families (15 genes) and PL families (15 genes) were also similar in numbers to those observed in the A. oryzae genome, where 14 CE and 20 PL genes have been annotated (Benoit et al., 2015). DEGs encoding GTs were generally less expressed on SB in comparison to glucose, indicating a probable lack of involvement in degradation of this biomass. GTs represent a diverse family of enzymes that function in the cell in many activities relating to structure, storage and signaling. Numerous GTs can transfer sugar residues from an activated sugar donor residue to specific acceptor molecules, TABLE 1 | CAZyme-encoding genes with significant increased expression after growth in treatments with steam exploded bagasse, in comparison with glucose.


down-regulated in each treatment.

resulting in the formation of glycosidic bonds. As such, this family of enzymes can enable the biosynthesis of an infinite number of oligosaccharides, polysaccharides and glycoconjugates (Taniguchi et al., 2002; Coutinho et al., 2003). As such, in contrast to sugarcane bagasse, where enzyme activity is involved in the degradation of plant cell wall polysaccharides, an increased expression of GTs on glucose may be expected, likely resulting in the biosynthesis of oligosaccharides and polysaccharides required for fungal metabolism. Such observations have also been reported in A. fumigatus on different lignocellulosic biomass sources (Miao et al., 2015; de Gouvêa et al., 2018), with many such GTs involved in the biosynthesis of fungal cell wall chitins and ergosterol glycosylation (Klutts et al., 2006; Castell-Miller et al., 2016). Additional DEGs likely to be not involved in the plant cell wall degradation included six GH18 putative chitinases, with activity likely associated with cell wall dynamics during fungal growth, as also observed in the ascomycete fungus Malbranchea cinnamomea (Hüttner et al., 2017).

Genes related to cellulose and hemicellulose deconstruction were highly expressed during liquid and semi-solid culture on SB (**Supplementary Table 2**), including cellobiohydrolases (CBHs) (celC - AO090001000348, celD - AO090012000941 and chbC - AO090038000439), endoglucanases (EGs) (celA - AO090026000102, celB - AO090010000314 and eglA - AO090005001553), and β-glucosidases (BGLs) (bgl3 - AO090003000497, bglD - AO090701000274 and bgl5 - AO090001000544). Two putative β-glucosidases (AO090701000244 and AO090005000337) and one putative exoglucanase (AO090005000423) were also observed with high levels of expression during liquid and semi-solid culture. In the case of the CAZy family AA9, classified as a member of the copper-dependent lytic polysaccharide monooxygenases (LPMOs) and responsible for cellulose cleavage through an oxidative reaction, two AA9 genes (AO090023000787 and AO090023000159) and two AA9 genes linked by a carbohydrate-binding module (CBM1) (AO090103000087 and AO090005000531) were observed amongst the highly expressed DEGs on SB, as also reported in A. niger (Borin et al., 2017). Cellulose-binding CBMs are frequently observed amongst fungal enzymes, playing roles in anchoring enzymes to crystalline cellulose (Igarashi et al., 2009), conferring enzyme and substrate specificity and enhancing enzyme activity (Crouch et al., 2016). CBM1s have been estimated to represent over 30% of CBMs in Ascomycete and Basidiomycetes (Várnai et al., 2014).

Whilst sugarcane bagasse hemicellulose and pectin have structures composed of xylan, galactan and arabinan polymers, many of the highly expressed DEGs from liquid and semisolid cultures grown on sugarcane bagasse were related to xylan saccharification, corroborating enzyme activity data and previous studies on A. tamarii. The endoxylanases (xynG1 - AO090001000111, xynF1 - AO090103000423) and xylanase (xynG2 - AO090120000026), for example, were upregulated more than 10-fold in SB. Xylosidases involved in xylooligomer hydrolysis into xylose, such as α-xylosidases (AO090005000768 and AO090005000767) and β-xylosidases (xylA - AO090005000986 and xylB - AO090005000698), also showed high expression levels.

Genes that encode accessory enzymes involved in galactan breakdown, such as β-galactosidase and α-galactosidase (AO090012000135 and AO090011000063), as well as those involved in xylan side chain removal, such as arabinofuranosidases (AO090701000886 and AO090701000885) and feruloyl esterases (AO090701000884 and AO090023000158), were all also highly expressed in liquid and semi-solid SB culture.

Fungi from the genus Aspergillus possess a variety of genes encoding enzymes related to pectin degradation. A. niger possesses over 60 such genes and A. oryzae over 90 genes encoding for pectinolytic enzymes (Coutinho et al., 2009; Martens-Uzunova and Schaap, 2009). In our study, we detected numerous pectin-related DEGs with increased expression on SB. Genes with increased expression on semi-solid SB included a number from the PL1 family (AO090010000087, AO090011000673, and AO090102000072), the PL9 family (AO090038000131), pectinase-encoding genes from the GH28 family (exopolygalacturonase - AO090005001400; polygalacturonase - AO090009000470 and AO090138000086; and rhamnogalacturonan hydrolase - AO090102000139), as well as one gene from the PL1 family, a pectin lyase (AO090010000030). In contrast, a number of PL1 family DEGs displayed increased expression exclusively in liquid SB culture, namely AO090012000451, AO090701000321, and AO090010000706. Such data reveal that culture format of this carbon source, in addition to cultivation time, can influence gene expression. Although pectin is less abundant in sugarcane bagasse in comparison to xylan, in terms of chemical composition (de Souza et al., 2013), the presence of PL family members amongst the DEGs also indicates pectin breakdown in SB. Similar increased expression has previously been observed in

PL4 family member genes in A. niger and A. fumigatus grown on SB (Borin et al., 2017; de Gouvêa et al., 2018).

Transcription factors perform major roles in the regulation of gene expression. Our data revealed a total of 80 differentially expressed transcription factor genes, from the families Zn2/Cys6 DNA-binding domain, Homeodomain-like, C2H2 zinc fingers, helix-loop-helix DNA-binding domain, GATA zinc finger and Basic-leucine zipper (bZIP) transcription factor. Amongst the DEGs, a total of 42 transcription factors were identified, from Zn clusters, a family commonly found across fungal species (Shelest, 2017). Of these, the transcription factor XlnR is known to play an important role in xylanolytic transcriptional activation in fungi (van Peij et al., 1998), controlling expression of a wide range of genes, including those encoding xylanases and enzymes in the D-xylose metabolic pathway (Hasper et al., 2000; de Groot et al., 2007). This TF is also involved in regulation of expression of genes encoding endocellulases (Gielkens et al., 1999). In the case of Aspergilli specifically, XlnR is known, in the presence of D-xylose, to induce xylanases, βxylosidases, cellobiohydrolases, endoglucanases, galactosidases, arabinofuranosidases and carbohydrate esterases (Mach-Aigner et al., 2012). In this study, XlnR was up-regulated according to SB substrate, as observed previously in A. niger in response to SB, over a similar time course (Borin et al., 2017), as well as in A. fumigatus on rice straw (Miao et al., 2015). Up-regulation was also observed in the ClrA gene following growth in LB36 and LB48. Together with XlnR, this gene is also known to act positively in cellulase production in A. niger (Raulo et al., 2016). Two transcription factors controlling pectinase-encoding genes, namely RhaR and AraR were also up-regulated in LB and SB. The major pectinase-encoding genes in A. niger have been shown to be under the control of the transcription factors RhaR, AraR, and GaaR (Kowalczyk et al., 2017), suggesting that the up-regulation of the RhaR and AraR observed in our study, also indicates a role in induction of pectinase-encoding gene expression in A. tamarii, releasing lignin associated with pectin in the sugarcane bagasse cell wall.

Following the breakdown of lignocellulosic biomass into mono- and disaccharide sugars, transport into the cell will involve numerous sugar transporters. Although genome sequence data has revealed that filamentous fungi indeed harbor many genes encoding sugar transporters, relatively few have been functionally validated to date (dos Reis et al., 2016). MFS transporter proteins are known to transport small soluble molecules such as sugars across ion gradients (Pao et al., 1998). Following comparison with all predicted MFS proteins in the annotated reference genome sequences for A. oryzae (508), A. nidulans (357), and A. fumigatus (278) (Ferreira et al., 2005), our data revealed over 150 potential transporter genes differentially expressed in A. tamarii following growth on sugarcane bagasse in comparison to glucose. The relatively large number of transporters identified in the genus Aspergillus, in comparison with T. reesei and N. crassa, may explain their greater ability to transport small solutes (Miao et al., 2015). Previously, de Souza et al. (2011) characterized seven genes encoding predicted transporters in A. niger, with increased expression in response to sugarcane bagasse and repression in the presence of glucose, suggesting a role in xylose transport. From our data, numerous differentially expressed transporter genes potentially involved in SB degradation were identified. Increased expression of the putative D-xylose transmembrane transporter gene xtrD, for example, likely indicates involvement in transport into the cell of free D-xylose released from xylan following activity of highly expressed xylanases, endoxylanases and xylosidases. In accord with our findings, the xtrD gene from A. nidulans, whilst able to accept multiple sugars (xylose, glucose, galactose, and mannose), shows high affinity for xylose, being induced by xylose in an XlnR-dependent manner, and repressed by glucose in a CreA-dependent manner (Colabardini et al., 2014). Similarly, significant increased expression of the cdt-2 gene ortholog, a putative cellobiose transporter, was observed across all four treatments with SB as carbon source, likely associated with cellobiose transport into the fungal cell. Given the inability of S. cerevisiae to transport sugars other than glucose (Young et al., 2010), the identification of transporters for pentose sugars and cellodextrins in A. tamarii will offer potential in genetic modification of Saccharomyces yeasts, facilitating transport and fermentation of D-xylose.

Whilst there is evidence for conservation of the genomic potential for plant biomass degradation across certain Aspergillus species (Benoit et al., 2015), investigation into gene expression and enzyme secretion has shown considerable variation across species and during cultivation on different carbon sources. This first analysis of the repertoire of CAZyme-, transcription factor- and sugar transporter-encoding genes in A. tamarii modulated in response to SB increases our understanding of enzymatic saccharification of this lignocellulosic biomass. Transcriptome data serves as a resource for economically viable biorefinary applications, with potential for application in improvement of enzymatic conversion of biomass to value-added products, through genetic improvement of both lignocellulolytic filamentous fungi, as well as yeasts employed in the fermentation of hexose and pentose sugars in hydrolysates in industrial 2G ethanol production.

### DATA AVAILABILITY

The Illumina RNAseq datasets generated for this study can be found in the NCBI Sequence Read Archive (SRA) database (BioProject ID PRJNA479954, SRA accession: SRP152413).

## AUTHOR CONTRIBUTIONS

RM, EN, and EF planned the experiments. GM and CC performed the bioassays, enzyme analyses, RNA and cDNA preparation, and sequence data analysis. RT, MC, OS, and PG participated in sequence data analysis and editing of the manuscript. RM conceived the study, participated in bioassays, RNA preparation for cDNA library construction, sequence data analysis, and drafted the manuscript. All authors have contributed to, read and approved the final version of the manuscript.

### FUNDING

This work was partially funded by the Fundação de Amparo à Pesquisa do Distrito Federal (FAPDF) (project 193.000.584/2009). GM and CC were supported by scholarships from CAPES. RM, EN, and EF were supported by fellowships from the CNPq.

### ACKNOWLEDGMENTS

We thank the reviewers for their useful comments on the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fbioe. 2018.00123/full#supplementary-material

#### REFERENCES


degradation potential of Aspergillus nidulans and comparison to Aspergillus niger and Aspergillus oryzae. Fungal Genet. Biol. 46(Suppl. 1), S161–S169. doi: 10.1016/j.fgb.2008.07.020


aflatoxins in the Brazilian peanut production chain. Food Res. Int. 94, 101–107. doi: 10.1016/j.foodres.2017.02.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Midorikawa, Correa, Noronha, Filho, Togawa, Costa, Silva-Junior, Grynberg and Miller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Abundance of Secreted Proteins of *Trichoderma reesei* Is Regulated by Light of Different Intensities

#### Eva Stappler <sup>1</sup> , Jonathan D. Walton<sup>2</sup> , Sabrina Beier <sup>1</sup> and Monika Schmoll <sup>1</sup> \*

<sup>1</sup> Center for Health and Bioresources, AIT Austrian Institute of Technology GmbH, Tulln, Austria, <sup>2</sup> MSU-DOE Plant Research Laboratory, Department of Plant Biology, Michigan State University, East Lansing, MI, United States

In Trichoderma reesei light is an important factor in the regulation of glycoside hydrolase gene expression. We therefore investigated the influence of different light intensities on cellulase activity and protein secretion. Differentially secreted proteins in light and darkness as identified by mass spectrometry included members of different glycoside hydrolase families, such as CBH1, Cel3A, Cel61B, XYN2, and XYN4. Several of the associated genes showed light-dependent regulation on the transcript level. Deletion of the photoreceptor genes blr1 and blr2 resulted in a diminished difference of protein abundance between light and darkness. The amount of secreted proteins including that of the major exo-acting beta-1,4-glucanases CBH1 and CBH2 was generally lower in light-grown cultures than in darkness. In contrast, cbh1 transcript levels increased with increasing light intensity from 700 to 2,000 lux but dopped at high light intensity (5,000 lux). In the photoreceptor mutants 1blr1 and 1blr2 cellulase activity in light was reduced compared to activity in darkness, showing a discrepancy between transcript levels and secreted cellulase activity. Furthermore, evaluation of different light sensitivities revealed an increased light tolerance with respect to cellulase expression of QM9414 compared to its parental strain QM6a. Investigation of one of the differentially expressed proteins between light and darkness, CLF1, revealed its function as a factor involved in regulation of secreted protease activity. T. reesei secretes a different set of proteins in light compared to darkness, this difference being mainly due to the function of the major known photoreceptors. Moreover, cellulase regulation is adjusted to light intensity and improved light tolerance was correlated with increased cellulase production. Our findings further support the hypothesis of a light intensity dependent post-transcriptional regulation of cellulase gene expression in T. reesei.

Keywords: *Trichoderma reesei, Hypocrea jecorina*, cellulase gene expression, secretion, light tolerance, protease

## INTRODUCTION

The filamentous ascomycete Trichoderma reesei (syn. Hypocrea jecorina) is an important producer of industrial enzymes, especially cellulases for conversion of cellulosic biomass (Bischof et al., 2016; Paloheimo et al., 2016; Schmoll et al., 2016). Especially for the production of biofuels from cellulosic waste material the enzymes of T. reesei are very important (Kumar et al., 2008). Additionally T. reesei is a frequently used host for the production of heterologous proteins (Nevalainen and Peterson, 2014; Singh et al., 2015). For T. reesei, glycoside hydrolase gene expression is regulated by light at the transcriptional level (Schmoll et al., 2005; Tisch et al., 2011b; Tisch and Schmoll, 2013).

#### *Edited by:*

André Ricardo Lima Damásio, Universidade Estadual de Campinas, Brazil

#### *Reviewed by:*

Fernando Segato, University of São Paulo, Brazil Helena Nevalainen, Macquarie University, Australia João Paulo Lourenço Franco Cairo, Brazilian Laboratory for Science and Technology of Bioethanol, Brazil

> *\*Correspondence:* Monika Schmoll monika.schmoll@ait.ac.at

#### *Specialty section:*

This article was submitted to Microbiotechnology, Ecotoxicology and Bioremediation, a section of the journal Frontiers in Microbiology

*Received:* 16 August 2017 *Accepted:* 12 December 2017 *Published:* 22 December 2017

#### *Citation:*

Stappler E, Walton JD, Beier S and Schmoll M (2017) Abundance of Secreted Proteins of Trichoderma reesei Is Regulated by Light of Different Intensities. Front. Microbiol. 8:2586. doi: 10.3389/fmicb.2017.02586

Light is an important environmental cue for most living organisms (Dunlap and Loros, 2017). Changes in light conditions, as a result of diurnal cycles or of growth on the surface compared to within a substrate, lead to considerably altered physiological processes in fungi. Light influences diverse functions like sexual development, conidiation, intracellular levels of ATP and cyclic adenosine monophosphate (cAMP), and many metabolic processes (Corrochano, 2007; Rodriguez-Romero et al., 2010; Tisch and Schmoll, 2010; Schmoll, 2011). In ascomycetes, the perception of light signals is largely conserved, with detection of blue, red and green light depending on the species (Idnurm and Heitman, 2005). Thereby, two GATA-type transcription factors containing PER Arnt Sim (PAS) domains are crucial for light perception (Schafmeier and Diernfellner, 2011). They act on a flat hierarchy, targeting transcriptional regulators, which in turn act on downstream pathways (Smith et al., 2010). The light perception machinery of T. reesei consists of BLR1, BLR2 and ENV1, which all contain PAS domains and have functions predominantly in light, but also in darkness (Schuster et al., 2007; Schmoll et al., 2010; Tisch and Schmoll, 2013; Schmoll, in press). BLR1 and BLR2 (blue light regulator 1 and 2) are homologs of the N. crassa photoreceptors White Collar-1 (WC-1) and White Collar-2 (WC-2), two GATA zinc-finger transcription factors that together form the White Collar Complex (WCC) to transfer signals to their target genes (Brunner and Kaldi, 2008; Chen et al., 2010). BLR1 and BLR2 regulate growth in light conditions and modulate cellulase gene transcription (Castellanos et al., 2010; Gyalai-Korpos et al., 2010; Schmoll et al., 2010). Although they are expected to act as a complex, BLR1 and BLR2 as well as their homologs in N. crassa also have individual functions (Schmoll et al., 2012; Tisch and Schmoll, 2013).

A third photoreceptor in N. crassa is VIVID, a PAS/LOV domain protein that is involved in detecting changes in light intensity as well as in adaptation to constant light (Heintzen et al., 2001; Hunt et al., 2010). VIVID is assumed to sense the difference between changes in light intensity during the day and night (moonlight) and is essential for photoadaptation, during which it binds to the WCC and acts as a universal brake for photoresponses (Chen et al., 2009; Malzahn et al., 2010). Its ortholog in T. reesei, ENV1 (Schmoll, in press), is not a functional homolog, although it shares similar functions (Schmoll et al., 2005; Castellanos et al., 2010). Particularly, deletion phenotypes are different, while the photoreceptors BLR1 and BLR2 are essential for env1 induction as in N. crassa (Schmoll et al., 2005; Castellanos et al., 2010). One further important difference is in the cysteine residue at position 96, which integrates oxidative stress signaling with light response in Hypocreales (Lokhandwala et al., 2015). ENV1 is necessary for normal growth in light and for photoadaptation, as well as for responding to different light intensities (Schmoll et al., 2005; Schuster et al., 2007; Castellanos et al., 2010) and has functions in sexual development (Seibel et al., 2012) and regulation of the heterotrimeric G-protein pathway (Tisch et al., 2011a).

The proteome of T. reesei has been analyzed previously under different conditions. The types and abundance of secreted and intracellular proteins strongly depend on the carbon source for growth (Adav et al., 2012; Jun et al., 2013; Peciulyte et al., 2014). The most efficient protein secretion rate occurs at low specific growth rates (Pakula et al., 2005; Arvas et al., 2011). Analysis of proteome-wide phosphorylation revealed a complex signaling network for cellulase induction that includes components of carbon sensing, osmoregulation and light signaling (Nguyen et al., 2016).

Due to the altered regulation of cellulase gene expression and phenotypic characteristics in light in T. reesei, we became interested whether the light intensity is crucial for regulation. Here we studied the secreted proteins of T. reesei under four different light conditions, ranging from 700 to 5,000 lux, during growth on cellulose. Traditionally, besides the original isolate QM6a, different strains of T. reesei are used for functional genomics analysis (Guangtao et al., 2009; Schuster et al., 2012a), which caused the question whether the genetic background is relevant for light tolerance. Hence, we compared the effect of light and light sensitivity on both the ancestral isolate QM6a and the cellulase high producer QM9414, and also evaluated the influence of the photoreceptors BLR1, BLR2, and ENV1 on the secretome in light and dark. Analysis of one of the differentially regulated proteins revealed that it had a function in regulating protease activity in T. reesei.

## MATERIALS AND METHODS

#### Fungal Strains and Culture Conditions

Trichoderma reesei wild-type strain QM6a, its derivative QM9414 (ATCC 26921), QM94141blr1, 1blr2, and 1env1 (Castellanos et al., 2010), and QM6a1ku80 were used throughout this study. Strains were maintained on 3% (w/v) malt extract-agar (malt extract: Merck, Darmstadt, Germany; agar-agar: Roth, Karlsruhe, Germany). For quantitative reverse transcription-PCR (qRT-PCR) analysis, biomass determination, SDS-PAGE, and secreted cellulase activity, T. reesei was grown in liquid culture in 100 ml Mandels-Andreotti minimal medium (Mandels and Andreotti, 1978) supplemented with 0.1% (w/v) peptone (Roth, Karlsruhe, Germany) and with 1% (w/v) microcrystalline cellulose (Alfa Aesar, Karlsruhe, Germany) as a carbon source for 72 h at 28◦C on a rotary shaker (200 rpm). Strains were grown either in the presence of constant illumination (Osram L 18W/835; day light simulating wave length distribution) with different light intensities ranging from 700 to 5,000 lux or in constant darkness. In the latter case, cultures were harvested under red safety light (Fischer Photolamp 230V 15W<sup>∗</sup> 5F, Diez, Germany) using Miracloth filtration material (Calbiochem/Merck, Darmstadt, Germany) to separate the culture from the supernatant.

### Construction of *T. reesei* Deletion Strains

The deletion vector for TR\_111915 was constructed by yeast recombination as described earlier (Schuster et al., 2012a) using primers pdel111915\_5F 5′ GTAACGCCAGGGTTTTCC CAGTCACGACGTCTCTTGAAGCCATGAAAGC 3′ and pdel111915\_5R 5′ ATCCACTTAACGTTACTGAAATCTCCA ACGGAGGAGGTAGATTAAAGGC 3′ (956 bp fragment) for PCR amplification of the 5′ region and pdel111915\_3F 5′ CTC CTTCAATATCATCTTCTGTCTCCGACGAATGTAAAGAGC

#### TGGACAC 3′ and pdel111915\_3R 5′ GCGGATAACAATTTC ACACAGGAAACAGCACAGAATCCAGCATAATGGC 3′

(994 bp fragment) for the 3′ region and the hph deletion cassette described therein. The deletion cassette was PCR-amplified with primers pdel111915\_5F and pdel111915\_3R (3,455 bp fragment). Ten microgram of the purified fragment were used for protoplast transformation of QM6a 1ku80 as described (Gruber et al., 1990). Transformants were selected on plates containing 100µg/ml hygromycin B (InvivoGen, USA). Deletion of the open reading frame was tested by PCR using primers binding within the deleted region (RT\_111915\_F 5′ GACATG AAGTGCGTCCCCGACA 3′ and RT\_111915\_R 5′ CCTTCG GACAAGCCAACCCCAT 3′ ; 253 bp fragment). No amplicon was detectable in the mutant strain confirming removal of the region to be deleted. Integration of the cassette at the correct location was confirmed by PCR using primers pdel111915\_SC 5 ′ ACATGTGGCCAAGGGAAATCGC 3′ , binding outside the deletion cassette and hph\_SC\_R 5′ GATGATGCAGCT TGGGCGCAG 3′ , binding inside the marker gene (1,292 bp fragment).

### RNA Isolation and cDNA Synthesis

Extraction of total RNA was carried out as described (Tisch et al., 2011a) using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany). The RNA concentration was measured with a Nanodrop spectrophotometer. The quality of total RNA was evaluated by agarose gel electrophoresis and the RNA 6000 Nano Kit with the Agilent 2100 Bioanalyzer. The threshold for minimum quality was set to RIN > 9 (Tisch et al., 2011a). A 1 µg portion of each total RNA sample was treated with DNase I (Thermo Fisher, Waltham, MA, USA) and then reverse-transcribed using the RevertAID H minus first-strand cDNA synthesis kit (Thermo Fisher, Waltham, MA, USA) using oligo(dT)18 primers.

### Quantitative RT-PCR

Quantitative RT-PCR was performed as described (Tisch et al., 2011a). All reactions were performed on a CFX96 Real-Time system machine (Bio-Rad) with the GoTaq qPCR Master Mix (Promega, Madison, WI, USA) and primers for cbh1 and L6e (reference gene) (Tisch et al., 2011a). At least two biological replicates were analyzed with three technical replicates. Data was analyzed with qbase+ (Biogazelle) and the CFX Maestro (Bio-Rad) software (statistics including ANOVA).

#### Biomass Determination

Biomass in the presence of insoluble cellulose was analyzed as described (Schuster et al., 2011). Due to the presence of insoluble cellulose, biomass could not be measured directly, but the protein content of the mycelium produced is analyzed reflecting biomass. Briefly, strains were grown in liquid medium with cellulose as the carbon source as described above. Mycelia were harvested by filtration, frozen in liquid nitrogen, and ground in precooled grinding jars in a Retsch Mill MM301 (Retsch, Haan, Germany) for 30 s with an oscillation frequency of 30 Hz. The powder was suspended in 0.1 M NaOH. This suspension was sonicated three times for 30 s and incubated for 3 h at room temperature. Samples were centrifuged for 10 min at 3,220 × g, and the supernatants were transferred to new tubes. The protein concentration, which we use here as a measure of biomass, was measured by the Bradford Protein Assay (Bio-Rad, Hercules, USA) with bovine serum albumin (BSA) as standard according to the manufacturer's instructions. For each strain three biological replicates with two technical replicates were analyzed. Statistical analysis was done using PSPP 1.0.1 (version August 2017).

### Determination of CMCase Activity in Culture Filtrates

Strains were grown in liquid medium as described above. Endo-1,4-β-D-glucanase activity in culture filtrates was assayed with azo-CM-cellulose (S-ACMC-L, Megazyme, Wicklow, Ireland). CMCase activity was determined using five biological replicates. To determine specific CMCase activity, measured activity was correlated to the amount of biomass as reflected by protein content of mycelia. Statistical analysis was done using PSPP 1.0.1 (version August 2017).

### Protein Recovery from Culture Filtrates, SDS-PAGE, and Western Blotting

Aliquots of liquid culture filtrates were precipitated with 0.25 volume of 100% (w/v) trichloroacetic acid (Sigma-Aldrich, Germany), mixed well and incubated on ice for 30 min. After centrifugation for 30 min at 16,000 × g, the pellet was washed twice with acetone (Roth, Karlsruhe, Germany). The pellet was dried at room temperature and redissolved in 2x SDS-PAGE loading buffer. Before loading, samples were heated for 10 min at 99◦C. SDS-PAGE, Coomassie staining, and Western blotting were performed according to standard protocols (Ausubel et al., 2007). Relative quantitation of coomassie stained bands was done using the Image Lab software (Bio-Rad, Hercules, CA, USA). For analysis of cellulases in culture filtrate, proteins were blotted on a nitrocellulose membrane (RPN303D, AmershamTM HybondTM-ECL, GE Healthcare, Little Chalfont, United Kingdom) by wet electroblotting. Antibodies against the major cellulase CBH1 and CBH2 (Mischak et al., 1989) and a horseradish peroxidaseconjugated anti-mouse IgG (W4021, Promega, Madison, WI) were used for detection. For visualization, Clarity Western ECL Substrate (170-5061, Bio-Rad, Hercules, CA, USA) was used. Photographs were taken with the ChemiDoc (Bio-Rad, Hercules, CA, USA).

#### Experimental LC/MS/MS

Gel bands were digested in-gel according to (Shevchenko et al., 1996) with modifications. Briefly, gel bands were dehydrated using 100% acetonitrile and incubated with 10 mM dithiothreitol in 100 mM ammonium bicarbonate, pH 8, at 56◦C for 45 min, dehydrated again and incubated in the dark with 50 mM iodoacetamide in 100 mM ammonium bicarbonate for 20 min. Gel bands were then washed with ammonium bicarbonate and dehydrated again. Sequencing grade modified trypsin was prepared to 0.01 µg/µL in 50 mM ammonium bicarbonate and ∼50 µL of this was added to each gel band so that the gel was completely submerged. Bands were incubated at 37◦C overnight. Peptides were extracted from the gel by water bath sonication in a solution of 60% acetonitrile/1 % TCA and vacuum dried to ∼2 µL. Peptides were then re-suspended in 2% acetonitrile/0.1% trifluoroacetic acid to 20 µL. From this, 10 µL were injected by a Waters nanoAcquity Sample Manager (www.waters.com) and loaded for 5 min onto a Waters Symmetry C18 peptide trap (5µm, 180µm × 20 mm) at 4 µL/min in 5% acetonitrile/0.1% formic acid. The bound peptides were eluted onto a Waters BH130 C18 column (1.7µm, 100µm × 150 mm) and eluted over 16 min with a gradient of 5–30% B in 8 min, ramped up to 90% B at 9 min and held for 1 min, then dropped back to 5% B at 10.1 min using a Waters nanoAcquity UPLC. Buffer A was 99.9% water/0.1% formic acid and buffer B was 99.9% acetonitrile/0.1% formic acid). Flow rate was 1 µL/min.

Eluted peptides were sprayed into a ThermoFisher LTQ Linear Ion trap mass spectrometer outfitted with a MICHROM Bioresources ADVANCE nano-spray source. The top five ions in each survey scan were subjected to data-dependent zoom scans followed by low energy collision induced dissociation (CID) and the resulting MS/MS spectra were converted to peak lists in Mascot Distiller, v2.4.3.3 (www.matrixscience.com) using the default LTQ instrument parameters. Peak lists were searched against all sequences available in the T. reesei protein database (downloaded from the Joint Genome Institute, http://www. jgi.doe.gov/) appended with common laboratory contaminants (downloaded from www.thegpm.org, cRAP project) using the Mascot searching algorithm, v2.4 (www.matrixscience.com). The Mascot output was then analyzed using Scaffold Q+S, v4.3.0 (www.proteomesoftware.com) to probabilistically validate protein identifications. Assignments validated in Scaffold with <1% false discovery rate were considered true. In addition, the minimum criteria for positive identification were at least two peptides and >95% probability as determined by Scaffold.

Mascot parameters for all databases were as follows: (1) up to two missed tryptic sites allowed, (2) fixed modification of carbamidomethyl cysteine, (3) variable modification of oxidation of methionine, (4) peptide tolerance of ±200 ppm, (5) MS/MS tolerance of 0.6 Da and (6) peptide charge state limited to +2 and +3.

Data are deposited in the MassIVE database under accession number MSV000081684.

### Protease Activity

To determine protease activity, strains were grown on TSA plates (3 g/l tryptone (Merck, Darmstadt, Germany), 1 g/l soytone (Merck, Darmstadt, Germany), 1 g/l NaCl, 20 g/l agar) supplemented with 1.5% milk powder (Roth, Karlsruhe, Germany) at 28◦C in constant darkness or constant light (1,800 lux). Protease activity was manifested by halo formation. The sizes of the cleared zones and hyphal extension were recorded and the ratio of the diameter of the halo to the diameter of the colony was calculated. Three biological and two technical replicates were analyzed.

### RESULTS

## The Secretome of *Trichoderma reesei* Is Influenced by Light

By SDS-PAGE we found considerable differences in the protein composition of culture filtrates of T. reesei during growth on cellulose in constant light or constant darkness (**Figure 1**, Figure S1 in Data sheet 1). This is consistent with the reported differences in the transcriptome in light and darkness (Tisch et al., 2011b; Tisch and Schmoll, 2013). Mass spectrometrybased proteomics identified the proteins with altered presence or abundance in light and darkness (**Table 1**).

Several members of different glycoside hydrolase families were found to be secreted in light and in dark conditions, often in clearly different amounts. In light we could identify TR\_123456, a candidate a,a-trehalase belonging to the glycoside hydrolase family 65, GLUC78 (TR\_121746), a candidate exo-1,3-β-glucanase and member of the glycoside hydrolase family 55 and RGX1 (TR\_122780), a candidate polygalacturonase from the glycoside hydrolase family 28. In darkness BXL1 (TR\_121127), a β-xylosidase of the glycoside hydrolase family 3, EGL6 (syn. Cel74A; TR\_49081), a xyloglucanase belonging to the glycoside hydrolase family 74, GLR1 (TR\_72526), a αglucuronidase of the glycoside hydrolase family 67, BGL1(syn. Cel3A; TR\_76672), β-glucosidase 1 belonging to the glycoside hydrolase family 3, CBH1 (syn. Cel7A; TR\_123989), the major cellulase in T. reesei, member of the glycoside hydrolase family 7, XYN4 (TR\_111849), a xylanase of the glycoside hydrolase family 30, TR\_65406, member of the glycoside hydrolase family 16 and XYN2 (TR\_123818) another xylanase, belonging to the glycoside hydrolase family 11 were found. Additionally, Cel61B (TR\_120961), a candidate lytic polysaccharide monooxygenase of glycoside hydrolase family 61, which was re-classified to auxiliary activity family 9 (AA9; Hemsworth et al., 2013) was detected, Moreover, several proteases and other proteins, some with unknown functions were identified (**Table 1**). Of those proteins, XYN2, BGL1, EGL6, and XYN4 are among the top ranking proteins for limiting hydrolysis capacity of corn stover in T. reesei (Lehmann et al., 2016).

We used available microarray data (Tisch and Schmoll, 2013; Stappler et al., 2017) to gain information on light- and carbon dependent regulation of the genes encoding the detected proteins. Most of the genes encoding the identified proteins show upregulation of their transcripts in a cellulase specific manner (**Table 1**, Data sheet 2). In contrast, only 6 genes of those encoding the 26 identified proteins showed significant light dependent regulation in the microarrays, although the protein pattern showed considerable differences between samples in light and darkness. For example, the gene encoding Cel61B (GH61, band Q) exhibited no significant differences between light and darkness in the microarray, albeit the protein band was only visible in samples from darkness. Also XYN4 (GH30, band L) was only found to be secreted in darkness although no light dependent alteration in transcript abundance was detected (**Table 1**).

## Light Intensity Influences the Secretome and Cellulase Activity

In order to expand our knowledge of light-dependent differences in gene expression and protein production, we examined whether light intensity would result in altered expression patterns or protein abundance. We used four light intensities between 700

lux (low light) and 5,000 lux (high light) and analyzed the protein pattern of secreted proteins by SDS-PAGE. In the wild-type, the striking difference between light and darkness was obvious, but no major differences were seen when the light intensity was varied from 700 to 5,000 lux (**Figure 2A**).

Five protein bands (A, B, K, Q, and U, **Figure 2A**) showed particularly interesting responses to light. Band A, identified as TR\_123456, a member of the GH65, slightly increased at increasing light intensities. The intensity of protein band B, which contains the hypothetical ceramidase TR\_64397 and the hypothetical protease TR\_51365, was considerably increased to similar levels in all light conditions compared to darkness. Bands K (BGL1 CBH1), Q (Cel61B), and U (putative protease inhibitor TR\_111915) were barely detectable in light-grown cultures, but showed a strong signal in darkness (**Figure 2A**).

### Deletion of the Photoreceptors BLR1 and BLR2 Leads to Loss of Light-Specific Protein Pattern

To investigate the relevance of the photoreceptors ENV1, BLR1, and BLR2 to the light-dependent changes in the secretome pattern, we analyzed secreted proteins of these strains. Protein patterns in 1env1 were similar to the wild-type in darkness (**Figure 2B**). There was a clear difference between proteins secreted under dark conditions and proteins secreted in light in 1env1 but no large influence of light intensity. Interestingly, band B, which showed a stronger signal in light in the wild-type QM6a, was decreased in 1env1 in light compared to darkness (**Figure 2B**). In 1blr1 and 1blr2, the dark and light patterns were more similar than in the wild type (**Figures 2C,D**). These results indicate that BLR1 and BLR2 play a major role in regulation of protein abundance in light. The influence of ENV1 is limited to altered regulation of individual proteins, but the overall decrease in protein secretion in light is not dependent on ENV1.

### Cellulase Transcription Drops to Basal Levels in High Light Intensities

Expression of the major cellulase cbh1 is influenced by light in T. reesei (Schmoll et al., 2005; Castellanos et al., 2010). To investigate the light effect in more detail, we analyzed cbh1 RNA levels at different light intensities and in photoreceptordeletion strains. In the wild-type QM9414 cbh1 transcript levels



(Continued)


Frontiers in Microbiology | www.frontiersin.org


sheet 2). Abbreviations:

 NR, no regulation; U, upregulation;

 D,

downregulation.

were increased by up to 50% at light intensities up to 2,000 lux compared to levels in darkness (**Figure 3A**) in agreement with earlier studies (Schmoll et al., 2005). At a higher light intensity (5,000 lux) cbh1 levels dropped to basal expression, which is not reflected by biomass production, as the strain did not show a major growth defect under these conditions (**Figure 3B**). Surprisingly, even though cbh1 transcript levels were upregulated at moderate light intensities, specific CMCase activity in these samples was severely decreased in QM9414 under all light conditions compared to activity in darkness (**Figure 3C**). Western blot analysis of culture filtrates revealed that decreased cellobiohydrolase levels were present at higher light intensities (**Figure 4A**). The signal for CBH1, as well as for CBH2 was the strongest in darkness. At 700 lux less CBH1 and CBH2 was present and the signal decreased further at higher light intensities. At 5000 lux neither CBH1 nor CBH2 was detectable any more. The discrepancy between data for cbh1 transcript levels, protein abundance and actual activity strongly suggests a level of post-transcriptional regulation of cellulase biosynthesis in light, which is not likely to be limited to cbh1 and cbh2.

### Photoreceptor Mutants can Still Respond to Light

Deletion of env1 leads to decreased cbh1 transcript levels at all light intensities (p < 0.01; **Figure 3A**), which is in agreement with previous findings for cultivation on cellulose for 72 h (Schmoll et al., 2005; Castellanos et al., 2010). In light 1env1 exhibited severely decreased growth (p < 0.01; **Figure 3B**) and cellulase activity was below detection limits due to the low biomass formation (p < 0.01; **Figure 3C**). Moreover, no CBH2 could be detected in culture supernatants from light cultures (**Figure 4B**). In strains lacking the photoreceptors blr1 and blr2, cbh1 transcript levels were not strongly altered by the light condition and they did not show the drop in transcript levels that occurred with the wild-type at the high light intensity of 5,000 lux (**Figure 3A**). Hence light sensitivity with respect to cellulase transcription is alleviated in 1blr1 and 1blr2. Expression levels of cbh1 in 1blr1 were generally lower than in the wild-type but stayed at the same level independent of the light condition. In 1blr2, cbh1 levels were increased compared to the wild-type at high light conditions (5,000 lux; p < 0.01) and thereby largely resembled the increase of transcript levels seen in the wild-type, albeit this increase was only reached at 5,000 lux instead of 2,000 lux in the wildtype (**Figure 3A**). In contrast to transcription data, both 1blr1 and 1blr2 exhibited decreased specific cellulase activity in light compared to darkness, irrespective of the light intensity (p < 0.01; **Figure 3C**). Nevertheless, cellulase levels in 1blr2 in light were still higher than in the wildtype under the same conditions (p < 0.01). Detection of cellulase abundance in western blot analysis

was in agreement with these results (**Figure 4B**). Consequently, strains lacking one of the photoreceptors were still responsive to light and influenced cellulase gene expression, hence confirming earlier results (Castellanos et al., 2010; Gyalai-Korpos et al., 2010; Schmoll et al., 2012; Tisch and Schmoll, 2013).

#### QM9414 Is More Light Tolerant than QM6a

QM9414 represents a derivative of the natural isolate of T. reesei QM6a with improved cellulase gene expression (Vitikainen et al., 2010; Seiboth et al., 2011). We investigated the difference in sensitivity to light of these two strains. In QM6a cbh1 transcript levels in light were dramatically downregulated and hardly detectable at all light intensities whereas in QM9414 downregulation of cbh1 occured only at the highest tested light intensity of 5,000 lux (**Figure 3A**), even though comparable amounts of biomass are produced in the two strains at 1,500– 5,000 lux (**Figure 3B**). In darkness transcript levels of cbh1 showed only a slightly negative trend in QM6a compared to QM9414 (**Figure 3A**), but cellulase activity levels were considerably lower in QM6a (**Figure 3C**). Data for specific cellulase activity of QM6a in light however, showed no detectable activity and western blotting did not reveal presence of CBH1 or CBH2 in light, whereas in QM9414 both were present at least for low to moderate light intensities (**Figures 3C**, **4A**). Also, the coregulation of CBH1 and CBH2 was not broken by light, independent of the intensities. This indicates that light tolerance with respect to cellulase production of the QM6a derivative QM9414 is improved compared to its parental strain. It remains to be investigated, whether light tolerance generally correlates with improved enzyme production.

## Transcript Levels of *env1* Increase with Rising Light Intensity

It was shown previously that env1 transcript levels are upregulated in response to light (Schmoll et al., 2005). Here we show that light intensity directly affects the magnitude of env1 upregulation also in T. reesei (**Figure 3D**). In QM9414 env1 transcript levels were about 25 times higher at 700 lux than in darkness. With increasing light intensity also transcript levels increased until up to 94-fold upregulation at 5,000 lux. In the more light sensitive strain QM6a env1 transcripts are already 88 fold upregulated at 700 lux compared to its levels in darkness. At 5,000 lux env1 is expressed 186 times more than in darkness. In darkness env1 transcript levels show a negative trend in QM6a compared to QM9414. In the photoreceptor deletion strain 1blr1 transcript levels of env1 are not influenced by light anymore and remain at the low darkness levels when exposed to light. Lack of blr2 results in consistently decreased transcript abundance of env1 in light and darkness (p < 0.01) compared to dark levels in QM9414 (data not shown). These findings are in agreement with earlier studies (Castellanos et al., 2010) showing that the presence of BLR1 and BLR2 is necessary to activate env1 transcription upon exposure to light. Additionally, the positive regulation of env1 in light is in agreement with a positive effect on cellulase transcription in light (**Figures 3A,D**).

### Deletion of *clf1* Leads to Decreased Protease Activity

Analysis of the secretome in light and darkness revealed several interesting proteins. One band clearly visible in dark-grown cultures, but absent in light, was band U (**Figure 1**). We identified this band by mass spectrometry as TR\_111915. This gene shares homology with protease inhibitors that contain Kazal domains (Interpro domain IPR011497) (e-value 7e-48). Transcriptome data showed that TR\_111915 transcript levels were increased upon growth on cellulase-inducing carbon sources like cellulose, lactose or sophorose (Stappler et al., 2017). Furthermore, TR\_111915 was downregulated in response to light and was a target of ENV1, as well as of the adenylate cyclase (ACY1) in light (Schuster et al., 2012b; Tisch and Schmoll, 2013; Tisch et al., 2014). We tested the regulation of clf1 upon growth on cellulose in darkness and in light of 1,500 and 5,000 lux by RTqPCR. This analysis confirmed the downregulation in light with a more severe effect with the higher light intensity (**Figure 5A**). In QM6a, transcript abundance was below that of QM9414 under all conditions. In strains lacking BLR1 or BLR2, dark levels were around those in the wild-type and did not decrease in light (**Figure 5A**). As these data point at a light- and photoreceptor dependent relevance of TR\_111915 for enzyme abundance, we designated this gene clf1 (cellulase and light associated f actor 1) and prepared a knock-out strain of this gene in the QM6a1ku80 background.

The phenotype of 1clf1 did not show growth or sporulation defects. Since CLF1 contains a putative protease inhibitor domain we analyzed protease activity. In cultures grown in light, 1clf1 secreted protease activity was significantly decreased compared to the wild-type. Also in darkness protease activity was decreased, but to a smaller extent than in light. (**Figure 5B**). Consequently, CLF1 has an influence on protease activity, but it is not a protease inhibitor, rather it is a protein with positive effect on their activity and hence is not responsible for the difference in enzyme/protein abundance between light and darkness. Analysis of transcript levels of cbh1 and specific cellulase activity in liquid cultures of the wild-type QM6a, QM6a 1ku80, and 1clf1 grown in constant light and constant darkness on MA media with cellulose as carbon source showed no significant differences between the deletion strain and the wild-type (data not shown). Hence, CLF1 is not involved in regulation of cellulase gene expression or potential degradation of secreted cellulases in light.

considerably in the wild-type and differential regulation in light and darkness was observed. Additionally, light intensity influences the secretome, albeit the effect is minor compared to the difference in protein abundance between light and darkness.

We found several members of different glycoside hydrolase families to be differentially regulated in light and in dark conditions. Many of them show different intensities of the band on the SDS-PAGE-gel, indicating that their amount present in the media differs depending on the light condition. These results are in agreement with previous studies showing that glycoside hydrolases are an important target of light signaling (Tisch et al., 2011b; Tisch and Schmoll, 2013). Also proteins with other function, for instance proteases or proteins with unknown functions, were present in different amounts after growth in constant light or constant darkness. One of the proteins identified from the culture grown in light was the hydrophobin HFB2. It has been shown that hfb2 is regulated by the G-alpha protein GNA1 and in response to light (Nakari-Setälä et al., 1997; Seibel et al., 2009). Interestingly, hfb2 transcript levels are downregulated in light on lactose, sophorose and glycerol, but not on cellulose. Furthermore, hfb2 shows regulation by cellulaseinducing conditions only in darkness, not in light (Stappler et al., 2017). Moreover, hfb2 is involved in sporulation (Askolin et al., 2005) and it is highly expressed upon growth in media containing complex plant polysaccharides, cellulose, xylan, cellobiose, or lactose, as well as in response to N and C starvation (Nakari-Setälä et al., 1997). The lower amount of plant cell wall degrading enzymes secreted in light may explain the presence of HFB2 in the secretome due to an effect of starvation.

Surprisingly, microarray data from previous studies (Tisch and Schmoll, 2013) revealed that for several proteins which showed different secretion levels in light and in darkness, the corresponding genes are not regulated at the transcriptional level in response to light. Although it was previously reported that

#### FIGURE 5 | Transcriptional regulation of clf1 and protease activity in 1clf1. (A) Transcript abundance of clf1 in constant darkness (DD) or in constant light with 1,500 lux ("1,500") or 5,000 lux ("5,000"). Strains were grown for 72 h on MA-media supplemented with 1% (w/v) cellulose in constant darkness (DD) or constant light at different light intensities. Two replicates for determination of cbh1 and env1 transcript abundance. (B) For determination of protease activity, strains were grown on milk TSA agar at 28◦C in constant darkness (DD) or constant light (LL). Halo formation by proteases and hyphal extension was determined at the indicated time points. Halo diameter was related to the respective diameter of the mycelium under the respective condition. Ratio of halo to mycelia is given for the wild-type QM6a (dark gray) and 1clf1 (light gray). Three biological replicates and two technical replicates were used. Errorbars show standard deviations.\* indicates values significantly different to QM9414 in darkness (p < 0.05). Statistical significance of other comparison is given with p-values in the text.

#### DISCUSSION

In this study we investigated the effect of light on the secretome of T. reesei. Light strongly influences the composition of the secretome in that the amount of secreted proteins decreases genes of predicted secreted proteins have a positive correlation to the extracellular specific protein production rate (Arvas et al., 2011), our data indicates that in addition to the transcriptional regulation, regulation at another level occurs. This is in agreement with other studies which show that in T. reesei a strict correlation of transcription and secretion of proteins is not always present and that already 2 min after induction changes in the phosphoproteomic profile can be detected that indicate a complex regulation pathway (Schuster et al., 2011; Nguyen et al., 2016). Also in N. crassa extensive post-transcriptional regulation was reported (Xiong et al., 2014). The pathway from gene transcription to protein secretion contains many steps, all of which are potential targets for additional regulation (for a review see Conesa et al., 2001). Recently, we could show, that a G-protein coupled receptor (CSG1) impacts post-transcriptional regulation of cellulase gene expression. In the absence of CSG1, cellulase transcript levels remain almost unaltered, while secreted cellulase activity drops dramatically on cellulose and lactose (Stappler et al., 2017). Consequently, cellulase regulation comprises a transcriptional and a post-transcriptional section, for which we found support also in this study.

Analysis of cbh1 transcript levels and the specific cellulase activity indicate as well that another regulation step is present. Transcript levels of cbh1 increase in light whereas the specific cellulase activity is dramatically decreased in light in QM9414. A similar discrepancy can be seen for the blr1 and blr2 deletion strains. Transcript levels of cbh1 are not regulated in response to light, but specific cellulase activity is strongly decreased in light. Although transcript levels of only one cellulase were determined and cumulative activity of all cellulases was measured, a correlation between the results would be expected, as it has been postulated previously that cellulases are transcriptionally co-regulated (Ilmen et al., 1997; Foreman et al., 2003).

We showed that the presence of light is an important signal for T. reesei but also that the intensity of light is relevant. Even though differences between light and darkness are of a much bigger magnitude, the intensity of light still affects expression of genes. ENV1, one of the photoreceptors, has been shown to be necessary for photoadaptation and for the response to increased light intensities (Schmoll et al., 2005; Schuster et al., 2007; Castellanos et al., 2010; Tisch et al., 2011b). Our study confirms that it clearly responds to different levels of light. Transcript levels of env1 are strongly upregulated even at a low light intensity of 700 lux. With increased light intensity, also env1 transcript levels increase almost proportionally to the intensity.

In our study we analyzed both the original isolate of T. reesei QM6a, as well as its derivative, the enhanced cellulase producer QM9414. Our results show some interesting differences between these two strains in respect to light sensitivity. In QM6a in light, transcription of cbh1 is strongly decreased compared to levels in darkness and only at basal levels. In contrast, in QM9414 cbh1 levels increase at low to moderate light intensities, but at 5,000 lux cbh1 transcript levels drop dramatically. These results indicate that T. reesei in nature upon encounter of light decreases its cellulase production. The laboratory strain QM9414 was mutagenized to obtain a mutant that produces high levels of cellulase (Mandels et al., 1971). As a result it seems to also be less sensitive to light. Genomic alterations between QM6a and QM9414 were analyzed by Vitikainen et al. (2010) by comparative genomic hybridization analysis, but no mutations that could explain the differences in light sensitivity were found. As mentioned above, transcript levels of env1 increase with rising light intensity. At the lowest tested light intensity of 700 lux, QM6a env1 transcript levels are almost twice as high as in QM9414. Much stronger light (2000 lux) is needed for QM9414 to reach this level. As ENV1 is involved in adaptation to light, it seems that the more light-tolerant strain QM9414 upregulates env1 less than the more light-sensitive strain QM6a.

Analysis of the secreted proteins revealed an interesting unknown protein, TR\_111915. The band containing this protein (**Figure 1**, band U) is clearly present upon growth in darkness, but absent in light conditions. It is homologous to known protease inhibitors containing Kazal domains (Interpro domain IPR011497), which are known to specifically inhibit S1 serine proteases (Schmoll et al., 2016). TR\_111915 is one of only two putative proteinase inhibitors found in T. reesei (Schmoll et al., 2016). For industrial production of enzymes the presence of naturally produced proteases by the fungus often constitutes a problem, as they can degrade the desired product and diminish production rates and enzyme stability. Therefore, proteases are often missing in commercial enzyme preparation due to selection against proteases or targeted deletion (Nagendran et al., 2009). Due to this relevance of proteases in industry the presence of a potential protease inhibitor, that is regulated in response to cellulase inducing condition as well as to light and ENV1 (Tisch and Schmoll, 2013; Tisch et al., 2014; Stappler et al., 2017), was of interest. To test the function of TR\_111915 a deletion strain was constructed. Even though TR\_111915 shares homology with protease inhibitors our data indicates that it in T. reesei it serves a different function. Contrary to expectations, deletion of this gene leads to decreased protease activity under the tested conditions, indicating a not yet detected role and possibly an involvement in enhancing protease activity.

In summary, we show that light influences the secretome of T. reesei and that its regulation takes place not only at the transcriptional level, which is in agreement with recent studies. We found that the original isolate of T. reesei QM6a is more light sensitive than its derivative QM9414. These results show the importance of controlled light conditions for optimization of industrial strains. Even though they might not be as light sensitive as the original isolate QM6a anymore, they still clearly respond to light as do currently applied industrial production strains (unpublished results). Additionally, the light intensity should be taken into account since also in the light tolerant strain QM9414 a strong reaction and drop in cbh1 transcription was present at high light intensity.

### AUTHOR CONTRIBUTIONS

ES performed light response experiments with T. reesei, strain construction and analysis and drafted the paper, JW performed mass spectrometric analysis and participated in drafting the manuscript, SB contributed to statistical analysis and MS conceived the study and wrote the final version of the manuscript.

#### FUNDING

Work of ES was funded by the Austrian Science Fund (FWF), project P24350 to MS. Work in JW's lab was funded by the U.S. Department of Energy Great Lakes Bioenergy Research Center (DOE Office of Science BER DE-FC02-07ER64494). Additional funding was provided by grant DE-FG02-91ER200021 to the MSU DOE Plant Research Laboratory from the U.S. Department of Energy, Office of Basic Energy Sciences. The funding bodies had no influence on the design of the study and

#### REFERENCES


collection, analysis, and interpretation of data and on writing the manuscript.

## ACKNOWLEDGMENTS

We thank Doug Whitten, MSU Research Technology Support Facility, for the proteomics analyses.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2017.02586/full#supplementary-material


I and II from Trichoderma reesei. Biochim. Biophys. Acta 990, 1–7. doi: 10.1016/S0304-4165(89)80003-0


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Stappler, Walton, Beier and Schmoll. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.