Unraveling the dynamic transcriptomic changes during the dimorphic transition of Talaromyces marneffei through time-course analysis

Introduction Systemic dimorphic fungi pose a significant public health challenge, causing over one million new infections annually. The dimorphic transition between saprophytic mycelia and pathogenic yeasts is strongly associated with the pathogenesis of dimorphic fungi. However, despite the dynamic nature of dimorphic transition, the current omics studies focused on dimorphic transition primarily employ static strategies, partly due to the lack of suitable dynamic analytical methods. Methods We conducted time-course transcriptional profiling during the dimorphic transition of Talaromyces marneffei, a model organism for thermally dimorphic fungi. To capture non-uniform and nonlinear transcriptional changes, we developed DyGAM-NS (dynamic optimized generalized additive model with natural cubic smoothing). The performance of DyGAM-NS was evaluated by comparison with seven other commonly used time-course analysis methods. Based on dimorphic transition induced genes (DTIGs) identified by DyGAM-NS, cluster analysis was utilized to discern distinct gene expression patterns throughout dimorphic transitions of T. marneffei. Simultaneously, a gene expression regulatory network was constructed to probe pivotal regulatory elements governing the dimorphic transitions. Results By using DyGAM-NS, model, we identified 5,223 DTIGs of T. marneffei. Notably, the DyGAM-NS model showcases performance on par with or superior to other commonly used models, achieving the highest F1 score in our assessment. Moreover, the DyGAM-NS model also demonstrates potential in predicting gene expression levels throughout temporal processes. The cluster analysis of DTIGs suggests divergent gene expression patterns between mycelium-to-yeast and yeast-to-mycelium transitions, indicating the asymmetrical nature of two transition directions. Additionally, leveraging the identified DTIGs, we constructed a regulatory network for the dimorphic transition and identified two zinc finger-containing transcription factors that potentially regulate dimorphic transition in T. marneffei. Discussion Our study elucidates the dynamic transcriptional profile changes during the dimorphic transition of T. marneffei. Furthermore, it offers a novel perspective for unraveling the underlying mechanisms of fungal dimorphism, emphasizing the importance of dynamic analytical methods in understanding complex biological processes.

Unraveling the dynamic transcriptomic changes during the dimorphic transition of Talaromyces marne ei through time-course analysis Introduction Systemic dimorphic fungi are significant human pathogens that cause millions of new infections annually worldwide (Klein and Tebbets, 2007).The ability of these fungi to transition between unicellular yeast and multicellular mycelial forms, known as dimorphic transition, enables them to adapt from environmental saprophytes to pathogens within hosts.In details, the myceliumto-yeast (M-to-Y) transition is crucial for their pathogenicity as the yeast form evades the host immune system, while the yeastto-mycelium (Y-to-M) transition is important for maintaining an environmental reservoir (Nemecek et al., 2006;Gauthier, 2017).As the highly association between fungal dimorphic transition and their pathogenicity, elucidating the genetic mechanisms of dimorphic transition will not only shed light on understanding the pathogenic mechanisms of human fungal pathogens, but also provide theoretical guidance for molecular-based antifungal therapy.
The rapid advancement of high-throughput sequencing technologies, especially in the field of transcriptomics, has spurred numerous investigations aiming to employ more efficient genomic techniques for a comprehensive understanding of the regulatory mechanisms underlying fungal dimorphic transition.However, many studies have predominantly relied on static transcriptomic analysis approaches, focusing on contrasting transcriptomic differences between the mycelial and yeast phases to identify regulatory genes involved in the dimorphic transition process.Given the complex regulatory network governing fungal dimorphic transition, such static transcriptomic studies often fall short in capturing the full spectrum of genes that actively regulate the transition.While some studies have made efforts to elucidate the regulatory mechanisms by utilizing transcriptomic data obtained at different time points during the dimorphic transition, the limited number of sampling time points poses constraints on the comprehensive analysis of the intricate transcriptional changes that occur throughout the transition.Consequently, there is an urgent demand for innovative research strategies that can enable efficient and accurate deciphering of fungal dimorphic transition.
Temporal transcriptome analysis is a commonly employed approach for investigating dynamic gene expression changes in complex biological processes (Bar-Joseph et al., 2012).It allows for the identification of transient gene expression patterns, response times, and gene regulatory relationships, thus holding significant potential for unraveling the regulatory mechanisms underlying dimorphic transition.However, the proper interpretation of timecourse transcriptome data during dimorphic transition remains challenging.In recent years, various methods, such as polynomial regression (Nueda et al., 2014), Gaussian process regression (Äijö et al., 2014), autoregressive models (Leng et al., 2015), and natural cubic spline smoothing (Michna et al., 2016), have been developed to address the complexities associated with time-course transcriptome analysis.However, the dynamics of gene expression can differ greatly across species, particularly in microorganisms like fungi, where gene expression changes exhibit non-uniform and nonlinear patterns.This variability limits the effectiveness of existing models in capturing the complex temporal dependencies inherent in dimorphic transition.Therefore, it is essential to develop precise and adaptable methodologies that can effectively analyze time-course transcriptome data and uncover the dynamic gene expression changes occurring during dimorphic transition.
Talaromyces marneffei (formerly Penicillium marneffei) is a thermally dimorphic pathogenic fungus that undergoes a switch from a mycelial form at ambient temperature (25 • C) to a yeast form at the host temperature (37 • C).This temperature-mediated dimorphic transition is intricately linked to the pathogenesis of T. marneffei (Boyce and Andrianopoulos, 2015).Once inside the host, T. marneffei transforms into yeast cells and adeptly navigates the harsh environment of macrophages by modulating metabolic pathways, adjusting nitrogen source preferences, and expressing peroxidases to counter nutrient deprivation and oxidative stress (Boyce and Andrianopoulos, 2015;Pruksaphon et al., 2022).This enables the fungus to thrive and multiply within macrophages.Subsequently, as macrophages migrate within the host, T. marneffei achieves systemic dissemination, leading to disseminated systemic infection.Given its propensity to infect immunocompromised individuals, such as HIV-positive patients, T. marneffei's lower biosafety level and its susceptibility to dimorphic transition induction under laboratory conditions render it an exemplary model organism for investigating fungal dimorphism.
In this study, we employed a DyGAM-NS (dynamic optimized generalized additive model with natural cubic smoothing) approach to analyze the time-course transcriptome data during the dimorphic transition of T. marneffei.The DyGAM-NS model combines the flexibility of generalized additive models, which enable capturing non-linear and complex changes, with dynamic optimization using natural cubic spline smoothing.By dynamically optimizing the model parameters for each gene, we identified the best-fitted models and determined the dimorphic transition induced genes (DTIGs) in T. marneffei.Cluster analysis of these DTIGs revealed dynamic expression patterns, providing insights into the temporal characteristics of the transcriptome during the dimorphic transition.Furthermore, by integrating the DTIGs with the transcription factors of T. marneffei, we constructed a gene expression regulatory network for the dimorphic transition, uncovering potential regulators involved in dimorphic transition of T. marneffei.In summary, the DyGAM-NS model not only enables efficient and accurate deciphering of the regulatory mechanisms underlying the dimorphic transition of T. marneffei, but also provides a powerful tool for unraveling complex biological processes in other microorganisms.

Strains media, and culture conditions
Talaromyces marneffei strain PM1, isolated from a talaromycosis patient without HIV infection in Hong Kong (Yang et al., 2014), served as the basis for our research.Previously, we generated madsA overexpression and knockout mutant strains (Yang et al., 2014;Wang et al., 2018) derived from the wild-type PM1.All strains were maintained at a temperature of 25 • C on Sabouraud Dextrose Agar (SDA) (Becton, Dickinson and Company, USA), which was also utilized for preparing conidial inocula.Sabouraud Dextrose Broth (SDB) (Becton, Dickinson and Company, USA) was employed for liquid cultures of the wild-type strain.To ensure long-term storage, mycelia from each strain were suspended in 25% (w/v) sterile glycerol and subsequently frozen at −80 • C.

Whole-genome sequencing of T. marne ei PM
Initially, a conidia suspension of PM1, containing ∼1 × 10 6 conidia, was introduced into a 50 ml SDB culture.This culture was incubated at 37 • C with shaking at 200 rpm for 7 days.Subsequently, 5 ml of the culture was transferred to a fresh 45 ml SDB and incubated overnight at 37 • C. The resulting yeast cells were harvested through centrifugation at 3000 × g and washed twice with 1 × phosphate-buffered saline (PBS).Genomic DNA extraction was carried out using the Epicenter MasterPure Yeast DNA Purification Kit (Cat.MPY80020) according to the manufacturer instructions, with a minor modification: RNA was digested using RNase A, RNase T1, and RNase I instead of solely RNase A. The quantity and quality of the extracted genomic DNA were evaluated using the Qubit 4.0 fluorometer (Thermo Scientific, USA) and NanoDrop spectrophotometer (Thermo Scientific, USA).Whole genome sequencing was performed by utilizing the Illumina HiSeq X platform (350 bp insert size with 150 bp pairedend) and the PacBio Sequel platform (10 kb inserts library).

De novo genome assembly and evaluation of quality
Subreads were obtained from the PacBio sequencing data using the SMRT analysis pipeline v5.1.0(Jayakumar and Sakakibara, 2019).The Canu v1.6 algorithm (Koren et al., 2017) was utilized to generate a draft genome through an overlapping layout consensus (OLC) assembly approach, employing PacBio subreads.In this process, a genome size of 30 Mb was specified, as previously reported that the genome size of T. marneffei is about 30 Mb (Yang et al., 2014), and the longest 40× of subreads were selected as seed reads to generate error-corrected long reads.The draft assembly underwent further processing, including the filtration of bubble sequences and the identification of the mitochondria sequence by comparing it to the known T. marneffei sequence using BLASTN v2.12.0+ (Camacho et al., 2009).To enhance the accuracy of the assembly, Arrow v2.0.2 (Jayakumar and Sakakibara, 2019) was employed based on PacBio reads, while Pilon v1.24 (Walker et al., 2014) was used based on Illumina reads for additional polishing.The completeness of the assembly was assessed by performing BUSCO v4 analysis (Manni et al., 2021) against the Eurotiales database.

Temporal RNA preparation and sequencing
In the initial phase of the experiment, conidia of strain PM1 were inoculated onto SDA plates and cultivated at two different temperatures, 25 • C and 37 • C, for a duration of 1 week.Subsequently, the SDA plates incubated at 25 • C were transferred to 37 • C, while those cultivated at 37 • C were transferred to 25 • C.This process facilitated the growth of colonies representing the Mto-Y or Y-to-M transition growth forms, respectively.Total RNA samples were extracted at specific time points after the temperature switch, namely at 0, 1, 3, 6, 9, 12, 24, 36, 48, and 72 hours.In the case of the madsA overexpression and knockout strains, total RNA samples were extracted at 0 and 48 h for the M-to-Y transition, and at 0 and 6 h for the Y-to-M transition.For each time point, three independent biological replicates were used to extract total RNA using the E.Z.N.A. fungal RNA kit (Omega Bio-Tek).The quantification of RNA was performed using a Qubit 4.0 fluorometer (Thermo Scientific, USA).A total of 84 strandspecific libraries, with RNA integrity numbers (RIN) exceeding 6.5, were constructed and sequenced on an Illumina HiSeq X platform, resulting in ∼20 million 150-bp paired-end reads for each sample.Prior to analysis, the reads were trimmed using Trimmomatic v0.38 (Bolger et al., 2014) based on quality assessment carried out by FastQC v0.11.5 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).

Annotation of repeat sequences and protein-coding genes
Repeat regions in the assembly were masked using RepeatMaker v4.0.7 (http://repeatmasker.org) with a T. marneffeispecific repeat library generated by RepeatModeler v4.0.7 (http:// repeatmasker.org/RepeatModeler/). Based on the annotation results obtained from RepeatMasker, we identified the presence of tandem repeat sequences (with a repeat fragment length of 6 bp) at the ends of each genomic sequence to ascertain telomeric sequences in T. marneffei.To locate centromeres, we scanned the genome sequences using 50 kb windows, and identified regions characterized by low GC content and high transposon proportion as potential centromeric regions.Gene annotations were conducted using three distinct types of evidence.Firstly, the assembly was aligned to a protein database that combined the UniProt/Swiss-Prot protein database and all sequences of Talaromyces marneffei from the NCBI protein database, using Exonrate v2.2.0 (Slater and Birney, 2005).Additionally, temporal RNA-seq reads of T. marneffei PM1 from various time points during the dimorphic transition were de novo assembled with Trinity v2.3.2 (Grabherr et al., 2011).The resulting Trinity assembly was then passed through the PASA v2.5.2 pipeline (Haas, 2003) to generate transcript evidence.Furthermore, ab initio gene predictions were carried out using FGENESH (Solovyev et al., 2006) andBRAKER2 v2.1.6 (Br • una et al., 2021).For FGENESH, the genome-specific parameters of Penicillium were employed, while BRAKER2 utilized RNA-seq reads to enhance the accuracy of gene predictions.The EVidenceModeler (EVM) v1.1.1 (Haas et al., 2008) was employed to integrate the three types of evidence into consensus gene structures, assigning a weight of 10 to transcript alignments, 5 to protein alignments, and 1 to ab initio gene predictions.The final gene sets were subjected to a BLASTP v2.12.0+ (Camacho et al., 2009) search against the NCBI non-redundant (NR) protein database.The domains of these genes were annotated using InterProScan v5.56-89.0(Jones et al., 2014), which utilizes publicly available databases.Functional annotations were then performed using BLAST2GO v5.2.5 (Conesa and Götz, 2008), incorporating the results obtained from both BLASTP and InterProScan analyses.

RNA-seq data analysis
RNA-seq short reads were aligned to the annotated genomes using STAR v2.7.9a (Dobin et al., 2013) to obtain read mappings.The read count for each gene was calculated using FeatureCount v2.0.3 (Liao et al., 2014), specifying a library by using "-s 2".The expression level of genes was quantified using TPM (transcripts per million).Genes with a mean TPM >1 at least once during the dimorphic transition were selected for subsequent analysis.The raw counts were then normalized using the median of ratios method implemented in DESeq2 (Love et al., 2014).For the time-course RNA-seq data, principal component analysis (PCA) was performed using the prcomp function in R v4.2.2.Visualization of the PCA results was achieved using the ggplot2 package.The heatmap depicting gene expression levels during the dimorphic transition was generated using the pheatmap package.In the case of RNA-seq data from madsA overexpression and knockout strains, differentially expressed genes were identified using the DESeq2 method, applying a false discovery rate (FDR) threshold of <0.05 and a fold change threshold >2.

Identification of dimorphic transition induced genes
We utilized a dynamic optimized generalized additive model with natural cubic smoothing (DyGAM-NS) approach to capture the non-linear relationship between gene expression profiles and time during the dimorphic transition (Equation 1).2).
The flexibility of the DyGAM-NS method allowed us to adjust the number and placement of knots (Equation 3).To determine the optimal parameters for each gene, we evaluated models with all possible combinations of knots using the corrected Akaike information criterion (AICc).
The DyGAM-NS model was fitted using the gam function in the mgcv package (Wood, 2011), and AICc values were calculated using the AICc function in the AICcmodavg package.Finally, we filtered the dimorphic transition induced genes (DTIGs) based on criteria of FDR <0.05, deviance explained >0.7.The code for fitting the DyGAM-NS model can be downloaded from GitHub (https:// github.com/yangence/DyGAM-NS).

Predicting gene expression levels during dimorphic transitions using DyGAM-NS model
Initially, we extracted expression data of all DTIGs identified during the M-to-Y and Y-to-M transitions.For each DTIG, a set of data points at an intermediate time point (1, 3, 6, 9, 12, 24, 36, 48 h) was randomly removed.Subsequently, leveraging data from the remaining 9 time points, we fitted the DyGAM-NS model for the responding gene.Then, the removed time point was provided to the DyGAM-NS model to yield its predicted gene expression level at that time point.Employing the model's estimated standard error, we derived the 95% confidence interval of the predicted expression level.Ultimately, by contrasting the original gene expression level with the confidence interval, we evaluated the predictive performance of the DyGAM-NS model.

Clustering gene expression patterns of DTIGs
In order to select more credible DTIGs, we filtered genes with expression changes exceeding a 2-fold threshold during the dimorphic transitions, designating them as highly variable DTIGs for subsequent clustering analysis.To analyze the expression pattern of DTIGs, we employed the trimmed k-means method from the tclust package (Fritz et al., 2012).This method utilized the fitted values of the DyGAM-NS model for each gene, where the fitted values were standardized to have a zero mean and unit variance across samples for each gene.The optimal number of clusters was determined based on the within sum of squares.The R package clusterProfiler v4.6.0 was employed, based on functional annotations specific to T. marneffei, for Gene Ontology (GO) functional enrichment analysis of genes within diverse expression patterns.

Construction of gene regulatory networks
Gene regulatory networks were constructed using GRNboost2 v0.1.6(Moerman et al., 2019) based on the time-course expression data sets.The list of transcription factors (TFs) used as input for GRNboost2 was identified through the InterPro database (IPR) and compiled from previous studies (Shelest, 2008).The final gene regulatory network was obtained by retaining edges with the top 5% cumulative weight.The betweenness centrality of each gene was calculated using the igraph package in R. Visualization of the gene regulatory network was performed using the NetworkD3 package in R. The motifs associated with each regulatory gene in the network were identified using MEME (Bailey et al., 2009), utilizing the upstream 500-bp regions of the corresponding target genes.DDBJ/ENA/GenBank under the accession JPOX02000000.The sequencing dataset for time-course RNA-seq of T. marneffei PM1 during M-to-Y and Y-to-M transitions have been deposited in the Sequence Read Archive under the accession PRJNA970557.The RNA-seq dataset used for identification of high variable genes during dimorphic transitions are available for download from the SRA under the accession PRJNA431116.

Improving the reference genome of Talaromyces marne ei strain PM to near-chromosome level
To improve the quality of the reference genome for Talaromyces marneffei strain PM1 (Yang et al., 2014), we conducted highcoverage whole-genome sequencing using a combination of PacBio SMRT (3.3 Gb, 115×) and Illumina (6.0 Gb, 210×) sequencing technologies.Firstly, we employed PacBio data to construct the initial draft assembly, resulting in 26 contigs.Subsequently, we identified and eliminated 12 bubble sequences and one mitochondria sequence from the draft assembly.Finally, we polished the genome sequences using both PacBio raw data and high-coverage Illumina data.The final assembly consisted of 13 contiguous sequences, with a total size of 29.0 Mb, an N50 of 3.3 Mb, and an L95 of 8 (Table 1).Based on the Eurotiales database (n = 4,191), we successfully identified 4,067 complete BUSCOs (97.3%), indicating a significant improvement in the completeness and continuity of the PM1 genome sequences.Furthermore, we detected 16 telomere sequences (TCCTAA) and 8 centromere regions in the final assembly (Figure 1), consistent with previous findings (Gifford and Cooper, 2009;Cuomo et al., 2020), thereby suggesting that the genome sequences have achieved near-chromosomal level.

Illustrating transcriptional dynamics of dimorphic transition by using time-course RNA-seq data
To explore the dynamic gene activity during the dimorphic transition of T. marneffei, we conducted RNA-seq analysis on samples collected at various stages of the time-course, encompassing both the mycelium-to-yeast (M-to-Y) and yeastto-mycelium (Y-to-M) transitions.Recognizing the non-uniform nature of the dimorphic transition, we strategically selected 10 time points distributed unevenly over the 0-72 h period (0, 1, 3, 6, 9, 12, 24, 36, 48, 72 h) for both transitions.With three biological replicates for each time point, a total of 60 samples were subjected to RNA-seq, yielding an average of 24 million read pairs per sample.
Initially, these transcriptome data were employed to facilitate the genome annotation of PM1, resulting in the identification of 10,378 protein-coding genes (Table 1).Among these genes, 77.6% of the proteins were annotated with at least one InterPro domain, and 56.1% were assigned at least one Gene Ontology (GO) term.Subsequently, the RNA-seq data were mapped to the PM1 reference genome, and uniquely mapped read pairs were utilized to estimate the expression levels of protein-coding genes using the transcripts per million (TPM) method.To mitigate the impact of transcriptional noise arising from low-expression genes, we filtered out 9,541 genes with a mean TPM <1 in at least one time point during either the M-to-Y or Y-to-M transition.

FIGURE
Circos plot of genome assembly and annotation of T. marne ei PM .Each track, from outside to inside, represents: the coordinate of sequences of the genome assembly of PM , the GC contents (range from % to %), the proportion of repeat elements (range from % to %), and the gene density, respectively.All these genomic characteristics were analyzed using a window size of kb.The telomere sequences are depicted by orange circles, while the centromere regions are represented by black diamonds.
An initial analysis of the transcriptomic dynamics during the dimorphic transition was performed using principal component analysis (PCA).In the M-to-Y transition, PC1 accounted for 64.8% of the variance between samples.Arranging the samples in the order of sampling time revealed a distinct trajectory on the PC1-PC2 plane, indicating clear differentiation among the sample groups (Figure 2A).Notably, during the early stage of the M-to-Y transition (0-3 h), the distances between sample groups were notably larger than those at other time points, suggesting a rapid transcriptional change during the initial phase of the M-to-Y transition.Similarly, in the Y-to-M transition, PC1 explained 61.1% of the variance between samples.A discernible trajectory was observed on the PC1-PC2 plane among the sample groups in the order of transition time (Figure 2B).The inter-group distances indicated that the magnitude of transcriptional changes during the early stage of the Y-to-M transition was not as pronounced as in the M-to-Y transition.Importantly, when all samples were subjected to PCA, a circular differentiation trajectory was formed between sample groups, aligning with the reversible cyclic nature of the dimorphic transition (Figure 2C).On the PC1-PC2 plane in the PCA of all samples, PC1 effectively distinguished between the M-to-Y and Y-to-M transitions, indicating distinct transcriptional regulation patterns during the dimorphic transition.Heatmap analysis based on gene expression levels across all samples further demonstrated that a considerable proportion of genes exhibited differential expression patterns during the dimorphic transition (Figure 2D), highlighting the complexity and nonlinearity of transcriptional dynamics during this process.

Identification of dimorphic transition induced genes by using generalized additive model and natural cubic spline
In this investigation, we employed an unevenly spaced sampling strategy that better aligns with the biological changes observed during the dimorphic transition of T. marneffei.To effectively capture the non-uniform and non-linear dynamics of the time-course data, we proposed a dynamic fitting method, referred to as dynamic optimized generalized additive model with natural cubic smoothing (DyGAM-NS), to examine the dynamic alterations in gene expression induced by the dimorphic transition.By incorporating natural cubic splines as smoothing terms, DyGAM-NS allows for non-uniform and customizable partition knots, thereby accommodating the changing characteristics of gene expression during the dimorphic transition.The read counts for each gene during transition was modeled using a negative binomial distribution with gene-specific dispersion parameters.To explore the optimal parameters for the DyGAM-NS model, we randomly selected 100 genes without replacement and fitted their expression changes over time using various NS partition numbers, NS knots, and dispersion coefficients.Model performance was evaluated using the corrected Akaike information criterion (AICc).The results demonstrated that genespecific optimization of NS partition numbers, NS knots, and dispersion coefficients enhanced the performance of the DyGAM-NS model (Supplementary Figure S1).Consequently, for each gene, we exhaustively explored combinations of NS smoothing term partition numbers and knots and selected the optimal parameter combination based on AICc.Compared to the default parameters, most genes (92.3% in M-to-Y and 96.4% in Y-to-M) exhibited improved model performance after parameter optimization (Supplementary Figures S2A, B).Summarizing the optimal model parameters for each gene revealed that most of genes had an optimal partition number of 5 (Supplementary Figures S2C, D), indicating the complex and diverse changes in the transcriptome during the dimorphic transition.In the M-to-Y transition, there was a noticeable presence of genes with partition knots at 1 and 3 h compared to the Y-to-M transition (Supplementary Figure S2E,  F), further suggesting the more rapid early transcriptome changes in the M-to-Y transition.
Using DyGAM-NS model, we identified 6,167 and 8,085 DTIGs in M-to-Y and Y-to-M transitions, respectively.Due to the lack of known data on true DTIGs of T. marneffei, direct evaluation of the precision and recall of the DyGAM-NS model is challenging.Thus, we alternatively estimated precision and recall using two different approaches.Given that genes involved in regulating the dimorphic transition should display significant expression changes during the process, DTIGs that do not exhibit differential expression between any two time points in the time-course data are likely false positives.Applying this criterion, we estimated the precision of M-to-Y DTIGs to be 96.1% (5,926/6,167) and Y-to-M DTIGs to be 96.1% (7,770/8085) (Table 2).To assess the recall of the model, we identified highly variable genes (HVGs) based on the biological coefficient of variation (BCV) during the dimorphic transition.Genes involved in regulating the dimorphic transition are expected to have higher BCV in temporal data compared to a static state.We employed a chi-square test to identify genes with significantly higher BCV (Brennecke et al., 2013) (FDR < 0.05) during the dimorphic transition than in a static state, considering them as HVGs.Using the HVG dataset as the criterion to evaluate the DyGAM-NS model, we estimated the recall of M-to-Y to be 98.1% (152/155) and Y-to-M to be 98.2% (280/285) (Table 2).Finally, based on the estimated precision and recall, the estimated F 1 -scores of the DyGAM-NS model in the M-to-Y and Y-to-M transitions were both 97.3% (Table 2).
To comprehensively assess the effectiveness of the DyGAM-NS model in identifying DTIGs, we conducted a comparison with seven commonly employed time-course analysis methods (McCarthy et al., 2012;Law et al., 2014;Nueda et al., 2014;Leng et al., 2015;Fischer et al., 2018) (Table 2).Comparison results revealed that the DyGAM-NS model demonstrates performance on par with other methods in identifying DTIGs.The estimated F 1 -score of DyGAM-NS slightly surpasses that of the other seven methods.Furthermore, comparing the DTIGs identified by DyGAM-NS with those of the other seven methods, DTIGs identified by the DyGAM-NS model can be validated by at least two other methods in both M-to-Y and Y-to-M transitions, suggesting the robustness of our approach.
Of particular interest is the DyGAM-NS model's potential in predicting gene expression levels of DTIGs at various time points during dimorphic transition.To validate this, for all identified DTIGs genes, we randomly excluded samples from any one set of time points during the dimorphic transition.We then utilized the expression data from the remaining 9 time points to fit the DyGAM-NS model.Next, we used the newly fitted model to predict the gene expression levels for the randomly excluded time points and compared them with the original data to assess their accuracy.The results indicate that during the M-to-Y transition process, 51.5% (3174/6167) of DTIGs' actual expression levels fall within the 95% confidence interval of the predicted values.
Similarly, during the Y-to-M transition process, 51.9% (4194/8085) of DTIGs' actual expression levels fall within the 95% confidence interval of the predicted values.In comparison to other methods, DyGAM-NS not only exhibits comparable DTIG identification capabilities but also demonstrates potential in predicting temporal gene expression levels.

Clustering of DTIGs indicate that di erent gene expression pattern for M-to-Y and Y-to-M transitions of T. marne ei
To explore the dynamic changes of DTIGs during the dimorphic transition of T. marneffei, we further applied stringent filtering criteria (fold change of expression levels during dimorphic transitions > 2), resulting in the identification of 3,084 and 4,156 highly variable DTIGs in M-to-Y and Y-to-M transitions, respectively.By using a modified trimmed k-means clustering algorithm (Fritz et al., 2012), we discovered 11 distinct expression patterns in both M-to-Y and Y-to-M transitions.In the Mto-Y transition, we discovered 11 distinct expression patterns (Figure 3), encompassing three primary types of changes: 4 upregulated patterns, 5 down-regulated patterns, and 2 pulselike patterns.Among the up-regulated expression patterns, we observed critical genes involved in the survival of T. marneffei within host macrophages, such as iron affinity-related genes (Pasricha et al., 2016) (ftrA, ftrC, and sidA) in pattern Mto-Y-up-1, pepA (Payne et al., 2019) in pattern M-to-Y-up-3, and cytochrome P450 monoxygenases (Boyce et al., 2018) (simA) and the tyrosine metabolic regulator (Boyce et al., 2015) (hmgR) in pattern M-to-Y-up-4.These findings suggest significant transcriptional changes occurring during the early stages of the M-to-Y transition, facilitating rapid adaptation to the harsh intracellular environment.Concerning the down-regulated expression patterns, we identified genes associated with conidiation and pigment synthesis (Borneman et al., 2002a;Todd et al., 2003;Woo et al., 2010Woo et al., , 2012;;Sapmak et al., 2015) (alb1, pbrB, pks11, pks12, brlA, and stuA), which are closely linked to the morphological characteristics of the mycelia.Additionally, the pulse-like expression patterns exhibited an immediate transient decrease in gene expression at the early stage of the M-to-Y transition, potentially indicating a transient response regulation mechanism during this transition.However, the underlying mechanism governing these genes remains unknown and warrants further investigation.
DTIGs associated with the Y-to-M transition can be classified into 11 distinct expression patterns (Figure 4), which can further be grouped into three types of changes: 5 up-regulated patterns, 2 down-regulated patterns, and 4 pulse-like patterns.In the upregulated expression pattern Y-to-M-up-1, we identified two basic leucine-zipper (bZIP) transcription factors (Nimmanee et al., 2014;Dankai et al., 2016) (atfA and yapA), known to regulate the response of T. marneffei to environmental stress.Another upregulated pattern (Y-to-M-up-3) contains the C2H2 transcription factor (Bugeja et al., 2013) (Borneman et al., 2002a;Zuber et al., 2003;Woo et al., 2012) (gasC, pks11, pks12, and stuA) and Y-to-M-up-4 (Todd et al., 2003;Woo et al., 2010;Sapmak et al., 2015) (pbrB, alb1, and brlA) exhibit genes related to the morphological characteristics of the mycelia, including conidiation and pigment synthesis.Notably, the dynamic changes in the transcriptome during the Y-to-M transition are not as pronounced as those observed in the M-to-Y transition.For instance, the mycelia phase-related genes found in the pattern Y-to-M-up-4 gradually upregulate in the later stage of the Y-to-M transition (Figure 4).In terms of the down-regulated expression patterns, functional enrichment analysis reveals a significant enrichment of translation-related functions, indicating a decreased demand for protein synthesis and related processes during the Y-to-M transition.Moreover, the continuously downregulated pattern Y-to-M-down-2 exhibits a significant enrichment of mitochondrial-related components and genes associated with the tricarboxylic acid (TCA) cycle.This finding is consistent with the fact that T. marneffei relies primarily on the TCA cycle for energy metabolism in the yeast phase, whereas glycolysis is prominent in the mycelia phase (Pasricha et al., 2017).Additionally, we observed genes related to intracellular survival (Pasricha et al., 2016;Payne et al., 2019) (sidA, pepA, and ftrC) in the two down-regulated expression patterns.Within the pulse-like expression patterns, Yto-M-pulse-1 includes the RFX transcription factor (Bugeja et al., 2010) rfxA, which plays a role in regulating cell division.Functional enrichment analysis also indicates a significant enrichment of cell division and cycle-related functions in pattern Y-to-M-pulse-1, suggesting that cell division is inhibited in the early stage of the Y-to-M transition and reactivated in the later stage.

Gene regulation network underlying potential regulators of dimorphic transition of T. marne ei
To elucidate the regulatory relationships among DTIGs during the dimorphic transition, we employed the GRNboost2 (Moerman et al., 2019) method to construct a directed gene regulatory network.To enhance the accuracy of the GRNboost2 approach, we utilized 5,223 DTIGs identified by the DyGAM-NS model in both the M-to-Y and Y-to-M transitions as target genes.Additionally, we incorporated 401 candidate transcription factors as prior information, identified through a combination of functional annotation of T. marneffei and known fungal transcription factor domains (Shelest, 2008).To ensure the focus on robust regulatory relationships, we retained the top 5% cumulative weight of strong gene-gene regulatory interactions to form the final gene regulatory network.Following the filtering process, the dimorphic transition gene regulatory network comprised 2,243 pairs of robust regulatory relationships involving 1,664 genes, of which 325 were transcription factors (Figure 5A).Notably, within these 1,664 genes, TM060272, TM021494, and TM030196 emerged as the most significant genes based on their betweenness centrality within the network (Figure 5B).TM060272, possessing a Zn( 2 transition, suggesting its involvement in the regulation of the Y-to-M transition.The TM030196 gene (abaA) exhibited an immediate transient increase in expression during the early stage of the M-to-Y transition (0-9 h) but showed minimal changes during the Y-to-M transition, supporting its regulatory role in the M-to-Y transition, consistent with previous research (Borneman et al., 2002b).
To assess the precision of the gene regulatory network governing the dimorphic transition, we focused on the MADSbox transcription factor madsA gene (TM030606), which has been experimentally verified to play a role in regulating the dimorphic transition (Yang et al., 2014;Wang et al., 2018) (Figure 5B).To validate the downstream regulatory genes within the dimorphic transition regulatory network, we employed knockout and overexpression strains of madsA.RNA-seq analysis was conducted on these strains at four time points during the dimorphic transition (0 and 48 h for M-to-Y, and 0 and 6 h for Y-to-M).By integrating the transcriptome data from the wildtype strain at corresponding time points, we identified 1,788 genes regulated by madsA during the dimorphic transition, of which 1,066 were identified as DTIGs by DyGAM-NS.Notably, in the gene regulatory network constructed using GRNboost2, over 50% (7/12) of the downstream target genes of madsA were validated (Supplementary Figure S4).

Discussion
Dimorphic transition represents a sophisticated adaptive mechanism in thermally dimorphic fungi, enabling their survival and propagation across diverse environments.Understanding the regulatory mechanisms underlying fungal dimorphic transition is a key scientific pursuit, given its close connection to fungal pathogenicity.Our study focused on the model organism of are significantly enriched within these distinct expression patterns, suggesting that different biological functions play prominent roles at different stages of dimorphic transition.Consequently, our study underscores the criticality of temporal transcriptomic data as an indispensable tool for dissecting the intricate nature of fungal dimorphic transitions.Despite the decreasing cost of sequencing technologies, enabling high-resolution time-course transcriptome sequencing, challenges persist in deciphering temporal data for complex biological processes, particularly in microorganisms like fungi.In this study, we developed the DyGAM-NS model, which integrates a flexible generalized additive model with dynamic gene-specific natural cubic spline fitting optimization.We benchmarked the DyGAM-NS model against seven other commonly used timecourse analysis methods.To assess its performance, we used differentially expressed gene sets identified through pairwise comparisons at each time point and highly variable gene sets based on biological coefficient of variation, given the absence of a standardized dataset for DTIGs in T. marneffei.Through this hybrid estimation approach, the DyGAM-NS model exhibited slightly higher F 1 -scores for DTIG identification in both M-to-Y and Y-to-M transitions compared to the other seven methods.Additionally, our analysis involved gene-specific parameter optimization for each gene, yielding a unique DyGAM-NS model for every DTIG.Leveraging the fitted model, our method can predict gene expression levels at other time points during dimorphic transitions, a capability currently lacking in other methods.Cross-validation using random sampling demonstrated that over 50% of gene expression levels fell within the 95% confidence interval predicted by the DyGAM-NS model in both Mto-Y and Y-to-M transitions.This predictive potential for temporal gene expression levels also supports downstream DTIG expression pattern clustering and gene regulatory network construction.In summary, the DyGAM-NS model, as an innovative approach for identifying dynamic temporal gene regulation, provides fresh insights into unraveling mechanisms of dimorphic transition and other complex biological processes, particularly in fungi and other microorganisms.
In this study, we utilized temporal transcriptomic data to investigate the dimorphic transition of T. marneffei, focusing on its distinct gene expression patterns at different stages of dimorphic transition.During the early stage of M-to-Y transition, we observed a substantial number of rapid upregulation genes associated with environmental adaptation, such as pepA, ftrA, ftrC, sidA, and hmgR.This observation suggests that T. marneffei undergoes a rapid adaptive response to the harsh environment within host macrophages.In contrast, we also identified genes with delayed response patterns, which are linked to characteristics specific to mycelial or yeast phase growth but unrelated to environmental stimuli.For instance, the gene hgrA, involved in cell wall regulation, remained unchanged until 48 h into the Yto-M transition.Functional analysis revealed that although hgrA is crucial for maintaining hyphal cell wall integrity, it does not directly participate in stress response, providing valuable insights into its upregulation during the later stage of the Y-to-M transition.Furthermore, we constructed a gene regulatory network for the dimorphic transition of T. marneffei using the DyGAM-NS model to identify key transcription factors.Interestingly, two zinc fingercontaining transcription factors were found to potentially regulate the M-to-Y and Y-to-M transitions, respectively.These findings present new avenues for future research aimed at unraveling the regulatory mechanisms underlying the dimorphic transition of T. marneffei.

FIGURE
FIGURE Transcriptional dynamics during M-to-Y and Y-to-M transitions of T. marne ei.The scatter charts show the di erentiation trajectory during dimorphic transition of (A) M-to-Y, (B) Y-to-M, and (C) M-to-Y together with Y-to-M based on the scores of the first two principal components.(D) Heatmap of gene expression levels, measured with TPM, during M-to-Y and Y-to-M transition.The expression levels are standardized to have zero mean and unit variance across samples for each gene.
FIGURE Gene expression levels and enriched gene ontologies of the expression patterns in M-to-Y transition.The line charts represent the expression tendency of the expression patterns in M-to-Y transition.Each light gray line represents a gene, and the black lines are drawn based on the mean values at each time point.The gene expressions levels are measured with TPM and standardized to have zero mean and unit variance across samples.The bar chart shows enriched gene ontologies for each expression pattern, which are presented in di erent colors.

FIGURE
FIGURE Gene expression levels and enriched gene ontologies of the expression patterns in Y-to-M transition.The line charts represent the expression tendency of the expression patterns in Y-to-M transition.Each light gray line represents a gene, and the black lines are drawn based on the mean values at each time point.The gene expressions levels are measured with TPM and standardized to have zero mean and unit variance across samples.The bar chart shows enriched gene ontologies for each expression pattern, which are presented in di erent colors.

TABLE Summary of
T. marne ei PM genome assembly and annotation.
(hgrA), which governs the formation of the hyphal cell wall.Similarly, up-regulated patterns Y-to-M-up-2 TABLE Performance of DyGAM-NS and other commonly used models in identification of DTIGs.