Intracellular and Intercellular Gene Regulatory Network Inference From Time-Course Individual RNA-Seq

Gene regulatory network (GRN) inference is an effective approach to understand the molecular mechanisms underlying biological events. Generally, GRN inference mainly targets intracellular regulatory relationships such as transcription factors and their associated targets. In multicellular organisms, there are both intracellular and intercellular regulatory mechanisms. Thus, we hypothesize that GRNs inferred from time-course individual (whole embryo) RNA-Seq during development can reveal intercellular regulatory relationships (signaling pathways) underlying the development. Here, we conducted time-course bulk RNA-Seq of individual mouse embryos during early development, followed by pseudo-time analysis and GRN inference. The results demonstrated that GRN inference from RNA-Seq with pseudo-time can be applied for individual bulk RNA-Seq similar to scRNA-Seq. Validation using an experimental-source-based database showed that our approach could significantly infer GRN for all transcription factors in the database. Furthermore, the inferred ligand-related and receptor-related downstream genes were significantly overlapped. Thus, the inferred GRN based on whole organism could include intercellular regulatory relationships, which cannot be inferred from scRNA-Seq based only on gene expression data. Overall, inferring GRN from time-course bulk RNA-Seq is an effective approach to understand the regulatory relationships underlying biological events in multicellular organisms.


INTRODUCTION
Regulation of gene expression is a fundamental factor that controls cellular events such as proliferation and differentiation. Understanding gene regulatory networks is important to elucidate the molecular mechanisms underlying cellular events. Recently, gene regulatory network (GRN) inference based on time-course data has garnered considerable attention in single-cell RNA-Seq (scRNA-Seq). State-of-the-art scRNA-Seq analysis techniques can generate transcriptome information from thousands of cells (Sasagawa et al., 2013;Klein et al., 2015;Hayashi et al., 2018;Sasagawa et al., 2018;Gao et al., 2020). Transcriptomic heterogeneity of cells due to asynchronous progression of cellular events enables us to infer regulatory relationships of genes. During inference, first, dimensional reduction of scRNA-Seq data provides a trajectory of cellular events such as differentiation and proliferation (Treutlein et al., 2014;Haghverdi et al., 2016). Then, assignment of pseudo-time can place cells along the trajectory. As scRNA-Seq with pseudo-time is a dense timecourse observation of cellular events, gene regulatory networks can be inferred by comparing the timing of gene upregulation and downregulation along pseudo-time (Matsumoto et al., 2017;Aalto et al., 2020).
Originally, GRN inference was applied for gene expression data from tissue and pooled cells (bulk samples) generated using DNA microarray and RNA-Seq (Fernandez-Valverde et al., 2018;Ko and Brandizzi, 2020). Compared to steadystate data, time-course data enables GRN inference by comparing the timing of gene upregulation and downregulation (Krouk et al., 2010;Ogami et al., 2012;Iglesias-Martinez et al., 2016;Zhang et al., 2019). However, GRN inference from time-course data of bulk samples is not popular due to the following drawbacks: 1) RNA extraction and library preparation of a large number of bulk samples before sequencing are expensive and timeconsuming (Yoshino et al., 2020), and 2) biological variances may result in inconsistencies between actual sampling time and transcriptome status (Kilfoil et al., 2009). Recent technical advances related to bulk RNA-Seq have overcome these limitations. Advances in sequencing platforms (Muir et al., 2016), RNA extraction method (Yoshino et al., 2020;Ujibe et al., 2021), and bulk 3′ RNA-Seq library preparation methods (Alpern et al., 2019;Kamitani et al., 2019;Li et al., 2020) have enabled a costeffective time-course individual RNA-Seq (Kashima et al., 2020) (a time-series RNA-Seq targeting an entire embryo or tissue of each individual). Pseudo-time analysis for individual RNA-Seq might capture individual differences in the progression speed of biological events such as development. Following the assignment of pseudo-time to each individual RNA-Seq data, GRN can be inferred similar to an inference based on scRNA-Seq.
Theoretically, GRN inferred from whole-body and tissue RNA-Seq are different from those inferred from scRNA-Seq. scRNA-Seq provides transcriptomic information at the cellular level that enables inference of intracellular GRN involved in proliferation and differentiation (Lam et al., 2016). In contrast, bulk RNA-Seq of the entire body and tissues could contain transcriptomic information at the cell population level. Time course individual RNA-Seq during development would enable inference of both intracellular and intercellular GRN (cell-cell communications) involved in the developmental process. For instance, during embryonic development, the upregulation of ligand-related genes can be followed by the upregulation of downstream genes in cells expressing receptor genes (Basson, 2012).
Thus, we hypothesized that GRNs inferred from timecourse individual RNA-Seq during embryonic development would include intercellular regulatory relationships between ligand genes and downstream genes of related signaling pathways. To test this hypothesis, we conducted timecourse bulk RNA-Seq of individual mouse embryos during early development, followed by pseudo-time analysis and GRN inference.

Maintenance of Mice
Embryos were collected from pregnant female Institute of Cancer Research (ICR) mice (CLEA, Tokyo, Japan) at different stages (E7.5, E8.5, E9.5, E10.5, E11.5, E12.5, and E.13.5). The number of replicates (embryos) was 10 at E7.5 and 11 in the remaining stages. All sacrificed female mice were housed under a 12-h dark-light cycle, with the light phase starting from 8 am. All animal experiments were performed in accordance with the guidelines of the Animal Care and Use Committee of Osaka University Graduate School of Dentistry, Osaka, Japan. All experimental protocols were approved by Animal Care and Use Committee of Osaka University Graduate School of Dentistry. All methods are reported in accordance with the ARRIVE guidelines (Percie du Sert et al., 2020).

RNA-Seq Extraction
The total RNA was extracted using the RNeasy ® kit (QIAGEN, Hilden, Germany) according to manufacturer's protocol. The total RNA concentration was measured using the Qubit ™ RNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, United States) and was adjusted to 5 ng/μL and stored at −80°C until the subsequent analysis.

RNA-Seq Library Preparation and Sequencing
Non targeted RNA-Seq was conducted according to the Lasy-Seq ver. 1.1 protocol (https://sites.google.com/view/lasy-seq/) (Kamitani et al., 2019;Kashima et al., 2020). Briefly, 50 ng of total RNA was reverse transcribed using an reverse transcription (RT) primer with index and SuperScript IV reverse transcriptase (Thermo Fisher Scientific). Thereafter, all RT mixtures were pooled and purified using an equal volume of AMpure XP beads (Beckman Coulter, Brea, CA, United States) according to the manufacturer's instructions. Second strand synthesis was conducted with the pooled samples using RNaseH (5 U/μL; Enzymatics, Beverly, MA, United States) and DNA polymerase I (10 U/μL; Enzymatics). To avoid the carryover of large amounts of rRNAs, the mixture was subjected to RNase treatment using RNase T1 (Thermo Fisher Scientific). Subsequently, the samples were purified using 0.8× volume of AMpure XP beads. Fragmentation, end-repair, and A-tailing were conducted using 5× WGS Fragmentation Mix (Enzymatics, Beverly, MA, United States). The Adapter for Lasy-Seq was ligated using 5× Ligation Mix (Enzymatics, Beverly, MA, United States), and the adapter-ligated DNA was purified twice with 0.8× volume of AMpure XP beads. After optimizing the PCR cycles for library amplification by qPCR using EvaGreen, 20× in water (Biotium, Fremont, CA, United States)  the ProFlex PCR System (Applied Biosystems, Waltham, MA, United States). The amplified library was purified with an equal volume of AMpure XP beads. One microliter of the library was subjected to electrophoresis using Bioanalyzer 2,100 with the Agilent High Sensitivity DNA kit (Agilent Technologies, Santa Clara, CA, United States) to assess quality. Subsequently, sequencing of 150-bp paired-end reads was performed using HiSeq X Ten (Illumina, San Diego, CA, United States).

Mapping and Gene Quantification
Read 1 reads were processed with fastp (version 0.21.0)  using the following parameters: --trim_poly_x -w 20 --adapter_sequence AGATCGGAAGAGCACACGTCTGAA CTCCAGTCA --adapter_sequence_r2 AGATCGGAAGAG CGTCGTGTAGGGAAAGAGTGT -l 31. The trimmed reads were then mapped to the mouse reference sequences of Mus_musculus.GRCm38.cdna.all.fa, using BWA mem (version 0.7.17-r1188) (Li and Durbin, 2009) with the default parameters. The read count for each gene was calculated with salmon using -l IU, which specifies the library type (version v0.12.0) (Patro et al., 2017). Thereafter, using R (version 4.0.1) (ore Team. R (2015). A, 2015), the sum of read counts per gene was calculated. Genes with read counts greater than zero were used in the subsequent analysis.

Pseudo-Time Analysis
Read counts were normalized using the "NormalizeData" function with the default parameters in Seurat (version 4.0.0) (Hao et al., 2020), which produces natural-log transformed (read per 10,000 + 1). For principal component analysis (PCA), the normalized read counts were centered but not scaled using the "ScaleData" function with the default parameters except for do.scale F. PCA was then performed using the "RunPCA" function for genes with high dispersion, which were selected using the "FindVariableFeatures" function with default parameters except for selection.method "mvp". Finally, SingleCellExperiment (version 1.10.1) (Amezquita et al., 2020) and slingshot (version 1.6.1) (Street et al., 2018) were used to calculate the pseudo-time for each sample.

Evaluation Considering Pseudo-Time Instead of Stage
The "smooth.spline" function in R (version 4.0.1) (ore Team. R (2015). A, 2015) with the default parameters except for "all.knots T, lambda 0.001" was used to obtain smoothed curve for normalized expression of each gene in the "data" slot of the Seurat object, and stage or pseudo-time. Then sum of squared residuals (SSR) between the observed and fitted values was calculated for each gene. The mean of SSRs was calculated against all genes and the high variable genes obtained with the "FindVariableFeatures" function.

Gene Regulatory Network Inference
In the SCODE algorithm (Matsumoto et al., 2017), normalized expression data in the "data" slot of the Seurat object, and pseudo-time were used to infer GRN. A was optimized 20 times with 100 iterations and D 4. Pearson's correlation coefficients between values of each A from the 20 optimizations and the meanA (the average of each value of A) were calculated. In the following analysis, we used the average values of the top 10 A showing higher correlations with the meanA from 20 optimizations. To define the thresholds for downstream gene selection for each gene, the linear function was regressed using the "nls" function in R for the scatter plot of absolute values of A for downstream genes in the decreasing order; the X and Y axes represented the integers from 1 to 28,117, and the absolute values of A, respectively. Genes with larger absolute values of A than the Y values of the regressed line were defined as the inferred gene downstream of each gene.

Evaluation of Inferred GRN by Comparing With a TF-Downstream Gene Database and Analyzing Downstream Genes of Ligandand Receptor-Related Genes
To validate the inferred GRN, information regarding the binding motifs and targets for 438 mouse TFs in the TF2DNA database was used (Berger et al., 2006;Matys et al., 2006;Berger et al., 2008;Badis et al., 2009;Wei et al., 2010;Chen et al., 2011;Jolma et al., 2013;Sebé-Pedrós et al., 2013;Weirauch et al., 2013;Mathelier et al., 2014;Pujato et al., 2014;Weirauch et al., 2014). The Area Under the Curve (AUC) values for downstream prediction based on the absolute values of A were calculated using the "performance" function in ROCR (version 1.0-11) (Sing et al., 2005). Statistical analysis of enrichment of the validated target genes among the inferred genes was conducted using the "fisher.test" function in R. Statistical analysis of overlapping of the inferred downstream genes of ligand-and receptor-related genes was conducted using the "enrichment_test" function in Rvenn (version 1.1.0) (Akyol, 2019). For all statistical tests, Benjamini-Hochberg (BH) correction was performed using the "p.adjust" function. The upset plots were drawn using the "upset" function in UpSetR (version 1.4.0) (Conway et al., 2017).

Evaluation of Intracellular Co-Expression of Upstream and Downstream Genes
Based on the batch-corrected scRNA-Seq data of approximately 60,000 cells of high quality (Han et al., 2018), the average expression of each gene of 98 cell types was calculated with the "AverageExpression" function in Seurat. Thereafter, Pearson's correlation coefficient (PCC) of normalized expression levels of upstream and downstream genes was calculated with the "cor" function in R. The inferred upstream and downstream genes showing PCC more than 0.4 were defined as co-expressed genes.

Pseudo-Time Analysis of the Time-Course Bulk RNA-Seq of Mouse Embryos
To determine whether GRN inference based on time-course bulk RNA-Seq is effective, we conducted time-course bulk RNA-Seq for individual mouse embryos. RNA was extracted from each individual embryo (n 10 or 11) at seven time points: E7.5, E8.5, E9.5, E10.5, E11.5, E12.5, and E.13.5, followed by 3′ RNA-Seq using the Lasy-Seq method (Kamitani et al., 2019). As a result, we obtained 76 RNA-Seq datasets with an average of 8.5 million reads per sample. The reads were mapped onto the mouse reference sequence, and then the read counts of each gene in each sample were calculated. We then used Seurat (Stuart et al., 2019), an R package for single cell omics analysis, for the normalization of read counts, detection of highly variable genes, and dimension reduction of omics date. On the PC1 and PC2 planes obtained with Seurat, samples in the same stage were close to each other ( Figure 1A). As expected, clusters of each stage were ordered according to the developmental process ( Figure 1B). Using the R package "slingshot" (Street et al., 2018), we inferred the developmental trajectory of mouse embryos and calculated the pseudo-time for each sample ( Figures 1A,B). The pseudo-time analysis revealed individual differences among embryos in the speed of their developmental processes. For example, the pseudo-time of a sample at E10.5 and that at E11.5 were close to each other ( Figures 1A,B). Using the pseudo-time analysis, the timecourse RNA-Seq data of the seven time points could be converted into those of 76 time points. Using pseudo-time as temporal information instead of stage (real sampling time) improved the sum of squared residuals (SSRs) between the observed and fitted values ( Figures 1C,D). The SSRs along the pseudo-time were decreased by 0.745% (all genes) and 3.835% (high variable genes), on an average, compared with the SSRs along stage. These results indicate that integrating pseudotime into the analysis, instead of the actual sampled stage, could improve the capture of temporal expression dynamics, by considering individual differences in the progression speed of biological events during early embryonic development in mice. Gene Regulatory Network Inference Based on Individual RNA-Seq of Entire Mouse Embryos Next, we inferred a GRN from the dataset of time-course individual RNA-Seq of entire mouse embryos. We used SCODE, which solves linear ordinary differential equations to infer GRN (Matsumoto et al., 2017). To avoid incorrect emphasis on the technical noise of RNA-Seq, we used a non-scaled normalized gene expression matrix as the input for SCODE. Selection of the D size, a parameter that affects the number of assumed basic patterns of expression dynamics of the dataset, is important for robust GRN inference, as an unnecessarily large D causes an unstable inferred GRN (Matsumoto et al., 2017). In this study, we used D 4 similar to that in a previous study (Matsumoto et al., 2017), with which the SSR was relatively small (Figure 2A). For 28,117 genes whose read counts were greater than zero, SCODE produced A (28,117 × 28,117 matrix) corresponding to the inferred gene regulatory network, in which the value of A i,j indicates regulatory effects on the downstream gene i from the regulator j. A i,j > 0 indicates that the regulator j positively regulates gene i, whereas A i,j < 0 indicates the opposite. Because SCODE optimizes A by random sampling, we optimized A 20 times to check for reproducibility. Thereafter, PCCs between the values of each A from the 20 optimizations and the meanA were calculated ( Figure 2B). Almost all optimizations produced a similar A with high correlation coefficients ( Figure 2B). In the subsequent analysis, we used the average values of the top 10 A showing higher correlations with the meanA from 20 optimizations. We then tried to define the thresholds for significant regulatory relationships between regulators and downstream genes. Because the average expression level of the downstream genes showed higher correlation with the absolute values of A compared to the regulators ( Figures 2C,D), we independently defined the threshold for each regulator instead of a constant threshold previously used (Matsumoto et al., 2017). For example, the absolute values of A indicating regulatory relationships between Sox8 and its regulators showed a positive correlation (PCC 0.76) ( Figure 2E), whereas the absolute values of A showing regulatory relationships between Sox8 and its downstream genes showed smaller correlations (PCC 0.54) ( Figure 2F). Although most of the inferred values of A for regulatory relationships between Sox8 and its downstream genes were around zero, some values were outliers ( Figure 2G). To define the threshold for the downstream genes of Sox8, we regressed the linear function for the scatter plot of absolute A values for the downstream genes in the decreasing order and defined the threshold ( Figure 2G). Finally, in the subsequent analysis, genes with larger absolute values of A than threshold were defined as the inferred genes downstream of Sox8 ( Figure 2G).

Validation of the Inferred Network
Next, we evaluated the inferred GRN by comparing the inferred genes downstream of the TFs with the information in the TF2DNA database, which is an experimental-source-based database of the binding motifs and downstream genes for 438 mouse TFs (Pujato et al., 2014;Badis et al., 2009;Wei et al., 2010;Weirauch et al., 2013;Berger et al., 2006;Mathelier et al., 2014;Matys et al., 2006;Weirauch et al., 2014;Berger et al., 2008;Chen et al., 2011;Jolma et al., 2013;Sebé-Pedrós et al., 2013). First, we evaluated the effectiveness of target prediction based on the absolute values of A by calculating the AUC ( Figure 3A), and obtained an average AUC of 0.704, suggesting that the inferred regulatory relationships could reflect the actual regulatory network. Furthermore, we calculated the AUC for inferred regulatory relationships using dynGENIE3 (Huynh-Thu and Geurts, 2018), an algorithm that uses sampled stage information as temporal information to infer GRN. The average AUC was 0.50 ( Figure 3A), suggesting the adequacy of GRN inference based on time-course individual RNA-Seq. Second, we examined the validity of the defined thresholds by assessing the differences between the validated target gene rate (number of validated target genes/number of all inferred downstream genes) above the defined thresholds and the best validated rate ( Figure 3B and Supplementary Table S1). Below the threshold, the validated target gene rates for the 438 TFs were 0.69% smaller than the best validated target gene rates on an average. The validated target genes were 67.5% of the inferred downstream genes on an average (Supplementary Table S1, Figures 3C,D). Compared with the background rate (number of known target genes/number of all genes), the validated target gene rates of the inferred downstream genes of all TFs were statistically high (adjusted p-value < 0.01) ( Figure 3D). These results suggest that our approach could infer the GRN underlying early development in mice.

Inferred Network Contained Regulatory Relationships Involved in Cell-Cell Interaction
Our GRN inference was based on bulk RNA-Seq containing the information of all cells in the body. We thus hypothesized that the inferred GRN also included the intercellular regulatory network. To examine this possibility, we checked the overlaps of inferred downstream genes of genes related to the ligands and receptors of nine major signaling pathways ( Figure 4A): Wnt/βcatenin, TNF, TGF-β, Hedgehog, FGF, EGF, Delta/Notch, BMP, and retinoic acid (RA) signaling pathways (Supplementary Table S2). As expected, the inferred downstream genes of all pairs of ligand and receptor genes were significantly overlapped (adjusted p-value < 0.01) ( Figure 4A). On an average, 94.2% of the inferred downstream genes of the ligand-and receptor-related genes were overlapped ( Figure 4A). For example, 4,510 genes were inferred as downstream of Wnt genes and 4,545 genes were downstream of Fzd genes. However, 97.5% of the inferred downstream of Wnt genes were also inferred as the downstream of Fzd genes ( Figure 4A). As SCODE can infer whether each downstream gene is positively or negatively regulated (Matsumoto et al., 2017), we assessed the overlaps of positively and negatively regulated gene downstream of the representative ligand-related and receptor-related genes of each signaling pathway, that is, with the highest number of inferred downstream genes among each gene family (Figures  (Gilbert and Barresi, 2017), the regulatory directions of ligand-and receptor-related genes for most inferred downstream genes were the same ( Figure 4B and Supplementary Figure S1). In contrast, in case of the RA signaling pathway, wherein the RA receptors function as transcriptional repressors without RA binding (Supplementary Figure S2A) (Glass and Rosenfeld, 2000), the regulatory directions of a ligand-related gene, Aldh1a3, which encodes a protein involved in RA synthesis, and a receptorrelated gene, Rara, for the downstream genes were opposite ( Figure 4C). In case of the Hedgehog signaling pathway, the regulatory directions of Ssh2 and Ptchd4 tended to be opposite ( Figure 4D). The regulatory directions of Ssh2 and Gli3 for the downstream genes were also opposite ( Figure 4D). In the absence of Hedgehog ligands, the full-length Gli family proteins are  Figure  S2B) (Skoda et al., 2018). With the binding of Hedgehog ligands, PATCHED inhibits the degradation of the Gli family proteins and allows them to function as transcriptional activators (Supplementary Figure S2A) (Skoda et al., 2018). Thus, the overlap of downstream genes that are positively and negatively regulated by Ssh2, Ptchd4, and Gli3 is reasonable. In case of the Wnt/βcatenin signaling pathway, the regulatory directions of Wnt7a and Fzd5 for the downstream were opposite (Supplementary Figure S3A). Our time-course RNA-Seq revealed that among the Fzd and Wnt genes, Fzd5 and Wnt8a, but not Wnt7a, were the only genes expressed dominantly in the early developmental stage among the protein families ( Supplementary Figures S4, S5). Similar regulatory directions were found for most of the inferred downstream genes of Wnt8a-Fzd5, which could be a functional pair in the early developmental stages. Furthermore, the regulatory directions for most of the inferred downstream genes of Wnt7a-Fzd6 (Fzd genes with the second most inferred downstream genes) were also the same (Supplementary Figure S3). Next, we evaluated the co-expression of inferred downstream and upstream genes. Based on a publicly available mouse cell atlas (Han et al., 2018), PCCs of normalized expression levels of upstream and downstream genes were calculated (Supplementary Figure S6A), and the pairs of PPC >0.4 was defined as co-expressed genes. Although the ratios of coexpression were overall low due to limited characteristic of scRNA-Seq, the ratio of co-expression of ligand-related genes and their inferred downstream was significantly lower than that of receptor-related genes and TFs (Supplementary Figure S6B). This suggests that the inferred downstream genes of ligand-related genes tended to be expressed in cells that do not express ligand-related genes.

Validation of Time-Course Individual RNA-Seq-Based Gene Regulatory Network Inference With a Publicly Available RNA-Seq Data
Finally, we also inferred GRNs from a publicly available dataset of time-course organ-level individual RNA-Seq of mouse (Cardoso-Moreira et al., 2019). This dataset contained data of five somatic tissues (brain, cerebellum, hear, kidney, and liver) and two germline tissues (ovary and testis). The inferred relationships between TFs and downstream genes from each organ data tended to show higher AUC values for the TF2DNA database than those based on our time-course individual whole-embryo RNA-Seq ( Figures 3A, 5A). In addition, same as the inferred GRN based on our whole-embryo RNA-Seq data, the inferred downstream genes of all pairs of ligand-and receptorrelated genes were significantly overlapped (adjusted p < 0.01) (Supplementary Figure S7). The AUC values for several TFs from organ-level RNA-Seq were considerably worse than those from our RNA-Seq data ( Figure 5A), suggesting organ-specificity of GRNs. As expected, hierarchical clustering revealed differences of the inferred GRNs between somatic and germline organs ( Figure 5B).
In conclusion, our approach would allow successful inference of the intercellular regulatory relationships related to the major signaling pathways as well as the intracellular pathways related to TFs.

DISCUSSION
Recently, GRN inference based on the combination of scRNA-Seq and pseudo-time analysis has garnered considerable attention (Dai et al., 2020). However, to our knowledge, this study is the first to report GRN inference based on the combination of individual bulk RNA-Seq and pseudo-time analysis. Here, pseudo-time explained variances of gene expression during early development in mice better than the actual sampled stage ( Figures 1C,D), suggesting that the variance of gene expression observed using our time-course individual RNA-Seq was not mere stochastic variability in gene expression but individual difference in the progression speed of development. Thus, pseudo-time uses the correlation of gene expression dynamics more effectively than the actual temporal information in time-course individual RNA-Seq. Unlike scRNA-Seq, which can elucidate transcriptomic dynamics in a certain cellular event such as proliferation and differentiation, bulk RNA-Seq could provide a mixture of various transcriptomic dynamics regarding the cellular events occurring in an embryo (Chasman and Roy, 2017). Theoretically, GRN inference based on bulk RNA-Seq cannot provide multiple lineage-specific regulatory relationships. This may explain why GRN inference based on bulk RNA-Seq has not been attempted as with scRNA-Seq. In this study, we succeeded in inferring the known regulatory relationships of the TFs in the TF2DNA database with a high AUC compared with the GRN inferred from scRNA-Seq (Chen and Mar, 2018) (Figure 3). This suggests that GRN inference from bulk RNA-Seq provides insights into the regulatory interconnections of genes, even though GRN inferred from bulk RNA-Seq would have limitations in the comprehensiveness of inferred regulatory relationships compared with those from scRNA-Seq. Furthermore, substantial changes in the cell population occur during early development stages. It is possible that our GRN inferred from bulk RNA-Seq merely reflects the changes in cell populations but not the interconnections of genes. Considering the results of inference for the Wnt and Fzd genes (Supplementary Figure S2), these points should be considered when interpreting the inferred GRN based on the time-course bulk RNA-Seq.
Assignment of pseudo-time is an important step in GRN inference from time-course data. In case of scRNA-Seq, the accuracy of pseudo-time assignment is controversial (Tritschler et al., 2019). In contrast, in the case of time-course individual bulk RNA-Seq, the accuracy of pseudo-time assignment could be assured by the actual sampling time, which can be an advantage compared with GRN inference from scRNA-Seq. Herein, we proposed a new strategy for the threshold of significant gene regulatory relationships inferred by SCODE (Figure 2). In the case of E2f3, the number of predicted target genes is only ∼10% of the actual targets with our strategy. Shifting the threshold to approximately 17,000 targets for E2f3 may still result in a validated target rate of approximately 80% ( Figure 3B). As decreasing the threshold may produce false positive inferred relationships for some genes, the determination of the threshold remains a challenge. Considering high AUCs for the TFs in the database ( Figure 3A), the threshold should change depending on the aim of the study.
GRN indicates the intracellular interconnections of genes in a narrow sense; intercellular regulation of genes via cell-cell communication is also a key factor to understand the regulatory mechanisms underlying multicellular organisms. Several studies have attempted to systematically identify cell-cell communications based on single cell gene expression profiles and information regarding ligand-receptor pairs (Kumar et al., 2018;Wang et al., 2019;Cabello-Aguilar et al., 2021;Jin et al., 2021). As these approaches require prior knowledge, they could only be applied for the major model organisms and cannot reveal novel signaling pathways. We demonstrated that GRN inference based on time-course individual RNA-Seq could infer intercellular regulatory relationships related to cell-cell communication via cell signaling pathways. This approach only requires time-course RNA-Seq results and is applicable for non-model organisms without ligand-receptor information. Theoretically, the GRN inferred based on scRNA-Seq cannot include intercellular interconnections of genes. Intracellular co-expression knowledge of upstream and downstream genes, which could be obtained form scRNA-Seq experiments, would be useful for systematic identification of genes involved in cell-cell communications during cellular events of interest. Taken together, our approach is a powerful tool to understand intracellular and intercellular regulatory relationships of genes, which cannot be achieved using the existing GRN inferences based on scRNA-Seq alone. As discussed above, bulk RNA-Seq is limited in the comprehensiveness of inferred regulatory relationship as multiple lineage-specific regulatory relationships cannot be deduced from GRN inference based on bulk RNA-Seq. Higher AUC values of the inferred GRNs from organ-level RNA-Seq than whole-embryo RNA-Seq would reflect this limitation ( Figures 3A, 5A). A future novel bioinformatic approach that can deconvolute gene expression in each tissue and cell lineage from bulk RNA-Seq will help overcome this limitation.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi. nlm.nih.gov/, PRJNA725414.

ETHICS STATEMENT
The animal study was reviewed and approved by Animal Care and Use Committee of Osaka University Graduate School of Dentistry.

AUTHOR CONTRIBUTIONS
MK, YS, and HK conceived and conducted the experiments, and MK analyzed the results and wrote the manuscript. All authors reviewed and approved the final manuscript.

FUNDING
This work was supported by Japan Society for the Promotion of Science (grant number JP19H03858 to HK).