Understanding gene regulation during the development of the sea cucumber Apostichopus japonicus using comparative transcriptomics

Embryonic development, especially metamorphosis and settlement, has a major impact on the life history of marine invertebrates. Apostichopus japonicus is an economically important species of sea cucumber. In this study, we performed RNA sequencing on six key stages of A. japonicas development: fertilized eggs, blastula, gastrula, auricularia, doliolaria, and pentactula. A total of 32,353 genes were identified and annotated as a reference gene set for subsequent pairwise comparison analysis. After filtering out low-quality genes, the dynamic molecular responses to development were revealed by WGCNA. The results showed that of the 20 modules, genes in the blue, yellow, and darkslateblue modules were highly correlated with the gastrula, auricularia, and blastula stages, respectively. GO terms for “RNA” and “proteasome complex” were most significantly enriched in the blue module. In the darkslateblue and yellow module, receptors of signaling pathways and metabolic processes were significantly enriched, respectively. All DEGs were categorized into 34 terms, mainly associated with signal transduction and cellular immunity. The expression pattern of genes associated with adhesion, cell cycle, signal, transcription factor, extracellular matrix (ECM), and cytoskeleton was analyzed according to gene function. The results of this study facilitated a more comprehensive understanding of the molecular characteristics of sea cucumber embryonic development and will provide theoretical guidance for larva rearing in sea cucumber culture.


Introduction
Apostichopus japonicus (Selenka) is distributed in shallow temperate and temperate-cold waters along the coasts of China, Japan, Korea, and Russia. This species is considered to be one of the most valuable sea foods, due to its extraordinary nutritional and medicinal properties . A. japonicus also displays some unique biological characteristics, such as evisceration, regeneration, aestivation, and autolysis . It has been extensively studied from many perspectives, including feeding behavior and movement patterns Sun et al., 2015), reproduction and breeding (Kato et al., 2009;Ahmed et al., 2011;Soliman et al., 2013), responses to environmental changes (temperature, salt, light, and dissolved oxygen) (Dong et al., 2010;Meng et al., 2011;Xu et al., 2016;Huo et al., 2017), disease and immunity (Gu et al., 2010;Zhang et al., 2014), and physiology and behavior of regeneration or aestivation (Chen et al., 2015;Miao et al., 2017;Sun et al., 2017a;Sun et al., 2017b). In 2017, we published the A. japonicus genome, the first fine sea cucumber genome, making an important contribution to sea cucumber and echinoderm research . A. japonicus, as a representative species for sea cucumbers, is becoming a good model species for research. Development is the most important biological process for species. In recent years, there have been a number of studies on sea cucumber development, promoting our understanding of these species. Researchers have elucidated the development and growth morphogenesis patterns of different kinds of sea cucumbers, including Cucumaria frondose (Gianasi et al., 2018), Stichopus sp. (Curry fish) (Hu et al., 2010), Isostichopus fuscus (Hamel et al., 2003), and A. japonicus . A. japonicus has received the most attention from researchers due to its high economic value in China and Asia. The life cycle of A. japonicus can be divided into eight major developmental stages. Embryonic and larval development before the juvenile phase can be divided into six key stages as follows: fertilization of eggs, blastula, gastrula, auricularia, doliolaria, and pentactula . To date, there have been numerous studies on the effect of environmental factors (such as density, temperature, salinity, and pH) on the development of A. japonicus (Li and Li, 2009;Liu et al., 2010;Yuan et al., 2015), mainly focusing on survival rate and metamorphosis. However, there has been a lack of studies on genetic mechanisms and gene regulation associated with development.
Morphological changes in the development of fertilized eggs to juvenile have been reported in the literature. When the fertilization membrane is lifted, it indicates that the ovum is successfully fertilized. After multiple cell divisions, the number of cells increased and the cell volume became smaller. The blastula is a hollow sphere surrounded by a thin layer of cells, and the egg membrane is clearly visible. The embryo is detachable by rotation and then rotate freely with the oscillation of surface cilia in a finger ring shape. As development proceeds, the vegetal pole invades and forms mesenchymal cells. The digestive tract is not fully differentiated, and morphological distinctions are not obvious.
The gastrula further develops into auricular larvae, and the digestive system begins to improve. The food particles enter the mouth with the water flow formed by the oscillation of the cilia. Calcareous ossicle begin to form and continue to elongate towards the posterior end. According to the development morphology of the water cavity, the auricular larvae are divided into three stages: earlyauricularia larva, mid-auricularia larva, late-auricularia larva (Qiu, 2013). At the late-auricularia larva stage, five small vesicles grow outward the water cavity, resembling finger-like branches (Wang et al., 2019). The body shrinks to a cylindrical shape and the larvae metamorphose into doliolaria. Most of the cilia fall off, and the movement ability and swimming ability of the larvae decrease. At the pentactula stage, the cilia rings degenerate and disappear completely. The larvae change from planktonic life to benthic life, with five tentacles protruding. The larvae sink to the attachment base and crawl by the first tube foot and tentacles. However, the global gene regulation patterns of A. japonicus development and key regulatory gene families are still not well understood.
With the development of high-throughput sequencing techniques, a number of A. japonicus transcriptomes have been successfully constructed, including transcriptomes of intestine regeneration (Sun et al., 2011), intestines, respiratory trees and coelomic fluid (Du et al., 2012), response to pathogen infection (Zhou et al., 2014;Gao et al., 2015), and albinism biogenesis (Xing et al., 2018). However, comparative transcriptomes of all key stages of A. japonicus development have not yet been constructed, as it is very hard to assemble 18 transcriptomes de novo with mega data. Fortunately, the fine A. japonicus genome provides a good reference, which makes this possible . In the current study, we applied transcriptomic techniques to identify differentially expressed genes (DEGs) of six different developmental stages of A. japonicus. The results of this study provided a better understanding of the mechanisms of A. japonicus development. Furthermore, these data will increase the amount of genetic information for A. japonicus, providing a good foundation for investigating the molecular/genetic mechanisms of sea cucumbers.

Sample collection
Parent culture, spawning, and larva rearing were performed in our breeding base in Qingdao. The embryos and larvae of A. japonicus were cultivated in 2 m×5 m×1.5 m cement pools at 20 ± 1°C. Samples from six major developmental stages were collected: fertilized eggs (F,~20 min after fertilization), blastula (B,~10 h), gastrula (G,~26 h), auricularia (A,~8 d), doliolaria (D,~10 d), and pentactula (P,~12 d). The samples were monitored under a microscope to ensure developmental synchrony. Three samples at each developmental stage were collected as biological replicates. Each sample was washed with ddH 2 O, frozen, and stored in liquid nitrogen. Finally, a total of 18 samples were prepared for sequencing to obtain 18 transcriptomes.

RNA extraction, library preparation, and illumina sequencing
Total RNA was extracted from 18 samples as mentioned above using the RNAeasy mini kit (Qiagen, USA) according to the manufacturer's instructions. The quality and concentration of RNA was measured by NanoDrop 1000 (Thermo). A total of 18 RNA samples were used for isolation, fragmentation, and construction of the cDNA library. Extracted total RNA was treated with DNase I, Oligo (dT) and used for RNA isolation. A total of 3 mg RNA per sample was used as input material for the RNA sample reparations. Fragmentation was carried out using divalent cations under elevated temperatures in NEBNext First Strand Synthesis Reaction Buffer (5×). Then cDNA was synthesized using the mRNA fragments as templates. Short fragments were purified and resolved with EB buffer (10 mM Tris-HCl, pH 8.5) for end reparation and single nucleotide A (adenine) addition. Subsequently, the short fragments were connected with adapters. Suitable fragments were selected for PCR amplification. During the QC steps, an Agilent 2100 Bioanalyzer and an ABI StepOnePlus Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) were used in the quantification and qualification of the sample library. Finally, sequencing was carried out using Illumina HiSeq 4000 by the Beijing Genome Institute (BGI) (Shenzhen, China). Table S1 (supplementary material) gives a brief overview of the condition of the samples and summarizes the key characteristics of the transcriptome sequencing.

Transcriptome assembly, SNP and indel detection, novel transcript prediction and gene expression analysis
The sequencing reads which were of low quality, adaptorpolluted, or had a high number of unknown base (N) reads, were removed prior to further analyses. Filtering was performed using internal software of BGI. All sequencing data and quality information can be found in Table S1 in the supplementary material. HISAT (v0.1.6-beta) (Kim et al., 2015) was used to map all clean reads to the sea cucumber genome (NCBI accession number: PRJNA354676) . After genome mapping, we used GATK to call SNP and INDEL variants for each sample (McKenna et al., 2010). After that, StringTie (v1.0.4) was used to reconstruct novel transcripts (Pertea et al., 2015), cuffcompare (v2.2.1) was used to compare reconstructed transcripts to draft genome annotation, and CPC (v0.9-r2) was used to predict the coding potential of novel transcripts (Kong et al., 2007). After novel transcript detection, we merged novel coding transcripts with reference transcripts to get complete reference, then we mapped clean reads to it using Bowtie2 (V2.25) (Langmead and Salzberg, 2012), then calculated gene expression level for each sample with RSEM (V1.2.12) (Li and Dewey, 2011). The gene expression summary was shown in Table S1 in the supplementary material. With gene expression results, NOIseq based on a noise distribution model was chosen to detect DEGs (Parameters: Fold Change ≥ 2.00 and Probability ≤0.8).

Gene functional annotation
The obtained genes were annotated according to the sea cucumber genome annotation . The novel transcripts were annotated using BLAST programs with SwissProt, NCBI non-redundant (Nr) protein, and NCBI nonredundant nucleotide sequences (Nt) databases. The GO (http:// www.geneontology.org/) and KEGG databases (http:// www.genome.jp/kegg/) were used to classify and group the identified proteins. A hypergeometric test was used to define significantly enriched GO terms and pathways of differentially expressed genes. The terms which FDR not larger than 0.001 were defined as significant enriched.

WGCNA analysis
Co-expression networks were constructed using the WGCNA (v1.47) package in R (Langfelder and Horvath, 2008) to identify specific modules associated with larva development at different stages. The fragments per kilobase of transcript per million mapped reads (FPKM) of each gene was calculated as the gene expression level. Genes were filtered out when FPKM < 2 in more than 70% of samples and outlier samples were deleted. The rest genes in the 17 libraries were imported into WGCNA to construct co-expression modules using the automatic network construction function block wise Modules with default settings. To find biologically or clinically significant modules, module eigengenes were used to calculate the correlation coefficient with samples. Intramodular connectivity (K.in) and module correlation degree (MM) of each gene was calculated with the R package WGCNA. Genes with high connectivity tended to be hub genes which might have important functions. To identify modules significantly associated with different developmental stages, correlation analysis was performed using the module eigengene with gene expression levels in different groups (F, B, G, A, D, P). Modules with high gene expression were considered as target modules. To further analyze the biological functions of genes in each target module, GO and KEGG pathway enrichment analysis were performed. GO terms and pathways with p-value corrected by FDR ≤ 0.05 were considered to be significantly enriched.

Real-time PCR validation
The same RNA as used for Illumina sequencing was used in this experiment. The first strand cDNA was synthesized in a 25-mL reaction system as follows: 1) 4 mL RNA and 1 mL oligodT18 were denatured at 70°C for 5 min; 2) 1 mL M-MLV reverse transcriptase (Promega), 5 mL M-MLV buffer (25 mM KCl, 10 mM Tris-HCl, 0.6 mM MgCl 2 , and 2 nM DTT, pH 8.3), 5 mL dNTP, 1 mL ribonuclease inhibitor, and 8 mL RNase-free water were added at 42°C for 1 h.
The synthesized cDNA was diluted with RNase-free water and stored at -80°C for subsequent quantitative real-time PCR. All gene mRNA expression levels were determined using the SYBR Green ® real-time PCR assay with an Eppendorf Mastercycler ® ep realplex (Eppendorf, Hamburg, Germany). The primers were designed using Primer3 (v0.4.0; http://bioinfo.ut.ee/primer3-0.4.0/primer3/) and are listed in Table S2 in the supplementary material. The amplification volume was 25 mL, and it contained 12.5 mL SYBR Green Master Mix (Takara), 0.5 mL (each) forward and reverse primer (10 mM), 1 mL diluted cDNA, and 10.5 mL RNase-free water. Thermal cycling was as follows: 1) 95°C for 5 s and 2) 40 cycles at 95°C for 10 s, 60°C for 20 s, and 72°C for 30 s. Melting curve analysis of the amplification products was performed to demonstrate the specificity of the PCR products. NADH dehydrogenase was used as a housekeeping gene for internal standardization (Sun et al., 2011;Sun et al., 2017b). The 2 -DDCT method was used to analyze mRNA expression levels. All data are reported as means ± SE (N = 5), and the level of statistical significance was set at P < 0.05. Analysis was carried out using SPSS18 software.

Overview of the transcriptomes
In the current study, 8.25~8.98 Gb × 18 data from sea cucumbers at six key developmental stages were generated by high-throughput sequencing techniques, and have been submitted to NCBI (accession NO. PRJNA889469). After read filtering, 53.20~59.96 Mb× 18 clean reads were obtained to construct 81,146 transcripts with contig N50 of 3910bp, which were further grouped into 32,353 genes. Among all transcripts, 48,098 transcripts were novel transcripts by reconstruction after mapping sequenced reads to the reference genome. Of these, 24,275 were previously unknown splicing events for known genes, 5,494 were novel coding transcripts without any known features, and the remaining 18,329 were long noncoding RNAs. The mRNA expression levels of all genes can be found in Table S3 in the supplementary material.
Thanks to the development of high-throughput sequencing techniques, many transcriptomes have been constructed to understand global gene regulation or response mechanisms to different types of biological processes in sea cucumbers (Sun et al., 2011;Du et al., 2012;Zhang et al., 2013;Gao et al., 2015;Xing et al., 2018). However, the gene expression patterns of development, one of most important biological processes, are still not known. In 2012, Du et al. (Du et al., 2012). constructed the transcriptomes of embryos and larvae. However, these data could only provide relatively complete gene sets and did not reveal the gene regulation patterns of development of sea cucumbers, because the embryo and larva samples were pooled to construct two libraries and the sampling times did not coincide with five key developmental stages. In this study, we revealed the global gene expression patterns of six developmental stages, which detailed all gene regulation patterns during sea cucumber development.

Construction and analysis of weighted gene co-expression network
To better understand the gene co-expression network during sea cucumber development, WGCNA analysis was carried out to obtain 20 modules based on hierarchical clustering of calculated dissimilarity after filtering low-quality samples and genes ( Figure 1A; Supplementary Table S4). Genes with similar B A expression patterns are represented on the same branch. Each branch represents a co-expression module. According to the similarity of the module eigengene, the modules with similar expression patterns are merged to obtain the final merged dynamic modules. Genes that do not fit into any module are shown in gray. The numbers of genes in modules ranged from 56 to 2,925 (Supplementary Table S5), and a heat map of sample expression patterns based on module eigengene was generated to visualize the correlation matrix between modules and groups ( Figure 1B; Supplementary Table S6). Some modules were found to relate to specific developmental stages. The genes in the greenyellow, lightcyan, yellow, and lavenderblush3 modules were highly expressed in the auricularia, doliolaria, and pentactula stages, but were expressed at low levels in fertilized eggs, blastula, and gastrula stages. However, the darkmagenta, darkorange2, floralwhite, indianred4, and lightpink4 modules showed a completely opposite correlation. Furthermore, the results showed that the blue and yellow modules were positively correlated with the gastrula and auricularia stages, respectively. Additionally, the darkslateblue module had the highest correlation coefficient with the blastula stage. Therefore, we treated the three modules that were highly associated with these developmental stages as the target modules.
To further explore the function of module genes, we performed GO and KEGG enrichment analysis for each target module (Figure 2; Supplementary Tables S7, S8). GO terms for "RNA" and "proteasome complex", such as "translation factor activity, RNA binding (GO:0008135)", "RNA binding (GO:0003723)", "RNA processing (GO:0006396)" and "proteasome complex (GO:0000502)", were most significantly enriched in the blue module. In the darkslateblue module, receptors of signaling pathways were significantly enriched. The representatives were "immune response-activating cell surface receptor signaling pathway (GO:0002429)", "immune response-regulating cell surface receptor signaling pathway (GO:0002768)" and "antigen receptor-mediated signaling pathway (GO:0050851)". Additionally, genes grouped in the yellow module were significantly enriched in metabolic processes, such as "neurotransmitter metabolic process (GO:0042133)", "single-organism carbohydrate catabolic process (GO:0044724)", and "carbohydrate catabolic process (GO:0016052)". According to the KEGG pathway enrichment results, "proteasome (ko03050)", "spliceosome (ko03040)", and "protein export (ko03060)" pathways related to gene information processes were enriched in the blue module. In the darkslate and yellow modules, metabolic pathways were significantly enriched. Overall, GO and KEGG enrichment analysis demonstrated the role of modular genes in the larval development of A. japonicus.
The correlation network of significantly enriched genes in the three target modules is illustrated in Figure 3. According to the target module size, the top five genes with the highest All.kWithin values were identified as hub genes (Table S9 in the supplementary material). In the correlation network of the blue module, protein mago nashi homolog (MAGOH), zinc finger C4H2 domaincontaining protein-like isoform X1(AC4H2), N-acetyltransferase 10 (NAT10), nuclear pore complex protein Nup107 isoform X1 (Nup107 complex), and eukaryotic translation initiation factor 3 subunit F (eIF3f) were identified as hub genes ( Figure 3A; Supplementary Table S9). MAGOH is one of the three core proteins of the exon junction complex (EJC), and is essential for the normal development of embryos and nerves (Asthana et al., 2022). AC4H2 plays a vital role in regulating nervous system development, and has already been identified in multiple species Gene Ontology (GO) enrichment analysis of genes in blue module (A), darkslateblue module (B) and yellow module (C). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of genes in blue module (D), darkslateblue module (E) and yellow module (F). (May et al., 2015;Ma et al., 2017). NAT10 promotes cell proliferation and thereby participates in malignant transformation of various cancers. In addition, NAT10 is expressed in the central nervous system of the embryo (Guo et al., 2020). Nup107 complex is part of the Nuclear Pore Complex (NPC) structural scaffold and the role of the Nup107 complex in gene expression is not restricted to nucleo-cytoplasmic transport, but also participates in neural differentiation in vertebrates (Gonzalez-Aguilera and Askjaer, 2012). The first nerve cells were detected in the apical region of the epidermis in the late gastrula stage (Nakano et al., 2006). Thus, the blue module was closely related to neurological genesis. Genes for enzymes associated with digestion and immunity, such as lactase-phlorizin hydrolase-like, aminopeptidase N (APN), dipeptidase 1-like (DPEP1), mucin-17 isoform X2 (MUC17), and cAMP-regulated D2 protein-like (cAMP-D2) were identified as hub genes of the yellow module ( Figure 3B, Table S9 in the supplementary material). During the auricularia stage, the digestive system begins to improve. For the darkslateblue module, tyrosine-protein kinase RYK isoform X4 (RYK), sodium-dependent multivitamin transporter (SMVT), lysosomal amino acid transporter 1 (LYAAT-1), cyclin-J, and ELAV-like protein 3 isoform X2 (elavl3) were identified as hub genes ( Figure 3C, Table S9 in the supplementary material). RYK may be a coreceptor along with FZD8 of Wnt proteins, such as WNT1, WNT3, WNT3A and WNT5A. LYAAT-1 defines a subgroup of lysosomal transporters in the amino acid/auxin permease family (Sagne et al., 2001). These genes were involved in the transportation of substances and the cell cycle.

Differential expression genes and realtime PCR validation
Using 32,353 genes as a reference, five comparison groups: F Vs B, B Vs G, G Vs A, A Vs D, and D Vs P, were used to analyze the differential expression of genes during A. japonicus development using NOIseq arithmetic (fold-change≥2.0 and probability≥0.8). The number of up-regulated genes and down-regulated genes was 20 and 112, 1,781 and 1,535, 535 and 97, 98 and 272, and 12 and 29, respectively ( Figure 4). The highest number of genes showing The number of differentially expressed genes (DEGs) in five comparison groups.

FIGURE 3
Correlation network visualization of the interactions between the top 5 genes with the highest connectivity in three target modules. Each node labelled with the abbreviation of gene name. Node colors represent connectivity, and darker nodes represent higher connectivity. (A-C) represent blue, darkslateblue and yellow module, respectively. differential expression were between the blastula to gastrula stage (B Vs G). All DEGs are listed in Supplementary Table S10. For validation purposes, RT-PCR analysis in four genes including cyclinB, ferritin, SoxB1, and cubilin-like protein was conducted in the six development stages ( Figure 5). The results showed that the overall trend and variation of RT-PCR based expression patterns among these selected genes was coincident with those obtained by RNA-Seq based detection.

Gene function and KEGG pathway enrichment analysis of DEGs
All DEGs could be categorized into 34 terms. "Signal transduction", "Global and overview maps", "Cancers", "Endocrine system", "Transport and catabolism", "Infectious diseases: Viral", "cellular community", "Translation", "Digestive system", and "Signaling molecules and interaction" were the 10 most enriched terms during the entire development process of sea cucumbers (Supplementary Figure S1). Among almost all terms, the number of DEGs in the B Vs G comparison group was the largest, followed by the G Vs A comparison group (Figure 6), thus, there was strong gene expression regulation from blastula to auricularia. Compared with other pathways, the AVsD comparison group had the largest proportion in "cell growth and death".
GO enrichment analysis showed significant enrichment of DEGs in three categories: molecular function (MF), cellular component (CC), and biological process (BP) (Table S11 in the supplementary material). In the MF category, 1, 14, 2, and 1 terms were enriched in the F Vs B, B Vs G, G Vs A, and D Vs P groups, respectively. In the CC category, 3, 23, 1, and 5 terms were enriched in the F Vs B, B Vs G, G Vs A, and A Vs D groups, respectively. In the BP category, 12, 17, 1, and 1 terms were enriched in the F Vs B, B Vs G, G Vs A, and A Vs D groups, respectively.
KEGG pathway enrichment analysis showed two pathways were enriched in F Vs B, including Ribosome and Vibrio cholera infection. Three pathways were enriched in B Vs G, including Ribosome, Protein export and Proteasome, while 16 pathways were enriched in G Vs A, including Protein digestion and absorption, Renin-angiotensin system as well as Pancreatic secretion. Six pathways were enriched in A Vs D, including cell cycle, oocyte meiosis, and p53 signaling pathway. Two pathways were enriched in D Vs P ( Figure S2; Supplementary Table S12).

Gene regulation during the development of fertilized eggs to blastula
The transition from fertilized eggs to blastula is the first stage in sea cucumber development. It involves the expulsion of polar bodies, rapid cell division, and the expansion and break down of the fertilization envelope . About 1 hour after fertilization, the fertilized egg enters the cleavage stage. The cells gradually decrease in size after division and enter the blastocyst stage at about 7 to 9 h. The whole blastocyst stage lasts for about 13 h and is divided into a stationary stage and a rotational stage. Compared with fertilized eggs, blastocysts are multicellular embryos. We analyzed DEGs and found 20 up-regulated genes and 112 down-regulated genes during this stage. To evaluate the function of DEGs, we conducted GO and KEGG enrichment analysis. The significant GO cellular component terms were related to ribosome components, such as "ribosomal subunit" and "ribosome". The biological process ontology includes terms associated with viral reproduction (Supplementary Table S11). The KEGG enrichment analysis results showed that DEGs are enriched in "Ribosome", "Vibrio cholerae infection" and "Amoebiasis" ( Figure S2). "Vibrio cholerae infection" and "Amoebiasis" are the common pathways during the comparative transcriptomic analysis of the early development in pacific white shrimp Litopenaeus vannamei (Wei et al., 2014). This suggests that the two pathways are involved in the embryonic development process. The ribosome as a complicated organelle mainly catalyzed the amino acids into peptide chains (translation). "Ribosome" is a basic cellular biology pathway involved in cell division and other cellular activities. Among the DEGs, lamin B receptor (LBR) was the most significantly upregulated gene. LBR is an integral membrane protein of the interphase nuclear envelope involved in mitotic nuclear envelope breakdown (Margalit et al., 2005;Olins et al., 2010), which called open mitosis. Thus, we speculated that LBR promoted NE breakdown to regulate open mitosis of sea cucumber cells. The results implied that two major events in this developmental process were DNA rapid replication for cell proliferation and immune response for protecting organisms from environmental stress.

Gene regulation during development from blastula to gastrula
The highest number of genes (up-regulated genes: 1,781; downregulated genes: 1,535) showed differential expression during development from the blastula to gastrula stage (B Vs G), suggesting that this is a key period during sea cucumber development. It is worth noting that of the top 30 known upregulated DEGs listed in Table S10 according to probability values and mRNA expression, a total of seven DEGs were associated with metalloproteinase, including three genes coding matrix metalloproteinase and four genes coding zinc metalloproteinase. In addition, many transcription factors, key genes in conservative signaling pathways as well as cell cycle regulation genes were significantly up-regulated during this stage, such as FoxQ1 (forkhead box protein Q1-like), FoxA, Wnt-8a, and frizzled 9-10. These results are consistent with the fact that during gastrulation the germ layers of the embryo are formed and the body plan of the organism is established (Tam and Loebel, 2007).

Gene regulation during development from gastrula to auricularia
In this process, the digestive tract of sea cucumbers gradually starts to open and they start to feed. A total of 535 up-regulated genes and 97 down-regulated genes were screened during this stage. Among them, Proteinase T/proprotein convertase subtilisin/kexin type 9 preproprotein was up-regulated over 2,000-fold from the gastrula stage to the auricularia stage in sea cucumbers. This gene was the most highly expressed gene in the intestine of healthy A. japonicus, However, it was not highly expressed in the body wall or respiratory tree (Yang et al., 2009), indicating that the expression of this gene was very advantageous for intestine digestion.
At this stage, the majority of DEGs were categorized into "Immune and Infectious diseases" (122 DEGs), "Digestive system" (102 DEGs), and "Endocrine system" (94 DEGs) ( Figure 6). The most enriched pathways were "Protein digestion and absorption", "Renin-angiotensin system", "Pancreatic secretion", "Carbohydrate digestion and absorption", and "Fat digestion and absorption", which were responsible for both the digestive endocrine systems ( Figure S2C). In light of these findings, we speculated that the formation of the digestive tract and the start of feeding were the key events in this process.

Gene regulation during development from auricularia to doliolaria
During the development of auricularia to doliolaria, the planktotrophic larva, the auricularia, undergoes a dramatic morphological transformation at the onset of metamorphosis (Lacalli and West, 2000). At this stage, 98 up-regulated genes and 272 down-regulated genes were found. The majority of DEGs were categorized into "Immune and Infectious diseases" and "Signal transduction", accounting for 99 and 60 DEGs, respectively ( Figure 6). MAP kinase-interacting serine/threonine-protein kinase 1 isoform X1 and PTSP-like peptide neurotransmitter precursor were most significantly up-regulated during the development from auricularia to doliolaria, and have been found to play important roles in the metamorphosis process. As mentioned above, nerve cells begin to form in the late gastrula stage and the adult nervous system develops during the doliolaria stage (Nakano et al., 2006). This indicates that metamorphosis is a multigenic synergistically involved process regulated by endogenous neurotransmitters The percentage of DEGs enriched in each term at different developmental stages. Su et al. 10.3389/fmars.2023.1087339 Frontiers in Marine Science frontiersin.org (Matsuura et al., 2009). Among all the down-regulated genes, the two genes associated with DNA replication, zinc finger RAD18 domaincontaining protein C1orf124 homolog (C1orf124) and spalt-like transcription factor (SALL), were the most significant. C1orf124 is a regulator of translesion synthesis and participates in polymerase switching (Ghosal et al., 2012). SALLs are evolutionarily conserved proteins that participate in embryonic development. In oncology studies, SALLs are involved in cellular apoptosis, angiogenesis, invasion, and metastasis (Ma et al., 2021). This is consistent with the most significantly enriched signaling pathway being the 'cell cycle', suggesting active cellular events during metamorphosis ( Figure S2D).

Gene regulation during development from doliolaria to pentactula
Sea cucumber larva develop from doliolaria to pentactula, and use their tentacles to attach to the substrate at~12 days postfertilization, which was a key part of substrate detection/selection and settlement 1. By this stage, the morphological structure of most of organs has taken shape. Hence, relatively few genes showed differential expression during this process. Four up-regulated genes and 10 down-regulated genes were screened. "Salivary secretion" and "Prion diseases" were the most enriched pathways ( Figure S2E).

Individual key DEGs
Key genes related to sea cucumber development were classified into seven groups according to gene function: adhesion, cell cycle, signal, transcription factor, extracellular matrix (ECM), and cytoskeleton. The detailed information of gene expression can be seen in Supplementary Table S13.

Adhesion
The expression profiles of 9 cadherin genes and 14 protocadherin Fat genes, as well as 18 integrin genes, are summarized in Figure 7A. Genes from the same family showed different expression patterns. The expression levels of Cadherin Ap showed an increasing trend during the developmental process, especially from doliolaria to pentactula. The expression trend of the remaining 7 cadherin genes was opposite to Cadherin Ap. The expression levels of most protocadherin Fat genes were gradually down-regulated. The integrin genes exhibited low expression levels from doliolaria to pentactula.

Cell cycle
The expression profiles of 8 cell division control protein (Cdc) family genes, 19 cyclin family genes (Cyc), 7 DNA replication licensing factor genes (MCM), and 9 origin recognition complex genes (Orc) are summarized in Figure 7B. Cdc family genes exhibited low expression levels from auricularia to pentactula. Cdc6 is essential for the formation of pre-replication complexes at the origins of DNA replication (Davis et al., 2008). The ORC1 gene, which plays an important role in the initiation of DNA replication, was also highly expressed at the fertilized egg and blastula stages. Cyclin B accumulates and binds to Cdk1 at the G2-phase, playing a vital role in G2-to-M transition (Lemonnier et al., 2020). The expression level of MCM genes were up-regulated from fertilized egg to auricularia. During the development from fertilized eggs to blastula, cells proliferate through the mitotic cell cycle. Therefore, these genes regulate the cell division cycle.

Signal
Two family genes (Notch and Wnt) related to key signal pathways were screened ( Figure 7C). There was no obvious pattern in Notch gene expression, and there were high expression genes in each stage. The expression levels of Wnt16 and WntA were up-regulated from auricularia to pentactula. Wnt4 and Wnt5 showed opposite expression levels.

Transcription factor
A total of 23 forkhead transcription factor (Fox) genes were expressed during development. The expression patterns of eight Sox genes, five T-box transcription factors, six Krueppel-like factors, and eight paired box proteins are shown in Figure 7D. SoxB1, a multifunctional gene involved in maintaining stem cell multipotency, was highly expressed from blastula to auricularia stages. Tbx2/3 showed high expression during development from gastrula to pentactula. Interestingly, Pax5 was highly expressed from the fertilized egg to the gastrula stage and Pax9 was highly expressed from the auricularia to pentactula stages. The expression time of these two genes covers the entire larval development process.

ECM
Many genes associated with ECM were expressed during sea cucumber development, including 40 Matrix metalloproteinase (MMPs), 41 collagen, six tenascin, and 24 laminin ( Figures 7E-G). The expression levels of most MMPs were up-regulated at doliolaria and pentactula stages. Most collagen genes were up-regulated from the auricularia to pentactula stage. However, laminin genes were highly expressed at the early stages of development.

Cytoskeleton
During sea cucumber development, many cytoskeleton genes were expressed, including 14 actin, 46 myosin, and 30 tubulin ( Figures 7H, I). The expression of many actin and tubulin genes were highly expressed from the auricularia to pentactula stages, while many myosin genes exhibited high expression levels at the early stages of development.
It has been clear for many years that matrix metalloproteinase hydrolyze ECM components, play a central role in embryogenesis, normal tissue remodeling, wound healing, and angiogenesis (Dolmatov et al., 2021). Wnt-8 and frizzled 9-10, the key genes of the Wnt signaling pathway, which controls the embryonic processes involved in body axis patterning, cell fate specification, cell proliferation, and cell migration (Clevers, 2006), were upregulated over 200-fold at gastrula stage. The nervous system is developed from the ectoderm formed during gastrulation, so key genes, regulating neural development such as dorsal root ganglia, homeobox protein-like was significantly up-regulated. In addition, Cdc20-like expression was up-regulated to regulate cell division from the blastula to the gastrula stage (Yu, 2007), Cdc family genes were also up-regulated during blastula and gastrula stage. It is worth noting that TNF receptor-associated factor 6-like (TRAF6) was down-regulated over 250-fold, which may due to the fact that TRAF6 negatively regulates NF-kB.

Conclusion
In the present study, RNA-Seq analysis was conducted on six embryonic and larval stages of sea cucumber development as follows: fertilized eggs, blastula, gastrula, auricularia, doliolaria, and pentactula. A total of 53.20~59.96 Mb× 18 clean reads were obtained, and 32,353 genes were identified and annotated. After filtering out low-quality genes, the dynamic molecular responses to development were revealed by WGCNA. The blue module and the yellow module were positively correlated with the gastrula and auricularia stages, respectively. The darkslateblue module had the highest correlation coefficient with the blastula stage. GO terms for "RNA" and "proteasome complex" were most significantly enriched in the blue module. In the darkslateblue and yellow module, receptors of signaling pathways and metabolic processes were significantly enriched, respectively. Five comparison groups, F Vs B, B Vs G, G Vs A, A Vs D, and D Vs P, were used to analyze differential expression genes during the development of A. japonicus. All DEGs could be categorized into 34 terms, with "Immune and Infectious diseases" being the most enriched. There was strong gene expression regulation from blastula to Auricularia. Functional enrichment analysis was performed on DEGs from each comparison group. The behavioral events and molecular features Heatmap diagram showing the expression levels of transcripts annotated as adhesion genes (A), cell cycle genes (B), signal genes (C), transcription factor (D), ECM genes (E-G) and cytoskeleton genes (H, I). Su et al. 10.3389/fmars.2023.1087339 Frontiers in Marine Science frontiersin.org are summarized in Figure 8. The expression pattern of genes associated with adhesion, cell cycle, signal, transcription factor, extracellular matrix (ECM) and cytoskeleton was analyzed according to gene function. These results suggested that development and metamorphosis relied on complex molecular mechanisms in which many genes were involved. The present study provided useful transcriptomic resources for further research on A. japonicus development.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: BioProject, PRJNA889469.