Specific Transcriptomic Signatures and Dual Regulation of Steroidogenesis Between Fetal and Adult Mouse Leydig Cells

Leydig cells (LC) are the main testicular androgen-producing cells. In eutherian mammals, two types of LCs emerge successively during testicular development, fetal Leydig cells (FLCs) and adult Leydig cells (ALCs). Both display significant differences in androgen production and regulation. Using bulk RNA sequencing, we compared the transcriptomes of both LC populations to characterize their specific transcriptional and functional features. Despite similar transcriptomic profiles, a quarter of the genes show significant variations in expression between FLCs and ALCs. Non-transcriptional events, such as alternative splicing was also observed, including a high rate of intron retention in FLCs compared to ALCs. The use of single-cell RNA sequencing data also allowed the identification of nine FLC-specific genes and 50 ALC-specific genes. Expression of the corticotropin-releasing hormone 1 (Crhr1) receptor and the ACTH receptor melanocortin type 2 receptor (Mc2r) specifically in FLCs suggests a dual regulation of steroidogenesis. The androstenedione synthesis by FLCs is stimulated by luteinizing hormone (LH), corticotrophin-releasing hormone (CRH), and adrenocorticotropic hormone (ACTH) whereas the testosterone synthesis by ALCs is dependent exclusively on LH. Overall, our study provides a useful database to explore LC development and functions.


INTRODUCTION
Leydig cells (LC) are the main steroidogenic cells of the testes. They synthesize androgens that are essential for both masculinisation of the organism and spermatogenesis. In mice, two populations of Leydig cells arise sequentially, one during embryonic development referred as the fetal Leydig cells (FLCs) and the other postnatally referred as the adult Leydig cells (ALCs) (Baker et al., 1999;Habert et al., 2001;O'Shaughnessy et al., 2002;Haider, 2004;Chen et al., 2009). The mouse FLCs appear in the interstitial compartment of the testis shortly after sex determination at embryonic day (E)12,5. The FLC population expands considerably during fetal testis development through the recruitment and differentiation of Leydig progenitor cells rather than by mitotic division of differentiated FLCs (Byskov, 1986;Kerr et al., 1988;Migrenne et al., 2001;Brennan et al., 2003;Barsoum and Yao, 2010;Ademi et al., 2020). The maximum number of FLCs is reached around birth and regresses over the first 2 weeks of postnatal life (Kerr et al., 1988;Shima, 2019). The ALCs appear around one week after birth and increase in number during puberty. They arise from LC progenitors located in the testicular interstitium (Davidoff et al., 2004;Barsoum et al., 2013;Shima et al., 2013;Kilcoyne et al., 2014;Ademi et al., 2020). Two recent studies showed that both fetal and adult Leydig cells derive from a common pool of progenitor cells originating from the gonadal surface epithelium and mesonephric mesenchymal cells present from fetal life (Ademi et al., 2020;Shen et al., 2020). Evidence also shows that a subset of FLCs dedifferentiates at fetal stages to serve as potential ALC stem cells (Shima et al., 2018).
The rodent FLCs and ALCs have distinct morphological and functional differences. The FLCs display a high proportion of lipid droplets, while mostly absent in the ALCs (Huhtaniemi and Pelliniemi, 1992;Shima, 2019). Unlike ALCs, the FLCs are not capable of fully synthesizing testosterone on their own. They express all the enzymes necessary for androgen synthesis except HSD17B3, which converts androstenedione to testosterone. The conversion of androstenedione produced by the FLCs is achieved by the adjacent fetal Sertoli cells that express HSD17B3 (O'Shaughnessy et al., 2000;Shima et al., 2013). Another notable difference between fetal and adult LCs is their regulation by the pituitary gonadotropins. Although the luteinizing hormone (LH) receptor is expressed from E16.5 in FLCs and later in ALCs (O'Shaughnessy et al., 1998), LH signaling is dispensable for FLCs development, but prove to be essential for ALCs development and testosterone production. Neonatal mouse mutants for LH/CG receptors display testes indistinguishable from control mice. In contrast, testes from adult mutants for LH/CG receptors are reduced in size, with fewer and hypoplastic ALCs, and show impaired testosterone production (Lei et al., 2001;Zhang et al., 2001;O'Shaughnessy and Fowler, 2011;Teerds and Huhtaniemi, 2015). FLC function is normal in the absence of endogenous circulating gonadotropins (O'Shaughnessy et al., 1998) but markedly reduced in late gestation in T/ebp/Nkx2.1 null mice lacking a pituitary gland (Pakarinen et al., 2002). This suggests that additional hypothalamo/pituitary hormones, other than LH, may be required for FLC function and androgen production. Interestingly, two additional hormones have been reported to stimulate testosterone production in fetal testis. Adrenocorticotropic hormone (ACTH) has been reported to stimulate in vitro testosterone production in fetal and neonatal testes (O'Shaughnessy et al., 2003). In parallel, corticotropinreleasing hormone (CRH) has been reported to stimulate steroidogenesis by direct activation of FLCs in fetal rat and mouse testes ex vivo and in MA-10 mouse Leydig cells (McDowell et al., 2012), but not in primary ALCs (Huang et al., 1995;McDowell et al., 2012).
While testosterone synthesis is subjected to intensive studies, our knowledge of FLCs and ALCs origins, development, and in particular similarities and differences is still incomplete. Multiple transcriptomic studies including either mouse fetal or adult Leydig populations have been performed on whole gonads or purified cell populations in a given context (Nef et al., 2005;Beverdam and Koopman, 2006;Stanley et al., 2011;Jameson et al., 2012;Munger et al., 2013;McClelland et al., 2015;Inoue et al., 2016;Miyabayashi et al., 2017). To date, no comprehensive comparison of FLCs and ALCs has been yet performed, and the identification of discriminant transcriptional signatures would be useful in distinguishing the two LC populations. In the present study, we employed a combination of bulk and single-cell RNA sequencing (RNA-seq and scRNA-seq) analyses to compare the transcriptome of FLCs and ALCs. While deep RNA-sequencing on purified Leydig cell populations allows the exploration of the transcriptomic landscape of FLCs and ALCs, including low expressed genes and alternative splicing; the high resolution of single cell RNA-sequencing allows us to bypass contamination issues inherent to cell population purification methods, and to identify specific marker genes that discriminate FLCs and ALCs amongst the other testicular cells. Our results provide a comprehensive view of the FLCs and ALCs transcriptional similarities and differences, unveiling important variations in terms of gene expression level and alternative splicing between the two populations of LCs. Furthermore, our analyses uncovered FLC-and ALC-specific markers that represent useful tools to study these two steroidogenic populations. Finally, we observed the expression of corticotropin-releasing hormone receptor 1 (Crhr1) and melanocortin 2 receptor (Mc2r) exclusively in FLCs, which corroborates previous studies suggesting that androgen synthesis is influenced by CRH and ACTH in FLCs.

Mouse Strains
Embryos were collected from timed pregnant female CD-1 outbred mice (Charles River) and heterozygous Tg(Nr5a1-GFP) transgenic male mice (Stallings, 2002). The mating plug observed the next morning is designated as E0.5 were used in this study. Animals were housed and cared according to the ethical guidelines of the Direction Générale de la Santé of the Canton de Genève (GE/57/18 30080).

Fetal and Adult Leydig Cell Purification
The purification of fetal Leydig cells was carried out in seven independent experiments based on a previously described experimental protocol (Nef et al., 2005;Pitetti et al., 2013;Stévant et al., 2018). The resulting cells were pooled to achieve the amount of RNA required for the preparation of the RNA sequence library (Supplementary Table 1). In short, adult CD-1 females were time-mated with heterozygous Tg(Nr5a1-GFP) transgenic male and checked for the presence of vaginal plugs the next morning (E0.5). On the relevant days of gestation (E18.5), females were sacrificed by CO 2 inhalation and the embryos collected in PBS. The sex and the presence of the Nr5a1-GFP transgene in the embryos were assessed under a fluorescent binocular microscope. Testes were isolated and incubated 20 min with Trypsin-EDTA 0.05%, mechanically dissociated with gentle pipetting, and filtered through a 70 µm cell strainer to obtain single cell suspension.
Adult Leydig cells purification has been performed in four independent experiments (Supplementary Table 1). Hundredday-old (P100) heterozygous Tg(Nr5a1-GFP) transgenic male mice were used for this experiment. Mice were sacrificed with Esconarkon injection and ∼1 mL of blood was collected by intracardiac puncture for serum extraction. Tunica albuginea of the testes were delicately removed and testes were incubated in DMEM supplemented with collagenase (1 mg/mL C0130; Sigma-Aldrich, St. Louis, MO, United States), hyaluronidase (2 mg/mL H3506; Sigma-Aldrich), and DnaseI (0.8 mg/mL dN25; Sigma-Aldrich) at 37 • C for 20 min with gentle agitation. After two rounds of seminiferous tubules sedimentation, the supernatants enriched in interstitial cells were collected and incubated 10 min with Trypsine-EDTA 0.05%. Cells were centrifuged and filtered through a 70 µm cell strainer to obtain single cell suspension.
Nr5a1 is expressed in several populations of testicular cells including both the Sertoli cells and Leydig cells, the latter expressing Nr5a1 very highly. Using this difference in the level of expression, strong Nr5a1-GFP positive cells from E18.5 and P100 testes were then sorted by fluorescent-active cell sorting (BD FACS ARIA II), excluding cell doublets, and the dead cells with Draq7 TM dye staining (Supplementary Figures 1A,A ). Cells were collected directly into RLT buffer from Qiagen RNeasy Mini kit for RNA extraction.

Bulk RNA-Sequencing Library Preparation and Sequencing
RNA was extracted from strong positive Nr5a1-GFP + cells with the RNeasy Mini kit (Qiagen) to obtain a minimum of 260 ng of total RNA. The composition of the different samples is detailed in Supplementary Table 1. Sequencing libraries were prepared from 150 ng of DNA with the TruSeq Stranded Total RNA Library Prep Gold (Ribo-Zero) and sequenced on an Illumina HiSeq 2500 (50 bp, paired end, ∼35 million reads expected) at the Genomics Platform of the University of Geneva.

RNA-Sequencing Analysis
Reads were demultiplexed with Casava (v1.8.2), mapped with GemTools (v1.7.1) (Marco-Sola et al., 2012) and read counts and RPKM gene expression quantifications were calculated with an in-house pipeline based on Gencode annotation GRCm38 (v4). Globally, over 88% of the reads mapped to exonic regions. Preranked gene set enrichment analysis was performed using GSEA (v4.1.0) with the mean RPKM expression as ranking, and using the genes with a mean RPKM > 1 (Subramanian et al., 2005). Spearman correlation and principal component analysis (PCA) using the R base stats package, revealed a very high correlation for both biological triplicates of ALCs and FLCs (Spearman correlation score > 98%, see Supplementary Figures 1C,D). Similarly, the correlation between the two conditions, ALCs and FLCs, is also very high (Spearman correlation score > 88%). We used the R package DESeq2 (v1.24.0) (Love et al., 2014) for the differential expression analysis. Genes with fewer than 10 reads were not taken into account. GO terms enrichment analysis was computed on the selected genes enriched in FLCs and in ALCs with the R packages ClusterProfiler (v3.12.0) (Yu et al., 2012).
To calculate the proportion of each biotype of the detected genes, we selected the genes expressed (RPKM > 0), we looked at the transcript_type field in the reference genome annotation (GTF file), and calculate the percentage of expressed genes for each biotype 1 . Calculation were performed using R.
Differential splicing analysis was performed using rMATS (v3.2.5) (Shen et al., 2012(Shen et al., , 2014 with default parameters, using the bam files and the GTF file used for the mapping as input. Splicing event with a FDR ≤ 0.05 were considered as significant. Single-Cell RNA-Sequencing Library Preparation and Sequencing Computations were performed at the Vital-IT Centre for high-performance computing of the SIB (Swiss Institute of Bioinformatics) 3 . Demultiplexing, alignment, barcode filtering and UMI counting were performed with the Cell Ranger v2.1 pipeline (10X Genomics). Reference genome has been modified to include the eGFP transgene with the mkref function. Data were mapped to the mouse reference genome GRCm38.p5 in which the eGFP and the bovine GH 3 splice/polyadenylation signals (NM_180996.1) (Stallings, 2002) sequences have been added, and annotated with Gencode vM15. Only protein coding genes and long non-coding RNAs were retained for further analysis.
To set a threshold between cell containing barcodes and empty ones, we computed the knee point and the inflection point of the ranked barcode distribution plot. Then, we detected the local minimum between these points on the density curve (density base R function, bw = 500, n = 4,096) of the UMI number per barcode using the quantmod R package (v0.4-16). This local minimum was used as a threshold for cell selection.

Adult Mouse Single-Cell RNA-Sequencing Data
We selected the libraries do15983, do15984, do17622, do17623, do17815, do17816, do18197, do18198, and do18199 of mice older than 60 days old from Ernst et al. (2019). These libraries were analyzed with Cell Ranger v1.3.1 software using the default threshold to obtain high-quality cells with large numbers of UMIs. We filtered out cells with <500 UMI and we excluded cells with more than 5% of reads mapping to the mitochondrial genome. We selected only protein-coding genes. Then, we inferred cell labels with the annotation furnished by Ernst et al. (2019), and removed cells labeled as "Outliers." So, 24,672 cells were used. We confirmed the cell type classification of Ernst et al. (2019) with genes known from the literature (Cyp11a1, Star, Insl3, Cyp17a1, and Fabp3) (Stévant et al., 2018).

Merge of Fetal and Adult Single-Cell RNA Sequencing Data
In order to compare the Leydig cells present at fetal and adult stages, we merged the single-cell RNA sequencing datasets with the MergeSeurat function (Seurat, v2.3.4). We normalized and computed the Principal Component Analysis (PCA) using the genes expressed in more than 50 cells. The 10 first components of this PCA were used to compute the corrected neighbor graph with BBKNN (balanced batch KNN, v1.3.8) (Polañski et al., 2019) and then, the UMAP representation. We made use of the previous cell annotation to distinguish fetal and adult populations and we ensured the cell identity using marker gene expression. This list includes the genes Nr5a1, Star, Cyp11a1, Insl3, Hsd17b3, Amh, Sox9, Arx, Nr2f2, Pecam1, Cdh5, Esam, Pou5f1, Ddx4, Dmrt1, Piwil1, Pex21, and Tnp1. We performed a differential expression analysis (Mann-Whitney-Wilcoxon test) between FLCs and ALCs clusters using Seurat FindMarkers function (only positive markers, min.pct = 0.25, thresh.use = 0.25). We identified genes showing a high expression in one population of Leydig cells and a low expression in the other population (adj. p-valuej < 0.01, avg logFC > 0.5, pct.1 > 0.5, pct.2 < 0.25), and overexpressed as well in the same population in the DESeq2 analysis.
From this selection, in order to select the genes with specific expression in one Leydig cell population and with a low expression in all other populations in the testis, we used an additional differential expression analysis between all cell types using Seurat FindAllMarkers function (only positive markers, p_val_adj < 0.01 and avg_logFC > 0.5 and pct.1 > 0.5 and pct.2 < 0.25) was used to compute marker genes for every cluster. The intersect of the two differential expression analysis with Seurat is used to get a list of the marker genes of FLCs and ALCs specifically.

RNAscope R Analyses (in situ Hybridization)
Adult (P100) and embryonic (E16.5) Nr5a1-eGFP samples from timed mated females were collected and fixed overnight in 4% paraformaldehyde, dehydrated and embedded in paraffin. Five µm thick sections were examined histologically via haematoxylin and eosin staining. We performed the RNAscope R 2.5 HD DuplexAssay protocol following the recommendation of BioTechne. The Star probes (C2) to label Leydig cells, and the probes for the candidates Crhr1 (C1), Ren1 (C1), Bhmt (C1), and Sult1e1 (C1) were tested. Slides were imaged using an Axioskop 2 plus confocal microscope and ZEN 2009 software (Carl Zeiss Ltd., Hertfordshire, United Kingdom). For reproducibility purpose, at least three different animals of each group were tested.

Immunostaining
Animals were bred and maintained in strict compliance with the Animals (Scientific Procedures) Act, 1986. All procedures were conducted in accordance with United Kingdom Home Office regulations under project licenses 60/4200 and 70/8804 held by Lee B. Smith.
Neonatal and adult tissues were fixed in Bouins for 6 h, stored in ethanol 70% and embed in paraffin. Sections of 5 µm were dewaxed in xylene, rehydrated in graded ethanol solutions. For the double immunostaining, slides were antigenretrieved in pressure cook with 0.01 M citrate buffer (pH 6.0). To quench endogenous peroxidases activity, slides were incubated in 0.3% hydrogen peroxide (v/v) in TBS for 30 min at room temperature (RT). The non-specific activity was blocked using the appropriate normal blocking serum for 30 min at RT followed by incubation overnight at 4 • C with the first primary antibody diluted in blocking serum. After washing, slides were incubated for 30 min at RT with the appropriate secondary antibody conjugated to peroxidase and diluted 1/200 in blocking serum and left on the slides for 30 min at RT. Sections were then incubated with Tyramide Signal Amplification system ('TSA, ' TM PerkinElmer) diluted 1/50 for 10 min at RT according to the manufacturer's instructions. Slides were then stained with the second Primary antibody and washed as an incubated as described above with secondary and Tyramide. Sections were then counterstained in Sytox Green (Molecular Probes, Life Technologies, Paisley, United Kingdom) for 10 min at RT and mounted in PermaFluor mounting medium (Thermo Scientific, United Kingdom). Slides were scanned using an LSM 710 confocal microscope and ZEN 2009 software (Carl Zeiss Ltd., Hertfordshire, United Kingdom). The primary and adequate secondary antibodies used in this study are detailed in Supplementary Table 2. To assure the specificity of the stained tissue, sections incubated with no primary antibody were used as negative controls. For reproducibility purpose, at least three different animals of each group were tested.
We show that the ALCs express in excess the Insl3 gene with an average of 21,113 RPKM, followed by Aldh1a1 (8,050 RPKM) and Cyp17a1 (5,742 RPKM). Together the RNA abundance of three genes represents 9.4% of the whole transcriptome. The FLCs do not display such an extreme over-representation of the same genes, but the top three expressed genes are mt-Co1 (7,501 RPKM), Hsd3b1 (5,357 RPKM), and Insl3 (3,947 RPKM), which represent 5.4% of the total transcriptome. To appreciate the biological functions enriched amongst the most expressed genes of FLCs and ALCs, we performed a pre-ranked gene set enrichment analysis of the transcriptomes, weighting genes by their level of expression. While in FLCs we observed a statistical enrichment of 345 GO terms (FDR < 25%), including "glucocorticoid biosynthetic process, " "C21 steroid hormone metabolic process" and "regulation of systemic arterial blood pressure by renin angiotensin" in the top terms, no statistical enrichment was observed for the ALCs. Although not statistically enriched, we could find in the top terms "glucocorticoid metabolic process, " "C21 steroid hormone metabolic process", and "circadian sleep wake cycle" [of note, the P100 mice were euthanized in the morning, and we know testosterone synthesis is sensitive to the circadian rhythm (Chen et al., 2017) Table 3). These results indicate that both fetal and adult Leydig cells are highly specialized cells dedicated to steroid production. For all subsequent analyses, we retained only the genes coding for proteins and long noncoding RNA (lncRNA).

Wide Variations in Gene Expression Levels Were Observed Between FLCs and ALCs
We then thought to evaluate the extent to which the transcriptomes of the two Leydig cell populations are comparable and what genes and biological pathways are differentially expressed. Among the 21,083 protein coding and long-non-coding genes expressed in either FLCs or ALCs, 15,330 genes have no significant difference of expression ( Figure 1B and GSE171746). Interestingly, a large proportion of genes (5,753 genes, i.e., 27% of the total) display significant variations of expression levels between ALCs and FLCs (adjusted p-value < 0.01). Of these genes, 2,357 are overexpressed in FLCs (FC < 0.5) (Figures 1B,C). The vast majority of genes overexpressed in FLCs have never been identified as such. This is particularly the case for the top 30 genes with the lowest adjusted p-value such as Cdkn1c, Gpc3, Cbln1, Sfrp1, Myadm, Glis2, Peg3, and Smoc2. In addition, we found also Thbs2 and Mc2r, two genes known to be specifically expressed in FLCs (O'Shaughnessy et al., 2002(O'Shaughnessy et al., , 2003 as well as 50 genes already reported to be expressed -although not specifically -in FLCs (Jameson et al., 2012;McDowell et al., 2012;McClelland et al., 2015;Inoue et al., 2016). We grouped the genes differentially expressed into 15 clusters according to their expression profile using a hierarchical clustering ( Figure 1D). Genes enriched in FLCs are grouped in clusters 3, 4, 7, 8, 9, and 13. Gene Ontology (GO) analysis of these clusters revealed an association with the development of the urogenital system (cluster 3), cell division and differentiation (cluster 4), various metabolic processes (clusters 7, 9, and 13), and response to reactive oxygen species (ROS) (cluster 13) (Supplementary Table 3). On the other side, 3,396 genes were found overexpressed in ALCs (FC > 2) (Figures 1B,C). Again, the large majority of genes overexpressed in ALCs have never been identified as such. This is particularly the case for genes with the lowest adjusted p-value such as Gstm2, Gstm1, Amy1, Csf1, Timp2. As expected, we have found genes known to be specifically expressed in ALCs such as Hsd3b6, Hsd17b3, Vcam1, Sult1e1, and Hpgds (O'Shaughnessy et al., 2002), as well as 750 genes already reported to be expressed -although not specifically -in ALCs (Sanz et al., 2013;O'Shaughnessy et al., 2014). Genes enriched in ALCs are grouped in clusters 2, 5, 6, 10, 11, 12, 14, and 15. This time, GO analysis of these clusters indicated a link with fertilization (clusters 2 and 5), regulation of cellular response (cluster 10), cell-substrate adhesion (cluster 11), regulation of ROS (cluster 14), and various metabolic processes (clusters 12, 14, and 15) (Supplementary Table 3). The cluster 1 regroups genes with low expression in both LC populations that are involved in stress and immune response. This association  with the immune system is consistent with the role of cytokines secreted by testicular macrophages in the regulation of Leydig cell functions (Hales, 2002). Overall, we observed significant variations in the level of expression of thousands of genes, with only a handful of genes exhibiting specific expression in one of the two Leydig cell populations.

Differential Splicing Between FLCs and ALCs
Alternative splicing is a ubiquitous regulatory mechanism that allows the generation of multiple transcript isoforms from a single gene, thus expanding the complexity of the proteome. However, the extent of alternative splicing occurring in FLCs and ALCs and its functional relevance remain unclear. To investigate whether these two cell populations exhibit different alternative splicing profiles, we performed a multivariate analysis of transcript splicing. We found 1,971 splicing events that are statistically different between the two LC populations (FDR < 0.05) (1,380 events in FLCs and 591 in ALCs) (Figure 2A and Supplementary Table 4). These splicing events occur in 1,437 genes, including 1,036 genes in FLCs, and 509 genes in ALCs (with 31 genes having an alternative splicing in both cell populations). We also examined if the genes detected as alternatively spliced correspond to differentially expressed genes. Of the 2,357 FLCs overexpressed genes, 86 of them show an alternative splicing. In the ALCs, 56 genes out of the 3,396 overexpressed genes show an alternative splicing. It appeared that the genes involved in steroidogenesis are not subject to alternative splicing in both FLCs and ALCs.
As shown in Figure 2A, intron retention is the most represented type of alternative splicing in the FLCs, with 742 events found in FLCs but only 34 in ALCs. We investigated if the genes presenting the different type of alternative splicing in both populations are enriched in a particular biological function by performing a GO enrichment analysis (Supplementary Table 4). In FLCs, the genes presenting exon skipping are strongly enriched in RNA splicing functions (Figure 2B), while genes showing intron retention are involved in mRNA processing and chromatin rearrangements ( Figure 2C). Regarding ALCs, genes showing exon skipping are involved in various processes such as cellular organization, cellular projection, or muscle system process ( Figure 2D). No GO enrichment was found in the other types of alternative splicing due to the small number of genes.
Overall, we showed that intron retention is a landmark of the FLC transcriptome. It is known that alternative splicing is frequent during embryonic development, usually cell/organ specific and plays a role in gene expression regulation and protein diversity (Revil et al., 2010;Kalsotra and Cooper, 2011;Grabski et al., 2021).

Characterization of Mutually Exclusive Marker Genes/Signatures of FLCs and ALCs
A thorough identification of the genes specifically expressed in FLCs and ALCs has never been achieved. Although our comparative analysis identified many genes that are differentially expressed between FLCs and ALCs, there is no evidence that these are specific to FLCs or ALCs. Indeed, many of them may also be expressed in other testicular cell types. To identify FLCand ALC-marker genes, we took advantage of existing single-cell RNA sequencing data from E16.5 testes (3,781 cells) (Neirijnck et al., 2019) and adult testes (24,672 cells) (Ernst et al., 2019). We combined these fetal and adult datasets to evaluate the gene expression specificity. Using five well-established Leydig cell markers, namely Hsd3b1, Star, Cyp11a1, Cyp17a1, and Fabp3, we identified 151 FLCs and 148 ALCs (Figure 3) (Ernst et al., 2019;Rebourcet et al., 2019). By comparing the sets of genes specifically enriched in ALCs and FLCs obtained with the single-cell RNA sequencing and the bulk RNA sequencing approaches described above, we generated a high-confidence selection of 62 genes enriched in FLCs showing a link with cell proliferation and differentiation, and hormone secretion (Supplementary Table 5) and 120 genes enriched in ALCs related to protein processing, spermatid development, and metabolic processes (Supplementary Table 6).

Identification of Nine FLC-Specific Marker Genes
Although we have identified 62 genes showing an expression in FLCs and low or no expression in ALCs, there is no evidence that these genes are specific to FLCs. Using singlecell transcriptomic data, we excluded the ones that were    Figure 4G) seems specific to FLCs but is not retained due to its low expression in single-cell transcriptomic data. The specific expression of FLC marker genes Crhr1 and Ren1 was validated by in situ hybridization and compared with Star expression, a known marker of Leydig cells (Figures 4H-K). We found that both Crhr1 and Ren1 are co-expressed with Star in E16.5 testis but not in adult P100 testis, confirming that these genes are specifically expressed in FLCs and not in ALCs or any other testicular cells (Figures 4H-K).

Identification of 50 ALC-Specific Marker Genes
We used the same approach to identify testicular marker genes specific to ALCs. Among the 120 genes specifically enriched in ALCs (compared to FLCs), 50 genes were considered as specific markers of ALCs (Figures 5A-G and (Figures 5H-K).
We found that Bhmt and Sult1e1 are co-expressed with Star in adult P100 testis but not in fetal E16.5 testis confirming that these genes are specifically expressed in ALCs and not in FLCs or any other testicular cells.
Overall, our analysis combining both bulk RNA sequencing and single-cell RNA sequencing resulted in the identification of 9 and 50 specific markers for FLCs and ALCs, respectively, most of which are newly identified.

DISCUSSION
The main purpose of our study was to characterize at transcriptomic level the similarities and differences between FLCs and ALCs, the major androgenic cells of the testis, using both bulk and single-cell RNA sequencing. Significant differences were observed both in terms of expression level, with 2,357 genes overexpressed in FLCs (11.2% of the total) and 3,396 genes overexpressed in ALCs (16.1%); and in terms of alternative splicing, with an over-representation of intron retention events in FLCs compared to ALCs. Our study also identified many specific markers for each Leydig cell populations, with 9 genes for FLCs and 50 genes for ALCs, most of them newly described.

Identification of FLC-and ALC-Specific Genes
The purity of the Leydig cell population is critical for the identification of Leydig cell markers using microarray and bulk RNA sequencing analyses. We have multiple indications that support the assertion that the 9 FLC-specific markers -and 50 ALC-specific markers -identified in this study are robust and specific. First, our transcriptomic analysis combines two independent sources of data, namely those from the bulk RNA  sequencing data of FLCs and ALCs, in which Leydig cells were sorted according to the level of GFP expression, but also a singlecell RNA sequencing of the testes of fetal and adult mice. In addition, the few markers already known, in particular Hsd17b3 and Hsd3b6 were also identified in the list of markers specific for ALCs. Finally, an independent validation by RNAscope R of Ren1 and Crhr1 as FLC-specific marker genes, and Bhmt and Sult1e1 as ALC-specific marker genes, confirmed their specific expression. Genome-wide expression studies using microarray technology or bulk RNA sequencing have investigated the transcriptome of FLCs (Jameson et al., 2012;McDowell et al., 2012;McClelland et al., 2015). In these studies, they isolated and evaluated the transcriptome of several populations present in the fetal testis, such as germ cells, Sertoli cells, Leydig cells and interstitial cells. Differential analysis of expression among these cell populations led to the identification of 166 overexpressed genes in FLCs. However, since the analyzed cell populations represent only a fraction of the cell types present in the fetal testis, this list of FLC overexpressed genes is overestimated. In contrast, our analysis combining bulk and single cell RNAsequencing identified nine FLC-specific candidates of which three were already described in these previous studies. We found that the other genes initially described as enriched in FLCs are mostly non-specific, either expressed in ALCs or in other testicular cell types. Among the three genes specifically enriched in FLCs are the Crhr1 (McDowell et al., 2012), Vsnl1 (Jameson et al., 2012;McClelland et al., 2015), and Ren1 (Jameson et al., 2012) genes. The corticotropin releasing hormone receptor 1 (Crhr1) is of particular interest as its ligand CRH (Corticotropin Releasing Hormone) is known to stimulate testosterone production in the fetal testes (McDowell et al., 2012) (see last paragraph of the discussion for more details and Figure 6). Moreover, our analysis revealed six new FLC-specific candidates including Cyp26b1, a gene coding for an enzyme degrading retinoic acid (RA), an active metabolite of retinol involved in meiosis regulation (Bowles et al., 2006;Koubova et al., 2006). It has recently been reported that in Cyp26b1-/-mutant mice, Leydig cell differentiation is impaired and steroidogenesis is decreased (Bowles et al., 2018).
Previous transcriptomics studies have used indirect methods to identify ALC-specific mRNA transcripts, such as the response of Leydig cells to hormones (Sanz et al., 2013), or cell ablation model using ethane dimethane sulphonate (EDS) to ablate LCs in adult male rats . These studies have resulted in a combined list of over 2,000 genes whose expression is enriched in ALCs. Although ingenious, these approaches due to their technical bias and limitations do not guarantee an ALC-specific expression. Here, we have identified 50 genes with ALC-specific expression ( Table 2). Confirming our results, three genes known to be specific for ALCs, Hsd3b6, Hsd17b3, and Vcam1 are also present in our list (O'Shaughnessy et al., 2000;Shima et al., 2013;Wen et al., 2014). Among these 50 genes, six were not described in the previous studies, namely Ces1d, Serpina3g, Rarres1, Bpifb5, Eppin, and Espn. Hundreds of genes initially described as enriched in ALCs by previous studies were excluded because their expression was not specific. This is particularly the case for Ptgds and Hsd11b1, two genes often described as specific to ALCs (Baker and O'Shaughnessy, 2001;Wen et al., 2014). In this study, we validated by RNAscope R the ALC-specific expression of two genes: Betaine-Homocysteine S-Methyltransferase (Bhmt) and Sulfotransferase Family 1E Member 1 (Sult1e1). We also proved the specific expression of BHMT at the protein level by immunohistochemistry (Supplementary Figure 2). BHMT plays a key role in regulating betaine concentration, that can be stored to control cellular osmolarity or metabolized to provide a methyl group for homocysteine methylation (Alirezaei et al., 2012). It has been shown that the testes are among the organs that contains the most betaine (Slow et al., 2009). SULT1E1, for its part, plays a protective role for Leydig cells and seminiferous tubules against estrogen overstimulation by catalyzing the sulfoconjugation and inactivation of estrogens (Song, 2007). It is also noteworthy to find several members of the SERPIN family among our ALC-specific genes (Serpina3c, Serpina3g, Serpina3n, and Serpina5) as most of them have been found in Leydig cells and seem to be sensitive to hCG (human chorionic gonadotropin) (Odet et al., 2006).
Most of the previous transcriptomic studies on Leydig cells have been carried out in rodents (rat or mouse). Although the two species are closely related, their Leydig cells show many differences. For this reason, we cannot guarantee the specificity of the markers in this study in other species than mice.

Alternative Splicing: High Intron Retention in FLCs
The main types of alternative splicing are alternative exon usage, alternative 5 or 3 splice sites, mutually exclusive exons, and intron retention. Intron retention is characterized by the inclusion of one or more introns in mature mRNA transcripts and has been previously considered to be an artifact of a dysfunctional spliceosome. It is known that alternative splicing, including intron retention, is frequent during embryonic development and contribute not only to the plasticity of the transcriptome but also the regulation of gene expression and protein diversity (Revil et al., 2010;Kalsotra and Cooper, 2011;Grabski et al., 2021). Here, we showed that intron retention is a landmark of the FLC transcriptome. Messenger RNA displaying intron retention are generally restricted from exiting the nucleus. This was proposed as a mechanism to downregulate gene expression (Grabski et al., 2021). In FLCs, we also showed that genes presenting alternative exon skipping are involved in splicing regulation itself. The control expression levels and activities of RNA binding proteins (RBPs) that regulate RNA splicing is mediated by auto-regulatory feedbacks by directly influencing the splicing of their own mRNAs (Müller-McNicoll et al., 2019). In particular, the regulation of the splicing factors of the SR (Serine/arginine rich) family regulate their activity by modulating the inclusion of a cassette exon containing a premature termination codon to produce or nor a functional protein (Müller-McNicoll et al., 2019). Tight regulation of the splicing factors is necessary for post-transcriptional gene expression regulation. The intron retention events observed in FLCs might subsequently result from the auto-regulation feedback of the splicing factors. Post-transcriptional gene expression regulation through alternative splicing have been identified as key player in the differentiation of mesenchymal stem cells (Park et al., 2020). We can postulate that the regulation of the FLC differentiation might also be mediated by alternative splicing.

Differences in FLCs and ALCs Transcriptomes Affect Steroidogenesis and Its Regulation
Fetal Leydig cells and ALCs display significant differences in both steroidogenic regulation and the type of androgen produced (androstenedione vs. testosterone) (O'Shaughnessy et al., 2000(O'Shaughnessy et al., , 2002Shima et al., 2013). Our transcriptomic data confirmed the differences in androgen production, the expression of the Hsd17b3 gene encoding the enzyme responsible for the conversion of androstenedione to testosterone is not expressed in FLCs but only in ALCs (Table 2 and Figure 6), which explains why FLCs synthesize mainly androstenedione and ALCs are capable of producing testosterone (O'Shaughnessy et al., 2000;Rebourcet et al., 2020). Regarding the differences in steroidogenesis regulation, LH appears not to be essential for FLC function since androgen production and masculinisation of the fetus occurs normally in LH/CG receptor knockout mice (Kendall et al., 1995;Lei et al., 2001;Zhang et al., 2001;Ma et al., 2004;O'Shaughnessy and Fowler, 2011;Teerds and Huhtaniemi, 2015). However, in T/ebp/Nkx2.1 null mice, which lack a pituitary gland, testicular androgen levels are markedly reduced in late gestation, suggesting that additional hypothalamo/pituitary hormones may be required for Leydig cell function and androgen production. Interestingly, our transcriptomic analysis confirmed that the ACTH receptor, melanocortin type 2 receptor (Mc2r), and the corticotropin releasing hormone receptor 1 (Crhr1) are both specifically expressed in FLCs and absent from ALCs (Table 1 and Figure 6). ACTH has been reported to stimulate in vitro testosterone production in fetal but not in adult testes suggesting that FLCs, but not ALCs, are sensitive to ACTH stimulation (O'Shaughnessy et al., 2003). However, fetal testosterone levels were normal in Proopiomelanocortin (POMC)-deficient mice that lack circulating ACTH, indicating that ACTH, like LH, is not essential for FLC function. Corticotropin-releasing hormone (CRH) has been also reported to stimulate steroidogenesis by direct activation of FLCs in fetal rat and mouse testes ex vivo and in MA-10 mouse Leydig cells (McDowell et al., 2012). In contrast, CRH does not enhance steroidogenesis in primary ALCs (Huang et al., 1995;McDowell et al., 2012). Combined together, these results support a sequential regulation of steroidogenesis in LCs. In this model, androgen production by FLCs is stimulated by three potentially redundant hypothalamo/pituitary hormones, namely LH, CRH and ACTH. Fetal androgen production can occur in the absence of any of these hormones with the two other hormones able to maintain FLC steroidogenic activity. Conversely, activation of steroidogenesis in ALCs is LHdependent and CRH-and ACTH-independent, since Crhr1 and Mc2r are not expressed in these cells (Figure 6). Although this model of steroid regulation by Leydig cells needs to be confirmed by further studies, such a mechanism may have evolved to ensure the production of adequate levels of androgens during fetal development.

DATA AVAILABILITY STATEMENT
The fetal and adult Leydig cell bulk RNA-sequencing data are available on GEO (NCBI) under accession number GSE171746.

ETHICS STATEMENT
The animal study was reviewed and approved by Direction Générale de la Santé of the Canton de Genève.

AUTHOR CONTRIBUTIONS
IS, LS, and SN contributed to conception and design of the study. IS, YN, and FK collected mouse samples and prepared sequencing libraries. PS and IS performed the bioinformatics analysis. PS, DR, AD, and MC carried out the experimental validation. PS wrote the first draft of the manuscript. IS and SN wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.

ACKNOWLEDGMENTS
The authors thank Violaine Regard from the NEF Laboratory, Cécile Gameiro, and Gregory Schneiter from the Flow Cytometry Facility, the Genomics Platform of iGE3, and the Animal Facility of the Faculty of Medicine of the University of Geneva.