Evidences of Z- and W-Linked Regions on the Genome of Fenneropenaeus chinensis

Fenneropenaeus chinensis is a commercially cultured shrimp in China. F. chinensis adults show significant sexual dimorphism, with larger females than males. However, sex determination (SD) of F. chinensis has not yet been elucidated. Clarification of the sex-determining system of F. chinensis could enrich our knowledge of the sex differentiation mechanism in crustaceans and facilitate the study of sex-controlling technologies. Here, we studied the sex-determining system of F. chinensis using the fixation index (FST) between the sexes to detect the genetic differentiation in resequencing data of multiple males and females. We located the candidate sex chromosome in the genome of F. chinensis and concluded the female heterogametic (ZW) SD system. We also assembled female-specific sequences, which could be used as molecular markers to identify the sex of F. chinensis. However, the differentiation of the F. chinensis Z and W chromosome is limited. RNA-seq data detected many genes with male-biased expression in the Z-specific region, which possibly could further intensify the divergency between the Z and W chromosomes.


INTRODUCTION
The evolution of chromosomes carrying the sex determination (SD) factor is an interesting form of genomic evolution (Charlesworth and Charlesworth, 2000;Bachtrog et al., 2011). In some cases, selection for close linkage may promote suppressed recombination in the SD region, because the region includes a sexually antagonistic polymorphism (Bull, 1985;Bergero and Charlesworth, 2009;Wright et al., 2016). Once recombination becomes suppressed, sex-specific evolutionary pressures act on the evolving sex chromosomes (Lahn and Page, 1997;Moghadam et al., 2012), leading to adaptive and non-adaptive processes that produce distinct differences between the X and Y (or Z and W) chromosomes (Muller, 1918;Bergero and Charlesworth, 2009;Bachtrog, 2013;Li et al., 2021). However, in many species, loss of recombination near the SD locus does not spread across the sex chromosomes, which may remain heteromorphic, and display only limited differentiation (Telgmann-Rauber et al., 2007;Spigler et al., 2008;Tennessen et al., 2016;Pucholt et al., 2017) (Figure 1A). Fenneropenaeus chinensis is a commercially cultured shrimp in China, whose adults are sexually dimorphic. Female shrimps tend to be blue, whereas the males are yellow; the adult females are larger and heavier than the adult males (Wang et al., 2019) (Figure 1B). Studies on the sex of F. chinensis are valuable. In Crustacea, both XX/XY and ZZ/ZW system existed (Shi et al., 2018;Fang et al., 2020). The sex-determining region of a predominant aquaculture shrimp species, the Pacific white shrimp (Litopenaeus vannamei), was mapped by the integrating linkage and association analyses (Yu et al., 2017). The results supported female heterogamety. The sex-determining regions of two other Penaeus shrimps, Penaeus monodon and Penaeus japonicus, have been mapped using high-density linkage analysis (Li et al., 2003;Robinson et al., 2014), but not finemapped. However, even some closely related species have different SD loci (Phillip et al., 2001;Miura, 2007;Mank and Avise, 2009;Pucholt et al., 2017). For F. chinensis, several sexrelated markers have been identified by QTL mapping, but no direct evidence of their connection with SD was found (Meng et al., 2021). We therefore studied the SD mechanism of F. chinensis to improve our knowledge of the sex differentiation mechanism in crustaceans and facilitate the study of sexcontrolling technologies, and provide sex-linked markers to identify the sexes at early developmental stages. The genome of F. chinensis released this year  makes it realizable to study the sex chromosome. In this study, we used resequencing data to detect sex-linked variants, including femalespecific sequences, and preliminarily conclude that the species had a female heterogametic (ZW) SD system.

Sample Collection and Sequencing
We randomly selected 10 female and 11 male F. chinensis "Huanghai No. 1" shrimps (age, 4 months) from the conservation base of Haifeng Aquaculture Co., Ltd. (Weifang, Shandong Province, China). The DNA samples of these 21 shrimps were obtained from the muscle and sequenced individually using the BGISEQ platform (BGI, Shenzhen, China), with paired ends of 150 bp. We obtained 219.9 Gb of clean DNA data (female: 104.5 Gb, male: 115.4 Gb). In addition, one female and one male were deeply sequenced, yielding 51.2 and 50.9 Gb clean data, respectively, approximately 35x cover depth.
The genome sequencing data were mapped to the improved reference genome of F. chinensis  using BWA (v0.7.15) (Li and Durbin, 2010) with default parameters. The bam files of 10 females and 11 males were merged into two pools by sex, and further convert to pileup file by using SAMtools (v1.9) (Li et al., 2009).

Analysis of Genetic Differentiation Between the Sexes
To evaluate the genetic differentiation between the sexes across the genome, we used PoPoolation2 software (Kofler et al., 2011) to convert the pileup file into sync file with a minimum base quality of 20. F ST between the sexes and nucleotide diversity values, π, were estimated for all site types, using PoPoolation2 with the following set parameters: window size of 10 kb, step size of 5 kb, minimum allele count of 4, minimum coverage of 10, and maximum coverage of 200. F ST was calculated using the merged data of all females and males, whereas π values were estimated using the two deepsequenced individuals.
Information on mapping depth along the genome was extracted with bedtools (v2.25.0) (Quinlan and Hall, 2010) in 1 kb windows. The depth was normalized by the total read count. The ratio of the depth of male to female was log 2 -transformed.

Female-Specific Sequence Assembly
Besides the 21 "Huanghai No. 1" F. chinensis shrimps, we also sequenced 15 female and six male F. chinensis shrimps captured from the wild, again using the BGISEQ platform, with paired ends of 150 bp. In total, we obtained 436.3 Gb of clean resequencing data (female: 251.9 Gb, male: 184.4).
We expected a female heterogametic, or ZX system. Candidate W-linked sequences can be detected because they are femalespecific. Therefore, we attempted to assemble female-specific sequences according to the method used for the snakehead (Channa argus) (Ou et al., 2017), with some adjustments. Briefly, the pooled sequencing data from the females were mapped to the reference genome, which was constructed from a male. The unmapped reads were assembled using SOAPdenovo (Luo et al., 2012), with Kmer = 31, and then aligned back to the assembled sequence to assess the accuracy of the assembly. Homozygous SNPs could reflect assembly error. The assembled contigs were mapped back to the reference genome using NCBI BLAST (v2.2.29 +) (Johnson et al., 2008) with evalue 1e-1. We deleted contigs for which more than 60% of the fragments that could be mapped to the reference genome. The resequencing data from both the males and the females were mapped to the assembled femalespecific contigs. We retained only contigs with male mapping depth = 0 and female mapping depth > 30, representing W-linked candidates.

Validation With PCR Amplification
The candidate W-linked fragments based on the "Huanghai No. 1" strain and captured wild shrimps used for the female-specific sequence assembly were validated by PCR amplification in samples from four populations, containing two other population, "Huanghai No. 3" and wild shrimp bred for a generation. Primers were designed for the female-specific sequences with the web tool Primer3 1 and synthesized by Sangon Biotech (China). We designed a pair of reference primers that could amplify the sequence in both sexes, but with a different sequence length in the females.
Female and male DNA pools of the four populations were used as PCR templates for the preliminary primer screening. Primers that generated PCR product only in female pools were selected for further validation in the individual samples of the four populations (10 females and 10 males of "Huanghai No. 1, " 10 females and 10 males of "Huanghai No. 3, " 12 females and 12 males of wild shrimp bred for a generation, and 10 females and six males of captured wild shrimp).

RNA-Seq Data Processing
The RNA-seq data have been described in a previous study (NCBI BioProject accession number: PRJNA591354) (Wang et al., 2019). Briefly, the gonad and muscle of female and male F. chinensis 1 https://primer3.ut.ee/ shrimps at 5 months of age were collected, and total RNA was extracted. We used three biological replicates for each tissue and sex. The 12 libraries were sequenced using the Illumina NovaSeq S4 platform, with paired ends of 150 bp. Clean reads from each RNA-seq library were aligned to the reference genome using HISAT (Kim et al., 2015). The gene expression of each sample was analyzed with HTSEQ (Anders et al., 2015).

Resequencing Data Identified a Candidate Sex-Linked Region
The fixation index (F ST ), which measures population differentiation (Weir and Cockerham, 1984;Holsinger and Weir, 2009), was calculated across the genome. Increased F ST was observed on chromosome 7 (Chr7), spanning a region of approximately 3 Mb, from 34 to 37 Mb (Figures 2A,B). In the female, nucleotide diversity (π) increased in the same interval, as expected in a system of female heterogamety (Figure 2C), whereas in males, it was similar to the values in the rest of the genome. There are 60 genes located on this 3 Mb interval according to the genomic annotation file of F. chinensis.
The mapping depth was estimated across Chr7. More reads aligned to the 34-37 Mb interval in the males than in the females (Supplementary Figure 1), and the ratio of male to female coverage in some zones was close to 2:1 (Figure 3).

Assembly and Validation of Female-Specific Sequences
Among the 1.68 billion clean reads in the female pool, 92.01% mapped to the reference genome (Supplementary Table 1), and 134.12 million unmapped reads were also extracted and used in the assembly. In total, we obtained 103.25 Mb of fragments, consisting 435,866 contigs, with 70 contigs longer than 2,000 bp (Supplementary Table 2). Alignment of the unmapped reads showed a high coverage and low error rate (Supplementary Table 3). After further screening, we obtained 363 candidate female-specific contigs.
We selected 16 of the longest candidate female-specific contigs for validation. Sequences from three contigs amplified only with female DNA pools ( Figure 4A). Further validation was performed using individual DNA templates (Figures 4B-D).

RNA-Seq Data Detected Male-Biased Expression Genes in the Interval on Chr7
To obtain information on the expression of genes on Chr7, we re-analyzed the RNA-seq data reported in a previous study (Wang et al., 2019) (Supplementary Figure 2). In muscle and gonad tissue of 5-month-old animals, we detected abundant differentially expressed genes (DEGs). In the muscle, 743 genes had male-biased expression and 2,291 genes had femalebiased expression (Supplementary Figure 3). In the gonad, there were 1,713 male-biased and 1,352 female-biased genes (Supplementary Figure 4). DEGs in the gonad showed a higher proportion (71.21%) of male-biased expression on Chr7 than other chromosomes (Supplementary Table 4).
In the 34-37 Mb interval of Chr7, we found five genes with male-biased and 11 with female-biased expression in the muscle (Figure 5A and Supplementary Table 5). In the gonad, the expression of all 13 DEGs in this region was male-biased; even expression of the adjacent genes was male-biased ( Figure 5B and Supplementary Table 6).

DISCUSSION
The F ST analysis located the sex-differentiation region on Chr7, which was regarded as the candidate sex chromosome in this study. In the differentiation region, the higher read mapping depth in males than females suggests that the sex-linked region is hemizygous in females, and the increased π in the females indicating that the SD system of F. chinensis is female heterogametic (female: ZW, male: ZZ). However, this species does not have differentiated Z and W chromosomes. Most fragments of the candidate "Z chromosome" were indistinguishable from autosomal sequences, according to the F ST values and gene expression analyses. The sex chromosome formation of this species may only stay a primary stage.
During sex chromosome divergence, the suppression of recombination leads to the accumulation of mutations, which can result in highly heteromorphic sex chromosomes (Wright et al., 2016). At the primary stage of the divergence, the loss of recombination near the SD locus does not spread across the sex chromosomes. The sex chromosomes display only limited levels of differentiation. There are 60 genes annotated in the interval of 34-37 Mb of Chr7; compared to the 790 genes on Chr7 (46 Mb), and 25,026 genes on the whole genome (1.45 Gb), the gene density is slightly higher, which might be caused by the intergenic region loss in the initial process of recombination suppression.
However, if recombination becomes suppressed in a large genome region carrying a sex-determining locus, transcriptional degeneration of the W (or Y) allele can occur quickly (Bachtrog et al., 2008;Papadopulos et al., 2015). The enrichment of sex-biased expression genes in such an SD region may occur before loss of genes from the sex-limited chromosome (Bergero and Charlesworth, 2011;Chibalina and Filatov, 2011;Muyle et al., 2012;Pucholt et al., 2017). All the DEGs in and near the candidate Z-linked region exhibited male-biased expression, suggesting that many mutations causing male-biased expression have accumulated. These DEGs may be related to sex development and responsible for sexual dimorphism or reproduction. However, further studies need to be performed to analyze the functions of the DEGs.
The female-specific sequences detected in our analysis may have been derived from a W chromosome-like region. However, the Z and W regions appear to carry highly similar sequences, and some of our primers were designed for both of them (Supplementary Figure 5).

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI (accession: PRJNA591354).

AUTHOR CONTRIBUTIONS
QW collected the samples and carried out genetic differentiation analysis. XR extracted DNA and RNA for sequencing. JJL carried out female-specific fragment assembly. XR processed the RNA-seq data. JW and SJ performed verification experiment. JL conceived this project. YH supervised the work. QW and JJL wrote the manuscript. All authors contributed to the final manuscript.

ACKNOWLEDGMENTS
We would like to appreciate our colleagues of the Key Laboratory for Sustainable Utilization of Marine Fisheries Resources of Yellow Sea Fisheries Research Institute, for their assistance on sample collection and helpful comments on the manuscript.