Identifying Novel Copy Number Variants in Azoospermia Factor Regions and Evaluating Their Effects on Spermatogenic Impairment

Microdeletions in Y-chromosomal azoospermia factor (AZF) regions have been regarded as the risk factor of spermatogenic failure (SF). However, AZF-linked duplications or complex copy number variants (CNVs) (deletion + duplication) were rarely studied. In this study, we performed multiplex ligation-dependent probe amplification (MLPA) analysis on 402 fertile healthy male controls and 423 idiopathic infertile SF patients (197 azoospermia and 226 oligozoospermia) in Han Chinese population. In total, twenty-four types of AZF-linked CNVs were identified in our study, including eleven novel CNVs (one deletion, seven duplications, and three complex CNVs). Our study revealed that AZFc-linked duplications and the instability of Y chromosome might be associated with spermatogenesis. Besides, the complex CNVs (b2/b3 deletion + DAZ1/2 duplication) were confirmed to increase genetic risks for SF in Han Chinese population. This study illustrated a spectrum of AZF-linked CNVs and presented valuable information for understanding the clinical significance of AZF-linked CNVs in male infertility.


INTRODUCTION
Infertility affects an estimated 15% of couples globally at childbearing age, and male infertility is directly or indirectly responsible for about 50% of cases involving reproductive-age couples with fertility-related issues (Agarwal et al., 2015). The majority of infertile males were diagnosed with spermatogenic failure (SF) (Lo Giacco et al., 2014).
Owing to frequent non-allelic homologous recombination (NAHR) of AZF region, the majority of CNVs are found on Y chromosome among all human chromosomes (Freeman et al., 2006). Identifying AZF-linked CNVs have been showed to be valuable for finding the genetic etiology of SF, as well as evaluating the rate of sperm recovery after testicular sperm extraction (TESE) (Krausz et al., 2014). Traditional methods to identify AZF-linked CNVs were based on polymerase chain reaction (PCR) of sequence-tagged site (STS) markers (STS-PCR), however, STS-PCR could only determine AZF-linked deletions. To date, some AZF-linked duplications or complex CNVs have been reported (Giachini et al., 2008;Lu et al., 2011;Saito et al., 2015). However, the precise frequency and clinical significance of these CNVs have remained unclear in patients with SF.
With the development of molecular techniques, such as multiplex ligation-dependent probe amplification (MLPA) and array comparative genomic hybridization (aCGH), researchers were able to simultaneously identify multiple CNVs. Compared with aCGH, the MLPA method was relatively simple and inexpensive, and it has been suggested as a valuable technique for diagnosing genetic diseases (Quarello et al., 2012;Massalska et al., 2013;Mutlu et al., 2018).
To identify novel AZF-linked CNVs (especially duplications and complex CNVs) and their significance in spermatogenesis, we performed MLPA analysis on 402 fertile healthy male controls and 423 idiopathic infertile SF patients (197 azoospermia and 226 oligozoospermia) in Han Chinese population.

Study Subjects
Subjects were recruited from the Nanjing Maternity and Child Health Care Hospital (Nanjing, Jiangsu province, China) between July 2016 and September 2017. This study was approved by the Ethics Committee of Nanjing Maternity and Child Health Care Hospital. All procedures involved in this study were conducted in accordance with the Declaration of Helsinki.
Semen analyses were performed based on World Health Organization criteria (2010) with the reference values parameters for semen volume of ≥ 1.5 mL, pH ≥ 7.2, total sperm count of 39 × 10 6 per ejaculate, sperm concentration of 15 × 10 6 /mL, sperm total motility of 40%, and sperm with a normal morphology of 4% (Cooper et al., 2010), and each subject was examined twice to ensure the reliability of the results. In addition to semen analyses, some comprehensive andrological examinations including a series of physical examinations, scrotal ultrasound, hormone analysis and karyotype analysis were also performed. Those with varicocele, cryptorchidism, orchitis or abnormal karyotype (such as 47, XXY) were excluded from this study.
The male infertile cases, who sought treatment in the infertility clinic, were recruited into this study by a retrospective design. These cases have no child and diagnosised with idiopathic non-obstructive azoospermia (NOA) (no sperm in the ejaculate) or oligozoospermia (total sperm count < 39 × 10 6 per ejaculate). To distinguish between NOA and OA (obstructive azoospermia), only those azoospermic cases with soft and small testis (total testicular volume <30 ml), elevated follicular stimulating hormone in plasma were included. The controls, recruited from the same hospital during the same period, were fertile males who had normal total sperm count per ejaculate, sperm concentration, motility, morphology and had fathered at least one child without assisted reproductive. Twenty-nine infertile subjects (out of 452) were excluded, including 4 with OA, 5 with cryptorchidism, 8 with varicocele, 2 with orchitis and 10 with abnormal karyotype (7 of them with 47, XXY). Finally, 402 fertile male controls and 423 infertile patients were recruited in this study.
All the participants were genetically unrelated ethnic Han-Chinese. Some influencing factors [e.g., age, body mass index (BMI), and smoking] were ruled out between patients and controls. All participants were informed about the purpose of the study, and signed the written informed consent for publication of this original research as well.

MLPA Analysis
MLPA analysis was performed using SALSA MLPA probe-mix kit P360-B1 (MRC-Holland, Amsterdam, the Netherlands) containing 42 specific probes for AZF regions (Supplementary Figure S1). Briefly, 7 ul of sample DNA (20 ng/ul) were denatured for 5 min at 98 • C and subsequently cooled to 25 • C. Next, 5 ul denatured DNA were mixed with 3 ul hybridization master mix and heated for 1 min at 95 • C, then 18 h at 60 • C. After hybridization reaction, 32 ul ligase-65 master mix were added and incubated for 15 min at 54 • C and 5 min for 98 • C, then pause at 20 • C. The ligation products were then mixed with polymerase master mix for PCR reaction. The PCR products were analyzed by ABI 3,500 using 50 cm capillaries and POP-7 polymers (Applied Biosystems, Foster City, CA, USA) and the injection mixture containing 0.9 ul PCR products, 0.1 ul LIZ 500 size standard and 9 µl HiDi formamide. The ABI 3,500 run conditions were as follows: injection voltage = 1.6 kVolt, injection time = 8 s, oven temperature = 60 • C, run voltage = 19.5 kVolts, and run time = 1330 s. The relative peak area of each probe was calculated by dividing the actual peak area of the subject by the average of that of five reference samples using Coffalyser.Net Software (MRC-Holland, Amsterdam, the Netherlands). The 30% increase or decrease of the relative peak area of the probe showed duplication or deletion of the targeted region, respectively. Supplementary Table S2. The ACTB gene was employed as a reference. The qPCR was performed in a total volume of 20 µL consisting of 10 µL AceQ Universal SYBR qPCR Master Mix (Vazyme, China), 0.4 µL of 10 µM each primer (forward and reverse), 2 µL genomic DNA (10 ng/µL) and 7.2 µL nucleasefree water. The PCR reaction were run in ABI StepOnePlus TM (Life Technologies, USA) based on the following program: 95 • C for 5 min for initial denaturation, followed by 40 cycles (95 • C for 10 s and 60 • C for 30 s) for amplification and fluorescence detection, and a cycle (95 • C for 15 s, 60 • C for 60 s, and 95 • C for 15 s) for melt curve analysis. Data analysis was performed using StepOneTM Software based on the CT method. All samples were detected in triplicates for qPCR analysis.

Statistical Analysis
In this study, χ 2 and Fisher's exact tests were applied to compare the statistical differences in the frequencies of CNVs between the patients and controls using SPSS 20.0 software (IBM, Armonk, NY, USA). P-value < 0.05 was statistically considered significant. We did not correct for multiple testing considering that the sample size for those novel AZF-linked CNVs were relatively small and adjustments for making multiple comparisons might miss possibly important findings.

Characteristics of the Study Population
In this study, we recruited 402 fertile healthy male controls and 423 infertile SF patients (197 azoospermia and 226 oligozoospermia). According to the achieved results, we found that there was no significant difference between the control group and the SF group in selected characteristics including age, BMI, and smoking ( Table 1).

Distribution of Known and Novel Deletions in AZF Regions and Their Effects on SF
Overall, twenty-four types of AZF-linked CNVs (eight deletions, eleven duplications and five complex CNVs) were identified in our study population. Among eight deletions, four classical deletions (AZFa deletion, AZFb deletion, AZFc complete deletion (b2/b4 deletion), and AZFb + c deletion) recommended by EAA/EMQN practice guidelines (Krausz et al., 2014) were identified by MLPA analysis and verified by STS-PCR (Figures 1, 4). These four types of deletions were identified in SF patients, but not observed in controls ( Table 2). Consistent with the previous reports, the classical deletions were the genetic causes of SF (Krausz et al., 2014).
The frequency of three AZFc partial deletions (gr/gr, b2/b3, and b1/b3) identified by MLPA and validated by STS-PCR showed no significantly statistical difference between the patients and the controls ( Table 2), suggesting that these partial deletions might not considered to be genetic risk factors for SF in our study population.
Besides, we found a novel and de novo AZFb partial deletion (RBMY1J deletion) in a patient with oligozoospermia whose father has a normal RBMY1J copy in the same region ( Table 2). The de novo deletion might be associated with oligozoospermia, and further study is required to confirm whether the RBMY1J deletion may cause the disease.

Distribution of Known and Novel Duplications in AZF Regions and Their Effects on SF
The traditional STS-PCR method could only detect AZFlinked deletions, while MLPA method could detect not only deletions, but also duplications. In this study, we identified eleven types of AZF-linked duplications in 3.98% of controls (16/402) and 7.09% of SF patients (30/423). Most of duplications (72.7%, 8/11) were located in the AZFc region likely owing to the frequent non-allele homologous recombination (NAHR) in this region (Skaletsky et al., 2003). Seven novel duplications, four in AZFc region, one in AZFb region, and two in AZFa region, were identified in our study (Figure 2, Table 3).
Two types of BPY2 duplication (Types I and II) and "AZFa partial dup Type I" were found in both patients and controls, suggesting they might be benign CNVs. The CNVs named "BPY2 dup Type I + gr/gr dup, " "b2/b4 dup + gr/gr dup, " "RBMY1J dup, " and "AZFa partial dup Type II" were all de novo CNVs and only identified in patient group (Table 3). Additional cases should be considered to investigate the clinical relevance of these findings.

Distribution of Known and Novel Complex CNVs (Deletion + Duplication) in AZF Regions and Their Effects on SF
Five types of complex CNVs involved both deletions and duplications were identified in our study (Figure 3, Table 4). Among these complex CNVs, the "b2/b3 deletion + DAZ 1/2 duplication" was confirmed by analyzing the restriction enzyme digestion site of single-nucleotide variation (SNV) at sY587 to discriminate DAZ 1/2 from DAZ 3/4 (Machev et al., 2004;Supplementary Table S1). Consistent with the result reported by Lu et al. (2011), the frequency of the CNV was significantly higher in patients than that in controls (OR = 5.34, P = 3.00 × 10 −2 ) ( Table 4). The CNV named "two gr/gr deletions + b2/b4 duplications" was identified in one SF patient by Saito et al. (2015), and this CNV was considered to be benign in our patients ( Table 4).
In addition, three novel complex CNVs were identified in our study, including "b2/b3 deletion + gr/gr duplication, " "b2/b3 deletion + gr/gr triplication, " and "gr/gr deletion + b2/b3 triplication." The frequency of each complex CNV showed that there was no significant difference between the patient group and the control group (Table 4).

The qPCR Validation of CNVs Not Confirmed by STS-PCR
To validate the CNVs not verified by STS-PCR especially those AZF-linked duplications and complex CNVs, we performed realtime quantitative PCR (qPCR).
For each CNV to be tested, two reference samples (normal MLPA results) and one randomly selected positive CNV sample were examined. The qPCR results were all consistent with MLPA results (Figure 5 and Supplementary Figure S2). Those above results indicated the accuracy of the MLPA method for detecting AZF-linked CNVs ( both deletions and duplications).

DISCUSSION
Y chromosome microdeletion was the second most frequent genetic etiology after Klinefelter syndrome (Krausz et al., 2014). In recent years, AZFa, b and c (b2/b4) deletions have been identified as the pathogenic CNVs in spermatogenesis (Krausz et al., 2014). However, AZF-linked duplications and complex CNVs were rarely studied due to limits of technical means. In this study, we revealed the relatively precise frequency and clinical significance of AZFlinked CNVs in Han Chinese population by using the MLPA method.
For AZFc partial deletions (gr/gr del, b2/b3 del, and b1/b3 del), we did not observe significantly different distributions between SF cases and control subjects. The clinical significance of these partial deletions was reported to be variable in previous studies (Lu et al., 2009;Bansal et al., 2016;Krausz and Casamonti, 2017). Hence, further study should be performed in diverse racial or ethnic groups to confirm the respective results.
Interestingly, two de novo RBMY1J CNVs, a deletion and a duplication were identified in our oligozoospermic patients. RBMY1J, a member of RBMY1 (Y-linked RNA binding motif protein family 1), is expressed in all stages of spermatogenesis and acts as a splicing factor during spermatogenesis (Elliott et al., 1997;Dreumont et al., 2010). Yan et al. revealed that RBMY1 is associated with sperm motility (Yan et al., 2017). Considering the normal sperm motility in both patients, our study suggests that RBMY1J is associated with decreased sperm numbers in clinic. Therefore, further studies are required to determine the function of RBMY1 in spermatogenesis.
Besides, Y chromosome is thought to be "fragile" owing to complex NAHR structure. In the present study, "any CNVs in AZF region" showed significantly higher frequency in SF patients than controls, and we speculated that the instability of Y chromosome might be associated with spermatogenic impairment. Additionally, the frequency of "any duplication in AZFc region" was also significantly higher in SF patients, suggesting that AZFc-linked duplications might play vital roles in spermatogenesis. Similar results were reported in the previous studies (Lin et al., 2007;Ye et al., 2013;Saito et al., 2015).
In addition, three novel complex CNVs identified in the present study expanded our knowledge on AZF-lined CNVs. Notably, all five complex CNVs were detected as simple deletions (gr/gr or b2/b3 deletion) by the traditional STS-PCR method. The technical limitation of STS-PCR would be a potential confounding factor for risk assessment in studies of gr/gr or b2/b3 deletion. It justifies why the previous research has demonstrated conflicting findings.
In 2012, Bunyan et al. performed MLPA in 100 subjects (50 SF patients and 50 controls) and identified four types of simple deletions, which cannot be detected by STS-PCR (Bunyan et al., 2012). In 2014, Liu et al. performed MLPA in 199 fathers and     their 228 sons, and found that assisted reproductive technology didn't increase the risk of Y-chromosome microdeletions in male offspring (Liu et al., 2014). Both studies used the MLPA analysis to detect Y-chromosome microdeletions. In 2015, Saito et al. performed MLPA in 56 SF patients and 65 control individuals from who all were Japanese, and identified both AZF-linked deletions and duplications (Saito et al., 2015). Compared with previous studies, we applied the MLPA analysis to more subjects and identified more types of AZF-linked CNVs. Besides, for those novel CNVs only detected in SF patients, we further detected if they were inherited from their fathers and found they were all de novo CNVs. In addition, we classified the patients with SF into azoospermia group and oligozoospermia group, which would be helpful to understand the clinical significance of AZF-linked CNVs in different SF types.
However, there are some limitations in the current study. Although a decent number of subjects were tested in this retrospective case-control study, it is not enough to reach conclusions on the role of those rare AZF-linked CNVs. In addition, the p-values presented were not corrected for multiple testing in the statistical analysis, which may lead to reject the null hypothesis too readily and increase the risk of false positives. In the future, we plan to establish the roles of rare AZF-linked CNVs in a larger prospective study.
In summary, this study comprehensively identified AZF-linked CNVs, especially duplications, in Han ethnicity. It provides valuable findings for understanding the roles of AZF-linked CNVs in spermatogenesis, and will be helpful in the diagnosis and treatment of SF in clinical practice.