Male Infertility Diagnosis: Improvement of Genetic Analysis Performance by the Introduction of Pre-Diagnostic Genes in a Next-Generation Sequencing Custom-Made Panel

Background Infertility affects about 7% of the general male population. The underlying cause of male infertility is undefined in about 50% of cases (idiopathic infertility). The number of genes involved in human spermatogenesis is over two thousand. Therefore, it is essential to analyze a large number of genes that may be involved in male infertility. This study aimed to test idiopathic male infertile patients negative for a validated panel of “diagnostic” genes, for a wide panel of genes that we have defined as “pre-diagnostic.” Methods We developed a next-generation sequencing (NGS) gene panel including 65 pre-diagnostic genes that were used in 12 patients who were negative to a diagnostic genetic test for male infertility disorders, including primary spermatogenic failure and central hypogonadism, consisting of 110 genes. Results After NGS sequencing, variants in pre-diagnostic genes were identified in 10/12 patients who were negative to a diagnostic test for primary spermatogenic failure (n = 9) or central hypogonadism (n = 1) due to mutations of single genes. Two pathogenic variants of DNAH5 and CFTR genes and three uncertain significance variants of DNAI1, DNAH11, and CCDC40 genes were found. Moreover, three variants with high impact were found in AMELY, CATSPER 2, and ADCY10 genes. Conclusion This study suggests that searching for pre-diagnostic genes may be of relevance to find the cause of infertility in patients with apparently idiopathic primary spermatogenic failure due to mutations of single genes and central hypogonadism.


INTRODUCTION
The increasing knowledge of male reproduction physiology, of fertilization, and the advent of increasingly effective assisted reproductive techniques, have led to a profound change in the management of male infertility. Currently, the diagnostic workflow offered to male infertile patients includes medical history collection and physical examination, followed by a combination of laboratory testing tailored to each case, including an in-depth genetic laboratory analysis (1)(2)(3). Diagnostic tests should be performed after at least 1 year of infertility. Accordingly, a couple can be defined infertile if they do not reach pregnancy after a year of unprotected and regular sexual intercourses (4).
Genetic factors are found in about 15% of male infertile patients. They include chromosomal abnormalities or singlegene mutations (5,6). Over 200 genetic disorders related to male infertility are reported in the Online Mendelian Inheritance in Man (OMIM) database (7,8). The genetic of male infertility is greatly complex because semen and testis histological phenotypes are very heterogeneous and up to 2,300 genes are involved in spermatogenesis (1,9). Moreover, studies in male infertility are challenging. Accordingly, genetic infertility results in an elimination of these mutations from the gene pool, since these are not transmitted. Furthermore, genetic and epigenetic changes accumulate in spermatozoa with aging, and rare single nucleotide polymorphisms and copy number variants can contribute to idiopathic male infertility (1). It is important to trace the non-genetic and genetic causes of male infertility since the latter are the cause of half of the cases of non-conception (4). Notably, to identify new genetic biomarkers of genetic infertility deserve investigation, because the standard clinical evaluation of infertile patients and karyotype analysis can identify the cause of infertility only in about 50% of the cases (10). The combination of genetic and epigenetic testing seems to identify genetic variations and differential expression of specific genes, providing information on the true ability of a man to reproduce. In contrast, a semen analysis may fail to evidence even a partial impairment of sperm parameters (9).
There are two general approaches for finding genes involved in infertility: the candidate gene approach in model animals, and the whole genome studies such as single-nucleotide polymorphism microarray and next-generation sequencing (NGS) technologies, such as exome or whole-genome sequencing (11,12). Despite a throughout diagnostic workup, conventional genetic tests largely fail to reach a diagnosis (13) and the cause of male infertility remains elusive in up to ∼70% of cases (14). Recent research seems to address the role of NGS technology in raising the rate of diagnosis in male infertility (15,16). Accordingly, several diagnostic genes have already been shown to be involved in the pathogenesis of male infertility (15). Pre-diagnostic genes, including those reported in association with male infertility but with no definitive evidence of a causative role, may help to reach a diagnosis. To this end, the present study was undertaken to evaluate a series of prediagnostic genes by comparing the results with those obtained with our usual NGS custom-made gene panel for the diagnosis of male infertility, including 110 genes.

Patients and Samples
Twelve patients with a clinical diagnosis of male infertility and negative to diagnostic genetic testing were selected for this study. Eleven were suspected to have primary spermatogenic failure and one was suspected to have central hypogonadism. More in detail, primary spermatogenic failure was suspected for a history of couple infertility longer than 2 years, after the exclusion of the female factor infertility and of acquired causes of male infertility (e.g. male accessory gland infection, varicocele, testicular trauma, etc.). Also, patients enrolled in this study were negative for first step genetic analysis, such as karyotype abnormalities, Y chromosome AZF microdeletions, or CFTR conventional gene mutations.
An informed written consent was obtained from each patient. The study was carried out following the tenets of the Declaration of Helsinki and it was approved by the local Ethics Committee. A blood EDTA sample was collected from each subject. Samples of genomic DNA of all subjects were extracted from peripheral blood using a commercial kit (SAMAG 120 BLOOD DNA Extraction Kit). DNA was quantified using Quant-iT Picogreen dsDNA Assay Kit (Life Sciences) and a Varioskan LUX (Thermo Scientific).

Gene Panel Design
A single NGS panel related to male infertility disorders comprising a total of 175 genes was designed. Then, 110 genes were analyzed in a diagnostic setting, and 65 genes comprising pre-diagnostic or informative genes were analyzed in patients who resulted negative to the diagnostic testing. The genes included in the panel were based on their correlation with male infertility described in Online Mendelian Inheritance in Man (OMIM) (7), GeneReviews (17), and primary literature. Genes were classified as "diagnostic" when they and their genetic variants were clearly correlated to male infertility in literature. Instead, genes were classified as "informative or pre-diagnostic" when they were reported to be associated with male infertility, but the causality link has not been unequivocally established. The list of genes associated with male infertility related to the diagnostic suspect of the considered subjects included in the two NGS panel, is shown in Table 1.
The custom Illumina Nextera panel included genomic targets comprising coding exons and 15 bp flanking regions of each gene. The target length of the diagnostic panel was 314,814 bp. Instead, the target length of the pre-diagnostic panel was 188,074 bp. Figure 1 describes the laboratory and analysis workflow.

Genetic Analysis and Variant Detection
DNA samples were processed using MiSeq personal sequencer (Illumina, San Diego, CA, USA) using a paired-end protocol and a 150 bp long reads, following the laboratory methods described elsewhere (18,19). Fastq (forward-reverse) files were obtained after sequencing. Reads alignment was done by the BWA (0.7.17-r1188) software. Duplicates were removed using the SAMBAMBA (0.6.7) program and GATK (4.0.0.0) were used for re-alignment. We used international databases dbSNP (www.  (20). Coding genomic regions (CDS) that were sequenced with coverage less than 15X were eventually re-sequenced using Sanger technology.

RESULTS
Twelve infertile patients were analyzed with two NGS custommade panels. They had a median age of 38 years (range 24-55). Clinical details, including testicular histology and responsiveness to FSH therapy (when available), are reported in Table 2.  Unpredictably, after genetic testing and a more than a 2 yearlong history of couple infertility, patients 5 (despite mild oligozoospermia) and 8 (despite oligozoospermia and testicular hypotrophy) spontaneously impregnated their wives, fathering healthy children. Our gene panel design generated a mean sequencing depth of 359X, whereas 98% of the target regions had a sequencing depth of at least 25X. Variants in the pre-diagnostic genes were identified in 10/12 subjects negative to diagnostic testing with suspected defects of primary spermatogenesis (83%). Seventeen filtered variants were detected in 12 of the 65 genes analyzed (18%): DNAH11, DNAH10, DNAH5, DNAI1, CCDC40, CFTR, GALNTL5, AMELY, KLK4, KLK14, CATSPER2, and ADCY10. In particular, two heterozygous variants (p.Lys1853*, rs748618094, in DNAH5 and p.Asp1152His, rs75541969, in CFTR) already reported as pathogenic were detected. Three variants with uncertain significance: p.Arg654Cys, rs140820295 in DNAI1 (heterozygous); p.Pro3935Leu, rs72658814 in DNAH11 (homozygous); and p.Asp284His, rs201042940 in CCDC40 (heterozygous) were also found. All of them were predicted to be disease-causing by MutationTaster, Damaging by SIFT, and Probably Damaging by Polyphen-2.
Moreover, three variants with high impact were identified: the hemizygous splice variant c.574-1G>A (rs760519968) in AMELY affects the acceptor splice site of the last exon and may cause the activation of a cryptic splice site and consequently a stop-loss mutation. This variant is predicted to be disease-causing by MutationTaster. The heterozygous variant c.842+1G>C (rs199516208) in CATSPER2 affects a donor splice site. This may cause the activation of a cryptic splice site and the introduction of a premature stop codon and is considered disease-causing by MutationTaster. The heterozygous truncating variant c.90T>A; p.Cys30* in ADCY10. This variant is considered pathogenic for the autosomal dominant inherited condition of susceptibility to absorptive hypercalciuria (OMIM #143870).
The genetic variants identified in the 12 infertile patients enrolled in this study using an NGS pre-diagnostic genes panel are reported in Table 3. Almost half of the variants identified by NGS in the 12 patients included in this study belong to the cytoplasmic dynein genes. The distribution of pre-diagnostic genes variants is shown in Figure 2.

DISCUSSION
Male infertility is a condition with highly heterogeneous phenotypic representation and a complex multifactorial etiology including environmental and genetic factors. The elevated number of candidate genes makes it hard to find a genetic cause of infertility in the majority of the cases (22)(23)(24). Anyway, a multi-disease gene panel can improve the identification of the etiology of male infertility (3,25,26). In several cases, idiopathic infertility has a genetic origin, therefore a correct phenotyping and medical history of the infertile patient may represent an initial basis for the genetic interpretation of the disorder (27), especially for the genetic variants of uncertain   Evaluated by ultrasound (ml). 3 FSH responsiveness was defined by the doubling of sperm concentration or total sperm count vs. pre-treatment values.
*The patient was diagnosed for reversal central hypogonadism. The values shown have been measured following 5 months from treatment withdrawal.
significance (VUS). To classify genetic variants, a prior likelihood of pathogenicity, based on in silico analysis, can be associated with the available genetic and epidemiological data to calculate the probability that a variant is pathogenic, in a multifactorial likelihood model. Based on references of the American College of Medical Genetics and Genomics, genetic variants can be distinguished into five classes: pathogenic, likely pathogenic, variant of uncertain significance, likely benign, or benign (28). A VUS is a genetic change with unclear implications for gene function. Interpretation of VUS represents a difficult challenge for genetic counseling and clinical management of infertile male patients. It is fundamental to identify VUS and to evaluate them since, at moment, they are not clearly associated with a phenotype but may be classified as pathogenic in the future (29)(30)(31).
We have successfully developed a genetic test based on NGS that covers the main male infertility indications (9,32,33). We developed a custom-made panel of 65 additional pre-diagnostic genes that we tested in 12 infertile patients who were negative to a diagnostic panel consisting of 110 genes. Eleven patients had a primary spermatogenic failure and one patient had central hypogonadism.
Almost half of the variants identified by NGS belong to the cytoplasmic dynein genes (Figure 2). Dynein genes are known to be involved in the syndromic forms of asthenozoospermia, including primary ciliary dyskinesia/Kartagener syndrome (38)(39)(40). A possible association between variants of dynein genes and isolated non-syndromic asthenozoospermia has also been reported (41).   Two pathogenic variants in two patients with primary spermatogenic failure were identified: p.Lys1853*, rs748618094 in DNAH5, and p.Asp1152His, rs75541969 in CFTR (42). DNAH5 (Dynein Axonemal Heavy Chain 5), mapping on the chromosome 5p15.2, encodes an axonemal heavy chain dynein protein. Variations in this gene mainly cause primary ciliary dyskinesia type 3 and Kartagener syndrome, which are diseases due to ciliary defects. Truncating variants in DNAH5 results in the absence of the outer dynein arm of the cilia, leading to abnormal ciliary structure and motor function (43,44). In this specific case, Subject 2 has azoospermia and carries this variant in a heterozygous state, a trait that may be associated with mutations in DNAH5. However, pathologic phenotype associated with mutations in DNAH5 is inherited in a recessive manner. We cannot exclude the presence of a large deletion/ insertion in the other allele or the contribution of other genes. CFTR (CF Transmembrane Conductance Regulator), mapping on chromosome 7q31.2, encodes a membrane protein and chloride channel. Notoriously, mutations in this gene cause cystic fibrosis (45). CFTR is important for spermatogenesis (46). Genetic variants of the CFTR gene are a relatively frequent cause of male infertility, due to obstructive azoospermia, or in atypical forms of CF such as the congenital absence of the vas deferens, bilateral ejaculatory duct obstruction, or bilateral obstructions (47,48). However, the patient studied here (Subject 8) has oligo-astheno-teratozoospermia, a trait never associated with this gene. We cannot exclude the presence of a large deletion/insertion in the other allele or the contribution of other genes.
DNAI1 (Dynein Axonemal Intermediate Chain 1), mapping on the chromosome 9p13.3, and DNAH11 (Dynein Axonemal Heavy Chain 11), mapping on the chromosome 7p15.3, are other genes of the dynein family related to primary ciliary dyskinesia and involved in male infertility (48), especially in isolated non-syndromic asthenozoospermia (32). The variant in DNAI1 is heterozygous; however primary ciliary dyskinesia caused by mutations in DNAI1 is inherited in an autosomal recessive manner. We cannot exclude that heterozygous variants in DNAI1 may cause a milder phenotype characterized only by infertility. In this specific case, Subject 1 showed oligo-astheno-teratozoospermia. Variants of DNAH11 are found also in primary ciliary dyskinesia patients with normal ciliary ultrastructure. Interestingly, we found a patient (Subject 7) that carries the p.Pro3935Leu variant in a homozygous state. In gnomAD this variant is always reported in a heterozygous state. CCDC40 (Coiled-Coil Domain Containing 40) mapping on the chromosome 17q25.3, is another gene associated with ciliary dyskinesia. The coiled-coil domain-containing protein CCDC40 is essential for motile cilia function and left-right axis formation (49). The variant p.Asp284His was found in compound heterozygosity with p.Phe649Leu, therefore we may speculate that both variants cannot cause major developmental defects like primary ciliary dyskinesia but they can cause oligo-astheno-teratozoospermia as observed in Subject 3. Interestingly, other variants with high impact requiring further functional and family segregation studies were identified. For instance, the splice variants rs760519968 in AMELY and rs199516208 in CATSPER2, and the stop gained variant p.Cys30* in ADCY10. To date, no loss-of-function mutations have been reported in the AMELY (Amelogenin Y-linked) gene in association with infertility. Structural rearrangements involving AMELY, mapping on the chromosome Yp11.2, have been found in patients with hypogonadism (50), although a direct link between the phenotype and the rearrangement has not been proven. CATSPER2 (Cation Channel Sperm Associated 2) mapping on the chromosome 15q15.3 is the main Ca 2+ channel mediating extracellular Ca 2+ influx into spermatozoa. CATSPER-related infertility is associated with azoospermia. This is consistent with the phenotype reported in Subject 9 (51). ADCY10 (Adenylate Cyclase 10) mapping on the chromosome 1q24.2, encodes for soluble adenylyl cyclase, which is the predominant adenylate cyclase in sperm crucial to sperm motility regulation, and it is associated with severe recessive asthenozoospermia (52). Subject 10 shows oligo-astheno-teratozoospermia, therefore his phenotype is partially overlapping with asthenozoospermia. Although truncating variants in ADCY10 are recessively inherited when associated with infertility, we cannot exclude the presence of a large insertion/deletion in the other allele that was not detected with NGS.
Therefore, an NGS custom-made panel test including prediagnostic genes can give an improvement to genetic diagnostic testing and can influence male infertility clinical management. The precise prevalence of male infertility is not known and, at present, there are not complete systematic reviews or metaanalyses on the epidemiology of male infertility (53,54). Making the diagnosis of genetic infertility is of relevance, also because the available epidemiological observations indicate lower life expectancy and higher morbidity in infertile patients (55,56).
In conclusion, we showed the efficacy of NGS-based approaches also employing pre-diagnostic genes. This panel of genes may help to identify the etiology underlying the disorder and guide clinical management.

DATA AVAILABILITY STATEMENT
The dataset presented in this study can be found in online repositories. The names of the repository/repositories and accession numbers can be found in the article/supplementary material.

ETHICS STATEMENT
The experimental protocol was performed in the Division of Andrology and Endocrinology of the Teaching hospital "G. Rodolico," University of Catania, Catania, Italy. The internal Institutional Review Board approved the study protocol. An exhaustive explanation of the study purpose was given to each participant and informed written consent was obtained in compliance with Helsinki's declaration. The patients/ participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
VP wrote the article. RC collected clinical data and critically revised the article. SP, GMB, TB, LS, GT, and AZ analyzed the data and critically revised the article. GM performed the bioinformatic analysis and critically revised the article. AEC conceived the study, collected clinical data, supervised the