Excess of Rare Missense Variants in Hearing Loss Genes in Sporadic Meniere Disease

Meniere's disease (MD) is a clinical spectrum of rare disorders characterized by vertigo attacks, associated with sensorineural hearing loss (SNHL) and tinnitus involving low to medium frequencies. Although it shows familial aggregation with incomplete phenotypic forms and variable expressivity, most cases are considered sporadic. The aim of this study was to investigate the burden for rare variation in SNHL genes in patients with sporadic MD. We conducted a targeted-sequencing study including SNHL and familial MD genes in 890 MD patients to compare the frequency of rare variants in cases using three independent public datasets as controls. Patients with sporadic MD showed a significant enrichment of missense variants in SNHL genes that was not found in the controls. The list of genes includes GJB2, USH1G, SLC26A4, ESRRB, and CLDN14. A rare synonymous variant with unknown significance was found in the MARVELD2 gene in several unrelated patients with MD. There is a burden of rare variation in certain SNHL genes in sporadic MD. Furthermore, the interaction of common and rare variants in SNHL genes may have an additive effect on MD phenotype. This study will contribute to design a gene panel for the genetic diagnosis of MD.


INTRODUCTION
Meniere's disease (MD, MIM 156000) is a chronic disorder of the inner ear characterized by episodes of vertigo, associated with low to middle frequency sensorineural hearing loss (SNHL), tinnitus and aural fullness . The disorder produces an accumulation of endolymph in the membranous labyrinth, and it may affect both ears in 25-40% of patients (termed bilateral MD) and most of cases are considered sporadic (Paparella and Griebie, 1984). However, heterogeneity in the phenotype is observed and some patients may have co-morbid conditions such as migraine or systemic autoimmune disorders (Gazquez et al., 2011;Caulley et al., 2017). This phenotypic spectrum can make the clinical diagnosis challenging considering that some of the symptoms overlap with other vestibular disorders such vestibular migraine (VM) or autoimmune inner ear disease (AIED) (Hietikko et al., 2011;Lempert et al., 2012;Mijovic et al., 2013;Requena et al., 2014b).
Epidemiological evidence showing a genetic contribution in MD is based on familial aggregation studies with a high siblings recurrence risk ratio (λ s = 16-48) (Requena et al., 2014a) and the description of multiple familial cases in European and Asian descendant populations (Arweiler-Harbeck et al., 2011;Hietikko et al., 2013). Exome sequencing has identified private variants in the FAM136A, DTNA, PRKCB, SEMA3D, and DPT genes in 4 families with autosomal dominant MD, showing incomplete penetrance and variable expressivity (Requena et al., 2015;Martín-Sierra et al., 2016. Moreover, some relatives in familial MD show partial syndromes, either with SNHL or episodic vertigo, increasing the granularity in the phenotype in a given family. However, the genetic contribution of familial genes in sporadic cases has not been investigated and the occurrence of recessive and novel variants is not known. More than 110 genes and ≈6,000 variants have been related to hereditary nonsyndromic hearing loss, making gene sequencing panels an essential tool for genetic diagnosis of hearing loss (The Molecular Otolaryngology and Renal Research Laboratories, The University of Iowa, 0000; Sloan-Heggen et al., 2016). Those genes include 45 genes associated to autosomal dominant SNHL and 71 genes related to recessive hearing loss (Shearer et al., 1999;Van Camp and Smith, 2018).
Targeted gene sequencing panels have been demonstrated to be an excellent tool for molecular diagnosis of rare variants in known genes with allelic heterogeneity (Brownstein et al., 2011;Lionel et al., 2018), as well as in sporadic cases of hearing impairment in specific populations (Gu et al., 2015;Dallol et al., 2016). So, we selected SNHL genes and designed a custom gene panel to search for rare and novel variants in sporadic MD.
In the present study, we describe the genetic variation found in a custom exon-sequencing panel of 69 genes in a large cohort of MD sporadic cases. We report that certain genes such as GJB2, USH1G, SLC26A4, and CLDN14 show an excess of missense variants in sporadic MD cases when compared to controls in the Iberian population, suggesting that several rare variants in these genes may contribute to the SNHL phenotype in sporadic MD.

Editorial Policies and Ethical Considerations
This study protocol was approved by the Institutional Review Board for Clinical Research (MS/2014/02), and a written informed consent to donate biological samples was obtained from all subjects.

Sample Selection
A total of 890 Spanish and Portuguese patients with MD were recruited. All patients were diagnosed by neurotology experienced clinicians from the Meniere's Disease Consortium (MeDiC), according to the diagnostic criteria for MD formulated by the International Classification Committee for Vestibular Disorders of the Barany Society in 2015 . Among them, 830 were considered sporadic MD cases while 60 were familial MD cases. Details from the selected cases are described in Table 1. As controls, 40 healthy individuals were selected from the same population.

Selection of Target Genes
Target genes were selected from a literature search attending to human phenotype (hearing profile, comorbid vestibular symptoms) and phenotype observations in mouse and zebrafish models. Most of them were selected from HearingLoss.org website gene list for monogenic SNHL. Additional genes were added because they have been previously found in familial MD (Requena et al., 2015;Martín-Sierra et al., 2016, or allelic variations associated with hearing outcome in MD had been described, such as NFKB1 or TLR10 genes (Requena et al., 2013;Cabrera et al., 2014). Mitochondrial genes were added since maternal inheritance is suspected in several families with MD (Requena et al., 2014b). Relevant information about the location, size, bibliography, and other characteristics about each gene included in the panel is presented in Tables S1, S2.
The custom panel (Panel ID: 39351-1430751809) were designed by the Suredesign webtool (Agilent) to cover the exons and 50 bp in the flanking regions (5 ′ and 3 ′ UTR). This allowed the sequencing around 533.380 kb with more than 98.46% coverage.

Sample Pooling
Enrichment technology allows the selective amplification of targeted sequences and DNA sample pooling, reducing the costs of reagents and increasing sample size. We decided to pool patient samples according to their geographical origin. Each pool consisted of 10 DNA samples from the same hospital for a total of 93 pools (930 samples).
DNA concentration and quality were measured on each sample using two methods: Qubit dsDNA BR Assay kit (TermoFisher Scientific) and Nanodrop 2000C (ThermoFisher Scientific). All samples had quality ratios ranging 1.8 and 2.0 in 280/260 and 1.6 to 2.0 in 260/230.

Libraries Preparation Protocol
The HaloPlex Target Enrichment System (Agilent, Santa Clara, CA) was used to prepare the DNA libraries, according to the manufacturer protocol. Validation of the protocol and library performance was analyzed with a 2,100 Bioanalyzer High Sensitivity DNA Assay kit. Expected concentrations were between 1 and 10 ng/ul. Higher concentrations than 10 ng/ul were diluted 1:10 in 10 mM TRIS, 1 mM EDTA. Targetedsequencing was performed in an Illumina Nextseq500 platform.

Data Generation Pipelines
Raw data downloaded and sequencing adapters were trimmed following manufacturer indications. The requested depth of coverage for the sequencing panel was 250X. The minimum coverage considered was 30X mean depth for nuclear genes, however mitochondrial sequences reached higher coverages with the enrichment technology. Bioinformatic analyses were performed according to the Good Practices recommended by Genome Analysis ToolKit (https://software.broadinstitute. org/gatk/). Mitochondrial genes were analyzed using the same pipeline.
Two methods were used to find differences in how UnifiedGenotyper and HaplotypeCaller (the old and the most recent tools for variant calling in GATK suite) address sequenced pools. Both custom pipelines use BWA-mem aligner and GATK suite tools following the GATK protocol for Variant Calling against GRCh37/hg19 human reference genome. Left normalization for multi-allelic variants were addressed by separated. Calling was made in the first pipeline with UnifiedGenotyper modifying number of chromosomes per sample (per pool, there are 20 chromosomes). The second pipeline used HaplotypeCaller, which cannot allow the same approach, but can automatically address high number of calls with a different approach. Variants with read depth (RD) <10 and genotype quality (GQ) <20 were excluded in all the calling pipelines following recommended hard filtering steps by GATK suite.
A third caller tool, VarScan, was used to filter and annotate quality strand data per variant to compare its output with GATK-based callers. VarScan allows the variant filtering using the information obtained according to each strand polarity. The method retrieves those variants that were only called in one strand, but not in the reverse strand, leading to false positive calls. This step was used as internal quality control to avoid strand bias usually generated in Haloplex data, as it has been reported in other studies (Collet et al., 2015).

Positive Control SNV Validation
Positive control testing was addressed using samples from patients with familial MD with known variants on certain genes. These individuals come from previous familial studies with independently validated variants by Sanger sequencing. Known variants were also sequenced and validated by Sanger (Table S3).
Coverage and mapping quality after each pipeline were annotated and measured. Representative chromatographs from validated SNVs are detailed in Figure S1.

Selection and Prioritization of Pathogenic SNV
In order to obtain more information of each SNV, we annotated the merged files using the ANNOVAR tool. Minor allele frequencies (MAF) were retrieved for each candidate variant from gnomAD and ExAC database (total individuals and non-Finnish European (NFE) individuals). Since the estimated prevalence of sporadic MD in Spain is 0.75/1,000 individuals (Morales Angulo et al., 2003), we selected variants with MAF <0.001 for single rare variant analysis and prioritized them according to Combined Annotation Dependent Depletion (CADD) phred score. For burden analysis of common and rare variants, we chose a higher MAF value <0.1. The Collaborative Spanish Variant Server (CVCS) database including 1,644 unrelated individuals was also used for annotation of exonic rare variants and to retrieve MAF in Spanish population (dataset fully accessible from http://csvs.babelomics.org/) (Dopazo et al., 2016).
KGGseq suite (grass.cgs.hku.hk/limx/kggseq) was used for the selection of rare variants to prioritize the most pathogenic variants according to the integrated model trained algorithm with known pathogenic variants and neutral control variants.
Enrichment analysis for each gene was made with all the exonic variants found with a MAF <0.1. This analysis required to divide the total amount of variants into three groups: those described in global ExAC population, those found in NFE population, and finally those included in CVCS Spanish population. These three reference datasets were used for enrichment analysis comparison.

Validation of Candidate Pathogenic SNV
Candidate SNV were visually revised in the BAM files using Integrative Genomics Viewer (https://software.broadinstitute. org/software/igv/) and validated in the different pools where they were called using Sanger sequencing.

Population Statistics
Statistical analysis was performed with IBM SPSS v.20 program, Microsoft Excel suite tools, and diverse python and java encoded public scripts. Due to the overrepresentation of Spanish population in our dataset, most of the selected variants were filtered through exome sequencing data from Spanish controls of CSVS database. The MAF was calculated for each variant in our dataset and rare and previously unreported variants on MD patients were identified in our gene panel. Odd ratios with 95% confidence interval were calculated for each variant using MAF obtained from Spanish population (N = 1,579), ExAC (N = 60,706), and ExAC NFE (N = 33,370) populations as controls.
Gene burden analysis was addressed using 2 × 2 contingency tables counting total exonic alternate allele counts per gene in our cases against total and NFE controls in ExAC and CSVS controls. Odds ratios with 95% confidence intervals were calculated using Fisher's exact test and obtaining one-sided pvalues. P-values were also corrected for multiple testing by the total amount of variants found for each gene following Bonferroni approach.

Position of Variants in Significant Enriched Genes
Several models were generated for rare variant-enriched domains in significant enriched genes by using the INSIDER modeling tool (Meyer et al., 2018). The selected variants per gene are detailed in results. Prediction values were annotated with their calculated p values.

Single Rare Variant Analysis
We achieved an average capture efficiency rate (percentage of total on-target reads in total sequenced reads) of 69% on the target regions above 30X (minimum depth considered for quality filtering). The mean coverage percentage can be found in Table S4. A total of 2,770 SNV in nuclear genes were selected from the raw merged dataset (18,961 SNVs) after filtering by quality controls. The analysis workflow is summarized in Figure 1. For rare variants analysis, SNV that were found in more than one pool were selected, remaining 1,239 variants. After that, we filtered by variants observed in the control pools, leaving only 392 exonic SNV in cases (278 missense, 111 synonymous, 2 stopgain, and 1 stoploss).
A final set of 162 SNV with a MAF <0.001 were retrieved (143 missense, 18 synonymous, 1 stoploss, 1 stopgain). All the exonic variants were annotated and scored using different priorization tools. Of them, 136 SNVs were not previously described in any population database and we considered them as potential novel variants.
After prioritizing the exonic variants by CADD phred, 31 rare variants remained ( Table 2). Six of them were validated by Sanger sequencing in more than two individuals in the following genes: GJB2, ESRRB, USH1G, SLC26A4 ( Table S3). The rest of the variants were considered benign or likely benign since they did not reach the pathogenicity threshold predicted for KGGSeq. However, a novel synonymous variant in the MARVELD2 gene was found and validated in three unrelated individuals.
The minor allelic frequencies in SNV of the 24 mitochondrial genes included in the panel were compared with the reference data obtained from MITOmap through its automated mtDNA sequence analysis system Mitomaster (Ruiz-Pesini et al., 2007). However, the candidate variants observed do not belong to the genes targeted in the mitochondrial genome, since they were not validated by Sanger sequencing. We did not found any SNV associated with MD (data not shown).

Gene Burden Analysis
To analyse the interaction of multiple variants, we considered SNV with a MAF<0.1 for the gene burden analysis. A total of 957 exonic variants were retrieved and their frequencies were compared with the global and NFE frequencies from ExAC, and with the Spanish population frequencies from CSVS.
A gene burden analysis using our gene set was performed using these three reference datasets. After Bonferroni correction, some genes showed a significant enrichment of rare variants in the three comparisons, making them candidate genes to be selected for a diagnosis panel for MD (Table 3). Moreover, 6 genes (FAM136A, ADD1, SLC12A2, POU4F3, RDX, and PRKCB) presented some novel variants that were validated by Sanger, but they have not been described in global ExAC or CSVS datasets. Although these previously unreported variants could not be sequenced in all the parents of these patients, we considered them as potential de novo variants.
A second variant analysis using the missense variants described in CSVS Spanish population database was made Minor allele frequency for each SNV is detailed as annotated by ExAC and gnomAD (exomes). Pathogenicity prediction is detailed according to CADD phred score.
( Table 4). Eighteen genes showed an excess of missense variants (a total of 46 variants, detailed in Table S5). Of note, five genes causing autosomal recessive SNHL showed the highest accumulation of missense variants when they were compared with NFE and Spanish population datasets: SLC26A4, GJB2, CLDN14, ESRRB, and USH1G. The variants in these five genes were validated through Sanger sequencing and considered Spanish population-specific variants.

Excess of Rare Variants in Hearing Loss Genes in Familial Cases
We used exome sequencing datasets from familial MD cases previously reported to search for rare variants identified in our panel in the sporadic cases. Although no single missense variant was found segregated in all the cases in the same family, we found several rare missense variants in at least one case per family in genes such as GJB2, GRHL2, TRIOBP, RDX, KCNQ4, WFS1, and ADD1. These MD families show phenotypic differences in terms of age of onset, hearing profile and disease progression and the presence of rare variants can be addressed as potential modulators of the phenotype in each familial case ( Table 5).

Effect of Rare Variant Interaction
We selected exonic variants from the gene burden analysis to analyze their potential additive effect at the protein-protein interaction interfaces by the tool INSIDER for our selected five genes. However, protein interfaces for ESRRB, CLDN14 and SLC26A4 genes could not be loaded and processed on the database (lacking predicted interfaces on ÉCLAIR database or crystalized protein structures on Protein Data Bank (PDB) database). Of note, most relevant affected interaction is observed in the self-interaction GJB2-GJB2 by the known variants observed in the burden analysis (significant spatial clustering with 4 SNV, p = 0.0009) rs111033218:G>C (p.Phe83Leu), rs80338945:A>G(p.Leu90Pro), rs374625633:T>C(p.Ile30Val), and rs2274084:C>T(p.Val27Ile) (Figure 2A).
Other interactions of interest were founded between the USH1G-USH1C genes, but the involved variants were not located in the known interaction surface of USH1G (Figure 2B).    Variants were retrieved from familial cases segregating a partial phenotype in different families. To assess if the SNHL genes showing enrichment of missense variants were located in genomic regions with a higher recombination rates, we retrieved recombination rates from deCODE genetics maps for the ESRRB, GJB2, USH1G, CLDN14, and SLC26A4 genes and calculated linkage disequilibrium correlations for candidate missense variants in these five genes.
USH1G and ESRRB genes have the highest recombination rates and they seem to be in genomic regions considered as hotspots (Table S6). However, most of the rare missense variants found were not clustered and showed a scattered distribution along the different exons with a low recombination rate (Table S7).

DISCUSSION
This study shows that patients with sporadic MD have an enrichment of few rare variants in certain hearing loss genes such as GJB2, SLC26A, or USH1G. This excess of missense variants in some genes may increase the risk to develop hearing loss in MD and may contribute to explain the heterogeneity observed in the phenotype (Francioli et al., 2015). To understand the relevance of population frequencies in our cohort, we performed the association analysis between variants observed in MD cases against their respective frequencies on a healthy population for each gene of the panel (Lek et al., 2016). From the total amount of variants, we selected rare coding variants for all the targeted genes. We applied a stronger filter for the selection of missense variants by choosing previously described variants for each gene significantly overrepresented in MD cases.
Since many missense variants were not found in the Spanish population from CSVS (Dopazo et al., 2016), a third comparison limited to previously reported variants in CSVS database was carried out. We followed this conservative approach to reduce false positive findings in the readings. This third restrictive gene analysis was limited to 132 variants observed at least once in the Spanish reference population.
From the final analysis, we found that some genes such as SLC26A4, ESRRB, CLDN14, GJB2, and USH1G retained the higher number of missense variants among Spanish MD patients. We also found one novel synonymous variant in the MARVELD2 gene in 3 unrelated patients. Besides from its functional implications, it may also generate a cryptic splice site. However, more testing is needed to confirm this finding.

Multiallelic Model for MD
The excess of missense variants in SNHL genes may point to core gene for hearing loss in MD. Our hypothesis is that common cisregulatory variants and rare variants in one or more genes will contribute to the phenotype in MD. The model will need the additive effect of at least a common and a rare variant in the same gene in a given individual (Castel et al., 2018). In the simplest bi-allelic hypothesis, we will have: Ind 1 = cv a + rv z (geneA) Ind 3 = cv a + rv x (geneA) Where cv is a common variant and rv represents a rare variant; however, this model could be more complex for a single gene: Ind 1 = cv a + cv c + rv z (geneA) So, several rare variants will be targeting the core genes (rv z, rv x for gene A; rv y, rv w for gene B) and common variants in the same genes will explain variable expressivity of the MD phenotype. Finally, in a more complex scenario, it could involve several genes (oligogenic multiallelic hypothesis): Ind n = cv a + rv z gene A + cv b + rv y gene B + . . . + cv n + rv m (gene N)

Gene Panel for Familial MD
The Genomics England project (https://www.genomicsengland. co.uk/) has designed gene panels for the diagnosis of many genetic disorders including familial MD (https://panelapp. genomicsengland.co.uk/panels/394/). This panel is in an early stage of development because it only considers 130 genes with limited evidence to few families. The results of this study can be used to improve the design of panels for the diagnosis of MD.
For the design of our panel, we chose a total of 69 genes. Most of the genes were selected according to the hearing loss profile (low frequency or pantonal hearing loss). However, more than 90 genes have been related to hearing loss, so more hearing loss genes could be involved in the phenotype (fully accessible from Hereditary Hearing Loss Homepage: http:// hereditaryhearingloss.org/). Genetic evidence of hearing loss has been obtained from linkage analyses until the emergence of NGS techniques (Shearer et al., 1999), that have facilitated the clinical development of genetic diagnosis in hearing loss. Custom panels and microarrays have been the flags of a new age of discovery of novel and rare variants for genetic diagnostic of hearing loss (Brownstein et al., 2011;Shearer and Smith, 2015).
Our panel was designed considering hearing loss as the main symptom shared by all patients with MD, since the vestibular phenotype and other associated co-morbidities such as migraine or autoimmune disorders are more variable. To improve the diagnostic yield of MD and to decrease this granularity in the phenotype, it will be recommendable to select sporadic patients with an early age of onset for future studies.

Rare Missense Variants in Hearing Loss Genes in Sporadic MD
The frequency of hearing loss related genes is population-specific (Sloan-Heggen et al., 2016). Herein, we present a study for MD patients in the Spanish population. As a part of the study, we consider a panel of genes related to hearing loss and other symptoms. Besides from the validated variants in singletons, only a few rare variants such as ESRRB rs201448899:C>T, MARVELD2 rs369265136:G>A, SLC26A4 rs200511789:A>C, and USH1G rs151242039:C>T have been validated in more than one sporadic case in the entire cohort. All these genes had been previously considered as pathogenic for hearing loss, but they have never been involved with MD.
ESRRB encodes the estrogen-related receptor beta, also known as nuclear receptor subfamily 3, group B, member 2 or NR3B2. This gene encodes for a protein like the estrogen receptor but with a different and unknown role. Mutations in the mouse ortholog have been involved in the placental development and autosomal recessive SNHL (Collin et al., 2008;Weber et al., 2014).
MARVELD2 encodes a protein found in the tight junctions, between epithelial cells. The encoded protein seems to forge barriers between epithelial cells such the ones in the organ of Corti, Defects in this gene are associated with DFNB49 (Mašindová et al., 2015).
SLC26A4 gene encodes pendrin, a protein extensively studied in hearing loss. Its alteration is one of the most common causes of syndromic deafness and autosomal recessive SNHL. It is also associated with enlarged vestibular aqueduct syndrome (EVAS) [ (Yang et al., 2007), 36].
USH1G is a gene translating to a protein that contains three ankyrin domains, a class I PDZ-binding motif and a sterile alpha motif. This protein is well-known to interact with harmonin (USH1C) in the stereocilia of hair cells, a protein associated with Usher syndrome type 1C (Weil et al., 2003). This protein plays a role in the development and maintenance of the auditory and visual systems and functions in the cohesion of hair bundles formed by inner ear sensory cells. Alterations in the integrity of the protein seem to be the cause of Usher syndrome type 1G (Weil et al., 2003;Miyasaka et al., 2016).
However, ESRRB rs201448899:C>T has been observed in more Spanish controls than in global or NFE in ExAC. This increased frequency on the Iberian population when compared with other known largest frequencies as NFE, suggests that this is a population specific variant rather than a MD disease variant. Only the MARVELD2 rs369265136:G>A variant remains as a proper novel related to MD cases. However, the functional effect of a synonymous variant is unknown and functional studies will be required to decipher the relevancy of this variant in MD cases in the future.

Burden Analysis of Rare Missense Variants in Sporadic MD
Our results demonstrate a burden of rare missense variants in few SNHL genes, including GJB2, ESRRB, CLDN14, SLC26A4, and USH1G. We speculate that the additive effect of several missense variants in the same gene could interact with the same or other genes at the protein level resulting in the hearing loss phenotype.
Population analysis was addressed in order to obtain a better image of our cohort. Despite the limitation that represents the small number of genes considered in our panel, we have found a significant increase of missense variants on several hearing loss genes in the Iberian population (Table 4). These findings suggest the involvement of multiple missense variants in the same gene and may explain several clinical findings in MD. So, incomplete phenotype found in relatives of patients with familial MD or even the variable expressivity observed could be explained on the differences found in multiple rare variants with additive effect among individuals of the same family (Requena et al., 2015;Martín-Sierra et al., 2016. In addition, some sporadic cases where a single rare variant with unknown significance cannot explain the phenotype could be singletons individuals with low frequency variants probably following a compound heterozygous recessive pattern of inheritance. Our results start to decipher the complex interaction between rare and ultrarare variations (MAF <0.0001) with common variants in the same or different genes in sporadic MD, adding more evidence to understand the genetic architecture of MD. However, one of the limitations of this study is the lack of availability of a replication cohort with different ethnicity in which to validate these findings.
Another limitation of our dataset is that the method used for resequencing mitochondrial genes may not be able to distinguish mitochondrial from nuclear sequences, as capture panels such as those based in the Haloplex technology may sequence all mitochondrial genome fragment replicas that are dispersed throughout the nuclear genome. Hence, variants observed may not belong to the genes targeted in the mitochondrial genome, but to their pseudogenes in the nuclear genome.
Several hypotheses could explain the excess of missense variants in SNHL genes in MD. First, the variable expressivity of SNHL in MD phenotype, could be the result of additive effect of low frequency or rare variants in the same gene. The combination of low frequency variants in the same gene can be a rare situation, as rare as the disease. As much changes are added to the protein, its integrity could be affected, showing a suboptimal functioning and finally, a loss of function. In our case, GJB2, that forms a hexamer with a transmembrane channel function, has been determined as possible affected by these changes in their interactions. Previous studies have determined how certain changes in the monomer can affect to the develop of the hexamer hemichannel (Bicego et al., 2006;Jara et al., 2012). Here, bioinformatics models show how the interaction of low frequency variants found in MD patients can impact the interaction between two connexins monomers, but this effect could be amplified in a model including the 6 connexins that form the connexon. However, this hypothesis is difficult to reconcile with the fact that for some small genes such as GJB2, complex alleles with several point mutations are exceedingly rare.
A second hypothesis points to the interaction of common and rare variants in one or several genes in the disease phenotype, following its complex disease definition (Becker, 2004;Mitchell, 2012). So, the excess of rare variants will be targeting core genes for hearing loss in MD. In this case, high significant genes in our study could be added to the panel of candidate targets of the disease, although a single variant could not be enough to explain the disease phenotype. So, the interaction between cis-regulatory variants with rare variant in some of our candidate genes and other, a priori, not related SNHL genes could be relevant in the expressivity. USH1G interacts with USH1C, a known gene involved in Usher syndrome. USH1G has been observed to have a minor role in Usher syndrome in Spanish population (Aller et al., 2007), but not in MD, even though they share similar hearing loss profile. Although no one of the missense variants in USH1G were in an interaction domain, this could be of interest when considering interaction between different proteins as a main factor to develop a mild phenotype. This hypothesis was reinforced through the data found in familial cases. For instance, the variant rs748718975 in DPT gene was only associated with the SNHL phenotype in the family where it was described, but these cases showed different characteristics in the age of onset or hearing loss outcome. These differences between the cases can be explained with other variants found in KCNQ4 (rs574794136:G>A) and ADD1 (rs372777117:A>G) genes, although these variants were previously described as variants of unknown significant. So, the variant rs574794136:G>A was found in two sisters with MD, but not in the third one, that was carrier of rs372777117:A>G. This excess of rare variants in certain genes observed in familial cases could explain the differences in expressivity in a given family.
This panel was made as an early screening diagnostic panel. Here we have found that certain SNHL gene variants can be related to MD in the Iberian population and the results show that multiple rare allelic variants in the same gene should be consider as likely pathogenic. Although there are large differences in the coverage for some genes between the MD panel and the WES databases, these are not the ones with excess of missense variants. Our results will contribute to design a novel gene panel for the genetic diagnosis of MD.

DATA AVAILABILITY
The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

AUTHOR CONTRIBUTIONS
TR and JL-E as project managers conceived the main idea of the work. Sample preparation and protocol were carry out by TR and AG-M. AG-M performed the bioinformatics and statistical analysis. Validations of tested SNV were done by AG-M and PR-N. AG-M, and JL-E took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis, and manuscript.