Integrating Population Variants and Protein Structural Analysis to Improve Clinical Genetic Diagnosis and Treatment in Nephrogenic Diabetes Insipidus

Congenital nephrogenic diabetes insipidus (NDI) is a rare genetic disorder characterized by renal inability to concentrate urine. We utilized a multicenter strategy to investigate the genotype and phenotype in a cohort of Chinese children clinically diagnosed with NDI from 2014 to 2019. Ten boys from nine families were identified with mutations in AVPR2 or AQP2 along with dehydration, polyuria–polydipsia, and severe hypernatremia. Genetic screening confirmed the diagnosis of seven additional relatives with partial or subclinical NDI. Protein structural analysis revealed a notable clustering of diagnostic mutations in the transmembrane region of AVPR2 and an enrichment of diagnostic mutations in the C-terminal region of AQP2. The pathogenic variants are significantly more likely to be located inside the domain compared with population variants. Through the structural analysis and in silico prediction, the eight mutations identified in this study were presumed to be disease-causing. The most common treatments were thiazide diuretics and non-steroidal anti-inflammatory drugs (NSAIDs). Emergency treatment for hypernatremia dehydration in neonates should not use isotonic saline as a rehydration fluid. Genetic analysis presumably confirmed the diagnosis of NDI in each patient in our study. We outlined methods for the early identification of NDI through phenotype and genotype, and outlined optimized treatment strategies.


INTRODUCTION
The congenital form of nephrogenic diabetes insipidus (NDI), a rare inherited disorder, is characterized by insensitivity of the distal nephron to the antidiuretic action of arginine-vasopressin (AVP) and the reduced ability of the kidney to concentrate the urine, leading to severe dehydration and electrolyte imbalance (hypernatremia and hyperchloremia) (1). In 90% of patients, inheritance of NDI arises from mutations in the X-linked gene coding for the vasopressin type 2 receptor (AVPR2) (OMIM #304800) (1,2). The remaining patients display autosomal recessive or dominant forms of inheritance due to mutations in the gene coding for the aquaporin 2 (AQP2) water channel (OMIM #222000; OMIM #125800, respectively) (1,3). The main clinical hallmarks of NDI are polyuria and compensatory polydipsia. When encountering an inadequate water supply, a hot environment, or episodic losses of free water, patients suffering from NDI do not properly compensate for water loss and are at risk of severe dehydration. The defect in the ability to concentrate urine is present at birth, and is accompanied by additional symptoms that arise during the first week of life such as irritability, poor feeding, and failure to thrive (4). Persistent polyuria can lead to the development of kidney megacystis, hydroureter, and hydronephrosis. Repeated episodes of dehydration can cause cognitive difficulties (5), one of the most serious complication of NDI, probably secondary to hypoxic episodes (6). Establishing the genetic diagnosis is particularly important for NDI in order to enable early detection and a more efficient differential diagnosis in view of its unique associated features and long-term complications.
Few studies described the spectrum and prognosis of NDI from pediatric patients in China. In the current study, we utilized a multicenter strategy to investigate the genotype and phenotype in a cohort of Chinese children with NDI to explore the diagnosis and treatment for NDI in China. The description of the clinical and genetic spectrum of NDI will substantially help to devise a population-specific strategy for gene analysis.

Study Design and Participants
The national multicenter registry (Chinese Children Genetic Kidney Disease Database, CCGKDD) genetically screened a cohort of pediatric renal disease patients in China to date (7). Participants in this study were solicited via clinicians collaborating with CCGKDD. Participants were asked to provide information concerning their presenting clinical features, genetic diagnosis, monitoring, medical management, age at each clinical Abbreviations: 3D, three dimensions; AD, autosomal dominant inherited; AR, autosomal recessive inherited; AVP, arginine-vasopressin; AVPR2, vasopressin type 2 receptor; AQP2, aquaporin 2; CCGKDD, Chinese Children Genetic Kidney Disease Database; CKD, chronic kidney disease; DDAVP, Desmopressin; GPCR, G protein-coupled receptor; HGMD, Human Genome Mutation Database; IQR, Interquartile range; NDI, nephrogenic diabetes insipidus; NSAIDs, non-steroidal anti-inflammatory drugs; OMIM, Online Mendelian Inheritance in Man; PCR, polymerase chain reaction; SNPs, single-nucleotide polymorphism; Trio-WES, trio whole exon sequence; WES, Whole Exome Sequencing. event, and clinical status at their latest follow-up. The responding clinicians were contacted to report details of clinical features, treatment, and follow-up of patients with NDI. No identifying information was collected about patients or respondents, and the names of the reporting centers were also collected to allow the comparison of entries in order to avoid duplicates. Among the children with renal disease consecutively enrolled in the national multicenter registry CCGKDD from 2014 to 2019, patients were diagnosed with NDI based on symptoms, biochemistry disturbances genetic testing, as well as water restriction/desmopressin (DDAVP) loading tests.

Genetic Analysis
In order to further identify and confirm the diagnosis of NDI, we performed the trio whole exome sequencing (Trio-WES) with the probands and their parents concurrently. Informed consent was obtained from the parents prior to genetic analysis. WES was outsourced, and raw data were then transferred to our lab for bioinformatics analysis. FastQC software was used to examine FastQ raw data quality in terms of length of reads, GC content of reads, quality of nucleotides within the reads, and over-represented sequences, among others. Next, sequences were aligned to the Hg19 reference genome and then assessed for variant calling using the HaplotypeCaller tool of GATK software. Lastly, variants were annotated for their predicted effects on protein function and allele frequency using the public databases gnomAD (https://gnomad.broadinstitute.org/), and predicted pathogenicity was also predicted using in silico algorithms provided by the online software SIFT (https://sift.bii.a-star.edu. sg/sift-bin/), PolyPhen-2 score (http://genetics.bwh.harvard.edu/ cgi-bin/pph2), and MutationTaster (http://www.mutationtaster. org/cgi-bin/). Diagnostic variants were defined as pathogenic or likely pathogenic according to the ACMG guidelines and also included variants of uncertain significance (VUS) of known disease-causing genes through discussion combined with genotype and phenotype. Evidence for disease causality was assessed using ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) and the Human Genome Mutation Database (HGMD; http:// www.hgmd.cf.ac.uk/ac/all.php), and a manual review of primary literature. After AVPR2 or AQP2 mutations were identified, appropriate genetic testing and counseling were offered to symptomatic individuals, high-risk siblings, and offsprings.

Protein Structural Analysis
The structural analysis of AVPR2 and AQP2 was performed using PDB accession 6U1N and 4NEF, respectively. The utility of Protter (https://wlab.ethz.ch/protter/start/) online software was used for the integrated visual analysis of membrane proteins (8), and residue depth was calculated computationally using DEPTH server (http://cospi.iiserpune.ac.in/depth/) (9). Population variation in AVPR2 or AQP2 was investigated using gnomAD and was presumed to be negligible with respect to the relative enriched and depleted pathogenic variants for rare childhood disease. According to the estimated prevalence of pediatric diabetes insipidus in males, variation in allele frequency of over 8.8/1,000,000 was described as population singlenucleotide polymorphisms (SNPs) (2). The number of residues involved in variants within the functional domains was compared with population SNPs and disease-causing mutations reported in the HGMD and by ClinVar analyses (Supplementary Table 2).

Patient Characteristics
A total of 10 boys from nine families with registry information on CCGKDD database from 2014 to 2019 were enrolled in this study. Patient characteristics are shown in Table 1. The median age at clinical diagnosis was 1.0 months (IQR, 0. [16][17][18]. Five patients were clinically diagnosed during the neonatal period whereas three of the cases included children with delayed diagnoses who were only clinically diagnosed after 2 years old. One of the three children was identified with the mutation of AQP2 until developing into stage 3 chronic kidney disease (CKD) at 14 years old. Polyuria-polydipsia and hypernatremia were present in all 10 probands. Fever, vomiting, and anorexia were present in the initial presentation of four of the patients. Five families had multiple effects, and only one family applied for early genetic screening for newborn babies because of his affected siblings with NDI. Five of the patients underwent water restriction tests prior to genetic analysis and eight patients underwent cranial MRI as part of their diagnostic evaluation with no reported abnormalities. Hydronephrosis was found in four children. None of the patients had hypercalciuria or abnormal renal function in serum chemistry analysis from initial presentation.
We also detected seven female relatives with heterozygous mutations and two maternal uncles with hemizygous mutations in the AVPR2 gene. All relatives reported multiple episodes of mild or moderate fever during childhood without any differential diagnosis details on dehydration fever or infection disease. Among them, we found that three suffered polydipsia and polyuria from childhood, and four female individuals presented partial or subclinical NDI. However, further examination has not been done. They all had multiple episodes of mild or moderate fever during childhood. Additionally, one maternal uncle was aware that he had a lack of thirst despite a large fluid intake.

Variant Analysis and Comparative Protein Modeling
The AVPR2 protein is a typical seven-membrane-spanning helix G protein-coupled receptor (GPCR) localized at the basolateral plasma membrane of the principal cells of the kidney collecting duct (14). In available literature, we found 169 reported missense mutations in AVPR2 in patients with NDI, and there are a further 109 population missense variants retrieved from the gnomAD database. The location of diseasecausing and population variants was scattered throughout the different structural domains of the protein, with significant enrichment of disease-causing variants in the transmembrane region (Fisher test, p = 0.001, Figures 1A,C). The residue depth of the disease-causing mutation is significantly higher than that of the population SNPs and that of all variants from gnomAD (Wilcoxon-Mann-Whitney test, p = 2e −8 and p = 2e −5 , respectively, Figures 2A,B).
Structural analysis of the AVPR2 protein revealed the presence of hemizygous mutations in eight patients ( Figure 1A). The three missense mutations of AVPR2 (p.L62P, p.A165D, and p.S167L) produce full-length misfolded proteins with residues located within the transmembrane domain. The protein is mostly retained in the endoplasmic reticulum (ER) by the ER quality control machinery and is target for proteasome degradation. The deletion mutation p.F77del resulted in the residue deficiency in the first intracellular loop (located in a wellconserved region), likely causing a defective protein. The small deletion p.S331Rfs * 25 caused a frameshift with a premature stop codon encountered in the sequence. This deficiency of the Cterminal tail of AVPR2 results in the loss of β-arrestin binding to the phosphorylated tail of AVPR2 and subsequent receptor internalization, which would normally allow for the engagement of G protein to the core of AVPR2. The formation of megaplex AVPR2-G protein-β-arrestin is required for the active signals leading to sustained endosomal cAMP generation (16). The gross deletion of exon 2 led to truncated proteins, which are often rapidly degraded.
The AQP2 gene is located in the 12q13 region and codes for the 271-amino-acid AQP2 protein, a type IV-A transmembrane protein characterized by six transmembrane domains connected by five loops and intracellular N-and C-termini (15). A total of 53 missense mutations and 9 small deletions in AQP2 were reported, and there were a further 70 population missense variants retrieved from the gnomAD database. The location of diseasecausing and population variants was scattered throughout the different structural domains of the protein, with no significant enrichment of pathogenic variants in the transmembrane regions (Fisher test, p = 0.079, Figure 1B). However, five of the nine small deletions in AQP2 affected the residues from 223 to 271 in the Cterminal cytoplasmic region. There was a significant enrichment of pathogenic variants by autosomal dominant inheritance (AD) in the C-terminal region (Pearson test, p = 0.0001, Figure 1D). The residue depth of the pathogenic variants is significantly higher than that of the population SNPs and that of all variants from gnomAD (Wilcoxon-Mann-Whitney test, p = 2e −7 and p = 2e −5 , respectively, Figures 2C,D).
Structural analysis of the AQP2 protein revealed the presence of mutations in two patients ( Figure 1B). The homozygous mutation p.R187C affected amino acids in the selectivity filter region of the water conduction pore, which determines the transport specificity (17). ER accumulation of these AQP2  mutants has been shown in another study (4). The Cterminal truncating mutation p.R267fs * 66 could result in the heterotetramers formed by wild type and mutated AQP2 monomers that were either retained in the Golgi apparatus or misrouted to late endosomes, lysosomes, or basolateral membrane. This type of mutation is inherited from a dominant trait in NDI (18).

Treatment Regimens
All the patients diagnosed with NDI as neonates required emergency treatment of hypernatremia dehydration in this study. Most patients received isotonic or hypotonic rehydration before being clinically diagnosed with NDI. For example, this was observed in our case example ID#3, where plasma sodium level increased from 153 to 161 mmol/L after rehydration with intravenous 0.19% saline for the neonate. A tonicity balance can easily demonstrate the excess of NaCl administration from 0.19% saline (Figures 3A,B). In this case, the appropriate concentration of 5% glucose was given, which shows that the free water loss was prescribed for maintenance fluid rates appropriate to his age and size, and adjustment according to the plasma sodium. Breastfeeding was preferable to formula feed for its hypotonic advantage.
Most of the patients in this study started the conventional treatment with thiazides after clinical diagnosis of NDI, followed by non-steroidal anti-inflammatory drugs (NSAIDs). Thiazides were prescribed in five patients, whereas thiazides and NSAIDs were prescribed in four of the patients. During the follow-up period, NSAIDs were discontinued for the following reasons including general concerns about long-term use in the three of the patients, and increased serum creatinine levels in one of the patients. The medium serum sodium at initial treatment was 154 mmol/L (IQR, 145-161) and was reduced to 144 mmol/L (IQR, 140-145) at the last follow-up, with a median age of 5 years (IQR, 2.6-8.3) (Figure 3C). At the time of last follow-up, only one case (#_2) with AQP2 mutation developed into CKD 3 stage. It is noteworthy to mention that the father and maternal grandfather of case #2 passed away with renal failure.

DISCUSSION
This study reported the genetic spectrum and treatment approaches in a cohort of children with NDI registered on the Chinese multicenter database. We reported eight cases from seven families presented with X-linked NDI (AVPR2); one case presented with autosomal recessive NDI (AQP2) and one case presented with autosomal dominant NDI (AQP2).
Although the pathophysiology and molecular diagnosis of congenital polyuric states have been well-established (19), we still encountered cases where the diagnoses were late and where inappropriate diagnostic testing and treatments are performed. Optimizing the diagnostic strategy especially with regard to genetic analysis was one of our top priorities. It has been wellestablished that the heterozygous loss-of-function variants of AVPR2 or AQP2 are common disease-causing genes that result in congenital NDI (2,3,20). In order to try to elucidate the pathogenicity of variants of AVPR2 or AQP2, we compared population SNPs with pathogenic mutations from the HGMD. Although the existence of a single population variant does not rule out pathogenicity, it is unlikely that the observed population variants of AVPR2 or AQP2 are pathogenic, since severe early onset childhood disorders have specifically been excluded from gnomAD. Therefore, we evaluated the variants in the 3D domain structure encoded by AVPR2 or AQP2 to determine positional correlation with pathogenicity. There was notable clustering in the AVPR2 3D structure, with pathogenic mutations more likely to be within the transmembrane region. Such clustering in the transmembrane region was not shown for AQP2. An enrichment of diagnostic mutations by autosomal dominant inherited (AD) was found in the C-terminal region of AQP2. Nonetheless, the pathogenic missense mutations of AVPR2 or AQP2 were significantly more likely to be located within the domain. Systemic analysis of the protein structure and variants allowed us to make strong predictions about likely pathogenic variations in both AVPR2 and in AQP2. One of the limitations of our study is the lack of functional studies, especially in case of the novel variants detected in the NDI cohort. Thus, the mutations identified in our study were considered as presumably disease-causative post from the protein structural analysis and in silico predictions.
As next-generation sequencing is increasingly applied in both research and clinical settings, more and more variants will be discovered in known disease-causative genes as well as in novel genes. Although in silico predictions alone should not be relied on as the sole basis to determine the clinical significance of variants in proteins, we hope that the findings of this study provide useful structural evidence for variant interpretation. Moreover, combining clinical and population genetics with protein structural analysis offers widely applicable in silico methods for improving the clinical interpretation of novel missense variations.
In patients with NDI, high fluid intake is necessary to avoid hypernatremia dehydration, which can otherwise result in permanent neurologic complications. Most emergency protocols suggest an initial treatment with 0.9% saline (21). However, the situation with NDI is different because of the ongoing loss of pure water into urine. We described the clinical case with an infusion of 0.19% saline resulting in excess sodium chloride administration and thus worsening the hypernatremia. Thus, children with NDI should be treated with hypotonic fluids, either enterally with water or milk as shown in case #3, or, if need be, intravenously with 5% dextrose in water (6,22). The diagnosis of congenital NDI can neither be missed nor misunderstood, as it would lead to dangerous mistreatment. The early identification of NDI through phenotypes and genotypes is important, as is treatment optimization.
Hypernatremia in children with NDI will induce a strong thirst behavior. When asked, the parents of patients with NDI often described the typical attributes of an extremely thirsty child, one who so avidly drinks large amounts of water, and often vomits afterward. In our cohort, five neonates were diagnosed early with symptoms including fever, vomiting, and anorexia. Five families had multiple members previously diagnosed with NDI; however, only one family had applied for early genetic screening for their newborn due to having siblings already diagnosed with NDI. In addition to the probands, we identified a further three males diagnosed with NDI and four female individuals who presented partial or subclinical NDI by sequencing AVPR2 and AQP2. Further examination found that those affected had frequent episodes of mild or moderate fever during childhood and a lack of thirst in adulthood. Confirming the clinical diagnosis of NDI with genetic screening allows for the early diagnosis and management of at-risk members of families with identified mutations (6,19,23). NDI is a rare disease and does not prominently feature on the diagnostic radar of frontline medical staff. The medical attention for the undiagnosed children with a fever of unknown etiology or large urine output, combined with further investigation of hypernatremia with inappropriately diluted urine, is sufficient to trigger the diagnosis of NDI. For families with suspected NDI affects, appropriate genetic testing and counseling should be offered to the symptomatic individuals, high-risk siblings and offspring, and pregnant woman.
Our study had several limitations. First, skewed Xchromosome inactivation was not checked in the female carriers screened in our study. It has been shown that a frequency of about 25-50% of NDI exists in female carriers 2,3. Additionally, data were collected retrospectively from the registry system and we were not able to obtain complete data on all of the patients.
We described the complications and treatment approaches of NDI during a median follow-up period of 5 years. Urological complications such as hydronephrosis and urinary incontinence were noted in 4 of 10 individuals. Unfortunately, there were no details on nutrition, growth, and mental development in our registry. It has been reported that the long-term morbidities caused by NDI include primary nocturnal enuresis (44%), persistent small stature (38%), urologic complications (37%), persistent failure to thrive (29%), and stage 2 or greater CKD (30%) (23). Here, we reported a signal patient with delayed diagnosis who later developed into stage 3 CKD.
In conclusion, newborn and young children with polyuria symptoms should be immediately referred to specialized centers with experience in treating hypernatremia dehydration and the ability to rapidly obtain genetic analyses and provide a clinical diagnosis.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request. The datasets presented in this article are not readily available because a regulation on the management of human genetic resources from the State Council, CHINA. Requests to access the datasets should be directed to the database for Chinese children renal disease which is publicly available datasets in Chinese language (https:// www.ccgkdd.com.cn/).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The Institutional Review Board (IRB) of Children's Hospital of Fudan University (No. 2018_286). Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. The parents of the children and adult patients described in this article provided consent for participation in the study and for publishing the obtained results.

AUTHOR CONTRIBUTIONS
JR designed and supervised the study and wrote the manuscript. PLL and HXL performed clinical examinations, collected blood samples, and wrote the clinical part of the manuscript. TCX, YF, and JR performed bio-information evaluation and protein structural analysis. JR, HX, and DM critically revised the manuscript. All the authors contributed to the clinical information and registry database.