Genetic and Phenotypic Characteristics of Congenital Hypothyroidism in a Chinese Cohort

Background The molecular etiology and the genotype–phenotype correlation of congenital hypothyroidism (CH) remain unclear. Methods We performed genetic analysis in 42 newborns with CH using whole-exome sequencing. Patients were divided into a single-gene group and a multi-gene group according to the number of affected genes, or divided into a monoallelic group, a biallelic group, and an oligogenic group according to the pattern of the detected variants. The clinical characteristics were compared between groups. Results Thyroid dysgenesis (TD) was observed in 10 patients and goiter in 5 patients, whereas 27 patients had normal-sized gland-in-situ (GIS). We identified 58 variants in five genes in 29 patients. The genes with the most frequent variants were DUOX2 (70.7%), followed by TSHR (12.1%), DUOXA2 (10.3%), and TPO (5.2%). Variants in the genes causing dyshormonogenesis (DH) were more common than those in the genes causing TD (87.9% versus 12.1%). Among the patients with detected variants, 26 (89.7%) were harboring a single gene variant (single-gene group), which include 22 patients harboring biallelic variants (biallelic group) and four patients harboring monoallelic variants (monoallelic group). Three (10.3%) patients harbored variants in two or three genes (multi-gene group or oligogenic group). Compared with the single-gene group, the levothyroxine (L-T4) dose at 1 year of age was higher in the multi-gene group (p = 0.018). A controllable reduction in the L-T4 dose was observed in 25% of patients in the monoallelic group and 59.1% of patients in the biallelic group; however, no patients with such reduction in the L-T4 dose were observed in the oligogenic group. Conclusions Patients with normal-sized GIS accounted for the majority of our cohort. Genetic defects in the genes causing DH were more common than those in the genes causing TD, with biallelic variants in DUOX2 being dominant. DH might be the leading pathophysiology of CH in Chinese individuals.


INTRODUCTION
Congenital hypothyroidism (CH) is the most common neonatal endocrine disorder with an incidence in newborns from approximately 1/2,000 to 1/4,000 (1). Delayed treatment of CH might result in profound neurodevelopmental delay. A newborn screening (NBS) program has been implemented to facilitate the prompt diagnosis of CH, and levothyroxine (L-T4) has also been used in the treatment of CH. However, the etiology of CH remains unclear.
To date, more than 20 genes have been reported to be involved in the pathogenesis of primary CH (2), and new genetic defects continue to be identified due to the use of efficient genetic approaches, such as target region sequencing and whole-exome sequencing (WES) (6). However, these reported genes cannot fully explain the molecular etiology of CH. In the present study, we performed screening of variants using WES and collected clinical data in a cohort of Chinese patients with CH in order to analyze the relationship between the genotype and the phenotype in CH. This is a comprehensive study that evaluates genotypic and phenotypic correlations through implementation of WES detecting genetic defects in a cohort.

Patients
Changzhou Maternal and Child Health Care Hospital is the only NBS institution in Changzhou. Participants were patients who were diagnosed and treated at this hospital. They all volunteered to participate in this study after informed consent. CH was diagnosed based on findings of elevated levels of serum thyroid-stimulating hormone (TSH; ≥9 mIU/ L) and low levels of free thyroxine (FT4; <7.77 pmol/L) for newborns who were positive in the NBS program (heel blood TSH, ≥9.0 mIU/L). The levels of serum TSH and FT4 were determined by electrochemistry immunoassay using the COBAS e601 analyzer (Roche Diagnostics, Mannheim, Germany). The levels of heel blood TSH were detected by time-resolved fluoroimmunoassay using the Wallac 1235 AutoDELFIA (Perkin Elmer, Waltham, MA, USA). Following confirmation of diagnosis, the patients were immediately administered L-T4 at an initial dose of 10-15 mg kg −1 day −1 . Patients with other congenital diseases or those whose mothers were diagnosed with Graves' disease were excluded from the study. The study design and protocol were approved by the Ethics Committee of the Nanjing Medical University (approval no. 2019-258). Written informed consent for participation in this study was provided by the participants' legal guardians.

Collection of Clinical Data and Blood Samples
All newborns with CH were treated and followed up at the Department of Medical Genetics of Changzhou Maternal and Child Health Care Hospital. The levels of heel blood TSH of newborns at screening and those of serum TSH and FT4 at diagnosis were recorded in the electronic NBS information system. All participants were followed up until the conclusion of the present study. The administered dose of L-T4 during treatment was managed and recorded by a clinician. Patients with controllable reduction of L-T4 dose were defined as patients whose levels of thyroid hormones were normal within 2 months of treatment and whose L-T4 doses were reduced or equal compare to the previous follow-up. The reference ranges for the treatment of CH are shown in Table 1. Thyroid morphology was determined using ultrasound scanning. The methods and reference values of thyroid volumes were based on a published study on Chinese newborns (7). Venous blood was sampled from the proband and the proband's parents and then stored in an ultra-low-temperature refrigerator.

Whole-Exome Sequencing
WES and variant analysis were performed in the Department of Medical Genetics and Molecular Diagnostic Laboratory of Shanghai Children's Medical Center. Briefly, genomic DNA was extracted from blood samples of patients and their parents using the Gentra Puregene Blood Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Whole-exome capture was performed using an Agilent SureSelect V6 enrichment capture kit (Agilent Technologies, Inc., Woburn, MA, USA) according to the manufacturer's instructions. The captured library was sequenced using an Illumina HiSeq 2500 System (Illumina, Inc., San Diego, CA, USA). Original sequencing data were assessed using FastQC (version 0.11.2) for quality control. The Burrows-Wheeler Alignment (BWA) tool (v.0.2.10) was employed for sequencing data alignment to the Genome Reference Consortium Human Build 37 (GRCh37/hg19). Single nucleotide variants and small indels were identified using the Genome Analysis Toolkit (GATK). All variants were saved in VCF format and uploaded to the Ingenuity Variant Analysis (Ingenuity Systems, Redwood City, CA, USA) and the TGex (Translational Genomics Expert) platform for biological analysis and interpretation. Variants were classified based on the guidelines of the American College of Medical Genetics and Genomics (ACMG) (8), excluding benign and likely benign, but including DUOX2 p.His678Arg, which has been reported as a functional SNP (9). Variants detected by WES were confirmed by Sanger sequencing in each patient and the parents.

Patient Classification and Comparison of Characteristics
Patients were assigned into groups using two different strategies: 1) based on the number of affected gene-the single-gene group comprised cases with variants in a single gene, while the multigene group comprised cases with variants in multiple genes; 2) based on the pattern of the detected variants-the monoallelic group comprised cases with a single variant in a single gene, the biallelic group comprised cases with homozygous variants or compound heterozygous variants in a single gene determined by analysis of the parents, and the oligogenic group included cases with two or more variants in different genes. The clinical characteristics, such as heel blood TSH at NBS, serum TSH and FT4 at diagnosis, the initial L-T4 dose, the L-T4 dose at 1 year of age, current L-T4 dose, and thyroid morphology, were analyzed among the groups.

Statistical Analysis
The summary statistics for non-normally distributed quantitative variables were expressed as the median plus interquartile range (IQR). Categorical data were summarized as number and percentages. The Mann-Whitney U test or the Kruskal-Wallis test was performed for comparisons between two or more groups, and Dunn's test was used for comparisons between subgroups. The chi-square test was used for categorical variables, whereas Fisher's exact test was used if the expected cell count was less than five. Differences were considered statistically significant at a two-sided p-value of 0.05. All statistical tests were performed using R version 3.6.3.

Variant Frequencies in Participants
Following acquisition of consent forms from guardians, 42 newborns with CH (21 males and 21 females) were enrolled in our study, including 40 non-consanguineous individuals and two siblings. The birth years of these participants ranged from 2011 to 2019. During this period, there were a total of 295,650 newborns, and 133 of them were diagnosed with CH. The average birth weight of the enrolled newborns was 3,346 g (range from 2,350 to 4,350 g), while the average gestational age was 39 weeks (from 34 + 6 to 42 weeks). We observed the occurrence of TD in 10 patients and goiter in five patients, whereas 27 patients had normal-sized gland-in-situ (GIS).
In total, we identified 58 variants in five genes-namely, DUOX2, TSHR, DUOXA2, TPO, and SLC26A4-in 29 of the participants (29/42, 69.0%). The positive rate of the variants was 75.0% (24/32) in patients with DH, whereas it was 50% (5/10) in patients with TD. The clinical characteristics and all variants are shown in Supplementary Table S1. Most of the identified variants were heterozygous, except for two homozygous DUOX2 mutations (i.e., p.Lys530* and p.Arg1110Gln) detected in two patients. Most variants were included in the databases or reported in previous studies, except for one heterozygous variant in TSHR (i.e., p. Ala579Val) that was novel. Interestingly, we found that variants in the genes causing DH were more common than those in the genes causing TD (87.9% versus 12.1%). We further observed that, among the 58 identified variants, the genes with the most frequent variants were DUOX2 (70.7%), followed by TSHR (12.1%), DUOXA2 (10.3%), and TPO (5.2%) ( Figure 1). In addition, seven of these variants were detected in more than one patient: five DUOX2 variants, one DUOXA2 variant, and one TSHR variant ( Table 2). We found that these variants accounted for 53.4% (31/58) of the total variants, with the p.Lys530* and p.Arg1110Gln mutations in DUOX2 constituting the predominant sites in the present cohort.

Characteristics in the Single-Gene and Multi-Gene Groups
We specifically observed that, among the variants detected in patients, 65.5% (19/29) harbored two variants, whereas 20.7% (6/29) harbored three variants. Although multisite variants are common in patients with CH, we noticed that 87.5% (21/24) of our patients harbored variants in a single gene, whereas only two patients and one patient harbored variants in two and three genes, respectively. Based on the number of mutated genes, we classified the patients into two distinct groups. The single-gene   (Figure 2A and Table 3). More specifically, we noticed that, in the single-gene group, 20 cases had DUOX2 variants, three cases had DUOXA2 variants, and three cases had TSHR variants. The multi-gene group comprised three GIS cases, namely, two cases that harbored variants in DUOX2 and TSHR, and SLC26A4 and TPO, respectively, and one case that harbored variants in DUOXA2, TPO, and TSHR ( Table 3).
We subsequently analyzed the differences in the clinical data between the single-gene and multi-gene groups ( Table 4). We did not observe any differences in thyroid morphology, heel blood TSH at screening, and TSH or FT4 at diagnosis (p > 0.05). We further collected data on the L-T4 dose, including the initial dose, dose at 1 years of age, and current dose. Interestingly, we found that the dose at 1 year of age was significantly higher in the multi-gene group than that in the single-gene group (p = 0.018). However, we did not detect a significant difference between the initial and current doses. In addition, we identified a controllable reduction in the L-T4 dose in 53.8% of the patients in the singlegene group; however, no patients with such reduction in the L-T4 dose were observed in the multi-gene group.

Characteristics in the Monoallelic, Biallelic, and Oligogenic Groups
Our aforementioned results suggested that multisite variants in a single gene prevailed in patients with CH. Therefore, we analyzed the pattern of the detected variants in their parents using Sanger sequencing. Based on the pattern of the detected variants, we classified the patients into three groups. Patients with variants in a single gene were subdivided into a monoallelic and a biallelic group. As shown in Figure 2B, the biallelic group comprised 22 cases with several variants in a single gene, including homozygous or compound heterozygous variants. The monoallelic group included four cases with a single variant in a single gene. In the biallelic group, we found that 81.8% (18/22) of cases (16 cases of GIS, 4 cases of goiter, and 2 cases of TD) harbored variants in DUOX2, two GIS cases had DUOXA2 variants, and two TD cases had TSHR variants. In the monoallelic group, we detected two cases with DUOX2, one GIS case with DUOXA2, and one TD case with TSHR. Of note is that the components of the cases in the oligogenic group were the same as those in the multi-gene group ( Table 3).
The levels of heel blood TSH at screening, TSH and FT4 at diagnosis, and the initial, 1 year, and current L-T4 doses in the    biallelic, oligogenic, and monoallelic groups are shown in Table 5. Considering the small sample size of each group, we did not perform comparisons among the three groups. We observed a controllable reduction in the L-T4 dose in 25% (1/4) of patients in the monoallelic group and in 59.1% (13/22) of patients in the biallelic group; however, we did not observe such a reduction in patients in the oligogenic group.

DISCUSSION
Our study demonstrated the existence of distinctive characteristics of genetic defects in different patients with CH. Historically, 75%-85% of CH cases have been attributed to TD, with the remainder occurring due to DH (10). However, patients with normal-sized GIS accounted for the majority of the cases in our cohort. Meanwhile, the frequency of genetic defects in the genes causing DH was higher than that in the genes causing TD, which was in agreement with previous studies in China and other Asian countries (11)(12)(13). These studies suggested that DH might be the leading pathophysiology of CH in Chinese populations, in contrast to what has been reported in other populations. We assumed that this difference might be a result of the genetic backgrounds of different ethnic populations.
In the present cohort, variants were most frequently identified in DUOX2, followed by TSHR and DUOXA2. The higher frequency of variants in DUOX2 was consistent with the observation of a higher proportion of DH cases in our participants. Variants in DUOX2 have been frequently reported, especially in East Asia (11)(12)(13)(14). Several studies have suggested that the most reported variants among Chinese, Japanese, and Thai patients with CH have been identified in DUOX2 (12,13,15), suggesting that DUOX2 variants are an even more frequent causative factor for CH than previously recognized. In particular, the p.Lys530* and p.Arg1110Gln variants have been reported in Asian populations, including Chinese (16)(17)(18), Japanese (19,20), Korean (21), and Malaysian (22) patients. The p.Lys530* and p.Arg1110Gln variants in DUOX2 were also identified as the most frequent sites with a rate of 19% in this study population.
Although multisite variants are known to be common in patients with CH, in our cohort, most multisite variants were detected in a single gene, i.e., DUOX2. By detecting the respective variants in the parents of probands, we identified both monoallelic and biallelic DUOX2 variants; the proportion of biallelic DUOX2 variants was overwhelmingly higher. Monoallelic mutations are thought to confer phenotypes due to haploinsufficiency (23). Monoallelic DUOX2 mutations have been reported to result in mild transient CH (16,24,25), whereas biallelic DUOX2 mutations cause severe permanent CH. However, subsequent studies have reported biallelic mutations in patients with transient CH (19,26) and monoallelic mutations in patients with permanent CH (17,27). Moreover, patients with DUOX2 mutations have been gradually recognized as exhibiting phenotypic heterogeneity (28,29). In our previous study, we confirmed that the number of DUOX2 variants could not be used to predict the transient or permanent outcomes (30). However, the correlation between variants and hormone characteristics or L-T4 dose remains unclear. Therefore, in the present study, the levels of hormones and the dose of L-T4 were used to analyze the correlation between variants and the phenotypic characteristics.
Application of next-generation sequencing (NGS) revealed that a significant proportion of patients harbored variants in more than a single gene (31)(32)(33). Based on these findings, the oligogenic model and the viewpoint of combinational mutant genes resulting in the pathogenesis of CH were proposed. In particular, oligogenicity implied that the pathogenesis of CH could be attributed to the sum of mutations. We detected multigene variants, namely, oligogenic variants, in 7.1% of patients in our cohort. Among these patients, patient 14 harbored oligogenic variants on one allele. Considering that the mother of patient 14  has the same variants showing symptoms of hypothyroidism, we kept patient 14 in the follow-up analysis. To date, data on the detection rates of oligogenic variants have been rarely reported. In a study in the Japanese population, oligogenic variants were identified in 18.0% of patients with CH using targeted NGS of 24 causative genes (15). Another study in the Italian population reported that oligogenic defects were found in 26.2% of the cohort using targeted NGS of 11 causative genes (34). Given that differences in ethnic populations have been shown to result in differences in the mutational spectrum, we speculated that the oligogenicity characteristics might be distinctive among different populations.
The oligogenic model has been suggested as the genetic etiology of CH, on the basis that it provides a suitable explanation for the complex forms of inheritance and the variable expressivity of mutations in CH (31,35). More specifically, the coexistence of mutations in multiple genes might contribute to the severity of the hypothyroid condition and lead to great genotype-phenotype variability (15,36). However, this hypothesis has not been fully clarified yet. To address this, we compared the clinical data between the singlegene and multi-gene groups. However, we did not observe any differences in the levels of heel blood TSH at NBS or those of serum TSH and FT4 at diagnosis. We also analyzed the differences in the L-T4 treatment doses at different stages (initial, 1 year of age, and current). Our results demonstrated that only the L-T4 treatment dose at 1 year of age was significantly higher in the multi-gene group. Furthermore, we divided the single-gene group into monoallelic and biallelic groups according to the number of variants. Considering the small sample size of each group, we did not perform statistical analysis among the three groups. In the treatment, the L-T4 doses of some patients could be easily controlled in order to keep the thyroid hormones in the normal range; however, this was not easy in some patients. We wondered whether it is related to the genotype. Regarding the change in the treatment dose, we noticed that fewer patients in the oligogenic group had a controllable reduction. However, it is still uncertain whether relatively high doses of L-T4 were administered to patients with oligogenic variants due to the limitation of the small sample size in this study. We will continue to focus on the study of CH.
Previous studies have suggested a detection rate of approximately 20% for CH candidate genes in Japanese and Czech cohorts with CH (15,37). Using efficient sequencing methods, such as multiplex PCR and targeted NGS, recent studies have reported an increased detection rate of CH candidate genes (14,16,33), especially in patients with DH (38). However, the detection rate of variants in TD cases has been shown to be only approximately 5% (3,39). Unlike TD, DH appears to have a detectable genetic basis in many cases (40,41). Using WES, the variant detection rate was 69.0% in our cohort. This was consistent with the results of previous studies in the Chinese population using targeted NGS (ranging from 52% to 65%) (14,33). Efficient sequencing methods and ethnic differences might have contributed to the high detection rates in Chinese cohorts (41). In addition, some ambiguous variants were confirmed as disease-causing variants. Advances in the study of CH candidate genes help to increase the detection rate of variants in later research.
Our results might help elucidate the mechanisms underlying the pathogenesis of CH. However, certain limitations were noted in our study. Firstly, the small sample size of the oligogenic and monoallelic groups might have limited the statistical performance. Secondly, we were unable to detect intronic variants and verify the function of the variants of uncertain significance. Therefore, further studies are required to clarify the molecular etiology and genotype-phenotype correlations in CH.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of Nanjing Medical University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
WL, FG, and YW participated in the diagnostic workup. HW, YW, and PX participated in the interpretation of clinical and biochemical data. RY, WL, and FG performed and interpreted the genetic analysis. WL and FG performed the statistical analysis.WL and BY wrote the manuscript. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We gratefully thank RY and his team for the help in interpreting the variants found in this study.