Characterization of a Cohort of Patients With LIG4 Deficiency Reveals the Founder Effect of p.R278L, Unique to the Chinese Population

DNA ligase IV (LIG4) deficiency is an extremely rare autosomal recessive primary immunodeficiency disease caused by mutations in LIG4. Patients suffer from a broad spectrum of clinical problems, including microcephaly, growth retardation, developmental delay, dysmorphic facial features, combined immunodeficiency, and a predisposition to autoimmune diseases and malignancy. In this study, the clinical, molecular, and immunological characteristics of 15 Chinese patients with LIG4 deficiency are summarized in detail. p.R278L (c.833G>T) is a unique mutation site present in the majority of Chinese cases. We conducted pedigree and haplotype analyses to examine the founder effect of this mutation site in China. This suggests that implementation of protocols for genetic diagnosis and for genetic counseling of affected pedigrees is essential. Also, the search might help determine the migration pathways of populations with Asian ancestry.


INTRODUCTION
DNA double-strand breaks (DSBs) are a deleterious form of DNA damage that can result in loss or rearrangement of genomic material, both of which lead to cell death or carcinogenesis (1-3). DSBs are induced by ionizing radiation, but they also arise as intermediates during normal endogenous processes such as DNA replication, and meiotic and V(D)J recombination. In mammalian cells, non-homologous DNA end joining (NHEJ) is the major mechanism for repairing DSBs (4). NHEJ involves at least six proteins: Ku70, Ku80, DNA-PKcs, DNA ligase IV, XRCC4, and Artemis. DNA ligase IV (LIG4) associates with XRCC4 during the final rejoining step of NHEJ (5).
LIG4 deficiency (OMIM 606593) is an extremely rare autosomal recessive disorder caused by mutations in the LIG4 gene. It is characterized by microcephaly, growth retardation, developmental delay, dysmorphic facial features, variable immunodeficiency, pancytopenia, a predisposition to malignancy, and pronounced clinical and cellular radiosensitivity (6).
The LIG4 gene maps to chromosome 13q33-q34; it contains two exons and comprises four domains: the DNA-binding domain (DBD), the nucleotidyltransferase domain (NTD), the oligo-binding domain (OBD), and the XRCC4-binding domain (XBD) (7). Our previous report identified p.R278L (c.833G>T) as a unique mutation site present in the majority of Chinese cases. We predicted that the p.R278L mutation is a hot spot or a founder effect in a Chinese population (8).
The founder effect is a genetic variation that occurs when a new population is created from a small number of individuals in a larger population (9). Ideally, the frequency of alleles in a population is distributed randomly between progeny; however, selection, variation, or inbreeding can affect the frequency of alleles in the progeny. A series of genetic markers that are inherited together through generations is called a haplotype; a haplotype demonstrates high linkage, resulting in little or no separation during meiotic recombination (10). Usually, the most common haplotype represents the polymorphism of most individuals within a population, which tends to be inherited as a whole by the offspring. Haplotype analysis can help confirm whether there is a founder effect in a mutation (11).
Here, we describe the clinical, immunological, and genetic characteristics of patients with LIG4 deficiency and investigate the phenotypic and mutation spectrum of the LIG4 deficiency to determine whether the p.R278L mutation is descended from a common ancestor via the founder effect.

Patients
Initially, all patients enrolled in the study were suspected of having combined immunodeficiency based on clinical manifestations, examination findings, and clinical laboratory results. Eight patients with LIG4 deficiency (P8 to P15) were recruited from the Children's Hospital of Chongqing Medical University between 2016 and 2020. Seven patients (P1 to P7) that harbored the p.R278L mutation in the previous study were included (8); therefore, 15 patients were enrolled. The relevant clinical data are summarized in Table 1A. The assessment criteria for child growth and development published by WHO were used as a reference (12). Permission to participate in the study was provided by the patients' families (all of whom provided informed consent), and the study was approved by the Medical Ethics Committee of Children's Hospital of Chongqing Medical University.

Cell Preparation
Peripheral blood mononuclear cells (PBMCs) were isolated from freshly drawn heparin-treated blood by Ficoll density gradient centrifugation, as described previously (8).

Lymphocyte Subsets
Whole blood was used for standard flow cytometry multicolor analysis; staining of lymphocyte surface markers was performed after red cell lysis, as described previously (13). A total of 20 subpopulations were examined to analyze T and B lymphocyte subsets.

T Cell Receptor Excision Circles and Kappa-Deleting Recombination Excision Circles
During T cell receptor rearrangement, excised DNA fragments create TRECs. During B cell maturation, KRECs are generated during kappa-deleting recombination allelic exclusion and isotypic exclusion of the light chain. TRECs reside within the chromosome, whereas KRECs are excised from genomic DNA. Quantification of TRECs and KRECs was performed using DNA samples extracted from peripheral blood. Quantification of TRECs and KRECs was performed by nested and quantitative real-time reverse transcription polymerase chain reaction (PCR) (qRT-PCR) (14, 15).

CDR3 Spectratyping
Each T cell receptor (TCR) Vb fragment was amplified using one of 23 Vb-specific primers and a 5'FAM-labeled Cb primer (16). The PCR products were sequenced by Sangon Biotech Company (Shanghai, China). The data were analyzed using Gene Mapper V3.5, and a scoring system was used to evaluate TCR Vb diversity: a score <4 indicated a skewed subfamily (17, 18).

Assessment of Maternofetal T Cell Engraftment
To detect the presence of maternofetal T cell transfusion, DNA samples obtained from each patient and their mother were subjected to short tandem repeat (STR) analysis by Kindstar Global Gene Technology Company (Wuhan, China).

LIG4 Mutation Analysis
Genomic DNA was extracted from peripheral blood leukocytes using a Gentra Puregene blood kit (Qiagen, Hilden, Germany). In some patients (P1-P4, P7, P10-P15), NGS of the family was  performed firstly. The filtration of the WES data includes DNA library preparation, enrichment and sequencing of targeted genes, and bioinformatics analysis (MyGenostics, Beijing, China), as previously described (8). Genes associated with primary immunodeficiency diseases and other immune-related diseases had been updated according to the IUIS PID Classification Committee. In this study, four steps were used to select the potential pathogenic mutations in downstream analysis (i): Mutation reads should be more than 5, and mutation ration should be no less than 30% (ii); The mutations should be removed, when the frequency of mutation was more than 5% in 1,000 g, ESP6500, and Inhouse database (iii); The mutations should be dropped, if they were in InNormal database (MyGenostics) (iV); The synonymous mutations should be removed, when they were not in the HGMD database. After that, the rest mutations should be the potential pathogenic mutations for further analysis and judgment based on clinical phenotypes and phenotypic databases (OMIM and ClinVar). All mutations identified by NextSeq 500 sequencing were confirmed by Sanger sequencing. The coding exons and exon-intron boundaries of LIG4 were amplified by PCR. Both strands of the amplified PCR products were sequenced by Sangon Biotech Company. Whole-exon sequencing of P13 suggested that there might be copy number variation of exon. Therefore, the normal control samples, proband, and family samples were conducted by fluorescence quantitative PCR, and the copy number of the second exon of the target gene LIG4 was detected with the ALB gene as the internal reference gene.

The Expression of the Mutant Protein
Full-length wild-type (WT) LIG4 cDNAs was ordered from Youbio Biotech Company (Changsha, China). Mutant LIG4 cDNAs (p.R278L and p.R278H) were constructed by PCR mutagenesis and then subcloned into the p3xFlag-CMV-7.1 vector. In the overexpress system, HEK293T cells were transfected with 0.5 mg plasmids (WT, p.R278L, p.R278H), respectively. Cells were harvested 24 h after transfection, and then the expression of LIG4 was tested by Western blot with anti-LIG4 (EPR16531, Abcam) or anti-Flag antibody (2B3C4, proteintech).

Haplotype Analysis for LIG4 p.R278L Mutation
Haplotype analysis was performed to determine whether the LIG4 p.R278L mutation represents a founder mutation. Singlenucleotide polymorphisms (SNPs) were analyzed by Guoke Biotechnology Company (Beijing, China) using an Illumina internationalgenome.org/) and used as "normal sample haplotype" information. The frequencies of the identified haplotypes were analyzed using the Haploview 4.2 program (https://www.broadinstitute.org/haploview/). A binomial probability formula was used to calculate the probability of a mutation occurring recurrently as a de novo event in the same haplotype.

Estimating the Age of the Mutation
To better understand its history, the age of the LIG4 p.R278L mutation was estimated using the DMLE 2.3 program (http:// www.dmle.org/). This software uses the Markov chain Monte Carlo algorithm for Bayesian inference of mutation age based on the observed LD at multiple genetic markers. The population growth rate was set as 0.025, with an intergenerational time interval of 25 years. The disease sample ratio was 0.002, which is the software default parameter.

Clinical Characteristics of Chinese Patients With LIG4 Deficiency
All 15 patients were from different families. Eight were male and seven were female. There was no evidence of potential skewing in the sex ratio. The clinical characteristics are listed in Table 1A. P6, P9, and P15 had a family history of early death or failed pregnancy (the older sister of P6 died of pneumonia at the age of 1 year, the older sister of P9 died of recurrent fever and diarrhea at the age of 8 months, and the mother of P15 had a previous pregnancy with embryo growth arrest). All were full-term infants, although five were small for their gestational age (SGA; i.e., birth weight 2 standard deviations below average). The average age of symptom onset was 8 months (range, 1-23 months), and the median time of diagnosis was 18 months. The most common onset manifestations (soon after birth or later in life) included diarrhea, pneumonia, thrush, and BCG infection; two cases (P9 and P14) presented initially with hemolytic anemia.
Every patient except P14 suffered from chronic diarrhea, which aggravated their nutritional status and potentially contributed to growth failure. Pneumonia was the second most common and deadly type of infection. Salmonella was cultured from the stools of seven patients with diarrhea. The main etiological agent of respiratory tract infections was bacteria, including Streptococcus pneumoniae, Enterobacter aerogenes, Moraxella catarrhiae, Acid-Producing Klebsiella, and Haemophilus influenzae. The copy number of cytomegalovirus (CMV) in the blood of four patients was high (P1, P7, P12, and P13). BCG infection was suspected in P10 and P15; the BCG vaccination site was ulcerated, and adjacent lymph nodes were enlarged.
Growth failure was evident in all patients. Occipitofrontal circumference, weight, and height were significantly below normal values (p <0.05). Data on head circumference at birth were scarce; however, our estimates were low because almost every patient presented at our hospital with a head circumference >3 SD below the population mean. A previous study in mice shows that LIG4 is essential for neuronal cell development. Consequently, most patients presented with short stature and microcephaly. Ten cases (P1-P9 and P13) showed developmental delay; all failed to achieve milestones of child development for their age group. P12 and P13 had a large nose, with a prominent nasal bridge as the facial dysmorphism ( Figure 1).
All patients exhibited cytopenia (Table 1B). Leukocytes were the most affected cell type (11/15), followed by red blood cell. Three patients presented with pancytopenia. Bone marrow aspiration revealed active bone marrow hyperplasia with no morphological abnormalities, while showed failure bone marrow in biopsy. Notably, P14 manifested with cytopenia; initially, the bone marrow results led us to suspect that this patient had "myelodysplastic syndrome" (MDS) or "aplastic anemia" (AA). There was some evidence of autoimmunity. Coomb's test was positive in four patients during the early course of the disease (P5, P6, P7, and P9). Two patients were positive for thyroid peroxidase antibodies (P9 and P10).
P12 presented with hemocytopenia, characterized by chronic diarrhea and pneumonia. Abdominal imaging and biopsy (of the sigmoid colon) at the age of 4 years confirmed EBV-positive diffuse large B cell lymphoma (non-germinal center origin). Chemotherapy was not a treatment option due to pulmonary fungal infection and poor nutritional status.
Prior to diagnosis of LIG4 deficiency, P7 and P9 received steroids and rituximab to treat autoimmune hemolytic anemia (AIHA) and thrombocytopenia. P14 received cyclosporine for hemocytopenia; this patient was refractory to immunosuppressive therapy. Due to repeated infections, most patients received a variety of antibacterial or antiviral drugs, and some received antifungal or antituberculous drugs. However, P1-P8 died or stopped treatment without transplantation due to marked exacerbation of pulmonary infections and respiratory failure. Hematopoietic stem cell transplantation, the only effective radical cure for AIHA, was performed for P9 and P10. The donors were their respective parents (haploidentical donors) due to HLA matching difficulties; both engrafted successfully. After transplantation, mycophenolate mofetil, tacrolimus, MTX, and steroids were used for GVHD prophylaxis. Unfortunately, P9 developed recurrent fever 3 months after transplantation, thought to be due to EBV-driven posttransplantation lymphoproliferative disease and hemophagocytic lymphohistiocytosis (HLH). She received the HLH-2008 chemotherapy regimen (including two doses of VP16) but died of sepsis 3 months later. Notably, HCT did not cure microcephaly and neurodevelopmental delay in P10. The remaining patients (P11-P15) have not received HCT, but continue to survive; all receive IVIG and oral co-trimoxazole to prevent infection (standard treatments after a diagnosis of LIG4 deficiency).

Immune Characteristics
Immunological function analyses were also performed ( Table 1B). Flow cytometry analysis of peripheral blood showed a significant reduction in the absolute numbers of CD19+ B cell and CD3+ T cell; however, NK cell (T-B-NK+ phenotype) counts were nearnormal. Seven of them (P8, P9, P10, P11, P12, P14, and P15) were analyzed in detail. Two patients (P8 and P15) exhibited a marked increase in the number of gd T cell compared with healthy children. IgG levels in six of the 14 patients (except the P11) were significantly lower than the normal reference value (p <0.05). Others had normal levels of IgG; however, they were tested after receiving intravenous immunoglobulin. The TRECs and KRECs count was significantly below the detection limit, although the lymphocyte counts in P8 and P15 were not that low (19). STR analysis of eight patients ruled out maternofetal transfusion (P8-P15).
Analysis of TCR-Vb diversity was performed in five newly enrolled patients (P8, P12, P13, P14, and P15). Most TCR-Vb subfamilies exhibited monoclonal or oligoclonal peaks, and TCR  repertoire complexity was limited, a finding similar to that in our previous report (8). P14 and P15 exhibited less skewed TCR diversity than healthy controls, suggestive of less severe impairment of V(D)J recombination and TCR function. In those age-matched healthy controls, the majority of the 23 TCR-Vb subfamilies exhibited a Gaussian curve with 6-9 peaks, reflecting a polyclonal Vb repertoire. The frequency of skewed TCR-Vb subfamilies in the patients was higher than that in the healthy control ( Figure 2). When PBMCs from patients were stimulated with PHA for 3 days, the percentages of proliferating CD4+ and CD8+ T lymphocytes were 0.2 and 1.9% in P8; 2.78 and 4.33% in P11; 3.79 and 1.67% in P12; and 1.1 and 8.3% in P14, respectively. CFSE fluorescence histograms generated by flow cytometry revealed no obvious peak with respect to cell division. The percentages of proliferating CD4+ and CD8+ T lymphocytes in the PBMCs population from the normal control were 71.9 and 65.6%, respectively, with obvious peaks in cell division. Taken together, these data suggest that T cell proliferation in the LIG4 deficiency patients was severely impaired. B cell proliferation was analyzed only in P11. Proliferation of CD19+ B cell after stimulation by PWM was 4.93% in the patient and 47.8% in the normal control.

Genetic Characteristics
Patients with LIG4 deficiency came from 11 different provinces in China. Mutations in the LIG4 gene were detected by either Sanger or next-generation sequencing. According to the clinical manifestations of microcephaly, immune deficiency, and autosomal recessive inheritance pattern, LIG4 was finally identified as the sole pathogenic gene. No other PID genes in the 2019 update of the IUIS IEI classification were found in each patient. Fourteen different mutations were identified, three of which have not been described previously. Most were compound heterozygous mutations, while P3 and P8 harbored a homozygous p.R278L. Most mutations were predicted to be "Deleterious," "Damaging," "Disease causing," or "Prediction disease causing" by different algorithms. Frameshift mutations and copy number variants cannot evaluated by CADD. All CADD PHRED score of the missense or non-sense mutations were >20 and regarded as deleterious except p.T9I. The allele frequency of p.T9I was about 20% in ChinaMap and was predicated as SNP by Mutation Taster. Mutation p.R278L was found in ChinaMap (6/21176) and gnomAD (1/1558, East Asian) but not in DDBJ or other population ( Table 2).
Since recurrent infection is the predominant feature in these patients, and antibody deficiency and lymphocytopenia are present as a combined immunodeficient immunophenotype, we hypothesized that these mutations are loss of function. Since it was difficult to get enough primary cell from patients, we tried to build the KO fibroblast cell line several months ago. However, the cell line did not proliferate after LIG4 gene knockout, and we failed to get the KO clone. We transfected the mutant LIG4 (p.R278L and p.R278H) to HEK293 as an overexpression system and found that the mutant protein expressed comparably to wild type. To visualize the function of R278 at the amino acid level, we constructed a diagram to describe the structure of the DBD domain of LIG4, in which  the main role of R278 is to bind ATP ( Figure 3). The map reveals that mutations in R278 affect the activity of enzymes by altering the spatial conformation and the ability to bind ATP. LIG4 deficiency patients carrying the p.R278L mutation were concentrated in the Yangtze River valley of China. All mutation sites are shown in the simulation diagram ( Figure 4) (20-37). Mutation p.R278L is a hot spot found only in Chinese patients. It is worth mentioning that the p.R278L mutation was also detected in P13; however, the ratio of the copy number of exon 2 of the LIG4 gene to that of the normal control was about 0.5 (Supplementary Figure), suggesting large deletion of heterozygosity in exon 2 of the LIG4 gene on the allelic chromosome. Another common genotype is p.K424RfsX20, which occurs in non-Chinese cases.  Table and   Figure 5). Using the Haploview program, a block of LD was located beside the LIG4 gene; three haplotypes were identified in the patients and five in the healthy control population. Haplotype analysis shows that haplotype GGACTACT was the most common in the patient group (53.8%), but occurred in only 19.5% of the CHS and 17.5% of the CHB population ( Figure 6). The haplotype of the p.R278L mutation site is significantly different from that of the normal allele. DMLE+ 2.3 was used to analyze the age of the mutation; DMLE+ 2.3 provides the posterior distribution probability of the age of the mutated haplotype (38). Based on an intergeneration time of 25 years, it predicted that the age of the mutation is 353 generations (95% credible set; 217-454 generations), i.e., 8,825 years (95% credible set: 5,425-11,350 years) (Figure 7).

DISCUSSION
According to the IEI classification standard updated by IUIS in 2019, LIG4 deficiency is a type of severe combined immunodeficiency (SCID) defined by CD3/CD19 lymphopenia, which affects cellular  and humoral immunity (39). As such, LIG4 deficiency has a broadspectrum phenotype that includes immunodeficiency, microcephaly, growth failure, facial dysmorphism, malignancy predisposition, and cellular sensitivity to ionizing radiation. Additional features include bony deformations such as syndactyly and congenital hip dysplasia (6). Due to combined immunodeficiency, approximately three quarters of LIG4 deficiency patients suffer from recurrent   (20). By contrast, the most common manifestation in our patient cohort was severe gastrointestinal tract infection. Chronic diarrhea and recurrent pneumonia were common at onset in almost all patients, leading to multiple hospital admissions and failure to thrive. Causative pathogens included bacteria, viruses, and fungi. Intestinal infection by Salmonella typhimurium is a unique manifestation in Chinese patients, indicating that gastrointestinal prophylaxis is of great importance. BCG is a unique pathogen associated with live attenuated vaccinations. Thus, a delayed BCG vaccination, or an IPV vaccination plan within 1 month after birth, should be considered as a replacement for the live attenuated vaccination plan. It is classified as a severe immunodeficiency, and our patients mainly present severe immunodeficiency for loss of LIG4 function, whereas P14 had no history of recurrent or severe infection; the only manifestation was cytopenia. Initially, we suspected "MDS" or "AA" and so treated the patient with cyclosporine for 2 years, even though two compound heterozygous mutation sites in LIG4 gene had been detected before he came to our hospital. The p.I327S mutation in P14 was also detected in another LIG4 deficiency patient with pancytopenia and lymphoma, whereas the other mutation (p.H862RfsX6) has not been reported previously. After hospitalization at our center, laboratory tests confirmed the pathogenicity of these mutations by indicating a significant decrease in TRECs/KRECs and T lymphocyte proliferation, the TCR diversity is not so severely impaired as the other patients. The p.H862RfsX6 mutation located in the XBD causes a premature stop codon six amino acids downstream of the C-terminus of the LIG4 protein. Previous studies show a genotype-phenotype correlation with respect to the position of truncating mutations corresponding to disease severity. Transcripts encoding truncated proteins are predicted to be expressed at normal levels (21). We speculate that P14 present as hypomorphic phenotype since the mild clinical manifestation, less skewed TCR diversity, higher lymphocyte and immunoglobulin due to heterogenous p.I327S missense mutation.
Extreme growth failure and microcephaly is a common and early presentation of LIG4 deficiency patients. Murray et al. even screened some LIG4 deficiency patients for microcephalic primordial dwarfism before making a diagnosis of CID or SCID (21). This may be due to accumulation of DNA-DSB over time resulting in reduced lymphocyte production. Most of our patients were full-term infants, although they were SGA, which suggests intrauterine growth retardation. A previous study in mice shows that LIG4 is essential for neuronal cell development (40). LIG4 deficiency may lead to impaired prenatal differentiation of neuronal cell and result in microcephaly and developmental delay. Thus, significant microcephaly and growth restriction are regarded as the most prominent features of LIG4 deficiency patients. Developmental delay also occurred in five of our patients. "Bird-like" or "Seckel syndrome-like" traits are always observed in LIG4 deficiency patients (20). In this study, P12 and P13 had a large nose with a prominent nasal bridge; however, there were no other facial dysmorphisms such as a low anterior hairline, bilateral epicanthic folds, or up slanting palpebral fissures (6).
The symptomatic treatment of LIG4 deficiency patients includes long-term antibiotics, antiviral and antifungal chemoprophylaxis, immunoglobulin infusion, transfusion support, and avoidance of unnecessary exposure to ionizing radiation. HCT is a curative treatment for CID and SCID immunophenotypes and might reduce the risk of developing lymphoid malignancy. Of note, reduced-intensity conditioning regimens with low-dose Cyclosporin A should be considered due to radiosensitivity (6). However, HCT does not cure microcephaly or neurodevelopmental delay. Short stature and mild to moderate intellectual disability may remain. Extra social care is required to maintain a good quality of life, including attendance at a special school for intellectual disability (21). In our study, nine of the 15 patients died. In another cohort of LIG4 deficiency cases in China, four of seven died and two were lost to follow-up (37). There were another four patients in the other papers who were also lost to follow-up, and the possibility of death was high (35,36). Thus in China, improvement of the early diagnosis and adequate treatment are necessary to improve the poor prognosis of LIG4 deficiency.
Due to disruption of V(D)J recombination, rearrangement of T and B lymphocyte receptors is aberrant, resulting in combined immunodeficiency and a skewed TCR-Vb repertoire, as confirmed in the new cohort patients (P8-P15). Therefore, most LIG4 deficiency patients present with T-B-NK+ phenotype SCID, along with reduced antibody production. Flow cytometry analysis showed that reduced B cell counts are more pronounced than reduced T cell counts due to B cell development is more reliant on V(D)J recombination, maternofetal transfusion, or a spontaneous somatic reversion (8). However, STR confirmed no maternal engraftment in our eight patients (P8-P15), so maternofetal transfusion is not as common as in X-SCID patients. Also, there was no reduction in gdT cell. Previous studies argue that limited V FIGURE 7 | The age of the mutation was simulated 100,000 times by DMLE+2.3. The abscissa represents the age of mutation, in units of generation (about 25 years), and the ordinate represents the frequency (Green, within the 95% confidence interval of posterior distribution; Red, outside the 95% confidence interval).
(D)J recombination activity may provide gd T cell with a developmental advantage (22,41). LIG4 has also been shown to be involved in immunoglobulin class switch recombination (6). Thirteen of 14 (93%) and nine of 14 (64%) patients had low IgG and IgA levels, respectively, whereas eight of 14 patients (57%) had normal IgM levels.
Genetic analysis revealed that most Chinese LIG4 deficiency cases harbor the p.R278L mutation, which is unique and represents a mutational hot spot in China. There is clinical evidence for the pathogenicity of this mutant site. Another mutation in the same codon, which causes a different amino acid substitution (p.R278H) (40,42), was reported only in a non-Chinese ethnic population. According to the pathogenicity prediction software, both of these mutations cause functional loss by affecting binding to ATP, although this needs to be confirmed. The P3 and P8 with homogeneous R278L manifested as severe infection, very low T/B cell counts, low TRECs, as well as normal protein expression ( Figure 3F), indication p.R278L as loss of function mutation.
The high frequency and the geographic clustering of the LIG4 p.R278L mutation may be a hot spot or even a founder mutation. According to the principle of LD, haplotypes reflect the genetic information at the mutation site since they tend to be passed on to offspring as a whole. For example, haplotype analysis revealed three founder mutations (BRCA1 c.68_69delAG, c.5266dupC, and BRCA2 c.5946delT) in the Ashkenazi Jewish population, and a recurrent F8 mutation (c.6046C>T) causing hemophilia A in a northern Italian population (43,44). In this study, we constructed five different haplotypes to explore the origin of the alleles carrying the p.R278L mutation in our LIG4 deficiency patients. Haplotype analysis identified only three haplotypes in LIG4 deficiency patients, the most common being haplotype GGACTACT (53.8%); this haplotype was much less common in the controls (CHS, 19.5%; CHB, 17.5%). The different frequencies of haplotypes, along with reduced genetic diversity, are typical features of isolated and stable populations. Thus, it is plausible that the LIG4 p.R278L mutation frequency increased in China due to a local founder effect.
There has been much debate about the origin of mankind. In recent years, a large amount of genomic data has been obtained and accumulated. The completion of the human genome project in 2003, the launch of the 1000 Genomes Project in 2008, and the study of ancient DNA have provided new clues that explain the evolution of the genetic structure of populations (45,46). The recent African origin model hypothesis states that Homo erectus, the common ancestor of early Homo sapiens or Archaic humans, originated in Africa and then spread from the continent to other parts of the world about 2 million years ago. India is one of the many crossroads in the history of mankind (47,48). Interestingly, when we analyzed this haplotype (GGACTACT) among other ethnic groups from the 1000 Genomes Project, we found that Bangladeshi and Pakistani (PJL), as well as the Indian populations in the United States (GIH) and England (ITU), also had the highest frequency of this haplotype. Therefore, we cannot exclude that this mutation might have spread from South Asia via migration (49)(50)(51). However, no LIG4 deficiency patients of Bangladeshi, Pakistani, or Indian origin have been reported.
Regarding the age of the common ancestor, the DMLE estimates that the LIG4 p.R278L mutation is approximately 8,825 years old. This corresponds to the Neolithic age, the period during which agriculture began in settled communities. The distribution patterns of such founder mutations might help determine the migration pathways of populations with Asian ancestry (48,49). However, estimating the age of founder mutations will always be an inexact endeavor. The true recombination and mutation history of the relevant chromosomal segments is unknown. Thus, further studies on the prevalence of shared mutations in Asian countries, and haplotype analyses in other ethnic groups, would be helpful if we are to trace the common ancestor.
This study has some limitations. We did not confirm the DNA-repair defect in primary cell due to lack of PBMCs from patients. We also did not confirm it by other strategy since the fibroblast cell line did not proliferate after LIG4 gene knockout as well as the technical difficulties of recombinant LIG4 enzyme activity analysis. It could be overcome by primary cell from a newly diagnosed patient in the future.

CONCLUSION
In summary, LIG4 deficiency is a rare disease with a broad spectrum of presentations. The severity fluctuates greatly. In China, improvement of the early diagnosis and adequate treatment are necessary to improve the poor prognosis of LIG4 deficiency, and HCT is urgent. Pedigree analysis and haplotype construction revealed conservation of a single haplotype surrounding the p.R278L mutation, suggesting that this allele has a common ancestor. The finding of a founder effect in a highly recurrent mutation in a rare disease suggests that implementation of protocols for genetic diagnosis and for genetic counseling of affected pedigrees is essential. Also, the search for new targeted therapies such as base editing should be prioritized.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in Genebank with the following accession numbers:

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Medical Ethics Committee Children's Hospital of Chongqing Medical University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
XL, YA, and XZ designed experiments and analyzed the data. XL wrote the first draft of the manuscript and performed the experiments. YD provided control specimens from normal healthy children. QL, JJ, WT, LZ, JY, and XT contributed to scientific discussion, data interpretation, and revision of the manuscript. XZ designed the research, supervised the study, and revised the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was supported by the Natural Science Foundation of China (81471619, 82070135) and by Chongqing Technology Innovation and Application Demonstration (cstc2018jscx-msybX0005).