Whole Exome Sequencing Confirms Molecular Diagnostics of Three Pakhtun Families With Autosomal Recessive Epidermolysis Bullosa

Epidermolysis bullosa (EB) is a genetic skin disorder that shows heterogeneous clinical fragility. The patients develop skin blisters congenitally or in the early years of life at the dermo-epithelial junctions, including erosions, hyperkeratosis over the palms and soles. The other associated features are hypotrichosis on the scalp, absent or dystrophic nails, and dental anomalies. Molecular diagnosis through whole-exome sequencing (WES) has become one of the successful tool in clinical setups. In this study, three Pakhtun families from the Khyber Pakhtunkhwa province of Pakistan were ascertained. WES analysis of a proband in each family revealed two novel variants (COL17A1: NM_000494.4: c.4041T>G: p.Y1347* and PLEC: NM_201380.3: c.1283_1285delGCT: p.L426del) and one previously known COL17A1: NM_000494.4:c.3067C>T: p.Q1023*) variant in homozygous forms. Sanger sequencing of the identified variants confirmed that the heterozygous genotypes of the obligate carriers. The identified variants have not only increased the mutation spectrum of the COL17A1 and PLEC but also confirms their vital role in the morphogenesis of skin and its associated appendages. WES can be used as a first-line diagnostic tool in genetic testing and counselling families from Khyber Pakhtunkhwa, Pakistan.


INTRODUCTION
Epidermolysis bullosa (EB) is a group of skin fragility disorders that exhibit various blisters caused by mechanical trauma disrupting the dermo-epithelial junction. The phenotypes of EB depend upon the location of defect within the skin and its molecular cause to develop blisters and scarring of skin and mucosa. The overall prevalence of EB in the world is about 19.6 per one million live births (1). According to the latest classification of EB, four clinical types have been defined, i.e., (a) EB simplex (EBS), (b) junctional EB (JEB), (c) dystrophic EB (DEB), and (d) Kindler syndrome (2). Clinical diagnosis is based on immunohistochemistry including skin biopsies and, electron microscopy to understand the histopathology of each EB prototype (3,4).
In this study, we applied the WES approach for the genetic diagnosis of EB patients in three families from Khyber Pakhtunkhwa province of Pakistan.

Ethical Approval
The study design was planned according to the Declaration of Helsinki. Approval for this research project and the enrolment of the members for the study were attained from Ethical Review.
Committee (ERC) of Institute of Basic Medical Sciences (IBMS), Khyber Medical University (KMU), Peshawar, and Department of Biotechnology, and Genetic Engineering, Kohat University of Science and Technology (KUST), Kohat, Khyber Pakhtunkhwa, Pakistan. Signed informed consent was also obtained from the legal guardians of the patients.

Study Subjects
Three unrelated consanguineous families from different regions in Khyber Pakhtunkhwa (KP) were recruited. In total, 13 individuals (Figures 1a-c) were available for this study. An expert dermatologist performed detailed clinical examinations of all the affected individuals. Peripheral blood samples were collected in EDTA tubes (BD Vacutainer. K3, Franklin Lakes, NJ, USA). Genomic DNA (gDNA) was extracted from the Whole blood using QIAamp kits (Qiagen, Valencia, CA, USA). The concentration of gDNA was measured on a Nanodrop2000 spectrophotometer. (Thermo Scientific, Schaumburg, IL, USA).

Whole Exome Sequencing
WES analysis of a proband was performed in each family. Genomic DNA (2 µg) was required for WES target capture and 51 Mb library construction using the SureSelectv7 kit. (Agilent Technologies, Santa Clara., CA, USA). The average sequencing depth was more than 120, and 130 bases long reads were generated. Sequencing reads were run on the Illumina platform (HiSeq 2500, San Diego, CA, USA). Causative variants were filtered and prioritised as described earlier (8). In addition, an affected sibling from each family was analysed for molecular diagnosis in 3billion.io, a commercial company for rare diseases in South Korea.

Sanger Sequencing
Sanger sequencing was performed using Macrogen Inc. South Korea, to identify the causative variants. Forward and reverse primers (Supplementary Table 1) were used to amplify the target regions. The sequencing data were analysed by aligning against the genomic reference sequence obtained from Ensembl Genome Browser (https://asia.ensembl.org/index.html).

Three-Dimensional Protein Models of COL17A1 and PLEC1
Homology modelling of COL17A1 was performed through I-Tasser server (9), and the structures were validated through MolProbity server (10). COL17A1 predicted model was subjected to geometry optimization and refinements. Subsequently, the stereochemistry and validity of predicted COL17A1 model was measured through Ramachandran plot that showed 97.8% favoured conformations therefore representing worth of the predicted model stereochemistry. Next, the functional domain characterisation of COL17A1 was predicted using, InterPro (https://www.ebi.ac.uk/interpro), SMART (smart.emblheidelberg.de/) and Pfam (http://pfam.xfam.org/). The three-dimensional structure of the PLEC plakin domain was retrieved through PDB with PDBID:2ODU and was refined via GROMACS available in Chimaera. Molecular dynamics (MD) simulation was performed using standard parameters (11). An appropriate amount of sodium and chloride ions was added to all the systems to neutralise them. The energy minimisation was conducted using the steepest descent method for 5,000 steps and was used for energy minimisation for stable conformations. All MD simulations were performed at 30 ns under constant temperature (300 K) and pressure (1 ATM) with Particle-Mesh Ewald simulation (12) for the analysis of electrostatic interactions. Root mean square deviation (RMSD) and root mean square fluctuation (RMSF) plot of the resulted dynamics trajectories were calculated to assess the stability and fluctuations.

Family A
Family A (Figure 1a) had two affected girls (V-1 and V-2) of 6 and 2 years. They were born to cousin marriage with no history of the same disease symptoms in previous ancestral generations. The younger sister (V-2) has mild symptoms of junctional epidermolysis bullosa (JEB), as compared to her elder sister (V-1). The patient (V-1) had generalised growth retardation below the 4th percentile for her age. Skin showed traumainduced blistering and erythematous bullous associated with crusting and scarring, which diminished with age along with sparse scalp hairs, dystrophic nails, plantar hyperkeratosis, oral mucosal blisters, and hypoplasia of enamel (Figures 1d,e,h,i). She was anaemic, suffering dysphagia, muscle weakness, with a history of recurrent skin, respiratory and urinary tract infections. WES analysis identified a novel non-sense variant in the collagen 17A1 gene (COL17A1: NM_000494.4: c.4041T>G: p.Y1347 * ). Sanger sequencing confirmed autosomal recessive inheritance of the affected allele in the family (Figures 2A-C).

Family B
Family B (1b) had two affected members (15-year-girl IV-3 and 13-year-boy IV-5), clinically diagnosed of epidermolysis bullosa intermediate form. The patients had fragile blisters and peeling skin, especially on the groyne and leg regions (Figures 1f,j). Their skin showed spontaneous blistering on hands, trunk, and extremities, which healed, leaving scars that become less prominent with time. They also showed hoarseness of voice, dystrophic nails, and plantar hyperkeratosis, along with recurrent skin and respiratory tract infections. WES analysis and subsequent Sanger Sequencing confirmed a novel variant in plectin gene (PLEC: NM_201380.3: c.1283_1285delTGC: p.L426del) (Figures 2D-F). As PLEC has been known to cause features other than EB, the members of family were also assessed for muscular dystrophy. Developmental motor milestones were not delayed since birth like sitting, standing, walking and speaking according to age. There was no muscular atrophy and fasciculation on inspection and measuring limb girth and midarm circumference. Both active and passive movements plus reflexes were normal. There are no tremors in hands and the gate of the patient was also normal. Serum creatinine phosphokinase (CPK) level measured within the normal ranges.

Family C
Family C (Figure 1c) had an affected boy of 6 years (IV-1) at the time of this study. Another affected child (IV-3) died in early infancy without any recorded clinical history. The proband IV-1 was clinically diagnosed as an intermediate form of JEB. He had generalised skin blisters affecting the face, hands, trunk, and legs (Figures 1g,k), which decreased in intensity and healed with the development of minimal scars. The patient also exhibited dystrophy of nails, plantar hyperkeratosis, enamel hypoplasia, oral mucosal blisters formation, anaemia, and dysphagia. WES analysis and Sanger sequencing identified a homozygous (COL17A1: NM_000494.4: c.3067C>T: p.Q1023 * ) variant, which creates a premature stop codon in exon 45 (Figures 2G-I).

Effect of the Three Identified Variations
The variant c.4041T>G creates a premature termination codon at 1,347 amino acids, which is expected to cause loss of normal protein function. A synonymous variant at this position (c.4041T>C; p.Y1347Y) has been reported with a shallow frequency (2.48 × 10 -5) in the large population cohorts (gnomAD: https://gnomad.broadinstitute. org/). However, the c.4041T>G variant of COL17A1 is novel and has not been reported before. The second homozygous variant c.1283_1285delTGC in PLEC deletes three nucleotides altering the typical reading frame and creates a novel premature termination codon 30 amino acids ahead of the point of deletion. This deletion lies in the non-repeat region and may result in in complete formation of protein leading to aberrant protein function. Similarly, the variant c.3067C>T in COL17A1 creates a stop codon in exon 52. All the three homozygous premature termination codons identified in this study may lead to complete loss of function through the mechanism of non-sense-mediated mRNA decay (13).

In-silico Modelling of COL17A1 and PLEC1 Proteins
A full-length three-dimensional model of COL17A1 was predicted with a C-score of 0.9 and a Tm score of 0.7. The p.Y1347 * was mapped in the ectodomain at the c-terminus of collagen XVII (Figure 3). The premature termination resulted in the deletion of 151 amino acids of the ectodomain and loss of active N-glycosylation site at 1,421 position (Figure 3). N-glycosylation of the ectodermal domain is a vital step in correct plasma membrane trafficking of collagen XVII (14), and deletion of glycosylation site may lead to pathogenic consequences.
The PLEC variant (p.L426del) affects the reading frame in the N-terminal globular domain of PLEC1 protein, which harbours actin, and integrin β4 binding domains. The p.Leu426 lies in the plakin domain of PLEC (15), and via in-silico, analysis predicted the alteration of p.Leu426del as "deleterious." The identified non-frameshift deletion variation might affect the binding of PLEC with its binding partners, thus leading to pathological phenotype (Figure 4). MD simulations revealed a high fluctuation among the amino acids of wild type and altered PLEC proteins with reduced hydrophobicity causing disability of the altered protein ( Figure 5).

DISCUSSION
The patients with rare skin malformations, especially those affected with epidermolysis bullosa (EB), have a lifetime of  suffering and management complications, especially in the populations of Khyber Pakhtunkhwa, due to lack of genetic testing and non-availability of health facilities. WES has become the first-line genetic test in diagnosing most monogenic diseases worldwide; here in this study, we have used it for a successful molecular diagnosis of EB patients.
The COL17A1 consisting of 56 exons, codes a transmembrane protein, collagen XVII, which is a trimer, of three 180kDa (XVII) chains, with a 466 amino acid intracellular N-terminal domain towards the cytosol, a 23 amino acids transmembrane stretch, and the lamina densa of the basement membrane (15). The detachment of ectodermal domain releases the cells from their cellular binding partners and makes them motile during wound healing and tissue regeneration. COL17A1 is also accountable for follicular stem cell maintenance, cellular migration, and cellular polarity (16).
Clinically the patients in families A and C present moderate to severe blistering tendencies, associated with extra-cutaneous manifestations include dystrophic nails, patchy scarring scalp alopecia, dental anomalies, and compromised life quality, as stated in previous studies (17,18). The substitutions generate novel non-sense variants (c.4041T>C; p.Y1347Y) and (c.3067C>T: p.Q1023 * ) in the homozygous state, which is expected to cause loss of normal function of protein through non-sense-mediated mRNA decay. The variant c.3067C>T was first reported in heterozygous state (17). Here we report the homozygous genotype of this variant globally and for the first time in the South Asian population.
In Family B, we present a consanguineous Pakistani family presenting epidermolysis bullosa simplex) in two out of seven siblings. Using whole-exome sequencing (WES), we identified a novel homozygous variant (c.1283_1285delTGC; p.Leu426) PLEC. PLEC codes the cytolinker protein plectin, a multi-domain large-sized protein (l > 500 kDa) of the plakin family expressed mainly in the skin, muscle, mucous membranes stratified, and simple squamous epithelia (19). The phenotypic presentation of PLEC genetic variations includes mucocutaneous blistering, muscular dystrophy, pyloric atresia, and cardiomyopathy (19). The plectin is located in the inner plaque of the hemidesmosomes, at the site of interactions with intermediate filaments (20).
In general, the affected individuals exhibit various degrees of clinical phenotypes concerning the severity of blistering of the skin. In addition, in EBS patients, nail deformities, tooth decay, erosive lesions on the oral and laryngeal mucosa, recurrent respiratory and urinary tract infections during infancy have also been reported (22). Features like nails and teeth dystrophies, muscle weakness, recurrent infections are also observed in our patients. In contrast, other features like oral and laryngeal blisters are not detected in the affected individuals reported in this study. Plec deficient (-/-) mice model displays blistering of the skin due to keratinocytes degeneration, reduction in the number and stability of hemidesmosomes, skeletal muscle myopathies and the disintegration of intercalated discs' of cardiac muscles and dies 2-3 days after birth due to failure to thrive (23).
This study adds novel variants to the existing pool of COL17A1 and PLEC variants. High frequency of cousin marriages is the main driver of disease risk in the upcoming generations. However, thorough genetic screening helps to reduce the disease burden via counselling and premarital planning. Furthermore, the latest technologies of gene editing (CRISPR/Cas9) have the potential to improve the therapeutics in EB patients (24,25).

DATA AVAILABILITY STATEMENT
The data presented in the study are deposited in the ClinVar repository, accession number (SCV001739268, SCV001739269).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethical Review Committee (ERC) of Institute of Basic Medical Sciences (IBMS), Khyber Medical University (KMU), Peshawar, Pakistan. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin. Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
FF, RN, and NW contributed to study design. NB, SAK, and NM contributed to data analysis. NS, SK, and MJ contributed to data generation and analysis. MJ, SK, and NW contributed to writing and finalising the draught. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
We are thankful to all the members of the families for their volunteer participation and contribution in the study.