# APPLYING NEXT GENERATION SEQUENCING AND TRANSGENIC MODELS TO RARE DISEASE RESEARCH

EDITED BY : Arvin M. Gouw, Amritha Jaishankar, and George A. Brooks PUBLISHED IN : Frontiers in Genetics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-524-5 DOI 10.3389/978-2-88963-524-5

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# APPLYING NEXT GENERATION SEQUENCING AND TRANSGENIC MODELS TO RARE DISEASE RESEARCH

Topic Editors:

Arvin M. Gouw, Stanford University, United States Amritha Jaishankar, Rare Genomics Institute, United States George A. Brooks, University of California, Berkeley, United States

A rare disease is a disease that occurs infrequently in the general population, typically affecting fewer than 200,000 Americans at any given time. More than 30 million people in the United States of America (USA) and 350 million people globally suffer from rare diseases. Out of the 7000+ known rare diseases, less than 5% have approved treatments. Rare diseases are frequently chronic, progressive, degenerative, and life-threatening, compromising the lives of patients by loss of autonomy. In the USA, it can take years for a rare disease patient to receive a correct diagnosis. The socioeconomic burden for rare disease is huge. For those living with diagnosed rare diseases, there is no support system or resource bank for navigating financial, educational, or other aspects of having a rare disease.

The purpose of this Research Topic is to bring together leading researchers, non-profit organizations, healthcare providers/diagnostic companies, and pharma/biotech/CROs in the field to provide a broad perspective on the latest advances, challenges, and opportunities in rare disease research.

A genomic approach to rare disease research is becoming the key to discovering unknown causes behind these syndromes. Genomic rare disease research has attracted not only academic researchers but also researchers from the biotech/pharma and non-profit organizations. The breadth and depth of current genomic approaches in rare disease is largely unexplored. While the creation of novel CRISPR mouse models and the use of NGS (ChIP Seq, RNA Seq, etc) have become more routine for fields such as oncology, rare disease researchers are now making advances in modifying and applying these approaches for rare diseases. This Research Topic provides a fruitful platform for rare disease researchers to share their findings and advance the field of genomics research in the rare disease space.

Citation: Gouw, A. M., Jaishankar, A., Brooks, G. A., eds. (2020). Applying Next Generation Sequencing and Transgenic Models to Rare Disease Research. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-524-5

# Table of Contents

*04 Individual Clinically Diagnosed With CHARGE Syndrome but With a Mutation in* KMT2D*, a Gene Associated With Kabuki Syndrome: A Case Report*

Sonoko Sakata, Satoshi Okada, Kohei Aoyama, Keiichi Hara, Chihiro Tani, Reiko Kagawa, Akari Utsunomiya-Nakamura, Shinichiro Miyagawa, Tsutomu Ogata, Haruo Mizuno and Masao Kobayashi

*10 A Comprehensive Atlas of E3 Ubiquitin Ligase Mutations in Neurological Disorders*

Arlene J. George, Yarely C. Hoffiz, Antoinette J. Charles, Ying Zhu and Angela M. Mabb

*27 Loss of the Intellectual Disability and Autism Gene* Cc2d1a *and its Homolog* Cc2d1b *Differentially Affect Spatial Memory, Anxiety, and Hyperactivity*

Marta Zamarbide, Adam W. Oaks, Heather L. Pond, Julia S. Adelman and M. Chiara Manzini

*38 Neurodevelopmental Genetic Diseases Associated With Microdeletions and Microduplications of Chromosome 17p13.3*

Sara M. Blazejewski, Sarah A. Bennison, Trevor H. Smith and Kazuhito Toyo-oka

*56 Rare Compound Heterozygous Frameshift Mutations in* ALMS1 *Gene Identified Through Exome Sequencing in a Taiwanese Patient With Alström Syndrome*

Meng-Che Tsai, Hui-Wen Yu, Tsunglin Liu, Yen-Yin Chou, Yuan-Yow Chiou and Peng-Chieh Chen


# Individual Clinically Diagnosed with CHARGE Syndrome but with a Mutation in KMT2D, a Gene Associated with Kabuki Syndrome: A Case Report

Sonoko Sakata<sup>1</sup> , Satoshi Okada<sup>1</sup> \*, Kohei Aoyama<sup>2</sup> , Keiichi Hara<sup>3</sup> , Chihiro Tani<sup>4</sup> , Reiko Kagawa<sup>1</sup> , Akari Utsunomiya-Nakamura<sup>1</sup> , Shinichiro Miyagawa1,5, Tsutomu Ogata<sup>6</sup> , Haruo Mizuno2,7 and Masao Kobayashi<sup>1</sup>

 Department of Pediatrics, Hiroshima University Graduate School of Biomedical and Health Sciences, Hiroshima, Japan, Department of Pediatrics and Neonatology, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan, Department of Pediatrics, National Hospital Organization Kure Medical Center, Kure, Japan, <sup>4</sup> Department of Diagnostic Radiology, Hiroshima University Graduate School of Biomedical and Health Science, Hiroshima, Japan, <sup>5</sup> Miyagawa Kid's Clinic, Hiroshima, Japan, <sup>6</sup> Department of Pediatrics, Hamamatsu University School of Medicine, Hamamatsu, Japan, Department of Pediatrics, International University of Health and Welfare School of Medicine, Chiba, Japan

#### Edited by:

George A. Brooks, University of California, Berkeley, United States

#### Reviewed by:

Michael L. Raff, MultiCare Health System, United States Feodora Stipoljev, University Hospital Sveti Duh, Croatia

> \*Correspondence: Satoshi Okada saok969@gmail.com

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 25 August 2017 Accepted: 28 November 2017 Published: 11 December 2017

#### Citation:

Sakata S, Okada S, Aoyama K, Hara K, Tani C, Kagawa R, Utsunomiya-Nakamura A, Miyagawa S, Ogata T, Mizuno H and Kobayashi M (2017) Individual Clinically Diagnosed with CHARGE Syndrome but with a Mutation in KMT2D, a Gene Associated with Kabuki Syndrome: A Case Report. Front. Genet. 8:210. doi: 10.3389/fgene.2017.00210 We report a Japanese female patient presenting with classic features of CHARGE syndrome, including choanal atresia, growth and development retardation, ear malformations, genital anomalies, multiple endocrine deficiency, and unilateral facial nerve palsy. She was clinically diagnosed with typical CHARGE syndrome, but genetic analysis using the TruSight One Sequence Panel revealed a germline heterozygous mutation in KMT2D with no pathogenic CHD7 alterations associated with CHARGE syndrome. Kabuki syndrome is a rare multisystem disorder characterized by five cardinal manifestations including typical facial features, skeletal anomalies, dermatoglyphic abnormalities, mild to moderate intellectual disability, and postnatal growth deficiency. Germline mutations in KMT2D underlie the molecular pathogenesis of 52–76% of patients with Kabuki syndrome. This is an instructive case that clearly represents a phenotypic overlap between Kabuki syndrome and CHARGE syndrome. It suggests the importance of considering the possibility of a diagnosis of Kabuki syndrome even if patients present with typical symptoms and meet diagnostic criteria of CHARGE syndrome. The case also emphasizes the impact of non-biased exhaustive genetic analysis by next-generation sequencing in the genetic diagnosis of rare congenital disorders with atypical manifestations.

Keywords: CHARGE syndrome, Kabuki syndrome, KMT2D, CHD7, phenotypic overlap

## INTRODUCTION

CHARGE syndrome (OMIM #214800) is an autosomal dominant genetic disorder that was first reported in Pagon et al. (1981). Its characteristic features are coloboma, heart malformations, choanal atresia, growth and/or development retardation, genital anomalies, and ear malformations. Germline mutations in CHD7 (OMIM <sup>∗</sup> 608892), encoding the chromodomain helicase DNAbinding protein 7 gene, have been identified in 58–67% of patients with CHARGE features

(Vissers et al., 2004; Lalani et al., 2006; Zentner et al., 2010). Furthermore, the CHD7 mutation detection rate rises to around 90% when patients meet the full CHARGE diagnostic criteria advocated by Blake or Verloes (Blake et al., 1998; Verloes, 2005; Jongmans et al., 2006; Janssen et al., 2012). However, CHD7 mutations have also been identified in patients with Kallmann syndrome, idiopathic hypogonadotropic hypogonadism, autism spectrum disorder, and T cell immunodeficiencies such as complete DiGeorge syndrome and Omen-like syndrome (Ogata et al., 2006; Sanka et al., 2007; Gennery et al., 2008; Kim et al., 2008; Jiang et al., 2013). This suggests that germline CHD7 mutations are associated with a broad clinical spectrum.

Kabuki syndrome (KS) (OMIM #147920 and #300867) is a rare multiple malformation disorder that was originally reported in Niikawa et al. (1981). Subsequently, the five cardinal manifestations of KS were defined by Niikawa et al. (1988) as typical facial features, skeletal anomalies, dermatoglyphic abnormalities, mild to moderate intellectual disability, and postnatal growth deficiency. However, consensus clinical diagnostic criteria for KS have not been established.

Two causative genes have thus far been identified in patients with KS (Ng et al., 2010; Jiang et al., 2013; Miyake et al., 2013). Germline mutations in KMT2D (OMIM <sup>∗</sup> 602113), encoding lysine-specific methyltransferase 2D, are responsible for the major molecular pathogenesis of KS, explaining the diagnosis of 52–76% of patients (Adam et al., 1993; Ng et al., 2010; Hannibal et al., 2011; Li et al., 2011; Micale et al., 2011; Paulussen et al., 2011; Banka et al., 2012; Makrythanasis et al., 2013). Most of those patients have de novo mutations in KMT2D, whereas obvious autosomal dominant inheritance has been identified in only a few familial cases (Ng et al., 2010).

The second causative gene of KS is KDM6A (OMIM <sup>∗</sup> 300128), encoding lysine-specific demethylase 6A, which causes X-linked KS subtype 2 when mutated (OMIM #300867). Germline mutations in KDM6A are relatively rare, and are responsible for fewer than 5% of patients with KS (Lederer et al., 2012; Miyake et al., 2013; Banka et al., 2015). Most KS patients, especially those with typical facial dysmorphism, carry KMT2D mutations (Banka et al., 2012). However, phenotypical variability has been documented in individuals with KMT2D mutations, indicating that such mutations can be detected in KS patients with atypical manifestations.

Kabuki syndrome and CHARGE syndrome are distinct congenital disorders, although phenotypic and molecular links between them have been reported previously (Ming et al., 2003; Genevieve et al., 2004; Schulz et al., 2014; Verhagen et al., 2014; Badalato et al., 2017; Butcher et al., 2017). However, to our knowledge, there are no reports of KS cases that have met both Blake and Verloes diagnostic criteria for CHARGE syndrome (Blake et al., 1998; Verloes, 2005). We herein report a patient clinically diagnosed with typical CHARGE syndrome that fulfilled both Blake and Verloes criteria (Blake et al., 1998; Verloes, 2005), but who was genetically diagnosed with atypical KS based on the presence of a de novo KMT2D mutation and the absence of pathogenic variation in CHD7. This case demonstrates the phenotypic overlap between CHARGE syndrome and KS.

## CASE REPORT

The 24-year-old Japanese female patient was born to non-consanguineous parents (**Figure 1A**) at 36 weeks and 6 days of gestational age by Cesarean section because of fetal distress. At birth, she had a low body weight (2,100 g) and a short stature (44 cm). She presented with multiple dysmorphic features including choanal atresia, cleft palate, micrognathia, a hypoplastic cupped auricle with atresia of the external auditory meatus, and right facial nerve palsy. Echocardiography revealed no structural abnormalities of the heart, and no coloboma was observed in either eye, together with no abnormalities in the iris, retina, choroid, or optic disk. Chromosome analysis revealed a normal karyotype (46, XX). Mechanical ventilation was required for respiratory distress for 2 weeks, after which an oropharyngeal tube was used to maintain the airways until she was 7 months old. Her choanal atresia was treated surgically at age 1 year. She had tooth hypoplasia with tooth malalignment. She also had entropion of the right upper eyelid, and an epicanthal fold of the left eye that were surgically treated at age 11 years.

She presented at our hospital at age 11 years with severe short stature (111.5 cm, −5.87 standard deviation). At the initial visit, she showed deafness, right facial nerve palsy, cleft palate, malformation of the auricle with atresia of the external auditory meatus, and bilateral hypoplastic nipples. Computed tomography revealed hypoplasia of the pancreatic body and tail (**Figure 2A**), and severe uterine hypoplasia with the presence of both ovaries (**Figure 2B**). It also revealed bilateral atresia of the external auditory meatus, hypoplasia of the right vestibular aqueduct, bilateral hypoplasia of the long limb of incus, and bilateral agenesis of stapes and the posterior semicircular canal (**Figure 2C**). Based on these findings, she was clinically diagnosed with typical CHARGE syndrome.

Laboratory examination revealed primary hypoparathyroidism [calcium 6.8 mg/dL (reference range: 8.6– 11.0 mg/dL), phosphorus 7.1 mg/dL (reference range: 2.5–4.5 mg/dL), and intact parathyroid hormone 20.0 pg/mL (reference range: 10–65 pg/mL)], and low serum levels of insulin-like growth factor 1 [14.1 ng/mL (reference range: 175–638 ng/mL)] with suspected mild primary hypothyroidism [free thyroxine 0.9 ng/dL (reference range: 1.1–1.9 ng/dL), free triiodothyronine 3.1 pg/mL (reference range: 2.3–4.7 pg/mL), and thyroid-stimulating hormone (TSH) 5.52µU/mL (reference range: 0.48–4.82 µU/mL)] (Supplementary Table 1). The triple stimulation test (insulin, luteinizing hormonereleasing hormone, and thyrotropin-releasing hormone) revealed complete growth hormone (GH) deficiency (peak GH response to insulin and arginine: 1.6 and 2.1 ng/mL, respectively; cut-off point to define severe GH deficiency: <3 ng/mL), hypogonadotropic hypogonadism (peak luteinizing hormone and follicle-stimulating hormone response: 0.1 and 1.0 mIU/mL, respectively), mild primary hypothyroidism [peak TSH: 44.62 µIU/mL (reference < 35 µIU/mL)], and subclinical hyperprolactinemia [peak prolactin: 110.6 ng/mL (reference < 70 ng/mL)] (Supplementary Table 2).

Magnetic resonance imaging of the brain showed an empty sella (**Figure 2D**). The oral glucose tolerance test

revealed borderline diabetes with impaired insulin secretion (Supplementary Table 3). The patient started treatment with levothyroxine, alfacalcidol, and GH. Her growth velocity dramatically improved after starting GH therapy (**Figure 1C**). Seven months after starting GH therapy, the repeated oral glucose tolerance test showed borderline diabetes with impaired insulin secretion [insulinogenic index: 0.09 (reference: 1.34 ± 0.66)] without insulin resistance [Homeostasis Model Assessment insulin Resistance (HOMA-R): 0.2 (reference: < 1.6)] (Supplementary Table 3). Serum C-peptide levels were low [0.4 ng/mL (reference range: 1.1–3.3 ng/mL)], but fasting blood glucose, and glycated hemoglobin (HbA1c) were normal, at 102 mg/dL (reference: <126) and 5.3% (reference range: 4.3–5.8%), respectively. However, about 3 years after starting GH therapy, the patient developed diabetes mellitus. She showed elevated fasting blood glucose (301 mg/dL) and HbA1c (9.9%), low C-peptide levels (1.7 ng/mL), but was negative for antiglutamic acid decarboxylase antibodies. She therefore started insulin therapy.

She underwent plastic surgery to correct the malformation of the auricle five times from age 14 years. She also started estrogen replacement therapy from age 15 years. The patient is currently 24 years old and has multiple dysmorphic features (**Figure 1D**). She has neither fingertip pads nor hockeystick palmar creases. Genetic testing was performed at age 23 years. Comprehensive DNA sequencing using the TruSight One sequencing panel (Illumina, San Diego, CA, United States) revealed a novel heterozygous mutation, c.10690 C > G (p.L3564V), in KMT2D. The mutation was confirmed by Sanger sequencing (**Figure 1B**). It was absent from both of her parents, suggesting that it was de novo, and was not found in the public databases (NCBI, Ensembl, dbSNP, or ExAc). It was predicted to be damaging by PolyPhen-2 (probably damaging) and SIFT (damaging) prediction tools, suggesting that it was not an irrelevant polymorphism. A microarray-based comparative genomic hybridization assay revealed no obvious pathogenic DNA copy number aberrations.

#### MATERIALS AND METHODS

Genomic DNA was extracted from peripheral blood samples using the QIAamp Blood Midi Kit (QIAGEN, Hilden, Germany). We performed trio sequencing using a TruSight One sequencing

panel consisting of 4813 genes associated with known Mendelian genetic disorders on a MiSeq instrument (Illumina). Sequence data were analyzed using CLC Genomics Workbench version 8.0 (CLC bio, Aarhus, Denmark). Variants detected by MiSeq were validated by conventional Sanger sequencing. Microarray comparative genomic hybridization was performed with the SurePrint G3 Human CGH Micro-array kit 8 × 60 K, Reference DNA Female, and the SureScan Microarray Scanner (Agilent Technologies, Santa Clara, CA, United States). Results were analyzed by CytoGenomics Software version 4.0 (Agilent).

#### DISCUSSION

The diagnostic criteria of CHARGE syndrome are defined by two representative papers (Blake et al., 1998; Verloes, 2005). The current patient met the diagnostic criteria of typical CHARGE syndrome defined by Blake et al. by presenting with three major criteria: choanal atresia, characteristic ear abnormalities and cranial nerve dysfunction, and four minor criteria: developmental delay, growth deficiency, an orofacial cleft and genital hypoplasia (based on the finding of hypogonadotropic hypogonadism). She met the diagnostic criteria of typical CHARGE syndrome defined by Verloes by fulfilling two major signs: choanal atresia and hypoplastic semi-circular canals, and four minor signs: rhombencephalic dysfunction, hypothalamohypophyseal dysfunction, abnormalities of the middle and external ear, and mental retardation. The patient was therefore clinically diagnosed with typical CHARGE syndrome. In contrast, she did not show choanal atresia or heart defects which are frequently identified in patients with CHARGE syndrome (Pagon et al., 1981; Blake et al., 1998; Zentner et al., 2010). Moreover, comprehensive genetic analysis identified a de novo germline mutation, L3564V, in KMT2D, which is a gene associated with KS. No mutations or variants were found in CDH7. Therefore, the patient was genetically diagnosed with KS despite presenting with typical symptoms of CHARGE syndrome.

A phenotypic overlap between CHARGE syndrome and KS has been described in previous studies (Ming et al., 2003; Genevieve et al., 2004; Schulz et al., 2014; Verhagen et al., 2014; Badalato et al., 2017; Butcher et al., 2017). For example, coloboma is a major symptom, found in 65–90% of patients with CHARGE syndrome (Blake et al., 1998; Zentner et al., 2010). However, coloboma is also found in patients with KS who show a phenotypic overlap between KS and CHARGE syndrome (Ming et al., 2003). Three cases of genetically confirmed KS with KMT2D mutations who also met typical CHARGE diagnostic criteria, as defined by either Blake et al. (Schulz et al., 2014; Verhagen et al., 2014) or Verloes (Patel and Alkuraya, 2015), have been previously reported. A detailed comparision of clinical

symptoms is shown in Supplementary Tables 4, 5. In contrast, the current case is the first molecularly diagnosed KS patient who simultaneously met two representative diagnostic criteria of typical CHARGE syndrome as defined by Blake et al. and Verloes. It is therefore considered to be an instructive case that clearly indicates the phenotypic overlap between CHARGE syndrome and KS.

The current case presented with several rare KS symptoms. Cranial nerve dysfunction is a typical symptom defined as a major criterion found in 70–92% of patients with CHARGE syndrome (Byerly and Pauli, 1993; Blake et al., 1998, 2008). Specifically, the involvement of cranial nerves I, VII, VIII, IX, and/or X are frequently observed in patients with CHARGE syndrome (Blake et al., 2008). The current case presented with right facial nerve palsy, which occurs in 32–50% of patients with CHARGE syndrome (Blake et al., 2008), but has only been reported in one previous study of a patient with KS (Iida et al., 2006). Choanal atresia is another typical symptom identified in 50–60% of patients with CHARGE syndrome (Blake et al., 1998). However, it is rarely seen in patients with KS, having been reported in four previous studies (Powell et al., 2003; Teissier et al., 2008; Schulz et al., 2014; Badalato et al., 2017). Among those patients, KMT2D mutations were only identified in one familial case (Badalato et al., 2017) of autosomal dominant inheritance associated with the Q3575H mutation in exon 38. Surprisingly, the L3564V mutation of the current patient is located in the same exon and affects an amino acid close to that mutated by Q3575H. Further studies are required to determine whether these two missense mutations affect the development of choanal atresia.

Recent studies revealed the presence of molecular link between CHD7 and KMT2D proteins (Schulz et al., 2014; Butcher et al., 2017). Both CHD7 and KMT2D interact with members of the WAR complex, suggesting that these two molecules function as part of the same chromatin modification machinery (Schulz et al., 2014). Butcher et al. (2017) investigated genome-wide DNA methylation profiles in patients with CHD7 or KMT2D mutations, and found that they showed distinct patterns of epigenetic dysregulation. They also identified common DNA methylation signatures, including a gain of DNA methylation at homeobox A5 (HOXA5), which is shared by the two genetic disorders and may account for some of the clinial overlap between CHARGE syndrome and KS. Therefore, both phenotypic and molecular links are observed in patients with CHARGE syndrome and KS.

#### CONCLUDING REMARKS

We report an atypical case of KS showing clear phenotypic overlap with CHARGE syndrome. This case highlights the importance of considering a diagnosis of KS even if patients

#### REFERENCES

Adam, M. P., Hudgins, L., and Hannibal, M. (1993). "Kabuki syndrome," in GeneReviews <sup>R</sup> , eds R. A. Pagon, M. P. Adam, H. H. Ardinger, S. E. Wallace, A. Amemiya, L. J. H. Bean, et al. (Seattle, WA: University of Washington).

fully meet the diagnostic criteria of typical CHARGE syndrome. Therefore, molecular testing of KMT2D should be considered in patients clinically diagnosed with CHARGE syndrome without CHD7 mutations. It also emphasizes the impact of non-biased exhaustive genetic analysis by next-generation sequencing in the molecular diagnosis of rare congenital disorders with atypical manifestations.

## ETHICS STATEMENT

We obtained written informed consent for genomic analysis of the patient and her parents in accordance with the Declaration of Helsinki. The genetic study was approved by the Institutional Review Board of Nagoya City University Graduate School of Medical Sciences (approval no. 130). The mother of the patient provided written informed consent for the publication of the patient's identifiable information.

#### AUTHOR CONTRIBUTIONS

Patient workup: SS, SO, KH, CT, RK, AU-N, SM, and TO. Genetic analysis: KA and HM. Drafted the manuscript: SS, SO, and MK. Final approval of the version to be published: SS, SO, KA, KH, CT, RK, AU-N, SM, TO, HM, and MK. Agreement to be accountable for all aspects of the work: SS, SO, KA, KH, CT, RK, AU-N, SM, TO, HM, and MK.

## FUNDING

This study was supported in part by the Practical Research Project for Rare/Intractable Diseases from the Japan Agency for Medical Research and Development, AMED.

## ACKNOWLEDGMENTS

The authors would like to thank Ohnishi Hidenori (Department of Pediatrics, Graduate School of Medicine, Gifu University) for helpful discussion. They thank Sarah Williams, Ph.D., from Edanz Group (https://www.edanzediting.com) for editing a draft of this manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2017.00210/full#supplementary-material

Badalato, L., Farhan, S. M., Dilliott, A. A., Care4Rare Canada Consortium Bulman, D. E., Hegele, R. A., et al. (2017). KMT2D p.Gln3575His segregating in a family with autosomal dominant choanal atresia strengthens the Kabuki/CHARGE connection. Am. J. Med. Genet. A 173, 183–189. doi: 10.1002/ajmg.a. 38010


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Sakata, Okada, Aoyama, Hara, Tani, Kagawa, Utsunomiya-Nakamura, Miyagawa, Ogata, Mizuno and Kobayashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Comprehensive Atlas of E3 Ubiquitin Ligase Mutations in Neurological Disorders

Arlene J. George<sup>1</sup> , Yarely C. Hoffiz <sup>1</sup> , Antoinette J. Charles <sup>1</sup> , Ying Zhu<sup>2</sup> and Angela M. Mabb<sup>1</sup> \*

*<sup>1</sup> Neuroscience Institute, Georgia State University, Atlanta, GA, United States, <sup>2</sup> Creative Media Industries Institute & Department of Computer Science, Georgia State University, Atlanta, GA, United States*

Protein ubiquitination is a posttranslational modification that plays an integral part in mediating diverse cellular functions. The process of protein ubiquitination requires an enzymatic cascade that consists of a ubiquitin activating enzyme (E1), ubiquitin conjugating enzyme (E2) and an E3 ubiquitin ligase (E3). There are an estimated 600–700 E3 ligase genes representing ∼5% of the human genome. Not surprisingly, mutations in E3 ligase genes have been observed in multiple neurological conditions. We constructed a comprehensive atlas of disrupted E3 ligase genes in common (CND) and rare neurological diseases (RND). Of the predicted and known human E3 ligase genes, we found ∼13% were mutated in a neurological disorder with 83 total genes representing 70 different types of neurological diseases. Of the E3 ligase genes identified, 51 were associated with an RND. Here, we provide an updated list of neurological disorders associated with E3 ligase gene disruption. We further highlight research in these neurological disorders and discuss the advanced technologies used to support these findings.

#### Edited by:

*Amritha Jaishankar, Rare Genomics Institute, United States*

#### Reviewed by:

*Nelson L. S. Tang, The Chinese University of Hong Kong, Hong Kong Musharraf Jelani, Department of Genetic Medicine, King Abdulaziz University, Saudi Arabia*

> \*Correspondence: *Angela M. Mabb amabb@gsu.edu*

#### Specialty section:

*This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics*

Received: *01 September 2017* Accepted: *22 January 2018* Published: *14 February 2018*

#### Citation:

*George AJ, Hoffiz YC, Charles AJ, Zhu Y and Mabb AM (2018) A Comprehensive Atlas of E3 Ubiquitin Ligase Mutations in Neurological Disorders. Front. Genet. 9:29. doi: 10.3389/fgene.2018.00029* Keywords: ubiquitin, neurological, rare diseases, angelman syndrome, transgenic

## INTRODUCTION

Protein ubiquitination is a posttranslational modification that involves the covalent tethering of a small 76 amino acid protein called ubiquitin to target proteins (Hershko and Ciechanover, 1998). Ubiquitination mediates many cellular functions, which include signal transduction and the removal of proteins by the ubiquitin proteasome system (UPS) (Hershko, 1996). The initiation of protein ubiquitination typically requires an ATP-dependent enzymatic cascade that is initiated with the priming of a ubiquitin onto a ubiquitin activating enzyme (E1) and the transfer to a ubiquitin conjugating enzyme (E2) (Komander and Rape, 2012; Zheng and Shabek, 2017). Ubiquitin is then covalently attached to a lysine residue on the target protein by an E3 ubiquitin ligase (E3) and this process can be repeated to create a series of ubiquitin chains (Hershko and Ciechanover, 1998). Ubiquitin chains can take various forms in length and configuration. The fate of these chains leads to multiple cellular functions, one of which provides a signal for the protein to undergo degradation by the UPS (Swatek and Komander, 2016; Yau and Rape, 2016).

Although there are only 2 E1 and 30-50 E2 genes, there are over 600 human E3 ligase genes whose diversity is accounted for by three different types of catalytic domains: Really Interesting New Gene (RING), Homologous to E6-AP Carboxyl Terminus (HECT), or Ring-Between-Ring (RBR) (Zheng and Shabek, 2017). While both RING and HECT E3 ligases transfer ubiquitin to a lysine residue on the substrate, RING E3s act as a platform to allow direct transfer of ubiquitin from the E2 to the substrate (Riley et al., 2013). HECT E3s on the other hand, contain a catalytic cysteine residue that can form a thioester bond directly with ubiquitin. The RBR acts as a hybrid protein of 2 domains, RING and HECT, with each family having various domains leading to the ubiquitination of numerous substrates (Marín et al., 2004; Zheng and Shabek, 2017). Proteins that are part of the RBR family have both a canonical RING domain as well as a catalytic cysteine residue similar to the HECT domain (Riley et al., 2013).

E3 ligases have been linked to neurological disorders that include neurodegeneration, neurodevelopmental disorders, and intellectual disability (Hegde and Upadhya, 2011; Upadhyay et al., 2017), many of which have no known effective therapies. Neurological disorders are a heterogeneous group of disorders that result from the impairment of the central and peripheral nervous system, affect 1 in 6 individuals, and contribute to 12% of total deaths worldwide (WHO, 2006). Rare neurological disorders (RNDs) are a subtype of neurological diseases that represent 50% of all rare diseases, affecting fewer than 200,000 people in the United States, and are often overlooked due to lack of understanding their potential causative factors (Han et al., 2014; Jiang et al., 2014; NCATS, 2016). Although neurological disorders encompass a large array of genetic defects, next-generation sequencing (NGS) has enabled researchers to identify constituents of the ubiquitin pathway, namely E3 ligases, as causative factors for neurological disease (Krystal and State, 2014; McCarroll et al., 2014; Brown and Meloche, 2016). We speculated that recent advances in NGS resulted in a massive expansion of the list of E3 ligases mutated in neurological disease. To test this, we performed an unbiased manual database search of ∼660 predicted and known human E3 ligase genes specifically mutated in neurological disease (Li et al., 2008; Hou et al., 2012). Strikingly, we found ∼13% of E3 ligase genes were mutated in a neurological disorder with 83 total genes representing 70 different types of neurological diseases (**Supplementary Tables 1**, **2**). Of the E3 ligase genes identified, 19 were associated with a CND (**Figures 1**, **2** and **Supplementary Table 1**), while 51 were associated with an RND (**Figures 3**, **4** and **Supplementary Table 2**). Thus, understanding how E3 ligase disruption is a causative factor for neurological disease may contribute to a strategy for therapeutic interventions for both CNDs and especially RNDs (Upadhyay et al., 2017).

The critical role of E3 ligases in neuropathology has been well documented in CNDs and is the result of both classic methodologies and innovative technologies that parsed out the consequences of E3 ligase disruption. However, very little is known about E3 ligase functions in RNDs such as identification of E3 ligase substrates, their definitive mechanisms, their long-term effects on a cellular and systematic level, and how to ameliorate these effects (Mabb and Ehlers, 2010; Atkin and Paulson, 2014). In the nervous system, E3 ligases are an integral part of the ubiquitin proteasome pathway involved in the turnover of proteins (Tai and Schuman, 2008; Mabb and Ehlers, 2010; Yamada et al., 2013). They localize to multiple cellular regions, which include the Golgi apparatus, centrosome, nucleus, cytoskeleton and synapse (Yamada et al., 2013). Indeed, disruptions in ubiquitin pathway components have been identified in numerous human disorders which include those related to generalized inflammation and cancer (Hegde and Upadhya, 2011; Upadhyay et al., 2017). Notably, ubiquitination mediates many forms of synaptic plasticity, which ultimately affect learning and memory (Mabb and Ehlers, 2010; Hegde et al., 2014). Below, we discuss how advanced technologies in CNDs and RNDs have been used to broaden the understanding of E3 ligases in neurological disease and have allowed researchers to exploit avenues for effective therapies. Although this review cannot encompass a thorough analysis of each disorder, we will focus on the technologies that were used to study E3 ligase disruptions in RNDs, highlighting their importance in increasing our generalized understanding of rare disease.

## E3 UBIQUITIN LIGASES AND COMMON NEUROLOGICAL DISEASE

#### Parkinson Disease

Parkinson disease (PD) is one of the most well-studied neurological diseases related to E3 ligase dysfunction. PD is characterized by dystonia, rigidity, tremors, hyperreflexia, bradykinesia, postural instability, substantia nigra gliosis and dopamine depletion, and Lewy body dementia (Halliday et al., 2014; Biundo et al., 2016). The prevalence of this disease occurs in 0.3% of the general population in the United States and 0.1–0.2% in European countries with increasing rates that occur with aging (Kowal et al., 2013; Tysnes and Storstein, 2017). Genetic mutations in the E3 ligases LRSAM1, FBXO7 (PARK 15), and PARK2 (**Figure 1** and **Supplementary Table 1**) have been identified in Parkinson disease (Wu et al., 2005; Choi et al., 2008; Lohmann et al., 2015; Aerts et al., 2016). PARKIN/PARK2, belongs to the RBR family of E3 ligases, and mutations in this gene occurs in 50% of familial cases and 10–20% in sporadic cases with high penetrance in early-onset PD (Lill, 2016; Zhang et al., 2016).

A series of advanced technologies have elucidated the functional consequences of deletion and missense mutations in the PARKIN gene (Wu et al., 2005; Choi et al., 2008) and their use has increased the capacity for PD therapeutics (Hattori et al., 1998; Kitada et al., 1998; Hedrich et al., 2001). For example, proteomic profiling demonstrated PARKIN's role in mitochondrial autophagy (Narendra and Youle, 2011). These findings were further supported by liquid chromatography–mass spectrometry and ubiquitin Absolute Quantification of ubiquitin (UB-AQUA) proteomics. UB-AQUA is a mass spectrometrybased method that uses internal standard peptides that are isotopically labeled to quantify peptides from digested mono- and poly-ubiquitinated chains attached on substrates (Kirkpatrick et al., 2006; Phu et al., 2011). UB-AQUA was used to identify and quantify subtypes of mitochondrial ubiquitin chain linkages. Using this method, PTEN-induced putative kinase 1 (PINK1) was found to phosphorylate PARKIN leading to its activation and formation of canonical and non-canonical ubiquitin chains on mitochondria, which were PARKIN-dependent (Ordureau

FIGURE 1 | Common neurological disorders (CNDs) and E3 ligase gene associations. Diagram of CNDs correlated with E3 ligase genes that are mutated in specific disorders. Diseases shaded in blue indicate multiple genes linked to that disorder. Genes highlighted in dark gray are shared between several diseases. Figures were generated using Graphviz (www.graphviz.org).

et al., 2014). This served as a feed-forward mechanism to promote PARKIN recruitment and mitochondrial ubiquitination in response to mitochondrial damage. Collectively, these findings were critical to understanding potential causative factors leading to PD pathogenesis (Ordureau et al., 2014).

The impairment of mitochondria in PD due to deletions of mitochondrial DNA (mtDNA) leads to respiratory-chain deficiencies especially in dopamine (DA) neurons of the substantia nigra (Bender et al., 2006). To study the effects of mtDNA deletions and other genetic mutations, conditional knock-out or knock-in mouse models using CRE recombinase and loxP sites were used (Soriano, 1999). The CRE-loxP system allows for conditional loss-of-function or gain-of-function studies in specific tissues and circumvents early life lethality and unwanted phenotypes in later stages of life by controlling when genes are expressed temporally and spatially (Sauer, 1998). A MitoPark reporter mouse line was created first by inserting a loxP-flanked stop-cassette upstream of the mitochondrial targeting presequence lox and yellow fluorescent protein (YFP) transgene to target it to the mitochondrial matrix. The presence of the stop cassette limits YFP expression in specific cells and is only expressed when the stop cassette is excised out with CRE.

Using the MitoPark mouse model, the consequences of respiratory chain dysfunction on the properties of mitochondria and DA neurons were examined after DA neuron-specific knockout of the mitochondrial transcription factor A (TFAM) (Sterky et al., 2011). In order to study the DA neurons, ROSA26+/SmY mice were crossed with dopamine transporter (DAT)-CRE mice that express CRE under a DA transporter locus, so the offspring from these parents express YFP precisely in the mitochondria of DA neurons in the midbrain (Sterky et al., 2011). Using this model, Sterky et al. demonstrated increased fragmented and aggregated mitochondria in aged PARKIN knockout mice (Sterky et al., 2011). Specifically, striatal DA neurons displayed a reduction in mitochondria and tyrosine hydroxylase density. Surprisingly, PARKIN knockout MitoPark mice presented no difference in morphology or number of mitochondria with or without PARKIN. There was also no indication of PARKIN recruitment to defective mitochondria suggesting PARKIN did not have an effect on the progression of neurodegeneration in PD (Sterky et al., 2011).

Although limitations exist in using the PD mouse model, there have been advancements in studying the role of PARKIN in PD in a pig model. Large animals such as pigs serve as great models to study pathological phenotypes for human neurological diseases due to their physiological similarity to humans (Prather et al., 2013). Wang et al. successfully implemented the clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 system into the Bama miniature pig genome to concurrently target three distinct loci by co-injecting cas9 mRNA and single-guide RNAs (sgRNA) which target PARKIN, DJ-1, and PINK1 genes into pronuclear embryos (Wang et al., 2016). Immunofluorescence, western blotting and reverse transcriptionpolymerase chain reaction (RT-PCR) confirmed a significant reduction in expression of these genes compared to wild-type with a low incidence of off-target mutations via whole-genome

sequencing. Despite the drawbacks to using large animals as a means of genetic modification, including the fact that it is a timeconsuming and expensive procedure due to lack of embryonic stem cell (ESC) lines, the effective and specific biallelic knockouts of genes makes this a valuable tool to study neurological disorders where other animal models have failed.

Along with studying large model organisms, the PD field has also taken advantage of using patient-derived induced pluripotent stem cells (iPSCs). IPSCs are a special type of stem cell in which human somatic cells are engineered and genetically altered to be differentiated into other types of cells in the body such as neuronal, cardiac or hepatic via distinct transcription factors (Bellin et al., 2012). IPSCs make an extraordinary model for studying human diseases because they can reveal phenotypic defects and are a renewable source. Chung et al. differentiated PARKIN/PINK1 mutant and normal iPSC and ESC lines of midbrain DA neurons such that all cell lines demonstrated properties consistent with development of midbrain DA neurons (Chung et al., 2016). Although PARKIN and PINK1-derived iPSCs showed abnormal mitochondria (enlarged and enhanced oxidative stress), they were not prone to cell death. When given an oxidative stress inducer, carbonyl cyanide m-chlorophenyl hydrazine (CCCP), PD iPSC cell lines were more susceptible to cell death and displayed atypical neurotransmitter homeostasis (Chung et al., 2016).

In summary, upon mitochondrial depolarization, PINK1 recruits PARKIN to the mitochondrial membrane and its phosphorylation permits proteasome-dependent degradation of damaged mitochondria and enhances cell survival by suppressing apoptosis. Using whole exome genome sequencing to identify PD patients with specific PARKIN mutations that demonstrate dysfunctional mitophagy, therapeutic interventions could be targeted to prevent oxidative stress or promote regeneration of these cells via stem cell technologies specifically to DA neurons of the midbrain.

## E3 UBIQUITIN LIGASES AND RARE NEUROLOGICAL DISORDERS

Extensive examination of RNDs are difficult to accomplish due to the small number of human populations that have RNDs; however, the use of in vitro and in vivo models along with the genomic tools described above have made it possible to identify mutations of genes, particularly E3 ligases, and to recapitulate mutations observed in RNDs.

#### Angelman Syndrome

Angelman syndrome (AS) is a neurodevelopmental disorder that is one of the most well studied RNDs. Although specific symptoms vary in individual cases, AS is characterized by intellectual disability, developmental delay, distinct behavioral patterns such as a happy demeanor with prolonged, inappropriate laughter and smiling, speech impairment, seizures, abnormal sleep patterns, and ataxia (Williams et al., 2006; Tan and Bird, 2016). AS occurs in about one out of every 12,000 births (Steffenburg et al., 1996). Over 90% of AS cases are caused by mutations in the E3 ligase, UBE3A or deletions in the 15q11-13 maternal region containing UBE3A (**Figure 3** and **Supplementary Table 2**; Kishino et al., 1997; Matsuura


FIGURE 3 | Rare neurological disorders (RNDs) and E3 ligase gene associations. Diagram of RNDs correlated with E3 ligase genes that are mutated in specific disorders. Diseases shaded in blue indicate multiple genes linked to that disorder. Genes highlighted in dark gray are shared between several diseases. Figures were generated using Graphviz (www.graphviz.org).

et al., 1997; Sutcliffe et al., 1997). Similar to UBE3A, mutations in another E3 ligase, HERC2, also lead to AS-like phenotypes (Puffenberger et al., 2012).

UBE3A encodes a HECT E3 ligase that is imprinted specifically in neurons of the central nervous system. Imprinting is established and maintained through expression of a long noncoding RNA on the paternal allele called the UBE3A antisense (UBE3A-ATS) transcript. As a result, only maternal UBE3A is expressed in neurons (Albrecht et al., 1997; Rougeulle et al., 1998; Runte et al., 2001, 2004; Yamasaki et al., 2003; Landers et al., 2004). Epigenetic regulation of UBE3A is similarly conserved in rodents, which has allowed researchers to generate AS murine models to study this disorder (Jiang et al., 1998). Using genetic engineering techniques, mouse models for AS were created to generate a viable maternally inherited Ube3a null mutation and a conditional reinstatement model has been created to restore the Ube3a null mutation during development (Jiang et al., 1998, 2010; Miura et al., 2002; Silva-Santos et al., 2015). AS mice display many AS-relevant phenotypes which include motor deficits, seizure susceptibility, learning impairments, and altered sleep homeostasis (Jiang et al., 1998, 2010; Miura et al., 2002; Cheron et al., 2005; Colas et al., 2005; van Woerden et al., 2007; Mulherkar and Jana, 2010; Ehlen et al., 2015). AS mice also exhibit deficits in multiple forms of synaptic plasticity such as long-term potentiation (LTP), long-term depression (LTD), metabotropic glutamate receptor (mGluR)-dependent LTD, homeostatic scaling, and ocular dominance plasticity (**Figure 5**; Jiang et al., 1998; Weeber et al., 2003; Dindot et al., 2008; Yashiro et al., 2009; Sato and Stryker, 2010; Pastuzyn and Shepherd, 2017).

The anatomical changes in AS have also been studied in great detail. In vertebrates, changes in dendritic spine dynamics are associated with alterations in learning, plasticity, and behavior throughout development (Dindot et al., 2008; Silva-Santos et al., 2015; Valluy et al., 2015; Pastuzyn and Shepherd, 2017). Notably, the density of dendritic spines was found to be reduced in a postmortem human AS patient using a traditional method of Golgi staining (Jay et al., 1991). Similarly, Ube3am−/p+(AS) mice have a decrease in dendritic spines, an absence of normal induction of LTD and LTP in the visual cortex, and have defective ocular dominance plasticity (Dindot et al., 2008; Yashiro et al., 2009; Sato and Stryker, 2010). The use of two-photon microscopy has allowed researchers to increase the depth of imaging in living tissue and provide longitudinal changes in functional connectivity, cortical response, and neural activity with limited phototoxicity (Yang and Yuste, 2017). To observe changes in dendritic spine dynamics, AS mice were crossed with Thy1-GFP males (Feng et al., 2000; Kim et al., 2016). The resulting offspring of this cross allowed GFP expression in Layer 5 pyramidal neurons in both wild type (WT) and AS mice (Kim et al., 2016). Although dendritic spine density was not altered in AS mice, dendritic spine elimination was significantly increased during the end of the first month of postnatal life. However, enhanced spine elimination could be rescued when AS mice were deprived of visual experience by dark rearing (Kim et al., 2016).

Gross anatomical differences have been observed in multiple brain regions in the AS mouse model (Judson et al., 2017). AS mice exhibit microcephaly and have significant reductions in white matter tracts. These microstructural abnormalities were investigated using diffusion tensor imaging (DTI), a tool that provides unique information of the preferred orientation, myelination, and density in white matter specifically in axon bundles in vivo (Basser and Pierpaoli, 1996; Goodlett et al., 2009). Using electron microscopy, a decrease in axon caliber (reduced cross-sectional diameter) in myelinated axons was identified within the corpus callosum and sciatic nerves, which later revealed slower action potential rise kinetics compared to controls (Judson et al., 2017). Generally, microcephaly is linked to early neurological phenotypes such as hypotonia and seizures in infants (Fryburg et al., 1991). Therefore, the relationship between the deficits in postnatal brain growth and pathophysiology can be understood by examining the mechanism of this phenotype and the developmental consequences due to loss of UBE3A.

Optogenetics is another tool used to measure how neural populations affect the circuitry and function of the brain leading to behavioral phenotypes. This technique typically uses light to manipulate the activity of light-sensitive ion channels to spatially and temporally control cells in select brain regions (Klapoetke et al., 2014; Deisseroth, 2015; Yang and Yuste, 2017). Previously, a loss of UBE3A was found to enhance dopamine release in the mesoaccumbal pathway (Riday et al., 2012; Berrios et al., 2016). To evaluate the role of dopamine release in consummatory behavior, optogenetics was used to evaluate motivational behavior in a conditional AS model (Ube3aFLOX/p+) by crossing these mice to those that express CRE recombinase

specifically in tyrosine hydroxylase neurons (THCRE). These mice were then transduced with a CRE-dependent adenoassociated virus (AAV5)-channelrhodopsin-2 (H134R) fused to an enhanced yellow fluorescent protein (ChR2-eYFP) into the ventral tegmental area and an optical fiber was placed above the nucleus accumbens (Berrios et al., 2016). Mice were then trained on a specific schedule to nose-poke during optical stimulation. Mice with a loss of UBE3A in tyrosine-hydroxylase neurons demonstrated increased reward-seeking behavior via optical self-stimulation by suppressing the co-release of gammaaminobutyric acid (GABA), an inhibitory neurotransmitter, in a non-canonical pathway (Berrios et al., 2016).

The generation of the UBE3A reinstatement model has allowed researchers to define neurodevelopmental windows that may rescue AS-related phenotypes. A conditional reinstatement mouse model of Ube3a was created using a CRE-dependent reinstatement of maternal Ube3a (Ube3aSTOP/p+). Mice lacking maternal Ube3a displayed consistent impaired behavioral performance using a battery of behavioral tests (rotarod, marble burying, open field, nest building, forced swim test, and epilepsy) similar to the traditional AS mouse model (Silva-Santos et al., 2015). CRE-dependent reinstatement of UBE3A rescued motor deficits in adolescent mice. However, other AS behaviors such as anxiety, repetitive behavior, and epilepsy could only be rescued during early development (Silva-Santos et al., 2015).

Electrophysiology, specifically local field potential recordings, allow researchers to understand dynamic neural networks by measuring action potentials and graded potentials that reflect synaptic activity in the neural network (Herreras, 2016). Using this method, full recovery of hippocampal LTP was found at every time point of UBE3A reinstatement (Silva-Santos et al., 2015).

Using in vivo patch-clamp electrophysiology in the same mouse model above, AS mice were found to have increased excitability and reduced orientation tuning in regularspiking GABA-ergic pyramidal neurons. However crossing Ube3aSTOP/p<sup>+</sup> with Gad2-CRE mice to specifically reinstate UBE3A in interneurons could rescue orientation tuning (Wallace et al., 2017).

A UBE3A reporter mouse was initially developed to assess regions in which UBE3A was imprinted (Dindot et al., 2008). The UBE3A-Yellow fluorescent protein knock-in mouse (Ube3am+/pYFP) was used to identify compounds to unsilence the paternal Ube3a allele. The rationale behind this work was that the majority of AS individuals have a maternally inherited disruption of UBE3A unlike the paternal copy which is normal, but is not expressed due to epigenetic modifications (Lalande and Calciano, 2007). Using the Ube3am+/pYFP knock-in mouse in a highcontent drug screen, topoisomerase inhibitors were found to unsilence the paternal Ube3a allele (Huang et al., 2012). Notably, the topoisomerase I inhibitor, topotecan, upregulated neuronal UBE3A expression in an AS mouse and downregulated Ube3a-ATS and Snrpn (Small Nuclear Ribonucleoprotein Polypeptide N) paternal gene expression in the brain in vivo (Huang et al., 2012). Another corresponding study also used Ube3am+/pYFP knock-in mice to test the efficiency of antisense oligonucleotides (ASOs) in depleting the Ube3a-ATS as another strategy to unsilence the paternal allele of UBE3A (Meng et al., 2015). The targeting of the paternal dormant allele provides a potential treatment option for AS by focusing on epigenetic modifications.

The development of AS patient-derived iPSCs has allowed researchers to study AS in a more human relevant context. (Chamberlain et al., 2010; Russo et al., 2015). These lines have been useful in studying the exact genomic disruption afflicted in AS patients and to understand human epigenetic UBE3A regulation to assist in identifying novel therapeutic strategies (Stanurova et al., 2016; Takahashi et al., 2017). ASderived iPSCs have an increase in resting membrane potentials, decreased spontaneous synaptic currents, and a loss of LTP induction (Fink et al., 2017). These phenotypes could be rescued by treating AS-derived iPSCs with topotecan to unsilence the UBE3A paternal allele. Treatment with topotecan resulted in an increase in UBE3A mRNA expression which led to a shift to a more hyperpolarized resting membrane potential and restoration of action potential firing to control levels promoting normal neuronal excitability (Fink et al., 2017). Importantly, targeting the silenced paternal allele in AS patients might alleviate AS-related phenotypes such as intellectual impairments and developmental delays by increasing synaptic events (Fink et al., 2017). In another study, targeting CGI methylation de novo via introducing a CpG-free cassette into AS patient-derived iPSCs was able to correct abnormal DNA methylation and result in normal UBE3A expression in DA neurons (Takahashi et al., 2017).

In parallel to the use of transgenic mice and human iPSCs, clinicians have sought to discover indicators specific to AS patients. One of the methods used in human cases was electroencephalography (EEG) for which AS patients tend to display theta rhythmicity, epileptiform spike-wave changes, and increased delta rhythmicity (Vendrame et al., 2012). Delta power in the AS mouse and in AS children was compared. Results from these studies demonstrated an increase in delta power during wakefulness and sleep in both AS mice and children with AS compared to matched controls (Sidorov et al., 2017). These studies reveal, that in AS, loss of UBE3A results in large-scale disruptions in rhythmic neural activity and shows that EEGs may serve as a potential biomarker for not just AS, but for other RNDs that have seizure phenotypes.

It is worth noting duplication of the 15q11-q13 region of the maternal chromosome harboring the UBE3A gene is also a common and highly penetrant factor of autism spectrum disorder (ASD) pathogenesis (**Figure 1** and **Supplementary Table 1**), and an increased dosage of the UBE3A gene is associated with developmental delay and neuropsychiatric phenotypes (Cook et al., 1997; Glessner et al., 2009; Noor et al., 2015). UBE3A can function as both an E3 ligase and a transcriptional co-activator (Scheffner et al., 1993; Dindot et al., 2008). It is expressed monoallelically in neurons and is involved in many previously stated functions such as maintaining the proper level of dendritic branching, synapse formation, and controlling the frequency of mEPSCs (Lu et al., 2009; Greer et al., 2010; Margolis et al., 2010; Khatri et al., 2017). Because previous research has parsed out the importance of expression sensitivity of UBE3A in two disorders, developing drugs that can control the expression of either the paternal or maternal allele or those that modulate neuronal activity would be beneficial to this patient population. Moreover, due to the emergence of early symptoms in AS and ASD, individuals with UBE3A disruptions can be monitored throughout their lifetime to prevent or alleviate ongoing symptoms such as seizures or ataxia.

#### Other Rare Neurological Disorders

Mutation in E3 ligase genes are associated with a vast multitude of RNDs (**Figure 3** and **Supplementary Table 2**). For the majority of RNDs, there is not strong supporting evidence to indicate a causal link between them. Many of the RNDs mentioned in detail below have implied a possible relationship between the RND and an E3 ligase through familial case studies by looking at probable critical regions on chromosomes to screen out less important genes (Sekine and Makino, 2017).

#### Epilepsy

Disorders that encompass seizure-like phenotypes have links to neurodegeneration (Wong, 2013; Dingledine et al., 2014). Epilepsy is a neurological disorder involving a long-term susceptibility to seizures caused by atypical neuronal activity in the brain (Fisher et al., 2005). About 3.4 million people, both adults and children, in the U.S. have active epilepsy (Zach and Kobau, 2017). Although there are many classifications of epileptic disorders, for the purpose of this review, we will only focus on specific rare epileptic types associated with E3 ligases.

Infantile spasms (IS) are characterized by the onset of seizures that occur in clusters during the first year of life. IS patients display irregular EEG readings known as hypsarrhythmia that is thought to cause developmental dysfunction (Lux and Osborne, 2004). The incidence of IS occurs in about 0.025–0.05% of live births (Taghdiri and Nemati, 2014). IS is caused by a deletion in the chromosome region 1p36. This region was identified using fluorescence in situ hybridization (FISH), a cytogenetic technique that uses fluorescence microscopy to visualize fluorescent probes designed to detect complementary nucleic acid sequences on chromosomes (Ratan et al., 2017). The fluorescent probe is RNA or single-stranded DNA labeled with fluorophores through nick translation or PCR that hybridize to its target with antibodies or biotin (Levsky and Singer, 2003). In human case studies, one subject was identified to have a chromosome deletion with copy number variation in the KLHL17 (Kelch-like family member 17) gene (**Figure 3** and **Supplementary Table 2**). This gene encodes an E3 ligase that is thought to play a role in actin-based neuronal function (Paciorkowski et al., 2011). In this study, the use of FISH was helpful in screening E3 ligases selectively to epileptic disorders. Even so, more studies would need to prove that there is a more critical region for IS that includes KLHL17 and to confirm that KLHL17 is a causative gene for IS.

Adult myoclonic epilepsy (AME) is associated with myoclonic jerks and twitches as well as finger shaking movement. Worldwide prevalence of AMEs remains unknown, but there are geographic variations of different genes associated with this disorder (Delgado-Escueta et al., 2003). AME is linked to missense mutations in the HECT E3 ligase, UBR5 (**Figure 3** and **Supplementary Table 2**; Kato et al., 2012). UBR5 was identified using whole exome enrichment and sequencing (WES) via NGS by using RNA probes to find single nucleotide variants (SNVs) and single nucleotide polymorphisms (SNPs) (Chen et al., 2015). While NGS is a type of technology that allows for high throughput sequencing, WES is a type of NGS that is more focused on sequencing protein-coding regions of a genome that contain mutations (Seleman et al., 2017). UBR5 mutations were identified in affected family members with AME but not in unaffected groups or unaffected family members (Kato et al., 2012). UBR5 has many functional roles including maturation and transcriptional regulation of mRNA, cell cycle, extraembryonic development, tumor suppression and regulation of the DNA topoisomerase II binding protein (TDP2). Other functions include suppression of another E3 ligase, RNF168, in response to DNA damage and prevention of growth of ubiquitinated chromatin in response to chromosomal damage (Gudjonsson et al., 2012). Similar to IS, further studies are needed to verify the importance of mutations in the UBR5 gene and its association with AME. Researchers have only begun to scratch the surface of finding altered E3 ligase functions in epilepsy making it difficult to currently manage epileptic patients with these specific gene mutations due to limited research in understanding the role of ubiquitinated proteins in epilepsy. Along with further investigation of E3 ligases, animal models with knock-out genes such as KLHL17 or UBR5 or even knock-in mouse models would be beneficial in determining the functionality of these genes for both in vitro and in vivo experiments. The isolation of iPSC cells from individuals with epilepsy and brain imaging tools such as EEGs might be useful to identify biomarkers.

#### Gordon Holmes Syndrome

Gordon Holmes syndrome (GHS) is another RND that has recently gained more attention. The clinical symptoms include ataxia and hypogonadotropic hypogonadism, cognitive impairment, dysarthria, cerebellar ataxia, and in some cases dementia (Haines et al., 2007; Margolin et al., 2013; Alqwaifly and Bohlega, 2016). GHS is part of a subset of disorders called autosomal recessive hereditary cerebellar ataxias (ARCA) with extracerebellar symptoms such as dementia (Heimdal et al., 2014). The prevalence of GHS remains unknown. Whole exome sequencing studies have established that homozygous mutations in the E3 ligase STIP1 homology and U-Box containing protein 1 (STUB1) also known as carboxy terminus of Hsp70-interacting protein (CHIP), results in ataxia and hypogonadism with a frequency of 2.3% in GHS patients (Shi et al., 2014; Hayer et al., 2017; **Figure 3** and **Supplementary Table 2**). Evidence in clinical familial cases have also demonstrated that mutations in STUB1/CHIP were identified in patients with ARCA along with cognitive impairment (Heimdal et al., 2014). Functionally, STUB1/CHIP is a gene that encodes the protein CHIP which is a U-box dependent E3 ligase involved in chaperoning proteins (Jiang et al., 2001). In assessing neurological behaviors, Chip knockout mice have decreased motor, sensory and cognitive function. These impairments were also associated with abnormal cellular morphology of Purkinje cells and other cortical cell layers resulting in cerebellar dysfunction (Shi et al., 2014).

Markedly, GHS has been associated with both missense and nonsense mutations in the RBR E3 ligase, RNF216/TRIAD3 (Margolin et al., 2013) (**Figure 3** and **Supplementary Table 2**). In familial genetic studies, patients diagnosed with ataxia and hypogonadotropic hypogonadism had compound heterozygous mutations in RNF216 whose variants were predicted to be deleterious compared to controls (Margolin et al., 2013). This implicated RNF216 as a causative gene for this disorder (Margolin et al., 2013). In one deceased patient, histopathological examination revealed atrophy of the cerebellum, gliosis, loss of inferior olivary neurons and cerebellar Purkinje cells, and loss of neurons in hippocampal regions CA3 and CA4. Moreover, ubiquitin-immunoreactive nuclear inclusions were found in the CA1, CA2, and dentate gyrus of the hippocampus further providing an anatomical basis for dementia. Longitudinal studies of the clinical symptoms of GHS patients identified dysarthria in early stages of life, while ataxia and dementia developed later on in adulthood indicated by neuroimaging results that revealed cerebellar and cortical atrophy (Margolin et al., 2013).

Considering hypogonadotropic hypogonadism is another feature of GHS, the endocrine system in GHS individuals was examined. Decreased levels of luteinizing hormone (LH) and pituitary dysfunction were detected indicating gonadotropinreleasing hormone (GnRH) secretion deficiencies; indeed, when robust pulses of GnRH were administered, gonadotropin levels and reproductive function were restored (Margolin et al., 2013). Another clinical case study showed cerebellar and cortical atrophy through the use of fMRIs and fluid-attenuated inversion recovery (FLAIR) brain imaging in two patients with homozygous mutations of a splice variant of RNF216 (Alqwaifly and Bohlega, 2016). FLAIR is a MRI technique that contrasts the tissue T2 prolongation and the cerebrospinal fluid signal so that lesions near the cerebrospinal fluid are revealed (Saranathan et al., 2017). These patients also confirmed low levels of LH and additionally testosterone, but testosterone treatment normalized secondary sexual characteristics (Alqwaifly and Bohlega, 2016). A recent case study identified additional mutations in RNF216 in a patient with GHS who had progressive cognitive decline that correlated with high signal intensity within the white matter of both cerebral hemispheres with gray matter lesions in the thalami, cerebellar atrophy, and high T2 signals in the midbrain (Mehmood et al., 2017). These clinical case studies support the strong relationship between behavioral phenotypes and their corresponding biological insults.

As opposed to using mouse models to support the role of RNF216 in GHS, zebrafish were used to test the functionality of the gene by injecting morpholino oligonucleotides (MO) in order to silence rnf216. This resulted in decreased size of the eye cup, optic tecta, and head size along with disorganization of the cerebellum. These phenotypes were rescued with co-injection of human RNF216 mRNA (Margolin et al., 2013). Complementing these data with transgenic mouse lines would provide a strong foundation to support the genetic studies of familial variability both in vitro and in vivo.

RNF216 encodes multiple RING finger E3 ligase isoforms (TRIAD3A-TRIAD3E) and plays a major role in inflammation (Chen et al., 2002; Chuang and Ulevitch, 2004; Fearns et al., 2006; Nakhaei et al., 2009; Shahjahan Miah et al., 2011; Xu et al., 2014). The first neuronal function for RNF216 was the identification of the immediate early gene, activity-regulated cytoskeletalassociated protein (Arc), as a substrate of TRIAD3A. TRIAD3A was found to directly ubiquitinate Arc and mediate its turnover by the UPS in mouse primary neurons (Mabb et al., 2014). Using a technique called total internal reflection fluorescence microscopy (TIRFM), TRIAD3A was also found to localize at clathrin-coated pits resulting in altered trafficking of α-amino-3 hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, principal excitatory receptors that mediate the majority of fast excitatory synaptic transmission in the nervous system (Mabb et al., 2014). Using shRNA to reduce levels of TRIAD3A, Arcdependent forms of synaptic plasticity were found to be altered most likely due to disruptions in AMPA receptor trafficking (Mabb et al., 2014). Additionally, viral transduction of Triad3 shRNA in the hippocampus of mice led to deficits in learning in the Morris water maze, a spatial-dependent learning task (Husain et al., 2017).

Due to STUB1/CHIP's role in directing chaperone proteins for proteasomal degradation, patients with this mutation could partake in treatment options that target these substrates to promote the growth and maintenance of Purkinje cells and other cortical cell layers. Regarding RNF216, particularly TRIAD3A, therapeutic inventions to induce Arc-dependent forms of synaptic plasticity could possibly prevent or at least delay the dementia phenotype. In conjunction, targeting substrates of the inflammatory pathway could prevent abnormal cell death. Research to elucidate the roles of RNF216 isoforms would be helpful in creating viable treatment options for patients. The generation of additional transgenic animal models and the use of iPSCs would be beneficial in determining the effects of both STUB1 and RNF216 disruptions on a cellular level and would aid in establishing additional substrates and pathways leading to behavioral phenotypes.

#### Louis Bar Syndrome

Louis Bar syndrome, commonly referred to as ataxiatelangiectasia (AT), is identified by its symptoms of ataxia, telangiectasia, elevated alpha-fetoprotein, microcephaly, pulmonary failure, radiosensitivity, immunodeficiency, dysmorphic features, and learning difficulties with a prevalence of 0.001–0.0025% in live births (Boder and Sedgwick, 1970; Swift et al., 1986; Richard and Susan, 1999). Clinical studies showed an occurrence of homozygous nonsense mutations in RNF168 in patients with AT and radiosensitivity (**Figure 3** and **Supplementary Table 2**). In addition, when screening for irradiation-induced nuclear foci containing 53BP1, AT lymphoblastoid cells showed a deficiency in the RNF168 pathway (Devgan et al., 2011). In vitro studies using shRNA showed that knock-down of RNF168, RNF8 and 53BP1 in a CH12F3-2 mouse B cell line resulted in a significant reduction in class-switch recombination (CSR) (Ramachandran et al., 2010). RNF168 plays a pivotal role following DNA double strand breaks (DSBs). During DNA damage, RNF168 is recruited to H2A-type histones and amplifies H2A lysine 63-linked ubiquitin conjugates mediated by another E3 ligase RNF8 (Stewart et al., 2009). This results in the accumulation of 53BP1, a protein important for double-stranded break repair, and BRCA1, a tumor suppressor protein, that are recruited to the sites of DNA damage critical for mediating cell cycle checkpoints and DNA repair (Stewart et al., 2009). While studying the immunology and radiological aspects of Louis-Bar syndrome is valuable, a nice correlate would be to determine if mutations in RNF168 causes other symptoms such as ataxia, telangiectasia or pulmonary failure. This would involve the use of transgenic animal models to study behavioral phenotypes. It would also be conducive to give attention to knock-out or overexpression of RNF168 in other cell types such as neurons or glial cells to determine if mutations in these cell types may result in impaired behavioral phenotypes.

#### Moyamoya Disease

Moyamoya Disease (MMD) is an RND that shows unique symptoms of steno-occlusion of the terminal side of the internal carotid artery causing abnormal vascular networks at the base of the brain (Ma et al., 2016). This disease has recently gained attention because it is thought to be a causative factor of stroke in both adults and children (Veeravagu et al., 2008). Although the progression of pathogenesis and prevalence is still unclear, missense mutations in RNF213 have been identified in 95% of familial cases and 73% of sporadic clinical cases (Kamada et al., 2011) (**Figure 3** and **Supplementary Table 2**). Among the 30 RNF213 variants listed from the Human Gene Mutation Database (HGMD), R4810K is the only variant that is strongly associated with MMD (Jang et al., 2017; **Supplementary Table 2**). A familial clinical case showed one patient with the R4810K mutation did not show any abnormalities with neuroimaging tools during childhood, but was diagnosed with MMD 10 years later after showing symptoms (Aoyama et al., 2017). This suggests that determining the time course of disease progression is an important factor in diagnosis and treatment.

The role of RNF213 in MMD was examined by subjecting WT mice to transient middle cerebral artery occlusion (tMCAO) and measuring mRNA expression of RNF213 both by in situ hybridization and RT-PCR, finding that RNF213 was upregulated compared to controls and its expression was predominantly in neurons (Sato-Maeda et al., 2016). A MMD mouse model was created to produce homozygous recessive RNF213 (RNF213−/−) animals whose cervical and cranial arteries were examined using magnetic resonance angiography (MRA) (Sonobe et al., 2014). Although there was no difference in MRA readings between WT and transgenic mice, common carotid artery ligation which induced vascular hyperplasia, resulted in thinner intima and medial layers in RNF213−/<sup>−</sup> mice compared to WT controls that exhibited hyperplasia (Sonobe et al., 2014). This supports a role of RNF213 in brain ischemia, a symptom of MMD, but further studies are needed to understand the mechanism of RNF213 action in MMD. Notably, iPSCs and iPSC-derived vascular endothelial cells (iPESCs) were taken from MMD patients that had reduced angiogenic activity. Microarrays confirmed that many mitotic-phase associated genes and securin, an inducer of angiogenesis and inhibitor of premature sister chromatin separation, were downregulated with this RNF213 genotype (Hitomi et al., 2013). RNAi-mediated depletion of securin also impaired tube formation without affecting proliferation of iPSCs (Hitomi et al., 2013). Using iPSCs can be useful as an in vitro model to study not only the pathophysiology of RNF213, but also to investigate specific drug targets for therapeutic intervention.

#### Juberg-Marsidi Syndrome

Juberg-Marsidi syndrome is a rare congenital X-linked disorder that specifically affects males. Symptoms consist of mental retardation, delay in developmental milestones, muscle weakness, hypotonia, growth retardation, deafness, microgenitalism, microcephaly, and additional physical abnormalities (Villard et al., 1996). The prevalence remains unknown. Using NGS, Juberg-Marsidi syndrome was found to be associated with mutations in the HECT E3 ubiquitin ligase, HUWE1, implicating it as a possible candidate gene for this X-linked disorder (Nava et al., 2012; Friez et al., 2016; **Figure 3** and **Supplementary Table 2**). In evaluating the function of HUWE1 and its substrates, a conditional knock-out mouse model was generated to delete the HECT domain of HUWE1. Mice were crossed with GFAP-CRE deleter mice to specifically target HUWE1 deletion in cerebellar granule neuron precursors (CGNPs) and radial glia (D'Arca et al., 2010). These mice were found to have high lethality around postnatal day 21 and cerebellar abnormalities including defects in cell cycle exit and granule cell differentiation with an ataxic phenotype caused by uncontrolled proliferation of CNGPs. This was associated with an increase in the abundance of a HUWE1 substrate, N-Myc (D'Arca et al., 2010). There are many different functions of HUWE1 including regulating neural differentiation and proliferation via catalyzing the polyubiquination and degradation of MYCN, which encodes the N-myc oncoprotein; ubiquitination of the tumor suppressor protein, p53; and regulation of CDC6 levels, essential for DNA replication, after DNA damage (Yoon et al., 2004; Hall et al., 2007; Zhao et al., 2008). In parallel with the work mentioned above, it would be favorable to determine how the overexpression of HUWE1 alters differentiation of the cerebellum. Indeed, disruption of HUWE1 in human iPSCs would assist in establishing critical developmental time points related to postnatal lethality.

#### Opitz G/BBB Syndrome

The Opitz G/BBB syndrome (OS) is another congenital disorder that involves two forms: X-linked and autosomal dominant found on chromosome 22. Both forms have similar abnormalities due to defects of the midline structures which include growth delay, microcephaly, polydactyly, cleft palate, mental retardation, seizures, heart defects, hypertelorism, and deafness. (Robin et al., 1996). The prevalence of this disease is unknown. In particular, the X-linked form is caused by a mutation in the MID1 gene, an E3 ligase that is a member of the Bbox family of zinc finger proteins with a RING-finger motif involved in anchoring proteins to microtubules (Quaderi et al., 1997; **Figure 3** and **Supplementary Table 2**). In vivo studies using a Mid1-null mouse line demonstrated OS phenotypes observed in affected humans. For example, prenatal cerebellar defects lead to dysfunction of primitive fissures and definitive boundaries resulting in motor discoordination and motor learning deficiencies (Lancioni et al., 2010). This work provides insight into the genetic causes underlying the behavior observed in OS. The use of primary neuron cultures and stem cells to understand how differentiation of the midline is affected and elucidation of the pathway the underlies this mechanism would be informative in understanding the origin leading to these OS symptomologies.

## CONCLUSIONS

The current studies on E3 ligases and their implication in neurological disorders is still an open field where research using diverse, emerging technologies would benefit. Molecular diagnosis of neurological disorders requires accurate, efficient, and cost-effective methods. Traditionally, standard PCR was helpful in detecting short sequences of repeat expansions. The emergence of Sanger sequencing allowed for sequencing of the entire human genome but was considerably time-consuming and cost prohibitive (Goldfeder et al., 2017). The studies discussed above show the benefits of using rapid and cost-effective NGS platforms in identifying gene variants and novel disease genes. Narrowing down the genetic causes of neurological diseases will allow clinicians and health care professionals to advise and administer specialized treatments at appropriate times to assist in the reduction of disease burden over time.

The prevalence of E3 ligase disruptions in such a broad array of neurological diseases suggests disruption in ubiquitin pathways may be a major driving force. The abundance of E3 ligase genes mutated in neurological disease (**Supplementary Tables 1**, **2**) indicates that targeting the ubiquitin pathway might have utility for a range of neurological disorders. However, this serves as a great challenge to researchers given our lack of a comprehensive understanding of E3 ligases and their role in neurodevelopment, neuronal maintenance, and a lack of information of E3 ligase substrates. This is further clouded in difficulties in developing intervention therapies due to the diversity and complexity of the ubiquitin pathway (Huang and Dixit, 2016). It is worth noting that for the few drugs that have been developed to target the ubiquitin pathway, most are meant to inhibit or disrupt function. Considering many of the E3 ligases mutated in neurological disease are related to a loss of enzyme function, inhibitors targeting these enzymes would not be beneficial (Bondeson et al., 2015; Galdeano, 2017). However, the finding that there are multiple E3 ligases that are disrupted in similar disease subsets (e.g., ASD) indicates a potential nexus of biological and functional convergence (**Figures 1**, **3**). Intriguingly, with the exception of GHS, we found that RNDs appear to be associated with one type of E3 ligase domain class (**Figure 4**), whereas CNDs tend to share E3 ligase domain classes (**Figure 2**).

The emergence of methods such as hydrophobic tagging might be useful for the targeting of E3 ligase substrates that undergo protein degradation (Neklesa et al., 2011; Huang and Dixit, 2016) but this requires the identification of substrates. Moreover, E3 ligase substrates may undergo ubiquitination that does not lead to subsequent degradation by the UPS. In order to maximize the potential for therapeutic treatments especially for RNDs, future studies that answer the following questions are warranted: What is the full list of mutated genes that encode for E3 ligases in neurological disease? NGS and access to patient populations for RNDs and CNDs would assist in this endeavor. What is the full range of substrates that are targeted by disease-associated E3 ligases? One of the most extensive lists exists for the E3 ligase Parkin but other E3 ligase substrates remain elusive (Panicker et al., 2017). What are the functions of the E3 ligases that are disrupted in neurological disorders? Note that for many RNDs, E3 ligase function in the nervous system is undefined. Are E3 ligases that are mutated in similar disorders have overlapping biological functions? Very little is known about how these enzymes function in the nervous system. Finally, for the ever-growing list of CNDs and RNDs that exhibit symptomatic heterogeneity, how does one select drug targets and develop viable therapeutic treatments? This, of course, will be the greatest of challenges for researchers.

#### AUTHOR CONTRIBUTIONS

YH and AC: generated the list of E3 ligases implicated in neurological disorders found in **Supplementary Tables 1**, **2**. AG and AM: Validated the list of E3 ligases and wrote the manuscript; AG and YZ: Created the figures.

#### ACKNOWLEDGMENTS

We would like to thank the Rare Genomics Institute for the opportunity to submit this review on E3 ligases in Rare Neurological Disorders, Mohammad Ghane, Jason Yi and Jun Yin for critical review of the manuscript. This work

#### REFERENCES


was supported by The Whitehall Foundation (Grant 2017- 05-35) and Georgia State University laboratory startup funds to AM.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00029/full#supplementary-material

Supplementary Table 1 | List of E3 ubiquitin ligases mutated in common neurological disorders (CNDs).

Supplementary Table 2 | List of E3 ubiquitin ligases mutated in rare neurological disorders (RNDs).

Necrosis Factor (TNF)- and IL1-induced NF-κB activation. J. Biol. Chem. 277, 15985–15991. doi: 10.1074/jbc.M108675200


of Hsp90 binding. J. Biol. Chem. 281, 34592–34600. doi: 10.1074/jbc.M604 019200


Ephexin5 relieves a developmental brake on excitatory synapse formation. Cell 143, 442–455. doi: 10.1016/j.cell.2010.09.038


global developmental delay and autism spectrum disorder. Hum. Mutat. 33, 1639–1646. doi: 10.1002/humu.22237


model for Angelman syndrome by reduction of [alpha]CaMKII inhibitory phosphorylation. Nat. Neurosci. 10, 280–282. doi: 10.1038/nn1845


transcripts of Ube3a. Hum. Mol. Genet. 12, 837–847. doi: 10.1093/hmg/ ddg106


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 George, Hoffiz, Charles, Zhu and Mabb. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## GLOSSARY

**ataxia**: inability to coordinate voluntary muscle movements due to dysfunction of the central nervous system, not due to muscle weakness

**bradykinesia:** extreme slowness in movements and reflexes

**dysarthria**: slurred speech due to dysfunction of the central nervous system

**dystonia**: abnormality of movement and muscle tone

**gliosis**: abnormal production of glia cells

**homeostatic scaling:** form of synaptic plasticity in which single neurons can regulate their own excitability in relation to network activity

**hyperplasia**: abnormal production of cells in tissues

**hyperreflexia**: overactivity of physiological reflexes

**hypertelorism**: abnormal increase in distance between two body parts (e.g. between the eyes)

**hypotonia**: having deficient muscle tone

**hypogonadotropic hypogonadism**: decreased activity of the gonads (testes and ovaries) due to deficiency of gonadotropins (LH and FSH) and diminished levels of sex steroid hormones

**hypsarrhythmia**: abnormal encephalogram characterized by disorganized arrangement of spikes

**long term depression (LTD)**: a form of activity-dependent plasticity that weakens a specific set of synapses due to a patterned stimulus that reduces the excitatory postsynaptic potential

**long term potentiation (LTP)**: a form of activity-dependent plasticity that persistently strengthens a set of synapses due to a patterned stimulus

**microcephaly**: abnormal smallness in circumference of the head that occurs at birth or within the first few years of life

**myoclonic jerk:** brief, involuntary twitching due to rapid contraction and relaxation of muscles

**nuclear inclusion:** clusters of a substance (e.g. proteins) that occur in the nucleus of a cell

**ocular dominance plasticity:** stripes of neurons located in the visual cortex that span many cortical layers that preferentially respond from one eye or another and are modified by activity-dependent changes in neuronal connections during a critical period

**telangiectasia**: abnormal dilation of the subcutaneous vascular system

**radiosensitivity**: quality of being quick to respond to slight changes in radiant energy

**steno-occulsion**: proximal narrowing or blockage of a blood vessel

**telangiectasia**: threadlike red patterns on the skin caused by widened venules

# Loss of the Intellectual Disability and Autism Gene Cc2d1a and Its Homolog Cc2d1b Differentially Affect Spatial Memory, Anxiety, and Hyperactivity

Marta Zamarbide<sup>1</sup> , Adam W. Oaks<sup>1</sup> , Heather L. Pond<sup>1</sup> , Julia S. Adelman<sup>1</sup> and M. Chiara Manzini1,2 \*

<sup>1</sup> GW Institute for Neurosciences, Department of Pharmacology and Physiology, The George Washington University School of Medicine and Health Sciences, Washington, DC, United States, <sup>2</sup> Autism and Neurodevelopmental Disorders Institute, The George Washington University, Washington, DC, United States

#### Edited by:

Amritha Jaishankar, Rare Genomics Institute, United States

#### Reviewed by:

Theodora Katsila, University of Patras, Greece Nelson L. S. Tang, The Chinese University of Hong Kong, Hong Kong

> \*Correspondence: M. Chiara Manzini cmanzini@gwu.edu; chiara.manzini@gmail.com

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 11 October 2017 Accepted: 15 February 2018 Published: 02 March 2018

#### Citation:

Zamarbide M, Oaks AW, Pond HL, Adelman JS and Manzini MC (2018) Loss of the Intellectual Disability and Autism Gene Cc2d1a and Its Homolog Cc2d1b Differentially Affect Spatial Memory, Anxiety, and Hyperactivity. Front. Genet. 9:65. doi: 10.3389/fgene.2018.00065 Hundreds of genes are mutated in non-syndromic intellectual disability (ID) and autism spectrum disorder (ASD), with each gene often involved in only a handful of cases. Such heterogeneity can be daunting, but rare recessive loss of function (LOF) mutations can be a good starting point to provide insight into the mechanisms of neurodevelopmental disease. Biallelic LOF mutations in the signaling scaffold CC2D1A cause a rare form of autosomal recessive ID, sometimes associated with ASD and seizures. In parallel, we recently reported that Cc2d1a-deficient mice present with cognitive and social deficits, hyperactivity and anxiety. In Drosophila, loss of the only ortholog of Cc2d1a, lgd, is embryonically lethal, while in vertebrates, Cc2d1a has a homolog Cc2d1b which appears to be compensating, indicating that Cc2d1a and Cc2d1b have a redundant function in humans and mice. Here, we generate an allelic series of Cc2d1a and Cc2d1b LOF to determine the relative role of these genes during behavioral development. We generated Cc2d1b knockout (KO), Cc2d1a/1b double heterozygous and double KO mice, then performed behavioral studies to analyze learning and memory, social interactions, anxiety, and hyperactivity. We found that Cc2d1a and Cc2d1b have partially overlapping roles. Overall, loss of Cc2d1b is less severe than loss of Cc2d1a, only leading to cognitive deficits, while Cc2d1a/1b double heterozygous animals are similar to Cc2d1a-deficient mice. These results will help us better understand the deficits in individuals with CC2D1A mutations, suggesting that recessive CC2D1B mutations and trans-heterozygous CC2D1A and CC2D1B mutations could also contribute to the genetics of ID.

Keywords: intellectual disability, learning, social function, anxiety, hyperactivity, rare diseases, mouse models

## INTRODUCTION

Autosomal recessive loss of function (LOF) of the signaling scaffold Coiled-coil and C2 Domain containing 1A (CC2D1A) causes a spectrum of neurodevelopmental conditions including fully penetrant intellectual disability (ID), and variably penetrant autism spectrum disorder (ASD), seizures, and aggressive behavior (Basel-Vanagaite et al., 2006; Manzini et al., 2014;

Reuter et al., 2017). In Drosophila, where only one CC2D1 homolog, lethal giant discs lgd, is present, removal of lgd is lethal during the larval stage (Gallagher and Knoblich, 2006; Jaekel and Klein, 2006). Expression of either human CC2D1A or CC2D1B can rescue the phenotypes observed in Drosophila (Drusenheimer et al., 2015), suggesting that CC2D1A and CC2D1B act redundantly. Despite wide expression of CC2D1A and its binding to multiple proteins involved in the immune response (Chang et al., 2011; Chen et al., 2012), CC2D1A LOF in humans appears to only affect the brain, leading to a spectrum of behavioral deficits. While this indicates that CC2D1B is not fully able to compensate in the brain leading to the human presentation, it is unclear whether CC2D1B itself could have a role in neurodevelopmental disorders.

Studies on the genetic causes of ID and ASD, in particular, are identifying a large contribution of de novo and hypomorphic mutations to these diseases (Sanders et al., 2012; Lim et al., 2013; Yu et al., 2013; Musante and Ropers, 2014). Many of the mutated genes would have greater impact on development if completely lost, leading to multi-system disorders and/or brain malformations, while the heterozygous and hypomorphic mutations found in ASD/ID affect neurons more mildly, leading to a grossly normal brain, but with cognitive and social deficits (Yu et al., 2013). We wondered whether a similar mechanism is at play in patients with CC2D1A LOF mutations, where CC2D1B can only partially compensate. If this was the case, removal of both CC2D1 genes would be incompatible with embryogenesis, indicating that these proteins together have a critical developmental role. Nothing is known about the role of CC2D1B in brain development. By comparing how individual loss of each gene affects cognitive, social, and affective function we have studied the relative role of CC2D1A and CC2D1B in the brain and defined whether CC2D1B should also be considered as a candidate gene for ID.

Mice deficient for Cc2d1a develop normally in utero, but die soon after birth because of breathing and swallowing deficits (Zhao et al., 2011; Al-Tawashi et al., 2012; Oaks et al., 2017). By conditionally removing Cc2d1a in the forebrain, we have previously shown that Cc2d1a LOF recapitulates features of ID and ASD in adult animals (Oaks et al., 2017). Cc2d1a conditional knockout (1a-cKO) mice show learning and memory deficits, social deficits, hyperactivity, anxiety, and repetitive behaviors (Oaks et al., 2017).

To define how CC2D1B compensates for loss of CC2D1A and contributes to these phenotypes, we generated a Cc2d1b knockout (1b-KO) line and developed an allelic series of Cc2d1a and Cc2d1b LOF, including Cc2d1a/1b double heterozygote (1a/1b-dHET) and double KO (1a/1b-KO) animals. Removal of both CC2D1 proteins causes early embryonic lethality, showing that CC2D1 function has an essential developmental role as in Drosophila. 1b-KO and 1a/1b-dHET animals are viable and fertile suggesting that Cc2d1a and Cc2d1b are not fully redundant, and that Cc2d1a has a critical role in respiration in the mouse.

When we tested the behavioral performance of 1b-KOs we found that Cc2d1b LOF caused only cognitive deficits, which are partially overlapping with those observed in Cc2d1a conditional LOF. Since direct comparison with a global Cc2d1a KO is not possible because of postnatal mortality, we also tested 1a/1b-dHETs which showed a combination of deficits with features of both 1b-KO and 1a-cKO animals, including delayed memory acquisition and retention, as well as increased anxiety and hyperactivity, mostly in males. Our findings indicate that CC2D1 function is critical for embryonic development and that the CC2D1 proteins regulate multiple behaviors with some sex-specificity for males. Both CC2D1A and CC2D1B are involved in learning and memory, while CC2D1A alone appears to contribute to anxiety and hyperactivity.

## MATERIALS AND METHODS

#### Animals

This study was carried out in accordance with the recommendations of the Institutional Animal Care and Use Committee of The George Washington University. A Cc2d1b null mouse line (1b-KO) was generated by the Knockout Mouse Project Repository (Project ID CDS 34981) at the University of California Davis, with the allele Cc2d1btm1a(KOMP)Wtsi . Cc2d1b null mice carry an engrailed 2 splice acceptor (En2SA) gene-trap allele with bicistronic expression of β-galactosidase as well as a neomycin resistance cassette, flanked by FRT (flippase recombinase target) recombination sites, in the genomic region between exons 2 and 3 of Cc2d1b (**Figure 1A**). Cc2d1a/1b double heterozygous (1a/1b-dHET) mice were generated by crossing Cc2d1a heterozygotes (1a-HET) with Cc2d1b heterozygotes (1b-HET). 1a-HET mice were bred from a Cc2d1a null mouse line (KO) generated by the Knockout Mouse Project Repository (Project Design ID 49663) at the University of California as was previously described by Oaks et al. (2017). All lines are maintained on a C57BL/6 background. For genotyping, polymerase chain reaction (PCR) amplifications were performed on 1 µL of proteinase K (New England Biolabs, Ipswich, MA, United States) digested tail DNA samples. PCR reactions (50 µL) consisted of GoTaq Flexi buffer (Promega, Madison, WI, United States), 100 µM dNTPs, 50 µM each of forward and reverse primers (sequence available upon request), 1 mM MgCl2, and 1.25 U GoTaq Flexi DNA polymerase (Promega, Madison WI, United States), and were run with optimized reaction profiles determined for each genotype. A 25-µL aliquot from each reaction was analyzed by gel electrophoresis on a 1.0% agarose gel for the presence of the desired band.

## Histological Preparation and Microscopy

To prepare tissue for histological analysis, deeply anesthetized mice were transcardially perfused with phosphate buffered saline (PBS) followed by 4% paraformaldehyde (PFA). Brains were removed and post-fixed in PFA. Cryosections from adult mouse brains were prepared by mounting in Neg-50 (Thermo Fisher Scientific, Waltham, MA, United States) and cut at 40 µm on a Cryostar NX50 cryostat (Thermo Fisher Scientific, Waltham, MA, United States), then stained with Hematoxylin and Eosin (H&E, VWR International, Radnor, PA, United States) to visualize tissue architecture. Imaging of H&E stained sections

was performed on a Leica M165 FC stereo microscope (Leica Microsystems, Buffalo Grove, IL, United States).

### Behavioral Tests

A standardized battery of behavioral testing was applied to each cohort of animals, 1b-KO and 1a/1b-dHET male and female mice, at 3–4 months of age. As both 1b-KOs and 1a/1b-dHETs were generated from the same 1a/1b-dHET, breeding the wild-type (WT) controls were littermates shared by both cohorts and all behavioral tests were performed at the same time for WT, 1b-KOs, and 1a/1b-dHETs. Behavioral tests were performed in the Manzini lab behavioral suite in the George Washington University Animal Research Facility following a 60 min period of acclimatization. Initial characterization to analyze any neurological abnormalities including the analysis of basic motor and somatosensory function was performed on a subset of the behavioral cohort as described by Rogers et al. (1997): righting reflex, wire hang, gait analysis, tail pinch, and visual reach. Cognitive and social function and other behaviors were tested in the open field test, novel object recognition test (NORT) (Bevins and Besheer, 2006), Morris water maze (MWM) (Vorhees and Williams, 2006), and 3-chamber social interaction test (Nadler et al., 2004). Behavioral analysis was performed via automated animal tracking using ANY-maze (Stoelting, Wood Dale, IL, United States).

#### Righting Reflex

Coordination, motor strength, and vestibular function were tested by placing each mouse on its back and timing its ability to return to an upright position.

#### Wire Hang

Motor strength was tested by timing the latency to fall to a mouse cage containing bedding while the mouse was hanging from a wire cage-top not higher than 18 cm.

#### Gait Analysis

Motor coordination and strength were assessed by painting the paws of each mouse with red non-toxic tempera paint and making them walk through a narrow tunnel over white paper. Abnormalities of paw placement and stride length were noted or indicated as normal.

#### Tail Pinch

The ability of each mouse to respond to mild pain was tested by pinching the tip of the tail with fine, ethanol-cleaned forceps. Reactions were categorized as either response or no response.

#### Visual Reach

Vision was tested by measuring the latency to the first attempt to reach for a nearby wire cage-top while the mouse was being held by the base of the tail at a height of 18 cm over an open cage.

#### Open Field Test

The open field test was performed in an unfamiliar 50 cm × 50 cm plastic box (Stoelting, Wood Dale, IL, United States). Animals were placed in the center of the arena and ambulatory activity was monitored by digital video for 15 min. The arena was divided into two areas, an outer zone and a center zone (25 cm ×25 cm; 25% of total area). Total distance traveled and time spent in each area was measured.

#### Novel Object Recognition Test

fgene-09-00065 February 28, 2018 Time: 16:16 # 4

The NORT (Bevins and Besheer, 2006; Oaks et al., 2017) was performed in the same apparatus described for the open field test. The test consisted of three different phases: habituation, training, and test. The habituation phase lasted for 30 min while the animals were exposed to the box and then returned to the home cage while the box was cleaned. During the training phase, the animal was placed in the same box with two identical objects located in opposite corners, at a distance of 5 cm from the walls. To assess short-term memory, the animal was returned to the home cage during an interval of 15 min. During the test, a familiar object, identical to those used in the habituation phase, was placed in one corner, while in the opposite corner an unfamiliar object was placed. Exploration activity was monitored for 10 min at each phase, with exploration defined as time spent actively observing or touching the object from within a radius of 5 cm. Cumulative time spent with each object was measured by video analysis using ANY-maze to determine the location of the animal's nose relative to the objects in the enclosure. Preference for the novel object was defined as the ratio of the time spent with the novel object to the time spent with the familiar object. Animals that did not interact with the object and stopped in a corner of the cage were removed from the analysis.

#### Morris Water Maze

The MWM (Vorhees and Williams, 2006; Oaks et al., 2017) apparatus was a 120 cm ×120 cm round metal tub (Stoelting, Wood Dale, IL, United States) where distinct visual cues were placed at the cardinal points. White non-toxic paint was added to the water to make the surface opaque for the hidden trials and it was maintained at 24◦C. Each trial consisted of four independent drops, one at each cardinal point around the tub, with the mouse facing the wall of the tub. Each drop lasted 60 s, or until the mouse found the platform, whichever occurred first. Each animal completed two trials (four drops each) with a visible platform, five trials with a platform hidden under the water surface, and two reversal trials where the location of the hidden platform (HP) was changed. The sequence of nine trials was performed over 9 days, with one trial per day. A 60-s probe trial was also performed the day after the HP series was completed, by removing the platform from the water before proceeding to the reversal phase on the following day.

#### Three-Chamber Social Interaction Test

The social interaction test (Nadler et al., 2004; Kaidanovich-Beilin et al., 2011) was performed in a clear rectangular acrylic box (60 cm × 40 cm) divided into three chambers (40 cm × 20 cm) with small openings (10 cm × 5 cm) in the adjoining walls (Everything Plastic, Philadelphia, PA, United States). The test consisted of two phases, the habituation phase and the sociability phase. During the habituation phase, empty inverted wire cups (10 cm in diameter) were placed in the center of the chambers at the ends. Each mouse was placed in the center chamber of the apparatus and allowed to explore the different chambers for 5 min. During the second phase, an unfamiliar mouse of the same sex as the tested mouse was placed under the wire cup in one of the side chambers. The experimental mice were allowed to explore for 10 min during the sociability phase. Total time spent in the Object (containing empty cup) and Mouse (with unfamiliar mouse under the cup) chambers was used to determine the social preference of each mouse tested, while the time sniffing within a 2-cm radius of the mouse-containing cup were recorded as measures of social approach and social interaction.

## RESULTS

#### CC2D1A and CC2D1B Have Partially Redundant Function in Development

Loss of CC2D1A in humans causes a variable spectrum of ID, ASD, and seizures and the removal of Cc2d1a in the murine forebrain leads to several cognitive, social, and affective behavioral phenotypes (Manzini et al., 2014; Oaks et al., 2017). As no human mutations in CC2D1B have been identified to date, we asked whether loss of Cc2d1b in the mouse would lead to similar phenotypes as loss of Cc2d1a. A Cc2d1b-deficient line (1b-KO) had been generated from the Knockout Mouse Project (KOMP) as a gene-trap allele inserted in intron 2 of Cc2d1b (**Figure 1A**). We obtained heterozygous animals and bred them to homozygosity, finding that 1b-KO mice are born in Mendelian ratios (**Figure 1B**). Differently from Cc2d1a KO (1a-KO) pups, which die shortly after birth (Zhao et al., 2011; Al-Tawashi et al., 2012; Drusenheimer et al., 2015; Oaks et al., 2017), 1b-KO mice are viable, fertile, and indistinguishable from WT littermates (**Figure 1C**). Basic behavioral functions were tested in adult WT and 1b-KO males and females: coordination (righting reflex), strength (wire hang), locomotion (stride and gait), pain sensitivity (tail pinch), and vision (visual reflex). No differences were observed in basic sensory and motor function (**Table 1**). We confirmed via western blot analysis of cortical protein lysates that CC2D1B was completely absent in these animals and that CC2D1A was expressed at normal levels (**Figure 1D**). Cryosections generated from the adult brain of 1b-KO animals and stained using hematoxylin and eosin (H&E) showed no differences in brain size and organization from WT littermates (**Figure 1E**). In summary, loss of Cc2d1b does not affect respiratory function and deglutition in the infant as observed in 1a-KOs, and 1b-KO adult mice are indistinguishable from WT littermates.

CC2D1A and CC2D1B contain very similar protein domains and are thought to have redundant functions in endocytic traffic and gene transcription (Hadjighassem et al., 2009; Usami et al., 2012; Drusenheimer et al., 2015). Because CC2D1B LOF did not result in postnatal lethality, we wondered whether


TABLE 1 | Analysis of basic motor and sensory function in 1b-KO and 1a/1b-dHET mice.

<sup>a</sup>Animals responding to tail pinch. <sup>b</sup>Animals responding to visual stimulus.

the two proteins would only be partially redundant. To test this hypothesis, we crossed 1b-KOs and 1a-KOs to generate Cc2d1a/Cc2d1b double heterozygous (1a/1b-dHET) and double KO mice (1a/1b-KO). As 1aKO pups die soon after birth (Zhao et al., 2011; Al-Tawashi et al., 2012; Oaks et al., 2017), we did not expect 1a/1b-KO animals to survive and we genotyped litters at postnatal day (P)0, collecting tissue from both live and dead pups. However, while dead 1a-KO and 1a-KO/1b-HET were found in the expected ratios, 1a/1b-dKO pups were never retrieved (**Figure 2A**), suggesting that double knockouts may die earlier during embryonic development. Examination of prenatal litters only identified 1a/1b-dKO tissue mid-gestation at E11.5, but the embryo was almost entirely absent, leaving only a hypomorphic and largely empty yolk sac (**Figure 2B**). These results indicate that removal of both CC2D1 proteins leads to early embryonic lethality.

1a/1b-dHETs were viable, fertile, and indistinguishable from WT littermates with normal gross brain anatomy (**Figure 2C**) and normal basic motor and sensory function (**Table 1**). We tested the expression levels of CC2D1A and CC2D1B in 1a/1b-dHET mice and found that as expected, only a half dose of each CC2D1 protein was present (**Figure 2D**). Thus, combined CC2D1 function is necessary for embryonic morphogenesis, but 1b-KO or 1a/1b-dHET animals develop normally, indicating that CC2D1A and CC2D1B have similar functions as it pertains to gross anatomical development and survival.

pups were never found at P0. (B) Representative images of normal embryonic day 11.5 (E11.5) embryo with a single intact Cc2d1 allele (Left) and a double KO embryo (Right; arrow indicates empty yolk sac). Scale bars: 1 mm. (C) The size and organization of the adult 1a/1b-dHET brain is indistinguishable from wild-type mice stained with hematoxylin and eosin. Scale bar: 1 mm. (D) Immunoblot analysis of CC2D1A and CC2D1B expression in wild-type and 1a/1b-dHET mice. A half dose of each CC2D1 protein was found. Results expressed as mean ± SEM. <sup>∗</sup>p < 0.05 (two tailed t-test).

## Both CC2D1A and CC2D1B Are Important for Cognitive Function

fgene-09-00065 February 28, 2018 Time: 16:16 # 6

We have previously found that loss of Cc2d1a leads to a constellation of behavioral deficits: cognitive and social impairment, anxiety, hyperactivity, and repetitive behaviors (Oaks et al., 2017). We generated a cohort of 1b-KO and 1a/1b-HET male and female mice for behavioral analysis by crossing 1a/1b-HETs, so that we could compare behavioral performance in both lines at the same time. In the short-term memory version of the NORT (Bevins and Besheer, 2006) mice are placed in an arena with two identical objects that they are free to explore. After being removed back to their cages for 15 min, they are put in the arena where one of the now known objects has been substituted for a novel object (**Figure 3A**). In this test, WT male and female mice spend roughly four times longer exploring the novel object, while 1b-KOs and 1a/1b-dHETs show no difference (**Figures 3B,C**) (Males: WT, T2/T1 = 1.21 ± 0.32, T4/T3 = 3.90 ± 0.75, n = 10, p = 0.004∗∗; 1b-KO, T2/T1 = 1.05 ± 0.24, T4/T3 = 1.60 ± 0.46, n = 11, p = 0.309; 1a/1b-dHET, T2/T1 = 1.08 ± 0.23, T4/T3 = 1.62 ± 0.46, n = 12, p = 0.307. Females: WT, T2/T1 = 1.20 ± 0.25, T4/T3 = 4.39 ± 1.40, n = 10, p = 0.038 ∗; 1b-KO, T2/T1 = 0.84 ± 0.16, T4/T3 = 0.93 ± 0.24, n = 10, p = 0.757; 1a/1b-dHET, T2/T1 = 1.34 ± 0.48, T4/T3 = 1.46 ± 0.28, n = 10, p = 0.824). This deficit was not due to reduced interest in the objects, as animals spent similar amounts of time in exploratory behaviors, with 1a/1b-dHET males showing significantly more exploration (**Figure 3D** T1+T2 – Males: WT, t = 26.97 ± 5.75s, n = 10; 1b-KO, t = 23.17 ± 3.65s, n = 11, p = 0.999; 1a/1b-dHET, t = 65.27 ± 15.93s, n = 12, p = 0.167; Females: WT, t = 40.56 ± 5.19s, n = 10; 1b-KO, t = 71.67 ± 17.47s, n = 10, p = 0.423; 1a/1b-dHET, t = 54.30 ± 10.56s, n = 10, p = 0.960. **Figure 3E** T3+T4 – Males: WT, t = 21.93 ± 5.54s; 1b-KO, t = 17.91 ± 3.57s, p = 0.999; 1a/1b-dHET, t = 50.83 ± 16.0s, p = 0.640. Females: WT, t = 15.39 ± 2.12s; 1b-KO, t = 68.38 ± 26.04s, p = 0.090; 1a/1b-dHET, t = 31.86 ± 10.61s, p = 0.959. **Figure 3F** SUM T1,2,3,4 – Males: WT, t = 48.90 ± 9.35s; 1b-KO, t = 41.08 ± 6.20s, p = 0.942; 1a/1b-dHET, t = 116.1 ± 28.24s, p = 0.033 ∗; Females: WT, t = 55.95 ± 6.62s, 1b-KO; t = 140.1 ± 42.64s, p = 0.073; 1a/1b-dHET, t = 86.16 ± 20.52s, p = 0.660).

To further assess cognitive function, the 1b-KO mice were tested using the MWM paradigm which probes spatial memory acquisition, retention, and flexibility, by testing the ability of a mouse to learn, remember, and relearn the location of a platform hidden under opaque water (Morris, 1984). After the mice are trained using a visible platform to escape from the water, the platform is hidden under the surface in a different location and the animals undergo training on five consecutive days to learn

the location of the platform. On the following day, memory retention is tested by removing the platform and measuring the amount of time the mouse spends in the area where the platform was previously located (probe trial). Finally, the position of the platform is changed and the animal must display flexibility by learning a new location (reversal). 1a-cKO animals show a delay in initial acquisition of the location of the HP, but after they learn, they can retain the memory in the probe trial, and learn a new location in the reversal (Oaks et al., 2017). Both 1b-KO and 1a/1bdHET males and females presented deficits in this test (**Figure 4**). 1b-KO males and females and 1a/1b-dHET males were delayed in the HP acquisition showing significant differences in day 2 or 3 of the test (HP2 and HP3 in **Figures 4B,F**) (Males HP3: WT, t = 6.82 ± 0.69s, n = 11; 1b-KO, t = 10.97 ± 1.85s, n = 10, p = 0.042 ∗; 1a/1b-dHET, t = 11.99 ± 1.28s, n = 13, p = 0.0027 ∗∗. Females HP2: WT, t = 12.30 ± 1.32s, n = 13; 1b-KO, t = 19.62 ± 1.74s, n = 10, p = 0.0025 ∗∗; 1a/1b-dHET, t = 14.66 ± 1.64s, n = 11, p = 0.247). 1a/1bHET males and females were also affected in the probe trial where they spent less time in the platform quadrant during the first 15 s of the 60-s trial (**Figures 4D,H**) (Probe 15 s – Males: WT, t = 9.51 ± 0.83s, n = 11; 1b-KO, t = 6.13 ± 0.50s, n = 10, p = 0.0029 ∗∗; 1a/1b-dHET, t = 5.48 ± 0.80s, n = 13, p = 0.0021 ∗∗. Females: WT, t = 7.18 ± 0.80s, n = 13; 1b-KO, t = 5.77 ± 0.65s, n = 10, p = 0.203; 1a/1b-dHET, t = 4.36 ± 0.82s, n = 11, p = 0.022 ∗). Finally, 1b-KO males, but not females, were affected throughout the 60-s probe trial and spent less time exploring the correct quadrant in the probe trial testing memory retention (**Figure 4D**) (Probe 60 s – Males: WT, t = 25.40 ± 1.78s, n = 11; 1b-KO, t = 19.58 ± 1.30s, n = 10, p = 0.018 ∗; 1a/1bdHET, t = 22.74 ± 2.63s, n = 13, p = 0.428. Females: WT, t = 21.19 ± 1.85s, n = 13; 1b-KO, t = 20.57 ± 1.54s, n = 10, p = 0.809; 1a/1b-dHET, t = 18.43 ± 2.62s, n = 11, p = 0.389). Animals heterozygous for loss of Cc2d1a or Cc2d1b alone showed normal behavioral performance (Supplementary Figures 1, 2 and Supplementary Table 1). In summary, loss of CC2D1B leads to cognitive deficits in both memory acquisition and retention. In general, males appear more severely affected than females in both 1bKO and 1a/1bHET lines, suggesting that CC2D1A and CC2D1B have overlapping roles in cognitive function.

## Only CC2D1A Is Involved in Anxiety and Hyperactivity

1A-cKO animals showed increased mobility and reduced entry into the center of the open field arena, indicating hyperactivity and anxiety (Oaks et al., 2017). In addition, removal of Cc2d1a in the forebrain also leads to ulcerative dermatitis due to obsessive grooming and social interaction deficits (Oaks et al., 2017). 1b-KO males and females performed similarly to WT littermates in the open field test and showed no signs of hyperactivity or anxiety (**Figure 5**) (Distance – Males: WT, d = 25.16 ± 2.29m, n = 11; 1b-KO, d = 29.63 ± 1.96m, n = 11, p = 0.498; Females: WT, d = 34.65 ± 1.36m, n = 13; 1b-KO, d = 42.37 ± 3.28m, n = 11, p = 0.097. Time in center – Males: WT, t = 78.13 ± 5.23s, n = 11; 1b-KO, t = 83.17 ± 14.26s, n = 11, p = 0.988; Females: WT, t = 77.45 ± 11.78s, n = 10; 1b-KO, t = 87.75 ± 17.65s, n = 10, p = 0.969). Interestingly, 1a/1b-dHETs showed increased locomotion and avoidance of open spaces, as previously observed for the 1a-cKOs, but only in males, similar to the exploration in the NORT where increased

FIGURE 4 | CC2D1B is involved in spatial memory formation and retention with mild male-specificity. Hippocampus-dependent spatial memory was assessed in 1bKO and 1a/1bdHET mice via the Morris Water Maze test. Spatial learning was measured as latency to escape in three different stages, visible platform (VP), hidden platform (HP), or the reversal (RV) of the HP position. No deficits were shown by males (A) or females (E) of any genotype in identifying the platform in the VP trial. (B) Both 1bKO and 1a/1bHET males showed a delay in learning the location of the HP, and a similar deficit was present in 1bKO females (F). (C,G) No differences were found in the RV during the test. (D,H) Spatial memory retention was measured between the HP and RV trials by the time spent swimming in the quadrant where the platform was previously located. Significant spatial memory impairment was found in the 1bKO male mice compared to WT both during the first 15 s and at the end of the trial after 60 s, while female 1bKO mice showed no deficit. 1a/1bHET males and females spent less time looking for the platform during the first 15 s, but subsequently recovered. Two-way ANOVA with repeated measures was used for analysis of the HP phase. Multiple t-tests with equal variance were uses for individual timepoints and probe analysis <sup>∗</sup>p < 0.05, ∗∗p < 0.01.

exploratory behavior was only observed in 1a/1b-dHET males (**Figures 5A,B**) (Distance – Males: WT, d = 25.16 ± 2.29m, n = 11; 1a/1b-dHET, d = 35.85 ± 2.94m, n = 13, p = 0.0076 ∗∗; Females: WT, d = 34.65 ± 1.36m, n = 13; 1a/1b-dHET, d = 35.35 ± 2.51m, n = 11, p = 0.999. Time in center – Males: WT, t = 78.13 ± 5.23s, n = 11; 1a/1b-dHET, t = 41.32 ± 3.71s, n = 13, p = 0.0198 ∗; Females: WT, t = 77.45 ± 11.78s, n = 10; 1b-KO, t = 121.90 ± 15.19s, n = 11, p = 0.1225). No ulcerative dermatitis or obsessive grooming was observed in any of these mouse lines.

Finally, all mice were tested in the social approach version of the three-chambered test. In this test, the mouse is placed in an apparatus with three communicating chambers. In the left chamber, there is a novel mouse of the same sex under a wire cup, while in the right chamber there is an empty wire cup. Mice spend more time exploring and sniffing the stranger mouse than the object and this is considered a social action (Nadler et al., 2004; Kaidanovich-Beilin et al., 2011). The 1a-cKO showed no preference for the conspecific both as in the time spent around the mouse enclosure and the time spent sniffing the stranger mouse (Oaks et al., 2017). 1a/1b-dHET males and females and 1b-KO females behaved like WT mice in this test (**Figures 5E–H**). 1b-KO males were moderately affected showing non-significant difference between the empty cup and the stranger (**Figure 5E**) [Males: WT, time with mouse (tm) = 287.65 ± 26.81s, time with object (to) = 162.74 ± 18.15s, n = 11, p = 0.00098 ∗ ∗ ∗; 1b-KO, tm = 282.37 ± 34.83s, to = 187.98 ± 28.63s, n = 13, p = 0.082; 1a/1b-dHET, tm = 312.05 ± 39.03s, to = 159.39 ± 28.11s, n = 10, p = 0.0052 ∗∗. Females: WT, tm = 331.50 ± 19.14s, to = 152.70 ± 31.59s, n = 8, p = 0.00026 ∗ ∗ ∗; 1b-KO, tm = 362.93 ± 29.06s, to = 151.53 ± 29.39s, n = 8, p = 0.00016 ∗ ∗ ∗; 1a/1b-dHET, tm = 317.62 ± 20.89s, to = 172.47 ± 12.91s, n = 9, p = 0.00002 ∗ ∗ ∗]. The deficit in 1b-KO males was primarily due to a subset of animals showing preference for the object (Supplementary Figure 3). All genotypes showed significantly increased time spent sniffing the stranger mouse, indicating that once in the chamber the 1b-KO animals interact with the other animal (**Figures 5F,H**) [Males: WT, time sniffing mouse (tsm) = 66.47 ± 7.44s, time sniffing object (tso) = 36.61 ± 7.51s, n = 11, p = 0.0105 ∗; 1b-KO, tsm = 56.04 ± 10.78s, tso = 21.47 ± 5.78s, n = 13, p = 0.009 ∗∗; 1a/1b-dHET, tsm = 58.40 ± 8.65s, tso = 31.11 ± 8.95s, n = 10, p = 0.042 ∗. Females: WT, tsm = 60.11 ± 10.60s, tso = 31.15 ± 7.71s, n = 8, p = 0.044 ∗; 1b-KO, tsm = 96.68 ± 13.00s, tso = 29.93 ± 5.55s, n = 8, p = 0.00033 ∗ ∗ ∗; 1a/1b-dHET, tsm = 55.80 ± 5.66s, to = 18.26 ± 4.02s, n = 9, p = 0.00005 ∗ ∗ ∗].

In conclusion, 1b-KO and 1a/1b-dHET animals show only partially overlapping behavioral profiles in anxiety, hyperactivity, and sociability. 1b-KO mice of either sex do not appear anxious or hyperactive and only males show a mild sociability deficit in the three-chamber test. 1a/1b-dHET males are more similar to 1a-cKO mice, with increased locomotion and decreased time in the center of the open field. These results show that CC2D1A and CC2D1B only have partially redundant roles in cognitive and social function. Each of the Cc2d1 genes contributes to aspects of learning and memory and sociability, but Cc2d1a appears to be more critical for hyperactivity and anxiety. Interestingly, both

lines display sexually dimorphic phenotypes with males being mildly more affected than females.

#### DISCUSSION

Cognitive development is controlled by a multitude of mechanisms regulating synaptic transmission and neuronal function. Hundreds of genes have been found mutated in patients with ID and ASD and the generation of mouse models has deepened our understanding of how each gene contributes to disease and behavior (Nestler and Hyman, 2010; Ey et al., 2011; Kazdoba et al., 2015). Mutations in the gene encoding CC2D1A cause a rare form of ID and ASD in humans, and this protein is emerging as a critical regulator of intracellular signaling with roles in cognitive function (Basel-Vanagaite et al., 2006; Manzini et al., 2014), immunity (Zhao et al., 2010; Chang et al., 2011) and cancer (Yamada et al., 2015). Removal of the only CC2D1 homolog in Drosophila, lgd, causes early lethality and severe deficits in morphogenesis, and both human proteins can rescue lgd LOF phenotypes, suggesting that the vertebrate CC2D1 proteins have redundant functions (Drusenheimer et al., 2015). In fact, deficits in lgd mutant flies are more severe than in 1a-KO and 1b-KO mice (Drusenheimer et al., 2015). We hypothesized that the neuropsychiatric phenotypes observed in humans carrying CC2D1A LOF mutations are likely due to the inability of CC2D1B to fully substitute for CC2D1A.

Initial evidence to support our hypothesis was provided by the fact that 1a-KO mice are anatomically normal but die soon after birth due to breathing and swallowing deficits (Zhao et al., 2011; Al-Tawashi et al., 2012; Chen et al., 2012; Oaks et al., 2017), while 1b-KOs are viable and fertile (Drusenheimer et al., 2015). No respiratory deficits have been reported in humans with CC2D1A mutations and these findings indicated that Cc2d1a has an essential role in breathing regulation in the brain stem in the mouse where CC2D1B cannot complement CC2D1A function. We do not know whether this difference between mice and humans is due to the timing of birth which is at an earlier stage of neural development in mice, or to differences in CC2D1A and CC2D1B expression in the brain stem in the two species.

The current study provides further evidence that Cc2d1a LOF is more severe than Cc2d1b LOF through behavioral studies. Forebrain-specific Cc2d1a-deficient mice 1a-cKO display an array of cognitive and social deficits, in addition to anxiety and hyperactivity (Oaks et al., 2017). 1b-KO mice only display cognitive deficits, with object-recognition impairment in the NORT and reduced memory acquisition and retention in the MWM test, but no other phenotypes. Interestingly, the MWM test results reveal different roles for the CC2D1 proteins in spatial learning and memory. 1a-cKO animals showed delayed learning, but no deficit in remembering the location of the platform once it was learned (Oaks et al., 2017), while 1b-KO mice also displayed reduced memory retention in the probe, especially in males. Parallel studies in the 1a/1b-dHET line confirm this difference observing deficits in both spatial memory acquisition and retention. In comparing cognitive performance in 1b-KOs with 1a/1b-dHET and previously published 1a-cKOs, all lines were equally deficient in the NORT, indicating that object recognition circuits in the cortex and hippocampus are affected (Antunes and Biala, 2012).

Cc2d1b also differs from Cc2d1a, as it appears to have no role in social behavior, hyperactivity, and anxiety. Results from the 1a/1b-dHETs suggest that partial loss of Cc2d1a in combination with a half dosage of Cc2d1b is sufficient to generate hyperactivity and anxiety. Interestingly, only complete loss of Cc2d1a leads to social deficits. Taken together, our results indicate that Cc2d1a and Cc2d1b have roles in behavioral function that are only partially redundant. Behavior is regulated by a multitude of molecular and cellular mechanisms, but it is interesting to note how each of these two homologous proteins may contribute to specific sets of behaviors. These effects could be due to their role in controlling a variety of intracellular signaling processes and thereby affecting multiple cellular functions.

CC2D1A and CC2D1B were reported to regulate endocytosis and gene transcription (Hadjighassem et al., 2011; Martinelli et al., 2012; Usami et al., 2012; Drusenheimer et al., 2015), but CC2D1A has been the most studied to date. Many of the pathways regulated by CC2D1A, such as Akt, CREB, and NFκB, are important for learning and memory (Bourtchuladze et al., 1994; Meffert et al., 2003; Lai et al., 2006; Majumdar et al., 2011). Initial findings in Cc2d1a-deficient cells showed an imbalance in signaling activation (Al-Tawashi et al., 2012; Manzini et al., 2014) and mild disruptions in endosome size (Drusenheimer et al., 2015), again demonstrating how CC2D1B is not fully able to compensate for CC2D1A. Our results in the 1a/1b-dHET also imply that there is a balance in CC2D1A and CC2D1B activity, and experiments in Drosophila and mammalian cells suggest that Cc2d1a and lgd expression and subcellular localization must be finely regulated to control endosomal trafficking and signaling through recruitment to specific signaling complexes (Gallagher and Knoblich, 2006; Jaekel and Klein, 2006; Manzini et al., 2014; Drusenheimer et al., 2015). This could be explained by a critical role for the CC2D1 proteins in the maintenance of signaling homeostasis. Homeostasis is broadly defined as the ability of a cell to return to a set point and maintain equilibrium. Many genes mutated in ASD and ID control homeostatic mechanisms in synaptic transmission, transcription, and signaling (De Rubeis et al., 2014; Pinto et al., 2014), and genomic deletions and duplications may show similar neurodevelopmental phenotypes leading to the hypothesis that the pathogenesis of neurodevelopmental disorders is linked to homeostatic imbalance (Ramocki and Zoghbi, 2008). Behavioral impairments in cognitive and social function could then be caused by subtle disruptions in multiple cellular processes limiting the ability of individual neurons and/or neuronal circuits to respond to stimuli, including environmental changes or stressors. In this respect, defining the role of CC2D1A and CC2D1B in homeostatic signaling of multiple pathways disrupted in ASD and ID could be important to dissect whether different signaling pathways contribute to distinct behavioral deficits.

Finally, in light of the cognitive defects in the 1b-KO mice, it may be worthwhile to search for CC2D1B mutations in patients with cognitive deficits, and to also consider the possibility of trans-heterozygous cases where CC2D1A and CC2D1B mutations

are both present in heterozygosity. While complete loss of both CC2D1 genes is embryonic lethal, haploinsufficiency of both CC2D1A and CC2D1B may lead to ID and ASD as CC2D1A LOF does. In the Genome Aggregation Database browser, which collects allele frequency data from more than 100,000 individuals in different populations, there are 43 likely gene disrupting (stop codon, frameshift, or splice site) alleles for CC2D1A and 89 for CC2D1B. These variants alone or in combination may further contribute to the genetic burden of ID.

#### AUTHOR CONTRIBUTIONS

MZ and MM designed the study and wrote the manuscript. MZ, AO, HP, and JA conducted the experiments and analyzed the data.

#### FUNDING

This work was supported by NIH grant R00HD067379, a Pilot Award from the Intellectual and Developmental Disabilities Research Center (IDDRC) at The Children's Research Institute (P30HD040677), and institutional start-up funds from The George Washington University to MM. NIH grants to VelociGene at Regeneron Inc. (U01HG004085) and the

## REFERENCES


CSD Consortium (U01HG004080) funded the generation of gene-targeted ES cells for 8500 genes in the KOMP Program and archived and distributed by the KOMP Repository at UC Davis and CHORI (U42RR024244). For more information or to obtain KOMP products go to www.komp.org or email service@komp.org.

#### ACKNOWLEDGMENTS

The authors are grateful to Tom Maynard and Irene Zohn for advice on mouse line generation and the analysis of the double knockouts; Anthony LaMantia, Judy Liu, Maria Chahrour, Emanuela Santini, and Sally Till for their helpful discussion on experimental design. The Cc2d1a and Cc2d1b KO mouse strains used for this research project were generated by the trans-NIH Knock-Out Mouse Project (KOMP) and obtained from the KOMP Repository (www.komp.org).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00065/full#supplementary-material


function as well as NF-κB signaling homeostasis. Cell Rep. 8, 647–655. doi: 10.1016/j.celrep.2014.06.039


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zamarbide, Oaks, Pond, Adelman and Manzini. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Neurodevelopmental Genetic Diseases Associated With Microdeletions and Microduplications of Chromosome 17p13.3

Sara M. Blazejewski, Sarah A. Bennison, Trevor H. Smith and Kazuhito Toyo-oka\*

*Department of Neurobiology and Anatomy, Drexel University College of Medicine, Philadelphia, PA, United States*

#### Edited by:

*Arvin Gouw, Rare Genomics Institute, United States*

#### Reviewed by:

*Xusheng Wang, St. Jude Children's Research Hospital, United States Nelson L. S. Tang, The Chinese University of Hong Kong, Hong Kong*

> \*Correspondence: *Kazuhito Toyo-oka kt469@drexel.edu*

#### Specialty section:

*This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics*

Received: *08 November 2017* Accepted: *26 February 2018* Published: *23 March 2018*

#### Citation:

*Blazejewski SM, Bennison SA, Smith TH and Toyo-oka K (2018) Neurodevelopmental Genetic Diseases Associated With Microdeletions and Microduplications of Chromosome 17p13.3. Front. Genet. 9:80. doi: 10.3389/fgene.2018.00080* Chromosome 17p13.3 is a region of genomic instability that is linked to different rare neurodevelopmental genetic diseases, depending on whether a deletion or duplication of the region has occurred. Chromosome microdeletions within 17p13.3 can result in either isolated lissencephaly sequence (ILS) or Miller-Dieker syndrome (MDS). Both conditions are associated with a smooth cerebral cortex, or lissencephaly, which leads to developmental delay, intellectual disability, and seizures. However, patients with MDS have larger deletions than patients with ILS, resulting in additional symptoms such as poor muscle tone, congenital anomalies, abnormal spasticity, and craniofacial dysmorphisms. In contrast to microdeletions in 17p13.3, recent studies have attracted considerable attention to a condition known as a 17p13.3 microduplication syndrome. Depending on the genes involved in their microduplication, patients with 17p13.3 microduplication syndrome may be categorized into either class I or class II. Individuals in class I have microduplications of the *YWHAE* gene encoding 14-3-3ε, as well as other genes in the region. However, the *PAFAH1B1* gene encoding LIS1 is never duplicated in these patients. Class I microduplications generally result in learning disabilities, autism, and developmental delays, among other disorders. Individuals in class II always have microduplications of the *PAFAH1B1* gene, which may include *YWHAE* and other genetic microduplications. Class II microduplications generally result in smaller body size, developmental delays, microcephaly, and other brain malformations. Here, we review the phenotypes associated with copy number variations (CNVs) of chromosome 17p13.3 and detail their developmental connection to particular microdeletions or microduplications. We also focus on existing single and double knockout mouse models that have been used to study human phenotypes, since the highly limited number of patients makes a study of these conditions difficult in humans. These models are also crucial for the study of brain development at a mechanistic level since this cannot be accomplished in humans. Finally, we emphasize the usefulness of the CRISPR/Cas9 system and next generation sequencing in the study of neurodevelopmental diseases.

Keywords: 17p13.3, microdeletion, microduplication, lissencephaly, autism spectrum disorder, neurodevelopmental disorder, CRISPR, next generation sequence

## GENERAL INTRODUCTION

Chromosome 17 boasts the third highest density of segmental duplications among human chromosomes and has the secondhighest gene density. More than 23% of the short arm of chromosome 17 consists of low-copy repeats (LCRs), creating the opportunity for non-allelic homologous recombination (NAHR) to occur (Komoike et al., 2010; Shimojima et al., 2010). This differs from allelic homologous recombination (AHR), which serves to separate haplotypes and results in the interchange of homologous sequences to contribute to nonpathologic genetic variation. NAHR results in the recombination of sequences on different chromosomes, different positions of homologous chromosomes, or between two sequences on the same chromosome. This can lead to deletions and duplications (Clancy and Shaw, 2008). The high density of LCRs found within chromosome 17p13.3 makes it a "recombination hotspot." LCRs are made up of the same or very similar repetitive sequences and are ∼95% identical (Stankiewicz et al., 2003). Many deletion breakpoints have been identified within LCRs (Stankiewicz et al., 2003). All of these factors contribute to the association of chromosome 17p13.3 with rare genetic diseases caused by haploinsufficiency or duplication.

The instability seen in chromosome 17 contributes to the development of a wide variety of diseases including morphological brain disorders, mental illnesses, epilepsy, and tumors (De Smaele et al., 2004; Shimojima et al., 2010; Schnaiter and Stilgenbauer, 2013; Gazzellone et al., 2014). In this review, the four major phenotypes we will discuss are Miller-Dieker syndrome (MDS), isolated lissencephaly sequence (ILS), class I 17p13.3 microduplication syndrome, and class II 17p13.3 microduplication syndrome. The CRK, PAFAH1B1, and YWHAE genes located at chromosome 17p13.3 all have crucial roles in neuronal migration and contribute to each of these genetic disorders when microdeletions or microduplications arise. MDS and ILS are associated with 17p13.3 microdeletions. Earl Walker first described cases of lissencephaly in 1942 in reference to the "smooth brain" observed. Miller and Dieker contributed to the identification of MDS in 1963 and 1969 respectively, resulting in the nomenclature of the disease (Walker, 1942; Miller, 1963; Pilz and Quarrell, 1996). MDS results from contiguous gene deletion within 17p13.3 and mainly features cerebral agyria (absence of gyri), cerebral pachygyria (broad gyri), craniofacial deformities, and seizures (Barros Fontes et al., 2017). ILS phenotypes lack the craniofacial deformities and the most severe grade of lissencephaly only observed in MDS phenotypes. These features of MDS patients result from larger deletions within chromosome 17p13.3 as compared to ILS patients. MDS is caused by microdeletions containing PAFAH1B1 and YWHAE, at minimum, while ILS can result from heterozygous mutation or deletion of PAFAH1B1 (Dobyns et al., 1993; Reiner et al., 1993). The thickening and simplification of cortical layers associated with lissencephaly seen in both ILS and MDS are due to a defect of neural migration during cortical development. This results in a variety of defects, including mental retardation (Dobyns et al., 1993; Cardoso et al., 2003).

Patients with microduplications of 17p13.3 were first reported in 2009, and this condition is now referred to as 17p13.3 microduplication syndrome (Bi et al., 2009; Roos et al., 2009). Although there are currently only about 40 reported cases, the number of 17p13.3 microduplication syndrome patients has been increasing. The affected can be categorized into two classes based on the size of chromosome duplication (Bruno et al., 2010). Patients with class I microduplications never have PAFAH1B1 duplications. They typically display autistic and other behavioral symptoms, delay in speech and motor abilities, craniofacial deformities, hand and foot deformities, and postnatal overgrowth (Bruno et al., 2010). In contrast, PAFAH1B1 is always duplicated in patients with class II, who may also have duplications of CRK and YWHAE (Bruno et al., 2010). Microduplications in class II result in psychomotor and developmental delay, as well as hypotonia (Bruno et al., 2010). Microdeletions and microduplications around the MDS critical region, which is the region of chromosome 17 spanning from PAFAH1B1 to YWHAE, have distinct phenotypes. Even so, both share similarities and affect the same genes (**Table 1**).

It is difficult to study these conditions with human patients due to the small number of affected individuals. Fortunately, the areas of clinical interest within the MDS region are largely conserved between the human and mouse genome (**Figure 1**). The short arm of chromosome 17 in humans coincides approximately with the center of chromosome 11 in mice and the 26 human genes (**Table 2**) known in 17p13.3 have homologs in the mouse chromosome. The genes in the MDS critical region are in the same order in humans and mice; however, some neighboring genes are in a different order. These genetic similarities and the parallels between human and mouse development make the mouse model an excellent analog (Yingling et al., 2003). Previous studies have used knockout and transgenic mice to study genes within the MDS critical region. There are currently some single and double knockout mice available, which will be discussed. However, mouse models for the complete deletion or duplication of chromosome 17p13.3 have yet to be developed. In this review, we discuss microdeletions and microduplications of chromosome 17p13.3 with a focus on patient phenotypes, phenotypes seen in single and double knockout and transgenic mice, and limitations on current studies. We also propose a combined CRISPR/Cas9-Cre-loxP approach to create mouse models with microdeletions and microduplications of the complete chromosome 17p13.3 region, which may advance current studies by providing a more clinically relevant model. We then discuss the potential for applied analysis of next generation sequencing to improve identification of copy number variants.

#### CHROMOSOME 17P13.3 DELETIONS

#### Introduction

Deletion of the 17p13.3 chromosome region results in an array of phenotypes in humans, depending on the severity of the deletion. Well-characterized phenotypes that may result from such a deletion include isolated lissencephaly sequence (ILS) and Miller-Dieker syndrome (MDS) (Ledbetter et al., 1992; Cardoso



et al., 2003; Bruno et al., 2010; Barros Fontes et al., 2017). A region of clinical interest within chromosome 17p13.3 is book-ended by two genes: PAFAH1B1 and YWHAE. Deletion of PAFAH1B1 results in classical lissencephaly (LIS), while larger deletions between PAFAH1B1 and YWHAE, or in other words the MDS critical region, result in MDS (Cardoso et al., 2003; Nagamani et al., 2009). Deletions within the 17p13.3 region are associated with features such as intellectual disability, craniofacial defects, epilepsy, and a few rare symptoms.

Although both ILS and MDS are associated with LIS, they are different conditions. Patients with ILS lack many of the facial dysmorphic features seen with MDS (Ledbetter et al., 1992; Cardoso et al., 2003; Kim et al., 2011). Other clinical features not found in ILS patients, such as cardiac defects, cystic kidney, and polydactyly, may also occur in MDS patients (Cardoso et al., 2003). MDS is associated with the most severe grade of LIS (grade 1), while individuals with ILS have less severe LIS (grades 2–4). The difference between the MDS and ILS phenotypes results from the deletion of specific genes within the 17p13.3 region. Deletion of PAFAH1B1 along with YWHAE and/or CRK may cause the more severe grade of LIS seen in MDS patients (Cardoso et al., 2003). YWHAE encodes 14-3-3ε, while CRK encodes a signaling protein that functions downstream of Reelin and is involved in cell proliferation, differentiation, migration, and axonal growth (Huang et al., 2004; Matsuki et al., 2008; Park and Curran, 2008; Barros Fontes et al., 2017).

Previously, MDS was incorrectly thought to be an autosomal recessive disorder because it was occasionally reported in individuals within the same family (Ledbetter et al., 1989). It has been shown that MDS results from a de novo microdeletion of chromosome 17p13.3 that may have a paternal bias in origin (Ledbetter et al., 1989). It is generally accepted that MDS is a contiguous-gene deletion syndrome, meaning that it results from the deletion of multiple genes found in close proximity

#### TABLE 2 | Summary of genes involved in microdeletions and microduplications of human chromosome 17p13.3.


*Genes are listed in the approximate order they are found in the sequence. NCBI Gene Database: Gene [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; 2004—[cited 2017 10 19] (Available from: https://www.ncbi.nlm.nih.gov/gene/).* \**Located within chromosome 17p13.3, but immediately outside the MDS critical region.*

to each other on the chromosome. This was challenged by a study involving five MDS patients using polymerase chain reaction (PCR) and fluorescence in situ hybridization (FISH) to check for microdeletions in chromosome 17p13.3. A deletion was only found in three of the patients. Because all five patients had nearly identical clinical symptoms, the results suggest that MDS is not a contiguous gene-deletion syndrome (Kohler et al., 1995). These data contradict the widely accepted belief that MDS is a contiguous gene-deletion syndrome, but should not be ignored. It is estimated that there is a visible or submicroscopic deletion of chromosome 17p13.3 in 90% of MDS patients, but the cause of the other 10% of cases is unclear (Ledbetter et al., 1992).

ILS may occur when a smaller deletion of chromosome 17p13.3 occurs. About 60% of ILS cases are associated with a deletion or intragenic variation of PAFAH1B1 (Takahashi et al., 2015). One study found 17p13.3 deletions in only 6 out of 45 ILS patients, which suggests that intragenic variations such as point mutations may account for the majority of cases (Ledbetter et al., 1992).

Deletions of chromosome 17p13.3 result in LIS, which is associated with mental retardation and epilepsy seen in both ILS and MDS patients. Individuals with deletions of PAFAH1B1 often have ILS, while individuals with larger deletions of the MDS critical region have MDS. More severe lissencephaly and craniofacial dysmorphic features are seen in MDS patients. The relationship between critical genes and human phenotypes will be described next.

### Phenotypes and Genotypes in Human Patients

Almost all of our knowledge about the various phenotypes seen in patients with MDS comes from clinical case studies. Lissencephaly and craniofacial dysmorphism are wellcharacterized features of MDS. Epilepsy is another major feature of MDS, though this is also seen in ILS. Deletions of certain genes are strongly associated with some phenotypes we will describe.

#### Phenotypes: Lissencephaly

Classical lissencephaly (LIS) largely results from arrest or defect of neuronal migration occurring between 10 and 14 weeks of gestation, causing abnormal cortical layering and a "smooth brain", meaning that patients with LIS have an absence or reduction of gyri and sulci in the cerebral cortex (Ledbetter et al., 1992; Wynshaw-Boris and Gambello, 2001; Wynshaw-Boris et al., 2010; Moon and Wynshaw-Boris, 2013). LIS is typically associated with mental retardation, intractable epilepsy, spasticity, and reduced longevity (Dobyns et al., 1993). The severity of LIS ranges from agyria (LIS grade 1), mixed agyriapachygyria (LIS grades 2 and 3), and pachygyria (LIS grade 4). Clinical symptoms correlate with the degree of agyria (Dobyns et al., 1999). LIS may also be caused by variations of the DCX gene on chromosome Xq23, but these cases may be distinguished from those associated with PAFAH1B1 because of differences in the gyral patterns and more common hypoplasia of the cerebellar vermis in individuals with DCX variations (Dobyns et al., 1999; Takahashi et al., 2015; Ayanlaja et al., 2017). Variations of PAFAH1B1 are linked to a posterior-to-anterior gradient of lissencephaly, while mutations of DCX are associated with an anterior-to-posterior gradient (Dobyns et al., 1999). Additionally, DCX mutations are associated with lissencephaly in males and a subcortical band heterotopia pattern in females, since DCX is located on the X chromosome (Gleeson et al., 1998; Poduri et al., 2013).

It has been established that LIS results from a defect in radial migration. However, previous reports indicate the involvement of interneurons, which have been shown to migrate non-radially (Pancoast et al., 2005). LIS is associated with mental retardation and epilepsy, which can often be linked to interneuron defects (Pancoast et al., 2005). A study involving two fetuses and two children with MDS found a non-radial migration defect in the calretinin-expressing interneuron subpopulation (Pancoast et al., 2005). Pancoast et al. hypothesizes that the clinically distinct forms of LIS have differences in interneurons, which may be useful for diagnosis and treatment once better understood (Pancoast et al., 2005). The idea that the migration of different cell types may be defective in different forms of LIS is supported by work from Marcorelles et al. (2010). A defect in tangential migration of GABAergic neurons has been associated with MDS, while defects in other aspects of tangential migration were reported for two other types of LIS (Marcorelles et al., 2010). Clearly identifying patterns in structural differences, like variations of gyral patterns and migration patterns, will aid in distinguishing between the types of LIS and could have some clinical applications.

While abnormal neuronal migration has been shown to contribute significantly to the lissencephaly phenotype, defects in neurogenesis may also play a role. LIS1, which is encoded by PAFAH1B1, influences cell proliferation at various neurodevelopmental stages (Tsai et al., 2005; Bi et al., 2009; Pramparo et al., 2010, 2011; Reiner and Sapir, 2013). A study where LIS1 levels were reduced throughout cortical development showed that LIS1 is involved in generating neuroblasts and postmitotic neurons and also has an influence on cell survival (Gambello et al., 2003). Lissencephaly in patients is analogous to the cortical disorganization observed in LIS1 deficient mice, which can be attributed to a combination of migration defects and LIS1-mediated reduction of cell numbers in the ventricular zone via modulation of cell proliferation and neuroblast survival (Gambello et al., 2003). It seems logical that defects in neurogenesis would impact subsequent stages of neurodevelopment since neurogenesis precedes and could potentially influence many events in development. Additionally, an increased time period in which neurogenesis occurs in a species has been evolutionarily correlated with the degree of convolution observed in the cortex. For example, neurogenesis is three times longer in ferrets and ten times longer in primates, as compared with rodents (Poluch and Juliano, 2015). A longer period of neurogenesis is also correlated with a slower depletion of the cortical progenitor pool (Kornack and Rakic, 1998; Calegari et al., 2005). This provides further support for neurogenesis as a contributing factor in the LIS phenotype, in addition to neuronal migration arrests or defects.

## Phenotypes: Epilepsy

Lissencephaly is closely linked with epilepsy. Since LIS occurs in patients with both ILS and MDS, it is a symptom common to both conditions. As with lissencephaly, YWHAE, CRK, and PAFAH1B1 have complicated roles in epilepsy. Two patients with deletions of YWHAE and CRK, but not PAFAH1B1, were described as having macrocephaly, epilepsy, and non-specific changes in white matter, among other clinical features (Tenney et al., 2011). Tenney et al. suggests that deletions of YWHAE and/or CRK indicate that a patient should be monitored carefully for seizure development (Tenney et al., 2011). Shimojima et al. supports this with similar findings regarding one of the three patients involved in the study (Shimojima et al., 2010). A different patient involved in this case study had a partial deletion of PAFAH1B1 resulting in isolated grade 3 lissencephaly and epilepsy (Shimojima et al., 2010). Thus, YWHAE, CRK, and PAFAH1B1 are critical genes involved in epilepsy.

#### Phenotypes: Craniofacial Dysmorphisms

Individuals with MDS have a characteristic facial appearance including tall square forehead, flattened midface, low-set posteriorly rotated ear, short upturned nose, and prominent lateral nasal folds (Cardoso et al., 2003; Nagamani et al., 2009; Bruno et al., 2010; Yu et al., 2014). One study involved a patient with a 284 kb deletion within the MDS critical region where CRK and MYO1C were deleted, but not YWHAE (Ostergaard et al., 2012). The patient had mental retardation in addition to slight facial and limb abnormalities, suggesting that CRK, but not YWHAE, may have a role in craniofacial defects and limb malformations (Ostergaard et al., 2012).

Additionally, deletion of the HIC1 and OVCA1 genes has a role in producing these craniofacial dysmorphisms (Barros Fontes et al., 2017). A study involving 19 patients with a 17p13.3 microdeletion reported that 13 of those individuals had haploinsufficiency of HIC1 and OVCA1 (Tenney et al., 2011). Cleft palate was observed in 4 of these patients, while craniofacial dysmorphisms were found in 11. These data implicate HIC1 and OVCA1 in craniofacial defects and provide evidence for their role in palatogenesis (Barros Fontes et al., 2017). Cardoso et al. identified a 400 kb critical region that differentiates ILS from MDS (Cardoso et al., 2003). Since craniofacial dysmorphisms are unique to MDS, the study of this critical region could help identify more genes involved in these characteristic craniofacial defects.

### Rare and Miscellaneous Characteristics of MDS

Approximately 90% of MDS patients have a visible or submicroscopic deletion of chromosome 17p13.3 (Ledbetter et al., 1992). About 1 in 50,000–100,000 patients has a ring chromosome disorder (Kim et al., 2014). It is possible for deletions to occur during the formation of ring chromosomes, since they result from the fusion of the short and long arms (Kim et al., 2014). Ring chromosomes have been linked to birth defects and mental disabilities (Kim et al., 2014). Several individuals have been described as having a ring chromosome 17 (Ono et al., 1974; Qazi et al., 1979; Chudley et al., 1982; Sharief et al., 1991). Of those patients, a few have had clinical features consistent with MDS (Sharief et al., 1991). One case report described a MDS patient who had 46 chromosomes with a ring structure taking the place of one of the chromosomes 17. There was also a deletion equivalent to what is typically seen in MDS (Sharief et al., 1991). Recent developments may eventually lead to therapeutics for such individuals. When fibroblast cells derived from patients with ring chromosomes are reprogrammed into induced pluripotent stem cells (iPSCs), the cells lose the ring chromosome and duplicate the wild-type homolog through compensatory uniparental disomy (UPD) (Bershteyn et al., 2014). These results could have important clinical implications and may eventually lead to the development of an approach to correct large-scale chromosomal aberrations (Kim et al., 2017).

While lissencephaly, craniofacial dysmorphism, and epilepsy are seen in the vast majority of MDS patients, there are also some less common and minor symptoms associated with MDS. An example is spinal manifestations. A 6-month-old infant with MDS was reported to have a tethered spinal cord, with the conus medullaris terminating abnormally low at the upper level of L4 (Hsieh et al., 2013). An inflamed lumbosacral sinus dermal tract was also described (Hsieh et al., 2013). The low incidence of MDS makes it difficult to determine if the rare and minor symptoms reported from an individual case study are a direct result of MDS or if they are unrelated. Mouse models with gene mutations will help overcome this difficulty and provide a better understanding of the relationship between specific genes and minor symptoms.

## Mouse Models

The region of chromosome 17p13.3 that is frequently deleted in patients with MDS is homologous to a region within the mouse chromosome 11B5. Mnt, Hic1, and Ovca1 are localized at this region and have been studied in mice (Carter et al., 2000; Toyo-oka et al., 2004; Yu et al., 2014). Mnt is essential for embryonic development and survival and may play a role in craniofacial defects associated with MDS, since Mnt knockout mice have cleft palate and retardation of skull development (Toyo-oka et al., 2004). Hic1 knockout mice show no neuronal migration defects or disorganization of the cerebral cortex, but do show gross developmental defects ranging from exencephaly to limb abnormalities (Carter et al., 2000). Ovca1 encodes a protein involved in diphthamide biosynthesis, which is critical for craniofacial development. Ovca1 deficiency in tissue derived from the neural crest has been shown to contribute to the craniofacial defects associated with MDS (Yu et al., 2014). Aberrant craniofacial development has been best observed in Mnt, Hic1, and Ovca1 knockout mice (Yu et al., 2014), so the study of the facial dysmorphic features associated with MDS currently relies on these models.

Heterozygous Mnt+/−, Hic1+/−, and Ovca1+/<sup>−</sup> mice do not have distinctive developmental phenotypes (Carter et al., 2000; Yu et al., 2014). This raises a question about the representativeness of these models to the human population with MDS, since these individuals only have deletions in one of their chromosome 17p13.3 regions. In most cases, we are currently analyzing craniofacial dysmorphisms resulting from haploinsufficiency of a relatively large region by use of homozygous mouse model knockouts of individual genes. If deletion of these genes results in phenotypes in a dosedependent manner, developmental phenotypes not seen in the heterozygous single knockout models may emerge when heterozygous knockouts of multiple genes are created. Study of the craniofacial dysmorphisms associated with MDS could be improved by creating Mnt/Hic1, Hic1/Ovca1, and/or Mnt/Ovca1 double heterozygous knockout mice. A Mnt/Hic1/Ovca1 triple heterozygous knockout model may show a more dramatic phenotype.

There are Pafah1b1, Ywhae, and Crk single knockout models and a Pafah1b1/Ywhae double knockout model available to study the neurodevelopmental defects associated with deletion of chromosome 17p13.3. Pafah1b1 encodes Lis1, which is involved in neuronal migration and has a role in dendritic filopodia dynamics and spine turnover (Hirotsune et al., 1998; Yamada et al., 2009; Sudarov et al., 2013). Ywhae encodes 14-3-3ε and is always deleted in patients with MDS. 14-3-3ε binds Ndel1 (also known as Nudel), which is phosphorylated by CDK5/p35, so Ndel1 remains phosphorylated (Toyo-oka et al., 2003). Single knockout models for Pafah1b1 and Ywhae are associated with similar defects in brain development and neuronal migration, while a double knockout of Pafah1b1 and Ywhae leads to a more severe phenotype that may be attributed to the non-sustained phosphorylation of Ndel1 (Toyo-oka et al., 2003).

The roles of Crk and Crk-like (Crkl) genes have been studied using mouse models. Crk encodes a signaling protein that functions downstream of Reelin and has roles in cell proliferation, differentiation, migration, and axonal growth (Huang et al., 2004; Tanaka et al., 2004; Matsuki et al., 2008; Park and Curran, 2008; Barros Fontes et al., 2017). Mutation of only Crk or Crkl does not compromise Reelin signaling and does not produce any obvious anatomical abnormalities (Park and Curran, 2008). However, double knockout of Crk and Crkl resulted in mice with grossly abnormal brain appearance (Park and Curran, 2008). This suggests overlapping functions for Crk and Crkl in Reelin signaling. Deletion of Crk, along with Ywhae, is thought to contribute to the more severe phenotype seen in MDS, as opposed to ILS (Tenney et al., 2011; Barros Fontes et al., 2017). At this time, Crk/Pafah1b1 and Crk/Ywhae double heterozygous knockout mice are not available. Creation of a Crk/Pafah1b1 double knockout could help determine if deletion of these genes contributes to the lissencephaly phenotype in a dose-dependent manner, as this seems to be the case when CRK and YWHAE are deleted together in humans (Tenney et al., 2011; Barros Fontes et al., 2017). There is a need for a Crk/Ywhae double knockout to be created, since combined deletion of these two genes is clinically important and has yet to be analyzed with a mouse model.

#### Limitations to Current Studies and Future Directions

Conventional cytogenetic analysis, or karyotyping, is not sufficient to detect microdeletion syndromes. Several case studies have used FISH to identify microdeletions of chromosome 17p13.3 in MDS and ILS patients (Kuwano et al., 1991; Kohler et al., 1995; Pilz et al., 1995; Cho et al., 2009). A relatively new alternative method is mental retardation syndrome multiplex ligation-dependent probe amplification (MRS-MLPA), which is capable of testing for multiple microdeletions at once and is less time consuming and less technically complicated than FISH. Another advantage of MRS-MLPA is its ability to detect smaller deletions that could be missed using FISH (Cho et al., 2009). The importance of this is illustrated by a case involving a MDS patient who had a partial deletion of the PAFAH1B1 locus (Izumi et al., 2007). A commercially available PAFAH1B1 FISH probe was used in an attempt to diagnose this patient, but due to the nature of the deletion, the patient's test results were normal (Izumi et al., 2007). The partial deletion was later discovered using a smaller sized FISH probe, drawing attention to the inability of FISH to detect relatively small deletions (Izumi et al., 2007). Other case studies have reported similar incidents (Shimojima et al., 2010). Use of FISH for diagnostic purposes may result in misdiagnosis for individuals with partial deletions of PAFAH1B1. Thus, the method that is considered the standard laboratory diagnostic tool for MDS has some shortcomings (Izumi et al., 2007). Unfortunately, the same is true of the mouse models currently being used to study MDS.

To date, there is no murine model for deletion of the complete MDS critical region. The lack of such a model is problematic because single or double knockout mice are not representative of the full range of phenotypes seen in MDS patients, who have relatively large deletions. While it would provide a more clinically relevant model, there are technical challenges in creating a mouse model for deletion of the complete MDS critical region. Later, in the conclusion of this paper, we describe these challenges and propose a strategy to address them.

#### CHROMOSOME 17P13.3 DUPLICATIONS

#### Introduction

Chromosome 17p13.3 is a region of genomic instability that is prone to submicroscopic rearrangements due to a high density of LCRs (Roos et al., 2009; Capra et al., 2012). These submicroscopic rearrangements can lead to microduplications (Roos et al., 2009; Capra et al., 2012). Duplications that cause 17p13.3 microduplication syndrome have diverse mechanisms, come in various sizes, and include a variety of genes. It follows that these microduplications are associated with a diverse array of phenotypes. The phenotypes generally associated with microduplications of 17p13.3 include developmental and psychomotor delay, behavioral problems and autism spectrum disorder (ASD), structural brain abnormalities and malformations, and distinct physical features (Curry et al., 2013). Microduplications have also been associated with limb malformations and cleft lip and palate (Curry et al., 2013). Microduplications in 17p13.3 occur in the same gene region that when deleted causes MDS, and therefore is sometimes referred to as the MDS critical region. The microduplication minimal region has been defined as a 72kb region exclusively containing gene YWHAE, which encodes protein 14-3-3ε and is strongly associated with ASD (Bruno et al., 2010; Curry et al., 2013).

17p13.3 microduplication syndrome diagnosis splits duplications into two categories: class I and class II (Bruno et al., 2010). Class I duplications are categorized as containing the gene YWHAE, and not PAFAH1B1, which encodes protein LIS1 (Bi et al., 2009). Class I duplications can also involve other genes, such as CRK. Patients with class I duplications usually display phenotypes characterized by learning disabilities and ASD (Bi et al., 2009; Curry et al., 2013). These individuals may have congenital defects, macrosomia, dysmorphic facial features, mild to severe developmental delay, and behavioral problems such as increased aggression (Bi et al., 2009; Curry et al., 2013). Class II microduplications are any duplication in chromosome 17p13.3 that contains PAFAH1B1. Class II duplications can also involve other genes such as YWHAE and CRK (Bi et al., 2009; Roos et al., 2009; Bruno et al., 2010). Phenotypes generally associated with class II duplications are mild to severe developmental and/or psychomotor delay, hypotonia, and mild brain malformations including microcephaly (Bi et al., 2009; Bruno et al., 2010). In contrast to infants with class I microduplications, infants with class II microduplications may also have small body size at birth. There are some reports of major internal organ abnormalities, such as structural congenital heart disease, in patients with class II duplications (Bi et al., 2009). Seizures were also noted, but were not common phenotypes among both classes of duplications (Bi et al., 2009; Curry et al., 2013; Gazzellone et al., 2014). Overall, there are overlapping but distinct phenotypes observed in class I and class II microduplication syndrome patients.

Popular techniques for diagnosing and classifying 17p13.3 microduplications in patients are FISH, array comparative genomic hybridization (Array CGH), multiplex ligationdependent probe amplification (MLPA), RT-PCR analysis, gene expression analysis, and chromosomal microarray (Bi et al., 2009; Roos et al., 2009; Bruno et al., 2010; Luk et al., 2014; Nagata et al., 2014; Petit et al., 2014; Eriksson et al., 2015). Array CGH has been used to detect gain of copy number, whereas FISH has been used to confirm duplications and regions (Bi et al., 2009). Case studies have analyzed the parental genomes of patients with 17p13.3 microduplication syndrome to determine potential inheritance patterns. In case studies where both parents were present and available for sequencing, duplications were largely classified as de novo. Only a few reported cases were maternally inherited (Bi et al., 2009; Curry et al., 2013). Simple and complex rearrangements have been observed. Suggested mechanisms for these rearrangements include non-homologous end joining (NHEJ) and replication fork stalling and template switching (Gu et al., 2008; Bi et al., 2009; Bruno et al., 2010). Bruno et al. also identified an individual with a microduplication that seemed to be caused by NAHR due to the observation of breakpoints lying within repetitive elements (Bruno et al., 2010). DNA sequencing of junction points was used to assess these hypothesized mechanisms (Bi et al., 2009; Bruno et al., 2010). It has been suggested that the microduplications present in 17p13.3 microduplication syndrome do not arise from stochastic events, but instead from intrinsic architectural features of the genome (Vissers et al., 2007).

## Phenotypes: Developmental/Psychomotor Delay and Behavior Problems

All patients present in case studies experienced some form of developmental delay, developmental disorder, cognitive impairment, speech abnormality, and/or behavior problems. Developmental delay is seen in both class I and class II duplications, whereas psychomotor delay is generally a characteristic unique to class II duplications (Bruno et al., 2010). Severity of the delay is highly diverse and variable. Case studies have shown developmental delay to vary greatly both between and within families (Curry et al., 2013). There was no correlation between size and location of duplication and degree of developmental delay or intellectual impairment. Curry et al. observed intellectual disability in 66% of patients who took part in their large-scale case studies (Curry et al., 2013). Cases consisted of individuals ranging from those who completed a high school education, to others who had average IQs accompanied by significant behavioral abnormalities, to those who were severely impaired (Curry et al., 2013). Delayed developmental milestones have been observed across patient populations. In a series of case studies performed by Roos et al. one patient with mild delays was observed to sit independently at 11 months, walk at 27 months, and have a vocabulary containing four words at 24 months (Roos et al., 2009). Other patients showed similar delays, with some more severe than others. Specifically, one more severely delayed patient was assessed at 22 months old and was just beginning to sit independently, had no recorded ability to walk, and had no meaningful vocabulary (Roos et al., 2009). In contrast, Bi et al. reported on one patient whose only impairment was fine motor delays by age 15 (Bi et al., 2009). Curry et al. observed behavioral issues in 100% of cases, but failed to observe a consistent behavioral phenotype for the patient population (Curry et al., 2013). Attention deficit hyperactivity disorder (ADHD) was observed in all individuals with duplications involving YWHAE (Bi et al., 2009). Behavioral phenotypes observed across case studies and classes of duplications included: poor social relationships, communication impairment, persistent/repetitive behaviors, attention deficit disorder, obsessive compulsive disorder or tendencies, obsessive food seeking, and mild to significant depression in older patients (Bi et al., 2009; Bruno et al., 2010; Hyon et al., 2011; Curry et al., 2013). As with developmental and psychomotor delay phenotypes, behavioral phenotypes were also variable and spanned duplication classes. Mechanistic etiological studies can help shed light on different microduplication region contributions to different variable delays and behavioral phenotypes.

## Phenotypes: Autism Spectrum Disorder (ASD)

Studies have found rare penetrative genetic copy number variations (CNVs) to account for between 5 and 15% of ASD cases. Copy number variations have been found in 8.6% of children diagnosed with autism, all of which were de novo (Gazzellone et al., 2014; Eriksson et al., 2015). It is important to note that <1% of autistic patients studied by Eriksson et al. presented with microduplications in 17p13.3, but when looking specifically at case studies of patients with 17p13.3 microduplication syndrome, ∼32% presented with some form of ASD diagnosis or tendencies (Curry et al., 2013; Eriksson et al., 2015). Although this specific duplication is rare in the ASD patient population as a whole, ASD comorbidity is somewhat common in the 17p13.3 microduplication patient population. According to Curry et al., there are one or more autism loci in the 17p13.3 region (Curry et al., 2013). YWHAE and CRK are seen as candidate genes for ASD that arises in patients with 17p13.3 microduplication syndrome. This is due to the observance of autistic-like behaviors and diagnosis of ASD primarily in patients with class I duplications, and overlap of the same genes being duplicated in multiple patients (Curry et al., 2013).

## Phenotypes: Brain Abnormalities and Characteristic Physical Features

Brain abnormalities and characteristic physical features vary slightly across classes of microduplications, but are subtle across groups. Brain abnormalities have been assessed using MRI technology on few patients involved in case studies (Bi et al., 2009; Roos et al., 2009; Curry et al., 2013). Brain abnormalities are generally observed in individuals with class II duplications, which result in LIS1 overexpression, but the abnormality phenotype is usually minimal (Bi et al., 2009). One patient in a case study performed by Bi et al. had a triplication of PAFAH1B1 and showed mild volume loss in the cerebellum, dysgenesis of the corpus callosum, and cerebellar atrophy. The brain also appeared smaller, especially in the occipital cortex (Bi et al., 2009). Another subject with a duplication of PAFAH1B1 showed mild cerebellar volume loss, thinning of the splenium of the corpus callosum, and a smaller brain, once again most notably in the occipital cortex (Bi et al., 2009). Other case studies performed MRI imaging of six patients with 17p13.3 microduplication syndrome and found that three patients showed variable abnormalities of the corpus callosum and cerebellum, and the remaining three patients had normal MRI scans (Curry et al., 2013). Roos et al. performed case studies of patients with 17p13.3 microduplication syndrome and provided brain imaging data for two patients (Roos et al., 2009). MRI imaging of one patient revealed hypoplasia of the corpus callosum and dilation of lateral ventricles. Thinning of white matter may have occurred. The MRI showed a possible increase of the signal intensity paraventricularly. A small pituitary gland and enlargement of the cisterna magna were also observed (Roos et al., 2009). A brain axial CT scan was performed on one patient at two years old and revealed potential delayed myelination, but in a second axial CT scan at age four myelination appeared normal (Roos et al., 2009). A novel inverted 1.4 Mb microduplication that disrupted PAFAH1B1 in a patient has been reported (Classen et al., 2013). An MRI was performed on the patient presenting with this duplication and results showed diffuse pachygyria with a moderately thick cortex, smooth white-gray border and no microgyri, and mild-posterior predominate lissencephaly (Classen et al., 2013). Results also showed a thin, short corpus callosum and a mildly thin, flat brainstem (Classen et al., 2013). Brain abnormalities observed in this patient are more characteristic of 17p13.3 microdeletions than microduplications, and authors hypothesized that this phenotype was due to insertion of the 1.4 Mb duplication into intron 1 of PAFAH1B1, which disrupted normal splicing and effectively inactivated one copy of PAFAH1B1 (Classen et al., 2013). One case study observed a patient with a novel class I microduplication with slight brain malformations (Capra et al., 2012). An MRI was performed at 5 years of age and revealed a reduction in the volume and thickness of the isthmus of the cingulate gyrus and the splenium of the corpus callosum (Capra et al., 2012). A dysmorphic aspect of the rostrum of the corpus callosum was also observed (Capra et al., 2012). This is a somewhat rare case in which a patient with a class I microduplication also presented with brain malformations. Patients with class I microduplications usually present with dysmorphic facial features, as opposed to brain malformations. Facial features include asymmetric cranium, flat occipital region, dysmorphic appearance, frontal bossing, low set ears, broad nasal bridge, small nose, and hypertelorism (Bi et al., 2009; Roos et al., 2009; Bruno et al., 2010; Curry et al., 2013). Throughout the rest of the body, other general phenotypes have been observed across duplication classes such as abnormal body proportions, long limbs, and anisomelia (Roos et al., 2009; Bruno et al., 2010).

## Phenotypes: Overgrowth and Related Growth Factors

Case studies analyzed by Bi et al. observed macrosomia in all but one patient with YWHAE duplication, which contrasts the severe growth restriction observed in individuals with duplications in PAFAH1B1 (Bi et al., 2009). Patients with class I duplications and macrosomia had duplications expanding into CRK, which is known to be involved in regulation of growth and cell differentiation. One case study reports a patient with a 1.58 Mb terminal gain of 17p13.3 (Henry et al., 2016). This gain is a class I duplication including YWHAE and CRK. The patient showed increased growth factors, pathologic tall stature, and entered puberty at 7 years old. This case study is unique in that it provided a detailed endocrinologic evaluation and reported involvement of the anterior pituitary gland. This suggests a potential hormonal mechanism for overgrowth associated with 17p13.3 microduplication syndrome (Henry et al., 2016). Future case studies of patients with class I microduplications in addition to macrosomia, accelerated growth, and/or elevated growth factors should aim to perform detailed endocrinologic analysis of the patient. This will further elucidate the hypothesized hormonal mechanism and allow for a better understanding of the consequences of CRK duplication.

#### Phenotypes: Limb Malformations

The BHLAH9 gene lies in the 17p13.3 microduplication syndrome region, just outside of the MDS critical region in humans, and has been associated with limb malformations (Nagata et al., 2014). Curry et al. hypothesized that duplication of BHLAH9 was necessary, but not sufficient, for limb malformation and that a complex mechanism including disruption or separation of nearby regulatory elements underlies the development of split hand/foot malformation with long bone deficiency (SHFLD) (Curry et al., 2013). Class I microduplications have been observed in patients with limb malformations such as SHFLD (Curry et al., 2013). For example, a case study reported a triplication involving BHLAH9 and a segment of YWHAE (Luk et al., 2014). A fetus was observed with bilateral split-hand malformation, and the triplication was maternally inherited. In a study of 51 Japanese families, BHLAH9 duplications were found to be the largest predictor of a range of different limb malformations (Nagata et al., 2014). A dosage effect was also observed in this sample, with the larger duplication, or in some cases triplication, correlating to more dramatic limb malformations. The limb malformations seen in this cohort were SHFLD, split hand/foot malformation (SHFM), and Gollop-Wolfgang complex (GWC) (Nagata et al., 2014). Like Curry et al., Nagata et al. also suggested that BHLAH9 copy number gains were necessary, but not sufficient, to cause limb malformations (Curry et al., 2013; Nagata et al., 2014). Case studies on 13 families with both BHLAH9 copy number gains and limb malformations found hand malformations in 75% of patients, foot malformations in 38% of patients, and long bone deficiency in 43% of patients (Petit et al., 2014). It has been hypothesized that duplications involving a 173 kb critical region correlate with SHFLD, due to the presence of overlapping microduplications in 17p13.3 in three patients involved in a case study (Armour et al., 2011). This 173 kb critical region contains gene BHLAH9 and exons 1-3 of ABR, another gene that lies just outside the MDS critical region (Armour et al., 2011). It has been proposed that duplication of this 173 kb region could potentially alter the dosage of a regulatory element involved in limb development or disrupt the interaction between a nearby regulatory element and its gene target or targets (Armour et al., 2011). These hypothesized mechanisms support the previously suggested mechanism that disruption or separation of nearby regulatory elements from their gene targets in the 17p13.3 microduplication region underlies the development of SHFLD (Curry et al., 2013).

#### Phenotypes: Cleft Lip and Palate

Syndromic and non-syndromic cleft lip and palate have been observed in patients with both class I and class II 17p13.3 microduplication syndrome. YWHAE has been implicated in the etiology of this phenotype, because it is specifically evidenced in the development of midline craniofacial structures (Tucker and Escobar, 2014). It has been hypothesized that interactions between other genes in the duplication region and YWHAE contribute to the variations seen in this phenotype (Tucker and Escobar, 2014). Tucker and Escobar report on two cases of 17p13.3 class I microduplications involving YWHAE with cleft lip and palate phenotypes. These cases were the first reported cases with a strictly class I microduplication categorization (Tucker and Escobar, 2014). It was hypothesized that YWHAE duplications played a potentially causative role in the cleft lip and palate phenotype (Tucker and Escobar, 2014). A case study also reported a patient with 17p13.3 microduplication syndrome with a non-syndromic cleft lip and palate phenotype (Ibitoye et al., 2015). Curry et al. reported on two families with multiple members who had the 17p13.3 microduplication syndrome and cleft lip and palate with or without accompanying intellectual disability (Curry et al., 2013). Overall, cleft lip and palate is generally accepted as a rare phenotype observed in 17p13.3 microduplication syndrome patients (Curry et al., 2013).

## Limitations to Current Studies and Future Directions

The majority of information about 17p13.3 microduplication syndrome has been obtained through case studies in human populations. Because 17p13.3 microduplication syndrome is classified as a rare disease, there are a very limited number of patients available for study. The largest review of case studies was authored by Curry et al. in collaboration with researchers, clinicians, and hospitals across the country to review 21 families (Curry et al., 2013). There are also no studies to date that have analyzed brains post-mortem due to the rarity of the disease and difficulty to obtain samples, so we must rely on techniques such as MRI to observe whole brain malformations. Additionally, gene expression patterns specific to the brain tissue of 17p13.3 microduplication syndrome patients has not been studied due to the limitations of obtaining post-mortem samples. Because participant numbers are small and phenotypes are incredibly variable, it is difficult to find distinct patterns between genotype and phenotype as well as disease etiology. Case studies allow for determining the presence and location of duplications, but difficulties arise when trying to determine mechanistic information.

A transgenic mouse model to conditionally overexpress Lis1 in the developing brain has been created (Bi et al., 2009). This model utilized the Cre-loxP system to increase expression of Pafah1b1, which caused a ∼20% increase of Lis1 protein levels (Bi et al., 2009). Results showed that increased Lis1 expression in the developing brain may lead to neuronal migration defects as well as smaller brain size (Bi et al., 2009). They observed disorganization in the ventricular zone and disrupted cell polarity, which is critical for migration. Bi et al. also observed both radial and tangential migration disruptions in the cortex and an increase of apoptotic cells (Bi et al., 2009). This Lis1 overexpression mouse model was effective for investigating the effects of Lis1 overexpression in the developing mouse brain. However, in order to create a disease model for the 17p13.3 microduplication syndrome, other genes and proteins must also be considered, as almost all human patients have duplications that span multiple genes.

Cornell et al. used the technique of in utero electroporation to observe the effects of 14-3-3ε overexpression in the developing murine cortex (Cornell et al., 2016). Plasmids containing 14-3- 3ε overexpression vectors were injected into the lateral ventricle of E14.5 or E16.5 embryonic brains. Electrodes were then positioned to direct plasmids into cells near the ventricular zone, which later migrate to their positions in the cortex. Embryos were then placed back into the uterus and allowed to develop uninterrupted. Brains were harvested and analyzed at P15. Results show that overexpression of 14-3-3ε results in a decrease of neurite formation through interactions with doublecortin (Dcx) (Cornell et al., 2016). This interaction decreases Dcx degradation and results in the failure of microtubule invasion into lamellipodia, which is a key step in neurite formation. Failure of microtubule invasion into the lamellipodia resulted in severe neuronal morphological defects. This was a useful technique for analyzing phenotypic differences in the developing mouse brain between 14-3-3ε overexpressing mice and control mice, but this system does not provide an exact mimicry of 14-3-3ε overexpression in human patients with the 17p13.3 microduplication syndrome. As was the case with Bi et al., this 14-3-3ε overexpression model proved effective for analyzing the effects of a single protein overexpression on neuronal development (Bi et al., 2009; Cornell et al., 2016). However, because both class I and class II duplications involve an interaction between multiple duplicated genes to create a phenotype, a mouse model must be created that encompasses multiple genes in chromosome 17p13.3 that have been identified as critical regions in humans.

Similar difficulties arise when studying deletions in 17p13.3. Models have been created to assess specific genes, but a successful mouse model has not yet been created to observe either 17p13.3 microdeletions or microduplications as they exist in human populations. Future directions include modification of available transgenic technology to create mice expressing certain 17p13.3 microdeletions or microduplications that correlate with specific known human phenotypes for mechanistic and/or etiological studies.

### MOUSE MODEL WITH 17P13.3 DELETION OR DUPLICATION

#### Proposed Cre-LoxP and CRISPR/Cas9 Combinatorial Strategy for Creating 17p13.3 Deletion or Duplication Mouse Models

To better understand both 17p13.3 microduplications and microdeletions, a new disease model must be created that encompasses critical regions containing both Pafah1b1 and Ywhae. There is a noted interaction between LIS1 and 14-3- 3ε in all discussed disorders and both proteins are essential for correct neuronal migration (Toyo-oka et al., 2003; Bi et al., 2009; Bruno et al., 2010). Complications arise when trying to create models containing deletions or duplications of both Pafah1b1 and Ywhae because of the relatively large distance of ∼1.3 Mbps between the two genes. This region is too large to either delete or duplicate using traditional Cre-loxP methods or the CRISPR/Cas9 system alone. We propose a methodology to create double transgenic mouse models using both Cre-loxP and CRISPR/Cas9 in combination to delete or duplicate a region of 17p13.3 containing both Pafah1b1 and Ywhae.

In theory, traditional Cre-loxP methods could be used to make a mouse model for deletion of the complete MDS critical region, but in practice this process would be quite laborious and likely require multiple attempts before mice with the correct genotype could be generated. Mice with loxP sites on the same DNA strand at both Pafah1b1 and Ywhae would need to be produced. A wide range of Cre transgenic mice could then be used to spatially and temporally delete the MDS critical region in mice. It would be possible to produce the Pafah1b1+/flox;Ywhae+/flox (in cis) mice by crossing Pafah1b1+/flox;Ywhae+/flox (in trans) mice with wild-type mice (**Figure 2**). This approach would rely on a meiotic crossover between alleles to place both loxP sites on the same chromosome (in cis). If there is no recombination, offspring would be positive for a loxP site at either the Pafah1b1 or Ywhae alleles. If crossover does occur, then the offspring will be positive for both alleles. The crossover could be confirmed by mating the Pafah1b1+/flox;Ywhae+/flox (in cis) mice to wildtype mice. We would expect ∼1% embryonic recombinants from trans to cis (Merscher et al., 2001). This approach would require genotyping of hundreds of pups and relies heavily on chance, but has been used successfully in the past (Merscher et al., 2001). An additional concern is the Cre-mediated deletion would likely have an extremely low efficiency. The farther apart the loxP sites are, the more inefficient the deletion (Coppoolse et al., 2005).

The approach for creating a mouse model for duplication of chromosome 17p13.3 is similar to that for creating the deletion model, with a few crucial differences. To create the duplication, the loxP sites should be on opposite DNA strands (**Figure 3**). Pafah1b1+/flox;Ywhae+/flox (in trans) mice would be crossed with Cre transgenic mice. This will result in the production of two alleles at the same time. One allele will have a duplication of the MDS critical region, while the other will have a deletion of the MDS critical region. This technique may also be used as an alternative approach for producing the deletion model. The two alleles can be segregated by crossing the mice with the deletion and duplication alleles with wild-type mice. However, a concern is that the recombination efficiency by Cre in trans is extremely low (Liu et al., 1998; Zheng et al., 2000). Therefore, this technique is interesting, but requires hard work.

Use of the CRISPR/Cas9 system alone would only allow the deletion of a single gene for each gRNA construct. If two gRNAs were designed to target both Pafah1b1 and Ywhae, mutation would likely occur close to the target sites, leaving most of the genes in the MDS critical region intact. A combined approach using CRISPR/Cas9 to insert the loxP sites near Pafah1b1 and Ywhae would be the most efficient way to create a mouse model with this relatively large deletion (**Figure 4**). This would allow researchers to bypass the laborious task described above of crossing the Pafah1b1+/flox;Ywhae+/flox (in trans) mice with wild-type mice. The loxP sites inserted by CRISPR/Cas9 should have the same directional orientations, so that the floxed sequence will be deleted. One concern is ensuring that both the loxP sites are inserted in the same chromosome and on the same DNA strand. To partially address this, a mutated nickase version of the Cas9 enzyme could be used. This would create a singlestranded break, as opposed to the double-stranded break created by the wild-type enzyme, which would make it more likely that the loxP sites would be inserted in the same DNA strand to create cis mice. Multiple attempts may be necessary before the loxP sites are inserted, since the efficiency of DNA break by the mutated nickase Cas9 is lower than wild-type Cas9.

While the use of the CRISPR/Cas9 system would make insertion of the loxP sites significantly easier, the issue of extremely low Cre-mediated deletion efficiency remains due to

the large size of the region we are proposing to delete. To address this, several loxP sites could be inserted to allow for a serial deletion of the complete MDS critical region (**Figure 4**). This would be done according to the method proposed above, but instead of only inserting loxP sites near Pafah1b1 and Ywhae multiple loxP sites would span the ∼1.3 Mbps region. This should increase the efficiency of the Cre-mediated deletion.

Embryonic stem (ES) cells or fertilized eggs may be used to employ our strategy which combines Cre-loxP and CRISPR/Cas9 techniques. Use of ES cells is preferred because it will be simpler to inject the gRNA constructs in cells in culture and confirm that the loxP sites have been incorporated as desired. Once ES cells of the appropriate genotype have been generated, they may be used to create chimera mice. The chimeras should then be used for testing the germline transmission. These mice with the multiple pairs of loxP sites would finally be crossed with Cre transgenic mice. There are a few considerations that should be made when selecting which of the several varieties of Cre recombinase to use. EIIa-Cre causes recombination very early in development (Lakso et al., 1996). This would be most representative of human patients. However, this phenotype may be embryonic lethal. Nestin-Cre is neuron-specific and would cause recombination later in development (Giusti et al., 2014). This would be less representative of what occurs in human development, but may be the better option if inducing recombination at earlier time points proves to be lethal. Although numerous technical aspects would need to be considered throughout the process, generation of mouse models that have either a deletion or duplication of the region of 17p13.3 that contains both Pafah1b1 and Ywhae would likely provide the most clinically relevant models of the associated diseases to date.

#### Advantages and Disadvantages of the Proposed Mouse Model

Creation of mouse models for deletion and duplication of the MDS critical region will present unique technical challenges. There are many advantages associated with these models, but they are not without some disadvantages. Mouse models are a classical method for the study of neurodevelopmental disorders, since ethical concerns prevent the study of neurodevelopment using human subjects. Post-mortem analysis of brains and other techniques traditionally used to study phenotypes that emerge later in life do not allow developmental defects to be studied in real time, so these methods will not greatly advance our understanding of neurodevelopmental diseases. For these reasons, use of a mouse model that allows for the observation of neurodevelopment is highly advantageous. With recent advances in the use of iPSCs and organoids, it is possible to observe and analyze cellular, molecular, and simple morphogenic and migrational phenotypes associated with disease. Use of a mouse model would also allow for observation of these phenomena, but would additionally allow for analysis of accompanying behavioral phenotypes and craniofacial defects, which are homologous to those seen in human patients.

Although mice and humans differ in brain structure, chromosome structure, and behavior, creation of a mouse model is still useful to draw conclusions about brain and behavioral abnormalities in humans (Lui et al., 2011; Watson and Platt, 2012; Geschwind and Rakic, 2013; Florio and Huttner, 2014). There is a slight variation in the arrangement of genes surrounding the MDS critical region. The BHLHA9 gene that is located immediately outside the MDS critical region in humans is separated from the MDS critical region in mice by several other genes. This means that a frequently duplicated gene in humans may not factor into the mouse model for the 17p13.3 microduplication syndrome. Although this slight difference may cause variation, the entire MDS critical region does exist in the same order in humans and mice.

The most relevant behavioral difference between mice and humans is that it is impossible to draw conclusions regarding language development in mice, which would be

of interest since language development is delayed in autistic human patients with 17p13.3 microduplication syndrome and developmentally delayed patients with 17p13.3 microdeletion syndrome (Silverman et al., 2010; Watson and Platt, 2012). However, there are other established measures of autistic behavior in mice that are homologous to humans such as stereotypic and repetitive behaviors, social behavior, and social communications (Crawley, 2004; Silverman et al., 2010; Watson and Platt, 2012). Social behavior is an especially important consideration for the study of ASD and related phenotypes in which patients may have characteristically poor social relationships.

Mouse models are also useful for analyzing typical phenotypes observed in patients with 17p13.3 microdeletion syndrome such as spasticity, epileptic seizures, decreased longevity, and intellectual disability (Dobyns et al., 1993). In addition, the craniofacial defects observed in human populations with 17p13.3 microdeletion syndrome are manifested in mice (Carter et al., 2000; Toyo-oka et al., 2004; Yu et al., 2014). Use of a mouse model with a large deletion or duplication, as opposed to single gene deletions and duplications, will be much more representative of the genotypes that occur in patients with ILS, MDS, and 17p13.3 microduplication syndrome. While there are drawbacks associated with any animal model, the creation of the

proposed mouse models would advance the study of the rare neurodevelopmental diseases discussed in this review and is a goal worthy of effort.

## Next Generation Sequencing and Rare Disease

Although the incidence of each individual rare disease is low, rare diseases collectively affect ∼30 million people in the United States (Shen et al., 2015). Next generation sequencing (NGS) has accelerated the rate at which genes responsible for rare monogenic diseases are being identified (Boycott et al., 2013). Whole-exome sequencing (WES) is a popular technique that is currently favored over whole-genome sequencing (WGS) due to its lower cost and complexity for detecting genetic variations (Boycott et al., 2013). WES is an NGS technique that may be applied to identify novel rare disease causing genes. NGS techniques are not yet being used as a typical method for diagnosis, but this may become a common practice once more gene-phenotype relationships are identified (Shen et al., 2015). NGS has had a significant positive impact on rare disease research. However, this impact is rather limited to diseases that are caused by a single gene.

Progress in research has not been as rapid on rare diseases that involve large regions of the genome and multiple genes, such as MDS and 17p13.3 microduplication syndrome, due to the focus of previous research on identifying mutations in single genes. The NGS techniques that have been used to identify about half of the genes responsible for the approximated 7,000 known rare monogenic diseases need to be modified before they may be applied to the study of rare diseases involving multiple genes (Boycott et al., 2013). Specifically, data interpretation should be modified. To help identify the specific genes that have been deleted or duplicated in patients with MDS or 17p13.3 microduplication syndrome, respectively, studies should be looking at CNVs of genes within critical regions for these diseases. Such an approach may help identify additional genes of importance for these conditions. Applying this strategy to other diseases of unknown genetic origin may help determine if multiple genes are deleted or duplicated in other syndromes.

Study of CNVs using targeted NGS sequencing data, such as WES, is challenging because deletions and duplications may not begin or end within the exome, making them difficult to identify. However, a statistical method known as SeqCNV has recently been developed that can robustly identify CNVs using capture NGS data (Chen et al., 2017). First, a dataset is generated using WES. Then, the analysis is done using SeqCNV to identify the copy number ratio and CNV boundary through use of read depth information and maximum penalized likelihood estimation (MPLE) (Chen et al., 2017). CNVs can also be identified from targeted NGS data using a popular software package called ExomeDepth. The sensitivity and specificity of ExomeDepth v1.1.16 were determined to be 100% and 99.8%, respectively, validating this technique as an appropriate method for detection of CNVs (Ellingford et al., 2017). Methods such as SeqCNV and ExomeDepth are also advantageous because they can identify CNVs anywhere in the genome by using WES data, unlike other methods, such as FISH, where a region of interest must be identified to carry out the technique. Development of NGS techniques has already greatly improved our knowledge of rare monogenic diseases. A shift in attention to more complex genotypes involving the deletion or duplication of multiple genes may result in better approaches for identifying CNVs.

## CONCLUSIONS

Microdeletions and microduplications of chromosome 17p13.3 lead to rare and complex diseases. The advent of NGS techniques and refinement of CRISPR/Cas9 mouse genetics opens new possibilities for the study of rare diseases such as the creation of new mouse models that previously would have been difficult or impossible to create, advanced analysis of WES data to better identify CNVs, and the potential for more accurate diagnostic tools. These techniques also offer alternate tools to study a rare disease that does not heavily rely on case study data, which is difficult to obtain. Study of neurodevelopmental disorders can also advance the field of neurodevelopmental research in general by contributing to our knowledge about fundamental processes involved in normal brain development including neurogenesis, neuronal migration, and neurite formation.

## AUTHOR CONTRIBUTIONS

SMB, SAB, and TS wrote the initial draft of the manuscript. SMB and SAB: created the figures. KT edited and finalized it.

## FUNDING

This review has been supported by a research grant from the NINDS (NS096098) and BeHEARD Technology Prizes.

## ACKNOWLEDGMENTS

We would like to acknowledge the National Institute of Neurological Disorders and Stroke (NINDS) and Rare Genomics

#### REFERENCES


Bodoy, S., Fotiadis, D., Stoeger, C., Kanai, Y., and Palacin, M. (2013). The small SLC43 family: facilitator system l amino acid transporters and the orphan EEG1. Mol. Aspects Med. 34, 638–645. doi: 10.1016/j.mam.2012.12.006


Institute for their support. We also thank Dr. Anthony Wynshaw-Boris, Case Western Reserve University, for critical reading and his comments and Dr. Masahito Ikawa, Osaka University, for his comments.


the LIS1 gene located at chromosome 17p13. JAMA 270, 2838–2842. doi: 10.1001/jama.1993.03510230076039


family adapter proteins. Biochem. Biophys. Res. Commun. 318, 204–212. doi: 10.1016/j.bbrc.2004.04.023


the region of the Miller-Dieker (17p13 deletion) syndrome. J. Med. Genet. 46, 703–710. doi: 10.1136/jmg.2008.065094


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Blazejewski, Bennison, Smith and Toyo-oka. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rare Compound Heterozygous Frameshift Mutations in ALMS1 Gene Identified Through Exome Sequencing in a Taiwanese Patient With Alström Syndrome

Meng-Che Tsai 1,2, Hui-Wen Yu2,3, Tsunglin Liu<sup>4</sup> , Yen-Yin Chou<sup>1</sup> , Yuan-Yow Chiou<sup>1</sup> and Peng-Chieh Chen2,3 \*

<sup>1</sup> Depatment of Pediatrics, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan, <sup>2</sup> Institute of Clinical Medicine, College of Medicine, National Cheng Kung University, Tainan, Taiwan, <sup>3</sup> Center of Clinical Medicine, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan, Taiwan, <sup>4</sup> Department of Biotechnology and Bioindustry Sciences, National Cheng Kung University, Tainan, Taiwan

#### Edited by:

Arvin Gouw, Rare Genomics Institute, United States

#### Reviewed by:

Thomy J. L. de Ravel, University Hospitals Leuven, Belgium Nelson L. S. Tang, The Chinese University of Hong Kong, China

> \*Correspondence: Peng-Chieh Chen pengchic@mail.ncku.edu.tw

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 14 December 2017 Accepted: 21 March 2018 Published: 18 April 2018

#### Citation:

Tsai M-C, Yu H-W, Liu T, Chou Y-Y, Chiou Y-Y and Chen P-C (2018) Rare Compound Heterozygous Frameshift Mutations in ALMS1 Gene Identified Through Exome Sequencing in a Taiwanese Patient With Alström Syndrome. Front. Genet. 9:110. doi: 10.3389/fgene.2018.00110 Alström syndrome (AS) is a rare autosomal recessive disorder that shares clinical features with other ciliopathy-related diseases. Genetic mutation analysis is often required in making differential diagnosis but usually costly in time and effort using conventional Sanger sequencing. Herein we describe a Taiwanese patient presenting cone-rod dystrophy and early-onset obesity that progressed to diabetes mellitus with marked insulin resistance during adolescence. Whole exome sequencing of the patient's genomic DNA identified a novel frameshift mutation in exons 15 (c.10290\_10291delTA, p.Lys3431Serfs∗10) and a rare mutation in 16 (c.10823\_10824delAG, p.Arg3609Alafs∗6) of ALMS1 gene. The compound heterozygous mutations were predicted to render truncated proteins. This report highlighted the clinical utility of exome sequencing and extended the knowledge of mutation spectrum in AS patients.

Keywords: Alström syndrome, ALMS1 gene, ciliopathy, whole exome sequencing, childhood obesity, retinitis pigmentosa

## INTRODUCTION

Alström syndrome (AS; OMIM 203800) is a rare autosomal recessive disorder characterized by early-onset blindness due to cone-rod dystrophy, juvenile obesity followed by marked insulin resistance and type 2 diabetes mellitus, and progressive sensorineural hearing impairment that usually takes place within the first year of life (Marshall et al., 2007a). Moreover, a certain portion of AS patients also presents liver, kidney, neurological, cardiac, and pulmonary diseases (Marshall et al., 2005). No disease specific treatment is as yet available, whereas early morbidity and mortality are expected in affected patients.

The causative gene of AS has been recently ascribed to ALMS1, which harbors 23 exons and encodes a 461-kDa protein (ALMS1, centrosome and basal body associated protein) involved in ciliary functions (Hearn et al., 2002). Although the multi-organ pathogenesis of AS has not been fully delineated, it is suggested that truncated or dysfunctional proteins disrupt cell cycle regulation and intracellular trafficking (Girard and Petrovsky, 2011). Clinical diagnosis is based on observation of cardinal features. However, varied age of onset and evolutional severity of manifestations may be due to the allelic and expressive heterogeneity of AS and thus defer the diagnosis particularly in patients with only part of clinical features that are also shared by other ciliopathies, such as Bardet-Biedl syndrome (Marshall et al., 2007a). Under such a circumstance, confirmatory mutation analysis of ALMS1 is required and usually laborious by conventional Sanger sequencing, given the size and mutation spectrum of ALMS1 gene along with other disease associated genes involved in the ciliopathies (Marshall et al., 2007b). This hardship can be resolved by the application of next generation sequencing techniques, which are now available for the identification of genetic variants comprehensively (Bamshad et al., 2011). Here, we reported a Taiwanese patient with AS features and identified a novel mutations in ALMS1 through whole exome sequencing (WES).

## CLINICAL REPORT

A 16-year-old Taiwanese boy, born uneventfully to Taiwanese parents, measured 147 cm in height and 55 kg in weight with a body mass index 25.2 kg/m<sup>2</sup> . He presented hyperphagia and rapid weight gain at the age of 5 months when he measured 10 kg (>97th %) and 69 cm (>75th %). Meanwhile, he received diagnosis of bilateral nystagmus that progressed to nearly blindness at the age of 4. Funduscopic examinations revealed retinitis pigmentosa. He was hospitalized at the age of 11 because of diabetic ketoacidosis and acute pancreatitis. Examinations also revealed hypertriglyceridemia, severe fatty liver, and renal insufficiency. The initial metabolic survey found that marked insulin resistance and pancreatic insufficiency with the homeostatic model assessment for insulin resistance (HOMA-IR) index was 83.5 [a value greater than 2.6 may indicate insulin resistance in adolescents (Burrows et al., 2015)] and HOMA-ß 78.2% [a value greater than 100% may indicate reserved insulin secretion (Matthews et al., 1988)]. He was started on insulin therapy since then and the daily requirement progressively exceeded 260 IU/day with the presence of remarkable acanthosis nigricans within 5 years' time. Updated endocrine investigation showed an optimal increase of C-peptide secretion in response to intravenous glucagon stimulation. Echocardiography revealed a borderline left ventricular dilatation. The intellectual and hearing ability were both unaffected.

## METHODS

Informed written consent was acquired from the patient and his parents, as the entire procedure of this study was approved by the Institutional Review Board of the National Cheng Kung University Hospital (B-BR-104-063). The proband's genomic DNA was extracted from peripheral blood collected in EDTAcontaining tubes. SureSelect QXT All human exon V6 (Agilent), which targeted 60 Mb of the exonic regions, was applied to construct the exome library that was then sequenced on Illumina NextSeq500 platform. We aligned the sequence reads to human genome reference Hg19 using Novoalign (www. novocraft.com) and identified single nucleotide variants and small insertions and deletions using Genome Analysis Toolkit 3.4 (GATK; www.broadinstitute.org/gatk). The sequence variants were annotated with SeattleSeqAnnotation (snp.gs.washington. edu/SeattleSeqAnnotation138) and novel variants were filtered against 1000 Genomes, dbSNP, and Genome Aggregation Database (gnomad.broadinstitute.org). Sanger sequencing was finally used to confirm the mutation.

## RESULTS

With the average coverage of 40.1X on targeted regions, we identified 28,967 novel genetic variants in the exome sequencing. The summary of variants identified was listed in **Table 1**. Two pathological mutations in ALMS1 were identified: chr2:73786171delTA (c.10290\_10291delTA) in exon 15 (**Figure 1A**, upper panel) and chr2:73799829delAG (c.10823\_10824delAG) in exon 16 (**Figure 1A**, lower panel). These mutations were further confirmed by Sanger sequencing (**Figure 1B**) and the results showed that c.10290\_10291delTA (p.Lys3431Serfs<sup>∗</sup> 10) was maternally inherited and c.10823\_10824delAG (p.Arg3609Alafs<sup>∗</sup> 6) was paternally inherited. These mutations resided in a conserved stretch of amino acids and led to truncated proteins lacking the ALMS motif (**Figures 2A,B**).

## DISCUSSION

In this report, we described a Taiwanese patient presenting clinical features compatible with AS, except the absence of hearing impairment. Using WES, we identified compound heterozygous mutations in the exons 15 and 16. Both affected alleles were frameshift mutations that were predicted to cause premature stop codon downstream and render truncated ALMS1 protein.

The ALMS1 protein consists of 4,169 amino acids and contains several domains including a potential signal peptide at residues 211–223, a leucine zipper at residues 2480–2501, and an ALMS motif at residues 4035–4167 (Collin et al., 2002; Hearn et al., 2002). The exact molecular role of these motifs is not completely elucidated, although truncated ALMS1 proteins


SNP represents single nucleotide polymorphism; INDEL, insertions and deletions, GV, genetic variants.

father (I-1). WT, wild type.

have exhibited perturbed effects on intracellular localization, microtubular organization, cell cycle regulation (Hearn et al., 2002, 2005; Girard and Petrovsky, 2011). In our patient, truncated proteins, if transcribed from the observed mutations p.Lys3431Serfs<sup>∗</sup> 10 and p.Arg3609Alafs<sup>∗</sup> 6, are presumed to lack the ALMS motif that is essential in co-localization with centrioles and basal bodies (Knorz et al., 2010). Given that ALMS1 is ubiquitous expressed in human tissues, low abundance or functionality of transcripts may contribute to the multiorgan pathogenesis in AS, such as metabolic and neurosensory disorders (Collin et al., 2002; Hearn et al., 2002; Girard and Petrovsky, 2011). However, phenotypic variability exists in the same ALMS1 mutated spots, even within the same family (Titomanlio et al., 2004). Our patient did not present typically sensorineural hearing impairment, which can be explained by the prior assumption that genotype-phenotype correlation of ALMS1 mutations may be modified by other modifier genes or environmental factors (Collin et al., 2002; Marshall et al., 2007b).

A wide array of nonsense and frameshift mutations has been identified in coding regions of ALMS1 gene with potential hotspots preferentially located in exons 8, 10, and 16 (**Figure 2A**) (Marshall et al., 2007b; Ozantürk et al., 2015). Ethnicity remains as a strong contributor to the distribution of genetic variants. However, the skewed clustering of mutations may be affected by genetic founder effect and consanguinity. In East Asian descents, most reported mutations were disperse within the aforementioned cluster of exons but distinctive in location and structure (Liang et al., 2013; Marshall et al., 2015). Only a few variants have been reported more than once in East Asian populations, such as c.11116\_111134del recurring in Taiwanese families (Marshall et al., 2007b; Lee et al., 2009). The rare variant c.10823\_10824delAG has been exclusively reported in East Asian descents with an estimated allele frequency of 2.53 × 10−<sup>5</sup> in the up-to-date Genome Aggregation Database<sup>1</sup> . Therefore,

<sup>1</sup>The Genome Aggregation Database. Available online at: http://gnomad. broadinstitute.org/about (Accessed January 18, 2018).

proband.

we assumed a genetic founder effect on this rare pathological variant. Obtaining genetic diagnosis of clinical patients in undergenotyped populations can expand our knowledge about this disease entity and relevant functionality of ALMS1 gene. From practical perspective, screening for these mutations in suspicious cases will be a feasible diagnostic step if there are sufficient data regarding the prevalence of ALMS1 mutations in the local population. Otherwise whole gene sequencing is usually required to fully capture the genotypes.

Technological advancement in high throughput sequencing has met clinical needs and thus hastened the identification of novel pathological variants in ALMS1 gene (Bamshad et al., 2011; Ozantürk et al., 2015). As compared to conventional Sanger sequencing, WES provides an effective and efficient alternative method that aids genetic diagnosis especially in cases with overlapping features among a certain number of differential diagnoses enlisted within the spectrum of ciliopathy. Moreover, the clinical utility would be significantly promoted

#### REFERENCES


within the near future as the cost of WES is expected to drop below that of Sanger sequencing, when reading a sizable coding region of target gene is required. Our report highlights the value of WES in providing genetic diagnosis in rare diseases.

#### AUTHOR CONTRIBUTIONS

M-CT and P-CC conceived the study; M-CT, Ye-YC, and Yu-YC collected clinical information; H-WY, TL, and P-CC conducted the bioinformatics analysis; M-CT drafted the manuscript and P-CC supervised the entire study. All the authors approved the final version of manuscript.

#### FUNDING

This study was supported by the research grant from National Cheng Kung University Hospital (NCKUH-10507026).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Tsai, Yu, Liu, Chou, Chiou and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# RNA-Sequencing of Primary Retinoblastoma Tumors Provides New Insights and Challenges Into Tumor Development

Sailaja V. Elchuri<sup>1</sup> , Swetha Rajasekaran2,3,4 and Wayne O. Miles2,3,4 \*

<sup>1</sup> Department of Nanotechnology, Vision Research Foundation, Sankara Nethralaya, Chennai, India, <sup>2</sup> The Ohio State University Comprehensive Cancer Center, Columbus, OH, United States, <sup>3</sup> Center for RNA Biology, The Ohio State University, Columbus, OH, United States, <sup>4</sup> Department of Molecular Genetics, The Ohio State University, Columbus, OH, United States

#### Edited by:

Arvin Gouw, Rare Genomics Institute, United States

#### Reviewed by:

Yanfeng Zhang, HudsonAlpha Institute for Biotechnology, United States Nelson L. S. Tang, The Chinese University of Hong Kong, China

> \*Correspondence: Wayne O. Miles wayne.miles@osumc.edu

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 15 December 2017 Accepted: 26 April 2018 Published: 17 May 2018

#### Citation:

Elchuri SV, Rajasekaran S and Miles WO (2018) RNA-Sequencing of Primary Retinoblastoma Tumors Provides New Insights and Challenges Into Tumor Development. Front. Genet. 9:170. doi: 10.3389/fgene.2018.00170 Retinoblastoma is rare tumor of the retina caused by the homozygous loss of the Retinoblastoma 1 tumor suppressor gene (RB1). Loss of the RB1 protein, pRB, results in de-regulated activity of the E2F transcription factors, chromatin changes and developmental defects leading to tumor development. Extensive microarray profiles of these tumors have enabled the identification of genes sensitive to pRB disruption, however, this technology has a number of limitations in the RNA profiles that they generate. The advent of RNA-sequencing has enabled the global profiling of all of the RNA within the cell including both coding and non-coding features and the detection of aberrant RNA processing events. In this perspective, we focus on discussing how RNA-sequencing of rare Retinoblastoma tumors will build on existing data and open up new area's to improve our understanding of the biology of these tumors. In particular, we discuss how the RB-research field may be to use this data to determine how RB1 loss results in the expression of; non-coding RNAs, causes aberrant RNA processing events and how a deeper analysis of metabolic RNA changes can be utilized to model tumor specific shifts in metabolism. Each section discusses new opportunities and challenges associated with these types of analyses and aims to provide an honest assessment of how understanding these different processes may contribute to the treatment of Retinoblastoma.

Keywords: Retinoblastoma, RNA-sequencing, non-coding RNA (LINC-RNA), tumor, RB1, metabolic profiling

#### INTRODUCTION

Tight control over proliferation and differentiation processes are fundamentally important to modulating organismal development and prevent oncogenic growth. In eukaryotes, the Retinoblastoma 1 (Rb1: gene, pRB: protein) gene and the E2 promoter binding Factors (E2Fs) function to control both of these pathways (Hiebert et al., 1992; Du et al., 1996). The pRB protein is the sole protein which can bind to and repress the activity of the activator E2Fs, E2F1- E2F3 (Hiebert et al., 1992), and this has made inactivation of the pRB pathway in cancer almost ubiquitous. Loss of pRB function, permits the constitute activity of the activator E2Fs and the uncontrolled proliferation of cells. Additionally, pRB-deficient cells have developmental defects

**61**

which commonly prevents these cells from terminally differentiating (Xu et al., 2009, 2014). pRB binding to the activator E2F's mediates the recruitment of numerous transcriptional repressor complexes including histone deacetylases (HDAC) and mating-type switching (SWI) and Sucrose Non Fermenting (SNF) (SWI/SNF) promoting repressive chromatin modifications on E2F target genes (Brehm et al., 1998). In cancer cells the pRB pathway is disabled by numerous mechanisms including; inactivating and/or gene mutations with in the Rb1 loci (Friend et al., 1986), the E7 oncoprotein produced by the Human Papilloma Virus (Dyson et al., 1989), the amplification or overexpression Cyclin D and/or Cyclin Dependent Kinase 4 or 6 (CDK4 and CDK6) (Nobori et al., 1994; Connell-Crowley et al., 1997; Harbour et al., 1999) or by the deletion or silencing of CDK inhibitors, CDKN2A-D (Cairns et al., 1994). Inactivation of the pRB/E2F pathway is one of the hallmarks of cancer and is ubiquitously disrupted in most tumor types (Sherr, 2000; Classon and Harlow, 2002). New research focusing on how pRB-loss changes cells has identified novel pathways sensitive to pRB-function including genome stability (Manning et al., 2010), metabolism and mitochondria biogenesis (Nicolay et al., 2015; Varaljai et al., 2015). The characterization of these unexpected biological processes coupled with the identification of new pRB interacting partners and the expansion of transcriptional profiling quality have inspired the pRB-field to probe how pRB-function modulates every aspect of cellular biology.

The RB1 gene was originally cloned as the genetic cause of the pediatric tumor of the retina, Retinoblastoma (Friend et al., 1986; Lee et al., 1987). Homozygote loss of the RB1 gene within these cells cause their aberrant proliferation and incorrect specification their cell fate (Xu et al., 2009, 2014). A small subset (3%) of Retinoblastoma tumors retain pRB function and contain MYCN amplifications (Bowles et al., 2007). RB1-deficient Retinoblastoma tumors are found in both heritable and sporadic forms (Friend et al., 1986; Lee et al., 1987) and occurs at the incidence of 1 in 15,000–20,000 births (Dimaras et al., 2012). These patients are also frequently affected by secondary tumors during adolescence that are commonly sarcoma in origin but can also affect a number of organs including the lung, bladder, and brain (Eng et al., 1993; Nahum et al., 2001). These secondary tumors are frequently metastatic and these patients have poor long term prognosis.

Genomic studies and microarray data have identified rod and cone retinal pigment epithelial (RPE) cells as the major cells of origin of Retinoblastoma (Xu et al., 2009, 2014; Kapatai et al., 2013; Cobrinik, 2015). High resolution Single nucleotide polymorphism (SNP) arrays from both hereditary and nonhereditary tumors show large genetic variation between these groups and suggest sub-categorization is possible (Kooi et al., 2015). In particular, hereditary Retinoblastoma tumors have less genomic rearrangements whilst non-hereditary tumors exhibit higher levels of genomic instability (Corson and Gallie, 2007; Mol et al., 2014). These genomic changes predominantly result in the gain of a number of chromosomal regions including, 1q, 2p, 6p, and 13q and the loss of 16q region (Kooi et al., 2016). Multiple studies have investigated these genomic aberrations and several candidate genes have been identified within these loci including KIF14, MDM4, MYCN, E2F3, DEK, and CDH11. Additional whole genome sampling arrays (WGSA) have found regions of genomic change in Retinoblastoma tumors and identified genes involved in differentiation regulation such as CEP170, SIX1, and SIX4 (Ganguly et al., 2009). This analysis lead to new genomic and epigenetic mapping experiments and to the identification of Spleen tyrosine kinase (SYK) as novel therapeutic target (Zhang et al., 2012; Pritchard et al., 2014). Additional studies linked the Mouse double minute 2 homolog (MDM2) and Tumor Protein 53 (p53) signaling node as being dysregulated in Retinoblastoma's (de Oliveira Reis et al., 2012; Qi and Cobrinik, 2017) and therapeutic intervention by targeting this pathway resulted in apoptosis of RB cells. These technologies have enable the identification of new and potential therapeutically targetable pathways in Retinoblastoma (Pritchard et al., 2014; Assayag et al., 2016) (reviewed by (Theriault et al., 2014) however, a number of addition key area's remained unexplored. These include: how do non-coding RNAs changes in these tumors? Can metabolic gene expression be used to predict targetable metabolic defects? And how do RNA processing events change in tumors lacking the RB1 gene? In this perspective, we will discuss the importance of addressing these areas and then highlight some of the challenges that accompany them. Information is power and this is particularly important when trying to improve the clinical outcomes for patients with pediatric tumors.

## RNA-SEQUENCING TECHNOLOGY

Pioneering studies into the transcriptome of Retinoblastoma tumors using microarrays have provided critical foundations for our understanding of the disease. The advent of new and unbiased profiling of the RNA content of cells by RNAsequencing (RNA-seq) significantly increases the resolution of the RNA measurements and can be expanded to include non-coding expression and detection of rare RNA fusion products. RNA-seq provides enhanced transcriptomic profiling by directly sequencing the RNA or DNA copies of the RNA and comparing it to the genome rather than relying on probe detection on traditional microarrays. This technology therefore circumvents the need to test RNA on pre-determined gene sets and can interrogate the entire transcriptome. RNA-seq platforms can measure all the RNA of cells, however, require the removal of very abundant RNA species within the cells that do not contribute to the transcriptome including ribosomal RNA (rRNA) and transfer RNA (tRNA). This critical experimental step depletes these specific RNA species from the cellular pool of RNA before library preparation and increases the overall coverage and depth of RNA-seq reads on the remaining genes. Currently RNA-seq datasets of Retinoblastoma tumors and normal controls have not been published, however, several independent teams are known to be working to generate these exciting datasets. In this perspective, we discuss three exciting opportunities that RNA-seq datasets of Retinoblastoma tumor may provide to the research community (**Figure 1**).

## LONG INTRAGENIC NON-CODING RNA CHANGES

A number of leading laboratories around the globe have used microarray technology to determine the transcriptional changes in RB1−/− Retinoblastoma samples compared to normal retinal controls (Ganguly and Shields, 2010; Zhang et al., 2012). This data has been invaluable to understanding how changes in the mRNA composition of these tumors contribute to the underlying biology driving the aberrant proliferation and differentiation of RB1−/− retinal cells. These tools do provide excellent coverage of the coding transcriptome, however, their capacity to detect non-coding RNA changes is limited to a small number of highly abundant long intergenic non-coding RNAs (LINC-RNAs). A growing body of literature has highlighted that changes in the non-coding RNAs including microRNAs (miRNAs) and LINC-RNAs may have important and clinically predictive roles in tumorigenesis. In particular, LINC-RNA integrated maps have enabled researchers to sub-divide Head and Neck Squamous Cell Carcinoma patients into different sub-groups based upon their LNC-RNA profile and this has enabled the identification of LINC-RNAs associated with poor prognosis (de Lena et al., 2017). In Retinoblastoma tumors only a limited number of studies of LINC-RNAs have been possible due to their poor representation on microarrays and this work has focused on LINC-RNAs identified in other tumor types including CCAT (Zhang et al., 2017), BANCR (Su et al., 2015), H19 (Zhang et al., 2018), MALAT1 (Liu et al., 2017), and HOTAIR (Dong et al., 2016). Therefore determining how the spectrum of LINC-RNA expression changes in Retinoblastoma tumors is an area of growing interest. This is particularly important as a subset of Retinoblastoma patients will develop life threatening sarcomas of the muscle, connective or bone tissue during early adulthood and examining new approaches to identify these patients may be of clinical benefit.

#### RB1-Loss and LINC Expression

Homozygote inactivation of the Rb1 gene is the initial event triggering the development of Retinoblastoma. As discussed in the introduction, pRB acts as a transcriptional repressor to inhibit the function of the activator E2F transcription factors (E2F1-3). E2F1-3 are potent inducers of transcription and in particular genes required for cell cycle progression and apoptosis, however, very little is known about how these factors control LINC-RNA expression. Preliminary studies from cell lines mapping activator E2F-mediated regulation of LINCs have suggested that a significant number of these RNAs may show transcriptional changes upon RB-loss via both direct and indirect mechanisms (Feldstein et al., 2013; Bida et al., 2015; Gasri-Plotnitsky et al., 2017). In addition, genome-wide E2F Chromatin Immunoprecipitation (ChIP) experiments have identified both activator and repressor E2F binding in genomic loci that contain LINC-RNAs (Xu et al., 2007). These preliminary studies have provided tantalizing hints that E2F and RB modulation of LINC-RNAs is widespread and potentially important for cells. Clearly, much more work needs to be done to determine how the entire family of E2F transcription factors (both the activators and repressors) and the RB-like proteins (RBL1 and RBL2) function to modulate the levels of non-coding RNAs both during developmental and oncogenic processes.

## LINC Expression in Retinoblastoma Tumors

Given that our understanding of how RB1 loss in vitro or in model organisms changes the LINC expression profile is limited, determining how RB1 homozygote mutation affects LINC levels in a complex and developmentally disorganized Retinoblastoma tumor represents a real challenge. With the development of RNA-sequencing that unbiasedly measures the levels of RNA in cells without the need to purify polyadenylated mRNAs or have probes designed against the transcriptome we can now, for the first time, interrogate the entire transcriptome of these tumors. This new approach will enable the development of Retinoblastoma specific LINC-RNA profiles and may generate clinically predictive biomarkers for identifying patients likely to develop secondary tumors. However, significant technical challenges have arisen in other tumor types when profiling LINC-RNAs. These are generally due to the lower expression levels of most LINC-RNAs compared to coding mRNAs and the high variability between patients. RNA-seq reads at the lower end of

the runs tend to have greater signal to noise levels than better expressed mRNAs which can make generating predictive LINC profiles from small numbers of tumors difficult. This is a potential issue in Retinoblastoma studies as these tumors are rare and both normal control and tumor cohort size tends to be smaller than for more common adult malignancies.

Although there are currently significant gaps in our understanding of how RB1 and the E2F pathway modulate the expression of the LINC-RNA transcriptome in cells and in tumors there seems to be little doubt that expanding our knowledge about how RB1-loss changes the entire transcriptome is an exciting and clinically relevant question (**Figure 2**). Profiling Retinoblastoma by RNA-seq technology will help fill some of the gaps in our knowledge and provide new insights into how RB1-loss may change the LINC-RNA transcriptome of these aggressive tumors but also to those in more genetically complex adult tumors.

#### GENE OR RNA FUSIONS

In addition to the documentation of LINC-RNA and other noncoding RNA changes caused by RB1 loss, RNA-seq also enables the measurement and identification of novel or fusion RNAs in samples. The pRB protein has important roles in modulating genomic integrity and stability and loss of pRB has been shown

FIGURE 2 | The workflow of RNA-microarray and RNA-sequencing is depicted above. Higher number of targets are detected by RNA-sequencing using which targets of unknown sequence and low abundance targets can both be identified. To detect targets using RNA microarrays, the sequence of the transcripts must be known and expressed in abundance in the cell.

to contribute to accelerated genome manipulations and enhanced drug sensitivity. RNA-seq of Retinoblastoma tumors may also enable the identification of novel RNA species caused by genomic rearrangements or aberrant RNA processing events (**Figure 2**). Such events would normally be excluded or undetectable with traditional microarray technology. Finding and mapping these rare events is an area of growing interest as the aberrant proteins are potential neo-antigens which can be recognized by the patients' immune system. A number of groups have highlighted the utility of these "abnormal" products for the clearance of tumors, although as yet not in Retinoblastoma, to immunotherapy. This is an emerging clinical avenue and in particular could have significant advantages in the treatment of Retinoblastoma as it would reduce the exposure of pediatric patients to developmentally harmful chemotoxic treatments. The detection and mapping of these events is possible with deep RNAseq coverage and may provide insights into how splicing and/or RNA metabolism is changed in Retinoblastoma tumors.

#### ALTERNATE SPLICING EVENTS IN RETINOBLASTOMA

Global profiles of splicing changes in Retinoblastoma tumors have yet to be published. Alternate splicing events in the Retinoblastoma cell line, Y79, using vector capping methodology revealed several variants in 57 Eye related genes including transcriptional factors, signal transduction proteins, membrane and secretory proteins (Oshikawa et al., 2011). Several discrepancies were observed in transcriptional start sites of some of these genes and splice variants were produced due a number of aberrant splicing events including lack of exons, insertion of exons, shifting of splice sites and non-splicing events. Additional studies from the same cell line found that an alternatively spliced form of Disabled-1 (Dab1) changed the phosphorylation levels of the Src family of kinases resulting in aberrant signaling (Katyal et al., 2011). When these Y79 cells were differentiated into neuronal cells, different splice variants of Ca(v)3.1 were observed indicating tissue specific alterations in Ca(2+) signaling (Bertolesi et al., 2006). Despite these studies, much more work is required to determine the global mRNA alternative splice variants in Retinoblastoma (McEvoy et al., 2012; Rodriguez-Martin et al., 2016; Cygan et al., 2017). Therefore, there is a significant need for studying these events using latest RNA Seq methods in Retinoblastoma tumors.

## METABOLIC PATHWAY PREDICTION

The in-depth profiles of transcriptional changes in Retinoblastoma tumors would be a significant resource to the community. In this section, we detail how these datasets could be utilized to make testable predictions into the molecular reprograming of Retinoblastoma cells. Microarray transcriptional profiling and proteomics analysis of Retinoblastoma tumors have identified a number of putative metabolic changes within the tumor cells (Kooi et al., 2015; Danda et al., 2016). From these studies, dysregulated lipid metabolism, mitochondrial energy metabolism and photoreceptor metabolism were implicated as key processes linked to Retinoblastoma progression. These findings are supported by similar findings in RB1 mutant animal models and cell lines (Nicolay et al., 2015) that identified changes in lipids, Glycolysis, amino acid and TCA cycle metabolism (Nicolay and Dyson, 2013; Nicolay et al., 2013, 2015; Kohe et al., 2015; Kohno et al., 2016; Muranaka et al., 2017). Collectively, these results suggest that understanding metabolic pathway changes or reprogramming events in Retinoblastomas may provide new therapeutic opportunities.

Compared to the high throughput methods available for studying global genomic, epigenetic and transcriptional changes in cancer progression metabolomics is an evolving discipline in cancer research. The major challenge for determining metabolic defects in primary tumor samples is the inability to measure metabolic flux using radioactive Carbon or Nitrogen molecules. In an attempt to circumvent these limitations new computational systems biology approaches could be used to understand metabolic reprogramming in Retinoblastoma (Medina-Cleghorn and Nomura, 2013). System biology employs constraint-based modeling that requires reconstruction of the metabolic reactions occurring in a cell followed by computational simulations and experimental validations of the models (Hernandez Patino et al., 2012). Among the models, flux balance analysis gained importance and has been successfully employed to predict metabolic state alteration in cultured cancer cells (Duarte et al., 2007; Orth et al., 2010; Resendis-Antonio et al., 2010; Hu et al., 2013). Using integrated system biology approaches for metabolic modeling of Retinoblastoma tumors may provide new opportunities to develop personalized treatment options for Retinoblastoma patients (Filipp, 2017).

Utilizing the excellent coverage and depth of RNA-seq data from Retinoblastoma tumors and normal tissue, we will be able to generate system level metabolic models, termed genome – scale metabolic models (GEMs). This approach has previously been employed to assay model organisms, tissue and drug treatment network maps (Agren et al., 2014; Bordbar et al., 2014; Mardinoglu et al., 2014; Varemo et al., 2016; Hinder et al., 2017). One limitation of this approach is the assumption that gene expression levels correlate to protein levels (Fagerberg et al., 2014). However, protein abundance variation due to posttranscriptional regulation and/or post-translational modification is not included in this model building. Despite this limitation, RNA-seq data has previously been used to develop Task-driven Integrative Network Inference for Tissues (tINIT) algorithms and draft GEM profiles. Once generated these networks can then be compared with other models from both experimental results and the scientific literature to determine the overall accuracy of the predictions. From this, 65 draft cell type-specific metabolic models consisting of 2,426 ± 467 reactions (± s.d.) and 1,262 ± 204 transcripts have been established (Shlomi et al., 2008; Schellenberger et al., 2011). By constructing these profiles we can begin to build Retinoblastoma specific predictions of the metabolic changes in these tumors that can be experimentally tested. Additionally, several layers and cell types in retinal tissue

perform unique functions necessitating the development of a sub-network in cell dependent manner. Research generating a cell specific GEM in retina cells may enable modeling of other ocular diseases including Age related Macular Degeneration, Glaucoma Diabetic retinopathy, Uveal melanoma and Retinoblastoma. RNA-seq analysis of these ocular diseases has identified several unique retinal genes (Farkas et al., 2013; Li et al., 2014). Utilizing existing approaches and developing RNA-seq datasets from Retinoblastoma tumors may allow the development of Retinoblastoma specific disease GEMs that may open avenues for developing new therapeutic opportunities. The predictive tools are very sensitive to biological variations that occur in different cohorts necessitating measurements of metabolites using additional high through techniques. Alternatively, the experimental metabolite data can be integrated into modeled metabolic reaction networks to understand diseased condition. A similar approach led to the generation of subnetwork responsible for impaired glucose tolerance in patients compared to normal condition (Deo et al., 2010). Utilizing high coverage RNA-seq data from Retinoblastoma tumors may enable the development of metabolic pathway predictions in this rare pediatric tumor.

## CONCLUSION

Microarray profiling of Retinoblastoma tumors has been instrumental in identifying the aberrant pathways driving the growth and invasion of the tumors. These approaches have really built an immense foundation that has enabled the Retinoblastoma research field to test and develop new ways to

#### REFERENCES


therapeutically target these tumors. This technology, that has served us so well, has a number of limitations that can be largely circumvented by the use of RNA-sequencing. In particular, RNAseq measures both coding and non-coding RNAs changes which may be invaluable to understanding the biology of these tumors but also in helping to stratify patients into different treatment outcome or relapse groups. Collectively, this tool enables us to interrogate the biology sustaining Retinoblastoma tumor growth and may led to discovery of new therapeutic approaches to treat these patients.

#### ETHICS STATEMENT

The protocol was approved by institutional review board on Ethical practices for research at Sankara Nethralaya.

## AUTHOR CONTRIBUTIONS

SE and WM contributed to the conception, preparation, and revision of the manuscript. SR built the figures. All of the authors approved and agreed on the manuscript.

#### FUNDING

This research was supported by a K22 CA204352 (WM), a DBT-COE grant (BT/01/CEIB/11/V/16) (SE), and with the support in part by grant P30 CA016058 from the National Cancer Institute (WM).


MDM4 on development and survival in hereditary retinoblastoma. Pediatr. Blood Cancer 59, 39–43. doi: 10.1002/pbc.24014



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Elchuri, Rajasekaran and Miles. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gene-Expression Analysis Identifies IGFBP2 Dysregulation in Dental Pulp Cells From Human Cleidocranial Dysplasia

Stephen L. Greene1,2, Olga Mamaeva<sup>2</sup> , David K. Crossman<sup>3</sup> , Changming Lu<sup>2</sup> \* and Mary MacDougall<sup>4</sup> \*

<sup>1</sup> Department of Pediatric Dentistry, School of Dentistry, The University of Alabama at Birmingham, Birmingham, AL, United States, <sup>2</sup> Institute of Oral Health Research, School of Dentistry, The University of Alabama at Birmingham, Birmingham, AL, United States, <sup>3</sup> Department of Genetics, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States, <sup>4</sup> Faculty of Dentistry, University of British Columbia, Vancouver, BC, Canada

#### Edited by:

Babajan Banganapalli, King Abdulaziz University, Saudi Arabia

#### Reviewed by:

Theodora Katsila, University of Patras, Greece Nelson L. S. Tang, The Chinese University of Hong Kong, Hong Kong

#### \*Correspondence:

Changming Lu luxxx323@uab.edu Mary MacDougall macdougall@dentistry.ubc.ca

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 03 October 2017 Accepted: 30 April 2018 Published: 23 May 2018

#### Citation:

Greene SL, Mamaeva O, Crossman DK, Lu C and MacDougall M (2018) Gene-Expression Analysis Identifies IGFBP2 Dysregulation in Dental Pulp Cells From Human Cleidocranial Dysplasia. Front. Genet. 9:178. doi: 10.3389/fgene.2018.00178 Cleidocranial dysplasia (CCD) is an autosomal dominant disorder affecting osteoblast differentiation, chondrocyte maturation, skeletal morphogenesis, and tooth formation. Dental phenotype in CCD include over-retained primary teeth, failed eruption of permanent teeth, and supernumerary teeth. The underlying mechanism is unclear. We previously reported one CCD patient with allelic RUNX2 deletion (CCD-011). In the current study, we determined the transcriptomic profiles of dental pulp cells from this patient compared to one sex-and-age matched non-affected individual. Next Generation RNA sequencing revealed that 60 genes were significantly dysregulated (63% upregulated and 27% downregulated). Among them, IGFBP2 (insulin-like growth factor binding protein-2) was found to be upregulated more than twofold in comparison to control cells. Stable overexpression of RUNX2 in CCD-011 pulp cells resulted in the reduction of IGFBP2. Moreover, ALPL expression was up-regulated in CCD-011 pulp cells after introduction of normal RUNX2. Promoter analysis revealed that there are four proximal putative RUNX2 binding sites in −1.5 kb IGFBP2 promoter region. Relative luciferase assay confirmed that IGFBP2 is a direct target of RUNX2. Immunohistochemistry demonstrated that IGFBP2 was expressed in odontoblasts but not ameloblasts. This report demonstrated the importance of RUNX2 in the regulation of gene profile related to dental pulp cells and provided novel insight of RUNX2 into the negative regulation of IGFBP2.

Keywords: cleidocranial dysplasia, RUNX2, IGFBP2, IGF, dentinogenesis

#### INTRODUCTION

Cleidocranial dysplasia (CCD, OMIM #119600) is a rare autosomal dominant genetic disorder characterized by bone and tooth abnormalities. Classical skeletal abnormalities seen in CCD include wormian bones, underdeveloped or completely absent clavicles, frontal and/or parietal bossing, flat nasal bridge, hypertelorism, late closure of the sutures and frontals of the skull, and short stature (Mundlos, 1999). CCD dental abnormalities include delayed tooth eruption, over-retained primary teeth and supernumerary teeth.

**69**

CCD is majorly caused by mutations in the transcription factor RUNX2 (runt-related transcription factor 2) : ∼70% of CCD cases are due to heterozygous RUNX2 mutations, less than 10% of cases due to RUNX2 copy number variations, and remaining cases by unknown etiology. RUNX2 is one of three members of the RUNX gene family, a master regulator important for osteoblast differentiation, chondrocyte maturation, skeletal morphogenesis, and odontogenesis (Cohen, 2009, 2013). Runx2 possesses several active domains such as the transactivation domains, glutamine/alanine rich domain, runt homology domain, nuclear localization signal, proline/serine/threonine rich domain, nuclear matrix targeting signal, repression domain, and VWRPY region (Vimalraj et al., 2015). Upon various stimuli, Runx2 interacts with different proteins resulting in positive or negative regulation of its target genes. Mutations in different domains may distinctly affect RUNX2 function in the transcriptional regulation of its target genes, which is reflected by different phenotypes seen in CCD.

Tooth development is a very complex process, which involves many transcription factors and signaling networks to ensure an ordered and controlled development of tooth germs and dentition. RUNX2 is reported to be important for tooth formation. RUNX2 has been reported to be involved in the regulation of dentin sialophosphoprotein (Chen et al., 2005), one of the principal proteins of the dentin extracellular matrix. Furthermore, RUNX2 regulates the alveolar remodeling process essential for tooth eruption and may play a role in the maintenance of the periodontal ligament (Camilleri and McDonald, 2006). Dental abnormalities seen in CCD patients may be a direct result of RUNX2 dysfunction in tooth-forming cells. Thus, it is necessary to comprehensively identify the targets of RUNX2 in dental cells. In this study, we analyzed the global transcriptomic profile of dental pulp cells isolated from a patient (CCD-011) with heterozygous novel microdeletion encompassing the entire RUNX2 locus and a segment of SUPT3H in comparison with age- and sex-matched pulp cells (Greene et al., 2018). Over 25,000 genes involved in important biological pathways were evaluated in order to identify novel RUNX2 target genes using next-generation RNA sequencing. For the first time, we identified IGFBP2 as a direct target of RUNX2 in dental pulp cells.

#### MATERIALS AND METHODS

#### Cell Culture

Human study protocols and patient consents were reviewed and approved by the Institutional Review Board at the University of Alabama at Birmingham. CCD-011 dental pulp cells and ageand sex-matched control dental pulp cells from non-affected healthy individual were grown in αMEM supplemented with 10% FBS, ascorbic acid (50 µg/ml), penicillin (100 U/ml), and streptomycin (100 µg/ml) at 37◦C with 5% CO2. For cell differentiation, cells were cultured with growth media above supplemented with 10 mM β- glycerophosphate for indicated time.

## Next Generation RNA Sequencing

Total RNA from CCD-011 and control pulp cells were isolated using RNeasy Mini Kit (Qiagen) according to the manufacturer's protocols. Next generation RNA Sequencing (RNA-Seq) was performed at the Heflin Center for Genomic Sciences Genomic Core. Briefly, mRNA sequencing was performed on Illumina HiSeq2000 platform. Total RNA was assessed using the Agilent 2100 Bioanalyzer followed by 2 rounds of poly A+ selection and conversion to cDNA. The TruSeq RNA Library Prep Kit were followed according to the manufacturer's instructions (Illumina, San Diego, CA, United States). Library construction consisted of random fragmentation of the polyA mRNA, followed by cDNA production using random primers. The ends of the cDNA were repaired and A-tailed, and adaptors were ligated for indexing (up to 12 different barcodes per lane) during the sequencing runs. The cDNA libraries were quantitated using quantitative PCR in a Roche LightCycler 480 with the Kapa Biosystems kit for library quantitation (Kapa Biosystems, Woburn, MA, United States) prior to cluster generation. Clusters yielded approximately 725K– 825K clusters/mm<sup>2</sup> . Cluster density and quality was determined during the run after the first base addition parameters were analyzed. Paired end 2 × 50 bp sequencing runs were run to align the cDNA sequences to the reference genome. Because CCD sample with allelic RUNX2 deletion is very rare, we have collected one sample (CCD-011) so far which was used in this experiment. The data has been deposited to the sequence read archive<sup>1</sup> .

## RNA-Seq Data Analysis

Image files from the sequencer were converted to raw sequence fastq files using the Illumina compute server running CASAVA version 1.8.2. Quality control of fastq files was checked with FastQC version 0.10.1 and found that no trimming or removal of poor quality sequences was needed. Alignments of the raw sequence reads to the UCSC human hg19 reference genome was performed using TopHat version 2.0.9 with the following parameters: –library-type fr-unstranded; -r 150. Transcript abundances was calculated using Cufflinks version 2.1.1 with the following parameters: -g; -b; -u. Cuffmerge was then used to merge the two transcript abundance files together, followed by pairwise differential expression with Cuffdiff (parameters used: -u; -b).

#### Ingenuity Pathway Analysis

The gene lists from Cuffdiff were uploaded to Ingenuity Pathway Analysis and the Core Analysis was used to identify significant interactions, downstream effects, and pathways.

#### Real-Time PCR

Total RNA were isolated from indicated cells as described as above. Single-strand cDNA was synthesized using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, Calif., United States). Primer sets for IGFBP2, GREM1 (Gremlin 1, DAN Family BMP Antagonist), BARX1 (BarH-like homeobox 1), and ALPL

<sup>1</sup>https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE104527

(alkaline phosphatase, liver/bone/kidney) were purchased from the Integrated DNA Technologies (IDT, Coralville, Iowa., United States). Other primer sets were synthesized by IDT as follows: TFAP2A(transcription factor AP-2 alpha)-For: 5 0 -GAGCCATGGCACGCACGAGACGGTATCTA-3<sup>0</sup> , TFAP2A-Rev: 5<sup>0</sup> -GAGCTCGAGCTCGCAGTCCTCGTACTTGA-3<sup>0</sup> ; LY PD6B (LY6/PLAUR Domain Containing 6B)-For: 5<sup>0</sup> -GTTTC CTGACCCGTGAAATG-3<sup>0</sup> , LYPD6B-Rev: 5<sup>0</sup> -GTCCCGTCCA GATGTTGG-3<sup>0</sup> ; RUNX2-For: 5<sup>0</sup> -TTACTTACACCCCCCAG TC-3<sup>0</sup> , RUNX2- Rev: 5<sup>0</sup> -CACTCTGGCTTTGGG AAGAG-3<sup>0</sup> ; and endogenous control GAPDH-For: 5<sup>0</sup> -AGGTCGGAGTCA ACGGATTTG-3<sup>0</sup> , GAPDH-Rev: 5<sup>0</sup> -GGGGTAACTGTGC-CTA TTCG-3<sup>0</sup> . Quantitative PCR using SYBR Green SuperMix (Qiagen) was performed as we previously described (Lu et al., 2014). The level of mRNA expression was measured using threshold cycle (CT) according to the 11CT method (Livak and Schmittgen, 2001).

## Establishment of CCD-011 Dental Pulp Cells With Stable Over-Expression of RUNX2

Recombinant pLenti-RUNX2 was customarily generated by VectorBuilder (Santa Clara, CA) by inserting normal full length of human RUNX2 into lentiviral expression empty vector. Lentivirus were prepared by transfection of expression vector, either pLenti-RUNX2 or Lentiviral empty vector together with packaging vectors (pMD2.G and psPAX.2; Addgene, Cambridge, MA, United States) into HEK-293T cells. The viruses were concentrated and titered. CCD-011 dental pulp cells were infected with empty lentivirus (lenti-Con) or recombinant lentivirus (lenti-RUNX2), respectively at the MOI of 0.5. Cells were further selected using G418 (20 µg/ml) for 1 week.

#### Luciferase Report Assay

A 1.5 kb fragment of the proximal IGFBP2 promoter was amplified with the following primer set: IGFBP2-For: 5<sup>0</sup> - CAGGGTACCCTGTGCCCTTGCTAACCGCCCATTTC-3<sup>0</sup> , IG FBP2-Rev: 5<sup>0</sup> -CAGGCTAGCCGGGTCCTAAGGGCCGGCTTC TCC-3<sup>0</sup> (restriction enzyme site KpnI and NheI were underlined). The DNA fragment was inserted into the luciferase reporter vector pGL4.18 [luc2P/Neo] (Promega Corporation, Madison, Wisconsin, United States) by KpnI and 3<sup>0</sup> NheI (IGFBP2 pGL4.18). For luciferase reporter assay, 5 × 10<sup>4</sup> HEK293T cells were cultured in 12-well plates overnight. Next day, cells were transfected with 1 µg of with lentiviral empty vector or lenti-RUNX2 together with 20 ng of IGFBP2-pGL4.18 and Renilla luciferase pGL4.74 [hRluc/TK] plasmid (Promega, Madison, WI, United States) using PolyJetTM In Vitro DNA Transfection Reagent (SignaGen Laboratories, Rockville, MD, United States). At 48 h post transfection, luciferase activity in each well was measured by Dual-Luciferase <sup>R</sup> Reporter Assay System (Promega) and normalized to Renilla.

#### Immunohistochemistry

Histological sections (5 µm) of postnatal day 5 mice tissue samples were prepared for immunohistochemistry as previously described (MacDougall et al., 1998). Immunostaining was performed using primary antibodies against IGFBP2 (Cell Signaling Technologies, Danvers, MA, United States). After deparaffinization, heat induced citrate antigen retrieval and blocking with 3% goat serum, 1% bovine serum albumin, and 0.5% tween in phosphate-buffered saline for 20 min at RT, mouse tooth sections were incubated with primary antibody for 1 hour at room temperature, followed by horseradish peroxidate (HRP) poly conjugate for 10 min and 3,3<sup>0</sup> -diaminobenzidine for optimal time. Images were visualized and captured by Nikon Eclipse TE2000-E microscope (Nikon Instruments, Melville, NY, United States).

## RESULTS

## Dysregulation in Gene Expression in CCD-011 Dental Pulp Cells

In order to identify novel RUNX2 target genes involved in tooth formation and signaling pathways that potentially contribute to the dental phenotypes seen in CCD, Next-Generation RNA Sequencing was performed on CCD-011 dental pulp cells carrying a total RUNX2 deletion in one allele and compared to one age- and sex-matched control pulp cells. Of the 25,643 genes analyzed, 11,039 genes had no detectable signal in both CCD and control samples tested, leaving 14,604 genes that were evaluated for differential gene expression. In the detectable genes, 60 transcripts (4.1%) were found to be statistically significantly dysregulated with 63% upregulated and 27% downregulated (fold change ≥2; q-value < 0.05). The top 10 genes with differential change are listed in **Table 1**. Analysis of isoform differential expression revealed 35% downregulated and 65% upregulated transcripts. The isoforms with twofold statistically significant change are listed in Supplementary Table 1. Ingenuity Pathway Analysis revealed top up- and down-regulated genes delineating multiple putative RUNX2 targets genes both upstream and downstream (Supplementary Figures 1, 2).

Among 60 genes identified by RNA-Seq, six genes (**Figure 1A**), TFAP2A, GREM1, BARX1, ALPL, LYPD6B, and IGFBP2 have been previously reported to be associated with osteogenesis and/or craniofacial development (Eckstein et al., 2002; Stoetzel et al., 2009; Tekin et al., 2009; Chung et al., 2012; Leijten et al., 2013; Nichols et al., 2013; Gasque et al., 2015). Their dysregulation in CCD-001 pulp cells was further confirmed by qPCR (**Figure 1B**). Four genes: TFAP2A, GREM1, BARX1, ALPL have been reported to be involved in craniofacial and/ or tooth development (Milunsky et al., 2008; Mitsiadis and Drouin, 2008; Nagatomo et al., 2008; Li et al., 2013; Liu et al., 2014). However, the remaining two genes, LYPD6B and IGFBP2, have not been described in tooth development.

## RUNX2 Introduction Partially Rescued the Dysregulated Genes in CCD-011 Pulp Cells

To investigate the role of RUNX2 in the regulation of target genes involved in CCD, CCD-011 pulp cells were transduced


TABLE 1 | Top 10 up- or downregulated genes in CCD-011 dental pulp cells in comparison to control cells.

with lenti-RUNX2 to introduce the full length of human RUNX2 (**Figure 2A**). Cells transduced with empty lentivirus was used as control. As seen in **Figure 2B**, there was almost fivefold higher level of RUNX2 in CCD-011 pulp cells transduced with Lenti-RUNX2 in comparison to that in cells transduced with empty lentivirus. To investigate if RUNX2 introduction could rescue the dysregulation of the six genes identified by RNA-Seq in CCD-011 pulp cells, their expression was quantitatively analyzed. As seen in **Figure 2C**, the expression of TFAP2A, LYPD6B, and ALPL was increased in CCD-011 pulp cells after introducing normal human RUNX2, indicating the positive regulation of their expression by RUNX2. This finding also suggests that their up-regulation in CCD-011 dental cells seen in **Figure 1** may be due to individual variation. However, the expression of IFGBP2, GREM1, and BARX1 was down-regulated in CCD-011 pulp cells with normal RUNX2 introduction. Under differentiation condition, IGFBP2 expression was further downregulated by RUNX2 overexpression distinct from the upregulation of ALPL (**Figure 2D**). Taken together, these findings indicate that introduction of normal human RUNX2 gene into CCD-011pulp cells can partially rescue the dysregulation of gene expression in these cells.

## IGFBP2 Is a Direct Target of RUNX2

IGFBP2 is a highly conserved family of six IGFBPs that circulate in serum and local biological fluid at relatively high concentrations. IGFBP2 serves as carrier proteins through high binding to IGFs and regulates their bioactivity (Ehrenborg et al., 1991; DeMambro et al., 2008). To determine if RUNX2 directly regulates IGFBP2, the human IGFBP2 promoter was first analyzed for putative RUNX2 binding sites based on previous reports (Ducy et al., 1997; Chen et al., 2002). As shown in **Figure 3A**, in the 1.45 kb segment of the proximal promoter, there are total of four potential RUNX2 binding sites. This promoter fragment was then amplified and inserted into luciferase reporter vector (IGFBP2-pGL4.18). HEK293T cells were transfected with lentiviral empty vector or lenti-RUNX2 together with luciferase IGFBP2-pGL4.18 and Renilla vector for

luciferase assay. As shown in **Figure 3B**, there was significant lower level of luciferase activity in HEK293T cells with lenti-RUNX2 compared with that in cells with lentiviral empty vector, demonstrating that IGFBP2 is a direct target of RUNX2.

cells with Lenti-Empty. Data were presented as mean+SD from one representative of two independent experiments.

## Localization and Expression of IGFBP2 in Mouse Molars

To determine the protein expression of IGFBP2 in the dental tissue during tooth development, postnatal day-1 and -5 mouse tooth sections were analyzed by immunohistochemistry. IGFBP2 was detected in skeletal muscle, alveolar bone, the odontoblasts, throughout the dental pulp, and within the oral epithelium (**Figure 4**). Staining was also seen within the stellate reticulum region associated with blood vessels. No staining was detectable within the epithelial components of the tooth including the ameloblasts, stellate reticulum, and outer enamel epithelium. These findings strongly suggest IGFBP2 may play a functionally important role during odontogenesis associated with dentinogenesis.

## DISCUSSION

CCD is primarily caused by RUNX2 functional alteration due to differential mutations or copy number variations. Although extensive efforts have been made in the past decades, it remains unclear how RUNX2 is involved in the pathogenesis of CCD. In this study, using RNA-Seq, we systemically analyzed the gene profiles from the pulp cells from one rare CCD patient with an allelic loss of total RUNX2 and pulp cells from one sex- and age-matched non-affected individual. We found that many genes associated with osteogenesis or dentinogenesis are dysregulated in CCD cells due to RUNX2 haploinsufficiency. For the first time, we found that IGFBP2 was a direct target of RUNX2 and increased in CCD pulps cells, indicating its potential role in the pathogenesis of CCD.

RUNX2 is the master transcriptional factor involved in bone formation through regulation of the expression of many bone matrix genes including osteocalcin, bone sialoprotein, osteopontin, and collagen I. In contrast to the relatively known functional role of RUNX2 in skeletal bone development and maintenance, it remains unclear how RUNX2 is involved in tooth formation. Importantly, RUNX2 may play different roles in tooth formation in humans and mice. Animal studies showed that tooth development was arrested in RUNX2 null mice at the late bud stage and normal in heterozygous RUNX2 mutant mice (D'Souza et al., 1999; Adhami et al., 2015), in contrast to the findings that there are supernumerary teeth seen in CCD patients with RUNX2 mutations or copy variations. The CCD-011 dental pulp cells with allelic deletion of total RUNX2 provided a useful cell tool to investigate how RUNX2 deficiency affects downstream targets and contribute to the dentinogenesis dysregulation related to CCD. Using RNA-Seq, we comprehensively analyzed and

assay revealed that IGFBP2 was negatively regulated by RUNX2. HEK293T cells in 24-well plates were transfected with empty lentivirus (Con) and lenti-RUNX2 (RUNX2) together with IGFBP2 luciferase reporter vector and Renilla vector followed by measuring the luciferase activity at the method detailed in the Section "Material and Methods". <sup>∗</sup>P < 0.01 by Student's t-test between two groups. Data were presented as mean+SD from one representative of two independent experiments.

compared the different gene profiles between CCD-011 and control dental pulp cells. In this study, we found that numerous gene expressions in CCD-011 pulp cells were up- and downregulated. These genes are associated within a number of biological processes including DNA replication, recombination and repair, cellular movement, assembly and organization, and organ morphology (**Table 1**, Supplementary Table S1, and Supplementary Figures S1, S2), suggesting the critical potential role in tooth development.

IGF-1(insulin-like growth factor 1) and IGF-2 are part of a complex systems and involved in various physiological and pathological processes, including cell differentiation and proliferation, morphogenesis, growth, metabolism, and carcinogenesis (Aguirre et al., 2016; Kasprzak et al., 2017; Mathew et al., 2017; Takahashi, 2017). Accumulating evidences from in vitro and in vivo studies demonstrated that IGFs are critical in the regulation of odonto/osteogenic differentiation and subsequent tooth/ bone formation (Young, 1995; Mohan and Kesavan, 2012; Ma et al., 2016; Matsumura et al., 2017). Since IGFBPs can bind to IGFs to inhibit or potentiate their bioactivity, it is conceivable that IGFBP2 dysregulation could affect IGFs signaling in the development and maintenance of skeletal and dental tissues, contributing to tooth/bone disorders. The expression patterns and functional role of IGFBP2 in skeletal tissues have been reported. IGFBP2 levels decline during neonatal and pubertal growth and increase with advancing age in humans. IGFBP2 generally inhibits IGF action when added to osteoblastic cells in culture (Conover, 2008). Transgenic mice with overexpression of IGFBP2 exhibit skeletal deficiencies (Eckstein et al., 2002). Clinical studies revealed a potentially deleterious role of IGFBP2 on bone density in aging men and women (Amin et al., 2004).However, the effects of IGFBP2 deficiency in bone development is controversial that maybe influenced by gender, age, and others factors (DeMambro et al., 2008). The functional role of IGFBP2 in odontogenic differentiation and tooth development is freshly reported. Studies in human dental pulp cells (DPCs) showed that IGFBP2 were expressed in DPCs, upregulated during odontogenic differentiation and coordinately regulated IGF-1-induced matrix mineralization with IGFBP-3 (Alkharobi et al., 2016). However, Kim et al. (2012) reported that IGFBP2 was highly expressed on

FIGURE 4 | IGFBP2 expression in mouse dental tissues, alveolar bone and skeletal muscle. IGFBP2 expression in dental tissue from day 1 (A–C) and day 5 (D–F) postnatal mouse revealed by immunohistochemical staining. odontoblast (Od), dental pulp (DP), stellate reticulum (SR), amnioblast (Am), dentino-enamel junction (DEJ), first molar (M1).

dental epithelium of the initiation stage but declined at bell stage of tooth development and suggested a negative role of IGFBP2 in tooth development. Our current studies showed that IGFBP2 was increased in CCD dental pulp cells with reduction in ALP expression, indicating a potential negative role of IGFBP2 in extracellular matrix mineralization. Further studies, such as using transgenic mice with IGFBP2 overexpression or deficiency, are necessary to determine the actual role of IGFBP2 in odontogenic differentiation and tooth development.

Although RUNX2 has been implicated in both positive and negative regulation of gene expression, it is the first time for us to report the transcriptional regulation of IGFBP2 by RUNX2. The expression and activity of RUNX2 is affected by a diversity of signaling pathways, which include extracellular matrix protein, cell-surface integrin, and growth factors. Previous studies reported that RUNX2 expression, both mRNA and protein levels, is regulated by IGF-1 signaling (Sun et al., 2001). IGF-1 also mediates endogenous RUNX2 activity through a phosphatidylinositol 3-Kinase/ERK-dependent and Akt-independent signaling pathway (Qiao et al., 2004). Since, IGFBP2 is negatively regulated by RUNX2, IGFs-RUNX2- IGFBP2 axis may play important physiopathological role in the tooth/bone development and disorders.

In summary, our studies have revealed a critical role of RUNX2 in the regulation of gene expression pattern associated with CCD dental pulp cells. Importantly, our studies demonstrate that RUNX2 is a negative regulator of IGFBP2,

### REFERENCES


which may be involved in the pathogenesis of human CCD with haploinsufficiency in RUNX2.

## AUTHOR CONTRIBUTIONS

SG performed the experiments, analyzed the data, and wrote the draft of the manuscript. OM participated in the experiments and the analysis of the data. DC performed the RNA-Seq analysis of the pathways. CL designed and performed the experiments, analyzed the data and critically revised the manuscript. MM conceptualized the project and provided resources for project completion. All authors read and approved the final manuscript.

## FUNDING

This study is supported by the Dentist Academic Research Training (DART) Program (T90 DE022736), National Institute of Dental and Craniofacial Research, National Institutes of Health.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00178/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Greene, Mamaeva, Crossman, Lu and MacDougall. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Pre-clinical Models for Malignant Mesothelioma Research: From Chemical-Induced to Patient-Derived Cancer Xenografts

Noushin Nabavi1,2, Jingchao Wei1,3, Dong Lin1,2, Colin C. Collins<sup>1</sup> , Peter W. Gout<sup>2</sup> and Yuzhuo Wang1,2 \*

<sup>1</sup> Department of Urologic Sciences, Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada, <sup>2</sup> Department of Experimental Therapeutics, BC Cancer Research Centre, Vancouver, BC, Canada, <sup>3</sup> Department of Urology, the Third Xiangya Hospital, Central South University, Changsha, China

Malignant mesothelioma (MM) is a rare disease often associated with environmental exposure to asbestos and other erionite fibers. MM has a long latency period prior to manifestation and a poor prognosis. The survival post-diagnosis is often less than a year. Although use of asbestos has been banned in the United States and many European countries, asbestos is still being used and extracted in many developing countries. Occupational exposure to asbestos, mining, and migration are reasons that we expect to continue to see growing incidence of mesothelioma in the coming decades. Despite improvements in survival achieved with multimodal therapies and cytoreductive surgeries, less morbid, more effective interventions are needed. Thus, identifying prognostic and predictive biomarkers for MM, and developing novel agents for targeted therapy, are key unmet needs in mesothelioma research and treatment. In this review, we discuss the evolution of pre-clinical model systems developed to study MM and emphasize the remarkable capability of patient-derived xenograft (PDX) MM models in expediting the pre-clinical development of novel therapeutic approaches. PDX disease model systems retain major characteristics of original malignancies with high fidelity, including molecular, histopathological and functional heterogeneities, and as such play major roles in translational research, drug development, and precision medicine.

Keywords: rare diseases, genomics, patient-derived xenografts, malignant mesothelioma, pre-clinical cancer research, drug development

#### INTRODUCTION

Malignant mesothelioma (MM) as a rare disease occurs infrequently in the general population, typically affecting fewer than 3,000 patients in North America (Bianchi and Bianchi, 2014). The pleural form, affecting the lining of the chest cavity and lungs, is often referred to as a man-made disease due to high correlation of incidence with exposure to asbestos. The rarer form affecting the abdominal cavity, i.e., peritoneal mesothelioma (PeM), is more common in women and is often subject to incorrect diagnosis (Alakus et al., 2015; Shin and Kim, 2016). Additionally, aside from a few large scale-studies on pleural mesothelioma (PM) (Zhang et al., 2015; Bueno et al., 2016; Joseph et al., 2017), PeM remains largely unexplored. Like many known rare diseases, mesotheliomas have no approved targeted therapy and cisplatin-pemetrexed chemotherapy remains the standard of care (van Zandwijk et al., 2013). MM is frequently acute and

#### Edited by:

Amritha Jaishankar, Rare Genomics Institute, United States

#### Reviewed by:

Theodora Katsila, University of Patras, Greece Vita Dolzan, University of Ljubljana, Slovenia

> \*Correspondence: Yuzhuo Wang ywang@bccrc.ca

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 05 March 2018 Accepted: 11 June 2018 Published: 04 July 2018

#### Citation:

Nabavi N, Wei J, Lin D, Collins CC, Gout PW and Wang Y (2018) Pre-clinical Models for Malignant Mesothelioma Research: From Chemical-Induced to Patient-Derived Cancer Xenografts. Front. Genet. 9:232. doi: 10.3389/fgene.2018.00232

**77**

life-threatening, with survival of less than 1 year in the majority of cases. While large numbers of people may have been exposed to asbestos through occupational or domestic exposure, significantly smaller numbers go on to develop mesothelioma (Carbone et al., 2012), suggesting the involvement of genomic predispositions in disease development. For instance, recent genomic profiling of mesotheliomas shows that mutations in the BAP1 gene render its protein product inactive, and are correlated with MM and uveal melanoma incidence (Testa et al., 2011; Alakus et al., 2015; Ji et al., 2016). Whereas more research is needed to understand other genetic links to MM tumorigenesis, progress is exacerbated by its existential paradox, lack of funding, disease model systems and research resources. Next-generation sequencing technologies (ChIP-Seq, RNA-Seq, DNA-Seq, and Proteome-Seq) applied to patient-derived cell and animal models in rare disease research are becoming key venues to identify the underlying etiology of the disease. Here we review the past and current pre-clinical models in MM research (see **Supplementary Table S1**) and address some of the challenges, limitations, and opportunities that can advance its status quo.

## HISTORICAL DEVELOPMENT OF MM MODELS THROUGH CHEMICAL INDUCTION AND GENE MODIFICATION

It is well-established that chronic exposure to asbestos induces development of human pleural mesothelial cells with cancerlike properties (Lohcharoenkal et al., 2013). Clinically, it has been demonstrated that exposure to asbestos causes many lung diseases such as asbestosis, MM, and lung cancer due to the generation of chromosomal damage and DNA aberrations (Nymark et al., 2007). Historically, to study tumorigenesis of MM, animal and cell models were induced through exposure to varying doses and sizes of asbestos fibers (Whitaker et al., 1984; Topov and Kolev, 1987; Davis et al., 1992; Pass and Mew, 1996) by intrapleural or intraperitoneal injection of asbestos fibers into laboratory rats, mice, or hamsters or incubation of normal mesothelial cell lines with the fibers. Potential MM models would eventually manifest following long latency periods of approximately 7 months for mice, 12 months for rats, and years for primates (Suzuki, 1991). Although these models are difficult to develop, they are ideal platforms for testing and selecting new combinations or targeted therapies, or studying de novo carcinogenic pathways.

Prior to the turn of this century, Simian virus 40 (SV40) was another identified agent widely studied to induce MM (Testa et al., 1998; Bocchetta et al., 2000). Although it is controversial that SV40 contributes to the development of mesothelioma as a causative factor (Hubner and Van Marck, 2002; López-Ríos et al., 2004), its role as a cofactor with asbestos has been established in animal models. Interestingly, some studies showed that SV40 rendered animals more susceptible to asbestos-related carcinogenesis (Kroczynska et al., 2006; Robinson et al., 2006), while asbestos was also reported to promote SV40 infection of cells (Appel et al., 1988).

Following chemical induction of MM, novel genetic models were generated to understand genomic predispositions to this malignancy independent of exposure to asbestos (Jongsma et al., 2008). Both knock-out and knock-in animal models are meaningful steps forward in research and are particularly useful for showing the potential importance of a single gene in disease progression. Well-established genetic studies associated with MM include loss of p16INK4A, p14ARF , Nf2, p53 and possibly Rb (Cheng et al., 1994; Bianchi et al., 1995; Mor et al., 1997; Papp et al., 2001). Additional studies showed that Nf2 is one of the most frequently mutated tumor suppressor genes in PeM (Sekido et al., 1995), and that asbestos-exposed Nf2 knockout mice exhibited accelerated MM tumor formation (Altomare et al., 2005). To demonstrate the powerful effect of Nf2 deficiency in inducing MM, Nf2-deficient mice were crossed with either Ink4a/Arfdeficient or p53-deficient mice, and in the absence of any exposure to asbestos, a high incidence of short median survival of invasive pleural mesothelioma developed (Altomare et al., 2005; Jongsma et al., 2008). Combined genomics studies further showed that MM tumors have frequent hypermethylations or deletions at the Cdkn2a/Arf and Cdkn2b gene loci (Kane, 2006). In summary, in combination with patient-derived xenografts, these models are invaluable systems for studying chronic and systemic effects of gene aberration burden in MM development and deciphering clear linkages between asbestos exposure and genetic predisposition.

## PATIENT-DERIVED CELL MODELS OF MM ACCELERATING RESEARCH AND DEVELOPMENT

Patient-derived cell lines in MM have served as impactful tools for profiling gene expressions, excavating new asbestos-associated genes and pathways, and identifying chromosomal regions that contribute to asbestos and therapy responses. Common chromosomal abnormalities, such as deletions, of chromosomes 1, 3, 4, 9, 11, 14 and 22, have been identified in patient-derived cell lines of MM (Popescu et al., 1988; Taguchi et al., 1993; Lee et al., 1996). Additionally, asbestos-affected genetic pathways such as integrin-mediated signaling pathways, MAPK pathways, and NFKB/IKB pathways (Ramos-Nino et al., 2003) can be attributed to advances brought about by patient-derived cell lines.

These developments started historically as early as 1982 in a study that reported a first-in-field in vitro patient-derived mesothelioma cell line that was generated from abdominal fluid of a patient diagnosed with mesothelioma. It was shown that this cell line stably yielded MM up to 100 passages (Behbehani et al., 1982). Subsequently, an H-MESO-1 cell line was derived from a 35-years old male diagnosed with MM (Reale et al., 1987); it was capable of growing both as nodules and as ascitic fluid with peritoneal seeding and diffuse peritoneal thickening, strongly mimicking the growth pattern of this tumor type in humans (Reale et al., 1987). Subsequently, a panel of 17 human MM cell lines was derived from 61 patients (46 effusions, 9 biopsies, and 6 tumors obtained at autopsy) and 5 of these cell lines were characterized to closely recapitulate human disease

(Wu et al., 1985; Versnel et al., 1989; Tange et al., 1995). Interestingly, Ishiwata et al. (2003) derived a cell line termed HMMME in 2003 from the pleural fluids of a MM case that grew well, both in vitro and in vivo, with a doubling time of 42 h, without interruption for 12 years, and was sub-cultured over 200 times. Following these advances, Usami et al. (2006) established and characterized additional malignant PM cell lines (ACC-MESO-1, ACC-MESO-4, Y-MESO-8A, and Y-MESO-8D), and detected differentially expressed genes between Y-MESO-8A and Y-MESO-8D, which were derived from the same patient. Among these four cell lines, Nf2 was found to be mutated only in ACC-MESO-1. This is an important finding as exploring the genomic aberrations associated of cells is necessary to testing potential targeted therapies and to better translate research discoveries. A search of clinicaltrials.gov in order to find clinical trials treating NF2 mutated solid tumors in patients suggests Everolimus, an oral derivative of rapamycin (NCT02352844) which is in phase 2 of trials, may be a potential targeted therapy to test in these cells. In another study, homozygous deletions of p16INK4A and inactivation of the p14ARF gene were found in all four cell lines. Again, the NCT02688907 phase 2 clinical trial recruiting small cell lung-cancer patients with a p16INK4A mutation uses AZD1775, a tyrosine kinase inhibitor. In vitro studies with this inhibitor in relevant MM cell lines as such can accelerate pre-clinical developments. Additionally, a key advancement in the field was the establishment of three PM cell lines (TCC-MESO-1, TCC-MESO-2, and TCC-MESO-3) by Yanagihara et al. (2010) from primary and metastatic tumors of a patient with epithelioid subtype and 1 line from a mixed tumor subtype (epithelioid and sarcomatoid) allowing for pathological subtype investigations both in vitro and in vivo. Traditional cell culture technologies such as gene transfections can be widely applied to malignant cells to directly study mechanisms of pathogenesis and tumorigenesis. However, cells in multicellular spheroids can mimic resistance to drugs better than monolayer cells as they preserve the complexity of the original tumor (Yanagihara et al., 2010). Thus, discovering genomic aberrations in these cell lines further enables the assessment and development of pre-clinical targeted therapeutics. One example utilizing testing drugs on patient-derived cell lines is a study confirming the successful response of a 3D multicellular spheroids of MM (MSTO-211H) to cytotoxic Paclitaxel-loaded nanoparticles (Lei et al., 2015). Appreciating the numerous advantages of cell lines in pre-clinical research, they are not without their shortcomings some of which include inability to precisely reflect in vivo conditions such as heterogeneities and tumor microenvironment. Thus, they necessitate further validation in models that better mimic intratumoral parameters of human disease.

## PATIENT-DERIVED ANIMAL MODELS OF MM FOR PRE-CLINICAL RESEARCH AND DEVELOPMENT

The practice of engrafting tumor fragments from patient surgical tissues or biopsies either heterotopically or orthotopically in immunodeficient mice started in the 1950s (Woolley, 1958). Heterotopical implants occur when the tumor fragments are implanted into mice unrelated to the original tumor site, generally in the subcutaneous site, or sometimes in subrenal capsular sites. Both of these models are unique in answering specific questions and are invaluable tools for mesothelioma research. Subcutaneous tissue xenografts rarely produce metastasis in mice, and have engraftment success rates of 40–60%, whereas sub-renal capsule tissue xenografts maintain the original tumor stroma (at least in the first generation) as well as the host stroma and have engraftment success rates of 95% (Wang et al., 2017). Knowing this, the mesothelioma field has attempted many of these techniques with remarkable success. For instance, Arnold colleagues for the first time reported inoculation of mesothelioma cells into nude mice to establish an in vivo mesothelioma xenograft model in Arnold et al. (1979) and Nissen et al. (1979). Later, Chahinian et al. (1980) successfully established six such xenografts by subcutaneous inoculation of fresh tumor specimens into nude mice. To investigate the suitability of MM PDX models in pre-clinical studies, tumors from 50 patients were implanted into immunodeficient mice and serially passaged for up to five generations (Wu et al., 2017). Successful PDXs were formed in 20 of 50 (40%) tumors implanted retaining both the morphology and characteristic genotypic and phenotypic markers of the primary lesion. Interestingly, PDX formation was associated with poor survival of the patients, making them ideal and replicable models to identify prognostic biomarkers and/or develop better pre-clinical therapeutic strategies. Interestingly, PDX models derived from epithelioid and sarcomatoid pathologies of mesothelioma have similar differentiation states as the original tumors (Darai-Ramqvist et al., 2013). The sarcomatoid mesothelioma subtype present with a faster growth rate than the epithelioid subtype in PDXs, consistent with its aggressive physiological behavior in humans (Darai-Ramqvist et al., 2013). The different growth patterns in mixed type mesotheliomas are suitably replicated in PDXs, making them invaluable models for investigating MM's cell differentiation, heterogeneity, and tumor evolution. In another study, mesothelioma cells isolated from ascites or pleural fluid of mesothelioma patients were injected into nude/SCID mice to generate PDX models. All PDXs exhibited morphologic and immunohistochemical features consistent with those of original patients' mesothelioma cells (Kalra et al., 2015). Since these models provide biological incubators for inoculated tumors, they provide a tumor repository platform that allows deep genomic and pathological analyses. For instance, it was found in this study that BAP1 loss correlated with enhanced tumor growth. Similar to human cells, murine mesothelioma cells injected into humanized BALB/c mice allow study of tumor cell interaction with the immune system. In one study, murine mesothelioma cells responded to exogenous High Mobility Group Box 1 protein, a Damage-Associated Molecular Pattern that acts as a chemoattractant for leukocytes and as a proinflammatory mediator (Mezzapelle et al., 2016). Other malignant mesothelioma cell lines, TCC-MESO-1, TCC-MESO-2 and TCC-MESO-3, show tumorigenicity in mice after orthotopic implantation (Yanagihara et al., 2010) and allow

evaluation of anticancer agents in vivo (Opitz et al., 2007; Yanagihara et al., 2010). Thus, establishment of PDX models of MM in immunocompromised mice provides a high-fidelity model with minimal genetic drift and physiologically relevant tumor microenvironments to investigate the etiology of this malignancy and develop new therapeutic agents for MM. Here, we hope to shed light on the concept that PDXs in combination with emerging gene-editing or nano-particle therapeutic techniques are paramount to harnessing the full potential of animal models. To our surprise, however, animal studies that take into account the genomic background of MMs for targeted therapy explorations are very limited. We think integrating the knowledge from genomic aberrations of MM models with targeted therapeutics for those aberrations can be largely utilized in mesothelioma research and have the potential for illuminating the value of these models in answering critical and unmet research needs. Some of the unexplored capabilities that these MM PDX models provide include expediting targeted therapy efficacies and accelerating pre-clinical translation of novel therapeutic approaches from other indications and applying them to MM.

#### CONCLUSION

Our understanding of MM biology is hindered by its slow onset, low prevalence, and difficulties of manifesting prolonged predisposing conditions to induce lesions in model systems. While historical models need yet to faithfully recapitulate all aspects of the clinical disease, MM PDX models are remarkable systems that enable insights into the genetics of tumor initiation, growth, and metastasis. In this review, we provide an overview of the major known models in the mesothelioma field that have been instrumental in key discoveries in the past century. We also highlight unresolved questions and limitations that hamper translational progress. We argue that although PDX models come

#### REFERENCES


with inherent challenges such as cost, failing to graft in vitro or in vivo, or not efficiently translating to clinical protocols, they are invaluable platforms to investigate the underlying mechanisms driving tumor initiation, progression, metastatic events, as well as therapeutic interventions. Conventional orthotopic, sub-renal, and subcutaneous transplantation models as well as cell lines remain indispensable in continuing the study of MM and new models that can spontaneously develop mesothelioma and be used to test novel and targeted agents are current clinically unmet needs.

## AUTHOR CONTRIBUTIONS

NN, JW, DL, PG, CC, and YW: conception. NN, JW, DL, and PG: writing, review, and/or revision of the manuscript. CC and YW: study supervision.

## FUNDING

This work was supported by the Canadian Institutes of Health Research (YW), BC Cancer Foundation Mesothelioma Research Fund/Mitacs Accelerate Postdoctoral Fellowship Fund (NN, YW, and CC), and the Terry Fox New Frontiers Program on Prostate Cancer Progression (CC and YW). JW's work is supported by visiting scholarships from China Scholar Council to JW (#201706370135).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00232/full#supplementary-material

TABLE S1 | Summary of pre-clinical models used for malignant mesothelioma research.




of the effect of S-1 therapy. Int. J. Cancer 126, 2835–2846. doi: 10.1002/ijc. 25002

Zhang, W., Wu, X., Wu, L., Zhang, W., and Zhao, X. (2015). Advances in the diagnosis, treatment and prognosis of malignant pleural mesothelioma. Ann. Transl. Med. 3:182. doi: 10.3978/j.issn.2305-5839.2015.07.03

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Nabavi, Wei, Lin, Collins, Gout and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Exploring the Crosstalk Between LMNA and Splicing Machinery Gene Mutations in Dilated Cardiomyopathy

Hind C. Zahr and Diana E. Jaalouk\*

Department of Biology, Faculty of Arts and Sciences, American University of Beirut, Beirut, Lebanon

Mutations in the LMNA gene, which encodes for the nuclear lamina proteins lamins A and C, are responsible for a diverse group of diseases known as laminopathies. One type of laminopathy is Dilated Cardiomyopathy (DCM), a heart muscle disease characterized by dilation of the left ventricle and impaired systolic function, often leading to heart failure and sudden cardiac death. LMNA is the second most commonly mutated gene in DCM. In addition to LMNA, mutations in more than 60 genes have been associated with DCM. The DCM-associated genes encode a variety of proteins including transcription factors, cytoskeletal, Ca2+-regulating, ion-channel, desmosomal, sarcomeric, and nuclear-membrane proteins. Another important category among DCMcausing genes emerged upon the identification of DCM-causing mutations in RNA binding motif protein 20 (RBM20), an alternative splicing factor that is chiefly expressed in the heart. In addition to RBM20, several essential splicing factors were validated, by employing mouse knock out models, to be embryonically lethal due to aberrant cardiogenesis. Furthermore, heart-specific deletion of some of these splicing factors was found to result in aberrant splicing of their targets and DCM development. In addition to splicing alterations, advances in next generation sequencing highlighted the association between splice-site mutations in several genes and DCM. This review summarizes LMNA mutations and splicing alterations in DCM and discusses how the interaction between LMNA and splicing regulators could possibly explain DCM disease mechanisms.

#### Edited by:

Amritha Jaishankar, Rare Genomics Institute, United States

#### Reviewed by:

Consolato Sergi, University of Alberta Hospital, Canada Muhammad Tariq, University of Tabuk, Saudi Arabia

> \*Correspondence: Diana E. Jaalouk dj11@aub.edu.lb

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 17 February 2018 Accepted: 11 June 2018 Published: 09 July 2018

#### Citation:

Zahr HC and Jaalouk DE (2018) Exploring the Crosstalk Between LMNA and Splicing Machinery Gene Mutations in Dilated Cardiomyopathy. Front. Genet. 9:231. doi: 10.3389/fgene.2018.00231 Keywords: lamins, splicing, RNA binding proteins, cardiomyopathies, DCM, RBM20

## INTRODUCTION

Cardiomyopathies are a diverse group of diseases that affect the heart muscle often rendering it hypertrophied, dilated or rigid. As these diseases progress, the heart becomes weakened and unable to perform its normal mechanical and electrical functions (Maron et al., 2006). Dilated cardiomyopathy (DCM), the most common type of cardiomyopathy, is characterized by dilation of the left or both ventricles and impaired systolic function, in the absence of ischemic coronary artery disease or abnormal pressure or volume loading (Yancy et al., 2013; Pinto et al., 2016). Despite being a rare disease, DCM represents a serious health burden that affects both adults and children, often leading to heart failure and sudden cardiac death (Elliott et al., 2008). Indeed, DCM is one of the leading causes of heart failure and heart transplantation in the world, with a prevalence ranging from 1 case per 2500 individuals to 1 per 250 and an incidence of 7 cases per 100000 individuals

(Bozkurt, 2016; Weintraub et al., 2017; Masarone et al., 2018). In addition, DCM comprises 60% of pediatric cardiomyopathies occurring at the highest rate during the first year of age (Towbin et al., 2006). Although different causes of DCM have been identified, many cases yet remain idiopathic. Causes of DCM are either acquired or genetic. Examples of acquired causes include infectious agents, drugs, toxins, alcohol, nutritional deficiencies, peripartum, and autoimmune, metabolic and endocrine disorders (Maron et al., 2006; Pinto et al., 2016). Although the genetic complexity of DCM is yet to be discovered, known genetic causes include mutations in more than 60 genes. The DCM-associated genes encode a variety of proteins including cytoskeletal, sarcomeric, Ca2+-regulating, ion-channel, desmosomal, mitochondrial, nuclear and nuclear-membrane proteins (Pérez-Serra et al., 2016). Dilated cardiomyopathy is either manifested as a predominant cardiac phenotype or associated with systemic conditions such as neuromuscular diseases (Duchenne muscular dystrophy) or syndromic diseases (Barth Syndrome) (Weintraub et al., 2017).

## GENETIC BASIS OF DILATED CARDIOMYOPATHY

About 35% of idiopathic DCM cases are familial and are therefore due to a genetic cause (Michels et al., 1992; McCartan et al., 2012). Familial DCM gene mutations mainly follow an autosomal dominant mode of inheritance. Nonetheless, autosomal recessive, X-linked and mitochondrial patterns of inheritance also occur but less frequently (McNally et al., 2013). The most commonly mutated gene in DCM is TTN, being altered in ∼25% of familial DCM cases and in 18% of sporadic cases (Herman et al., 2012; Haas et al., 2015). TTN encodes titin, the largest known human protein and a key component of the sarcomeres which, through its interaction with thin and thick filaments, plays a role in sarcomere assembly, passive force generation during diastole and elasticity during systole. The majority of TTN variants are truncating mutations (Herman et al., 2012). A study employing human induced pluripotent stem cell-derived cardiac tissue showed that certain TTN truncating mutations exert their pathogenicity by improper interaction with other proteins during sarcomere assembly, attenuated contractility and impaired response to stress and growth signals (Hinson et al., 2015). The LMNA gene, which encodes A-type lamins, is the second most commonly mutated gene in DCM, accounting for ∼6% of cases (Hershberger and Siegfried, 2011). Mutations in sarcomeric genes such as MYH7, MYH6, MYBPC3, ACTC1, TNNT2, and TPM1 have also been associated with DCM, collectively being responsible for ∼5% of all cases (Kamisago et al., 2000). DCM-causing mutations in RBM20 gene, which encodes the splicing regulator RNA binding motif protein 20 (RBM20), were first identified in 2009 (Brauch et al., 2009). The studies that followed showed that RBM20 gene mutations occur at a rate of 3% of all DCM cases (Haas et al., 2015). Other pathogenic variants, causing a predominant cardiac DCM phenotype, have been reported in Z-disk, desmosomal and ion channel genes (Mohapatra et al., 2003; Schmitt et al., 2003; Vatta et al., 2003; Olson et al., 2005; Taylor et al., 2007).

Dilated Cardiomyopathy-associated mutations in the above genes include missense/nonsense mutations, insertions, deletions and splicing mutations (McNally et al., 2013). Variants that arise from the different mutations exert their pathogenicity via dominant negative or haploinsufficient effects of abnormal normal-sized or truncated proteins respectively. Mechanistically, the diversity of the genes involved in DCM underscore the complexity of the underlying mechanisms. Various mechanistic insights have illustrated abnormalities in protein degradation, transcriptional activity, Ca2+-handling and homeostasis, metabolic activity, nuclear integrity and force generation and transmission (McNally et al., 2013). Adding to the mechanistic complexity of DCM, only 30–35% of familial DCM cases follow a Mendelian mode of inheritance, suggesting a more complex multi-variant or oligogenic basis of inheritance for the remaining cases (Hershberger and Siegfried, 2011). In support of this notion, genetic screening methods have revealed the presence of nonrare variants in multiple genes for several DCM cases (Hershberger et al., 2013). Another complicating aspect is the variability of expression of the same mutation in different carriers within the same family. Variability also occurs in terms of onset, severity, progression and phenotype of disease. For instance, the 960delT mutation in the LMNA gene may be manifested as primary DCM, or DCM associated with either Emery-Dreifuss muscular Dystrophy (EDMD)-like or limb girdle muscular dystrophy (LGMD)-like phenotype (Brodsky et al., 2000).

## THE LMNA GENE AND LAMINOPATHIES

The LMNA gene encodes A-type lamins which comprise lamins A and C (lamin A/C). Lamins are type V intermediate filaments, exclusively localized to the nucleus of most differentiated cells and mesenchymal stem cells (Fisher et al., 1986; Rober et al., 1989; Ho and Lammerding, 2012). In humans, the LMNA gene is composed of 12 exons (Lin and Worman, 1993). Lamins A and C are produced by alternative splicing of exon 10 of the LMNA gene (Lin and Worman, 1993; Machiels et al., 1996). Both isoforms are identical in their first 566 amino acids after which they become different in both length and amino acid composition. While lamin C is 572 amino acids long, mature lamin A is composed of 646 amino acids. Another difference is the presence of CAAX motif at the C-terminal end of pre-lamin A which acts as a site for sequential posttranslational modifications that result in cleavage of prelamin A into mature lamin A (Davies et al., 2011). Lamins A and C have a similar structural organization consisting of a short globular N terminal head domain, a central coiled-coil rod domain and a long globular C-terminal tail domain (Lin and Worman, 1993). The rod domain is highly conserved and is implicated in lamin dimerization. An immunoglobulin-like domain is also present in the tail of lamin A/C where various posttranslational modifications occur (Ho and Lammerding, 2012; Burke and Stewart, 2013). Lamins assemble first by dimerizing into coiled-coiled dimers, then by forming polar polymers,

through head to tail dimer arrangement, and finally by forming antiparallel nonpolar filaments (Heitlinger et al., 1992; Sasse et al., 1998; Stuurman et al., 1998). Lamins constitute the main components of the nuclear lamina and interact with proteins and DNA. Through their architectural attachment to the inner nuclear membrane, lamins provide mechanical and structural support to the nucleus. Furthermore, by acting as a platform for diverse protein interactions, they play a role in anchoring and positioning nuclear membrane proteins, regulating various signaling pathways, recruiting and sequestering transcription and DNA replication and repair factors, coupling the nucleoskeleton to the cytoskeleton and mechanotransduction. In addition, lamins bind DNA both directly and indirectly and hence play a role in chromatin organization, gene silencing and transcription (Ho and Lammerding, 2012).

The first mutation in the LMNA gene was identified in 1999 to be causative of EDMD, a progressive muscle weakening and wasting disorder with conduction system malfunction and DCM (Bonne et al., 1993; Bonne et al., 1999). LMNA gene mutations were also shown, in the same year, to be causative of DCM and conduction system disease in the absence of skeletal muscle involvement (Fatkin et al., 1999). Since then, more than 450 mutations in the LMNA gene have been reported<sup>1</sup> . The different mutations are associated with diverse diseases collectively called Laminopathies that are either manifested as tissue-specific disorders or multisystem disease (Worman and Bonne, 2007; Szeverenyi et al., 2008). While tissue-specific effects are seen in striated muscle tissue, adipose tissue or peripheral nervous tissue, multisystem effects incorporate multiple tissues and are seen in premature aging syndrome or overlapping syndromes (Cao and Hegele, 2000; Brown et al., 2001; Broers et al., 2006; Worman and Bonne, 2007). Most LMNA mutations (79.1%) affect striated muscle tissue, followed by adipose tissue (8.6%) and peripheral nervous tissue (0.3%). Furthermore, the percentage of LMNA mutations causing progeroid and overlapping syndromes is 9.3 and 10.9% respectively (Bertrand et al., 2011).

### LMNA GENE MUTATIONS IN DILATED CARDIOMYOPATHY

Dilated Cardiomyopathy caused by LMNA mutations has the worst prognosis, highest rate of heart transplantation (Hoorntje, 2017)due to congestive heart failure and considerable risk of sudden cardiac death (Bécane et al., 2000; Taylor et al., 2003; van Berlo et al., 2005; Pérez-Serra et al., 2015). Affected individuals frequently suffer from progressive conduction system disease such as atrioventricular block, bradyarrhythmias and tachyarrhythmias and have a high chance of developing thromboembolic disorder (Fatkin et al., 1999; Arbustini et al., 2002; Fatkin and Graham, 2002; van Rijsingen et al., 2012, 2013a). Males have a worse prognosis than females, owing to frequent ventricular arrhythmias and end stage heart failure (van Rijsingen et al., 2013b). Altogether, LMNA mutation carriers have high disease penetrance, often presenting symptoms at an early age and having high mortality rate (Taylor et al., 2003). Furthermore, most LMNA mutation carriers exhibit an agedependent penetrance, with the percentage of carriers showing a cardiac phenotype increasing from 7% under the age of 20 years to 100% above the age of 60 years (Pasotti et al., 2008).

In 2014, 165 DCM-associated mutations, based on four different databases, were identified in the LMNA gene (Tesson et al., 2014) (see Ref. for a detailed table of all 165 mutations). Most of these pathogenic variants were missense/nonsense mutations, some were splicing mutations, small deletions or small insertions and very few were small indel, gross deletions or gross insertions (Sébillon et al., 2003; Parks et al., 2008; Millat et al., 2009, 2011; Zimmerman et al., 2010; Narula et al., 2012; Pugh et al., 2014; Pérez-Serra et al., 2016). Since then more DCMassociated mutations were identified in the LMNA gene (Forleo et al., 2015; Haas et al., 2015; Pérez-Serra et al., 2015; Ambrosi et al., 2016; Hasselberg et al., 2017; Kayvanpour et al., 2017; Walsh et al., 2017). Remarkably, a recent study performed on a multicenter cohort of 77 subjects from 45 different families identified 24 novel mutations in the LMNA gene (Nishiuchi et al., 2017). Of these mutations, 18 were associated with DCM and were considered pathogenic (Nishiuchi et al., 2017). Most DCMcausing mutations in the LMNA gene occur in the head and rod domains, which comprise more than half of lamin A and two thirds of lamin C, but rarely in the tail domain. Unlike DCM, mutations linked to other laminopathies such as EDMD, familial partial lipodystrophy and Hutchinson-Gilford progeria syndrome (HGPS) commonly affect the tail domain and thus overlap with the various phosphorylation sites that are abundant in that region of the protein (Szeverenyi et al., 2008). In addition, although hot spots have been identified for some laminopathies including HGPS, mandibuloacral dysplasia and adipose tissuespecific disorders, hot spots for DCM or disorders affecting striated muscle tissue have not been recognized (Bertrand et al., 2011).

#### SPLICING ALTERATIONS IN DCM

Several splicing factors have been associated with heart diseases including DCM (van den Hoogenhof et al., 2016). Splicing alterations in DCM include splicing factor mutations, deregulation in expression of alternative splicing isoforms and splice-site mutations. Embryonic lethality of mouse knockout models of essential splicing factors narrowed the list of splicing regulators described to cause human cardiac pathologies. Nonetheless, many of these splicing factors were shown to be embryonically lethal due to aberrant cardiogenesis. For instance, RNA binding motif protein 24 (RBM24) knockout mice die of many cardiac abnormalities and show hindered sarcomere formation. Further analysis showed that RBM24 is responsible for splicing of 64 genes many of which are important for cardiac development and sarcomere function, which goes in line with its preferential expression in striated muscle tissue (Yang et al., 2014). In addition, loss of SRSF10 (or SRp38), a ubiquitously expressed splicing factor belonging to the conserved Serine/Arginine (SR) protein family, leads to embryonic lethality

<sup>1</sup>http://www.umd.be/LMNA/

due to impaired cardiogenesis. Particularly, its loss has been shown to be associated with altered expression and splicing of Ca2<sup>+</sup> handling genes (Feng et al., 2009).

Another example is the ubiquitously expressed SR protein ASF/SF2 (or SFRS1), which plays a role in both constitutive and alternative splicing. As it is embryonically lethal, conditional cardiac-specific ablation of ASF/SF2 has been shown to result in DCM due to aberrant Ca2<sup>+</sup> handling and excitation-contraction coupling. These effects were attributed to missplicing of several genes including Ca2+/calmodulin-dependent protein kinase II delta (CamkIIδ), troponin T2 (TNNT2), and LIM domain binding 3 (LDB3) (Xu et al., 2005). SC35 (or SRSF2) is another ubiquitously expressed SR protein whose heart-specific loss results in DCM. Although a missplicing effect of SC35 was not confirmed, downregulation of ryanodine receptor 2 (RyR2) was observed in SC35-deficient hearts. This downregulation is speculated to be an effect of the nonsense-mediated decay pathway of misspliced RyR2 mRNA. In addition to its heartspecific effects, SC35 appears to be essential for embryogenesis as knockout mice die at a very early stage even before the beginning of cardiogenesis (Ding et al., 2004). In addition to SR proteins, a heterogeneous nuclear ribonuclear protein (hnRNP) family member, hnRNP U that acts both as a constitutive and alternative splicing factor, has been associated with cardiac disease. Heart specific deletion of hnRNP U was shown to be lethal during early postnatal life due to the development of severe DCM. The DCM phenotype was associated with altered alternative splicing of the Ca2<sup>+</sup> handling gene CamkIIδ (Ye et al., 2015). Dysregulation in the expression of certain splicing factors has also been shown to be associated with cardiac disease. For instance, Rbfox2 which belongs to the FOX-protein family of splicing regulators is down-regulated in heart disease. Rbfox2 regulates the alternative splicing of many genes that are related to cardiac function and its heart-specific deletion in mice develops DCM and heart failure (Wei et al., 2015).

Despite the associations of several splicing factors with cardiac pathologies, mutations in a single splicing factor, RBM20, have thus far been confirmed to cause heart disease (Brauch et al., 2009; Li et al., 2010; Refaat et al., 2012). Mutations in the RBM20 gene have recently been shown to cause DCM (Brauch et al., 2009; Li et al., 2010; Refaat et al., 2012), putting it forward as one of the most commonly affected genes in DCM (Haas et al., 2015). In addition to being prevalent among DCM patients, RBM20 mutations rank first for the youngest mean age of heart transplantation and are correlated with advanced disease (Brauch et al., 2009; Kayvanpour et al., 2017) (**Table 1**).

#### RBM20 AND CARDIAC FUNCTION

RBM20 is an RNA-binding protein (RBP) that regulates alternative mRNA splicing. RBM20 has one RNA recognition motif (RRM) domain in exons 6 and 7 that binds RNA, an Arginine-Serine rich (RS) domain in exon 9 that mediates interactions with other proteins, and a zinc finger domain of the U1 type (Long and Caceres, 2009; Guo et al., 2012) (**Figure 1**). RBM20 has been shown to bind to a distinct UCUU-containing RNA recognition element that is conserved between rats and humans. This binding motif is enriched within introns, such that binding of RBM20 to intronic regions flanking 3<sup>0</sup> and 5<sup>0</sup> splice sites represses exon splicing (Maatz et al., 2014). Many cardiac-expressed transcripts containing the UCUU motif have been shown to directly bind RBM20. The most prominent of these targets is TTN which encodes the protein titin (Maatz et al., 2014). In addition to TTN, transcripts of 17 genes (CamkIIδ, DST, ENAH, IMMT, LDB3, LMO7, MLIP, LRRFIP1, MYH7, MYOM1, NEXN, OBSCN, PDLIM3, RTN4, RyR2, SORBS1, TNNT2) were shown to be directly regulated by RBM20, mainly by mutually exclusive splicing (Maatz et al., 2014). Splicing regulation of mutually exclusive exons is often related to the expression of tissue-specific splice variants (Wang et al., 2008). This regulation mechanism is in accord with the tissue-specific expression of RBM20, being chiefly expressed in striated muscle with the highest amounts in cardiac muscle (Guo et al., 2012). Supporting this idea is the enrichment of the identified RBM20 targets for tissue-specific diseases associated with RBM20 mutations (DCM, hypertrophic cardiomyopathy and heart failure) according to NCBI Medical Subject Heading (MeSH) (Ho and Lammerding, 2012).

To date RBM20 has been shown to regulate the alternative splicing of 31 genes, many of which are associated with cardiomyopathies and cardiac cell biology (Guo et al., 2012). RBM20 regulates several sarcomeric genes thus influencing sarcomere structure and function. The genuine splicing target of RBM20 is titin (Li et al., 2010; Guo et al., 2012; Methawasin et al., 2014), an enormously large elastic protein that spans half the length of the sarcomere (Labeit and Kolmerer, 1995). Titin maintains the structural integrity of the sarcomere and restores it to normal length following extension and contraction (Helmes et al., 1996). Alternative splicing of titin results in many protein isoforms, the most common are N2B and N2BA isoforms (Bang et al., 2001). N2B isoform is short and rigid while N2BA isoforms are longer and more compliant (Lahmers et al., 2004). The relative ratio of short to long titin isoforms is developmentally regulated and is a determining factor of myocardial passive stiffness (Granzier and Irving, 1995; Lahmers et al., 2004). Cardiac N2BA is predominantly expressed during fetal life; however, N2B isoform becomes mainly expressed after birth (Lahmers et al., 2004; Opitz et al., 2004; Warren et al., 2004). This shift in titin isoforms is essential for proper diastolic function (Fukuda et al., 2003; Opitz et al., 2004) as N2B-enhanced passive stiffness prevents ventricular overfilling during diastole (Opitz et al., 2004). In addition, titin isoform switch is important for systolic function as N2B-increased Ca2<sup>+</sup> sensitivity improves contractility during systole (Fukuda et al., 2003).

Another direct target of RBM20 is myomesin 1, a structural component of the sarcomeric M-line. Interaction of myomesin with myosin and titin is responsible for structural organization of these contractile proteins and sarcomere integrity during contraction (Agarkova and Perriard, 2005). In addition, the presence of a phosphorylation site in myomesin suggests that it responds to stretch-dependent signaling (Obermann et al., 1997). Myomesin 1 undergoes an isoform switch in

a timely manner with titin (Agarkova et al., 2000). After birth, myomesin 1 isoforms that lack a molecular spring domain (EH domain) become upregulated (Agarkova et al., 2000). The myomesin 1 switch has been suggested to enhance alignment of contractile filaments and contraction efficiency (Siedner et al., 2003). Remarkably, re-expression of fetal isoforms


All mutations are missense except for two nonsense mutations (marked with<sup>∗</sup> ). Most mutations occur in a 5-amino acid hotspot comprised of the residues 634, 635, 636, 637, and 638 in the RS domain of the protein. The RBM20 nucleotide position is based on the NCBI reference sequence NM\_001134363.1 and amino acid positions are based on the NCBI reference sequence NP\_001127835.1. RS, Arginine-Serine rich; RRM, RNA-recognition-motif, ZnF, Zinc Finger; ACMG, American College of Medical Genetics and Genomics.

of both titin and myomesin has been observed in DCM (Makarenko et al., 2004; Nagueh et al., 2004; Schoenauer et al., 2011). Tropomyosin 1 (TPM1), a component of thin filaments, is yet another target of RBM20 (Guo et al., 2012). Tropomyosin binds actin and mediates contraction in response to Ca2<sup>+</sup> binding to troponin. Alternative splicing of TPM1 results in two striated muscle specific isoforms (TPM1α and TPM1κ) and a smooth muscle specific isoform (TBM1β). The specific role of RBM20 in these splicing events is not known. However; overexpression of the TBM1κ isoform has been observed in DCM patients and has been shown to cause systolic and diastolic dysfunction in transgenic mice (Rajan et al., 2010).

RBM20 regulates the tissue-specific splicing of Lim domain binding 3 (LDB3), a Z-line structural protein important for maintaining sarcomere integrity during contraction (Zhou et al., 2001; Guo et al., 2012; Maatz et al., 2014). RBM20 controls the inclusion of mutually exclusive exons of LDB3 that are either present in its cardiac-specific isoform (exon 4) or skeletal muscle-specific isoform (exons 5 and 6). Loss of RBM20 results in the inclusion of exons 5 and 6 rather than exon 4 (Guo et al., 2012). Moreover, mutations in exon 4 as well as in another cardiac-specific exon (exon 10) of LDB3 were identified in DCM patients (Arimura et al., 2009). Thus, loss of exon 4 due to loss of RBM20 might contribute to the development of the DCM phenotype in Rbm20 null mice (Arimura et al., 2009; Guo et al., 2012). RBM20 is also a splicing regulator of CamkIIδ, a crucial enzyme in the heart (Guo et al., 2012). Four of the eleven isoforms described for CamkIIδ are expressed in the heart (Gray and Heller Brown, 2014). CamkIIδ isoforms phosphorylate proteins of the sarcomere and the sarcoplasmic reticulum, calcium channels and transcription factors thus affecting cell signaling, Ca2<sup>+</sup> handling and gene expression (Dzhura et al., 2000; Zhang et al., 2002, 2003; Hidalgo et al., 2013). As such, splicing regulation of CamkIIδ by RBM20 can greatly affect cardiac function by modulating these processes. This is further elucidated by the association of dysregulated CamkIIδ isoforms with DCM and heart failure (Zhang et al., 2002, 2003).

## RBM20 GENE MUTATIONS IN DCM

In an attempt to find a novel pre-clinical biomarker for DCM, genome-wide analysis of 8 families with DCM uncovered distinct heterozygous missense mutations in exon 9 of RBM20. These mutations segregated with the DCM phenotype, which led to the recognition of RBM20 as a DCM-causing gene (Brauch et al., 2009). Genotype-phenotype associations linked RBM20 mutations with aggressive DCM characterized by variable symptoms that include arrhythmias, heart failure and sudden death. Patient-derived tissues also showed variable involvement of cardiac hypertrophy and interstitial fibrosis (Brauch et al., 2009). Multiple RBM20 mutations were identified by subsequent studies (Li et al., 2010; Refaat et al., 2012). The role of RBM20 in DCM was revealed in rats harboring a loss of function mutation that removes the RRM, RS and zinc finger domains (exons 2–14) of RBM20 (Guo et al., 2012). This spontaneously occurring mutation resulted in altered titin mRNA splicing. In addition to titin, alternative splicing of 30 other transcripts was shown to be altered in both rats and a DCM-patient carrying an S635A missense mutation in the RS region of RBM20. The identified RBM20-dependant genes were enriched for genes related to ion-handling, sarcomere function and cardiomyopathy (Guo et al., 2012). As most of the identified genes are key determinants of cardiac cell function and as some of them have been associated with cardiomyopathy, their missplicing is thought to be a key determinant of the DCM phenotype. This is illustrated by impairment of the Frank-Starling mechanism (FSM) as a result of expression of longer and more compliant titin isoforms in Rbm20-deficient mice (Methawasin et al., 2014). In addition to titin, missplicing of other sarcomeric proteins, such as myomesin 1, might affect their contractile function. On the other hand, splicing alterations in other RBM20 targets such as CamkIIδ, RyR2, and Cacna1c might affect Ca2<sup>+</sup> homeostasis. Indeed, Rbm20 deficiency induces a switch into larger cardiacspecific isoforms of CamkIIδ, which might compromise its normal function (Maatz et al., 2014).

In addition to altered expression of adult/fetal protein isoforms, RBM20 loss or mutation is also manifested in

deregulation of tissue-specific protein isoforms or mislocalization of misspliced proteins. For instance, RyR2 and CamkIIδ aberrant splicing caused by RBM20 mutation results in the expression of mislocalized protein isoforms. RyR2 constitutes the key calcium release channel in the sarcoplasmic reticulum membrane that plays a role in excitation-contraction coupling. A 24-bp exon inclusion in RyR2 transcript causes a translocation of the corresponding protein from the ER to the intranuclear cisternae, thus deeply affecting calcium signaling (George et al., 2007). Interestingly, RyR2 transcripts containing this exon are upregulated in Rbm20-null rats as well as in cardiomyopathy patients (Maatz et al., 2014). Moreover, mutations affecting RyR2 function have been associated with cardiomyopathies (Tang et al., 2012). Remarkably, DCM-associated RBM20 mutation reversed the splicing of mutually exclusive exons in CamkIIδ, resulting in an isoform switch from CamkIIδB into CamkIIδA. Although both isoforms are expressed in the heart, CamkIIδB is predominantly found in the nucleus where it regulates gene expression (Zhang et al., 2002), while CamkIIδA lacks the nuclear localization signal and is mainly located in T-tubules where it plays a role in facilitation of the L-type calcium channel (LTCC) (Dzhura et al., 2000). This isoform switch has been shown, under other circumstances, to cause excitation-contraction coupling defects and proneness to tachyarrhythmia, symptoms that are also seen in DCM caused by RBM20 mutations (Xu et al., 2005).

RBM20 has also been shown to repress splicing of different targets (such as LMO7, RTN4, PDLIM3 and LDB3) in favor of their heart specific isoforms. LMO7 is a transcription factor that regulates both skeletal and cardiac muscle-related genes (Holaska et al., 2006). RBM20 represses the inclusion of exons 9 and 10 which characterize the brain specific isoform of LMO7 (Ooshio et al., 2004; Maatz et al., 2014). Although LMO7 has not been associated with DCM, expression of its brain specific isoform in the absence of RBM20 might be an important disease mechanism. Similarly, RBM20 suppresses the neuronal-specific isoform of RTN4 in favor of the heart-specific isoform (Maatz et al., 2014). RTN4 is a neurite growth inhibitor with unknown role in the heart (Huber et al., 2002). Brain-specific RTN4 isoform is weakly detected in the heart; however, it is upregulated in cases of ischemia and DCM and has also been suggested as a marker of heart failure (Bullard et al., 2008; Gramolini et al., 2008; Sarkey et al., 2011).

Most reported cardiomyopathy-related mutations in the RBM20 gene arise in the RS region which includes a five-amino acid mutation "hotspot" within exon 9 (Brauch et al., 2009; Li et al., 2010; Millat et al., 2011; Rampersaud et al., 2011; Refaat et al., 2012; Guo et al., 2012; Wells et al., 2013; Klauke et al., 2017; Long et al., 2017). The RS region is thought to mediate protein-protein interactions (Long and Caceres, 2009). Indeed, quantitative proteomic analysis revealed that RBM20 interacts with many protein components of the U1 and U2 small nuclear ribonucleoproteins (snRNPs), which associate with pre-mRNA to form spliceosomal complex A of the spliceosome. Moreover, the RNA recognition element of RBM20 is proximal to U1 and U2 snRNP binding sites. Association of RBM20 with earlystage spliceosomal assembly but not with the catalytically active spliceosome has been suggested to stall further spliceosomal assembly beyond complex A formation, thus causing splicing repression (Maatz et al., 2014). As such, mutations in the RS region of RBM20 are expected to abrogate protein-protein interactions essential for RBM20 function as a splicing repressor. Indeed, the DCM-associated S635A mutation in the RS region of RBM20 (Guo et al., 2012) has been shown to considerably reduce interactions with 38 alternative spliceosomal factors with no effect on interactions with fundamental spliceosomal proteins. One possible explanation of this outcome is that association of RBM20 with these alternative splicing factors might be needed for the suggested spliceosomal stalling mechanism and splicing repression (Maatz et al., 2014).

In addition to protein-protein interactions, binding of RBM20 to nascent transcripts is important for its function. Indeed, mutations in exon 6 of RBM20 were identified in idiopathic DCM patients. As these mutations localize to the RRM domain of RBM20, they are expected to disrupt its binding to mRNA (Li et al., 2010). Mice lacking the RRM domain of RBM20, by deletion of exons 6 and 7, exhibit altered titin splicing with a favored expression of more compliant titin isoforms that increased in length from heterozygous to homozygous RBM20 mutant mice. Increased titin compliance was associated with a decrease in passive stiffness and FSM which also correlated with the number of affected alleles (Methawasin et al., 2014). Along the same line, mutations in the RBM20 binding site also influence its splicing activity (Maatz et al., 2014). Although many other missense and nonsense mutations of RBM20 have been identified in DCM patients, their functional consequences have not been explored (Refaat et al., 2012; Haas et al., 2015; Zhao et al., 2015; Waldmüller et al., 2015). Yet, many of these mutations localized to novel exons of RBM20, including exons 2, 4, 11 12, 13, and 14 (Refaat et al., 2012; Zhao et al., 2015; Beqqali et al., 2016).

Recently, a novel familial DCM-causing mutation (E913K) in a glutamate-rich region of RBM20, encoded by exon 11, has been studied. Although the region of the mutation is not characterized, its conservation across distinct species suggests its functional significance. This mutation was shown to cause a strong reduction in RBM20 protein levels in human cardiomyocytes, which was suggestive of compromised RBM20 protein stability. One possible mechanism that could affect protein stability is the generation of misfolded proteins and their subsequent proteasomal degradation. The outcome of reduced RBM20 protein levels was manifested in the aberrant inclusion of several exons in the spring region of titin. Missplicing of titin caused a dramatic shift from the stiff N2B isoform to the highly compliant N2BA isoform and resulted in an attenuated FSM (Beqqali et al., 2016). Notably, similar effects on titin splicing and the FSM were previously reported in a mouse model of RRM-deficient RBM20 as well as in mice lacking RBM20 (Methawasin et al., 2014).

## DEREGULATION IN ALTERNATIVE SPLICING ISOFORM EXPRESSION

Many studies revealed, by deep sequencing and microarray analysis, sets of genes that show differential splicing between control and diseased heart; yet the mechanisms behind these

alterations were not identified. In humans, splicing alterations of the sarcomeric genes, TNNT2 (troponin T2), TNNI3 (troponin I3), MYH7 (myosin heavy chain 7), and FLNC (filamin C gamma) were observed in both DCM and hypertrophied myocardium. Interestingly, the ratio of the different splice-isoforms of each of TNNT2, MYH7, and FLNC served as markers that distinguished failing from non-failing heart (Kong et al., 2010). Downregulation of the L-type voltage gated Ca2<sup>+</sup> channel Cav1.2 has previously been associated with cardiac hypertrophy and heart failure (Chen et al., 2002; Goonasekera et al., 2012). Recently, a novel neonatal splice variant of Cav1.2 has been identified and was shown to be aberrantly re-expressed in adult rodent heart, upon pressure overload-induced cardiac hypertrophy, as well as in left ventricles of DCM patients. Re-expression of the identified isoform by missplicing of the Cav1.2 gene, Cacna1c, promoted proteasomal degradation of wild-type Cav1.2, thus explaining the reported decreased expression and activity of Cav1.2 in cardiac hypertrophy (Hu et al., 2016).

## SPLICE-SITE MUTATIONS

In addition to loss or dysregulation of splicing factors, splice site mutations also cause splicing alterations and disease (van den Hoogenhof et al., 2016). DCM has also been associated with splice-site mutations in its most commonly mutated gene, TTN. One fourth of idiopathic familial DCM cases harbor truncated titin proteins and almost 31% of mutations that generate a truncated titin protein are splice-site mutations (Herman et al., 2012). Splice site mutations in TTN are also thought to be responsible for HCM through the generation of a truncated titin protein which results in a reduced myocardial passive stiffness (Herman et al., 2012). However, while TTN truncating mutations frequently occur in DCM, they are rare in HCM (Herman et al., 2012). In addition, splice site mutations in other genes such as those encoding for lamin A/C (LMNA) (Parks et al., 2008), desmoplakin (DSP) (Garcia-Pavia et al., 2011) and dystrophin (DMD) (Obler et al., 2010) have been reported in DCM. For instance, an A > G substitution at the 3<sup>0</sup> splice site of exon 4 leads to an in-frame addition of 3 amino acids to lamin A/C protein thus causing DCM (Otomo et al., 2005).

## LAMIN A/C SPECKLES AND SPLICING FACTOR COMPARTMENTS (SFCs)

Several reports identified an association between lamin A/C and splicing factor compartments (SFCs). SFCs are 1–2 µm diameter speckles in which RNA splicing factors are concentrated. Acting as storage sites for transcription factors, SFCs are dynamic in that splicing factors are constantly recruited into and out of these compartments from and to transcription sites (Misteli et al., 1997). As such, their size changes depending on the level of transcription and mRNA splicing (Carmo-Fonseca et al., 1992; Spector, 1993; O'Keefe et al., 1994). Splicing of most nascent transcripts is simultaneous with transcription (Beyer and Osheim, 1988) and occurs on perichromatin fibrils that, in addition to being localized at the borders of SFCs, are also found throughout the nucleoplasm away from SFCs (Jackson et al., 1993; Wansink et al., 1993; Cmarko et al., 1999). In accordance with the previously proposed role of intranuclear lamins in nuclear organization, the finding that intranuclear lamin foci or speckles colocalize with RNA splicing factors in SFCs was suggestive of a structural role of lamins in SFCs (Jagatheesan et al., 1999). The findings of other studies that followed were as well implicative of a role of lamin A/C in SFC organization. For instance, the expression of terminally tagged lamin A/C resulted in depletion of lamin speckles and SFCs, which was associated with reduction in RNA polymerase II (pol II) transcription. Furthermore, lamin speckles and SFCs concomitantly became larger upon pol II transcriptional inhibition and reversibly attained their normal size upon withdrawal of inhibition (Kumaran et al., 2002). In another study published the same year, the authors showed that disruption of the nuclear organization of lamin A/C, by means of a dominant-negative mutant that lacks the N-terminal domain, also reorganizes SFCs and inhibits pol II transcription (Spann et al., 2002). Contradicting these studies, Vecerová et al. (2004) tested Lmna−/<sup>−</sup> cells and showed that lamin A/C is non-essential for the formation and maintenance of SFCs. Nonetheless, this discrepancy between the different studies might be due to the different protein factors used to label SFCs. In addition, the expression of a truncated fragment of lamin A in Lmna−/<sup>−</sup> cells might be sufficient to preserve lamin speckles, despite its failure to preserve the nuclear lamina (Jahn et al., 2012).

## INTERACTION BETWEEN laminA/C AND SPLICING FACTORS/RNA-BINDING PROTEINS

As mentioned earlier, disruption of lamin speckles was associated with down-regulation of pol II transcription (Kumaran et al., 2002; Spann et al., 2002). This could be a direct effect of the loss of the putative interaction between lamin A/C and SFC components such as SC-35. Indeed, it has been shown that SC-35 supports pol II-dependent elongation through its interaction with cyclin-dependent kinase 9 (CDK9), a component of the positive transcription elongation factor b (P-TEFb). CDK9 phosphorylates the C-terminal domain (CTD) of pol II and results in transcriptional elongation (Lin et al., 2008). In addition to its interaction with CDK9, an interaction between SC-35 and the CTD of pol II has also been reported (Yuryev et al., 1996). Interestingly, cardiac-specific deletion of SC-35 causes DCM (Ding et al., 2004), which is also well-known to be caused by LMNA gene mutations (Gruenbaum et al., 2005). ASF/SF2 is another splicing factor which interacts with lamin A/C (Gallego et al., 1997; Depreux et al., 2015). ASF/SF2 is a splicing regulator of several genes that encode for cardiac proteins, such as CamkIIδ (Xu et al., 2005). As already mentioned, dysregulated CamkIIδ isoforms are associated with cardiomyopathy and heart failure (Zhang et al., 2002, 2003) and loss of ASF/SF2 in the heart tissue is responsible for DCM as well as perturbed excitation-contraction coupling (Xu et al., 2005). Therefore, the loss of association

between lamin A/C and each of ASF/SF2 and SC-35 might affect their functions pertaining to striated muscle tissue which might explain the skeletal and cardiac muscle effects of most LMNA mutations.

A recent study identified 130 proteins that repeatedly associate with lamin A tail in C2C12 myoblasts differentiated to form myotubes. Upon functional classification of these proteins, enrichment of proteins involved in RNA splicing was noted. Furthermore, binding partners belonging to this functional category were solely found to differ between wild type lamin A and two lamin A mutants associated with EDMD (Depreux et al., 2015). Of the identified proteins in this study, 15 proteins are localized in nuclear speckles (CDC5L, DDX3X, EFTUD2, LUC7L3, NPM1, PRPF19, RNPS1, SFRS1, SFRS3, SFRS4, SRSF10, SRRM1, SRRM2, THOC4, and U2AF2) and 30 proteins in the spliceosomal complex. Furthermore, some of the identified proteins, such as ASF/SF2 (or SFRS1) and SRSF10 are localized to both compartments. As already mentioned, both alternative splicing factors have heart-specific effects despite their ubiquitous expression and their knockout in mice is embryonically lethal due to impaired cardiac development (Feng et al., 2009). Accordingly, loss of interaction between lamin A/C and these splicing factors might account for the tissue-specific effects of lamin A/C mutations.

Driven by the importance of protein interactions in core cellular processes, large-scale biochemical, proteomic and bioinformatic approaches were employed to characterize the composition of cellular protein complexes in cultured human cells (Havugimana et al., 2012). This study identified new proteins that associate with lamin A/C. Although none of the identified putative lamin A/C partners are localized to nuclear speckles, three play a role in splicing (NONO, SF3B3 and hnRNP-M). hnRNP-M belongs to the heterogeneous nuclear ribonucleoprotein (hnRNP) family of proteins. Members of this family are implicated in pre-mRNA transcription, translation, processing and transport, all of which might affect gene expression (Kim et al., 2000). They exert their effects on the fate of pre-mRNA by alternative splicing, manipulating the structure of pre-mRNA and affecting accessibility to other RNA processing factors (Martinez-Contreras et al., 2007). Remarkably, hnRNP-M has been shown to interact with cell division cycle 5-like (CDC5L) and pleiotropic regulator 1 (PLRG1) (Lleres et al., 2010), two core components of the CDC5L complex which is crucial for spliceosome assembly and function (Ajuh et al., 2000; Makarova et al., 2004). The interaction domain in hnRNP-M was shown to be essential for its role in constitutive and alternative splicing. In addition to its presence in the spliceosome, CDC5L is also present in nuclear speckles and have been shown to interact with lamin A/C (Lleres et al., 2010; Depreux et al., 2015). Other CDC5Lassociated proteins that are localized to nuclear speckles and putatively interact with lamin A/C are SC-35 and ASF/SF2 (Ajuh et al., 2000).

As such, lamin A/C might regulate the pre-assembly and targeting of sub-complexes such as the CDC5L complex to the emerging spliceosomal complex, such that loss of interaction between sub-complex components and lamin A/C might influence the function of the spliceosome and gene expression. Other hnRNPs were identified by a novel approach that assesses proximity or binding to lamin A in a quite natural cellular context. This approach utilizes a biotin ligase fusion of lamin A followed by mass spectrometry of biotinylated proteins. The biotinylated hnRNPs by this procedure were hnRNP-E1, hnRNPA1, hnRNPA2B1, hnRNPA0 and hnRNPR. In addition to hnRNPs, other splicing factors that interact with and/or are proximal to lamin A, such as SF1, U2SURP, GPATCH1, DGCR14, RBM10, SUGP1, PAPOLA, TFIP11 and GTF2F2, were identified (Roux et al., 2012). Interestingly, hnRNP-E1 was shown by another study to retain its interaction with progerin, a truncated form of lamin A that causes HGPS (Zhong et al., 2005). This implies that the interaction domain might be preserved in progerin and that loss of interaction between lamin A and its interacting partners is mutation dependent, thus providing further insight into the diverse tissue-specific phenotypes associated with LMNA mutations (Zhong et al., 2005). In tandem with our improved understanding of DCM disease mechanisms, these insights have been enabled by the advances in next generation sequencing and transgenic models in the past two decades.

#### LIMITATIONS AND FUTURE PERSPECTIVES

Not only has NGS helped in identifying mutations in a panel of DCM-causing genes, its RNA-seq application has uncovered DCM-associated defects in mRNA splicing. By means of this high-throughput technology, genetic testing has helped the diagnosis of DCM mostly through a targeted NGS panel. The benefits of such genetic diagnosis lie in determining the disease etiology, natural history and prognosis but most importantly in determining the need for family screening and the choice of a proper treatment strategy. To improve the diagnostic yield of genetic testing without increasing the chance of identifying variants of unknown significance (VUS) and incidental findings (IF), a recent study proposed a standardized stepwise exome-sequencing based approach in pediatric DCM. As this approach combines exome-based targeted analysis, copy number variation analysis and Human Phenotype Ontology (HPO) filtering, it permits heterogenous disease identification and thus personalized data analysis (Herkert et al., 2018).

Despite the progress made in identifying DCM-associated genes, further work is still needed to uncover new DCM-causing genes, and to investigate the pathogenic role and to decipher the biofunctional relevance of many of the reported mutations especially those revealed by candidate—gene approaches or identified in a small number of families. Unbiased genome wide approaches such as whole genome sequencing should be used to validate DCM-associated mutations. In addition, mutations should be replicated and validated in large cohorts. Furthermore, while most studies have focused on identifying single nucleotide variants in coding regions, other types of genomic variations such as structural variants, transposable elements insertions or variants in non-coding regions should be explored further.

Studies should also tackle the role of mitochondrial variation, somatic variation, and epigenetic modifications in DCM. Confounding factors that affect the DCM phenotype in different individuals such as penetrance, effect of multiple variants, ethnic and gender differences, and environmental factors have yet to be established.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

DJ and HZ contributed to the idea conception, overall review design, text mining, and interpretation of the scientific literature discussed in this review. HZ wrote the paper. DJ revised and edited the paper.


intermediate filament proteins. Proc. Natl. Acad. Sci. U.S.A. 83, 6450–6454. doi: 10.1073/pnas.83.17.6450


with dilated cardiomyopathy. Clin. Trans. Sci. 3, 90–97. doi: 10.1111/j.1752- 8062.2010.00198.x



heart association task force on practice guidelines. Circulation 128, 1810–1852. doi: 10.1161/CIR.0b013e31829e8807


mutations in patients with dilated cardiomyopathy. Int. J. Mol. Med. 36, 1479–1486. doi: 10.3892/ijmm.2015.23


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Zahr and Jaalouk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Emergence of Intrahepatic Cholangiocarcinoma: How High-Throughput Technologies Expedite the Solutions for a Rare Cancer Type

Meng-Shin Shiao1†, Khajeelak Chiablaem2†, Varodom Charoensawan3,4,5 , Nuttapong Ngamphaiboon<sup>6</sup> and Natini Jinawath2,4 \*

#### Edited by:

*Arvin Gouw, Rare Genomics Institute, United States*

#### Reviewed by:

*Theodora Katsila, University of Patras, Greece Ramu Elango, Princess Al-Jawhara Center of Excellence in Research of Hereditary Disorders, King Abdulaziz University, Saudi Arabia*

#### \*Correspondence:

*Natini Jinawath jnatini@hotmail.com; natini.jin@mahidol.ac.th*

*†These authors have contributed equally to this work*

#### Specialty section:

*This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics*

Received: *09 October 2017* Accepted: *23 July 2018* Published: *15 August 2018*

#### Citation:

*Shiao M-S, Chiablaem K, Charoensawan V, Ngamphaiboon N and Jinawath N (2018) Emergence of Intrahepatic Cholangiocarcinoma: How High-Throughput Technologies Expedite the Solutions for a Rare Cancer Type. Front. Genet. 9:309. doi: 10.3389/fgene.2018.00309* *<sup>1</sup> Research Center, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand, <sup>2</sup> Program in Translational Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand, <sup>3</sup> Department of Biochemistry, Faculty of Science, Mahidol University, Bangkok, Thailand, <sup>4</sup> Integrative Computational BioScience (ICBS) Center, Mahidol University, Nakhon Pathom, Thailand, <sup>5</sup> Systems Biology of Diseases Research Unit, Faculty of Science, Mahidol University, Bangkok, Thailand, <sup>6</sup> Medical Oncology Unit, Department of Medicine, Faculty of Medicine, Ramathibodi Hospital, Mahidol University, Bangkok, Thailand*

Intrahepatic cholangiocarcinoma (ICC) is the cancer of the intrahepatic bile ducts, and together with hepatocellular carcinoma (HCC), constitute the majority of primary liver cancers. ICC is a rare disorder as its overall incidence is <1/100,000 in the United States and Europe. However, it shows much higher incidence in particular geographical regions, such as northeastern Thailand, where liver fluke infection is the most common risk factor of ICC. Since the early stages of ICC are often asymptomatic, the patients are usually diagnosed at advanced stages with no effective treatments available, leading to the high mortality rate. In addition, unclear genetic mechanisms, heterogeneous nature, and various etiologies complicate the development of new efficient treatments. Recently, a number of studies have employed high-throughput approaches, including next-generation sequencing and mass spectrometry, in order to understand ICC in different biological aspects. In general, the majority of recurrent genetic alterations identified in ICC are enriched in known tumor suppressor genes and oncogenes, such as mutations in *TP53*, *KRAS*, *BAP1, ARID1A*, *IDH1*, *IDH2*, and novel *FGFR2* fusion genes. Yet, there are no major driver genes with immediate clinical solutions characterized. Interestingly, recent studies utilized multi-omics data to classify ICC into two main subgroups, one with immune response genes as the main driving factor, while another is enriched with driver mutations in the genes associated with epigenetic regulations, such as *IDH1* and *IDH2*. The two subgroups also show different hypermethylation patterns in the promoter regions. Additionally, the immune response induced by host-pathogen interactions, i.e., liver fluke infection, may further stimulate tumor growth through alterations of the tumor microenvironment. For in-depth functional studies, although many ICC cell lines have been globally established, these homogeneous cell lines may not fully explain the highly heterogeneous genetic contents of this disorder. Therefore, the advent of patient-derived

**97**

xenograft and 3D patient-derived organoids as new disease models together with the understanding of evolution and genetic alterations of tumor cells at the single-cell resolution will likely become the main focus to fill the current translational research gaps of ICC in the future.

Keywords: intrahepatic cholangiocarcinoma, high-throughput technology, integrative multi-omics analysis, molecular biomarker, disease model, translational medicine, precision oncology

## BACKGROUND

The biliary system includes bile ducts and gallbladder. The main functions of bile ducts are to transfer bile from the liver and gallbladder to the small intestine to help with the digestion and absorption of dietary fats. Bile ducts can be classified into several parts based on the anatomical locations and structures. Peripheral branches of intrahepatic bile ducts drain into the right and left hepatic ducts, which then merge into a larger tube outside the liver, called the common hepatic duct. This extrahepatic bile duct further combines with the cystic duct from the gallbladder and becomes the common bile duct. Cholangiocarcinoma (CCA) is a group of heterogeneous malignancies that occurs in any part of the bile ducts. It can be further classified into three different categories based on the anatomical positions. The tumors that occur in the intrahepatic bile ducts are termed intrahepatic cholangiocarcinoma (ICC), while those located between the secondary branches of the right and left hepatic ducts and the common hepatic duct proximal to the cystic duct origin, and in the common bile duct are classified as perihilar and distal cholangiocarcinomas, respectively (Blechacz, 2017; **Figure 1**). As ICC occurs inside the liver, it is also one of the two main types of primary liver cancers besides hepatocellular carcinoma (HCC). In this review, we aim to provide a comprehensive update and novel insights on ICC, the rare type of CCA, which is known for its extraordinary complexity and heterogeneity, along with dismal prognosis.

Based on a 31-year study in the United States, ICC accounts for only 8% of all CCA cases, and is considered to be a rare disorder (DeOliveira et al., 2007). ICC occurs with the highest prevalence in Hispanic Americans (1.22 per 100,000 people) and lowest in African Americans (0.3 per 100,000 people) (McLean and Patel, 2006). By contrast, it is more common in East Asian and Southeast Asian countries. ICC has an incidence of around 10 per 100,000 people in China (males), and the highest frequency of occurrence, 71 per 100,000 people (males), is found in the northeastern part of Thailand (Shin et al., 2010a). Interestingly, the global incidence of ICC seems to have increased in recent years (Khan et al., 2012).

Risk factors of ICC include bile duct cysts, chronic biliary irritation, parasitic or viral infections, inflammatory bowel disease, abnormal bile ducts, and exposure of chemical carcinogens. Chronic inflammation caused by parasitic infection, particularly liver flukes (Opisthorchis viverrini and Clonorchis sinensis), is a well-known risk factor of ICC in northeastern Thailand (Sripa et al., 2007; Sripa and Pairojkul, 2008). Eating raw or uncooked fermented fish, a common local dish in this area, results in the high incidence of recurrent liver fluke infections, which are strongly associated with ICC. Several mechanisms have been proposed to explain the association between liver fluke infection and ICC (Sripa et al., 2007). First, when liver flukes start their parasitic life in humans, they attach themselves to the bile duct epithelia using their suckers, which cause damage to the epithelial walls of the ducts. The repeated damage-repair processes may result in the epithelial-mesenchymal transition (EMT) of cell states. Second, the inflammation reactions induced by parasites and the chemicals secreted by them, as well as mutagens from fermented food, may create more carcinogens that damage DNA and result in irreversible oncogenic mutations.

Other than parasitic infections, hepatitis B virus (HBV) and hepatitis C virus (HCV) infections are also associated with ICC. HBV and HCV nucleic acids have been found in 27% of ICC tumors in a US-based study (Perumal et al., 2006). Another study in China has shown a strong association between chronic HBV infection and ICC in a total of 317 patients, and further suggested that ICC and HCC may share a common carcinogenesis process (Zhou et al., 2010). In addition, HBV and HCV infections are proposed to be associated with increasing incidence of ICC from several case-control studies (Yamamoto et al., 2004; Fwu et al., 2011; Sempoux et al., 2011; Yu et al., 2011; Zhou et al., 2012; Li et al., 2015). Other possible risk factors of ICC include smoking, alcohol drinking, obesity and diabetes mellitus, which are mostly observed in western countries (Tyson and El-Serag, 2011). A detailed summary of established risk factors for ICC and their relative risks are shown in **Table 1**.

The most fundamental categorization of ICC is based on the macroscopic features established by the Liver Cancer Study Group of Japan in 2003 (Yamasaki, 2003). The authors described three macroscopic subtypes of ICC, namely, mass-forming type (MF), periductal-infiltrating type (PDI) and intraductal growth (IDG) type. MF type forms a definite mass in the liver parenchyma. PDI type is defined as tumors that extend longitudinally along the ducts, while the IDG type forms a papillary growth inside the lumen of intrahepatic ducts. MF subtype is the most common subtype (about 65%), whereas PDI and IDG types are less prevalent (around 5% each), and mixedtype (MF+PDI) accounts for ∼25% of the cases (Yamasaki, 2003; Sempoux et al., 2011). However, based on more recent data, a noteworthy degree of heterogeneity of ICCs in regard to their histopathological and molecular features was observed. Therefore, in addition to the traditional classifications, multiple new criteria were proposed in order to subcategorize ICC (Vijgen et al., 2017).

Serum biomarkers are usually used to help screen cancer at its earliest stages. A wide variety of markers have been tested in bile and serum with limited success. To date, disease-specific

biomarkers for CCA have yet to be established (Valle et al., 2016) and are urgently needed. The most frequently used biomarker for diagnostic and treatment prediction in CCA patients in clinical practice is carbohydrate antigen 19–9 (CA 19-9) (Liang et al., 2015), which is the standard tumor marker for pancreatic adenocarcinoma (Ballehaninna and Chamberlain, 2012). Nevertheless, serum levels of CA 19-9 are also elevated in benign cholestasis such as primary sclerosing cholangitis (PSC), complicating its usage in clinic (Lin et al., 2014). A serum CA 19- 9 level >100 U/mL has quite limited sensitivity and specificity (75 and 80%, respectively) in identifying PSC patients with CCA (Chalasani et al., 2000). In ICC, a large cohort analysis by Bergquist et al reported an elevated CA 19-9 level as an independent risk factor for mortality. Elevation of CA 19-9 independently predicted increased mortality with impact similar to node-positivity, positive-margin resection, and non-receipt of chemotherapy (Bergquist et al., 2016).

Since the clinical presentation of ICC is not specific and the disease in its early stage is usually asymptomatic, the patients are often diagnosed at an advanced stage. Surgical resection, which is the only curative treatment, remains the anchor of therapy for patients with resectable ICC (Weber S. M. et al., 2015). Nevertheless, because of the late presentation of symptoms and the central hepatic location of ICC, only ∼30% of the patients are deemed eligible for resection by the time of diagnosis. This results in a low 5-year survival and high recurrent rate after resections (Hyder et al., 2013). Loco-regional therapies (LRT) including intra-arterial embolotherapy (IAT) and radiofrequency ablation have been reported as the feasible and effective palliative treatments for patients with unresectable ICC (Savic et al., 2017). Overall, systemic cytotoxic chemotherapy is still the mainstay of treatment for patients with advanced unresectable, recurrent or metastatic ICC. In a landmark phase III randomized study in patients with advanced biliary tract cancer (BTC), doublet chemotherapy (addition of cisplatin to gemcitabine) improved the response rate from 72 to 81% (P = 0.049) and overall survival from 8.1 to 11.7 months (hazard ratio 0.64; P < 0.001) (Valle et al., 2010). Thus, it has since been considered as the standard of care although the efficacy remains limited. Of note, CCA only accounts for ∼60% of all BTC patients enrolled in this study. Another well-established combination chemotherapy regimen for advanced BTC is GEMOX, which consists of gembitabine plus oxaliplatin (Sharma et al., 2010). So far, several clinical trials investigating the efficacy of targeted therapies, such as cetuximab, panitumumab, erlotinib, selumetinib, sunitinib, and bevacizumab, have failed to demonstrate the survival benefits for



*<sup>a</sup>Endemic in Northeastern Thailand, Lao, Vietnam, Cambodia.*

*<sup>b</sup>Endemic in South China, Japan, Korea, Taiwan.*

*<sup>c</sup>HFR 677CC*+*TSER 2R; GSTO1*\**A140D; MRP2/ABCC2 variant c.3972C*>*T; (NKG2D rs11053781, rs2617167)* +*PSC; MICA5.1*+*PSC; CYP1A2*\**1A/*\**1A; NAT2*\**13,*\**6B,*\**7A; XRCCI194W; XRCC1 R280H; PYGS2 Ex10*+*837 (Tyson and El-Serag, 2011).*

this group of patients (Zhu et al., 2010; Bekaii-Saab et al., 2011; Jensen et al., 2012; Lee et al., 2012; Yi et al., 2012; Malka et al., 2014).

Taken together, even though ICC is considered a rare cancer type, it represents an emerging health problem with increasing incidence worldwide. ICC is usually diagnosed at late stages and has poor prognosis, partly due to the complex anatomical structure of the biliary system, its various etiologies, heterogeneous subclassifications, and the lack of effective biomarkers and treatments. To date, the genetic signatures of ICC are still limitedly understood and no major driver mutations with clinical actionability have been identified. An overview of current challenges in the treatment of ICC is outlined in **Box 1**. In the next sections, we aim to provide an in-depth update on the application of recent advances in high-throughput technologies that can help expedite the translation of research discoveries in ICC and related cancers, as well as current disease models used to facilitate the development of precision oncology in ICC.

#### MOLECULAR FEATURES AND SUBTYPES OF ICC IDENTIFIED BY HIGH-THROUGHPUT APPROACHES

Advances in high-throughput screening methods such as nextgeneration sequencing (NGS) and liquid chromatography–mass

#### Box 1 | Challenges in ICC treatment.


spectrometry (LC-MS) have enabled broader interrogation of genetic diseases and other disorders. The so-called "omics" data can be defined and categorized according to different groups of biological molecules and regulatory processes, which provide different information of the cells. Given the advantages of broader and deeper scales of available data, different types of omics are applied widely and rapidly to study the associations between different variations and phenotypes, and also used to predict prognosis. It also helps in the classification of subtypes of a disease, which may require different treatment guidelines (Kristensen et al., 2014).

Genomics is one of the earliest to be introduced among the omics data series. Common types of somatic DNA alterations including single nucleotide variants (SNVs), insertions and deletions (INDELs), copy number alterations (CNAs), and structural variations (SVs) have all been shown to play important roles in development and progression of ICC (Zou et al., 2014). Comparative genomics of cancer and normal cells serve as an important platform to investigate molecular mechanisms of cancers; however, biological functions of oncogenes largely depend on how they are expressed (or not expressed) into functional oncoproteins and which tissues they are expressed in. Transcriptomics describes the abundance of transcribed messenger RNA (mRNA) and other non-coding RNAs. Even though most transcriptomic studies on ICC and relating cancers have been focused on mRNA (Jinawath et al., 2006), dysfunction of non-coding RNAs, particularly microRNA (miRNA) and long non-coding (lncRNA), have recently been found to play roles in ICC as well (Wang et al., 2016; Yang et al., 2017; Zheng et al., 2017). Other than transcriptional level, transcriptomic profiling by RNA-Seq data also provides novel information on alternative splicing isoforms of a gene and confirms the expression of novel fusion gene transcripts, which is surprisingly prevalent in ICC (Arai et al., 2014; Borad et al., 2014; Ross et al., 2014; Nakamura et al., 2015; Sia et al., 2015). Transcriptional levels significantly depend on epigenetic configuration of regulatory elements targeting the oncogenes and tumor suppressor genes. It has been shown in CCA, including ICC, that DNA methylation is markedly enriched in either CpG islands or shores, which are regulatory regions enriched in cytosine and guanine nucleotides (Jusakul et al., 2017). Downstream to transcriptomes, proteomics has been widely used to quantify peptide sequences, post-translational modifications, protein abundance and interactions. Aberrant proteins secreted by cancer cells and released into various kinds of body fluids, such as blood, urine and saliva, provide good non-invasive biomarkers for early detection of cancer and the recurrent disease. A few studies have proposed potential biomarkers for CCA and HCC based on mass spectrometry analysis of cancer-specific secreted proteins (Srisomsap et al., 2010; Cao et al., 2013). Another high-throughput approach, metabolomics study, quantifies small molecules, such as amino acids, fatty acids, carbohydrates, or other compounds related to cellular metabolic functions. Metabolite levels and relative ratios reflect metabolic function, and out of normal range perturbations are often indicative of disease, as also shown in ICC (Murakami et al., 2015).

One of the most apparent applications of omic techniques on cancer research is the characterization of cancer subtypes and their signatures, which frequently leads to personalized treatments for cancer patients bearing different tumor signatures. For instance, based on a large whole exome (WES) and genome sequencing (WGS) dataset of 7,042 tumors generated from 30 primary cancer types, cancers could be categorized into 21 different molecular signatures (Alexandrov et al., 2013). Molecular signature 1, for example, has the highest prevalence in all the cancer samples (∼70%), and is mostly associated with age. Signature 3 accounts for about 10% of the prevalence and is associated with mutations in BRCA1/2. Therefore, combining signature 1 and 3 explains over 80% of the breast cancer cases. Even though within each cancer type, the prevalence of somatic mutations varies significantly, they can be distinguished using different combinations of signatures. In parallel, another study categorized 3,299 tumors from The Cancer Genome Atlas (TCGA) comprising 12 cancer types into two main classes, one with dominant oncogenic signatures of somatic mutations (M class), and the others with dominant signatures of CNAs (C class) (Ciriello et al., 2013). The M class tumors show primarily genomic mutations and epigenetic alterations, such as DNA hypermethylation. Conversely, the C class tumors show primarily CNAs, particularly high-level of amplifications and homozygous deletions. Targetable molecular alterations in a tumor class allow the use of class-specific combination cancer therapy. More recently, an integrated analysis of genetic alterations focusing on the 10 canonical signaling pathways in the 9,125 TCGA-profiled tumors from 33 cancer types including CCA has underlined significant representation of individual and co-occurring actionable alterations among these pathways, which suggests targeted and combination therapy opportunities (Sanchez-Vega et al., 2018). In addition, WES and transcriptome data were applied to identify molecular signatures of metastatic solid tumors from 500 adult patients (Robinson et al., 2017). Altogether, such systematic approaches can potentially be applied specifically to ICCs, where each tumor may carry different underlying genetic mechanisms and prognoses, in order to obtain more effective treatment for individual patients.

To overcome the challenges in ICC diagnosis and treatment (**Box 1**), multiple high-throughput omics studies have been performed in order to discover the underlying molecular mechanisms that can be translated into precision oncology application. In order to better understand the current progress in ICC translational research, here we review the various subclassifications of ICC with regard to its cells of origin, different etiologies and unique clinicomolecular aspects of this rare disorder. The detailed summary of the high-throughput omics studies of ICC can be seen in **Table 2**.

## Cells of Origin of ICC

Primary liver cancer, which is the second leading cause of cancerrelated death worldwide, is mainly composed of ICC and HCC. The molecular and clinical features of the two cancers are distinct in most cases. Many studies have shown that the two cancers may share the same driver genes, which may be due to the fact that they also share the same cells of origin; hepatocytes and cholangiocytes arise from a common progenitor, hepatoblasts. ICC usually has poorer prognosis than HCC due to the difficulties in early disease detection and poorly understood carcinogenesis mechanisms. In a small proportion of the cases, ranging from 0.4 to 14% depending on the geographical regions, the patients developed combined hepatocellular cholangiocarcinoma (CHC) (Theise et al., 2010), which was proposed to be of monoclonal origin based on a recent study (Wang et al., 2018).

Various genetically engineered mouse models have been generated to study the cellular origin of primary liver cancers; however, the results are still inconclusive. By ablation of genes in Hippo signaling pathways (Lee et al., 2010; Lu et al., 2010) or knocking out neurofibromatosis type 2 (Nf2) gene (Benhamouche et al., 2010) in mouse, the authors proposed that ICC and HCC may share the same progenitor cells since all surviving mice eventually developed both CCA and HCC. A similar result was achieved by performing transduction of oncogenes, i.e., H-Ras or SV40LT, in mouse primary hepatic progenitor cells, lineage-committed hepatoblasts, and differentiated adult hepatocytes. Regardless of the hepatic lineage hierarchy, transduced cells were able to give rise to a continuous spectrum of liver cancers from HCC to CCA suggesting that any hepatic lineage cell can be cell-of-origin of primary liver cancer (Holczbauer et al., 2013). Several large multi-omics studies have shown that ICC and HCC share recurrently mutated genes including TP53, BAP1, ARID1A, ARID2 (Chaisaingmongkol et al., 2017; Farshidfar et al., 2017; Wang et al., 2018). Furthermore, ICC together with HCC can be categorized into


C1 and C2 subtypes. ICC-C1 and HCC-C1 share similar transcriptomic patterns that are significantly different from those of ICC-C2 and HCC-C2. Interestingly, ICC-C1 and HCC-C1 are enriched for aberrant mitotic checkpoint signaling, suggesting a high rate of chromosomal instability, while C2 groups are enriched for the cell immunity-related pathways, which implies an association with inflammatory responses (Chaisaingmongkol et al., 2017). These findings indicate that ICC and HCC, while clinically treated as separate entities, share common molecular subtypes with similar actionable drivers that can be exploited to improve precision therapy.

It should be noted that ICC- or HCC-specific alterations also exist. Aberrant activation of NOTCH signaling and gain-of-function mutations in the genes encoding isocitrate dehydrogenases (IDH1 and IDH2) are required for ICC development, and thus are significantly more common in ICC than in HCC (Sekiya and Suzuki, 2012; Moeini et al., 2016). In addition, activation of KRAS and deletion of PTEN in the mouse model will only generate ICC (Ikenoue et al., 2016). Multiple studies have identified different molecular features of ICC and HCC by applying large-scale high-throughput datasets. By combining metabolomics and transcriptomics data from 10 ICC and six HCC samples together with their paired normal tissues, a research team showed that ICC can be distinguished from HCC by the distinct expression patterns of 62 mRNAs, 17 miRNAs, and 14 metabolites (Murakami et al., 2015), leading to the conclusion that ICC and HCC have different oncogenic mechanisms. Recently, Farshidfar et al. conducted a metaanalysis study by combining sequencing data from a total of 458 ICC, 153 pancreatic ductal adenocarcinoma (PDAC), and 196 HCC samples from multiple studies including TCGA. They identified a distinct subtype of ICC enriched for IDH mutants, and found that HCC can be characterized by CTNNB1 and TERT promoter mutations, which are absent in ICC (Farshidfar et al., 2017).

In conclusion, although ICC shares some molecular changes with HCC, likely because of the same cells of origin, this rare cancer also possesses its own unique differentiation and evolution pathways, as well as specific genetic alterations and distinct gene expression patterns.

#### Different Etiologies of ICC

Parasitic infection by liver flukes, i.e., O. viverrini (OV) and C. sinensis, is a well-known ICC risk factor, particularly in Thailand. The chronic liver fluke infection is estimated to account for 8–10% of the overall ICC incidences (Gupta and Dixon, 2017). The gene expressions studied by Jinawath et al. (2006) was one of the first reports to elucidate the different genetic mechanisms between liver fluke- and non-liver flukeassociated ICCs. Using cDNA microarray, the authors compared the two groups of ICC at the transcriptional level, and found that genes involved in xenobiotic and endobiotic metabolisms, i.e., UDP-glucuronosyltransferase (UGT2B11, UGT1A10) and sulfotransferases (CHST4, SUT1C1), have higher expression in liver fluke-associated ICCs comparing to non-liver fluke group. These genes are believed to play important roles in detoxification of carcinogens such as nitrosamines from preserved food and, if any, toxic substances released from the parasites or created by parasite-induced chronic inflammation. On the other hand, genes involved in growth factor signaling show higher expression in non-liver fluke ICCs.

Different causative etiologies may induce distinct somatic alterations. Recurrent infection of liver flukes, particularly OV, has been associated with different DNA mutation signatures in ICCs. A WES study demonstrated that the frequently mutated genes in OV-related ICCs comprise both known cancer genes, such as TP53, KRAS and SMAD4, and newly implicated cancer genes including MLL3, ROBO2, RNF43, PEG3, and GNAS, which are genes involved in histone methylation, genome stability, and G-protein signaling (Ong et al., 2012). Another WES study further showed that TP53 mutations are more enriched in OVrelated ICCs, while mutations in BAP1, IDH1, and IDH2 genes are more common in non-OV-related tumors (Chan-On et al., 2013).

A recent multi-omics study analyzed the combined datasets of WGS, WES, CNAs, transcriptomes and epigenomes, and identified four CCA clusters likely driven by distinct etiologies, with separate genetic, epigenetic, and clinical features (Jusakul et al., 2017). The results showed that liver fluke infection is one of the most important classification factors and is also the factor that leads to poorer prognosis. From this study, clusters 1 and 2, which are liver fluke positive, are enriched for recurrent mutations in TP53, ARID1A and BRCA1/2, and ERBB2 amplifications. In contrast, clusters 3 and 4, which comprise mostly non-liver fluke-associated tumors, are enriched for recurrent mutations in epigenetic-related genes, i.e., BAP1 and IDH1/2, as well as FGFR rearrangements, and have high PD-1/PD-L2 expression. Additionally, DNA hypermethylation of CpG islands and high levels of mutations in H3K27me3-associated promoters were only observed in clusters 1, while cluster 4 exhibited DNA hypermethylation in CpG shores. These findings suggest different mutational pathways across all four CCA subtypes.

Other than liver fluke, hepatitis virus infection has been proposed to be associated with an increased risk of ICC as well. A meta-analysis of the combined 13 case–control studies and three cohorts of ICC patients has reported a statistically significant increased risk of ICC incidence with HBV and HCV infection (OR = 3.17, 95% CI, 1.88–5.34, and OR = 3.42, 95% CI, 1.96–5.99, respectively) (Zhou et al., 2012). To investigate whether viral hepatitis-associated ICC may harbor specific histomorphological and genetic features, Yu et al. analyzed the 170 ICC patients who were either seropositive or seronegative for HBV or HCV. The authors identified Ncadherin as an immunohistochemistry (IHC) marker for viral hepatitis-associated ICC. N-cadherin IHC positivity is also strongly associated with cholangiolar morphology, lack of CEA, high MUC2 expression, and low KRAS mutation frequency (Yu et al., 2011). In line with these findings, another study conducting WES in ICCs found that HBV-associated ICCs carry high TP53 mutation loads, while mutations in KRAS are almost exclusively identified in tumors of HBV-seronegative patients (Zou et al., 2014). However, larger scale high-throughput studies have yet to be performed in viral hepatitis-associated ICCs.

#### Other Molecular and Clinical Aspects

Based on gene expression and SNP microarrays, two main subtypes of ICC, proliferation (PF) and inflammation (IF), were identified (Sia et al., 2013a). The PF subtype is more common and can be characterized by activation of oncogenic signaling pathways, DNA amplifications of 11q13.2 (including CCND1 and FGF19 gene loci), deletions of 14q22.1 (including SAV1 gene locus), mutations in KRAS and BRAF, and is associated with a poor prognosis. In contrast, the IF subtype is characterized by activation of inflammatory signaling pathways, overexpression of cytokines and STAT3 activation, and is associated with a better prognosis. Another study led by Anderson et al. classified ICC patients into two subgroups based on 5-year survival rate, time to recurrence, and the absence or presence of KRAS mutations. Similarly, KRAS mutations are associated with poor clinical outcomes (Andersen et al., 2013).

As mentioned earlier, based on a large-scale TCGA study, mutational signatures can be divided into two major classes, namely M and C (Ciriello et al., 2013). By combining WES and transcriptomic data, a study showed that ICCs carry signatures of both M and C classes as well (Kim et al., 2016). ICC of C class harbors recurrent focal CNAs including deletions involving CDKN2A, ROBO1, ROBO2, RUNX3, and SMAD4, while those of M class harbor recurrent mutations in the genes frequently mutated in ICC, i.e., TP53, KRAS, and IDH1, as well as epigenetic regulators and genes in TGFβ signaling pathway.

Focusing on the genomic findings from all ICC studies discussed above, recurrent mutations of ICC are enriched in tumor suppressor genes, i.e., ARID1A, ARID1B, BAP1, PBRM1, TP53, STK11, and PTEN, and oncogenes, i.e., IDH1, IDH2, KRAS, BRAF, and PIK3CA. The frequencies of these recurrent mutations in ICC across multiple studies are summarized in **Figure 1**. The majority of these genes are associated with genome instability and epigenetic alterations, which are the common underlying mechanisms of cancer. Recurrent mutations of BRCA2, MLL3, APC, NF1, and ELF3 tumor-suppressor genes have also been reported in ICC (Farshidfar et al., 2017). Using transcriptomic analysis, fibroblast growth factor receptor 2 (FGFR2) fusion genes, i.e., FGFR2-AHCYL, FGFR2-BICC1 type1, FGFR2-BICC1 type2, FGFR2-PPHLN1, FGFR2-MGEA5, FGFR2-TACC3, FGFR2-KIAA1598, FGFR2-KCTD1, and FGFR2- TXLNA, are found to be one of the most prevalent alterations in ICC (Jiao et al., 2013; Borad et al., 2014; Ross et al., 2014; Murakami et al., 2015; Sia et al., 2015; Farshidfar et al., 2017; **Figure 1**). Furthermore, they are reported to be exclusively present in ICC, but not ECC and gallbladder cancer (Nakamura et al., 2015). FGFR2 fusion proteins have been shown to facilitate oligomerization and FGFR kinase activation, resulting in altered cell differentiation and increased cell proliferation (Wu et al., 2013). Although the genomic and transcriptomic analyses of ICC support the use of targeted therapeutic interventions, there is currently no targeted therapy considered effective for this disorder. In order to develop a strategy to overcome this challenge, a disease model that mimics most or all biological and genetic aspects of ICC is an ideal tool for performing functional studies of the target genes or screening potential anticancer drugs. In the coming sections, we will update the recent progress and introduce new disease models that may expedite the discovery of novel treatment for ICC.

## CURRENT DISEASE MODELS OF ICC

The first ICC cell line, HChol-Y1, was established in 1985. The cell line secretes very low levels of CEA and high level of CA 19- 9, which are the markers of various kind of cancers (Yamaguchi et al., 1985). Since then, many more ICC cell lines originating from ICCs with different etiologies have been established around the world. PCI:SG231 (Storto et al., 1990), CC-SW-1 (Shimizu et al., 1992), CC-LP-1 (Shimizu et al., 1992) cell lines were established from patients in the US. HuH-28 (Kusaka et al., 1988), KMCH-2 (Yano et al., 1996), RBA (Enjoji et al., 1997), SSP-25 (Enjoji et al., 1997), NCC-CC1, NCC-CC3-1, NCC-CC3-2, and NCC-CC4-1 (Ojima et al., 2010) were derived from Japanese patients. SNU-1079 (Ku et al., 2002) was derived from a Korean patient, while HKGZ-CC (Ma et al., 2007), and HCCC-9810 (Liu et al., 2013) were derived from Chinese patients. In particular, HuCCA-1 was established from the tumor removed from a Thai patient with liver fluke infection (Sirisinha et al., 1991). This cell line is from epithelial cell origin and secretes a number of non-specific tumor markers including CA125 (Srisomsap et al., 2004).

Unlike most of the ICC cell lines established directly from primary tumor cells, two cell lines, namely MT-CHC01 and KKU-213L5, were established by generating xenograft, which is the growing of human primary tumor cells in the immunodeficient mice, such as nonobese diabetic (NOD)/Shisevere combined immunodeficient (scid)-IL2rγ null mice (NOG mice). MT-CHC01 was established from a xenograft derived from the tumor of an Italian patient. After growing primary tumor cells in NOD/Shi-scid mice for four generations, the xenograft was stabilized, and the tumors were resected from mice to generate xenograft-derived cell lines. MT-CHC01 retains epithelial cell markers, and shows stemness and pluripotency markers (Cavalloni et al., 2016b). After subcutaneous injection, it retains in vivo tumorigenicity and expresses CEA and CA19-9; KRAS G12D mutation is also maintained in this cell line. KKU-213L5 was recently derived from its parental cell line, KKU-213, which was established from the primary tumor of a Thai patient. KKU-213L5 was selected in vivo through five serial passages of pulmonary metastasized tissues via tail-vein injection into NOD/scid/Jak3 mice (Uthaisar et al., 2016). Compared to KKU-213, KKU-213L5 possesses higher metastatic behaviors, such as higher migration and invasion abilities, and also shows stem cell characteristics. The cells exhibit significantly higher expression of anterior gradient protein-2 (AGR2) and suppression of KiSS-1, which are associated with metastasis in the later stages of disease (**Figure 2A**).

Recently, the use of human tumor xenograft or patientderived xenograft (PDX) provides a "patient-like" environment in animal models for a better study of human cancers. To generate PDX, tumor cells are transplanted into immunocompromised animals either by subcutaneous injection or by injecting into the desired organs directly. An orthotopic

xenograft model is generated by either implanting or injecting human tumor cells into the equivalent organ from which the cancer originated. It is widely believed that orthotopic PDX reflects the original tumor microenvironment much better than the conventional subcutaneous xenograft models. Recently, a novel PDX model was generated from an Italian patient with ICC. This PDX shows the same biliary epithelial markers, tissue architecture, and genetic aberrations as the primary tumor (Cavalloni et al., 2016a) (**Figure 2D**). Other than PDX, a genetically engineered mouse model of ICC has been generated by inducing oncogenic Kras mutation and homozygous Pten deletion in the liver. The tumors induced in this model are exclusively ICCs and show histological phenotype similar to human ICC with cholangiocyte origin. This mouse line is suited for the development of new therapies for ICCs with an oncogenic KRAS mutation and the activated PI3K pathway (Ikenoue et al., 2016) (**Figure 2C**). The latest gene-editing technology, CRISPR/Cas9 technique, has successfully been used to induce ICC in mice. A study led by Weber J. et al. (2015) introduced mutations in a set of tumor suppressor genes often altered in human ICC/HCC such as Arid1a, Pten, Smad4, Trp53, Apc, Cdkn2a, and in a few rarely mutated genes including Tet2, Brca1/2, in mice by conducting multiplex CRISPR/Cas9 gene editing. The results showed that CRISPR/Cas9-induced mouse ICCs preferentially carry higher frequencies of mutations in the frequently dysregulated genes in human ICCs, especially those related to chromatin modification. However, the authors unexpectedly observed a high mutation frequency of Tet2,

mutations to the selected tumor suppressor genes including *Arid1a*, *Trp53*, *Tet2*, *Pten*, *Cdkn2a, Apc, Brca1/2*, and *Smad4*, which lead to ICC in the gene-edited mice.

which has never been observed in human ICCs. Although TET2 mutations have not been reported in human ICC, TET2 is believed to harbor tumor suppressive function linked to IDH1/2, which are among the commonly mutated oncogenes in ICC. The authors, therefore, brought up the importance of genetic screening in pinpointing the cancer genes that may not be mutated, but altered by other mechanisms (Weber J. et al., 2015) (**Figure 2E**).

## TRANSLATIONAL CLINICAL ASPECTS AND FUTURE DIRECTIONS

Looking ahead on the future of cancer research, one of the most exciting trends is the application of patient-derived organoids, which serve as a source of expanded in vitro patientderived cancer cells (**Figure 2B**). This essentially provides a 3D semi-solid tissue-like architecture that captures the real structure and heterogeneity of a solid tumor, a quality that is lacking in the commonly used immortalized cancer cell lines. Organoid, therefore, serves as a good model for studying the underlying carcinogenesis mechanisms, as well as for drug sensitivity testing and developing targeted therapies (Lancaster and Knoblich, 2014). Recently, human cholangiocytes were isolated and propagated from human extrahepatic biliary tree in the form of organoids as a proof-of-concept experiment for regenerative medicine applications (Sampaziotis et al., 2017). These extrahepatic cholangiocyte organoids can form tissue-like structures with biliary characteristics when transplanted into immunocompromised mice, and can reconstruct the gallbladder wall by repairing the biliary epithelial cells in a mouse model of injury. The results showed that bioengineered artificial ducts can functionally mimic the native common bile duct. Recently, Broutier et al. has successfully developed organoids from primary cell culture of HCC, CHC, ICC, and perihilar CCA (Broutier et al., 2017). By generating ICC organoids that reflect the heterogeneous origins and etiologies, we foresee a possibility of identifying the functions of somatic alterations in ICC by systematically conducting CRISPR/Cas9 gene editing. In addition, one can investigate the effects of microenvironment more thoroughly (i.e., tumor-immune interactions and cellcell communications), the cell state transition, and test the efficacy of drugs in a high-throughput manner. Ultimately, patient-derived organoids together with PDX mice may serve as two of the most important models for the development of precision medicine in ICC and other rare cancers. In **Figure 3**, we summarize the application of precision oncology through the use of high-throughput technologies and disease models to expedite translational research outcomes in ICC.

Intra-tumor heterogeneity reflects the diverse clonal evolution of tumor cells. Tumor evolution is proposed to have one of the following characteristics; hypermutability phenotypes, various mutation signature patterns, weak clonal selection, and high heterogeneity of tumor cell subclones (Schwartz and Schäffer, 2017). Extensive intra-tumor heterogeneity of ICC has lately been observed using WES, which identified branch evolution collectively shaped by parallel evolution and chromosome instability as the predominant pattern of ICC (Dong et al., 2018). As single-cell omics technologies have become more matured recently, it is now possible to characterize the reference expression patterns of individual cells in human (Nawy, 2014) in order to provide the most fundamental knowledge for understanding human health and diseases (Rozenblatt-Rosen et al., 2017). Such advanced technologies will also expedite understanding of carcinogenesis mechanisms, including those of ICC. These approaches include generating transcriptomes and epigenomes at the single-cell level (scRNA-seq and scATAC-seq, respectively), as well as spatial transcriptomes, which can be used to investigate physical relationships of each cell in a tumor mass (Ståhl et al., 2016). Single-cell genomics has also become another important tool for understanding the clonal evolution of tumor cells phylogenetically by exploring the mutating ability of cancer cells (Kim and Simon, 2014; Müller and Diaz, 2017). In the same way, single-cell genomics may help better elucidate the heterogeneity of ICC, particularly when combined with other multi-level omics data generated from either primary tumor cells or the patient-derived 3D tumor model such as organoids. A recent study by Roerink et al. has investigated the nature and extent of intra-tumor diversification at the single cell level by characterizing organoids derived from multiple single cells from three colorectal cancers and adjacent normal intestinal crypts. Interestingly, the responses to anticancer drugs between even closely related cells of the same tumor are markedly different, emphasizing the importance of studying individual cancer cells (Roerink et al., 2018).

With the current advances in NGS technology, the genomic landscapes of ICC have been largely revealed, which is critically important for the clinical development of novel drugs. In addition, the multi-omics profiles that can classify tumor types based on molecular features may be essential for the clinical success in treating the patients. Toward this direction, the clinical trials driven by biomarkers are being conducted. Many ongoing clinical trials of all types of CCA including ICC are listed in **Table 3**. Among these, targeting FGFR alterations appear to be particularly promising. A phase 2 study of BGJ398, a selective pan-FGFR inhibitor, in metastatic FGFR-altered CCA patients who failed or were intolerant to platinum-based chemotherapy demonstrated impressive anti-tumor activity (Javle et al., 2016). Among the 22 evaluable metastatic patients harboring FGFR2 fusions or other alterations, three patients achieved partial response (PR) and 15 patients had stable disease (SD). A Phase 1 study of ARQ 087, an oral pan-FGFR inhibitor, in patients harboring FGFR2 fusions demonstrated two patients with a confirmed PR and one with durable SD at ≥16 weeks (Papadopoulos et al., 2017). A phase 3 study of ARQ 087 is ongoing and recruiting more patients with FGFR2 fusions as well as inoperable or advanced ICC (NCT03230318). Other novel drugs targeting FGFR fusions such as INCB054828, H3B-6527, erdafitinib, and INCB062079 are in early phases of clinical development (**Table 3**).

Mutations of IDH1 were reported in up to 25% of CCA (Lowery et al., 2017). AG-120, a highly selective small molecule inhibitor of mutant IDH1 protein, demonstrated a preliminary efficacy in refractory CCA patients with IDH1 mutations. A

phase 1 study of AG-120 reported one patient who achieved PR and five patients with SD >6 months (Lowery et al., 2017). A phase 3 randomized placebo-controlled study of AG-120 in IDH1 mutation-positive patients is underway (NCT02073994) (**Table 3**).

Immunotherapy such as checkpoint inhibitor may be effective only in patients with mismatch-repair deficiency (dMMR). In CCA including ICC, incidences of dMMR and/or microsatellite instability-high (MSI-H) were variously reported as quite low (Liengswangwong et al., 2003, 2006; Limpaiboon et al., 2006; Walter et al., 2017). A phase 2 non-randomized study of pembrolizumab, an anti-PD1 antibody, in 41 patients with progressive metastatic carcinoma demonstrated an immunerelated objective response rate of 40, 71, and 0% for the patients who have colorectal cancer with dMMR, CCA and other cancers with dMMR, and colorectal cancer with mismatchrepair proficiency (pMMR), respectively (Le et al., 2015). In addition, WES revealed an average of 1,782 somatic mutations for each dMMR tumor compared with only 73 for a pMMR tumor (P = 0.007). High somatic mutation loads were also associated with prolonged progression-free survival (PFS) (P = 0.02). Hence, dMMR tumors with a large number of somatic mutations may be more susceptible to immune checkpoint blockade, as a result of the substantial amount of new immunogenic antigens produced. Based on these findings, US FDA (Food and Drug Administration) has granted accelerated approval to pembrolizumab in patients with unresectable or metastatic solid tumors with MSI-H or dMMR. A phase 1b study of pembrolizumab (KEYNOTE-028) with 89 advanced biliary tract cancer patients has reported a preliminary efficacy of checkpoint inhibitor (Bang et al., 2015). Overall response rate was observed in ∼17% of the patients. Several other ongoing studies of checkpoint inhibitors are being investigated in combination with other drugs including chemotherapy, targeted drugs, and other immunotherapies (**Table 3**).

## CONCLUSIONS

In summary, we have described how the advances in highthroughput technologies have provided a massive amount of information in understanding the genetic mechanisms of disorders, including rare cancers, and in particular, ICC. To be able to effectively utilize such high-throughput methods in cancer research, one should take the following into consideration. First, the determination of clinical information, such as risk factor exposure or etiologies, disease stages, responsiveness to therapy, histology subtypes and anatomical locations, prior to

#### TABLE 3 | Ongoing clinical trials of targeted therapy in cholangiocarcinoma<sup>a</sup> .


*a Information acquired from Clinicaltrials.gov.* inclusion of the clinical samples is crucial, as it may affect the overall success of downstream analyses. For ICC, liver fluke and hepatitis virus infections are both strongly associated with the disease. Hence, additional information on whether the patients are seropositive for these infections may help better characterize the sample subgroups. Furthermore, ICC can also be subcategorized by macroscopic features, i.e., MF, IDG, and PDI, which rely on accurate pathological determination of the tumor sections. Secondly, insufficient sample size is one of the greatest challenges in studying ICC and other rare cancer types. This cancer in particular is prevalent in certain regions in Asia, such as northeastern Thailand, where most patients are believed to be associated with liver fluke infection. Finding a suitable ICC cohort with adequate sample size is difficult. Earlier studies have combined patients from different countries/geographical regions as well as other different types of BTC e.g., ECC and gallbladder cancer, in order to elucidate the molecular mechanisms and treatment responses. These cohorts, particularly in the form of clinical trials, are consisted of patients and tumors with different genetic backgrounds, which may have resulted in therapeutic failure due to the confounding factors and selection biases. Lastly, the small amount or low quality of source clinical materials limit the comprehensive applications of true "multi-omics" approaches. The majority of previous studies relied on obtaining multiple levels of omics information from different sets of ICC patients. The restricted amount of biological materials from one patient is the main hindrance of performing multiple omics analyses at once to comprehensively investigate the correlation and connections between multiple regulatory processes. Therefore, in addition to a good systematic longitudinal collection of clinical specimens from cancer patients in a tumor biobank, having organoids or PDX mouse models

#### REFERENCES


as "cancer avatars" would, at least in part, solve the problem of sample limitation, and should contribute to better omics study design and more effective translational outcomes for rare cancer patients.

#### AUTHOR CONTRIBUTIONS

KC, M-SS, and NJ conceived the concept of the review and figures. KC, M-SS, VC, NN, and NJ wrote the manuscript. KC, M-SS, NN, and NJ prepared the tables and figures. All the authors read, reviewed, and approved the final manuscript.

## FUNDING

NJ is a recipient of TRF Research Scholar Fund (RSA5780065), Government Fiscal Year Budget Funds administered through Mahidol University-National Research Council of Thailand and the Research University Network (RUN) Program, and the research grant from the Ramathibodi Cancer Center. NN acknowledges the Talent Management Program, Mahidol University. VC acknowledges the TRF Grant for New Researcher (MRG6080235), Newton Advanced Fellowship through TRF (DBG60800003) and Royal Society (NA160153), and Faculty of Science, Mahidol University. The NJ and VC laboratories are supported by the Crown Property Bureau Foundation through Integrative Computational BioScience (ICBS) Center, Mahidol University.

#### ACKNOWLEDGMENTS

The authors thank Mr. Jeffrey Makin for very helpful comments on the manuscript.


HChol-Y1 in a serum-free, chemically defined medium. J. Natl. Cancer Inst. 75, 29–35.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Shiao, Chiablaem, Charoensawan, Ngamphaiboon and Jinawath. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Infantile Neuroaxonal Dystrophy: Diagnosis and Possible Treatments

Patricia L. Babin<sup>1</sup> , Sudheendra N. R. Rao<sup>1</sup> \*, Anita Chacko<sup>1</sup> , Fidelia B. Alvina<sup>1</sup> , Anil Panwala<sup>2</sup> , Leena Panwala<sup>2</sup> and Danielle C. Fumagalli<sup>1</sup>

<sup>1</sup> Rare Genomics Institute, Downey, CA, United States, <sup>2</sup> INADcure Foundation, Fairfield, NJ, United States

Infantile Neuroaxonal Dystrophy (INAD) is a rare neurodegenerative disease that often cuts short the life span of a child to 10 years. With a typical onset at 6 months of age, INAD is characterized by regression of acquired motor skills, delayed motor coordination and eventual loss of voluntary muscle control. Biallelic mutations in the PLA2G6 gene have been identified as the most frequent cause of INAD. We highlight the salient features of INAD molecular pathology and the progress made in molecular diagnostics. We reiterate that enhanced molecular diagnostic methodologies such as targeted gene panel testing, exome sequencing, and whole genome sequencing can help ascertain a molecular diagnosis. We describe how the defective catalytic activity of the PLA2G6 gene could be potentially overcome by enzyme replacement or gene correction, giving examples and challenges specific to INAD. This is expected to encourage steps toward developing and testing emerging therapies that might alleviate INAD progression and help realize objectives of patient formed organizations such as the INADcure Foundation.

Keywords: infantile neuroaxonal dystrophy, rare disease, exome sequencing, enzyme replacement therapy, vector gene therapy, CRISPR/Cas9

## INTRODUCTION

Infantile neuroaxonal dystrophy (INAD) is an autosomal recessive rare neurodegenerative disease of unknown frequency. The onset of symptoms generally occurs between 6 months and 3 years of age. Prior to that time, infants develop normally. The first symptom of INAD may be the slowing of the attainment of normal developmental milestones or regression in developmental milestones (Ramanadham et al., 2015). Trunk hypotonia, strabismus, and nystagmus are early symptoms of the disease (Gregory et al., 2017). The progression of the disease is rapid and as it progresses, more acquired skills are lost. Muscles soon become hypotonic and later become spastic (Levi and Finazzi, 2014). Eventually, all voluntary muscle control is lost. Muscle weakness can also lead to difficulties in feeding and breathing. In addition to nystagmus, some children experience vision loss. Cognitive functions are gradually lost and dementia develops. The life span is generally 5 to 10 years (Gregory et al., 1993; Jardim et al., 2004; Macauley and Sands, 2009).

## MOLECULAR PATHOLOGY

Infantile Neuroaxonal Dystrophy belongs to a family of neurodegenerative disorders that includes atypical late-onset neuroaxonal dystrophy (ANAD) and dystonia Parkinsonism complex (DPC). Most cases of INAD are associated with homozygous or compound heterozygous mutations in the PLA2G6 gene that affect the catalytic activity of its protein product (Engel et al., 2010). The

#### Edited by:

George A. Brooks, University of California, Berkeley, United States

#### Reviewed by:

Michael L. Raff, MultiCare Health System, United States Prashant Kumar Verma, All India Institute of Medical Sciences, Rishikesh, India

\*Correspondence:

Sudheendra N. R. Rao sudhee26@gmail.com; sudheendra.rao@raregenomics.org

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 15 February 2018 Accepted: 15 November 2018 Published: 10 December 2018

#### Citation:

Babin PL, Rao SNR, Chacko A, Alvina FB, Panwala A, Panwala L and Fumagalli DC (2018) Infantile Neuroaxonal Dystrophy: Diagnosis and Possible Treatments. Front. Genet. 9:597. doi: 10.3389/fgene.2018.00597

**113**

PLA2G6 gene encodes a group via calcium-independent phospholipase A2 protein (PLA2G6 or iPLA2β, ∼85/88 kDa) with a lipase and seven ankyrin repeat-containing domains (Tang et al., 1997). PLA2G6 hydrolyzes the sn-2 acyl chain of phospholipids, generating free fatty acids and lysophospholipids.

Phospholipids in the inner membrane of the mitochondria are rich in unsaturated fatty acids in the sn-2 position, particularly cardiolipin (Seleznev et al., 2006). These unsaturated fatty acids are particularly vulnerable to the abundant reactive oxygen species produced by the mitochondria (Murphy, 2009) resulting in peroxidized phospholipids in the inner membrane of the mitochondria. PLA2G6 localizes to the mitochondria (Williams and Gottlieb, 2002; Liou et al., 2005) consistent with an increased demand for hydrolysis of the peroxidized fatty acids in the sn-2 position of phospholipids leading to remodeled phospholipids (Balsinde et al., 1995; Zhao et al., 2010). When PLA2G6 is defective, the mitochondria inner membrane integrity is damaged. PLA2G6 also localizes to the axon (Ong et al., 2005; Seleznev et al., 2006) indicating an increased localized demand for phospholipid remodeling there as well. The manifestation of such accumulation in the brain is unique to key brain areas, such as the basal ganglia which resulted in various names for the same underlying molecular pathogenesis involving PLAG26 (Mehnaaz, 2016; Nassif et al., 2016).

Ultrastructure analysis of neurons in PLA2G6 knockout mice is consistent with this molecular pathology. Mitochondria with branching and tubular cristae, mitochondria with degenerated cristae, axons with cytoskeleton collapse, and partial membrane loss at axon terminals have been observed (Beck et al., 2011). At a microscopic level, these features appear as axonal swellings and spheroid bodies in pre-synaptic terminals (**Figure 1**) in the central or peripheral nervous system.

#### MOLECULAR DIAGNOSTICS AND RARE DISEASE PATIENT EMPOWERMENT

Apart from specific clinical, electrophysiological, and imaging features, prior to the availability of next generation sequencing, skin biopsies showing axonal swellings and spheroid bodies in pre-synaptic terminals in the central or peripheral nervous system were the diagnostic criteria for the confirmation of INAD (Gregory et al., 2017; Iodice et al., 2017). Often, multiple biopsies were required to confirm the diagnosis. Families generally waited many years for a diagnosis. With the decreasing cost of gene and genome sequencing, availability of targeted gene panel testing with diagnostic labs for undiagnosed neurological diseases, and the increasing awareness of physicians of the availability of genetic diagnostics, families are receiving diagnosis more rapidly; sometimes within a year of the first symptom appearing. The children in these families are still young and the families are motivated to partner with scientists to find a treatment for their children's illness. In order to fund the research process, a group of parents of INAD patients formed the INADcure Foundation. The foundation has raised substantial funds for research and is partnering with Rare Genomics Institute to guide them in the awarding of research grants. Interestingly, a 2016 genetic analysis of 22 Indian families with INAD, ANAD, and DPC found that 10/22 families (45.45%) lacked mutations in the PLA2G6 gene coding region (Kapoor et al., 2016). Failure to identify deleterious mutations in the coding region of PLA2G6 highlights that future molecular diagnostic efforts would require whole gene sequencing to identify mutations in the intronic and regulatory regions of the PLA2G6 gene in INAD affected patients. Furthermore INAD cases where PLA2G6 associated mutations are not observed suggests that the cause of the disease could be due to mutation in genes other than PLA2G, which needs to be explored.

## POSSIBLE THERAPIES

Most of the therapies that are considered for a rare disease like INAD involving a defective enzyme are enzyme replacement, gene replacement or gene correction. When an enzyme deficiency is caused by a recessive genetic defect, it is assumed that enzyme replacement or supplementation may correct the problem (Smith et al., 2012; Yu-Wai-Man, 2016). However, experiments to prove that therapies providing the correct gene or enzyme will rescue the INAD phenotype are yet to be performed and tested.

## Enzyme Replacement Therapy (ERT)

Since the brain is the primary organ affected in INAD, enzyme replacement therapy for INAD would most likely require infusion of the enzyme into the brain. In 2017, Biomarin received FDA approval for tripeptidyl peptidase 1 (cerliponase alfa) as a treatment for the underlying cause of Batten disease, the deficiency of tripeptidyl peptidase (TPP1) a lysosomal enzyme (U.S. Food Drug Administration, 2017). Tripeptidyl peptidase 1 (cerliponase alfa; ∼59 kDa) is the first ERT to be directly administered into the cerebrospinal fluid (CSF) of the brain. Other ERT drugs which can be administered into the CSF are in clinical trials (Jardim et al., 2004; Macauley and Sands, 2009).

The efficacy of tripeptidyl peptidase 1 (cerliponase alfa) on the walking ability of Batten disease patients demonstrates that enzyme replacement therapy for diseases that affect the brain is theoretically possible. Targeting replacement enzyme into the mitochondria for INAD will be more challenging than lysosomal targeting which had already been successful by IV ERT for a long time as in the case of Gaucher disease. Hence, there are still many biological issues specific to INAD that need to be resolved:



An additional issue is that enzyme replacement therapy requires the initial placement of an intracerebroventricular catheter and frequent infusions. The placement of the catheter requires anesthesia and the infusions may require anesthesia depending on the patients' cooperation. Addressing the concerns of pediatric neurologists and parents of children with INAD with regard to using anesthesia on INAD patients with appropriate information will require further effort.

#### Gene Therapy/Gene Replacement

The human PLA2G gene is around 70 kb with 17 exons and 2 alternate exons. The longest protein-coding transcript, however, is just 3.3 kb and the protein coding sequence is just over 2.4 kb, which can easily be packaged into a viral vector cargo. We will first explore another lysosomal storage disorder in the CNS that have preclinical data to then be compared with INAD. The lack of preclinical data of ERT in INAD is a significant knowledge gap in the field, which is why it is crucial that we observe other preclinical models of CNS disorders that are undergoing gene therapy studies. Previously, a preclinical study in mice addressed gene therapy for the lysosomal storage disorder mucopolysaccharidoses type IIIA (MPS-IIIA) by packaging the correct version of the sulphamidase gene inside a viral vector (Sorrentino et al., 2013). The study took advantage of the growing understanding of the blood brain barrier (BBB) which actively regulates the transport of large molecules from blood into CNS by a process called transcytosis (Pardridge, 2005b; Sorrentino and Fraldi, 2016).

Transcytosis involves endocytosis of ligands on the luminal side, mediated by ligand-specific receptors (e.g., insulin receptor,

therapeutic strategies.

transferrin receptor, and low-density-lipoprotein receptor, etc.,) enriched on the capillary endothelium (Pardridge, 2005a). This is be followed by movement of endocytosed cargo through the endothelium cytoplasm and finally exocytosis at the abluminal (brain) side, thus effectively delivering the cargo across the BBB (Pardridge, 2002). The MPS-IIIA preclinical study manufactured a chimeric sulphamidase protein that contained a BBB-binding domain from apolipoprotein B to facilitate uptake by the endothelium and also a signal peptide from iduronate-2-sulfatase to aid an efficient exocytosis toward the abluminal side of the BBB (Sorrentino et al., 2013). A viral vector cargo encoding a chimeric sulphamidase was then loaded onto an adeno-associated virus (AAV) serotype 2/8 targeting the liver (Sorrentino and Fraldi, 2016). Thus the liver served as an internal factory that provided a constant supply of chimeric-sulphamidase, which resulted in a 10–15% increase in brain sulphamidase activity even 7 months after liver gene therapy. This increase in brain sulphamidase activity levels led to quantifiable improvement in brain pathology and behavior outcomes in the mouse model of MPS-IIIA (Sorrentino et al., 2013).

If a similar study is conducted for INAD, it would answer several critical questions about enzyme replacement for INAD therapy. Alternatively, intra-vascular or intra-CSF administration of AAV9 based gene therapy products can directly target CNS (Bey et al., 2017; Roca et al., 2017). However, the multiple mutations that INAD patients have on their PLAG26 gene provides unique challenges to gene therapy. Though the size of the PLA2G6 gene of 2–3 kb should not pose a problem for its insertion into a viral vector, however, the regulatory complications from correcting PLA2G6 are difficult to predict. Regulatory complications become uncontrollable especially if the gene therapy cannot be localized properly within the target tissues. Enzyme replacement therapy may also be a problem if some mutant products turn out to be dominant negative to the wild-type PLAG26.

Most of the challenges of possible therapies for INAD comes from the fact that it is an ultra-rare disease. However, the fact that INAD has a known genetic etiology provides avenues for possible therapies. Moreover, any successful therapy for INAD will receive orphan drug status and all the protection it gets because INAD is a rare disease. Thus despite all the aforementioned challenges, the orphan drug status provides a strong incentive for rare disease researchers and the biotechnology industry, not to mention when the genetic cause is already known.

#### Gene/Base Editing

As of 2017, at least 277 missense mutations have been observed in the human PLA2G6 gene (Lek et al., 2016). Only a small

#### REFERENCES

Balsinde, J., Bianco, I. D., Ackermann, E. J., Conde-Frieboes, K., and Dennis, E. A. (1995). Inhibition of calcium-independent phospholipase A2 prevents arachidonic acid incorporation and phospholipid remodeling in P388D1 macrophages. Proc. Natl. Acad. Sci. U.S.A. 92, 8527–8531. doi: 10.1073/pnas. 92.18.8527

proportion of PLA2G6 mutations include frame shifts, indels, nonsense mutations and mutations in the splice sites (Morgan et al., 2006). Thus correcting the gene in the target cell population or correcting the mutated DNA bases is an attractive therapeutic avenue. Over the years, several tools have been developed for gene editing, and CRISPR/Cas9-based genome editing is being hailed as a ground breaking technology with a clinical trial planned in 2018. These technologies utilize a DNA binding protein that can also cleave the strand in a specified manner to make space for the insertion of a new DNA sequence or correction of the deleterious DNA base (LaFountaine et al., 2015). CRISPR/Cas9 technology has already yielded astounding therapeutic results in several pre-clinical disease models (e.g., Duchene muscular dystrophy, liver metabolic diseases, etc.,) (Dai et al., 2016). Additionally, a growing understanding of human variation is posing newer challenges to gene therapy and driving innovation toward a true personalization of gene editing (Lessard et al., 2017; Scott and Zhang, 2017). New technologies such as vSLENDR, an AAV virus, and CRISPR/Cas9-mediated technology to replace defective genes in neurons and other nervous system cells are pushing gene editing technologies to new frontiers (Nishiyama et al., 2017).

## CONCLUSION

Infantile Neuroaxonal Dystrophy is a severe neurodegenerative disease with a certain morbidity and mortality. This rare disease offers an exciting opportunity to revalidate available modes of next generation therapy and generate newer ones. Emerging congruence on the clinical diagnostic criteria for INAD is expected to provide an impetus toward enhanced molecular diagnostics. This progress is expected to lead us to developing affordable therapies that would provide quantifiable improvement in the quality of life of INAD patients and retard or ameliorate the disease progression. We have highlighted some success stories in Batten's disease and the mucopolysaccharidoses that provide us with the inspiration to ask the right questions to make INAD therapy a reality. In addition, growing viral and nonviral approaches for CRISPR/Cas9 based gene editing should also open newer therapeutic avenues for INAD.

## AUTHOR CONTRIBUTIONS

PB and FA prepared preliminary draft. SR and AC edited and added more sections to the manuscript. DF, AP, and LP edited the manuscript.


AAV9-GFP for gene therapy of neurological disorders. Gene Ther. 24:325. doi: 10.1038/gt.2017.18


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Babin, Rao, Chacko, Alvina, Panwala, Panwala and Fumagalli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership