# GENETICS AND EPIGENETICS OF PSYCHIATRIC DISEASES, 2nd Edition

EDITED BY : Cunyou Zhao, Weihua Yue and Zhexing Wen PUBLISHED IN : Frontiers in Genetics and Frontiers in Psychiatry

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-025-4 DOI 10.3389/978-2-88966-025-4

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# GENETICS AND EPIGENETICS OF PSYCHIATRIC DISEASES, 2nd Edition

Topic Editors:

Cunyou Zhao, Southern Medical University, China Weihua Yue, Peking University Sixth Hospital, China Zhexing Wen, Emory University School of Medicine, United States

Psychiatric diseases have a highly complex etiology, aggregating in families but not segregating in a traditional Mendelian manner. Recent approaches to understanding the causes of psychiatric disease have focused on describing the genetic contribution to major psychiatric illnesses; the use of large-scale genome-wide association studies (GWAS) and exome sequencing has enabled a systematic exploration of genetic risk factors and identified over 100 independent genomic loci significantly associated with psychiatric diseases; however, there remains uncertainty about the causal genes involved in disease pathogenesis, and how their function is regulated. Since many GWAS variants reside in non-coding regions, the disease-associated common variants might be enriched in regulatory domains, including enhancers and regions of active chromatin state. These lead us to focus on the possible role of non-sequence-based genomic variation in health and disease. Of particular interest are epigenetic modifications that regulate gene expression through modifications to DNA, RNA, histone proteins, and chromatin. The availability of high-throughput profiling methods for quantifiying epigenomic modifications in large numbers of samples has enabled us to perform epigenome-wide association studies (EWAS) aimed at screening methylomic variations associated with environmental exposure and disease. Thus systematic integration of genetic, epigenetic and epidemiological approaches will contribute to improving our understanding of the molecular mechanisms underlying disease phenotypes.

Publisher's note: In this 2nd edition, the following article has been updated: Gao J, Yi H, Tang X, Feng X, Yu M, Sha W, Wang X, Zhang X and Zhang X (2018) DNA Methylation and Gene Expression of Matrix Metalloproteinase 9 Gene in Deficit and Non-deficit Schizophrenia. Front. Genet. 9:646. doi: 10.3389/fgene.2018.00646

Citation: Zhao, C., Yue, W., Wen, Z., eds. (2020). Genetics and Epigenetics of Psychiatric Diseases, 2nd Edition. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-025-4

# Table of Contents


Yung-Fu Wu, Huey-Kang Sytwu and For-Wey Lung


Yanjie Fan, Xiujuan Du, Xin Liu, Lili Wang, Fei Li and Yongguo Yu

*58* MECP2 *Mutation Interrupts Nucleolin–mTOR–P70S6K Signaling in Rett Syndrome Patients*

Carl O. Olson, Shervin Pejhan, Daniel Kroft, Kimia Sheikholeslami, David Fuss, Marjorie Buist, Annan Ali Sher, Marc R. Del Bigio, Yehezkel Sztainberg, Victoria Mok Siu, Lee Cyn Ang, Marianne Sabourin-Felix, Tom Moss and Mojgan Rastegar

*75 Exome Sequencing Identifies* TENM4 *as a Novel Candidate Gene for Schizophrenia in the SCZD2 Locus at 11q14-21*

Chao-Biao Xue, Zhou-Heng Xu, Jun Zhu, Yu Wu, Xi-Hang Zhuang, Qu-Liang Chen, Cai-Ru Wu, Jin-Tao Hu, Hou-Shi Zhou, Wei-Hang Xie, Xin Yi, Shan-Shan Yu, Zhi-Yu Peng, Huan-Ming Yang, Xiao-Hong Hong and Jian-Huan Chen

*85 Whole Exome Sequencing Identifies a Novel Predisposing Gene, MAPKAP1, for Familial Mixed Mood Disorder*

Chunxia Yang, Suping Li, Jack X. Ma, Yi Li, Aixia Zhang, Ning Sun, Yanfang Wang, Yong Xu and Kerang Zhang

*96 Current Understanding of Gut Microbiota in Mood Disorders: An Update of Human Studies*

Ting-Ting Huang, Jian-Bo Lai, Yan-Li Du, Yi Xu, Lie-Min Ruan and Shao-Hua Hu

*108 SNP Variation of* RELN *Gene and Schizophrenia in a Chinese Population: A Hospital-Based Case–Control Study*

Xia Luo, Si Chen, Li Xue, Jian-Huan Chen, Yan-Wei Shi and Hu Zhao


Chenxing Liu, Ian Everall, Christos Pantelis and Chad Bousman


Jingwen Yin, Dongjian Zhu, You Li, Dong Lv, Huajun Yu, Chunmei Liang, Xudong Luo, Xusan Xu, Jiawu Fu, Haifeng Yan, Zhun Dai, Xia Zhou, Xia Wen, Susu Xiong, Zhixiong Lin, Juda Lin, Bin Zhao, Yajun Wang, Keshen Li and Guoda Ma


Jiali Jin, Lu Liu, Wai Chen, Qian Gao, Haimei Li, Yufeng Wang and Qiujin Qian

# Human Endogenous Retroviral Envelope Protein Syncytin-1 and Inflammatory Abnormalities in Neuropsychological Diseases

Xiuling Wang1,2†, Jin Huang3† and Fan Zhu1,4 \*

*<sup>1</sup> Department of Medical Microbiology, School of Medicine, Wuhan University, Wuhan, China, <sup>2</sup> Department of Medical Laboratory, The Central Hospital of Wuhan, Huazhong University of Science and Technology, Wuhan, China, <sup>3</sup> Key Laboratory for Molecular Diagnosis of Hubei Province, The Central Hospital of Wuhan, Huazhong University of Science and Technology, Wuhan, China, <sup>4</sup> Hubei Province Key Laboratory of Allergy and Immunology, Wuhan University, Wuhan, China*

#### Edited by:

*Zhexing Wen, Emory University School of Medicine, United States*

#### Reviewed by:

*Sarven Sabunciyan, Johns Hopkins University, United States Ting Zhao, University of Pennsylvania, United States*

\*Correspondence:

*Fan Zhu zhufan@hotmail.com; fanzhu@whu.edu.cn*

*†These authors have contributed equally to this work*

#### Specialty section:

*This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Psychiatry*

> Received: *25 April 2018* Accepted: *17 August 2018* Published: *07 September 2018*

#### Citation:

*Wang X, Huang J and Zhu F (2018) Human Endogenous Retroviral Envelope Protein Syncytin-1 and Inflammatory Abnormalities in Neuropsychological Diseases. Front. Psychiatry 9:422. doi: 10.3389/fpsyt.2018.00422* Human endogenous retroviruses (HERVs) comprise approximately 8% of the human genome. Recent studies have considered HERVs as potential pathogenic factors. The majority of HERV genes are mutated and not capable of encoding functional proteins; regardless, some HERV genes, such as HERV-W envelope (env) glycoprotein, are known to have intact open reading frames. The HERV-W element on 7q21.2, which encodes a protein referred to as Syncytin-1, participates in human placental morphogenesis and can activate a pro-inflammatory and autoimmune cascade. Neuropsychological disorders are typically linked to inflammatory abnormalities. In this study, we review that Syncytin-1 has been increasingly involved in the development of neuropsychological disorders, such as schizophrenia and multiple sclerosis (MS). This study also presents inflammation imbalances in schizophrenia and MS. More importantly, we discuss the potential role and molecular mechanisms by which Syncytin-1 regulates inflammatory abnormalities in neuropsychological diseases. In summary, Syncytin-1 activity may represent a novel molecular pathogenic mechanism in neuropyschological diseases, such as schizophrenia and MS.

Keywords: HERV, syncytin-1, inflammation, schizophrenia-associated genes, neuropsychological diseases

# INTRODUCTION

Human endogenous retroviruses (HERVs), a class of retroelements, are regarded as remnants of ancient exogenous retroviruses, which integrated into the genome by infecting germ line cells millions of years ago (1). HERVs comprise approximately 8% of the human genome and replicate, along with the human genome, following Mendel's law (2–5). In addition, HERVs are polynucleotide sequences with the complete structure of a retrovirus (6). Classical HERVs have the general components of retroviruses, including the 5′LTR, GAG, POL (retroviral polymerase gene), ENV (envelop), and 3′LTR (7, 8). By phylogenetic analyses of the pol and env genes**,** HERVs have been identified at least 55 families/groups and categorized into three main classes: Class I (HERV-W and HERV-H), Class II (HERV-K), and Class III (HERV-L; **Figure 1**). HERV DNA, once classified as useless junk DNA, is essential to human embryonic development and is deeply involved in human evolution.

HERVs include many families, and each has multiple copies. HERV-W, an important member of the HERV family, was first named as multiple sclerosis-associated retrovirus (MSRV). It was isolated from the leptomeningeal choroid plexus, as well as from the Epstein–Barr virus-immortalized B cells of patients with MS (9–12). A complete full-length DNA copy of the HERV-W gene is located at chromosome 7q21, which is defective and does not produce a functional virus (13). Syncytin-1, also known as ERVWE1 or HERV-W Env, is a functional envelope glycoprotein encoded by a single HERV-W env locus that harbors a complete open reading frame (14). Syncytin-1 comprises two functional domains: the cell surface domain (SU) and the transmembrane domain (TM; **Figure 2**). SU binds with the host cell receptors, and TM promotes virus–cell or cell–cell fusion. Syncytin-1 plays a critical role in placental trophoblastic formation and is involved in the maternal immunosuppressive effect on the fetus. In addition, Syncytin-1 is a highly membranous fusogenic glycoprotein that can induce syncytium formation in cell–cell fusion assay (15, 16). However, recent studies show that Syncytin-1 expression is reproducibly associated with numerous neurological diseases such as schizophrenia, and an increasing number of studies have focused on the potential inflammatory mechanism by which Syncytin-1 mediates neuroimmune activation and oligodendrocyte damage in these diseases. In this article, we mainly introduce the role of Syncytin-1 in inflammatory abnormalities and emphasize an inflammatory mechanism mediated by Syncytin-1 in neuropsychological diseases.

### SYNCYTIN-1 AND NEUROLOGICAL DISEASES

Except for the normal physiologic function of Syncytin-1 in the development of placenta, the activity and expression of Syncytin-1 increase in several diseases, such as neuropsychiatric disorders, autoimmune diseases, and cancer (8). Considerably more studies suggest that Syncytin-1 contributes to the development of neuropsychological diseases, such as schizophrenia and MS (3, 17).

# Syncytin-1 and Schizophrenia

Schizophrenia is a severe neuropsychiatric disorder characterized by an abnormal social behavior and incapacity to distinguish what is real (18). Findings indicate that a number of genes is contributed to the development of schizophrenia, such as the brain-derived neurotrophic factor (BDNF), neurotrophic tyrosine kinase type 2 receptor (NTRK2), dopamine receptor D3 (DRD3), small conductance Ca2+-activated K<sup>+</sup> channel protein 3 (SK3), and glycogen synthase kinase 3β (GSK3β) (18, 19). Considerable attention has recently been directed toward the role of Syncytin-1 in schizophrenia.

A growing volume of articles have reported the implications of Syncytin-1 in schizophrenia. Syncytin-1 expression in the serum sample of patients with schizophrenia was been described by Perron et al. (20). In the study, positive Syncytin-1 expression was detected in 23 of 49 subjects with schizophrenia but only in 1 of 30 healthy controls. In another research, the transcripts of Syncytin-1 in peripheral blood mononuclear cells (PBMCs) were similarly elevated in patients with schizophrenia relative to those in control subjects (21). In our previous study, we identified the positive mRNA transcription of Syncytin-1 in the plasma samples of 42 in 118 patients with recent-onset schizophrenia; however, none from 106 controls was found (3). We also detected increased protein level of Syncytin-1 in the sera of 99 patients with schizophrenia relative to that of 83 normal individuals by ELISA assay (22). These results, when combined, suggest that Syncytin-1 is involved in the development of schizophrenia.

Several findings contradict the results described above. Frank analyzed Syncytin-1 mRNA expression in the brain of seven healthy individuals and seven individuals with schizophrenia, no differences were found between the groups (23). Meanwhile, similar levels of Syncytin-1 expression were observed in the PBMCs of patients with schizophrenia and controls (24, 25). The inconsistencies with previous findings may be attributable to the following: First, variation in sample size might have influenced the statistical conclusion. Second, the Syncytin-1 expression levels were detected in different tissues and fluids in these studies, such as brain tissue, serum, and PBMCs. The variation in the results for these samples suggested the diverse roles of Syncytin-1 in the development of schizophrenia. Last, the patients with schizophrenia in the studies were at different stages of the disease. In the early and late stages of schizophrenia, pathogenic factors are involved in the promotion and exacerbation of schizophrenia. Therefore, Syncytin-1 can potentially perform different functions in the development of schizophrenia.

# Syncytin-1 and MS

Multiple sclerosis (MS) is a demyelinating disease with chronic inflammation. Patients with MS generally harbor the damaged insulating covers of nerve cells in the brain and spinal cord, disrupting the communication of parts of the nervous system. Consequently, signs and symptoms

manifest, including mental, physical, and psychiatric disorders (26). The pathogenesis of MS remains unclear; however, its underlying mechanism involves the destruction of the immune system and the deficiency of myelin-producing cells (27). Potential causes have been identified, which include complex interactions between genetic susceptibility and environmental factors. Viral infection is considered a potential environmental factor (28).

A growing number of studies indicate that Syncytin-1 plays an important role in MS. In 2004, Antony et al. reported that Syncytin-1 was elevated in glial cells in patients with acute demyelination MS. In the aforementioned study, the Syncytin-1 gene was inserted into a virus that could infect astrocytes, and the modified virus was injected into the brains of healthy mice. Overexpression of Syncytin-1 in astrocytes promoted the release of redox reactants cytotoxic to oligodendrocytes. Two weeks post-injection, the mice developed MS-like symptoms, and numerous deformed and dead oligodendrocytes were found during autopsy (29). Perron et al. also found the physiologic expression of HERV-W in gray matter and white matter microglia as well as in central vascular endothelial cells in patients with MS (30). In 2007, Giuseppe et al. demonstrated that HERV-W env (Syncytin-1) and pol genes were highly expressed in the brain and PMBCs of individuals with MS by polymerase chain reaction and reverse transcription–PCR. Immunohistochemical analysis showed that the protein level of Syncytin-1 was only expressed in the glial cells of patients with MS exhibiting hyperplastic damage and was mainly distributed in the margins of microglia and astrocytes (31). In 2010, MSRV was observed in the cerebrospinal fluid of patients with early MS and contributed to the secondary progressive phase of MS (32). Given the role of Syncytin-1 in MS has been widely acknowledged; thus, the study of Syncytin-1 may provide new ideas for defining the neuropathic mechanisms of MS as well as its diagnosis, prognosis, and treatment (33, 34).

# INFLAMMATORY ABNORMALITIES IN NEUROPSYCHOLOGICAL DISEASES

Inflammation is a series of complex biological reactions of an organism in response to harmful stimuli (35). Various inflammatory cytokines, which play a role in initiating the inflammatory response, are essential for regulating inflammation (36). In the central nervous system (CNS), the inflammatory cytokines produced by neuronal and glial cells affect the brain cortical neuronal development (37–39). Inflammatory abnormalities are involved in a wide range of human diseases and are regarded as the potential pathogenesis of neuropsychological diseases such as schizophrenia (40) and MS (41).

# Inflammation and Schizophrenia

Inflammatory abnormalities have been repeatedly linked to schizophrenia in recent research (42–44). In either the early pivotal stage of brain development or the adult acute disease state, inflammation significantly affects the development of schizophrenia (36).

A confluence of evidence has demonstrated an association between prenatal inflammation induced by bacterial or viral agent infections and increased risk of schizophrenia in the offspring during adulthood (36, 45, 46). Studies on rodents have indicated that an immune disorder during pregnancy can result in mimic clinical symptoms of schizophrenia in the adult offspring, including brain dysfunction and behavioral changes (47). The correlation between inflammation and schizophrenia developed in adulthood has been investigated, in addition to that in prenatal and perinatal inflammation (48–51). Inflammatory response in the development of schizophrenia is a chronic lowgrade response rather than an acute and short-term status (51, 52). Acute inflammation is a quick response and often benefits tissue repair and recovery (53, 54), whereas chronic inflammation has long-term consequences that are often detrimental, inducing immune system perturbations (55, 56). This finding may be one of the reasons that high rates of chronic inflammation are reported in patients with schizophrenia. Numerous studies have revealed increased concentrations of several inflammatory cytokines in the patients with schizophrenia (57). Increased levels of nterleukin (IL) 1-β, IL-2, IL-6, IL-8, IL-12, transforming growth factor-beta (TGF-β), and tumor necrosis factor-alpha (TNF-α) were detected in patients with schizophrenia than in controls (52, 58–60). C-reactive protein (CRP), another pro-inflammatory molecule, has recently been found to be sufficiently increased in patients with schizophrenia (36, 61– 63). An underlying mechanism of inflammatory cytokine contributing to schizophrenia is apoptosis, which can induce neuronal injury or death (64, 65). Researchers have demonstrated that the alteration in the apoptotic cascade can potentially lessen the viability of neuron and glia at different stages of neurodevelopment, inducing the deficits in brain volume and function in schizophrenia (66–68). Another mechanism is that chronic inflammation may induce the damage of the brain microvascular system and disruption of the blood– brain barrier and cerebral blood flow, which may lead to the development of clinical symptoms of schizophrenia (69– 71).Thus, inflammation plays a critical role in the development of schizophrenia.

#### Inflammation and MS

MS is a chronic inflammatory demyelinating disorder. Inflammatory disorders play a pivotal role in MS. Richard et al. found significantly increased secretions of inflammatory cytokines IL-1β and TNF-α in the monocytes of patients with MS relative to those of the controls (72). Meanwhile, **Celia** et al. performed immunohistochemistry to detect the expression and distribution of pro-inflammatory and regulatory cytokines in different MS lesions and compared the inflammatory or non-inflammatory components of CNS tissues with other neurological diseases. Results showed a widespread distribution of cytokines in perivascular inflammatory cells and glial cells in all inflammatory lesions. No apparent pattern of these cytokines in MS lesions were observed; however, pro-inflammatory cytokines were rarely detectable under normal and noninflammatory conditions, and regulatory cytokines were easily detected in MS (73). Moreover, Josa et al. observed the robust brain inflammation response in the relapsing–remitting MS (RRMS), secondary progressive MS (SPMS), and primary progressive MS(PPMS). An evidently significant correlation between inflammation and axonal injury was observed in both the global MS population and progressive MS alone (74). These results indicate that inflammation is associated with MS and depict a potential process of inflammation triggered in MS. During the inflammatory reaction, encephalitogenic lymphocytes, which are activated peripherally, bind to receptors of endothelial cells within the CNS and then cross the blood– brain barrier, pass into the interstitial matrix, and trigger and amplify the inflammatory disorders in the brain. Inflammatory abnormalities may further induce neurodegeneration in MS (75).

#### SYNCYTIN-1 COULD CAUSE INFLAMMATORY ABNORMALITIES IN NEUROPSYCHOLOGICAL DISEASES

Recent research has linked HERVs to the inflammatory condition in neuropsychological diseases. HERV-K, another most studied HERV, was found to have a robust expression in the brain of subjects with amyotrophic lateral sclerosis (ALS) (76, 77). In addition, the inflammatory transcription factors interferon regulatory factor 1 (IRF1) and NF-κB could trigger the HERV-K expression via its interferon-stimulated response elements in neurons of the motor cortex in ALS (78), suggesting the potential role of HERVs in mediating inflammation in neuropsychological diseases.

Syncytin-1, functioning as an immunotoxin, can induce inflammation with superantigen-like effects, thereby activating the innate immune system (79). Studies indicate that specific infections can activate HERV-W elements, leading to the production of Syncytin-1, which then stimulates pro-inflammatory and neurotoxic cascades (21). Murphy demonstrated that overexpression of Syncytin-1 upregulated the expression of proinflammatory factors, such as IL-1β and IL-6 (80). Moreover, Syncytin-1 overexpression in glial cells can trigger endoplasmic reticulum stress, leading to neuroinflammation and the production of free radicals to destroy proximate cells (34). Given the regulatory role of Syncytin-1 in inflammation, abnormal expression of Syncytin-1 may result in cell death or tissue damage (81). An in vitro study indicates indirect cytotoxicity of Syncytin-1 to oligodendrocytes, and murine models show that Syncytin-1 overexpression can lead to demyelination (17, 29, 31, 82). In a study by Perron, Syncytin-1 not only induced proinflammatory reaction but also exhibited the ability to trigger experimental autoimmune encephalomyelitis (EAE) in mice (83). Owing to its potential to elicit immunosuppressive and neuroinflammatory effects, Syncytin-1 has been linked to some neurological and neuropsychiatric disorders (29, 84). For instance, Syncytin-1 has been regarded as an important regulator in the development of MS and schizophrenia because of its capacity to induce neuroinflammation and cytotoxicity. In the present study, we introduce several potential mechanisms of Syncytin-1 involved in neuroinflammation.

#### Syncytin-1 Increases Nitric Oxide in Glial Cells

Schizophrenia and MS are neurological diseases with an inflammatory response in the the CNS (85). Glial cells, including astrocytes, microglia, and oligodendroglial cell, are widespread in the CNS and are necessary for regulating brain inflammation (86). Nitric Oxide (NO) plays regulatory roles in the inflammatory condition of the brain and the function of neuronal cells and participates in the pathogenesis of various neuropsychological diseases (87, 88). Antony et al. indicated that Syncytin-1 could activate the inducible NO synthase in astrocytes to initiate an old astrocyte specifically induced substance (OASIS)-mediated suppression of ASCT1 (17). In addition, Antony et al. observed that the overexpression of Syncytin-1 in astrocytes also induced the release of the oxidation– reduction reaction product and NO, which exhibited cytotoxicity to oligodendrocytes (29). In our recent research, we found that overexpression of Syncytin-1 in microglia could induce the expression of inducible NO synthase to increase NO production and promote the migration of microglia (89). This combination allows Syncytin-1 to contribute to neuroinflammation by inducing the production of NO in glial cells.

# Syncytin-1 Induces Proinflammatory Cytokines via CD14 and TLR4 in Human Monocytes

Toll-like receptor 4 (TLR4) is a transmembrane protein belonging to the TLR family. It can recognize lipopolysaccharide and lead to the activation of the NF-κB signal transduction pathway and the production of inflammatory cytokines. TLR4 mainly participates in activating the innate immune system. Meanwhile, CD14 is a glycosylphosphatidylinositol-anchored membrane protein, which functions as a pattern recognition receptor with the extracellular domain of TLR4. A referenced article focused on the inflammatory response induced by Syncytin-1 and CD14-TLR4. In human monocytes, activation of Syncytin-1 could induce the proinflammatory cytokines IL-6, IL-1β, and TNF-α; however, the incubation of the neutralizing antibodies of CD14 and TLR4 effectively blocked the secretion of these cytokines (90). The signaling pathways of CD14 and TLR4 in glial cells have not been confirmed; regardless, increased TLR4 has been identified in the oligodendroglial cell of MS, inducing brain inflammation (29). Moreover, the proinflammatory cytokines IL6, IL-1β, and TNF-α are important for regulating the inflammation status in the CNS and brain development (39, 91, 92). Given these findings, we consider that CD14/TLR4 potentially mediates Syncytin-1 in the CNS to induce proinflammatory cytokines and participates in neuropsychological diseases, such as schizophrenia and MS.

# Syncytin-1 Induces CRP Activation via TLR3 in Glial Cells

C-reaction protein (CRP), an inflammatory marker, is associated with several neuropsychological diseases. For instance, CRP was elevated in the serum of patients with schizophrenia (93) and MS (94). Recent research indicated that the expression of several TLRs, including TLR3, was highly increased in the blood of individuals with schizophrenia (95). A member of the TLR family, TLR3 mainly recognizes the virus dsDNA and activates the innate immune system. Activation of TLR3 can induce the production of proinflammatory cytokines as diverse as IL-6, IL-1β, and TNF-α (96, 97). In our recent study, we reported that Syncytin-1 exhibited a positive correlation and marked consistency with the expression levels of CRP in individuals with schizophrenia. We also found that Syncytin-1 could trigger the activation of CRP via the TLR3-IL-6 signal pathway in glial cells, the deficiency of TLR3 could significantly impair Syncytin-1-induced CRP and IL-6 expression (22). Direct interaction and cellular colocalization between Syncytin-1 and TLR3 were observed by confocal microscopy (22). Thus, TLR3 can potentially function as a Syncytin-1 mediator to induce inflammatory abnormalities in the glial cell.

# HLA-A∗0201+-Restricted Epitopes of Syncytin-1 Could Induce Cytotoxic T Lymphocytes

HLA-A<sup>∗</sup> 0201<sup>+</sup> is a human leukemia antigen. HLA restriction is involved in the immune response to neuropsychiatric diseases. The epitopes derived from Syncytin-1 were the HLA-A<sup>∗</sup> 0201 restriction and potential for adoptive immunotherapy. In the study, we predicted and synthesized five peptides that displayed HLA-A<sup>∗</sup> 0201-binding motifs of Syncytin-1. Among the peptides, peptides W, Q and T could promote the proliferation of lymphocytes. The stimulation of these peptides on PBMSs from HLA-A<sup>∗</sup> 0201<sup>+</sup> donors could induce peptide-specific CD8<sup>+</sup> T cells. Abundant interferon-γ-secreting T cells were also detected after stimulation of these peptides for several weeks. These data demonstrate that Syncytin-1 peptides (such as W, Q, and T peptides) can induce HLA-A2.1-restricted CD8<sup>+</sup> CTL and could be a potential target for astrocytoma immunotherapy (98). On the other hand, the cytotoxic T lymphocytes induced by Syncytin-1 could be a potential mechanism for inflammatory abnormalities in the CNS.

# COMMENTS

Recent clinical reports have indicated the importance of Syncytin-1 in neuropsychological diseases; regardless, these studies have several limitations. First, the sample sizes of these clinical studies are relatively small. The sample size is crucial because an insufficient sample size may render testing and reproduction for statistical significance difficult. An increasing sample size is necessary for verifying the role of Syncytin-1 in these diseases. Second, other psychiatric control groups in a clinical study can benefit from the enhancement of the potential implications of the findings.

A notable finding from the previous observations is that the abnormal expression levels of Syncytin-1 in neuropsychological diseases seem ubiquitous. For instance, elevated Syncytin-1 expression in MS was detected in different tissues or fluids, including the serum, PBMCs, glial cells, and brain tissues from patients with MS (31, 32, 99, 100). This elevation presents a challenge for clearly elaborating on the pathogenic mechanism of Syncytin-1 in neuropsychological diseases. It also suggests that Syncytin-1 can execute multiple functions in the development of diseases. The evidence in relation to the association between Syncytin-1 and inflammation demonstrates that the change in Syncytin-1 in neuropsychological diseases seems not to be an incidental phenomenon. In view of the brain damage in neuropsychological diseases, increased Syncytin-1 in the cerebrospinal fluid or neurogliacyte may be associated with neuroinflammation, leading to brain injury. Our previous data supported this possibility. Our study found that Syncytin-1 could trigger the production of inflammatory cytokines CRP and IL-6 in microglial and astroglial cells (22). Another study also suggested that Syncytin-1 can induce inflammation by promoting the secretion of IL-6, IL-1β, and TNF-α in human monocytes (90). Owing to the association between Syncytin-1 and inflammation, abnormal **S**yncytin-1 expression in PBMCs indicates that Syncytin-1 may promote the inflammatory stage of immune cells in blood, enhancing inflammation in the brain of individuals with neuropsychological diseases. Therefore, the role of Syncytin-1 in neuropsychological diseases may be complex, and more clinical studies and cells experiments are necessary to verify the specific functions of Syncytin-1 in different tissues or fluids in neuropsychological diseases.

In the molecular mechanisms of Syncytin-1 regulating inflammation in neuropsychological diseases, TLRs may be the essential factors. Once activated, TLR3 and TLR4 can trigger the innate immune reaction. We found that TLR3 could mediate the inflammatory effect of Syncytin-1 in microglial and astroglial cells (22). Meanwhile, the neutralizing antibodies of TLR4 could effectively impair the inflammation induced by Syncytin-1 in human monocytes (90). Therefore, different TLRs may function as mediators to induce inflammation reaction in response to Syncytin-1 in different tissues or fluids. Regardless, the existing data in the literature remain inconclusive. The mechanism of Syncytin-1 regulation of neuroinflammation in neuropsychological diseases has yet to be elucidated. Further research on the role of Syncytin-1 in neuropsychological diseases has to be conducted. Mouse models should also be used in these studies.

# CONCLUSION

An increasing number of findings suggest that neuropsychological diseases result from both genetic and environmental factors. In addition to genetic factors, environmental factors play an essential role in disease development, particularly in the early phases of brain neurodevelopment (18, 19). Syncytin-1 may link environmental and genetic factors. Accumulating evidence indicates that Syncytin-1 is closely involved in the development of neuropsychological diseases. Environmental factors, such as specific viral infections, drug application, and exposure to ultraviolet rays (22), can induce Syncytin-1. The elevated Syncytin-1 in the brain has been associated with abnormal inflammation, contributing to the development of neuropsychological diseases. Many studies reveal the potential role of Syncytin-1 in neuroinflammation, but the potential mechanisms of HERV pathogenicity have yet to be elucidated. In this study, we described several activated signaling networks in response to Syncytin-1 that may lead to abnormal inflammation

FIGURE 3 | Hypothesis that Syncytin-1 contributes to inflammatory abnormalities, which lead to neuropsychological diseases. Environmental factors (e.g., ultraviolet rays), infectious agents (e.g., viruses), drug application (e.g., caffeine, aspirin, etc.), and genetic variation could trigger the expression of Syncytin-1 in glial cells. The expression of Syncytin-1 induced the release of nitric oxide in microglia and astrocytes and activated the TLR signaling pathways (e.g., TLR3 and TLR4) to induce the production of inflammatory cytokines. In addition, Syncytin-1-derived cytotoxic T lymphocytes could also secrete inflammatory cytokines. The production of these inflammatory cytokines led to the inflammatory abnormalities in the CNS and contributed to the development of neuropsychological diseases.

in neuropsychological diseases: Syncytin-1 may induce the inflammatory abnormalities via four routes: (1) release of NO; (2) activation of the TLR4/CD4 pathway; (3) activation of the TLR3 signal pathway; (4) induction of CTL. These inflammatory abnormalities could lead to neuronal damage and apoptosis of neuron cells, which play crucial roles in neuropsychological diseases such as schizophrenia and MS (**Figure 3**).

We summarize the relationship between increased Syncytin-1 and abnormal inflammation and elucidate the potential mechanisms of inflammation induced by Syncytin-1 in neuropsychological disorders. This review also presents a new insight into the diagnosis and treatment of neuropsychological diseases.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

XW prepared the first draft of the manuscript, which was corrected and improved by JH and FZ. All authors read and approved the final manuscript.

### FUNDING

This work was supported by Grants from the National Natural Sciences Foundation of China (Nos. 81772196, 31470264, 81271820, 30870789, 30300117, and 81500634) and Stanley Foundation from the Stanley Medical Research Institute, United States (No. 06R-1366) for FZ.

coding capacity for complete envelope proteins. J Virol. (2003) 77:10414–22. doi: 10.1128/JVI.77.19.10414-10422.2003


in rat: relevance for schizophrenia. PLoS ONE (2010) 5:e10967. doi: 10.1371/journal.pone.0010967


genes and the risk of deficit schizophrenia. Schizophr Res. (2017) 193:359–63. doi: 10.1016/j.schres.2017.06.050


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wang, Huang and Zhu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Prioritized High-Confidence Risk Genes for Intellectual Disability Reveal Molecular Convergence During Brain Development

Zhenwei Liu<sup>1</sup>† , Na Zhang<sup>1</sup>† , Yu Zhang<sup>1</sup>† , Yaoqiang Du<sup>1</sup> , Tao Zhang<sup>1</sup> , Zhongshan Li<sup>1</sup> , Jinyu Wu<sup>1</sup> \* and Xiaobing Wang<sup>2</sup> \*

1 Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, China, <sup>2</sup> Department of Rheumatology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China

#### Edited by:

Cunyou Zhao, Southern Medical University, China

#### Reviewed by:

Thomas V. Fernandez, Yale University, United States Fangqing Zhao, Beijing Institutes of Life Science (CAS), China

#### \*Correspondence:

Jinyu Wu iamwujy@gmail.com Xiaobing Wang gale820907@163.com; wztdwang@163.com

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

Received: 31 May 2018 Accepted: 09 August 2018 Published: 18 September 2018

#### Citation:

Liu Z, Zhang N, Zhang Y, Du Y, Zhang T, Li Z, Wu J and Wang X (2018) Prioritized High-Confidence Risk Genes for Intellectual Disability Reveal Molecular Convergence During Brain Development. Front. Genet. 9:349. doi: 10.3389/fgene.2018.00349 Dissecting the genetic susceptibility to intellectual disability (ID) based on de novo mutations (DNMs) will aid our understanding of the neurobiological and genetic basis of ID. In this study, we identify 63 high-confidence ID genes with q-values < 0.1 based on four background DNM rates and coding DNM data sets from multiple sequencing cohorts. Bioinformatic annotations revealed a higher burden of these 63 ID genes in FMRP targets and CHD8 targets, and these genes show evolutionary constraint against functional genetic variation. Moreover, these ID risk genes were preferentially expressed in the cortical regions from the early fetal to late mid-fetal stages. In particular, a genomewide weighted co-expression network analysis suggested that ID genes tightly converge onto two biological modules (M1 and M2) during human brain development. Functional annotations showed specific enrichment of chromatin modification and transcriptional regulation for M1 and synaptic function for M2, implying the divergent etiology of the two modules. In addition, we curated 12 additional strong ID risk genes whose molecular interconnectivity with known ID genes (q-values < 0.3) was greater than random. These findings further highlight the biological convergence of ID risk genes and help improve our understanding of the genetic architecture of ID.

Keywords: intellectual disability, de novo mutations, brain development, gene prioritization, molecular convergence

#### INTRODUCTION

Intellectual disability (ID) is a complex neurodevelopmental disorder characterized by notable deficits in intellectual functioning and adaptive behavior (Ropers, 2010; Musante and Ropers, 2014) with a prevalence of approximately 1% of the world's population (Maulik et al., 2011). Larger studies have provided compelling evidence that genetic factors are a major contributor to ID and may explain 25–50% of cases, although this association is complicated by extensive clinical and genetic heterogeneity (Vissers et al., 2016). Dissecting the relationship between genetics and ID would advance our understanding of the etiology of this disorder and may offer key information for the development of diagnostics and therapies (Harripaul et al., 2017a).

The whole-exome sequencing (WES) and whole-genome sequencing (WGS) of parent– offspring trios or quartets has established that rare de novo mutations (DNMs) play a prominent role in the pathogenesis of severe sporadic ID (de Ligt et al., 2012; Gilissen et al., 2014;

Hamdan et al., 2014; Lelieveld et al., 2016). DNMs have been identified as an important source of novel risk genes and provide further insight into the genetic landscape of ID (Hamdan et al., 2014; Lelieveld et al., 2016; Vissers et al., 2016). Screening for recurrent and deleterious DNMs from ever more cohort and family studies has produced a steadily growing number of risk loci and genes associated with ID, such as DYNC1H1 (de Ligt et al., 2012), CTNNB1 (de Ligt et al., 2012), KCNQ3 (Rauch et al., 2012), DLG4 (Lelieveld et al., 2016), and PPM1D (Lelieveld et al., 2016). Statistical analyses of larger cohorts have demonstrated that the candidate genes identified from patients with severe ID often harbor an excess number of loss-of-function (LoF) or functional DNMs with a potentially greater disruptive effect on protein function than expected (Gilissen et al., 2014; Lelieveld et al., 2016). However, due to the extreme genetic heterogeneity of ID, each newly identified gene accounts for only a small proportion of ID cases (Carvill and Mefford, 2015; Vissers et al., 2016). It is therefore still crucial to use available sequencing data to effectively prioritize the causative mutations and candidate genes associated with ID.

Recent functional-network-based analyses, including gene co-expression or physical protein interactions, have shown high functional coherence and connectivity between ID risk genes (Hamdan et al., 2014; Riazuddin et al., 2016; Harripaul et al., 2017b; Shohat et al., 2017). Additionally, Gene Ontology (GO)-based annotations of multiple biological processes in several studies revealed that ID risk genes are significantly associated with nervous system development, RNA metabolism, and transcription, presenting convergent functional features in specific biological pathways (Kochinke et al., 2016). Analyses of the unique spatiotemporal expression patterns of ID risk genes during human brain development indicated that the altered functions of certain specific brain regions were responsible for the range of various clinical ID phenotypes (Parikshak et al., 2015; Harripaul et al., 2017b; Shohat et al., 2017). Therefore, determining ID-associated biological pathways and their expression in the human brain would be of great utility for understanding the pathogenesis of ID (Parikshak et al., 2015; Vissers et al., 2016).

In this study, using TADA statistical model, we identified 63 high-confidence ID genes with q-values < 0.1based on all coding DNMs reported to date for ID from currently available triobased WES/WGS studies. Furthermore, we sought to provide further insight into the pathogenesis of ID by validating these high-confidence ID genes based on a range of function-related analyses. Our analyses showed increased molecular connectivity between strong candidate genes and known ID genes and suggest that these high-confidence ID genes converge on specific brain regions and development stages as well as common biological processes.

# MATERIALS AND METHODS

#### Data Collection and Annotation

All DNM datasets in this study were available from 11 published cohorts for ID and control, and detailed information is shown in **Supplementary Table S1**. In addition, the four background DNM rates (DNMRs), including DNMR-GC (Sanders et al., 2012), DNMR-SC (Samocha et al., 2014), DNMR-MF (Francioli et al., 2015), and DNMR-DM (Jiang et al., 2017), were retrieved from the mirDNMR database (Jiang et al., 2017).

# Annotation of DNMs and Prioritization of ID Risk Genes

By combining the datasets from each study, a total of 1,404 DNMs were collected based on the WES/WGS of 1,027 ID trios and 38,403 from 951 control trios for WGS (**Supplementary Table S1**). We annotated variants using ANNOVAR software (Wang et al., 2010) based on RefSeq hg19 and multiple allele frequency databases (ExAC, UK10K, 1000 Genomes and ESP6500). The functional prediction of missense mutations was performed using 14 integrated tools in ANNOVAR (SIFT, Polyphen2\_hdiv, Polyphen2\_hvar, LRT, Mutation Taster, Mutation Assessor, FATHMM, RadialSVM, MetaLR, VEST3, CADD, GERP, phyloP100way\_vertebrate, SiPhy). After filtering out non-exonic DNMs and common variants with minor allele frequency ≥ 0.001, we focused on 1,392 and 702 de novo coding mutations for cases and controls, respectively (**Supplementary Table S2**). We then investigated de novo extreme mutations, including LoF [frameshift, indel, stopgain, stop-loss or splicing single nucleotide variants (SNVs) in coding regions] and missense mutations that were predicted to be damaging by at least eight of the fourteen tools. We then used a Bayesian model of the TADA (TADA-Denovo) to prioritize ID risk genes based on extreme mutations and four background DNMRs, and the TADA P-value was adjusted to calculate the q-value (He et al., 2013). Genes with a q-value < 0.1 for at least three background mutation rates were defined as high-confidence ID risk genes. Known ID genes were derived from three articles (Lelieveld et al., 2016; Vissers et al., 2016; Harripaul et al., 2017b) (**Supplementary Table S3**).

# Conservation and Damage Estimation

We assessed the tolerance of genes to functional genetic variations using the Residual Variation Intolerance Score (RVIS), which measures deviation from the expected amount of common functional variations in genes (Petrovski et al., 2013). Genes with an RVIS score in the top 25% were described as intolerant. The probability of being LoF intolerant (pLI) was derived from ExAC<sup>1</sup> , and genes with a pLI greater than 0.9 were defined as extremely intolerant genes (Lek et al., 2016). Additionally, we defined the 'hot zone' as a region that reflects a pLI score greater than 0.9 and an RVIS in the top 25th percentile. The Fragile X Mental Retardation Protein (FMRP) is a polyribosomeassociated neuronal RNA-binding protein (Darnell et al., 2011). We collected FMRP targets from two independent data sets, Ascano et al. (2012) (939 genes) and Darnell et al. (2011) (842 genes). CHD8 targets, genes encoding postsynaptic density (PSD) proteins, haploinsufficient genes with predicted haploinsufficient probability greater than 0.9 and constrained genes were derived

<sup>1</sup>http://exac.broadinstitute.org/

from previous studies (Huang et al., 2010; Bayes et al., 2011; Samocha et al., 2014; Cotney et al., 2015). We utilized the Fisher's exact test with correction for multiple comparisons to analyze whether our ID risk genes were enriched in the above gene sets.

#### Functional Enrichment Analysis

fgene-09-00349 September 14, 2018 Time: 9:13 # 3

To characterize the functional convergence of ID, the GO annotations of ID risk genes were determined using DAVID v6.8<sup>2</sup> .

#### Network Analysis

The protein–protein interaction (PPI) network used in this study was retrieved from the STRING database<sup>3</sup> (v10). Analytical data on spatiotemporal enrichment and co-expression were obtained from the HBT database<sup>4</sup> . To construct the co-expression network, we first computed the Pearson correlation coefficient (r) between any two genes in the HBT and defined the gene pair as coexpressed if the calculated absolute r score was greater than 0.6. We then estimated whether the absolute r score of any two gene pairs between 12 novel candidate genes and the 741 known ID genes or 63 known ID genes with q-values < 0.3 was greater than 0.6. To prove that the constructed PPI and co-expression networks were not random, we employed a permutation test with 100,000 iterations for genes and their connections. The network was visualized using Cytoscape v3.4.0 (Shannon et al., 2003). Code for permutations performed in **Figure 2** are provided in **Supplementary File S1**.

### Spatiotemporal Enrichment of ID Risk Genes

In order to gain insight into the spatiotemporal and tissue specific expression of ID risk genes, we used Tissue Specific Expression Analysis (TSEA<sup>5</sup> ) (Dougherty et al., 2010) and specific expression analysis across brain regions and development using previously developed tools<sup>6</sup> (Xu et al., 2014).

### Weighted Gene Co-expression Network Analysis

As previously described (Langfelder and Horvath, 2008), we performed a weighted gene co-expression network analysis (WGCNA) for ID risk genes using an R package. The expression levels of 60 of the 63 genes across different developmental stages, based on the HBT, were utilized to build gene co-expression modules. The WGCNA clusters the genes using a measure of topological overlap based on the change in the correlation matrix using a power consistent with scale-free topology standards (Zhang and Horvath, 2005). The relevant parameters of the software package were set to 6 for clustering the spatiotemporal expression patterns of a given gene set.

#### RESULTS

#### Comprehensive Detection and Prioritization of Candidate ID Risk Genes

We collected a combined cohort of 1,027 ID trios and 951 normal trios through precluding sample redundancies, with a total of 39,807 DNMs from available parent–offspring sequencing studies to comprehensively investigate known and potential IDassociated genes (**Supplementary Table S1**). After excluding non-exonic variants and common variants with MAF ≥ 0.001 based on different public databases (ExAC, UK10K, 1000 Genomes, and ESP6500), we focused on 2,094 DNMs located in the coding regions; these DNMs consisted of 1924 de novo SNVs and 170 de novo indels (**Supplementary Table S2**). To further optimize and achieve the appropriate power for the discovery of ID-associated genes, we prioritized candidate genes using TADA model based on coding DNMs and four DNMRs (DNMR-GC, DNMR-SC, DNMR-MF, and DNMR-DM). TADA prioritized 71 ID risk genes with q-values < 0.1 and 145 with q-values < 0.3 using any DNMRs (**Figure 1A** and **Supplementary Figure S1**). Moreover, we found 63 candidate genes with q < 0.1 (63/71, 88.7%) that harbored more than one DNM and could be found simultaneously by any three of the background DNMRs, and we defined these as high-confidence risk genes. In addition, 44 candidate genes (44/71, 62.0%) were shared by the four background DNMRs (**Figure 1A** and **Supplementary Table S4**). Of the 145 genes with q-values < 0.3 for ID, 127 (127/145, 87.6%) could be found by any three of the background DNMRs, while 92 (92/145, 63.4%) were found by all background DNMRs (**Supplementary Figure S1A** and **Supplementary Table S4**). But no genes showed a q-value < 0.1 among the 951 controls (**Supplementary Figure S1B**) and there were only two genes (SH3D19 and P2RY14) with q-value < 0.3 (**Supplementary Figure S1C** and **Supplementary Table S4**).

Additionally, we curated 741 well-known ID-associated genes reported in three published studies (**Supplementary Table S3**). After excluding 62 known ID genes of 145 ID risk genes with q-value < 0.3, we isolated 12 additional candidate genes with q-values < 0.1 and 63 potential candidate genes with q-values < 0.3 that harbored DNMs in the ID trios (**Figure 1B**). Of the 12 candidate genes (q-values < 0.1), TCF7L2 had 4 independent DNMs, 4 (KDM2B, PPP1CB, TNPO2, USP7) had 3 independent DNMs, 7 (ABCC3, CACNA1A, CEP85L, CSNK2A1, FBXO11, PPP2CA, SLC6A1) had 2 independent DNMs (**Figure 1C** and **Supplementary Table S4**). Fourteen generic tools for functional prediction (see section "Materials and Methods") predicted that approximately 94.7% (18/19) of missense DNMs were damaging (D-Mis). In this study, LoF and D-Mis DNMs were considered extreme mutations. With the exception of one synonymous DNM in TNPO2, all DNMs in all other genes were extreme mutations.

### Functional Co-expression and Physical Interaction Networks of ID Risk Genes

Physical interactions often occur between the different causative genes of the same disorder. To evaluate the PPI formed by

<sup>2</sup>https://david.ncifcrf.gov/

<sup>3</sup>http://string-db.org/

<sup>4</sup>http://hbatlas.org/

<sup>5</sup>http://genetics.wustl.edu/jdlab/tsea/

<sup>6</sup>http://genetics.wustl.edu/jdlab/csea-tool-2/

the 12 candidate genes and the 63 known ID genes with q-values < 0.3, we generated an interconnected network using the remarkably comprehensive human protein interactome dataset collected from the STRING database (**Figure 2A**). Our evaluation of the PPI network showed statistical significance for the number of interacting proteins (P = 1.04 × 10−<sup>3</sup> ) and connections (P = 8.30 × 10−<sup>4</sup> ) relative to random expectations. Among the interconnected network encoded by 45 genes, 9 candidate genes showed highly likely direct interactions with 36 known ID genes (**Figure 2A**). Strikingly, the 4 genes with the most edges (PPP2CA, CSNK2A1, TCF7L2, CACNA1A) interacted with more than 10 known ID genes and PPP2CA had the most common gene interactions, associating with 13 known ID genes. Moreover, we found that 11 candidate genes and 337 of the 741 known ID genes formed a significant interaction network which displayed more connections than random expectation (P = 1.00 × 10−<sup>5</sup> for genes; P = 1.00 × 10−<sup>5</sup> for connections; **Supplementary Figure S2A**).

To further explore the functional relevance of the 12 candidate genes and the known ID genes, we performed a co-expression network analysis based on the spatiotemporal transcriptome data set of the developing brain found in the Human Brain Transcriptome (HBT) database. We observed the clear coexpression of novel candidate genes and the known ID genes, as demonstrated by their absolute r-values greater than 0.6 (**Figure 2B**). Eight of these candidate genes were more frequently co-expressed with 34 of the known ID genes than would be expected by chance (P = 3.80 × 10−<sup>4</sup> for genes; P = 7.00 × 10−<sup>5</sup> for connections; **Figure 2B**). A further analysis of the network revealed that the four genes with the most edges (KDM2B, CSNK2A1, FBXO11, and SLC6A1) interacted with more than 15 known ID genes. Furthermore, 11 candidate genes were more frequently co-expressed with 292 known ID genes than those observed in randomly permuted networks (P = 3.00 × 10−<sup>5</sup> for genes; P = 9.80 × 10−<sup>4</sup> for connections; **Supplementary Figure S2B**). Our PPI and co-expression data provided support for the biological relationship between the 12 candidate ID genes.

# Functional Characteristics and Evaluation of ID Risk Genes

To assess whether the ID risk genes with q-values < 0.1 were intolerant of functional genetic variation, we used the RVIS percentile and pLI in the ExAC to measure intolerance. There

FIGURE 2 | Protein–protein interaction (PPI) and co-expression network analyses of ID risk genes. (A) Physical interaction network was created by seeding 12 candidates and known ID genes with q-values < 0.3 in STRING. The node color reveals the class of the gene set (known ID genes, dark sea green; candidate ID genes, red), and the thickness of all edges with the color turquoise shows the degree of connectivity (PPI score). (B) The co-expression network between the 12 candidate ID genes (cyan) and the 63 known ID genes (firebrick) was analyzed using data from the HBT. Edge (blue line) size indicates the levels of co-expression of the gene pairs estimated by the absolute value of r greater than 0.6. The histograms describe the number of genes and connections distributing on the 100,000 interactions. Apart from that, the red vertical lines depict the numbers of observed nodes and connections in the networks. P-values are shown in the figures.

were 44 ID risk genes with RVIS values in the top 25th percentile of the most constrained genes (enrichment P = 7.31 × 10−13) and 55 risk genes with pLI values ≥ 0.9 (enrichment P = 2.47 × 10−46). In addition, 43 risk genes were preferentially enriched for "hot spot zones," defined as genes with RVIS ≤ 25th percentile and pLI values ≥ 0.9 (enrichment P = 1.98 × 10−<sup>28</sup> , **Figure 3A**). To further characterize the function of the 63 ID risk genes with q-values < 0.1, we performed an enrichment test

fgene-09-00349 September 14, 2018 Time: 9:13 # 5

for genes encoding messenger RNAs bound by FMRP, a neuronal RNA-binding protein implicated in regulating synaptic function during normal neurogenesis. The 63 ID risk genes were strongly enriched in the FMRP-related gene sets from Darnell et al. (2011) (24 risk genes, corrected P = 1.58 × 10−16). Although the significant enrichment was not observed in the FMRP targets from Ascano et al. (2012) (6 risk genes; corrected P = 0.12), the enrichment in the shared set of FMRP genes from the above two independent data sets still achieved statistical significance (4 risk genes; corrected P = 1.62 × 10−<sup>3</sup> ). Moreover, we also found significant enrichment for several canonical functional classes involved in a wide range of neurodevelopmental phenotypes (**Figure 3B**), such as CHD8 target genes (31 risk genes, corrected P = 6.96 × 10−<sup>9</sup> ), PSD genes (15 risk genes, corrected P = 6.50 × 10−<sup>5</sup> ), haploinsufficient genes (8 risk genes, corrected P = 1.62 × 10−<sup>3</sup> ), and constrained genes (36 risk genes, corrected P = 6.62 × 10−29).

In addition, we further assessed the phenotypic terms of enrichment of the 63 ID risk genes based on the Human Phenotype Ontology database. We found that the 63 ID risk genes were significantly enriched for eight major neurodevelopmental phenotypes in humans (all corrected P < 0.05; **Figure 3C** and **Supplementary Table S5**). Hypoplasia of the corpus callosum was the most highly enriched (corrected P = 3.86 × 10−<sup>4</sup> ), followed by epileptic encephalopathy, aggressive behavior, febrile seizures, stereotypy, autistic behavior, ID and global developmental delay. Constrained genes or genes with missense mutations in neuropsychiatric disorders have been proposed to have more protein interactions than non-constrained genes or controls (Shohat et al., 2017). Consistent with previous hypotheses, we found that 63 ID risk genes had a significant excess of PPIs compared with genes with q-values ≥ 0.1 identified in the present study (P = 0.031, **Figure 3D**).

### Spatiotemporal Expression Profiles of ID Risk Genes Involved in Brain Development

To investigate whether the co-expression of the 63 ID risk genes was enriched in specific tissue of human or stages of human brain development most pertinent to ID, we performed TSEA and spatiotemporal enrichment in brain using previously developed tools (Xu et al., 2014). We found that those 63 genes are enriched for brain expression and preferentially expressed in specific brain regions, in accordance with previous findings (**Supplementary Figure S3**) (Shohat et al., 2017). Across brain regions and developmental stages, we observed strong signals of association in the cortical regions during the early fetal, early mid-fetal and late mid-fetal stages (**Figure 4A**). In particular, the most significant enrichment was detected in the early midfetal stage (corrected P = 1.24 × 10−<sup>8</sup> ). In addition, significant enrichment were also found for the amygdala and striatum during early mid-fetal stages (**Figure 4A**).

Given that our analysis pointed to the roles of the 63 ID risk genes in the context of human brain development, we wanted to further characterize the spatiotemporal expression dynamics of these genes and assess their molecular convergence on specific biological processes. We employed WGCNA to group 60 of the 63 risk genes into 2 different co-expression modules (M1 and M2) based on pairwise correlations between the gene expression profiles of the tissue samples from the HBT (**Figure 4B** and **Supplementary Table S6**). The gene expression profile of the largest module (M1), which contained 39 genes, revealed a gradual trend toward increased expression in the human brain from the embryonic to early mid-fetal periods [16–19 post-conception weeks (PCW)] and then a gradual decrease to the lowest expression at childhood. An enrichment analysis of GO terms showed that this group of genes significantly converged on covalent chromatin modification (corrected P = 1.04 × 10−<sup>3</sup> ) and some transcriptional regulation, including positive regulation of transcription, DNA-templated (corrected P = 4.67 × 10−<sup>2</sup> ), negative regulation of transcription from RNA polymerase II promoter (corrected P = 6.78 × 10−<sup>3</sup> ) and positive regulation of transcription from RNA polymerase II promoter (corrected P = 2.15 × 10−<sup>4</sup> ; **Figure 4C** and **Supplementary Table S7**). For M2, we found that 18 genes within this module were gradually decreased during the fetal and infancy periods, followed by a gradual increase in expression from the infancy to adolescence periods, reaching a stable level after adulthood. Functional annotation showed that the M2 genes were enriched for chemical synaptic transmission (corrected P = 0.023), protein dephosphorylation (corrected P = 3.46 × 10−<sup>3</sup> ) and nervous system development (corrected P = 2.23 × 10−<sup>3</sup> ; **Figure 4C** and **Supplementary Table S7**).

Gene Ontology enrichment analysis showed that some biological processes were specific to the genes of M1 or M2, implying that these two modules have a divergent etiology. We then evaluated whether the DNM number, genes with DNMs and patients harboring DNMs differed across the two types of functional DNMs (LoF and D-Mis) between M1 and M2. We found that M1 have higher prevalence of LoF mutations than did M2 (OR = 3.19, P = 5.06 × 10−<sup>4</sup> ; two-tailed Fisher's exact test), but a lower rate of D-Mis mutations was observed in M1 than in M2 (OR = 0.27, P = 8.79 × 10−<sup>5</sup> ; two-tailed Fisher's exact test; **Figure 4D**). Consistent with this observation, the burden in ID patients harboring LoF mutations was clearly higher in M1 than in M2 (OR = 3.88, P = 1.58 × 10−<sup>4</sup> ; two-tailed Fisher's exact test), while an excess of patients harboring D-Mis mutations was observed in M2 over M1 (OR = 0.28, P = 2.30 × 10−<sup>4</sup> ; twotailed Fisher's exact test; **Figure 4D**). In addition, the frequency of genes with LoF and D-Mis mutations was not significantly different between M1 and M2, although a high proportion of LoF mutations was observed in M1 (for LoF, OR = 3.32, P = 0.08; for D-Mis, OR = 0.45, P = 0.25; two-tailed Fisher's exact test).

# DISCUSSION

Recent advances in genetic studies based on DNMs identified from large-scale WES/WGS analyses of ID patient cohorts allow us to further reinforce our understanding of the genetic etiology of ID (Gilissen et al., 2014; Hamdan et al., 2014; Lelieveld et al., 2016). However, the considerable genetic heterogeneity underlying ID makes it essential to prioritize causative mutations

exact test. (C) Enrichment of 63 ID genes in human phenotypes drawn from the Human Phenotype Ontology (HPO). The x-axis represents the log<sup>10</sup> of the corrected P-values. (D) The cumulative distribution function (CDF) of the number of interactions (log10) is depicted for 63 high-confidence risk genes relative to q-values ≥ 0.1.

and explore new candidate genes as well as understand the relative biological processes associated with ID (Vissers et al., 2016). In this study, we employed the TADA statistical model to identify 63 high-confidence ID genes with q-values < 0.1, including 51 known and 12 potential ID genes, on the basis of coding DNM data sets from multiple trio-based WES/WGS studies in combination with four background DNMRs. We also observed a significant enrichment of FMRP targets and CHD8 targets among these 63 genes. Summarizing gene burden analyses in multiple metrics of evolutionary constraint suggests that the 63 risk genes are intolerant of functional genetic variations, highlighting the importance of their association with ID. Importantly, the enrichment of spatiotemporal gene expression signatures shows that ID genes were preferentially

expressed in the cortex during the early fetal, early mid-fetal and late mid-fetal stages as well as amygdala and striatum

A two-sample Kolmogorov–Smirnov test was used to detect the difference.

during early mid-fetal stages. In particular, WGCNA analyses revealed an obvious convergence of the signals of these risk genes on similar biological processes, including synaptic function, chromatin modification and transcriptional regulation.

By excluding known ID genes, we highlighted 12 potential candidate ID genes from the 63 high-confidence ID genes. Moreover, several previous functional and association studies have pointed to the pathogenicity of most of the 12 potential candidate genes. Numerous genetics studies have identified pathogenic variants of CACNA1A (Epi, 2016; Luo et al., 2017), CSNK2A1 (Trinh et al., 2017), PPP1CB (Gripp et al., 2016; Ma et al., 2016), PPP2CA (Reijnders et al., 2017), SLC6A1 (Carvill et al., 2015; Halvorsen et al., 2016; Palmer et al., 2016; Yuan et al., 2017), and USP7 (Zarrei et al., 2017) from large cohorts of unrelated patients who presented a wide spectrum of neurological and behavioral phenotypes of global developmental

FIGURE 4 | Specific expression patterns of the 63 ID risk genes in the brain. (A) Enrichment analysis across brain regions and development periods is depicted for different specificity index thresholds (pSIs). The outer hexagons depict pSI < 0.05, and the inner hexagons indicate a more stringent pSI. The dimension of the hexagons is scaled to the size of the gene list. Bullseyes will be color filled by corrected P-values calculated by Fisher's exact test. (B) Illustration of the WGCNA of the 63 ID risk genes in the brain for the modules' eigengenes (dots for different brain regions) and smooth curves for the confidence intervals (gray ranges). (C) GO enrichment analysis for the two modules. All P-values are corrected using correction for multiple comparisons. The red dotted line indicates a corrected P = 0.05. (D) Enrichment analysis of mutation class (LoF and D-Mis) from both modules at the DNM level, gene level and sample level. OR: odds ratio (module 1/module 2); P-values were calculated using two-sided Fisher's exact tests. Fetal is composed of 4–8 PCW, 8–10 PCW, 10–13 PCW, 13–16 PCW, 16–19PCW, 19–24PCW, and 24–38PCW; Infancy includes 0–6 months and 6–12 months; Childhood contains 1–6 years and 6–12 years; Adolescence refers to 12–20 years; Adulthood is made up of 20–40 years, 40–60 years and over 60 years.

delay, attention deficit disorder, epileptic encephalopathy, macrocephaly, ID or sensory processing disorder. Several studies in model systems have provided definitive evidence of the role of partial genes in the neurodevelopmental process. Drosophila models have suggested that LoF alleles of CACNA1A affect synaptic transmission and neurodegeneration (Luo et al., 2017). A SLC6A1-knockout mouse model showed phenotypes of absence seizures or similar ADHD symptoms (Chen et al., 2015). A CRISPR/Cas9-based knockout of USP7 in neurons clearly impaired its effect on the proper function of hypothalamic neurons (Hao et al., 2015). Expression profile analysis and immunohistochemistry revealed that TCF7L2 is very highly expressed in the cortical, thalamic, and midbrain regions from the late gestational stage to the adult stage in mice (Nagalski et al., 2013). TPNO2 and 71 other constrained genes formed a significantly connected subnetwork and were preferentially expressed in the hippocampal region during the early stages of brain development (Choi et al., 2016). Based on our analysis of the PPI and co-expression networks, the present study also provides compelling support for the strong functional association between the 12 potential candidate genes and the known ID genes with q-values < 0.3.

The finding in the present study that 15 of the 63 highconfidence ID genes were significantly associated with hypoplasia of the corpus callosum, which showed the highest enrichment (corrected P = 3.86 × 10−<sup>4</sup> ), reflects the importance of the corpus callosum in ID. The corpus callosum is the largest forebrain commissure, comprising highly organized neocortical

connections and functioning in bilateral movements, the development of language and handedness, and behavior and cognition (Raybaud, 2010; van der Knaap and van der Ham, 2011). The agenesis or dysgenesis of the corpus callosum has been implicated in severe ID by previous neuroradiologic studies that examined a wealth of magnetic resonance imaging (MRI) data on these patients (Schatz and Buzan, 2006; Luders et al., 2007; Aukema et al., 2009). With respect to healthy and autistic subjects, approximately 12.2% of patients with ID presented with a hypoplastic corpus callosum, as measured by the thickness and length of the corpus callosum on midsagittal T1-weighted images (Erbetta et al., 2015). An additional MRI study on a novel checklist of structural anomalies in 80 patients with unexplained mental retardation found mild to severe callosal anomalies in 28.8% of intellectually disabled patients, with a low IQ associated with the thinning of the corpus callosum (Spencer et al., 2005). In addition, a variety of abnormalities in the morphology of the corpus callosum are also found relatively frequently in children and adults with ASD, SCZ, and EE (Whitford et al., 2010; Basel-Vanagaite et al., 2013; Wolff et al., 2015).

There has been a large increase in the evidence supporting a shared genetic etiology between ID and other neuropsychiatric disorders, such as EE, ASD, DD (Vissers et al., 2016; Shohat et al., 2017). In the present study, some of the 63 risk genes were clearly implicated in EE, autistic behavior and global developmental delay (**Figure 3C**). Moreover, based on WES or WGS studies, several potential candidates from the 63 risk genes that harbored functional DNMs were frequently detected in ASD and DD (**Supplementary Table S8**). For example, the DNMs within CACNA1A, CSNK2A1, and FBXO11 were recurrently detected in unrelated patients with severe DD syndromes in independent sequencing studies of larger cohorts (Deciphering Developmental Disorders, 2015, 2017). Recurrent DNMs harbored in the SLC6A1 and TCF7L2 genes were shared among ID, ASD, and DD (De Rubeis et al., 2014; Iossifov et al., 2014; Deciphering Developmental Disorders, 2015, 2017; Yuen et al., 2016), further highlighting the shared genetic basis of DNMs in neuropsychiatric disorders.

Recent studies using co-expression enrichment in the brain have identified the fetal development of the cortex as a point of molecular convergence for de novo Lof or missense mutations in ID (Harripaul et al., 2017b; Shohat et al., 2017), implying that altered cortical function is critical for ID susceptibility. Indeed, increased stability during evolution led to insufficient time for the evolution of a buffering capacity for the cerebral cortex, which is generally more intolerant of genetic perturbation (McGrath et al., 2011). The dysfunction of the cerebral cortex has been consistently implicated in neurodevelopmental disorders by multiple modalities (McGrath et al., 2011; Rubenstein, 2011; Hutsler and Casanova, 2016; Kim et al., 2016). Of the six major brain regions tested, the cortex showed the significantly enriched expression of the 63 ID risk genes identified in the present study, consistent with previous findings. Despite extensive genetic heterogeneity in ID, there is emerging evidence that ID-associated genes that are highly connected in coexpression networks or in modules converge on certain specific biological functions (Kochinke et al., 2016; Harripaul et al., 2017b; Shohat et al., 2017). A WGCNA analysis of our gene set identified two spatially and temporally specific modules associated with chromatin modification, chromatin organization and transcriptional regulation in M1 and with synaptic function in M2. The biological processes involved in ID are consistent with previous findings (Kochinke et al., 2016; Shohat et al., 2017), further emphasizing the role of convergent biological functions in ID.

# CONCLUSION

We provide multiple lines of evidence with function-related analyses from biological annotations, evolutionary constraints, gene co-expression and protein interaction networks that support the important role of these 63 high-confidence genes with q-values < 0.1 in the etiology of ID. In particular, we took advantage of a brain-specific network to define the preferential expression of ID genes in the cortex, and they point to a shared molecular basis for the synaptic function, chromatin modification and transcriptional regulation implicated in the pathogenesis of ID.

# AUTHOR CONTRIBUTIONS

ZWL, NZ, and YZ contributed to the drafting and revision of the manuscript, data acquisition and analysis. TZ and YD contributed to data acquisition and data analysis. ZSL contributed to data acquisition and manuscript revision. XW and JW contributed to study concept and design, critical review and manuscript revision. All authors read and approved the final manuscript.

# FUNDING

This study was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY18C060007).

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018. 00349/full#supplementary-material

FIGURE S1 | Identification of candidate genes in ID. The number of genes with q-values < 0.3 in ID (A), genes with q-values < 0.1 (B) and q-values < 0.3 (C) in control performed by the TADA method based on four background DNMRs are shown in the Venn diagram.

FIGURE S2 | Protein–protein interaction (PPI) and co-expression network analyses of between 12 new genes and 741 known ID genes. (A) The histograms display the results of the permutation tests (100,000 simulations each) that assess the combined nodes and edges (connections) scores of the PPI networks. (B) The histograms display the results of the permutation tests (100,000 simulations each) that assess the combined nodes and edges (connections) scores of the co-expression networks. The vertical red lines indicate observed scores.

FIGURE S3 | Over presentation of 63 ID risk genes across tissue types for human is demonstrated for different specificity index thresholds (pSIs).

TABLE S1 | DNM information from the published literature on ID and control.

TABLE S2 | Annotations for DNMs in ID and control.

fgene-09-00349 September 14, 2018 Time: 9:13 # 10

TABLE S3 | Information on reported known genes in ID.

TABLE S4 | Prioritized genes with q-values < 0.3 by TADA.

TABLE S5 | High-confidence ID genes are significantly enriched in Human Phenotype Ontology with an enrichment score of corrected P < 0.05.

#### REFERENCES


TABLE S6 | Conservative assessment of 63 ID risk genes and module information.

TABLE S7 | Enrichment of biological processes of 63 ID risk genes in each module.

TABLE S8 | Shared DNMs of ID risk genes in ASD and DD.

FILE S1 | Code for permutation test performed in PPI and co-expression network.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Liu, Zhang, Zhang, Du, Zhang, Li, Wu and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genetic and Epigenetic Alterations Underlie Oligodendroglia Susceptibility and White Matter Etiology in Psychiatric Disorders

Xianjun Chen<sup>1</sup> , Huifeng Duan<sup>1</sup> , Lan Xiao<sup>2</sup> and Jingli Gan<sup>1</sup> \*

<sup>1</sup> Department of Psychiatry, Mental Diseases Prevention and Treatment Institute of PLA, PLA 91st Central Hospital, Jiaozuo, China, <sup>2</sup> Department of Histology and Embryology, Chongqing Key Laboratory of Neurobiology, Army Medical University (Third Military Medical University), Chongqing, China

#### Edited by:

Zhexing Wen, Emory University School of Medicine, United States

#### Reviewed by:

Tomas J. Ekstrom, Karolinska Institutet (KI), Sweden Ting Zhao, University of Pennsylvania, United States

> \*Correspondence: Jingli Gan cco91@163.com

#### Specialty section:

This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics

Received: 26 September 2018 Accepted: 06 November 2018 Published: 22 November 2018

#### Citation:

Chen X, Duan H, Xiao L and Gan J (2018) Genetic and Epigenetic Alterations Underlie Oligodendroglia Susceptibility and White Matter Etiology in Psychiatric Disorders. Front. Genet. 9:565. doi: 10.3389/fgene.2018.00565 Numerous genetic risk loci are found to associate with major neuropsychiatric disorders represented by schizophrenia. The pathogenic roles of genetic risk loci in psychiatric diseases are further complicated by the association with cell lineageand/or developmental stage-specific epigenetic alterations. Besides aberrant assembly and malfunction of neuronal circuitry, an increasing volume of discoveries clearly demonstrate impairment of oligodendroglia and disruption of white matter integrity in psychiatric diseases. Nonetheless, whether and how genetic risk factors and epigenetic dysregulations for neuronal susceptibility may affect oligodendroglia is largely unknown. In this mini-review, we will discuss emerging evidence regarding the functional interplay between genetic risk loci and epigenetic factors, which may underlie compromised oligodendroglia and myelin development in neuropsychiatric disorders. Transcriptional and epigenetic factors are the major aspects affected in oligodendroglia. Moreover, multiple disease susceptibility genes are connected by epigenetically modulated transcriptional and post-transcriptional mechanisms. Oligodendroglia specific complex molecular orchestra may explain how distinct risk factors lead to the common clinical expression of white matter pathology of neuropsychiatric disorders.

Keywords: psychiatric disorders, schizophrenia, oligodendroglia, myelin, white matter, genetic risk loci, epigenetic dysregulation

# INTRODUCTION

The highly complex etiology of neuropsychiatric disorders, which is influenced by both genetic predisposition and environmental factors, has been a major challenge in understanding these devastating diseases, like schizophrenia. In recent years, genomic studies have uncovered the complex genetic architecture of psychiatric disorders including thousands of genetic loci (Sullivan et al., 2012). Multiple genome-wide association studies (GWAS) have further produced remarkable findings in different populations. Besides, rare mutations, which are sufficiently deleterious with a low frequency, also play important roles in the pathogenesis (Abecasis et al., 2012). These advances in genetics have updated our understanding of psychiatric disorders. However, since some common symptoms of major psychiatric diseases have already been well-defined, how numerous

genetic alterations affect function of multiple distinct genes and thereafter lead to similar clinical phenotypes is a puzzle.

Apart from the genetic component, environmental risk factors, including biological and psychosocial ones, are also involved in mental diseases onset (Owen et al., 2016). Environmental insult induces stable changes in gene expression, which are governed by epigenetic modifications (Nestler et al., 2016). Epigenetic modifications, including DNA methylation, histone modifications, and non-coding RNAs, played a functional control over the genetic information by regulating chromatin accessibility and gene transcription (Shorter and Miller, 2015). Chromatin modification analysis (assay for transposase accessible chromatin followed by sequencing, ATAC-seq) and transcriptome evidence suggest a possible molecular framework of how genetic alterations and epigenetic factors interact with each other (Guo et al., 2017; Fan et al., 2018).

Decades of extensive investigations revealed aberrant neuronal circuit assembly and synaptic malfunction of neuropsychiatric diseases, which forms the mechanisms for most current antipsychotics and new therapeutic development (Forrest et al., 2018). Recent studies of genetic alterations strongly support a neuronal susceptibility of psychiatry diseases. A large-scale GWAS has also implicated common variation in genes encoding the glutamate receptors, dopamine receptors, and members of voltage-gated calcium channel family of proteins (2014). Studies for rare mutations implicate that genes encoding a variety of synaptic proteins and the above-mentioned voltage-gated calcium channel related proteins are involved in the pathogenesis (Hall et al., 2015). Besides, epigenetic alterations underlying aberrant gene regulation in neurons have also been implicated in psychiatric disorders (Tsankova et al., 2007; Iwamoto and Kato, 2009; Guidotti et al., 2016).

Moreover, in recent years, neuropathological, neuroimaging, and genetic studies clearly revealed developmental defects in oligodendroglia/myelin formation and disrupted white matter integrity in major psychiatric disorders, including schizophrenia, depression, and bipolar disorders (Fields, 2008; Martins-de-Souza, 2010; Edgar and Sibille, 2012). Postmortem and brain imaging evidence showed volume reduction and ultrastructural changes of white matter in the prefrontal cortex of schizophrenic patients (Sanfilipo et al., 2000; Staal et al., 2000; Van Haren et al., 2004; Thong et al., 2014). Moreover, several postmortem studies revealed a loss of oligodendroglia in multiple brain regions (Hof et al., 2003; Stark et al., 2004; Byne et al., 2008), as well as some apoptotic and necrotic signs in oligodendroglia (Uranova et al., 2001). Another study further identified a decrease in the total number of oligodendroglia lineage cells, but not the number of progenitor cells, implying impaired differentiation of oligodendroglia (Mauney et al., 2015). In terms of schizophrenia, even before disease onset, the impaired myelin integrity occurs in frontal areas and advances in further stages to more brain regions (Friedman et al., 2008; Yao et al., 2013; Holleran et al., 2014; Liu et al., 2014), suggesting that oligodendroglia and myelin deficits are involved in the early pathogenesis. Besides, some neurological disorders characterized by white matter abnormalities, such as leukodystrophies and multiple sclerosis, showed some psychosis symptoms (Walterfang et al., 2005; Mckay et al., 2018). Rodent models with impaired oligodendroglia development and myelin deficits exhibit various phenotypes reminiscent of psychiatric disorders (Chen et al., 2015; Poggi et al., 2016). Importantly, some clinical studies found that the abnormal frontal myelin integrity was correlated to cognitive symptoms in first episode patients with schizophrenia (Perez-Iglesias et al., 2010; Kuswanto et al., 2012). These results strongly suggest that impaired oligodendroglia development and myelination could be related to the etiology of psychiatric disorders rather than simply being accompanied pathological abnormalities.

However, several questions need further discussion. How do genetic alterations and epigenetic dysregulations influence oligodendroglia besides neurons? Are there any cell-type-specific or developmental-stage-specific mechanisms in oligodendroglia? Whether and how do multiple risk factors coordinate to play roles in oligodendroglia? In this mini-review, we will try to address these questions based on emerging concepts of genetic and epigenetic findings in oligodendroglia during the pathogenesis of psychiatric disorders.

#### GENETIC ALTERATIONS ASSOCIATED WITH SCHIZOPHRENIA ARE PREDICTED TO AFFECT CODING GENES EXPRESSED IN BOTH NEURONS AND OLIGODENDROGLIA

Major psychiatric disorders, represented by schizophrenia, have been well-recognized as highly polygenic based on genetic epidemiological findings at the population level (Gottesman and Shields, 1967). A recent large-scale GWAS identified more than 100 genetic loci in Schizophrenia Working Group of the Psychiatric Genomics Consortium (2014) and Li et al. (2017), which contained 332 Reference Sequence annotated genes (Birnbaum et al., 2015). When cross-referenced to microarray probes in the BrainCloud data set, 239 genes were further mapped out (Birnbaum et al., 2015), which included both protein coding genes and non-coding genes. Notably, most of these genes were not cell-type-specific (Mckenzie et al., 2018). Except for the GWAS data, recent whole exome sequencing studies have further identified some rare de novo single nucleotide and insertion/deletion variants in schizophrenia (Fromer et al., 2014; Purcell et al., 2014). If the genetic alterations occur in coding sequence, they may cause a destabilization of the protein conformation and aberrant posttranslational modifications (**Figure 1**; Ishizuka et al., 2017), thereafter resulting in functional deficits within various cell types.

In recent years, substantial functional studies of risk genes focused on neuronal function (Owen et al., 2016). Even though previous clinical studies demonstrated that several oligodendroglia and myelin related genes were significantly associated with major mental illnesses, such as MAG (Wan et al., 2005; Voineskos et al., 2013), Olig2 (Georgieva et al., 2006; Mitkus et al., 2008), and CNP (Voineskos et al., 2008), the function of a majority of schizophrenia related genes was poorly explored

in oligodendroglia, which in turn hampers the understanding of pathological mechanisms of causal genes in oligodendroglia.

# NON-CODING VARIANTS AFFECT CHROMATIN ACCESSIBILITY AND GENE TRANSCRIPTION

Non-synonymous protein coding changes cannot explain the majority of disease related genetic variants. Recent study found that the majority of GWAS hits occur within intergenic and intronic regions of the genome (Hindorff et al., 2009). Moreover, among these non-coding variants, 85% of non-synonymous variants and more than 90% of STOP gain and splice-disrupting variants are in low frequency (below 0.5%), and could be highly deleterious (Abecasis et al., 2012). For example, alterations to splicing site sequences, nucleotides adjacent to splice junctions and splicing regulatory sequences can severely influence gene splicing (Reble et al., 2018), which may explain global changes in alternatively spliced transcripts of risk genes (e.g., ERBB4 and DISC1) in psychiatric disorders.(**Figure 1**; Nakata et al., 2009; Chung et al., 2018).

In fact, some genetic effects owe to not only the common single nucleotide polymorphisms in GWAS, but also rare variants (Yu et al., 2018), copy-number variations (Malhotra and Sebat, 2012) and other types of mutations (Hall et al., 2015). A previous study identified schizophrenia related rare variants concentrated in regions of promoters and enhancers, but not insulators (Duan et al., 2014). Another study further found individuals harboring rare variants in conserved transcription factor binding motifs, untranslated regions of genes and non-coding RNAs (Abecasis et al., 2012). Therefore, these risk alleles could result in differential transcription factor binding and chromatin accessibility, implying their subsequent effects on the regulation of gene expression (Zhang et al., 2018). Moreover, recent DNaseI sequencing was applied to analyze chromatin accessibility at a genome-wide level, which further discovered genetic variants that modify chromatin accessibility which in turn may disrupt cis regulatory elements. These are major mechanisms that could explain how genetic alteration led to gene expression differences (Degner et al., 2012; Maurano et al., 2012).

Open chromatin, which is accessible for DNA-binding proteins, plays key roles in securing ordered spatiotemporal regulation of gene expression. Notably, some studies found that most of the identified open chromatin regions were enriched in promoters, enhancers and well-defined cell-type markers. Moreover, these regions were differentially accessible between neurons and non-neuronal cells, which may be related to the fact that neuronal open chromatin regions were more evolutionarily conserved and were enriched in distal regulatory elements

as compared with non-neuronal cells (Fullard et al., 2017). Even though the precise molecular mechanism underlying this cell-type-specific discrepancy is not clear at present, it raises the possibility that genetic variants could exhibit cell-typespecific effects and differentially affect neuronal function and oligodendroglia function. Besides, the chromatin accessibility of oligodendroglia differentiation inhibitors, dynamic expression of transcription factors and non-coding RNAs were totally opposite or different between oligodendroglia progenitor cells and mature oligodendroglia (Emery and Lu, 2015), which shed light on the possibility that genetic variants occurring in these factors could differently interfere with oligodendroglia function at various developmental stages.

#### EPIGENETIC DYSREGULATION OF RISK GENES AFFECTS OLIGODENDROGLIA DEVELOPMENT AND MYELINATION

Oligodendroglia differentiation and myelination are tightly controlled by epigenetic regulation. The transition from progenitor cells to mature oligodendroglia is characterized by a rapid and substantial chromatin remodeling (Nielsen et al., 2002; Liu et al., 2012), which is governed by epigenetic regulators included in the histone modifications and DNA methylation. Besides, the transcriptional activators and repressors negotiate through the underlying chromatin organization and play critical roles during the specification, differentiation and myelination of oligodendroglia (Emery and Lu, 2015). These findings suggest that dysregulation of the epigenetic regulators and transcription factors may impair oligodendroglia development and myelination.

In fact, some oligodendroglia specific risk factors (e.g., OLIG2, SOX10, and CNP), which are crucial for oligodendroglia development and myelination, are robustly dysregulated in major mental illnesses (Tkachev et al., 2003; Kato and Iwamoto, 2014). Recent genome-wide methylation analysis revealed overall disease related differential methylation of 817 genes in promotor regions (Wockner et al., 2014), which also included some oligodendroglia specific risk genes. These findings imply aberrant epigenetic mechanisms that regulate the expression of key risk genes in oligodendroglia. In particular, the DNA methylation of SOX10 was associated with oligodendroglia dysfunction in schizophrenia (Iwamoto et al., 2005), suggesting that epigenetic mechanisms causing functional deficit of risk genes could increase vulnerability of oligodendrocyte dysfunction during the pathogenesis.

Furthermore, several studies found a dysregulation of epigenetic regulators in the brains of psychiatric cohorts, including histone modification factors (e.g., histone acetyltransferase, histone deacetylase), DNA methylation factors (e.g., DNA methyltransferase, DNA methylase, and DNA demethylase) and microRNAs (Nestler et al., 2016). Therefore, if the epigenetic regulators were dysregulated in oligodendroglia, they may lead to, at least in part, the abnormal expression of oligodendroglia specific risk genes. However, as the epigenetic regulators are commonly expressed within various cell types, whether the aberrant epigenetic mechanisms somehow exhibit cell-type-specificity is vastly unclear.

In our previous study, we found that histone acetylation could affect transcription of FEZ1, a well-defined schizophrenia risk gene (Yamada et al., 2004; Kang et al., 2011), in oligodendroglia rather than in neurons (Chen et al., 2017), which might be due to variable contribution of transcription factor binding, providing an example of oligodendrogliaspecific epigenetic regulation of risk genes. However, much of the epigenetic landscape remains unexplored in both neurons and oligodendroglia. Therefore, in order to deeply analyze the abnormal epigenetic modifications in nuclei and further understand the cell-type-specificity issue, chromatin modification and transcriptional profiling analyses could be applied to neuronal and oligodendroglia nuclei, which can be technically isolated from frozen postmortem human brain by fluorescence activated nuclear sorting (FANS) (Fullard et al., 2017).

#### GENE TRANSCRIPTION IS A MAJOR ASPECT AFFECTED IN OLIGODENDROGLIA, LIKELY COOPERATING WITH EPIGENETIC DYSREGULATION

Strikingly, in a recent study, researchers analyzed genetic variants in glial type-specific schizophrenia risk genes by cross-referring GWAS data to previously published microarray database, and identified 1650 schizophrenia associated genes highly enriched in oligodendroglia (Goudriaan et al., 2014). Furthermore, the functional gene set analysis revealed three oligodendrogliaspecific gene sets that were significantly associated with schizophrenia, including lipid metabolism, oxidation-reduction, and gene transcription (Goudriaan et al., 2014). Notably, the gene transcription set was the largest one and account for 47% of all disease related genes in oligodendroglia. Besides, the association between oligodendroglia gene transcription and disease depended on accumulated effects of multiple genes rather than the effects of a few genes (Goudriaan et al., 2014), implying that the transcription regulators in oligodendroglia could coordinate to play functional roles during the pathogenesis.

We further analyzed aforementioned gene transcription set, which included 123 genes in total. As described in Entrez Gene database<sup>1</sup> , these genes were involved in the histone modifications, transcription coactivation/corepression, transcription initiation and chromatin remodeling. Moreover, 59% of these genes were well-defined transcription factors. As the transcription factors are crucial for oligodendroglia differentiation and myelination (Emery and Lu, 2015), both the genetic deficit and epigenetic dysregulation of them may thereafter cause some functional deficits of risk genes and ultimately lead to the abnormalities of oligodendroglia and/or white matter.

<sup>1</sup>http://www.ncbi.nlm.nih.gov/gene

### MOLECULAR NETWORK OF RISK GENES IN OLIGODENDROGLIA IS CONNECTED BY EPIGENETICALLY REGULATED TRANSCRIPTION AND POST-TRANSCRIPTIONAL MECHANISMS

fgene-09-00565 November 20, 2018 Time: 15:11 # 5

However, whether and how the oligodendroglia specific risk factors, including the above-mentioned transcription factors, coordinate to affect oligodendroglia function and myelination, except for their individual roles, is vastly unclear. For example, HDAC activity is necessary for oligodendroglia differentiation (Conway et al., 2012). When overall histone acetylation was changed in oligodendroglia, the expression of multiple psychiatric disorders related factors was altered (Conway et al., 2012; Chen et al., 2017). In our previous study, by inhibiting histone deacetylation in oligodendroglia, we demonstrated a sophisticated molecular orchestra in oligodendroglia regulating the expression of risk gene FEZ1 (Chen et al., 2017). Firstly, HDAC inhibition directly increased histone acetylation at the Fez1 promoter, therefore opened the chromatin at this DNA region. Secondly, HDAC inhibition in oligodendroglia altered the expression of some psychiatric diseases related transcription activators and repressors, which may cause imbalance of transcription activity and thereafter indirectly regulate FEZ1 transcription. Thirdly, the transcriptional repressor exhibited inhibitory roles on activators, which indeed fastened the cross-talk network of various transcription factors. Together with another study (Deng et al., 2017), these findings present an coordination between HDAC mediated chromatin remodeling and transcription factors, raising an intriguing possibility that DNA accessibility of risk genes and oligodendroglia-specific transcription factor orchestra may functionally interplay to regulate oligodendroglia function during the pathogenesis.

Besides epigenetic factors involved in gene transcription, some post-transcriptional regulators are also well-recognized as risk factors in oligodendroglia. For example, substantial evidence supports that quaking is a glia-expressed schizophrenia risk factor (Aberg et al., 2006a,b). The quaking protein, which controlled nuclear export, stability and translation of their bound mRNAs in oligodendroglia (Zhao et al., 2006; Bockbrader and Feng, 2008), was essential for oligodendroglia and myelin development (Chen et al., 2007). Except for the well-known myelin-related mRNAs involved in schizophrenia (Aberg et al., 2006a), we found that quaking protein can regulate FEZ1 mRNA stability through directly binding to FEZ1 mRNA at 30UTR in the cytoplasm of oligodendroglia (Chen et al., 2017), hence connecting the

#### REFERENCES

Abecasis, G. R., Auton, A., Brooks, L. D., Depristo, M. A., Durbin, R. M., Handsaker, R. E., et al. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65. doi: 10.1038/nature 11632

function of two risk genes in oligodendroglia at the level of posttranscriptional regulation. Taken together, these studies provide an example how multiple risk factors in oligodendroglia are functionally connected in schizophrenia at transcriptional and post-transcriptional level (**Figure 1**), abnormalities of which may underlie oligodendroglia susceptibility and white matter etiology.

## CONCLUSION

Genetic alterations, as a large component of mental disease etiology, are not only affecting neuronal function, but also myelinating oligodendroglia. Moreover, these alterations need to induce functional effects through interaction with epigenetic regulators. Genetic variants located in gene coding regions could lead to aberrant protein conformation and posttranslational modifications. Genetic variants within non-coding regions are mostly included in cis regulatory elements including promoters and enhancers, and show deleterious effects by affecting chromatin accessibility and transcription factor binding motif (**Figure 1**).

Transcription and epigenetic regulation are important for oligodendroglia development, and are also affected during oligodendroglia pathogenesis. Ordered chromatin remodeling and transcription factor expression pattern during oligodendroglia development and myelin formation raise the possibility that genetic alterations and epigenetic dysregulations may exhibit functional effects at specific developmental stage based on the expression pattern and functional needs of the epigenetic factors. Moreover, multiple risk factors in oligodendroglia could functionally interplay at transcriptional and post-transcriptional level to affect oligodendroglia function during the pathogenesis, suggesting that malfunction of a molecular orchestra involving distinct risk factors may lead to the common pathophysiology of psychiatric diseases.

# AUTHOR CONTRIBUTIONS

XC and JG had designed this study and drafted the manuscript. All authors contributed to reviewing and editing the final manuscript.

# FUNDING

This work was supported by Research Project of Medical Science of Chinese PLA (CWS12J071) to JG and NSFC grant (3167060380) to LX.

Aberg, K., Saetre, P., Jareborg, N., and Jazin, E. (2006a). Human QKI, a potential regulator of mRNA expression of human oligodendrocyte-related genes involved in schizophrenia. Proc. Natl. Acad. Sci. U.S.A. 103, 7482–7487.

Aberg, K., Saetre, P., Lindholm, E., Ekholm, B., Pettersson, U., Adolfsson, R., et al. (2006b). Human QKI, a new candidate gene for schizophrenia involved in myelination. Am. J. Med. Genet. B Neuropsychiatr. Genet. 141B, 84–90.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chen, Duan, Xiao and Gan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Human Aquaporin 4 Gene Polymorphisms and Haplotypes Are Associated With Serum S100B Level and Negative Symptoms of Schizophrenia in a Southern Chinese Han Population

#### Yung-Fu Wu1,2, Huey-Kang Sytwu3,4 \* and For-Wey Lung2,5 \*

*<sup>1</sup> Department of Psychiatry, Beitou Branch, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, <sup>2</sup> Graduate Institute of Medical Science, National Defense Medical Center, Taipei, Taiwan, <sup>3</sup> Department of Microbiology and Immunology, National Defense Medical Center, Taipei, Taiwan, <sup>4</sup> National Health Research Institutes, Zhunan, Taiwan, <sup>5</sup> Calo Psychiatric Center, Pingtung, Taiwan*

#### Edited by:

*Weihua Yue, Peking University Sixth Hospital, China*

#### Reviewed by:

*Jianping Zhang, Zucker Hillside Hospital, United States Yong Xu, First Hospital of Shanxi Medical University, China*

#### \*Correspondence:

*Huey-Kang Sytwu sytwu@ndmctsgh.edu.tw For-Wey Lung forwey@seed.net.tw*

#### Specialty section:

*This article was submitted to Molecular Psychiatry, a section of the journal Frontiers in Psychiatry*

Received: *18 July 2018* Accepted: *19 November 2018* Published: *11 December 2018*

#### Citation:

*Wu Y-F, Sytwu H-K and Lung F-W (2018) Human Aquaporin 4 Gene Polymorphisms and Haplotypes Are Associated With Serum S100B Level and Negative Symptoms of Schizophrenia in a Southern Chinese Han Population. Front. Psychiatry 9:657. doi: 10.3389/fpsyt.2018.00657* Background: Aquaporin 4 (AQP4) polymorphism may influence the required dosage of antipsychotic drugs. However, the roles of AQP4 polymorphisms in the blood—brain barrier (BBB) and different neuroprotective effects need further exploration. This study aims to investigate whether the gene polymorphisms and haplotype of AQP4 are associated with serum S100 calcium-binding protein B (S100B) level and clinical symptoms in patients with schizophrenia (SCZ).

Methods: We recruited 190 patients with SCZ. They provided demographic data, completed relevant questionnaires, and submitted samples to test for four AQP4 tag single nucleotide polymorphisms (SNPs) and eight haplotypes. The rating scales of Positive and Negative Syndrome Scale (PANSS), Personal and Social Performance (PSP), the Global Assessment of Functioning (GAF), Clinical Global Impression (CGI) were assessed and serum S100B level were measured repeatedly during antipsychotic treatment at weeks 0 (baseline), 3, 6, and 9. Using generalized estimating equation (GEE) analyses, log-transformed S100B (logS100B) level was tested for associations with haplotype and other dependent variables.

Results: Discretization via the median split procedure showed that logS100B level >1.78 or ≤1.78 had the best discriminant validity to stratify the patients into two groups. After 9 weeks of treatment, the serum S100B level was decreased. The TAA haplotype of AQP4 SNPs was associated with increased serum S100B level (*p* = 0.006). The PANSS negative subscale (PANSS-N) (*p* = 0.001) and Clinical Global Impression–Improvement (CGI-I) (*p* = 0.003) scores had a positive association with S100B level.

Conclusion: Patients with the TAA haplotype of the AQP4 polymorphism are likely to have increased serum S100B level, negative symptoms and poor control of neuroinflammation. A logS100B level >1.78 may be sufficiently specific to predict a

**32**

higher severity of negative symptoms. Further study including healthy controls and patients with first and recurrent episodes under selective AQP4 modulators will be necessary to explore the profound effects on the treatment of patients with SCZ and may positively influence their overall outcome.

Keywords: schizophrenia (SCZ), aquaporin 4 (AQP4), single nucleotide polymorphism (SNP), haplotype, S100 calcium-binding protein B (S100B)

# BACKGROUND

Schizophrenia (SCZ) is a complex chronic psychiatric disorder; the clinical psychopathology involves cognition, perception, emotion, and other appearances of behavior although the presentation of these features differs across patients and over time. The lifetime prevalence of SCZ is 0.6–1.9% (1). Recent genome-wide studies suggest immune involvement in SCZ (2). Sustained inflammatory activation of microglia and astrogliosis are important mechanisms in the progression of neuroinflammation. Neuroinflammation is frequently associated with blood—brain barrier (BBB) dysfunction and a common pathological event observed in neuropsychiatric diseases. An accumulating body of evidence points to the association between neuroinflammation and SCZ (3). Increasing numbers of studies have shown that SCZ involves a chronic process of neuroinflammation in the brain (4, 5).

Aquaporin-4 (AQP4) is a water-channel protein and highly expressed in the human body primarily at the end-feet of astrocytes surrounding capillaries (6, 7). In addition, AQP4 is involved in BBB development, function, and integrity (8). Apart from its function in water homeostasis, many studies have shown possible inter-relations between AQP4 and neuroinflammation (9). This protein plays an important role in various brain pathological conditions, such as SCZ (10, 11).

S100 calcium-binding protein B (S100B) is a member of the S100 protein family and abundant in astroglial cells. The protein has therefore been considered a glial marker protein (12–14). Given its high level of expression in brain tissue, most studies of the relation between S100 and neurodegeneration have focused on S100B in particular. Increased S100B concentrations are mainly considered to be result of astroglial or BBB dysfunction. Scientific studies have mentioned that blood levels of S100B are increased in SCZ (15–18). Therefore, S100B may be useful in the development of a diagnostic biomarker signature of SCZ (19).

The existence of single nucleotide polymorphisms (SNPs) from different forms of DNA sequence variation may explain the possible genetic risk for SCZ (20). Our previous study found that AQP4 polymorphism may influence the required dosage of antipsychotic drugs (11). However, the roles of AQP4 polymorphisms in the BBB and different neuroprotective effects need further exploration. Here, we hypothesized that astroglial AQP4 would play a crucial role in the regulation of the extent of neuroinflammation and the illness severity in SCZ. We aimed to investigate whether the gene polymorphisms and haplotype of AQP4 are associated with serum S100B level and clinical symptoms in patients with SCZ, while controlling related factors.

#### METHODS

#### Participants and Procedure

A total of 190 patients with SCZ completed the study from the psychiatric wards or outpatient departments in Taiwan. All the patients with SCZ were interviewed face-to-face by a senior psychiatrist, and fulfilled the criteria for SCZ, based on the Mini International Neuropsychiatric Interview (MINI) (21) for DSM-IV criteria and the International Classification of Diseases-10 (WHO-ICD-10). This study was approved by the Independent Ethics Committee/Institutional Review Board in Taiwan. The exclusion criteria were: abnormal value of Creactive protein (CRP) or erythrocyte sedimentation rate (ESR); psychosis other than SCZ; intellectual disability; substancerelated and addictive disorders; other neurocognitive disorders; and having undergone electroconvulsive therapy within the past 6 months. The exclusion criteria also included: active medical illnesses that could be etiologically related to the level of S100B (e.g., uncontrolled cancer, autoimmune or infectious diseases, or a cardiovascular incident within the past 6 months). All participants provided written informed consent, demographic data and completed relevant assessment of clinical symptoms and global function. Whole blood samples were obtained in the morning for testing. The EDTA-anticoagulated blood samples were collected by venipuncture for DNA extraction, tag SNPs and subsequent SNP genotyping. An additional venous blood sample was withdrawn for human serum S100B ELISA detection at the same time. Both the assessment of clinical symptoms and the detection of serum S100B level were recorded at baseline (week 0), and at 3, 6, and 9 weeks. Each recruited participant with SCZ took typical and atypical antipsychotics according to their physician's choice.

# Demographic Variables

A self-report questionnaire was used to obtain demographic information, including age, gender, educational level, marital status, military service, age at diagnosis of SCZ, smoking and family history of mental disorder.

### Isolation of DNA, SNP Selection, and Genotyping

Genomic DNA was extracted from peripheral blood leukocytes using a salting out method. Based on the HapMap data for Han Chinese in the Beijing population, tag SNPs across the entire region of the AQP4 gene were selected using the tagger algorithm (http://www.broadinstitute.org/mpg/tagger/) with a pairwise approach, an r 2 cutoff of 0.8 and a minor allele frequency >0.05. A total of four tag SNPs in two distinct gene regions

were retrieved: in the 3′UTR region (rs1058424, rs335929, rs3763043) and in the intronic region between exons 4 and 5 (rs335931). Tag SNP genotyping was performed with TaqMan allele-specific discrimination assays on an ABI PRISM\_7700 Sequence Detection System and analyzed with the SDS software (Applied Biosystems, Foster City, CA).

#### Haplotype Reconstruction

We use PHASE (22) to estimate the haplotypes from the tag SNPs. The software PHASE version 2.1.1 (http://stephenslab.uchicago. edu/phase/download.html) uses a Gibbs sampling approach in which each individual haplotypes is updated conditionally upon the current estimates of haplotypes from all other samples. Approximations to the distribution of a haplotype conditional upon a set of other haplotypes were used for the conditional distributions of the Gibbs sampler.

#### Human S100B ELISA Detection

Samples of venous blood (5 ml) were withdrawn to measure the level of S100B protein. Blood samples were collected by venipuncture into tubes containing heparin. Plasma samples were centrifuged at 1,008 g for 10 min. The samples were maintained at −80◦C before performing the assays. The concentration of human S100B in serum was measured with an ELISA kit. A Multiskan FC microplate photometer (Thermo Scientific) was used for reading, at 450 nm. The results were expressed in pg/ml.

# Clinical Symptoms and Function Assessment

Assessment of clinical symptoms and function involved the rating scales of Positive and Negative Syndrome Scale (PANSS), the Global Assessment of Functioning (GAF), Personal and Social Performance (PSP), Clinical Global Impression– Severity (CGI-S) and Clinical Global Impression–Improvement (CGI-I).

#### Data Processing and Statistical Analyses

S100B values were log-transformed to normalize data because of their non-linear distribution. To define groups with high and low baseline levels of S100B, this variable was dichotomized by a median split. The cut-off value of the baseline log-transformed S100B (logS100B) is determined to be 1.78. To investigate the differences between groups, we genotyped patients for the risk SNPs. The Hardy–Weinberg equilibrium (HWE) of the four SNPs was calculated for the allele and genotype frequencies. A goodness-of-fit χ 2 test was used to detect the HWE, and Pearson χ 2 test to compare allele distribution comparison. SPSS 23.0 (SPSS Inc., Chicago, IL) was used for demographic analysis, descriptive analysis, and exploratory analysis. Logistic regression analysis was used to investigate the risk level of the tested SNPs associated with SCZ, before and after adjustment for age and gender. The generalized estimating equation (GEE), developed by Zeger and Liang in 1986 (23), was applied to handle our missing data due to absenteeism or a failure to complete the assessment on time. The primary objective of this analysis was to explore the relationship between clinical effects and serum logS100B levels. The GEE was used to analyze the independent variables of gender, age, age at onset, duration of illness, educational level, military service, haplotypes of tag SNPs, smoking history, family history of mental disorder and other outcome variables. The dependent variable was set as the logS100B level.

# RESULTS

A total of 190 patients with SCZ from the Southern Chinese Han population participated in our study. 93 of 190 patients joined our study in acute relapse and 97 of 190 patients were in stable condition. 77.5% patients were treated with 1 antipsychotic drug (monotherapy) and 22.5% patients received 2 or more different antipsychotics (polypharmacy). **Table 1** shows

TABLE 1 | Clinical and demographic information of the participants, and baseline statistics between groups.


*<sup>a</sup>ESR, erythrocyte sedimentation rate; <sup>b</sup>CRP, C-reactive protein.*

the clinical and demographic information of the participants, and baseline statistics between groups. We recorded each participant's age, gender, educational level, smoking, marital status, family history of mental disorder, military service, age at diagnosis of SCZ, and baseline ESR and CRP levels. The P-values of the above variables showed no significant differences between groups. Repeated measurements of PANSS, PSP, GAF, CGI-S, CGI-I scores and S100B level were collected in a longitudinal study in which change over time is assessed. Comparing the high S100B group to the low S100B group, the significant P-values were observed in the baseline PANSS total scores (p = 0.01) as well as the positive (p = 0.035), negative (p = 0.021) and general (p = 0.017) subscales **(Table 2)**. Also, the P-values of both baseline GAF level (p = 0.035) and final CGI-I scores (p = 0.006) had significance. Multiple bar graphs were included to show the changes of S100B, PANSS, and other outcome variables derived from different time-points between groups (**Figure 1**).


\**p* < *0.05,* \*\**p* < *0.01.*

*S100B, S100 calcium-binding protein B; PANSS-T, total scores of the positive and negative syndrome scale; PANSS-P, positive subscale of the positive and negative syndrome scale; PANSS-N, negative subscale of the positive and negative syndrome scale; PANSS-G, general subscale of the positive and negative syndrome scale; PSP, personal and social performance; GAF, global assessment of functioning; CGI-S, clinical global impression-severity; CGI-I, clinical global impression–improvement.*

FIGURE 1 | Comparing the high S100B group to the low S100B group, the significant *P*-values were observed in the (A) baseline PANSS total scores (*p* = 0.01) as well as the (B) positive (*p* = 0.035), (C) negative (*p* = 0.021), and (D) general (*p* = 0.017) subscales. Also, the *P*-values of both (F) baseline GAF level (*p* = 0.035) and (H) final CGI-I scores (*p* = 0.006) had significance. \**p* < 0.05. \*\**p* < 0.01.


*MAF, Minor allele frequency.*

**Table 3** shows the characteristics of the genotyped AQP4 among the four tag SNPs. **Table 4** shows the allele and genotype frequencies for each tag SNP. The T allele of rs1058424 has the significance between two groups (p = 0.013). Among the four genotyped tag SNPs, rs335931 exhibited a low level of linkage disequilibrium and therefore was excluded. We chose the three other tag SNPs (rs1058424, rs335929, and rs376043) for further PHASE analysis. The **Table 5** showed that the TAA haplotype was significantly different between the two groups (p = 0.018), as was the ACG haplotype (p = 0.048). Although the P-value of these two haplotypes was insignificant after the Bonferroni correction, the FDR adjusted P-value of TAA haplotype was still meaningful due to the less likely to be false positive.

We used GEE methodology to analyze the data and to investigate the factors possibly related to logS100B level (**Table 6**). The logS100B levels decreased within the measured time intervals and this achieved statistical significance (β = 1.765, p < 0.001) in comparison with the baseline level. The logS100B level had a positive association with the PANSS negative subscale (PANSS-N) score (β = 0.084, p = 0.001) and the CGI-I score (β = 0.288, p = 0.003). In addition, the TAA haplotype had a positive association with the logS100B level (β = 0.254, p = 0.006). Pearson correlation coefficients was proven the positive correlation between log level of S100B, PANSS-N, and CGI-I variables (**Figure 2**).

# DISCUSSION

The activation of glial cells by the brain may represent an effort to fight against neuroinflammation. S100B is a primary product of astrocytes and has been implicated in the regulation of intracellular processes. It exhibits cytokine-like activities and mediates interactions between glial cells and neurons. Increased production of S100B and its release from activated glial cells may act as a cytokine and interfere with neurodegeneration. S100B is a proposed biomarker of SCZ pathophysiology, diagnosis and progression (24). Persistent astrocyte activation, indicated by increased S100B concentration, may be directed toward an ongoing pathogenic process not successfully limited by glial activation. In our study, the high S100B group has significant P-value of baseline PANSS and GAF scores maybe due to the persistent influence of neuroinflammation.

The using of antipsychotic monotherapy or polypharmacy with either typical or atypical antipsychotics was allowed according to the severity of illness. As recently summarized, the intervention of antipsychotic drugs can affect glial S100B release (25). Unfortunately, there is no consistent association between S100B level and therapeutic response. Previous studies with repeated measurement have shown either increased or decreased levels of serum S100B during antipsychotic treatment (25). Compared with age- and sex-matched healthy controls, Rothermundt et al. observed patients with SCZ had increased levels of serum S100B both upon admission and after 12 or 24 weeks of treatment (15). However, Ling et al. (26) and Steiner et al. (27) reported that higher baseline levels of S100B in patients with SCZ when compared with levels after 6 or 12 weeks of


\**p* < *0.05.*

treatment. Our study had similar findings, showing higher levels at baseline and decreased levels of S100B after 3, 6, and 9 weeks of treatment. In support of our findings, it has been suggested that antipsychotic medication may decrease S100B levels and control the neuroinflammation in patients with SCZ.

Both negative symptoms and function impairment are often enduring and resistant to conventional treatments in individuals with SCZ (28). Previous studies on elevated S100B levels have shown that they are partly correlated with acute exacerbations, compatible with acute illness, and the severity of negative symptoms, compatible with chronic illness (29, 30). Furthermore, persistently high S100B concentrations correlate with memory impairment in patients with chronic SCZ (31). This shows that negative symptoms may significantly contribute to more severe functional disabilities and outcomes (32). In our study, we demonstrated that patients who had a logS100B value >1.78 showed greater severity of negative symptoms.

In contrast to a recent study, we did not observe a strong association between S100B level and the PANSS positive subscale under antipsychotic medication (33). In addition, no significant correlation was observed between S100B and the PANSS total score or its general subscale scores. On the contrary, our study found that the serum S100B level had a positive association with the CGI-I and PANSS-N scores after 9 weeks of treatment. A possible explanation for this may be the influence of age and duration of illness. As shown by the descriptive analysis of demographic characteristics in **Table 1**, high S100B group had higher average age and longer duration of illness in comparison with low S100B group. Although the CGI-I score improved, the elevated PANSS-N scores seemed to be the core problem in older patients with chronic illness.

Past studies representing an unexpected extent of correlation and structure in haplotype patterns (34, 35) have led to the development of the Human Haplotype Map project (HapMap) and the benefit of genetics research. Our study shows the possible correlation of a rs1058424 polymorphism and a TAA haplotype of the AQP4 gene in the clinical outcome of patients with SCZ. It is therefore possible that the rs1058424 polymorphism within the 3′UTR may regulate the function of the AQP4 gene. This is believed to be the first study to find an association between the haplotypes of the AQP4 gene polymorphisms and S100B activity in a Southern Chinese Han population. We concluded


\**p* < *0.05.*

Wu et al. Aquaporin 4 Gene and Haplotypes Associated

TABLE 6 | Parsimonious model of changes in variables from the GEE in a 9-week trial.


\*\**p* < *0.01,* \*\*\**p* < *0.001, dependent variable, logS100B.*

that patients with the TAA haplotype of AQP4 had an increased level of S100B that may be due to persistent neuroinflammation. The expression of the TAA haplotype seemed to be associated with an increased risk of disease progression and poor treatment response. The increase in S100B concentrations appeared to be functionally related to increased negative symptoms of SCZ.

It is important that the expression of the AQP4 gene regulated by genetic polymorphisms may influence blood flow and fluid balance, and consequently the extent of neuroinflammation. The use of specific haplotypes as potential biomarkers is benefit to clarify different subgroups of patients and identify the potential aquaporin modulators in the management of neuroinflammation in SCZ.

Our study had several limitations that are applicable to genetic association studies. We cannot exclude the possibility of selection bias because our participants were selected mainly on the basis of repeated hospital admissions, and cases at the first onset of SCZ may have been under-represented. In addition, we cannot draw conclusions about specific associations between drugs and treatment response because the patients were allowed to receive different types and doses of antipsychotics. We used a graphic method to dichotomize the two groups using selected cutoff points. This approach may in some instances have altered the study outcome when used instead of the traditional analysis in comparison with healthy individuals. Finally, the lack of measurement of inflammatory cytokines provided less powerful evidence of an association between neuroinflammation and AQP4 polymorphisms and haplotypes.

#### CONCLUSIONS

AQP4 seems to influence brain neuroinflammation in SCZ because of its important role in maintaining BBB integrity, structure, and permeability. Our study provides possible association between the involvement of genetic variations in the AQP4 gene and the functional outcome of patients with SCZ. The risk variants in the AQP4 gene, including the T allele of rs1058424, A allele of rs335929, A allele of rs3763043 and TAA haplotype, are associated with elevated S100B level and greater severity of negative symptoms in individuals

with SCZ. The increased interest in AQP4 derives from its potential use as a therapeutic target in patients with SCZ, for the prevention and treatment of negative symptoms, because inhibitors of the TAA haplotype of AQP4 are expected to protect the brain from persistent neuroinflammation. Further study including healthy controls and patients with first and recurrent episodes under selective AQP4 modulators will be necessary to explore the profound effects on the treatment of patients with SCZ and may positively influence their overall outcome.

#### AUTHOR CONTRIBUTIONS

Y-FW, with help of H-KS, and F-WL, planned the present study's content and analysis, interpreted the data and wrote the paper. Y-FW, H-KS, and F-WL initiated and performed the whole survey, analyzed the data and helped to interpret the findings

#### REFERENCES


and to write the paper. All authors read and approved the final manuscript.

#### ACKNOWLEDGMENTS

We would like to express our gratitude to all patients who volunteered for this experiment. This study was supported by a research grant from the tri-service general hospital, beitou branch (grant TSGH-BT-106-002).


35. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, et al. The structure of haplotype blocks in the human genome. Science (2002) 296:2225–9. doi: 10.1126/science.10 69424

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Wu, Sytwu and Lung. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# DNA Methylation and Gene Expression of Matrix Metalloproteinase 9 Gene in Deficit and Non-deficit Schizophrenia

Ju Gao1,2† , Hongwei Yi<sup>3</sup>† , Xiaowei Tang<sup>4</sup>† , Xiaotang Feng<sup>5</sup> , Miao Yu<sup>1</sup> , Weiwei Sha<sup>4</sup> , Xiang Wang<sup>6</sup> , Xiaobin Zhang<sup>4</sup> and Xiangrong Zhang<sup>1</sup> \*

<sup>1</sup> Department of Geriatric Psychiatry, Nanjing Brain Hospital, Affiliated to Nanjing Medical University, Nanjing, China, <sup>2</sup> Centers of Disease Prevention and Control for Mental Disorders, Shanghai Changning Mental Health Center, Shanghai, China, <sup>3</sup> Department of Pharmacology, School of Medicine, Southeast University, Nanjing, China, <sup>4</sup> Department of Psychiatry, Affiliated WuTaiShan Hospital of Medical College, Yangzhou University, Yangzhou, China, <sup>5</sup> Department of Psychiatry, Nanjing Qing Long Mountain Psychiatric Hospital, Nanjing, China, <sup>6</sup> Medical Psychological Institute of the Second Xiangya Hospital, Central South University, Changsha, China

#### Edited by:

Cunyou Zhao, Southern Medical University, China

#### Reviewed by:

Kazuhiko Nakabayashi, National Center for Child Health and Development (NCCHD), Japan Jian-Huan Chen, Jiangnan University, China

\*Correspondence: Xiangrong Zhang drxrz@hotmail.com †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics

Received: 05 August 2018 Accepted: 29 November 2018 Published: 11 December 2018

#### Citation:

Gao J, Yi H, Tang X, Feng X, Yu M, Sha W, Wang X, Zhang X and Zhang X (2018) DNA Methylation and Gene Expression of Matrix Metalloproteinase 9 Gene in Deficit and Non-deficit Schizophrenia. Front. Genet. 9:646. doi: 10.3389/fgene.2018.00646 The biological pathology of deficit schizophrenia (DS) remains unclear. Matrix metalloproteinase 9 (MMP9) might be associated with neural plasticity and glutamate regulation, involved in schizophrenia pathogenesis. This study explores gene expression and DNA methylation of MMP9 in peripheral blood mononuclear cells (PBMCs) and their relationship with clinical symptoms in DS and non-deficit schizophrenia (NDS). Pyrosequencing was used to determine DNA methylation at CpG sites in exon 4 and exon 5 of MMP9 in 51 DS patients, 53 NDS patients and 50 healthy subjects (HC). RT-qPCR was used to detect MMP9 expression. Clinical symptoms were assessed by BPRS, SANS and SAPS scales. MMP9 expression in PBMCs was significantly higher in DS than NDS and HC subjects. Compared to NDS patients, DS patients had significantly lower DNA methylation at individual CpG sites in exon 4 and exon 5 of MMP9. Correlation analysis showed that DNA methylation in exon 4 was negatively correlated with gene expression in DS group. Positive correlation was found between MMP9 expression and negative symptoms in total schizophrenic patients. The social amotivation factor of SANS and negative syndrome of BPRS was negatively correlated with DNA methylation of CpG5-1 in DS patients but not in NDS patients. DS patients showed a specific abnormality of peripheral MMP9 expression and DNA methylation, indicating a pathological mechanism underlying DS as a specific subgroup of schizophrenia.

Keywords: deficit schizophrenia, matrix metalloproteinase-9, DNA methylation, gene expression, negative symptoms, pyrosequencing

# INTRODUCTION

Schizophrenia is a severe psychiatric disorder with impairment of perception, thought, emotion and behavior, resulting in profoundly impaired social function. Negative symptoms are characterized by an absence or reduction of affective, social and behavioral expression, which are regarded as important predictors of treatment response and prognosis (Foussias et al., 2011). However, the

heterogeneity of negative symptoms, primary or secondary symptoms, could lead to the substantial differences in clinical outcome of antipsychotic response and disease prognosis (Aleman et al., 2016). Deficit schizophrenia (DS), proposed by Carpenter et al. (1988), characterizes patients with primary and permanent negative symptoms including restricted affect, diminished emotional range, poverty of speech, curbing of interests, diminished sense of purpose and diminished social drive (Carpenter et al., 1988; Kirkpatrick et al., 1989, 2001). Increasing evidence demonstrates discrepant factors between DS and non-deficit schizophrenia (NDS) in clinical symptoms, disease course, genetic variation (Hong et al., 2005; Wonodi et al., 2006; Bakker et al., 2007; Rethelyi et al., 2010), neuroimaging changes (Lahti et al., 2001; Galderisi et al., 2008; Voineskos et al., 2013; Lei et al., 2015) and neuropsychology (Cohen et al., 2007; Yu et al., 2015; Tang et al., 2016), suggesting that DS might be a homogeneous disease entity with a unique pathogenesis. The research on DS might help to understand the etiology of negative symptoms and prediction biomarkers for the long-term prognosis of patients with schizophrenia.

The matrix metalloproteinase, a large family of extracellular proteolytic enzymes, is implicated in numerous developmental and disease-related processes (Sternlicht and Werb, 2001). Matrix metalloproteinase-9 (MMP9) is the best-characterized MMP family member and is thought to have an important role in the pathophysiology of neuropsychiatric disorders, such as schizophrenia, bipolar disorder, stroke, neurodegeneration and brain tumors (Vafadari et al., 2015). Evidence from animal models demonstrated that extracellular proteolytic activity of MMP9 can remodel the synaptic microenvironment, which might be involved in mechanisms of long-term potentiation (LTP), learning and memory and brain disease involving aberrant plasticity (Lepeta and Kaczmarek, 2015). Michaluk et al. (2011) revealed excessive MMP9 caused elongation and thinning of dendritic spines in the hippocampal neurons in both in vivo and in vitro models (). Interestingly, it was reported MMP9 regulated surface trafficking of N-methyl-D-aspartate receptors (NMDAR) in rat hippocampal neurons in vitro (Michaluk et al., 2009); these receptors are importantly related to cellular and molecular processes involved in synaptic adaptation, plasticity and NMDAR-dependent neuronal pathologies (Lau and Zukin, 2007; Groc et al., 2009).

Previous reports on MMP9 in patients with schizophrenia are inconsistent. Rybakowski et al. (2009) genotyped the functional -1562C/T polymorphism of MMP9 and found a significant preponderance of C/C genotype and C allele in the schizophrenia group compared to normal controls. Domenici et al. (2010) detected increased MMP9 and TIMP-1 (tissue inhibitor of matrix metalloproteinases-I) in peripheral blood of patients with schizophrenia (). Moreover, a randomized double-blind clinical trial reported that minocycline, whose pleiotropic effects include decreasing the expression and activity of MMP9 (Yao et al., 2004), acted as an efficient adjuvant therapy to benefit negative symptoms in early schizophrenia (Chaudhry et al., 2012). However, Niitsu et al. (2014) reported that neither serum mature BDNF nor MMP9 levels differed between chronic schizophrenia and controls, a finding confused by unmatched smoking status between groups. Yamamori et al. (2013) observed a significantly increased MMP9 protein in treatment-resistant schizophrenia treated with clozapine, but no association was found between MMP9 and clinical variables. These reports showed no clear consistency regarding the role of MMP9 in schizophrenia, which might be attributed to the various clinical and methodological factors such as variation of psychiatric symptoms severity and the heterogeneity of schizophrenia. Therefore, it would be of value to investigate whether there is a specific involvement of MMP9 in DS patients with prominent and enduring negative symptoms.

Epigenetics refers to the combination of mechanisms that confer long-term and heritable changes in gene expression without altering the DNA sequence itself. DNA methylation is one of the most common epigenetic alterations which influences genomic expression through methylation of cytosine at C-phosphate-G (CpG) dinucleotides located in distinct genomic regions such as the gene promoter (Deaton and Bird, 2011). In postmortem brains or peripheral blood samples of patients with schizophrenia, DNA methylation alterations in the promoters of candidate genes such as GELN, COMT, GABRB2 have been reported (Abdolmaleky et al., 2005, 2006; Pun et al., 2011). Genome-wide analysis revealed that monozygotic twins for schizophrenia showed characteristic DNA methylation alterations in peripheral blood samples, highlighting an additional role for epigenetic processes in mediating susceptibility (Dempster et al., 2011). However, it is unclear whether MMP9 expression might be associated with epigenetic changes that potentially influence negative symptoms in schizophrenia. In the present study we have performed a pyrosequencing approach to detect methylation status of an exonic region of MMP9 rich in CpG sites in DS, NDS and health control (HC) groups. Furthermore, potential correlations between gene expression, DNA methylation and clinical symptoms were studied. Our primary hypothesis was that there would be different expression of MMP9 in DS and NDS patients. Subsequent analyses, dependent on the outcome of the primary hypothesis, explored the hypothesis that any such differences in DNA methylation of MMP9 would be associated with differences in gene expression and clinical symptoms.

# MATERIALS AND METHODS

# Participants

A total of 104 patients with schizophrenia (51 DS and 53 NDS) and 50 healthy subjects participated in this study, based on approval by the Institutional Ethical Committee for clinical research of Wutaishan hospital, Jiangsu province, China. All participants were male Han Chinese, right-handed, and provided written informed consent. The patients with schizophrenia were recruited from the psychiatric rehabilitation unit of Yangzhou Wutaishan hospital. The inclusion criteria were (1) a diagnosis of schizophrenia according to Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), and confirmed by the Chinese version of the Structured Clinical Interview for DSM-IV (SCID-I) (First et al., 1997); (2) age between 20 and 65 years; (3) long-standing psychiatric symptoms and stable

antipsychotic pharmacological treatment for at least 12 months based on inpatient medical records. Exclusion criteria included any neurological or medical condition, such as head trauma, mental retardation, alcoholism or substance abuse, or a history of electroconvulsive therapy in the past 6 months. Deficit and nondeficit schizophrenia were diagnosed using the Chinese version of the Schedule for the Deficit Syndrome (SDS) (Wang et al., 2008). The healthy controls were selected from the community by matching patients with schizophrenia for age and handedness, excluding any Axis I psychiatric disorder of the Structured Clinical Interview for DSM-IV Non-Patient version (SCID-NP) (First et al., 1996) and with no family history of psychiatric disorders.

#### Clinical Assessment of Patients

The Schedule for Deficit Syndrome (SDS) (Kirkpatrick et al., 1989) was used to categorize the patients with schizophrenia into DS and NDS groups, according to the assessment of six enduring (persistent over 12 months) and primary (instead of secondary sources) negative symptoms (restricted affect, diminished emotional range, poverty of speech, curbing of interests, diminished sense of purpose, diminished social drive). The Brief Psychiatric Rating Scale (BPRS), organized into separate positive, negative, disorganized and affect syndromes based on the findings of the factor analysis of 18-item (Mueser et al., 1997; Cohen et al., 2007), was used to evaluated a full range of symptomatology of DS and NDS patients. As BPRS fails to reflect the comprehensive range of negative symptoms, we utilized the Scale for the Assessment of Negative Symptoms (SANS, 19 original items) to assess the negative symptoms. The SANS scale was divided into three factors involving Diminished Expression (including items from the Affective Flattening or Blunting scale, as well as the "poverty of speech"), Inattention-Alogia (which included items from the Inattention and Alogia scales, as well as the "poor eye contact" item) and Social Amotivation (reflecting items from the Anhedonia-Asociality and Avolition-Apathy subscales) (Blanchard and Cohen, 2006; Lyne et al., 2013). Since attentional problems yield a poor fit related to the negative symptom construct (Peralta and Cuesta, 1995), the factor of Inattention-Alogia has not been included in exploratory analysis in the current study. The Scale for the Assessment of Positive Symptoms (SAPS) was used to assess positive symptoms.

#### MMP9 Expression and DNA Methylation Pyrosequencing Processing

Fasting venous blood samples were taken from the patients with schizophrenia and health subjects at 6–8 am in the morning. Peripheral blood mononuclear cells (PBMCs) were isolated from blood samples by using the BD Vacutainer Cell Preparation tubes according to the manufacturer's instructions (Becton, Dickinson and Company, United States) and stored in the refrigerator by −80◦C. Total RNA was isolated from PBMCs samples using RNeasy mini kit (Qiagen, CA, United States). The gene expression was obtained by using real-time quantitative PCR (RT-qPCR). cDNA sequences for human MMP9 (forward: 5<sup>0</sup> -GTGGACGATGCCTGCAACGT-3<sup>0</sup> ; reverse: 5<sup>0</sup> -GCCGCTCCTCAAAGACCGAG-3<sup>0</sup> ) and GAPDH (forward: 5<sup>0</sup> -ACCACAGTCCATGCCATCAC-3<sup>0</sup> ; reverse: 5 0 -TCCACCACCCTGTTGCTGTA-3<sup>0</sup> ) were used for primer construction. cDNA samples were used for RT-qPCR experiment in duplicate. Real-time PCR was performed according to the manufacturer's protocol using QuantiTect SYBR Green RT-PCR kit (Qiagen, United States). Briefly, 20 µL total reaction volume containing 10 µL SYBR Green master mix (Applied Biosystems), 0.1 µL each forward and reverse primer (10 pM/µL), and 2 µL cDNA was used in PCR using ABI 7900HT FAST instrument. PCR was performed with an initial incubation at 50◦C for 2 min, followed by 10-min denaturation at 95◦C and 40 cycles at 95◦C for 15 s, 60◦C for 1 min, and 72◦C for 40 s. MMP9 expression was normalized to the mRNA levels of housekeeping gene GAPDH (Barber et al., 2005). Delta–Delta CT (CT = threshold cycle) and relative mRNA levels of MMP9 were calculated (Livak and Schmittgen, 2001). The relative fold changes of MMP9 mRNA of DS or NDS group patients were compared with the mean MMP9 mRNA of healthy subjects.

Genomic DNA was isolated from blood sample PBMCs using QIAamp DNA Blood Mini Kit (Qiagen, United States) and bisulfite-modified to convert unmethylated cytosine residues to uracil using EpiTec Fast DNA Bisulfite Kit (Qiagen, United States). PCR reactions were set up according to the instruction of PyroMark PCR Master Mix kit (Qiagen, Cat. No. 978703). In brief, gently mix 12.5 µl PyroMark PCR Master Mix, 2.5 µl CoralLoad Concentrate, 2 µl Primer, 6 µl RNase-free water and 2 µl template DNA. The thermal cycler is 95◦C, 15 min; 94◦C, 30 s, 56◦C, 30 s, 72◦C, 30 s, 45 cycles; 72◦C, 10 min. After amplification, samples stored −20◦C. Pyrosequencing was performed using the PyroMark Q96 ID System (Qiagen, United States) to analysis DNA methylation of MMP9 gene in patients and healthy controls. According to CpG islands track of UCSC genome Browser<sup>1</sup> , we got the information that the human MMP9 gene contains four CpG islands. In view of that DNA methylation usually occurs within promoter or nearby exon regions intragenically, we chose the sequence on the first CpG island containing exons 4 and 5 for analysis. The region containing exon 4 using Hs\_MMP9\_02\_PM PyroMark CpG assay (Cat. No. PM00079198) analyzing sequence of 5<sup>0</sup> - GCCC**CG**GCATTCAGGGAGA**CG**CCCATTT**CG**A**CG**ATGA**C G**A-3<sup>0</sup> and the region containing exon 5 using Hs\_MMP9\_01\_ PM PyroMark CpG assay (Cat. No. PM00079191) analyzing sequence of 5<sup>0</sup> -TCGGTTTGGAAACGCAGATGGCGCG-3<sup>0</sup> . Mean values of methylation of each exonic CpG-containing sequence were calculated. Totally, 9 CpG sites were included, naming CpG4-1, CpG4-2, CpG4-3, CpG4- 4, CpG4-5, CpG5-1, CpG5-2, CpG5-3, and CpG5-4. The relative methylation changes of MMP9 of DS or NDS group patients were compared with the mean MMP9 methylation of healthy subjects (Gao et al., 2019).

#### Statistical Analysis

Statistical analysis was undertaken using SPSS version 19.0 (SPSS Inc., United States). Data of demographic, clinical

<sup>1</sup>http://genome.ucsc.edu/

characteristics, gene expression and percentage methylation are presented as mean ± standard deviation. Differences between these three groups were determined through the use of ANCOVA analysis for continuous variables using education years as covariant, then Bonferroni post hoc analysis was used to compare between groups. Dichotomous variables (i.e., percentage smokers) were compared by chi-squared among three groups. Psychiatric symptoms between DS and NDS groups were compared using student's t-tests. Spearman rank correlation and multiple linear regression were used to determine the influence of demographic factors, clinical measurements and gene methylation level on negative symptoms. Furthermore, partial correlation analyses were used to determine the relationships between CpG site methylation percentage of MMP9 and clinical assessments controlling age and CPZequivalent variables. In testing our primary hypothesis, for the multiple comparisons between clinical groups for each CpG site, a Bonferroni-corrected p-value of 0.0056 was applied. Otherwise a two tailed P-value < 0.05 was predetermined as significant.

# RESULTS

# Demographic and Clinical Characteristics

There was a significant difference in education years but not age nor the proportion of smokers among DS, NDS and HC groups. The DS group revealed fewer education years compared to HC subjects, while no significant difference was detected between the two schizophrenia subgroups. There was no significant difference in age at onset, duration and CPZ-equivalent dose between DS and NDS groups. DS patients showed significantly higher scores in negative syndrome than NDS, but not in positive syndrome, disorganization syndrome and affect factors of BPRS. Consistent with BPRS measurement, SANS scores were higher in DS group than NDS group but no significant differences seen in SAPS scores. DS patients showed significantly higher scores in the diminished expression and social amotivation factors of SANS than NDS, indicating that DS patients suffered more severe impairment in blunted affect, anhedonia and avolition (**Table 1**).

### Methylation of the MMP9 and Gene Expression in PBMCs

Mean methylation levels of nine sites in the MMP9 were significantly different among DS, NDS and HC groups. Compared to NDS patients, DS had significantly lower methylation levels in CpG4-4, CpG4-5, CpG5-1, CpG5-2, CpG5- 3, CpG5-4 (all adjusted P < 0.001), while CpG4-2 (P = 0.027), CpG4-3 (P = 0.262) showed no significant difference and CpG4- 1 (adjusted P < 0.001) had a higher level. Post hoc analysis showed DNA methylation of all individual sites in both DS and NDS patients were significant lower than HC subjects, except that DNA methylation of CpG5-3 did not differ between NDS and HC (P = 0.027). Mean values of DNA methylation were significantly different in exons 4 and 5 of MMP9 among three groups. Bonferroni post hoc comparisons revealed lower methylation levels in both exons 4 and 5 in DS patients (P < 0.001) and NDS (P < 0.001) relative to HC subjects, while DS patients showed significantly lower DNA methylation of exons 4 and 5 than NDS. Repeating the initial analysis with smoking status and CPZ equivalents included as an additional factor and covariate, respectively, showed essentially equivalent results with no differences in the levels of significance reached for each CpG site (data not shown).

The gene expression of MMP9 in peripheral blood mononuclear cells showed significant higher levels in DS and NDS patients than HC subjects and it was higher in DS patients than NDS patients (**Table 2**). Here too neither smoking status nor CPZ equivalents influenced these findings.

#### Relationship Between MMP9 Methylation, Gene Expression, Demographic and Clinical Characteristics

Stepwise linear regression indicated a significant effect of age (β = 0.031, P = 0.02) and medicine (CPZ-equivalent dose) (β = 0.001, P = 0.029) on gene expression of MMP9 in DS group, but no significant effect of education, duration, onset and smoking status. After controlling for age and CPZ-equivalent dose, DNA methylation in exon 4 (r = −0.388, P = 0.005) but not exon 5 (r = 0.252, P = 0.074) remained significantly negatively correlated with gene expression of MMP9 in DS group (**Figure 1**). Age had a significant effect (β = 0.026, P < 0.001) on gene expression of MMP9 in NDS group, while other variables showed no significant effect. Controlling for age effect, there was no significant correlation between gene expression and methylation in both exon 4 (r = 0.163, P = 0.244) and exon 5 (r = −0.225, P = 0.106) in NDS group (**Figure 1**). There was no significant correlation between gene expression of MMP9 and average methylation in exon 4 or exon 5 in HC group.

# Relationship Between Clinical Symptoms and MMP9 Methylation

Partial correlation analysis was used to determine the relationships between clinical measurements, MMP9 DNA methylation and expression in patients when controlling for age and CPZ-equivalent. For all patients with schizophrenia, a significant positive correlation was found between gene expression of MMP9 and negative symptoms (SANS total scores: r = 0.256 P = 0.009 and negative syndrome subscale of BPRS: r = 0.272 P = 0.006), but no significant correlation was found with positive symptoms. The lower methylation level of individual CpG sites was associated with higher SANS scores except for CpG4-1, CpG4-2, and CpG4-3. After dividing patients into subgroups, the social amotivation factor of SANS (r = −0.351, P = 0.013) and negative syndrome of BPRS (r = −0.334, P = 0.019) was negative correlated with DNA methylation of CpG5-1 in DS patients but not in NDS patients (**Figure 2**). The correlation diagrams were presented with the residual values of CpG methylation and clinical assessments.


TABLE 1 | Demographics and characteristics for deficit schizophrenia (DS), non-deficit schizophrenia (NDS), and healthy subjects (HC) groups.

Values represented as mean ± S.D.

DS, deficit schizophrenia; NDS, non-deficit schizophrenia; HC, healthy controls; BPRS, Breif Psychiatric Rating Scale; SAPS, the Scale for the Assessment of Positive Symptoms; SANS, the Scale for the Assessment of Negative Symptoms; CPZ, chlorpromazine; IQR, interquartile range.

∗∗P < 0.001 DS vs. NDS.

( <sup>1</sup>P ( <0.01 DS vs. HC.

TABLE 2 | Methylation of MMP-9 and gene expression in DS, NDS, and HC group.


<sup>∗</sup>P < 0.01 DS vs. NDS; ∗∗P < 0.001 DS vs. NDS; #P < 0.01 DS/NDS vs. HC; ##P < 0.001 DS/NDS vs. HC.

Values are expressed as % methylation.

<sup>1</sup>Relative fold changes of MMP9 (range: DS, 0.87–2.85; NDS, 0.74–2.23; HC, 0.36–1.70).

#### DISCUSSION

The present study demonstrated that the epigenetic pattern and gene expression of MMP9 of peripheral blood mononuclear cells was different among DS, NDS, and HC groups. DS patients had an increased gene expression of MMP9, which might be relative with the lower level of average methylation in exon 4, near the promoter region of the MMP9. Notably, DNA methylation of individual CpG sites showed negative correlations with clinical symptoms such as the social amotivation factor of SANS and negative syndromes of BPRS in DS patients, but not in NDS patients. To the best of our knowledge, this was the first study to provide evidence of DNA methylation involved in gene expression of MMP9 in patients with schizophrenia, contributing to our understanding of the potential pathogenesis of DS patients.

MMP9 has been considered to have pathological importance in patients with schizophrenia (Lepeta and Kaczmarek, 2015). Domenici et al. (2010) applied a focused proteomic approach in a large scale case-control study including 229 schizophrenic patients and 254 controls and revealed increased peripheral MMP9 in patients with schizophrenia. Similar results were reported by the recent ROC curve analysis (Ali et al., 2017) indicating that the increased MMP9 had some value in distinguishing schizophrenia and healthy controls. Increased

FIGURE 1 | Correlation between MMP9 expression and methylation in exons 4 and 5. The points of the scatter plot represented residual value of mean methylation of exon 4/exon 5 and relative fold changes of MMP9 (gene expression) after controlled covariates in DS (filled dots) and NDS (empty dots) groups. The mean methylation was calculated by using the relative methylation changes in each CpG site.

methylation of CpG5-1 and subscale of SANS and BPRS after controlled age and CPZ-equivalent as covariates in DS patients.

peripheral MMP9 was also reported in remitted (Devanarayanan et al., 2015) and treatment-resistant schizophrenia (Yamamori et al., 2013). However, a few studies have shown negative findings of peripheral MMP9 in patients with schizophrenia, which had been attributed to disease state, medicine treatment (Kumarasinghe et al., 2013) or smoking status (Niitsu et al., 2014). For example, Kumarasinghe et al. (2013) found that MMP9 mRNA was significantly up-regulated in PBMCs of treatment-naïve schizophrenic patients than healthy subjects and returned to control level after 6–8 weeks antipsychotic pharmacotherapy of 200 mg/d CPZ-equivalents. Our study was consistent with the majority of the previous reports (Domenici et al., 2010; Devanarayanan et al., 2015; Ali et al., 2017) showing an increase MMP9 expression in PBMCs in these long-term stabilized patients with schizophrenia. This study along with our recent study (Gao et al., 2019) also indicated that MMP9 was significantly elevated in DS patients relative to NDS patients. As influencing factors including age, gender and smoking status were well matched, the increased MMP9 observed in the present study might reflect an association with clinical symptoms, especially the primary and persistent negative symptoms in DS patients compared with NDS patients.

A positive correlation was found between gene expression of MMP9 and negative symptoms after controlling for age and CPZ-equivalents in the total schizophrenia group, suggesting the increased MMP9 might have a potential effect on negative symptoms. MMP9 is synthesized by neurons, astrocytes and microglia in hippocampal and prefrontal cortex (Okulski et al., 2007), which are the critical brain regions associated with negative symptoms in schizophrenia (Shaffer et al., 2015; Palm et al., 2016). Altered LTP in MMP9 knockout mice was first reported by Nagy et al. (2006), indicating that MMP9 plays an important role in hippocampal synaptic physiology and plasticity. This finding has been supported by several other studies, including investigation of different pathways of LTP in hippocampus. Wiera et al. (2013) reported that the maintenance of LTP was nearly abolished in the mossy fiber-CA3 projection of hippocampus in both the MMP9 overexpressing and the MMP9 knockout rats. Abnormal synaptic plasticity renders the patient incapable of learning from social experiences and adaptive

exchanges, which is likely to trigger social withdrawal and apathy in schizophrenia (Stephan et al., 2009). Moreover, Michaluk et al. (2009) reported that MMP9 stimulated the surface trafficking of NMDAR through increasing the lateral diffusion in hippocampal neurons of rats in vitro. The lateral diffusion was found as the key process of NMDA receptor internalization (Groc et al., 2009), indicating that the increased MMP9 might theoretically lead to NMDAR hypoactivity in hippocampus. Studies focusing on the relationship between MMP9 and NMDAR in the prefrontal cortex are virtually absent. It is well known that NMDAR hypoactivity in the prefrontal cortex contributes to the inhibition of the dopamine pathway from ventral tegmental area to dorsolateral and ventromedial prefrontal cortex (Stahl, 2007). Therefore, it might be speculated that the elevated MMP9 might be involved in functional disturbances of NMDA receptor in the hippocampus and prefrontal cortex and possibly be associated with negative symptoms and cognitive dysfunction in schizophrenia. Additionally, it should be noted that the present study found no significant correlation between SANS scores and gene expression of MMP9 in either the DS or NDS group. Whether this negative finding could be attributed to ceiling effects of severe negative symptoms in two patient groups remains unclear.

The association between gene polymorphism and behavioral symptoms in schizophrenia might provide a mechanistic insight into MMP9 gene function. Recently, a phenotypebased genetic association study by Lepeta et al. (2017) demonstrated an association between the MMP9 rs20544 CC/CT genotype and severity of a chronic delusional syndrome. The molecular mechanism might be the rs20544 C/T singlenucleotide polymorphism (SNP) affects the affinity of Fragile X mental retardation protein (FMRP) binding to MMP9 mRNA and influences MMP9 activity at dendritic spines. According to exposing the MMP9 heterozygous mice to psychosisrelated locomotor hyperactivity induced by an NMDA receptor antagonist, they also indicated the lower MMP9 level influenced performance in a behavioral model of the positive symptoms of schizophrenia (Lepeta et al., 2017). Another research reported by Bienkowski et al. (2015) has found no association between the functional MMP9 -1562C/T gene polymorphism and deficit/nondeficit subtypes of schizophrenia, which seemed to argue against the role of MMP9 gene polymorphism in deficit symptomatology. However, one should be aware that presumed genetic differences between deficit and non-deficit subtypes may involve many genes and that the influence of a single gene can be relatively weak and difficult to confirm. The present study employed different methodologies (e.g., MMP9 expression and methylation) to study deficit and non-deficit subtypes and further demonstrated the association between elevated gene expression of MMP9 and negative symptoms of schizophrenia. Thus gene polymorphism and alterations of MMP9, both down- as well as up-regulated, might be possible factors contributing to the pathophysiological underpinning of schizophrenia.

The present study demonstrated hypo-methylation status in exon 4 and exon 5 of MMP9 in two schizophrenia subgroups compared to health controls, which might result in up-regulated gene expression of MMP9. Previous studies demonstrated that gene expression of MMP9 might be mediated by DNA methylation in MMP9 promoter and intragenic DNA regions, indicating a mechanism of regulation by epigenetic modifications (Campos et al., 2016; Klassen et al., 2018). Roach et al. (2005) showed that demethylation of the 5<sup>0</sup> -flanking region containing the promoter of MMP9 was associated with elevated expression patterns of MMP9 in human osteoarthritic chondrocytes (Roach et al., 2005). Zybura-Broda et al. (2016) identified gene expression was regulated by demethylation of MMP9 proximal promoter in human epilepsy. Aforementioned studies indicated an important role of epigenetic mechanism played in MMP9 up-regulation and protease activity, also that might be implicated in the MMP9 biological process in schizophrenia. DNA methylation, one of the principal epigenetic mechanisms, has been importantly implicated in the pathophysiology of schizophrenia (Feng and Fan, 2009; Kinoshita et al., 2013). The present study provided the first evidence showing the possible association between DNA methylation and gene expression of MMP9 in schizophrenia. Importantly, the DS patients showed lower methylation than NDS patients at the majority of individual CpG sites, while a negative correlation was found between gene expression and average methylation of exon 4 for MMP9 in DS patients. DS patients had poorer premorbid adjustment during childhood and early adolescence and exhibited more impairment in general cognitive abilities than NDS patients (Galderisi et al., 2002). Considering that MMP9 apparently plays a special role during developmental plasticity, it would raise an intriguing assumption that the hypo-methylation pattern of MMP9 might affect downstream cellular change and contribute to the manifestation of behavioral abnormalities or clinical symptoms from the early life of DS patients.

Lower methylation at several individual CpG sites and higher MMP9 expression was associated with higher SANS scores in schizophrenia patients in the present study. This interesting finding is in keeping with the expectation that lower methylation would be associated with increased MMP9, which in turn could contribute to the neuropathology that may underlie negative symptomatology. Generally, promoter sequence methylation could directly interfere with transcription factor binding sites or indirectly cause gene silencing through methylation DNA binding proteins that recruit histone deacetylases, leading to chromatin condensation (Jones et al., 1998). Predicted by bioinformatics methods (PROMO v8.3) (Messeguer et al., 2002), the region of MMP9 adjacent to CpG4-4, CpG4-5 and CpG5-1 contains recognition sequences of several important transcription factors, such as p53, GR-alpha and c-Myb. p53 is importantly involved in proliferation, differentiation and apoptosis of neural progenitor cells (Tedeschi and Di Giovanni, 2009). Thus the hypo-methylation in exon 4 of MMP9 might possibly enhance the activity of transcription factors (e.g., p53), and thereby increase gene expression of MMP9 in DS patients. Notably, the present study further found that both the social amotivation factor of SANS and the negative syndrome of BPRS were negatively correlated with the DNA methylation of CpG5- 1 particularly in DS patients, indicating a distinct epigenetic characteristic of DS patients. The social amotivation factor of SANS would represent the behavioral abnormality relevant to the

learning disability of social experiences and adaptive exchanges (Stephan et al., 2009), consistent with the hypothesis that MMP9 contributes to pathological synaptic plasticity in schizophrenia (Lepeta and Kaczmarek, 2015). The present findings thus provide a possible linkage between hypo-methylation of MMP9 and negative symptoms in DS patients.

Several limitations of the present study should be considered. The main limitation was that MMP9 DNA methylation and gene expression were detected in PBMCs, where MMP9 expression and methylation may not necessarily represent that occurring in brain regions relevant to the pathogenesis of schizophrenia. However, several previous reports have also demonstrated that MMP9 expression is substantially altered in peripheral blood in patients with schizophrenia. The second limitation is that the present study did not completely analyze the DNA methylation at full length of promoter (exon) region of MMP9. The promoter (exon1) region, containing many CpG sites, was omitted in the present study, although this region does not overlap with a CpG island. Further study is needed to reveal a more comprehensive understanding of MMP9 DNA methylation in DS. Thirdly, the present study used CPZ-equivalent daily dose as the parameter to evaluate antipsychotic medication for the recruited patients, wherein the type and dosage of antipsychotic drug was comparable between the two patient groups. The influence of antipsychotic medication on MMP9 expression and methylation does not here differentiate typical and atypical antipsychotics, although differential epigenetic modification of various antipsychotics in the patients with schizophrenia has been reported (Ibi and Gonzalez-Maeso, 2015). Further studies would be needed to determine the distinct epigenetic effects on the MMP9 following different antipsychotic medication. Finally, the present study is limited in the sample sizes of the DS, NDS and control groups. This is an inevitable consequence of the strict diagnostic criteria for DS and the need to restrict variance by eliminating or minimizing confounders such as age, gender, smoking and fluctuations of psychiatric symptoms. Replication with a larger sample size would be valuable to increase the statistical power and fully investigate the effects of various clinical confounders.

In summary, the present study provided evidence for abnormal peripheral gene expression and DNA methylation of MMP9 in DS patients, indicating that subjects with the deficit syndrome might be a specific sub-group within schizophrenia.

#### REFERENCES


The negative correlations between MMP9 DNA methylation of individual CpG sites and negative symptoms revealed a distinct neuropathological impairment in DS patients. The present study indicated that MMP9 methylation might be a promising disease biomarker especially for the diagnosis and treatment domains of negative symptoms.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of 'name of guidelines, name of committee' with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the 'name of committee'.

#### AUTHOR CONTRIBUTIONS

JG and HY performed the research and analyzed the data. JG and XT wrote the manuscript. XgZ made substantial contributions to conception and coordination. XF, MY, WS, and XbZ help collecting samples. XW provided the scale for assess clinical symptoms. All authors read and approved the final manuscript.

# FUNDING

This work was supported by National Natural Science Foundation of China (NSFC) (Nos. 81371474, 81571314, 81741164, 91132727, and 31671144), National Key Research and Development Program (2016YFC1307002 and 2018YFC1314303), Nanjing Technology Development Foundation (Nos. 201505001, YKK16291, and YKK16290), Medical key talent projects in Jiangsu, Province (ZDRCA2016075) and the six talent peaks projects in Jiangsu Province (No. 2015-WSN-071), and the Shanghai Changning Medical Research Program (CNKW2016Y017).

#### ACKNOWLEDGMENTS

We sincerely thank Hao Tang for preparing experiments and Gavin P. Reynolds for revising the whole paper.


association to deficit and nondeficit schizophrenia. Am. J. Med. Genet. B Neuropsychiatr. Genet. 136B, 8–11. doi: 10.1002/ajmg.b.30181


fgene-09-00646 August 13, 2020 Time: 19:44 # 9


with deficit schizophrenia. Psychiatry Res. 246, 353–359. doi: 10.1016/j.psychres. 2016.09.055


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Gao, Yi, Tang, Feng, Yu, Sha, Wang, Zhang and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-09-00646 August 13, 2020 Time: 19:44 # 10

# Rare Copy Number Variations in a Chinese Cohort of Autism Spectrum Disorder

Yanjie Fan<sup>1</sup> \* † , Xiujuan Du2†, Xin Liu<sup>3</sup> , Lili Wang<sup>1</sup> , Fei Li <sup>2</sup> \* and Yongguo Yu<sup>1</sup> \*

*<sup>1</sup> Shanghai Institute for Pediatric Research, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China, <sup>2</sup> Department of Developmental and Behavioral Pediatrics, Department of Child Primary Care, Brain and Behavioral Research Unit of Shanghai Institute for Pediatric Research & MOE-Shanghai Key Laboratory for Children's Environmental Health, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China, <sup>3</sup> Department of Developmental and Behavioral Pediatrics, Department of Child Primary Care, Xinhua Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, China*

#### Edited by:

*Zhexing Wen, Emory University School of Medicine, United States*

#### Reviewed by:

*Feng Zhang, Fudan University, China Ting Zhao, University of Pennsylvania, United States*

#### \*Correspondence:

*Yanjie Fan fanyanjie@shsmu.edu.cn Fei Li feili@shsmu.edu.cn Yongguo Yu yuyongguo@shsmu.edu.cn*

*†These authors have contributed equally to this work*

#### Specialty section:

*This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics*

Received: *02 October 2018* Accepted: *04 December 2018* Published: *18 December 2018*

#### Citation:

*Fan Y, Du X, Liu X, Wang L, Li F and Yu Y (2018) Rare Copy Number Variations in a Chinese Cohort of Autism Spectrum Disorder. Front. Genet. 9:665. doi: 10.3389/fgene.2018.00665* Autism spectrum disorder (ASD) is heterogeneous in symptom and etiology. Rare copy number variations (CNVs) are important genetic factors contributing to ASD. Currently chromosomal microarray (CMA) detecting CNVs is recommended as a first-tier diagnostic assay, largely based on research in North America and Europe. The feature of rare CNVs has not been well characterized in ASD cohorts from non-European ancestry. In this study, high resolution CMA was utilized to investigate rare CNVs in a Chinese cohort of ASD (*n* = 401, including 177 mildly/moderately and 224 severely affected individuals), together with an ancestry-matched control cohort (*n* = 197). Diagnostic yield was about 4.2%, with 17 clinically significant CNVs identified in ASD individuals, of which 12 CNVs overlapped with recurrent autism risk loci or genes. Autosomal rare CNV burden analysis showed an overrepresentation of rare loss events in ASD cohort, whereas the rate of rare gain events correlated with the phenotypic severity. Further analysis showed rare losses disrupting genes highly intolerant of loss-of-function variants were enriched in the ASD cohort. Among these highly constrained genes disrupted by rare losses, *RIMS2* is a promising candidate contributing to ASD risk. This pilot study evaluated clinical utility of CMA and the feature of rare CNVs in Chinese ASD, with candidate genes identified as potential risk factors.

Keywords: autism spectrum disorder, rare copy number variations, chromosomal microarray, LoF-intolerant genes, RIMS2

#### INTRODUCTION

Autism spectrum disorder (ASD) is characterized by persistent deficits in social communication and restricted, repetitive pattern of behaviors (Lai et al., 2014). The manifestation of ASD spans a broad range of symptoms and severity (Lai et al., 2014). This phenotypic diversity coincides with heterogeneous genetic etiology—known genetic causes of ASD include aneuploidy, copy number variations (CNVs), and single nucleotide variations (De Rubeis and Buxbaum, 2015).

Multiple lines of evidence support rare CNVs as an important type of genetic factors contributing to autism risk (Schaefer et al., 2013), and currently chromosomal microarray (CMA) detecting CNVs is recommended as a first-tier diagnostic assay for ASD (Miller et al., 2010).

**51**

However, most of these evidence come from studies in North America and Europe (Shen et al., 2010; Schaefer et al., 2013; Tammimies et al., 2015). Research on rare CNVs in ASD from non-European ancestry is limited but necessary, considering the substantial difference of CNV distribution and pattern due to ethnical diversity (Park et al., 2010; Manrai et al., 2016). For Chinese population, only three studies so far have examined the yield of CMA in ASD (probands from Northern China, Taiwan and Hong Kong, respectively; Gazzellone et al., 2014; Yin et al., 2016; Mak et al., 2017), which primarily focused on the clinical utility, necessitating further work to characterize the general feature and burden of rare CNVs. Besides being short of ethnical diversity, published CNV studies are largely from ASD cohorts with varying degrees of severity. The correlation between rare CNV burden and symptom severity has not been investigated yet. Dissecting the heterogeneity of severity is a critical step to understand the genetic architecture of ASD.

Interrogating the genic content of rare CNVs is another aspect to gain insights of ASD etiology, as candidate genes can be discovered in rare CNV regions. Given the strong selective pressure on neurodevelopmental disorders (Kosmicki et al., 2017), genes intolerant of loss-of-function (LoF) variants are prioritized candidates. Among the 86 genes curated as highrisk factors by SFARI (Simons Foundation Autism Research Initiative, https://gene.sfari.org), 63 were LoF-intolerant genes (based on pLi score>0.99 in Exome Aggregate Consortium; Ruderfer et al., 2016). Interrogating these evolutionally constrained genes in rare CNVs is a rational approach of candidate search.

In this study, we investigated rare CNVs in a wellcharacterized Chinese ASD cohort (n = 401), including 177 mildly affected and 224 severely affected individuals, together with an ancestry-matched control cohort (n = 197). Three aims of this study are: 1. To evaluate the diagnostic yield of CMA in Chinese ASD individuals; 2. To examine the rare CNV burden between mildly and severely affected subgroups; 3. To identify candidate risk genes based on rare CNVs disrupting those genes extremely intolerant of LoF variants.

# MATERIALS AND METHODS

#### Sample Selection

Four hundred and one Chinese individuals diagnosed of ASD were recruited during July 2014 to December 2017 from the Developmental and Behavioral Clinic at Xinhua Hospital and Shanghai Children's Medical Center. Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DMS-5) (American Psychiatric Association, 2013), the Autism Diagnostic Observation Schedule (ADOS) (Lord, 2002), and Childhood Autism Rating Scale (CARS) (Schopler et al., 1986) were used. The ASD cohort consisted of 335 males and 66 females, with age ranged from 1 year 5 months to 17 years old. The severity categorization was based on CARS score −30–37 defined as mildly/moderately affected (Mild group), and 37–60 defined as severely affected (Severe group). Full list of this ASD cohort was included in **Supplementary Table 1**. The control cohort consisted of 123 males and 85 females of Chinese with no ASD or other major anomalies.

This study was carried out in accordance with the recommendations of national guidelines on research involving human subjects in China with written informed consent. All subjects gave written informed consent in accordance with the Declaration of Helsinki. For participants under 16, written informed consent was obtained from the parents of the participants. The protocol was approved by the Ethics Committee of Xinhua hospital.

# CMA and Data Analysis

Genomic DNA was extracted from peripheral blood of participants. Affymetrix CytoScan HD array (average probe spacing 1,148bp) was utilized to detect genomic CNVs following the manufacturer's guide (Thermo-Fisher Scientific, United States). Array results were analyzed by Chromosome Analysis Suite software with streamlined CNV calling workflow (Thermo-Fisher Scientific, USA). CYCHP files generated were used for summarizing chromosomal aberrations. Size threshold was set to 100 kb (with >25 probes), a relatively stringent criterion to ensure high confidence CNV calling.

# Rare CNV Burden Analysis

Population control data was obtained from 2,691 phenotypically normal controls analyzed by the same CMA platform (dataset offered by Affymetrix) and from Database of Genomic Variants (http://dgv.tcag.ca/dgv/app/home) (MacDonald et al., 2014). One percent frequency threshold (defined as >50% overlap of length) was applied to retain only rare CNVs. Burden analysis for rare CNVs was performed using PLINK v1.07 and scripts developed in house. Due to the imbalance of gender in the ASD cohort, only autosomal rare CNV burden was analyzed in this study. Three aspects of rare CNV burden were evaluated: the rate (the number of rare CNV events per individual), the CNV size, and the proportion of individuals harboring at least one event. P-values were estimated by permutation function in PLINK (http://pngu.mgh.harvard.edu/purcell/plink/) (MacDonald et al., 2014). The setting of one-sided, 100,000 permutations was used for these comparisons—ASD vs. control, and Severe vs. Mild.

### Regions of Interest

The chromosomal regions of known ASD loci were based on the summary by Pinto et al. including well-established ASD loci with multiple lines of evidence (Pinto et al., 2014). The list of high-risk ASD genes with strong evidence was quoted from the curated SFARI Gene database (https://gene.sfari.org/, "category 1–high confidence" and "category 2–strong candidate"). LoFintolerant gene list was generated based on the pLi score in Exome Aggregation Consortium (http://exac.broadinstitute.org/, genes with pLi>0.99 were included) (Lek et al., 2016). These lists of genes were included in **Supplementary Table 2**.

CNVs overlapping with chromosomal regions of known ASD loci (>80% overlap, and of the corresponding type of deletion or duplication) were considered clinically relevant and counted


TABLE 1 | The occurrence of rare CNVs by type, size and regions of interest.

*R, rate of CNV (number of events per individual); HiRisk genes, high-risk ASD genes; LoF-I genes, LoF-intolerant genes. See "2.4 Regions of interest" for details. P* < *0.05 was displayed in bold.*

as "with ASD loci" in burden analysis. In the analysis of potentially disruptive events of LoF-intolerant genes, any loss events intersecting the genic regions were counted, while gain events were counted only when starting or stopping within the genic regions (resulting in partially duplicated genes).

### RESULTS

#### Diagnostic Yield of CMA in Chinese ASD

Based on the American College of Medical Genetics and Genomics (ACMG) guideline of CNV interpretation (Kearney et al., 2011), 17 out of 405 individuals in our ASD cohort were found to harbor pathogenic CNVs. These CNVs included: (1) 6 CNVs in the regions of 8p23.3p23.1, 10q11.2, 4q31.21q33, 3p14.1, and 17p12; (2) 10 CNVs with at least 80% overlap of known ASD loci; and (3) 2 CNVs involving high-risk ASD genes, among which the TAOK2-relevant CNV also resided in the known 16p11.2 ASD loci. These CNVs were summarized in **Table 2** (see "2.4 Region of interest" for details of known ASD loci and highrisk ASD genes). This resulted in approximately 4.2% diagnostic yield of CMA in the ASD cohort.

# Rare CNV Burden in ASD vs. Control and the Correlation With Severity

The occurrence of rare CNV events was summarized in **Table 1,** stratified by CNV type and size. Only autosomal CNVs were analyzed in this study, considering the gender bias in the ASD cohort. When taking all the CNVs above 100 kb into consideration, the average occurrence of rare loss event was 0.369 per person in the ASD cohort, significantly higher than the occurrence rate of 0.259 per person in the control cohort (RASD/RControl = 1.43, p = 0.021, one-sided, 100,000 permutations). The rate of rare loss events at all size ranges was nominally higher in the ASD cohort, but no statistical difference was reached at particular size range. For rare gain events, no significant difference of the occurrence rate was found between the ASD and control cohort (RASD/RControl = 0.76, p = 0.996).

In the comparison between ASD individuals categorized by severity, rare gain events occurred at a higher rate in the severely affected individuals (right panel of **Table 1**, RSevere/RMild = 1.43, p = 0.003), and this difference was significant at the small size range of 100–400 kb (RSevere/RMild = 1.41, p = 0.010). No correlation between the occurrence rate of rare loss events and ASD severity was found ("All" size range, RSevere/RMild = 0.83, p = 0.861).

Besides the occurrence rate, we also analyzed two other parameters of burden, including CNV size and the proportion of individuals harboring at least one rare CNV event. However, no significant difference in these two measures was found, either in the comparison of "ASD vs. control" or in "Severe vs. Mild" (data not shown).

Taken together, the difference of CNV burden between ASD and control was mainly in the rate of rare loss events, while within the ASD cohort, the severity correlated with the rate of rare gain events.


TABLE 2 | Rare CNVs with clinical significance, in known ASD loci or intersected with high-risk genes.

# Rare CNVs in Regions of Interest–Recurrent ASD Loci and High-Risk ASD Genes

A total of 10 rare CNVs (6 losses and 4 gains) overlapped with known recurrent ASD loci (see **Supplementary Table 2** for the list of chromosomal locations). Among these loci, 15q11q13 duplications and 22q11.2 deletions were recurrently found in the ASD cohort (**Table 2**). Heterozygous loss of two high-risk ASD genes—NRXN1 and TAOK2—were found in two ASD individuals.

Rare loss events involving well-known ASD loci and high-risk ASD genes were not found in control. However, the potentially disruptive gain events were found in the control cohort, at even higher rate than in the ASD cohort (lower left part of **Table 1**). Between the Mild and Severe groups, no significant difference of rare CNV events overlapping with known loci/genes was observed (lower right part of **Table 1**).

### Genes Intolerant of LoF Variants Intersected by Rare CNVs in the ASD Cohort

Genes intolerant of LoF variants were prioritized candidates of ASD risk factors (see "2.1 Regions of interest" for details of gene list). Rare CNVs identified in the ASD cohort were interrogated for potential disruption of these evolutionally constrained genes. The occurrence of loss events and potentially disruptive gain events (when starting or stopping within genic region, resulting in partially duplicated gene) in genes intolerant of LoF variants was summarized in **Tables 1**, **3**. Loss events intersecting these constrained genes were enriched in ASD cohorts (2.95 times higher rate, **Table 1**), but potentially disruptive gain events were dispersed in control and ASD. RIMS2, PTPRT, FRMD4A, HSPA14, and CCZ1 were intersected by loss events (**Table 3**), and a total of 24 LoFintolerant genes were intersected by potentially disruptive gain events only in the ASD cohort (**Supplementary Table 3**). Of particular, RIMS2 was found in two rare CNVs (one loss and one gain) in two independent ASD patients, and both of the affected individuals presented severe symptom (**Table 3**).

#### DISCUSSION

#### Diagnostic Yield of CMA Can Be Affected by Heterogeneity of the ASD Cohort

The diagnostic yield of CMA in this study was 4.2%, as clinically significant CNVs were identified in 17 out of 405 ASD individuals. This yield is slightly lower than majority of ASD studies based on cohorts of European ancestry, which reported a diagnostic rate of 5–10% (Shen et al., 2010; Schaefer et al., 2013). In three published studies on Chinese ASD, the diagnostic rate was reported to be 8.6, 5.1, and 3.5% in cohorts with sample size of 104, 335, and 228, respectively (Gazzellone et al., 2014; Yin et al., 2016; Mak et al., 2017). Besides the potential bias in patient origin and CNV analysis, the difference of diagnostic yields could be attributed to the presence of comorbidity in the cohort. When CMA was performed in ASD patients with comorbid intellectual disability, microcephaly or other congenital anomalies in our


TABLE 3 | Novel candidate genes intersected by rare CNVs in this study.

*DGV freq: the population frequency of respective CNV in Database of Genomic Variants (http://dgv.tcag.ca/dgv/app/home).*

center, the yield increased to 15% (Fan et al., 2018). Affected individuals in this study were relatively "pure"–over 95% of the ASD cohort were free of major systemic anomalies. The 4.2% diagnostic yield found in this study is exactly same as the finding in "essential group" of ASD by Tamminies et al., who found the yield increased drastically from 4.2 to 24.5% in the "complex group" with co-presence of morphological anomalies (Tammimies et al., 2015).

#### Rare Gain Burden Was Implicated to Correlate With the Phenotypic Severity

Our pilot study on rare CNV burden in ASD of Chinese ancestry suggested increased occurrence of rare loss events in the ASD cohort. This is different from the prior burden analysis on large European cohort showing rare "genic" losses and gains were overrepresented but not the overall occurrence (Pinto et al., 2010). Due to the small sample size in this study, replication study on larger Chinese cohort is necessary to ascertain if CNV burden is influenced by ethnicity.

Our finding also implied higher burden of rare gains in the severe ASD than the mild. Though not exactly the same way, a similar observation that rare gains influenced the phenotypic outcome in ASD was reported, and the authors found the burden of duplications, but not deletions, correlated with the severity score (Girirajan et al., 2013). Our results were in agreement with the finding by Girirajan et al., but replications on larger datasets are warranted.

### Potentially Disruptive CNV Events Were Found in Genes Intolerant of LoF Variants

Rare loss events disrupting genes extremely intolerant of LoF variants were found to be enriched in the ASD cohort, while rare gain events did not show such enrichment. One explanation is that the impact of deletions (loss) on gene function is relatively definite, while partial duplications (gain events counted as "potentially disruptive" in this study) may not have deleterious impact on the interested gene. This may also explain a nominally higher incidence of rare gains intersecting high-risk ASD genes were found in the control cohort (**Table 1**).

LoF-intolerant genes disrupted by rare loss events in ASD are prioritized candidates in this study. Among the five genes intersected by rare loss events, RIMS2 was also intersected by a rare gain found in another individual of the ASD cohort. No study so far has reported the association of RIMS2 with genetic disorders, but its homologues RIMS3 and RIMS4 were implicated autism risk factors (Kumar et al., 2010; Leblond et al., 2018). RIMS2 codes for a presynaptic protein regulating synaptic membrane exocytosis, and mediates neurotransmitter release during short- and long-term synaptic plasticity (Kaeser and Südhof, 2005). Given the well-established role of synaptic plasticity in ASD etiology (Bourgeron, 2015), genetic variants of RIMS2 could affect the synaptic regulation and confer risk to ASD.

# CONCLUSION

This study investigated rare CNVs in a Chinese ASD cohort. The diagnostic yield of CMA was 4.2%, and CNV burden analysis suggested overrepresentation of rare losses in ASD, whereas the symptom severity correlated with rare gain burden. Additionally, rare losses intersecting LoF-intolerant genes were enriched in ASD. The CNV burden and potential candidates implicated in this pilot study should be validated in larger ASD cohorts for definite clues of genetic etiology.

# DATA AVAILABILITY STATEMENT

The CNVs in this study can be found in the LOVD database (https://databases.lovd.nl/shared/individuals), with accession numbers #00181115 to #00181139.

# AUTHOR CONTRIBUTIONS

YF and XD performed the data analysis and drafted the manuscript. XD, XL, and FL collected the clinical information and performed the psychiatric diagnostic evaluation. LW performed the experiments of chromosomal microarray. YF, FL, and YY designed and supervised the study.

# FUNDING

This work was supported by the National Key R&D Program of China (2018YFC1002204, to YY), the National Natural Science Foundation of China (No. 81500972 and No. 81873735, to YF; No. 81670812, to YY; No. 81571031 and No. 81761128035, to FL), the Shanghai Municipal Education Commission (No.15CG14, to YF; No.20152234, to FL), and the Jiaotong University Cross Biomedical Engineering (No. YG2017MS72, to YY), the Shanghai Municipal Commission of Health and Family Planning (No.201740192, to YY; No.2017ZZ02026, No. 2018BR33, No.2017EKHWYX-02 and No.GDEK201709 to FL), the Shanghai Shen Kang Hospital Development Center new frontier technology joint project (No.SHDC12017109, to YY; No.16CR2025B, to FL), Shanghai Committee of Science and Technology (No.17XD1403200 & No.18DZ2313505, to FL), Xinhua Hospital of Shanghai Jiao Tong University School of Medicine (2018YJRC03, Talent introduction-014 & Top talent-201603, to FL).

#### REFERENCES


# ACKNOWLEDGMENTS

The authors were grateful for the support of all the participants and families in this study.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00665/full#supplementary-material


spectrum disorders. Pediatrics 125, e727–e735. doi: 10.1542/peds.2009- 1684


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Fan, Du, Liu, Wang, Li and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# MECP2 Mutation Interrupts Nucleolin–mTOR–P70S6K Signaling in Rett Syndrome Patients

Carl O. Olson<sup>1</sup> , Shervin Pejhan<sup>1</sup> , Daniel Kroft<sup>1</sup> , Kimia Sheikholeslami1,2, David Fuss<sup>1</sup> , Marjorie Buist<sup>1</sup> , Annan Ali Sher<sup>1</sup> , Marc R. Del Bigio<sup>3</sup> , Yehezkel Sztainberg<sup>4</sup> , Victoria Mok Siu<sup>5</sup> , Lee Cyn Ang<sup>6</sup> , Marianne Sabourin-Felix<sup>7</sup> , Tom Moss<sup>7</sup> and Mojgan Rastegar<sup>1</sup> \*

<sup>1</sup> Regenerative Medicine Program, and Department of Biochemistry and Medical Genetics, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada, <sup>2</sup> Faculty of Medicine, University of Toronto, Toronto, ON, Canada, <sup>3</sup> Department of Pathology, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada, <sup>4</sup> Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, United States, <sup>5</sup> Division of Medical Genetics, Department of Paediatrics, Schulich School of Medicine, Western University, London, ON, Canada, <sup>6</sup> Department of Pathology, Schulich School of Medicine and Dentistry, Western University, London, ON, Canada, <sup>7</sup> Cancer Division of the Quebec University Hospital Research Centre, Department of Molecular Biology, Medical Biochemistry and Pathology, Faculty of Medicine, Laval University, Quebec City, QC, Canada

#### Edited by:

Zhexing Wen, Emory University School of Medicine, United States

#### Reviewed by:

Xinyuan Wang, University of Pennsylvania, United States Ying Zhou, Shanghai Jiao Tong University, China

\*Correspondence:

Mojgan Rastegar mojgan.rastegar@umanitoba.ca

#### Specialty section:

This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics

Received: 01 June 2018 Accepted: 27 November 2018 Published: 19 December 2018

#### Citation:

Olson CO, Pejhan S, Kroft D, Sheikholeslami K, Fuss D, Buist M, Ali Sher A, Del Bigio MR, Sztainberg Y, Siu VM, Ang LC, Sabourin-Felix M, Moss T and Rastegar M (2018) MECP2 Mutation Interrupts Nucleolin–mTOR–P70S6K Signaling in Rett Syndrome Patients. Front. Genet. 9:635. doi: 10.3389/fgene.2018.00635 Rett syndrome (RTT) is a severe and rare neurological disorder that is caused by mutations in the X-linked MECP2 (methyl CpG-binding protein 2) gene. MeCP2 protein is an important epigenetic factor in the brain and in neurons. In Mecp2 deficient neurons, nucleoli structures are compromised. Nucleoli are sites of active ribosomal RNA (rRNA) transcription and maturation, a process mainly controlled by nucleolin and mechanistic target of rapamycin (mTOR)–P70S6K signaling. Currently, it is unclear how nucleolin–rRNA–mTOR–P70S6K signaling from RTT cellular model systems translates into human RTT brain. Here, we studied the components of nucleolin– rRNA–mTOR–P70S6K signaling in the brain of RTT patients with common T158M and R255X mutations. Immunohistochemical examination of T158M brain showed disturbed nucleolin subcellular localization, which was absent in Mecp2-deficient homozygous male or heterozygote female mice, compared to wild type (WT). We confirmed by Western blot analysis that nucleolin protein levels are altered in RTT brain, but not in Mecp2-deficient mice. Further, we studied the expression of rRNA transcripts in Mecp2-deficient mice and RTT patients, as downstream molecules that are controlled by nucleolin. By data mining of published ChIP-seq studies, we showed MeCP2-binding at the multi-copy rRNA genes in the mouse brain, suggesting that rRNA might be a direct MeCP2 target gene. Additionally, we observed compromised mTOR–P70S6K signaling in the human RTT brain, a molecular pathway that is upstream of rRNA– nucleolin molecular conduits. RTT patients showed significantly higher phosphorylation of active mTORC1 or mTORC2 complexes compared to age- and sex-matched controls. Correlational analysis of mTORC1/2–P70S6K signaling pathway identified multiple points of deviation from the control tissues that may result in abnormal ribosome biogenesis in RTT brain. To our knowledge, this is the first report of deregulated nucleolin–rRNA–mTOR–P70S6K signaling in the human RTT brain. Our results provide important insight toward understanding the molecular properties of human RTT brain.

Keywords: MECP2 mutations, Rett syndrome, human brain tissues, DNA methylation, ribosome biogenesis, mTOR, nucleolin, protein translation

# INTRODUCTION

fgene-09-00635 December 18, 2018 Time: 16:21 # 2

Methyl CpG-binding protein 2 gene was discovered in 1992, encoding for MeCP2 as an important member of the DNA methyl binding proteins (MBP) (Lewis et al., 1992). MeCP2 is an epigenetic regulator with crucial functions in the brain and in neurons (Delcuve et al., 2009; Ezeonwuka and Rastegar, 2014; Liyanage et al., 2014). De novo mutations of the X-linked MECP2 gene are the underlying cause of ∼95% cases of RTT (Amir et al., 1999). RTT is a severe and rare progressive neurodevelopmental disease in females (1:10,000), with few cases of reported male patients (Liyanage and Rastegar, 2014). RTT patients appear normal at the beginning of their life, but by 6–18 months, they exhibit developmental regression and loss of acquired skills, along with neurological symptoms that may include seizures, ataxia, and autistic characteristics.

It is well established that MECP2 deficiency in neurons is associated with compromised protein synthesis (Li et al., 2013), a fundamental process in all cells including neurons. Protein synthesis is tightly regulated and has multiple ratelimiting steps. Of those steps, ribosome biogenesis and rRNA synthesis are largely controlled (Moss and Langlois, 2007). Eukaryotic ribosomes are subcellular organelles made of rRNA transcripts and a multitude of ribosomal proteins. The process of rRNA synthesis, in turn, is a rate-limiting step for ribosome biogenesis. The multi-copy rRNA genes are initially transcribed by polymerase I as 45S pre-rRNA precursors in the nucleolus that are processed into 18S, 28S, and 5.8S rRNAs (Moss, 2004; Moss and Langlois, 2007). RNA polymerase I activity is controlled by nucleolin and the mTOR-P70S6K ribosomal protein pathway. It has been reported that the 28S and 18S rRNA transcripts are reduced in murine Mecp2-deficient neurons (Gabel et al., 2015), and that the mTOR signaling is impaired in RTT mouse models (Ricciardi et al., 2011). However, no study has been done in the human RTT brain, and there is no report if the nucleolin levels are interrupted in human RTT brain. As nucleolin controls rRNA synthesis/ribosome biogenesis, and this process is also controlled by mTOR–P70S6K signaling, we hypothesized that MeCP2 mutations in human RTT brain would be associated with deregulation of nucleolin, rRNA transcripts, and mTOR–P70S6K signaling. Previous reports have highlighted a role for MeCP2 in organizing neuronal nucleoli structure during embryonic development (Singleton et al., 2011), while pointing toward MeCP2 recruitment at the nucleolar periphery of Purkinje cells in mice cerebellum. This is suggestive of MeCP2 binding to methylated rRNA genes at peri-nucleolar parts of the nucleus (Payen et al., 1998), introducing rRNA genes as potential direct target genes of MeCP2. While these studies highlight a functional importance for MeCP2 in embryonic neuronal nucleoli and Purkinje cells of mice, it is unclear if human RTT cerebellum has nucleolar deficits. MeCP2 levels are highest in neurons, and of different brain regions, cerebellum has the highest neuronal density (Marzban et al., 2014). The cerebellum has established links with autism, cognitive characteristics, ataxia, and memory function (some of the main RTT phenotypic characteristics), and has been studied for RTT-associated research on the mechanism of disease (Ben-Shachar et al., 2009; Rangasamy et al., 2016; Rastegar, 2017). Therefore, in our studies we focused on human RTT cerebellum.

Here, we report that human RTT brain shows deregulation of multiple molecules upstream of protein translation, altered nucleolin protein levels, and mTOR–P70S6K pathway. To our knowledge, the possible link between MeCP2, nucleolin levels, and mTOR–P70S6K pathway in RTT and other MeCP2 associated neurological disorders (i.e., MDS) is a novel concept that is being reported. Data presented in this study suggest a potential regulatory role for MeCP2 that may lead to a better understanding of MeCP2-associated disease pathobiology.

# MATERIALS AND METHODS

#### Immunohistochemistry

Dissected mouse brain fixation in ice-cold freshly de-polymerized 2% PFA (0.16 M sodium phosphate buffer, pH 7.4 with PFA) was followed by incubation in cryoprotectant (25 mM sodium phosphate buffer, pH 7.4, 10% sucrose, and 0.04% NaN3) at 4 ◦C for at least 24h. Ten micron mouse brain cryosections were processed on to gelatinized slides and stored at −20◦C. Slides were air-dried at room temperature prior to use. Human brain sections (5 µm) were incubated in an oven at 60◦C for 30 min, then deparaffinized using sequential incubations of 4 × 5 min xylene, 2 × 1 min 100% ethanol, 1 × 1 min 95% ethanol, 1 × 1 min 70% ethanol, 1 × 1 min running tap water, and 1 × 1 min distilled de-ionized water. De-paraffinized human brain sections were treated using Tris-EDTA antigen retrieval buffer (10 mM Trizma base, 1 mM EDTA, pH 9.0, 0.05% Tween-20) or citrate antigen retrieval buffer (10 mM sodium citrate, pH 6.0, 0.05% Tween-20) both at boiling temperature for 20 min, followed by 3 min air-cooling and 3 × 5 min TBS (50 mM Trizma base, pH 7.6, 1.5% NaCl) wash. Human and mouse brain sections were permeabilized for 20 min in TBS-Tr (50 mM Trizma base, pH 7.6, 1.5% NaCl, 0.3% Triton X-100) and preblocked with 10 or 20% normal goat serum (NGS) in TBS-Tr overnight at 4◦C. Immunohistochemistry (IHC) was performed using rabbit polyclonal anti-nucleolin (Abcam, ab22758) primary antibody in TBS-Tr with serum. Secondary antibody, goat antirabbit Alexa 594 (Thermo Fisher, A11037), was also diluted in TBS-Tr with serum and applied for 1 h at room temperature followed by washes using 3 × 20 min TBS-Tr and 1 × 15 min Tris–HCl buffer (50 mM Trizma base, pH 7.4). Sudan Black counterstaining (0.1% w/v in 70% EtOH) for 30 min followed secondary antibody washes for citrate antigen retrieval samples, followed by 1 × 5 min wash with 70% EtOH and 3 × 5 min wash with TBS. DAPI counterstaining and washes with Tris–HCl was performed followed by application of Prolong Gold (Thermo

**Abbreviations:** DBD, DNA-binding domain; EEG, electroencephalogram; GCL, granular cell layer; IGV, Integrative Genomics Viewer; MBP, methyl binding proteins; MDS, MECP2 duplication syndrome; MECP2, methyl CpG-binding protein 2) (human gene)/Mecp2 (murine gene)/MeCP2 (protein); ML, molecular layer; mTOR, mechanistic target of rapamycin; PCL, Purkinje cell layer; PFA, paraformaldehyde; PMI, post-mortem delay; qRT-PCR, quantitative real-time PCR; RPM, reads per million; rRNA, ribosomal RNA; RTT, Rett syndrome; SEM, standard error of the mean; TRD, transcriptional repression domain; WB, Western blot; WGBS, whole-genome bisulfite sequencing; WT, wild type.

Fisher, P36930) antifade and cover slipping. Immunolabeling was detected using an Axio Observer Z1 inverted microscope and LSM710 confocal microscope (Carl Zeiss Canada Ltd.), as previously described (Olson et al., 2014). Images were obtained and analyzed using Zen (Carl Zeiss Canada Ltd.) software, and assembled into figures using Adobe Photoshop C5 and Adobe Illustrator C5. Please refer to **Supplementary Tables S1**, **S2** for the list of primary and secondary antibodies.

#### Western Blot

Nuclear and cytoplasmic extraction from brain tissues were carried out using NE-PERTM Nuclear and Cytoplasmic Extraction Kit (Thermo Scientific Inc., 78835) as per the manufacturer's instructions, and as we reported (Olson et al., 2014). Total protein cell extracts were done by high salt protocol as we have reported (Lahuna et al., 2000; Rastegar et al., 2000; Wu et al., 2001; Nolte et al., 2006). WB experiments and quantification of the signals was performed as we reported (Olson et al., 2014; Nagakannan et al., 2016). AlphaEaseFC (version 6.0.0, Alpha Innotech) software was used for quantification. Please refer to **Supplementary Tables S1**, **S2** for the list of primary and secondary antibodies.

As loading control for WBs with total cell extracts, we used GAPDH as a commonly used housekeeping protein, which appeared to be consistently detectable across different samples, when the same amount of protein was loaded for each sample. Similarly, GAPDH signals in the cytoplasmic extracts remained constant, providing a reliable indication that comparable level of protein samples are loaded for each sample. This is in agreement with GADH being reported as a key enzyme for glycolysis (Bruns and Gerald, 1976; Sirover, 1999) in the cytoplasm. We also used GAPDH as a loading control for nuclear extracts with consistent detection among different samples when the same amount of nuclear protein extracts were loaded onto the gels. This is in accordance with the reported role of nuclear GAPDH in maintenance and protection of telomeric DNA (Sundararaj et al., 2004), and also regarding its functional role in controlling histone H2B expression (Zheng et al., 2003; Nicholls et al., 2012). While the level of loaded proteins in the cytoplasmic and nuclear extracts were verified by GAPDH signals, detection of histone H3 (pan H3 and acetylated H3) and S100 protein was used to verify the quality of extracted nuclear and cytoplasmic extracts, respectively.

#### Quantitative Real-Time PCR (qRT-PCR)

Total RNA from murine and human brain regions was extracted by Trizol, as we reported elsewhere (Rastegar et al., 2004; Kobrossy et al., 2006). Quantitative RT-PCR was done using SYBR Green-based RT2 qPCR Master Mix (Applied Biosystems, 4367659) in an Applied Biosystems Fast 7500 Real-Time PCR machine. The threshold cycle value (Ct) for each gene was obtained from the Applied Biosystems Fast 7500 Real-Time PCR machine and the values were normalized against a housekeeping gene (Gapdh). This was followed by obtaining the 1Ct values for each one of the samples, by calculating the relative levels of each gene by calculating 2−1Ct for each sample. Analysis was done by Microsoft Excel 2010 and 2−1Ct values of each gene that were transferred to GraphPad Prism 6.0, for generating the final graphs, a similar analysis that we reported previously (Liyanage et al., 2013, 2015). Statistical significance was determined by Welch's t-test, with ∗∗∗∗p < 0.0001, ∗∗∗p < 0.001, ∗∗p < 0.01, or <sup>∗</sup>p < 0.05. The sequence of the primers used in RT-qPCR reactions are as following: mouse nucleolin: forward: 5<sup>0</sup> -AA GCAGCACCTGGAAAACG-3<sup>0</sup> , reverse: 5<sup>0</sup> -TCTGAGCCTTCTA CTTTCTGTTTCTTG-3<sup>0</sup> (Monte et al., 2013); mouse GAPDH: forward: 5<sup>0</sup> -ATGTCGTGGAGTCTACTGG-3<sup>0</sup> , reverse: 5<sup>0</sup> -GTGG TGCAGGATGCATTGC-3<sup>0</sup> ; mouse 45s rRNA: forward: 5<sup>0</sup> -GA GAGTCCCGAGTACTTCAC-3<sup>0</sup> , reverse: 5<sup>0</sup> -GGAGAAACAAG CGAGATAGG-3<sup>0</sup> (Chen et al., 2008); human/mouse 28s rRNA: forward: 5<sup>0</sup> -AGAGGTAAACGGGTGGGGTC-3<sup>0</sup> , reverse: 5<sup>0</sup> -GG GGTCGGGAGGAACGG-3<sup>0</sup> (Uemura et al., 2012); human/ mouse 18s rRNA: forward: 5<sup>0</sup> -GATGGTAGTCGCCGTGCC-3<sup>0</sup> (Uemura et al., 2012); reverse: 5<sup>0</sup> -GCCTGCTGCCTTCCTTGG-3 0 ; human GAPDH: forward: 5<sup>0</sup> -CCACTCCTCCACCTTTGAC-3 0 , reverse: 5<sup>0</sup> -ACCCTGTTGCTGTAGCCA-3<sup>0</sup> ; human nucleolin: forward: 5<sup>0</sup> -AGCAAAGAAGGTGGTCGTTT -3<sup>0</sup> , reverse: 5<sup>0</sup> -CT TGCCAGGTGTGGTAACTG -3<sup>0</sup> ; human 45S rRNA: forward: 5 0 -CTCCGTTATGGTAGCGCTGC-3<sup>0</sup> , reverse: 5<sup>0</sup> -GCGGAACC CTCGCTTCTC-3<sup>0</sup> .

#### Correlation Analysis and Ratios

The correlations between protein contents of mTOR, mTORC1 (2448), mTORC2 (2481), P70S6K, and phosphorylated P70S6K were determined by Pearson's correlation analysis. WB signals were normalized against GAPDH loading control, and individual values for each signal were compared to the average of the controls in that blot; this was done in order to render values from different technical replicates comparable. We then calculated Pearson's correlation coefficient (r) for the normalized values of each pair of molecules within the RTT patients and controls. The power of correlation is reported as follows: very weak/poor, 0 < r < 0.3; moderate/medium, 0.3 < r < 0.4; strong, 0.4 < r < 0.7; and very strong, 0.7 < r < 1.0. Due to the nature of the data, significance was not computed.

Ratios between molecules were computed for control tissues (n = 3), average RTT patients (n = 4), and each RTT patient individually. The average of normalized WB signals for each sample was compared between the phosphorylated proteins to the corresponding total protein to obtain ratios. Error bars for each sample represent the SEM for the RTT and control ratios. Significance was computed using FDR-adjusted multiple t-tests, with an alpha of 0.05.

#### Sequence Data Mining

The raw sequence data was obtained for both ChIP and input DNA samples (GSM1464563 and GSM1464564) (Gabel et al., 2015) and was then aligned to the mouse genome version MmGRCm38 to which a single copy of the mouse rDNA repeat sequence (GenBank BK000964v3), which was added as an additional chromosome by using Bowtie2 (Langmead and Salzberg, 2012). For practicality, the source of the rRNA repeat was placed at the EcoRI site at 30,493 in a way that the pre-rRNA initiation site now is located at the nucleotide 14,815.

The deconvoNorm.py script was then used to normalize the data<sup>1</sup> (Mars et al., 2018). Briefly, the aligned reads were extended to 100 bp and the coverage was calculated by using BEDtools (Quinlan Lab, University of Utah). The resultant data was then converted to the RPM and the sample DNA coverage was normalized to the input DNA coverage (sample/input) for each of the base positions. The resulting normalized BED files were then converted to BEDgraph format and were visualized using IGV (IGV 2.3, Broad Institute). WGBS data (GSM1173783) (Lister et al., 2013) was also analyzed for the methyl-dC by alignments to the same composite mouse genome by using the Bismark v0.10 (Krueger and Andrews, 2011), and Bowtie2, and were again visualized using IGV (IGV 2.3, Broad Institute).

#### Ethical Approval and Consent to Participate

All experiments with mice were conducted according to the standards of the Canadian Council on Animal Care with the approval of the Office of Research Ethics of the University of Manitoba, in accordance with approved guidelines on animal experimentation. MeCP2 knockout transgenic mice Mecp2tm1.1Birdy/<sup>−</sup> (null), heterozygote female (Mecp2tm1.1Bird <sup>+</sup>/−), and their WT counterparts were purchased from The Jackson laboratories, United States. Mice tissue harvest and outlined experimental procedures were peer-reviewed and approved under the "animal protocol number 16-031/1/2(AC-11190)" by the University of Manitoba Bannatyne Campus Protocol Management and Review Committee. Samples from MECP2-Tg1 and Tg3 mice were received from Dr. Huda Zoghbi, Baylor College of Medicine, Houston, TX, United States, as they previously reported (Samaco et al., 2012). The human tissue research has been reviewed and approved by the University of Manitoba Bannatyne Campus research ethics board and (Health Research Board protocol # HS20095 H2016:337). For donated T158M human RTT brain tissues, we obtained appropriate family consent to participate in research (through Dr. Victoria Siu, coauthor), and the T158M RTT post-mortem brain tissues were collected from a 13-year-old female with RTT diagnosis. Control fixed cerebellum tissues are from age-matched female. Human brain tissues for RNA and protein extractions from RTT patients (R255X: c.763C > T nonsense mutation, 17 and 20 years old, case numbers #4516 and #4882; and G451T case number #4852) and control age-matched female tissues (17, 19, and 20 years old, case numbers #5446, #1347, and #5646) were received from NIH NeuroBiobank at the University of Maryland Brain and Tissue Bank, as frozen and formalin-fixed paraffin-embedded tissues.

#### Clinical Information and History of the Rett Syndrome Patients

The T158M patient was born at 40 weeks gestation following an uncomplicated pregnancy, weighing 2984 g (25th percentile). She was the only child born to a 21-year-old mother and a 33-yearold father, and was diagnosed with RTT at the age of 21/<sup>2</sup> years. Her early developmental milestones were normal. At 6 months of age, she started grinding her teeth, rolled at 5 months, and crawled and took her first steps at 1 year. At 18 months, she began to regress, losing purposeful hand movements, the ability to walk and to speak, as well as developing severe constipation. Her eye contact became very poor, but was regained by 21/<sup>2</sup> years of age. From age 2, she was constantly mouthing or wringing her hands. She also exhibited hyperventilation and abdominal bloating. She had intermittent strabismus, and abnormal EEG with generalized grade 4 dysrhythmia. At 2 years, she began exhibiting repetitive hand movement characteristic of RTT, leading to her diagnosis. Genetic testing revealed a T158M mutation in the MECP2 gene, confirming the RTT diagnosis. She was always a happy and passive child who never experienced a period of irritability in association with her regression. On examination at age 4, she showed constant handwringing and bruxism. Height was 93 cm (third percentile), weight 15.1 kg (25th percentile) and head circumference 48.2 cm (10th to 25th percentile). By age 8, she was having three types of seizures: staring spells with facial and arm twitching, tonic–clonic seizures, and apneic periods associated with cyanosis. At 9 years of age, she appeared to have multifocal myoclonus. Choking and aspiration episodes became frequent, necessitating recurrent admissions to the critical care unit for intubation and ventilation. By 10 years, she had developed a seizure disorder and respiratory dysfunction, showing characteristic autonomic fluctuations of heart rate and respiratory function. By age 12 years, her overall health and quality of life had decreased significantly to the point of respiratory insufficiency, requiring continuous BiPAP and frequent suctioning to maintain oxygenation and clear secretions. She died of respiratory failure after the age of 12. She passed away peacefully and her mother requested that her brain be donated for research into RTT. The post-mortem autopsy revealed a partially resolved subdural hemorrhage, mild enlargement of the lateral ventricles, and slight thinning of the posterior corpus callosum, with no other focal abnormalities noted. Upon analysis by microscopy, increased cell density and small pyramidal neurons were noted, as well as diffuse mild microglial activation in the white matter. No other abnormalities were noted.

The NIH case # 4516, R255X mutation, 20-year-old patient was a right-handed female, born vaginally at 43 weeks gestation. At around 16 months, development of her speech stagnated. Over the following 6 months, she began to lose motor function in her hands and bowels/bladder. At 2 years of age, she began to have focal onset unaware seizures lasting 5–8 s and occurring three times per week. She continued to deteriorate until her death at age 20. On autopsy, slight cerebral atrophy was noted. A postmortem genetic analysis identified an R255X mutation in the MECP2 gene.

The NIH case # 4882, R255X mutation, 17-year-old patient was born vaginally at 39.5 weeks gestation, after a prolonged rupture of membranes of 26 h. At birth, she was diagnosed to have torticollis, which was thought to be due to low intrauterine tone. This was corrected with special pillows. At 21 months of age, she was noted to have hypotonia and hyporeflexia, constant repetitive hand and foot movements, and brachycephaly. A genetic test at this time identified a C763T nonsense mutation in the MECP2 gene translating into R255X mutation in the MeCP2 protein,

<sup>1</sup>https://github.com/mariFelix/deconvoNorm

confirming a diagnosis of RTT. The patient passed away at the age of 17 years old. On autopsy, no significant pathologic findings were identified.

The NIH case # 4852, G451T mutation, had a limited clinical history available. Her clinical course included kyphoscoliosis and epilepsy, which began in her early childhood. She died at the age of 19 years old and a neuropathological examination revealed a pale substantia nigra.

For further information, including PMI, and the storage years prior arriving to our lab for research, please refer to **Supplementary Table S3**.

### RESULTS

#### Nucleolin Protein Levels and Sub-Cellular Localization Are Deregulated in the Human T158M RTT Brain

In order to study the impact of MeCP2 mutations in nucleoli structures in humans, we analyzed post-mortem cerebellum of a RTT patient with the most common MeCP2 mutation (T158M). This mutation is recognized as the highest frequency RTTassociated MeCP2 mutation and occurs in the MeCP2 DBD (**Figure 1A**). Nucleolin is a major nucleoli protein that makes up about 10% of total nucleolar proteins (Singleton et al., 2011; Tajrishi et al., 2011). We performed IHC analysis of post-mortem cerebellum tissues from the T158M 13-year-old RTT patient with a clinical RTT diagnosis (**Figures 1B–F**). We detected higher levels of nucleolin in the cerebellum ML, PCL, and GCL of the RTT patient compared to control tissue (**Figures 1B–F**). Especially in the Purkinje cells, a faint nucleolin staining was detected in DAPI-devoid regions of the nucleolus, but was detected at higher levels throughout the nucleus in the RTT patient (**Figure 1E**). Using confocal microscopy, we also detected a clear nucleolin staining in the nucleoli of the GCL and ML cells of the control cerebellum, but this appeared to be faint and distributed throughout the nucleus in the RTT T158M cerebellum (**Figures 1D,F**). No signal was detected in primary antibody omission control samples (**Figure 1G**). Comparative analysis of murine cerebellum in 6-week WT compared to null Mecp2tm1.1Birdy/<sup>−</sup> brain tissues did not show significant differences in the nucleolar morphological organization analyzed by nucleolin staining (**Figures 2A,B**). Our observation in mice cerebellum was in agreement with a previous report of nucleolin staining in the cortex of adult Mecp2tm1.1Birdy/<sup>−</sup> homozygous mice compared to WT male (Singleton et al., 2011). This is also in agreement with a previous report that detected compromised nucleolar structures in Mecp2-deficient neurons at the embryonic stage, which were corrected by adulthood in mice (Singleton et al., 2011). In murine cerebellum at 6 weeks of age, we did not detect any differences between the WT female and heterozygote Mecp2tm1.1Bird <sup>−</sup>/<sup>+</sup> female (**Figures 2A–D**). Due to the X-linked nature of the Mecp2/MECP2 gene, no homozygous female or heterozygote male mice are available for comparison studies. Comparing the results from T158M RTT patient and Mecp2-deficient transgenic mice, it is possible that there is differential regulation of nucleoli structures in mice and humans that might be detected in human MeCP2 mutant brain tissues. It is also possible that the effect of total MeCP2 protein loss (in Mecp2tm1.1Bird mice) would be different from an RTT patient that has the full-length protein, but with a specific point mutation.

Next, we asked if the higher nucleolin levels shown by IHC in T158M cerebellum are also detectable by a more quantitative analysis. We isolated protein extracts of the cerebellum from the T158M patient and control tissues for WB. Nucleolin protein levels in T158M female cerebellum were observed to be at higher levels compared to control cerebellum tissues (**Figure 3A**). Examination of Nucleolin transcripts did not show a direct correlation with the protein levels, as transcripts were found to be at the lowest levels in the T158M patient (**Figure 3B**).

In general, the severity of the disease in RTT and the associated phenotypes may vary depending on the type of MECP2 genetic mutation and the affected functional domain of the protein (Liyanage and Rastegar, 2014). While T158M is the highest frequency of MeCP2 mutations in RTT, R255X is the highest frequency of MeCP2 mutation in the TRD, constituting the third most common RTT-associated mutation (**Figure 1A**). In two different cases of R255X mutations in RTT patients, nucleolin levels appeared to be below the control levels (**Figure 3A**), suggesting that altered nucleolin levels might depend on the type of MECP2 mutation. Transcript analysis of Nucleolin mRNA level in the two R255X patients showed slightly increased Nucleolin transcripts compared to controls (**Figure 3B**), suggesting regulation at the level of nucleolin translation or turnover.

Immunohistochemistry examination of the nucleolin in the cerebellum tissues of these R255X patients compared to ageand sex-matched controls suggested lower detection of nucleolin in these patients (**Supplementary Figures S1A–D**). However, the quality of the tissues for IHC examination was largely reduced due to the long-term storage of these brain tissues in formalin (NIH Neurobiobank #4516: over 11 years, and NIH Neurobiobank 4882: over 9 years). In both cases, some levels of noise background were detected in primary omission controls slides for the two RTT patients (**Supplementary Figures S1Ce,De**), indicating that there is some level of autofluorescence in these tissues when trying to visualize the low levels of nucleolin by microscopy. It is important to note that although IHC results may point toward antigen detection in individual cells, WB experiments are more reliable for quantitative expression level studies in between different samples.

Next, we studied nucleolin levels in the nuclear and cytoplasmic fractions of the cerebellum from all three patients, as we had noticed change in nucleolin sub-cellular localization in the T158M patient. The T158M cerebellum showed higher levels of nucleolin in both nuclear and cytoplasmic fractions, while the two R255X and control cerebellum cytoplasmic extracts showed negligible nucleolin levels (**Figures 3C,D**). In order to ensure the highly detected nucleolin protein in the cytoplasmic fraction of the T158M is not a simple technical

signals (white) in a female control and a T158M 13-year-old patient is shown, for the three cerebellum layers (granular, Purkinje, and molecular cell layers). (C) Higher magnification images for the three cerebellum layers in control (a–c) and T158M RTT patient (d–f) are shown. (D–F) Confocal images of nucleolin in the three layers of the human cerebellum are shown. Note that in all three layers, nucleolin signals are redistributed from the nucleolus into the nuclei. Scale bars represent 100 µm in (B), 20 µm in (C), and 2 µm in (D–F). Yellow arrows point toward nucleoli structures. (G) Primary antibody omission in female control (a,b) and the T158M RTT patient (c,d) are shown. CTD, C-terminal domain; DBD, DNA-binding domain; GCL, granular cell layer; ID, intervening domain; ML, molecular layer; NCL, nucleolin; NTD, N-terminal domain; PCL, Purkinje cell layer; TRD, transcriptional repression domain.

The data are shown with the following samples in the order of controls (NIH NeuroBiobank case numbers #5646 and #5446) and RTT patients (R255X: c.763C>T nonsense mutation, 20 and 17 years old, case numbers #4516 and #4882), and T158M cerebellum (brain received as donation by family members with appropriate consent for research). Averaged data for the three patients is shown in the "RTT" column. (B) Transcript level of Nucleolin is shown for human cerebellum. Human control and RTT patients are the same as in (A). (C) Same as (A), but for nuclear proteins. (D) Same as in (A,C), but for cytoplasmic extracts. (E) Validation of nuclear and cytoplasmic fractions using antibodies against histone H3, H3 di-acetylation at K9–K14 (H3AC) as nuclear proteins, and astrocytic protein S100 as a cytoplasmic protein. The order of samples is the same as in (A,C,D). (F) Quantification of nuclear nucleolin signals from C against GAPDH (loading control), H3, or H3AC as other ubiquitous nuclear proteins. (G) Combined quantification of the three RTT patients from (F). (H) Quantification of cytoplasmic nucleolin signals from (D) against GAPDH (loading control), or S100 as other cytoplasmic protein in the brain. (I) Combined quantification of the three RTT patients from (H). N = 2 for controls and values represent single RTT patients in (F,H) (the two R255X patients, and T158M patient). In (A,B,G,I), N = 2 for controls and N = 3 ± SEM for RTT patients. Statistical significance was determined by Welch's t-test with <sup>∗</sup>p < 0.05 and two-way ANOVA.

error due to the contamination of nuclear protein in this patient, we verified the quality of our nuclear–cytoplasmic fractions. Analysis of histone H3 detection along with its specific acetylation modification (H3AC: histone H3 di-acetyl K9–K14) as nuclear-specific proteins indicated that there is no nuclear contamination in the T158M cytoplasmic fraction (**Figure 3E**) as no H3 or H3Ac was detected in the T158M sample. Accordingly, examination of a cytoplasmic protein (S100) confirmed negligible detection in the nuclear extracts, compared to the cytoplasmic extracts (**Figure 3E**). We used a housekeeping protein (GAPDH) loading control on these experiments (**Figure 3E**). Examination of the nuclear extracts from RTT cerebellum showed lower levels of H3 and H3AC in all three patients compared to controls. Accordingly, cytoplasmic extracts of the RTT patients showed lower levels of S100 expression compared to controls (**Figure 3E** and **Supplementary Figure S3**). As GAPDH levels remained relatively consistent among controls and RTT patients, it is possible that lower levels of H3, H3AC, and S100 in RTT patients may have biological relevance. Regardless, quantification of nucleolin signals in the R255X patients normalized to GAPDH, H3, or H3AC in the nuclear extracts showed a trend of decreased nucleolin levels, which was more drastic when it was normalized to GAPDH (**Figure 3F**). In the T158M patient, nucleoli level was between twofold and fivefold higher than the control levels depending normalization to GAPDH loading control or H3, and H3AC nuclear proteins. Combination of the values from the three RTT patients did not show significant change from the controls, due to the opposite alteration of nucleolin in R255X (decreased) versus T158M patient (increased) protein levels (**Figure 3G**). Accordingly, detection of cytoplasmic nucleolin levels in the T158M RTT patient compared to GAPDH loading control or S100 cytoplasmic marker showed an increase of over 100 fold (when compared to S100 levels) (**Figure 3H**), but was not significant in R255X and T158M patients, when all three patients were combined together (**Figure 3I**). Importantly, all three patients showed reduced levels of nuclear histones (H3 and H3AC), as well as S100 cytoplasmic protein that appeared to be statistically significant (**Supplementary Figures S3A–D**). However, understanding the biological relevance of these differences and possible pathological implications requires further investigations.

Next, we tested whether altered nucleolin levels in the human T158M (higher levels) and R255X (lower levels) is a phenotype that can be detected in one of the moststudied murine RTT mouse models (Mecp2tm1.1Bird/<sup>Y</sup> ). In this transgenic mouse, our IHC studies showed no nucleoli structure alteration by nucleolin staining in the cerebellum (**Figure 2**). The specificity of these detected signals was confirmed by absence of any signal in primary antibody omission control samples (**Supplementary Figures S2A–D**). In agreement with the absence of nucleoli alteration in murine RTT brain, no change in the nucleolin was detected in Mecp2tm1.1Birdy/<sup>−</sup> (homozygous, n = 3) cerebellum compared to WT (**Figure 4A**). Accordingly, analysis of H3 and H3AC proteins showed no difference between WT and homozygous Mecp2tm1.1Birdy/<sup>−</sup> mice (**Figure 4B**).

# Detection of Ribosomal RNA Transcripts in Human RTT Brain

It has been reported that Mecp2 deficiency/knockdown in embryonic murine neurons alters the 28S and 18S rRNA transcript levels and causes compromised nucleolar structures that are visible during development (Singleton et al., 2011; Gabel et al., 2015). It is also known that nucleolin plays key roles in the transcription of the 45S pre-rRNA and its processing into mature rRNAs (Durut and Saez-Vasquez, 2015). To study whether rRNA transcripts are impacted in RTT cerebellum, we analyzed 45S prerRNA, 28S, and 18S rRNA transcript levels in these three RTT patients compared to age-matched control cerebellum tissues. While a similar pattern was not observed in these three patients, the two R255X patients (17 and 20 years old) showed a trend of increased rRNA transcripts, but T158M cerebellum showed a trend for decreased rRNA transcripts (**Figure 4B**). Regardless, combination of the results from all three patients suggested a trend of increased rRNA transcripts, which was significant in case of 18SrRNA (**Figure 4Bd**). These results suggest that in RTT brain possible deregulated rRNA synthesis might be mutationdependent, implicating rRNA synthesis as a possible contributing mechanism in impaired protein translation that warrants further investigations.

In order to see if ribosomal RNA transcripts are affected by absence of MeCP2, we studied rRNA transcripts in the cerebellum of 6 weeks null Mecp2 mice (Mecp2tm1.1Birdy/<sup>−</sup> homozygous, n = 3). While we observed an induction of the 45S pre-rRNA by 15-fold, no significant difference in processed 28S and 18S rRNA transcripts compared to WT mice was detected. Further studies in transgenic mice with overexpressed levels of MeCP2 with either a twofold (MECP2 Tg1: Tg1) or a threefold (MECP2 Tg3: Tg3) increase in MeCP2 levels, only showed decreased levels of 28S rRNA in Tg3 cerebellum, but not Tg1 mice (n = 4). These results suggest that a possible regulation of rRNA genes by MeCP2 might be complex and may depend on the type of MECP2 mutation. It is also possible that MeCP2 represses rRNA genes with a trend of rRNA induction in Mecp2-deficient mice and/or RTT patients and decreased level(s) where MeCP2 is overexpressed (when an alteration is seen in Tg3 mice) (**Figures 4B,C**). Such a role for MeCP2 and other MBDs was previously suggested in non-neuronal murine cells (Goshen et al., 2004).

### MeCP2 Binding to rRNA Genes Follows the meCpG Modification Levels

A significant fraction of the ∼200 rRNA mouse and human genes exist in an inactive, highly meCpG modified, and heterochromatin state. Increased rRNA gene methylation has been shown to repress rRNA transcription (McStay and Grummt, 2008), but unexpectedly, loss of methylation also leads to a repression of transcription that is associated with a failure in rRNA processing (Gagnon-Kugler et al., 2009). MeCP2 was suggested to localize to peri-nucleolar areas of the nucleus in the brain (Payen et al., 1998). These condensed chromatin areas include the silent, heterochromatic rRNA genes. To examine if MeCP2 binding is indeed present at the rRNA gene loci

FIGURE 4 | Nucleolin, MeCP2, histone H3, and histone H3 di-acetylation at K9-K14 (H3AC) in the murine cerebellum along with ribosomal RNA transcripts in the human and murine cerebellum. (A) Western blot (WB) analysis of the total cell extracts of wild type (WT) and Mecp2tm1.1BirdY/<sup>−</sup> (null, N = 3). As expected, no MeCP2 is detected in the Mecp2tm1.1BirdY/<sup>−</sup> cerebellum, and nucleolin levels are not changed. No obvious difference in nucleolin, H3, or H3AC is visible between the WT and Mecp2tm1.1BirdY/<sup>−</sup> null mice (a). Quantification of WB signals is provided in (b) confirming no change in the nucleolin, H3, and H3AC levels between the wild-type and null mice. N = 3 ± SEM for the WB for WT and null mice. Statistical significance was determined by paired t-test, with ∗∗∗p < 0.001. (B) Transcript levels of 45S precursor ribosomal RNA (rRNA) (a), mature 28SrRNA (b), and 18SrRNA (c) are shown for the human cerebellum. Human controls are NIH Neurobiobank #5646 and #5446, and RTT patients are in the order of NIH Neurobiobank #4516 and #4882, followed by T158M patient. Means of technical replicates are presented for individual patients. Combined patient data is shown in (d) for each rRNA transcript, with N = 2 for controls and N = 3 ± SEM for patients. (C) Transcript levels of 45S precursor ribosomal RNA (rRNA) (a), mature 28SrRNA (b), and 18SrRNA (c) are shown for the cerebellum of Mecp2tm1.1BirdY/<sup>−</sup> (null, N = 3), Tg1 (MECP2-Tg1, N = 4), and Tg3 (MECP2 ˛A-Tg3, N = 4) mice compared to control cerebellum from WT controls. N = 3–4 ± SEM. Statistical significance for Bd and C was determined by Welch's t-test, with <sup>∗</sup>p < 0.05.

in comparison with meC sites, we performed data mining of published whole genome bisulphite sequencing and MeCP2 ChIP-seq data for 6-week-old mouse frontal cortex samples (Lister et al., 2013; Gabel et al., 2015). Realignment of the ChIP-seq data and its normalization to input sequence coverage revealed MeCP2 enrichment across the Intergenic Spacer (IGS) (**Figure 5**). By contrast, MeCP2 was depleted from the active gene regions, including the upstream enhancer repeats and the 47S gene body. CpG methylation of the rRNA gene repeat followed a similar pattern, with most sites being fully or nearly fully methylated in the IGS, but on average only 30–40% methylated throughout the active gene regions. No cytidine methylation in the context of non-CpG methylation was detected. Since it is expected that a significant fraction of rRNA genes will be heterochromatic, methylation of the active gene regions is consistent with the existence of the silent gene fraction. However, the near 100% methylation observed in the IGS suggests that this region is strongly repressed in all rRNA genes regardless of their activity status. Thus, MeCP2 binding followed the overall level of CG methylation, probably being present within the IGS of most rRNA genes, but not the gene bodies of the active genes.

# The mTOR–P70S6K Pathway and mTORC1 and mTORC2 Complexes Are Interrupted in RTT Brain

The mTOR–P70S6K pathway is a well-established signaling pathway upstream of protein synthesis. In order to study the protein components of mTOR–P70S6K signaling in human RTT brain, total cell extracts were isolated from the cerebella of controls and RTT patients (two different R255X; a 20-yearold and a 17-year-old patient), one T158M (a 13-year-old patient), and one G451T [a 19-year-old patient with a rare mutation in the MeCP2 C-terminal domain (CTD)]. In all four RTT cerebellum tissues, mTOR levels were about twofold higher than in the controls (**Figures 6A,B** and **Supplementary Figure S4**) (∗∗p < 0.01). Accordingly, the phosphorylated mTOR at both Serine 2448 (mTORC1) and Serine 2481 (mTORC2) were elevated in these RTT patients compared to the controls (∗∗p < 0.01 for S2481) (**Figure 6B**). While increased phosphorylation of mTORC2 complex (S2481) was consistent among these four patients, elevated mTORC1 phosphorylation (S2448) was present in R255X and T158M patients (**Figure 6A**), but absent in G451T mutation (**Supplementary Figure S4**). This could hint toward differential involvement of MeCP2 protein domains (MBD, TRD, or CTD) in mTORC1 phosphorylation. Despite the activation of mTORC1 and/or mTORC2 complexes (indicated by increased S2448 and S2481 phosphorylation, respectively), the levels of a common protein component of both mTORC1 and mTORC2 complexes, namely, G-Beta-L protein, was slightly, but significantly reduced (∗p < 0.05) in the RTT cerebellum (**Figures 6A,B**), suggesting potentially compromised mTORC1/2 functional complexes. Also, the levels of complexspecific protein components of mTORC1 and mTORC2 (Raptor and Rictor, respectively) were slightly (and significantly in the case of Raptor, ∗∗p < 0.01) elevated in RTT patients (**Figure 6B**). While these data indicated that in in these human RTT cerebellums, mTOR protein and its two associated mTORC1 and mTORC2 complexes were elevated, further studies were required to determine if this impacted P70S6K signaling. WB experiments showed that following elevated mTOR, there was also a significant increase of about 7-fold in the protein levels of P70S6K in RTT cerebellum compared to control tissues (∗p < 0.05) (**Figure 6B**). Accordingly, phosphorylation of P70S6K at Thr 389 was deregulated in RTT patients (**Figures 6A,B**), with a drastic decrease in the G451T patient (**Supplementary Figure S4**). These data collectively may hint toward a deregulated mTORC–P70S6K cell-signaling pathway, in parallel to elevated P70S6K.

Next, we analyzed the ratio of phosphorylated mTORC1 (2448) or mTORC2 (2481) to mTOR (total protein), and phosphorylated P70S6K (Thr389) to P70S6K (total protein). While individual RTT patients did not show similar patterns of the phosphorylated versus non-phosphorylated molecules, a trend of elevated P-mTORC1: mTOR (S2448), P-mTORC2: mTOR (S2481), but reduced P-P70S6K (Thr389): P70S6K was observed compared to the controls (**Figure 7A**). This prompted us to study the correlation of mTOR, phosphorylated mTORC1 (2448), phosphorylated mTORC2 (S2481), and P70S6K to different components of this pathway, specially phosphorylated P70S6K (Thr389) (**Figure 7B**). Pearson's correlation analysis

(r) showed that while mTOR was similarly correlated with mTORC1 and P-P70S6K-389 in control and RTT brain, mTOR and mTORC2 (S2481) were negatively correlated in RTT patients (**Figure 7B**). As mTORC2 controls cellular cytoskeleton, this could partly explain the smaller brain size that is a general characteristic of RTT brain. Additionally, mTORC1 (2448) correlation with P-P70S6K was in the moderate range in RTT brain that was lower compared to controls. This is important, as active mTORC1 is the responsible protein directly upstream of phosphorylated P70S6K (Thr389) in this signaling pathway.

Notably, similar correlation analysis in RTT patients indicated a weak correlation between P70S6K and its phosphorylated form (P-P70S6K-Thr389). This highlights that in RTT brain, phosphorylation of P70S6K at Thr389, which is required for proper cell signaling toward rRNA synthesis and ribosomal biogenesis might be compromised. Thus, impaired mTORC1– P70S6K may be associated with compromised protein synthesis in RTT, a process that is directly downstream of this cellular pathway.

#### DISCUSSION

Four common MeCP2 mutations make up >28% of all RTT cases. These include T158M (8.7%) and R168M (7.35%) in the DBD, as well as R255X (6.35%) and R270X (5.8%) in the TRD. We analyzed post-mortem cerebellar tissues from one post-mortem RTT with T158M mutation and two different R255X RTT patients to determine the impact of these common MECP2 mutations on pathways that impinge on ribosome biogenesis. We provide evidence that nucleolin; a regulator of rRNA transcription and processing might be a potential MeCP2 target. In T158M RTT cerebellum, nucleolin levels were changed, along with its nucleolar localization, in association with highly increased protein levels in the cytoplasmic and nuclear fractions of cerebellum. Nucleolin is an RNA-binding protein (Ghisolfi-Nieto et al., 1996) and its abnormal sub-cellular localization may cause molecular abnormalities, besides effects on rRNA synthesis and/or processing. In terms of rRNA transcription, other well-known MeCP2 targets, such as BDNF and IGF-1, also impinge on rRNA transcription (Schratt et al., 2004; Donati et al., 2011)

and may therefore play deregulatory roles that warrant further studies.

We also observed increased levels of mTOR and deregulation of its two phosphorylated forms, which contribute to mTORC1 and mTORC2 activities in the brain. Increased mTORC1 phosphorylation in RTT patients (T158M and R255X) may also explain increased levels of p70S6K phosphorylation and/or increased rRNA levels. Although not all patients showed a positive trend of rRNA increase, it is possible that there are regulatory mechanisms in place that are disturbed, as an opposite change in nucleolin levels (transcripts and protein) is also observed between RTT patients with T158M and R255X mutations. These data suggest that deregulation of cell signaling pathways and molecular properties of RTT brain might be MeCP2 mutation-dependent, which could be addressed by future studies. Our data collectively suggest that in human RTT brain, ribosomal RNA transcripts and/or mTOR–P70S6K may be elevated, pointing toward a potential over-activation of this fundamental process, exhausting cellular resources that are essential for other cellular functions. This may include components of the protein translation machinery, neuronal plasticity, synapse formation, and other critical functions that are compromised in RTT brain and neurons.

While a role for mTOR in autism spectrum disorders is suggested (Onore et al., 2017), to our knowledge, our study is the first to implicate mTOR–P70S6K–nucleolin–rRNA synthesis in human RTT cerebellum. Although our results bring important insights toward understanding the molecular abnormalities of the RTT brain, further studies are required to determine if our findings can be generalized to other RTT-associated MeCP2 mutations. RTT is a rare disease with hundreds of different mutations that occur within different protein domains. One limitation of our study was access to a large number of human post-mortem brain tissues with the same mutation. While in our studies, RTT patients showed different trend of nucleolin-rRNA biogenesis, two different R255X patients exhibited similar trend of molecular characterization. This suggests that RTT-associated molecular abnormalities might depend on the type of genetic mutations and affected protein domains. Regardless, the mTOR-P70S6K signaling pathway appeared to be more similarly affected between these three patients, suggesting the importance of this pathway in RTT cerebellum. Our analysis of a fourth RTT patient with a rare MeCP2 mutation (G451T) in the MeCP2 CTD highlighted that elevated mTOR and phosphorylation of mTORC2 are common among mutations that involve three different MeCP2

protein domains (MBD, TRD, and CTD). A summary of our results and the proposed model of MeCP2 involvement in the mTOR–P70S6K–nucleolin–rRNA synthesis is provided in **Figure 8**.

# AUTHOR CONTRIBUTIONS

fgene-09-00635 December 18, 2018 Time: 16:21 # 15

MR and COO designed experiments. COO dissected murine tissues and prepared protein extracts from murine cerebellum, extracted RNA from Mecp2tm1.1Birdy/<sup>−</sup> and WT mice, prepared human brain tissues, extracted total protein extracts, prepared nuclear and cytoplasmic extracts, conducted human Western blots, and performed IHC and microscopic imaging. SP extracted RNA from human brain and performed RT-PCR from murine brain. AAS performed mouse WBs and prepared the graph for the generated signals, and marker analysis of nuclear–cytoplasmic extracts by WB. DF performed human brain RT-PCR. DF, DK, KS, and MDB quantified WB results, prepared related graphs and table(s), and correlational co-efficient analysis. YS from Dr. Huda Zoghbi's lab provided RNA samples from WT, MECP2- Tg1, and MECP2-Tg3 mice. MS-F and TM conducted alignment analysis of MeCP2 binding atrRNA genes. MDB provided control human brain tissues for IHC, and dissected T158M human brain. VMS and LA arranged consent, donation, transfer, and usage of T158M post-mortem brain to the Rastegar lab for research. For controls of RNA and protein extractions, and both R255X RTT brain tissues "Human tissue was obtained from University of Maryland Brain and Tissue Bank, which is a Brain and Tissue Repository of the NIH Biobank." MR wrote the manuscript, assembled final graphs and images prepared by other authors, provided conception and design, and contributed reagents, materials, analysis tools, and research facilities. All authors have read and approved the final version of the manuscript.

# FUNDING

This work is supported by funds from the International Rett Syndrome Foundation (IRSF) Grant 3212 to MR, Ontario Rett Syndrome Association (ORSA) to MR, and Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant 2016-06035 to MR. COO is supported by IRSF and ORSA funding to MR. DK is supported by a BSc Med scholarship. KS is supported by ORSA, IRSF, and NSERC-DG grants to MR. DF's training was supported by an NSERC undergraduate award and further supported by ORSA and IRSF grants to MR. MB's training was supported by CHRIM summer studentship awards, further supported by NSERC-DG to MR, and is currently supported by a CIHR graduate student scholarship. SP is supported by NSERC-DG 2016-06035 to MR and Graduate Enhancement of Tri-Council Stipends (GETS) supplements to MR. MDB holds the Canada Research Chair in Developmental Neuropathology. Funding to TM Canadian Institutes of Health Research (CIHR, MOP12205/PJT153266), and the National Science and Engineering Council (NSERC) of Canada (Discovery Grant).

# ACKNOWLEDGMENTS

We would like to thank current and past members of the Rastegar lab for scientific discussions and input. Tissues and relevant data for the two RTT (T255X) and frozen control tissues were obtained through the NIH NeuroBioBank Program: neurobiobank.nih.gov. We would like to express our sincere gratitude for the donation of the T158M RTT patient brain tissues.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00635/full#supplementary-material

FIGURE S1 | Detection of nucleolin protein in the cerebellum of R255X Rett syndrome (RTT) patients. (A,B) Microscopic images of post-mortem human cerebellum for nucleolin (red) and DAPI signals (white) in two female controls (a–c), and primary omission (d,e). (C,D) Microscopic images of post-mortem human cerebellum for nucleolin (red) and DAPI signals (white) in two female R255X patients are shown (a–c) as well as the primary omission (d,e). In each panel, the NIH Neurobiobank case number is indicated. GCL, granular cell layer; ML, molecular layer; NCL, nucleolin; PCL, Purkinje cell layer. Scale bars represent 20 µm.

FIGURE S2 | Primary omission control for murine cerebellum. Primary antibody omission in wild-type male and female (A,C) and mutant homozygote (B) or heterozygote (D) mice.

FIGURE S3 | Quantification of nuclear histone H3 and H3 di-acetylation at K9–K14 (H3AC) and cytoplasmic S100 normalized to GAPDH in Rett syndrome cerebellum and controls. (A,B) Western blot (WB) quantification of nuclear cell extracts of controls and RTT patients individually and in combination, respectively. The data are shown with the following samples in the order of controls (NIH NeuroBiobank case numbers #5646 and #5446) and Rett syndrome (RTT) patients (R255X: c.763C>T nonsense mutation, 20 and 17 years old, case numbers #4516 and #4882), and T158M cerebellum (brain received as donation by family members with appropriate consent for research). (C,D) Same as in (A,B), but for the cytoplasmic extracts for S100. N = 2 for controls while individual patient data is shown in (A,C). For (B,D), N = 2 for controls and N = 3 ± SEM for RTT patients. Statistical significance was determined by two-way ANOVA, with ∗∗p < 0.01 and ∗∗∗∗p < 0.0001.

FIGURE S4 | The mTOR and P70S6K signaling molecules in Rett syndrome. Representative Western blots (WB) with total cell extract of a human control and a G451T RTT cerebellum with indicated antibodies (mTOR, phosphorylated mTOR at Serine 2481 or 2448, G-Beta-L as the common component of mTOR complexes, Raptor as part of mTORC1, and Rictor as part of mTORC2), P70S6K (and its phosphorylated form Thr389) and GAPDH. The molecular weight of each detected protein is indicated, and the NIH Neurobiobank case numbers are indicated for the control and RTT cerebellum.

TABLE S1 | Primary antibodies used for Western blot (WB) or immunohistochemistry (IHC).

TABLE S2 | Secondary antibodies used for Western blot (WB) or immunohistochemistry (IHC).

TABLE S3 | Brain sample characteristics for rett syndrome (RTT) patients and controls.

# REFERENCES

fgene-09-00635 December 18, 2018 Time: 16:21 # 16



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Olson, Pejhan, Kroft, Sheikholeslami, Fuss, Buist, Ali Sher, Del Bigio, Sztainberg, Siu, Ang, Sabourin-Felix, Moss and Rastegar. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Exome Sequencing Identifies TENM4 as a Novel Candidate Gene for Schizophrenia in the SCZD2 Locus at 11q14-21

Chao-Biao Xue1,2† , Zhou-Heng Xu<sup>3</sup>† , Jun Zhu1,4, Yu Wu<sup>1</sup> , Xi-Hang Zhuang<sup>1</sup> , Qu-Liang Chen<sup>1</sup> , Cai-Ru Wu<sup>1</sup> , Jin-Tao Hu<sup>1</sup> , Hou-Shi Zhou<sup>2</sup> , Wei-Hang Xie<sup>2</sup> , Xin Yi<sup>5</sup> , Shan-Shan Yu<sup>5</sup> , Zhi-Yu Peng<sup>5</sup> , Huan-Ming Yang<sup>5</sup> , Xiao-Hong Hong<sup>1</sup> \* and Jian-Huan Chen<sup>3</sup> \*

Edited by: Cunyou Zhao, Southern Medical University, China

#### Reviewed by:

Xingyin Liu, Nanjing Medical University, China Xiao Dong, Albert Einstein College of Medicine, United States Mariza De Andrade, Mayo Clinic, United States

#### \*Correspondence:

Xiao-Hong Hong hongxiaohong@21cn.com Jian-Huan Chen cjh\_bio@hotmail.com †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

> Received: 10 August 2018 Accepted: 22 December 2018 Published: 28 January 2019

#### Citation:

Xue C-B, Xu Z-H, Zhu J, Wu Y, Zhuang X-H, Chen Q-L, Wu C-R, Hu J-T, Zhou H-S, Xie W-H, Yi X, Yu S-S, Peng Z-Y, Yang H-M, Hong X-H and Chen J-H (2019) Exome Sequencing Identifies TENM4 as a Novel Candidate Gene for Schizophrenia in the SCZD2 Locus at 11q14-21. Front. Genet. 9:725. doi: 10.3389/fgene.2018.00725 <sup>1</sup> Mental Health Center, Shantou University Medical College, Shantou, China, <sup>2</sup> Shantou Central Hospital, Affiliated Shantou Hospital of Sun Yat-sen University, Shantou, China, <sup>3</sup> Laboratory of Genomic and Precision Medicine, Wuxi School of Medicine, Jiangnan University, Wuxi, China, <sup>4</sup> Shenzhen Kang Ning Hospital, Shenzhen, China, <sup>5</sup> Beijing Genomics Institute – Shenzhen, Shenzhen, China

Schizophrenia is a complex psychiatric disorder with high genetic heterogeneity, however, the contribution of rare mutations to the disease etiology remains to be further elucidated. We herein performed exome sequencing in a Han Chinese schizophrenia family and identified a missense mutation (c.6724C>T, p.R2242C) in the teneurin transmembrane protein 4 (TENM4) gene in the SCZD2 locus, a region previously linked to schizophrenia at 11q14-21. The mutation was confirmed to co-segregate with the schizophrenia phenotype in the family. Subsequent investigation of TENM4 exons 31, 32, and 33 adjacent to the p.R2242C mutation revealed two additional missense mutations in 120 sporadic schizophrenic patients. Residues mutated in these mutations, which are predicted to be deleterious to protein function, were highly conserved among vertebrates. These rare mutations were not detected in 1000 Genomes, NHLBI Exome Sequencing Project databases, or our in-house 1136 non-schizophrenic control exomes. Analysis of RNA-Seq data showed that TENM4 is expressed in the brain with high abundance and specificity. In line with the important role of TENM4 in central nervous system development, our findings suggested that increased rare variants in TENM4 could be associated with schizophrenia, and thus TENM4 could be a novel candidate gene for schizophrenia in the SCZD2 locus.

Keywords: association, co-segregation, schizophrenia, exome analysis, rare mutation

# INTRODUCTION

Schizophrenia (OMIM 181500) is a complex psychiatric disorder, and is also a public health problem affecting approximately 1% of the world population, leading to reduced life expectancy by an average of 20–25 years (Tiihonen et al., 2009). Family and twin studies have demonstrated a strong genetic component in schizophrenia with heritability estimated to be 60–80% (Sullivan et al., 2003; Shih et al., 2004).

High heritability in the disease underlines substantial effects from genetic variants. The frequency of these of alleles ranges from common to extremely rare. Common variants associated with schizophrenia have been widely studied by candidate gene association studies (Stefansson et al., 2002; Chen et al., 2009a, 2) and genome-wide association studies (GWASs) (Bergen and Petryshen, 2012). For example, a recent multi-stage schizophrenia GWAS of up to 36,989 cases and 113,075 controls identify 128 independent associations spanning 108 loci that meet genome-wide significance (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). It is estimated that half to a third of the genetic risk of schizophrenia is indexed by common alleles in GWAS.

Apart from common variants with small effect, recent studies have shown that rare variants or mutations with large effect may also help to understand remaining component of psychiatric etiology unexplained by common variants (Pulver et al., 1994; Jacquet et al., 2002; Ament et al., 2015). For example, a mutation in the Neuronal PAS domain protein 3 (NPAS3) gene segregates with mental illness in a family affected by schizophrenia and major depression (Yu et al., 2014, 3). Recent exome sequencing studies have advanced understanding of rare variants and mutations in schizophrenia (Bustamante et al., 2017; Singh et al., 2017; John et al., 2019). Studies in parent-proband trios have revealed de novo mutations with high genetic heterogeneity in schizophrenia (Xu et al., 2011, 2012). Moreover, analysis of exomes from 2536 schizophrenia cases and 2543 controls has emphasized burden raised from extremely rare (less than 1 in 10,000), disruptive mutations in the patients (Purcell et al., 2014). However, the contribution of rare mutations to schizophrenia remains to be further elucidated. In spite of a number of genetic loci identified by linkage studies in schizophrenia,<sup>1</sup> only a few genes have been mapped in these loci, such as PRODH (SCZD4) (Jacquet et al., 2002), DISC1 (SCZD9) (Debono et al., 2012, 1), SHANK3 (SCZD15) (Gauthier et al., 2010), NRXN1 (SCZD17) (Rujescu et al., 2009), and SLC1A1 (SCZD18) (Myles-Worsley et al., 2013, 1). Most of these linkage loci have not yet been defined molecularly, which might harbor unknown genes or mutations with large effects in disease risk determination that remain to be identified.

In the current study, by using exome sequencing we identified a novel rare mutation in the teneurin transmembrane protein 4 gene (TENM4, also named ODZ4), which cosegregated with schizophrenia in a Han Chinese family. Moreover, additional rare TENM4 mutations were found in a cohort of 120 unrelated sporadic schizophrenic patients. All mutation was not detected in 1441 non-schizophrenic control exomes. TENM4 is located within the schizophrenia disorder 2 (SCZD2) locus at 11q14- 21, in which the underlying gene has not been identified by far. The study thus demonstrated increased TENM4 mutation burden in schizophrenia and suggested that TENM4 could probably be a candidate gene for schizophrenia in the SCZD2 locus.

#### MATERIALS AND METHODS

#### Sample Collection and Clinical Examination

This study has been approved by the Ethics Committee of the Mental Health Center of Shantou University Medical College and was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and all subsequent revisions. All participants in the paper gave their informed consent for their participation, and the publication of clinical data and indirectly identifiable information prior to their inclusion in the study. The family with schizophrenia was recruited at the Mental Health Center of Medical College of Shantou University, Shantou, China (**Figure 1**). Clinical diagnoses in the proband (III-2) and family members were based on a Chinese version of the Structured Clinical Interview for DSM-IV-TR Axis I Disorder-Patient Edition (SCID-I/P) (First et al., 1997) criteria derived from a standard interview and from a case-note review by two trained psychiatrists. Four affected family members (II-2, III-2, III-3, and III-4) were interviewed with Social Disability Screening Schedule (SDSS) (World Health Organization, 1988) to assess the clinicalspecific features and social abilities. Clinical global impressionseverity of illness (CGI-SI) (Programs and Guy et al., 1976) was used to assess the disease severity (**Table 1**). All of them were diagnosed as paranoid schizophrenia and had social or occupational dysfunction and continuous signs of the disturbance for more than 15 years. Four individuals affected with treatment-resistant schizophrenia were enrolled. The two unaffected family members were confirmed to have no sign of mental illness.

A cohort of 120 unrelated sporadic schizophrenic patients was recruited using the same criteria as described above. All of them were diagnosed as schizophrenia, and their main manifestations included auditory hallucination, persecutory delusion, and social or occupational dysfunction.

A group of 205 unrelated non-schizophrenia individuals in Sanger sequencing validation had no history of psychotic diseases following clinical interview FH-RDC criteria (Endicott, 1978).

Peripheral blood was collected from all participants except for II-2 whose blood and DNA was not available due to her death before blood sample collection for the current study. Her genotypes and mutation status was then inferred from her children and spouse's genotypes. Genomic DNA was extracted by using the QIAmp Blood kit (Qiagen, Hilden, Germany).

#### Exome Capture and Sequencing

Genomic DNA (3 µg) from two affected (III-2 and III-4) and two unaffected members (II-1 and III-1) was used to perform exome sequencing by Axeq Technologies (Rockville, MD, United States). Whole exome was captured by SeqCap EZ Human Exome Library v3.0 (Roche NimbleGen, Madison, WI, United States) and sequenced on an Illumina HiSeq 2000 (Illumina, Hayward, CA, United States) with a paired-end 100 bp length configuration.

<sup>1</sup>https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-30352/

#### Read Mapping and Variant Detection

The reads were mapped against UCSC hg19 Human Reference Genome<sup>2</sup> by using BWA<sup>3</sup> . The single nucleotide variations (SNVs) and Indels were detected by SAMTOOLS<sup>4</sup> , and annotated using ANNOVAR (Wang et al., 2010) previously known and reported variants were identified and filtered using dbSNP 135<sup>5</sup> and 1000 Genomes project<sup>6</sup> data. Functional impact of variants was predicted by PROVEAN, SIFT, and PolyPhen.

#### Sanger Sequencing

fgene-09-00725 January 24, 2019 Time: 16:12 # 3

Genomic sequence of TENM4 was obtained from the NCBI reference sequence database<sup>7</sup> . Primers designed accordingly by Primer 3 were summarized in **Supplementary Table S5**. Polymerase chain reaction (PCR) amplification was performed using the GeneAmp PCR System 9700 (ABI, Foster City, CA, United States) in a 25-µl mixture containing 1.5 mM MgCl2, 0.2 mM of each dNTP (Sangon, Shanghai, China), 1 U Taq DNA polymerase (Invitrogen, Carlsbad, CA, United States), 0.2 µM primers, and 20 ng of genomic DNA. Sanger sequencing was performed using the BigDye Terminator Cycle Sequencing v3.1 kit (ABI, Foster City, CA, United States) and the 3130xl Genetic Analyzer (ABI, Foster City, CA, United States) following the protocol suggested by the manufacturer. Sequence alignment and analysis of variations were performed using the NovoSNP program<sup>8</sup> .

<sup>2</sup>http://genome.ucsc.edu/

<sup>3</sup>http://bio-bwa.sourceforge.net/

<sup>4</sup>http://samtools.sourceforge.net/

<sup>5</sup>http://www.ncbi.nlm.nih.gov/snp/

<sup>6</sup>http://www.1000genomes.org

<sup>7</sup>http://www.ncbi.nlm.nih.gov/refseq/

<sup>8</sup>http://www.molgen.ua.ac.be/bioinfo/novosnp/

FIGURE 1 | A Chinese Han family with schizophrenia. Filled squares and circles denote affected males and females, respectively. Normal individual is shown as empty symbols. All family members in the second and third generations were examined. The ages at diagnosis for the examined affected family members are shown below the symbols. Whole-exome sequencing was performed in two affected (III-2 and III-4) and two unaffected (II-1 and III-1) family members. Asterisks denote individuals with blood samples and DNA collected. Triangles denote individuals (II-1, III-1, III-2, and III-4) whose DNA were used in exome sequencing in the current study.

#### Non-schizophrenic Control Exomes

Variants were screened against exomes from Chinese Han individuals without schizophrenia to remove common variants.

TABLE 1 | Demographic information and clinical features of four affected family members from schizophrenia family enrolled in the exome sequencing study.


M, male; F, female; NA, not available or applicable; SDSS, Social Disability Screening Schedule; CGI-SI, clinical global impression-severity of illness.

The set of control exome data were consolidated from several previous exome sequencing studies performed by our group. Various exome capture chips were used for this dataset, such as Roche NimbleGen SeqCap EZ Human Exome Library v3.0, Agilent SureSelect All Human Exon v4.0 kit, or Ilumina TruSeq Exome Enrichment kit. Sequencing was performed in Illumina Hiseq 2000. The data quality criteria and filtering process followed the same standard as that for the family.

#### Statistical Analysis

fgene-09-00725 January 24, 2019 Time: 16:12 # 4

Difference in scores of clinical assessment was analyzed using Student's t-test. Association of SNVs with schizophrenia was performed using Fisher's exact test.

#### RNA-Seq of Human Tissues

Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of RNA-seq of the brain, cerebellum, heart, kidney, liver, and testis from human, macaque, mouse, and opossum were obtained from the Baseline Atlas of Gene Expression Altas website<sup>8</sup> (Keane et al., 2011). Relative expression level of TENM4 mRNA was calculated as the original expression normalized by the highest expression among the six tissues in each species.

### RESULTS

#### Clinical Data of the Schizophrenia Family

A Chinese Han family of five family members with three generations affected by schizophrenia was recruited from our longitudinal follow-up, and latest interview was done in 2004 for II-2 and III-3, and 2011 for III-2 and III-4, respectively (**Figure 1**, **Table 1**). All affected family members received the interviews were all diagnosed as schizophrenia (**Table 1**). The affected patients all had disease-onset after adulthood (age of onset later than 22 years old), and manifested auditory hallucination, persecutory delusion, and were easy to be offended or irritated and aggressive in the acute stage. They developed gradually into difficulty in concentration, and blunting affect, leading to social isolation and work disability for multiple episodes.

#### Exome Sequencing Data of the Family

In order to identify the underlying genetic predisposition in the family, exome sequencing was performed in two affected (III-2 and III-4) and two unaffected (II-1 and III-1) family members. The original exome sequencing data from the four family members were summarized in **Supplementary Table S1**. The exome sequencing achieved 96.2% mean coverage and 62.5× mean depth of target region, which allowed high quality of variant calling (**Supplementary Table S2**). The variants called from the exome data were analyzed using a step-bystep filtering method (**Table 2**). The variants were firstly filtered to remove all noncoding and synonymous variants, and keep only nonsynonymous single-nucleotide variants (NSVs), splice site variant (SSV), and coding indels. The selected variants were then filtered against databases including dbSNP 135, 1000



Genomes, the NHLBI Exome Sequencing Project, and nonpsychiatric exomes from the Exome Aggregation Consortium (ExAC) to remove common and known variants in the public databases. Given the pattern of dominant inheritance in the family, only heterozygous variants shared by the two affected (III-2 and III-4) but not found in the two unaffected (II-1 and III-1) were kept for subsequent analysis. In order to look for rare variants, the remaining variants were then filtered against exomes of non-schizophrenic controls. Twenty-eight remaining variants had a MAF less than 1% in the control exomes (**Table 2**), none of which was in PRODH, DISC1, SHANK3, NRXN1, or SLC1A1. These variants were then evaluated for impact on protein function to by three bioinformatic programs PROVEAN, SIFT, and PolyPhen-2. Higher priority was given to four NSVs and one SSV that were predicted to be deleterious and were not detected in control exomes (**Table 2**). These five variants were then analyzed with prior knowledge of gene function as shown in **Supplementary Table S4**. SLC11A2 is a known diseasecausing gene for autosomal recessive anemia (OMIM #206100) (Iolascon et al., 2006). Likewise TSPAN12 has been characterized as a disease-causing gene for exudative vitreoretinopathy (OMIM #613310) (Nikopoulos et al., 2010). These two genes were known disease-causing genes of distinct phenotypes from schizophrenia. A recently study on autozygosity showed lack of an apparent phenotype in individuals with loss of function variants of FUK (Alsalem et al., 2013). Therefore, these three genes were excluded from further study. TENM4 is located within a region

### %603342). LRRTM4 has highly selective expression in the brain and can mediate excitatory synapse development on dentate gyrus granule cells (Siddiqui et al., 2013). Therefore, TENM4 and LRRTM4 were then included for subsequent validation experiment.

that was previously linked to schizophrenia (SCZD2, OMIM

#### Cosegregation With Schizophrenia in the Family

Sanger sequencing results showed that the LRRTM4 variant was not found in one affected family member III-3, and thus were excluded from further analysis due to lack of co-segregation with the disease phenotype. The c.6724C>T variant in exon 32 of TENM4 was heterozygous in the four affected (III-2, III-3, and III-4 by sequencing and II-2 by inferring) (**Figure 2A** and **Supplementary Figure S1**), and were not detected in the two unaffected (II-1 and III-1). These results showed that the c.6724C>T variant cosegregated with the schizophrenia phenotype in the schizophrenia family. The variant was not detected in any of another group of 205 unrelated nonschizophrenic controls by Sanger sequencing, in consistent to its absence in the public databases and our in-house control exomes as described above.

# TENM4 Mutations in Unrelated Sporadic Schizophrenic Patients

As TENM4 could possible serve as mutation hotspots, we further investigate TENM4 exons 31, 32, and 33 which were adjacent to c.6724C>T in 120 unrelated sporadic schizophrenic

fgene-09-00725 January 24, 2019 Time: 16:12 # 5

patients using Sanger sequencing (**Supplementary Table S5**). Two additional missense alterations, c.5738T>G (p.I1913S) in exon 31 and c.6880G>A (p.D2294N) in exon 32 were detected (**Table 3**), both of which were predicted to be deleterious to protein function by PROVEAN, SIFT, and PolyPhen-2. The variant p.D2294N was found in a single heterozygous individual (MAF = 8.3 × 10−<sup>6</sup> ) in the ExAC database but not found in our controls. And p.I1913S was not detected in the public databases, control exomes, or in the Sanger sequencing results of 205 unrelated controls. A rare synonymous variant (p.K1922K) was also significantly associated with schizophrenia (p < 0.05). In contrast, only one missense variant with a MAF less than 0.04% (1 in 2882) in the YD repeats was detected in the same exons in controls, which was predicted to be benign, and was not significantly different between patients and controls (p = 1). Further comparison of clinical features did not find evident

mutations were predicted to be deleterious to the protein function.

difference between patients with and without either of p.I1913S and p.D2294N.

### TENM4 Residues Mutated in Schizophrenia

The human TENM4 protein sequence was aligned to its orthologs in other vertebrates. The alignment showed that the TENM4 residues mutated in schizophrenic patients were evolutionarily conserved (**Figure 2**). They were all located within a region that consisted of 23 tyrosine/aspartic acid (YD) repeats in the extracellular domain of the transmembrane TENM4 protein.

To evaluate burden of rare variants in the gene, all coding variants of TENM4 were examined in the exomes of the family and non-schizophrenia controls. The data were summarized in **Supplementary Table S6**. In total, 39 variants were detected

in the gene from the exomes including the novel mutation pR2242C exclusively found in the affected family members. We analyzed the distribution of variants predicted to be deleterious in control exomes. Three novel rare variants that were predicted to damaging were exclusively found in control exomes, including a missense variant (p.R325W) in the teneurin N-terminus, a missense variant (p.E1098G) between the EGF-like and NHL repeats, and one stop-gained variant (p.Q2735X) in the last exon close to the C-terminus. However, by carefully looking into the positions of novel and known variants predicted to be deleterious in control exomes, none of them was found to be within the YD repeats. The three control individuals carrying these variants had no symptom of schizophrenia. Moreover, either control exomes or the two unaffected did not have any variant predicted to be deleterious in the YD repeats. In contrast, novel, rare deleterious variants in the YD repeat region were exclusively observed in affected family members and unrelated patients. These findings might suggest increased burden of rare deleterious variants within the YD repeats of TENM4 was associated with schizophrenia (variant burden test p < 0.001).

### Expression Specificity of TENM4 in the Brain

We then analyzed RNA-seq data of six human tissues including brain, cerebrum, heart, liver, kidney, and testis, which were retrieved from Gene Expression Atlas. The results showed that TENM4 was highly expressed in human brain tissues (**Supplementary Figure S2**). In addition, the gene exhibited higher specificity to the brain compared to other tissues. Moreover, highest expression level of TENM4 in the brain among six different tissues was observed in mouse, opossum, and monkey, suggesting that the high specificity of TENM4 expression in the brain was evolutionarily conserved in mammals.

# DISCUSSION

In the current study, we conducted exome sequencing in a threegeneration family affected by schizophrenia, and identified a novel candidate gene TENM4 in a linkage locus SCZD2 at 11q14- 21. One missense mutation cosegregated with the schizophrenia phenotype, and two more missense mutations were detected in unrelated sporadic schizophrenic patients. These findings hence implicated a potential role of TENM4 in etiology of schizophrenia.

The SCZD2 locus at 11q14-21 has not yet been defined molecularly for any disease gene for schizophrenia (Yamada et al., 2004). A balanced t(1;11)(q42.1;q14.3) translocation was first reported to co-segregate with schizophrenia (Millar et al., 2000). The 11q14-21 region has been linked to mental illnesses, including schizophrenia and bipolar disorder (Fanous et al., 2012). A recent study has further demonstrated that polymorphisms in the SCZD2 locus are associated with schizophrenia in Scottish population (Debono et al., 2012), emphasizing the possibility that the SCZD2 region may harbor genetic variants contributing to risk of schizophrenia. By showing co-segregation with schizophrenia in the family, our findings thus suggested that TENM4 could be a candidate gene in the SCZD2 locus.

TENM4 is located at 11q14.1 and contains 34 exons, which encode a large protein composed of 2769-amino acid residues. The molecular function of the gene has not yet been well characterized by far. It probably plays as a type II transmembrane protein, and functions as a cellular signal transducer. TENM4 is a member of the TENEURIN/ODD OZ (TEN/ODZ) protein family, which is first identified in Drosophila (Baumgartner and Chiquet-Ehrismann, 1993; Baumgartner et al., 1994), and contains many members with orthologs identified in mammals, vertebrates, insects, and nematodes (Minet and Chiquet-Ehrismann, 2000). This family of transmembrane proteins shares a common TENEURIN N-terminus intracellular domain, 5–8 EGF-like repeats, and more than 20 tyrosine/aspartic acid (YD) repeats. TENM4 were highly expressed in mammalian brain (Zhou et al., 2003). TENM4 mutations in mice can be lethal, and the mutants exhibit anomalies in gastrulation (Lossie et al., 2005). It is noted that missense mutations in TENM4 have recently been reported to cause autosomal-dominant essential tremor (Hor et al., 2015). TENM4 rare variants are also found to be associated with bipolar disorder (Ament et al., 2015). Furthermore, a role of Tenm4 has been implicated in a rodent model of schizophrenia (Neary et al., 2017). These studies taken together were in line with a substantial role of TENM4 in mammalian central nervous system development and functions.

In the current study, a rare missense mutation (c.6724C>T; p.R2242C) in TENM4 showed cosegregation with schizophrenia in a three-generation family. In addition, two more rare mutations were identified in unrelated sporadic schizophrenic patients (**Table 3**), suggesting the exon surrounding p.R2242C in TENM4 could be a potential mutation hotspot for schizophrenia. Such findings thus pointed to potential association of TENM4 with schizophrenia. It was noted that more TENM4 missense were observed in these schizophrenic patients than in controls. It could be due to higher clinical homogeneity in our patients. All of the unrelated schizophrenic patients in the current study were diagnosed as paranoid schizophrenia. It should also be noted that the local population in the current study could be more genetically homogeneous as reported in a recent study (Chen et al., 2009b). The absence of rare TENM4 mutations in non-schizophrenic controls was confirmed by using Sanger sequencing in 205 controls plus exome sequencing in a larger control cohort, which should ensure that the extreme rareness of these mutations was unlikely due to technical bias if any. Among the three TENM4 mutations found in schizophrenia patients, p.D2294N was found in a heterozygous individual (MAF = 8.3 × 10−<sup>6</sup> ) in the ExAC database indicated that these deleterious TENM4 mutations might exist in the general population, yet with an extremely low frequency. We further checked loss-offunction variants in TENM4 in the ExAC database. In total six loss-of-function variants were found for the canonical TENM4 isoform, each of which was detected in a single heterozygous individual (MAF = 8.3 × 10−<sup>6</sup> ), suggesting that

loss-of-function variants in TENM4 were extremely rare. For comparison, there were 27 loss-of-function variants found in DISC1 in the ExAC database, with the highest MAF = 0.0005. Therefore, the function of TENM4 protein could be relatively conserved and was unlikely to be tolerant to loss-of-function mutations.

The TENM4 protein contains 1 teneurin N-terminal domain, 8 EGF-like repeats, 5 NHL (NCL-1, HT2A, and Lin-41) repeats, and 23 YD repeats. Our findings suggested that distribution of variants predicted to be deleterious was unlikely random. The mutations detected in schizophrenia patients in the current study were located within a YD repeat containing region in the extracellular domain of TENM4 protein, which was essential for extracellular protein–protein interaction and signal transduction for transmembrane proteins. In contrast, only a benign missense variant with low MAF, and no deleterious variant were found in the YD repeat region among nonschizophrenia control exomes. A premature termination codon variant was identified in the last exon close to the C-terminus in one of our control exomes. However, it was not within the YD repeat. Encoding 98.7% of full-length protein, the variant was unlikely to trigger nonsense-mediated decay of mRNA to affect the protein function (Cirulli et al., 2011). Such findings suggested that increased burden of rare deleterious variants in YD repeats of TENM4 was probably associated with schizophrenia. The YD repeat sequences contain two tandem copies of a 21 residue extracellular repeat named for a YD dipeptide, the most strongly conserved motif of the repeat (Minet et al., 1999). These repeats appear in general to be involved in binding carbohydrate, and their function has not been understood clearly. In human, YD repeats are found in all four of TENM proteins. YD repeats are found in all four of TENM proteins. Prior studies suggest that these conserved repeats might be involved in neuron development. Minet et al. (1999) reported that culture substrate coated with the YD repeat region of chicken teneurin-1 supported neuron outgrowth from dorsal root ganglion explants. Therefore, defects in YD repeats might contribute to development of psychiatric diseases due to its role in neuron development. And our results thus warranted further studies of the YD repeats and TENM4 in molecular mechanism of schizophrenia.

Although TENM4 has not been clinically linked to schizophrenia previously, recent studies have already implicated its possible role in mental illness and cognition. In an epigenetic study, the Tenm4 gene has recently been implicated in a rodent model of schizophrenia (Neary et al., 2017). In addition, TENM4 is associated with bipolar disorder in a recent genome-wide association study with a large sample size (Sklar et al., 2011). And genetic studies have shown prominent genetic overlap between schizophrenia and bipolar disorder. In a large population-based study in Sweden and Demark (Lichtenstein et al., 2009), risk of bipolar disorder was associated with a family history of schizophrenia. Moreover, a recent study has reported that rare variants in TENM4 might contribute to etiology of bipolar disorder (Ament et al., 2015). Moreover, in a study on genetic correlation on brain magnetic resonance imaging and cognitive tests, TENM4 has been ranked among the top 25 loci with Boston Naming test score via generalized estimating equations, suggesting its link to cognition function33. Association of cognitive impairment with schizophrenia has been implicated in previous studies (O'Carroll, 2000). Although our study subjects have not been assessed for cognition function directly in our sample collection, our records showed that all affected in the family received education and had been employed, suggesting they were unlikely to have cognitive impairment before disease onset. After disease onset, as the disease progressed, they gradually exhibited cognitive impairments with marked negative symptoms including disorganized thinking, lack of judgment and insight, social dysfunction, and loss of ability to work. Likewise, all of the unrelated schizophrenia patients had evident hallucination and delusion during the acute phase. As the disease progressed, cognitive impairments were observed in these patients including poor attention, active social avoidance, lack of judgment and insight, poor rapport, and disturbance of volition. The rare deleterious TENM4 variants found in schizophrenia patients might not affect cognition directly. Therefore, our data suggested association of TENM4 with schizophrenia, while its association with cognitive impairment needs to be investigated in further study. Nevertheless, taken together with these previous studies, our findings were in line with a substantial role of TENM4 in central nervous system development and diseases.

Schizophrenia is of high genetic heterogeneity, and genetic variants with different effects can contribute to the disease risk. In addition to common variants with relatively small effects identified by both GWAS and candidate gene studies, the role of rare variants or mutations with large effects has been emphasized in psychiatric disorders such as autism (Chaste et al., 2014), especially by exome sequencing studies (Cukier et al., 2014; Toma et al., 2014). Schizophrenia-associated rare mutations have been depicted in recent exome sequencing studies (Purcell et al., 2014). However, most of these variants need to be further confirmed, and for most of them validation still remain a big challenge due to their low frequency (<1%) and uncertain penetrance, especially for schizophrenia. The mapping of rare mutations can be enhanced by analyzing rare variants cosegregated with disease phenotypes in high-density families of schizophrenia, and by integrating prior knowledge of linkage loci. The association between TENM4 mutations with schizophrenia in the current study was not contradictory to previous literature regarding architecture of schizophrenia genetic risk. First, to the best of our knowledge, mutations with large risk to schizophrenia have been reported in at least five genes, DISC1, NRXN1, PRODH, SHANK3, and SLC1A1 according to records in OMIM (**Supplementary Table S3**). Second, the observation of TENM4 mutations in some of our schizophrenic patients does not rule out possible contribution of other genetic factors such as polymorphisms with small effects or mutations with large effects to the disease in the same patients.

In the current study, by using exome sequencing, we demonstrated cosegregation of a novel missense mutation in TENM4 with schizophrenia in a Chinese family. With additional rare TENM4 mutations significantly associated with schizophrenia, our findings suggested that TENM4 were a candidate gene for the disease in the SCZD2 locus at 11q14-21. Actual specific functional effects of these TENM4 rare variants were thus warranted in further study.

#### ADDITIONAL INFORMATION

#### Accession Codes

fgene-09-00725 January 24, 2019 Time: 16:12 # 9

Whole-exome sequence data for the schizophrenia family has been deposited in GenBank Sequence Read Archive (SRA) under the accession code SPR172436.

#### URLs

Online Mendelian Inheritance in Man (OMIM), http://omim. org/; PROVEAN, http://provean.jcvi.org/; SIFT, http://sift.jcvi. org/; PolyPhen-2, http://genetics.bwh.harvard.edu/pph2/; Gene Expression Atlas, http://www.ebi.ac.uk/gxa/; ExAC, http://exac. broadinstitute.org/.

### AUTHOR CONTRIBUTIONS

C-BX and X-HZ contributed equally to this work. X-HH and J-HC conceived the project and planned the experiments. C-BX, X-HZ, YW, Q-LC, C-RW, J-TH, H-SZ, and W-HX clinically characterized the cases and collected blood samples. C-BX, Z-HX, and JZ performed the validation experiments. J-HC, XY, S-SY, Z-YP, and H-MY analyzed and interpreted the experiment data. H-MY provided the control exome data. C-BX and X-HH interpreted the clinical data. All authors contributed to the final manuscript.

#### REFERENCES


# FUNDING

This study was supported in part by grants from the National Natural Science Foundation of China (Nos. 31671311, 81371033, 81000397, and 81573510), the National first-class discipline program of Light Industry Technology and Engineering (LITE2018-14), the National "Tenth Five-Year" Science and Technology Program (2002BA711A08), the Project of Guangdong Province Medical Research Foundation (A2012409), Natural Science Foundation of Guangdong Province, China (No. S2013010015618), Science and Technology Planning Project of Guangdong Province, China (No. 2010B031600130), the "Six Talent Peak" Plan of Jiangsu Province (No. SWYY-127), the Innovative and Entrepreneurial Talents of Jiangsu Province, the Program for High-Level Entrepreneurial and Innovative Talents Introduction of Jiangsu Province, Guangdong High-Level Personnel of Special Support Program, Yangfan Plan of Talents Recruitment Grant, and Fundamental Research Funds for the Central Universities (JUSRP51712B).

#### ACKNOWLEDGMENTS

We thank all the participating patients, institutions, and medical staff, without whose contribution this work would not have been possible.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00725/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Xue, Xu, Zhu, Wu, Zhuang, Chen, Wu, Hu, Zhou, Xie, Yi, Yu, Peng, Yang, Hong and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-09-00725 January 24, 2019 Time: 16:12 # 10

# Whole Exome Sequencing Identifies a Novel Predisposing Gene, MAPKAP1, for Familial Mixed Mood Disorder

Chunxia Yang<sup>1</sup> , Suping Li <sup>1</sup> , Jack X. Ma<sup>2</sup> , Yi Li <sup>3</sup> , Aixia Zhang<sup>1</sup> , Ning Sun1,4, Yanfang Wang<sup>1</sup> , Yong Xu<sup>1</sup> \* and Kerang Zhang<sup>1</sup> \*

*<sup>1</sup> Department of Psychiatry, First Hospital of Shanxi Medical University, Taiyuan, China, <sup>2</sup> McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX, United States, <sup>3</sup> School of Statistics, Shanxi University of*

#### Edited by:

*Weihua Yue, Peking University Sixth Hospital, China*

#### Reviewed by:

*Ming Li, Kunming Institute of Zoology, China Xiangrong Zhang, Nanjing Brain Hospital Affiliated to Nanjing Medical University, China Guolian Kang, St. Jude Children's Research Hospital, United States*

#### \*Correspondence:

*Yong Xu xuyongsmu@vip.163.com Kerang Zhang krzhang\_sxmu@vip.163.com*

#### Specialty section:

*This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics*

> Received: *16 August 2018* Accepted: *28 January 2019* Published: *15 February 2019*

#### Citation:

*Yang C, Li S, Ma JX, Li Y, Zhang A, Sun N, Wang Y, Xu Y and Zhang K (2019) Whole Exome Sequencing Identifies a Novel Predisposing Gene, MAPKAP1, for Familial Mixed Mood Disorder. Front. Genet. 10:74. doi: 10.3389/fgene.2019.00074* *Finance and Economics, Taiyuan, China, <sup>4</sup> Nuring College of Shanxi Medical University, Taiyuan, China* Background: Mood disorder is ranked seventh among the worldwide causes of non-fatal disease burden and is generally believed to be a heritable disease. However,

there is still a substantial portion of the heritability yet to be discovered, despite the success of genome-wide association studies (GWAS) for mood disorder. A proportion of the missing heritability may be accounted for by rare coding variants segregating in families enriched with mood disorder.

Methods: To identify novel variants segregating with mood disorder, we performed whole-exome sequencing on genomic DNA for a multigenerational family with nine members affected with mood disorder. We prioritized potential causal variants within the family based on segregation with mood disorder, predicted functional effects, and prevalence in human populations. In addition, for the top-ranked candidate variant, we conducted validation *in vivo* to explore the pathogenesis of mood disorder.

Results: We identified and ranked 26 candidate variants based on their segregation pattern and functional annotations. The top-ranked variant, rs78809014, is located in intron 7 of the MAPKAP1 gene. The expression levels of MAPKAP1 in peripheral blood of both major depression disorder (MDD) patients and depressive-like mice ventral dentate gyrus were significantly higher than that in the corresponding controls. In addition, the expression level of MAPKAP1 were correlated with antidepressant response.

Conclusions: Although the exact mechanisms in the family remain to be elucidated, our data strongly indicate a probable role of the variant, rs78809014, in the regulatory process of the expression of MAPKAP1 and thus in the development of mood disorder in familial mood disorder.

Keywords: mood disorder, pedigree-based analysis, whole exome sequencing, rare variants, validation

**85**

# INTRODUCTION

Mood disorders are a kind of serious mental illness and are characterized by high incidence, recurrence, and suicide rate (Ogasawara et al., 2018). Among different types of mood disorders, bipolar disorder (BD) affect ∼1% of the population, causing severe psychosocial disturbance and requiring life-long treatment (Kanba et al., 2013; Stahl et al., 2013). On the other hand, major depression disorder (MDD) is a common disease with lifetime prevalence of around 10% (Motomura and Kanba, 2013). The social and financial burden due to these disorders is large and current treatments are still insufficient. Although the etiology of mood disorders remains largely unknown, a genetic component has been strongly suggested by family and twin studies. The heritability of mood disorders ranges from ∼37% (95% Cl 31–42) for major depressive (Flint and Kendler, 2014), to 75% for bipolar disorder (Sullivan et al., 2012). In the last decade, many single nucleotide polymorphisms (SNP) and de novo or inherited copy number variations (CNV) have been found to be associated with mood disorders by genome-wide association studies (GWAS) and CNV analysis using DNA microarray (Kato, 2015). Despite of the success of GWAS, the identified SNPs and CNVs reaching a genome-wide significance level that are validated by independent studies so far can explain only a small portion of the heritability (Peterson et al., 2017; Xiao et al., 2017). It is generally believed that the degree of genetic heterogeneity is remarkably higher than previously thought for most of mood disorders, and the overall genetic structure probably has a polygenic component that contributes only a small portion of the overall liability. The rest of variance that cannot be explained by variants identified by GWAS, known as "missing heritability" may be accounted for by loci with modest to large effects (Collins et al., 2013; Cruceanu et al., 2013). Because GWASs focus on common variants, it is believed that low frequency (0.5∼5%) and rare (<0.5%) variants could explain the missing heritability. Rare variants are known to play an important role in many Mendelian disorders and rare forms of common disease with high penetrance (Keinan and Clark, 2012; Zuk et al., 2014). Recent empirical evidence also shows that low-frequency and rare variants are associated with complex diseases (Goes et al., 2016).

With the advance of second-generation of DNA sequencing technologies, detection of rare variants has become increasingly feasible. Because rare variants have an extremely low frequency in general populations, one of the ideal study designs for detecting rare variants is to utilize pedigrees with a significant number of affected individuals (Roach et al., 2010). Availability of secondgeneration whole genome sequencing (WGS) or whole exome sequencing (WES) now permits the study of rare SNVs and small insertions/deletions (in/dels) in a systematic genome-wide manner (Roach et al., 2010; Ament et al., 2015). Studies using WGS or WES have been conducted for adult BD to search for highly penetrant rare variants (in 1% of population) with some success (Kato, 2015; Zhang et al., 2018). Collins et al. (2013) genotyped 46 individuals in a three-generation Old Order Amish pedigree with 19 affected (16 BP and 3 MD) and 27 unaffected subjects, and suggested that family based studies of the combined effect of common and rare CNVs at many loci may represent a useful approach in the genetic analysis of disease susceptibility of mental disorders. Although WGS has many advantages, such as allowing examination of both coding and non-coding regions (e.g., regulatory regions), WES is more cost effective, has much less computational burdens, and can quickly and effectively identify common and rare coding variants. In addition, in a large scale study of BD using WGS of 200 individuals from 41 families with BD, it was shown that an excess of rare variants in pathways associated with γ-aminobutyric acid and calcium channel signaling (Ament et al., 2015). In a recent study, Goes et al. (2016) performed exome sequencing of 36 affected members with BD from eight multiplex families, tested rare, segregating variants in three independent case-control samples consisting of 3,541 BD cases and 4,774 controls, and found 84 rare (frequency <1%), segregating variants that were bioinformatically predicted to be damaging (Goes et al., 2016).

In this study, we recruited a mood disorder-affected Chinese pedigree and sequenced the exomes of 22 subjects in this pedigree, which include 9 mood disorders and 13 unaffected members to explore novel genetic alterations predisposing individuals to the familial mood disorder. We also conducted validation in vivo from the perspective of genetics, to explore the pathogenesis of mood disorder.

# MATERIALS AND METHODS

#### Subjects

We studied a Northern Chinese family of ethnic Han origin in which 9 individuals (5 males and 4 females) affected with MDD or BD (**Figure 1**). We recruited this family through a proband (A21), who was diagnosed with BD at the age of 22. Clinical diagnosis was made between March 2013 and February 2015 by at least two consultant psychiatrists according to Diagnostic and Statistical Manual of Mental Disorders Fourth Edition (DSM-IV) criteria for MDD (American Psychiatric Association, 2000). The affected individuals were also assessed with the Chinese Version of the Modified Structured Clinical Interview for DSM-IV TR Axis I Disorders Patient Edition (SCID-I/P, 11/2002 revision).All affected individuals had no other diseases, except for two of them who were with hypertension but in a stable condition. The 13 unaffected members of the family had no mental illness or other diseases.

In order to explore the functional impacts of the candidate gene identified from the pedigree, we utilized the data from one of our previous studies with 30 MDD. Details on the collection and diagnosis of the subjects were given in Sun et al. (2016). In short, all the patients were assessed by well-trained research assistants with background in psychology or psychiatry using the 17-item Hamilton depressive scale (HAMD-17) before and after a 8-week antidepressant treatment (SSRIs).

Among the 30 MDD patients, initial doses increased to curative doses administered for the next 2–4 weeks. The specific doses (maintenance doses and increased doses) were adjusted according to side effects and clinical assessment. If necessary, small doses of benzodiazepines were prescribed for agitation, but not persistent over 3 days. Twenty-two patients were

remitted after 8-week antidepressant treatment (HAMD < 7). Five patients did not underwent the 8-week antidepressant and three did not meet the remitted individuals after the 8-week antidepressant treatment. Twenty-two healthy volunteers age and gender matched were selected controls were selected from 86 healthy volunteers. None of these controls had any family history of major psychiatric disorders (schizophrenia, bipolar disorder, MDD, and so on). All healthy controls did not have any history of blood transfusion or severe traumatic event within 1 month.

The study was approved by the Medical Research Ethics Committee of Shanxi Medical University. All subjects gave written informed consent.

#### Exome Sequencing Followed by Quality Control and Statistical Analyses Exome Capture and Sequencing

Blood samples for 22 family members (affected: A06, A08, A12, A15, A17, A18, A21, A23, A25; unaffected: A03, A07, A09, A10, A11, A13, A14, A16, A20, A22, A28, A29, A30) were submitted to MyGenomics (Bejing, China) for Genome Analysis. DNA was extracted from (5) mL aliquots using the CTAB approach. Each DNA sample was run on a (1) % agarose gel to determine if it was of high quality by analyzing the degradation degree and whether it was contaminated by RNA. The purity of DNA was analyzed with Nanodrop. The qualified DNA samples should be >1.5 µg, had no degradation or RNA contamination, and had a 260/280 ratio between 1.8 and 2.0. The quantified genomic DNA was sheared to a fragment length between 180 and 280 base pairs using focused acoustic energy (Covaris). The samples were then end repaired, A-tailed and ligated with specific adapters and multiplexing indexes.

The amplified DNA was captured with a polling biotin labeled probe (up to 543,872) for liquid-phase hybridization. The probes were designed to tile along 20,965 genes containing 334,378 exons. The capture experiment was conducted according to manufacturer's protocol. In brief, 1 µg DNA library was mixed with Buffer BL and GenCapgenepanel probe (MyGenostics, Beijing, China), and heated at 95◦C for 7 min and 65◦C for 2 min on a PCR machine; 23 µl of the 65◦Cprewarmed Buffer HY (MyGenostics, Beijing, China) was then added to the mix, and the mixture was held at 65◦C with PCR lid heat on for 22 h for hybridization. Fifty microliters MyOne beads (Life Technology) was washed in 500 µl 1X binding buffer for three times and resuspended in 80 µl 1X binding buffer. Sixty-four microliters 2X binding buffer was added to the hybrid mix, and transferred to the tube with 80 µl MyOne beads. The mix was rotated for 1 h on a rotator. The beads were then washed with WB1 buffer at room temperature for 15 min once and WB3 buffer at 65◦C for 15 min three times. The bound DNA was then eluted with Buffer Elute. The eluted DNA was finally amplified for 15 cycles using the following program: 98◦C for 30 s (1 cycle); 98◦C for 25 s, 65◦C for 30 s, 72◦C for 30 s (15 cycles); 72◦C for 5 min (1 cycle). The PCR product was purified using SPRI beads (Beckman Coulter) according to manufacturer's protocol. The enrichment libraries were sequenced on Illumina HiSeq 2000 sequencer for paired read 100 bp.

#### Quality Control, Variants Calling, and Annotation

High-quality reads were retrieved from raw reads by filtering out the low quality reads and adaptor sequences using the Solexa QA package and the cutadapt program, respectively. Reads were first aligned using the Burrows-Wheeler Algorithm (BWA) (http://bio-bwa.sourceforge.net/) to the human reference genome (GRCh37). The GATK (http://www.broadinstitute. org/gsa/wiki/index.php/Home\_Page) package was then used to align the clean read sequences to the human reference genome (GRCh37). The identified SNPs and InDels were then filtered out if: (a) mapping qualities <30; (b) the Total Mapping Quality Zero Reads <4; (c) approximate read depth <5; (d) QUAL <50.0; or (e) phred-scaled p-value using Fisher's exact test to detect strand bias >10.0. The variants were then annotated using the ANNOVAR (http://annovar.openbioinformatics.org/ en/latest/) with multiple databases, including 1000genome (http://www.1000genomes.org/), dbSNP (http://www.ncbi. nlm.nih.gov/projects/SNP/), EXAC (http://exac.broadinstitute. org/), Inhouse (MyGenostics), HGMD (http://www.biobaseinternational.com/product/hgmd). Non-synonymous variants were evaluated using four algorithms Ployphen SIFT (http://sift.jcvi.org/), PolyPhen-2 (http:// genetics.bwh.harvard.edu/pph2/), and MutationTaster (http://www.mutationtaster.org/).

#### Filtering Based on Coverage, Functions, and Prevalence

Given our interest in identifying functional, rare or even familyspecific variants, we applied a second process of filtering by removing those that were synonymous, with allele frequency > 5% in any of the mutation database, or with depth of coverage <6 or mutation ratio <30%.

#### Filtering Based on Segregation Pattern and Genetic Model

The segregation pattern of the MDD/BD in the family suggested a high, but not fully-penetrant dominant model because the father (A03) of the two affected sons (A17 and A18) were not affected. Also, we assumed that the causal variant segregated within the family members but not the marryins. Therefore, we first identified all invariants that were shared by all affected members and the father (A03) of A17 and A18; then we excluded variants that appeared in the five marry-ins.

#### Prioritizing and Ranking Using Mendelscan

Following the strategy proposed in the program package MendelScan (Koboldt et al., 2014), we prioritized potential causal variants within the family based on segregation with mood disorder, predicted functional effects, and prevalence in human populations. The segregation score, population score, and the annotation score of the candidate variants were calculated using the MendelScan approach.

#### Differential Expression Analysis

The Kolmogorov-Smirnov test was used to test the normality of the expression data. Differential expression analysis for MDD vs. control groups and treated vs. untreated MDD were performed using two-sample t-test. All analyses were conducted with SPSS (version 17.0) and GraphPad Prism (version 5.0). Data were expressed as the means ± standard deviations (SD) unless otherwise indicated. The demographic data of the two groups were analyzed using t-test and ChiSquare Test. P < 0.05 were considered statistically significant.

#### RNA Extraction

Blood samples of 22 MDD patients before and after a 8 week treatment and 22 healthy controls were collected using EDTA anticoagulant tube and processed within 3 h. Peripheral blood leukocytes were isolated by centrifugation from the fresh blood sample, and stored at −80◦C in fresh RNase/DNase-free 2 ml microcentrifuge tube. Total RNAs were extracted from the peripheral blood leukocytes with the TRIzol (Invitrogen; USA) with on-column DNase I treatment according to the manufacturer's protocol. The integrity of total RNA was evaluated by denaturing agarose gel electrophoresis. RNA was further purified using an RNeasy mini kit (Qiagen, Valencia, CA, USA) according to the manufacturer's instructions.

#### qPCR and Comparison of Gene Expression

The expression of MAPKAP1 gene of 22 MDD patients and 22 health controls was analyzed by real-time quantitative PCR (qRT-PCR). cDNA was synthesized using a High Capacity RNA-to-cDNA Kit (Invitrogen; USA) as described by the manufacturer. The primers used for MAPKAP1 are listed in **Table 1**. PCR was performed using a 7900HT real-time PCR machine (Applied Biosystems; USA) for 2 min at 50◦C, 2 min at 95◦C, and then 40 cycles consisting of 15 s at 95◦C, 60 s at 60◦C, followed by a subsequent standard dissociation protocol to ensure that each amplicon was a single product. All quantifications were normalized to GAPDH. The qRT-PCR was performed in triplicate for each of the three independent samples. The expression of MAPKAP1 was measured using the miScript system(QIAGEN, CA) (including miScript Reverse Transcription kit, miScript Primer assays and miScript SYBR Green PCRkit) as described by the protocol provided by the company. Small nuclear RNA U6 was used for normalization. The threshold cycle (CT) was defined as the fractional cycle number at which the fluorescence passes the fixed threshold. The comparative Ct (2∧–11CT) method was used for quantification of transcripts.

### GEO Data Analysis

We also analyzed the expression data of MAPKAP1 from GEO in ventral dentate gyrus from adults mice subjected to chronic corticosterone (CORT) to induce depression-like behaviors, followed by selective serotonin reuptake inhibitor fluoxetine (Samuels et al., 2014). All mice were 7–8 weeks old and weighed 23–35 g at the beginning of the treatment. Fifteen mice were treated with chronic corticosterone (CORT), which induced depression-like behaviors with increased immobility in the Forced Swim Test (FST) and anxiety-like behavior with increased latency to eat in the Novelty Suppressed Feeding Test (NSF). These features can then be reversed with fluoxetine (FLX) for 21 days. However, a minority of mice (4 out of 15) appeared to respond to FLX in the NSF but not in the FST, which were not used for microarray experiments, such that 11 (7 appeared to respond to FLX, 4 appeared to resist to FLX) out 15 fluoxetine treated mice were used for microarrays. Eight randomly chosen mice not treated with FLX were used as representative controls for microarray studies. Student's t-test was performed to compare the expression levels between MDD mice group and control group, and between Responder group and resistant group, control group.

TABLE 1 | Sequences for primers of MAPKAP1.


# RESULTS

The genogram of the pedigree is given in **Figure 1**. Blood samples were available for 9 affected and 13 unaffected members (A03, A07, A09, A10, A11, A13, A14, A16, A20, A22, A28, A29, A30) of the family, and we performed WES on these family members (**Figure 1**).

The clinical characteristics of the affected family members are provided in **Table 1**. None of these individuals displayed atypical findings on neurological examination, except that one of them (A25) had mild intellectual disability since childhood. Two of the affected members had hypertension. There were nine affected family members, seven of them diagnosed major depression, and one diagnosed bipolar disorder (**Table 2**).

# QC of Sequencing Data

We attained relatively high coverage of the exome, with each sample achieving, in average, 12,638 Mbp. On average, 94.6 and 89.0% base pairs had Phred values greater than 20 and 30, respectively. On average, 99.8% of the base pairs passed the strict quality control filters for each sample. On average, about 84% of base pairs were mapped to the reference genome, 34% to the targeted region, the exomes, and the average depth was 69.6 and covered 98.5% of the exomes. On average, 96.1, 91.9, and 84.3% of the exomes had sequencing depth 4X, 10X, and 20X, respectively. On average, 84.37% of reads were successfully aligned, 10.74% PCR replicates per sample, resulting in 98.52% coverage, with 84% of the exome being covered by 20 or more reads. The detailed quality control metrics were given in **Table 3**.

# Variants Calling and QC

There were 115,915 variants (102,717 SNVs, and 13,198 InDels) called that were variable in at least one sample. For those called variants, the average coverage per sample was between 46X and 79X. Of those variants, the majority were within exons (49%), and introns (30%). About 8% of those variants were in regions not considered to be within or around known genes (**Table 4**). After applying stringent quality control filters, 101,126 variants remained, including 93,071 SNVs, and 8,055 InDels. For those high quality variants, the average coverage was between 48X and 81X. The majority of those variants were within exons (51%) and introns (31%), while 6.5% within intergenic regions (**Table 4**).


# Filtering Based on Coverage, Functions, and Prevalence

This filtering process yielded 21,306 variants, including 18,464 SNVs, and 2,842 InDels, with the average coverage between 53X and 81X. Among those variants, 8,071 (38%) were within exons, 7,225 (34%) within introns, and 2,686 (13%) within intergenic regions (**Table 4**).

### Filtering Based on Segregation Pattern and Genetic Model

We assumed that the causal variant segregated within the family members but not the marry-ins (see Methods for details). We first identified 475 variants that were shared by all affected members and the father, A03, of A17 and A18; then we excluded those variants among 1,104 variants that were found in all the five marry-ins, yielding a final set of 26 candidate variants.

# Prioritizing and Ranking Candidate Variants

To prioritize those 26 candidate variants, we calculated the segregation score, population score, and the annotation score of them using the MendelScan approach (Koboldt et al., 2014). The results are summarized in **Table 5**. The topranked variants is rs78809014 (chr9; 128305252;128305252; C;G) with population-score is 0.02, and located at the seventh intron of gene MAPKAP1. The gene is a key component in mTOR signaling pathway, which has great potential for the identification of new therapeutic targets for the development of antidepressant drugs in MDD and in response to antidepressants

TABLE 3 | Average variants coverage by sample.



(Szewczyk et al., 2015; Liu et al., 2016). Specifically, the top variant (rs78809014) had a perfect association with the affection status: all affected had the mutation and all unaffected except A03 had the normal allele.

#### Clinical Characteristics of the MDD Cases and Controls

As shown in **Table 6**, 22 MDD patients underwent a 8-week antidepressant treatment (sole SSRI). All the patients and controls were of Han nationality, and there were no statistically significant differences in age, sex or residential locations between MDD patients and healthy controls. The first mean score of HAMD was 19.86 ± 2.51.

### Expression of MAPKAP1 in MDD and Control

All the expression data follow normal distribution (P > 0.05). The mean expression level of the MDD group before treatment (1.92 ± 0.42) was significantly higher than that of the control group (1.52 ± 0.30), (t = 3.652, P = 0.001); and was also significantly higher than that after treatment (1.66 ± 0.38), (t = 2.131, P = 0.039) (**Figure 2**).

### Expression of MAPKAP1 in MDD and Control Mice

We performed One-Sample Kolmogorov-Smirnov test for the expression data in each of the four groups separately and found that each of them follow a normal distribution (P > 0.05). The mean expression level of MAPKAP1 in ventral dentate gyrus of MDD mice (236.46 ± 23.17) was significantly higher than that of the control group (208.47 ± 9.23), (t = −3.216, P = 0.005). The mean expression level of MAPKAP1 in ventral dentate gyrus of Responder group (245.90 ± 19.07) was significantly higher than that of the control group (208.47 ± 9.23), (t = 4.946, P < 0.001) and was also marginally higher than that resisitant group (219.95 ± 22.14), (t = −2.055, P = 0.070) (**Figure 3**).

### DISCUSSION

With only a small fraction of the predicted heritability being accounted for by variants identified through linkage analysis and GWAS, it is believed that mood disorder is highly genetically heterogeneous and its genetic susceptibility factors may involve rare variants (Gershon, 2000). The pedigree we recruited was enriched with affected individuals and thus provided us with a great opportunity to reduce the genetic heterogeneity and identify rare or even family-specific variants responsible to mood disorder (Rao et al., 2017). Exome-sequencing is increasingly utilized to identify rare and likely disease-causing mutations in many neuropsychiatric disorders (Binder, 2012). The segregation pattern of mood disorder in this pedigree strongly suggests a high-penetrant dominant model. Based on this model and the segregation score, annotated functional score, and population score, we identified and ranked 26 candidate rare variants. These rare mutations were shared by affected family members and were absent in the unaffected family members (except for A03), which is in consistent with the currently favored hypothesis of oligogenic disease causation in BD (Gershon, 2000; Rao et al., 2017).

The top-ranked variant, rs78809014, is located in the intronic region of the MAPKAP1 gene. This variant may be regarded as being segregated perfectly with mood disorder in this pedigree, if it is deemed as not penetrant in A03 who was the father of two affected sons. rs78809014 may be regarded as a rare variant with a global MAF 0.0088 in the major populations. However, in the HapMap population CHB (Han Chinese in Beijing) the MAF is as high as 0.044 (https://www.ncbi.nlm.nih.gov/snp/? term=rs78809014). Indeed, to our best knowledge, none of the GWAS studies have identified rs78809014 as a susceptibility locus associated with the risk of mood disorders. We anticipate that the reasons why this not-so-rare variant (especially in the Chinese population) may have escaped from the genome-wide scans may be that mood disorders are complex diseases caused by a combinations of inherited variations, which often acting together with some environmental and/or behavioral factors (Li et al., 2010). It is possible that there was an additional genetic mutation shared by the affected individuals in the pedigree under study, which might be identified should we conducted a WGS. This additional mutation might act together with rs78809014 to cause the prevalence of mood disorder in this pedigree. It is also like that those affected may share a common environmental risk factor that interacts with rs78809014 in relation to the regulation of the expression of MAPKAP1.

MAPKA1 (also known as Sin1), is a key component of mTORC2 signaling complex which is necessary for AKT phosphorylation (Li et al., 2010; Machado-Vieira et al., 2015). Ketamine, which recently became one of most popular antidepressant medicines and has been proved to be effective, could rapidly activate the mammalian target of rapamycin TABLE 5 | Scores of the family separation analysis.


(mTOR) pathway, leading to increased synaptic signaling proteins and increased number and function of new spine synapses in the prefrontal cortex of rats (Abelaira et al., 2014). Fluoxetine, the clinical commonly classic antidepressant medication, also has been proved to regulate mTOR signaling in a region-dependent manner in depression-like mice and mainly

TABLE 6 | Clinical characteristics of MDD patients and healthy controls.


in the hippocampus (Liu et al., 2015). Some of us have previously reported that the key genes, AKT1 and GSK3B, of mTORC2 signaling appear to be associated with MDD in the Han Chinese population (Yang et al., 2010, 2012; Zhang et al., 2010; Liu et al., 2014). The characterization of the mTOR signaling pathway in depression and its action in response to antidepressants show great potential for the identification of new therapeutic targets for the development of antidepressant drugs (Szewczyk et al., 2015; Liu et al., 2016).

Although the variant rs78809014 is located in an intron and has not being annotated with any functional terms, we anticipate that it may have some kind of regulatory functional impact on the MAPKAP1 gene. Data from in vivo were supportive to our anticipation that the expression level of MAPKAP1 may be altered in the development of MDD. We compared the mRNA expression levels of MAPKAP1 in MDD patients vs. healthy controls and before and after treatment (8 weeks) for the MDD patients and found that MAPKAP1 was indeed overexpressed in MDD patients and treatment with (SSRI) could significantly reduce its expression level. Also, the expression levels of MAPKAP1 in responder mice ventral dentate gyrus were significantly higher than that in the control group, and was also marginally higher than that in resistant group. Recent studies show that antidepressant treatment such as imipramine inhibited PI3K/Akt/mTOR signaling (Jeon et al., 2011). Moreover, Lin et al. (2010) showed that sertraline exerts antiproliferative activity by targeting the mTOR signaling pathway in rat embryonic fibroblasts. Our data provided further evidence that the expression level of MAPKAP1 plays an important role in the pathology of MDD. In a future study, it will be highly desirable to investigate whether this variant affect the expression level of MAPKAP1 in the brain tissues and/or blood cells. Moreover, a more detailed expression analysis at the transcript level should be performed in future studies, because there exist 21 isoforms of the gene KAPMAP1. Specifically, in the qPCR experiments, more than five different transcripts were mapped and the results reflected an aggregative expression of those isoforms.

Not surprisingly, none of the candidate variants identified in this study was implicated in any of the 15 major GWASs on BD (Shinozaki and Potash, 2014). Because of the genetic heterogeneity of mood disorder, a single susceptibility variant may affect only a very small proportion in a population. Given the substantial evidence that the majority of causal genetic variants in BD are common with very small effect sizes (OR¼ 1.05 to 1.20), individual rare variants can hardly reach a genome-wide significance level required in a typical GWAS. It is the advantage of a family-based WES or WGS study to identify rare, moderate to high penetrant variants susceptible to complex diseases such as mood disorders.

The pedigree we have studied included multiple affected individuals with BD or MDD. Although it has been believed that focusing on a single subtype of a disease to reduce the phenotypic heterogeneity will increase the chance of identifying genetic factors, increasing evidence has indicated that familial co-aggregation or comorbidity between these disorders is mainly attributable to overlapping genetic influences (Smoller and Finn, 2003; Chang et al., 2013). A recent large-scale GWAS across the five disorders, MDD, BD, SZ, ADHD, and autism, identified SNPs at four loci that accounted for some of the shared variation across the disorders at p < 5 ∗ 10<sup>8</sup> (Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013). Moreover, the proband of this pedigree was diagnosed with BD at the age of 22, the cut-off age for defining early onset (Grigoroiu-Serbanescu et al., 2014). It is believed that early-onset BD is more heritable and more severe than BD occurring at older age (Schürhoff et al., 2000; Grigoroiu-Serbanescu et al., 2001; Somanath et al., 2002). The etiologic mechanisms for BD are not well-understood, but empirical data consistently suggest the polygenic character of BD with estimated heritability ranging from 80 to 85% (Barnett and Smoller, 2009). We therefore strongly believe that there existed a genetic variant segregating along with and responsible to the mood disorder in this pedigree.

One of the limitations of exome sequencing is that it does not identify the non-coding and structural variants that could be found by WGS. Also, none of the current exome capture reagents cover 100% of the coding region. Nevertheless, we believe that the candidate variants identified in the current study warrant validations via family-based and/or population based studies of large sample size. Furthermore, the proposed susceptibility genes may be validated functionally for their role in the brain and the impact of the identified mutations on protein structure and function and/or expression levels of the corresponding transcripts.

# AUTHOR CONTRIBUTIONS

KZ: Study concepts and definition of intellectual content; KZ and YX: Study design; AZ: Literature research; YW and SL: Clinical studies; NS: Experimental studies; CY and YX: Data acquisition; CY and JM: Data analysis and statistical analysis; CY: Manuscript preparation; CY and YL: Manuscript editing; AZ, YL, KZ, and YX: Manuscript review. All authors have approved the final article.

# ACKNOWLEDGMENTS

This study was supported by the National Natural Science Youth Fund Project (81701345, 81601192), National Natural Science Foundation of China (81471379), the National key research and development program of China (2016YFC1307103), National Clinical Research Center on Mental Disorders (2015BAI13B02), National Key Basic Research Program (No. 2013CB531305), National Clinical Research Center on Mental Disorders (2015BAI13B02), NSFC grants (3147 0070 and 3150 1002), Program for the Outstanding Innovative Teams of Higher Learning Institutions of Shanxi, Natural Science Foundation of Shanxi Province for Youths (201601D021151), Doctoral fund of Shanxi Medical University (BS03201635), and Shanxi Province Graduate Student Education Innovation at 2017 (2017BY079). We sincerely thank the patients and their families, as well as the healthy volunteers for their participation, and all the medical staffs involved in the collection of specimens.

#### REFERENCES


mammalian target of rapamycin signaling. Cancer Res. 70, 3199–3208. doi: 10.1158/0008-5472.CAN-09-4072


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Yang, Li, Ma, Li, Zhang, Sun, Wang, Xu and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Current Understanding of Gut Microbiota in Mood Disorders: An Update of Human Studies

Ting-Ting Huang<sup>1</sup>† , Jian-Bo Lai1,2,3† , Yan-Li Du<sup>1</sup> , Yi Xu1,2,3, Lie-Min Ruan<sup>4</sup> \* and Shao-Hua Hu1,2,3 \*

#### Edited by:

Weihua Yue, Peking University Sixth Hospital, China

#### Reviewed by:

Chuanjun Zhuo, Tianjin Anding Hospital, China Richard S. Lee, Johns Hopkins University, United States Xueqin Song, Zhengzhou University, China Shuping Tan, Beijing HuiLongGuan Hospital, Peking University, China

#### \*Correspondence:

Shao-Hua Hu dorhushaohua@zju.edu.cn Lie-Min Ruan 13805869162@163.com

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

> Received: 05 November 2018 Accepted: 29 January 2019 Published: 19 February 2019

#### Citation:

Huang T-T, Lai J-B, Du Y-L, Xu Y, Ruan L-M and Hu S-H (2019) Current Understanding of Gut Microbiota in Mood Disorders: An Update of Human Studies. Front. Genet. 10:98. doi: 10.3389/fgene.2019.00098 <sup>1</sup> Department of Psychiatry, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China, <sup>2</sup> The Key Laboratory of Mental Disorder's Management of Zhejiang Province, Hangzhou, China, <sup>3</sup> Brain Research Institute of Zhejiang University, Hangzhou, China, <sup>4</sup> Department of Mental Health, Ningbo First Hospital, Ningbo, China

Gut microbiota plays an important role in the bidirectional communication between the gut and the central nervous system. Mounting evidence suggests that gut microbiota can influence the brain function via neuroimmune and neuroendocrine pathways as well as the nervous system. Advances in gene sequencing techniques further facilitate investigating the underlying relationship between gut microbiota and psychiatric disorders. In recent years, researchers have preliminarily explored the gut microbiota in patients with mood disorders. The current review aims to summarize the published human studies of gut microbiota in mood disorders. The findings showed that microbial diversity and taxonomic compositions were significantly changed compared with healthy individuals. Most of these findings revealed that short-chain fatty acids-producing bacterial genera were decreased, while pro-inflammatory genera and those involved in lipid metabolism were increased in patients with depressive episodes. Interestingly, the abundance of Actinobacteria, Enterobacteriaceae was increased and Faecalibacterium was decreased consistently in patients with either bipolar disorder or major depressive disorder. Some studies further indicated that specific bacteria were associated with clinical characteristics, inflammatory profiles, metabolic markers, and pharmacological treatment. These studies present preliminary evidence of the important role of gut microbiota in mood disorders, through the brain-gut-microbiota axis, which emerges as a promising target for disease diagnosis and therapeutic interventions in the future.

Keywords: gut microbiota, mood disorder, brain-gut-microbiota axis, gene sequencing techniques, human study

# INTRODUCTION

Mood disorders, including depressive disorder and bipolar disorder (BD), affect approximately 10% of the world's population, causing significant individual and socioeconomic burdens (Wittchen, 2012). Compared with the general population, people with mood disorders tend to have higher rates of mortality and a decreased life expectancy (Angst et al., 1999; Kessing et al., 2015).

**96**

However, the underlying mechanisms of mood disorders are not sufficiently characterized. Genetic and environmental factors contribute to the major causes of mood disorders, such as genetic vulnerability and susceptibility (Sullivan et al., 2012), chronic non-infectious inflammation (Rosenblat et al., 2014), oxidative stress (Zhao et al., 2017), neurotransmitter imbalance (Pralong et al., 2002), insufficient signaling by neurotrophic factors (Castren and Kojima, 2017), and neuroendocrine abnormalities (Linkowski, 2003). In the last decade, the potential role of gut microbiota in the pathogenesis of mood disorders has attracted considerable attention. Until now, no article has comprehensively reviewed all the human studies investigating the role of the gut microbiota in mood disorders.

The human commensal microbiota inhabits various body surfaces including the skin, nose, oral cavity, vagina, stomach, and intestine (Marsland and Gollwitzer, 2014). The human intestine contains 10 to 100 trillion microbes, which is almost 10 times greater than the total number of human cells (Bäckhed et al., 2005). The predominant bacterial phyla in the human gastrointestinal tract (GI) are Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria, Fusobacteria, and Cyanobacteria (Bäckhed et al., 2005; Marsland and Gollwitzer, 2014). Furthermore, some researchers regard the human microbiota as the second genome, which contains 100 times the number of genes of the human genome (Bäckhed et al., 2005; Grice and Segre, 2012). The normal gut ecosystem is beneficial in maintaining human health, which can be classified into metabolic, protective, structural, and histological functions (Prakash et al., 2011). The microbiota changes dynamically during individual growth (Clemente et al., 2012). However, the gut microbiota can be influenced by various factors, such as a genetic basis (Kurilshikov et al., 2017), environment (Chen et al., 2018b), mode of delivery (Dominguez-Bello et al., 2010), diet (Patman, 2015), antibiotics (Bokulich et al., 2016), and probiotics and prebiotics (Preidis and Versalovic, 2009). Dysbiosis in gut microbiota was found to be associated with many systemic disorders, such as functional bowel disorders (Mayer et al., 2014), inflammatory disease (Clemente et al., 2018), atherosclerosis (Jie et al., 2017), metabolic disease (Bouter et al., 2017), and neuropsychiatric disorders (Sharon et al., 2016). It has been reported that reduction of certain microbes that could produce short-chain fatty acids (SCFAs) was observed in inflammatory bowel disease and autoimmune diseases, and dysbiosis in gut microbiota was associated with higher levels of inflammation (Clemente et al., 2018). Furthermore, it was proven that obesity was associated with a lower ratio of Bacteroidetes to Firmicutes and the ratio increased after weight loss (Ley et al., 2006).

The alterations in the human gut microbiota composition have also been linked to a variety of neuropsychiatric disorders, including mood disorders, autism spectrum disorder (ASD), schizophrenia and Parkinson's disease (PD) (Cenit et al., 2017). Studies indicated that altered gut bacterial communities could substantially influence the central physiology. Furthermore, many patients who suffered from GI discomfort were more likely to comorbid with mental disorders (Mussell et al., 2008; Lee et al., 2015). The GI symptoms in patients with irritable bowel syndrome (IBS) significantly improved after receiving psychotropic treatments (Palsson and Whitehead, 2002). The altered gut microbiota composition in patients with depression was related to abnormalities in hypothalamic–pituitary–adrenal (HPA) axis function, intestinal low-grade inflammation and an imbalanced neurotransmitter metabolism via the brain–gut– microbiota axis (Kelly et al., 2016). Therefore, gut microbial dysregulation may contribute to the pathogenesis of mental disorders, supporting the hypothesis of a pathological process of bidirectional communication between the gut and the brain.

The aim of this current review is thus to first introduce the brain-gut-microbiota axis, briefly describe evidence from animal studies and other neuropsychiatric disorders relevant to the brain-gut-microbiota axis, then to focus on human studies in patients with mood disorders, and lastly to discuss the causeeffect relationship between the gut dysbiosis and mood disorders. We also discuss the limitations in previous studies and propose prospective future investigations.

# THE BRAIN-GUT-MICROBIOTA AXIS

Gut microbiota modulates brain development and function and the brain in turn interacts with gut bacteria via neuroimmune, neuroendocrine pathways, and the nervous system. This bidirectional communication system is commonly called the brain-gut-microbiota axis (Rhee et al., 2009). Through this bidirectional communication system, signals from the brain can influence the physiological effects of the gut, including motility, secretion and immune function, and messages from the gut can influence the brain function with regard to reflex regulation and mood states (O'Mahony et al., 2011). Chronic stress could affect the gut microbiota composition, which is associated with the activation of the HPA axis and an elevation in the proinflammatory status (Bailey et al., 2011; O'Mahony et al., 2011). The hyperactivity of the HPA axis promotes cortisol secretion and induces a pro-inflammatory response. The intestinal mucosal barrier and blood–brain barrier are important gates for substance transfer. The cortisol can increase the permeability of the intestinal tract and blood–brain barrier, thus facilitating the mutual communication between the gut microbiota and the central nervous system (CNS). In addition, the microbial composition can be interfered with by a pathogen infection or by microecological treatment (O'Toole and Cooney, 2008). The change in microbiota has a direct effect on the immune system, and the disrupted balance between pro-inflammatory and antiinflammatory cytokines further affects brain function (Duerkop et al., 2009). The vagus nerve (VN), as a major anatomical pathway connecting the enteric nervous system (ENS) and the CNS, also plays a key role in microbiota–brain interactions (Cryan and Dinan, 2012).

#### Gut Immune System

Gut microbiota is an important component of the development of a gut immune system and gut immunological homeostasis is influenced by host–microbe interactions (Furusawa et al., 2013). Germ-free (GF) mice exhibited an underdeveloped immune system and immune function, which could be restored by the

colonization of certain bacteria, such as segmented filamentous bacteria (Talham et al., 1999). Symbiotic microbes maintain the immune balance through both direct and indirect pathways (Petra et al., 2015). On the one hand, microbial-associated molecular factors, including lipopolysaccharide, bacterial lipoprotein, flagellin, CpG oligodeoxynucleotide (a ligand to Toll-like receptor 9 expressed in endosomes of dendritic cells), can activate immune cells as well as toll-like receptors to promote the release of pro-inflammatory cytokines, which further increases the permeability of gut-blood barrier and blood–brain barrier, and regulates the CNS function and behavior (Petra et al., 2015; Sampson and Mazmanian, 2015). A microbiotadriven pro-inflammatory state and low-grade inflammation in dysfunctional intestinal mucosal barrier was observed in stressrelated psychiatric disorders such as depression (Kelly et al., 2015). Inflammatory cytokines can also cause the over-release of the corticotropin releasing hormone, the dominant regulator of the HPA axis (de Weerth, 2017). Hyperactivity of the HPA could also contribute to increased cytokine expression in animal studies (Hueston and Deak, 2014). Indeed, hyperactivity of the HPA axis and immune activation were both observed during depressive episodes (Young, 2004; Slavich and Irwin, 2014). On the other hand, the VN, as a link between the CNS and ENS, can mediate immunoregulatory signals directly to the brain and the gut (Petra et al., 2015; Breit et al., 2018).

### The Neuroendocrine Pathway

Gut microbiota can secrete a series of neurotransmitters, such as γ-aminobutyric acid (GABA) (Barrett et al., 2012), acetylcholine (Stephenson and Rowatt, 1947), serotonin (Mittal et al., 2017), dopamine (Asano et al., 2012; Mittal et al., 2017), and histamine (Devalia et al., 1989). For instance, Lactobacillus spp. produces GABA and acetylcholine; Bifidobacterium spp. produces GABA; Escherichia spp. produces noradrenalin and serotonin; Bacillus spp. produces noradrenalin and dopamine; Saccharomyces spp. produces noradrenalin; Candida spp., Streptococcus spp., and Enterococcus spp. produces serotonin (Cryan and Dinan, 2012). Notably, More than 90% of the neurotransmitter, serotonin, in the human body is produced in the gut, which can affect emotion regulation when transmitted to the CNS (Smith, 2015). Studies in GF mice showed higher levels of noradrenalin, dopamine, and serotonin in the striatum and the hippocampus (Diaz Heijtz et al., 2011; Clarke et al., 2013). It is conceivable that neurotransmitters secreted by gut microbiota can influence the level of central neurotransmitters and then affect behavior and mood. Furthermore, bacterial metabolites, such as SCFAs (e.g., acetic acid, propionate, butyrate, isobutyric acid, valeric acid, and isovaleric acid) (Al-Lahham et al., 2010), have physiological effects including the regulation of food intake, glucose/insulin or lipid metabolism, anti-inflammatory and antitumorigenic functions, and can even activate the sympathetic nervous system (Layden et al., 2013; Borre et al., 2014; Tan et al., 2014). In addition, butyrate can alter the activity of cells located in the blood–brain barrier and exert an antidepressant-like effect in animal models (Yamawaki et al., 2012; Smith, 2015). Therefore, brain function and behavior can also be modulated by gut microbiota through the neuroendocrine pathway.

# The Neural Pathway

The communication between the gut and brain, through the neural anatomical pathway, is based on a hierarchic fourlevel integrative organization, including the ENS, prevertebral ganglia, the autonomic nervous system, and the CNS (Wang and Wang, 2016). Animal studies have shown that gut microbiota can activate the VN and further influence brain function and behavior (Sherwin et al., 2016). Anxiolytic and antidepressant-like behavior was observed in mice treated with Lactobacillus rhamnosus, but not in vagotomized mice (Bravo et al., 2011). Similar findings were reported in rats with probiotic administration of Bifidobacterium longum (Bested et al., 2013). It seems that the effects of gut microbiota on the brain function are dependent on vagal activation. Furthermore, activation of the VN inhibits cytokine production, manifesting as an antiinflammatory response (Johnston and Webster, 2009).

### Evidence From Animal Studies

Currently, evidence of the brain-gut-microbiota axis are mostly obtained from animal studies. These animal studies have verified the role of gut microbiota on modulating gut–brain interactions, through various strategies such as GF animal observation, fecal microbial transplantation, and probiotic treatment (Mayer et al., 2015). GF mice exhibited an enhanced HPA response from stress and reduced the expression of the brain-derived neurotrophic factor (BDNF) in the cortex and hippocampus, compared with specific-pathogen-free mice (Sudo et al., 2004). Moreover, the exaggerated HPA stress response in the GF mice could partially be reversed by orally inoculated Bifidobacterium infantis (Sudo et al., 2004). It is however worth noting that some studies showed a reduction of anxiety-like behaviors in GF mice (Neufeld et al., 2011; Clarke et al., 2013; Arentsen et al., 2015). GF mice displayed alterations in behavior and stress responses, changes in neurotransmitter levels and immune activation (Campos et al., 2016). Anxiety-like behaviors in GF mice can be normalized following the restoration of intestinal microbiota (Clarke et al., 2013). Interestingly, fecal microbiota transplantation in GF mice from depressed patients led to depression-like behaviors (Zheng et al., 2016). Furthermore, depression-like behaviors in a rat model could also be reversed with probiotic treatment (Desbonnet et al., 2008, 2010). These results suggest that the composition in gut microbiota may contribute to regulating mood and behavior, but the detailed molecular mechanisms and cause-effect relationship between gut microbiota and phenotypes of moods and behaviors are not fully understood.

# GUT MICROBIOTA AND NEUROPSYCHIATRIC DISORDERS

Previous human studies have investigated the link between gut microbiota and a series of neuropsychiatric disorders. Alternations in gut microbial composition have been observed in children with ASD, with an increase in the Firmicutes/Bacteroidetes ratio (Williams et al., 2011; Tomova et al., 2015; Strati et al., 2017). A recent study with 35 ASD children and six healthy controls also found consistent results,

Huang et al. Gut Microbiota in Mood Disorders

and the functional analysis of this study demonstrated that butyrate/lactate-producing bacteria were decreased in ASD children (Zhang et al., 2018). Patients with schizophrenia also showed dysbiosis of gut microbiota, with a higher Proteobacteria abundance compared to the healthy controls (Shen et al., 2018). Another study reported an increased abundance of Lactobacillus in patients with first-episode psychosis, and patients with stronger microbial differences compared to the controls, showed worse treatment outcomes (Schwarz et al., 2018). It has also been proven that some neurodegenerative diseases are associated with gut microbiota. Increasing evidence shows gut microbiota changes in PD patients. Higher microbial diversity was found in Chinese and American PD patients, but not in Finnish and German patients, which may be related to regional variation (Keshavarzian et al., 2015; Scheperjans et al., 2015; Hopfner et al., 2017; Qian et al., 2018). A metagenomic shotgun analyses performed in PD patients showed that PD can potentially be identified from healthy controls with gut microbiota-based biomarkers (Bedarf et al., 2017). Alternation in gut microbial metabolites, including β-glucuronate and tryptophan metabolism, was further observed in PD patients (Bedarf et al., 2017). In addition to the diseases mentioned above, preliminary studies characterizing the gut microbiota in patients with mood disorders have also emerged recently, providing rudimentary knowledge in this field. This review will hereinafter focus on studies carried out in patients with major depressive disorder (MDD) and BD.

### HUMAN STUDIES OF THE GUT MICROBIOTA IN MOOD DISORDERS

The search strategy we used was in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA). We selected relevant studies before October 1, 2018, by searching PubMed, Embase and PsycINFO databases. The search keyword string used was (mood disorder OR bipolar disorder OR mania OR depression OR depressive disorder) AND gut microbiota. The results were further filtered by human studies. No language restrictions were required. We also searched the reference lists of key articles, manually. Studies eligible for inclusion needed to have investigated the characteristics of gut microbiota in patients with MDD or BD, using a high-throughput sequencing or proteomics approach. The studies needed to be full-text research articles, rather than reviews, letters, case reports, or meeting abstracts. All studies identified were screened by their titles and abstracts, as well as the full article if needed. The study selection process is shown in **Figure 1**. Finally, 12 research articles, on the gut microbiota in mood disorders, were included for further review. Overall, seven studies were carried out on MDD, and five studies on BD. 16S rRNA gene sequencing is the currently mainstream tool to identify phylogenetic relationships between various bacteria. The 16S rRNA gene exists in all bacteria. Its function remains conservative over time, and it is large enough to distinguish between different bacteria (Patel, 2001). These studies further conducted a correlation analysis to explore the relationships between the gut microbial features and demographic, immune, metabolic and clinical data in MDD and BD patients. The sample size across studies ranged from 10 to 58 in MDD subjects, and 31 to 115 in BD subjects. Among MDD and BD patients, the mean age was 39.5 and 39.9; the mean female ratio was 45.7 and 55.8%; and the mean body mass index (BMI) was 23.1 and 26.3, respectively. Details of these studies are provided in **Table 1**. We will discuss the main findings in MDD and BD separately below.

### Gut Microbiota and Major Depressive Disorder

To date, seven documented studies have investigated the association between MDD and gut microbiota in humans. Among these studies, four performed a microbial diversity analysis. Jiang et al. (2015) reported greater diversity of gut microbiota in MDD patients when compared with healthy individuals. While the other three studies failed to find significant differences in microbial diversity (Naseribafrouei et al., 2014; Zheng et al., 2016; Chen et al., 2018a). High microbial diversity is potentially beneficial to health, but it could easily be influenced by age, diet, and other factors (Yatsunenko et al., 2012). Additionally, all seven studies analyzed microbial composition changes in MDD patients. Naseribafrouei et al. (2014) first compared the gut microbiota in 37 depressed patients and 18 healthy controls, in whom a higher abundance of order Bacteroidales, genus Oscillibacter and Alistipes, and a lower abundance of family Lachnospiraceae was associated with depression. Another study investigated the gut microbiota in active-MDD (A-MDD) patients, responded-MDD (R-MDD) and healthy controls (Jiang et al., 2015). In this study, the Proteobacteria, Bacteroidetes, Actinobacteria abundance was increased, while Firmicutes was decreased in both A-MDD and R-MDD patients compared to the healthy controls. However, higher microbial diversity was only found in A-MDD patients, not in R-MDD patients. At lower taxonomic levels, increased Enterobacteriaceae and Alistipes and decreased Faecalibacterium were observed in MDD patients. Furthermore, the Faecalibacterium genus was negatively associated with the severity of depressive symptoms (Jiang et al., 2015), whereas Prevotella and Klebsiella were found positively correlated with the depression score in another study (Lin et al., 2017). In addition, this study reported more phylum Firmicutes, genus Prevotella, Klebsiella, Streptococcus, and Clostridium XI and less Bacteroidetes in depressed patients (Lin et al., 2017). Gut beneficial bacteria, such as Bifidobacterium and Lactobacillus, were also reduced in MDD patients (Aizawa et al., 2016). Zheng et al. (2016) found increased Actinobacteria and Bacteroidetes abundance and reduced Firmicutes in MDD patients, consistent with Jiang's finding. However, Chen et al. (2018c) found that Firmicutes and Actinobacteria were increased, whereas Bacteroidetes and Proteobacteria were reduced in 10 MDD patients compared to 10 healthy controls. Abundance of Faecalibacterium was also found to be associated with depression severity (Chen et al., 2018c). Proteomics analysis further indicated the disordered bacterial proteins involved in carbohydrate and amino acid metabolism (Chen et al., 2018c). Only one study explored the sex differences of gut microbiota


MDD, major depressive disorder; A-MDD, active-MDD; R-MDD, responded-MDD; F-MDD, female-MDD; M-MDD, male-MDD; BD, bipolar disorder; BDD, BD patients with depressive episode; BDM, BD patients with manic episode; HCs, healthy controls; AAP-treated, atypical antipsychotic-treated.

in MDD patients, demonstrating that levels of Actinobacteria phylum was increased in females and the Bacteroidia class was decreased in males (Chen et al., 2018a).

From the current findings in MDD patients, we found increased levels of phylum Actinobacteria, order Bacteroidales, family Enterobacteriaceae, genus Alistipes and deceased family Lachnospiraceae, genus Faecalibacterium were associated with depression in most studies. However, the change of Bacteroidetes was not consistent among these studies. Phylum Actinobacteria is involved in lipid metabolism (Painold et al., 2018), thus indicating that more Actinobacteria in depressed patients were possibly related to dyslipidemia. Bacteroidetes and Bacteroidales were found to be associated with complex polysaccharide hydrolysis, and low Bacteroidetes and Bacteroidales levels were shown to be associated with metabolic diseases, such as obesity and diabetes (Ley et al., 2006; Zhang et al., 2013). The Enterobacteriaceae family is a natural inhabitant of the intestinal tract, and the inflammatory status in gut microbiota is particularly beneficial for the proliferation of Enterobacteriaceae (Zeng et al., 2017). Alistipes genus was associated with triggering inflammation and tumorigenesis in an IL-6-dependent

manner (Moschen et al., 2016). A study on BALB/c mice showed a significant increase in Alistipes abundance when exposed to stress (Bangsgaard Bendtsen et al., 2012). Recent studies have shown that Alistipes was associated with a higher risk of obesity (Kang et al., 2018), IBS (Saulnier et al., 2011), and immune deficiency syndrome (McHardy et al., 2013). Notably, Faecalibacterium and Lachnospiraceae were important for the biosynthesis of the microbial product butyrate, which has anti-inflammatory effects, partly due to the decrease of pro-inflammatory cytokine synthesis and the increase of anti-inflammatory cytokine secretion (Duncan et al., 2007; Sokol et al., 2008). Consistent findings showed a negative association between Faecalibacterium and depressive symptoms, but the underlying mechanisms connecting Faecalibacterium with depression remain unclear.

#### Gut Microbiota and Bipolar Disorder

To date, the association between BD and gut microbiota has been documented in only five studies. Evans et al. (2017) first investigated the gut microbiota characteristics in 115 BD patients compared to 64 healthy controls. The levels of Faecalibacterium and a member of the Ruminococcaceae family of Firmicutes phylum were decreased, and the Faecalibacterium abundance was negatively correlated with self-reported symptoms and depressive severity (Evans et al., 2017). This finding is consistent with the study by Painold et al. (2018). The latter also reported higher levels of phylum Actinobacteria and class Coriobacteria

in BD patients (Painold et al., 2018). Based on the severity of depressive symptoms, the Clostridiaceae family and Roseburia genus were more abundant in healthier BD patients, while the Enterobacteriaceae family was more abundant in clinically depressive patients (Painold et al., 2018). In addition, microbial diversity was negatively correlated with illness duration (Painold et al., 2018). Inflammatory and metabolic profiles, such as serum IL-6, lipids, tryptophan, BMI, were associated with specific bacteria in BD patients, and were all positively correlated with the genus Lactobacillus abundance (Painold et al., 2018). Another study found higher microbial diversity in BD patients compared to healthy subjects, especially in patients with manic episodes (Guo et al., 2018). In patients with different mood statuses, the relative abundance of Escherichia coli and Bifidobacterium adolescentis was higher in manic individuals, while Stercoris was higher in depressed individuals (Guo et al., 2018). Flavonifractor genus was also identified to be associated with BD patients, especially female patients who in smoke (Coello et al., 2018). However, no difference of gut microbiota was observed between unaffected first-degree relatives of BD patients and healthy subjects (Coello et al., 2018). To date, only one study has explored the influence of atypical antipsychotic (AAP) treatment on gut microbiota in BD patients. Significantly decreased microbial diversity was revealed in AAP-treated females (Flowers et al., 2017). Furthermore, AAP treatment was associated with an increased abundance of Lachnospiraceae and a decreased abundance of Akkermansia and Sutterella.

As shown in these studies, the gut microbiota in BD patients tend to harbor higher phylum Actinobacteria, order Coriobacteriales, family Coriobacteriaceae, Enterobacteriaceae, genus Flavonifractor and lower genus Faecalibacterium, and the abundance of Bacteroides. Actinobacteria, Coriobacteriales, Coriobacteriaceae, and Bacteroides was proven to be associated with lipid and glucose metabolism. Therefore, the increased risk for metabolic disturbance in BD patients may be related to the altered abundance of these bacteria (McElroy and Keck, 2014). Flavonifractor abundance was increased in BD patients, and may be correlated to the influence of oxidative stress and inflammatory reactions (Coello et al., 2018). Faecalibacterium, which was also decreased in MDD patients, was shown to be beneficial for human health through anti-inflammatory activities in the gut. As a whole, the disordered gut microbial communities in BD patients are linked to an abnormal inflammatory, metabolic process and oxidative stress, as well as the disease itself.

#### Gut Microbiota Difference Between Major Depressive Disorder and Bipolar Disorder

Major depressive disorder and BD are two different psychiatric disorders according to the DSM-5, with different clinical symptoms, therapies, and prognosis. The different gut microbial characteristics in BD and MDD patients may indicate different etiologies of these two diseases. However, some findings were similar in BD and MDD patients. For example, a higher abundance of Actinobacteria, Enterobacteriaceae and lower Faecalibacterium was reported in both diseases, and these bacteria were related to lipid metabolism and an inflammatory response, which may contribute to the disturbance in lipid metabolism and pro-inflammatory activities in affected patients (Sowa-Kucma et al., 2018). In addition, other different findings in the gut bacteria of BD and MDD patients should be mentioned. The SCFA-producing bacterium, Lachnospiraceae family, was decreased in MDD patients in most studies, but Lachnospira genus was increased in BD patients. Pro-inflammatory genera, Alistipes and Klebsiella, were increased in MDD patients, but not reported in BD patients. Furthermore, bacteria communities identified in MDD patients were involved more in serotonin, GABA, valeric acid and butyrate production metabolism, but were more likely to be associates with lipid metabolism in BD patients. Details are shown in **Figure 2**.

## The Impact of Psychotropic Medication on the Gut Microbiota

What impact do psychotropic medications have on the human gut microbiota? So far, few studies have addressed this issue in humans. Among the studies included for review, only one has described the changes in microbiota from AAP treatment in BD patients. In this study, APP-treated patients had a higher BMI compared to non-treated patients, and the microbial diversity was decreased only in female treated patients, but not in male treated patients (Flowers et al., 2017). In addition, treated patients without obesity, showed significantly decreased Akkermansia (Flowers et al., 2017). This bacterium was reported to present an inverse association with inflammation, insulin resistance, and lipid metabolism (Schneeberger et al., 2015). Therefore, its reduction in BD may predispose patients to inflammatory conditions and metabolic perturbations. In patients with schizophrenia, chronic treatment with risperidone was associated with weight gain and a lower ratio of Bacteroidetes to Firmicutes (Bahr et al., 2015). Interestingly, chemically different antipsychotics can exert inhibitory effects on the growth of gut-originated microbial strains, indicating that these non-antibiotics have antibiotic-like side effects (Maier et al., 2018). In addition to antipsychotics, some antidepressants are also considered to have antimicrobial effects (Macedo et al., 2017). Some other possible mechanisms should be taken into consideration. Most psychotropic medications target neurotransmitters and their receptors, including serotonin, dopamine and noradrenalin, which can also be produced by gut microbiota and can potentially have feedback on the bacteria. Furthermore, improvement of clinical symptoms, through psychotropic medications, may also influence the diversity and composition of gut microbiota. The gastrointestinal side effects of these drugs, such as constipation and diarrhea, may also affect the commensal bacteria. Although the underpinnings are not fully understood, alterations of gut microbiota in relation to psychotropic drugs, also seem to contribute to weight gain, metabolic disturbance and inflammatory activities in patients. The drug–microbiota interactions provide promising paths to understanding and controlling their off-target side effects.

FIGURE 2 | The taxonomic tree and function of main fecal bacterial clades in MDD and BD. The taxonomic tree shows the common and different features of gut microbiota composition between MDD and BD patients, as well as the physiological functions of the specific bacteria. Decreased in SCFAs-producing genera and increased in pro-inflammatory genera was reported both in patients with BD and MDD. # There were four studies showing lower Bacteroidetes levels in MDD patients, with one study showing higher abundance of Bacteroidetes in MDD patients. Similarly, the abundance of Lachnospiraceae was decreased in two studies in MDD patients, but another study showed Lachnospiraceae was increased in MDD patients. SCFAs, short-chain fatty acids; IBS, irritable bowel syndrome; 5-HT, 5-hydroxytryptamine.

# DISCUSSION

Accumulating evidence on the role of the brain-gut-microbiota in neuropsychiatric diseases has emerged in recent years. Herein, we focus on the human studies of gut microbiota pertaining to mood disorders. Compared to healthy individuals, MDD and BD patients showed significant changes in gut microbial diversity and composition. In depressed patients, decreased microbial diversity was found in most studies. According to different studies, a consistent increase in the abundance of Actinobacteria, Enterobacteriaceae and a decrease in Faecalibacterium was revealed. These findings indicate that decreased SCFAsproducing genera and increased pro-inflammatory genera may be related to chronic, low-grade systemic inflammation in patients with mood disorders. Furthermore, specific gut bacteria were also associated with inflammatory markers and metabolic profiles, disease severity, duration of illness, psychiatric symptoms, and pharmacological treatment.

Current studies shed a light on the potential of using gut microbial markers to distinguish between patients with mood disorders from unaffected healthy individuals. However, these studies were all cross-sectional, and the cause-effect relationship between the mood disorder and gut microbiota remains unclear. Among the studies included, only two investigated the microbiota characteristics between active depressed patients and responder patients (Jiang et al., 2015; Painold et al., 2018). The gut microbiota composition in responder patients also showed a significant difference in healthy controls, indicating that the improvement of clinical symptoms could not restore the human gut microbiota to nearly the same state as the gut microbiota of healthy individuals. Longitudinal studies, with the confounding factors controlled and patients in different statuses (depression, mania, and remission), are needed to clarify the causal relationship between gut microbiota and mood disorders.

There are some major limitations in the current studies of human gut microbiota in mood disorders. Most studies included a small size sample of subjects, which would inevitably weaken the study stringency. Consistent demographic and clinical characteristics of recruited subjects are needed. However, the gut is a complicated ecosystem, which can be influenced by various factors, such as age (O'Toole and Jeffery, 2015), genetics (Bonder et al., 2016), diet (David et al., 2014), and regional variations (He et al., 2018). Recruited subjects in most studies did not receive a standardized diet, and the geographical effect was also not strictly controlled. Furthermore, all studies were cross-sectional, without evaluating influences of antipsychotic medications on the gut microbiota. Although some studies have evaluated the associations between the depression severity and gut microbiota, other domains, such as the relationship of gut microbiota with psychotic symptoms, cognitive function and sleep disturbances were not further investigated. In addition, the disease status of patients should be clearly classified. Patients with different mood statuses may have distinct gut microbial composition.

#### CONCLUSION

Current research on gut microbiota and mood disorders is still at its early stage. Growing evidence shows changed gut microbiota in patients with mood disorders, which may play an important role in disease pathology. The cause-effect relationship is still inconclusive. Future well-designed studies with new techniques, such as proteomics, metabonomics and metagenomics, are warranted to address this issue.

#### AUTHOR CONTRIBUTIONS

fgene-10-00098 February 15, 2019 Time: 17:48 # 9

S-HH, L-MR, and YX contributed to the study concept and design. T-TH, J-BL, and Y-LD wrote and revised the manuscript. All authors read and approved the final manuscript.

#### REFERENCES


#### FUNDING

This study was supported by the grants of the National Key Research and Development Program of China (Grant No. 2016YFC1307104 to S-HH) and the Key Research Project of Zhejiang Province (Grant No. 2015C03040 to YX).





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Huang, Lai, Du, Xu, Ruan and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# SNP Variation of RELN Gene and Schizophrenia in a Chinese Population: A Hospital-Based Case–Control Study

#### Edited by:

Weihua Yue, Peking University Sixth Hospital, China

#### Reviewed by:

Xiaoping Wang, Second Xiangya Hospital of Central South University, China Qiang Wang, West China Hospital of Sichuan University, China

#### \*Correspondence:

Jian-Huan Chen cjh\_bio@hottmail.com Yan-Wei Shi shiyanw@mail.sysu.edu.cn Hu Zhao zhaohu3@mail.sysu.edu.cn †Co-first authors

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

> Received: 21 August 2018 Accepted: 18 February 2019 Published: 05 March 2019

#### Citation:

Luo X, Chen S, Xue L, Chen J-H, Shi Y-W and Zhao H (2019) SNP Variation of RELN Gene and Schizophrenia in a Chinese Population: A Hospital-Based Case–Control Study. Front. Genet. 10:175. doi: 10.3389/fgene.2019.00175 Xia Luo1,2† , Si Chen3,4† , Li Xue<sup>3</sup> , Jian-Huan Chen<sup>5</sup> \*, Yan-Wei Shi3,6 \* and Hu Zhao3,6 \*

<sup>1</sup> Department of Psychiatry, Shenzhen Kangning Hospital, Shenzhen Mental Health Center, Shenzhen, China, <sup>2</sup> Department of Psychiatry, Shantou University Medical College, Shantou, China, <sup>3</sup> Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China, <sup>4</sup> Institute of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou, China, <sup>5</sup> Laboratory of Genomic and Precision Medicine, Wuxi School of Medicine, Jiangnan University, Wuxi, China, <sup>6</sup> Guangdong Province Translational Forensic Medicine Engineering Technology Research Center, Sun Yat-sen University, Guangzhou, China

Aims: We aimed to explore whether RELN contributes to the vulnerability and severity of clinical symptoms of schizophrenia (SZ) in a Chinese population.

Methods: The following were conducted in an adult Han Chinese population from southern China: case–control association analyses of 30 representative single nucleotide polymorphisms (SNPs) that were screened according to specific programs based on bioinformatics tools and former research and quantitative trait locus analyses with SNPs and psychiatric symptoms evaluated with the positive and negative symptoms scale.

Results: A 4-SNP haplotype consisting of rs362814, rs39339, rs540058, and rs661575 was found to be significantly associated with SZ even after Bonferroni correction (χ <sup>2</sup> = 29.024, p = 6.42E-04, pBonf = 0.017), and the T-C-T-C haplotype was a protective factor for SZ (OR = 0.050, 95% CI = 0.004–0.705). Moreover, the 4-SNP haplotype showed a significant association with G16 (active social avoidance) after false discovery rate correction (χ <sup>2</sup> = 28.620, p = 1.697E-04, pFDR = 0.025). In addition, P7 (hostility) was related to the haplotype comprising rs2229864, rs2535764, and rs262355 (χ <sup>2</sup> = 31.424, p = 2.103E-05, padjustment = 0.019) in quantitative trait loci analyses.

Conclusion: Overall, this study showed several positive associations between RELN and SZ, as well as psychiatric symptoms, which not only supports the proposition that RELN is a susceptibility gene for SZ but also provides information on a genotypephenotype correlation for SZ in a Chinese population.

Keywords: RELN, schizophrenia, SNP, association analysis, quantitative trait loci analyses, psychiatric symptoms, bioinformatics tools

**108**

# INTRODUCTION

fgene-10-00175 March 4, 2019 Time: 16:32 # 2

Schizophrenia (SZ, Online Mendelian Inheritance in Man [OMIM] 181500) is a common and serious psychiatric disorder with a lifetime prevalence estimate of 4.0 (1.6–12.1) per 1,000 individuals (Esan et al., 2012). SZ has been reported to be a predominantly genetic disorder, in which heritability is estimated to be 80% (Lichtenstein et al., 2009).

The human RELN gene (OMIM 600514) maps to chromosome 7q22 and encodes reelin, a large secreted glycoprotein, that is thought to be critical for cell positioning and neuronal migration by controlling cell-cell interactions during brain development (Hartfuss et al., 2003; Huang and D'Arcangelo, 2008; Niu et al., 2008). Recently, growing evidence has shown that the reelin protein might also be associated with neurotransmission, memory formation and synaptic plasticity (Herz and Chen, 2006), which have been demonstrated to also be damaged in SZ patients (Falkai et al., 2015). Decrease trends in reelin expression in patients with SZ have been found in brain and blood tissues (Impagnatiello et al., 1998; Guidotti et al., 2000; Nabil Fikri et al., 2017). Hence, low level or dysfunctional of reelin protein may cause deficits in neuronal development and cognitive function in adults and may play pathogenic roles in neuropsychiatric illnesses, such as SZ. This proposition was further supported by anatomical and in vivo studies that included the reeler mice and the heterozygous reeler mouse (HRM) model (Hill et al., 2006; Qiu et al., 2006; Tueting et al., 2006). The role of RELN as a potential risk for SZ has also been suggested by genetic association analysis, especially in GWASs (Shifman et al., 2008; Li et al., 2011, 2013; Ovadia and Shifman, 2011; Zhou et al., 2016). Over the past decade, dozens of SNPs in RELN gene loci have been reported to be associated with the onset and/or severity of clinical symptoms of SZ (Kahler et al., 2008; Shifman et al., 2008; Wedenoja et al., 2008, 2010; Need et al., 2009; Ben-David et al., 2010; Liu et al., 2010; Kuang et al., 2011; Li et al., 2013). However, the results remain controversial (Tost and Weinberger, 2011; Li et al., 2013; Bocharova et al., 2017).

In this study, we assumed that RELN may be related to the onset of SZ and the severity of some clinical symptoms. A case–control study had been performed in the Han Chinese population from southern China. To thoroughly understand the genetic basis of mental symptoms in SZ rather than only focus on verifying previous positive results and searching for new susceptible SNPs in RELN, we specifically performed quantitative trait locus (QTL) analyses in addition to qualitative association studies. Due to the high clinical and genetic heterogeneity of SZ, different clinical subtypes of SZ may be related to different genetic bases, we specifically collected patients with paranoid or undifferentiated SZ who experienced first onset or recurrence after drug withdrawal at least 1 month to improve the consistency of research subjects and to minimize the influence of drug on the scores of PANSS.

### MATERIALS AND METHODS

## Subjects

The patient sample consisted of 102 unrelated SZ patients (46 females and 56 males; mean age, 32.42 ± 10.08 years) recruited from the Third People's Hospital of Zhongshan City during 2012.11–2013.06. Patients were diagnosed by at least two psychiatrists according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders (Fourth Edition, DSM-IV). Detailed information on clinical features, such as the onset time, symptoms and family history of mental illness, were obtained according to DIGS and FIGS. The PANSS was used to evaluate the severity of psychosis symptoms of SZ patients. It is one of the most common instruments applied to evaluate the severity of clinical symptoms in the world (Kay et al., 1987). PANSS comprises 33 items, including 30 psychopathological items which are usually divided into the positive subscale (7 items), the negative subscale (7 items) and the general psychopathology subscale (16 items) and 3 complementary attack risk items. The item level of the 33 items ranges from 1 to 7, with 1 equaling "no symptoms" (Kay et al., 1987). The Chinese version of PANSS, with acceptable validity and reliability, was used to evaluate the severity of psychosis symptoms of SZ patients in this study (Si et al., 2004). Paranoid or undifferentiated SZ patients who were at first onset, having never been treated, or at recurrence, having not taken any antipsychotics for at least 1 month, were enrolled in. Among them, patients with the bothnegative type (i.e., the number of items with a score greater than or equal to 4 was less than three on both positive and negative subscales in PANSS) were excluded. All patients with ambiguous diagnoses or accompanied by neurological diseases, organic mental disorders or other symptomatic psychoses, or who had concomitant severe somatic disease, cancer, pregnancy or lactation, an immune or endocrine system disease, or a history of alcoholism or substance abuse were excluded.

A total of 169 healthy controls (75 females and 94 males; mean age 33.27 ± 8.63 years) matched with the patients in sex, age, and birthplace at the same time were randomly recruited from the volunteers who came to perform physical examination in the Affiliated Hospital of Sun Yat-sen University during 2013.06–2013.11. Volunteers were asked to provide detailed information on medical and family psychiatric histories. Subjects were excluded due to any of the following: positive family histories (first-degree relatives) of psychiatric illness; substance abuse; abnormal birth; febrile convulsions; juvenile adoption or residence with a single-parent family; pregnancy or lactation.

All participants were unrelated Han Chinese born and living in the southern China (mostly come from Guangdong province and nearby cities), and all their biological grandparents were Han Chinese ancestry. They provided written informed consent for participation, and the research protocol was approved by the Ethical Committee for Genetic Studies of Shantou University.

**Abbreviations:** DIGS, the Diagnostic Interview for Genetic Studies; DSM-IV, the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition; FASTSNP, function analysis and selection tool for single nucleotide polymorphisms; FDR, false discovery rate; FIGS, the Familial Interview for Genetic Studies; GWAS, genome wide association study; HWE, Hardy-Weinberg equilibrium; MAF, minor allele frequency; OMIM, Online Mendelian Inheritance in Man; OR, odds ratios; PANSS, the positive and negative symptoms scale; QTL, quantitative trait loci; SIFT, Sorting Tolerant From Intolerant; SNPs, single nucleotide polymorphisms; SZ, schizophrenia; UTR, untranslated region.

#### SNP Selection

fgene-10-00175 March 4, 2019 Time: 16:32 # 3

The SNPs were based on previous studies (Shen et al., 2006; Doss et al., 2010; Wu and Jiang, 2013) and/or screened by SNP functional prediction softwares, including SIFT (Kumar et al., 2009), PolyPhen-2 (Adzhubei et al., 2010), and FASTSNP (Yuan et al., 2006). Information on the SNPs was obtained from dbSNP, HapMap and other human genome databases. A SNP would be approved only when its MAF was greater than or equal to 0.01 in the Chinese Beijing population in the HapMap database. The detailed strategies for SNP selection were as follows. (1) "Positive" SNPs that have been reported to be associated with SZ previously were selected once their MAFs were greater than or equal to 0.01 in the Chinese Beijing population in the HapMap database. (2) A two-step method was executed for the selection of exonic SNPs (**Figure 1A**). First, PolyPhen-2, the SIFT online service and the FASTSNP online service were employed for functional analysis of exonic SNPs. Next, by comparing the outputs of the three software programs, repeated SNPs with MAFs equal to or greater than 0.01 were chosen. (3) For intronic SNPs, a four-step method was conducted using the FASTSNP, SNP functional prediction (Xu et al., 2007), F-SNP online services and HapMap database successively, and SNPs with MAFs greater than or equal to 0.05 were preserved (**Figure 1B**). (4) By using FASTSNP, UTRscan (Shen et al., 2006), and UCSC Variant Annotation Integrator online software (Meyer et al., 2013), similar functional analyses of exonic SNPs were conducted among UTR and upstream SNPs of RELN, but a higher MAF threshold of 0.05 was implemented. Rs7341475, a "positive" SNP thought to be associated with SZ (Shifman et al., 2008; Ben-David et al., 2010) was not included in this study, as there was no significant evidence implying that rs7341475 may associate with SZ in the Han Chinese population (Li et al., 2011, 2013).

## SNP Genotyping

For each participant, 5 ml peripheral venous blood was collected after signing informed consent and before drug used. Genomic DNA was extracted using the phenol-chloroform method within 1–2 weeks and saved in −20◦C refrigerator. SNP genotyping was performed with the Sequenom MassARRAY iPLEX Gold platform (Sequenom, San Diego, CA, United States) (Ikeda et al., 2008). Primers were designed using the MassARRAY Assay Design 3.1 software. Two amplification primer systems were built with the balance of capturing as many SNP loci as possible while limiting costs. Genomic fragments containing SNPs were amplified by polymerase chain reaction (PCR) in a total reaction volume of 5 µl, which included 20–50 ng of genomic DNA, in two 384-well plates using the ABI GeneAmp <sup>R</sup> 9700 384 Dual. Purified and specific genotyping primers were used to amplify target sites. Genotypes were automatically called by MassARRAY Typer 4.0 and verified manually. To ensure the accuracy of genotyping, 32 samples were duplicated for quality control, and no genotyping errors were found. Additionally, each 384-well plate had four blank controls. Any individual whose missing genotypes were greater than 50% was excluded from further statistical analysis.

### Statistical Analysis

Statistical analyses including HWE, the description of MAF, single marker association of allele, the standardized measure of linkage disequilibrium (LD) coefficients (D'), haplotype block and haplotype frequency and association within the block were all assessed using Haploview v4.2 software (version 4.2, Broad Institute of MIT and Harvard, Cambridge, MA, United States) (Gabriel et al., 2002; Barrett et al., 2005). The haplotype frequency was estimated using the expectation maximization (EM) algorithm. The criterion for significances in Haploview v4.2 was set at p smaller than

FIGURE 1 | The general screening flows of exonic (A) and intronic (B) SNPs in the SNP selection phase.

0.05 for tests of expected HWE and corrected p-value after 10000 permutations.

The SPSS 19.0 were used to analyzing the demographic and genotypic frequency distribution of samples using t-tests or chisquare tests. Logistic regression analyses of population, sex, and age were used to generate population stratification assignments for all individuals. The criterion for significance in SPSS 19.0 was set at p smaller than 0.05. Genotype and multiple SNPs qualitative traits association tests between patients and controls were performed with Unphased 3.1.7 (Dudbridge, 2008). The overall test of association in Unphased 3.1.7 is a likelihood ratio test. And the odds ratio (OR) and 95% confidence interval (95% CI) were calculated automatically to evaluate the effects of alleles and haplotypes. Tofurtherinvestigate the possibility of complexity in the genotype-phenotype relationship, QTL analyses were also performed with Unphased v3.1.7 using 33 factor scores and 4 scale scores (i.e., the total scale score, the positive scale score, the negative scale score, and the general psychopathology scale score) from PANSS as quantitative traits. In QTL analyses, the addVal (i.e., additive genetic value) instead of OR was calculated automatically to show the estimated additive genetic value for special haplotype. For all multiple tests performed in Unphased 3.1.7, Bonferroni correction and FDR correction were employed to correct p-values and minimize the influence of type II errors. The corrected p-value threshold of 0.05 was used for significant after Bonferroni correction (pBonf = α × m, α is the desired overall alpha level of 0.05, and m is the number of hypotheses) and FDR correction (the corrected p-value was shown as pFDR) which was performed using R projection Version 3.1.2<sup>1</sup> . A power analysis was performed using the G∗Power software for this study (Faul et al., 2009).

#### RESULTS

#### SNP Selection and Genotyping

Initially, thirty-nine SNPs were screened, including twenty-nine "positive" SNPs reported previously and ten "functional" SNPs, from more than ten thousand RELN loci. Screening flows and the number of targeted SNPs in exonic and intronic zones in the RELN gene are shown in **Figure 1**. Six SNPs (i.e., rs362746, rs12705169, rs885995, rs11761011, rs16872603, and rs2237628) were ruled out for poor specificity or missing data, and three SNPs (i.e., rs3025962, rs73714410, and rs7811571) were monomorphic in the samples. Therefore, 30 SNPs in the RELN gene were genotyped successfully in 100 SZ patients (45 females and 55 males; mean age, 32.170 ± 10.000 years) and 163 healthy controls (74 females and 89 males; mean age, 33.350 ± 8.585 years). No differences were found between patients and controls in age (t = 1.015, df = 261, p = 0.311) and gender (χ <sup>2</sup> = 0.004, df = 1, p = 1.000). The average call rate of all SNPs was 99.5% in total sample. No SNP deviation from HWE was found in the controls (**Table 1**). Among 30 SNPs, there are 2 exonic SNPs and 28 intronic SNPs. The rough distribution of the 30 SNPs is shown in **Figure 2**.

#### Single-SNP Association Analyses

Detailed information regarding the allelic and genotypic frequencies of the 30 SNPs from patients and controls was obtained (**Table 1**). No single polymorphism was found to be significantly associated with SZ neither in the total sample (**Table 1**) nor when grouped by sex (**Table 2**). The weak genotypic association of rs6465938 (χ <sup>2</sup> = 6.087, df = 2, p = 0.048) did not withstand the FDR correction (pFDR = 0.713).

#### Haplotype Association Analyses

Five blocks were captured using Haploview v4.2 software in the total sample (**Figure 3**). The frequency and haplotype association within the block were shown in **Supplementary Table S1**. No haplotype analysis within the block survived before 10000 permutations. Haplotype associations of 2-SNP, 3-SNP, and 4-SNP were analyzed using Unphased v3.1.7 software. When performing logistic regression analyses, only birthplace shown the population stratification [p = 0.019, Exp (B) = 0.526, 95% CI 0.308–0.898]. To reduce the stratification, we employed birthplace, sex, and age as covariates in the association analyses. Two 3-SNP haplotype blocks (one consisting of rs362814, rs39339, and rs540058, and the other consisting of rs362691, rs362719, and rs362726) were identified before multiple adjustment (χ <sup>2</sup> = 19.406, df = 6, p = 0.004, pBonf = 0.099 and χ <sup>2</sup> = 20.621, df = 7, p = 0.004, pBonf = 0.122, respectively). The haplotype with four SNPs (rs362814, rs39339, rs540058, and rs661575) was significantly associated with SZ (χ <sup>2</sup> = 29.024, df = 9, p = 6.42E-04), even after Bonferroni and FDR correction (both padjustment = 0.017). And the T-C-T-C haplotype of the four SNPs was more common in the controls (0.764% in case versus 6.173% in controls, OR = 0.050, 95% CI = 0.004–0.705). No sex effect was found in multiple SNP analyses (date not shown).

#### QTL Association Analyses

For the QTL analyses, we used smoking, birthplace, age, and sex as covariates, as smoking may be associated with the severity of the negative symptoms (Zhang et al., 2012). Several factors rather than scale score were significantly associated with 3-SNP or 4- SNP haplotypes after FDR correction (**Table 3**, more information shown in **Supplementary Table S2**). The T-C-T haplotype and C-T-T haplotype of the 3-SNP consisting of rs2229864, rs2535764 and rs262355 were related to the severity of P7 (hostility) (AddVal = 0.748, 95% CI = 0.291–1.204 and AddVal = 0.718, 95% CI = 0.230–1.206). No significant result was found in SNP or 2-SNP QTL association analyses neither in total sample nor grouped by sex (data not shown). We did not performed 3-SNP and 4-SNP QTL association analyses in male or female patients due to the small sample.

#### Power

The G∗Power program was used to perform the power calculation. The size of this sample revealed a power of 98.873% to detect a significant association (α < 0.05) when given an effect size index of 0.5 (corresponding to a "medium" gene effect). However, the power narrowed down to 47.033% when the effect size index was 0.2 (corresponding to a "small" gene effect).

<sup>1</sup>https://www.r-project.org/

TABLE 1 | Genotype frequencies, HWE tests, and single-SNP association analyses of 30 RELN SNPs in the SZ and control cases.


<sup>a</sup>Major/minor allele, major, and minor alleles are denoted by D and d, respectively. <sup>b</sup>Numbers of samples that are successfully genotyped in patients and controls, respectively. <sup>c</sup>p-Values of the Hardy-Weinberg equilibrium (HWE) tests from Haploview v4.2. <sup>d</sup>The minor allele frequency of 30 selected SNPs in this study. <sup>e</sup>Uncorrected p-values of allelic and genotypic association analysis in this study. <sup>f</sup>Two exonic SNPs selected in this report.

#### DISCUSSION

Schizophrenia is a complex genetic disease with diverse clinical symptoms. Here, we performed qualitative and quantitative trait association analyses in a Han Chinese population from southern China to verify that RELN was a susceptibility gene for SZ. The results of multiple SNP analyses confirmed the genotypephenotype relationship between RELN and SZ.

A 4-SNP haplotype consisting of rs362814, rs39339, rs540058, and rs661575 was observed to be significantly associated with SZ even after the Bonferroni correction, and the T-C-T-C haplotype may be a protective factor for SZ. Moreover, when performing QTL analysis, this 4-SNP haplotype had a significant association with G16 after FDR correction. These results suggested the importance of the four SNPs, which was supported by previous studies. A study of Han population origin from southwestern

TABLE 2 | The sex differences in single-SNP association analyses of 30 RELN SNPs in SZ and control cases.


China demonstrated that the allele of rs362814 was more common in SZ cases and the A and T alleles took part in the building of a risk haplotype and a protective haplotype, respectively (Li et al., 2013). In our study, the T allele of rs362814 also took part in building a protective haplotype in the participants, but with different SNPs. Rs39339 was slightly significant in allelic association tests before the Bonferroni correction and nominally significant in a combined analysis in a Scandinavian population (Kahler et al., 2008). Rs540058 was reported to be associated with the severity of positive symptoms of SZ (Wedenoja et al., 2010), and rs661575 was associated with visual learning and memory in Finnish families (Wedenoja et al., 2008). The four SNPs are intronic and not in high linkage disequilibrium, so they are unlikely to have direct functional impacts on RELN and are unlikely to be co-inherited. However, SNPs in different functional areas may affect gene expression in different manners and have different effects on gene function (Manolio, 2010). Moreover, up to 52% of all SNPs that were associated with disease were intronic SNPs (Chen et al., 2010; Manolio, 2010). Hence, further studies on large samples are needed to verify the associations between the four SNPs and SZ, and rs362814 might be associated with SZ in an alleledependent manner.

Rs2229864 is a synonymous mutation located in the 50th exon of the RELN gene. Rs2229864 was recognized to be detrimental in the screening process and was imbalanced in the allelic expression of RELN in SZ (Ovadia and Shifman, 2011). Rs2535764 and rs262355 are both intronic SNPs and were slightly significant in the allelic association test only before the Bonferroni correction (Kahler et al., 2008). Both rs262355 and rs2229864 failed to obtain definite positive results in subsequent studies (Li et al., 2011). However, a weak association with P7 and haplotype consisting of rs2229864, rs2535764, and rs262355 withstood both the FDR and Bonferroni correction in the QTL analysis, which demonstrates that the relationship may be genuine rather than a false positive in this study. In addition, 4-SNP haplotypes consisting of the three SNPs and rs362626 or rs17157643 were associated with P7, G14 and S3, or P7 and S3, which all

FIGURE 3 | Linkage disequilibrium (LD) among the 30 RELN SNPs in the case and control groups. The pairwise LD R 2 values of the sample set are illustrated in the matrix. The dark color indicates relatively strong LD. The R 2 values of five blocks covering the RELN gene were larger than 0.75, indicating reasonable haplotype blocks.



P4: excitement; P7: hostility; G11: poor attention; G14: Poor impulse control; G16: active social avoidance; S3: emotional instability; pBonf: corrected p-values by Bonferroni correction; pFDR: corrected p-values by FDR correction.

reflect the impulsive and dangerous behavior of SZ patients in PANSS. Both single item scores and scale scores were used as quantitative traits instead of synthesized scale scores in this study to deeply understand the relationships between the severity of clinical symptoms and genetic bases as well as the category of symptoms and genetics. Overall, we presumed that rs2229864, rs2535764, and rs262355 may be closely related to the risk for SZ patients, and subsequent studies with large samples will be especially necessary.

Another interesting phenomenon was that the AA/GG ratio of rs727708 seemingly had opposite trends in the SZ patients and controls, although a significant difference was not found in the allele or genotype association analyses. Both rs727708 and rs540058 were thought to be related to the severity of positive

symptoms (Wedenoja et al., 2010), so rs727708 may be related to both the susceptibility to and severity of SZ.

Both rs362691 and rs6465938 failed in single and multiple analyses. Rs362691 is a missense mutation located in the 22th exon of the RELN gene and can cause the Leu-Val amino acid change. Rs362691 had also been recognized as a detrimental mutation in the screening process. Researchers have found that rs362691 was not only associated with autism spectrum disorders (ASDs) (Wang et al., 2014) but might also take part in the influence of the RELN gene on the cognitive functions of healthy people (Baune et al., 2010). Haplotypes consisting of rs362691 rather than a SNP played roles in susceptibility to SZ in the Chinese Va population (Yang et al., 2013). A subtle but significant difference in the genotype frequency distribution of rs6465938 was found before multiple corrections. Notably, rs6465938 was not in HWE only in the patient samples. Moreover, rs6465938 showed a nominal association in a Scandinavian population (Kahler et al., 2008). Therefore, negative results in this study should not be sufficient evidence to deny the possible associations between rs362691 or rs6465938 and SZ.

When subjects were divided by sex, none of these SNPs was found to be significantly associated with SZ in this study. Due to the small sample size, QTL analyses of different sexes were not performed. Many studies have demonstrated that males and females show different clinical factors, such as age of onset, treatment, cognitive function and clinical symptomatology (Tang et al., 2007; Ochoa et al., 2012; Zhang et al., 2012; Thorup et al., 2014), as well as genetic information (Shifman et al., 2008; Liu et al., 2010). We agree with the notion that male SZ patients are different from female patients in terms of both clinical and genetic conditions. Our negative findings are mainly due to the small sample size and low statistical power, so studies with larger sample sizes would contribute to identifying sex differences in patients with SZ.

There are several possible explanations for the discrepancies between studies, such as genetic heterogeneity in distinct ethnic populations (Caucasian versus East Asian race, Han versus Va populations), the heterogeneity of SZ (subtypes and disease categories), population stratification, environmental exposure, cultures and diets (Li et al., 2011, 2013). Although we tried to reduce the effects of region and SZ subtype, the major limitation of the small sample size due to the rigorous standards still exists. The enrollment of patients principally from an outpatient service instead of an inpatient department may be another reason, as the overall PANSS scores may tend to be lower for outpatients than for hospitalized patients.

#### CONCLUSION

In summary, the current investigation showed qualitative and quantitative trait associations between genetic variants in RELN

#### REFERENCES

Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., et al. (2010). A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249. doi: 10.1038/nmeth0410-248

and SZ in a southern Han Chinese population. Several SNPs showed significant associations with the pathogenesis of SZ and/or the severity of psychiatric symptoms principally in multi-SNP analyses. Considering the limitation of our work, further investigations of genetic susceptibility among larger samples and inpatients are required to elucidate the role of the RELN polymorphisms in the risk and sex difference of SZ as well as the severity of clinic symptoms in the future.

#### DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

#### AUTHOR CONTRIBUTIONS

HZ, Y-WS, and J-HC contributed to the conception and design of the study, and provided the approval for publication of the content. XL wrote the protocol and managed the literature searches and SNP screening. SC and LX organized the database and performed the statistical analysis. XL and SC wrote sections of the manuscript. All authors contributed to the manuscript revision and read and approved the submitted version.

### FUNDING

This work was supported by the National Natural Science Foundation of China (Grant Numbers 81871535 and 81273350).

#### ACKNOWLEDGMENTS

We are grateful to all the voluntary donors of DNA samples in this study. We thank all the field workers who participated in the sample collection and diagnostic assessment, especially Mr. Ting-Yun Jiang, Ms. Wen-Wei Zhang, and Mr. Lei Liu. We acknowledge the contributions of Ms. Qiu-Lin Liu, Miss Lin Zhou, and Miss Lu Zong for the laboratory work to help with the statistical analysis.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2019.00175/full#supplementary-material


healthy individuals. Neurobiol. Learn. Mem. 94, 446–451. doi: 10.1016/j.nlm. 2010.08.002


fgene-10-00175 March 4, 2019 Time: 16:32 # 9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Luo, Chen, Xue, Chen, Shi and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Effect of Dopamine Antagonist Treatment on Auditory Verbal Hallucinations in Healthy Individuals Is Clearly Influenced by *COMT* Genotype and Accompanied by Corresponding Brain Structural and Functional Alterations: An Artificially Controlled Pilot Study

#### *Edited by:*

*Weihua Yue, Peking University Sixth Hospital, China*

#### *Reviewed by:*

*Cao Qingjiu, Peking University, China Jinsong Tang, Central South University, China*

#### *\*Correspondence:*

*Chuanjun Zhuo chuanjunzhuotjmh@163.com Chunhua Zhou zhouchunhua80@126.com*

#### *Specialty section:*

*This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics*

> *Received: 16 October 2018 Accepted: 29 January 2019 Published: 06 March 2019*

#### *Citation:*

*Zhuo C, Xu Y, Zhang L, Jing R and Zhou C (2019) The Effect of Dopamine Antagonist Treatment on Auditory Verbal Hallucinations in Healthy Individuals Is Clearly Influenced by COMT Genotype and Accompanied by Corresponding Brain Structural and Functional Alterations: An Artificially Controlled Pilot Study. Front. Genet. 10:92. doi: 10.3389/fgene.2019.00092*

*Chuanjun Zhuo1,2,3,4,5 \*, Yong Xu3,4 , Li Zhang6 , Rixing Jing7,8 and Chunhua Zhou9 \**

*1Department of Psychiatric-Neuroimaging-Genetics and Comorbidity Laboratory (PNGC-Lab), Tianjin Mental Health Centre, Mental Health Teaching Hospital of Tianjin Medical University, Tianjin Anding Hospital, Tianjin, China, 2Department of Psychiatry, College of Basic Medical Science, Tianjin Medical University, Tianjin, China, 3Department of Psychiatry, First Hospital/First Clinical Medical College of Shanxi Medical University, Taiyuan, China, 4MDT Center for Cognitive Impairment and Sleep Disorders, First Hospital of Shanxi Medical University, Taiyuan, China, 5Department of Psychiatry, Institute of Mental Health, Jining Medical University, Jining, China, 6GHM Institute of CNS Regeneration, Jinan University, Guangzhou, China, 7Department of Pattern Recognition, China National Key Laboratory, Institute of Automation, Chinese Academy of Sciences, Beijing, China, 8Department of Pattern Recognition, University of Chinese Academy of Sciences, Beijing, China, 9Department of Pharmacy, The First Hospital of Hebei Medical University, Shijiazhuang, China*

Few studies have been conducted to explore the influence of the catechol-omethyltransferase (*COMT*) genotype on the severity of and treatment efficacy on auditory verbal hallucination (AVH) symptoms in healthy individuals with AVHs (Hi-AVHs). We hypothesized that the efficacy of dopamine antagonist treatment on AVHs in Hi-AVHs may be influenced by their *COMT* genotype and may be accompanied by corresponding brain alterations. To preliminarily investigate and test our hypothesis in an artificially controlled pilot study, we enrolled 42 Hi-AVHs as subjects and used magnetic resonance imaging and genetic methods to explore the basis brain features to investigate whether the efficacy of dopamine antagonist treatment on AVHs in Hi-AVH subjects was influenced by their *COMT* genotype or not. We found that *COMT*-met genotype subjects' treatment response was better than that of *COMT*-val subjects. Although *COMT*-met genotype subjects demonstrated an increase in global functional connectivity density (gFCD) but no difference on gray matter volume (GMV) compared to COMT-val genotype subjects at baseline, notably, we found that both groups demonstrated gFCD and GMV reduction after treatment, but the reduction was more widespread in *COMT*-met genotype subjects than in *COMT*-val genotype subjects. This is the first study to report that Hi-AVH subjects' baseline brain functional features are influenced by their *COMT* genotypes and that the *COMT*-met genotype subjects exhibit better responses to dopamine antagonists but have more widespread GMV and gFCD reduction than subjects with the *COMT*-val genotype. Despite several limitations, these findings may provide auxiliary information to further explain the mechanisms of AVHs and provide a clue for scholars to further explore specific treatment targets for AVHs in Hi-AVH subjects or in schizophrenia patients.

Keywords: *COMT* genotypes, auditory verbal hallucinations, dopamine antagonists, brain alterations, health individuals

# INTRODUCTION

According to the previous findings, even according to the strictest criteria, 0.7% of the general population has experienced auditory verbal hallucinations (AVHs); these subjects are usually called healthy individuals with AVHs (Hi-AVHs) (Johns et al., 2004; Sommer et al., 2010; Upthegrove et al., 2016). Many hypotheses of AVHs have been established in recent decades; each hypothesis explains AVHs from a different perspective (Jones, 2010; Liemburg et al., 2012; Cho and Wu, 2014; McCarthy-Jones et al., 2014; Northoff, 2014; Wilkinson, 2014; Alderson-Day et al., 2015, 2017; Hugdahl, 2015; Conde et al., 2016; Baumeister et al., 2017; Curcic-Blake et al., 2017; Wilkinson and Fernyhough, 2017). To date, however, no hypothesis has achieved general acceptance (Wilkinson and Fernyhough, 2017). Many studies have proposed that investigating AVHs in Hi-AVHs subjects can provide important information to help clarify the mechanisms of AVHs (Jones, 2010; Hugdahl, 2015; Wilkinson and Fernyhough, 2017).

The catechol-o-methyltransferase (*COMT*) genotype influences brain functional (including auditory processing, which is highly related to AVHs) and dopaminergic alterations both in healthy people and in patients with schizophrenia (Lu et al., 2007; Kang et al., 2010; Gothelf et al., 2011; Edgar et al., 2012; Tian et al., 2013a,b; Li et al., 2015; Steiner et al., 2018) and the efficacy of dopamine antagonists in patients with schizophrenia (Olgiati et al., 2009; Sagud et al., 2010; Huang et al., 2016). These previous findings converge to indicate that the reciprocal interactions between *COMT* genotype, dopamine levels, and structural and/or functional alterations in the human brain are related to mental disorder symptoms, such as AVHs (Lu et al., 2007; Kang et al., 2010; Sagud et al., 2010; Gothelf et al., 2011; Edgar et al., 2012; Tian et al., 2013a,b; Li et al., 2015; Huang et al., 2016; Steiner et al., 2018).

*COMT*-met genotype patients with schizophrenia respond more strongly than *COMT*-val genotype patients to dopamine antagonists. In particular, patients with positive symptoms respond more strongly to dopamine antagonists in the former group than in the latter, and this response is associated with corresponding brain structural and functional alterations (Edgar et al., 2012; Lei et al., 2015; Gong et al., 2016). AVHs are the classic positive symptoms (Tandon, 2013; Reed et al., 2018). However, no study has reported the efficacy of antipsychotics on AVHs in schizophrenic patients. Most studies refer to AVHs as part of the positive symptom cluster and do not report them as a distinct category (Olgiati et al., 2009; Sagud et al., 2010; Huang et al., 2016). To the best of our knowledge, only one study has reported that the efficacy of transcranial direct current stimulation (tCDS) as a supplement to antipsychotic treatment of AVHs is weaker in schizophrenia patients with the *COMT*-met genotype than in those with the *COMT*-val genotype (Chhabra et al., 2018). A recent systematic review reported that antipsychotics can improve AVHs in patients with borderline personality disorder (Slotema et al., 2018). This study indicates the feasibility of exploring the efficacy of antipsychotics on AVHs. Exploring the pathological features of Hi-AVH subjects and the efficacy of antipsychotic treatment on them can provide new fundamental information for exploring the mechanisms of AVHs in patients with schizophrenia. The recruitment of Hi-AVHs subjects can prevent many confounding factors, such as other positive symptoms.

To the best of our knowledge, few studies have been conducted to explore the influence of the *COMT* genotype on the severity of AVH symptoms in healthy individuals with AVHs, to explore the relationship between brain structural and functional alterations and the COMT genotype, or to explore the efficacy of dopamine antagonists on AVHs in Hi-AVH subjects in order to explore the corresponding brain structural or functional alterations that accompany the efficacy of treatment. *We proposed a hypothesis that the* efficacy *of dopamine antagonist treatment on AVHs symptoms in Hi-AVHs subjects is influenced by the COMT genotype and is accompanied by structural and functional brain alterations.*

In the present pilot study, we adopt genotyping, functional connectivity density mapping, and statistical parametric mapping (SPM) techniques to explore the influence of the *COMT* genotype on AVH symptoms in Hi-AVH subjects and explore the influence of the *COMT* genotype on the efficacy of dopamine antagonist treatment on AVHs in Hi-AVH subjects and the accompanying brain structural and functional alterations.

# SAMPLES AND METHODS

#### Samples

For the present pilot study, we recruited by advertisement in 1,000 communities (total resident population greater than 200,000) to enroll Hi-AVH volunteers from 1,1,2016 to 6,31,2018. In accordance with previous studies (Haddock et al., 1999; Xu et al., 2012; Dollfus et al., 2018), we adopted the traditional Auditory-Verbal Hallucinations Rating Scale (AHRS) extracted from the Psychotic Symptom Rating Scales (PSYRATS) to assess the severity of the AVH symptoms. We enrolled 300 healthy people with diagnosed AVHs. Among them, only 115 subjects reported that they had suffered mental distress caused by the AVHs and volunteered to accept treatment. None of the subjects who participated in the treatment processing reported psychiatric positive family history, childhood trauma, abuse, or any other negative life events. Simultaneously, none of the subjects achieved the diagnostic criteria of any clearly mental disorders according to DSM-IV, depending on the assessment by two senior psychiatrists according to the SCID-NP semistructured interview (Phillips et al., 2009). The Tianjin Anding Hospital ethics review board approved this study. All patients provided written consent. The assessments were carried out in compliance with the Declaration of Helsinki guidelines and approved by the institutional ethics committee.

# Methods

#### Genotyping

The genotypes were grouped by allele dominance according to the available literature (Chhabra et al., 2018). Blood collection and genotyping were performed as previously reported (Chhabra et al., 2018). In brief, 5 ml of peripheral blood was collected in K2EDTA-treated vacutainers (Becton & Dickinson, NJ, USA), and genomic DNA was extracted using commercial spin columns (Qiagen, Inc., Limburg, the Netherlands). The quality of extracted DNA was determined by UV spectrophotometry (Thermo Scientific, Waltham, MA, USA). Then, the genomic DNA was subjected to COMT genotyping at rs4680 using the TaqMan 5′ nuclease allelic discrimination assay. The genotyping was performed by realtime polymerase chain reaction (PCR) in a 96-well plate (StepOne Plus™ Real-Time PCR Systems, Applied Biosystems) with predesigned, commercially made primers and allelespecific minor groove binding probes (FAM and VIC; Applied Biosystems, Foster City, CA, USA) in a reaction volume of 10 μl (10 ng of genomic sample DNA, assay mix, and PCR Universal Master Mix with AmpErase® Uracil-DNA Glycosylase) as follows: 60°C for 30 s, and 95°C for 10 min, followed by 50 cycles of 92°C for 15 s and 60°C for 90 s. PCR was performed in duplicate with both positive and negative controls. In accordance with previous studies, *COMT*met is the subjects with *COMT*-met/met, while *COMT*-val is the subjects with *COMT*-met/val and *COMT*-val/val (Kang et al., 2010).

#### Magnetic Resonance Imaging (MRI) Data Acquisition

All MRI data were obtained on a 3.0-tesla MR system (Discovery MR750, General Electric, Milwaukee, WI, USA). Tight but comfortable foam padding was used to stabilize head position, and earplugs were used to reduce scanner noise during image acquisition. A sagittal 3D T1-weighted brain volume sequence with 188 sagittal slices was performed with the following scan parameters: repetition time (TR) = 8.2 ms; echo time (TE) = 3.2 ms; inversion time (TI) = 450 ms; flip angle (FA) = 12°; field of view (FOV) = 256 mm × 256 mm; matrix = 256 × 256; slice thickness = 1 mm, no gap. Resting-state functional MRI (fMRI) scans were performed using a gradient-echo single-shot echo-planar imaging sequence with scan parameters of TR/ TE = 2000/45 ms, FOV = 220 mm × 220 mm, matrix = 64 × 64, FA = 90°, slice thickness = 4 mm, gap = 0.5 mm, 32 interleaved transverse slices, and 180 volumes. During fMRI scans, all subjects were instructed to keep their eyes closed, to relax, to move as little as possible, to think of nothing in particular, and not to fall asleep.

#### fMRI Data Preprocessing

Resting-state fMRI data were preprocessed using SPM8 (http:// www.fil.ion.ucl.ac.uk/spm). The first 10 volumes for each subject were discarded to allow for scanner stabilization and the participants' adaption to the scanning environment. The remaining volumes were preprocessed after a slice-timing correction. All subjects' fMRI data were within defined motion thresholds (i.e., translational and rotational motion parameters less than 2 mm or 2°). Several nuisance covariates (six motion parameters and average BOLD signals of the ventricles and white matter) were regressed out of the data. Framewise displacement (FD), which indexes volume-to-volume changes in head position, was also calculated. If the FD of the specific volume was over 0.5, spike volumes were also regressed out. The datasets were bandpass filtered with frequencies ranging from 0.01 to 0.08 Hz. Individual structural images were linearly coregistered to the mean functional image, and the transformed structural images were linearly coregistered to the Montreal Neurological Institute (MNI) space. Finally, the motion-corrected functional volumes were spatially normalized to the MNI space using parameters that were estimated during linear coregistration. The functional images were resampled into 3-mm cubic voxels.

#### gFCD Calculation

We calculated the gFCD of each voxel using an in-house script on the Linux platform as previously reported (Tomasi and Volkow, 2010, 2011). The strength of the functional connectivity between voxels was evaluated using Pearson's linear correlation with a correlation coefficient threshold of R > 0.6. The gFCD calculation was restricted to voxels in the cerebral gray matter regions using a cerebral gray matter mask. The gFCD at a given voxel x0 was computed as the total number of functional connections, k(x0), between x0 and all other voxels using a "growing" algorithm that was developed on the Linux platform. This processing calculation was repeated for all voxels x0 in the whole brain. To increase the normality of the distribution, we applied grand mean scaling to gFCD by dividing by the mean value of the qualified voxels in the whole brain. Finally, to minimize differences in the functional brain anatomy across subjects, we spatially smoothed the FCD maps with a 6 × 6 × 6 mm3 Gaussian kernel.

#### Gray Matter Volume (GMV) Calculation

The GMV of each voxel was calculated using Statistical Parametric Mapping (SPM8; http://www.fil.ion.ucl.ac.uk/spm/software/spm8/). With the standard unified segmentation model, we segmented structural MR images into gray matter (GM), white matter, and cerebrospinal fluid. After an initial affine registration of the GM concentration map into the MNI space using the technique of diffeomorphic anatomical registration through exponentiated Lie algebra, GM concentration images were nonlinearly warped and then converted to a voxel size of 1.5 × 1.5 × 1.5 mm3 . The nonlinear determinants were first derived from the spatial normalization step and then multiplied by the GM concentration map to obtain the GMV of each voxel. Finally, the GMV images were smoothed with a 6 × 6 × 6 mm3 full width at half-maximum Gaussian kernel. The normalized, modulated, and smoothed GMV maps were used for statistical analyses after spatial preprocessing.

#### Statistical Analysis

A two-sample *t*-test was used to compare gFCD between groups in a voxelwise manner with adjustment for age and sex. A familywise error (FWE) method was used to correct for multiple comparisons (*p* < 0.05). If a significant difference between groups was found in the mean gFCD of each cluster, it was extracted for each subject and then used for region of interest (ROI)-based comparison between groups. The possible effect of GMV on global FCD changes was excluded by comparing the GMV of each ROI between groups as an added covariate of no interest. For these ROI-based analyses, the effect size of each comparison was described using Cohen's d. To further investigate whether the gFCDs were correlated with clinical variables, we used a partial correlation analysis to analyze the relationship of ROI-based analyses with antipsychotic doses of chlorpromazine equivalents, illness duration, and adjusted for age and sex. Partial correlation analyses were also performed to investigate the relationship between AVH scores and gFCD values, adjusted for age, gender, educational level, and antipsychotic dose. The Bonferroni method was used to correct for multiple comparisons (*p* < 0.05).

#### RESULTS

#### Sample Information

Ultimately, 34 *COMT*-met subjects and 45 *COMT*-val subjects underwent dopamine antagonist treatment for 6 months. However, the MRI data from only 25 *COMT*-met subjects

TABLE 1 | Sociodemographic and clinical characteristics.

and 21 *COMT*-val subjects could be used for analysis. We sought to assess accurately how the *COMT* genotype influenced the antipsychotic efficacy and accompanying corresponding brain alterations. We factitiously discarded information from three *COMT*-met subjects and one *COMT*val subject (a flaw of the present study; please see the section on limitations), preserving only 22 *COMT*-met subjects and 20 *COMT*-val subjects for further analysis to guarantee comparability. All the sociodemographic, genotype, and treatment response information are shown in **Table 1**. There were no significant group differences in gender, age, educational level, illness duration, or baseline AVH symptom severity. During the treatment, all subjects took risperidone (Johns and Johns, Xi'an Yang-Sen Pharmaceutical Co., Ltd.) as treatment, and their antipsychotic dosages (in chlorpromazine equivalents) ranged from 500 to 100 mg/d. Antipsychotic dosage showed a significant difference between genotypes, with *COMT*-val subjects receiving higher dosages than *COMT*met subjects. Surprisingly, however, despite largely comparable medication regimens, the efficacy of antipsychotics on AVHs was remarkably different between two groups (**Table 1** and **Figure 1**, Note: \*\* < 0.001).

#### gFCD and GMV Differences Between Two Groups at Baseline

At baseline, compared to the *COMT*-val genotype subjects, the current study found that *COMT*-met genotype subjects demonstrated higher gFCDs located in the auditory cortex (superior temporal gyrus or Wernicke brain region, *p* < 0.05, corrected with FWE) (**Figure 2A**). Simultaneously, the ventral lateral prefrontal lobe, which was related to mood regulation, also demonstrated higher gFCD (**Figure 2A**). However, GMV showed no significant differences between two groups (**Figure 2B**).

#### gFCD and GMV Differences Between Two Groups After Treatment

After treatment, compared to the *COMT*-val genotype subjects, the present study found that *COMT*-met genotype subjects


*Note: AHRS: Auditory verbal rating scale.*

demonstrated lower gFCD in the middle temporal gyrus (**Figure 2C**). More notably, *COMT*-met genotype subjects' GMV was also significantly lower in the lateral parietal lobe (**Figure 2D**).

#### gFCD and GMV Differences Before and After Treatment in Each Group

In the *COMT*-met subjects, gFCD was lower in the lateral parietal lobe, dorsolateral prefrontal cortex, posterior superior temporal gyrus, temporal pole, and motor cortex after treatment (**Figure 3A**), while GMV was lower in the lateral frontal lobe, lateral temporal lobe, and lateral parietal lobe (**Figure 3B**). However, in the *COMT*-val subjects, gFCD was lower in the posterior superior temporal gyrus and posterior parietal lobe (**Figure 3C**), and GMV was lower in the superior temporal gyrus and ventral lateral prefrontal lobe (**Figure 3D**).

### The Association Among gFCD, GMV, and AVHs

In each group, we found no statistical correlations between gFCD and antipsychotic dosage in chlorpromazine equivalents, illness duration, or AVH scores before or after treatment. Similarly, we also did not find any statistical correlations between FIGURE 1 <sup>|</sup> Differential effects of treatment in two groups (note: \*\* *P* < 0.01).

GMV and antipsychotic dosages in chlorpromazine equivalents, illness duration, or AVH scores before or after treatment.

# DISCUSSION

The present artificially controlled pilot study is the first one to demonstrate that the efficacy of dopamine antagonists on Hi-AVH subjects was influenced by the *COMT* genotype and was accompanied by corresponding brain structural and functional alterations. More importantly, we not only compared treatment efficacy and brain alterations between two groups but also assessed the difference before and after treatment in each group, providing supplementary information to further clarify the pathological features of Hi-AVHs with specific *COMT* genotypes. Although this pilot study has several limitations, it can at least provide a clue to guide further study.

Mounting studies have confirmed that GMV, usually referred to as structural alterations, affects many functional activities and subsequently causes mental and behavioral alterations (Asami et al., 2013; Rogers and De Brito, 2016). gFCD is an index evaluating functional connectivity, and many previous studies have also reported that gFCD indexes informational communication capacity to some extent. gFCD increase indicates that information communication throughout the brain is enhanced and vice versa (Goni et al., 2014; Teodoro et al., 2018). The GMV and gFCD alterations in Hi-AVH subjects after treatment indicate that antipsychotics may normalize structural and functional aberrations of the brain, subsequently alleviating AVH symptoms, although the treatment efficacy is influenced by the *COMT* genotype.

In this pilot study, we found that *COMT*-met Hi-AVH subjects achieved markedly better treatment efficacy than those with the *COMT*-val genotype; correspondingly, brain structural and functional alterations are also more widespread after treatment in the former group than in the latter. However, we did not find any correlation between the gFCD or GMV alterations accompanying AVH alleviation and the dosage of antipsychotics or the duration of illness. More interestingly, we also found that *COMT*-met subjects had a broader scope of brain structural and functional aberrations before and after treatment than *COMT*-val subjects. These aberrant brain regions are involved in multiple types of information processing; for example, the superior temporal gyrus participates in the processing of auditory information, the ventral lateral prefrontal lobe and lateral parietal lobe participate in cognitive information processing, the posterior superior temporal gyrus and posterior parietal lobe are related to language processing, and the temporal pole participates in language processing and multisensory integration (Fan et al., 2014; Xu et al., 2015). These findings indicated that Hi-AVH subjects also had functional aberrations in many brain regions that modulate multiple types of functional activity in the brain, not limited to auditory information processing. More importantly, after treatment, all subjects also demonstrated normalization of hyperfunctional activity, which indicated that GMV was impaired in all subjects. More notably, the GMV reduction before and after treatment was more widespread among *COMT*-met subjects than among *COMT*-val subjects, which indicated that antipsychotics may have caused GMV reduction or the normalization of enlarged GMV before treatment. This complex phenomenon requires further study of genotypically similar normal controls and Hi-AVH subjects to clarify. However, some studies have reported that antipsychotics may cause GMV reduction (Asami et al., 2013; Rogers and De Brito, 2016). As for the normalization of aberrant functional activity by antipsychotic treatment, this finding has been confirmed by many studies (Gong et al., 2016). Therefore, we do not expand on the explanation in this paper. Our data may support the postulation that antipsychotics normalize enlarged GMV before treatment, which has very little support from current available literature. Of course, additional attention and studies are required to confirm this conclusion, since the subjects in our study took antipsychotics for only 6 months.

AVHs in schizophrenia have been reported by many studies from multiple perspectives, from a clinical to a combination of neuroimaging, electrophysiological, and neurotransmitter perspectives (Lu et al., 2007; Kang et al., 2010; Gothelf et al., 2011; Edgar et al., 2012; Steiner et al., 2018). Multiple specific figures are also corroborated by many studies and some are generally accepted to some extent. However, few studies have reported on AVHs in Hi-AVH subjects. Our pilot study also found that AVH symptoms in Hi-AVHs are nearly as severe as in schizophrenia patients, and many subjects desired treatment. Further study is needed to explore specific strategies by which to treat these subjects, especially considering the GMV impairment caused by antipsychotics. In only 6 months of treatment, the GMV decreased at a remarkable speed. One possible explanation is that the GMV in Hi-AVH subjects was enlarged compared with that of healthy controls, while antipsychotics normalized the enlarged GMV, thus causing GMV reduction. However, this postulation is highly speculative and cannot be adequately tested with the current data. Therefore, further study is urgently needed to address this hypothesis.

#### Limitations

By listing the flaws of this pilot study, we hope to help other scholars avoid similar weaknesses in their research. This study has at least 12 limitations, which we list in the following paragraphs; we sincerely hope that international scholars will provide constructive comments to guide our subsequent studies.

First, to evaluate the practicability of this study, we adopted many methods to ensure that the subjects could complete the full study. However, despite our best efforts, only 42 subjects with adequate data can be used for the whole analysis. In order to improve the accuracy of investigating the influences of COMT genotype on the treatment efficacy of dopamine antagonist on the AVH symptoms in the Hi-AVH subjects, a long-term cohort study with large sample will be necessary. Given this pilot study, we should rethink our method to assure that large samples are sufficiently enrolled in the study. Second, to explore potentially objective evaluation indices, we discarded four samples that deviated substantially from other samples. The present study was only a pilot study, and we need to strengthen our study methods so that heterogeneous samples can be analyzed. Third, here we adopted the relatively simple indices of gFCD and GMV to explore brain alterations. More precise MRI data analysis methods are currently available, and we should adopt the most advanced method to analyze brain alterations in future studies. Fourth, in this pilot study, we considered only the *COMT* genotype. Other genes, such as *FOX2* gene, *NRG1*, and other newly identified genes, were not examined. We should adopt genomic, transcriptomic, and even proteomic methods for further studies. Fifth, here we used a 3.0-T scanner to acquire MRI data. Currently, a 7.0-T scanner is in use in China. A higher resolution scanner should be applied in future studies to explore the subtle alterations in the brain. Sixth, we did not consider treatment time or reciprocal gene interaction effects, which should be taken into account in further studies. Seventh, an important flaw in this pilot study is that, limited by the condition, at baseline, we did not enroll healthy controls with *COMT*-met and *COM*-val to compare their brain alterations with those of similarly genotyped Hi-AVH subjects. Therefore, with current data, we cannot clarify whether the brain structural and functional alterations exist or not in the Hi-AVH subjects at baseline. However, comparison was performed in one patient before and after treatment in the present study. This self-control comparison may somewhat remedy the study flaw. In future studies, we must solve this problem to achieve a more precise understanding of AVHs-related brain structural alterations in the Hi-AVHs. Eighth, we did not enroll schizophrenia patients with AVHs for comparison, which limits the information that can be gained from the present study. The ninth limitation of this pilot study is that we did not compare cognitive alterations before antipsychotic treatments. To the best of our knowledge, there are no studies reporting any influence of antipsychotics on the cognitive ability of psychiatrically healthy subjects. Conversely, many studies have reported that antipsychotics (including risperidone) have positive effects on cognitive impairments in patients with schizophrenia (Suzuki and Gen, 2012; Desamericq et al., 2014), which indicates that antipsychotics can improve cognitive impairments or, at least, does not impair cognitive ability. Hence, we need to clarify the effect of antipsychotics on the cognitive ability of Hi-AVHs in the future. Tenth, when we found the GMV were lower after six-month treatment, we worried about the influences of the treatment on the subjects' social and cognitive function. Thus, we adopted Global Assessment of Functioning (GAF) scale (Vaskinn and Abu-Akel, 2018) and Wisconsin Card Sorting Test (WCST) (Westwood et al., 2016) to evaluate each subject, which was used as a remedy method to define whether the dopamine antagonist caused the functional impairment or not in the subject. Fortunately, all subjects scored within the normal range. This problem should be avoided in the future study. More importantly, as mentioned above, the GMV in the subjects receiving treatment decreased so quickly in the Hi-AVH subjects that we must be highly vigilant. According to our pilot study, we suggest that the dopamine antagonist is not the appropriate treatment for Hi-AVH subjects. Eleventh, in this pilot study, we calculated the gFCD both with and without the global signal and found little difference between the two values. As there is no consensus as whether to include the global signal or not in calculating gFCD, we reported the gFCD with the global signal here. Twelfth, as this was a pilot study, we did not compare differences in the demographic and clinical data between the subjects who completed and those who did not complete the full study. In this study, we aimed to observe the influence of the *COMT* genotype on the efficacy of atypical antipsychotics on AVHs in the Hi-AVH subjects. However, we did not genotype *COMT* in the subjects who failed to complete the study, which was a flaw of this study.

#### CONCLUSION

In this artificially controlled pilot study, despite many limitations existed, we report for the first time that *COMT* genotypes influence the functional features of brains and the efficacy of dopamine antagonists on the treatment of AVHs in Hi-AVH subjects.

#### REFERENCES


*COMT*-met genotype subjects responded more strongly to dopamine antagonists but also had more serious GMV and FCD reductions than *COMT*-val subjects. Although the design of this pilot study is less than optimal, these findings can at least provide primary information to further explain the mechanisms of AVHs and to help reveal specific targets for the treatment of AVHs in Hi-AVH subjects or in schizophrenia patients.

#### AUTHOR CONTRIBUTIONS

CZhuo and CZhou had full access to all of the data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. CZhuo and CZhou contributed to the study concept and design. YX, LZ, and RJ participated in acquisition, analysis, or interpretation of data. CZhuo, YX, LZ, and RJ performed statistical analysis. CZhuo and CZhou contributed to administrative, technical, or material support. CZhuo involved in drafting of the manuscript.

#### FUNDING

This work was supported by grants from the Tianjin Health Bureau Foundation (2014KR02 to CZhuo), National Natural Science Foundation of China (81871052 to CZhuo), the Key Projects of the Natural Science Foundation of Tianjin, China (17JCZDJC35700 to CZhuo), National Key Research and Development Program of China (2016YFC1307004 to YX), and Multidisciplinary Team for Cognitive Impairment of Shanxi Science and Technology Innovation Training Team (201705D131027 to YX).

auditory verbal hallucinations? *Harv. Rev. Psychiatry* 24, 148–163. doi: 10.1097/ HRP.0000000000000082


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Zhuo, Xu, Zhang, Jing and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Epigenetic Regulations in Neuropsychiatric Disorders

Janise N. Kuehner<sup>1</sup> , Emily C. Bruggeman<sup>1</sup> , Zhexing Wen2,3,4 and Bing Yao<sup>1</sup> \*

<sup>1</sup> Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, United States, <sup>2</sup> Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, United States, <sup>3</sup> Department of Cell Biology, Emory University School of Medicine, Atlanta, GA, United States, <sup>4</sup> Department of Neurology, Emory University School of Medicine, Atlanta, GA, United States

Precise genetic and epigenetic spatiotemporal regulation of gene expression is critical for proper brain development, function and circuitry formation in the mammalian central nervous system. Neuronal differentiation processes are tightly regulated by epigenetic mechanisms including DNA methylation, histone modifications, chromatin remodelers and non-coding RNAs. Dysregulation of any of these pathways is detrimental to normal neuronal development and functions, which can result in devastating neuropsychiatric disorders, such as depression, schizophrenia and autism spectrum disorders. In this review, we focus on the current understanding of epigenetic regulations in brain development and functions, as well as their implications in neuropsychiatric disorders.

#### Edited by:

Mojgan Rastegar, University of Manitoba, Canada

#### Reviewed by:

Tomas J. Ekström, Karolinska Institutet (KI), Sweden Malav Suchin Trivedi, Northeastern University, United States

#### \*Correspondence:

Bing Yao bing.yao@emory.edu

#### Specialty section:

This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Genetics

Received: 13 November 2018 Accepted: 11 March 2019 Published: 04 April 2019

#### Citation:

Kuehner JN, Bruggeman EC, Wen Z and Yao B (2019) Epigenetic Regulations in Neuropsychiatric Disorders. Front. Genet. 10:268. doi: 10.3389/fgene.2019.00268 Keywords: epigenetics, neuropsychiatric disorders, DNA methylation, Rett syndrome, Fragile X syndrome, autism spectrum disorders, schizophrenia, major depressive disorders

# INTRODUCTION

The concept of epigenetics was first proposed in 1939 by Conrad Waddington to describe early embryonic development (Waddington, 1939). He proposed that development originates from the interactions of the starting material in the fertilized egg, and that the interactions give rise to something new. He further postulated that this process cycles, leading to the formation of a whole organism. Today, the accepted definition of epigenetics is the study of modifications that directly affect the expression of a gene, but do not change the underlying DNA sequence (Goldberg et al., 2007; Allis and Jenuwein, 2016; Gayon, 2016). There are several major epigenetic mechanisms that are extensively studied including DNA modifications, histone modifications, chromosome remodeling and RNA regulation via non-coding RNAs such as microRNA (miRNA) and long noncoding RNA (lncRNA) (**Figure 1**). Modifications can be added, removed and interpreted by various classes of proteins collectively known as 'writers,' 'erasers' and 'readers,' respectively. Disruption of these epigenetic mechanisms and their molecular machinery can have catastrophic consequences in the mammalian central nervous system (CNS).

Both nervous system development and function can be affected by epigenetic spatiotemporal regulation of gene expression. In the mammalian CNS, epigenetic dysregulation is associated with neuropsychiatric diseases such as major depressive disorder (MDD), autism spectrum disorders (ASDs), Fragile X, Rett syndrome and schizophrenia. Epigenetic studies are actively trying to identify biomarkers that could be associated with diseases to aid in our development of novel therapeutics. This information is critical, as the prevalence of neuropsychiatric diseases is on the rise (Atladottir et al., 2015). Here, we review the current understanding of epigenetic regulation in

**127**

brain development and functions, with a focus on DNA methylation, as well as their implications in psychiatric diseases.

#### DNA METHYLATION

fgene-10-00268 April 2, 2019 Time: 17:28 # 2

#### Functional Roles of DNA Methylation

DNA methylation is one of the best characterized epigenetic marks studied and has been regarded as a highly stable mark found in differentiated cells (Reik, 2007; Suzuki and Bird, 2008). It involves the covalent methylation of the fifth position in the cytosine ring, generating 5-methylcytosine (5mC) (**Figure 1A**). DNA methylation largely occurs at CpG dinucleotides (Bird, 1986). Accumulation of short, unmethylated CpG-rich clusters known as CpG islands occurs in the promoter regions of most genes (Jones, 2012). Genome-wide studies have implicated that the distribution of 5mC in transcripts could have differential roles in gene expression. For example, methylation status of the CpG islands helps to determine whether the corresponding gene will be expressed, whereas gene body methylation has been proposed to promote transcriptional elongation (Neri et al., 2017) and affect splicing (Maunakea et al., 2013). In addition, the methylation status of CpG islands can be influenced spatially based on tissue and cell type (Illingworth and Bird, 2009). For instance, the gene HTR2A, which has been implicated in many neuropsychiatric disorders (Norton and Owen, 2005), shows differential expression in the cerebellum and the cortex and is regulated by DNA methylation (Ladd-Acosta et al., 2007). Strikingly, the methylated CpG loci regulating HTR2A expression is over 1 Kb upstream of the promoter rather than being in the promoter region, illustrating that methylation can regulate genes across long distances. Thus, DNA methylation has important roles for brain region-specific transcriptome profiles.

Not only can DNA methylation regulate protein coding genes, it can also regulate non-protein coding RNA like lncRNAs. Random X-inactivation, an essential embryonic event, is triggered by the production of Xist, a lncRNA that coats the X chromosome destined to be inactivated (Borsani et al., 1991; Brown et al., 1992). The promoter of the Xist gene contains a CpG island whose methylation status ultimately dictates whether the X chromosome is active (Beard et al., 1995). How DNA methylation regulates lncRNA in the brain is still unclear. One study compared the DNA methylation patterns around the transcription start sites (TSSs) of protein coding genes and lncRNA loci (Sati et al., 2012). Surprisingly, a sharp increase in DNA methylation immediately downstream of the TSS was associated with lncRNA loci, but did not correlate with expression of the lncRNA. While this finding suggests that DNA methylation may not play an essential role in lncRNA expression, it would be interesting to investigate if blocking methylation at these sites influenced lncRNA expression.

In addition to its roles in gene regulation, DNA methylation also maintains genomic stability by controlling the expression of highly repetitive regions in the genome such as retrotransposons and satellite DNA (Liu et al., 1994; Woodcock et al., 1997; Walsh et al., 1998). In general, long interspersed nuclear element-1 (LINE 1) is only active in the germline and during early development (Ma et al., 2010). During somatic cell differentiation, DNA methylation silences LINE 1. Interestingly, studies have suggested that LINE 1 may be active during human and rodent neuronal differentiation and influence neuronal gene expression to create cell heterogeneity in the adult brain (Muotri et al., 2005; Muotri and Gage, 2006; Coufal et al., 2009). Indeed, LINE 1 has been shown to be more active in the brain compared to other tissues (Coufal et al., 2009). Increases in LINE 1 and other repetitive elements have been associated with the neuropsychiatric disorder Rett syndrome (Muotri et al., 2010). Suppression of LINE 1 requires methylation of its promoter and binding of the methyl-binding protein MeCP2, which plays a causal role in Rett syndrome.

Suppressing the expression of repetitive elements is one way by which DNA methylation maintains genomic stability and integrity. Genome instability has been shown to be highly correlated with many neuropsychiatric diseases such as schizophrenia, autism, Rett syndrome and several others (Smith et al., 2010). Numerous genes associated with these disorders, particularly schizophrenia and autisms, co-localize with regions of the genome that are more susceptible to mutations, or epigenetic alterations known as fragile sites. The most studied fragile site is associated with Fragile X syndrome and will be discussed later in this review.

Finally, DNA methylation has important roles in early developmental processes such as gene imprinting. Often, the "imprint" is methylation of a long-range control element called an imprint control element (ICE) (also referred to as imprint control region, ICR, or imprint center, IC) (Li et al., 1993; Barlow, 2011). Parental specific methylation of the ICE is established by the DNA methyltransferase (DNMT) complex DNMT3A/3L during gamete development (Bourc'his et al., 2001; Kaneda et al., 2004). Of the approximately 100 imprinted genes currently known, the majority of them are expressed in brain tissues, though not always exclusively, and have been reviewed previously (Wilkinson et al., 2007). One of the more extensively studied imprinted genes, specifically in the CNS of mammals, is the paternally expressed gene Necdin (Ndn) (Aizawa et al., 1992). Ndn regulates neuronal differentiation and axonal outgrowth. Also, Ndn is most highly expressed during mouse neuronal generation and between postnatal days 1–4.

#### DNA Methylation in the Brain

DNA methylation in the brain is required for brain development and function throughout all stages in life. Dynamic regulation of DNA methylation is critical for cellular differentiation. One study compared the changes in DNA methylation patterns between two differentiation phases: the transition of embryonic stem cells (ESCs) to neuronal progenitor cells (NPCs), and the transition of NPCs to differentiated neurons (Mohn et al., 2008). The most dynamic changes in DNA methylation patterns were found when ESCs lost their pluripotency and became NPCs. In fact, ESCs were nearly devoid of DNA methylation marks except at the promoters of genes that were germline specific. In contrast, during the differentiation of NPCs to mature neurons, only 2.3% of the analyzed promoters gained

de novo methylation and only 0.1% of promoters were demethylated, suggesting that the majority of DNA methylation dynamics do not occur in this phase. Similar to neurogenesis, astrocytogenesis is tightly controlled by DNA methylation. In mouse, astrocyte differentiation from neuroepithelial cells requires that the promoter of the GFAP gene be demethylated on embryonic day 14.5, allowing for the transcription factor STAT3 to bind and activate GFAP expression (Teter et al., 1994; Takizawa et al., 2001).

Very few studies have focused on how DNA methylation regulates other brain developmental features, such as neural migration and axonal/dendritic outgrowth. Two recent studies have demonstrated that the DNA methyltransferase, DNMT1, as having putative regulatory roles in immature GABAergic interneuron migration (Pensold et al., 2017; Symmank et al., 2018). They found that Dnmt1 promotes the migration and survival of immature migratory GABAergic interneurons that derive from the embryonic preoptic area (POA) by repressing

Pak6 expression (Pensold et al., 2017). p21-active kinases (PAKs) are known for their roles in cytoskeletal organization (Kumar et al., 2017), and Pak6 has previously been shown to stimulate neurite outgrowth in post-migratory neurons derived from POA (Civiero et al., 2015; Pensold et al., 2017). De novo methylation by Dnmt3b in early embryonic neurodevelopmental processes has been shown to be critical in regulating the clustered protocadherins (Pcdhs) genes (Toyoda et al., 2014). Protocadherins are cell-surface adhesion proteins that are predominantly expressed in the nervous system (Sano et al., 1993), and have critical functions in neurite self-avoidance (Lefebvre et al., 2012), neuronal survival (Wang et al., 2002b), and dendritic patterning (Garrett et al., 2012). In mammals, they are found in three closely linked gene clusters call α (Pcdha), β (Pcdhb), and γ (Pcdhg) (Kohmura et al., 1998; Wu and Maniatis, 1999). Interestingly, the Pcdhs are stochastically expressed by alternative promoters in individual neurons generating single cell diversity of isoforms in the brain (Wang et al., 2002a). This stochastic expression is regulated by methylation of variable exons and this has been thoroughly reviewed elsewhere (Hirayama and Yagi, 2017). Protocadherins have critical roles in neural development and are starting to be implicated in neuropsychiatric disorders such as ASDs, depression and schizophrenia (Redies et al., 2012; El Hajj et al., 2017).

DNA methylation also has roles in brain function such as memory processing. In the mammalian brain, the hippocampus and the cortex are largely responsible for memory formation and storage (Morris et al., 1982; Squire, 1986; Miller et al., 2010). In the hippocampus, contextual fear conditioning induced changes in DNA methylation during memory formation in rats. When DNMTs were inhibited by either zebularine or 5 aza-2<sup>0</sup> -deoxycytidine, neuronal plasticity-promoting genes Bdnf and Reelin demonstrated altered methylation patterns (Levenson et al., 2006). After contextual fear conditioning, Dnmt3a and Dnmt3b mRNA were highly upregulated in the brain; however, when DNMT inhibitors, zebularine or 5-aza-2<sup>0</sup> -deoxycytidine, were injected into the hippocampus immediately after contextual fear conditioning, the fear response was eliminated, suggesting that DNA methylation is required for memory formation (Miller and Sweatt, 2007). Importantly, when the memory suppressor gene Pp1 was examined after fear conditioning, there was an increase in methylation at the CpG island upstream of the Pp1 transcriptional start site. It was postulated that the increase in de novo Dnmts may be necessary to transcriptionally silence memory suppressor genes after fear conditioning training to allow for memory formation and consolidation. In addition to the formation of memories, DNA methylation also has putative roles in long-term memory storage. Contextual fear conditioning was found to disrupt DNA methylation at three genes associated with memory, Egr1, reelin, and calcineurin, which also happen to have large promoter CpG islands (Miller et al., 2010). Both reelin and calcineurin were hypermethylated; however, only calcineurin maintained this hypermethylated state for 30 days, suggesting that DNA methylation might be required for long term memory storage.

Worth noting is that DNA methylation patterns in the brain can be affected by external stimuli in one's environment. Interestingly, a study found that in mature neuronal cells, CpGs in low density regions compared to CpG islands undergo dynamic DNA methylation changes in response to electroconvulsive stimulation (Guo et al., 2011a). Numerous studies have shown that maternal care during childhood (Weaver et al., 2004), early life stressors including abuse (McGowan et al., 2009), parental separation and social defeat stressors can alter DNA methylation patterns in the brain and have been reviewed elsewhere (Yu et al., 2011).

#### DNA Methyltransferases

DNA methylation is generated by a group of DNMTs, also regarded as 5mC enzymatic "writers" (**Figure 1A**). Each Dnmt (Dnmt1, 3a, 3b, 2, and 3L) has evolved to have its own specialized regulatory functions. These specialized functions could be attributed to the lack of sequence homology seen in the N-terminal regulatory domains of the Dnmts (Bestor and Verdine, 1994). All of the Dnmts contain some version of a cysteine rich domain that further define their functions. The most conserved region between the DNMTs is the C-terminal catalytic domain, which is characteristic of all enzymes that modify pyrimidines at the fifth position (**Figure 2A**).

The first Dnmt purified was Dnmt1 back in 1983, and was found to be responsible for maintaining methylated CpG sites during DNA replication (Bestor and Ingram, 1983). Dnmt1 interacts with replication machinery, such as proliferating cell nuclear antigen (PCNA). Maintenance of the genomic methylation pattern requires that unmethylated regions also be maintained during replication. During the S phase, the transcription factor p21 blocks Dnmt1 from interacting with PCNA, which ensures that unmethylated regions maintain their original state (Chuang et al., 1997). This regulation of Dnmt1 plays an important role in asynchronous replication, specifically at replication origins that include CpG islands (Delgado et al., 1998). Mutation and loss-of-function studies have demonstrated the necessity of Dnmt1 during embryonic development. By gestational day 9.5, Dnmt1-null mouse embryos failed to develop and died by gestational day 11 (Li et al., 1992). In addition, overall global methylation levels decreased by threefold in the Dnmt1-null embryos.

Nearly 15 years later, two additional Dnmts were discovered, Dnmt3a and Dnmt3b. Both Dnmt3a and Dnmt3b are responsible for de novo methylation, which is also critical during early embryogenesis (Okano et al., 1998a). When either Dnmt3a or Dnmt3b are deleted during embryogenesis, severe developmental defects or embryonic lethality are observed, respectively (Okano et al., 1999). Mouse embryos with Dnmt3a depletion appear normal at birth, but die around 4 weeks of age. In contrast, embryos null for Dnmt3b were not viable and had growth retardation and neural tube defects. In addition to embryonic development, the de novo methyltransferases work in conjunction with Dnmt1 to regulate genome stability and imprinted genes. At a global level, deletion of Dnmt3a and/or Dnmt3b results in slight demethylation at repetitive sequences, but not to the same extent observed in Dnmt1 gene deletion. This indicates that Dnmt1 is more important for the maintenance of methylation at repetitive sequences. At a loci-specific level,

deletion of Dnmt3a and/or Dnmt3b has varied effects. For example, at several imprinted gene loci, Igf2r and H19, neither single nor dual gene disruption of Dnmt3a or Dnmt3b resulted in the demethylation pattern observed in Dnmt1 gene disruption. However, at another imprinted loci, Igf2, dual deletion of Dnmt3a/Dnmt3b showed demethylation levels comparable to Dnmt1 loss, whereas single gene disruption had no effect on demethylation. This indicates that there is some overlap in the roles of the Dnmts at certain gene sites.

Lesser known methyltransferases include Dnmt2 and Dnmt3L that were identified by sequence homology studies. Dnmt2 contains all of the C-terminal catalytic domains necessary to act as a methyltransferase; however, it was found to be non-essential for maintenance or de novo methylation (Okano et al., 1998b), but rather responsible for tRNA methylation (Goll et al., 2006; Schaefer et al., 2010). Dnmt3L demonstrates homology with Dnmt3a and Dnmt3b, but lacks the enzymatic activity required to generate de novo methylation (Bourc'his et al., 2001; Hata et al., 2002). Instead, Dnmt3L is essential in the establishment of maternal imprints and co-localizes with Dnmt3a/3b to regulate imprinting. Furthermore, in the male germ line, loss of Dnmt3L resulted in the reactivation of retrotransposons and meiotic failure in spermatocytes (Bourc'his and Bestor, 2004), suggesting a role in genomic stability.

#### DNA Methyltransferases in the CNS

As writers of DNA methylation, Dnmts play critical roles in the mammalian CNS. Studies conducted on embryonic and adult mice revealed that Dnmts are highly expressed in neural progenitor cells, but are maintained at substantially lower levels in most differentiated neurons (Goto et al., 1994). Furthermore, mouse studies revealed that in the CNS, Dnmt3a is detected as early as embryonic day (E) E10.5 in the ventricular and subventricular zones, but its expression is predominantly in adult post-mitotic neurons (Feng et al., 2005). In contrast, Dnmt3b could only be detected during early neurogenesis. These specific time points of expression suggest that Dnmt3b may be important during the early stages of brain development, whereas Dnmt3a is more crucial to mature neurons. Further supporting different spatiotemporal roles for the de novo methyltransferases, it was shown that Dnmt3b is required for methylation at centromeric minor satellite repeats during embryonic brain development, whereas Dnmt3a is not (Okano et al., 1999).

Targeted mutagenesis studies revealed how critical the Dnmts are in the CNS. Conditional deletion of Dnmt1 in CNS precursor

cells, but not post-mitotic neurons, caused daughter cells to be severely hypomethylated (Fan et al., 2001). Interestingly, mice that had 30% of their CNS cells mutated showed selective pressure against the Dnmt-knockout cells in their brain. Three weeks after birth, all Dnmt-knockout cells were abolished. In adult forebrain neurons, double knockout of both Dnmt1 and Dnmt3a (but neither gene by itself) resulted in significantly smaller hippocampi and dentate gyrus brain regions, due to smaller neurons (Feng et al., 2010). These mice also showed impairments in learning and memory as well as inappropriate upregulation of immune genes associated with demethylation. These results suggest that Dnmt1 and Dnmt3a may have redundant roles in post-mitotic neurons.

To further enhance the elaborate network of DNA methylation in the mammalian CNS, non-CpG dinucleotide methylation (CpH) has surfaced and shown to be highly enriched and have critical roles in the brain (**Figure 1A**). CpG dinucleotides make up around 75% of total cytosine methylation, whereas CpH dinucleotides ('H' could be adenosine, thymine or cytosine) make up the remaining 25% (Guo et al., 2014). Interestingly, CpH methylation is enriched in low CpG dense regions, is associated with repressed gene expression, but is unassociated with protein–DNA interaction sites. As previously mentioned, Dnmt1 preferentially associates with CpG dinucleotides, and maintains symmetric CpG methylation on both strands of DNA during replication. This symmetric balance is further facilitated by the complimentary base pairing (GpC). CpH methylation does not maintain the sequence symmetry and consequently during replication, CpH methylation is not conserved. This requires the re-establishment of CpH methylations after each cell division (Shirane et al., 2013). Re-establishment of CpH methylation has been linked to Dnmt3a gene expression (Xie et al., 2012; Shirane et al., 2013; Varley et al., 2013). In knockdown experiments, loss of Dnmt3a, but not Dnmt1 or Dnmt3b, resulted in reduced CpH methylation with no effect on CpG methylation (Guo et al., 2014).

Like CpG methylation dynamics in early development, CpH methylation levels change during development. CpH methylation has been shown in relatively high abundance in stem cells (Lister et al., 2009; Laurent et al., 2010) and found to be enriched in both adult mouse and human brain tissues (Xie et al., 2012; Lister et al., 2013; Varley et al., 2013). A recent study showed that CpH methylation accumulates in the frontal cortex of the brain early after birth through adolescence and then slightly diminishes during aging (Lister et al., 2013). Different subclasses of neurons have unique CpH and CpG methylomes and CpH methylation may correlate more robustly with gene expression as compared to CpG methylation (Mo et al., 2015).

#### Methyl-Binding Proteins

After the establishment of DNA methylation marks by "writers,' a subset of proteins with methyl binding abilities known as "readers" can bind, protect and interpret these marks and facilitate function (**Figure 1A**). There are two main classes of methyl-CpG-binding proteins that have been thoroughly reviewed elsewhere (Ballestar and Wolffe, 2001), so this review will briefly discuss methyl-CpG-binding domain (MBD) proteins and MeCP2. Both protein families, for the most part, selectively bind to methylated DNA and aid in transcriptional repression (Hendrich and Bird, 1998). MeCP2 can facilitate gene repression by recruiting histone deacetylase (HDAC) machinery that further remodel the chromatin environment, facilitating a repressed state (Jones et al., 1998; Nan et al., 1998; Fuks et al., 2003). Later it was found that MeCP2 could also bind to non-CpG methylation modifications (Mellen et al., 2012; Guo et al., 2014; Gabel et al., 2015).

Methyl-binding proteins are ubiquitously expressed in somatic cells, but are particularly enriched in the mammalian CNS (Hendrich and Bird, 1998; Nan et al., 1998; Shahbazian et al., 2002; Cassel et al., 2004; Mullaney et al., 2004). Several studies have found that MeCP2 is involved in the regulation of brain-derived neurotrophic factor (BDNF), which promotes neuronal maturation (Chen et al., 2003; Martinowich et al., 2003). Additionally, MeCP2 was found to regulate a maternally imprinted gene called Dlx5 that is part of the gamma-aminobutyric acid (GABA) pathway for inhibitory GABAergic neurons (Horike et al., 2005). Importantly, mutations in the MBD of MeCP2 have been implicated in the X-linked, neurodevelopment disorder known as Rett syndrome (Amir et al., 1999).

In addition to MeCP2, there are four other mammalian MBD proteins. MBD1-3 are known for their roles in transcriptional repression, whereas MBD4 functions as a thymine glycosylase in the mismatch repair pathway (Fujita et al., 2003). The MBD proteins can repress gene expression in several ways. One is through the recruitment of the H3K9 methyltransferase Suv39h1 and heterochromatin protein 1 (HP1). Both Suv39h1 and HP1 interact with MBD1 and aid in the establishment and maintenance of a repressive chromatin state which is further facilitated by the recruitment of both HDAC1 and HDAC2 (Fujita et al., 2003). During the S phase of DNA replication, regions of the chromosome that are repressed by DNA methylation, or histone modifications must be maintained. MBD1 forms an S phase specific complex with the H3K9 methyltransferase SETDB1, and then associates with chromatin assembly factor (CAF-1) to help maintain a repressed chromatin state (Sarraf and Stancheva, 2004). MBD2 and 3 were found to be in the nucleosome remodeling and histone deacetylation (NURD) complex, further associating the cross-talk of DNA methylation with histone modifications and chromatin remodeling enzymes (Zhang et al., 1999). Although MBD3 cannot bind methylated DNA, it was found to mediate the association between metastasis-associated protein 2 (MTA2), a MBD-containing protein, and the HDAC core of the NuRD complex. MBD2 is thought to direct the NuRD complex to methylated DNA and aid in the maintenance of a repressed environment.

Very little work has been done to identify functions of MBD1-3 in the CNS. Mice with a loss-of-function MBD1 gene showed normal development, but as adults exhibited deficits in neurogenesis, impaired spatial learning and reduced long-term potentiation in the dentate gyrus (Zhao et al., 2003). Additionally, MBD1 was most enriched in hippocampus. During early embryogenesis, MBD3 was found to be highly expressed in the developing brain compared to MBD2 expression (Jung et al., 2003). In addition, in the adult brain, MBD3 is highly expressed in hippocampal and cortex neurons, but has very little expression in the outer cortical layer. Based on overall brain region enrichment patterning, it appears that the MBD proteins have some role in adult neurogenesis, but to what extent is unknown.

#### DNA DEMETHYLATION

fgene-10-00268 April 2, 2019 Time: 17:28 # 7

#### Mechanism of DNA Demethylation

The mammalian genome undergoes genome-wide passive and active DNA demethylation processes during early embryogenesis and in the germline (Monk et al., 1987; Kafri et al., 1992; Tada et al., 1998). During passive demethylation, there is either a lack of, or inhibition of Dnmt1 preventing the replacement of methyl marks (Howlett and Reik, 1991; Mertineit et al., 1998; Rougier et al., 1998; Howell et al., 2001). Furthermore, Dnmt1 is unable to recognize and bind to unmethylated DNA (Valinluck and Sowers, 2007), rather it prefers to bind to hemi-methylated DNA. The precise molecular events of active DNA demethylation were not elucidated until 2009 when two seminal studies identified the presence of 5-hydroxymethylcytosine (5hmC) in the mammalian genome (Kriaucionis and Heintz, 2009; Tahiliani et al., 2009). Tahiliani et al. (2009) discovered that Ten-Eleven Translocation 1 (TET1) could oxidize the methyl group on 5mC to generate 5hmC (**Figure 1A**). Subsequent studies further identified TET2 and TET3 proteins as additional "erasers" of 5mC (Ito et al., 2010). 5hmC can be furthered catalyzed by all TETs to form 5 formylcytosine (5fC) and 5-carboxylcytosine (5caC) (He et al., 2011; Ito et al., 2011). In addition, 5hmC can be converted to 5-hydroxymethyluracil (5hmU) via the activation-induced cytidine deaminase (AID) and apolipoprotein B mRNA-editing catalytic polypeptides (APOBEC) enzymes (Bhutani et al., 2011). All three of these derivatives (5fC, 5caC, and 5hmU) can be cleaved by thymine-DNA glycosylase (TDG), which excises the modified cytosine base allowing for the base excision repair (BER) pathway to return it to an unmodified cytosine base (Bhutani et al., 2011; He et al., 2011). Contrary to previous belief that the accumulation of 5hmC was solely dependent on TET activity on 5mC, recent work has suggested that Dnmt1 and Dnmt3a, drive the initial accumulation of 5hmC in the early mouse zygote stage (Amouroux et al., 2016). Knockout models and small molecule inhibitor studies were able to uncouple the formation of 5hmC from 5mC in the paternal pronucleus. This suggests that 5hmC could itself be an independent epigenetic modification.

#### TET Enzymes

TET enzymes catalytically oxidize the methyl group on 5mC to form 5hmC. The TET protein family is made up of three members: TET1, 2 and 3 (**Figure 2B**). Each contains a core catalytic domain structured as a double-stranded β-helix (DSBH) fold (Iyer et al., 2009; Tahiliani et al., 2009). Distinguishing the TET proteins from other related TET J-binding proteins (TET-JBP) families is the presence of a Cys domain located in the N-terminus of the DSBH domain that is thought to be essential for the catalytic activity. Also contained in TET1 and TET3 is a CXXC domain allowing the TET proteins to associate with chromatin through its binding to methylated cytosines. During development, the TET proteins can elect both an activating and repressive response from the genes they control based on what cofactors associate with them. In ES cells, TET1 has a repressive role when bound to the promoter region because it recruits MBD3-NURD (Yildirim et al., 2011) and SIN3A (Deplus et al., 2013). On the other hand, TET2 is not able to recruit either repressive component and has been associated with active cofactors such as Nanog and OGT (O-GlcNAc transferase) (Costa et al., 2013; Vella et al., 2013). In the male pronucleus, TET3 is responsible for the complete loss of 5mC and the accumulation of 5hmC, as shown by antibody staining and TET3 knockdown studies (Gu et al., 2011; Iqbal et al., 2011; Wossidlo et al., 2011).

# TET Enzymes in the CNS

Once it was discovered that TET enzymes were the long soughtafter DNA demethylases, (Iyer et al., 2009; Tahiliani et al., 2009) extensive efforts were made to understand the dynamics of the global demethylation events observed in early embryogenesis. The catalytic function of the TET enzyme family and their putative novel roles were yet to be discovered. Even after all the advancements made in the past 9 years, very little is known about the function of TET enzymes in the mammalian CNS. Although all three TET proteins are expressed in the brain, Tet2 and Tet3 have higher expression compared to Tet1 (Kriaucionis and Heintz, 2009; Szulwach et al., 2011; Hahn et al., 2013). When Tet2 and Tet3 are overexpressed, premature neuronal differentiation was observed, whereas knockdown caused defects in differentiation progression (Hahn et al., 2013). Tet1 knockout studies have identified several neural activity-regulated genes that are downregulated. Animals with this knockout display abnormal hippocampal synaptic plasticity and impaired memory extinction (Rudenko et al., 2013). Intriguingly, Tet1 deletion did not appear to affect anxiety or depression related behaviors. Due to the embryonic lethality of Tet3 deletion in mice, determining its function in the adult brain has been challenging. Instead of knockout studies, several groups have utilized small hairpin RNAs (shRNAs) to conditionally inhibit Tet3 expression. A recent study demonstrated that deletion of Tet3, and not Tet1, in mouse infralimbic prefrontal cortex (ILPFC), a region of the brain associated with fear extinction learning, impaired their ability to reverse a previously learned fear response (Li et al., 2014). Importantly, it was found that Tet3 mediates the drastic genome-wide redistribution of 5hmC in the ILPFC in response to extinction learning. Furthermore, posttraumatic stress disorders and phobias have been associated with impairments in fear extinction learning (Orsini and Maren, 2012).

# Roles of 5hmC, 5fC, and 5caC in the CNS

As previously discussed, 5hmC is the immediate product of TET enzymes' in the demethylation of 5mC. Relative to other tissue types, 5hmC is found to be approximately 10 times higher in the brain compared to ESCs (Tahiliani et al., 2009; Globisch et al., 2010; Song et al., 2011). Genome-wide analysis studies have demonstrated that 5hmC is dynamically regulated in human (Wang et al., 2012) and mouse brains during neurodevelopment

and aging (Szulwach et al., 2011). Dot blot analysis on cerebellum DNA showed 5hmC increased roughly 42% from fetal to adult brains. Furthermore, human 5hmC modifications were enriched at CpG islands and shores, exons and untranslated regions, consistent with 5hmC being associated with active genes. Notably, 5hmC has been found to be enriched at genes that are associated with ASDs. Differential hydroxymethylated regions found in human fetal and adult cerebellum were more likely to localize on Fragile X mental retardation protein (FMRP) target genes (Wang et al., 2012). These pieces of evidence clearly indicate the key roles of 5hmC in mammalian CNS. In addition to brain regions, some neurons have been found to contain high levels of 5hmC. For example, Purkinje neurons in the cerebellum were found to have roughly 40% more 5hmC relative to 5mC (Kriaucionis and Heintz, 2009). The enrichment of 5hmC in Purkinje neurons could account for its active biological functions as motor neurons that require an active transcriptome. Locus specific demethylation has been observed at the Bdnf loci. Bdnf is involved in adult neural plasticity and learning and memory (West et al., 2001). When cortical and hippocampal neurons experience a depolarization event, the Bdnf promoter is activated, enhancing its transcription (Shieh et al., 1998; Tao et al., 1998). The depolarization was also found to correlate with a decrease in CpG methylation in the Bdnf regulatory region (Martinowich et al., 2003; Guo et al., 2011b).

Very little is known about the functional roles of 5fC and 5caC other than their roles in active demethylation and conversion back to an unmodified cytosine. Genome-wide profiling studies found an enrichment of 5fC at poised and active enhancers, but with a clear preference for poised enhancers (Song et al., 2013). A recent study examined the dynamics of 5fC and 5caC in embryonic day 11.5 mice through 15-week-old adult mice (Bachman et al., 2015). They found that 5fC could be detected throughout all of the developmental time points, while 5caC could not be detected. Interestingly, both 5fC and 5caC were found to induce pausing of RNA Pol II during elongation, where this effect was not observed at C, 5mC nor 5hmC bases (Kellinger et al., 2012). It is possible that TDG could be recruited to sites of paused RNA Pol II to initiate the BER mechanism. Interestingly, TDG is the only glycosylase that is required for embryonic development (Cortazar et al., 2011; Cortellino et al., 2011). Even more intriguing is that in ESCs, both 5fC and 5caC recruit more proteins than either 5mC or 5hmC (Spruijt et al., 2013). The recruited proteins mostly had functional roles in DNA damage response (such as Tdg and p53), and proteins involved in chromatin remodeling (such as BAF170) were also found to interact with them.

### HISTONE MODIFICATIONS

DNA is wrapped around a core histone octamer containing two copies each of the histone variants H2A, H2B, H3 and H4 forming a chromatin structure (Kornberg, 1974). The amino acids that make up the amino-terminal 'histone tails,' specifically lysines and arginines, are subject to modifications, such as methylation and acetylation, that can affect transcription (**Figure 1B**). Unlike DNA methylation which only has three major methyltransferases, there have been numerous histone methyltransferases and demethylases identified for histones (Hyun et al., 2017). The potential crosstalk between histone methylations and DNA modifications and chromatin remodelers and regulatory RNAs add another layer of complexity. These crosstalk events are thought to establish and maintain the local chromatin environment as well as help cells "remember" their differentiated state (Cedar and Bergman, 2009; Jobe et al., 2012). Several mechanisms facilitate this cross-talk such as DNMT3L and methyl-binding proteins like MeCP2 and MBD2, but we will focus in detail on the Polycomb (PcG) repressive proteins and the Trithorax (TrxG) activating proteins (**Figure 1B**). These two groups of proteins antagonistically regulate genes that are critical for development and cell differentiation pathways (Schwartz and Pirrotta, 2008). The proteins encoded by PcG and TrxG form large complexes to maintain the local chromatin environment in either a repressed or active state, respectively (Locke et al., 1988; Franke et al., 1992).

#### Polycomb Group Proteins

The PcG proteins are divided into two major multiprotein complexes: polycomb repressive complexes 1 and 2 (PRC1 and PRC2) (Shao et al., 1999). Both complexes contain a core set of proteins critical for their basic function and can incorporate accessory proteins, permitting the complex to act in a spatiotemporal manner. There are four core proteins that are present in all PRC2 complexes: the SET domain contained in the enhancer of zeste [E(z), EZH1, and EZH2] protein, extra sex combs (Esc, EED) proteins, suppressor of zeste 12 [Su(z)12, SU(Z)12] and the histone binding protein p55 (RBAP48 and RBAP46) (Ng et al., 2000; Tie et al., 2001; Kuzmichev et al., 2002). The SET domain within E(Z) is responsible for the lysine methyltransferase activity specifically occurring on histone 3 at lysine 27 (H3K27) (Cao et al., 2002). PRC1 is also composed of a set of four major core proteins including polycomb (Pc), polyhomeotic (Ph), posterior sex combs (Psc) and Sex combs extra (Sce/dRing 1) (Shao et al., 1999). The chromodomain in Pc is responsible for recognizing and binding trimethylated H3K27 (H3K27me3) and upon binding will induce structural changes in the chromatin (Fischle et al., 2003; Min et al., 2003). In addition, PRC1 is also responsible for the monoubiquitination of lysines on histone H2A via the proteins Ring1A/B (de Napoles et al., 2004).

#### Trithorax Group Proteins

Antagonistic to the PcG proteins, the TrxG proteins are recognized for their activating mechanisms and addition of histone 3 lysine 4 trimethylation (H3K4me3). TrxG proteins are also evolutionarily conserved and are categorized into three groups based on their function. Group one is composed of the SET-domain-containing proteins that methylate histone tails, group two contains ATP-dependent chromatin remodeling proteins and finally group three contains the TrxG proteins that can bind DNA in a sequence specific manner. Each of these groups are thoroughly reviewed elsewhere (Schuettengruber et al., 2011). One of the first SET-domain-containing histone modifying complexes identified that could catalyze mono-, di-,

and trimethylation on H3K4 was a complex called COMPASS in yeast (Miller et al., 2001; Roguev et al., 2001). Mammals have six COMPASS-like complexes that have been shown to facilitate most H3K4me3 present, indicating that they are likely involved in global gene activation (Wu et al., 2008).

# PcG and TrxG Proteins in the CNS

In the mammalian CNS, both PcG and TrxG proteins help to regulate the differentiation process of neuronal cells. In ESCs, polycomb proteins prevent neuronal differentiation by adding H3K27me3 repressive marks at neuronal specific genes such as Ngns, Pax6, Sox1 (Bernstein et al., 2006; Mikkelsen et al., 2007). However, these genes simultaneously contain the active trithorax H3K4me3 mark, making these promoters bivalent. As ESCs differentiate into NPCs, the H3K27me3 polycomb mark is removed specifically by the histone demethylase Jmjd3 to further commit them to a neural lineage (Burgold et al., 2008). In addition to histone demethylation, activation of the TrxG COMPASS-like complex proteins RBBP5 and DBY30 are essential for the differentiation of ESCs into NPCs (Jiang et al., 2011). In NPCs, the PRC2 subunit Ezh2 is initially highly expressed, but declines during cortical neuron differentiation (Pereira et al., 2010). The loss of Ezh2 was shown to augment neurogenesis and neuronal differentiation. PcG complexes have also been associated with differentiation of NPCs to astrocytes (Hirabayashi et al., 2009) and oligodendrocytes (Sher et al., 2008). As the brain develops, NPCs can travel up and outward to form the outer layers of the brain. A study demonstrated that Ezh2 silences genes associated with neuron migration, such as Netrin1, to maintain correct migration patterns throughout the brain (Di Meglio et al., 2013).

Furthermore, several studies have demonstrated the importance of cross-talk between DNA methylation and histone modifications during mammalian brain development (Wu et al., 2010a; Hahn et al., 2013). As previously described, during neurogenesis as NPCs begin to differentiate, there is an increase in 5hmC specifically in gene bodies of developmentally active genes with little change in 5mC. Accompanying this increase, there is also a decrease in Polycomb-mediated repression and H3K27me3 formation (Hahn et al., 2013). Overexpression of Tet2 and Tet3, both of which are highly expressed in the embryonic cortex, prompted early differentiation of NPCs. An analogous and more obvious transition was seen when Ezh2 was also depleted. Moreover, when Tet proteins were inhibited and Ezh2 overexpressed, NPCs failed to differentiate. This suggests that Polycomb may regulate the transition of NPCs differentiation, and Tet proteins putatively maintain the differentiated state. Additionally, it has been demonstrated that there is an inverse association of Dnmt3a de novo methylation on non-promoter CpGs and H3K27me3 formation in the mouse brain (Wu et al., 2010a). Mice deficient for Dnmt3a had an increase of H3K27me3 as well as increases of PRC2 components Suz12 and Ezh2 at Dnmt3a targets. As previously discussed, Dnmt3a has more of a role in DNA methylation maintenance in postnatal development. The proposed cross-talk suggests that in addition

to methylating promoters of self-renewal genes in NPCs, Dnmt3a also has an activating function by inducing transcription of mature neural genes by down regulating H3K27me3 and antagonizing PRC2 binding.

# Histone Acetylation

Methylation is just one type of modification that can be present on histone tails; acetylation is a second type of modification that also regulates chromatin dynamics. Histone acetyltransferases (HATs) and HDACs are enzymatic proteins that either add or remove acetylation residues on lysines, respectively (Inoue and Fujimoto, 1970; Racey and Byvoet, 1971) (**Figure 1B**). Core histones are acetylated by transcriptional coactivators like CBP/p300 that are ubiquitously expressed and involved in cell cycle control, differentiation and apoptosis (Yang et al., 1996). HATs can be divided into three families based on the structure of their catalytic domains: GNAT, MYST and CBP/p300 which are reviewed elsewhere (Sterner and Berger, 2000; Kouzarides, 2007). Supportive of their activating role, HATs will interact with various transcription factors to promote many signaling cascades (Saha and Pahan, 2006). Similar to methylation, acetylation is reversible and removed by HDACs that silence gene expression. HDACs can also be categorized into four distinct classes where class 1 and class 2 HDACs seem to have important roles in the nervous system (Gray and Ekstrom, 2001; Abel and Zukin, 2008). Inhibitors of HDACs have shown promising effects in treating both neurodegenerative and neuropsychiatric diseases. It has been demonstrated that HDAC inhibitors could re-establish histone acetylation that is potentially lost due to dysregulation of the HAT, Tip60 (Cao and Sudhof, 2001). Furthermore, inhibition of HDACs restored learning and memory in a mouse model of neurodegeneration (Fischer et al., 2007). In Fragile X studies, combined administration of 5-azadeoxycytidine and various HDAC inhibitors cause reactivation of FMR1 gene expression (Chiurazzi et al., 1998). In the mouse brain, Hdac3 deletion provoked abnormal locomotor coordination, sociability and cognition (Nott et al., 2016). Interestingly, a cross-talk between HDAC3 and MeCP2 was shown to positively regulate neuronal genes by deacetylating FOXO, a transcription factor that is highly expressed in the hippocampus. A putative link for this cross-talk in relation to Rett syndrome is discussed below.

#### CHROMATIN REMODELING

The total length of DNA in one mammalian cell is on average 2 meters, yet the size of the nucleus is only 6 µm. In order to fit the entire genome into such a limited space, DNA molecules have to undergo extraordinary consolidation by a process termed chromatin remodeling. In addition to histones, a major contributor to chromatin compaction is a family of ATP-dependent remodeling proteins. The BAF (mammalian SWI/SNF) complex is a chromatin remodeling multiplex that uses ATP-dependent energy to modify the chromatin landscape to promote cell differentiation (Son and Crabtree, 2014) (**Figure 1C**). BAF complexes exist in a very spatiotemporal specific fashion. For example, in the mammalian

CNS, there are developmental stage-specific BAF complexes in ESCs (Kaeser et al., 2008), NPCs and in post-mitotic neurons (Lessard et al., 2007). A unique feature to BAF complexes is that the alternative subunits that make up the various stage-specific complexes are not interchangeable, indicating their functions are non-overlapping (Wang et al., 1996a,b). Interestingly, BAF complexes are being increasingly associated with neuropsychiatric diseases such as ASD (Neale et al., 2012; O'Roak et al., 2012) and schizophrenia (Koga et al., 2009).

#### BAF Chromatin Remodelers

The ESC specific BAF (esBAF) contains the ATPase BRG1, BAF250a, BAF60a/b and BAF155 (Kaeser et al., 2008). Deletion of any of the core subunits results in a lethal phenotype (Bultman et al., 2000). For example, shRNA depletion of Brg1 impairs self-renewal properties of ESCs and results in loss of key ESC markers such as Oct4, Sox2 and Nanog (Ho et al., 2009). In addition, deletion of Brg1 also resulted in an increase of the PRC2 recruitment and subsequently, H3K27me3 repression at active ESC genes (Ho et al., 2011). All this evidence suggests that esBAF maintains a euchromatic environment that is required to maintain the pluripotency of ESCs.

The transition from esBAF to neural progenitor BAF (npBAF) is associated with the replacement of esBAF155 with npBAF170 (Ho et al., 2009; Tuoc et al., 2013). npBAF is composed of a combination of either ATPase BRG1 or BRM along with several other BAF subunits. Similar to esBAF, npBAF are critical for the self-renewal properties of NPCs and loss of Brg1 shows similar phenotypes as those seen in esBAF. Interestingly, BAF170 was shown to interact with the transcription factor Pax6 whose primary function is to regulate neural progenitor division during early cortical development (Gotz et al., 1998). Upon BAF170 binding to Pax6, the transcriptional repressor REST (RE1 silencing transcription factor, also known as NRSF) is recruited, and represses Pax6 in non-neuronal radial glia cells (Tuoc et al., 2013). A conserved, 23 base pair sequence known as RE1 (repressor element 1, also known as NRSE) acts as the binding site for REST (Chong et al., 1995; Schoenherr and Anderson, 1995; Chen et al., 1998). Two corepressors are required for REST mediated silencing, Sin3-HDAC and the CoREST protein complex that contains HDACs (Andres et al., 1999; Grimes et al., 2000). Additionally, it was shown that CoREST interacts with BAF57, a subunit present in all stage-specific complexes, to induce long term silencing (Battaglioli et al., 2002). BAF170 is present in the subset of radial glia cells that are destined to be non-neuronal, and absent in radial glia cells destined to become intermediate progenitors that migrate outward to form the outer cortex layer (Andres et al., 1999; Grimes et al., 2000; Tuoc et al., 2013).

The substitutions of BAF53a for BAF53b, SS18 for CREST and BAF45a/d for BAF45b/c marks the transition from npBAF to the mature neuron (nBAF) complex (Olave et al., 2002). Importantly, the nBAF subunits are exclusive to neuronal cells and maintain the chromatin environment of post-mitotic neurons (Olave et al., 2002; Naik et al., 2007). nBAF, in complex with CREST, is essential in regulating dendritic outgrowth (Wu et al., 2007). Normal brain function depends on the correct wiring and synaptic function controlled by adequate dendritic outgrowth. Calcium regulation in the CNS can activate calcium mediated transcription factors, such as CREST, to promote the activation of genes required for dendrite growth (Aizawa et al., 2004).

# REGULATORY RNA

An emerging field in epigenetics is focusing on debunking the large amount of non-protein coding DNA contained in the mammalian genome. Over the past 20 years, scientists have begun to discover that non-coding is not equivalent to nonfunctional. When transcribed, these regions generate non-coding RNA (ncRNA) that can range in size from just ∼21 nucleotides to 100,000 nucleotides and can post-transcriptionally regulate mRNA. Many flavors of ncRNAs have been identified (Cech and Steitz, 2014); however, this review will briefly cover miRNA and lncRNA and the putative functions they may serve in the mammalian CNS.

#### MicroRNAs

MicroRNAs are roughly 22 nucleotides in length and have major roles in post-transcriptionally regulating gene expression by destabilizing their target mRNA (Bartel, 2004). Partial sequence complementarity to the 3<sup>0</sup> untranslated region (30UTR) of the target is adequate for gene downregulation (Lewis et al., 2005). Perfect complementarity is required at what is called the "seed sequence" in the 50UTR of the miRNA. Interestingly, a single miRNA can target hundreds of different mRNA and that a single mRNA can be targeted by more than one miRNA (Lim et al., 2005). Determining functional roles for the hundreds of miRNAs discovered has eluded scientists for years. Early studies proposed that miRNA had extensive roles during mammalian brain development and several of these studies identified neuralspecific miRNA (Krichevsky et al., 2003; Kim et al., 2004; Miska et al., 2004; Sempere et al., 2004). Of the neural-specific miRNA identified, one in particular stands out, miR-124. miR-124 is the most abundant and highly conserved miRNA found in the mammalian brain (Lagos-Quintana et al., 2002). Accounting for nearly 25–48% of all the miRNA in the brain, miR-124 has been implicated as a major contributor in neuronal differentiation and maturation (Krichevsky et al., 2006; Makeyev et al., 2007). For example, the direct targeting and repression of the RNA binding protein, PTBP1 by miR-124 has critical roles in non-neuronal cell development (Makeyev et al., 2007) (**Figure 1D**). PTBP1 is highly expressed in non-neuronal cells and inhibits alternative splicing of neuron-specific genes (Wagner and Garcia-Blanco, 2001; Sharma et al., 2005). In cells destined to become neurons, miR-124 binds and represses PTBP1, resulting in an increase of PTBP1's neuronal homolog, PTBP2 protein expression, inducing neuron-specific alternative splicing.

Another brain enriched miRNA, miR-137, is thought to have roles in both adult neurogenesis and neuronal maturation. During adult neurogenesis, miR-137 regulation of proliferation versus differentiation is coupled with its ability to cross-talk with MeCP2 and Ezh2 (Szulwach et al., 2010). Roughly 2–4 Kb upstream of miR-137, methylated CpGs were

found as well as a threefold enrichment of MeCP2 binding. Subsequently, it was found that Sox2 also binds upstream of miR-137, and concurrent binding of Sox2 with MeCP2 inhibited miR-137. When miR-137 expression is reduced, there is an increase in neuronal differentiation and a decrease in adult neural stem cell proliferation. This is concurrent with a previous observation that miR-137 expression increases during neuronal differentiation (Silber et al., 2008). The polycomb protein Ezh2, was found to be a direct target of miR-137 in vitro (Szulwach et al., 2010). MiR-137 reduces the expression of Ezh2 and consequently there is also a decrease in H3K27me3. Loss of H3K27me3 encourages adult stem cells to begin to differentiate rather than proliferate. With regards to neuropsychiatric disorders, Genome wide association studies (GWAS) identified miR-137 as one of the strongest associated factors with schizophrenia (Schizophrenia Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium, 2011; Kwon et al., 2013; Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). Intriguingly, four targets of miR-137 were also found to be highly associated with schizophrenia (Kwon et al., 2013); however, the biological impact of miR-137 in schizophrenia still remains to be explored.

#### Long Non-coding RNAs

Long non-coding RNAs are classified as having at least 200 nucleotides and non-protein coding abilities (Kapranov et al., 2007). They are also one of the least well understood class of ncNRAs because of the difficulty in distinguishing them from transcription by-products. Compositionally, lncRNA do not appear to be very well conserved between mouse and human (Pang et al., 2006). In the mouse genome, the vast majority of lncRNAs do not contain an open reading frame (Ravasi et al., 2006). In addition, compared to protein coding transcripts, lncRNA tend to be shorter and contain fewer introns. Unusually, some lncRNA such as the paternally imprinted lncRNA H19, are polyadenylated, spliced and exported to the cytoplasm just like protein coding transcripts (Brannan et al., 1990). Functional roles of lncRNA may depend on where in the genome they are located. Those that are transcribed near expressed genes have the potential to regulate the expression of that gene in cis. One of the most well studied lncRNAs is Xist, which functions in cis and is critical for inactivating one of the X chromosomes in mammalian females (Brockdorff et al., 1991). As Xist coats the X chromosome, other repressive factors are recruited, such as Polycomb repressive complexes PRC1 and PRC2 and other histone modifying enzymes (Plath et al., 2003; Silva et al., 2003; de Napoles et al., 2004). lncRNAs have also been demonstrated to regulate transcriptional repressors and activators from a distance (in trans). The HOTAIR lncRNA is 2.2 Kb in length, and was shown to repress the transcription of 40 Kb of the HOXD locus (Rinn et al., 2007). It is proposed that HOTAIR interacts with PRC2 to facilitate H3K27me3 of the HOXD locus because siRNA mediated knockdown of HOTAIR resulted in the loss of H3K27me3 marks specifically at HOXD. Beyond chromatin remodeling, other putative functions for lncRNAs have been suggested, such as transcriptional control and post-transcriptional processing, which are reviewed in detail elsewhere (Mercer et al., 2009; Ponting et al., 2009).

The role of lncRNAs in chromatin remodeling has been extensively studied and the scientific community is just starting to make strides in investigating their roles in the brain (reviewed by Ng et al., 2013). lncRNAs have been found in many tissues (Iyer et al., 2015), but are strikingly enriched in the mammalian brain. One study identified over 800 lncRNAs in the mouse brain, and found that most were associated with specific brain regions, cell types or subcellular compartments, suggesting some putative function (Mercer et al., 2008). One of the better studied lncRNAs in the brain is Malat1 (also known as NEAT2), which is particularly enriched in neurons (Bernard et al., 2010; Lipovich et al., 2012). Malat1 localizes to nuclear speckles which are storage/assembly sites for processing factors involved in pre-mRNA splicing (Lamond and Spector, 2003; Hutchinson et al., 2007; Clemson et al., 2009). Studies demonstrated that Malat1 recruits SR splicing factors in the nuclear spectacle and can regulate genes involved in neural processes and synaptic function (Bernard et al., 2010). Importantly, Malat1 was shown to have 90% conservation between human and mouse, suggesting maintenance of a critical function. A recent computational study utilized RNA-seq data from mouse embryonic brains to identify temporally regulated lncRNAs in brain development. Interestingly, lncRNAs specifically expressed in embryonic brains were no longer expressed in adult brains (Lv et al., 2013). Another study employed RNA-seq on human iPSCs to investigate the expression of lncRNAs during their differentiation into mature neurons (Lin et al., 2011). Several of the lncRNAs that were aberrantly regulated during differentiation were associated with candidate genes of neuropsychiatric disorders, such as ASDs, bipolar disorder and schizophrenia. Much research is being conducted on identifying and determining functional roles of the ever-growing list of lncRNAs; however, more work remains to be done.

To add another layer of complexity, different groups of noncoding RNAs have been found to cross-talk with each other and form regulatory networks in the brain (Kleaveland et al., 2018). A recent study found that in mouse brain, the lncRNA Cyrano destabilizes miR-7 through its highly complementary site for miR-7. Degradation of miR-7 promoted the accumulation of a circular RNA Cdr1as, which is known to dampen neuronal activity (Memczak et al., 2013; Piwecka et al., 2017). Interestingly, Cdr1as contains an inherent destruction mechanism where binding of miR-671 induces its slicing (Kleaveland et al., 2018). It has been proposed that because the binding sites for miR-7 and miR-671 are so close on Cdr1as, cooperative binding could recruit a silencing complex and control the accumulation of Cdr1as in the brain (Grimson et al., 2007; Saetrom et al., 2007).

Proper epigenetic regulations are critical for normal brain development and functions. Numerous evidences suggest that their dysregulation could serve as causal roles in the onset of neurological, neurodegenerative and neuropsychiatric disorders. In the following sections, we will focus on several

#### TABLE 1 | Summary table of epigenetic processes that can occur in various neuropsychiatric diseases.


(Continued)

TABLE 1 | Continued


neuropsychiatric disorders with known roles of epigenetic regulation in their etiology and progression (**Table 1**).

#### MAJOR DEPRESSIVE DISORDER

Individuals with MDD present clinically with not only a depressed mood, but can also suffer from anhedonia, dysregulated appetite and sleep, fatigue, poor concentration and suicidal ideations or acts (Belmaker and Agam, 2008). In the United States, the incidence of depression in women is greater than 20%, nearly twice that of men (Kessler et al., 2003). Twin studies have suggested that MDD has a high heritability rate of about 37% (Sullivan et al., 2000). However, MDD is not monogenic, but rather caused by many genes each contributing only a small proportion. Environment, such as early life stress or trauma, is a major risk factor. Many studies have tried to identify biomarkers to assess a patient's predisposition for MDD; however, no useful biomarkers have yet been identified. Furthermore, many individuals with MDD are resistant to treatments, and so developing a greater understanding of the neurological facets of MDD has become paramount to the creation of efficacious therapies.

#### Differential DNA Methylation

A major candidate gene for MDD is BDNF. Individuals with MDD show reduced BDNF protein, and multiple studies have associated this reduction with increased methylation of the BDNF promoter in peripheral blood cells (Angelucci et al., 2005). BDNF has two small CpG islands upstream of exons 1 and 4. One study found that the methylation status of exon 1 in BDNF could be used to accurately distinguish between MDD patients and healthy controls. Remarkably, the depressed patients consistently showed a complete absence of methylation at certain CpG sites in exon 1 (Fuchikami et al., 2011). Although this study was only based on a small number of participants, it would be worth investigating whether these findings could be replicated in larger populations. Absence of methylation at one particular CpG site in exon 4 of BDNF has been associated with reduced response to antidepressant drugs (Tadic et al., 2014). While antidepressants showed no effect on exon 4 methylation, in vitro experiments established that antidepressants could regulate the promoter activity of BDNF. Furthermore, antidepressants have been shown to increase BDNF expression in mice by phosphorylation of MeCP2, which causes the removal of MeCP2 from the DNA (Hutchinson et al., 2012). BDNF exon 4 methylation levels and circulating BDNF protein together may predict a patient's treatment response (Lieb et al., 2018). These findings collectively suggest that BDNF methylation levels may be a useful biomarker and tool to make more informed choices about individual therapies. Another well-studied biological factor in MDD is the serotonin transporter gene SLC6A4. SLC6A4 methylation correlates with depression in a variety of ways. For example, in an analysis of individuals with MDD, those who had a family member with depression showed a higher percentage of SLC6A4 methylation, indicating that epigenetic regulation of this loci may be related to depression heritability. In mother–child pairs that were concordant for depression, increased methylation of the SLC6A4 promoter was seen in both mother and child (Mendonca et al., 2019).

#### Disruption of DNA Methylation From Environmental Stressors

Stressful, traumatic events in early life are a major environmental risk factor for MDD, and changes in stress-related genes may be part of the mechanism of depression for some individuals. The glucocorticoid receptor gene, NR3C1, plays an important role in the hypothalamic–pituitary–adrenal (HPA) axis, a stress response system that becomes dysregulated in MDD. Exon 1F of NR3C1 has been extensively studied with regards to its role in early life adversity (Daskalakis and Yehuda, 2014),

and has been the target of focus for many depression studies as well. Individuals with MDD show hypermethylation of NR3C1 exon 1F, which correlated with morning cortisol levels (Farrell et al., 2018). In adolescent males, increased NR3C1 exon 1F methylation was associated with stressful experiences such as being bullied, lacking friends and internalizing symptoms, as assessed by a depression scale (Efstathopoulos et al., 2018). Polymorphisms of the glucocorticoid receptor co-chaperone protein, FK506 binding protein 5 (FKBP5), have also been associated with MDD. Interestingly, methylation of certain CpG sites of FKBP5 intron 7 significantly correlated with early life adversity in MDD patients (Farrell et al., 2018). Thus, the connection between many MDD cases and early life trauma involves disruption to the stress response system at an epigenetic level. It is plausible to imagine potential pharmacotherapies that could target methylation of key genes in this system to help restore balance in the HPA axis, and thus attenuate MDD symptoms. Whether targeting HPA axis genes alone would be enough to improve MDD, remains to be understood.

Several studies have linked clustered Pcdhs to depression-like behaviors. A rat model of depression revealed that Pcdhga11 expression levels were increased in the hippocampus (Garafola and Henn, 2014), suggesting Pcdhga11 could be used as a putative biomarker. In contrast to early life stressors, which epigenetically alter the HPA axis, positive early-life parental interactions can epigenetically alter genes that promote neuronal function. For example, adult mice that receive good maternal care (high licking), showed increased histone acetylation and DNA methylation in exons of Pcdh genes. Also, there was reduced methylation at their promoter, increasing over all expression of Pcdh genes (McGowan et al., 2011).

#### HDAC Inhibitors as a Putative Antidepressant

Histone deacetylases are a promising target for MDD therapies. Mouse behavioral paradigms, such as chronic social defeat stress, have been relied upon as a way to measure antidepressant efficacy (Yin et al., 2016). In mice that have experienced chronic social defeat stress and in postmortem brains from humans with clinical depression, HDAC2 protein is reduced in the nucleus accumbens (NAc; a brain region associated with reward) (Covington et al., 2009). Hdac5 expression is also reduced in the NAc of chronically stressed mice, and this expression is restored and further increased with antidepressant treatment. Consistent with this, mice lacking Hdac5 exhibit enhanced depressive-like behaviors in response to chronic stress (Renthal et al., 2007). In the hippocampus, however, chronically stressed mice have increased Hdac5, and this can be reversed by antidepressant administration (Tsankova et al., 2006). It is no surprise then that HDAC inhibitors, which have been commonly used as anti-cancer agents, are now also being studied for their antidepressant actions (Eckschlager et al., 2017). For example, MS-275 delivery to the hippocampus reverses anhedonia and reduces social avoidance in mice that experienced continuous social defeat stress (Covington et al., 2011). While HDAC inhibitors remain strong candidates for potential therapeutics in humans, translatability from mouse studies is currently lacking. A gap in this research includes determining whether HDAC expression in one particular brain region may drive MDD; and if so, whether there are therapeutics that may regulate this.

#### MicroRNAs in MDD

Several studies have begun to look at miRNAs as a putative peripheral biomarker for MDD. Remarkably, evidence supports that under certain conditions, miRNAs expressed in the brain can cross through the blood–brain barrier and circulate in the plasma (Sheinerman and Umansky, 2013). In patients with MDD, BDNF levels were found to be decreased in plasma (Molendijk et al., 2014). More importantly, two miRNAs known to interact with BDNF have also been found in plasma of MDD individuals (Fang et al., 2018). This study compared the levels of BDNF, miR-132 and miR-124 in MDD patients that were either treated or not treated with citalopram to healthy control patients. It was found that miR-132 was highest in non-treated MDD patients relative to treated patients and controls, suggesting that miR-132 could be used as a potential biomarker for MDD individuals. Notably, miR-132 is the only miRNA that has been consistently identified in several MDD studies (Yuan et al., 2018). Additionally, MDD patients had higher levels of miR-124, with citalopram treated patients having the largest increase (Fang et al., 2018). Conflicting evidence has been reported with regard to how reliable miR-124 plasma expression levels are for being used as an MDD biomarker (Bocchio-Chiavetto et al., 2013; He et al., 2016). Many other prospective miRNA biomarkers have been proposed (Lopez et al., 2018; Yuan et al., 2018), however; much work remains in validating if any of these biomarkers can be used reliably.

Antidepressant drugs are the most common treatment for individuals with MDD, however; many patients do not respond to them. An interesting area of research is focusing on how miRNAs can help predict patient response to antidepressants. Selective serotonin reuptake inhibitors (SSRIs) are a commonly prescribed class of antidepressants that target the serotonin transporter (SERT). Interestingly, it was found that long term treatment of MDD with SSRIs increases the expression of miR-16, which serendipitously also directly targets SERT (Baudry et al., 2010). Subsequently, SSRI promotes the conversion of precursor miR-16 into its mature form to regulate SERT uptake of serotonin. Another study examined the expression of three miRNAs: miR-1202, miR-135a and miR-16, of MDD patients and controls from two independent cohorts and compared miRNA expression between antidepressant responders and nonresponders (Fiori et al., 2017). In both cohorts, decreased levels of miR-1202 correlated with patients responding to either an SSRI or a serotonin-norepinephrine reuptake inhibitor (SNRI). After 8 weeks of antidepressant treatment, the responders' miR-1202 expression levels increased and were indistinguishable from non-responders and the healthy controls. Importantly, in vitro studies demonstrated a similar result, where NPCs treated with SSRI drugs had an increase in miR-1202; however, miR-1202 expression did not increase when NPCs were treated

with non-serotonergic drugs (Lopez et al., 2014). This suggests that MDD patients with low miR-1202 may be more likely to respond to serotonin-based antidepressants. With continued research, miRNAs may become valuable tools for developing a personalized treatment plan, increasing the chances of patients receiving the most appropriate antidepressant the first time.

In summary, epigenetic studies will be highly beneficial in the development of individualized MDD therapeutics, categorization of MDD subtypes and for enhancing efficacy of currently existing treatments.

#### AUTISM SPECTRUM DISORDERS

Autism spectrum disorders are characterized as heritable neurodevelopmental disorders in which affected individuals have deficits in social interactions, communication and behaviors (American Psychiatric Association, 2013). Over the years, genomic studies have identified genes that seem to contribute to the condition (Abrahams and Geschwind, 2008); however, none significantly stand out as a major contributor to ASD. Rather, it appears that much of the heritability is polygenic with each gene only contributing a very small portion. Recent studies are beginning to suggest that in addition to genetics, ASD may also have an epigenetic component.

#### Putative Role for Polycomb Repressive Complex 1 in ASD

Several putative genes have been proposed for contributing to ASD, one of which is autism susceptibility candidate 2 (AUTS2) (Sultana et al., 2002; Oksenberg and Ahituv, 2013). Surprisingly, recent studies have demonstrated that AUTS2 can be in complex with PRC1 and function in gene promotion contrary to PRC1's traditional repressive role (Gao et al., 2012; Gao et al., 2014). It is proposed that the PRC1-AUTS2 complex can promote gene expression through the recruitment of CK2 and the co-activator P300 protein. CK2 inhibits monoubiquitination of lysine 119 on histone H2A by phosphorylating RING1B. Further supporting the role of AUTS2 in gene activation, ChIP-seq analysis has localized AUTS2 predominantly near TSSs in the mouse brain. These binding sites also possess active histone marks such as histone 3 lysine 27 acetylation (H3K27ac) and H3K4me3 and were reduced for repressive histone mark H3K27me3. Furthermore, gene ontology analysis of PRC1-AUTS2 targets identified functional terms that were associated with CNS transcriptional programming. All of this evidence supports the PRC1-AUTS2 complex as being involved in promoting gene expression. Behavioral and developmental analysis of AUTS2 knockout mice also showed similar impaired developmental phenotypes as observed in humans with a disruption in AUTS2 (Gao et al., 2014). The interaction between AUTS2 and epigenetic machinery could be a rich area to investigate to uncover potential therapeutic targets for individuals with AUTS2 polymorphisms.

#### Differential DNA Methylation

The SHANK3 gene has been identified as a strong contributing factor to ASDs (Durand et al., 2007; Moessner et al., 2007; Gauthier et al., 2009). In neuronal synapses, SHANK3 acts as a scaffolding protein with critical roles in the formation, maturation and maintenance of synapses (Du et al., 1998; Boeckers et al., 1999). The SHANK3 gene contains 5 CpG islands at putative intragenic promoters whose methylation status has been associated with alternative splice variants (Zhu et al., 2014). In postmortem ASD brains, there was a significant increase in DNA methylation at the CpG islands 2, 3, and 4 of SHANK3. In addition, the methylation at these islands was associated with decreased expression and decreased alternative splicing of SHANK3, suggesting DNA methylation regulates the expression of the splice variants. This evidence introduces the possibility that the methylation status of SHANK3 could serve as a putative predictor for ASD.

#### Dysregulation of Non-coding RNAs

Although very little is known about the contributions miRNA and lncRNA make in ASD, several studies have investigated non-coding RNAs in this disorder. One study identified 28 differentially expressed miRNAs in ASD cerebellar cortex tissue using qPCR (Abu-Elneel et al., 2008). Interestingly, 7 of the identified miRNAs were predicted to target autismassociated genes NEUREXIN and SHANK3. Another study that looked at lncRNA detected over 200 differentially expressed lncRNAs in ASD (Ziats and Rennert, 2013). Of those identified, more than 90% mapped within 500 Kb of a known gene, many of which were genes with functional roles in neurodevelopment and psychiatric diseases. These findings imply that lncRNAs could be part of the mechanism that regulates genes contributing to ASD. This study was also able to compare the expression of lncRNAs in in the cerebellum and cortex from the same patient of healthy and ASD diseased brains. Between brain regions, the ASD brains had significantly less differentially expressed genes and lncRNAs compared to the control brains. This finding is consistent with imaging studies that show autistic brains have less specialized, less distinct regions as compared to healthy brains (Minshew and Keller, 2010). In summary, because ASD lacks a strong heritability factor, epigenetic studies will likely fill in many gaps of mechanisms and risk factors contributing to ASD.

Hundreds of genes have been found to be associated with ASD including Pcdh genes. A GWAS study identified 5 SNPs in the PCDHA gene that were significantly associated with ASD (Anitha et al., 2013). Interestingly, deletions near PCDH10 have consistently been found in families with autism (Morrow et al., 2008; Bucan et al., 2009). Further supporting roles for protocadherins in ASD was the finding that ASD brains have increased dendritic spine densities compared to controls (Hutsler and Zhang, 2010). Pcdh genes are renowned for their roles in dendritogenesis, dendrite arborization and dendritic spine regulation (Keeler et al., 2015) making them perfect candidate genes for autism studies. Studying Pcdh genes in neuropsychiatric diseases has become a hot topic for the field, and it would be interesting to see how epigenetic regulation of them also contributes to disease pathology.

# FRAGILE X SYNDROME

fgene-10-00268 April 2, 2019 Time: 17:28 # 16

Fragile X syndrome is the most commonly inherited form of mental retardation, and is caused by a trinucleotide repeat in the 50UTR of the FMR1 gene, which encodes the RNA binding protein FMRP (Webb et al., 1986; Verkerk et al., 1991; Ashley et al., 1993). FMRP is widely expressed in fetal and adult tissues with the highest enrichment in the brain and testes (Devys et al., 1993). It predominantly localizes in the cytoplasm; however, it can be transported to the nucleus via its nuclear localization signal (Devys et al., 1993; Eberhart et al., 1996). As an RNA binding protein, FMRP appears to have several functions ranging from translation regulation, miRNA-mediated translation suppression and neuronal synaptic plasticity (Jin et al., 2004a). Currently, the precise mechanisms by which FMRP regulates transcription/translation as well as its target RNAs are still under rigorous investigation. Recent work has started to unveil putative functional roles of FMRP as well as potential regulatory targets of FMRP in FXS and other intellectual disabilities (Nelson et al., 2013).

## Hypermethylation of FMR1 Putatively Mediated by RNAi

Fragile X has been shown to be caused by the loss of FMR1 gene expression in conjunction with the hypermethylation of the cytosines in the CGG trinucleotide repeat (Bell et al., 1991; Pieretti et al., 1991; Sutcliffe et al., 1992; Orsini and Maren, 2012). Methylation of the CGG repeats was identified in human fetal tissue, suggesting that the methylation is acquired after fertilization, or is already present in the carrier female's oocytes (Sutcliffe et al., 1992). Remarkably, FMR1 gene expression could be rescued in vitro by utilizing DNMT inhibitors and CRISPR/Cas9 to remove the DNA methylation (Bar-Nur et al., 2012; Park et al., 2015; Liu et al., 2018). The question that remains at large is what initiates or causes the hypermethylation of the expanded repeats seen at the CpG island? One model proposes that the RNA interference (RNAi) pathway may be involved (Jin et al., 2004a). This model suggests that the mRNA produced from the expanded FMR1 gene can fold back on itself, generating a hairpin-like structure and be processed by the RNAi machinery. Ultimately, targeting of the RNAi complex is thought to recruit de novo DNMTs and histone methyltransferases (HMTs) to the expanded FMR1 sequence. This model is supported by the initial finding that the mutant FMR1 RNA sequence forms different hairpin structures with the prominent structure forming in the 3 <sup>0</sup>UTR of the transcript (Handa et al., 2003).

# MicroRNA Pathway in Fragile X

FMRP has been shown to function as a translational repressor through its RNA binding properties (Laggerbauer et al., 2001; Li et al., 2001). In the brain, FMRP bound to mRNA has been found at dendritic spines associated with polyribosomes, suggesting some involvement in protein synthesis at synapses (Feng et al., 1997). Also, in human brains of fragile X patients there is abnormal dendritic spine growth (Hinton et al., 1991). Recent work has prompted a model where FMRP regulates its mRNA expression through the miRNA pathway. Immunoprecipitation studies demonstrated that mammalian wildtype FMRP, but not mutant FMRP, could associate with miRNA and miRNA pathway proteins Dicer, eIF2C2 and the mammalian Argonaute (AGO) protein (Jin et al., 2004b). This study also determined that in fly, AGO1 is required for dFmr1, the fly ortholog of FMR1, regulation of synaptic plasticity. These observations are supported from previous studies in Drosophila that showed dFmr1 associated with AGO2 and the RNA inducing silencing complex (RISC) (Caudy et al., 2002; Ishizuka et al., 2002). In vitro rescue studies of FMR1 have demonstrated that current technologies, such as DNMT inhibitors and CRISPR/Cas9, can be applied as putative therapeutics. The next step is to conduct translational studies to test whether FMR1 expression can be rescued in mammals. A possible place to begin would be in vitro fertilization experiments. Hypermethylation of FMR1 is observed either after fertilization or is already in the oocyte. It would be interesting to explore the effects of CRISPR technology on FMR1 expression at these early stages in development.

# RETT SYNDROME

### MeCP2 Dysregulation

Rett syndrome (RTT) is a rare disease that was first described in 1966 although the criteria for diagnosing patients did not become available until the 1980s (Rett, 1966; Hagberg et al., 1985). Described as a progressive neurodevelopment disorder, Rett syndrome is most common in females and symptoms, such as autistic behavior, stereotypic hand wringing and loss of facial expression, begin to appear around 18 months of age (Hagberg et al., 1983). Later clinical presentations can include difficulty with motor control, breathing, communication, small head size, muscle wasting and seizures (Gold et al., 2018). The Rett loci was mapped to a region on the X chromosome (Xq28) (Sirianni et al., 1998). Further mapping studies found that the MeCP2 gene also mapped to this region, and that mutations in the methyl binding domain and transcription of repression domain (TRD) of MeCP2 caused RTT (Amir et al., 1999). The MeCP2 missense mutation R133C results in the abolishment of any methyl binding ability of the MBD (Mellen et al., 2012). Many other RTT-associated missense mutations in the MBD and TRD also have been shown to prevent MeCP2's ability to interact with complexes and methylated DNA (Lyst et al., 2013). MeCP2 is essential to normal brain morphology and consequently, individuals with RTT have more densely packed, shorter neurons with dendrites that are less dense and less complex (Armstrong et al., 1995). Conditional deletion of Mecp2 in postnatal mice produced similar phenotypes as those observed in RTT patients (Gemelli et al., 2006). These mice had impaired motor coordination, increased anxiety and abnormal social behavior.

The mechanism by which MeCP2 mutation (or loss of function) causes Rett is not known; although several studies have tried to identify dysregulated genes in RTT that are direct targets of MeCP2 (Colantuoni et al., 2001; Peddada et al., 2006; Jordan et al., 2007). One study determined that MeCP2 deficient mice and RTT human brains showed significant

Kuehner et al. Epigenetic in Brain Diseases

upregulation of inhibitors of differentiation genes (ID1-4), which are targets of MeCP2 (Peddada et al., 2006). In vitro studies demonstrated that MeCP2 normally downregulates the protocadherin genes PCDHB1 and PCDH7 (Miyake et al., 2011). Because protocadherins are critical for proper brain development, aberrant expression of these genes could contribute to the pathogenesis of RTT. Furthermore, a study involving four independent cDNA microarrays demonstrated that the majority of differentially expressed genes were downregulated in human RTT postmortem brains, but they failed to investigate whether any of these genes were associated with MeCP2 (Colantuoni et al., 2001). Another study that used microarrays to identify differentially expressed genes in a MeCP2-null mouse model looked at several brain regions (cortex, midbrain and cerebellum) to determine if certain regions were more sensitive to loss of MeCP2 (Urdinguio et al., 2008). Although no significant differences were found between brain regions, the study did identify several genes that are direct binding targets of MeCP2. Importantly, these genes (Fkbp5, Mobp, Plagl1, Ddc, Mllt2h, Eya2, and S100a9) were found to be upregulated in the RTT mouse model, and their functions are associated with neural function. Identifying candidate genes in RTT is important for developing a greater understanding of the underlying mechanism.

While Rett syndrome is the result of MeCP2 loss of function, MeCP2 duplication syndrome is the result of MeCP2 overexpression, and mimics some of the symptoms of Rett (Van Esch et al., 2005). Rodent models have clearly demonstrated that having a balance of MeCP2 expression and function is absolutely essential to normal brain activity. Mice that have either overexpression or deletion of MeCP2 show disrupted neuronal activity in the hippocampus (Lu et al., 2016). Both mouse models exhibit neuronal hypersynchrony, which is an aberration from the normal asynchrony typically present at baseline. Importantly, this phenotype could be observed several months before the animals started to have seizures. Deep brain stimulation therapy rescued the abnormal synchrony in both mouse models. Thus, proper MeCP2 expression levels are required for stable neuronal activity.

#### DNA Methylation Affects MeCP2 Binding

Several studies have looked at how dynamic changes in DNA methylation, of both CpG and CpH, could correlate to Rett syndrome pathology, and have even speculated as to how they could contribute to the delayed onset of RTT symptoms. In addition to mCG, MeCP2 also binds to mCH, preferentially to mCA (Guo et al., 2014; Gabel et al., 2015). Accumulation of MeCP2 in mammalian neurons occurs early after birth when mCH starts to accumulate (Shahbazian et al., 2002; Ballas et al., 2009; Lister et al., 2013). Interestingly, in maturing neurons, those genes that acquired mCH marks were more likely to be dysregulated in the RTT mouse model (Chen et al., 2015). This evidence advocates that early in brain development, MeCP2 initially binds to mCG and then to mCH as it accumulates around a subgroup of neuronal genes (such as Bdnf) to influence gene transcription. This epigenetic mechanism could contribute to why a genetic disease like Rett syndrome could have a delayed onset.

Important for brain development is the proper regulation of LINE 1 retrotransposon. MeCP2 directly targets the 50UTR of LINE 1 in the brain to regulate LINE 1 mobility (Muotri et al., 2010). Mutations in MeCP2, as seen in RTT, prevent its binding to LINE 1 resulting in increased expression of LINE 1 in both in vitro and in vivo models of RTT. Whether or not the increase in neuronal retrotransposition contributes to the cause of RTT, or is simply an effect is not clear. These findings warrant further investigation of LINE 1's contribution to RTT. It could also be worth investigating global methylation profiles of developing mouse embryos to determine if DNA methylation patterns are also disrupted, contributing to aberrant LINE 1 expression.

# MeCP2 Interacts With HDACs

A possible epigenetic mechanism to investigate for RTT is the interaction of MeCP2 with HDAC3. MeCP2 associates with HDAC3 as part of the NCoR/SMRT co-suppressor complex (Lyst et al., 2013), and MeCP2 missense mutations that occur in Rett Syndrome prevent this interaction (Nan et al., 1998; Ebert et al., 2013; Lyst et al., 2013). Furthermore, in mice, HDAC3 binds near transcriptional start sites of active gene promoters, including the Bdnf gene promoter in the brain (Nott et al., 2016). In Rett syndromic mice, MeCP2 mutations prevent the recruitment of HDAC3 and FOXO to gene promoters. FOXO is a transcription factor that when acetylated has reduced binding affinity to DNA (Daitoku et al., 2004; Matsuzaki et al., 2005; Hatta et al., 2009). Recruitment of HDAC3 to active gene promoters through MeCP2 regulates the deacetylation of FOXO, and promotes gene expression of neuronal genes (Nott et al., 2016). The gene targets of this complex might yield insightful avenues for developing site-directed therapeutics for Rett patients.

#### Dysregulation of miRNAs

Recently, miRNAs have been suggested to interact with MeCP2 and potentially contribute to RTT. Using a mouse RTT model, one study found that just over one-fourth of the miRNAs analyzed showed different expression patterns in Mecp2-null brains compared to wildtype, most of which were downregulated (Urdinguio et al., 2010). Additionally, they found that MeCP2 associated with the miRNAs that had 50UTR hypermethylation. Interestingly, two of the downregulated miRNA, miR-146a and miR-146b, base pair to the 30UTR of IL-1 receptor-associated kinase 1 (Irak1), which is upregulated in RTT mouse brains (Taganov et al., 2006; Urdinguio et al., 2008). It was then shown that both miR-146a and miR-146b could downregulate IRAK1 expression in vitro (Nahid et al., 2009) and it was proposed that in Rett syndrome the downregulation of miR-146a/b contributes to the overexpression of IRAK1 (Urdinguio et al., 2010). Another study identified altered expression of miRNA in the cerebellum of Mecp2-null mice (Wu et al., 2010b). They showed that the promoters of the dysregulated miRNAs were methylated and bound by MeCP2, downregulating their expression. Furthermore, the 30UTR of the Bdnf transcript contained multiple miRNA binding sites for miRNA that were upregulated, providing mechanistic evidence to explain reduced Bdnf expression in RTT. MeCP2 and its interactions with epigenetic factors play major roles in Rett syndrome, yet why there is a delay in disease onset is not fully elucidated. It may be worthwhile to investigate a spectrum of early developmental stages to determine what epigenetic changes are occurring before the onset of disease, and how these changes could contribute to the delayed onset of RTT.

#### SCHIZOPHRENIA

fgene-10-00268 April 2, 2019 Time: 17:28 # 18

Schizophrenia (SZ) is a mental illness with clinical phenotypes such as dissociation of thought, ideas, identity and emotion (Moskowitz and Heim, 2011). SZ has a wide range of first episodic onset, with early onset occurring in adolescence and late onset being in the mid-50s (Jablensky et al., 1992). Interestingly, males appear to experience their first episodic event 4–5 years earlier than females, on average (Hafner et al., 1998). Like ASD, SZ lacks a single causal gene (International Schizophrenia et al., 2009); however, epigenetic factors are a promising area of research.

#### Aberrant DNA Methylation

In all postmortem brains of schizophrenic individuals, studies have found ∼50% increase in DNA methylation at the Reelin gene (RELN) promoter (Impagnatiello et al., 1998). Reelin is an extracellular matrix protein highly expressed in GABAergic neurons (Pesold et al., 1999). Functionally, RELN has been shown to be essential in brain development contributing to neuronal migration, axonal branching and synaptogenesis. Upstream of the promoter region is a CpG island, suggesting that inappropriate methylation could regulate RELN (Royaux et al., 1997). One study demonstrated that hypermethylation of the RELN promoter was associated with a decrease of RELN expression found in the brains of schizophrenic patients (Abdolmaleky et al., 2005). It was also established that the transcription factors Sp1 and Tbr1 have binding sites upstream of the RELN promoter and induce gene expression (Chen et al., 2002). Interestingly, Sp1 regulation of the adenine ribosyltransferase gene triggers demethylation and prevents de novo methylation, and it is proposed that Sp1 could similarly regulate RELN (Han et al., 2001; Chen et al., 2002). Additionally, prevention of DNA methylation with 5-aza-2<sup>0</sup> -deoxycytidine (5 azadC) at the CpG island increased gene expression of RELN more than 50-fold (Chen et al., 2002). This evidence indicates that methylation at the RELN gene likely plays a major role in SZ.

Very little research has been done to link how aberrant epigenetic modifications can affect the expression of protocadherins in SZ. Interestingly, olanzapine, a common antipsychotic drug often prescribed to SZ patients, is proposed to induce its effect by causing DNA methylation changes throughout the brain (Melka et al., 2014). Importantly, several protocadherin genes (Pcdha11, Pcdha9, and Pcdhga5) had altered promoter methylation in the cerebellum, whereas hypomethylation of the Pcdhga8 promoter was observed in the hippocampus. Regions of the genome that are thought to contribute to SZ susceptibility appear to overlap with cadherin superfamily genes (Pedrosa et al., 2008). Polymorphisms in PCDH12 and PCDH15 were found to be in association (Gregorio et al., 2009; Narayanan et al., 2015), and linkage studies found the CTNNA2 gene in sibling pairs with SZ (DeLisi et al., 2002; Chu and Liu, 2010). Protocadherin gene expression clearly has important roles in SZ, but to what extent they are affected by epigenetic changes is unclear. It would be interesting to test whether manipulation of methylation at various protocadherin genes could significantly impact brain development and function in neuropsychiatric diseases.

#### Non-coding RNAs

The contribution of miRNAs to cognitive disorders has been best characterized in SZ. In postmortem SZ brains, miR-132 was found to be dysregulated, and has been associated with cognitive and behavioral impairments (Moreau et al., 2011; Miller et al., 2012). In the prefrontal cortex of SZ brains, miR-132 was significantly downregulated while its target mRNAs were all upregulated (Miller et al., 2012). Some of the identified targets were associated with synaptic long-term potentiation and depression, neuronal CREB signaling and DNA methylation. Interestingly, Dnmt3a was found to be a putative target of miR-132; however, the expression patterns of Dnmt3a and miR-132 at early developmental stages are opposite. It is not until later in development when miR-132 expression drastically increases that it would have the potential to target Dnmt3a. One could speculate that this temporal expression of miR-132 and Dnmt3a prevents their dysregulation during early development, consistent with SZ requiring an ongoing and prolonged accumulation of dysregulated events that must reach a threshold for symptoms to develop (Cannon, 1996). Another miRNA found in SZ brains was miR-195, which targets several genes (BDNF, RELN, DRD1) implicated in SZ (Beveridge et al., 2010). GWAS for SZ have identified a locus on chromosome 1p21.3 that is highly associated with miR-137 (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014; Gianfrancesco et al., 2017). Several independent GWAS studies have identified a single nucleotide polymorphism (SNP) within the miR-137 gene that is common amongst schizophrenic patients (Hamshere et al., 2013; Guan et al., 2014). Patients with this high-risk SNP had earlier age of onset (Lett et al., 2013), abnormal development of brain structure and lower prefrontal cortex activity during working memory (van Erp et al., 2014). In vitro analysis identified a novel lncRNA whose expression pattern is very comparable to miR-137 (Gianfrancesco et al., 2017). The lncRNA was found to be highly expressed, specifically in the prefrontal cortex, and transcriptionally induced by psychoactive drugs, suggesting that there might be a potential connection with the hallucinations that many SZ patients experience. Although further work still needs to be done to better understand the functional role of this lncRNA, it could be postulated that miR-137 and the lncRNA could regulate each other.

Over 200 lncRNAs have been found in the brains of individuals with psychiatric disorders such as SZ (Ziats and Rennert, 2013). The lncRNA GOMAFU in humans is involved in brain development (Mercer et al., 2010) and post-mitotic neuronal function (Sone et al., 2007). In addition, it has been found that SZ patients have reduced GOMAFU expression, which was found to be important for cognitive function (Barry et al., 2014). Interestingly, GOMAFU can directly interact with

the splicing factors quaking (QKI) and SRSF1 (serine/argininerich splicing factor 1), and when GOMAFU is dysregulated, the alternative splicing resembles that seen in schizophreniaassociated genes DISC1 and ERBB4 (**Figure 1D**). QKI was identified as a potential SZ gene because it is the only gene located in the chromosome susceptibility locus, 6q25-6q27, in a schizophrenia pedigree (Aberg et al., 2006b). mRNA expression analysis revealed that two QKI splice variants were significantly down regulated in SZ patient brains, suggesting that the splice variants could increase the susceptibility of SZ. Moreover, disrupted QKI splicing could account for the decreased expression of myelin-related genes associated with SZ (Aberg et al., 2006a). Interestingly, most of the myelinrelated gene repression was explained by the splice variant QKI-7kb, and putative QKI-binding sites were identified in five myelin gene's mRNA.

In summary, non-coding RNAs as well as methylation of the RELN gene have been implicated as epigenetic research areas that may hold potential therapeutic targets for SZ. Future studies should aim to further elucidate the role of miRNAs in the SZ brain, in order to pinpoint certain miRNAs that may be pivotal in SZ symptoms. Because SZ has so many genetic variants that only contribute a small portion to the overall increased risk, identifying global epigenetic dysregulation patterns may be more promising. Additionally, there is a lack of studies looking at how epigenetic patterning in the brain changes due to environmental risk factors such as drug use, birth complications and childhood adversity (Neilson et al., 2017). All of these environmental risk factors have been highly correlated with SZ and shown to impact brain development.

#### CONCLUSION AND OUTLOOK

Genetic and epigenetic regulations are critical for brain development, function and prevention of neurological diseases. Currently, the field lacks clear molecular mechanisms underlying neuropsychiatric diseases and effective treatment options. Epigenetics provides a whole new dimension for therapeutic treatments because so many of these diseases are not monogenic and likely have a significant environmental contribution. The epigenome is greatly influenced by environmental factors such as nutrition, chemical pollutants, traumatic early life experiences, temperature changes and exercise (Roth and Sweatt, 2011; Feil and Fraga, 2012), but how they affect brain development is poorly understood. Importantly, the effect of the environment on epigenetics is not limited to development after birth, but can also affect development in utero. Recent work hypothesized that early life stressors that cause long-lasting epigenetic changes may be due to cellular epigenetic "priming." Similar to the immune system, once a particular environmental exposure is experienced and alters the epigenetic state of a gene, that gene now remains in a state of "primed responsiveness," and will have a quicker response if that same environmental exposure is experienced again (Vineis et al., 2017). This concept of epigenetic memory in response to environmental stimuli could serve as a way to identify individuals predisposed to developing neuropsychiatric diseases.

For several of the monogenic neuropsychiatric diseases, such as Rett syndrome and Fragile X, exploring epigenetic mechanisms may lead to understanding whether or not there could be early intervention treatment that could attenuate the disease prior to its onset. Prenatal genome sequencing could be implemented to look for mutations in specific genes as the cost of sequencing continuously decreases. If it is known ahead of time that a child is predisposed, early intervention treatments could be started to slow or prevent disease progression. Possible directions for treatment development could include the use of CRISPR editing to fix missenses mutations in MeCP2 of Rett patients, or developing DNMT based drugs to remove the methylation on the CGG expanded repeat in Fragile X. Additionally, it could be useful to look at developmental time points to identify what epigenetic changes are occurring just before the onset of disease. This could shed light on when key epigenetic remodeling events take place and when potential interventions could be tested.

Treatments for polygenic neuropsychiatric diseases, such as MDD, could benefit the most from epigenetic treatments because there is no clear-cut mechanism to explain disease development. The field is currently focusing on exploring two approaches for developing HDAC inhibitor treatments. The first approach combines HDAC inhibitors with antidepressant drugs (Fuchikami et al., 2016). In this method, HDACs are thought to promote the condensation of the chromatin and prevent transcription factors from binding, regardless of whether the antidepressant is able to increase the levels of the transcription factor. Administration of both HDAC inhibitors and antidepressant could make both drugs work better. The second approach addresses the problem of low specificity of current HDAC inhibitors. The goal of this approach is to synthesize new, highly selective compounds/analogs that can cross the blood brain barrier and be administered acutely instead of chronically (Misztak et al., 2018). Additionally, identifying more reliable biomarkers, such as miR-1202, that can help predict a patient's likelihood to respond to antidepressants could eliminate much of the guess work in finding a drug that will best treat a patient.

In summary, this review has discussed several epigenetic processes and how dysregulation of any of them can affect brain development, function and disease. An important topic not covered in this review is that dysregulation of DNA methylation, histone modifications, chromatin remodeling, and regulatory RNA also contribute to neurodegenerative diseases such as Huntington's, Parkinson's and Alzheimer's. Several model systems, such as mice and postmortem human brains, have been used to generate the current knowledge bank available. A promising new model system, the organoid, can help evolve our understanding of genetics and epigenetics in neuropsychiatric disorders.

Currently, a major challenge in studying neuropsychiatric diseases is the limitations of the model systems available. Mouse models and human postmortem brains have been heavily relied upon to provide insight into neuropsychiatric disease pathology and etiology. However, both options have their limitations. Although mouse and human brains are highly similar at genetic, structural and general circuitry levels, key differences limit them

as models of human diseases that are characterized by complex dysfunction of behavior and thought. For example, human brains have evolved to contain the granular prefrontal cortex, which is absent in mouse brains (Passingham and Wise, 2012). This portion of the cortex is thought to have emerged in relation to increasing brain size, and have roles in comprehension, planning and perception (Goldman-Rakic, 1996; Barbas, 2000; Rolls, 2000; Miller and Cohen, 2001). Human brain samples are obtained postmortem, and thus can never fully recapitulate the epigenetic landscape of a living brain. Postmortem brains only provide a snap shot in the timeline of the disease, and this snapshot is usually biased toward the state of death. Thus, postmortem human brains fail to provide data regarding disease initiation and progression over time.

A new and promising model system that can compensate for animal models and postmortem brains are organoids. Organoids are 3-dimensional cultures that model whole developing organs (Itskovitz-Eldor et al., 2000). This system evolved from embryoid cultures, which are 3D aggregates of stem cells that are grown in a suspension that will induce their differentiation. When organoids are used to generate neuronal lineages, they can recapitulate human brain development in vitro. Morphological studies have further confirmed that forebrain organoids have similar developmental patterns as the developing human cortex (Qian et al., 2016; Zhang et al., 2016). For example, developing organoids can undergo neural differentiation, form multi-layer progenitor zones, form discrete brain regions and portray

#### REFERENCES


typical neuron morphologies such as spine-like structures (Lancaster et al., 2013; Qian et al., 2016). Epigenomic studies have also confirmed that brain organoids recapitulate the fetal brain epigenome (Luo et al., 2016). Whole-genome methylome profiling revealed that mCH accumulation in both fetal brain and cerebral organoid occurred at super-enhancers that are specifically active during fetal development, and later became repressed. Additionally, organoid mCG signatures at DNA methylated valleys, large domains depleted of mCG, were comparable to fetal cortex and localized to genes with roles in brain development. Organoids are cultured from mature epithelial cells that are reverted back to induced pluripotent stem cells. The mature epithelial cells can be obtained non-invasively from an individual affected by a neurological disease, allowing researchers to use a model that is genetically identical to the patient. This provides the field with the ability to develop unique therapeutic options specific to each patient.

In conclusion, the study of epigenetics, along with the exploitation of organoid models, can accelerate our understanding of neuropsychiatric diseases to better develop enhanced treatments.

#### AUTHOR CONTRIBUTIONS

JK and BY wrote the review together. EB and ZW helped with revisions. EB contributed to the MDD section.



channel gene expression to neurons. Cell 80, 949–957. doi: 10.1016/0092- 8674(95)90298-8






single precursor cells derived in vitro and in vivo. Nat. Immunol. 8, 1217–1226. doi: 10.1038/ni1522


mammalian preimplantation development. Genes Dev. 12, 2108–2113. doi: 10.1101/gad.12.14.2108



methylation in the mouse genome. Cell 148, 816–831. doi: 10.1016/j.cell.2011. 12.035


human astrocytes reveals transcriptional and functional differences with mouse. Neuron 89, 37–53. doi: 10.1016/j.neuron.2015.11.013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Kuehner, Bruggeman, Wen and Yao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Interrogating the Evolutionary Paradox of Schizophrenia: A Novel Framework and Evidence Supporting Recent Negative Selection of Schizophrenia Risk Alleles

#### Chenxing Liu<sup>1</sup> \*, Ian Everall2,3, Christos Pantelis1,4,5,6 and Chad Bousman1,7,8,9

<sup>1</sup> Department of Psychiatry, Melbourne Neuropsychiatry Centre, University of Melbourne and Melbourne Health, Melbourne, VIC, Australia, <sup>2</sup> Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, United Kingdom, <sup>3</sup> South London and Maudsley NHS Foundation Trust, London, United Kingdom, <sup>4</sup> Florey Institute of Neuroscience and Mental Health, University of Melbourne, Melbourne, VIC, Australia, <sup>5</sup> Department of Electrical and Electronic Engineering, Centre for Neural Engineering (CfNE), University of Melbourne, Carlton South, VIC, Australia, <sup>6</sup> Melbourne Health, NorthWestern Mental Health, Melbourne, VIC, Australia, <sup>7</sup> Department of Medical Genetics, University of Calgary, Calgary, AB, Canada, <sup>8</sup> Department of Psychiatry, University of Calgary, Calgary, AB, Canada, <sup>9</sup> Department of Physiology and Pharmacology, University of Calgary, Calgary, AB, Canada

#### Edited by:

Cunyou Zhao, Southern Medical University, China

#### Reviewed by:

Annemie Ploeger, University of Amsterdam, Netherlands Tom Dickins, Middlesex University London, United Kingdom Jian-Huan Chen, Jiangnan University, China

> \*Correspondence: Chenxing Liu CHL284@pitt.edu

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

> Received: 06 November 2018 Accepted: 10 April 2019 Published: 30 April 2019

#### Citation:

Liu C, Everall I, Pantelis C and Bousman C (2019) Interrogating the Evolutionary Paradox of Schizophrenia: A Novel Framework and Evidence Supporting Recent Negative Selection of Schizophrenia Risk Alleles. Front. Genet. 10:389. doi: 10.3389/fgene.2019.00389 Schizophrenia is a psychiatric disorder with a worldwide prevalence of ∼1%. The high heritability and reduced fertility among schizophrenia patients have raised an evolutionary paradox: why has negative selection not eliminated schizophrenia associated alleles during evolution? To address this question, we examined evolutionary markers, known as modern-human-specific (MD) sites and archaic-human-specific sites, using existing genome-wide association study (GWAS) data from 34,241 individuals with schizophrenia and 45,604 healthy controls included in the Psychiatric Genomics Consortium (PGC). By testing the distribution of schizophrenia single nucleotide polymorphisms (SNPs) with risk and protective effects in the human-specific sites, we observed a negative selection of risk alleles for schizophrenia in modern humans relative to archaic humans (e.g., Neanderthal and Denisovans). Such findings indicate that risk alleles of schizophrenia have been gradually removed from the modern human genome due to negative selection pressure. This novel evidence contributes to our understanding of the genetic origins of schizophrenia.

Keywords: schizophrenia, evolution, GWAS, Neanderthal, negative selection

# INTRODUCTION

Schizophrenia is a severe, highly heritable (h <sup>2</sup> = 0.64–0.80) psychiatric disorder that typically emerges in late adolescence or early adulthood (Thaker and Carpenter, 2001; Lichtenstein et al., 2009; van Os and Kapur, 2009). The peak of illness onset differs by sex regardless of culture, definition of onset, and definition of illness, with onset peaking at 15–25 years of age in men and 20–35 years of age in women (Mendrek and Mancini-Marïe, 2016). Aligned with these onset peaks, evidence indicates that schizophrenia patients, particularly males, have reduced rate of reproduction (fitness) compared with non-affected populations (Bassett et al., 1996;

Avila et al., 2001). Although it has been reported that fertility among relatives of patients with schizophrenia is increased, a large cohort study and meta-analysis identified that this increase was too small to counterbalance the reduced fitness of affected patients (Bundy et al., 2011; Power et al., 2013). In fact, MacCabe et al. (2009) showed that patients with schizophrenia had fewer grandchildren than in the general population, demonstrating that the reduced reproductivity persists into subsequent generations. This reduction in overall reproduction among those with schizophrenia and their progeny, coupled with high heritability should result in a decrease in schizophrenia according to the evolutionary concept of negative selection. Negative selection results in the purging of deleterious alleles that contribute to traits that reduce fertility. However, the principle of negative selection seems inconsistent with schizophrenia, which is characterized by both high heritability and reduced fertility (Avila et al., 2001) but relatively stable prevalence in the population, suggesting an evolutionary paradox.

Some have attempted to explain this paradox by proposing that risk alleles for schizophrenia at some time in human history conferred evolutionary advantages (i.e., mating success or reproductivity) (Karksson, 1970; Waddell, 1998; Turelli and Barton, 2004; Nettle and Clegg, 2006), while others have attributed the existence of these risk alleles as a price paid for language and development of the social brain (Crow, 1997, 2000). The former evolutionary perspective in schizophrenia has been explained by Nettle (2001), Nettle and Clegg (2006), who suggested that schizotypy characteristics could be linked to intelligence, artistic creativity and thus may positively correlate with mating success. A recent crosstrait analysis of genome-wide association study (GWAS) data supports this notion in that higher polygenic risk scores for schizophrenia predicted creativity (Power et al., 2015). The latter explanation by Crow proposed schizophrenia as a price the modern human paid for achievement of language (Crow, 1997). This idea was subsequently incorporated in the socalled "by product" hypothesis of schizophrenia by Burns (2004, 2006). The by product hypothesis relies on the argument that schizophrenia shares a common genetic basis with the evolution of the social brain, representing the abnormal cortical connectivity that occurred approximately 1 to 1.5 million years ago in our ancestors, archaic humans (e.g., Neanderthals, Denisovans). Other evolutionary theories, such as ancestral neutrality and polygenic mutation-selection balance, have been proposed to explain the evolutionary paradox (Keller and Miller, 2006). However, a consensus has not been reached by evolutionary scientists.

The development of evolutionary genomic tools and the emergence of a critical mass of GWAS data have provided the opportunity to empirically examine the "schizophrenia paradox" and uncover evolutionary mechanisms underpinning the pathogenesis of schizophrenia. Xu et al. (2015) identified the enrichment of schizophrenia SNPs near human accelerated regions (HARs) in the genome that are conserved in primates but have undergone accelerated evolution in humans (pHAR, a type of HARs based on conservation of non-human primates). More recently, Srinivasan et al. (2016) applied a novel evolutionary statistic, the Neanderthal selective sweep (NSS) score, to the largest schizophrenia GWAS dataset (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014) and found SNPs associated with schizophrenia were significantly (p = 7.30 × 10−<sup>9</sup> ) enriched in genome regions that were under recent positive selection. However, recent GWAS findings by Pardiñas et al. (2018) have challenged the notion of selective advantage of schizophrenia risk alleles by demonstrating that these risk alleles have undergone strong background (negative) selection.

To assist in reconciling the current evidence to date, additional evolutionary genomic markers i.e., modern-humanspecific (MD) sites and archaic-human-specific (AD) sites have recently become available (Prüfer et al., 2014; **Figure 1**). These genomic sites provide an opportunity to further interrogate the schizophrenia paradox and examine in more detail the direction of evolutionary mechanisms on SNPs/alleles associated with schizophrenia after modern humans split from archaic humans. As such, we analyzed the Psychiatric Genomics Consortium (PGC) schizophrenia GWAS data (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014) using these new evolutionary markers. Based on the most recent findings by Pardiñas et al. (2018), we hypothesized that the risk alleles of schizophrenia underwent negative selection after modern humans branched away from Neanderthals and Denisovans.

(derived) alleles. Modern-human-specific (MD) sites are those sites where Denisovans or Altai Neanderthals have the derived allele and the ancestral allele is fixed or appears at a high frequency (>90%) in modern humans. Archaic-human-specific sites are those sites. For each site, the ancestral/non-ancestral state (allele) was determined via a comparison with the chimpanzee genome.

#### MATERIALS AND METHODS

fgene-10-00389 April 27, 2019 Time: 18:16 # 3

#### Data Sources GWAS

Summary statistics of GWAS SNPs were obtained from the PGC schizophrenia study<sup>1</sup> , which consisted of 34,241 cases and 45,604 controls.

#### MAF

Minor allele frequency (MAF) information from the 1000 Genomes Project in European (pop\_id = 16652) populations were downloaded from the dbSNP149 database<sup>2</sup> .

#### Human-Specific Sites

General information on MD/AD sites were downloaded from the Max Planck Institute's Evolutionary Anthropology website<sup>3</sup> . We have extracted information (NCBI identifier, genome coordinates and ancestral allele of the site) for SNPs within modernhuman (MD-SNPs), and archaic-human (AD-SNPs) specific sites. Although most of these sites were fixed in modern humans and did not have alternative alleles, 91,752 MD-SNPs (28.5%) and 66,952 AD-SNPs (31.0%) were identified in the PGC schizophrenia GWAS following cross-table querying using NCBI identifiers (rsID) or chromosome coordinates as keys. It was these polymorphic sites that were used in the subsequent analyses (**Supplementary Figure S1**).

#### Analytical Approach

#### Linkage Disequilibrium-Pruning Approach

Prior to statistical analysis, available SNPs were subjected to a linkage disequilibrium (LD)-based SNP pruning process because statistical tests, as described below, assume independence of the studied data. The pruning process was conducted by PLINK software in a 1 Mb window in which any pair of SNPs with R <sup>2</sup> > 0.2 was noted and SNPs were greedily pruned from the window until no such pairs remained. During the pruning process, SNPs were randomly removed with the same priority. The 1000 genome project phase 3 data<sup>4</sup> were used as a reference in the pruning process.

#### Enrichment Analysis of Schizophrenia SNPs for Human-Specific Sites

To control the potential bias caused by MAF, only SNPs with a MAF < 0.1 were included in the enrichment analysis. The MAF of <0.1 was selected because variants in human-specific sites occur at this frequency or below. Fold change scores (F-scores) within each association p-value decile bin (p ∼ [1, 0.886], [0.886, 0.781], [0.781, 0.671], [0.671, 0.559], [0.559, 0.443], [0.443, 0.336], [0.336, 0.233], [0.233, 0.140], [0.140, 0.054], and [0.054, 0]) were calculated as the difference between the observed proportion and

<sup>4</sup>http://phase3browser.1000genomes.org/index.html

the expected proportion:

$$\text{F score} = \frac{b \times c}{a \times d}$$

where the observed proportion is the ratio of the distribution of SNPs within the queried p-value bin located in MD/AD sites (d in **Table 1**), to the distribution of these SNPs in all regions of the genome (c). Whereas, the expected proportion is the ratio of the distribution of all available SNPs located in MD/AD sites (b), to the distribution of these SNPs in all regions of the genome (a). The Fisher's exact test was used to quantify the difference between AD and MD sites within each decile bin.

#### Identification of Derived-Risk or Derived-Protective Alleles

To further investigate changes of risk and protective alleles during the process of human evolution, we identified the derived-risk and derived-protective alleles for schizophrenia. Risk and protective alleles for schizophrenia were determined using summary results from the PGC GWAS (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). Derived/ancestral alleles were identified using the chimpanzee genome as a reference. Those SNPs within MD/AD sites were divided into the derived-risk category, in which the derived allele is the risk allele for schizophrenia (the ancestral allele is the protective allele), and the derived-protective category, in which the derived allele is the protective allele for schizophrenia. We then calculated the ratio of derivedrisk and derived-protective schizophrenia SNPs in each of the decile p-value bins described above to examine the pattern of risk and protective allelic substitutions during the recent evolution of humans. The Fisher's exact test was used to identify the statistical significance within each of the decile bins. All statistical tests have been performed in the R program v3.2.3.

#### Cross-Disorder Analyses

To determine if our results observed in schizophrenia could also be observed in other psychiatric disorders, we obtained PGC GWAS summary results<sup>1</sup> for bipolar, autism and major depressive disorder. The same analytical pipeline used to examine the schizophrenia data (described above) was applied separately to the bipolar, autism and major depressive disorder GWAS data. The chromosome coordinates for genome build 38 (hg38) and build 18 (hg18) were aligned with the coordinates for genome


<sup>1</sup>https://www.med.unc.edu/pgc/results-and-downloads

<sup>2</sup>https://ftp.ncbi.nih.gov/snp/

<sup>3</sup>http://cdna.eva.mpg.de/neandertal/

build hg19 by the LiftOver software, along with corresponding conversion references<sup>5</sup>,<sup>6</sup> .

# RESULTS

### Enrichment Analysis of Schizophrenia SNPs

As shown in **Figure 2**, SNPs examined in the schizophrenia GWAS were not significantly enriched within MD sites or AD sites, regardless of decile bin (**Supplementary Table S1**). Furthermore, there was no difference in the proportion of MD-SNPs (overall p-value across all bins = 0.66) or AD-SNPs (pvalue = 0.56) among all GWAS SNPs.

# Schizophrenia Risk and Protective Allelic Substitution

The schizophrenia SNPs within MD and AD sites had diametrically opposite evolutionary patterns (**Figures 3A,B** and **Supplementary Table S2**). The AD sites contained more derived-risk alleles for schizophrenia compared with the MD sites, whereas the MD sites had more derived-protective alleles. The strongest difference (p-value = 3.9 × 10−15) was found within the decile bin containing SNPs with the smallest p-value in the PGC schizophrenia GWAS (**Supplementary Table S2**).

<sup>6</sup>http://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/

#### Cross-Disorder Analysis

Similar to schizophrenia, SNPs from the bipolar, autism and major depressive disorder GWAS were not significantly enriched within MD sites or AD sites, regardless of the decile bin examined (**Supplementary Figure S2**). In contrast, we did not detect a similar evolutionary pattern as was observed in schizophrenia (**Supplementary Figure S3** and **Table S3**).

# DISCUSSION

Our findings show that since the modern human lineage split from Neanderthals and Denisovans, risk alleles for schizophrenia but not for other psychiatric disorders, have been progressively eliminated from the modern human genome. Interestingly, the tendency toward eliminating risk and retaining protective alleles has been identified in not only nominally associated SNPs, but also SNPs that currently have not been associated with schizophrenia (i.e., SNPs with p values > 0.05). One explanation for this observation is background selection. Background selection is based on the notion that negative selection could decrease the frequency of a deleterious allele, along with the removal of linked variation within the same LD block. Based on background selection, the elimination of schizophrenia risk alleles may not be the result of their intrinsically deleterious effects, but the negative selection of causal alleles.

The enrichment of schizophrenia SNPs in pHAR regions and NSS regions was identified by Xu et al. (2015) and Srinivasan et al. (2016), respectively. Srinivasan attributed their observation to the effect of positive selection after the divergence of humans and Neanderthals. However, the most recent study

<sup>5</sup>https://genome.ucsc.edu/cgi-bin/hgLiftOver

FIGURE 3 | (A) Derived-risk/derived-protective allele ratios within AD and MD sites. (B) An expanded view of the p < 0.054 bin. MD = Modern-human-specific sites; AD = Archaic-human-specific sites.

by Pardiñas et al. (2018) has emphasized the role of background selection in the persistence of risk alleles for schizophrenia. Contrary to the perspective in Srinivasan's study, Pardiñas et al. (2018) suggested that SNPs under positive selection are less likely to be associated with schizophrenia. Our findings are consistent with those reported by Pardiñas et al. (2018) in that our results support negative selection and corresponding background selection of schizophrenia risk alleles rather than positive selection.

In **Figure 4**, we offer a simple preliminary framework that integrates our results within an evolutionary context. Our framework adopts the by-product hypothesis' notion that the number of schizophrenia risk alleles increased with the development of the social brain, language, and high-order cognitive functions (Crow, 2000; Burns, 2004). Aligned with this notion, we speculate that around 100,000 – 150,000 years ago (Burns, 2004), before the migration of modern humans out-of-Africa (Stringer and Andrews, 1988), there was a "turning point" at which time the number of schizophrenia risk alleles plateaued. Thereafter, risk alleles for schizophrenia have been progressively but slowly eliminated from the modern human genome while undergoing negative selection pressure.

Support for our proposed framework would ideally involve evidence suggesting progressive reductions in schizophrenia incidence over the past 100,000 – 150,000 years, along with evidence showing greater schizophrenia polygenic burden among our more distant human ancestors. However, currently we are limited to DNA obtained from Neanderthals and Denisovans. In addition, the calculation and comparison of schizophrenia polygenic burden in Neanderthals and Denisovans with that observed in modern humans would be an effective approach to validate the proposed framework. However, the time-frame by which human evolution occurred (e.g., >million years) and the relatively recent operationalization of schizophrenia,

pose a significant challenge in evaluating changes in the incidence of schizophrenia from an evolutionary perspective. However, an epidemiological study has suggested the incidence of schizophrenia is declining (McGrath et al., 2008).

Our framework could be strengthened or refined by answers to several outstanding questions. First, when did the "turning point" occur? We have speculated the occurrence of this event to have taken place 100,000 – 150,000 years ago but more precise estimates would allow for more sophisticated evolutionary models to be created. Second, how many schizophrenia common risk alleles were present at the turning point? Our framework assumes the number of schizophrenia risk alleles or polygenic burden was greater among our human ancestors but the extent of this additional burden is unknown. Third, what is the rate at which common risk alleles have been eliminated and to what extent have other evolutionary mechanisms such as balancing selection or sexual selection counteracted the rate of allele elimination? Our proposed framework assumes removal of risk alleles has occurred in a static, linear fashion since the turning point. However, to confirm this assumption, DNA from more distant ancestors will be required. Finally, can a single evolutionary framework explain the genetic origin of schizophrenia? Our analysis and framework assume that schizophrenia is a unitary disorder. However, it is widely accepted that schizophrenia represents a clustering of various symptoms rather than a unitary disorder and any comprehensive framework is likely to require a combination of models. As such, our analyses would have ideally been performed on more homogenous populations that shared similar symptoms. Unfortunately, most public schizophrenia GWAS datasets are limited in the amount of symptom level data available, prohibiting these types of analyses. Nevertheless, our findings suggest that risk alleles for schizophrenia have been progressively eliminated from the modern human genome, regardless of the presumed symptom heterogeneity within our sample. Future investigations of schizophrenia GWAS data with high quality phenotyping is warranted.

Despite the novelty and strength of our study, we acknowledge several limitations. Due to the limited number of associated SNPs, the study did not examine the enrichment and substitution of schizophrenia susceptibility under strict p-value thresholds. Novel evolutionary markers encompassing more schizophrenia SNPs are therefore required to further investigate SNPs with genome-wide significance. Second, insertion-deletion (indels) variants were not included in our analysis due to the low number

#### REFERENCES


available in our dataset. Indels play regulatory roles in brain functions, thus future studies should explore their contribution to the genetic origins of schizophrenia. Third, our findings rely on genome information of several archaic humans, but the psychiatric status of the Neanderthal or Denisovan individuals remains unknown. If any of them were affected by psychosis, our findings could be biased. Finally, other evolutionary models, such as the sexual selection and balancing selection model (Nettle, 2001; Del Giudice, 2017), have been proposed to reconcile the evolution paradox in schizophrenia. However, the present study did not empirically evaluate these models because evolutionary markers available are not suitable for testing such evolutionary models.

In sum, we have performed a novel evolutionary analysis using schizophrenia and other psychiatric disorder GWAS data and comparative genome results in modern and archaic humans. Our study, for the first time, provides experimental evidence supporting the role of negative selection in eliminating risk alleles for schizophrenia but not other psychiatric disorders from the modern human genome. Based on these theoretical and biological findings, we have proposed a novel evolutionary framework to stimulate further research on the evolutionary paradox and genetic origin of schizophrenia.

# AUTHOR CONTRIBUTIONS

CL conceived and designed the study, performed bioinformatics and statistical analyses. CB and CL interpreted the main findings. CB, IE, and CP supervised the work.

#### FUNDING

CP was supported by a NHMRC Senior Principal Research Fellowship (628386 and 1105825), and a Brain and Behavior Research Foundation (NARSAD) Distinguished Investigator Award (US; Grant ID: 18722).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2019.00389/full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Liu, Everall, Pantelis and Bousman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Transcriptome Changes in Relation to Manic Episode

*Ya-Chin Lee1, Yu-Lin Chao2, Chiao-Erh Chang1, Ming-Hsien Hsieh3, Kuan-Ting Liu1, Hsi-Chung Chen3, Mong-Liang Lu4,5, Wen-Yin Chen6, Chun-Hsin Chen4,5, Mong-Hsun Tsai7, Tzu-Pin Lu1, Ming-Chyi Huang5,6\* and Po-Hsiu Kuo1,8\**

*1 Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan, 2 Department of Psychiatry, Buddhist Tzu Chi General Hospital and Tzu Chi University, Hualien, Taiwan, 3 Department of Psychiatry, National Taiwan University Hospital, Taipei, Taiwan, 4 Department of Psychiatry, Wang-Fang Hospital, Taipei Medical University, Taipei, Taiwan, 5 Department of Psychiatry, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan, 6 Department of Psychiatry, Taipei City Psychiatric Center, Taipei City Hospital, Taipei, Taiwan, 7 Institute of Biotechnology, National Taiwan University, Taipei, Taiwan, 8 Department of Public Health, National Taiwan University, Taipei, Taiwan*

#### *Edited by:*

*Zhexing Wen, Emory University School of Medicine, United States*

#### *Reviewed by:*

*Herb Lachman, Albert Einstein College of Medicine, United States Hechen Bao, University of North Carolina at Chapel Hill, United States Rafael Tabarés-Seisdedos, University of Valencia, Spain*

*\*Correspondence:* 

*Ming-Chyi Huang mch@tpech.gov.tw Po-Hsiu Kuo phkuo@ntu.edu.tw*

#### *Specialty section:*

*This article was submitted to Molecular Psychiatry, a section of the journal Frontiers in Psychiatry*

*Received: 07 October 2018 Accepted: 11 April 2019 Published: 01 May 2019*

#### *Citation:*

*Lee Y-C, Chao Y-L, Chang C-E, Hsieh M-H, Liu K-T, Chen H-C, Lu M-L, Chen W-Y, Chen C-H, Tsai M-H, Lu T-P, Huang M-C and Kuo P-H (2019) Transcriptome Changes in Relation to Manic Episode. Front. Psychiatry 10:280. doi: 10.3389/fpsyt.2019.00280*

Bipolar disorder (BD) is highly heritable and well known for its recurrent manic and depressive episodes. The present study focused on manic episode in BD patients and aimed to investigate state-specific transcriptome alterations between acute episode and remission, including messenger RNAs (mRNAs), long noncoding RNAs (lncRNAs), and micro-RNAs (miRNAs), using microarray and RNA sequencing (RNA-Seq) platforms. BD patients were enrolled with clinical information, and peripheral blood samples collected at both acute and remission status spanning for at least 2 months were confirmed by follow-ups. Symptom severity was assessed by Young Mania Rating Scale. We enrolled six BD patients as the discovery samples and used the Affymetrix Human Transcriptome Array 2.0 to capture transcriptome data at the two time points. For replication, expression data from Gene Expression Omnibus that consisted of 11 BD patients were downloaded, and we performed a mega-analysis for microarray data of 17 patients. Moreover, we conducted RNA sequencing (RNA-Seq) in additional samples of 7 BD patients. To identify intraindividual differentially expressed genes (DEGs), we analyzed data using a linear model controlling for symptom severity. We found that noncoding genes were of majority among the top DEGs in microarray data. The expression fold change of coding genes among DEGs showed moderate to high correlations (~0.5) across platforms. A number of lncRNAs and two miRNAs (*MIR181B1* and *MIR103A1*) exhibited high levels of gene expression in the manic state. For coding genes, we reported that the taste function-related genes, including *TAS2R5* and *TAS2R3*, may be mania state-specific markers. Additionally, four genes showed a nominal *p*-value of less than 0.05 in all our microarray data, mega-analysis, and RNA-Seq analysis. They were upregulated in the manic state and consisted of *MS4A14*, *PYHIN1*, *UTRN*, and *DMXL2*, and their gene expression patterns were further validated by quantitative real-time polymerase chain reaction (PCR) (qRT-PCR). We also performed weight gene coexpression network analysis to identify gene modules for manic episode. Genes in the mania-related modules were different from the susceptible loci of BD obtained from genome-wide association studies, and biological pathways in relation to these modules were mainly related to immune function, especially cytokine–cytokine receptor interaction. Results of the present study elucidated potential molecular targets and genomic networks that are involved in manic episode. Future studies are needed to further validate these biomarkers for their roles in the etiology of bipolar illness.

Keywords: bipolar disorder, manic episode, microarray, RNA-sequencing, transcriptome, noncoding RNAs, state markers

#### INTRODUCTION

Bipolar disorder (BD) is a severe and highly heritable psychiatric disorder, characterized by repeated manic episodes and depressive episodes (1). Having the mania episode is a unique feature for diagnosing BD. The symptoms of mania episode include elevated mood, irritability, racing thoughts and rapid speech, inflated selfesteem, increased activity, reduced need for sleep, and engaged in risky behaviors (2). BD is among the top list of disease burden worldwide and causes huge loss of disability-adjusted life year (3). However, the pathological mechanisms of BD are still unclear. It remains a great challenge to make accurate and early diagnosis, as well as efficient treatment for BD (4). In the past decade, genome-wide association studies (GWAS) had huge impact on studying complex traits (5) and facilitated the identification of hundreds of genetic loci (6) and follow-up functional studies (7). However, the progress made for BD is left behind (8). In particular, the mechanisms underlying episodic feature of BD are largely unknown. In this regard, the dynamic characteristics of transcriptome that are in response to physiological and environmental stimuli become a suitable genomic system to study the molecular alterations for manic episode.

Recently, with the advances in RNA-sequencing (RNA-Seq) technology (9) and breakthrough findings of noncoding RNAs (10–12) (ncRNAs), studies in transcriptomics have entered a new era. Among different categories of ncRNAs, short micro-RNAs (miRNAs) are probably the most studied, with many known molecular functions, including binding mechanism and target gene repression. Another novel group of large ncRNAs, named long noncoding RNAs (lncRNAs) that are defined as transcripts longer than 200 bps, have been identified and found to have substantial regulation functions during development in multiple genomic domains (13), including transcriptional and posttranscriptional regulations (10, 11). LncRNAs exert their complex functions by further interacting with miRNAs as competitors, primers, or cooperators to regulate miRNAs or other regulators (12). Moreover, lncRNAs are vital for brain development and neural function (14), and are recognized to play roles in psychiatric disorders (15). For example, the lncRNA myocardial infarction associated transcript (MIAT) (Ensembl gene ID: ENSG00000225783) has been found to regulate schizophrenia-associated alternative splicing and is regulated by neuronal activation (16). Akula et al. conducted the first RNA sequencing (RNA-Seq) analysis in postmortem brains of 11 BD and 8 healthy controls to explore between-group transcriptome differences (17). Together with results from another study in BD (18), it was reported that lncRNAs were significantly expressed with difference. Moreover, using human induced pluripotent stem cells (iPSCs), RNA-Seq result showed dysregulated lncRNAs in BD patients compared to healthy control iPSCs (19). These results highlight the importance to study both coding and noncoding transcriptomic targets for psychiatric disorders.

It is known that transcriptome profiles are different across individuals and are easily confounded by environmental and genetic factors, RNA extraction methods, and sample timing (20, 21). Therefore, it might be more appropriate to study withinindividual transcriptomic differences such as disease state markers to reduce potential confounding effects by interindividual comparisons. The concept of studying state-specific markers for BD using gene expression data is not new (22). For example, brainderived neurotrophic factor, the *BDNF* gene, is widely studied for its association with disease status of BD (23). Previous studies often adopted a candidate gene approach and used quantitative real-time PCR to investigate gene expressions in specific candidate genes (24) but not at the genome-wide scale to explore more comprehensive transcriptome alterations for mood episodes. In addition, despite a few studies exploring gene expression patterns for depressive states, very few studies were conducted for the manic state of BD (22), and none reported results of withinindividual comparisons. As far as we are aware, there was only one study using prospective study design to follow 11 BD patients from manic episode to euthymia to explore blood gene expression alterations (25). However, this study reported results from interindividual comparisons, including mania patients versus healthy controls, and euthymia patients versus healthy controls.

In the present study, we aimed to investigate transcriptome patterns for the unique manic feature of BD patients, including coding genes and ncRNAs. We followed patients from acute episode to remission and made intraindividual comparisons. The current study included discovery samples and replication samples using different high-throughput platforms of gene expression for comparisons. We also accessed public data from Witt et al. (25) to run a mega-analysis with their 11 pairs of BD samples in order to obtain more robust results combining different studies and ethnic groups. To identify differentially expressed genes (DEGs), we performed analyses using linear regression models adjusted for symptom severity. Using the latest gene annotation version, we were able to annotate all recorded ncRNAs in the database to obtain a more comprehensive transcriptome profile for the manic state. Moreover, the field is experiencing a paradigm shift for considering "omnigenic model" (26) and network analysis (27). We performed coexpression and network analyses to explore modules/pathways that are correlated with manic episode and symptom severity. These strategies could help us to better interpret the potential functions of identified coding genes and ncRNAs, and facilitate our understanding about the underlying mechanisms for the development of manic episode.

#### MATERIAL AND METHODS

#### Subjects

Inpatients aged 20–65 years old who were diagnosed with BD and had a current manic episode according to the criteria of *Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition* were referred by psychiatrists in several central and regional hospitals in Taipei. We excluded patients with mental retardation, schizophrenia, schizoaffective disorder, or substance-induced secondary BD. Acute manic patients were followed for at least 2 months until they achieved full remission. Each patient had pair data with two time points, including clinical data and bio-sample collection. In total, we recruited 13 BD patients with manic episode, who had complete data in acute phase and remission status in the present study. Six BD patients were discovery samples using genome-wide microarrays, and seven BD patients undergone whole-genome RNA-Seq analysis. All participants signed informed consent forms after study procedures were fully explained. The sample recruitment and data collection were approved by the Institutional Review Board of all participating institutes and hospitals.

For replication, we further downloaded the microarray gene expression data of Witt et al. (25) with 11 German BD patients from Gene Expression Omnibus database (GEO, Series: GSE46416), which included data from both manic and remission status, and obtained patients' clinical severity information from the corresponding author. The demographic data of all subjects were listed in **Table 1**. We conducted a mega-analysis to combine the two sets of microarray gene expression data as replication.

#### Assessment

Our trained lay interviewers conducted a face-to-face interview with each participant. Subjects were interviewed with modified

TABLE 1 | Demographic data of all subjects in different platforms of the present study.


*The Microarray column was our primary analysis with Affymetrix Human Transcriptome Array (HTA) 2.0 and Witt's study using Affymetrix Exon 1.0 ST array. For RNA sequencing (RNA-Seq), we used Illumina NextSeq 500 platforms. YMRS, Young Mania Rating Scale; HAMD, Hamilton Rating Scale, N/A, not applicable.*

Chinese version of Schedule for Affective Disorders and Schizophrenia-Lifetime (SADS-L) to assess demographic characteristics and lifetime history of psychiatry disorders (28). For symptom severity, we used the Chinese version of Young Mania Rating Scale (YMRS) for mania (29) and its reliability, validity and sensitivity are examined. There was a high correlation between the scores of two independent clinicians on both the total score (0.93 and Hamilton Rating Scale (HAMD) for depression (30). A cutoff of 16 in YMRS score was set for defining individuals in manic episode, while a score of less than 7 was considered in remission. We defined a variable named stage to represent the contrast between acute and remission status. The HAMD score was less than 11 in all times, and this criterion was taken to ensure that the BD patients were in mania but not mixed state.

#### RNA Isolation and Biosample Quality Control

The whole blood samples were drawn from patients at each time point and stored with TRizol™ reagent at −80°C freezers. We used chloroform for lysis, followed by isopropanol with incubation and centrifugation to retrieve total RNA precipitates from mononuclear cells. Then, we used 70% ethanol for washing. Finally, we solubilized the RNA pellet in 20–50 μL of RNase-free water and 0.1 mM EDTA for further application. RNA samples had undergone quality controls with the following criteria: OD260/280 was between 1.8 and 2.0, and OD260/230 was larger than 2.0. The quality control criteria of BioAnalyzer to obtain the RNA integrity number (RIN) and 28S/18S values were as follows: RIN ≥ 6 and 28S/18S ≥ 1.0. Most of our sequenced samples were of fine RNA quality. The mean of the RIN in all samples was 8.8, with only two samples with RIN values ranging from 6 to 7 (see **Supplementary Table S1**).

#### Microarrays and RNA Sequencing

For transcriptome analysis, we conducted the genome-wide gene expression analysis using microarrays in the discovery samples. We used Affymetrix Human Transcriptome Array (HTA) 2.0, which contained more than 6.0 million probes and can be annotated to 44,699 coding transcripts and 22,829 noncoding transcripts, including miRNAs and lncRNAs. We used qualified 500 ng RNAs to be synthesized to cDNA and hybridized to the HTA microarrays. All procedures followed the Affymetrix protocols and performed in the Microarray Core Lab of Core Instrument Center in the National Health Research Center. The transcriptome microarrays used by Witt et al. (25) were Affymetrix Human Exon 1.0 ST Array, which contained 46,753 transcripts in total. For RNA-Seq, the qualified RNAs were used for library construction and then sequenced on Illumina NextSeq 500 platform with 75 paired-end sequencing. On average, each sample had more than 30 million reads.

#### Transcriptome Analysis

All of the transcriptome analyses for microarrays were conducted with R software (31). First, the microarray data were imported and normalized using Robust Multichip Average (RMA) method (32), including background correction, log2 transformation, and quantile normalization, in *affy* package (33). To obtain the systematical nomenclature for cross-platform comparisons between different arrays, we used *biomaRt* package (34) to match probe ID in different microarray platforms with Ensembl gene ID using Human Genome Reference GRCh38 (35). Next, we used the collapseRows MaxMean function in *weighted gene co-expression network analysis (WGCNA)* package to select the represented probe expression value in gene-level comparison and increase betweenstudy consistency. The MaxMean function was originally developed to perform cross-platform microarray mega-analysis and has been extensively validated (36). In total, the common genes between different platforms for the further analysis were 34,576 genes, including 18,049 coding genes and 16,527 noncoding genes. Finally, the possible batch effects were corrected with ComBat function of *sva* package (37). To capture the DEGs for between manic episode and remission, we constructed a linear model with the following covariates: age, sex, and YMRS score by *limma* package (38) to calculate expression values with empirical Bayes methods. For RNA-Seq, we performed analyses in linux system. We first used FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) for quality check for sequence files and used trimmomatic (39) for sequence trimming. We then used HISAT2 (40) for sequence alignment with GRCh38 index files and used featureCounts (41) to do gene annotations with Ensembl gene ID. Lastly, we used DESeq2 (42) to detect DEGs for intraindividual comparisons. We reported DEGs with a nominal *p*-value of <0.05 and an absolute value of fold change larger than 1.2.

#### Quantitative Real-Time PCR for Potential Targets

For experimental validation, we used 10 pairs of RNA samples from manic episodes and remission. Among the 10 paired samples, 4 paired samples were technical replicates from previous samples and 6 paired samples were additional (see **Supplementary Table S2**).

For qRT-PCR, we performed standard RNA isolation and reverse transcription (RT) using PrimeScript RT reagent Kit (TaKaRa, Japan) following the manufacturer's protocol. The analysis was conducted with SYBR® Premix Ex Taq™ (TaKaRa, Japan), with sequence-specific primers of each targeting genes (see **Supplementary File** for all the target sequences). The qPCR assays were conducted in duplicate by MyiQ™ single-color real-time PCR detection system (Bio-Rad, CA), and glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used as internal control. Threshold was set above nontemplate control background and within linear phase of target gene amplification to calculate cycle number at which the transcript was detected [threshold cycle (CT)].

#### Network and Enrichment Analyses

We ran weighted gene coexpression network analysis using the *WGCNA* package (43) for 17 BD samples altogether, and the twotime point data were normalized and batch effects were corrected before analysis. We used signed hybrid network type and biweight midcorrelation (bicor) for module detection and softthreshold power calculation, which could provide robust results with better biological meanings (44). A soft-threshold power of 4 could achieve approximate scale-free topology with *R*<sup>2</sup> > 0.8. The network construction was created using blockwiseModules function for consideration of computer efficiency. The module detection criteria were as follows: minimum module size of 50, deepsplit of 4, and merge threshold of 0.25. The merged modules were then summarized with module eigengene (ME) correlations >0.75. MEs were defined by their first principal component and were labeled with different colors as module names in the Results section. After the modules were generated, we conducted different enrichment analysis to explore the functional interpretation of genes within the modules.

Moreover, we conducted enrichment analysis with genomewide association (GWA) signals for BD. We downloaded the summary statistics results from one of the latest GWA study of BD from the Psychiatric Genomic Consortium from BioRxiv (8). We used loci with significant level less than a *p*-value of 5\*10−6 and reannotated them with Ensembl database (35). In total, we obtained 131 genes, including coding and noncoding genes for the GWA enrichment analysis. There were, in total, 137 DEGs that showed signals in both our discovery samples and megaanalysis. We thus conducted DEG enrichment analysis to explore whether the mania-related modules are more enriched with these DEG signals. The *p*-value of each module for enrichment analyses was calculated using Fisher's exact test. Lastly, for modules significantly associated with mania status or YMRS scores, we conducted gene-set analysis for genes in these identified modules to explore their functions using Multi-marker Analysis of GenoMic Annotation (MAGMA) (45). The background pool of gene-set analysis contained all the 34,576 genes as reference. To explore tissue specificity for these modules, we used data from GTEx project (46) and BrainSpan Atlas (47). The significance criterion was set with an adjusted *p*-value of <0.05.

# RESULTS

In the DEG analysis, we constructed a linear model with YMRS score in the model for intraindividual comparisons between episodic status. Results of the DEG analysis showed reasonable consistency between DEGs obtained from either stage or symptom severity. The correlations between them were high, whether in our discovery samples or in Witt et al. (25) samples, with a correlation coefficient of around 0.69–0.84 (see **Supplementary Figure 1**). We found that most of the DEGs were upregulated in the manic state (see **Figure 1C** and **D** and **Table 2**). There were 306 DEGs in our discovery microarray data, and 321 DEGs in the mega-analysis. Among the 306 DEGs in the discovery data, 60 DEGs remained significant in the megaanalysis, and the predominant gene category was ncRNAs (more than 40%, see **Figure 1A** and **B**). However, none of the DEGs meet statistical significance criterion after multiple testing correction, regardless of DEGs from discovery or mega-analysis. We further evaluated the gene expression correlations across experimental platforms. The correlations for the fold change between HTAs, RNA-Seq, and mega-analysis were shown in **Figure 2**, with a significant and modest correlation between

microarray (mega-analysis) and RNA-Seq (0.20 in **Figure 2A**). The correlations were increased among coding DEGs (*r* = 0.44) (**Figure 2B**) but relatively unchanged among noncoding DEGs (*r* = 0.19) (**Figure 2C**). The heatmaps of DEGs among different platforms were shown in **Supplementary Figure S2**.

We listed the top 10 coding and noncoding DEGs in discovery samples in **Table 2** along with results from other experimental platforms. In general, the expression directions were similar across platforms. Results of the mega-analysis showed that there were more ncRNAs as the DEGs compared with coding DEGs, such as RNA5SP294 (rRNA), PTCSC3 (lincRNA), and RNU6-1228P (snRNA). Among noncoding genes, two miRNAs (*MIR181B1* and *MIR103A1*) showed signals with high levels of gene expression in the manic state (fold change ranged from 1.8 to 3.0). These results of ncRNAs, however, are not able to be examined in our RNA-Seq data (see **Supplementary Figure S3**). We did standard library preparation, which is not designed for obtaining ncRNAs. Thus, most of the noncoding genes were not able to be detected in this platform. For instance, among the DEGs in RNA-Seq analysis, none of the miRNAs were found. Consequently, less noncoding genes were seen in RNA-Seq than microarray platform (**Figure 1C and D**), and in turn, the correlation of fold change in noncoding DEGs was low (see also **Figure 2**). On the other hand, among results of coding genes, we found interesting targets in the taste 2 receptor (TAS2) gene family. The *TAS2R5* and *TAS2R3*  genes showed signals in the discovery samples and mega-analysis, suggesting the potential involvement of taste system in the manic state. In general, the DEG targets between microarrays and RNA-Seq did not have much overlapping, and only four genes were significant in microarray and RNA-Seq platforms as well as in mega-analysis: *MS4A14*, *PYHIN1*, *UTRN*, and *DMXL2* (**Table 2**). We used qPCR to validate gene expression patterns of selected gene targets, including the four overlapping genes and *TAS2R5*. All of these genes showed significant differences between manic and remission states in qPCR with the same direction with original analysis (**Figure 3**).

In our network analysis, we found that 34,576 genes were clusterd into 33 modules (median of module size: 324, mean of module size: 1,048; **Figure 4**). There were three modules (Royalblue, Brown, and Darkred) that were significantly correlated with both mania and YMRS score. Two modules (Darkgrey and Cyan) were significantly correlated with YMRS score, and one module (Lightcyan) was significantly correlated with the manic state. We then ran enrichment analysis for these identified modules. In GWAS enrichment analysis, none of the modules were significantly enriched with GWAS signals.


TABLE 2 | Top 10 significant coding and noncoding differentially expressed genes (DEGs) in HTA and their performance on different platforms.

*\*Significant in HTA and mega-analysis, #validated by quantitative real-time PCR (qRT-PCR).*

*FC, fold-change; rRNA, ribosomal RNA; snRNA, small nuclear RNA; lincRNA, long intergenic non-coding RNA.*

For the DEG enrichment analysis, only the Darkred module showed margin significance (*p* = 0.045), and interestingly, all DEGs enriched in this module were ncRNAs (see **Supplementary Table S3**).

Exploration for possible biological functions of each maniarelated module was performed, and the results are displayed in **Table 3**. Using the tissue specificity test, we found that Royalblue, Cyan, and Lightcyan modules were enriched in whole blood and Epstein-Barr virus (EBV)-transformed lymphocytes. The genes in Royalblue and Cyan modules were enriched in subtantia nigra (SN) and anterior cingulate cortex (ACC) brain tissue, respectively. In the gene-set enrichment analysis, these two modules were significantly associated with immune-related pathways, including cytokine– cytokine receptor interaction, response to type I interferon and



*\*GTEx 53, The 53 tissue expression from Genotype-Tissue Expression (GTEx); YMRS, Young Mania Rating Scale.*

*N/A, not applicable; BA24, Brodmann area 24; MHC, major histocompatibility complex; KEGG, Kyoto Encyclopedia of Genes and Genomes; GO, Gene ontology;* 

*ERAD, Endoplasmic-reticulum-associated protein degradation.*

antigen processing, and presentation of exogenous peptide antigen via major histocompatibility complex (MHC) class I.

#### DISCUSSION

In the present study, we focused on the manic feature in BD to capture state-specific transcriptome patterns using data from different platforms, and analyzed both coding and noncoding RNAs. Our results were presented by DEG analysis as well as network and enrichment analyses to identify potential transcriptome biomarkers and provide biological explanations. As far as we know, this is the first transcriptome study that reported "state markers" for mania episode with intraindividual comparisons, while previous studies usually compared different groups of patients (22). The interindividual comparisons are subject to confounding bias, such as batch effects, genetic heterogeneity, and different environmental exposures across subjects (20). Therefore, studying dynamic transcriptome changes within individuals is much more preferable in searching biomarkers for the manic state and to provide insights for assisting early diagnosis of BD.

In our DEG analysis, we found that ncRNAs is the predominant category showing mainly up-regulation during acute manic episode (see **Figure 1**), which is possibly related to characteristics of manic symptoms, such as agitation and overenergetic behaviors. Among ncRNAs, the lncRNAs is the largest category in our DEGs. In a previous study, a small number of DEGs were shown for mania, and all of the 22 genes were coding genes. Interestingly, all of them were upregulated in mania (25). Across microarray data (our discovery and German samples), we also found that the fold change is more consistent in ncRNAs than in coding genes; therefore, ncRNAs maybe a relatively reliable targets for manic episode.

In the RNA-Seq results, unfortunately, the original library preparation was not tailored for detecting ncRNAs, and lncRNAs have particularly lower expression than coding genes (48) in non-brain tissues. Thus, most of the ncRNAs reported in our discovery samples cannot be detected by RNA-Seq (**Table 2**). Specific techniques, such as ribosomal RNA depletion for library construction (18) or CaptureSeq (49), are required for more comprehensive ncRNA detection. Similarly, correlations of fold change across microarray and RNA-Seq platforms vary by coding or noncoding category such that the correlations of coding DEGs are higher, ranging from 0.4 to 0.5, and lower for noncoding DEGs (0.2). The lncRNAs showed various regulatory abilities in human genomes (16). Thus, lncRNAs may serve as potential targets for manic episode and could be conveniently assessed by available microarrays. Future studies need to use ribosomal RNA depletion for library construction or CaptureSeq to validate the results of ncRNAs for mania episode biomarkers. Despite our efforts to use multiple datasets to search for more convincing biomarkers for the manic state, none of our DEGs with nominal *p*-value of <0.05 reach statistical significance after multiple testing correction. Our study design with intra-individual comparisons might increase statistical power to a certain extent (50); however, the sample size in the present study might still be too small to detect weak gene effects between acute and remission states. Moreover, heterogeneity in datasets, including ethnics (e.g., Taiwanese and German samples), patients' inclusion criteria, and clinical characteristics, may also contribute to the increased challenges in finding reliable biomarkers across datasets for the manic state. For the purpose of validation, we selected only the four overlapping DEGs across different platforms and *TAS2R5* (**Table 2**) for experimental validation. All the qPCR results showed significance and high consistency of the fold-change direction in the original analyses, indicating the potential of studying these targets for bipolar illness in future studies. Among the target genes, *MS4A14* belongs to a big gene family called membrane spanning 4-domains. It has been found to be involved in DNA methylation and transcript splicing in Alzheimer's disease (51). *PYHIN1*, pyrin and HIN domain family member 1, is related to interferon regulation and has been found to be related with depressive behaviors induced by lipopolysaccharide in mice models (52), which showed its potential role as a state marker. *UTRN* encodes for utrophin, which is located at synapse and myotendinous junctions, and was reported to be a candidate gene for schizophrenia and BD (53). An interesting study exploring blood-based spliceosome found that *UTRN* had differential splicing for psychosis in schizophrenia and BD (54), again demonstrating the potential of using bloodbased biomarkers for psychiatric disorders. *DMXL2*, Dmx Like 2, is involved in synaptic vesicle exocytosis in major depressive disorder patients (55). *DMXL2* has also been found to be related to co-occurring cardiovascular disease under selective serotonin reuptake inhibitors (SSRI) treatments in patients with major depressive disorder (56). Among the top 10 coding genes reported in the discovery samples, *TAS2R5* is an interesting target that also showed signal in mega-analysis (see **Table 2**). Taste 2 receptors (TAS2Rs) are also called bitter taste receptors, which belong to one of the G protein-coupled receptors. Unlike other TAS2Rs, *TAS2R3* and *TAS3R5* have limited agonist due to their unique structure (57). *TAS2R5* has been found to be overexpressed in asthma children and might have anti-inflammatory function by regulating cytokines (58), indicating its multifunctional characteristic. TAS2Rs are expressed in rat and human brain tissues, which were reported to be related to Parkinson's disease and neurodevelopmental diseases (59). *TAS2R5* was also found to be downregulated in the dorsolateral prefrontal cortex of schizophrenia postmortem brain tissues compared to healthy control brain tissues (60). Interestingly, another gene, *CES1*, showing signal in our discovery samples (*p* = 7.99\*10−3) and mega-analysis (*p* = 1.94\*10−2), was found to be related to taste reduction in attention-deficit/hyperactivity disorder children who underwent methylphenidate treatment (61). In the present study, although the effects of medication were not directly evaluated, we recorded the information for each patient to ensure that medication was not changed during the follow-up period. In clinical observations, the dysregulation of taste often exists during acute episode in BD patients (62). The taste alteration might also be related to cognition performance (63). Therefore, the tasterelated receptors are potential candidates for episodic status as the pharmacological targets. Further *in vivo*/*in vitro* experiments are needed to verify the medication effects on these taste-related genes. Recent development of BD patient-derived induced pluripotent stem cells (iPSCs) provides information for medication effects on gene expression (64, 65). For example, recent studies using iPSC of BD patients have produced interesting findings in studying the therapeutic mechanisms of lithium effects (66). This is a potential model to study mechanisms underlying manic state.

For noncoding RNAs among the DEGs, lncRNAs were the predominant category. In our noncoding DEGs, the PTCSC3 (papillary thyroid carcinoma susceptibility candidate 3) gene was reported to be associated with thyroid cancer (67). Researchers nowadays can do a lot of *in silico* prediction for lncRNAs' functional interpretation (68), and more functions of lncRNA would be validated in the near future. On the other hand, miR181B1 was an interesting target among short ncRNAs, which was upregulated in the manic state. miR181B1 has been found to be downregulated in schizophrenia patients with antipsychotic medication, including olanzapine, quetiapine, ziprasidone, and risperidone (69). Furthermore, another study demonstrates that among treatment-resistant schizophrenia patients, miR181B1 showed higher expression compared to those responsive schizophrenia patients (70). The target genes of miR181B1 might be related to manic symptom onset for BD.

In total, we identified 33 modules by WGCNA with ideal soft threshold (Beta = 4), and 6 modules were significantly related to clinical outcomes (YMRS or manic stage). The mania-related modules were not enriched with GWAS signals even when the signals were from one of the largest BD studies so far (8). The state markers that we explored in the present study might be very different from the susceptible trait loci of BD. From the recent cross-disorder analysis among major psychiatric disorders, schizophrenia, BD, and major depressive disorders (71), the authors reported similar scenario that there exist common genetic loci across different major psychiatric disorders, but transcriptional dysregulation is different.

In the DEG enrichment analysis for modules, we found that most of the modules were not enriched with DEGs, which could be explained by the "omnigeneic model" proposition. This model suggests that the DEGs within the module network are highly connected with other genes, and the biological functions are decided by the whole module but not a few signals. Among the six mania-related modules, we found that the Royalblue and Cyan modules are enriched with gene expression in substantia nigra and BA24 (anterior cingulate cortex) brain regions, which are highly relevant to BD (18, 72). These results supported that peripheral transcriptome analysis can somehow correlated with the dynamic changes in central nervous systems. Moreover, these two modules are especially enriched for cytokine-related immune pathways. These findings are consistent with previous findings for the disturbance of cytokines between different states of BD (73). In addition, among previous gene expression studies based on postmortem brain (17, 18) or iPSCs (19, 74) using case–control study design, most of the enriched pathways were related to calcium channel or G proteincoupled receptors. It is highly possible that dysregulated targets of trait markers or state markers are involved with different sets of genes. These echo the observation of not finding enrichment for GWAS signals in coexpression modules. Information coming from different study designs would provide more comprehensive perspectives in understanding the etiology of bipolar illness. In our results, the cytokine-related pathways are based on genome-wide gene expression network, which might provide more insights for cytokine-related mechanisms underlying manic episode for BD.

There are several limitations in the present study. First, we had limited sample size in the present study, as follow-up of manic BD patients is a difficult challenge. We used different platforms and independent samples for replication and validation, and conducted mega-analysis to increase generalizability. In addition, withinindividual paired data are less prone to confounding effects for gene expression study and increase the power for DEG identification. However, the sample size in the present study was small and may not have enough power to detect gene with weak effects. Nevertheless, among our selected gene targets for qPCR validation, all of them showed significant differences with the same expression patterns as the original analyses, which increased our confidence of finding meaningful targets despite moderate power. Second, we used only peripheral blood samples for state-specific transcriptome analysis but not brain tissues for practical reasons. We cannot control the blood drawing time in different subjects and time points, and may slightly influence gene expression results due to potential circadian rhythm in a certain proportion of human gene expressions. Third, the assessment of symptom severity is subject to interviewer bias. Nevertheless, by treating YMRS score as one of the covariates in our linear model, this concern maybe minimized. Lastly, although medication might influence gene expression results, we were not able to control the medications among BD patients in our observational study design, and the medication effects on gene expressions cannot be directly assessed as we are comparing the same individuals at two times points with the same medication treatment during follow-up. Instead, we tried to adjust medication usage of different categories of drugs in regression models, and results remain similar with or without adjustment.

In summary, the present study represents one of the very few studies that focused on manic feature of BD using intraindividual comparisons. Using genome-wide exploration for gene expression patterns with different experimental platforms and sets of independent samples, our results provide the first line of evidence for the involvement of coding and noncoding transcriptomic alterations for manic episode. In particular, with experimental validation, taste-related genes (including *TAS2R5*, *TAS2R3*, and *CES1*) and other common targets across different platforms (*MS4A14*, *PYHIN1*, *UTRN*, and *DMXL2*) might be important targets for episodic onset. Results of network and enrichment analyses suggest the potential role of cytokine-related pathways for mania, and gene expression network is independent of the signals from susceptible genetic loci for BD identified from previous GWA studies. These results bring insights to designing future study for early diagnosis and detection of manic episodes in bipolar illness.

#### ETHICS STATEMENT

In the present study, all participants signed informed consent forms after study procedures were fully explained. The sample recruitment and data collection were approved by the Institutional Review Board of all participating institutes and hospitals, including National Taiwan University Hospital Research Ethics

#### REFERENCES


Committee, Quality and Patient Safety Committee of Taipei City Hospital and Institutional Review Board of Wangfang Hospital.

#### AUTHOR CONTRIBUTIONS

All investigators contributed to the design or execution of the study, and approved the final version. YC responded to the major data analysis and manuscript writing. PH designed the study, obtained funding, drafted the analytical plan, guided the statistical analysis, interpreted the data, and critically revised the manuscript. MC provided the major parts of resource for data collection and interviewer training. CE, MH, HC, ML, WY and CH responded to interviewer training and data collection. YL and KT conducted the experimental validation. TP and MH provided the resource and guidance for RNA-sequencing techniques.

### FUNDING

This study was supported by the National Health Research Institutes Project (NHRI-EX106-10627NI), Ministry of Science and Technology Project (MOST 105-2628-B-002- 028-MY3), and the National Taiwan University Career Development Project (104R7883) to PI, Dr. PH Kuo, and MOST 107-2314-b-038-085 to PI, Dr. CH Chen, and Taipei City Hospital Research Project (TPCH-108-057) to PI, Dr. MC Huang.

### ACKNOWLEDGMENTS

A part of data analysis and microarray experiments were done with the help of National Taiwan University Center of Genomic Medicine and Microarray Core Lab of Core Instrument Center in National Health Research Center. We thank P.C. Hsiao for his IT assistance and Jessica Ho for experimental assistance. We also acknowledge the PGC Bipolar Disorder Working Group for the access of published GWAS summary statistics results. Lastly, we acknowledge Dr. Stephanie H. Witt for kindly offering clinical severity data for her published results to enable the mega-analysis for all microarray samples.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2019.00280/ full#supplementary-material


risk from complex variation of complement component 4. *Nature* (2016) 530:177–83. doi: 10.1038/nature16549


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Lee, Chao, Chang, Hsieh, Liu, Chen, Lu, Chen, Chen, Tsai, Lu, Huang and Kuo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Genetic Variability of TCF4 in Schizophrenia of Southern Chinese Han Population: A Case-Control Study

Jingwen Yin<sup>1</sup>† , Dongjian Zhu<sup>1</sup>† , You Li2,3† , Dong Lv<sup>1</sup> , Huajun Yu<sup>4</sup> , Chunmei Liang2,3 , Xudong Luo<sup>1</sup> , Xusan Xu<sup>3</sup> , Jiawu Fu<sup>2</sup> , Haifeng Yan<sup>1</sup> , Zhun Dai<sup>1</sup> , Xia Zhou<sup>3</sup> , Xia Wen<sup>3</sup> , Susu Xiong<sup>1</sup> , Zhixiong Lin<sup>1</sup> , Juda Lin<sup>1</sup> , Bin Zhao<sup>3</sup> , Yajun Wang<sup>5</sup> \*, Keshen Li3,6,7 \* and Guoda Ma2,3 \*

#### Edited by:

Zhexing Wen, Emory University School of Medicine, United States

#### Reviewed by:

Fabian Streit, Central Institute for Mental Health, Germany Gabriel R. Fries, University of Texas Health Science Center at Houston, United States

#### \*Correspondence:

Yajun Wang wangyajuny1977@aliyun.com Keshen Li keshenli1971@163.com Guoda Ma sihan1107@126.com †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics

> Received: 05 November 2018 Accepted: 10 May 2019 Published: 28 May 2019

#### Citation:

Yin J, Zhu D, Li Y, Lv D, Yu H, Liang C, Luo X, Xu X, Fu J, Yan H, Dai Z, Zhou X, Wen X, Xiong S, Lin Z, Lin J, Zhao B, Wang Y, Li K and Ma G (2019) Genetic Variability of TCF4 in Schizophrenia of Southern Chinese Han Population: A Case-Control Study. Front. Genet. 10:513. doi: 10.3389/fgene.2019.00513 <sup>1</sup> Department of Psychiatry, Affiliated Hospital of Guangdong Medical University, Zhanjiang, China, <sup>2</sup> Department of Neurology, Affiliated Hospital of Guangdong Medical University, Zhanjiang, China, <sup>3</sup> Guangdong Key Laboratory of Age-Related Cardiac and Cerebral Diseases, Guangdong Medical University, Zhanjiang, China, <sup>4</sup> Experiment Animal Center, Guangdong Medical University, Zhanjiang, China, <sup>5</sup> Clinical Research Center, Affiliated Hospital of Guangdong Medical University, Zhanjiang, China, <sup>6</sup> Department of Neurology and Stroke Center, The First Affiliated Hospital, Jinan University, Guangzhou, China, <sup>7</sup> Clinical Neuroscience Institute of Jinan University, Guangzhou, China

Objective: Schizophrenia is thought to be a neurodevelopmental disorder. As a key regulator in the development of the central nervous system, transcription factor 4 (TCF4) has been shown to be involved in the pathogenesis of schizophrenia. The aim of our study was to assay the association of TCF4 single nucleotide polymorphisms (SNPs) with schizophrenia and the effect of these SNPs on phenotypic variability in schizophrenia in Southern Chinese Han Population.

Methods: Four SNPs (rs9960767, rs2958182, rs4309482, and rs12966547) of TCF4 were genotyped in 1137 schizophrenic patients and 1035 controls in a Southern Chinese Han population using the improved multiplex ligation detection reaction (iMLDR) technique. For patients with schizophrenia, the severity of symptom phenotypes was analyzed by the five-factor model of the Positive and Negative Symptom Scale (PANSS). Cognitive function was assessed using the Brief Assessment of Cognition in Schizophrenia (BACS) scale.

Results: The results showed that the genotypes and alleles of the three SNPs (rs2958182, rs4309482, and rs12966547) were not significantly different between the control group and the case group (all P > 0.05). rs9960767 could not be included in the statistics for the extremely low minor allele frequency. However, the genotypes of rs4309482 shown a potential risk in the positive symptoms (P = 0.04) and excitement symptoms (P = 0.04) of the five-factor model of PANSS, but not survived in multiple test correction. The same potential risk was shown in the rs12966547 in positive symptoms of the PANSS (P = 0.03).

Conclusion: Our results failed to find the associations of SNPs (rs2958182, rs4309482, and rs12966547) in TCF4 with schizophrenia in Southern Chinese Han Population.

Keywords: schizophrenia, TCF4, polymorphisms, positive psychotic symptoms, Southern Chinese Han population

# INTRODUCTION

fgene-10-00513 May 24, 2019 Time: 18:23 # 2

Schizophrenia is thought to be a highly heritable disease with a genetic architecture arising from the subtle effect of multiple risk genes (Wang et al., 2017). The "accumulation" of dysregulation events in susceptibility genes leading to dysfunction in the nervous system results in the phenotypic heterogeneity of schizophrenia, including positive symptoms, negative symptoms, and cognitive dysfunction (Takahashi, 2013).

TCF4, a transcription factor involved in the development of the nervous system, is found to be a highly plausible candidate for contributing to schizophrenia (Lennertz et al., 2011). In vitro and in vivo evidence demonstrated the involvement of TCF4 in all stages of brain development, including proliferation, differentiation, migration and synaptogenesis, as well as in adult brain plasticity and information signaling (D'Rozario et al., 2016). Rare TCF4 mutations led to neurodevelopmental disorders, such as Pitt–Hopkins syndrome, which is characterized by severe cognitive deficit, microcephaly, disrupted motor development, and hyperventilation (Forrest et al., 2012). In animal studies, overexpression of TCF4 in transgenic mice resulted in deficits in prepulse inhibition (PPI) and memory, fear conditioning and sensorimotor gating (Brzózka et al., 2010). Functional deficit in the neural system might be involved in the etiology of schizophrenia. Corresponding results in clinical studies suggested a critical effect of TCF4 in several phenotypes of schizophrenia, including age at onset, sensorimotor gating, negative symptoms, cognitive function and MRI-measured brain structure (Albanna et al., 2014; Hui et al., 2015; Chow et al., 2016; Alizadeh et al., 2017). Further analysis showed increased TCF4 mRNA expression in psychosis patients compared with controls (Wirgenes et al., 2012). Furthermore, as a basic helix-loop-helix (bHLH) transcription factor, TCF4 is considered a crucial player in gene expression networks through regulation of gene expression (Forrest et al., 2014). In particular, TCF4 has been identified as a direct target of schizophrenia-associated pivotal factor miR-137, suggesting a particular susceptibility whereby TCF4 could be involved in the gene regulatory networks underlying schizophrenia (Yin et al., 2014; Xia et al., 2018).

Large genome-wide association studies (GWAS) with some replicable and intriguing findings have suggested that several single nucleotide polymorphisms (SNPs) of TCF4 are consistent with and significantly increase susceptibility to schizophrenia (Schizophrenia Psychiatric Genome-Wide Association Study [GWAS] Consortium, 2011; Steinberg et al., 2011; Zammit et al., 2014). The variant rs9960767, located in intron 3 of the TCF4 gene on chromosome 18q21.1, was significantly associated with schizophrenia in the European population (Steinberg et al., 2011). However, it was not polymorphic in a previous case-control study in the East Chinese Han Population (Li et al., 2010). Interestingly, the rs2958182 polymorphism was predicted to be a proxy SNP for rs9960767, with a high linkage disequilibrium (LD) (D' = 1) and the physically (∼6 kb pairs) closest distance between two SNPs. rs2958182 has been reported to be involved in several phenotypes of schizophrenia in the Chinese population. Another risk SNP identified by Steinberg et al. (Steinberg et al., 2011), rs4309482, lies intergenically downstream of TCF4 and upstream of coiled-coiled domain containing 68 (CCDC68). Finally, in another mega-GWAS analysis, rs12966547, which was in high LD with rs4309482, was associated with the risk of schizophrenia (Schizophrenia Psychiatric Genome-Wide Association Study [GWAS] Consortium, 2011).

Considering the potential for ethnic and geographic heterogeneity, the susceptibility associated with rs9960767 and rs2958182 should be further verified in the Southern Han population in China. For the first time, we conducted a case-control study to explore the risk associated with rs4309482 and rs12966547 in schizophrenia. The effects of a genetic risk variant on phenotype, including demographic characteristics and neurocognitive function, were examined to further identify the genetic sources of phenotypic heterogeneity.

# MATERIALS AND METHODS

# Ethics Statement

The study was approved by the Ethical Committee of the Affiliated Hospital of Guangdong Medical University, and written consent forms were obtained from the participants or their legal representatives.

#### Subjects

A total of 1137 unrelated patients with schizophrenia were consecutively recruited from the Affiliated Hospital of Guangdong Medical University. All patients were enrolled based on the following criteria: (1) diagnosed as schizophrenia according to the Diagnostic and Statistical Manual of Mental Disorder IV (DSM-IV) criteria for schizophrenia by at least two experienced senior psychiatrists; (2) age from 18 to 55 years; (3) underwent a standardized battery of examinations, including family history, extensive drug and alcohol assessment, physical and neurological examination, and laboratory tests to exclude substance-induced psychotic disorders or psychosis caused by general medical condition. 1035 healthy controls were randomly selected from the Health Examination Center of the Affiliated Hospital of Guangdong Medical University. Based on unstructured interviews and physical examination reports, healthy controls with a personal or family history of psychiatric disorders, substance abuse and serious somatic illnesses were excluded. All subjects were Han Chinese origin.

# Symptom and Neurocognitive Function Assessment

The psychotic symptoms of patients with schizophrenia were evaluated with the Positive and Negative Symptom Scale (PANSS). Although the items of the PANSS are divided into three subscales (the positive, negative and general psychopathology scale), several factor analyses have shown that five-factor models better characterize PANSS data (Van et al., 2006). Thus, our results for the PANSS were presented in a five-factor model encompassing the following factors: positive (total score of P1, P5, P6, G9), negative (total score of N1, N2, N3, N4, N6, G16),

excitement (total score of P4, P7, G4, G14), depression/anxiety (total score of G1, G2, G3, G6, G15), and cognitive (total score of P2, N5, G5, G10, G11) (Lindenmayer et al., 1994). Additionally, the neurocognitive function of the patients was assessed using the Brief Assessment of Cognition in Schizophrenia (BACS), as described previously (Ma et al., 2014).

#### DNA Extraction and Genotyping

Genomic DNA from EDTA-treated peripheral blood was extracted using the TIANamp Blood DNA Kit (Tiangen Biotech, Beijing, People's Republic of China). The improved multiplex ligation detection reaction (iMLDR) method (Genesky Biotechnologies Inc., Shanghai, China) was used to genotype candidate SNPs, as described previously (Xu et al., 2018). The primer information for the multiplex polymerase chain reaction (PCR) is described in **Supplementary Table S1** .

#### Statistical Analysis

The statistical analysis was performed using SPSS 22.0 software. The descriptive variables are presented as the mean ± standard deviation (SD). P < 0.05 was considered significant for all statistical tests. Hardy–Weinberg equilibrium was tested using Pearson's chi-square ( χ 2 ) test. The allelic and genotypic frequencies were compared between patients and controls using χ2 tests. Generalized odds ratios (ORs) with 95% confidence intervals (CIs) of the alleles were calculated. Subjects' basic demographic data, such as age, gender, and family history, were measured using χ 2 tests. To test the effect of genotype on phenotypes, analysis of variance (ANOVA) was conducted with the genotype as the fixed factor, and the five factors (positive, negative, excitement, depression/anxiety, and cognitive) from the PANSS, age, age at onset, duration and cognitive scores (the BACS total and 5 index scores) were the dependent factors. Multiple test corrections were conducted by Bonferroni's test. Power calculations were performed using QUANTO 1.2 software. The LD status was determined using the Haploview 4.2 program. Only those haplotypes with frequencies greater than 3% were further analyzed.

#### RESULTS

#### Demographic Characteristics

The patients and controls had comparable age and gender distributions in the TCF4 polymorphisms (all P > 0.05), as described in **Supplementary Table S2**. In addition, for the analysis of TCF4 SNPs effect in clinical characteristics, we found no difference in duration, family history and age at onset between the genotypes of selected SNPs (shown in **Table 2**).

### Association Study of SNPs (rs9960767, rs2958182, rs4309482, rs12966547) and Schizophrenia

A total of 1137 patients and 1035 controls were genotyped for rs4309482, rs12966547 and rs9960767. 1916 samples (1021 patients and 895 controls) were genotyped for rs2958182. In


#### TABLE 2 | Genotypes of TCF4 gene polymorphisms and clinical characteristics of schizophrenic patients.


PANSS: positive and negative symptom scale, <sup>∗</sup>P < 0.05, P<sup>c</sup> : the P value corrected by Bonferroni correction.

results, only AA and CA were present in rs9960767, and there were 6 CA genotypes in schizophrenia patients and 2 in controls; thus, rs9960767 could not be included in the statistics. The distributions of the TCF4 rs2958182, rs4309482, and rs12966547 polymorphisms in our cohort are shown in **Table 1**. The frequency distribution of each tag SNP (rs2958182, rs4309482, rs12966547) in the case group and the controls was in Hardy– Weinberg equilibrium (all P > 0.05). No significant differences were found in the frequencies of genotypes (χ <sup>2</sup> = 2.15, P = 0.34 for rs2958182; χ <sup>2</sup> = 2.30, P = 0.32 for rs4309482; χ <sup>2</sup> = 2.46, P = 0.29 for rs12966547) or alleles (P = 0.50, OR = 0.94, 95% CI: 0.79–1.12 for rs2958182; P = 0.21, OR = 1.08, 95% CI: 0.96–1.08 for rs4309482; P = 0.20, OR = 0.92, 95% CI: 0.82– 1.04 for rs12966547) between the patients with schizophrenia and the controls. Additionally, in the gender-stratified analysis, there were no significant differences in either genotype or allele distributions between schizophrenic patients and controls (**Supplementary Table S3**).

### TCF4 SNPs and Clinical Characteristics

The genotype distributions of the three SNPs were in Hardy– Weinberg equilibrium (all P > 0.05, data not shown). The difference was significant among genotypes of rs4309482 in the positive scores (F = 3.34, P = 0.04) and excitement scores (F = 3.27, P = 0.04) and genotypes of rs12966547 in positive scores (F = 3.57, P = 0.03), However, three positive results did not pass the Bonferroni corrections (P = 0.12, P = 0.12, P = 0.09, respectively). There was no significant difference in other items in rs4309482 and rs12966547. Furthermore, there was no significant difference in age at onset, family history, duration, and total or five-factor scores for the PANSS and BACS scores when comparing the different rs2958182 genotypes (**Table 2** and **Supplementary Table S4**).

#### LD Analysis

We performed an LD analysis of 4 loci. Strong LD was observed between rs4309482 and rs12966547 (D' = 1.0, r <sup>2</sup> = 0.997). rs2958182 had LD with rs9960767 (D' = 1.0, r <sup>2</sup> = 0.0); however, neither is strongly in LD with rs4309482 (D' = 0.04, r <sup>2</sup> = 0.0) or rs12966547 (D' = 0.04, r <sup>2</sup> = 0.001). In addition, rs9960767 also showed LD with rs4309482 (D' = 0.51,r <sup>2</sup> = 0.001) and rs12966547 (D' = 0.51, r <sup>2</sup> = 0.001) (**Supplementary Table S5**).

### DISCUSSION

The TCF4 gene was highly associated with schizophrenia in a recent GWAS analysis (Schizophrenia Psychiatric Genome-Wide Association Study [GWAS] Consortium, 2011). As a member of the bHLH group of proteins, TCF4 controls critical steps of various developmental and possibly plasticityrelated transcriptional programs in the central nervous system (Quednow et al., 2014). Increasing compelling evidence supports the crucial role of TCF4 during neurodevelopment and raises the possibility that TCF4 genetic perturbations may increase the risk for schizophrenia. In the present case-control study, we evaluated the potential association of the four SNPs (rs9960767 rs2958182, rs4309482, rs12966547) of the TCF4 gene with schizophrenia in a sample of 1137 unrelated patients with schizophrenia and 1035 unrelated healthy people. We further aimed to test the effects of a genetic risk variant on phenotype and to identify the genetic sources of phenotypic heterogeneity.

Unexpectedly, we found no association of the genotype or allele distribution of rs2958182 with schizophrenia in our sample from Zhanjiang. This result is unexpected because the susceptibility of this SNP to schizophrenia has been repeatedly found in previous studies. The inconsistent results might be due to ethnic and geographic heterogeneity. In the present study, the frequency of the rs2958182 A allele detected in our cohort from China (15.67% in the cases and 16.48% in the controls) was lower than that previously observed in other ethnicities, including Norwegians (37.3% in the cases) (Wirgenes et al., 2012) and Malaysians (18.94% in the cases) (Chow et al., 2016). In the same ethnicity, genetic variation in TCF4 also exists among geographic groups within the Chinese Han population. The A allele frequency of rs2958182 in our cohort from Zhanjiang, in Southern China, was higher than that in the Chinese cohort of Beijing (13.7% in the cases and 11.2% in the controls) (Hui et al., 2015), Shandong (11.2% in the cases and 11.4% in the controls) (Zhu et al., 2013) and Shanghai (10.3% in the cases) (Li et al., 2010) (as shown in **Figure 1**). Furthermore, we failed to find any association of rs2958182 with clinical neurocognitive characteristics of schizophrenia, which also might be due to the bias in genotype frequencies. In addition to geographical and ethnic factors, the case collection process, diagnostic criteria and environmental factors may also explain these differences. Therefore, rs2958182 susceptibility in schizophrenia should be further verified in different populations. In addition, with our study sample and assuming a risk allele frequency of 15.67%, we had 86.7% power to detect a genotype relative risk with an odds ratio of 1.3 at the 0.05 level. Therefore, there is still a 13.3% chance of a type II error. A false negative result cannot be excluded.

The variant rs9960767 was a common polymorphic variant and was proved to be a risk allele for schizophrenia in populations of European origin (Quednow et al., 2011; Steinberg et al., 2011) and in the Canadian population (Albanna et al., 2014). However, in China, no polymorphic variation in rs9960767 was reported in a large sample (2496 schizophrenia cases and 5184 control subjects) from East China in a previous study (Li et al., 2010). In our Southern Chinese sample, we obtained rare frequency mutations in rs9960767, with only 6 CA genotypes in patients with schizophrenia and 2 CA genotypes in controls. The frequency of the CA genotype was 0.53% in patients and 0.19% in controls. Both of these frequencies are significantly lower than those in the HapMap HCB data (CA = 2.33%)<sup>1</sup> . These results suggested that ethnic heterogeneity also exists in rs9960767. Furthermore, the CC haplotype absent in rs9960767 and different allele frequencies lead to the heterogeneity in LD to rs2958182 and rs9960767. Although rs2958182 is a marker near (∼6 kb) and in complete LD with rs9960767 in people of European origin, in our sample, the LD analysis showed D' = 1, r <sup>2</sup> = 0. The dramatic results in our study highlight the inherent problems of LD index D' and r 2 . D' is not a sensitive measure of LD for rare mutations such as rs9960767. However, r 2 , which summarizes both recombination and mutational history values, is better able to assess LD. This r 2 value indicated the lack

<sup>1</sup>https://www.ncbi.nlm.nih.gov/projects/SNP/snp\_ref.cgi?rs=9960767

of LD observed for rs2958182 and rs9960767 in our Chinese sample. Therefore, it is of particular practical importance to take great care in case-control studies that evaluate the risk of two associations for schizophrenia in a Chinese sample.

Our results indicated that rs4309482 and rs12966547 are in strong LD in our sample of Southern Chinese Han individuals. The rs4309482 and rs12966547 genotypes were associated with positive symptoms and excitement symptoms using a five-factor model of the PANSS, but not survived in the multiple test correction. In previous studies, these two variants have been linked to several psychosis phenotypes. In a larger Norwegian study sample, these two risk alleles were confirmed to be related to poorer verbal fluency (Wirgenes et al., 2012). Furthermore, rs12966547 of TCF4 was significantly associated with earlier age at onset in a Malaysian population (Chow et al., 2016). However, in a German study, rs4309482 was not associated with schizophrenia, including genotypes, alleles, clinical symptoms and cognitive function (Papiol et al., 2011). Although rs4309482 and rs12966547 of TCF4 appear to be risk factors influencing the phenotypes in schizophrenia, it seems unlikely that a SNP could account for a disorder as complex as schizophrenia. The mechanisms underlying the effect of SNPs of TCF4 in schizophrenia deserve further exploration.

In previous studies, TCF4 has been linked to cognitive functions (Wirgenes et al., 2012; Zhu et al., 2013; Albanna et al., 2014; Hui et al., 2015). Wirgenes et al. (2012) reported that risk alleles of rs4309482 are associated with poorer executive function in the form of verbal fluency in Norwegian population. Our data showed marginally significant differences in cognitive domain of Reasoning and Problem Solving between the genotypes of rs4309482, but not survived in the multiple test correction. The similar results were found in the cognitive subscale of PANSS (shown in **Supplementary Table S3**). It is the first time to analyze the effect of rs4309482 in cognitive impairment of schizophrenia in Chinese population, which needs to be further confirmed.

In conclusion, in Southern Chinese cohort, we failed to find the associations of SNPs (rs4309482 and rs12966547) in TCF4 to schizophrenia and failed to replicate the association of rs2958182 to schizophrenia found in east Chinese cohort. These findings argue against the rs2958182 polymorphism being a risk factor for schizophrenia in the Chinese population. More SNPs, which suggested by GWAS in TCF4 should be included in the further research for insights to the risk of TCF4 in schizophrenia.

#### REFERENCES


#### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of "the Affiliated Hospital of Guangdong Medical University, the Ethical Committee of the Affiliated Hospital of Guangdong Medical University" with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the "Ethical Committee of the Affiliated Hospital of Guangdong Medical University."

#### AUTHOR CONTRIBUTIONS

KL, GM, BZ, JL, and YW supervised the entire project and gave critical comments on the manuscript. DL, XL, JF, HaY, and ZL contributed to the data collection. HuY, CL, SX, XZ, and XW participated in genetic analyses. ZD and XS administered the neuropsychological tests. JY, DZ, and YL managed the literature searches, collected the data, undertook the statistical analyses, and wrote the draft of the manuscript. All authors approved the final manuscript.

#### FUNDING

This study was supported by the National Natural Science Foundation of China (81670252, 81571157, 81471294, and 81770034), the Natural Science Foundation of Guangdong Province (2015A030313523), the third session of the China-Serbia Committee for Scientific and Technological Cooperation (3–13), the 2016 Talent Assistance Project of Guangdong (4YF17006G), the Science and Technology Research Project of Zhanjiang City (2016A01008) and the Opening Foundation of Guangdong Key Laboratory of Age-Related Cardiac and Cerebral Diseases (4CX16008G).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2019.00513/full#supplementary-material


II: a ten-fold cross-validation of a revised model. Schizophr. Res. 85, 280–287. doi: 10.1016/j.schres.2006.03.021


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Yin, Zhu, Li, Lv, Yu, Liang, Luo, Xu, Fu, Yan, Dai, Zhou, Wen, Xiong, Lin, Lin, Zhao, Wang, Li and Ma. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-10-00513 May 24, 2019 Time: 18:23 # 7

# Target Genes of Autism Risk Loci in Brain Frontal Cortex

*Yan Sun1, Xueming Yao1, Michael E. March2, Xinyi Meng1, Junyi Li1, Zhi Wei3, Patrick M.A. Sleiman2,4,5, Hakon Hakonarson2,4,5, Qianghua Xia1\* and Jin Li1\**

*1 Department of Cell Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, Tianjin Medical University, Tianjin, China, 2 Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, United States, 3 College of Computing Sciences, New Jersey Institute of Technology, University Heights, Newark, NJ, United States, 4 Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, United States, 5 Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, PA, United States*

Autism spectrum disorder (ASD) is a complex neuropsychiatric disorder. A number of genetic risk loci have been identified for ASD from genome-wide association studies (GWAS); however, their target genes in relevant tissues and cell types remain to be investigated. The frontal cortex is a key region in the human brain for communication and cognitive function. To identify risk genes contributing to potential dysfunction in the frontal cortex of ASD patients, we took an *in silico* approach integrating multi-omics data. We first found genes with expression in frontal cortex tissue that correlates with ASD risk loci by leveraging expression quantitative trait loci (eQTLs) information. Among these genes, we then identified 76 genes showing significant differential expression in the frontal cortex between ASD cases and controls in microarray datasets and further replicated four genes with RNA-seq data. Among the ASD GWAS single nucleotide polymorphisms (SNPs) correlating with the 76 genes, 20 overlap with histone marks and 40 are associated with gene methylation level. Thus, through multi-omics data analyses, we identified genes that may work as target genes of ASD risk loci in the brain frontal cortex.

#### *Edited by:*

*Weihua Yue, Peking University Sixth Hospital, China*

#### *Reviewed by:*

*Fuquan Zhang, Nanjing Medical University, China Chuanjun Zhuo, Tianjin Anding Hospital, China*

#### *\*Correspondence:*

*Jin Li jli01@tmu.edu.cn Qianghua Xia qianghua@gmail.com*

#### *Specialty section:*

*This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics*

> *Received: 11 October 2018 Accepted: 04 July 2019 Published: 09 August 2019*

#### *Citation:*

*Sun Y, Yao X, March ME, Meng X, Li J, Wei Z, Sleiman PMA, Hakonarson H, Xia Q and Li J (2019) Target Genes of Autism Risk Loci in Brain Frontal Cortex. Front. Genet. 10:707. doi: 10.3389/fgene.2019.00707*

Keywords: autism, brain frontal cortex, DNA methylation, eQTL, GWAS loci, histone modification, target gene

#### INTRODUCTION

Autism spectrum disorder (ASD) is a type of complex neurodevelopmental disorder mainly characterized by stereotyped behavior and deficiency in social communication ability. Among children, the prevalence rate of ASD has been estimated to be 1 in 68 in the USA and 1 in 100 worldwide, and there is four times higher prevalence among boys than girls (Ginsberg et al., 2012; Developmental Disabilities Monitoring Network Surveillance Year 2010 Principal Investigators; Centers for Disease Control and Prevention, 2014; Geschwind and State, 2015). ASD severely affects the life quality of patients and their families and increases public health burden (Ginsberg et al., 2013). ASD patients exhibit highly heterogeneous clinical presentations, and ASD patients are mainly treated by rehabilitation intervention with no specific therapeutic drug (Bowers et al., 2015). Therefore, it is necessary to understand the genetic mechanism underlying ASD development in important brain regions.

Genetic studies of ASD have revealed a number of risk loci that may contribute to ASD pathogenesis. It has been shown that single nucleotide polymorphisms (SNPs) located at loci 3p21 and 10q24, as well as in *CACNA1C* and *CACNB2*, are significantly associated with multiple psychiatric disorders including ASD (Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013). Xia et al. discovered *TRIM33* and *NRAS-CSDE1* as ASD candidate genes by GWAS analysis of Chinese autistic patients and datasets of three European populations (Xia et al., 2014). A recent study by the Autism Spectrum Disorders Working Group of the Psychiatric Genomics Consortium (PGC) identified multiple loci, composed of common variants, associated with ASD and found a significant genetic correlation between ASD and schizophrenia *via* meta-analysis of more than 16,000 autistic patients (The Autism Spectrum Disorders Working Group of The Psychiatric Genomics Consortium, 2017). Furthermore, Cantor et al. found that rs289883 located in the intron of gene *PHB* was associated with the degree of behavioral abnormality in ASD patients (Cantor et al., 2018).

Different brain regions control different functions, which may be impaired in ASD patients. The frontal lobe of the brain plays an important role in social, emotional, and cognitive functions and has shown severe dysfunction in ASD patients (Courchesne and Pierce, 2005). The frontal lobes in ASD patients undergo an abnormal overgrowth while other regions are not significantly enlarged (Buxhoeveden et al., 2006). Additionally, a decrease in astrocyte precursor cells and an increase in synaptic connectivity are observed in the frontal cortex of ASD patients (Broek et al., 2014). Previous studies demonstrated pronounced ASD-associated gene expression changes in the cerebral cortex, including attenuated distinction between the frontal and temporal cortices in ASD brains (Voineagu et al., 2011).

Because of the importance of the frontal cortex in normal brains and its dysfunction in ASD brains, we aim to identify targeted genes of ASD risk loci in the frontal cortex. We obtained ASD associated SNPs from the GWAS catalog and found genes with genotype-correlated expression in the frontal cortex tissue from eQTL databases. By analyzing microarray gene expression datasets, we then identified 76 ASD–loci correlated genes showing significant expression difference between ASD brain frontal cortices and controls, and we further replicated four genes in an RNA-seq dataset. Among the ASD GWAS SNPs correlating with the 76 genes, 20 overlap with histone marks and 40 were associated with the gene methylation level, suggesting that they may regulate the transcription of their target genes through epigenetic mechanisms. Our results help to understand how ASD GWAS loci confer disease risk and prioritize genes for further functional validation.

# MATERIAL AND METHODS

#### Extraction of ASD GWAS Loci

Significant ASD associations were downloaded from the GWAS catalog (https://www.ebi.ac.uk/gwas/) (Macarthur et al., 2017) using the keyword "autism spectrum disorder," and SNP information was extracted from downloaded data. We did not apply any significance threshold when extracting ASD SNPs from the GWAS catalog.

# eQTL Analysis

Genes with expression in the frontal cortex that correlate with the genotypes of ASD SNPs were extracted from two eQTL databases: GTEx (https://gtexportal.org/home/) (GTEx Consortium, 2013) and Braineac (http://caprica.genetics.kcl.ac.uk/BRAINEAC/) (Ramasamy et al., 2014). Only associations with *P* < 0.05 were extracted.

# Analysis of Microarray Data

Series matrix files of two microarray datasets GSE28475 (Chow et al., 2012) and GSE28521 (Voineagu et al., 2011) that compare transcriptome data in human brain frontal cortex between ASD cases and controls were downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/) (Barrett et al., 2013). There are 16 ASD cases and 16 controls in dataset GSE28521, and there are 52 cases and 61 controls in dataset GSE28475. Microarray data underwent quality control with log2 transformation and quantile normalization. Differential expression analysis was performed using the linear regression approach (Limma model) and further meta-analyzed using Fisher's method in the NetworkAnalyst portal (http://www.networkanalyst.ca/) (Xia et al., 2015) with adjustment for batch effects.

### Analysis of RNA-Seq Data

The SRA file of an RNA-seq dataset (GSE102741) (Wright et al., 2017), which similarly compares transcriptome data in human brain frontal cortex between ASD cases and controls, was downloaded from the GEO database. Dataset GSE102741 contains 13 ASD cases and 39 controls. The quality of RNA-seq raw reads was examined using FastQC (Andrews, 2010), and reads were aligned to the human reference genome (GRCh37) using software HISAT2 (Kim et al., 2015). Then transcripts were assembled and quantified using Stringtie (Pertea et al., 2015) with the reference gene annotation (GRCh37) as a guide. Differential expression analysis between cases and controls was conducted using edgeR (Robinson et al., 2010).

#### Protein–Protein Interaction Network Analysis

The gene symbols were input into the NetworkAnalyst web portal, which maps each gene to protein–protein interaction (PPI) databases to construct networks. The PPI network was constructed among the 76 genes without further extension, with InnateDB (Breuer et al., 2013) as the source of protein interactions.

# Pathway Enrichment Analysis

We input the 76 genes into DAVID (https://david.ncifcrf.gov/ home.jsp) (Huang Da et al., 2009) and PANTHER (http://www. pantherdb.org/) (Mi and Thomas, 2009) web portals for pathway analysis. The pathway databases used in our analyses are KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa and Goto, 2000) and Reactome (Fabregat et al., 2017), respectively.

#### Analysis of Methylation Data

The generation of genome-wide methylation profiles of 843 subjects on the Infinium HumanMethylation450 BeadChip by the Center for Applied Genomics, the Children's Hospital of Philadelphia, was reported in a previous publication (Van Ingen et al., 2016). The methylation level of each methylation probe was represented by the *M*-values (the log2 ratio between the methylated and unmethylated probe intensities). The association of ASD GWAS SNPs with methylation probes in each of the 76 genes was assessed in a linear regression model including gender, age, and 10 genotype-derived principle components as covariates.

#### Hi-C Data Visualization

We conducted Hi-C data visualization for the ASD loci and target genes through the 3D Genome browser (promoter.bx.psu. edu/hi-c/) (Wang et al., 2018) and the FUMA GWAS site (http:// fuma.ctglab.nl/) (Watanabe et al., 2017) using reported brain Hi-C data (Schmitt et al., 2016; Won et al., 2016).

#### RESULTS

As the frontal cortex is involved in important brain functions, which are severely impaired among ASD patients (communication, language, social behavior, and complex cognitive functions), we are interested in identifying target genes of ASD risk loci in the frontal cortex region. To do this, we first extracted all reported ASD associated risk loci from the GWAS catalog. A total of 466 SNPs from 19 studies were extracted, with the highest reported association *P*-value of 1 × 10−5. The majority (97%) of these SNPs were located in non-coding regions. As these non-coding SNPs could function by regulating downstream target gene expression, we examined their potential regulatory effects in two eQTL databases [brain—frontal cortex tissue in the GTEx database (GTEx Consortium, 2013) and frontal cortex in the Braineac database (Ramasamy et al., 2014)].We found 457 genes from GTEx and 1,848 genes from Braineac with mRNA level correlated with the additive genotype of ASD GWAS SNPs (nominal *P* < 0.05). As GWAS loci and their targeted genes may not exhibit highly significant correlations in eQTL analysis, we took this less stringent threshold and combined the ASD–loci targeted genes from the two datasets, yielding a list of 2,098 genes. The eQTL associations suggest that the expression of these genes may be directly or indirectly influenced by the genotype of the ASD loci.

We hypothesized that the expression of genes functioning in the frontal cortex may be dis-regulated among ASD patients; thus, we conducted gene expression meta-analysis by comparing the mRNA level of ASD cases and that of healthy controls at the genome-wide scale using datasets GSE28475 and GSE28521 from the GEO database. The analysis yielded 893 differentially expressed genes (adjusted *P* < 0.05). Among the 2,098 genes likely regulated by ASD–loci, 76 displayed significant differential gene expression (**Supplementary Table 1**), implicating that these genes may be ASD loci-controlled genes in the frontal cortex.

As replication, we looked into ASD RNA-seq dataset GSE102741 in the GEO database. We similarly conducted transcriptome profiling analysis and found that four of the 76 genes identified in the above steps showed significant differences (*P* < 0.05) in mRNA level between ASD cases and healthy controls: *HIST1H1C*, *HSPA1B*, *PRPF3*, and *SERPINA3* (**Table 1**). Therefore, differential expression of these four genes in the frontal cortex of ASD brains were further validated by RNAseq. Inability to validate the remaining 72 genes could be due to the small sample size of the RNA-seq dataset. We further checked brain Hi-C data and found additional supporting evidence for the plausible chromatin interactions between ASD SNPs and the target genes (**Supplementary Figure 1**). Certainly, future Hi-C experiments specifically focusing on the frontal cortex regions should be performed to examine these interactions. It has been reported in the human protein atlas database (https:// www.proteinatlas.org/) (Uhlen et al., 2015) that both the mRNA and proteins of ASD target genes *HIST1H1C* and *PRPF3* were detected in human brain cerebral cortex; HIST1H1C protein level is particularly high. The HSPA1B protein and SERPINA3 mRNA were also detected in the cerebral cortex. By searching the Mouse Genome Informatics (MGI) database (http://www. informatics.jax.org) (Law and Shaw, 2018), we also found that mRNA of mouse HIST1H1C and PRPF3 homologues has been detected in mouse brains in previous publications (Mckee et al., 2005; Diez-Roux et al., 2011). Furthermore, abnormal behavior, neurological phenotype, or defects in nervous system have been documented for mouse strains carrying mutant Prpf3 or Hspa1b genes (Law and Shaw, 2018), suggesting the biological relevance of these genes to ASD.

To understand how the 76 genes are involved in ASD pathogenesis, we constructed a protein–protein interaction (PPI) network (**Figure 1**) using NetworkAnalyst. The largest module consists of 11 of the 76 genes, including three of the RNA-seq validated genes (*HIST1H1C*, *HSPA1B*, and *SERPINA3*). To fully understand the interactions between these genes, we further examined the pathways in which these genes are enriched. We found significantly enriched pathways: "Antigen processing and presentation" and "Noncanonical activation of NOTCH3" (**Supplementary Table 2**). Both



*SNP, single nucleotide polymorphism; Target Gene, candidate target gene identified for each ASD GWAS locus; eQTL P-value, P-value of correlation between gene expression level and GWAS SNP genotype in GTEx database or Braineac database; Microarray P-value, P-value of the differentially expressed gene in microarray meta-analysis; RNA-seq P-value, P-value of the differentially expressed gene in RNA-seq analysis.*

of these pathways are highly relevant to ASD pathogenesis (Needleman and Mcallister, 2012; Hormozdiari et al., 2015; Bennabi et al., 2018; Jones et al., 2018).

Both phenotypic and genetic overlap was observed between neuropsychiatric diseases. We found that SNPs in genes *HISTI1H1C*, *HSPA1B*, *PRPF3*, and *SERPINA3* showed at least nominal significant association with four other neuropsychiatric diseases [schizophrenia (SCZ), bipolar disorder, major depressive disorder (MDD), and attention-deficit hyperactivity disorder (ADHD)] in the Broad PGC database (https://data.broadinstitute. org/mpg/ricopili/) (Ripke and Thomas, 2017) (**Table 2**).

To understand how ASD SNPs may regulate the expression of their target genes, we explored the functional annotations of ASD GWAS SNPs corresponding to the 76 target genes in the ENCODE (Encode Project Consortium, 2012) and ROADMAP (Roadmap Epigenomics Consortium et al., 2015) epigenome databases *via* the HaploReg web portal (Ward and Kellis, 2016). We found 20 ASD SNPs overlap with histone marks in the brain dorsolateral prefrontal cortex (**Table 3**). This suggests that these SNPs may affect chromatin activation through histone methylation and acetylation, which in turn affects their target gene expression.

We also looked into whether there is any correlation between the genotypes of ASD SNPs and methylation at or near their target genes. Forty-five of the 76 genes contained probes with methylation level correlated with additive SNP genotype (**Table 4**) at a nominal significance level, suggesting that the expression level of these genes may be regulated by ASD SNPs through DNA methylation.

#### DISCUSSION

To find ASD GWAS loci targeted genes, we conducted an analysis integrating eQTL, transcriptome, epigenome, and methylation

TABLE 2 | Within genes *HISTI1H1C*, *HSPA1B*, *PRPF3*, and *SERPINA3*, single-nucleotide polymorphisms (SNPs) are associated with other neuropsychiatric diseases.


*ADHD, attention-deficit hyperactivity disorder; MDD, major depressive disorder; SCZ, schizophrenia.*

Sun et al. Autism Genes in Frontal Cortex

TABLE 3 | ASD genome-wide association studies (GWAS) SNPs overlap with histone marks in brain dorsolateral prefrontal cortex, based on ENCODE and ROADMAP datasets.


data. We began by analyzing the correlation between SNP genotype and mRNA level of genes reflected by eQTL data, and we obtained 2,098 target genes that may be regulated by these ASD loci. Then, we analyzed the differentially expressed genes between cases and controls at the mRNA level using array and RNA-seq data. A total of 76 genes with expression correlating with ASD SNP genotype were differentially expressed between ASD cases and controls in the frontal cortex in array data. Four of those genes were further validated by RNA-seq data. Evidence also suggested that the expression level of these genes could be regulated through histone modification or DNA methylation. Therefore, by *in silico* analysis, we identified candidate genes likely controlled by ASD loci in the frontal cortex, which are worthy of further experimental validation.

There are multiple lines of evidence suggesting the involvement of the four candidate ASD genes in disease etiology. *HIST1H1C* encodes a protein that belongs to the histone cluster 1 H1 family. In an ASD model system based on haploinsufficiency of *SHANK3*, Darville et al. found that five histone isoforms including HIST1H1C were down-regulated upon lithium and VPA treatment in neurons differentiated from pluripotent stem cells. Lithium and VPA increased levels of *SHANK3* mRNA, and the authors speculated that *SHANK3* may be regulated through an epigenetic mechanism involving histone modification (Darville et al., 2016). In addition, *HIST1H1C* may also be involved in other brain disease development. For example, *HIST1H1C* displayed consistently significant increased mRNA level in the cortex of brains from 7- and 18-month-old mice in an Alzheimer's disease model (Ham et al., 2018). The mRNA level of *HIST1H1C* is up-regulated in hypoxia and is correlated with worse disease outcome among neuroblastoma patients (Applebaum et al., 2016). Mutation in other members of histone cluster 1 H1 family, such as HIST1H1E, has been detected in ASD patient and is

TABLE 4 | ASD GWAS SNPs correlate with target gene methylation level at nominal significance level.


likely to be the underlying causal mutation (Duffney et al., 2018). Systematic review indicated that nearly 20% of ASD candidate genes play a role in epigenetic regulations, especially histone modifications (Duffney et al., 2018). These data support the potential involvement of HIST1H1C in ASD development, likely through epigenetic regulation of neurodevelopmental genes.

*HSPA1B* encodes a heat shock protein, which works as chaperone for other proteins. In heat shock experiments on induced pluripotent stem cells modeling brain development under maternal fever, *HSPA1B* is one of the heat shock genes that drastically increased its mRNA level, together with other genes involved in neurogenesis and neuronal function (Lin et al., 2014). Heat shock proteins target mis-folded proteins and facilitate proper refolding or targeting of damaged proteins for degradation (Lin et al., 2014). Its mRNA level is increased in the frontal cortex of schizophrenia subjects (Arion et al., 2007). It has been shown that *HSPA1B* also functions in the pathogenesis of other neurological conditions, such as Parkinson disease (Kalia et al., 2010), progressive supranuclear palsy (Hauser et al., 2005), and Alzheimer's disease (Sherman and Goldberg, 2001; Muchowski and Wacker, 2005), presumably by facilitating protein folding and inhibiting apoptosis (Leak, 2014). Genes enriched in multiple signaling pathways, like pathways of "Heterotrimeric G-protein signaling pathway" and "B cell activation," were altered by Hspa1b deficiency in an MPTP-induced mouse model of Parkinson disease (Ban et al., 2012).

PRPF3 is one of several proteins interacting with U4 and U6 small nuclear ribonucleoproteins, which are components of spliceosomes. Mutations in *MECP2* (methyl-CpG-binding protein 2) cause the neurodevelopmental disorder Rett syndrome. The MECP2 protein directly interacts with PRPF3 (Long et al., 2011), and several Rett-associated mutations in *MECP2* affect interaction of MECP2 with PRPF3, implying that neurodevelopmental disorders, in general, could be related to abnormal mRNA splicing.

*SERPINA3* belongs to the serine protease inhibitor family. The protein antagonizes the activity of neutrophil cathepsin G and mast cell chymase and has been implicated in neuroinflammation, neurodegeneration (Baker et al., 2007), and other types of brain conditions such as human prion diseases (Vanni et al., 2017). The mRNA level of SERPINA3 is robustly up-regulated in the prefrontal cortex of schizophrenia patients, suggesting its involvement in the pathogenesis of neuropsychiatric disorders (Arion et al., 2007; Saetre et al., 2007; Fillman et al., 2014).

#### REFERENCES


In summary, we identified genes that may function as ASD genetic loci targeted genes in the brain frontal cortex through multi-omics data analyses. These genes are worth being further characterized for their function in ASD development through experimental approaches.

#### AUTHOR CONTRIBUTIONS

YS was mainly involved in the data analysis, processing, and summarization. XY was involved in part of data analysis and mainly responsible for drafting manuscript. QX and JL were responsible for conception and design of study and revising the manuscript critically for important intellectual content. Other authors have partially participated in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript.

### FUNDING

This study was supported by National Natural Science Foundation of China (81771769); Tianjin Natural Science Foundation (18JCYBJC42700); Startup Funding from Tianjin Medical University; and the Thousand Youth Talents Plan of Tianjin.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019.00707/ full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Sun, Yao, March, Meng, Li, Wei, Sleiman, Hakonarson, Xia and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The Implicated Roles of Cell Adhesion Molecule 1 (*CADM1*) Gene and Altered Prefrontal Neuronal Activity in Attention-Deficit/ Hyperactivity Disorder: A "Gene– Brain–Behavior Relationship"?

#### *Edited by:*

*Cunyou Zhao, Southern Medical University, China*

#### *Reviewed by:*

*Pamela Belmonte Mahon, Harvard Medical School, United States Lucía Colodro-Conde, QIMR Berghofer Medical Research Institute, Australia*

#### *\*Correspondence:*

*Lu Liu liulupku@bjmu.edu.cn Qiujin Qian qianqiujin@bjmu.edu.cn*

#### *†ORCID:*

*Wai Chen orcid.org/0000-0002-0477-7883*

#### *Specialty section:*

*This article was submitted to Behavioral and Psychiatric Genetics, a section of the journal Frontiers in Genetics*

> *Received: 07 October 2018 Accepted: 21 August 2019 Published: 26 September 2019*

#### *Citation:*

*Jin J, Liu L, Chen W, Gao Q, Li H, Wang Y and Qian Q (2019) The Implicated Roles of Cell Adhesion Molecule 1 (CADM1) Gene and Altered Prefrontal Neuronal Activity in Attention-Deficit/Hyperactivity Disorder: A "Gene–Brain–Behavior Relationship"?. Front. Genet. 10:882. doi: 10.3389/fgene.2019.00882*

*Jiali Jin1,2, Lu Liu1,2\*, Wai Chen3,4†, Qian Gao1,2, Haimei Li1,2, Yufeng Wang1,2 and Qiujin Qian1,2\**

*1 Department of Child Psychiatry, Peking University Sixth Hospital/Institute of Mental Health, Beijing, China, 2 National Clinical Research Center for Mental Disorders & the Key Laboratory of Mental Health, Ministry of Health (Peking University), Beijing, China, 3 Centre & Discipline of Child and Adolescent Psychiatry, and Psychotherapy, School of Medicine, Division of Paediatrics and Child Health & Division of Psychiatry and Clinical Neurosciences, The University of Western Australia, Perth, WA, Australia, 4 Complex Attention and Hyperactivity Disorders Service (CAHDS), Specialised Child and Adolescent Mental Health Services of Health in Western Australia, Perth, WA, Australia*

Background: Genes related to cell adhesion pathway have been implicated in the genetic architecture of attention-deficit/hyperactivity disorder (ADHD). Cell adhesion molecule 1, encoded by *CADM1* gene, is a protein which facilitates cell adhesion, highly expressed in the human prefrontal lobe. This study aimed to evaluate the association of *CADM1*  genotype with ADHD, executive function, and regional brain functions.

Methods: The genotype data of 10-tag single nucleotide polymorphisms of *CADM1* for 1,040 children and adolescents with ADHD and 963 controls were used for case–control association analyses. Stroop color–word interference test, Rey–Osterrieth complex figure test, and trail making test were conducted to assess "inhibition," "working memory," and "set-shifting," respectively. A subsample (35 ADHD versus 56 controls) participated in the nested imaging genetic study. Resting-state functional magnetic resonance images were acquired, and the mean amplitude of low-frequency fluctuations (mALFF) were captured.

Results: Nominal significant genotypic effect of rs10891819 in "ADHD-alone" subgroup was detected (*P =* 0.008) with TT genotype as protective. The results did not survive multiple testing correction. No direct genetic effect was found for performance on executive function tasks. In the imaging genetic study for the "ADHD-whole" sample, rs10891819 genotype was significantly associated with altered mALFF in the right superior frontal gyrus (rSFG, peak *t =* 3.85, corrected *P* < 0.05). Specifically, the mALFFs in T-allele carriers were consistently higher than GG carriers in ADHD and control groups. Endophenotypic correlation analyses indicated a significant negative correlation between "word interference time" in Stroop (shorter "word interference time" indexing better inhibitory function) and mALFF in the rSFG (*r =* -0.29, *P =* 0.006). Finally, mediation analysis confirmed significant indirect effects from "rs10891819 genotype (T-allele carriers)" *via*  "mALFF (rSFG)" to "inhibition ("word interference time")" (Sobel*z =* -2.47; B *=* -2.61, 95% confidence interval -0.48 to -4.72; *P =* 0.009).

Conclusions: Our study offered preliminary evidence to implicate the roles of *CADM1* in relation to prefrontal brain activities, inhibition function, and ADHD, indicating a potential "gene–brain–behavior" relationship of the *CADM1 gene*. Future studies with larger samples may specifically test these hypotheses generated by our exploratory findings.

Keywords: attention-deficit/hyperactivity disorder, *CADM1*, executive function, imaging genetics, prefrontal cortex, mean amplitude of low-frequency fluctuation

#### INTRODUCTION

Attention-deficit/hyperactivity disorder (ADHD) characterized by developmentally inappropriate levels of inattention, hyperactivity, and impulsivity—is one of the most common childhood neurodevelopmental disorders with an estimated worldwide prevalence of 5% (American Psychiatric Association, 2013). It is a condition of genetic etiology, with a heritability estimated around 74% (Faraone and Larsson, 2019). To date, a large-scale genome-wide association meta-analysis based on categorical clinical diagnoses—which are derived from observed patterns of symptom clustering—has identified 12 significant loci involved in the underlying biology of ADHD (Demontis et al., 2019). Interestingly, these significant loci do not coincide with those reported by previous candidate genes studies, underscoring the limitations of the candidate gene approach. The Research Domain Criteria (RDoC), proposed by the National Institute of Mental Health, offer a different theoretical framework to re-orientate research approach, in particular, redirect the primary focus from diagnostic categories of ADHD to the functioning of specific domains (i.e., along the continuum from genes to molecules, cells, brain circuitry, cognitive endophenotypes, and behaviors) that are presumed to underlie the clinical manifestations (Musser and Raiker, 2019). In addition to the candidate association approach, this present study also attempts to apply the RDoC approach to explore different investigative avenues to detect associations between genes (putative functional molecules), brain activities, cognitive endophenotypes, and ADHD behaviors, within the context of cell adhesion molecule 1 (*CADM1*) gene.

By topological and functional analyses, a recent study identified potential roles of genes related to cell adhesion pathway, as being implicated in the genetic architecture of ADHD (Lima et al., 2016). In particular, cell adhesion molecule 1 (CADM1), encoded by *CADM1* gene, is a member of the immunoglobulin superfamily with cell adhesion properties, which promote axonal growth, neuronal migration, pathfinding, and synaptic formation in the developing nervous system and is also involved in the formation of neural networks (Fujita et al., 2005; Perez et al., 2015). In central and peripheral nerve system, CADM/SynCAM is essential for myelination *via* moderating adhesion of gliocyte and axon (Niederkofler et al., 2010) and plays pivotal roles in developing neurons by shaping their migrating growth cones and adhesive differentiation in their axo-dendritic contacts (Stagi et al., 2010; Yamagata et al., 2018). An animal model demonstrates in mature neurons the additional function of CADM1 protein molecule, which participates in regulating neuronal plasticity and synapse number (Robbins et al., 2010). Overall, the earlier evidence suggested that CADM1 plays multiple critical roles in maintaining neuronal integrity and functions from development to maturity.

Genes involved in CADM1-related pathways have been implicated in ADHD, such as *ITGA1* or *CDH13* genes related to cell adhesion (Liu et al., 2017) and cell-to-cell communication functions (Hawi et al., 2015). Despite lacking direct evidence from human studies, a recent rodent model reported a relationship between *CADM1* function and ADHD-like behaviors. Altered expression of *CADM1* gene was associated with abnormal diurnal spontaneous activity in mice. When compared with wild type, mice with GFAP-DNSynCAM1 (i.e., dominant negative mutation) displayed increased daytime activity, decreased rest, and nocturnal hyperactivity, and strikingly, these anomalies were reversed by amphetamine. Moreover, higher levels of impulsive and aggressive behaviors were also present, such as jumping out of their cages and attacking other mice, i.e., behaviors consistent with other rodent models of ADHD (Sandau et al., 2012).

Furthermore, two missense mutations of *CADM1* gene— C739A(H246N) and A755C(Y251S)—were found in autism spectrum disorder probands and their family members (Zhiling et al., 2008)—a neurodevelopmental disorder sharing high rates of comorbidity with ADHD (Sharma et al., 2018) and potential common molecular genetic etiologies (Gonzalez-Mantilla et al., 2016). At the molecular level, the mutant variants of *CADM1* gene were associated with abnormal expression of matured oligosaccharide, cell surface trafficking defection, and greater susceptibility to cleavage or degradation (Zhiling et al., 2008). At the cellular level, the mutant *CADM1* gene was also associated with morphological and functional alterations in neurons, including shorter dendrites, impaired synaptogenesis (Fujita et al., 2010), and disruption in protein distribution (Muhle et al., 2004). Compared with normal cells, the abnormal CADM1 proteins in mutant cells accumulated mainly in the endoplasmic reticulum (ER) and induced upregulation of the ER stress marker, known as "C/EBL-homologous protein" (Fujita et al., 2010). Long-term exposure to excessive ER stress can lead to neuronal death, thereby implicating *CADM1* genetic mutation as potentially relevant to aberrant neurodevelopment and related pathogenesis (Muhle et al., 2004)

Jin et al. *CADM1* on EF in ADHD

Interestingly, CADM1 is highly expressed in many brain areas, including the cingulate cortex, parietal lobe, temporal lobe, occipital lobe, amygdala, caudate nucleus, cerebellum, and, especially, prefrontal cortex (http://biogps. org/#goto=genereport&id=23705), all of which represent the most relevant common brain regions identified by neuroimaging studies of ADHD (Liston et al., 2011). Notably, functional magnetic resonance imaging (fMRI) studies highlighted a complex neural network (comprising of prefrontal lobe, parietal lobe, cingulate cortex, cerebellum, and basal ganglia) involved in information processing relevant to ADHD deficits, such as attention, shifting, planning, reward, working memory, and response inhibition (Liston et al., 2011; Carlen, 2017). More specifically, ADHD participants showed reduced inhibitory control associated with lower brain activation of bilateral ventral lateral prefrontal cortex when compared with controls (Norman et al., 2016). Task-specific fMRI study also found differences in ADHD participants: significantly decreased activation in the right inferior frontal cortex while performing inhibition tasks (Hart et al., 2013); increased activation in the left dorsolateral prefrontal cortex while completing working memory tasks (Bedard et al., 2014); as well as under-activation in bilateral inferior prefrontal cortices during visual–spatial switching tasks (Rubia et al., 2010). In other words, certain brain regions with high CADM1 expression overlapped with those related to ADHD and associated neurocognitive deficits. However, no reports have explored the specific genetic effects of *CADM1* polymorphism on these brain structures and/or functions in the context of ADHD.

Intriguingly, despite different strands of compelling evidence on the implicated roles of *CADM1* gene in neurodevelopmental disruption (and likely symptom expression) and potential relevance to ADHD, no genetic variants of *CADM1* have achieved the postulated genome-wide significance (*P* < 10-8) or been represented among the genome-wide association studies (GWAS) top hits—in either the meta-analysis or primary GWAS of ADHD (Demontis et al., 2019; Faraone and Larsson, 2019). These findings suggested that the association between *CADM1* genotype and ADHD phenotypes, if it exists, may not be readily detected by conventional genetic analyses. The postulated links between *CADM1* and ADHD may therefore need to be probed by an alternative strategy as guided by the RDoC initiative (which redirects focus on the "gene– brain–behavior" relationships along the continuum of interlinking domains from genes, cells, anatomical regions, functions, and behaviors), instead of conventional diagnostic phenotypes. As an exploratory and hypotheses-generating study, we therefore considered it best to interrogate brain circuitry anomalies by applying an atheoretical probe along the domain continuum. In contrast to task-related fMRI, resting-state fMRI (rs-fMRI) permits evaluation of brain activities without prior theoretical assumptions, representing a more appropriate method to capture spontaneous regional neural activity as indexed by mean amplitude of lowfrequency fluctuation (mALFF).

There were two key objectives of this study. First was to examine the association of *CADM1* gene in relation to ADHD psychiatric phenotypes, neurocognitive endophenotypes, and regional brain circuitry activities. Second was to test whether the domain approach guided by RDoC could be an alternative avenue to elucidate "gene–brain–behavior" relationships of the *CADM1* gene by a series of domain-specific probes on different intermediate phenotypes within one single study, instead of relying on traditional diagnostic phenotypes alone. In other words, as an exploratory and hypotheses-generating study, we sought to explore the possible "gene–brain–behavior" relationship between *CADM1* genetic polymorphism, brain circuitries activities, and executive tasks performance relevant to ADHD. We postulated that the "domain-based" analyses for complex relationships could yield findings with finer specificity than conventional phenotypes.

#### MATERIALS AND METHODS

#### Participants

In total, 2,003 individuals (1,040 children with ADHD and 963 healthy controls) participated in the current study. All ADHD probands were recruited from the child psychiatric clinics at Peking University Sixth Hospital/Institute of Mental Health. Psychiatric diagnoses of ADHD and comorbidities were assessed and classified according to Diagnostic and Statistical Manual of Mental Disorders, 4th Edition criteria (American Psychiatric Association, 2013) by trained child psychiatrists using the Chinese-translated version of the Clinical Diagnostic Interview Scale (Barkley, 1998; Liu and Guan, 2011) for a semi-structured interview with probands and their parents together. Comorbidities as captured by the Clinical Diagnostic Interview Scale included: oppositional defiance disorders, conductive disorder, tic disorder, learning disorder, obsessive–compulsive disorder, specific phobias, anxiety, depression, and bipolar disorder.

The inclusion criteria for ADHD probands included: 1) having a Diagnostic and Statistical Manual of Mental Disorders, 4th Edition ADHD diagnosis; 2) aged between 6 and 16 years; 3) with a fullscale IQ ≥ 70, as measured by the Chinese version of Wechsler Intelligence Scale for Children (Gong and Cai, 1993); and 4) both biological parents were of Chinese Han descent. Exclusion criteria included: a psychiatric history of schizophrenia, affective disorder, pervasive development disorders (or autism spectrum disorders), major physical or metabolic disorders, and neurological disorders. Healthy control subjects were of Chinese Han descent (include both adults and children) and recruited from three sources: students from local elementary schools; healthy blood donors attending the Blood Center of Peking University First Hospital; and healthy volunteers attending The Institute of Mental Health (Beijing) for research. Among recruited controls, those found to have ADHD, other major psychiatric disorders, family history of psychosis, severe physical diseases, and substance abuse were excluded. Adult control participants were screened by the ADHD Rating Scale (DuPaul et al., 1998) and self-report. Child control participants were screened i) for low IQ by the Chinese version of Wechsler Intelligence Scale for Children and, ii) for psychopathologies, by parent-rated ADHD Rating Scale (DuPaul et al., 1998), Conners' Parent Rating Scale (Conners et al., 1998), and Achenbach's Child Behavior Cheek-list (Tseng et al., 1988).

A subsample of 35 ADHD cases and 56 healthy controls (aged between 8 and 16 years) was recruited for the nested imaging genetic study. To control for potential confounders relevant to this imaging genetic study, more stringent inclusion and exclusion criteria were introduced. Additional exclusion criteria were post-traumatic stress disorder, enuresis, and encopresis [captured by Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime version (Chinesetranslated version) (Kaufman et al., 1997)]. Participants were also excluded for having conditions contraindicated for undergoing MRI procedures; these included: having metal implants (including nonremovable dentures) and claustrophobia. More stringent inclusion criteria were applied for controlling potential artifacts and confounders relevant to neuroimaging studies; these included: 1) right hand dominant; 2) no history of severe head injury or brain trauma (leading to loss of consciousness or coma); 3) full-scale IQ ≥ 80; and 4) ADHD medication effects. Only drug-naïve participants in the ADHD group were recruited.

This project was approved by the Ethics Committee of Peking University Sixth Hospital/Institute of Mental Health. Written informed consents were sought and obtained from parents for the child participants and from adult participants.

#### Genotyping and Single Nucleotide Polymorphism Selection

Blood samples of both cases and controls were collected and genotyped using the Affymetrix6.0 array at CapitalBio Ltd. (Beijing) according to the standard Affymetrix protocol. Samples of cases and controls were added in equal proportion to each chip to avoid batch effects. The Affymetrix 6.0 array included 96 single nucleotide polymorphisms (SNP) probes of *CADM1* gene. The final set of 10-tag SNPs (**Table 1**) was selected based on two criteria: 1) common SNP sites according with the Hardy– Weinberg equilibrium and had a minor allele frequency above 5%; 2) using the confidence interval (CI) method of haplotype analysis software HaploView (ver4.2) to identify linkage disequilibrium, then tag SNPs yielded with the threshold setting of *r*<sup>2</sup> > 0.80 were included for the subsequent analysis.

#### Executive Function Measures

Executive function measures were ascertained in the subsample of ADHD and child control.

*Stroop Color–Word Interference test*: The test included four conditions: i) color naming (condition 1, e.g., name patches of color); ii) word reading (condition 2, e.g., read the rows of words printed in black ink); iii) color inhibition (condition 3, e.g., read the word *red* printed in *green* ink); iv) word inhibition (condition 4, e.g., name the color of the *green* ink rather than the word *red*). The "color interference time" denotes the average time (in seconds) taken to complete each trial of condition 3 subtracted by condition 2, and "word interference time" denotes the average time (in seconds) taken to complete each trial of condition 4 subtracted by condition 1. In this study, color and word interference time scores are analyzed to represent "inhibition" of executive function (Shuai et al., 2011).

*Rey–Osterrieth complex figure test (RCFT)*: RCFT evaluates visuospatial construction ability, visual working memory, and organizational skills. Participants were asked first to inspect and then copy the RCFT figure. After 30 s (Immediate Recall Condition) and 20 min (Delayed Recall Condition), they were asked to recall and reproduce the figure from memory without any visual cues or prompt. Both immediate and delayed scores were rated according to i) the structure recalled (structure score, 0–6 for 3 items) and ii) detailed accuracy reproduced (detail score, 0–36 for 18 items). "Forgotten" scores (for "structure" and for "detail") were generated by subtracting respective "delay" scores from "immediate" scores. This yielded two set of scores: "structure forgotten score" and "detail forgotten score," indicating the information that was lost during the 20-min interval. These discrepancy scores were analyzed in this study to represent "visual working memory" of executive function (Shuai et al., 2011).

*Trail making test*: This test was used to assess "set-shifting." It includes two parts: i) number sequencing trail making; ii) number–letter switching trail making. The time taken to complete each part was recorded. The "set-shifting time" was represented by the time discrepancy taken for part 2 subtracted by the time taken to complete part 1 (Shuai et al., 2011).

#### Magnetic Resonance Imaging Acquisition

All MRI images were acquired on the 3-T Siemens Tim Trio MRI scanner (Siemens, Erlangen, Germany) with a standard 12-channel



*HWE, Hardy–Weinberg equilibrium; MAF, minor allele frequency.*

head coil in the Imaging Center for Brain Research, Beijing Normal University. Participants while awake were instructed to remain still and relaxed with eyes closed during the 30-min period of rs-fMRI scanning, and a head strap and foam pads were used to minimize head movements. Functional images were acquired using an echo-planar imaging sequence with the following parameters: 33 axial slices, thickness/skip = 3.5/0.7 mm, repetition time = 2,000 ms, echo time = 30 ms, flip angle = 90°, matrix = 64 × 64, field of view = 200 × 200 mm, and 240 volumes. T1-weighted anatomical images were acquired with the following parameters: 128 slices, slice thickness = 1.33 mm, repetition time = 2,530 ms, inversion time (TI) = 1,100 ms, echo time = 3.39 ms, flip angle = 7°, matrix = 256 × 256, and field of view = 256 × 256 mm.

#### Data Preprocessing

Analysis of the rs-fMRI data was performed using Data Processing & Analysis for Brain Imaging (DPABI) 4.4 toolbox (DPABI\_V3.1) (Yan et al., 2016). The preprocessing included the following procedures: 1) removal of the first 10 volumes; 2) slice-timing correction; 3) head-motion correction; all the subjects head motion were lower than our criteria of 3 mm and 3°; 4) coregistration of T1 image to the functional image, and T1 image was segmented into gray matter, white matter, and cerebrospinal fluid by using the "new segment" method; 5) spatial normalization of segmented T1 image to standard Montreal Neurological Institute (MNI) space using "Dartel"; then, the functional data were normalized to the MNI space (resampled voxel size = 3 × 3 × 3 mm); 6) the Gaussian kernel full width at half-maximum was 6 mm3; 7) removal of linear trends; and temporal band pass filtering (0.01–0.08 Hz) were conducted; 8) regression of head motion effects with the Friston-24 parameter model, white matter, cerebrospinal fluid, and global signal.

#### Mean Amplitude of Low-Frequency Fluctuation Calculation

ALFF, regional homogeneity, and degree centrality are the three most commonly used methods for "voxel-wise whole-brain" analysis in rs-fMRI (Zang et al., 2015). The intra-scanner reliability (i.e., test–retest reliability) of mALFF has been identified as having higher reliability than regional homogeneity and degree centrality (Zhao et al., 2018). In this study, mALFF was used as the matrix to represent resting state brain neural circuit activity.

The power spectrum was obtained by fast Fourier transform of the pretreated time courses, and the averaged square root across a frequency band of 0.01–0.08 Hz was calculated as ALFF (Zang et al., 2007). ALFF of each voxel was divided by the global mean ALFF for standardization purpose, and mALFF was obtained as a parameter for further statistical comparison and analysis.

#### Statistical Analysis

#### Demographic and Clinical Characteristics

Demographic and clinical characteristics were compared between ADHD and control groups. *Chi*-square tests were applied for the categorical variable (sex), and independent sample *t*-tests were applied for continuous variables (age and IQ scores).

#### Gene–Behavior Association Analysis

First, the allelic and genotypic distributions under additive model of SNPs between "ADHD-whole" group and controls were compared using chi-square tests. Once the allelic and/or additive model difference reached nominal significance (*P* < 0.05), further genotypic comparisons under recessive and dominant models were conducted. Furthermore, we repeated the earlier analyses by stratifying the full ADHD sample (i.e. the "ADHDwhole" sample)into "ADHD-comorbid" and "ADHD-alone" subsamples to account for potential heterogeneity related to comorbidities within the ADHD phenotype, given the concerns raised about such effects on genetic association in the literature (Robinson et al., 2014). To correct for multiple comparisons, Bonferroni correction was performed setting significant *P*-value at 0.0008 (i.e., 0.05/10/2/3; with 10 representing the number of SNPs analyzed, 2 representing the allelic and genotypic models, and 3 representing the three analyzed phenotypes including "ADHD-whole," "ADHD-alone," and "ADHD-comorbid"). In addition, further logistic regression analyses were conducted to adjust the potential confounding effects with age, sex, and 10 principal components derived from the multidimensional scaling procedure for the Affymetrix 6.0 genotyping data (Yang et al., 2013) as covariates.

To minimize the artifacts introduced by multiple testing, only significant SNPs identified with positive associations in the case– control analysis were included in the general linear model that examines the association between genotypes and executive function measures. General linear model (multiple linear regression model with more than one dependent variable) was performed using SPSS. The executive function measures (as represented by the scores for "Stroop color interference time," "Stroop word interference time," "RCFT structure forgotten score," "RCFT detail forgotten score," and "TMT set-shifting time") were entered as dependent variables in the general linear model, while genotypes were entered as independent variables, with age, sex, IQ, and ADHD diagnoses set as covariates. Whenever genetic main effects were detected, *post hoc* analyses were then reconducted separately in the ADHD and control groups.

#### Imaging Genetic Analysis

Statistical analyses of mALFF were performed in DPABI (Yan et al., 2016). A mixed effect analysis was conducted in DPABI to determine whether there were any significant regional mALFF differences between genotypes and phenotype groups—ADHD and control groups. To control for multiple testing, AlphaSim correction was applied. By using AlphaSim correction, the significance threshold was set at *P* < 0.05 (a combination threshold of voxel level at *P* < 0.01 and a cluster size estimated by AlphaSim, with the kernel of smoothness recalculated based on four-dimensional residual).

#### Correlations of Genotype-Modulated Regional Mean Amplitude of Low-Frequency Fluctuations With Executive Function and Mediation Analyses

The correlations between regional mALFF alteration and executive functions (EF) indexes were conducted in those brain regions using Pearson correlation. Sex, age, IQ, and ADHD diagnoses were controlled as covariates.

If the correlation reached significance, moderation and mediation effects were evaluated using the PROCESS macro of SPSS (Hayes, 2017). First, moderation was assessed by model 1, in which the interaction effect of the W (the moderator, regional mALFF) and X (genotype) on Y (EF indexes) was computed. If there no moderation effect was detected, mediation was then assessed by model 4, in which the indirect effects of the X on Y through M (the mediator, regional mALFF) were evaluated for effect size and significance (Hayes and Rockwood, 2017). We used bootstrapping with 5,000 samples. The effects of sex, age, IQ, and ADHD cases were controlled as covariates. In the moderation analyses, the variables were mean-centered before the interactions were modeled. Sobel test of mediation (Sobel, 1982) was applied to determine whether regional mALFF significantly mediated the relations between rs10891819 genotype and executive function measures.

#### Expression Quantitative Trait Loci Analysis

To explore the potential biological functions of the SNPs identified in our first analysis, we further examined the patterns of expression quantitative trait loci (eQTL) based on the data from the UK Brain Expression Cohort (http://www.braineac.org).

#### RESULTS

#### Demographic Data

The demographic and clinical characteristics of participants for both the association study and nested imaging genetic study are summarized in **Table 2** and **Supplementary Table 1**. Sex ratio and IQ scores differed between "ADHD-whole" and control groups, with male preponderance and lower IQ scores detected in the ADHD group. In the sample of the imaging genetic study, the mean age of the control group was higher (all *P*s < 0.05).

#### Gene–Behavior Association Analyses

No differences in allelic and genotypic distribution of any SNP examined were found in the "ADHD-whole" sample or "ADHD comorbid" subsamples when compared with controls (**Supplementary Table 2**). In the "ADHD-alone" subsample, the genotypic distribution of rs10891819 was different from the controls at the nominal levels of significance in both additive model (*P =* 0.008) and the recessive model with TT genotype as protective [odds ratio *=* 0.48 (95% CI, 0.27–0.85), *P =* 0.012] (**Table 3**). Further adjustment for covariates (age and sex and the 10 principal components from the multidimensional scaling procedure) yielded similar results (**Table 3**). All the earlier results could not survive Bonferroni corrections. Quanto 1.2.4 was used to evaluate the statistical power of our sample. The power estimate yielded 73% at alpha of 0.05, based on the respective values in sample size, prevalence, allele frequency, and relative risk (ADHD cases *=* 295; healthy controls *=* 963; prevalence *=* 0.05; allele frequency *=* 0.29; inherent mode *=* recessive; relative risk of alleles *=* 0.48). We then repeated the same analysis in the "ADHD-whole" sample, "ADHD-comorbid," and "ADHDalone" subsamples specifically using the child-only control subsample (i.e., excluding the adult controls for a more stringent validation); the results did not differ substantially (data not shown).

For the "gene-EF" analyses, performances on all executive function measures were poorer in the ADHD group compared with controls. However, no genotypic main effect of rs10891819 was detected for the "ADHD-whole," "ADHD-alone," or "ADHDcomorbid" grouping (**Supplementary Table 3**) (all *P*s > 0.05).

#### Imaging Genetic Study

The genotypic distributions of rs10891819 in the nested study were: 19, 12, and 4 carriers of GG, GT, and TT genotypes in the "ADHD-whole" group (n = 35) and 37, 13, and 6 carriers in control group (n = 56). However, when stratified by comorbidities, no TT genotype carriers were found in the "ADHD-alone" subgroup. Subsequently, the GT and TT genotypes were combined to form the "T-allele carrier" group for imaging genetic analyses. Further details on sample characteristics were given in **Supplementary Table 4**. No genotypic effect was found on any EF parameters (**Supplementary Table 5**) (all *P*s > 0.05).

When controls and ADHD cases were analyzed as a combined group, a main genotypic effect of rs10891819 on the brain activity


*ADHD, attention-deficit/hyperactivity disorder; ADHD-I, ADHD inattentive subtype; ADHD-C, ADHD combined subtype; ADHD-alone, ADHD subjects without any comorbidity; ADHD-comorbid, ADHD with assessed comorbidities; IQ, intelligence quotient; SD, standard deviation.*


TABLE 3 | Allelic and Genotypic Analysis in ADHD-alone (n = 295) and Controls (n *=* 963).

*ADHD-alone, ADHD subjects without any comorbidity; OR: odd ratios; 95% CI: 95% confidence interval; the ancestry alleles were bolded.*

*aAge, sex, and the 10 principal components from the multidimensional scaling procedure as covariates.*

was detected. Specifically, significantly higher levels of mALFF were detected in the right superior frontal gyrus (rSFG) for T-allele carriers when compared with GG carriers (peak *t =*  3.85, corrected *P* < 0.05, also see **Table 4A**, **Figure 1A**). *Post hoc* analyses stratifying ADHD and control participants into two separate groups detected the same pattern: T-allele carriers showed significantly higher levels of mALFF than GG carriers in the ADHD group [(1.34 ± 0.30) versus (1.07 ± 0.13), *P =* 0.001] and in the control group [(1.18 ± 0.18) versus (1.05 ± 0.16), *P =* 0.002] (also see **Supplementary Table 6**, **Figure 1B**). These results survived AlphaSim correction.

When the ADHD cases were stratified into "ADHD-alone" and "ADHD-comorbid" subgroups and then combined with the participants from control group, respectively (for sufficient statistical power for comparison), the genotypic effect of rs10891819 remained significant in the rSFG region, with higher levels of mALFF shown in T-allele carriers compared with GG carriers (**Supplementary Table 6**) (all *P*s < 0.05).

#### Correlational, Moderation, and Mediation Analyses for Genotype, Regional mALFF, and Executive Function Measures

Correlation analyses of regional mALFF (in the rSFG) and executive function measures were conducted in the combined sample of ADHD participants and controls. The results showed a negative correlation between mALFF in the rSFG and "word interference time" in the STROOP test (*r =* -0.29, *P =* 0.006, **Figure 2A**), indicating higher mALFF levels correlated with better performance in this inhibition task. However, the correlation


*ADHD, attention-deficit/hyperactivity disorder; AAL, anatomical automatic labeling; age, sex, and IQ as covariates. Threshold: voxel p < 0.01, cluster P < 0.05 after AlphaSim correction.*

between mALFF and other remaining executive measures was not detected (*P*s > 0.05, **Table 4B**). When stratifying ADHD and control participants into two separate groups, we could only detect in the control group the negative correlation between mALFF in the rSFG and "word interference time" in STROOP test (r = -0.41, P = 0.003).

variables in start points on those in end points. rSFG: right superior frontal gyrus. \*\*P < 0.01, \*\*\*P < 0.001.

Using PROCESS, the moderation effect of mALFF in rSFG (moderator) on the relationship between genotype (X) and word interference time (Y) was evaluated. The level of mALFF (moderator) was significantly associated with word interference time, but there was no significant mALFF\*genotype interaction in relation to word interference time (**Supplementary Table 7**).

The mediation effect was then examined to evaluate the three-way relationship between mALFF in the rSFG (mediator), genotype (X), and word interference time (Y). In the mediation model, the path from genotype to mALFF was significant [B = TABLE 4B | Correlation between mean amplitude of low-frequency fluctuation in right superior frontal gyrus and executive measures in combined samples of ADHD-whole and control.


*aAdjusted with age, sex, IQ, and ADHD diagnoses.*

0.20 (SE = 0.04), 95% CI = 0.12 to 0.27, *P =* 2.10 × 10-6], and the path from mALFF to word interference time was significant [B = -13.33 (SE = 4.72), 95% CI = -3.39 to -22.73, *P =* 0.006], but the path from genotype to word interference time did not reach statistical significance. The bias-corrected bootstrap 95% CI indicated that the indirect path through mALFF was significant [B = -2.61 (SE = 1.07), 95% CI = -0.48 to -4.72, **Supplementary Table 7**]. An indirect-only subtype of mediation was detected (Zhao et al., 2010): Sobel test for mediation effect was significant (Sobel z = -2.47, *P =* 0.009) offering support that mALFF mediated the path between genotype and word interference time (**Figure 2B**).

#### Expression Quantitative Trait Loci Analyses for rs10891819

According to the data extracted from online resource from the UK Brain Expression Cohort (of Caucasian participants), the minor G allele of the SNP rs10891819 was associated with higher *CADM1* expression level (P-value = 0.037, **Figure 3A**). This pattern was different from our sample of Chinese Han participants, in whom T variant was the minor allele (**Figure 3B**).

FIGURE 3 | (A) Expression quantitative trait loci analysis of rs10891819 on CADM1 transcriptional expression in human brain. FCTX, frontal cortex; HIPP, hippocampus; THAL, thalamus; TCTX, temporal cortex; CRBL, cerebellar cortex; OCTX, occipital cortex (specifically primary visual cortex); PUTM, putamen; SNIG, substantia nigra; MEDU, medulla (specifically inferior olivary nucleus); WHMT, intralobular white matter. (B) Worldwide diversity of rs10891819 allele frequencies in Human Genome Diversity Project (http://genome.ucsc.edu/trash/hgc/hgdpGeo\_rs10891819.png).

### DISCUSSION

Our study examined the association of *CADM1* gene in relation to ADHD psychiatric phenotypes, neurocognitive endophenotypes, and regional brain circuitry activities. There are four key findings.

The first key finding was a marginal significant genotypic effect of rs10891819 detected only in the "ADHD-alone" subgroup, with TT genotype as protective, though the association did not survive Bonferroni correction. Second, in the nested imaging genetic study, rs10891819 genotype was significantly associated with altered spontaneous regional brain activities during rs-fMRI, in the rSFG region. More specifically, the mALFF activities in the T-allele carriers were consistently higher than GG carriers in both ADHD and control groups. Third, endophenotypic correlation analyses detected a significant negative correlation between "word interference time" in Stroop and mALFF activity in the rSFG, that is: higher spontaneous regional brain activities in the rSFG were correlated with better performance in inhibition task (as indexed by shorter "word interference time"). Fourth, our mediation analysis confirmed a significant three-way effect (supported by a significant Sobel test for "indirect-only mediation") from "gene" to "brain activity" to "inhibition task"—potentially representing a "gene–brain–behavior" relationship. The significant indirect effects involved two paths: from rs10891819 genotype (T-allele carriers) to brain activation in the rSFG (higher activities) and from rSFG to Stroop inhibition task (better performance).

In other words, we only detected a protective effect of *CADM1*  genotype and its association with higher brain activation in the context of better performance in inhibition task. These two strands of findings are consistent with each other, suggestive that the detected *CADM1* genotypic effects confer better cognitive function and therefore protection, rather than elevating the risks of impaired cognitive processes or phenotypic expression of ADHD. Our preliminary findings could also indicate that *CADM1* genotypes may not directly elevate the risk of ADHD expression and therefore would be consistent with our predictions derived from ADHD GWAS, which did not detect genome-wide significant associations with the "disorder phenotype."

Our second aim was to test whether the domain approach guided by RDoC could be an alternative avenue to elucidate "gene–brain–behavior" relationships of the *CADM1* gene by a series of domain-specific probes on different intermediate phenotypes within one single study, instead of relying on traditional diagnostic phenotypes alone. The SNP rs10891819 showed marginal association (TT genotype as protective) in the "ADHD-alone" subsample. The detected association was not found in the "ADHD-comorbid" or the whole (unstratified) sample. This could represent a spurious chance finding (Liu et al., 2015) or a weak genetic signal partially obscured by other unmeasured confounders. Inevitably, this preliminary finding needs to be replicated by future studies with larger sample size. Within the remits of our study, we then applied the domain approach guided by RDoC and interrogated this weak genetic signal further using a nested imaging genetic study. Through iterations along the domains posited by RDoC, other significant findings were uncovered in brain activities in relation to genotype and cognitive intermediate phenotype. Finally, a significant mediation model emerged: delineating the paths from "T-allelic carrier genotype" to "higher brain activation in the rSFG" and from "rSFG" to "better performance in inhibition task." The findings are congruent with the theoretical and biological plausibility that the detected "better performance in inhibition" is in line with our other findings, such as "higher brain activity in the PFC" involved top–down control as well as the "detected protective effect" against ADHD expression. Given the small sample size in our imaging genetic study and multiple testing conducted (without surviving Bonferroni correction), our findings should be interpreted with caution and regarded as exploratory. As a hypotheses-generating study, our findings provided preliminary support for the merits of domain-informed approach based on RDoC framework in exploring potential "gene–brain–behavior" relationships within the context of *CADM1* gene. Future studies with larger samples may specifically test these hypotheses generated by our exploratory findings. It is particularly striking that the mediation effect on "gene–brain– endophenotype" relationship was detected independent of the clinical diagnostic phenotypes (i.e., in ADHD and/or control groups). If such findings were replicated, our findings may offer support for the RDoC conceptual framework that privileges brain circuitries (in relation to genes and endophenotypes) over the clinical phenotypes as the primary anchor for investigation.

Our findings were in line with the suggestion derived from a recent study that cell adhesion pathway could be an etiological candidate for ADHD (Lima et al., 2016), but the association is unlikely to be a linear one or conforming to the conventional bivariate model of risk and disease. *CADM1* gene encodes cell adhesion molecule 1, which influences a wide range of neural functions, including neuronal development, myelination, synaptic formation, plasticity, and integrity of neuronal networks (Lima et al., 2016). Genes involved in neuronal migration, growth, morphology, synaptic plasticity, and cell adhesion have been implicated by GWAS in ADHD (Zayats et al., 2015; Lima et al., 2016).

Interestingly, rs10891819 is located in intron 9 of *CADM1* gene, a region with uncertain but putative function of influencing expression of CADM1 protein molecule. As shown from the expression quantitative trait loci analyses, the minor G allele (in Caucasian samples) was associated with a higher *CADM1* expression level. However, the reverse pattern of minor allele of rs10891819 was observed in our Chinese Han participants (T as minor allele) (**Figure 3B**); one possible explanation is that the T allele in Chinese Han and G allele in Caucasian subjects might confer same postulated function, relevant to the expression of ADHD symptoms—given the putative protective function bestowed by a higher expression of *CADM1* and positive downstream influences on higher prefrontal neural activities and better inhibitory control. However, there are no available expression data in Hans population to support this interpretation, and we could only infer higher cognitive performance observed in our findings attributable to better functions of CADM1 protein molecule. If our findings were replicated, future study may be needed to evaluate the transcriptional functions of *CADM1*  polymorphism and elucidate more fully their functional roles in Chinese Han participants. In addition, the possible mechanism for the involvement of *CADM1* in ADHD has also been considered within dopaminergic functions in a recent review (Kitagishi et al., 2015). Evidently, dopamine transporter (DAT) is a key molecule in psychopharmacological treatment of ADHD, pivotal in i) regulating the DA level within synaptic cleft and ii) maintaining presynaptic DA function through synthesis and storage. However, the regulatory functions of DAT are dependent on protein kinase (PKA) and protein kinase B (AKT), which are activated by phosphatidylinositol-3-hydroxykinase (PI3K). By recruiting PI3K to the membrane surface, CADM1 molecule plays a putatively crucial role in affecting the upstream signaling pathways of DAT and, consequently, in the pathogenesis of ADHD (Kitagishi et al., 2015). Therefore, the potential roles of *CADM1* gene involved in the expression of ADHD symptoms are complex and likely implicated at multiple levels: including at the level of specific pathway (e.g., cell adhesion) and at the level of pathway–pathway interaction (e.g., "cell adhesion pathway" intersecting with "monoaminergic pathway"). Moreover, the genotypic effect of *CADM1* on rSFG and subsequent relationship with inhibition function reported by our research might be the consequence of *CADM1*\**DAT1* gene–gene interaction. ADHD is likely a disorder involving multiple causal genes of small effects and interactions. To elucidate this possibility, future study with *DAT1* and other genotypes can unpack more fully the effects and theoretical implications of *CADM1*\**DAT1* and other gene–gene interactions.

Several limitations need to be considered. First, most of the findings did not survive correction for multiple testing. Our study is an exploratory study examining the genetic effects of *CADM1* gene on ADHD, and it should be regarded as a hypothesisgenerating study. Second, the scope was limited by the small sample size, especially after stratification by comorbidities status. Our preliminary findings should be treated with caution and needed to be replicated in other samples. More specifically, Caucasian samples may show an opposite effect; given G-allele is the minor allele conferring higher *CADM1* expression in Caucasian population. Future replication studies should therefore be vigilant of potential divergent functional effects of a given minor allele on cellular, brain, functional, and behavioral expression. Third, the participants in our controls recruited in the gene–behavior association analyses included both adults and children. Genotypes do not change with age, and healthy adult samples without childhood history of psychiatric disorders can be used as controls. Further validation analyses in children-only samples could overcome this limitation. In addition, if some adults failed to recall or disclose childhood disorders accurately, contamination of the controls by ADHD cases would reduce the statistical power of the sample, biasing the results toward the null hypothesis rather than leading to spurious positive findings. Fourth, there were multiple testing and comparison in our study. Our significant findings could be spurious and arose by chance. However, the directions of significant findings converged meaningfully in line with theoretical and biological plausibility, regarding the protective effect of the rs10891819 genotype, higher PFC activation, better cognitive function, and their mediating relationship. It remained likely that a weak genetic signal initially detected by candidate gene association approach was amplified through subsequent domain iterations as guided by RDoC approach. Fifth, we could only detect the effect of an "indirect-only mediation" (Zhao et al., 2010). Zhao et al. (2010) provided an extensive review on different subtypes of mediation model. Our final mediation conformed to the "indirect-only mediation" subtype. It is possible that the long chain of intermediates between *CADM1* gene and inhibition endophenotype has diluted the direct effect to the extent that our small sample could not detect a significant association between gene and inhibition. Alternatively, the "brain-activation" phenotype embodies two unrelated or lowly correlated variances, and each one of them correlates with *CADM1* gene and inhibition independently—as a result, only "brain" correlates with both. Future studies with a larger sample adequately powered (based on our detected effect sizes) may be able to provide a fuller explanation. Furthermore, mediation and moderation analyses by future studies could be utilized in a gene–environment interaction model justified by a plausible biological theory (van der Meer et al., 2015). Our findings should be regarded as hypothesis generating and should only serve as a stimulus for future research.

In conclusion, our study offers preliminary evidence to support the roles of *CADM1* function in relation to prefrontal brain activities, inhibitory executive function, and ADHD phenotype, implicating a potential "gene–brain–behavior" relationship of the *CADM1* gene. Our preliminary findings derived from this hypotheses-generating study also provided support for the merits of applying the domaininformed approach based on RDoC framework in ADHD research. Future studies with larger samples may specifically test these hypotheses generated by our exploratory findings.

#### DATA AVAILABILITY

The raw data supporting the conclusions of this manuscript will be made available by the corresponding author on reasonable request, without undue reservation, to any qualified researcher.

#### ETHICS STATEMENT

This project was approved by the Ethics Committee of Peking University Sixth Hospital/Institute of Mental Health. Written informed consents were sought and obtained from parents for the child participants and from adult participants.

#### AUTHOR CONTRIBUTIONS

JJ, QG, LL, YW and QQ contributed conception and design of the study. HL, JJ, and QG organized the database. JJ, LL, WC, YW and QQ performed the statistical analysis, interpreted the results, and wrote sections of the manuscript. All authors contributed to manuscript revision and read and approved the submitted version.

# FUNDING

This work was supported by the National Science Foundation of China (81571340), the National Key Basic Research Program of China (973 program 2015CB856405), Beijing Municipal Science & Technology Commission (No.Z161100000516032), the National Natural Science Foundation of China (81873802, 81641163, 81761148026), Beijing Natural Science Foundation (7172245), and the Capital Foundation of Medical Developments (CFMD:2016-2-4113).

#### REFERENCES


#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019.00882/ full#supplementary-material

Children-Present and Lifetime Version (K-SADS-PL): initial reliability and validity data. *J. Am. Acad. Child. Adolesc. Psychiatry* 36, 980–988. doi: 10.1097/00004583-199707000-00021


**Conflict of Interest Statement**: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Jin, Liu, Chen, Gao, Li, Wang and Qian. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

digital media

of impactful research

article's readership