# EPIGENETIC APPROACHES IN DRUG DISCOVERY, DEVELOPMENT AND TREATMENT

EDITED BY : Shibashish Giri and Chandravanu Dash PUBLISHED IN : Frontiers in Pharmacology and Frontiers in Genetics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-932-8 DOI 10.3389/978-2-88963-932-8

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# EPIGENETIC APPROACHES IN DRUG DISCOVERY, DEVELOPMENT AND TREATMENT

Topic Editors: Shibashish Giri, University of Leipzig, Germany Chandravanu Dash, Meharry Medical College, United States

Establishment of a normal phenotype involves dynamic epigenetic regulation of gene expression that when affected contributes to human diseases. On a molecular level, epigenetic regulation is marked by specific covalent modifications (acetylation, methylation, phosphorylation, sumoylation, PARylation and ubiquitylation) of DNA and its associated histones. Studies also suggest the influence of such epigenetic modifications on non-coding RNA expression implicated in normal and diseased phenotypes. Epigenetic control of genetic expression is a reversible process essential for normal development and function of an organism. Alteration of epigenetic regulation leads to various disease forms such as cancer, diabetes, inflammation and neuropsychiatric disorders. Assessing these alterations provides a deeper insight into the changes induced in the genome, which is often informative for identifying disease subtypes or developing suitable treatments. Therefore, epigenetics proves to be a key area of clinical investigation in diagnosis, prognosis, and treatment of complex diseases.

Genetic mutations, environmental stress, pathogens and drugs of abuse are some of the predominant factors that induce and impact changes on chromatin, which directly dictate a diseased phenotype. It is essential to consider the interaction between genetic and epigenetic factors to understand the molecular mechanisms of complex human diseases for safer and efficient drug development. Furthermore, genetic variation in absorption, distribution, metabolism, and excretion (ADME) genes is insufficient to account for interindividual variability of drug response. Therefore, current efforts aim to identify epigenetic components of ADME gene regulation, which include phase-I and phase-II enzymes, uptake transporters, efflux transporters and nuclear receptors involved in regulation of ADME genes. Monitoring circulatory epigenetic biomarkers in liquid biopsies (blood, saliva, urine, cerebrospinal fluid) of disease-associated and drug-associated epigenetic alterations may prove useful for decision support for routine clinical treatment and drug discovery. Hence, recent drug discovery efforts on targeting the epigenome, has emerged an area of interest with several new drugs being developed, tested and some already approved by the US Food and Drug Administration (FDA). These new insights into the complexities of epigenetic regulation are key contributors to our basic understanding of this process in human health and disease, which will provide scope for innovative drug therapies. It is of urgency to aid the present understanding of epigenomics driven diseased outcomes, with the expectation that further studies will identify early markers of disease and targets for therapeutics.

Citation: Shibashish, G., Chandravanu, D., eds. (2020). Epigenetic Approaches in Drug Discovery, Development and Treatment. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-932-8

# Table of Contents

*04 miRNAs and lncRNAs as Predictive Biomarkers of Response to FOLFOX Therapy in Colorectal Cancer*

Kha Wai Hon, Nadiah Abu, Nurul-Syakima Ab Mutalib and Rahman Jamal

*14 Inhibitors of DNA Methyltransferases From Natural Sources: A Computational Perspective*

Fernanda I. Saldívar-González, Alejandro Gómez-García, David E. Chávez-Ponce de León, Norberto Sánchez-Cruz, Javier Ruiz-Rios, B. Angélica Pilón-Jiménez and José L. Medina-Franco

*24 Sulfotransferase and Heparanase: Remodeling Engines in Promoting Virus Infection and Disease Development*

Dominik D. Kaltenbach, Dinesh Jaishankar, Meng Hao, Jacob C. Beer, Michael V. Volin, Umesh R. Desai and Vaibhav Tiwari

*41 Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity*

Qingjie Guo, Ruonan Zheng, Jiarui Huang, Meng He, Yuhan Wang, Zonghao Guo, Liankun Sun and Peng Chen

*48 Corrigendum: Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity*

Qingjie Guo, Ruonan Zheng, Jiarui Huang, Meng He, Yuhan Wang, Zonghao Guo, Liankun Sun and Peng Chen


Mohammad Afaque Alam and Prasun K. Datta

# miRNAs and lncRNAs as Predictive Biomarkers of Response to FOLFOX Therapy in Colorectal Cancer

#### Kha Wai Hon, Nadiah Abu\*, Nurul-Syakima Ab Mutalib and Rahman Jamal

UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia

Chemotherapy is one of the options for cancer treatment. FOLFOX is one of the widely used chemotherapeutic regimens used to treat primarily colorectal cancer and other cancers as well. However, the emergence of chemo-resistance clones during cancer treatment has become a critical challenge in the clinical setting. It is crucial to identify the potential biomarkers and therapeutics targets which could lead to an improvement in the success rate of the proposed therapies. Since non-coding RNAs have been known to be important players in the cellular system, the interest in their functional roles has intensified. Non-coding RNAs (ncRNAs) as regulators at the post-transcriptional level could be very promising to provide insights in overcoming chemo-resistance to FOLFOX. Hence, this mini review attempts to summarize the potential of ncRNAs correlating with chemo-sensitivity/resistance to FOLFOX.

#### Edited by:

Chandravanu Dash, Meharry Medical College, United States

#### Reviewed by:

Yang Zhang, University of Pennsylvania, United States Alessio Squassina, Università degli studi di Cagliari, Italy

\*Correspondence:

Nadiah Abu nadiah.abu@ppukm.ukm.edu.my

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology

Received: 01 June 2018 Accepted: 13 July 2018 Published: 06 August 2018

#### Citation:

Hon KW, Abu N, Ab Mutalib N-S and Jamal R (2018) miRNAs and lncRNAs as Predictive Biomarkers of Response to FOLFOX Therapy in Colorectal Cancer. Front. Pharmacol. 9:846. doi: 10.3389/fphar.2018.00846 Keywords: FOLFOX, chemo-resistance, biomarkers, non-coding RNAs, molecular target

## INTRODUCTION

Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the fourth leading cause of cancer mortality worldwide (Torre et al., 2015). CRC begins with the appearance of benign adenomatous polyps on the inner wall of colon and rectum in large intestine, which progressively develops into advanced adenoma, invasive carcinoma and eventually distant metastases (Flor et al., 2013; Yee et al., 2013; US Preventive Services Task Force et al., 2016; Veettil et al., 2016). Despite the advancement in clinical oncology, multidrug resistance (MDR) remains a major obstacle for treatment of CRC patients especially those at the advanced stage of the disease (Hammond et al., 2016). Several mechanisms have been proposed to modulate MDR in CRC, mainly via limitation of drug transport, dysregulation of cellular processes, alteration of drug sensitivity via epigenetic modifications such as disturbance of miRNA levels and others (Holohan et al., 2013; Panczyk, 2014; Hu et al., 2016; Zhang and Wang, 2017). Over-expression of ATP-dependent transporters on plasma membrane of cancer cells could be responsible for suppressing the influx of drug into cancer cells, and simultaneously increasing the efflux of drug out of the cancers cells to reduce overall drug accumulation (Liu et al., 2010; Wilson et al., 2011; Nies et al., 2015). Dysregulation of cellular processes including apoptosis, drug metabolism, DNA damage repair and regulation of cell cycle checkpoints may also modulate MDR in CRC (Bouwman and Jonkers, 2012; Xu et al., 2012, 2013; Czabotar et al., 2014). Epigenetic modification in cancer cells such as, selective expression of miRNAs, DNA methylation and histone modification have been postulated to alter drug sensitivity in certain cancers including CRC (Brown et al., 2014; Shen et al., 2017; Zhang and Wang, 2017). There are various mechanisms that lead to drug resistance in CRC, but nevertheless, this issue still remains widely unresolved.

One of the CRC chemotherapeutic regimens being widely used today is FOLFOX, which is the combination of folinic acid (FOL), 5-fluorouracil (F) and oxaliplatin (OX) (André et al., 2004). 5-fluorouracil (5-FU) as the main component in FOLFOX, is a type of fluoropyrimidine that incorporates into the DNA molecule to inhibit thymidylate synthase (TS) (Longley et al., 2003). This subsequently hinders the synthesis of pyrimidine thymidine required for DNA replication so that actively dividing cancerous cells will undergo apoptosis due to the thymineless condition (Noordhuis et al., 2004). 5-FU has been used to treat multiple types of cancer including esophageal cancer, gastric cancer, pancreatic cancer, breast cancer, and cervical cancer (Peters et al., 2000; Ling et al., 2012; Carter et al., 2013; Kim et al., 2016; Lee and Park, 2016). Oxaliplatin (trans- /-diaminocyclohexane oxalatoplatinum; L-OHP) on the other hand, is a platinum-based antineoplastic agent that inhibits DNA replication and transcription by forming cross linkages within the double strands of DNA (Bleiberg, 1998; Woynarowski et al., 2000; Kelland, 2007). The combination of 5-FU and oxaliplatin provide a synergistic effect in anti-proliferative activity, especially among patients with metastatic colorectal cancer (mCRC) (Gustavsson et al., 2015). Folinic acid, also known as leucovorin or calcium folinate, stabilizes the 5-FU-TS complex with better cytotoxicity, and it acts by reducing the side effects of 5-FU with lower dosage required to complete the cycles of treatment (Morgan, 1989; Van Der Wilt et al., 1992). Leucovorin and oxaliplatin also exhibit antitumor properties against metastatic colorectal cancer, esophageal cancer, gastric cancer and hepatocellular carcinoma, and are either used individually or in combination with other chemotherapeutics (Lin et al., 2013b;Skinner et al., 2014; Wu et al., 2014a;Hironaka et al., 2016; Liu et al., 2016b). Currently, FOLFOX is widely administered through injection into the veins to treat mostly stage II and III CRC patients after surgical resection (André et al., 2015). Although FOLFOX is among the preferred chemotherapeutic regimen for CRC patients, the response rate to this systemic treatment is only estimated at around 50%. Studies have reported that almost half of the patients receiving FOLFOX develop chemo-resistance at a later stage of treatment, resulting in high incidence rate of cancer recurrence and metastasis to other organs (De Gramont et al., 2000; André et al., 2004, 2015; Howlader et al., 2016).

Non-coding RNAs (ncRNAs) represent a group of functional RNA molecules originally transcribed from DNA but not translated into proteins (Chen and Xue, 2016). ncRNAs can be classified into two major groups: infrastructural and regulatory ncRNAs (Kaikkonen et al., 2011). Infrastructural ncRNAs such as transfer RNAs (tRNAs), ribosomal RNAs (rRNAs) and small nuclear RNAs (snRNAs) are abundantly expressed in all eukaryotic cells and p2lay housekeeping roles in splicing and translation of mRNAs into proteins (Mattick and Makunin, 2006). Regulatory ncRNAs include microRNA (miRNA), short interfering RNAs (siRNA), piwiinteracting RNAs (piRNA), and long ncRNAs (lncRNAs). They are involved in the epigenetic modification of other RNAs (Fu, 2014). These regulatory ncRNAs regulate gene expression at the transcriptional and post-transcriptional level, via several mechanisms namely heterochromatin formation, histone modification, DNA methylation, and gene silencing (Meister and Tuschl, 2004; Volpe and Martienssen, 2011; Nohata et al., 2013; Matzke and Mosher, 2014). Other classes of RNAs have been discovered in recent decade, such as enhancer RNAs (eRNAs), circular RNAs (circRNAs) and promoterassociated RNAs (PARs), but the limited understanding on these minor classes of RNAs requires more studies to validate their functions in gene regulation (Han et al., 2007; Yan and Ma, 2012; Kim et al., 2015; Salzman, 2016). MicroRNAs (miRNAs) are a class of small, single stranded endogenous ncRNAs with 21–25 nucleotides (Ul Hussain, 2012). MiRNAs bind partially or completely with complementary sequences of the target mRNAs and can silence the mRNA through regulatory mechanisms such as cleavage of the mRNA strand and destabilization of the mRNA through shortening of its poly(A) tail (Bartel, 2009). MiRNAs play important roles in a variety of biological processes, namely cellular development, proliferation, differentiation, metabolism, apoptosis and tumorigenesis (Ul Hussain, 2012). Long non-coding RNAs (lncRNA) constitute a large family of ncRNAs with a length of 200 nucleotides and longer (Geisler and Coller, 2013). LncRNAs mostly interact with DNA, RNA and proteins on the secondary and tertiary structures to form multiple kinds of complexes that could be key regulators in modulating gene expression (Mercer et al., 2009). Aberrant expression of lncRNAs has been discovered in many diseases including cancers (Fang and Fullwood, 2016). Emerging literature has revealed the importance of lncRNAs as oncogenes or tumor suppressors to regulate several key steps in the process of carcinogenesis, such as tumor proliferation, apoptosis, metastasis and chemo-resistance by interfering with target gene expression (Gupta et al., 2010; Yang et al., 2012; Majidinia and Yousefi, 2016; Pan et al., 2016). Concurrently, lncRNA may also serve as therapeutic targets or biomarkers of disease pathogenesis and pathophysiology (Lavorgna et al., 2016). Recent discoveries from transcriptomic and bioinformatics studies have reported increasing number of miRNAs and lncRNAs which modulate epigenetic regulation of cancer chemo-resistance. Emerging evidence has also demonstrated that interactions between miRNAs and lncRNAs with other biomolecules such as proteins are equally important to modulate the molecular mechanism underlying cancer chemo-resistance. This review provides insight into promising miRNAs and lncRNAs as potential biomarkers or therapeutic targets related to FOLFOXresponsiveness in colorectal cancer.

#### miRNAs and Folfox Chemo-Resistance

Chen et al. reported that the upregulation of serum miR-19a is significantly associated with FOLFOX-resistance in advanced CRC (Chen et al., 2013b). In this study, serum miR-19a was reported to have a sensitivity of 66.7% and specificity of 63.9% to differentiate FOLFOX-resistant patients from FOLFOXresponsive patients. This implicates the potential of miR-19a as a biomarker in advanced CRC to predict innate resistance before FOLFOX therapy as well as to monitor the acquired resistance of FOLFOX during the treatment (Chen et al., 2013b). MiR-19a is an integral component of the oncomiRs—miR-17- 92 family (miR-17, miR-18a, miR-19a, miR-20a, miR-19b-1, and miR-92-1) (Olive et al., 2009; Matsumura et al., 2015). The aberrant expression of this oncogenic cluster has been observed in different cancers, such as myeloma, acute myeloid leukemia, lung cancer, bladder cancer and CRC (Zhang et al., 2012; Collins et al., 2013; Lepore et al., 2013; Lin et al., 2013a;Feng et al., 2014; Wu et al., 2014b;Yamamoto et al., 2015; Liu et al., 2017). MiR-19a was also detected in CRC-derived exosomes and has been suggested as a possible prognostic biomarker for recurrence in CRC patients (Matsumura et al., 2015). Exosomes are microvesicles released by most of the living cells as natural carriers of molecular information such as DNA and RNA (Milane et al., 2015). Furthermore, miR-19a was also associated with other types of drug resistance, including gefitinib resistance in nonsmall cell lung cancer (Cao et al., 2017) and epirubicin plus paclitaxel in breast cancer (Li et al., 2014). In breast cancer, miR-19a was involved in resistance by regulating the PTEN protein (Li et al., 2014). Meanwhile, in non-small cell lung cancer, miR-19a was reported to be involved in acquired gefitinib resistance via the c-met pathway (Cao et al., 2017). This is interesting, as this shows that the method of resistance to different drugs is dependent on the type of drug. Different drugs may induce different types of resistance, even though the same miRNA is involved.

Another member of the miR-17-92 cluster, is the miR-17- 5p, which has been reported to also be significantly upregulated among FOLFOX-resistant CRC patients (Fang et al., 2014). Elevated expression of miR-17-5p was associated with poor prognosis, distant metastases and advanced clinical presentation (Fang et al., 2014). miR-17-5p could serve as a biomarker to predict chemotherapy response in CRC as well as a potential target for the study of CRC tumorigenesis. Additionally, the role of miR-17-5p in relation to drug resistance was also reported in other types of cancer. For instance, miR-17-5p was involved in erlotinib resistance in non-small cell lung cancer cells (Zhang et al., 2017), paclitaxel resistance in lung cancer (Chatterjee et al., 2014) and cisplatin resistance in gastric cancer(Wang and Wang, 2018). In gastric cancer, the resistance mediated by miR-17-5p was achieved by modulating the p21 protein. In ovarian cancer, miR-17-5p was reported to contribute to drug resistance by regulating the AKT pathway through PTEN, and also other EMT players (Fang et al., 2015). Similarly, in colorectal cancer, the PTEN protein was also found to be involved in miR-17-5pacquired drug resistance (Fang et al., 2014). Interestingly, it was reported that miR-17-5p affected paclitaxel resistance by binding to the 3'UTR of the beclin-1 gene (Chatterjee et al., 2014). The same observation was seen in erlotinib resistance, where miR-17- 5p was reported to bind to the EZH1 gene instead (Zhang et al., 2017). From these reports, it can be postulated that PTEN is a major player in drug resistance, and can be used as a targeted therapy. Additionally, miR-17-5p was found to mediate drug resistance by becoming a competitive inhibitor for different types of genes.

Kjersem et al. also identified the upregulation of three other miRNAs (miR-106a, miR-130b, and miR-484) that could emerge as predictive biomarkers of intrinsic resistance among metastatic CRC patients toward FOLFOX (Kjersem et al., 2014). miR-130b was found to be involved in breast cancer resistance to adriamycin via the PI3K/AKT signaling pathway (Miao et al., 2017). miR-484 was also reported to contribute to gemcitabine resistance in breast cancer and sunitinib resistance in renal cell carcinoma (Merhautova et al., 2015; Ye et al., 2015). In breast cancer, miR484 was found to contribute to resistance by targeting the cell-cycle related protein, CDA (Ye et al., 2015). Additionally, a comprehensive analysis conducted via real time-PCR based profiling of 742 different miRNAs using 26 CRC tissues with or without response to first-line capecitabine and oxaliplatin (XELOX)/FOLFOX treatment reported that the overexpression of miR-27b, miR-181b, and miR-625-3p was significantly associated with poor response to XELOX/FOLFOX (Rasmussen et al., 2013). The same study further validated these candidate miRNAs in primary tumor tissues of 94 metastatic CRC patients, which confirmed that high expression of miR-625-3p to be significantly associated with deprived response to XELOX/FOLFOX as first-line treatment. It was further investigated that this miRNA regulated chemoresistance by targeting MAP2K6 of the MAPK pathway (Lyskjær et al., 2016). Zhang et al screened for differentially expressed miRNAs in the serum of 20 responders and 20 non-responders to FOLFOX (Zhang et al., 2014). They reported that 14 miRNAs were differentially expressed, and the findings were further validated in a larger cohort of patients consisting of 93 responders and 80 non-responders. The validation resulted in the further stratification of potential miRNAs down to five miRNAs. The five serum miRNAs (miR-20a, miR-130, miR-145, miR-216, and miR-372) were further statistically tested whether they can be used to differentiate between the responders and non-responders. The AUC using all five miRNAs were 0.841 in the training set (40 CRC patients) and 0.918 (173 CRC patients), whereas when using CEA and CA19-9, the AUC values were 0.689 and 0.746 respectively. This indicates that the panel of five miRNAs was more accurate at determining the responsiveness toward chemotherapy than CEA and CA19-9(Zhang et al., 2014).

Dong et al. discovered the upregulation of miR-429 in both serum and primary tissues from chemo-resistant CRC patients who received 5-FU based adjuvant chemotherapy including FOLFOX (Dong et al., 2016). Overexpression of miR-429 was positively correlated with tumor size, lymph node involvement, distant metastases and TNM staging, resulting in poor prognosis and lower survival rate for CRC patients. Furthermore, miR-429 was also implicated in cisplatin-resistance in epithelial ovarian cancer (Zou et al., 2017). It was further discovered that the method of cisplatin-resistance was achieved by targeting the ZEB1 protein. This pathway of resistance may be similar to FOLFOX resistance, as both oxaliplatin and cisplatin are platinum-based drugs. Furthermore, Takahashi et al. reported that the expression of miR-148a was lower in non-responder CRC than responders (Takahashi et al., 2012). The lower expression of miR-148a was also associated with lower progression-free survival and significantly poorer overall survival (Takahashi et al., 2012). At the molecular level, the downregulation of miR-148a in primary tissues was correlated with the development of highgrade adenoma and poor clinical outcome in stage III CRC patients (Hibino et al., 2015). All these findings suggest that miR-148a could serve as a predictive biomarker for FOLFOX. For drug resistance, miR-148a was reported to be involved in tamoxifen resistance in breast cancer as well (Chen et al., 2017). In breast cancer, the resistance toward tamoxifen by miR-148a was achieved by targeting the ALCAM protein (Chen et al., 2017).

Liu et al. reported that the reduced expression of serum exosomal miR-4772-3p was significantly associated with a higher risk of tumor recurrence in stage II and stage III CRC patients who received FOLFOX therapy (Liu et al., 2016a). However, no study has been conducted in relation to the levels of miR-4772-3p and FOLFOX-responsiveness. In another study which involved a 3-year follow up that focused on the dynamic monitoring of serum miRNA levels (miR-155, miR-200c, and miR-210) with adjuvant FOLFOX therapy plus cetuximab in 15 CRC patients, the researchers suggested that re-elevation of serum miR-155 levels after surgery and chemotherapy may help to predict chemo-resistance (Chen et al., 2013a). MiR-155 is one of the most multi-functional and conserved miRNA ever reported (Yu et al., 2015; Bayraktar and Van Roosbroeck, 2018). In fact, miR-155 is well known to be associated with resistance of treatment in multiple types of cancer such as breast cancer (Yu et al., 2015), lung cancer (Van Roosbroeck et al., 2017), prostate cancer (Li et al., 2017a) cervical cancer (Lei et al., 2012) and renal cell carcinoma(Merhautova et al., 2015). Though the studies reported different mechanisms on how miR-155 regulate resistance, we can still have a basic view


on how miR-155 operates and apply it to further understand its role. MiR-155 is one of the major oncogenic miRNAs that is known to be involved in drug resistance and is well-studied. In breast cancer, miR-155 was found to modulate resistance by targeting the FOXO3 pathway, MAPK pathway and EMTrelated proteins (Bayraktar and Van Roosbroeck, 2018). Another study reported the prognostic value of miR-320e as a novel biomarker in CRC and has been validated in two different cohorts of patients treated with FOLFOX (Perez-Carbonell et al., 2015). The expression level of miR-320e in primary CRC tissues showed a positive correlation with recurrence, advanced clinical presentation, poor prognosis among stage II and III CRC patients treated with FOLFOX (Perez-Carbonell et al., 2015).

Recently, a study by Kiss et al. discovered that a cohort of CRC patients treated with the combination of bevacizumab and FOLFOX, showed a distinctive profile of tissue miRNAs (Kiss et al., 2017). The study identified 67 differentially expressed miRNAs between the responders and non-responder, where seven of the miRNAs were independently validated (Kiss et al., 2017). From there, four miRNAs (miR-92b-3p, miR-3156-5p, miR-10a-5p, and miR-125a-5p) were significantly associated with the Response Evaluation Criteria In Solid Tumors (RECIST) criteria of responsiveness (Kiss et al., 2017). Moreover, the combination of these four miRNAs had a sensitivity of 82% and specificity of 64% to differentiate between responders and nonresponders, thus indicating the potential use of these miRNAs as biomarkers of chemotherapy responsiveness and progressionfree survival (Kiss et al., 2017).

### Long Non-coding RNAs (lncRNAs) and Folfox-Resistance

Li et al. presented two different lncRNAs, namely MALAT1 and HOTAIR that contribute to resistance on 5-FU/oxaliplatinbased chemotherapy via similar inhibition of miR-218 (Li et al., 2017b,c). The lncRNA MALAT1, also known as nuclear-enriched transcript 2 (NEAT2), was initially discovered as a promising

biomarker for lung cancer metastasis (Gutschner et al., 2013b). Later, discovery of MALAT1 dysregulation was expanded into various cancers to become the key regulator of metastasis and cancer development (Gutschner et al., 2013a;Tripathi et al., 2013). MALAT has also been associated with other types of drug resistance. For instance, MALAT was shown to be involved in cisplatin resistance in NSCLC (Fang et al., 2018), adriamycin resistance in diffuse large-B cell lymphoma (Long et al., 2017) and temozolomide resistance in glioblastoma (Lu et al., 2017). Upregulation of MALAT1 in primary CRC tissue was highly associated with a poor survival rate and a weak response to FOLFOX in advanced CRC patients (Li et al., 2017b). The same study demonstrated that the overexpression of MALAT1 in oxaliplatin-resistant CRC cells modulate chemo-resistance via suppression of E-cadherin expression and enhancement of epithelial-mesenchymal transition (EMT) but the underlying signaling pathways have not been fully elucidated (Wang and Zhou, 2013). Correspondingly, HOTAIR overexpression in primary CRC tissue was also demonstrated to inhibit miR-218 expression in CRC, resulting in poor response to 5FU- based adjuvant chemotherapy (Li et al., 2017c). Similarly, HOTAIR was also reported to be involved in other types of drug resistance in different cancers. It was reported that HOTAIR was associated with cisplatin resistance in lung adenocarcinoma (Liu et al., 2013), crizotinib resistance in NSCLC (Yang et al., 2018) and imatinib resistance in chronic myeloid leukemia (Wang et al., 2017). In gastric cancer, HOTAIR was involved in cisplatin resistance via inhibition of the PI3K/AKT pathway and Wnt/Bcatenin pathways. Both of these pathways are hallmark pathways that are involved in colorectal cancer pathogenesis. It can be assumed that FOLFOX resistance was also achieved by the same mechanism. Collectively, **Table 1** summarizes all the mentioned ncRNAs that show correlation with chemo-resistance to FOLFOX.

#### CONCLUSION, CHALLENGES AND FUTURE DIRECTION

This mini-review highlights the increasing evidence and a fresh update that will help to widen our knowledge on the potential role of ncRNAs, primarily miRNAs and lncRNAs underlying FOLFOX chemo-resistance. Most of the ncRNAs reported are not only involved in FOLFOX resistance, but in resistance to other drugs as well. This reflects that the mechanisms of drug resistance are rather complex and different pathways may crosstalk with each other as illustrated in **Figure 1**. Drug resistance may also occur in a centralized manner, regardless of the type of cancer or drug being administered. Nevertheless, there are also instances, where a certain ncRNA may modulate different pathways of resistance depending on the type of

#### REFERENCES

drugs. Most of the studies mentioned above were performed on analysis of clinical tissues and serum/plasma samples from retrospective patient cohorts without sufficient validation and in-depth functional analysis. Future research is necessary to validate these findings in multi-centered cohort studies as well as to elucidate the underlying signaling pathway via in vitro and in vivo functional studies. The bioinformatics analysis in studies related to FOLFOX-resistance is still insufficient to provide solid foundation for the translation of miRNAs and lncRNAs as powerful predictor of FOLFOX-resistance in the clinical setting. More intensive and comprehensive statistical analysis is essential to validate the specificity and sensitivity of each individual miRNA/lncRNA as a biomarker (Liu et al., 2012). Ultimately, all these findings may contribute toward the development of next-generation diagnostic panel comprising of miRNAs and lncRNAs, which a more powerful diagnostic tool to predict patient response toward FOLFOX. However, it is still challenging to accurately identify those clinically promising ncRNAs suitable for early diagnosis, risk assessment, prognosis prediction and drug monitoring in patients treated with FOLFOX. Furthermore, there are still considerable obstacles that limit the clinical application of miRNAs and lncRNAs for diagnostic and prognostic purposes. Lack of standardization in the extraction of ncRNAs from tissues and bodily fluids remains a major challenge, which greatly affects the stability of ncRNAs in specimens and subsequently leads to inconsistency in most findings. Due to the high abundance of ncRNAs in human serum/plasma, the liquid biopsy approach could be an ideal method to develop standardized operative procedures for ncRNAs extraction in the clinical environment (Erbes et al., 2015; Komatsu et al., 2015). Liquid biopsy is now widely accepted as a non-invasive method to retrieve circulating cancer cells or traces of nuclei acids derived from tumor (Karachaliou et al., 2015; Murtaza et al., 2015; Birkenkamp-Demtröder et al., 2016; Jamal-Hanjani et al., 2016). Discovery and characterization of new ncRNAs related to FOLFOX-resistance will benefit the researchers to explore the diverse ncRNAs as potential biomarkers and therapeutic targets to overcome drug resistance.

#### AUTHOR CONTRIBUTIONS

KH and NA drafted the manuscript. NA, N-SA, and RJ were responsible for critical feedback and manuscript revision.

#### FUNDING

This manuscript was funded by the Dana Impak Perdana Grant (DIP-2016-013) awarded by Universiti Kebangsaan Malaysia.

André, T., De Gramont, A., Vernerey, D., Chibaudel, B., Bonnetain, F., Tijeras-Raballand, A., et al. (2015). Adjuvant fluorouracil, leucovorin, and oxaliplatin in stage ii to iii colon cancer: updated 10-year survival and outcomes according to braf mutation and mismatch repair status of the MOSAIC study. J. Clin. Oncol. 33, 4176–4187. doi: 10.1200/JCO.2015.63.4238

André, T., Boni, C., Mounedji-Boudiaf, L., Navarro, M., Tabernero, J., Hickish, T., et al. (2004). Oxaliplatin, fluorouracil, and leucovorin as adjuvant treatment for colon cancer. N. Engl. J. Med. 350, 2343–2351. doi: 10.1056/NEJMoa032709


colorectal cancers. Ann. Oncol. 15, 1025–1032. doi: 10.1093/annonc/ mdh264


of response to chemotherapy. Anticancer. Drugs 25, 346–352. doi: 10.1097/CAD.0000000000000049


in epithelial ovarian cancer by targeting ZEB1. Am. J. Transl. Res. 9, 1357–1368.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Hon, Abu, Ab Mutalib and Jamal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Inhibitors of DNA Methyltransferases From Natural Sources: A Computational Perspective

Fernanda I. Saldívar-González, Alejandro Gómez-García, David E. Chávez-Ponce de León, Norberto Sánchez-Cruz, Javier Ruiz-Rios, B. Angélica Pilón-Jiménez and José L. Medina-Franco\*

Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, Mexico City, Mexico

#### Edited by:

Shibashish Giri, University of Leipzig, Germany

#### Reviewed by:

Jaigopal Sharma, Delhi Technological University, India Sujata Mohanty, All India Institute of Medical Sciences, India Rup Lal, University of Delhi, India

#### \*Correspondence:

José L. Medina-Franco medinajl@unam.mx; jose.medina.franco@gmail.com

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology

Received: 08 August 2018 Accepted: 21 September 2018 Published: 10 October 2018

#### Citation:

Saldívar-González FI, Gómez-García A, Chávez-Ponce de León DE, Sánchez-Cruz N, Ruiz-Rios J, Pilón-Jiménez BA and Medina-Franco JL (2018) Inhibitors of DNA Methyltransferases From Natural Sources: A Computational Perspective. Front. Pharmacol. 9:1144. doi: 10.3389/fphar.2018.01144 Naturally occurring small molecules include a large variety of natural products from different sources that have confirmed activity against epigenetic targets. In this work we review chemoinformatic, molecular modeling, and other computational approaches that have been used to uncover natural products as inhibitors of DNA methyltransferases, a major family of epigenetic targets with therapeutic interest. Examples of computational approaches surveyed in this work are docking, similarity-based virtual screening, and pharmacophore modeling. It is also discussed the chemoinformatic-guided exploration of the chemical space of naturally occurring compounds as epigenetic modulators which may have significant implications in epigenetic drug discovery and nutriepigenetics.

Keywords: chemical space, chemoinformatics, databases, DNMT inhibitors, drug discovery, molecular modeling, similarity searching, virtual screening

#### SECTION 1: INTRODUCTION

Epigenetics has been defined as a change in phenotype without an underlying change in genotype (Berger et al., 2009). In the 1940s Waddington suggested the term "epigenetics" trying to describe "the interactions of genes with their environment, which brings the phenotype into being" (Waddington, 2012). Alterations in epigenetic modifications have been related to several diseases including cancer, diabetes, neurodegenerative disorders, and immune-mediated diseases (Dueñas-González et al., 2016; Tough et al., 2016; Hwang et al., 2017; Lu et al., 2018). Moreover, epigenetic targets are also attractive for the treatment of antiparasitic infections (Sacconnay et al., 2014).

In epigenetic drug discovery, epigenetic targets have been classified into three main groups (Ganesan, 2018). "Writers" are enzymes that catalyze the addition of a functional group to a protein or nucleic acid; "readers" are macromolecules that function as recognition units that can distinguish a native macromolecule vs. the modified one; and "erasers" that are enzymes that aid in the removal of chemical modifications introduced by the writers. Thus far, several targets from these three major families have reached different stages of drug discovery, ranging from lead discovery, preclinical development, clinical trials and approval. Currently, there are seven compounds approved for clinical use (Ganesan, 2018).

DNA methyltransferases (DNMTs) are a family of "writer" enzymes responsible for DNA methylation that is the addition of a methyl group to the carbon atom number five (C5) of cytosine. As surveyed in this work, since DNA methylation has an essential role for cell differentiation and

**14**

development, alterations in the function of DNMTs have been associated with cancer (Castillo-Aguilera et al., 2017) and other diseases (Lyko, 2017).

Several natural products have been identified as inhibitors of epigenetic targets including DNMTs. Most of these compounds have been uncovered fortuitously. However, there are recent efforts to screen systematically natural products as DNMT inhibitors. The vastness of the chemical space of natural products led to the hypothesis that many more active compounds could potentially been identified. Indeed, it has been estimated that more than 95% of the biodiversity in nature remains to be explored to identify potential bioactive molecules (Ho et al., 2018).

The aim of this work is to discuss a broad range of computational methods to identify novel inhibitors of DNMTs from natural products. The manuscript also discusses the chemical space of natural products as inhibitors of DNMTs. The manuscript is organized into nine sections. After this introduction, Section 2 reviews briefly the structure of DNMTs including different isoforms. The next section covers major aspects of the function of DNMTs including the mechanism of methylation. Section 4 reviews currently known inhibitors of DNMTs from natural sources including food chemicals. Section 5 discusses the epigenetic relevant chemical space of natural products comparing the chemical space of DNMT inhibitors from natural sources vs. other compounds. The next section reviews computational strategies that are used to identify natural compounds as potential epi-hits or epi-leads targeting DNMTs. Sections 7 and 8 presents Summary conclusions and Perspectives, respectively.

#### SECTION 2: STRUCTURE OF DNMTs

The human genome encodes DNMT1, DNMT2, DNMT3A, DNMT3B, and DNMT3L. While DNMT1, DNMT3A, and DNMT3B have catalytic activity, DNMT2 and DNMT3L do not (Lyko, 2017). DNMT1 is a maintenance methyltransferase, responsible for duplicating the pattern of DNA methylation during replication. DNMT1 is essential for proper mammalian development and it has been proposed as the most interesting target for experimental cancer therapies (Dueñas-González et al., 2016). DNMT3A and DNMT3B are de novo methyltransferases. Human DNMT1 has 1616 amino acids whose structure can be divided into an N-terminal regulatory domain and a C-terminal catalytic domain (Jeltsch, 2002; Jurkowska et al., 2011). The N-terminal domain contains a replication focitargeting domain, a DNA-binding CXXC domain, and a pair of bromo-adjacent homology domains. The C-terminal catalytic domain has 10 amino acid motifs. The cofactor and substrate binding sites in the C-terminal catalytic domain are comprised of motif I and X and motif IV, VI, and VIII, respectively (Lan et al., 2010). The target recognition domain which is maintained by motif IX and involved in DNA recognition, is not conserved between the DNMT family. **Figure 1A** shows a three-dimensional (3D) model of a DNMT1 (PDB ID: 4WXX) (Zhang et al., 2015). **Figure 1B** shows a schematic diagram of human of DNMT1, 2, 3A, 3B, and L.

#### Section 2.1: Isoforms

Two isoforms of DNMT3A have been identified, DNMT3A1 and DNMT3A2. At the N-terminal domain both isoforms have a PWWP (Pro-Trp-Trp-Pro) and an ADD (ATRX-DNMT3- DNMT3L) domains (Jurkowska et al., 2011). The C-terminal domain is identical in the two isoforms (Choi et al., 2011).

There are more than 30 isoforms of DNMT3B, however, only DNMT3B1 and DNMT3B2 are catalytically active (Ostler et al., 2007). Similar to DNMT3A, DNMT3B1, and DNMT3B2 have a PWWP and ADD domains at the N-terminal region (Lyko, 2017). The rest of the isoforms are not catalytically active. Some of these such as DNMT3B3, DNMT3B4, and DNMT3B7 are overexpressed in many tumor cell lines (Gordon et al., 2013). 1DNMT3B has seven isoforms and lacks 200 amino acids from the N-terminal region of DNMT3B (Wang et al., 2006). 1DNMT3B1–4 possess catalytic activity whereas 1DNMT3B5–7 lacks the catalytic domain (Wang et al., 2006). 1DNMT3B is mainly expressed in non-small cell lung cancer (Wang et al., 2006; Ostler et al., 2007). **Figure 1C** shows the identity matrix of 14 DNMTs isoforms. The identity matrix indicates that the amino acid sequence at the catalytic site of DNMT3A1 and DNMT3A2 isoforms is identical. In the same manner, the amino acid sequence at the C-terminal domain of the catalytically active isoforms DNMT3B1, DNMT3B2, and 1DNMT3B1–4 are identical. DNMT1, DNMT2, and DNMT3L show a significant difference in the sequence of the catalytic site with respect to the rest of the isoforms. Therefore, it can be anticipated that is possible to identify or design selective inhibitors for these isoforms.

### SECTION 3: FUNCTION AND MECHANISM OF DNMTs

As outlined in Section 2, cytosine-5 DNMTs catalyze the addition of methylation marks to genomic DNA. All DNMTs have a related catalytic mechanism that is featured by the formation of a covalent adduct intermediate between the enzyme and the substrate base. All DNMTs use S-adenosyl-L-methionine (SAM) as the methyl group donor (Vilkaitis et al., 2001; Du et al., 2016). DNMT forms a complex with DNA and the cytosine which will be methylated flips out from the DNA (Klimasauskas et al., 1994). A conserved cysteine performs a nucleophilic attack to the sixposition of the target cytosine yielding a covalent intermediate. The five-position of the cytosine is activated and conducts a nucleophilic attack on the cofactor SAM to form the 5-methyl covalent adduct and S-adenosyl-L-homocysteine (SAH). The attack on the six-position is aided by a transient protonation of the cytosine ring at the endocyclic nitrogen atom N3, which can be stabilized by a glutamate and arginine residues. The covalent complex between the methylated base and the DNA is resolved by deprotonation at the five-position to generate the methylated cytosine and the free enzyme.

FIGURE 1 | (A) Three-dimensional model of DNMT1, amino acid residues 351–1600. Figure rendered from the Protein Data Bank PDB ID: 4WXX. (B) Schematic diagram of the structure of human DNMT1, DNMT2, DNMT3A, DNMT3B, and DNMT3L. (C) Identity matrix of the catalytic site of 14 DNMTs isoforms. Note that there is a significant difference in the sequence of DNMT1, DNMT2, and DNMT3L.

#### SECTION 4: KNOWN INHIBITORS OF DNMTs FROM NATURAL SOURCES

Thus far more than 500 compounds have been tested as inhibitors of DNMTs. The structural diversity and coverage in chemical space has been analyzed using chemoinformatic methods (Fernandez-de Gortari and Medina-Franco, 2015). The chemical space of DNMT inhibitors has been compared with inhibitors of other epigenetic targets (Naveja and Medina-Franco, 2018). Furthermore, the structure-activity relationships (SAR) of DNMT inhibitors using the concept of activity landscape has been documented (Naveja and Medina-Franco, 2015).

DNA methyltransferase inhibitors have been obtained from a broad number of different strategies including organic synthesis, virtual, and high-throughput screening (Medina-Franco et al., 2015). Organic synthesis has been employed in several instances for lead optimization (Castellano et al., 2008; Kabro et al., 2013; Davide et al., 2016). Natural products and food chemicals have also been a major source of active compounds. Natural products that are known to act as DNMT inhibitors or demethylating agents have been extensively reviewed by Zwergel et al. (2016). These natural products are of the type polyphenols, flavonoids, anthraquinones, and other classes. Some of the first natural products described were curcumin, (-)-epigallocatechin-3-gallate (EGCG), mahanine, genistein, and quercetin. Other natural products that have described as inhibitors of DNMT or demethylating agents are silibinin, luteolin, kazinol Q, laccaic acid, hypericin, boswellic acid, and lycopene. **Figure 2** shows the chemical structure of representative DNMT inhibitors with emphasis on compounds from natural origin.

The bioactivity profile and potency in enzymatic and/or cellbased assays of these natural products have been discussed in detail by Zwergel et al. (2016). Of note, it will be valuable if all natural products could have been screened under the same conditions. For few natural products the selectivity has been characterized being nanaomycin A an exception (vide infra). Indeed, for about eight natural products the IC<sup>50</sup> has been measured in enzymatic based assays. Despite the fact that the potency of the natural products with DNMTs is not very high in enzymatic-based assays, e.g., IC<sup>50</sup> between 0.5 and 10 µM, several natural products have shown promising activity in cell-based assays. Notably, natural products have distinct chemical scaffolds that could be used as a starting point in lead optimization efforts. Moreover, quercetin in combination with green tea extract has advanced into phase I clinical trials for the treatment of prostate cancer.

Most of the natural products with demethylating activity or ability to inhibit DNA methyltransferases in enzymatic assays have been identified fortuitously. However, as discussed in this work, there are efforts toward the identification of bioactive demethylating agents using systematic approaches such a virtual screening. Indeed, the natural product nanaomycin A (**Figure 2**) was identified from a virtual screening campaign initially focused on the identification of inhibitors of DNMT1. The quinone-based antibiotic isolated from Streptomyces showed antiproliferative effects in three human tumor cell lines, HCT116, A549, and HL60 after 72 h of treatment. Moreover, nanaomycin A showed reduced global methylation levels in all three cell lines when tested at concentrations ranging from 0.5 to 5 µM. Nanaomycin A reactivated the transcription of the RASSF1A tumor suppressor gene inducing its expression up to 18-fold at 5 µM, higher than the reference drug 5-azacytidine (sixfold at 25 µM). In an enzymatic inhibitory assay, nanaomycin A was selective toward DNMT3B with an IC<sup>50</sup> = 0.50 µM.

### Section 4.1: Natural Products and Food Chemicals

It is remarkable that several natural products are used as dietary sources such as curcumin, caffeic acid and chlorogenic acid found in Coffea arabica, genistein found in soybean, quercetin found in fruits, vegetables, and beverages. Of course, there is a large overlap between the chemical space of food chemicals and natural products (Naveja et al., 2018). This has given rise to systematically screen food chemical databases for potential regulators of epigenetic targets.

#### SECTION 5: EPIGENETIC RELEVANT CHEMICAL SPACE OF NATURAL PRODUCTS: FOCUS ON DNMT INHIBITORS

In drug discovery it is generally accepted that a major benefit of natural products vs. purely synthetic organic molecules is, overall, the feasibility of the former to exert a biological activity and increased chemical diversity (Ho et al., 2018). The chemical space of natural products is vast and its molecular diversity has been quantified over the past few years (López-Vallejo et al., 2012; Olmedo et al., 2017; Shang et al., 2018). A major contribution to these studies has been the increasing availability of natural products collections in the public domain (Medina-Franco, 2015). Examples of major compound collections are the Traditional Chinese Medicine (Chen, 2011), natural products from Brazil – NuBBE (Pilon et al., 2017), AfroDb (Ntie-Kang et al., 2013) or collections available for screening in a medium to high-throughput screening mode. The large importance of natural products in drug discovery has boosted the development of open access applications to mine these rich repositories. Few examples are ChemGPS-NP, TCMAnalyzer, and other resources described elsewhere (Rosen et al., 2009; Chen et al., 2017; Gonzalez-Medina et al., 2017; Liu et al., 2018).

The chemical space of natural products from different sources has been compared to several other collections including the chemical space of drugs approved for clinical use and synthetic compounds (Olmedo et al., 2017; Shang et al., 2018). These studies demonstrate that the chemical space of natural products is vast, that there is a notable overlap with the chemical space of drugs, and that natural products cover novel regions of the chemical space. The overlap with the chemical space of approved drugs is not that surprising since there are a large percentage of drugs from natural origin. **Figure 3** shows a visual representation of the chemical space of 15 representative DNMT inhibitors from natural sources vs. 4103 compounds from a commercial

vendor library of natural products, 206 fungi metabolites, and 6253 marine natural products (Krishna et al., 2017). The visual representation was generated with principal component analysis of six physicochemical properties of pharmaceutical relevance, namely molecular weight (MW), topological surface area (TPSA), number of hydrogen bond donors and acceptors (HBD/HBA), number of rotatable bonds (RB), and octanol/water partition coefficient (logP). The first two principal components capture about 90% of the total variance. The visual representation of the chemical space in this figure indicates that marine natural products (data points in blue) cover a broader area of the chemical space followed by natural products in the vendor collection (orange) and by fungi metabolites (green). DNMT inhibitors from natural origin (purple) are, in general, inside the subspace of the DNMT1 inhibitors (red). This visualization of the chemical space indicates that there would be expected to identify more DNMT1 inhibitors in the marine and vendor collections, as well as in the data set of fungi metabolites.

### SECTION 6: OPPORTUNITIES FOR SEARCHING FOR NATURAL PRODUCTS AS DNMT INHIBITORS

Most of the DNMT inhibitors from natural sources have been identified by serendipity. As discussed in Section 5, the chemical space of natural products and food chemicals can be explored in a systematic manner using computational approaches. A classical and general approach is using virtual screening. The main aim of virtual screening is filtering compound data sets to select a reduced number of compounds with increased probability to show biological activity. Virtual screening has proven to be useful to identify hit compounds (Clark, 2008; Lavecchia and Di Giovanni, 2013). **Table 1** summarizes representative case studies where virtual screening has led to the identification of active compounds with novel scaffolds. In other studies, virtual screening has uncovered potential active compounds but experimental validation still needs to be conducted. Examples of these studies are further discussed in the following sections.

There are several published studies of virtual screening of natural products to identify DNMT inhibitors and/or demethylating agents. In an early work, Medina-Franco et al. (2011) reported the screening of a lead-like subset of natural products available in ZINC. Authors of that work implemented a multistep virtual screening approach selecting consensus hits identified from three different docking programs. One computational hit showed DNMT1 activity in a previous study. Other candidate compounds were identified for later experimental validation (Medina-Franco et al., 2011).

In a separate work, Maldonado-Rojas et al. (2015) developed a QSAR model based on linear discriminant analysis to screen 800 natural products. Hits selected were further


interest. The percentage of variance is shown on each axis of the plot.


products data sets. The visual representation of the chemical space was based on principal component analysis of six physicochemical properties of pharmaceutical

NP: natural products.

docked with two crystallographic structures of human DNMT employing two docking programs. Six consensus hits were identified as potential inhibitors (Maldonado-Rojas et al., 2015).

Virtual screening of synthetic libraries has also been reported to identify active compounds with novel scaffolds and suitable for lead optimization. For instance, Chen et al. (2014) reported a docking-based virtual screening of the commercial screening compound library SPECS with 111,121 molecules (after filtering compounds with undesirable physicochemical properties). Results of that work led to the identification of a compound with a novel scaffold with low micromolar IC<sup>50</sup> (10.3 µM). Starting from the computational hit, similarity searching led to the identification of two more potent compounds.

Hassanzadeh et al. (2017) recently reported a pharmacophorebased virtual screening of a compound database with 500 compounds. The pharmacophore was generated using a ligandbased approach by superimposing a group of active nucleoside analogs. Selected hits, which are structurally related to the barbituric acid, were docked into the substrate binding site of DNMT1. One compound was identified with a novel chemical scaffold that inhibits DNMT1 in the low micromolar range (IC<sup>50</sup> = 4.1 µM). The compound also showed some selectivity on DNMT1 over DNMT3 enzymes (Hassanzadeh et al., 2017).

Krishna et al. (2017) implemented a virtual screening protocol using several structure- and ligand-based approaches. Methods included a pharmacophore model, a Naïve Bayesian classification model, and ensemble docking. Three out of ten selected compounds from a commercial library of synthetic molecules (e.g., Maybridge with 53,000 small drug-like compounds), showed DNMT1 inhibitory activity at compound concentration of 20 µM. Two of these molecules showed activity at 1 µM (Krishna et al., 2017).

In addition to the studies discussed above and summarized in **Table 1**, the next subsections discuss other approaches that can be explored. Case studies for each strategy are outlined briefly.

#### Section 6.1: Similarity-Based Virtual Screening of Natural Products

Similarity searching is a commonly used approach for identifying new hit compounds. Major goals are identifying starting points for later optimization or expand the SAR of analog series. Since similarity searching is fast it can be used to filter large chemical databases and it can be used in combination with other computational approaches such as molecular docking.

Similarity searching involves two major components: a molecular representation and a similarity coefficient. In practice, one of the most common molecular representations are twodimensional (2D) fingerprints. A fingerprint is generally a string of zeros and ones that indicate the presence or absence of molecular features, respectively. In turn, one of the most common similarity coefficients is Tanimoto's (Bajusz et al., 2015). Full discussion of molecular representations and similarity coefficients are published elsewhere (Willett et al., 1998; Maggiora et al., 2014).

A novel approach to encode the chemical structures of data sets is the database fingerprint (DBFP) (Fernández-de Gortari et al., 2017). The rationale of DBFP is account for the most structural features encoded in bit positions of an entire data set. In principle, virtually any data set can be represented. For instance, it can be a small or large chemical database of screening compounds or a group of active compounds. DBFP can be used in visual representation of the chemical space (Naveja and Medina-Franco, 2018) and similarity searching (Fernández-de Gortari et al., 2017). More recently, this approach was further refined into the so-called statistical based database fingerprint (SB-DFP). This approach has the same underlying idea and application of DBFP. A key improvement is the approach to account for the most relevant structural features that are derived from a statistical comparison between the structural features of a data set of interest vs. a database of reference.

#### Section 6.2: Pharmacophore-Based

Thus far, several pharmacophore modeling studies have been conducted for inhibitors of DNMT1. Different approaches and input molecules have been used to develop these models. Most of the pharmacophore models have been employed to virtually screen chemical databases and identify novel hit compounds.

Yoo and Medina-Franco (2011) reported one of the first pharmacophore models for inhibitors of DNMT1. The model was generated based on the docking poses of 14 known inhibitors available at that time. The docking was conducted with a homology model of the catalytic domain of DNMT1. Of note, at the time of that study the crystallographic structure of human DNMT1 was not available. Known DNMT inhibitors used to develop the pharmacophore model included the natural products curcumin, parthenolide, EGCG and mahanine (Yoo and Medina-Franco, 2011). A year later was reported that trimethylaurintricarboxylic acid (**Figure 2**) showed a good agreement with this structure-based pharmacophore model. This compound is structurally related to 5,5<sup>0</sup> -methylenedisalicylic acid that has an inhibition of DNMT1 in a low micromolar range (IC<sup>50</sup> = 4.79 µM) (Yoo and Medina-Franco, 2012; Yoo et al., 2012).

More recently, as described in the first part of Section 6, Hassanzadeh et al. (2017) developed a pharmacophore model based on a ligand-based approach by 3D superimposition of active nucleoside analogs. That model was used to do virtual screening (vide supra). In the same year, with the aid of the Hypogen module of the software DS4.1, Krishna et al. (2017) developed a ligand-based pharmacophore model using the structures of 20 compounds obtained from the literature. The model was validated with the classification of an external set with known active and inactive compounds. The validated pharmacophore models were employed as part of a combined strategy to identify novel active molecules (Krishna et al., 2017).

#### Section 6.3: De novo Design

De novo design is a technique currently explored for DNMT inhibitors on a limited basis. Here we briefly outline two promising perspectives related to natural product research.

The first one is a strategy that provides a structural diversity classification of natural products scaffolds through generative topographic map algorithm implementation often so-called chemographies. Chemographies allow the visualization of the landscape distribution of the chemical space of natural products and their synthetic mimetic compounds (Miyao et al., 2015). Since chemographies could be generated from pharmacophoric features and molecular descriptors, it would be feasible to do scaffold hopping based on the structures of natural products (Rodrigues et al., 2016). The second approach is based on scaffold simplification that could be adapted to generate fragment-like natural products focused on DNMT inhibitors. This strategy reduces the molecular framework of natural products through the implementation of a scaffold tree algorithm based on rule-based decomposition of ring systems (Bajorath, 2018).

#### SECTION 7: CONCLUSION

Epigenetic targets are attractive to develop therapeutic strategies. DNA methyltransferases are the major enzyme family being one of the first epigenetic targets studied, in particular for the treatment of cancer. However, over the past few years, more therapeutic opportunities related to the modulation of DNMTs are emerging. Therefore, there is a growing interest in the scientific community to identify and develop small molecules that can be used as epi-drugs or epi-probes targeting DNMTs. Virtual screening has become more used in recent years to uncover natural products as inhibitors of DNMTs and/or demethylating agents. To this end, well stablished structure- and ligandbased virtual screening approaches are being used such as automated docking, QSAR and similarity searching. Also, novel chemoinformatic approaches are being developed. Of course, the computational methods should be validated with rigorous experiments in vitro and in vivo experiments to support their application.

Natural products have a well stablished history as inhibitors of DNMTs and demethylating molecules. However, most of the active natural products have been identified by serendipity. The knowledge of the three-dimensional structures of DNMTs in combination with increased in silico approaches and better computational resources are boosting the systematic search of bioactive molecules from natural origin. In addition, the increasing availability of natural product databases facilitates the discovery of epi-drugs and epi-probes targeting DNMTs.

#### SECTION 8: PERSPECTIVES

Natural products inside or outside of the traditional druglike chemical space represent a large promise to develop novel compounds with DNMT inhibitory activity or demethylating properties. This is because the traditional chemical space is highly represented by small molecules that over the past few years have not been very successful. A notable example in this direction is the reemergence of peptide-based drug discovery. Indeed, linear, cyclic peptides and peptidomimetics are regaining interest in drug discovery (Fosgerau and Hoffmann, 2015; Henninot et al., 2018).

Other promising an emerging avenue are the modulators of protein–protein interactions (PPIs) (Díaz-Eufracio et al., 2018). DNMTs are known to be involved in several PPIs (Díaz-Eufracio et al., 2018). Modulation of such interactions can be conveniently achieved with natural products. This is because PPIs are "difficult targets" not easily addressed by small molecules from the traditional chemical space (Villoutreix et al., 2014). In other words, since PPIs have unique features these can be approached with novel chemical libraries. Natural products collections represent excellent candidates for this purpose.

We foresee an augmented hit and led identification efforts based on natural products combining approaches such as high-throughput screening, structure-, ligand-based in silico screening, structure-based optimization, similarity searching, and scaffold hopping (Schneider et al., 1999). As part of the search for novel and more potent compounds is crucial to consider potential toxicity since toxicity issues play a major part in the lack of success of drug discovery projects.

### DISCLAIMER

A similar version of this manuscript was deposited in a Pre-Print server on July 6, 2018. The reference is: Saldívar-González, F. I.; Gómez-García, A.; Sánchez-Cruz, N.; Ruiz-Rios, J.; Pilón-Jiménez, B. A.; Medina-Franco, J. L. Computational Approaches to Identify Natural Products as Inhibitors of DNA Methyltransferases. Preprints 2018, 2018070116 (doi: 10.20944/preprints201807.0116.v1).

### AUTHOR CONTRIBUTIONS

All authors contributed to methodology and formal analysis. FS-G, JR-R, and BP-J contributed to data curation. AG-G, FS-G, DC-PdL, and JM-F contributed to writing-original draft preparation. AG-G, FS-G, NS-G, and JM-F contributed to writing-review and editing. AG-G, FS-G, and BP-J contributed to visualization. JM-F contributed to project administration.

### FUNDING

This research was funded by Consejo Nacional de Ciencia y Tecnologia (CONACYT, Mexico) grant number 282785, the Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica (PAPIIT) grant IA203718, and by Programa de Apoyo a Proyectos para la Innovación y Mejoramiento de la Enseñanza (PAPIME) grant PE200118, UNAM.

#### ACKNOWLEDGMENTS

fphar-09-01144 October 8, 2018 Time: 15:40 # 9

FS-G, AG-G, and NS-C acknowledge Consejo Nacional de Ciencia y Tecnologia (CONACyT, Mexico) for the graduate

#### REFERENCES


scholarships. DC-PdL and JR-R thanks the Programa de Apoyo a Proyectos para la Innovación y Mejoramiento de la Enseñanza (PAPIME) for the undergraduate scholarship. The authors also thank Chanachai Sae-Lee for providing the sequences used in **Figure 1**. They also acknowledge all current and past members of the DIFACQUIM research group for their comments and discussions that enriched this manuscript.



Angew. Chem. Int. Ed. 38, 2894–2896. doi: 10.1002/(SICI)1521-3773(19991004) 38:19<2894::AID-ANIE2894>3.0.CO;2-F


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Saldívar-González, Gómez-García, Chávez-Ponce de León, Sánchez-Cruz, Ruiz-Rios, Pilón-Jiménez and Medina-Franco. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sulfotransferase and Heparanase: Remodeling Engines in Promoting Virus Infection and Disease Development

Dominik D. Kaltenbach<sup>1</sup> , Dinesh Jaishankar <sup>2</sup> , Meng Hao<sup>3</sup> , Jacob C. Beer <sup>4</sup> , Michael V. Volin<sup>5</sup> , Umesh R. Desai <sup>6</sup> and Vaibhav Tiwari <sup>5</sup> \*

<sup>1</sup> Department of Biomedical Sciences, College of Graduate Studies, Midwestern University, Downers Grove, IL, United States, <sup>2</sup> Department of Ophthalmology & Visual Sciences, University of Illinois at Chicago, Chicago, IL, United States, <sup>3</sup> Chicago College of Pharmacy, Midwestern University, Downers Grove, IL, United States, <sup>4</sup> Chicago College of Osteopathic Medicine, Midwestern University, Downers Grove, IL, United States, <sup>5</sup> Department of Microbiology & Immunology, College of Graduate Studies, Midwestern University, Downers Grove, IL, United States, <sup>6</sup> Department of Medicinal Chemistry and Institute for Structural Biology, Drug Discovery and Development, Virginia Commonwealth University, Richmond, VA, United States

#### Edited by:

Chandravanu Dash, Meharry Medical College, United States

#### Reviewed by:

Yu Bai, Novo Nordisk, China Chunliang Li, St. Jude Children's Research Hospital, United States

> \*Correspondence: Vaibhav Tiwari vtiwar@midwestern.edu

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology

Received: 10 September 2018 Accepted: 29 October 2018 Published: 22 November 2018

#### Citation:

Kaltenbach DD, Jaishankar D, Hao M, Beer JC, Volin MV, Desai UR and Tiwari V (2018) Sulfotransferase and Heparanase: Remodeling Engines in Promoting Virus Infection and Disease Development. Front. Pharmacol. 9:1315. doi: 10.3389/fphar.2018.01315 An extraordinary binding site generated in heparan sulfate (HS) structures, during its biosynthesis, provides a unique opportunity to interact with multiple protein ligands including viral proteins, and therefore adds tremendous value to this master molecule. An example of such a moiety is the sulfation at the C3 position of glucosamine residues in HS chain via 3-O sulfotransferase (3-OST) enzymes, which generates a unique virus-cell fusion receptor during herpes simplex virus (HSV) entry and spread. Emerging evidence now suggests that the unique patterns in HS sulfation assist multiple viruses in invading host cells at various steps of their life cycles. In addition, sulfated-HS structures are known to assist in invading host defense mechanisms and initiating multiple inflammatory processes; a critical event in the disease development. All these processes are detrimental for the host and therefore raise the question of how HS-sulfation is regulated. Epigenetic modulations have been shown to be implicated in these reactions during HSV infection as well as in HS modifying enzyme sulfotransferases, and therefore pose a critical component in answering it. Interestingly, heparanase (HPSE) activity is shown to be upregulated during virus infection and multiple other diseases assisting in virus replication to promote cell and tissue damage. These phenomena suggest that sulfotransferases and HPSE serve as key players in extracellular matrix remodeling and possibly generating unique signatures in a given disease. Therefore, identifying the epigenetic regulation of OST genes, and HPSE resulting in altered yet specific sulfation patterns in HS chain during virus infection, will be a significant a step toward developing potential diagnostic markers and designing novel therapies.

Keywords: heparan sulfate, herpes simplex virus, heparanse, sulfotranferases, heparan mimetic, viral entry

#### INTRODUCTION

Heparan sulfate (HS) is a linear polysaccharide ubiquitously present in all tissues. HS is attached to the cell surface or extracellular matrix proteins where it exists as heparan sulfate proteoglycans (HSPGs). Because of its unique structural capability, HSPGs serve as "heavy duty engines" by interacting with an array of multiple and diverse protein ligands to participate in virtually any given

**24**

Kaltenbach et al. Heparan Sulfate and Viral Infection

biological reaction (Yanagishita and Hascall, 1992; Sasisekharan and Venkataraman, 2000; Esko and Lindahl, 2001; Varki, 2002; Bishop et al., 2007). HS undergoes numerous modifications during its biosynthesis. One family of enzymes called sulfotransferases, which catalyzes sulfonation specific sites on HS and are recognized to play various important roles ranging from impacting cellular processes to microbial pathogenesis and associated inflammations. Infectious disease literature documents two-types of HS for their contributions in viral infections. In the first category, the plain-type or unmodified form of HS contains specific types or arrangements of sulfated residues, which assists multiple viruses for cell attachment or binding (WuDunn and Spear, 1989; Feldman et al., 2000; Shukla and Spear, 2001; Jiang et al., 2012; Richards et al., 2013; Tan et al., 2013). This is considered the first step toward establishing a successful infection. In the second category, an exceedingly specialized form of HS containing unique patterns of sulfation generated by sulfotransferase, is utilized by viruses to further facilitate infectious processes like virus-cell fusion, internalization, and trafficking etc (Shukla et al., 1999; Zautner et al., 2006; Borst et al., 2013; Connell and Lortat-Jacob, 2013; Makkonen et al., 2013).

### SYNTHESIS OF HEPARAN SULFATE (HS)

HS synthesis is a dynamic process which initially begins with addition of tetrasaccharide linker region (GlcA-Gal-Gal- Xyl) on serine residues of the syndecans- the protein core (Esko and Lindahl, 2001). Following the initial addition of an Nacetylated GlcN (GlcNAc) residue to the beginning of the HS chain, polymerization continues with the addition of alternating GlcA and GlcNAc residues. The polymeric chain extension is also accompanied by a series of modifications in which multiple enzymes participate in a sequential order. These include glycosyltransferases, an epimerase, and sulfotransferases (de Agostini et al., 2008). During the initial process, Ndeacetylation and N-sulfation of GlcNAc occur, transforming the glycosaminoglycan into N-sulfo-GlcN (GlcNS). Next C5 of GlcA is epimerized to IdoA, followed by O-sulfation, which is performed by 2-O-sulfotransferases (2-OSTs), 6-OST or 3-OST in the following order: Initially, 2-O-sulfation of IdoA and GlcA occurs followed by 6-O-sulfation of GlcNAc and GlcNS units, and lastly 3-O-sulfation of GlcN residues (Esko and Lindahl, 2001). The final step of 3-O sulfation is a rare modification that involves a finite number of chains (Marcum et al., 1986; Zhang et al., 2001; Thacker et al., 2014, 2016). Various arrangements of these modified residues result in a heterogeneous structure of HS which creates distinct binding motifs on the HS chains which are thought to regulate its functional specificity in distinct biological processes within the host. Concurrently, modifications of HS residues allow for distinct functions in pathogen–host interactions highlighting the impact of HS in disease development. For instance, the importance of HSsignaling in cancer biology, tumor development, metastasis, and differentiation are emerging (**Figure 1**) (Blackhall et al., 2001; Raman and Kuberan, 2010). Furthermore, a vast spectrum of pathogens ranging from viruses, bacteria, parasites, and fungi exploit the various moieties present on HS to signal and facilitate microbial pathogenesis (Sinnis et al., 2007; Gardner et al., 2011; Tiwari et al., 2012; Park and Shukla, 2013; Jinno and Park, 2015; Lin et al., 2015; Raman et al., 2016). HS promotes both, initial microbial attachment and associated inflammatory response, which may result in damaging outcomes for the host (Urbinati et al., 2009; Xu et al., 2011; Axelsson et al., 2012; Knelson et al., 2014; Yun et al., 2014; Kumar et al., 2015; Dyer et al., 2016).

#### SULFOTRANSFERASE GENERATED HEPARAN SULFATE

Currently only three sulfotransferases: 2-O, 3-O, and 6-O sulfotransferase (2-OST, 3-OST, and 6-OST) enzymes are known to generate 2-O, 3-O, and 6-O sulfated heparan sulfate (2-OS HS, 3-OS HS, and 6-OS HS) respectively. These sulfated forms of HS play a role by generating high affinity receptors that allow viral and bacterial pathogens to have affinity for their natural target cell-types (Clement et al., 2006; Tiwari et al., 2006, 2007, 2011; Zautner et al., 2006; Trottein et al., 2009; Kobayashi et al., 2012; Hayashida et al., 2015; García et al., 2016) (**Table 1** and **Figure 2**). Gene profiles for sulfotransferases have shown an enhanced expression in human monocytes and dendritic cells (DCs) during differentiation suggesting their role in responding to pathogens, stress, and during aging (Taylor and Gallo, 2006; Sikora et al., 2016). Since a large array of inflammatory and immunoregulatory mediators are known to interact with cell surface HS to target subsets of T lymphocytes and monocytes/macrophages (Zhou et al., 2015), dissecting such interactions could generate a valuable tool useful for understanding inflammatory and immune responses. Moreover, the impact of specific ligands on sulfotransferase enzymes and its turnover during pathogen invasion requires additional detailed investigation. In fact, a recent study offers oligosaccharides as a means by which to inhibit HS- sulfotransferases, adding new tools to probe the structural selectivity for HS-binding proteins (Raman et al., 2013).

Interestingly, these modified versions of HS are also increasingly recognized as potential markers for invasive diseases such as cancers and tumors. It has been proposed that 3- OS HS and 6-OS HS reinforce cancer cells to break down the extracellular matrix (ECM) to spread and highjack normal signaling pathways to facilitate cell spread (Brennan et al., 2016). In addition, HS structures containing unique subsets of sulfated domains are also proving to be facilitators for inflammatory responses making them specialized HS—a high-value molecule to be exploited for future drug development. In fact, engineered structure-specific HS-analogs or HS glycomimetics designed to modulate a specific function, such as to enhance protein interaction or to modulate the inflammatory process for overall higher efficacy and lower toxicity, are now routinely synthesized as promising new drugs for use against cancer and multiple other diseases (Esko and Selleck, 2002; Zhou et al., 2015; Afratis et al., 2017; Sanderson et al., 2017).

The unique chain that results from modifying HS enables it to bind specifically to its ligand to modulate angiogenesis, axonal sprouting, and insulin secretion (Liu and Thorp, 2002; Takahashi et al., 2012; van Wijk and van Kuppevelt, 2014; Zhang et al., 2015a). The many and diverse effects of the use of these drugs can results in better control of many different disease processes with minimal side effects and ultimately bring about better patient outcomes.

### Sulfotransferase Generated Heparan Sulfate in HSV Entry

Modification also permits viral membrane fusion and penetration (Shukla et al., 1999; Xia et al., 2002; Chen et al., 2008). Unique sites within HS chains generated by 3-OSTs produce 3-OS HS, the newest family of HSV-1 glycoprotein D (gD) receptors, (Liu et al., 2002; Borst et al., 2013). The 3-OSTs act to modify HS further along in its biosynthesis, and each isoform recognizes glucosamine residues, as substrates, within regions of the HS chain having specific and unique, prior modifications, including epimerization and sulfation at other positions (Liu et al., 1999; Shworak et al., 1999). Thus, each 3-OST is capable of producing potentially unique protein-binding sites within HS. To date, seven different isoforms of human 3-OSTs (3-OST-1, 3-OST-2, 3-OST-3 A, 3- OST-3 B, 3-OST-4, 3-OST-5, 3-OST-6, and 3OST-7) are known. Although all the isoforms modify HS to generate 3-OS HS, it remains a mystery as to why seven isoforms are present. All, forms of 3-OST except 3-OST-1, produce HSV-1 entry receptors (Liu et al., 1999; Shukla et al., 1999). Interestingly, zebrafish embryo encodes 3-OST isoforms that are functionally similar in terms of recognizing HSV-1 glycoprotein D (gD) to promote viral entry. The gD receptors generated by other isoforms of 3-OST are very similar in structure, but likely not identical (Liu et al., 1999; Shukla et al., 1999; Chen et al., 2008; Borst et al., 2013). The isoform 3-OST-1 creates binding sites for antithrombin (Shworak et al., 1999; Mochizuki et al., 2003) but are unable to produce receptors for HSV-1 gD (Borst et al., 2013). Additionally, one or more isoforms of 3-OSTs have been found to be expressed in both human and mouse tissues that are relevant to HSV-1 infection examined thus far (Liu et al., 1999; Mochizuki et al., 2003; Lawrence et al., 2007) suggesting the significant role 3-OST play in contributing to disease development. The general distinction between plain-type of HS and the HS chain containing 3-O sulfation regarding their usage in viral entry is well-documented. Although the vast jungle of HS in the host cell assists in the capture and concentration of pathogens at a given site, it may not guarantee successful infection since HS mainly serves as an attachment receptor. Overall interactions between HS and pathogens are considered to be non-specific and may influence pathogenesis indirectly. On the other hand, target cells expressing 3-O or 6-O sulfated moieties in HS chains may guarantee that certain pathogens will be able to cross the host cell membrane and therefore the presence of the later form of specialized-HS may serve as a "Wi-Fi" zone for the targeting pathogens to easily invade the host cell. That is, pathogens in the vicinity of cells expressing modified HS can connect to "the internet" (host organism) by using modified HS as a means and signal to literally connect to host cells. To further the analogy, pathogens may even be able to detect modified HSPGs similar to how one can scan for Wi-Fi signals with a computer or a cellphone. Nonetheless, it remains

TABLE 1 | O-Sulfation in heparan sulfate and its implication in pathology.


to be investigated whether such "Wi-Fi" zones can be distinctly equipped with a particular signature of the 3-O-sulfated isoform that influences or contributes to establishing HSV tropism. Moreover, a HSV-1 gD-type 3-O HS specific to neurons or to ocular cells are well-documented (Clement et al., 2006; Deligny et al., 2010).

A series of evidence support the role of HS in viral infection as indicated in **Figure 3**. For example, enzymatic removal of specific regions in the HS chain in target cells by using heparanase significantly impairs the viral infection (Tiwari et al., 2004). Likewise, use of soluble heparin or HS-mimetic analogs inhibits viral infection through direct competition for cell surface HS (Gangji et al., 2018). Furthermore, a higher degree of viral entry was noticed in the cells that expressed HS compared to the mutant cells that were devoid of or weakly expressed HS. Lastly, the expression of viral envelop proteins which interact with HS in target cells, resulted in a resistance to viral entry by sequestering cell surface HS (Tiwari et al., 2006). As in the case of HSV-1, the expression of viral envelop glycoprotein B (gB) in target cells, which is known to bind HS, significantly impairs viral entrya process also known as gB-mediated interference. Profoundly, our initial work made a unique discovery that viruses do in fact utilize HS beyond cell attachment and binding. Using cDNA library screening, direct evidence was presented for a modified version of HS; 3-OS HS generated by 3-OST-3 isoform as a receptor which promotes virus-cell membrane fusion (Shukla et al., 1999). Expression of the HS-modifying 3-OST-3 enzymes confers susceptibility for HSV-1 infection in previously HSV-1 resistant Chinese hamster ovary (CHO-K1) cells (Shukla et al., 1999). Additionally, an event with the protein receptors (nectin and HVEM) was observed that supports HSV-1 entry (Shukla and Spear, 2001). After identifying the significance of 3-OST-3 in HSV-1 entry, our group has cloned an additional six individual human and zebrafish isoforms of 3-OST enzymes and have characterized them against HSV-1 entry. Given the fact, that these enzymes are tightly regulated and are expressed in specific cells and tissues, it is tempting to speculate that 3-OST isoforms might be prime candidates for HSV localized infection either in brain or ocular cells and tissues, essentially contributing toward viral tropism. In this regard, our previous work has demonstrated the significance of one 3-OST-3 isoform type in primary cultures of human corneal stroma (CF) derived from eye donors- a natural target cell for HSV infection. Using si-RNA approaches together with the use of heparanase enzyme in cultured CF, we found a significant reduction in HSV-1 entry (Tiwari et al., 2006). Additional support for HSV use of the 3-OS HS receptor in CF was gained by using phage display-derived peptide targeting 3- OS HS receptor, which further resulted in a compelling inhibition in HSV-1 entry (Tiwari et al., 2011). To further strengthen our work, we used anti-3-OS HS peptides to successfully prevent HSV-1 infection in a mouse corneal model (Tiwari et al., 2011). With the above encouraging results, we have tested the clinical usage of anti-3-OS HS peptides by using them on a commercially available contact lens as a delivery vehicle for extended release as

FIGURE 2 | The O-sulfation in HS chain is known to generate binding sites for multiple viral proteins to facilitate viral entry. For instance, sulfation at 3-O position on glucosamine residues generates a HSV-1 glycoprotein D (gD) receptor for entry. In addition, 3-O sulfation in HS facilitates the spread of cytomegalovirus (CMV). Similarly, 6-O sulfation in HS chain promotes hepatitis C and coxackievirus entry. Interestingly, it has been demonstrated that HIV glycoprotein gp120 interacts with 2-O sulfated HS during cell entry.

a potential and effective way to control corneal herpes infection (Jaishankar et al., 2016). Interestingly, a laboratory produced 3- O-sulfonated HS octasaccharide, was demonstrated to inhibit HSV-1 and host-cell interactions, suggesting the application of HS derived molecules as potential therapeutic tools against viral pathogens.

In the case of 3-OS HS, the ligands that are documented to interact are HSV-1 glycoprotein D (gD), antithrombin III, Fibroblast growth factor (FGF), Cyclophlin-B, and Neuropilin-1 (Liu et al., 1999; Datta et al., 2013; Baldwin et al., 2015; Zhang et al., 2015a). Emerging evidence suggests that multiple viruses have evolved with the interaction of O-sulfated domains in HS to promote infection. For instance, 3-OS HS although first recognized to be HSV-1 gD receptor, is now reported to be utilized by other members of the herpes virus family as well. A recent study by Baldwin et al. (2015) demonstrated that human cytomegalovirus interacts with 3-OS HS generated by 3-OST-3 isoforms to mediate virus spread (Zhang et al., 2010). This result emphasizes the symbolic role of 3-OS HS in co-infection or in superinfection models especially since herpes viruses are opportunistic by nature. On the other hand, it is also possible that viruses keep a secret and independent life in the infected cell by down-regulating the expression of 3-OS HS to make supportive habitats in the infected cells for their own distinct advantage. This question remains unresolved and needs to be further addressed. Similarly, 3-OS HS is exploited by viruses as an entry receptor, but at the line of duty, it primarily exists as a signaling transduction molecule. Therefore, it is possible that viruses intelligently sense the 3-OS HS environment and as such, they have no control over their turnover. Clearly, all the above questions are extremely vital as HS/3-OS HS widely participates in various steps of the virus life cycle.

### Role of Sulfotransferase Generated Heparan Sulfate in Other Viruses

Interestingly among non-herpes viruses, expression of human 3- OST-3a may suppress Hepatitis B virus replication in hepatocytes (Hallak et al., 2000), while 6-O sulfation in HS chain potentially supports entry of human cytomegalovirus (Zautner et al., 2006). In addition, 6-OS HS has been shown to mediate internalization of coxackievirus B3 (Connell and Lortat-Jacob, 2013). Interestingly, 2-O sulfation is recognized by HIV glycoprotein gp120 during cell entry (**Figure 2**) (Makkonen et al., 2013; Matos et al., 2014). Conversely, N-sulfation of heparan is necessary for inhibition of Respiratory Syncytial Virus (RSV) infection (Fechtner et al., 2013) and is required for CHIKV infection (Tanaka et al., 2017). In case of Hepatitis C Virus (HCV), envelope glycoprotein E1/E2 requires both N and 6-O sulfate group to interact with HS (Trottein et al., 2009).

Besides viruses, other pathogens can also interact with the modified product of HS. For instance, an outer membrane protein of Chlamydia trachomatis known as OmcB interacts with 6-OS HS during infection (Tanaka et al., 2012). Seemingly, HS/3-OS HS becomes an ideal target for both viral and bacterial infection- a possibility that exists in many sexually transmitted infections. In the case of human T-cell leukemia virus type 1 (HTLV), it has been demonstrated that the combination of the number and the length of HS chains containing heparinlike regions is a critical factor in determining cell tropism of HTLV-1 (Monneau et al., 2016). It is hypothesized that shorter HS chains may be able to induce receptor complexes (HTLV-1 Env-HS-NRP-1-GLUT1) more efficiently than their longer counterparts by attracting HTLV-1 particles to the target cell surface. Nonetheless, the findings obtained in this study may provide the foundation for the development of effective therapies against HTLV-I infection and aid in the development of a metric for assessing the prognosis of HTLV-1 induced diseases. Whether the length of the HS chain influences virus infection in general still requires further investigation. Furthermore, whether the virus infection subverts the length of HS or effects HS-polymerization is also unrecognized. However, recent progress made in glycobiology has shown that glycosaminoglycan structural properties such as length, sulfation, and epimerization patterns, are cell, tissue, and developmental stage specific (Patel et al., 1993).

It is clear that a single type of sulfation in a HS chain can serve as an attachment point for multiple types of viruses. For instance, 3-O sulfation in HS generated by 3-OST-3 is utilized by both HSV and cytomegalovirus (CMV) to trigger virus entry. Moreover, an individual virus can exhibit the preference to two entirely different sulfation patterns in HS chains (Shukla et al., 1999; Baldwin et al., 2015). According to recent studies, CMV can facilitate both 6-OS HS and 3-OS HS for infectious entry (Borst et al., 2013; Baldwin et al., 2015). Though it is still unknown for many viruses, what specific sulfonation pattern is most preferred.

An exponential amount of evidence suggests that HSPGs play a significant role in the HIV lifecycle and disease development. Initially, it was demonstrated that viruses enter multiple T cells by using HS chains of HSPGs to facilitate entry (Roderiquez et al., 1995). Concomitantly, the presence of heparanase has been shown to competitively inhibit HIV entry into the target cells (Roderiquez et al., 1995; Ohshiro et al., 1996). In addition, multiple studies demonstrated that the presence of HSPG in various cell types enhances HIV entry (Ibrahim et al., 1999; Saphire et al., 2001; Guibinga et al., 2002; Zhang et al., 2002; Bobardt et al., 2004). For instance, primary human endothelial cells which are loaded with HSPG are extremely efficient in capturing viruses on the cell surface (Argyris et al., 2003). The overexpression of HS in primary endothelial cells near the blood-brain barrier and in the microvasculature, in addition to their ability to capture HIV-1, has been proposed as a novel mechanism to facilitate the invasion of the brain by HIV-1 (Argyris et al., 2003). Further prime target cells like DCs, macrophages, epithelial, and endothelial cells provide docking sites for HIV by expressing HS (Guibinga et al., 2002; Wu et al., 2003; Gallay, 2004; de Witte et al., 2007; Ceballos et al., 2009).

Additionally, HS-expressing spermatozoa play a crucial role in virus transformation into DCs, macrophage and T cells during HIV transmissions (de Parseval et al., 2005). In fact, four HSbinding domains have been identified in the V2 and V3 loops, the C-terminal domain, and the CD4-induced bridging sheet of the HIV gp120 (Crublet et al., 2008; Herrera et al., 2016). In contrast, one recent study depicts that HS combines with the innate protein human beta-defensin (hBD) and reduces HIV trans-epithelial transmission through inactivation of virus. Using an adult oral epithelial cell which expresses hBD, the authors showed the HSPG-mediated internalization of hBD and HIVgp 120 in the endosome results in oligomerization and reduced infectivity of HIV (Vivès et al., 2005).

Remarkably, it has been demonstrated that HS not only facilitates viral attachment to host cells via HIV gp120 (Cladera et al., 2001) but also that HS mediates viral-host fusion by interacting with the fusion domain of gp41 (Teixé et al., 2008). On permissive cells, HIV binding to HS is thought to increase infectivity by concentrating viral particles at the cell surface (in the cis process). Additionally, for some cell types such as macrophages, HS may compensate for low CD4 expression (Guibinga et al., 2002). It has also been proven that the expression of HSPG on the CD4+ cell surface is regulated by virus infections and immune activation (Ibrahim et al., 1999; Alvarez Losada et al., 2002). HS may be directly involved in the infection of CD4<sup>+</sup> and/or CD4<sup>−</sup> cells, making it an ideal target for new anti-HIV therapies (Wu et al., 2003; Somiya et al., 2016). Beside HS interaction to gp120, it also co-localizes with matrix protein p17 on activated human CD4<sup>+</sup> T cells. Additionally, single particle tracking confirms that multivalent Tat protein transduction results in a domain-induced heparan sulfate proteoglycan crosslinkage, which activates Rac1 for virus internalization. HIV-1 Tat and HSPG interaction have been proposed as a novel mechanism of lymphocyte adhesion and migration across the endothelium (Xu et al., 2011).

Besides the symbolic role of HS for viral entry, HS plays other extensively documented roles in the viral lifecycle. Using a recombinant Hepatitis B virus (HBV) surface antigen L protein particles (bio-nanocapsules, BNCs), a HS-dependent mechanism of HBV uptake has been proposed (Dong et al., 2013). To study the hepatitis C virus (HCV), a gene silencing approach was used against enzymes involved in HS-biosynthesis to investigate structural determinants required by HCV during infection. The study found the N- and 6-O-sulfation, but not 2-O-sulfation, is required for HCV infection. It is also known that the minimum HS oligosaccharide length required for HCV infection is a decasaccharide. Overall, previous discoveries highlighted that HCV uses specific HS structure to initiate infectious cycle. Concurrently, another study implemented a different approach by using heparanase I, II, and III enzymes to cleave specific structures in HS to reach the same conclusion. The findings indicate that the heterogeneity of HS especially for rich N-sulfation and iduronic acids heavily participate in Respiratory Syncytial virus (RSV) infection in some mammalian cells (Cooper et al., 2005).

Besides the role of HS in viral entry, it is justified that HS is also involved in the induction of cytokines by HBV capsid in macrophages via TLR2 signaling (Tsidulko et al., 2015). A recent study has proposed that within various bovine tissues used for HS preparation, only liver-based HS can strongly bind to both E1 and E2 of Hepatitis C Virus (HCV). This study is very significant since HCV, which displays liver tropism, contains highly sulfated structures compared to HS from other tissues. This concludes that HS-proteoglycan on liver cell surfaces appear to be one of the molecules that define liver-specific tissue tropism of HCV (Trottein et al., 2009). Later, a study by Lawrence et al. (2007), Deligny et al. (2010) supports the notion that HS structures vary depending on the cell and tissue, and this can influence virus tropism. For instance, HS-modifying enzymes 3-OST-2 and 3- OST-4 are the predominant forms expressed in the neurons of the trigeminal ganglions while 3-OST-2 and 3-OST-4 generated HS have been shown to be of the principal gD-receptor type for HSV-1. Ultimately, all of these subtleties reinforce the presence of a selective interaction between HS and viruses. Although multiple viruses use HS proteoglycans which are ubiquitously expressed GAGs for binding and or attachment, it remains to be ruled out if virus interactions with HS are random or if they require structural specificity.

A unique pattern of HS expression in Epstein Barr Virus (EBV) infected cells is established. Lymphoblastoid cell lines infected with EBV have shown specific proteoglycan expression with down-regulation of CD44 and ECM components and up-regulation of serglycin and perlecan/HSPG2. Nevertheless, in Burkitt's lymphoma cells (BL), serglycin was shown to be down-regulated. Fundamentally, the biosynthetic machinery for HS construction and modification was active in all cell lines, with the cavate that this machinery was down-regulated in BL cells (Shannon-Lowe and Rowe, 2011). A separate study documents the involvement of sulfated-HS during B cell/epithelial cell interactions during Epstein—Barr virus (EBV) transfer (Woodard et al., 2016). In fact, understanding HSvirus association has been investigated for many, entirely different reasons to directly help patients. In ocular gene therapy, a novel chimeric adenovirus-associated vector (AAV) has been developed, which binds to HS and hence displays better transduction ability. This result has generated AAV with higher accumulation and penetration to the retina and therefore offers a less invasive intravitreal injection route for ocular gene therapy (Kaever et al., 2016). Similarly, the significance of HS in developing antibodies against vaccinia virus A27, human papillomavirus (HPV) has been advocated (Silva et al., 2014; Xia et al., 2016).

In addition, a study tested two different strains (vaccine strain and clinical isolates) of Chikungunya virus (CHIKV) for GAG binding generated efficient virus replication. Although GAGs are necessary for efficient binding of both strains, they exhibit differential requirements for GAGs. However, the study delineated the critical amino acid residue 82 in the envelope glycoprotein E2 is a primary determinant of GAG utilization. A critical outcome for the future development of viral entry inhibitors against CHIKV infection (Richards et al., 2013). Additionally, the importance of HS structure for CHIKV infection was demonstrated in a study by Tanaka et al. (2017), Following genome screening the CRISPR/Cas9 system was deployed to create knockout models of haploid human cells (HAP1). These knockout models were created to assess the importance of various genes that modify HS in efficient CHIKV infection. Infectivity was not affected following disruption of genes encoding proteins that act sequentially after NDST1, an enzyme that catalyzes the N-sulfation step of HS (Tanaka et al., 2017). GLCE is one such protein that acts after the N-sulfation step. GLCE catalyzes the C5 epimerization of GlcA to IdoA which is a necessary step required to later produce the 2 and 3-Osulfated forms of HS. This phenomenon stands in contrast to the mechanism of HSV infection, where these modifications have been shown to be necessary for infection. Concerning CHIKV infection, HS in the 2 and 3-O-sulfated forms are not required for productive infection but rather it is the N-sulfated HS that facilitates viral binding and host cell entry.

In the case of HPV, multiple HS-binding sites were found on the capsid protein L1 which bring sequential confirmation shifts during virus-receptor engagement assisting virus attachment and internalization. For instance, primary binding is regulated by site 1 (Lys278, Lys361). It can cause confirmation shift in L2, which mediates the display of the amino terminus. In the step after primary attachment, Site 2 (Lys54, Lys356) and site 3 (Asn57, Lys59, Lys442, Lys443) are essential for viral entry (Richards et al., 2013). Overall, the HS-binding mediates exposure of secondary binding sites aiding in the virus engulfment.

There is a study comparing the amino acid sequences of the E2 glycoprotein from natural North American eastern equine encephalitis virus (NA-EEEV) isolates that shows that the neurovirulence of EEEV is supported by the interaction with HS (Voss et al., 2010). For instance, mutations from lysine to glutamine at E2 71, from lysine to threonine at E2 71, or from threonine to lysine at E2 72 were found to have an altered virulence and interaction with HS. Using an electrostatic map, the HS-binding regions in the EEEV E1/E2 heterotrimer were predicted at the apical surface of E2. Basing this method upon the recent Chikungunya virus crystal structure, variants were detected to affect the electrochemical nature of the binding site (Ryman et al., 2007). The EEEV sylvatic cycle could be the starting point for these natural variations in the EEEV HS-binding domain, which may influence receptor interaction as well as the severity of EEEV disease (Voss et al., 2010).

Similarly, HS-binding has been proposed to be an important factor in neurovirulence of neuroadapted and non-neuroadapted Sindbis Viruses (SV) (Milho et al., 2012). Substantially, HS on olfactory neuroepithelium offers an essential and multifaceted site for HS-dependent murid herpesvirus-4 uptake. Therefore, olfactory neuroepithelium poses a critical entry point for heparan-dependent herpesvirus (Antoine et al., 2014).

### HEPARAN SULFATE- A GATEWAY FOR THE EMERGING DISEASES

Emerging highly pathogenic viruses continue to use cell surface HS for attachment. For instance, zoonotic viruses such as Nipah and Hendra viruses have shown to use to HS for viral entry during lethal infection (O'Hearn et al., 2015). Similarly, Filoviruses, including Ebola virus (EBOV) and Marburg virus (MARV), known to cause hemorrhagic fever with a high mortality rate are also known to use cell surface HS and enzyme EXT-1 involved in HS-biosynthesis (Riblett et al., 2016). In addition, Rift Valley fever virus (RVFV), also an emerging pathogen that can cause lethal hemorrhagic fever syndrome in humans, has been documented to use HS during the initial stages of infection (Mao et al., 2016).

#### HEPARAN SULFATE- A PLATFORM FOR INFECTION

Viruses are consistently facing enormous challenges due to the jungle of HS chains on the cell surface which may hinder or prevent the virus from being released efficiently from the infected cell. Ironically this gives viruses an opportunity to get equipped with a strategy to overcome this problem. For instance, in hepatocytes hepatitis B the virus hides its envelope glycoprotein in its interior during release and becomes HSPG-non binding N-type. Then later it becomes HSPGbinding B-type followed by the translocation of its envelope protein back to the viral surface the latter of which, results in the virus's infectivity. In the case of HSV, it was recently discovered that upon infection the expression of heparanase enzyme is up-regulated via NF-kB mechanisms, which aid in cleaving off the HS/3OS HS chains resulting in virus release, supporting viral pathogenesis. Taken together, viruses find their way to leave the cells and prepare for next target cell by changing the landscape of their own HS-interacting envelope glycoproteins or by shedding the cell surface HS. Interestingly, an overexpression of heparanase is well-documented in multiple pathological conditions such as inflammation, angiogenesis, angiogenesis, tumor metastasis, and atherosclerosis. Again, heparanase mediated breaking or shedding of the basement membrane of epithelial and endothelial cells results in an increase in vascular permeability which in turns supports the leakiness and migration of the leukocytes in the surrounding tissues. In addition, actions of heparanase cause further damage by the release of HS-sequestered with cytokines and growth factors (e.g., vascular endothelial growth factor; VEGEF), which either on the spot or in the surrounding regions, ignite inflammatory processes and angiogenesis aggravating the issue. The later process results in the activation of GTPase signaling, which activates cytoskeleton proteins to promote cell migration and invasion. Again, these actions combined with the up-regulation of matrix metalloproteinase facilitate the intercellular tumor invasion and metastasis after the loss of the HS barrier in the ECM. Further, the released HS may bind to endogenous TLR4 which in turn-on activates the signaling cascade to release the pro-oncogenic signals. These reactions are too overwhelming for the cell and tissues to repair the damage and maintain their homeostasis. Therefore, the overall process may result in devastating disease. A similar situation exists during ocular HSV-1 infection when the virus initially mediates the infection in the layer of corneal epithelium which later brings an overwhelming immune response together with angiogenic cytokines. This immune-mediated response in the corneal stroma can compromise the corneal transparency ultimately resulting in vision loss.

The sulfated moieties present in a HS chain are key determinants for its biological significance. Analyzing HSmoieties for their structure and function is an important strategy for multiple reasons. For instance, multiple moieties in the HS structure remain "orphan" in the sense that the ligands they interact with are not yet fully discovered or if they exist, are poorly understood. In addition, the structural moieties in HS do have "good" and "ugly" components which are differentially expressed signifying a "biological clock" and further suggesting using expression pattern as a potential diagnostic tool. For example, expression of 3-OST and 6-OST has been shown to be up-regulated in certain cancers (Connell et al., 2012; Cole et al., 2014). Since the potential in human disease and associated pathologies including virus infection is now being recognized, identifying the vasculature of HS and associated signature of proinflammatory chemokines/cytokines in a diseased state is crucial for potential target development.

Interestingly, a study by Connell et al. (2012) demonstrated the concept of generating HS synthetic -mimetic peptide conjugated to a mini CD4 display high anti- HIV-1 activity independent of co-receptor usage (Whittall et al., 2013). The CD4 mimetic and the heparan sulfate derivative interact synergistically and allow potent antiviral activity against both CCR5- and CXCR4-tropic HIV-1 strains. Similarly, peptide-based molecules that directly recognize 3-OS HS or 6-OS HS or 2-OS HS will be of enormous value to understand multiple disease pathologies including virus pathogenesis. In this direction, we made an optimistic attempt by developing peptides which recognize HS or 3-OS HS. These peptides were found to be effective against a murine ocular HSV model. Further characterization of the anti-HS and anti-3-OS HS peptides to prevent inflammatory response need to be tested. In fact, there is a critical need to crack the code between the viruses and their affinity for specific forms of HS. We need to define their structural moieties. There is an emerging trend indicating that multiple viruses do use specific moieties present in HS chain suggesting association to the structural components. However, this remains to be determined if viruses have a preference for one type of sulfation pattern over other types. If true, it would give viruses an edge to establish tropism because cell and tissues exhibit specific expression of sulfated-HS. This could mean that viral use of a specific sulfate-HS moiety could be used as a marker for a more virulent strain. Further, is there is a unique sub-set of sulfated-HS that are expressed in disease states, which are either not present in healthy cells and tissues or present at low levels? A major gap in knowledge exists if the post virus infection influences the expression of certain types of HS-sulfotransferases or if their combinations further impact the additional variability and complexity in the existing HS structure. Developing a blueprint in the regions of the sulfated-moieties in HS during pre-and postinfection and identifying their binding partner's, especially the viral ligands, will boost the development of the targeted therapy. Our complete understanding of the structural components in HS interacting to ligands during various biological processes is very likely to advance our knowledge in the benefit of human health by preventing virus infection, serious diseases like cancer, and their associated pathologies. For instance, peptide-based formulations that directly target interactions between viruses envelop glycoprotein and respective sulfated-HS and affecting the immune response will be new exciting therapeutic options as a broad-spectrum virus inhibitor. Usage of cell surface HS moieties during viral binding and entry is a unique feature that is shared by medically important viruses including many herpes viruses. Similarly, multiple viruses exploit modified-HS differently to promote infection. For instance, HSV uses 3- OS HS to promote virus-cell fusion, while the presence of 3-OS HS significantly increases during CMV internalization, suggesting the possibility that 3-OS HS-mediates endocytosis (Baldwin et al., 2015). Further, use of multiple virus strains including clinical isolates and cell-types are also necessary to classify global or very specific-usage of sulfated-HS. These data will be highly informative and will allow in establishing HStyping. One additional way is to screen the phage display library against 2-O, and 6-O sulfated-HS. Interestingly, certain structural forms of HS have been suggested to be involved in the uptake of ligands using endocytosis. Although, HS is a very versatile molecule; little information is available regarding the HS-epitope role in driving efficient trafficking of virus-cargo in endosomes, trapping cell signaling molecules which otherwise may diffuse, and controlling potential virus gene expression. Understanding the type of HS that is involved in loading and unloading virus cargo, the type of HS that effects cell-to-cell communication, and secondary messengers are emerging and fascinating science, which has tremendous treatment potential. For instance, the delivery of therapeutic agents such as HS recognizing particles conjugated with anti-HS/anti-3OS HS peptides in nanosomes may be given to the patient. Phage display based on anti-sulfated HS peptides will be useful in this regard as well to improve therapeutic, diagnostic options, and patient outcomes. Phage display library screening is now widely used as an antiviral approach to identify ligands that bind to virus glycoproteins or cellular receptors to reduce or block virus infection. It is tempting to speculate that designing peptides which delay or interfere with virus internalization, trafficking, or that potentially stop cell communication, could be very useful on various levels.

### HEPARAN SULFATE- A KEY COMPONENT IN ACTIN CYTOSKELETON AND INFLAMMATION

Cell surface HSPGs are known to influence the cell behavior and cytoskeletal organization via interaction with the numerous ligands (Martinho et al., 1996). It has been recently discovered that intercellular bridges or cytoplasmic bridges between cells expressing F-actin and HS/3-OS HS are indicators of cell-based oxidative stress, virus infection inflammation and cancer (Onfelt et al., 2004; Rustom et al., 2004; Vidulescu et al., 2004; Sowinski et al., 2008). Similarly, long bridges of tunneling nanotubes (TNTs) connecting two or more cells that express HS and 3-OS HS (Rustom et al., 2004; Chang et al., 2016) have been proposed to be heavily involved in organelles and vesicle transport as a part of cell to cell communication. Their usage in viral transport has gained momentum for retroviruses (Sowinski et al., 2008; Omsland et al., 2018). For instance, it was recently shown that HIV infection in the primary human macrophages results in an increased number of TNTs, which further gives the virus an opportunity to spread via a novel mechanism (Hashimoto et al., 2016). A similar mechanism may exist for multiple viruses. It is obvious that viruses use TNTs smartly and strategically to cover long distances quickly and efficiently while escaping the host immune response via compartmentalization. Again, HS may play a bigger role in the virus trafficking across TNTs by clumping or aggregating around the actin bundles followed by organizing the molecular machinery to interact with myosin binding ATPases to trigger the virus movement. HSV being a neurotropic virus may find similar TNTs between two neurons. Recently it has been shown that pathogenic α-synuclein fibrils, responsible for Parkinson disease, travel between neurons in culture inside lysosomal vesicles through tunneling nanotubes (TNTs)- a recently discovered mechanism of intercellular communication (Gallegos et al., 2015). HSV with the similar opportunity may exploit TNTs to move and find the new neighborhood to cause persistence, neuroinflammation, and potentially CNS associated neurodegenerative diseases. In regard to human metapneumovirus (HMPV), a cellular extensionbased model recently has been proposed to be utilized by the virus. Interestingly, the virus has two modes of infection: cell-free infection, which is blocked by neutralizing antibodies and requires binding to HS moieties, and direct cell-tocell infection, which is HS independent (El Najjar et al., 2016).

During inflammation, chemokines stimulate the induction of endothelial filopodia and microvilli structures, which have high levels of HS/2-O/6-O/3-O sulfated HS (Sowinski et al., 2008). This signifies the role of sulfated-HS in helping in the sequestration or clustering of chemokines, their gradient and presentation of the chemokine to leukocytes, and further helping in leukocytes trans-endothelial migration. Interestingly, multiple studies have shown that viruses during cell entry often reorganize their actin cytoskeleton and produce filopodia (Chang et al., 2016). Although these structures have been proposed to aid in virus infections such as surfing and trafficking, it remains to be determined if such structures further participate in the recruitment of inflammatory cells and tissue invasion. On the one hand, a study by Whittall et al. (2013) clearly demonstrated co-localization of 2-O, 3-O, and 6-O heparan sulfate with chemokine (CXL8) on filopodial surfaces. On the other hand, the antagonizing chemotactic activity of proinflammatory cytokines and angiogenic activity by the usage of HS-mimetics are proving to be very useful (Mohamed and Coombe, 2017). Therefore, such small molecules can be valuable candidates in preventing not only HSV entry but also in recruiting inflammatory cells in response to HSV-1 in ocular cells and tissues (Gangji et al., 2018). This will bring a much-needed benefit for the patient suffering either from high tittered acute infection or low dosage chronic immunemediated response to infection. The later phase is responsible for substantial immune cell infiltrates leading to scarring of the eye tissues.

#### HEPARAN SULFATE STRUCTURES THAT INTERACT WITH VIRAL GLYCOPROTEINS

Although biopolymer HS is massively diverse, specific sequences within it are likely to be critical for recognition of viral glycoproteins. An example of specific recognition is demonstrated by the HS–gD interaction, wherein a 3-Osulfated GlcNp residue is required for HSV-1 to penetrate cells (Shukla et al., 1999; Xia et al., 2002; Tiwari et al., 2004). The site in HS to which gD binds is generated by an isoform of 3-O-sulfotransferase that yields an octasaccharide, <sup>1</sup>UA-(1:4)- GlcNp2S-(1:4)-IdoAp2S-(1:4)-GlcNp2Ac-(1:4)-GlcAp2S (or IdoAp2S)-(1:4)-GlcNp2S-(1:4)-IdoAp2S-(1:4)-GlcNp3S6S (Liu et al., 2002). Although gB and gC are major components of the viral envelope that facilitates binding to HS, the structure(s) in HS which these glycoproteins recognize have not yet been determined. However, it appears likely that these glycoproteins recognize different HS sequence(s) as suggested by competition studies (Herold et al., 1995). In addition, different HS structures appear to be recognized by gC from HSV-1 and HSV-2 (Gerber et al., 1995). This raises the concept of HS mimetics as inhibitors of viral infection of host cells.

#### Mimics of HS as Inhibitors of Viral Entry Into Cells

Decades ago in 1964, it was discovered that heparin, a glycosaminoglycan related to heparan sulfate, inhibited HSV infection, which was possibly the origin for the idea that heparan sulfate is a viral receptor. Inhibition by heparin is a competitive process and, therefore, is most effective when the inhibitor is present during the attachment phase of viral entry. Heparin has been shown to bind to soluble forms of gB, gC, and gD (Nahmias and Kibrick, 1964; Herold et al., 1995; Tal-Singer et al., 1995; Feyzi et al., 1997; Trybala et al., 2000). Interestingly, whereas 6-O- and 3-O-sulfation is the primary determinant for HSV-1 infection, they appear to have a little role in HSV-2 infection, suggesting differences between the two types of viruses (Herold et al., 1996). Likewise, O-sulfation was found to be more important for binding to gB from HSV-1 than gB from HSV-2. Fractionation of polymeric heparin into discrete sulfated oligosaccharide mixtures suggests that nearly 10–12 monosaccharides are necessary for effective binding to gC (Feyzi et al., 1997).

Numerous other sulfated polysaccharides have been explored as mimics of HS in the inhibition of herpesvirus entry into cells. A principal source of these sulfated polysaccharides is sea algae, which biosynthesizes these molecules to retain K<sup>+</sup> and Ca+<sup>2</sup> ions from seawater and for enhanced resistance to desiccation. These polysaccharides include κ and λ carrageenans (Marchetti et al., 1995; Carlucci et al., 1997, 1999a,b; Zacharopoulos and Phillips, 1997), galactan sulfate (Damonte et al., 1996; Mazumder et al., 2002), galactofucan sulfate (Thompson and Dragar, 2004), fucan sulfate (Preeprame et al., 2001), spirulan (Hayashi K. et al., 1996; Hayashi T. et al., 1996), fucoidan (Ponce et al., 2003; Lee et al., 2004a), rhamnan sulfate (Lee et al., 1999), chitin sulfate (Ishihara et al., 1993), and other uncharacterized polymers (Hasui et al., 1995; Witvrouw and De Clercq, 1997; Lee et al., 2004b; Zhu et al., 2004). In addition, other polysaccharides investigated as mimics of heparan sulfate include chemically modified heparins (Herold et al., 1995, 1997; Feyzi et al., 1997), non-anticoagulated heparin (Herold et al., 1997), pentosan polysulfate (Herold et al., 1997), and dextran sulfate (Marchetti et al., 1995; Neyts et al., 1995; Dyer et al., 1997; Herold et al., 1997).

### Non-saccharide Mimetics of HS as Inhibitors of Viral Entry Into Cells

A growing a class of small molecules these days is called non-saccharide glycosaminoglycan (GAG) mimetics (NSGMs). NSGMs are much smaller than polymeric HS mimetics, which makes them more drug-like in nature in comparison to the oligomeric and polymeric HS mimetics. NSGMs are also much easier to prepare (synthesize) in comparison to oligomeric HS mimetics and they are highly water soluble. Most importantly, NSGMs are functional mimetics of polymeric GAGs, which arises from their ability to bind to sites on proteins that interact with GAGs. Many NSGMs have been discovered so far that modulate various processes in addition to viral infection (Desai, 2013). For example, sulfated flavonoids and xanthones have been found to work as anticoagulant and antiplatelet agents (Al-Horani et al., 2011; Correia-da-Silva et al., 2011); sulfated benzofurans have been found as inhibitors of thrombin (Sidhu et al., 2013); and sulfated benzylated glycosides have been discovered as inhibitors of human factor Xia (Al-Horani et al., 2013; Al-Horani and Desai, 2014).

In a recent work, Gangji et al. (2018) show for the first time NSGMs present an excellent alternative to polymeric HS agents for inhibiting HSV (Gangji et al., 2018). These authors screened a small library of synthetic NSGMs (MW in range of 500–2500) and identified a distinct group of NSGMs that bind glycoprotein D with high affinity (10 nM or so). More importantly, one specific NSGM, called SPGG, inhibited cellular entry of HSV-1 with IC<sup>50</sup> in the range of 430 nM to 1.0µM. This is greater than a 10 fold lower response than that reported for 3-O-sulfate containing heparin/heparan sulfate-derived octasaccharides (Liu et al., 2002; Copeland et al., 2008; Hu et al., 2011).

Overall, competitive inhibition of viral entry through the use of either oligosaccharidic or non-saccharidic agents is very exciting. Several poly/oligosaccharides and small aromatic agents have been discovered that present major opportunities for development of anti-virals. It appears that the category of NSGMs represents excellent agents for targeting viral glycoproteins.

#### EPIGENETIC COMPONENTS

Equally important, growing evidence suggests that epigenetic regulation of HS-modifying enzymes (sulfotransferases) together with heparanase (HPSE; mammalian endoglycosidase which degrades HSPG) are important determinants in the pathogenesis of several inflammatory conditions (**Figure 4**). For example, Bui et al. (2010) showed upon analysis of chondrosarcoma cells the typical hypermethylation profile of 3-OST sulfotransferase genes, which contributed toward the invasive phenotype of cancer. A similar hypermethylation pattern in 3-OST-2 genes has recently been reported in a variety of cancers (Hwang et al., 2013; Hull et al., 2017). Several studies have shown that DNA methylation of the HPSE promoter influences HPSE expression in different stages of breast cancer and has a direct effect on tumor progression (Jiao et al., 2014). The knocking down and overexpression experiments with HPSE confirmed that HPSE regulates the transcription of a distinct cohort of immune response genes involved in T cell effector function and migration (Parish et al., 2013). An increase in HPSE levels result in NF-kB activation followed by the release of tumorpromoting substances, growth factors (GFs) and cytokines by tumor-associated macrophages (Goldberg et al., 2013). Further, the increased metastatic potential in vivo mice was inhibited with laminaran sulfate, a potent inhibitor of HPSE activity (Shteper et al., 2003). In addition, HPSE promotes cancer and related inflammatory pathologies by removing extracellular barriers for serve to limit invasion/extravasation. The release of HS-bound GFs and cytokines results in activation of antiapoptotic signaling and stimulates angiogenesis (Goldberg et al., 2013). During ocular HSV-1 infection, it has been shown that upregulation of HPSE at the nucleus caused decreased interferon signaling and increased NF-κB activation, resulting neighboring cells to be more susceptible to infection and increased proinflammatory factor production (Agelidis et al., 2017). Therefore, HPSE represents an attractive target for the development of broad spectrum drugs which may have antimicrobial, antiinflammatory to anti-tumor activities (Khachigian and Parish,

2004). Antimicrobial peptides and HS-mimetic, which target HPSE, are already gaining wide popularity to resolve lifethreatening situations (Martin et al., 2015; Brennan et al., 2016).

The ultimate benefit of understanding the regulation of the sulfotransferase and HPSE genes and their localization mechanisms during disease development will be instrumental in the discovery of novel biomarkers. This approach will be beneficial for the early detection of disease and therefore may offer a better prognosis. In addition, targeting genes responsible for abnormal phenotype by developing highly specific inhibitors may have limited side effects and hence bring novel interventions to prevent disease.

#### CONCLUSION

Growing evidence suggests that the interaction between viral components and their preferred modified HS-type variants is significantly important in our understanding toward the virus pathogenicity. Scientists of the past decade have developed sophisticated information on the molecular determinants of modified-HS and their counterpart ligands involved in various steps of microbial pathogenesis. Nonetheless, more exceedingly advanced works are required to demonstrate precise and more specialized work on the role of the modified forms of HS-variants in viral pathogenesis and in disease development. Heparan sulfate (HS) once stood for an ancient yet vital regulator of cell homeostasis, however with the discovery of the HS modifying enzymes, multiple roles of HS-variants (i.e., modified versions of HS) are turning out to be equally fascinating since the domains in the sulfated-HS carry very precise functions. One question remains to be addressed; do the pathogens including viruses have the ability to directly or indirectly control the

#### REFERENCES


expression of HS modifying enzymes? Further, the structurefunction analysis of HS-variants in terms of their role in virusinfectivity, tropism, and in the disease-phenotype remains to be investigated. Since viral glycoproteins have unique structural motifs to bind HS and 3-OS HS, the possibility now exists to design HS mimetics, which selectively block viral protein interaction with the host cell to expand the selective therapeutic potential of HS mimetics (Gangji et al., 2018). The fact that HS, modified HS, together with HPSE activity plays a critical role during viral infection suggests that cell surface proteoglycans are a promising area for the development of reagents and understanding of the disease process. Overall, these studies certainly are worth pursuing to advance our approach to understanding disease and developing novel therapeutic interventions.

## AUTHOR CONTRIBUTIONS

DDK, DJ, MH, JB, MVV, and VT wrote the article. URD provided HS-mimetic section. VT, MH, JCB, and DDK prepared the figures and the table.

#### FUNDING

DDK was supported by the Biomedical Science Program (College of Graduate Studies) at Midwestern University (MWU). VT was supported by MWU's start-up funds.

#### ACKNOWLEDGMENTS

Midwestern University start-up funds to VT is kindly acknowledged.


angiogenic responses of endothelial cells to fibroblast growth factor 2 and vascular endothelial growth factor. J. Biol. Chem. 287, 36132–36146. doi: 10.1074/jbc.M112.384875


products in central and peripheral nervous system tissues. Matrix Biol. 26, 442–455. doi: 10.1016/j.matbio.2007.03.002


function in antibacterial innate immunity. Infect. Immun. 83, 3648–3656. doi: 10.1128/IAI.00545-15


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kaltenbach, Jaishankar, Hao, Beer, Volin, Desai and Tiwari. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity

Qingjie Guo1,2, Ruonan Zheng<sup>3</sup> , Jiarui Huang<sup>3</sup> , Meng He<sup>3</sup> , Yuhan Wang<sup>3</sup> , Zonghao Guo<sup>3</sup> , Liankun Sun<sup>4</sup> \* † and Peng Chen1,2 \* †

<sup>1</sup> Department of Genetics, College of Basic Medical Sciences, Jilin University, Changchun, China, <sup>2</sup> Key Laboratory of Pathobiology, Ministry of Education, Jilin University, Changchun, China, <sup>3</sup> College of Clinical Medicine, Jilin University, Changchun, China, <sup>4</sup> Department of Pathophysiology, College of Basic Medical Sciences, Jilin University, Changchun, China

#### Edited by:

Chandravanu Dash, Meharry Medical College, United States

#### Reviewed by:

Meenal Gupta, University of Utah, United States Zhenbang Chen, Meharry Medical College, United States

#### \*Correspondence:

Peng Chen pchen@jlu.edu.cn Liankun Sun sunlk@jiu.edu.cn

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Genetics

Received: 24 July 2018 Accepted: 04 December 2018 Published: 19 December 2018

#### Citation:

Guo Q, Zheng R, Huang J, He M, Wang Y, Guo Z, Sun L and Chen P (2018) Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity. Front. Genet. 9:663. doi: 10.3389/fgene.2018.00663 Obesity has become a major public health issue which is caused by a combination of genetic and environmental factors. Genome-wide DNA methylation studies have identified that DNA methylation at Cytosine-phosphate-Guanine (CpG) sites are associated with obesity. However, subsequent functional validation of the results from these studies has been challenging given the high number of reported associations. In this study, we applied an integrative analysis approach, aiming to prioritize the drug development candidate genes from many associated CpGs. Association data was collected from previous genome-wide DNA methylation studies and combined using a sample-size-weighted strategy. Gene expression data in adipose tissues and enriched pathways of the affiliated genes were overlapped, to shortlist the associated CpGs. The CpGs with the most overlapping evidence were indicated as the most appropriate CpGs for future studies. Our results revealed that 119 CpGs were associated with obesity (p ≤ 1.03 × 10−<sup>7</sup> ). Of the affiliated genes, SOCS3 was the only gene involved in all enriched pathways and was differentially expressed in both visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT). In conclusion, our integrative analysis is an effective approach in highlighting the DNA methylation with the highest drug development relevance. SOCS3 may serve as a target for drug development of obesity and its complications.

Keywords: DNA methylation, obesity, association, gene expression, CpG

### INTRODUCTION

Since 1980, the incidence of obesity has increased throughout the world (Stevens et al., 2012; Ng et al., 2014). The onset of obesity involves the interaction between genetic and environmental factors (Contaldo and Pasanisi, 2004; Ussar et al., 2015). Genome Wide Association Studies (GWASs) have successfully identified many genetic variations associated with human complex diseases and provide crucial new insights about underlying molecular mechanisms (De La Vega et al., 2011; Fall and Ingelsson, 2014; Winham et al., 2014; Evangelou et al., 2018). Until now, the largest obesity GWAS study has identified 97 body mass index (BMI) associated loci (P ≤ 5 × 10−<sup>8</sup> ) from up to 339,224 individuals. However, most of the genetic susceptibility remains unclear (Locke et al., 2015).

Existing evidence suggests that obesity is a result of interactions between genetic and environmental factors (Marti et al., 2008). DNA methylation provides a molecular mechanism for the interaction between the environment and obesity, in that it may affect individual susceptibility to obesity by altering the gene expression. In recent years, the association between DNA methylation and obesity has intensively been studied (van Dijk et al., 2015; Dhana et al., 2018; Wang et al., 2018). For example, a genome-wide DNA methylation association study in obesity that recruited 5,387 individuals, identified 278 CpGs associated with BMI (Wahl et al., 2017). The associated CpGs have provided wider insight in addition to previous genetic studies. On the other side, the numerous associated CpGs has made it difficult for functional investigations using cell and animal models.

In this study, we applied an integrative analysis approach, to prioritize genes with more relevance from several associated CpGs. Using this approach, we identified SOCS3 as a promising candidate for mechanism studies and drug development. This approach can also be adapted to genome-wide DNA methylation studies of other diseases.

### METHODS

The integrative analysis approach included three components. The first component was to nominate the candidate CpGs by combining the association results from previous studies of peripheral blood samples (Steps 1–4, **Figure 1**). The second component was to estimate the functional relevance of the candidates through pathway enrichment analysis (Step 5). The third component was to validate that the genes affiliated with candidate CpGs were differentially expressed in adipose tissues (Step 6). Finally, the evidence from these components were put together and the genes with positive support from all components were considered and prioritized by our approach (Step 7).

#### Literature Search

The literature search was conducted in the PubMed database using the keywords "CpG", "DNA methylation" and "obesity" to capture all articles published from 2014 to 2018. We applied an English language restriction to our search results.

#### Inclusion Criteria and Data Extraction

Both cohort studies and case-control studies reporting the association between DNA methylation and obesity (as measured by BMI) were included in this meta-analysis. Studies that used samples from cancer patients were not included. We further excluded the studies that used non-human subjects.

The full text of each article was carefully read to determine whether studies should be included. Once included, data were extracted from the articles, including the publication year, participant characteristics, sample size, association p-value, and the effect size.

#### Meta-Analysis

We employed a sample-size weighted strategy to combine the p values reported in the included studies, taking into consideration the direction of the association effect size. This strategy was implemented using R software (https://www.r-project.org/). In this meta-analysis the CpG site with p value less than 1.03 × 10−<sup>7</sup> (Bonferroni correction based on 485,577 CpGs designed in Illumina HM450K array) and with effect sizes consistent with the direction across all included studies, were considered as significant.

#### Pathway Enrichment Analysis

We investigated the enrichment of the affiliated genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, using the Metascape online software (http://metascape.org; Tripathi et al., 2015). The genes were annotated using the default resources provided by Metascape. KEGG pathways were reduced using the default settings (the number of gene hits ≥3, enrichment p-value ≤ 0.05 and enrichment statistics ≥1.5). A FDR p-value ≤ 0.05 was taken to declare a significant enrichment.

#### Differential Expression Analysis in Adipose Tissues

We aimed to investigate whether the associated genes were differentially expressed in the SAT and VAT of obesity patients, by comparing their gene transcription levels with normal individuals. This analysis was performed using the GEO2R tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/) on two datasets, GSE2508 (10 obese vs. 10 lean) and GSE88837 (15 obese vs. 15 lean) for the SAT and VAT, respectively. The gene transcription levels were assayed using Affymetrix Human Genome U95 V2 and U133 arrays. The differential gene expression in obese samples was identified using the Bayesian estimation by GEO2R. Transcription level data of each sample was queried from the GEO database (Davis and Meltzer, 2007). Empirical Bayes statistics were calculated using the R package "limma" (Smyth, 2004; Ritchie et al., 2015). The fold change of DNA methylation was calculated using the group mean. P value ≤0.05 and |log<sup>2</sup> (fold change)| ≥1 were used as criteria for differentially expressed genes. CpGs which were differentially expressed in both tissue types were identified as relevant loci.

### The DNA Methylation Associated With Obesity in Human VAT and Liver Tissue

The DNA methylation of the included studies was all measured in peripheral blood, but the DNA methylation in peripheral blood may be different from that in the metabolic tissues. To test whether the association in peripheral blood samples can be transferred into obesity related tissue, we tested the association of the significant CpGs in human VAT and liver tissue, using two GEO datasets, GSE88940 (10 obese vs. 10 normal VAT samples) and GSE65057 (8 obese vs. 7 normal liver samples), respectively.

**Abbreviations:** ABCG1, ATP Binding Cassette Subfamily G Member 1; BMI, body mass index; CpG, Cytosine-phosphate-Guanine; FDR, false discovery rate; GWAS, Genome Wide Association Study; IRS1, insulin receptor substrate 1; IRS2, insulin receptor substrate 2; KEGG, Kyoto Encyclopedia of Genes and Genomes; LEPR, leptin receptor; SAT, subcutaneous adipose tissue; SOCS3, Suppressor Of Cytokine Signaling 3; T2D, type 2 diabetes; VAT, visceral adipose tissue.

## RESULTS

### Characteristic of Individual Studies

According to the keywords "CpG", "DNA methylation" and "obesity", a total of 350 related articles were retrieved. Two hundred and seventy studies were excluded based on the title and abstract, as they were inconsistent with inclusion criteria, leaving 80 articles. Of those, 67 articles were excluded after a full-text review. As a result, 13 articles were included in the analysis. The reason for the exclusion of most articles was because they were functional studies in cells or animals. The basic characteristics of the included studies are detailed in **Table 1**.

### Meta-Analysis and Pathway Enrichment Analysis

A total of 13 articles were enrolled in our meta-analysis. The pooled peripheral blood samples for each CpG ranged from 700 to 18,370. We identified 119 CpGs associated with obesity, that reached a genome-wide significant level of p ≤ 1.03 × 10−<sup>7</sup> (**Supplementary Table 1**). The top 10 associated CpGs are shown in **Table 2**.

Seventy-eight genes were annotated to be affiliated with these CpGs and used as the input to the pathway enrichment analysis. These associated genes were enriched in three KEGG pathways related to insulin resistance, adipocytokine signaling and TNF signaling. However, none of them were significant after multiple testing corrections (FDR p > 0.05). According to the KEGG, there is only one gene (SOCS3) which was involved in all three pathways (**Table 3**).

### Differential Expression of the Affiliated Genes

We analyzed 30 VAT samples and 20 SAT samples to assess the differential expression of the affiliated genes. A p ≤ 0.05 and |log<sup>2</sup> (fold change)| ≥1 were used as criteria for differentially expressed genes. In the SAT, a total of 392 differentially expressed genes were obtained, of which 317 were up-regulated and 75 were down-regulated. On the other hand, there were 875 differentially expressed genes in the VAT, of which 406 were up-regulated and 469 were down-regulated. Among the genes affiliated with the 119 significant CpGs, SOCS3 and DOK2 were differentially expressed in the SAT of obesity patients, while seven


<sup>a</sup>Values are shown as mean ± SD.

<sup>b</sup>BMI was derived as weight (g) divided by height<sup>2</sup> (cm<sup>2</sup> ).

MARTHA, MARseille THrombosis Association; KORA, Cooperative Health Research; ARIC, Atherosclerosis Risk in Communities; GUSTO, Growing Up in Singapore toward Healthy Outcomes; FHS, Framingham Heart Study; EUGENE2, European Network on Functional Genomics of Type 2 Diabetes; ALSPAC, Avon Longitudinal Study of Parents and Children; RS-III, Rotterdam Study III; BIOS, Biobank-based Integrative Omics Studies; LOLIPOP, The London Life Sciences Prospective Population; LBC, Lothian Birth Cohort; KoCAS, Korean Child-Adolescent Cohort Study; EpiGO, Epigenetic Basis of Obesity-Induced Cardiovascular Disease and Type 2 Diabetes; LACHY: Lifestyle, Adiposity and Cardiovascular Health in Youth; BP, Blood pressure; RS, Rotterdam Study; AA, African American; EA, European American.

genes (SOCS3, PRR5L, ABCG1, BRDT, B3GNT7, ZNF710, and RARRES1) were differentially expressed in the VAT of obesity patients. It is worth mentioning that the SOCS3 gene was upregulated in both human SAT (log<sup>2</sup> fold change = 1.06, p = 9.23 × 10−<sup>3</sup> ) and VAT (log<sup>2</sup> fold change = 1.92, p = 8.52 × 10−<sup>3</sup> ). The results are detailed in **Supplementary Tables 2**, **3**.

### The Association With Obesity in Human VAT and Liver Tissue

Twenty VAT samples and 15 liver tissue samples were analyzed to test whether the association in the peripheral blood samples can be transferred into the metabolic tissues. The results revealed that seven CpGs in 119 associated CpGs were significantly associated with obesity (p < 0.05) in both the VAT and the liver tissue. The detailed results are shown in **Table 4**. Interestingly, most of them were associated with obesity in the opposite direction in the VAT and the liver tissue. For example, the CpG site cg07136133 was hyper-methylated in the VAT (log2fold change = 0.135, p = 0.014), but hypo-methylated (log<sup>2</sup> fold change = −0.066, p = 0.013) in the liver of obesity patients.

#### DISCUSSION

In this study, we identified 119 obesity-associated DNA methylations in human peripheral blood samples by combining results from previous studies. We further implemented the integrative approach highlighting SOCS3 among the numerous associated genes as a promising drug target.

The role of SOCS3 in obesity was strongly supported by our pathway enrichment analysis and the differential gene expression in the metabolic tissues. The pathway enrichment analysis is an efficient tool for drug target discovery (Aguirre-Plans et al., 2018). A gene which was involved in each or most of the enriched pathways may be situated in an essential position in the etiology of obesity. In our results, SOCS3 was the only obesity-associated gene whose protein regulated all three of the most enriched KEGG pathways. SOCS3 suppresses the target proteins by promoting their ubiquitination and degradation. Those included insulin receptor substrates (IRS1 and IRS2) in the liver cells and the leptin receptor (LEPR) in adipocytes (Bjorbak et al., 2000; Eyckerman et al., 2000; Rui et al., 2002; Howard et al., 2004). In our study, SOCS3 gene expression was up-regulated in both the VAT and SAT of obesity patients. This observation is in line with the increased insulin resistance found in morbid

TABLE 2 | The top 10 associated CpGs in the meta-analysis.


<sup>a</sup>The genes were annotated using the default resources provided by Metascape. CpG, Cytosine-phosphate-Guanine; P, P-value; N, the total sample size of the corresponding CpG sites; Dir, direction of association with body mass index.


KEGG, Kyoto Encyclopedia of Genes and Genomes; P, P-value; FDR, false discovery rate.

obesity patients and it further confirmed SOCS3 as a promising drug target (Mitrou et al., 2013; Dawson et al., 2014; Pucci et al., 2014).

Although involved in the insulin signaling pathway, the association between SOCS3 DNA methylation and type 2 diabetes (T2D) has been under debate. In studies of a small sample size (N < 300), the SOCS3 CpG was not associated with T2D (p > 0.05), with or without the adjustment of the BMI (Al Muftah et al., 2016; Dayeh et al., 2016). Furthermore, it is even associated with a BMI with the adjustment of T2D in one of the studies using the same cohort (Dayeh et al., 2016). In a study with 1074 incident T2D samples and 1590 controls, the SOCS3 CpG was associated with incident T2D (p = 1.2 × 10−<sup>7</sup> ) without the adjustment of the BMI (Chambers et al., 2015).

On the other hand, it has also been considered controversial whether the obesity-associated SOCS3 CpG impacts the transcription level. It was demonstrated that the hypomethylation at the associated SOCS3 CpG may induce higher SOCS3 expression in peripheral blood mononuclear cells (Ali et al., 2016). One might think that this is probably transferable to other tissues, however, the hypo-methylation at this CpG was found to be related to lower gene expression in the human pancreatic islet but related to higher gene expression in adipose tissue (Dayeh et al., 2016). This apparently controversial evidence has indicated that the regulation of SOCS3 expression might be much more complex than we previously thought. Further investigation is necessary to uncover the tissue-specific modifier of expression regulation of this gene and to understand whether this helps to clarify the association between SOCS3 DNA methylation and T2D.

DNA methylations in peripheral blood samples could be different from those in metabolic tissues, like adipose tissue and liver cells (De Bustos et al., 2009; Lovinsky-Desir et al., 2014). The conclusion derived from non-metabolic tissues should be validated in multiple metabolic tissues, before being used as evidence to support drug development or clinical trials. However, the GEO DNA methylation datasets in metabolic tissues had a small sample size. A statistical power analysis showed that we only had 9.5% power to detect a weak effect of DNA methylation on obesity using 20 samples. We hope that a better powered



<sup>a</sup>The genes were annotated using the default resources provided by Metascape.

CpG, Cytosine-phosphate-Guanine; VAT, visceral adipose tissue; P, P-value; Dir, direction of association with body mass index; Logfc, log fold change.

DNA methylation analysis in metabolic tissues could be taken into consideration in future integrative studies.

As compared to SOCS3, other genes showed a relatively weak relevance in our integrative analysis. The DNA methylation of the ABCG1 gene was the top signal in peripheral blood samples. Unfortunately, the ABCG1 gene was only differentially expressed in the VAT of obesity patients. The CPT1A gene was involved in two enriched pathways, but not differentially expressed in adipose tissues. Except for SOCS3, CPT1A, and ABCG1, other top 10 associated genes, shown in **Table 2**, lacked evidence of differential expression in adipose tissues and involvement in the enriched pathways. When taking a closer look at the 119 associated genes, we did have five additional differentially expressed genes in the VAT and one in the SAT. However, their priority was not supported by the enriched pathways.

The strength of this study lies in overlapping multiple lines of evidence to prioritize the candidate genes for drug target development, from the many associated DNA methylations. The integrated approach included genome-wide screening results in peripheral blood samples, pathway enrichment analysis, and differential gene expression in multiple adipose tissues. Screening obesity-associated CpGs in peripheral blood is remains the most practical way currently, as peripheral blood samples are abundant in many research groups. However, it should be noted that genomic DNA methylation can vary among different tissue types. For example, from our DNA methylation analyses in metabolic tissues, we observed the opposite direction of association at most of the associated CpGs (**Table 4**).

It should be noted that our study had limitations. Firstly, the association results from the included studies, came with various types of data transformation and statistical models, the effect sizes showed strong heterogeneity. We combined the pvalues using a sample-size-weighted strategy, which is a flexible approach, but can also be inaccurate. The genome-wide screening

#### REFERENCES


of our pipeline could be improved when the individual DNA methylation data is available. Secondly, we analyzed the GEO datasets using the GEO2R tool. This tool was not able to adjust for the covariates, e.g., age and sex, which may be helpful in minimizing the effects from confounding factors. Finally, the included GEO datasets have much smaller sample sizes as compared to the genome-wide screening, which may have increased the false negative rate of our approach.

#### CONCLUSION

In summary, we integrated multiple lines of evidence to reveal candidate genes for the treatment of obesity and its complications. Our study provided new insights on the interaction between obesity and the epigenome. Future studies are warranted to discover more potential drug targets using larger sample sizes from metabolic tissues, and to elucidate the mechanism of SOCS3 DNA methylation interacting with obesity.

#### AUTHOR CONTRIBUTIONS

PC contributed conception and design of the study. QG, RZ, YW, ZG, JH, and MH were responsible for finding literature and extracting data. QG conducted data analysis and wrote the first draft of the manuscript. PC and LS contributed to manuscript revision and final approval for submission. All authors reviewed the manuscript and provided comments.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00663/full#supplementary-material

diabetes: a nested case-control study. Lancet Diab. Endocrinol. 3, 526–534. doi: 10.1016/s2213-8587(15)00127-8


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Guo, Zheng, Huang, He, Wang, Guo, Sun and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Corrigendum: Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity

#### Approved by:

Frontiers Editorial Office, Frontiers Media SA, Switzerland

#### \*Correspondence:

Peng Chen pchen@jlu.edu.cn Liankun Sun sunlk@jiu.edu.cn

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Genetics

Received: 06 May 2019 Accepted: 31 May 2019 Published: 21 June 2019

#### Citation:

Guo Q, Zheng R, Huang J, He M, Wang Y, Guo Z, Sun L and Chen P (2019) Corrigendum: Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity. Front. Genet. 10:571. doi: 10.3389/fgene.2019.00571 Qingjie Guo1,2, Ruonan Zheng<sup>3</sup> , Jiarui Huang<sup>3</sup> , Meng He<sup>3</sup> , Yuhan Wang<sup>3</sup> , Zonghao Guo<sup>3</sup> , Liankun Sun<sup>4</sup> \* † and Peng Chen1,2 \* †

<sup>1</sup> Department of Genetics, College of Basic Medical Sciences, Jilin University, Changchun, China, <sup>2</sup> Key Laboratory of Pathobiology, Ministry of Education, Jilin University, Changchun, China, <sup>3</sup> College of Clinical Medicine, Jilin University, Changchun, China, <sup>4</sup> Department of Pathophysiology, College of Basic Medical Sciences, Jilin University, Changchun, China

Keywords: DNA methylation, obesity, association, gene expression, CpG

#### **A Corrigendum on**

#### **Using Integrative Analysis of DNA Methylation and Gene Expression Data in Multiple Tissue Types to Prioritize Candidate Genes for Drug Development in Obesity**

by Guo, Q., Zheng, R., Huang, J., He, M., Wang, Y., Guo, Z., et al. (2018). Front. Genet. 9:663. doi: 10.3389/fgene.2018.00663

In the published article, there was an error regarding the affiliation for Liankun Sun. He should have the affiliation <sup>4</sup>Department of Pathophysiology, College of Basic Medical Sciences, Jilin University, Changchun, China instead of the affiliation <sup>1</sup>Department of Genetics, College of Basic Medical Sciences, Jilin University, Changchun, China.

The authors apologize for this error and state that this does not change the scientific conclusions of the article in any way. The original article has been updated.

Copyright © 2019 Guo, Zheng, Huang, He, Wang, Guo, Sun and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Making Sense of the Epigenome Using Data Integration Approaches

Emma Cazaly<sup>1</sup> , Joseph Saad<sup>1</sup> , Wenyu Wang<sup>1</sup> , Caroline Heckman<sup>1</sup> , Miina Ollikainen1,2 \* and Jing Tang1,3,4 \*

1 Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland, <sup>2</sup> Department of Public Health, University of Helsinki, Helsinki, Finland, <sup>3</sup> Department of Mathematics and Statistics, University of Turku, Turku, Finland, <sup>4</sup> Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland

Epigenetic research involves examining the mitotically heritable processes that regulate gene expression, independent of changes in the DNA sequence. Recent technical advances such as whole-genome bisulfite sequencing and affordable epigenomic arraybased technologies, allow researchers to measure epigenetic profiles of large cohorts at a genome-wide level, generating comprehensive high-dimensional datasets that may contain important information for disease development and treatment opportunities. The epigenomic profile for a certain disease is often a result of the complex interplay between multiple genetic and environmental factors, which poses an enormous challenge to visualize and interpret these data. Furthermore, due to the dynamic nature of the epigenome, it is critical to determine causal relationships from the many correlated associations. In this review we provide an overview of recent data analysis approaches to integrate various omics layers to understand epigenetic mechanisms of complex diseases, such as obesity and cancer. We discuss the following topics: (i) advantages and limitations of major epigenetic profiling techniques, (ii) resources for standardization, annotation and harmonization of epigenetic data, and (iii) statistical methods and machine learning methods for establishing data-driven hypotheses of key regulatory mechanisms. Finally, we discuss the future directions for data integration that shall facilitate the discovery of epigenetic-based biomarkers and therapies.

#### Keywords: epigenetics, data integration, functional annotation, drug discovery, data resources, profiling techniques

## INTRODUCTION

Complex diseases and traits have a genetic background, yet the final phenotypic outcome largely depends on an individual's environment and lifestyle, and genomic studies have thus far explained only a small fraction of the inherited risk of many complex diseases (Eichler et al., 2010). This missing heritability may in part be explained by the contribution of epigenetic variation to complex diseases. Moreover, the majority of genetic variants associated with a disease risk are located at non-coding regions of the genome, suggesting that these SNPs point to genomic regions with a downstream regulatory role. It is well-established that cells regulate gene expression during multiple stages of transcription and translation, predominantly through chromatin packaging (Holliday, 2006). Chromatin is a complex of DNA and DNA binding proteins that control the packaging of DNA and thereby affect the access of transcription factors to the regulatory regions

#### Edited by:

Chandravanu Dash, Meharry Medical College, United States

#### Reviewed by:

Volker Martin Lauschke, Karolinska Institute (KI), Sweden Zhiguo Xie, Central South University, China

#### \*Correspondence:

Miina Ollikainen miina.ollikainen@helsinki.fi Jing Tang jing.tang@helsinki.fi

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology

Received: 22 November 2018 Accepted: 31 January 2019 Published: 19 February 2019

#### Citation:

Cazaly E, Saad J, Wang W, Heckman C, Ollikainen M and Tang J (2019) Making Sense of the Epigenome Using Data Integration Approaches. Front. Pharmacol. 10:126. doi: 10.3389/fphar.2019.00126

**49**

of genes. This process is regulated by two epigenetic mechanisms: dynamic DNA methylation and post-translational modifications of DNA binding histone proteins.

DNA methylation plays an important role in silencing tissuespecific genes, imprinted genes and repetitive elements (Walsh et al., 1998; Fouse et al., 2008). DNA methylation in human cells occurs predominantly at the cytosine of a cytosine-guanine pair (CpG dinucleotide), where a methyl group is covalently attached to the carbon 5 position. In the human genome there are approximately 28 million CpG dinucleotides, accounting for 1% of the whole genome. Of these, 60 to 90% are methylated, while the majority of unmethylated sites cluster non-randomly in regions called CpG islands (CGIs). CGIs co-localize to the promoter region of up to 70% of human genes (Illingworth and Bird, 2009). In general, unmethylated CGIs are associated with transcriptionally permissive chromatin and gene expression. During normal development and in certain disease states, particularly in cancer, these CGIs can become methylated, leading to inhibition of transcription factor binding and gene repression.

In addition to DNA methylation (5mC), DNA hydroxymethylation (5hmC) is another essential epigenetic modification in cells. Hydroxymethylation is the primary product of the oxidation of 5-methylcytosine by the ten-eleven translocation (TET) enzymes. In this process methylated cytosine (5mC) is first oxidized into 5-hydroxymethylcytosine (5hmC), then to 5-formylcytosine and to 5-carboxylcytosine (5caC). These are removed by thymine DNA glycosylase and replaced by unmethylated cytosine by base excision repair. However, hydroxymethylation is not merely an intermediate of the dynamic demethylation process but actually a temporarily stable epigenetic modification of DNA (Globisch et al., 2010). Hydroxymethylated cytosines are enriched at the promoters and enhancers of developmental genes, and they correlate positively with gene expression during cell lineage commitment in early development. In addition, hydroxymethylation is present in gene bodies of actively transcribed genes (Colquitt et al., 2013; Tsagaratou et al., 2014; Nestor et al., 2016). Hydroxymethylation is less abundant than DNA methylation, and its abundance varies between tissues and cell types. It is more abundant in embryonic stem cells (Ito et al., 2010), and human brain tissue (0.67%), kidney (0.38%), colon (0.45%), rectum (0.57%), and liver (0.46%), while low or very low in human lung, breast and placenta (Li and Liu, 2011). The abundance of hydroxymethylation seems to be inversely correlated with the proliferation rate of a cell (Kriaucionis and Heintz, 2009; Bachman et al., 2014). The dynamic interplay between DNA methylation and hydroxymethylation is presumably important for maintaining normal gene expression patterns in a cell, however, the causes and consequences of the imbalance between these two DNA modifications is still to be understood.

In contrast to DNA methylation and hydroxymethylation, which are set de novo at early embryogenesis and maintained during DNA replication, histone modifications are posttranslational changes. They act to remodel the chromatin structure and regulate gene expression through chromatin accessibility (ENCODE Project Consortium, 2012). Histone modifications are the largest category of chromatin modifications identified so far, with 15 known chemical modifications at more than 130 sites on 5 canonical histones and on around 30 histone variants. Specific histone modification patterns often correlate with known functional genomic elements. For example, H3K9me3 and H3K27me3 are associated with inactive promoters; while H3K4me3 and H3K27ac are shown to be enriched in active enhancers and promoters (Karlic et al., 2010; Zhou V.W. et al., 2011). However, the theoretical number of all possible histone – modification combinations is huge, particularly when compared to the extremely limited knowledge on their functional roles.

An additional layer of epigenetic regulation is derived from non-coding RNA (ncRNA), which is transcribed from DNA but not translated into protein. NcRNA ranges from very small 22 nucleotide microRNA molecules (miRNA), to transcripts longer than 200 nucleotides (lncRNA). NcRNAs play a role in translation, splicing, DNA replication and gene regulation, particularly through miRNA directed downregulation of gene expression (Valencia-Sanchez et al., 2006). NcRNAs are most widely studied in the context of cancer, where they have been identified in the tumor suppressor or oncogenic processes of all major cancers (Anastasiadou et al., 2018). The techniques for measuring ncRNA are similar to other transcriptomic techniques, predominantly involving deep sequencing approaches (Veneziano et al., 2016). In recent years it has become apparent that there is a coordinated interaction between ncRNA and other epigenetic marks, the extent of which is yet to be fully realized (Ferreira and Esteller, 2018). The discovery of more than 100 post-transcriptional modifications to ncRNA, such as methylated adenines and cytosines, is providing further insight into the interaction between these different epigenetic layers (Romano et al., 2018). For the latest advances in the ncRNA biology, we refer the reader to the special series in Nature Reviews Genetics, January 1st 2018<sup>1</sup> .

DNA methylation (referring to both 5mC and 5hmC from here on), histone modifications and ncRNA respond to genetic and environmental effects and thereby alter gene expression, providing biological mechanisms for the development of common diseases. Therefore, epigenetic mechanisms are key to understanding disease progression and discovering new treatment targets (Lord and Cruchaga, 2014). As one of the more recent omics fields, epigenomics has experienced rapid growth in the past decade, providing novel insights to many areas of cell biology. Recent developments in microarray technology have made the generation of genome-wide epigenetic data feasible in large populations (Pidsley et al., 2016). As such, epigenome-wide association studies (EWASs) have become an important component of omics-driven approaches to investigate the origin of common human traits and diseases (Lappalainen and Greally, 2017).

Despite the tremendous potential to improve our understanding of disease progression and treatment, epigenetics has yet to become fully utilized in clinical applications. Similar to transcriptomics, epigenetic profiles are continuous, dynamic and tissue-specific. As ever more epigenetic data are generated

<sup>1</sup>https://www.nature.com/collections/sqtqxdnvdz

with advances in high-throughput sequencing and microarray technologies, the challenges now become developing data analysis approaches to facilitate the identification of coordinated epigenetic changes and interpretation of their functional consequences in normal development and disease. For example, an effective data annotation protocol is needed for a communitydriven data standardization to improve the replicability of epigenetic findings (Carter et al., 2017). In particular, the variation in epigenetics profiles at different time points is yet to be established as a control for the reference in normal populations. Partly due to the lack of appropriate and efficient computational methods, the majority of existing studies focus on a single epigenetic mark in isolation, although the interactions of multiple marks and genotypes exist in vivo (Davila-Velderrain et al., 2015). To realize the full potential offered by epigenetics, an interdisciplinary research community is needed to foster effective and robust data integration strategies for combining epigenetics data with other omics data (**Figure 1**).

In the following sections we will review the recent advances in computational methods and applications for epigenomic analysis and discovery, ranging from databases and software tools for statistical analysis to data integration techniques for functional annotation. We will start by comparing the common epigenomic profiling technologies, before moving on to data annotation and standardization models. We then provide an overview of various data sources leveraged in epigenetic studies and their applications. We describe statistical and machine learning methods to pinpoint epigenetic modifications driving disease, and provide a list of software tools capable of implementing these methods, as well as databases containing epigenomic and other omics data. This catalog provides a comprehensive and practical resource to build data-driven hypotheses for analyzing the functional consequences of epigenetic marks. Finally, we provide representative examples of profiling epigenetics in disease states and its significance in biomarker and drug discovery.

expression quantitative trait loci; TWAS, transcriptome-wide association study.

## EPIGENETIC PROFILING TECHNIQUES

Epigenetic analysis techniques can be broadly classified as typing, involving a small number of loci across many samples, or profiling that can be extended to epigenomewide analysis. The end-point measurements from these methods often reflect a proportion or ratio of chromatin with epigenetic marks compared to the total chromatin. Within these categories, various sequencing, microarray and antibody based methodologies are employed to examine the different aspects of epigenetic regulation, including DNA methylation, chromatin accessibility and histone modifications. Epigenetic data generated from these techniques require different pre-processing steps depending on the methodology employed. For example, array-based DNA methylation analysis requires extensive within and between array normalization, preprocessing and integration across platforms (Fortin et al., 2017), while bisulfite sequencing can be processed with a relatively standardized sequence trimming and alignment pipeline (Wreczycka et al., 2017). Further complications include the feasibility of using epigenetic profiles derived from blood as a proxy for other less accessible tissue types (Houseman et al., 2015), and controlling for tumor purity in cancer studies (Zheng et al., 2017). Here, we summarize the most common epigenetic profiling techniques and compare their advantages and limitations (**Tables 1**, **2**).

### DATA RESOURCES FOR STANDARDIZATION, ANNOTATION, AND HARMONIZATION

Unlike the human genome, the epigenome varies across different cell types and over time. Due to recent efforts in big data consortia, such as the Encyclopedia of DNA Elements (ENCODE) (Davis et al., 2018) and the International Human Epigenome Consortium (IHEC) (Bujold et al., 2016), genome-wide epigenetic reference datasets are now publically available for different cell lineages, tissues, and diseases. Within IHEC, standardization of sample preparation and assay protocols have been benchmarked and implemented across multiple centers, that have been collected from seven international consortia including ENCODE, NIH Roadmap (Bernstein et al., 2010), Blueprint (Martens and Stunnenberg, 2013) and others across Europe, North America, and Asia. Furthermore, efficient data portal infrastructure has provided powerful tools for interactive exploration and annotation of the resulting datasets at a genomewide scale, encompassing over 800 reference epigenomes for different tissues and conditions. Such a communitydriven profiling effort has provided rich resources and tools for future epigenetic data mining and functional annotation. More recently, these datasets have been made available via the Human Epigenome Browser (Zhou X. et al., 2011), providing the visualization tools similar to the UCSC Genome Browser (Kent et al., 2002). Here, we

list the common data repositories and their visualization tools (**Table 3**).

To facilitate the sharing of epigenomic data between different studies, standardization of sample preparation and assay protocols is required. While there are existing recommendations for reporting the minimal information to annotate omics studies such as MIAME for gene expression data and MIAPE for proteomics data, the consensus for the annotation protocol for epigenetics data has yet to be defined. This is partly due to the versatile techniques for various epigenetic features

TABLE 1 | Summary of major profiling techniques for DNA methylation.


TABLE 2 | Summary of major profiling techniques for Chromatin Accessibility and Histone modifications.


that require distinctive experimental protocols for achieving optimal results (Chervitz et al., 2011). To improve the data interoperability, comparisons of the epigenetics profiling techniques have been initiated by the international consortia. For example, the BLUEPRINT consortium has conducted a systematic comparison of different DNA methylation profiling technologies and reported generally consistent results, whilst also highlighting the higher performance of sequencing-based assays over array-based or antibody-based assays (BLUEPRINT Consortium, 2016). Moreover, informatics approaches such as APIs (Application Programming Interfaces) have been developed to extract data from different repositories in a more efficient manner. One example is the DeepBlue web server, which provides an API for retrieving major epigenetic studies of IHEC (Albrecht et al., 2016). The use of resource description framework (RDF) such as Bio2RDF has also been proposed to allow for the sharing of knowledge to facilitate text mining techniques for information retrieval (Jupp et al., 2014).

### STATISTICAL AND DATA INTEGRATION METHODS FOR INTERROGATING THE EPIGENOME

As is the case in association studies in other fields, EWAS detect epigenetic marks associated with a certain phenotype. Common epigenetic study designs include case-control studies,

TABLE 3 | Epigenetic data repositories and browsers.


cross-sectional or longitudinal cohort studies, and family or twin designs. Logistic regression is commonly used for a case vs. control or binomial phenotype design, while linear regression is employed with continuous phenotypes. Technical and biological covariates are added to the regression models to adjust for confounding factors in the data and methods that control the false discovery rate posed by multiple testing are applied.

The resulting epigenetic profiles can be visualized on appropriate web tools, such as UCSC Genome Browser (Kent et al., 2002), EpiGenome Browser (Zhou X. et al., 2011), or coMET (Martin et al., 2015). While recent advances in epigenetic profiling techniques have made EWAS more cost-efficient and effective, interpreting the results from such epigenomic studies remains a challenge. Without a careful selection of tissues and population samples, many EWAS associations may partly stem from the dynamic and complex nature of the interactions between the different epigenetic layers, or arise from the fact that epigenetic states differ spatially across tissues and cell types as well as during aging. Therefore, there have been significant difficulties inferring the causality of epigenetic marks among a range of genetic, environmental and stochastic factors. A variety of data integration approaches, such as co-mapping and network analysis are currently employed to unravel the complexities of these various epigenetic layers and their interaction with other omics datasets (Hasin et al., 2017).

In this section we discuss data integration approaches for the functional annotation of trait-associated epigenetic hits by the use of knowledge bases, by predicting chromatin states, and by establishing associations with gene expression. Alternatively, the genetic basis of DNA methylation marks can be studied using the meQTL analyses, from which computational tools can be utilized to further identify the potential functional variants. The results of robust associations between genetic variants, epigenetics marks and disease traits can be integrated in the framework of causal modeling, with an aim to dissect causal epigenetic marks from those that are secondary to disease progression. These likely causal epigenetic marks may be further developed into potential disease biomarkers and drug targets upon experimental investigation, for example using epigenome editing techniques.

## Functional Annotation

#### Pathways

Genes and their regulators do not function in isolation, but are organized into pathways and networks. To obtain a more holistic view on the potential functional implications of the EWAS hits, multiple tools on gene ontologies (GOs), pathway and network analysis are available for researchers to interpret their findings. For example, GO biological process, molecular function, and cellular component pathways of the EWAS hits can be explored by PANTHER (protein annotation through evolutionary relationship) tools (Mi et al., 2013). Other commonly used tools include Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005), where a predefined set of genes represent a pathway collected from multiple databases such as KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et al., 2017). The commercial Ingenuity Pathway Analysis (IPA© QIAGEN) can be also used to examine biological networks, functions, and associated diseases (Kramer et al., 2014). In addition to these gene centered analyses, genome region enrichment analysis has been proposed to infer the functional significance of the epigenetic marks at potential regulatory elements. For example, the LOLA tool can test a non-coding genomic region of interest for overlap with curated region set databases (Sheffield and Bock, 2016). The GREAT tool (Genomic Regions Enrichment of Annotations Tool) associates cis-regulatory regions identified by, e.g., ChIP-seq with biological processes by computing the enrichment scores for a given ontology term of the nearby genes (McLean et al., 2010). As a result, insights into the functional significance of the cisregulatory regions across the genome are produced.

#### Chromatin States

To infer the chromatin states from epigenetics data, networkbased methods such as hidden Markov model (HMM) have been developed to determine the probability of chromatin states at different genomic regions from the histone modification marks (de Pretis and Pelizzola, 2014). Notably, a widely applied method is ChromHMM which can efficiently learn the hidden chromatin states based on the distinctive combinatorial and spatial patterns of histone modification marks (Ernst and Kellis, 2017). These data-driven chromatin states are then annotated by their putative functions, such as transcription start sites, enhancers or promoters. Annotating the genome with such predicted chromatin states together with other genomic information may reveal functional elements, particularly for those regions that are in linkage disequilibrium (LD) with disease-associated SNPs. ChromHMM has been implemented in an ENCODE study to integrate 14 epigenetic marks, including histone modifications, transcription factors and chromatin accessibility for 6 human cell types, resulting in 25 chromatin states that are predictive of RNA transcription (Hoffman et al., 2013). The resulting gene regulatory elements mapped by these computational methods from ENCODE and other consortium projects have allowed individual researchers to interrogate and interpret their EWAS findings. Furthermore, computational methods that aim to predict tissue or cell-type specific functional regions have been proposed. For example, a web-based tool eFORGE (experimentally derived Functional element Overlap analysis of ReGions from EWAS) can be used to inform which traitassociated methylation hits are likely functional in a given tissue or cell type. The eFORGE method computes an enrichment score based on the overlap between the CpG sites of interest and DNase I hypersensitive sites (as marks for active chromatin) to predict the functionality of a CpG site in a given cell type, and thus help prioritize the EWAS results in terms of functional impact (Breeze et al., 2016). Another complementary method called dCMA is based on differential chromatin modification analysis to identify cell-type specific regulatory elements from ChIP-Seq data (Chen et al., 2013).

#### Gene Expression

The association between epigenetic marks and gene expression has been extensively studied to identify the functional

consequences of epigenetic marks identified in an EWAS. This is commonly accomplished by linear regression models with the expression level of a gene as the dependent variable and CpG site methylation or histone modification as the independent variable. Adjusting for biological and technical confounders is also common practice in such models, which can be used to explore how epigenetic marks interact with gene expression throughout the genome. For example, a recent study in human blood cells applied a linear mixed effects model, by which DNA methylation signatures for more than 13k transcripts were defined (Kennedy et al., 2018).

While the association between CGI promoter methylation and gene expression is well-established and readily interpretable (Cedar and Bergman, 2012), the regulatory role of DNA methylation outside CGIs, in 'shores' and 'shelves' and throughout gene bodies is less extensively studied. However, methylation in these regions is potentially more relevant to diseases, as these are the regions that vary the most between tissue types and between cancerous and normal tissue (Irizarry et al., 2009). Unlike promoter methylation which is associated with gene repression, the association between intragenic methylation and gene expression is more bell-curved, with high methylation associated with moderately expressed genes and low methylation observed in genes with either high or low expression (Jjingo et al., 2012). This complex relationship between DNA methylation and gene expression poses challenges for comprehensively integrating gene expression and DNA methylation data. Public databases such as the Gene Expression Omnibus (GEO<sup>2</sup> ) can also be employed to assist in the interpretation of EWAS findings. Inferring causal relationships between DNA methylation and gene expression can be obtained by including genetic data in the models, as discussed in the next two sections.

### Identification of Genetic Drivers of Epigenetic Marks

One of the major objectives in epigenetic studies is to identify SNPs that are associated with DNA methylation marks as meQTLs. In order to demonstrate whether trait-associated DNA methylation is independent of genetic variants influencing methylation, a regression analysis can be conducted using for example R package MatrixEQTL (Shabalin, 2012). Results of meQTL analyses include a ranked list of both short distance cis and more distal (>1 Mb from the DNA methylation site) trans effects of genetic variants on DNA methylation. Public repositories such as the mQTLdb database (Gaunt et al., 2016) and BIOS QTL browser (Bonder et al., 2017) are invaluable in epigenetic research as they enable the results from largescale individual studies to be incorporated in subsequent meta analyses. Recently, meta-databases have been developed to systematically curate, harmonize and integrate meQTL data across different diseases. For example, Pancan-meQTL provides the result of meQTLs for 23 cancer types (Gong et al., 2018). The findings of meQTL analyses can be coupled with eQTL results in interpreting GWAS hits, as demonstrated in a recent study which identified a strong correlation between meQTLs and eQTLs that are shared by common genetic variants from peripheral blood (Pierce et al., 2018). Similar conclusions have been made in a study involving 3,841 Dutch individuals, where disease-associated variants have been found to affect both transcription factor levels and methylation of their binding sites (Bonder et al., 2017).

Integrating epigenetic marks with genotypes can also aid in interpreting the functionality of trait-associated SNPs observed in GWAS. Therefore, computational tools to predict the functions of genetic variants can be also used for annotating the functional consequences of meQTLs. Information that has been generally considered in such prediction tasks includes sequence conservation, population frequency as well as functional genomics. Approaches such as SIFT (Kumar et al., 2009) and PolyPhen2 (Adzhubei et al., 2013) align human protein sequences to homologous sequences from the other organisms to evaluate the impact of missense variants. Such sequence conservation approaches have been extended to identify conserved elements in non-coding regions by PhastCons (Siepel et al., 2005) and GERP (Davydov et al., 2010). In comparison, tools such as VAAST also utilize population frequency information from large consortiums, i.e., the 1000 Genome project for variant prioritization. Moreover, machine learning technology has long been introduced into the functional annotation of genetic variants (see Holder et al., 2017 for a recent review). For example, the PANTHER method utilizes a HMM to capture the relationship between sequence similarity and functional similarity, based on which the functional impact of a given genetic variant can be predicted (Thomas et al., 2003). As one of the most widely used methods, the CADD method employed epigenomic information such as genomic regions of DNase I hypersensitivity and histone modifications as predictive features to train the Supported Vector Machine to predict the causal variants in the genomic regions (Rentzsch et al., 2019).

### Dissecting Causality by Mendelian Randomization and Causal Networks

While many of the above-mentioned methods help illustrate the various functions of trait -associated epigenetic marks, it is often difficult to distinguish cause from consequence. In addition, the associations are often confounded by other factors. Mendelian randomization (MR) is a special form of causal network modeling, where the causality between a potential risk factor and an outcome can be established by including the genotype data (Tang et al., 2009; Latvala and Ollikainen, 2016). To be able to establish whether an association between an epigenetic mark and a disease outcome is causal, MR utilizes a series of statistical inference rules, which start by identifying an instrumental variable from the trait-associated genetic variants. This genetic instrument must fulfill the following criteria: (1) associated with the exposure, (2) independent of any potential confounders, and (3) associated with the outcome of interest only via its association with the exposure. Since the genetic variant occurs at germline that precedes the onset of disease, reverse causality is not possible. Also, as parental alleles are randomly segregated and assorted to offspring, associations between genetic

<sup>2</sup>https://www.ncbi.nlm.nih.gov/geo/

variation and the outcome of interest are unlikely to be affected by confounding factors. The principles and recent developments in MR are described in detail elsewhere (Davey Smith and Ebrahim, 2003; Davey Smith and Hemani, 2014).

Mendelian randomization has been commonly used in epidemiology, and has recently been applied to infer causality in epigenetics studies as well. Depending on the applications, epigenetics marks have been considered as either the exposure or the outcome of interest in the MR model. For example, Relton and Davey Smith provided a two-step MR framework to select the instrument variables for both the risk factors and the DNA methylation marks, so that the causality cascade from the risk factors to the disease outcome can be established (Relton and Davey Smith, 2012). Such a two-step MR framework has been recently applied to study the causal roles of DNA methylation between smoking and inflammation (Jhun et al., 2017). On the other hand, a similar stepwise MR framework has been applied to distinguish causal effects from associations between blood lipid levels and DNA methylation, where the blood lipid levels were considered as the risk factor to affect DNA methylation of white blood cells (Dekkers et al., 2016). More recently, a systematic MR study involved multiple steps to investigate the meQTLs as the instrumental variables to understand the causal effect of DNA methylation for a large variety of disease traits (Richardson et al., 2018). As a validation, majority of the candidate loci were known to affect gene expression and DNA methylation, and thus supported the validity of MR as a datadriven approach to generate plausible biological hypotheses that warrant further experimental investigation. The basic version of MR involves the use of bivariate analysis, which can be extended as a causal network inference that involves the testing of multiple instrument variables in relation to different risk factors and disease outcomes. For example, the joint likelihood method (JLIM) tests whether two risk factors share the same causal genetic variants by evaluating the similarity of LD patterns between the SNPs, which is a form of co-localization methods (Chun et al., 2017). The other co-localization methods include HEIDI (heterogeneity in dependent instrument) (Zhu et al., 2016) and coloc (Giambartolomei et al., 2014) methods, while only summary-level data is used. More recently, a method called GSMR leveraged multiple SNPs as instrument variables to test for causality between risk factors and common diseases (Zhu et al., 2018).

Alternatives of causal modeling include the causal mediation analysis, which employs a series of hypothesis testing on the conditional independence among genetic variants, exposure, and disease traits (Millstein et al., 2009). The mediation analysis infers how much the indirect causal effect of an exposure on a disease outcome is mediated by a mediator, while MR focuses on the direct causal effect of the exposure on the disease outcome using a genetic variant as the proxy (Richmond et al., 2016a). A model-based causal mediation approach is available in the mediation R package (Imai et al., 2010), which has been applied in a recent study to identify nine potential epigenetic CpG sites that may mediate the effect of prenatal famine exposure to adult body mass index (BMI), serum triglycerides, and glucose levels. Notably, these CpG sites were all located at regulatory regions which are linked to the expression of growth, differentiation, and metabolism-related genes (Tobi et al., 2018).

For a model selection perspective, both causal mediation analysis and MR can be considered as special cases of causal network modeling, which compares the likelihoods for multiple competing models about causality (e.g., reverse causality model or confounding effect model) (Burgess et al., 2015). These different statistical frameworks to test for causality of epigenetic marks are useful tools, however, it is never possible to definitively prove causality based on these methods only. Instead, any negative or positive findings should be interpreted with caution and should be supported by multiple independent approaches with different assumptions, as well as the sensitivity analyses of the measurement error, and finally to match with the available biological knowledge and experimental validation (Hermani et al., 2017; Yarmolinsky et al., 2018).

### INTEGRATIVE APPROACHES TO UNDERSTAND THE ROLE OF EPIGENETICS IN COMPLEX TRAITS

To date, 10s of 1000s of genetic variants have been associated with human complex traits via GWAS. Based on the findings of twin studies, these diseases and traits are, on average, 50% heritable (Polderman et al., 2015). To be able to better explain the functions of the genetic variants, the field of epigenetics has been actively researched. Next, we will describe a few representative case studies in obesity and cancer, where the integration of genetic, epigenetic, and transcriptomic data has been a key component in understanding the disease etiology and progression. The information gained from such studies can then help inform future diagnostic biomarker and treatment strategies.

#### Obesity and Associated Traits

Numerous EWAS studies have shown that BMI and obesity are associated with widespread changes in DNA methylation, most often profiled using Illumina 450K or EPIC arrays (Dick et al., 2014; Ollikainen et al., 2015; Pietilainen et al., 2016; Mendelson et al., 2017; Wahl et al., 2017; Davis et al., 2018; Dhana et al., 2018). Most of the findings are tissue specific, or shared by a few tissue types (Dick et al., 2014; Wahl et al., 2017), with some hits replicated between studies, while others appear to be more study or population specific. Many of the observed DNA methylation hits are at or near genes that have previously been related to BMI or obesity traits by genetic association, while others may reflect novel genes and pathways involved in the regulation of adiposity or obesity-related diseases (Ollikainen et al., 2015; Mendelson et al., 2017; Wahl et al., 2017).

Integration of DNA methylation data with predicted chromatin states from ENCODE data has revealed that the genomic regions associated with obesity by DNA methylation are often enriched for regulatory features (Ollikainen et al., 2015; Wahl et al., 2017). Potential functional consequences of the observed methylation alterations have been tested by correlating DNA methylation with gene expression of the nearby genes, and concomitant changes in DNA methylation and gene expression

have been observed in many obesity relevant genes. Integration of DNA methylation with genotype data (as meQTLs) has been used to annotate GWAS hits, and to identify novel candidate obesity-associated genes. For example meQTLs at KLF13 (Koh et al., 2017) and MCR4 (Tang et al., 2017) have been shown to associate with childhood obesity. In addition to identification of meQTLs, integration of genotypes and DNA methylation can be used to infer causality in the observed associations, for example by MR –based approaches. These analyses have shown that the observed associations are predominantly the consequence of high BMI or obesity – related metabolic outcomes (Dick et al., 2014; Ollikainen et al., 2015; Richmond et al., 2016b; Wahl et al., 2017). However, NFATC2IP and SREBF1 methylation have been shown to have potential causal associations with BMI (Mendelson et al., 2017; Wahl et al., 2017). Finally, some studies have shown that the disturbances in DNA methylation predict future development of type 2 diabetes (Wahl et al., 2017) and coronary heart disease (Hedman et al., 2017), and that DNA methylation could be used to distinguish metabolically unhealthy from healthy obesity (Ollikainen et al., 2015; Wahl et al., 2017). To enable early detection of individuals with increased risk for metabolic complications, further studies are needed to thoroughly examine whether DNA-methylation could serve as a biomarker for metabolically unhealthy obesity.

Taken together, results from multiple epigenetic studies using data integration approaches in obesity and related traits may provide new insights into the biological pathways influenced by adiposity. Although most of the epigenetic changes are consequential to obesity or related traits, a few appear to have a causal role. Identification of causal hits is critical not only for understanding the biological mechanisms in the development of obesity and metabolic disturbances, but also for developing novel, effective prevention, and treatment strategies that target the underlying mechanisms. However, the cross-sectional nature of most of the analyzed data sets limits definitive causal determination. In addition, the marks that are caused by obesity can be considered as potential biomarkers of obesity or related metabolic disturbances. These may enable development of new strategies for prediction and prevention of adverse metabolic consequences of obesity.

#### Cancer

Despite the fact that cancer has been traditionally perceived as a genetic disease, epigenetic mechanisms have been increasingly identified to contribute to many hallmarks of cancer (Flavahan et al., 2017). Epigenetic alterations are shown to be responsible for the activation of cancer oncogenes or the inactivation of tumor suppressors (Kagohara et al., 2018). Numerous recent cancer epigenetics studies have demonstrated that data integration not only enables a more detailed understanding of disease mechanisms at the molecular level, but also offers novel insights on improved approaches for disease diagnostics, treatment, and management. For example, The Cancer Genome Atlas (TCGA) project has produced DNA methylation data for over 10000 cancer samples (Hoadley et al., 2014). Here, we highlight a few representative cancer epigenetic studies where a combination of multiple data analysis methods have been applied.

One case study implemented a genome-wide chromatin accessibility profiling for chronic lymphocytic leukemia (CLL) patient samples using ChIPmentation and RNA-seq profiling (Rendeiro et al., 2016). Using a Random Forest machine learning method (Rahman et al., 2017), it was found that epigenetic profiles can accurately predict the IGHV mutation status. Furthermore, common and constitutively accessible regions as well as regions with higher inter-individual variability were also found. Similar studies were done using reduced representation bisulfite sequencing (RRBS) for Ewing sarcoma, a rare cancer that is known to be caused by the EWS-FLI1 fusion gene. Despite the common genetic background, substantial DNA methylation differences between and within cancers were found (Sheffield et al., 2017). Notably, several computational tools have been developed in this study. For example, a MIRA score has been derived to transform the epigenetic state of a given genomic region into the degree of regulatory activity. Moreover, the intra-tumor heterogeneity has been measured using the PIM (proportion of sites with intermediate methylation) and PDR (proportion of discordant reads) scoring which can capture the cell-to-cell heterogeneity and the epigenetic instability within the tumor cells separately. The PIM score was then used to predict the metastatic state of a patient-derive sample using a logistic regression model.

Another study focused on triple-negative breast cancer (TNBC) by jointly contrasting the transcriptomic and epigenetic profiles of cancer stem cells (CSCs) versus non-cancer stem cells (NCSCs) (Li et al., 2018). Differentially expressed genes between CSCs and NCSCs were first identified by performing an RNA-Seq data preprocessing using tools including HTSeq (Anders et al., 2015) and samtools (Li et al., 2009), as well as differential analyses using R packages including DEGSeq (Wang et al., 2010). Subsequently, functional significance of cisregulatory regions were analyzed with the GREAT (McLean et al., 2010) for the identification of significantly disrupted signaling pathways. Furthermore, patterns of differential DNA methylation and histone modifications were analyzed. By performing a WGBS analysis, differentially-methylated CpG sites in promoter regions [defined around genes' transcription start sites (TSSs)] were identified using the methylKit R package (Akalin et al., 2012) and PeakAnalyzer (Salmon-Divon et al., 2010). In parallel, histone modifications were analyzed using ChIP-seq to determine and visualize different binding sites of antibodies specific to H3K4me2 (considered as a permissive mark for transcription) and H3K27me3 (a transcriptional silencer), using the R packages DiffBind (Ross-Innes et al., 2012) and seqMINER (Zhan and Liu, 2015). As a result, the repressive mark H3K27me3 appeared to contribute more to the tumor-promoting tendencies of CSCs, notably by affecting melanogenesis, Wnt, and GnRH pathways, all of which are known to be involved in cellular proliferation and self-renewal, conferring to the typical characteristics of chemo- or radiotherapy- resistance.

In a study conducted on epithelial ovarian cancer (EOC), the integrated analysis of genetic (GWAS), expression (proteomic) and epigenetic (DNA methylation) data permitted the identification of a novel subtype-specific susceptibility gene for the malignancy (Shen et al., 2013). As a first step, a GWAS study

for ovarian cancer (consisting of 43 smaller studies and a total of more than 16,000 EOC patients) identified various HNF1B SNPs for the serous and the clear cell subtypes of EOC. Specifically, while rs7405776 [minor allele frequency (MAF) = 36%] was the most strongly associated SNP with serous EOC and conferred an increased risk of 13% per minor allele, rs11651755 (MAF = 45%) was strongly associated with the clear cell subtype of EOC and decreased the malignancy risk by 23% at genome-wide significance. This detection of HNF1B as a risk gene encouraged a more detailed evaluation of its promoter methylation profiles and its proteomic expression levels. An epigenetic silencing of HNF1B by DNA methylation was confirmed in half of the cases in the TCGA data including 576 primary serous EOC samples. To follow-up on the functional effects of the retained DNA methylation, a third cohort of 1149 EOC samples from the Ovarian Tumor Tissue Analysis (OTTA) Consortium (Bolton et al., 2012) was assessed. DNA-methylation analysis was also performed on 254 serous cases and 17 clear cell cases from those samples, using the Illumina 450K assay, with plate normalization using a linear model on the logit-transformed beta values. The correlation between the gene expression and methylation was in line with the previous hypotheses, revealing a high HNF1B expression and absence of promoter-methylation in most of the clear cell EOV samples, while the majority of serous samples displayed high promoter-methylation and stained negative for HNF1B in the IHC assay. Such an integrated analysis involving multiple omics data provides strong evidence that different genetic or epigenetic variations within the HNF1B gene can predispose to different histological variants of EOV, and that those variations could potentially be used as diagnostic tools for ovarian tumors.

### Epigenetics Biomarker and Drug Discovery

Upon the validation of its functional role in the disease etiology, an epigenetic mark can be further developed as a diagnostic biomarker or a drug target. By definition, a biomarker is any characteristic that can be quantified and evaluated as an indicator of normal or pathogenic biological processes, or as a measure of response to some form of treatment. Biomarkers can take a wide variety of forms, including (but not limited to) genomic modifications, RNA transcripts, proteins, and/or epigenetic alterations (Costa-Pinheiro et al., 2015). Ideally, a suitable biomarker is a highly accurate one that can be obtained in a minimally invasive or non-invasive manner, which can be utilized for screening and detection methods, diagnosis and prognostication purposes, risk assessment, and/or for the prediction of response to therapy. Accordingly, epigenetic changes are considered among the most promising classes of cancer biomarkers, owing to their stability, potential reversibility, and ease of access. There are a few epigenetic biomarkers approved in non-invasive cancer diagnosis. For example, Cologuard has become the first FDA approved test for colorectal cancer (CRC) which involves the testing of DNA methylation levels at BMP3 and NDRG4, together with the mutation status of KRAS and hemoglobin. More recently FDA has approved a blood-based screening test for CRC called Epi procolon. The test measures the DNA methylation level of SEPT9, a gene that has been found to be hypermethylated in the promoter region (Issa and Noureddine, 2017).

Currently, a rich set of epigenetic biomarkers, including noncoding RNA expression levels, aberrant methylation patterns, and histone-modifying enzyme levels, are being tested in preclinical and clinical settings. For example, a urine-based epigenetic test on the DNA methylation of three genes (TWIST1, ONECTU2, and OTX1) in bladder cancer has achieved superior accuracy and now progressed to a larger validation study (Velazquez, 2018). Other potential epigenetic biomarkers include SHOX2 for lung cancer and BRCA1 for breast and ovarian cancers (Fece de la Cruz and Corcoran, 2018). To be able to leverage the existing cancer samples in the TCGA, a recent study developed a pancancer bisulfite sequencing assay to measure the methylation status of 9,223 GpG sites in plasma cell-free DNA in 34 major cancer types (Liu et al., 2018). The derived methylation signatures were then used for training a cancer type -specific classifier, each of which consisted of a unique set of CpG sites. The resulting classifier was used to predict the cancer type for a given sample, based solely on its methylation signature, demonstrating the feasibility of genome-wide epigenetic profiles for cancer diagnosis. In contrast, the development of epigenetics biomarkers in other disease areas is relatively in its early stage, with a few links being made for diabetes (Bacos et al., 2016) and schizophrenia (Rodrigues-Amorim et al., 2017).

In epigenetic drug discovery, histone post-translational modifications (PTMs) have been pursued as a major strategy as they constitute one of the most immediate contributors to epigenetic regulation. The PTM-affecting enzymes can be classified into three distinctive functional classes including writers, erasers and readers, which have been pursued as the targets for epigenetic drugs (Hyun et al., 2017). For example, cancer epigenetic therapy has focused on the development of targeted histone deacetylase (HDAC) inhibitors and DNA methyltransferase (DNMT) inhibitors. HDAC inhibitors activate histone acetylation, leading to higher expression of certain genes for apoptosis and cell cycle, while DNMT inhibitors re-activate tumor suppressor genes. The use of HDAC (e.g., vorinostat, belinostat, panobinostat, and romidepsin) and DNMT inhibitors (e.g., azacytidine and decitabine) has been approved for hematological malignancies. Furthermore, combinations of HDAC and DNMT inhibitors have shown synergistic interactions in a variety of cancer cell lines (Brocks et al., 2017).

In addition, overexpression and activity of histone methyltransferases (HMT) have been reported in a variety of cancers, notably acting via the silencing of essential tumorsuppressors (Bracken et al., 2003; Kim and Roberts, 2016). Consequently, HMT inhibitors such as tazemetostat and CPI-1205 have found their way to clinical development. It is unlikely that any single drug targeting epigenetic modifications is capable of curing a malignancy on its own. The combination with other such drug or with standard chemotherapeutic approaches offers the most promising prospects. For example, DNMT and HDAC inhibitors are thought to open up the chromatin conformation, thus rendering DNA more

accessible to, and thereby more susceptible to damage, by chemotherapy. This observation has been validated by the successful combinations of azacitidine and low-dose cytarabine for AML (Radujkovic et al., 2014), or those of vorinostat and carboplatin or paclitaxel in non-small cell lung cancer (Owonikoko et al., 2010).

Other epigenetic modifiers that target the downstream proteins also have sparked interest. For example, the family of bromodomain containing proteins known as BETs have been involved in chromatin remodeling and transcriptional activity in a variety of diseases including inflammation, viral infection and cancer (Ferri et al., 2016). Furthermore, BET inhibition has been shown to decrease MYC expression and to restore normal cellular functions in a variety of cancers including hematological malignancies and solid tumors (Wang and Filippakopoulos, 2015). The first potent and selective BET inhibitor is the thienotiazolo-1,4-diazepine, known as the positive enantiomer (+) of JQ1. Other BET inhibitors include I-BET762 which is currently being investigated in several ongoing clinical trials for different cancers (Andrieu et al., 2016).

#### Pharmacoepigenetics

Due to the lack of full annotations on the drug-induced epigenetic changes, the exact mode of action of the epigenetic drugs in different cancer cells remains largely unknown, which partly explains the individual variation in the clinical response (Treppendahl et al., 2014). On the other hand, it has been shown that many common drugs also induce epigenetic changes via the direct interaction with the PTM-affecting enzymes, or the downstream drug signaling pathways (Lotsch et al., 2013). These epigenetic changes may contribute to both the therapeutic and the adverse effects of the compounds, which are also mediated by the patient's individual genetic background, e.g., of drug-metabolizing enzymes and transporters. Only recently the concept of pharmacoepigenetics has started to emerge, aiming at the study of epigenetic mechanisms to explain the interindividual variability in drug responses (Majchrzak-Celinska ´ and Baer-Dubowska, 2017; Lauschke et al., 2018). The epigenetic regulators of drug responses have been often linked to ADME (drug absorption, distribution, metabolism, and excretion) genes. For example, many genes in the Cytochrome P450 family are reported to be directly or indirectly regulated by miRNAs (Kim et al., 2014). Hypomethylation of the ABCB1 promoter region has been shown to increase the gene's expression in cancer cells, leading to acquired drug resistance (Reed et al., 2010). Research in this field may eventually lead to the development of ADMErelated biomarkers for the stratification of patients into different treatment groups. In addition, epigenetic biomarkers that are not linked to ADME genes were also reported, while the exact mechanisms remain largely undetermined. In breast cancer for example, the quantification of PSAT1 DNA methylation is used to predict tamoxifen response (Martens et al., 2005; De Marchi et al., 2017), whereas that of BRCA1/2 (similarly to somatic mutations in those genes) is indicative of response to PARP inhibitors (Martens et al., 2005). Similarly, hypermethylation of MGMT and MLH1 correlates with increased response to 5-FU treatment and improved survival in CRC (Nagasaka et al., 2003; Jensen et al., 2013). Notably, a recent clinical study has discovered a DNA methylation signature to predict the response of Anti-Programmed Death-1 (PD-1) treatment for advanced non-small-cell lung cancer (Duruisseaux et al., 2018). Another clinical study called Genetic and Environmental Determinants of Triglycerides (GOLDN) measured the genetic and epigenetic profiles for metabolic syndrome using a familybased design (Aslibekyan et al., 2018). In this study, the epigenetic profiling was made before and after the treatment of fenofibrate, allowing the characterization of genotype and DNA methylation to understand the variability in the drug treatment response. Despite that potential biomarkers have been found in these recent advances, a systematic strategy to predict and understand the epigenome-wide interactions mediating the drug responses is still lacking. We anticipate that data integration methods as summarized in previous sections that are capable of annotating the epigenome from a pharmacological and pharmacokinetic perspective shall provide a valuable source of information to inform personalized treatment decisions.

### CONCLUSION

Understanding epigenomic regulation is critical for dissecting gene–environment interactions in both normal development and disease. The fact that epigenetic profiles are plastic and reversible holds great promise for developing epigenetic biomarkers and drug targets. Furthermore, epigenetics captures the spatial and temporal variation on top of each individual's unique genome, and thus better informs the decision-making in personalized medicine. Recent developments have made chromatin accessibility profiling more cost-effective by allowing only a small number of cells as input, demonstrating the clinical potential of disease monitoring (Buenrostro et al., 2015). On the other hand, biobanks have made large scale clinical samples accessible and often provide functionality to share the accumulating raw data and molecular profiles similar to the concept of European Genome-Phenome Archive (EGA) (Lappalainen et al., 2015). Although individual epigenetic marks are often studied in isolation, the understanding of how the putative gene regulatory mechanisms occur will not be achieved without efficient tools to design, analyze, integrate, and interpret the versatile epigenetic features. To facilitate the systematic characterization of cells in a specific context, the other omics data such as transcriptomics and metabolomics may also provide complementary information to explain the interplay of the gene–environment interaction. Further developing the data integration tools shall more efficiently prioritize robust epigenetic modifications that are susceptible to environmental exposures and causal to specific diseases, so that specifically targeted compounds can be developed. Furthermore, despite the advances in these computational methods, one needs to ultimately resort to experimental approaches to confirm the hypothesis. The recent development of CRISPR-Cas9 and other genome editing tools may provide an efficient way to induce epigenetic alterations without the change of DNA sequences, so that novel drug targets and disease biomarkers may be identified more efficiently (Liao et al., 2017).

#### AUTHOR CONTRIBUTIONS

fphar-10-00126 February 19, 2019 Time: 12:37 # 12

EC, CH, MO, and JT conceived the study. All authors participated the writing of the manuscript.

#### REFERENCES


#### FUNDING

This work has been supported by the European Research Council (ERC) starting grant agreement no. 716063 DrugComb to JT, Academy of Finland (Grant Nos. 317680 to JT, 297908 to MO), Sigrid Juselius Foundation to MO and University of Helsinki Research Funds to MO.



Kim, K. H., and Roberts, C. W. (2016). Targeting EZH2 in cancer. Nat. Med. 22, 128–134. doi: 10.1038/nm.4036


disease: a mendelian randomization approach. PLoS Med. 14:e1002215. doi: 10.1371/journal.pmed.1002215



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Cazaly, Saad, Wang, Heckman, Ollikainen and Tang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bioactive Ingredients in Chinese Herbal Medicines That Target Non-coding RNAs: Promising New Choices for Disease Treatment

Yan Dong† , Hengwen Chen† , Jialiang Gao, Yongmei Liu, Jun Li\* and Jie Wang\*

*Department of Cardiology, Guang'anmen Hospital, China Academy of Chinese Medical Sciences, Beijing, China*

#### Edited by:

*Chandravanu Dash, Meharry Medical College, United States*

#### Reviewed by:

*Zijian Zhang, Texas Tech University Health Sciences Center El Paso, United States Yong Xu, First Hospital of Shanxi Medical University, China*

#### \*Correspondence:

*Jun Li gamyylj@163.com Jie Wang wangjie0230@126.com*

*†These authors have contributed equally to this work*

#### Specialty section:

*This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology*

Received: *13 January 2019* Accepted: *24 April 2019* Published: *21 May 2019*

#### Citation:

*Dong Y, Chen H, Gao J, Liu Y, Li J and Wang J (2019) Bioactive Ingredients in Chinese Herbal Medicines That Target Non-coding RNAs: Promising New Choices for Disease Treatment. Front. Pharmacol. 10:515. doi: 10.3389/fphar.2019.00515* Chinese herbal medicines (CHMs) are widely used in China and have long been a powerful method to treat diseases in Chinese people. Bioactive ingredients are the main components extracted from herbs that have therapeutic properties. Since artemisinin was discovered to inhibit malaria by Nobel laureate Youyou Tu, extracts from natural plants, particularly bioactive ingredients, have aroused increasing attention among medical researchers. The bioactive ingredients of some CHMs have been found to target various non-coding RNA molecules (ncRNAs), especially miRNAs, lncRNAs, and circRNAs, which have emerged as new treatment targets in numerous diseases. Here we review the evidence that, by regulating the expression of ncRNAs, these ingredients exert protective effects, including pro-apoptosis, anti-proliferation and anti-migration, anti-inflammation, anti-atherosclerosis, anti-infection, anti-senescence, and suppression of structural remodeling. Consequently, they have potential as treatment agents in diseases such as cancer, cardiovascular disease, nervous system disease, inflammatory bowel disease, asthma, infectious diseases, and senescence-related diseases. Although research has been relatively limited and inadequate to date, the promising choices and new alternatives offered by bioactive ingredients for the treatment of the above diseases warrant serious investigation.

Keywords: bioactive ingredient, Chinese herbal medicine, traditional Chinese medicine, ncRNA, therapeutic target

## INTRODUCTION

Chinese herbal medicines (CHMs) were the main treatment method used in ancient times by the Chinese to combat disease. As early as the Qin and Han Dynasty (around 221 BCE to 220 CE), Sheng Nong's Herbal Classic recorded 365 medicines. By the time of the Ming Dynasty (1368–1644), the number of CHMs listed in the book of Compendium of Materia Medica had increased to 1892. Most herbal medicines in such publications have been used constantly throughout medical history and are still applied in practice today. For example, according to Sheng Nong's Herbal Classic, Coptis chinensis Franch. was found to relieve abdominal pain and diarrhea and this herb is still widely used in China for the treatment of diarrhea or dysentery. Further, Panax notoginseng (Burk.) F. H. Chen, a traditional herb was initially used to stop bleeding, promote blood circulation and ease pain, was recorded in the Compendium of Materia Medica. It is now commonly used in cases of trauma and cardiovascular, and cerebrovascular diseases. A recent meta-analysis further demonstrated that several P. notoginseng preparations are beneficial for patients with unstable angina pectoris (Song et al., 2017).

Despite the positive effects of CHMs, little is known about their effective constituents, bioactive ingredients, and mechanisms of action. Therefore, in addition to the clinical applications mentioned in classic texts, understanding the specific active ingredients and clarifying the mechanisms of action of these compounds would facilitate the improved application of CHMs. The discovery of the drug artemisinin best illustrates the importance of CHM to the world (Tu, 2016), inspiring the notion that, through study of their bioactive ingredients, CHMs can help people around the world to conquer life-threatening diseases.

Non-coding RNA molecules (ncRNAs), which mainly comprise miRNA, lncRNA, and circRNA, do not encode proteins; however, as the most abundant class of RNA (at least 90%) (Sana et al., 2012), ncRNAs have important functions in gene regulation and are involved in pathological processes contributing to many diseases (Batista and Chang, 2013; Memczak et al., 2013; Zhang et al., 2018a), particularly cancer, and cardiovascular and nervous system diseases. Moreover, circRNA and lncRNA act as competitive endogenous RNAs (ceRNA), which are natural miRNA sponges that influence miRNA-induced gene silencing via miRNA response elements (Tay et al., 2014). Thus, complex regulatory networks exist, comprising circRNA, lncRNA, miRNA, and target genes. Unraveling of this complexity has laid the foundation for a comprehensive understanding of the pathology and treatment of diseases influenced by gene regulatory networks, rather than only core disease-related genes (Boyle et al., 2017). Excitingly, recent studies (Feng et al., 2015; Tian F. et al., 2017; Zhou Y. et al., 2017) have revealed that some miRNA, lncRNA, circRNA, and ceRNA crosstalk can be regulated by bioactive ingredients from CHMs, which often have multiple targets (**Table 1**). By influencing regulatory mechanisms, including pro-apoptosis (Feng et al., 2015), anti-proliferation and anti-migration (Liu T. et al., 2017), anti-inflammation (Fan et al., 2016), anti-atherosclerosis (Han et al., 2018), anti-infection (Liu et al., 2016), anti-senescence (Zhang J. et al., 2017), and suppression of structural remodeling (Liu L. et al., 2017), these ingredients exert protective functions in cancer, cardiovascular disease, nervous system disease, inflammatory bowel disease, asthma, infectious diseases, and senescence-related diseases.

### METHODOLOGY

The bioactive ingredients of CHMs and their interactions with ncRNA targets are the subject of intensive and rapidly expanding research. In this study, we undertook a comprehensive review of this research. The PubMed database was searched using the terms: "(ncRNA) AND herbal medicine", "(((miRNA) OR lncRNA) OR circRNA) AND herbal medicine," "(((miRNA) OR lncRNA) OR circRNA) AND active ingredient," "(((miRNA) OR lncRNA) OR circRNA) AND Chinese herb," "(((miRNA) OR lncRNA) OR circRNA) AND natural agent," "(((miRNA) OR lncRNA) OR circRNA) AND natural compound," or "(((miRNA) OR lncRNA) OR circRNA) AND traditional Chinese medicine."

In addition, the China National Knowledge Infrastructure (CNKI) was also searched with terms as follows: "FT = 'Chinese herbal medicine' AND SU = 'ncRNA' NOT (TI = 'Review' OR TI = ' Progress' OR TI = 'Overview' OR TI = 'Current situation')," "FT = 'Chinese herbal medicine' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = ' Progress' OR TI = 'Overview' OR TI = 'Current situation')," "FT = 'Active ingredient' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = 'Progress' OR TI = 'Overview' OR TI = 'Current situation')," "FT = 'Natural compound' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = 'Progress' OR TI = 'Overview' OR TI = 'Current situation')," "FT = 'Natural ingredient' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = 'Progress' OR TI = 'Overview' OR TI = 'Current situation')," "FT = 'Traditional Chinese medicine extract' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = 'Progress' OR TI = 'Overview' OR TI = 'Current situation')," or "FT = 'Traditional Chinese medicine' AND (SU = 'lncRNA' OR SU = 'miRNA' OR SU = 'circRNA') NOT (TI = 'Review' OR TI = 'Progress' OR TI = 'Overview' OR TI = 'Current situation')." "FT" means full text; "SU" means subject; "TI" means title. Articles included in "Guide to Core Journals of China," "Chinese Science Citation Database" and "Chemical Abstracts" simultaneously, were selected to ensure high quality of literature.

According to the above searching method, English and Chinese original articles related to bioactive ingredients of CHMs and any ncRNA (miRNA, lncRNA, or circRNA) were selected manually.

### PRO-APOPTOSIS EFFECTS OF CHMs

Apoptosis is programmed cell death, which is a normal physiological process of cells. Imbalance of apoptosis is closely associated with various diseases, particularly cancer. Proteins that inhibit apoptosis are over-expressed in various cancers, and are considered to be related to tumorigenesis and chemotherapy resistance (Mohamed et al., 2017); therefore, the induction of apoptosis is a promising method for cancer management (Fulda and Vucic, 2012). Recently, several bioactive ingredients of CHMs have been reported to promote apoptosis by targeting miRNA, lncRNA, or ceRNA crosstalk, indicating their potential as complementary therapies for cancer.

#### Berberine

Berberine [BBR, 9,10-Dimethoxy-2,3-(methylenedioxy)- 7,8,13,13a-Tetrahydroberbinium; **Chem. 1**] is an isoquinoline alkaloid extracted from the roots of several species including: Coptis chinensis Franch., Berberis soulieana Schneid., Berberis poiretii Schneid., Berberis vernae Schneid., Berberis wilsoniae Hemsl., and Platycladus orientalis (Linn.) Franco. These herbs are considered to have antipyretic and detoxification effects, based on the theory of traditional Chinese medicine (TCM), and were mainly used to treat diseases of the digestive and

#### TABLE 1 | Detailed information on bioactive ingredients targeting ncRNAs.


TABLE 1 | Continued




TABLE 1 | Continued




urinary systems, such as diarrhea, ulcer, jaundice, and urinary infection, as well as dermatological diseases, including eczema. Newly reported researches (Luo et al., 2014; Chai et al., 2018) have found the anti-cancer activity of two extracts from Coptis chinensis Franch. one of which is BBR. It's indicated that BBR affects apoptotic pathways in various cancers through regulation of multiple miRNAs. For example, BBR exerts significantly protective effects in multiple myeloma (MM) through targeting several miRNAs. BBR down-regulates the expression of miR-99a∼125b, miR-17∼92, miR-106∼25 (Feng et al., 2015), and miR-21 (Luo et al., 2014), thereby influencing the P53, ERB, and MAPK signaling pathways, leading to acceleration of apoptosis and growth inhibition. Moreover, BBR suppresses MM cell viability through down-regulating miR-19a/92a expression (Yin et al., 2018). Further, BBR can up regulate miR-23a in hepatocellular carcinoma (HCC) (Wang N. et al., 2014), as well as miR-152, miR-429, and miR-29a in colorectal cancer (Huang et al., 2017) in a P53-dependent manner. By inducing NEK6 inhibition and transcriptional activation of the P53-associated tumor suppressor genes, P21 and GADD45α, BBR induces cell death, G2/M cell cycle arrest, and tumor growth suppression in HCC cells; in contrast, miR-23a inhibition can attenuate these BBR-mediated functions (Wang N. et al., 2014). Besides, BBR also regulates cell cycle and inhibits cell proliferation in melanoma A375 cells through promoting miR-34a, miR-154, miR-26a, and miR-124 expression, as well as suppressing target genes CDK4, CyclinD1, CyclinE, and CDK2 (Yang L. H. et al., 2016).

Additionally, BBR can enhance cellular sensitivity to chemotherapeutic drugs, helping to address the problem of drug resistance. In ovarian and gastric cancers, BBR can enhance cisplatin sensitivity by regulating the expression of miR-93 (Chen et al., 2015) and miR-203 (You et al., 2016), respectively, thus inducing apoptosis. Furthermore, there is evidence that BBR can function together with other compound. It has a synergistic antiproliferative effect on colorectal cancer when combined with NVP-AUY922. The potential mechanism underlying this phenomenon was reported to be suppression of CDK4 and induction of miR-296-5p-mediated inhibition of Pin1-β-catenin-cyclin D1 signaling, resulting in cell growth arrest (Su et al., 2015).

Interestingly, BBR also has protective effects against obesity, steatotic liver and insulin resistance. It inhibits cell viability, cell differentiation, and triglyceride content in a dose- and timedependent manner, through marked induction of miR-27a and miR-27b; while miR-27a and miR-27b inhibitors can counteract this repressive function of BBR (Wu et al., 2016). Steatotic liver results from disordered lipid metabolism, where lncRNA MRAK052686, NRF2 (Yuan et al., 2015), and miR-373 (Li C. H. et al., 2018) are down-regulated. BBR can reverse the abnormal expression of these genes in steatotic liver, thereby inhibiting the AKT-mTOR-S6K signaling pathway and preventing the development of hepatic steatosis (Li C. H. et al., 2018). In addition, through downregulating miR-29a-3p in insulin resistant HepG2 cells, BBR can increase the mRNA and protein expression of IRS1, leading to regulation of the insulin receptor signaling pathway protein (Mao et al., 2018).

Based on the above results, it is clear that BBR possesses different pharmacological effects via targeting miRNAs. Particularly, it shows obvious advantages for treatment of MM and digestive cancers, mainly through its activities in promotion of apoptosis and inhibition of cell growth. More importantly, its direct anticancer properties are strongly associated with P53 signaling. Therefore, BBR presents as promise for potential future use in cancer treatment.

#### Artesunate

Artesunate (ART, dihydroartemisinin-12-alpha-succinate; **Chem. 2**) is a sesquiterpene lactone extracted from the leafy portions of the Chinese herb, Artemisia annua L. It's the semisynthetic derivative of artemisinin which is widely known to be a natural antimalarial medicine (Tu, 2016). Recently, the function of ART in cancer therapy, by targeting lncRNA and promoting cell apoptosis, was newly identified. The lncRNA, UCA1, is up-regulated in prostate cancer tissues and positively correlated with poor prognosis (Zhou Y. et al., 2017). ART significantly decreased the expression of lncRNA UCA1, thereby regulating the downstream miR-184/BCL-2 axis, inducing apoptosis, and inhibiting metastatic ability. Furthermore, these protective effects could be reversed by overexpression of lncRNA UCA1, indicating that it is a target of ART (Zhou Y. et al., 2017). Hence, ART exhibits anticancer properties through regulating the ceRNA crosstalk of the lncRNA UCA1/miR-184/BCL-2 axis in prostate cancer. Nevertheless, additional evidence to support these findings is lacking and the stability of this regulatory network requires validation.

### Triptolide/Triptonide

Triptolide (TP, (3bs,4as,5as,6r,6ar,7as,7bs,8as,8bs)-6 hydroxy-6a-isopropyl-8b-methyl-3b,4,4a,6,6a,7a,7b,8b,9,10 decahydrotrisoxireno[6,7:8a,9:4b,5]phenanthro[1,2-c]furan-1(3h)-one; **Chem. 3**) and Triptonide (TN,(3bS,4aS,5aS,6aS, 7aS,7bS,8aS,8bS)-6a-isopropyl-8b-methyl-3b,4,4a,7a,7b,8b, 9,10-octahydrotrisoxireno[6,7:8a,9:4b,5]phenanthro[1,2-c] furan-1,6(3H,6aH)-dione; **Chem. 4**) are both diterpene lactone components originated from Tripterygium wilfordii Hook. f. (TwHf) that has traditionally been used for treatment of rheumatoid arthritis (RA). A recent study has revealed that TwHf exerts its anti-rheumatic effects through regulation of miR-146a, which is over-expressed in patients with RA and negatively correlated with prognosis. TwHf treatment could significantly decrease miR-146a expression. Moreover, miR-146a could be used as a predictor of patient clinical response to TwHf (Chen Z. Z. et al., 2017).

Researches about TP and TN broaden the traditional application and generate new pharmacological effect for cancer treatment. TP has been found to exert anticancer activities in lung cancer. It can induce apoptosis and suppress proliferation through inhibiting miR-21 and increasing expression of phosphatase and tensin homolog (PTEN) protein in non-small cell lung cancer. Moreover, miR-21 upregulation could reverse the effect of TP on cell viability and PTEN (Li et al., 2016). Furthermore, TP treatment is also reported to regulate 227 miRNAs and markedly decrease the migration, invasion, and metastasis of lung cancer cells (Reno et al., 2015). In addition, TP can promote apoptosis and suppress cell proliferation in hepatocellular carcinoma, potentially via inhibition of miR-17-92 and miR-106b-25, in a c-MYC-dependent manner, leading to an increase of BIM, PTEN, and P21 levels (Li S. G. et al., 2018).

However, TN shows significant therapeutic advantages in human nasopharyngeal carcinoma (NPC). It promotes NPC apoptosis and cell cycle arrest, as well as inhibition of cell migration and invasion, without toxicity to nasopharyngeal epithelial cells. This anti-cancer activity is attributed to suppression of lncRNA THOR, followed by downregulation of IGF2BP1 mRNA targets involving Myc, IGF2, and Gli1. Furthermore, lncRNA THOR knockout enhances the protection of TN on NPC; while lncRNA THOR overexpression reverses TN-induced treatment in cells. In vivo, TN administration also obviously impedes subcutaneous NPC xenograft growth in mice. Similarly, lncRNA THOR knockout inhibits xenograft growth (Wang et al., 2019).

Therefore, TP and TN are attractive candidate chemotherapeutic agents against the above cancers. With regard to TP, PTEN is an important target; while TN possesses anti-cancer activity in vitro and in vivo through regulating lncRNA THOR/IGF2BP1 signaling. Nevertheless, as the extracts from TwHf with general toxicity (Chen et al., 2008; Luo et al., 2018), the effectiveness and safety of TP and TN require additional confirmation.

#### Ailanthone

Ailanthone (AIL, Picrasa-3,13(21)-diene-2,16-dione, 11,20 epoxy-1,11,12-trihydroxy-, (1-beta,11-beta,12-alpha)-; **Chem. 5**) is a water-soluble quassinoid extracted from the root bark of Ailanthus altissima (Mill.) Swingle. Traditionally, the root bark was used for improvement of itching, bleeding and diarrhea in TCM theory. However, AIL has been newly found to possess anti-tumor activity in different tumors (Chen Y. et al., 2017; Peng et al., 2017; Yang P. et al., 2018). Among those, inhibitory effect of human vestibular schwannomas (VSs) induced by AIL is correlated with miRNA. A research has demonstrated that AIL cleaves caspase 3 and caspase 9, promotes Beclin-1, LC3-II accumulation, and decreases p62, cyclin D1 expression, thus increasing apoptotic cell rate. The upstream mechanism may be suppression of miR-21, and the Ras/Raf/MEK/ERK and mTOR pathways, leading to apoptosis and autophagy in AIL-treated cells. In addition, miR-21 overexpression can attenuate the regulation of AIL on Ras/Raf/MEK/ERK and mTOR pathways, as well as apoptosis and autophagy, indicating miR-21 can be the treatment target of AIL in VSs (Yang P. et al., 2018).

#### Cordycepin

Cordycepin (COR, 9-(beta-D-3′ -Deoxyribofuranosyl)adenine; **Chem. 6**) is the main bioactive ingredient of Cordyceps militaris, a precious CHM. The medicinal herb has immunitystrengthening effect, and has been already used as a health care product in clinical practice. Modern researches broaden the application of COR in various cancers (Wang et al., 2016; Liang et al., 2017; Yu X. et al., 2017). Specifically, treatment of COR for renal cell carcinoma (RCC) is attributed to regulation of miR-21 and PTEN phosphatase. It's indicated that COR down-regulates miR-21 expression and Akt phosphorylation, yet promotes PTEN phosphatase in RCC Caki-1 cells, resulting in induction of apoptotic cell death and suppression of cell migration. Furthermore, miR-21 mimic or PTEN siRNA can markedly abolish the above effects induced by COR (Yang et al., 2017). Therefore, it's confirmed that COR possesses proapoptosis and anti-migration function through regulating miR-21/PTEN axis.

In addition, soya-cerebroside (**Chem 7**), another extracts from Cordyceps militaris is demonstrated to be anti-inflammatory for osteoarthritis (OA). It suppresses AMPK and AKT signaling pathways, and then promotes miR-432 expression in OA synovial fibroblasts, leading to inhibition of monocyte chemoattractant protein-1 (MCP-1), monocyte migration and infiltration, as well as cartilage degradation (Liu S. C. et al., 2017). As a result, soya-cerebroside exerts protective effect for OA partially via regulating miR-432, MCP-1, AMPK and AKT pathways; while in this study a clear functional relationships among those factors are not reported.

## Tubeimoside I

Tubeimoside I (TBMSI, nosyl]-β-D-glucopyranosyl]oxy]-2,23 -dihydroxy-,28-(O-β-D-xylopyranosyl-(1→3)-O-6-deoxyα-L-mannopyranosyl-(1→2)-α-L-arabinopyranosyl)ester, intramol. ester, [2β,3β(S),4α]- Tubeimoside TUBEIMOSIDE A(P); **Chem. 8**) is the main triterpenoid saponin originated from Bolbostemma paniculatum (Maxim) Franquet. which has detoxification and detumescent activities. Recent studies have revealed the pharmacological action of TBMS1 as a potential anti-cancer agent (Wang et al., 2011; Gu et al., 2016). A research demonstrated that TBMS1 can promote apoptosis, and attenuate migration, invasion of non-small cell lung cancer cells. The underlying mechanism is attributed to upregulation of miR-126-5p, followed by inactivation of VEGF-A/VEGFR-2/ERK signaling pathway. MiR-126-5p inhibitor can reverse the downregulated VEGF-A and VEGFR-2 induced by TBMS1 treatment; moreover, both miR-126-5p inhibitor, and VEGF-A, VEGFR-2 overexpression upregulate the mRNA expression and phosphorylation of MEK1 and ERK. Significantly, apoptosis, migration and invasion of TBMS1-treated cells can be reversed by either miR-126-5p inhibitor or ERK activator (Shi et al., 2017). From the above results, it can be concluded that miR-126- 5p/VEGF-A/VEGFR-2/ERK signaling is the protective pathway of TBMS1 for cancer therapy.

#### Oridonin

Oridonin (ORI, (14R)-7-alpha,20-Epoxy-1-alpha,6-beta,7,14 tetrahydroxykaur-16-en-15-one, **Chem. 9**) is a ent-kaurane diterpenoid compound mainly originated from Rabdosia rubescens (Hemsl.) Hara. Traditionally, the herb was convinced to have the effect of detoxification, circulation promotion and pain relief in China. Currently, ORI is illustrated to participate in the treatment of several tumors via different regulatory pathways. It's reported that human laryngeal cancer cell is accelerated to apoptosis after ORI treatment through inhibiting EGFR signaling. Similarly, EGFR suppression increased ORI-induced apoptosis by the promotion of oxidative stress, and activation of intrinsic and extrinsic apoptotic pathways (Kang et al., 2010). Moreover, 105 miRNAs are involved in the regulation of ORI-treated pancreatic cancer (Gui et al., 2015). Therefore, it's possible that miRNAs involve in the anti-cancer activity of ORI; however, whether EGFR is the downstream target of miRNAs deserves more researches.

### ANTI-PROLIFERATION AND ANTI-MIGRATION EFFECTS OF CHMs

Abnormal cell proliferation is involved in the pathogenesis of many diseases. In particular, the proliferation and invasion of cancer cells are primary contributors to poor patient outcomes (Gao et al., 2017). In addition, asthma is also associated with the cell proliferation and migration in airway smooth muscle (Zhao et al., 2016). Hence, suppression of cell proliferation and migration are critical methods for treatment of these diseases. Excitingly, some bioactive ingredients of CHMs have been found to inhibit the proliferation and migration of both cancer and asthma cells through targeting miRNA, lncRNA, or ceRNA crosstalk.

### Curcumin

Curcumin (CUR, (1E,6E)-1,7-Bis(4-hydroxy-3 methoxyphenyl)hepta-1,6-diene-3,5-dione; **Chem. 10**) is a phenolic compound extracted from Curcuma longa L., which was traditionally used as painkiller in rheumatism and other bone and joint diseases. Recent studies have found that CUR can also act as an anticancer agent, via miRNA and lncRNA targets. CUR inhibits miR-208 and activates expression of the cell cycle suppressor, CDKN1A, resulting in dose-dependent suppression of prostate cancer cell proliferation (Guo H. et al., 2015). Further, CUR can significantly increase miR-143 and decrease PGK1 expression, while ectopic expression of FOXD3 can enhance the regulatory effect of CUR on miR-143, thereby inhibiting the proliferation and migration of prostate cancer cells (Cao et al., 2017). Further studies reveal that CUR also acts on human prostate cancer stem cells (HuPCaSC). CUR treatment increases the expression of miR-145 and decreases levels of lncRNA-ROR, the cell cycle proteins CCND1, CDK4, and the stem cell markers OCT4, CD44, and CD133. The tumorigenicity

of these cells is thereby significantly reduced through inhibition of their proliferation, invasion, and cell cycle arrest (Liu T. et al., 2017). Moreover, expression levels of miR-770-5p and miR-1247 in the DLK1–DIO3 imprinted gene cluster were significantly up-regulated, leading to suppression of HuPCaSC proliferation and invasion in vitro (Zhang et al., 2018a). CUR also promotes the expression of miR-98 in lung cancer, thus inhibiting cell growth and migration (Liu W. L. et al., 2017). By reducing miR-186<sup>∗</sup> expression, it induces apoptosis and decreases cell viability in lung cancer cells as well (Tang et al., 2010). Furthermore, CUR inhibits both proliferation and accelerates apoptosis in bladder, gastric, non-small cell lung, pancreatic cancers, and hepatic carcinoma via the up-regulation of miR-203 (Saini et al., 2011), miR-33b (Sun et al., 2016), miR-192-5p (Jin et al., 2015), miR-7 (Ma et al., 2014), and lncRNA AK125910 (Guo Y. et al., 2015), respectively.

CUR has also been reported to increase the sensitivity of nonsmall-cell lung cancer (Lu et al., 2017), breast cancer (Zhou S. et al., 2017), and nasopharyngeal carcinoma (Wang Q. et al., 2014) to chemotherapy drugs by targeting ncRNAs including miR-30c, miR-29b-1-5p, and lncRNA AK294004, respectively, along with their downstream genes. Moreover, CUR can exert synergistic effects in combination with other compounds, to suppress cell proliferation and invasion and induce apoptosis in glioblastoma (Wu et al., 2015), breast cancer (Guo et al., 2013), and hepatocellular carcinoma (Zhang S. et al., 2017). In glioblastoma, miR-378 was found to promote the anticancer effect of CUR by regulating p38 expression, demonstrating the mutual interaction of miRNA and CUR (Li et al., 2017). Furthermore, CUR is reported to exert anti-inflammatory effects (Ma F. et al., 2017) and to inhibit adipogenic differentiation (Tian L. et al., 2017).

Notably, as liposome technology is a good method for targeting drug delivery system that can solve the solubility problems of poorly soluble drugs (Allen and Cullis, 2004). A research has used this technology to produce CUR-loaded liposome, increasing solubility and oral bioavailability of CUR, as well as reducing first pass effect of hepar. This drug combination can also promote sensitivity of breast cancer cells to chemotherapy, through regulating different miRNAs of miR-29b-1-5p, miR-29b-3p, miR-6068, miR-6790-5p, and miR-4417, as well as their target genes involving DDIT4, EPAS1, VEGFA, RPS14, and DCDC2 (Zhou et al., 2018).

These data demonstrate that CUR can suppress cell proliferation, growth, and metastasis in various cancers by targeting ncRNAs. In particular, CUR has obvious advantages for the treatment of prostate cancer through its regulation of cancer and cancer stem cells. Moreover, the synergistic effects of CUR with other chemotherapies provide new alternative strategies to address drug resistance. Excitingly, structural improvement of CUR not only ensures its anti-cancer effect, but also promotes the bioavailability.

#### Shikonin

Shikonin (SHK, 5,8-dihydroxy-2-((1R)-1-hydroxy-4-methyl-3-penten-1-yl)-1,4-naphthalen-edione; **Chem. 11**) is a naphthoquinone derivative compound. SHK is extracted from the root of the natural herbal medicine, Lithospermum erythrorhizon Sieb. et Zucc. This plant was generally used to treat rash, pox, measles, and urticaria in TCM. Modern studies have discovered broader applications for this compound in cancer, by revealing its anti-proliferation function, which is reported to be associated with targeting of miRNAs. SHK can inhibit proliferation and promote apoptosis by modulating the miR-106b/PTEN/AKT/mTOR signaling pathway in endometrioid endometrial cancer (Huang and Hu, 2018). Moreover, SHK inhibits the proliferation of breast cancer cells through downregulation of tumor-derived exosomal miR-128 (Wei et al., 2016). In addition, the anticancer activity of SHK in glioblastoma is enhanced by miR-143 by reducing the expression of the anti-apoptosis regulator, BAG3, which is a functional target of miR-143 (Liu et al., 2015).

Overall, the regulatory relationships between SHK and miRNAs are mutual. SHK could target miR-106b and miR-128 in endometrioid endometrial cancer and breast cancer to prevent cell proliferation. Further, miR-143 expression influences the anticancer activity of SHK in glioblastoma. Finally, the results reviewed above demonstrate that the anti-proliferation activity of SHK in cancers can be attributed to its interactions with miRNAs.

#### Paeoniflorin

Paeoniflorin [PF, 5b-((Benzoyloxy)methyl)tetrahydro-5 hydroxy-2-methyl-2,5-methano-lH-3,4-dioxacyclobuta(cd) pentalen-1a(2H)-yl-beta-D-glucopyranoside; **Chem. 12**] is the main active ingredient of Paeonia lactiflora Pall., which was commonly used to regulate blood circulation and relieve pain in TCM theory. Recent investigations have revealed roles for PF in vasodilation (Goto et al., 1996), anti-inflammation (Chen et al., 2013; Hu et al., 2018), microcirculation improvement (Zhou et al., 2011), anti-oxidation (Chen et al., 2011), and anti-cancer (Wang et al., 2012) activities. Specifically, PF exhibits protective activity in glioma via suppression of cell proliferation and promotion of apoptosis. The potential underlying mechanism may involve upregulation of miR-16 and downregulation of matrix metalloproteinase-9 (MMP-9), which are differentially expressed in glioma tissues and cells compared with healthy controls (Li W. et al., 2015). This result lays the foundation for treatment of cancer using PF; however, supporting evidence is insufficient and more investigations are needed.

#### Honokiol

Honokiol (HNK, 5,3′ -Diallyl-2,4′ -dihydroxybiphenyl; **Chem. 13**) is a bioactive polyphenol isolated from Magnolia grandiflora. Although the flower was traditionally valued as ornamental, it contains the phenolic ingredient, HNK, which has been shown to have antimicrobial activity (Clark et al., 1981). A recent study discovered that HNK has anti-tumor activity; it can markedly inhibit the growth, invasion, and migration of breast cancer cells, and breast-tumor-xenograft growth induced by leptin. HNK promotes the expression of miR-34a, and inhibits WNT1-MTA1-β-catenin signaling, through suppression of STAT3 phosphorylation and recruitment of STAT3 to the promoter of miR-34a (Avtanski et al., 2015). Hence, HNK has demonstrated a protective effect on breast cancer in a diet-induced-obese mouse model with high leptin levels and could serve as a new endocrine therapy drug for patients with obesity-related breast cancer accompanied by negative estrogen and progesterone receptors; however, the research described above was limited to animal experiments, and further evidence in humans is required, thus clinical trials are warranted to further investigate HNK.

### Schisandrin B

Schisandrin B (Sch B, 7-dimethyl-ethoxy-stereoisomer;benzo (3,4)cycloocta(1,2-f)(1,3)benzodioxole,5,6,7,8-tetrahydro-1,2, 3,13-tetram; **Chem. 14**) is a type of lignan, extracted from Schisandra sphenanthera Rehd. et Wils. The original fruit was commonly used to relieve symptoms of cough, gasp, abnormal sweating, nocturnal emission, thirst, and palpitations, under TCM theory. Although it was widely used to treat various diseases in ancient China, its specific target and underlying mechanism of action were unclear. A recent study of Sch B provided information about the involvement of ncRNA. Sch B may increase the expression of miR-150 and subsequently reduce levels of the lncRNA BCYRN1 in airway smooth muscle cells (ASMCs) of asthmatic rats. By regulating these two ncRNAs, Sch B suppresses the proliferation, viability, and migration of ASMCs; therefore, the study generated evidence that partially explains the mechanism underlying the activity of Sch B against asthma (Zhang X. Y. et al., 2017). Moreover, Sch B can mediate ceRNA crosstalk between miR-150 and lncRNA BCYRN1, further establishing an miR-150/lncRNA BCYRN1/cell proliferation axis; however, as a new regulatory mechanism influencing asthma, the stability of the ceRNA crosstalk requires further investigation.

#### Resveratrol

Resveratrol (RES, 3,4′ ,5-trihydroxystilbene; **Chem. 15**) is a natural phenol stilbenoid that is mainly found in food, including the skin of grapes and blueberries, and several CHMs, including Morus alba L., Polygonum cuspidatum Sieb. et Zucc., and Rubus idaeus L. It is considered to protect individuals from cardiovascular diseases, as well as dietary and metabolic diseases (Bradamante et al., 2004; Baur et al., 2006; Lagouge et al., 2006). Recently, its anticancer properties have also been evaluated by researchers and RES has been used as a dietary supplement (Garvin et al., 2006; Kalra et al., 2008; Roy et al., 2009). RES can down-regulate the lncRNA, MALAT1, and up-regulate miR-200c, as well as inhibiting WNT/β-catenin signaling, leading to suppression of cell invasion, metastasis, and migration in colorectal cancer (Ji et al., 2013; Karimi Dermani et al., 2017). Moreover, by significantly decreasing oncogenic miR-221 and regulating NF-κB and TFG, RES exerts inhibitory effects on melanoma cells, both in vitro and in vivo (Wu and Cui, 2017). In glioma, RES inhibits cell proliferation, arrests the cell cycle in S phase, and induces apoptosis in vitro, through down-regulation of miR-21, miR-30a-5p, and miR-19, as well as regulating their targets, including P53, PTEN, EGFR, STAT3, COX-2, NF-κB, and the PI3K/AKT/mTOR pathway (Wang G. et al., 2015).

RES also has anti-inflammatory effects. It can reduce expression of miR-155 and promote that of its target gene, suppressor of cytokine signaling 1 (SOCS1), leading to subsequent inhibition of the inflammatory factors, TNF-α, IL-6, MAPKs, and STAT1/STAT3 (Ma C. et al., 2017). Interestingly, by increasing miR-663 expression, RES down-regulates miR-155, thus acting as both an anti-inflammatory and an anticancer agent (Tili et al., 2010). Furthermore, RES exhibits neuroprotective effects. It promotes miR-96 and inhibits its target gene, BAX, resulting in prevention of oxygen/glucose deprivation/reoxygenation-induced apoptosis and brain damage, while this protective function can be reversed by miR-96 inhibitor (Bian et al., 2017). In Alzheimer's disease, RES also improves long-term memory formation and induction of long-term potentiation of hippocampus CA1 neurons, through down-regulation of miR-134 and miR-124, and up-regulation of CREB and BDNF (Zhao et al., 2013). Therefore, RES is a potential therapeutic agent against cancers, cerebral ischemia, Alzheimer's disease, and other inflammatory conditions.

#### Soybean Isoflavones

Soybean isoflavones (SIF, 3-phenyl-4h-1-benzopyran-4-one; **Chem. 16**) are extracted from Glycine max (Linn.) Merr. They act as phytoestrogens in mammals and have been used as dietary supplements. SIF are associated with breast cancer (Douglas et al., 2013; Takagi et al., 2015). Recently, they have also been demonstrated to suppress cell growth and invasion in prostate cancer. A potential mechanism underlying the anti-prostate cancer activity of SIF is its promotion of miR-29a and miR-1256, leading to down-regulation of TRIM68 and PGK-1 by inhibiting methylation of the miR-29a and miR-1256 promoters (Li et al., 2012). Nevertheless, as a controversial ingredient with weak estrogen-like properties, the influence of SIF on hormonereceptor-positive cancers has caused widespread concern. Therefore, research is needed to determine the effectiveness and safety of SIF in the context of different cancers.

#### Matrine

Matrine (MAT, (7aS,13aR,13bR,13cS)-Dodecahydro-1H,5H,10H-dipyrido[2,1-f:3′ ,2′ ,1′ -ij](Memczak et al., 2013; Song et al., 2017)naphthyridin-10-one; **Chem. 17**) is the main alkaloid extract from Sophora flavescens Ait which was commonly used for diseases of dysentery, eczema and jaundice in China. Modern pharmacological research shows that MAT has protective activity in melanoma, as evidenced by inhibition of proliferation and invasion, and promotion of apoptosis in melanoma cell lines. By downregulating miR-19b-3p expression, MAT increases the protein and mRNA expression of PTEN, a direct target of miR-19b-3p. Similarly, miR-19b-3p downregulation can imitate the effect of MAT; while PTEN silencing reverses the protection induced by MAT (Wei et al., 2018). As a result, MAT can exert anti-cancer activity in melanoma via regulating miR-19b-3p/PTEN axis.

#### Corylin

Corylin (CL, 3-(2,2-dimethylchromen-6-yl)-7-hydroxychromen-4-one; **Chem. 18**) is the flavonoid compound extracted from Psoralea corylifolia Linn. In TCM practice, Psoralea corylifolia Linn. was often used for degenerative bone and joint diseases. Newly reported studies have revealed its application in inflammation (Kim et al., 2016; Hung et al., 2017) and cancer (Chen et al., 2018). The anti-cancer activity induced by CL is related to upregulation of tumor suppressor lncRNA GAS5 and its downstream anticancer pathways activation. As a result, the proliferation, migration, and invasiveness, as well as epithelialmesenchymal transition are all inhibited in hepatocellular carcinoma cells. Moreover, lncRNA GAS5 silencing can attenuate the above inhibitory effect of CL. In an animal experiment, CL is observed to obviously retard tumor growth as well, with no significant physiological toxicity (Chen et al., 2018). Taken together, lncRNA GAS5 may act as the treatment target of CL in hepatocellular carcinoma; however, specific downstream gene of lncRNA GAS5 still needs further study.

### ANTI-INFLAMMATORY EFFECTS OF CHMs

Inflammation is a common pathological process involved in many diseases, including coronary heart disease, inflammatory bowel disease, myocarditis, asthma, and neuroinflammatory disorder (Harrington, 2017; Robinson et al., 2017; Mahajan et al., 2018); however, both non-steroidal anti-inflammatory drugs and immunosuppressive agents have clear side effects (Shah and Gecys, 2006; Ahmad et al., 2010). Consequently, safe and effective anti-inflammatory drugs for the treatment of the basic pathologies underlying the above diseases are still needed. Several bioactive ingredients of CHMs are reported to target miRNA or ceRNA crosstalk, thereby exerting anti-inflammatory effects.

### Tanshinone IIA

Tanshinone IIA (Tan IIA, Phenanthro [1, 2-b]furan-10, 11-dione, 6, 7, 8, 9-tetrahydro-1, 6, 6-trimethyl; **Chem. 19**) is a lipophilic diterpenoid extracted from the root of Salvia miltiorrhiza Bge. Under TCM theory, the original herb is considered to promote blood circulation. Recent studies have illustrated that Tan IIA has cardioprotective activity (Shang et al., 2012; Feng et al., 2016) and injection of sodium Tan IIA sulfonate has been widely used as an adjunctive therapy for cardiovascular diseases in China (Yu M. L. et al., 2018). A potential mechanism underlying its inhibition of inflammation (Pan et al., 2017; Cheng et al., 2018), and an upstream regulator, is miRNA. Tan IIA can reduce the expression levels of cytokines, chemokines, and acutephase proteins, including TLR4, MyD88, GM-CSF, sICAM-1, CXCl-1, and MIP-1α. Moreover, it significantly inhibits the mRNA expression levels of IL-1β, TNF-α, and COX-2, thereby suppressing lipopolysaccharides (LPS) -induced activation of the TLR4-NF-κB pathway. Furthermore, expression of miR-155, miR-147, miR-184, miR-29b, and miR-34c is also reduced by Tan IIA, and these may be upstream regulators in anti-inflammation processes (Fan et al., 2016). In addition, by down-regulation of miR-146b and miR-155, Tan IIA significantly reduces the levels of inflammatory factors, including CRP, ox-LDL, IL-1β, IL-6, IL-12, TNF-α, CCL-2, CD40, and MMP-2, thereby exerting protective functions in atherosclerosis induced by Porphyromonas gingivalis (Xuan et al., 2017).

Another study indicated that Tan IIA can also inhibit apoptosis caused by hypoxia. Through increasing miR-133 expression and activating the stress-induced protein kinase, MAPK ERK1/2, Tan IIA enhances resistance to hypoxic exposure in neonatal cardiomyocytes (Zhang et al., 2012). Treatment with Tan IIA has also been illustrated to reverse the abnormal expression of miR-1, SRF, and MEF2, and participates in suppression of the p38 MAPK signaling pathway, restoring declined I(K1) current density and Kir2.1 and Cx43 protein levels, thus lowering the incidence of arrhythmogenesis and mortality after myocardial infarction, and improving cardiac function (Shan et al., 2009; Zhang et al., 2010). These results provide a partial explanation for the anti-inflammatory and anti-hypoxia activity of Tan IIA via miRNAs in cardiovascular diseases; in particular, miR-155 may be a specific target of Tan IIA in inflammation.

Additionally, an aqueous extract from Salvia miltiorrhiza Bge., named magnesium lithospermate B (MLB, magnesium (2R)-3-(3,4-dihydroxyphenyl)-2-[(E)-3-[(2S,3S)-2-(3,4 dihydroxyphenyl)-3-[(2R)-3-(3,4-dihydroxyphenyl)-1-oxido-1-oxopropan-2-yl]oxycarbonyl-7-hydroxy-2,3-dihydro-1 benzofuran-4-yl]prop-2-enoyl]oxypropanoate; **Chem. 20**), has neuroprotective effect in ischemia/reperfusion (I/R) injury. I/R injury can lead to miR-107 upregulation, glutamate transporter 1 (GLT-1) suppression and glutamate accumulation, increasing neurological deficit score, infarct volume and cellular apoptosis (Yang Z. B. et al., 2014). MLB treatment improves I/R-induced cerebral injury through reversing the abnormal expressions of miR-107, GLT-1 and glutamate (Yang et al., 2015).

The above results may help to throw light on the underlying mechanisms of Tan IIA and MLB for the treatment of cardiovascular and cerebrovascular diseases from the perspective of miRNA; however, it should be noted that the pharmacological action of Salvia miltiorrhiza Bge. is not limited to ncRNAs (Zhu et al., 2017).

#### Baicalin

Baicalin (BA, 7-D-glucuronic acid-5,6-dihydroxy-flavone; **Chem. 21**) is a flavone glycoside extracted from Scutellaria baicalensis Georgi, which was commonly applied for the treatment of respiratory and digestive diseases in CHM. The traditional treatment effects may be related to regulation of inflammatory responses. TNF-α stimulation promotes the expression of miR-191a, causing downregulation of ZO-1 mRNA and protein. BA pretreatment could reverse the effects of ZO-1 and miR-191a expression induced by TNF-α, leading to improved viability and migration of rat small intestine epithelial cells. Furthermore, knockdown of miR-191a expression significantly increased BAL-induced ZO-1 protein expression, thereby enhancing the protective effect of BA on cell motility (Wang L. et al., 2017). These data suggest that miR-191a may be an upstream target of BA in the treatment of inflammatory bowel disease; moreover, the therapeutic effects of BA can also be influenced by miR-191a.

Also, another research illustrates the proliferative inhibition of mouse embryonic stem cells induced by baicalin. Baicalin suppresses the expression of miR-294, c-jun and c-fos; while miR-294 overexpression could significantly reverse the above effect of baicalin, indicating miR-294 may be the treatment target (Wang J. et al., 2015).

Therefore, BA exerts anti-inflammatory and anti-proliferative effects by targeting miRNAs and emerges as a potential treatment agent for digestive disease. Further, as BA remains one of the most frequently used medicines for the treatment of cough and phlegm, the activity of BA in respiratory disease warrants similar studies.

#### Cinnamaldehyde

Cinnamaldehyde (CA, 3-phenylprop-2-enaldehyde; **Chem. 22**) is a conjugated aromatic aldehyde extracted from the bark of the Chinese herb, Cinnamomum cassia Presl. According to TCM theory, the traditional plant can enhance the function of "yang qi" (a substance with excitatory function in TCM) and is often used to relieve symptoms of weakness. Recent studies have broadened the application of this preparation to the treatment of cerebrovascular diseases, ulcerative colitis, and cancer (Zhao et al., 2015; Tian F. et al., 2017; Qu et al., 2018), where it acts by exerting anti-inflammatory or ncRNA regulatory functions. CA improves symptoms of weight loss, disease activity index, and infiltration of inflammatory cells, by decreasing the levels of pro-inflammatory cytokines, including TNF-α, IL-1β, and IL-6, as well as the NLRP3 inflammasome, miR-21, and miR-155, in both colon tissue and macrophages. Moreover, levels of reactive oxygen species were also reduced, along with the phosphorylation of AKT, mTOR, and COX2 proteins. Further experiments revealed similar suppression of IL-1β and IL-6 in response to miR-21 or miR-155 inhibitors, demonstrating that these inflammatory factors are positively regulated by miR-21 or miR-155 (Qu et al., 2018). As a result, CA suppresses the miR-21/miR-155/IL-1β/IL-6 axis to exert its protective function in ulcerative colitis.

CA also has anti-cancer activity through regulation of ceRNA crosstalk and can suppress cell proliferation and induce apoptosis in non-small cell lung cancer. Through upregulation of hascirc-0043256 and ITCH expression, CA inhibits the WNT/βcatenin pathway, while this function can be partially abolished by miR-1252, indicating that miR-1252 may participate in has-circ-0043256-related regulation. Moreover, has-circ-0043256 knockdown can reverse the effects of CA on cells (Tian F. et al., 2017). Consequently, has-circ-0043256/miR-1252/ITCH crosstalk may contribute to the anticancer activity of CA.

#### Geniposide

Geniposide (GEN, methyl (1S,4aS,7aS)-7-(hydroxymethyl)-1- [(2S,3R,4S,5S,6R)-3,4,5-trihydroxy-6-(hydroxymethyl)oxan-2 yl]oxy-1,4a,5,7a-tetrahydrocyclopenta[c]pyran-4-carboxylate; **Chem. 23**) is derived from Gardenia jasminoides Ellis, a traditional antipyretic and detoxifying CHM. A recent research reported GEN has effects of anti-inflammatory and cardiomyocyte protection in LPS-injured H9c2 cells. It up-regulates miR-145 expression, inhibits pro-inflammatory factors of IL-6, TNF-α, and MCP-1, and then suppresses the MEK/ERK pathway, thus promoting cell viability and inhibiting apoptosis. Moreover, miR-145 inhibitor could reverse the above protective function induced by GEN pretreatment (Su et al., 2018). Therefore, GEN becomes a potential therapeutic agent for myocarditis in practice by targeting miR-145 and anti-inflammation in cardiomyocyte.

#### Carvacrol/Thymol

Carvacrol (Car, 5-Isopropyl-2-methylphenol; **Chem 24**) and Thymol (Thy, 2-Isopropyl-5-methylphenol; **Chem. 25**) are isolated from the essential oil of Origanum vulgare L. or wild bergamot. They are isomers and belong to monoterpenoid phenol. Traditionally, Origanum vulgare L. was applied for the treatment of cold and heatstroke. Bergamot can relieve pain and vomit under TCM theory. A research further expanded the applicable scope of these herbs by studying the two bioactive ingredients. Car/Thy can suppress the allergic inflammation in asthma by regulating miRNAs and inflammatory factors. In chitin-induced model, expression levels of miR-155, miR-146a and miR-21, promotor of pro-inflammatory cytokines, are upregulated. Furthermore, SOCS1 and SHIP1, targets of miR-155 and negative regulators of TLR-mediated inflammation, are demonstrated to be inhibited by chitin. However, Car/Thy treatment can reverse the abnormal expressions of TLR2, TLR4, SOCS1, SHIP1, and miR-155, miR-146a, miR-21 (Khosravi and Erle, 2016). These results preliminarily establish the relationships between anti-inflammation effect of Car/Thy and miRNAs; but the specific target and corresponding regulatory network are not reported regrettably.

#### Boswellic Acids

Boswellic acids are extracted from oleo-gum-resin of Boswellia serrata, a traditional CHM with promoting blood circulation and pain relief function. Boswellic acids contain different ingredients, among which 3-acetyl-11-keto-βboswellic acid (AKBA, (3R,4R,6aR,6bS,8aR,11R,12S,14bS)- 3-acetyloxy-4,6a,6b,8a,11,12,14b-heptamethyl-14-oxo-

1,2,3,4a,5,6,7,8,9,10,11,12,12a,14a-tetradecahydropicene-

4-carboxylic acid; **Chem. 26**) possesses the most potent anti-inflammatory activity (Siddiqui, 2011; Sayed et al., 2018). AKBA can attenuate the behavioral dysfunction in LPS-induced neuroinflammation, similarly with that effect of dexamethasone. Moreover, AKBA lowers expression of miR-155, P-IκB-α, and carbonyl protein, and increases contents of normal cytokine and SOCS-1, resulting in effects of antiapoptotic and anti-amyloidogenic (Sayed et al., 2018). Therefore, regulation of miR-155 and downstream protein helps to reveal the possible mechanism underneath AKBA's positive role in neuroinflammatory disorders; however, the specific target deserves more verification.

### ANTI-ATHEROSCLEROSIS EFFECTS OF CHMs

Atherosclerosis is the basic pathology underlying coronary artery disease, cerebral infarction, and other vascular diseases (Pothineni et al., 2017; Li Q. et al., 2018). Medicines with antiatherosclerosis activities are therefore highly significant for the prevention and treatment of these diseases. Statins are currently the main drugs used against atherosclerosis; however, when taken for long periods of time, they risk impairing liver function and causing muscle lysis, particularly in elderly patients (Guyton, 2006; Ramakumari et al., 2018). Therefore, better drugs with relatively few side effects are needed and CHMs represent a good resource in this context. Recent studies have identified three bioactive ingredients of CHMs as regulators of atherosclerosis through targeting miRNA or lncRNA.

#### Sinapic Acid

Sinapic Acid (SA, 3,5-dimethoxy-4-hydroxycinnamic acid; **Chem. 27**) is the bioactive ingredient isolated from seeds of the Chinese herb, Sinapis alba L. The seeds were commonly used to treat cough, phlegm, limb numbness, and chronic abscess. A recent study reported protective effects of SA in atherosclerosis, which helped to partially explain the original application of the seed to treat limb numbness. The lncRNA MALAT1 is significantly up-regulated in rats with diabetic atherosclerosis and low-dose SA treatment can suppress this abnormal expression. Subsequently, pyroptotic death of bone marrow derived macrophages is inhibited, accompanied by decreased expression of ET-1 and IL-1β, and the pyroptotic proteins, ASC, NRLP3, and Caspase-1 (Han et al., 2018). Hence, SA exerts anti-inflammatory activity and prevents pyroptosis, thus exerting anti-atherosclerosis effects, through targeting lncRNA MALAT1; however, the efficacy and safety of SA as a potential treatment agent require verification by additional studies.

#### Polydatin

Polydatin (PLD, 3,4,5-trihydroxystilbene-3-beta-monoglucoside; **Chem. 28**), also known as Piceid, is the bioactive ingredient from Polygonum cuspidatum Sieb. et Zucc. This herb was traditionally used for the treatment of jaundice and cough. However, PLD has also been found to exert protective effects against cardiac hypertrophy (Zhang et al., 2015a), insulin resistance, and hepatic steatosis (Zhang et al., 2015b). Further, a recent study has revealed the underlying regulatory action of PLD in atherosclerosis with liver dysfunction. The findings indicated that PLD treatment can markedly lower increased blood glucose, serum ALT, AST, TC, TG, and LDL-C in mice with high-fat diet. Simultaneously, changes in HDL-C, MDA, SOD, and miR-214 were also improved in liver tissue (Zhou et al., 2016). This study indicates that PLD can be therapeutically effective in complex diseases by regulating various factors. In addition, PLD shows great potential as a complement to treatment for statin-induced liver damage via its anti-atherosclerosis and liver protection properties; however, the above study only reported the expression levels of various factors induced by PLD, rather than systematically studying the relationships between miR-214 and its target genes. Therefore, further in-depth investigations are required in the future.

#### Ampelopsin

Ampelopsin [(2r,3r)-3,5,7-trihydroxy-2-(3,4,5-trihydrox yphenyl)chroman-4-one; **Chem. 29**], also called dihydromyricetin (DHM), is the main flavonoid compound from Ampelopsis grossedentata. The original herb has effects of detoxification, anti-inflammatory and analgesic, commonly used as a dietary supplement. DHM is now demonstrated to impede atherosclerotic process by regulating endothelial dysfunction (Yang D. et al., 2018), and exert anti-aging effect against neurodegenerative diseases (Kou et al., 2016). It inhibits miR-21 expression and then improves endothelial dysfunction induced by TNF-α, accompanied by suppression of abnormal expression of eNOS, DDAH1, NO, and ADMA, as well as improvement of tube formation and migration. Furthermore, miR-21 blockade can produce similar effects with DHM treatment; while miR-21 overexpression abolishes the above protection. Additionally, improvement of endothelial dysfunction can be reversed by a non-specific NOS inhibitor, indicating DHM ameliorates vascular endothelial function and inhibits atherosclerosis by targeting miR-21-mediated DDAH1/ADMA/NO signal pathway (Yang D. et al., 2018).

Another study reveals that miR-34a is upregulated in D-galinduced brain aging rats; while DHM management can inhibit the abnormal expression. Moreover, DHM suppresses apoptosis and ameliorates impaired autophagy of neurons in D-gal-injured hippocampus tissue, by up-regulating SIRT1 and downregulating mTOR signal pathways (Kou et al., 2016). Therefore, DHM possesses anti-aging effect partially through regulating miR-34a-mediated SIRT1-mTOR signal pathway, showing important role for the treatment of neurodegenerative diseases.

From the above results, it can be seen that DHM exerts not only anti-atherosclerosis effect, but also anti-aging function by targeting miRNAs and downstream signaling pathways.

### ANTI-INFECTION EFFECTS OF CHMs

Antibiotics and antiviral drugs are basic treatments for infectious diseases; however, a deteriorating situation caused by antibiotic abuse, drug-resistance, and viral mutations is shifting the focus of research attention to other therapeutic and complementary drugs for treatment of these conditions (Miyoshi et al., 2015; Jiang et al., 2019). To date, two bioactive ingredients of CHMs have been found to contribute to the treatment of infection through regulation of miRNAs. These substances can protect the human body from pathological damage, although they do not directly induce pathogen resistance.

#### Icariine

Icariine [ICA, 2-(4′ -methoxyphenyl)-3-rhamnosido-5 hydroxyl-7-glucosido-8-(3′ -methyl-2-butylenyl)-4-chromanone; **Chem. 30**] is the main bioactive flavonoid glucoside extracted from Epimedium brevicornu Maxim. Under TCM, this herb was considered to nourish "yang qi" and generally applied for treatment of osteoarticular and reproductive diseases. ICA can suppress osteoclast bone resorption and bone loss, indicating great potential for use as a treatment agent for both aseptic loosening and bacteria-induced bone loss (Zhang et al., 2015; Liu et al., 2016). Specifically, ICA can restore LPS-induced bone loss, without obvious cytotoxicity. This product can downregulate expression of the osteogenic inhibitor, miR-34c, while it up-regulates levels of the key transcription factor, RUNX2, thereby inducing osteogenic differentiation and mineral nodule formation. Moreover, miR-34c overexpression can reverse these effects of ICA. Additionally, ICA markedly suppresses LPSinduced activation of JNK, p38, and NF-kB pathways, leading to therapeutic effects in diseases causing bacteria-induced bone loss, such as osteomyelitis and septic arthritis (Liu et al., 2016). Interestingly, ICA also exhibits anticancer activity in ovarian cancer via down-regulation of miR-21 expression. This subsequently induces PTEN, RECK, and Caspase-3 activity, while BCL-2 protein expression is inhibited, leading to suppression of cell proliferation and increased apoptosis (Li J. et al., 2015). Based on these findings, miR-34c appears to facilitate the mechanisms underlying the role of ICA in infectious bone loss. Furthermore, the identification of miR-21 suggests a potential new application of ICA in cancer therapy. Therefore, there is promise that additional currently unknown functions of this medicinal herb could be determined by studying ncRNA and related regulatory networks.

#### Ginsenosides

Ginsenosides (GS, (3S,5R,8R,9R,10R,14R,17S)-17-(2-hydroxy -6-methylhept-5-en-2-yl)-4,4,8,10,14-pentamethyl-2,3,5,6,7,9, 11,12,13,15,16,17-dodecahydro-1H-cyclopenta[a]phenanthren-3-ol; **Chem. 31**), also referred to as panaxosides, are a class of natural steroid glycosides and triterpene saponins. These products include various active components, such as ginsenoside Re, Rg, Rh, Rb, and Rc. GS are mainly isolated from Panax ginseng C. A. Mey., a valuable herb with nourishing effects and a long history of use in ancient China. At present, GS products are not only used to promote health, but also for their activity as immune regulators in many diseases (Jiang Z. et al., 2017; Shin et al., 2018; Yu X. et al., 2018). GS exert a cytoprotective effect, thereby promoting cell viability on avian influenza H9N2/G1 infection. During this process, the expression of miR-15b was up-regulated, while production of IP-10 was markedly inhibited. Furthermore, cytometry and TUNEL analyses indicated that ginsenoside Re prevents apoptosis and DNA damage in human endothelial cells caused by H9N2/G1 (Chan et al., 2011). These results are inconsistent with the traditional concept that GS is only suitable for treatment of sub-optimal health status or chronic diseases and greatly expand the potential for application of GS for treatment of acute infectious diseases in the future.

## ANTI-SENESCENCE EFFECTS OF CHMs

Cell senescence is the irreversible state in which cells undergo cycle arrest responding to various factors (Watanabe et al., 2017). It participates in biological processes involving embryonic development, wound healing and aging, closely relating to organismal aging and diseases, and thus arousing widespread concerns in researchers (Watanabe et al., 2017; de Magalhães and Passos, 2018). Currently, several bioactive ingredients of CHMs have been found to act positive roles in anti-senescence.

#### Salidroside

Salidroside [SAL, 2-(4-Hydroxyphenyl)ethyl beta-Dglucopyranoside; **Chem. 32**] is the main bioactive extract from Rhodiola rosea L. with effects of nourishing "yang qi" and promoting blood circulation under TCM theory. Modern pharmacological study further revealed that the medicinal herb not only exerts anti-fatigue ability, but also improves resistance to hypoxia (Li et al., 2014). Moreover, SAL has been supported to possess anti-senescence activity. The potential mechanism is related to regulation of multiple miRNAs expression. Through upregulating let-7c, let-7e, miR-3620, and decreasing expression of miR-411, miR-24-2-5p and miR-485-3p in the aging cells, SAL participates in several pathways involving p53, transcription factor CREB and AKT/mTOR signaling (Zhang J. et al., 2017). As is known that both let-7 and mTOR are aging-related (Marasa et al., 2010; Wu et al., 2013), and the former factor can directly inhibit the expression of the latter (Marcais et al., 2014). Therefore, it's possible that SAL possesses anti-senescence effect by regulating let-7 and mTOR; however, the predicted regulatory relationship requires more validation in the future.

#### Phlorizin

Phlorizin (PZ, Phloretin-2′ -O-beta-glucoside; **Chem. 33**) is the main active ingredient of Acanthopanax senticosus (Rupr. et Maxim.) Harms which is a traditional CHM with functions.of nourishing and enhancing strength. PZ is convinced to exert effects of anti-fatigue, learning improvement and immuneenhancing (Huang et al., 2011). Researches further reported that PZ can act as a promising agent for skin aging (Zhai et al., 2015; Choi et al., 2016). By promoting epidermal cell proliferation and self-renewal, PZ thickens epidermis to maintain skin structure and resistance to aging. Moreover, PZ increases expression of p63 and proliferating cell nuclear antigen (PCNA), as well as integrin α6, integrin β1 and type IV collagen. Particularly, the mRNA of type IV collagen is also increased and possibly regulated by downregulation of miR-135b (Choi et al., 2016). As a result, miR-135b/type IV collagen axis may be the underlying regulatory mechanism of anti-senescence induced by PZ.

#### Osthole

Osthole (Ost, 7-methoxy-8-(3-methyl-2-butenyl)-2H-1 benzopyran-2-one; **Chem. 34**) is mainly extracted from Cnidium monnieri (L.) Cuss. which was commonly used for nourishing "yang qi" and relieving itching in TCM practice. Current pharmacological researches newly revealed that Ost can improve memory, delay senescence and resist cell damage in Alzheimer's disease (AD) (Hu et al., 2013; Zheng et al., 2013). As it's clear that beta-amyloid peptide (Aβ) is the critical pathology of AD, inhibition of Aβ deposition thereby becomes an important treatment strategy for the disease (Wilcock et al., 2009). Ost was reported to enhance cyclic AMP response element-binding protein (CREB) phosphorylation and then inhibit Aβ cytotoxicity on neural cells (Hu et al., 2013). Further mechanism study indicates that it upregulates miR-107, and then promotes cells viability of neuron, resulting in suppression of the protein expression of Aβ and BACE1, as well as LDH (Xiao et al., 2017). Therefore, Ost exertes obvious neuroprotective effect through targeting miR-107 and impeding Aβ deposition, presenting as a potential treatment agent for neurogenic aging and neurodegenerative disease.

### INHIBITORY EFFECTS OF CHMs ON STRUCTURAL REMODELING

Structural remodeling is an important factor that can impede the normal functions of tissues and organs. It is also the main pathological change during the late stages of various diseases, making poor prognosis and difficult treatment (Bijkerk et al., 2019; Bittencourt et al., 2019; Zhuang et al., 2019). Encouragingly, three CHM ingredients have been demonstrated to exert protective effects on this pathological change through regulation of miRNAs.

#### Panax Notoginseng Saponins

Panax notoginseng saponins (PNS, notoginsenoside-fe, 98%; **Chem. 35**) are a chemical mixture extracted from the root of Panax notoginseng (Burk.) F. H. Chen. According to TCM theory, the traditional herb can simultaneously promote blood circulation and prevent bleeding; therefore, it was commonly used to treat coronary artery disease, stroke, gastrointestinal bleeding, irregular menstruation, and bruises. Currently, several PNS preparations, including xuesaitong injections and xuesetong capsules, are widely used to treat cardiovascular diseases (Song et al., 2017). The improvement in cardiac prognosis caused by PNS has been attributed to its regulation of miRNAs and suppression of structural remodeling. PNS was reported to increase expression of the anti-fibrotic factor, miR-29c, which is clearly reduced in mice with isoproterenol-induced myocardial fibrogenesis, leading to downregulation of its target genes: collagen (Col) 1a1, Col1a2, Col3a1, Col5a1, Fbn1, and TGFβ1, thus exerting protective effects against myocardial injury and fibrosis (Liu L. et al., 2017). In addition, PNS has obvious resistance to H2O2-induced oxidative damage, showing anti-apoptosis activity in vascular endothelial cells (VECs) by suppressing miR-146b-5p expression (Wang J. et al., 2017). Moreover, notoginsenoside R1, one main component of PNS, can delay the process of senescence in VECs by regulating miR-34a/SIRT1/p53 pathway (Lai et al., 2018). As a result, through repairing VECs damages, PNS inhibits vascular pathological process.

PNS also has an active role in tumors complicated by myocardial ischemia where paradoxical treatment strategy existed. PNS and its major components, Rg1, Rb1, and R1, are implicated in tissue-specific regulation of angiogenesis, and can inhibit tumor growth, as well as attenuating myocardial ischemia. The potential underlying mechanism may be the downregulation of miR-18a and vascular markers (CD34 and vWF) in tumor, with simultaneous up-regulation of these factors in heart (Yang Q. et al., 2014). Notably, by modulating Met/miR-222 axis, and then increasing target genes of tumor suppressor p27 and PTEN expression, PNS selectively inhibits the survival of lewis lung carcinoma cells and attenuates tumor growth in mice (Yang Q. et al., 2016).

Based on the above results, PNS appears to exert its cardioprotective function by preventing fibrosis, improving vascular endothelium, and promoting angiogenesis in the heart. Simultaneously, considering the cardioprotection and antitumor effects through targeting miRNAs, PNS is especially suitable for patients with heart disease and tumor.

#### Tetrandrine

Tetrandrine (TET, 6,6′ ,7,12-tetramethoxy-2,2′ -dimethyl-1 beta-berbaman; **Chem. 36**) is a bisbenzylisoquinoline alkaloid extracted from Stephania tetrandra S. Moore. The medicinal plant was often used for treatment of rheumatism and edema under the theories of TCM. Notably, the source plant should not be confused with Radix Aristolochiae Fangchi which causes nephrotoxicity, despite their similar Chinese names. A recent study identified a new pharmacological effect of TET in treatment of anti-hypertrophic scarring. The underlying mechanism was suggested to be repression of DNA and collagen synthesis in scar-derived fibroblasts (Liu et al., 2001). Furthermore, by upregulating miR-27b and downregulating miR-125b, TET influenced the expression of putative targets, including VEGFC, BCL2L12, COL4A3, and FGFR2, predicted to contribute to several scar and wound healing-related signaling pathways (Ning et al., 2016). Consequently, TET has therapeutic potential for the inhibition of skin tissue hyperplasia after wounding or surgery.

#### Leonurine

Leonurine (LEO, 4-Guanidino-n-butyl syringate; **Chem. 37**) is the alkaloid compound from Leonurus artemisia (Laur.) S. Y. Hu F, which was commonly used for gynecological diseases in TCM. Newly reported results have indicated the cardioprotective effect by studying LEO activity, namely anti-atherosclerosis (Jiang T. et al., 2017), anti-oxidation (Gao et al., 2016) and resistant to cardiomyocyte hypertrophy (Lu et al., 2019). LEO treatment can significantly reduce the surface area of hypertrophic cardiomyocytes, decrease the content of atrial natriuretic peptide (ANP), endothelin-1 (ET-1), p38 mitogen-activated protein kinase (p38 MAPK), phosphorylated p38 MAPK (p-p38 MAPK), myocyte enhancer factor 2 and β-myosin heavy chain. Moreover, it also up-regulates the expression levels of α-myosin heavy chain protein and miR-1. Thus, by upregulating miR-1 expression and then inhibiting the activation of p38MAPK signaling pathway, LEO may inhibit AngII-induced cardiomyocyte hypertrophy and structural remodeling (Lu et al., 2019).

### OTHER EFFECTS OF CHMs

Bioactive ingredients of CHMs can exert various protective effects through targeting different miRNA, lncRNA, or circRNA. Besides the above mentioned mechanisms, some ingredients also play positive roles in anti-I/R injury, anti-arrhythmia, recovery of blood-spinal cord barrier, and promotion of cardiac differentiation by targeting lncRNA and miRNA.

#### Calycosin/Astragaloside IV

Calycosin (CAL, 7,3′ -dihydroxy-4′ -methoxyisoflavone; **Chem. 38**) is a natural phytoestrogen derived from Astragalus membranaceus (Fisch.) Bunge. which can nourish "yang qi" and was commonly used for cardiovascular and cerebrovascular diseases in TCM practice. It's indicated that the neuroprotection effect of CAL is related to miRNA. CAL markedly improves the infarcted volume, brain water content, and neurological deficit in cerebral I/R injury rats, by upregulating miR-375, ER-α and Bcl-2, and inhibiting RASD1 expression (Wang Y. et al., 2014). Regrettably, a systematic mechanism of miR-375 and those downstream targets has not been revealed in this study.

Moreover, CAL also possesses positive role in anti-cancer, enriching the pharmacological effect and application of Astragalus membranaceus extracts (Tseng et al., 2016; Kong et al., 2018). It's demonstrated that CAL significantly impedes lncRNA EWSAT1 expression in nasopharyngeal carcinoma (NPC), followed by influenced downstream factors and pathways, leading to inhibitory growth. Furthermore, lncRNA EWSAT1 overexpression can reverse CAL-induced effect, indicating lncRNA EWSAT1 act as the specific target of CAL promisingly (Kong et al., 2018).

Additionally, Astragaloside IV (ASIV, 3-O-beta-Dxylopyranosyl-6-O-beta-D-glucopyranosylcycloastragenol;

**Chem. 39**), another bioactive compound of Astragalus membranaceus (Fisch.) Bunge., can ameliorate precancerous lesions of gastric carcinoma (PLGC) markedly. It lowers mRNA and protein expressions of LDHA, MCT1, MCT4, HIF-1α, and CD147, as well as increasing TIGAR and p53 content. Furthermore, ASIV treatment promotes miR-34a expression. As a result, ASIV improves abnormal glycolysis and dysplasia possibly via regulation of miR-34a/LDHA pathway (Zhang C. et al., 2018).

Interestingly, the total flavonoids of Astragalus membranaceus (Fisch.) Bunge. (TFA) can improve heart function damaged by viral myocarditis. By upregulating the expression of miR-378 and miR-378<sup>∗</sup> in cardiomyocytes infected with coxsackie B3 virus, TFA may inhibit cardiac hypertrophy and improve prognosis (Nagalingam et al., 2013; Wan et al., 2017). Therefore, it can be speculated that the heart protection of TFA is attributed to inhibition of myocardial pathology by regulating miR-378 and miR-378<sup>∗</sup> .

From the above results, it can been seen that although CAL, ASIV and TFA are extracted from the same herb, they have different targets and are applicable for distinct diseases, demonstrating the study necessity of identified ingredients and targets from CHMs.

### Paeonol

Paeonol (PAE, 2′ -Hydroxy-4′ -methoxyacetophenone; **Chem. 40**) is the main bioactive ingredient of Paeonia suffruticosa Andr. and Cynanchum paniculatum (Bunge) Kitagawa. The two herbs promoted blood circulation and could be used for cardiovascular diseases in TCM practice. Further pharmacological research shows PAE significantly reduces the incidence of ischemic arrhythmia in rats, including lowered frequency of ventricular premature beat, ventricular tachycardia and ventricular fibrillation. Moreover, it markedly decreases infarct size of myocardium. The potential treatment target is miR-1 which is inhibited by PAE in cardiomyocytes (Zhang and Xiong, 2015). Nevertheless, the above study only reveals the possible regulatory relationship between PAE and miR-1, which needs more verification and further identifies the downstream target gene of miRNA.

### Salvianolic Acid A

Salvianolic acid A (Sal A, (R)-3-(3,4-Dihydroxyphenyl)-2- (((E)-3-(2-((E)-3,4-dihydroxystyryl)-3,4-dihydroxyphenyl)

acryloyl)oxy)propanoic acid; **Chem. 41**) is derived from watersoluble phenolic compound of Salvia miltiorrhiza Bge.. It has protective effects of anti-IR injury (Jiang et al., 2008; Yang et al., 2011), recovery of neurological function (Yu D. S. et al., 2017) and anti-cancer activities (Chen et al., 2016; Lu et al., 2016), the latter two pharmacological actions of which are attributed to miRNA regulation. It's reported that Sal A significantly increases expression of tight junction proteins and HO-1, and decreases p-caveolin-1 and apoptosis-related proteins, resulting in recovery of blood-spinal cord barrier integrity after spinal cord injury (SCI). Furthermore, HO-1 inhibitor can attenuate the regulation of ZO-1, occluding, and p-caveolin-1 by Sal A. The underlying target and mechanism may be upregulation of miR-101 which promotes expression of nuclear factor erythroid 2-related factor 2 (Nrf2) and HO-1. Conversely, miR-101 inhibitor accelerates the permeability of rat brain microvascular endothelial cells, and the protein of Cul3 by targeting its mRNA. As a result, Sal A improves neurological function after SCI through targeting miR-101/Cul3/Nrf2/HO-1 signaling pathway (Yu D. S. et al., 2017).

Another study indicated that Sal A can also down-regulate the expression of multidrug resistance gene MDR1 in lung cancer, thereby emerging as a new treatment agent for lung cancer resistance. The potential mechanism may be related to up-regulation of 4 miRNA expressions including miR-3686, miR-4708-3p, miR-3667-5p, and miR-4738-3p (Chen et al., 2016). This study attempts to find the upstream target of Sal A against MDR1 from perspective of post-transcriptional regulation; but current result cannot directly confirm the regulatory correlation between the 4 miRNAs and MDR1. Thus, more and deeper experiments are urgently needed in the future.

#### Andrographolide

Andrographolide (Andro, 17-hydro-9-dehydroandrographolide; **Chem. 42**) is a diterpenoid lactone compound derived from Andrographis paniculata (Burm. f.) Nees, a natural anti-bacterial and anti-viral CHM. A recent study further demonstrates that Andro can inhibit hepatoma tumor growth. It promotes the expression of 22 miRNAs, but declines that of other 10 miRNAs in a xenograft mouse tumor model in vivo. Among those upregulated miRNAs, miR-222-3p, miR-106b-5p, miR-30b-5p, and miR-23a-5p are confirmed in cell experiments in vitro. Functional analysis reveals that those miRNAs are mainly involved in signaling pathways of miRNAs in cancer, MPAKs and focal adhesion. Moreover, 24 target genes involved in the above signaling pathways are illustrated to be consistent with miRNAs expression (Lu et al., 2016). As a result, Andro prevents hepatoma tumor growth partially through regulating miRNA profile; whereas the specific target and underlying mechanism still need deeper study.

### Puerarin

Puerarin (PUE, 7,4'-Dihydroxy-8-C-glucosylisoflavone; **Chem. 43**) is the main active ingredient extracted from Radix Puerariae Lobatae which improved symptoms of fever, neck stiffness, thirst and diarrhea in ancient China. Modern researches reveal the cardiovascular protection of PUE for myocardial infarction (Zhang et al., 2006) and arrhythmia (Zhang et al., 2011). The active effects may be attributed to promotion of cardiac differentiation (Cheng et al., 2013; Wang L. et al., 2014). PUE upregulates expression of caveolin-3, amphiphysin-2 and junctophinlin-2, and then ameliorates myofibril array and sarcomeres formation, accompanied by increased t-tubules development in the embryonic stem cell-derived cardiomyocytes. Moreover, PUE suppresses the upstream regulatory factor of caveolin-3, namely miR-22, indicating miR-22/caveolin-3 axis may be the underlying mechanism of cardiomyogenesis induced by PUE (Wang L. et al., 2014).

### CONCLUSION

CHM has long been a powerful weapon used by Chinese people to combat disease. Over thousands of years, practitioners of CHM have accumulated a wealth of knowledge which is used to prevent and cure diseases. The theoretical concepts of TCM act as the basis for scientific research into CHM today, including identification of the bioactive ingredients and underlying mechanisms of CHMs that could be of benefit internationally—a gift from the Chinese people to the world (Tu, 2011, 2016). Chinese pharmacologist, Youyou Tu, and her team discovered the highly effective and low-toxicity bioactive ingredient "artemisinin" from Artemisia annua L. (also referred to as "qinghao" in Chinese), inspired by, "A Handbook of Prescriptions for Emergencies" (written around 317–420 CE), thus making an outstanding contributions to the global treatment of malaria (Tu, 2016). Consequently, we became convinced that study of the bioactive ingredients of CHMs is an effective way to reveal their potential mechanisms of action and further broaden their clinical application.

By conducting the above comprehensive review, we found that bioactive ingredients of CHMs can play positive roles in treatment of cancer, cardiovascular, nervous system, respiratory, digestive, infectious, and senescence-related diseases. Through targeting various miRNAs, lncRNAs, circRNAs, or ceRNA crosstalk, these ingredients exert protective effects, including pro-apoptosis, anti-proliferation and anti-migration, anti-inflammation, anti-atherosclerosis, anti-infection, antisenescence, and anti-structural remodeling. Some miRNAs, including miR-21, miR-34a, miR-34c, miR-155, miR-29a, miR-203, miR-27b, miR-184, and miR-143, contributed to the treatment mechanisms of more than one bioactive ingredient of CHMs. In particular, miR-21 was identified as targeted and regulated by BBR (Luo et al., 2014), TP (Li et al., 2016), RES (Wang G. et al., 2015), CA (Qu et al., 2018), ICA (Li J. et al., 2015), AIL (Yang P. et al., 2018), Car/Thy (Khosravi and Erle, 2016), DHM (Yang D. et al., 2018), and COR (Yang et al., 2017), especially in its anti-cancer activities, indicating that this miRNA is stably targetable and responsive to the pharmacological effects of various CHMs. Moreover, miR-155 was associated with inflammatory responses and could be inhibited by CUR, RES, Tan IIA, CA, Car/Thy and AKBA, in inflammatoryrelated diseases (Tili et al., 2010; Fan et al., 2016; Khosravi and Erle, 2016; Ma F. et al., 2017; Xuan et al., 2017; Qu et al., 2018; Sayed et al., 2018). Thus, it is highly likely that miR-155 could represent a new treatment target for anti-inflammation. In addition, three complex ceRNA crosstalk networks were discovered to function in the therapeutic mechanisms of ART, Sch B, and CA. Specifically, ART regulates the lncRNA UCA1/miR-184/BCL-2 axis, to inhibit prostate cancer (Zhou S. et al., 2017), while the has-circ-0043256/miR-1252/ITCH axis was involved in the treatment of non-small cell lung cancer by CA (Tian F. et al., 2017). The miR-150/ lncRNA BCYRN1 axis was targeted by Sch B treatment, leading to suppression of cell proliferation in asthma (Zhang X. Y. et al., 2017). All of these complex networks provide foundations for in-depth understanding and broader application of CHMs in the near future.

The interactions between bioactive ingredients of CHMs with ncRNA targets are the subject of intensive and rapidly expanding research. This has helped to reveal the treatment mechanisms underlying the activities of CHMs and offers promising complementary and alternative treatments for diseases, based on scientific research. Although some previous reviews have revealed the increasing importance of bioactive ingredients (Huang et al., 2016; Lelli et al., 2017) or CHMs (Hong et al., 2015) in the treatment of diseases by targeting ncRNA. However, most of them mainly focused on one kind of ncRNA (particularly miRNA), or are limited to a specific disease (mostly cancer) (Mohammadi et al., 2017; Mirzaei et al., 2018); rather than overall and comprehensive ncRNA targets, ceRNA crosstalk and corresponding mechanisms. As a result, we consider it's necessary to make a systematic review about the treatment mechanisms of bioactive ingredients from CHMs by targeting miRNA, lncRNA, and circRNA. From our review, it can be seen that studies are currently in initial and exploratory phases, and several critical problems remain. First, various individual ncRNA molecules are targets of CHM bioactive ingredients; however, recent results are far from sufficient to allow understanding of the complex regulatory interactions between circRNA, lncRNA, miRNA, and mRNA in the treatment of diseases. Second, the metabolism of drugs in single cell lines and animals may differ from that in the human body; therefore, results based on basic research require further verification in clinical trials. Third, each CHM generally contains numerous ingredients and a TCM clinical prescription often consists of several CHMs; therefore, multiple targets and ceRNA crosstalk must occur and the study of classic TCM formulae will further complicate the picture.

In conclusion, ncRNAs are potential targets of CHMs and understanding of ceRNA crosstalk has helped to reveal the complex mechanisms underlying multi-target and multi-level regulation of bioactive ingredients from CHMs. Therefore, CHM ingredients represent new and promising choices for future alternative disease treatments.

#### AUTHOR CONTRIBUTIONS

JW and JL designed the study. YD, HC, and JG conducted searches and extracted the data. YL and YD analyzed the data. YD wrote the manuscript.

#### REFERENCES


#### FUNDING

This study was supported by the National Natural Science Foundation of China (No. 81673847; No. 81473561).


MALAT1 mediated Wnt/β-catenin signal pathway. PLoS. ONE 8:e78700. doi: 10.1371/journal.pone.0078700


in multiple myeloma cells in response to berberine. BMC Syst. Biol. 8:82. doi: 10.1186/1752-0509-8-82


miR-17-5p and stimulating the Wnt signalling pathway effector Tcf7l2. Cell. Death. Dis. 8:e2559. doi: 10.1038/cddis.2016.455


IV and curcumin on tumor growth and angiogenesis in an orthotopic Nude-Mouse model of human hepatocellular carcinoma. Anticancer. Res. 37, 465–473. doi: 10.21873/anticanres.11338


alleviating inflammatory cell infiltration and microvascular permeability. Inflamm. Res. 60, 981–990. doi: 10.1007/s00011-011-0359-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Dong, Chen, Gao, Liu, Li and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### APPENDIX

Chem. 1–43 Chemical formulae of bioactive ingredients from CHMs.

Chem. 9 Oridonin Chem. 10 Curcumin

Chem. 11 Shikonin Chem. 12 Paeoniflorin

Chem. 13 Honokiol Chem. 14 Schiscandrin B

Continued

Chem. 17 Matrine Chem. 18 Corylin

#### Continued

Chem. 41 Salvianolic acid A Chem. 42 Andrographolide

# Drugs Targeting Epigenetic Modifications and Plausible Therapeutic Strategies Against Colorectal Cancer

#### *Srinivas Patnaik\* and Anupriya*

*School of Biotechnology, KIIT University, Bhubaneswar, India*

Genetic variations along with epigenetic modifications of DNA are involved in colorectal cancer (CRC) development and progression. CRC is the fourth leading cause of cancerrelated deaths worldwide. Initiation and progression of CRC is the cumulation of a variety of genetic and epigenetic changes in colonic epithelial cells. Colorectal carcinogenesis is associated with epigenetic aberrations including DNA methylation, histone modifications, chromatin remodeling, and non-coding RNAs. Recently, epigenetic modifications have been identified like association of hypermethylated gene Claudin11 (CLDN11) with metastasis and prognosis of poor survival of CRC. DNA methylation of genes CMTM3, SSTR2, MDF1, NDRG4 and TGFB2 are potential epigenetic biomarkers for the early detection of CRC. Tumor suppressor candidate 3 (TUSC3) mRNA expression is silenced by promoter methylation, which promotes epidermal growth factor receptor (EGFR) signaling and rescues the CRC cells from apoptosis and hence leading to poor survival rate. Previous scientific evidences strongly suggest epigenetic modifications that contribute to anticancer drug resistance. Recent research studies emphasize development of drugs targeting histone deacetylases (HDACs) and DNA methyltransferase inhibitors as an emerging anticancer strategy. This review covers potential epigenetic modification targeting chemotherapeutic drugs and probable implementation for the treatment of CRC, which offers a strong rationale to explore therapeutic strategies and provides a basis to develop potent antitumor drugs.

#### Keywords: drugs, histone, colorectal cancer, therapy, epigenetics

### INTRODUCTION

Colorectal cancer (CRC) is the third most common cancer in men and the second most common in women with expected increase in burden by 60% in the coming 10 years (Arnold et al., 2017). The mechanisms underlying CRC pathogenesis and progression remain subjects of extensive investigation in the field of cancer biology. It is known that CRC results from cumulation of both genetic and epigenetic alterations of the cellular genome drive transformation of normal glandular epithelium into adenocarcinoma. By further alteration in genetic and epigenetic profiles, CRC can acquire migration and invasion capability to metastasize to other parts of the body (Coyle et al., 2017; Hong, 2018). Epigenetic changes are defined as non-genetic influences on the gene expression. These changes do not include changes in DNA sequence but are inheritable. Epigenetic aberrations affect

#### *Edited by:*

*Chandravanu Dash, Meharry Medical College, United States*

#### *Reviewed by:*

*Yang Zhang, University of Pennsylvania, United States Zhiguo Xie, Central South University, China*

*\*Correspondence: Srinivas Patnaik srinivas.patnaik@kiitbiotech.ac.in*

#### *Specialty section:*

*This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology*

*Received: 16 January 2019 Accepted: 08 May 2019 Published: 06 June 2019*

#### *Citation:*

*Patnaik S and Anupriya (2019) Drugs Targeting Epigenetic Modifications and Plausible Therapeutic Strategies Against Colorectal Cancer. Front. Pharmacol. 10:588. doi: 10.3389/fphar.2019.00588*

every aspect of tumor development from initiation to metastasis. Anomalous expression of genes including p53, Ras, beta-catenin, transcription factors involved in embryogenesis, and DNA mismatch repair genes drives the progression of the disease from benign adenoma to malignant adenocarcinoma (Vaiopoulos et al., 2014). Development of CRC is contributed by three major classical pathways of genomic instability—microsatellite instability (MSI), chromosomal instability (CIN), and CpG island methylator phenotype (CIMP) (Armaghany et al., 2012). CRCs that are positive for CIMP are rich in hypermethylation in CpG island of panel of marker genes CRABP1, SOCS1, RUNX3, MLH1, CACNA1G, NEUROG1, CDKN2A, and IGF2 (Bae et al., 2017). The aberrant methylation of tumor suppressor genes results in inactivation of these genes and subsequent promotion of neoplasia.

The field of epigenetics includes DNA modifications, histone modifications, and nucleosome remodeling. These modifications include methylation, acetylation, parylation, citrullination, phosphorylation, ribosylation, sumoylation, and ubiquitylation. Among these modifications, methylation and acetylation are widely studied. One of the main causes of morbidity due to cancer is the late detection. Methylation and acetylation status of few genes can be considered as potential epigenetic biomarkers for the early detection of CRC.

DNA methylation, one of the most important epigenetic phenomena, occurs in the cytosine of CpG dinucleotide islands and marks the inactivation of several key tumor suppressor genes required to drive the initiation and progression of CRC (Ichimura et al., 2015). Ezh2-mediated trimethylation of lysine 27 of histone 3 (H3K27me3) leads to inactivation of tumor suppressor genes and hence increases EMT phenotype and malignancy (Tiwari et al., 2013). Epigenetic modifications are also suggested as early-stage biomarkers for cancer. In the early stage of CRC, several tumor suppressor genes, CMTM3, SSTR2, and MDFI, are found to be remarkably hypermethylated in CRC tissues when compared with adjacent normal colorectal tissues (Li et al., 2017a). The DCC (Deleted in CRC) gene has repressive histone-tail marks, trimethylated lysine at the 9th and 27th position of H3 marked (H3K9me3 and H3K27me3) (Derks et al., 2009). Promoter regions of tumor suppressor genes hMLH1, MGMT, APC, and CDH1 were found to be hypermethylated in early stages of tumor formation in colon adenocarcinomas (Michailidi et al., 2015). The hypermethylated state of the MGMT (O6 -alkylguanine DNA alkyltransferase) gene promoter, which is involved in DNA repair, is common in the case of brain metastasis from CRC and corresponding primary tumors (Maglio et al., 2015). Tumor suppressor gene MGMT, along with RASSF1A and FHIT gene promoter hypermethylation, is correlated with tumor stage and metastasis (Sinha et al., 2013). All of the abovementioned genes are involved in either tumor suppression or DNA repair, which, due to methylation, gets inactivated and hence results in uncontrolled and unchecked growth of cancer. Overall, methylation in certain genes is associated with more advanced tumor stage, poorly differentiated cancer cells, and metastasis.

Acetylation and deacetylation of histones and non-histone proteins are crucial events for gene regulation. Histone acetylation state is maintained by two crucial enzymes, histone acetyltransferases (HATs) and histone deacetylases (HDACs) (Verdone et al., 2006). Histone acetylation is an indicator of unconstrained expression of genes whereas deacetylation ensures repression in the gene expression. Therefore, HATs and HDACs are associated with hyperactivity and hypoactivity of genes, respectively. There are several evidences proving the significance of this kind of epigenetic modification in CRC. CNT2 (concentrative nucleotide transporter 2) is a pharmacologically important gene as it is a transporter that mediates the uptake of both natural nucleosides and nucleoside-derived drugs. However, CNT2 expression is significantly repressed in CRC due to hypoacetylation of the promoter region (Ye et al., 2018). One of the major regulatory cascades in the case of CRC is the Wnt/β-catenin signaling pathway, which is also controlled by epigenetic mechanism (Zhang et al., 2018). Cell-cycle-related and expression-elevated protein (CREPT), along with p300 (HAT), is involved in the acetylation of β-catenin, promoting oncogenic Wnt/β-catenin signaling and CRC (Zhang et al., 2018). There are many more cases reporting acetylation related to the progression of CRC. Scaffold Matrix Attachment Region Binding Protein 1 (SMAR1)-reduced expression is associated with poor prognosis of cancer. Loss of SMAR1 leads to enriched H3K9 acetylation of the β-catenin promoter that further activates the Wnt/β-catenin signaling pathway and CRC progression (Taye et al., 2018). One of the main reasons of the cancer relapse and drug resistance is cancer stem cells. Generation of cancer stem cells is associated with acetylation. The JADE3 (Jade family PHD finger 3) gene, which acetylates the histone during the transcription, was found upregulated in colon cancer cell line. Its overexpression increases while knockdown decreases the stem-cell-like property of colon cancer cells both *in vivo* and *in vitro*. Reduction in expression of JADE3 impairs the tumor-initiating property of cancer cells *in vivo*. JADE3 interacts with the promoters of LGR5 (colon stem cell marker) and activates its transcription by increasing the occupancy of p300 acetyltransferase and histone acetylation, hence substantially inducing Wnt/β-catenin signaling (Jian et al., 2018). Overexpression of HDACs has been shown to promote migration, invasion, tumorigenesis, and metastasis. HDAC has also been suggested to modulate the expression of IL-10 as a transcriptional activator (Cheng et al., 2014). Due to the fundamental significance of HDAC for cellular de-differentiation processes, HDAC inhibition has been proposed as a strategy to re-balance transcription of those genes deregulated in cancer.

In recent years, our understanding has been improved regarding the role of epigenetic factors in cancer. The inherent plastic nature of epigenetic changes provides possible lines for targeted treatment using specific inhibitors for proteins involved in epigenetic modifications. FDA has approved several of these compounds for the treatment of certain cancers, for example, azacytidine (market name Vidaza), which is a nucleoside-like compound. This drug acts as a cytidine analog; carbon at the fifth position is replaced by a nitrogen atom. During replication, azacytidine gets incorporated into DNA. It is recognized by DNMT1 and forms an irreversible DNMT1–aza linkage that triggers DNMT1 degradation and leads to overall reduction in methylation (Garcia-Manero et al., 2019; Nehme et al., 2019). HAT and HDAC inhibitors have also been identified to inhibit the catalytic activity of both in many cancer types. P300 is one of the potential histone acetylases that acetylates β-catenin in many cases of CRC (Zhang et al., 2018). A number of synthetic as well as natural HAT inhibitors (HATi) are used to inhibit p300, including bisubstrate inhibitors, garcinol, anacardic acid, C646, and natural HATi curcumin. HATi are less selective and bind many non-specific targets, whereas HDAC inhibitors (HDACi) are relatively more specific in nature. There are so many potential HDACi in the market. Hydroxamic acid has emerged as a potent chemotherapeutic drug that inhibits Class I and II HDACs. In fact, suberanilohydroxamic acid (SAHA) has been shown to inhibit HDAC at very minimal concentration (Guo and Zhang, 2012). This review covers potential epigenetic targeting chemotherapeutic drugs or the compounds that have the potential to be used as chemotherapeutic drugs for the restriction and treatment of CRC.

#### REGULATION OF COLORECTAL CANCER BY EPIGENETIC MECHANISM

CRC is the third most common cancer in men and the second most common cancer in women worldwide. For decades, gene mutations have been known to be a significant contributor for cancer development. However, epigenetic changes have recently been shown to play a potential role in cancer progression, specifically in the case of CRC where it is demonstrated that epigenetic modifications act earlier than the genetic modifications in the progression of CRC (Porcellini et al., 2018). Development of CRC is contributed by three major classical pathways of genomic instability—MSI, CIN, and CIMP (Armaghany et al., 2012). CRCs that are positive for CIMP are rich in hypermethylation in CpG island of panel of marker genes CRABP1, SOCS1, RUNX3, MLH1, CACNA1G, NEUROG1, CDKN2A, and IGF2 (Weisenberger et al., 2006). Compendious genomic and epigenomic studies have revealed the heterogeneity of CRC. Epigenetic status of eight CIMP marker genes also correlates with heterogeneity of CRC. CRCs with five or six methylated genes compared with CRCs with less than five methylated genes showed moderate increase in MLH1 methylation, an MSI-high status, CK7 overexpression, and downregulation in expression of CK20 and CDX2. CRC cases with seven or eight methylated CIMP marker genes that showed overexpression of CK7 and downregulation of CK20 and CDX2 showed high incidences of *BRAF* mutations and lacked *KRAS* mutations. Based on these trends, CRCs were classified into CIMP-negative (CIMP-N, 0–4 methylated genes), CIMP-positive 1 (CIMP-P1, 5–6 methylated genes), and CIMP-positive 2 (CIMP-P2, 7–8 methylated genes) categories (Bae et al., 2017). Wnt/β-catenin signaling has been involved in a variety of cancers and other diseases. Loss of the Wnt signaling negative regulator adenomatous polyposis coli (APC) is the major hallmark of human CRCs (Novellasdemunt et al., 2015). APC is a tumor suppressor that blocks transition of cells from G1 to S phase. Stem cells reside at the base of the colonic crypts, which is maintained in their native undifferentiated state through Wnt/β-catenin signaling. These stem cells are responsible for the survival of normal stem cells as well as cancer stem cells. β-Catenin regulates the migration of stem cells out of epithelial crypts as they get differentiated. During this process, many stem cells acquire mutations. But in healthy people, these cells normally get sloughed off in a week after apoptosis and hence don't get enough time to induce cancer. APC gene downregulates the Wnt/β-catenin signaling through its ability to bind β-catenin and mediate its degradation. Hypermethylation of the promoter region of APC causes its inactivation, which leads to the accumulation of β-catenin. Accumulation of β-catenin in enterocyte precursor results in retention of a stem cell phenotype, which prevents them from migrating to the surface to be sloughed off. The accumulation of these undifferentiated cells eventually leads to polyp formation in the colonic crypts (Armaghany et al., 2012). Mismatch repair gene promoter hypermethylation has been frequently observed in sporadic CRC with MSI while hypermethylation of the APC promoter is positively correlated with CRC metastasis (Van Engeland et al., 2011; Roy and Majumdar, 2012). Another DNA-repair gene, MGMT, is found silenced in CRC due to hypermethylation of the promoter region, which favors mutation in p53 and kRAS genes. MGMT promoter hypermethylation state can be seen at precancerous polyps (Matsubara, 2012). This finding suggests that hypermethylation of certain genes occurring at the early stage of CRC could provide promising diagnostic biomarkers. Hypermethylation-mediated silencing of genes in CRC has been widely studied. However, recently, there have been many reports mentioning the involvement of histone modification in CRC progression, in particular acetylation of lysine residues of H3 and H4. Acetylation level is regulated and balanced by the function of both HAT and HDAC activity (Dawson and Kouzarides, 2012). Specifically, the HDAC family of proteins is commonly upregulated in CRC, including HDAC1, HDAC2, HDAC3, HDAC5, and HDAC7 (Barneda-Zahonero and Parra, 2012). HDAC2 upregulation was observed as one of the earliest events in CRC carcinogenesis, which may serve as an early-stage biomarker for detection (Stypula-Cyrus et al., 2013). HDACs are responsible for silencing of tumor suppressor genes. Wnt/β-catenin target genes CDX1 and EPHB act as tumor suppressors in intestinal epithelial cells, which are frequently found to be downregulated in CRC. Study showed involvement of HDAC1 and HDAC3 that caused strong reduction of active histone modifications in the promoter region of CDX1. Unlike the inactive CDX1 locus, EPHB encoding DNA was hypomethylated in the promoter regions in silent state. Treatment with both DNMT and HDACi restored the tumor suppressor genes' activity (Rönsch et al., 2011).

#### Histone Deacetylases Inhibitors

HATs mediate acetylation of amino acid residues within histone tails, which lose chromatin structure, thus making target genes more accessible for transcription factors. Conversely, HDACs catalyze histone deacetylation, resulting in chromatin condensation and transcriptional repression (Eberharter and Becker, 2002). The HDAC family consists of 18 members, subdivided into four classes: I (HDAC1–HDAC3 and HDAC 8), II (HDAC4–HDAC7, HDAC9, and HDAC10), III (sirtuins 1–7), and IV (HDAC11) (Barneda-Zahonero and Parra, 2012). Upregulated expression of HDAC2 is found in the early stages in CRC along with HDAC1–3, HDAC5, and HDAC7 (Stypula-Cyrus et al., 2013). Expression levels of HDACs vary according to the tissue type. Upregulation of HDAC1 has been seen in prostate, gastric, lung, esophageal, breast, and colon cancer. HDAC2 was found upregulated in cervical, gastric, and CRC, whereas HDAC6 and HDAC11 expressions are found more in breast cancer and rhabdomyosarcoma, respectively (Eckschlager et al., 2017). HDAC-regulated proteins include STAT3, tumor protein p53, Myc, RUNX3, β-catenin, EKLF, estrogen receptor, GATA family, HIF-1α, Foxp3, NF-Κb, and MyoD, which play major roles in cancer progression (Hull et al., 2016). Studies have shown that HDAC1 inhibition controls the proliferation and inhibits tumorigenecity, while HDAC3 inhibition reduces cell migration with the overexpression of epithelial markers (Hayashi et al., 2010). Several lines of evidences have demonstrated that the aberrant expression of HDACs is often associated with poor prognosis as well as poor survival rates in CRC. HDACi have emerged as novel drugs with potent anticancer activity in both preclinical experiments and clinical trials. For example, Vorinostat (SAHA), romodepsin, belinostat, and panobinostat are the potent HDACi, approved for cancer therapy by FDA (Eckschlager et al., 2017). Overall, HDACi reduce metastasis by reducing the expression of genes involved in migration, angiogenesis, epithelial-tomesenchymal transition (EMT), and cell survival while enhancing the expression of genes involved in apoptosis (**Figure 1**).

HDACi may be specific or act against all types of HDACs. HDACi can be classified into five categories on the basis of nature of compounds—hydroxamic acids, short chain fatty acids, benzamides, cyclic tetrapeptides, and sirtuin inhibitors (Ceccacci and Minucci, 2016). Currently, there are a number of HDACi that are in use or under clinical trials.

#### Sulforaphane

Sulforaphane (SFN) is a natural compound found in cruciferous vegetables from the Brassicaceae family like broccoli, cauliflower, Brussels sprouts, and cabbage. It comes within the isothiocyanate group of organosulfur compounds. Various clinical and epidemiological studies have proven SFN as a potential chemopreventive agent, which can be used for cancer treatment (Michaud et al., 1999; Cipolla et al., 2015). It has shown prodigious antitumor effects *in vitro* and *in vivo* without any toxic effect (Alumkal et al., 2015). SFN is well known to suppress HDACs, which alter epigenetic regulation. The interaction of SFN and HDAC has been shown both clinically and preclinically. The formulation of this drug was based on the previous findings about the anticancer properties of its sources like broccoli. In the past, there is a study on human subjects showing significant inhibition of HDAC activity following consumption of a single dose of 68 g of broccoli sprouts (Dashwood and Ho, 2007). SFN effect has been proven *in vivo* too. In a prostate cancer xenograft mice model, the daily consumption

of 7.5 µmol SFN per mice for 21 days significantly decreased HDAC activity, indicated by an increase of acetylated histones. These findings suggested SFN activity in suppressing the cancer cell growth through increased histone acetylation (Myzak et al., 2007). Feeding colon cancer xenograft mice models with 10 µmol SFN per mice gave the same result. It was concluded that inhibition of HDAC activity coupled with increased acetylated histones might contribute to the cancer chemoprotective and therapeutic effects of SFN (Myzak et al., 2007). In a few SFN-treated prostate cancer cell lines, several class I and class II HDAC (HDAC3, HDAC4, HDAC6, and HDAC8) protein levels decreased, whereas in normal cells, only a temporary depletion of HDAC activity was noticed. It revealed the selective function of SFNs for benign hyperplastic and cancer but not normal cells (Clarke et al., 2011). SFN convincingly targets HDAC1, HDAC2, HDAC3, and HDAC8, but not HDAC6 in CRC cells (Juengel et al., 2018).

Focusing on SFN molecular mechanism, it has all-around chemopreventive properties. SFN by its HDAC inhibiting property alters apoptotic and cell cycle regulating gene expression, which blocks growth of tumor cells and induces apoptosis *in vitro* as well as *in vivo*. SFN triggers checkpoint kinase 2 (Chk2) phosphorylation-dependent upregulation of p21, which inhibits cyclin-dependent kinase (cdk), and results in cell cycle arrest. In prostate cancer cells, cyclin D1, cyclin E, Cdk4, and Cdk6 protein level reduction correlated with S phase arrest induced by SFN treatment, whereas activation of the G(2)-M checkpoint correlated with induction of cyclin B1 and reduction of Cdk1 and mitosis inducer Cdc25C protein levels (Herman-Antosiewicz et al., 2007). Advanced stage cancers are characterized by metastasis, which includes cell migration, invasion, and angiogenesis. It is therefore interesting that SFN intervenes with the cancer cells' invasion cascade and angiogenesis by downregulating matrix metalloproteinases (MMP) such as MMP-1, MMP-2, MMP-7, and MMP-9 (Shankar et al., 2008). In recent years, the role of microRNAs (miRNAs) has been acknowledged in the regulation of gene expression at the epigenetic level, hence causing cell development and progression of various cancers. SFN has been shown to regulate certain miRNAs like miRNA-21 and miRNA-320a (Sato et al., 2016; Martin et al., 2018). Although a clear connection between SFN and miRNAs has not been deciphered yet, it gives further scope for investigation as these miRNAs are involved in the progression of EMT, which leads to aggressiveness of tumors (Yang et al., 2018).

#### Vorinostat

Vorinostat, also known as suberoylanilide hydroxamic acid (SAHA), is an orally bioavailable broad HDAC inhibitor and commonly inhibits HDAC class I and II. It is a new potential therapeutic drug used for the treatment of cancer (Richon et al., 1998; Bubna, 2015). Vorinostat does not inhibit HDAC class III enzymes (Richon et al., 2000). United States Food and Drug Administration (FDA) approved it as the first HDACi that is used for the treatment of relapsed cutaneous T-cell lymphoma (CTCL) (Eckschlager et al., 2017). SAHA plays a role both *in vitro* and *in vivo*, induces apoptosis of cancer cells *in vitro*, as well as inhibits tumor growth in mouse models (Glick et al., 1999; Richon et al., 2000; Suraweera et al., 2018). It has also been used in combination with other drugs like tamoxifen, pembrolizumab, sorafenib, rituximab, and gefitinib for cancer treatment (Richon et al., 2000).

Previous studies have shown the SAHA involvement in cell cycle arrest. The expression of the cyclin-dependent kinase inhibitor WAF1 induced by SAHA leads to T24 bladder carcinoma cells' growth arrest (Butler et al., 2002). SAHA increases expression of TATA Box binding protein-2 (TBP-2) that inhibits thioredoxin, which is an intracellular antioxidant in prostate, bladder, and breast cancer cells. Treatment of cancer cells by this HDAC inhibitor induces ROSdependent apoptosis (Lincoln et al., 2003; Lin and Pollard, 2007). Vorinostat acts indirectly under hypoxic conditions, suppressing hypoxia inducible factor (HIF)-1 alpha and vascular endothelial growth factor (VEGF), and thus blocks angiogenesis (Zhijun et al., 2016). HDAC overactivity is known in CRC progression. Hence, SAHA can be used to target HDAC for CRC treatment.

#### Domatinostat (4SC-202)

4SC-202 (domatinostat) is an orally administered small molecule for the treatment of various types of cancer. The compound inhibits the enzymes HDACs HDAC1, HDAC2, and HDAC3, which are believed to play important roles in the regulation of aberrant cancer signaling. It potently inhibits survival, proliferation, and cell cycle progression in CRC cells (HT-29, HCT-116, HT-15, and DLD-1). The colon epithelial cells that have low expression of HDAC1/2 had minimal effect on them after 4SC-202 treatment. This result showed the specificity of 4SC-202 for HDACs (Maes et al., 2015). It has dual HDAC and KDM (lysine demethylases) inhibitory activity (Gruber et al., 2018). Together, these preclinical results indicate that 4SC-202 may be further investigated as a valuable anti-CRC agent/ chemo-adjuvant.

Domatinostat targets the oncogenic Hedgehog (HH)/Gli signaling pathway and hence reduces proliferation, survival, self-renewal, metastasis, and overall tumor formation and cancer progression (Fu et al., 2016). Addition of 4SC-202 in hepatocellular carcinoma (HCC) cells activates ASK-1 dependent mitochondrial apoptosis pathway (Mishra et al., 2017). 4SC-202 treatment inhibits TGFβ-induced EMT. It markedly induces p21 expression and significantly attenuates cell proliferation. Genome-wide studies revealed that 4SC-202-induced genes were enriched for Bromodomain-containing Protein-4 (BRD4) and MYC occupancy (Zhao et al., 2018).

### Resminostat (4SC-201)

Resminostat (4SC-201 or RAS2410) is an orally bioavailable inhibitor of HDACs. It is a direct inhibitor for HDAC classes I and II including HDACs 1, 3, 6, and 8. It was shown to reduce the growth of HCC cells by inhibiting the proliferation with its specificity for class I HDACs (Mandl-Weber et al., 2010). Resminostat was seen to have anti-myeloma activity (Brunetto et al., 2013). The effect of Resminostat has been seen on several cancers including head and neck squamous cell carcinoma (HNSCC), multiple myeloma (MM), and HCC. In human patients with advanced solid tumors, an investigation on Resminostat was carried out to study its safety and tolerability. It was found to be safely administered and to have anticancer effects with a dose-proportional pharmacokinetic profile (Tambo et al., 2017). Resminostat is also used with the combination of other drugs for better efficacy like sorafenib and docetaxel (Bitzer et al., 2016; Mandl-Weber et al., 2010).

Resminostat inhibits proliferation and induced G0/G1 cell cycle arrest along with decreased levels of cyclin D1, cdc25a, Cdk4, and pRb, as well as upregulation of p21. It strongly induces apoptosis in MM cell lines shown by increased expression level of Bim and Bax and decreased expression level of Bcl-xL (Enzenhofer et al., 2017). In the case of MM and HNSCC, Resminostat reported to have anticancer activity by affecting the AKT signaling pathway and hence reducing cell survival and proliferation (Soukupova et al., 2017). Resminostat prevents cell growth and induces death in HCC cell lines with a decrease in the mesenchymal related genes and an increase in epithelial-related genes. Moreover, it downregulates the CD44 expression and hence the colony formation capacity of HCC cells (Gimsing et al., 2008).

Resminostat is in clinical trials for the treatment of advanced CRC but no results have been published yet. Activation of AKT signaling in the CRC results in proliferation, migration, and inhibition of apoptosis in CRC cell lines, which suggests the possible usage of Resminostat for CRC treatment.

#### Belinostat

Belinostat is a hydroxamate acid-type HDAC inhibitor (HDACi) drug with antineoplastic activity. It was developed by TopoTarget for the treatment of hematological malignancies and solid tumors. Importantly, it has tolerable side effects with infrequent toxicity when compared to other HDACi (Beck et al., 2010). Human colon cancer cell line HCT116 proteomic profiling has been done to evaluate the effect of this drug on the expression of proteins involved in cancer progression and regression. In Belinostattreated cells, 45 differentially expressed proteins related to proto-oncogene were revealed, including nucleophosmin and stratifin, which were downregulated, and nucleolin, gelsolin, heterogeneous nuclear ribonucleoprotein K, annexin 1, and HSP90B, which were upregulated (Tumber et al., 2007). Belinostat in combination with the other chemotherapeutic drug 5-fluorouracil has shown promising effect to inhibit colon cancer cell growth *in vitro* and *in vivo* (Kong et al., 2017). Belinostat has promising chemosensitizing characteristics in lung squamous cell carcinoma. Its treatment triggers the proteasomal degradation of SOS proteins FBXO3 and FBXW10 through the suppression of MAPK activity (ERK1/2 and p38) (Chowdhury et al., 2011). This drug has been tested on colon, breast, and pancreatic cancer cells with an epigenetically silenced TGFβ receptor. Belinostat induces the expression of tumor suppressor gene TGFβRII with simultaneous restoration of the downstream cascade. Survivin, a cancer-associated gene, gets downregulated through TGFβ/ protein kinase A (PKA) pathway. This results in poor cell survival and reduced metastasis. Hence, its downregulation leads to cancer cell death (Giles et al., 2006).

The drug is still being investigated for its effective use against multiple cancers. Preclinical experiments demonstrated that the drug works by inhibiting cell proliferation and inducing programmed cell death in tumor cells.

#### Panobinostat

Panobinostat (LBH589) is a member of the hydroxamic acid class of HDACi approved by the FDA for the treatment of MM. It is a colorless, clear, slightly viscous liquid and chemically known as (2E)-*N*-hydroxy-3-[4-[[(2-hydroxyethyl) [2-(1H-indol-3-yl) ethyl]amino]methyl]phenyl]-2-propanamide, administered in both oral and intravenous forms (Atadja, 2009). It is a nonselective HDAC inhibitor that works against all the classes of HDACs including class I, II, and IV. It interferes with both histone and non-histone proteins. Panobinostat treatment increases acetylated H3 and H4 as well as non-histone proteins HIF-1α, α-tubulin, β-catenin, chaperons (HSP90), estrogen receptor (ERα), androgen receptor (AR), signaling mediators (Stat3, Smad7), DNA repair proteins (Ku70), retinoblastoma protein (pRb), etc., leading to alterations in transcriptional factors (p53, E2F, NF-кB, c-Myc) (Singh et al., 2010; Kim and Bae, 2011; Jones et al., 2011).

It has been tested for several types of cancer including hematologic and solid malignancies, CTCL, Hodgkin's lymphoma, leukemia, prostate, thyroid, and breast cancer (LaBonte et al., 2009). One of the main advantages of Panobinostat is its ability to prolong hyperacetylation of histones, which allows intermittent dosing schedules to reduce the challenging thrombocytopenia. Thrombocytopenia is a condition that defines low platelet counts, which is more likely a side effect of all HDACi. Hence, Panobinostat is currently a more potent HDAC inhibitor with better persistence in the body. Panobinostat treatment of colon cancer cell lines inhibits proliferation and survival at nanomolar concentrations. Analysis of gene expression profiles of CRC cell lines treated with panobinostat revealed alteration of genes involved in the process of angiogenesis, mitosis, DNA replication, and apoptosis (Regel et al., 2012).

Panobinostat, when used in combination with anthracyclines, works as a chemosensitizing agent for gastric cancer cells *via*  activation of CITED2 (Cbp/p300-interacting transactivator 2) (Catalano et al., 2012). Panobinostat induces G1, G2/M cell cycle arrest and cell death in the case of head and neck cancer and CRC respectively (Prystowsky et al., 2009; Gandesiri et al., 2012). Combined treatment of lapatinib and panobinostat inhibits the proliferation and colony formation in all CRC cell lines tested. Lapatinib is an EGFR/HER2 kinase inhibitor. Combination treatment resulted in rapid induction of apoptosis with increased DNA double-strand breaks, caspase-8 activation, and PARP cleavage with downregulation of transcriptional targets including NF-κB1, IRAK1, and CCND1. This was paralleled by decreased signaling through the MAPK and PI3K/AKT pathways.

In CRC, panobinostat has been shown to activate the tumor suppressor gene death-associated protein kinase (DAPK), which plays a role in induction of autophagy and apoptosis (LaBonte et al., 2011). Panobinostat treatment downregulates EGFR, HER2, and HER3 expression both at the mRNA and protein level through transcriptional and posttranslational mechanisms (Costello and Plass, 2001).

#### DNA/Histone Methyltransferase Inhibitors

Methylation is involved in several biological processes, including developmental, cell cycle, differentiation, and DNA repair (**Figure 2**). Any alteration in methylation can affect any of

these events and results in disease development. H3K4me2 is downregulated in many types of cancers including lung, kidney, prostate, breast, and pancreatic cancer, whereas H3K27me3 is downregulated in gastric adenocarcinoma and downregulation of H4K20me3 is associated with CRC (Greer and Shi, 2012). Epigenetic silencing by aberrant methylation of regulatory genes leads to tumorigenesis. Inactivation of critical genes involved in tumor suppression, DNA repair, cell-cycle regulatory mechanism, apoptosis, angiogenesis, and metastasis has been demonstrated in a wide variety of tumor types (Orta et al., 2017). Enzymes responsible for these processes are histone methyltransferases (HMTs) and DNA methyltransferase (DNMT) that maintains an altered methylation pattern by copying it from parent to daughter DNA strands after replication. DNMT1 high expression is observed in almost all cancer types and maintains a higher methylation level (Heerboth et al., 2014). Aberrant epigenetic changes induced in malignant cells lead to emergence of neoplastic properties. Deviant histone methylations have been suggested to play a major role in CRC. More than 20 histone methylation enzymes are found to be clinically relevant to CRC, including 17 oncoproteins and 8 tumor suppressors (Huang et al., 2017). However, abnormal epigenetic patterns can be reversed by the action of epigenetically active agents. In the recent years, CRC epigenetic regulation, particularly HMTs and demethylases (HDMs), has been the subject of extensive research. Agents used as HMT and DNMT inhibitions have shown promising anticancer effects. For example, EZH2 and DOT1L inhibitors have shown potency in preclinical trials for CRC treatment. Chaetocin, a fungal metabolite, inhibits SUV39H1 and inhibits migration of CRC cells (Yokoyama et al., 2013). Altogether, DNMT/HMT inhibitors reduce metastasis (**Figure 2**).

Aberrant histone and DNA hypermethylation is frequently found in tumor cells, and inhibition of methylation is an effective anticancer strategy. Currently, there are a number of DNMT/HMT inhibitors used as chemotherapeutic drugs for CRC treatment.

#### Zebularine

Zebularine [1-beta-D-ribofuranosyl-2(1H)-pyrimidinone] is a hydrophilic, orally bioavailable nucleoside analog of cytidine. It inhibits DNA methylation by getting incorporated into DNA, hence appealing for use in rapidly dividing cancer cells. Zebularine acts as an inhibitor of DNA methylation by inhibiting the action of DNMTs. Zebularine-substrate DNA forms covalent bond with DNMTs and gets entrapped in the complex (Zhou et al., 2002). It possesses high oral bioavailability and shows low toxicity and high efficacy, being a promising adjuvant agent for anticancer chemotherapy (Cao et al., 2018). In addition, the low toxic effect of zebularine gives scope for low-dose administration for a prolonged period.

Zebularine-treated cumulus cells were found with reduced overall DNA methylation patterns and gene-specific DNA methylation levels at the promoter regions of pluripotency genes (Oct4, Sox2, and Nanog), which indicates that zebularine has a role to play in the case of cancer stem cells (Rao et al., 2007).

Several studies have reported its antitumor effects on several types of cancers like lung cancer, gastric cancer, pancreatic cancer, medulloblastoma, leukemia, head and neck cancer, hepatocellular carcinoma, cervical cancer, breast cancer, bladder carcinoma, prostate cancer, ovarian cancer, and CRC. Zebularine was developed to counter the shortcomings of DNMTI 5-azacytidine. Zebularine was reported to be safer than 5-azacytidine for the treatment of cancers in Epstein–Barr Virus (EBV) carriers and proposed to be used against tumors possessing EBV (Takemura et al., 2018).

In human malignant mesothelioma cells, DNMT1 expression decrease was directly proportional to zebularine. It exerted antiproliferative activity through S phase delay and cell death (You and Park, 2014). It has been seen to have the same effect in the case of lung cancer and induces A549 cell death, which was accompanied by the loss of mitochondrial membrane potential, Bcl-2 reduction, and activation of Bax, p53, caspase-3, and caspase-8 (Nakamura et al., 2015). It induces suppression of the Wnt signaling pathway by decreasing β-catenin protein levels in cholangiocarcinoma (CCA) cell lines TFK-1 and HuCCT1, which leads to apoptotic cell death in CCA (Yang et al., 2013). These studies indicated that zebularine could effectively target both DNMT inhibitors and non-DNMT inhibitors. In lymphoma cells, zebularine reactivates silenced E-cadherin, which suggests its capacity to reverse the EMT (Takemura et al., 2018).

The effect of zebularine on CRC has also been investigated. It induces p53-dependent ER stress and autophagy, whereas it inhibits tumorigenesis and stemness of CRC (You and Park, 2014). Zebularine also works at the mRNA level. Its treatment increases the expression level of let-7b, which functions as tumor suppressor microRNA and hence suppressed the invasion activity of CRC cells (Tanaka et al., 2017).

Genes involved in tumor suppression, DNA damage repair, and cell cycle regulation often get inactivated because of CpG island hypermethylation, which causes cancer progression. Methylation inhibitors target the transcriptional silencing of tumor suppressor genes due to hypermethylation and direct the reactivation. Zebularine is one such methylation inhibitor with the potential for clinical utilization with less cytotoxic effect, stability, and high selectivity for cancer cells, making it a promising candidate as a chemotherapeutic agent (Yoo et al., 2004; Veverka et al., 1997).

#### Disulfiram

DSF (DSF, bis-diethylthiocarbamoyl disulfide) also known as Antabuse, is an irreversible inhibitor of aldehyde dehydrogenase (ALDH), which is responsible for ethanol metabolism. It contains a thiol-reactive functional group that interacts with a thiol group at the active site of ALDH and hence it is used for the management of alcoholism (Lin et al., 2011). DNA methyltransferase 1 (DNMT1) contains a reactive CXXC region (C is cysteine; X is any other amino acid) at its active site, which makes it susceptible to DSF (Syro et al., 2006). Disulfiram has the ability to cross the blood–brain barrier and has been reported as a potential inhibitor of DNMT in several cancers.

DSF has been reported to significantly inhibit the growth and clonogenic survival of cell lines in prostate cancer by unmethylating the promoter of the APC (adenomatosis polyposis coli) gene, which encodes a tumor suppressor protein that acts as an antagonist of the Wnt signaling pathway, and the RARB (retinoic acid receptor beta) gene, which limits the growth of many cell types. DSF exposure also leads to reduction of global genomic C content (Zhao et al., 2015b). DSF in combination with other drugs can be an effective therapeutic strategy against cancer. Studies have suggested O6-methylguanine-DNA methyltransferase (MGMT) as the key factor responsible for chemoresistance of aggressive pituitary adenomas to the currently most promising chemotherapeutic drugs temozolomide (TMZ) and 2-methoxyestradiol (2ME) (Sharma et al., 2016). TMZ efficacy increases when used in combination with DSF. The antitumor effect of TMZ is observed in human pituitary cancer cells *via* the ubiquitin–proteasomal MGMT protein elimination route (Sharma et al., 2016). Estrogen receptor-β (ER-β), a tumorsuppressor gene in prostate cancer, is repressed by DNMTmediated hypermethylation. DSF treatment reverses the silencing of ER-β and prevents cell proliferation (Wang et al., 2003). The role of DSF has been seen in the case of CRC. DSF in combination with 5-fluorouracil (5-FU) is used as the major chemotherapeutic component for CRC. It imparts chemosensitization, significantly enhanced the apoptotic effect, and synergistically potentiated the toxic effect of 5-FU on CRC cell lines. Cancer cells with high NF-кB nuclear activity demonstrate robust chemoresistance and radioresistance. DSF strongly inhibits both NF-кB nuclear translocation and DNA binding activity (Oki et al., 2007).

Inhibition of DNMT function can potentially reverse some of the cancer-associated methylation marks, reprogram the epigenetic makeup, and change the protein expression profile. DSF represents an attractive therapeutic avenue.

#### Decitabine

Decitabine or 5-aza-2'-deoxycytidine (cytidine analog) is a hypomethylating agent that functions as nucleic acid inhibitors by inhibiting DNMTs. Its trade name is Dacogen. It engages the DNMTs by binding to it irreversibly through a covalent bond and inhibiting the methylation of a daughter strand during the replication (Jabbour et al., 2008). Decitabine treatment at high concentrations inhibits DNA synthesis and leads to cell cycle arrest (Palii et al., 2008), whereas its low-dose but long-term treatment eventually causes degradation of DNMTs without cell cycle arrest by getting DNMTs entrapped (Briot et al., 2017). Many studies have been carried out to increase the efficiency of decitabine as it has less oral bioavailability. To increase its oral bioavailability, lipid nanocapsules have been developed to encapsulate the decitabine. Decitabine cytotoxicity was observed to be higher when used in conjugation with lipid nanocapsules against different cancer cell lines (Briot et al., 2017).

Understanding of molecular mechanism is necessary to enhance the efficacy of any drug. Decitabine is involved in the regulation of key factors to prevent cancer. Overall, it helps in the restoration of cancer cell sensitivity toward drugs and body immunity (cytotoxic lymphocytes) and most importantly reverses the EMT. Multidrug resistance continues to be a big hurdle for cancer treatment. Decitabine restores drug sensitivity by p-glycoprotein (P-gp) coded by the mdr-1 gene in both myeloid and solid tumor cells K562/ADR and MCF-7/ADR, respectively, in a time- and dose-dependent manner (Wang et al., 2017a). It has been demonstrated that decitabine inhibits the MAPK pathway, which could be the possible reason behind upregulation of p-glycoprotein (Vitale et al., 2017). To overcome drug resistance, decitabine has been combinedly used with the mTOR inhibitor everolimus for the treatment of medullary thyroid cancer (MTC). This combination showed strong antiproliferative activity through apoptosis induction. Through bioinformatics, four major molecular pathways involved in cancer progression were seen to be affected, including PI3K-Akt signaling, ECM/receptor interaction, neurotrophin pathway, and focal adhesion, which leads to the apoptosis of cancer cells through the overexpression of apoptosis regulators NGFR and Bax genes (Li et al., 2017b). Decitabine enhances the allo-NK cell-mediated killing effects on leukemia stem cell by upregulation of NKG2D ligands (Morel et al., 2017). Decitabine is also involved in miRNA regulation, and miR-375 level is upregulated after decitabine treatment, which, in turn, represses HPV16 E6 oncoprotein level (Bai et al., 2017). Progestin and adipoQ receptor family member 3 (PAQR3) expression is significantly associated with advanced TNM stage of cancer. Decitabine treatment induced the expression of PARQ3, which significantly reduced proliferation, colony formation, and invasion of ESCC cells *via* inhibition of ERK signaling (Wang et al., 2017b). Decitabine restores tumor suppressor Bridging integrator-1 (Bin1), which reduces ESCC cell malignant behaviors and reverses EMT *via* regression of MMP-2 and MMP-9 expression through the PTEN/AKT signaling pathway.

In CRC, decitabine has been used alone as well as in combination with other drugs. Low expression of NALP1 (nucleotide-binding oligomerization domain-like receptor family, pyrin domain-containing 1) is associated with survival and tumor metastasis in colon cancer. DAC treatment increases its expression and hence suppressed the growth of colon cancer (Chen et al., 2015). Decitabine treatment suppressed the invasion ability of CRC lymph node metastasis-derived SW620 cells as well as oxaliplatin-resistant SW620 cells. Cells regained the epithelial characteristic that was indicated by upregulation of E-cadherin, miR-200C, and miR-141 (Tanaka et al., 2015; Manfrão-Netto et al., 2018). It has also been used in combination with other drugs like gefitinib and azacitidine as the effective treatment approach for CRC (Müller and Florek, 2010; Gerecke et al., 2018).

#### Azacitidine

Azacitidine (marketed as Vidaza) is basically a ribonucleoside and functions as a chemical analog of cytidine. It acts as a hypomethylating agent. It gets incorporated into the RNA larger than into a DNA. Decitabine is the deoxy derivative of azacitidine. It differs from decitabine in the way that it can bind to both DNA and RNA, whereas decitabine can bind only to the DNA. Its oral version is called CC-486. It works in a dose-dependent manner. At low dose, it causes hypomethylation of DNA by inhibiting DNMT by covalent binding with it, whereas at high dose, it functions as a cytotoxic agent and gets incorporated into DNA and RNA in the abnormal cells, resulting in cell death (Borodovsky et al., 2013).

It has been demonstrated that somatic mutations in isocitrate dehydrogenase 1 (IDH1) alone sufficiently induce a global hypermethylated phenotype, which is one of the features of the glioma with this kind of mutation. Long-term treatment with azacytidine resulted in reduction of DNA methylation, which results in glial differentiation, reduction in cell proliferation, and tumor growth. Also, there was no sign of recurrence despite discontinuation of therapy (Lee et al., 2018). H3K9me3 and H3K27me3 marks are associated with cancer progression and metastasis. Azacitidine treatment resulted in reduction of H3K9me3 and H3K27me3 marks in neuroendocrine prostate cancer (Roulois et al., 2015). At low dose, azacytidine targets CRC initiating cells by induction of viral mimicry *via* the MDA5/ MAVS/IRF7 pathway (Hu et al., 2017). The reduced expression of NDN (Necdin, MAGE Family Member) is associated with poor differentiation, advanced TNM stage, and poor prognosis of CRC. Administration of azacytidine causes hypomethylation of the NDN promoter. Enhancement in the expression of NDN causes it to bind to the LRP6 promoter, leading to reduced transcription and Wnt signaling pathway inhibition in CRC (Lai et al., 2015).

#### Chaetocin

Chaetocin is a fungal mycotoxin that inhibits HMTs. HMT SUV39H1-mediated methylation of lysine 9 on histone H3 is associated with repression of tumor suppressor genes. Treatment of cancer cell lines with chaetocin led to downregulation of SUV39H1 along with reduction in H3K9 status (Cherblanc et al., 2013; Chiba et al., 2015). Previously, chaetocin was concluded as a "specific" inhibitor of the H3K9 HKMT (histone lysine methyltransferase) SU(VAR)3–9, but later on, it was proved that it is a non-specific inhibitor of HMTs (Zuma et al., 2017). It promotes irreversible arrest of cell cycle, nucleolus fragmentation, and RNA transcript blockage in pathogenic trypanosomatids (Dixit et al., 2014). Chaetocin induces ROS-mediated apoptosis through an ATM-YAP1 driven apoptotic pathway in glioma cells (Shuai et al., 2018). Chaetocin also shows synergistic cytotoxity in combination with other epigenetic drugs such as SAHA (HDACi) or JQ (bromodomain inhibitor) (Chiba et al., 2015).

The histone H3 lysine 9 (H3K9) methylation mark is linked with the progression of CRC and positively correlated with the metastasis (Yokoyama et al., 2013). Methyltransferases SUV39H1/SUV39H2 have been seen to be involved in cell migration regulation (Yokoyama et al., 2013; Liu et al., 2015). Chaetocin treatment was observed to reduce the cell migration in CRC cell lines (Liu et al., 2015). Thus, chaetocin alone or in combination with other drugs may be a potent drug for the treatment of multiple cancers. Chaetocin has been demonstrated to suppress cancer cells through the induction of apoptosis. In multiple lung cancer cells, its treatment activates endoplasmic reticulum stress, which results in the upregulation of ER stress response proteins, transcription factor ATF3, and CHOP, which further contributes to apoptosis in a death receptor 5 (DR5)-dependent manner (Zhao et al., 2015a). Treatment of chaetocin reduces cell growth by downregulating Blimp1 and RANKL (receptor activator of NFκB ligand) expression, which reduces osteoclast differentiation. Osteoclast gets differentiated from hematopoetic macrophage-like cells through the RANKL– RANK signaling system. Osteoclast formation is associated with the bone-responsive diseases (Zhao et al., 2015a). In B16F10 mouse melanoma cells, chaetocin inhibits IBMX (3-isobutyl-1-methylxanthine)-induced melanogenesis through activation of ERK (Bae et al., 2016). Chaetocin is able to induce autophagy along with caspase-dependent apoptosis in hepatic cancer, but inhibition of autophagy enhances its effectivity as an anticancer apoptotic agent (Jung et al., 2016). There is still a lot more to be discovered as the molecular mechanism underlying the behavior as anticancer agent in several types of cancer remains unclear.

Epigenetic modification is a crucial mechanism in cancer and has been exploited for the development of anticancer therapeutic drugs. Overall, HDAC inhibitor and DNMT/HMT inhibitor treatment increases the expression of tumor suppressor genes, genes responsible for drug sensitivity, and reduced the expression of oncogenes (**Figure 3**).

### CONCLUSION AND FUTURE PERSPECTIVE

Currently, there are several potential drugs targeting HDACs and DNA/histone methyltransferases (DNMT/HMT) used in treating several types of cancer (**Table 1**). Most of the drugs mentioned in this review are FDA-approved as they work efficiently in specific cancers at certain stages of the disease.

Although the approach of using epigenetic modifying agents as anticancer drugs may have clinical benefits, there are several problems that must be considered. Even after the drug treatment, the reversible nature of methylation persists. The remethylation and suppleness are major problems that needed to be resolved. Moreover, it is important to consider that involvement of HDACs in cancer does not necessarily mean its overexpression. Sometimes, truncated or mutated HDACs can


TABLE 1 | Chemotherapeutic drugs targeting epigenetic modifications and signaling pathways.

also be present. In such cases, alternative therapeutic drugs will be required. One of the major limitations associated with HDACs and DNMT enzymes is lack of specificity. Through the use of DNMT inhibitor therapy, non-specific genes are reactivated. The epigenome is tremendously complex; hence, it is needed to minimize the off-target effects of epigenetic modifying drugs. There is room for development of effective methods for drug delivery to reduce side effects and attain a higher therapeutic index. There are various delivery systems like nanocarriers, administering drugs in combination for synergistic effect and or altering the chemical structure of drugs, which enhance the effectiveness of the drugs. There are various nanocarriers that are used to deliver the drug, including nanogels, liposomes, dendrimers, and polymeric nanoparticles. Epigenetic drugs have less stability and hence less sustainability in the body. For example, some drugs like decitabine need to be administered continuously as it is rapidly degraded in the body and system drug level drops. Drug delivery modification enhances drug stability, permeability and retention. It also lowers the required drug concentration during administration. During DNA replication, inhibition of DNMT distorts methylation status throughout the genome. Methylation targets many DNA repair pathway genes, whose expression may lead to drug resistance. During chemotherapy, cancer stem cells are a small population of cells that escape and enter into the dormancy phase, slow the growth rate, and resemble normal stem cell properties. These residual cancer stem cells, after getting triggered, appear as a recurrent and chemoresistance disease. Combined therapies of standard chemotherapeutic drugs with epigenetic targeting drugs provide the opportunity for the reactivation of genes required for response to the chemotherapeutic drugs. Combining epigenetic therapies has shown both additive and synergistic effects. One of the main side effects of these drugs is their toxic effect. Many epigenetic modification targeting drugs like demethylating agents azacytidine and decitabine are highly toxic in nature. However, it is not easy to develop a chemically derived drug that is not toxic to normal cells. Therefore, drug discovery using plant species and their natural products is drawing attention. Being naturally derived drugs, they have been shown to have less or non-toxic effects on normal cells and are more tolerable. Their anticancer properties and natural abundance make them a great candidate for drug development. Plant-derived drugs can also be categorized into four classes according to their activities: antioxidants, cell cycle inhibitors, methyltransferase inhibitors, and HDAC inhibitors. For example, SFN, isoflavones, and isothiocyanates work as HDACi. There are many plant-derived natural compounds and secondary metabolites still under investigation for their anticancer activities and can be the scope for development of new clinical drugs.

#### AUTHOR CONTRIBUTIONS

SP conceptualized the topic, framed the write-up sequence, collected the relevant references, and finalized the manuscript. A contributed in writing some of the portions under the guidance of SP.

## REFERENCES


MGMT, APC, and CDH1 genes in patients with colon adenocarcinoma. *Exper. Biol. Med.* 240 (12), 1599–1605. doi: 10.1177/1535370215583800


to epigenetic silencing of CDX1 and EPHB tumor suppressor genes in colorectal cancer. *Epigenetics* 6 (5), 610–622. doi: 10.4161/epi.6.5.15300


colorectal cancer by up-regulating histone acetylation state. *Br. J. Pharmacol.* 175 (22), 4209–4217. doi: 10.1111/bph.14467


**Conflict of Interest Statement:** The handling editor declared a past co-authorship with one of the authors. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Patnaik and Anupriya. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# CD 36: Focus on Epigenetic and Post-Transcriptional Regulation

#### *Cristina-Mariana Niculite1,2\*†, Ana-Maria Enciu1,2† and Mihail Eugen Hinescu1,2*

*1 Cell Biology Department, "Victor Babes" National Institute of Pathology, Bucharest, Romania, 2 Department of Cellular and Molecular Biology and Histology, "Carol Davila" University of Medicine and Pharmacy, Bucharest, Romania*

CD36 is a transmembrane protein involved in fatty acid translocation, scavenging for oxidized fatty acids acting as a receptor for adhesion molecules. It is expressed on macrophages, as well as other types of cells, such as endothelial and adipose cells. CD36 participates in muscle lipid uptake, adipose energy storage, and gut fat absorption. Recently, several preclinical and clinical studies demonstrated that upregulation of CD36 is a prerequisite for tumor metastasis. Cancer metastasis-related research emerged much later and has been less investigated, though it is equally or even more important. CD36 protein expression can be modified by epigenetic changes and post-transcriptional interference from non-coding RNAs. Some data indicate modulation of CD36 expression in specific cell types by epigenetic changes *via* DNA methylation patterns or histone tails, or through miRNA interference, but this is largely unexplored. The few papers addressing this topic refer mostly to lipid metabolism-related pathologies, whereas in cancer research, data are even more scarce. The aim of this review was to summarize major epigenetic and post-transcriptional mechanisms that impact CD36 expression in relation to various pathologies while highlighting the areas in need of further exploration.

#### Keywords: CD36, epigenetics, non-coding RNAs, inflammation, obesity, biomarkers

### INTRODUCTION

Very different pathologies, such as cardiovascular disease, malaria, and tumor metastasis, share a mechanism involving the membrane glycoprotein CD36. It was first identified as an adhesion protein, mediating collagen, fibronectin, thrombospondin (TSP) (Tandon et al., 1989; Silverstein and Febbraio, 2009), and *Plasmodium falciparum* binding (Barnwell et al., 1989). CD36 was later demonstrated to be a scavenger receptor (Febbraio et al., 2001; Silverstein and Febbraio, 2009), a fatty acid translocator for native and oxidized low-density lipoprotein (oxLDL) (Endemann et al., 1993), anionic phospholipids (Rigotti et al., 1995), and long-chain fatty acids (FAs) (Abumrad et al., 1993; Glatz and Luiken, 2017). These different functions are performed by different binding sites of the extracellular domain of the receptor (Asch et al., 1993; Jay and Hamilton, 2018), and the outcome of ligand binding is largely dependent on cell type. In muscle, adipose, and intestinal cells, CD36 participates mainly in lipid uptake (Smith et al., 2008; Tran et al., 2011).

In macrophages, depending on the environment, CD36 can act as a scavenger of oxLDL, forming foam cells (Park, 2014) as a pattern recognition receptor, in innate immunity (Thylur et al., 2017), or as a trigger of inflammatory responses (Stewart et al., 2010; Qin et al., 2017). In endothelial cells, it has mainly anti-angiogenic and pro-apoptotic effects *via* thrombospondin binding (Klenotic et al., 2013). In vascular smooth muscle cells, CD36 may contribute to generation of reactive oxygen species (ROS) and cellular proliferation (Li et al., 2010; Yue et al., 2019). In sensory cells, it can act

#### *Edited by:*

*Chandravanu Dash, Meharry Medical College, United States*

#### *Reviewed by:*

*Laelie Allison Snook, University of Guelph, Canada Nikhlesh Singh, University of Tennessee Health Science Center (UTHSC), United States*

#### *\*Correspondence:*

*Cristina-Mariana Niculite maria.niculite@ivb.ro*

*†These authors have contributed equally to this work.*

#### *Specialty section:*

*This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Genetics*

*Received: 15 January 2019 Accepted: 28 June 2019 Published: 19 July 2019*

#### *Citation:*

*Niculite C-M, Enciu A-M and Hinescu ME (2019) CD 36: Focus on Epigenetic and Post-Transcriptional Regulation. Front. Genet. 10:680. doi: 10.3389/fgene.2019.00680*

**109**

as a taste (Laugerette et al., 2005; Keller et al., 2012; Pepino et al., 2012; Ozdener et al., 2014) or olfactory receptor (Oberland et al., 2015; Xavier et al., 2016).

CD36 is currently investigated as a potential therapeutic target in cardiovascular disease (Portal et al., 2016), metabolic syndrome, and obesity (Corpeleijn et al., 2008; Goyenechea et al., 2008). Several preclinical (Pascual et al., 2017; Ladanyi et al., 2018; Sp et al., 2018) and clinical studies (Liang et al., 2018; Pan et al., 2019) recently demonstrated that upregulation of CD36 could be a prerequisite for tumor metastasis (reviewed in Enciu et al., 2018), opening new avenues for anticancer therapies. Even though the localization and changes in expression of CD36 protein have been sufficiently addressed, less is known about how epigenetic and post-transcriptional regulation of CD36 influences CD36-related pathologies. A more detailed look into these less explored areas is necessary to highlight gaps in knowledge, as well as new possible therapeutic targets. Therefore, the aim of this review was to summarize major epigenetic and post-transcriptional mechanisms with impact on CD36 expression in relation to various pathologies. The blank spots on the CD36 map are also addressed.

#### CD36 Gene Regulation and Gene-Related Pathologies

The CD36 gene is located on the long arm of chromosome 7 (7q21.11) (*Homo sapiens* Annotation Release 109, GRCh38.p12) and consists of 17 exons and 18 introns and can generate four protein isoforms *via* alternative splicing (https://www.ncbi.nlm. nih.gov/gene/948). Out of the four isoforms, three have at least one spliced exon (4, 6-7, or 8) (UniProtkB ID: P16671). At least 23 alternative transcripts are known for CD36, and six alternative first exons have been described, yielding different transcripts in different tissue types (Pietka et al., 2014; Mikkelsen et al., 2010). A STAT binding GAS element has also been identified in the CD36 gene promoter (Sp et al., 2018), and several papers have reported that CD36 expression is modulated by various STAT family members (Hosui et al., 2017; Kotla et al., 2017; Rozovski et al., 2018).

Gene promoter translation is under the control of core binding factor (CBF) family members (Armesilla et al., 1996) and CCAAT/enhancer-binding protein α (C/EBPα) (Qiao et al., 2008). A major regulator of *CD36* expression is peroxisome proliferator activated receptor gamma (PPAR-γ), which binds to enhancer regions of *CD36* (Mikkelsen et al., 2010), but other transcription factors, such as Activating Transcription Factor 2 (ATF2), were demonstrated to induce CD36 expression as well (Raghavan et al., 2018). Genetic modifications of *CD36* have been investigated thoroughly since the identification of malaria-predisposing mutations (Aitman et al., 2000). More than 9,000 SNPs have been described in the CD36 gene, in both intronic and exonic sequences, as well as the 5′ and 3′ untranslated regions (UTRs), but only a handful of them have been associated to known pathologies (https://www.genecards. org/cgi-bin/carddisp.pl?gene=CD36#transcripts). This is due in part to the large blocks of linkage disequilibrium across the gene in several of these population studies. CD36 polymorphisms are more frequent in Asian and African American populations than Caucasians (Love-Gregory et al., 2008). Several studies have investigated CD36 polymorphism in African Americans (Love-Gregory et al., 2008; Beydoun et al., 2014) and genome wide studies comparing other ethnic groups (Coram et al., 2013; Ellis et al., 2014). A gene-centric meta-analysis of lipid traits in African, East Asian, and Hispanic populations confirmed previous data on CD36 polymorphisms and proposed the rs3211938-G allele, which is nearly absent in European and Asian populations, as a "signature of selection" in Africans and African Americans (Elbers et al., 2012).

Regarding copy number variations (CNVs), *CD36*-related modifications occur mostly with loss of genetic sequences (Vogler et al., 2010; Suktitipat et al., 2014; Uddin et al., 2015), leading to platelet glycoprotein IV deficiency or neurocognitive developmental delay (Coe et al., 2014).

In addition to genetic alterations, epigenetic and posttranscriptional interventions can also modify the final protein output, impacting protein function. Gene expression can be further altered in the cytoplasmic compartment by noncoding RNA species that can interfere with mRNA translation and modulate protein output (Pop et al., 2018). Some of these epigenetic gene control mechanisms are addressed below with respect to CD36 expression and involvement in pathology. Posttranslational modifications of CD36 were reviewed recently by Luiken et al. (2016) and are beyond the scope of this review.

#### Epigenetic Regulation

Epigenetics is defined as "the study of changes in gene function that are mitotically and/or meiotically heritable and that do not entail change in DNA sequence" (Wu, 2001). Although no general consensus has been reached in regard to which mechanisms can be categorized as epigenetic (Deans and Maggert, 2015), they usually involve nuclear processes, such as DNA methylation, histone modifications, and non-coding RNAs (Goldberg et al., 2007; Peschansky and Wahlestedt, 2013; Ramassone et al., 2018).

DNA methylation involves the transfer of methyl groups from S-adenosylmethionine to cytosines in CpG dinucleotide sequences. The methylation patterns of CpG sites are created and maintained by a family of DNA methyltransferases (DNMT1, DNMT3A, and DNMT3B) and can be reversed by ten-eleven translocation (TET) enzymes (Moutinho and Esteller, 2017; Pfeifer, 2018). Hypermethylation is associated with gene repression, and hypomethylation with gene activation (Morlando and Fatica, 2018). Alterations of methylation patterns play an important part in regulating gene activity during embryogenesis, gametogenesis, and cellular differentiation (Goldberg et al., 2007; Pfeifer, 2018). In many types of cancer, global hypomethylation has been observed at repetitive genomic regions, leading to genomic instability, in combination with hypermethylation at specific CpG-rich islands in the promoters of key tumor suppressor genes (Pfeifer, 2018; Morlando and Fatica, 2018; Thomas and Marcato, 2018; Ramassone et al., 2018).

Histone (H) proteins organize chromatin into structural units called nucleosomes, which consist of DNA wrapped around an octamer of H2A, H2B, H3, and H4 core histones connected by linker DNA and stabilized by H1 (Hergeth and Schneider, 2015). Epigenetic events usually include covalent modification of amino acids from the H3 and H4 tails (Cheng and Blumenthal, 2010; Iorio et al., 2010; Moutinho and Esteller, 2017). These changes represent the so-called "histone code," which affects the conformation of the chromatin fiber, regulating the switch between euchromatin (transcriptionally active) and heterochromatin (transcriptionally inactive) (Moutinho and Esteller, 2017). Histone modifications are mediated by histone acetyltransferases (HATs), histone deacetylases (HDACs), histone methyltransferases (HMTs), and polycomb repressive complex 2 (PRC2) (Moutinho and Esteller, 2017; Thomas and Marcato, 2018). Several studies have shown an interplay between DNA methylation and histone modifications in determining the transcriptional status of a gene (Fuks et al., 2000; Johnson et al., 2002; Fuks et al., 2003; Weber et al., 2007; Cheng and Blumenthal, 2010).

Non-coding RNAs (ncRNAs) are transcribed from DNA but not translated into proteins. Although not all categories of ncRNAs can be classified as epigenetic factors, several types have been shown to control epigenetic mechanisms and, in turn, to be regulated by the epigenetic machinery (Goldberg et al., 2007; Iorio et al., 2010; Collins et al., 2011; Peschansky and Wahlestedt, 2013; Ramassone et al., 2018). The classes of ncRNAs that are epigenetically related are long non-coding RNAs (lncRNAs) and three types of short non-coding RNAs: microRNAs (miRNAs), short interfering RNAs (siRNAs), and Piwi-interacting RNAs (piRNAs) (Collins et al., 2011; Peschansky and Wahlestedt, 2013). However, the data available on CD36 regulation by ncRNAs show an involvement of the latter only in post-transcriptional mechanisms controlling CD36 mRNA levels, with no input from the epigenetic machinery.

#### DNA Methylation of *CD36*

The human CD36 gene presents several promoter sequences and distal cis-regulatory elements, which contain a number of CpG sites (Mikkelsen et al., 2010). Changes in CpG methylation patterns in the *CD36* promoters have been studied more extensively in relation to lipid metabolism and associated disorders. One study investigated the impact of *CD36* promoter SNPs and DNA methylation sites on postprandial lipid uptake and clearance in adipose and heart tissue (Love-Gregory et al., 2016). Several SNPs were associated with higher chylomicron (CM) remnants and LDL particle numbers, as well as a delayed triglyceride (TG) clearance, and some of them also correlated with lower CD36 mRNA levels and aligned to binding sites for PPAR-γ. Furthermore, the SNPs negatively related to CD36 level were associated with methylation at several CpG sites, although the two factors seem to function independently in regulating *CD36* mRNA expression (Love-Gregory et al., 2016).

Several studies have focused on obesity and its related co-morbidities: metabolic syndrome or obesity hypoventilation syndrome. In a weight loss and n-3 polyunsaturated fatty acid (PUFA) supplementation study in young overweight women, *CD36* promoter methylation was significantly reduced when adjusted for baseline body weight (Amaral et al., 2014). *CD36* was hypermethylated and less expressed in abdominal omental

visceral adipose tissue (OVAT) from non-obese individuals than obese individuals (Keller et al., 2017). Fat deposits in OVAT have been shown to more strongly correlate with a higher risk of obesity-related co-morbidities than subcutaneous deposits (Vega et al., 2006). *CD36* hypermethylation has also been found in monocytes from patients with obesity hypoventilation syndrome, following the application of positive airway pressure during sleep (the current treatment for the condition) (Cortese et al., 2016). Another study investigated the methylation of the most significant TG-associated CpG in a regulatory region within *CD36* (Allum et al., 2015): expression of the main CD36 transcript in adipose tissue from obese individuals with or without metabolic syndrome was negatively associated with methylation of this regulatory region.

DNA methylation profile of *CD36* has also been investigated in liver tissue. In human primary hepatocytes, *CD36* was hypomethylated and upregulated after 5 days of valproic acid (VPA) exposure, a HDAC inhibitor, which can also modulate the expression and DNA methylation level of genes involved in liver steatosis (van Breda et al., 2018). Yu et al. (2015) reported that *CD36* hypermethylation was induced in the livers of adult offspring mice by a high-lipid, high-energy maternal diet during gestation and lactation.

In addition to metabolic diseases, CD36 can be involved in the occurrence of cancer. Only one study has focused on the DNA methylation status of *CD36* in relation to tumor progression. Sun et al. (2018) have reported a hypermethylation of *CD36*, correlated with low expression in primary lung tumors. CD36 inhibited migration, invasion, and proliferation of lung cancer cells and arrested cell cycles in G0/G1 phase. Furthermore, treatment with decitabine, an inhibitor of DNA methylation and chidamide, an HDAC inhibitor, decreased the methylation and increased the mRNA expression level of *CD36*.

Currently, most of the data regarding changes in *CD36* DNA methylation come from genome-wide or epigenome-wide studies focusing on obesity and metabolic disorders, such as diabetes. The different DNA methylation signatures found in adipose tissue from lean vs. obese individuals with or without metabolic disorders could be investigated further to establish the causality between these epigenetic variants and associated pathologies. An overlooked area of research is the involvement of *CD36* methylation in the onset of cancer.

A summary of DNA methylation changes affecting the CD36 gene is presented in **Table 1**.

#### Histone Modifications of *CD36*

The CD36 gene promoters and distal enhancers that bind CCCTC-binding factor (CTCF) or PPAR-γ are subjected to both histone acetylation and methylation (Mikkelsen et al., 2010). Although histone tails can be subjected to many other covalent modifications, they have not been documented in CD36 expression. Changes in *CD36* histone marks have been studied mostly in adipocytes (Steger et al., 2008; Mikkelsen et al., 2010), monocytes/macrophages (Choi et al., 2005; Bekkering et al., 2014; Cortese et al., 2017), and hepatocytes (Cao et al., 2013; Zhong et al., 2017), but also in erythroid precursors (Cui et al., TABLE 1 | *CD36* epigenetic changes based on cell type/tissue.


2009), fibroblasts (Garbes et al., 2012), endothelial cells (Ren et al., 2016), and microglia (Xia et al., 2017). A summary of histone modifications affecting *CD36* is presented in **Table 1**.

Gene expression of *CD36* in human adipocytes has been shown to be activated by both H3K4me3 at P3, the major *CD36* promoter, and H3K27Ac in PPAR-γ enhancer (Mikkelsen et al., 2010). Intergenic enrichment of H3K79 monomethylation upstream of *CD36* correlates with PPAR occupancy (Steger et al., 2008). The H3K4me3 mark at the *CD36* promoter has also been linked to differentiation of hematopoietic stem/progenitor cells to erythroid precursors (Cui et al., 2009), and in macrophages it accompanies the switch to a pro-inflammatory phenotype (Bekkering et al., 2014). Another histone mark affecting CD36 expression in macrophages is H3K9Ac enrichment, which is found in aortic macrophages exposed to long-term chronic intermittent hypoxia, resulting in a higher *CD36* mRNA level (Cortese et al., 2017). Increased H3 acetylation upstream of the *CD36* promoter has also been correlated with hepatic accumulation of TGs (Cao et al., 2013).

Activation or repression of *CD36* expression *via* histone modifications can be induced pharmacologically. Using trichostatin A (TSA), a specific HDAC inhibitor, that stimulates acetylation of H4 at the *CD36* promoter, increases *CD36* mRNA, followed by higher uptake of oxLDL in macrophages (Choi et al., 2005). CD36 repression has been reported with lysophosphatidic acid, *via* HDAC7 in endothelial cells (Ren et al., 2016), and an HDAC3 inhibitor (RGFP966) in primary microglia (Xia et al., 2017).

Two studies have shown that the relationship between CD36 and histone modifiers is not unidirectional, and the former is capable of influencing the activity of the latter. Zhong et al. (2017) reported that *CD36* deletion inhibits nuclear HDAC2 expression in hepatocytes, changing the acetylation of histones binding to the monocyte chemoattractant protein-1 (MCP-1) promoters and increasing macrophage infiltration and hepatic inflammation. In a study investigating the efficiency of VPA treatment in patients with spinal muscular atrophy, non-responsiveness to the drug was linked to CD36 overexpression, which suppressed the inhibitory effect of VPA on HDACs (Garbes et al., 2012).

*CD36* histone modifications affect cells involved in lipid metabolism, but in contrast to DNA methylation patterns, changes in *CD36* histone marks in these cells have been associated mostly with inflammation (Bekkering et al., 2014; Zhong et al., 2017). Several studies activating or inhibiting CD36 *via* histone modifiers have shown the therapeutic potential of these histone marks as targets for regulating the inflammatory response (Bekkering et al., 2014; Zhong et al., 2017). Although histone modifications have been documented extensively in cancer (Audia and Campbell, 2016), no study has yet linked *CD36* histone marks with tumorigenesis.

#### Post-Transcriptional Regulation

miRNAs are small single-stranded RNA molecules (18– 25 nucleotides) (Sato et al., 2011; Moutinho and Esteller, 2017) that start off as long primary miRNA transcripts (primiRNAs), generated by RNA polymerase II. They are cleaved into hairpin precursors (pre-miRNAs) in the nucleus by a complex containing the RNAse III Drosha (Sato et al., 2011; Moutinho and Esteller, 2017), exported to the cytoplasm, and further cleaved into miRNA duplexes by RNAse III Dicer (Sato et al., 2011; Moutinho and Esteller, 2017). One of the RNA strands is incorporated into the RNA-induced silencing complex (RISC) and drives its binding to the 3′-UTR of the target mRNA, where it induces cleavage and degradation or translational repression (Collins et al., 2011; Sato et al., 2011; Moutinho and Esteller, 2017). miRNA expression can be regulated by epigenetic mechanisms (Tuna et al., 2016; Moutinho and Esteller, 2017). miRNAs themselves can control the epigenetic machinery by modulating the activity of various epigenetic effectors at a post-transcriptional level in the cytoplasm. However, miRNAs can also act in the nucleus at the transcriptional level, activating or repressing gene transcription by inducing changes in the chromatin state (Moutinho and Esteller, 2017; Ramassone et al., 2018).

lncRNAs are non-protein-coding transcripts longer than 200 nucleotides with higher tissue specificity than protein-coding genes (Morlando and Fatica, 2018). Through interaction with other molecules (proteins, DNA, other RNAs), lncRNAs can regulate gene expression at transcriptional and post-transcriptional levels and direct epigenetic changes by coordinating chromatin remodeling, affecting DNA methylation, or acting as competing endogenous RNAs by binding to the target sequence of miRNAs (Forrest and Khalil, 2017; Morlando and Fatica, 2018; Hanly et al., 2018; Hu et al., 2018).

#### CD36 Regulation by miRNAs

*CD36* mRNA is targeted by different species of miRNA that modulate its expression at a post-transcriptional level in a tissue-specific manner. **Table 2** summarizes the species of ncRNAs (miRNAs and lncRNAs) demonstrated to be involved in the regulation of CD36 expression. Zhou et al. (2016) reported that CD36 is increased during bone marrow cell differentiation towards the monocytic-macrophage line and associated with differential expression of seven miRNA species; miR-130a, -134, -141, -199a, and -363 were decreased, and miR-152 and -342-3p were increased. The miR expression profiling in erythropoiesis revealed that miR-16, miR-22, miR-26a, and miR-223 correlated with the appearance and level of CD36 as an erythroid surface antigen (Choong et al., 2007).


Given the association of CD36 with lipid metabolism and the robust expression on circulating monocytes and tissue macrophages, a lot of data have been collected from *in vitro* models using these cells. For example, in an *in vitro* atherosclerosis model based on oxLDL-stimulated dendritic cells derived from circulating monocytes, Chen et al. (2011) found that increased levels of miR-29a were associated with higher expression of CD36 at both the protein and mRNA level. miR-155 induced expression of CD36 in oxLDL-treated macrophages, which promoted their differentiation into dendritic cells (DCs) (So et al., 2017). In addition, silencing miR-155 in a human monocytic-macrophage cell line upregulated the expression of CD36 and increased the uptake of lipids induced by oxLDL. CD36 and scavenger receptor A (SRA) have been shown to be involved in oxLDL-mediated miR-155 upregulation in DCs (Yan et al., 2016).

Several other studies have reported post-transcriptional regulation of this fatty-acid translocator by miRNAs. miR-758-5p was shown to specifically bind to the 3′-UTR of *CD36* mRNA, downregulating both the mRNA and protein in macrophages (Li et al., 2017). Furthermore, these biochemical alterations were specifically related to reduced cellular cholesterol uptake. miR-34a is also involved in downregulation of CD36, as miR34a−/− mice exhibit increased protein expression and increased lipid content in adipocytes (Lavery et al., 2016). Other miRNA species involved in downregulation of CD36 are miR-135a (Du and Lu, 2018), miR-181a (Du et al., 2018), and miR-182-5p (Qin et al., 2018).

As with histone modifiers, interactions between CD36 and miRNAs can go both ways. For example, oxLDL binding to CD36 suppresses cellular Dicer levels and subsequent miR-30c-5p expression (Ceolotto et al., 2017). In human prostate cancer cells, CD36 activation *via* TSP-2 binding downregulates miR-376c *via* a MAPK-dependent pathway (Chen et al., 2017).

Apart from several miRs that control the exposure of CD36 on the surface of erythroid precursors, most miRs species that have been shown to influence CD36 expression are involved in regulating lipid metabolism. For example, miR-133a attenuates lipid accumulation *via* the testicular orphan nuclear receptor 4 (TR4)-CD36 pathway in macrophages and may serve as a potential biomarker and potent therapeutic agent for atherosclerosis (Peng et al., 2016). On the other hand, CD36 is downregulated in hypertensive kidney (cortex), putatively targeted by miR-26a/b (Marques et al., 2011).

Despite insufficient evidence to support the clinical use of these miRs as potential diagnostic or prognostic biomarkers, further data could lead to their clinical relevance.

#### CD36 Regulation by lncRNAs

Fewer studies have investigated the roles of various lncRNAs in controlling CD36 expression, and they focus mainly on lipid uptake in macrophages. Two recent papers have demonstrated the involvement of lncRNAs metastasis-associated lung adenocarcinoma transcript 1 (MALAT1) (Huangfu et al., 2018) and nuclear-enriched abundant transcript 1 (NEAT1) (Huang-Fu et al., 2017) in oxLDL-induced CD36-mediated lipid uptake by macrophages, which plays an important role

in the development of atherosclerosis. MALAT1 was first identified in non-small cell lung cancer patients (Ji et al., 2003) but has been shown to be expressed in almost all human tissues (Zhang et al., 2017). Although MALAT1 and NEAT1 loci are adjacent in the genome, their nuclear localization is distinct: MALAT1 is found in nuclear speckles, nuclear bodies that contain various pre-mRNA splicing factors, whereas NEAT1 is localized in paraspeckles, subnuclear structures present in the interchromatin space near the nuclear speckles (West et al., 2014). Huangfu et al. (2018) show that oxLDL promotes MALAT1 transcription, which induces CD36 upregulation by binding to β-catenin and promoting its accumulation on PPAR-γ responding sites of the *CD36* promoter. Even though MALAT1 is known to act in epigenetic events, such as histone modification induced by an interaction with complexes PRC1 and PRC2 (Yang et al., 2011; Abdouh et al., 2016; Qi et al., 2016; Kim et al., 2017; Biswas et al., 2018), it is not clear whether *CD36* transcription is affected by direct binding of the lncRNA to the promoter or by changing the nucleosome organization. oxLDL upregulates both NEAT1 isoforms (NEAT1 and NEAT1\_2) and also stimulates NEAT1\_2 mediated paraspeckle formation, which then suppresses lipid uptake by stabilizing *CD36* mRNA in paraspeckles (Huang-Fu et al., 2017).

CD36 upregulation was induced by overexpression of lncRNA E330013P06 in macrophages from diabetic mice (Reddy et al., 2014), increasing foam cell formation and contributing to an enhanced inflammatory and atherogenic phenotype in macrophages. The human equivalent of lncRNA E33, MIR143HG, which has similar genomic organization as the mouse gene, was also overexpressed in monocytes from patients with type 2 diabetes (Reddy et al., 2014). In another study focusing on aberrantly expressed lncRNAs in patients with type 2 diabetes, *CD36* mRNA expression had a high correlation coefficient with two co-expressed lncRNAs (NONCODE IDs n382000 and n341587) (Wang et al., 2017), demonstrating a possible role in the pathogenesis of type 2 diabetes through regulation of inflammation.

Finally, CD36 expression and that of other genes involved in lipid synthesis and uptake can be controlled by the interaction between an ultraconserved (uc) lncRNA and two species of miRs (Guo et al., 2018). The upregulation of uc.372 drives hepatic steatosis in mice, as well as patients with non-alcoholic fatty liver disease (NAFLD), by intervening in the maturation of miR-195 and miR-4668. In the case of *CD36*, uc.372 binds to the terminal loop region of pri-miR-4668, blocking its maturation and relieving its gene silencing effect on *CD36*. Because uc.372 promotes hepatic steatosis through these mechanisms, uc.372 inhibitors could be potential therapeutic agents for NAFLD.

#### CONCLUSIONS

CD36 is a widespread protein with various functions depending on tissue localization, from adhesion protein to scavenger for oxidized phospholipids and lipoproteins, or fatty acid translocator. CD36 is involved in cardiovascular and metabolic diseases, as well as immune responses and cancer metastasis. However, CD36 has been studied in detail in some cells (monocytes-macrophages) and diseases (obesity, atherosclerosis/ cardiovascular diseases), whereas other areas (tumor metastasis) are less covered. Genetic mutations in *CD36*, and changes in protein expression have been repeatedly reported in relation to cardiac artery disease, metabolic syndrome and obesity, malaria, and tumor spreading. Beyond genetic variants of *CD36*, epigenetic and post-transcriptional controls of *CD36* gene expression play a significant role in protein function and may serve as regulatory circuits, which are yet insufficiently explored. DNA methylation of *CD36* has been studied mostly in relation to lipid metabolism and obesity, whereas changes in *CD36* histone marks have been linked to inflammation. Both epigenetic mechanisms can serve as targets for drug development in their associated pathologies. Thus far, a clear relationship has been demonstrated between certain miRNAs, CD36, and altered lipid profile that can be further exploited therapeutically. However, miRNAs do not stand alone in this post-transcriptional regulation, as long-noncoding RNAs are also responsible for the stability of *CD36* mRNA. No data are yet available on the involvement of other small noncoding RNA species in CD36 expression, such as endogenous siRNAs or piRNAs. Thus, in terms of epigenetic control of CD36, there are still many gaps to be filled, especially in cancer research.

#### REFERENCES


### AUTHOR CONTRIBUTIONS

C-MN, A-ME, and MH gathered the data and wrote the manuscript; MH corrected the final proof. All authors have read and approved the submitted form.

### FUNDING

This manuscript was funded by the Ministry of Research and Innovation in Romania, under Program 1—The Improvement of the National System of Research and Development, Subprogram 1.2—Institutional Excellence—Projects of Excellence Funding in RDI, Contract No. 7PFE/16.10.2018 and grant COP A 1.2.3., ID: P\_40\_197/2016.

### ACKNOWLEDGMENTS

This manuscript was funded by the Ministry of Research and Innovation in Romania, under Program 1—The Improvement of the National System of Research and Development, Subprogram 1.2—Institutional Excellence—Projects of Excellence Funding in RDI, Contract No. 7PFE/16.10.2018 and grant COP A 1.2.3., ID: P\_40\_197/2016.




and sub-Saharan populations. *PLoS One* 5 (12), e15246. doi: 10.1371/journal. pone.0015246


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Niculite, Enciu and Hinescu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Epigenetic Regulation of Excitatory Amino Acid Transporter 2 in Neurological Disorders

*Mohammad Afaque Alam and Prasun K. Datta\**

Department of Neuroscience, Center for Comprehensive NeuroAIDS, Lewis Katz School of Medicine at Temple University, Philadelphia, PA, United States

Excitatory amino acid transporter 2 (EAAT2) is the predominant astrocyte glutamate transporter involved in the reuptake of the majority of the synaptic glutamate in the mammalian central nervous system (CNS). Gene expression can be altered without changing DNA sequences through epigenetic mechanisms. Mechanisms of epigenetic regulation, include DNA methylation, post-translational modifications of histones, chromatin remodeling, and small non-coding RNAs. This review is focused on neurological disorders, such as glioblastoma multiforme (GBM), Alzheimer's disease (AD), amyotrophic lateral sclerosis (ALS), Parkinson's disease (PD), bipolar disorder (BD), and neuroHIV where there is evidence that epigenetics plays a role in the reduction of EAAT2 expression. The emerging field of pharmaco-epigenetics provides a novel avenue for epigenetics-based drug therapy. This review highlights findings on the role of epigenetics in the regulation of EAAT2 in different neurological disorders and discusses the current pharmacological approaches used and the potential use of novel therapeutic approaches to induce EAAT2 expression in neurological disorders using CRISPR/Cas9 technology.

Edited by:

Chandravanu Dash, Meharry Medical College, United States

#### Reviewed by:

Lin Cheng, The University of Iowa, United States Gurudutt Pendyala, University of Nebraska Medical Center, United States

#### \*Correspondence:

Prasun K. Datta dattapk@temple.edu

#### Specialty section:

This article was submitted to Pharmacogenetics and Pharmacogenomics, a section of the journal Frontiers in Pharmacology

Received: 04 July 2019 Accepted: 21 November 2019 Published: 13 December 2019

#### Citation:

Alam MA and Datta PK (2019) Epigenetic Regulation of Excitatory Amino Acid Transporter 2 in Neurological Disorders. Front. Pharmacol. 10:1510. doi: 10.3389/fphar.2019.01510

Keywords: glutamate, microRNA, excitatory amino acid transporter 2, DNA methyltransferase, histone deacetylase, CRISPR/Cas9

### INTRODUCTION

The human excitatory amino acid transporter 2 (EAAT2) or glutamate transporter 1 (GLT-1) in the rodents is the primary glutamate transporter in the astrocytes (Rothstein et al., 1996; Sheldon and Robinson, 2007; Kim et al., 2011), and handles 90% of total glutamate uptake in the CNS (Tanaka et al., 1997). The SLC1A2 (solute carrier family, member 2) gene, located on chromosome codes for EAAT2 in humans (Meyer et al., 1997), while in mouse, it is located on chromosome 2 and is known as glutamate transporter 1 (GLT1). The size of the human and mouse EAAT2 gene is ~11.7 and ~11.5 kb, respectively, and both genes contain 11 exons (**Figure 1A**).

The promoter analysis of the human EAAT2 gene reveals that it harbors transcription factor binding sites for Nuclear factor kappa-B (NFκB), Specificity protein 1 (Sp1), cAMP responsive element binding protein (CREB), Ying-yang 1 (YY1), and peroxisome proliferator activated receptor (PPAR) response element (Su et al., 2003; Sitcheran et al., 2005; Zschocke et al., 2007; Romera et al., 2007; Allritz et al., 2010; Unger et al., 2012; Karki et al., 2014; Vartak-Sharma et al., 2014). Also, the proximal promoter harbors CpG islands at position −1472 to −1146 and −680 to −494, with 17 CpG and 15 CpG dinucleotides, respectively (Zschocke et al., 2007). The human EAAT2 cDNA harbors an unusually long 3'-UTR of 9684 bp (Kim et al., 2003). Sequence analysis shows that the 3'-UTR

size of SLC1A2 gene is 11704 bp in human and 11565 bp in mouse. (B). Schematic depiction of the organization of EAAT2 in the plasma membrane as deduced from crystallographic data (Yernool et al., 2004) and adapted from Boston-Howes et al. (2006). The protein contains eight transmembrane domains and two helical hairpin loops (HP1) and (HP2). These hairpin structures are involved in transport of amino acids mainly -glutamate. (C). Schematic representation of the mechanism of glutamate-mediated excitotoxicity in the synaptic cleft due to dysregulation of EAAT2 expression in astrocytes. In normal scenario, depolarization of nerve terminal (presynaptic) glutamate is released from synaptic vesicles. Released glutamate then binds to ionotropic glutamate receptors (NMDA-R and AMPA-R) on the postsynaptic terminal that results in depolarization and action potential generation. Glutamate is then removed quickly from the synaptic cleft by astrocyte EAAT2 transporter to prevent the overstimulation of glutamate receptors. However, excessive glutamate accumulation in the synaptic cleft due to dysregulation of astrocyte EAAT2 expression causes overstimulation of NMDA and AMPA receptors that results in the build-up of intracellular Ca++ ions leading to neuronal death or excitotoxicity.

of EAAT2 cDNA is nearly identical and conserved in human, macaque, rat, and mouse (Kim et al., 2003). This observation suggests that it is likely that EAAT2 mRNA expression can be regulated at the post-transcriptional level by miRNAs.

EAAT2 is a plasma membrane sodium-dependent, highaffinity amino acid transporter that mediates the uptake of L-glutamate (Arriza et al., 1994). In brief, the protein has eight transmembrane domains with the amino- and carboxyterminal located intracellularly (**Figure 1B**). It clears the excitatory neurotransmitter glutamate from the extracellular space at synapses in the brain (Rothstein et al., 1996). Glutamate clearance by astrocyte is critical for proper synaptic activation, and also glutamate is converted to glutamine and transported out of the astrocytes into neurons for reuse in glutamate synthesis (Erecińska and Silver, 1990). Furthermore, reuptake of glutamate by EAAT2 also prevents neuronal damage caused by excessive activation of NMDA receptors (**Figure 1C**), a phenomenon known as excitotoxicity (Olney and Sharpe, 1969).

### EPIGENETIC REGULATORS: THE WRITERS, READERS, AND ERASERS

Epigenetics is defined as changes in gene expression without the involvement of changes in the DNA sequence. The epigenetic "writers" are enzymes such as DNA methyltransferases, histone lysine methyltransferases, protein arginine methyltransferases, and histone acetyltransferases that catalyze the addition of a functional group to a protein or nucleic acid (Gillette and Hill, 2015). The epigenetic "readers" are proteins or enzymes such as methyl CpG binding proteins, histone methylation readers, and histone acetylation readers that recognize methylated DNA, methylated lysine residues in proteins and acetylated histones, respectively. The epigenetic "erasers" are enzymes, such as ten-eleven translocation (TET) family of proteins, histone demethylases, and histone deacetylases (HDAC) that demethylate DNA, demethylate lysine residues on histone proteins, and deacetylate histone proteins (see reviews; Gillette and Hill, 2015; Biswas and Rao, 2018).

## DNA METHYLTRANSFERASES (DNMTS)

DNMTs are classified into three categories, DNMT1, DNMT2, DNMT3 [DNMT3a, DNMT3b, and DNMT3L] (Lyko, 2018; Gujar et al., 2019). DNMT1 is involved in the maintenance methylation (Ren et al., 2018). DNMT3a and DNMT3b methylate cytosine residues in CpG island(s) and considered as *de novo* methyltransferases. DNMT1, DNMT3a, and DNMT3b catalyze the addition of a methyl group from S-adenosylmethionine (SAM) to cytosine resulting in 5-mC. 5-mC acts as a stable transcriptional repressor (Kitsera et al., 2017). DNMT2 and DNMT3L are non-canonical family members, as they do not possess catalytic DNMT activity (Lyko, 2018).

### TEN-ELEVEN TRANSLOCATION

DNA demethylation involves the TET family of methylcytosine dioxygenases that are α-KG-dependent enzymes (Koivunen and Laukka, 2018). This family consists of TET1, TET2, and TET3, which participate in the conversion of 5-mC to 5-hmC to promote reversal of methylation (Ito et al., 2010; Melamed et al., 2018). Besides, studies have shown that Tet enzymes also catalyze the conversion of 5-hmC to 5-formylcytosine (5-fC), and 5-carboxylcytosine (5-caC). These modifications serve as DNA demethylation intermediates and are subject to deamination, glycosylase-dependent excision, and repair, resulting in a reversion to unmodified cytosine (Antunes et al., 2019).

### DNMT EXPRESSION IN ASTROCYTES

In late-stage embryonic development in the brain, DNMT3a is ubiquitously expressed, while DNMT3b expression level decreases but remains high in comparison to early-stage embryos (Okano et al., 1999). The expression of DNMT1 and DNMT3a has been documented in rat brain cortical astrocytes (Zhang et al., 2014).

## TET EXPRESSION IN ASTROCYTES

In the brain, NeuN positive neuronal cells express all forms of TETs (Kaas et al., 2013; Li et al., 2014). These observations are tune with reports that neuronal cells are enriched for 5hmC (Szulwach et al., 2011). TET1 expression has been documented in glial fibrillary acidic protein (GFAP) positive astrocytes in the adult mouse hippocampus (Kaas et al., 2013). It has been observed that TET enzymatic activity is inhibited by increased production of 2-hydroxyglutarate in gliomas as a consequence of oncogenic mutations in the metabolic regulators IDH1 (isocitrate dehydrogenase 1) and IDH2 (isocitrate dehydrogenase 2) (Reiter-Brennan et al., 2018).

## HISTONE DEACETYLASES

HDACs based on their amino acid sequence, organization of the domains, and catalytic dependence are grouped into four classes (de Ruijter et al., 2003). Class I, II, and IV HDACs are zinc-dependent, while class III are nicotinamide adenine dinucleotide (NAD+) dependent. The class I HDACs include HDAC1, -2, -3, and -8, while class II includes HDAC4, -5, -6, -7, -9, and -10, and class IV is represented by HDAC11 (de Ruijter et al., 2003). Class III HDACs include sirtuins 1–7 (SIRT1–7) that are structurally unrelated to the other HDACs (Carafa et al., 2016).

### HDAC EXPRESSION IN ASTROCYTES

A comprehensive study was the first to demonstrate the expression of HDACs in rat brain using high-resolution *in situ* hybridization (ISH) coupled with immunohistochemistry in astrocytes, oligodendrocytes, neurons, and endothelial cells (Broide et al., 2007). The study showed that GFAP-positive astrocytes expressed HDAC3 to HDAC11 (Broide et al., 2007). However, a recent study reported that only HDAC1, 2, and 4 are expressed in rat astrocytes (Kalinin et al., 2013). HDAC 1, 2, 3, and 8 are expressed in normal human astrocytes, and glioblastoma multiforme (GBM) derived astrocytic cell lines (Zhang et al., 2016).

### SIRTUINS EXPRESSION IN ASTROCYTES

Among the class III HDACs, SIRT1 is the most conserved member of the sirtuin family of NAD+ dependent protein deacetylases (Cohen et al., 2004) and is predominantly a nuclear enzyme but also present in the mitochondria (Tang, 2016). SIRT1 is expressed in mouse (Li M et al., 2018), rat, and human astrocytes (Hu et al., 2017). SIRT2 is a cytoplasmic enzyme (Braidy et al., 2015), and its expression was observed in rat hippocampus and cerebral cortex. Unlike SIRT1, which is primarily a nuclear enzyme SIRT3, 4, 5 are mitochondrial enzymes (Jęśko et al., 2017; Sidorova-Darmos et al., 2018). The expression of SIRT3 was shown in rat astrocytes (Li X et al., 2018). SIRT4 is highly expressed in rat astrocytes (Komlos et al., 2013). It is reported that SIRT5 is expressed in rat striatum (Omonijo et al., 2014). Not much is known about the astrocyte-specific expression of SIRT6 and SIRT7, that are nuclear enzymes except for that fact that they are expressed in rat hippocampus and cerebral cortex (Braidy et al., 2015).

### NONCODING RNA: miRNAs

miRNAs are small noncoding RNAs (20–22 nucleotides) regulate gene expression by binding to seed sequences located in the 3'-UTR of mRNAs (He and Hannon, 2004; Bartel, 2009). The complementarity between the miRNA seed sequence and its target mRNA determines the fate of the mRNA resulting in either translational repression or mRNA cleavage (Guo et al., 2010). A single miRNA can regulate many different mRNAs or can bind to a single site or multiple sites within the 3'-UTR of the mRNA.

### NEUROLOGICAL DISORDERS AND EAAT2 EXPRESSION

In this section, we describe the various neurological disorders where dysregulation of EAAT2 expression have been reported. A summary of the epigenetic changes affecting EAAT2 gene is presented in **Table 1**. The epigenetic changes that are involved in EAAT2 expression is shown in **Figure 2**.

## GLIOBLASTOMA MULTIFORME (GBM)

GBM, a WHO grade IV astrocytoma is an extremely aggressive, invasive, and destructive primary brain tumor in the adult population (Geraldo et al., 2019). Lee and co-workers demonstrated a strong negative correlation between the expression of Astrocyte Elevated Gene-1 (AEG-1), an oncogene, and EAAT2 by immunofluorescence analyses in human glioma tissue arrays (Lee et al., 2011). Dysregulation of EAAT2 expression is also seen in cell lines derived from tumors (Zschocke et al., 2007; Lee et al., 2011). In two different glioma cell lines, A172 and LN18 that lack EAAT2 expression profiling of DNA methylation by bisulfite sequencing revealed hypermethylation in both CpG islands of EAAT2 promoter (Zschocke et al., 2007).

## ALZHEIMER'S DISEASE (AD)

AD is a chronic neurodegenerative disorder that contributes to 60% to 70% of dementia worldwide. Most forms of AD are sporadic, and less than 1% of all cases are familial AD. Earlyonset AD is caused by mutations of the genes for APP (amyloid

TABLE 1 | Epigenetic modifications involved in dysregulation of EAAT2 expression.


translocation enzyme.

expression. (C). miRNA mediated regulation. Binding of miRNA to the 3'-UTR of EAAT2 mRNA can result in miRNA mediated mRNA degradation or repression of translation resulting in reduced expression of EAAT2. HAT, Histone acetyltransferase; HDAC, Histone deacetylase; DNMT, DNA methyltransferase; TET, Ten eleven

precursor protein), PSEN1 (presenilin 1), and PSEN2 (presenilin 2) (Giau et al., 2019). The pathological hallmarks are β-amyloid plaques localized extracellularly and neurofibrillary tangles, which are localized intracellularly, especially in the frontal cortex and hippocampus (Pinheiro and Faustino, 2019). Studies have shown decreased EAAT2 protein expression in AD brains (Li et al., 1997; Jacob et al., 2007). EAAT2 expression is reduced in astrocytes by oligomeric Aβ by NFAT signaling (Abdul et al., 2009). Dysregulation of EAAT2 expression has been linked in the pathogenesis of AD in APPSw/Ind mice, a transgenic mouse of AD (Takahashi et al., 2015).

### AMYOTROPHIC LATERAL SCLEROSIS (ALS)

ALS is a late-onset and devastating neurodegenerative disorder that is characterized by progressive degeneration of motor neurons in the motor cortex, spinal cord, and brainstem (Verber et al., 2019). Studies have shown that there is a loss of EAAT2 protein in the motor cortex and spinal cord in ALS patients (Rothstein et al., 1995). In the transgenic mice or rats expressing familial ALS-linked mutant SOD1 reduced expression of EAAT2 protein has also been observed (Bruijn et al., 1997; Bendotti et al., 2001; Howland et al., 2002). In addition, another epigenetic modulator, known as sumoylation was shown to regulate localization of EAAT2 expression in SOD1-G93A mouse model of inherited ALS, wherein the cytosolic carboxy-terminal domain is cleaved from EAAT2, conjugated to SUMO1, and results in the accumulation of EAAT2 in the cytoplasm instead of expression in the plasma membrane (Foran et al., 2014).

## PARKINSON'S DISEASE (PD)

PD is a complex neurodegenerative disorder that impacts the dopaminergic neurons located in the midbrain nucleus substantia nigra (Dauer and Przedborski, 2003). The pathological hallmark of PD is the accumulation of α-synuclein oligomers to form Lewy bodies (Wong and Krainc, 2017). In PD, induced in mouse models by 6-hydroxydopamine injection into the nigrostriatal pathway (Chung et al., 2008) and 1-methyl-4-phenyl-1,2,3,6 tetrahydropyridine (Holmer et al., 2005) EAAT2 expression is reduced. Studies have shown that high manganese (Mn) levels induce manganism, symptoms of which are similar to those of PD (Bowman et al., 2011). In this regard, Mn treatment of astrocytes inhibited EAAT2 expression by upregulating YY1 expression that repressed EAAT2 expression at the mRNA and protein level (Karki et al., 2014).

### BIPOLAR DISORDER (BD)

BD is a complex neurobiological disease (Harrison et al., 2018). In BD, both glial cells and neurons are affected and dysregulation of monoamines, altered glutamatergic neurotransmission, increase in oxidative stress, mitochondrial dysfunction, and neuroinflammation play a role in the etiology of the disease (Yuksel and Ongur, 2010; Data-Franco et al., 2017). It is reported that a T-to-G polymorphism in the SLC1A2 gene promoter affects EAAT2 expression in BD (Dallaspezia et al., 2012). A recent study using high resolution melting PCR (HRM-PCR) and thymine-adenine (TA) cloning reported that the SLC1A2 promoter region was hypermethylated in BD (Jia et al., 2017).

### HIV-ASSOCIATED NEUROCOGNITIVE DISORDER (HAND)

HAND or NeuroHIV persists despite effective antiretroviral therapy (Saylor et al., 2016). HIV-1 and gp120 have been shown to inhibit EAAT2 expression in human fetal brain astrocyte (Wang et al., 2003). Studies using immunohistochemistry have demonstrated that in HAND-positive brain tissues, expression of EAAT2 is reduced in comparison to uninfected brain tissue (Xing et al., 2009). Furthermore, it was shown that treatment of human brain astrocytes with a pro-inflammatory cytokine IL-1β, induced AEG-1 expression that, in turn, upregulated YY1 expression and inhibited EAAT2 transcription (Vartak-Sharma et al., 2014). Elucidation of global DNA methylation status in brain tissues of HIV-individuals who used methamphetamine showed increased levels of DNMT1 activity and also hypermethylation of CpG nucleotides in SLC1A2 promoter (Desplats et al., 2014).

#### ROLE OF HDACS AND SIRTUINS IN EAAT2 EXPRESSION

Overexpression of HDAC1 and -3 (class I), and HDAC6 and -7 (class II) was shown to inhibit EAAT2 promoter activity in rat astrocytes (Karki et al., 2014). In the same study, the coexpression of HDACs with YY1 or NFκB further attenuated EAAT2 promoter activity (Karki et al., 2014). There is a lack of information on the effect of SIRTs in the regulation of EAAT2 expression. A recent metabolomics study using SIRT5 knockout mice model showed dysregulation of glutamate levels in brain cortex and reduced expression of EAAT2 at mRNA level (Koronowski et al., 2018).

#### ROLE OF miRNA IN EAAT2 EXPRESSION

The upregulation of miR-107 was shown to inhibit GLT-1 expression in a rodent model of nerve cell hypoxia/reoxygenation (H/R) injury (Yang et al., 2014). Our preliminary studies show that miR-146a reduces EAAT2 expression in U251 cells and human fetal brain astrocytes (Deshmane et al., 2018). A recent report demonstrated that murine neuronal miR-124a induces astroglial EAAT2 not by targeting EAAT2 3'-UTR but by indirectly modulating astrocyte-derived factors that regulate EAAT2 expression (Morel et al., 2013). Also, exosome-mediated delivery of miR-124 was shown to induce the expression of EAAT2 in human neural precursor cells and astrocytes (Lee et al., 2014). In a recent study, a novel mechanism of neurodegeneration in a rat model of ALS was described extracellular miR-218 released from dying motor neurons inhibited EAAT2 expression in astrocytes (Hoye et al., 2018).

#### PHARMACO-EPIGENETIC STRATEGIES TO ACTIVATE EAAT2 EXPRESSION

A successful approach in the treatment of neurodegenerative diseases where epigenetics regulate gene expression could be the use of therapeutic drugs that target epigenetic mechanisms, such as DNA methylation, chromatin, and histone modifications. In this regard, significant advancements have been made to develop drugs that can restore or alter epigenetic mechanisms. In this section, we highlight the findings reported so far with DNMT inhibitors and HDAC inhibitors in the restoration of EAAT2 expression *in vitro* and *in vivo*.

### DNMT INHIBITORS

DNMT inhibitors prevent DNA methylation as a consequence reduce promoter hypermethylation, which leads to re-expression of silenced genes. DNMT inhibitors have been widely used as anticancer drugs since hypermethylation of promoters of tumor suppressor genes occurs in numerous cancers (Pfister and Ashworth, 2017). DNMT inhibitors that are approved by the US Food and Drug Administration (FDA) and widely used as anticancer drugs are nucleoside analogs. These are azacytidine (5-aza-deoxycytidine) (Sorm and Vesely, 1964; Christman, 2002) and decitabine (5-aza-2'-deoxycytidine) (Pliml and Sorm, 1964). In this regard, DNMT inhibitor azacytidine was shown to restore EAAT2 expression in a glioma cell line (Zschocke et al., 2007).

### HDAC INHIBITORS

Among the four major structural families of HDAC inhibitors viz., short-chain aliphatic acids, hydroxamic acids, benzamides, and cyclic tetrapeptides and depsipeptide only the efficacy of short-chain aliphatic acids, hydroxamic acids, and cyclic tetrapeptides and depsipeptide have been evaluated in inducing EAAT2 expression.

Valproic acid (VPA short-chain aliphatic acid), an FDAapproved anti-epileptic agent and sodium butyrate that inhibits class I and II HDACs (Göttlicher et al., 2001) were reported to prevent manganese-induced inhibition of GLT1 expression in mice (Johnson et al., 2018a; Johnson et al., 2018b). VPA induced CpG site demethylation and acetylated histone H4 enrichment in the distal part of the GLT-1 promoter in rat astrocytes (Perisic et al., 2010).

Hydroxamic acids, Trichostatin A (TSA) (Yoshida et al., 1990) was shown to induce EAAT2 mRNA expression in glioma cells (Zschocke et al., 2007), and EAAT2 promoter activity in rodent astrocyte (Karki et al., 2014).

Suberoylanilide hydroxamic acid (SAHA), an FDA approved drug, also induced EAAT2 promoter activity in rodent astrocyte (Karki et al., 2014).

Among the cyclic tetrapeptides and depsipeptide, Romidepsin has been shown to induce EAAT2 promoter activity (Karki et al., 2014). MC1568, a class II HDAC inhibitor, was reported to upregulate the expression of EAAT2 *in vitro* and also in the spinal cord of SOD1-G93A mice, a rodent model of ALS (Lapucci et al., 2017).

### POTENTIAL USE OF CRISPR/CAS9 FOR EAAT2 GENE EXPRESSION

With the discovery of several genome editing technologies such as zinc-finger nucleases (ZFNs), transcription activatorlike effector nucleases (TALEN), and clustered regularly interspaced short palindromic repeat (CRISPR)/Cas9 system (Datta et al., 2016; Khalili et al., 2017), it is possible not only to edit genes but also activate genes (Hilton et al., 2015) that are epigenetically repressed. Since epigenetic modifying enzymes can be fused to the inactivated dCas9 (D10A mutation in RuvC and H840A in HNH nuclease domain), it is possible to target specific gene promoters using guide RNAs (Tadić et al., 2019) and thereby prevent off-target effects of either overexpression or knockdown of epigenetic modifying enzymes. In the context of EAAT2 gene activation, two CRISPR tools can be used. dCas9 fused to (a) histone acetyltransferase p300 (dCas9-p300) activation domain (Hilton et al., 2015), and (b) DNA demethylase catalytic domain from the TET family (Xu et al., 2016). In the former scenario, the recruitment of dCas9-p300 by guide RNAs can result in histone acetylation mediated EAAT2 gene transcription (**Figure 3A**), and in the latter situation, DNA hypermethylation of the EAAT2 gene CpG islands can potentially be reversed by targeted demethylation of cytosine residues using the Tet catalytic domain (Tet-CD) and guide RNA (**Figure 3B**). This strategy can be accomplished *in vivo* since several viral vectors, including adeno-associated virus, lentivirus, and adenovirus (**Figure 3C**), have been employed for delivery of Cas9 and gRNAs (Gori et al., 2015; Mout et al., 2017).

### CONCLUSIONS AND FUTURE DIRECTIONS

A large body of evidence demonstrates the involvement of epigenetic mechanisms, including DNA methylation and histone modification at the pre-transcriptional level and miRNAs at the posttranscriptional level in the dysregulation of EAAT2 expression in numerous neurodegenerative diseases. The mechanisms may also act in concert while regulating EAAT2 expression. The involvement of other epigenetic features, including posttranslational histone modifications, including acetylation, methylation, and phosphorylation, in the regulation of EAAT2/GLT1 promoter activation in astrocytes remains to be investigated in future studies. With

the development of new epigenetic drugs with increased sensitivity, specificity, and decreased toxicity it might be possible to upregulate EAAT2 expression in neurological disorders depending on the epigenetic modification that is involved in repression of EAAT2 expression. However, it is likely that in addition to gene-specific modulation, genome wide reactivation or inactivation of genes at random can have potentially deleterious effects. The proposed CRISPR/ Cas9 mediated EAAT2 gene regulation can, therefore, be employed in animal models to mitigate glutamate-mediated excitotoxicity.

### REFERENCES


### AUTHOR CONTRIBUTIONS

MAA contributed to writing the initial draft of the manuscript and illustrations. PD contributed to writing the review and editing.

### FUNDING

PD was supported by the National Institutes of Health through grants from National Institute of Drug Abuse, 5R01DA033213, and in part by 5P01DA037830-05.


neurodegeneration: opportunities for developing novel therapeutics. *J. Cell. Physiol.* 226, 2484–2493. doi: 10.1002/jcp.22609


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Alam and Datta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*