# ENDOGENOUS VIRAL ELEMENTS – LINKS BETWEEN AUTOIMMUNITY AND CANCER?

EDITED BY : Martin S. Staege and Alexander Emmer PUBLISHED IN : Frontiers in Microbiology

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-763-2 DOI 10.3389/978-2-88945-763-2

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ENDOGENOUS VIRAL ELEMENTS – LINKS BETWEEN AUTOIMMUNITY AND CANCER?

Topic Editors:

Martin S. Staege, Martin Luther University of Halle-Wittenberg, Germany Alexander Emmer, Martin Luther University of Halle-Wittenberg, Germany

Eukaryotic genomes are seasoned with a wealth of sequences that can be addressed as endogenous viral elements. The cover shows the overlay of a microscopic image of human chromosomes and a red/green pattern symbolizing the penetration of endogenous viral elements into the genome. © Martin S. Staege, 2018. The overlay was generated with GIMP version 2.10.8.

In this eBook, original and review papers on various aspects of endogenous viral elements (EVEs) are included. EVEs are integral parts of the genomes of eukaryotic organisms and are involved in various physiological and pathological processes. The focus of this eBook is on the involvement of EVEs in cancer and autoimmune diseases.

In particular, research on endogenous retroviruses and endogenous bornaviruses is included. The presented data demonstrate that EVEs are fascinating objects that are still worth exploring.

Citation: Staege, M. S., Emmer, A., eds. (2019). Endogenous Viral Elements – Links Between Autoimmunity and Cancer?. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-763-2

# Table of Contents

*06 Editorial: Endogenous Viral Elements—Links Between Autoimmunity and Cancer?*

Martin S. Staege and Alexander Emmer

## CHAPTER 1

## TRANSCRIPTION OF EVEs IN HEALTH AND DISEASE


## CHAPTER 2

## EVEs AND AUTOIMMUNITY


Christine Brütting, Harini Narasimhan, Frank Hoffmann, Malte E. Kornhuber, Martin S. Staege and Alexander Emmer

*72 HERV Envelope Proteins: Physiological Role and Pathogenic Potential in Cancer and Autoimmunity*

Nicole Grandi and Enzo Tramontano

## CHAPTER 3

## EVEs AND CANCER


Maria Giebler, Martin S. Staege, Sindy Blauschmidt, Lea I. Ohm, Matthias Kraus, Peter Würl, Helge Taubert and Thomas Greither

*112 Human Endogenous Retrovirus K in the Crosstalk Between Cancer Cells Microenvironment and Plasticity: A New Perspective for Combination Therapy*

Emanuela Balestrieri, Ayele Argaw-Denboba, Alessandra Gambacurta, Chiara Cipriani, Roberto Bei, Annalucia Serafino, Paola Sinibaldi-Vallebona and Claudia Matteucci

*120 APOBEC3B Activity is Prevalent in Urothelial Carcinoma Cells and Only Slightly Affected by LINE-1 Expression*

Ananda Ayyappan Jaguva Vasudevan, Ulrike Kreimer, Wolfgang A. Schulz, Aikaterini Krikoni, Gerald G. Schumann, Dieter Häussinger, Carsten Münk and Wolfgang Goering

*137 Murine Endogenous Retroviruses are Detectable in Patient-Derived Xenografts but not in Patient-Individual Cell Lines of Human Colorectal Cancer*

Stephanie Bock, Christina S. Mullins, Ernst Klar, Philippe Pérot, Claudia Maletzki and Michael Linnebacher

# Editorial: Endogenous Viral Elements—Links Between Autoimmunity and Cancer?

Martin S. Staege<sup>1</sup> \* and Alexander Emmer <sup>2</sup>

*<sup>1</sup> Department for Operative and Non-operative Pediatric and Adolescent Medicine, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>2</sup> University Clinic and Outpatient Clinic for Neurology, Martin Luther University Halle-Wittenberg, Halle, Germany*

Keywords: endogenous retroviruses, endogenous bornaviruses, envelope proteins, patient derived xenografts, multiple sclerosis, gene expression, solid tumors, hematopoietic neoplasia

**Editorial on the Research Topic**

### **Endogenous Viral Elements—Links Between Autoimmunity and Cancer?**

The association between cancer and autoimmune diseases has been known for a long time (Benvenuto et al., 2017). With the exception of some virus-induced tumors, cancer antigens are synthesized based on information that is present in the normal genome of the patient. Therefore, these antigens are usually highly similar or even identical to self-antigens. This fact may explain why antibodies in cancer patients often have a similar epitope spectrum compared to antibodies in autoimmune patients. In this regard, one can consider para-neoplastic autoimmunity as simple cross-reactivity between tumor cells and normal cells. In the present era of checkpoint-inhibition therapy for cancer, this phenomenon is obviously of great clinical importance. On the other hand, patients with some autoimmune diseases have an increased risk of developing cancer (Cristaldi et al., 2011). Hematopoietic neoplasia may be a consequence of prolonged immune cell stimulation by non-clearable auto-antigens. The interpretation of solid tumors emerging in patients with autoimmune disorders is probably more complicated. In this case, the never-ending stimulation of the immune system may force the development of regulatory cells that accidentally suppress immunocompetent cells with anti-cancer activity.

Interestingly, a common feature of cancer and autoimmune diseases is the altered expression of endogenous viral elements (EVEs). EVEs can be classified based on the exogenous viruses that are their most likely nearest relatives. In mammals, endogenous retroviruses (ERVs) represent the largest EVE family, and comprise nearly 10% of the human genome. Based on sequence similarities, this family can further be subdivided into several clades. Most ERV loci are transcriptionally inactive, but transcription has been observed under varying pathological conditions. In particular, ERV-encoded envelope proteins are considered as pathogenetic factors (Grandi and Tramontano; Gröger and Cynis).

The presence of substantial number of EVEs in the genomes of virtually all higher eukaryotes strongly suggests a physiological function for such elements. Indeed, host organisms employ some of these elements for important tasks e.g., placenta development, regulation of gene expression, or defense against exogenous pathogens. The liaison between the host and EVEs can approach true symbiosis, as in the example of endogenous viruses in some parasitic wasps (Federici and Bigot, 2003).

This Frontiers Research Topic compiles current aspects of EVE research, paying particular attention to ERVs and bornaviruses—two EVE families that are important in the context of human diseases. Like ERVs, endogenous bornavirus-like elements (EBLs) have been found in several

#### Edited by:

*Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece*

> Reviewed by: *Tara Patricia Hurst, Abcam, United Kingdom*

\*Correspondence: *Martin S. Staege martin.staege@uk-halle.de*

#### Specialty section:

*This article was submitted to Virology, a section of the journal Frontiers in Microbiology*

Received: *30 October 2018* Accepted: *07 December 2018* Published: *20 December 2018*

#### Citation:

*Staege MS and Emmer A (2018) Editorial: Endogenous Viral Elements—Links Between Autoimmunity and Cancer? Front. Microbiol. 9:3171. doi: 10.3389/fmicb.2018.03171*

**6**

vertebrates including humans (Fujino et al., 2014). The presence of these elements in cellular DNA is surprising because bornaviruses have no reverse transcriptase and no known DNA stage in their life cycle. However, integration of bornavirusderived DNA into host genomes during natural infection seems to be a widespread event that is mediated by transposable elements from the host cells (Horie et al., 2010). Interestingly, EBLs have been shown to inhibit replication of exogenous bornaviruses in ground squirrels pointing to a probable function of EBLs (Horie et al., 2013). On the other hand, alterations in EBLs may be involved in cancer formation (Honda).

The exact number and chromosomal locations of EVEs vary even between closely related species. For instance, ERV3 is only present in most but not all old-world monkeys (Bustamante-Rivera et al.). The species specificity of EVEs offers a charming explanation for why some human diseases (e.g., certain cancers like Ewing sarcoma or Hodgkin lymphoma) do not occur (to our knowledge) or only occasionally occur in the rest of the animal kingdom. ERV3 is one of the best-studied human EVEs, but its function in health and disease is not clearly understood (Bustamante-Rivera et al.).

For most EVE/cancer associations, it remains unclear whether the expression of these EVEs is only a consequence of deregulated gene expression in cancer cells, or whether EVE are causally involved in cancer development. Not all EVEs are over-expressed in cancer cells in comparison to normal cells. Moreover, a single EVE can behave differently in different tumor models. Tumor cells proliferate by bypassing the control mechanisms that otherwise allow cell division only when physiologically necessary. In vitro, quasi-physiological cell proliferation can be studied in lymphocytes, which can be activated by varying stimuli mimicking the recognition of cognate antigens. A comparison between such activated B lymphocytes and neoplastic B cells indicates that different sets of ERV elements are transcribed in these two cell types (Attig et al.). Obviously, proliferation alone is not sufficient to induce the expression of a complete set of ERV elements. Tumor cells and their normal counterparts have more differences than just the lack of proliferation control. Another feature of cancer cells is their altered differentiation capacity. Germ cell tumors may be an interesting model for studying the interaction between ERV and the cellular differentiation status (Mueller et al.). In this model, ERV expression levels were inversely correlated with differentiation status. Such correlations between differentiation capacity and ERV expression may also be responsible for the shortened progression-free survival of sarcoma patients with high ERV expression (Giebler et al.). Differentiation potential and self-renewal are the two hallmarks that define stem cells. In this regard, the impact of ERV expression on stemness in normal and malignant cells is remarkable, and has been studied in melanoma (Balestrieri et al.). The malignant phenotype of these cells seems to be at least partially dependent on ERV expression. Similar pro-oncogenic effects have been observed for other repetitive elements, e.g., L1. Knockdown of L1 in urothelial carcinoma cells decrease their proliferation (Vasudevan et al.).

Cells have developed restriction systems that inhibit the propagation of exogenous viruses as well as of EVEs. One important group of factors involved in this restriction process are the cytidine deaminases of the apolipoprotein B mRNA editing enzyme family (APOBEC). As a side effect, these enzymes can also cause mutations in the cellular genome. The increased activity of APOBEC members after EVE re-activation might therefore be a mechanism by which oncogenic mutations are generated, and this effect might be one factor explaining the association between EVEs and cancer. As a proof of principle, the association between L1 and APOBEC activity was investigated in urothelial carcinoma (Vasudevan et al.). In this model, L1 only showed a small impact on APOBEC activity. Whether other EVE elements might exert APOBEC-dependent mutagenic effects should be investigated in the future.

An interesting aspect of so-called patient-derived xenografts (PDXs) is their frequent contamination with murine ERVs (Bock et al.). Today, we observe an unintelligible hype over PDX models, and one can get the impression that PDXs represent an absolutely important innovation that drastically advances research. The scientific community could probably use a reminder that passaging of tumor cells in immunocompromised animals was invented several decades ago as the only way of maintaining these cells outside of the patient. With the development of efficient cell-culture techniques, the necessity of passaging in live animals has principally been overcome, and the re-invention of animal passaging under the new name "PDX" does not seem well-founded (at least in a large proportion of applications). It is well-known that animal passages alter the biological behavior of cells (Sanford et al., 1959). Which roles ERVs play in this process requires further elucidation.

Gene expression in cancer cells—as seen by researchers—is an end product of a long-lasting co-evolution of the cancer cell population and the host organism's counter-strike mechanisms. Therefore, active EVE transcription in cancer cells can also be a consequence of activated defense mechanisms. As mentioned above, APOBEC activation is one mechanism by which cells restrict exogenous virus infections. EVEs may be a reservoir of endogenous activators in cases where exogenous viruses fail to activate these defense mechanisms (Bannert et al.). Such "virusmimicry" (Bannert et al.) could be considered as part of a "SOS response" (Bustamante-Rivera et al.) that is activated in cells if they cannot respond purposefully to a given condition. This lack of a coordinated cellular reaction is a typical feature not only of cancer but also of autoimmunity. In the autoimmune situation, the immune system cannot eliminate the antigen because this antigen is an integral part of the body. The attacked tissue, on the other hand, cannot power down the attack because the aggressor is also an integral part of the body.

If synthesized at the wrong time, EVE-derived proteins can be harmful for the host organisms. It seems that such proteins—especially ERV envelope proteins—can induce unwanted immune responses and tissue damage (Grandi and Tramontano; Gröger and Cynis). For instance, such effects have been observed for the human syncytin ERVW1. ERVW1 and other ERV envelope proteins are considered to be pathogenetic factors in autoimmune diseases including multiple sclerosis. Gene expression analyses in a neuroblastoma model of CoCl2-simulated hypoxia identified another ERV envelope locus, ERV-FRD1, as a possible candidate for cytopathic envelope proteins in the context of neuronal diseases (Brütting et al.).

In addition to EVEs sensu stricto, eukaryotic genomes contain high numbers of EVE-like elements that have no similarity to known exogenous viruses or that have lost most of their genetic information in the course of evolution. The distinction between these classes is often blurred and sometimes a matter of opinion. ERVs are usually classified as repetitive elements, while the low copy numbers of other EVE classes prevented their inclusion in the corresponding databases. Taking into account the large number of different EVEs (ERVs) and related elements in eukaryotic genomes, the comprehensive analysis of these elements is not easy. The co-expression of EVEs and neighboring genes (Mueller et al.) may offer a new approach for the characterization of disease-associated EVE loci (Kruse et al.; Brütting et al.). Genome coordinates from gene expression experiments or other sources can be mapped to EVE coordinates in order to predict transcriptionally active loci. Using a new web tool that implements this approach, a new transcription start site was identified in the Hodgkin

## REFERENCES


lymphoma-associated cytochrome 4 family Z member 1 gene (Kruse et al.).

Taken together, several EVEs are integral parts of the genomes of virtually all eukaryotic species. Clarifying the physiological and pathophysiological functions of these elements, as well as investigating the mechanisms that lead to altered EVE expression in disease contexts, may identify new targets for the treatment of conditions like cancer or autoimmune diseases.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## ACKNOWLEDGMENTS

We thank Malte E. Kornhuber and Michelle Ng for critically reading of the manuscript. We thank all authors of this Frontiers Research Topic for their valuable contributions.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Staege and Emmer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Physiological and Pathological Transcriptional Activation of Endogenous Retroelements Assessed by RNA-Sequencing of B Lymphocytes

Jan Attig<sup>1</sup>† , George R. Young<sup>2</sup>† , Jonathan P. Stoye2,3 and George Kassiotis1,3 \*

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

#### Reviewed by:

Tara Patricia Hurst, Abcam, United Kingdom Yukihito Ishizaka, National Center for Global Health and Medicine, Japan

#### \*Correspondence:

George Kassiotis george.kassiotis@crick.ac.uk †These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 29 September 2017 Accepted: 29 November 2017 Published: 12 December 2017

#### Citation:

Attig J, Young GR, Stoye JP and Kassiotis G (2017) Physiological and Pathological Transcriptional Activation of Endogenous Retroelements Assessed by RNA-Sequencing of B Lymphocytes. Front. Microbiol. 8:2489. doi: 10.3389/fmicb.2017.02489 <sup>1</sup> Retroviral Immunology, The Francis Crick Institute, London, United Kingdom, <sup>2</sup> Retrovirus-Host Interactions, The Francis Crick Institute, London, United Kingdom, <sup>3</sup> Department of Medicine, Faculty of Medicine, Imperial College London, London, United Kingdom

In addition to evolutionarily-accrued sequence mutation or deletion, endogenous retroelements (EREs) in eukaryotic genomes are subject to epigenetic silencing, preventing or reducing their transcription, particularly in the germplasm. Nevertheless, transcriptional activation of EREs, including endogenous retroviruses (ERVs) and long interspersed nuclear elements (LINEs), is observed in somatic cells, variably upon cellular differentiation and frequently upon cellular transformation. ERE transcription is modulated during physiological and pathological immune cell activation, as well as in immune cell cancers. However, our understanding of the potential consequences of such modulation remains incomplete, partly due to the relative scarcity of information regarding genome-wide ERE transcriptional patterns in immune cells. Here, we describe a methodology that allows probing RNA-sequencing (RNA-seq) data for genome-wide expression of EREs in murine and human cells. Our analysis of B cells reveals that their transcriptional response during immune activation is dominated by induction of gene transcription, and that EREs respond to a much lesser extent. The transcriptional activity of the majority of EREs is either unaffected or reduced by B cell activation both in mice and humans, albeit LINEs appear considerably more responsive in the latter host. Nevertheless, a small number of highly distinct ERVs are strongly and consistently induced during B cell activation. Importantly, this pattern contrasts starkly with B cell transformation, which exhibits widespread induction of EREs, including ERVs that minimally overlap with those responsive to immune stimulation. The distinctive patterns of ERE induction suggest different underlying mechanisms and will help separate physiological from pathological expression.

Keywords: endogenous retroviruses, endogenous retroelements, transcription, genetic, B lymphocyte activation, B cell lymphoma, autoimmunity, cancer

## INTRODUCTION

fmicb-08-02489 December 12, 2017 Time: 11:34 # 2

Vertebrate genomes contain a considerable number of endogenous retroelements (EREs) with various degrees of open reading frame (ORF) integrity and replication autonomy. Occupying approximately a fifth of the mouse and human genomes, long interspersed nuclear elements (LINEs) are the largest group of EREs (Lander et al., 2001; Mouse Genome Sequencing Consortium et al., 2002). LINEs are still capable of autonomous retrotransposition in both host species. They also provide the reverse-transcriptase (RT) activity and retrotransposition machinery for mobilization of other EREs that lack long terminal repeats (LTRs), collectively known as non-LTR elements, and occasionally also of processed RNAs from cellular genes (Burns and Boeke, 2012). Distinguished by the presence of LTRs flanking the proviral genome, endogenous retroviruses (ERVs) and mammalian apparent LTR-retrotransposons (MaLRs), together comprise approximately 9.8 and 8.5% of mouse and human genomes, respectively (Lander et al., 2001; Mouse Genome Sequencing Consortium et al., 2002). ERVs may still possess and express ORFs encoding functional RT, which is necessary for the replication of non-autonomous LTR elements, such as MaLRs (Mccarthy and Mcdonald, 2004; Mager and Stoye, 2015). However, only few distinct ERVs are still replication-competent in mice (Mccarthy and Mcdonald, 2004; Mager and Stoye, 2015), and ERV replication has not been demonstrated to date in humans (Kassiotis and Stoye, 2017).

In addition to loss of replication competence as a result of sequence mutation or deletion sustained over long evolutionary periods, EREs are subject to epigenetic silencing preventing or reducing their transcription, which may otherwise produce nucleic acid and protein products with significant effects on host physiology and pathology (Kassiotis and Stoye, 2016). Epigenetic silencing of EREs is particularly potent in the germplasm, but is thought to be less effective when somatic cells alter their gene expression patterns, as part of the physiological process of their differentiation or response to stimuli or as part of the pathological process of cellular transformation (Slotkin and Martienssen, 2007). Increased ERE expression has frequently been reported as a hallmark of murine and human cancer (Kassiotis and Stoye, 2017). However, ERE induction is also characteristic of the physiological lymphocyte response to stimulation. For example, the transcriptional induction of certain groups of endogenous murine leukaemia viruses (MLVs) upon lipopolysaccharide (LPS) stimulation of murine B cells has been well documented over 3 decades ago and has been linked to B cell differentiation (Stoye and Moroni, 1983). Moreover, transcriptional induction of EREs was also described in B cells from Multiple Sclerosis (MS) patients, which were found to express elevated surface levels of ERV envelope glycoproteins (Brudek et al., 2009).

Thus, this transcriptional regulation of EREs in B cells or other hematopoietic cells may influence immune function, and both beneficial and detrimental effects have been proposed (Kassiotis and Stoye, 2016). However, what remains an open question is the degree of specificity of ERE induction during physiological or pathological conditions. Understanding the degree of overlap between those EREs that are induced as part of the normal processes of cellular activation and differentiation and those that signify cellular transformation or other pathological conditions requires detailed knowledge of ERE transcriptional patterns on a genome-wide scale, which is currently lacking.

Previous analyses of ERE expression have frequently employed PCR-based assays or microarrays, which rarely afforded elementspecific or genome-wide resolution. For instance, although there are 100s or 1000s of EREs represented on commercial microarray probesets, these amount to only 0.25 and 0.04% of all genomic LTR elements and non-LTR elements, respectively (Young et al., 2014). The recent advent of RNA-sequencing (RNA-seq) techniques and the increasing availability of public RNA-seq datasets now provides the opportunity to study genome-wide ERE transcriptional regulation under a range of physiological or pathological conditions (Haase et al., 2015; Sokol et al., 2016). Here, we have analyzed ERE modulation in RNA-seq data from murine and human B cells, covering physiological B cell responses to in vitro and in vivo stimulation, as well as chronic diseases, including B cell lymphoma. Our results reveal distinct patterns of limited ERE induction during B cell cellular activation, contrasting with wide-spread ERE upregulation during B cell transformation, which indicates different underlying mechanisms.

## MATERIALS AND METHODS

## Repeat Region Annotation

The precise annotation of repetitive regions is central to the accurate assessment of their activities. Until recently, this has relied upon the use of manually curated consensus sequences (Bao et al., 2015) with BLASTn-based search methods to define regions of interest. In place of these flattened representations, hidden Markov models (HMMs) can now also be used to represent repeat families, better representing the full range and variability of their sequence space (Hubley et al., 2016). This profile-based masking improves both accuracy and sensitivity, and annotates an additional 5.5 and 5.1% of the mouse and human genomes, respectively (Hubley et al., 2016). Using this method, the mouse and human genomes (GRCm38.78 and GRCh38.78, respectively) were masked using RepeatMasker<sup>1</sup> configured with nhmmer (Wheeler and Eddy, 2013) in sensitive mode using the Dfam 2.0 library (v150923). RepeatMasker annotates LTR and internal regions separately, complicating the summation of reads spanning these divides. Tabular outputs were, therefore, parsed to merge adjacent annotations for the same element and to produce gene transfer format (GTF) files compatible with popular read-counting programs. GTF files for both genomes are freely available upon request.

## Read Mapping and Counting

The expression data used in this study have been previously described and are publicly available. Ethical review, experimental and methodological details relating to study design and data acquisition can be found in the original reports. The following

<sup>1</sup> repeatmasker.org

accessions were used: E-MTAB-2499; GSE61608; GSE60927; GSE68769; GSE65422; GSE60424; GSE72420 and GSE62241, which are a mixture of single-end and paired-end Illumina RNAseq reads. Adapter contamination, assessed with FastQC<sup>2</sup> , was removed using Trimmomatic (Bolger et al., 2014), with additional quality trimming (Q20) and subsequent length filtering (both reads of a pair ≥ 35 nts). The resulting read pairs were aligned with HISAT2 (Kim et al., 2015) and primary mappings counted with featureCounts (Subread, Liao et al., 2014) using standard GTFs for annotated genes and the curated RepeatMasker GTFs for repeat regions. For accuracy and to prevent ambiguity, only reads that could be uniquely assigned to a single feature were counted. This may underestimate total expression in certain situations, but ensures confident count allocation to individual features. Features with no assigned reads across all samples within an experiment were discarded. Those remaining were normalized to account for variable sequencing depth between samples using DESeq2 (Love et al., 2014). In comparison to the use of normalization to transcripts-per million (TPM), for example, normalized read counts do not facilitate comparison of individual feature expression levels between experiments, but are nevertheless preferable for the assessment of repetitive element expression. Methods normalizing expression to TPM or reads per kilobase million, RPKM, require the accurate knowledge of transcript lengths, which cannot simply be determined for repetitive elements and are, in fact, often variable between treatments and systems. Normalized read counts were subsequently imported into Qlucore Omics Explorer (Qlucore, Lund, Sweden) for all downstream analysis and visualization. This included all statistical comparisons, calculation of foldchanges in transcript abundance, computation of Z-Scores (the number of standard deviations from the mean of each variable for each data point), and plotting either Z-Scores or log<sup>2</sup> foldchanges in heat-map form.

## RESULTS

## ERE Modulation during Murine B Cell Activation

Induction of endogenous MLVs in LPS-stimulated B cells provided one of the earliest examples of ERE modulation upon immune cell activation (Stoye and Moroni, 1983). We therefore focused on murine B cells to examine the transcriptional response of LTR and non-LTR elements to B cell stimulation. To this end, we analyzed RNA-seq data (E-MTAB-2499) from mature B cells, isolated from the spleens of C57BL/6 (B6) mice and stimulated in vitro with LPS, a-IgM antibodies, or a combination of CD40 ligand (CD40L) and IL-4 (Hartweger et al., 2014).

As expected, analysis of this dataset highlighted a strong modulation of a great number of non-viral gene transcripts, with just over half of responding genes (53.6%) upregulated upon stimulation, relative to unstimulated cells (**Figure 1A**). In contrast, under the same conditions of stimulation, the majority of LTR element and LINE transcripts (85.8 and 89.3%, respectively) were proportionally downregulated (**Figure 1A**). This apparent reduction in ERE transcription is likely the effect of the increase in overall gene transcription in response to stimulation. Closer inspection of the top 31 LTR elements that were induced in these B cell upon stimulation, revealed several different groups of LTR elements (**Figure 1B**). However, notable was the over-representation of xenotropic endogenous MLVs (Supplementary Table 1). These included two closely integrated proviruses on chromosome 1, Xmv41 (LTR/ERV1|MuLV-int∼ RLTR4\_Mm|1|171481146|171489815) and Xmv43 (LTR/ERV1| MuLV-int∼RLTR4\_Mm|1|170941521|170950177), which have been previously shown to be LPS-responsive (Young et al., 2014), as well as a previously unlocalized provirus, Xmv45 on chromosome 5 (LTR/ERV1|MuLV-int∼RLTR4\_Mm|5|23700579| 23709245) (Supplementary Table 1). Four additional xenotropic proviruses, previously uncharacterized due to their location of the Y chromosome (Frankel et al., 1989), were also significantly upregulated (**Figure 1B** and Supplementary Table 1). However, these were highly homologous with Xmv41, Xmv43 and Xmv45 (Supplementary Figure 1), making it difficult to discern whether the Y-linked proviruses are genuinely expressed or whether they report expression of Xmv41, Xmv43 or Xmv45.

As an independent confirmation of the observed pattern of LTR element transcriptional activation, we analyzed a second set of RNA-seq data (GSE61608) from mature B cells, isolated from spleens of B6 mice and stimulated in vitro with LPS or a-IgM antibodies (Fowler et al., 2015). Again, 12 out of the top 31 LTR elements identified in the previous set, were also significantly induced at the earlier time-point of 2 h in this set and, notably, these included Xmv45 (**Figure 2A**). Furthermore, Xmv45 was also significantly induced in a third set of RNA-seq data (GSE60927) by longer in vitro stimulation of B6-derived B cells with LPS or a combination of CD40L, IL-4 and IL-5 (Shi et al., 2015) (**Figure 2B**). More importantly, Xmv45 seems also to be transcriptionally induced in vivo, as splenic plasma cells assessed directly ex vivo, which represent a state of recent B cell activation, showed elevated Xmv45 transcription and clustered closely with in vitro LPS-stimulated B cells (**Figure 2B**). Lastly, further support for transcriptional induction of Xmv45 in vivo was provided by analysis of RNA-seq data (GSE68769) from B cells isolated from the lymph nodes of mice responding to Influenza A virus vaccination. Indeed, Xmv45 was one of the few LTR elements that were significantly induced over the course of vaccination, despite the fact that Influenza A-specific B cells should constitute only a small fraction of total lymph node B cells that were analyzed (**Figure 2C**).

We next explored whether the consistency with which Xmv45 transcriptional induction was detected in multiple datasets was explained by the degree of this induction. Indeed, transcription of Xmv45 in stimulated B cells was much higher than any other induced LTR element (**Figure 2D**) and dwarfed transcription of other MLV proviruses that are either weakly (Emv2) or strongly (Xmv41 and Xmv43) inducible by LPS stimulation (**Figure 2E**; please note that Xmv45 expression is plotted on a scale that is 20-times higher than the rest). Together, these data suggest that transcription of a small selection of LTR elements, exemplified by Xmv45, is consistently induced in murine B cells by a multitude

<sup>2</sup>www.bioinformatics.babraham.ac.uk/projects/fastqc

of in vitro and in vivo stimuli and validate the capacity of our analysis to detect this induction in multiple datasets.

## ERE Modulation in Murine B Cell Lymphoma

We next explored whether ERE transcriptional modulation as observed during physiological B cell activation overlapped with modulation that may occur following B cell transformation. For this purpose, we compared RNA-seq data (GSE65422) from nontransformed B cells (resting splenic B cells and germinal center B cells analyzed directly ex vivo; and B cells activated in vitro with a-CD40 and a-IgM antibodies) with B cells resembling diffuse large B cell lymphoma (DLBCL) (Zhang et al., 2015). The latter were obtained from mice that develop spontaneous B cell lymphomas as a result of deregulated expression of BCL6 under the immunoglobulin (Ig) heavy chain Iµ promoter and of deregulated activation of the alternative NF-κB pathway by expression of the NF-κB inducing kinase (NIK) under the ROSA26 promoter (Zhang et al., 2015). Both these genetic alterations were restricted to the germinal center lineage by conditional mutagenesis, using the Cγ1-cre transgene (Zhang et al., 2015).

Comparison of a-CD40 and a-IgM in vitro activated B cells with resting B cells in this dataset (**Figure 3A**), revealed a picture comparable with that from the previous dataset (**Figure 1A**), with transcriptional activation favoring gene induction, and to a lesser extent LTR element and LINE transcription. In contrast, LTR elements and LINEs were dominating the transcriptional differences between resting B cell and B cell lymphomas, with minimal overlap between B cell lymphomas and activated B cells (**Figure 3B**). Of note, whereas Xmv45 was still the most induced provirus upon in vitro B cell activation, B cell lymphomas where characterized by significantly elevated expression of Emv2 (**Figure 3C**). In fact, expression of Emv2 in lymphomas was ∼50 times higher than of Xmv45 (**Figure 3C**; please note that Emv2 expression is plotted on a scale that is 10-times higher than that of Xmv45). The elevated ecotropic MLV expression likely reflects restoration of Emv2 infectivity, which has been previously observed in cancer cell lines (Li et al., 1999) and immunodeficient animals (Young et al., 2012). In contrast to Xmv45, Emv2 transcription in immune cells is only weakly inducible by LPS, but strongly inducible by the epigenetic derepression through BrdU treatment (Young et al., 2014), suggesting that the primary cause of its upregulation in B cell lymphomas is loss of epigenetic repression, followed by restoration of infectivity.

Consistent with different mechanistic origins of LTR element modulation during B cell activation and B cell transformation, approximately one-third of LTR elements that were transcriptionally induced in B cell lymphomas were also induced either in germinal center B cells or in in vitro activated B cells (in equal proportions between the two), whereas the majority (two-thirds) were unique to B cell lymphomas (**Figure 3D**).

## ERE Modulation in Human B Cells under Physiological and Pathological Conditions

Given the evolutionary divergence between EREs in different host species, we next asked whether the specificity with which EREs are modulated in murine B cells in distinct conditions, also characterized ERE modulation in B cells from

were identified in each study separately (≥2-fold change; p < 0.05, and the elements shared with in vitro stimulated cells (Figure 1B) are shown. (A) Transcriptional analysis of purified splenic follicular B cells before and after 2-h in vitro stimulation with a-IgM (10 µg/ml) or LPS (25 µg/ml) (GSE61608), depicting the significantly induced LTR elements. (B) Transcriptional analysis of purified splenic follicular B cells before and after in vitro stimulation with LPS for 3 days or a combination of CD40L, IL-4 and IL-5 for 4 days (GSE60927). Also included in the comparison are ex vivo analyzed splenic germinal center B cells, marginal zone B cells and plasma cells. The heat map depicts the (Continued)

#### FIGURE 2 | Continued

significantly induced LTR elements and unsupervised hierarchical clustering of samples according to their expression. (C) Mice were primed by intramuscular injection of inactivated influenza A/New Caledonia/20/99 virus and were boosted with intramuscular injection of seasonal (2006–2007) trivalent inactivated influenza vaccine 30 days later (GSE68769). The figure shows the transcriptional analysis of purified lymph node B cells, pooled from 3 mice for each of the indicated time-points after boost, depicting the significantly induced LTR elements. In (A–C) each column is an independent pool and the underlined element is Xmv45. (D) Normalized counts for the 6 LTR elements with the highest expression in dataset described in Figure 1A. Symbols represent the mean values of triplicate samples. The underlined element is Xmv45. (E) Normalized counts of the indicated proviruses in the same dataset described in Figure 1A. Each symbol is an independent sample.

a different host, namely the human. We started by investigating gene and ERE transcriptional modulation in RNA-seq data (GSE60424), generated from peripheral blood B cells, isolated from healthy individuals and those with infectious, degenerative or autoimmune diseases, including Sepsis, Amyotrophic Lateral Sclerosis (ALS), Type 1 Diabetes (T1D) and MS (Linsley et al., 2014).

As might be expected by its acute and severe nature, Sepsis accounted for the majority of the 2,159 genes that were differentially regulated between the studied conditions, with a smaller, but clearly evident signature in MS patients shortly after the first treatment with IFNβ (**Figure 4**). A comparable number of LTR elements were also differentially expressed between the conditions (**Figure 4**). Interestingly, transcription of LTR elements appeared more distinct between the conditions, with the exception of T1D, than overall gene expression, with a particularly strong signature in the IFNβ-treated subset of MS patients (**Figure 4**). Moreover, B cells from these individuals differentially expressed more than twice the number of LINEs than of genes, with clusters of LINEs clearly distinguishing the different conditions, again with a very strong signature evident in the IFNβ-treated subset of MS patients (**Figure 4**).

To probe further the IFNβ-responsiveness of EREs, we first examined whether the pattern observed in purified B cells was also present when the entire complement of blood cell types was analyzed. We focused on LTR elements as they include phylogenetically more diverse groups than LINEs. Indeed, a sizeable set of the IFNβ-inducible LTR elements upregulated in B cells from IFNβ-treated MS patients was also detectable and highly induced in the same patient group, when RNA-seq data from whole-blood was analyzed (131 of 779 elements, **Figure 5A**).

We next examined the overlap between LTR elements that are induced by IFNβ treatment of MS patients and those that might be naturally induced in a setting of elevated levels of endogenously produced type I IFN (IFN I). To this end, we analyzed RNA-seq data (GSE72420) from wholeblood samples obtained from Systemic Lupus Erythematosus (SLE) patients (Hung et al., 2015), an autoimmune disease with a strong IFN I signature (Obermoser and Pascual, 2010). The intersection of LTR elements that were upregulated in purified B cells in response to IFNβ treatment of MS

patients and those induced in SLE patients as a group, identified 219 common LTR elements (**Figure 5B**). Importantly, significantly elevated expression of these LTR elements was present in the vast majority (88%; 66 of 75) of SLE patients with a high Interferon Signature Metric (ISM) score, but not in any of the SLE patients with a low ISM score (0/24) or any healthy individuals (0/18) (**Figure 5B**). These results indicated that an overlapping set of LTR elements (Supplementary Table 2) were responsive to IFN I both in IFNβ-treated MS patients and in SLE patients with a high ISM score.

To explore whether IFN I was preferentially inducing certain LTR groups, we compared the composition of all LTR elements expressed in MS or SLE patients with that of the IFN I-inducible LTR elements shared between the two conditions (Supplementary Table 2). This comparison uncovered significant enrichment for the ERV1 groups as a whole, with members of the LTR48, HERV4, MER41D and HERVFH19 subgroups being frequently

transcripts) in the same dataset as in (A), in comparison with resting B cells. Mean fold changes from resting B cells are plotted.

responsive to IFN I (**Figure 5C**). Thus, transcriptional induction of LTR elements by exogenously administered or endogenously produced IFN I displays a certain degree of specificity.

Lastly, we investigated how human B cell transformation might influence ERE transcriptional behavior. To achieve this, we compared ERE transcription in RNA-seq data (GSE62241) from follicular B cell lymphoma and from non-transformed B cells (Koues et al., 2015). The groups included B cells purified from follicular lymphoma biopsies; centrocytes, the non-cycling fraction of germinal center B cells, isolated ex vivo from tonsillar tissues; and activated B cells, isolated from peripheral blood samples and stimulated in vitro using a combination of IL-4, a-CD40, a-IgM, and a-IgD (Koues et al., 2015). Consistent with their original description (Koues et al., 2015), follicular lymphoma B cells differentially expressed a substantial number of genes, in comparison with non-transformed activated B cells or centrocytes (**Figure 6A**). Notably, a much larger number of EREs were dysregulated in follicular lymphoma B cells, with twice as many LTR elements and four-times as many LINEs upregulated in follicular lymphoma B cells as genes (**Figure 6A**). More importantly, investigation of the overlap between LTR elements upregulated in follicular lymphoma B cells with those induced in B cells from diseases other than cancer uncovered distinguishable, non-overlapping patterns, with the majority of induced LTR elements specific to B cell lymphoma (**Figure 6B**). Together, these results suggested that distinct LTR elements are transcriptionally activated in cancer and in other diseases.

## DISCUSSION

Endogenous retroelements constitute a sizeable fraction of the genome and their transcription has considerable potential to affect cellular physiology or contribute to pathology. However, their precise contribution can only be accurately assessed with detailed knowledge of their transcriptional behavior at the genome-wide level with the resolution of individual ERE integrations (Kassiotis and Stoye, 2016). The technological advances of transcriptional profiling by RNA-seq now afford a means of addressing genome-wide ERE modulation in health and disease. Using such methodology, we uncovered unique patterns of ERE modulation characteristic of physiological activation or pathological transformation of murine and human B cells.

Study of murine B cell responses to physiological innate immune, cytokine, or BCR stimuli highlighted the responsiveness predominantly of gene transcription and the lack of widespread induction of LTR elements or LINEs. In fact, most B cellexpressed EREs appeared downregulated in murine B cells upon activation, likely due to the overshadowing induction of strong gene transcription under these conditions. Prior analysis of immune cell stimulation by microarray methods had also indicated that transcription of different ERE groups could either increase or, indeed, decrease upon activation (Young et al., 2014), but lacked the resolution of RNA-seq data analysis. Our current analysis identified a small number of distinct LTR elements that are consistently activated in stimulated murine B cells, with a single endogenous MLV provirus, Xmv45, expressed at higher levels than all other LTR elements together. It would be interesting to explore the reasons for this unique inducibility of Xmv45, as well as its potential consequences for B cell function.

Contrasting the strong induction of a very limited number of ERE proviruses following B cell stimulation, expression of a much larger number of EREs was found altered following B cell transformation. Gene expression profiling of DLBCL has revealed patterns associated either with an activated B cell phenotype or with a germinal center phenotype (Lenz and Staudt, 2010). Each subtype is characterized by a different frequency of mutations in pathways affecting cellular activation, such as mutations promoting constitutive NF-κB activation, or differentiation, such as BCL6 or its antagonist Blimp-1 (Lenz and Staudt, 2010). Consistent with these observations, we found that one-third of LTR elements that were induced in B cell lymphoma cells were shared with either activated B cells or germinal center B cells. Induction of these shared LTR elements is likely to be induced by the same transcriptional regulators induced in activated B cells (e.g., NF-κB) or germinal center B cells (e.g., BCL6), which are also overexpressed in B cell lymphomas.

NF-κB, together with IRF1, have been incriminated for the transcriptional activation of HERV-K(HML2) proviruses in ALS brain tissue and human astrocytes and neurons treated with inflammatory cytokines (Manghera et al., 2016). These two immune activation-induced transcription factors are part of a longer list of nearly 40 host transcription factors that

of 219 LTR elements that were induced in B cells by IFNβ treatment of MS patients and were also upregulated in RNA-seq data (GSE72420) from whole-blood samples from SLE patients as a group, compared with those from healthy individuals (≥2-fold change; p < 0.05). In (A,B) each column is an independent sample. (C) Diversity of the LTR elements that are expressed in peripheral blood cells (left) and of the 108 IFN I-inducible LTR elements that were common between purified B cells and whole-blood samples from IFNβ-treated MS patients and SLE patients. Slice widths are proportional to the frequency of each member. Significantly enriched groups (p < 0.05, χ <sup>2</sup> with multiple comparison correction) are indicated by red asterisks.

are suspected to directly drive transcription of HERVK LTRs (Manghera and Douville, 2013). Indeed, this relatively high affinity of ERV LTRs for host transcription factors seems to be an intrinsic, evolutionarily shared property (Dunn et al., 2005), and underlies their ability to establish and rewire host gene regulatory networks (Rebollo et al., 2012). Whether BCL6 or Blimp-1 directly affect transcription of ERVs is not currently known, but Blimp-1 has been reported to repress expression of HIV-1 proviruses in T cells (Kaczmarek Michaels et al., 2015). Therefore, BCL6 may induce expression of ERVs indirectly, through its established role in repressing the repressor Blimp-1 (Crotty et al., 2010).

Given the common pathways that drive B cell activation, germinal center response and B cell transformation, it was surprising to observe that the majority of LTR elements that were significantly upregulated in B cell lymphoma were unique to this condition and were not shared with physiologically activated B cells. This was exemplified by the observed expression of Emv2, which surpassed the highly inducible Xmv45 by two orders of magnitude, to become the single most expressed MLV in B cell lymphoma. Whereas Xmv45 is highly inducible by LPS, cytokine or antigenic stimulation of B cells, Emv2 is primarily responsive to epigenetic modifiers (Young et al., 2014), implicating epigenetic changes in the altered MLV expression profile in B cell lymphoma cells. It should be noted that although expression of ecotropic MLV found in RNA-seq data is attributed here to the germline copy of Emv2, based on sequence identity, it may also arise

from new somatically acquired integrations of an Emv2 derived infectious retrovirus. Indeed, the dysregulation of Emv2 alongside the expression of complementary viruses may support the production of infectious recombinant retroviruses, further increasing observed expression, particularly in B cell lymphomas (Young et al., 2012). The same is also true for other mobile EREs, such as intracisternal A particle (IAP) elements in mice and LINEs in both humans and mice, where the reported expression is the combination of transcription of germline copies and any somatically acquired additional copies.

Akin to murine B cells, distinct patterns of LTR element and LINE expression characterized B cells isolated from patients suffering from different autoimmune, infectious, degenerative or neoplastic diseases. However, interesting differences between the two host species were also observed. Whereas the transcriptional response of murine B cells was overshadowed by gene expression changes, this was not the case in human B cells where transcription of EREs was far more responsive to the influence of the diseases studied here. This was particularly visible for LINEs, which indeed were nearly three-times more numerous than non-viral genes in the transcriptional difference between B cells from the different diseases. These findings may indicate higher overall transcriptional activity of LINEs in humans (Hancks and Kazazian, 2012).

Regardless of its origin, the enhanced transcriptional responsiveness of human EREs provides a considerably more detailed map of transcriptional activity across the genome than annotated genes alone. For example, the transcriptional response to IFN I treatment of MS patients was more evident in LTR element or LINE transcription than in gene transcription overall. The sheer number of transcribed EREs allows for increased statistical power, revealing differences that may be too subtle to detect otherwise.

Also similar to the specificity of LTR element expression in stimulated murine B cells, human B cells upregulated a select list of LTR elements in conditions of IFN I stimulation. IFN I-inducible LTR elements, shared between purified B cells and whole-blood samples and between IFNβ-treated MS patients an SLE patients, were enriched for ERV1 class elements and included LTR48, HERV4 and MER41D members. Several IFN-induced transcription factors, including NF-κB, IRFs and STATs are predicted to bind HERV-K LTRs (Manghera and Douville, 2013), indicating direct responsiveness of at least certain ERV groups to IFN stimulation. Interestingly, members of the MER41 group were recently shown to confer IFNγ-responsiveness to the AIM2 gene and to contribute to the activation of other immune-related

## REFERENCES


genes, by providing binding sites for the transcription factors IRF1 and STAT1 (Chuong et al., 2016). It is, therefore, conceivable that the IFN I responsiveness of LTR48, HERV4 and MER41D elements is mediated by IFN I-induced transcription factors.

In comparison with resting or activated B cells, the most pronounced induction of ERE transcription was witnessed in human B cell lymphoma cells, affecting thousands of LTR elements and LINEs. More importantly, as was the case with murine B cell lymphoma cells, the LTR elements that were activated in human B cell lymphoma cells exhibited minimal overlap with those expressed in non-transformed B cells from any of the conditions studied. Together, these results suggest that cellular transformation, at least in the B cell lineage, is associated with dysregulation of a non-random set of EREs that are not typically found dysregulated in other conditions. The distinctive patterns of ERE induction will help separate physiological from pathological expression, as well as provide targets for possible intervention.

## AUTHOR CONTRIBUTIONS

GY developed the bioinformatics pipeline. JA, GY, and GK analyzed the data. JA, GY, JS, and GK wrote the manuscript. JS and GK supervised the study.

## FUNDING

This work was supported by the Francis Crick Institute (FC001099 and FC001162), which receives its core funding from Cancer Research UK, the UK Medical Research Council and the Wellcome Trust; and by the Wellcome Trust (102898/B/13/Z).

## ACKNOWLEDGMENT

The authors are grateful for assistance from the Scientific Computing Facility at the Francis Crick institute.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2017.02489/full#supplementary-material



**Conflict of Interest Statement:** GK is a scientific co-founder of and consulting for ERVAXX and a member of its scientific advisory board. GK, GY, and JA may receive royalties through their institution from ERVAXX.

The other author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Attig, Young, Stoye and Kassiotis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# WebHERV: A Web Server for the Computational Investigation of Gene Expression Associated With Endogenous Retrovirus-Like Sequences

Konstantin Kruse1,2†, Martin Nettling<sup>1</sup> \* † , Nadine Wappler <sup>2</sup> , Alexander Emmer <sup>3</sup> , Malte Kornhuber 3,4, Martin S. Staege2‡ and Ivo Grosse1,5‡

#### Edited by:

*Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece*

#### Reviewed by:

*Robert Belshaw, Plymouth University, United Kingdom Amr Aswad, University of Oxford, United Kingdom*

#### \*Correspondence:

*Martin Nettling martin.nettling@ informatik.uni-halle.de*

> *†These authors have contributed equally to this work*

*‡These authors have contributed equally to this work as last authors*

#### Specialty section:

*This article was submitted to Virology, a section of the journal Frontiers in Microbiology*

Received: *28 February 2018* Accepted: *18 September 2018* Published: *05 November 2018*

#### Citation:

*Kruse K, Nettling M, Wappler N, Emmer A, Kornhuber M, Staege MS and Grosse I (2018) WebHERV: A Web Server for the Computational Investigation of Gene Expression Associated With Endogenous Retrovirus-Like Sequences. Front. Microbiol. 9:2384. doi: 10.3389/fmicb.2018.02384* *1 Institute of Computer Science, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>2</sup> Department of Surgical and Conservative Pediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>3</sup> Department of Neurology, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>4</sup> Department of Neurology, Helios Hospital, Sangerhausen, Germany, <sup>5</sup> German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany*

More than eight percent of the human genome consists of human endogenous retroviruses (HERVs). Typically, the expression of HERVs is repressed, but varying activities of HERVs have been observed in diseases ranging from cancer to neuro-degeneration. Such activities can include the transcription of HERV-derived open reading frames, which can be translated into proteins. However, as a consequence of mutations that disrupt open reading frames, most HERV-like sequences have lost their protein-coding capacity. Nevertheless, these loci can still influence the expression of adjacent genes and, hence, mediate biological effects. Here, we present WebHERV (http://calypso.informatik.uni-halle.de/WebHERV/), a web server that enables the computational prediction of active HERV-like sequences in the human genome based on a comparison of genome coordinates of expressed sequences uploaded by the user and genome coordinates of HERV-like sequences stored in the specialized key-value store DRUMS. Using WebHERV, we predicted putative candidates of active HERV-like sequences in Hodgkin lymphoma (HL) cell lines, validated one of them by a modified SMART (switching mechanism at 5′ end of RNA template) technique, and identified a new alternative transcription start site for cytochrome P450, family 4, subfamily Z, polypeptide 1 (CYP4Z1).

Keywords: endogenous retroviruses, HERVs, Hodgkin lymphoma, CYP4Z1, BLAST, DRUMS, database, web server

## 1. INTRODUCTION

Human endogenous retroviruses (HERVs) are an increasingly recognized part of the human genome that entered the germline in the course of evolution and today comprise more than eight percent of it (Makałowski, 2001; de Parseval and Heidmann, 2005). Most of these sequences are no longer protein coding due to mutations, but there are several exceptions such as syncytins, which exert a physiological function in placenta development (Denner, 2016).

HERV envelope sequences might be translated and exert immuno-regulatory activity (Kassiotis and Stoye, 2016), and it is a matter of current research to which degree the high number of HERV-like sequences without intact open reading frames have physiological or patho-physiological functions (Moyes et al., 2007; Ruprecht et al., 2008; Voisset et al., 2008; Dolei and Perron, 2009). Interestingly, HERV-like sequences can act as regulatory elements for adjacent genes, and the promoter activity of such elements has been demonstrated to influence the activity of oncogenes in lymphoma cells (Lamprecht et al., 2010).

Cloning strategies for the identification of expressed HERV loci have been developed (Wilkinson et al., 1993), but the experimental effort of these methods is high. In contrast, the experimental effort of next-generation-sequencing-based approaches is low, but the challenge of these approaches is that analyzing the resulting data in a coherent manner requires some non-negligible bioinformatics effort.

Here, we describe the web server WebHERV that enables genome-wide analyses of the proximity of differentially expressed genes and HERV-like sequences. These analyses are based on sets of genome coordinates representing transcriptionally active gene loci generated for example by micro-array or RNA-seq experiments and sets of coordinates of HERV-like sequences retrieved from an integrated DRUMS database (Nettling et al., 2014).

Co-expression of HERV-like sequences and host genes may lead to differential transcript abundances of these genes in the vicinity of HERV-like sequences. In turn, the observation of such expression patterns might be indicative for the activity and for the potentially physiological or patho-physiological functions of the associated HERV-like sequences.

The current implementation of WebHERV is based on a database of sequences with high similarities to HERV loci as defined by Villesen et al. (2004) without any additional size limitation and without further sequence restriction. In addition, WebHERV provides the option of using RepeatMasker coordinates as alternative source for HERV-like sequences.

The rest of the paper is structured as follows. In section 2, we present the underlying data of WebHERV, the architecture of WebHERV, and one exemplary biological application. In section 3, we use WebHERV for predicting putative candidates of HERVlike sequences that might influence the transcription of genes in Hodgkin lymphoma (HL) cell lines.

## 2. MATERIALS AND METHODS

In this section, we first describe the prediction of HERV-like sequences in the human genome. Second, we describe the data storage. Third, we describe the functionality of the web server and the data flow.

## 2.1. Predicting HERV-Like Sequences

We used the predicted HERVs published by Villesen et al. (2004) as basis to search for the complete set of HERV-like sequences in two versions of the human reference genome using BLAST version blast-2.6.0-linux (Altschul et al., 1990) with an E-value threshold of 10−10. This BLAST search identified more than 4 × 10<sup>8</sup> HERV-like sequences in both of the human reference genome versions hg18 and hg19, which we integrated into the DRUMS database at the back-end of WebHERV.

The sequences from Villesen et al. (2004) preferentially include long and almost complete HERV-like sequences with partially intact open reading frames. We accepted this bias because the expression of such sequences might be of particular interest in the context of human diseases.

However, this bias might not be desirable or sometimes not be acceptable for other applications. Hence, we developed WebHERV in such a way that the user can easily extend or modify the database and include for example other HERV-like sequences with lower sequence similarity to typical retroviruses or other repetitive elements.

In order to allow the identification of long terminal repeats (LTRs) and other elements with lower sequence similarities to retroviruses, we also integrated the RepeatMasker coordinates of LTR elements into the DRUMS database of WebHERV.

## 2.2. HERV Loci Store

Traditional relational databases like MySQL have severe problems with the integration of—and with performing queries against—more than 10<sup>8</sup> sequences (Nettling et al., 2014). Hence, we used the tailored storage system DRUMS (Nettling et al., 2014) as HERV loci store at the back-end of WebHERV, which allows both a smooth integration and smooth queries of more than 4 × 10<sup>8</sup> HERV-like sequences.

## 2.3. WebHERV

The front-end of WebHERV is a web application based on JavaServerFaces and publicly available at http://calypso. informatik.uni-halle.de/WebHERV/. The user can upload a file with genomic positions derived from arbitrary sources of RNAseq or micro-array data. Alternatively, the user can upload a file with probe set IDs of the Affymetrix Human Exon 1.0 ST array platform, as WebHERV stores the positional information of these probe sets in an additional SQLite database.

The user can specify several search parameters, which are described in detail next to the corresponding input field. In addition, a step-by-step user instruction is available for downloaded at http://calypso.informatik.uni-halle.de/ WebHERV/resources/docs/InstructionsWebHERV.pdf. Finally, pressing the "submit" button starts the search for HERV-like sequences in the HERV loci store.

The results are interactively represented on a separate page. For each genomic position—or alternatively for each probe set— WebHERV displays all HERV-like sequences and their E-values found by the specified search parameters and makes these data available for download as CSV-file. The complete data flow is illustrated in **Figure 1**.

## 2.4. Biological Application

In this section, we describe one application of WebHERV to the analysis of HL data. All analyzed data are available from the Gene Expression Omnibus (GEO) database. RNA from HL cell lines was isolated using TRIzol (Invitrogen, Karlsruhe, Germany) following the manufacturer's protocol.

Micro-array data (Affymetrix Human Exon 1.0 ST arrays) from HL cell lines were generated as described (Kewitz and Staege, 2013). For comparison, micro-array data from normal B cells (Nikitin et al., 2010) and normal blood cells (Shehadeh et al., 2010) were used. All Affymetrix cel files (GSE18838 for normal blood cells, GSE20200 for normal B cells, and GSE47686 for HL cell lines) were downloaded from the GEO database and processed using the Affymetrix Expression Console software (build 1.4.1.46; annotation file version huex-1-0stv2.na 36.hg19).

From the mentioned data sets GSE18838 and GSE20200 only normal blood cells (11 samples) or normal B cells (4 samples) were used. Additional cel files (Affymetrix HG-U133Plus2.0 arrays) from HL samples (GSE12453, GSE12427, GSE20011, GSE25986, GSE39134, GSE7303, and GSE12453) and a panel of normal tissues from the human body atlas GSE7307 were also downloaded from the GEO database (Roth et al., 2006; Brune et al., 2008; Liu et al., 2010; Köchert et al., 2011; Steidl et al., 2011, 2012). Probe sets were pre-filtered by using MAFilter (Winkler et al., 2012) on the basis of a simple fold change calculation (Witten and Tibshirani, 2007; Draghici, 2011).

Probe sets were considered to be differentially expressed if the median signal intensity in one group (HL cell lines or normal blood cells) exceeded the 75th percentile of the signal intensities in the other group at least 10 times. Dividing the 50th percentile of one group by the 75th percentile of the other group has the advantage that outliers have only a low impact on the ratios.

When performing reverse transcription-polymerase chain reactions (RT-PCRs), two micrograms of RNA were transcribed into cDNA using the qScriptTM cDNA SuperMix (Quantabio, Beverly, MA, USA) following the manufacturer's protocol. RT-PCR was performed using a total volume of 25 µl with final concentrations of 10 pM forward and 10 pM revers primer, 200 µM dNTP Mix (ThermoFischer Scientific, Waltham, MA, USA), 1x Go-Taq-Buffer (Promega, Fitchburg, WI, USA), 0.04 U/µl GoTaq DNA-Polymerase (Promega), and 2 µl cDNA.

The following RT-PCR procedure was used: (i) 94◦C, 5 min; (ii) 94◦C, 30 s; (iii) 60◦C, 30 s; (iv) 72◦C, 45 s; (v) 72◦C, 5 min. Steps (i) to (iv) were repeated 30 times. The following primers were used: CYP4Z1-E1-3: 5′ -ttc ttg ctg ctg atc ctc ct-3′ , 5′ -ccc agg att caa gga ttt tg-3′ ; CYP4Z1-HERVLE: 5′ -tca gca aac tat cgc aag ga-3′ , 5′ -tag ggg ttg tgg tga aga gc-3′ .

Transcripts of CYP4Z1 in HL cells were characterized by using a modified SMART (switching mechanism at 5′ end of RNA transcript) technique as described in Kewitz et al. (2014). Sequencing of RT-PCR products was performed using the BigDye Terminator V1.1 Cycle Sequencing Kit (Life Technologies, Austin, TX, USA). The sequences were analyzed by BLAST (Altschul et al., 1990), and splice-site prediction was performed by Human Splicing Finder (Desmet et al., 2009).

## 3. RESULTS AND DISCUSSION

We tested the functionality of the web server by using two random lists of 60,000 probe sets generated by the MySQL random function. The percentages of probe sets that were identified as being HERV associated increased monotonically with increasing distances or increasing E-values.

795 probe sets (1.33%) and 873 probe sets (1.46%) from the two random lists of probe sets were identified as being HERV associated, respectively, when using no size limitation, an Evalue threshold of 10−100, and a distance of 0 base pairs. These percentages increased to 3,291 (5.50%) and 3,267 (5.46%) when using an E-value threshold of 10−<sup>10</sup> .

HL is a lympho-proliferative disease with known re-activation of HERV-like sequences (Lamprecht et al., 2010; Staege et al., 2014). The majority of HL patients can be cured today, but the toxicity of the used therapy regimes is high. Hence, major efforts are being spent worldwide to identify novel targets that allow the development of less toxic treatment strategies in the future.

HERV-like sequences associated with differential gene expression might possibly represent such targets. Hence, one of our main research topics is the characterization of HL, and so we analyzed micro-array data from HL cell lines (Kewitz and Staege, 2013) in comparison to normal blood cells (Shehadeh et al., 2010) using WebHERV. From the blood data cell set, we only used healthy control samples.

We filtered probe sets with the highest signal intensities in HL cells compared to blood cells and vice versa as described in section 2 and obtained 4,329 up-regulated and 4,994 down-regulated probe sets in HL cells in comparison to normal blood cells. 4,306 and 4,918 of these probe sets have genome coordinates, and

GSE18838) were analyzed for the presence of HERV-like sequences. Presented are the percentages of probe sets with hits in the distance between 1 and 10,000 bp. (B) Presented is the ratio of the two percentages of probe sets with hits from up-regulated and down-regulated probe sets. The HERV association of HL specific probe sets is most pronounced at a distance of approximately 200 bp from the probe set.

we provide these two lists of probe sets as sample files on the WebHERV homepage.

We analyzed these probe sets for the presence of neighboring HERV-like sequences by WebHERV using the hg19 database and found that probe sets with higher signal intensities in HL cell lines were located more often in the vicinity of HERV-like sequences than probe sets with higher signal intensities in normal blood cells (**Figure 2**).

The percentage of probe sets with HERV-like sequences in the neighborhood increases in both cell types with increasing distances from the probe sets. However, a higher percentage of probe sets with high signal intensities in HL cells is located in the vicinity to HERV-like sequences.

We found 2,575 of the up-regulated probe sets (59.80%) and only 2,335 of the down-regulated probe sets (47.48%) to be associated with HERV-like sequences using a distance between 0 and 5,000 bp from the probe sets, an E-value threshold of 10−<sup>100</sup> , and no size restriction. In this data set, the HERV association was most pronounced at around 200 bp from the probe sets (**Figure 2**), but in other data sets the optimal distance might be different, so we set the default distance of WebHERV to 1,000 bp.

We found 163 from the up-regulated probe sets (3.79%) and 96 down-regulated probe sets (1.95%) in the neighborhood of HERV-like sequences using a distance of 200 bp. The percentage of HERV-associated probe sets reaches the limit of 100% and the ratio of HERV-associated probe sets in HL and normal cells reaches the limit of 1.0 with increasing distances, so we limited this analysis to a distance of 10 kb.

We found several genes among the HL specific probe sets associated with HERV-like sequences with a known high expression in HL (Staege et al., 2008, 2015; Hermes et al., 2016) such as the cancer/testis antigen PRAME (preferentially expressed antigen in melanoma), the cytokine EBI3 (Epstein-Barr virus induced 3), the chemokine fractalkine (CX3CL1), fascin (FSCN), or topoisomerase 2A (TOP2A). We also observed a high expression of all of these genes in HL cells in comparison to isolated B cells (**Supplementary Figure 1**).

In addition, we found a high up-regulation in HL cells for a locus on chromosome 1 corresponding to cytochrome P450, family 4, subfamily Z, polypeptide 1 (CYP4Z1; **Figure 3**). Independent micro-array data (Affymetrix HGU133Plus2.0 arrays) suggest that CYP4Z1 is indeed an HL associated gene. High signal intensities were observed in the majority of HL samples, and only mammary gland expressed CYP4Z1 in normal tissues (**Figure 4**).

We asked whether HERV-like sequences in the CYP4Z1 gene might influence the transcription of this gene or vice versa. A HERV-like sequence is located in the intron between exons 9 and 10 of the reference sequence, and we used a SMART (switching mechanism at 5′ end of RNA transcript) technique for the identification of the 5′ end of the CYP4Z1 transcripts.

We identified and sequenced three different transcripts (**Figure 5**) and found that the two longer transcripts represent CYP4Z1 splice variants with or without exon 2. Primers with a specificity for exons 1 and 3 of CYP4Z1 demonstrated the presence of these two CYP4Z1 splice variants in 3/5 HL cells lines but not in normal peripheral blood mononuclear cells (PBMC).

The shortest transcript corresponds to a sequence that starts in the intron between exons 9 and 10 of CYP4Z1. The intronic part of this transcript has homology to L1 transposable elements. Primers with a specificity for exon 10 and the adjacent HERV-like sequence identified transcripts with an alternative transcription start site defined by the HERV-like sequence only in HL cell lines but not in PBMC (**Figure 5**).

The putative function of CYP4Z1 in HL cells is unknown. CYP4Z1 is not expressed in all HL cell lines, so it seems unlikely

FIGURE 3 | Signal intensities of six up-regulated genes in HL cell lines in the vicinity of HERV-like sequences. Presented are means and standard deviations for probe sets identified as associated with HERV-like sequences in HL cell lines (closed bars; GEO data set GSE47686) and normal blood cells (open bars; GEO data set GSE18838). A high up-regulation in HL was found for a probe set corresponding to the gene cytochrome P450, family 4, subfamily Z, polypeptide 1 (CYP4Z1).

GSE39134) and normal tissues (from GEO data sets GSE7307) was assessed in micro-array data sets from the GEO database (Affymetrix HG-U133Plus2.0 micro-array data). High signal intensities were observed in the majority of HL samples. From the normal tissues only mammary gland expresses CYP4Z1.

that CYP4Z1 plays a major role in HL pathogenesis. Nevertheless, as recently discussed for breast cancer, CYP4Z1 might represent an interesting target for cancer therapy (Yang et al., 2017).

For example, the enzymatic function of CYP4Z1 might be targetable for prodrug activation. In addition, autoantibodies against CYP4Z1 have been detected in breast cancer patients, suggesting that CYP4Z1 can serve as target for immunological treatment strategies.

CYP4Z1 amplified two CYP4Z1 splice variants (1 and 2) in 3/5 HL cells lines but not in normal peripheral blood mononuclear cells (PBMC). Primers with a specificity for exon 10 and the adjacent HERV-like sequence detected transcripts (3) with an alternative transcription start site only in HL cell lines but not in PBMC; ntc, no template control; M, DNA size marker. (C) Schematic representation of identified CYP4Z1 transcripts. Exons are indicated by blue boxes. The position of the HERV-like sequence is indicated by a red box.

The relevance of the CYP4Z1 splice variant with missing exon 2 is unclear. Splice site analysis of the genomic CYP4Z1 reference sequence (NG\_007967.1) using Human Splicing Finder (Desmet et al., 2009) yielded the expected splice donor site (GAGgtaaga) at the 3′ end of exon 1 with an HFS score of 95.9 and a MaxEnt score of 10.06.

Interestingly, however, the 5′ end of exon 2 yielded a splice acceptor site HSF score of 83.63 and a MaxEnt score of 5.55, whereas the 5′ end of exon 3 showed an HSF score of 89.69 and a MaxEnt score > 12. Hence, it might be possible that the exon 3 acceptor is preferentially used, and exon 2 is lost, resulting in the truncation of the protein sequence.

Alternatively, a downstream start codon might be used, resulting in an N-terminally truncated CYP4Z1 protein. The lost amino acids include the transmembrane region, and it might be possible that the new splice variant represents a soluble isoform of CYP4Z1. The presence of such soluble isoforms might be important for the development of future treatment strategies using CYP4Z1 as target.

In breast cancer, CYP4Z1 and the pseudogene CYP4Z2P have been indicated to play a role in angiogenesis and cell transformation (Zheng et al., 2015), but it needs to be analyzed if the detected HERV-associated transcript variant can interfere with the CYP4Z1/CYP4Z2P network. Interestingly, all HL cell lines that expressed the newly identified transcript variant also expressed the longer transcripts. This suggests that the expression of the new CYP4Z1 transcript variant is not controlled by the HERV-like sequence independently from the normal promoter but that the activity of the locus as a whole is switched on in CYP4Z1 expressing HL cells.

This application example demonstrates that WebHERV might be useful for the analysis of gene expression associated with HERV-like sequences. The performed analyses preferentially returned hits that have a relatively high sequence similarity with preserved HERV-like sequences, whereas shorter sequences, isolated long terminal repeats, or HERV-like sequences with unclear phylogenetic relationships to retroviruses such as members of the transposon-like human element (THE) family were not recognized entirely.

Hence, we included the RepeatMasker coordinates of LTR elements as an alternative database in the WebHERV server. Using this database allows the identification of additional HERVlike sequences including for example the mentioned THE family.

These elements can play an important role in the context of human diseases including HL. One example is a long terminal repeat of the THE1B family acting as promoter for the colony

## REFERENCES


stimulating factor 1 in HL cells (Lamprecht et al., 2010). As described in the online GitHub documentation, WebHERV can be extended to include also broader arrays of elements beyond HERV-like sequences.

The feature of WebHERV to provide pre-processed HERV loci might be advantageous for some users, but other user might be interested in extending or replacing the lists of genome coordinates with further or alternative putative elements. Hence, WebHERV provides the possibility of including additional elements as well as additional genome sequences for users who wish to perform similar studies with elements of their choice in genomes of their choice.

## AUTHOR CONTRIBUTIONS

MN, MS, and IG designed the study. KK, MN, MS, and NW performed the experiments. All authors analyzed the data, wrote the manuscript, and approved the final version of the manuscript.

## ACKNOWLEDGMENTS

We thank Ines Volkmer for grateful technical assistance.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.02384/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Kruse, Nettling, Wappler, Emmer, Kornhuber, Staege and Grosse. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## NOMENCLATURE

## Resource Identification Initiative

Used cell lines: HDLM-2: DSMZ Cat. ACC-17, RRID:CVCL0009. KM-H2: DSMZ Cat. ACC-8, RRID:CVCL1330. L-1236: DSMZ Cat. ACC-530, RRID:CVCL2096. L-428: DSMZ Cat. ACC-197, RRID:CVCL1361. L-540: DSMZ Cat. ACC-72,RRID:CVCL1362.

# Differentiation-Dependent Regulation of Human Endogenous Retrovirus K Sequences and Neighboring Genes in Germ Cell Tumor Cells

#### Thomas Mueller<sup>1</sup> \*, Claudia Hantsch<sup>2</sup> , Ines Volkmer<sup>2</sup> and Martin S. Staege<sup>2</sup>

<sup>1</sup> Department of Internal Medicine IV, Haematology/Oncology, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>2</sup> Department of Surgical and Conservative Paediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, Halle, Germany

#### Edited by:

Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### Reviewed by:

Tara Patricia Hurst, Abcam, United Kingdom Timokratis Karamitros, University of Oxford, United Kingdom

\*Correspondence:

Thomas Mueller thomas.mueller@medizin.uni-halle.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 28 February 2018 Accepted: 23 May 2018 Published: 15 June 2018

#### Citation:

Mueller T, Hantsch C, Volkmer I and Staege MS (2018) Differentiation-Dependent Regulation of Human Endogenous Retrovirus K Sequences and Neighboring Genes in Germ Cell Tumor Cells. Front. Microbiol. 9:1253. doi: 10.3389/fmicb.2018.01253 Under physiological conditions, most human endogenous retroviruses (HERVs) are transcriptionally silent. However, re-activation of HERVs is observed under pathological conditions like inflammation or cancer. In addition to expression of HERV sequences, an impact of HERV-loci on expression of adjacent genes has been suggested as probably important patho-physiological mechanism. A candidate for such a gene is PRODH (proline dehydrogenase 1), which is located on chromosome 22 adjacent to HERVK-24. Germ cell tumors (GCTs) are known to express high level of HERVK sequences. In addition, non-seminomatous GCT are useful models to study HERV expression in the context of differentiation since they reflect aspects of cellular development during embryogenesis and usually contain different cell types. This is due to the embryonal carcinoma (EC) cells, which are the stem cell component of GCT. They are pluripotent, show high expression of pluripotency markers like OCT4 and LIN28A and can differentiate into either somatic derivatives (teratoma cells) or choriocarcinoma or yolk-sac tumor cells reflecting extra-embryonal differentiation. OCT4 is lost upon differentiation. We used GCT derived cell lines of varying differentiation stages to analyze expression of HERVK and PRODH. Differentiation status and cellular relationship of GCT cells was determined using microarray analysis and western blotting of the embryonic pluripotency markers OCT4 and LIN28A. The highest expression of HERVK was found in undifferentiated EC cells, which retain a stem cell phenotype and express both OCT4 and LIN28. In contrast, the lowest expression of HERVK was observed in somatic differentiated GCT cells which also lack OCT4 and LIN28A whereas GCT cells with differentiation characteristics of yolk-sac tumor expressed LIN28A but not OCT4 and showed intermediate level of HERVK. A similar pattern was found for PRODH. Differentiation of EC cells by siRNA mediated knock-down of OCT4 or treatment with differentiation inducing medium decreased expression of HERVK and PRODH. Treatment of differentiated GCT cells with 5<sup>0</sup> -azacytidine and

**28**

trichostatin A increased expression of HERVK and PRODH, indicating that epigenetic mechanisms are responsible for altered expression of these genes. Our data suggest that HERVK expression is dependent on cellular differentiation stages regulated by epigenetic mechanisms, which can also affect expression of neighboring genes.

Keywords: endogenous retroviruses, germ cell tumor, differentiation, HERVK, PRODH, OCT4, LIN28A

## INTRODUCTION

fmicb-09-01253 June 13, 2018 Time: 16:11 # 2

Human endogenous retroviruses (HERVs) are retroviral sequences that are permanently integrated into the human DNA and that are inherited from parents to the offspring like other genes. In addition to HERVs that are present in every individual, some HERVs are polymorphic and the presence or absence of these HERVs varies between individuals (Turner et al., 2001). Interestingly, these HERVs are particularly able to produce virus like particles (Boller et al., 2008). Re-activation of HERVs has been found in cancer patients. The patho-physiological function of this phenomenon is unclear but oncogenic transformation of cells by HERV gene products has been described (Boese et al., 2000; Galli et al., 2005; Argaw-Denboba et al., 2017; Lemaître et al., 2017). In addition, re-activation of HERV-like promoters has been shown to be involved in the aberrant expression of transformation associated genes in lymphoma cells (Lamprecht et al., 2010). Therefore, an impact of HERV-loci on expression of adjacent genes can be suggested as one probably important patho-physiological mechanism.

In this study we focused on a specific HERVK locus, which is referred to as ERVK-24 according to the nomenclature from Mayer et al. (2011). Formerly, this locus was described as HERV-K101 (Barbulescu et al., 1999) and c22\_A (Ruprecht et al., 2008). ERVK-24 is located on chromosome 22 between the loci for proline dehydrogenase 1 (PRODH) and DiGeorge critical region 5 (DGCR5). DGCR5 has been identified as chromosomal breakpoint in patients with DiGeorge syndrome (Sutherland et al., 1996). As DGCR5 did not contain a functional open reading frame, it was suggested that expression of DGCR5 might reflect a particular chromatin configuration that is required for regulation of adjacent genes (Sutherland et al., 1996). One candidate for such a gene is PRODH. PRODH is an evolutionarily conserved gene and a homolog of the Drosophila gene sluggish A (Gogos et al., 1999). Like PRODH, sluggish A is a mitochondrial protein and is involved in glutamate synthesis (Hayward et al., 1993). Mutations in PRODH are a cause of hyperprolinemia and a risk factor for schizophrenia (Bender et al., 2005).

ERVK-24 belongs to a group of HERVs with high expression in patients with germ cell tumors (GCTs) that are positive for antibodies against HERV-proteins (Flockerzi et al., 2008). It seems to be one of the transcriptionally most active HERV in GCT cells (Ruprecht et al., 2008). In addition to their high expression of HERVK sequences, GCTs, in particular non-seminomatous GCTs are useful models to study HERV expression in the context of differentiation processes since they can reflect some aspects of cellular development during embryogenesis. This is due to the pluripotent nature of embryonal carcinoma (EC) cells, which are the stem cell component of GCT. EC cells can be considered as the malignant counterpart of pluripotent embryonic stem cells, and show high expression of pluripotency markers like OCT4 (Looijenga et al., 2003; Sperger et al., 2003). They can differentiate into either somatic derivatives leading to teratoma tissue or into tissues like choriocarcinoma and yolk sac tumor reflecting an extra-embryonic differentiation (Oosterhuis and Looijenga, 2005). OCT4 is lost during differentiation. Therefore, GCT are usually composed of undifferentiated EC cells and variously differentiated cell types (Oosterhuis and Looijenga, 2005).

In the present paper we analyzed expression of HERVK and PRODH in cell lines of GCT with varying differentiation stages and upon induction of differentiation in undifferentiated cells. In addition, differentiated cells were treated with agents modifying DNA methylation and histone acetylation to investigate epigenetic mechanisms, which are known to be involved in both differentiation processes and inactivation of HERVs.

## MATERIALS AND METHODS

## Cell Lines and Cell Culture

The following human GCT cell lines were used: H12.1 and H12.5 (Casper et al., 1987), H12.1D (Mueller et al., 2006), 1411HP (Vogelzang et al., 1985), GCT72 and GCT27 (Pera et al., 1987), 1777NRpmet, 2102EP, 833K, and NTera2-D1 (Bronson et al., 1980, 1983; Andrews et al., 1996). The cell lines 1777NRpmet, 1411HP, and 833K were kindly provided by Prof. Peter W. Andrews (University of Sheffield, United Kingdom). The H12.1 and H12.5 were established in the former group of Prof. H.-J. Schmoll (University Hospital Halle, Germany) and belong to our lab. The cell lines GCT72 and GCT27 were kindly provided by Prof. Martin F. Pera (Monash University, Australia, at the time of shipping). The NTera2-D1 was kindly provided by Dr. Heiko van der Kuip (University of Tübingen, Germany).

The Hodgkin lymphoma (HL) cell lines L-1236, L-428, L-540, KM-H2, and HDLM-2 (Schaadt et al., 1979; Diehl et al., 1982; Drexler et al., 1986; Kamesaki et al., 1986; Wolf et al., 1996) were purchased from the German Collection of Microorganisms and Cell Cultures, Brunswick, Germany.

All cell lines were cultured in RPMI-1640 (Invitrogen, Karlsruhe, Germany) supplemented with 10% fetal calf serum, 100 U/mL penicillin, and 100 µg/mL streptomycin at 37◦C in a humidified atmosphere with 5% CO2. For induction of differentiation of H12.1 cells, cells were treated with 10 µM retinoic acid and harvested after 5 days. For re-induction of HERVK expression, 1777NRpmet cells were treated with 5<sup>0</sup> azacytidine (4 µM) and trichostatin A (10 nM) and harvested after 5 days.

## Gene Expression Analysis

fmicb-09-01253 June 13, 2018 Time: 16:11 # 3

We used published cell lines and commercially available RNA from anonymous sources for gene expression analysis. RNA from cell lines was isolated using Trizol reagent (Invitrogen, Karlsruhe, Germany) following the manufacturer's protocol. Probable DNA contamination was removed by treatment with DNase (Roche, Mannheim, Germany). In addition, RNA from human placenta from anonymous donors was obtained from Becton-Dickinson (Heidelberg, Germany). RNA (2 µg) were transcribed into cDNA using oligo-dT12-18 primers (Promega, Mannheim, Germany) and polymerase chain reaction (RT-PCR) was performed. The following primer combinations were used: actin beta (ACTB): 5<sup>0</sup> -GGC ATC GTG ATG GAC TCC G-3<sup>0</sup> , 5 0 -GCT GGA AGG TGG ACA GCG A-3<sup>0</sup> ; HERVK primer combination a (HERVKa): 5<sup>0</sup> -CCT GCA GTC CAA AAT TGG TT-3<sup>0</sup> , 5<sup>0</sup> -GCA ATG CAA CTC CTG CTA CA-3<sup>0</sup> ; HERVK primer combination b (HERVKb): 5<sup>0</sup> -TTC TGC TGG TGA GAG CAA GA-3<sup>0</sup> , 5<sup>0</sup> -TGG ACA CAG CAC ATG TTT CA-3<sup>0</sup> ; glyceraldehyde 3-phosphate dehydrogenase (GAPDH): 5<sup>0</sup> -CCA TGG AGA AGG CTG GGG-3<sup>0</sup> , 50 -CAA AGT TGT CAT GGA TGA CC-3<sup>0</sup> ; proline dehydrogenase 1 (PRODH): 5<sup>0</sup> -GAG GCT TTG AGA AGC CAG TG-3<sup>0</sup> , 50 -GGT ATT GCT TGT CCC GCT TA-3<sup>0</sup> . The PCR conditions were: 94◦C, 30 s; 60◦C, 30 s; 72◦C, 45 s (35 cycles). The HERVK primers bind to the following genome coordinates: HERVKa:NC\_000022.11:18945350-18945369 and NC\_000022.11:18946285-18946304; HERVKb:NC\_000022.11: 18946101-18946120 and NC\_000022.11:18947034-18947049. The reverse primer from this combination has two mismatches with the current genome version. The primers should also be able to amplify additional HERVK elements. However, the very high expression of ERVK-24 in comparison to other elements seem to favor amplification of this locus as proved by sequencing of polymerase chain reaction products (see Supplementary Material). Absence of DNA contamination was tested randomly by using RNA without reverse transcription as template for PCR. See the Supplementary Material for an example. PCR products were subjected to agarose gel electrophoresis in the presence of ethidium bromide. Real-time quantitative RT-PCR (qRT-PCR) was performed using the MaximaTM SYBR Green qPCR Master Mix (Fermentas, Sankt Leon-Rot, Germany) using the following conditions: 94◦C, 45 s; 60◦C, 45 s; 72◦C, 60 s (40 cycles).

Global gene expression in GCT cells was analyzed using Affymetrix HG\_U133A arrays (Affymetrix, Santa Clara, CA, United States). Arrays were processed essentially as described (Staege et al., 2004). In short, biotinylated cRNA was prepared by in vitro transcription after synthesis of double-stranded cDNA. After fragmentation of cRNA and hybridization, signals were detected with streptavidin-phycoerythrin and signals were enhanced by using goat-anti-streptavidin antibodies. Arrays were washed and stained with a GeneChip Fluidics Station 400 and scanned with a GeneArray Scanner G2500A. Affymetrix cell files were processed using Robust Multi-array Average (RMA) algorithm with Expression Console 1.1 (Affymetrix). GCT associated genes were identified on the basis of Wilks' Lambda score (WLS) by using MAFilter (Winkler et al., 2012). WLS was used descriptively without significance calculation for filtering probe sets with high signal intensities in GCT cells in comparison to normal cells. For this end, WLS was calculated as quotient of the variance in the total group of samples and the variance in the group of normal tissues alone. Microarray cell files have been submitted to the Gene Expression Omnibus (GEO) data base (Accession No. GSE113423). For comparative analysis, published microarray data from a panel of normal tissues [normal body atlas (NBA)] from the GEO data base (GSE2361) were used (Ge et al., 2005). Cluster analysis

FIGURE 1 | Expression of stem cell markers in GCT cells. (A) Gene expression in GCT cell lines was assessed by DNA microarray analysis. Two independent samples per cell line were analyzed. Gene expression in GCT was compared with gene expression in a panel of normal tissues (Ge et al., 2005). Genes with high expression in GCT were filtered by using MAFilter. Probe sets with a WLS > 10 were considered to be GCT specific. Presented are signal intensities (arbitrary units) for probe sets with specificity for the indicated stem cell markers. The following normal tissues are included (from left to right): heart, thymus, spleen, ovary, kidney, skeletal muscle, pancreas, prostate, small intestine, colon, placenta, bladder, breast, uterus, thyroid, skin, salivary gland, trachea, cerebellum, brain, fetal brain, adrenal gland, bone marrow, amygdala, caudate nucleus, corpus, hippocampus, thalamus, pituitary gland, spinal cord, testis, liver, stomach, lung, fetal lung, fetal liver. (B) Western blot analysis of pluripotent stem cell markers OCT4 and LIN28A. In addition to H12.1 and H12.5, four other cell lines representing the undifferentiated, pluripotent EC cell type were analyzed: 2102EP, 833K, GCT27, NTera-D1.

and visualization was performed with Genesis (Sturn et al., 2002).

## Sequencing and Bioinformatical Analyses

Polymerase chain reaction products were purified with NucleoSpin Gel and PCR Clean-up (Machery-Nagel, Düren, Germany). Sequencing of PCR products was performed using the BigDye Terminator v1.1 Cycle Sequencing Kit (Life Technologies, Austin, TX, United States). The sequences were analyzed with BLAST (Altschul et al., 1990). Open reading frames in the intergenic region between PRODH and DGCR5 were identified by using getorf<sup>1</sup> . Long terminal repeats (LTRs) were identified using RepeatMasker<sup>2</sup> .

## Western Blot Analysis

Cells were harvested by trypsiniziation, rinsed twice with PBS and lysed in RIPA buffer (50 mM Tris-HCl pH 8.0, 100 mM NaCl, 0.5% NP40, 0.5% DOC, 0.5% SDS) supplemented with a protease inhibitor cocktail (Sigma, St. Louis, MO, United States). Insoluble components were removed by centrifugation and

<sup>1</sup>http://www.hpa-bioinfotools.org.uk/pise/getorf.html

<sup>2</sup>http://www.repeatmasker.org/

protein concentrations were measured (BIO-RAD protein assay, Bio-Rad, Hercules, United States). After boiling for 5 min in SDS-loading buffer (500 mM Tris-HCl pH 6.8; 10% glycerol, 2% SDS, 5% 2-mercaptoethanol, 0.05% bromophenol blue), 20 µg protein per lane was separated by SDS-PAGE and electroblotted onto nitrocellulose transfer membrane (Whatman, Maidstone, United Kingdom). Equal protein loading was controlled by Ponceau S staining (Sigma, St. Louis, MO, United States). Membranes were blocked with 5% non-fat dry milk in PBST for 1 h and probed for 2 h with the primary antibodies diluted in PBST/5% milk followed by incubation with secondary HRPconjugated antibodies. Proteins were visualized by enhanced chemiluminescence (Carl Roth, Karlsruhe, Germany). The following primary antibodies were used: OCT4: sc-5279 mouse monoclonal C-10; β-actin: sc-1615 goat polyclonal C-11 (both from Santa Cruz Biotechnology, Santa Cruz, CA, United States); LIN28A: #3978 rabbit polyclonal (Cell Signalling). Horseradish peroxidase (HRP)-conjugated anti-goat, anti-mouse and antirabbit IgG (all from Santa Cruz Biotechnology, Santa Cruz, CA, United States) were used as secondary antibodies.

## siRNA-Mediated Protein Knock-down

For siRNA mediated protein knock-down of OCT4, cells were transfected with OCT4-specific siRNA or control-siRNA (both

from Santa Cruz Biotechnology, United States). Transfection of siRNA was performed by the Nucleofector <sup>R</sup> -technology (Amaxa Biosystems, Germany). Cells (2 × 10<sup>6</sup> ) were suspended in 100 µl transfection buffer (Amaxa Biosystems, Germany) and combined with 1 µg siRNA. After reaction in the Nucleofector <sup>R</sup> -system, the transfected cell suspension was diluted in growth media, seeded in 6 well plates and incubated for indicated times. OCT4 knock-down was confirmed by western blot analysis.

## RESULTS AND DISCUSSION

To investigate cellular relationship, we analyzed the gene expression pattern of six GCT cell lines with varying differentiation stages in comparison to a panel of normal tissues (NBA) from the GEO database (Ge et al., 2005). Microarray data were filtered for up-regulated genes on the basis of WLS by using MAFilter and 1,104 probes sets were identified with a WLS > 10 indicating up-regulation of the corresponding genes in GCT cells. Among the strongest up-regulated genes we found typical markers of pluripotent stem cells like LIN28A, NANOG, and OCT4. Based on the expression pattern of LIN28A, NANOG and OCT4, three groups of GCT cells could be defined (**Figure 1A**). Group 1 included the cell lines H12.1 and H12.5, which represent the undifferentiated, pluripotent EC cell type. These GCT cells are characterized by the expression of all three genes. Group 2 included the cell lines GCT72 and 1411HP, which have characteristics of differentiation toward yolk-sac tumor. These cells have lost expression of NANOG and OCT4 but still express LIN28A. Group 3 included the cell line 1777NRpmet and the H12.1D, which is an in vitro differentiated, stable derivative of the EC cell line H12.1. These cells have lost expression of all three stem cell markers. The differential gene expression patterns were confirmed by western blot analysis of OCT4 and LIN28A (**Figure 1B**) indicating that both markers are useful to define the three groups of GCT cells.

Cluster analysis indicated that the gene expression profile of cells from group 3 have greater similarity with normal somatic tissues than the gene expression profiles of the other groups (**Figure 2**) indicating a teratoma-like, somatic differentiation lineage of these cells. This similarity could also be seen in cluster analysis when we used probe sets that were filtered for cell linespecificity (**Figure 3**). For this end, we divided the mean signal intensity of each cell line (which in our case is identical to the 50th percentile) by the 85th percentile of the signal intensities in all other GCT cell lines. Using the 85th percentile has the advantage that outliers from these cell lines have only low impact on the calculated ratios. Cell line specificity was considered if this ratio was greater than 2. Based on this filtering criterion, a total of 1,315 probe sets showed cell line specificity. Cluster analysis using these probe sets as data points revealed again and more clearly the higher similarity between normal somatic tissues and 1777NRpmet and H12.1D cells. Interestingly, it also revealed that among the cells with yolk-sac tumor characteristics, GCT72 cells are closer related to pluripotent H12.1/H12.5 cells than 1411HP cells (**Figure 3**).

Together, gene expression analysis and western blotting could define three groups of GCT cells: (i) OCT4+/LIN28+ undifferentiated pluripotent, (ii) OCT4−/ LIN28+ differentiated toward yolk-sac tumor, and (iii) OCT4−/ LIN28− somatic differentiated.

Germ cell tumors are known for their high expression of endogenous retroviruses. Therefore, we tested expression of HERVK in GCT cell lines by conventional and quantitative RT-PCR. For comparison we used HL cell lines. As shown in **Figure 4A**, GCT cell lines showed higher expression of HERVK than HL cell lines. Although reactivation of HERVs in HL cells has been described, the expression in HL cells is low in comparison to GCT at least for HERVK. Next we asked whether the different differentiation status of our GCT cell lines might have an impact on HERVK expression. Among GCT cells expression of HERVK was particularly high in undifferentiated pluripotent EC cell lines H12.1 and H12.5 (**Figure 4B**). Cells with characteristics of yolk-sac tumor

(GCT72 and 1411HP) showed intermediate expression whereas somatically differentiated tumor cells (1777NRpmet, H12.1D) expressed lowest levels (**Figure 4B**). To prove a direct link between differentiation processes and HERVK expression, we performed siRNA mediated knock-down of OCT4 in pluripotent H12.1 cells which induces differentiation in those cells. As shown in **Figure 5**, induction of differentiation rapidly led to repression of HERVK.

Sequencing of PCR products indicated that the primers used for PCR amplify preferentially ERVK-24 (see Supplementary Material). ERVK-24 is located between the loci for PRODH

FIGURE 6 | Correlative expression of HERVK and PRODH in GCT cells. (A) Gene expression in GCT cell lines was assessed by DNA microarray analysis. Two independent samples per cell line were analyzed. Presented are signal intensities (arbitrary units) for probe sets with specificity for PRODH as means ± SD. (B) Presented are results from quantitative RT-PCR analysis with primers specific for HERVK and PRODH. cDNA was prepared from GCT cell lines and β-actin was used as house-keeping control. Two additional cell lines (2102EP, 833K), which represent the undifferentiated, pluripotent EC cell type were included to support the correlation and to analyze differences among this group of GCT cells.

FIGURE 7 | Epigenetic regulation of HERVK and PRODH in GCT cells. Presented are results from quantitative RT-PCR analysis with primers specific for HERVK and PRODH. cDNA was prepared from GCT cell lines and β-actin was used as house-keeping control. Expression of HERVK in un-treated cells was set as 1 and relative expression was calculated according to the 2 <sup>−</sup>11Ct-method (Livak and Schmittgen, 2001). (A) Pluripotent H12.1 cells were treated with retinoic acid to induce differentiation. (B) Somatically differentiated 1777NRpmet cells were treated with a combination of 5 0 -azacytidine and trichostatin A to induce re-expression of HERVK and PRODH.

and DGCR5 (see Supplementary Material for the topography of the complete locus including the position of PCR amplicons). Organization of the locus suggests that PRODH and ERVK-24 might be regulated by a bi-directional promoter. We asked whether ERVK-24 and the neighboring PRODH might be coregulated. Analysis of PRODH based on our microarray data showed a similar expression pattern of PRODH as observed for HERVK regarding the three groups of GCT cells (**Figure 6A**). Notably, GCT72 yolk-sac tumor cells had similar high PRODH expression as pluripotent H12.1 cells. Next we performed combined RT-PCR analysis of HERVK and PRODH in our GCT cell panel and found a correlation between PRODH and HERVK expression (**Figure 6B**). The characteristic higher PRODH expression in GCT72 among the cells with yolk-sac tumor differentiation could be reproduced. Therefore, PRODH and HERVK expression pattern of GCT72 confirmed the cluster analyses and suggest that it is more closely related to pluripotent H12.1/H12.5 cells than 1411HP.

To further investigate a possible differentiation dependent coregulation of HERVK and PRODH, pluripotent H12.1 cells were treated with differentiation inducing retinoic acid. As shown in **Figure 7A**, induction of differentiation led to repression of HERVK and was accompanied by down-regulation of PRODH. Next we asked whether HERVK could be re-induced in cells with low expression, e.g., somatic differentiated cells. As shown in **Figure 7B**, treatment of 1777NRpmet cells with 5<sup>0</sup> -azacytidine and trichostatin A increased expression of HERVK. Interestingly, this was accompanied by induction of PRODH expression. Together these data demonstrate a differentiation-dependent and epigenetically-regulated expression of HERVK and suggest coregulation of PRODH expression.

Expression of HERV sequences has been observed in different diseases including cancer. It remains unclear whether HERV expression is directly involved in pathogenesis or whether HERV expression is only an epi-phenomenon of altered gene regulation under pathological conditions. More recently, it was shown that activation of an endogenous retroviral LTR-like promoter is responsible for the expression of growth factor receptors in cancer cells (Lamprecht et al., 2010). However, the reasons for the aberrant activation of such promoters in cancer cells require further investigation. In the present paper we analyzed the expression of the PRODH/ERVK-24 locus. PRODH has been identified as a putative tumor suppressor gene (Liu et al., 2008, 2010). On the other hand, knock-down of PRODH decreases the viability of oxidized low-density lipoprotein (OxLDL)-treated cancer cells (Zabirnyk et al., 2010). OxLDL induce PRODHdependent autophagy which may explain some of the effects of PRODH, because limited autophagy is a cell survival factor whereas excessive autophagy promotes cell death (Degenhardt et al., 2006).

## REFERENCES

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403–410. doi: 10.1016/S0022- 2836(05)80360-2

In general, a large number of human genes are regulated by bi-directional promoters (Trinklein et al., 2004) and highly active HERV promoters might serve as bi-directional promoters (Domansky et al., 2000). The chromosomal organization of the PRODH/ERVK-24 locus together with our expression data in GCT suggests that both genes are co-regulated. Interestingly, expression of PRODH in germ line cells is evolutionarily highly conserved since germ line stem cells from Drosophila express high amounts of the PRODH homolog sluggish A (Kai et al., 2005). These data suggest that expression of PRODH is a feature of cells with an embryonic phenotype. The co-expression of ERVK-24 together with PRODH might be a consequence of the active chromatin state in GCT. Whether the expression of ERVK-24 and PRODH has consequences for the tumor cell biology requires further investigation.

## CONCLUSION

In addition to direct effects of HERV expression, co-regulation of neighboring genes should be considered as possible mechanism for HERV-associated diseases. This co-regulation can be associated with differentiation processes regulated by epigenetic mechanisms, as we have shown using GCT cell lines reflecting different stages of development.

## AUTHOR CONTRIBUTIONS

TM and MS designed the study and wrote the paper. All authors performed experiments, analyzed the data, and approved the final version of the paper.

## FUNDING

Our work was supported by the Wilhelm-Roux-Programm of the Medical Faculty of the Martin Luther University Halle-Wittenberg.

## ACKNOWLEDGMENTS

We thank Franziska Reipsch for technical assistance.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.01253/full#supplementary-material

Andrews, P. W., Casper, J., Damjanov, I., Duggan-Keen, M., Giwercman, A., Hata, J., et al. (1996). Comparative analysis of cell surface antigens expressed by cell lines derived from human germ cell tumours. Int. J. Cancer 66, 806–816. doi: 10.1002/(SICI)1097-0215(19960611)66:6<806::AID-IJC17>3.0. CO;2-0



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Mueller, Hantsch, Volkmer and Staege. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Endogenous Retrovirus 3 – History, Physiology, and Pathology

Yomara Y. Bustamante Rivera<sup>1</sup> , Christine Brütting1,2, Caroline Schmidt<sup>1</sup> , Ines Volkmer<sup>1</sup> and Martin S. Staege<sup>1</sup> \*

<sup>1</sup> Department of Paediatrics I, Martin Luther University Halle-Wittenberg, Halle, Germany, <sup>2</sup> Department of Neurology, Martin Luther University Halle-Wittenberg, Halle, Germany

Endogenous viral elements (EVE) seem to be present in all eukaryotic genomes. The composition of EVE varies between different species. The endogenous retrovirus 3 (ERV3) is one of these elements that is present only in humans and other Catarrhini. Conservation of ERV3 in most of the investigated Catarrhini and the expression pattern in normal tissues suggest a putative physiological role of ERV3. On the other hand, ERV3 has been implicated in the pathogenesis of auto-immunity and cancer. In the present review we summarize knowledge about this interesting EVE. We propose the model that expression of ERV3 (and probably other EVE loci) under pathological conditions might be part of a metazoan SOS response.

#### Edited by:

Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### Reviewed by:

Manja Marz, Friedrich-Schiller-Universität Jena, Germany George Robert Young, Francis Crick Institute, United Kingdom

#### \*Correspondence:

Martin S. Staege martin.staege@uk-halle.de; martin.staege@medizin.uni-halle.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 29 September 2017 Accepted: 26 December 2017 Published: 15 January 2018

#### Citation:

Bustamante Rivera YY, Brütting C, Schmidt C, Volkmer I and Staege MS (2018) Endogenous Retrovirus 3 – History, Physiology, and Pathology. Front. Microbiol. 8:2691. doi: 10.3389/fmicb.2017.02691 Keywords: endogenous viral elements, endogenous retroviruses, ERV3, ZNF117, cancer, autoimmunity, ultra-stability, SOS response

## ENDOGENOUS VIRAL ELEMENTS (EVE)

Several virus species can persist lifelong in their hosts (Norja et al., 2006; Thorley-Lawson et al., 2013). In some cases, persistence is a consequence of integration in the host genome (Wang et al., 2015). In addition to somatic cells, cells of the germ line can be target cells of integration events. The integrated virus can then be transmitted vertically like an ordinary gene (Feschotte and Gilbert, 2012). If such endogenous viral elements (EVE) have no negative effects on the host, EVE can become stable elements of the host genome (Villesen et al., 2004).

Endogenous retroviruses (ERV) are the largest group of EVE and form at least 8% of the human genome (Griffiths, 2001). In some other species the amount of ERV DNA in the genome is much lower, suggesting the existence of efficient control systems in these species (Muir et al., 2004). ERV have been detected in the genomes of virtually all higher eukaryotes (Belshaw et al., 2004; Heidmann et al., 2009). There is growing evidence that ERV have played an important role in the evolution of mammals, primates, and humans (Deininger et al., 2003). Nearly all known human ERV (HERV) integrated up to 100 million years ago (Magiorkinis et al., 2015; Escalera-Zamudio and Greenwood, 2016).

Endogenous viral elements are usually inactivated by genetic and epigenetic mechanisms (Jern and Coffin, 2008). Genetic mechanisms include deletions, inversions, and point mutations in the open reading frames for the viral proteins. Therefore, most EVE are no longer able to replicate and form virus particles autonomously. However, release of virus particles derived from EVE has been described in cancer and other diseases (Wang-Johanning et al., 2007; Volkman and Stetson, 2014). In addition to mutations, epigenetic mechanisms inhibit EVE transcription (Blazkova et al., 2009; Lee et al., 2012). Reactivation of epigenetically silenced EVE can occur and lead to transcription of EVE-encoded proteins or non-coding sequences.

**38**

The majority of genomic HERV sequences are incomplete or heavily mutated, are often relatively short, and do not retain the complete retrovirus genome organization. Nevertheless, these HERV-like elements (HERVLE) can contribute to physiological or pathological processes. Complete HERV and HERVLE have been shown to be reactivated in certain types of cancer (Bannert and Kurt, 2004). Reactivated HERVLE can modulate expression of adjacent genes. For instance, HERVLE have been shown to act as alternate promoters for varying cellular genes in Hodgkin lymphoma and Non-Hodgkin lymphoma cells (Huff et al., 2005; Lamprecht et al., 2010; Lock et al., 2014; Babaian et al., 2016).

Endogenous retroviruses have been classified based on sequence similarities, but no system is universally accepted (Blomberg et al., 2009). ERV contain over 200 distinct groups and subgroups. ERV have been classified into three major groups: Class I ERV are related to gammaretroviruses and include human ERVE and ERV3; Class II ERV are related to betaretroviruses and include human ERVK and mouse mammary tumor virus; Class III ERV are related to Spumaretrovirinae and include ERVL (Katzourakis and Tristem, 2005).

Endogenous retroviruses are preferentially located on the Y chromosomes of human, chimpanzee and orang-utan (Sin et al., 2010). It has been suggested that reduced recombination of the Y chromosome renders loss of integrated sequences less likely. In addition, the apparently low number of functional genes and the high amount of heterochromatin on the Y chromosome might allow integration of ERV without negative impact (Kjellman et al., 1995).

## THE ENDOGENOUS RETROVIRUS 3 (ERV3)

ERV3 (also known as HERV-R) has been detected only in Hominidae (with the exception of Gorilla) and Cercopithecoidea. ERV3 entered the primate genome obviously 30–40 million years ago, around the time of the separation of the Catarrhini and Platyrrhini lineages (separation of the Old and New World monkeys). In several studies, ERV3 has been used as marker for the presence of human DNA (Yuan et al., 2001; Whiley et al., 2004; Eberhart et al., 2005; Lee et al., 2005, 2006; Adaui et al., 2006; Rollison et al., 2007; Gage et al., 2011; MacIsaac et al., 2012; Agrawal et al., 2014; Alsaleh et al., 2014; Barletta et al., 2014; Devonshire et al., 2014; Shigeishi et al., 2016). ERV3 is located in great apes, monkeys and humans at an identical genomic position. No ERV3 locus was found in the genome of Gorilla. Despite absence of ERV3 from the Gorilla genome, sequences with similarity to human ERV3 are present in Gorilla (Kim et al., 2006). Indeed, the current Gorilla genome version (gorGor4) contains at least one predicted non-coding gene (LOC109024208) with high sequence similarity to human ERV3. The human genome contains the same non-coding ERV3 copy. In both species, this copy is located upstream of the zinc finger protein ZNF681 on chromosome 19. ERV3 sequences have been found in different species of Catarrhini including Cercopithecinae (macaques, baboons, mangabyes),

Hylobatidae (gibbons), and Hominidae. No sequences have been found in Platyrrhini (Shih et al., 1991; Hervé et al., 2004). As demonstrated in **Figure 1**, ERV3 is detectable at the cDNA as well as genomic DNA level in man (Homo sapiens, Hominoidea, Catarrhini; Hodgkin lymphoma cell line L-1236; Wolf et al., 1996) and grivet (C. aethiops, Cercopithecoidea, Catarrhini; cell line COS-1; Gluzman, 1981) but not in cottontop tamarin (Saguinus oedipus, Cebidae, Platyrrhini; cell line B95.8; Shope et al., 1973). The ERV3 sequences from Catarrhini are highly conserved (**Figure 2**). Unfortunately, a definitive and universally accepted nomenclature for ERV and other EVE has not been established (Mayer et al., 2011; Vargiu et al., 2016). Therefore, several sequences that are annotated in public databases as ERV3 (e.g., gene IDs 71995, 107603642, 105604693, and many others) are not homolog to ERV3 from Catarrhini.

ERV3 was isolated from human DNA and cDNA libraries in the mid-80s (O'Connell et al., 1984; Cohen et al., 1985) and named ERV3 because it was the third identified human endogenous retrovirus locus (after ERV1 and virus 51-1). Sequence similarities with mammalian type C retroviruses qualify this ERV as a class I ERV. Human ERV3 is located on chromosome 7 at 7q11 (Kim et al., 2000). Early observations

FIGURE 2 | Sequence comparison of ERV3 sequences from different species. Variations specific for individual taxa are highlighted. The following sequences have been analyzed: Cercocebus atys: NM\_001308247, Cercopithecus aethiops: MG574981, H. sapiens: NM\_001007253, Hylobates agilis: AB198937, Hylobates moloch: AJ862653, Macaca fascicularis: AB198938, Macaca fuscata: XM\_015446627, Macaca mulatta: XM\_015133398, Nomascus leucogenys: NM\_001308194, Pan paniscus: XM\_014345675, Pan troglodytes: XM\_016956775, Papio anubis: XM\_017956681, Pongo abelii: NM\_001308132, Pongo pygmaeus: AB198936, Rhinopithecus bieti: XM\_017858756.

indicated that some of the transcripts from the ERV3 locus contained sequences from the downstream region (Kato et al., 1987). It was found that such transcripts contain sequences from a zinc finger protein (ZNF117) with unknown function (Kato et al., 1990). Interestingly, these read-trough transcripts were more abundant in peripheral blood mononuclear cells (PBMCs) from patients with multiple sclerosis than in PBMC from healthy individuals (Rasmussen et al., 1995). However, a link between the ERV3 locus and multiple sclerosis could not be established (Clausen, 2003). Read-trough transcription from ERV into zink finger proteins seems to be a common theme. For instance, according to nucleotide data bases, ERV-ZNF8 read-trough transcription might occur. Notably, ERV3-ZNF117 read-through transcripts (NM001348050) and normal ZNF117 reference transcripts (NM\_015852) encode the identical ZNF117 protein sequence. Therefore, the ERV3 locus can be considered as an alternative promoter for ZNF117. No specific functions for the different untranslated regions of the two transcripts have been identified. According to the RegRNA2.0 (Chang et al., 2013) analysis the shorter 5<sup>0</sup> -untranslated region of the read-through transcripts might have fewer binding sites for microRNAs and non-coding RNAs. Whether the different ZNF117 transcripts have different stabilities and translation efficiencies should be analyzed. The Gorilla gorilla genome contains a sequence with high homology to the human ZNF117 that is located in a predicted gene (LOC101136021, "zinc finger protein 107-like"). In previous genome versions the region was annotated as "zinc finger protein 208-like." As a consequence of the high number of zinc finger proteins with similar sequences the automated annotation algorithms have obviously not correctly assigned this gene as Gorilla ZNF117. However, this homology is evidenced not only by the high sequence similarity but also by the identical chromosomal context (**Figure 3**). Human ZNF117 as well as Gorilla ZNF107-like are located on the opposite strand between the two zinc finger proteins ZNF273 (G. gorilla LOC101135434) and ZNF92 (G. gorilla LOC101137731) on chromosome 7. The sequence between the two zinc finger proteins is remarkably shorter in Gorilla than in Homo suggesting that the Gorilla ERV3 might has been lost by a deletion.

A large proportion of human genomes harbor a polymorphism that results in a truncated ZNF117 protein (Balasubramanian et al., 2011). This single nucleotide polymorphism (rs1404453) introduces a termination codon in the open reading frame resulting in loss of the last 57 amino acids. The putative nucleic acid binding sites are not affected by the truncation. Interestingly, this polymorphism is conserved in other species, suggesting that the shorter protein form might be functionally active.

The human genome contains approximately 40 ERV3-like elements (Kannan et al., 1991; Kjellman et al., 1995; Andersson et al., 2005). Only the copy on chromosome 7q11 has a complete open reading frame for a viral envelope protein; the other open reading frames from this locus are inactivated by non-sense mutations (Kannan et al., 1991). Polymorphisms in the LTR and open reading frame of ERV3 including non-sense mutations that lead to truncated proteins have been observed but no association with diseases has been found (Rasmussen et al., 1996; Rasmussen and Clausen, 1998). Interestingly, approximately 1% of the Caucasian population has mutations in ERV3 that disrupt the open reading frame (de Parseval and Heidmann, 1998). The functional consequences of this inactivation have not been clarified.

ERV3 transcripts are detectable in several normal tissues including lymphoid organs (spleen, lymph nodes, thymus), the gastro-intestinal tract (stomach, duodenum, small bowel, appendix, colon, rectum), the endocrine system (adrenal glands, thyroid), the urinary system (kidney, urinary bladder), placenta, male and female reproductive system (testis, corpus luteum, Fallopian tubes), the respiratory system (lung bronchial epithelium), astrocytes, sebaceous glands, and salivary glands (Larsson et al., 1994; Andersson et al., 1996; Katsumata et al., 1998; Eo et al., 2014; Fei et al., 2014; Kang et al., 2014). The broad expression profile of ERV3 was also found in other species (Schiavetti et al., 2002).

## ERV3 AND IMMUNOPATHOLOGY

The stimulation of the immune system by ERV encoded antigens might be involved in autoimmunity. ERV encoded antigens can be recognized by cytotoxic T cells (Haist et al., 1992). Antibody cross-reactivity between exogenous retroviruses and ERV3 peptides have been described (Katsumata et al., 1999) and ERV3 is up-regulated by cytokines in endothelial cells (Sasaki et al., 2009). Indeed, ERV3 has been suggested as an auto-antigen involved in different immune-pathologies (Takeuchi et al., 1995; Li et al., 1996; de Parseval et al., 1999; Blank et al., 2009; Nelson et al., 2010, 2014; Kowalczyk et al., 2012). Expression of ERV3 was found to be up-regulated in blood cells but downregulated in skin biopsies from patients with morphea (Li et al., 1996). ERV3 was detected in synovial tissues from patients

with rheumatoid arthritis and osteoarthritis but also in synovial tissues of healthy individuals (Nelson et al., 2010). Altogether, the possible involvement of ERV3 in autoimmunity requires further investigations. Like many other retroviral envelope proteins, ERV3 has a functionally active so-called immunosuppressive domain that can reduce immune responses in mice (Mangeney et al., 2007). Immune responses are governed by several host factors including highly polymorphic systems like the major histocompatibility complex. It seems possible that the balance between immunosuppressive and immuno-stimulatory activities depends on the individual combination of such factors.

## ERV3 AND CANCER

The role of ERV3 in cancer might vary in different tumor entities. Elevated presence of ERV3 has been detected in colorectal, lung and liver cancer (Ahn and Kim, 2009; Lee et al., 2014). ERV3 is expressed in prostate cancer cells (Wang-Johanning et al., 2003). ERV3 is up-regulated together with other ERV (ERVFRD-1, ERV-PABLB-1, ERVPb-1, ERVV-1, ERVW-1, and ERVW-2) in endometrial carcinoma (Strissel et al., 2012). Besides, ERV3 is co-expressed together with members of the ERVK family and ERVE family in ovarian cancer (Wang-Johanning et al., 2007). Interestingly, 30% of ovarian cancer patients have antibodies against ERV3 whereas such antibodies are not detectable in healthy individuals (Wang-Johanning et al., 2007). This observation underscores the recognition of ERV3 by the immune system. In early studies, ERV3 was not detected in breast cancer (Wang-Johanning et al., 2001). A more recent study observed increased levels of ERV3 in the blood of untreated patients with breast cancer. Levels of ERV3 and other ERV decreased after therapy (Rhyu et al., 2014). Up-regulation of ERV3 in different cancer types might suggest an involvement in the pathogenesis of these diseases.

On the other hand, ERV3 was considered to be a tumor suppressor (Matsuda et al., 1997; Lin et al., 1999, 2000). ERV3 is up-regulated after irradiation of head and neck squamous cell carcinoma cells (Michna et al., 2016), during monocytic differentiation of acute myelogenous leukemia cells (Larsson et al., 1996, 1997; Abrink et al., 1998) as well as during differentiation of normal squamous cells (Otsuka et al., 2006). Demethylation of the ERV3 locus during monocytic differentiation leads to expression of ERV3 and ZNF117 (Andersson et al., 1998). Growth inhibited Hodgkin lymphoma cells express higher levels of ERV3 RNA than proliferating cells (Kewitz and Staege, 2013).

Regulation of ERV3 seems to be cell type specific (Sibata et al., 1997). For instance, ERV3 is up-regulated together with fusogenic ERV envelope proteins in muscle after long-term endurance exercise (Frese et al., 2015). ERV3 is expressed during embryogenesis and a role of ERV3 in developmental processes has been discussed (Andersson et al., 2002). ERV3 expression might be regulated by steroid hormones (Rote et al., 2004). On the other hand, a function of ERV3 in hormone regulation has been suggested (Morrish et al., 2001). In normal placenta, ERV3 is higher expressed in the first trimester of pregnancy than at term (Holder et al., 2012). ERV3 is up-regulated during trophoblast differentiation (Boyd et al., 1993). Like the 5<sup>0</sup> -long terminal repeats (LTRs) of ERVW-1 and ERVFRD-1, the ERV3 5 0 -LTR is hypomethylated in cytotrophoblasts during pregnancy (Gimenez et al., 2009). Expression of ERV3 and other ERV in the placenta is reduced in cases of intrauterine growth retardation (Ruebner et al., 2010). The importance of ERV expression in the placenta is, indeed, known for a long time (Muir et al., 2004). An immunosuppressive function of ERV3 in the context of motherfetus interaction has been proposed (Venables et al., 1995). Other ERV expressed in the placenta have fusogenic activity. Whether ERV3 has fusogenic activity has been discussed controversially (Boyd et al., 1993; Morrish et al., 2001). Together with syncytin 1 and syncytin 2, ERV3 is down-regulated in hydatidiform moles and malignant gestational trophoblastic tumors in comparison to normal placenta (Bolze et al., 2016). ERV3 expression is absent in choriocarcinoma (Cohen et al., 1988; Kato et al., 1988).

Taking together, it seems that in some tumor entities ERV3 is preferentially expressed in differentiated or growth inhibited cells compared to proliferating tumor cells. Whether ERV3 has growth inhibitory activity in certain cell types has to be investigated. Rodent (tumor) models for ERV3 (and other genuine human ERV) have the limitation that ERV3 is not naturally present in these species. Therefore, especially the interaction between immune cells and ERV3 in these models is highly different from the situation that can be expected in the human system. In vitro systems might be necessary to reconstruct basic aspects of this interaction. Independent on the function of ERV3 in tumor cells, ERV3 might be considered as target for immunological treatment strategies. The presence of antibodies against ERV3 in some cancer patients indicates that immune responses are possible. Cytotoxic T cells with specificity for ERV3 might be able to kill ERV3 expressing tumor cells. However, the problems of overcoming tolerance on the one hand and avoiding autoimmunity on the other hand have to be solved before ERV3 (which is not a classical cancer antigen) might be useful as immunological cancer target.

## THE METAZOAN SOS RESPONSE

Based on the presented observations, it remains unclear whether ERV3 can act as a tumor suppressor or a tumor promoting factor. It remains possible that the expression of ERV3 in tumor cells has no impact on tumor growth but is only an epiphenomenon related to relaxed gene expression control. ERV3 transgenic rats show no pathology (Tanaka et al., 2003). The limitations of such animal models have been discussed above. The presence of mutations in ERV3 that disrupt the open reading frame in virtually healthy individuals suggest that ERV3 protein has no essential function. In addition, it seems doubtful whether the numerous mutated non-coding copies of ERV3 (and other ERV) have individual functions. We propose a different function for ERV3 and other ERV loci. It was suggested that ERV3 DNA can form a structure that activates the intracellular DNA sensor cyclic GMP–AMP synthase (Herzner et al., 2015). Activation of this enzyme can trigger an inflammatory pathway. The

importance of this pathway is highlighted by the development of autoimmunity in patients with defective double-stranded DNAremoval machinery (Stetson et al., 2008). Interestingly, increased ERV expression has been detected in patients with cancer as well as in patients with a spectrum of auto-immune diseases. One of the common features between cancer and auto-immune diseases is the dysfunction of regulatory circuits. Biological systems are characterized by a high level of ultra-stability (Staege, 2014). In cancer cells, normal regulatory circuits are defect. It seems likely that cells have sensor mechanisms that respond to dysfunctional regulatory circuits (DRC). As a consequence of ultra-stability, cells will try to reach alternative steady-state equilibria. The activation of ERV under these conditions might be involved in these mechanisms. DRC can be the consequence of virus infections. If the immune system cannot eliminate this virus directly, the activation of the immune system by EVE can be an alternative pathway that allows elimination of the exogenous virus by varying mechanisms (receptor interference, lysis of EVE-expressing cells by cytotoxic T cells, competition between RNA molecules, and so on and so forth). Such mechanism might be responsible for the detected antibodies against ERV including ERV3 in some cancer patients. This might be one reason why the genomes of virtually all higher organisms contain a plethora of EVE. ERV re-activation in cancer or other diseases can indicate the presence of DRC in these diseases.

In the case of ERV3, loss of ERV3 expression in certain types of cancer can indicate that in these tumors ERV3 expression would otherwise activate the endogenous sensing machinery. The further elucidation of the function of ERV3 and other EVE in

## REFERENCES


health and disease might allow the development of new treatment strategies for cancer and auto-immune diseases.

## CONCLUSION

ERV3 is a Catarrhini-specific EVE with interesting expression profile in normal tissues, cancer and other diseases. ERV3 is closely linked to the neighboring ZNF117 locus and for both genes the physiological function has not been clarified. Differential expression of ERV3 in cancer cells and the corresponding normal tissues makes ERV3 a potential target for future therapeutic developments. However, further investigations are necessary in order to elucidate the role of the ERV3/ZNF117 locus in the context of cancer and other diseases as well as physiological functions of these genes.

## AUTHOR CONTRIBUTIONS

MS designed the study. IV and CS performed the experiments. All authors analyzed the data, wrote the paper, and approved the final version of the paper.

## ACKNOWLEDGMENTS

The authors thank the Stiftung Mitteldeutsche Kinderkrebsforschung, Germany for kind support of our studies (grant number 124/2016).

in normal and neoplastic tissues. Int. J. Oncol. 12, 309–313. doi: 10.3892/ijo.12. 2.309





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bustamante Rivera, Brütting, Schmidt, Volkmer and Staege. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# HERVs New Role in Cancer: From Accused Perpetrators to Cheerful Protectors

### Norbert Bannert\*, Henning Hofmann, Adriana Block and Oliver Hohn

HIV and Other Retroviruses, Robert Koch Institute, Berlin, Germany

Initial indications that retroviruses are connected to neoplastic transformation were seen more than a century ago. This concept has also been tested for endogenized retroviruses (ERVs) that are abundantly expressed in many transformed cells. In healthy cells, ERV expression is commonly prevented by DNA methylation and other epigenetic control mechanisms. ERVs are remnants of former exogenous forms that invaded the germ line of the host and have since been vertically transmitted. Several examples of ERV-induced genomic recombination events and dysregulation of cellular genes that contribute to tumor formation have been well documented. Moreover, evidence is accumulating that certain ERV proteins have oncogenic properties. In contrast to these implications for supporting cancer induction, a recent string of papers has described favorable outcomes of increasing human ERV (HERV) RNA and DNA abundance by treatment of cancer cells with methyltransferase inhibitors. Analogous to an infecting agent, the ERV-derived nucleic acids are sensed in the cytoplasm and activate innate immune responses that drive the tumor cell into apoptosis. This "viral mimicry" induced by epigenetic drugs might offer novel therapeutic approaches to help target cancer cells that are normally difficult to treat using standard chemotherapy. In this review, we discuss both the detrimental and the new beneficial role of HERV reactivation in terms of its implications for cancer.

Keywords: human endogenous retrovirus (HERV), HERV, HERV-K, cancer, innate sensing, DNA-methylation, viral mimicry

## INTRODUCTION

Scattered throughout the genomes of all vertebrates are millions of footprints from past invasion events by retroelements; i.e., fragments of genomic DNA that have been retrotranscribed from RNA. Indeed, 43% of the human genome is made up of such elements and 8% of the genome is comprised of retroviruses that infected human ancestors, entering cells of the germ line or proliferating thereafter by retrotransposition (Katzourakis et al., 2005). This "retroviral self " can be classified into more than 30 distinct HERV families (Bannert and Kurth, 2004). By now, all of the known proviral sequences in the human germ line have suffered postinsertional mutations and deletions and have lost the ability to produce replication competent viral particles. However, around 100 of the germ line invaders belonging to the most recently active HERV-K(HML-2) family are full-length (or nearly) (**Figure 1A**). Many of the most recently acquired elements are polymorphic, leading to a diversity of haplotypes in the human population. HERV-K113, for

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

#### Reviewed by:

Sunil Joshi, Old Dominion University, United States Tara Patricia Hurst, Abcam, United Kingdom

#### \*Correspondence:

Norbert Bannert bannertN@rki.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 06 December 2017 Accepted: 25 January 2018 Published: 13 February 2018

#### Citation:

Bannert N, Hofmann H, Block A and Hohn O (2018) HERVs New Role in Cancer: From Accused Perpetrators to Cheerful Protectors. Front. Microbiol. 9:178. doi: 10.3389/fmicb.2018.00178

**47**

example, a well-studied non-infectious full-length provirus with open reading frames, is present in only about 30% of Africans and 12% of individuals from other parts of the world. It is becoming increasingly evident that differences in our personal "HERVome" heritage can influence individual traits and susceptibility to disease.

This is particularly obvious with regard to the impact of retroviral promoters, splice sites and other regulatory elements on the expression of proteins and non-coding RNAs. Retroviral LTRs are bona fide promotors able to initiate transcription if appropriate transcription factors are present in the nucleus and their access to the LTR is not epigenetically restricted. Under such conditions, mRNAs are produced that occasionally encode functional viral proteins, and in the case of HERV-K(HML-2), non-infectious viral particles are in fact released (Boller et al., 1983). In differentiated healthy cells, however, LTR activity is tightly repressed by epigenetic constraints such as DNA methylation. In contrast, silencing in embryonic stem cells depends primarily on the activity of histone methyltransferases and other histone modifications (Rowe and Trono, 2011). Transcription of retroviral LTRs plays a fundamental role in the maintenance of pluripotency and induction of an antiviral state in those cells (Grow et al., 2015). The physiological role of HERV expression in embryonic stem cells is not the only known example of domestication of these genomic parasites to serve the host, i.e., "exaptation." The best known examples in this regard are the syncytin genes: HERV envelope proteins under positive selection that play an important role in the physiology of the placenta in mammals (Dupressoir et al., 2012; Lavialle et al., 2013).

Conversely, since the early days of HERV research, these elements have been implicated in cellular transformation processes associated to various types of cancer, although recent studies suggest that expression of HERV-derived nucleic acids may also have a beneficial impact in the fight against cancer.

## Implications of HERVs in the Promotion of Transformation

Investigation on human retroviruses and their involvement in cancerogenesis started in the early 1970s with the search for reverse transcriptase activity and virus particles in tumor cells (Sarngadharan et al., 1972; Zhdanov et al., 1973). This search was later extended to retroviral sequences derived from or related to murine retroviruses in the human genome, as several murine retroviruses are established transforming agents (Chumakov et al., 1982; Repaske et al., 1983).

There is a plethora of publications reporting HERV activation in various cancers: breast cancer (Wang-Johanning et al., 2001, 2003, 2008; Burmeister et al., 2004; Contreras-Galindo et al., 2008; Golan et al., 2008; Zhou et al., 2016; Johanning et al., 2017), lymphoma (Contreras-Galindo et al., 2008; Maliniemi et al., 2013; Fava et al., 2016), melanoma (Muster et al., 2003; Buscher et al., 2005; Hirschl et al., 2007; Serafino et al., 2009; Reiche

et al., 2010; Stengel et al., 2010; Huang et al., 2013; Singh et al., 2013), ovarian cancers (Gotzinger et al., 1996; Wang-Johanning et al., 2007; Iramaneerat et al., 2011; Heidmann et al., 2017), and prostate cancers (Tomlins et al., 2007; Ishida et al., 2008; Goering et al., 2011; Agoni et al., 2013; Goering et al., 2015). However, to date, there is no conclusive picture emerging regarding the role and impact of HERVs as causative or promoting agents in cancerogenesis, although some well-described examples of links at the DNA and protein levels are known.

## At the DNA Level

Non-allelic recombination of HERV sequences can lead to deletions, duplications, and other chromosomal rearrangements (**Figure 1B**). In some prostate cancer cases, a translocation of the HERV-K\_22q11.23 5<sup>0</sup> -LTR-UTR sequence upstream of the transcription factor ETS translocation variant 1 (ETV1) has been described, which results in the enhanced expression of the ETV1 oncogene promoting cancerogenesis (Tomlins et al., 2007). LTRs can also act as alternative promotors and dysregulate nearby proto-oncogenes, or growth-promoting cellular genes (**Figure 1B**). For example, it was shown in B cell-derived Hodgkin's lymphoma cells that transcription of the proto-oncogene colony-stimulating factor 1 receptor (CSF1R) is driven by an aberrantly activated LTR promoter of the THE1B retrotransposon, an apparent member of the mammalian LTR retrotransposons (MaLR) family (Lamprecht et al., 2010). Moreover, in the same study, it could also be demonstrated that the derepression of THE1B LTR in those Hodgkin's lymphoma cells is a consequence of the loss of transcriptional corepressor CBFA2T3 expression, which leads to disturbed epigenetic control (Lamprecht et al., 2010). A very recent review summarizes several studies addressing the impact of HERV LTR on cellular genes, among other effects (Kassiotis and Stoye, 2017).

## At the Protein Level

Unlike HTLV-1 and the tax protein, HERVs do not possess a bona fide oncogene. However, expression of some HERV Env proteins may also be detrimental due to its ability to induce cell–cell fusion and may contribute by this way or others to tumorigenesis (Duelli and Lazebnik, 2003; Bjerregaard et al., 2006). In in vitro experiments, an Env protein coded by the HERV-K(HML-2) consensus sequences as well as by several HERV-K(HML-2) elements can interact with a cellular signaling pathway often involved in cancers (Lemaitre et al., 2017), suggesting a potential pro-oncogenic role of these HERV Envs (**Figure 1C**). Another potential effect of HERV Env expression might be the promotion of tumor escape through immune modulation, mediated by the immunosuppressive domain contained in the transmembrane region (Mangeney et al., 2001, 2005; Kudo-Saito et al., 2014).

Moreover, the expression of the HERV-K(HML-2) accessory proteins Rec and Np9 (Magin et al., 1999; Armbruester et al., 2002) can be linked to tumorigenesis (Galli et al., 2005; Chen et al., 2013; Schmitt et al., 2013; Singh et al., 2013; Fischer et al., 2014). Paradoxically, transcripts of rec and np9 from different HERV-K(HML-2) loci appear to be present in various normal human tissues (Schmitt et al., 2015). It is known that Rec and Np9 both interact with the cellular promyelocytic leukemia zinc-finger protein (PLZF) (Boese et al., 2000; Denne et al., 2007), a transcriptional repressor of the c-MYC proto-oncogene. Furthermore, Rec also binds to the testicular zinc-finger protein (TZFP) (Kaufmann et al., 2010) and the human small glutaminerich tetratricopeptide repeat protein (hSGT) (Hanke et al., 2013), both involved in androgen receptor repression (**Figure 1C**). Rec-driven dysregulation of the androgen receptor signaling may eventually result in tumor induction or promotion (Hanke et al., 2013). In addition to PLZF, Np9 also binds the ligand of Numb protein X (LNX) and is therefore interacting with the Numb/Notch signaling cascade (Armbruester et al., 2004). Dysregulation of this pathway has been linked to several cancers (Roy et al., 2007; Downey et al., 2015).

## HERV Expression as a Diagnostic Marker for Tumors

The significantly elevated expression of various HERV elements in cancer cells has spurred studies for their use as biomarkers for malignant transformation, staging, and prognosis of cancers (Harzmann et al., 1982; Hahn et al., 2008; Wallace et al., 2014; Perot et al., 2015; Ma et al., 2016). One of the best candidates for diagnostic purpose in this regard is the HERV-K(HML-2) envelope protein in human breast cancer. Zhao et al. (2011) demonstrated that this gene is expressed in the majority of breast cancers from United States or Chinese women but generally not expressed or at very low levels in normal breast tissue. They subsequently showed that HERV-K(HML-2) antibodies and mRNA are elevated in blood of patients at an early stage of this cancer type, and further increase in patients who are at risk of developing metastatis (Wang-Johanning et al., 2014). Thus, screening for HERV-K(HML-2) expression seems to be a promising additional option for early detection in women at increased risk for breast cancer.

## Antitumor Activity of HERVs

In many respects, an endogenous retrovirus is an intermediate between a genuine virus and a regular human gene. This also applies to the immune response directed against HERV-derived nucleic acids and proteins.

## Adaptive Immunity

Immunologic tolerance to HERV-derived proteins and peptides is imperfect. This is presumably due to the tight epigenetic silencing in the thymus and bone marrow that prevents normal deletion of all reactive HERV-specific T and B lymphocytes, respectively. Indeed, immunization of non-human primates with endogenous retrovirus-derived antigens elicits robust polyfunctional T cell responses and high antibody titers (Sacha et al., 2012; Sheppard et al., 2014). In line with these findings, Boller et al. (1997) were among the first to report transient HERV-K(HML-2)-specific antibodies in the plasma of testicular cancer patients that became rapidly undetectable following tumor surgery. Strong CTL responses against epitopes of certain HERV proteins, considered to be another class of tumor-specific antigens, have been found in patients with various types of cancers (Schiavetti et al., 2002; Mullins and Linnebacher, 2012; Rycaj et al., 2015). Although evidence for tumor regression by the action of HERV-specific

CTLs exists, the general impact of these antigens on the adaptive antitumoral defense remains largely unclear (Huang et al., 1996; Takahashi et al., 2008). This also holds true for the potential use of tumor-specific HERV-based therapeutic vaccines against various types of cancers.

### Innate Immunity

The nucleic acids derived from endogenous retroviruses in the cytoplasm and other cellular compartments do not escape a response from the innate immune system. Although there are marked differences between the innate immunity of humans and mice, similar principles might act during the control of ERV reactivation and innate sensing. In an elegant study in a mouse model, Yu et al. (2012) have shown that Toll-Like Receptors (TLRs) 3, 7, and 9 are essential for the control of ERVs at least in mice. Mice lacking these receptors develop late-onset leukemia by insertional mutagenesis of reactivated replicating ERVs (Yu et al., 2012). A key factor for the production of anti-ERV antibodies was further attributed to TLR7 thereby linking innate and adaptive immunity (**Figure 2**). Similar to TLR-knockout mice, those with inactivating mutations in the maintenance gene DNA methyltransferase 1 (DNMT1) develop tumors induced by reactivation of replication-competent ERVs (Howard et al., 2008). Significant DNA hypomethylation and ERV activation can also be achieved by treatment with DNA methyltransferase inhibitors (DNMTis), such as 5-azacytidine (Aza) and 5-aza-2<sup>0</sup> -deoxycytidine (Dac) (Stengel et al., 2010), both of which are approved by the FDA for the treatment of myelodysplastic syndromes (Kaminskas et al., 2005). These inhibitors were initially thought to epigenetically reactivate silenced tumor suppressor genes in malignant cells and render these cells therefore more prone to apoptosis, but it is also recognized that DNMTis induce immune responses against cancer cells. Chiappinelli et al. (2015) and Roulois et al. (2015) demonstrated a link between DNMTi-induced activation of HERV expression and innate sensing of transcribed viral RNAs and activation of innate immunity signaling pathways leading to an inhibition of tumor cell growth. These results represent a paradigm shift in our comprehension of the antitumor activity of demethylating agents. The authors demonstrated that DNMTis induce HERV-derived dsRNAs that are sensed primarily by TLR3 (localized at the endosomal membrane) and by the cytosolic CARD-domain family protein MDA5, a known pattern recognition receptor. Following dsRNA binding, the CARD domain of MDA5 interacts with the mitochondrial antiviral signaling protein MAVS located on mitochondrial membranes. This leads to the activation of NF-kB and the phosphorylation and nuclear import of IRF7 resulting in a profound interferon (IFN) response (**Figure 2**). TLR3 binding to dsRNAs of endogenous origin also leads to both IRF and NF-kB activation. In these experiments, Aza treatment also induced partial demethylation of the IRF7 gene and increased its expression. Due to the slow response toward DNMTi treatment, the type I IFN response and thereby IFN release and activation

FIGURE 2 | Antitumor activity mediated by innate sensing of HERV RNA and DNA. Treatment of cancer cells with DNMTis results in demethylation of the LTR promotors and HERV transcription. Viral nucleic acids in the cytoplasm and other cellular compartments (viral mimicry) are sensed by TLRs and other innate sensors including MDA5. Upon activation these pattern recognition receptors initiate signal transduction pathways that result in production of type I and III IFNs and pro-inflammatory cytokines further leading to promotion of an adaptive immune response (e.g., anti-HERV antibodies).

of numerous IFN Stimulated Genes (ISGs) are delayed and peak about 1 week after treatment (Chiappinelli et al., 2015). Interestingly, while in the ovarian cancer cell model, the outcome of a low dose Aza treatment was dominated by an IFN I response (Chiappinelli et al., 2015; Stone et al., 2017), in colorectal cancer cells, activated IFN III genes were more preponderant (Roulois et al., 2015). The latter group also demonstrated that cancer initiating cells, defined by their ability to self-renew and difficult to reach by standard chemotherapy, were importantly targeted by treatment. The overall outcome of the innate immune response was a suppression of tumor cell proliferation and enhanced apoptosis. Moreover, Aza administration sensitized murine melanoma cells to anti-CTLA-4 treatment and high expression of ISGs correlated with a sustained clinical response to anti-CTLA-4. A treatment combining DNMTis and immune checkpoint inhibitors (e.g., anti-CTLA-4) is therefore regarded as extremely promising. Supplementation with vitamin C was also suggested to increase the effect of the DNMTis and was shown to be effective (Liu et al., 2016). Vitamin C promotes DNA demethylation through increased activity of the three so-called "Ten-Eleven Translocation (TET)" enzymes, which convert 5 methylcytosine to 5-hydroxymethylcytosine (Tahiliani et al., 2009).

The induction of immune responses by unleashing ERV expression from epigenetic restrictions has been termed "viral mimicry," i.e., a cellular response similar to those seen after infection with an exogenous virus (**Figure 2**). Results from these studies using Aza treatment have been corroborated by others in different settings (Nicholas et al., 2017). In one of these studies, a 3D culture system of intestinal mice tumors was used to demonstrate that DNMT knockdown by Aza treatment significantly reduced cell proliferation in tumor organoids (Saito et al., 2016). More recently, the anticancer agent, RRx-001, a dinitroazetidine derivate that is currently tested in phase II clinical trials, has also been shown to elicit an IFN response through epigenetic induction of viral mimicry. The remarkable safety profile of this immunomodulatory anticancer agent makes it a leading candidate for future clinical applications (Zhao et al., 2017).

Although reactivated endogenous elements have been roughly categorized at the family level, the actual chromosomal loci remained unknown. The identification of these loci might help to better understand epigenetic changes and cancer-specific differences. In this context, it might also explain the surprisingly frequent occurrence of bidirectional transcription in many ERVs which appears to be the underlying reason for the strong activation of MDA5, as this innate sensor requires extended dsRNA structures for efficient activation (Mu et al., 2016).

There are many additional options for the enhancement and advancement of an ERV-mediated epigenetic cancer therapy.

## REFERENCES

One approach might be the inhibition of dsRNA or DNA degradation in cancer cells by blocking the activity of the respective nucleases. Such an accumulation of ERV nucleic acids has been reported in cells from individuals bearing inactivating mutations in nucleases that normally clear nucleic acid from the cytoplasm or influence their metabolism in the cytoplasm. The most prominent examples are mutations in Three-prime Repair Exonuclease 1 (TREX1) or Sam domain and HD domain 1 (SAMHD1), that are associated with the rare autoinflammatory disease Aicardi-Goutieres syndrome (AGS) characterized by an exaggerated type I IFN response (van Montfoort et al., 2014).

## CONCLUSION

It has been recognized for many years that endogenous retroviruses and other retroelements contribute to malignant diseases as well as to inflammatory and autoimmune disorders at the DNA and presumably at the protein level. However, until recently, it escaped attention that an increased expression of HERV-derived nucleic acids also has an adverse effect on cancer cells and that this effect could be the basis of novel therapeutic approaches. Importantly, targeting of the neoplastic cell will be an important issue to prevent jumping from the "frying-pan" into the fire. Aberrant reactivation and expression of HERVs in healthy tissue not only bears the risk of new transformations and autoimmune diseases, but also might influence cellular physiology by activating HERV promotors that act on cellular genes. Curing cancers by activating HERVs that instigate an innate immune response is surely an appealing concept with high expectations and is worth investing significant effort in the future.

## AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contributions to this mini review, and approved it for publication.

## FUNDING

This work was supported by the priority program "Innate Sensing and Restriction of Retroviruses (SPP1923)" of the German Research Foundation (DFG). HH was supported by a Mathilde Krim fellowship, phase II from amfAR (108982-57-RKGN).

## ACKNOWLEDGMENTS

We thank Drs. Stephen Norley and Benoit Barbeau for helpful discussions.

Armbruester, V., Sauter, M., Krautkraemer, E., Meese, E., Kleiman, A., Best, B., et al. (2002). A novel gene from the human endogenous retrovirus K expressed in transformed cells. Clin. Cancer Res. 8, 1800–1807.

Armbruester, V., Sauter, M., Roemer, K., Best, B., Hahn, S., Nty, A., et al. (2004). Np9 protein of human endogenous retrovirus K interacts with ligand of

Agoni, L., Guha, C., and Lenz, J. (2013). Detection of human endogenous retrovirus K (HERV-K) transcripts in human prostate cancer cell lines. Front. Oncol. 3:180. doi: 10.3389/fonc.2013.00180

numb protein X. J. Virol. 78, 10310–10319. doi: 10.1128/JVI.78.19.10310-10319. 2004




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bannert, Hofmann, Block and Hohn. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Human Endogenous Retroviruses and Their Putative Role in the Development of Autoimmune Disorders Such as Multiple Sclerosis

### Victoria Gröger and Holger Cynis\*

Department of Drug Design and Target Validation, Fraunhofer Institute for Cell Therapy and Immunology, Halle, Germany

Human endogenous retroviruses (HERVs) are remnants of retroviral germ line infections of human ancestors and make up ∼8% of the human genome. Under physiological conditions, these elements are frequently inactive or non-functional due to deactivating mutations and epigenetic control. However, they can be reactivated under certain pathological conditions and produce viral transcripts and proteins. Several disorders, like multiple sclerosis or amyotrophic lateral sclerosis are associated with increased HERV expression. Although their detailed contribution to individual diseases has yet to be elucidated, an increasing number of studies in vitro and in vivo suggest HERVs as potent modulators of the immune system. They are able to affect the transcription of other immune-related genes, interact with pattern recognition receptors, and influence the positive and negative selection of developing thymocytes. Interestingly, HERV envelope proteins can both stimulate and suppress immune responses based on different mechanisms. In the light of HERV proteins becoming an emerging drug target for autoimmune-related disorders and cancer, we will provide an overview on recent findings of the complex interactions between HERVs and the human immune system with a focus on autoimmunity.

#### Edited by:

Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### Reviewed by:

Tara Patricia Hurst, Abcam, United Kingdom Masaaki Miyazawa, Kindai University, Japan

\*Correspondence:

Holger Cynis holger.cynis@izi.fraunhofer.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 05 October 2017 Accepted: 05 February 2018 Published: 20 February 2018

#### Citation:

Gröger V and Cynis H (2018) Human Endogenous Retroviruses and Their Putative Role in the Development of Autoimmune Disorders Such as Multiple Sclerosis. Front. Microbiol. 9:265. doi: 10.3389/fmicb.2018.00265 Keywords: HERV, immune system, autoimmunity, superantigen, disease

## INTRODUCTION

Retroelements constitute a large portion (42%) of our genome (Lander et al., 2001; Cho et al., 2008; Young et al., 2013). These transposable elements, which have RNA intermediates, are often neglected although their contribution to the human entity is not well-understood.

They are discriminated by the presence of long terminal repeats (LTRs) fundamental for regulation of retroviral gene expression (Mita and Boeke, 2016). Short interspersed nuclear elements (SINEs, without reverse transcriptase) and long interspersed nuclear elements (LINEs, with reverse transcriptase) belong to retroelements that do not possess LTRs (Mita and Boeke, 2016). LTR-positive retroelements encompass 8% of the human genome (Lander et al., 2001; Balada et al., 2009). They are either called retrotransposons or human endogenous retroviruses (HERVs) according to the absence or presence of the envelope (env) gene, respectively. Hence, HERVs represent the most complete form of retroelements. They entered the primate genome by exogenous retrovirus infections (Belshaw et al., 2004; Young et al., 2013). Retroviruses usually infect somatic cells, but on occasion germ line cells are also targeted. As a consequence, retroviral sequences were transmitted vertically to the offspring in a Mendelian manner and became fixed in the human population (Christensen, 2010).

HERVs are extensively distributed throughout the human genome due to amplification and transposition events. Based on sequence similarities to exogenous retroviruses, HERVs belong to class I (gamma- and epsilon-like), class II (lenti-, alpha-, beta-, and delta-like) or class III (spuma-like) retroviruses (Gifford et al., 2005; Balada et al., 2010). Phylogenetic studies revealed that at least 30 different HERV families exist in the human genome, each resulting from a distinct infection of the germ line (Bénit et al., 2003; Katzourakis et al., 2005; Stoye, 2012). Among them, HERV-K (HML-2) elements integrated most recently and thus are the most intact and biologically active forms (Marchi et al., 2014). Although the number and diversity of HERVs are huge, nomenclature is still not standardized. While most HERVs are named after the tRNA species used to prime reverse transcription (e.g., HERV-W for tryptophan tRNA), some names are still linked to the approaches applied for their identification. For more details refer to Vargiu et al. (2016).

In the course of human evolution most HERVs have accumulated mutations, which rendered a large fraction of their retroviral sequences non-functional (de Parseval and Heidmann, 2005). There are only two full-length proviruses known from the most recently integrated HERV-K family (HERV-K113, HERV-K115), which show complete reading frames for all viral genes (Turner et al., 2001). However, no infectious endogenous retrovirus has yet been identified in humans (Balada et al., 2009; Stoye, 2012). Nevertheless, intact open reading frames of single retroviral genes persisted in the genome, which gave rise to RNA transcripts as well as proteins and therefore suggesting functions in the human body (de Parseval and Heidmann, 2005).

In this regard, a well-investigated example is syncytin-1, which is an ancient Env protein from the HERV-W family. It encodes a 60 kDa large viral glycoprotein with fusogenic properties and possesses an essential function in placental development in humans (Dupressoir et al., 2012; Bolze et al., 2017). Independent integration events of syncytins, which share functional properties but are derived from multiple ERV lineages, are also important for placental development of many other mammals (Dupressoir et al., 2012; Imakawa and Nakagawa, 2017). Furthermore, HERV transcripts are upregulated during early human embryogenesis with possible implications in early viral defense pathways (Grow et al., 2015).

In surveys of the human genome, a limited number of 16 coding env genes were identified (de Parseval et al., 2003; Villesen et al., 2004). Although it cannot be excluded that shorter ORFs may play a role in cellular processes, it is more probable for long ORFs to have retained their original function. Consequently, the human genome bears a number of retroviral proteins with putative roles in pathophysiological conditions (Hansen et al., 2017). As an example, in amyotrophic lateral sclerosis (ALS), recent research suggested a possible involvement of HERVs (Alfahad and Nath, 2013). It was shown that HERV-K expression in human neurons causes retraction and beading of neurites (Li et al., 2015). As the virus was found to be expressed in neurons of ALS patients but not in neurons of healthy controls it was concluded that HERV-K expression might contribute to neurodegeneration (Li et al., 2015). These results are supported by findings showing increased HERV-K expression in brain tissue of ALS patients compared to non-ALS individuals (Douville et al., 2011).

The focus of the present mini-review is the putative interaction of HERV proteins with the human immune system. Different mechanisms have been proposed to explain HERV interaction with the immune response. With focus on adaptive immune mechanisms, superantigen motifs, and viral proteins will be discussed. Concerning innate immunity, interaction of HERVs with pattern recognition receptors (PRRs) like Toll-like receptor 4 (TLR4) and cluster of differentiation (CD) 14 are described. Immunosuppressive function of HERVs will be also addressed.

## INTERACTION OF HERV PROTEINS WITH THE HUMAN IMMUNE SYSTEM

As part of the human genome, HERV-encoded proteins should be considered as self-antigens and tolerated by the immune system. However, they could be perceived as neo-antigens if not expressed in the thymus during acquisition of immune tolerance (Balada et al., 2009). Moreover, once descended from exogenous viruses, HERVs share sequence homologies with their ancestors, which could provide antigenic epitopes for lymphocyte recognition (Voisset et al., 2008). The underlying mechanism is called molecular mimicry. Here, proteins of infectious agents such as viruses or bacteria and self-derived proteins share structural, functional or immunological similarities. In this light, sequence similarities between Env proteins of HERV-W and myelin are supposed to potentially trigger an immune response in multiple sclerosis (MS) (Ramasamy et al., 2017). There are a number of computationally predicted epitopes, which are shared between retroviruses and host proteins, although biological significance is not always given (Fujinami et al., 2006). Nevertheless, molecular mimicry could help to explain how viral infection leads to autoimmunity.

Retroviral nucleic acids and viral proteins can be sensed by a variety of PRRs, such as Toll-like receptors (TLRs) or NOD-like receptors (Thompson et al., 2011). It is conceivable that HERVencoded proteins are able to trigger PRRs of the innate immune system leading to an induction of autoimmunity (Tugnet et al., 2013). A direct interaction between certain HERV proteins and TLRs has been shown. As an example, the surface unit of HERV-W Env binds to TLR4 and CD14 and stimulates the production of pro-inflammatory cytokines including IL-1 beta, IL-6, and TNFalpha (Rolland et al., 2006). A more detailed description of innate immune response activation by HERVs has been compiled by Hurst et al. (Hurst and Magiorkinis, 2015).

Retroviral envelope proteins are hypothesized to both trigger and suppress an immune response. In this context, a peptide of 14 amino acids (LQARILAVERYLKD) located in the transmembrane (TM) glycoprotein gp41 of HIV-1 inhibits mitogen-induced and lymphokine-dependent T-lymphocyte proliferation (Denner et al., 1994; Mühle et al., 2017). It is also able to modulate cytokine levels as it increases IL-6 and IL-10 and decreases IL-2 and CXCL9 expression in human peripheral blood mononuclear cells (PBMCs) (Denner et al., 2013). Thereby,

Gröger and Cynis HERV and Immunity

it allows the virus to persist and replicate in host cells (Blinov et al., 2013; Denner, 2014). This short sequence, called the immunosuppressive domain (ISD), is highly conserved among retroviruses. It was first described for murine and feline Ctype retroviruses and later extended to human T-lymphotropic virus (HTLV) and HIV (Haraguchi et al., 1997). A similar but not identical sequence N-terminally to the immunodominant Cys–Cys loop can be found in some HERV families including HERV-W, HERV-FRD, and HERV-K (Morozov et al., 2013). A recombinant TM protein and a peptide corresponding to the ISD in HERV-K were shown to inhibit proliferation of human immune cells and to modulate cytokine release similar to the ISD of HIV-1 (Morozov et al., 2013), although corroboration of these findings by other groups is pending. Moreover, the envelope protein Env59 of HERV-H shows anti-inflammatory effects in an experimental arthritis model (Laska et al., 2016). In contrast to a study by Tolosa et al. showing reduced immune response of PBMCs to treatment with LPS and syncytin-1 (HERV-W) (Tolosa et al., 2012), Mangeney et al. described immunomodulatory properties for syncytin-2 (HERV-FRD) but not for syncytin-1 (Mangeney et al., 2007). However, the replacement of two amino acids in the ISD of syncytin-1 with those of syncytin-2 was able to restore the immunosuppressive function (Mangeney et al., 2007). Therefore, syncytins may help to protect the fetus from the mother's immune system (Blaise et al., 2003; Mangeney et al., 2007). HERVs might also help tumor growth by shielding it from the host immune system (Kudo-Saito et al., 2014). This was shown for a synthetic peptide corresponding to the ISD of HERV-H as it causes CCL19-mediated CD271<sup>+</sup> cell-governing immunosuppression in stimulated human tumor cells (Kudo-Saito et al., 2014). HERV-H could also be an important factor for immune defense in cancer. Although the association of HERVs with cancerous tissues is beyond the scope of this review, it has been hypothesized that immune suppression by HERVs could contribute to tumor immune evasion.

## REGULATION OF HERV EXPRESSION

HERV expression is tightly regulated by the host through epigenetic mechanisms, which results in varying expression from tissue to tissue (Hurst and Magiorkinis, 2017). Control of HERV expression depends upon regulation of the LTRs, which are able to bind nuclear transcription factors and function as promoters (Hurst and Magiorkinis, 2017). Both CpG methylation of DNA and histone deacetylation keep HERVs silenced, although histone modifications alone were shown to be insufficient for efficient transcription suppression (Hurst et al., 2016). Retroviral genes are heavily methylated in normal tissues, whereas tumors show increased levels of HERV transcripts due to hypomethylation (Cegolon et al., 2013). In addition to epigenetic regulation, other factors including hormones, microorganisms, and the environment were shown to modulate HERV expression (Balada et al., 2009; Emmer et al., 2014).

In this regard, the Epstein-Barr virus (EBV) is able to transactivate the expression of the normally inactive HERV-K18 Env protein, e.g., in resting B lymphocytes via CD21 receptor interaction (Sutkowski et al., 2001; Hsiao et al., 2006; Balada et al., 2009). The mechanism of transactivation was further shown to depend on the expression of the major EBV late gene transactivator EBNA-2 (Sutkowski et al., 2004). In-depth analysis identified the EBV latent membrane protein LMP-2A as a strong candidate for the transactivation of HERV-K18 (Sutkowski et al., 2004). Furthermore, Stauffer et al. showed that interferon-α upregulates transcription of the HERV-K18 env gene, suggesting an indirect connection between viral infections and autoimmune disorders (Stauffer et al., 2001). This is of great interest since HERV-K18 has been reported to have superantigen activity (Sutkowski et al., 2001; Tai et al., 2006), although conflicting data are also published (Lapatschek et al., 2000; Azar and Thibodeau, 2002).

Superantigens activate B- and T-lymphocytes regardless of the specificity of their antigen receptor. They are produced by bacteria and viruses and do not need to be processed as conventional antigens for antigen presentation (Solanki et al., 2008). They bind to conserved regions of major histocompatibility complex (MHC) class II molecules outside of the classical peptide-binding groove and connect them with a subset of T-cells expressing particular T cell receptor (TCR) β chain variable region genes (Solanki et al., 2008). This is different from conventional T-cell activation where highly variable TCR α and β chains CDR3 regions are bound (Sutkowski et al., 2001). Therefore, superantigens can stimulate many subsets of T-cells expressing the same Vβ genes, followed by massive cytokine secretion (Solanki et al., 2008).

In this context, the first HERV superantigen was isolated by Conrad et al. from pancreatic islets of patients with type I diabetes (T1D) (Conrad et al., 1997). They showed that the Env protein of this new HERV initially named IDDMK1,222 has properties of a Vβ7-specific superantigen. Sequence analysis revealed that IDDMK1,222 corresponds to one allele of the polymorphic HERV-K18 env (Stauffer et al., 2001). Sutkowski et al. further showed an activation of TCR Vβ13 T cells in response to murine B cells transfected with HERV-K18 env gene (Sutkowski et al., 2001). Tai and colleagues found similar results for K18 Env in mice as it expands Vβ7 and Vβ13 T cells (Tai et al., 2006; Emmer et al., 2014). Although HERV-K18 Env seems to possess superantigenic properties, its contribution to pathogenesis of T1D remains unclear. Contrary to studies supporting the initial association of the putative superantigen with T1D (Kinjo et al., 2001; Marguerat et al., 2004), four independent studies challenged this hypothesis (Badenhoop et al., 1999; Jaeckel et al., 1999; Knerr et al., 1999; Muir et al., 1999). In summary, the expression of HERVs in the human body is subject to strict regulation, which can lead to an increase in HERV transcripts and proteins due to pathological alterations.

## IMPLICATIONS FOR AUTOIMMUNE DISORDERS

The diversity of as many as 80 different types of autoimmune disorders as well as their clinical resemblance often makes diagnosis difficult. It is known that many different genetic loci with small effect sizes predispose individuals to develop autoimmunity, but in addition, environmental factors play a role in triggering the immune response (Ercolini and Miller, 2009). Here, HERVs might play an important role in the homeostasis of the immune system and could be key players when it comes to development of autoimmunity.

Studies that show an association between HERVs and autoimmune diseases either rely on retroviral antigens at the site of disease or the presence of antiretroviral antibodies in the sera of patients (Herve et al., 2002; Mameli et al., 2007; Laska et al., 2012; Alfahad and Nath, 2013). It has been hypothesized that HERVs are involved in the pathogenesis of diseases characterized by dysregulated immune response, such as autoimmune diseases (**Table 1**). However, whether HERVs are causative or only a consequence of disease is still under debate, as the expression of HERV mRNA or proteins at the site of tissue injury alone is insufficient to prove a pathogenic role of HERVs.

Diagnosis HERV Main results References MS HERV-W Meta-analysis of HERV-W viral protein and/or mRNA expression in peripheral blood, CSF, and brain of MS patients reveals an association between HERV-W and MS Morandi et al., 2017 Accumulated HERV-W Gag expression in axonal structures and endothelial cells of active MS lesions, HERV-W Env expression in macrophages is restricted to early MS lesions Perron et al., 2005 HERV-W Env is upregulated within MS plaques and correlated with the extent of active demyelination and inflammation, significantly greater accumulation of HERV-W-specific RNAs in MS brains vs. controls Mameli et al., 2007 HERV-W Env is dominantly expressed in macrophages and microglia in areas of active demyelination van Horssen et al., 2016 HERV-W Env is present in macrophages within MS brain lesions with particular concentrations

TABLE 1 | Summary of HERVs associated with inflammatory diseases mainly through genetic, serological, and molecular studies.


MS, Multiple sclerosis; ALS, Amyotrophic lateral sclerosis; SLE, Systemic lupus erythematosus; RA, Rheumatoid arthritis; SS, Sjögren's syndrome; JIA, Juvenile idiopathic arthritis.

Gröger and Cynis HERV and Immunity

As a prominent example, the association of HERVs with MS is extensively discussed (Morandi et al., 2017). The multiple sclerosis associated retrovirus (MSRV) has been observed in leptomeningeal cells shed into cerebrospinal fluid of a patient with progressive MS (Perron et al., 1991). MSRV belongs to a then unknown HERV-W family and encodes a viral envelope protein that is physiologically expressed in microglia cells of normal brain (Perron et al., 2005). It becomes deregulated and is highly expressed in macrophages of active lesions in MS patients (Perron et al., 2005). In rat and human oligodendroglial precursor cells, HERV-W/TLR4 interaction causes both an increase in pro-inflammatory cytokines and nitrosative stress through increased release of inducible nitric oxide synthase. As a result, oligodendroglial differentiation is reduced, which might be the cause of impaired myelin repair observed in MS (Kremer et al., 2013). Antony and colleagues reported similar results for HERV-W Env expression in astrocytes as it leads to neuroinflammation and death of oligodendrocytes (Antony et al., 2004). Interestingly, treatment with specific antibodies against MSRV Env could prevent MS symptoms in a mouse model of experimental autoimmune encephalomyelitis (Perron et al., 2013). Clinical phase 2b studies with the same humanized antibody are currently under way in 12 European countries (CHANGE-MS study) with the possibility of an extension (ANGEL-MS study) for patients that have been enrolled in the CHANGE-MS study (Curtin et al., 2015; GeNeuro, 2017). These studies appear promising in terms of the development of potential novel therapies for MS.

HERV-Fc1, which has the potential to express a full-length Env product of 584 aa, and a Gag product of 470 aa might also be involved in the pathogenesis of MS (Nexø et al., 2015). Laska et al. could show an increased expression of HERV-Fc1 Gag in PBMCs and four times higher RNA levels in plasma of patients suffering from active MS compared to healthy controls (Laska et al., 2012). HERV-Fc1 is unusual among human proviruses in having only a single known integration in the genome (on the X chromosome; Nissen et al., 2012). This locus seems to be genetically associated with MS (Hansen et al., 2011; Nexø et al., 2016). Similarly, homozygous carriers of K18.3, which is one of three allelic forms of HERV-K18 Env and displaying superantigenic properties, show an increased risk for MS compared to individuals carrying two K18.2 alleles (Tai et al., 2008).

A possible mechanism of HERV action in MS is inferred from the findings of pre-active plaques in MS patients. These are clusters of activated microglia present in the absence of demyelination and infiltrating leukocytes (van der Valk and Amor, 2009). They can be detected by magnetic resonance

## REFERENCES


imaging (MRI) several months before the appearance of an active lesion (Fazekas et al., 2002). Oligodendrocyte abnormalities and primary damage to myelin appear to be crucially involved (van der Valk and Amor, 2009). Based on these results and HERV expression in active MS lesions (Mameli et al., 2007; Perron et al., 2012; van Horssen et al., 2016), it is tempting to speculate that pathological alterations in MS are supported by HERV protein expression contributing to plaque formation. Further evidence for the role of HERVs in MS would improve our understanding of the etiology and provide new therapeutic insights into MS.

## CONCLUSION

The findings described here suggest that HERV elements may play a role in the pathogenesis of human diseases such as MS or ALS. Particularly in MS, it is conceivable that the formation of HERV Env proteins trigger a damaging cascade that eventually leads to the symptoms of the disease. This assumption could help to integrate unexpected findings, such as pre-active plaques, into the sequence of pathological events (Christensen, 2017). A deeper understanding of HERV expression under physiological and pathophysiological conditions and their interaction with the immune system might help to better explain and combine several factors that contribute to MS. In this regard, the first studies targeting a specific HERV-W Env protein are currently in clinical trials and may provide further evidence of the validity of this novel approach in the near future.

## AUTHOR CONTRIBUTIONS

VG and HC: designed the outline of the manuscript; VG: wrote the manuscript; HC: supervised the writing, edited, and approved the final version of the manuscript.

## FUNDING

The work was supported by an institutional fund from the state government of Sachsen-Anhalt, Germany.

## ACKNOWLEDGMENTS

The authors are grateful to Malte E. Kornhuber, Martin S. Staege, and Alexander Emmer for helpful comments during preparation of the manuscript. The help of Helen Crehan and Susan Barendrecht in reviewing language and grammar is also acknowledged.


expression from healthy relatives or normal individuals. Diabetes 48, 215–218. doi: 10.2337/diabetes.48.1.215


and myelin proteins in multiple sclerosis. Immunol. Lett. 183, 79–85. doi: 10.1016/j.imlet.2017.02.003


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Gröger and Cynis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fmicb-09-00287 February 20, 2018 Time: 16:41 # 1

# Investigation of Endogenous Retrovirus Sequences in the Neighborhood of Genes Up-regulated in a Neuroblastoma Model after Treatment with Hypoxia-Mimetic Cobalt Chloride

Christine Brütting1,2 \*, Harini Narasimhan<sup>1</sup> , Frank Hoffmann<sup>3</sup> , Malte E. Kornhuber<sup>2</sup> , Martin S. Staege<sup>1</sup> and Alexander Emmer<sup>2</sup>

<sup>1</sup> Department of Surgical and Conservative Paediatrics and Adolescent Medicine, Martin Luther University of Halle-Wittenberg, Halle, Germany, <sup>2</sup> Department of Neurology, Martin Luther University of Halle-Wittenberg, Halle, Germany, <sup>3</sup> Department of Neurology, Hospital "Martha-Maria" Halle-Dölau, Halle, Germany

#### Edited by:

Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### Reviewed by:

Tara Patricia Hurst, Abcam, United Kingdom Masaaki Miyazawa, Kindai University, Japan

#### \*Correspondence:

Christine Brütting christine.bruetting@uk-halle.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 29 September 2017 Accepted: 07 February 2018 Published: 21 February 2018

#### Citation:

Brütting C, Narasimhan H, Hoffmann F, Kornhuber ME, Staege MS and Emmer A (2018) Investigation of Endogenous Retrovirus Sequences in the Neighborhood of Genes Up-regulated in a Neuroblastoma Model after Treatment with Hypoxia-Mimetic Cobalt Chloride. Front. Microbiol. 9:287. doi: 10.3389/fmicb.2018.00287 Human endogenous retroviruses (ERVs) have been found to be associated with different diseases, e.g., multiple sclerosis (MS). Most human ERVs integrated in our genome are not competent to replicate and these sequences are presumably silent. However, transcription of human ERVs can be reactivated, e.g., by hypoxia. Interestingly, MS has been linked to hypoxia since decades. As some patterns of demyelination are similar to white matter ischemia, hypoxic damage is discussed. Therefore, we are interested in the association between hypoxia and ERVs. As a model, we used human SH-SY5Y neuroblastoma cells after treatment with the hypoxia-mimetic cobalt chloride and analyzed differences in the gene expression profiles in comparison to untreated cells. The vicinity of up-regulated genes was scanned for endogenous retrovirus-derived sequences. Five genes were found to be strongly up-regulated in SH-SY5Y cells after treatment with cobalt chloride: clusterin, glutathione peroxidase 3, insulin-like growth factor 2, solute carrier family 7 member 11, and neural precursor cell expressed developmentally down-regulated protein 9. In the vicinity of these genes we identified large (>1,000 bp) open reading frames (ORFs). Most of these ORFs showed only low similarities to proteins from retro-transcribing viruses. However, we found very high similarity between retrovirus envelope sequences and a sequence in the vicinity of neural precursor cell expressed developmentally down-regulated protein 9. This sequence encodes the human endogenous retrovirus group FRD member 1, the encoded protein product is called syncytin 2. Transfection of syncytin 2 into the well-characterized Ewing sarcoma cell line A673 was not able to modulate the low immunostimulatory activity of this cell line. Future research is needed to determine whether the identified genes and the human endogenous retrovirus group FRD member 1 might play a role in the etiology of MS.

Keywords: endogenous retroviruses, open reading frames, ERVFRD-1, HERV-FRD, human endogenous retrovirus group FRD member 1, hypoxia, multiple sclerosis, neural precursor cell expressed developmentally downregulated protein 9 (NEDD9)

## INTRODUCTION

fmicb-09-00287 February 20, 2018 Time: 16:41 # 2

Endogenous retroviruses (ERVs) are viral elements that are present in the genomes of virtually all species including human beings (Hayward and Katzourakis, 2015). At least 8% of the human genome is composed of endogenous retroviral sequences (Lander et al., 2001). These sequences were integrated into the human genome in the course of the evolution (Emerman and Malik, 2010). The great majority of ERVs are stabilized in the genome, but there is still ongoing or potential ERV genotype modification from parents to offspring through generations. Like other genes, ERVs are susceptible to mutations and proviral DNAs are predisposed to accumulate mutations as these sequences are usually not vital for the host survival and thus not under strong selective pressure. The majority of ERVs integrated in our genome is not competent to replicate and most ERV sequences are presumably silent (Jern and Coffin, 2008). Nevertheless, about one third of all ERV sequences in the genome were found to be transcriptionally active (Pérot et al., 2012). Some of these sequences still have open reading frames (ORFs) and, therefore, have the potential to code for a protein or peptide (Dupressoir et al., 2012; Wildschutte et al., 2016). ERVs can be reactivated by some herpes viruses such as Epstein–Barr virus (Mameli et al., 2012). Another possibility is the reactivation of ERV expression by hypoxia (Kewitz and Staege, 2013; Kulkarni et al., 2017). ERV-encoded superantigens might lead to hyper-stimulation of the immune system and tissue damage. In addition, fusogenic activity of ERV envelope proteins might have direct cytopathic effects which might be involved in MS pathogenesis independent on autoimmune mechanisms. Indeed, cell fusion has been detected in MS brain lesions as well as in animal models of MS (Kemp et al., 2012; Sankavaram et al., 2015). A working model for ERV reactivation and consequences is presented in **Figure 1**.

Endogenous retroviruses have contributed to certain physiological genes (i.e., syncytins) through modifications (Blond et al., 1999; Mi et al., 2000; Soygur and Moore, 2016) and can sometimes probably protect the host against exogenous retrovirus infections (Malfavon-Borja and Feschotte, 2015). On the other hand, ERVs have also been found to be associated with different diseases (Dolei, 2006; Balada et al., 2009), e.g., schizophrenia and bipolar disorder (Perron et al., 2012), type 1 diabetes mellitus (Mason et al., 2014), or cancer (Goering et al., 2015) as well as multiple sclerosis (MS) (Perron and Lang, 2010; De la Hera et al., 2014). Several ERVs are considered to be associated with MS (Christensen, 2010). For example, human ERV-W envelope mRNA expression was found to be selectively up-regulated in brain tissue from individuals with MS as compared with controls (Antony et al., 2004). In addition, HERV-H Env and HERV-W Env are increased on the surface of B cells and monocytes of MS patients (Brudek et al., 2009).

Multiple sclerosis is a chronic immune-mediated inflammatory disease of the central nervous system with characteristic patchy demyelination. It is the most common chronic disabling CNS disease in young adults and affects about 2.3 million people around the world (Browne et al., 2014). The etiology of MS has not been completely decoded so far; the causes of MS are hypothesized to be multifactorial including environmental influences (Islam et al., 2007) as well as epigenetic and genetic factors (Küçükali et al., 2015; Booth and Parnell, 2017). Commonly an autoimmune attack against myelin autoantigens is considered as the main occurrence in the pathogenesis of MS (Hemmer et al., 2003; Pender and Greer, 2007). Additionally, ERVs are discussed to contribute to MS (Tselis, 2011; Emmer et al., 2014). Besides, MS has been linked to hypoxia for decades (e.g., Fischer et al., 1983; Auer et al., 1995; Trapp and Stys, 2009). Hypoxic damage is hypothesized to be a factor in MS pathogenesis, because some patterns of demyelination are similar to white matter ischemia (Lassmann, 2003).

In the present study, we analyzed the effect of hypoxiamimetic cobalt chloride (CoCl2) on human neuronal-like SH-SY5Y neuroblastoma cells for changes in gene expression profiles in contrast to un-stimulated cells. Genes up-regulated in this model are considered to indicate transcriptionally active chromatin regions which are susceptible also for ERV reactivation. Therefore, the vicinity of up-regulated genes was scanned for endogenous retrovirus sequences in order to identify possible ERV that might be involved in the link between hypoxia and MS. In addition, we analyzed the possible immune modulatory activity of the identified syncytin 2 in the A673 cell line system. We used this system because the immunostimulatory activity of A673 cells is well-characterized (Staege et al., 2004; Max et al., 2014; Reuter et al., 2015) and they display similar gene expression and splicing features as neuronal cells (Bros et al., 2006). The immunostimulatory activity of this model cell line has been shown to be susceptible to transgenic expression of varying molecules like interleukin 2 (Staege et al., 2004), CD137 ligand (Max et al., 2014), or OX40 ligand (Reuter et al., 2015).

## MATERIALS AND METHODS

## Cell Lines and Cell Culture

Human SH-SY5Y neuroblastoma cells (Biedler et al., 1973) were obtained from the Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSMZ, Braunschweig, Germany). A673 Ewing sarcoma cells (Giard et al., 1973) were obtained from the American Type Culture Collection (Manassas, VA, United States) All cells were cultured in Dulbecco's Modified Eagle Medium (DMEM, PAA, Pasching, Germany), supplemented with 10% fetal calf serum, 100 U/mL penicillin, and 100 µg/mL streptomycin at 37◦C in a humidified atmosphere with 5% CO2. For simulation of hypoxia, a fresh stock solution (10 mM) of CoCl<sup>2</sup> was prepared in water and added to the medium to obtain desired final concentrations. SH-SY5Y cells were treated for 24 h at a cell density of 1 × 10<sup>6</sup> cells/mL with either 0 µM CoCl2, 100 µM CoCl2, or 200 µM CoCl2. The experiment was repeated twice for the gene expression analysis with microarrays and three times for the gene expression analysis with polymerase chain reaction (PCR).

fmicb-09-00287 February 20, 2018 Time: 16:41 # 3

## Gene Expression Analysis

RNA was isolated using GeneMatrix Universal RNA Kit (roboklon, Berlin, Germany). RNA extracted from the cells was treated with DNase (roboklon, Berlin, Germany) to remove genomic DNA. Occasionally absence of DNA contamination was proved by using isolated RNA without reverse transcription as template for PCR. Global gene expression in SH-SY5Y cells was analyzed using Affymetrix Human Exon 1.0ST arrays (Affymetrix, Santa Clara, CA, United States). Affymetrix cel files were processed with Expression Console 1.1 (Affymetrix) at gene level (core; library version: huex-1\_0-st-v2.na36.1.hg19). Calculations were performed with the MAfilter software (Winkler et al., 2012). Values of cobalt (II) chloride treated samples had to be three times higher than controls and signal intensities (RMA normalized, linear values) had to be above 100 to be considered as differentially expressed. Analysis was performed separately for cells treated with 100 µM CoCl2, or 200 µM CoCl2. For further analysis we included all threefold up-regulated genes that were found in both replications. Microarray cell files have been submitted to the Gene Expression Omnibus (GEO) data base (GSE107333).

## Identification of Endogenous Retrovirus Sequences

The chromosomal locations of the up-regulated genes were analyzed for the presence of putative ERV sequences essentially as described (Brütting et al., 2016). For this end, we analyzed the 2 Mbp surrounding each individual gene for the presence of ORFs with a minimal length of 1 kb by using Mobyle 1.5 (Rice et al., 2000). Identified ORFs were analyzed using BLASTP (Altschul et al., 2005) against the NCBI database of retro-transcribing viruses (taxid 35268) with the reference genome GRCh38 (primary assembly).

## Polymerase Chain Reaction

One microgram of the isolated RNA was transcribed into cDNA and used as a template for PCR. Real-time quantitative reverse transcription-PCR (qRT-PCR) was performed using Go Taq pPCR master mix (Promega, Mannheim, Germany) using 10 µL Go Taq pPCR master mix, 7 µL water, 1 µL forward primer, 1 µL reverse primer (25 µM) and 1 µL cDNA. PCR conditions were: 94◦C, 30 s; 60◦C, 30 s; 72◦C, 45 s (40 cycles). Gene expression was calculated with the 2−11Ct method (Livak and Schmittgen, 2001). Conventional PCR was performed using 2 µl of the cDNA, 5 µl Green GoTaq Buffer (Promega, Mannheim, Germany), 0.5 µl of 10 mM dNTPs (Fermentas, Sankt Leon-Rot, Germany), 0.25 µl of each of the two primers (25 µM), 0.2 µl GoTaq polymerase (5 U/µl; Promega) and 16.8 µl water. All used primer sequences are listed in **Table 1**. The amplification protocol included an initial denaturation step at 95◦C for 5 min, followed by 40 cycles with denaturation at 95◦C for 60 s; primer annealing at 60◦C for

#### TABLE 1 | Primer combinations used in this study.

fmicb-09-00287 February 20, 2018 Time: 16:41 # 4


If not otherwise stated, primer combinations were used for qRT-PCR. <sup>a</sup>ACTB, actin beta; HERV-FRD, syncytin 1; EWSR1-FLI1, Ewing sarcoma breakpoint region 1-Friend leukemia virus integration site 1; tumor specific gene fusion; LIPI, lipase member I (cancer-testis antigen 17); NEDD9, neural precursor cell expressed developmentally down-regulated protein 9. <sup>b</sup>Used for conventional PCR only.

60 s; amplification at 72◦C for 90 s; and a final extension step at 72◦C for 5 min. PCR products were subjected to agarose gel (1.5%) electrophoresis in the presence of ethidium bromide.

## Cloning of HERV-FRD in pIRES2-AcGFP1 Vector

DNA (PCR product from SH-SY5Y cells) from the agarose gel was extracted with GeneJet Gel Extraction Kit (Thermo Fisher, Waltham, MA, United States), ligated in vector pGEM-T Easy (Promega) and transformed in Escherichia coli XL1- Blue. The DNA of one overnight colony was isolated with GeneJET Plasmid Miniprep Kit (Thermo Fisher, Waltham, MA, United States). DNA and vector pIRES2-AcGFP1 (Clontech, Mountain View, CA, United States) were digested with SacI and SacII. After agarose gel purification, ligation, and transformation into Escherichia coli XL1-Blue, individual clones were sequenced by using HERV-FRD specific primers. For sequencing, a 10 µL sequencing mix was used that contained 6.8 µL HPLC water, 0.2 µL sequence-specific sequencing primers (10 µM), 2.0 µL BigDyeTerminator v1.1 Cycle Sequencing buffer (Applied Biosystems, Foster City, CA, United States), 2.0 µL BigDyeTerminator v1.1 Cycle Sequencing Mix and 10 ng DNA. Sequence analysis was performed using ABI PrismTM 310 Genetic Analyzer (Applied Biosystems). A clone with complete HERV-FRD ORF was used for further analysis. This clone differs from the reference sequence by a silent C to T transition (corresponding to base 1,384 in reference sequence NM207582).

## Transfection

For transient expression, SH-SY5Y cells and A673 cells were cultured for 24 h and then transfected with the appropriate vectors using PromoFectin (PromoKine, Heidelberg, Germany) according to the manufacturer's protocol. For stable expression, cells were treated in the same way. After 24 h they were put under selection with the antibiotic G418.

## Mixed Lymphocyte Tumor Cell Culture (MLTC) and Flow Cytometry

Peripheral blood mononuclear cells (PBMC) were prepared and mixed lymphocyte tumor cell culture (MLTC) was performed as described elsewhere (Staege et al., 2004; Foell et al., 2008). Detection of surface antigens on PBMC by flow cytometry was performed as described elsewhere (Hoennscheidt et al., 2009). The following phycoerythrin labeled antibodies have been used: anti-CD3 clone SK7, anti-CD8 clone RPA-T8, and anti-CD25 clone 2A3. All antibodies were purchased from Becton Dickinson (Heidelberg, Germany) and all samples were analyzed on a FACScan instrument (Becton Dickinson) using CellQuestPro software (Becton Dickinson).

## RESULTS AND DISCUSSION

According to our stringent filter criteria (see section "Materials and Methods"), only five genes were found to be strongly up-regulated in SH-SY5Y cells after treatment with cobalt chloride. These genes include (in alphabetical order) CLU (clusterin), GPX3 (glutathione peroxidase 3), IGF2 (insulinlike growth factor 2), NEDD9 (neural precursor cell expressed, developmentally down-regulated 9), and SLC7A11 [solute carrier family 7 (anionic amino acid transporter light chain, Xc-system), Member 11]. The up-regulated genes indicate transcriptionally active chromatin regions which might be susceptible for reactivation of other genetic elements like ERVs.

CLU (also known as apolipoprotein J, testosterone-repressed prostate message-2, or sulfated glycoprotein-2) encodes a glycoprotein which is nearly ubiquitously distributed in human tissues (Jones and Jomary, 2002). It is a 75–80 kDa heterodimer and a molecular chaperone which is normally secreted but in conditions of cellular stress, it can be transported to the cytoplasm where it can bind to BAX and inhibit neuronal apoptosis (Nuutinen et al., 2009). CLU expression has been associated with tumorigenesis of various malignancies, including tumors of the prostate, colon, and breast (Shannan et al., 2006). Variants in the clusterin gene are also associated with the risk of Alzheimer's disease (Schrijvers et al., 2011), dementia (Weinstein et al., 2016), and stroke (Guido et al., 2015). In astrocytes of MS white matter lesions an elevated expression of clusterin was detected (van Luijn et al., 2015). All of these diseases represent states of increased oxidative stress, which in turn, promotes amorphous aggregation of target proteins, increased genomic instability and high rates of cellular death (Trougakos and Gonos, 2006).

GPX3 (also known as plasma or extracellular glutathione peroxidase) encodes a protein which functions in the detoxification of hydrogen peroxide. Most of the GPX3 mRNA is kidney-derived (Avissar et al., 1994), but it is also expressed by heart, lung, liver, brain, breast, and gastrointestinal tract (Chu et al., 1992; Tham et al., 1998). In human cancer fmicb-09-00287 February 20, 2018 Time: 16:41 # 5

GPX3 promotor down-regulation and hyper-methylation is rather common (Zhang et al., 2010; Chen et al., 2011). GPX3 expression and GPX3 hyper-methylation can thus be used as biomarkers for different kind of cancer (Yang et al., 2013; Zhou et al., 2015). GPX3 works as a tumor suppressor for example in colitis-associated carcinoma (Barrett et al., 2013) and in hepatocellular carcinoma (Qi et al., 2014). In initial MS lesions GPX3 was found to be downregulated (>2 log2-fold) compared to control (Fischer et al., 2012).

IGF2 encodes a protein with high homology to pro-insulin (Livingstone, 2013). IGF2 contains 10 exons and 4 promoters so that several alternatively spliced transcripts are possible (Engström et al., 1998). The IGF2 gene is imprinted: the paternal IGF2 allele is transcribed whereas the maternal allele is silent (Giannoukakis et al., 1993). As a growth factor it is especially expressed in many tissues in early stages of embryonic and fetal development (Hedborg et al., 1994). In adults, IGF2 is preferentially expressed in liver and brain (Engström et al., 1998). IGF2 regulates normal cell growth and proliferation. Moreover, it plays a role in the growth and development of tumors: epigenetic changes at this locus are for example associated with Wilms tumor, Beckwith–Wiedemann syndrome, or rhabdomyosarcoma (Bergman et al., 2013).

SLC7A11 (also known as xCT) encodes a protein that is member (together with SLC3A2) of a heterodimeric, sodiumindependent, anionic amino acid transport system that is highly specific for cysteine and glutamate (Sato et al., 2000). While SLC7A11 seems to induce the transport activity, SLC3A2 leads to the surface expression of the system (Verrey et al., 2004). SLC7A11 seems to contribute to different kinds of cancer, including, e.g., malignant glioma (Robert et al., 2015) or breast cancer (Liu et al., 2011). In tumor cells, the amino acid transport system plays a critical role in regulating intracellular glutathione levels (Okuno et al., 2003) and glutathione has been broadly implicated in chemotherapy resistance (Gatti and Zunino, 2005). Besides, SLC7A11 is significantly up-regulated in post-mortem spinal cord samples from MS patients (Lieury et al., 2014). SLC1A11 is a member of the solute carrier family, a large gene family that contains several receptors for retroviruses. Interestingly, two members of this family (SLC1A4, SLC1A5) have also been suggested as receptors for ERV (Lavillette et al., 2002). A function as receptor for viruses has not been described for SLC1A11.

NEDD9 (also known as CasL and HEF1) encodes a protein which regulates diverse cellular processes that are relevant to cancer, like cell attachment, migration, invasion, apoptosis, or cell cycle regulation (Singh et al., 2007; Shagisultanova et al., 2015). Furthermore, NEDD9 seems to play a role in the nervous system as there is some association between one NEDD9 variation and the susceptibility of late-onset Alzheimer's disease and Parkinson's disease (Li et al., 2008). As it is involved in TGFβ-mediated differentiation into the neuronal lineage and NEDD9 possibly promotes a progenitor status that renders the cells competent to differentiation into neurons (Vogel et al., 2010). It is enriched in neural progenitor cells (Abramova et al., 2005) and its down-regulation is linked to neuronal lineage commitment (Aquino et al., 2008).

Based on our search strategy (see section "Materials and Methods"), we found in the vicinity of the up-regulated genes large (>1,000 bp) ORFs (from 11 in the vicinity of NEDD9 to 169 in the vicinity of IGF2). For all genes, these ORFs included candidates that passed the default threshold of the NCBI BLASTP implementation [expect (E) value < 10] against the database of retro-transcribing viruses. For four of the genes (all with the exception of SLC7A11) these BLASTP hits include envelope sequences from retro-transcribing viruses. The E-values for nearly all of these hits were higher than 0.01 and, therefore, are not convincing retroviral (ERV) sequences. However, we found one hit with very high similarity to retroviral envelope proteins in the vicinity of NEDD9 (see Supplementary Figure 1).

We validated up-regulation of NEDD9 in CoCl<sup>2</sup> treated SH-SY5Y cells by qRT-PCR (**Figure 2**). Our results are in agreement with observations from other groups also demonstrating that NEDD9 is induced by hypoxia (Martin-Rendon et al., 2007; Kim et al., 2010).

The BLASTP hit in the vicinity of NEDD9 (accession number CAB94192.1; see Supplementary Figure 1) represents a sequence ("HERV-H/env62") of the human HERV-H family. With about 1,000 elements the HERV-H family is one of the largest HERV families in the human genome (Wilkinson et al., 1994). Analyzes showed that there are three envelopes with large ORFs corresponding to potential 59-, 60-, and 62-kDa translational products (de Parseval et al., 2001). Moreover, the higher HERV seroreactivity in patients with active MS correlates with the higher levels of HERV-H Env expression on B cells and monocytes (Brudek et al., 2009).

The sequence in the vicinity of NEDD9 is identical to the human endogenous retrovirus group FRD, member 1 (HERV-FRD). HERV-FRD is located in an intron of the small integral membrane protein 13 (SMIM13). The close association between NEDD9 and SMIM13 is highly conserved in vertebrates. However, in non-primate vertebrates, HERV-FRD is absent

standard deviations from three independent experiments. For comparative analysis, beta actin was used as housekeeping control and the median expression of all samples was set as one.

fmicb-09-00287 February 20, 2018 Time: 16:41 # 6

(**Figure 3**). HERV-FRD entered the primate genomes more than 40 million years ago (de Parseval and Heidmann, 2005). It has inactivating mutations in the gag and pol genes whereas the envelope glycoprotein gene is preserved (Renard et al., 2005). The encoded protein product is called syncytin 2 (Blaise et al., 2003) which plays a major role in placental development and trophoblast fusion (Malassiné et al., 2007; Vargas et al., 2009). The protein has the characteristics of a typical retroviral envelope protein, including a cleavage site that separates the surface and transmembrane units which together form a heterodimer of the mature syncytin 2 (Renard et al., 2005). Syncytin 2 can induce cell-cell fusion (Blaise et al., 2003).

In our model we found up-regulation only for the mentioned five genes and not for the associated ERVs and we have no evidence that ERVs are functionally involved in upregulation of the genes or vice versa. From our data we only found HERV-FRD to be a candidate for a possible association between hypoxia and ERVs in MS. Other factors (e.g., patient specific polymorphisms) might be necessary to induce expression of the ERVs and subsequent effects. Under such conditions, it seems possible that over-expression of syncytin 2 in the brain, e.g., as a consequence of local hypoxia, elicits an immunomodulating activity. Therefore, we tested whether syncytin 2 overexpression lead to altered immunostimulatory activity in the well-characterized A673 model system (Staege et al., 2004; Reuter et al., 2015). HERV-FRD transfected A673 cells retained the expression of tumor associated antigens (**Figure 4A**). However, we were not able to find altered immunostimulatory activity of transfected cells (**Figure 4B**) in this system. Further

investigations are needed to analyze possible immunomodulatory properties.

Taking together, our study shows changes in gene expression profiles of hypoxia-mimetic CoCl<sup>2</sup> treated human neuronal-like SH-SY5Y cells in contrast to untreated cells. Five genes were found to be strongly up-regulated: CLU, GPX3, IGF2, NEDD9, and SLC7A11. Three of them (CLU, GPX3, and SLC7A11) showed in the past some associations to MS. The identified ERV in the vicinity of NEDD9 might thus be involved in the association between hypoxia and MS.

## AUTHOR CONTRIBUTIONS

CB: data collection, data analysis, and interpretation, generating figures, and drafting the article. HN: part of data collection. FH and MK: conception of the work. MS: conception of the work, generating figures, and critical revision of the article. AE: conception of the work and final approval of the version to be published.

## ACKNOWLEDGMENTS

fmicb-09-00287 February 20, 2018 Time: 16:41 # 7

We thank the Wilhelm Roux Program of the Medical Faculty of the Martin Luther University of Halle-Wittenberg (FKZ 28/45) for the kind support for our studies.

## REFERENCES


## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.00287/full#supplementary-material

multiple sclerosis exhibit increased surface expression of both HERV-H Env and HERV-W Env, accompanied by increased seroreactivity. Retrovirology 6:104. doi: 10.1186/1742-4690-6-104


00421-5


fmicb-09-00287 February 20, 2018 Time: 16:41 # 8

doi: 10.1093/jnci/51.5.1417

derived from a series of solid tumors. J. Natl. Cancer Inst. 51, 1417–1423.

Goering, W., Schmitt, K., Dostert, M., Schaal, H., Deenen, R., Mayer, J., et al. (2015). Human endogenous retrovirus HERV-K (HML-2) activity in prostate cancer is dominated by a few loci. Prostate 75, 1958–1971. doi: 10.1002/pros.23095 Guido, V., Musaro, V., San Millan, D., Ghika, J. A., Fishman, D., Dayer, E., et al. (2015). Serum and urine clusterin levels are elevated in stroke patients. J. Neurol.

Hayward, A., and Katzourakis, A. (2015). Endogenous retroviruses. Curr. Biol. 25,

Hedborg, F., Holmgren, L., Sandstedt, B., and Ohlsson, R. (1994). The cell type-specific IGF2 expression during early human development correlates to the pattern of overgrowth and neoplasia in the Beckwith-Wiedemann

Hemmer, B., Kieseier, B., Cepok, S., and Hartung, H. P. (2003). New immunopathologic insights into multiple sclerosis. Curr. Neurol. Neurosci. 3,

Hoennscheidt, C., Max, D., Richter, N., and Staege, M. S. (2009). Expression of CD4 on Epstein–Barr virus-immortalized B cells. Scand. J. Immunol. 70, 216–225.

Islam, T., Gauderman, W. J., Cozen, W., and Mack, T. M. (2007). Childhood sun exposure influences risk of multiple sclerosis in monozygotic twins. Neurology

Jern, P., and Coffin, J. M. (2008). Effects of retroviruses on host genome function. Annu. Rev. Genet. 42, 709–732. doi: 10.1146/annurev.genet.42.110807.091501 Jones, S. E., and Jomary, C. (2002). Clusterin. Int. J. Biochem. Cell Biol. 34, 427–431.

Kemp, K., Gray, E., Wilkins, A., and Scolding, N. (2012). Purkinje cell fusion and binucleate heterokaryon formation in multiple sclerosis cerebellum. Brain 135,

Kewitz, S., and Staege, M. S. (2013). Expression and regulation of the endogenous retrovirus 3 (ERV3) in Hodgkin's lymphoma cells. Front. Oncol. 3:179.

Kim, S. H., Xia, D., Kim, S. W., Holla, V., Menter, D. G., and DuBois, R. N. (2010). Human enhancer of filamentation 1 is a mediator of hypoxia-inducible factor-1α–mediated migration in colorectal carcinoma cells. Cancer Res. 70,

Küçükali, C. Ý., Kürtüncü, M., Çoban, A., Çebi, M., and Tüzün, E. (2015). Epigenetics of multiple sclerosis: an updated review. Neuromol. Med. 17, 83–96.

Kulkarni, A., Mateus, M., Thinnes, C. C., McCullagh, J. S., Schofield, C. J., Taylor, G. P., et al. (2017). Glucose metabolism and oxygen availability govern reactivation of the latent human retrovirus HTLV-1. Cell Chem. Biol. 24,

Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409,

Lassmann, H. (2003). Hypoxia-like tissue injury as a component of multiple sclerosis lesions. J. Neurol. Sci. 206, 187–191. doi: 10.1016/S0022-510X(02)

Lavillette, D., Marin, M., Ruggieri, A., Mallet, F., Cosset, F. L., and Kabat, D. (2002). The envelope glycoprotein of human endogenous retrovirus type W uses a divergent family of amino acid transporters/cell surface receptors. J. Virol. 76,

Li, Y., Grupe, A., Rowland, C., Holmans, P., Segurado, R., Abraham, R., et al. (2008). Evidence that common variation in NEDD9 is associated with susceptibility to late-onset Alzheimer's and Parkinson's disease. Hum. Mol.

Sci. 357, e379–e380. doi: 10.1016/j.jns.2015.08.1352

R644–R646. doi: 10.1016/j.cub.2015.05.041

syndrome. Am. J. Pathol. 145, 802–817.

246–255. doi: 10.1007/s11910-003-0085-y

doi: 10.1111/j.1365-3083.2009.02286.x

doi: 10.1016/S1357-2725(01)00155-8

2962–2972. doi: 10.1093/brain/aws226

4054–4063. doi: 10.1158/0008-5472.CAN-09-2110

1377.e3–1387.e3. doi: 10.1016/j.chembiol.2017.08.016

6442–6452. doi: 10.1128/JVI.76.13.6442-6452.2002

doi: 10.3389/fonc.2013.00179

doi: 10.1007/s12017-014-8298-6

860–921. doi: 10.1038/35057062

69, 381–388. doi: 10.1212/01.wnl.0000268266.50850.48

fmicb-09-00287 February 20, 2018 Time: 16:41 # 9

with experimental autoimmune encephalomyelitis. PLoS One 10:e0133903. doi: 10.1371/journal.pone.0133903


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Brütting, Narasimhan, Hoffmann, Kornhuber, Staege and Emmer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# HERV Envelope Proteins: Physiological Role and Pathogenic Potential in Cancer and Autoimmunity

#### Nicole Grandi <sup>1</sup> and Enzo Tramontano1,2 \*

*<sup>1</sup> Laboratory of Molecular Virology, Department of Life and Environmental Sciences, University of Cagliari, Cagliari, Italy, 2 Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche, Cagliari, Italy*

## Edited by:

*Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany*

#### Reviewed by:

*Tara Patricia Hurst, Abcam, United Kingdom Masaaki Miyazawa, Kindai University, Japan*

> \*Correspondence: *Enzo Tramontano tramon@unica.it*

#### Specialty section:

*This article was submitted to Virology, a section of the journal Frontiers in Microbiology*

Received: *22 November 2017* Accepted: *27 February 2018* Published: *14 March 2018*

#### Citation:

*Grandi N and Tramontano E (2018) HERV Envelope Proteins: Physiological Role and Pathogenic Potential in Cancer and Autoimmunity. Front. Microbiol. 9:462. doi: 10.3389/fmicb.2018.00462* Human endogenous retroviruses (HERVs) are relics of ancient infections accounting for about the 8% of our genome. Despite their persistence in human DNA led to the accumulation of mutations, HERVs are still contributing to the human transcriptome, and a growing number of findings suggests that their expression products may have a role in various diseases. Among HERV products, the envelope proteins (Env) are currently highly investigated for their pathogenic properties, which could likely be participating to several disorders with complex etiology, particularly in the contexts of autoimmunity and cancer. In fact, HERV Env proteins have been shown, on the one side, to trigger both innate and adaptive immunity, prompting inflammatory, cytotoxic and apoptotic reactions; and, on the other side, to prevent the immune response activation, presenting immunosuppressive properties and acting as immune downregulators. In addition, HERV Env proteins have been shown to induce abnormal cell-cell fusion, possibly contributing to tumor development and metastasizing processes. Remarkably, even highly defective HERV *env* genes and alternative *env* splicing variants can provide further mechanisms of pathogenesis. A well-known example is the HERV-K(HML2) *env* gene that, depending on the presence or the absence of a 292-bp deletion, can originate two proteins of different length (Np9 and Rec) proposed to have oncogenic properties. The understanding of their involvement in complex pathological disorders made HERV Env proteins potential targets for therapeutic interventions. Of note, a monoclonal antibody directed against a HERV-W Env is currently under clinical trial as therapeutic approach for multiple sclerosis, representing the first HERV-based treatment. The present review will focus on the current knowledge of the HERV Env expression, summarizing its role in human physiology and its possible pathogenic effects in various cancer and autoimmune disorders. It moreover analyzes HERV Env possible exploitation for the development of innovative therapeutic strategies.

Keywords: HERV, endogenous retroviruses, Env, cancer, autoimmunity, multiple sclerosis, syncytin

## INTRODUCTION

Human endogenous retroviruses (HERVs) are transposable elements acquired along primate evolution through multiple infections by now extinct exogenous retroviruses. Common to modern retroviruses, the ancient RNA genome had been reverse transcribed into a double-stranded DNA provirus and stably integrated in the host's chromosomes. In the case of HERVs, however, such ancestral infections peculiarly affected the germ line, allowing the Mendelian inheritance of HERV proviruses through the offspring. HERVs became stable components of the human genome, constituting approximately the 8% of our DNA (Lander et al., 2001). While HERV characterization at the genomic level is still ongoing, the widely reported HERV expression across tissues stimulated the search of a role in human pathogenesis. In general, however, no definitive link of any HERV sequence (and its expressed products) to human diseases has been demonstrated yet, due to a series of confounding factors that include the lack of characterization of individual HERV loci, the poor knowledge of their specific expression in healthy and diseased conditions and the absence of confirmed molecular mechanisms of pathogenesis (Grandi and Tramontano, 2017). Subsequently, even if the contribution of HERV expression to our transcriptome is by now undeniable, its significance for the tentative association with human pathogenesis has often lacked sufficient support, ending in over-interpreted conclusions (Voisset et al., 2008; Grandi and Tramontano, 2017). Far from meaning that HERV RNAs must be translated to have an effect, the uncertainty in the field demands for standardized methodologies and reliable genomic backgrounds to definitely assess which expressed HERV loci constitute a "physiological" phenomenon and which, instead, could actually have some pathological potential. In contrast, HERV protein expression, especially if exclusive of diseased tissues, is less common than RNA production and could more likely have some effects on the host, even if not intrinsically taking part to pathogenesis. An important issue to be addresses is if the specificity of expression from a given locus can be associated with a defined molecular mechanism of pathogenesis. Such knowledge could lead to the identification of individual HERV proteins involved in disease development, and thus exploitable as therapeutic targets.

The great majority of studies investigating the pathogenic role of HERV products has been focused on HERV envelope proteins (Env). In the present review, we will describe HERV-derived Env contribution to human physiology and analyze their possible impact on pathogenesis. Overall, even if not fully conclusive, the soundest evidence has been reached in cancer and autoimmunity. The main molecular mechanisms of HERV Env pathogenesis and their possible exploitation as therapeutic targets are discussed.

## HUMAN ENDOGENOUS RETROVIRUSES

Being remnants of ancient retroviral infections, HERVs show a typical proviral structure (**Figure 1A**), even if the action of cellular editing systems and the prolonged exposition of proviruses to the host genome substitution rate often made them coding-defective. In structurally complete HERVs, two long terminal repeats (LTRs) flank the proviral internal portion, constituted by the viral genes gag, pro, pol and env. Briefly, gag encodes the structural components of matrix, capsid and nucleocapsid; pro and pol specify the enzymes protease (PR), reverse transcriptase (RT) and integrase (IN); while env encodes Env surface (SU) and transmembrane (TM) subunits. The LTRs are formed during reverse transcription and have important regulatory functions for viral expression. HERV proviruses include moreover a primer binding site (PBS), between 5'LTR and gag, and a polypurine tract (PPT), between env and 3'LTR: the former acts as binding site for the cellular tRNA priming the (−)strand DNA synthesis, the latter serves as a primer for the (+)strand DNA synthesis. In addition, the HERV-K(HML2) group presents two accessory proteins, namely Np9 and Rec, originated by the use of alternative splicing sites during env transcription (**Figure 1B**) (Löwer et al., 1995; Armbruester et al., 2002). In fact, besides the full-length env mRNA, a sub-spliced rec mRNA can be generated by type II HML2 proviruses depending on a splicing donor site that can be lost due to a recurrent 292 bp deletion (**Figure 1B**). The loss of this portion characterizes type I HML2 sequences, in which an alternative splice donor site upstream of the deletion generates the np9 mRNA (**Figure 1B**). As described below, both Rec and Np9 have been intensively studied for their possible role in human health. Interestingly, a recent study reported the presence of a rec Open Reading Frame (ORF) also in type II HERV-K(HML10) sequences, opening new perspectives for the group's possible impact on human biology (Grandi et al., 2017).

Due to the lack of a proper nomenclature, the classification of HERVs has been for a long time incomplete and sometimes controversial. HERVs have been broadly divided in three main classes based on their similarity to exogenous retroviruses: class I (Gammaretrovirus- and Epsilonretrovirus-like), class II (Betaretrovirus-like) and class III (Spumaretrovirus-like). The individual HERV groups have been albeit designated based on discordant criteria, generating some confusion. Many HERV groups were in fact named according to the cellular tRNA recognized by their PBS (e.g. HERV-K for lysine, HERV-H for histidine), even if the subsequent characterization revealed the occurrence of PBS variants recognizing alternative tRNAs (Jern et al., 2005; Grandi et al., 2016; Vargiu et al., 2016). Other HERVs have been termed based on the name of a neighbor gene (HERV-ADP) or a particular motif (HERV-FRD). All these nomenclatures are now considered inadequate due to their poor taxonomic value, and HERV classification is based on phylogenetic relationships among the different groups. Moreover, some structural features shared by all the HERVs belonging to the same genus or class are a valuable support in better understanding retroviral phylogeny (Jern et al., 2005). According to the most updated and comprehensive analysis, performed with the software RetroTector (Sperber et al., 2007), HERVs can be classified in 39 "canonical" groups and 31 "noncanonical" clades characterized by several degrees of mosaicism (Vargiu et al., 2016) (**Table 1**). This classification is based on a multi-step approach and provided a remarkable background for the characterization of single HERV groups, revealing insights on

FIGURE 1 | General structure of HERV DNA sequences. (A) *Typical structure of HERV proviruses*. The general structure of a full-length HERV proviral sequence is depicted: two Long Terminal Repeats (LTRs) flank *gag*, *pro, pol*, and *env* genes. The viral genes and the correspondent protein products are indicated: *gag* matrix (MA), capsid (CA) and nucleocapsid (NC); *pro-pol* protease (PR) – reverse transcriptase (RT), ribonuclease H (RH) and integrase (IN); *env* surface (SU) and transmembrane (TM). The primer binding site (PBS) and polypurine tract (PPT) are located between 5′LTR and *gag* and between *env* and 3′LTR, respectively. It is worth noting that the action of cellular editing systems and the persistence within the host genome led to the accumulation of substitutions, deletion and insertions that modified the structure of the majority of HERV proviruses, leading to coding-defective sequences and solitary LTRs formation. (B) *HERV-K(HML2) accessory genes*. Differently from other HERV groups, the HERV-K(HML2) *env* gene presents alternative splicing sites, leading to the presence of a full length transcript with structural significance plus two types of accessory variants, named *rec* and *np9*, that subdivide the HML2 sequences into two types. Type I HML2 elements present a characteristic 292-bp *env* deletion, leading to the use of an upstream splice donor and the subsequent production of a shorter protein, named Np9; while type II HML2 sequences retained such portion and use the downstream splice donor site to encode for a longer protein, named Rec.

TABLE 1 | Identification and classification of ∼3,200 HERV sequences in GRCh37/hg19 genome assembly by RetroTector (adapted from Vargiu et al., 2016).


*MLV, murine leukemia virus; FELV, feline leukemia virus; WDSV, walleye dermal sarcoma virus; MMTV, mouse mammary tumor virus; MPMV, Mason-Pfizer monkey virus; JSRV, jaagsiekte sheep retrovirus; SFV, simian foamy virus; C, canonical; NC, non-canonical.*

*<sup>a</sup>Exogenous retroviral species representative of the genus.*

HERV mosaic forms arisen from recombination and secondary integrations (Vargiu et al., 2016). It also highlighted frequent recombination events between the env genes of different HERV groups ("env snatching"), providing some positive effect on viral fitness that could be associated with both loss of env, favoring intragenomic spread instead of extracellular replication, and env acquisition, conferring a different/wider tropism (Vargiu et al., 2016). Finally, the calculation of the nucleotide divergence between proviral LTRs gave a remarkably complete (even if approximate) overview of the different HERV groups' time of integration, showing that most of them were acquired by primates from 60 to 20 million years ago (Vargiu et al., 2016) (**Figure 2**). Thus, the majority of HERV groups were acquired by Haplorrhini primates after their evolutionary separation from the elder Strepsirrhini suborder, being broadly divided based on their presence in both Platyrrhini and Catarrhini or in Catarrhini species only, due to an integration occurred after the evolutionary separation of these parvorders (∼43 million years ago) (**Figure 2**).

## ENV PLEIOTROPIC NATURE: FROM RETROVIRAL FUNCTIONS TO PHYSIOLOGICAL ROLES

(H)ERV Envs gained considerable attention due to the cooptation of some of them during eutherian mammal evolution, providing important biological activities to placenta development and pregnancy-related functions. Here we will only focus on the Env proteins relevant to human physiology, but it is worth noting that many Envs encoded by different (H)ERVs have been domesticated independently in different times and species, representing a fascinating example of convergent evolution (Lavialle et al., 2013).

In the retroviral life cycle, Env glycoproteins mediate the entry into the host cell. env encodes for a precursor that is cleaved into a SU subunit, constituting the viral antireceptor, and a TM subunit, holding fusogenic and immunosuppressive activities (**Figures 1A**, **3A,B**). SU-TM heterodimers are assembled at the cellular membrane to form Env trimers, acquired by the viral particles during their budding. In the presence of a susceptible cell, the Env SU antireceptor binds the correspondent receptor on the cellular membrane, mediating the insertion of TM fusion peptide (FP) for membranes fusion and cytoplasmic release of the nucleocapsid. Env expression on the infected cell's surface also compete for receptor occupation, preventing superinfection of the same cell by multiple retroviruses, as shown for ERV-derived Env proteins in mouse (Best et al., 1997) and sheep (Varela et al., 2009). In addition, Env proteins present on the cell membrane can bind the correspondent receptors on uninfected cells, mediating membrane fusions and syncytia formation (**Figure 3B**). This mechanism is involved in the activity of coopted Env, namely "syncytins," which similarly determine the fusion of the blastocyst's peripheral cells. Cytotrophoblasts form therefore a highly invasive layer, the syncytiotrophoblast, which invades the maternal uterine decidua constituting the outer placenta surface, being fundamental for embryo implantation and trophic exchanges.

In addition to fusogenicity, the retroviral Env TM subunit is known to possess an immune modulatory activity, possibly due to the presence of a putative immunosuppressive domain (ISD) (**Figures 3A,B**). In exogenous retroviruses this immune suppression activity is normally used to counteract the host antiviral responses (Mangeney and Heidmann, 1998; Blaise et al., 2001) and, in the case of HERVs, Env-mediated immune modulation has been coopted on occasion for the physiological maternal tolerance during pregnancy (Mangeney et al., 2007; Lavialle et al., 2013). In the latter, a well-evolved immune balance must allow fetal trophoblast invasion avoiding the rejection of paternal antigens (Ags), but should also maintain its activity in counteracting viral and bacterial infections. In particular, pregnancy is characterized by the suppression of cellular immunity that could stimulate cytotoxic processes and be harmful to the fetus. In this context, retroviral Envs have been suggested to inhibit maternal Th1 cytokine production (TNF-α, IFN-γ, and IL-2), leading to a shift toward the anti-inflammatory Th2 cytokines response (IL-4, IL-5 and, especially, IL-10) (Haraguchi et al., 1992, 1995; Tolosa et al., 2012).

Due to their relevant roles for human physiology, syncytin genes have been subjected to a positive selection along primate evolution, as suggested by the limited human polymorphisms and the high conservation of non-human primates' homologous loci (Esnault et al., 2013; Lavialle et al., 2013).

## HERV-W Syncytin-1

The first Env characterized for its domestication is syncytin-1, encoded by a HERV-W provirus in locus 7q21.2 (ERVWE1) that was acquired by primates ∼25 millions years ago (Mallet et al., 2004). ERVWE1 is coding-defective for gag and pol, albeit retaining an env ORF producing a protein with pregnancyrelated functions (Blond et al., 2000; Mi et al., 2000).

Syncytin-1 is a 73 kDa glycosylated protein composed of 538 amino acids (aa) and presenting an N-terminal signal peptide (SP, aa 1-20), a SU subunit (aa 21-317), a TM subunit (aa 318-538) and various functional domains for the protein maturation and activity (Gimenez and Mallet, 2008) (**Figure 3A**). In syncytin-1 precursor, SU and TM domains associate into a homotrimer through the TM N- and Cterminal heptad repeats (HR1 and HR2, aa 352-392 and 407-440, respectively) (Gimenez and Mallet, 2008). In this homotrimer, each precursor is cleaved by cellular furine proteases at the conserved SU/TM RKNR site, producing mature SU (gp50) and TM (gp24) subunits (Gimenez and Mallet, 2008). The latter remain linked through a disulphide bridge between SU CWIC and TM CX6CC motifs. Once located at the cellular membrane, SU N-terminal receptor binding domain interacts with the type D mammalian retrovirus receptor hASCT2 (human sodium-dependent neutral amino acid transporter type 2) (Lavillette et al., 2002; Cheynet et al., 2006). This activates syncytin-1 fusogenic activity, hold by the TM subunit and involving its hydrophobic FP (aa 320-340), the fusion core (composed of HR1 and HR2) and the TM C-terminal intracytoplasmic tail (CYT, aa 471-538) (Gimenez and Mallet, 2008). In particular, while in retroviral Envs the fusogenic activity requires the removal of an inhibitory peptide by viral proteases, syncytin-1 CYT presents a deletion of four aa (LQMV, after aa 485) that makes the protein constitutively competent for fusion (Bonnaud et al., 2004). The recent analysis of 16 HERV-W env ORFs and the correspondent putative proteins revealed their defectiveness as compared to syncytin-1,

FIGURE 2 | Overview of HERV groups' period of acquisition by primate lineages. For each HERV group (as listed on the x-axis), colored rhomboids indicate the average time of integration, while the global period of diffusion is delimited with a line (in millions of years, y-axis). Values were extracted from Vargiu et al. (2016) and derive from the nucleotide divergence calculated between the two LTRs of individual proviral members considering a genomic rate of 0.2%

*(Continued)*

FIGURE 2 | substitutions/nucleotide/millions of years. The estimated period in which the different primate species evolutionary diverged (as derived from Steiper and Young, 2006; Perelman et al., 2011) is indicated in the underlying tree at each node of separation, and is depicted in the graph by colored blocks corresponding to the first primate species infected by a certain HERV group (blue: prosimians, red: New World monkeys, green: Old World monkeys, yellow: gibbon, pink: orangutan, turquoise: gorilla, violet: chimpanzee, orange: humans). Primate parvorders (bold italic) and suborders (italic) are reported in the top of the graph, and a red line delimits the global time period of evolution of the whole order. Photo credits: human by Mostafameraji, https://commons.wikimedia.org/w/index.php?curid= 61340991; chimpanzee by Thomas Lersch, https://commons.wikimedia.org/w/index.php?curid=1001910; gorilla by Adrian Pingstone, https://commons.wikimedia. org/w/index.php?curid=519340; orangutan by julubecka, https://commons.wikimedia.org/w/index.php?curid=53997414; gibbon by Raul654, https://commons. wikimedia.org/w/index.php?curid=529865; rhesus by Ltshears, https://commons.wikimedia.org/w/index.php?curid=10846501; rhesus (OWM) by Dr. Raju Kasambe, https://commons.wikimedia.org/w/index.php?curid=64614226; marmoset (NWM) by Georges Néron, https://commons.wikimedia.org/w/index.php?curid=5094199; prosimians by Sannse, https://commons.wikimedia.org/w/index.php?curid=112516.

Please see the text for more details about the individual domain functions. (B) Simplified model of the protein's structural configuration and fusogenic role when

showing mutations in all sites relevant to fusogenic activity (Grandi et al., 2016). The syncytin-1 crystal structure was obtained for the sole fusion subunit (**Figure 4**), confirming the presence of a trimeric structure organized by hydrophobic αhelices association (Gong et al., 2005) (**Figure 4**). It has been suggested that, after the FP insertion, HR1 and HR2 associate in an antiparallel manner to form a stable α-helical trimer of heterodimers, i.e., a homotrimeric coiled coil complex (Gong et al., 2005).

Syncytin-1 was shown to induce syncytia through the interaction with hASCT2 receptor (Blond et al., 2000; Mi et al., 2000). Besides this, syncytin-1 is able to bind human divergent receptors (hASCT1) and even murine unglycosylated orthologs (mASCT1 and mASCT2), showing a low restriction that is likely due to a strong selective pressure throughout evolution (Lavillette et al., 2002). Syncytin-1 specifically localizes to the villous and extravillous trophoblasts, where its fusogenic activity has been coopted for the development of the placental syncytiotrophoblast (Blond et al., 2000; Mi et al., 2000; Malassiné et al., 2005). Syncytin-1 is also involved in the syncytiotrophoblast homeostasis, modulating cytotrophoblast differentiation, proliferation and survival by fostering the G1/S transition (Frendo et al., 2003; Huang et al., 2013b, 2014). In such regulation, cyclic AMP (cAMP), besides modulating the kinases involved in trophoblast fusion and differentiation (Keryer et al., 1998), controls a cAMP-responsive element (CRE) in ERVWE1 LTR (Frendo et al., 2003). Such cAMP stimulation promotes ERVWE1 basal expression, while an upstream tissue-restricted

inserted into viral or cellular membranes.

enhancer within an adjacent MaLR solitary LTR ensures high placenta-specific production (Prudhomme et al., 2004). This bipartite promoter is moreover under a joint regional epigenetic control, being hypomethylated with stage-dependent profiles in cytotrophoblasts (Gimenez et al., 2009) and being instead hypermethylated in non-placental tissues (Matousková et al., 2006). Aside from cAMP stimulation, the 5′ -flanking region of ERVWE1 presents two binding sites for the chorionspecific transcription factor GCM1, required for placental development and shown to increase syncytin-1 expression and fusogenicity in trophoblasts, but not in other cells (Yu et al., 2002).

Syncytin-1 was also thought to have a role in maternal immune tolerance (Blond et al., 2000; Mi et al., 2000; Malassiné et al., 2005), as demonstrated for the ISD of a murine (Mangeney and Heidmann, 1998) and a primate (Blaise et al., 2001) retrovirus. Further studies in mice suggested instead the absence of such activity (Mangeney et al., 2007). However, syncytin-1 was able to inhibit Th1 cytokine production in human blood, suggesting a possible role in the shift from Th1 to Th2 cytokines occurring during pregnancy (Tolosa et al., 2012). Likewise, a physiological role of syncytin-1 was proposed in the upstream fertilization process: in fact, both syncytin-1 and hASCT2 receptor are expressed in human gametes, showing temporal and spatial appearance in line with a role in oocyte and sperm fusion (Bjerregaard et al., 2014).

Overall, syncytin-1 plays a pivotal role in placental morphogenesis and homeostasis through a well-evolved balance of fusogenic and non-fusogenic functions, having in addition some possible immunomodulatory activity during pregnancy.

## HERV-FRD Syncytin-2

Similarly to syncytin-1, another Env encoded by a HERV-FRD provirus in locus 6p24.1 showed placenta-specific expression and syncytia induction in cultured cells, being therefore named syncytin-2 (Blaise et al., 2003). Syncytin-2 locus is found in both Catarrhini and Platyrrhini species, having been acquired >40 million years ago (Blaise et al., 2003).

Syncytin-2 is homologous to syncytin-1, being a ∼73 kDa glycosylated protein of 538 aa expressed as a precursor that associates to form homotrimeric complexes (Renard et al., 2005; Chen et al., 2008; Cui et al., 2016) and presenting the same domains (**Figure 3A**): an N-terminal SP (aa 1-15), a SU subunit (aa 16-350) and a TM subunit (aa 351-528). SU and TM precursors are cleaved by a cellular furin protease at RVRR cleavage site (aa 347-350) and remain covalently associated through a disulphide bond between SU CX2C (aa 43-46) and TM CX6CC (aa 431-439) motifs. The TM subunit harbors the putative FP (aa 354-374), an ISD (aa 414-430), a TM domain (aa 479-499) and a C-terminal CYT domain (aa 500-538) missing the LQMV inhibitory motif in the cleavage site. Syncytin-2 crystal is also available for the fusion subunit only, being a homotrimer organized by associated HR1 and HR2 hydrophobic α-helices followed by an ISD loop and an extended domain antiparallel to the coiled-coil (Renard et al., 2005) (**Figure 4**).

Syncytin-2 expression is limited to the villous cytotrophoblasts (Malassiné et al., 2007), whereas its receptor, the transmembrane protein MFSD2 (major facilitator superfamily domain containing 2), is found specifically in the placental syncytiotrophoblast (Esnault et al., 2008). Syncytin-2 is thus responsible for a polarized fusion that drives the merging of individual cytotrophoblasts to the syncytiotrophoblast, sustaining its growth and regeneration in concert with syncytin-1 (Esnault et al., 2008). In addition, the syncytin-2 ISD is highly preserved and shows strong immunosuppressive activity, possibly providing immune tolerance to the fetal allograft (Mangeney et al., 2007). As for syncytin-1, syncytin-2 is epigenetically regulated at the proviral 5′LTR, being hypomethylated in placenta and hypermethylated in other cells (Gimenez et al., 2009; Liang et al., 2010); and it is further controlled by GCM1 that binds syncytin-2 and MFSD2A promoters stimulating placental cell-cell fusion (Liang et al., 2010).

## Other HERV-Derived Env Proteins

Besides syncytins, other HERV Envs are expressed in normal conditions. This suggested their possible involvement in human physiology, even if no conclusive demonstrations have been reached yet.

The env gene of an ERV3 sequence in locus 7q11.21 contains a functional ORF (Boyd et al., 1993; Andersson et al., 2002, 2005) and ERV3 Envs have been detected in various human tissues, showing substantial expression in placenta (especially at the cyto/syncytiotrophoblastic layer), in the reproductive trait and in cells undergoing fusion either in physiological (myocardium, skeletal muscle) or pathological conditions (macrophages, tumors) (Fei et al., 2014).

HERV-K(HML2) Env expression has been detected in villous and extravillous cytotrophoblasts during the whole period of gestation, even if the protein has neither been found at any gestational time in placental syncytiotrophoblast nor linked to a specific locus of origin (Kämmerer et al., 2011). Furthermore, type II HML2 Rec proteins were recently found to be upregulated during embryogenesis, specifically stimulating an interferon (IFN)-induced viral restriction factor (IFITM126) in epiblast and embryonic stem cells (Yan et al., 2013; Grow et al., 2015). This led to the intriguing hypothesis that Rec and the associated HML2 transcripts might be sensed in the cytosol, stimulating an innate anti-viral response able to broadly inhibit embryonic viral infections (Grow et al., 2015). In the same embryonic cells, Rec was shown to interact with ∼1,600 cellular mRNAs and to influence their ribosome occupancy, further suggesting a physiological role during early development (Grow et al., 2015).

Contrary to syncytins, the HERV Env encoded by a HERV-F provirus in locus 21q22.3 has been found to inhibit fusogenicity in mammals and therefore named suppressyn (Sugimoto et al., 2013). Suppressyn is a 160 aa polypeptide with placenta-specific expression, corresponding to the Env N-terminal portion and including a putative SP and a SU subunit with a premature stop codon upstream the SU/TM cleavage site (Sugimoto et al., 2013). The protein was shown to compete for the binding to syncytin-1 receptor and to significantly reduce its fusogenicity, possibly representing the first env-derived restriction factor found in Catarrhini primates (Sugimoto et al., 2013). Of note, the defective structure of this HERV-derived Env, lacking a portion of SU and the whole TM subunit, confirms that not only full-length but also truncated proteins can have some biological significance to the host.

## HERV Env and Oncogenesis: Passive Bystanders or Active Contributors?

Oncogenesis is a multistep process hypothesized to be the result of a complex interplay between inherited and environmental factors, including viral infections. Nevertheless, while the oncogenic properties of exogenous retroviruses are well-known, HERV sequences have a more uncertain pathological significance, being expressed in many tissues without pathological consequences. Hence, a major obstacle is to properly evaluate whether their presence has a direct role in the disease onset, or if it is just an indirect product of transformation. In fact, tumors show a general epigenetic dysregulation, known to strongly and non-specifically liberate retrotransposon expression. Overall, different HERV-mediated mechanisms of oncogenesis have been proposed, including the ones not requiring any expressed product. Concerning HERV Env proteins, even if not confirmed yet as oncogenic agents, they have been suggested to support tumorigenesis through the same biological activities domesticated for physiological purposes: fusogenicity and immunosuppression (**Figure 5**).

Cell-cell fusion is known to occur physiologically in certain tissues (e.g., myocardium, skeletal muscle, placenta) and is frequently observed under pathogenic stimuli, such

responses. All these processes can thus have an impact on tumor growth and

in the subsequent process of tissue invasion and migration.

as inflammation and cancer. Tumor cells are in fact able to fuse with each other and with non-transformed cells, and such a process is involved in cancer progression, metastasis and chemoresistance (Walker et al., 2009; Berndt et al., 2013; Bastida-Ruiz et al., 2016) (**Figure 5**). Cellular fusion is also an important source of malignant cell heterogeneity and genetic instability, leading to polyploidy and aneuploidy (Bastida-Ruiz et al., 2016) (**Figure 5**). Thus, given that cancer cells show migration and invasion features very similar to the ones of syncytiotrophoblast, domesticated HERV Env fusogenicity could stimulate the uncontrolled cell fusion in tumors, underlining the need for a strict regulation.

In addition, the ISD of HERV Envs may support tumor progression, abrogating the anti-oncogenic cytolytic immune response (Kassiotis and Stoye, 2017) (**Figure 5**). This hypothesis is sustained by the strong evidence of tumor promotion by the Env ISD of animal ERVs (Ruprecht et al., 2008b).

Apart from fusogenic and immunosuppressive activities, which can be held by any functional HERV Env, the HERV-K(HML2) sequences could sustain transformation through their accessory proteins Np9 and Rec, proposed to be HERV-derived oncogenes.

## HERV-W

A large body of studies reported the HERV-W group RNA and protein upregulation in human cancers, often with no information about the originating loci and the eventual influence of tumor epigenetic dysregulation on their expression (Stauffer et al., 2004; Yi et al., 2004; Bjerregaard et al., 2006; Kim et al., 2008; Díaz-Carballo et al., 2015, and reviewed in Grandi and Tramontano, 2017). Of note, in the case of HERV-W sequences, the hyperexpression prompted by hypomethylated environments could possibly trigger the de novo retrotransposition of mRNA sequences (Grandi and Tramontano, 2017). This event has been ancestrally responsible for the formation of HERV-W processed pseudogenes, accounting for ∼2/3 of the actual group members (Grandi et al., 2016) and occurred through the L1 mediated reverse transcription and mobilization of HERV-W RNA transcripts (Costas, 2002; Pavlícek et al., 2002). Thus, given that (i) our genome still contains 80–100 retrotranspositioncompetent L1, and that (ii) L1 reactivation has been broadly reported in human malignancies (Hancks and Kazazian, 2016; Scott and Devine, 2017) and implicated in metastasis and cancer progression (Papasotiriou et al., 2017), the de novo L1-mediated insertion of HERV-W processed pseudogenes could possibly further contribute to tumor genetic instability (Grandi and Tramontano, 2017).

In breast and endometrial carcinomas, syncytin-1 fusogenicity has been proposed to mediate cancer cell fusion with other tumoral or normal cells (Bjerregaard et al., 2006; Strick et al., 2007), potentially altering their biological behavior and sustaining tumor progression. Particularly, human breast cancer (hBC) cell lines that express syncytin-1 on the cellular membrane are able to fuse with endothelial cells presenting hASCT2 receptor (Bjerregaard et al., 2006). Such Env expression on hBC cells was found in ∼40% of patients, being related to the rate of recurrence and survival and is thus proposed to be a prognostic marker (Larsson et al., 2007). The common occurrence of giant syncytial cells was similarly observed in endometrial carcinomas, whose development is known to be linked to hormone replacement therapies. Thus, both the fusogenic and the steroid-driven nature of this tumor were investigated in relation to syncytin-1 expression. Even if syncytin-1 was upregulated also in benign endometrial specimens, the highest expression was observed in carcinoma tissues, being induced by steroid hormones due to the presence of an hypomethylated estrogen responsive element (ERE) in ERVWE1 5′LTR (Strick et al., 2007; Strissel et al., 2012).

In some other instances, HERV-W Envs have been linked to tumor development due to their interference with crucial cellular pathways. For example, syncytin-1 was upregulated in >75% of bladder urothelial carcinomas, increasing proliferation and viability of immortalized uroepithelial cells, and a recurrent single nucleotide substitution in ERVWE1 3′LTR was shown to drive the binding of c-myb transcription factor, possibly empowering syncytin-1 promoter activity in this malignancy (Yu et al., 2014). In neuroblastoma, such dysregulation was shown to involve the cAMP-pathway, already mentioned for its regulatory role on the syncytin-1 promoter (Frendo et al., 2003) and cytotrophoblast differentiation (Keryer et al., 1998). Particularly, in neuroblastoma cell lines, the HERV-W Env transfection mediated an augmented phosphorylation of the activating transcription factor CREB (CRE-binding protein), leading to the hyperactivation of an ion channel (SK3, small conductance Ca2+- activated K+ channel protein 3) already known to be involved in neuron excitotoxicity and neurological diseases (Li et al., 2013).

Furthermore, a few studies reported the presence of syncytin-1 in other types of malignancies, without however characterizing its possible contribution to cancer. Syncytin-1 was detected in colorectal carcinomas, being confined to the tumor areas and hyperexpressed in villar and intervillar regions as well as in large intestine crypts (Larsen et al., 2009; Díaz-Carballo et al., 2015). In mycosis fungoides (the most common primary cutaneous T-cell lymphoma), 50% of patients expressed syncytin-1 in infiltrating lymphocytes, while no expression was detected in patients presenting benign T cell infiltrates (Maliniemi et al., 2013). In a third study on leukemia, >2/3 of samples showed abnormal syncytin-1 expression, and the protein was detected in the blood cells of patients but not healthy controls (Sun et al., 2010).

Finally, in a few cases, syncytin-1 was contrarily downregulated in cancer, possibly suggesting its positive prognostic value in certain tumors. Accordingly, pancreatic adenocarcinoma samples showed reduced syncytin-1 expression concomitant to ERVWE1 LTRs hypermethylation (Lu et al., 2015), and B16F10 melanoma cells stably expressing syncytin-1 were significantly limited in proliferation and invasion (Mo et al., 2013).

## HERV-FRD

While many studies were devoted to syncytin-1 expression in cancer, very few studies investigated the second domesticated Env protein, HERV-FRD syncytin-2 (**Table 2**). In the first study in hBC cells, the ectopic expression of GCM1 led to the TABLE 2 | Main molecular evidence of putative pro-oncogenic activity of HERV-derived Env.


*HERV-derived Env proteins for which a specific pro-oncogenic activity and/or a precise effect on cellular transformation have been characterized are included.*

hypomethylation of syncytin-2 5′LTR, stimulating the protein expression and fusogenicity (Liang et al., 2010). In a second study, syncytin-2 overexpression was shown in endometrial tumoral and pre-tumoral lesions, being significantly associated with the disease stage and histological grading (Strissel et al., 2012). Finally, a significant upregulation of syncytin-2, along with syncytin-1, has been reported in colon adenocarcinoma samples (Díaz-Carballo et al., 2015).

## HERV-K

The HERV-K supergroup, composed of the Human MMTVlike (HML) groups 1–10 (**Table 1**), probably constitutes the most investigated HERV ensemble in relation to carcinogenesis. Particularly, in-depth attention was devoted to HERV-K(HML2) group, often generically referred to just as HERV-K. This group includes the evolutionarily youngest HERV sequences, showing a remarkably recent activity that led to the formation of both human-specific loci (i.e., present in humans but not in non-human primates) (Subramanian et al., 2011) and unfixed insertions within human population (Marchi et al., 2014). Moreover, as mentioned above, HML2 Np9 and Rec accessory proteins have been suggested to be oncogenes.

### HERV-K(HML2)

HML2 env expression attracted significant attention in the tentative link to tumorigenesis, either as Env or as the above accessory variants Np9 and Rec (**Figures 1B**, **6**).

#### **Env**

The upregulation of HML2 env in hBC cells and tissues (Wang-Johanning et al., 2001, 2003; Zhao et al., 2011) together with the link between MMTV and mouse mammary carcinoma (Bittner, 1936; Matsuzawa et al., 1995) greatly prompted the search for a possible involvement of HML2 Env in hBC. The protein is known to be present at different levels in both in situ and invasive hBC tissues, stimulating cellular and humoral immunity (Wang-Johanning et al., 2008; Zhao et al., 2011), and its overexpression has been linked to hBC stage, grade, p53 mutation status and metastatic spread, having thus a possible prognostic value (Zhao et al., 2011; Zhou et al., 2016). Of note, monoclonal antibodies (Abs) against HML2 Env, which were shown to inhibit the proliferation of hBC cells and the growth of xenograft tumors in mice, are being investigated as anti-cancer agents (Wang-Johanning et al., 2012). Overall, while many findings reported the presence of either HML2 Env in diseased tissues or a specific immune responses against it in hBC patients, the molecular mechanisms involving the protein in hBC oncogenesis are still not fully clarified. A recent study showed that HML2 Env knockdown was able to inhibit hBC cell proliferation, migration and invasion; affecting various cellular networks playing key roles in cancerogenesis (EGFR, TGFB1, NF-κB, c-myc, and p53) and impairing tumor-associated gene expression (ras, p-RSK, and p-ERK) (Zhou et al., 2016) (**Figure 6**). Accordingly, HML2 Env downregulation significantly reduced tumor formation and metastasis in mouse xenografts

site II), respectively, have been reported to potentially trigger transformation by the positive stimulation (orange arrows) of cellular transcription factors and pro-oncogenes (violet circles) and/or through the interaction with proteins controlling their degradation (pink) or repression (yellow), leading to the lack of negative regulation (blue lines). The main potential effects on tumor growth and progression are indicated.

while, when overexpressed in hBC cells, the previously observed impairments in cellular networking, migration, invasion, and transformation were reverted (Zhou et al., 2016). Likewise, HML2 Env expression in non-transformed breast cells prompted epithelial to mesenchymal transition, leading to an increase in cell motility, migration and invasion (Lemaître et al., 2017). Mechanistically, such behavioral changes were mediated by HML2 Env through the activation of ERK1/2 MAPK pathway (**Figure 6**) and the stimulation of transcription factors associated with cancer aggressiveness (Lemaître et al., 2017).

Additional evidence of HML2 Env oncogenicity comes from intensive studies in melanoma. Both HML2-derived Env and Rec proteins have been detected at variable percentages in melanoma biopsies and cell lines, being instead generally absent in normal melanocytes (Muster et al., 2003; Büscher et al., 2005, 2006). Moreover, HML2 virus-like particles with RT activity but lacking infectivity were observed in melanoma cell cultures (Muster et al., 2003; Büscher et al., 2005, 2006; Serafino et al., 2009). HML2 Env was shown to trigger humoral immunity in ∼20% of melanoma patients, and patients with anti-HML2 Abs had a decreased overall survival (Büscher et al., 2005; Hahn et al., 2008). Finally, HML2 Env inhibition decreased the fusion of cultured melanoma cells, suggesting a role in the formation of multinuclear cancer cells and in the onset of genetic heterogeneity conferring trophic and survival advantages to tumor cells (Huang et al., 2013a). In this regard, a pivotal role in melanoma development has been suggested for cancer subpopulations bearing stem cell properties (Frank et al., 2010). CD133+ melanoma cell lines undergoing medium-induced phenotype-switching showed stemness features (enhanced proliferation, migration and invasion) concomitant with the activation of HML2 expression, thus suggested to have a role in cellular plasticity (Argaw-Denboba et al., 2017). Similarly, stress conditions stimulating the transition from an adherent to a non-adherent (and more malignant) phenotype led to a concomitant upregulation of HML2 expression, whose inhibition was in turn able to prevent the phenotype-switch (Serafino et al., 2009). Given that the observed morpho-behavioral changes are analogous to the ones arising from BRAF gene suppression in melanoma, commonly leading to the constitutive activation of MEK–ERK signaling, HML2 Env expression has been related to ERK activation, being downregulated after the inhibition of MEK or CDK4 (Li et al., 2010). In general, this augmented HML2 expression in melanoma could represent a tumor epiphenomenon, especially in the presence of impairments in cellular signaling and stress conditions altering the tumor transcriptional environment. In the case of melanoma, for example, UV irradiation is a known risk factor and has been shown to trigger HERV-K protein expression (Reiche et al., 2010; Schanab et al., 2011). Similarly, a melanoma-specific transcription factor is able to activate HML2 LTRs (Katoh et al., 2011) and the RB protein, a downstream mediator of BRAF–MEK–ERK signaling often altered in cancers, is a key regulator of DNA methylation that can influence HERV expression (Li et al., 2010).

HML2 Env was detected on the surface of ovarian cancer (OC) lines and patients' cells, showing a general correlation with the tumor histotype (Wang-Johanning et al., 2007; Rycaj et al., 2015). OC patients showed significantly higher titers of Abs against HML2 Env and specific T-cell cytotoxicity against autologous OC cells (Wang-Johanning et al., 2007; Rycaj et al., 2015). However, the similar Ab positivity found against HERV-E and ERV3 Envs (Wang-Johanning et al., 2007) together with the general hypomethylation of HERV sequences in OC (Iramaneerat et al., 2011) suggest that the increased HERV expression and Ab production could constitute (at least in part) a tumor epiphenomenon. However, even HERV proteins arisen from tumor-dependent upregulation can then participate in a multifactorial stimulation, contributing to cancer progression.

### **Rec**

HML2 Rec is a functional homolog of HIV-1 Rev and HTLV Rex accessory proteins, protecting the retroviral transcripts from cellular splicing and enhancing their nuclear export (Magin et al., 1999). Thus, Rec shares with Rev and Rex the main structural and biological properties. First of all, to export viral RNAs, these proteins need to be imported from the cytoplasm to the nucleus through a specific nuclear localization signal (NLS) rich in basic amino acids (often arginines) that interacts with cellular import factors. Once inside the nucleus, the efficient binding to viral transcripts relies on the interaction between Rev/Rex/Rec and a specific responsive element in viral transcripts. These responsive elements, named RRE/RxRE/RcRE, respectively, can be located either within env (RRE) or in the 3′UTR (RxRE, RcRE) and show an highly structured and folded RNA organization (Magin et al., 1999; Magin-Lachmann et al., 2001). In the case of HML2 transcripts, RcRE presents four stem-loops essential for Rec- but not Rev- and Rex-mediated export, that does occur in vitro through discrete binding sites (Magin-Lachmann et al., 2001). The so-formed ribonucleoprotein multimers cooperate then with various host factors to stabilize the transcripts and mediate their nuclear export, competing in this way with the cellular splicing machinery. To do this, Rec/Rev/Rex presents a nuclear export signal (NES) rich in leucines that is recognized by cellular exportins, the most important of which is CRM1, to mediate the active egress of the protein with the associated unspliced RNAs to the cytoplasm (Fornerod et al., 1997; Fukuda et al., 1997; Neville et al., 1997; Ossareh-Nazari, 1997; Magin et al., 1999; Boese et al., 2000b). Besides CRM1, Rec can interact with Staufen-1, an ubiquitous protein involved in the cytoskeletonmediated transport of ribonucleoprotein complexes, increasing the export and translation of Rec-associated mRNAs (Hanke et al., 2013b). Staufen-1 is also involved in RNA decay during cellular stress, recruiting non-essential mRNAs and accumulating them into stress granules or processing bodies, to prevent their translation or mediate their degradation, respectively. Accordingly, in stressed cells, Rec co-localizes with Staufen-1 in stress granules, leading to the block of viral RNA translation (Hanke et al., 2013b).

Besides these virus-specific regulatory functions, a pathological role of Rec was firstly suggested in human germ cell tumors (hGCT), due to the expression of HML2 env spliced mRNA variants (Löwer et al., 1993) and the development of anti-HML2 Env Abs in ∼85% of patients (Sauter et al., 1996). Such putative oncogenic properties were then attributed to the Rec splicing variant, given that tumor development was observed in nude mice receiving injections of HML2 Rec but not in the ones injected with the full-length Env (Boese et al., 2000a). HML2 Rec expression in transgenic mice was moreover able to induce in situ testicular carcinomas, the predecessor lesion of GCT (Galli et al., 2005). At the molecular level, Rec was shown to interact with human tumor suppressor PLZF (promyelocytic leukemia zinc-finger protein) (Boese et al., 2000a), known to be involved in leukemia development. Rec binding to PLZF abolished the PLZF-mediated transcriptional repression of c-myc, stimulating cell growth and proliferation (Denne et al., 2007) (**Figure 6**). Besides PLZF, Rec can form complexes with the androgen receptor (AR) (Kaufmann et al., 2010) as well as with its negative regulators: the PLZF-related testicular zinc-finger protein co-repressor (TZFP) (Kaufmann et al., 2010) and the human small glutamine-rich tetratricopeptide repeat-containing protein co-chaperone (hSGT) (Hanke et al., 2013a). In particular, Rec can associate in a trimeric complex with TZFP and AR, overcoming the former stimulation by the latter (Kaufmann et al., 2010), and can also directly interact with hSGT, similarly abrogating its capacity to bind and suppress AR (Hanke et al., 2013a) (**Figure 6**). Such a pathway could be further sustained by hormone stimulation, including androgens: in this vicious cycle, the Rec-mediated suppression of AR negative regulation on the one hand enhances the expression of AR-dependent genes, leading to cellular proliferation and reduced apoptosis, and, on the other hand, it activates HML2 LTRs, further stimulating Rec production (Hanke et al., 2013a).

In hBC, anti-Rec Abs were detected in early-stage patients and suggested to be predictive of the disease progression (Wang-Johanning et al., 2013). It was proposed that Rec interaction with AR/TZFP/hSGT and the activation of c-myc can cooperate with AR-mediated dysregulation of HER2/HER3 signaling (Hanke et al., 2016). However, to the best of our knowledge, no specific molecular mechanisms of tumorigenesis in hBC have been demonstrated yet, and the pro-oncogenic activation of MAPK-ERK1/2 pathway observed for HML2 Env was instead absent when considering Rec (Lemaître et al., 2017).

## **Np9**

HML-2 Np9 originates as an env shorter splicing variant (∼9 kDa) associated with type I HML2 proviruses (**Figure 1B**) and, differently from the Rec protein that is normally found in the cytoplasm, it shows a predominant nuclear localization (Armbruester et al., 2002). Np9 expression was originally reported in transformed cells only (Armbruester et al., 2002) and has therefore been investigated for its oncogenic potential, even if subsequent studies revealed its physiological transcription in various healthy tissues (Schmitt et al., 2015).

First of all, the homology of Np9 and Rec in the 14 aa at the N-terminus drove the search for common cellular partners relevant to tumor transformation (Armbruester et al., 2002). Thus, Np9 was shown to bind PLZF in the nucleus, not through the N-terminal portion shared with Rec but with the first NLS (Denne et al., 2007), already shown to be critical for the protein nuclear localization (Armbruester et al., 2004). In this way, Np9 was suggested to potentially act as an oncoprotein, interfering with PLZF repression of c-myc (Denne et al., 2007) (**Table 2**, **Figure 6**).

Np9 was also shown to interact with two E3 ubiquitin ligases involved in the proteasome-dependent degradation of cellular proteins relevant to proliferation and transformation. Firstly, Np9 can bind LNX (Ligand of Numb protein X), which mediates Numb ubiquitylation and degradation (Armbruester et al., 2004). Numb is a crucial cell fate determinant and functions as Notch antagonist in the differentiation/proliferative pathway. Notch pathway is, in fact, an evolutionarily conserved signaling network regulating proliferation, differentiation and self-renewal of stem and progenitor cells, and its dysregulation has been involved in some tumors (Flores et al., 2014). While LNX-Numb interaction occurs in endosomes, Np9 competes for LNX binding directing the protein within nucleoli, affecting in this way LNX intracellular localization and Numb degradation, thereby promoting Notch signaling (Armbruester et al., 2004) (**Table 2**, **Figure 6**). Interestingly, Np9 interacts with the Epstein-Barr Virus (EBV) nuclear Ag EBNA2, which is considered to be a viral homolog of Notch due to its capacity to promote B-cell proliferation and immortalization (Gross et al., 2011). Particularly, Np9 nuclear binding to EBNA2 exerted a negative impact on the latter functions of promoter activation and binding to cellular transcription factors (Gross et al., 2011) (**Table 2**). In addition, Np9 was shown to bind another E3 ligase, MDM2, which has a pivotal role in the negative regulation of p53 transcription factor through its ubiquitylation for degradation or nuclear exclusion (Heyne et al., 2015). Np9 was shown to compete with p53 for the binding to MDM2, thus affecting the former proteasomal degradation and possibly supporting its uncontrolled activity as a further pro-oncogenic mechanism (Heyne et al., 2015) (**Table 2**, **Figure 6**).

Additional evidence of Np9 oncogenic activity was obtained in myeloid and lymphoblastic leukemia cells that were promoted in survival and growth by Np9 overexpression and, when injected in immunodeficient mice, led to faster tumor growth and increased tumor weight (Chen et al., 2013). Moreover, in leukemia cells expressing native Np9, the protein was shown to co-activate multiple leukemia-associated signaling pathways, inducing an aberrant upregulation of pERK, c-Myc and β-catenin and cleaving Notch1 with a subsequent decrease of Numb (Chen et al., 2013) (**Table 2**, **Figure 6**). Overall, the ability of Np9 to act as a molecular switch of multiple signaling pathways critical to cellular growth and proliferation could support a pro-oncogenic role of this protein.

### HERV-K(HML6)

The sole finding linking HML6 Envs to human cancers was reported in a melanoma patient which presented an HML6 Ag expressed on tumor cells and targeted by cytolytic T lymphocytes (Schiavetti et al., 2002). Such Ag was encoded by a HERV-K(HML6) env gene located in chromosome 16, being expressed in ∼85% of transformed melanocytes and generally absent in normal tissues, and was therefore named HERV-K-MEL (Schiavetti et al., 2002). HERV-K-MEL was proposed as a biomarker for melanoma onset, even if its expression was detected in the majority of benign nevi and in normal skin samples as well (Schiavetti et al., 2002). The authors suggested subsequently that HERV-K- MEL derived Ags, being targeted by patients' cytolytic T lymphocytes, could constitute a target for vaccination and anti-cancer approaches, as discussed below.

## Other HERV Groups

Besides the above-discussed groups, other HERV Env proteins have been investigated by a few studies. Donor T lymphocytes infusion has curative effects for hematological malignancies, and has been shown to induce tumor regression in metastatic renal cell carcinoma patients. In tumor lines established from the latter biopsies, the onset of alloreactive T cells was observed and the recognized sequences had 100% homology with proteins encoded by a HERV-E locus on chromosome 6 (Takahashi et al., 2008). Such protein expression was up-regulated in the sole clear cell carcinoma variant, being promoted by the hypoxia-inducible transcription factor 2α as a consequence of von Hippel-Lindau factor (VHL) inactivation (Takahashi et al., 2008). Later on, the same HERV-E locus was shown to express a full-length env, similarly found in clear cell renal carcinomas only and detected in all patients harboring VHL deficiency (Haruta et al., 2015). The correspondent putative Env stimulated cytolytic T lymphocytes specifically recognizing the HERV-E-expressing carcinoma cells, with a possible immunotherapeutic application (Haruta et al., 2015). Similarly, a HERV-H env in chromosome X was found to be highly expressed in a subset of gastrointestinal cancers (Wentzensen et al., 2007) and the putative Env peptides showed the ability to stimulate autologous cytolytic T lymphocytes, leading to INFγ production and lysis of colorectal carcinoma cells (Mullins and Linnebacher, 2012a).

## HERV ENV IMMUNOPATHOGENIC PROPERTIES: POSSIBLE ROLES IN AUTOIMMUNITY

A functional immune system is able to discriminate between foreign immunogenic Ags, stimulating effective immune responses, and self-Ags, for which immune tolerance becomes established during development. Autoimmunity defines a heterogeneous ensemble of multifactorial disorders sharing the loss of such tolerance. Its clinical manifestations include the activation of T helper lymphocytes and the onset of Abs and/or cytotoxic T cells directed against body components, leading to chronic inflammation and tissue damage. In theory, HERV products should be recognized as self-Ags, being stable components of the human genome highly expressed during development, when immune tolerance is acquired. Nevertheless, HERV expression is still able to trigger both innate and adaptive immunity, being subsequently investigated in a large number of autoimmune disorders. To explain such paradox, the most accepted model is that HERV Ags stimulate immunity due to their similarity to exogenous viral proteins, i.e., based on molecular mimicry (Trela et al., 2016) (**Figure 7**). In this way, HERV Ags normally expressed in healthy cells may be considered as pathogen associated molecular patterns (PAMPs) by the innate immunity pattern recognition receptors (PRRs), triggering inflammation and T helper lymphocytes differentiation and evoking cellular-mediated cytotoxicity and auto-Ab production (Hurst and Magiorkinis, 2015). Besides molecular mimicry, even in the absence of a specific immune recognition, HERV proteins can elicit the non-specific polyclonal activation of auto-reactive T lymphocytes, acting like super-Ags and inducing massive cytokine release with potentially life-threatening manifestations (shock, multi-organ failure) (Brodziak et al., 2011; Emmer et al., 2014) (**Figure 7**). Finally, as described for cancer, the presence of an altered epigenetic environment has been widely reported in autoimmunity too, and must be considered when investigating the upregulation of HERV expression and its possible pathological significance. Nevertheless, even the hypomethylation-dependent onset of HERV protein could provide immunopathogenic agents possibly suitable as biomarkers and therapeutic targets.

In autoimmune diseases, Env proteins are the most intensively investigated HERV products due to their remarkable immunopathogenic properties. Overall, the soundest evidence has been obtained about the HERV-W Env immunopathogenic potential in multiple sclerosis (MS), while the findings about other HERV Env contribution to autoimmune disorders are still quite controversial. However, as already mentioned for cancers,

destruction. In addition, HERV Env proteins can act as strong activators of the immune system with superAg function, prompting the non-specific stimulation of T lymphocytes. The consequent polyclonal expansion of reactive T cells can led to massive cytokine release, with extensive tissue damage and systemic life-threatening manifestations (shock, multi-organ failure).

no HERV sequence or protein has been definitively associated with any autoimmune disease yet.

## HERV-W

The potential role of HERV-W sequences and their expression products in autoimmunity has been recently reviewed (Hon et al., 2013; Grandi et al., 2017). Concerning HERV-W Env proteins, the majority of studies investigated their role in MS and, only recently, in chronic inflammatory demyelinating polyradiculoneuropathy (CIDP) and type 1 diabetes.

## Multiple Sclerosis and Other Demyelinating Diseases

MS is an autoimmune disease having as main signature the progressive demyelination of the central nervous system, with immunopathogenic manifestations sustained by alterations in both innate and adaptive immunity (Antony et al., 2011). As many autoimmune disorders, MS shows a complex and poorly understood etiology, which somehow prevented the development of specific therapeutic approaches. A great number of etiological determinants has been proposed, including genetic predisposition, environmental factors and various infectious agents (Perron and Lang, 2010; Libbey et al., 2013; Morandi et al., 2015) that likely participate to a multifactorial immunopathogenesis. The specific causes of demyelination and axon damage are still not fully clarified, even if inflammatory reactions prompted by cytokines, chemokines, prostaglandins, reactive oxygen species and matrix metalloproteinases are known to play a significant role (Antony et al., 2011). The initial link between the HERV-W group and MS was prompted by its sequence identity with MSRV (MS retrovirus), a putative exogenous element detected in a variable proportion of MS patients (Garson et al., 1998; Komurian-Pradel et al., 1999; Voisset et al., 1999). The origin and nature of MSRV is, however, still subject of debate (Blomberg et al., 2000; Ruprecht et al., 2008a; Voisset et al., 2008), and the MSRV sequences found in MS patients might have originated from the expression of individual HERV-W loci or by the recombination of different HERV-W transcripts (Schmitt et al., 2013; Grandi et al., 2016), known to often confound in the in vitro analysis of multicopy HERV sequences (Flockerzi et al., 2007). In the last three decades, many studies reported the presence of HERV-W transcripts in MS samples; HERV-W Ags in MS lesions or specific B and T cell immune responses against them in MS patients. Concerning HERV-W/MSRV Envs, even if the (variable) presence of such proteins in MS patients is undoubted, their precise role in the disease etiology is still to be defined. Nevertheless, their clear immunopathogenic potential—as shown through the use of MS animal models—led to their current exploitation as innovative therapeutic targets (Curtin et al., 2015b). While the HERV-W/MSRV Env expression has been documented in both normal and MS brains, making a pivotal role of these proteins in the disease onset unlikely (Grandi and Tramontano, 2017), their increased presence in MS lesions and their ability to trigger adaptive and, especially, innate immunity suggested that they could take part, with other individual factors, to MS immunopathogenesis (**Table 3**).

Regarding innate immunity, syncytin-1 was found to be upregulated in MS patients' brain specific cells involved in neuroinflammation, i.e., astrocytes and microglia, being instead lowly expressed (Antony et al., 2004) or even absent (Mameli et al., 2007a; Perron et al., 2012) in healthy individuals. HERV-W/MSRV Envs were similarly abundant within MS brain lesions, being associated with actively demyelinating sites and expressed principally by macrophages and microglia (van Horssen et al., 2016). A moderate expression was reported in reactive astrocytes within demyelinating areas (van Horssen et al., 2016), and syncytin-1 in vitro upregulation prompted the production of proinflammatory molecules that could likely cause astrocyte and oligodendrocyte damage (Antony et al., 2011). In addition, HERV-W/MSRV Env was shown to act like a superAg (Emmer et al., 2014), stimulating the polyclonal T-cell activation and eliciting an abnormal innate response with massive release of multiple cytokines, already known to play a central role in demyelination (Perron et al., 2001; Rolland et al., 2005) (**Table 3**). Further studies revealed that HERV-W/MSRV Env pro-inflammatory properties depend on the stimulation of toll-like receptor 4 (TLR4) (Perron et al., 2001; Rolland et al., 2006; Saresella et al., 2009), evoking the same proinflammatory cytokines prevalent in MS, such as interleukins (IL-1, IL-6) and tumor necrosis factor α (Rolland et al., 2005, 2006; Mameli et al., 2007b). Such TLR4 activation by Env proteins led also to the induction of nitric oxide synthase, with the formation of nitrotyrosine groups blocking oligodendrocyte differentiation and hence affecting myelin expression and renewal (Kremer et al., 2013) (**Table 3**). The marked ability of HERV-W/MSRV Envs in triggering innate immunity has been further characterized through murine models (**Table 3**). The intraperitoneal injection of MSRV virions in humanized immunodeficient mice led to acute neurological inflammation and animal death due to massive brain hemorrhage (Firouzi et al., 2003). A similar outcome was obtained with syncytin-1 overexpression, inducing inflammation, neurobehavioral abnormalities and oligodendrocytes and myelin injuries mediated by redox reactants including nitric oxide (Antony et al., 2004, 2007). Finally, MSRV-Env was confirmed to strongly trigger the TLR4-mediated production of proinflammatory cytokines, leading to experimental allergic encephalomyelitis in mice (Perron et al., 2013). Accordingly, MSRV Env was confirmed to be a potent agonist of TLR4 (Madeira et al., 2016) and a monoclonal Ab neutralizing the protein (GNbAC1) was shown to reduce such immune activation, rescuing myelin expression (Kremer et al., 2014).

Considering adaptive immunity, HERV-W/MRSV Env epitopes have been detected on active MS patient B cells and monocytes (Brudek et al., 2009) and showed potential cross-reactivity against myelin proteins (Ramasamy et al., 2017), being possibly involved in molecular mimicry events. Accordingly, MSRV Env stimulated IFN-γ release by T cells when associated with the myelin oligodendrocyte glycoprotein (MOG) Ag (Perron et al., 2013), and the Env protein encoded by an HERV-W processed pseudogene (ERVWE2, locus Xq22.3) was shown to have five domains similar to the Ig-like domain of MOG, including T and B cell epitopes (Do Olival et al., 2013) (**Table 3**). In addition, IgG Abs in MS patients were shown to


strongly recognize two HERV-W Env peptides, having a slight decline after IFN-β treatment (Mameli et al., 2015). A similar decrease in anti-HERV-W Env Abs following IFN-β therapy was previously reported, albeit being not statistically significant (Petersen et al., 2009). In contrast, another study assessing the humoral response against HERV-W/MSRV found no support for its specificity to MS, detecting syncytin-1 Abs in only 1/50 patients and MSRV Env Abs in none of them (Ruprecht et al., 2008a).

Overall, to date, the sole presence of HERV-W/MSRV Env Ags and Abs in MS patients does not support their association with MS etiology. Contrarily, the evident Env immunopathogenic properties, especially in evoking innate immunity, strongly suggest a role in MS clinical manifestations (Grandi and Tramontano, 2017). However, the fact that HERV-W/MSRV Envs have been reported in healthy controls too likely suggests that their expression represents a physiological phenomenon, possibly showing higher prevalence and pathological consequences in MS due to the altered epigenetic and immunological environment and the presence of other individual triggers (Hon et al., 2013; Sun et al., 2016). Given this multifactorial interplay, involving genetic predisposition and exogenous agents (Christensen, 2005; Ryan, 2011), the HERV-W/MSRV Env superAg activity is the most probable HERV-derived contributor to MS clinical manifestations, inducing inflammatory effects coincident with the major hallmarks of MS (van Horssen et al., 2016). Such possible participation in MS neuropathogenesis makes HERV-W/MSRV Envs promising targets for innovative therapeutic approaches. In this regard, GNbAC1 monoclonal Ab selectively recognizing HERV-W/MSRV Env showed promising neutralizing effects in vitro and in mouse models, and is currently under clinical trial (Curtin et al., 2015a,b) (see below).

HERV-W expression has been investigated in CIDP, another autoimmune disease affecting the peripheral nervous system with inflammatory and demyelinating lesions in nerve roots (Faucard et al., 2016). The expression of MSRV-Env was detected in 5 out of 8 CIDP patients (Perron et al., 2012), and the protein was found in the nerve lesions of 5 out of 7 patients (Faucard et al., 2016) with a prevailing expression in Schwan cells (Faucard et al., 2016). The latter exposition to MSRV-Env led to the stimulation of IL-6 and chemokine CXCL10 that was significantly inhibited by GNbAC1 Ab (Faucard et al., 2016).

### Diabetes

HERV-W Env expression has been recently studied in type 1 diabetes, showing significant upregulation in 70% of diabetes patients as compared to a 12% positivity in healthy controls (Levet et al., 2017). The immunostaining of the protein in pancreatic specimens from 20 cases and 19 controls gave comparable positivity percentages (75 and 16%, respectively), showing a predominant localization in acinar cells, proximal to Langerhans islets (Levet et al., 2017). Furthermore, mice transgenic for HERV-W Env expression developed hyperglycemia, diminished insulin levels and pancreatic infiltrates of immune cells, all hallmarks of type I diabetes (Levet et al., 2017). In particular, the inhibition of insulin secretion was determined by HERV-W Env protein in a dose-dependent manner, being restored in the presence of neutralizing Abs and possibly depending on the protein interaction with pancreatic β cell TLR4 (Levet et al., 2017). Hence, authors suggested that HERV-W-Env might exert a double pathological effect, impairing pancreatic β cell insulin secretion and stimulating autoimmune reactions. It is worth noting that a phase-IIa clinical trial is currently testing GNbAC1 monoclonal Ab as possible HERV-based therapeutic approach in type 1 diabetes (Levet et al., 2017).

## HERV-K

As seen for cancer, the majority of studies tentatively linking HERV-K supergroup to autoimmunity was dedicated to HML2 group, with particular attention to some specific proviruses: HERV-K10 (locus 5q33.3) and HERV-K18 (locus 1q23.3). It is worth noting that members of the same HERV group generally share high identity, and it is thus likely that findings reported for a given element could actually involve related ones too, especially with the use of Abs not characterized for their cross-reactivity with other proteins of the same group. Overall, differently from HERV-W, the involvement of HML2-derived Env in autoimmune diseases is still controversial.

HML2 Envs derived from HERV-K18 provirus were originally proposed to act like superAgs in type 1 diabetes, activating patients' Vβ7 and Vβ13 T lymphocytes and leading to pancreatic β cell damage (Conrad et al., 1997). Later on, such theory has been widely controverted by different studies that reported the absence of both selective expression and immunopathogenic significance of HERV-K18 Env in type 1 diabetes (Badenhoop et al., 1999; Jaeckel et al., 1999; Kim et al., 1999; Muir et al., 1999; Herve et al., 2002) as well as the lack of any superAg activity (Lapatschek et al., 2000).

In rheumatic diseases, two HML2 Env proteins (associated with HERV-K10 and the above mentioned HERV-K18 proviruses) were investigated in systemic lupus erythematosus (SLE) and Sjögren syndrome to assess their ability to stimulate humoral immunity, showing albeit no significant increase of specific Abs as compared to healthy controls (Herve et al., 2002). Similarly, a recent study investigating HERV-K Env humoral recognition in rheumatoid arthritis reported the highest reactivity against a SU epitope that was however recognized at low prevalence by both patients and healthy controls (19 and 3%, respectively) and did not show significant correlation with the disease (Mameli et al., 2017).

## Other HERV Groups

In addition to HERV-W and HERV-K, a few studies reported the expression of other HERV Envs and/or the presence of specific Abs in some autoimmune disorders.

Psoriatic and atopic skin samples were generally positive for HERV-E Env as compared to a low positivity in normal skin, being also expressed in CD4+ T cells found in psoriatic lesions (Bessis et al., 2004). Such protein expression, contrary to what observed for HERV-K in melanoma, was downregulated by UV irradiation (Bessis et al., 2004).

Elevated Ab titers against ERV-3 Env were found in healthy pregnant women (accordingly to its known placental expression) and in women affected by either SLE or Sjögren syndrome (Li et al., 1996). Authors reported that, in healthy women, the highest Ab levels were observed in mothers of babies suffering from congenital heart block. This and the evidence of ERV-3 Env expression in both placenta and fetal heart supported the theory of a possible autoimmunization during pregnancy (Li et al., 1996). However, the fact that patients suffering from SLE showed reactivity against HIV-1 and HTLV-1 Envs too could suggest the occurrence of molecular mimicry events (Balada et al., 2010).

## Evidence of HERV-Env Immunosuppressive Activity in Autoimmunity

While the majority of studies attempted to find a link between HERV Env and autoimmune reactions, a limited number of works presented instead an opposite scenario, in which these proteins could even downregulate immune activation through their immunosuppressive activity.

In a small group of psoriatic patients, the expression of HERV-K(HML2) env was decreased as compared with healthy individuals, with a concomitant decline in the specific Ab response (Gupta et al., 2014). An HERV-H Env protein was upregulated in SLE individuals, being but negatively correlated to IL-6 levels and able to affect the latter production ex vivo (Laska et al., 2017). Intriguingly, based on this inverse relation between Env expression and patients immune response, authors proposed that in some instances HERV Envs might have been exploited by human immunity to support a negative inflammatory feedback (Laska et al., 2017). In line with this hypothesis, the upregulation of HERV Env expression, as frequently observed in pathological conditions (e.g., exogenous infections and inflammation), could provide an immunosuppressive system to control harmful cytokine production (Laska et al., 2017). Even if limited, these findings might indicate that Env expression in autoimmunity could constitute a multifaceted phenomenon, arose due to the altered immune and epigenetic conditions and having some relevance for the disease pathogenesis, likely depending on the genetic and environmental background. In addition, the impact of HERV Env expression on immunopathogenesis may be ambivalent, having negative or positive effects based on the prevalence of either immune-stimulatory or immunesuppressive activities. In this view, HERV Envs in healthy population can theoretically serve as a sort of "immune sentinels" that, on the one hand, provide physiological functions and protect the tissues from exaggerate immune reactions and, on the other hand, maintain a basal immune alert leading, in some complex conditions, to harmful effects.

## TOWARD HERV-BASED THERAPIES: NEEDS AND POTENTIAL

In line with their proposed role in cancer and autoimmunity, HERVs are considered promising targets for the development of innovative therapeutic strategies. It is however noteworthy that, paradoxically, while the first HERV-based therapies are currently in clinical studies, no human illness has been definitively associated with any HERV, yet. Thus, in general, an important gap that current studies are trying to fill is to provide the definitive evidence of a causal association between HERVs' presence/expression and the onset/progression of a given disorder. Such final demonstration should satisfy the criteria commonly used to assess cause-effect relationships and applied to viral-related pathogenesis as well, relying on the growing amount of in vitro and in silico screening tools and biostatistics models (Ronit and Shou-Jiang, 2011; Fedak et al., 2015). First of all, even if a weak association does not necessary indicate the absence of causality between the exposure to a given factor and pathogenesis, the stronger is an association, the more likely is its causality. Hence, the presence of HERV proteins in diseased conditions should represent a starting point for evaluating the significant upregulation of specific HERV loci in diseased vs. healthy tissues through appropriate statistical models. Another important point is the consistence, i.e., the reproducibility of observations made in variable conditions (different groups, places, samples, etc.). This is a major goal for HERV studies, because the absence of standardized methodologies and the poor genomic characterization of individual HERV groups led often to discordant results about their expression in human diseases. This unfortunately affected the coherence of epidemiologic literature in the field, with different studies reporting conflicting and poorly comparable associations. Thus, advanced mechanistic analyses and standardized specific methodologies can elucidate which are the most consistent evidence, thereby rationally directing the causal investigation. Moreover, epidemiologic associations should not only support biological hypothesis, but should be plausible according to the existing knowledge about diseases pathogenesis. Hence, besides being expressed in diseased tissues, HERV proteins should hold biological activities potentially explaining the disease molecular and/or clinical manifestations. Accordingly, it is important to explore the interaction with cellular or exogenous partners already involved in the disease pathogenesis, thus considering that many of the accounted diseases arise from the complex interplay between multiple factors. This implies that the manipulation of a sole determinant may produce undetectable or misleading effects, asking for exhaustive and carefully designed experimental systems supported by appropriate animal models. The latter should be used to assess the specificity of HERV expression, elucidating the exact molecular mechanism of pathogenesis through multiple integrated research approaches and considering eventual influencing factors (epigenetics, drug treatments, physical agents, etc.). Similarly, in the case of a complex temporality between exposition and pathogenesis, the presence of HERV products should direct the investigation to specific loci, demonstrating their ability to induce de novo the observed effects and considering that HERV expression varies in the different periods of life (e.g., development, pregnancy) or according to epigenetic changes (e.g., cancer, autoimmune diseases). Besides temporality, the biological gradient should be evaluated, given that many HERVs are expressed in both patients and healthy individuals, without apparent harmful effects in the latter. Hence, it would be useful to understand if this could depend on a threshold under which HERV products have not significant impact on the host, being "symptomatic" only when upregulated (even as an epiphenomenon) in diseased contexts. This could help the rational design of protocols testing candidate molecules for HERV-inhibition. Finally, due to the principle of analogy, HERV products belonging to a same group share significant similarity. Hence, if on the particular one side it is unlikely that their expression could indiscriminately produce pathogenesis, asking for more specific procedures, on the other side, the reliable characterization of HERV pathogenic determinants in a given disease could allow to reach lowered standards of evidence for subsequent association studies.

## Development of HERV-Based Anticancer Approaches

Since HERV Envs have been suggested to exert pro-oncogenic effects in a number of tissues, being possibly involved in tumor progression and in the downstream metastatic spread, they constitute promising targets for innovative anti-cancer strategies based either on HERV inhibitors or immunotherapy approaches. While the former clearly requires a supported causative association between HERV expression and cancer progression, the latter can exploit both the selective or upregulated expression of HERV Ags to direct therapeutic agents against cancer cells. To date, various HERV-based anticancer approaches have been proposed, even if, to the best of our knowledge, none of them is currently under clinical development (sources: clinicaltrials.gov, USA; clinicaltrialsregister.eu, EU).

## HERV Inhibitors

In the presence of a demonstrated role of HERV proteins in cancer onset and/or progression, a valuable therapeutic strategy could be based on molecules or small RNAs inhibiting either the protein activity or the upstream HERV expression. This approach includes the possibility to test the cross-efficacy of antiretroviral molecules already approved for exogenous retroviral and nonretroviral infection treatment. As an example, colorectal cancer cells with an induced chemotherapy-resistant phenotype were shown to hyperexpress HERV-W and HERV-FRD Envs, and such expression was efficiently downregulated by different antiviral compounds (amantadine, ribavirin and pleconaril) (Díaz-Carballo et al., 2015). Interestingly, the combination of antitumor agents and these antiviral drugs led to synergistic antiproliferative effects, increasing the cytotoxicity against the multiresistant colorectal tumor cells (Díaz-Carballo et al., 2015). Similarly, the treatment of prostate cancer cell lines harboring endogenous RT activity with the nucleoside HIV-1 RT inhibitor Abacavir showed marked anti-proliferative effects, even if no data are available about the specific action on HERV expression (Carlini et al., 2010). Also non-nucleoside HIV-1 RT inhibitors Nevirapine and Efavirenz were tested against HERV-K(HML2) expression, reducing proliferation and promoting apoptosis in melanoma cells with induced stemness features (Argaw-Denboba et al., 2017).

## Passive and Active Immunotherapy

HERV Envs being upregulated and/or found exclusively in tumor tissues could be suitable targets to direct both passive and active immunotherapy against cancer cells, even in the absence of a direct role in the disease onset and progression.

Passive immunotherapy is mainly based on the development of Abs recognizing the HERV Envs expressed in diseased tissues. Given the high similarity shared by HERV proteins, especially among related groups, the design of selective Abs cannot ignore the need of a proteomic project to characterize the specific expression of individual HERV peptides. Given the high expression of HERV-K(HML2) in hBC, a monoclonal Ab against HML2 Env was shown to inhibit hBC cell line proliferation, with the concomitant activation of apoptotic signals (Wang-Johanning et al., 2012). The same Ab significantly reduced the growth of xenograft tumors in mice, being therefore proposed as possible immunotherapeutic agents for hBC (Wang-Johanning et al., 2012).

While passive immunotherapy relies on Abs administration, active immunotherapy aims to stimulate an intrinsic cellular and humoral response against diseased cells. In particular, an ideal anticancer therapeutic agent should be as selective as possible toward transformed cells only, and should be able to prevent recurrences by evoking a protective immunity (Mullins and Linnebacher, 2012b). Due to its specificity and durability, active immunotherapy is considered more advantageous with respect to passive immunization, even if both approaches might be combined to gain a higher anticancer effect. Currently, various HERV-derived Envs have been investigated for anticancer immunotherapy, being expressed to higher extents (tumorassociated Ags, TAAgs) or exclusively (tumor-specific Ags, TSAgs) in transformed cells. In this context, an important therapeutic opportunity would be the identification of HERV TSAgs shared between different tumors, to develop broadspectrum anticancer strategies. The first attempt to exploit endogenous retroviral proteins as TAAgs was performed in murine colorectal carcinoma and melanoma cell lines producing ERV Envs, in which recombinant vaccinia virus was used for antitumor immunization against these proteins (Yang and Perry-Lalley, 2000; Kershaw et al., 2001). In a similar way, recombinant vaccinia virus expressing HERV-K(HML2) Env reduced the number of nodules of Env-expressing pulmonary tumors induced in mice, which were even prevented by the vaccine prophylactic administration (Kraus et al., 2013). In humans, a multicentric study reported that the incidence of melanoma is reduced in individuals that have received vaccinia and/or bacille Calmette-Guerin vaccination, used to induce protective immunity against smallpox and tuberculosis, respectively (Krone et al., 2005). Such lower melanoma risk was also confirmed in individuals having suffered from acute infectious diseases, possibly suggesting that different viral Ags sharing sequence homologies with HERV-K-MEL-Ags could induce a crossprotection against melanoma development (Krone et al., 2005). An analogous effect was reported in a case report about a patient with metastatic melanoma who achieved spontaneous cancer regression after a febrile reaction to tetanus–diphtheria–pertussis combined vaccination (Tran et al., 2013). Likewise, given the antigenic similarity between HERV-K-MEL and yellow fever virus (YFV) (Krone et al., 2005), cohorts of individuals having received anti-YFV vaccination were investigated for melanoma incidence, showing however no significant protective effects in the 10 years post-vaccination (Mastrangelo et al., 2009; Hodges-Vazqueza et al., 2012). In addition to the findings in melanoma cells (Schiavetti et al., 2002), HERV-K-MEL expression and possible exploitation for immunotherapeutic purposes have been investigated in pancreatic cancer (Schmitz-winnenthal et al., 2007), driving the production of engineered chimeric T cells that lysed tumor cells expressing HERV-K-MEL and decreased tumor mass when injected in a mouse model of metastatic melanoma (Zhou et al., 2015; Krishnamurthy et al., 2016). A subset of gastrointestinal cancers showed significant upregulation of the Env encoded by a HERV-H provirus (locus Xp22.3), and T cells sensitized toward such protein had lytic effects against colorectal carcinoma cells that expressed it (Mullins and Linnebacher, 2012a). Similarly, a HERV-E Env selectively expressed in renal carcinoma was shown to induce cytolytic T lymphocytes recognizing renal carcinoma cells (Haruta et al., 2015).

## Combination With Demethylating Agents

Demethylating drugs are commonly used as anticancer agents and are known to liberate retrotransposon expression by inducing a hypomethylated status. Remarkably, the antitumor activity of DNA methyltransferase inhibitors is thought to rely on this trigger toward HERV expression, stimulating the production of viral dsRNA that is sensed as a PAMP by cellular recognition pathways, leading to the immune attack against tumor cells (Roulois et al., 2015). Accordingly, the individual knock-down of MDA5, MAVS and IRF7 PPRs in colorectal cells significantly reduced the anticancer activity of DNA methyltransferase inhibitors (Roulois et al., 2015). Therefore, considering that the sole immune checkpoint therapy often produces weak responses in cancer patients, demethylating agents have been proposed in combination with active immunization to produce synergistic anticancer effects (Chiappinelli et al., 2016).

## HERV-Based Therapeutic Treatments in Autoimmunity

HERV Envs showed remarkable immunogenic properties suggesting their contribution to autoimmune diseases through both molecular mimicry and superAg activities (Emmer et al., 2014; Trela et al., 2016). They were hence investigated as possible targets for innovative autoimmunity therapies, mainly focused on either the inhibition of HERV expression or the passive immunization against HERV Env proteins. To date, at least three clinical trials are dedicated to the development of HERV-Env based therapies for MS (2) and type I diabetes (1) (sources: clinicaltrials.gov, USA; clinicaltrialsregister.eu, EU).

### HERV Inhibitors

As reported for cancer, in the presence of a pathological contribution of HERV products, the use of molecules inhibiting such proteins' activity or expression could reduce the associated clinical manifestations. Intriguingly, it has been hypothesized that the cytoplasmic accumulation of endogenous retroelements could led to the activation of innate DNA sensors and the consequent production of IFN in mice, especially in the presence of mutations affecting the 3′ repair exonuclease 1 (Trex1) (Stetson et al., 2008; Gall et al., 2012; Xu et al., 2014). Even if the actual role of human Trex1 in the degradation of cytosolic HERV cDNA is still to be demonstrated, it has been shown that this exonuclease can metabolize reverse-transcribed endogenous retroelements, which accumulates on the contrary in the cytosol of mouse Trex1-deficient cells (Stetson et al., 2008). Accordingly, mice affected by hereditary autoimmune inflammation and deficient for Trex1 were treated with HIV-1 RT inhibitors showing a significant amelioration in symptomatology (Beck-Engeser et al., 2011). Similarly, a case report described that the administration of Raltegravir HIV-1 IN inhibitor led to the stable remission of a severe autoimmune chronic idiopathic urticarial patient that showed resistance to all traditional treatments (Dreyfus, 2011). Thus, the administration of HERV inhibitors could be theoretically suitable to treat some immune manifestations linked to HERV replication products, although further studies are needed to understand the significance of the latter accumulation in autoimmunity and the cellular networks involved in their sensing and degradation.

## Passive Immunotherapy

The natural onset of Abs against HERV Envs in autoimmunity patients suggested that specific neutralizing Abs could be developed for therapeutic purposes. Until now, the main findings regard the above-mentioned GNbAC1 monoclonal Ab, targeting HERV-W/MSRV Env proteins and proposed as innovative therapy for MS and type I diabetes. In fact, GNbAC1 inhibited the release of proinflammatory cytokines by PBMC stimulated with HERV-W Env (Rolland et al., 2006) and reduced the HERV-W Env-dependent TLR4-mediated induction of nitric oxide synthase, rescuing myelin expression and oligodendrocyte differentiation (Kremer et al., 2014). The Ab was then tested in an HERV-W Env-induced experimental allergic encephalitis mouse model (Perron et al., 2013), corroborating its efficacy in ameliorating the disease symptoms and preventing the animals death (Curtin et al., 2015b). The early clinical development assessed favorable safety and pharmacokinetic profiles (Derfuss et al., 2015b; Curtin et al., 2016) that were confirmed in a phase IIa randomized study of 10 MS patients, showing positive pharmacodynamics responses (Derfuss et al., 2015a,b). Aside from MS, a pathological role of HERV-W Env has been suggested in insulin deficiency and type I diabetes immunopathogenesis (Levet et al., 2017). Subsequently, GNbAC1 Ab is currently tested in a phase-IIa clinical study of type I diabetes patients, being possibly an important clinical option in treating this autoimmune disease (Levet et al., 2017).

## CONCLUSIONS

HERV-derived Env proteins constitute multifaceted and multifunctional elements at the interface between self and nonself, showing a delicate balance of the same biological activities in serving the host physiology and exerting harmful effects. Our understanding of HERVs has grown in the last three decades and, by now, it is evident that their expression is a normal phenomenon and, thus, cannot be used as the only evidence of their involvement in human disorders. It is hence necessary to characterize individual HERV proteins for their actual effects on the molecular pathways involved in human pathogenesis, to finally individuate precise causalities and allow the effective exploitation of selected elements as specific biomarkers and promising therapeutic targets.

## REFERENCES


## AUTHOR CONTRIBUTIONS

NG and ET participated to the conception, drafting and revision of the manuscript and approved the final version.

## ACKNOWLEDGMENTS

We would like to thank the colleagues involved in the studies reported in the present review, and apologize to the ones whose work has not been referenced here.

retrovirus E transmembrane envelope glycoprotein in normal, psoriatic and atopic dermatitis human skin. Br. J. Dermatol. 151, 737–745. doi: 10.1111/j.1365-2133.2004.06116.x


fusogenic and immunosuppressive activity of retroviral envelope proteins. Proc. Natl. Acad. Sci. U.S.A. 104, 20534–20539. doi: 10.1073/pnas.0707873105


inducing polyclonal Vβ16 T-lymphocyte activation. Virology 287, 321–332. doi: 10.1006/viro.2001.1045


syncytin-1 expression in urothelial cell carcinoma of the bladder through interacting with c-Myb. Oncogene 33, 3947–3958. doi: 10.1038/onc.20 13.366


breast cancer cells. Oncotarget 7, 84093–84117. doi: 10.18632/oncotarget. 11455

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Grandi and Tramontano. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Potential Links between Hepadnavirus and Bornavirus Sequences in the Host Genome and Cancer

### Tomoyuki Honda\*

Division of Virology, Department of Microbiology and Immunology, Osaka University Graduate School of Medicine, Osaka, Japan

Various viruses leave their sequences in the host genomes during infection. Such events occur mainly in retrovirus infection but also sometimes in DNA and non-retroviral RNA virus infections. If viral sequences are integrated into the genomes of germ line cells, the sequences can become inherited as endogenous viral elements (EVEs). The integration events of viral sequences may have oncogenic potential. Because proviral integrations of some retroviruses and/or reactivation of endogenous retroviruses are closely linked to cancers, viral insertions related to non-retroviral viruses also possibly contribute to cancer development. This article focuses on genomic viral sequences derived from two non-retroviral viruses, whose endogenization is already reported, and discusses their possible contributions to cancer. Viral insertions of hepatitis B virus play roles in the development of hepatocellular carcinoma. Endogenous bornavirus-like elements, the only non-retroviral RNA virus-related EVEs found in the human genome, may also be involved in cancer formation. In addition, the possible contribution of the interactions between viruses and retrotransposons, which seem to be a major driving force for generating EVEs related to non-retroviral RNA viruses, to cancers will be discussed. Future studies regarding the possible links described here may open a new avenue for the development of novel therapeutics for tumor virus-related cancers and/or provide novel insights into EVE functions.

Keywords: hepatitis B virus, endogenous viral elements, cancer, borna disease virus, non-coding RNAs, LINE-1, retrotransposon

## INTRODUCTION

Viruses can deposit their sequences into the host genome during infection. Consistently, animal genomes contain many viral-related sequences, called endogenous viral elements (EVEs) (Katzourakis and Gifford, 2010; Holmes, 2011; Parrish and Tomonaga, 2016). EVEs are mainly derived from ancient retroviruses because retroviruses require the integration of their DNAs into the host genome for replication. In addition to retroviruses, DNA and non-retroviral RNA viruses can sometimes become integrated into the host genome, despite the fact that integration events are not required for the viral life cycle. In particular, sequences of non-retroviral RNA viruses seem to have been integrated into the host genome possibly by machineries of a host retrotransposon, long interspersed nuclear element 1 (LINE-1, or L1) (Horie et al., 2010). The integration events of viral sequences occur not only in somatic cells but also in germ line cells. If the integration event occurs in germ line cells, the integrated viral sequences become inherited as EVEs. Thus, the integration of viral sequences into the genome of germ line cells is an essential first step for generating EVEs.

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

#### Reviewed by:

Antoinette Van Der Kuyl, University of Amsterdam, Netherlands Masaaki Miyazawa, Kindai University, Japan

> \*Correspondence: Tomoyuki Honda thonda@virus.med.osaka-u.ac.jp

## Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 29 September 2017 Accepted: 06 December 2017 Published: 19 December 2017

#### Citation:

Honda T (2017) Potential Links between Hepadnavirus and Bornavirus Sequences in the Host Genome and Cancer. Front. Microbiol. 8:2537. doi: 10.3389/fmicb.2017.02537

Viral integration can have oncogenic potential via several mechanisms (**Figure 1A**). First, the inserted sequences in the vicinity of an oncogene may function as a promoter for the oncogene. Second, such events may inactivate tumor suppressor genes via insertional mutagenesis. Third, such integrated sequences may induce genomic instability via homologous recombination (Hino et al., 1991). Fourth, the integrated sequences may epigenetically regulate the host gene expression landscape, leading to cancer formation and spreading (Zhao et al., 2016). Fifth, such sequences may produce an oncogenic protein or non-coding RNA (Lau et al., 2014).

Regarding the relationship between viruses and cancers, many excellent reviews have been published about the links between endogenous retroviruses or tumor viruses and cancers (Suntsova et al., 2015; Gonzalez-Cao et al., 2016; Gramolelli and Ojala, 2017; McBride, 2017; Pancholi et al., 2017). On the other hand, few have dealt with the association between nonretroviral viral sequences in the genome and cancers. One of such studies has proposed that insertions of human papilloma virus contribute to cervical cancer formation through interrupting tumor suppressor or destabilizing chromosomes (Zhao et al., 2016). Here, I especially focus on the genomic sequences derived from two non-retroviral viruses, whose endogenization is already reported in animal genomes (Horie et al., 2010; Shen et al., 2016), and discuss the possibilities how these specific sequences could contribute to cancer formation. As DNA virus-related EVEs, EVEs derived from hepadnavirus and human herpesvirus 6 (HHV-6) are reported in animal genomes (Gravel et al., 2015; Shen et al., 2016). Because the link between HHV-6 insertions and cancer is not convincing at present, I will introduce current understanding regarding the roles of hepatitis B virus (HBV), a tumor-related hepadnavirus, insertions and hepatocellular carcinoma (HCC). Then, I will discuss the possible involvement of EVEs derived from ancient non-retroviral RNA virus sequences in cancers. Because endogenous bornaviruslike elements (EBLs) are the only non-retroviral RNA virusderived EVEs found in the human genome thus far (Horie et al., 2010), I focus on the possible links between these elements and cancers although these links have not been demonstrated. EBLs are possibly generated in a retrotransposon-dependent manner. Therefore, I will finally propose the possible contribution of virus-retrotransposon interactions to cancers. This article aims to inspire future studies regarding the possible links described here, which may open a new avenue for understanding of the significance of viral insertions in the host genome.

## A POTENTIAL LINK BETWEEN HCC AND HBV INSERTIONS IN THE GENOME

Hepatocellular carcinoma accounts for 80% of liver cancer, whose major causative agents are two hepatitis viruses, HBV and hepatitis C virus (HCV) (Jemal et al., 2011; Forner et al., 2012; Tateishi and Omata, 2012). HBV is a DNA virus that belongs to the Hepadnaviridae family (Beck and Nassal, 2007; Nguyen et al., 2008), while HCV is an RNA virus and belongs to the Flaviviridae family (Hijikata et al., 1991; Grakoui et al., 1993; Aly et al., 2012). Both viruses can cause chronic infections, which may increase the chance of horizontal viral gene transfer to the host genome (Parkin, 2006; Aly et al., 2012). Consistent with this idea, EVEs derived from an ancient hepadnavirus and an ancestor HCV have been identified in animal genomes although they are not in the human genome. The budgerigar genome contains two EVEs with the full-length genome of the ancient budgerigar hepadnavirus (Shen et al., 2016). The rabbit and hare genomes have fragments homologous to HCV genes, which might suggest the possibility that cDNA from an HCV ancestor was integrated into the host genome (Silva et al., 2012). Although HCV replicates without a known DNA intermediate stage, it is still possible that the sequences of non-retroviral RNA viruses are integrated into the host genome via host retrotransposon machineries as evidenced by several studies (Geuking et al., 2009; Horie et al., 2010). HCV cDNA has been reportedly detected in patients infected with HCV (Zemer et al., 2008), further supporting this possibility. However, the contribution of integration events of the HCV sequences to oncogenesis remains unclear.

On the other hand, insertions of the HBV sequences seem to be closely linked to HCC development because the frequency of HBV insertions in cancer tissue is larger than that in canceradjacent tissues (Ding et al., 2012; Jiang et al., 2012). So far, several genes that are recurrently targeted by HBV insertions have been reported (Ding et al., 2012; Fujimoto et al., 2012). It has been proposed that HBV insertions occur during chronic hepatitis and that some of the cells with HBV insertions can acquire growth advantages and initiate tumorigenesis (Ding et al., 2012). A possible oncogenic contribution of HBV insertions is modification of gene expression via insertions into the genomic regulatory region, genomic instability induced by recombination between integrated HBV sequences or production of oncogenic cellular-HBV chimeric proteins or non-coding RNAs (**Figure 1B**). One of the first cases is the recurrent insertion into the telomerase reverse transcriptase (TERT) gene (Ferber et al., 2003). TERT expression is a limiting factor in telomerase activation and its upregulation is thought to be a critical step in tumorigenesis (Ferber et al., 2003). HBV insertions in the promoter region of the TERT gene enhance its expression, which might be related to HCC development (Ding et al., 2012; Sung et al., 2012). The second possibility is supported by the observation that fragments containing the HBV sequences increase the recombination events (Hino et al., 1991).

The chimeric gene, HBx-L1, is an example of the third possible mechanism described above (Lau et al., 2014). HBx-L1 is a fusion gene of HBx, an HBV gene, and LINE-1, a host retrotransposon, produced by the HBV integration event, which is found in more than 20% of HBV-related HCC and correlates with a poor outcome (Lau et al., 2014). Knockdown of the HBx-L1 transcript reduces migratory and invasive properties of HBV-positive HCC cells. HBx-L1 overexpression confers a growth advantage and promotes cell migration and invasion via β-catenin/Wnt signaling, a major pathway in the oncogenesis of HBV-related HCC, regardless of its protein-coding potential (Lau et al., 2014). Thus, the HBx-L1 transcript is a chimeric long non-coding RNA (lncRNA) that promotes the HCC phenotype (Whittaker et al., 2010; Lau et al., 2014).

FIGURE 1 | The possible contribution of viral sequences in the genome to cancer development. (A) Overview of possible functions of genomic viral sequences. Viral sequences in the genome could function as a gene regulatory DNA element (red), a functional RNA (green) or a protein (blue), all of which can contribute to cancer development. (B) HBV insertions in the genome. HBV insertions may enhance TERT promoter activity, have a recombinogenic effect or produce a viral-host chimeric RNA with an oncogenic potential. (C) EBLNs in the genome. The hsEBLN-2 protein may be involved in mitochondrial function, whereas the hsEBLN-1 protein may regulate microtubules. The hsEBLN-1 RNA has also been shown to regulate tumor-related genes.

## A POTENTIAL LINK BETWEEN CANCERS AND ENDOGENOUS BORNAVIRUS-LIKE ELEMENTS

Endogenous bornavirus-like elements are the only non-retroviral RNA virus-derived EVEs found in the human genome, although DNA virus-derived EVEs are also found in the human genome (Gravel et al., 2015). The majority of such elements are EBLs from the bornavirus nucleoprotein (N) gene (EBLNs), which appear to have originated from the reverse-transcription and integration of ancient bornavirus N mRNA (Horie et al., 2010). Among 7 Homo sapiens EBLNs (hsEBLNs) in the human genome, hsEBLN-2 is most closely linked to cancer. Whole exome sequencing using two sibling pairs of non-smokers with lung adenocarcinoma reveals that a truncated mutation in hsEBLN-2 is only detected in affected siblings (Renieri et al., 2014). The authors concluded that this mutation in hsEBLN-2 might predispose an individual to lung adenocarcinoma (Renieri et al., 2014). The loss of 3p12-p14 is recurrently observed in uterine cervical cancer, suggesting a strong selection advantage for the gene loss (Lando et al., 2013). hsEBLN-2 is highly downregulated in cases with this gene loss (Lando et al., 2013). Gene ontology analysis of the genes associated with the loss, including hsEBLN-2, shows enrichment of tumorigenic pathways, such as apoptosis, proliferation and stress responses, suggesting that hsEBLN-2 might be a tumor suppressor. hsEBLN-2 is homologous to the bornavirus N gene but also contains an additional TOM20 recognition motif (F4LKLY8) at the N-terminal. Furthermore, the hsEBLN-2 protein was shown to be expressed and to interact with several other host proteins (Ewing et al., 2007). Because mitochondrial dysfunction is found in cancers (Lleonart et al., 2017), hsEBLN-2 might play important roles in mitochondrial function and then act as a tumor suppressor (**Figure 1C**).

hsEBLN-1 retains a long open reading frame (ORF) that encodes 366 amino acids, which is comparable with the full-length BDV N protein (Horie et al., 2010). Despite the overall homology between hsEBLN-1 and BDV N proteins, their subcellular localizations are different, suggesting that hsEBLN-1 may have acquired new or additional functions during millions of years of residence within the human genome (Honda and Tomonaga, 2013; Fujino et al., 2014). Recently, two studies have revealed the involvement of hsEBLN-1 in tumorigenic pathways, such as cell cycle transit, cell genome stability and apoptosis (He et al., 2016; Myers et al., 2016) (**Figure 1C**). Both studies demonstrated that hsEBLN-1 silencing increases the proportion of cells in the G2/M phase. hsEBLN-1 knockdown cells exhibit microtubule and centrosomal splitting defects (Myers et al., 2016). Proteomic analysis of the purified hsEBLN-1 complex identified several binding partners for hsEBLN-1 (Myers et al., 2016). Among these, TPR (Translocated Promoter Region) is a nuclear protein that regulates mRNA transport and mitotic spindles. Because hsEBLN-1 silencing impairs the nuclear envelope localization of TPR, improper localization of TPR may abrogate TPR function to regulate microtubules and thereby induces abnormal cell cycle

progression. Indeed, TPR has been implicated in cancer development (Snow and Paschal, 2014). In addition to this, three genes upregulated after hsEBLN-1 silencing, RND3, OSMR, and CREB3L2, are closely linked to glioma (He et al., 2016). This observation raises the possibility that hsEBLN-1 may be involved in the development of some kinds of cancers, although no hsEBLN-1 mutations have been identified in cancer thus far.

We have previously demonstrated that hsEBLN-1 can modulate the expression of its neighboring gene, COMMD3 (Sofuku et al., 2015). When transcription from the hsEBLN-1 locus in the human genome was induced, expression of the COMMD3 gene was downregulated. The effect of hsEBLN-1 RNA expression on the COMMD3 locus was abrogated by treatment with siRNA against hsEBLN-1 RNA. These results suggest that hsEBLN-1 RNA may function as a lncRNA that scaffolds transcriptional and/or epigenetic repressors for the COMMD3 gene and suppress its expression. Although we cannot exclude the possibility that hsEBLN-1 functions as a cis-regulatory DNA element or a protein acting on this locus in trans, our data using siRNA and cytoplasmic localization of the hsEBLN-1 protein strongly suggest a role for hsEBLN-1 as a lncRNA (Fujino et al., 2014; Sofuku et al., 2015). The COMMD3 gene encodes a protein that can interact with and inhibit the NF-kB pathway (Burstein et al., 2005), which regulates type I interferons (IFNs), inflammatory cytokines, such as interleukin-1 (IL-1), IL-2, IL-6, IL-12, and tumor necrosis factor (TNF)-α and intercellular adhesion molecule 1 (ICAM-1). In addition, enhanced expression of the COMMD3 gene was reported in a particular type of leukemia (Mulaw et al., 2012). EBLN insertion in the hsEBLN-1 locus may downregulate the expression of the COMMD3 gene and thereby potentiate the NF-kB pathway (Honda and Tomonaga, 2016). Cancer cells are known to induce IFNs, which mediate antitumor effects on particular types of tumors, such as renal cell carcinoma, and are therefore used in clinical anti-cancer therapy (Müller et al., 2017; Wu et al., 2017). Taken together, hsEBLN-1 may exert anti-tumor effects via the COMMD3-NFkB-IFN pathway. Further studies are required to understand the contribution of EBLNs to immune modulation during oncogenesis.

## POSSIBLE INVOLVEMENT OF RETROTRANSPOSON-VIRUS INTERACTIONS IN CARCINOGENESIS

As described above, non-retroviral RNA virus-related sequences in the genome are possibly generated by a retrotransposon machinery (Horie et al., 2010; Shimizu et al., 2014). In other words, retrotransposons are a major driving force for generating such EVEs. Therefore, it is important to understand the interactions between retrotransposons and viruses. Among retrotransposons, L1s constitute approximately 17% of the human genome (Lander et al., 2001). Most L1s are 5<sup>0</sup> truncated and therefore defective in retrotransposition, whereas 80–100 copies are still retrotransposition-competent and utilize a "copyand-paste" mechanism to retrotranspose to new genomic loci (Beck et al., 2010; Brouha et al., 2003). L1 is also responsible for the production of non-retroviral RNA virus elements in the host genome as described. Thus, dysregulation of L1s is considered a major source of endogenous insertional mutagenesis in humans (Levin and Moran, 2011; Burns and Boeke, 2012). Indeed, L1 retrotransposition occurs not only in germ line cells and pluripotent stem cells (van den Hurk et al., 2007; Beck et al., 2011; Levin and Moran, 2011; Klawitter et al., 2016) but also in cancer cells (Iskow et al., 2010; Goodier, 2014). Furthermore, although it is unclear whether L1s are activated in normal cells before clonal expansion or in cancer cells during the later stages of carcinogenesis (Goodier, 2014), many epidemiological studies suggest a linkage between dysregulated L1 expression and cancers (Shukla et al., 2013; Rodic et al., 2014 ´ ; Harada et al., 2015). Once L1 or L1-mediated viral insertions occur around oncogenes or tumor suppressor genes, some of these insertions may confer survival and/or proliferative advantages to the cells, thereby enhancing the various steps of carcinogenesis. Consistent with this idea, transposon-based insertional mutagenesis has been shown to induce virtually any kind of cancer in mice (Dupuy et al., 2005, 2009; Rad et al., 2010). Furthermore, several tumor viruses are reported to activate transcription of retrotransposons, such as endogenous retroviruses and short interspersed nuclear elements (SINEs). For example, Marek's disease virus, an avian tumor virus, is reported to induce expression of an endogenous retrovirus (Hu et al., 2017), and murine gammaherpesvirus 68, another tumor virus, also activates transcription of SINEs (Tucker and Glaunsinger, 2017). These observations may emphasize the significance of retrotransposon activation in tumor virus-related carcinogenesis.

## CONCLUSION AND PERSPECTIVE

This article has presented a current view of the possible contributions of hepadnavirus and bornavirus insertions in the genome to cancer formation. The presented lines of evidence suggest potential links between these viral sequences and cancers. However, current knowledge in this field is still poor, and there are many questions to be addressed. Although several genes recurrently targeted by HBV insertions have been identified, the precise role of most of them in tumorigenesis remains unclear. Among the HBV integration sties identified so far, only a limited number of cellular-HBV chimeric proteins/transcripts have demonstrated the oncogenic potential. Further accumulation of examples of recurrent HBV insertion sites in the host genome or recurrent chimeric transcripts specific to hepatitis virusrelated HCC will be promising to understand the contribution of HBV insertions to HCC etiology. Regarding links between EBLs and cancers, the information is more limited. Epidemiological studies on the links between EBL mutations and cancers are clearly required. Furthermore, the causal relationship between such EBL mutations and cancers should be demonstrated in future.

Although a definitive role for tumor viruses in retrotransposon activation has not been established thus far, investigating a possible link between L1 activation and

tumor viruses, especially HBV, would be of considerable interest because L1 hypomethylation or some L1 chimeric transcripts are associated with poor prognosis in HCC (Honda, 2016). Hypomethylation of the L1 loci may upregulate L1 expression, potentially removing an obstacle to L1 transposition in liver cells. Once L1s are activated, any potential disruption of tumor suppressor genes induced by L1 retrotransposition could contribute to the development of HCC. Indeed, L1 has been shown to be a crucial source of mutations that can reduce the tumor-suppressive capacity of somatic cells (Shukla et al., 2013).

Future studies regarding the above links may open a new avenue for the development of novel therapeutics, such as epigenetic modification of viral sequences in the genome, for tumor virus-related cancers. Also, such studies will provide novel insights into the biological roles of EVEs in the cells.

## REFERENCES


## AUTHOR CONTRIBUTIONS

TH wrote the manuscript, confirms being the sole contributor of this work and approved it for publication.

## ACKNOWLEDGMENTS

This work was supported in part by JSPS KAKENHI Grant Number JP15K08496, the Program on the Innovative Development and the Application of New Drugs for Hepatitis B from Japan Agency for Medical Research and Development (AMED), and grants from the Takeda Science Foundation, Senri Life Science Foundation, Suzuken Memorial Foundation, and Kobayashi International Scholarship Foundation.


adenocarcinoma patients. Lung Cancer 85, 168–174. doi: 10.1016/j.lungcan. 2014.05.020


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Honda. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fmicb-08-02537 December 15, 2017 Time: 16:50 # 6

# Elevated HERV-K Expression in Soft Tissue Sarcoma Is Associated with Worsened Relapse-Free Survival

Maria Giebler<sup>1</sup>† , Martin S. Staege<sup>2</sup>† , Sindy Blauschmidt<sup>1</sup> , Lea I. Ohm<sup>2</sup> , Matthias Kraus<sup>1</sup> , Peter Würl<sup>3</sup> , Helge Taubert<sup>4</sup> and Thomas Greither<sup>1</sup> \*

<sup>1</sup> Center for Reproductive Medicine and Andrology, Martin Luther University of Halle-Wittenberg, Halle, Germany, <sup>2</sup> Department of Pediatrics I, Martin Luther University of Halle-Wittenberg, Halle, Germany, <sup>3</sup> Department of General, Visceral and Thoracic Surgery, Städtische Klinikum Dessau, Dessau-Roßlau, Germany, <sup>4</sup> Division Molecular Urology, Department of Urology and Pediatric Urology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany

#### Edited by:

Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### Reviewed by:

George Robert Young, Francis Crick Institute, United Kingdom César López-Camarillo, Universidad Autónoma de la Ciudad de México, Mexico Tara Patricia Hurst, Abcam, United Kingdom

#### \*Correspondence:

Thomas Greither thomas.greither@medizin.uni-halle.de †These authors have contributed

equally to this work.

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 26 October 2017 Accepted: 30 January 2018 Published: 13 February 2018

#### Citation:

Giebler M, Staege MS, Blauschmidt S, Ohm LI, Kraus M, Würl P, Taubert H and Greither T (2018) Elevated HERV-K Expression in Soft Tissue Sarcoma Is Associated with Worsened Relapse-Free Survival. Front. Microbiol. 9:211. doi: 10.3389/fmicb.2018.00211 A wide variety of endogenous retroviral sequences has been demonstrated in the human genome so far, divided into several different families according to the sequence homology to viral strains. While increased expression of human endogenous retrovirus (HERV) elements has already been linked to unfavorable prognosis in hepatocellular carcinoma, breast cancer, and ovarian carcinoma yet less is known about the impact of the expression of different HERV elements on sarcomagenesis in general as well as the outcome of soft tissue sarcoma (STS) patients. Therefore, in this study the association between expression of HERV-K and HERV-F and the clinicopathological characteristics in a cohort of STSs as well as the patients' prognosis was evaluated. HERV-K and HERV-F expression was assessed by quantitative real-time PCR in 120 patient specimens. HERV-K and HERV-F expression was significantly correlated (r<sup>S</sup> = 0.5; p = 6.4 × 10−<sup>9</sup> ; Spearman's rank bivariate correlation). Also, tumor diameter exhibited a significant negative association to HERV-K and HERV-F expression. Levels of several hypoxia-related RNAs like HIF-1α and miR-210 showed a significant positive correlation with both HERV-K and HERV-F expression. Although in survival analyses no impact of HERV expression on disease-specific survival could be detected, patients with elevated HERV-K expression had a significantly shorter relapse-free survival (p = 0.014, log-rank analysis). In conclusion, we provide evidence for the first time that the increased expression of HERV-K in tumors is associated with STS patients' prognosis.

#### Keywords: soft tissue sarcoma, HERV-K, HERV-Fb, prognosis, relapse

## INTRODUCTION

Soft tissue sarcomas (STSs) are a heterogeneous group of tumors classified by the somatic tissue they resemble, with over 50 subtypes that can be distinguished (Ducimetière et al., 2011). The incidence of STS is relatively low – with estimated 4–5 cases per 100,000 per year (Stiller et al., 2013) – but the 5-year survival is only around 50% (Ferrari et al., 2011). Although treatment of STS usually consists of a wide resection of the tumor, followed by radio- or chemotherapy in selected cases (Casali and Blay, 2010), relapses and metastases are still an urgent clinical issue (Steinestel and Wardelmann, 2015). Due to the heterogeneity in genetics and phenotypes of the different STS entities, many of the proposed laboratory biomarkers and clinical factors still are insufficient for a satisfying prognostic evaluation of the individual patient's outcome.

**104**

Human endogenous retroviruses (HERVs) are a class of retroviral sequences acquired during evolution by integration of viral genes in the host genome. They became non-infectious by mutation or loss of relevant genes for replication or virus release (Vargiu et al., 2016). They comprise an estimated 8% of the human genome (Lander et al., 2001), and at least 22 independent families based on their homology to known mammalian retroviruses exist (Griffiths, 2001).

Among the different families, retroviral sequences of the HERV-K family (HML-2) were the latest acquired, therefore they are the most complete and biologically active family (Bannert and Kurth, 2004; Hughes and Coffin, 2004, 2005).


One significant association (Chi<sup>2</sup> test) is marked in bold. <sup>a</sup>Union for International Cancer Control Guidelines; LS, liposarcoma; FS, fibrosarcoma; RMS, rhabdomyosarcoma; LMS, leiomyosarcoma; NS, neuronal sarcoma; Syn, synovial sarcoma; NOS, not other specified; T1, <5 cm in diameter; T2, >5 cm in diameter.



HERV-K expression was detected in several tumor entities, among them chronic myeloid leukemia (Brodsky et al., 1993), renal cell carcinoma (Florl et al., 1999; Kreimer et al., 2013), breast carcinoma (Wang-Johanning et al., 2003), prostate carcinoma (Wallace et al., 2014), pancreatic cancer (Li et al., 2017), and melanoma (Serafino et al., 2009). In melanoma, HERV-K expression is suggested to be an early event in tumorigenesis, seemingly enhancing the pathological process of tumor formation (Serafino et al., 2009; Cegolon et al., 2013). Most recently, it was demonstrated that shRNA-mediated downregulation of HERV-K in pancreatic cancer cell lines suppressed growth rates and metastases as well as the expression of several proliferation-related genes (Li et al., 2017). However, it is worth mentioning that the above described associations between HERV expression and tumor tissues are correlative. It has also been demonstrated, that healthy tissues of different origins express HERV sequences, especially during early embryo development and placentation or in the innate immune response (reviewed in Meyer et al., 2017).

HERV-F is another family of HERVs originally identified by cloning of human genomic DNA sequences and homology analyses (Kjellman et al., 1999a,b). HERV-F expression was described in leukemia cell lines (Patzke et al., 2002) and in a wide range of other tumor cell lines (Yi and Kim, 2004). On the contrary, in adult somatic tissues HERV-F is only expressed in placenta (Kjellman et al., 1999a; Yi and Kim, 2004). Recently, constitutive HERV-F expression was reported in a cohort of breast cancer patients, with an increased expression of HERV-F members in comparison to normal breast tissue (Frank et al., 2008).

The knowledge on the expression and clinical impact of HERV-K and HERV-F family in STS is scarce. Schiavetti et al. (2002) reported a robust expression of HERV-K-MEL in human sarcoma specimens, which was higher than in other tumor entities and comparable to the expression in bladder and breast carcinoma, but lower than in melanoma samples. We hypothesized that a detectable HERV mRNA expression could be a common feature in STS and might be related to the patients' clinical outcome. Therefore, the aim of this study was the quantification of HERV-K and HERV-F family mRNA expression in a cohort of 120 STS samples and the correlation to clinicopathological and prognostic data of the patients. Furthermore, as a secondary end point we analyzed the correlation of the mRNA expression of HERV-K and HERV-F with the RNA expression of known apoptosis-related [B-cell cll/lymphoma 2 (BCL2)] or hypoxiarelated (miR-210, miR-199a, Hypoxia inducible factor 1a) genes as well as known epigenetically regulated genes (miR-203, H2A.Bbd).

## MATERIALS AND METHODS

## Patients

One hundred and twenty STS patients agreed to participate in this study. An overview of the patient cohort is given in **Table 1**. Patients underwent tumor surgical resection between 1998 and 2001 at the Department of Surgery, University of Leipzig (Leipzig, Germany) without prior adjuvant treatment. Thirty-nine patients exhibited metastases (32.5%). Fresh tumor tissue was snap-frozen immediately after excision and stored at −80◦C until RNA isolation. The study was approved by the local ethics committee of the Medical Faculty of the Martin Luther University of Halle-Wittenberg and the Medical Faculty of the University of Leipzig. According to the Helsinki Declaration, all patients gave written informed consent. Patient cohort composition as well as tissue cryopreservation was as described previously (Kappler et al., 2001; Würl et al., 2002).

## RNA Isolation

Tissue specimens were partly processed on a cryotome in 5 µm tissue slices, and RNA was isolated from 20 tissue slices. The slices were incubated in Trizol (Thermo Fisher Scientific, Waltham, MA, United States) for 5 min at room temperature and subsequently mixed with chloroform (AppliChem, Darmstadt, Germany). After centrifugation, aqueous phase was collected and treated with DNase (Qiagen, Hilden, Germany). Total


TABLE 3 | Bivariate correlations (Spearman's rank test; rs) of HERV-K or HERV-F mRNA expression with several clinicopathological and molecular parameters.

Significant p-values are marked in bold.

RNA was precipitated with isopropanol (AppliChem, Darmstadt, Germany) for 12 h at 4◦C, washed with different ice-cooled ethanol solutions (96 and 70%) and finally dissolved in RNasefree water (Qiagen, Hilden, Germany). RNA concentrations were assessed spectrometrically.

## cDNA Synthesis and qPCR

cDNA synthesis was carried out with RevertAid First strand synthesis kit (Thermo Fisher Scientific, Waltham, MA, United States) according to manufacturer's protocol. One µg total RNA was applied for cDNA synthesis per tissue specimen. The complete elimination of genomic DNA was controlled by mock-RT PCR reactions (see Supplementary Figures S1, S2). cDNA was quantified with Maxima SyBR Green Kit (Thermo Fisher Scientific, Waltham, MA, United States) in a quantitative real-time-PCR reaction. The applied primer sequences for the PCR reaction were: HERV-K forward: 5<sup>0</sup> -GGC CAT CAG AGT CTA AAC CAC G-3<sup>0</sup> ; HERV-K reverse: 5<sup>0</sup> -CTG ACT TTC TGG GGG TGG CCG-3<sup>0</sup> ; HERV-F forward: 5<sup>0</sup> -CCT CCA GTC ACA ACA ACT C-3<sup>0</sup> ; HERV-F reverse: 5<sup>0</sup> -TAT TGA AGA AGG CGG CTG G-3<sup>0</sup> (Seifarth et al., 2005); H2A.Bbd forward: 5<sup>0</sup> -TCG TTT TCA GTA GCC AGG T-3<sup>0</sup> ; H2A.Bbd reverse: 5<sup>0</sup> -CAG AAT TAA TGA AGG CCC AAG-3<sup>0</sup> ; HPRT forward: 5<sup>0</sup> -TTG CTG ACC TGC TGG ATT AC-3<sup>0</sup> ; HPRT reverse: 5<sup>0</sup> -CTT GCG ACC TTG ACC ATC TT-3<sup>0</sup> . Samples were run on a MyIQ cycler (BioRad, Hercules, CA, United States) and HERV-K or HERV-F expression calculated according to the 2−1CT method (Schmittgen and Livak, 2008) with HPRT as reference gene. Linearity of the qPCR reaction for both HERV-K and HERV-F was analyzed by dilution series of the gel-extracted amplicon (see Supplementary Figure S3). All amplicons were analyzed by qPCR melt analyses on the occurrence of a single, distinct peak (see Supplementary Figures S4, S5). Representative PCR products were purified by agarose gel electrophoresis and subsequently sequenced (see **Table 2**). Analysis of the sequenced PCR products with RepeatMasker<sup>1</sup> demonstrated that the used primers amplified sequences from HERV-K and HERV-Fb (HERVFH21). Expression analyses for BCL2 mRNA, miR-203 and miR-210 (Greither et al., 2012), HIF-1α mRNA (Kessler et al., 2010) and miR-199a (Keßler et al., 2016) were carried out as previously described.

## Statistical Analyses

Statistical analyses were performed with SPSS 20.0 (IBM Statistics, Ehingen, Germany). HERV expression data were analyzed with bivariate correlation analyses (Spearman rank correlation) and Chi<sup>2</sup> tests. Survival analyses were performed with Kaplan–Meier analyses and multivariate Cox's Regressions analyses adjusted for resection status, localization of the tumor, tumor entity, and tumor stage (inclusion).

## RESULTS

## HERV-K and -F Expression in Soft Tissue Sarcoma Samples

HERV-K mRNA expression was detected in 120 patients samples with a mean expression of 4.1 (range: 0.06–95.6; 2 <sup>−</sup>1CT value). HERV-F mRNA expression was also detected in 120 patients sarcoma specimen with a mean expression of 6.0 (range: 1 × 10−5–99.3; 2−1CT value, see **Figures 1A,B**). Additionally, we measured the HERV-K and HERV-F mRNA expression in normal skeletal muscle tissue. HERV-K mRNA expression was determined at 0.093 (2−1CT value) and HERV-F mRNA expression at 0.296 (2−1CT value). Skeletal muscle tissue therefore exhibited lower HERV mRNA values than 76.6 and 78.3% of STS samples, respectively. For survival analyses, HERV-K or HERV-F mRNA expression were classified

<sup>1</sup>www.repeatmasker.org/

according to the median as cut-off value (HERV-K: 0.56; HERV-F: 2.02). In Chi<sup>2</sup> tests, low or elevated HERV-K expressions exhibited no correlation to demographic (age, sex) or clinical parameters (tumor entity or localization, resection type, tumor size, number of relapses, and patients status). In contrast, a significant correlation of HERV-F expression with the histological subtype of the STS was observed (p = 0.047).

## Association of HERV Expression with Clinicopathological Parameters

In bivariate regression analyses, the association between HERV-K or HERV-F expression and several clinicopathological parameters were tested (see **Table 3**). Interestingly, HERV-F and HERV-K expression was significantly associated (r<sup>S</sup> = 0.499; p = 6.4 × 10−<sup>9</sup> ). Both HERV-F and HERV-K expression exhibited a significant inverse association with the actual tumor diameter at surgery (r<sup>S</sup> = −0.309; p = 0.001 and r<sup>S</sup> = −0.467; p = 7.4 × 10−<sup>8</sup> ; respectively). Moreover, HERV-K mRNA expression was significantly inversely associated with BCL2 mRNA expression (r<sup>S</sup> = −0.408; p = 0.0002) and miR-199a (r<sup>S</sup> = 0.361; p = 0.0004) expression, while solely HERV-F expression was significantly associated with miR-203 expression (r<sup>S</sup> = 0.333; p = 0.005). Intriguingly, both HERV-K and HERV-F expression were significantly associated with levels of hypoxia-related genes like HIF-1α mRNA expression (r<sup>S</sup> = 0.444; p = 3.0 × 10−<sup>6</sup> and 0.359; p = 0.0002; respectively) or miR-210 expression (r<sup>S</sup> = 0.399; p = 0.001 and r<sup>S</sup> = 0.366; p = 0.002; respectively). Additionally, both HERV-K and HERV-F expression were significantly associated to H2A.Bbd mRNA expression (r<sup>S</sup> = 0.456; p = 0.0009 and r<sup>S</sup> = 0.302; p = 0.012, respectively).

## HERV-K or -F Expression and Patients' Survival

In Kaplan–Meier analyses, while HERV-F mRNA expression showed no significant correlation to patients' disease-specific survival, a lower HERV-K mRNA expression was in trend associated with a worsened survival (p = 0.08; log rank test). Further, in a multivariate Cox's regression analysis adjusted to the confounders resection type, tumor localization, tumor histotype and staging, there was no significant correlation between HERV-K or HERV-F mRNA expression and patient survival (see Supplementary Figure S6). Interestingly, when analyzing the relapse-free survival in Kaplan–Meier analyses, patients with a lower HERV-K mRNA expression exhibited a significantly longer relapse-free survival (p = 0.014; log-rank test, see **Figure 2A**). A comparable effect was observed in patients with lower HERV-F mRNA expression; however, there it was not significant (p = 0.22, see **Figure 2B**). Additionally, when analyzing the effect of the HERV-K mRNA expression on the relapse-free survival in a multivariate Cox's regression analysis, an elevated HERV-K mRNA expression was in trend associated with a 1.78-fold increased risk for a relapse (p = 0.08). Furthermore, when comparing only patients exhibiting low HERV-F and HERV-K (n = 43) with patients exhibiting both elevated HERV-F and HERV-K expression (n = 43) in a

multivariate Cox's regression analysis, an elevated expression of HERVs was in trend significantly associated with a 2.08-fold increased relative risk for a relapse (p = 0.066).

## DISCUSSION

In this study, we demonstrated that a robust mRNA expression of HERV-K and HERV-F in a cohort of 120 STS samples is detectable, and that the expression of HERV-K and HERV-F is correlated with clinicopathological features and hypoxia-related gene expression. Furthermore, an elevated HERV-K mRNA expression was significantly associated with a shorter relapse-free survival.

There are only few data on the expression of HERVs in STS. Schiavetti and colleagues studied the expression of HERV-K-MEL in sarcoma in comparison to the expression in a patient's sample of melanoma cells. This transcript was detectable in 9/23 (39.1%) of sarcoma samples (Schiavetti et al., 2002). There is no data on the expression of HERV-F in sarcoma, however, one report shows a wide expression of HERV-F in tumor cell lines originating from mamma carcinoma, ovarian carcinoma, pancreatic adenocarcinoma, prostate carcinoma, glioblastoma, and others (Yi and Kim, 2004). Interestingly, HERV-F expression

was not detected in any somatic tissue tested, with the exception of placenta (Yi and Kim, 2004). These reports are consistent with the assumption, that HERV sequences are normally methylated and therefore transcriptionally inactive, but are hypomethylated and activated during carcinogenesis (Kreimer et al., 2013; Hurst and Magiorkinis, 2017). Concordantly, the treatment with 5<sup>0</sup> -azacytdidine, a known DNA methyltransferase inhibitor, activates HERV sequences in diverse tumor cell lines (Strissel et al., 2012; Laska et al., 2013; Chiappinelli et al., 2015). Therefore, we propose that re-induction of HERVs may also occur during the multi-step process of sarcomagenesis.

Intriguingly, we identified a significant association of the HERV-K and -F expression with those of the hypoxia-related genes HIF-1α and miR-210. HIF-1α is a key regulator of the hypoxic response, and miR-210 is the most prominent microRNA upregulated by hypoxia (Kulshreshtha et al., 2007). There is little knowledge about interactions between HERV expression and hypoxic response. It has been described, that the hypoxiamimetic CoCl<sup>2</sup> increases the expression of ERV3 in Hodgkin's lymphoma cell lines, and that this increase in ERV3 expression might be associated with a pro-apoptotic reaction (Kewitz and Staege, 2013). Other groups demonstrated upregulation of the HERV-W expression in neuroblastoma cell lines due to hypoxic conditions (Hu et al., 2016) or upregulation of the HERV-E expression in renal cell carcinomas due to inactivation of the von Hippel-Lindau factor and subsequent stabilization of the oxygen sensor protein HIF-1α (Cherkasova et al., 2011). In our ex vivo samples, we detected an inverse association between HERV-K and BCL2 mRNA expression, implying that HERV-K overexpression could be associated with apoptosis-induction. However, other reports performed on in vitro cell cultures demonstrate that the HERV-K family exerts an anti-apoptotic role (Broecker et al., 2016). Further research on this contradiction is warranted.

Furthermore, HERV-K and HERV-F expression were both significantly correlated to the mRNA expression of H2A.Bbd, a histone A2 variant encoded on the X chromosome, which is found to be associated with the nucleosomes of transcriptionally active genomic regions (Chadwick and Willard, 2001). H2A.Bbd was further shown to induce a more relaxed structure of the DNA by destabilizing the nucleosome (Bao et al., 2004; Doyen et al., 2006), which resembles the genomic reorganization induced by histone acetylation in a modification-independent manner (Eirín-López et al., 2008). A recent report showed H2A.Bbd to localize temporarily on replication-active DNA regions. By this mechanism, H2A.Bbd is holding the DNA in a more decondensed state, thereby increasing S-phase progression (Sansoni et al., 2014). Thus, it can be speculated that an H2A.Bbd overexpression in patient samples may be associated with a more

## REFERENCES


transcriptionally active genome resulting in an increased chance of reactivation and expression HERV species.

In our patient cohort, we detected a significant association between a lower HERV-K expression and a longer relapse-free survival. This is concordant with previous reports describing a better overall prognosis for patients with a lower HERV-K expression in breast cancer (Golan et al., 2008; Zhao et al., 2011) or hepatocellular carcinoma (Ma et al., 2016). Additionally, hypomethylation and subsequent HERV induction was also demonstrated in ovarian carcinoma, and specifically the extent HERV-K hypomethylation was associated with a poor prognosis and therapy resistance in ovarian carcinoma patients (Menendez et al., 2004; Iramaneerat et al., 2011).

## CONCLUSION

To the best of our knowledge we present the first report suggesting an involvement of the HERV-K expression in the clinical course of STS. From our ex vivo data, we also suggest that HERV-K and -F expression may be regulated directly or indirectly by tumor hypoxia. Furthermore, HERV-K was associated to apoptosis in our samples, therefore may be an interesting therapeutic target.

## AUTHOR CONTRIBUTIONS

MG performed the data analysis and revised the manuscript; SB, LO, and MK carried out the clinical sample processing and the qPCR measurements; PW recruited the patients and collected the tissue specimen; TG, HT, and MS conceived the study design, prepared and revised the manuscript. All authors read and approved the final version of the manuscript.

## ACKNOWLEDGMENTS

The authors want to thank Ines Volkmer for excellent technical assistance. They also acknowledge the financial support of the Open Access Publication Fund of the Martin Luther University of Halle-Wittenberg.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.00211/full#supplementary-material

118 base pairs of DNA. EMBO J. 23, 3314–3324. doi: 10.1038/sj.emboj.760 0316


an indicator of malignant transformation. Mob. DNA 7:25. doi: 10.1186/s13100- 016-0081-9


is implicated in melanoma cell malignant transformation. Exp. Cell Res. 315, 849–862. doi: 10.1016/j.yexcr.2008.12.023


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Giebler, Staege, Blauschmidt, Ohm, Kraus, Würl, Taubert and Greither. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Human Endogenous Retrovirus K in the Crosstalk Between Cancer Cells Microenvironment and Plasticity: A New Perspective for Combination Therapy

Emanuela Balestrieri<sup>1</sup> , Ayele Argaw-Denboba<sup>1</sup> , Alessandra Gambacurta<sup>1</sup> , Chiara Cipriani<sup>1</sup> , Roberto Bei<sup>2</sup> , Annalucia Serafino<sup>3</sup> , Paola Sinibaldi-Vallebona1,3 and Claudia Matteucci<sup>1</sup> \*

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

### Reviewed by:

Tara Patricia Hurst, Abcam (United Kingdom), United Kingdom Kazuaki Monde, Kumamoto University, Japan

\*Correspondence:

Claudia Matteucci matteucci@med.uniroma2.it

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 28 February 2018 Accepted: 11 June 2018 Published: 02 July 2018

#### Citation:

Balestrieri E, Argaw-Denboba A, Gambacurta A, Cipriani C, Bei R, Serafino A, Sinibaldi-Vallebona P and Matteucci C (2018) Human Endogenous Retrovirus K in the Crosstalk Between Cancer Cells Microenvironment and Plasticity: A New Perspective for Combination Therapy. Front. Microbiol. 9:1448. doi: 10.3389/fmicb.2018.01448 <sup>1</sup> Department of Experimental Medicine and Surgery, University of Rome "Tor Vergata", Rome, Italy, <sup>2</sup> Department of Clinical Sciences and Translational Medicine, University of Rome "Tor Vergata", Rome, Italy, <sup>3</sup> Institute of Translational Pharmacology, National Research Council, Rome, Italy

Abnormal activation of human endogenous retroviruses (HERVs) has been associated with several diseases such as cancer, autoimmunity, and neurological disorders. In particular, in cancer HERV activity and expression have been specifically associated with tumor aggressiveness and patient outcomes. Cancer cell aggressiveness is intimately linked to the acquisition of peculiar plasticity and heterogeneity based on cell stemness features, as well as on the crosstalk between cancer cells and the microenvironment. The latter is a driving factor in the acquisition of aggressive phenotypes, associated with metastasis and resistance to conventional cancer therapies. Remarkably, in different cell types and stages of development, HERV expression is mainly regulated by epigenetic mechanisms and is subjected to a very precise temporal and spatial regulation according to the surrounding microenvironment. Focusing on our research experience with HERV-K involvement in the aggressiveness and plasticity of melanoma cells, this perspective aims to highlight the role of HERV-K in the crosstalk between cancer cells and the tumor microenvironment. The implications for a combination therapy targeted at HERVs with standard approaches are discussed.

Keywords: endogenous retroviruses, cancer plasticity, cancer therapy, cancer biomarker, combination therapy, reprogramming, stemness, tumor microenvironment

**Abbreviations:** CAR-T, chimeric antigen receptor T cells; Clone6, highly proliferating melanoma cell line; CSCs, cancer stem cells; FBS, fetal bovine serum; Grapes, non-adherent dark cellular aggregates; HDACi, histone deacetylate inhibitors; HERVs, human endogenous retroviruses; hESCs, human embryonic stem cells; HML, human endogenous MMTV-like; IFITM1, interferon induced transmembrane protein 1; iPSCs, induced pluripotent stem cells; LTRs, long terminal repeats; Melan A/MART-1, protein melanoma-A/melanoma antigen recognized by T cells 1; MHC-I, major histocompatibility complex; NANOG, DNA binding homeobox transcription factor; NNRTIs, non-nucleoside reverse-transcriptase inhibitors; OCT4, octamer-binding transcription factor 4; ORFs, open reading frames; RPMI, standard medium; SOX2, transcription factor, Sex determining region Y-box 2; TVM-A12, human melanoma cell line; X-VIVO, serum free medium.

## INTRODUCTION

fmicb-09-01448 June 28, 2018 Time: 17:56 # 2

Human endogenous retroviruses are replication-defective proviruses comprising a portion of human genome (∼8%). HERVs are recognized as having a role in health maintenance (Vargas et al., 2009) and complex diseases (Suntsova et al., 2015; Meyer et al., 2017; Grandi and Tramontano, 2018) acting by remodeling structure and function of DNA. Although most of the HERV sequences have been inactivated over time, some of them remain active and, with LTRs, retain the genes encoding the structure, replication, and accessory proteins of retroviruses (Wildschutte et al., 2016). LTRs contain regulatory sequences and their activity essentially depends on the chromatin and CpG island methylation of their regulatory regions (Katoh and Kurata, 2013). In different cell types and stages of development, HERV expression is mainly regulated by epigenetic mechanisms and is subjected to regulation according to the surrounding microenvironment (Hurst and Magiorkinis, 2017). Numerous endogenous/exogenous factors lead to the activation of HERVs, including hormones (Buslei et al., 2015), cytokines (Manghera et al., 2016), cytotoxic chemicals/drugs (Diem et al., 2012; Mercorio et al., 2017), radiation (Schanab et al., 2011), vitamins (Liu et al., 2016), and interactions with microorganisms (Balada et al., 2009; Toufaily et al., 2011; Gonzalez-Hernandez et al., 2012).

In the last few decades, many studies have highlighted the involvement of HERVs in complex diseases, such as cancer, autoimmunity and neurological disorders (Young et al., 2013; Meyer et al., 2017). Although the research activity focused on the "omics" characterization of tumor from the primary site to the metastasis, the molecules that act as intermediaries between the epigenetic effect mediated by the microenvironment and cell fate haven't been completely identified.

The ability of tumors to adapt to microenvironmental changes is embedded in their plasticity. Moreover, on the basis of the genetic predisposition, both differentiated and stem cells are driven toward transformation by the epigenetic pressure of the tumor niche and the microenvironmental changes (van den Hurk et al., 2012; Taddei et al., 2013).

An overview of the knowledge on HERVs related to tumors, in light of our experience in melanoma, is provided in order to achieve new insights into the contribution of HERV-K to the crosstalk between cancer cells and the tumor microenvironment. In addition, we suggest future perspectives on their potential therapeutic uses.

## HERVs IN CANCER

Several mechanisms by which HERVs could produce pathological effects have been proposed, including generation of new variants of HERVs, insertional mutagenesis, and protein toxicity (Young et al., 2013). In this regard, HERV activation appears to influence the aggressiveness of different cancers, including seminoma, melanoma, leukemia, hepatocellular carcinoma, sarcoma, prostate, breast and colon cancer (Cegolon et al., 2013; Kassiotis, 2014; Pérot et al., 2015; Suntsova et al., 2015; Giebler et al., 2018). Likewise, the pathologic process of rheumatic disorders, systemic lupus erythematosus, multiple sclerosis, autism spectrum disorders, schizophrenia, bipolar disorder, psoriasis, type I diabetes, and systemic sclerosis shows a correlation with HERV activity (Alelú-Paz and Iturrieta-Zuazo, 2012; Balestrieri et al., 2012; Brodziak et al., 2012).

Several studies suggested that the aberrant activation of HERVs promotes tumorigenesis through oncogenic mechanisms, such as: (1) insertional mutagenesis with inactivation of tumor suppressor genes (Gerdes et al., 2016); (2) activation of downstream (proto-)oncogenes or genes involved in cell growth (Fan and Johnson, 2011); (3) expression of HERV-K oncogenes such as Rec and Np9 (Denne et al., 2007; Chen et al., 2013); (4) expression of HERV proteins involved in the fusion of tumor cells or immunosuppression (Downey et al., 2015); (5) disruption of cellular checkpoints (Kassiotis and Stoye, 2017; Lemaître et al., 2017).

Manifold HERV families have been identified; the HERV-K family is the most recently integrated in human genome, comprising 10 so-called HML subgroups (Subramanian et al., 2011). Of these, HML-2 subgroup maintains most of the ORFs actively transcribed. Due to differential transcript splicing and the deletion of 292-bp at the pol and env boundary, HML-2 produces the protein Env (single spliced) and accessory proteins Np9 and Rec (double spliced) (Armbruester et al., 2002; Büscher et al., 2006). Their identification helped to understand and characterize this subgroup of HERV-K in many tumors, including ovarian, breast and prostate cancer, melanoma, lymphomas, leukemias, and sarcomas (Cegolon et al., 2013; Kassiotis, 2014). HERV-K DNA-polymorphisms, mRNA and proteins have been detected in cancer cells; in addition, viral particles have been identified in tissue, serum, and cell lines (Hohn et al., 2013). Interestingly, HERV-K is involved in cell transformation and contributes to the metastatic phenotype (Downey et al., 2015). Accordingly, we demonstrated the reactivation of HERV-K under restrictive conditions to be strictly required in human melanoma cells to support the expansion of a subpopulation of cancer cells with stemness features (Serafino et al., 2009; Argaw-Denboba et al., 2017).

## HERVs AND STEMNESS

Stem cells have self-renewal capacity and give rise to progeny capable of differentiating into diverse cell types. The transcription factors OCT4, SOX2, and NANOG have fundamental roles in maintaining the pluripotency and stemness features of hESCs and contribute to the reprogramming of adult somatic cells into iPSCs (Kashyap et al., 2009; Yamasaki et al., 2014). Recent studies showed HERV activity (mainly HERV-H and HERV-K) in hESCs and iPSC (Ohnuki et al., 2014; Grow et al., 2015). Specifically, LTR7/HERV-H is one of the transposable elements found more often at the binding sites of OCT4 and NANOG (Kunarso et al., 2010) and its targeting compromises the self-renewal functions (Wang et al., 2014). Furthermore, DNA hypomethylation at HERV-K LTRs elements together with transactivation by OCT4, increase HERV-K expression during embryogenesis.

In addition, the overexpression of Rec in pluripotent cells increases the interferon-induced transmembrane protein 1 (IFITM1), suggesting a role of HERV-K in the immunoprotection of human embryos against viruses sensitive to the IFITM1-type restriction (Grow et al., 2015).

Possessing stemness features is crucial for cancer progression and metastasis. The generation of subpopulations with stemness features determines cancer self-renewal, proliferation and differentiation, allowing immune evasion and acquisition of resistance to therapy. These subpopulations, called CSCs, give rise to heterogeneous cell populations and maintain an undifferentiated state that equips them with the plasticity required to survive environmental stress (Aponte and Caicedo, 2017; Ramos et al., 2017).

The role of HERVs in stemness and the acquisition of cancer stemness are linked by a complex crosstalk of cellular signals, in which microenvironmental changes play a significant role (Cabrera et al., 2015; Argaw-Denboba et al., 2017; Flavahan et al., 2017).

## THE ROLE OF HERV-K IN THE PLASTICITY OF CANCER CELLS: OUR POINT OF VIEW IN MELANOMA

Several studies suggest that the tumorigenesis is determined by genetic alterations, which contribute to transformation, as well as by external factors present in the cancer microenvironment. The microenvironment is therefore considered a part of the tumor that constantly changes in parallel with cancer progression, as a result of bidirectional interactions between tumor cells and cellular and molecular components of their "niche" (Plaks et al., 2015; Wang et al., 2017). These interactions are essential for the establishment of a permissive stem cell microenvironment, providing a fine balance between self-renewal/differentiation and quiescence/proliferation. The tumor microenvironment is characterized by adverse growth conditions (hypoxia and acidosis), which trigger a stress response in cancer cells that, with molecules such as cytokines and growth factors, is instrumental in phenotype switching, angiogenesis, tumor growth, and immune evasion.

Cellular plasticity is fundamental for tumor progression and metastasis, to adapt to changes in the microenvironment (Taddei et al., 2013). Aggressive cancer cells share many characteristics with embryonal progenitors, expressing developmental genes that allow the differentiation into a wide range of cell lineages, including neural, mesenchymal, and endothelial cells. This mimicry of other cell lineages becomes essential in the cancer's ability to adapt to microenvironmental changes. For instance, melanoma cells show phenotypic heterogeneity and maintain their morphological and biological plasticity despite repeated cloning (Bröcker et al., 1991; Hendrix et al., 2003; Boiko et al., 2010).

Several studies from our and other groups demonstrated that HERV-K (HML-2), has a potential aggravating role in malignant melanoma and in immune escape during metastasis (Büscher et al., 2006; Serafino et al., 2009; Argaw-Denboba et al., 2017). Since HERV-K is responsive to microenvironmental changes, and melanoma cells are strongly associated with epigenetic and microenvironmental anomalies, the association of HERV-K activation with carcinogenesis is particularly intriguing (Li et al., 2015; Roesch, 2015).

Our group has established and characterized a metastatic human melanoma cell line, termed TVM-A12, that is highly heterogeneous, plastic and responds strongly to microenvironmental alterations (Melino et al., 1993; Serafino et al., 2009; Argaw-Denboba et al., 2017) (**Figure 1**). This has supplied a model to study the crosstalk between HERVs and cancer cells in a changing microenvironment. The TVM-A12 cellular monolayer is characterized by the presence of cells with different morphologies including small ovoid, spindle polygonal and large dendritic forms. Notably, the multiple morphology with melanin production persisted after years of continued passage in culture. When grown in different media, despite changing the morphology and functional characteristics, TVM-A12 retain the ability to restore the original phenotype if standard conditions are re-established. Peculiarly, when cultured in specific media that promote differentiation, TVM-A12 cells show specific phenotypes of melanogenic, adipogenic, and osteogenic lineages (unpublished data) (**Figure 1**). This morphological transition correlated with the change of culture media, without committing to terminal differentiation, has been previously described and is considered a hallmark of stemness in melanoma cells (Zhu et al., 2014).

TVM-A12 cells were also cultured at low serum concentrations (RPMI with 1%FBS), a recognized protocol for inducing microenvironmental stress conditions in vitro. This prompted a change in their phenotype, switching from adherent to suspension cells, generating a highly proliferating cell line called Clone6 (**Figure 1**). A major event for the generation of metastatic tumor cells is the inhibition of anoikis, the programmed cell death pathway induced by loss of integrinmediated cell matrix interactions, revealed in vitro by the ability of cells to grow in an anchorage-independent manner (Paoli et al., 2013). In the case of Clone6, following cell detachment, cells undergo uncontrolled growth and show loss of expression of immune recognition molecules (MHC-I, Melan A/MART-1), loss of melanin production and ability to generate tumor masses in mice (unpublished data), as one would expect from highly malignant cells. Uniquely, Clone6 cells are unable to return to the original phenotype when standard culture conditions are re-established, and there is a marked transcriptional activation of HERV-K with the concomitant production and release of viral particles, along with these phenotypic and functional changes. The generation of Clone6 from TVM-A12 is shown to be dependent on HERV-K as down-regulation by RNA interference prevents it.

When switching to a serum free medium, such as X-VIVO (typically used for stem cells), TVM-A12 generated non-adherent dark cellular aggregates called Grapes, with a low growth rate (**Figure 1**). Grapes are characterized by increased expression of CSCs markers (CD133 and nestin), loss of expression of immune recognition molecules (MHC-I, Melan A/MART-1) and an increase in melanoma progression and metastasis

associated markers (CD10 and CXCR4). This phenotype switch is accompanied by an increased expression of HERV-K, and the link to microenvironmental changes is confirmed by a strong down-regulation of HERV-K expression when cells are returned to media containing serum (RPMI with 10%FBS; X-VIVO with 2–10%FBS). Once again confirming its important role, the silencing of HERV-K in TVM-A12 leads to a reduction in Grapes formation and an induction of cell death.

This HERV-K interference, during Grapes generation, also specifically inhibits the expansion of a CD133+ subpopulation with stemness features, demonstrating the requirement of HERV-K activation to sustain the expansion of this subpopulation. Microenvironmental stress is indicated as critical for regulating stemness of tumor cells (Plaks et al., 2015). This subpopulation is characterized by recognized hallmarks of cancer aggressiveness such as high expression of OCT4, self-renewing, migration and invasion capacity. Remarkably, when these cells are treated with NNRTIs such as efavirenz and nevirapine, the results mimic HERV-K interference, with a decrease in HERV-K expression and a concomitant cell death induction. We cannot be certain if the effect of NNRTIs treatment is due to a direct inhibition of the reverse transcriptase of HERV-K or is mediated by the action on other cellular components such as LINE-1 (Sinibaldi-Vallebona et al., 2011).

Tumors are graded and evaluated based on their degree of cellular differentiation, more malignant cancers lose the characteristics of the original tissue and approach a more stemcell-like state (Lathia and Liu, 2017). Given this, it appears that HERV-K inhibition offers a promising avenue of research for combination therapy in cancer.

## HERV-K AND THE ACQUISITION OF CANCER HALLMARKS: RATIONALE FOR COMBINATION THERAPIES

Cell heterogeneity and plasticity are the main drivers of the clonal evolution of genetic resistance and the emergence of highly metastatic tumor phenotypes resistant to conventional chemotherapies and radiation (Skvortsov et al., 2014; Roesch, 2015). The phenotype-switching ability of melanoma cells in response to the microenvironment drives the dynamicity

of its immune escape and malignant characteristics (Li et al., 2015). Accordingly, our studies on melanoma have shown how the activation of HERV-K under microenvironmental stress induces and maintains tumor cell plasticity and determines the acquisition of the typical cancer hallmarks, such as changes in phenotype, stemness feature, immune evasion, and metastasis (**Figure 2A**). Cancer progression is accompanied by metabolic alterations and epigenetic reprogramming. The tumor microenvironment, poor in nutrients and oxygen, is responsible for the metabolic switch observed in cancer cells. The accumulation of glycolysis metabolic products, such as lactate, induces local immune suppression, which facilitates tumor progression and metastasis (Renner et al., 2017). Both cellular nutrient metabolism and chromatin organization are remodeled in cancer cells, and these alterations play a key role in tumor development and growth. Indeed, many chromatin modifying-enzymes utilize metabolic intermediates as cofactors or substrates, and recent studies have shown that the epigenome is sensitive to cellular metabolism (Yun et al., 2012). Thus, while epigenetic alterations can modify the expression of metabolic enzymes, the metabolic reprogramming can affect the cancer cell epigenome as well (by DNA methylation and histone modifications). One of the most important events for the development and progression of cancer is global DNA hypomethylation (Ehrlich, 2009; Sandoval and Esteller, 2012); indeed, the expression of HERV-K is strongly associated with hypomethylation (Stengel et al., 2010; Kreimer et al., 2013), and with increased genomic instability and transcriptome activity (Romanish et al., 2010). In this context, the identification and study of mechanisms and regulators of the metabolic switch and epigenetic modifiers, with an eye toward targeted therapies to be used in combination, should provide more effective cancer therapies.

The eradication of tumors and prevention of recurrences are current challenges in cancer therapy. Thus, starting from conventional cancer treatments, including chemotherapy, radiotherapy and surgery, new combination approaches are needed. In this view, we consider HERV-K targeting as a strategy to improve response to therapy (**Figure 2B**). The enhancement of patient's immune response plays a key role in the treatment of cancer. Indeed, in the field of cancer immunotherapy, progress has been made in the development of new technologies aimed at boosting the immune system, such as monoclonal antibodies and engineered CAR-T against tumor antigens (Khalil et al., 2016). Actually, targeting of the HERV-K envelope protein by CAR-T cells has already been reported as a potential immunotherapeutic approach for melanoma and other tumors (Krishnamurthy et al., 2015). Targeted immunotherapy research demonstrated the potential of anti-HML-2-Env antibodies in inhibiting tumor growth and inducing apoptosis, both in vitro and in vivo mouse models (Wang-Johanning et al., 2012). Moreover, based on the immunogenic property of HERV-K proteins (Reis et al., 2013), studies are underway on a peptide-based vaccine derived from HERV-K in order to control the spread of cancer (Kraus et al., 2013).

Another approach is the identification of the microenvironmental factors and the corresponding signal transduction pathways that are responsible for transdifferentiation of cancer cells. These could be potential targets for a new therapeutic approach for cancer reprogramming into a differentiated state, with a decrease or even loss of cancer stemness features and malignancy (Carpentieri et al., 2016). One promising result in this direction has been the achievement of a new type of cancer cell reprogramming: the osteogenic differentiation of neuroblastoma cells, switched to a different germ layer through rapamycin induction in the presence of a scaffold, without an intermediate

iPSCs step (Carpentieri et al., 2015). It would be of interest to study how the expression of retroelements is regulated during the cancer transdifferentiation process.

Histone deacetylate inhibitors are currently used in the clinical setting as anticancer agents that alter the regulation of histone proteins. HDACi can modify the acetylation status of histones, resulting in the induction of cell cycle arrest, apoptosis or differentiation (Eckschlager et al., 2017). HDACi potentially re-activate HERVs, however, the beneficial or detrimental effects of epigenetic drugs on HERV modulation are currently discussed (Chiappinelli et al., 2015; Hurst et al., 2016; Daskalakis et al., 2018; White et al., 2018).

In a broader context, our group and other authors had already suggested intervening on the activity of retroelements with antiretroviral drugs (Sinibaldi-Vallebona et al., 2011; Argaw-Denboba et al., 2017; Contreras-Galindo et al., 2017), which now more than ever appears as a promising new component in future combination therapies in cancer.

## FUTURE DIRECTIONS

In this scenario, the responsiveness to external stimuli of HERVs attributes to these genetic elements a high relevance in the crosstalk between tumor and microenvironment. Based on our experience we have demonstrated that HERV-K is fundamental in the acquisition of stemness features and aggressiveness under the pressure of the microenvironment. Following the path indicated

## REFERENCES


by our results on melanoma cells, we aim to widen the scope and depth of our knowledge of HERV-K in cancer plasticity, by exploring other cancer types and by deciphering the molecular pathways underlying the responsiveness of HERV-K to the microenvironment. We believe that future combination therapies able to target HERV-K and other retroelements will become indispensable weapons in a wider arsenal for fighting cancers. Therefore, we propose that expanding the range of therapeutic options will allow defining personalized combination therapies in the future.

## AUTHOR CONTRIBUTIONS

All authors listed have made substantial, direct and intellectual contributions to this perspective, revised and approved the final version of the manuscript for publication.

## FUNDING

This work was supported with the European Project Tempus n.144529-2008.

## ACKNOWLEDGMENTS

We wish to thank Giacomo Diedenhofen and Dr. Martino Tony Miele for their linguistic assistance.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Balestrieri, Argaw-Denboba, Gambacurta, Cipriani, Bei, Serafino, Sinibaldi-Vallebona and Matteucci. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# APOBEC3B Activity Is Prevalent in Urothelial Carcinoma Cells and Only Slightly Affected by LINE-1 Expression

Ananda Ayyappan Jaguva Vasudevan1,2 \*, Ulrike Kreimer<sup>1</sup> , Wolfgang A. Schulz<sup>1</sup> , Aikaterini Krikoni<sup>2</sup> , Gerald G. Schumann<sup>3</sup> , Dieter Häussinger<sup>2</sup> , Carsten Münk<sup>2</sup> and Wolfgang Goering1,4 \*

<sup>1</sup> Department of Urology, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany, <sup>2</sup> Clinic for Gastroenterology, Hepatology, and Infectiology, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany, <sup>3</sup> Division of Medical Biotechnology, Paul-Ehrlich-Institut, Langen, Germany, <sup>4</sup> Institute of Pathology, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

#### Reviewed by:

Harold Charles Smith, University of Rochester, United States Gkikas Magiorkinis, National and Kapodistrian University of Athens, Greece

#### \*Correspondence:

Ananda Ayyappan Jaguva Vasudevan ananda.ayyappan@med.uniduesseldorf.de Wolfgang Goering w.goering@hhu.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 15 March 2018 Accepted: 15 August 2018 Published: 04 September 2018

#### Citation:

Jaguva Vasudevan AA, Kreimer U, Schulz WA, Krikoni A, Schumann GG, Häussinger D, Münk C and Goering W (2018) APOBEC3B Activity Is Prevalent in Urothelial Carcinoma Cells and Only Slightly Affected by LINE-1 Expression. Front. Microbiol. 9:2088. doi: 10.3389/fmicb.2018.02088 The most common mutational signature in urothelial carcinoma (UC), the most common type of urinary bladder cancer is assumed to be caused by the misdirected activity of APOBEC3 (A3) cytidine deaminases, especially A3A or A3B, which are known to normally restrict the propagation of exogenous viruses and endogenous retroelements such as LINE-1 (L1). The involvement of A3 proteins in urothelial carcinogenesis is unexpected because, to date, UC is thought to be caused by chemical carcinogens rather than viral activity. Therefore, we explored the relationship between A3 expression and L1 activity, which is generally upregulated in UC. We found that UC cell lines highly express A3B and in some cases A3G, but not A3A, and exhibit corresponding cytidine deamination activity in vitro. While we observed evidence suggesting that L1 expression has a weak positive effect on A3B and A3G expression and A3B promoter activity, neither efficient siRNA-mediated knockdown nor overexpression of functional L1 elements affected catalytic activity of A3 proteins consistently. However, L1 knockdown diminished proliferation of a UC cell line exhibiting robust endogenous L1 expression, but had little impact on a cell line with low L1 expression levels. Our results indicate that UC cells express A3B at levels exceeding A3A levels by far, making A3B the prime candidate for causing genomic mutations. Our data provide evidence that L1 activation constitutes only a minor and negligible factor involved in induction or upregulation of endogenous A3 expression in UC.

Keywords: cytidine deaminase, APOBEC3B, APOBEC3G, APOBEC3H, urothelial cancer cells, LINE-1, innate immunity, mutation

## INTRODUCTION

The apolipoprotein B mRNA editing enzyme catalytic polypeptide 3 (APOBEC3, A3) protein family of Zn2+-dependent DNA cytidine deaminases constitutes a defensive network of proteins restricting exogenous viruses (Chiu and Greene, 2008; Harris and Dudley, 2015) and endogenous transposable elements (Schumann, 2007; Schumann et al., 2010; Refsland and Harris, 2013; Salter et al., 2016). They restrain retroviral replication mainly by deamination of cytidines in ssDNA following reverse transcription (Chiu and Greene, 2008; Münk et al., 2012; Vasudevan et al., 2013;

Harris and Dudley, 2015). Importantly, APOBEC3B (A3B) is constitutively localized in the nucleus (Muckenfuss et al., 2006) and inhibits HIV-1 infection independent of the presence of Vif, which otherwise counteracts the activity of the remaining A3 family members (Bishop et al., 2004; Doehle et al., 2005). A3 proteins also inhibit human papilloma virus (HPV) and hepatitis B virus (HBV) (Harris and Dudley, 2015; Henderson and Fenton, 2015). Recently, large-scale exome and whole genome mutation studies have revealed distinct differences in mutational spectra and mutation frequencies between tumor entities (Alexandrov et al., 2013; Burns et al., 2013; Lawrence et al., 2013; Roberts et al., 2013). Many tumors of diverse entities display a characteristic mutational signature with strand-coordinated clusters of C→T transitions, which are frequently located in the proximity of chromosomal breakpoints. This signature is often associated with increased A3A or A3B mRNA expression levels and is thought to be caused by misdirected A3 activity, partly in conjunction with viral infection (Burns et al., 2013; Roberts et al., 2013). Indeed, almost all cervical cancers and a significant fraction of head and neck cancers (HNSCC), all harboring frequent A3 related mutations, are associated with viral infections (Vartanian et al., 2008; Lawrence et al., 2013). Additionally, A3G expression in HPV-induced uterine cervical intraepithelial neoplasia (CIN) and infiltration of A3G expressing CD3 positive T cells in CIN lesions were reported (Iizuka et al., 2017). In contrast, the frequent occurrence of a characteristic mutational A3 signature (Lawrence et al., 2013) in urothelial carcinoma (UC) is puzzling, as these tumors are thought to be caused predominantly by chemical carcinogens rather than viral infections (Tolstov et al., 2014).

It is assumed that retroelements, including endogenous retroviruses that are flanked by long terminal repeats (LTRs), and non-LTR retrotransposons such as long interspersed nuclear element-1 (LINE-1, L1) and short interspersed nuclear elements (SINEs), have been the original targets of A3 activity and have provided the evolutionary pressure necessary for the continuous expansion of the A3 locus in primates (Münk et al., 2012). Mobilization of these retroelements is restricted by the different members of the A3 protein family to protect the genome from deleterious retrotransposition events (Muckenfuss et al., 2006; Schumann, 2007; Chiu and Greene, 2008; Goodier and Kazazian, 2008; Horn et al., 2013; Orecchini et al., 2018). For instance, the role of A3B in intracellular defense against transposable element activity was recently demonstrated by a twofold to fourfold increase in retrotransposition efficiency of an engineered human L1 reporter after shRNA-based knockdown of A3B in hESCs (Wissing et al., 2011). Importantly, L1 retrotransposition has been detected during development and progression of many human cancer entities (Lee et al., 2012; Doucet-O'Hare et al., 2015; Ewing et al., 2015) (for review: Carreira et al., 2014; Goodier, 2014; Burns, 2017). In UC, the most common histological subtype of urinary bladder cancer, L1-mediated retrotransposition frequency has not been established to date. However, L1Hs elements were reported to be particularly strongly hypomethylated in UCs (Nusgen et al., 2015), full-length L1 (FL-L1) transcript levels are increased (Kreimer et al., 2013) and L1 ORF1 protein (ORF1p) can be detected in UC tissues (Rodic et al., 2014; Whongsiri et al., 2018). Beyond L1-mediated retrotransposition, L1-encoded gene products may contribute to carcinogenesis by other mechanisms, including the regulation of RNA–DNA hybrids (Sciamanna et al., 2014; Schwertz et al., 2018). Moreover, experimental L1 downregulation in colon carcinoma cells led to reduced mRNA levels of the catalytic telomerase subunit hTERT and the telomerase RNA component hTERC (Aschacher et al., 2016). Whether this observation can be extrapolated to other cancer types like UC is so far unknown.

Conceivably, A3-induced genomic mutations may represent collateral damage to the human genome by a response originally directed against endogenous retrotransposons or exogenous viruses (Alexandrov et al., 2013). Thus, we hypothesized that in UC, where exogenous viruses are considered to contribute rarely to carcinogenesis, induction of A3 protein expression and their mutagenic effects might rather represent a response to the well-documented activation of endogenous L1 retrotransposons in these tumors. To address this hypothesis, we analyzed the mRNA expression profile of the different A3 family members in UC cell lines, which we had previously characterized for FL-L1 expression (Kreimer et al., 2013) and established the actual presence of A3-specific enzymatic activity. Subsequently, we investigated the consequences of modulating L1Hs expression by siRNA-mediated knockdown or ectopic overexpression for A3 expression and cellular properties. While UC cell lines did not express detectable A3A mRNA levels, expression of A3B mRNA was prominent in many. Our experiments provide evidence that there is only some minor effect of L1Hs expression on A3B promoter activity, which alone cannot explain the extensive upregulation of A3B expression in UC tissues and cell lines. Modulation of L1 expression did not have any consistently detectable effect on the expression of endogenous A3A, A3B, or A3G, even though knockdown of L1 elements with intact ORF1p impeded cell growth.

## MATERIALS AND METHODS

## Tissue Samples and Cell Lines

All urothelial cancer cell lines (UCCs) used in this study (253J, 5637, 639-V, 647-V, BFTC905, HT-1376, J82, MGHU4, RT4, RT-112, SCaBER, SD, SW-1710, UMUC3, UMUC6, VM-CUB1, T24) were cultured in DMEM GlutaMax (Gibco, Darmstadt, Germany), supplemented with 10% fetal calf serum (Koch et al., 2012). BC61 cells were cultured as described previously (Seifert et al., 2007). The cell lines were obtained from the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures (Braunschweig, Germany), except for the cell line UMUC3, which was kindly provided by Dr. Grossman (Houston, TX, United States). The human embryonal carcinoma cell lines NCCIT (ATCC CRL-2073) and Tera-1 (ATCC HTB-105) were kindly provided by Dr. R. Loewer, (Paul-Ehrlich-Institut, Langen, Germany) and cultured as described (Hoffmann et al., 2006). HeLa cells (ATCC CCL-2) were cultured following supplier's recommendations. The telomerase-immortalized TERT-NHUC

cell line was kindly provided by Dr. M. A. Knowles (Leeds, United Kingdom) and cultured as described (Chapman et al., 2009). Cell lines were authenticated prior to use by STR profiling in the Institute of Forensic Medicine, Heinrich Heine University Düsseldorf, Germany. Cultures of primary urothelial (UP) cells were established from ureters after nephrectomy and were routinely maintained in keratinocyte serum-free medium (KSFM, Gibco, Darmstadt, Germany) supplemented with 12.5 µg/ml bovine pituitary extract and 0.25 ng/ml epidermal growth factor as described (Swiatkowski et al., 2003). Tissue samples for UP generation were collected with patient informed consent and approval by the ethics committee of the medical faculty of the Heinrich Heine University, Study Number 1788.

## Nucleic Acid Extraction and cDNA Synthesis

To minimize DNA contamination, total RNA was extracted by acid phenol extraction followed by column purification. Synthesis of complementary DNA was performed using the QuantiTect Reverse Transcription Kit (Qiagen, Hilden, Germany), according to the manufacturer's instructions, including an extra DNA removal step by DNase as recommended by the supplier. Briefly, 1 µg of total RNA was subjected to genomic DNA elimination reaction in a 14 µl volume, comprised of 2 µl of a 7x gDNA-Wipeout-Buffer, RNA, and water. The reaction mixture was incubated at 42◦C for 2 min and then kept on ice. One microliter of the reaction mixture was taken and mixed with 14.38 µl of water in a new tube (considering 1 µg total RNA input, the RNA concentration in this solution would be 4.64 ng/µl), which served as mock RT template for RT-qPCR assay. With the remaining 13 µl reaction mixture, cDNA synthesis was performed (20 µl volume reaction mixture is made up of 1 µl RT, 4 µl RT buffer (5x), 1 µl RT primer mix, 1 µl water, and 13 µl DNAse treated RNA) by incubating the RT reaction components for 30 min at 42◦C and then inactivating the RT enzyme by boiling for 95◦C for 3 min (according to the manufacturer's instruction). In order to minimize potential inhibitory effects of the RT buffer system on qPCR, a 1:10 dilution of the cDNA product was generated prior to the PCR reaction quantifying FL-L1 transcripts. The final nucleic acids concentration of the RNA suspension (used for the Mock RT-qPCR) and the cDNA suspension were both adjusted to 4.64 ng/µl prior to qPCR.

The efficiency of DNAse treatment was assessed by qPCR on RNA samples that were not incubated with any Reverse Transcriptase before. Data (Ct values) obtained from cell lines 5637, VM-CUB1, 639-V, SD, BC61, RT4 are provided in **Supplementary Table 1**. Ct values obtained from mock-RT experiments were found to be comparable with those obtained from blank control (water). The qPCR conditions were as follows: initial denaturation step at 95◦C for 15 min, followed by 40 amplification cycles consisting of denaturation at 95◦C for 15 s, annealing at 55◦C for 20 s and extension at 72◦C for 30 s, using the primers presented in the following method section and **Supplementary Table 2**.

## Quantitative Real-Time Reverse Transcription PCR (RT-qPCR)

RT-qPCR was performed on a 7500 Fast Real-Time PCR System (Applied Biosystems, Carlsbad, CA, United States) or Roche LightCycler 96 (Hoffmann-La Roche Ltd., Basel, Switzerland) using the QuantiTect SYBR Green PCR Kit (Qiagen, Hilden) with cDNA (1:10 diluted) from DNAse-treated RNA samples (see also above) as described previously (Goering et al., 2011). To quantify transcripts, specifically designed primers (**Supplementary Table 2**) were employed using the following PCR conditions: initial denaturation at 95◦C for 15 min, followed by 40 amplification cycles consisting of denaturation at 95◦C for 15 s, annealing at the appropriate temperature for 20 s and extension at 72◦C for 30 s. Assay specificity was controlled for by using UCSC In-Silico PCR and melting curve profiles. All measurements were performed at least in duplicates; assay variance was <10%. Relative expression was calculated by the modified 11Ct method using TATA-box binding protein (TBP) mRNA levels as a reference gene transcript (Pfaffl, 2001). To ascertain efficient amplification, a standard curve was carried in each RT-qPCR experiment using cDNAs from activated PBMCs (A3A, A3F, A3H), UMUC3 (A3B, A3D), PC3 (A3C, TBP), 5637 (A3G), and VM-CUB1 (FL-L1), respectively.

To quantify transcript levels of human endogenous FL-L1 elements, primers specific for the 5<sup>0</sup> -UTR sequence of the L1.3 reference element (Acc. No. L19088.1, Sassaman et al., 1997) were used which bind L1.3 nucleotide positions 99– 120 (L1\_50\_for: 5<sup>0</sup> -GTACCGGGTTCATCTCACTAGG-3<sup>0</sup> ) and 323–344 (L1-50\_rev: 5<sup>0</sup> -TGTGGGATATAGTCTCGTGGTG-3<sup>0</sup> ) (**Supplementary Table 2**). RT-qPCRs with these primers were performed as previously described (Kreimer et al., 2013).

## Immunoblot Analysis

Twenty micrograms of each protein lysate were boiled in 3x SDS sample buffer (New England Biolabs, Frankfurt/Main, Germany), loaded on 4–12% Bis/Tris gels (Invitrogen), subjected to SDS-PAGE, and electroblotted onto nitrocellulose membranes. After protein transfer, membranes were blocked for 2 h at room temperature in 10% non-fat milk powder in 1xPBS-T [137 mM NaCl, 3 mM KCl, 16.5 mM Na2HPO4, 1.5 mM KH2PO4, 0.05% Tween 20 (Sigma-Aldrich, Mannheim, Germany)], washed in 1xPBS-T, and incubated overnight with the respective primary antibody at 4◦C.

L1 ORF1p was detected using the polyclonal rabbit-anti-L1 ORF1p antibody #984 (Raiz et al., 2012) at a 1:2,000 dilution in 1xPBS-T containing 5% milk powder as primary antibody. Subsequently, membranes were washed three times in 1xPBS-T. As secondary antibodies, we used HRP-conjugated donkey anti-rabbit IgG antibody at a 1:30,000 dilution (Amersham Biosciences, Freiburg, Germany) in 1xPBS/5% milk powder for 2 h. Subsequently, the membrane was washed three times for 10 min in 1xPBS-T. β-actin expression was detected using a monoclonal anti-β-actin antibody (clone AC-74, Sigma-Aldrich, Steinheim, Germany) at a dilution of 1:30,000 as primary antibody and an anti-mouse HRP-linked species-specific antibody (from sheep) at a dilution of 1:10,000 as secondary

antibody. Immunocomplexes were visualized using luminobased ECL immunoblot reagent (Amersham Biosciences). For the immunoblot analysis shown in **Figure 2D**, the applied monoclonal anti-L1 ORF1p antibody was kindly provided by Dr. K. Burns (Johns Hopkins University, Baltimore, MD, United States) (Rodic et al., 2014) or purchased (clone 4H1, 1:10,000 dilution, Millipore, Darmstadt, Germany). For immunoblot detection of A3H protein in selected UCCs, anti-Human APOBEC3H monoclonal antibody (P1H6, cat # 12156, 1:10<sup>3</sup> dilution, NIH AIDS reagent) was used.

## Transfection Experiments

In order to knockdown expression of functional endogenous L1 elements, cells were transfected for 72 h with 20 nmol of either L1\_siRNA#1 (5<sup>0</sup> -GAGAACGCCACAAAGAUACtt-3<sup>0</sup> ) (Oricchio et al., 2007) or L1\_siRNA#2 (5<sup>0</sup> -GAAAUGAAGC GAGAAGGGAAGUUUA-3<sup>0</sup> ) (Aschacher et al., 2016) targeting specifically nucleotide positions 1512–1531 or 1287–1312 of the L1.3 reference sequence (acc.no: L19088.1; Sassaman et al., 1997), respectively, (**Supplementary Table 2**) or a non-specific control (Silencer select negative control siRNA 1; Life Technologies, Darmstadt, Germany). Transfections were performed using Lipofectamine RNAiMAX (Life Technologies) according to the manufacturer's instructions. The chosen siRNAs specifically target roughly 500 full-length and truncated L1 elements in the average human genome of L1PA1 subfamily L1Hs sequences, which can be transcriptionally active (Skowronski et al., 1988; Sheen et al., 2000; Tang et al., 2017). To ensure that the siRNA effects persist in long-term (120 h) experiments, the siRNA transfection procedure was repeated 3 days after the initial transfection. Knockdown of A3B and A3G expression was accomplished by transient transfection of siRNAs (Cat. Nos. L-017322-00 and L-013072-00, Dharmacon/ON-TARGETplus siRNA Reagents) for 72 h. As indicated in **Figure 4**, the total concentration of siRNAs was maintained at 20 nM for all transfections.

The episomal L1 retrotransposition reporter plasmids pJM101/L1RP (Kimberland et al., 1999) and pAJG101/L1RP (**Supplementary Figure 1**) facilitating ectopic expression of the mneoI-tagged, full-length L1RP element, and their empty vector pCEP4 (Thermo Fisher Scientific) were separately transfected into UC cells using the X-tremeGENE 9 DNA transfection reagent (Roche). In pAJG101/L1RP the CMV promoter in pJM101/L1RP was replaced by the CAG promoter (Niwa et al., 1991; Kobayashi, 1996).

## A3B Promoter Constructs and Reporter Assays

For this study, we have constructed two A3B promoter luciferase reporter constructs pA3B-120 and pA3B-1200 (**Supplementary Figure 2**). To generate pA3B-1200, the genomic sequence flanked by nucleotide position −1200 and +18 relative to the ATG start codon of the A3B gene was amplified from genomic HeLa DNA applying forward primer A3B\_Pr\_-1200\_F (5<sup>0</sup> -GATGGTACC GCTCCCAGCAACCCCCCAG) and reverse primer A3B\_Pr\_ -1200\_R (5<sup>0</sup> -CATGCTAGCCTGATCTGTGGATTCATGTTCA GC) and Pwo DNA polymerase (Roche) using the following PCR conditions: one cycle 94◦C for 2 min; 30 cycles of 94◦C for 30 s, 58◦C for 60 s, and 72◦C for 60 s; one cycle 72◦C for 7 min. The amplicon was inserted into the promoterless luciferase reporter plasmid pGL3-Basic (Promega, Mannheim, Germany) via KpnI and NheI restriction sites in the primer sequences (**Supplementary Figure 2**).

pGL3-Basic (Promega) derived reporter plasmid pA3B-120 containing a 120-bp fragments of the A3B promoter was constructed from pA3B-1200 using forward primer A3B\_Pr\_- 120\_F (5<sup>0</sup> -GATGGTACCGCCCTGGGAGGTCACTTTAAG) and reverse primer A3B\_Pr\_-1200\_R (**Supplementary Figure 2**). UC cells 5637 and VM-CUB1 were plated in 24-well dishes and cotransfected with either pA3B-120 or pA3B-1200 and either pJM101/L1RP or pAJG101/L1RP on the next day using X-tremeGENE 9 DNA transfection reagent (Roche). The cells were cotransfected with each A3B-promoter-luciferase reporter construct or the 1HOX plasmid, containing a frameshifted HOXB13 cDNA in pcDNA/TO4, as a MOCK-transfection control. Luciferase activity was quantified 2 days post transfection using the Dual Luciferase Reporter Assay System (Promega, Mannheim, Germany) as described (Goering et al., 2011).

## APOBEC3 Detection and in vitro DNA Deamination Assay

HEK293T cells were transfected with plasmids expressing hemagglutinin (HA)-tagged A3B or A3G using Lipofectamine LTX (Thermo Fisher Scientific). 5637, UMUC3 and VM-CUB1 UCCs were transfected with the pJM101/L1RP expression construct or 1HOX as a MOCK-transfection control as described above. Seventy-two hours later, cells were washed in PBS and lysed with a mild lysis buffer [50 mM Tris (pH 8), 1 mM PMSF, 10% glycerol, 0.8% NP-40, 150 mM NaCl and 1X complete protease inhibitor (Calbiochem, Darmstadt, Germany)]. Lysates were clarified by centrifugation for 20 min at 14,800 rpm at 4◦C. For immunoblot analyses, samples were boiled at 95◦C for 5 min in Roti load reducing buffer (Roth, Karlsruhe, Germany), separated on an SDS-PAGE gel followed by transfer to a PVDF membrane. Membranes were blocked in TBST containing 5% skimmed milk powder and probed with the respective primary antibody. A rabbit anti-A3G antibody (anti-ApoC17; 1:10<sup>4</sup> dilution, NIH AIDS reagent) that is crossreactive against A3A and A3B was used to detect A3B and A3G proteins (Mitra et al., 2014; Jaguva Vasudevan et al., 2018). Mouse anti-α-tubulin antibody (1:4,000 dilution, clone B5-1- 2; Sigma-Aldrich, Taufkirchen, Germany), or goat anti-GAPDH (C-terminus, 1:15,000 dilution, Everest Biotech, Oxfordshire, United Kingdom) were used as primary antibodies for loading controls.

In vitro deamination reactions were performed as described (Nowarski et al., 2008; Jaguva Vasudevan et al., 2013) in a 10 µl reaction volume containing 25 mM Tris pH 7.0, and 100 fmol single-stranded DNA substrate (CCCA : 5<sup>0</sup> -GGATTGG TTGGTTATTTGTTTAAGGAAGGTGGATTAAAGGCCCAAG AAGGTGATGGAAGTTATGTTTGGTAGATTGATGG; TTCA: 5 0 -GGATTGGTTGGTTATTTGTATAAGGAAGGTGGATTGA

AGGTTCAAGAAGGTGATGGAAGTTATGTTTGGTAGATTG ATGG) with 2 µl of freshly prepared cell lysate. Samples were divided in half and 50 µg/ml RNase A (Thermo Fisher Scientific) were added to one half. Subsequently, reactions were incubated for 1 h at 37◦C and the reaction was terminated by incubating samples at 95◦C for 5 min. An equivalent of one fmol single-stranded DNA substrate was used for PCR amplification [Dream Taq polymerase (Thermo Scientific)] comprising 95◦C for 3 min, followed by 30 cycles of 61◦C for 30 s and 94◦C for 30 s) using the forward primers 5 0 -GATTGGTTGGTTATTTGTTTAAGGA for the CCCA substrate or 5<sup>0</sup> -GGATTGGTTGGTTATTTGTATAAGGA for the TTCA substrate, and in both cases the reverse primer 5 0 -CCATCAATCTACCAAACATAACTTCCA. PCR products resulting from the CCCA and TTCA substrates were digested with the restriction enzymes Eco147I (StuI) (Thermo Fisher Scientific) or MseI (NEB, Frankfurt/Main, Germany), respectively, and the resulting restriction fragments were separated on a 15% native PAGE gel and stained with ethidium bromide solution (7.5 µg/ml). To control for successful and efficient restriction digestion of the PCR products, additional substrate oligonucleotides in which the nucleotide sequences 5 0 -CCCA-3<sup>0</sup> and 5<sup>0</sup> -TTCA-3<sup>0</sup> were replaced by 5<sup>0</sup> -CCUA-3<sup>0</sup> and 5 0 -TTUA-3<sup>0</sup> , respectively, were digested in parallel.

## Cell Viability, Proliferation, Apoptosis, and Senescence-Associated (SA)-β-Galactosidase Assays

Cell viability and apoptosis were assessed in quadruplicates by CellTiter-Glo <sup>R</sup> Luminescent Cell Viability Assay and Caspase-Glo <sup>R</sup> 3/7 Assay (Promega, Madison, WI, United States), respectively, using a Wallac 1420 VICTOR2TM plate reader (PerkinElmer, Waltham, MA, United States). Cell proliferation was measured by an EdU incorporation assay (baseclick GmbH, Neuried, Germany), according to the manufacturer's instructions. The SA-β-galactosidase assay was performed as described (Itahana et al., 2007).

## Clonogenicity Assay

For clonogenicity assays, cells were plated at low density into 6 cm dishes and grown until colonies started to become confluent for the fastest growth condition. Colonies were stained with Giemsa (Merck, Darmstadt, Germany).

## Analysis of A3H Single Nucleotide Polymorphism (SNP)

Since A3H haplotype I encodes an active cytosine deaminase described to reside in the nucleus and linked to cancer mutagenesis (Starrett et al., 2016), the status of A3H SNPs was determined in our panel of UCCs. A 305-bp genomic DNA genomic DNA fragment of the A3H gene was amplified from DNAs from various UCCs using primer pairs, A3H forward 5 0 -AGTGCCATGCAGAAATTTGCTTT and A3H reverse 5<sup>0</sup> - CGGGGGTTTGCACTCTTATAACT. Amplified fragments were subjected to direct Sanger sequencing and results were analyzed for the highly polymorphic SNP rs139297 and the adjacent SNPs rs139298/rs139299.

## Statistical Analyses

p-Values were calculated by the Mann–Whitney U Test using SPSS Statistics 21 (IBM, Armonk, NY, United States) the unpaired Student's t-test with Graphpad Prism (GraphPad Software, San Diego, CA, United States). Data were represented as the mean ± standard deviation (SD). Significant differences (p < 0.05) are marked by asterisks. Correlation coefficients and significance were calculated by non-parametric Spearman's rank correlation (Spearman's rho) and were corrected for multiple testing using the Bonferroni method.

## RESULTS

## A3B Is Upregulated in Urothelial Cancer Cells

As a first step to investigate if the expression of specific members of the A3 protein family of cytidine deaminases is a response to the upregulation of endogenous L1 expression, we profiled mRNA levels of all seven members of the A3 protein family (A3A, A3B, A3C, A3D, A3F, A3G, and A3H) in 19 UCCs and five independent primary cultures of normal urothelial cells (UPs) by RT-qPCR. A3A was only detectable in the urothelial cell culture UP86, the UCC BFTC905 (**Figure 1** and **Supplementary Figure 3**) and in activated PBMCs serving as positive control for A3A expression (data not shown). A3A expression was not detectable in all remaining UCCs and UPs. In contrast, A3B expression was barely detectable in all analyzed urothelial cell cultures, but was high in all analyzed UCCs except for the 647-V line harboring no detectable A3B mRNA at all (**Figure 1** and **Supplementary Figure 3**). A3C, D, and F transcripts were detectable at moderate levels in all tested UPs and the majority of UCCs, but a few UCCs expressed A3C and A3D at extremely low or undetectable levels (**Figure 1** and **Supplementary Figure 3**). A3G expression was robust in UPs but low in most UCCs, especially in those originating from muscle-invasive urothelial cancers (**Figure 1** and **Supplementary Figure 3**). A3H transcripts were below detection level in the majority of muscle-invasive UCCs and in two UPs, while the remaining cell lines and UP cultures displayed moderate or robust transcript levels. In the papillary UCC BC61, A3H expression was exceptionally upregulated (**Figure 1** and **Supplementary Figure 3**).

Quantification of FL-L1 transcripts revealed moderate L1 expression in all UPs and robust transcriptional L1 upregulation in almost all analyzed UCCs with the exception of the UCCs 253J and 5637 (**Figure 1** and **Supplementary Figure 3**). If L1 upregulation contributed to activation of A3 enzymes, mRNA expression of L1 and A3 would be expected to correlate. We therefore calculated whether any correlation between FL-L1 mRNA expression levels and mRNA levels of the various A3 genes could be detected in UCCs using Spearmen's rho correlation (**Figure 1**, see also Kreimer et al., 2013). Surprisingly, only the correlation between L1 and A3H RNA levels reached the level of

significance (Spearman's rho 0.419, <sup>∗</sup>p = 0.042), which was lost after Bonferroni correction for multiple testing.

Because A3H was expressed in several UPs and A3H haplotype I (A3H-I), a specific allele of A3H, has been implicated in breast and lung carcinogenesis (Starrett et al., 2016), we additionally determined the A3H genotype at SNPs rs139297, rs139298 and rs139299 in UCCs (**Supplementary Table 3**). Approximately two thirds of the tested UCCs harbor the G105/K121 allele associated with the A3H-I haplotype in a homozygous (6/18) or heterozygous (7/18) manner. However, expression of A3H was not detectable in selected UCCs irrespective of the A3H genotype (**Supplementary Figure 7**). This finding implies A3H as a possible but unlikely source of A3 mutations in several but not all UCCs.

## L1 ORF1p Is Expressed in Most UCCs

The design of the L1-specific RT-qPCR assay to quantify L1 transcript levels (Kreimer et al., 2013; **Figure 1**) which is based on the L1.3 reference sequence (Sassaman et al., 1997), does not allow to fully distinguish transcripts of the approximately 100 retrotransposition-competent L1Hs elements (Brouha et al., 2003) encoding functional L1 ORF1 and L1 ORF2 proteins from non-functional FL-L1 transcripts. Therefore, to investigate relative expression levels of L1Hs elements encoding functional L1 proteins, we performed quantitative immunoblot analyses using anti-L1-ORF1p antibodies (Raiz et al., 2012; Rodic et al., 2014).

Consistent with their previously assessed relative FL-L1 mRNA levels (Kreimer et al., 2013) (**Figure 1**), elevated amounts of L1 ORF1p were detected in the UCC lines BFTC905, RT-112, VM-CUB1, and SD (**Figure 2A** and **Supplementary Figures 4A,C**). More moderate but still detectable L1 ORF1p levels were present in J82, SW-I710, UMUC6, 253J, 5637, 639- V, 647-V, HT-1376, T24, and UMUC3 cells (**Supplementary Figure 4**). The relatively modest amounts of L1 ORF1p in BC61 and RT4 cells do not seem consistent with the transcriptional L1 upregulation in these cells (**Figure 1**), but could be explained by the fact that a subset of FL-L1 elements transcribed in these cell lines does not encode intact L1 proteins. Of note, L1 ORF1p expression was not detectable in the normal urothelial cell culture

UP239 and the TERT-NHUC immortalized normal urothelial cell line (**Figure 2A**).

## L1 siRNA-Mediated Knock Down of Functional, Endogenous L1 Elements in UCCs Exert Largely Diverse Effects on A3 Expression

In order to investigate potential effects of L1-encoded gene products on the expression of A3 proteins, we modulated the expression of L1 elements in selected representative UCCs with robust endogenous A3B transcription levels and either low/moderate (5637 and 639-V) or high (VM-CUB1 and SD) L1 mRNA and ORF1p expression levels, respectively. Expression of full-length L1Hs elements was downregulated in the four UCCs by transfecting either siRNA#1 (Oricchio et al., 2007) or siRNA#2 (Aschacher et al., 2016), which are specifically directed against ORF1 of the human L1.3 reference sequence (Sassaman et al., 1997; see Materials and Methods).

Transfection of each of the two L1-specific siRNAs efficiently knocked down L1 ORF1p expression in VM-CUB1 (**Figures 2B,D**), 5637 (**Figure 2C**), and SD UCCs. L1 mRNA and ORF1p expression was barely detectable in 639-V and remained unchanged after L1 siRNA treatment (**Figures 1**, **2D**).

Interestingly, RT-qPCR using primer combinations specific for the 5<sup>0</sup> end of the L1 5<sup>0</sup> -UTR (see Materials and Methods and **Supplementary Table 2**) revealed that overall FL-L1 transcript levels were reduced only by at most 50% (**Figures 2E–H**). Since the partial mRNA knockdown observed in these cell lines nevertheless resulted in a highly efficient L1 ORF1p depletion (**Figures 2A–D**), this observation indicates that the siRNAs target most if not all intact protein-coding L1 elements harboring an intact ORF1 efficiently. In contrast, non-functional FL-L1 elements with mutations in ORF1 that differ from the L1.3 reference sequence were targeted less efficiently.

Following transfection with L1-specific siRNAs, A3A mRNA expression remained undetectable in all four UCC lines (**Figures 2E–H**). A3H expression was only detectable in SD cells among these four UCCs (compare with **Figure 1**) and not altered by the knockdown of L1 expression in the examined UCCs (**Figures 2E–H**), whereas L1 knockdown exerted variable effects on the expression of A3B to A3G in the tested UCCs.

Transfection of L1\_siRNA#1 diminished overall FL-L1 transcript levels in the analyzed UCC lines by 8% (639-V) to 42% (5637) (**Figures 2E–H**), and was associated with a drop of A3B expression by 25% (5637) to 55% (VM-CUB1). A3G expression was diminished by 14, 47 and 36% in VM-CUB1, 5637, and SD cells, respectively, and remained unaffected in 639- V cells. However, only minor, negligible effects on A3C, A3D and A3F transcript levels were observed in VM-CUB1, 5637, and SD cells. In 639-V cells, A3C expression was unchanged, whereas A3D and A3F transcript levels were increased by 56% and 38%, respectively.

Following transfection with L1\_siRNA#2, A3B transcript levels remained basically unchanged in 5637, SD, and 639-V cells, but increased in VM-CUB1 cells by approximately twofold. A3C expression was increased in VM-CUB1, SD, and 639-V cells by 70%, 45%, and 30%, respectively, but remained unchanged in 5637 cells. An increase in A3D expression was observed in each of the four UCCs and ranged from 12 to 90%, while A3F transcript levels were upregulated only in VM-CUB1 and 639-V cells, and remained essentially unaffected in 5637 and SD cells (**Figures 2E–H**). The knockdown of functional L1 elements was associated with an increase of A3F transcript levels in 639-V cells by 44%, but had no meaningful effect on A3F expression in the remaining cell lines. L1\_siRNA#2-mediated knockdown of ORF1p-expressing endogenous L1 elements was associated with a strong increase of A3G transcript levels in three UCCs ranging from 50% in the SD line to fourfold in VM-CUB1 cells but had no consequences for A3F expression in 5637 cells, and no noteworthy effect on 5637 cells (**Figures 2E–H**).

In summary, knockdown of functional L1 elements with L1\_siRNA#2 resulted in a general increase of A3C, D, F, and G transcript levels in three out of four analyzed UCCs, while L1\_siRNA#1-mediated knockdown was associated with a comparable major increase in A3C, A3D, and A3F transcript levels only in 639-V cells and a minor increase in VM-CUB1 cells. A decrease of A3 transcript levels was observed only after L1\_siRNA#1-mediated knockdown in 5637 and SD UCCs and was minor, ranging from 10 to 50% among the A3 genes. Taken together, no consistent correlation between the knockdown of functional endogenous L1 elements and expression of A3 genes could be observed, but knockdown of intact L1s was more often associated with increases rather than decreases of A3 gene expression. In sum, these findings are not consistent with the hypothesis that expression of A3 genes is upregulated in UCCs as a consequence of functional L1s activation. Furthermore, it is currently unclear why the effects of functional L1 knockdown on A3 gene expression differed between L1\_siRNA#1- and L1\_siRNA#2-mediated knockdowns.

## Ectopic Expression of Retrotransposition-Competent L1 Elements Has Only Minor Effects on Expression of Selected A3s

To investigate the consequences of ectopic overexpression of retrotransposition-competent L1 elements on endogenous A3B and A3G transcript levels in UCC lines, we next transfected either of the L1 expression plasmids pJM101/L1RP and pAJG101/L1RP (**Supplementary Figure 1**) into the cell lines VM-CUB1 and 5637, which are characterized by relatively high and low endogenous L1 transcript levels, respectively (**Figure 1**). Following transfection, FL-L1 RNA levels increased 2.5- to 3 fold in VM-CUB1 cells and by 23% to 46% in 5637 cells, as demonstrated by the L1\_5<sup>0</sup> -UTR-specific RT-qPCR assay (**Figures 3A,B**). To analyze whether ectopic L1 overexpression affects endogenous A3 expression, we quantified A3A, A3B, and A3G mRNA levels in the transfected UCCs. We found that expression of A3B and A3G was slightly increased in VM-CUB1 UCC after transfection with pAJG101/L1RP, but only A3B expression changes reached the level of significance (**Figures 3A,B**). In 5637 UCC, A3B and A3G expression was not altered significantly. Of note, A3A expression remained undetectable in both cell lines (data not shown).

To study whether ectopic expression of functional L1 elements can induce A3B promoter activity, we co-transfected VM-CUB1 and 5637 cells with either of the two A3B promoter luciferase reporter constructs pA3B-120 or pA3B-1200 (**Supplementary Figure 2**) together with the L1 expression plasmids pJM101/L1RP and pAJG101/L1RP or the empty pCEP4 vector as negative control. Activity of the A3B promoter encoded by pA3B-1200 increased by ∼36% – 42% and 64% – 80%, respectively, after co-transfection of the luciferase reporter construct with pJM101/L1RP or pAJG101/L1RP into VM-CUB1 and 5637 cells relative to the effect of the control vector (**Figures 3C,D**). This increase in promoter activity is consistent with the increase in endogenous A3B mRNA levels by ∼27% after transfection of plasmid pAJG101/L1RP in VM-CUB1 cells (**Figure 3A**). Taken together, induction of L1 activity has only minor effects on A3B expression as well as promoter activity and significant effects are limited to VM-CUB1 UCC with high L1 expression.

## A3B Deaminase Activity Is Predominant in UCCs and Not Altered by Ectopic L1 Expression

We investigated expression and deaminase activity of A3 enzymes suspected to cause mutations during bladder

carcinogenesis in selected UCCs with different A3 mRNA expression levels. We chose specifically (i) the UCC 5637 exhibiting major transcriptional upregulation of A3B and A3G, (ii) UMUC3 showing robust transcript levels of most A3 genes (with A3B expression being the highest), and (iii) VM-CUB1 with very low transcript levels of most A3 family members except for A3B (**Figure 1**). Distinguishing A3B from A3G by standard immunoblotting techniques is challenging due to the high amino-acid sequence homology between both proteins and their comparable molecular masses (Burns et al., 2015; Jaguva Vasudevan et al., 2018). Therefore, to determine whether A3B or A3G is responsible for any potential deamination activity in these cancer cell lines, A3B and A3G were knocked down separately in different cultures or simultaneously in the same culture. Successful downregulation of A3B and A3G was confirmed by RT-qPCR using A3B- and A3G-specific primer pairs. Transfection of the UCCs 5637, UMUC3, and VM-CUB1 with A3B-specific siRNA alone reduced A3B expression by >90% (**Figure 4A**). Similarly, A3G expression levels dropped after transfection with A3G-specific siRNA to below 10% in all three UCCs (**Figure 4A**). Simultaneous treatment with A3Band A3G-specific siRNAs resulted in diminished A3B and A3G mRNA levels in all examined UCCs comparable to those observed after single siRNA treatment (**Figure 4A**).

Immunoblot analyses of cell extracts isolated from the differently transfected 5637 cells with an anti-A3G antibody (ApoC17) (Kao et al., 2003) reported to cross-react with A3B, detected a ∼45 kDa protein in 5637 cells, which is consistent with the predicted molecular weights of both A3B and A3G proteins (**Figure 4B**) and their mRNA expression pattern in 5637 cells (**Figure 1**). The intensity of the 45 kDa band was slightly diminished after transfection of the cells with A3Bspecific siRNA, but the band disappeared almost entirely after transfection with A3G-specific siRNA or a combination of both siRNAs (**Figure 4B**). Expression of the 45-kDa protein was not affected by transfection of control siRNA.

Transfection of A3G-specific siRNA strongly depleted the amounts of the 45-kDa protein in both UMUC3 and VM-CUB1 cells, whereas the A3B-specific siRNA had only a minor effect on the 45-kDa protein levels in UMUC3 cells and did not affect its expression at all in VM-CUB1 cells (**Figure 4B**). These findings suggest that the majority of the 45-kDa proteins detected with the anti-A3G antibody represents A3G. A note of caution on the detection of endogenous A3B: Since a more specific antibody capable of selectively detecting endogenous A3B enzyme in UC cell lines is currently not available, we cannot formally exclude that A3B protein is not depleted by A3B-specific siRNA, despite downregulation of A3B mRNA. Of note, A3A expression was not considered because there was no evidence for the presence of A3A mRNA in the analyzed UCC lines (**Figures 1**, **2**), and consistently, immunoblot analysis with anti-A3A antibodies did not provide any evidence for the presence of A3A proteins (data not shown).

In order to investigate if A3B and/or A3G are enzymatically active in UCC lines, DNA deamination activity assays were

performed using cells lysates from the different UCCs transfected with A3B- and A3G-siRNAs as controls (**Figure 4C**). To measure deaminase activity, we applied a qualitative PCR-based in vitro DNA deamination assay to identify C→U conversion in an 80-nt single-stranded DNA substrate harboring the isozymespecific motif TTCA or CCCA, specifically recognized by A3B or A3G, respectively (Jaguva Vasudevan et al., 2017; Yang et al., 2017). Catalytic deamination of C→U in the respective motif creates specific restriction sites, which can be detected by restriction analysis of the PCR product. As an additional control, substrate specificity of A3B and A3G was tested using lysates from 293T cells transiently transfected with A3B or A3G expression plasmids (**Supplementary Figure 5**). Of note, whereas the substrate TTCA (YTCA) was reported as a statistically

favorable target for A3A over A3B in cancer tissues (Chan et al., 2015), TTC was demonstrated as a preferred target for all A3s except for A3G, in vitro, based on high resolution structures, protein–DNA interaction studies, and enzymatic assays (Yu et al., 2004; Harjes et al., 2017; Jaguva Vasudevan et al., 2017; Kouno et al., 2017; Shi et al., 2017; Yang et al., 2017). The assay demonstrated that A3G deaminase activity is present in all three UCCs (**Figure 4C**, CCCA panel). A3G deaminase activity was robust in 5637 cells, but lower in UMUC3 cells and barely detectable in VM-CUB1 cells. Importantly, as expected from the CCCA substrate specificity of A3G (Yu et al., 2004; Jaguva Vasudevan et al., 2017), siRNA-mediated knockdown of A3G affected product formation in the CCCA assay more efficiently than in the TTCA assay (**Figure 4C**). Using the CCCA substrate, A3B downregulation slightly reduced product formation, whereas simultaneous knockdown of A3B and A3G abolished detectable deaminase activity. Conversely, using the TTCA substrate, A3B knockdown, but not A3G knockdown resulted in complete loss of detectable deaminase activity (**Figure 4C**, TTCA panel). Taken together, these data confirm that A3G favors the CCCA sequence motif and A3B prefers the TTCA motif, but also indicate that A3B might mutate CCCA sequences on ssDNA substrates with a low frequency. More importantly, combining the deamination assay data (**Figure 4C**) with the A3 expression data presented in **Figures 1** and **4A** leads to the conclusion that in vitro A3B is the predominantly active member of the A3 family in the tested UCCs, whereas A3G-specific deaminase activity is comparably low.

Next, we investigated the effect of the expression of functional L1 elements on the deamination activity of A3 proteins by ectopically overexpressing transfected functional L1 elements encoded by pAJG101/L1RP in UCCs. Lysates from transfected and untransfected UCCs were then either treated with RNase A to eliminate potential inhibitory RNA molecules, or left untreated, before they were assayed for A3B- or A3G-specific deaminase activity (Soros et al., 2007; McDougall and Smith, 2011). Ectopic L1 expression did not affect A3B- or A3G-encoded deaminase activity in any of the transfected UCC lines (**Figure 4D**).

## L1 Downregulation Reduces Cell Viability Irrespective of Apoptosis and Induces Senescence in UCC

While our results do not indicate that L1 expression affects A3B (or other A3) expression consistently, L1 silencing by siRNA impaired cell proliferation. In VM-CUB1 cells expressing L1 more strongly, the number of viable cells decreased to 47% and 7% after L1 knockdown using L1\_siRNA#1 and L1\_siRNA#2, respectively, compared to control siRNAtransfected cells (**Figure 5A**). The number of viable 5637 UCCs was less severely depleted to 68% and 46% after treatment with L1\_siRNA#1 and L1\_siRNA#2, respectively (**Figure 5A**). Caspase 3/7 activity, measured as an indicator of apoptosis, decreased to 43% and 8% in VM-CUB1 cells after L1\_siRNA#1 and L1\_siRNA#2 treatment, respectively (**Figure 5B**). In 5637 UCCs, caspase activity was increased after treatment with L1\_siRNA#1, but not with L1\_siRNA#2 (**Figure 5B**). According to flow

cytometry data, the fraction of VM-CUB1 cells in G2/M phase was increased by roughly 8% in cells treated with either L1 siRNA (**Figure 5C**). In VM-CUB1 cells treated with L1\_siRNA#2, the fraction of subG1 cells also decreased. S-phase fraction was unchanged (**Figure 5C**) and accordingly, incorporation of EdU was only slightly diminished, especially following L1\_siRNA#1 treatment (**Supplementary Figure 6A**). Only minor changes in cell cycle distribution and, accordingly, EdU incorporation were seen in 5637 UCCs (**Figure 5D** and **Supplementary Figure 6A**). Thus, the decrease in viable cells following L1 knockdown does not reflect apoptosis.

In keeping with its effects on cell viability in shortterm experiments, FL-L1 knockdown strongly diminished the clonogenicity of VM-CUB1 cells (**Supplementary Figure 6B**). Moreover, some VM-CUB1 cells showed morphological changes typical of senescent cells and stained positive for senescenceassociated (SA)-β-galactosidase after treatment with either L1 siRNA, but very rarely after treatment with control siRNA (**Supplementary Figure 6D**). Unexpectedly, in 5637 cells, clonogenicity reproducibly increased after L1 knockdown using L1\_siRNA#2, whereas L1\_siRNA#1 exerted no significant effects on clonogenicity (**Supplementary Figure 6C**). Accordingly, no indications of senescence were detected in 5637 cells after treatment with L1 siRNAs or control siRNA (data not shown). Thus, L1 knockdown affected VM-CUB1 UCCs with high L1 expression levels more severely than 5637 cells with low L1 expression.

## DISCUSSION

## The APOBEC3 Signature in Urothelial Carcinoma

Mutations induced by misdirected activity of A3 proteins have been implicated in several cancer types (Alexandrov et al., 2013; Burns et al., 2013; Lawrence et al., 2013; Roberts et al., 2013). Following viral replication or in the context of other genomic disturbances, A3 proteins can act as endogenous sources of mutations that can promote genomic instability in cancer evolution (Tubbs and Nussenzweig, 2017). The contribution of A3s is especially plausible in cancers elicited by viruses, such as cervical cancer (Henderson et al., 2014), but the high frequency of an APOBEC3 related mutational signature in UC (Alexandrov et al., 2013; Lawrence et al., 2013; Roberts et al., 2013) remains unexplained. Indeed, in a recently published molecular characterization of muscle-invasive bladder cancer, it was calculated that APOBEC3-mediated mutagenesis contributes 67% of all single nucleotide variants (SNVs) (Robertson et al., 2017).

## APOBEC Isoenzymes in Urothelial Carcinogenesis

A specific question is, which member of the A3 protein family is responsible for the observed mutational signature in UC. Bioinformatic analyses suggest that the mutational signature in UC matches better A3A than A3B specificity (Chan et al., 2015; Lamy et al., 2016). However, the expression of A3B was reported to exceed A3A expression in UC tissues (Lamy et al., 2016) and A3B may be more capable of introducing base substitutions in genomic DNA in human cells (Shinohara et al., 2012). Likewise, our results demonstrated robust upregulation of A3B expression in 16/17 UCC lines relative to normal urothelial cell cultures (**Figure 1**), whereas A3A was essentially undetectable in almost all (16/17) analyzed UCCs. Obviously, expression and enzymatic activity of A3 family members may vary during urothelial carcinogenesis and may not be fully reflected in the pattern seen in UCCs. However, A3 mutational activity was shown to be involved in both early and late mutation events that occurred during urothelial carcinogenesis arguing rather for continuous A3 mutational activity (McGranahan et al., 2015; Hurst et al., 2017; Robertson et al., 2017). Accordingly, A3B expression levels exceeding A3A were also observed in high-grade non-muscle invasive bladder cancers (NMIBCs) (Hedegaard et al., 2016). Furthermore, it seems unlikely that A3A would be selectively repressed in UCCs, whereas A3B remains upregulated. Thus, our results rather argue for the enzymatic activity of A3B being responsible for the observed mutations, at least in the context of UCC lines. Conceivably, A3A expression in UC tissues may partly result from macrophages and monocytes highly prevalent in high-grade NMIBCs (Peng et al., 2007; Koning et al., 2009; Thielen et al., 2010; Takeuchi et al., 2016), or may be induced in UC cells in vivo by factors located in the tumor environment. Currently available antibodies directed against A3B cannot detect A3B at levels present in UCC lines (Burns et al., 2015; Jaguva Vasudevan et al., 2018). However, since we could demonstrate that the amounts of expressed A3G proteins correspond to their A3G mRNA levels (**Figures 1**, **4B**) in UCCs 5637, UMUC3 and VM-CUB1, this is very likely to be the case for A3B too. Moreover, cytidine deamination assays coupled with knockdown experiments convincingly revealed the expected substratespecific activity levels for both A3B and A3G. Of note, the general DNA motif reported to be recognized by APOBEC proteins to introduce somatic mutations in cancer is "TC" (Roberts et al., 2013) (the A3B-specific motif in our assay here is TTCA). However, A3G recognizes the DNA sequence motif (CCCA) (Jaguva Vasudevan et al., 2013; Yang et al., 2017). In addition, A3G reportedly possesses a cytoplasmic retention signal that retains A3G exclusively in the cytoplasm (Jaguva Vasudevan et al., 2013; Bennett et al., 2008). For these reasons, A3G is not considered to contribute to A3-mediated mutagenesis during carcinogenesis. Interestingly, A3G may influence cancer cell survival via its likely role in DSB repair (Nowarski and Kotler, 2013).

## Are There Any Effects of Endogenous L1 Activity on A3 Upregulation in Urothelial Cancer Cells?

To address the general question of what triggers A3 activation in urothelial cancer cells, we pursued the hypothesis that A3 activation may be elicited by endogenous retroelement activity

rather than the presence of exogenous viruses. Expression of functional endogenous L1 elements seems a plausible cause for A3 activation, because in urothelial cancer cells, L1 promoter sequences are frequently hypomethylated, and FL-L1 expression is increased even more than in other cancer types (Kreimer et al., 2013; Nusgen et al., 2015). In comparison, neither Alu nor HERV-K sequences are significantly upregulated in UCCs (Kreimer et al., 2013). However, our combined results do not allow drawing the conclusion that L1 activity is a major factor for A3 activation as neither siRNA-mediated downregulation of endogenous FL-L1 elements nor ectopic overexpression of RC-L1 reporter elements led to any consistent and significant alteration in the expression of any A3 protein family member. Only in VM-CUB1 cells the overexpression of the L1 reporter plasmid pAJG101/L1RP led to a significant increase of A3B transcript levels (**Figure 3**). In addition, endogenous FL-L1 and A3 expression levels did not correlate with each other across the tested panel of cell lines. Here, future investigations are required to unambiguously elucidate any role of L1 expression and/or retrotransposition activity in the activation of A3 proteins in tumor cells.

For instance, it might be useful to investigate the effects of the codon-optimized L1 element, ORFeus-Hs (An et al., 2011) that produces 5- to 10-fold more L1 proteins than the L1RP element used in our study, on the expression of endogenous APOBEC3 gene products.

Knocking down the expression of endogenous FL-L1 elements with two different siRNAs targeting the intact ORF1 coding region resulted in the efficient depletion of endogenous L1 ORF1p. This observation indicates that the majority of transcripts from active L1Hs elements harboring intact ORF1 sequences were removed from the tested cell lines. However, these siRNAs did not decrease the overall FL-L1 transcript levels as measured by RT-qPCR to the same degree (**Figure 2**). This could be explained by the fact that the L1 50UTR-specific primers used for the RT-qPCR assay also detect transcripts from FL-L1 elements with non-functional ORF1 sequences, which are not or less efficiently targeted by the siRNAs.

In future work, it should be worthwhile investigating the impact of siRNAs targeting also non-functional L1 transcripts on A3 expression as well.

Although we did not observe any effect of L1 repression on A3 activity, it is obviously capable to elicit severe effects in UCCs. In particular, efficient knockdown of ORF1p expressing FL-L1 elements by siRNAs diminished proliferation of UCCs with higher L1 expression levels (such as VM-CUB1), but had less effect on UCCs exhibiting lower L1 expression levels (such as 5637 cells). These results are in good agreement with previous reports that L1 knockdown causes a loss of proliferative ability in tumor cells independent from apoptosis (Aschacher et al., 2016), ultimately leading to senescence (Oricchio et al., 2007; Sciamanna et al., 2014; Aschacher et al., 2016). However, this issue has not been investigated in UCCs previously. Since L1 activation may be particularly prevalent in UC (Nusgen et al., 2015; Whongsiri et al., 2018), this result calls for closer investigations of L1 function in UC carcinogenesis, beyond retrotransposition. There is growing evidence suggesting that expression and retrotransposition of LINE-1 in neoplasms affects transcription initiation of oncogenes (Rodic and Burns, 2013). Also in hepatocellular carcinoma, L1 ORF1p was suggested to promote cell proliferation and resistance to chemotherapy (Feng et al., 2013). Indeed, L1 expression is linked to the activation of epithelial-mesenchymal transition (EMT) and was shown to affect the expression of miRNA genes (let-7 miRNA family) specifically regulating tumor suppressor expression (Rangasamy et al., 2015). Consistently, our study found cell growth impairment as a consequence of L1 silencing in UCCs, which requires further studies to identify any specific factor(s) or pathway that is involved in this regulation.

## Potential Causes of APOBEC Activation in Urothelial Carcinoma

Finally, if there is no evidence that A3 activation in UC is elicited by either exogenous virus infection or endogenous L1 retrotransposon activation, what causes it? Several alternative hypotheses deserve investigation. For instance, A3B is induced by several cytokines in normal liver (Lucifora et al., 2014) and through the PKC-NFκB signaling pathway in several cancers (Leonard et al., 2015). These factors may also be relevant in urothelial carcinogenesis and could be fostered by a persistent inflammatory state (Thompson et al., 2015). Interestingly, a recent analysis of A3 expression in UC tissues by Glaser et al. (2018) revealed rather uniform expression of A3B in various molecular subtypes of the disease, whereas A3A was mostly expressed in the basal, squamous-like subtype. A3-high tumors demonstrated higher expression of relevant immune marker genes. A3 genes are inducible by interferon and thus belong to the group of interferon-stimulated genes (ISGs). Indeed, Glaser et al. (2018) could induce A3B expression in the UC cell lines HT-1376 and UMUC3 by IFNγ treatment, but not in two cell lines with initially low A3B expression. Unfortunately, they did not report on A3A expression in UC cell lines.

Most advanced muscle-invasive UCs contain mutations inactivating p53, which are rare in non-muscle invasive UC (Hurst et al., 2017; Robertson et al., 2017). The p53 tumor suppressor also regulates the transcription of several A3 genes. In particular, loss of p53 or overexpression of gain-of-function mutants leads to upregulation of A3B (Menendez et al., 2017; Periyasamy et al., 2017). Loss of p53 function may therefore contribute to A3B activation in muscle-invasive UC, but not likely in non-muscle invasive tumors.

Moreover, recent results suggest that A3B may target ssDNA accumulating as a result of replication stress (Kanu et al., 2016) or transcription stress (Periyasamy et al., 2015; Tubbs and Nussenzweig, 2017). ssDNA formed preferentially during lagging strand synthesis in the course of DNA replication and displaced non-transcribed strand ssDNA due to transcription overload, e.g., as a result of hormone stimulation (Periyasamy et al., 2015; Haradhvala et al., 2016; Hoopes et al., 2016). Indeed, replication stress is thought to be common during urothelial carcinogenesis (Schepeler et al., 2013) and exacerbated by p53 loss of function.

Thus, we conclude that several factors may cooperate to activate A3 in urothelial carcinogenesis. This work largely excludes the pervasive activation of L1 retroelements as one potential factor. Moreover, in line with some, but not other previous reports, detailed analysis of UCCs suggests A3B rather than A3A as the predominantly active enzyme.

The major limitations of our study concern the detection of A3 proteins and the high L1 copy number in the human genome. With respect to A3 proteins, we could not obtain antibodies that are sufficiently sensitive and specific to detect endogenous expression of each isoenzyme. A reliable array of such antibodies would be very helpful to characterize the expression pattern of A3s in UC cell lines and tissues more precisely. Also the highly repetitive character of endogenous L1 retroelements causes major complications for our studies. To fully understand the impact of L1 activity in UC cells and tissues, a complete characterization of the repertoire of transcripts from retrotransposition-competent L1 elements and, ideally, from non-functional L1 elements will be required. Third generation techniques currently under development will hopefully enable this investigation.

## AUTHOR CONTRIBUTIONS

WG, WS, AAJV, and CM conceived and designed the experiments. AV and WG performed most of the experiments. GS performed immunoblot analyses, generated the L1 reporter plasmid pAJG101/L1RP, and participated in drafting the manuscript. UK and AK performed some experiments. WG,

## REFERENCES


WS, AAJV, DH, and CM analyzed the data. WS, GS, and CM contributed reagents and tools. WG, WS, AAJV, GS, and CM wrote the paper.

## FUNDING

This study was financially supported by a grant from the Forschungskommission der Medizinischen Fakultät der HHU Düsseldorf to WG, grant SCHU1014/8-1 from the Deutsche Forschungsgemeinschaft to GS, and is supported by the Heinz-Ansmann Foundation for AIDS research to CM.

## ACKNOWLEDGMENTS

We are grateful to Michèle J. Hoffmann for advice on cell cycle analysis and several other experiments and to Christiane Hader and Zhang Zeli for helpful discussions. The following reagents were obtained through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH: anti-ApoC17, from Klaus Strebel and Anti-Human APOBEC3H Monoclonal (P1H6) (cat # 12156) from Michael Emerman and Reuben Harris.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.02088/full#supplementary-material


improved survival, mutations in DNA damage response genes, and immune response. Oncotarget 9, 4537–4548. doi: 10.18632/oncotarget.23344


intracellular expression and inhibits packaging of APOBEC3G (CEM15), a cellular inhibitor of virus infectivity. J. Virol. 77, 11398–11407. doi: 10.1128/ JVI.77.21.11398-11407.2003



species and associated with bladder urothelial carcinoma progression. Cancer Genomics Proteomics 15, 143–151.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jaguva Vasudevan, Kreimer, Schulz, Krikoni, Schumann, Häussinger, Münk and Goering. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Murine Endogenous Retroviruses Are Detectable in Patient-Derived Xenografts but Not in Patient-Individual Cell Lines of Human Colorectal Cancer

Stephanie Bock<sup>1</sup> , Christina S. Mullins<sup>1</sup> , Ernst Klar<sup>1</sup> , Philippe Pérot<sup>2</sup> , Claudia Maletzki<sup>1</sup> and Michael Linnebacher<sup>1</sup> \*

<sup>1</sup> Department of General Surgery, Molecular Oncology and Immunotherapy, University Medicine Rostock, Rostock, Germany, 2 INSERM U1117, Biology of Infection Unit, Laboratory of Pathogen Discovery, Institut Pasteur, Paris, France

#### Edited by:

Martin Sebastian Staege, Martin Luther University of Halle-Wittenberg, Germany

#### Reviewed by:

Antoinette Van Der Kuyl, University of Amsterdam, Netherlands Qiangming Sun, Institute of Medical Biology (CAMS), China

#### \*Correspondence:

Michael Linnebacher michael.linnebacher@med.unirostock.de

#### Specialty section:

This article was submitted to Virology, a section of the journal Frontiers in Microbiology

Received: 14 September 2017 Accepted: 06 April 2018 Published: 24 April 2018

#### Citation:

Bock S, Mullins CS, Klar E, Pérot P, Maletzki C and Linnebacher M (2018) Murine Endogenous Retroviruses Are Detectable in Patient-Derived Xenografts but Not in Patient-Individual Cell Lines of Human Colorectal Cancer. Front. Microbiol. 9:789. doi: 10.3389/fmicb.2018.00789 Endogenous retroviruses are remnants of retroviral infections. In contrast to their human counterparts, murine endogenous retroviruses (mERV) still can synthesize infectious particles and retrotranspose. Xenotransplanted human cells have occasionally been described to be mERV infected. With genetic engineered mice and patient-derived xenografts (PDXs) on the rise as eminent research tools, we here systematically investigated, if different tumor models harbor mERV infections. Relevant mERV candidates were first preselected by next generation sequencing (NGS) analysis of spontaneous lymphomas triggered by colorectal cancer (CRC) PDX tissue. Two primer systems were designed for each of these candidates (AblMLV, EcoMLV, EndoPP, MLV, and preXMRV) and implemented in an quantitative real-time (RT-qPCR) screen using murine tissues (n = 11), PDX-tissues (n = 22), PDX-derived cell lines (n = 13), and patientderived tumor cell lines (n = 14). The expression levels of mERV varied largely both in the PDX samples and in the mouse tissues. No mERV signal was, however, obtained from cDNA or genomic DNA of CRC cell lines. Expression of EcoMLV was higher in PDX than in murine tissues; for EndoPP it was the opposite. These two were thus further investigated in 40 additional PDX. In addition, four patient-derived cell lines free of any mERV expression were subcutaneously injected into immunodeficient mice. Outgrowing cell-derived xenografts barely expressed EndoPP. In contrast, the expression of EcoMLV was even higher than in surrounding mouse tissues. This expression gradually vanished within few passages of re-cultivated cells. In summary, these results strongly imply that: (i) PDX and murine tissues in general are likely to be contaminated by mERV, (ii) mERV are expressed transiently and at low level in fresh PDX-derived cell cultures, and (iii) mERV integration into the genome of human cells is unlikely or at least a very rare event. Thus, mERVs are stowaways present in murine cells, in PDX tissues and early thereof-derived cell cultures. We conclude that further analysis is needed concerning their impact on results obtained from studies performed with PDX but also with murine tumor models.

Keywords: mERV, expression, PDX, PDX-derived cell lines, CDX, colorectal cancer

## INTRODUCTION

fmicb-09-00789 April 21, 2018 Time: 11:37 # 2

Retroviruses are reverse-transcriptase encoding viruses with a single-stranded RNA-genome. Endogenous retroviruses are located in somatic as well as germ cells and are therefore passed on to following generations. In contrast to their human equivalents, murine endogenous retroviruses (mERV) still have the ability to synthesize infectious particles and to retrotranspose. MERV are categorized according to their ability to infect foreign species, which is called host tropism, into ecotropic (not able to build infectious particle in original host, but replication is possible in thereof established cultures), xenotropic (infect only foreign species) and polytropic (infect original and as well foreign species) viruses (Stoye and Coffin, 1987; Weiss, 2006).

Xenotransplants of human tumor tissue grown in immunodeficient mice and termed patient-derived xenografts (PDXs) are used to expand the original tumor tissue and to exactly mimic the biological environment: the latter is hardly possible in cell culture. With the revival of PDX-models for research and especially for pharmaceutical drug development (Izumchenko et al., 2016), the question in dispute whether or not mERV are present and active in these preclinical tumor models demands a definite answer.

Several groups described mERV infections of human tumor tissues xenografted into immunodeficient mice and of thereof derived cell lines.

In 53% of independently isolated BALB/c nude mouse-derived PDX tissues of human breast cancer, a replication competent xenotropic endogenous murine leukemia virus (XMLV) was traceable (Naseer et al., 2015). Subsequent to xenotransplantation into SCID mice, an infection with B10 xenotropic virus 1 (Bxv1) was detectable in two human pancreatic cancer cell lines (Kirkegaard et al., 2016).

Zhang et al. (2011) described a 23% XMLV infection rate of PDX-derived cell lines by using TaqMan qPCR systems directed against XMLV gag, env, and pol regions. Surprisingly, they tested also 17% of primary cell lines XMLV positive; but failed to detect XMLV in cell lines generated in a lab not working with PDX-models (Zhang et al., 2011).

Furthermore, mERV may have oncogenic potential. Abelson murine leukemia virus (AblMLV) can induce pre-B-cell lymphoma (Yi and Rosenberg, 2008) and acute transforming retrovirus (AKT8) thymic lymphoma (Staal and Hartley, 1988). This would have significant consequences for general PDX usage and the results of PDX-based basic research as well as therapeutic drug testing and preclinical development. Thus, we took advantage of our large collection of patient-derived tumor models mainly established from colorectal cancer (CRC) to perform a comprehensive analysis of possible mERV infection of PDX, PDX-derived cell lines and primary tumor-derived cell lines.

In previous work, we occasionally observed the occurrence of lymphomas in NOD-SCID mice carrying human CRC PDX (Mullins et al., 2017). In two cases, lymphoma formation could be reproduced by implantation of a second piece of the primary patient tumor. Molecular and pathomorphological analysis revealed that the lymphoma cells were of murine origin (unpublished data). We hypothesized that it is likely that (human) tumor cell-derived factors trigger this lymphoma development. In light of the previously mentioned data on infections with mERV, we decided to additionally screen for the presence of potentially oncogenic murine viruses in these two CRC PDX tissues associated with lymphoma development in NOD-SCID mice.

## MATERIALS AND METHODS

## Next Generation Sequencing (NGS)

RNA from the two PDX HROC32 and HROC39 were sequenced on an Illumina HiSeq2000 instrument in paired-end 2 × 100 nt. Raw read numbers were 80.3 × 10<sup>6</sup> and 183.2 × 10<sup>6</sup> for HROC32 and HROC39, respectively. After an initial step consisting of QC and trimming, reads were mapped to the human reference genome hg19 using Bowtie2. Unmapped reads were assembled with CLC Assembler into contigs of 100 nt minimal size. A search for viral and bacterial sequences was done from the contigs using a blastn procedure (best score) on NCBI databases genbank\_release\_virus and genbank\_release\_bacteria. From that, hits with an alignment longer than 60 nt were kept and their corresponding sequences were retrieved and used for a second (validation) blastn search, against the entire genbank database. Sequences for which the results were the same after the two blast searches were considered correctly assigned to a taxonomy.

## Xenotransplantation Procedures

Patient-derived xenografts tissues used in this study have been generated from CRC and glioblastoma patient tumors as described in previous studies using NMRI-Foxn1nu (NMRI nu/nu), NOD.CB17-Prkdcscid/NcrCrl (NOD SCID) and NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NSG) mice (Linnebacher et al., 2010; William et al., 2017). To generate cell-derived xenotransplants, cells of patient-derived cell lines were suspended in 50 µl PBS and subcutaneously injected using a 1 ml syringe. A total amount of 1 × 10<sup>6</sup> cells was injected dorsally, 3 × 10<sup>6</sup> cells into the left and 5 × 10<sup>6</sup> cells into the right flank. Outgrowing cell-derived xenograft (CDX) were explanted when at least one of the three tumors reached a size of 6–9 mm. Expression of EcoMLV and EndoPP in these tissues was subsequently analyzed using the probe systems.

## Cell Culture

Patient-derived cell lines of CRC patients (HROC24, HROC32, HROC39, HROC173, and HROC183) were cultured in DMEM/Ham's F12 (1:1) supplemented with 10% fetal calf serum (FCS) and 2 mmol/l L-glutamine, as described before (Maletzki et al., 2012, 2015). Cell cultures established from the CDX were cultured initially in a coated 6-well plate and transferred at passage 1 into a T25 cell culture flask. A snap frozen cell pellet was prepared from this and subsequent passages for isolation of nucleic acids.

## RNA Isolation and cDNA Synthesis

RNA was isolated from cell pellets using EURx GeneMATRIX Cell Culture RNA Purification Kit (EURx; Gdansk, Poland). ´ Tissue samples were isolated using the Precellys Tissue RNA Kit (Peqlab; Erlangen, Germany). Skin samples were processed using TRIsureTM according to the manufacturers' description (Bioline GmbH; Luckenwalde, Germany). CDNA synthesis was performed using Reverase <sup>R</sup> (Bioron, Ludwigshafen, Germany). Contamination of the isolated RNA by gDNA was minimized by incorporation of a DNAse step in all RNA isolation protocols used.

## Quantitative Real-Time PCR

Duplicate determination of mERV expression (AblMLV, EcoMLV, EndoPP, MLV, and preXMRV) was done by quantitative real-time PCR (qRT-PCR) on a ViiATM 7 Real-Time PCR System instrument (Thermo Fisher Scientific, Waltham, MA, United States). For the initial screening, two primer systems were designed for each mERV (**Table 1**). CDNA (stored at −30◦C) was analyzed in duplicates in 12.5 µl reactions containing 10 µM of each primer, SYBR Green and reagents provided in the Fast SG qPCR Master Mix (Roboklon, Berlin, Germany) according to the manufacturer's protocol. Cycling parameters were as follows: 2 min at 50◦C, 10 min at 95◦C and 40 cycles of 15 s at 95◦C, 30 s at 60◦C and afterward a melting curve analysis was performed. A non-template-control (NTC) was included in each qRT-PCR run and as housekeeping-genes murine GAPDH and human β-actin (**Table 2**) were used. For further investigations of Eco-MLV and Endo-PP, probe systems were designed (**Table 3**). The reagents were provided in a Fast Probe qPCR Master Mix (Roboklon) and applied according to the manufacturer's protocol. Cycling parameters were as follows: 2 min at 50◦C, 10 min at 95◦C and 40 cycles of 15 s at 95◦C and 30 s at 60◦C.

Relative expression of EcoMLV and EndoPP in PDX tissues was calculated by using the following formula 2 −(Ct value relevent gene −Ct value murine GAPDH) (1+2−(Ct value human <sup>β</sup>−actin <sup>−</sup>Ct value murine GAPDH) ) . The relative expression was adjusted with proportion of human (versus murine) tissue calculated by 2−(Ct value human <sup>β</sup>−actin <sup>−</sup>Ct value murine GAPDH) . Remaining calculations were done by using the following formula 2−(Ct value relevant gene <sup>−</sup>Ct value murine GAPDH) .

## Statistical Analysis

Gaussian distribution was tested by D'Agostino and Pearson omnibus normality test before significances were pointed out by using Mann-Whitney-U-test. Expression levels were compared between different samples as well as between different mERV. Statistical evaluation was performed using GraphPad PRISM software, version 5.02. The Pearson correlation coefficient was calculated for EcoMLV and EndoPP expression in PDX samples. p-Values lower than 0.05 were considered statistically significant.

## RESULTS

Starting point of the present analysis were two CRC cases which could reproducibly trigger murine lymphoma development in TABLE 1 | Polymerase chain reaction (PCR) primers, used for SYBR Green based qRT-PCR analysis.

#### Abelson (P160) murine leukemia virus (Ab-MLV) abl gene (X02963)


#### Ecotropic murine leukemia virus (KJ668270)

### EcoMLV


Mus musculus mobilized endogenous polytropic provirus (FJ544578)

#### EndoPP-1


#### Murine leukemia virus (AY714523)

## MLV


#### PreXMRV-2 (FR871850)


For each mERV, two primer systems were designed, marked with "-1" and "-2"; albeit for EcoMLV and MLV only the primer system providing specific products is listed.

TABLE 2 | Polymerase chain reaction primers of housekeeping genes used in SYBR Green based qRT-PCR analysis.

#### Housekeeping genes

fmicb-09-00789 April 21, 2018 Time: 11:37 # 4


TABLE 3 | Probe systems used for EcoMLV and EndoPP detection.


immunodeficient NOD-SCID mice when engrafted as PDX. RNA-seq analysis identified several sequences with homology to viral sequences in both PDX (HROC32 and HROC39), one contig attributed to a papillomavirus, and other hits matching common NGS bacterial contaminants were also found, but will not be discussed further here (full data and sequences are available in **Supplementary Table S1**). No other viral sequences were detected.

Those mERV detectable in both PDX cases were considered most relevant. These were: murine leukemia virus (MLV), AblMLV, ecotropic murine leukemia virus (EcoMLV), Mus musculus mobilized endogenous polytropic provirus (EndoPP), and pre xenotropic murine leukemia virus-related virus (preXMRV). For each candidate, two SYBR Green based qPCR primer systems were designed and tested in an initial screening experiment using cDNA from different normal murine tissues. Two primer systems delivered unspecific signals and were thus excluded from further analysis. **Figure 1** displays the relative expression of the investigated mERV in several healthy murine tissues (n = 11) from three different mouse strains. Clear expression differences between the different mERV became obvious. AblMLV was not expressed. EcoMLV and preXMRV showed moderate expression levels, whereas MLV and EndoPP were generally the highest expressed mERV. A generally even higher mERV activity was observed in murine tissues of C57 BL/6 origin, followed by NSG and then NMRI nu/nu mice (**Supplementary Figures S1A–C**). Yet, for all used mERV primer

ranges from 0.074 to 0.347.

combinations signals were weaker than that of the murine housekeeping gene, GAPDH (0- to 0.4-fold relative expression).

Next, the relative expression of these five mERV was analyzed in a small series of CRC PDX tissues (n = 22) (**Figure 2**).

As expected, a variation in expression could be observed between the different mERV in the PDX tumor models on the one hand. Very weak signals came from AblMLV, weak to intermediate from EndoPP, MLV, and preXMRV and the single highest expressed mERV was EcoMLV. On the other hand, bigger differences in expression were observed for preXMRV and EcoMLV between individual PDX samples.

When directly comparing the mERV expression between murine control and PDX tissues, EcoMLV showed a significantly (p < 0.0001) higher expression in PDX than in murine control tissues, whereas EndoPP expression behaved just opposite (p < 0.0001).

According to definition, an ecotropic retrovirus like EcoMLV does not replicate in the original host but mainly in thereof established cell cultures. Thus, the observed high expression in the xenotransplanted tissue raises the question whether PDX may trigger EcoMLV expression, and dampen EndoPP expression. This was subsequently investigated in more detail.

Probe systems specific for EcoMLV and EndoPP were designed and used to assess expression in the previously analyzed 22 PDX as well as in 40 additional CRC PDX (**Figure 3A**). Polymerase chain reaction (PCR) signals of EcoMLV were higher than the one of the housekeeping gene (4.80) but with a range between zero to over 60-fold. EndoPP PCR signals were much lower (0.209) with relative expression levels ranging from 0.01 to 1.79. Of note, expression levels of EcoMLV and EndoPP in the PDX were strongly positively correlated (p < 0.00001). Overall, the results of the probe systems confirmed the findings of the less-specific qPCR screening experiments (**Figure 2**).

We additionally compared expression in the different mouse strains used to generate PDX models (**Figure 3B**). Differences in expression were significant for EcoMLV and EndoPP in NMRI nu/nu vs. NOD SCID (p = 0.0062 and p = 0.0140, respectively) as well as for EndoPP in NMRI nu/nu vs. NSG (p = 0.0196). No significant difference was observed for EcoMLV expression in NSG vs. NMRI nu/nu (p = 0.0682) and vs. NOD SCID (p = 0.8763). EndoPP expression was also not significantly different between NSG and NOD SCID (p = 1.000). Thus, both mERV were expressed higher in NSG and NOD SCID compared to NMRI nu/nu mice.

Next, we analyzed, if expression of these two mERV is restricted to CRC-derived PDX by analyzing five PDX established from glioblastoma (**Supplementary Figure S2**). Here, the expression of EcoMLV and EndoPP showed no significant difference when compared to the ones of the CRC-derived PDX. Thus, we assume that at least expression of these two mERV cannot specifically be attributed to the PDX tissue origin.

In order to analyze if the human CRC cells are sufficient to influence mERV expression, we next generated xenografts from four patient-derived cell lines. Those were tested before to be free of mERV expression and were injected in different tumor cell numbers (1, 3, and 5 × 10<sup>6</sup> cells) subcutaneously into NSG and NMRI nu/nu mice. The CDX tissues, obtained after several weeks of growth, were analyzed side by side with subcutaneous tissue from a CDX-free area of the same animals. In the CDX tissues, EcoMLV was generally expressed higher than in the CDXfree mouse tissues (**Figure 4A**). In direct comparison to the expression values obtained with the PDX tissues from the same tumor cases, the expression tended to be lower; this, however, was not statistically significant. EndoPP expression was very low with little variations between tumors and mouse tissues (**Figure 4B**).

The CDX tissues were additionally re-cultivated and expression of the two mERV was measured in these secondary cell cultures. Interestingly, EcoMLV as well as EndoPP were both initially expressed; but the levels decreased gradually with increasing passage numbers to or below the detection limit (**Figures 5A,B**). This observation might best be explained with mERV activity being attributable to the murine cells present in CDX (and PDX) tissues and they additionally suggest that no persistent infection of human tumor cells took place.

To address this question in more detail, we subsequently analyzed mERV expression in a panel of cell lines established in our lab from both primary CRC tissues (n = 17) and PDX tissues (n = 28) established from primary CRC cases, with n = 6 matching pairs. Generally, no expression of EcoMLV and EndoPP was observed in the CRC cell lines – of note, neither in the patient-derived nor in the PDX-derived cell lines.

A failure or bias of the detection systems used can be largely ruled out, since in several cell lines established from murine tumors, expression levels of EcoMLV and EndoPP corresponded to the levels observed in the PDX samples (**Supplementary Figure S3**).

Finally, we tested genomic DNA from these PDX-derived cell lines in an endpoint PCR using all available mERV primer systems. No amplicons were obtained from the probe system primers for EcoMLV and EndoPP (**Table 3**), but occasionally, bands were observed from several of the SYBR Green based primer systems used for the initial screening (**Table 1**). However, sequencing of these amplicons failed to detect any mERV-specific sequence. Thus, unspecific amplification must have taken place in some cases. In sum, we could not demonstrate integration into human tumor cells of any of the mERV analyzed after PDX-passaging in immunodeficient mice.

## DISCUSSION

Patient-derived xenografts are considered the models best conserving the tumors' original biological features including microarchitecture and micromilieu. Maybe more important, they also have the best clinical prediction value for individual treatment success (Guo et al., 2016). Thus, they are not only frequently used for basic research applications, but have recently re-emerged as ideal tool for preclinical drug development (Izumchenko et al., 2016).

However, it has repeatedly been observed that the xenografting procedure might result in mERV activation – at least in the PDX tissues (Todaro et al., 1973; Oakes et al., 2010; Delviks-Frankenberry et al., 2013; Naseer et al., 2015). Moreover, mERV activity has also been described in cell lines established from PDX (Zhang et al., 2011; Kirkegaard et al., 2016).

We here addressed the question to which extent mERV are activated and possibly transmitted to human CRC cells when passaged as PDX in immunodeficient mice. Technically, this

FIGURE 3 | Relative expression of EcoMLV and EndoPP in PDX from CRC. (A) Expression levels were analyzed by qRT-PCR with probe systems specific for EcoMLV and EndoPP and relative expression was calculated by using the described formula. The analysis of n = 62 individual CRC PDX shows high expression of EcoMLV ranging from zero to 65.35. Expression of EndoPP was low, ranging from 0.01 to 1.79. Considerable differences in expression between individual PDX samples can be depicted. (B) Same expression data stratified according to the mouse strains, NMRI nu/nu (n = 50), NSG (n = 7), and NOD SCID (n = 5) used to generate the PDX. Significant differences in EcoMLV and EndoPP expression for NMRI nu/nu vs. NOD SCID (p = 0.0062 and p = 0.0140, respectively) (∗∗) and in EndoPP expression for NMRI nu/nu vs. NSG (p = 0.0196) are indicated (<sup>∗</sup> ).

lower than the expression in the reference PDX tissues from the same tumor cases. (B) Relative EndoPP expression was low with little variation between CDX, mouse control tissues and PDX.

approach was possible due to a large collection of low-passage, patient-derived tumor models including PDX and matching tumor- and PDX-derived permanent cell lines (Maletzki et al., 2015; Gock et al., 2016; Kuehn et al., 2016; Mullins et al., 2016).

Instead of analyzing pre-described mERV, a global NGS-screen for expressed mERV sequences was performed on two PDX observed to repeatedly induce leukemia of murine cell origin. This is likely to be a functional consequence of mERV (re)-activation. Triviai et al. (2014) investigated a similar hypothesis concerning induction of leukemia in a PDX model of primary myelofibrosis. They identified highly virulent polytropic MLV recombinants generated by incorporation of sequences from xenotropic ERV provirus as causative for this acute myeloid leukemia.

From all mERV sequences identified in our NGS screen, we selected five candidates found expressed in both PDX models analyzed; i.e., AblMLV, EcoMLV, EndoPP, MLV, and preXMRV. With the exception of AblMLV, all selected mERV were expressed

in murine control tissues from three different mouse strains. In a large collection of PDX-tissues, all mERV were found expressed, even to a very low degree AblMLV. The expression levels largely varied in the PDX samples as well as in the murine tissues; whereby EcoMLV showed a significantly (p < 0.0001) higher expression in PDX than in murine control tissues. EndoPP expression behaved just opposite. According to these results, human tumors' PDX seem to trigger EcoMLV but dampen EndoPP expression. This behavior of EcoMLV fits well to the host tropism of an ecotropic virus (Stoye and Coffin, 1987). However, we did not further analyze if this opposite behavior is functionally linked.

Moreover, compared to murine control tissues, EcoMLV was found to be higher expressed not only in PDX – independent of the origin of the transplanted human tumor tissue – but also in murine cell lines and in CDX.

An important and somewhat unexpected finding of the present study was that not only directly patient-tumor-derived but also PDX-derived CRC cell lines had no detectable mERV expression – even if the cell lines were established from PDX tissues with proven mERV (in particular EcoMLV) activity. This is in direct contrast to the findings of Zhang et al. (2011), who detected an infection with XMRV in 23% of PDX-derived cell lines and a 17% infection rate of human non-PDX-derived cell cultures; with the latter infections ascribed to horizontal spread in vitro.

Therefore, we also addressed the question, whether mERV expression is transient or more stable in cell cultures from freshly explanted human xenograft tissue with a series of CDX-experiments. In direct comparison to PDX, mERV expression is similar in CDX; but in CDX-derived secondary cultures it gradually disappears within a few passages. The best explanation for this finding is that mERV are expressed in the murine cells present in PDX tissues – and thus mERV expression vanishes with the decreasing contaminating murine cells. This result might partly explain the high contamination rate described by Zhang et al. (2011). But then again, CRC cells might be less susceptible for XMRV. So far, only one cell line isolate, i.e., the cell line RKO, has been found positive for XMRV in just one lab (Zhang et al., 2011).

Only high standards in tissue culture could prevent such contaminations. Though, even then, cross-contamination and horizontal transmission should not generally be neglected and further studies are needed to clarify which mERV (including mERV recombinants like XMRV) are incidental or frequent contaminants for tumor models of which tumor entities. Similar to identity testing, routine quality tests have to be developed and subsequently strictly applied to prevent data misinterpretation when mERV-contaminated human tumor models are used.

Furthermore, our study pointed out that mERV integration into the genome of human tumor cells was not detectable and therefore seems to be unlikely or at least a rarer event than previously thought. Technical problems due to laboratory contaminations seem to be a serious issue in this context. Human DNA samples tested positive for XMRV/MLV sequences were found contaminated with murine DNA (Oakes et al., 2010). And even laboratory reagents containing trace amounts of murine nucleic acids have repeatedly been tested to cause false-positive XMRV detection results (Sato et al., 2010; Zheng et al., 2011).

Taken together, these findings imply that at least most mERV simply represent stowaways which are active within murine cells in PDX tissues and of early PDX-derived cell cultures. This is an important finding generally in favor of the utility of PDX and PDX-derived tumor models, since it attenuates the likelihood of a mERV bias permanently introduced into these models by the murine environment. The observed difference in mERV expression between PDX from different immunodeficient mouse strains additionally hints toward increasing mERV activity levels with the degree of immunodeficiency. Yet, none of these results can be generalized without testing of further immunodeficient mouse strains and PDX models of more tumor entities using similar global mERV screening strategies.

These facts highlight the importance of future research on this topic. Which mouse strain shall be selected for which

experimental approach taking spontaneous and PDX-induced mERV activity and infections into account? What influences do specific mERV like EcoMLV and XMRV have on biological functions in vivo? It might also be speculated that mERV activations are somehow related to the success rate of PDX models and one might ask, if the higher level of mERV activation in strains with higher levels of immunodeficiency is worth the tradeoff for a better tumor take rate?

The influence of additional factors like bacterial and viral co-infections, feed, housing conditions, age, fitness, etc. on mERV activity might also be worth analyzing.

Finally, when considering the increasing number of co-culture protocols using two or more cell lines from different donors and laboratories (Miki et al., 2012), it will be wise to include mERV activity analysis under such conditions into standard quality control schemes.

The present study has some obvious constraints: only two PDX models have been used as starting point for the NGS-based global screening, leading to unambiguous (pro)virus identification by the short sequences obtained by NGS. Additionally, result of a qPCR-based analysis shall be validated on the protein level. Still, one can conclude that this comprehensive analysis of mERV activity in a large cohort of patient-derived CRC tumor models in low passages delivered in summary the following data: (i) PDX and murine tissues in general are likely to be contaminated by mERV – but variations between different immunodeficient mouse strains can readily be observed, (ii) mERV are expressed transiently and at low level in fresh PDX-derived cell cultures and gradually decrease to zero within a few passages, (iii) mERV integration into the human tumor cells' genome is an unlikely or at least very rare event, (iv) "mouse free" cell cultures (i.e., cultures established from primary tumors) are free of any mERV activity.

## ETHICS STATEMENT

Specimen collection was conducted in accordance with the ethics guidelines for the use of human material, approved by the Ethics Committee of the University of Rostock (Reference numbers: II HV 43/2004, A45/2007, and A 2009/34) and with informed written consent from all patients prior to surgery.

Mouse breeding took place in the animal facilities (University of Rostock) under specified pathogen-free conditions. Trials were performed in accordance with the German legislation on

## REFERENCES


protection of animals and the Guide for the Care and Use of Laboratory Animals (Institute of Laboratory Animal Resources, National Research Council; NIH Guide, Vol. 25, No. 28, 1996; Approval No.: LALLF M-V/TSD/7221.3-1.1-071/10 and LALLF M-V/TSD/7221.3-2-036/13).

## AUTHOR CONTRIBUTIONS

SB conducted most of the molecular biology work, designed the probe systems, performed cell culture and the CDX experiments, and wrote the manuscript. EK critically reviewed the manuscript. PP performed the initial NGS screen and designed the primers used in the screening experiment. CM implanted the CRC cells into the immunodeficient mice. CSM was involved in the data analysis and language-edited the final version of the manuscript. ML designed the study and wrote the manuscript.

## ACKNOWLEDGMENTS

The authors thank the Department for Surgery and the Institute for Pathology at the University Medicine of Rostock, especially Prof. F. Prall, for providing the tumor tissues.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2018.00789/full#supplementary-material

FIGURE S1 | Relative expression of mERV in murine tissue stratified to mouse strains analyzed. Expression levels were analyzed by SYBR Green based qRT-PCR. The analysis of several healthy murine tissues shows the highest mERV activity in (A) C57BL6 followed by mERV expression in (B) NSG and then in (C) NMRI nu/nu mice.

FIGURE S2 | Relative expression of EcoMLV and EndoPP in PDX tissue derived from glioblastoma (n = 5). Expression levels were analyzed by SYBR Green based qRT-PCR. The relative expression of EcoMLV ranged from 3.31 to 527.58 and of EndoPP from 0.09 to 0.84.

FIGURE S3 | Relative expression of EcoMLV and EndoPP in murine cell lines (n = 5). Expression levels were analyzed by SYBR Green based qRT-PCR. The relative expression of EcoMLV ranged from16.29 to 94.58 and of EndoPP from 0.01 to 0.21.

TABLE S1 | MERV sequences identified by NGS.

and cancer cell lines. Cancer Res. 76, 4619–4626. doi: 10.1158/0008- 5472


colonic carcinoma cell line and its matched patient-derived xenograft. Sci. Rep. 6:24671. doi: 10.1038/srep24671


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bock, Mullins, Klar, Pérot, Maletzki and Linnebacher. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.