The Chromatin of Candida albicans Pericentromeres Bears Features of Both Euchromatin and Heterochromatin

Centromeres, sites of kinetochore assembly, are important for chromosome stability and integrity. Most eukaryotes have regional centromeres epigenetically specified by the presence of the histone H3 variant CENP-A. CENP-A chromatin is often surrounded by pericentromeric regions packaged into transcriptionally silent heterochromatin. Candida albicans, the most common human fungal pathogen, possesses small regional centromeres assembled into CENP-A chromatin. The chromatin state of C. albicans pericentromeric regions is unknown. Here, for the first time, we address this question. We find that C. albicans pericentromeres are assembled into an intermediate chromatin state bearing features of both euchromatin and heterochromatin. Pericentromeric chromatin is associated with nucleosomes that are highly acetylated, as found in euchromatic regions of the genome; and hypomethylated on H3K4, as found in heterochromatin. This intermediate chromatin state is inhibitory to transcription and partially represses expression of proximal genes and inserted marker genes. Our analysis identifies a new chromatin state associated with pericentromeric regions.


INTRODUCTION
The centromere is the cis-acting DNA site of kinetochore assembly and spindle attachment during chromosome segregation in mitosis and meiosis. Centromeric regions have a different organization across different species. Some organisms, such as the budding yeast Saccharomyces cerevisiae, have "point" centromeres while other organisms, such as the fission yeast, the fruit fly and human, have "regional" centromeres (Buscaino et al., 2010). Point centromeres are only ∼125 bp long and include specific DNA binding sites necessary for centromere function (Westermann et al., 2007). Regional centromeres span large DNA domains (∼10 to 10,000 kb) and do not contain a specific DNA sequence but are epigenetically specified by the presence of the histone H3 variant, CENP-A (also termed Cse4 and CENH3) (Buscaino et al., 2010). Regional centromeres are often associated with repetitive elements. The structure and organization of centromereassociated DNA repeats varies across organisms. For example, human centromeres are composed of tandem arrays of 171 alpha-satellite repeats and, in Drosophila melanogaster, centromeric DNA contains short repetitive elements interspersed with transposable elements (Buscaino et al., 2010). In the yeast Schizosaccharomyces pombe and the fungal pathogen Candida tropicalis, centromeres are organized in a CENP-A-containing central/mid core domain flanked by inverted repeats (IRs) whose sequences are conserved across centromeres (Buscaino et al., 2010;Chatterjee et al., 2016). In both organisms, these IRs are important for de novo CENP-A deposition on a plasmid containing the central core sequence (Baum et al., 1994;Chatterjee et al., 2016). Pericentromeric regions are usually assembled into transcriptionally silent heterochromatin that is required for establishment of CENP-A chromatin and for faithful chromosome segregation (Bernard et al., 2001;Nonaka et al., 2002;Folco et al., 2008). At these locations, heterochromatin is hypoacetylated at Lysine 9 of Histone H3 (H3K9) and Lysine 16 of Histone H4 (H4K16). Heterochromatic regions are also hypomethylated at Lysine 4 of Histone H4 (H3K4) and methylated at H3K9 (Strahl and Allis, 2000;Kouzarides, 2007). Histone modifiers control this modification state: for example the histone deacetylase (HDAC) Sir2 deacetylates H3K9 and/or H4K16, the histone methyltransferase Set1 methylates H3K4 and the histone methyltransferase Su(var)3-9 methylates H3K9 (Shankaranarayana et al., 2003;Wirén et al., 2005;Bühler and Gasser, 2009;Kueng et al., 2013). Although pericentromeric heterochromatin is usually associated with pericentromeric regions, this repressive chromatin state is not absolutely required for centromere function and faithful chromosome segregation. For example, in Candida lusitaniae pericentromeric regions are not assembled into heterochromatin (Kapoor et al., 2015). Given the diversity of centromere structure across eukaryotes, it is important to analyze centromere organization in a variety of organisms.
Candida albicans, the most common human fungal pathogen, is an ideal system to investigate diversity and structure of centromeres because it possess regional centromeres that are much smaller and simpler than other regional centromeres (Sanyal et al., 2004;Baum et al., 2006;Mishra et al., 2007). Each of the 8 C. albicans diploid chromosomes has a relatively small regional centromere (2-4 kb) assembled into CENP-A chromatin (Baum et al., 2006). The organization and sequence of pericentromeric regions differs at each centromere ( Figure 1A). Centromeres on chromosome 1, 4, 5, 6, and R are similar to centromeres of the fission yeast Schizosaccharomyces pombe where IRs flank the CENP-A containing domain. Contrary to S. pombe, the sequence of these repetitive elements is not conserved across centromeres. On chromosome 2 and 3 Long Terminal Repeats (LTRs) are found within ∼3 kb of the CENP-A containing domain. Centromere on chromosome 7 does not have obvious repeats nearby (Mishra et al., 2007;Ketel et al., 2009). Therefore, 7 of the 8 pericentromeric regions are associated with DNA repeats. Several lines of evidence suggest that, in C. albicans, pericentromeric repeats are important for centromere function and/or establishing centromere identity. First of all, despite the lack of conservation in the DNA sequence, repetitive DNA is also associated with centromere of other Candida species such as C. dubliniensis (Padmanabhan et al., 2008). In addition, following deletions of endogenous centromeres, C. albicans neocentromeres form efficiently and are often assembled in proximity to DNA repeats (Ketel et al., 2009). However, the lack of repetitive elements surrounding the centromere on chromosome 7 argue against a role of DNA repeats in centromere function. In many eukaryotes, pericentromeric regions are assembled into transcriptionally silent heterochromatin. It is possible that, despite the lack of a conserved DNA sequence and/or DNA feature, the common feature of C. albicans pericentromeric regions is a specific chromatin structure resembling heterochromatin. Here, we address this question.
We have recently shown that, in C. albicans, transcriptionally silent heterochromatin is assembled at the ribosomal DNA (rDNA) locus and at telomeres (Freire-Benéitez et al., 2016). At these locations, heterochromatin is typified by nucleosomes that are hypoacetyled and hypomethylated on H3K4. The histone deacetylateses Sir2 (orf 19.1992) is required to maintain this repressive epigenetic state via hypoacetylation of H3K9 and H4K16 (Freire-Benéitez et al., 2016).
In this study, we investigate the chromatin state associated with C. albicans pericentromeric repeats. We find that pericentromeric regions are assembled into an intermediate chromatin state bearing features of both euchromatin (high histone acetylation) and heterochromatin (hypomethylation of H3K4). This intermediate chromatin state is associated with a weak transcriptionally silent environment that partially represses expression of proximal genes and inserted marker genes. Our analysis identifies a new chromatin state associated with pericentromeric regions.

Growth Conditions
Yeast cells were cultured in rich medium (YPAD) containing extra adenine (0.1 mg/ml) and extra uridine (0.08 mg/ml), complete SC medium (Formedium TM ), or SC Drop-Out media (Formedium TM ). Cells were grown at 30 or 39 • C as indicated.

Yeast Strain Construction
Strains are listed in Supplementary Table S1. Integration and deletion of genes were performed using plasmids containing marker genes for substitution or integration at endogenous locus as previously described (Wilson et al., 1999). Transformation was performed by electroporation (Gene Pulser TM , Bio-Rad) using the protocol described in (De Backer et al., 1999). URA3 marker gene was used for silencing assays to determine URA3 expression in complete SC medium (Formedium TM ) or SC URA Drop-Out media (Formedium TM ). HIS1, ARG4, and NAT marker genes were used to replace both copies of SIR2, SET1, and JHD2 genes. PCR was used for screening of positive transformants. Oligonucleotides and plasmids used for strain constructions are listed in Supplementary Tables S2 and S3, respectively.

Silencing Assay
Growth analyses were performed using a plate reader (SpectrostarNano, BMG labtech) in 96 well plate formats at 30 • C. For each silencing assay in 96 well plate format, 1:100 dilution of and overnight culture was inoculated in a final volume of 95 µl of SC or SC-URA media to reach a concentration of 60 cells/µl. Growth was assessed by measuring A 600 , using the following conditions: OD 600 nm , 616 cycle time, three flashes per well, 700 rpm shaking frequency, orbital shaking mode, 545 s additional shaking time after each cycle 0.5 s post delay, for 44 h. Graphs represent data from three biological replicates. Error bars: standard deviation (SD) of three biological replicates generated from three independent cultures of the same strain. Data was processed using SpectrostarNano MARS software and Microsoft Excel.

RNA Extraction and cDNA Synthesis
RNA was extracted from Log2 exponential cultures (OD 600 nm = 1.4) using a yeast RNA extraction kit (E.Z.N.A. R Isolation Kit RNA Yeast, Omega Bio-Tek) following the manufacturer's instructions. RNA quality was checked by electrophoresis under denaturing conditions in 1% agarose, 1X HEPES, 6% Formaldehyde (Sigma). RNA concentration was measured using a NanoDrop ND-1000 Spectrophotometer. cDNA synthesis was performed using iScript TM Reverse Transcription Supermix for RT-qPCR (Bio-Rad) following manufacturer's instructions and a Bio-Rad CFXConnect TM Real-Time System.

High-throughput RNA Sequencing
Strand-specific cDNA Illumina Barcoded Libraries were generated from 1 µg of total RNA extracted from wt and sir2 / strains and sequenced with an Illumina iSeq2000 platform. Illumina Library and Deep-sequencing was performed by the Genomics Core Facility at EMBL (Heidelberg, Germany). Raw reads were analyzed using TopHat algorithm following the RNA deep sequencing analysis pipeline described (Trapnell et al., 2013) using Galaxy 1 and Linux platform. Heatmaps and boxplot graphs were generated with R 2 . RNA sequencing data are deposited into ArrayExpress (accession number E-MTAB-4622).

qPCR Reactions
Primers used are listed in Supplementary Table S2. Realtime qPCR and RT-qPCR were performed in the presence of SYBR Green (Bio-Rad) on a Bio-Rad CFXConnect TM Real-Time System. Data was analyzed with Bio-Rad CFX Manager 3.1 software and Microsoft Excel. Enrichments were calculated as the percentage ratio of specific IP over input for qChIP analysis and as enrichment over actin for RT-qPCR. Histograms represent data from three biological replicates. Error bars: SD of three biological replicates generated from three independent cultures of the same strain.

Expression Level of Centromere-Proximal Genes Is Low
Heterochromatin represses expression of associated and proximal genes (Freitas-Junior et al., 2005;Kaur et al., 2005;Hansen et al., 2006;Merrick and Duraisingh, 2006). Therefore, if the pericentromeric regions of C. albicans were assembled into heterochromatin, genes in proximity to these regions would be poorly expressed. To test this hypothesis, we isolated RNA from wild-type (WT) cells grown at a temperature relevant for growth of C. albicans on the skin (30 • C) and at a temperature mimicking fever in the host (39 • C) and performed RNA-seq analyses. FPKM (fragments per kilobase of exons per million mapped reads) values were determined for each annotated gene. We then calculated the FPKM for genes in cumulative bins of 1 kb from the centromeres and compared these values to the genome-wide average ( Figure 1B). This analysis reveals that genes in proximity to centromeres (0 to 1.5 kb) are less expressed compared with the genome-wide average (p-value = 2.2 × 10 −16 ). These data suggest that pericentromeric and centromeric regions impose a weak transcriptionally repressive environment.

A Marker Gene Inserted at Pericentromeric Repeats Is Partially Repressed
Heterochromatin assembled onto repetitive DNA represses the transcription of marker genes inserted in their proximity (Henikoff and Dreesen, 1989;Gottschling et al., 1990;Bryk et al., 1997;Smith and Boeke, 1997). We have previously shown that, also in C. albicans, heterochromatin silences inserted marker genes (Freire-Benéitez et al., 2016). Our RNA-seq analyses suggest that regions proximal to a centromere impose a weak transcriptional repressive environment (Figure 1B). To test whether C. albicans pericentromeres impose transcriptional silencing dependently or independently of the presence of DNA repeats, we integrated the URA3 marker gene into the pericentromeric regions of centromeres surrounded by DNA repeats (peri-CEN4:URA3 + and peri-CEN5:URA3 + ) and into the pericentromeric regions of CEN7, lacking DNA repeats (peri-CEN7:URA3 + ). (Figure 2A). To investigate whether the URA3 marker gene is transcriptionally silenced when inserted at these locations, strains were grown in non-selective (N/S) medium and in medium lacking uridine (-Uri) in which only cells expressing sufficient Ura3 protein are able to grow. Silencing of URA3 is expected to result in slower growth in -Uri medium compared to N/S. However, none of the strains grew poorly in -Uri medium compared to N/S medium ( Figure 2B). In contrast, quantitative reverse transcriptase analysis (qRT-PCR) reveals that the levels of URA3 mRNA for the peri-CEN4:URA3 + , peri-CEN5:URA3 + , and peri-CEN7:URA3 + strains were significantly lower than the URA3 euchromatic gene ( Figure 2C). Therefore, pericentromeric regions impose a weak transcriptional silencing that can only be detected at the RNA level independently of the presence of repetitive elements.

Pericentromeric Repeats Are Assembled Into an Intermediate Chromatin State Bearing Features of Both Euchromatin and Heterochromatin
We have shown that, in C. albicans, heterochromatin regions are assembled into nucleosomes that are hypoacetylated on H3K9 and H4K16 and hypomethylated on H3K4 (Freire-Benéitez et al., 2016). To assess whether pericentromeric regions are associated with heterochromatic histone marks, we monitored by qChIP the presence of H3K9Ac, H4K16Ac, and H3K4me at pericentromeric regions surrounding CEN4, CEN5, and CEN7 ( Figure 3A). As a control, the chromatin state associated with the euchromatic ACT1 locus was analyzed. We find that all pericentromeric regions analyzed are assembled into chromatin that is highly acetylated on H3K9 and H4K16 as levels of these two histone modifications are similar to levels detected at the active and euchromatic locus ACT1 (Figures 3B,D, and F). High levels of H4K16 acetylation are also found at the central core region assembled into CENP-A chromatin (Figure 3). In contrast, pericentromeric chromatin is hypomethylated on H3K4, a chromatin state more similar to heterochromatic regions and different from the euchromatic ACT1 locus (Figures 3B,D, and F). Thus, pericentromeric regions have only one of the three marks (H3K4 hypomethylation) associated with classic heterochromatin. Importantly, this chromatin state marks pericentromeric regions independently of the presence of repeats as pericentromeres with DNA repeats (CEN4 and CEN5) and without (CEN7) are associated with a similar histone modifications pattern. We concluded that C. albicans pericentromeres are not assembled into classical transcriptionally silent heterochromatin but they are associated with an intermediate chromatin state bearing features of euchromatin (high histone acetylation) and heterochromatin (H3K4 hypomethylation).

The Chromatin State Associated with Pericentromeric Regions Is Independent of the Histone Deacetylase Sir2
The HDAC Sir2 specifically deacetylates H3K9 and H4K16 and it is required for heterochromatin assembly across the eukaryotic kingdom (Rusche et al., 2003). We have shown that C. albicans Sir2 is necessary for heterochromatin integrity at the rDNA locus and telomeric regions via deacetylation of H3K9 and H4K16 (Freire-Benéitez et al., 2016). To assess whether Sir2 contributes to the chromatin and transcriptional state of pericentromeric regions, we isolated RNA from WT and sir2 / cells and performed RNA-seq analyses. FPKM values were determined for all genes proximal to CEN repeats and compared between sir2 / and WT strains. Upon deletion of the SIR2 gene, we did not observe any clear effect on expression of CEN proximal genes ( Figure 4A). Only 2 out of the 12 genes located in proximity (<1.5 Kb) of centromeres were expressed more than 2 fold in sir2 / isolates compared to WT cells ( Table 1). Therefore Sir2 does not contribute to the poor expression of CEN-proximal genes. In agreement with these results, deletion of the SIR2 gene does not increase the levels of H3K9 and H4K16 acetylation associated with pericentromeric region on chromosome five as revealed by q-ChIP analyses ( Figure 4C).

The Histone Methyltransferase Set1, But Not the Histone Demethylase Jhd2, Contributes to the Chromatin State Associated with Pericentromeres
Coordination of activities between histone methyltransferases and histone demethylases ensures the right methylation level associated with euchromatic and heterochromatic loci. Therefore, specific histone methyltransferases and demethylases might be important for maintaining the H3K4 hypomethylated state associated with C. albicans pericentromeres. The C. albicans genome encodes for the H3K4 methyltransferase Set1 (Raman et al., 2006) and for the putative H3K4 demethylase Kmd5/Jhd2 (orf19.5651). In S. cerevisiae, both proteins have been implicated in transcriptional silencing and heterochromatin formation (Briggs et al., 2001;Ingvarsdottir et al., 2007;Ryu and Ahn, 2014). To assess whether Set1 and/or Jhd2 contribute to the chromatin state associated with C. albicans pericentromeric regions, we deleted both copies of the SET1 and JHD2 genes  Weak transcriptional silencing of a URA3 marker gene integrated at pericentromeric region. (A) Schematic of peri-CEN4:URA3 + , peri-CEN5:URA3 + , and peri-CEN7:URA3 + reporter strains. Distance from the URA3 marker gene to each CENcore is indicated in Kb (B) Silencing assay of the peri-CEN4:URA3 + , peri-CEN5:URA3 + , and peri-CEN7:URA3 + reporter strain. URA3 + (URA3/URA3) and Ura − (ura3 /ura3 ) strains were included as controls. (C) qRT-PCR analyses to measure URA3 + transcript levels of the peri-CEN4:URA3 + , peri-CEN5:URA3 + , and peri-CEN7:URA3 + reporter strain relative to actin transcript levels (ACT1). Error bars in each panel: standard deviation (SD) of three biological replicates.  from WT cells and quantified H3K4me2 levels by qChIP analyses. We find that Set1 is necessary for maintaining the low levels of H3K4me associated with CEN5 repeats (Figure 5A) as H3K4me dropped to background levels in set1 / compared to WT cells ( Figure 5B). In contrast, we find that Jhd2 is not required for maintaining the hypomethylated state associated with CEN5 ( Figure 5A) repeats as H3K4me levels did not change between jhd2 / and WT cells ( Figure 5C).

DISCUSSION
This study is the first analysis of the chromatin state associated with pericentromeric regions in the human fungal pathogen C. albicans.
In many organisms, regional centromeres have a conserved modular structure despite the lack of a conserved DNA sequence. At these locations, CENP-A domains, sites of kinetochore assembly, are flanked or interspersed by DNA repeats assembled  into heterochromatin (Buscaino et al., 2010). Both CENP-A chromatin and pericentromeric heterochromatin are inhibitory to transcription as illustrated by silencing of inserted marker genes (Allshire et al., 1994;Ketel et al., 2009). The small C. albicans regional centromeres are composed of a CENP-A central core surrounded by pericentromeric regions with different sequence and organization. CENP-A chromatin represses transcription of embedded marker genes (Ketel et al., 2009). Here, we show that, despite the absence of canonical heterochromatin, C. albicans pericentromeric regions weakly repress transcription. These conclusions are supported by two observations. Firstly, RNA-sequence analysis highlights that genes located in proximity (<1.5 Kb) to centromeres are less expressed than the genome-wide average ( Figure 1B). Secondly, a URA3 marker gene inserted into the pericentromeric region of chromosomes 4, 5, and 7 is weakly silenced. This weak silencing is observed when URA3 RNA levels are measured by qRT-PCR but no silencing is detected by growing strains in -URA medium (Figure 2). This suggests that the reduced expression of the URA3 + marker gene is sufficient to confer a URA + growth phenotype.
Candida albicans lacks the heterochromatic structure that is normally associated with regional centromeres because the C. albicans genome does not encode for SuVar 3-9, the H3K9-specific methyltransferase, or for HP-1, the chromodomain protein responsible for assembly and spreading of centromeric heterochromatin. We have recently shown that transcriptionally silenced heterochromatin exists in C. albicans and it is associated with the rDNA locus and telomeric regions (Freire-Benéitez et al., 2016). This heterochromatic state, similarly to S. cerevisiae, is characterized by hypoacetylated nucleosomes that are hypomethylated on H3K4 (Freire-Benéitez et al., 2016). Here, we show that C. albicans pericentromeric regions are not associated with heterochromatin but with an intermediate chromatin state bearing features of both euchromatin and heterochromatin. Nucleosomes at pericentromeric regions are highly acetylated, as observed in euchromatin, and hypomethylated on H3K4, as observed in heterochromatin (Figure 3). Consistently, we find that deletion of the HDAC Sir2 does not perturb the chromatin state of pericentromeric regions (Figure 4).
Given the lack of canonical heterochromatin at pericentromeric regions, it is still unclear what drives the weak transcriptional silencing associated with these regions. We envisage two possible scenarios. It is possible that low levels of CENP-A, undetectable by ChIP analyses, associate with pericentromeric regions driving transcriptional silencing. In support of this hypothesis, it is well known that CENP-A chromatin is inhibitory to transcription. In addition, it has been observed that, following deletion of endogenous centromeric sequences, neocentromeres often form immediately adjacent to the site of the excised native centromeres (Ketel et al., 2009;Shang et al., 2013;Thakur and Sanyal, 2013). These data suggest that low levels of CENP-A might be present at pericentromeric regions and might be sufficient to nucleate new centromeres. In agreement with this hypothesis, it has been shown that a pool of free CENP-A accessory molecules is present in vicinity of centromeres. These accessory molecules do not nucleate kinetochore assembly but could allow for rapid incorporation of CENP-A in the event of eviction at the centromere (Haase et al., 2013). Therefore, it is possible that CENP-A accessory molecules drive the transcriptional silencing associated with pericentromeric repeats. Finally, it has been shown that, following neocentromere formation, assembly of CENP-A into transcribed regions is sufficient to repress gene expression (Shang et al., 2013).
An alternative hypothesis is that the chromatin state associated with pericentromeric regions is sufficient to drive transcriptional silencing. We find the pericentromeric regions are hypomethylated on H3K4, an epigenetic heterochromatic mark. It is possible that this mark is sufficient to drive transcriptional repression even in the presence of acetylated histones. In support of this hypothesis, it is well established that methylation of H3K4 correlates with gene expression and heterochromatic regions are marked by H3K4 hypomethylation (Fischle et al., 2003;Lachner et al., 2004).
The biology of C. albicans DNA repeats, including pericentromeric regions, is still poorly understood. Future studies will reveal whether and how the novel chromatin state associated with these pericentromeric regions controls centromere function and/or identity.

FUNDING
This work was supported by BBSRC (BB/L008041/1 to AB), MRC (MR/M019713/1 to RJP and AB), and a Royal Society Research Grant (RG130149 to AB).