Regulatory Features for Odorant Receptor Genes in the Mouse Genome

The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron–one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice. Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci, where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus. Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes.


INTRODUCTION
Many animals rely strongly on olfaction in order to get information about their surroundings, look for food, escape from predators, find a mate and communicate with each other. The transcriptional regulation of odorant receptor genes -comprising in mice ∼ 1100 intact members (Buck and Axel, 1991;Niimura et al., 2014) -displays an almost unique feature: from the whole set, a single odorant receptor gene is monoallelicaly expressed in a single sensory neuron. This singularity is driven by mostly uncharacterized molecular dynamics, collectively termed odorant receptor gene choice, which seem to occur via local removal of strongly repressive epigenetic marks put in place on each odorant receptor locus during the maturation of the sensory neuron.
The scientific community has tried to identify sequences that regulate the expression of odorant receptor genes within their loci, which normally contain groups of related genes tightly arranged in genomic clusters (Sullivan et al., 1996;Niimura et al., 2014). Several studies identified transcription factor binding sites (TFBSs) on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers located in proximity to odorant receptor genes. Those enhancers regulate in cis their transcription, and seem to form interchromosomal networks.
A solid amount of evidences shows that odorant receptors are not only involved in odor detection (Buck and Axel, 1991), but also in neuronal maturation , axonal sorting (Mombaerts et al., 1996;Wang et al., 1998;Vassalli et al., 2002) and neuronal longevity (Santoro and Dulac, 2012). However, their peculiar gene expression and how it is achieved still represent a fundamental open question.

THE MOUSE OLFACTORY SYSTEM
In mouse the task of sensing a vast range of molecules is run by the main and the accessory olfactory systems. The main olfactory system includes the main olfactory epithelium (MOE), which lines the turbinates in the posterior nasal cavity, and the main olfactory bulb (MOB) in the brain. The MOE is a neurogenic pseudostratified epithelium, which houses basal cells, supporting cells, Bowman's glands, and olfactory sensory neurons (OSNs); these are responsible for the detection of odorants and other ethologically important molecules. OSNs are bipolar neurons with an apical dendrite ending in a knob from which specialized cilia protrude into the mucus of the nasal cavity (Mendoza, 1993; for detailed reviews see also Breer et al., 2006;Tirindelli et al., 2009).
Primary transduction of odors takes place in the cilia, where the chemosensory receptors, either odorant (ORs) or trace amine-associated receptors (TAARs) are expressed (Buck and Axel, 1991;Liberles and Buck, 2006). These are G protein-coupled receptors (GPCRs) whose signals activate the transduction cascade and influence epigenetic gene regulation. ORs can be phylogenetically divided in two subfamilies: class I, comprising ∼125 fish-like intact OR genes, and class II, including ∼1000 intact OR genes specific for mammals (Niimura et al., 2014). OSNs express monogenically and monoallelicaly a single OR gene from the whole genomic repertoire (Ngai et al., 1993a,b;Chess et al., 1994), a feature known as one neuron-one receptor rule. However, as recently reported by Greer et al. (2016), a subset of OSNs localized in the recesses of the olfactory epithelium seems to escape this general rule: each OSNs of the necklace subsystem expresses multiple MS4As genes, coding for four-transmembrane chemoreceptors; through a yet unknown signaling path, they are mainly involved in detection of pheromones and others ethologically relevant ligands.
Axons from OSNs expressing the same OR gene, after crossing the cribriform plate, bundle together converging on the same location, referred to as a glomeruli, in a few stereotypic domains of the MOB (Le Gros Clark and Turner Warwick, 1946;Ressler et al., 1994;Vassar et al., 1994;Mombaerts et al., 1996). Axonal wiring is a process in which the sensory receptor itself has a fundamental role (Mombaerts et al., 1996;Wang et al., 1998;Vassalli et al., 2002;Movahedi et al., 2016). The accessory olfactory system includes the vomeronasal organ of Jacobson and its projections to the accessory olfactory bulb, located in a posterior dorsal region of the MOB (McCotter, 1912;Breer et al., 2006), and other olfactory compartments: the septal organ of Masera, sited close to the nasal septum, that sends axonal projections to a subset of glomeruli in the MOB; the Grueneberg ganglion, in the anterodorsal region of the nasal cavity, that sends projections to a subpopulation of the necklace glomeruli in the MOB. Neurons found in septal organ and Grueneberg ganglion epithelia are generally called OSNs, although some of those, as well as the already mentioned necklace OSNs, display some peculiarities (Fleischer and Breer, 2010;Greer et al., 2016).
The vomeronasal organ is a blind tubular structure located at the base of the nasal septum and mainly deputed to pheromone detection. It presents a non-sensory region and a sensory pseudostratified epithelium hosting vomeronasal sensory neurons (VSNs), basal stem cells, and supporting cells (Halpern, 1987;Døving and Trotier, 1998;Ishii and Mombaerts, 2008). VSNs are bipolar neurons with a single dendrite ending in a knob that exposes microvilli to the vomeronasal lumen. They are divided in two main subpopulations distributed on an apical and a basal layer and having as receptors members of two different families of vomeronasal GPCRs. Apical VSNs coexpress G-protein subunit G αi2 and receptor genes of the family V1R, which includes ∼150 intact genes (out of 300 genes) divided in 12 clades (Dulac and Axel, 1995;Jia and Halpern, 1996). Genes belonging to the same subfamily are organized in clusters (Herrada and Dulac, 1997;Matsunami and Buck, 1997;Ryba and Tirindelli, 1997;Rodriguez et al., 2002;Zhang et al., 2004) and they have monogenic and monoallelic expression (Dulac and Axel, 1995;Rodriguez et al., 1999;Roppolo et al., 2007). Basal VSNs coexpress G-protein subunit G αo and receptor genes of the family V2R, which includes ∼120 intact genes (out of 280) divided in the subfamilies A, B, and D, comprising most of the intact V2Rs gene repertoire, and the subfamily C (seven genes). V2R sensory neurons express a single, apparently stochastically chosen, member of subfamily C plus one or more selected member of subfamilies A, B, or D (Herrada and Dulac, 1997;Matsunami and Buck, 1997;Ryba and Tirindelli, 1997;Martini et al., 2001;Silvotti et al., 2007;Ishii and Mombaerts, 2011). A subset of these neurons may also express non-classical major histocompatibility complex (MHC) 1b H2-Mv genes (Ishii and Mombaerts, 2008;Leinders-Zufall et al., 2009). Although most of the regulatory features of V1R and V2R genes are still not well known, Enomoto et al. (2011) reported that transcription factor bcl11b has an important role in regulating the fate choice between the V1R and V2R types of VSNs.
A small subset of VSNs, mostly in the apical neurons, monogenically expresses genes coding for formyl peptide receptors (FPRs), GPCRs that are mainly involved in microbial and viral peptide detection (Rivière et al., 2009;Bufe et al., 2015).

GENOMIC ORGANIZATION OF ODORANT RECEPTOR GENES AND OLFACTORY CODING
Olfactory information is encoded by thousands OSNs, each of which can bind different molecules with different affinity in a combinatorial fashion (Nara et al., 2011;Jiang et al., 2015) that amplifies the odorant discrimination possibilities of the already huge OR repertoire.
The OR gene family is spread across all genome: class I OR genes are in a single cluster on chromosome 7; class II OR genes are scattered on all chromosomes except the 18 and Y, and arranged in clusters distributed in ∼ 50 loci, which have a usual intergenic distance of 19-45 kb (Young et al., 2002;Zhang et al., 2007;Clowney et al., 2012), and few more solitary genes distant more than 1Mb upstream and downstream from the start and end of their transcripts (Zhang and Firestein, 2002;Godfrey et al., 2004;Malnic et al., 2004;Degl'Innocenti et al., 2016). OR genes are encoded by single exon ∼ 1 kb long and present conserved amino acid motifs characteristic of their family (Lane et al., 2001;Ibarra-Soria et al., 2014;Kanageswaran et al., 2015;Saraiva et al., 2015). As said, the single OR allele expressed in a single OSN determines also its identity, and influence the OSN's axonal wiring to specific glomeruli in the bulb, resulting in a stereotyped sensory map that depends from not yet known information provided by the OR. Knowing how odorant receptor gene choice works is therefore pivotal to understand also the logic behind the olfactory input integration.
To explain OR gene choice, several evidences point towards molecular mechanisms that lead to the random choice of only one among several OR promoters, possibly through epigenetic dynamics (Chess et al., 1994;Lomvardas et al., 2006;Clowney et al., 2012). Instead, the possibility of gene rearrangements for OR loci in the OSN lineage has been excluded, at least for the locus of model OR gene M71. In fact, cloning a mouse from the nucleus of an M71-expressing OSN resets OR gene choice in favor of M71, and results in specimens with a normal OR gene expression (Eggan et al., 2004;Li et al., 2004).

CIS-REGULATING SEQUENCES FOR ODORANT RECEPTOR GENES Odorant Receptor Gene Promoters
OR gene promoters are AT-rich sequences usually lacking a TATA-box, although some do have one Young et al., 2011;Plessy et al., 2012). When present, however, their positions do not closely correlate with the transcription start site of the gene. For many OR genes, initiation of transcription may adhere to the so-called rule of genomic contrast: mRNA polymerization would be caused not by specific increase in AT content but by a sudden local variation of it (cf. Clowney et al., 2011). OR promoters typically feature TFBSs for homeodomain and for olfactory/early B transcription factors (Wang et al., 1997;Vassalli et al., 2002;Young et al., 2011;Plessy et al., 2012), whose presence was in some cases confirmed in vivo (Rothman et al., 2005;Vassalli et al., 2011). Along with them, other TFBSs were found on their sequences, e.g., for MEF2A (Plessy et al., 2012). TFBSs are considered major players in defining zonality of OR gene expression: OSNs found within a given zone, i.e., MOE-subdomain with typical transcriptome, choose stochastically their OR allele out of a subset of the whole genomic repertoire. Non-chosen OR promoters are epigenetically silenced by H3K9me3 and H4K20me3 marks . From functional studies, minimal promoters appear to be quite short (∼300 bp; Vassalli et al., 2011), and sequences of similar length have proven to be capable to drive punctate, stochastic expression of OR transgenes in the MOE (Vassalli et al., 2002(Vassalli et al., , 2011Rothman et al., 2005).

Odorant Receptor Elements
Elements for OR genes are non-genic regulatory sequences traditionally classified as enhancers, although their very nature as facilitators of transcription is debated: it has been proposed that elements differ from typical enhancers in the sense that they control the probability of a given OR gene to be chosen, rather than merely increasing the amount of transcript per cell for all the genes they regulate (Khan et al., 2011;Vassalli et al., 2011). Elements are invariably found within, or in proximity to, OR loci (Khan et al., 2011;Markenscoff-Papadimitriou et al., 2014). Their sequences contain, similarly to OR promoters, homeodomain and olfactory/early B TFBSs, plus additional TFBSs like those for Foxj2, Cdx, C/EBPgamma, Bptf . To date, a total of 14 enhancers are though to regulate OR gene expression in the mouse, three of them being robustly confirmed in vivo; these are called H, P, and Lipsi (Nishizumi et al., 2007;Bozza et al., 2009;Khan et al., 2011;Markenscoff-Papadimitriou et al., 2014). It was realized long ago that elements might have been somehow involved in OR gene choice (Serizawa et al., 2000(Serizawa et al., , 2003Lewcock and Reed, 2004;Shykind et al., 2004), but no clear mechanism has been found yet: elements regulate the expression of OR genes in their in cis proximities, although Markenscoff-Papadimitriou et al. (2014) has suggested they may possess in trans activity with high degree of redundancy.

ODORANT RECEPTOR GENE CHOICE: REPRESSIVE MECHANISMS
The organization of the nucleus in OSNs plays a role in the regulation of OR gene expression. Instead of being at the nuclear periphery, as in typical eukaryotic cells, constitutive heterochromatin is mainly located in central nuclear region (Solovei et al., 2009;Clowney et al., 2012;Armelin-Correa et al., 2014a). Indeed, in the early differentiation steps of OSNs, long before OR gene choice takes place, robust silencing and packing occurs on OR loci. Cytogenetically OR gene loci (and their enhancers) become aggregated in a small number of nuclear locations including arrangements named foci, tridimensional chromatin structures characterized by the repressive epigenetic marks H3K9me3 and H4K20me3, typical of constitutive pericentromeric and subtelomeric chromatin . These marks will be removed later on from a single OR allele, ensuring monogenic and monoallelic expression (Clowney et al., 2012).
Other transcription factor binding sites (MEF2A, TBP, and transcriptional repressors resembling RP58) (some) OR promoters Clowney et al., 2011;Michaloski et al., 2011;Young et al., 2011;Plessy et al., 2012 Summary of regulatory features, either epigenetic or on primary sequences, found in genomic regions regulating OR gene expression with variable degree of evidence (we report specific references for each of them). a While not properly a regulatory feature, enriched presence of 8-oxodG on chosen OR allele is reported for convenience. different compartments, one within constitutive heterochromatin and the other in facultative heterochromatin. In fact, as for any monoallelicaly expressed gene family, homologous alleles of OR genes are replicated asynchronously (Chess et al., 1994). Consistent with these observations, immunofluorescence staining of the olfactory epithelium for H3K27me3 -a mark for facultative heterochromatin -indicates that it is present in the nuclei of OSNs (Armelin-Correa et al., 2014a). However, no clear evidence of H3K27me3 marks on OR genes has been found yet , although Armelin-Correa et al. (2014a) report H3K27me3 markings being required for asymmetric replication of OR genes in embryonic stem cells.
Recent studies show that early developing OSNs can weakly express multiple OR genes, while during subsequent stages of development the expression of one single OR gene overtakes and the other OR loci get silenced (Hanchate et al., 2015;Saraiva et al., 2015;Tan et al., 2015;Scholz et al., 2016). To explain this transition, Hanchate et al. (2015) proposed a winner takes all-model where one of the initially expressed OR genes becomes dominant, capturing limiting factors required for high expression level. Alternatively, the high expression of one OR gene would occur independently of other earlier expressed genes. Hanchate et al. (2015) also suggest a regional bias in OR gene choice: early co-expressed OR genes, although sitting at multiple chromosomal locations, are expressed in neurons located in the same region of MOE.
Immature OSNs expressing an OR gene can still switch to another OR gene in a loop-process that continues until a functional OR gene is expressed and elicits a feedback signal that stops the cycle and stabilizes the choice (Serizawa et al., 2003;Lewcock and Reed, 2004;Shykind et al., 2004). In post-mitotic OSNs, a single OR gene -chosen in a stochastic, yet elusive event -escapes foci and gets repositioned in a nearby nuclear area (Clowney et al., 2012). According to Lyons et al. (2013) a derepressor with limited availability, either in space or time, would act together with the histone lysine demethylase 1 (Lsd1), transiently expressed at the core time window of OR gene choice. This event is associated with an epigenetic switch from H3K9me3 to H3K4me3 for the chosen OR allele, which perhaps interacts with an interchromosomal complex of elements Lyons et al., 2013;Markenscoff-Papadimitriou et al., 2014). Currently, it is unclear whether H3K27 demethylases may have a role in the process too (Armelin-Correa et al., 2014a).
The expression of an intact OR gene activates the unfolded protein response, which eventually leads to the production of adenylate cyclase 3 (Adcy3); Adcy3 represses Lsd1 and promotes neuronal maturation, locking OR gene choice. If the OR gene is nonfunctional and fails to elicit Adcy3-mediated feedback, Lsd1 retains its activity: it might re-heterochromatize the opened locus, or alternatively it may open another one. This process of choice (Figure 1) would continue until an OR gene succeeds in being stably expressed (Dalton et al., 2013).
Whereas developmental expression seems to be independent from odorant receptor-induced neuronal activity (Hanchate et al., 2015), Ferreira et al. (2014) have shown in zebrafish that the βγ subunit of the olfactory G protein, released when an OR binds its ligand, has a direct impact on the methylation state of silenced OR loci, thus linking receptor activity to the epigenetic regulation behind the single OR gene choice mechanism.
More recently Zhang et al. (2016) have shown that the homeodomain transcription factor Lhx2 influences OR expression frequencies in immature and mature OSNs, and it is necessary for driving OR expression but not for the OR singularity, although they do not exclude an indirect role in OR gene choice.
FIGURE 1 | Main steps of odorant receptor gene choice. Gray areas represent foci; black filament represents euchromatin; colored circles represent elements; yellow box represents the single -"chosen" odorant receptor (OR) allele; blue boxes represent nearby OR genes with repressive marks. (A) Silencing: in the nucleus of maturing olfactory sensory neuron (iOSN), OR gene loci are heterochromatized; one locus undergoes an epigenetic change. (B) De-repressing: local variation in epigenetic state on OR heterochromatin (magnified shadowed red-stroked box) is initiated by an unknown derepressor, which cooperates with Lsd1 and perhaps with H3K27 demethylases in the random opening of one OR allele only; an element in the same OR locus interacts with the OR gene via DNA-looping; nearby OR genes keep their repressive marks. (C) Transcribing: on the euchromatic OR allele, an interchromosomal complex of elements drives robust expression of the gene. (D) Eliciting feedback: if massive protein production within the endoplasmic reticulum is achieved, unfolded protein response is triggered; this causes Adcy3-mediated block on Lsd1, resulting in cell inability to unpack silenced OR loci and to re-close the euchromatized allele (purple line); left, if the "chosen-OR" is a pseudogene the process is repeated with a new OR-choice; right, if a functional OR protein is produced (green arrow), its activity leads to the release of the βγ subunit of the G protein, which further prevents other OR alleles to escape foci (red line); therefore, in order to stabilize OR gene choice, the process induces odorant sensory neuron (OSN) maturation.

CONCLUDING REMARKS
Overall, OR genes seem to adopt a lock-and-key strategy for expression; all loci are initially epigenetically silenced, then a limiting factor randomly opens a single allele that later on stabilizes its own transcription through complex feedback mechanisms. Aside from the olfactory system, others examples of way to increase cellular diversity among similar cell types are provided by immune system (Hozumi and Tonegawa, 1976;Jaeger et al., 2013;Magklara and Lomvardas, 2013), and protocadherins (Lefebvre et al., 2012). Several transcriptional characteristics seem to recur in other clustered gene families, such as globins and homeobox genes, which also display oligogenic expression. However, whilst globins (Drescher and Künzer, 1954;Huehns et al., 1964;Groudine et al., 1983) and homeobox (Gaunt et al., 1988;Duboule and Dollé, 1989;Dressler and Gruss, 1989;Graham et al., 1989) genes are serially expressed according to their chromosomal location, OR gene family requires more complex regulation. What are the molecular mechanisms that lead to the OR gene expression? How is the OSN transition to a single highly expressed OR gene regulated? How does the nuclear architecture influence this process? What is the missing link between OR gene expression and the mature OSN identity? These are only few of the fundamental open questions still tickling the olfaction field.

AUTHOR CONTRIBUTIONS
ADI drafted an early version of the manuscript, AD critically revised it. ADI and AD wrote the manuscript. All authors read and accepted the final version.