Mechanisms Underlying Hox-Mediated Transcriptional Outcomes

Metazoans differentially express multiple Hox transcription factors to specify diverse cell fates along the developing anterior-posterior axis. Two challenges arise when trying to understand how the Hox transcription factors regulate the required target genes for morphogenesis: First, how does each Hox factor differ from one another to accurately activate and repress target genes required for the formation of distinct segment and regional identities? Second, how can a Hox factor that is broadly expressed in many tissues within a segment impact the development of specific organs by regulating target genes in a cell type-specific manner? In this review, we highlight how recent genomic, interactome, and cis-regulatory studies are providing new insights into answering these two questions. Collectively, these studies suggest that Hox factors may differentially modify the chromatin of gene targets as well as utilize numerous interactions with additional co-activators, co-repressors, and sequence-specific transcription factors to achieve accurate segment and cell type-specific transcriptional outcomes.


INTRODUCTION
Hox genes have long fascinated developmental biologists for the essential roles that they play in specifying different segment and regional identities along the developing anterior-posterior (A-P) axis of metazoans. Classic genetic studies first revealed that Hox gene mutations can result in homeotic transformations, and thereby cause one part of the organism to be transformed into the likeness of another region. As an example, Drosophila with Hox mutations can have obvious developmental abnormalities that include the misspecification of appendages as evidenced by the transformation of antennae into legs (Kaufman et al., 1980;Abbott and Kaufman, 1986;Schneuwly et al., 1987;Casares and Mann, 1998) or the conversion of the haltere into an extra set of wings (Lewis, 1978;Bender et al., 1983;Carroll et al., 1995). Subsequent studies in other organisms including a variety of vertebrate animals revealed that mutations within the highly conserved Hox gene family can cause a wide variety of homeotic transformations across metazoans as reviewed by Mark et al. (1997) and Quinonez and Innis (2014).
Hox genes were originally discovered in Drosophila melanogaster. In total, Drosophila has eight Hox genes that are separated into two distinct chromosomal clusters: The Antennapedia cluster consists of five Hox genes [labial (lab), proboscipedia (pb), Deformed (Dfd), Sex combs reduced (Scr), and Antennapedia (Antp)] that collectively regulate head and anterior thoracic development, whereas the three Hox genes in the Bithorax cluster [Ultrabithorax (Ubx), abdominal-A (abd-A), and Abdominal-B (Abd-B)] specify cell fates within the third thoracic segment and the abdominal segments (Morata et al., 1990;Maeda and Karch, 2009). In general, the order of the Hox genes on the chromosome correspond with the location along the A-P axis that the Hox genes act in the embryo (Lewis, 1978;Mann, 1997;Noordermeer and Duboule, 2013;Luo et al., 2019;Hajirnis and Mishra, 2021). For example, genes at the 3′ end of the Hox gene cluster tend to mediate anterior development whereas the 5′ genes tend to control posterior structures. In contrast to the single set of eight Hox genes in Drosophila, vertebrates have undergone genome duplication events such that humans have four distinct Hox clusters (labeled HOXA, HOXB, HOXC, and HOXD, respectively) encoding 39 Hox genes that have been categorized into 13 paralogs (HOX1-13). Importantly, the mammalian Hox genes exhibit the same spatial collinearity along the A-P axis as in Drosophila (Duboule and Dolle, 1989;Graham et al., 1989). For example, HOX1 genes on the 3′ end of each cluster regulate anterior structures including the hindbrain (Singh et al., 2020), while HOX13 genes on the 5' end of each cluster control posterior and distal structures including digit development (Desanlis et al., 2020). Based on sequence conservation, the relative positions of each Hox gene within a cluster, and their roles in A-P patterning, the Hox genes have been broadly categorized into anterior (lab, pb, Dfd, and Scr in Drosophila and Hox1-5 in vertebrates), central (Antp, Ubx, and abd-A in Drosophila and Hox6-8 in vertebrates), and posterior groups (Abd-B in Drosophila and Hox9-13 in vertebrates) (Hueber et al., 2010). It is important to note that not all Hox paralogs remain in each of the duplicated vertebrate Hox clusters. For example, cluster HOXB does not have posterior factors HOXB10-B12, and cluster HOXC lacks paralogs of HOXC1-C3 in humans (Mark et al., 1997). In short, metazoans encode variable numbers of Hox genes that are typically found clustered along the chromosome to specify the different cell fates that form along the A-P axis body plan.
The mysteries underlying how Hox genes control distinct body regions only grew upon the discovery that each encodes a homeodomain transcription factor (TF) capable of binding highly similar AT-rich DNA sequences (McGinnis et al., 1984a;McGinnis et al., 1984b). In fact, Hox genes are members of a much larger homeodomain TF family that consists of over 200 members in mammals, and many of these genes control distinct developmental processes and fates despite encoding TFs that bind highly similar DNA sequences (Berger et al., 2008;Jolma et al., 2013;Bürglin and Affolter, 2016). Taken together, these conflicting genetic and biochemical findings raise a fundamental paradox: How can a family of homeodomain TFs capable of binding highly similar DNA sequences in vitro, regulate distinct and diverse cell fates in vivo?
During the past two decades, many molecular, genetic, and genomic approaches have begun to reveal that numerous mechanisms likely underlie the ability of Hox TFs to specify different cell fates along the A-P body axis. In total, these studies have made considerable progress in defining mechanisms that enhance Hox DNA target specificity, especially by the formation of larger DNA binding complexes with other TFs. For example, the Extradenticle (Exd, Drosophila)/Pre-B cell leukemia homeobox (Pbx, vertebrate) and/or Homothorax (Hth, Drosophila)/Myeloid ecotropic viral integration site (Meis, vertebrate) TFs have been shown to form cooperative DNA binding complexes with Hox TFs and thereby enhance Hox DNA binding specificity (Mann and Affolter, 1998;Moens and Selleri, 2006;Merabet and Mann, 2016). Through a combination of structural, biochemical, and genetic studies, the formation of Hox/Exd and Hox/Pbx complexes have uncovered several key concepts that underlie how Hox TFs gain DNA binding specificity including the critical role of not just nucleotide identity but DNA shape (Zeiske et al., 2018), the concept of latent specificity (Slattery et al., 2011), and the importance of low affinity versus high affinity binding sites (Crocker et al., 2015;Zandvakili et al., 2019). These mechanisms, which by and large are used to increase Hox target gene specificity, have been reviewed in several excellent articles (Merabet and Mann, 2016;Kribelbauer et al., 2019;De Kumar and Darland, 2021).
In this review, we focus on how large-scale genomic and interactome data have uncovered numerous potential Hox regulatory elements and protein interactors that present both new opportunities and challenges. Genomic DNA binding studies from tissues and cells have revealed that Hox TFs, like most sequence-specific TFs, bind thousands of potential cis-regulatory elements but only a subset of these binding events are likely to be associated with significant changes in the expression of nearby genes (Walter et al., 1994;Farnham, 2009;Biggin, 2011;Choo and Russell, 2011;Walhout, 2011;Fisher et al., 2012). In addition, comparative studies between Hox TFs have revealed differences in their ability to bind inaccessible (i.e., closed chromatin) DNA elements. Such differences in ability to bind DNA wrapped in nucleosomes may indicate that Hox TFs have the potential to elicit pioneer-like activities that promote the opening of closed chromatin, thereby expanding the already large number of possible genomic binding sites. However, since Hox TFs are capable of mediating both transcriptional activation and repression, simply detecting Hox TF binding to an element cannot easily be used to predict transcriptional outcome. Intriguingly, protein-protein interaction assays have uncovered that Hox TFs can interact with many different proteins including other sequence-specific TFs as well as factors involved in mediating gene activation and repression. Integrating these large-scale findings with existing cis-regulatory logic studies of confirmed Hox target genes suggests that Hox TFs are likely to require numerous protein-protein interactions with other TFs to gain the required regulatory specificity to ensure accurate gene activation or repression outcomes occur in a reproducible and robust manner.

HOX TRANSCRIPTION FACTOR BINDING AND CHROMATIN ACCESSIBILITY
Hox factors, like all TFs, must bind specific DNA regulatory elements to mediate accurate transcriptional responses. Since all the cells within an organism have the same genomic material, differences in the chromatin landscape of a cell can play a large role in dictating which DNA elements are available for transcription factor binding. Thus, chromatin accessibility helps to define which genes can be activated during the specification of distinct cell fates along the body plan. Intriguingly, comparative genomic accessibility studies using Formaldehyde-Assisted Isolation of Regulatory Elements sequencing (FAIRE-seq) on Drosophila imaginal discs revealed that the wing, haltere, and metathoracic leg imaginal discs have very similar chromatin profiles (McKay and Lieb, 2013). For example, comparison between the wing and haltere imaginal discs showed that except for genomic regions flanking the Ultrabithorax (Ubx) Hox gene these two tissues have largely identical accessible cis regulatory elements (McKay and Lieb, 2013). Similar results were obtained using the Assay for Transposase-Accessible Chromatin sequencing (ATAC-seq) methods with ∼98% of the accessible DNA sequences being the same between age-matched wing and haltere discs (Loker et al., 2021).
The above findings suggest that the Ubx Hox factor, which is differentially expressed in the Drosophila imaginal discs, directs the formation of different cell and tissue fates by regulating distinct target genes within highly similar chromatin landscapes. But does the expression of this Hox TF alter the chromatin landscape during the process of cell fate specification and morphogenesis? A recent elegant study addressed this question to better define the role of the Ubx TF in regulating haltere development by focusing on the relatively small percentage of loci (∼2% of accessible regions) that were differentially accessible in haltere versus wing discs (Loker et al., 2021). Importantly, Loker et al. combined chromatin accessibility data with Ubx Chromatin Immunoprecipitation sequencing (ChIP-seq) assays and transcriptomics (RNA-seq) to show that Ubx genomic binding correlates with the opening and closing of specific loci to mediate distinct transcriptional outputs during Drosophila haltere development (Loker et al., 2021). In particular, they found that Ubx could modify the chromatin landscape to both reduce chromatin accessibility to repress gene transcription in the capitulum and proximal hinge of the haltere and increase chromatin accessibility to activate gene transcription in the distal hinge with the aid of the Hth and Exd Hox co-factor proteins (Loker et al., 2021). Since Ubx is required for haltere fate and the loss of Ubx function results in the transformation of haltere tissue into wing tissue (Lewis, 1978;Bender et al., 1983;Carroll et al., 1995), these data are congruent with the idea that the primary difference between these two serially homologous appendages is the expression of Ubx and that once expressed, the Ubx TF directs haltere development by modulating chromatin accessibility and target gene expression within an initial chromatin landscape capable of forming either a wing or a haltere (McKay and Lieb, 2013;Loker et al., 2021). Thus, while many of the accessible genomic regions across imaginal disc tissues are the same, Hox TFs are likely to modify this landscape to activate and/or repress key target genes during cell fate specification and morphogenesis.
The finding that Hox TF binding can increase genomic accessibility raises the possibility that Hox TFs have pioneer-like activities. By definition, pioneer transcription factors can both bind DNA that is wrapped around a nucleosome and promote chromatin remodeling to make DNA elements accessible for other TFs (Iwafuchi-Doi and Zaret, 2014;Zaret, 2020). To assess the ability of Hox TFs to bind inaccessible DNA and promote chromatin opening, recent comparative genomic binding and accessibility studies have been performed for Hox TFs in both a Drosophila cell line (Kc167 cells) (Beh et al., 2016;Porcelli et al., 2019) and in a mouse motor neuron progenitor culture system (Bulajić et al., 2020). Intriguingly, these data indicate that some, but not all, Hox factors can readily bind inaccessible chromatin. By intersecting ATAC-seq and ChIP-seq profiles of the eight Drosophila Hox TFs, Porcelli et al. showed that this ability to bind inaccessible chromatin is shared by the anterior factors, Lab, Dfd, and Pb, as well as the posterior Hox factor, Abd-B ( Figure 1A; Porcelli et al., 2019). Further, by comparing ATACseq profiles before and after inducing Dfd and Abd-B expression in respective Kc167 cell lines, Porcelli et al. found that Dfd and Abd-B can increase chromatin accessibility of their targets (Porcelli et al., 2019). These findings are consistent with a previous finding that 42% of Abd-B specific peaks were bound outside of the cells DNaseI accessible regions in Kc167 cells (Beh et al., 2016). The enhanced ability of Abd-B to bind inaccessible chromatin was also supported by studies of the mammalian Abd-B orthologs in neural progenitors and undifferentiated motor neurons (Bulajić et al., 2020). In particular, Bulajić et al. found that the HOXC9 and HOXC13 posterior Hox TFs bound significantly more inaccessible genomic regions than the HOXC6 and HOXC8, which are classified as central Hox TFs ( Figure 1B; Bulajić et al., 2020). Consistent with these findings, the HOXD13 TF also demonstrated pioneer factor-like activity by increasing chromatin accessibility of targets to guide proximal to distal limb development ( Figure 1B; Desanlis et al., 2020), supporting a mechanism in which select Hox factors can bind inaccessible chromatin and increase chromatin accessibility of its targets ( Figure 1; Figure 2A).
While the above findings are congruent with the idea of the posterior Abd-B-like Hox factors being able to readily bind inaccessible DNA, additional studies revealed that not all posterior Hox TFs may equally share such properties. For example, comparative studies between several posterior HOX TFs in the motor neuron progenitor assay revealed clear differences with HOXC9 and HOXC13 binding many more inaccessible regions than HOXC10, HOXA9, or HOXD9 ( Figure 1B; Bulajić et al., 2020). Moreover, the ability of the human HOXC13 factor to bind inaccessible DNA was predominantly influenced by the DNA binding domain and C-terminus (Bulajić et al., 2020). Thus, while it has been argued based on structural studies that posterior Hox TFs may have enhanced binding to inaccessible DNA due to high affinity electrostatic interactions between the narrow groove of DNA and the Hox N-terminal arm of the homeodomain (LaRonde- LeBlanc and Wolberger, 2003;Beh et al., 2016), we currently lack a molecular understanding of why only a subset of posterior HOX TFs readily bind inaccessible chromatin.  (Beh et al., 2016;Porcelli et al., 2019). Furthermore, Exd and Hth expression tend to enhance a factor's ability to bind to inaccessible chromatin (Beh et al., 2016;Porcelli et al., 2019). It is important to note that the ability to bind inaccessible chromatin of Abd-B was not enhanced and the ability of Scr was only slightly enhanced by Exd/Hth. (B) Diagram summarizing the genomic DNA binding activities of human Hox factors in motor neuron cells (Bulajić et al., 2020), mouse embryonic stem cells (Singh et al., 2021), and mouse limb buds (Desanlis et al., 2020). The genomic binding and accessibility profiles were intersected to assess inaccessible chromatin binding. Nearby PBX and MEIS motifs were used to determine co-binding. Drosophila and human Hox factors follow a similar trend that posterior factors can bind inaccessible chromatin more so than central factors.  (Porcelli et al., 2019). (B) Pbx/Meis in vertebrates have been shown to bind inaccessible chromatin and promote chromatin opening. In this model, the Hox factor gains access to accessible DNA, forms a complex with Pbx/Meis, and this larger TF complex is required for accurate target regulation (Sagerstrom, 2004;Choe et al., 2014;Mariani et al., 2021). (C) Another potential model is that Hox TFs, a subset of which are capable of binding inaccessible DNA, recruit Exd/Pbx and Hth/Meis and together these complexes promote chromatin opening (Porcelli et al., 2019). Note that in each of these models, the role of the Hox factor in chromatin opening has not yet been confirmed.  (Mann and Affolter, 1998;Uhl et al., 2010;Merabet and Mann, 2016). More recently, these factors have also been shown to influence the binding of Hox TFs as well as other TFs to genomic DNA elements embedded in chromatin. The first study describing such an activity for Pbx and Meis was in association with MyoD, a non-Hox basic-Helix-Loop-Helix transcription factor that promotes muscle cell development (Berkes et al., 2004). Pbx was shown to bind the inactive myogenin promoter in undifferentiated C2C12 myoblast cells at a time-point that preceded MyoD binding by 6 h (Bergstrom et al., 2002;Berkes et al., 2004). The subsequent binding of MyoD with Pbx during differentiation correlated with myogenin promoter activation, consistent with previous studies that found MyoD is able to remodel chromatin and activate target genes (Gerber et al., 1997). Together, these findings suggest that Pbx is binding to and marking the inaccessible chromatin for activation upon recruitment of MyoD, a mechanism that might extend to Pbx's interactions with Hox factors (Sagerstrom, 2004). In fact, Choe et al. found that Pbx/Meis bound numerous loci as early as the zebrafish blastula and that later in development the Hoxb1a TF was required for these loci to become fully active (Choe et al., 2014). More recently, Mariani et al. used a combination of DNA accessibility assays, Pbx ChIP-seq assays, and transcriptomics on wild type and Pbx knockout cells undergoing paraxial mesoderm differentiation to show that Pbx factors are required to bind and open essential chromatin regions during the maturation of paraxial mesoderm cells (Mariani et al., 2021). Importantly, the authors used genome editing to show that a Pbx binding site in a regulatory element of the msgn1 gene that specifies paraxial mesoderm cell fate is required for its chromatin accessibility. Thus, either the loss of the Pbx protein or the disruption of the Pbx binding site resulted in the loss of msgn1 enhancer DNA accessibility and msgn1 gene activation (Mariani et al., 2021).
These data support a model in which Pbx marks and opens the inaccessible chromatin for subsequent gene regulation by Hox factors ( Figure 2B). In Drosophila, Porcelli et al. systematically assessed how the Exd and Hth co-factors impact genomic accessibility and Hox DNA binding profiles by taking advantage of the fact that Kc167 cells lack Hth expression, which thereby restricts Exd to the cytoplasm (Porcelli et al., 2019). These studies revealed that the expression of Hth, which concomitantly localizes Exd to the nucleus, generally increased the genomic binding of all the Drosophila Hox factors but Abd-B to inaccessible chromatin ( Figure 1A; Porcelli et al., 2019). This was previously shown for Ubx in which ∼30% of the Ubx and Hth specific binding sites did not intersect with the cell line's DNase1 profile prior to Hox gene expression, whereas in the absence of Hth and nuclear Exd only ∼5% of Ubx bound regions intersected with this DNaseI inaccessible chromatin profile (Beh et al., 2016). Moreover, by comparing chromatin profiles before and after Ubx and Hth induction, Ubx was shown to open the surrounding chromatin of its targets with the help of Hth (Porcelli et al., 2019). These data support the idea that the formation of Ubx/Hth/Exd complexes can promote chromatin remodeling and DNA accessibility, which is consistent with the findings that Ubx increases chromatin accessibility to activate gene transcription in the Hth and Exd expressing cells of the distal hinge in the haltere disc (Loker et al., 2021).
In agreement with these Drosophila findings, a recent study in mouse embryonic stem cells studies found that like Lab (Porcelli et al., 2019), the HOXB1 homologue is capable of binding to both inaccessible and accessible DNA (Singh et al., 2021). Intriguingly, by also performing ChIP-seq assays for PBX1 and various chromatin marks, the authors found that those HOXB1 peaks found in inaccessible DNA were not bound by PBX1 and were predominantly located in gene deserts of nucleosome-bound chromatin. In contrast, the HOXB1 regions that were also bound by PBX1 tended to be in more accessible chromatin that correlated with open chromatin marks such as H2K27ac, H3Kme1, and H3Kme3 ( Figure 1B; Singh et al., 2021). These findings suggest that while HOXB1 has the capacity to bind inaccessible DNA, it may have a limited ability to convert that binding event into accessible chromatin unless co-bound with the PBX1 factor.
Collectively, the above genomic data in both Drosophila and vertebrates support the idea that the Pbx/Exd and Meis/Hth factors have some degree of pioneer TF activity. Consistent with this model, a recent nucleosome consecutive affinity purification-systematic evolution of ligands by exponential enrichment assay provided evidence that MEIS3 is capable of binding nucleosome bound DNA in vitro (Zhu et al., 2018), and a comparative study of pioneer TFs highlighted that PBX contains a truncated alpha recognition helix that mimics the structure that allows the FOXA3, OCT4, PU1, and ASCL1 pioneer TFs to bind nucleosome bound DNA (Fernandez Garcia et al., 2019). In total, these studies provide support for the following model: the Pbx/Exd and Meis/Hth factors can bind inaccessible DNA, promote chromatin opening, and ultimately regulate target gene expression via the subsequent recruitment of Hox TFs as well as other non-Hox TFs such as MyoD ( Figure 2B). What remains less clear is if the Hox TFs are only involved in the final step of target gene activation or if the Hox TFs also participate with Pbx/Exd and Meis/Hth in the process of chromatin remodeling. Moreover, since at least a subset of Hox TFs also bind inaccessible DNA, it is possible that at some regulatory elements Hox TFs can use a pioneer-like activity to bind inaccessible DNA and subsequently recruit the Pbx/Exd and/or Meis/Hth factors to open chromatin and regulate target gene expression ( Figure 2C). Thus, the differential ability of Hox TFs and the Pbx/Exd and Meis/Hth TFs to bind accessible versus inaccessible DNA provide an additional potential regulatory mechanism that may underlie how the anterior, central, and posterior Hox TFs accurately control target gene expression during animal development.

HOX FACTORS AS MULTI-FUNCTIONAL TRANSCRIPTIONAL ACTIVATORS AND REPRESSORS
Once bound to DNA, Hox factors ultimately function by altering the expression of downstream target genes. Unlike some TFs that are thought to function predominantly as transcriptional activators or repressors, the Hox TFs are capable of mediating both transcriptional outcomes (Pearson et al., 2005;Zandvakili and Gebelein, 2016). In the past, considerable work has been done to map activation and repression domains of the Hox factors as well as to determine how mutating these regions impacts transcriptional output. For example, a structure function study of Abd-A in Drosophila revealed how a Hox protein can utilize multiple distinct Exd interaction domains to differentially regulate target genes and morphological outcomes (Merabet et al., 2011). Further, a combination of mutational analyses and transcriptional output assays using Gal4 drivers showed that Dfd in Drosophila (Li et al., 1999) and HoxA7 in NIH3T3 cells (Schnabel and Abate-Shen, 1996) as well as HoxD4 in P19 embryonal carcinoma cells (Rambaldi et al., 1994) possessed a proline alanine rich region in the N-terminus that can activate transcriptional output. However, this activity was masked by the homeodomain and C-terminus in the context of the full proteins. In fact, there is increasing evidence that the homeodomain itself can be a large driver of transcriptional repression, and that the extent of this repression is paralog specific. A recent study quantitively measured protein domain transcriptional activity using a novel high-throughput sequencing technique, HTrecruit (Tycko et al., 2020). This study fused a large library of TF protein domains to the rTetR DNA binding domain within a lentivirus and assessed their ability to alter a citrine reporter gene under the control of TetO binding sites. After subjecting infected cells to doxycycline, cells were sorted for citrine-ON versus citrine-OFF and the read count ratio between the off and on cells was used to quantify the repression capability of each protein domain. Through this technique, they discovered that the repression capability of the Hox homeodomains was colinear and correlated with paralog such that posterior Hox factors had a more potent repression activity than the anterior Hox factors (Tycko et al., 2020). The authors then connected the enhanced repression of posterior factors to a more positively charged N-terminal arm in the homeodomain, specifically a RKKR motif (Tycko et al., 2020). This connection is consistent with a previous mutational study that found that mutating a similar region of HoxA7 to the amino acids of HoxB4 resulted in reduced repression activity (Schnabel and Abate-Shen, 1996). Altogether, these findings highlight the importance of the homeodomain in repression as well as exemplifies how transcriptional outputs across Hox proteins can vary. Moreover, these data provide further evidence that the same Hox TFs, such as HoxA7, HoxD4, and Abd-A, can have both functional activation and repression domains. Given that Hox TFs have the capacity to both activate and repress transcription, it is not surprising that a wide variety of coactivator and co-repressor proteins have been shown to physically interact with Hox TFs as reviewed in (Mann et al., 2009;Zandvakili and Gebelein, 2016;De Kumar and Darland, 2021). Many large scale interactome analyses have been performed with Hox factors, and each of these have identified a substantial number of potential protein-protein interactions that could modify the ability of the Hox TFs to mediate gene activation and/or gene repression (Giot et al., 2003;Stanyon et al., 2004;Rual et al., 2005;Yu et al., 2011;Lambert et al., 2012;Rolland et al., 2014;Bischof et al., 2018;Shokri et al., 2019;Carnesecchi et al., 2020;Luck et al., 2020). For example, the Ubx and Abd-A Hox factors were screened for interactions against 260 different TFs in the Drosophila embryo using a split-fluorescence assay coupled with ectopic expression using the Gal4-UAS system (Bischof et al., 2018). Unexpectedly, both of these Hox TFs interacted with a large number of the tested TFs, as Ubx interacted with 163 of the 260 TFs (62%), and Abd-A interacted with 149 of the TFs (57%) (Bischof et al., 2018). However, it should be noted that an additional large-scale TF-TF interaction screen tested a number of different Hox TFs using a yeast 2-hybrid assay and found that the Hox TFs, including Ubx and Abd-A, interact with relatively few tested TFs (Shokri et al., 2019). These conflicting results are likely to be attributed to differences in sensitivity between the two assays as well as the fact that the fluorescence complementation assay was performed in Drosophila cells that express additional co-factor proteins that may allow large scale complex formation whereas the two-hybrid approach was performed in yeast. More recently, a proximitydependent Biotin IDentification (BioID) assay in multiple cell types of the Drosophila embryo revealed that Ubx interacts with many proteins involved in processes from chromatin modification to mRNA processing (Carnesecchi et al., 2020). Surprisingly, however, while most of the BioID identified Ubx interactors were found to occur in a tissue-specific manner, the vast majority of the proteins that do interact with Ubx are broadly expressed across many tissues (Carnesecchi et al., 2020). This finding raises the possibility that the ability of Ubx to interact with ubiquitous regulatory proteins can be modified in a tissue-specific manner, although the mechanisms regulating such tissue-specific interactions are currently unknown. Nevertheless, these data raise the possibility that the Hox TFs gain in DNA binding specificity by forming complexes with numerous additional TFs, many of which are expressed in a tissue-restricted manner, and gain in regulatory specificity (i.e., activate versus repress) by interacting with many different co-activator and co-repressor proteins that are widely expressed in numerous cell types.

CASE STUDIES ON THE CIS-REGULATORY LOGIC OF HOX TRANSCRIPTION FACTORS
To better understand how Hox TFs regulate target genes in specific tissues, a select number of cis-regulatory modules (CRMs) have been extensively characterized using a combination of DNA binding assays, transcriptional reporter assays, and loss-and gain-of-function genetics. In this review, we are going to focus on our current knowledge of the cisregulatory logic of two well-characterized Drosophila CRMs, one of which is specifically regulated by Abd-A and the other is regulated by the Abd-A, Ubx, and Antp Hox factors. Intriguingly, Abd-A regulates these two CRMs in different cell types and in opposing ways. In the developing peripheral nervous system, Abd-A triggers the secretion of epidermal growth factor ligands from a specific subset of abdominal sensory organ precursor cells by activating the expression of the rhomboid (rho) serine protease gene via a highly conserved CRM called RhoA (Brodu et al., 2002;Li-Kroeger et al., 2008). In contrast, Abd-A, as well as Ubx, suppresses leg development in abdominal segments by repressing the expression of the Distal-less (Dll) homeodomain protein in ectodermal cells via the Dll conserved regulatory element (DCRE) (Vachon et al., 1992;Gebelein et al., 2002, Gebelein et al., 2004. In addition to being repressed by both Abd-A and Ubx in the abdomen, the DCRE CRM can also be activated by the Antennapedia (Antp) Hox factor in thoracic segments (Uhl et al., 2016).
To determine how these CRMs recruit specific Hox factors to mediate distinct cell type-and segment-specific outputs, comparative studies on the TF binding sites (TFBSs), the TF complexes, and the genetic requirements of each TF have revealed several insights into the principals underlying Hox cis-regulatory logic (Figure 3). First, the same Hox, Exd, and Hth binding sites are capable of mediating either activation or repression. The RhoA CRM contains a single set of adjacent Exd/Hth/Hox binding sites ( Figure 3B; Brodu et al., 2002;Li-Kroeger et al., 2008), whereas the DCRE CRM contains three Hox sites, each of which is directly adjacent to a Exd or Hth site (Figures 3C,E;Gebelein et al., 2002, Gebelein et al., 2004Uhl et al., 2016). Each configuration of binding sites is capable of cooperatively binding Abd-A/Hth/Exd complexes. Further, mutations within these binding sites disrupt the ability of Abd-A to either activate gene expression in sensory cells ( Figure 3B; Li-Kroeger et al., 2012) or repress gene expression in the abdominal ectoderm (Figures 3C,E;Gebelein et al., 2002Gebelein et al., , 2004Uhl et al., 2016). Moreover, swapping the "activating" Exd/Hth/Hox sites from the RhoA CRM into the DCRE demonstrated that the Abd-A Hox factor can also use this same configuration of binding sites to mediate transcriptional repression ( Figure 3D; Zandvakili et al., 2019). These data suggest that differences in the conformation of Exd, Hth, and Hox TFBSs do not reveal how the Abd-A Hox complex mediates distinct outcomes in different cell types.
Second, accurate Hox-dependent transcriptional outcomes by the RhoA and DCRE CRMs require nearby TFBSs for additional tissue-restricted TFs ( Figure 3A). For example, the RhoA CRM encodes a binding site for the Pax2 TF ( Figure 3B, the Drosophila Pax2 gene name is shaven, sv, but for simplicity we will call it Pax2) and mutations within the RhoA CRM that disrupt Pax2 binding compromise Abd-A mediated activation (Li-Kroeger et al., 2012;Zandvakili et al., 2018). Moreover, Pax2, Abd-A, Exd, and Hth could utilize these TFBSs to form specific TF complexes on the RhoA CRM (Li-Kroeger et al., 2012) and altering the spacing and orientation between the Pax2 and Exd/Hth/Hox site disrupted RhoA activity in abdominal sensory organ cells (Zandvakili et al., 2019). Given that the expression of the Drosophila Pax2 gene is predominately restricted to sensory organ cells in the embryo (Li-Kroeger et al., 2012), the direct regulation of the RhoA CRM by Abd-A and Pax2 provides insight into both the abdominal and sensory specific activity of this enhancer.
The DCRE CRM similarly requires additional TFBSs to mediate abdominal-specific repression by the Ubx and Abd-A Hox factors (Figures 3C,E;Gebelein et al., 2004). In fact, these two Hox factors were found to require different TFs in the anterior versus the posterior compartments of the abdominal segments. In the posterior compartment cells, the Ubx and Abd-A Hox factors form cooperative complexes with the Engrailed (En) TFs on adjacent binding sites within the DCRE ( Figure 3E).
In the anterior compartment, the Ubx and Abd-A Hox factors cooperate with the Drosophila FoxG factors, which are encoded by the largely redundant sloppy-paired 1 (slp1) and sloppy-paired 2 (slp2) genes via nearby Hox and dFoxG binding sites within the DCRE ( Figure 3C; Gebelein et al., 2004). Moreover, like in the RhoA CRM, the spacing between the Hox and dFoxG TFBSs contributed to optimal activity as adding a 5 bp sequence disrupted repression (Zandvakili et al., 2019). However, adding longer space sequences between the Hox and dFoxG sites (+10, +15, and +20 bps) resulted in strong transcriptional repression, suggesting that unlike the Pax2 and Hox sites in RhoA, the configurations of dFoxG and Hox sites in the DCRE are not rigidly fixed to mediate transcriptional repression (Zandvakili et al., 2019). Since the dFoxG factors are specifically expressed in the anterior compartment cells, whereas En is specifically expressed in the posterior compartment cells, these findings again highlight how a CRM can integrate both Hox TFs and tissue-restricted TFs to yield accurate abdominal-specific outcomes. Third, the DCRE CRM can contribute to both Hox-mediated transcriptional repression in the abdomen and Hox-mediated transcriptional activation in the thorax. In addition to repressing Dll expression in the abdomen, the DCRE can also use a subset of the Hox TFBSs to stimulate gene expression in the thoracic leg primordia cells via the Antp Hox TF (Uhl et al., 2016). In particular, Antp can utilize the two Hox/Exd sites, but not the adjacent Hth/Hox site, to stimulate DCRE-mediated activation in thoracic cells ( Figure 3F). However, unlike DCRE mediated repression, the dFoxG and the En binding sites are not required for this Hox-dependent activity, suggesting that Antp is likely to cooperate with other TFs to stimulate the DCRE ( Figure 3F; Uhl et al., 2016). Additional cis-regulatory studies on the six2 target gene in mammals have also revealed that the same Hox binding sites can be used to activate six2 via Hox11 factors in the kidney and repress six2 via Hoxa2 in the branchial arch and facial mesenchyme (Yallowitz et al., 2009). Thus, the same Hox binding sites can contribute to both activation and repression, but the appropriate transcriptional response will likely depend upon integrating distinct combinations of additional TFs.
While the thorough characterization of the RhoA and DCRE CRMs provide new insight into the cis-regulatory logic of the segment-and cell type-specific transcriptional responses, does this cis-regulatory logic provide insight into how Abd-A represses the DCRE and activates the RhoA CRMs? Currently, it is unclear how Abd-A/Pax2 complexes activate the RhoA CRM in sensory cells, but it is interesting to note that vertebrate Pax2 and Hox11 factors are thought to collaborate to activate the six2 target gene and Hox11 contains an activation domain required for this function (Gong et al., 2007;Yallowitz et al., 2009). Moreover, the integration of Abd-A with either the Slp1/2 dFoxG factors or the En homeodomain factor provides a likely mechanism of repression as both dFoxG and En have been shown to recruit the well-established Groucho co-repressor protein (Jiménez et al., 1997;Andrioli et al., 2004). In addition, Agelopoulos et al. used a novel cell-and gene-specific ChIP strategy to demonstrate that while the regulatory element containing the DCRE loops and contacts the Dll promoter in thoracic segments, consistent with gene activation, the DCRE region does not contact the Dll promoter in the abdominal segments (Agelopoulos et al., 2012). Using whole embryo ChIP, the authors then found that the Dll enhancer region containing the DCRE was not only bound by Ubx and Abd-A but also was highly correlated with histone variant, H2Av. These data suggest that Ubx and/or Abd-A may recruit H2Av and result in decreased interactions between the DCRE and the Dll promoter (Agelopoulos et al., 2012). These data also further highlight how some Hox factors, most notably Ubx, can activate and repress gene expression by both increasing and decreasing chromatin accessibility in a gene-specific manner (Agelopoulos et al., 2012;Loker et al., 2021).

DISCUSSION
Several properties of the Hox factors make it particularly challenging to develop a general model that predicts the outcome of a Hox binding event to a regulatory element. First, Hox factors are expressed within many cell types of a segment, and yet most Hox target genes are regulated in only a small subset of the cells that express the Hox TF. Second, as mentioned throughout this review, all the members of the Hox TF family share a homeodomain that binds highly similar AT-rich sequences, and yet Hox TFs specify different segment identities and regional cell fates. Thus, we need to determine both the mechanisms that underlie how the same broadly expressed Hox TF can regulate target genes in a cell-type specific manner, and the mechanisms that underlie what makes each Hox TF different from each other to regulate the distinct combinations of target genes needed to specify different cell fates along the A-P axis.
This review summarized several studies that suggest select Hox factors can bind to inaccessible chromatin by intersecting genome binding and chromatin accessibility profiles (Beh et al., 2016;Porcelli et al., 2019;Bulajić et al., 2020). Moreover, at least in the case of Dfd and Abd-B, Hox TFs have the potential to increase chromatin accessibility even in the absence of nuclear Exd and Hth (Porcelli et al., 2019). However, intersecting genome binding and chromatin accessibility profiles only provides correlative evidence of pioneer-like activity, as it is possible that other TFs regulated by Dfd and Abd-B are ultimately the ones that open the chromatin. Thus, additional studies that combine the use of ATAC-seq, ChIP-seq, and Hox mutational analysis with genomic editing of known Hox binding sites will be required to confirm or refute Hox pioneer-like activity, much like the studies of Mariani et al. for PBX established pioneer activity in mouse epiblast stem cells (Mariani et al., 2021).
Through a combination of large-scale genomic, bioinformatics, and protein interactome approaches, the scientific field has recently identified numerous Hox-bound genomic elements as well as many Hox protein interactors that are likely to contribute to the diverse regulatory potential of Hox factors. Of particular note is that Hox TFs were found to form complexes with a surprisingly large number of other sequence specific TFs (Bischof et al., 2018;Carnesecchi et al., 2020). Taken together with the finding that several wellcharacterized Hox regulatory elements require TFBSs for additional TFs that are expressed in tissue-and cell typerestricted patterns, these results suggest that Hox-regulated CRMs function by integrating specific Hox TFs with numerous other TFs to yield accurate segment-, cell-, and gene-specific regulatory outcomes. The question that arises from these studies is do Hox TFs regulate each target gene by interactions with distinct combinations of TFs, and thus each Hox-regulated CRM will contain a relatively unique combination of TFBSs? Or do the Hox TFs regulate many different target genes via interactions with a common group of TFs such that potential cis-regulatory codes can be identified and used to predict Hoxregulated CRM output?
To answer these questions, future experiments will be needed to generate additional genomic binding data and transcriptomics data for Hox TFs as well as their potential partner proteins in defined cell types. For example, intersecting ChIP-seq for Hox TFs, Pbx/Exd, Hth/Meis, and other TFs from the same cell types would allow one to segregate Hox genomic binding events into distinct bins that are associated with the binding or lack thereof of additional TFs. Moreover, the use of higher resolution binding assays such as Cleavage Under Targets & Release Using Nuclease (CUT&RUN) or ChIP-Exonuclease can provide near bp resolution binding that reveals if adjacent sites are also occupied near the Hox TF binding site. Such an approach was recently utilized for the Gsx2 homeodomain TF to reveal distinct monomer versus dimer binding events using CUT&RUN assays and nucleotide footprinting analysis (Salomone et al., 2021). By combining high-resolution genomic binding data with transcriptomic studies using wild type and specific mutant cells (i.e., Hox mutant, Pbx/Exd mutant, etc), we will be better positioned to define which binding events are associated with gene expression changes. Lastly, bioinformatics can be used to perform unbiased searches for additional TF motifs as well as to search for potential constraints on the relationships between Hox TF sites and other TFBSs. Such an approach has already identified that many Hox genomic binding events enrich for coupled Pbx/ Hox (vertebrates) or Exd/Hox (Drosophila) motifs even when the genomic binding assay was performed using a complex tissue composed of many cell types (Loker et al., 2021;Singh et al., 2021). These findings are consistent with the Pbx/Exd TFs serving as widespread Hox co-factor proteins in many tissues. Moreover, a recent study for HoxB1 combined genomic binding assays with transcriptomics and unbiased motif enrichment analysis to show that HoxB1 genomic binding events associated with gene repression, but not gene activation, are enriched for TFBSs for the REST transcriptional repressor (Singh et al., 2021). Hence, future studies focused on genomic binding assays for many Hox and other TFs in specific tissues will be needed to determine which TFs are likely to collaborate with specific Hox factors. Armed with the sequences of these potential regulatory elements, bioinformatics approaches will help to reveal if specific cisregulatory codes underlie how Hox TFs are integrated with each different TF to regulate cell specific gene expression.

AUTHOR CONTRIBUTIONS
The topic of this review was jointly conceived by BC and BG. The initial draft of the manuscript was written by BC and BG. The figures were initially generated by BC and edited by BG.

FUNDING
This work was supported by a National Institutes of Health grant #GM079428 to BG.