Expression of Multiple engrailed Family Genes in Eyespots of Bicyclus anynana Butterflies Does Not Implicate the Duplication Events in the Evolution of This Morphological Novelty

Gene duplication events often create genetic redundancy that can either lead to the appearance of pseudogenes or, instead, create opportunities for the evolution of novel proteins that can take on new functions. One of the genes which has been widely studied with respect to gene duplication is engrailed (en). En-family proteins are expressed in a morphological novelty, eyespots (in the center and in the outer gold ring), in the African squinting bush brown butterfly Bicyclus anynana, as well as in a more conserved pattern, the posterior compartment of a wing. In the present study, we used whole-genome sequencing and transcriptome data to show the presence of three en-family genes and their differential expression on the pupal wings of B. anynana using in situ hybridization. The results suggest two duplication events of en-family genes, the first evidence of a two-fold duplication in the Lepidoptera. We propose that all copies initially had posterior wing compartment expression and all copies subsequently gained a novel expression domain associated with eyespot centers. Two copies secondarily lost the posterior compartment expression, and one copy alone gained the outer ring expression domain. By dating the origin of both duplication events, however, we conclude that they predate the origin of eyespots by at least 60 mya, and hence our data does not support the retention of the multiple en gene duplicates in the genome via their involvement with the novel eyespot evolutionary innovation.


INTRODUCTION
Gene duplication plays an essential role in the evolution of novel proteins, and this process has also been proposed to lead to the evolution of major morphological innovations such as flowers; the chambers of the heart; and brains, bones, and cartilage in vertebrates (Theißen, 2001;Olson, 2006;Conant and Wolfe, 2008;Wagner and Lynch, 2010;Zhang et al., 2015). When genes duplicate, there is initially genetic redundancy which allows one of the copies to quickly accumulate mutations and become converted into a pseudogene (Lynch et al., 2001;Zhang, 2003). However, genetic redundancy can also lead to opportunity. When gene duplicates remain intact during the evolutionary processes this usually means one of three things: (1) that either the double dose of RNA and/or protein increased fitness (Zhang, 2003); (2) that each copy divided the original functions and/or expression domains of the protein among themselves, a process known as sub-functionalization that reduces the total number of functions taken over by each copy and lessens pleiotropy (Force et al., 1999);(3) or that one of the copies gained a novel function, known as neofunctionalization (Force et al., 1999). Examples include subfunctionalization of the gene engrailed (en) in zebrafish, which resulted in the differential expression of the copies in the pectoral appendage, and in neurons (Force et al., 1999); and neo-functionalization of eosinophil-derived neurotoxin (EDN) and eosinophil cationic protein (ECP) in humans and old world monkeys from EDN, where after the duplication ECP developed a novel anti-pathogenic function (Zhang et al., 1998). Neofunctionalization has been of particular interest since the origin of novel gene duplicates has been loosely associated with the origin of evolutionary innovations (Conant and Wolfe, 2008;Wagner and Lynch, 2010), such as with beetle horns (Zattara et al., 2016), human vision (Yokoyama and Yokoyama, 1989), and betalain pigments in plants (Brockington et al., 2015).
The gene engrailed (en) has been heavily studied with respect to gene duplications (Force et al., 1999). Duplication of en in several independent metazoan lineages has generated a family of related genes, varying from one to four copies, where the paralogs display varying degrees of divergence in expression and function (Peterson et al., 1998;Abzhanov and Kaufman, 2000;Gibert et al., 2000;Damen, 2002;Marie and Blagburn, 2003;Peel et al., 2006).
In addition to its conserved function in segmentation during invertebrate embryogenesis (Manzanares et al., 1993;Brown et al., 1994;Fujioka et al., 2002), en is also associated with morphological novelties in several species. For instance, en is involved in neurogenesis (Patel et al., 1989), in determining the fate of glial and neuronal cells in grasshopper median neuroblast (Condron et al., 1994), in axonal targeting in centipedes (Whitington et al., 1991), and in the patterning of insect wing veins (Guillén et al., 1995;. In addition, while no function is yet known, en is expressed in precursor cells that build a mollusc's shell (Nederbragt et al., 2002), in the tentacles of the cephalopod Sepia officinalis (Baratte and Bonnaud, 2009), in bacteriocytes of the aphid, Acyrthosiphon pisum (Braendle et al., 2003), and in the eyespots of saturniid moths (Monteiro et al., 2006) and nymphalid butterflies (Keys et al., 1999;Brunetti et al., 2001).
Eyespots in butterfly wings are novel traits that appear to be specified by old transcription factors and signaling pathways, deployed in novel ways (Keys et al., 1999;Brunetti et al., 2001;Oliver et al., 2012;Monteiro, 2015). Using an antibody that detects a variety of Engrailed proteins (Patel et al., 1989), Keys et al. (1999) found that En-family proteins are expressed in both conserved and novel domains in butterfly wings. En-family proteins maintained their conserved expression domain, and likely their role, in establishing the posterior compartment of each embryonic body segment including that of the wings during the larval stage (Carroll et al., 1994;Keys et al., 1999;). However, one or more of the genes encoding En or its paralog Invected (Inv) were also expressed in the eyespot centers, in late larval and pupal wings, and in a pattern mapping to the ring of gold scales during the pupal stage of wing development in Bicyclus anynana butterflies (Brakefield et al., 1996;Brunetti et al., 2001). A detailed investigation of the number of en family gene copies present in the genome of butterflies, however, and of the expression domains of each paralog in eyespot patterns is still missing.
Here we provide additional insights on the association between the retention of duplicated genes and their involvement in the origin of novel traits by detailing the expression domain of multiple en copies in B. anynana wings, asking whether one or more copies are exclusively associated with the eyespot patterns, and testing whether the gene duplication events are closely associated with the origin of eyespots. We report the presence of three en-family genes in B. anynana, two of which contain all the five major domains of En-family proteins (Peel et al., 2006) and one contains four of these domains. We used sequencing data, to analyze the phylogenetic relationship between the copies, and whole-mount in situ hybridizations to examine the expression of all three copies in the pupal wings. Two of the copies are closely related to each other and have a single expression domain on the pupal wing, in the eyespot centers, while the other copy displays three wing expression domains, in the posterior compartment, in the eyespot centers, and in the golden ring. The timing between the last en-family gene duplication and the origin of eyespots, however, spans at least 60 mya. We discuss the implications of our data on the mechanisms of evolution of these morphological novelties.

MATERIALS AND METHODS
Rearing Bicyclus anynana B. anynana butterflies were reared in the lab at 27 • C, 60% relative humidity, and 12-12 h day-night cycle. Larvae were fed young corn leaves and adults were fed mashed bananas.

Sequence Alignment and Phylogenetic Tree Construction
Nucleotide sequence alignment was carried out using ClustalW (Thompson et al., 1994) with the default parameters in 'SLOW/ACCURATE' pairwise alignment option in GenomeNet and visualized in Geneious v10.1.3 (Kearse et al., 2012). Nucleotides were converted into protein sequence via NCBI ORF finder. The protein sequences were aligned via ClustalW (Thompson et al., 1994) with gap open cost: 10, and gap extend cost 0.1; and afterward, manually edited and visualized in Geneious v10.1.3 (Kearse et al., 2012).
A maximum likelihood tree was inferred using RAxML v8.1.20 ran with model PROTGAMMAJTT and default parameters with 100 bootstraps (Stamatakis, 2014) using ETE v3.1.1 (Huerta-Cepas et al., 2016 implemented on GenomeNet. The same tree topology was also obtained using the following other methods and software:

Probe Design of the en-Family Genes
Sequence used for en probe: A 628 bp region in the exon 1 of en CDS was chosen for designing a probe. This region is absent in inv and inv-like and does not align well with these genes (∼35% bp similarity) (Supplementary Figure S5). Sequences used for inv and inv-like probes: The regions used for probe preparation are Exon 1 of inv, which is not present in inv-like, and the 5 UTR of inv-like, which is not present in inv. The sequence used for the inv probe has 24.9% identical bps to that of the en gene sequence, and 36.03% identical bps to that of the inv-like gene sequence in their overlapping regions. The sequence used for the inv-like probe has 30.2% identical bps to that of the en gene sequence, and 36.8% identical bps to that of the inv gene sequence in the overlapping region (Supplementary Figure S5). This level of sequence similarity is lower than that observed between highly unrelated genes such as en and optix, which share 41.2% identical sequences. The presence of a very low sequence identity and a small overlap as observed in Supplementary Figure S5 makes it unlikely for the two probes to produce results due to cross-reaction.
Control stainings: inv-like and inv are mostly identical in the CDS region. Due to this limitation, we used a 182 bps long region in the 5 UTR of inv-like ( Figure 1A) which is small in size and can produce non-specific stainings. To test for presence of non-specific stainings we performed inv-like sense stainings using a probe prepared with the same sequence ( Supplementary  Figures S1p,q). Sense stainings were also performed for en and inv (Supplementary Figures S1d-f,j-l).
Prior to probe preparation, amplified fragments of each paralog were sequenced and aligned to ensure that the correct amplicons were used for in situ hybridization.

In situ Hybridization
One to two days old pupal wings, timed for moment of pupation using a time-lapse camera, were dissected in 1X PBS at room temperature based on a previously published protocol  and transferred into 1X PBST with 4% formaldehyde for 30-35 min. Afterward, the wings were washed three times in 1X PBST and treated with 20 mg/ml proteinase K (NEB, P8107S) in 1 ml 1X PBST and then with 2 mg/ml glycine in 1X PBST. Wings were then washed three times in 1X PBST and gradually transferred into a prehybridization buffer (composition in Supplementary Table S4) and incubated at 60 • C for 1 h prior to transfer into hybridization buffer (composition in Supplementary Table S4) supplemented with 100 ng/µl of probe. The wings were incubated in hybridization buffer at 60 • C for 16-20 h and washed five times with prehybridization buffer at 60 • C to remove any unbound probe. To prevent crosshybridization of the probes, experiments with each probe were performed on separate days. Afterward, wings were brought to room temperature and gradually transferred to 1X PBST and then incubated in block buffer (composition in Supplementary  Table S4) for 1 h. 1:3000 dilution of anti-Digoxigenin was then added and incubated for 1 h followed by five washes with block buffer. The wings were then transferred to an alkalinephosphatase buffer (composition in Supplementary Table S4) supplemented with NBT/BCIP. Imaging was done under a Leica DMS1000 microscope.

Antibody Staining
One to two days old pupal wings, timed using a time-lapse camera, were dissected in 1X PBS and immediately transferred into a fixation buffer supplemented with 4% formaldehyde at room temperature (composition in Supplementary Table S5) for 30-35 min. Wings were then moved to ice and washed four times with 1X PBS and transferred to block buffer for 1 day (composition in Supplementary Table S5). The next day wings were incubated in primary antibodies against En/Inv [1:15, mouse 4F11, a gift from Patel et al. (1989)], followed by antimouse AF488 (Invitrogen, #A28175) secondary antibody. Wings were then washed and mounted on an inhouse mounting media (composition in Supplementary Table S5) and imaged under an Olympus fv3000 confocal microscope.

Three en-Family Copies Are Present in the Genome of Bicyclus anynana
Whole-genome sequencing (Nowell et al., 2017) and transcriptomic data (Macias-Munoz et al., 2016;Özsu and Monteiro, 2017) revealed three en-family genes present in the genome of B. anynana within a single scaffold. We named the first copy engrailed (en) based on the presence of specific sequence domains described below. It has three exons and three splice variants. The 5 UTR of the splice variants are the same, but the 3 UTR region has different lengths across the three variants. The sequences are present between 136,804 and 158,542 bps  Figures 1A,B). The transcript is translated into a 352 aa sequence containing all the five conserved domains of En-family proteins and the two En specific domains (Hui et al., 1992) (Figure 2A). The second copy we named invected (inv). It has two exons present between 249,290 and 235,356 bps of the same scaffold ( Figure 1A). The transcript is translated into a 479 aa sequence containing all the five conserved domains of En-family proteins and the Inv specific LVSG and RS domains (Hui et al., 1992;Peel et al., 2006) (Figure 2A). The third copy we named invected-like (inv-like). It has two exons with a 5 UTR as identified using a pupal brain (Macias-Munoz et al., 2016) and an eyespot-specific transcriptome (Özsu and Monteiro, 2017) (see Supplementary File for sequence). The sequence is present between 189,552 and 191,029 bps of the same scaffold ( Figure 1A). The transcript is translated into a 125 aa sequence containing four domains of En-family proteins with high similarity to Inv (Figure 2A and results below for details).

Conserved Domains of En-Family Proteins in B. anynana
En and Inv proteins contain all five domains of high conservation previously identified for En-family proteins, while Inv-like contains three full domains and one partial domain (Figure 2A) (Logan et al., 1992). The function of most of these regions has been determined: EH1 and EH2 are binding sites for the transcriptional co-repressor Groucho (Tolkunova et al., 1998) and the co-activator Exd (Peltenburg and Murre, 1996) respectively, EH4 is the homeodomain (Fjose et al., 1985) and EH5 represses En targets (Han and Manley, 1993). The function of EH3 is not known. The EH1 domain of En and Inv are 13 aa each and share 61.5% identity. The EH2 domain of En and Inv are 19 and 21 aa respectively and share 90.5% identity, while Inv-like has only three aa from this domain. The EH3 domain is highly divergent among species and paralogs. En has 11 aa, while Inv and Inv-like have 15 aa each for this domain, respectively. The EH4 domain is a 60 aa homeodomain region which is highly conserved in the three paralogs. Inv and Inv-like are 100% identical in this domain while En has 95% identity with the other two. The EH5 domain is an 18 aa region which is also highly conserved. Inv and Inv-like are 100% identical in this region while En shares 94.4% identity with the other two ( Figure 2B).

Phylogenetic Analysis of En-Family Proteins in Lepidoptera
Phylogenetic analysis of the highly conserved sequences from the EH2 to EH5 domains of En-family proteins from nine lepidopteran species using RaxML (Randomly Accelerated Maximum likelihood) (Stamatakis, 2014) resulted in the clustering of all En sequences as a sister lineage to Inv and Invlike sequences in Lepidoptera ( Figure 2C). The same results were obtained using PhyMl (Guindon et al., 2010)  Frontiers in Ecology and Evolution | www.frontiersin.org while En has 90.1% similarity to Inv, and 91.2% similarity to Inv-like. Furthermore, the nucleotide sequences of inv and inv-like are 98.25% similar; while en is 80.07% similar to inv, and 81.81% similar to inv-like at the conserved EH4 and EH5 regions (Supplementary Figure S5 and Supplementary Table S1). The presence of highly conserved sequences between Inv and Inv-like compared to En indicate that inv and inv-like underwent a more recent duplication.

Expression of en Paralogs in the Pupal Wings of B. anynana
In situ hybridization results indicate expression differences between the three en paralogs in the pupal wings of B. anynana. en is expressed in the posterior compartment of the wing, in the eyespot centers, and in the golden ring area surrounding each center (Figures 3d-f; Supplementary Figures S1a-c). inv and inv-like, however, are only present in the eyespot centers (Figures 3i-n and Supplementary Figures S1g-i,m-o).

Duplication of en Paralogs in Holometabolous Insects
Recent research proposed that a common ancestor to all Lepidoptera has undergone a large-scale genome duplication that is consistent with a whole-genome duplication, while hexapod ancestors to both Diptera and Lepidoptera have undergone two other large scale bursts of gene duplication (Li et al., 2018). These duplications are believed to play important roles in the evolution of hexapods by increasing genetic redundancy and freeing up redundant copies to evolve new functions (Li et al., 2018;Doyle and Coate, 2019). The duplication of en into en and inv happened to a common ancestor to all hexapods including the holometabolous orders Diptera (containing Drosophila) and Lepidoptera (butterflies and moths) (Peel et al., 2006). Two of the families within the Lepidoptera, including Pieridae (containing Pieris rapae), and Nymphalidae (Junonia coenia, Vanessa cardui, and B. anynana) have butterflies whose genomes and transcriptomes have at least two copies of en-family genes (Daniels et al., 2014;Challis et al., 2016;Connahs et al., 2016) with characteristic En-family conserved domains. Inv has the conserved RS and LSVG motifs present across holometabolous insects (Peel et al., 2006), and En proteins one or both En-specific motifs (Figure 2A, see Supplementary File for sequences and motifs).
The gene phylogeny suggested that the duplication of a common ancestor into inv and inv-like might have occurred independently in at least three different lineages: B. anynana, Heliconius melpomene, and Plutella xylostella ( Figure 2C). This phylogenetic pattern, of Inv-like clustering with its paralog Inv in two of these lineages (H. melpomene and B. anynana) before clustering with its other orthologs (Inv-like), can also derive from the phenomenon of gene conversion (discussed below). Independent evidence using genomic copy position and orientation suggests that the duplication of these genes likely happened to a lineage ancestral to the divergence of all three species. The three en-like paralogs from P. xylostella map to contig unitig_1988 of P. xylostella pacbiov1 genome (Challis et al., 2016); and the same three paralogs from H. melpomene melpomene map to scaffold Hmel207002 of H. melpomene melpomene Hmel2 genome (Davey et al., 2015;Challis et al., 2016). All three paralogs map next to each other and in the same genomic orientation as in B. anynana (Supplementary Figures  S6, S7, Supplementary Table S1, and Figure 1A; sequence in Supplementary File). The conserved genomic architecture suggests that this gene duplication event happened only once in a common ancestor to the three lineages, rather than in the three different lineages independently, and this must have happened at least 130 mya, which is an estimated date for the split of P. xylostella moths from the two butterflies obtained from a recent phylogenomic study that included fossil calibrations (Kawahara et al., 2019).

Concerted Evolution of En-Family Proteins in B. anynana
En-family proteins seem to have evolved under concerted evolution in B. anynana (as well as in H. melpomene). Gene duplicates that undergo concerted evolution are often found in close physical proximity and maintain high sequence similarity due to unequal crossing over and frequent gene conversion (Zhang, 2003). Due to concerted evolution high sequence similarity is maintained between En-family members which has often led to misleading phylogenetic trees (Abzhanov and Kaufman, 2000;Marie and Bacon, 2000;Peel et al., 2006). In particular, higher sequence similarity is maintained between paralogs within a species when compared to similarity between orthologs across species, giving a false phylogenetic signal of multiple duplication events having happened independently in each lineage (Liao, 1999). Specific conserved LSVG and RS motifs in Inv paralogs, however, appear to be immune to gene conversion allowing researchers to conclude that the divergence of en and inv happened before the divergence of hexapods (Peel et al., 2006). We have identified both LSVG and RS motifs in B. anynana and other Lepidopteran species (Figures 2A,B) but also observed clear signs of concerted evolution in that B. anynana Inv has more sequence similarity to B. anynana En than to Inv from D. melanogaster (Diptera) (see Supplementary File for sequence). The same phenomenon of gene conversion is also likely happening to Inv and Inv-like within Lepidoptera, as discussed above.

Overlapping and Distinct Expression Patterns of en Paralogs in the Pupal Wings of B. anynana
Studies on en-family genes in butterflies have been primarily carried out using the 4F11 monoclonal antibody that binds to a conserved domain of the translated En and Inv proteins of D. melanogaster (Patel et al., 1989). These En-family proteins are expressed in the posterior compartment and eyespot centers of larval and pupal wings (Keys et al., 1999; and in the cells that will differentiate the golden ring of the eyespots in pupal wings of B. anynana (Brunetti et al., 2001), but until now it was not clear how many copies of enlike genes and transcripts were contributing to the observed multiple protein expression domains in this species. We aimed to find if one or more copies of en-like genes were exclusively associated with the novel eyespot color patterns and if these novel expression domains might have contributed to the preservation of en paralogs in the B. anynana genome.
In situ hybridization against each en paralog in the present study indicates that they are expressed in different patterns in the pupal wings of B. anynana. The protein expression patterns are likely the result of the expression of all three en paralogs (Figure 3). The expression in the posterior wing compartment and in the golden ring areas is specific to en, while all three paralogs are expressed in the eyespot centers.
Previous work established that both en and inv are expressed in the posterior compartment of butterflies during larval wing development (Carroll et al., 1994;, however, we show here that later in development only en is present in the posterior compartment of pupal wings. No data is currently available for the expression of inv-like during the larval stages. Previous studies using in situ hybridization showed that en in B. anynana is homogeneously expressed in the posterior compartment, while inv is present in the upper posterior compartment in B. anynana  and Junonia coenia (Carroll et al., 1994). In the pupal stage, we only observed the expression of en in the posterior compartment (Figures 3d-f and Supplementary  Figures S1a-c). The loss of inv in the posterior compartment during pupal wing development is most likely because this gene is no longer needed to perform its earlier larval function of setting up posterior veins . Venation patterning happens during the early fifth instar , where both en and inv are likely involved in activating other downstream diffusible signals such as Hedgehog and Bone Morphogenic Proteins that position the veins . Late pupal expression of en in the wings of D. melanogaster, however, is involved in the maintenance of the posterior compartment, as mutations of en at these late stage leads to an enlarged posterior compartment with venation defects (Garcia-Bellido and Santamaria, 1972;Lawrence and Morata, 1976;Blair, 1992). The maintenance of en expression in pupal wings of butterflies, suggests that this gene may be playing a similar role.
The expression of all three B. anynana en-family genes in the eyespot centers indicates that all three paralogs might have a redundant function in the differentiation or signaling from the eyespot center which sets the rings surrounding it, but this requires functional evidence (Figure 3) (Beldade and Brakefield, 2002;Monteiro, 2015). Many lineages of nymphalid butterflies lost expression of En-family protein in the eyespot center (as evidenced by lack of protein expression detected by the 4F11 antibody (Oliver et al., 2012)) and yet they develop eyespots. This suggests that En-family proteins might not be necessary for eyespot center differentiation and/or signaling (Oliver et al., 2012).
The expression of en in the golden ring area indicates that this gene alone might play a role in its differentiation, but again this requires functional validation. B. anynana goldeneye spontaneous mutants, where the golden ring extends inward into the area of the central black scales in the eyespots, display an equally extended expression of En-family proteins into that area (Saenko et al., 2008) but it is still unclear whether en, or some other gene equally affected by the goldeneye mutation is actually responsible for the change in scale colors in the central disk.

Alternative 3 UTR Splicing of en in B. anynana
The different isoforms of en with varying 3 UTR length could be involved in controlling the level of proteins produced and their localization; and hence, they might also be playing a role in the differential expression of this paralog compared to inv and inv-like. Alternative 3 UTR lengths are achieved by the post-transcriptional process known as alternative cleavage and polyadenylation (APA) (Elkon et al., 2013). These 3 UTRs are known to control expression level and localization of proteins (Mayr and Bartel, 2009;Elkon et al., 2013;Mariella et al., 2019). Shorter 3 UTRs are usually translated at higher rates, leading to higher protein levels (Mayr and Bartel, 2009). Future studies could focus on understanding the spatial and temporal expression of the alternative 3 UTR of en to investigate if they have diverged from each other.

en Paralogs and Eyespot Evolution
En-family proteins are differentially expressed in Nymphalid butterflies (Brunetti et al., 2001;Oliver et al., 2012). Eyespots are believed to have originated in a lineage sister to the subfamily Danainae, and now revised to have originated around 70 mya (Kawahara et al., 2019), via co-option which resulted in the expression of a set of genes in the eyespot centers, one of which includes En/Inv (Oliver et al., 2012). However, after the duplication, many lineages lost expression of En-family proteins in the eyespot centers (Oliver et al., 2012). V. cardui, for example, has expression in the golden rings but not in the eyespot centers (Brunetti et al., 2001), while the closely related J. coenia has expression of En-family proteins in the eyespot centers and the surrounding black scales, but not in the golden ring (Brunetti et al., 2001).
The in situ hybridization data on the pupal wings of B. anynana (Figure 3) reveal that two of the en paralogs (inv and inv-like) either lost two ancestral expression domains (posterior compartment and gold ring) or, alternatively, that one paralog (en) gained novel expression domains at these two locations after the duplication event. Given that inv is expressed in the posterior compartment of larval wings of B. anynana, but is absent in pupal wings, it might be more parsimonious to propose that inv might have had both larval and pupal wing expression but subsequently lost its pupal expression.
We propose two equally parsimonious scenarios for the evolution of expression domains of en paralogs (Figure 4). The two scenarios are not exclusive. In both scenarios, the ancestral copy of all the present paralogs in B. anynana already had expression in the posterior compartment of the wing because this is the pattern of expression for these genes in D. melanogaster in the pupal stage (Blair, 1992), which shared a common ancestor with Lepidoptera around 290 mya (Misof et al., 2014) (Figure 4A). This would be facilitated if all copies shared a common set of regulatory elements, as is observed for the regulation of en and inv in D. melanogaster (Gustavson et al., 1989;Cheng et al., 2014). After eyespots originated, around 70 mya (Oliver et al., 2012) (this is a revised estimate based on a new phylogenetic study (Kawahara et al., 2019)), all paralogs gained a novel expression domain in the eyespot centers. This might have happened in a few different ways. A novel cis-regulatory element might have evolved in a genomic region that affects the expression of all en-copies simultaneously in the eyespot center (Scenario 1, Figure 4B), or a pre-existent cis-regulatory element that was already enhancing all copies simultaneously might have been re-used in a novel location (eyespot centers) due to the co-option of a gene regulatory network to that novel location, containing en/inv as genes positioned within the coopted network (Monteiro and Podlaha, 2009;Monteiro and Gupta, 2016) (Scenario 2, Figure 4C). Subsequently, in both scenarios, the expression of inv and inv-like in the posterior compartment is lost (Figures 4D,E). Finally, either a new cis-regulatory element driving en expression in the golden ring is gained (Scenario 1, Figure 4F), or a pre-existent cisregulatory element is reused for the novel en expression in the golden ring, again via gene network co-option (Scenario 2, Figure 4G). It is, however, also possible that the expression of en in association with the gold ring is gained in all three copies at once and subsequently lost in the inv/inv-like copies (not shown).
An important point to make here is that the timing of the first duplication event of en into en and inv paralogs happened around 480 mya or earlier (Peel et al., 2006;Misof et al., 2014), and the duplication inv into inv and inv-like happened no more recently than around 130 mya (see "Discussion" section above). Both of these duplications happened much earlier than the origin of eyespots, estimated to be around 70 mya (Oliver et al., 2012(Oliver et al., , 2014Kawahara et al., 2019). This suggests that the duplication and maintenance of en-like paralogs in the genome of B. anynana as well as in the genome of other Lepidoptera is not particularly well associated with the origin of eyespots.
Future studies focused on the expression pattern of the en paralogs in model species such as J. coenia and V. cardui, as well as on the sequencing of open chromatin region surrounding the genes in B. anynana using techniques such as FAIRE and ATAC, will help us identify how the expression of en-like genes and their cis-regulatory elements diversified in the context of eyespot evolution. Functional studies will also be important to discover whether en-family members are actually involved in the regulation of eyespots in nymphalid butterflies.

CONCLUSION
The present study provides the first evidence of the expression of multiple gene duplicates associated with the development of eyespots, a trait novel to nymphalid butterflies (Monteiro, 2015). By studying gene transcripts rather than conserved domains of en/inv protein we found that en paralogs are expressed in both overlapping and distinct domains. We proposed two models for the evolution of the expression pattern in en paralogs, where either loss or gain of novel expression domains resulted in differential expression of the paralogs in the wings on B. anynana. Furthermore, we showed that the duplication of the three genes happened much earlier than the origin of eyespots. Hence, this study does not lend support to the idea that these particular gene duplicates were retained in Lepidopteran genomes due to their involvement in the development of this particular evolutionary novelty. Future studies focused on understanding the interaction of en paralogs with each other as well as other known genes that play roles in eyespot development will continue to illuminate the evo-devo of eyespot patterns.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
TB, DR, and AM wrote the manuscript. TB performed bioinformatic analysis, in situ hybridization, and phylogenetic analysis. TB and AM performed antibody stainings. DR found the splice variants of en. TB, DR, and AM analyzed and visualized the results. All authors read and agreed to the final version of the manuscript.

ACKNOWLEDGMENTS
We would like to thank Nipam Patel for the anti-Engrailed/Invected 4F11 antibody and Heidi Connahs for help with the transcriptome analysis. We would like to thank DBS-CBIS confocal facility for access to confocal microscopes.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo.2020.00227/ full#supplementary-material FIGURE S1 | Replicates of en, inv, inv-like and control in situ hybridization staining in B. anynana pupal wings. (a-c) Pupal hindwings stained with probe against en. The transcript is present in the posterior compartment, in the eyespot centers and in the eyespot golden ring area. (g-i) Pupal hindwings stained with probe against inv. The transcript is present only in the eyespot centers (m-o) Pupal hindwings stained with probe against inv-like. The transcript is present only in the eyespot centers. (d-f) Pupal wings stained with en sense probe (control). (j-l) Pupal wings stained with inv sense probe. (p,q) Pupal wings stained with inv-like sense probe. Non-specific staining is observed in the veins and trachea. Anterior-posterior compartments are separated by dotted lines. Black arrows point to eyespot centers.  Vertical lines indicate mean number of amino acid substitutions per site. Numbers above branches represents branch support.
FIGURE S4 | Phylogenetic tree of En-family proteins in Lepidoptera created using Geneious tree builder (Jukes-Cantor as the genetic distance model, and UPGMA as tree build method). The tree clusters En together; and Inv and Inv-like together. Numbers above branches represent branch support calculated based on 1000 bootstraps. FIGURE S5 | DNA sequence alignment of en, inv and inv-like and the regions used for probe design. Sequence used for inv probe has 24.9% identical bps to that of the en sequence, and 36.03% identical bps to that of the inv-like sequence. Sequence used for inv-like probe has 30.2% identical bps to that of the en sequence, and 36.8% identical bps to that of the inv sequence.