Retention and Molecular Evolution of Lipoxygenase Genes in Modern Rosid Plants

Whole-genome duplication events have occurred more than once in the genomes of some rosids and played a significant role over evolutionary time. Lipoxygenases (LOXs) are involved in many developmental and resistance processes in plants. Our study concerns the subject of the LOX gene family; we tracked the evolutionary process of ancestral LOX genes in four modern rosids. Here we show that some members of the LOX gene family in the Arabidopsis genome are likely to be lost during evolution, leading to a smaller size than that in Populus, Vitis, and Carica. Strong purifying selection acted as a critical role in almost all of the paralogous and orthologous genes. The structure of LOX genes in Carica and Populus are relatively stable, whereas Vitis and Arabidopsis have a difference. By searching conserved motifs of LOX genes, we found that each sub-family shared similar components. Research on intraspecies gene collinearity show that recent duplication holds an important position in Populus and Arabidopsis. Gene collinearity analysis within and between these four rosid plants revealed that all LOX genes in each modern rosid were the offspring from different ancestral genes. This study traces the evolution of LOX genes which have been differentially retained and expanded in rosid plants. Our results presented here may aid in the selection of special genes retained in the rosid plants for further analysis of biological function.

Whole-genome duplication events have occurred more than once in the genomes of some rosids and played a significant role over evolutionary time. Lipoxygenases (LOXs) are involved in many developmental and resistance processes in plants. Our study concerns the subject of the LOX gene family; we tracked the evolutionary process of ancestral LOX genes in four modern rosids. Here we show that some members of the LOX gene family in the Arabidopsis genome are likely to be lost during evolution, leading to a smaller size than that in Populus, Vitis, and Carica. Strong purifying selection acted as a critical role in almost all of the paralogous and orthologous genes. The structure of LOX genes in Carica and Populus are relatively stable, whereas Vitis and Arabidopsis have a difference. By searching conserved motifs of LOX genes, we found that each sub-family shared similar components. Research on intraspecies gene collinearity show that recent duplication holds an important position in Populus and Arabidopsis. Gene collinearity analysis within and between these four rosid plants revealed that all LOX genes in each modern rosid were the offspring from different ancestral genes. This study traces the evolution of LOX genes which have been differentially retained and expanded in rosid plants. Our results presented here may aid in the selection of special genes retained in the rosid plants for further analysis of biological function.

INTRODUCTION
Whole-genome duplications (WGDs) bring a huge impact on genome sizes of many angiosperms and may have provided the genetic material for evolutionary novelties (Sémon and Wolfe, 2007;Jaillon et al., 2009). Duplication events are usually followed by gene loss (Bowers et al., 2003), nucleotide divergence (Bowers et al., 2003) and structural rearrangements (Hufton and Panopoulou, 2009). It has long been hypothesized that the ancient genome triplication event happened to a single common ancestor of Arabidopsis-Populus-Vitis-Carica and finally caused a paleohexaploid (Tang et al., 2008a).Other than that, the two recent paleopolyploidies that have affected Arabidopsis are β-and α-duplications. At-α was a recent event, and the At-β was an intermediate event (Barker et al., 2009). In Populus, there was a single genome-wide event. This duplication event was called the "salicoid" duplication event (P-duplication; Tuskan et al., 2006). Vitis vinifera and Carica papaya each have only γ-triplication event and no other polyploidies (Tang et al., 2008a).
Polyploidy has been and continues to have an extensive effect on the number or type of genes in plant evolution (Adams and Wendel, 2005). Analysis of the differential retention and expansion of ancestral genes in modern plants provide an informative and robust way to resolve relationships among many lineages (Rokas and Holland, 2000). In this study, we will take the Lipoxygenase gene family as an example and discuss the differential retention and expansion of ancestral genes in four rosids.
Lipoxygenases involved in food-related applications during bread-making and production of the aroma are controlled by enzymes, which were found related to the formation of volatile compounds (Leenhardt et al., 2006). Studies have shown that extractable activities of enzymes are major factors that can affect the degrading efficiency of carotenoid pigments during the kneading step of bread-making in each of the three cultivated wheat species. Lipoxygenases also have a negative relationship with the color, off-flavor and antioxidant status of plant-based foods. Studies on soy-based foods have demonstrated that lipoxygenases are responsible for the offflavor associated with biological components present in soybean (Leenhardt et al., 2006). So far, there is sufficient evidence to prove that lipoxygenase is the most crucial element in plant defense responses (Baysal and Demirdöven, 2007;Bannenberg et al., 2009). In recent years, one LOX gene in Arabidopsis (AtLOX2) was thought to function exclusively in jasmonates (JA) biosynthesis upon wounding (Van Loon et al., 2006). The study is backed up by recent findings in apple which showed that MdLOX5 gene was more likely to be responsible for aphid tolerance or resistance (Vogt et al., 2013).
Lipoxygenase genes are chosen for their biological significance. In our research, taking lipoxygenases as an example, we studied the expansion of these genes in four species. Previous analysis showed that one or more paleopolyploidy events which had an impact on these four modern rosid genomes, fluctuate remarkably in size and arrangement. Our results trace the differential retention and expansion of the ancestral Lipoxygenases in Arabidopsis thaliana, Populus trichocarpa, V. vinifera and C. papaya and help facilitate the extrapolation of the evolutionary process.

Ethics Statement
No specific permits were required for the described field studies. No specific permissions were required for these locations and activities. The location is not privately-owned or protected in any way and the field studies did not involve endangered or protected species.

Database Search and Sequence Retrieval
LOX genes were identified following the method described by (Podolyan et al., 2010;Umate, 2014;Chen et al., 2015). Protein and cDNA sequences of LOX genes in Arabidopsis were obtained from the Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org/, release 10.0). Protein and cDNA sequences of P. trichocarpa, V. vinifera, and C. papaya were downloaded from Phytozomev.11.0 database. The respective genome sequence sites are as follows: P. trichocarpa, V. vinifera, and C. papaya (http://genome.jgi.doe.gov/pages/ dynamicOrganismDownload.jsf?organism=PhytozomeV11). Local Blast searching was performed using Arabidopsis LOX proteins as queries for the identification of LOX genes from Carica and Vitis, and then using the resulting poplar and grapevine sequences as secondary queries. To obtain good gene models of CpLOX genes, all-against-all nucleotide sequence similarity searches were performed between the gene models and EST sequences using BLASTN software (Supplementary Table 1). Besides, we also worked on the multiple sequence alignment with CpLOX proteins and 70 experimantally verified gene models (Supplementary Tables 2, 3). All of the obtained genes were further manually analyzed to confirm the presence of the LOX domain (PF00305) and PLAT/LH2 (polycystin-1, lipoxygenase, α-toxin domain, or the lipoxygenase homology) domain (PF01477) in the Pfam HMM database (http://pfam.sanger.ac.uk/) (Finn et al., 2006) and InterPro (European Bioinformatics Institute) (http://www.ebi.ac.uk/interpro/scan.html) (Supplementary Table 4; Mulder et al., 2007). Redundant sequences with different identification numbers and the same chromosome locus were removed.

Phylogenetic Trees Construction
Complete protein sequences of LOX in the four plant species were aligned with the aid of ClustalW (Larkin et al., 2007). The phylogenetic tree was constructed by MEGA version 6.0 software with the minimum evolution (ME) method (Tamura et al., 2013). Bootstrap analysis with 1000 replicates was performed to calculate the reliability of the ME tree. To confirm the robustness of the ME tree, we also constructed other phylogenetic trees by using the Neighbor-Joining (NJ) method.

Exon-Intron Structural Analysis and Identification of Conserved Motifs
The exon-intron structure positions of LOX genes were generated online using the online program Gene Structure Display Server (GSDS; http://gsds.cbi.pku.edu.cn/; Guo et al., 2007) by alignment of the cDNAs with their corresponding genomic DNA sequences. To identify the conserved motifs of the LOX genes in four rosid plants, the structural motif annotation was employed using the MEME (Multiple Em for Motif Elicitation, Version 4.11.1) program (Bailey et al., 2006) with the following parameters: the maximum number of motifs was set at 20, and the optimum motif widths were set between six and 200 residues. Structural motif annotation was provided by using the Batch search tool in Pfam program.

Identification of Paralogs and Orthologs
Paralogs and orthologs were identified by using the same procedure described in Blanc and Wolfe (2004). This method was performed by running a BLASTN (Altschul et al., 1997) for all nucleotide sequences for each species. A pair of matching sequences were defined as pairs of paralogs when the identity was more than 40% and the alignment covered over 300 bp. To identify putative orthologs between two species, for example A and B, each sequence from species A was searched against all sequences from species B using BLASTN. At the same time, each sequence from species B was searched against all sequences from species A. The two sequences were defined as orthologs whose reciprocal best hits were each within >=300 bp of the two sequences aligned.

ω and Ks Analysis
First, pairwise protein sequence alignment was performed using MUSCLE (Edgar, 2004).Then, used in conjunction with protein alignments, CDS sequence, and an in-house PERL script, the input file format of KaKs_Calculator2.0 could be got. Finally, the input file were converted into computation of Ks (synonymous substitution rate) and Ka (non-synonymous substitution rate) values using KaKs_Calculator2.0 (Wang et al., 2010). To further assess whether positive selection acts upon specific sites, the gene pairs for all the paralogs and orthologs were used to calculate the ω, where ω = Ka/Ks.

Intraspecies and Interspecies Microsynteny Analysis
Microsynteny analysis within four species was detected by MicroSyn software (Cai et al., 2011). At the beginning of the generated MSY file step, three property files are needed: the gene list file, the CDS file and the gene identifier file. The microsynteny graphic file was provided by loading the three files. Then MicroSyn creates a homologous relationship among all genes. Finally, the microsynteny graphic file was provided by the software. In order to analyze duplication of LOX genes of the four rosid plants we researched the expansion of LOX genes through segmental or whole-genome duplications (S/WGD) for LOX genes in each species by using the plant genome duplication database (PGDD; http://chibba.agtec.uga.edu/duplication/; Lee et al., 2013). In order to determine whether the LOX genes of the four rosid plants arose from a large-scale duplication event (duplicated blocks derived from whole-genome or segmental duplication) or tandem duplication, genome-wide analysis was undertaken to examine whether LOX genes occurred within duplicated blocks. We researched the expansion of LOX genes through segmental or whole-genome duplications (S/WGD) for LOX genes in each species by using the plant genome duplication database (PGDD; http://chibba.agtec.uga.edu/duplication/). The LOX genes duplicated through S/WGDs were inferred on the basis of gene collinearity on syntenic blocks. First, from the PGDD, we download the file containing collinear block information within and between the four rosids. Then, all download blocks information were imported into MySQL, and the LOX gene ID were used as the query to perform a search in these species. The LOX genes duplicated through S/WGDs were identified in this way. In this analysis, the counterparts of a particular LOX gene on an SCB may have been retained as LOXs, subfunctionalized into non LOXs (indicated by an "N" before the first letter in the code name). LOX genes expanded through tandem duplication (TD) were inferred following the method that (1) belong to LOX gene family, (2) are located within 60 kb each other (data comes from Phytozome), and (3) are separated by five or fewer gene loci (non LOXs). Syntenic blocks between species were identified using the MCscanX (Wang et al., 2012) software with default parameters. To categorize the expansion of the LOX gene families, the positions of the LOX genes in the blocks were A. thaliana, P. trichocarpa, V. vinifera, and C. papaya. Circos software was used to draw the syntenic diagram (Krzywinski et al., 2009).

LOX Genes in Four Modern Rosids
Based on the previous studies, we obtained 6 and 20 putative LOX genes from the Arabidopsis, and poplar, respectively. In a recently published report, a total of 18 LOX genes were identified in Vitis. By removing pseudogenes, 13 LOX genes were identified in the Vitis genome. In this study, by removing psedogenes we further filtered five additional LOX genes in Vitis and changed the total member into 13. To identify LOX in Carica, we performed a search against the genome database with BlastP using AtLOX protein sequences as queries. Finally, 11 LOX genes were identified in Carica. The detail information of each LOX genes are listed in Table 1.
To date, four studied rosids have been suggested to possess paleohexaploidy in a common ancestor (Jaillon et al., 2007). Based on previous results, the multiplicity ratio for an ancestral gene comparison in the genomes of four species should be 4:2:1:1. But in our research results, the number of LOX genes in Arabidopsis is far fewer than that estimated for other plant species. Previous studies suggest that the V. vinifera genome is by far the closest to the ancestral arrangement such that the ancestral gene order can be deduced from this species with no difficulty. The ratio of LOX genes for the four species is 0.5: 1.5: 0.85: 1 when the number of LOX genes in V. vinifera is used as a benchmark. In addition to A. thaliana, this result is basically in line with the expected current ratios of LOXs.
Our results show that Populus and Vitis shared the most orthologous pairs, of up to 26 pairs of orthologous LOXs. We only got one pair of orthologous LOXs between Arabidopsis and Vitis. Two pairs of orthologous LOXs were detected between Arabidopsis and Populus. After comparing Carica and other three species we found no orthologous LOXs between them.
In order to better understand the evolutionary constraints acting among the four rosid species, we measured the Ka/Ks ratios for these pairs of LOX paralogs and orthologs. The ratio of non-synonymous substitutions per non-synonymous site (Ka) vs. the synonymous substitutions per synonymous site (Ks) is an indicator of the history of selection (Yang and Bielawski, 2000). If Ka/Ks < 1, it suggests that the gene is undergoing purifying selection. When Ka/Ks > 1, it means there is accelerated devolution with positive selection, and Ka/Ks = 1 suggests neutral selection. A summary of Ka/Ks for LOX paralogous and orthologous pairs is shown in Supplementary Table 5. The resulting pairwise comparison data showed the Ka/Ks values of only one Vv paralogous pair larger than 1. The relatively higher Ka/Ks ratio of VvLOX6/11 suggests that they may have experienced relatively rapid evolution following duplication. There were two Vitis pairs (VvLOX6/10 and VvLOX7/12) and one Populus pair (PtLOX18/19) that were larger than 0.5 but less than 1, while all of the remaining Ka/Ks ratios were less than 0.5, suggesting that the LOX family has mainly undergone strong purifying selection and these LOX genes are slowly evolving at the protein level. Our calculations shows that all orthologous LOXs between species were less than 1.

Expansion and Structural Characteristics of the LOX Genes in Four Rosid Plants
To investigate the extent of the expansion of the LOX genes in rosid plants, we performed a joint phylogenetic analysis with MEGA using the ME method (Figure 1) and the NJ method (Supplementary Figure 1). The ME and NJ trees show identical topologies. As mentioned above, plant lipoxygenases are clustered into two groups (9-LOX and 13-LOX). In our study, a total of 50 genes formed two distinct clades and are in agreement with the previously studied results (Brash, 1999; Figure 1). As shown in Figure 1, 9-LOX consisted of 20 LOX genes from four modern rosids; two from Arabidopsis, eight from Populus, six from Carica, and four from Vitis. This clade is composed of four sub-clades, one of which includes purely six Populus LOX genes. Yet there is another sub-clade simply containing four Carica LOX genes. The rest includes LOX genes from two or more species. Paralogous groups in this clade are CpLOX2/4/5/7 and CpLOX8/9 from Carica; PtLOX7/8/9/10/16/17 and PtLOX11/15 from Populus; VvLOX6/10/11 from Vitis. Beyond that there are three orthologous pairs shared by the four modern rosids. The remaining LOXs are placed in the 13-LOXs group. Paralogous groups in this clade were A3-A6, from Arabidopsis; PtLOX1/2, PtLOX3/4/20, PtLOX17/26, and PtLOX10/12 from Populus; VvLOX3/4/5, VvLOX4/9 and VvLOX7/12 from Vitis. In addition, this clade contained only one paralogous pair from Carica, CpLOX10/11. In this group, there are seven orthologous pairs shared by the four modern rosids. Besides, the genetic distances among the two LOX sub-families were studied and the result showed that the genetic distance of 9-LOX genes was smaller than 13-LOX genes, indicating that 9-LOX genes are more closely related to each other.
For a better understanding of the structural diversity of LOX genes, using the structures of LOX genes we generated the exon-intron architecture of each LOX gene in four rosid plants (Figure 2). Overall, the structures of LOX genes in Carica and Populus were conserved. But some changes take place in the AtLOX and VvLOX genes. The detailed structural analysis of the exon/intron are presented in Figure 3. Of the four species surveyed, Carica and Populus LOX genes are in a similar position with eight or nine exons, and the number of exons in Arabidopsis range from six to nine. VvLOX genes are much more dramatic, VvLOX9/10/11 have the highest number of exons at 11, but in the same species VvLOX6 also has the least number of exons at five. We further analyzed the exon/intron structure of the LOX orthologous and paralogous gene pairs discussed previously. The results showed that majority of these gene pairs have different exon numbers. Among paralogous gene pairs, the structure rationality changes obviously in VvLOX6/10. Simultaneously, by comparing the orthologous pairs, we found that all the differences come from Populus and Vitis.
We also studied the conserved motifs of LOX genes because of its particularity and the importance to the diversified functions of LOX genes. Therefore, we used the MEME web server to find the relatively conserved motifs which are shared with the 50 LOX proteins. In total, 20 distinct conserved motifs were found (Figure 3, Supplementary Table 6), and the relevant information is shown in Supplementary Table 6. Each of the putative motifs is well commented by searching in Pfam database. In detail, motifs 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 14, 15, and 19 are associated with the Lipoxygenase domain; motif 9 is found to encode the PLAT domain; motif 4 is thought to be involved in foring the PLAT and Lipoxygenase domain. However, the other motifs have no functional annotation. As illustrated in Figure 3, most LOX members belong to the same sub-family and are alike in motif compositions, suggesting that a lot of similarity may have many overlapping parts from a functional perspective. Motif 5 is widely presented in all fifty LOX proteins. Motif 15 and motif 20 are unique to the proteins in the 9-LOX clade. The former is considered for all of the components of the Lipoxygenase domain. Even though the function of motif 20 is still unknown, we still think that these motifs might be important to the functions of unique LOX proteins due to their specificity. To some extent, these specific motifs may help us to understand the functional divergence of LOX genes during evolutionary history.

Expansion Manners of LOX Genes within Four Rosid Plants
In order to probe the relationship between the genetic divergences within the LOX gene family and the corresponding expansion patterns, we further analyzed the gene duplication events within each species. As previously mentioned, rosid plants have experienced at least one polyploidy event. These events may have lasting implications for the evolution of LOX gene families. We used the MicroSyn software to investigate this possibility. If two members of the same gene family are homologous pairs, and three or more of the 50 upstream and downstream neighboring genes are also considered to be homologous pairs, we defined these two regions as those resulting from a duplication event. The number of LOX genes that arose from duplication events varied among the four rosids. Our survey results showed that 10 collinear gene pairs occurred in the Populus genome and a total of four collinear gene pairs occurred in the Vitis genome; however, there was only one collinear gene pair in both Arabidopsis and Carica genomes (Figure 4).
In Arabidopsis, one gene pair (AtLOX3/AtLOX6) was found to have conserved neighboring regions and no syntenic relationships were detected within the other four AtLOX genes. In Carica, the microsynteny between the CpLOX8 and CpLOX9 genes is extensive and this gene pair is located next to each other on the same supercontig 58, believed to have derived from tandem duplication events. In Vitis, one pair, VvLOX2/VvLOX8, shares a substantial collinear region. In addition, one gene cluster VvLOX3/VvLOX4/VvLOX5 and one gene pair, VvLOX10/VvLOX11 are located near each other on the same chromosomes and these gene pairs might be evolved from tandem duplication. In Populus, three gene pairs, PtLOX7/PtLOX16, PtLOX11/PtLOX15, and PtLOX12/PtLOX14 share extraordinary conserved synteny, with less conserved collinear genes surrounding PtLOX5/PtLOX6, PtLOX8/PtLOX16, and PtLOX15/PtLOX16. Besides, one gene cluster, PtLOX7/PtLOX8/PtLOX9/PtLOX10, and one gene pair, PtLOX18/PtLOX19, appear to have evolved from tandem duplication events. Using this approach, we made a preliminary judgment on the duplication events within each species. To better examine the gene duplication events of LOX genes, we retrieved the syntenic chromosomal blocks (SCBs) associated with the expansion of LOX genes through S/WGDs from the plant genome duplication database. The results are consistent with the findings of most studies done in the MicroSyn software. The biggest difference occurs in Vitis and Populus. From the PGDD database, we found that the other two gene pairs VvLOX2/VvLOX10 and VvLOX3/VvLOX9 are associated with S/WGDs in Vitis. In Populus, two counterparts (PtLOX1/PtLOXN1 and PtLOX13/PtLOXN1) of LOXs on SCBs are found to have sub-functionalized into other gene family members (indicated by the code name preceded by the letter "N"; Guo et al., 2014). Beyond that, the two gene pairs PtLOX8/PtLOX16 and PtLOX15/PtLOX16 achieved through MicroSyn software were not found in the PGDD database. Generally to consider the results from both ways, in current findings the number of LOX genes that arose from S/WGD are 10, five and two in Populus, Vitis, and Arabidopsis, respectively. Because the duplicated gene located on a SCB is simultaneous with another one, the median Ks value of duplicated genes in SCBs can be used to infer the dates of the large-scale duplication events. In this analysis, the duplicated gene pairs as well as the homologous genes in neighbor regions are used to date duplication events. The mean Ks values for each duplication FIGURE 2 | Phylogenetic relationship of LOX proteins and the exon-intron structure of LOX genes from four rosid plants. Left: an unrooted phylogenetic tree constructed using MEGA 6.0 by the ME method. Different sub-families of LOX genes are highlighted with different colored backgrounds. Right: exon-intron structure. The exons and introns are indicated by yellow rectangles and thin lines, respectively. The untranslated regions (UTRs) are indicated by thick purple lines.
pair in the LOX genes are shown in Figure 5 and Table 2. In Populus, the median Ks value of the γ triplication event is 1.54, and that related to the P-WGD is 0.27 (Tang et al., 2008b). We detected eight conserved gene pairs, which most likely resulted from SCB events. The median Ks of five in eight shows one range: 0.27-0.5. The median Ks values of the rest of the gene pairs is 2.1 and this pair is considered to associate with the most ancient γ-triplication event. In Arabidopsis, the median Ks values that have a relationship with β-and γ-WGDs are almost indistinguishable, and the Ks value is 2.00 (Tang et al., 2008b). To our knowledge the overall median Ks value for α-duplication in Arabidopsis is nearly 0.86. Therefore, the only one duplicated gene pair in Arabidopsis should be related with the α-duplication event. We also examined the expansion of LOX genes within the genomes of Vitis. According to previous reports, the overall median Ks value of SCBs in Vitis associated with the γ triplication is 1.22. In our research results, the synonymous silent substitutions per site are calculated over these three possible gene pairs. The Ks values for VvLOX2/VvLOX10, VvLOX2/VvLOX8, and VvLOX3/VvLOX9 are 1.2, 0.83, and 1.3, respectively. Based on the predicted Ks value, VvLOX2/VvLOX10 and VvLOX3/VvLOX9 appear to evolve from the γ triplication, while VvLOX2/VvLOX8 evolve from a duplication event that occurred more recently.
Based on the gene-collinearity analysis within each species, we established an idealized gene tree of the duplication groups of LOX genes in four rosid plants. As shown in Figure 6, in the Arabidopsis LOX genes duplicated network, one ancestor in the ancient genome duplication should have produced at least 12 AtLOX genes, but actually there is only one gene pair that is considered from α-duplication in our study. So we think that a possible ancient gene loss event occurred. In Populus, after two rounds of duplications, one ancestor in ancient genome duplication should have produced at least six LOX genes. However, two of these lines lacked the copies, which would have been obtained from p genome duplication. Moreover, PtLOX7, PtLOX8, PtLOX11, PtLOX15 and PtLOX16 originate from the same ancestral gene. PtLOX1 and PtLOXN1 evolve from a prior duplication event, while PtLOXN1 and PtLOX13 result from a duplication event that occurred more recently. In addition, two LOX gene pairs could be matched to the γ triplication in Vitis. In contrast, there is no LOX-containing segments in Carica being matched in any duplicated pairs. Such a huge difference existed in the expansion manners of LOX gene within the four rosid plants, so where did the remaining LOX genes in these species originate from?

Evolutionary History of LOX Gene Families in Four Rosid Plants
We used the LOX gene family members as anchor genes to further examine the orthologous relationships and evolutionary history of LOX genes among four rosid plants. After this interspecies microsynteny analysis, the relationships between syntenic orthologs of LOX genes in four rosid plants are displayed in Figure 7, indicating that the strongly conserved microsynteny among these regions across four species is observed significantly.
We obtained the collinear correlations of LOX genes in the four plant genomes by using MicroSyn. In total, 33 conserved syntenic segments were found (Supplementary Figure 2), and these syntenic segments are divided into six groups. Four of the groups contains all the four species with the LOX gene. AtLOX1 in Arabidopsis, PtLOX7/8/16/17 in Populus, VvLOX11 in Vitis, and CpLOX7 in Carica have conserved collinearity, and are identified as group "A". AtLOX3/6 in Arabidopsis, PtLOX5/6 in Populus, VvLOX8 in Vitis, and CpLOX3 in Carica are classified into the group "B." AtLOX4 in Arabidopsis, PtLOX12/14 in Populus, VvLOX1 in Vitis and CpLOX6 in Carica are grouped as group "C." AtLOX5 in Arabidopsis, PtLOX11/15 in Populus, VvLOX2 in Vitis and CpLOX8/9 in Carica are grouped as group "D." Group "E" consists of LOX genes from three species, which are PtLOX9 in Populus, VvLOX12 in Vitis, and CpLOX1 in Carica.
The LOX genes from two species comprise the group "F, " as they are PtLOX1 in Populus and VvLOX3/4/5 in Vitis. The results are consistent with the findings of the phylogenetic analysis.
Subsequently, the synteny quality was calculated in four rosid plants. The quality was calculated as twice the number of  matches divided by the total number of genes in both segments (Cannon et al., 2006). These four species have a synteny quality of 67.77% for orthologous regions. The minimum value of synteny quality observed between Arabidopsis and Vitis was 48.97%, and the maximum value was 97.70%. The average synteny quality in the Carica/Populus syntenic regions reached over 89.24%, followed by Carica/Vitis, for which the average synteny quality was 76%. The average synteny quality in the Arabidopsis/Populus and Arabidopsis/Carica syntenic regions was 53.34 and 45.35%, respectively. Details of this comparative analysis are shown in Table 3. It is thought that LOX families evolved from a process of different duplication events. However, within those LOX homologs, what kind of role is the genome-wide duplication playing? Since previous studies, the expansion of LOX genes within the genome of each species have been researched by using the PGDD database. Similarly, from the database, we also examined the SCBs associated with the expansion of LOXs between species. To better understand the gene-collinearity between species, a panoramic picture about the differential retention and evolution of the ancestral LOXs related to paleopolyploidy in the four rosid plants was built (Figure 8, Table 2). The study, building on previous research, has identified FIGURE 6 | Idealized gene trees of the duplication groups of LOX genes in Populus, Vitis, and Arabidopsis. Each tree represents a duplication group from large-scale gene duplication. As shown in the trees, the question marks indicate possible gene loss events. As shown in the trees, three paleopolyploidies affecting Arabidopsis (α, β, and γ duplication event). Populus trichocarpa has two duplication events (p and γ duplication event) and γ, which is shared by Vitis. The question marks indicate possible gene loss events.
differences in the duplicates of these ancestral genes through S/WGDs in each species. Five, one, eight and four LOX genes have been linked to paleopolyploidy in Vitis, Carica, Populus, and Arabidopsis, respectively.
In theory, it should always be possible to find out if the microsynteny were maintained among the members of four rosid plants. But this was not what we found. As shown in Figure 8, in the process of analyzing the gene-collinearity in individual species, we found out that one gene pair AtLOX3/AtLOX6 in Arabidopsis originated from α-WGD. When we expanded to analyze the gene-collinearity between species, we found that AtLOX3 and AtLOX6 were all orthologous SCBs of the chromosomal block containing PtLOX5 and PtLOX6 in Populus. In addition, in Carica, one gene, CpN1 on SCBs, was also found to have collinearity with AtLOX3/AtLOX6 and PtLOX5/PtLOX6. This may be because LOX gene sub-functionalized into other gene family members over evolutionary time. A similar dynamic can be seen in the gene-collinearity analysis between PtLOX12 and PtLOX14.The SCB analysis revealed that this gene pair is related to AtLOX4 and CpN2. In addition, based on the results of the present study, VvLOX2, VvLOX10, PtLOX11, PtLOX15, and AtLOX5 genes might have originated from the same ancestral gene. The orthologous CpLOX6 genes that might have arose through S/WGDs were sub-functionalized into VvN1 and PtN2 in Vitis and Populus, respectively. Beyond that, the orthologous pair CpN3 and VvLOX12 were found to have originated from the same ancestral gene. Besides, the orthologous relationship of VvLOX3/9 and PtLOX1/13/N1 are no longer traceable to the LOX genes in the other three species. Furthermore, our opinion is that LOX genes which are unique to their species might represent the oldest relics of ancient LOXs differentially retained in each species.

DISCUSSION
In terms of evolution, A. thaliana, P. trichocarpa, C. papaya, and V. vinifera are believed to originate from a common paleohexaploid ancestor demonstrated by numerous studies (Ohno, 2013). Single and more recent multiple WGD events have been found in the genomes of Populus and Arabidopsis. Paleopolyploidy events provided opportunities for gene duplication, and those duplicated genes have been shown to act as an important role in evolutionary innovation (Hittinger and Carroll, 2007). Functional diversification with duplicated genes results in more complex organisms. Typically, ancient genome duplications have always been thought to be a powerful source of functional innovation and genome complexity, and is also followed by substantial gene loss. Lipoxygenase and its products are involved in the regulation of a variety of processes. In this paper, we used the model rosid plant Arabidopsis, as well as Carica, Populus, and Vitis to study the evolutionary history of this gene family.
In this study, we identified 6, 20, 13, and 11 LOXs in Arabidopsis, Populus, Vitis, and Carica, respectively. Except Arabidopsis, previous surveys indicated that a very high proportion of most LOX members in other three species are paralogs. In order to improve our understanding on what affects the evolutionary constraints, we measured the Ka/Ks ratios of paralogous pairs of the four species. Amidst all of the pairwise comparison data, only one gene pair, VvLOX6/11 exhibits a Ka/Ks ratio larger than 1, suggesting that accelerated devolution with positive selection occurred in this gene pair. Other than that, Ka/Ks ratios of all the other paralog pairs are lower than 1, indicating that the LOX genes at the protein level are very slowchanging and the majority of sites are often controlled by strong purifying selection. Phylogenetic trees are quite informative for obtaining the LOX gene relationships with each other. In this study, the LOX genes are divided into two groups, 13-LOX and 9-LOX, consistent with many previous studies. The calculated genetic distances among the two LOX subfamilies were computed and the results show that the LOX genes of 9-LOX sub-families appear to be more closely related to each other than those LOX genes in 13-LOX sub-families. The number of exons in Carica and Populus LOX genes is relatively stable, whereas the exon numbers has changed dramatically in Vitis and Arabidopsis. Vitis have the most number of exons with 11 and the least number of exons with five. Exon-intron structural diversification has been confirmed in the evolution of many gene families, and the reason why exonintron gain or loss occurs is because of the genetic assortment of different chromosome fragments. The MEME server identifies that each sub-family shares a similar motif, and the results could have implications for functional similarities about these LOX proteins (Paterson et al., 2006). The differential motifs in each sub-family may endow the LOX proteins with new functions or to raise their performance. Our study shows that the results meet the similarities in gene structure and motif composition of most LOX proteins from phylogenetic analysis of the LOX gene family. LOX genes differentiate into various characteristics among the different sub-families for a variety of possible reasons, and the most probable cause is that the LOX members were functionally diversified (Blanc and Wolfe, 2004).
The current tools related to investigate the relationship among genes in modern plants include sequence similarity, microsynteny analysis, and retrieve the syntenic chromosomal blocks from PGDD. As might be expected each of these approaches has advantages and shortcoming in certain situations. For example, some ancient duplicates could not be retrieved preferences from the Blast method because sequence similarity may have severely eroded in long evolutionary process. As a result, we could no longer track the paralog-ship for such duplicates based on this method. For instance, our results showed that VvLOX3 and VvLOX9 are duplicates resulting from γ-WGD in Vitis (Figure 8), but this gene pairs in the paralog analysis based on sequence similarity that remained undetected (Supplementary Table 5). Thanks to the plant genome duplication database, more duplicates which had arisen from segmental or whole-genome duplications could be discovered easily. But this approach had a number of drawbacks, such as we could not detect the homology pairs that proliferate via other duplication strategies. The large paralogous group unique to only one rosid plant were found abundant in our results (Supplementary Table 5). It is unscientific to infer evolutionary relationships for homologous genes among different lineages reducing only based on Microsynteny analysis. Microsynteny between two members of a gene family is calculated from their flanking genes. And we have already known that the quality of gene prediction in different genome sequencing programs is drastically different. If the flanking regions contain assembly errors, gaps or annotation errors, would cause the microsynteny that be artificial. In conclusion, to approach the evolutionary relationships for members of LOX gene family across four modern rosids, the above methods should be utilized compositely.
Based on the gene collinearity on syntenic chromosomal blocks within and between these four rosid plants, we set up a panoramic picture to trace the evolutionary history of LOX gene families (Figure 8). Our analysis shows that five (line 1-5 in Figure 8) of these ancestral LOXs are retained in more than two species. By contrast, two of the ancestral LOX genes were retained in only one of the four rosids. These observations suggests that, no matter which rosid plant is used as the model plant, the functions of a gene family inevitably get the limited amount. We could speculate genes uniquely retained in only one species may have a specific and indispensable function. As it turns out, this finding will help us dig deeper into the unique genes retained in the rosid plants and further research the function of these genes. A surprising finding of this study is that we have not found any line containing the LOX genes in all four modern rosids. Thus, we determined that all of the LOXs in each modern rosid are offspring coming from different ancestral genes.
To date, top-down analysis shows a high degree of collinearity between the four studied rosids. Arabidopsis (Lamesch et al., 2012), Carica (Ming et al., 2008), Vitis (Jaillon et al., 2007), and Populus (Tuskan et al., 2006) have been suggested to possess paleohexaploidy in a common ancestor (Jaillon et al., 2007). Previous results indicate that genome triplication (γ) occurred in a common ancestor of Vitis, Arabidopsis, Carica, and Populus. Meanwhile, the previous results show the two most recent paleopolyploidies affecting Arabidopsis that are often described as α and β duplication. Populus trichocarpa has had a unique duplication event in recent times, which is called salicoid lineage (p, following the usage in Tang et al., 2008a). Considering the paleopolyploidy events that occurred in each species, there should be 3 ancestral loci in Carica and Vitis, 6 ancestral loci in Populus, and 12 in Arabidopsis. Based on these results, the multiplicity ratio for an ancestral gene comparison in the genomes of four species should be 4:2:1:1. And in fact none of the ancestral LOXs included the extremes of each condition. However, collinear correlations of LOX genes in the four plant genomes have been obtained by using MicroSyn provides an interesting point. In our work, 33 FIGURE 8 | Panoramic picture to visualize the differential retention and expansion of the ancestral LOXs associated with paleopolyploidy events that have occurred in four modern rosids. Square represents a SCB duplicated through paleopolyploidy events within and between species. Codes in the square correspond to associated LOX genes. Genes in the same line are thought to have originated from the same ancestral gene. Genes coded with an "N" between the letter and the number (e.g., CN1) represent those that have sub-functionalized into non-LOX; blank positions correspond to situations where the whole SCBs has been completely lost.
conserved syntenic segments are divided into six groups, and in almost every group, LOX genes from Populus are present in at least an extra copy compared with Vitis and Carica, and the two copies are paralogs. The result accords with the well-documented fact and also provides powerful evidence that Populus has undergone an additional whole genome duplication, which is not shared with Vitis and Carica. However, our survey results show that there are not twice as many LOX genes in Populus vs. Vitis, suggesting them could have suffered differential gene loss events could have happened to these two species. The LOX gene family has shrunk in the herbaceous plants and retained a large number of LOX genes in the woody plants, leading to the hypothesized that some Arabidopsis LOX genes might have been lost during the evolutionary process due to functional redundancy. Previous studies showed that after the paleopolyploidy events, the exponential growth in gene numbers is often tempered by massive and progressive gene death in the subsequent diploidization process (Tang et al., 2008a). Another possibility is that LOXs may have expanded faster in the other three species than in Arabidopsis. This expansion to more abundant LOX genes in Populus, Vitis, and Carica genomes suggests a great need of LOX genes to participate in more complicated physiological and biological processes in these three woody species. These results probably suggest the complex evolutionary history of the LOX family in rosid plants.
In this study, there are eight pairs of genes associated with S/WGDs, including six pairs from the PtLOX gene family, along with two pairs that have sub-functionalized into other gene family members. In contrast, there is only one collinear gene pair in both Arabidopsis, which is mainly caused by the rapid substitutions in Arabidopsis. The γ-triplication event has ever happened in the common ancestor of these four species. But all of them have different values median Ks about γ-paleologs from four modern rosids. In Arabidopsis the median Ks is close to 2.0, which was higher than that in Populus (1.54), Carica (1.76), and Vitis (1.22) (Tang et al., 2008a). Some studies have shown rapid substitutions at a rate proportional to the amount of synonymous sites (Guo et al., 2014). Over millions of years of evolution in Arabidopsis extensive chromosome have been actively rearranged. That might contribute to the high median Ks between γ-paleologs in Arabidopsis and this would destroy collinearity. The result accords with the fact that Arabidopsis which contains more paleopolyploidies has a smaller genome than that of Populus, though both originated from a common ancestor. Besides, our analyses showed that almost all Vitis contains many more paralog-ship LOX genes than Carica, although both of them were affected by the γ-WGD event. The observations suggesting that the gene duplication impacts turn out to be small in CpLOX gene family. The dates of the large-scale duplication events have been obtained through calculating the median Ks value of duplicated genes in SCBs. In Populus, the number of LOX genes produced by the recent duplication event is much more than those produced from ancient duplication events. In Arabidopsis, the only unique gene pair is generated by α duplication event, which is also a recent duplication event. This illustrates that recent duplication host to those species which have undergone at least whole-genome duplication.
The current study provides an overview of LOX genes in four rosid plants, including their phylogenetic relationship, gene structure, conserved motifs, microsynteny and gene collinearity. Based on these findings, we tracked the evolutionary history of ancestral LOX genes among four modern rosids. The results suggest that all of the LOX genes in each species could have resulted from different ancestral genes. This study presented here may provide clues for exploring the unique genes retained in the rosid plants and aid in the research of the biological functions about these special genes.