Mitogen-activated protein kinase cascades in Vitis vinifera

Protein phosphorylation is one of the most important mechanisms to control cellular functions in response to external and endogenous signals. Mitogen-activated protein kinases (MAPK) are universal signaling molecules in eukaryotes that mediate the intracellular transmission of extracellular signals resulting in the induction of appropriate cellular responses. MAPK cascades are composed of four protein kinase modules: MAPKKK kinases (MAPKKKKs), MAPKK kinases (MAPKKKs), MAPK kinases (MAPKKs), and MAPKs. In plants, MAPKs are activated in response to abiotic stresses, wounding, and hormones, and during plant pathogen interactions and cell division. In this report, we performed a complete inventory of MAPK cascades genes in Vitis vinifera, the whole genome of which has been sequenced. By comparison with MAPK, MAPK kinases, MAPK kinase kinases and MAPK kinase kinase kinase kinase members of Arabidopsis thaliana, we revealed the existence of 14 MAPKs, 5 MAPKKs, 62 MAPKKKs, and 7 MAPKKKKs in Vitis vinifera. We identified orthologs of V. vinifera putative MAPKs in different species, and ESTs corresponding to members of MAPK cascades in various tissues. This work represents the first complete inventory of MAPK cascades in V. vinifera and could help elucidate the biological and physiological functions of these proteins in V. vinifera.


Introduction
Mitogen-activated protein kinase (MAPK) cascades are higly conserved modules of signal transduction in eucaryotes including yeast, animals, and plants. MAPK cascades play an important role in protein phosphorylation of signal transduction events (Rodriguez et al., 2010). MAPK cascades typically consist of three protein kinases, MAPK, MAPK kinase (MAPKK), and MAPK kinase kinase (MAPKKK), but sometimes include MAP3K kinase (MAP4K) that phosphorylate the corresponding downstream substrates (Jonak et al., 2002;Champion et al., 2004).
MAPK is activated via phophorylation of conserved threonine (T) and tyrosine (Y) residues in the catalytic subdomain by its specific MAPKK, which is in turn activated by phophorylation of two serine/threonine residues in a conserved S/T-X33-5-S/T motif by an upstream MAPKKK (Stulemeijer et al., 2007;Zaïdi et al., 2010;Huang et al., 2011). Upon activation, the MAPK could be translocated into the nucleus or cytoplasm to trigger the cellular responses through phosphorylation of downstream transcription factors or components of transcription machinery while some MAP kinases, like ERK3, are constitutively present in the nucleus and may function in the nucleus (Lee et al., 2004;Pedley and Martin, 2005;Fiil et al., 2009;Nadarajah and Sidek, 2010). MAPKKK is usually activated by a G protein, but sometimes activation is mediated via an upstream MAP4K (Champion et al., 2004).
MAPK proteins contain 11 evolutionary conserved kinase domains that may be involved in substrate specifity or proteinprotein interaction (Nadarajah and Sidek, 2010). MAPK cascade proteins have TEY or TDY phophorylation motifs in the region between kinase domains VII and VIII (Group et al., 2002), which provides a protein-binding domain for the activation of MAPKs (Rohila and Yang, 2007).
MKKs are activated by the phosphorylation on conserved serine and threonine residues in the S/T-X3-5-S/T motif and characterized by a putative MAPK-docking domain K/R-K/R-K/R-X1-6-L-X-L/V/S, and a kinase domain (Group et al., 2002). To date, many MAPKKs have been identified from several plant species. All the identified MAPKK genes from Arabidopsis, rice and poplar contain 11 catalytic subdomains Rao et al., 2010;Wang et al., 2014c). In Arabidopsis, MKK1 was activated by wounding and abiotic stress (Matsuoka et al., 2002). Alfalfa SIMKK mediates both salt and elicitor-induced signals (Kiegerl et al., 2000;Cardinale et al., 2002). NtMEK2 activates SIPK and WIPK resulting in cell death (Yang et al., 2001).
MAPKKKs form the largest class of MAPK cascade enzymes with 80 members classified into three subfamilies, MEKK,Raf,and ZIK containing 21,11,and 48 genes,respectively in Arabidopsis (Jonak et al., 2002). Plant MAPKKKs are characterized by different primary structures of their kinase domains, but are conserved within a single group (Champion et al., 2004). The MEKK subfamily comprises a conserved kinase domain of G(T/S)Px(W/Y/F)MAPEV (Jonak et al., 2002). The ZIK subfamily contains GTPEFMAPE(L/V)Y while the Raf subfamily has GTxx(W/Y)MAPE (Jonak et al., 2002). All the MAPKKK proteins have a kinase domain, and most of them have a serine/threonine protein kinase active site (Wang et al., 2015). In the RAF subfamily, most of the proteins have a long N-terminal regulatory domain and C-terminal kinase domain. By contrast, majority of the members in the ZIK subfamily have an N-terminal kinase domain (Wang et al., 2015). However, the MEKK subfamily has a less conserved protein structure with a kinase domain located either at the C-or Nterminal or in the central part of the protein (Wang et al., 2015). Homologs of MAPKKKs have been identified in plant species such as alfalfa, Arabidopsis, tobacco (Kovtun et al., 2000;Nishihama et al., 2001;Lukowitz et al., 2004;Nakagami et al., 2004). The MEKK subfamily contains NPK1, NbMAPKKKα, NbMAPKKKγ, NbMAPKKKε in tobacco (Jin et al., 2002;del Pozo et al., 2004;Liu et al., 2004;Melech-Bonfil and Sessa, 2010), MEKK1 in Arabidopsis (Asai et al., 2002), and SIMAPKKKα and SIMAPKKKε in tomato (Oh et al., 2010;Sun et al., 2014). The second subfamily, Raf, includes Arabidopsis CTR1/raf1 (Kieber et al., 1993), EDR/Raf2 (Frye et al., 2001), and DSM1 in rice (Ning et al., 2010). In Arabidopsis, MEKK1 regulates defense responses against different pathogens including bacteria and fungi (Asai et al., 2002;Qiu et al., 2008;Galletti et al., 2011). In addition, AtEDR1, a Raf-like MAPKKK, regulates SA-inducible defense responses (Frye et al., 2001). The ZIK subfamily which contains 10 and 9 members in Arabidopsis and rice, respectively, are able to regulate flowering time and circadian rhythms (Wang et al., 2008;Kumar et al., 2011).
A putative phosphorylation domain T/Sx 5 T/S is found between domains VII and VIII in MAP4Ks, which is identical to the phosphorylation motif of MAPKKs from plants Ichimura et al., 2002). Both domains participate in peptide-substrate recognition (Champion et al., 2004). MAP4Ks can be linked to the plasma membrane through association with a small GTPase or lipid (Qi and Elion, 2005). They are directly activated by stimulated interaction with adaptor proteins (Qi and Elion, 2005). The MAP4Ks are divided into eight classes including PAK-related, Gck, Mst, Tao, Ste/PAK, Sok (Champion et al., 2004). The majority of MAP4Ks are from the large class of Ste20 protein kinases, which exhibit a highly diverse noncatalytic domain (Dan et al., 2001). The PAKs, which have a C-terminal catalytic domain, are separated from the GC Kinase-related polypeptides, which contain an N-terminal catalytic domain (Dan et al., 2001). Most of the MAP4Ks contain an N-terminal catalytic domain, but members of the STE20/PAK group have a C-terminal kinase domain and some plant MAP4Ks have their kinase domain in the middle of the sequences . The Arabidopsis genome contains 10 putative MAP4Ks (Champion et al., 2004). A maize gene encoding MIK is a GCKlike kinase being a subfamily of MAP4K (Llompart et al., 2003), which relates membrane-located receptors to MAP kinases (Dan et al., 2001). Some MAP4K are able to phosphorylate MEKK or Raf members whereas other MAP4Ks either phosphorylate MAPKKs or function as adaptors (Champion et al., 2004).
However, the functions of most MAPK genes in plants are still unknown. Although MAPK cascades are involved in signaling multiple defense responses, the role of Vitis MAPK cascades in response to biotic and abiotic stresses are not elucidated. In previous studies in grapevine, a few components of the MAPK gene family were isolated (Wang et al., 2014a). In addition, the gene family of MAPKKKs were identified and their expression profiles were analyzed in different organs in response to different stresses (Wang et al., 2014b). Interestingly, the expression of VvMAP kinase gene was induced by salinity and drought (Daldoul et al., 2012). However, the MAPKK and the MAPKKKK subfamilies have not yet been characterized. To explore the role of MAPK cascade proteins in biotic and abiotic stress responses in grapevine, the publicly available grapevine genome (Jaillon et al., 2007) was analyzed to identify all members of MAPK cascade proteins. Using these databases, we characterized all members of MAPK cascades of V. vinifera and performed a phylogenetic analysis in comparison with members of Arabidopsis MAPK cascade proteins.

Multiple-sequence Alignment and Phylogenetic Tree Construction
Multiple-sequence alignments of the putative MAPK cascade proteins were aligned using CLUSTAL W and subjected to phylogenetic analysis by both the maximum parsimony and distance with neighbor-joining methods with 1000 bootstrap replicates (Saitou and Nei, 1987;Thompson et al., 1994). The phylogenetic tree was illustrated using MEGA5. Because similar results were obtained with both methods, only the single tree retrieved from the distance analysis is discussed in detail.
For MAPK cascade subfamilies from both V. vinifera and A. thaliana, multiple sequence alignment was performed using the multiple sequence comparison by log-expectation (MUSCLE) alignment tool (http://www.ebi.ac.uk/Tools/msa/ muscle/) (Edgar, 2004). The phylogenetic analysis was performed using a neighbor-joining method with 1000 bootstrap replicates andvisualized with MEGA5 software (Tamura et al., 2011). The protein theoretical molecular weight and isoelectric point were predicted using compute pI/MW (http://au.expasy.org/tools).

Orthology Analysis and Database Search
Orthology analysis was performed using the PHOG web server (http://phylofacts.berkeley.edu/orthologs/) (Datta et al., 2009). The sequences of conserved domains with similarity over 70% and an "E" value of 0.0 were selected as queries. The selected sequences of conserved domains from different species were then used in a BLASTP search against the V. vinifera protein sequence database. The best hits were annotated as putative orthologous sequences (Moreno-Hagelsieb and Latimer, 2008).
Expressed sequence tags (ESTs) were identified by BLASTn of the V. vinifera expressed sequence tag (EST) database (http:// www.ncbi.nlm.nih.gov/dbEST). Using the sequences of all of the MAPK cascade proteins as queries. The positives sequences were then confirmed by alignment with the query ORF.

Genome-wide Identification of MAPK Cascade Genes in Vitis vinifera
Vitis vinifera MAPK cascade sequences were mined from the grapevine genome proteome 12x database (Jaillon et al., 2007). We identified 88 ORFs encoding putative MAPK cascade proteins containing at least MAPK domain by BLAST searches of the grapevine genome proteome 12× database with the amino acid sequences of the MAPK cascade proteins from A. thaliana as queries ( Table 1). The completed Vitis genome contains 14 MAPKs, 5 MAPKKs, 62 MAPKKKs, and 7 MAPKKKKs ( Table 1).

Phylogenetic Analysis
All predicted MAPK cascade family sequences were aligned using ClustalW (Thompson et al., 1994). A rooted phylogenetic tree was constructed by alignment of full length amino acid sequences using the MEGA5 program and maximum parsimony and distance with neighbor-joining methods (Saitou and Nei, 1987) (Figure 1). One thousand bootstrap replicates were produced for each analysis.
Vitis MAPK cascade sequences can be divided into four subfamilies on the basis of the presence of conserved threonine and tyrosine residues in the motif TxY located in the activation loop (T-loop) between kinase subdomains VII and VIII. In addition, we identified MAPKKKK subfamily with 7 members in Vitis genome, which has the conserved amino acid motifs TFVGTPxWMAPEV as described (Jonak et al., 2002). The members of four subfamilies clustered more tightly with each other than with members of other subfamilies (Figure 1).

MAPKs
The phylogenetic analysis showed that the VvMAPKs were devided into five distinct groups, which is higher than previous reports (Kumar and Kirti, 2010;Nadarajah and Sidek, 2010). Group V MAPKs are found only in the grapevine genome among other plant species. All of identified ORFs encoding MAPK were named VvMPK1 through 14. Hyun et al. (2010) reported 12 MAPKs based on 8x sequence coverage in grapevine genome whereas we identified a total of 14 ORFs in Vitis 12x genome coverage (Hyun et al., 2010), which may be due to the errors corrected in 12x genome sequence coverage. The grapevine genome contains less MAPKs than Arabidopsis (20 MAPKs)  and rice (17 MAPKs) (Liu and Xue, 2007). Members of the Vitis MAPK subfamily show 20-86% identity to each other. Full length MAPK proteins ranged in size from 195 to 769 amino acids ( Table 1). Variation in length of the entire MAPK gene is usually due to differences in the length of MAPK domain and/or, due to the number of introns. The difference in length among MAPK genes may indicate the presence or absence of motifs which could affect functional specifity. VvMPK12, VvMPK14 belong to the group I., which contains well-characterized MAPK genes including AtMPK3, AtMPK6 (Figure 2). It has been demonstrated that AtMPK3, OsMPK5 were activated in response to pathogens and abiotic stresses (Zhang and Klessig, 2001;Hamel et al., 2006;Rohila and Yang, 2007). OsMPK5 plays an important role for the resistance to blast disease (Song and Goodman, 2002;Huang et al., 2011). AtMPK6 can be activated by various abiotic and biotic stresses (Ichimura et al., 2000;Yuasa et al., 2001;Feilner et al., 2005;Huang et al., 2011). Similarly, PtrMAPK is involved in resistance to both dehydration and cold (Huang et al., 2011).
Group II MAPKs are involved in both abiotic stresses and cell division in Arabidopsis. VvMPK13, VvMPK11, and VvMPK9 are clustered with Group II., which includes AtMPK4, AtMPK5, AtMPK12, and AtMPK11. AtMPK4 and its upstream MAPKK AtMKK2 can be activated by biotic and abiotic stresses (Ichimura et al., 2000;Teige et al., 2004).
VvMPK4 and VvMPK8 belong to group III. AtMPK1 in the group III is regulated by salt stress treatment (Mizoguchi et al., 1996). In addition, AtMPK1 and AtMPK2 are activated by ABA (Ortiz-Masia et al., 2007). The group III genes, such as rice BWMK1 and alfalfa TDY1, are activated by wounding and pathogens (Nowak et al., 1997;Lynch et al., 2001).
Group IV, which includes VvMPK1, VvMPK3, VvMPK5, VvMPK6, and VvMPK7 of the Vitis MAPKs, have the TDY motif in their T-loop and the absence of the C-terminal CD domain, which is consistently found in members of the other MAPK groups. VvMPK2 and VvMPK10 belonging to group V were separated from other groups.
All of the 14 Vitis MAPK proteins are represented in the Vitis ESTs database (Supplementary Table 1) and are expressed in different tissues such as fruits, berries, buds, flowers, leaves, and roots. In addition, 12 VvMPK genes were isolated (Wang et al., 2014a). Expression analysis of VvMPK genes showed that all VvMPK genes are expressed during grapevine growth and development, and in biotic and abiotic stresses (Wang et al., 2014a).
To date, none of the Vitis MAPKK homologs have been cloned or characterized. However, 98 ESTs were identified for this subfamily in different tissues in response to biotic or abiotic stresses (Supplementary Table 2). A role of MAPK kinase, MKK1 in abiotic stress signaling was previously demonstrated (Matsuoka et al., 2002). Analysis of MKK1 revealed that drought, salt stress, cold, wounding activated MKK1, which in turns activates its downstream target MPK4 (Matsuoka et al., 2002). Tobacco NtMEK2 is functionally interchangeable with two Arabidopsis MAPKKs, AtMKK4, and AtMKK5 in activating the downstream MAPKs (Ren et al., 2002). MdMKK1 was reported to be downregulated by ABA (Wang et al., 2010). In Arabidopsis, AtMKK3 is upregulated in response to ABA (Hwa and Yang, 2008). Interestingly, AtMKK1/AtMKK2 play an important role in signaling in ROS homeostasis (Liu, 2012).

MAPKKKs
With 62 members, the MAPKKK subfamily represents the largest subfamily of V. vinifera MAPK cascade proteins, which is smaller than those of Arabidopsis (80 members) and rice (75 members) (Colcombet and Hirt, 2008;Rao et al., 2010). Recently, Wang et al. (2014b) identified 45 MAPKKK genes in grapevine 12x    genome coverage (Wang et al., 2014b). The difference in the number of MAPKKK members in grapevine genome may be related to the "E" value > E-120 used in this report, which is more significant. In addition, domain scan using two different databases (PROSITE and CDD) can identify more sequences in the grapevine genome. The members of the Vitis MAPKKK subfamily share 11-35% identity with each other and distributed on various chromosomes (from 2 to 18) ( Table 1). The full length Vitis MAPKKK sequences range from 175 (VviMAPKKK38) to 1397 (VviMAPKKK17) amino acids. The phylogenetic analysis of both Vitis and Arabidopsis MAPKKK sequences shows that this subfamily is categorized into three main groups with bootstrap values up to 93% (Figure 4).
The first group contains MAPKKKs whose kinase domains have similarity to MEKK subfamily members (Figure 4) (Jonak et al., 2002). A second group includes Raf subfamily members while a third group presents ZIK subfamily members (Figure 4) (Jonak et al., 2002). In total, there are 21 VviMAPKKKs in the MEKK subfamily, while there are 12 in the ZIK subfamily and 29 in the Raf subfamily among the 62 members in the Vitis genome.
Analysis of conserved domain of VviMAPKKKs identified a long regulatory domain in the N-terminal region and a kinase domain in the C-terminal region in most of VviMAPKKKs. It is suggested that the long regulatory domain in the N-terminal region of the Raf subfamily may be involved in protein-protein interactions and regulate or specify their kinase activity . Twenty members of the Vitis MAPKKK subfamily share 75.1-89.2% similarity with their orthologs from different plant species (Table 2).
We identified at least 640 ESTs for 59 of the Vitis MAPKKKs (Supplementary Table 3) indicating that MAPKKK subfamily is transcriptionally active. Expression profile of VviMAPKKK genes suggested that some of them are involved in response to biotic and abiotic stresses in different tissues and organs (Wang et al., 2014b). In support of a role for some Vitis MAPKKKs, AtMEKK1 expression is enhanced by drought, salt, stress (Mizoguchi et al., 1996). Recently, it was reported that AtMKK1/MKK2 and AtMEKK1 were able to negatively regulate programmed cell death (PCD) as well as immune responses (Kong et al., 2012). In tobacco, NPK1-MEK1-Ntf6 are also involved in resistance to tobacco mosaic virus (TMV) (Jin et al., 2002;Liu et al., 2004). In addition, AtEDR1, a Raf-like MAPKKK could regulate SA-inducible defense responses negatively (Frye et al., 2001).
Several MAP4Ks have been identified in plant genomes based on phylogenetic analyses of their kinase domain. A MAP4K, named MIK, was characterized from the Zea mays (Wang et al., 2014d). Recently, a new MAP4K from GCK-II subfamily named ScMAP4K1, which play important roles in ovule, seed, and fruit development was characterized (Major et al., 2009).
In addition, we identified several orthologs from different species for 3 VvMAP4Ks ( Table 2). Among 7 ORFs encoding Vitis MAP4Ks, all of them are transcriptionally active (Supplementary Table 4), but none of them has been cloned and characterized.

Conclusions
This report represents the first complete genome-wide analysis of MAPK cascade proteins in grapevine. The identification of Vitis MAPK cascade proteins and their comparative analysis with the Arabidopsis MAPK cascade proteins indicates that MAPK cascade genes have been conserved during evolution. In this report, we annotated 90 ORFs encoding MAPK cascade proteins in V. vinifera using a bioinformatics approach. Taken as a whole, our data provide significant insights into future biological and physiological analysis of MAPK cascades from V. vinifera.

Author Contributions
BÇ conceived and designed all research. OK performed the bioinformatic analyses. BÇ analyzed data and wrote the article.