K-Domain Technology: Constitutive Expression of a Blueberry Keratin-Like Domain Mimics Expression of Multiple MADS-Box Genes in Enhancing Maize Grain Yield

MADS-box genes are considered as the foundation of all agronomic traits because they play essential roles in almost every aspect of plant reproductive development. Keratin-like (K) domain is a conserved protein domain of tens of MIKC-type MADS-box genes in plants. K-domain technology constitutively expresses a K-domain to mimic expression of the K-domains of other MADS-box genes simultaneously and thus to generate new opportunities for yield enhancement, because the increased K-domains can likely prevent MADS-domain proteins from binding to target DNA. In this study, we evaluated utilizing the K-domain technology to increase maize yield. The K-domain of a blueberry’s SUPPRESSOR of CONSTITUTIVE EXPRESSION OF CONSTANS 1 (VcSOC1K) has similarities to five MADS-box genes in maize. Transgenic maize plants expressing the VcSOC1K showed 13–100% of more grain per plant than the nontransgenic plants in all five experiments conducted under different experimental conditions. Transcriptome comparisons revealed 982 differentially expressed genes (DEGs) in the leaves from 83-day old plants, supporting that the K-domain technology were powerful and multiple functional. The results demonstrated that constitutive expression of the VcSOC1K was very effective to enhance maize grain production. With the potential of mimicking the K-domains of multiple MADS-box genes, the K-domain technology opens a new approach to increase crop yield.

MADS-box genes were frequent targets of selection during maize domestication and improvement (Zhao et al., 2011;Schilling et al., 2018). They play essential roles in every aspect of plant reproductive development and were considered as the jack of all traits (Heuer et al., 2001;Schilling et al., 2018). For MIKC proteins, the I-domain defines specificity in the formation of DNA binding dimers (Masiero et al., 2011); the K-domain contributes to specificity in proteinprotein interactions (Kaufmann et al., 2005;Rumpler et al., 2015). Both the K and C domains function in the formation of higher-order protein complexes, and the C domain also determines the specificity of interactions of MADS-box proteins (van Dijk et al., 2010;Liu et al., 2013). Of the MIKC c gene clades, SUPPRESSOR of CONSTITUTIVE EXPRESSION OF CONSTANS 1 (SOC1) is a positive regulator of the downstream MADS-box genes such as APETALA1 (AP1) and FRUITFUL (FUL)/AGAMOUS-like 8 (AGL8) (Lee and Lee, 2010;Alter et al., 2016). Due to their regulatory roles, many of the MIKC c genes have a potential to change agronomic traits (Cacharron et al., 2000;Takatsuji and Kapoor, 2002;Lee et al., 2004;Podila et al., 2005;Ryu et al., 2009;Bae et al., 2011;Giovannoni et al., 2013;Alter et al., 2016). For example, a maize ZMM28 gene (patent application # WO2008148872A1), which is a homolog of the AGL8, has been successfully applied to enhance grain yield by its constitutive expression (Munster et al., 2002;Anderson et al., 2019a,b;Catron, 2019).
Being part of the K domain of the blueberry's (Vaccinium corymbosum L.) SOC1 gene (VcSOC1K), overexpression of the VcSOC1K was found effective in promoting flowering, reducing plant height, enhancing abiotic tolerance, and increasing blueberry yield potential through its broad impact the expression of numerous genes (Song et al., 2013;Song and Chen, 2018). This laid a foundation of the K-domain technology, which utilizes a constitutively expressing K-domain to mimic or affect the expression of multiple MADS-box genes simultaneously. In this study, we evaluated the K-domain technology for maize yield increase. We provide the phenotypic data of transgenic maize plants containing a constitutively expressed VcSOC1K (VcSOC1K-CX) from five experiments conducted under different conditions. We show the data of transcriptome comparison to reveal a broad effect of VcSOC1K-CX on plant development at transcript levels. The K-domain technology opens a new approach to enhance crop yield potential by mimicking expression of the K-domains of multiple MADSbox genes.

Constructs and Plant Transformation
Maize SOC1 gene (ZmSOC1 or ZmMADS1) was cloned from the cDNA of maize inbred line B104. The protein sequence of the cloned 696-bp ZmSOC1 is identical to that derived from the HQ858775.1 in the GenBank. The ZmSOC1 and VcSOC1K protein sequences were aligned using Clustal Omega at EBI with default parameters 1 .
The VcSOC1K was previously cloned into the T-DNA region of the binary vector pBI121 between the CaMV 35S promoter and the Nos terminator for constitutive expression (Song and Chen, 2018). The CaMV 35S-VcSOC1K-Ocs expression cassette in the PBI121 vector was released by a digestion using Hind III and EcoR I, purified from gel, and then ligated to the T-DNA region of the Hind III-and EcoR I-digested binary vector pTF101.1. The pTF101.1 contains the bialaphos resistance (bar) gene under the CaMV 35S promoter for selection of transformed plant cells using glufosinate (GS) herbicide. The resulting pTF101.1-VcSOC1K was verified by sequencing the VcSOC1K and was then transformed into Agrobacterium tumefaciens strain EHA101.
Transformation of the pTF101.1-VcSOC1K into maize cultivar Hi-II (A188 × B73) calluses was conducted at the Plant Transformation Facility of Iowa State University. The first generation (T 0 ) of transgenic (TR) Hi-II plants were backcrossed with nontransgenic inbred line B73 to produce first generation of backcross (BC 1 ) seeds, which have about 75% of the B73 genetic background. T 0 plants from separate callus clusters were defined as independent transgenic lines. BC 1 seeds from 18 transgenic lines were obtained. Transgenic BC 1 plants that showed PCR-positive for both the bar gene and the VcSOC1K were crossed with inbred line B73 to produce BC 2 seeds, which have about 87.5% of the B73 genetic background.

Sequence Analysis of Maize MADS-Box Genes
The amino acid sequence of the VcSOC1K was used to search the sequence database "all gene model protein sequences" at Maize Genetics and Genomics Database (MaizeGDB) 2 using BLAST program blastp. The BLAST hits with E-value cutoff of 1e-4 were retained and annotated by BLAST against the database at GenBank. Protein sequence alignment was conducted using CLC Sequence Viewer 8.0. Phylogenetic tree analysis was performed using the Maximum Likelihood method conducted in MEGA X (Jones et al., 1992;Kumar et al., 2018;Stecher et al., 2020).

Plant Phenotyping
To collect phenotypic data of the BC 1 plants, three experiments started on May 17th, June 11th, and June 25th were conducted in 2018, and one experiment started on may 11th was performed in 2019. Plants of four transgenic lines were evaluated in all four experiments, and five additional transgenic lines were also evaluated in one or two experiments. For all of the four experiments, BC 1 seeds were germinated in watersoaked Suremix Perlite planting medium (Michigan Grower Products Inc., Galesburg, MI) in 4-inch plastic pots (8.9 cm width × 12.7 cm height). Individual BC 1 plant was transplanted to a 4-gallon pot (top diameter 30 cm, bottom diameter 24 cm, depth 27 cm) and the plants were grown in a secured courtyard under natural environmental conditions at Michigan State University, East Lansing, Michigan. All of the plants were irrigated and fertilized as needed. During the summer time, plants were watered every other day and fertilized weekly using 20-20-20 fertilizer. Young leaves of 30 to 40-day old plants, 0.5 g per plant, were collected for each plant, frozen in liquid nitrogen, and stored in a freezer at −80 • C for DNA isolation. To avoid biases in phenotypic data collection, verification of the transgenic plants through polymerase chain reaction (PCR) was conducted after phenotypic data collection.
To collect phenotypic data of the BC 2 plants, one field test started on May 26th was conducted in 2020. One of the four transgenic lines evaluated in the BC 1 generation was tested in six field plots. A total of 30 plants in three lanes were randomly grown in each of the six plots, including three plots for a high planting density of 40,000 plants/acre and another three for a low planting density of 32,000 plants/acre. Two extra lanes of B73 plants for each plot were used as protection lanes (Supplementary Figure 1). A drip irrigation system was installed in the field for plant irrigation as needed.
Phenotypic data collections included plant height, seed germination date, date of tassel and silk appearance, the total number of stem nodes and leaves, the number of cobs, dry weight of aerial parts without ears, dry weight of ear(s) excluding husk(s), and dry wright of grains. Plant heights measured during plant growth refer to stalk heights from the soil surface to the node of the highest leaf. The final heights of the maize plants refer to stalk heights from the soil surface to the base of the first branch of tassels at the harvest time. All of the plants for each experiment were harvested at a same time after they reached full physiological maturity in late Octobers. The ear(s) of each plant were collected in a paper bag and dried at 25 • C for over two months in the lab prior to weighing the dry weights of cob(s) and grains. Grain quality from BC 2 transgenic and nontransgenic null segregate (NT) and B73 plants was measured using a Grain Analyser (Infratec 1241, FOSS Analytical AB, Denmark).

Transgene Detection
DNA was isolated from about 200 mg of leaf tissues for each sample using the cetyltrimethylammonium bromide (CTAB) method (Doyle and Doyle, 1987). Two pairs of primers, bar-F and bar-R for the bar gene, 35S-F (3 portion of the CaMV 35S promoter) and SOK for the VcSOC1K gene (Supplementary Table 1), were used to detect the presence of transgenes in each sample. PCR reaction conditions for all primer pairs started with an initial denaturation for two min at 94 • C, 30 cycles of 45 s at 94 • C, 60 s at 58 • C and 90 s at 72 • C, and a final extension for 10 min at 72 • C. All amplified PCR and RT-PCR products were separated on 1.0% agarose gel containing ethidium bromide and visualized and photographed under UV light.

RNA Sequencing and Transcriptome Analysis
The third leaf from the top of 83-day old plants at the Blister (R2) stage, which is a reproductive growth stage occurs 10-14 days after silking, were harvested, frozen immediately in liquid nitrogen, and stored at −80 • C in a freezer for RNA isoation. Three transgenic and three NT plants from one field plot were used as individual biological replicates. Total RNA of each sample was isolated from about 500 mg young leaf tissues using a separate CTAB method (Zamboni et al., 2008) and was purified using RNeasy Mini Kit (Qiagen, Valencia, CA, United States). On-Column DNase digestion with the RNase-free DNase Set was used to remove DNA in the RNA samples (Qiagen, Valencia, CA, United States). RNA quality was determined using the High Sensitivity RNA ScreenTape system (Agilent technologies, Santa Clara, CA, United States). All of the RNA samples used for RNA sequencing had an RNA integrity number (RIN) equivalent score greater than 5.0.
The RNA samples were sequenced (150 bp-paired end reads) using the Illumina HiSeq4000. All sequencing was performed at the Research Technology Support Facility at Michigan State University (East Lansing, Michigan, United States). The FastQC program 3 was used to assess the quality of sequencing reads for the per base quality scores. A total of 7.3-11.0 million pair-reads (MR) for each of the six biological samples with average scores ranging from 38.4 to 39.4 were obtained for transcriptome analysis. A transcriptome reference of BC 1 plants (ZmTrinity) assembled from about 100 MR of multiple NT and transgenic lines using Trinity/2.8.5 was used to conduct differential expression analysis (Haas et al., 2013). The differentially expressed transcripts (DETs) with the false discovery rate (FDR) value below 0.05 were used for further analyses of different pathway genes. The transcriptome reference ZmTrinity was annotated using Trinotate (Bryant et al., 2017).
Pathway genes of nine phytohormones in Arabidopsis, including auxin, cytokinin, ABA, ethylene, gibberellin, brassinosteroid, jasmonic acid, salicylic acid, and strigolactones, were retrieved from RIKEN Plant Hormone Research Network 4 . Similarly, pathway genes of sugar in Arabidopsis were identified. These Arabidopsis hormone, MADS-box, and sugar genes were used as queries to blast against the transcriptome reference ZmTrinity and the isoforms showing e-values less than −20 were identified and used for transcriptome comparisons. Flowering pathway genes in Arabidopsis and cereals (Walworth et al., 2016) were used to analyze flowering-related DETs identified in this study. Cytoscape 3.8.2 was used to construct gene networks of overrepresented gene ontology (GO) terms for the selected DETs under BiNGO's default parameters with selected ontology file "GOSlim_Plants" and selected organism "A. thaliana" (Shannon et al., 2003;Maere et al., 2005).
Quantitative reverse transcript PCR (RT-qPCR) using SYBR Green system (LifeTechnologies, Carlsbad, CA, United States) was conducted to check the selected transcripts. The primers were designed according to the RNA-seq data, ZmActin1 was used to normalize the RT-qPCR results (Supplementary Table 1). RT-qPCR was performed on a Roche LightCycler 480 Instrument II (Roche). The reaction conditions for RT-qPCR were 95 • C for 5 min, 45 cycles of 30 s at 95 • C, 45 s at 62 • C and 30 s at 72 • C. Transcript levels within samples were normalized to Actin. Fold changes were calculated using 2 − Ct , where Ct = (Ct TARGET − Ct NOM ) transgenic − (Ct TARGET − Ct NOM ) nontransgenic . Three biological samples and three technical replicates were used for the analysis of each transgenic and nontransgenic line.

Statistical Analysis
Statistical analysis of the phenotypic data was conducted using ANOVA and TukeyHSD in RStudio (Version 1.3.1093).

VcSOC1K Sequence Has Similarities to the Proteins of Multiple Maize MADS-Box Genes
The VcSOC1K, lacking the MADS box domain, had 36.9% of identity to the maize SOC1 gene (ZmSOC1/ZmMADS1) (Figures 1A,B). It contained both the K-, I-, and C-domains ( Figure 1B). A total of five MADS-box genes on five chromosomes in the maize B73 representative genome showed Frontiers in Plant Science | www.frontiersin.org similarities to the VcSOC1K ( Figure 1C). This was the foundation of our hypothesis of utilizing constitutive expression of the VcSOC1K to mimic expression of the K-domains of multiple MADS-box genes and thus to regulate plant development for grain yield increase in this study.

VcSOC1-OX Enhanced Grain Production
Five experiments were conducted to evaluate the phenotypic changes in VcSOC1K-CX plants under five environmental conditions in three years. Pot-growing VcSOC1K-CX BC 1 plants from nine transgenic lines in four experiments and fieldgrowing VcSOC1K-CX BC 2 plants at two planting densities from one transgenic line were compared with the NT plants (Figure 2A and Supplementary Figure 1). Of the nine agronomic traits investigated, the VcSOC1K-CX plants had a higher grain production per plant than the NT plants in all five experiments ( Figure 2B and Supplementary Table 2). The increases for VcSOC1K-CX BC 1 plants ranged from 13 to 27%. Incredibly, for the 180 BC 2 plants tested in the field in 2020, the average dry grain weight for the VcSOC1K-CX BC 2 (81 g/plant) was twofold as many as that for the NT plants (40 g/plant) ( Figure 2B). This difference for the BC 2 plants, compared to field-growing BC 1 plants, was due likely to the increased abiotic and biotic stresses caused by the field conditions. All of the other traits (e.g., flowering time, leaf number, and plant height) showed no significant changes ( Figure 2C and Supplementary Table 2). Quality of the grains from the BC 2 plants was measured. The grains from the VcSOC1K-CX plants showed no significant differences from those of the NT plants (Supplementary Table 3).
With the MADS box of MIKC-type SOC1 protein removal, constitutive expression of the truncated SOC1 genes resulted in SOC1-deficient phenotypes in model plants, petunia, Arabidopsis, and Brachypodium (Ferrario et al., 2004;Seo et al., 2012). Overexpression of a truncated petunia SOC1 gene containing the K-domain and C-region delayed flowering in transgenic petunia plants (Ferrario et al., 2004). Similarly, in both Arabidopsis and Brachypodium, expression of the truncated SOC1 genes containing either the K-domain only or the K-domain and I-region inactivated the SOC1 and caused delay of flowering (Seo et al., 2012). In contrast, unlike the other truncated SOC1 genes reported (Ferrario et al., 2004;Seo et al., 2012), when the truncated VcSOC1K lacking only the MADS box were constitutively expressed, both VcSOC1K-CX tobacco or VcSOC1K-OX blueberry plants had SOC1-OX phenotypes (Song et al., 2013;Song and Chen, 2018). Surprisingly, the VcSOC1K-CX maize plants in this study showed neither SOC1-deficient phenotype in delayed flowering nor obvious SOC1-OX phenotypes of promoted flowering and plant dwarfing (Alter et al., 2016).

VcSOC1K-CX Affected the Expression of Numerous Genes
Kernel Blister Stage (growth stage R2) is a reproductive growth stage occurs 10-14 days after silking. The R2 stage is of importance for the determination of grain yield. At this stage, we found the leaves of the VcSOC1K-CX BC 1 plants were often greener leaves than the NT plants. Thus, we conducted transcriptome comparisons between the VcSOC1K-CX and the NT BC 2 plants. The comparison revealed 2,247 DETs, which were annotated to 982 differentially expressed genes (DEGs) with E-value cutoff of 1e-19 (Supplementary Table 4). RT-qPCR    analysis of seven selected DEGs were consistent with those from RNA-sequencing data, suggesting that the RNA-seq data were reliable (Supplementary Figure 2). Of the 982 DEGs, we further identified 21 DEGs in the flowering pathway and 41 DEGs related to phytohormones, including abscisic acid (9 DEGs), auxin (9), brassinosteroid (8), cytokinin (5), ethylene (1), gibberellin (8), and Jasmonate (1). Additionally, there were 3 DEGs of MADS-box genes, 18 DEGs related to sucrose synthesis, and 28 DEGS in the family of mitogen-activated protein kinase (MAPK) related to plant resistance to abiotic and biotic stresses (Bigeard and Hirt, 2018;Krysan and Colcombet, 2018;He et al., 2020). Remarkably, of these essential DEGs, greater than 100-fold changes occurred for 10 repressed and 40 up-regulated DEGs ( Table 1). The examples of these DEGs indicated that VcSOC1K-CX could affect grain production, at least, through flowering, phytohormones, MAPK-mediated signaling (Bigeard and Hirt, 2018), or photosynthetic sucrose synthesis (Stein and Granot, 2019), although these DEGs only represented the changes in a specific tissue at a specific developmental stage. For instance, the upregulated ZEAXANTHIN EPOXIDASE Gene (ZEP_ORYSJ) could increase plant resistance to osmotic and drought stresses, seed development and dormancy (Agrawal et al., 2001;Cao et al., 2018). The increased expression of the floral homeotic protein APETALA 2 (AP2_ARATH) could play a broad role in flower and seed development by controlling the expression of other floral organ identity genes (Jofuku et al., 1994;Krogan et al., 2012). The upregulated Mitogen-activated protein kinase kinase kinase YODA (YODA_ARATH) could enhance the regulation of florescence architecture due to its role in promoting extraembryogenic fate (Lukowitz et al., 2004;Meng et al., 2012). Alpha, alpha-trehalose-phosphate synthase [UDP-forming] 1 (TPS1_ARATH) plays a critical role in vegetative growth and transition to flowering, embryo development and growth, and starch and sucrose degradation (Blazquez et al., 1998;van Dijken et al., 2004;Avonce et al., 2005;Gomez et al., 2006). It was likely that the increased expression of TPS10_ARATH enhanced grain yield (Table 1). More studies at protein levels are still needed to find out how and why the expression of the truncated VcSOC1K with the MADS box removal had such a broad impact on gene expressions.

Gene Networks of the DETs
As a quantitative trait, since there are no convincing molecular criteria available to define different crop yield potentials, we visualized the overall impact of VcSOC1K-CX by further analyses of the DETs using the ontology file of GOSlim_Plants in BiNGO to identify overrepresented GO terms. Thirtynine overrepresented GO terms were revealed in the gene networks, including 13 in "biological process, " nine in "molecular function, " and 17 in "cellular component" (Figure 3). These overrepresented GO terms indicated a broad impact of the VcSOC1K-CX at different levels, which provided alternative evidence to support that the VcSOC1K-CX worked effectively in regulating plant development.
Of the 13 overrepresented GO terms "biological process, " five were related to abiotic factors (Figure 3). In this study, the fieldgrown BC 2 plants were likely exposed to more abiotic stresses than the pot-grown BC 1 plants tested under more controlled conditions (i.e., water and fertilizing). The higher grain yield increase by 100% for the BC 2 TR (vs BC 2 NT) plants (100%), compared to the grain yield increase by 13 to 27% observed in four experiments for BC 1 TR (vs BC 1 NT) plants, could be attributed to enhanced abiotic tolerance in transgenic plants that could be more obvious under stress conditions. In the previous report, VcSOC1K-OX resulted in high pH tolerance in blueberry plants (Song and Chen, 2018).

CONCLUSION
K-domain technology utilizes expression of the VcSOC1K to regulate plant growth. In maize, the VcSOC1K showed similarities to five MADS-box genes. VcSOC1K-CX resulted in grain yield increase by 13 to 100% in all five experiments conducted under different experimental conditions. Transcriptome comparisons revealed 982 DEGs in the leaves from the growth stage R2 plants, supporting that the K-domain technology were multiple functional. The K-domain technology opens a new approach to increase crop yield by its potential of mimicking the K-domains of multiple MADS-box genes.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are publicly available. This data can be found here: NCBI repository, accession number PRJNA701291.

AUTHOR CONTRIBUTIONS
G-qS conceived, supervised the study, analyzed data, and wrote the manuscript. XH and G-qS conducted the experiments. Both authors read and approved the final manuscript.  (Table 1 and Supplementary Table 4).

−
Ct is an average of three biological and three technical replicates for each DET. ZmActin1 (SAC1_ARATH) was used to normalize the RT-qPCR results. Bars indicate standard deviation.