Transcriptional Reprogramming of Legume Genomes: Perspective and Challenges Associated With Single-Cell and Single Cell-Type Approaches During Nodule Development

Transcriptomic approaches revealed thousands of genes differentially or specifically expressed during nodulation, a biological process resulting from the symbiosis between leguminous plant roots and rhizobia, atmospheric nitrogen-fixing symbiotic bacteria. Ultimately, nodulation will lead to the development of a new root organ, the nodule. Through functional genomic studies, plant transcriptomes have been used by scientists to reveal plant genes potentially controlling nodulation. However, it is important to acknowledge that the physiology, transcriptomic programs, and biochemical properties of the plant cells involved in nodulation are continuously regulated. They also differ between the different cell-types composing the nodules. To generate a more accurate picture of the transcriptome, epigenome, proteome, and metabolome of the cells infected by rhizobia and cells composing the nodule, there is a need to implement plant single-cell and single cell-types strategies and methods. Accessing such information would allow a better understanding of the infection of plant cells by rhizobia and will help understanding the complex interactions existing between rhizobia and the plant cells. In this mini-review, we are reporting the current knowledge on legume nodulation gained by plant scientists at the level of single cell-types, and provide perspectives on single cell/single cell-type approaches when applied to legume nodulation.


INTRODUCTION
Nodulation is a complex biological process which occurs between the root system of plants (i.e., legumes and the genus Parasponia of the Ulmaceae family) and rhizobia, soil bacteria capable to fix and assimilate the atmospheric dinitrogen. The establishment of nitrogen-fixing nodules requires two developmental programs, one leading to the formation of infection threads (plant-made structures through which rhizobia grow to reach the developing nodule) and one leading to nodule morphogenesis. Several molecular, physiological and cellular aspects of this biological interaction were characterized during the past two decades. For instance, the root and bacterial exudates used to initiate the recognition between the two partners are now wellcharacterized [e.g., plants flavonoids and iso-flavonoids (Phillips et al., 1994), bacterial nodulation factor (Nod factor) (Lerouge et al., 1990), and polysaccharides (Fraysse et al., 2003)]. More specifically, Nod factors are lipo-chitooligosaccharides whose synthesis is stimulated upon recognition of plant flavonoids by rhizobial NodD proteins (Oldroyd and Downie, 2008). Several functional genomic studies revealed the role of plant genes in controlling the perception then infection of the legume root hair cells and nodule cells by rhizobia (Oldroyd, 2013). Notably, the nodulation signaling pathway, a conserved gene regulatory pathway between legume species which is induced upon recognition of the Nod factor by Nod factor receptors, was characterized across several legume species (Oldroyd, 2013). In addition to these functional genomic studies, the development of microarrays followed by the emergence of highthroughput sequencing technologies led researchers to better characterize the overall response of the legume transcriptome to rhizobia inoculation and infection. For instance, these transcriptomic analyses were conducted to reveal the early responses of the legume root hair cells to rhizobia inoculation as well as the transcriptomic changes occurring during nodule development [see below for a more detailed description of these studies (Colebatch et al., 2002(Colebatch et al., , 2004Barnett et al., 2004;El Yahyaoui et al., 2004;Kouchi et al., 2004;Lee et al., 2004;Asamizu et al., 2005;Starker et al., 2006;Benedito et al., 2008;Brechenmacher et al., 2008;Hogslund et al., 2009;Libault et al., 2009;Afonso-Grunz et al., 2014;Roux et al., 2014;Kant et al., 2016;Peng et al., 2017;Yuan et al., 2017)].
While these studies allowed the identification of numerous differentially expressed genes, opening avenues for new functional analyses, the cellular complexity of the samples used to establish these transcriptomic resources remains a difficulty to accurately understand the response of plant cells to rhizobia inoculation and infection. For instance, only root hair cells localized in one specific zone of the root, the "susceptible zone" of the root system, are potentially infected. Similarly, only a subset of the nodule cells is infected by rhizobia upon endocytosis and formation of the symbiosome, a plant cell compartment containing the symbiotic bacteria. To overcome the problem associated with sample heterogeneity, researchers implemented strategies to isolate specific cell-types before applying the collection of high-throughput sequencing methodologies such as microarray hybridization and RNA-sequencing technology. Such strategy successfully revealed the activation and repression of transcriptomic programs in response to rhizobia inoculation and infection. In this mini-review, we are discussing the outcome of these analyses, their limitation, and opportunities to develop new strategies to better capture the dynamic changes of the legume transcriptome during the various stages of the nodulation process.

ROOT HAIR INFECTION BY RHIZOBIA
The infection of the plant root hair cell by rhizobia is a continuous process which is initiated by the chemical recognition between plant and rhizobia [i.e., plants flavonoids and iso-flavonoids are recognized by the bacteria leading to the activation of the transcriptional regulators NodD (Fisher and Long, 1993), and bacterial Nod factors as well as exopolysaccharides are recognized by Lysin motif-receptor-like kinases of host plants (Limpens et al., 2003;Madsen et al., 2003;Radutoiu et al., 2003;Kawaharada et al., 2015)]. This recognition between the two partners is required to insure the specificity of the interaction and the success of the symbiosis. Upon recognition, the plant root hair cell will adopt molecular and morphological changes in order to enhance its infection by rhizobia. For instance, a gradual and constant reorientation of the direction of root hair growth will lead to the curling of the root hair cell. This curling is needed in order to trap rhizobia into an infection pocket to enhance the infection rate of the root hair cells. The reallocation of plasma membrane proteins in response to rhizobia is also one of the earliest responses of the plant to rhizobia inoculation. Specifically, several proteins of the microdomain fraction of the plasma membrane are reallocated at the tip of the root hair cells only several hours after bacterial inoculation (Haney and Long, 2010;. Functional analysis of the Medicago truncatula flotillin proteins suggest that this reallocation is needed before the formation of the preinfection thread, then during the initiation and elongation of the infection thread and the progression of rhizobia in the root hair cells in this tubular structure (Haney and Long, 2010).
As described above, the initiation of the nodulation process results from sequential and progressive changes in root hair cell physiological, morphological, and molecular responses. While the morphological responses of the root hair cells consecutively to rhizobia inoculation (e.g., root hair cell branching and curling) are well-documented based on their ease to be monitored under the microscope, the molecular response of the root hair cells remains poorly described, especially when considering the specific programs required at each step of the infection of the root hair cell (El Yahyaoui et al., 2004;Kouchi et al., 2004;Lohar et al., 2006;Hogslund et al., 2009;Libault et al., 2010b;Breakspear et al., 2014;Damiani et al., 2016;Jardinaud et al., 2016; Figure 1). Having the objective to carefully decipher the transcriptomic programs and the time-course of gene activity consecutively to rhizobia inoculation, single cell-type strategies were implemented. For instance, researchers isolated root sections enriched in rhizobia-susceptible root hair cells (Lohar et al., 2006;Hogslund et al., 2009; Figure 1). This strategy was useful since it led to the identification of hundreds of genes differentially expressed in response to bacteria inoculation. To reach a higher level of resolution of these responses, populations of root hair cells were isolated from the root system at different time after bacterial inoculation (Libault et al., 2010b;Breakspear et al., 2014; Figure 1). Such approach highlighted the regulation of thousands of genes including many genes of the Nodulation Signaling Pathway, the differential expression of genes at different time of the infection, and the transient activation of the plant defense system (Libault et al., 2010b). The rapid inhibition of FIGURE 1 | Schematic representation of the current biological knowledge gained during the past years in legume symbiosis at the level of single cell-types (blue boxes) and some of the major remaining gaps existing in our understanding of legume nodulation (red boxes). Relevant studies are mentioned in each box. In addition to gain more knowledge in various aspects of legume nodulation, data integration must be conducted. Ideally, multi-omic analyses at the level of single cell relevant to study the nodulation process (e.g., infected root hair cells and cells composing the nodule) should be conducted. Another challenge is related to the dynamic molecular changes occurring in those cells during the recognition, interaction, infection then symbiosis between plant cells and rhizobia. Taking in consideration the permanent adaptation of each cell involved in nodulation will clearly enhance our understanding of legume nodulation. the plant defense system in root hair cells is likely required to promote the infection of the plant by rhizobia (El Yahyaoui et al., 2004;Kouchi et al., 2004;Libault et al., 2010b).
Despite this effort, the temporal regulation of the expression of the legume genes upon rhizobia inoculation was difficult to highlight. This is inherent to the nodulation process itself since new root hair cells are continuously infected and each infection is independently progressing from another. Hence, the lack of synchronization of the infection of the root hairs by rhizobia logically leads to the isolation of a heterogeneous plant material: a mixture of unresponsive root hair cells (those located outside of the susceptible zone of the root system), responsive but uninfected root hairs, and responsive and infected root hairs. The latter category could also be divided in unique populations of cells according to their stage of infection by rhizobia.

THE LEGUME NODULE, A COMPLEX ROOT ORGAN
Concomitantly to root hair cell infection, the cortical cells of the root are actively dividing leading to the formation of the nodule primordia. The location of these divisions differ between legume species. For instance, in M. truncatula, the inner cortex and pericycle actively divide upon rhizobia inoculation whereas, in Lotus japonicus, the outer cortex cells divide (Szczyglowski et al., 1998;Timmers et al., 1999;Xiao et al., 2014). Alongside, the infection thread progresses in and between plant cells until it reaches those dividing cells. There, the bacteria which are differentiated in bacteroids, are released in the symbiosome, an organelle-like structure where the bacteroids are surrounded by the host plasma membrane. The presence of microdomainassociated proteins in the symbiosome membrane suggests a role of these membrane proteins in regulating the communication existing between the symbionts and the infected plant cells of the nodule (Haney and Long, 2010;Lefebvre et al., 2010;Qiao and Libault, 2017;. Nodule organogenesis differs between legume species. For instance, indeterminate nodule development requires the maintenance of the primordia even upon formation of a mature indeterminate nodule (e.g., M. truncatula and Pisum sativum). Oppositely, in determinate nodules (e.g., Glycine max, L. japonicus, and Phaseolus vulgaris), the initially active nodule meristem will degenerate in mature nodules. As a consequence, the cellular organization differ between determinate and indeterminate nodules (Brewin, 1991;Ferguson et al., 2010). In indeterminate nodule four major zones can be distinguished. These zones are biologically different one from another. Zone #1 which is located at the tip of the nodule is the site of the permanent nodule meristem. Zone #2 corresponds to the infection zone where the bacteria infect the plant cells. Zone #3 is the nitrogen fixation zone where the bacteroids fix and assimilate for the plant the atmospheric dinitrogen. Zone #4 is located on the basal side of the nodule zone and is the location of the senescence of the nodule cells. Oppositely to indeterminate nodules, determinate nodules are not organized in zones. However, these nodules remain structurally organized: the plant cells colonized by rhizobia are exclusively located in the center of the globular nodules and are surrounded by uninfected epidermal, cortex, and vascular cells. In addition to their complex cellular composition, the nodules are also characterized by the level of endoreduplication of their cells, a duplication of the genomic DNA without cell division (Foucher and Kondorosi, 2000;Vinardell et al., 2003;Kondorosi and Kondorosi, 2004). While most plant cells contain 2C of genomic DNA, the infected cells of the nodules can reach 4, 8, 16, 32, 64C, etc., of genomic DNA content where C is the haploid DNA content. As a consequence, the zone #3 of indeterminate nodules is characterized by its massive endoreduplication.
To date, most transcriptomic analyses conducted on legume nodules focused on their developmental stages rather than their cellular complexity [e.g., L. japonicus (Colebatch et al., 2002(Colebatch et al., , 2004Kouchi et al., 2004;Asamizu et al., 2005;Hogslund et al., 2009), M. truncatula (Barnett et al., 2004;El Yahyaoui et al., 2004;Starker et al., 2006;Benedito et al., 2008), G. max (Lee et al., 2004;Brechenmacher et al., 2008;Libault et al., 2009;Yuan et al., 2017), Cicer arietinum (Afonso-Grunz et al., 2014;Kant et al., 2016), and Arachis hypogaea (Peng et al., 2017; Figure 1]. In indeterminate nodules, Roux et al. (2014) collected different zones of the M. truncatula nodules validating the use of laser microdissection in order to better depict the unique transcriptional properties of each zone. This method helps validating the used of laser microdissection to enhance the purity of the biological samples used from nodules (Roux et al., 2018). More recently, the same group revealed the role of MtDME (DEMETER) as a major regulator of the transcriptional activity of nodule genes and transposable elements (Satge et al., 2016). However, additional biological information is needed to reveal the complexity of the transcriptional regulation, especially in determinate nodules.

APPLYING SINGLE-CELL/SINGLE CELL-TYPE APPROACHES TO BETTER UNDERSTAND LEGUME NODULATION
In order to better understand the role of legume genes during the nodulation process, it is important to reveal the dynamic changes of their expression during nodulation (Figure 1). Such study should be conducted on infected root hair cells and nodule cells in order to capture the complexity of the molecular regulation at different stages of the infection of plant cells by the symbiotic bacteria. Accordingly, there is a need to isolate and separate each legume cell or cell-types (i.e., population of plant cells sharing the same biological function) infected by rhizobia or contributing to nodulation such as the root hairs preferentially located in the susceptible zone of the root and the different nodule cell-types (e.g., epidermal cells, vascular cells, and infected and uninfected cortex cells of the nodule).
Various methodologies were established to isolate plant celltypes (see Libault et al., 2017 for review). These methods consist in isolating transgenic plant cell protoplasts (i.e., living plant cells devoid in cell walls upon digestion of the cell wall by a cocktail of cellulases, hemicellulases, and pectinases) expressing the green fluorescent protein (GFP) in a cell-type dependent manner using fluorescent-activated cell sorting (FACS) (Birnbaum et al., 2003;Brady et al., 2007;Dinneny et al., 2008;Iyer-Pascuzzi et al., 2011;Petersson et al., 2015;Marx, 2016). Another approach consists in sequencing the transcriptome of cell nuclei upon their isolation (e.g., isolation of biotinylated nuclei) expecting that the cellular and nuclear transcriptomes are similar (Deal and Henikoff, 2011). A more sophisticated approach allowing the sequencing of transcripts interacting with ribosomes consists in the isolation of mRNA using a cell type-preferential tagged ribosomal protein (Zanetti et al., 2005). Applying those methods, genes preferentially expressed in specific root cell-types were characterized validating the idea of root cell-type-preferential transcriptomes. More recently, the gDNA methylation profiles from 6 different root cell-types from Arabidopsis were established (Kawakatsu et al., 2016). Another strategy successfully applied when analyzing the transcriptomic, epigenomic (Yan et al., 2013(Yan et al., , 2015(Yan et al., , 2016, proteomic (Larrainzar et al., 2007;Thal et al., 2018), phosphoproteomic Rose et al., 2012), metabolomics (Brechenmacher et al., 2010), and glycomic  responses of legume plants during the nodulation process consist in to the massive isolation of root hairs inoculated with rhizobia (Brechenmacher et al., 2009Libault et al., 2010a,c).
However, single cell-type approaches have several limitations when considering the nodulation process. For instance, while the isolation of a population of legume root hairs enhances plant sample homogeneity leading to a more accurate depiction of the molecular mechanisms controlling root hair infection by rhizobia, it is important to acknowledge the heterogeneity of this cellular population according to their unique stages of differentiation, unique responses to their environment, different stages in their infection by rhizobia, and the stochastic variations existing between cells. Also, other strategies need to be established to properly investigate the unique transcriptomic signature of the cells composing the nodule. To overcome these limitations, single-cell approaches (i.e., individual analysis of the transcriptome of each cell composing a complex organ) coupled with dropletbased microfluidic systems (Kolodziejczyk et al., 2015) are emerging. These systems [e.g., Chromium Single Cell Gene Expression Solution (10× Genomics), ddSEq (Bio-Rad), C1 (Fluidigm)] allow the separation and isolation of each single-cell preliminary to their molecular analysis. However, there are several technical limitations to consider when using these droplet-based microfluidic systems. For instance, the use of plant protoplasts in droplet-based microfluidic systems remains challenging due to the cell size discrimination of these systems (e.g., the 10× Genomics gel beads and C1 Fluidigm chips cannot incorporate cells/nuclei larger than 52 and 25 µm of diameter, respectively). This size exclusion might lead to the absence or relative depletion of the transcriptome of large plant cells. Also, protoplast bursting remains a major concern leading to a decrease in RNA-seq library construction efficiency and a marginal representation of low-represented cell types (Shulse et al., 2018). Consequently, isolated plant nuclei represent an interesting alternative but it presupposes that the cell and nuclear transcriptomes are similar. Previous studies concluded that working on isolated nuclei is an acceptable way to overcome the problem of fragile cells (Deal and Henikoff, 2011). There is a need to validate this results on plant cells before to fully consider isolated nuclei as an alternative to single-cell biology. Consequently, the application of droplet technology on plant cells will require the combination of unique expertise in plant cell biology, molecular biology, and bioinformatics in order to generate viable biological samples compatible with droplet-based microfluidic systems. Being capable to overcome these limitations will open new avenues not only to understand legume nodulation but also to reveal the dynamic changes of the plant cell molecular responses during the infection process (Figure 1).

CONCLUSION AND PERSPECTIVES
Accessing single-cell transcriptomes is only a first step to fully understand legume nodulation. Additional avenues must be considered in order to develop a system-level understanding of legume nodulation including the integration of transcriptomic, epigenomic, proteomic, and metabolomics datasets. In addition, gene regulatory networks including the characterization of the binding sites of transcription factors controlling the nodulation process (Andriankaja et al., 2007) should also be more systematically characterized. As mentioned above, such experiments should be conducted at the level of single cells or, at least, at the level of single cell-types. In order to reach this goal, new strategies and technologies has been recently applied on plants or should be adapted to plant single cell biology (Figure 1). For instance, recent improvements of the sensitivity of mass-spectrometers and the development of new biochemical tools allow the characterization of plant single-cell proteomes (Misra et al., 2014;Zhu et al., 2016), and the threedimensional spatial distributions of plant metabolites including from soybean nodules (Stopka et al., 2017;Velickovic et al., 2018). The establishment of single cell ATAC-seq methodology [Assay for Transposase-Accessible Chromatin using sequencing (Cusanovich et al., 2015)] to reveal the folding of the chromatin fiber of eukaryotic cells at the level of single cell also represents an interesting approach to better understand the impact of the epigenome on gene expression. However, the future access to such methodology will need to be adapted and applied to plant single cells.

AUTHOR CONTRIBUTIONS
ML designed, wrote, and edited this mini-review.