Defining Color Change in Pitaya: A Close Look at Betacyanin Synthesis Genes in Stenocereus queretaroensis

Betalains are tyrosine-derived plant pigments present in several species of the Caryophyllales order. Betalains are classified in red betacyanins and yellow betaxanthins and are implicated in plant stress tolerance and visual attraction for pollinators. The compounds are used as natural colorants in many industries. Today, there is little information on betalain biosynthesis with several key enzymes that remain unknown on plants of the Caryophyllales order. Omic tools have proven to be very useful in gaining insights into various molecular mechanisms. In this study, we used suspension cells from fruits of the cactus Stenocereus queretaroensis. Two growing conditions were used to perform RNA-seq and differential expression analysis to help identify betalain biosynthesis-related genes. We found 98 differential expressed genes related to aromatic amino acids and betalain biosynthesis pathways. Interestingly, we found that only one gene of the betalain synthesis pathway was differentially expressed. The rest of the genes belong to the aromatic amino acid pathway, including hydroxy phenylpyruvate-related genes, suggesting the possibility of an alternative biosynthetic pathway similar to that observed in legumes.


INTRODUCTION
The essence of color is a particular characteristic that has a profound influence on the animal kingdom. We are also affected by our choice of objects, clothing, and food. However, the use of synthetic colorants has been regulated to such an extent that only a specific spectrum of dyes are endorsed by the health organizations based on their possible side effects (Trasande et al., 2018). Because of this, in the last decades, natural pigments have regained commercial, and scientific interest. Betalains are water-soluble tyrosine derivate pigments, synthesized in the cytoplasm, and stored in the vacuole. Betalains are present in flowers, fruits, and occasionally in vegetative tissue of plants of the Caryophyllales order (Osbourn, 2017;Polturak and Aharoni, 2018).
Betalains arise from several reactions that involve betalamic acid primarily, resulting in two types of pigments: betacyanins and betaxanthins. Betacyanins are red pigments, resulting in condensation of betalamic acid with a cycle-3-(3,4-dihydroxy phenyl)-L-alanine molecule (cyclo-DOPA). At the same time, betaxanthins are yellow pigments from the conjugation of betalamic acid with the amino group of amino acid. The presence of betalains in nature is related to essential processes, such as pollination of flowers or stress tolerance mechanisms. Betalains also have bioactive properties that include antioxidant and antimicrobial (Tesoriere et al., 2004;Esatbeyoglu et al., 2015), and compared with other plant pigments such as anthocyanins, betalains are stable over a wide pH range (Gandía-Herrero et al., 2016;Choo et al., 2018). The best known edible sources of betacyanins and betaxanthins include the roots of sugar beet (B. vulgaris L.) belonging to the Amaranthaceae family, and fruits of the cacti Hylocereus polyrhizus (both from Caryophyllales order) (Azeredo, 2009;Osbourn, 2017;Polturak and Aharoni, 2018).
Stenocereus (A. Berger) Riccob, is a genus within the Caryophyllales that comprises ∼24 species of cacti, and its geographical distribution ranges from the southern border of the US to Peru and Venezuela. The main characteristic of this genus is the production of highly desired edible fruits, generically called "pitayas, " which have acquired commercial value as exotic fruits worldwide (Alviter, 2002). In Mexico, Stenocereus represents 19 species throughout the territory (Sánchez Mejorada, 1984). The cultivated species of this genus are Stenocereus thurberi, Stenocereus griseus, Stenocereus stellatus, Stenocereus fricci, and Stenocereus queretaroensis, the latter being the most used due to the availability of varieties and their productivity (Ruiz et al., 2015). S. queretaroensis (Weber) Buxbaum is a well-defined columnar cactus (Bravo Hollis and Sánchez Mejorada, 1978). The branches are cylindrical and have eight prominent ribs. The flowers differ in the areolas of the upper half of the branches. The pitayas of S. queretaroensis are ovoid or globose fruits, covered by a generally soft shell, which have areolas with thorns. The pulp of the fruit is juicy and sweet, with colors of various blends of white and purple, although depending on the variety, it can have colors ranging from pink to yellow (Supplementary Figure 1). The betacyanins and betaxanthins present in the fruits provide intense and attractive colors, which are considered essential indicators for the classification of varieties in this genus.
The pigments extracted from pitayas have some advantages over those extracted from beetroots as they do not alter the taste of other foods (Azeredo, 2009). In addition, the range of colors from cacti betalains ranges from yellow to violet, compared with pigments obtained from other sources. Based on the broad color spectrum, water solubility, and various bioactive properties, pitaya betalains are promising candidate pigments with a wide range of potential applications in the food industry and the biotechnological processing of food products.
Because plants can harbor particular steps in metabolite biosynthesis and their precursors (Schenck and Maeda, 2018), transcriptomics can be a useful tool for the study of the biosynthetic pathway of betalains in different species within the Caryophyllales order, all of them without a reference genome.
Four key genes of betalain production were identified: 4,5-DOPA dioxygenase estradiol, cytochrome P450, and glucosyltransferase in a study of Hylocereus polyrhizus pulp fruits (Qingzhu et al., 2015). Most recently, a newly recognized gene named CYP76AD6, which codes for a cytochrome P450-like enzyme was silenced in sugar beet, reducing betalain formation. Moreover, in tobacco leaves, overexpression of this gene leads to a betalain heterologous biosynthesis (Polturak et al., 2016). Furthermore, in Bougambilia species, a tight correlation between betalain content and CYP76AD1, dihydroxy-phenylalanine (DOPA)-4,5dioxygenase (DODA) was shown in a transcriptomic analysis (Xu et al., 2016). An evolutionary approach in Mirabilis jalapa flowers reveals pivotal changes in betalain biosynthesis compared with anthocyanins, where not only lacking of anthocyanin related genes, but some specific gene mutations lead to betalain synthesis . Recently, a study in fruits of Hylocereus costaricensis found a higher expression of the CYP76AD1α and DODAα genes in red fruits compared with a white pulp of Hylocereus undatus (Xi et al., 2019). Here, we identified pivotal genes related to betacyanin synthesis through the differential expression profiles and their induction by sugar content or osmotic stress.

Suspension Cell Growth Conditions
Two growing conditions were established for cell suspensions obtained from the mesocarp of S. queretaroensis fruits. A modified MS media (Murashige and Skoog, 1962) was used containing 4.02 µM Ca 2+ and minus glycine and nicotinic acid. Sucrose 3% (SqY1 media) or 8% (SqR1 media), were used for the growth of cells under in vitro culture conditions (Miranda-Ham et al., 1999). Previous to the multiplication of the plant material, 15 ml of the inoculum cells in SqY1 media was added to 40 ml of half-strength MS media using 50 Erlenmeyer flasks of 250-ml volume capacity; on the other hand, we added 10 ml of the inoculum cells in SqR1 media to 40 ml of half-strength MS using 60 Erlenmeyer flasks of 250-ml volume capacity. The cell cultures were moved at room temperature (25 • C ± 2) under constant mechanical agitation at 115 rpm and in continuous light conditions.

Growth Measurement
We evaluated the fresh weight (FW) and dry weight (DW) of the cell mass as well as the pH and conductivity of the residual medium. The cell growth assessment was performed on pre-scheduled dates, starting on day 0 and every 2 days during 34 days. Three flasks were randomly selected from each condition. FW of the cultures was determined by vacuum filtering the total content of each flask with the help of a funnel and depositing the collected liquid medium in individual conical tubes. We conducted pH measurements using a Thermo Orion potentiometer and we measured the electrical conductivity with a Hanna conductometer, calibrated against distilled water. For both cultures, we conducted all the measurements every second day. All experiments were performed three times; for every day of measure, we took three flasks, to calculate media and standard deviation in every day of evaluation.

Betalain Production Quantification
The pigment content was determined by the spectrophotometric method reported previously (Cai et al., 2003), using the fresh extract obtained, measured in a spectrophotometer (Génesys 10 UV-vis) in triplicate and using the following formula: where BC is the content of betalains in fresh weight (mg/g), V is the volume (ml) of the extract obtained, DF is the dilution factor, L is the length of the optical step (cm), and W is the weight of the sample (g). For red pigments (betacyanins), A is the absorbance measured at 538 nM, MW is the molecular mass of betanine (550 g/mol), and ε is the molar absorbability of betanine (60,000 L/mol × cm). In the case of betaxanthins, A is the absorbance measured at 480 nM, MW is the molecular mass of indicaxanthin (308 g/mol), and ε is the molar absorbability of indicaxanthin (48,000 L/mol × cm). For each, quantification media and standard deviation were calculated.

Plant Material Preparation
Six flasks of sucrose at 3% and six of sucrose at 8% were harvested on the 16th day of culture. Then, the flask of each condition was mixed, filtrated, and aliquoted into cryovial tubes containing 0.1 g of sample each. The cryovials were frozen with liquid nitrogen and stored at −76 • C until RNA extraction.

Microscopic Analysis
Fourteen-day growth cells of SqY1 were reseeded in 3% sucrose, 8% sucrose, and 3 plus 5% PEG 6,000 media. Color changes were documented on days 3, 5, and 7 of culture cycle. Cells of the 7th day were taken and fixed with 4% formaldehyde for 24 h and 4 • C temperature. Later, the cells were mounted in fluoroshield-DAPI (Sigma Aldrich) in glass slides and incubated for 30 min. Cell images were taken in a confocal microscope [Olympus FV1000 microscope (Olympus Corporation)] using 355, 488, and 555 nM, and bright field. We used bright field for cell detection. A 355-nM channel was used for the detection of DAPI (in blue). Betaxanthin fluorescence was detected in 488 nM excitation wavelength. No fluorescence was detected in 555 nM. Colocalization of DAPI and betaxanthin was made by merging 355-and 488-nM channels.

Color Change Measurement by Image Analysis
The color change was carried out from photos of the Erlenmeyer flasks with cell growth in 3 sucrose, 8 sucrose, and 3% plus 5% PEG media. Photos were taken under the same conditions. Once the photos were taken and a composed photo of all nine flasks was made, the image was analyzed in the ImageJ software in which the background was subtracted with the background subtraction tool. The values of the three channels and the calculated percentage of each signal were obtained. Finally, the percentages of the red channel signal from each treatment were plotted.

RNA Extraction and Sequencing Library Preparation
A 0.1 g of SqY1 and SqR1 were used for RNA extraction with the kit PureLink TM RNA Mini Kit (Invitrogen) with some modifications: We used 1 ml of TRIzol TM Reagent buffer for lysis; homogenization was carried with 0.5 mm glass beads, 500 µl of isopropanol was added to homogenate, and then was passed through kit columns according to the manufacturing procedure. RNA was eluted with 50 µl of water. The quality of the RNA samples was evaluated by nanodrop 2,000 (Thermo Scientific TM ) and by Agilent 2,100 bioanalyzer. The integrity of the samples was visualized in a non-denaturing agarose gel (1%).

RNA-Seq Library Construction and Sequencing
Of the total RNA per sample (three per condition), 1 µg was used as input material for each sequencing library by using poly-T oligo-attached magnetic beads. Sequencing libraries were generated using NEBNext R Ultra TM RNA Library Prep Kit for Illumina R (NEB, USA) following the recommendations of the manufacturer. Index codes were added to attribute sequences to each sample, and library quality was assessed on the Agilent Bioanalyzer 2,100 system, RNA-sequencing, and de novo assembly. Pair-end sequencing configuration (150 PE × 40 M raw reads) was performed on a Novaseq 6,000 sequencer. The resulting raw reads were processed by FASTQC (http://www.bioinformatics. babraham.ac.uk/projects/fastqc/) and Trimmomatic v0.36 (Bolger et al., 2014) to filter and remove low-quality sequences and reads (Q ≥ 20). The resulted clean reads were assembled by using Trinity v.r2014-04-13p1 (Grabherr et al., 2011) with minimum kmer coverage set to 2 and all other parameters set to default. We used the Corset software v1.05 (Davidson and Oshlack, 2014) for the clustering of transcripts to enhance the statistical power of DEG analysis by reducing redundancy.

Quantification of Gene Expression Levels
Quantification of gene expression levels was estimated by mapping all the PE trimmed libraries to the de novo assembled transcriptome (filtered by Corset) employing Bowtie2 program (Langmead and Salzberg, 2012), which is wrapped into the RNA-Seq by Expectation-Maximization (RSEM) software v1.2.26 (Li and Dewey, 2011). Read counts were normalized to FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions of base pairs sequenced), representing the normalized gene expression levels.

Differential Expression Analysis
Before expression analysis, the sample correlation was calculated using Pearson correlation, to test the reliability of the experiment. Differential expression analysis of the two conditions (SqYC and SqRT) was performed by using the DESeq R package v1.10.1 (Anders and Huber, 2010), which is based on the negative binomial distribution. Differential expressed genes (DEGs) were defined as those presenting a fold change (FC) ≥ 1 and an adjusted p-value ≤ 0.05.

GO and KEGG Enrichment Analysis
Gene Ontology (GO) enrichment analysis of the DEGs was implemented by the GOseq R package v1.10.0, based on Wallenius non-central hypergeometric distribution (Robinson and Oshlack, 2010). KOBAS software v2.0.12 (Mao et al., 2005) was used to test the statistical enrichment of differential expression genes in KEGG pathways.

Growth of Cell Suspensions
Cell cultures of S. queretaroensis of both treatments (SqY1 and SqR1, Figure 1) were evaluated for 34 days of growth and color evaluation (yellow to reddish gray as indicated in the color index, Supplementary Figure 2). Four different parameters were measured as stated in the Materials and methods section in order to measure growth, where both treatments behaved in a similar manner, only fresh weight was presented. At the end of the SqY1 culture cycle (treatment with 3% sucrose), a typical sigmoid curve could be observed, whose growth phases were well identified. The lag phase was identified between days 0 and 8; the exponential phase was found between days 8 and 20, with a progressive deceleration between days 20 and 22; and the stationary phase was observed from day 22 (Figure 1B). For the SqR1 treatment (grew with 8% sucrose), the lag phase lasted until the fourth day ( Figure 1E). From this day, the exponential growth became evident until the 20th day, and the stationary phase was observed between days 20 and 30, to finally give way to the phase of cell death ( Figure 1E).

Betalain Quantification
The quantification of betalains in the SqY1 line ( Figure 1A) shows that pigment presence belonged to the betaxanthin class, with a small amount of betacyanins. We observed a small increase in both betalains during the exponential phase, followed by a fall at day 9, a recovery occurs on day 12 ( Figure 1C) that coincides with the stationary phase. The SqR1 line ( Figure 1D) had one order magnitude difference in the content of betaxanthins compared with the SqY1 line. However, SqR1 presented a similar amount of betacyanins and betaxathins, where there appears to be an increase on the fourth day and then fall on the 6th day, and a recovery is observed on day 8 ( Figure 1F).

Transcriptome Assembly
Six transcriptome libraries were constructed with a total of 88 million raw reads: three corresponding to betaxanthinproducing cells (SqY1) and three for betacyanin productions (SqR1). The error rate distribution was calculated for all libraries (Supplementary Figure 3), and then, all the reads were filtered to leave only those with the highest quality, and sequence-free adapters (Supplementary Figures 3, 4); finally, we kept 13 G of clean bases with a phred quality ≥Q20, with a total of 81.7 Gbp for the assembly ( Table 1) and GC content around 46% (Supplementary Figure 5).
A total of 171,128 transcripts ( Table 2) were assembled with a minimum size of 201 nucleotides, a maximum length of 15,505, and an average length of 1,531. The N50 and N90 for the assembled transcripts were 2,379 and 703, respectively. We obtained 170,970 unigenes ( Table 2) with a minimum length of 201 and a maximum of 15,505, while the average sequence sizes were 1,532. The N50 and N90 were calculated at 2,380 and 704, respectively ( Table 3). This is reflected in the distribution of transcripts according to their length in nucleotides (Table 2), where it can be observed that 77% of transcripts were >500 bases.

Gene Functional Annotation
The 170,970 unigenes were annotated against seven public databases: NR, Pfam, GO, KO, KOG, NT, and SwissProt (Supplementary Figure 6A). From the total of assembled transcripts, 26,159 were found in seven databases, 58.91% of them in the NR database, 45.01% in the SwissProt, and 21.23% in the KOG database (Figure 2A). From the total of annotated transcripts, 17,619 were found exclusively in the NR database, 1,644 in NT, and five were only found in KOG (Supplementary Figure 6B).
Regarding the blast species classification, 66.5% of the annotated sequences had a higher similarity to Beta vulgaris, 4.7% to Vitis vinifera, 1.6% to Nelumbo nucifera, and 1.4% to Theobroma cacao. The other 24.4% of the sequences had similarities to other species (Supplementary Figure 6C). About the highest range of similarity, 46.3% of the assembled transcripts had a similarity percentage of 60-80%, while only 1.5% of annotated sequences had a similarity value range of 95-100% (Supplementary Figure 6D).
From the total of annotated transcripts, 31% had a higher percentage of representativeness of 1e100, while only 4.6% had an e-value equal to 0 (Supplementary Figure 7A). Regarding the KOG classification, the main categories within annotated genes were general function prediction only, posttranslational modification, protein turnover, chaperones, and signal transduction mechanisms (Supplementary Figure 7B). Finally, for KEGG classification, the main enriched pathways were carbohydrate metabolism, folding, sorting and degradation, and translation (Supplementary Figure 7C). All transcripts were mapped against the GO term database and classified Frontiers in Sustainable Food Systems | www.frontiersin.org    according to the three main categories: biological process (BP), molecular function (MF), and cellular component (CC). Among the principal terms in the BP category are cellular process, metabolic process, single-organism process; among the MF terms are: binding and catalytic activity; and within CC terms are: cell, cell part, organelle, and macromolecular complex (Supplementary Figure 7D).

Quantification of Gene Expression Levels and Differential Expression Analysis
Pearson's correlation indicated values above 0.9 for the SqY1 libraries and above 0.86 for the SqR1 libraries (Supplementary Figure 8A), but it was close to 0.6 among the libraries of different lines. The expression level of each gene was quantified by RSEM program by mapping all clean reads of each library against our de novo assembled transcriptome. The accounts for all six libraries were normalized to FPKM values (Supplementary Figure 8B).
From all the unigenes that were expressed at least at 0.3 FPKM (116,136 transcripts), we identified a total of 48,963 DEGs among treatments; 25,353 were upregulated and 23,610 were downregulated (Supplementary Figure 9B). Comparing SqYC and SqRT groups, we identified 60,456 common expressed genes; 36,282 were expressed only in SqRT and 19,398 in SqYT (Supplementary Figure 9C).

GO and KEGG Enrichment Analysis on DEG
According to the three main categories of GO database, the term within MF were the most enriched for the DEG, and been binding, heterocyclic compound binding, organic cyclic compound binding, ion, and protein binding, the most enriched terms. Within biological process (BP), the most enriched terms were protein metabolic process and cellular protein metabolic process (Supplementary Figure 10A).
Within the KEGG-enriched categories of upregulated DEGs were protein processing in the endoplasmic reticulum and plantpathogen interaction (Supplementary Figure 10B). On the other hand, for the downregulated DEGs, the enriched categories for KEGG pathways were carbon metabolism, ribosome, and RNA transport (Supplementary Figure 10C). The upregulated genes had lower q-value overall than the downregulated, where ribosomes were the category with the lowest q-value.

DEGs Associated With Betalain Synthesis
We focused on the biosynthetic pathways of aromatic amino acids and betalains; a total of 98 DEGs were identified (Figure 2A). From those, 40 genes were upregulated and 58 were downregulated, where 10 of them had a higher significant difference (-log10 q-value) >35, while the clustered heat map (Figure 2B) shows the close relationship between the expression patterns of these genes.
Concerning the betalain biosynthetic pathway (Figure 4), only two gene clusters associated with 4.5-DOPA dioxygenase estradiol (DODA, 1.13.11) were observed ( Figure 2B). The first Cluster−2,051.103340 is upregulated in yellow cultures and downregulated in red. In the second Cluster−2,051.102127, the regulation occurs inversely. We observed a similar behavior in the case of two tyrosine aminotransferases: Cluster−2,051.66822 and Cluster−2,051.73928. The first has a light upregulation expression in yellow and downregulation in red, while the second transcript is upregulated in red and downregulated in yellow cells.

Color Change Measurement by Image Analysis
According to total betalain quantification (Figures 1C,F), a sucrose concentration of 8% enhanced the betalain production in pitaya suspension cells. This solute concentration can create osmotic stress in the cells. Therefore, we decided to perform a time-lapse to define color change from cell cultures of S. queretaroensis yellow grown in SqR1 or SqY1 media with or without added polyethylene glycol 6,000 to induce osmotic stress (Elmaghrabi et al., 2017). The cells showed no apparent changes in pigmentation in the first 3 days of culture (Figures 5A-C). On the 5th day, we observed a clear difference between the cells in the 8% sucrose media and the cells with 5% PEG. The latter showed a slightly red pigmentation in comparison with the 8% of sucrose (Figures 5A,D-F). On the 7th day of culture (Figures 5A,G-I), the cells in 3% sucrose media showed no change in color; however, the cells cultivated in 8% sucrose or in the media containing 5% PEG begun to produce pigmentation change ( Figure 5B). On the 9th day, the cells with sucrose at 8% show a greater red color turnover in comparison with the other two conditions (Figure 5B)

Microscopic Analysis
Confocal analysis shows betaxanthin accumulation in cells (Figure 6). We took the cells of 3% sucrose and PEG conditions on the 7th day, the cells were then fixed and stained with DAPI, which was used as a nuclear reference. We observed intact round cells in the bright field (Figures 6A,F), and DAPI in 355 channels showing cell nuclei. The nuclei do not show alteration in shape (Figures 6B,G). In 488 channels, we observed a signal corresponding to betaxanthin (Figures 6C,H) due to the excitation wavelength of 450-490 nM reported for these pigments (Gandía-Herrero et al., 2005;Guerrero-Rubio et al., 2020). The autofluorescence occurs only in yellow color cells at 488 nm (Supplementary Figure 11). The signal in PEG-treated cells located in the nuclear and cytoplasmic areas of the cell, while the cells grown in 3% sucrose showed autofluorescence surrounding the cell nucleus.

DISCUSSION
Betalains are secondary metabolites derived from tyrosine and are unique to plants of the order of Caryophyllales. Betalain function, in addition to attracting pollinators and seed dispersor, also plays a function in mechanisms of stress tolerance. Recently, betalains have become significant due to the wide variety of colors they possess as well as their bioactive properties (Tesoriere et al., 2004;Khan and Giridhar, 2015;Osbourn, 2017;Polturak and Aharoni, 2018).
The use of sequencing tools, particularly RNA-seq analysis for de novo assemblies, have proven to be very useful in determining genes of importance for the synthesis of a large number of plant metabolites, including secondary metabolites derived from tyrosine such as alkaloids or other pigments such as anthocyanins (Zhao et al., 2014;Liu et al., 2015;Wei et al., 2015) in model and non-model plants. This approach results are very useful for studying particularities in the synthesis of specialized metabolites derived from tyrosine like betalains. S. queretaroensis can produce a large number of betalain colors. This fact makes very attractive to deepen our knowledge of the synthesis of these pigments.
According to the data provided by the growth curve and the quantification of the pigments, the days of most significant accumulation of betalains coincide with the stationary phase, which is typically considered the condition where secondary metabolites are produced. Therefore, day 16 was chosen to perform the RNA extraction with cells (yellow and red) of both conditions of sucrose concentration as a first approach to the identification of DEGs. Since a reference genome is not available, seven databases were used to perform the annotation of 170,970 unigenes, of which 66.5% had the highest degree of similarity against the genes of B. vulgaris, which is one of the closest species with genome and transcriptomes publicly available, and also, this is a betalain producer species within Caryophyllales. Moreover, functional notation showed that much of the genes might be primarily related to protein synthesis and post-translational modifications, as well as cell transport, which is characteristic of the exponential phase in cell cultures.
Pearson's correlation analysis indicates a high correlation between the replicates of each condition, but with differences between treatment conditions. Based on the values of Pearson correlation in the SqYC replicates, higher than 0.9, it indicates greater reproducibility when compared with those of SqRT, which was <0.9. This is probably a reflection of what is observed in the betalain content, where it is observed that in SqYC samples (SqY1-3), the presence of betaxanthins is observed almost uniquely, while in SqRT samples (SqR1-3), both betaxanthins and betacyanins are observed in similar proportions, suggesting a more heterogeneous population. However, despite this, values close to 0.5 of Pearson correlation between the two groups indicate a different expression.
We expected an increased expression on genes related to tyrosine biosynthesis, especially arogenate dehydrogenase (TyrA; EC 1.3.1.78). This gene had a reduction in its expression in red

SqR1 cells and a slight increase in yellow SqY1 cultivated cells.
This is the opposite of what is expected since the production of tyrosine is necessary for an enhanced betalain biosynthesis, which has been observed in other betalain producer species (Lopez-Nieves et al., 2018;Liu et al., 2019;Xi et al., 2019).
The results suggest two alternatives: one indicates that in S. queretaroensis, synthesis of tyrosine preferably used hydroxyphenylpyruvate to produce tyrosine (Yoo et al., 2013;Bedewitz et al., 2014), in specific conditions, such as osmotic stress due to 8% sucrose grow media. This is suggested for the differential expression of tyrosine aminotransferase in cultures. Even more, this should indicate the presence of this alternative cytosolic located pathway.
PEG treatment mimics osmotic stress (Elmaghrabi et al., 2017;Zhang and Shi, 2018), and our treatment was capable of inducing color change to red and betaxanthin accumulation inside yellow cells. In addition, the content of betacyanins is higher in cells of the SqR1 line than in SqY1. However, these results cannot confirm this assumption, since this pathway is common only in bacteria and legumes (Schenck and Maeda, 2018), and so, future analyses such as RT-qPCR, heterologous expression, and enzymatic characterization of the products of these genes should be carried out.
The second alternative suggests that the point at which sampling was performed may not be ideal for identifying differential genes related to betacyanin synthesis in which the content of betacyanins and the expression of specific genes may change. This may be the effect of reseeding and the difference between the two crops, such as the duplication time that could  be reflected in the phase of the cell cycle in which most cells are located as they are not synchronized.
Since some of the genes are classified according to their function, they are limited in protein synthesis, nutrient transport, and DNA repair, which is common in the G2 phase of the cycle, when cells decide to divide (Vasconsuelo and Boland, 2007). This could explain the TyrA downregulation since tyrosine must be tightly regulated for normal cell growth (de Oliveira et al., 2019) and no bioinformatic evidence of a prephenate dehydrogenase expression, which is then responsible for tyrosine biosynthesis in legumes, could be found (Schenck et al., 2015).
We identify two DEGs corresponding to 4,5-estradioldioxygenase or DOPA (EC 1.13.11-), which is consistent with the data in other betalain-producing species (Lopez-Nieves et al., 2018;Liu et al., 2019;Xi et al., 2019). DOPA has been identified as an essential gene for the synthesis of betacyanins since its product, 4,5-seco-dopa undergoes a spontaneous re-arrangement to produce betalamic acid, which is condensed with another 4,5-seco-dopa molecule to produce betanidine will be modified to produce the different varieties of betalains (Christinet et al., 2004;Gandía-Herrero and García-Carmona, 2012;Bean et al., 2018). DOPA belongs to a family of enzymes with different catalytic properties, being able to produce different amounts of 4,5-seco-dopa. They also participate in functions not related to pigment synthesis, but yet related to tyrosine metabolism. This explains the fact that the two genes show opposite expression patterns (Christinet et al., 2004).

CONCLUSION
A high-quality transcriptome of S. queretaroensis was obtained, a cactus with promising potential as a source of betalains. Ninetyfive DEGs related to both tyrosine synthesis and the amino acid from which these pigments are derived are both from the betalain synthesis pathway. The arogenate dehydrogenase (2051.131393) was found to be downregulated in red cultures, which contradicts what is observed in the metabolism of aromatic amino acids in most plants, except for legumes. It was found that two DEGs probably belong to DOPA, of which only one of them (2051.103340) is upregulated in yellow and one in red (2051.102127). A similar case occurs with one tyrosine aminotransferase upregulated in yellow (2051.66822) and one in red (2051.73928) suggesting the possibility of an alternative cytosolic biosynthetic pathway, which is only found in legumes. This color change and betalain accumulation is related to stress, mainly osmotic stress, since PEG is capable of inducing betacyanin production.
This species, as well as its cell suspensions, may be crucial to the identification of genes essential for the synthesis of betalains. Finally, more in-depth analyses must be carried out to discard or accept the results obtained.

AUTHOR CONTRIBUTIONS
JM: RNA extraction, quantification, transcriptomic and DEG results analysis, cell fixation, data analysis, and manuscript writing. JA-S: plant cell growth and manuscript writing. LC-C: plant growth, betalain quantification, and culture cycle characterization. AK: confocal microscopy analysis and including samples preparation. AP-S: data bioinformatic analysis and critical review and editing. MM-H: conceived the study and manuscript preparation. EC: conceived the study, data analysis, and manuscript preparation. Supplementary Table 2 | Representative differential expressed genes in aromatic amino acid and betalain biosynthesis.