Transcriptome and Biochemical Analysis of a Flower Color Polymorphism in Silene littorea (Caryophyllaceae)

Flower color polymorphisms are widely used as model traits from genetics to ecology, yet determining the biochemical and molecular basis can be challenging. Anthocyanin-based flower color variations can be caused by at least 12 structural and three regulatory genes in the anthocyanin biosynthetic pathway (ABP). We use mRNA-Seq to simultaneously sequence and estimate expression of these candidate genes in nine samples of Silene littorea representing three color morphs (dark pink, light pink and white) across three developmental stages in hopes of identifying the cause of flower color variation. We identified 29 putative paralogs for the 15 candidate genes in the ABP. We assembled complete coding sequences for 16 structural loci and nine of ten regulatory loci. Among these 29 putative paralogs, we identified 622 SNPs, yet only nine synonymous SNPs in Ans had allele frequencies that differentiated pigmented petals (dark pink and light pink) from white petals. These Ans allele frequency differences were further investigated with an expanded sequencing survey of 38 individuals, yet no SNPs consistently differentiated the color morphs. We also found one locus, F3h1, with strong differential expression between pigmented and white samples (>42x). This may be caused by decreased expression of Myb1a in white petal buds. Myb1a in S. littorea is a regulatory locus closely related to Subgroup 7 Mybs known to regulate F3h and other loci in the first half of the ABP in model species. We then compare the mRNA-Seq results with petal biochemistry which revealed cyanidin as the primary anthocyanin and five flavonoid intermediates. Concentrations of three of the flavonoid intermediates were significantly lower in white petals than in pigmented petals (rutin, quercetin and isovitexin). The biochemistry results for rutin, quercetin, luteolin and apigenin are consistent with the transcriptome results suggesting a blockage at F3h, possibly caused by downregulation of Myb1a.


INTRODUCTION
Flower color has played a pivotal role in our current understanding of biology since Mendel's discovery of the inheritance of flower color in Pisum sativum (Mendel, 1866;Ellis et al., 2011). Since then, flower color has contributed to a wide range of important biological discoveries including gene regulation (Napoli et al., 1990), pleiotropy (Streisfeld and Rausher, 2011), population genetics (Wright, 1943;Schemske and Bierzychudek, 2001), speciation (Bradshaw et al., 1995;Hopkins and Rausher, 2011) and floral ecology (Irwin and Strauss, 2005;Eckhart et al., 2006;Strauss and Whittall, 2006). Although many breakthroughs involving flower color utilize model species with available complete reference genomes, numerous evolutionary and ecological questions regarding flower color variation reside in non-model species. Investigating the cause of flower color variation in non-model species would benefit from an efficient method for sequencing and detecting expression of all flower color related genes in non-model species.
The most common floral pigments are the anthocyanins (Miller et al., 2011;Campanella et al., 2014) which are produced by the anthocyanin biosynthetic pathway (ABP). Floral anthocyanins are now considered a metamodel because of the conserved nature of the biosynthetic pathway across most angiosperms (Kopp, 2009). Changes in the color of anthocyanins (e.g., shifts from blue to red flowers) and the loss of floral anthocyanins (producing white flowers) can now be traced from ecological interactions in the field to the biochemical and molecular basis for these changes (Tanaka et al., 2008;Davies, 2009;Hopkins and Rausher, 2011;Zhao and Tao, 2015). These changes can result from mutations in core structural genes or regulatory loci . It is expected that null coding mutations will be more frequent within species; and cis-regulatory mutations between species (Stern and Orgogozo, 2008), which has been demonstrated in polymorphic populations of some species such as Mimulus lewisii (Wu et al., 2013).
The ABP is composed of seven core enzymes and several side-branching enzymes, and appears largely conserved across angiosperms (Holton and Cornish, 1995;Grotewold, 2006). Blockages in the first steps of the ABP are predicted to have more dramatic physiological effects and potentially ecological consequences compared to blockages in the latter steps because of the lack of stress-responsive flavones and flavonols . These maladaptive consequences of blocking the ABP can be ameliorated by recruiting tissue-specific regulators in order to limit effects solely to the petal (Streisfeld and Rausher, 2011;Wessinger and Rausher, 2012;Sobel and Streisfeld, 2013). The ABP is regulated by a complex composed by three interacting transcription factors: the R2R3-MYB, the basic helix-loop-helix (bHLH) and the WD40-repeat (WDR) (Hichri et al., 2011). The MYBs confer the majority of the tissue specificity (Zhang et al., 2003;Schwinn et al., 2006;Dubos et al., 2010;reviewed in Albert et al., 2014). Collectively, the core ABP, side-branches within the ABP, genes leading into the ABP and regulatory genes provides a relatively large target for a diversity of mutations that could block the ABP (Wessinger and Rausher, 2012). For flower color polymorphic plants, locating the blockage and predicting the physiological and ecological consequences require a thorough characterization of the ABP at the biochemical and molecular scales (e.g., Lou et al., 2014;Nishihara et al., 2014). Deciphering the cause of flower color variation is a complicated task because it requires sequencing and measuring expression of all genes acting in the ABP and their regulators.
RNA-Seq is a fast and efficient approach to sequence and examine the expression of all ABP-related genes, even when a reference genome is not available as in most non-model species (Li et al., 2012;Lulin et al., 2012;Xu et al., 2013;Butler et al., 2014). For flower color polymorphisms, petal mRNA must be examined across a range of developmental stages, especially the earliest stage when the flower color polymorphism is first manifested Dick et al., 2011;Butler et al., 2014). Large, complex genomes often exhibit multiple paralogs, sometimes expressed in the same tissue (e.g., Martins et al., 2013;Yuan et al., 2014). Differentiating paralogs and getting paralog-specific expression levels can be a complicated step in the mRNA-Seq bioinformatics pipeline. Once a candidate gene has been identified with either sequence or expression differences that correlate with flower color, subsequent biochemical analysis of the petal can be used to test the flavonoid composition.
A blockage in the ABP should restrict the production of downstream flavonoids and may lead to an accumulation of upstream intermediates Dick et al., 2011). High-Performance Liquid Chromatography (HPLC) coupled with mass spectrometry has been extensively used to identify and quantify the flavonoid composition in many ornamental and wild plants (Fossen and Andersen, 2006;Qiao et al., 2011). For instance, high concentrations of anthocyanins in black cultivars of Dahlia, were related with elevated expression of the ABP genes and low concentrations of flavones (Thill et al., 2012).
The genus Silene (Caryophyllaceae) is a model for studies of evolutionary ecology (Bernasconi et al., 2009), yet no one has examined the molecular and biochemical basis for flower color polymorphisms in any of the species (yet see the proposed ABP in Kamsteeg et al., 1979). Although the Caryophyllales are largely characterized by the production of betalain pigments in place of anthocyanins, flower color variation in Silene is still caused by anthocyanins (Brockington et al., 2011). Herein, we focus on a discrete flower color polymorphism in the Iberian Peninsula endemic, S. littorea (Talavera, 1979). After surveying 17 populations across the species range, we found most populations were composed of primarily dark pink individuals ( Figure 1A). Yet, in two northern populations, there were three distinct color morphs: white, light pink, and dark pink (Figures 1B-D). The three petal color morphs were compared with a UV-Vis spectrophotometer. The differences among them are concentrated in the visible range especially at the typical wavelength that anthocyanins absorb (∼550 nm; Figure 1E), yet the biochemical and molecular underpinnings causing this flower color variation in S. littorea remains unknown.
Here, we examine the petal transcriptome and biochemistry of the flower color polymorphism in S. littorea using RNA-Seq complemented with HPLC flavonoid profiling to try and identify the most likely cause of the blockage in the ABP. Transcriptome analysis is used to sequence and estimate expression of 15 ABP-related genes. The sequences of these candidates are examined to determine if there are any consistently color differentiating SNPs. Simultaneously, we estimate expression differences between color morphs to establish if downregulation of any ABP-related genes correlate with white petals. We complement our RNA-Seq results with an investigation of the petal biochemistry of the three color morphs by identifying the primary anthocyanin pigment and flavonoid intermediates. We then compare their relative abundances among the three color morphs to help target the blockage in the ABP leading to white petals. The biochemical results are interpreted in light of the SNP and expression findings from the transcriptome analysis.

Plant Species
Silene littorea (Caryophyllaceae) has an anthocyanin petal polymorphism with three distinct categories-dark pink (D), light pink (L), and white (W) (Figures 1B-D). It belongs to the section Psammophilae (Oxelman et al., 2013), that is composed of five diploid (2n = 24; Talavera, 1979), annual, primarily pink flowered taxa. The species grows primarily in coastal sand dunes of Spain and Portugal ( Figure 1A; Talavera, 1979). It has a gynodioecious-gynomonoecious sexual system and produces a highly variable number of flowers per plant (Casimiro-Soriguer et al., 2013), yet the flower color is constant among flowers within a plant (unpublished observations). There is no correlation between flower color and sexuality (unpublished observations), but white individuals are much more common in the northwestern portion of the species range compared to the southeast ( Figure 1A).

Sampling and RNA Extraction
Plants were sampled from a polymorphic population near the northwestern range limit (Playa de Barra, Spain; 42 Figure 1A). Petals from W, L, and D flowers were sampled at three developmental stages: bud, opening, and anthesis ( Figure S1 in Supplementary Material). All five petals from the same flower were collected, immediately preserved in RNAlater (Ambion, Inc., Austin, Texas), and stored at −20 • C until RNA extraction. RNA was isolated using the RNeasy Plant Mini Kit (Qiagen, Valencia, CA). Concentration and purity of RNA was measured with a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE) and agarose gels were run to verify RNA integrity. The nine samples with the highest concentration of RNA for each of the three color morphs and developmental stages were selected: bud white, opening white, anthesis white, bud light, opening light, anthesis light, bud dark, opening dark, and anthesis dark.

Library Preparation and Sequencing
RNA-Seq libraries were prepared and sequenced at the Epigenome Center of the University of Southern California following the manufacturers protocol (Illumina, San Diego, CA). The nine libraries were barcoded (6 bp), pooled in equimolar concentrations and loaded on a single lane of the Illumina Hi-Seq 2000 system. Sequencing consisted of 50 cycles of singleend sequencing-by-synthesis reactions. Fortuitously, the libraries were sequenced twice. After a preliminary analysis indicated that both datasets produced qualitatively similar results, we merged them for all analyses reported herein. In the combined dataset, there was an average of 37.9 million reads per sample (range 36.3-40.0 million). Raw sequencing reads were deposited to NCBI's Short Read Archive (Accession number SRP033277).

De novo Assembly of ABP-Related Loci
Since there are no closely related genome resources for Silene, we conducted de novo assembly of the S. littorea transcriptome following a similar pipeline developed by Butler et al. (2014) relying largely on VELVET v.1.2.07 (Zerbino and Birney, 2008) and OASES v.0.2.08 (Schulz et al., 2012). We assembled using the FASTQ files across a range of k-mers (23-39) to maximize ABP candidate gene coverage. Assembled contigs were identified by BLAST+ against the Arabidopsis thaliana RefSeq database from TAIR (v. 10; Swarbreck et al., 2008), limiting the results to e < 10 −10 .
We made sequence comparisons and expression analyses of 15 ABP candidate genes. These include seven core ABP structural genes [chalcone synthase (Chs), chalcone isomerase (Chi), flavanone-3-hydroxylase (F3h), dihydroflavonol 4reductase (Dfr), anthocyanidin synthase (Ans), flavonoid 3-O-glucosyltransferase (Uf3gt), and acyltransferase (At)], three genes immediately upstream of the ABP [phenylalanine ammonia-lyase (Pal), cinnamate 4-hydroxylase (C4h), coumarate CoA ligase (4Cl)], two side-branching genes [flavonol synthase (Fls), flavonoid 3 ′ hydroxylase (F3 ′ h)], and three transcriptional regulators [basic helix loop helix (Bhlh), WD40 Repeats (Wd40), and R2R3-MYB domains (Mybs)]. For each of the nine samples, we extracted all contigs from the candidates for each gene of the ABP and these contigs were aligned in BioEdit (Hall, 1999). For two regulatory genes, Wd40 and Mybs, two and seven different sequences were found. For five core ABP genes, two to three very distinct sequences of the same gene were assembled. Even though these loci blasted to the same ABP-related locus in the TAIR database, the alignment suggested they were unlikely orthologous. Therefore, we treated them separately as putative paralogs (hereafter "locus") in all further analyses of 29 loci in total. For each locus, a consensus sequence with ambiguities representing all the variable sites among the nine samples was extracted as the reference for gene expression. All ABP-related reference loci are available in Genbank nucleotide database [http://www.ncbi.nlm.nih.gov/nuccore/] Accession numbers KT954895-KT954923.

Sequence Comparisons of ABP-Related Loci
For all 29 ABP related loci, we tested for single nucleotide polymorphisms (SNPs) that correlated with flower color. We started by mapping the microreads back onto the de novo consensus sequences using Mosaik (Lee et al., 2014). We followed the author's recommendations for the parameter settings: two allowed mismatches and a hash length of 15. We then used Picard v.1.94 (http://broadinstitute.github.io/picard/) to identify PCR duplicate reads-an artifact of the library preparation methodology. The Genome Analysis Toolkit v.2.6-4 (GATK, McKenna et al., 2010) was used to (1) re-align the reads around potential indels, (2) remove the PCR duplicates identified in Picard and finally (3) identify SNPs in each of the nine samples across the 29 ABP-related loci (DePristo et al., 2011). We calculated the allele frequencies for each SNP for each color type using the genotype field (GT) in the GATK output file (we surveyed 3 individuals per color morph = 6 alleles per color morph). We also calculated the mean likelihood of genotype assignment (0/0, 0/1, or 1/1) for each color type (parameter PL in the GATK output).
After finding allele frequency differences among pink and white individuals in Ans, we conducted a survey of additional individuals from the same population using genomic DNA. DNA was extracted from 19 pink and 19 white individuals following the PL2 protocol in the NucleoSpin Plant II kit (Macherey-Nagel, Bethlehem, PA). We designed primers specific to SlAns in order to amplify and sequence a 677 bp fragment including the color differentiating SNPs and a 110 bp intron (ANS-416F: CTAGTGGCCAACTCGAGTGG & ANS-1021R: CAAAGGTTCGAGGCGGGTAA). PCR conditions followed those of Dick et al. (2011) using Taq polymerase from New England Biolabs (Ipswich, Massachusetts) with the following thermal cycling steps: initial denature at 95 • C for 3 min; 35 cycles of 95 • C for 30 s, 60 • C for 30 s, 72 • C for 90 s; a final extension at 72 • C for 10 min; and a 4 • C hold. PCR products were purified using exoSAP and sequenced using Big Dye Terminator methodology on an ABI 3730xl DNA Analyzer (Sequetech Corp., Mountain View, California). Contigs were produced from forward and reverse reads. Contigs were then aligned and allele frequencies calculated in Geneious v.8.1.6 (Auckland, New Zealand). The flower color of each additional sample in the Ans survey was verified with an anthocyanin extraction of the petals and quantification by measuring their absorbance at 520 nm (see Materials and Methods in Del Valle et al., 2015) following correction for the petal area ( Figure S2 in Supplementary Material).

Expression Analysis
To extract the number of reads mapped for each gene from the bam file produced by Mosaik we used Artemis (Rutherford et al., 2000). For differential expression analysis we used the DESeq package (Anders and Huber, 2010) in R (R Core Team, 2013). This package requires the normalization of the raw counts. After normalization, only those loci above the 33rd quantile are further analyzed for differential expression. This filtering step is neccesary to avoid spurious estimates of fold-change differences due to very low expression values. A negative binomial tests was applied to identify any statistically differentially expressed loci (p < 0.05). Although we analyzed the three developmental stages separately, we focus the remainder of the expression analysis and interpretation on the bud stage since the petal color is already present in the bud ( Figure S1 in Supplementary Material) and therefore, the causal genes should be detectable at this developmental stage.

Phylogenetic Analysis of R2R3 Mybs
To help infer which S. littorea petal Mybs may be involved in anthocyanin biosynthesis, we compared the seven distinct Mybs identified in S. littorea to known regulators of the ABP from several model species (Antirrhinum majus, Arabidopsis thaliana, Chrysanthemum morifolium, Eucaliptus gunnii, Fragaria ananassa, Fragaria chiloensis, Gerbera hybrida, Ipomoea nil, Lycopersicon esculentum, Malus domestica, Mimulus aurantiacus, Salvia miltiorrhiza, Oryza sativa, Petunia hybrida, Populus trichocarpa, Vitis vinifera, and Zea mays-accession numbers can be found in Table S1 in Supplementary Material). We conducted both maximum likelihood (RAxML; Stamatakis, 2014) and Bayesian (MrBayes; Huelsenbeck and Ronquist, 2001) phylogenetic analyses of the aligned nucleotids of the R2R3 binding domain (315 bp) using plug-ins within Geneious v.8.1.6. For the maximum likelihood analysis, we fit a GTR+CAT+I model followed by 1000 bootstrap replications. For the Bayesian analysis, we applied the GTR+I+G model of sequence evolution for two separate runs, each consisting of four independent chains run for 5,000,000 generations sampling every 50,000 generations after 1,000,000 generations of burnin. Bayesian runs were checked for proper mixing and convergence using standard diagnostics.

HPLC Analysis of Flavonoids
Petal flavonoids were identified for three dark pink samples. After that, we compared specific compounds in three white, six light pink, and six dark pink samples. Flavonoids were extracted from four petals of an anthesis stage flower that were preserved in 1 mL of CH 3 OH:H 2 O:HCl (90:9:1, v:v:v) and stored on ice and in the dark until the flavonoids could be extracted. The samples were homogenized using 5 × 3 mm diameter glass beads in a Mixer Mill MM 200 (Retsch, Haan, Germany) with a frequency of 30 oscillations/s. They were beaten until the sample was completely homogenized (minimum of 60 s). The supernatant was removed after 10 min centrifugation (13,000 rpm) and stored at −20 • C until it could be separated by HPLC.
Chromatographic separation was performed using a Perkin Elmer Series 200 HPLC system (Wellesley, USA) coupled to an Applied Biosystems QTRAP LC/MS/MS system (Foster City, USA) consisting of a hybrid triple-quadruple linear ion trap (QqQLIT) mass spectrometer equipped with an electrospray ion source. HPLC analyses were performed on a 150 × 2.0 mm Phenomenex Luna 3u C18(2) 100A reversed-phase column with a particle size of three µm. The flow rate was 0.2 mL/min. To identify and quantify the flavonoid compounds in the petals of S. littorea, we performed multiple reactions monitoring (MRM) combined with precursor ions scan and subsequent MS/MS analysis (Li et al., 2006;Qiao et al., 2011). We used the standards of the flavonoids that were previously reported for S. littorea and others species of Silene (Table S2 in Supplementary Material). The standards were obtained from SDS (Toulouse, France). The parameters for the MRM transitions and HPLC-ESI-MS/MS analyses were fixed following Dardanelli et al. (2008), with the exception of the dwell time for each transition which was set to 0.05 s. Flavonoid amounts were corrected for flower size using the total area of the petals measured with the software ImageJ (US National Institutes of Health, Bethesda, MD, USA, http://imagej.nih.gov/ij/). Size-corrected flavonoid amounts were standardized by their maximum value. Significant differences in individual flavonoid concentrations were examined for the three color categories (D, L, and W) in an ANOVA after log transformation in R v.3.1.0 (R Core Team, 2013). When significant, we conducted Tukey HSD post-hoc paired tests to

De novo Assembly of ABP-Related Genes
We identified all 15 ABP-related genes from de novo assembly of the petal transcriptome. The longest contigs from the de novo assembly were most frequently from Velvet k-mer 31 ( Table 1), but supplemented by contigs from other kmer analyses as necessary. After BLAST+ identification against all known genes of A. thaliana, multiple putative paralogs were identified for seven genes producing a total of 29 ABP-related loci ( Table 2).
Three of the genes that feed into the ABP had two or three copies each (Pal, C4h, and 4Cl). Most of the ABP structural genes and their side-branches had only a single locus expressed in the petals except Chs and F3h which had two and three copies respectively. Of the three regulatory loci, there were seven Mybs, two Wd40s, and only one Bhlh locus. The bHLH locus is closely related to the AN1 locus of Petunia and TT8 locus from Arabidopsis, both regulators of the ABP (see Figure S3 in Supplementary material). We sequenced 100% of the coding sequence (CDS) for 28 of the 29 ABP-related loci (all except Myb5). In addition, we acquired an average of 121 bp of the 5 ′ UTR sequence (range 35-451 bp) and 170 bp of the 3 ′ UTR (range 21-306) ( Table 2).

Sequence Comparisons among Color Morphs
Among the nine samples, we found 622 SNPs in the 5 ′ UTR, CDS and 3 ′ UTR of the 29 ABP-related loci ( Table 2). The number of SNPs per gene was highly variable, ranging from zero to 91 in F3h1 and Pal1, respectively ( Table 2). Although we found numerous non-synonymous SNPs in several loci, none of them consistently differentiated the three color morphs.
Ans was the only gene that had SNPs with allele frequencies consistently associated with flower color (Table S3 in Supplementary Material). A total of 32 SNPs were found in the 5 ′ UTR, CDS and 3 ′ UTR in Ans, yet nine of these between positions 697-1099 exhibited substantially different frequencies in dark pink vs. white samples ( Figure S4 in Supplementary Material). Furthermore, the likelihood of being homozygous for one allele or the other was also strongly correlated with petal color ( Figure S5 in Supplementary Material). There was a very low likelihood of heterozygosity at all nine of these SNPs for all three color morphs. Nevertheless, all of these color-differentiating SNPs were synonymous. Additional sequencing of the 677 bp region (including the 110 bp intron) of Ans in 19 pink and 19 white individuals contained all of the color differentiated SNPs except the last one at bp 1099. Seventeen SNPs including two additional SNPs discovered in the intron were examined, however no single SNP consistently differentiated pink and white individuals (mean allele frequency = 0.21; Table 3).

Expression Comparisons among Color Morphs
Since petal anthocyanins are detectable in the bud stage ( Figure  S1 in Supplementary Material), we infer that all ABP-related loci should have been expressed by this developmental stage. Thus, we focus on the three bud stage samples for expression comparisons (expression values for later developmental stages are available in Table S4 in Supplementary Material). The DESeq corrected expression estimates for the bud stage ranged from 1459.8 reads-494,875.6 reads (median = 8713.9 reads; Table S4 in Supplementary Material).
When comparing the two pigmented morphs (dark pink and light pink), there are two signficantly differentially expressed loci

Phylogenetic Analysis of R2R3 Myb Loci
The phylogenetic analysis of the R2R3 Myb DNA binding domain including several ABP-related Mybs from model species indicates numerous S. littorea Mybs are likely ABP regulators. RAxML and MrBayes phylogenetic analyses produced nearly identical topologies. For simplicity, we present the RAxML results (Figure 3). In particular, SlMyb4, SlMyb1a, and SlMyb1b, are strongly supported as sister to Subgroup 7, which controls the first dedicated steps of the ABP gene regulation including F3h in A. thaliana (Stracke et al., 2007;Dubos et al., 2010). SlMyb5 and SlMyb6 grouped with a large number of unresolved eudicot Mybs of Subgroup 6 which are known to control the later genes of the ABP (Figure 3; Dubos et al., 2010). Both of these mybs have the expected bHLH interaction residues and the ANDV motif in the R3 domain that are characteristic of Subgroup 6 (results not shown). SlMyb2 and SlMyb3 are less likely involved in the blockage of the ABP since they are not closely related to exemplars from Subgroups 6 or 7. Both of these mybs have the bHLH binding domain and the C1 conserved motif, but the C2 motif is only present in SlMyb3 (Dubos et al., 2010;Yoshida et al., 2015).

Identification and Quantification of Flavonoids
HPLC analysis revealed three anthocyanin compounds (glycosylated cyanidin derivatives) responsible for the petal color in S. littorea (Table S5 in Supplementary Material). We detected seven additional flavonoids: four flavones (identified from standards as apigenin, isoorientin, isovitexin and luteolin), two flavonols (quercetin and rutin), and one dihydroflavonol (dihydroquercetin). No flavonoids matching the flavanone narigenin nor the isoflavone genisteine, from the earliest dedicated steps of the ABP, were detected. The putative location of these flavonoid intermediates in the ABP is shown in Figure 4. We compared the relative amounts of anthocyanins and their intermediates across color morphs to link the transcriptome results to the phenotype. The amount of cyanidin derivatives significantly increased with the intensity of the petal color as expected (Table 4, Figure 5). In three of the five flavonoid intermediates (rutin, isovitexin and quercetin), the relative amounts of metabolites in the color morphs were significantly different. Amounts of luteolin derivatives and apigenin were not significantly different among the color morphs (Table 4, Figure 5). Post-hoc pairwise comparisons among the three color morphs indicate that differences were always strongest between pigmented and white petals, except for quercetin where white and pink were not significantly different from each other. Light and dark pigmented morphs did not differ in the relative amount of any of the five flavonoid intermediates (Table 4, Figure 5).

DISCUSSION
We sequenced and measured expression of all ABP-related genes from the petals of the non-model species, S. littorea. We assembled complete coding sequences of 28 out of 29 ABP-related loci and identified over 600 SNPs, yet none are sufficient to confer a structural blockage in the ABP. This study is the first to sequence and measure expression of structural and regulatory FIGURE 2 | Expression differences, estimated as fold changes, between dark/white (dark pink bars) and light/white (light pink bars) petals at bud developmental stage. Fold change values between pigmented and white petals after normalization of the raw counts are shown. Only those loci that are above the lower 33 quantile are included. This filtering step prevents spourious estimates of fold-change values due to very low expression.
genes of the ABP in the petals of Silene. A previous transcriptome analysis of white flowered S. vulgaris has been completed, but it utilized pooled RNA from leaves, roots and whole flowers (Blavet et al., 2011;Sloan et al., 2012). Recently, RNA-Seq studies in nonmodel species of Mimulus, Muscari, and Parrya have similarly used de novo assemblies followed by expression analyses of the ABP genes (Butler et al., 2014;Lou et al., 2014;   *P < 0.05; **P < 0.01; ***P < 0.001.

2014). As in
Silene littorea, these studies also identified the ABP genes and some of the regulatory loci, confirming our result that RNA-Seq is an efficient tool for narrowing the number of candidate genes responsible for a flower color polymorphism. We assembled the complete CDS of most of the ABP loci (28 out of 29) compared to an average of 71% in Muscari and 89% in Parrya likely due to the excessive coverage following the replicate sequencing runs (Butler et al., 2014;Lou et al., 2014). The concentration of synonymous SNPs that correlated with flower color in Ans was initially encouraging and warranted further investigation. Unfortunately, none of these SNPs consistently differentiate the flower color samples (highest D-W allele frequency difference is 0.45). Given that there are numerous non-color differentiating SNPs on either side of the cluster of color-related SNPs, it is unlikely that we surveyed a region adjacent to a structural blockage in the ABP at ANS. Furthermore, since none of these SNPs cause any changes to the amino acids, they should have no structural effect on the enzymes activity. Lastly, these SNPs are unlikely to cause regulatory changes since in Arabidopsis and Ipomoea, the regulation of Ans occurs in the promoter (Dong et al., 2014;Xu et al., 2014) where MYB and bHLH binding sites are found.
The expression analysis identified several significantly differentially expressed genes in the petal buds where there was substantially lower expression in white samples compared to pigmented samples. Although we have focused our interpretation of expression differences on the developmental stage closest to when the pigment difference between pink and white become apparent (bud), we have provided results for later petal developmental stages as well (Table S4 in Supplementary Material). In particular, F3h1 exhibits significantly different expression for both dark pink vs. white and light pink vs. white comparisons. In fact, these are the two largest fold-changes in expression of pigmented vs. white among all ABP-related loci. Changes in F3h1 expression could be due to mutations in the cis-regulatory elements of the promoter or changes in the transacting myb-bHLH-WD40 regulatory complex. Unfortunately,  Table S1 in Supplementary Material. there is no DNA sequence variation in the 129 bp of the 5 ′ UTR that we have sequenced, thereby limiting our ability to associate this region with any adjacent cis-acting color-differentiating SNPs. Mutations upstream from the 5 ′ UTR cannot be excluded as a cause of F3h1 downregulation in S. littorea since this locus can affect flower color as has been reported in other species (Dedio et al., 1995;van Houwelingen et al., 1998). Nishihara et al. (2014) found that a white-petaled Torenia was caused by a retrotransposon in the promoter of the F3h gene. In addition, antisense suppression of F3h in carnation resulted in a variety of transgenic plants showing a range of loss of function, from subtle attenuation to complete loss (Zuker et al., 2002).
The regulation of ABP genes through changes in expression of their regulatory elements, could also lead to the differential expression observed in F3h. In Mimulus aurantiacus, MaMyb2 regulates the expression of F3h, Dfr, and Ans. When MaMyyb2 was silenced, the expression of these genes was significantly lower than the control . In S. littorea, the gene tree of MYBs, placed SlMyb5 and SlMyb6 in the same group as MaMyb2 and many other known ABP regulators (Subgroup 7 according to Dubos et al., 2010), and also presented a reduction of expression in the non-anthocyanin morph, although not significant. This result highlights SlMyb5 and SlMyb6 as tentative regulators of the expression of the ABP genes, however further experiments are needed to test this hypothesis.
Significant differences in expression were also found in C4h2 (dark pink vs. white), F3 ′ h (light pink vs. white) and the transcription factor Myb1a (dark pink vs. white and dark pink vs. light pink). C4h2 is a pre-ABP gene acting in the general phenylpropanoid biosynthetic pathway (Ehlting et al., 2006). Since we did not detect the biochemical product of C4H (nor were the products of CHS and CHI, chalcone and naringinen, identified), we cannot differentiate whether there is a blockage at C4H due to decreased expression or if the enzymes downstream of C4H are consuming all of the product during flux down the ABP. Interestingly, suppression of the first two dedicated genes of ABP such as Chs or Chi would eliminate most flavonoid intermediates without affecting the production of upstream compounds including the volatile benzenoids responsible for floral scent (Clark and Verwoerd, 2011). This is because C4h2, Chs, and Chi are all located downstream of the production of cinnamic acid, the initial substrate of this side branch (Davies and Schwinn, 2006;Ben Zvi et al., 2008). Although differences were not significant, expression of C4h2 was also much higher in light vs. white morphs. The phylogenetic tree of Mybs showed that SlMyb1a (closely related to SlMyb1b, with 68% of amino acid similarity) and SlMyb4 are closely related to Subgroup 7 (sensu Dubos et al., 2010) which controls several genes in the first half of the ABP including F3h in A. thaliana (Dubos et al., 2010). On the other hand, we also found that light pink and  Figure 4). Log transformed, size corrected relative peak areas are compared for white, light pink, and dark pink samples. The boxes represent the 25th and 75th percentiles, the whiskers are the 5th and 95th percentiles, the central solid lines are the median values, and circles represents outliers.
dark pink petals showed differential expression in two Myb transcriptional regulators. This suggests that differences in color intensity between the light and dark pink morphs could be due to a different candidate ABP locus than the loss-of-function (e.g., Hopkins and Rausher, 2011;Yuan et al., 2013), rather than heterozygosity of a single loss-of-function locus in the white morph. In fact, during the SNP assignment, the light pink morph never had a higher probability of being heterozygote, however confirming the number of loci responsible for these three color morphs must be evaluated with an F 2 population that is segregating for flower color.
A structural or regulatory blockage in the ABP would decrease the amount of flavonoid intermediates below the blockage, but would increase the amount of intermediates in upstream side branches (depending on the dynamics of metabolite flux through the pathway). The flavonoid biochemical analysis identified cyanidin as the primary anthocyanin and an additional five flavonoid intermediates to compare among the color morphs. Only three of them were significantly different between white and pigmented individuals and two (quercetin and rutin) are consistent with a blockage at or above F3H. Consistent results between HPLC and expression analysis were found in Parrya nudicaulis, where the white morph did not produce catechins or flavonols due to the reduced expression in Chs (Dick et al., 2011).
Nevertheless, in Iris lutescens, the production of non-anthocyanic flavonoids (including chalcones, flavones and flavonols) in the yellow morph was higher than in the purple (Wang et al., 2013). In Muscari armeniacum, Lou et al. (2014) found that except for anthocyanins (delphinidin and cyanidin), the white morph contained the same metabolites as the blue, and generally at higher concentrations. They argue that the blockage in DFR in the white morph, caused a redirection of the flux of metabolites through a side-chain to other products. A similar argument may hold between the light and pink morphs in S. littorea. Although differences were not significant, the light morph of S. littorea showed a trend toward higher concentrations of flavonols and flavones compared with the dark pink morph (Figure 5), which could be due to a redirection of the flux of metabolites.
Based on our biochemical analysis, we have proposed a tentative metabolic pathway of anthocyanin in the petals of S. littorea (Figure 4). The pink color of the petals is caused by the accumulation of cyanidin 3-glucoside derivatives, as is found in S. armeria (Iwashina and Ootani, 1987) and S. dioica (Kamsteeg et al., 1979). Dark pink flowers in S. littorea showed the same cyanidin 3-glucoside derivatives but in a much higher concentration than light pink flowers, which suggest that the pink intensity is caused by the different concentration of these compounds. In other species, it has been proposed that copigments such as flavones and flavonols play an important role in the color or intensity of the petals (Gould and Lister, 2006;Thill et al., 2012;Nishihara et al., 2014). For example, brown color of outer part of the labellum of Ophrys speculum is suggested to be caused by the flavonols acting as co-pigments of cyanidins (Vignolini et al., 2012). In S. littorea, flavonols and flavones are not expected to play a key role in the pink intensity since higher concentrations were not found in darker petals.
The lack of anthocyanins, and the lower levels of other flavonoids in petals of the white-flowered morph of S. littorea could result in a fitness disadvantage in stressful conditions. These pigments (and some of their intermediates) are known to influence pollinator visitation, attraction of florivores and susceptibility to pathogens (e.g., Hoballah et al., 2007;Johnson et al., 2008;Falcone Ferreyra et al., 2012). Furthermore, anthocyanin-less morphs may more susceptible to abiotic stresses such as heat, cold and dessication (reviewed in Winkel-Shirley, 2002). Interestingly, individuals of the white campion, S. pratensis, lacking glyscosilated isovitexin showed ruptured upper epidermal cells that caused curved petals (Van Brederode et al., 1982). The possible disadvantage of the lack of anthocyanins or other flavonoids can be even higher when vegetative tissues are also affected (Levin and Brack, 1995;Warren and Mackenzie, 2001). This could be the case of a different type of white-flowered mutant that appears rarely in a few southern populations of S. littorea. This whole-plant mutant is not able to produce anthocyanins in other tissues of the plant (see the calyx in Figure S1E in Supplementary Material), and is found at very low frequencies (<0.05%; Casimiro-Soriguer, 2015). Mutations in structural genes are commonly responsible for low frequency white-flowered mutants in several other species (i.e., Coberly and Rausher, 2008;Wu et al., 2013). Thus, the rare mutant in S. littorea could be also due to a coding mutation, but future experiments should be carried out to answer this question. However, the high frequency white-flowered mutant studied here, is able to produce anthocyanins and flavonoids in other tissues of the plant including calyx, leaves and stem (see calyx in Figure S1E in Supplementary Material). Instead, we posit that regulatory changes in SlMyb1a affects expression of SlF3h1 (at least) which is then the most likely blockage in the ABP for these northwestern Iberian white-flowered morphs of S. littorea.

CONCLUSIONS
We used RNA-Seq to simultaneously sequence and estimate expression of 29 ABP-related loci among three flower color morphs of the non-model plant, S. littorea. After sequencing the complete coding regions of all structural genes and most regulatory loci, we found a cluster of nine synonymous SNPs around the intron in Ans whose frequencies differ among color morphs, yet their functional significance is unclear. Additional sampling confirmed these Ans allele frequency differences, yet no single SNP consistently differentiates the color morphs. Instead, there is consistent and significant downregulation in the expression of F3h when comparing pigmented and white petal buds which may be influenced by decreased expression of Myb1a-a regulator of F3h in other eudicots. The flavonoid biochemical analysis is partially consistent with downregulation of F3h-the most likely blockage in the ABP leading to the loss of floral anthocyanins potentially mediated by expression of Myb1a. Expanded sampling of white and dark individuals for expression analysis of SlMyb1a and F3h and sequencing of the promoter region in association with genetic analysis of these loci using a segregating F 2 population are essential steps to validating these results.

AUTHOR CONTRIBUTIONS
JW, MB, EN, and IS conceived and designed the experiments. MB, JD, EN, and IS carried out the sampling. JW and IS performed the assembly and the sequences comparison. JW and JD performed the Sanger sequencing. JD, JW, and IS run the phylogenetic analysis. MB and IS carried out the differential expression analysis. EN, JD, and IS analyzed the HPLC data. JW, EN, MB, and IS drafted the manuscript. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS
The authors thank Charles Nicolet and the University of Southern California's Epigenome Center for library preparation and conducting the sequencing. The Department of Biology at Santa Clara University supported IS during a research stay in partial fulfillment of her Ph.D. to conduct a portion of these analyses. Cindy Dick provided useful guidance in the Whittall laboratory and Daryn Baker in the Department of Biology provided computational support at Santa Clara University. HPLC analysis were performed in the Mass Spectrometry Service at the CITIUS center of the University of Seville, with the supervision of ME Soria-Díaz. We would like to thank to Adolfo Cordero for facilitate the location of one of the polymorphic populations. This work was supported with FEDER funds and grants by the Spanish Ministerio de Ciencia e Innovación through a Formación de Personal Investigador grant to IS [BES-2010-031073] and the research projects CGL2009-08257 and CGL2012-37646.