A systems genetic analysis identifies putative mechanisms and candidate genes regulating vessel traits in poplar wood

Wood is the water conducting tissue of tree stems. Like most angiosperm trees, poplar wood contains water-conducting vessel elements whose functional properties affect water transport and growth rates, as well as susceptibility to embolism and hydraulic failure during water stress and drought. Here we used a unique hybrid poplar pedigree carrying genomically characterized chromosomal insertions and deletions to undertake a systems genomics analysis of vessel traits. We assayed gene expression in wood forming tissues from clonal replicates of genotypes covering dosage quantitative trait loci with insertions and deletions, genotypes with extreme vessel trait phenotypes, and control genotypes. A gene co-expression analysis was used to assign genes to modules, which were then used in integrative analyses to identify modules associated with traits, to identify putative molecular and cellular processes associated with each module, and finally to identify candidate genes using multiple criteria including dosage responsiveness. These analyses identified known processes associated with vessel traits including stress response, abscisic acid and cell wall biosynthesis, and in addition identified previously unexplored processes including cell cycle and protein ubiquitination. We discuss our findings relative to component processes contributing to vessel trait variation including signaling, cell cycle, cell expansion, and cell differentiation.


Introduction
Wood development is a culmination of cell division, tissue patterning and cell differentiation that is modified to balance growth versus safety under environmental conditions that vary throughout growing seasons and across years.The vascular cambium is a secondary meristem composed of ray and fusiform initials that give rise to both wood and inner bark tissues (Larson, 1994).In the developing wood of the model tree genus Populus, fusiform initials give rise to the axial cell system that includes fibers and vessel elements within wood.Vessel elements, the water conducting cells in wood, develop through rapid and dramatic radial cell expansion involving both symplastic and intrusive growth (Wilczek et al., 2011).After expansion, vessel elements synthesize a complex lignified secondary cell wall, followed by program cell death and hydrolysis of the cytoplasmic contents and portions of the primary cell wall (Turner et al., 2007).Dead at functional maturity, individual vessel elements connect end-to-end to form longer conduits called vessels (Esau, 1977), allowing for the upward flow of water under tension.The dimensions and distribution of vessels in wood help determine tissue hydraulic and mechanical functions (Sperry, 2011).
The width, number, and clustering of vessels affect water transport efficiency and vulnerability to hydraulic failure under stressful conditions (Rodriguez-Zaccaro and Groover, 2019).Stressful environmental conditions including drought, extreme heat and high salinity, reduce cambial activity and result in wood with vessels that are smaller in diameter, more numerous and more interconnected (Arend and Fromm, 2007;Fonti et al., 2013).Small variations in vessel diameter can have a large impact on water transport efficiency, with wider vessels allowing for greater volumetric flow rates (Baas et al., 2004;Venturas et al., 2017).Vessels that are in direct contact share pit pairs through which water can flow laterally.Consequently, increased vessel clustering (measured through vessel grouping indices) allows for greater vessel network interconnectivity and greater stem-specific hydraulic conductivities (Lens et al., 2011;Hajek et al., 2014).The ability of trees to transport water efficiently is tied to their photosynthetic capacity and growth rate (Tyree et al., 1998;Brodribb and Field, 2000).Greater water transport efficiency, however, has been linked to increased vulnerability to hydraulic failure through freeze-thaw and drought-induced air embolisms that can lead to death (Feng et al., 2015;Gleason et al., 2016).Trees are thus challenged to sense environmental conditions and adjust wood development accordingly to balance growth and safety.
A number of vessel traits have been linked to hydraulic properties of tree stems.The number of vessels within a cross-sectional area of wood (termed vessel frequency) is often strongly negatively correlated to vessel diameter (Sperry et al., 2008).Greater vessel frequencies can buffer trees from hydraulic failure by decreasing the impact of embolism of a single vessel on the overall conductivity of a stem (Baas et al., 1983;Ewers et al., 2007).Some studies have linked greater vessel grouping to increased vulnerability to embolism (Scholz et al., 2013;Guet et al., 2015), possibly due to an increased probability of embolism spread throughout a more interconnected xylem network (Loepfe et al., 2007).However, greater vessel grouping indices have also been related to decreased vulnerability to embolism (Lens et al., 2011;Smith et al., 2013), likely due to the presence of alternate pathways that can bypass embolism blockage (Carlquist, 1984).The cross-sectional shape of vessels (termed vessel circularity) can also affect transport efficiency, as deviations from perfect circularity can decrease the volumetric flow rate of water through a conduit (Zhao et al., 2019).Vessel width and frequency scale with tree height and diameter; taller and thicker trees have wider and less numerous vessels than shorter and thinner trees at the same sampling height (Olson et al., 2014;Olson et al., 2020).Accordingly, taller trees, while more conductive, are generally more vulnerable to drought and freeze-thaw cycles (Olson et al., 2018).
Our understanding of the regulation of vessel traits in trees remains incomplete and there has been ongoing debate regarding whether vessel traits are at all directly genetically regulated or if they represent an indirect response to environmental conditions (Olson et al., 2014;Olson et al., 2020).Emerging evidence gives strong support to the hypothesis that changes in wood development and resulting vessel traits are the result of coordinated regulation, however.For example, abscisic acid plays a key role in the mediation of stress response signals in poplar wood, and treatment with exogenous ABA can mimic the wood anatomical features seen under drought (Yu et al., 2021).Recent application of genomic approaches has provided some insights into specific regulatory pathways, including identification of AREB1 and NAC transcription factors affecting vessel size in response to ABA in Arabidopsis and poplar (Li et al., 2019;Yu et al., 2019;Ramachandran et al., 2021).Auxin transport and biosynthesis disturbances have been shown to alter vessel width, frequency and grouping resulting from changes in cell differentiation patterns (Tuominen et al., 1997;Junghans et al., 2004;Johnson et al., 2018).However, less is known about how developing vessels perceive signals that influence their differentiation and size, how turgor and cell wall extensibility are regulated, or whether models of cell size regulation proposed for other organisms extend to trees.
We previously established a poplar population for functional genomics based on gene dosage (Henry et al., 2015;Bastiaanse et al., 2019;Bastiaanse et al., 2021).A large population of Populus deltoides x P. nigra hybrids was created, following gamma irradiation of pollen to create chromosomal deletions and additions (indels) in the resulting F1 progeny (Henry et al., 2015).The full genome of each F1 line was sequenced to precisely locate the chromosomal positions of deletions and insertions (Henry et al., 2015).Systems genomics approaches have been used in this population to dissect complex traits, including descriptions of genetic architecture and dosage quantitative trait loci (dQTL) that affect phenotype trait values when relative gene dosage is modulated, as well as putative mechanisms and individual candidate genes affecting traits (Bastiaanse et al., 2019;Bastiaanse et al., 2021).In previous work we used this unique population to determine heritabilities and identify dQTL affecting multiple vessel traits (Rodriguez-Zaccaro et al., 2021).The traits examined included mean vessel diameter (MVD), vessel frequency (VF), vessel grouping index (VGI), mean vessel circularity (MVC) and non-lumen fraction (NF; percentage of wood not comprised of vessel lumens), all of which showed significant heritabilities and behaved as classical quantitative traits influenced by many genes with modest individual effects.
Here we extend our previous study through analyses integrating transcript abundance data from wood-forming tissues with phenotypic and gene function data.Specifically, we use a systems genetics approach integrating information within a gene coexpression framework to identify putative genes and mechanisms underlying vessel traits.We discuss and test evidence for mechanisms affecting traits including stress perception and longdistance signaling, turgor regulation, cell wall extensibility, regulation of cell division and posttranslational modifications.

Plant materials
A subset of 33 poplar F1 hybrid genotypes was selected from a larger group of 201 lines included in a previous study (Rodriguez-Zaccaro et al., 2021).Eleven genotypes were selected based on the presence of indel mutations spanning contiguous dQTL regions associated with both cMVD and cVF in chromosome 9. Six genotypes were selected based on the presence of indels spanning adjoining dQTL regions associated to cMVD in chromosome 16.Nine more genotypes were selected for their extreme cMVD and cVF phenotypes, regardless of indel mutation locations.Lastly, 7 genotypes were randomly selected as nonlesion controls.
Cuttings of the 201 selected genotypes were planted in individual 3-L pots inside a greenhouse and were established during a 2-month growth period.Three clonal replicates per line were then selected to include in a randomized complete block design across three benches.Trees were watered using a drip irrigation system and grown at 23°C.All trees were harvested for woody stem segments and coppiced in the late fall of 2018.These stems were processed to obtain stem and wood anatomical trait data as described previously (Rodriguez-Zaccaro et al., 2021).Trees were then allowed to regrow a main stem from the original cutting during the spring and early summer of 2019 under the same greenhouse conditions established in the previous growth season for sampling wood forming tissues for RNA extraction as described below.

RNA-seq
The stems of 33 genotypes (2-3 replicates per genotype) were harvested in July of 2019.The final height before harvest was obtained for each of these trees, measuring from the point of emergence of the current year stem to the shoot apex.Stems were then harvested 5 to 10 cm from the point of emergence from the original cutting.Wood forming tissues were collected from harvested stems by gently peeling off the bark and scraping the uncovered tissue with double-edged razorblades.Tissues from each stem were placed in individual aluminum foil envelopes and quickly submerged in liquid nitrogen before being stored at -80°C.
Frozen tissue samples were ground to a fine powder using ceramic mortar and pestles containing liquid nitrogen.Total RNA was extracted from each sample and then purified through a protocol adapted from the TRIzol reagent user guide (Life Technologies) and the RNeasy handbook (QIAGEN).RNA samples were assessed for quantity and quality using a Qubit fluorometer (Invitrogen) and a Bioanalyzer (Agilent Technologies), respectively.A 3' Tag-Seq mRNA library preparation kit (Lexogen QuantSeq) was used to construct cDNA libraries for each sample.Ninety-one libraries were multiplexed across three flow cell lanes of an Illumina HiSeq 4000 run generating 90 bp single end reads.

Preprocessing and read counts table generation
FASTQ files were processed by removing unique molecular identifier (UMI) barcodes from sequences and adding them to sequence read names using the "extract" command in the "UMI-Tools" (v1.1.2)Python package (Smith et al., 2017).Files were then demultiplexed using 6 nucleotide long single indices through the Python tool "Allprep" (scripts available at: https://github.com/Comai-Lab/allprep).Reads were mapped to the Populus trichocarpa reference genome (v3.1) using the STAR software (v2.6) (Dobin and Gingeras, 2015).The resulting SAM files were then sorted using the package "SAMtools" (Li et al., 2009) and deduplicated to correct for amplification bias using UMIs in the sequence read names through the adjacency method of the "dedup" command in "UMI-Tools" (Smith et al., 2017).The "htseq-count" Python script (from the HTSeq package v0.13.5) was used to calculate the number of read counts mapped to each gene in the deduplicated SAM files (Anders et al., 2015).Normalization factors were calculated using the "calcNormFactors" function (TMM method) in the R package "edgeR" (v2.7) to correct for raw library sizes (Robinson et al., 2010).Genes with a maximum expression of less than 10 counts across all samples were filtered out as low-expressed or unexpressed genes.The "voom" function in the R package "limma" was used to transform counts into log2 CPM by implementing the previously calculated normalization factors.This function was also used to obtain weights to correct for differences in expression variance across genes (Law et al., 2014;Liu et al., 2015).A linear model using weighted least squares for each gene was then fitted using the "lmFit" function in "limma" (Ritchie et al., 2015).Model coefficients were extracted to produce a final gene expression dataset.

Pearson correlations among traits
Trait data obtained from trees harvested in 2018 (Rodriguez-Zaccaro et al., 2021) were used to find correlations between all trait combinations within the 33-genotype subset.Pearson correlations and their corresponding P values were obtained using base R (v4.1.0),with significant correlations declared for P values of less than 0.05.Trait data include tree height at harvest (TH), mean vessel diameter (MVD), height-corrected mean vessel diameter (cMVD), vessel frequency (VF), height-corrected vessel frequency (cVF), vessel grouping index (VGI), mean vessel circularity (MVC), non-lumen fraction (NF), and bark thickness (BT).

Weighted gene correlation network analysis
To identify candidate genes related to stem and wood anatomical traits, gene co-expression networks were constructed using the weighted gene correlation network analysis R package (WGCNA, v1.70-3;Langfelder and Horvath, 2008).All 33 sampled genotypes were included in the analysis, with 2-3 replicates per genotype.Twenty-six of these lines had large-scale indels throughout their genome, including 17 selected for the presence of indels in specific areas of chromosomes 9 and 16.A previous WGCNA study involving 164 dosage mutant lines from this pedigree found that half of all gene modules were localized within individual chromosomes, suggesting the creation of artificial indelinduced modules (Bastiaanse et al., 2021).To account for this effect, TMM and variance normalized log2 CPM gene expression data (described above) were processed into an indel-normalized data set.In indel lines, the expression of genes located within indel regions was replaced with the average expression of these genes in all other lines, eliminating the cis-effects of indels on expression (Bastiaanse et al., 2021).However, this approach can potentially compromise the ability to detect real biological correlations.Gene network construction and downstream analyses were thus run using both indel-normalized and non indel-normalized datasets to determine if normalization is necessary.
Co-expression networks were built by obtaining Pearson's correlation coefficients for every expressed gene pair combination across all samples.Weighted adjacencies were then calculated through a transformation of coefficients involving a soft thresholding power of 12.This soft thresholding power was chosen for both networks to produce an 82% model fit to a scalefree topology.Topological overlap matrices (TOM) produced from the adjacencies were then used to calculate measures of topological overlap dissimilarities.Color-labeled modules of highly coexpressed genes were identified through the hierarchical clustering of these dissimilarities using a dynamic tree cutting height of 0.99.Eigengenes, or the weighted average of the expression of all genes within a module, were calculated and used to merge modules showing correlations of 0.75 or higher.Module stability was assessed for both networks through repeated resampling of 63% of all libraries, producing 49 co-expression network iterations.The physical locations of all co-expression modules across the Poplar genome were visualized using the R package "chromoMap" v0.3.1 (Anand and Rodriguez Lopez, 2022).
Stem and wood anatomical traits were tested for significant Pearson correlations with module eigengene values (P<0.05).Specifically, trait data obtained from trees harvested in 2018 for our previous study (Rodriguez-Zaccaro et al., 2021) were used to test for correlations with the gene expression of trees harvested in 2019 belonging to the same 33 genotypes.Two tree height datasets were included in the analysis, tree heights collected from 2018 plants, and tree heights collected from the 2019 plants harvested for RNA.
Gene co-expression modules were examined for functional enrichment through a GO enrichment analysis using Arabidopsis thaliana best BLAST hits of P. trichocarpa genes.The "treeGO" R package was used to perform hypergeometric tests to identify GO terms that are significantly overrepresented (P<0.05)within each module.Gene significance (GS) vs. module membership (MM) plots were obtained for biologically interesting modules.Gene significance is the correlation between the expression of a particular gene and a trait of interest, while module membership is the correlation between the expression of a gene and the eigengene value of the module.

Dosage response analysis
We tested for the presence of dosage responsive genes within two previously identified cMVD and cVF dQTL regions (Rodriguez-Zaccaro et al., 2021).One dQTL region is located on chromosome 9 (from 6.3 to 6.8 Mbp) and was highly correlated to both cMVD and cVF.The other dQTL region was located on chromosome 16 (from 0 to 0.5 Mbp) and was highly correlated to cMVD.Genotypes were grouped into relative dosage score (RDS) categories determined by the presence of insertions or deletions at each dQTL region.Lines with deletions have an RDS of 0.5, lines without indels have an RDS of 1.0 and lines with insertions have an RDS of 1.5.All genes expressed within a dQTL region were then tested for significant differences in gene expression across RDS categories through ANOVAs (P<0.05).Three lines with deletions, 4 lines without lesions and 2 lines with insertions covering the chromosome 9 region were included in an ANOVA.Four lines with deletions, 4 lines without lesions and 2 lines with insertions spanning the chromosome 16 region were included in a separate ANOVA.The 4 non-lesion lines were randomly selected.Each analysis was followed by a Tukey's honest significance post hoc test.Analyses were performed using base R (v4.1.0).

Results
Genotypes in the study were selected either to provide indel coverage of dQTL identified in our previous study (Rodriguez-Zaccaro et al., 2021) or to include genotypes with extreme trait values as well as control genotypes without indels (Materials and Methods).Phenotypic traits measured for each clone are summarized in Table 1 and include wood anatomical and cell morphological traits relevant to stem hydraulic properties measured in our previous study (Rodriguez-Zaccaro et al., 2021).Pearson correlation coefficients were calculated for all stem and wood anatomical trait combinations across the 33 hybrid poplar genotypes (Supplementary Figure 1).Tree height was positively correlated to both mean vessel diameter (R=0.66,P<0.01) and bark thickness (R=0.70,P<0.001), and strongly negatively correlated to vessel frequency (R=-0.85,P<0.001).Vessel frequency in turn showed a strong negative correlation to mean vessel diameter (R=-0.82,P<0.001) and bark thickness (R=-67, P<0.001).Height corrected mean vessel diameter and height corrected vessel frequency were negatively related (R=-0.67,P<0.05).Vessel grouping index, mean vessel circularity and non-lumen fraction showed no significant correlations to any trait.

Constructing gene co-expression networks for wood forming tissues
RNAseq read count data from wood forming tissues from each clone was used to construct gene co-expression networks to identify broad biological pathways and candidate genes related to stem and wood anatomical traits (Materials and Methods).Some genotypes included in the study have overlapping indel mutations that could potentially cause the creation gene co-expression modules reflecting dosage responsive genes localized to the overlapping indels, and these potential artifacts can be addressed using a normalization procedure (Baastianse et al., 2020).To determine if normalization was warranted for data in the current study, co-expression networks were constructed and compared for both non indel-normalized and indel-normalized gene expression datasets (Materials and Methods).Nineteen modules consisting of highly coexpressed genes were identified within both networks (Supplementary Figure 2).Module sizes were similar across non-normalized and indel normalized networks and ranged from 46 to 4711 genes per module (Supplementary Table 1).Hypergeometric tests showed a highly significant overrepresentation of genes of the nonnormalized modules in each normalized module (P<0.00001)consistent with similar networks being recovered by the two approaches.A module stability analysis showed that individual modules within each network persist across many network construction subsampling iterations, suggesting that the modules are robust within both networks (Supplementary Figure 3).Similar to results using indel-normalized data (Supplementary Figure 4), genes from non indel-normalized modules were widespread throughout the genome, with genes from every module represented across most chromosomes (Figure 1).The lightyellow and orange modules were the least widely distributed, with most of their genes located on chromosome 9 (Supplementary Figure 5), and correlations of genes within these modules and traits should thus be evaluated with caution.However, regions in chromosome 9 with high indel coverage depth also included many genes that were assigned to other modules (Supplementary Figure 6), and there was no enrichment of genes from any particular module for a 0.5Mbp bin on chromosome 16 spanned by indels (Supplementary Figure 7).Together these results did not support the need for global data normalization.Because normalization changes expression values for genes in indel regions for genotypes that are most informative for examining dosage-trait relationships it would reduce the power to detect correlations among modules or genes and phenotypes, and thus results are presented using nonindel normalized networks for the remainder of analyses.However, transcript counts, annotations and module membership for individual genes from both the non-indel normalized and the indel normalized network are provided in Supplementary Tables 2, 3, respectively.

Gene modules correlate with phenotypic traits
Correlations were calculated between the eigengene value of expression for each module and stem anatomical and vessel morphological trait data (P<0.05; Figure 2).Tree height in both years were the most strongly correlated to the darkgrey, grey60, turquoise, cyan, blue, and dark-green modules, which together comprised about 64% of all expressed genes, consistent with the complex polygenic nature of this trait.Modules tan, brown, lightcyan and black were only associated with tree height in the year transcript levels were measured, 2019.No modules were uniquely associated with tree height in 2018.Together these findings suggest that while overall transcript levels and module eigengene values associated with height were similar across the two years of the study, correlations were weaker when height and transcript abundance were not obtained in the same year.Thus, it is likely that the power to detect correlations between traits below (also measured in 2018) and modules is weakened by variation in plant growth between the two years.
All anatomical and vessel morphological traits were correlated with one or more modules, with the exception of height corrected mean vessel diameter and mean vessel circularity (P <0.05; Figure 2).Vessel frequency was the most highly correlated to any gene module across both networks and had the greatest number of significant correlations to modules (P <0.05; Figure 2).Vessel frequency had positive correlations to the darkgrey, grey60, and turquoise (R=0.42,0.43 and 0.56, respectively) and negative correlation to the blue and darkgreen modules (R=-0.56 and -0.37).Similarly, mean vessel diameter was significantly correlated to the turquoise, blue and darkgreen modules (R=0.4,0.41 and 0.36 respectively).Height corrected vessel frequency was positively related to the light-yellow (R=0.51) and the grey60 (R=0.36).Vessel grouping index was significantly related to the cyan module (R=-0.39),while non-lumen fraction was correlated to the turquoise (R=-0.34),cyan (R=0.41),light-cyan (R=0.39) and blue (R=0.35)modules.Bark thickness was positively correlated with the darkturquoise (R=0.39),lightgreen (R=0.35),cyan (R=0.47),orange (R=0.40) and blue (R=0.35)modules.
Modules associated with multiple traits can reflect common regulatory mechanisms shared across traits.Modules that correlated with vessel traits also correlated with tree height, with the exception of the unique correlation of the lightyellow module with height corrected vessel frequency.For example, three of the modules correlated with mean vessel diameter (turquoise, blue, orange) were also correlated with height and in the same direction.
Similarly, all six of the modules correlated with vessel frequency were also correlated with height, but in the opposite direction.The positive correlation of height with mean vessel diameter trait and negative correlation with vessel frequency (Supplementary Figure 1) are thus reflected in these co-expression modules.Such correlations may indicate that a module's biological function contributes to variation in vessel traits and indirectly influences height growth, for example by affecting water flow rates during growth, or that the module's functions directly influence both sets of traits.
Mean vessel diameter and vessel frequency are negatively correlated traits (Supplementary Table 1), and all three modules significantly correlated with mean vessel diameter (turquoise, blue and darkgreen) were also correlated with vessel frequency, while three additional modules (darkgrey, grey60, and cyan) were unique to vessel frequency and could reflect mechanisms specific to this trait.One of the two modules correlated with height corrected vessel frequency was also correlated with vessel frequency (grey60), while the lightyellow module was unique to height corrected vessel frequency.

Gene module GO enrichment analyses
Co-expression modules were next assigned broad biological functions through GO enrichment analyses (Materials and Broadly distributed genomic locations of gene modules identified through a weighted gene coexpression network analysis (WGCNA) using a non indel normalized dataset.Colored regions along chromosomes represent areas with a high density of genes in a particular module.The grey dotted lines mark the edges of indel regions.The number above each region represents the number of genotypes with indels in that region.
Methods) to provide insights into potential mechanisms affecting traits.GO enrichment categories and statistics are provided in Supplementary Table 4.As shown in Figure 3, the turquoise module associated with height, mean vessel diameter and vessel frequency was significantly enriched in stress response, transcription regulation and hormone related terms, among others (P<0.05).Response to stress was one of the most significantly overrepresented terms, with 769 genes with this annotation included in the module.Response to cold (125 genes) and water deprivation (134 genes) were the most significantly overrepresented specific stress response terms.After stress response, transcription regulation related terms where among the most numerous and significantly overrepresented, with 446 genes in the turquoise module associated with nucleic acid-templated transcription.Lastly,  this module was highly enriched in response to hormone terms, which together encompassed 330 genes.In particular, genes annotated with response to abscisic acid terms were the most numerous and overrepresented within this category, with 141 genes included in the turquoise module.Response to jasmonic acid and cytokinin related terms were also significantly overrepresented.
The blue module was also correlated with height, mean vessel diameter, and vessel frequency but in the opposite directions as the turquoise module.In contrast to the turquoise module, the blue module was significantly enriched in genes associated with vesicle transport, Golgi processes, cell wall, cell growth and cell differentiation related terms, among others (Figure 3).Vesicle mediated transport related terms were the most significantly overrepresented (193 genes), with 109 of these genes annotated with Golgi vesicle transport and organization terms.In addition, the blue module was highly enriched in cell wall organization and biogenesis terms, with 149 genes with these annotations included in the module.Hemicellulose was the most overrepresented among specific cell wall component related terms, followed by pectin and cellulose.Lastly, cell growth (133 genes) and differentiation (145 genes) associated terms were greatly overrepresented in the blue module.The blue module would thus be consistent with genes and mechanisms associated with cell differentiation and cell wall biosynthesis, a critical feature of vessel element differentiation.The darkgreen module was also correlated with height, mean vessel diameter, and vessel frequency (Figure 2) but in contrast to the turquoise and blue modules was not highly enriched in genes associated with abscisic acid (Figure 3).
The cyan module showed the strongest correlation to BT and the third strongest correlation to VF, after the turquoise and blue modules (Figure 2).This module was enriched in stress response and cell wall related annotations (Figure 3).The light-yellow and grey60 modules in the non indel-normalized network were correlated to height corrected vessel frequency and were enriched in transcription and stress response related terms, respectively (Figure 3).

Gene significance vs module membership plots
Biologically relevant modules were next examined for candidate genes related to wood anatomical traits through gene significance (GSa measure of correlation between a gene's expression and a phenotype) vs module membership (MMa measure of correlation between a gene's expression and the eigengene of the module) plots (Langfelder and Horvath, 2008), as shown in Figure 4.While this approach can be used to dissect any module-trait combination, we focus on vessel frequency-related candidate genes, as this trait showed the strongest and most numerous correlations to modules among the examined wood anatomical traits, as well as height corrected vessel frequency candidate genes.Genes with the greatest GS and MM values within overrepresented GO categories were identified as candidate genes within each selected module (Langfelder and Horvath, 2008) and are listed in Table 2. Candidate genes reflect potential mechanisms affecting vessel traits, including stress response and signaling, Golgi and vesicle-related genes, cell wall -related genes, and cell differentiation-related genes (Table 2).Some of these candidate genes have been functionally characterized in other species and connected to directly or indirectly to vessel traits, such as the orthologs of Arabidopsis genes IRREGULAR XYLEM 1 (IRX1) and IRX5 that encode secondary cell wall-related cellulose synthases (Taylor et al., 2003).Loss of function mutations of these genes in Arabidopsis result in a "collapsed vessels" phenotype (Turner and Somerville, 1997).Interestingly, IRX1 mutants have also been shown to have enhanced drought and osmotic stress tolerance in Arabidopsis (Chen et al., 2005).Other notable candidates include an ortholog of Arabidopsis CONTINUOUS VASCULAR RING (COV1), which encodes a protein associated with Golgi morphology and vacuolar protein sorting (Shirakawa et al., 2014).Arabidopsis cov1 mutants are characterized by extension of vascular bundles boundaries, suggesting COV1 is a negative regulator of vascular differentiation in stems (Parker et al., 2003).An ortholog of VASCULAR-RELATED NAC DOMAIN 2 (VND2) encodes a transcription factor that, in Arabidopsis, has been shown to modify respond to ABA during stress to modify the differentiation rates of vessel elements (Ramachandran et al., 2021).Lastly an ortholog of BIG BROTHER (BB) was identified, which in Arabidopsis encodes a RING-finger E3 ubiquitin ligase that represses organ growth, potentially by degrading growth-stimulating proteins (Disch et al., 2006).Interestingly Arabidopsis BB is dosage sensitive (Disch et al., 2006) and has been hypothesized to uncouple cell proliferation from cell elongation (Cattaneo and Hardtke, 2017).

Dosage responsiveness further narrows candidate genes underlying dQTL
One prediction is that the causative gene(s) underlying vessel trait dQTL should be expressed in wood forming tissues and should themselves respond to dosage in terms of expression.We tested genes within dQTL regions on chromosome 9 (from 6.3 to 6.8 Mbp) associated with height corrected mean vessel diameter and height corrected vessel frequency and chromosome 16 (from 0 to 0.5 Mbp) associated with height corrected mean vessel diameter (Rodriguez-Zaccaro et al., 2021) for genes whose expression changes within relative dosage.Lines were placed in relative dosage score (RDS) groups determined by the presence of insertions or deletions spanning each dQTL region.Forty-two genes were expressed out of a total of 75 annotated genes within the chromosome 9 region.Eighteen of these genes showed significantly different expression across lines with different RDS values, suggesting dosage responsiveness (P<0.05).The expression of most of the 18 genes showed a positive relationship to their dosage in the region, with more copies of a gene leading to greater expression (Figure 5).Slopes obtained from linear regressions of gene expression data across different RDS lines ranged from 0.70 to 2.04, showing a wide variation in the effect of dosage on expression (Table 3).None of the genes within this group scored high on gene significance/module membership plots, however (Figure 4).Within the chromosome 16 region, 46 out of 103 annotated genes were expressed in wood forming tissues.Seventeen of these genes showed significantly different expression across different RDS line categories (P<0.05; Figure 6).As expected, the expression of these genes was positively related to their gene dosage, with lines with greater RDS values showing greater expression (Figure 6).Slopes obtained from linear regressions of gene expression data across RDS categories ranged from 0.83 to 2.12, showing a large variation in the effect of dosage on expression (Table 4).Within this group, two genes scored high on gene significance/module membership plots (Table 4), Potri.009G062100(ortholog of BIG BROTHER ubiquitin E3 ligase) and Potri.009G062600(ortholog of a MAKORIN-like ubiquitin E3 ligase).

Discussion
In this study we used a unique poplar germplasm resource (Henry et al., 2015) to identify potential mechanisms and individual genes influencing vessel traits.The poplar pedigree utilized carries dosage variation from genomically defined insertions and deletions and is an unusually rich source of genetic and phenotypic variation for dissecting complex traits, as shown here and in previous studies (Bastiaanse et al., 2019;Bastiaanse et al., 2021;Rodriguez-Zaccaro et al., 2021).We used a gene co-expression approach that allowed integration of RNAseq transcript abundance profiling data with vessel traits and other data types across clonally replicated genotypes to identify potential mechanisms and candidate genes correlated with vessel traits.Below, we discuss the results from the study against the backdrop of known and anticipated mechanistic components of vessel traits.
To fully understand vessel traits, a minimal list of component traits that must be explained include 1) perception of water stress and long distance signaling across organs and tissues, 2) cell fate decisions of cambial daughter cells that ultimately determine vessel patterning, 3) regulation of vessel morphology and diameter.Our gene expression data from wood forming tissues is likely not ideal for identifying genes and mechanisms related to water/osmotic stress perception, and it is not known to what degree osmotic stress is directly perceived in wood forming tissues or responds to signals originating in roots.However, at the level of mechanisms our data reflected a role for ABA in both mean vessel diameter and vessel Gene significance (GS) vs module membership (MM) scatterplots of selected modules identified from a non indel normalized dataset.The Blue, Turquoise and Cyan module plots (A-C) show GS for vessel frequency (VF), while the Lightyellow and Grey60 plots (D, E) show GS for heightcorrected vessel frequency (cVF).These modules showed highly significant correlations between GS and MM.Representative genes with high GS and MM are highlighted in the modules and included in Table 2.
frequency (see turquoise module Figures 2, 3).ABA has been previously implicated as a primary drought stress signal whose manipulation can change drought response (Yu et al., 2019) and vessel trait phenotypes in poplar (Popko et al., 2010).This same gene module was also enriched for stress response genes as well as other hormone related genes, including jasmonic acid, which has been shown to interact with ABA in other plant systems during stress (de Ollas et al., 2015), and auxin which has also been shown to modify vessel patterning and size (Junghans et al., 2004;Johnson et al., 2018).
The sequence of events in vessel element differentiation includes specification of vessel element cell fate, cessation of cell division and initiation of differentiation, cell expansion, and secondary cell wall synthesis prior to programmed cell death (Turner et al., 2007).We were unable to identify mechanisms or candidate genes that are unique to the process by which cambial daughter cells are specified to differentiate as vessels and thus determine vessel patterning traits (vessel frequency, vessel grouping index) from our results.Unfortunately, our experimental design was not able disentangle Putative dosage-responsive genes within a previously identified dosage dependent QTL (dQTL) in chromosome 9.Each boxplot shows gene expression for genotypes grouped by relative dosage score (RDS) at each gene locus within the chromosome 9 dQTL region (6.3-6.8Mbp).Only genes that had significantly different expression levels across RDS categories were included (determined by ANOVAs).Tukey pairwise comparisons were used to show significant differences in gene expression values between RDS groups; means that share a letter are not significantly different (p > 0.05).Module membership for each gene is indicated by colors (see scale for names).how vessel patterning versus size could be uniquely regulated, which would have required experimental treatments capable of breaking the observed phenotypic correlations between these traits.Thus, our experimental design likely resulted in some gene moduletrait correlations that indirectly reflect trait-trait correlations.It is also possible that at least at some early stage of cell specification and differentiation there are common regulatory mechanisms affecting both traits.Although purely speculative, a general type of mechanism that could connect vessel patterning and size could be envisioned where a cell specified as a vessel represses expansion of neighboring vessels, for example by localized diffusion of a repressive signal.
Cell expansion and final cell size reflect the interplay of turgor pressure and resistance of the cell wall (Ali et al., 2023).Clearly turgor pressure is a central aspect of vessel size, as illustrated by the massive expansion of vessel elements in contrast to the limited expansion of neighboring cells that differentiate as fibers that also synthesize secondary cell walls.We did not find clear evidence for Putative dosage-responsive genes within a previously identified dosage dependent QTL (dQTL) in chromosome 16.Each boxplot shows gene expression for genotypes grouped by relative dosage score (RDS) at each gene locus within the chromosome 16 dQTL region (0-0.5 Mbp).Only genes that had significantly different expression levels across RDS categories were included (determined by ANOVAs).Tukey pairwise comparisons were used to show significant differences in gene expression values between RDS groups; means that share a letter are not significantly different (p > 0.05).Module membership for each gene is indicated by colors (see scale for names).
enrichment of genes associated with turgor-related process at the level of GO analysis, which could reflect that this process is not globally regulated at the level of transcription.In support of that notion, recent results in Arabidopsis identified posttranslational regulatory mechanisms affecting turgor-driven cell expansion (Chaudhary et al., 2023).Specifically, the Arabidopsis receptor kinase FERONIA (FER) was implicated in regulating turgordriven cell expansion in response to brassinosteroids, and FER is postranslationally regulated through phosphorylation by the BRASSINSTEROID-INSENSITIVE 2 kinase (Chaudhary et al., 2023).However, we did find strong signatures of cell wall biosynthesis and modification-related genes associated with both vessel frequency and diameter (Figure 3).One possibility is cell size is affected in part by the timing of biosynthesis or changes in mechanical properties in cell walls that affect the resistance of the cell wall to turgor during expansion, with lignification of the rigid secondary wall ultimately preventing further cell expansion.While many secondary cell wall-related mutants in plants have not been specifically examined for vessel element patterning or diameter phenotypes, there are examples of wall-related genes and mutants with altered vessel morphologies.For example, an ABA responsive AREB1 transcription factor in poplar regulates both the diameter and frequency of vessels.AREB1 was also shown to regulate poplar NAC transcription factors (Li et al., 2019), which have been shown to include master regulators capable of initiating vessel-like differentiation and cell wall biosynthesis in other cell types when overexpressed (Kubo et al., 2005;Yamaguchi et al., 2010).That both size and frequency are affected in AREB1 misregulated poplars may indicate that specification of vessel cell identity is a key point of regulation (Li et al., 2019).However, the apparent lack of significant cell expansion in other cell types induced to differentiate as vessel by ectopic expression of NACs (e.g.Yamaguchi et al., 2010) may indicate that cell identity including cell expansion is specified upstream or in parallel of NAC cell wall regulation.The regulation of cell size is a fundamental issue for plant development but remains poorly understood.The determination of vessel element diameter is especially complex because it is highly responsive to environmental conditions.Three generalized models of cell size regulation have been proposed based on findings in the more simplified yeast system: the "timer", "sizer" and "adder" models (D'Ario and Sablowski, 2019), all of which evoke a key role for cell division in determining final size.The timer model predicts that cells divide at specific time intervals, while sizer model cells divide when they reach a certain size.In the adder model, cells divide after a specific amount of growth regardless of initial size and can continue to increase in size across cell divisions.The adder model best fits what is seen in the cambial zone, where daughter cells (also referred to as xylem mother cells) are similar in size, although daughters directly born from the fusiform initials can be smaller than older daughter cells in their final round of division prior to differentiation.However, these models do not account for observed anatomical features of wood such as the large difference in final size of fiber cells compared to vessel elements derived from the same pool of initials, and they do not account for the large variation observed in vessel element size across developmental age, height position within the stem, or in response to environmental cues and stress.Also, these models were developed for cell types that continue dividing, whereas vessel elements undergo terminal differentiation.Although speculative, we favor a model by which an interplay between division rates in the cambial zone and time of onset of differentiation may have an effect on final vessel size, which is manifest or modified through either the time allotted for differentiation or intensity of increase in turgor and cell wall modifications.The interplay of cell division and vessel size can be seen through histological examination of poplar stems from fast growing trees in well-watered conditions, vs trees under water stress.In the former, rapid cell division results in an expanded cambial zone which ultimately produces larger diameter vessel elements.In the latter, slower division rates are associated with a narrow cambial zone that produces smaller diameter vessel elements.Our results showed evidence for enrichment in genes associated with cell cycle and ubiquitination in modules associated with vessel patterning and size traits (Figures 2, 3).Three of the candidate genes within previously identified dQTL for vessel traits (Table 3) encode ubiquitin E3 ligases that have been implicated in auxin-mediated cell cycle regulation (SKP2: Jurado et al., 2010), putative morphological and developmental regulation (MAKORINlike: Arumugam et al., 2007) and organ size (BIG BROTHER: Disch et al., 2006;Cattaneo and Hardtke, 2017) in Arabidopsis.
The current study was limited by several factors.On a practical basis, our sample sizes were modest, and our experimental design did not include stress or other treatments that might allow further dissection of traits and breaking of correlations among traits.A limiting factor was the extremely laborious task of sectioning, imaging, and analyzing images of wood forming tissue from large numbers of stems.Future studies using advanced phenotyping platforms and imaging technologies could no doubt greatly extend the results here.Additionally, our study relied primarily on gene expression data, and undoubtedly key regulatory point of the rapid developmental changes associated with vessel traits include posttranscriptional regulation not surveyed here, including posttranslational modification and protein degradation.However, our study show that it is possible to dissect vessel traits de novo and pointed to new candidate mechanisms and genes for future study.

FIGURE 3
FIGURE 3 Enrichment analysis heatmaps showing significant overrepresentation (P<0.05) of GO terms within gene modules.The color bars represent the -log10 of P values (scale to right) generated from hypergeometric tests.Each major category indicated on the x axis is composed of individual child GO categories represented by individual bars.Supplementary File 2 lists all child categories and statistical test values for each module.The dendrogram to the left shows the relationships among modules.

FIGURE 2
FIGURE 2Module eigengene and trait correlations across 33 hybrid poplar genotypes.Module eigengene and trait correlations across 33 hybrid poplar genotypes.Wood anatomical traits include mean vessel diameter (MVD), height-corrected mean vessel diameter (cMVD), vessel frequency (VF), height-corrected vessel frequency (cVF), vessel grouping index (VGI), mean vessel circularity (MVC), and non-lumen fraction (NF).Stem traits include tree height (TH) and bark thickness (BT).Each box includes a Pearson's correlation coefficient represented by a color bar (color scale on right shows ranged of positive and negative correlations) and a P value.P values ≤ 0.05 are marked with an asterisk.

TABLE 1
Stem and wood anatomical traits examined across 33.

TABLE 3 Continued
Genotypes were grouped by RDS at each gene locus within the dQTL region.An ANOVA was performed using gene expression data across different RDS groups.Only genes that had significantly different expression levels across RDS categories were included (P<0.05).The slope obtained from the linear regression of gene expression data across RDS groups is included to show the magnitude of the effect of dosage on expression.