The role of gene expression in the recent evolution of resistance in a model host parasite system

Damage by parasites is a perpetual challenge for hosts, often leading to the evolution of elaborate mechanisms of avoidance, immunity, or tolerance. Host resistance can evolve via changes in immune protein coding and/or expression. Heritable population differences in gene expression following infection can reveal mechanisms of immune evolution. We compared gene expression in infected and uninfected threespine stickleback (Gasterosteus aculeatus) from two natural populations that differ in their resistance to a native cestode parasite, Schistocephalus solidus. Genes in both the innate and adaptive immune system were differentially expressed as a function of host population, infection status, and their interaction. These genes were enriched for loci controlling immune functions that we independently verified differ between host populations, or in response to infection. For instance, populations differ strongly in reactive oxygen (ROS) production, and we observed corresponding differences in expression of ROS-affecting loci. Differentially expressed genes also were involved in fibroblast activation, B-cell activation, and leukocyte trafficking. Coexpression network analysis identified two distinct immune processes contributing to stickleback resistance; several modules of genes are correlated with parasite survival while a different set of modules are correlated with suppression of cestode growth. Comparison of networks between populations showed resistant fish have a dynamic expression profile while susceptible fish are static. In summary, recent evolutionary divergence between two vertebrate populations has generated population-specific gene expression responses to parasite infection, which reveal a few immune modules likely to separately affect cestode establishment, and growth.

). This is mirrored by differences in expression of a select few immune genes, between wild caught 87 stickleback from six populations, and between wild caught fish with versus without cestodes (Lenz,88 Eizaguirre, Rotter, et al. 2013 ;Stutz, et al. 2015). populations, rarely encounter the cestode because its eggs do not hatch in brackish water. These fish 93 genotypes are therefore highly susceptible to infection in laboratory exposure trials. When marine 94 stickleback colonized post-glacial freshwater lakes, they encountered cestodes and evolved increased 95 resistance to infection by cestodes (Weber, et al. 2017  ). The first host (copepods) and terminal hosts (piscivorous birds, mostly loons and mergansers) are common in both lakes. Diet data from 102 both lakes shows that Rob and Gos fish consume copepods at an equal rate (Snowberg, et al. 2015; (N = 21 and 16), or exposed and infected (N = 17 and 9) fish. Tissues from the latter two groups were 137 harvested 42 days post exposure. See ) for full experimental methods. We focus on 138 expression in head kidneys as it is the major site of immune cell differentiation in fish (Scharsack, et al. 139 2004;Fischer, et al. 2006;Scharsack, et al. 2007;Fischer, et al. 2013), and head kidney cell cultures were 140 used to measure immune function independently of gene expression. Below we present and interpret 141 candidate genes from our linear models (see Table 1 for summary statistics). 142

143
Main effect of population on immune gene expression 144 Our negative binomial linear models (see Materials and Methods) identified 643 genes that were 145 differentially expressed as a function of stickleback population (Wald, p < 0.1 after 10% FDR correction; 146 361 genes after 5% FDR). These main effects of population represent genes whose expression differs 147 constitutively between populations (regardless of infection status). Because these differences occur in lab-148 raised fish, they represent heritable between-population differences in RNA abundance. Because we 149 measured gene expression from the entire head kidney, expression differences could reflect evolved 150 changes in gene regulation per cell, changes in cell population composition, or both). A caveat is that 151 because we used first-generation lab-reared fish, we are as yet unable to rule out maternal or other 152 epigenetic effects. However, comparison of Rob, Gos, and reciprocal F1 hybrids revealed little evidence 153 for maternal effects on infection outcomes or immune traits (with the exception of 154 granulocyte:lymphocyte ratio)   One study noted increased expression of MHC II in wild fish which were more heavily parasitized, 210 especially when MHC allele diversity was low (Wegner, et al. 2006 therefore possible that differences in expression of the two variants described here is due to altered 217 regulation of only particular alleles or paralogs. 218 219 Transcriptomic response to exposure or infection 220 Surprisingly, no genes differed between control versus exposed-but-uninfected fish (Wald, p > 0.1 after 221 10% FDR correction). This could be because resistant fish quickly mounted a response to the cestode, 222 eliminated the parasite, and then down-regulated immune function by our 42-day sample date. Or, early-223 acting resistance to the cestode may involve physical or chemical barriers to entry that entail constitutive 224 gene expression or non-genetic effects (e.g., gut epithelial mucous, protective symbiotic bacteria, etc). suggested that helminths could potentially suppress stickleback adaptive immunity (Scharsack, et al. 257 2007), and the downregulation of CD40 is one plausible mechanism. Alternatively, fish with inherently 258 lower CD40 expression may be more susceptible to infection. This raises a broader question that we are 259 not yet able to answer, but which warrants further study: to what extent are the expression differences 260 between infected and control fish a result of host immune response or parasite immune suppression? 261 262 CD40 expression is not limited to immune cells, but can also be expressed in fibroblasts (Grewal and 263 Flavell 1998), so its precise function in stickleback infection by cestodes is unclear. This dual role is 264 intriguing because fibroblast activation is associated with the formation of fibrotic cysts that encapsulate 265 cestodes (Zeisberg, et al. 1999). These cysts likely restrict cestode movement and concentrate ROS while 266 limiting damage to host tissues. Recall that this is a population specific defense, exhibited by Rob but not 267 presence of activated B-cells indicate the host immune system has recognized the parasite and is actively 276 mounting a defense. In our study, infected fish show higher levels of tspan33 compared to controls 277 (ENSGACG00000019698: log2fold change = 1.04, Wald p < 0.07 after 10% FDR correction).

Genotype by infection interactions 282
The higher resistance to S. solidus infection in Rob compared to Gos stickleback could be due to 283 constitutive differences in gene expression (as documented above), or differences in the induced immune 284 response to infection. The latter can be detectable via interactions between host genotype and infection 285 status. Linear modeling results identified 16 genes significant for this interaction (Wald p < 0.1 after 10% 286 FDR correction). Most of these genes are known to affect the immune traits that ) 287 already showed are divergent between Rob and Gos fish. For example, glutathione peroxidase 1a (gpx1a) 288 is an enzyme that degrades hydrogen peroxide, a type of ROS, into glutathione and water (Turrens 2003). 289 Expression of gpx1a in Gos fish increases upon infection, and therefore should tend to decrease the 290 amount of ROS (hydrogen peroxide) available to defend against cestodes (ENSGACG00000010455: 291 log2foldchange = -1.86, Wald p < 0.07 after 10% FDR correction). We speculate that this proactive 292 down-regulation upon infection might be a tolerance response to mitigate autoimmune damage by Gos 293 fish, which are commonly infected and therefore might not be able to tolerate a strong ROS response. The 294 cytochrome c complex produces ROS (Turrens 2003), and we see increased expression of Cytochrome c 295 oxidase subunit IV (cox4i1) in Rob fish that are infected, while Gos fish decrease expression 296 (ENSGACG00000015963: log2foldchange = 0.43, Wald p < 0.06 after 10% FDR correction). This gene 297 expression data is consistent with our phenotypic data showing that Rob fish have more ROS producing 298 macrophages than Gos fish, and more ROS per cell. This cox4il up-regulation in Rob fish may be 299 amplified by population differences in bcl2 (see above). Oddly, we do not observe a significant infection-300 induced increase in ROS production in fish of either genotype. This discrepancy may reflect our head-301 kidney cell-culture based ROS assay, which does not rule out changes in ROS in vivo or in other tissues. 302 The one contrary result involves colony stimulating factor 1b (csf1b), a paralog of csf1/mcsf, a well-304 studied regulator of monocytes in mammals (Akagawa, et al. 2006). csf1 increases the production of head 305 kidney leukocytes (which includes granulocytes) in trout (Oncorhynchus mykiss) (Wang, et al. 2008). In 306 our study, csf1b is downregulated in infected Rob fish even though they have more granulocytes (which log2foldchange = -1.11, Wald p < 0.08 after 10% FDR correction). This discrepancy may be resolved by 309 recognizing that we examined a single time point post-exposure. It is likely that Rob fish initially increase 310 csf1b or another gene to drive the granulocyte proliferation that we observe after infection. resistance to the cestode. For example, the black module has modest to strong correlations with infection 362 status, cell population phenotypes including density of all cells, precursors, and myeloids but is not 363 correlated to ROS production (r = 0.13 p = 0.2, Figure 3). In contrast, the magenta and greenyellow 364 modules are correlated with ROS production and also with cestode size, but much less so with cell 365 population phenotypes and not at all with infection status (Figure 3). These observations imply that 366 stickleback prevention of cestode establishment, and suppression of cestode growth, entail two distinct 367 immune pathways (innate response and ROS production, respectively). 368

369
The magenta module has a modest, but strongly significant correlation to the fraction of cells that are Stickleback samples were generated as part of a large lab infection experiment . 425 Briefly, head kidneys were dissected from stickleback and preserved in RNAlater at -20ºC. Libraries were 426 constructed according to (Lohman, et al. 2016). Samples were sequenced on the Illumina HiSeq 2500 at 427 the Genome Sequence and Analysis Facility at the University of Texas at Austin, producing ~6.7M raw 428 reads per sample. See ) for ROS and flow cytometry methods. Flow cytometry data 429 was analyzed using FlowJo software (Treestar). Granulocyte and lymphocyte populations were defined 430 based on gating described in ). Precursor, myeloid, and eosinophil populations were 431 defined using gating as described in (Wittamer, et al. 2011). βInfectionStatus is a fixed effect with three levels: control, exposed (but not infected), and infected, and full-450 sibling families are nested within populations. βBatch is the lane on which samples were sequenced. An 451 additional predictor βSex was included for genes when appropriate (lower AIC score), and improved the 452 model fit of 839 genes total. We fit the full model (including sex) to all genes and then extracted only the 453 839 that were improved by the addition of sex and looked for significant p values for main effects and 454 interactions. With the full model, 67 of these 'sex improved' genes were significantly different between 455 populations. No genes were significant for either exposure or infection, and 1 gene was significant for the 456 interaction of population and infection status (myosin 5ab, ENSGACG00000006025: log2foldchange = 457 2.98, Wald p = 0.07). All p-values were multiple test corrected using 10% FDR. Although fish from the 458 controlled infection experiment were exposed to three different parasite genotypes (each family exposed 459 to only one parasite genotype), we are only interested in the host response to any parasite, and therefore 460 average across parasite genotypes by simply not including this as a term in our linear model. 461

462
Gene Ontogeny with GO_MWU: 463 We used the Mann-Whitney U test for GO analysis. This approach has been described (Wright, et al.  dynamic tree cut, we merged modules with greater than 80% similarity, producing 14 modules. We populations, using historical natural selection as a tool to screen for changes in gene expression associated 476 with parasite infection. While our host-parasite model system is powerful, it does have some limitations. 477 The reference genome is of generally good quality but annotation is lacking (approximately 22.5% of the 478 entries in the stickleback genome are either unnamed or labeled as novel genes). Thus, GO analysis is 479 performed after assigning GO accession terms by BLAST homology, rather than functional verification, a 480 common solution for non-model systems. The features of the stickleback genome may be missing 481 potentially interesting immunological genes which are sufficiently diverged from human or mouse genes 482 and therefore may be unannotated. In particular, the number and location of MHC II paralogs remains 483 uncertain, illustrating need for genome sequence improvement. 484 Our linear modeling with DESeq2 employs appropriate FDR correction, but we choose to accept 485 higher than 'standard' p-values associated with LFC because of the direct connection between candidate 486 genes and independently observed immune phenotypes. If, for example, we had not measured immune 487 phenotypes, we would not accept log2 fold changes in expression with associated p-values greater than 488 0.05 but less than 0.1. Furthermore, we chose a very low base min mean filter because we have high 489 confidence in detecting lowly expressed genes (Lohman, et al. 2016). We also wished to include more 490 genes in our enrichment analysis and this also detracts from our power due to multiple test correction. We 491 accordingly accept slightly larger than normally allowed p-values. Our TagSeq based approach has been 492 shown to be at least as good as total RNAseq methods (having an equal or higher correlation between 493 observed and known values of a spike in control), but does not account for splice variants or copy number 494 variation, which may be potentially important in the evolution of immune responses. 495 Our study used tissue from a single organ (head kidneys) for both gene expression and immune 496 phenotype measures. Head kidneys are a crucial hematopoetic organ in fish, but analysis of other tissues 497 may produce different results. Moreover, head kidneys contain multiple immune cell populations that we 498 are unable to sort effectively for cell-type-specific expression studies. We do use cell population counts 499 (proportion granulocytes versus lymphocytes) as a covariate, which as noted for MHC weakly contributes 500 to expression variation of a few genes. But, lacking monoclonal antibodies to many immune cell receptors 501 in stickleback, we cannot readily distinguish among finer subdivisions of cell types. This resource 502 limitation, typical of most non-model organisms, limits our ability to statically detect effects of cell 503 population composition on expression. 504  and dependent manner. Y axis is variance stabilized data, the product of log transforming and library size 518 correcting raw gene counts. C = control, E = exposed, I = infected. Black lines are Rob and magenta lines 519 are Gos. 520 Figure 3. Weighed coexpression gene network analysis suggest that host response to cestodes involves 522 two traits, the initial immune response (salmon, purple, black, yellow, brown) and the control over 523 parasite growth (magenta and greenyellow). Each cell indicates the correlation between the module and 524 the p-value for that correlation (in parentheses). Correlations weaker than 0.07 were omitted. 525