Alternative Splicing of Putative Stroke/Vascular Risk Factor Genes Expressed in Blood Following Ischemic Stroke Is Sexually Dimorphic and Cause-Specific

Genome-wide association studies have identified putative ischemic stroke risk genes, yet, their expression after stroke is unexplored in spite of growing interest in elucidating their specific role and identifying candidate genes for stroke treatment. Thus, we took an exploratory approach to investigate sexual dimorphism, alternative splicing, and etiology in putative risk gene expression in blood following cardioembolic, atherosclerotic large vessel disease and small vessel disease/lacunar causes of ischemic stroke in each sex compared to controls. Whole transcriptome arrays assessed 71 putative stroke/vascular risk factor genes for blood RNA expression at gene-, exon-, and alternative splicing-levels. Male (n = 122) and female (n = 123) stroke and control volunteers from three university medical centers were matched for race, age, vascular risk factors, and blood draw time since stroke onset. Exclusion criteria included: previous stroke, drug abuse, subarachnoid or intracerebral hemorrhage, hemorrhagic transformation, infection, dialysis, cancer, hematological abnormalities, thrombolytics, anticoagulants or immunosuppressants. Significant differential gene expression (fold change > |1.2|, p < 0.05, partial correlation > |0.4|) and alternative splicing (false discovery rate p < 0.3) were assessed. At gene level, few were differentially expressed: ALDH2, ALOX5AP, F13A1, and IMPA2 (males, all stroke); ITGB3 (females, cardioembolic); ADD1 (males, atherosclerotic); F13A1, IMPA2 (males, lacunar); and WNK1 (females, lacunar). GP1BA and ITGA2B were alternatively spliced in both sexes (all patients vs. controls). Six genes in males, five in females, were alternatively spliced in all stroke compared to controls. Alternative splicing and exon-level analyses associated many genes with specific etiology in either sex. Of 71 genes, 70 had differential exon-level expression in stroke patients compared to control subjects. Among stroke patients, 24 genes represented by differentially expressed exons were male-specific, six were common between sexes, and two were female-specific. In lacunar stroke, expression of 19 differentially expressed exons representing six genes (ADD1, NINJ2, PCSK9, PEMT, SMARCA4, WNK1) decreased in males and increased in females. Results demonstrate alternative splicing and sexually dimorphic expression of most putative risk genes in stroke patients' blood. Since expression was also often cause-specific, sex, and etiology are factors to consider in stroke treatment trials and genetic association studies as society trends toward more personalized medicine.


INTRODUCTION
Genome-wide association studies (GWAS) have identified many loci associated with vascular risk factors (VRFs) and ischemic stroke (IS) risk, including some stroke-cause specific genes such as alpha 1-3-glactosyl-transferase (ABO), which is suggested to be a risk gene for large vessel disease (LVD) and cardioembolism (CE) causes of IS (1), yet whether these genes also play a direct role in post-stroke pathology which might identify them as potential targets for stroke treatment, is not well-understood. Sexual dimorphism exists for etiology, pathology, outcome, and IS risk factors (2)(3)(4)(5)(6). This has led to the development of maleand female-specific clinical guidelines (7) and prediction models (8). However, GWAS-identified risk loci have only recently been examined for sex differences and higher heritability has been reported in women (9,10). These studies also found differing heritability between stroke causes and a recent study reported overlap in risk loci between ischemic subtypes and hemorrhagic strokes (11). However, there are relatively few loci that are consistently associated with stroke. Additionally, we have reported differential alternative gene splicing in stroke patients compared to control subjects (12,13), thus raising questions about whether risk loci may also be alternatively spliced and/or specific for sex or cause of stroke. To partially address these issues, we examined peripheral blood of male and female patients with specific causes of ischemic stroke at gene-and transcriptlevels to determine sexual dimorphic expression, stroke cause specific expression, and alternative splicing of stroke/vascular risk factor genes compared to vascular risk factor matched control subjects.
Gene expression measurement has been revolutionized by technologies that allow assessment at whole transcript (gene), exon, and alternatively spliced transcript levels. Alternative splicing (AS) is the process of exon inclusion or exclusion from final mRNA transcripts that allows a gene to produce several isoforms with cell-and tissue-specific functions (14)(15)(16)(17)(18)(19)(20). AS and its role in transcript diversity is critical for disease susceptibility and severity and is a potential target for treatment (21)(22)(23). Indeed, differential AS has recently been reported following stroke (13), yet it has not been examined with respect to GWAS-implicated stroke or VRF risk genes and/or sexassociated differences in stroke. Therefore, we hypothesized that differential expression and AS of GWAS-identified IS and VRF genes in patients' blood following ischemic stroke is sexually dimorphic and stroke cause-specific when compared to control subjects.

Derivation of Risk Genes List
A PubMed search was conducted in December 2015 and May 2016 using terms "GWAS" "stroke" "risk" and "genes" to determine those genes that had been identified by genome wide association studies as having an associated stroke risk. Additionally, the Online Mendelian Inheritance of Man R website (www.OMIM.org) was used to search phenotype association #601367 (ischemic stroke) for associated genes and references. From these sources, a list of 75 gene names was derived, of which 71 could be investigated in Partek R Genomics Suite R ( Table 1). The list included genes implicated in any stroke GWAS study, including non-replicated genes such as NINJ2 and WNK1. Additionally, selected genes implicated as vascular risk factors (VRF) specifically for stroke were included in the analyses (n = 71 genes) ( Table 1). Due to the dynamic nature of this field of study, this could not be an exhaustive list of risk-associated genes. Thus, forkhead box F2 (FOXF2) and other recently identified loci were not included in the original analyses, but rather were investigated at gene-and transcript-levels after all other analyses were completed. Further, VRF genes were selected based upon a published association with stroke and, here again, we did not set a goal of compiling a comprehensive list of VRF genes.

Study Subjects
Stroke and control subjects were recruited from 2005 through 2013 in medical centers at the Universities of California (UCs) at Davis and San Francisco and at the University of Alberta, Canada (see experimental flow chart, Figure 1). The studies involving human participants were reviewed and approved by the UC Davis Institutional Review Board (IRB) as the base of the study (IRB#248994), with additional site approvals by the UC San Francisco IRB and the University of Alberta Health Research Ethics Board (Biomedical Panel). The studies adhere to all federal and state regulations related to the protection of human research subjects, including The Common Rule, the principles of The Belmont Report, and Institutional policies and procedures. Written informed consent was obtained from all participants or their proxy. Board-certified neurologists diagnosed stroke as previously described (37).
Samples were matched for age, race, and vascular risk factors, including hypertension, diabetes mellitus, hyperlipidemia, and smoking status. Stroke patient samples were additionally matched for blood draw time since stroke onset. Exclusion

Blood Collection and RNA Isolation
Blood was collected via venipuncture into PAXgene tubes and RNA isolated as previously described (37). PAXgene tubes for blood collection are pre-filled with a solution that lyses the cells and stabilizes the RNA, which prevents degradation during storage.

Array Design and Array Hybridization
Hybridization to Affymetrix GeneChip R Human Transcriptome 2.0 arrays (Affymetrix, Santa Clara, CA) was performed as described previously (37). The HTA 2.0 array includes exon-exon junction (JUC) and within-exon (PSR) probesets for coding and non-coding RNAs. Control and stroke samples were randomly allocated to microarray batches and scan date was a covariate in statistical analyses.

Data Normalization and Statistical Analysis
For this exploratory investigation, probes were imported using Partek R Genomics Suite R software version 6.6 (Partek Inc., St. Louis, MO, USA). Robust multi-array average with prebackground adjustment for GC content was used to quantile normalize and log 2 -transform raw gene expression. After import, 925,032 probeset regions (PSRs) were annotated using Affymetrix specified library files HTA-2_0.r3 version hg19 and then filtered to include only 3,358 within-exon, junction and non-coding PSRs associated with the 71 previously identified risk genes ( Table 1).
Exploratory analyses of gene-and probeset-specific (including probesets within exons and probesets spanning exon-exon junctions) differential expression were performed using a mixed regression model in Partek R Genomics Suite R . Factors in the model included: diagnosis [IS, vascular risk factor matched control (VRFC)], cause of stroke (CE, LVD, small vessel disease/lacunar (SVD), and VRFC), age, time since event, sex, technical variation (array scan date), and VRFs (hypertension, diabetes mellitus and hyperlipidemia). Although samples were collected from three different medical centers, we did not consider this as a factor in our model because the majority were collected from UC Davis (88%) with very few samples from UC San Francisco (8%) and University of Alberta (4%). However, after filtering the samples on the genes of interest list, we determined that there were no outliers from the University of California, San Francisco or the University of Alberta.
Analyses were performed separately on male and females to decrease bias related to hormonal differences and because sexual dimorphism exists in stroke (43,44). Gene-, exon-and transcript-level expression common to both sexes and unique for each were identified. Differential expression was considered significant with absolute fold change (FC) > 1.2 and p < 0.05. Additionally, differentially expressed genes were filtered to include only significant genes for which more than 30% of the total number of probesets on the array for any particular gene were differentially expressed.
Genes with differential alternative splicing (DAS) were investigated using diagnosis as the main effect in an AS analysis of variance as per Partek algorithm that included subject ID and the same factors used in the mixed regression model. Genes displaying DAS with false discovery rate p (FDR) < 0.3 were considered significant. A more relaxed FDR p-value was chosen due to the stringency of the Partek software Splicing ANCOVA  Where: • Y ijklmnop represents the p th observation on the i th Scan Date j th Diagnosis k th Hypertension l th Diabetes m th Hypercholesterolemia n th Marker ID o th Sample ID • µ is the common effect for the whole experiment.
• ε ijklmnop represents the random error present in the p th observation on the i th Scan Date j th Diagnosis k th Hypertension l th Diabetes m th Hypercholesterolemia n th Marker ID o th Sample ID. The errors ε ijklmnop are assumed to be normally and independently distributed with mean 0 and standard deviation δ for all measurements. • Marker ID n is exon-to-exon effect (alt-splicing independent to group). This term also accounts for the fact that not all exons of a gene hybridize to the corresponding probe sets (MarkerID) with the same efficiency. • Diagnosis * Marker ID jn represent whether an exon expresses differently in different level of the specified Alternative Splice Factor(s). • Sample ID (Scan Date * Diagnosis * Hypertension * Diabetes * Hypercholesterolemia) ijklmo is a sample-to-sample effect. Sample ID and Scan Date are random effects.
Determination of differential gene expression between the three main causes of ischemic stroke, CE, LVD, and SVD, was made by performing sex-specific analysis of variance (ANOVA) on each subtype vs. VRFC. An analysis of covariance (ANCOVA) was performed for each sex as a function of time, with technical variation (array scan date), age and VRFs as co-variates. A list (p < 0.05) was derived and those with a partial correlation r > |0.4| were considered most significant (47).

Participant Characteristics
There

Differential Gene Expression of Risk Genes Differed Between the Sexes and Between the Three Main Causes of IS
Four genes were significantly differentially expressed (DE) between all IS male patients and VRFC males, including: aldehyde dehydrogenase 2 family (mitochondrial) (ALDH2), arachidonate 5-lipoxygenase activating protein (ALOX5AP), coagulation factor XIII A chain (F13A1), and inositol monophosphatase 2 (IMPA2) (Figure 2, Table 3). All had decreased (FC < −1.2) expression levels ( Table 4). No significant genes were found when all female IS patients were compared to female VRFCs (Figure 2, Tables 3, 4).

Differential Gene Expression in Cardioembolism Cause of Ischemic Stroke
At the gene level, integrin subunit beta 3 (ITGB3) had significantly lower expression in female CE patients compared to female VRFC subjects whereas there were no differentially expressed genes in male CE patients (Figure 2, Tables 3, 4).

Differential Gene Expression in Large Vessel Disease Cause of Ischemic Stroke
Adducin 1 (ADD1) had significantly decreased expression in male LVD patients compared to VRFCs, whereas there were no differentially expressed genes in females (Figure 2, Tables 3, 4).

Differential Alternative Splicing in the Risk Genes in Ischemic Stroke Patients Compared to Matched Control Subjects
A striking find related to AS is that most of the genes with predicted AS for each of the three main causes of stroke investigated are generally unique to each stroke cause (Figure 3, Tables 3, 5-7). This is emphasized by the fact that only two genes histone deacetylase 9 (HDAC9) and ALDH2, are predicted to have AS in all three causes of stroke (CE, LVD, and SVD) in either male or female patients (Figure 3, Tables 6, 7).

Differential Alternative Splicing in Cardioembolism Cause of Ischemic Stroke
In male CE patients compared to VRFCs, 15 genes were significant for DAS, including previously implicated GWAS CE risk genes: NINJ2, proprotein convertase subtilisin/Kexin type 9 (PCSK9) and ZFHX3 ( Figure 3A, Table 5A). In female CE patients, 17 genes were significant for DAS, including ADD1, a gene previously reported in the vicinity of a long non-coding RNA that had increased expression in female IS patients compared to control subjects (37) (Figure 3B, Table 5B).

Differential Alternative Splicing in Large Vessel Disease Cause of Ischemic Stroke
There were 21 genes significant for DAS in male LVD patients compared to VRFCs ( Figure 3A, Table 5A). These included LVD risk genes: coagulation factor II, thrombin (F2) (10, 26), HDAC9 (24), and serum/glucocorticoid regulated kinase 1 (SGK1) (10). LPA was significant for DAS in males for both LVD and CE etiologies and had previously been reported in the vicinity of linc-SLC22A2, a lncRNA with increased differential expression in male IS patients compared to control subjects (37). In contrast, female LVD patients had only six genes significant for DAS ( Figure 3B, Table 5B): F13A1, coagulation factor XIII B chain (F13B), F5, interleukin 1 alpha (IL1A), MTHFR and phosphodiesterase 4D (PDE4D), also alternatively spliced in CE patients and previously identified as a CE/LVD risk gene (10,26). Two genes, F13B and IL1A, were significant for DAS in both sexes (Figure 3, Table 5).

Differential Exon/Junction Usage in the Risk Genes of Ischemic Stroke Patients Compared to Matched Control Subjects
Interestingly, of the 71 risk genes that were investigated in this study, 70 (Supplementary Tables 1-3), had significant differential exon/junction usage, yet only four genes were common between sexes in all three main causes of stroke: ADD1, ALDH2, nitric oxide synthase 3 (NOS3) and PDE4D (Supplementary Tables 2, 3), indicating that differential exon and junction expression is sex-and stroke cause-specific. Of the 70 genes with significant differential exon/junction expression, 21 were specific to the male cohort (Supplementary Table 2) and two were female-specific: fibrinogen alpha chain (FGA) and solute carrier family 22 member 3 (SLC22A3) (Supplementary Table 3). Within the specific stroke causes, only nine genes were represented by 23 exons and junctions that were common between the sexes (Supplementary Tables 1-3), yet only one, ITGB3, showed exons/junctions with similar expression patterns in both sexes (Supplementary Tables 2, 3).

Differential Exon/Junction Usage in All Causes of Ischemic Stroke
Comparing all IS patients to VRFCs, there were 143 differentially expressed exons and junctions from 30 genes in males (Figure 2, (Figure 2, Table 6).  3). One junction, JUC18000387, in IMPA2 was commonly but inversely expressed in male and female patients ( Table 6).

Differential Exon/Junction Usage in Large Vessel Disease Cause of Ischemic Stroke
In male LVD patients compared to VRFCs, 312 differentially expressed exons/junctions represented 62 genes (Figure 2, Table 3, Supplementary Table 2    homolog, SAGA and STAGA complex component (SUPT3H) (24,39,49) (Figure 4A, Supplementary Table 2). In female LVD patients, 43 differentially expressed exons/junctions represented 18 genes (Figure 2, Table 3, Supplementary Table 3), all of which were also represented in male LVD patients (Figure 2, Table 3, Supplementary Table 2). However, only three of the 18 genes were unique to the female LVD cohort, including: LPAL2, ZFHX3, and CDKN2B which, although not previously reported as an LVD-associated risk gene, was also unique to LVD male patients, though the associated expressed exons and junctions differed between the sexes (Figure 4, Table 7A, Supplementary Tables 2, 3). Though not unique to LVD, one associated risk gene, HDAC9 (10,26), was commonly represented in both sexes, albeit by differing exon and junction expression (Supplementary Tables 2, 3). One exon in ALDH2, PSR12011484, was commonly expressed in both sexes but was downregulated in male patients and upregulated in female patients ( Table 6).

Differential Exon/Junction Usage in Small Vessel Disease/Lacunar Cause of Ischemic Stroke
Male SVD patients had 148 differentially expressed exons/junctions from 34 genes (Figure 2, Table 3,  Supplementary Table 2). Similarly, females had 150 differentially expressed exons/junctions from 34 genes (Figure 2, Table 3,  Supplementary Table 3). In male patients, four of the 34 SVD-associated genes were unique to that specific stroke cause: CYP4F2, GP1BA, solute carrier family four, member one (SLC4A1), and F13A1, which was also uniquely expressed FIGURE 3 | Venn diagram depicting alternatively spliced genes by cause of ischemic stroke in male (A) and female (B) stroke patients compared to vascular risk factor control subjects. These genes are derived from the middle column of Figure 2. Numbers in parentheses represent the number of ischemic stroke cause-specific genes for each sex. Bold font indicates sex specific genes. Italicized font indicates common alternatively spliced genes between sexes but within different causes of IS. Actual statistical significance values for the male cohort are presented in Table 5A and the female cohort in Table 5B. See methods section for statistical analysis protocol. CE, cardioembolic IS; SVD, small vessel disease/lacunar IS; LVD, large vessel disease IS.
in the male cohort ( Figure 4A). In female patients, 12 of the 34 SVD-associated genes were unique to that stroke cause, including FGA, SLC22A3, both unique to the female cohort, and SLC4A1 which was also expressed in males, though by different exons and junctions (Figure 4, Tables 6, 7A). Twenty SVD genes were common between the sexes (Figure 2, Table 3, Supplementary Tables 2, 3). However, only eight were represented by 21 common exons/probesets, 18 of which had inverse expression levels between the sexes ( Table 6).
Remarkably, six genes represented by differential exon and junction expression in male LVD patients were also represented in female SVD patients, including: F2, a LVD-risk associated gene, nitric oxide synthase (NOS1), paired like homeodomain 2 (PITX2), phospholipid phosphatase 3 (PLPP3), retinoic acid induced 1 (RAI1), and zinc finger C3HC-type containing 1 (ZC3HC1) (Figure 4, Table 7B). However, only one gene probeset, a junction, JUC17001562, in RAI1, was commonly expressed, albeit inversely, in both sexes (Table 7B).  Table 4). Hyaluronan binding protein 2 (HABP2; Figure 5A), tumor necrosis factor alpha (TNF) (Figure 5B), and GP1BA ( Figure 5C) were correlated with time in male SVD patients. GP1BA and angiopoietin-1 (ANGPT1) negatively correlated with time after stroke in females at the gene level (Figures 5D,E). At the exon/junction level, 45 genes represented by 103 exons/junctions were significantly correlated with time after stroke in females (Supplementary Table 4). GP1BA was positively correlated with time in male SVD patients ( Figure 5C) and negatively correlated with time in female SVD patients (Figure 5F). A probeset within an exon (PSR19008811) in APOE and a probeset spanning an exon-exon junction (JUC19001725) in LDLR had expression that was negatively correlated with time after stroke in male IS (Supplementary Table 4) but expression that was positively correlated with time in female IS patients (Supplementary Table 4).

DISCUSSION
Building on our previous results that alternatively spliced genes differ between stroke and vascular risk factor controls (13), we investigated differential expression of putative risk genes after ischemic stroke in male and female patients compared to vascular risk factor matched male and female control subjects. Thus, this is the first systematic study to demonstrate sexual dimorphism and differential alternative splicing (DAS) FIGURE 4 | Venn diagram of ischemic stroke (IS) cause-specific genes with differentially expressed exons/junctions in ischemic stroke vs. vascular risk factor control subjects in the male (A) and in female (B) cohorts. These genes are derived from the far-right column of Figure 2. Numbers in parentheses represent the number of subtype-specific genes per number of total genes differentially expressed within each stroke subtype. Bold underlined gene symbols indicate common genes between male and female cohorts within the same stroke cause. Bold italicized genes are common between male LVD and female SVD. Actual statistical significance (p < 0.05) and fold change (FC>|1.2|) values are presented in Tables 6, 7 and Supplementary Tables 2, 3. See methods section for statistical analysis protocol. CE, cardioembolic IS; SVD, small vessel disease/lacunar IS; LVD, large vessel disease IS.
in expression of many GWAS-identified stroke risk genes in the blood of patients following the three main causes of ischemic stroke compared to control subjects. Notably, of the 71 genes studied, 62 showed differential exon/junction expression in large vessel disease (LVD), 48 showed differential exon/junction expression in small vessel disease (SVD) and 32 showed differential exon/junction expression in cardioembolism (CE). This demonstrates transcriptional responses to stroke in most loci that are dependent upon different causes of stroke.
The American Heart Association (AHA) it began identifying genes as risk factors in 2012 (50), adding more each year, though it still does not officially recognize all of the stroke subtype specific genes in this study (51). Given that this field of study continues to be very dynamic, in addition to the AHA reported risk genes (52), we included GWAS identified loci for ischemic stroke in general, as well as additional subtype risk genes and vascular risk factor genes ( Table 1). In support of the AHA acknowledged genes, we found evidence for AS and/or differential expression of four of those genes. Specifically, in male patients ZFHX3, implicated in CE stroke (24,49), and HDAC9, implicated in LVD stroke (24,49), were significant for DAS. In female patients, ALDH2 implicated in SVD stroke (24), was significant for DAS in male LVD and female SVD, and SH2B adaptor protein three (SH2B3) also implicated in all/SVD stroke (1), had significant differential exons/junction expression in female SVD.
Though several other genes have been genetically associated with specific causes of stroke, their transcriptional post-stroke response was not necessarily specific to the risk-associated cause. Examples include SUPT3H/CDC5L which is implicated in LVD (25) yet showed differential exon/junction usage in male LVD and male SVD patients compared to control subjects. CDKN2A, implicated in LVD and all IS (25), was regulated in LVD in male and female patients and in male CE patients compared to control subjects. PITX2 which has been implicated in CE stroke (25), was differentially expressed in LVD in males and females and in female SVD. The CE associated gene, ZFHX3 (24,49), was represented by differentially expressed exons/junctions in female LVD patients and male SVD patients. ABO, a putative LVD and CE risk gene (25), had differentially expressed exons/junctions in male and female patients with LVD and male CE etiologies. MMP12 implicated in LVD (25) was regulated in male and female LVD. NINJ2, implicated in all IS (25) was regulated in male and female LVD and SVD. NAA25, implicated in all IS (25), showed differential exon expression in female CE, female LVD, and female SVD and DAS in female CE. HDAC9, implicated in LVD and all IS (25), was significant for differential splicing not only in male LVD patients but also in male SVD patients, and showed differential exon expression in male and female LVD and male SVD.
We previously reported that several differentially expressed lncRNAs were in proximity to stroke risk and VRF genes (37). ADD1, a gene associated with increased stroke risk (27), was reported to be in the vicinity of a lncRNA with increased expression in female IS patients compared to controls (37) and was found in this study to be associated with DAS in male and female CE patients and male LVD patients. Additionally, LPA, significant for DAS in male CE and LVD patients and LPAL2-for DAS in male SVD patients, were in proximity to a lncRNA with increased expression in male IS patients compared to VRFCs (37). LPA also had differentially expressed exons in female CE, male LVD, and male and female SVD; and LPAL2 had differentially expressed exons/junctions in male and female LVD and male SVD. These results further support a need for better understanding of lncRNA functions in stroke biology.
WNK1 was of note since it had significant differential gene level expression in SVD, for DAS in male CE and SVD patients, and differential exon/junction expression in male CE, female LVD and both male and female SVD. This gene has previously been implicated in atherothrombotic stroke risk (42) but could not be validated in a follow-up study (35). Though we previously reported there were no changes in WNK1 expression, those studies were done at the gene level only, did not include separate sex analyses, and did not include small vessel disease/lacunar stroke (35). Thus, by assessing expression at different levels in both males and females with all three causes of stroke, a more complete picture of WNK1 expression was obtained in this study.

Study Strengths and Limitations
The strengths of this study include investigating post-stroke differential sex-and cause-specific expression of risk genes at gene-, exon-and alternative splicing levels. However, there are several limitations of this study, including the non-exhaustive nature of our gene list due to the dynamic nature of this field of study. This is evidenced by a recently published GWAS   (54), we found sexually dimorphic differential expression for many of the genes when we compared stroke patients and control subjects ( Table 8). Two genes, semaphorin 4A (SEMA4A) and pre-mRNA processing factor 8 (PRPF8), had significantly decreased expression (FC > |1.2|; FDR < 0.3) in male IS patients compared to VRFC, whereas there were no genes in the female comparison with significant differential expression. Five probesets representing five genes (castor zinc finger 1 (CASZ1), guanylate cyclase 1 soluble subunit alpha 3 (GUCY1A3), leucine rich repeats and calponin homology domain containing 1 (LRCH1), PRPF8, SH3, and PX domains 2A (SH3PXD2A) were significantly differentially expressed (FC > |1.2|; FDR < 0.3) and IS and VRFC females. In male IS vs. VRFC, 44 probesets, representing fourteen genes (ankyrin 2 (ANK2), CASZ1, FES proto-oncogene, tyrosine kinase (FES), GUCY1A3, interleukin enhancer binding factor 3 (ILF3), long intergenic non-protein coding RNA 1492 (LINC01492), polyamine modulated factor 1 (PMF1), PRPF8, SEMA4A, SH3PXD2A, solute carrier family 25 member 44 (SLC25A44), solute carrier family 44 member 2 (SLC44A2), zinc finger protein 318 (ZNF318) were significantly differentially expressed (FC > |1.2|; FDR < 0.3). LRCH1, a gene linked to cardiac mechanism, was represented only in women. In men, eight (ANK2, FES, ILF3, LINC01492, PMF1, SEMA4A, SLC25A44, SLC44A2, ZNF318) of the 14 represented genes were male-specific. Interestingly, the majority of differentially expressed probesets were exonexon junctions in both men (56.5%) and in women (71.4%). These results, when combined with the differential expression in men of PRPF8, a gene that plays a distinct role in spliceosome assembly (55), suggest that AS may be playing a role in the differential expression of these newly identified stroke risk genes.
Although we did not evaluate differential expression between the stroke causes, we found differential expression of genes that are associated with specific etiologies. LINC01492 is an identified risk gene for LVD (53,54), PMF1, SLC25A44, and SEMA4A, a gene only recently linked with IS, are all associated with SVD (53,54). These data reinforce the results of our more in-depth analyses that sex and etiology should both be considered in stroke studies. Because stroke is known to affect immune responses, variations in gene expression could reflect changes in blood cell types. Therefore, future research should investigate gene expression of individual blood cell types after stroke. Additional limitations of this study include the relatively small sample size of some of the subgroups. The results need to be replicated. Alternative splicing was predicted rather than directly measured as can now be done with some sequencing systems. Race should be addressed in future studies, though our previous studies indicate small effects of race on gene expression following stroke (56).
Because of the putative involvement of these genes in stroke pathophysiology, this study investigates differential expression in the putative risk genes between male and female patients with ischemic stroke of different etiologies and male and female vascular risk factor matched controls but it is not our intention to suggest that sexual dimorphism exists only in the expression of these putative risk genes. Indeed, we expect that sexual dimorphism may affect non-stroke risk genes in a stroke-etiology specific manner. Future studies need to address the broader sexual dimorphism in stroke of different etiologies.

CONCLUSION
This study provides evidence for a role of previous GWASidentified genes beyond an increased risk for stroke and future research should investigate their role in post-stroke injury and/or recovery. Many of these genes are alternatively spliced between the three main causes of stroke and are sexually dimorphic. Strikingly, although few of the 71 genes studied pass GWAS statistical criteria (51), 70 of the 71 genes associated with stroke or stroke risk factors had differential expression in stroke patients compared to control subjects. Since sex and cause of stroke had such a large impact on these gene expression results, we suggest a need for stroke treatment trials and stroke genetic studies to be powered for inclusion of these variables as we search for more targeted stroke treatment and as society continues its trend toward more personalized medicine for patients.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by University of California (UC), Davis Institutional Review Board, UC San Francisco Institutional Review Board and University of Alberta Health Research Ethics Board (Biomedical Panel). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
CD-A analyzed and interpreted the patients' array data. CD-A, FS, BS, and BA were major contributors in writing the manuscript. HH prepared samples and arrays. All authors read and approved the final manuscript.

FUNDING
These studies were supported by an AHA Fellow to Faculty award to GJ, American Heart Association grants to BS and DL, and NIH/NINDS grants to FS, GJ, and BS (NS079153, NS075035, and NS097000).