- 1Department of Neurology, Nippon Medical School, Tokyo, Japan
- 2Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- 3Department of Computational Biology and Medical Science, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- 4Department of Hematology, Nippon Medical School, Tokyo, Japan
Background: Previous multi-ancestry genome-wide association studies (GWAS) of stroke reported 32 stroke risk loci in the MEGASTROKE study. Most studies on the genetic risk score (GRS) of stroke have reported a predominance in the European general population. We aimed to explore the association among GRS, clinical characteristics, and mortality in patients with ischemic stroke registered in the BioBank Japan (BBJ) database.
Methods: This is a cohort study of BBJ participants. The project participants were recruited between June 2003 and March 2018. We conducted a GWAS for stroke in 19,702 Japanese patients with ischemic stroke and 159,610 controls. GRS was generated using 29 stroke risk single nucleotide polymorphisms (SNPs) from 32 stroke-related loci identified in the MEGASTROKE. A multivariate logistic regression model was used to estimate odds ratios (ORs) and 95% confidence intervals (95% CIs) for comorbidities and stroke etiology across the GRS. The Cox proportional hazard model was used to estimate hazard ratios (HRs) and 95% CIs for mortality associated with GRS.
Results: The ORs for atrial fibrillation were significantly higher in those at Intermediate GRS [20–80th percentile of GRS; ORs 1.59 (1.25–1.90)] and High GRS [top 20th percentile of GRS; ORs 2.12 (1.69–2.67)] after a full adjustment than in those at Low GRS (bottom 20th percentile of GRS). Regarding stroke etiology, the ORs for cardioembolism were significantly higher in those at Intermediate GRS [ORs 1.31 (1.04–1.61)] and High GRS [ORs 1.44 (1.13–1.89)] than in those at Low GRS. During a median follow-up of 10.0 years, the risk of stroke mortality was significantly higher in those at High GRS [HRs 1.27 (1.04–1.56)] than in those at Low GRS in a fully adjusted model.
Conclusion: In Japanese, a higher GRS was significantly associated with atrial fibrillation, cardioembolism, and stroke mortality. Our findings suggest that the GRS may predict the risk of stroke mortality and provide insights into the pathogenesis of stroke.
Introduction
Stroke is the second-leading cause of death and the primary cause of neurological disability worldwide (1, 2). Stroke is caused by a complex interplay of environmental and traditional risk factors, including older age, hypertension, diabetes mellitus, dyslipidemia, atrial fibrillation, chronic kidney disease, and smoking (3). Besides conventional clinical risk factors, the genetic contribution to the development of stroke is also widely recognized (4). Twin and family history studies suggest genetic factors are responsible for some of this unexplained risk for stroke. The heritability estimates were 0.32 for the liability to stroke death and 0.17 for stroke hospitalization or stroke death (5).
Over the last decades, several genome-wide association studies (GWAS) have identified genetic variants associated with stroke in different ethnic populations (6–11). Previous multi-ancestry GWAS of 52,000 subjects in predominantly European-ancestry groups have identified 32 loci associated with stroke and stroke subtypes (MEGASTROKE study) (11). Recent work has highlighted the potential of the genetic risk score (GRS) based on the MEGASTROKE study, which can be evaluated as a risk factor for stroke and used to predict incident stroke events in an independent population (11–16). The risk of incident stroke was higher in those at high genetic risk than in those at low genetic risk (12). The polygenic risk score (PRS) using 3.6 million genetic variants predicts stroke incidents in a population of 12,792 healthy older individuals enrolled in the ASPREE trial (Aspirin in Reducing Events in the Elderly) (13). In a genetic cohort analysis pooling 51,288 subjects with cardiometabolic disease from five cardiovascular clinical trials, GRS using the set of 32 single nucleotide polymorphisms (SNPs) derived from the MEGASTROKE study was a strong, independent predictor of ischemic stroke incidence over a median follow-up period of 2.5 years (14). In the Northern Finland Birth Cohort 1966 of 12,058 children, higher PRS for stroke was associated with the risk for cerebrovascular disease in mid-life in Finnish population. Ischemic stroke (15). Although investigation of genetic risk for stroke has been limited in non-European populations, the Hisayama Study, which involved 3,038 Japanese individuals, reported the PRS for stroke using 350,000 SNPs was significantly associated with stroke incidence during long-term follow-up (median 10.2 years) (16). Most of the advanced literature on genetic risk for stroke has been reported in general populations regardless of ethnicity; however, solid evidence in the relevant literature has not described the clinical significance of genetic risk for stroke in non-European stroke patients.
To address these limitations, we developed a GRS for stroke from a set of 32 stroke risk loci identified in the MEGASTROKE study in Japanese patients with ischemic stroke. We hypothesized that subsets with a higher genetic risk influence stroke mechanisms and mortality compared to those with a lower genetic risk of ischemic stroke. This cohort study aimed to clarify the association between the GRS score, clinical characteristics, and mortality in stroke patients registered in the BioBank Japan (BBJ) database.
Methods
Study participants
The BBJ is a multi-institutional hospital-based registry initially designed to focus on human genetic research (17, 18). All study participants were Japanese individuals registered in the BBJ project.1 The project aimed to register patients with newly developed diseases (incident cases) as well as those who had been diagnosed and treated before the project started (prevalent cases). The project participants were recruited between June 2003 and March 2018. The biological samples and clinical information were collected and anonymized onsite at the cooperating hospitals. The BBJ 1st cohort consisted of approximately 200,000 patients with 47 common diseases between 2003 and 2007, while the BBJ 2nd cohort included approximately 67,000 patients with 38 diseases between 2013 and 2017. All study participants were diagnosed with one or more of the 51 target diseases, including malignant, cerebral, cardiovascular, respiratory, liver, metabolic, and urologic diseases (17, 18). The identification of ischemic stroke and comorbidities was based on the physicians’ diagnoses written in the medical records and the questionnaire. Ischemic stroke was diagnosed based on clinical presentation and neuroimaging findings, including magnetic resonance imaging (MRI) or computed tomography (CT). If detailed medical record surveys were available, stroke subtypes were determined according to the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) criteria (19). For the genetic association analysis, individuals without any history of stroke or intracranial aneurysm were selected as controls (n = 159,610).
Clinical information
Clinical information, including common clinical variables, disease-specific variables, and laboratory parameters, was collected from each participant upon registration. Among ischemic stroke patients, the following clinical variables were collected: 1) age and sex; 2) systolic and diastolic blood pressure; 3) vascular risk factors, such as hypertension, diabetes, dyslipidemia, and chronic renal failure; 4) atrial fibrillation; 5) congestive heart failure; 6) prior history of ischemic stroke and myocardial infarction; 7) smoking status; 8) alcohol consumption; 9) laboratory parameters; and 10) stroke subtype [large artery atherosclerosis (LAA), cardioembolism (CE), small vessel occlusion (SVO), and stroke of others/undetermined etiology (O/U)].
Genotyping, imputation, and quality control
For biological samples, deoxyribonucleic acid (DNA) and serum were collected from participants at cooperating hospitals and stored in the BBJ DNA and serum banks, respectively. At baseline, a total of 14 mL of whole blood was obtained from each participant using two 7-mL EDTA-containing tubes.
One 7-mL blood sample was sent to one of three commercial laboratories (SRL, BML, or MBC, Japan) for DNA extraction using standard laboratory procedures. Following extraction, DNA concentration was adjusted to 100 ng/mL, aliquoted into three 1-mL tubes labeled with two-dimensional (2D) barcodes, and stored at 4–10 °C. The DNA samples were then delivered to BBJ, where barcode information was verified against anonymized participant identifiers before long-term storage in the BBJ DNA bank at 4 °C (17).
The remaining 7-mL blood sample was centrifuged according to standard protocols at each cooperating hospital. The resulting serum was aliquoted into three 1-mL tubes, labeled with 2D barcodes, and initially stored at −80 °C at each site. After batch collection, serum samples were transported to BBJ, where barcode verification was performed, and samples were subsequently stored in the BBJ serum bank at −150 °C (17).
The individuals included in the GWAS were genotyped using the Illumina HumanOmniExpress Exome BeadChip array or a combination of the Illumina HumanOmniExpress and HumanExome BeadChips in BBJ1. However, BBJ2 utilized the Illumina Infinium Asian Screening Array for SNP genotyping. We imputed genotypes using the combined reference panel of the 1,000 Genomes Project Phase 3 reference panel (20) and the Japanese in-house reference panel from BBJ using Eagle v2.4.1 and Minimac4 v1.0.2 (21). Rigorous quality control filters were applied prior to phasing and imputation, including a criterion that excluded variants with sample low call rate <0.98, heterozygous genotype count (HGC) < 5, SNP cell rate <0.99, and Hardy–Weinberg equilibrium p < 1.0 × 10−6.
Calculating GRS and risk categories
We summarized ORs and 95% CIs for the 32 stroke risk loci in the MEGASTROKE study and BBJ cohorts (Supplementary Table S1). Among the 32 stroke-related loci identified in the MEGASTROKE study, three SNPs (rs146390073, rs12124533, and rs635634) were not assessed in the BBJ cohort. We then constructed a GRS for ischemic stroke using the remaining 29 SNPs in the BBJ cohort. The GRS was calculated following the equation with the use of PLINK v2.0; ; where N is the number of SNPs in the score, i is the effect size of variant I and dosage ij is the number of copies is SNP i in the genotype of individual j (22). Patients were assigned into three categories according to the GRS; “Low GRS” (bottom 20th percentile of GRS), “Intermediate GRS” (20-80th percentile of GRS), and “High GRS” (top 20th percentile of PRS). Regarding the risk categories, we set “Low GRS” as a reference group.
Statistical analysis
Association between GRS and comorbidities
We compared the clinical variables, including age, sex, and comorbidities for stroke, according to the risk categories of the GRS in all patients (n = 19,702). Continuous variables were expressed as medians and interquartile ranges (IQR) in the text and tables. The significance of intergroup differences was assessed using Fisher’s exact test for categorical variables and the Kruskal–Wallis test for continuous variables with Bonferroni correction. Receiver operating characteristic (ROC) curve analyses were performed to determine the cut-off values of GRS for distinguishing the presence of absence of each comorbidity. In addition to univariate analysis, we investigated the association between the GRS (continuous and risk categories) and each comorbidity using multivariate logistic analysis. We included sex and age as covariates in the sex- and age-adjusted models (Model 1). The fully adjusted model (Model 2) also included sex, age, comorbidities (hypertension, dyslipidemia, diabetes mellitus, atrial fibrillation, congestive heart failure, and chronic kidney failure), history of ischemic stroke and myocardial infarction, smoking, and alcohol consumption. The logistic regression models were used to estimate odds ratios (ORs) and 95% confidence intervals (95% CIs) for each comorbidity across the GRS (continuous and risk categories).
Association between GRS and stroke subtype
Similarly, we also investigated the association between the GRS and stroke subtype in patients with an identified stroke etiology based on the TOAST criteria (n = 6,608). Univariate and multivariate analyses were conducted using the same methods used to evaluate the significant association between the GRS and comorbidities. The significance of intergroup differences was assessed using Fisher’s exact test for categorical variables with Bonferroni correction. We constructed ROC curves to determine the cutoff values of the GRS for distinguishing stroke subtypes. The logistic regression model was used to estimate ORs and 95% confidence intervals (95% CIs) for each stroke subtype across the GRS (continuous and risk categories).
Association between GRS and mortality
For the survival analysis, we obtained survival follow-up data based on the cause of death under the ICD-10 code. ROC curve analyses were performed to determine the cut-off values of GRS for distinguishing the survive or death. We assessed Cox proportional hazards models to explore the association between GRS and the risk of mortality [all-cause, stroke, and cardiovascular (CV)] during long-term follow-up (n = 15,468). Models 1 and 2 were similar to those used in the “Association between GRS and comorbidities” subsection. The Cox proportional hazard model was used to estimate hazard ratios (HRs) and 95% CIs for all-cause, stroke, and cardiovascular mortality associated with the GRS (continuous and risk categories).
Statistical analyses were performed using SPSS Software version 25.0 (IBM), GraphPad Prism 10 (GraphPad), and R version 4.0.0 (R Project for Statistical Computing). In this study, all statistical analyses, including those without Bonferroni correction, were interpreted using a significance threshold of p < 0.05.
Data access statement: All the data used for the analysis are presented in the tables and figures in this article. Data will be shared after obtaining ethical approval if requested by any qualified investigator to replicate the results.
Standard protocol approvals, registrations, and patient consents
Written informed consent was obtained from all participants, and the study was approved by the ethics committees of Nippon Medical School (A-2021-070) and the University of Tokyo (2019-17-0718).
Results
Genome-wide significant stroke loci in the BBJ cohort and baseline clinical characteristics
A total of 267,309 individuals with one or more of the 51 target diseases were registered in the BioBank Japan Project between April 2003 and March 2018. Among them, 21,404 individuals with ischemic stroke were identified based on the diagnoses of physicians. After the quality check, 19,702 patients in the BBJ cohort were examined to generate the GRS for ischemic stroke (Supplementary Table S1).
Table 1 presents the baseline demographic characteristics of the study population. The median age of the participants was 71 years, and 63.1% were male. At the time of registration, blood pressure levels and laboratory parameters were not assessed in any of the subjects. Many patients had traditional comorbidities for stroke, including smoking (53.2%), alcohol (51.4%), previous ischemic stroke (43.6%), hypertension (37.0%), dyslipidemia (18.2%), and diabetes (15.7%). A small proportion of the cohort had myocardial infarction (4.6%), atrial fibrillation (4.4%), congestive heart failure (3.0%), or chronic kidney failure (2.1%). The stroke subtype was identified in 6,608 patients based on the TOAST criteria (Supplementary Figure S1. LAA; n = 1,276, CE; n = 752, SVO; n = 3,687, O/U; n = 893).
Association between GRS and comorbidities
Baseline characteristics by GRS risk category are shown in Supplementary Table S2. Subjects at High GRS were more likely to be younger and more male-dominant than those at Low GRS. The prevalence of atrial fibrillation was significantly higher among patients with Intermediate and High GRS than those at Low GRS. The correlation between the GRS, blood pressure, and laboratory parameters is shown in Supplementary Figure S2. The results of the ROC curve analyses for stroke subtype and the GRS are presented in Supplementary Figure S3. We investigated the association between GRS (both continuous level and risk categories) and each comorbidity using multivariate logistic analysis (Table 2). The ORs (95%CIs) of atrial fibrillation in those at Intermediate and High GRS were significantly higher in the fully adjusted model than those at Low GRS. Similarly, higher GRS (continuous level) was significantly associated with atrial fibrillation in the fully adjusted model (Table 2 and Figure 1). A significant association was also observed between GRS and hypertension in the fully adjusted model (Table 2 and Figure 1). We also analyzed the association using a GRS constructed from the odds ratios reported in the MEGASTROKE study, and found that only atrial fibrillation was associated with a higher MEGASTROKE GRS (Supplementary Table S4).
Table 2. Association between genetic risk score (GRS) and comorbidities by multivariate logistic analysis.
Figure 1. Impact of the continuous genetic risk score (GRS) for stroke and comorbidities (n = 19,702). The data are presented as estimated odds ratios (ORs) and 95% confidence intervals (CIs) for increased GRS for stroke. Statistical significance was set at p < 0.05. Model 1: adjusted for age and sex. Model 2: adjusted for Model 1 + other stroke comorbidities.
Association between GRS and stroke subtype
The prevalence of stroke subtypes based on the TOAST criteria by GRS category is summarized in Supplementary Table S2. The results of the ROC curve analyses for stroke subtype and the GRS are presented in Supplementary Figure S4. Subjects in Intermediate and High GRS were more likely to have CE than those at Low GRS. Conversely, those with Lower and Intermediate GRS were more likely to have SVO than those at High GRS. We investigated the association between GRS (both continuous level and risk categories) and stroke subtype using multivariate logistic analysis (Table 3). The ORs (95%CIs) for CE were significantly higher in those at Intermediate GRS and High GRS in the fully adjusted model (Table 3). Higher GRS (continuous level) was significantly associated with CE in the fully adjusted model (Table 3 and Figure 2). However, a lower GRS was significantly associated with SVO in the fully adjusted model (Table 3 and Figure 2). In contrast, the MEGASTROKE GRS was not significantly associated with cardioembolic stroke (Supplementary Table S5).
Table 3. Association between genetic risk score (GRS) and stroke etiology by multivariate logistic analysis.
Figure 2. Impact of continuous genetic risk score (GRS) for stroke and stroke subtypes based on TOAST criteria (n = 6,608). The data are presented as estimated odds ratios (ORs) and 95% confidence intervals (CIs) for increased GRS for stroke. Statistical significance was set at p < 0.05. Model 1: adjusted for age and sex. Model 2: adjusted for Model 1 + other stroke comorbidities.
Association between GRS and mortality
During the median follow-up of 10.0 years, 15,468 patients were included in the outcome analysis (Supplementary Figure S1). The number of events for mortality was as follows (Table 4): all-cause (n = 6,253, 31.7%), stroke (n = 949, 4.8%), and cardiovascular (n = 1,167, 5.9%). The results of the ROC curve analyses for mortality and the GRS are presented in Supplementary Figure S5. Kaplan–Meier estimates of cumulative mortality rate were higher in those at High GRS than those at Low GRS stroke-related mortality for stroke and CV mortality by Cox proportional hazard analysis (high GRS: HRs 1.27 [1.04–1.56], p = 0.018 for stroke mortality; high GRS: HRs 1.27 [1.06–1.53], p = 0.009 for CV mortality) (Table 4 and Figure 3). These significant associations were observed through continuous GRS level (HRs 1.30 [1.07–1.56], p = 0.007 for stroke mortality: HRs 1.26 [1.06–1.50], p = 0.007 for CV mortality) (Table 4 and Figure 3). However, no significant association was observed between the GRS and all-cause mortality using the Cox proportional hazard analysis (Table 4 and Figure 3). While the MEGASTROKE GRS was associated with cardiovascular mortality, no association was observed with stroke-related mortality (Supplementary Table S6).
Table 4. Association between genetic risk score (GRS) and mortality by Cox proportional hazard analysis (n = 15,468).
Figure 3. Impact of genetic risk score (GRS) for stroke on long-term mortality in Cox proportional hazard analysis. Kaplan–Meier estimates of cumulative events from (A) all-cause, (B) stroke, and (C) cardiovascular (CV) are shown with a band of 95% CIs. Individuals were classified as having high GRS (red), intermediate GRS (green), or low GRS (blue). (D) Effect of continuous GRS for stroke and mortality (all-cause, stroke, and CV). Data are presented as estimated hazard ratios (HRs) and 95% confidence intervals (CIs) for increased GRS for stroke. Statistical significance was set at p < 0.05. Model 1: adjusted for age and sex. Model 2: adjusted for Model 1 + other stroke comorbidities.
Discussion
We found a strong association between the GRS and atrial fibrillation in Japanese patients with ischemic stroke. In the MEGASTROKE study, approximately half of the identified loci shared genetic variation with related vascular traits, including blood pressure, atrial fibrillation, and lipid levels (11). In the BBJ cohort, the effect sizes of stroke-related loci associated with atrial fibrillation (rs13143308; ORs 1.41, rs12,932,445; ORs 1.21) were greater than those of other stroke-related loci, suggesting a stronger genetic contribution of atrial fibrillation-related pathways. Recent cross-ancestry GWAS of atrial fibrillation identified 35 new susceptibility loci using data from BBJ and European cohorts (77,690 cases, 1,167,040 controls) (23). Notably, the PRS for atrial fibrillation predicted increased risks of CV and stroke mortalities and segregated individuals with cardioembolic stroke in undiagnosed atrial fibrillation patients (23). Moreover, Ebara et al. (24) reported that GRS using eight risk loci for atrial fibrillation was associated with the risk of ischemic stroke in patients with atrial fibrillation in the BBJ cohort. Therefore, the genetics of atrial fibrillation may play a clinically important role in Japanese patients with ischemic stroke.
Regarding the association between GRS and vascular risk factors, there was no association between the GRS and hyperlipidemia, while a significant association was observed between the GRS and hypertension. The effect sizes of stroke-related loci associated with blood pressure (rs880315, OR 1.06; rs1689638, OR 1.05; rs4932370, OR 1.07; rs35436, OR 1.08) were generally larger than those of lipid-related loci (e.g., rs8103309, OR 1.06) in the BBJ dataset. The GRS used in this analysis was derived from GWAS signals related to overall ischemic stroke risk, and therefore may not fully capture genetic pathways specific to lipid metabolism. Widespread statin use in Japan may attenuate the association between genetic predisposition and clinically diagnosed hyperlipidemia, as lipid levels can be modified irrespective of genetic background, potentially reducing the detectability of genetic effects (25). Furthermore, ethnic differences in lipid-related genetic architecture may also contribute. Several lipid-associated loci identified in European populations demonstrate smaller or inconsistent effect sizes in East Asian populations (26), which may partially explain the weaker association observed in our cohort. Taken together, these factors likely contributed to the absence of a clear relationship between the GRS and hyperlipidemia in the present study.
Among patients with an identified stroke etiology based on the TOAST criteria, a higher GRS was also associated with CE and atrial fibrillation. This finding suggests that part of the observed association between the GRS and CE may indeed be mediated through atrial fibrillation. Previous epidemiological studies, including the Framingham Heart Study (27) and the Asymptomatic Atrial Fibrillation and Stroke Evaluation in Pacemaker Patients and the Atrial Fibrillation Reduction Atrial Pacing Trial (28), have consistently demonstrated the profound impact of atrial fibrillation on CE risk. Furthermore, large-scale GWAS studies have identified multiple genetic loci that contribute to atrial fibrillation susceptibility, supporting the notion that genetic predisposition may influence CE partly through atrial fibrillation-related pathways (29, 30). Identifying the underlying stroke etiology is important because its pathophysiology has consequences for acute treatment and secondary stroke prevention. Traditionally, an ischemic stroke with an unclear etiology based on the TOAST criteria was classified as cryptogenic stroke (31). In 2014, the concept of embolic stroke of undetermined source (ESUS) was developed based on previous observations that patients with non-lacunar cryptogenic ischemic stroke were likely to have embolic stroke mechanisms (31, 32). Occult paroxysmal atrial fibrillation has been considered an important cause of ESUS (33, 34). Atrial fibrillation was detected in 5.1% of patients in the in-hospital setting, and 8.9, 12.4, and 30.0% at 6, 12, and 36 months, respectively, after stroke with an insertable cardiac monitor (ICM) (33, 34). To date, no detailed investigations have been conducted regarding GRS and the diagnostic work of ESUS with ICM, further research is needed.
We found a significant association between lower GRS and SVO. A previous Japanese cohort study using two independent data sets (Kyusyu U data set and JPJM data set), showed that the ORs of the top PRS quintiles were significantly higher than those of the lowest PRS quintiles when compared with the control group in CE (134 cases, 134 matched controls), LAA (360 cases, 360 matched controls), and SVO (486 cases, 486 matched controls) (35). In the ASPREE trial, continuous GRS indicated a significant predictor for the risk of large vessel and cardioembolic stroke but not for small vessel stroke among 12,792 healthy older individuals over 5 years (13). Several factors may explain the discrepant associations between GRS and SVO across populations. Methodological differences between BBJJ and European cohorts likely contribute. European studies, including the ASPREE trial (13), often adopt standardized MRI protocols with centralized image review, whereas imaging environments in large Japanese registries are more heterogeneous, potentially affecting subtype definitions and comparability. Ethnic variation in susceptibility to cerebral small vessel disease may also play a role. Previous reports have shown higher burdens of microbleeds and white matter lesions in East Asian individuals compared with Europeans, independent of conventional vascular risk factors (36, 37). Such population-specific predispositions may strengthen the association between lower GRS and SVO in Japanese cohorts.
GRS was not associated with LAA in the BBJ cohort, although the genetic loci rs7610618 and rs10820405 have been associated with LAA in European populations in the MEGASTROKE study (11). In this cohort, SVO accounted for more than half of all cases (56%), whereas LAA and CE represented 19 and 11%, respectively. In contrast, data from a Japanese multicenter, hospital-based acute stroke registry (n = 10,392) reported the distribution of TOAST subtypes as follows: LAA, 30%; CE, 27%; and SVO, 22% (38). Furthermore, ethnic differences in the genetic architecture of atherosclerosis may contribute to the observed discrepancies. Previous trans-ethnic meta-analyses have shown that LAA-associated loci identified in European populations, such as rs7610618 and rs10820405, often differ in allele frequency and linkage disequilibrium structure in East Asians, resulting in attenuated or non-replicated effect size (11). These findings suggest that population characteristics and genetic heterogeneity should be carefully considered when interpreting GRS associations with LAA across different ethnic groups.
In the present study, we confirmed a significant association between the GRS for stroke and the risk of stroke and CV mortality during a long-term follow-up. Previous cohort studies have shown a significant association between a higher genetic risk for stroke and the incidence of future stroke in the general population (11, 12, 39). However, solid evidence in the relevant literature has not described the clinical relevance of the GRS for stroke to survival prediction in patients with stroke. A previous GWAS for atrial fibrillation and coronary artery disease showed that a higher PRS for these diseases significantly increased the risk of CV mortality in the BBJ cohort (23, 40). Therefore, our findings were strengthened by the large population-based cohort design with a long-term follow-up to explore the association between the GRS for stroke and stroke mortality as well as cardiovascular disease in Japanese individuals.
This study had several limitations. First, the cause of mortality was identified based on the ICD-10 codes. Clinical information in the BBJ cohort may not include as many clinical details as individual hospital records regarding survival analysis. Second, only one-third of the participants were diagnosed with stroke subtypes based on the TOAST criteria. Moreover, most patients with stroke in the BBJ cohort included those with mild disabilities or in a stable chronic stage. The difference in baseline characteristics, including risk factors, stroke subtypes, and neurological severity, should be taken into account when investigating the clinical significance of GRS in patients with ischemic stroke. Third, detailed information on neuroimaging findings was not available in the BBJ cohort. Regarding the association between brain MRI findings and GWAS in the general cohort, white matter lesions (41), cerebral microbleeds (42), and perivascular spaces (43), indicating imaging markers of cerebral small vessel disease, were assessed in the general cohorts. However, it remains unclear whether there exists an association between genetic risk for stroke and other MRI findings, such as acute ischemic volume on diffusion-weighted imaging as well as large vessel involvement due to atherosclerotic changes or cardiac embolism. Forth, in the present study, we restricted our analysis to patients with ischemic stroke because our primary objective was to characterize differences in risk profiles within the stroke population. Finally, to confirm an association between the GRS, clinical characteristics, and mortality in a non-European population, our study population included only Japanese subjects from the BBJ cohort. Recent cross-ancestry GWAS meta-analyses of 110,182 stroke patients (GIGASTROKE study) identified 89 independent stroke risk loci (39). Higher GIGASTROKE GRS was significantly associated with increased risk of stroke in the East Asian cohort (1,312 participants of whom 27 developed an incident stroke over a 3-year follow-up; HRs = 1.49, 95% CIs = 1.00–2.21, p = 0.048), whereas the MEGASTROKE GRS was not associated with incident stroke (HRs = 0.82, 95% CIs = 0.55–1.23, p = 0.34) (39). In the ROC analyses of this study, the predictive value of GRS for comorbidity, stroke subtype, and mortality risk was lower than that obtained using the GRS as a continuous variable or as a categorical classification in the multivariate analysis. Further studies, including the GIGASTROKE GRS, brain MRI findings, and clinical outcomes, are needed to clarify the clinical significance of genetic risk in patients with ischemic stroke.
This large cohort study demonstrated an association between the GRS for stroke, clinical characteristics, and mortality in Japanese patients with ischemic stroke. The GRS for stroke was significantly associated with atrial fibrillation, CE, and stroke mortality. Our findings suggest that the GRS for stroke may provide insights into the pathogenesis of stroke and predict the risk of stroke mortality.
Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary material.
Ethics statement
The studies involving humans were approved by the Ethics Committees of Nippon Medical School (Approval number: A-2021-070) and the University of Tokyo (Approval number: 2019-17-0718). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
TS: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. YK: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – review & editing. KM: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – review & editing. HY: Conceptualization, Data curation, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – review & editing. KK: Conceptualization, Methodology, Supervision, Validation, Visualization, Writing – review & editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This work was supported by a Grant-in-Aid for Scientific Research (C) (21 K07445 and 25 K10799) from Japan Society for the Promotion of Science (JSPS).
Acknowledgments
We would like to thank all the participants for their willingness and time devoted to this study and extend our appreciation to the BBJ project team for their assistance in data collection.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that Generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fneur.2026.1664594/full#supplementary-material
Footnotes
References
2. GBD 2016 Stroke Collaborators. Global, regional, and national burden of stroke, 1990-2016: a systematic analysis for the global burden of disease study 2016. Lancet Neurol. (2019) 18:439–58. doi: 10.1016/S1474-4422(19)30034-1,
3. Boehme, AK, Esenwa, C, and Elkind, MS. Stroke risk factors, genetics, and prevention. Circ Res. (2017) 120:472–95. doi: 10.1161/CIRCRESAHA.116.308398,
4. Dichgans, M, Pulit, SL, and Rosand, J. Stroke genetics: discovery, biology, and clinical applications. Lancet Neurol. (2019) 18:587–99. doi: 10.1016/S1474-4422(19)30043-2,
5. Bak, S, Gaist, D, Sindrup, SH, Skytthe, A, and Christensen, K. Genetic liability in stroke: a long term follow up study of Danish twins. Stroke. (2002) 33:769–74. doi: 10.1161/hs0302.103619,
6. Bevan, S, Traylor, M, Adib-Samii, P, Malik, R, Paul, NL, Jackson, C, et al. Genetic heritability of ischemic stroke and the contribution of previously reported candidate gene and genomewide associations. Stroke. (2012) 43:3161–7. doi: 10.1161/STROKEAHA.112.665760,
7. Gretarsdottir, S, Thorleifsson, G, Manolescu, A, Styrkarsdottir, U, Helgadottir, A, Gschwendtner, A, et al. Risk variants for atrial fibrillation on chromosome 4q25 associate with ischemic stroke. Ann Neurol. (2008) 64:402–9. doi: 10.1002/ana.21480,
8. Gudbjartsson, DF, Holm, H, Gretarsdottir, S, Thorleifsson, G, Walters, GB, Thorgeirsson, G, et al. A sequence variant in ZFHX3 on 16q22 associates with atrial fibrillation and ischemic stroke. Nat Genet. (2009) 41:876–8. doi: 10.1038/ng.417,
9. International Stroke Genetics Consortium (ISGC) et al. Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke. Nat Genet. (2012) 44:328–33. doi: 10.1038/ng.1081,
10. Traylor, M, Farrall, M, Holliday, EG, Sudlow, C, Hopewell, JC, Cheng, Y-C, et al. Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE collaboration): a meta-analysis of genome-wide association studies. Lancet Neurol. (2012) 11:951–62. doi: 10.1016/S1474-4422(12)70234-X,
11. Malik, R, Chauhan, G, Traylor, M, Sargurupremraj, M, Okada, Y, Mishra, A, et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat Genet. (2018) 50:524–37. doi: 10.1038/s41588-018-0058-3,
12. Rutten-Jacobs, LC, Larsson, SC, Malik, R, and Rannikmäe, KMEGASTROKE consortiumInternational Stroke Genetics Consortium, et al. Genetic risk, incident stroke, and the benefits of adhering to a healthy lifestyle: cohort study of 306 473 UK Biobank participants. BMJ. (2018) 363:k4168. doi: 10.1136/bmj.k4168,
13. Neumann, JT, Riaz, M, Bakshi, A, Polekhina, G, Thao, LTP, Nelson, MR, et al. Predictive performance of a polygenic risk score for incident ischemic stroke in a healthy older population. Stroke. (2021) 52:2882–91. doi: 10.1161/STROKEAHA.120.033670,
14. Marston, NA, Patel, PN, Kamanu, FK, Nordio, F, Melloni, GM, Roselli, C, et al. Clinical application of a novel genetic risk score for ischemic stroke in patients with cardiometabolic disease. Circulation. (2021) 143:470–8. doi: 10.1161/CIRCULATIONAHA.120.051927,
15. Hyytiäinen, V, Ala-Mursula, L, Oura, P, Paananen, M, Karhunen, V, Rusanen, H, et al. Clusters of parental socioeconomic status in early childhood and inherited risk for cerebrovascular disease until mid-life—northern Finland birth cohort 1966. Int J Stroke. (2025) 20:85–94. doi: 10.1177/17474930241282521,
16. Hachiya, T, Hata, J, Hirakawa, Y, Yoshida, D, Furuta, Y, Kitazono, T, et al. Genome-wide polygenic score and the risk of ischemic stroke in a prospective cohort the Hisayama study. Stroke. (2020) 51:759–65. doi: 10.1161/STROKEAHA.119.027520,
17. Nagai, A, Hirata, M, Kamatani, Y, Muto, K, Matsuda, K, Kiyohara, Y, et al. Overview of the BioBank Japan project: study design and profile. J Epidemiol. (2017) 27:S2–8. doi: 10.1016/j.je.2016.12.005
18. Hirata, M, Kamatani, Y, Nagai, A, Kiyohara, Y, Ninomiya, T, Tamakoshi, A, et al. Cross-sectional analysis of BioBank Japan clinical data: a large cohort of 200,000 patients with 47 common diseases. J Epidemiol. (2017) 27:S9–S21. doi: 10.1016/j.je.2016.12.003,
19. Adams, HP Jr, Bendixen, BH, Kappelle, LJ, Biller, J, Love, BB, Gordon, DL, et al. Classification of subtype of acute ischemic stroke. Definitions for use in a multicenter clinical trial. TOAST. Trial of org 10172 in acute stroke treatment. Stroke. (1993) 24:35–41. doi: 10.1161/01.str.24.1.35,
20. 1000 Genomes Project Consortium, Abecasis, GR, Altshuler, D, Auton, A, Brooks, LD, Durbin, RM, et al. A map of human genome variation from population-scale sequencing. Nature. (2010) 467:1061–73. doi: 10.1038/nature09534,
21. Akiyama, M, Ishigaki, K, Sakaue, S, Momozawa, Y, Horikoshi, M, Hirata, M, et al. Characterizing rare and low-frequency height-associated variants in the Japanese population. Nat Commun. (2019) 10:4393. doi: 10.1038/s41467-019-12276-5,
22. Collister, JA, Liu, X, and Clifton, L. Calculating polygenic risk scores (PRS) in UK Biobank: a practical guide for epidemiologists. Front Genet. (2022) 13:818574. doi: 10.3389/fgene.2022.818574,
23. Miyazawa, K, Ito, K, Ito, M, Zou, Z, Kubota, M, Nomura, S, et al. Cross-ancestry genome-wide analysis of atrial fibrillation unveils disease biology and enables cardioembolic risk prediction. Nat Genet. (2023) 55:187–97. doi: 10.1038/s41588-022-01284-9,
24. Ebana, Y, Liu, L, Ihara, K, Abe, K, Terao, C, Kamatani, Y, et al. Genetic risk score of cerebral infarction in atrial fibrillation genome-wide association study. Eur J Clin Investig. (2023) 53:e14084. doi: 10.1111/eci.14084,
25. Wake, M, Onishi, Y, Guelfucci, F, Oh, A, Hiroi, S, Shimasaki, Y, et al. Treatment patterns in hyperlipidaemia patients based on administrative claim databases in Japan. Atherosclerosis. (2018) 272:145–52. doi: 10.1016/j.atherosclerosis.2018.03.023,
26. Kuchenbaecker, K, Telkar, N, Reiker, T, Walters, RG, Lin, K, Eriksson, A, et al. The transferability of lipid loci across African, Asian and European cohorts. Nat Commun. (2019) 10:4330. doi: 10.1038/s41467-019-12026-7,
27. Wolf, PA, Abbott, RD, and Kannel, WB. Atrial fibrillation as an independent risk factor for stroke: the Framingham study. Stroke. (1991) 22:983–8. doi: 10.1161/01.str.22.8.983,
28. Healey, JS, Connolly, SJ, Gold, MR, Israel, CW, Van Gelder, IC, Capucci, A, et al. Subclinical atrial fibrillation and the risk of stroke. N Engl J Med. (2012) 366:120–9. doi: 10.1056/NEJMoa1105575,
29. Roselli, C, Chaffin, MD, Weng, LC, Aeschbacher, S, Ahlberg, G, Albert, CM, et al. Multi-ethnic genome-wide association study for atrial fibrillation. Nat Genet. (2018) 50:1225–33. doi: 10.1038/s41588-018-0133-9,
30. Nielsen, JB, Thorolfsdottir, RB, Fritsche, LG, Zhou, W, Skov, MW, Graham, SE, et al. Biobank-driven genomic discovery yields new insight into atrial fibrillation biology. Nat Genet. (2018) 50:1234–9. doi: 10.1038/s41588-018-0171-3,
31. Diener, HC, Easton, JD, Hart, RG, Kasner, S, Kamel, H, and Ntaios, G. Review and update of the concept of embolic stroke of undetermined source. Nat Rev Neurol. (2022) 18:455–65. doi: 10.1038/s41582-022-00663-4,
32. Hart, RG, Diener, HC, Coutts, SB, Easton, JD, Granger, CB, O'Donnell, MJ, et al. Embolic strokes of undetermined source: the case for a new clinical construct. Lancet Neurol. (2014) 13:429–38. doi: 10.1016/S1474-4422(13)70310-7,
33. Sanna, T, Diener, HC, Passman, RS, Lazzaro, VD, Bernstein, RA, Morillo, CA, et al. Cryptogenic stroke e and underlying atrial fibrillation. N Engl J Med. (2014) 370:2478–86. doi: 10.1056/NEJMoa1313600
34. Sposato, LA, Cipriano, LE, Saposnik, G, Vargas, ER, Riccio, PM, and Hachinski, V. Diagnosis of atrial fibrillation after stroke and transient ischaemic attack: a systematic review and meta-analysis. Lancet Neurol. (2015) 14:377–87. doi: 10.1016/S1474-4422(15)70027-X,
35. Hachiya, T, Kamatani, Y, Takahashi, A, Hata, J, Furukawa, R, Shiwa, Y, et al. Genetic predisposition to ischemic stroke: a polygenic risk score. Stroke. (2017) 48:253–8. doi: 10.1161/STROKEAHA.116.014506,
36. Shu, J, Neugebauer, H, Li, F, Lulé, D, Müller, H-P, Zhang, J, et al. Clinical and neuroimaging disparity between Chinese and German patients with cerebral small vessel disease: a comparative study. Sci Rep. (2019) 9:20015. doi: 10.1038/s41598-019-55899-w,
37. Shi, Y, and Wardlaw, JM. Update on cerebral small vessel disease: a dynamic whole-brain disease. Stroke Vasc Neurol. (2016) 1:83–92. doi: 10.1136/svn-2016-000035,
38. Miwa, K, Koga, M, Nakai, M, Yoshimura, S, Sasahara, Y, Koge, J, et al. Etiology and outcome of ischemic stroke in patients with renal impairment including chronic kidney disease. Neurology. (2022) 98:e1738–47. doi: 10.1212/WNL.0000000000200153,
39. Mishra, A, Malik, R, Hachiya, T, Jürgenson, T, Namba, S, Posner, DC, et al. Stroke genetics informs drug discovery and risk prediction across ancestries. Nature. (2022) 611:115–23. doi: 10.1038/s41586-022-05165-3,
40. Koyama, S, Ito, K, Terao, C, Akiyama, M, Horikoshi, M, Momozawa, Y, et al. Population-specific and trans-ancestry genome-wide analysis identify distinct and shared genetic risk loci for coronary artery disease. Nat Genet. (2020) 52:1169–97. doi: 10.1038/s41588-020-0705-3
41. Sargurupremraj, M, Suzuki, H, Jian, X, Sarnowski, C, Evans, TE, Bis, JC, et al. Cerebral small vessel disease genomics and its implications across the lifespan. Nat Commun. (2020) 11:6285. doi: 10.1038/s41467-020-19111-2,
42. Knol, MJ, Lu, D, Traylor, M, Adams, HHH, Romero, JRJ, Smith, AV, et al. Association of common genetic variants with brain microbleeds: a genome-wide association study. Neurology. (2020) 95:e3331–43. doi: 10.1212/WNL.0000000000010852,
43. Duperron, MG, Knol, MJ, Le Grand, Q, Evans, TE, Mishra, A, Tsuchida, A, et al. Genomics of perivascular space burden unravels early mechanisms of cerebral small vessel disease. Nat Med. (2023) 29:950–62. doi: 10.1038/s41591-023-02268-w,
Glossary
GWAS - genome-wide association studies
GRS - genetic risk score
ASPREE - aspirin in reducing events in the elderly
PRS - polygenic risk score
SNPs - single nucleotide polymorphisms
BBJ - BioBank Japan
MRI - magnetic resonance imaging
CT - computed tomography
TOAST - Trial of ORG 10172 in Acute Stroke Treatment
LAA - large artery atherosclerosis
CE - cardioembolism
SVO - small vessel occlusion
O/U - others/undetermined
DNA - deoxyribonucleic acid
EDTA - ethylenediaminetetraacetic acid
HGC - heterozygous genotype count
ORs - odds ratios
CIs - confidence intervals
IQR - interquartile ranges
ROC - receiver operating characteristic
ICD - International Classification of Diseases
CV - cardiovascular
HRs - hazard ratios
ESUS - embolic stroke of undetermined source
ICM - insertable cardiac monitor
Keywords: BioBank Japan, cohort study, genetic risk, GWAS, stroke
Citation: Shimoyama T, Kamatani Y, Matsuda K, Yamaguchi H and Kimura K (2026) Genetic risk impacts stroke mortality and pathogenesis in patients with ischemic stroke: a cohort study of BioBank Japan. Front. Neurol. 17:1664594. doi: 10.3389/fneur.2026.1664594
Edited by:
Shinichiro Uchiyama, Sanno Medical Center, JapanReviewed by:
Georgia Damoraki, National and Kapodistrian University of Athens, GreeceShigeru Nogawa, Tokai University Hachioji Hospital, Japan
Copyright © 2026 Shimoyama, Kamatani, Matsuda, Yamaguchi and Kimura. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Takashi Shimoyama, cy10YWthc2hpQG5tcy5hYy5qcA==
†These authors have contributed equally to this work and share first authorship
Koichi Matsuda3†