Association of Genetic Variants With Migraine Subclassified by Clinical Symptoms in Adult Females

Migraine is heritable and formally diagnosed by structured criteria that require presence of some but not all possible migraine symptoms which include aura, several distinct manifestations of pain, nausea/vomiting, and sensitivity to light or sound. The most recent genome-wide genetic association study (GWAS) for migraine identified 38 loci. We investigated whether 46 single-nucleotide polymorphisms (SNPs), i.e., genetic variants, at these loci may have especially pronounced, i.e., selective, association with migraine presenting with individual symptoms compared to absence of migraine. Selective genetic associations of SNPs were evaluated through a likelihood framework in the Women's Genome Health Study (WGHS), a population-based cohort of middle-aged women including 3,003 experiencing migraine and 18,108 not experiencing migraine, all with genetic information. SNPs at 12 loci displayed significant selective association for migraine subclassified by specific symptoms, among which six selective associations are novel. Symptoms showing selective association include aura, nausea/vomiting, photophobia, and phonophobia. The selective associations were consistent whether the women met all formal criteria for diagnostic for migraine or lacked one of the diagnostic criteria, formally termed probable migraine. Subsequently, we performed latent class analysis of migraine diagnostic symptoms among 69,861 women experiencing migraine from the WGHS recruitment sample to assess whether there were clusters of specific symptoms that might also have a genetic basis. However, no globally robust latent migraine substructures of diagnostic symptoms were observed nor were there selective genetic associations with specific combinations of symptoms revealed among weakly supported latent classes. The findings extend previously reported selective genetic associations with migraine diagnostic symptoms while supporting models for shared genetic susceptibility across all qualifying migraine at many loci.

Migraine is heritable and formally diagnosed by structured criteria that require presence of some but not all possible migraine symptoms which include aura, several distinct manifestations of pain, nausea/vomiting, and sensitivity to light or sound. The most recent genome-wide genetic association study (GWAS) for migraine identified 38 loci. We investigated whether 46 single-nucleotide polymorphisms (SNPs), i.e., genetic variants, at these loci may have especially pronounced, i.e., selective, association with migraine presenting with individual symptoms compared to absence of migraine. Selective genetic associations of SNPs were evaluated through a likelihood framework in the Women's Genome Health Study (WGHS), a population-based cohort of middle-aged women including 3,003 experiencing migraine and 18,108 not experiencing migraine, all with genetic information. SNPs at 12 loci displayed significant selective association for migraine subclassified by specific symptoms, among which six selective associations are novel. Symptoms showing selective association include aura, nausea/vomiting, photophobia, and phonophobia. The selective associations were consistent whether the women met all formal criteria for diagnostic for migraine or lacked one of the diagnostic criteria, formally termed probable migraine. Subsequently, we performed latent class analysis of migraine diagnostic symptoms among 69,861 women experiencing migraine from the WGHS recruitment sample to assess whether there were clusters of specific symptoms that might also have a genetic basis. However, no globally robust latent migraine substructures of diagnostic symptoms were observed nor were there selective genetic associations with specific combinations of symptoms revealed among weakly supported latent classes. The findings extend previously reported selective genetic associations with migraine diagnostic symptoms while supporting models for shared genetic susceptibility across all qualifying migraine at many loci.
Keywords: latent class analyses, migraine diagnostic criteria, migraine with and without aura, migraine pain, genetic association analysis
Beyond MA and MO, migraine may be further subclassified according to symptoms constituting migraine diagnosis (11) including pain character, photophobia, phonophobia, attack duration, and nausea/vomiting. Previously, we reported significant selective associations at 4 of 12 single-nucleotide polymorphisms (SNPs) from an early migraine GWAS (6) with migraine subclassified according to aura status or individual diagnostic symptoms (5). In principle, latent subclasses of aura status and the diagnostic symptoms may also underlie the heterogeneity of migraine presentation and may be accompanied by unique genetics (12)(13)(14)(15)(16). However, previous latent class analysis (LCA) of diagnostic symptoms among 6,265 twins (13) found that potential latent structure was consistent with a continuum of genetic liability rather than distinct genetics for each latent subclass.
Here, we expand on the existing literature (5,10,13), testing for selective associations with aura status and the diagnostic symptoms at 46 SNPs from 38 loci from the most recent GWAS (7). We also explored whether any selective genetic associations may be extended to self-reported migraineurs who do not meet full diagnostic criteria (11). Finally, we revisit migraine latent classes and potential corresponding selective genetics among the 46 SNPs in a sample with unprecedented power including 69,861 migraineurs.

Study Population
The current study leveraged data from the Women's Health Study (WHS) (Figure 1). The design, methods, and results of the WHS have been described in detail previously (18)(19)(20). In brief, the WHS was a randomized, placebo-controlled trial designed to test the benefits and risks of low-dose aspirin and vitamin E in the primary prevention of cardiovascular disease and cancer among 39,876 apparently healthy female healthcare professionals aged 45 or older at baseline. During 1992-1995, over 1.7 million female healthcare professionals were recruited to join the study, including women both younger and older than 45 (although the trial only included women older than 45), of whom 453,787 returned baseline questionnaires, which included questions for migraine assessment. The analytic sample for assessing migraine latent classes was the subset of 69,861 baseline respondents Abbreviations: GWAS, genome-wide association study; ICHD, International Classification of Headache Disorders; LCA, latent class analysis; LD, linkage disequilibrium; LRT, likelihood ratio test; MA, migraine with typical aura; MO, migraine without aura; SNP, single-nucleotide polymorphism.  (17). *Sample for latent class analysis. ∧ Sample for analysis of selective genetic association by the likelihood procedure using the diagnostic symptoms or latent class assignments.
with self-identified European ancestry who reported having a migraine in the year preceding recruitment. The sample for the genetic analysis is derived from the Women's Genome Health Study (WGHS) (17), a subset of randomized WHS participants including 23,294 WHS participants with whole genome genotype data and verified European ancestry.

Migraine Assessment
Migraine in the WHS was assessed at baseline by self-report as described previously (5,21). Briefly, participants were asked: "Have you ever had migraine headaches?" and "In the past year, have you had migraine headaches?" Based on their responses, participants were classified as having no history of migraine, "active" migraine, i.e., migraine experienced within the past year, or "prior" migraine, i.e., migraine experienced previous to the past year. Participants reporting active migraine were additionally asked about further symptoms of their migraine attacks related to the International Classification of Headache Disorders (ICHD) criteria, such as aura status, nausea/vomiting, phonophobia, and photophobia, and included frequency of migraine attacks (21). Responses to these questions allowed classification using modified ICHD-2 criteria as described previously, including the formal diagnosis of probable migraine, defined in the ICHD as missing just one of the diagnostic criteria (Supplementary Table 1) (21). The migraine frequency variable was dichotomized as fewer than six attacks/year compared with 6 or more attacks/year. Of the 23,294 WGHS participants for genetics, analysis was restricted to the 3,003 with active migraine and 18,108 reporting no migraine, excluding 2,119 reporting prior but not active migraine and 64 with missing migraine status.

Genotype Data
Whole-genome genotype data were collected in the WGHS as described (17). Statistical modeling was applied to the 46 candidate SNPs identified in a recent GWAS (7) that mapped to 38 distinct genomic loci (see Supplementary Table 2). Among these SNPs, rs6724624 and rs10166942 (at TRPM8) were in high LD (r 2 = 1). SNPs at FHL5 were in moderate LD (r 2 = 0.25 for rs4839827-rs7775721 and r 2 = 0.54 for rs67338227-rs7775721). SNPs rs12135062, rs10166942, rs11031122, rs11172113, and rs17857135 were genotyped directly. Genotype information for the remaining SNPs and missing genotype for the genotyped SNPs was derived from imputation using MaCH v.1.0.16 to the 1000 Genomes cosmopolitan reference panel (version 1, phase 3, March 2012) (22). All imputed SNP genotypes (as maximum likelihood dose) were of high quality, with minimum imputation R 2 ≥ 0.87.

Likelihood Framework for Testing Selective Genetic Associations
A likelihood framework was used to evaluate the selectivity of associations between the GWAS SNPs and migraine subclassified according to individual diagnostic symptoms as described previously (5,23). For each symptom, the WGHS sample was classified into three groups: (a) active migraineurs reporting the symptom, (b) active migraineurs not reporting the symptom, and (c) non-migraineurs, i.e., WGHS participants reporting having never experienced migraine. The Bayesian Information Criteria (BIC) was used to discriminate among six models for the association of each SNP with migraine compared with absence of migraine: (1) the "null" model of no association; (2) the "basic" model of association regardless of symptom-based subclass; (3) the "subset" model of association with migraine subtype defined by the presence and not absence a particular diagnostic symptom, (4) the "inverse subset" model of association with migraine subtype defined by the absence and not presence a particular diagnostic symptom, (5) the "general" model of a different magnitude association depending on presence or the absence of one of symptoms, and finally, (6) the "modifier" model assuming association with a given symptom conditional on being a migraineur. The significance of selected models was evaluated with a likelihood ratio test and a permutation method to address multiple testing as well as potential confounding. The permutation method provided corrected p-values for each SNP across all of the symptoms. These corrected p-values were further corrected for multiple testing across the 46 SNPs by the Šidák method. Subsequently, the magnitude of genetic associations with subclasses compared to non-migraineurs was assessed with logistic regression controlling for age and principal components of European ancestry substructure (see Supplementary Methods). All statistical analysis was performed in R (24).

Latent Class Analysis (LCA) of Symptoms From Migraine Diagnostic Criteria
Standard LCA was performed among active migraineurs (N = 69,861) and subsamples thereof derived from the WHS recruitment sample (N = 453,787) using the poLCA (25) package in R (24). Symptoms of the diagnostic criteria were encoded as binary variables (yes/no). PoLCA was iterated 50 times for each number of classes K ranging 2-15, using the BIC to determine an optimal model for each value of K (See Supplementary Methods). Genetic analysis was applied to the LCA results restricted to the subset of migraineurs and nonmigraineurs in the WGHS subset. Selective SNP associations for a particular latent class (analogous to the "subset" model above) or for a particular LCA solution (analogous to the "general" model above) were evaluated by an extension of the likelihood framework for the diagnostic symptoms (See Supplementary Methods).

Other Statistical Procedures
Differences in demographics and health characteristics between migraineurs and controls were compared using ANOVA or Chisquare tests as appropriate.

Data Use and Availability
All data collection and analysis were consistent with written informed consent in the WHS and approved by the Institutional Review Board (IRB) of Brigham and Women's Hospital. Public release of WGHS data is restricted by the IRB. However, access to data described in this work will be made available on a collaborative basis upon request. Figure 1 shows the overall study design. We examined whether 46 genome-wide significant SNPs at 38 loci (Supplementary Table 2) identified in recent GWAS of migraine were selectively associated with migraine according to aura status and the individual diagnostic symptoms in three nested subsets of the Women's Genome Health Study (WGHS): fully qualifying migraineurs (N = 1,422), fully qualifying and probable migraineurs (N = 2,258), and all migraineurs (ICHD fully qualifying, ICHD probable migraineurs, and individuals reporting migraine but not meeting ICHD criteria) (N = 3,003) compared with 18,108 non-migraineurs. Demographic characteristics of the WGHS are shown in Table 1. The prevalence of aura and the diagnostic symptoms associated with migraine for the three nested samples WGHS migraineurs is shown in Table 2. Aura prevalence was roughly equivalent in all three subsets, whereas other symptoms, e.g., nausea/vomiting, were less prevalent at least partly owing to the lack of fulfillment of the diagnostic criteria.

Selectivity of SNP Associations for Aura and Other Migraine Characteristics
The results of the Bayesian Information Criterion (BIC)-based assessment of selective association with migraine subclassified by aura and the diagnostic symptoms at the 46 SNPs are  In the WGHS, after correcting for multiple testing, fifteen SNPs (including the two TRPM8 SNPs in high LD) were either significant for migraine overall regardless of symptoms or significantly selective for at least aura or one of the diagnostic symptoms for migraine. Overall, six SNPs that were not available in the previous selectivity analysis (5) were selective here: rs12260159 (HPSE2), rs4910165 (MRVI1), rs561561 (IGSF9B), rs11031122 (MPPED2), rs1024905 (Near FGF6), and rs17857135 (RNF213). In particular, SNPs rs12260159 and rs1024905 were selective for migraine without aura while the SNP at rs11031122 was selective for migraine accompanied with aura in at least one of the nested sets of migraine defined by diagnostic stringency. Seven of 15 of the significant BIC models would not have been evident among fully qualifying migraineurs only, likely due to power. For example, the preferential association with nausea/vomiting in rs6724624 and rs10166942 (both TRPM8) was absent when limited to full migraineurs but significant in the combined full and probable migraineurs (p cor < 0.001) and all migraineurs (p cor < 0.001, Supplementary Table 3).
Further, rs11031122 (MPPED2), which was the only SNP to be preferentially associated with the migraine with aura ("subset" model), was only significant in the sample of all migraineurs (p cor = 0.015). Two SNPs, rs561561 (IGSF9B) and rs4910165 (MRVI1), were both found to be selective for migraine characterized by nausea/vomiting (p cor = 0.010 and 0.006, respectively) in the combined full and probable migraineurs. Selectivity for migraine without aura was found for rs1024905 (near FGF6) in combined full and probable migraineurs (p cor = 0.014) but consistent with the null among full migraineurs alone.
However, for other loci, significance decreased when augmenting the sample of fully qualifying migraineurs with probable or other migraineurs. For example, selectivity of the rs12260159 (HPSE2) was significantly preferential for migraine without aura ("inverse subset") among the fully qualifying migraineurs (p cor = 0.002) but not selected for association by the BIC in the larger samples including non-qualifying, self-reported migraineurs. A similar pattern was observed for rs6791480 (near TGFBR2) and migraine characterized by inhibited daily activity due to pain (p cor = 0.013) among fully qualifying migraineurs but null in the augmented samples.   Meanwhile, other SNPs such as rs2078371 (near TSPAN2) the model identified with the BIC changed from non-selective (i.e., "basic") in the limited group to being preferentially associated with migraine characterized by sensitivity to sound in the larger samples (both p cor =0.001). The same pattern was found for rs10218452 (PRDM16). The increased selectivity by augmenting the fully qualifying migraineur sample with probable migraineurs is counter to the inference that the selective associations are restricted to severe migraineurs.
Using logistic regression, we assessed the magnitudes of the significant associations at the 15 SNPs, especially those that were selective (Supplementary Table 4). The estimated effects (i.e., logistic regression beta-coefficients) of some SNPs were greater in the absence compared with presence of aura or a particular diagnostic symptom, again suggesting that selective effects do not necessarily reflect association with more severe migraine. For example, SNPs rs7775721 (gene FHL5-UFL1) among all migraineurs, rs12260159 (gene HPSE2) among fully qualifying migraineurs, rs1024905 (near FGF6) among full and probable migraineurs, and rs11172113 (LRP1) among all migraineurs are more strongly associated with absence of aura compared to its presence. Additional stronger associations are observed for absence compared with presence for phonophobia (rs561561 and rs11172113) and migraine frequency i.e., ≥6 attacks/year (rs10218452 and rs12135062). By contrast, stronger associations in the presence compared with absence of aura or the diagnostic symptoms were also observed, and included the auraspecific association at rs11031122 (MPPED2) that was also noted previously in the discovery GWAS (7).  Table 5). Note that the diagnostic criteria will induce correlations in groups with full migraineurs and combined full and probable migraineurs, e.g., the strong correlations between photophobia and phonophobia.

Selectivity of SNP Associations With Latent Classes
Within these nested samples, we performed LCA using aura status and the diagnostic symptoms as binary manifest variables over total number of subclasses, K, ranging from 2 to 15. We also explored latent models in these samples further stratified by aura status. In each of these groups and for each value of K, LCA was performed with 50 random initializing frequencies. Yet none of the LCA models recurred for any K, indicating lack of truly stable solutions (26). Moreover, no clear support was found for an optimal value of K, as indicated by an absence of a clear optimum BIC value, except possibly among all WHS migraineurs reporting aura where a broad minimum was centered around K=8 (Figure 3). Including symptoms that are obligate for full ICHD migraine when applying LCA to the sample that included full, probable, and other active migraineurs may have precluded a robust solution. However, excluding these symptoms and repeating the entire LCA procedure with only aura status and the diagnostic symptoms for pain, nausea/vomiting, photophobia, and phonophobia did not reveal a clearly optimal K or a solution that recurred over the 50 iterations (not shown). Similarly, stratifying the LCA by age ≤45 v. >45, the former potentially relevant to greater prevalence of migraine among younger women did not reveal a reliable solution in the LCA (not shown).
In spite of the lack of a robust LCA solution, it remained possible that particular latent classes recur and moreover that such potential latent classes have pronounced, selective association with one or more of the candidate SNPs. However, having adapted the selective association hypothesis testing framework for latent classes (Materials and Methods), we found no selective associations meeting significance thresholds consistent with the large burden of multiple testing.

DISCUSSION
The current study focused on 46 GWAS migraine susceptibility SNPs (7), evaluating selective associations with migraine subclasses among adult women defined on the basis of individual symptoms of the diagnostic criteria (5,10,13,14,27). Six new selective associations were observed raising the total for this BICbased model selection approach to 15 (5). The analysis further examined the influence of diagnostic stringency on selective associations. Meanwhile, in spite of exceptional power, LCA did not reveal robust overall substructure among diagnostic criteria of migraine, nor was there evidence of selective SNP associations with any individual latent classes that may have existed even in the absence of overall latent substructure.
Among six SNPs identified in the recent GWAS and chosen by the BIC model selection, three were selective for aura status (7). One of these SNPs, rs11031122, is the first reported specifically for MA and was noted for heterogeneity in comparing its association between MA and MO in meta-analyses that included data from the WGHS (7). Rs11031122 maps to the intron of the highly conserved gene for metallophosphoesterase-domain containing 2 (MPPED2) that is expressed in fetal brain (28), consistent with a role in neuronal development (29), although it is expressed also in adult non-brain tissues including aorta (30). The second SNP, rs1024905, was selective for migraine without aura in the combined full and probable migraineurs group. It maps near the FGF6 gene that has been shown to play an important role in the regulation of cell proliferation, cell differentiation, angiogenesis, and myogenesis and is required for normal muscle regeneration (7,31). The final new selective SNP pertaining to aura, rs12260159, which is in the intron of the heparanase 2 gene (HPSE2), was also selective for lack of aura but only in the stringently qualifying migraineur group. Heparinases are involved in tissue remodeling, and HPSE2, in particular, is expressed widely in fetal and adult central nervous system of mouse (32). Mutations in HPSE2 have been associated with urofacial syndrome in humans, suggesting a potential developmental role (33).
The comparison of models selected across nested subsets of migraineurs with decreasing diagnostic stringency begins to address relationships among selectivity, genetic heterogeneity, and power. Among the most strongly associated SNPs, Unilateral pain, pulsating pain, pain aggravated by physical activity, and pain inhibiting daily activities relate to criterion C of the ICHD diagnostic criteria. Nausea/vomiting, photophobia, and phonophobia relate to criterion D of the ICHD diagnostic criteria. augmenting the sample of full migraineurs with probable migraineurs almost always resulted in greater selectivity and significance, most simply explained by a greater increase in power and minimal deterioration by potential heterogeneity.
This interpretation is consistent with previous conclusions that probable migraineurs are genetically similar to full migraineurs (13). However, adding the "other" self-reported migraineurs who did not meet criteria for either full or probable migraine did not uniformly improve significance. These individuals may have aged out of some symptoms leading to misclassification of their status, they may have genetic background that is less susceptible to migraine at the candidate SNPs, or their self-reported migraine condition does not arise from genetic influences shared with migraine meeting ICHD criteria.
We have previously argued that selective genetic associations are not simply explained by associations with more severe migraine (5). If so, and were severity assessed by the presence of one or more specific symptoms, then the selective associations would have highlighted the same set of symptoms for each selective SNP. However, this was not the case. Nor were two measures of severity, the specific symptoms of high frequency migraine and/or aura, uniformly highlighted. At the same time, selective associations denoted by "inverse-subset" imply a stronger association in the absence rather than the presence of particular diagnostic symptoms and are possibly consistent with less severe migraine. Finally, in the current analysis, selectivity typically improved by augmenting the fully qualifying migraineurs with probable migraineurs, who may be viewed as experiencing less severe form of migraine. Notwithstanding the preceding argument, it is important to note that approximately half, i.e., only part, of the variation in the concordance rate for common migraine is attributable to genetic factors, while the remainder depends on environmental factors (1)(2)(3).
With up to 69,861 female migraineurs, our sample for LCA was over ten times larger than the sample in the previous analysis (13) and therefore better powered to detect much more subtle clustering. That previous study did not identify discrete latent structures but found evidence for a continuum of increasing numbers of symptoms, greater prevalence of aura, and male sex that was correlated with overall genetic liability. Although migraine in our study was coded slightly differently such that we could not evaluate symptom prevalence in the absence of self-reported migraine, our analysis also largely failed to reveal robust substructure, except possibly among migraineurs with aura who met stringent diagnostic criteria. Even in this group, there was no single model that recurred in the 50 randomly initialized iterations of the modeling procedure, suggesting that latent substructure among migraineurs remains elusive, if it exists at all.
While the main strengths of our study are the unprecedented sample size for the LCA and the large sample size for the genetic analysis for the diagnostic symptoms, the main limitation of our study is the self-reported ascertainment of migraine, which may result in misclassification. However, the demographic of our study population, female healthcare professionals, is known to provide accurate clinical information by self-reported questionnaire (34) and a previous study in the WHS showed good agreement between self-reported MO and classification of MO based on ICHD-2 criteria (21). Moreover, migraine status in the WGHS was used to identify the first consistent migraine susceptibility loci by GWAS (16) and to discover robust liability to stroke associated with MA (35). Further, our model selection was based on the BIC, which enforces a high stringency, and we used a permutation procedure to establish significance thresholds consistent with multiple hypothesis testing. Additional limitations pertain to the migraine ascertainment. Aura in the questionnaire was not distinguished from other prodromal phenomena, possibly leading to misclassification, and is at the higher limit of prevalence compared to other population-based surveys (36). Nor did the ascertainment provide longitudinal information about the symptoms. Again, the consequence is potential misclassification since it is impossible to distinguish whether symptoms may never have existed compared to whether they disappeared or possibly altered with age in our participants who were at least 45 years old (37)(38)(39). However, all of these possibilities would likely attenuate the selective associations we observe, and they may underlie the loss of some selective associations, e.g., for rs17857135 at gene RNF213, when augmenting the sample of full and probable migraineurs with other self-reported migraineurs in the selectivity analysis.
The diversity of symptoms qualifying for diagnosis of common migraine raises the possibility that the phenotypic heterogeneity is accompanied by underlying genetic heterogeneity (40). The findings here together with previous findings from ourselves and others support the notion that at least some of the heterogeneity in common migraine is influenced by genetics (5,13). As more loci are discovered in future GWAS, we anticipate improving the understanding of both migraine pathophysiology and diagnosis through additional study of selective genetic associations.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: The datasets used in this study, namely The Women's Genome Health Study and its parent cohort The Women's Health Study, are restricted from public access by the local IRB. However, the datasets are available through collaboration and no reasonable collaborative requests have been refused. Requests to access these datasets should be directed to dchasman@research.bwh.harvard.edu.