Bayesian nonparametric method for genetic dissection of brain activation region

Biological evidence indicewates that the brain atrophy can be involved at the onset of neuropathological pathways of Alzheimer's disease. However, there is lack of formal statistical methods to perform genetic dissection of brain activation phenotypes such as shape and intensity. To this end, we propose a Bayesian hierarchical model which consists of two levels of hierarchy. At level 1, we develop a Bayesian nonparametric level set (BNLS) model for studying the brain activation region shape. At level 2, we construct a regression model to select genetic variants that are strongly associated with the brain activation intensity, where a spike-and-slab prior and a Gaussian prior are chosen for feature selection. We develop efficient posterior computation algorithms based on the Markov chain Monte Carlo (MCMC) method. We demonstrate the advantages of the proposed method via extensive simulation studies and analyses of imaging genetics data in the Alzheimer's disease neuroimaging initiative (ADNI) study.


. Introduction
Imaging genetics is an emerging interdisciplinary field with a focus on assessing the impact of genetic variation on brain function and structure.It is a useful tool to uncover the etiologies of complex neuropsychiatric diseases, such as Autism (Ameis and Szatmari, 2012) schizophrenia (Meyer-Lindenberg, 2010) and Alzheimer's disease (Weiner et al., 2013).Traditional genetics studies have attempted to search genetic variants that are strongly associated with a behavior or related phenotypes; however, some findings were weak and inconsistent.There are considerable inter-subject differences in the behavioral measures, usually requiring large sample sizes to detect a signal.For neuropsychiatric disease, many genetic variants may not be directly associated with a clinical outcome or a behavior response but have a strong indirect effect which is mediated through molecular and cellular level information processing by neurons in the brain.We refer to this information processing procedure as brain activity.Functional neuroimaging, including functional magnetic resonance imaging (fMRI) and positron emission tomography (PET), is a set of powerful techniques to indirectly measure the brain activity at each location in the brain.Many current functional neuroimaging studies have focused on detecting the brain activation regions in association with particular cognitive and emotional tasks or at resting state.
Therefore, in imaging genetics studies, it is of great interest is to simultaneously select important genetic variants and detect brain activation regions where the genetic effects are strongly associated with brain activity.We refer to this procedure as genetic dissection of brain activation regions.Our motivating example is joint analysis of the fluorodeoxyglucose positron emission tomography (FDG-PET) data, single nucleotide polymorphisms (SNP) data and clinical data in the Alzheimer's Disease Neuroimaging Initiative (ADNI) study.Alzheimer's disease (AD) is one of the most common neurodegenerative disorders that impair mental functioning.It affects approximately eight percent of people who are 65 years of age or older.It has been shown that AD leads to nerve cell death and tissue loss in the brain (Bookheimer et al., 2000).As AD progresses, the brain shrinks dramatically; and abnormal changes in the brain worsen over time, eventually interfering with many aspects of brain function, such as memory loss, resulting in a decline in some intellectual abilities and changes in personality and behavior.New and potential treatments for AD focus on slowing the progression of the disease, making it important to identify at an early stage markers of future cognitive decline.Genetics studies showed that the presence of some of genes such as APOE and NEDD9 may be associated with cognitive decline in older persons (Wang et al., 2011).Structural magnetic resonance imaging (sMRI) studies (Bookheimer et al., 2000) identified that older persons with normal cognition may show medial temporal atrophy and thus indicate the possibility of future cognitive impairment.Many ADNI studies have focused on the joint analysis of sMRI and SNPs to discover the genetic effects on brain structure (Stein et al., 2010;Zhu et al., 2014;Huang et al., 2015).Functional neuroimaging techniques can facilitate to discover more subtle alternations in brain function as AD progresses, thus analyses of PET or fMRI data in the ADNI studies have drawn much attention recently as well.For example, Huang et al. (2010) and Kundu and Kang (2016) developed statistical methods for leaning the genetic effects on the functional connectivity of AD.In this work, our goal is to study the genetic effects on functional brain activity for people at risk of AD, based on which we can identify the consistent brain activation regions across multiple subjects and quantify the changes of their shapes over times.
Many of them have been adopted to detect the association between imaging biomarkers and genetic variants.The pioneer work includes voxelwise genome-wide association (vGWAS) study (Stein et al., 2010) where each voxel is considered as a phenotype and univariate regression models were fitted for all the combinations of voxels and genetic variants.This approach enjoys the simplicity and fast computations but suffers from the difficulty of the multiple testing problem since the number of voxels often can be up to more than 10,000.To address those limitations, Huang et al. (2015) proposed a joint modeling approach, termed as, Fast voxelwise genome wide association analysis (FVGWAS) with a well family-wise error control procedure and developed efficient computing tools for large-scale imaging genetics studies.Huang et al. (2017) introduced a new framework called Functional Genome-Wide Association Analysis (FGWAS) designed to analyze functional phenotypes like those found in neuroimaging studies.FGWAS improves upon FVGWAS methods by incorporating the unique features of functional phenotypes-like smoothness and correlation-into the statistical model, resulting in more powerful detection of genetic variants affecting brain structure and function.Alternatively, Vounou et al. (2010) and Zhu et al. (2014) proposed to use low rank regression to handle the high-dimensional neuroimaging phenotype, where a latent structure are imposed in the regression coefficients.Besides reduced rank approximation approaches, independent component analysis (ICA) (Liu et al., 2009) and canonical correlation analysis (CCA) (Chi et al., 2013) have been applied to discover the association between the imaging biomarkers and genetic variants with different latent structure assumptions.
Different from all the existing methods, in this work, we propose a Bayesian hierarchical model for genetic dissection of brain activation regions using the level set function with the Gaussian process (GP) prior.We term our method as Bayesian nonparametric level set (BNLS) method.BNLS consists of two levels of hierarchy.
At level 1, a Bayesian nonparametric level set model is developed for characterizing the shape of consistent brain activation regions across multiple subjects.The level set method has been widely used in image segmentation problems (e.g., Balafar et al., 2010;Li et al., 2011;Bergeest and Rohr, 2012), where contours (2D) or surfaces (3D) are represented as the zero-level set of a higher dimensional function, thus spatial voxels can be classified based on the function values: positive (inside the region) or negative (outside the region).We refer to this function as the level set function.The corresponding shape representation can characterize complex topological variations: the appearance of holes or tails, shapes that break down into smaller pieces, etc.The traditional level set based shape estimation problem can be solved by the numerical methods for partial differential equations.In our model, we propose to assign a GP prior to the level set function and make fully posterior inference on the level set function as well as the shape of the activation regions, taking advantages of the good statistical properties of GP.
At level 2, a regression model is adopted to select genetic variants that are strongly associated with the average brain activity within the region over multiple subjects, where a spike-and-slab prior and a Gaussian prior are chosen for feature selection.In particular, we model the average brain activation intensity within the region for each subject as the response variable; and we consider all the genetic variants as well as some clinical factors as predictors.We assign the Bayesian spike and slab prior on the regression coefficients for variable selection and thus to detect the important genetic variants of interest.The spike and slab prior was initially proposed by Mitchell and Beauchamp (1988) and George and McCulloch (1993) and has been broadly adopted for various applications (Chipman et al., 2001;Ishwaran and Rao, 2005a,b).In the spike and slab prior specifications, the coefficients are mutually independent with a two-point mixture distribution made up of a "uniform-like" flat distribution (called "slab") and a "degeneratedpoint-mass-at-zero-like" distribution (called "spike"), leading to sparsity in the posterior inference.
Our proposed Bayesian model offers several unique features compared to existing methods.First, the foundational assumptions of our model diverge substantially from those of FVGWAS and FGWAS.Specifically, we focus on probabilistic modeling of brain activation regions and the selection of key genetic variants with significant associations to these regions.Second, we aim for fully Bayesian inferences that account for all sources of variation in the model parameters, which in turn characterize both imaging measurements and genetic variants.Third, our method is designed to efficiently analyze high-resolution images and a moderately large set of genetic variants.It is important to note that we do not expect our Bayesian approach to scale in the same way as variable screening-based methods like FVGWAS and FGWAS.Our model also has the potential to be integrated with prior knowledgeguided Bayesian variable screening methods (He and Kang, 2022).The efficacy of this combined approach certainly warrants future investigation.
The remainder of the manuscript is organized as follows.In Section 2, we present the proposed model with prior specifications, and develop the posterior computation algorithms for fully Bayesian model.In Section 3, we evaluate the performance of the proposed method via extensive simulation studies.In Section 4, we illustrate the proposed method on analysis of the PET and SNP data from the ADNI study to detect influential SNPs and consistent activation regions across subjects.Finally, we conclude our paper by discussion in Section 5.

. Materials and methods
In this section, we develop BNLS: a two-level Bayesian hierarchical model for fitting the brain activation regions that can simultaneously select important genetic variants.At Level 1, we focus on identifying the consistent activation regions across subjects, where the brain activation intensity may be different for different subjects.At Level 2, we are interested in identifying the important genetic variants (such as SNPs) that are strongly associated with brain activation intensities.

. . Two-level model
Suppose we collect brain images consisting of p voxels in a brain region B ⊂ R 3 and genetic variants of m SNPs from n subjects.Let i(i = 1, . . ., n) index the subject, j(j = 1, . . ., p) index the voxels and k(k = 1, . . ., m) index the SNPs.Denote by y ij the observed imaging signal at voxel v j ∈ B. Let S ik be the genetic variant for SNP k.
At Level 1, we model the brain signal intensity within brain activation regions by assuming y ij follow a normal mixture model: where then it is located in a activation region and the brain signal y ij has an average activation intensity µ i .Otherwise, the voxel is located outside the brain activation regions with a mean intensity zero.The parameter σ 2 i is the variance of the signal y ij across all voxels j for subject i.
At Level 2, we link the activation intensity to the genetic variant by using a regression model where η k is the genetic effects of SNP k on the brain activation intensity.The variance parameter τ 2 µ characterizes the variability of the average activation intensity that are not from the genetic variants.

. . Prior specifications
In this section, we discuss the prior specifications for models (1) and (2).
At Level 1, to guarantee the robustness and flexibility of modeling the activation regions shape, we assign a Gaussian process prior to the level set function φ(v) with mean zero and covariance kernel function, denoted as At Level 2, we impose sparsity on η k to identify the important SNP sets that are strongly associated with the brain activation intensity.To achieve this goal, we adopt the spike-and-slab prior proposed by Ishwaran and Rao (2005b): where δ v (•) refers to a point mass measure at a real value v.The parameter w is the prior inclusion probability indicating how likely each feature is to be selected.Pre-defined value ν 0 usually select very small so that the "spike" (δ ν 0 , i.e., N[0, ν 0 τ 2 k ]) part and "slab" (δ 1 , i.e., N[0, τ 2 k ]) part can be mostly differentiated from each other.For all the variance parameters σ 2 i , τ 2 µ , and τ 2 k , we assume they are mutually indenpendent and follow conjugate priors: where IG(w 1 , w 2 ) represents an inverse gamma prior with shape w 1 and rate w 2 .

. . Model representation
To implement posterior computation algorithm, we need to consider model approximations.First, we consider the basis , where {ψ l (•)} and {λ l } are respectively eigen functions and eigenvalues for the kernel function κ(•, •) that are shared cross all patient samples.Here, L is the number of basis functions and it can be determined according to the proportion of the variation of the GP, denoted as α ∈ {0, 1}, that can be explained by the basis expansions, i.e., min{L A common choice of α is around 0.7 to 0.8.Second, we introduce the function and ψ l,j = ψ l (v j ).Then our Bayesian hierarchical model with prior specifications can be represented as where Ŵ is a diagonal matrix with (k, k) element being γ k τ 2 k . .

. The model with non-sparse prior
We also consider a conjugate normal prior on η without imposing sparsity which leads to more efficient posterior computation.The model is represented as .

. Posterior computation and variable selection
For posterior computation, we adopt the Riemann Manifold Metropolis adjusted Langevin algorithm (MMALA) (Girolami and Calderhead, 2011) and Stochastic Search Variable Selection (SSVS) George and McCulloch (1997) within Gibbs sampling.
For the variable selection, we apply an ad-hoc method based on posterior credible intervals.For correlating the variables (clinical or SNPs) with brain image intensity levels, we use the null hypothesis that SNP k is uncorrelated with the intensity level inside the activation region (H 0 : η k = 0) and the alternative hypothesis that SNP k is not uncorrelated with the intensity level inside the activation region (H a : η k = 0).Based on the marginal posterior distribution for η k , if 0 is included in the posterior 99% credible interval, we assign γ k = 1, otherwise γ k = 0 where γ k is the same indicator variable introduced in SSVS.We approximate the posterior inclusion probability of SNP k: Eγ k using the averaged values after burn-ins γ k .Then the SNPs with posterior inclusion probability larger than 0.01 are selected as important.
The details of derivations and posterior computation algorithms the Bayesian level set method with spike-and-slab prior and normal prior are provided in the Supplementary material.

. Simulations
We tested the performance for learning activation region shapes and selection influential variables using proposed method starting from the simplest scenario and then gradually extended to the most complicated scenario.For the simplest simulation setting, we simulated a single subject, 2D imaging data and zero predictor matrix, i.e., set n = 1, d = 2, S = 0 thus no variable selection was involved.For the most complicated simulation setting, we simulated multiple subjects, 3D imaging data and utilized the predictors in real data analysis for selection.

. . Single subject with D image and no variable selection
In this simulation study, the objective is to test the Bayesian nonparametric level set method for random shape fitting.We simulated 2D images of size 150 × 150 on a square region [−1, 1] 2 (d = 2).We considered three activation region shapes: circles, squares and random shapes.We simulated data by setting σ 2 = 1, and the signal intensities µ and the level set function were set as follows: For the posterior computation, we set ǫ = 1 × 10 −3 and run 50,000 iterations with 20,000 burn-ins.The shape estimation results were presented in Figure 1.

. . Multi-subjects with D image and no variable selection
We evaluated the proposed method on a total of m = 50 subjects with 3D images simulated for each of them.The 3D image grid was of 20 × 20 × 20 (p = 8, 000) on a square region [−1, 1] 2 (d = 3).Like simulation 3.1, we set S = 0 so that there is no variable selection involved.We considered three different shapes of activation region: spheres, diamonds, and random shapes.We set The signal intensities µ i (i = 1, . . ., n) and the level set function φ(v) were set as follows: • Sphere shapes: set the signal intensity µ i ∼ N(1, 1) and the true level set function φ and draw the true level set from a Gaussian process with mean zero and covariance kernel κ For the posterior computation, we set ǫ = 1 × 10 −3 , α = 0.8 as PCA percent and run 5,000 iteration with 2,000 burn-in.The shape segmentation results were respectively summarized in Figure 2.   The results include parameter estimation mean squared error (MSE) and selection accuracy (ACC), true positive rate (TPR) and true negative rate (TNR) for brain activation region (AR) and genetic variants (GV) for three shapes with combinations of the signal-to-noise ratios. .

. Multi-subjects with D image and variable selection
In the simulation study, we evaluated the proposed method on the most complicated scenario where there is a total of n = 235 subjects with 3D images simulated for each of them.We only took the first 200 columns (m = 200) from the SNP matrix in real data analysis to form S in the simulations.We randomly selected 5 of them as signal (without loss of generosity, set η k = 1, k = 1, . . ., 5) and the remaining 195 (η k = 0, k = 6, . . ., 200) as noise.Like previous simulation studies, we considered three different activation region shapes with different combination of abilities for shape estimation and variable selection quantified by signal-to-noise-ratio SNR(•).
For the posterior computation, we set ǫ = 1 × 10 −4 and α = 0.75 as the proportion of variation explained in the GP approximation, which leads to 120 basis functions.We ran 6,000 iterations with 4,000 burn-ins and saved the MCMC sample for every two iterations.For each of the simulation settings, we simulated 50 datasets in total and evaluated the algorithm performance based on some proposed metrics averaged across different datasets.The voxels inside activation regions were selected if their posterior inclusion probability is larger than 0.5.The variable are selected if their posterior inclusion probability is larger than 0.02 for SSVS and 0.01 when used non-sparse prior.
For activation shape estimation and variable selection, as there are only two possible values that voxels can take: "inside the region" or "outside the region", also two possible values that variables can take: "selected" or "not-selected", we can summarize spatial voxels and variable selection results by their averaged accuracy, sensitivity, and specificity respectively.We also provided the averaged meansquared-errors (MSE) for η and µ.The simulation results using SSVS are presented in Table 1 and results using non-sparse prior are presented in Table 1.
The simulation studies indicate our proposed method is accurate for voxels classification and variable selection.For simulations using SSVS, even with the worse scenario when SNR(β) = 2 and SNR(η) = 2, the averaged accuracy, sensitivity and specificity for voxels classification are all above 0.94 and for The results include parameter estimation mean squared error (MSE) and selection accuracy (ACC), true positive rate (TPR) and true negative rate (TNR) for brain activation region (AR) and genetic variants (GV) for two levels the signal-to-noise ratios genetic variants.
variable selection are all 100%.As SNR(β) increased to 5 and 8, classification performance improves as expected while SNR(η) increased to 5 and 8, variable selection are all 100% accurate.For MSE of η and µ, it decreases in the general trend when SNR(β) increases.
Compared to Bayesian level set method with non-sparse prior for variable selection, the voxels classification is robust but the variable selection generates worse performance.If we compare the scenario when SNR(β) = SNR(η) = 2, the accuracy, sensitivity (true positive rate) and specificity (true negative rate) decrease to 0.893, 0.840, 0.894 and MSE for η and µ increases to 0.081 and 0.141.The proposed method does suffer a decrease performance as expected, but in general, the results are acceptable.We recommend applying the fast algorithm when there is exceedingly large number of candidate SNPs in the study for fast computation purpose.
To assess our model's performance relative to existing methods like FVGWAS and FGWAS, we specifically examine signal-to-noise ratios (SNR) for both brain activation region selection and genetic effects, using estimations from the analysis of ADNI data.The estimated SNR(β) ≈ 0.5, while for η, the SNR varies between 0.5 and 1.0; hence, we consider two scenarios: SNR(η) = 0.5, 1.0.We simulate the activation region through Gaussian Processes (GP), where the spatial correlation is approximated to be 0.9 for neighboring voxels, based on PET image analysis.Given that the underlying assumptions of our Bayesian model differ from those of FVGWAS and FGWAS, the parameter estimations are not directly comparable.Therefore, we focus our comparison on activation region and SNP selection results, omitting the mean squared error (MSE) metrics.The findings are summarized in Table 2.
The comparative analysis indicates that FVGWAS exhibits low power in detecting activation regions but performs exceptionally well in identifying genetic variants, albeit with a slightly inflated false-positive rate.On the other hand, FGWAS, which incorporates spatial smoothness into its estimations, demonstrates high power in detecting activation regions but struggles to control the falsepositive rate effectively, resulting in low specificity.In contrast, our proposed Bayesian level set method, augmented with a spikeand-slab prior, outperforms both methods in terms of activation region and genetic variant selection.Importantly, it also effectively controls the false-positive rate.the activation regions if their posterior inclusion probabilities after burn-ins are larger (i.e., voxel v j are selected if z j ≥ 0.5).Compared to this hard-thresholding likewise method, in real data analysis, we combined all voxels selected across all time points together and defined as our "ROI" and then presented results in a probability map where we presented each voxel's marginal posterior inclusion probability.For instance, the progression in hippocampus region from the right hemisphere, in the middle temporal gyrus from the right hemisphere at different time points are presented in Figure 3.We observed that in both regions activations are decreased over the time along with disease progression.Moreover, all activation regions at a brain-wide level are presented at the axial, sagittal and coronal panel in Figure 4. We observed that the activations follow the human brain structure symmetry.We also presented anatomically regional activation selection in the Table 3.
For variable selection, different from simulations studies where we selected variables based on their posterior inclusion probabilities (≥ 0.02), this time, we applied a more stringent rule where only SNPs with posterior inclusion probability larger than 0.5 are selected as highly-influential SNP of interest.We the SNPs selected from at least one anatomical region together.As shown in the Venn diagram (Figure 5), in total there were 5 SNPs selected: SNP rs677066 (gene CR1) and SNP rs16940638 (gene ADAM10) are selected at all time points which indicated their consistent impact on the activation intensities; SNP rs2274976 (gene MTHFR) and SNP rs3734404 (gene NEDD9) were only selected at baseline and month 6 respectively; SNP rs10512186 (gene DAPK1) was selected at both month 6 and month 12.For each of the selected SNP, the number of regions related at different time points, region names, and region-related lobe names can be found in Table 4.
Based on SNP-activation relation, we observe that ADAM10 (SNP rs16940638) is universally associated with the majority of the anatomical brain regions across time: 38 out of 42 at baseline, 33 out of 42 at month 6, and 33 out of 42 at month 12.The SNP has been identified as one of the significant genetic variants using genomewide association studies (GWAS) and mediation analyses where the objective was to detect SNPs that influence psychiatric and cognitive traits through intermediaries, and would not be detected otherwise (Bi et al., 2017).The biological function of the gene, ADAM10, is to proteolytic release cell-surface proteins, including TNF-alpha, heparin-binding epidermal growth-like factor, Notch receptors, and amyloid precursor protein (APP) in the nonamyloidogenic manner (Rabquer et al., 2010;Jouannet et al., 2016;Seegar et al., 2017).The regulatory role of ADAM10 in the brain has been well-documented (Saftig and Lichtenthaler, 2015).
In our results, rs677066 is associated with inferior parietal left in all three time points, while it is associated with middle frontal gyrus left at both 6 months and 12 months, but not at baseline., and it is associated with superior temporal gyrus left only at month 12.In existing literature, rs677066 (gene: CR1) is among the top SNPs to be related with AD gene pathway implicated in Alzheimer's disease (Silver et al., 2012).It was also found to play a role in the spontaneous idiopathic preterm birth (McElroy, 2013).The CR1 gene encodes a type I membrane glycoprotein that typically mediates cellular binding with immune complexes (Schifferli et al., 1988).The exact molecular mechanism of CR1's involvement with Alzheimer's is yet to be elucidated.
The role of DAPK1 (SNP rs10512186) and NEDD9 (SNP rs3734404) in the brain are not documented based on existing literature, but their functionalities can be inferred.DAPK1 (rs10512186) may be related to the late-onset of Alzheimer's disease as it is only selected at month 6 and month 12. Functionally, DAPK1 is a death-associated protein kinase that mediates a number of cellular processes including apoptosis and autophagy (Singh et al., 2016).The genetic variations in DAPK1 are known to be related with late-onset of Alzheimer's disease (Li et al., 2006).
NEDD9 (neural precursor cell expressed developmentally down-regulated protein 9) plays a key role in tyrosine-kinase signaling related to cell adhesion (Regelmann et al., 2006).The role of rs3734404/NEDD9 in AD is incomplete and inconsistent: some literature argued its functionalities with late-onset Alzheimer's disease (Strittmatter et al., 1993;Li et al., 2007) while some literature reported no association between the SNP genotype and AD (Chapuis et al., 2008).In our results, NEDD9 is associated with two regions only at month 6-superior occipital gyrus right and superior parietal gyrus right, indicating a transient role of the gene in disease development.
In our results, the MTHFR rs2274976 polymorphism is associated with superior parietal gyrus right at baseline.The presence of another MTHFR polymophism, rs180113 is associated with increased risk for Alzheimer's disease, adult depression, and neural tube defects in the fetus, etc (Trimmer, 2013).The gene MTHFR codes the protein methylenetetrahydrofolate reductase, which catalyzes a reaction involving the vitamin folate, and also plays a role processing amino acids (Wan et al., 2018).The level of serum folate is lower in AD patients, and folate deficiency is associated with higher risk of AD (Zhang et al., 2021;Prado et al., 2023).Our results indicates the association may be more critical at the onset of the disease.In addition, the variant of rs2274976 in MTHFR results in an arginine-to-glutamic acid change at amino acid 594.But as it is less frequent, its functionality is largely unknown.It may need additional attention based on our result.

. Conclusion and discussion
We have developed a novel Bayesian hierarchical model in imaging genetics studies for simultaneous activation shape estimation and variable selection.Our approach can jointly estimate the brain activation regions after accounting for external sources of clinical factors and genetic variation.To the best of our knowledge, currently there is no method that shares the same goal with us.We also borrow the anatomical brain segmentation as prior information.Our approach can detect important genetic and demographic factors associated with activation intensities inside activation regions.We applied the new method to an ADNI dataset as real The method yielded new that are interpretable, and pointed to some important loci that deserve further biological investigations.
On the other hand, our method does suffer from some limitations.First, our method uses the assumption that all averaged intensities inside are shared across all activation regions as long as they are anatomically the same, which is a relatively strong assumption.Mathematically speaking, the µ i can be further extended to an activation-region-specific variable: µ i,r , where r can be pre-specified by some spatial clustering methods.Second, the proposed method is limited by computational speed.It should be optimized so that it can be scalable to larger number of SNPs which is common to GWAS studies.
The study also has some limitations in its real data application.In this study, we limited our analysis to 614 SNPs associated with 34 genes known to be potentially related to AD.The purpose is to study which genotypes among the selection had an effect at the brain image level during the first year of the onset the disease.Thus genes whose relation with the disease manifests in later stages will be not identified in this analysis.As an example, the Apolipoprotein E (APOE) gene is a well-known risk factor due to its influence on blood cholesterol.Although it was included in our analysis, APOE was not found to be significantly associated with brain activation in the FDG-PET data, presumably because APOE variants act in a more global manner, and are not directly linked to the activation level of brain regions in the specific data type.
In this work, the genetic data used only involved genotyping data.Although genetic variations can shed some light on the potential association between genes and brain region activation during AD development, it cannot elucidate detailed molecular mechanisms.In future works, we will try to include gene expression and other data types to further study the mechanisms behind the genetic associations.

FIGURE
FIGURE Single subject with D image and no variable selection: from top to bottom, left to right: simulated boundary in red, simulated intensity data, estimated boundary in red and inclusion probability map.(a) Circle (weak).(b) Square (strong).(c) Random (intermediatel).

FIGURE
FIGUREInclusion probability map indicating changes of brain activation shapes.Left/right: the hippocampus from the right hemisphere from coronal panel/ the middle temporal gyrus from the right hemisphere at saggital panel.Points: light blue, anatomical regions; colored, activation regions.Red to dark blue: inclusion probability decreases from to .

FIGURE
FIGUREDi erent views of brain-wide activation regions.Red: activation regions; light blue: anatomical regions.

FIGURE
FIGUREVenn diagram presenting relations of SNPs selected at di erent time points.
TABLE Comparisons between BNLS with the spike-and-slab prior and with the normal prior in the simulation results for D image with multiple subjects.
TABLE Comparisons between FVGWAS, FGWAS, BNLS in the simulations for D image with multiple subjects, where the data were simulated according to the signal to noise ratios estimated from the real PET images in the ADNI study.
TABLE Anatomical region-wise results: number of voxels inside activation regions at di erent time points using hard-thresholding criterion, time points for each selected SNPs within one region.
TABLE Selected SNPs with related anatomical regions at di erent time points.