Front. Genet.Frontiers in GeneticsFront. Genet.1664-8021Frontiers Media S.A.10.3389/fgene.2018.00653GeneticsOriginal ResearchPrediction of Alzheimer’s Disease-Associated Genes by Integration of GWAS Summary Data and Expression DataHaoSicheng1*WangRui1ZhangYu2*ZhanHui3*1College of Computer and Information Science, Northeastern University, Boston, MA, United States2Department of Neurosurgery, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China3College of Electronic Engineering, Heilongjiang University, Harbin, China
Edited by: Yan Huang, Harvard Medical School, United States
Reviewed by: Mingxuan Han, University of Utah, United States; Yaohui Nie, Harvard University, United States
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Alzheimer’s disease (AD) is the most common cause of dementia. It is the fifth leading cause of death among elderly people. With high genetic heritability (79%), finding the disease’s causal genes is a crucial step in finding a treatment for AD. Following the International Genomics of Alzheimer’s Project (IGAP), many disease-associated genes have been identified; however, we do not have enough knowledge about how those disease-associated genes affect gene expression and disease-related pathways. We integrated GWAS summary data from IGAP and five different expression-level data by using the transcriptome-wide association study method and identified 15 disease-causal genes under strict multiple testing (α < 0.05), and four genes are newly identified. We identified an additional 29 potential disease-causal genes under a false discovery rate (α < 0.05), and 21 of them are newly identified. Many genes we identified are also associated with an autoimmune disorder.
Alzheimer’s diseasegenome-wide association studyautoimmune diseasestranscriptome-wide association studyfalse discover rateNational Natural Science Foundation of China10.13039/501100001809Introduction
Alzheimer’s disease (AD) is the most common cause of dementia which is characterized by a decline in cognitive skills that affects a person’s ability to perform everyday activities. Estimated 5.4 million people in the United States are living with AD. It is the fifth-leading cause of death among those age 65 and older (Alzheimer’s Association, 2016). Although some drugs showing effectiveness to mitigate the symptoms from getting worse for a limit time, no treatment can stop the disease. Heritability for the AD was estimated up to 79% (Gatz et al., 2006). However, the current finding of AD-associated genetic variants is not enough to fully explain the AD signal pathway in sufficient detail.
During recent years, with the rapid advance of next-generation DNA sequencing, identify disease-related mutation from large data set and develop treatment become possible (Cheng et al., 2016a, 2018a,b). Genome-wide comparison studies (GWASs) have identified a significant amount of common genetic variants associated with complex traits and diseases (Welter et al., 2014; Hu et al., 2017a,b). Many previous studies have identified genes such as APOE (Mahoney-Sanchez et al., 2016; Liao et al., 2017) on chromosome 19. However, the causal relation of those associated genes and variants remain unclear. For example, recent study and data showed that a female with the APOE gene under greater risk than a male with the APOE gene (Cacciottolo et al., 2016; Mazure and Swendsen, 2017). This strongly indicates that we have little knowledge about how this risk factor effect people.
With GWAS summary data provided by the International Genomics of Alzheimer’s Project (IGAP) (Lambert et al., 2013), we are able to study AD in great detail. For a complex disease such as AD, the top single nucleotide polymorphisms (SNPs) often located in the non-coding region, hard to know which gene is modified by that mutation and many significant SNPs are in high linkage disequilibrium (LD) with non-significant SNPs, plus many associated SNPs are more likely to locate in expression regulation region of the disease causal gene (Nicolae et al., 2010; Karch et al., 2016). To identify disease-associated genes, we used the transcriptome-wide association study (TWAS) (Gusev et al., 2016) method which integrates GWAS summarization level data, expression level data from human tissue. TWAS method can eliminate potential confounding and find disease causal gene by focusing only on expression trait linking related by genetic variation; it can also increase statistical power from the lower multiple-testing burden and the noise reduction of gene expression from environmental factors (Gusev et al., 2016).
Previous studies have pointed out at AD is closely related to autoimmune disorders (D’Andrea, 2005; Carter, 2010). After detecting possible disease causal gene for AD, we manually curated existing research about the autoimmune diseases that potentially related to AD.
Materials and Methods
Data we used for SNP-trait association is a large-scale GWAS summary data provided by IGAP with total 17,008 AD cases and 37,154 controls, include 7,055,881 SNPs, we selected 6,004,159 SNPs. Expression level data are from adipose tissue (RNA-seq), whole blood (RNA array), peripheral blood (RNA array), brain tissue (RNA-seq and RNA-seq splicing) (Raitakari et al., 2008; Nuotio et al., 2014; Wright et al., 2014; Fromer et al., 2016). Selection method can be find in Supplementary Materials.
Transcriptome-Wide Association Study
Transcriptome-wide association study can be viewed as a test for correlation between predicted gene expression and traits from GWAS summary association data. The predicted effect size of gene expression on traits can be viewed as a linear model of genotypes with weights based on the correlation between SNPs and gene expression in the training data while accounting for LD among SNPs.
There are eight modes of causality for the relationship between genetic variant, gene expression, and traits. Scenarios Figures 1E–H should be identified as significant by TWAS and its corresponding null hypothesis is gene expression completely independent of traits (Figures 1A–D). By only focusing on the genetic component of expression, the instances of expression-trait association that is not caused by genetic variation but variation in traits can be avoided. One aspect that needs to be noticed is, same as other methods, TWAS is also confounded by linkage and pleiotropy.
Eight of causal assumption between gene, expression and trait in TWAS study. Null hypothesis: gene expression is completely independent of traits (A–D). Alternative hypothesis: causal relation exists between SNPs and traits (E–H).
Performing TWAS With GWAS Summary Statistics
We integrated gene expression measurements from five tissues with summary GWAS to perform multi-tissue transcriptome-wide association. In each tissue, TWAS used cross-validation to compare predictions from the best cis-eQTL to those from all SNPs at the locus. Prediction models choosing from BLUP (Lofgren et al., 1989), BSLLM (Zhou et al., 2013), LASSO (Tibshirani, 1997), and elastic net (Gamazon et al., 2015).
Transcriptome-wide association study Imputes effect size (z-score) of the expression and trait are linear combination of elements of z-score of SNPs for traits with weights. The weights, W = ∑ e,s∑s,s−1, are calculated using ImpG-Summary algorithm (Pasaniuc et al., 2014) and adjusted for LD. ∑ e,s is the estimated covariance matrix between all SNPs at the locus and gene expression and ∑ s,s is the estimated covariance among all SNPs which is used to account for LD.
Standardized effect sizes (Z-scores) of SNPs for a trait at a given cis locus can be denoted as a vector Z. Also, the imputed Z-score of expression and trait, WZ, has variance. W ∑s,sWt. Therefore, the imputation Z score of the cis genetic effect on the trait is,
ZTWAS=WZ/(WΣs,sWt)12.
Bonferroni correction is usually applied when identifying significant disease-associated gene. The standard multiple testing conducted in TWAS is 0.05/15000 (Gusev et al., 2016). But traditional p-value cutoffs adjusted by Bonferroni correction are made too strict in order to avoid an abundance of false positive results. The thresholds like 0.05/15000 for significant genes are usually chosen so that the probability of any single false positive among all loci tested is smaller than 0.05, which will lead to many missed findings. Instead, False Discovery Rate error measure is a more useful approach when a study involves a large number of tests, since it can identify as many significant genes as possible while incurring a relatively low proportion of false positives (Storey and Tibshirani, 2003). For each tissue, we used the Benjamini and Hochberg procedure (Benjamini and Hochberg, 1995) in addition to the Bonferroni correction for all gene tested. The Benjamini and Hochberg procedure is one of false discovery rate procedures that are designed to control the expected proportion of false positives. It is less stringent than the Bonferroni correction, thus has greater power. Since this is study is more exploratory, we can pay more risk of type I error for larger statistical power. It works as follows:
Put individual p-values in ascending order and assign ranks to the p-values.
Calculate each individual Benjamini and Hochberg critical value with formula kmα, where k is individual p-value’s rank, m is total number of tests and α is the false discovery rate.
Find the largest k such that Pk ≤kmα and reject the null hypothesis for all Hi for i = 1, ...k.
Results
To determine which gene is significantly associated with AD, we first performed strict multiple testing Bonferroni correction (p-value < 0.05/15000). We found 15 significant genes (Table 1), 11 of them has identified by previous studies of AD. In order to increase the search range, we performed false discovery rate under the same alpha (0.05). After the Benjamini and Hochberg procedure (Benjamini and Hochberg, 1995), we found 29 additional genes (Table 2). Nine of those genes has previously identified to be related to AD.
Significant genes identified by TWAS under strict multiple testing.
Gene
Chromosome
Tissue
P-Value
Z-score
Related to autoimmune diseases
PVRL2
19
Brain (CMC) RNA-seq
4.92E-34
−12.1626
Yes
TOMM40
19
Whole Blood (YFS) RNA Arr ay
1.13E-25
10.4749
CLPTM1
19
Brain (CMC) RNA-seq
5.73E-17
−8.37061
CLU
8
Brain (CMC) RNA-seq splicing
1.45E-16
−8.26075
CR1
1
Brain (CMC) RNA-seq
4.08E-15
7.8523
Yes
CEACAM19
19
Adipose (METSIM) RNA-seq
3.38E-11
6.62905
Yes
MS4A6A
11
Whole Blood (YFS) RNA Array
2.92E-10
6.30316
TRPC4AP
20
Brain (CMC) RNA-seq splicing
9.43E-10
6.1188
MLH3
14
Brain (CMC) RNA-seq splicing
7.86E-09
−5.77148
Yes
MS4A6A
11
Peripheral Blood (NTR) RNA Array
5.72E-08
5.4272
PTK2B
8
Peripheral Blood (NTR) RNA Array
9.93E-08
5.32809
PVR
19
Brain (CMC) RNA-seq
2.05E-07
−5.19443
Yes
PICALM
11
Peripheral Blood (NTR) RNA Array
2.84E-07
5.1337
Yes
MS4A4A
11
Adipose (METSIM) RNA-seq
6.11E-07
4.99
BIN1
2
Whole Blood (YFS) RNA Array
1.18E-06
4.859114
FNBP4
11
Whole Blood (YFS) RNA Array
1.49E-06
−4.81307
PTK2B
8
Whole Blood (YFS) RNA Array
2.89E-06
4.6784
Yes
BIN1
2
Peripheral Blood (NTR) RNA Array
3.24E-06
4.65503
Yes
Additional gene under Benjamini and Hochberg procedure.
Gene
Chromosome
Tissue
P-Value
Z-score
Previously identified
PHACTR1
6
Whole Blood (YFS) RNA Array
3.41E-06
−4.64434
PTPMT1
11
Whole Blood (YFS) RNA Array
4.45E-06
4.58895
MTCH2
11
Peripheral Blood (NTR) RNA Array
5.76E-06
4.535
C1QTNF4
11
Adipose (METSIM) RNA-seq
8.82E-06
4.44
FAM180B
11
Brain (CMC) RNA-seq
1.09E-05
−4.39814
Yes
DMWD
19
Whole Blood (YFS) RNA Array
1.22E-05
4.3733
ELL
19
Whole Blood (YFS) RNA Array
1.89E-05
4.277
Yes
ZNF740
12
Brain (CMC) RNA-seq splicing
2.08E-05
4.25599
NYAP1
7
Adipose (METSIM) RNA-seq
2.47E-05
−4.21777
SDAD1
4
Whole Blood (YFS) RNA Array
3.04E-05
−4.17062
MTSS1L
16
Brain (CMC) RNA-seq splicing
3.35E-05
4.14833
PHKB
16
Brain (CMC) RNA-seq
3.70E-05
−4.1257
Yes
SLC39A13
11
Brain (CMC) RNA-seq splicing
4.01E-05
−4.10667
Yes
CD33
19
Whole Blood (YFS) RNA Array
4.04E-05
4.1051
Yes
AP2A2
11
Brain (CMC) RNA-seq
4.28E-05
−4.09193
Yes
ZYX
7
Adipose (METSIM) RNA-seq
4.56E-05
−4.07718
ZNF232
17
Brain (CMC) RNA-seq splicing
4.73E-05
−4.0688
ZNF232
17
Brain (CMC) RNA-seq splicing
4.76E-05
4.0671
DLST
14
Peripheral Blood (NTR) RNA Array
5.26E-05
4.0436
Yes
TBC1D7
6
Adipose (METSIM) RNA-seq
5.34E-05
4.0403
ELL
19
Adipose (METSIM) RNA-seq
5.48E-05
4.03401
SLC39A13
11
Brain (CMC) RNA-seq splicing
5.79E-05
−4.02128
Yes
TMCO6
5
Whole Blood (YFS) RNA Array
6.50E-05
3.9938
CEL
9
Whole Blood (YFS) RNA Array
6.99E-05
3.97671
Yes
MYBPC3
11
Adipose (METSIM) RNA-seq
7.05E-05
3.97
Yes
TBC1D7
6
Brain (CMC) RNA-seq splicing
7.48E-05
−3.96063
LRRC25
19
Peripheral Blood (NTR) RNA Array
7.74E-05
−3.9523
TBC1D7
6
Brain (CMC) RNA-seq splicing
8.37E-05
3.93351
KIR3DX1
19
Peripheral Blood (NTR) RNA Array
8.87E-05
3.9195
SIX5
19
Peripheral Blood (NTR) RNA Array
9.32E-05
3.9076
HBEGF
5
Whole Blood (YFS) RNA Array
9.92E-05
−3.8926
Yes
NUP88
17
Peripheral Blood (NTR) RNA Array
1.60E-04
−3.7748
FAM105B
5
Whole Blood (YFS) RNA Array
1.61E-04
3.773
ARL6IP4
12
Peripheral Blood (NTR) RNA Array
2.10E-04
3.707
Gene
Chromosome
Tissue
P-Value
Z-score
PHACTR1
6
Whole Blood (YFS) RNA Array
3.41E-06
−4.64434
PTPMT1
11
Whole Blood (YFS) RNA Array
4.45E-06
4.58895
MTCH2
11
Peripheral Blood (NTR) RNA Array
5.76E-06
4.535
C1QTNF4
11
Adipose (METSIM) RNA-seq
8.82E-06
4.44
FAM180B
11
Brain (CMC) RNA-seq
1.09E-05
−4.39814
DMWD
19
Whole Blood (YFS) RNA Array
1.22E-05
4.3733
ELL
19
Whole Blood (YFS) RNA Array
1.89E-05
4.277
ZNF740
12
Brain (CMC) RNA-seq splicing
2.08E-05
4.25599
NYAP1
7
Adipose (METSIM) RNA-seq
2.47E-05
−4.21777
SDAD1
4
Whole Blood (YFS) RNA Array
3.04E-05
−4.17062
MTSS1L
16
Brain (CMC) RNA-seq splicing
3.35E-05
4.14833
PHKB
16
Brain (CMC) RNA-seq
3.70E-05
−4.1257
SLC39A13
11
Brain (CMC) RNA-seq splicing
4.01E-05
−4.10667
CD33
19
Whole Blood (YFS) RNA Array
4.04E-05
4.1051
AP2A2
11
Brain (CMC) RNA-seq
4.28E-05
−4.09193
ZYX
7
Adipose (METSIM) RNA-seq
4.56E-05
−4.07718
ZNF232
17
Brain (CMC) RNA-seq splicing
4.73E-05
−4.0688
ZNF232
17
Brain (CMC) RNA-seq splicing
4.76E-05
4.0671
DLST
14
Peripheral Blood (NTR) RNA Array
5.26E-05
4.0436
TBC1D7
6
Adipose (METSIM) RNA-seq
5.34E-05
4.0403
ELL
19
Adipose (METSIM) RNA-seq
5.48E-05
4.03401
SLC39A13
11
Brain (CMC) RNA-seq splicing
5.79E-05
−4.02128
TMCO6
5
Whole Blood (YFS) RNA Array
6.50E-05
3.9938
CEL
9
Whole Blood (YFS) RNA Array
6.99E-05
3.97671
MYBPC3
11
Adipose (METSIM) RNA-seq
7.05E-05
3.97
TBC1D7
6
Brain (CMC) RNA-seq splicing
7.48E-05
−3.96063
LRRC25
19
Peripheral Blood (NTR) RNA Array
7.74E-05
−3.9523
TBC1D7
6
Brain (CMC) RNA-seq splicing
8.37E-05
3.93351
KIR3DX1
19
Peripheral Blood (NTR) RNA Array
8.87E-05
3.9195
SIX5
19
Peripheral Blood (NTR) RNA Array
9.32E-05
3.9076
HBEGF
5
Whole Blood (YFS) RNA Array
9.92E-05
−3.8926
NUP88
17
Peripheral Blood (NTR) RNA Array
1.60E-04
−3.7748
FAM105B
5
Whole Blood (YFS) RNA Array
1.61E-04
3.773
ARL6IP4
12
Peripheral Blood (NTR) RNA Array
2.10E-04
3.707
PVRL2 (p-value 4.92∗10ˆ–34 in Brain (CMC) RNA-seq, also known as NECTIN2) is a well-known gene for AD. This gene encodes a single-pass type I membrane glycoprotein and interact with AOPE gene (Kulminski et al., 2018). TOMM40 [p-value 1.13∗10ˆ–25 in Whole Blood (YFS) RNA Array] is also located adjacent to APOE. It has been identified by previous studies worldwide as AD related gene (Lyall et al., 2014; Goh et al., 2015; Mise et al., 2017). It is the central and essential component of the translocase of the outer mitochondrial membrane (Humphries et al., 2005). This confirmed that mitochondrial dysfunction plays a significant role in AD-related pathology (Swerdlow and Khan, 2004; Roses et al., 2016).
Other highly connected genes function group identified are BIN1 [p-value 1.18 × 10−6in Whole Blood (YFS) RNA Array; 3.24 × 10−6 in Peripheral Blood (NTR) RNA Array], CLU (p-value 1.45 × 10−16), MS4A6A [p-value 5.72 × 10−8in Peripheral Blood (NTR) RNA Array; 2.92 × 10−10in Whole Blood (YFS) RNA Array] (Han et al., 2017).
New Identified Genes
MLH3 [p-value 7.86 × 10−9 in Brain (CMC) RNA-seq splicing] FNBP4 [p-value 1.49 × 10−6in Whole Blood (YFS) RNA Array], CEACAM19 [p-value 3.38 × 10−11 in Adipose (METSIM) RNA-seq], and CLPTM1 [p-value 5.73 × 10−17 in Brain (CMC) RNA-seq] are newly identified AD-associated genes. MLH3 gene is known for its function in repair mismatched DNA and risk for thyroid cancer and lupus (Souliotis et al., 2016; Al-Sweel et al., 2017; Javid et al., 2018). CEACAM19 gene located in chromosome 19, a previous study showed high expression of CEACAM19 for patients with breast cancer (Estiar et al., 2017); CLPTM1 has been shown to increase the risk of lung cancer and melanoma (Llorca-Cardenosa et al., 2014; Lee et al., 2017). Both CEACAM19 and CLPTM1 gene are located in chromosome 19 and near APOE gene. More detailed studies are needed to investigate the relationship between those genes and whether CLPTM1 and CEACAM19 are disease causal gene.
DiscussionAPOE Related Genes
Although APOE is not reported to be significant in any tissue, not enough evidence to conclude that APOE is not related to AD. Since each SNP has a weight assigned regarding the expression in TWAS study, even two genes are both significantly related to a disease, it is very likely only one of them will be showing significant in TWAS. TOMM40 (Figure 2, p-value 1.13 × 10−25) gene located adjacent to APOE (Pomara et al., 2011), and has a strong LD with APOE gene (Yu et al., 2007), hence TWAS didn’t detect this APOE does not imply it is not disease causal gene. APOE and TOMM40 may interact to affect AD pathology such as mitochondrial dysfunction (David et al., 2005; Roses et al., 2013). Further study is needed to show causal relation in detail. PICALM [p-value 2084 × 10−7 in Peripheral Blood (NTR) RNA Array] and PTK2B [p-value 9.93 × 10−8 in Peripheral Blood (NTR) RNA Array; p-value 2.89 × 10−6 in Whole Blood (YFS) RNA Array] are also related to APOE and TOMM40 gene according to previous studies (Carter, 2011; Gharesouran et al., 2014; Morgen et al., 2014; Han et al., 2017).
Geneposition plot in chromosome 19. Expression data: whole blood.
Association With Autoimmune Diseases
Complex disease such as AD, often shares common pathways or causal genes with other diseases (Hu et al., 2017c). For instance, TOMM40 is a shared disease-associated gene between AD and Type II diabetes (Greenbaum et al., 2014). Recent studies showing autoimmune diseases have closed relation with AD (D’Andrea, 2005; Lehrer and Rheinstein, 2015; Wotton and Goldacre, 2017). Among all the genes we identified through TWAS method, eight of them are related to autoimmune diseases.
As shown in Figure 3, PICALM, PVRL2, PVR, and CLU have shown to be related to systemic, an autoimmune disease characterized by vascular injury and debilitating tissue fibrosis (Xia et al., 2010; Ryu et al., 2014; Tsou et al., 2016; van Luijn et al., 2016). CR1 and CLU gene are related to thymus function which could potentially cause an autoimmune disorder (French et al., 1992; Pekalski et al., 2017). MLH3 and BIN1 gene have shown to be associated with Lupus, another severe autoimmune disease (Armstrong et al., 2014; Souliotis et al., 2016). Although with existing result, we don’t have enough evidence to prove these genes are both disease causal genes for AD and autoimmune disease, further research from areas such as metabolomics and proteomics is needed to study the disease association between AD and autoimmune diseases (Cheng et al., 2016b, 2017; Hu et al., 2018).
Shared disease associated gene between Alzheimer’s disease (AD) and Autoimmune diseases.
Author Contributions
RW and YZ wrote the method manuscript. SH and HZ analyzed the data and wrote the manuscript. All authors read and approved the final manuscript.
Conflict of Interest Statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Funding. Publication costs were funded by the Fundamental Research Funds for the Central Universities (Grant No. HIT NSRIF 201856), National Natural Science Foundation of China (Grant No. 61502125), Heilongjiang Postdoctoral Fund (Grant Nos. LBH-Z6064 and LBH-Z15179), and China Postdoctoral Science Foundation (Grant No. 2016 M590291).
Supplementary Material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018.00653/full#supplementary-material
ReferencesAl-SweelN.RaghavanV.DuttaA.AjithV. P.Di VietroL.KhondakarN. (2017). mlh3 mutations in baker’s yeast alter meiotic recombination outcomes by increasing noncrossover events genome-wide.13:e1006974. 10.1371/journal.pgen.100697429065117Alzheimer’s Association (2016). Alzheimer’s disease facts and figures.12459–509.ArmstrongD. L.ZidovetzkiR.Alarcon-RiquelmeM. E.TsaoB. P.CriswellL. A.KimberlyR. P. (2014). GWAS identifies novel SLE susceptibility genes and explains the association of the HLA region.15347–354. 10.1038/gene.2014.2324871463BenjaminiY.HochbergY. (1995). Controlling the false discovery rate - a practical and powerful approach to multiple testing.57289–300. 10.1111/j.2517-6161.1995.tb02031.xCacciottoloM.ChristensenA.MoserA.LiuJ. H.PikeC. J.SmithC. (2016). The APOE4 allele shows opposite sex bias in microbleeds and Alzheimer’s disease of humans and mice.3747–57. 10.1016/j.neurobiolaging.2015.10.01026686669CarterC. (2011). Alzheimer’s disease: APP, gamma secretase, APOE, CLU, CR1, PICALM, ABCA7, BIN1, CD2AP, CD33, EPHA1, and MS4A2, and their relationships with herpes simplex, C. Pneumoniae, other suspect pathogens, and the immune system.2011:501862. 10.4061/2011/50186222254144CarterC. J. (2010). Alzheimer’s disease: a pathogenetic autoimmune disorder caused by herpes simplex in a gene-dependent manner.2010:140539. 10.4061/2010/14053921234306ChengL.HuY.SunJ.ZhouM.JiangQ. (2018a). DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function.341953–1956. 10.1093/bioinformatics/bty00229365045ChengL.JiangY.JuH.SunJ.PengJ.ZhouM. (2018b). InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk.19:919. 10.1186/s12864-017-4338-629363423ChengL.JiangY.WangZ.ShiH.SunJ.YangH. (2016a). DisSim: an online system for exploring significant similar diseases and exhibiting potential therapeutic drugs.6:30024. 10.1038/srep3002427457921ChengL.SunJ.XuW.DongL.HuY.ZhouM. (2016b). OAHG: an integrated resource for annotating human genes with multi-level ontologies.10:34820. 10.1038/srep3482027703231ChengL.YangH.ZhaoH.PeiX.ShiH.SunJ. (2017). MetSigDis: a manually curated resource for the metabolic signatures of diseases.10.1093/bib/bbx103 [Epub ahead ofprint]. 28968812D’AndreaM. R. (2005). Add Alzheimer’s disease to the list of autoimmune diseases.64458–463. 10.1016/j.mehy.2004.08.02415617848DavidD. C.HauptmannS.ScherpingI.SchuesselK.KeilU.RizzuP. (2005). Proteomic and functional analyses reveal a mitochondrial dysfunction in P301L tau transgenic mice.28023802–23814. 10.1074/jbc.M50035620015831501EstiarM. A.EsmaeiliR.ZareA. A.FarahmandL.FazilatyH.ZekriA. (2017). High expression of CEACAM19, a new member of carcinoembryonic antigen gene family, in patients with breast cancer.17547–553. 10.1007/s10238-016-0442-127909883FrenchL. E.SappinoA. P.TschoppJ.SchifferliJ. A. (1992). Distinct sites of production and deposition of the putative cell death marker clusterin in the human thymus.901919–1925. 10.1172/JCI1160691430214FromerM.RoussosP.SiebertsS. K.JohnsonJ. S.KavanaghD. H.PerumalT. M. (2016). Gene expression elucidates functional impact of polygenic risk for schizophrenia.191442–1453. 10.1038/nn.439927668389GamazonE. R.WheelerH. E.ShahK. P.MozaffariS. V.Aquino-MichaelsK.CarrollR. J. (2015). A gene-based association method for mapping traits using reference transcriptome data.471091–1098. 10.1038/ng.336726258848GatzM.ReynoldsC. A.FratiglioniL.JohanssonB.MortimerJ. A.BergS. (2006). Role of genes and environments for explaining Alzheimer disease.63168–174. 10.1001/archpsyc.63.2.16816461860GharesouranJ.RezazadehM.KhorramiA.GhojazadehM.TalebiM. (2014). Genetic evidence for the involvement of variants at APOE, BIN1, CR1, and PICALM loci in risk of late-onset Alzheimer’s disease and evaluation for interactions with APOE genotypes.54780–786. 10.1007/s12031-014-0377-525022885GohL. K.LimW. S.TeoS.VijayaraghavanA.ChanM.TayL. (2015). TOMM40 alterations in Alzheimer’s disease over a 2-year follow-up period.4457–61. 10.3233/JAD-14159025201778GreenbaumL.SpringerR. R.LutzM. W.HeymannA.LubitzI.CooperI. (2014). The TOMM40 poly-T rs10524523 variant is associated with cognitive performance among non-demented elderly with type 2 diabetes.241492–1499. 10.1016/j.euroneuro.2014.06.00225044051GusevA.KoA.ShiH.BhatiaG.ChungW.PenninxB. W. (2016). Integrative approaches for large-scale transcriptome-wide association studies.48245–252. 10.1038/ng.350626854917HanZ.HuangH.GaoY.HuangQ. (2017). Functional annotation of Alzheimer’s disease associated loci revealed by GWASs.12:e0179677. 10.1371/journal.pone.017967728650998HuY.ChengL.ZhangY.BaiW.WangT.HanZ. (2017a). Rs4878104 contributes to Alzheimer’s disease risk and regulates DAPK1 gene expression.381255–1262. 10.1007/s10072-017-2959-928429084HuY.ZhengL.ChengL.BaiW.WangT.HanZ. (2017b). GAB2 rs2373115 variant contributes to Alzheimer’s disease risk specifically in European population.37518–22. 10.1016/j.jns.2017.01.03028320126HuY.ZhouM.ShiH.JuH.JiangQ.ChengL. (2017c). Measuring disease similarity and predicting disease-related ncRNAs by a novel method.10(Suppl. 5):71. 10.1186/s12920-017-0315-929297338HuY.ZhaoT.ZhangN.ZangT.ZhangJ.ChengL. (2018). Identifying diseases-related metabolites using random walk.19(Suppl. 5):116. 10.1186/s12859-018-2098-129671398HumphriesA. D.StreimannI. C.StojanovskiD.JohnstonA. J.YanoM.HoogenraadN. J. (2005). Dissection of the mitochondrial import and assembly pathway for human Tom40.28011535–11543. 10.1074/jbc.M41381620015644312JavidM.SasanakietkulT.NicolsonN. G.GibsonC. E.CallenderG. G.KorahR. (2018). DNA mismatch repair deficiency promotes genomic instability in a subset of papillary thyroid cancers.42358–366. 10.1007/s00268-017-4299-629075860KarchC. M.EzerskiyL. A.BertelsenS.Alzheimer’s Disease Genetics Consortium (ADGC)GoateA. M. (2016). Alzheimer’s disease risk polymorphisms regulate gene expression in the ZCWPW1 and the CELF1 loci.11:e0148717. 10.1371/journal.pone.014871726919393KulminskiA. M.HuangJ.WangJ.HeL.LoikaY.CulminskayaI. (2018). Apolipoprotein E region molecular signatures of Alzheimer’s disease.10.1111/acel.12779 [Epub ahead of print]. 29797398LambertJ. C.Ibrahim-VerbaasC. A.HaroldD.NajA. C.SimsR.BellenguezC. (2013). Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease.451452–1458. 10.1038/ng.280224162737LeeD. H.HeoY. R.ParkW. J.LeeJ. H. (2017). A TERT-CLPTM1 locus polymorphism (rs401681) is associated with EGFR mutation in non-small cell lung cancer.2131340–1343. 10.1016/j.prp.2017.09.02829033187LehrerS.RheinsteinP. H. (2015). Is Alzheimer’s disease autoimmune inflammation of the brain that can be treated with nasal nonsteroidal anti-inflammatory drugs?30225–227. 10.1177/153331751454547825100747LiaoF.YoonH.KimJ. (2017). Apolipoprotein E metabolism and functions in brain and its role in Alzheimer’s disease.2860–67. 10.1097/MOL.000000000000038327922847Llorca-CardenosaM. J.Pena-ChiletM.MayorM.Gomez-FernandezC.CasadoB.Martin-GonzalezM. (2014). Long telomere length and a TERT-CLPTM1 locus polymorphism association with melanoma risk.503168–3177. 10.1016/j.ejca.2014.09.01725457634LofgrenD. L.HarrisD. L.StewartT. S.SchinckelA. P. (1989). Adapting best linear unbiased prediction (BLUP) for timely genetic evaluation: II. Progeny traits in multiple contemporary groups within a herd.673223–3242. 10.2527/jas1989.67123223x2613571LyallD. M.HarrisS. E.BastinM. E.Munoz ManiegaS.MurrayC.LutzM. W. (2014). Alzheimer’s disease susceptibility genes APOE and TOMM40, and brain white matter integrity in the Lothian Birth Cohort 1936.351513.e25–1533.e25. 10.1016/j.neurobiolaging.2014.01.00624508314Mahoney-SanchezL.BelaidiA. A.BushA. I.AytonS. (2016). The complex role of apolipoprotein E in Alzheimer’s disease: an overview and update.60325–335. 10.1007/s12031-016-0839-z27647307MazureC. M.SwendsenJ. (2017). Sex differences in Alzheimer’s disease and other dementias.15451–452. 10.1016/S1474-4422(16)00067-3MiseA.YoshinoY.YamazakiK.OzakiY.SaoT.YoshidaT. (2017). TOMM40 and APOE gene expression and cognitive decline in Japanese Alzheimer’s disease subjects.601107–1117. 10.3233/JAD-17036128984592MorgenK.RamirezA.FrolichL.TostH.PlichtaM. M.KolschH. (2014). Genetic interaction of PICALM and APOE is associated with brain atrophy and cognitive impairment in Alzheimer’s disease.10(5 Suppl.), S269–S276. 10.1016/j.jalz.2013.11.00124613704NicolaeD. L.GamazonE.ZhangW.DuanS. W.DolanM. E.CoxN. J. (2010). Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS.6:e1000888. 10.1371/journal.pgen.100088820369019NuotioJ.OikonenM.MagnussenC. G.JokinenE.LaitinenT.Hutri-KahonenN. (2014). Cardiovascular risk factors in 2011 and secular trends since 2007: the cardiovascular risk in young finns study.42563–571. 10.1177/140349481454159725053467PasaniucB.ZaitlenN.ShiH.BhatiaG.GusevA.PickrellJ. (2014). Fast and accurate imputation of summary statistics enhances evidence of functional enrichment.302906–2914. 10.1093/bioinformatics/btu41624990607PekalskiM. L.GarciaA. R.FerreiraR. C.RainbowD. B.SmythD. J.MasharM. (2017). Neonatal and adult recent thymic emigrants produce IL-8 and express complement receptors CR1 and CR2.2:93739. 10.1172/jci.insight.9373928814669PomaraN.BrunoD.SidtisJ. J.LutzM. W.GreenblattD. J.SaundersA. M. (2011). Translocase of outer mitochondrial membrane 40 homolog (TOMM40) poly-T length modulates lorazepam-related cognitive toxicity in healthy APOE epsilon4-negative elderly.31544–546. 10.1097/JCP.0b013e318222810e21720235RaitakariO. T.JuonalaM.RonnemaaT.Keltikangas-JarvinenL.RasanenL.PietikainenM. (2008). Cohort profile: the cardiovascular risk in young finns study.371220–1226. 10.1093/ije/dym22518263651RosesA.SundsethS.SaundersA.GottschalkW.BurnsD.LutzM. (2016). Understanding the genetics of APOE and TOMM40 and role of mitochondrial structure and function in clinical pharmacology of Alzheimer’s disease.12687–694. 10.1016/j.jalz.2016.03.01527154058RosesA. D.LutzM. W.CrenshawD. G.GrossmanI.SaundersA. M.GottschalkW. K. (2013). TOMM40 and APOE: requirements for replication studies of association with age of disease onset and enrichment of a clinical trial.9132–136. 10.1016/j.jalz.2012.10.00923333464RyuJ.WooJ.ShinJ.RyooH.KimY.LeeC. (2014). Profile of differential promoter activity by nucleotide substitution at GWAS signals for multiple sclerosis.93:e281. 10.1097/MD.000000000000028125526461SouliotisV. L.VougasK.GorgoulisV. G.SfikakisP. P. (2016). Defective DNA repair and chromatin organization in patients with quiescent systemic lupus erythematosus.18:182. 10.1186/s13075-016-1081-327492607StoreyJ. D.TibshiraniR. (2003). Statistical significance for genomewide studies.1009440–9445. 10.1073/pnas.153050910012883005SwerdlowR. H.KhanS. M. (2004). A ”mitochondrial cascade hypothesis” for sporadic Alzheimer’s disease.638–20. 10.1016/j.mehy.2003.12.04515193340TibshiraniR. (1997). The lasso method for variable selection in the Cox model.16385–395. 10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3TsouP.-S.WrenJ. D.AminM. A.SchiopuE.FoxD. A.KhannaD. (2016). Histone deacetylase 5 is overexpressed in scleroderma endothelial cells and impairs angiogenesis via repression of proangiogenic factors.682975–2985. 10.1002/art.39828van LuijnM. M.van MeursM.StoopM. P.VerbraakE.Wierenga-WolfA. F.MeliefM. J. (2016). Elevated expression of the cerebrospinal fluid disease markers chromogranin a and clusterin in astrocytes of multiple sclerosis white matter lesions.7586–98. 10.1093/jnen/nlv00426683597WelterD.MacArthurJ.MoralesJ.BurdettT.HallP.JunkinsH. (2014). The NHGRI GWAS catalog, a curated resource of SNP-trait associations.42D1001–D1006. 10.1093/nar/gkt122924316577WottonC. J.GoldacreM. J. (2017). Associations between specific autoimmune diseases and subsequent dementia: retrospective record-linkage cohort study, UK.71576–583. 10.1136/jech-2016-20780928249989WrightF. A.SullivanP. F.BrooksA. I.ZouF.SunW.XiaK. (2014). Heritability and genomics of gene expression in peripheral blood.46430–437. 10.1038/ng.295124728292XiaZ.ChibnikL. B.GlanzB. I.LiguoriM.ShulmanJ. M.TranD. (2010). A putative Alzheimer’s disease risk allele in PCK1 influences brain atrophy in multiple sclerosis.5:e14169. 10.1371/journal.pone.001416921152065YuC. E.SeltmanH.PeskindE. R.GallowayN.ZhouP. X.RosenthalE. (2007). Comprehensive analysis ofAPOE and selected proximate markers for late-onset Alzheimer’s disease: patterns of linkage disequilibrium and disease/marker association.89655–665. 10.1016/j.ygeno.2007.02.00217434289ZhouX.CarbonettoP.StephensM. (2013). Polygenic modeling with bayesian sparse linear mixed models.9:e1003264. 10.1371/journal.pgen.100326423408905