AUTHOR=Chan Yi Hao , Wang Conghao , Soh Wei Kwek , Rajapakse Jagath C. TITLE=Combining Neuroimaging and Omics Datasets for Disease Classification Using Graph Neural Networks JOURNAL=Frontiers in Neuroscience VOLUME=Volume 16 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2022.866666 DOI=10.3389/fnins.2022.866666 ISSN=1662-453X ABSTRACT=Both neuroimaging and genomics datasets are often gathered for the detection of neurodegenerative diseases. Huge dimensionalities of neuroimaging data as well as omics data pose tremendous challenge for methods integrating multiple modalities. There are few existing solutions that can combine both multi-modal imaging and multi-omics datasets to derive neurological insights. We propose a deep neural network architecture that combines both structural and functional connectome data with multi-omics data for disease classification. Graph convolutional networks (GCN) are used to model functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) data simultaneously to learn compact representations of the connectome. A separate set of GCNs are then used to model multi-omics datasets, expressed in the form of patient similarity networks, and combine them with latent representations of the connectome. An attention mechanism is used to fuse the GCN outputs, which provides insights on which omics data contributed most to the model's classification decision. We demonstrate our methods for Parkinson's disease (PD) classification by using datasets from the Parkinson's Progression Markers Initiative (PPMI). PD has been shown to be associated with changes in the human connectome and it is also known to be influenced by genetic factors. We combine DTI and fMRI data with multi-omics data from messenger RNA Expression, Single Nucleotide Polymorphism (SNP), DNA Methylation and non-coding RNA experiments. To address the paucity of multi-modal imaging data, we used CycleGAN on structural and functional connectomes to generate missing imaging modalities, in the process reducing the problem of imbalanced data in the PPMI dataset. Our proposed architecture can achieve a Matthew Correlation Coefficient of more than 0.8 over many combinations of multi-modal imaging data and multi-omics data. Furthermore, ablation studies offer insights into the importance of each imaging and omics modality for PD prediction. Analysis of the generated attention matrices revealed that SNP data was the most important omics modality out of all the omics datasets considered. Our work motivates further research into imaging genetics and the creation of more multi-modal imaging and multi-omics datasets to study PD and other complex neurodegenerative diseases.