AUTHOR=Wang Shuo , Zhang Hao , Liu Zhen , Liu Yuanning TITLE=A Novel Deep Learning Method to Predict Lung Cancer Long-Term Survival With Biological Knowledge Incorporated Gene Expression Images and Clinical Data JOURNAL=Frontiers in Genetics VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.800853 DOI=10.3389/fgene.2022.800853 ISSN=1664-8021 ABSTRACT=Lung cancer was the leading cause of the cancer deaths. Therefore, predicting the survival status of lung cancer patients is of great value. However, the existing methods mainly depend on statistical machine learning (ML) algorithms. Moreover, they are not appropriate for high-dimensionality genomics data, and deep learning (DL), with strong high-dimensional data learning capability, can be used to predict lung cancer survival using genomics data. The Cancer Genome Atlas (TCGA) is a great database that contains many kinds of genomics data for 33 cancer types. With this enormous amounts of data, researchers can analyse key factors related to cancer therapy. This paper proposes a novel method to predict lung cancer long-term survival using gene expression data from TCGA. Firstly, we selected the most relevant genes to the target problem by the supervised feature selection method called mutual information selector. Secondly, we proposed a method to convert gene expression data into two kinds of images with KEGG BRITE and KEGG Pathway data incorporated in respectively, so that we could make good use of the CNN model to learn high-level features. Afterwards, we designed a CNN-based deep learning model and added two kinds of clinical data to improve the performance, so that we finally got a multi-modal deep learning model. The generalized experiments results indicated that our method performed much better than the machine learning models and uni-modal deep learning models. Furthermore, we applied survival analysis on the test set, and observed that our model could better divide the test samples into high-risk and low-risk groups.