AUTHOR=He Binsheng , Zhang Yanxiang , Zhou Zhen , Wang Bo , Liang Yuebin , Lang Jidong , Lin Huixin , Bing Pingping , Yu Lan , Sun Dejun , Luo Huaiqing , Yang Jialiang , Tian Geng TITLE=A Neural Network Framework for Predicting the Tissue-of-Origin of 15 Common Cancer Types Based on RNA-Seq Data JOURNAL=Frontiers in Bioengineering and Biotechnology VOLUME=Volume 8 - 2020 YEAR=2020 URL=https://www.frontiersin.org/journals/bioengineering-and-biotechnology/articles/10.3389/fbioe.2020.00737 DOI=10.3389/fbioe.2020.00737 ISSN=2296-4185 ABSTRACT=Sequencing-based identification of tumor tissue-of-origin (TOO) is critical for patients with cancer of unknown primary (CUP) lesions. Even if the TOO of a tumor can be diagnosed by clinicopathological observation, reevaluations by computational methods can also avoid misdiagnosis. In this study, we developed a Deep Neural Network framework using the expression of a 150-gene panel to infer tumor TOO, which is applicable for 15 common cancer types including lung, breast, liver, colorectal, gastroesophagus, ovary, cervix, endometrium, pancreas, bladder, head and neck, thyroid, prostate, kidney and brain. Specifically, we first downloaded the RNA-Seq data of 7,460 tumor samples across the 15 cancer types, with each type of samples varying from 142 to 1052, from the cancer genome atlas (TCGA). Then, we performed feature selection by a Pearson correlation method and extracted a 150-gene panel, the genes in which were significantly enriched in functions like GO:2001242 Regulation of intrinsic apoptotic signaling pathway and GO:0009755 Hormone-mediated signaling pathway, and so on. After that, we developed a novel Deep Neural Network (DNN) model using the 150 genes to predict tumor TOO for the 15 cancer types. The average prediction sensitivity and specificity of the framework are 94.85% and 99.66% respectively for the 7,460 tumor samples based on the 10-fold cross-validation, and those for a few specific cancers like prostate cancer can reach 100%. In summary, we presented a high accurate method to infer tumor tissue-of-origin, which could be useful in clinical.