AUTHOR=Pan Mengmeng , Yang Pingping , Wang Fangce , Luo Xiu , Li Bing , Ding Yi , Lu Huina , Dong Yan , Zhang Wenjun , Xiu Bing , Liang Aibin TITLE=Whole Transcriptome Data Analysis Reveals Prognostic Signature Genes for Overall Survival Prediction in Diffuse Large B Cell Lymphoma JOURNAL=Frontiers in Genetics VOLUME=Volume 12 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.648800 DOI=10.3389/fgene.2021.648800 ISSN=1664-8021 ABSTRACT=Abstract BACKGROUND With the improvement of clinical treatment outcomes in Diffuse large B cell lymphoma (DLBCL), the high rate of relapse in DLBCL patients is still an established barrier, due to the therapeutic strategy selection based on potential target remains unsatisfactory. Therefore, there is an urgent need in further exploration of prognostic biomarkers so as to improve the prognosis of DLBCL. METHODS The univariable and multivariable Cox regression models were employed to screen out gene signatures for DLBCL overall survival prediction. The differential expression analysis was used to identify representative genes in high-risk and low-risk groups, respectively, by student t test and fold change. The functional difference between the high-risk and low-risk groups were identified by the gene set enrichment analysis. RESULTS We conducted a systematic data analysis to screen the candidate genes significantly associated with overall survival of DLBCL in three NCBI Gene Expression Omnibus (GEO) datasets. To construct a prognostic model, five genes (CEBPA, CYP27A1, LST1, MREG, and TARP) were then screened and tested using the multivariable Cox model and the stepwise regression method. Kaplan-Meier curve confirmed the good predictive performance of the five-gene Cox model. Thereafter, the prognostic model and the expression levels of the five genes were validated by means of an independent dataset. All five genes were significantly favorable for the prognosis in DLBCL, both in training and validation datasets. Additionally, further analysis revealed the independence and superiority of the prognostic model in risk prediction. Functional enrichment analysis revealed some vital pathways resulting in unfavorable outcome and potential therapeutic targets in DLBCL. CONCLUSION We developed a five-gene Cox model for the clinical outcome prediction of DLBCL patients. Meanwhile, potential drug selection using this model can help clinicians to improve the clinical practice for the patients.