Machine Learning-Based Methods for RNA Data Analysis

41.5K
views
68
authors
11
articles
Editors
4
Impact
Loading...
3,247 views
11 citations
5,387 views
7 citations
4,303 views
9 citations
Original Research
13 April 2021
Deep Learning Enables Fast and Accurate Imputation of Gene Expression
Ramon Viñas
2 more and 
Pietro Liò
Per-gene imputation R2 scores on genes from the Alzheimer's disease pathway. Each point represents the average R2 score in a tissue type. We note that some genes in the pathway (e.g., PSMB6, COX6C, PSMD7, PSMA2, PSMD14, SDHB, TUBB1, TUBA8, FZD9, LPL, KIF5C, TUBB4A, TUBB2B, APOE) exhibited different distributions between brain and non-brain tissue types.

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.

6,036 views
17 citations
3,440 views
14 citations
Original Research
21 January 2021
A Novel Framework to Predict Breast Cancer Prognosis Using Immune-Associated LncRNAs
Zhijian Huang
6 more and 
Nani Li
Article Cover Image

Background: Breast cancer (BC) is one of the most frequently diagnosed malignancies among females. As a huge heterogeneity of malignant tumor, it is important to seek reliable molecular biomarkers to carry out the stratification for patients with BC. We surveyed immune- associated lncRNAs that may be used as potential therapeutic targets in BC.

Methods: LncRNA expression data and clinical information of BC patients were downloaded from the TCGA database for a comprehensive analysis of candidate genes. A model consisting of immune-related lncRNAs enriched in BC cancerous tissues was established using the univariate Cox regression analysis and the iterative Lasso Cox regression analysis. The prognostic performance of this model was validated in two independent cohorts (GSE21653 and BC-KR), and compared with known prognostic biomarkers. A nomogram that integrated the immune-related lncRNA signature and clinicopathological factors was constructed to accurately assess the prognostic value of this signature. The correlation between the signature and immune cell infiltration in BC was also analyzed.

Results: The Kaplan-Meier analysis showed that the OS of Patients in the low-risk group had significantly better survival than those in the high-risk group, Clinical subgroup analysis showed that the predictive ability was independent of clinicopathological factors. Univariate/multivariate Cox regression analysis showed immune lncRNA signature is an important prognostic factor and an independent prognostic marker. In addition, GSEA and GSVA analysis as well as comprehensive analysis of immune cells showed that the signature was significantly correlated with the infiltration of immune cells.

Conclusion: We successfully constructed an immune-associated lncRNA signature that can accurately predict BC prognosis.

4,141 views
11 citations
Recommended Research Topics