RNA, as a type of nucleic acid, forms one of the four fundamental macromolecules crucial for all known life forms. Unlike DNA (Deoxyribonucleic Acid), which typically serves as the primary genetic material in cells, many viruses use RNA as their genetic material. RNA viruses are known for their ability to mutate rapidly, and the emergence of novel strains and variants (Yin et al., 2020) is potentially responsible for a wide range of diseases, leading to epidemics or pandemics such as swine-origin flu pandemic (Yin et al., 2018) and COVID-19 (V’kovski et al., 2021; Yin et al., 2018; Ding and Xu, 2023). In addition, RNA plays critical roles in various biological processes, including gene expression, protein synthesis (Frye et al., 2018). Understanding the mechanisms and roles of RNA in disease pathogenesis and progression is crucial for advancing our knowledge of human biology and developing optimized therapeutic strategies to combat RNA-related diseases. Computational approaches like machine learning and statistics, have captured much attention in this field due to increasingly available diverse RNA datasets (Yin et al., 2022; Li et al., 2023; Yin et al., 2023). This Research Topic of Frontiers in Genetics features a Research Topic of the latest advances in applying and developing various kinds of computational methods to analyze RNA data towards non-coding RNAs (e.g., miRNA, lncRNA) and RNA viruses (e.g., influenza, coronavirus).
The ncRNAs are crucial for regulating gene expression at both the transcriptional and posttranscriptional levels within the transcriptome, without encoding proteins (Winkle et al., 2021). In particular, miRNAs are a type of small, single-stranded noncoding RNAs, about 19–25 nucleotides long, that have highly conserved sequences and can regulate gene expression at the post-transcriptional level. Through extensive research on miRNA in the context of development and disease, it has emerged as a compelling target for innovative therapeutic approaches (Shen et al., 2020a; Shen et al., 2020b; Li Peng et al., 2022). In this Research Topic, Luo et al. presented a comprehensive perspective of recent progress in miRNA-targeted therapeutics employing machine learning techniques. In addition to discussing resources and preprocessing of pharmacogenomic data, they also presented the main machine learning algorithms employed in identifying miRNA-disease associations. Given the limitations of current methods in constructing negative sample sets, Wei et al. introduced a clustering-based sampling approach called CSMDA to predict miRNA-disease associations. This method aims to address the Research Topic associated with negative sample selection in the context of miRNA-disease association prediction. Under a five-fold cross-validation, CSMDA computed an impressive Area Under the Curve (AUC) of 0.9610. Additionally, through validation with the dbDEMC database, it was confirmed that all predicted miRNAs, except hsa-mir-34c, were associated with colon cancer.
LncRNAs are a subset of ncRNAs characterized by their length, which exceeds 200 nucleotides. They have important functions in controlling gene expression at various levels, such as translational, transcriptional, and epigenetic processes (Qin et al., 2020). LncRNAs are crucial in controlling genes and proteins related to a range of human diseases like cancer (Xiao et al., 2018), digestive system Research Topic, and heart problems. Their role in disease regulation is well-established and holds promise for future therapies. Yao et al. proposed a computational model called GCHIRFLDA, which utilizes geometric complement heterogeneous information and random forest to predict lncRNA-disease associations. Under five-fold cross-validation, GCHIRFLDA achieved impressive performance metrics with an AUC of 0.9897 and an AUPR of 0.7040. The study demonstrated that 18 of the predicted lncRNAs were validated through records present in databases or published literature. Meanwhile, the presence of inherent sparsity in known heterogeneous bio-data poses a challenge for computational methods aiming to enhance the accuracy of prediction. Thus, Zhang et al. explored a novel multiple mechanisms to discover underlying lncRNA-disease associations (MM-LDA). By integrating the graph attention network (GAT) and inductive matrix completion (IMC), this approach boosts the prediction accuracy. Firstly, a multiple-operator aggregation was created as part of the n-heads attention mechanism in the GAT. Then, IMC was incorporated into the improved node feature, and subsequently, the LDA network underwent a reconstruction to address the cold start problem caused by insufficient data in either whole rows or columns of a known association matrix. Under 5-fold cross-validation, an AUC of 0.9395 and an AUPR of 0.8057 were computed. The results from MM-LDA suggested a potential link between HOTAIR and HTTAS and gastric cancer.
In recent years, there has been the proposal of a hypothesis about competing endogenous RNA (ceRNA) network (Salmena et al., 2011). Under this hypothesis, lncRNAs possess the capability to function as endogenous molecular sponges for miRNAs, indirectly regulating the expression of messenger RNAs (mRNAs). The intricate nature of the lncRNA-miRNA-mRNA network makes their dysregulation closely linked to the progression and onset of various human diseases. For example, Ye et al. (2019) discovered that the lncRNA MIAT increases the expression of CD47 by acting as a sponge for miR-149-5p, leading to the inhibition of efferocytosis in advanced atherosclerosis. Yang et al. (2021) conducted a study uncovering the role of lncRNA XIST as a ceRNA, promoting atherosclerosis by upregulating TLR4 expression through the mediation of miR-599. Additionally, they identified several putative ceRNA networks, including those associated with implantation failure (Feng et al., 2018), polycystic ovary syndrome (Ma et al., 2021), and epithelial ovarian cancer (Zhao et al., 2019). Chen et al. employed the CIBERSORT algorithm to investigate the potential ceRNA-related mechanism of Peripheral arterial occlusive disease (PAOD) and to identify the associated patterns of immune cell infiltration. They developed an immune-related core ceRNA network that offered valuable insights into the molecular mechanisms underlying Peripheral Arterial Occlusive Disease (PAOD). This network consisting of CREB1, LINC00221, miR-20b-5p, and miR-17-5p, along with the infiltrating immune cells, specifically M1 macrophages and monocytes. Luo et al. introduced a lncRNA–mRNA network based on POI (POILMN) to identify essential lncRNAs. This research yielded a Research Topic of 288 differentially expressed mRNAs and 244 differentially expressed lncRNA. Ultimately, Through the application of topological analysis, POILMN identified four intersecting lncRNAs based on two centralities, namely, degree and betweenness.
CircRNA is a class of ncRNAs that forms a covalently closed loop structures (Li et al., 2020; Xiao et al., 2020; Peng et al., 2022; Peng et al., 2023). CircRNA molecules have been observed or artificially synthesized in various organisms, including mammals (Xu and Zhang, 2021) and viruses (Tan and Lim, 2021). The interactions between miRNAs and circRNAs have been demonstrated to modify gene expression and play a regulatory role in diseases. Therefore, He et al. introduced a novel approach called GCNCMI, which utilizes a graph convolutional neural (GCN) network to uncover latent associations between miRNAs and circRNAs. GCNCMI initially examines the underlying connections between neighboring nodes in the GCN network. Afterward, it iteratively spreads this connection information across the graph convolutional layers. Lastly, the embeddings produced by each layer were combined to output the ultimate prediction results. GCNCMI achieved an AUC of 0.9312 and an AUPR of 0.9412. The results from GCNCMI showed that 8 interactions involving hsa-miR-149-5p and 7 interactions involving hsa-miR-622 were validated.
Additionally, mitochondrial dysfunction could be among the molecular mechanisms implicated in obstructive sleep apnea (OSA) and its concurrent conditions. Despite several studies reporting the involvement of various proteins and miRNAs in OSA (Targa et al., 2020; Pinilla et al., 2021), the impact of OSA on genes and pathways, particularly concerning mitochondrial dysfunction, remains largely unexplored. In a previous study by Li et al. (2017), differentially expressed miRNAs were reported in OSA, but their specific association with mitochondrial dysfunction was not established. Liu et al.developed a novel diagnostic model consisting of a four-gene signature related to mitochondrial dysfunction. Using gene expression related to mitochondrial dysfunction, all samples were categorized into two clusters, with an additional subdivision of three clusters identified specifically among the samples with OSA. In the OSA samples compared to control samples, Significant differences were noted in the levels of M0 and M1 macrophages as well as plasma cells. Additionally, within the clusters associated with mitochondrial dysfunction in OSA samples, various immune cell types, particularly T cells, showed significant differences.
Although multiple databases offer information on virus-host protein interactions, they often lack detailed information about strain-specific virulence factors or the specific protein domains implicated in the interactions (Yin et al., 2017; Yin et al., 2021). Several databases may have incomplete representation coverage of influenza strains of influenza strains due to the challenge of sifting through extensive literature to gather comprehensive information. No existing database has provided complete records of strain-specific protein-protein interactions for all types of Influenza A viruses. In particular, Ng et al. presented an innovative network that predicts domain-domain interactions between proteins from the mouse host and influenza A virus (IAV). By incorporating vital virulence details like lethal dose, this network facilitates a methodical exploration of disease factors. They created a network of interacting protein domains from both mouse and viral proteins, representing them as nodes and using weighted edges to show their interactions.
In summary, this Research Topic centers on the recent progress in utilizing and refining diverse computational methods, including machine learning and statistical techniques, to analyze RNA data related to RNA viruses and non-coding RNA. As a result, these analyses have delved into the biological disease mechanisms and aided in the understanding of human diseases, leading to improved preventive measures, diagnoses, and treatments.
Statements
Author contributions
PD: Conceptualization, Formal Analysis, Writing–original draft, Writing–review and editing. MZ: Conceptualization, Formal Analysis, Writing–original draft, Writing–review and editing. RY: Conceptualization, Funding acquisition, Writing–original draft, Writing–review and editing.
Funding
This study was partially supported by grants from Centers for Disease Control and Prevention (1U18DP006512), National Institute of Environmental Health Sciences (R21ES032762) and the NIH National Center for Advancing Translational Sciences (UL1TR001427).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1
DingP.XuR. (2023). Causal association of COVID-19 with brain structure changes: findings from a non-overlapping 2-sample mendelian randomization study. medRxiv 2023.07.16.23292735.
2
FengC.ShenJ. M.LvP. P.JinM.WangL. Q.RaoJ. P.et al (2018). Construction of implantation failure related lncRNA-mRNA network and identification of lncRNA biomarkers for predicting endometrial receptivity. Int. J. Biol. Sci.14, 1361–1377. 10.7150/ijbs.25081
3
FryeM.HaradaB. T.BehmM.HeC. (2018). RNA modifications modulate gene expression during development. Science361, 1346–1349. 10.1126/science.aau1646
4
LiG.LuoJ.WangD.LiangC.XiaoQ.DingP.et al (2020). Potential circRNA-disease association prediction using DeepWalk and network consistency projection. J. Biomed. Inf.112, 103624. 10.1016/j.jbi.2020.103624
5
LiK.WeiP.QinY.WeiY. (2017). MicroRNA expression profiling and bioinformatics analysis of dysregulated microRNAs in obstructive sleep apnea patients. Medicine96, e7917. 10.1097/MD.0000000000007917
6
LiM.ZhaoB.YinR.LuC.GuoF.ZengM. J. (2023). GraphLncLoc: long non-coding RNA subcellular localization prediction using graph convolutional networks based on sequence to graph transformation. Briefings Bioinforma.24, bbac565. 10.1093/bib/bbac565
7
Li PengY. T.HuangL.YangL.FuX.ChenX. (2022). Daestb: inferring associations of small molecule–miRNA via a scalable tree boosting model based on deep autoencoder. Briefings Bioinforma.23, bbac478. 10.1093/bib/bbac478
8
MaY.MaL.CaoY.ZhaiJ. (2021). Construction of a ceRNA-based lncRNA-mRNA network to identify functional lncRNAs in polycystic ovarian syndrome. Aging (Albany NY)13, 8481–8496. 10.18632/aging.202659
9
PengL.YangC.ChenY.LiuW. (2023). Predicting CircRNA-disease associations via feature convolution learning with heterogeneous graph attention network. IEEE J. Biomed. Health Inf.27, 3072–3082. 10.1109/JBHI.2023.3260863
10
PengL.YangC.HuangL.ChenX.FuX.LiuW. (2022). Rnmflp: predicting circRNA–disease associations based on robust nonnegative matrix factorization and label propagation. Briefings Bioinforma.23, bbac155. 10.1093/bib/bbac155
11
PinillaL.BarbeF.De Gonzalo-CalvoD. J. (2021). MicroRNAs to guide medical decision-making in obstructive sleep apnea: A review. Sleep. Med. Rev.59, 101458. 10.1016/j.smrv.2021.101458
12
QinT.LiJ.ZhangK. Q. (2020). Structure, regulation, and function of linear and circular long non-coding RNAs. Front. Genet.11, 150. 10.3389/fgene.2020.00150
13
SalmenaL.PolisenoL.TayY.KatsL.PandolfiP. P. (2011). A ceRNA hypothesis: the rosetta stone of a hidden RNA language?Cell146, 353–358. 10.1016/j.cell.2011.07.014
14
ShenC.LuoJ.LaiZ.DingP. (2020a). Multiview joint learning-based method for identifying small-molecule-associated MiRNAs by integrating pharmacological, genomics, and network knowledge. J. Chem. Inf. Model.60, 4085–4097. 10.1021/acs.jcim.0c00244
15
ShenC.LuoJ.OuyangW.DingP.WuH. (2020b). Identification of small molecule–miRNA associations with graph regularization techniques in heterogeneous networks. J. Chem. Inf. Model.60, 6709–6721. 10.1021/acs.jcim.0c00975
16
TanK. E.LimY. (2021). Viruses join the circular RNA world. FEBS J.288, 4488–4502. 10.1111/febs.15639
17
TargaA.DakterzadaF.BenítezI.De Gonzalo-CalvoD.Moncusí-MoixA.LópezR.et al (2020). Circulating MicroRNA profile associated with obstructive sleep apnea in alzheimer’s disease. Mol. Neurobiol.57, 4363–4372. 10.1007/s12035-020-02031-z
18
V’kovskiP.KratzelA.SteinerS.StalderH.ThielV. (2021). Coronavirus biology and replication: implications for SARS-CoV-2. Nat. Rev. Microbiol.19, 155–170. 10.1038/s41579-020-00468-6
19
WinkleM.El-DalyS. M.FabbriM.CalinG. (2021). Noncoding RNA therapeutics—challenges and potential solutions. Nat. Rev. Drug Discov.20, 629–651. 10.1038/s41573-021-00219-z
20
XiaoQ.LuoJ.LiangC.LiG.CaiJ.DingP.et al (2018). Identifying lncRNA and mRNA co-expression modules from matched expression data in ovarian cancer. IEEE/ACM Trans. Comput. Biol. Bioinforma.17, 623–634. 10.1109/TCBB.2018.2864129
21
XiaoQ.YuH.ZhongJ.LiangC.LiG.DingP.et al (2020). An in-silico method with graph-based multi-label learning for large-scale prediction of circRNA-disease associations. Genomics112, 3407–3415. 10.1016/j.ygeno.2020.06.017
22
XuC.ZhangJ. (2021). Mammalian circular RNAs result largely from splicing errors. Cell Rep.36, 109439. 10.1016/j.celrep.2021.109439
23
YangK.XueY.GaoX. (2021). LncRNA XIST promotes atherosclerosis by regulating miR-599/TLR4 axis. Inflammation44, 965–973. 10.1007/s10753-020-01391-x
24
YeZ. M.YangS.XiaY. P.HuR. T.ChenS.LiB. W.et al (2019). LncRNA MIAT sponges miR-149-5p to inhibit efferocytosis in advanced atherosclerosis through CD47 upregulation. Cell death Dis.10, 138. 10.1038/s41419-019-1409-4
25
YinR.LuoZ.ZhuangP.LinZ.KwohC. (2021). VirPreNet: a weighted ensemble convolutional neural network for the virulence prediction of influenza A virus using all eight segments. Bioinformatics37, 737–743. 10.1093/bioinformatics/btaa901
26
YinR.LuoZ.ZhuangP.ZengM.LiM.LinZ.et al (2023). ViPal: a framework for virulence prediction of influenza viruses with prior viral knowledge using genomic sequences. J. Biomed. Inf.142, 104388. 10.1016/j.jbi.2023.104388
27
YinR.LuusuaE.DabrowskiJ.ZhangY.KwohC. (2020). Tempel: time-series mutation prediction of influenza A viruses via attention-based recurrent neural networks. Bioinformatics36, 2697–2704. 10.1093/bioinformatics/btaa050
28
YinR.TranV. H.ZhouX.ZhengJ.KwohC. (2018). Predicting antigenic variants of H1N1 influenza virus based on epidemics and pandemics using a stacking model. PloS one13, e0207777. 10.1371/journal.pone.0207777
29
YinR.ZhouX.IvanF. X.ZhengJ.ChowV. T.KwohC. K. (2017). “Identification of potential critical virulent sites based on hemagglutinin of influenza a virus in past pandemic strains,” in Proceedings of the 6th International Conference on Bioinformatics and Biomedical Science, Singapore, June 22 - 24, 2017, 30–36.
30
YinR.ZhuX.ZengM.WuP.LiM.KwohC. (2022). A framework for predicting variable-length epitopes of human-adapted viruses using machine learning methods. Briefings Bioinforma.23, bbac281. 10.1093/bib/bbac281
31
ZhaoX.TangD. Y.ZuoX.ZhangT. D.WangC. (2019). Identification of lncRNA–miRNA–mRNA regulatory network associated with epithelial ovarian cancer cisplatin‐resistant. J. Cell. physiology234, 19886–19894. 10.1002/jcp.28587
Summary
Keywords
RNA, miRNA, LnRNA, RNA virus, ceRNA network, machine learning/statistics, human disease
Citation
Ding P, Zeng M and Yin R (2023) Editorial: Computational methods to analyze RNA data for human diseases. Front. Genet. 14:1270334. doi: 10.3389/fgene.2023.1270334
Received
31 July 2023
Accepted
14 August 2023
Published
22 August 2023
Volume
14 - 2023
Edited and reviewed by
Fangqing Zhao, Beijing Institutes of Life Science (CAS), China
Updates
Copyright
© 2023 Ding, Zeng and Yin.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Pingjian Ding, pxd210@case.edu; Min Zeng, zengmin@csu.edu.cn; Rui Yin, ruiyin@ufl.edu
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.