ORIGINAL RESEARCH article
Front. Genet.
Sec. Computational Genomics
Volume 16 - 2025 | doi: 10.3389/fgene.2025.1658577
Cross-Modal Predictive Modeling of Multi-Omic Data in 3D Airway Organ Tissue Equivalents During Viral Infection
Provisionally accepted- Wake Forest University School of Medicine, Winston-Salem, United States
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Developing robust predictive models from multi-omics data is challenging because sample sizes are typically small (often fewer than 100) while the feature space is vast (over 20,000 molecular features such as genes, transcripts, and proteins), which increases the risk of overfitting and limits generalizability. To address this challenge, this study introduces the Magnitude-Altitude Score Analysis for Tracking Infection and Time-Dependent Genes (MASIT), a novel method adept at filtering out irrelevant features/genes while focusing on important ones. Applied to the 3D airway organ tissue equivalent model that mimics human airway physiology, MASIT employed both RNA-Seq and NanoString technologies for a comprehensive analysis. RNA-Seq offered a transcriptomic overview of 19,671 protein coding genes, whereas NanoString targeted 773 specific genes. We used MASIT to analyze gene expression changes in the airway tissue equivalent after exposure to Influenza A virus, Human metapneumovirus, and Parainfluenza virus type 3 at 24-and 72-hour post-infection. It was trained and validated on NanoString data, tested on the held-out RNA-Seq test set, and achieved a 92% accuracy in differentiating eight groups of infected samples. Our findings showed that MASIT outperformed models using the full gene set, notably in algorithms like Random Forest, XGBoost, and AdaBoost. Selected genes like IFIT1, IFIT2, IFIT3, OASL, IFI44, and OAS3 were particularly effective in categorizing samples by viral type and infection stage, providing insights into the host's molecular response to viral infections. Benchmarking against widely used feature selection approaches, including Fisher score, minimum Redundancy Maximum Relevance, embedded Lasso regression, and Boruta feature importance, further demonstrated that MASIT not only exceeded their performance within NanoString data but also uniquely maintained high accuracy and stability when applied to held-out RNA-Seq data.
Keywords: Predictive Modeling, 3D airway organ tissue equivalent (OTEs), viral infection, RNA-seq data, NanoString technologies, differential expression analysis
Received: 02 Jul 2025; Accepted: 28 Aug 2025.
Copyright: © 2025 Rezapour, McNutt, Ornelles, Walker, Murphy, Atala and Gurcan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Mostafa Rezapour, Wake Forest University School of Medicine, Winston-Salem, United States
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.