Omics and Computational Modeling Approaches for the Effective Treatment of Drug-Resistant Cancer Cells

Chemotherapy is a mainstream cancer treatment, but has a constant challenge of drug resistance, which consequently leads to poor prognosis in cancer treatment. For better understanding and effective treatment of drug-resistant cancer cells, omics approaches have been widely conducted in various forms. A notable use of omics data beyond routine data mining is to use them for computational modeling that allows generating useful predictions, such as drug responses and prognostic biomarkers. In particular, an increasing volume of omics data has facilitated the development of machine learning models. In this mini review, we highlight recent studies on the use of multi-omics data for studying drug-resistant cancer cells. We put a particular focus on studies that use computational models to characterize drug-resistant cancer cells, and to predict biomarkers and/or drug responses. Computational models covered in this mini review include network-based models, machine learning models and genome-scale metabolic models. We also provide perspectives on future research opportunities for combating drug-resistant cancer cells.


INTRODUCTION
Drug resistance has been a major obstacle for a successful treatment of cancers, as manifested by over 90% mortality of cancer patients that appeared to be associated with drug resistance (Bukowski et al., 2020). Drug resistance is a phenotypic state that arises as a result of a complex interplay between genetic and non-genetic mechanisms (Marine et al., 2020). Such genetic and non-genetic reprogramming consequently leads to drug resistance through various mechanisms (Gatti and Zunino, 2005;Housman et al., 2014;Zheng, 2017;Lim and Ma, 2019;Vasan et al., 2019;Bukowski et al., 2020), including: drug inactivation, for example by an excessive level of glutathione that detoxifies xenobiotics (Jiang et al., 2017;De Luca et al., 2019); alteration of a drug target by mutations or changes in an expression level (Likhite et al., 2006;Costa et al., 2008); drug efflux by transporters (Giddings et al., 2021); enhanced DNA damage repair system (Harte et al., 2014); development of resistance via dysregulated autophagy (Martin et al., 2017;Cai et al., 2019); epithelial-mesenchymal transition (EMT) (Fischer et al., 2015;Zheng et al., 2015); or heterogeneity of a cancer cell population having cancer stem cells (Seth et al., 2019;Zhao et al., 2021). A state of drug resistance is indeed a highly complex phenotype that requires multidimensional approaches.
Omics technologies have now become indispensable for characterizing mechanisms of cancer progression, and for identifying effective biomarkers and treatment targets for cancers. For this reason, large-scale projects have been launched to generate omics data of various cancer cells. A recent representative example is the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which has allowed advanced studies on gene mutations and gene expression profiles across cancers (Consortium, 2020). The resulting various datasets from such large-scale efforts have been found to be useful for studying drugresistant cancer cells. Relevant representative datasets include the NCI-60 Human Tumor Cell Lines Screen (Shoemaker, 2006), the Genomics of Drug Sensitivity in Cancer (GDSC) (Yang et al., 2013), TCGA (Cancer Genome Atlas Research et al., 2013), the Cancer Therapeutic Response Portal (CTRP) (Seashore-Ludlow et al., 2015), L1000 profiles from The Library of Integrated Network-Based Cellular Signatures (LINCS) Program (Subramanian et al., 2017), the Cancer Cell Line Encyclopedia (CCLE) (Ghandi et al., 2019), and the Catalogue Of Somatic Mutations In Cancer (COSMIC) (Tate et al., 2019). All these datasets have served as a source of novel insights that help characterize and overcome drug-resistant cancer cells. In particular, it is expected that an increasing volume of such large-scale datasets will facilitate development of various computational models that will better systematize our approaches to studying drug-resistant cancer cells.
We here review recent studies that utilized multi-omics and computational modeling approaches to better understand mechanisms associated with the progression of drug resistance, and to identify biomarkers and/or drug responses ( Figure 1 and Table 1). Especially, we put more focus on computational modeling that makes predictions for various scenarios for the treatment of drug-resistant cancer cells. We also provide an outlook for further advances on the use of computational models for studying drug-resistant cancer cells.

MULTI-OMICS ANALYSES
Multiple omics data are often generated to examine various biological aspects of drug-resistant cancer cells (Figure 1). Target genotypes and phenotypes examined using omics data ( Table 1) include: cancer-associated mutations (Niehr et al., 2018;Marczyk et al., 2020;Sinkala et al., 2021); changes in the expression level of specific genes (Niehr et al., 2018;Nava et al., 2019;Kagohara et al., 2020;Marczyk et al., 2020;Poojan et al., 2020;Sinkala et al., 2021); changes in chromosome structure (Kagohara et al., 2020;Marczyk et al., 2020;Aissa et al., 2021); epigenetic alterations (e.g., methylation or acetylation states of histone   Kagohara et al., 2020;Marczyk et al., 2020;Poojan et al., 2020;Sinkala et al., 2021); and the presence of heterogeneity of a cell population (Niehr et al., 2018), often increasingly examined at a single-cell resolution (Kagohara et al., 2020;Aissa et al., 2021). In a recent study for cell line heterogeneity, for example, application of single-cell DNA and RNA sequencing (RNA-seq) to 20 triple-negative breast cancer (TNBC) patients revealed that rare pre-existing clones having genotypes associated with chemoresistance were adaptively selected in response to neoadjuvant chemotherapy, which subsequently led to acquired transcriptional reprogramming (Kim et al., 2018). For epigenetic alteration, chromosome conformation capture (Hi-C) along with additional omics analyses were conducted for estrogen receptor positive (ER+) breast cancer, which showed that resistance development to endocrine therapy was accompanied with notable 3dimensional (3D) epigenome alterations (Achinger-Kawecka et al., 2020). Application of multi-omics analyses has also been extended to examine biological processes in quiescent cancer cells that show drug resistance Kumar et al., 2021). Understanding the biology of drug resistance often helps devise effective treatment strategies for drug-resistant cancer cells. Relevant examples (Table 1) include targeting: cancer stem cell phenotypes, in particular stem cell factor receptor c-KIT, for TNBC cells resistant to an anticancer agent RH1 that is currently under clinical trials (Kuciauskas et al., 2019); a range of biological pathways (e.g., metabolism), microenvironment as well as proliferation, migration and invasion of cells, which are all associated with drug resistance for diffuse large B-cell lymphoma patients (Fornecker et al., 2019); zinc finger MYND domain-containing protein 8 (ZMYND8), a putative chromatin reader that appeared to suppress tumorigenic potential and drug resistance induced by doxorubicin (Mukherjee et al., 2020); and EZH2 responsible for histone methylation in taxane-resistant TNBC (Deblois et al., 2020).
As representative examples of overcoming drug resistance on the basis of omics analyses, recent studies additionally conducted CRISPR-Cas9-based genetic screens to examine cellular plasticity, which was suggested as a therapeutic target for drug-resistant cancer cells (Bell et al., 2019;Torre et al., 2021). Cellular plasticity describes non-genetic transformation of a cellular state into a drug-resistant state by reprogramming gene expression profiles. In a study by Torre et al., CRISPR-Cas9 genetic screens were implemented for melanoma cells to identify genes that affect cell fate decisions by altering cellular plasticity (Torre et al., 2021).
In particular, modulating the cellular plasticity was demonstrated for vemurafenib inhibiting B-Raf, encoded by a proto-oncogene, in melanoma. Interestingly, inhibiting DOT1L, associated with the onset of melanoma, before the B-Raf inhibition showed more drug resistance than simultaneous inhibition of DOT1 and B-Raf using pinometostat and vemurafenib, respectively. Subsequent transcriptome analysis of knockout cell lines generated clues for non-genetic mechanisms of drug resistance. Another study by Bell et al. focused on acute myeloid leukemia patients that showed nongenetic drug resistance (Bell et al., 2019). Single-cell RNA-seq, followed by CRISPR-Cas9 screening, led to the identification of genes responsible for transcriptional plasticity that triggered epigenetic resistance. Among the genes identified was Lsd1, the inhibition of which was shown to overcome non-genetic drug resistance. As demonstrated by these two recent studies, implementation of genome engineering in addition to omics analyses provides compelling evidence for targets that can help overcome drug resistance.

COMPUTATIONAL MODELING APPROACHES
While various bioinformatic analyses are available for analyzing omics data, such as enrichment analyses, gene co-expression networks (GCNs) (Cui et al., 2020;Qi and Zhang, 2020) and their variants (e.g., a network of long non-coding RNAs and mRNAs) (Huang et al., 2018;Liu H. et al., 2019) as well as dimensionality reduction (e.g., t-SNE and UMAP), omics data have also been subjected to computational modeling to make predictions for discovering novel mechanisms and devising treatment strategies for drug-resistant cancers ( Figure 1). Use of survival analysis in combination with GCNs, and development of a gene regulatory network (GRN) model using a set of ordinary differential equations (ODEs), machine learning models, and genome-scale metabolic models (GEMs) are representative computational modeling approaches that have recently been considered for studying drug-resistant cancer cells (Table 1).

Network-Based Modeling
GCN has been a popular analysis for understanding gene expression patterns from transcriptome data. GCN is an undirected graph that can be constructed from transcriptome data (e.g., RNA-seq), and connects pairs of genes (nodes in a GCN) with an edge if each pair of genes shows significant coexpression patterns across the transcriptome data. GCN analysis, such as identifying hub genes and/or modules, allows prioritizing candidate genes that may be highly associated with drug resistance of cancer cells. Weighted GCN additionally considers the level of significance in the co-expression relationship between genes in a pair. Often, outcomes from (weighted) GCN analysis are further subjected to other computational analyses, for example survival analysis, to validate the biological and/or clinical significance of the candidate genes. As a recent example, Li et al. focused on PPP2R2B, encoding serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B beta isoform, as a potential prognostic biomarker for TNBC on the basis of a series of bioinformatic analyses involving a GCN (Li Z. et al., 2021). Kaplan-Meier survival analysis for this gene revealed that patients with a low expression level of PPP2R2B showed shorter survival time than those with a high expression level of PPP2R2B. Interestingly, PPP2R2B upregulation could attenuate the resistance of TNBC cells to doxorubicin. Likewise, Cox proportional hazards regression model (Cox regression model) was used for genes selected from GCNs to predict prognostic biomarkers for breast cancer, and to suggest genes (e.g., CCNE2 and KIF14) that may help overcome drug resistance (Li Y.-K. et al., 2021). While GCNs can provide clinically important information when combined with additional predictive models, such as survival analysis above, they have limitations in generating clues on a molecular mechanism associated with development of drug resistance, in particular dynamic interactions between genes. To address this problem, Zhang et al. developed a timecourse RNA-seq data-driven computational framework (DryNetMC) to construct GRNs that help elucidate dynamic interactions between genes, and identify key genes associated with mechanisms of drug resistance (Zhang et al., 2019). DryNetMC involves a set of ODEs, a regularized regression method as well as a series of network analyses. Using DryNetMC, GRNs were constructed for dbcAMPsensitive and dbcAMP-resistant glioma cells based on their time-course RNA-seq data. These differential GRNs were subsequently subjected to a systematic characterization to identify their unique network properties (e.g., node importance) that helped identify key genes (e.g., KIF2C, CCNA2, NDC80, KIF11, and KIF23) that are predictive of a cancer cell's drug response. Because network-based models, either by using a GCN or other methods (e.g., ODEs), can visualize a biological context (e.g., association between genes), they will continue to be actively used in the analysis of omics data, and likely along with additional predictive models.

Machine Learning
Increasing availability of omics data for drug-resistant cancer cells has also provided unprecedented opportunities for building machine learning models. In general, machine learning models perform classification or regression, depending on a given problem. Recently, prediction of anticancer drug response was attempted by using various types of machine learning methods, such as logistic regression (Frejno et al., 2017;Yu et al., 2021), random forest (Xu et al., 2019) and deep neural network (DNN; e.g., multilayer perceptron) (Malik et al., 2021) on the basis of a range of omics and drug response data ( Table 1). When developing these machine learning models, transcriptome (RNA-seq or mRNA microarray) was the most frequently adopted dataset, but other types of datasets were also considered, including genome (e.g., gene mutations) (Yu et al., 2021), proteome (Frejno et al., 2020), epigenome (Xu et al., Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 742902 2019), mass spectrometry data (Liu R. et al., 2019) and molecular features of a target drug (Zhu et al., 2020). In a recent study by Kong et al., a machine learning model was developed that can predict a patient's drug response on the basis of the analysis of protein-protein interaction (PPI) network and pharmacogenomic data from 3D organoid culture models . Specifically, potential biomarkers were first inferred from the PPI network analysis, and their corresponding expression profiles along with drug response data (IC 50 ) were used to train a machine learning model (e.g., ridge regression). The resulting drug responses were validated using survival analysis by focusing on colorectal and bladder cancer patients treated with 5-fluorouracil and cisplatin, respectively. The predicted drug responses also appeared to be consistent with transcriptome profiles from drug-sensitive and drug-resistant isogenic cancer cell lines as well as data on somatic mutations associated with already known biomarkers. In this study, consideration of the network analysis not only helped improve the performance of the developed machine learning model, but also facilitated the interpretation of model prediction outcomes. Likewise, in another study, elastic net and random forest regression were used to predict drug responses from abundance data of proteins and their phosphorylation sites in cancer cell lines (Frejno et al., 2020).
Among machine learning methods, DNNs are increasingly used for various predictions, and they have also been used to predict drug responses. Sakellaropoulos et al. developed a DNN model by using GDSC datasets (i.e., transcriptomic data for 1,001 cancer cell lines and IC 50 values of 251 drugs) to predict drug responses (Sakellaropoulos et al., 2019). Across several datasets tested, the DNN model showed consistently better performance than elastic net and random forest models. The DNN model was validated by conducting survival analyses for the model-predicted IC 50 values, which split patients based on their drug responsiveness. Importantly, pathway enrichment analysis using information from the DNN model (i.e., weights that connect the input layer and the first hidden layer) appeared to associate specific biological pathways with mechanisms of action for drugs. In a more recent study, predicting drug response was also attempted by using a DNN model combined with multiple elastic nets (Choi et al., 2020), referred to as Reference Drug-based Neural Network (RefDNN). RefDNN was developed more in the context of drug resistance, which predicts whether a given cell line is resistant to a target drug by processing gene expression profiles and molecular structure of a drug. RefDNN was also shown to help identify biomarker genes associated with drug resistance, and explore a novel anticancer drug via drug repositioning.
Despite its demonstrated performance, machine learning is often challenged with the limited availability of training datasets for many technical fields. This challenge can be addressed to a certain extent by employing transfer learning as recently demonstrated (Zhu et al., 2020). Zhu et al. demonstrated that ensemble transfer learning can improve the prediction of drug responses in the context of drug repositioning (i.e., use of a drug for another cancer that is already known), precision oncology (i.e., use of a drug for a new cancer that has never been treated before) and new drug development (i.e., use of a new drug for already known cancer). In this particular study, LightGBM (Light Gradient Boosting Machine) and two different DNN models were considered for ensemble transfer learning; larger datasets from the CTRP and GDSC were used as source data for initial training of models, and smaller datasets from CCLE and the Genentech Cell Line Screening Initiative (gCSI) served as target data for further refinement and testing of the models. It was shown that ensemble transfer learning-based models almost always outperformed models that were not developed using transfer learning. This study suggests the use of transfer learning for other drug-resistant cancer cells where a training dataset is sufficiently not available.

Genome-Scale Metabolic Modeling
GEM is a computational model that describes gene-proteinreaction (GPR) associations, and can be simulated to predict genome-scale metabolic flux distributions . GEMs are now available for an increasing number of organisms that are important in biotechnology and biomedicine. Several versions of human GEMs (Ryu et al., 2017;Brunk et al., 2018;Robinson et al., 2020) are currently available, which have been used to examine a target cell's metabolism, and to predict biomarkers and drug targets for various diseases (Cook and Nielsen, 2017;Gu et al., 2019). For a medical application, a generic human GEM, covering all the known GPR associations in human metabolism, is initially integrated with omics data, often transcriptome (e.g., RNA-seq), to build a context-specific GEM, a GEM that is specific to a target cell or tissue (Ryu et al., 2015;Opdam et al., 2017). The resulting context-specific GEM is then simulated for various metabolic studies.
Human GEMs have recently been used to study radiationresistant tumors , but not drug-resistant cancer cells, to the best of our knowledge. Lewis et al. newly constructed GEMs for radiation-sensitive and radiationresistant tumors through multi-omics integration (i.e., transcriptome data, mutational data, kinetic data and thermodynamic data) . These context-specific GEMs were used to identify changes in redox cofactor production that give resistance to radiation therapy. In the other study, ensemble machine learning classifiers were developed to predict whether an individual is responsive or resistant to a radiation therapy by considering data of metabolite production rates predicted from context-specific GEMs as well as mutation data, transcriptome data and clinical data from TCGA . These two studies obviously suggest that GEM-based approaches can also be considered to identify metabolic signatures of drug-resistant cancer cells, and to predict effective drug targets for these cancer cells.

OUTLOOK
Understanding genotype-phenotype associations in drug-resistant cancer cells is a highly complex problem, and therefore use of multi-omics data has been considered to capture various aspects of these troubling cancer cells. In particular, multi-omics analyses along with additional tools, such as genome engineering (e.g., CRISPR-Cas9), will continue to play an important role in thorough characterization of drug-resistant cancer cells. Also, an increasing volume of omics data will facilitate development of various types of Frontiers in Genetics | www.frontiersin.org October 2021 | Volume 12 | Article 742902 computational models. As a consequence, prediction outcomes from computational models will allow more systematically designing experiments for drug-resistant cancer cells.
Despite the promises of omics data and computational models, technical challenges exist. First, current coverage of multi-omics data is not sufficient for thoroughly studying a range of drugresistant cancer cells. In particular, generation of a consistent set of multi-omics data from each single cell is necessary for in-depth study of a target cancer cell and comparison of different types of cancer cells. Also, it will be interesting to examine the effects of using datasets obtained from patients having a specific disease instead of publicly available datasets (e.g., GDSC and CTRP). While currently available machine learning models have been rigorously validated by using public datasets, they might reveal previously unnoticed limitations in a clinical setting because the public datasets are often generated in a highly controlled condition. In particular, additional consideration of non-genetic factors (e.g., age, gender, and lifestyle) may help reveal new insights on drugresistant cancer cells. Use of patient-specific datasets will allow more widespread use of the state-of-the-art computational models in a clinical setting.
For network-based modeling, including both GCN and GRN, a breakthrough is needed that allows efficiently developing a cellspecific large-scale GRN that can be simulated under various conditions (e.g., gene perturbation). For machine learning, despite its high predictive performance, there is always a challenge of avoiding overfitting and achieving explainability. Explainability in terms of biological processes is particularly important in the field of biomedicine in order to explain prediction outcomes and make medical decisions. In case of human GEMs, because patient-specific omics data (e.g., RNAseq) are available to a certain extent, human GEMs should be more actively considered to systematically examine metabolism of drugresistant cancer cells. Availability of multi-omics data will be particularly useful for interpreting human GEMs and their prediction outcomes; because human GEMs only cover a metabolic network, use of multi-omics data can help explain a complex interplay between metabolic and regulatory networks. Prediction outcomes from the simulation of human GEMs will in turn help explain the insights reaped from omics analyses.
Taken together, advances in omics technologies and computational modeling will bring about positive impacts in understanding and treating drug-resistant cancer cells. Feedback from clinicians and biomedical researchers will be additionally useful for the successful development and clinical application of computational models.

AUTHOR CONTRIBUTIONS
HUK, HDJ and YJS wrote the manuscript. HDJ and YJS prepared the figure and table, and critically reviewed the manuscript. All authors contributed to the article and approved the submitted version.