Identification of IRAK1BP1 as a candidate prognostic factor in lung adenocarcinoma

Introduction Lung cancer is one of the major causes of cancer-related mortality worldwide. High-throughput RNA sequencing (RNA-seq) of surgically removed tumors has been used to identify new biomarkers of lung cancer; however, contamination by non-tumor cells in the tumor microenvironment significantly interferes with the search for novel biomarkers. Tumor organoids, as a pre-clinical cancer model, exhibit similar molecular characteristics with tumor samples while minimizing the interference from other cells. Methods and Results Here we analyzed six RNA-seq datasets collected from different organoid models, in which cells with oncogenic mutations were reprogrammed to mimic lung adenocarcinoma (LUAD) tumorigenesis. We uncovered 9 LUAD-specific biomarker genes by integrating transcriptomic data from multiple sources, and identified IRAK1BP1 as a novel predictor of LUAD disease outcome. Validation with RNA-seq and microarray data collected from multiple patient cohorts, as well as patient-derived xenograft (PDX) and lung cancer cell line models confirmed that IRAK1BP1 expression was significantly lower in tumor cells, and had no correlation with known markers oflung cancer prognosis. In addition, loss of IRAK1BP1 correlated with the group of LUAD patients with worse survival; and gene-set enrichment analysis using tumor and cell line data implicated that high IRAK1BP1 expression was associated with suppression of oncogenic pathways. Discussion In conclusion, we demonstrate that IRAK1BP1 is a promising biomarker of LUAD prognosis.


Introduction
Lung cancer is one of the most frequently diagnosed and deadliest cancers in the world (1). Comprising approximately 80% to 85% of all cases of lung cancer (2), non-small-cell lung carcinoma (NSCLC) is a heterogeneous disease that consists of tumors originating from different cell types in the lung. Some of the most frequently diagnosed subtypes of NSCLC include lung adenocarcinoma (LUAD), squamous cell carcinoma (LUSC), and large cell carcinoma, with LUAD being the most common subtype. While the gold standard of care for early-stage lung cancer is surgical resection, tumors of later stages are often treated with a combination of surgery and other treatments such as chemotherapy, targeted therapy, radiotherapy, or immunotherapy (3). Despite recent advances in the diagnosis and treatment of lung cancer, prognosis remains dire for many patients, as the early detection and prediction of long-term treatment response remain challenging. Therefore, the identification of new biomarkers that are indicative of lung cancer progression is critical for improving patient survival.
The search for biomarkers of lung cancer has benefitted largely from advances in biomedical and molecular technologies such as tumor messenger RNA (mRNA) microarray and RNA-sequencing (RNA-seq). Molecular profiling of resected lung tumors has uncovered genes indicative of lung cancer progression (4)(5)(6)(7). However, a number of technical limitations may hinder future discoveries. For example, surgically removed tumors often contain stromal and immune cells in the tumor microenvironment (TME), which may generate 'noise' in sequencing and reduce the likelihood of uncovering cancer-specific biomarkers. The emergence of single cell RNA-seq (scRNA-seq) has allowed for the differentiation of tumor cells from cells in the TME (7,8); yet partial detection of the transcriptome of different cells (a.k.a. dropouts) has made it difficult to identify novel biomarkers (9). In an attempt to address some of these challenges, tumor organoids developed from murine cells or human induced pluripotent stem-cells (iPSCs) carrying oncogenic mutations were utilized to minimize contamination from nontumor cells while providing enough cells for performing scRNAseq or bulk RNA-seq (10,11). Evidently, these tumor organoids effectively recapitulated the phenotypic and epigenetic characteristics of lung tumors, demonstrating the physiological relevance of these models. Furthermore, tumor organoids serve as an excellent tool for the study of early events in tumorigenesis and cancer progression, which may have predictive power.
Here, we describe our study exploring 6 RNA-seq datasets from two studies using tumor organoids as pre-clinical models of LUAD (10,11). We uncovered 9 genes that were differentially expressed (DEGs) across all 6 different organoid models. Interestingly, among these DEGs, we identified IL-1R-associated kinase 1 binding protein 1 (IRAK1BP1), an anti-inflammatory factor that negatively regulates Toll-like receptor (TLR) signaling (12)(13)(14). While studies have elucidated the anti-inflammatory function of IRAK1BP1, the association between IRAK1BP1 and cancer has not been reported and its role in cancer is unknown. Further investigation of IRAK1BP1 expression in publicly available LUAD patient data found that IRAK1BP1 expression was lower in tumor samples relative to the corresponding normal tissue; and its predictive power is independent from other known indicators of patient outcome, such as known oncogenic mutations, tumor mutation burden, age or gender. Importantly, loss of IRAK1BP1 coincided with the group of LUAD patients with poorer prognosis; and gene-set enrichment analysis (GSEA) in both cell line and patient data suggested that elevated IRAK1BP1 expression was associated with reduced expression of oncogenic pathways. Our analysis suggests that IRAK1BP1 is an independent prognostic factor of LUAD that robustly predicts LUAD tumor progression, which may be valuable for future development of personalized treatments.
Gene expression and prognosis analysis TIMER (http://timer.cistrome.org/) was used to analyze the differential expression of IRAK1BP1 between normal tissues and tumor types ( Figure 1G). [IRAK1BP1 expression levels in various tumor types and the corresponding normal tissues were analyzed by TIMER (http://timer.cistrome.org/) ( Figure 1G)]. The clinical characteristics, somatic mutations and RNA-seq data in TPM (transcripts per million reads) of lung cancer patients, as well as the corresponding normal lung tissue data in the cancer genome atlas (TCGA) (https://portal.gdc.cancer.gov/) were downloaded using TCGAbiolinks (16). The correlation between IRAK1BP1 expression and overall survival (OS) was analyzed by the "survival" R package (version 3.4-0). Patient samples were divided into four groups based on 25%, 50% and 75% quartiles ranked by IRAK1BP1 expression, and the Kaplan-Meier survival analysis for OS was performed in the top and bottom groups. Statistical significance of survival curves was assessed by a log-rank test.

Correlation analysis of gene mutations and LUAD clinical data
We calculated the percentage of mutation for each gene across all the LUAD samples and selected the top 20 genes with the highest mutation rates. We then ranked the samples based on the expression of IRAK1BP1 from high to low, and split all cases into three groups (high, < 25%; medium, ≥ 25% and ≤ 75%; low, > 75%). A waterfall plot was displayed using the "maftools" R package (version 2.14.0). sites.broadinstitute.org/ccle/). Differentially expressed genes in LUAD samples from TCGA and LUAD cell lines from CCLE were identified with adjusted P value < 0.05 and |log2FC| > 1. Gene Ontology enrichment analyses were performed by the "clusterProfiler" R package (version 4.5.2). The "fgsea" R package (version 1.22.0) was used to perform gene set enrichment analysis (GSEA).

Results
Analysis of RNA-seq data obtained from organoid models identified LUAD-specific DEGs and IRAK1BP1, which is downregulated in six organoid models as well as in LUAD patient samples To search for novel biomarkers for lung cancer that predict disease outcome, we analyzed RNA-seq datasets from two studies using murine tumor organoids derived from murine alveolar epithelial progenitor (AT2) cells and/or bronchioalveolar stem cells (BASCs), which were postulated to give rise to LUAD (10). Figure 1A demonstrates the workflow of RNA-seq data collection from mouse organoids with or without the in vitro induced activation of KRAS G12D (10) or EGFR T790M/L858R (11), which are two of the most predominant oncogenic mutations in NSCLC (17); this was then followed by DEG analysis to find genes aberrantly expressed in tumor organoids by comparing the transcriptomic profiles of tumor versus normal (control) organoids. Data collected for a total of six organoid types were analyzed for our study ( Figure 1A). Lists of DEGs for each dataset were then compared to find 9 DEGs that were common to all six organoid models of LUAD ( Figures 1A-C). Genes that were consistently up-regulated across all LUAD organoid types include Wnt7a, Unc13c, Rap1gap, and Ank1 ( Figure 1B, upper heatmap), while genes consistently down-regulated were Gstm1, Gpx2, Acox2, Irak1bp1, and Erich2 ( Figure 1B, bottom heatmap). The log fold change (log2FC) for each DEG was displayed in Figure 1D, showing that these 9 genes were indeed consistently up-or down-regulated in all 6 organoid types that were analyzed.
To find candidate biomarkers that may be valuable to the prediction of LUAD progression, we retrieved and analyzed LUAD patient transcriptomic and prognostic data from the cancer genome atlas (TCGA) as well as 3 published tumor microarray datasets (GSE19188, GSE30219, GSE43458) to obtain our list of shared DEGs. To see how well these candidate biomarkers predict patient outcome, we measured the hazard ratio (hr) of the 9 DEGs discovered, and found that one gene, in particular, called IRAK1BP1 had a lower hr value with statistical significance, suggesting that it had greater predictive power than the rest of the DEGs ( Figure 1E). As a result, we decided to further investigate the expression of IRAK1BP1 in patient samples from TCGA as well as 3 independent cohorts; our findings suggested that IRAK1BP1 expression in LUAD patients resembled what was observed in the organoid models, showing reduced expression in tumor tissue relative to normal tissue ( Figure 1F). Interestingly, the differential expression of IRAK1BP1 is also observed in other cancer types: we assessed IRAK1BP1 expression for all cancer types with available data ( Figure 1G). The differential expression of IRAK1BP1 is again evident in LUAD and LUSC, the two main NSCLC subtypes (red box); IRAK1BP1 expression is lower in tumor relative to normal tissue in many other cancer types such as breast cancer (BRCA), kidney cancers (KICH, KIRC), and prostate adenocarcinoma (PRAD); while in some cancer types, such as cholaniocarcinoma (CHOL) and colon adenocarcinoma (COAD), the opposite is true ( Figure 1G). The differential expression of IRAK1BP1 in various cancer types warrants further survival analysis in cancers other than lung cancer.

IRAK1BP1 expression correlates with LUAD patient survival and is an independent marker of LUAD prognosis
We further investigated whether differential IRAK1BP1 expression correlated with differential LUAD prognosis by plotting the survival curves for LUAD patients divided into two groups (top and bottom quartiles), based on the level of IRAK1BP1 expression (Figure 2A). The overall survival (OS) and early-stage survival for LUAD patients in the TCGA cohorts (TCGA-LUAD), as well as the survival curves calculated from patient microarray data, suggested that LUAD patients with high IRAK1BP1 expression (red) had higher survival rates hence better prognosis compared with patients with low IRAK1BP1 (blue) (Figure 2A, respectively). To see how well IRAK1BP1 alone performs as a prognostic predictor relative to other known biomarkers of NSCLC prognosis, we measured the coefficients of correlation between IRAK1BP1 expression and well-studied indicators of lung cancer prognosis such as patient age, gender, and tumor stage at diagnosis, and found that IRAK1BP1 expression had no correlation with any of these factors; our analysis showed that IRAK1BP1 alone was able to robustly predict LUAD patient prognosis ( Figure 2B). In addition, pair-wise correlation between each of a collective set of lung cancer prognostic marker genes and IRAK1BP1 expression was measured and shown in Figure 2D, where the color represents the coefficient of correlation and the size of the dot represents statistical significance; again, our analysis suggests that IRAK1BP1 had very weak correlation with most known biomarkers and was able to act alone as a prognostic predictor ( Figure 2D).
To compare the performance of IRAK1BP1 as a prognostic predictor with known biomarkers of LUAD, hazard ratios (hr) were calculated for IRAK1BP1 along with previously described lung cancer biomarkers listed in Figure 2C: hr values lower than 1 indicates low hazards (i.e. better prognosis); the lower the hr value, the better the prognosis it predicts. Our analysis showed that IRAK1BP1 was able to robustly predict LUAD patient prognosis, outperforming each of the known biomarkers in the list ( Figure 2C).
Tumor mutation burden (TMB) and mutation rates are wellknown biomarkers of lung cancer patient treatment response and prognosis. To assess whether the correlation between IRAK1BP1 expression and patient survival is a function of TMB and mutation rates, we evaluated the correlation between IRAK1BP1 expression and a number of factors including TMB, the mutation status of the top 20 genes with the highest mutation rates, age, gender, OS, smoking status, and pathological stage of LUAD (Figure 3). Patient samples were divided into three groups based on the level of IRAK1BP1 expression and were aligned horizontally from high to low expression ( Figure 3A). Here, we did not observe any significant correlation between IRAK1BP1 expression and most of the factors evaluated; mutation rates were not dramatically altered for the 20 most mutated genes selected, though mutation rates in the high-expression group were slightly lower than those in low-and/or mediumexpression groups ( Figure 3B). The frequencies of different types of mutations found in LUAD patients were also fairly even between the High-IRAK1BP1 and Low-IRAK1BP1 groups ( Figure 3C). Our results suggest that IRAK1BP1 expression correlates with LUAD prognosis and is able to act independently as a candidate predictor of disease outcome.

GSEA and GO term analyses suggest that higher IRAK1BP1 expression is associated with suppressed oncogenic pathway signatures
To find out how IRAK1BP1 may assist in tumor formation and progression, we performed gene-set enrichment analysis (GSEA) as well as gene ontology analysis for LUAD samples in the high-IRAK1BP1 group (Figure 4). By comparing transcriptomes between LUAD patients with high-and low-IRAK1BP1 expression, we found genes that were up-regulated or down-regulated in patients with high IRAK1BP1 ( Figure 4A). GSEA analysis in the high-IRAK1BP1 transcriptomes from the TCGA-LUAD patient cohorts suggested that some of the most well-known oncogenic pathways, such as the Notch, canonical Wnt, as well as the TGF-b pathways, were down-regulated in LUAD tumors from the high-IRAK1BP1 group ( Figure 4B). Further validating the predictive power of IRAK1BP1, our GSEA analysis of high-IRAK1BP1 expressers using RNA-seq data from lung cancer cell lines in the cancer cell line encyclopedia (CCLE) database showed remarkable similarity to the GSEA results in patient data ( Figures 4C, D). Interestingly, in addition to down-regulated oncogenic pathways, signatures of inactive KRAS pathway were also present in the high-IRAK1BP1 group ( Figure 4D). GO term analysis of up-regulated genes ( Figure 4E) and down-regulated genes ( Figure 4F) in TCGA-LUAD samples with high IRAK1BP1 suggested that high IRAK1BP1 expression was associated with suppression of pathways associated with immune/inflammatory response as well as developmental/wound healing, possibly reflecting reduced TMB in this group ( Figure 4F); in contrast, pathways associated with cytoskeleton and cell adhesion were up-regulated in the high-IRAK1BP1 group, suggesting that these cells may have better structural stability ( Figure 4E). These findings prompt further studies to better characterize the association of IRAK1BP1 with tumorigenesis and cancer progression.

Discussion
Recent advances in translational research and the invention of new therapeutic approaches have shed light on the diagnosis and treatment of cancer. However, many patients continue to suffer from the lack of treatment response, the development of drug resistance, and the dread of recurrence after remission. For most cancer patients, diagnosis at an earlier stage of tumor formation and/or progression means better chance of surviving. As a result, there has been ongoing effort dedicated to the search for biomarkers that enable more sensitive detection of early-stage cancers as well as more accurate prediction of disease outcome. Serving as a preclinical model of cancer, tumor organoids allow us to overcome the limitations of animal and patient-derived xenograft models by capturing early events of tumorigenesis and/or the initiating steps in cancer progression. While some organoid models arguably lack components from the TME and therefore may not reveal the intricacies of cancer-stroma/immune interactions, they nonetheless provide a relatively controlled system for the discovery of biomarkers and/or druggable targets.
Here, using RNA-seq data that are publicly available, we identified IRAK1BP1 as a putative prognosis predictor of LUAD disease outcome. Our analyses showed that IRAK1BP1 is not only one of the 9 LUAD-specific DEGs that are common to the 6 datasets that were analyzed, but also possesses predictive potential, as the reduced expression of IRAK1BP1 alone in tumor samples predicts poor survival. Our findings suggest that elevated IRAK1BP1 is associated with suppressed oncogenic signatures in both lung cancer cell lines and patient data, further showing promise for IRAK1BP1 as a predictor of lung cancer prognosis.
The association of IRAK1BP1 with cancer is a novel discovery, while the function of IRAK1BP1 in cancer remains to be explored. To-date, multiple studies have suggested that IRAK1BP1 promotes nuclear translocation of NF-kappaB thereby inhibiting inflammatory responses (12)(13)(14). In the context of cancer, chronic inflammation is often associated with tumor development (18); therefore it would be interesting to test if the reduced expression of IRAK1BP1 in tumor cells leads to elevated inflammatory signals and contributes to cancer progression. Mechanistic studies will allow us to evaluate the therapeutic potential of this gene.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

Ethics statement
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Ethical review and approval were not required for the animal study because the study was based on bioinformatic analysis of published patient data, for which ethical reviews were conducted to obtain approval. All the datasets and their source projects were properly referenced.