Skip to main content


Front. Immunol., 20 September 2022
Sec. Viral Immunology
Volume 13 - 2022 |

Blood gene expression predicts intensive care unit admission in hospitalised patients with COVID-19

Rebekah Penrice-Randal1,2*, Xiaofeng Dong1, Andrew George Shapanis3, Aaron Gardner2, Nicholas Harding2, Jelmer Legebeke3,4, Jenny Lord3, Andres F. Vallejo5, Stephen Poole4,5, Nathan J. Brendish4,5, Catherine Hartley1, Anthony P. Williams6, Gabrielle Wheway3, Marta E. Polak5,7, Fabio Strazzeri2, James P. R. Schofield2, Paul J. Skipp2,8, Julian A. Hiscox1,9,10, Tristan W. Clark4,5† and Diana Baralle3,4†
  • 1Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, United Kingdom
  • 2TopMD Precision Medicine Ltd, Southampton, United Kingdom
  • 3School of Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
  • 4National Institute for Health Research (NIHR) Southampton Biomedical Research Centre, University Hospital Southampton National Health Service (NHS) Foundation Trust, University of Southampton, Southampton, United Kingdom
  • 5School of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
  • 6Cancer Sciences Division, Faculty of Medicine, University Hospital Southampton, Southampton, United Kingdom
  • 7Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
  • 8Centre for Proteomic Research, School of Biological Sciences, University of Southampton, Southampton, United Kingdom
  • 9NIHR Health Protection Research Unit in Emerging and Zoonotic Infections, Liverpool, United Kingdom
  • 10ASTAR Infectious Diseases Laboratories (ASTAR ID Labs), Agency for Science, Technology and Research (ASTAR) Singapore, Singapore, Singapore

Background: The COVID-19 pandemic has created pressure on healthcare systems worldwide. Tools that can stratify individuals according to prognosis could allow for more efficient allocation of healthcare resources and thus improved patient outcomes. It is currently unclear if blood gene expression signatures derived from patients at the point of admission to hospital could provide useful prognostic information.

Methods: Gene expression of whole blood obtained at the point of admission from a cohort of 78 patients hospitalised with COVID-19 during the first wave was measured by high resolution RNA sequencing. Gene signatures predictive of admission to Intensive Care Unit were identified and tested using machine learning and topological data analysis, TopMD.

Results: The best gene expression signature predictive of ICU admission was defined using topological data analysis with an accuracy: 0.72 and ROC AUC: 0.76. The gene signature was primarily based on differentially activated pathways controlling epidermal growth factor receptor (EGFR) presentation, Peroxisome proliferator-activated receptor alpha (PPAR-α) signalling and Transforming growth factor beta (TGF-β) signalling.

Conclusions: Gene expression signatures from blood taken at the point of admission to hospital predicted ICU admission of treatment naïve patients with COVID-19.


Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a betacoronavirus responsible for coronavirus disease-19 (COVID-19) resulting in a global pandemic with over 6.3 million deaths by June 2022. SARS-CoV-2 causes a spectrum of symptoms in humans, from asymptomatic to severe disease, where the latter requires continuous and intensive care and is associated with extensive pulmonary immunopathology (1, 2). The nature of severe coronavirus disease has caused a strain on healthcare systems across the world (3). Biomarkers predictive of outcome in patients with Ebola virus disease have been identified (4), highlighting that prognostic biomarkers could be useful in outbreak and clinical settings. There is an urgent need for tools which can stratify patients according to prognosis to better manage healthcare resources and improve patient outcomes, particularly in resource poor or limited settings.

There have been many attempts to define prognostic biomarkers in COVID-19 (510). However, these have focused on predicting mortality, which is primarily associated with older age groups. The ability to predict, at admission to hospital, the trajectory of a patient towards intensive care unit (ICU) admission will allow for more efficient triaging and improve outcomes through early targeted interventions. The decision to admit individuals to ICU is a result of applying standard clinical and physiological metrics with clinical oversight and scoring tools such as NEWS2 (11). Leveraging host response data from an accessible sample (e.g., peripheral blood) to predict and inform ICU admission is therefore an exciting continuation of previous work to define the host response in patients with COVID-19 (12).

Several studies in different disease contexts, including COVID-19, have been conducted to predict in-hospital mortality and ICU admission (11, 1317). For sepsis, NEWS, was assessed for the prediction of in-hospital death with an AUROC of 0.65 (0.61 to 0.68) and ICU admission with an AUROC of 0.64 (0.57 to 0.71), however, the authors highlight that no scoring system has both high sensitivity and specificity for predicting adverse outcomes in sepsis at admission (13). A retrospective analysis of data available at the time of admission, including heart rate, supplementary oxygen, abnormal sodium, and amount of time spent in the emergency department, was used to build a logistic regression model to predict early ICU admission which produced a AUROC of 0.70 (0.67-0.72), and was able to identify 10% of early ICU transfers (14). Some attempts in predicting COVID-19 ICU admission have not performed well (16). However, a model based on age, sex and comorbidities did predict ICU mortality and ICU admission in COVID-19 patients, generating a c-statistic of 0.876 (0.864-0.886) (11). Others have found that CURB-65 scores perform well in predicting in-hospital mortality with an AUC of 0.781, and the qCSI score performed well in predicting ICU admission with an AUC of 0.761 (15). Models with AUC values between 0.86 – 0.88 have been developed for predicting hospitalisation, ICU care and mechanical ventilation (18). Age and BMI were important predictors for hospitalisation, whereas for ICU admission male sex, opacities in chest scans and age were important variables (17). Routine laboratory values predictive of ICU admission and mechanical ventilation included elevated serum lactate dehydrogenase (LDH), C-reactive protein (CRP), anion gap and glucose, in addition to decreased serum calcium, sodium and albumin (17).

Using gene expression signatures to predict clinical outcome or care trajectories, from a sample such as blood, have been infrequently reported in the literature. Previously, an 11-gene host response score was found to perform similarly to SAPS3 and APACHE II as a stand-alone test, from whole blood collected within 30 days of admission when predicting 60-day mortality (AUC: 0.68), in-hospital mortality (AUC: 0.75), shock patients (AUC: 77) and primary MODS or ARDS (AUC: 0.98) (19). In sepsis, 20 and 10 gene panels have been trialed with AUCs between 0.723 to 0.956 being achieved depending on the cohort and the number of genes included in the panel (20).

Genomic analyses such as RNAseq are routinely used to inform clinical decisions (2124). Turnaround times from sampling to actionable data are continually improving, making their potential use as point-of-care tools more feasible. In addition, the cost of sequencing continues to decrease and many sequencing platforms are becoming more accessible. In this study, using blood gene expression profiles from 78 SARS-CoV-2 infected patients, machine learning and an emerging topological data analysis approach (25, 26) was used to identify and validate gene signatures that were predictive of ICU admission of patients with COVID-19 disease. This predictive model, demonstrates potential as a valuable tool for personalised treatment and assist in the clinical decision making for hospitalised COVID-19 patients, and provide a point of comparison for evaluating the effects of medical countermeasures.

Materials and methods

Patient cohort and study design

In this study, a cohort of 78 patients presenting hospitalised with COVID-19 were analysed. Samples were collected as part of the CoV-19POC study (ISRCTN trial registry: ISRCTN14966673) as previously described (12). In brief, blood samples were collected in PAXgene tubes within 24 hours of admission to hospital between March and April 2020. All patients were sampled and RNAseq data generated. Detailed patient characteristics and demographics collected at time of admission from medical records, are included in Table 1, generated by gtsummary (27).


Table 1 Patient characteristics and demographics grouped by ICU admission status.

Extraction of RNA from clinical samples and Illumina sequencing

Total RNA was extracted from PAXgene BRT using the PAXgene Blood RNA Kit (PreAnalytix), according to the manufactures protocol at Containment Level 3 in a Tripass Class I hood. Libraries were sequenced using 150 bp paired-end reads on an Illumina® NovaSeq 6000.

Data processing and machine learning

Raw paired-end fastq files generated by the NovaSeq were trimmed for the presence of adapter sequences using cutadapt (v.1.2.1), with the -O 3 parameter (28). The fastq files were further trimmed using sickle (v.1.200) with a minimum window quality score of 20 and reads shorter than 15bp are removed from analysis (18). Hisat2 v2.1.0 (29) was used to map the trimmed reads on the reference Homo sapiens genome assembly (release-94) downloaded from the Ensembl FTP site. The resultant alignment files were processed by featureCounts v2.0.0 (30) with the default setting to generate raw read counts per gene. Before further analysis, outlier samples in the hierarchical clustering were removed and low-expression genes (at least 1 read per million in smallest groups) were filtered. The decision trees models to classify ICU admission in COVID-19 samples were built according to the random forest classifier based on gene expression or traits of hospital assay by using randomForest() function in R package “randomForest” (31) with “ntree=500, proximity=TRUE, mtry=5”. Variable importance in the random forest models were measured through mean decrease in accuracy and the Gini Index.

Topological data analysis (TDA)

To determine reliability and accuracy of the TDA method presented here, the cohort was divided randomly in two not-overlapping sets, one for training (48 samples) and another for statistical testing (30 samples). Patient demographics and characteristics are presented in Table 2 for the test and training datasets. The average gene expression of ICU samples within the training set was also calculated and its topology of the global differential gene expression was measured by Topological Pathway Mapping, TopMD, without filtering. Such topology was then used as a reference with respect to the topology of global differential gene expression of each sample. Highly modulated pathways are large features of the TopMD Maps; gene pathways of high importance. When performing the regression analysis, via Logistic Regression with ElasticNet penalty (see formula below), we stress that the TopMD ICU profile used as reference was computed only on the training set.


Table 2 Patient characteristics and demographics grouped by test or train status.

To define a gene signature, TopMD profiles were computed for both each patient blood sample and the ICU average gene expression within the training set, relative to the average of all training set samples. From the training ICU profile, a panel of m genes taken from N TopMD-pathways of highest importance was selected and subsequently a feature matrix was constructed to perform the linear regression analysis, as follows.

From the training ICU profile, a reference panel is constructed using the most important N TopMD-pathways and, per each of them, the m most abundant genes. The feature matrix was then constructed associating each sample to a row and each reference gene to a column, that is, the entry (i, j) referred to sample Pi and gene gj. Any matrix entry (i, j) was defined to be 0 whenever the gene gj was not within the TopMD-defined sample panel, that is, gj was not one of the m most abundant genes within the N most important TopMD-pathways for the Pi TopMD-profile. Otherwise, such entry was the relative gene expression of gj for sample Pi.

For the statistical analysis, the Logistic Regression model, with ElasticNet penalty, was used, defined by the following formula:


Where X is the feature matrix, y the binary classification vector and w is the weights vector. Parameters for this model are C, a regularisation parameter (improving numerical stability), and ρ which controls the strength of l1 and l2 regularisation, respectively the first and second member in the formula. The best performing panel of genes was selected, among all the combination of N and m with value ranging from 1 to 100, given that mN. The best performing model, with respect to predictive error, was obtained using N=10 TopMD-pathways and m=5 genes. The regression model allows naturally to define the belongingness probability to the positive class, the ICU class in this case. For statistical testing purposes, each patient blood sample in the test set is predicted to be ICU when such probability is higher than 0.5.

Statistical analysis

Statistical testing was performed including a Shapiro-Wilk test to assess for data normality followed with either an unpaired parametric T-test (Shapiro-Wilk test p-value > 0.05) or an unpaired non-parametric Wilcoxon test (Shapiro-Wilk test p-value< 0.05) for continuous data, or a Chi-square test for categorical data.


To identify transcripts that were predictive of ICU admission for those with COVID-19 disease, the transcriptome of blood samples from infected patients was analysed.

Patient characteristics

These samples were collected through the CoV-POC trial in early 2020. Out of the 78 samples included in this study, 48 were included in the training dataset and 30 in the test dataset. The median age of the study population was 61 (IQR: 46-74) 52 were male (67%) and 26 were female (33%). The most common comorbidities were hypertension (37%), chronic respiratory disease (27%) and diabetes mellitus (24%) (Table 1). 27 were admitted to ICU of which 15 died within 30 days of admission. In this dataset there was no difference in sex between those admitted to ICU and those not admitted to ICU, p = 0.61. Age was different between those admitted to ICU and those not admitted to ICU, p 0.006, median age of 56 and 70 years respectively. Table 1 shows that data points from blood chemistry, cytokine/chemokine assessment and physiological metrics are significantly different between the patients admitted to ICU and not admitted to ICU. Including white blood cell count, neutrophil count, albumin, LDH, ferritin, CRP, IL-6, IL-33, Oxygen saturation, administration of oxygen, NEWS and consolidation of infiltrates. Patient characteristics and demographics are also shown for the test and training split (Table 2).

Machine learning

Combinations of the top 30 important genes were identified by Random Forest analysis predictive of ICU admission in the test dataset (Figure 1), achieving good accuracy (0.73) and a ROC of 0.68. The higher the value of importance of the variable (mean decrease gini score), the higher the importance of the genes in the model. In this analysis, the gene that was most associated with the decision to admit to ICU was family with sequence similarity 219 member A (FAM219A) gene.


Figure 1 The importance of genes in a classification of ICU admission with Random Forest. The higher the value of importance of the variable (mean decrease gini score), the higher the importance of the gene(s) in the model.

Topological data analysis

TopMD Pathway Biomarker Analysis defined a model with 79 genes identified from TopMD clusters predictive of ICU admission in the test dataset with accuracy: 0.72 and ROC AUC: 0.76 (Figure 2). The genes of this predictive signature were features of the top 10 pathways with top 10 genes for a total of 79 genes overall; differentially activated gene pathways between patients admitted to ICU or not admitted to ICU in the training dataset.


Figure 2 ROC analysis of the overall performance of the TopMD-defined gene signature predictive of ICU admission. ROC curve with split 62/38, using top 10 pathways with top 10 genes for a total of 79 genes overall.

The top 3 identified pathways predictive of ICU admission are involved in EGFR, PPAR-α and TGFβ signalling pathways

TopMD analysis identified pathways associated with ICU admission by defining and ranking pathways by their topological volume, the sum of normalised differential expression. The gene with the largest fold change was termed the peak-gene of the identified pathway. The top pathway had peak gene SNX2, associated with epidermal growth factor receptor (EGFR) signalling, followed by ACAA1, associated with Peroxisome proliferator-activated receptor alpha (PPAR-α) signalling and finally, FAM89B associated with Transforming growth factor beta (TGF-β) signalling (Figure 3). Additional peak genes and pathways are presented in Supplementary Figure 1. These consist of peak genes PHETA1, KEAP1, BAIAP2, TRAPPC6A, AGXT, HES1 and CDK5R1. Highlighting pathways such as phosphatidylinositol signalling, and glyoxylate and dicarboxylate metabolism (Supplementary Figure 1).


Figure 3 Differential expression of top genes in the top 3 pathways between patients admitted to ICU and not admitted to ICU of the training set. Connections represent known gene interactions according to STRING-db. (A) SNX2 - controlling epidermal growth factor receptor (EGFR) presentation, (B) ACAA1-peak pathway, representing peroxisome proliferator-activated receptor alpha (PPAR-α) signalling, (C) FAM89B-peak pathway, mediating transforming growth factor beta (TGF-β) signalling. Pathways and genes identified by topological data analysis, TopMD.


The emergence of the novel infectious agent SARS-CoV-2 has had a huge impact on healthcare systems worldwide and highlighted the importance of pandemic preparedness and management of limited healthcare resources. Here we demonstrate using retrospective analysis of gene expression data from patients hospitalised with COVID-19, at the point of admission, that there are markers that can predict the patient’s clinical outcome.

Like many other studies have previously identified, there were significant differences between clinical observations and physiological metrics for those who were and were not admitted to the ICU. In this study population, this included white blood cell count, neutrophil count, albumin, LDH, ferritin, CRP, IL-6, IL-33, oxygen saturation, NEWS, and consolidation of infiltrates (Table 1). This is in line with previous studies (17, 32). To further understand the host response in this study population and to determine whether mRNA signatures were able to predict ICU admission, a combination of topological analysis and machine learning was employed to identify genes and related pathways that predict disease.

To test the predictive nature of the model, data was split randomly into training and test datasets. There were differences in variables between the training and test cohorts (Table 2). Differences in measured variables are expected with high dimensional profiling of randomly split cohorts. The results of this study represent biological mechanisms which are consistent across the training and test cohorts, however, they are likely to be not the only mechanisms at play in driving COVID-19 disease severity, including those related to variables not balanced between the training and test cohorts.

COVID-19 gene expression prognosis studies are limited (33, 34). Scoring algorithm of molecular subphenotypes (SAMS) have been used to identify 50-gene risk profiles for COVID-19 which discriminate between mild and severe disease (33). Such profiles were able to predict ICU admission, the need for mechanical ventilation and mortality with an AUC of 0.77, 0.75 and 0.74 respectively. Immunophenotyping in addition to transcriptomic analysis on data derived from COVID-19 patients has led to the discovery of molecules that were associated with more severe disease, however, no AUC values were presented (34). In our analysis we ranked the top 30 most important genes with random forest, achieving an accuracy of 0.73 and ROC of 0.68, where FAM219A was identified as the most important variable for predicting ICU admission. FAM219A has been identified as a potential interactor with the SARS-CoV-2 M protein (35), however, the transcripts function is unknown.

TopMD analysis is an emerging topological data analysis (TDA) technology. When using high dimensional and noisy biological data sets, such as gene expression data, TDA approaches are particularly advantageous and have been successful in disease sub-phenotyping studies (25, 3640). These approaches facilitate measurement of genes relative to their networks in disease context as opposed to the conventional differential abundance analysis, traditionally utilised in biomarker discovery. The TopMD algorithm was applied to gene expression data from COVID-19 patients at point of admission, with varying care trajectories. Our analysis shows that gene expression signatures in blood predict ICU admission. Gene expression signatures predictive of ICU admission were defined by machine learning and TopMD with accuracy: 0.73 and ROC: 0.68 and accuracy: 0.72 and ROC: 0.76 respectively. Topological analysis with TopMD improved the predictive model in comparison to the machine learning approach, demonstrating the advantages of considering the shape of data relative to underlying biological mechanisms above standard bioinformatic approaches which rely on statistical analysis of abundances of isolated molecules in vastly reduced, noisy, ‘omics datasets.

The TDA analysis of gene expression relative to pathways by TopMD acts as a global pathway analysis tool, defining patterns of differentially expressed genes with evidenced interactions. The top pathways differentially modulated between patients admitted to ICU and not admitted to ICU were 1st, SNX2-peak pathway, controlling epidermal growth factor receptor (EGFR) presentation, 2nd, ACAA1-peak pathway, representing peroxisome proliferator-activated receptor alpha (PPAR-α) signalling and 3rd, FAM89B-peak pathway, mediating transforming growth factor beta (TGF-β) signalling. (Figure 2). SNX2 was the top peak gene identified through TopMD analysis and is associated with EGFR signalling pathways. Dysfunctional EGFR signalling has been identified as a contributing factor to pulmonary fibrotic-like illness during SARS-CoV infections in animal models following the SARS-CoV pandemic in 2002, where authors speculated that inhibiting EGFR pathways would prevent fibrotic disease (41, 42). This is further supported by similar findings in SARS-CoV-2 infected patients, whereby EGFR was again found to be a regulator of pulmonary fibrosis (43). Inhibiting this pathway with nimotuzumab, a monoclonal antibody against EGFR, was found to decrease inflammatory markers and fibrosis associated with COVID-19 (44, 45). ACAA1; the peak gene of the second top pathway; is representative of PPAR-α signalling. PPAR-α signalling is a key mediator of inflammation, and like EGFR a potential marker for acute lung injury. Modulation of PPAR-α signalling by SARS-CoV-2 may alter lipid metabolism in the lung epithelial cells, contributing to lipotoxicity, inflammation and untoward respiratory effects (46). Therapeutics such as fenofibrate that target PPAR-α have been recommended to enter clinical trials (47). Where others have proposed that oleoylethanolamide (OEA), a high-affinity agonist to PPAR-α and ultramicronised palmitoylethanolamide (PEA), may have therapeutic effects by suppressing inflammatory responses (48, 49). Where PEA is also able to inhibit SARS-CoV-2 entry and replication (50). Interestingly, others have identified PPAR-α as a potential mediator neuroinflammation in COVID-19 (51). The third Top pathway had peak gene FAM89B, representing TGFβ signalling pathway, which is also associated with pulmonary fibrosis (52). TGFβ is a known regulator of immune reactions and its signalling is associated with fibrosis (53, 54). In the context of COVID-19, TGFβ gene signatures are observed in plasmablasts following seroconversion and is associated with a chronic immune reaction and severe disease (52). Within the ten pathways, peak gene KEAP1 was identified as a biomarker for ICU admission. KEAP1 is most well-known for its interaction with Nrf2 facilitating its ubiquitination, where exploiting this interaction to manage cytokine storms has been discussed in the context of COVID-19 (55, 56).

A key limitation of this study is that only one time point was considered in this analysis, although this was at the point of admission to hospital, which demonstrates its potential value as a POC tool, it does not consider the dynamic element of disease-course, future studies would benefit from gene expression measured at multiple time points. RNA sequencing can take a long time, however, with the third-generation sequencing platforms, rapid biomarker discovery and implementation at POC may be possible in the future. RNA sequencing at the bedside for personalised and precision medicine may not be an accessible solution for healthcare systems at this point in time, however, our data and analysis shows the potential use of sequencing data for prognosis. As sequencing costs continue to fall and accessibility to sequencing increases, this concept could progress to the bedside. In the case of retrospective analysis, useful pathways can also be identified informing future research and thus our understanding of disease.

Prognostic gene expression signatures identified here, upon further validation in independent cohorts, could be used to inform management of healthcare resources and improve outcomes of patients with COVID-19. Gene expression signatures measured in global RNAseq transcriptomics data could be applied across health and disease for precision medicine.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: European Genome-Phenome Archive, EGAS00001005971.

Ethics statement

The studies involving human participants were reviewed and approved by South Central Hampshire A Research Ethics Committee. The patients/participants provided their written informed consent to participate in this study.

Author contributions

TC and DB conceptualised the study. SP and NB screened and recruited the patients and collected the data in the CoV-19POC trials. RP-R and CH sample processing and experiments. RP-R, XD, AG, JS, and JH performed data analysis. RP-R, AG, JS, and JH drafted the article, and editing. All authors read and approved the final manuscript.


The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: the CoV-19POC trial was funded by University Hospital Southampton Foundation Trust (UHSFT). In addition, the CoV-19POC trials were supported by the NIHR Southampton Clinical Research Facility and NIHR Southampton Biomedical Research Centre (BRC). JLe was supported by a PhD studentship from the NIHR Southampton BRC (no. NIHR-INF-0932). NB was supported by the NIHR Clinical Lecturer scheme. JH, RP-R, CH, and XD were supported by the US Food and Drug Administration Medical Countermeasures Initiative (no 75F40120C00085) awarded to JH. MP was supported by a Sir Henry Dale Fellowship from Welcome Trust and The Royal Society (no. 109377/Z/15/Z). TC was supported by a NIHR Post-Doctoral Fellowship (no. 2016-09-061). DB and her laboratory are supported by a NIHR Research Professorship (no. RP-2016-07-011). TopMD, the University of Southampton and the University of Liverpool are members of the DRAGON consortium, which received funding from the Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant agreement no 101005122. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.


The authors would like to acknowledge and gives thanks to the patients who kindly participated in this study and to all the clinical staff at University Hospital Southampton Foundation Trust who cared for them. The DRAGON consortium is a group of high-tech SMEs, academic research institutes, biotech and pharma partners, affiliated patient-centered organisations and professional societies aiming to apply artificial intelligence for improved and more rapid diagnosis and prognosis in COVID-19.) Further details may be found at

Conflict of interest

TC has received speaker fees, honoraria, travel reimbursement, and equipment and consumables free of charge for the purposes of research from BioFire diagnostics LLC and BioMerieux. TC has received discounted equipment and consumables for the purposes of research from QIAGEN. TC has received consultancy fees from Biofire diagnostics LLC, BioMerieux, Synairgen research Ltd, Randox laboratories Ltd and Cidara therapeutics. TC has been a member of advisory boards for Roche and Janssen and has received reimbursement for these. TC is member of two independent data monitoring committees for trials sponsored by Roche. TC has previously acted as the UK chief investigator for trials sponsored by Janssen. TC is currently a member of the NHSE COVID-19 Testing Technologies Oversight Group and the NHSE COVID-19 Technologies Validation Group. JS is a founding director, CEO, employee, and shareholder in TopMD Precision Medicine Ltd. FS is a founding director, CTO, employee, and shareholder in TopMD Precision Medicine Ltd. PS is a founding director, employee and shareholder in TopMD Precision Medicine Ltd. AG is an employee and shareholder in TopMD Precision Medicine Ltd. RP-R is an employee at TopMD Precision Medicine Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at:


1. Dorward DA, Russell CD, Um IH, Elshani M, Armstrong SD, Penrice-Randal R, et al. Tissue-specific immunopathology in fatal COVID-19. Am J Respir Crit Care Med (2021) 203(2):192–201. doi: 10.1164/rccm.202008-3265OC

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Russell CD, Valanciute A, Gachanja NN, Stephen J, Penrice-Randal R, Armstrong SD, et al. Tissue proteomic analysis identifies mechanisms and stages of immunopathology in fatal COVID-19. Am J Respir Cell Mol Biol (2022) 66(2):196–205. doi: 10.1165/rcmb.2021-0358OC

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Lal A, Erondu NA, Heymann DL, Gitahi G, Yates R. Fragmented health systems in COVID-19: rectifying the misalignment between global health security and universal health coverage. Lancet (2021) 397(10268):61–7. doi: 10.1016/S0140-6736(20)32228-5

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Liu X, Speranza E, Muñoz-Fontela C, Haldenby S, Rickett NY, Garcia-Dorival I, et al. Transcriptomic signatures differentiate survival from fatal outcomes in humans infected with Ebola virus. Genome Biol (2017) 18(1):4. doi: 10.1186/s13059-016-1137-3

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Ye W, Chen G, Li X, Lan X, Ji C, Hou M, et al. Dynamic changes of d-dimer and neutrophil-lymphocyte count ratio as prognostic biomarkers in COVID-19. Respir Res (2020) 21(1):169. doi: 10.1186/s12931-020-01428-7

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Aboughdir M, Kirwin T, Abdul Khader A, Wang B. Prognostic value of cardiovascular biomarkers in COVID-19: A review. Viruses (2020) 12(5):527. doi: 10.3390/v12050527

CrossRef Full Text | Google Scholar

7. Danlos FX, Grajeda-Iglesias C, Durand S, Sauvat A, Roumier M, Cantin D, et al. Metabolomic analyses of COVID-19 patients unravel stage-dependent and prognostic biomarkers. Cell Death Dis (2021) 12(3):258. doi: 10.1038/s41419-021-03540-y

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Ponti G, Maccaferri M, Ruini C, Tomasi A, Ozben T. Biomarkers associated with COVID-19 disease progression. Crit Rev Clin Lab Sci (2020) 57(6):389–99. doi: 10.1080/10408363.2020.1770685

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Kermali M, Khalsa RK, Pillai K, Ismail Z, Harky A. The role of biomarkers in diagnosis of COVID-19 - a systematic review. Life Sci (2020) 254:117788. doi: 10.1016/j.lfs.2020.117788

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Zahran AM, El-Badawy O, Ali WA, Mahran ZG, Mahran E, Rayan A. Circulating microparticles and activated platelets as novel prognostic biomarkers in COVID-19; relation to cancer. PloS One (2021) 16(2):e0246806. doi: 10.1371/journal.pone.0246806

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Carr E, Bendayan R, Bean D, Stammers M, Wang W, Zhang H, et al. Evaluation and improvement of the national early warning score (NEWS2) for COVID-19: A multi-hospital study. BMC Med (2021) 19(1):23. doi: 10.1186/s12916-020-01893-3

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Legebeke J, Lord J, Penrice-Randal R, Vallejo AF, Poole S, Brendish NJ, et al. Evaluating the immune response in treatment-naive hospitalised patients with influenza and COVID-19. Front Immunol (2022) 13. doi: 10.3389/fimmu.2022.853265

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Goulden R, Hoyle MC, Monis J, Railton D, Riley V, Martin P, et al. qSOFA, SIRS and NEWS for predicting inhospital mortality and ICU admission in emergency admissions treated as sepsis. Emerg Med J (2018) 35(6):345–9. doi: 10.1136/emermed-2017-207120

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Glass G, Hartka TR, Keim-Malpass J, Enfield KB, Clark MT. Dynamic data in the ED predict requirement for ICU transfer following acute care admission. J Clin Monit Comput (2021) 35(3):515–23. doi: 10.1007/s10877-020-00500-3

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Rodriguez-Nava G, Yanez-Bello MA, Trelles-Garcia DP, Chung CW, Friedman HJ, Hines DW. Performance of the quick COVID-19 severity index and the brescia-COVID respiratory severity scale in hospitalized patients with COVID-19 in a community hospital setting. Int J Infect Dis (2021) 102:571–6. doi: 10.1016/j.ijid.2020.11.003

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Wang AZ, Ehrman R, Bucca A, Croft A, Glober N, Holt D, et al. Can we predict which COVID-19 patients will need transfer to intensive care within 24 hours of floor admission? Acad Emerg Med (2021) 28(5):511–8. doi: 10.1111/acem.14245

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Hao B, Sotudian S, Wang T, Xu T, Hu Y, Gaitanidis A, et al. Early prediction of level-of-care requirements in patients with COVID-19. Elife (2020) 9:e60519. doi: 10.7554/eLife.60519

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Joshi N, Fass J. Sickle: A sliding-window, adaptive, quality-based trimming tool for FastQ files (2011). Available at:

Google Scholar

19. Moore AR, Roque J, Shaller BT, Asuni T, Remmel M, Rawling D, et al. Prospective validation of an 11-gene mRNA host response score for mortality risk stratification in the intensive care unit. Sci Rep (2021) 11(1):13062. doi: 10.1038/s41598-021-91201-7

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Banerjee S, Mohammed A, Wong HR, Palaniyar N, Kamaleswaran R. Machine learning identifies complicated sepsis course and subsequent mortality based on 20 genes in peripheral blood immune cells at 24 h post-ICU admission. Front Immunol (2021) 12:592303. doi: 10.3389/fimmu.2021.592303

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Oberg JA, Glade Bender JL, Sulis ML, Pendrick D, Sireci AN, Hsiao SJ, et al. Implementation of next generation sequencing into pediatric hematology-oncology practice: Moving beyond actionable alterations. Genome Med (2016) 8(1):133. doi: 10.1186/s13073-016-0389-6

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Qiao B, Zhao M, Wu J, Wu H, Zhao Y, Meng F, et al. A novel RNA-Seq-Based model for preoperative prediction of lymph node metastasis in oral squamous cell carcinoma. BioMed Res Int (2020) 2020:4252580. doi: 10.1155/2020/4252580

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Schieffer KM, Choi CS, Emrich S, Harris L, Deiling S, Karamchandani DM, et al. RNA-Seq implicates deregulation of the immune system in the pathogenesis of diverticulitis. Am J Physiol Gastrointest Liver Physiol (2017) 313(3):G277–g84. doi: 10.1152/ajpgi.00136.2017

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Seco-Cervera M, González-Rodríguez D, Ibáñez-Cabellos JS, Peiró-Chova L, Pallardó FV, García-Giménez JL. Small RNA-seq analysis of circulating miRNAs to identify phenotypic variability in friedreich’s ataxia patients. Sci Data (2018) 5:180021. doi: 10.1038/sdata.2018.21

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Bigler J, Boedigheimer M, Schofield JPR, Skipp PJ, Corfield J, Rowe A, et al. A severe asthma disease signature from gene expression profiling of peripheral blood from U-BIOPRED cohorts. Am J Respir Crit Care Med (2017) 195(10):1311–20. doi: 10.1164/rccm.201604-0866OC

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Nicolau M, Levine AJ, Carlsson G. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc Natl Acad Sci U.S.A. (2011) 108(17):7265–70. doi: 10.1073/pnas.1102826108

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Sjoberg DD, Whiting K, Curry M, Lavery JA, Larmarange J. Reproducible summary tables with the gtsummary package. R J (2021) 13:570–80. doi: 10.32614/RJ-2021-053

CrossRef Full Text | Google Scholar

28. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal (2011) 17. doi: 10.14806/ej.17.1.200

CrossRef Full Text | Google Scholar

29. Kim D, Langmead B, Salzberg SA-O. HISAT: a fast spliced aligner with low memory requirements. Nat Methods (2015) 12(4):357–60. doi: 10.1038/nmeth.3317

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Liao Y, Smyth GK, Shi W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics (2014) 30(7):923–30. doi: 10.1093/bioinformatics/btt656

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Breiman L. Bagging predictors. Mach Learn (1996) 24(2):123–40. doi: 10.1007/BF00058655

CrossRef Full Text | Google Scholar

32. Henry BM, MHSd O, Benoit S, Plebani M, Lippi G. Hematologic, biochemical and immune biomarker abnormalities associated with severe illness and mortality in coronavirus disease 2019 (COVID-19): A meta-analysis. Clin Chem Lab Med (CCLM) (2020) 58(7):1021–8. doi: 10.1515/cclm-2020-0369

CrossRef Full Text | Google Scholar

33. Juan Guardela BM, Sun J, Zhang T, Xu B, Balnis J, Huang Y, et al. 50-gene risk profiles in peripheral blood predict COVID-19 outcomes: A retrospective, multicenter cohort study. EBioMedicine (2021) 69:103439. doi: 10.1016/j.ebiom.2021.103439

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Russick J, Foy PE, Josseaume N, Meylan M, Hamouda NB, Kirilovsky A, et al. Immune signature linked to COVID-19 severity: A SARS-score for personalized medicine. Front Immunol (2021) 12:701273. doi: 10.3389/fimmu.2021.701273

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Chen Z, Wang C, Feng X, Nie L, Tang M, Zhang H, et al. Interactomes of SARS-CoV-2 and human coronaviruses reveal host factors potentially affecting pathogenesis. EMBO J. 40(17):e107776. doi: 10.15252/embj.2021107776

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Schofield JPR, Burg D, Nicholas B, Strazzeri F, Brandsma J, Staykova D, et al. Stratification of asthma phenotypes by airway proteomic signatures. J Allergy Clin Immunol (2019) 144(1):70–82. doi: 10.1016/j.jaci.2019.03.013

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Tariq K, Schofield JPR, Nicholas BL, Burg D, Brandsma J, Bansal AT, et al. Sputum proteomic signature of gastro-oesophageal reflux in patients with severe asthma. Respir Med (2019) 150:66–73. doi: 10.1016/j.rmed.2019.02.008

PubMed Abstract | CrossRef Full Text | Google Scholar

38. De Meulder B, Lefaudeux D, Bansal AT, Mazein A, Chaiboonchoe A, Ahmed H, et al. A computational framework for complex disease stratification from multiple large-scale datasets. BMC Syst Biol (2018) 12(1):60. doi: 10.1186/s12918-018-0556-z

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Östling J, van Geest M, Schofield JPR, Jevnikar Z, Wilson S, Ward J, et al. IL-17-high asthma with features of a psoriasis immunophenotype. J Allergy Clin Immunol (2019) 144(5):1198–213. doi: 10.1016/j.jaci.2019.03.027

PubMed Abstract | CrossRef Full Text | Google Scholar

40. Schofield JPR, Strazzeri F, Bigler J, Boedigheimer M, Adcock IM, Chung KF, et al. Morse-Clustering of a topological data analysis network identifies phenotypes of asthma based on blood gene expression profiles. bioRxiv (2020) 516328. doi: 10.1101/516328

CrossRef Full Text | Google Scholar

41. Venkataraman T, Coleman CM, Frieman MB. Overactive epidermal growth factor receptor signaling leads to increased fibrosis after severe acute respiratory syndrome coronavirus infection. J Virol (2017) 91(12):e00182-17. doi: 10.1128/JVI.00182-17

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Venkataraman T, Frieman MB. The role of epidermal growth factor receptor (EGFR) signaling in SARS coronavirus-induced pulmonary fibrosis. Antiviral Res (2017) 143:142–50. doi: 10.1016/j.antiviral.2017.03.022

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Vagapova ER, Lebedev TD, Prassolov VS. Viral fibrotic scoring and drug screen based on MAPK activity uncovers EGFR as a key regulator of COVID-19 fibrosis. Sci Rep (2021) 11(1):11234. doi: 10.1038/s41598-021-90701-w

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Londres HD, Armada JJ, Martínez AH, Abdo Cuza AA, Sánchez YH, Rodríguez AG, et al. Blocking EGFR with nimotuzumab: A novel strategy for COVID-19 treatment. Immunotherapy (2022) 14(7):521–30. doi: 10.2217/imt-2022-0027

PubMed Abstract | CrossRef Full Text | Google Scholar

45. Abdo Cuza AA, Ávila JP, Martínez RM, González JJ, Aspuro GP, Gutiérrez Martínez JA, et al. Nimotuzumab for COVID-19: Case series. Immunotherapy (2022) 14(3):185–93. doi: 10.2217/imt-2021-0269

CrossRef Full Text | Google Scholar

46. Heffernan KS, Ranadive SM, Jae SY. Exercise as medicine for COVID-19: On PPAR with emerging pharmacotherapy. Med Hypotheses (2020) 143:110197. doi: 10.1016/j.mehy.2020.110197

PubMed Abstract | CrossRef Full Text | Google Scholar

47. Buschard K. Fenofibrate increases the amount of sulfatide which seems beneficial against covid-19. Med Hypotheses (2020) 143:110127. doi: 10.1016/j.mehy.2020.110127

PubMed Abstract | CrossRef Full Text | Google Scholar

48. Del Re A, Corpetti C, Pesce M, Seguella L, Steardo L, Palenca I, et al. Ultramicronized palmitoylethanolamide inhibits NLRP3 inflammasome expression and pro-inflammatory response activated by SARS-CoV-2 spike protein in cultured murine alveolar macrophages. Metabolites (2021) 11(9):592. doi: 10.3390/metabo11090592.

PubMed Abstract | CrossRef Full Text | Google Scholar

49. Akbari N, Ostadrahimi A, Tutunchi H, Pourmoradian S, Farrin N, Najafipour F, et al. Possible therapeutic effects of boron citrate and oleoylethanolamide supplementation in patients with COVID-19: A pilot randomized, double-blind, clinical trial. J Trace Elem Med Biol (2022) 71:126945. doi: 10.1016/j.jtemb.2022.126945

PubMed Abstract | CrossRef Full Text | Google Scholar

50. Fonnesu R, Thunuguntla V, Veeramachaneni GK, Bondili JS, La Rocca V, Filipponi C, et al. Palmitoylethanolamide (PEA) inhibits SARS-CoV-2 entry by interacting with s protein and ACE-2 receptor. Viruses (2022) 14(5):1080. doi: 10.3390/v14051080.

PubMed Abstract | CrossRef Full Text | Google Scholar

51. Flannery LE, Kerr DM, Hughes EM, Kelly C, Costello J, Thornton AM, et al. N-acylethanolamine regulation of TLR3-induced hyperthermia and neuroinflammatory gene expression: A role for PPARα. J Neuroimmunol (2021) 358:577654. doi: 10.1016/j.jneuroim.2021.577654

PubMed Abstract | CrossRef Full Text | Google Scholar

52. Ferreira-Gomes M, Kruglov A, Durek P, Heinrich F, Tizian C, Heinz GA, et al. SARS-CoV-2 in severe COVID-19 induces a TGF-β-dominated chronic immune response that does not target itself. Nat Commun (2021) 12(1):1961. doi: 10.1038/s41467-021-22210-3

PubMed Abstract | CrossRef Full Text | Google Scholar

53. Beller A, Kruglov A, Durek P, von Goetze V, Werner K, Heinz GA, et al. Specific microbiota enhances intestinal IgA levels by inducing TGF-β in T follicular helper cells of peyer’s patches in mice. Eur J Immunol (2020) 50(6):783–94. doi: 10.1002/eji.201948474

PubMed Abstract | CrossRef Full Text | Google Scholar

54. Lee CG, Homer RJ, Zhu Z, Lanone S, Wang X, Koteliansky V, et al. Interleukin-13 induces tissue fibrosis by selectively stimulating and activating transforming growth factor beta(1). J Exp Med (2001) 194(6):809–21. doi: 10.1084/jem.194.6.809

PubMed Abstract | CrossRef Full Text | Google Scholar

55. Singh E, Matada GSP, Abbas N, Dhiwar PS, Ghara A, Das A. Management of COVID-19-induced cytokine storm by Keap1-Nrf2 system: A review. Inflammopharmacology (2021) 29(5):1347–55. doi: 10.1007/s10787-021-00860-5

PubMed Abstract | CrossRef Full Text | Google Scholar

56. Ulasov AV, Rosenkranz AA, Georgiev GP, Sobolev AS. Nrf2/Keap1/ARE signaling: Towards specific regulation. Life Sci (2022) 291:120111. doi: 10.1016/j.lfs.2021.120111

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: COVID-19, Critical Care, biomarkers, prognosis, topology, transcriptome, RNA-seq - RNA sequencing

Citation: Penrice-Randal R, Dong X, Shapanis AG, Gardner A, Harding N, Legebeke J, Lord J, Vallejo AF, Poole S, Brendish NJ, Hartley C, Williams AP, Wheway G, Polak ME, Strazzeri F, Schofield JPR, Skipp PJ, Hiscox JA, Clark TW and Baralle D (2022) Blood gene expression predicts intensive care unit admission in hospitalised patients with COVID-19. Front. Immunol. 13:988685. doi: 10.3389/fimmu.2022.988685

Received: 07 July 2022; Accepted: 29 August 2022;
Published: 20 September 2022.

Edited by:

Martijn van Griensven, Maastricht University, Netherlands

Reviewed by:

Diane Marie Del Valle, Icahn School of Medicine at Mount Sinai, United States
Mattia Bellan, Università del Piemonte Orientale, Italy

Copyright © 2022 Penrice-Randal, Dong, Shapanis, Gardner, Harding, Legebeke, Lord, Vallejo, Poole, Brendish, Hartley, Williams, Wheway, Polak, Strazzeri, Schofield, Skipp, Hiscox, Clark and Baralle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Rebekah Penrice-Randal,

†These authors share senior authorship