A Novel Framework for Phenotyping Children With Suspected or Confirmed Infection for Future Biomarker Studies

Background: The limited diagnostic accuracy of biomarkers in children at risk of a serious bacterial infection (SBI) might be due to the imperfect reference standard of SBI. We aimed to evaluate the diagnostic performance of a new classification algorithm for biomarker discovery in children at risk of SBI. Methods: We used data from five previously published, prospective observational biomarker discovery studies, which included patients aged 0– <16 years: the Alder Hey emergency department (n = 1,120), Alder Hey pediatric intensive care unit (n = 355), Erasmus emergency department (n = 1,993), Maasstad emergency department (n = 714) and St. Mary's hospital (n = 200) cohorts. Biomarkers including procalcitonin (PCT) (4 cohorts), neutrophil gelatinase-associated lipocalin-2 (NGAL) (3 cohorts) and resistin (2 cohorts) were compared for their ability to classify patients according to current standards (dichotomous classification of SBI vs. non-SBI), vs. a proposed PERFORM classification algorithm that assign patients to one of eleven categories. These categories were based on clinical phenotype, test outcomes and C-reactive protein level and accounted for the uncertainty of final diagnosis in many febrile children. The success of the biomarkers was measured by the Area under the receiver operating Curves (AUCs) when they were used individually or in combination. Results: Using the new PERFORM classification system, patients with clinically confident bacterial diagnosis (“definite bacterial” category) had significantly higher levels of PCT, NGAL and resistin compared with those with a clinically confident viral diagnosis (“definite viral” category). Patients with diagnostic uncertainty had biomarker concentrations that varied across the spectrum. AUCs were higher for classification of “definite bacterial” vs. “definite viral” following the PERFORM algorithm than using the “SBI” vs. “non-SBI” classification; summary AUC for PCT was 0.77 (95% CI 0.72–0.82) vs. 0.70 (95% CI 0.65–0.75); for NGAL this was 0.80 (95% CI 0.69–0.91) vs. 0.70 (95% CI 0.58–0.81); for resistin this was 0.68 (95% CI 0.61–0.75) vs. 0.64 (0.58–0.69) The three biomarkers combined had summary AUC of 0.83 (0.77–0.89) for “definite bacterial” vs. “definite viral” infections and 0.71 (0.67–0.74) for “SBI” vs. “non-SBI.” Conclusion: Biomarkers of bacterial infection were strongly associated with the diagnostic categories using the PERFORM classification system in five independent cohorts. Our proposed algorithm provides a novel framework for phenotyping children with suspected or confirmed infection for future biomarker studies.

Background: The limited diagnostic accuracy of biomarkers in children at risk of a serious bacterial infection (SBI) might be due to the imperfect reference standard of SBI. We aimed to evaluate the diagnostic performance of a new Methods: We used data from five previously published, prospective observational biomarker discovery studies, which included patients aged 0-<16 years: the Alder Hey emergency department (n = 1,120), Alder Hey pediatric intensive care unit (n = 355), Erasmus emergency department (n = 1,993), Maasstad emergency department (n = 714) and St. Mary's hospital (n = 200) cohorts. Biomarkers including procalcitonin (PCT) (4 cohorts), neutrophil gelatinase-associated lipocalin-2 (NGAL) (3 cohorts) and resistin (2 cohorts) were compared for their ability to classify patients according to current standards (dichotomous classification of SBI vs. non-SBI), vs. a proposed PERFORM classification algorithm that assign patients to one of eleven categories. These categories were based on clinical phenotype, test outcomes and C-reactive protein level and accounted for the uncertainty of final diagnosis in many febrile children. The success of the biomarkers was measured by the Area under the receiver operating Curves (AUCs) when they were used individually or in combination.
Results: Using the new PERFORM classification system, patients with clinically confident bacterial diagnosis ("definite bacterial" category) had significantly higher levels of PCT, NGAL and resistin compared with those with a clinically confident viral diagnosis ("definite viral" category). Patients with diagnostic uncertainty had biomarker concentrations that varied across the spectrum. AUCs were higher for classification of "definite bacterial" vs. "definite viral" following the PERFORM algorithm than using the "SBI" vs. "non-SBI" classification; summary AUC for PCT was 0.77 (95% CI 0. 72 Conclusion: Biomarkers of bacterial infection were strongly associated with the diagnostic categories using the PERFORM classification system in five independent cohorts. Our proposed algorithm provides a novel framework for phenotyping children with suspected or confirmed infection for future biomarker studies.

INTRODUCTION
Amongst the many children presenting with febrile illness to healthcare, a minority have serious bacterial infections (SBI), and of those, only the rare few are admitted to intensive care units or have a fatal outcome (Figure 1). Clinical signs and symptoms alone lack the ability to reliably differentiate between children with viral and bacterial infection, and accurate biomarkers are urgently needed (1). SBIs are still one of the leading causes of childhood mortality and morbidity in both high income as well as low and middle income countries (2,3). The increasing global burden of antimicrobial resistance has amplified the need for improved diagnostics, and there has been an emphasis on the need for "rule-out" tests for bacterial infection, so that antibiotic treatment can be reserved for those needing treatment, irrespective of the presence of coincident viral infection. In addition, sepsis campaigns have promoted early identification of children at risk of sepsis and the early escalation of care, including prompt administration of broad spectrum antibiotics, making improved biomarkers to guide management even more urgent (4)(5)(6)(7)(8). In high income countries, an increasing proportion of children who present to the emergency department (ED) with SBI have pre-existing co-morbidities, which can make the diagnosis more challenging (9,10). The economic impact of diagnostic uncertainty when managing pediatric febrile illness is significant, with the precautionary use of antibiotics being associated with increased costs (11).
With effective antivirals emerging or in the pipeline for common and important viral illnesses [including coronavirus and Respiratory Syncytial Virus (RSV)], identification of when antibiotics are required will be insufficient to guide accurate treatment. With both viral and bacterial illnesses requiring FIGURE 1 | Child with fever: patient journey in order of likely outcome. A small proportion of children presenting to the ED with a febrile illness have a confirmed Serious Bacterial Infection (SBI), and of these a smaller number require admission to hospital or PICU, as shown in the pyramid as a percentage of the total number of febrile children in ED. The data were collected in the MOFICHE study (Management and Outcome of Fever in Children in Europe, n = 38,480) as part of the EU Horizon 2020-funded PERFORM study (Personalized Risk assessment in Febrile illness to Optimize Real-life Management across the European Union, www.perform2020.org). The MOFICHE study was an observational study in twelve EDs in eight different European countries [Austria, Germany, Greece, Latvia, the Netherlands (n = 3), Spain, Slovenia and the United Kingdom (n = 3)], which recorded clinical data on consecutive children with febrile illness in 2017-2018 (78). There were no fatal cases of SBI in the MOFICHE study, but 1 case of fatal viral gastro-enteritis; PICU admission with SBI: 39 (25%) out of total of 158 PICU admissions; hospital admission with SBI: 1,947 (20%) out of total of 9.893 admissions; *the MOFICHE study reflects death in ED, not overall mortality. ED, emergency department; SBI, serious bacterial infection; PICU, pediatric intensive care unit. targeted treatment, and the possibility of one or both being present, successful infection biomarkers must make a more nuanced diagnosis. Moreover, in childhood in particular, the incidence of bacterial infections has decreased considerably since the introduction of conjugate vaccines; it follows that the proportion of children presenting with febrile illness who have alternative diagnoses, including inflammatory conditions, is increasing. In addition to the decreasing burden of bacterial infection, the increased recognition of inflammatory illness in children may reflect changes in ascertainment, as well as true increases in incidence, as seen in the case of Kawasaki disease (12).

DEFINING BACTERIAL INFECTIONS
Traditionally, most biomarker discovery studies have ascertained bacterial etiology based on bacterial detection by culture or PCR in a sterile site (including urine, CSF, blood; often referred to as "invasive bacterial infections") (13)(14)(15). A patient without this evidence will then typically be classed as nonbacterial. Some studies, in particular those with a focus on pragmatic clinical translation, include positive cultures from non-sterile sites (e.g., throat swab, stools, skin) and imaging results such as radiographical changes on chest X-ray (e.g., to define bacterial pneumonia), CT or MRI (e.g., to define mastoiditis). Furthermore, intra-operative findings and histology (e.g., appendicitis, septic arthritis) or a clinical diagnosis without microbiological evidence (e.g., abscess, cellulitis) might be added to the outcome reference standard. This broader definition of complicated bacterial infections is often referred to as "serious bacterial infections" (Appendix A in Supplementary Material). In many studies, an expert opinion will be included to agree on the most appropriate final diagnosis. This has proven a fairly robust, but labor intensive approach to ensuring reproducibility between study outcomes (16,17).
One of the major drawbacks of using bacterial cultures for confirming "definite bacterial" infection is their limited sensitivity ( Table 1) (18). The sensitivity is directly related to the volume sampled, which is a well-known problem for blood cultures in neonates and children (19), to the prior use of antibiotics, which is very common in some settings (20), to culture techniques and to the types of pathogens. Other limitations of cultures of sterile sites are the high rates of contaminants, at times as high as the rate of true bacterial pathogens (21), and whether or not the site of infection can be sampled directly. Molecular strategies are now being employed to optimize the capture rate of pathogens in addition to conventional blood cultures. For instance, meningococcal PCR is already considered the gold standard confirmatory test (22,23). In the UK-based multicenter DINOSAUR study molecular techniques improved the number of pathogens detected in children with convincing evidence of infective osteomyelitis or septic arthritis (24,25). Molecular diagnostic panels, such as Septifast, Sepsitest, and Biofire Filmarray (26)(27)(28)(29)(30), have been shown to increase the number of positive findings in blood with relatively short turnaround time (31,32). However, problems of sensitivity and specificity persist (33,34), and studies that combine molecular and culture approaches still have disappointing diagnostic yield. For example, in the large observational study of children with life-threatening infection admitted to hospitals across several European countries (EUCLIDS), more than half of the children with a serious infection did not have a definitive causative pathogen identified, despite extensive diagnostic work-up (35). In addition, with bacterial identification, usually from non-sterile sites, the distinction between acute infection and carriage is often unclear, particularly in patients with co-morbidities.
Defining bacterial pneumonia, the most common SBI with an overall mortality of 6.4 per 100,000 for children aged 5 years and under in high income countries (2, 3), is particularly challenging without a gold-standard diagnostic test. As collecting suitable diagnostic biosamples for the lower respiratory tract in children is difficult, a diagnosis will often be made based on chest X-ray changes or on clinical grounds alone, both of which are unreliable for establishing a definitive diagnosis of community acquired pneumonia (36,37). Guidance by the World Health Organisation, albeit mostly applicable to lower income countries without referral capacity, recommends antibiotic treatment for community acquired pneumonia on fast breathing alone (38). Recent studies have improved our understanding of the etiology of childhood pneumonia using more elaborate diagnostic platforms. In the PERCH ("Pneumonia Etiology Research for Child Health") study, conducted in several lower and middle income countries, viruses were the causative pathogen in the majority children with pneumonia, with RSV most commonly identified in approximately one third of children (39). However, in the absence of a sensitive diagnostic test, bacterial involvement cannot be ruled-out as contributory. A North American study on childhood pneumonia demonstrated that multiple viruses or bacteria or both viruses and bacteria were isolated in many children, in line with current thinking that respiratory tract infections are the result of complex mechanisms involving multiple organisms and varying hostimmune responses (40,41). Some viruses, in particular RSV and influenza virus, were more likely to be associated with disease than carriage (42)(43)(44).

ROLE OF BIOMARKERS
Many biomarkers have been proposed for differentiating viral and bacterial infections (45). The most evidence is available for CRP and PCT, and they appear equally useful in many clinical areas, even though neither can be used for diagnosing SBI with differences in patient populations and the case-mix of settings differences in incidence of SBI differences in epidemiology: seasonality and endemic disease a Cell cytopenia and limited protein, metabolite, or RNA yield a Differences in vaccination schemes and status, as well as seasonality and endemic disease can change the pre-test probabilities of the individual patient of having or not having (a specific type of) SBI, altering the interpretation of a biomarker result and its effect on the post-test probabilities and making a diagnosis of (a specific type of) SBI more or less likely; e.g., seasonality of enterovirus and influenzavirus can lead to different interpretation of biomarker value. ED, emergency department; PICU, pediatric intensive care unit; SBI, serious bacterial infection.
confidence in isolation (13,46). PCT performs slightly better in young infants and neonates, and children with a short duration of fever compared with CRP (47), reflecting the differences in physiological inflammatory responses and time to elevated levels of CRP and PCT after stimulus (48). Despite the evidence available on its limited diagnostic utility, white cell count is still commonly used (13). Many other biomarkers have been explored, some with very promising initial results. For example, CD64 was useful in PICU settings, but did not validate well in ED settings ( Table 1). Other markers of bacterial infection, such as neutrophil gelatinase-associated lipocalin-2 (NGAL) and resistin, have shown promise across a range of clinical settings but have not been integrated in clinical practice yet. The pressure to improve early treatment of true bacterial infection, whilst avoiding unnecessary treatments, set against the decreasing incidence of bacterial illness and increasing incidence of inflammatory conditions makes the case for novel, accurate diagnostic strategies more compelling (49). Yet, despite many promising candidates (1,50,51), few biomarkers complete the journey from discovery to translation (52). An important obstacle in the development of bacterial biomarkers remains the lack of a consistent reference standard to classify SBIs, often aiming to capture a heterogeneous mix of causative pathogens and clinical phenotypes, not easily captured with a single, or minimal set of, biomarker(s) (35,51,53). Another obstacle to translation arises when biomarkers are discovered and perform well in high-incidence settings, for instance in severely unwell children in PICU with a clear or extreme presentation, but have poor performance in a low-incidence setting like emergency departments where they are most needed, where diagnostic uncertainty is higher, and clinical presentations less clear-cut. As different types of bacterial infections might need different diagnostic and management strategies, it seems unrealistic for biomarkers to be equally predictive for all. Some studies use a polytomous approach, allowing for different types of bacterial infections in their modeling (51), whilst other have looked at a single bacterial infection (54)(55)(56)(57)(58).
Future biomarker strategies, drawn from multi-omic discovery approaches, may enable classification of a wide range of febrile illnesses spanning bacterial and viral illness, other infections and inflammatory conditions, and also include other variables such as disease severity or prognosis. It is therefore essential that phenotyping approaches are able to classify the full range of presentations likely to be needed to be diagnosed in such a multi-class testing approach. With this paper we propose a novel classification framework to guide the design and evaluation of biomarker discovery and validation for childhood febrile illness, one which reflects the complex interplay between bacterial, viral and inflammatory illnesses. By means of illustrative validation studies using five prospective cohorts of children with infections, we aimed to evaluate the diagnostic performance of a new classification algorithm for biomarker discovery in children at risk of SBI.

METHODOLOGY
Using five prospective, previously published, cohorts including children aged <16 years used for biomarker discovery and validation studies ( Table 2) (59-62), we assessed the performance of the biomarkers procalcitonin (PCT, 4 cohorts), neutrophil gelatinase-associated lipocalin-2 (NGAL, 3 cohorts) and resistin (2 cohorts) to classify patients as having "bacterial" infection. These biomarkers were measured in each of the local reference laboratories, as detailed in the original publications.
The Alder Hey ED (n = 1,183), Maasstad ED (n = 714) and Erasmus ED (n = 1,993) cohorts recruited consecutive febrile children presenting to the emergency department in whom additional blood tests were done; the research biosamples for the ED cohorts were taken as additional samples at the time of taking the initial blood tests, and ideally before the administration of systemic antibiotics. The Alder Hey PICU cohort included children with suspected infection in the pediatric intensive care unit (n = 352), with research biosamples taken on admission to PICU or at the time of developing an infection during PICU stay; the St. Mary's hospital cohort recruited acutely ill febrile children admitted to pediatric wards or intensive care (n = 394), and research biosamples were taken at the earliest opportunity during the patient's inpatient stay.
We then re-categorized the children of these four cohorts, blinded for our biomarkers of interest, from the original dichotomous SBI classification into one of the eleven distinct outcome groups in view of their likelihood of having a bacterial or viral infection, or both (Figure 2), and using an extended version of a published algorithm previously used to derive a 2-transcript bacterial-viral diagnostic classifier (63). For the fifth cohort, i.e., the St. Mary's hospital cohort, we allocated final diagnoses for both the SBI and the PERFORM classification systems, blinded for the biomarkers of interest. The PERFORM algorithm broadly groups patients into patients with a likely bacterial infection, patients with a likely viral infection, patients with unknown viral and/or bacterial infection and other febrile syndromes, which includes patients with suspected or confirmed inflammatory conditions, and infections with distinct treatments or non-viral/bacterial etiology, such as tuberculosis and malaria (Figure 2). We then examined the distribution of the biomarkers according to the patient classifications. For this study, we combined "trivial, " "other infection, " "infection or inflammation" and "inflammatory syndrome" into one "Other" group; with the cohorts having few or no cases.
Concentrations of biomarkers were visualized using barplots with median concentration levels and interquartile ranges, and these were compared for "SBI" vs. "non-SBI" (the SBI algorithm, Appendix A in Supplementary Material) and "definite bacterial" vs. "definite viral" (i.e., the groups of the PERFORM algorithm with most diagnostic certainty, Figure 2) using Wilcoxon nonparametric rank sum tests, and for all levels of the PERFORM algorithm using Kruskal-Wallis tests. Spearman correlation coefficients were calculated for the concentrations of biomarkers and categories with viral and bacterial infections of the PERFORM algorithm. Pearson correlation coefficients were calculated for the correlation between C-Reactive Protein (CRP), which was used to allocate a final diagnosis in the PERFORM algorithm and the biomarkers PCR, NGAL and resistin. We compared the "SBI" and "PERFORM" phenotyping classification systems by measuring the Area Under the receiver operating Curves (AUC) of the biomarkers' ability to discriminate the predicted bacterial and viral groups, by means of "SBI" vs. "non-SBI" and "definite bacterial" vs. "definite viral" infection. Additionally, we calculated the AUC for a model that combined PCT, NGAL and resistin using the data from the Alder Hey ED and Alder Hey PICU cohort, applying restricted cubic splines for optimal model fit. Only cases with available biomarker data were used. We calculated summary AUCs using random effect models and presented these in forest plots. All analyses were performed in R v4.0.0, including the use of packages pROC, ggpubr, and metafor.
n/a N = 32 n/a n/a N = 43 All concentrations in median and interquartile range (IQR); median (IQR) not shown for the "Other" category. ∧ The Alder Hey PICU cohort only had minimal cases of "probable viral" or "viral syndrome"; The St. Mary's hospital cohort had no cases coded as "bacterial syndrome," "viral syndrome" or "probable viral." *Spearman ρ p-value < 0.05 for correlation of the full spectrum of the PERFORM classification. ED, emergency department; NGAL, neutrophil gelatinase-associated lipocalin; PCT, procalcitonin; PICU, pediatric intensive care unit; SBI, serious bacterial infection.
Frontiers in Pediatrics | www.frontiersin.org FIGURE 2 | Algorithm for classifying children at risk of serious infection. Following discharge, clinical phenotypes were assigned after review of all available clinical and laboratory data including biochemistry, hematology, radiology and microbiology. Children allocated to the "other infection," "infection or inflammation," or "inflammatory syndrome" boxes at the bottom right would normally be analyzed as its component parts individually, so that studies can recruit and meaningfully analyze data from these type of patients alongside the infection patients. CRP, C-reactive protein.

RESULTS
The number of children diagnosed with SBI varied between the 5 cohorts, ranging from 12% in the Erasmus ED cohort to 54% in the St. Mary's hospital cohort ( Increased concentrations of PCT were strongly associated with bacterial infections in all four cohorts for both the PERFORM and SBI algorithms, and the PERFORM classification algorithm showed a clear trend toward higher concentrations of PCT in children with a higher degree of certainty of having a bacterial infection (Figure 3). NGAL concentrations differed between bacterial and viral infections as per PERFORM  algorithm, as well as between "SBI" and "non-SBI, " in the Alder Hey ED, the Alder Hey PICU and the St Mary's hospital cohorts (Figure 4). Resistin levels did not discriminate "SBI" from "non-SBI" in the Alder Hey PICU cohort ( Figure 5). Notably, children in "viral syndrome" or "unknown" groups of the PERFORM algorithm had high levels of PCT and NGAL.

DISCUSSIONS
Compared to the dichotomous categories from the original publications ("SBI" vs. "non-SBI"), the new PERFORM algorithm showed better discrimination and granularity across the full spectrum from "definite bacterial" to "definite viral." It aligned well with host response biomarker concentrations, which had highest concentrations in the group with most certainty. Hence, the PERFORM algorithm helped define those with a bacterial infection, as well as those without a bacterial illness. This was seen across a range of clinical settings with varying incidences of bacterial infections, reflecting different recruitment strategies and supporting the broad applicability of the PERFORM algorithm. A combination of PCT, NGAL and resistin improved discrimination compared with the individual biomarkers in the Alder Hey ED and PICU cohorts, and more so for differentiating between "definite bacterial" infections and "definite viral" infections based on the PERFORM algorithm than for "SBI" and "non-SBI." In the PERFORM algorithm, children with a clinical phenotype resembling a viral infection, but with a high CRP level not clearly explained by the presence of a bacterial co-infection, will be classified in either the "viral syndrome" or "unknown" group. Hence, high concentrations of PCT and NGAL in these two groups might represent misclassification or co-infection and are of interest for future biomarker studies. Even though there was only moderate correlation between the biomarkers of interest and CRP, using CRP to guide the phenotyping in the PERFORM algorithm might have increased the diagnostic performance of PCT, NGAL, and resisitin.
When we examined biomarker concentrations in children with unclear etiology for their illness (children with no positive microbiology, or in whom the microbiology does not fit the diagnostic phenotype) and all viral infections ("probable" and "definite"), there was a trend to a stepwise decrease in the median biomarker values, moving from most to least likely bacterial infection. Within each phenotypic category we found a range of biomarker concentrations spanning from "bacterial" level to "viral" range, as well as some showing intermediate values.
The PERFORM phenotyping approach enabled categorization of children with more granularity. The data are consistent with the emerging evidence for a complex relationship between bacterial and viral pathogens in the etiology of disease, such that the overall clinical presentation may be a result of interplay between pathogens, including between bacteria and viruses (41). The PERFORM algorithm will also allow for the accurate classification of children with inflammatory conditions and other type of infections. Although there were few of these cases in our cohorts, it will be important to optimize their phenotyping, as illustrated by the emergence of the SARS-CoV-2 associated Multisystem Inflammatory Syndrome in children (MIS-C) (64).
The PERFORM classification algorithm aptly captures the degree of uncertainty of the final diagnosis, increasing the likelihood of a successful candidate biomarker to perform  2nd and 4th row), and the Alder Hey PICU cohort (right, 2nd and 4th row). Each bar represents median concentration values, with the black lines representing the interquartile range, and the gray dots representing individual values. Overall significance for the PERFORM classification algorithm is given using the Kruskal Wallis test, and for the SBI classification using the Wilcoxon rank sum test. In addition, significance value for "definite bacterial" vs. "definite viral" of the PERFORM algorithm was calculated using the Wilcoxon rank sum test. Overall significance for the PERFORM classification algorithm is given using the Kruskal Wallis test, and for the SBI classification using the Wilcoxon rank sum test. In addition, significance value for "definite bacterial" vs. "definite viral" of the PERFORM algorithm was calculated using the Wilcoxon rank sum test.
well in validation cohort studies. Furthermore, the algorithm gives insights in the distribution of the types of infection in different clinical settings. We now suggest, as a next step toward clinical implementation, to select the most promising candidate biomarkers with the most convincing trend of biomarker concentrations, and the best discriminative ability for "definite bacterial" vs. "definite viral" infection. Including those children with "probable bacterial" or "probable viral" infections can be considered to cover a wider range of clinical phenotypes. Then, candidate biomarkers should be validated in independent validation cohorts. Validation studies should recruit cohorts with consecutive patients, including those with an unclear clinical phenotype, and should be conducted in various clinical settings including low and Overall significance for the PERFORM classification algorithm is given using the Kruskal Wallis test, and for the SBI classification using the Wilcoxon rank sum test. In addition, significance value for "definite bacterial" vs. "definite viral" of the PERFORM algorithm was calculated using the Wilcoxon rank sum test.
middle income countries, as well as high and low incidence settings. To illustrate the importance of this, we showed that biomarker levels were markedly higher in the Alder Hey PICU cohort than in the ED cohorts for all diagnostic groups. Specific populations such as neonates and children with comorbidity are also important to consider for validation studies. Following this strategy, it will become apparent in which groups of patients a potential new biomarker might or might not perform satisfactorily. This framework could easily be extended to account for non-viral and non-bacterial causes of febrile illness as well, as is currently being explored in the Diagnosis and Management of Febrile Illness using RNA Personalized Molecular Signature Diagnosis (DIAMONDS) study (65). The algorithm could be modified for adult studies, but would need to be validated using case biomarker studies from adults.

CLINICAL TRANSLATION OF BIOMARKERS AND FUTURE RESEARCH
Following satisfactory discovery and validation stages, the performance of a potential biomarker needs to be studied against clinically meaningful and patient-centered endpoints. Until now,  only few high-quality randomized trials have evaluated this in the pediatric population. As one example, the Neopins study showed that PCT could successfully be used to shorten the duration of antibiotics in suspected early onset sepsis in neonates (66). Similarly, Baer et al. showed that the duration of antibiotic treatment could be guided by PCT in a pediatric ED setting (67). The UK BATCH ("Biomarker-guided duration of Antibiotic Treatment in Children Hospitalized with confirmed or suspected bacterial infection") trial is currently recruiting patients, aiming to use PCT for guiding the duration of antibiotics in hospitalized children with an acute infection (68).
Other studies showed limited impact of using biomarkers on the management of children with acute infections (69,70). Successful clinical implementation of a biomarker is complex and multifactorial (71), as was shown in a trial implementing rapid diagnostics for malaria, in which physicians did not use the result to guide treatment, despite good diagnostic accuracy (72). Next, given the significant cost associated with diagnostic uncertainty in childhood febrile illness, diagnostic advances increasing the confidence to withhold antibiotics may yield considerable efficiency gains, especially in the sub-groups where the perceived risks of failing to identify potentially lifethreatening bacterial infections are greatest (11). It is therefore imperative that future randomized trials of biomarkers include comprehensive cost-effectiveness analysis. In addition, future studies will need to focus on combinations of biomarkers that ideally include markers of both viral and bacterial infections and of other febrile illnesses including inflammatory disease (17,63,73). Throughout, we discussed biomarkers in blood, but future studies should additionally consider the optimal type of biosample for a biomarker, as some have marked improved diagnostic performance in sterile fluids such as cerebral spinal fluid (74). Lastly, establishing the likelihood of viral or bacterial disease in febrile children, as suggested in this manuscript, does not always relate to the severity of disease. Emerging evidence is providing more insight into the role of clinical signs and symptoms in predicting the severity of childhood illness (75,76). Future studies should therefore combine new biomarkers with existing validated clinical prediction models with an aim to predict both severity and etiology of childhood febrile illness (77).

CONCLUSION
The absence of a perfect reference standard for biomarker studies in serious bacterial infections has hindered translation of biomarker studies into clinical practice. Our proposed new algorithm provides a framework for phenotyping children with infections based on the trends in the different biomarkers in relation to the certainty of the diagnosis of either bacterial or viral categories. The findings from our independent biomarker validation studies suggest that the algorithm also aligns well with the host response and could provide mechanistic insights for those with uncertain diagnoses. To utilize the full potential of -omics driven biomarkers discovery studies, it will be essential to reach agreement on the best outcome reference standard in future studies, and we propose our diagnostic phenotyping algorithm as the best possible way to do so at present.

PPI AND STAKEHOLDERS' STATEMENT
Patient representatives were involved in all facets of the PERFORM study involving study design, data collection and patient recruitment and presentation and dissemination of results, during the entirety of study period. The PERFORM consortium, made up of 18 organizations from 10 different countries, actively engages with national and European wide policymakers and stakeholders, to maximize the project's reach and impact.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.  EC, JH, RN, HM, RO, CC-P, UB, IE, ME, MF, RG, BK, EL,  IM, SP, FM-T, MP, SR, IC, LS, FS, MT, SY, DZ, WZ, ML,  AC, MK, TD, TK,