SYSTEMATIC REVIEW article

Front. Surg., 10 October 2022

Sec. Orthopedic Surgery

Volume 9 - 2022 | https://doi.org/10.3389/fsurg.2022.924810

Decision support by machine learning systems for acute management of severely injured patients: A systematic review

  • 1. Department for Orthopedics and Traumatology, University Hospital Leipzig, Leipzig, Germany

  • 2. Center for Musculoskeletal Surgery, Charité University Medicine Berlin, Berlin, Germany

  • 3. Clinic for Traumatology, University Hospital Zurich, Zurich, Switzerland

  • 4. Clinic for Traumatology and Orthopedics, Bundeswehr Hospital Berlin, Berlin, Germany

  • 5. Department of Orthopaedics and Trauma Surgery, University Hospital Bonn, Bonn, Germany

  • 6. Department for Orthopedics, Traumatology and Plastic Surgery, University Hospital Leipzig, Leipzig, Germany

Article metrics

View details

19

Citations

4,3k

Views

2k

Downloads

Abstract

Introduction:

Treating severely injured patients requires numerous critical decisions within short intervals in a highly complex situation. The coordination of a trauma team in this setting has been shown to be associated with multiple procedural errors, even of experienced care teams. Machine learning (ML) is an approach that estimates outcomes based on past experiences and data patterns using a computer-generated algorithm. This systematic review aimed to summarize the existing literature on the value of ML for the initial management of severely injured patients.

Methods:

We conducted a systematic review of the literature with the goal of finding all articles describing the use of ML systems in the context of acute management of severely injured patients. MESH search of Pubmed/Medline and Web of Science was conducted. Studies including fewer than 10 patients were excluded. Studies were divided into the following main prediction groups: (1) injury pattern, (2) hemorrhage/need for transfusion, (3) emergency intervention, (4) ICU/length of hospital stay, and (5) mortality.

Results:

Thirty-six articles met the inclusion criteria; among these were two prospective and thirty-four retrospective case series. Publication dates ranged from 2000 to 2020 and included 32 different first authors. A total of 18,586,929 patients were included in the prediction models. Mortality was the most represented main prediction group (n = 19). ML models used were artificial neural network ( n = 15), singular vector machine (n = 3), Bayesian network (n = 7), random forest (n = 6), natural language processing (n = 2), stacked ensemble classifier [SuperLearner (SL), n = 3], k-nearest neighbor (n = 1), belief system (n = 1), and sequential minimal optimization (n = 2) models. Thirty articles assessed results as positive, five showed moderate results, and one article described negative results to their implementation of the respective prediction model.

Conclusions:

While the majority of articles show a generally positive result with high accuracy and precision, there are several requirements that need to be met to make the implementation of such models in daily clinical work possible. Furthermore, experience in dealing with on-site implementation and more clinical trials are necessary before the implementation of ML techniques in clinical care can become a reality.

Introduction

Time is considered one of the significant factors for patient outcomes after major trauma. Depending on injury severity, a rapid medical assessment, life-saving on-site treatment, and transportation to an appropriate trauma center are essential to improve survival rates. Therefore, constant improvement in prehospital settings in resuscitation, rapid transit, and adequate initial treatment in hospitals have a substantial impact on survival rates (1).

The introduction of standardized training and education programs has improved the quality of care for severely injured trauma patients—both in the preclinical field and in the emergency trauma room. An example is Advanced Trauma Life Support. Altogether, educational training and standards have led to improvements in the factor of time and treatment quality (2).

Although emergency care and surgical care improvement led to a better outcome, up to 8.0% of all trauma patients’ death are considered preventable or potentially preventable (3). Management errors arise because of time pressure, inexperience, reliance on memory, multitasking, information flow analysis, and failures in trauma team coordination, particularly during the initial minutes of patient reception and resuscitation in emergency rooms. Even in established trauma centers with experienced trauma care professionals, despite guidelines, protocols, and continuous performance improvement, protocol compliance was only 53% (4).

In the age of digitalization, connecting computer-generated prompts through visual and auditory displays within the resuscitation may enhance trauma care professionals’ interaction and reduce errors of omission and miscommunication. In addition, the past decade led to the excitement for the potential to apply deep-learning algorithms to healthcare. This subtype of artificial intelligence (AI) has the ability to improve the accuracy and speed of interpreting large datasets, such as images, speech, and text (5). Machine learning (ML) deals with the estimation of outcomes based on past experiences and data patterns using a computer-generated algorithm (6).

This systematic review aims to evaluate the existing literature on how ML can change the decision support of acute management in severely injured patients.

Materials and methods

Study design

A systematic review of the literature according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) checklist and algorithm was conducted, with the goal of finding all articles describing the value of machine learning systems in the context of acute management of severely injured patients (7).

Study characteristics

Investigations between 2000 and January 2021 were included. For analysis, prospective and retrospective observational investigations including database studies were considered.

Information source

The authors performed a systematic search of the PubMed/Medline and Web of Science (Core Collection) databases for eligible investigations.

Search

The search terms were (trauma) AND ((decision) OR (predict*) OR (assist*)) AND ((artificial intelligence) OR (neural network) OR (machine learning) OR (deep learning)).

Study selection

The authors limited the research to observational studies, while systematic reviews, meta-analyses, case series, and case reports were excluded. Titles and abstracts were reviewed after the removal of duplicates. The remaining full texts were checked for suitability by all authors, and disagreement was resolved by consensus. In cases of doubt, articles were included in the next stage. A flowchart of the filtering stages (titles, abstracts, full-length texts) is shown in Figure 1.

Figure 1

Figure 1

Selection process for the systematic review - flow chart.

Data items

Studies were selected according to the following inclusion criteria: (a) case series, cohort studies, clinical trials, or registry data studies regarding decision support by self-learning systems for the acute management of adult civilian trauma patients, (b) studies that used parameters available during initial assessment and resuscitation in the trauma bay, (c) studies that used at least one physiologic parameter (e.g., heart rate), (d) models that predicted patient-related outcome (e.g., mortality, hemorrhage, need for emergency intervention), and (e) articles published in English or German language.

Exclusion criteria are as follows: (a) case reports or case series with fewer than 10 patients, (b) review articles, (c) animal studies, (d) cadaver studies, (e) studies including only patients with isolated non-life-threatening injuries, (f) studies including only patients with isolated traumatic brain injury, (g) models relying on imaging data, and (h) studies that predicted an effect that would become immanent after more than 30 days (e.g., 1-year survival).

Synthesis of results

We extracted data concerning study characteristics including author names, title, year of publication, journal of publication, number of patients, time of follow-up, and type of study. For the description of the study population, the number of patients and age were collected. Outcome parameters were analyzed according to the inclusion criteria and were assigned to five predicted outcomes: (1) injury pattern, (2) hemorrhage/need for transfusion, (3) emergency intervention, (4) ICU/length of hospital stay, and (5) mortality.

For all included studies, we used the Oxford Centre for Evidence-Based Medicine 2011 to define the level of evidence (OCEBM Levels of Evidence Working Group 2011).

Results

Thirty-six articles met the inclusion criteria; among these were 2 prospective and 34 retrospective case series. Publication dates ranged from 2000 to 2020 and included 32 different first authors. A total of 18,586,929 patients were included in the prediction models. Machine learning models used were artificial neural network (ANN; n = 15), singular vector machine (n = 3), Bayesian network (n = 7), random forest (n = 6), natural language processing (n = 2), stacked ensemble classifier (SL, n = 3), k-nearest neighbor (KNN, n = 1), belief system (n = 1), and sequential minimal optimization (n = 2) models. Thirty articles assessed results as positive, five showed moderate results, and only one article described negative results to their implementation of the prediction model (Table 1). The quality of included articles showed OCEBM Levels of Evidence of 3, correlating with the retrospective character of model training.

Table 1

YearAuthorYearInputMain prediction groupModel typeModel descriptionPatient typeNo. of patientsOutcome compared to other predictionsOutcome parameterResultConclusion
2020Ahmed2020Vital signsMortalityDeep FLAIMBiostatically rigorous two-phase ML frameworkTrauma ICU patients3,014Other ML modelsAUC, Sens, ACCPositiveVery positive, the model outperforms other machine learning models
2019Almagrabi2019Vital signsMortalityBelief SystemMAKER RIMER combinationTrauma Audit and Research Network database177,014Other ML modelsAUCPositivePositive, with an AUC of 0.683 for mortality
2019Christie2018Patient histories from 3 hospital sitesMortalitySLEnsemble machine learning algorithmTrauma patients28,212Standard scoring methodsAUCPositiveVery positive prediction of mortality (AUC range: 0.90–0.94)
2018Chrisite2019Multiple inputsMortalitySLEnsemble machine learning algorithmCritically injured database prospective cohort data1,494N/AAUCPositiveGood prediction for mortality (AUC range: 0.94–097), multiorgan failure (AUC range: 0.84–0.90), and transfusion (AUC range: 0.87–0.90).
2000DiRusso2000Prehospital variables, ER admission variables, ISSMortalityANNStandard feed-forward back-propagation neural networkRegional area trauma system10,609TRISS, ISSROCPositiveCompared to TRISS and ISS improvement in ROC (0.912). ANN suited for implementation
2005Fuller2005ISS, TS, age, sex, mode of injury, othersMortalityANNBack-propagation neural networkTrauma register2,510TRISSRelative errorPositiveThe ANN model was able to predict mortality with a relative error of 5% and was superior to TRISS (relative error 27.7%
2018Kim2018National sample project dataset, wearable device dataMortalityANN, othersLogistic regression, random forest, deep neural networkBlunt and penetrating injuries460,865GCSAUCPositiveOf the tested models, all alternatives to GCS showed superior AUC, with the proposed ANN showing the best results (AUC: 0.89)
2019Gorcyca2019TRISS variables, ICD 9MortalityTSMLogistic regression, random forest, gradient boost machine, feed-forward neural networkTrauma patient database1,432,5172Bayesian network, injury severity score, Harborview assessment of risk of mortality, trauma mortality prediction modelACC, F-scorePositiveThe model showed superior predictive power compared to other models
2008Pearl2006Multiple inputsMortalityANNBack-propagation, feed-forward ANNDatabase1,433,024GCSROCPositiveImproved predictive power compared to GCS
2006Pearl2008Multiple inputsMortalityANNBack-propagation, feed-forward ANNSwedish trauma registry7,688GCSROCPositiveImplemented models performed well; however, additional variables could improve the performance of the ANN
2019Rau2019Multiple inputsMortalityANN (Stuttgart neural network Simulator)Back-propagation, feed-forward ANNTrauma patients18,811TRISS, SVM, LRACC, SensPositiveImplemented ANN showed the highest score in balanced accuracy (75.1%) compared to the other prediction models
2017Roveda2017Multiple inputsMortalityANN, naive Bayes, logistic regression, random forestTwo hidden layers, 5/8 hidden neurons per layer depending on predictionCRASH II patients20,207ANN, naive bayes, logistic regression, random forestROCModeratePromising results, but the used dataset was limited in dimensionality of the given patient data
2017Schetinin2013Multiple inputsMortalityBayesian networkBayesian averaging over decision tree modelNTDB database571,148TRISSSens, specPositiveWhen compared to TRISS classification accuracy showed significant improvement
2013Schetinin2017Multiple inputsMortalityBayesian networkBayesian averaging over decision tree modelNTDB database14,840TRISSAUCPositiveThe proposed network outperformed the standard methods in predictive performance
2017Sefrioui2017Seventeen featuresMortalityRandom forest, ANN, LR, KNN, naive Bayes, PART division lists, SMOMultiple models for mortality on a big data setNTDB database656,092TRISSACCPositiveFrom the tested models, the random forest model showed highest accuracy and improvement over TRISS for mortality prediction
2020Servia2020Multiple inputs, pre- and postresuscitationMortalityLogistic regression, ANN, SMO, Bayesian networks, AdaBoost, Bagging, othersMultiple models for mortality on a big data setRETRAUCI database9,625All implemented model typesF-Score, AUCPositiveAuthors implemented a range of different models, fittest best for the task of mortality prediction were SMO and Bayesian network model
2020Tsiklidis202032; 8 after reducing variablesMortalityEstimator gradient boosting classifierEstimator gradient boosting classifierNTDB Database799,233N/AAUCPositiveThe proposed model was able to predict mortality correctly in 92.4% of cases
2005Wolfe2006Multiple features, prehospitalMortalityANN, logistic regression, classification treeThree different investigators on the same databasePatients with blunt trauma Victorian state trauma registry4,014N/ASens, specNegativeNone of the implemented methods were optimal in predicting mortality but useful insides for further developments were collected
2015Gholipour2015TRISS variablesICU LOS, mortalityANNStandard feed-forward back-propagation neural networkTrauma patients admitted to ER125N/ASens, specPositivePositive outcome with no significant difference of ICU LOS predictions and real LOS. High sensitivity for mortality
2020Bhat2020BGA parametersHemorrhage/transfusionSVM, bagged decision tree, ANN with a Bayesian regulatorDifferent models were compared in their ability to classify similar to HISS100 cases, data synthetically derived (SFRP)100HISSACC, SensPositiveAll models were highly accurate without significant differences (range ACC 0.91–0.93)
2008Chen2008Vital signsHemorrhage/transfusionLinear classifierLinear classifierHelicopter transport trauma patient898Schock IndexAUCPositiveGood prediction with AUC 0,76 for identification of patients in need for transfusion
2002Clarke2002Multiple inputsHemorrhage/transfusion, emergency interventionTrauma AIDTrauma AID, early algorithmTrauma patients with penetrating thoracoabdominal trauma97N/AN/AModerateModerate results using the model to decide treatment strategy. Of 40 cases 5 computer predicted strategies were altered
2018Hodgman2018Data from resuscitation bay PROMMTTHemorrhage/transfusionApp model (Mina et al.)N/APatients needing at least 1 transfusion at resuscitation bay1,245N/AAUCModerateModerate predictive performance of the app model for need for massive transfusion (AUC: 0.694)
2020Li2020Vital signs, BGA parametersHemorrhage/transfusionRandom forest modelRandom forest algorithm based on decorrelated decision treesDatabase, ER1,405Logistic regression modelF1-ScoreModerateSuperior F1 Score (93.4%) but lower AUC (0.81) when compared to logistic regression model for predicting acute traumatic coagulopathy
2013Hubbard2013Data from resuscitation bay PROMMTTHemorrhage/transfusionSLEnsemble machine learning algorithmPROMMTT patients980N/AROC, AUCPositiveSuperior prediction of ongoing transfusion with SL over naive approaches for different time intervals
2006Partridge200616 parameters, resuscitation bay parameters minus BGAHemorrhage/transfusionBayesian networkDecision tree with Bayesian averagingTrauma patients’ “difficult” cases316Maximum a posterioriN/APositiveEarly work on decision tree implementation with positive results compared to other implementations
2005Walczak200517 featuresHemorrhage/transfusionANNBack-propagationPatients trauma1016N/ASens, spev, mean absolute differencePositiveThe authors demonstrated high sensitivity and specificity for predicting transfusion indication in trauma patients with the implementation of ANN
2017Gu2017Speech inputInjury patternNLPCNN, LSTMThirty local cases30N/AACCModerateAuthors could show a 79.12% accuracy for phase prediction
2020Kulshrestha2021Text-based inputsInjury patternNLPLogistic regression with elastic net regularizationPatients local hospital with thorax injury6,891N/AAUCPositiveAuthors were able to show good prediction of differentiation between severe and non-severe injury to the thorax (AUC 0.88)
2015Metzger2015Lab parameters, comorbidities, demographicInjury pattern“Figaro” modelStacked ensemble classifier (similar to SL)Trauma patients’ subsection vascular injuries2,643Other classifiersSens, accPositiveGood prediction for death, multi-organ failure and transfusion. Superior when compared to other classifiers
2002Ogunyemi2002Resuscitation bay parameters, clinical findingsInjury patternTrauma SCAN, Bayesian networkTrauma SCAN, Bayesian networkPatients with gunshot wounds chest and abdomen26Trauma SCAN diagnosis, vs. Trauma SCAN and clinical findings in a Bayesian networkAUCPositiveImplementing geometric and probabilistic reasoning showed an improvement of prediction for TraumaScan
2020Paydar2020Multiple inputsInjury patternBagged, SVM, KNN, AdaBoost, neural networkDifferent classifiers comparedTrauma patients >16 years1,107Bagged best, SVM second, KNN, Adaboost, neural networkACCPositiveAuthors were able to predict deterioration of injured patients with high accuracy, through bagging and SVM models (99%)
2019Harvin2019Multiple inputsEmergency interventionRandom forest algorithmRandom forest algorithm based on decorrelated decision treesPatients admitted to trauma center25,983N/AACCPositiveGood prediction of damage control laparotomy (ACC range 0.772–0.857)
2002Hirshberg2002Bullet trajectory, admission systolic blood pressureEmergency interventionANNArray of ANNs, two hidden layers, logistic activation, back-propagationPatients with single abdominal gunshot wound312N/AClassification rate, ACC, sens, specPositiveThe authors were able to achieve excellent correct classification rate (0.96) and AUC (0.94)
2014Liu2014Vital signs, monitoring, GCSEmergency interventionANNMultilayer perceptronPatients transported via helicopter with trauma104Multivariate logistic regressionROC, AUCPositiveCompared to multivariate logistic regression models, the ANN showed higher AUC (0.99) in prediction performance for emergency intervention
2014Liu2014Feature sets, patient recordEmergency interventionANNMultilayer perceptron and simple rule-based algorithmPatients transported via helicopter with trauma79N/AMean absolute errorPositiveThe authors showed higher ROC with the proposed implementation of ANN for emergency intervention prediction

Overview of the 36 included studies, sorted by name and prediction groups.

Deep FLAIM framework, deep Fahad-Liaqat-Ahmad intensive machine framework; ML, machine learning; AUC, area under the curve; Sens, sensitivity; ACC, accuracy; MAKER, maximum likelihood evidential reasoning; RIMER, rule-based inference methodology using the evidential reasoning approach; BGA, blood gas analysis; SVM, support vector machine; ANN, artificial neural network; HISS, hemorrhage intensive severity and survivability score; SFRP, sensible fictitious rationalized patient; SL, “Super Learner”; ER, emergency room; ISS, injury severity score; TRISS, trauma and injury severity score; ROC, receiver operator characteristic; TS, trauma score; ICU LOS, intensive care unit length of stay; spec, specificity; TSM, trauma severity model; PROMMTT, prospective, observational, multicenter, majof trauma transfusion; GCS, Glasgow coma scale; NLP, natural language processing; KNN, k-nearest neighbor; CRASH, Clinical Randomization of an Antifibrinolytic in Significant Hemorrhage; SMO, sequential minimal optimization.

Predicting injury patterns

Over the past 20 years, various research groups have been working with computer-assisted systems to partially automate the analysis of patient data in the resuscitation room to assess injury patterns. Depending on the study, prehospital and hospital-acquired parameters were used for predicting patients’ injury patterns.

In 2002, Ogunyemi et al. explored the possibility of using probabilistic graphic models in combination with 3D reconstruction to analyze penetrating chest and abdominal trauma with the aim of predicting outcomes based on the location of penetration. The TraumaScan tool developed for this purpose used a combination of the location of the entry wound and the patient’s symptoms and parameters. The use of such software in the treatment of penetrating injury is an effective tool to make the treatment of injured people more time effective and safe (8).

Metzger et al. tested the possibility of using various artificial intelligence (AI) algorithms to detect vascular injury based on the initially collected patients’ parameters to aid in treatment decisions and the identification of critical patients. For this purpose, 2,643 patients were selected and parameters were extracted. The parameters were tested for outcomes on different classifiers and combinations of multiple classifiers. The use of multiple classifiers deployed on these parameters produced the best results (9).

The work of Gu et al. follows a new approach to classifying the individual phases of a trauma resuscitation (pre-arrival, patient arrival, primary survey, secondary survey, and postsecondary survey) and uses the spoken words in the resuscitation room. For this purpose, microphones were installed and the recorded words were converted into individual path-finding phrases. This was tested on 24 recorded trauma resuscitations and converted into an algorithm by deep learning processes. Subsequently, this process was performed on six recorded cases with a matching accuracy of almost 80%. In summary, audio analysis during resuscitation room management shows a novel implementation of data collection and options for phase classification (10).

In the most recent study on classifying injury patterns in polytrauma patients, Paydar et al. showed that early classification of the injury pattern and decision-making support are relevant when the patient prognosis is poor. Using data from 1,107 trauma patients and using various analytics algorithms, they aimed to investigate the benefits of AI-aided decision-making for the treatment of polytrauma patients. For this purpose, paraclinical and clinical data were extracted. Diastolic blood pressure, GCS, and BE after resuscitation crystallized as the most impactful parameters; the outcome was predicted with high accuracy of 0.99 (11).

Kulshrestha et al. retrospectively evaluated data from 6,891 trauma patients and analyzed 450,000 corresponding documents. Using natural language processing, ANN was trained for automated analysis. The authors were able to show that the implementation of natural language processing can aid in the adequate classification of thoracic trauma (12).

Predicting hemorrhage/need for transfusion

Walczak trained ANN for the prediction of the need for blood transfusions (fresh frozen plasma, packed red blood cells, and platelets) using data from 508 retrospective patients who were transferred to their trauma center between January 1996 and December 1997 using the back-propagation method (13). The input variables for the ANN were easily accessible patient characteristics obtained on admission to the ER, such as sex, age, blood pressure, Glasgow coma scale, and so on. Trauma patients with no transfusions but with similar epidemiological data were added to the data set, which resulted in a total of 1,016 data sets (training set with 538 patients and hold-out-sample with 478 patients). The main finding was that the proposed ANN was able to predict blood transfusions with a mean absolute error (MAE) of 7.02 units for patients who received 0–174 units in total (all types of transfusions within the first 24 h after admission), but the ANN showed better MAE (5.49 units) for specific blood transfusion types in shorter time periods (13).

Chen et al. introduced a classifier as a decision assist tool for identifying hypovolemia in trauma patients being transported from the scene via a helicopter when reliable vital parameters are hard to assess (14). The working group used data from 898 trauma patients and included 627 subjects with 71 cases of major bleeding. The ensemble classifier, which was fed with five easily assessable vital parameters every second (RR, RR, DBP, SBP, and SaO2) showed an area under the curve (AUC) of almost 0.8 after 14 min of transport for the prediction of life-threatening hemorrhage and was able to tolerate missing variables better than linear classifiers (14).

In the comparison of standard stepwise logistic regression analysis to new SL techniques, the SL showed a superior prediction of mortality in this complex dynamic multivariate data set at several time intervals.

Hodgman et al. utilized the PROMMT database in 2018 for validation of a smartphone app model for predicting the activation of mass transfusion protocol or MT delivery for five different mass transfusion definitions: 10 units of packed red blood cells (PRBCs) within 24 h (1), resuscitation intensity score ≥4 (2), critical administration threshold (3), 4 units of PRBCs within 4 h (4), and 6 units of PRBCs within 6 h (5). Examining 1,245 patients, the smartphone app showed consistent prediction for the need for MT, MTP activation, and MT delivery with AUC ranging from 0.694 to 0.711 regardless of the MT definition (15).

Christie et al. assessed 1,494 severely injured patients with 2,397 variables in the time period from February 2005 to April 2015 for several outcomes including the need for transfusion and need for mass transfusion (>10 L in 24 h) several times after trauma (2–120 h) (16).

SL technique was applied to the data sets, and early blood transfusion was sufficiently predicted with an AUC of 0.82 and 0.84–0.88 for interval prediction throughout the first 72 h after admission (16).

Recently, Bath et al. introduced the hemorrhage intensive severity and survivability score (HISS) using five biomarkers: glucose, lactate, pH, potassium, and oxygen tension. The working group created 100 sensible fictitious rationalized patients (SFRPs) and let five trauma experts rate their HISS score for triage (0 = low, 1 = guarded, 2 = elevated, 3 = high, 4 = severe). Afterward, linear support vector machine, ensemble bagged decision tree, ANN with Bayesian regularization algorithm, and possibility rule-based using function approximation (PRBF) were evaluated for their ability to accurately classify the 100 entries of the SFRP data set with an identified adequate training set of 75, and it was felt that these classification algorithms can be used as an adjunct to the HISS due to an accuracy higher than 91% in the clinical setting (17).

Predicting the need for emergency intervention

One of the earlier publications covering the use of an AI-based technology was a study by Clarke et al. with the TraumAID computer program in 2002. Here, a retrospective analysis of 97 cases showed a significantly higher evaluation of three raters for TraumAID’s protocols over actual care in 64 cases. TraumaAID was used by residents in 40 cases in the emergency department, and in 5 of these, this resulted in a change of evaluation, diagnosis, or treatment, while none of these changes was judged to be an error by the majority of the raters (18).

In the same year, Hirshberg et al. reported on the creation of an ANN (ANN) for the prediction of damage control operations in patients with a single abdominal gunshot injury. After training the ANN on data of 312 patients, the authors tested it on 34 cases. A sensitivity of 88% and a specificity of 96% were achieved. Variables like systolic pressure, bullet path, or trajectory were determined as strong inputs (19).

Prediction of performing a damage control laparotomy in trauma patients was also the focus of a study by Harvin et al. 2019. In a single center, a quality improvement intervention had been introduced in advance, successfully reducing the rate of damage control laparotomies (DCLs) without increasing morbidity or mortality. A random forest computer learning algorithm, based upon decorrelated decision trees for prediction, was used to analyze 72 variables for their predictive value for a DCL, identifying some significant correlations. The authors concluded that ML could be used to point out the effects of interventions on surgeons’ decision-making (20).

Wolfe et al. created three prediction models for outcomes following nonpenetrating trauma using a data set of 4,014 patients, of which a first subgroup was used for training the model and a second one for external validation. Models based on data from the scene of injury were developed for an intensive care unit (ICU) stay (complete data sets 1,324; 33%) or later death (complete data sets: 2,059; 51.3%). Statistical models used were logistic regression, classification trees, and ANN. None of the models was seen as optimal and met the self-given performance criteria, therefore being the only article in this systematic review showing negative results (21).

Liu et al. published two papers in 2014. In the first one, they described the development and validation of a multiparameter ML algorithm for predicting the need for life-saving interventions in trauma patients. All patients sustained severe blunt and penetrating injuries and were transported to two Level I trauma centers by helicopter. The authors used 79 patient records with over 110,000 feature sets to train the system and applied the trained network to data from 104 patients. The algorithm showed positive results for the prediction of trauma patients needing live-saving interventions (22).

In another paper from the same year, Liu et al. analyzed the data of 104 patients extracting a combination of vital signs, heart rate complexity, and others, again applying ML tools to identify the need for life-saving interventions. They showed that an ML model has superior performance over multivariate logistic regression models (23).

Predicting admission to ICU/length of hospital stay

For healthcare providers, the assessment of valences in intensive care units and their distribution at any given point in time is essential. In 2015, Gholipour et al. used trauma and injury severity score (TRISS) variables to train a feed-forward propagating ANN to predict the length of stay in the ICU and mortality. For this purpose, data from 95 trauma patients admitted to the emergency room were retrospectively used to train an ANN and tested on 30 other cases.

Overall, the results were good, with a sensitivity of 75% and a specificity of 96% for the survival of trauma patients and no significant difference in the ICU length of stay prediction and real length of stay (24).

Predicting mortality

In 19 studies, mortality was the most common prediction in this review. A total of 17,972,347 patients were included overall. In total, 13 different ML algorithms and models were considered, showing positive results for 18 of 19 studies. Models based on ANNs were most prevalent in this review, with the adaption of this architecture in 16 papers (84% of mortality predicting articles).

Ahmed et al. identified 3,041 trauma patients admitted to their trauma surgery ICU. Univariate and multivariate analyses on mortality were performed, and ANN and other ML models were deployed on the extracted data. With an accuracy of 92.3% and sensitivity of 79.1%, the ANN-based deep-FLAIM model outperformed other tested methods (25). Almagrabi et al. used the TARN database to access retrospective vital sign data from 177,014 patients to predict mortality, testing different ML models. With AUC values of 0.6882, 0.6829, and 0.678, respectively, logistic regression, interpretable ML, and ANN were the best classifiers tested (26).

An earlier study by DiRusso et al. compared established mortality predictors, namely, TRISS and Injury Severity Score (ISS), with a feed-forward backpropagation neural network, showing its superiority with a receiver under the operating curve (ROC) of 0.912 for the ANN and 0.895 and 0.766 for TRISS and ISS, respectively. Including 10,609 patients admitted to 24 hospitals in a seven-county region, this multicenter study confirms the previous work of the research group (27). A 2005 study from Fuller et al. also compared TRISS predictions with ANN including 2,510 patients from the CAMC Trauma registry. Using ISS and six other inputs, the ANN outperformed TRISS predictions with a relative error of 5% compared to 28%. Unfortunately, no AUC, ROC, accuracy, or sensitivity were calculated in this study (28). A more recent study from 2015 by Gholipour et al. used input variables of TRISS on 125 patients to predict mortality and length of stay, showing satisfactory results with a sensitivity and specificity of 75% and 97%, respectively, for the prediction of mortality by ANN. Comparing the results to TRISS, the sensitivity and specificity of TRISS showed better results with 81% and 95%, respectively (24). In 2019, Gorczyca et al. were able to achieve excellent classification rates for mortality prediction, comparing a state-of-the-art ANN to several other prediction models including Bayesian networks, ISS, and others (29). Hubbard et al. used SL to predict mortality and the need for transfusion within discrete time intervals (30–90, 90–180, and 180–360 min) in patients meeting the criteria for highest-level trauma activation in 10 major Level I hospitals. SL outperformed the standard methods for predicting future mortality, with the greatest difference being the prediction of death at the 180–360 min interval (AUC SL 0.92 vs. 0.55 for standard methods) and a 5% increase compared to logistic regression models in prediction performance (30). Kim et al. extracted 460.865 cases of blunt and penetrating trauma from the National Trauma Data Bank (NTDB) and assessed the implementation of a consciousness index as well as ML algorithms to predict mortality. With an AUC of 0.89, the deep neural network showed the best results in predicting mortality. The used input variables were chosen as though they were collected by wearable patient devices, showcasing the feasibility of such devices (31).

Pearl et al. published two studies in 2006 and 2008 using ANNs with eight and five parameters, respectively, for the prediction of mortality for clinicians and nonclinicians [32]. In both studies, the NTDB data were used, including 1,433,024 patients in 2006 and 7,688 patients in 2008. Pearl et al. could show that ANNs predicted mortality with prehospital variables well, with a correct prediction of 91% in 2006, but could not show improvement when excluding pulse and other input variables in 2008 for nonclinician use of ANNs (33).

In a recent study by Rau et al., the authors compared mortality prediction for 18,811 patients using TRISS, ANN (neural network configured via the Stuttgart neural network simulator), support vector machine, and logistic regression. Results showed high accuracy for all four models but the highest specificity (51.5%) for ANN (34). Another retrospective analysis by Roveda et al. included 20,207 patients from the CRASH (Clinical Randomisation of an Antifibrinolytic in Significant Haemorrhage) database to predict mortality, as well as ICU stay and need for surgical intervention. All tested models (logistic regression, Bayesian network, random forest, ANN) showed similar results, concluding that a combination of the above-mentioned models could enhance the performance of predictive power (35).

Schetinin et al. used Bayesian averaging over decision trees to predict the mortality of NTDB data by considering 571,148 cases, comparing results with TRISS estimates, and concluding that the goodness of fit was superior to the TRISS method (36). A study from 2017 by Sefrioui et al. compared different ML approaches to the NTDB database, characterizing every patient (n = 656.092) by 17 features, including GCS, vital signs, and other parameters, with TRISS prediction. In their testing, the random forest approach showed the most promising results (ACC 0.9774) compared to TRISS and other ML models in the prediction of patient mortality (37). A recent study from Servia et al. used data from the National Trauma Registry of 52 Spanish ICUs to test the predictive capabilities for mortality on nine ML-based classifiers on data from 9,790 critically injured patients, showing a high correlation of mortality in patients with traumatic brain injury and organic failure. Even though all tested classifiers were able to produce a high accuracy, specificity, and AUC, low values for recall were obtained. Servia et al. discussed that since comparable results in accuracy and sensitivity could be achieved by all nine classifiers, one should rather choose ML techniques by their architecture and fit to a specific task than by the determination of statistical superiority only (38).

Discussion

Decision-making in the acute management of severely injured patients has to be based on reliable and accurate statements and predictions.

In the past, experience-based algorithms like TRISS have shown weaknesses in clinical use. Data-driven ML tools show great potential as a new approach to problems of this nature. In this systematic review, we could show that ML tools are generally more accurate and sensitive when compared to existing tools for decision-making in acute trauma management like GCS, ISS, and TRISS. Only one of eight studies that compared prediction performance showed TRISS superior when compared to the tested ML mechanisms.

Mortality was the most frequently predicted outcome parameter in our review (n = 19). We included this outcome even though the use of decision aid for predicting mortality is discussable. Using vital signs and BGA parameters, mortality can be predicted with higher accuracy and sensitivity compared to methods in clinical use right now. Different models, more recently ANNs, can solve the task of determining the mortality risk of a patient in the resuscitation bay, but the implementation of these tools assumes digitalized parameters of patients’ vital signs, lab parameters, and demographics at the time of hospitalization. Furthermore, interfaces need to be in place to make these predictions possible and actually aid in the acute management of trauma patients. We hypothesize that these are limitations to the broad implementation of these tools. Clinical validation of such tools also mandates standardization of evaluation parameters of ML models fit for clinical use. The benefit of mortality prediction in the acute setting of a resuscitation room remains unclear. However, if an implementation can aid in acute management and the above-mentioned limitations are addressed, we consider ML tools as a promising development. Would we do less if a patient had a 70% probability of 48 h-survival?

Because of the broad range of different trauma mechanisms, injury pattern prediction showed a heterogeneous field of studies that all predicted specific outcomes with different ML models. The complexity of the prediction makes it hard to generalize injury patterns of trauma patients in one ML model so far. Another field with similar limitations was the prediction of emergency interventions. One paper was able to predict the need for DCL with high sensitivity and specificity, but the often complex indication of different emergency or damage control interventions makes predictions on a general scale complicated. Further developments are needed to make ML models implementable for these tasks.

Also, hemorrhage and the subsequent need for transfusion can be predicted by different ML models; recent publications especially show high accuracy in these predictions. If the implementation of live evaluation of vital signs and BGA parameters is widely available, ML-powered decision support for transfusion protocols seems likely to be implemented in the near future. Only one article set the prediction of ICU length of stay as the main outcome of ANN implementation and was able to show that the prediction made by the network did not significantly differ from the real length of stay after training on 95 cases. However, due to the small number of cases and lack of comparable studies, we cannot assess the validity of these findings yet.

Specific problems cannot always be answered best by one ML model or approach. To evaluate models only on their differences in accuracy or sensitivity does not always reflect their implementation ability, especially in a clinical environment, with the need for clinical evaluation and standardization.

ANNs show a wide range of implementation possibilities for the future but are very sparsely implemented in the clinical routine so far. To our knowledge, none of the shown ML tools in the included studies seem to be in actual clinical use in the version described in the respective articles. One reason is the need for big data to train these ML tools, reflected by the large number of studies in this systematic review (10 of 36) training and testing their models on data from national or international registries or databases. Unfortunately, data generation in such databases can be incomplete at times and are not standardized internationally. Subsequently, there will be a call for access to well-structured and complete datasets for trauma patients in the future, which we hope will enable the training of generalized ML and ANN models.

Conclusion

Digitalization and general computing and technical capabilities of healthcare providers are the basis on which implementation of ML models into the clinical routine can be made possible. While the benefits of these ML tools, especially ANNs for prediction in different fields (image segmentation, image processing) are undeniable, several requirements concerning live availability of data, better and more accessible big data sets on trauma patients, technical requirements on site, and insurance of patient data security need to be met before the implementation of ML techniques, and especially ANNs can become a widely implemented reality in the acute management of trauma patients.

Statements

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

Author contributions

The authors confirm their contribution to the paper as follows: study conception and design: DB, GO, DAB, ST, TG; data collection: DB, GO. Author; analysis and interpretation of results: all authors; draft manuscript preparation: DB. All authors contributed to the article and approved the submitted version.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1.

    BerkeveldEPopalZSchoberPZuidemaWPBloemersFWGiannakopoulosGF. Prehospital time and mortality in polytrauma patients: a retrospective analysis. BMC Emerg Med. (2021) 21:78. 10.1186/s12873-021-00476-6

  • 2.

    HussmannBLendemansS. Pre-hospital and early in-hospital management of severe injuries: changes and trends. Injury. (2014) 45(Suppl 3):S3942. 10.1016/j.injury.2014.08.016

  • 3.

    ParkYLeeGJLeeMAChoiKKGwakJHyunSYet alMajor causes of preventable death in trauma patients. J Trauma Inj. (2021) 34:22532. 10.20408/jti.2020.0074

  • 4.

    SpanjersbergWRBergsEAMushkudianiNKlimekMSchipperIB. Protocol compliance and time management in blunt trauma resuscitation. Emerg Med J. (2009) 26:237. 10.1136/emj.2008.058073

  • 5.

    TopolEJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. (2019) 25:4456. 10.1038/s41591-018-0300-7

  • 6.

    DeoRC. Machine learning in medicine. Circulation. (2015) 132:192030. 10.1161/CIRCULATIONAHA.115.001593

  • 7.

    MoherDLiberatiATetzlaffJAltmanDG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. (2009) 6:e1000097. 10.1371/journal.pmed.1000097

  • 8.

    OgunyemiOIClarkeJRAshNWebberBLCombining geometric and probabilistic reasoning for computer-based penetrating-trauma assessment. J Am Med Inform Assoc. (2002) 9:27382. 10.1197/jamia.m0979

  • 9.

    MetzgerMHowardMKelloggLKundiR. Ensemble prediction of vascular injury in trauma care initial efforts towards data-driven, low-cost screening. Proceedings 2015 IEEE International Conference on Big Data:25608 (2015).

  • 10.

    ChenLOgundeleOClermontGHravnakMPinskyMRDubrawskiAW. Dynamic and personalized risk forecast in step-down units. Implications for monitoring paradigms. Ann Am Thorac Soc. (2017) 14:38491. 10.1513/AnnalsATS.201611-905OC

  • 11.

    PaydarSParvaEGhahramaniZPourahmadSShayanLMohammadkarimiVet alDo clinical and paraclinical findings have the power to predict the critical conditions of injured patients after traumatic injury resuscitation? Using data mining artificial intelligence. Chin J Traumatol. (2020) 24(1):4852. 10.1016/j.cjtee.2020.11.009

  • 12.

    KulshresthaSDligachDJoyceCBakerMSGonzalezRO'RourkeAPet alPrediction of severe chest injury using natural language processing from the electronic health record. Injury. (2020) 52(2):20512. 10.1016/j.injury.2020.10.094

  • 13.

    WalczakS. Artificial neural network medical decision support tool: predicting transfusion requirements of ER patients. IEEE Trans Inf Technol Biomed. (2005) 9:46874. 10.1109/titb.2005.847510

  • 14.

    ChenLMcKennaTMReisnerATGribokAReifmanJ. Decision tool for the early diagnosis of trauma patient hypovolemia. J Biomed Inform. (2008) 41:46978. 10.1016/j.jbi.2007.12.002

  • 15.

    HodgmanEICrippsMWMinaMJBulgerEMSchreiberMABraselKJet alExternal validation of a smartphone app model to predict the need for massive transfusion using five different definitions. J Trauma Acute Care Surg. (2018) 84:397402. 10.1097/TA.0000000000001756

  • 16.

    ChristieSAConroyASCallcutRAHubbardAECohenMJ. Dynamic multi-outcome prediction after injury: applying adaptive machine learning for precision medicine in trauma. PLoS One. (2019) 14:e0213836. 10.1371/journal.pone.0213836

  • 17.

    BhatAPodstawczykDWaltherBKAggasJRMachado-ArandaDWardKRet alToward a hemorrhagic trauma severity score: fusing five physiological biomarkers. J Transl Med. (2020) 18:348. 10.1186/s12967-020-02516-4

  • 18.

    ClarkeJRHaywardCZSantoraTAWagnerDKWebberBL. Computer-generated trauma management plans: comparison with actual care. World J Surg. (2002) 26:5368. 10.1007/s00268-001-0263-5

  • 19.

    HirshbergAWallMJJrMattoxKL. Bullet trajectory predicts the need for damage control: an artificial neural network model. J Trauma. (2002) 52:8528. 10.1097/00005373-200205000-00006

  • 20.

    HarvinJAKaoLSLiangMKAdamsSDMcNuttMKLoveJDet alDecreasing the use of damage control laparotomy in trauma: a quality improvement project. J Am Coll Surg. (2017) 225:2009. 10.1016/j.jamcollsurg.2017.04.010

  • 21.

    WolfeRMcKenzieDPBlackJSimpsonPGabbeBJCameronPA. Models developed by three techniques did not achieve acceptable prediction of binary trauma outcomes. J Clin Epidemiol. (2006) 59:2635. 10.1016/j.jclinepi.2005.05.007

  • 22.

    LiuNTHolcombJBWadeCEDarrahMISalinasJ. Utility of vital signs, heart rate variability and complexity, and machine learning for identifying the need for life-saving interventions in trauma patients. Shock. (2014) 42:10814. 10.1097/SHK.0000000000000186

  • 23.

    LiuNTHolcombJBWadeCEBatchinskyAICancioLCDarrahMIet alDevelopment and validation of a machine learning algorithm and hybrid system to predict the need for life-saving interventions in trauma patients. Med Biol Eng Comput. (2014) 52:193203. 10.1007/s11517-013-1130-x

  • 24.

    GholipourCRahimFFakhreeAZiapourB. Using an artificial neural networks (ANNs) model for prediction of intensive care unit (ICU) outcome and length of stay at hospital in traumatic patients. J Clin Diagn Res. (2015) 9:OC1923. 10.7860/JCDR/2015/9467.5828

  • 25.

    AhmedFSAliLJosephBAIkramAUl MustafaRBukhariSAM. A statistically rigorous deep neural network approach to predict mortality in trauma patients admitted to the intensive care unit. J Trauma Acute Care Surg. (2020) 89:73642. 10.1097/TA.0000000000002888

  • 26.

    AlmaghrabiFXuD-LYangJ-B. A new machine learning technique for predicting traumatic injuries outcomes based on the vital signs. 2019 25th IEEE International Conference on Automation and Computing (Icac):648 (2019).

  • 27.

    DiRussoSMChahineASullivanTCuffSNealonPSiegelBet alDevelopment of an artificial neural network to predict trauma survival in pediatric patients. Crit Care Med. (2000) 28:A29.

  • 28.

    FullerJJEmmettMKesselJWPricePDForsytheJH. A comparison of neural networks for computing predicted probability of survival for trauma victims. W V Med J. (2005) 101:1205. PMID:

  • 29.

    GorczycaMTToscanoNCChengJD. The trauma severity model: an ensemble machine learning approach to risk prediction. Comput Biol Med. (2019) 108:919. 10.1016/j.compbiomed.2019.02.025

  • 30.

    GorczycaMTToscanoNCChengJD. Time-dependent prediction and evaluation of variable importance using superlearning in high-dimensional clinical data. J Trauma Acute Care Surg. (2013) 75:S5360. 10.1097/TA.0b013e3182914553

  • 31.

    KimDCogillSYangS. A data-driven artificial intelligence model for remote triage in the prehospital environment. PLoS One. (2018) 13:e0206006. 10.1371/journal.pone.0206006

  • 32.

    PearlACaspiRBar-OrD. Artificial neural network versus subjective scoring in predicting mortality in trauma patients. Stud Health Technol Inform. (2006) 124:101924. PMID:

  • 33.

    PearlABar-OrRBar-OrD. An artificial neural network derived trauma outcome prediction score as an aid to triage for non-clinicians. Stud Health Technol Inform. (2008) 136:2538 PMID:

  • 34.

    RauC-SWuS-CChuangJ-FHuangC-YLiuH-TChienPCet alMachine learning models of survival prediction in trauma patients. J Clin Med. (2019) 8:799. 10.3390/jcm8060799

  • 35.

    RovedaGKoledoyeMAParimbelliEHolmesJH. Predicting clinical Outcomes in Patients With Traumatic Bleeding: A Secondary Analysis of the CRASH-2 Dataset. 2017 IEEE 3rd International Forum on Research and Technologies for Society and Industry (Rtsi):3327 (2017).

  • 36.

    SchetininVJakaiteLKrzanowskiW. Bayesian averaging over decision tree models for trauma severity scoring. Artif Intell Med. (2018) 84:13945. 10.1016/j.artmed.2017.12.003

  • 37.

    SefriouiIAmadiniRMauroJEl FallahiAGabbrielliM. Survival prediction of trauma patients: a study on US National Trauma Data Bank. Eur J Trauma Emerg Surg. (2017) 43:80522. 10.1007/s00068-016-0757-3

  • 38.

    ServiáLMontserratNBadiaMLlompart-PouJABarea-MendozaJAChico-FernándezMet alMachine learning techniques for mortality prediction in critical traumatic patients: anatomic and physiologic variables from the RETRAUCI study. BMC Med Res Methodol. (2020) 20:262. 10.1186/s12874-020-01151-3

  • 39.

    PartridgeDSchetininVLiDCoatsTJFieldsendJEKrzanowskiWJet alInterpretability of Bayesian decision trees induced from trauma data. Medical Image Computing and Computer Assisted Intervention - Miccai. (2006) 2019(Pt Iii 4029):97281.

  • 40.

    TsiklidisEJSimsCSinnoTDiamondSL. Using the National Trauma Data Bank (NTDB) and machine learning to predict trauma patient mortality at admission. PLoS One. (2020) 15:e0242166. 10.1371/journal.pone.0242166

  • 41.

    SchetininVJakaiteLKrzanowskiWJ. Prediction of survival probabilities with Bayesian decision trees. Expert Syst Appl. (2013) 40:546676. 10.1016/j.eswa.2013.04.009

Summary

Keywords

trauma, polytrauma, decision support, machine learning, deep learning, artificial intelligence, neural networks, prediction

Citation

Baur D, Gehlen T, Scherer J, Back DA, Tsitsilonis S, Kabir K and Osterhoff G (2022) Decision support by machine learning systems for acute management of severely injured patients: A systematic review. Front. Surg. 9:924810. doi: 10.3389/fsurg.2022.924810

Received

20 April 2022

Accepted

31 August 2022

Published

10 October 2022

Volume

9 - 2022

Edited by

Pascal Jungbluth, University Hospital of Düsseldorf, Germany

Reviewed by

Michael Wiktor Zyskowski, University Hospital rechts der Isar, Technical University of Munich, Germany Mehmet Yildirim, University of Health Sciences (Turkey), Turkey

Updates

Copyright

*Correspondence: Georg Osterhoff

Specialty Section: This article was submitted to Orthopedic Surgery, a section of the journal Frontiers in Surgery

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics