- 1Department of Statistics, Colorado State University, Fort Collins, CO, United States
- 2Department of Occupational Therapy, Colorado State University, Fort Collins, CO, United States
- 3Department of Computer Science, University of Colorado, Boulder, CO, United States
- 4Shirley Ryan Ability Lab and Department of Physical Medicine & Rehabilitation, Northwestern University, Chicago, IL, United States
- 5UCHealth, University of Colorado Hospital, Aurora, CO, United States
- 6Department of Psychology, Colorado State University, CO, United States
Background: Traumatic brain injury (TBI) is one of the most common and complex neurological conditions. Many TBI patients require ongoing rehabilitation beyond acute care, making treatment and discharge decisions critical. While individual risk factors for TBI outcomes are known, integrating comprehensive electronic health record (EHR) data into practical, validated prediction tools for personalized discharge planning and readmission risk assessment remains a key challenge. EHRs offer a valuable resource by integrating sociodemographic information, clinical care details, and prior healthcare encounters, providing an opportunity to develop models that predict key outcomes for TBI patients, such as discharge disposition and 30-day readmission.
Methods: This retrospective cohort study utilized EHRs from a large multi-hospital health system (2017–2023) to develop and validate statistical models predicting discharge disposition and 30-day readmission among hospitalized TBI patients, and to translate these models into an accessible clinical prediction tool. Descriptive statistics were calculated to summarize patient characteristics. Multinomial logistic regression was used to model discharge disposition, and logistic regression was used for 30-day readmission. Forward stepwise regression based on the Akaike information criterion was used for variable selection. Cross-validation using the area under the receiver operating characteristic evaluated predictive performance.
Results: Several factors were significantly associated with both outcomes. Older age was positively associated with discharge to Inpatient Rehabilitation Facility/Skilled Nursing Facility or Hospice/Died versus Home (p < 0.001), and with 30-day readmission (p = 0.002). Ethnicity, significant other status, insurance, prior inpatient stays, length of stay, as well as Glasgow Coma Scale, activities of daily living, and mobility were all significantly associated with discharge disposition (p < 0.001). Prior mental health diagnosis (p = 0.062), prior inpatient stays (p < 0.001), and intensive care unit admission (p = 0.002) were associated with higher odds of 30-day readmission, while Commercial insurance was associated with lower odds compared to Medicare (p = 0.024). A prediction tool is available.
Conclusion: We developed and validated predictive models using EHR data, culminating in a practical tool that may enhance the management of patients hospitalized with TBI by supporting personalized discharge planning and risk stratification.
1 Introduction
Traumatic brain injury (TBI) is a prevalent neurological disorder associated with significant individual costs and a substantial societal burden worldwide (1). Recovery from TBI typically requires substantial time and resources, highlighting the importance of effective treatment planning to improve long-term outcomes and reduce complications. Given the finite availability of therapy staff and rehabilitation services (2), efficient resource allocation is essential. Predictive tools can help identify patients who may benefit from specific services, thereby supporting more targeted and patient-centered care. For example, identifying patients who are likely to require post-acute care in an inpatient rehabilitation facility (IRF) or skilled nursing facility (SNF) can facilitate early involvement of rehabilitation specialists, multidisciplinary care planning, and proactive communication with patients and families (3–5). Similarly, identifying patients with elevated risk of 30-day readmission can prompt targeted interventions such as medication reconciliation, focused education on warning signs, and timely post-discharge follow-up (6). Optimizing resource allocation, ensuring appropriate discharge disposition, and preventing readmissions all highlight the need for personalized treatment tailored to each patient’s prognostic profile. Historically, however, the ability to personalize patient care has been viewed as a luxury, often unattainable for resource-constrained acute care facilities (7).
Existing studies have explored predicting discharge disposition (8–11), 30-day readmission (12–15), and other outcomes, such as mortality and functional outcomes (16–20), among TBI patients. Nevertheless, a unique aspect of this study is its use of electronic health records (EHRs), a valuable resource for personalizing patient care. The widespread adoption of EHRs is a relatively recent development. In 2004, the United States government set a goal for all Americans to have an EHR by 2014 (21). However, it was not until the American Recovery and Reinvestment Act of 2009, which incentivized the adoption of EHRs, that many healthcare providers actively pursued the transition to EHR systems (22). Despite the widespread adoption of EHRs, they are sometimes perceived as a burden, diverting clinicians’ time and attention away from direct patient care (23). While various studies have focused on the challenges of EHRs, the unprecedented opportunities to leverage real-world clinical data to improve patient care are increasingly being recognized (24, 25). Still, translating the vast repository of real-world clinical data into reliable, interpretable, and actionable insights for specific conditions like TBI presents ongoing challenges. This study contributes to that effort by leveraging EHRs that contain a wealth of information on patient demographics, clinical care details, and medical history that influence recovery after TBI. Navigating these complex datasets and extracting meaningful insights necessitate advanced analytical techniques that can be translated into user-friendly tools.
Statistical predictive modeling is a powerful approach for analyzing complex healthcare data, identifying patterns and relationships that are not readily apparent to care providers, and creating personalized predictions of patient outcomes (26). These insights can inform clinical decision-making, such as treatment strategies and discharge planning, and help reduce readmission rates.
We aim to develop and validate statistical models using a comprehensive set of EHR-derived variables to predict two key patient outcomes following TBI hospitalization: discharge disposition and 30-day readmission. The goal is not only to identify significant predictors but also to transform these complex EHR data into an accessible prediction tool for clinicians, thereby demonstrating a practical pathway to support personalized treatment and discharge planning, optimizing resource allocation, and ultimately enhancing patient outcomes during and after acute care. The prediction tool can be accessed at https://tjzhou.shinyapps.io/INREACHapp/.
2 Materials and methods
2.1 Study design and setting
This retrospective cohort study analyzed data extracted from deidentified EHRs from a large multi-hospital health system. The study period encompassed records from 2017 to 2023. Data were obtained with support from Health Data Compass, an enterprise health data warehouse, and the study was approved by the Colorado Multiple Institutional Review Board and the Colorado State University Institutional Review Board.
2.2 Study population
Adult patients (18 years or older) hospitalized with a primary diagnosis of TBI were considered for inclusion. TBI diagnoses were identified using the International Classification of Diseases, Tenth Revision (ICD-10) codes.
2.3 Data collection and preprocessing
Data were extracted from the EHR system, including patient demographics, medical histories, treatment details, and outcomes. The initial dataset of 11,137 patients was cleaned and coded, resulting in a final analytical cohort of 6,275 patients (Figure 1). A total of 4,862 patients were excluded because they had missing data for at least one of the following continuous variables: median income for zip code (n = 213), Glasgow coma scale (GCS, n = 465), activities of daily living (ADL, n = 3,921), and mobility (n = 2,218). Some patients had more than one missing variable. Missing data for categorical variables were handled by creating a separate “Missing” category.

Figure 1. Cohort selection diagram. (i) There can be overlapping missing data (e.g., both GCS and ADL can be missing for the same patient); (ii) when GCS, ADL, and mobility are missing, it is always the case that both the minimum and maximum scores are missing.
2.4 Predictor variables
A comprehensive set of predictor variables was considered, including:
• Sociodemographic factors: Age, race, ethnicity, sex, significant other status, median income for zip code, percentage above high school education for zip code, and percentage above bachelor’s education for zip code.
• Substance use: Tobacco, alcohol, and drug use.
• Health and functional status: Glascow Coma Scale (GCS), a clinical tool used to assess a patient’s level of consciousness by evaluating verbal, eye, and motor responses (27, 28). The GCS score ranges from 3 to 15, with higher scores indicating higher levels of responsiveness. Activities of daily living (ADL), measured using the Activity Measure for Post-Acute Care (AM-PAC) “6-Clicks” tool, an electronically administered questionnaire (29, 30). The ADL score evaluates self-care abilities and ranges from 6 to 24, with higher scores indicating greater self-care independence. Mobility, also measured using the AM-PAC “6-Clicks.” The mobility score evaluates a patient’s ability to perform mobility tasks and ranges from 6 to 24, with higher scores reflecting greater mobility function. During a patient’s stay, multiple measurements of GCS, ADL, and mobility were taken, and the minimum and maximum scores recorded from assessments conducted throughout their acute care hospitalization were used. This approach captures the observed range of a patient’s neurological and functional status during their admission, rather than relying on a single time-point (e.g., initial emergency department GCS) assessment, to inform predictions related to discharge planning and 30-day readmission risk from the perspective of the entire hospital course.
• Hospital encounter information: Total length of stay, emergency department (ED) visits, and intensive care unit (ICU) admission.
• Insurance and payment information: Insurance type (Medicare, Medicaid, Commercial, and Self-pay/Other).
• Prior medical history: Prior mental health diagnosis and inpatient stays.
2.5 Outcome variables
The primary outcome variables of interest were:
• Discharge disposition: Categorized as Home, IRF/SNF, Hospice/Died, and Other.
• 30-day readmission: Defined as any unplanned hospital readmission within 30 days of the initial discharge date for TBI.
2.6 Statistical analysis
Descriptive statistics, including frequencies for categorical variables and means with standard deviations for continuous variables, were calculated to summarize patient characteristics. Multinomial logistic regression was used to model discharge disposition (a categorical variable), with the reference category for the outcome variable set to “Home” (31). Logistic regression was used to model 30-day readmission (a binary variable) (31). Forward stepwise regression based on the Akaike information criterion (AIC) was employed for variable selection and to identify important predictors for each outcome, with a maximum of 10 predictors included in each final model for ease of interpretation and model parsimony (32–34). A reasonable reference category was selected for each categorical predictor. Likelihood ratio tests were used to calculate p-values from multiple regression models that included all important predictors identified through stepwise selection. Statistical significance was determined using a p-value threshold of 0.05. Additionally, p-values from individual simple regression models with one predictor at a time (marginal p-values) were also calculated and reported. This approach captures both the conditional effect, the significance of a predictor after controlling for the other important predictors, and the marginal effect, which reflects a predictor’s significance when considered independently from the other predictors. The p-values are intended for descriptive purposes, with the understanding that their interpretation is complicated by post-selection inference (35) and multiple testing (36).
The area under the receiver operating characteristic curve (AUC) was used to evaluate the predictive performance of forward stepwise logistic regression (37). When the outcome variable has more than 2 classes (discharge disposition), the multiclass AUC was applied (38). To better capture out-of-sample predictive performance, 5-fold cross-validation was performed (39). Other candidate models and variable selection methods, including random forest (40), the least absolute shrinkage and selection operator (LASSO) (41), support vector machine (SVM) (42), and the full logistic regression model without variable selection, were compared based on cross-validated AUC values. Note that random forest, LASSO, and SVM do not provide straightforward quantification of predictor significance through p-values.
The R statistical software was used to fit the models to the data (43). The multinomial logistic regression model was fitted using the multinom function from the nnet package (44). Logistic regression was performed with the glm function in base R. Forward stepwise regression, based on the AIC, was conducted using the step function in base R. AUC values were calculated with the pROC package (45). The random forest model was fitted with the randomForest package (46), while the LASSO method was implemented using the glmnet package (47). The SVM method was implemented using the svm function from the e1071 package (48). The shiny package was used to create an R Shiny app that predicts a patient’s discharge disposition and chance of 30-day readmission (49).
3 Results
3.1 Study population characteristics
The study cohort comprised 6,275 adult patients hospitalized for TBI. Descriptive statistics are provided in Tables 1, 2. The mean age was 60.74 years (SD = 20.82), and 60.41% of the patients were male. The racial distribution of the cohort was as follows: 78.47% White, 5.40% Black, 1.91% Asian, 0.76% Native American, 0.29% Pacific Islander, 0.62% Multiple Race, and 12.54% Missing. The ethnic distribution of the cohort was: 84.62% Non-Hispanic, 14.15% Hispanic, and 1.23% Missing. The distribution of insurance types was: 49.72% Medicare, 20.13% Medicaid, 20.41% Commercial, and 9.74% Self-pay/Other. Prior mental health conditions were present in 10.61% of the cohort, and 16.03% of the cohort had prior inpatient stays. The average length of stay was 11.48 days (SD = 18.51), and 55.90% of patients required an ICU admission. 67.73% of patients had an ED visit, 7.01% did not have an ED visit, and for the remaining 25.26%, their ED visit status was missing. The cohort’s mean minimum and maximum GCS scores during their stays were 11.87 (SD = 4.14) and 14.83 (SD = 0.68), respectively. The mean minimum and maximum ADL scores were 15.24 (SD = 5.31) and 17.55 (SD = 4.78), while the mean minimum and maximum mobility scores were 13.94 (SD = 5.48) and 18.34 (SD = 4.75). The distribution of discharge disposition of the cohort was: 50.57% Home, 38.74% IRF/SNF, 3.68% Hospice/Died, and 7.01% Other. Lastly, 8.61% of the patients had unplanned 30-day readmission.
3.2 Discharge disposition model
Multinomial logistic regression was used to model discharge disposition, with “Home” set as the reference category for the outcome variable. Forward stepwise regression was employed to identify 10 predictors that were most predictive of discharge disposition. Table 3 shows the full model output with regression coefficients, p-values, and marginal p-values. Older age, prior inpatient stays, and longer length of stay were all significantly associated with a greater likelihood of discharge to IRF/SNF, Hospice/Died, or Other compared to Home (all p < 0.001). Being Hispanic, having a significant other, having Medicaid or Commercial insurance rather than Medicare, higher minimum GCS scores, higher maximum ADL scores, and higher minimum/maximum mobility scores were significantly associated with a greater likelihood of being discharged Home compared to all other categories (all p < 0.001). Higher maximum GCS score was linked to a lower likelihood of discharge to Hospice/Died and Other compared to Home, but a greater likelihood to IRF/SNF (p < 0.001). Higher minimum ADL score was linked to a lower likelihood of discharge to IRF/SNF and Other compared to Home, but a greater likelihood to Hospice/Died (p < 0.001). The difference in discharge disposition between patients with and without an ED visit was not significant when controlling for the above predictors (p = 0.211). However, ED visit was selected by forward regression, driven primarily by the “Missing” category (p = 0.003).
3.3 30-day readmission model
Logistic regression was used to model 30-day readmission, with forward stepwise regression identifying 7 predictors that were most predictive of the outcome variable. The full model results, including regression coefficients, p-values, and marginal p-values, can be found in Table 4. Older age (p = 0.002) and prior inpatient stays (p < 0.001) were significantly associated with higher odds of readmission. Commercial insurance was associated with lower odds of readmission compared to Medicare (p = 0.024), while the difference between Medicaid and Medicare was not significant (p = 0.963). Prior mental health diagnosis, being marginally significant in the multiple regression model (p = 0.062), was associated with higher odds of readmission. ICU admission, significant only in the multiple regression model (p = 0.002), was also associated with higher odds of readmission. The difference in 30-day readmission between patients with and without a significant other (p = 0.716) and with and without alcohol use (p = 0.571) was not significant when controlling for the other predictors included in the model. However, significant other status and alcohol use were selected by forward regression, driven primarily by the “Missing” category (p < 0.001 and p = 0.006, respectively).
3.4 Model evaluation
The AUC was used as the performance metric to assess the predictive performance of the forward stepwise logistic regression model (37, 38). The AUC quantifies the model’s ability to distinguish between outcome categories by integrating sensitivity and specificity, providing a summary of performance across all classification thresholds. This makes the AUC particularly useful for applications involving imbalanced outcomes, as in this study. The AUC ranges from 0.5 to 1, with higher values indicating stronger discriminatory power. To obtain a robust estimate of out-of-sample performance, 5-fold cross-validation was employed. The data were randomly partitioned into five subsets, with the model trained on four subsets and tested on the remaining one. This process was repeated five times, with each subset serving as the test set once, and the average AUC value was taken.
For comparison, additional candidate models and variable selection methods were considered, including random forest, LASSO, SVM, and the full logistic regression model without variable selection. These models were selected to represent a range of approaches varying in complexity and interpretability. Their performance was compared to the stepwise logistic regression model.
The cross-validated AUC values for each model are presented in Table 5. For discharge disposition, the forward stepwise logistic regression model achieved an AUC of 0.773, comparable to the AUC of 0.779 for random forest, 0.770 for LASSO, and 0.772 for the full model. SVM yielded a slightly lower AUC of 0.759. For 30-day readmission, the forward stepwise logistic regression model achieved an AUC of 0.656, comparable to LASSO (AUC = 0.655) and outperforming random forest (AUC = 0.609), SVM (AUC = 0.546), and the full model (AUC = 0.638).

Table 5. 5-fold cross-validated AUC for measuring and comparing the predictive performance of forward stepwise logistic regression (Forward), random forest (RF), the least absolute shrinkage and selection operator (LASSO), support vector machine (SVM), and the full logistic regression model without variable selection (Full).
Based on these results, the forward stepwise logistic regression model demonstrated good predictive performance for both discharge disposition and 30-day readmission, comparable to or better than the alternative models considered. Given its competitive performance, straightforward variable selection, and the added benefit of providing interpretable regression coefficients and p-values, forward stepwise logistic regression was selected as the primary modeling approach for this study.
3.5 Prediction tool
An R Shiny app was developed to provide an interactive interface for these predictive models. The app allows clinicians to input a patient’s values for the identified predictors and subsequently receive estimated probabilities for different discharge dispositions and the likelihood of 30-day readmission. This tool aims to support clinical decision-making for treatment and discharge planning by predicting a patient’s discharge disposition and chance of 30-day readmission. The app is available at https://tjzhou.shinyapps.io/INREACHapp/.
4 Discussion
We used statistical predictive modeling to create a practical tool that may support treatment and discharge planning for TBI patients, leveraging the rich data available in EHRs. While several factors identified in our models, such as age and GCS, are established predictors of TBI outcomes (9, 12, 18), this study’s primary contribution extends beyond re-identifying individual risk factors. We demonstrate the utility of integrating a comprehensive suite of readily available EHR variables, encompassing sociodemographics, functional status, hospital encounter details, insurance, and medical history from a large, multi-hospital system into parsimonious yet robust predictive models. Furthermore, we show that forward stepwise logistic regression, a highly interpretable method, performs comparably or better than more complex machine learning approaches like random forests or SVMs for these specific TBI outcomes (Table 5). Crucially, we translate these validated models into an accessible, interactive prediction tool, bridging the gap between data analysis and potential clinical application. Our analyses identified several important predictors for discharge disposition and 30-day readmission, providing valuable insights for clinical decision-making.
The INREACHapp prediction tool1 is designed to translate these statistical insights into actionable information at the point of care. For instance, when a clinician inputs a patient’s data, the tool provides probabilities for discharge to Home, IRF/SNF, Hospice/Died, or Other. A high predicted probability for IRF/SNF might prompt earlier engagement of rehabilitation specialists, multidisciplinary team discussions about post-acute care needs, and proactive communication with the patient and family regarding expectations and planning. Similarly, a high predicted risk of 30-day readmission could trigger targeted interventions. These might include comprehensive medication reconciliation, enhanced patient and caregiver education focused on warning signs, scheduling prompt post-discharge follow-up appointments, or coordinating with community-based services to ensure a smoother care transition. The current models utilize summary variables from the hospital stay (e.g., length of stay, min/max functional scores) and are thus most relevant for informing discharge planning as the patient stabilizes. While the tool can be used whenever these data points are available, dynamic, day-to-day prediction based on evolving patient status would represent a future development, potentially requiring different predictors and model structures.
Regarding the discharge disposition model, several factors were predictive of discharge to different settings. Not unexpectedly, older age was associated with a higher likelihood of discharge to IRF/SNF, Hospice/Died, or Other rather than home. Similarly, a history of prior inpatient stays and a longer length of stay were significantly associated with a higher likelihood of discharge to other settings instead of home. This is not surprising, as prior hospitalizations and extended stays likely indicate a more complex medical history and greater symptom severity, necessitating continued post-discharge care. The inclusion of both minimum and maximum functional scores (GCS, ADL, Mobility) by the stepwise selection process suggests that both the lowest point of function and the peak recovery achieved during hospitalization contribute unique predictive information for discharge disposition.
In contrast, identifying as Hispanic, having a significant other, being insured through Medicaid or Commercial insurance rather than Medicare, and having higher minimum GCS, maximum ADL, and minimum/maximum mobility scores were all significantly associated with a higher likelihood of being discharged home rather than to other settings. Factors reflecting less severe brain injury (GCS) and improved health and mobility (ADL and mobility) are logically related to a greater likelihood of home discharge. It is important to interpret the discharge disposition model outputs with nuance. The “Home” category was used as the reference in our multinomial logistic regression, a standard statistical approach, and its prediction does not inherently imply it is a “better” outcome than IRF/SNF for every individual. The most appropriate discharge setting is a complex clinical decision tailored to individual patient needs, functional status, and support systems. Our model aims to predict the observed discharge patterns within our healthcare system based on the available data, thereby providing insights into factors influencing these decisions and helping to anticipate post-acute care needs and resource utilization. The primary outcomes modeled were the discharge event itself and 30-day readmission, not comparative long-term functional outcomes across different discharge settings, which would necessitate different study designs and outcome measures.
The association between identifying as Hispanic and discharge disposition may be influenced by various factors, as discussed previously (50–54). Our findings regarding ethnicity’s association with discharge disposition, even within a cohort that was predominantly Caucasian, highlight the complex interplay of socio-cultural factors in healthcare decisions and outcomes. This underscores the need for future research to validate and potentially recalibrate such predictive models in more racially and ethnically diverse TBI populations to ensure their equitable applicability and to uncover population-specific predictors. Similarly, the presence of a significant other facilitating home discharge aligns with prior research (55), though the complexities noted (56) remain relevant.
In terms of 30-day readmission, factors such as older age, insurance status, prior mental health diagnosis, previous inpatient stays, and ICU admission were associated with a higher risk. Readmission after acute care for TBI is not merely an indicator of care quality or a source of financial strain for patients and healthcare systems (57, 58); it often signals unresolved medical issues or the emergence of TBI-related complications such as post-traumatic seizures, persistent headaches, or neurological decline. Such early readmissions can negatively impact long-term functional recovery, increase overall morbidity, and are associated with a higher risk of mortality (13, 57, 59–61). While our model identified factors like prior inpatient stays and ICU admission as significant predictors or proxies for greater medical complexity and severity, specific TBI sequelae (e.g., seizure disorders, persistent headaches) or detailed mechanisms of injury were not available as discrete variables for inclusion in this particular dataset. Future iterations of predictive models could be enhanced by incorporating such granular clinical information where feasible, potentially improving predictive accuracy for TBI-specific adverse events.
Many of our findings regarding individual predictors are consistent with previous studies (9, 12, 18). However, our study expands upon prior research by leveraging EHR data from a large multi-hospital health system, which capture a broader range of predictors simultaneously. This allows us to develop a more comprehensive predictive model using interpretable statistical methods that perform comparably to more complex machine learning techniques in this context.
Several directions can be pursued to further validate and generalize the current study. First, applying the proposed method to other sources of EHR data can help assess the robustness of the conclusions. Second, the method remains applicable even when predictions are made at different time points, with candidate predictors added or dropped based on data availability. Third, developing a flexible software infrastructure that can adapt to varying EHR structures and automate the construction of the prediction tool across different datasets would enhance its scalability and usability. Fourth, incorporating unstructured EHR data could make the predictive model more comprehensive (62). Fifth, expanding the tool to include other relevant outcomes would further increase its clinical utility. Lastly, evaluating the feasibility and impact of integrating the prediction tool into clinical workflows is a critical next step.
The principles and methodologies central to this study, which include the systematic leveraging of comprehensive EHR data, the application of robust statistical predictive modeling, and the development of user-friendly decision-support tools, hold considerable potential for broader application in TBI care worldwide. While specific predictor variables and their weights will undoubtedly vary across different healthcare systems, patient demographics, and cultural contexts, the foundational approach of transforming routinely collected clinical data into actionable, personalized insights can empower clinicians globally. This can lead to more efficient resource allocation, timely interventions for at-risk patients, and ultimately, contribute to improving TBI care pathways and outcomes. Future international collaborations could focus on standardizing key TBI-related data elements within EHRs and sharing best practices for the development, validation, and ethical implementation of such predictive tools across diverse settings.
5 Limitations
This study has several strengths, including a large sample size from a multi-hospital system, the use of real-world EHR data, rigorous statistical methodology, and the development of a user-friendly app. However, certain limitations should be acknowledged. As a retrospective cohort study, there is potential for selection bias and unmeasured confounding (63). Therefore, the findings should be interpreted as associations that do not necessarily represent causal relationships. The study cohort, while large, was predominantly Caucasian (78.47% White). This may limit the generalizability of the specific model coefficients to TBI populations with different racial and ethnic compositions. Future validation and calibration in more diverse cohorts are essential. Certain factors potentially pertinent to TBI outcomes, such as the detailed mechanism of injury or the presence of specific post-TBI complications (e.g., seizures, neurobehavioral symptoms not captured by broader mental health diagnoses), were not available as discrete, structured variables in our dataset for model inclusion. Their absence might affect the model’s comprehensiveness (11, 13). The GCS scores utilized (minimum and maximum during hospitalization) differ from the initial ED GCS often reported in acute TBI prognostic studies. While these scores reflect a patient’s trajectory during admission and are relevant for discharge planning, they may capture different aspects of neurological status than a one-time ED assessment. Missing data, a common challenge in EHR analyses, were handled by excluding records with missing values in continuous variables and creating a “Missing” category for categorical variables with missingness. However, the impact of missing data on the findings requires further investigation, as nonignorable missingness may introduce bias. For example, underserved groups may be more susceptible to missing data due to fragmented care or language barriers (64). Additionally, the “Missing” category in several categorical variables, including ethnicity, significant other status, ED visits, and alcohol use, was significantly associated with discharge disposition and 30-day readmission (Tables 3, 4). Given the inherent ambiguity of the “Missing” category, it is unclear whether it reflects specific patient characteristics, underlying conditions, or reporting trends at certain facilities. As a result, findings related to this category should be interpreted with caution. Moreover, while ethnicity was found to be significantly associated with discharge disposition, race was not identified as an important predictor for either outcome. This may be due to the small number of participants identifying as a race other than White or Missing (Table 1). Lastly, the models developed in this study estimate the likelihood of various discharge dispositions and the chance of 30-day readmission based on the patterns in the EHR data. These predictions reflect overall trends and should be used in conjunction with clinical judgment, not to replace clinical judgment and holistic patient assessment, ensuring that all patients receive individualized care based on their comprehensive needs.
Data availability statement
The datasets presented in this article are not readily available because the data analyzed in this study were obtained through a data use agreement with the University of Colorado Hospital Authority (UCHA). Others must establish their own data use agreements with UCHA to access these datasets. Requests to access the datasets should be directed to https://www.uchealth.org/locations/uchealth-university-of-colorado-hospital-uch/.
Ethics statement
The studies involving humans were approved by the Colorado Multiple Institutional Review Board and the Colorado State University Institutional Review Board. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
TZ: Conceptualization, Data curation, Formal analysis, Funding acquisition, Methodology, Project administration, Software, Supervision, Validation, Writing – original draft, Writing – review & editing. JG: Conceptualization, Data curation, Funding acquisition, Methodology, Project administration, Validation, Writing – original draft, Writing – review & editing, Resources. DD-D: Validation, Writing – original draft, Writing – review & editing. AM: Formal analysis, Methodology, Software, Writing – review & editing. JE: Data curation, Writing – review & editing. AH: Data curation, Writing – review & editing. HW: Conceptualization, Funding acquisition, Writing – review & editing. DD: Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing – original draft, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. Funding for this project was provided by the INterdisciplinary REsearch into Aging Challenges (IN-REACH) pilot grant administered by the Columbine Health Systems Center for Healthy Aging at Colorado State University.
Acknowledgments
This project was supported by the Health Data Compass Data Warehouse project (healthdatacompass.org).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Footnotes
References
1. Maas, AIR, Menon, DK, Manley, GT, Abrams, M, Åkerlund, C, Andelic, N, et al. Traumatic brain injury: progress and challenges in prevention, clinical care, and research. Lancet Neurol. (2022) 21:1004–60. doi: 10.1016/S1474-4422(22)00309-X
2. Ziemek, J, Hoge, N, Woodward, KF, Doerfler, E, Bradywood, A, Pletcher, A, et al. Hospital personnel perspectives on factors influencing acute care patient outcomes: a qualitative approach to model refinement. BMC Health Serv Res. (2024) 24:805. doi: 10.1186/s12913-024-11271-x
3. Lin, CJ, Cheng, SJ, Shih, SC, Chu, CH, and Tjung, JJ. Discharge planning. Int J Gerontol. (2012) 6:237–40. doi: 10.1016/j.ijge.2012.05.001
4. Bajorek, SA, and McElroy, V. Discharge planning and transitions of care. Disch plan transit care. (2019); Available online at: https://psnet.ahrq.gov/primer/discharge-planning-and-transitions-care (Accessed on 2024 Nov 29)
5. James, MK, Robitsek, RJ, Saghir, SM, Gentile, PA, Ramos, M, and Perez, F. Clinical and non-clinical factors that predict discharge disposition after a fall. Injury. (2018) 49:975–82. doi: 10.1016/j.injury.2018.02.014
6. Shadmi, E, Flaks-Manov, N, Hoshen, M, Goldman, O, Bitterman, H, and Balicer, RD. Predicting 30-day readmissions with preadmission electronic health record data. Med Care. (2015) 53:283–9. doi: 10.1097/MLR.0000000000000315
7. Shah, MA, Firdous, A, and Dar, GN. Personalized cancer nanomedicine: why it is a necessity and not a luxury In: MA Aziz, editor. Personalized and precision nanomedicine for cancer treatment. Singapore: Springer Nature (2024). 389–98.
8. Chan, L, Doctor, J, Temkin, N, MacLehose, RF, Esselman, P, Bell, K, et al. Discharge disposition from acute care after traumatic brain injury: the effect of insurance type. Arch Phys Med Rehabil. (2001) 82:1151–4. doi: 10.1053/apmr.2001.24892
9. Cuthbert, JP, Corrigan, JD, Harrison-Felix, C, Coronado, V, Dijkers, MP, Heinemann, AW, et al. Factors that predict acute hospitalization discharge disposition for adults with moderate to severe traumatic brain injury. Arch Phys Med Rehabil. (2011) 92:721–730.e3. doi: 10.1016/j.apmr.2010.12.023
10. Lu, J, Gormley, M, Donaldson, A, Agyemang, A, Karmarkar, A, and Seel, RT. Identifying factors associated with acute hospital discharge dispositions in patients with moderate-to-severe traumatic brain injury. Brain Inj. (2022) 36:383–92. doi: 10.1080/02699052.2022.2034180
11. Satyadev, N, Warman, PI, Seas, A, Kolls, BJ, Haglund, MM, Fuller, AT, et al. Machine learning for predicting discharge disposition after traumatic brain injury. Neurosurgery. (2022) 90:768–74. doi: 10.1227/neu.0000000000001911
12. Canner, JK, Giuliano, K, Gani, F, and Schneider, EB. Thirty-day re-admission after traumatic brain injury: results from MarketScan®. Brain Inj. (2016) 30:1570–5. doi: 10.1080/02699052.2016.1199898
13. Saverino, C, Swaine, B, Jaglal, S, Lewko, J, Vernich, L, Voth, J, et al. Rehospitalization after traumatic brain injury: a population-based study. Arch Phys Med Rehabil. (2016) 97:S19–25. doi: 10.1016/j.apmr.2015.04.016
14. Li, CY, Karmarkar, A, Adhikari, D, Ottenbacher, K, and Kuo, YF. Effects of age and sex on hospital readmission in traumatic brain injury. Arch Phys Med Rehabil. (2018) 99:1279–1288.e1. doi: 10.1016/j.apmr.2017.12.006
15. Kelly, DJ, Thibault, D, Tam, D, Liu, LJ, Cragg, JJ, Willis, AW, et al. Readmission following hospitalization for traumatic brain injury: a nationwide study. J Head Trauma Rehabil. (2022) 37:E165–74. doi: 10.1097/HTR.0000000000000699
16. MRC Crash Trial Collaborators. Predicting outcome after traumatic brain injury: practical prognostic models based on large cohort of international patients. BMJ. (2008) 336:425–9. doi: 10.1136/bmj.39461.643438.25
17. Steyerberg, EW, Mushkudiani, N, Perel, P, Butcher, I, Lu, J, McHugh, GS, et al. Predicting outcome after traumatic brain injury: development and international validation of prognostic scores based on admission characteristics. PLoS Med. (2008) 5:e165. doi: 10.1371/journal.pmed.0050165
18. Lingsma, HF, Roozenbeek, B, Steyerberg, EW, Murray, GD, and Maas, AI. Early prognosis in traumatic brain injury: from prophecies to predictions. Lancet Neurol. (2010) 9:543–54. doi: 10.1016/S1474-4422(10)70065-X
19. Walker, WC, Stromberg, KA, Marwitz, JH, Sima, AP, Agyemang, AA, Graham, KM, et al. Predicting long-term global outcome after traumatic brain injury: development of a practical prognostic tool using the traumatic brain injury model systems national database. J Neurotrauma. (2018) 35:1587–95. doi: 10.1089/neu.2017.5359
20. Khalili, H, Rismani, M, Nematollahi, MA, Masoudi, MS, Asadollahi, A, Taheri, R, et al. Prognosis prediction in traumatic brain injury patients using machine learning algorithms. Sci Rep. (2023) 13:960. doi: 10.1038/s41598-023-28188-w
21. Simborg, DW. Promoting electronic health record adoption. Is it the correct focus? J Am Med Inform Assoc. (2008) 15:127–9. doi: 10.1197/jamia.M2573
22. Honavar, SG. Electronic medical records – the good, the bad and the ugly. Indian J Ophthalmol. (2020) 68:417–8. doi: 10.4103/ijo.IJO_278_20
23. Hill, RG, Sears, LM, and Melanson, SW. 4000 clicks: a productivity analysis of electronic medical records in a community hospital ED. Am J Emerg Med. (2013) 31:1591–4. doi: 10.1016/j.ajem.2013.06.028
24. Coorevits, P, Sundgren, M, Klein, GO, Bahr, A, Claerhout, B, Daniel, C, et al. Electronic health records: new opportunities for clinical research. J Intern Med. (2013) 274:547–60. doi: 10.1111/joim.12119
25. Cowie, MR, Blomster, JI, Curtis, LH, Duclaux, S, Ford, I, Fritz, F, et al. Electronic health records to facilitate clinical research. Clin Res Cardiol. (2017) 106:1–9. doi: 10.1007/s00392-016-1025-6
26. Abul-Husn, NS, and Kenny, EE. Personalized medicine and the power of electronic health records. Cell. (2019) 177:58–69. doi: 10.1016/j.cell.2019.02.039
27. Teasdale, G, and Jennett, B. Assessment of coma and impaired consciousness. A practical scale. Lancet. (1974) 2:81–4.
28. McNett, M. A review of the predictive ability of Glasgow coma scale scores in head-injured patients. J Neurosci Nurs. (2007) 39:68–75. doi: 10.1097/01376517-200704000-00002
29. Jette, DU, Stilphen, M, Ranganathan, VK, Passek, SD, Frost, FS, and Jette, AM. AM-PAC “6-clicks” functional assessment scores predict acute care hospital discharge destination. Phys Ther. (2014) 94:1252–61. doi: 10.2522/ptj.20130359
30. Jette, DU, Stilphen, M, Ranganathan, VK, Passek, SD, Frost, FS, and Jette, AM. Validity of the AM-PAC “6-clicks” inpatient daily activity and basic mobility short forms. Phys Ther. (2014) 94:379–91. doi: 10.2522/ptj.20130199
31. Menard, S. Applied logistic regression analysis. Thousand Oaks, CA, USA: SAGE publications (2001).
32. Draper, NR, and Smith, H. Applied regression analysis. New York, NY, USA: John Wiley & Sons (1998).
33. Hocking, RR. The analysis and selection of variables in linear regression. Biometrics. (1976) 32:1–49.
34. Akaike, H. A new look at the statistical model identification. IEEE Trans Automat Control. (1974) 19:716–23.
35. Kuchibhotla, AK, Kolassa, JE, and Kuffner, TA. Post-selection inference. Annu Rev Stat Appl. (2022) 9:505–27. doi: 10.1146/annurev-statistics-100421-044639
36. Bender, R, and Lange, S. Adjusting for multiple testing—when and how? J Clin Epidemiol. (2001) 54:343–9. doi: 10.1016/S0895-4356(00)00314-0
37. Hanley, JA, and McNeil, BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. (1982) 143:29–36.
38. Hand, DJ, and Till, RJ. A simple generalisation of the area under the ROC curve for multiple class classification problems. Mach Learn. (2001) 45:171–86. doi: 10.1023/A:1010920819831
39. Stone, M. Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B Methodol. (1974) 36:111–33. doi: 10.1111/j.2517-6161.1974.tb00994.x
41. Tibshirani, R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Stat Methodol. (1996) 58:267–88.
42. Cristianini, N, and Shawe-Taylor, J. An introduction to support vector machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press (2000).
43. R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing (2024).
44. Venables, WN, and Ripley, BD. Modern applied statistics with S. Fourth ed. New York, NY, USA: Springer (2002).
45. Robin, X, Turck, N, Hainard, A, Tiberti, N, Lisacek, F, Sanchez, JC, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. (2011) 12:1–8. doi: 10.1186/1471-2105-12-77
47. Friedman, J, Hastie, T, and Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. (2010) 33:1–22. doi: 10.1109/TPAMI.2005.127
48. Meyer, D, Dimitriadou, E, Hornik, K, Weingessel, A, and Leisch, F, E1071: misc functions of the department of statistics, probability theory group (formerly: E1071), TU Wien. R package version 1.7-4. Available online at: https://CRAN.R-project.org/package=e1071. (2020). (Accessed December 1, 2024).
49. Chang, W, Cheng, J, Allaire, JJ, Xie, Y, and McPherson, J. Shiny: web application framework for R. R package version 1.5.0. Available online at: https://CRAN.R-project.org/package=shiny. (2020). (Accessed December 1, 2024).
50. Martínez DE,, and Gonzalez, KE. “Latino” or “Hispanic”? The sociodemographic correlates of Panethnic label preferences among U.S. Latinos/Hispanics. Sociol Perspect. (2021) 64:365–86. doi: 10.1177/0731121420950371
51. Hacker, K, Anies, M, Folb, BL, and Zallman, L. Barriers to health care for undocumented immigrants: a literature review. Risk Manag Healthc Policy. (2015) 8:175–83. doi: 10.2147/RMHP.S70173
52. Betancourt, JR, Green, AR, Carrillo, JE, and Ananeh-Firempong, O. Defining cultural competence: a practical framework for addressing racial/ethnic disparities in health and health care. Public Health Rep. (2003) 118:293–302. doi: 10.1016/S0033-3549(04)50253-4
53. Bartley, CN, Atwell, K, Cairns, B, and Charles, A. Racial and ethnic disparities in discharge to rehabilitation following burn injury. J Burn Care Res. (2019) 40:143–7. doi: 10.1093/jbcr/irz001
54. Warren, KL, and García, JJ. Centering race/ethnicity: differences in traumatic brain injury inpatient rehabilitation outcomes. PM&R. (2022) 14:1430–8. doi: 10.1002/pmrj.12737
55. Farrell, RT, Bennett, BK, and Gamelli, RL. An analysis of social support and insurance on discharge disposition and functional outcomes in patients with acute burns. J Burn Care Res. (2010) 31:385–92. doi: 10.1097/BCR.0b013e3181db516b
56. Howie-Esquivel, J, and Spicer, JG. Association of Partner Status and Disposition with Rehospitalization in heart failure patients. Am J Crit Care. (2012) 21:e65–73. doi: 10.4037/ajcc2012382
57. Dams-O’Connor, K, Pretz, C, Mellick, D, Dreer, LE, Hammond, FM, Hoffman, J, et al. Rehospitalization over 10 years among survivors of TBI: a National Institute on disability, independent living and rehabilitation research (NIDILRR) traumatic brain injury model systems study. J Head Trauma Rehabil. (2017) 32:147–57. doi: 10.1097/HTR.0000000000000263
58. Wish, JB. The role of 30-day readmission as a measure of quality. Clin J Am Soc Nephrol. (2014) 9:440–2. doi: 10.2215/CJN.00240114
59. Gardner, J, Sexton, KW, Taylor, J, Beck, W, Kimbrough, MK, Davis, B, et al. Defining severe traumatic brain injury readmission rates and reasons in a rural state. Trauma Surg Acute Care Open. (2018) 3:e000186. doi: 10.1136/tsaco-2018-000186
60. Koo, AB, Elsamadicy, AA, David, WB, Zogg, CK, Santarosa, C, Sujijantarat, N, et al. Thirty-and 90-day readmissions after treatment of traumatic subdural hematoma: National Trend Analysis. World Neurosurg. (2020) 139:e212–9. doi: 10.1016/j.wneu.2020.03.168
61. Keneally, RJ, Heinz, ER, Young, R, DeFreitas, C, and Estroff, JM. Modern readmission rates after head trauma. Proc (Baylor Univ Med Cent). (2023) 36:663–8. doi: 10.1080/08998280.2023.2249387
62. Juhn, Y, and Liu, H. Artificial intelligence approaches using natural language processing to advance EHR-based clinical research. J Allergy Clin Immunol. (2020) 145:463–9. doi: 10.1016/j.jaci.2019.12.897
63. Talari, K, and Goyal, M. Retrospective studies–utility and caveats. J Royal College of Physicians of Edinburgh. (2020) 50:398–402. doi: 10.4997/jrcpe.2020.409
Keywords: discharge planning, electronic health record, predictive modeling, readmission, rehabilitation, traumatic brain injury
Citation: Zhou T, Graham JE, Davalos-DeLosh D, Maulik AK, Edelstein J, Hoffman AL, Wang H and Davalos D (2025) Prediction tool for discharge disposition and 30-day readmission using electronic health records among patients hospitalized for traumatic brain injury. Front. Neurol. 16:1581176. doi: 10.3389/fneur.2025.1581176
Edited by:
Barak Bar, University of Wisconsin-Madison, United StatesReviewed by:
Nikki Miller Ferguson, Virginia Commonwealth University Health System, United StatesVed Vrat Verma, National Institute of Cancer Prevention and Research (ICMR), India
Copyright © 2025 Zhou, Graham, Davalos-DeLosh, Maulik, Edelstein, Hoffman, Wang and Davalos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Tianjian Zhou, dGlhbmppYW4uemhvdUBjb2xvc3RhdGUuZWR1