Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling

Edelman, Eric R.; van Kuijk, Sander M. J.; Hamaekers, Ankie E. W.; de Korte, Marcel J. M.; van Merode, Godefridus G.; Buhre, Wolfgang F. F. A.

doi:10.3389/fmed.2017.00085

ORIGINAL RESEARCH article

Front. Med., 19 June 2017

Sec. Intensive Care Medicine and Anesthesiology

Volume 4 - 2017 | https://doi.org/10.3389/fmed.2017.00085

This article is part of the Research TopicMedical Management Redesign in Hospitals: Manage Growing Demands with Dwindling ResourcesView all 6 articles

Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling

Eric R. Edelman¹*

Sander M. J. van Kuijk²

Ankie E. W. Hamaekers³

Marcel J. M. de Korte³

Godefridus G. van Merode⁴

Wolfgang F. F. A. Buhre³

¹Faculty of Health, Medicine and Life Sciences, Department of Health Services Research, CAPHRI School for Public Health and Primary Care, Maastricht University, Maastricht, Netherlands
²Department of Clinical Epidemiology and Medical Technology Assessment (KEMTA), Maastricht University Medical Center+, Maastricht, Netherlands
³Department of Anesthesiology, Maastricht University Medical Center+, Maastricht, Netherlands
⁴Maastricht University Medical Center+, Maastricht, Netherlands

For efficient utilization of operating rooms (ORs), accurate schedules of assigned block time and sequences of patient cases need to be made. The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT) per case. In this paper, we attempt to improve the accuracy of TPT predictions by using linear regression models based on estimated surgeon-controlled time (eSCT) and other variables relevant to TPT. We extracted data from a Dutch benchmarking database of all surgeries performed in six academic hospitals in The Netherlands from 2012 till 2016. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation, American Society of Anesthesiologists (ASA) physical status classification, and type of anesthesia used. First, we computed the predicted TPT based on a previously described fixed ratio model for each record, multiplying eSCT by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT). We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables. In addition, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT). TPT was most accurately predicted using a linear regression model based on the independent variables eSCT, type of operation, ASA classification, and type of anesthesia. This model performed significantly better than the fixed ratio model and the method of predicting ACT separately. Making use of these more accurate predictions in planning and sequencing algorithms may enable an increase in utilization of ORs, leading to significant financial and productivity related benefits.

Introduction

Operating rooms (ORs) are some of the most valuable hospital assets there are, generating a large part of hospital revenue. Revenue per OR hour varies per procedure, but is estimated to be between $1,000 and $2,000 on average, before subtracting the variable costs of personnel and supplies related to hospitalization (1). This makes efficient utilization of ORs paramount. Every minute wasted may cause a significant loss of revenue. For efficient utilization of ORs, accurate schedules of assigned block time and sequences of patient cases need to be made.

The quality of these planning tools is dependent on the accurate prediction of total procedure time (TPT; abbreviations are described in Table 1) per case. TPT consists of anesthesia-controlled time (ACT, itself consisting of the induction and emergence phases) and surgeon-controlled time (SCT, being the duration of the actual operation, including patient positioning and draping). ACT is included because in Dutch academic hospitals, the induction and emergence phases always take place in the OR, making them relevant to OR utilization.

TABLE 1

Table 1. Descriptions of abbreviations used.

Predicted TPTs are used to plan up to a desired level of utilization of the OR complex. Sequencing patient cases based on predicted TPT can help minimize the probability of underutilization of the OR and cancelation of procedures. Previous research has shown that using a fixed ratio to calculate TPT from SCT as estimated prior to an operation [estimated surgeon-controlled time (eSCT)] provides more accurate estimates than adding a fixed duration for ACT to eSCT to compute TPT (2). In this paper, we attempt to improve the accuracy of TPT predictions further by including patient and surgery characteristics relevant to TPT.

Materials and Methods

We extracted data from a Dutch benchmarking database of all surgeries performed in all eight academic hospitals in The Netherlands from 2012 till 2016. Written informed consent from the patients was not required, because no individual patient data were included. The data contributed by two of these hospitals were excluded, because they only contained observed and subsequently recorded SCT instead of the initially estimated SCT. The other records also did not contain eSCT, but did describe estimated TPT. We used this to approximate eSCT by subtracting 20 min, which is the default time allocated to ACT in many Dutch hospitals. Unfortunately, it was not feasible to accurately discover the exact time attributed to anesthesia for each operation in each hospital. Subtracting 20 min gives us approximate eSCTs that are sufficient for testing the methods described in this paper.

Potential predictors of TPT that were included in the subsequent analysis were eSCT, patient age, type of operation (identified by unique codes as registered by the hospitals), American Society of Anesthesiologists (ASA) physical status classification, and type of anesthesia used (again identified by hospital supplied codes). Other database fields described observed TPT, anesthesia induction time, and anesthesia emergence time. Observed ACT was calculated by adding up induction and emergence durations. Only records describing elective surgery were included, because emergency surgery does not receive an estimated TPT/SCT.

Data analysis and statistical calculations were performed in R version 3.3.1. Implausible or impossible data values, such as a 0 for observed TPT, were marked as missing data. As we suspected missing data in the database to have occurred completely at random, we omitted incomplete records from the analysis. The final dataset consisted of 79,983 records, describing 199,772 h of total OR time. The distribution of the characteristics within this dataset is shown in Tables 2 and 3. The data were split into a training set with records from the years 2012 till 2015 and a test set from 2016.

TABLE 2

Table 2. Distribution of characteristics in the dataset.

TABLE 3

Table 3. Miscellaneous descriptive statistics about the dataset used.

An often used rule-of-thumb states the need for at least 10 records for each potential predictor of TPT to be included in the model. Recent research suggests the actual number may be even lower (3). Considering that the dataset used for our analysis contained nearly 80,000 records, we had ample precision to test all potential predictors and interactions.

First, we computed for each record the predicted TPT based on the fixed ratio model described by van Veen-Berkx et al. (2) For each patient, the eSCT was multiplied by 1.33. This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of ACT. Using both predicted and observed TPT, we computed the mean absolute error (MAE), the mean squared error (MSE), and model fit expressed as the adjusted R-squared of the model. The adjusted R-squared can be interpreted as the proportion of variance in TPT that can be explained by parameters in the model.

All linear regression models were created using the 2012–2015 data and then validated on both this set and the 2016 set. This enabled us to separately measure the performance of the models on new data and compare this to their performance on the training data.

We used the p-value of each variable and the adjusted R-squared values to test all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables.

As an additional alternative, all regression models were again tested without eSCT as a predictor to predict ACT separately (which leads to TPT by adding SCT). This allowed us to compare our findings with various previous attempts to predict ACT (4, 5).

Finally, to test for any possible influence, the omission of the incomplete records might have had on our results, we reran the analyses after imputation of the missing data. Linear regression was used to impute the numeric variables and a proportional odds model for the ordered variable describing ASA classification. The type of anesthesia used and the type of surgery performed could not be imputed, due to the large number of categories.

Results

Using the fixed ratio model, the MAE of the 2012–2015 predictions was 39.5 min with a MSE of 3,859.6 min. For the 2016 predictions, the MAE was 38.5 min with a MSE of 3,275.9 min.

All variables of the linear regression models were highly significant predictors (p < 0.01), in part, due to the size of the dataset, except some of the levels of the factor variables for type of anesthesia and type of operation. These variables were retained in the model though, since the overall effect of the factor variables was significant. Ultimately, the best model was identified by examining when the adjusted R-squared showed only minimal improvement after adding additional predictors.

Of all models tested, TPT is most accurately predicted using a linear regression model based on all available independent variables. However, as can be seen in Tables 4 and 5, including patient age in the model did not significantly improve the goodness-of-fit, so we only retained the variables eSCT, type of operation, ASA classification, and type of anesthesia. Using this best model, the MAE of the 2012–2015 predictions was 29.2 min with a MSE of 2,320.7 min. For the 2016 predictions, the MAE was 31.3 min with a MSE of 2,366.9 min. The adjusted R-squared of this model was 0.8498.

TABLE 4

Table 4. Goodness-of-fit of the linear regression models for predicting total procedure time ranked by best adjusted R-squared value.

TABLE 5

Table 5. Goodness-of-fit of the linear regression models for predicting anesthesia-controlled time (ACT), ranked by best adjusted R-squared value.

Similarly, ACT was most accurately predicted by all independent variables, but with very little improvement by adding patient age. The final model, based on the type of operation, ASA classification, and type of anesthesia, did not perform better than the direct prediction of TPT, with a MAE of the 2012–2015 predictions of 34.7 with a MSE of 3,269.7 min and a MAE of the 2016 predictions of 34.2 min with a MSE of 2,878.7 min. The adjusted R-squared was 0.6314.

These main outcomes are summarized in Table 6. Figure 1 displays plots of the predicted versus the actual TPTs for these three models.

TABLE 6

Table 6. Performance of fixed ratio model and best performing linear regression models.

FIGURE 1

Figure 1. Plots of the predicted versus the actual total procedure time (TPTs) for the fixed ratio model and the two best linear regression models for predicting TPT and anesthesia-controlled time (ACT).

After imputation of missing data in the initial dataset instead of elimination of incomplete records, all results were practically the same.

Discussion

The improvement in TPT prediction of the best performing linear regression model versus the fixed ratio model was convincing. On the training data, the MSE was reduced by a quarter of the original value. This indicates that the variation in prediction errors was substantially reduced. As is to be expected, this effect was somewhat less pronounced on the 2016 testing data, but still very useful.

Making use of these more accurate predictions may help prevent the typical consequences of under- and overestimation. Underestimation can lead to costly overtime or even the cancelation of operations, while overestimation can lead to downtime of both the operating theater and its staff. For the hospital with the highest number of complete records in our dataset, totaling all the under- and overestimation of the included operations from 2016 results in a total overestimation of 3,118 h. Had they made use of a model as described in this paper (based on their own data), the total result would have been an overestimation of only 179 h. Depending on the way these hours would have been distributed in the scheduling, they may have led to additional operations being performed.

The accuracy of predicted durations of surgery also directly influences the confidence with which planners might increase the level of utilization of ORs. Planning for higher utilization is only possible with more certainty about case duration, but can offer significant financial and productivity related benefits.

A second important finding is that separate ACT prediction (using the same available variables but without eSCT) yields worse results than direct TPT prediction.

The fact that TPT is the result of ACT and SCT is demonstrated by the best performing model. This model is based on eSCT, type of operation, and the two most important anesthesiologic variables: ASA classification and type of anesthesia used. This means predictions are possible using a limited number of easily obtainable values. Even though our model is intended for use by a computer system, keeping the model simple by requiring fewer inputs improves its usability, understandability, and speed.

The fact that the regression models were calculated and tested using surgeons’ actual pre-surgery estimations of SCT instead of recorded, historical SCTs lends additional credibility to our results. In actual planning practice, predictions will similarly need to be based on estimated SCT. Therefore, the performance of the models as described in our results should match real-world performance, as opposed to a likely positive bias when based on historical data. This is especially true for the performance on the 2016 data, which the model was not trained on. While performing our research, it became apparent that the predictions of the 2016 TPTs became increasingly accurate as our collection of training data grew. This suggests that the method described in this paper holds potential for improved performance when applied to even larger datasets, as are becoming increasingly available to health-care data analysts. Additionally, further improvement may be achieved by tailoring the analyses to local circumstances. It is possible to prepare custom models for the level of individual hospitals, departments, types of operations, or even surgeons.

Summarizing the above, we encourage hospital data analysts and surgical managers to create similar models to those described in this paper using as much of their own historical data as possible. The method described is relatively straightforward and might provide them with more accurate procedure time predictions than current practices.

A limitation of this study was that the data used were recorded in academic centers only. The applicability to typical OR schedules in regional hospitals has not been studied. In addition, we have averaged all suitable data available from these academic centers under the assumption that there were no major differences between these centers that might significantly alter the TPT.

The manual registration of the timestamps and semi-manual process of aggregating the other data has two important weaknesses. First, it most probably resulted in inaccuracies of the data, possibly leaning toward late recording of the key moments during the operations. Second, there was a surprising amount of missing data at analysis. Of the records we started with, only ca. 21% contained complete and plausible data in all required fields, making the rest unsuitable for analysis. The fact that the results after imputation of the missing data were very similar to those of our initial analyses indicates that eliminating the incomplete records had limited influence on the outcomes as described.

Both issues underline the importance of the implementation of automatic registration systems that integrate into the work processes in the OR to collect more and better data. Only then will the results of analysis of this data be taken to a higher level, allowing for robust conclusions with operational consequences.

A final important remark is that, despite the new model generally performing well over the long-term, a relatively high interindividual variability still exists. This could limit the usefulness of its predictions in day to day planning.

Conclusion

A linear regression model to predict TPT based on eSCT, type of operation, patient ASA classification, and anesthesia type outperforms the current practices of using a standard duration for ACT or a fixed ratio between eSCT and TPT. A second conclusion is that predicting TPT through the separate prediction of ACT yields less accurate results than direct prediction of TPT.

Author Contributions

EE performed the analyses based on advice by SK and drafted the original manuscript. WB provided direct supervision during the entire project. AH, MK, and WB made valuable contributions to the anesthesiological aspects of the research performed and contributed to the article contents. GM independently performed the statistical analyses a second time to confirm the outcomes. He also provided additional advice and feedback on the methods and their textual descriptions.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1. Macario A, Dexter F, Traub RD. Hospital profitability per hour of operating room time can vary among surgeons. Anesth Analg (2001) 93(3):669–75. doi: 10.1097/00000539-200109000-00028

PubMed Abstract | CrossRef Full Text | Google Scholar

2. van Veen-Berkx E, Bitter J, Elkhuizen SG, Buhre WF, Kalkman CJ, Gooszen HG, et al. The influence of anesthesia-controlled time on operating room scheduling in Dutch university medical centres. Can J Anesth (2014) 61(6):524–32. doi:10.1007/s12630-014-0134-9

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Austin PC, Steyerberg EW. The number of subjects per variable required in linear regression analyses. J Clin Epidemiol (2015) 68(6):627–36. doi:10.1016/j.jclinepi.2014.12.014

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Dexter F, Yue JC, Dow AJ. Predicting anesthesia times for diagnostic and interventional radiological procedures. Anesth Analg (2006) 102(5):1491–500. doi:10.1213/01.ane.0000202397.90361.1b

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Silber JH, Rosenbaum PR, Zhang X, Even-Shoshan O. Influence of patient and hospital characteristics on anesthesia time in Medicare patients undergoing general and orthopedic surgery. Anesthesiology (2007) 106(2):356–64. doi:10.1097/00000542-200702000-00025

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: operating room utilization, procedure time, regression, prediction, anesthesia time, surgeon time, surgical time

Citation: Edelman ER, van Kuijk SMJ, Hamaekers AEW, de Korte MJM, van Merode GG and Buhre WFFA (2017) Improving the Prediction of Total Surgical Procedure Time Using Linear Regression Modeling. Front. Med. 4:85. doi: 10.3389/fmed.2017.00085

Received: 31 January 2017; Accepted: 06 June 2017;
Published: 19 June 2017

Edited by:

Joachim Paul Hasebrook, Steinbeis University Berlin, Germany

Reviewed by:

Rene Waurick, University Hospital Muenster, Germany
Gereon Schälte, Uniklinik RWTH Aachen, Germany

Copyright: © 2017 Edelman, van Kuijk, Hamaekers, de Korte, van Merode and Buhre. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Eric R. Edelman, ZXJpYy5lZGVsbWFuQG11bWMubmw=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.