Assessing the length of hospital stay for patients with myasthenia gravis based on the data mining MARS approach

Chang, Che-Cheng; Yeh, Jiann-Horng; Chiu, Hou-Chang; Liu, Tzu-Chi; Chen, Yen-Ming; Jhou, Mao-Jhen; Lu, Chi-Jie

doi:10.3389/fneur.2023.1283214

ORIGINAL RESEARCH article

Front. Neurol., 14 December 2023

Sec. Neuromuscular Disorders and Peripheral Neuropathies

Volume 14 - 2023 | https://doi.org/10.3389/fneur.2023.1283214

Assessing the length of hospital stay for patients with myasthenia gravis based on the data mining MARS approach

Che-Cheng Chang^1,2

Jiann-Horng Yeh^3,4,5

Hou-Chang Chiu^3,6

Tzu-Chi Liu⁷

Yen-Ming Chen¹

Mao-Jhen Jhou⁷

Chi-Jie Lu^7,8,9^*

¹Department of Neurology, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City, Taiwan
²PhD Program in Nutrition and Food Science, Fu Jen Catholic University, New Taipei City, Taiwan
³School of Medicine, Fu Jen Catholic University, New Taipei City, Taiwan
⁴Department of Neurology, Shin Kong Wu Ho-Su Memorial Hospital, Taipei City, Taiwan
⁵Department of Neurology, Kaohsiung Medical University, Kaohsiung, Taiwan
⁶Department of Neurology, Taipei Medical University, Shuang-Ho Hospital, New Taipei City, Taiwan
⁷Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei City, Taiwan
⁸Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City, Taiwan
⁹Department of Information Management, Fu Jen Catholic University, New Taipei City, Taiwan

Predicting the length of hospital stay for myasthenia gravis (MG) patients is challenging due to the complex pathogenesis, high clinical variability, and non-linear relationships between variables. Considering the management of MG during hospitalization, it is important to conduct a risk assessment to predict the length of hospital stay. The present study aimed to successfully predict the length of hospital stay for MG based on an expandable data mining technique, multivariate adaptive regression splines (MARS). Data from 196 MG patients' hospitalization were analyzed, and the MARS model was compared with classical multiple linear regression (MLR) and three other machine learning (ML) algorithms. The average hospital stay duration was 12.3 days. The MARS model, leveraging its ability to capture non-linearity, identified four significant factors: disease duration, age at admission, MGFA clinical classification, and daily prednisolone dose. Cut-off points and correlation curves were determined for these risk factors. The MARS model outperformed the MLR and the other ML methods (including least absolute shrinkage and selection operator MLR, classification and regression tree, and random forest) in assessing hospital stay length. This is the first study to utilize data mining methods to explore factors influencing hospital stay in patients with MG. The results highlight the effectiveness of the MARS model in identifying the cut-off points and correlation for risk factors associated with MG hospitalization. Furthermore, a MARS-based formula was developed as a practical tool to assist in the measurement of hospital stay, which can be feasibly supported as an extension of clinical risk assessment.

1 Introduction

Myasthenia gravis (MG) is a neuromuscular junction disorder in which antibodies attack the post-synaptic proteins, which can cause muscle weakness and fatigue during repeated muscle contraction (1). MG is an uncommon disease that affects 15–25 people per 100,000 people (2). Currently, the complication rates of MG are improved under good management, and the treatment of MG has been well documented (3). However, the relapse rate is still variable and the severity varies from person to person; approximately 38% of patients with MG had remission, and 10% are resistant to traditional immunotherapy and need hospitalization (4). Even with the use of multiple drugs, some patients have poor symptom control and occasionally require repeated or prolonged hospitalization (4). However, it is currently difficult to predict who will require a longer stay in hospital or estimate the length of hospitalization because of complex clinical variability.

Most previous studies investigating predictors of prognosis and risk factors for MG symptom deterioration have been based on linear or logistic regression (5–9). Multiple linear regression (MLR) is a classic method used in many medical studies (10). However, MLR has limitations when the data contain non-linear variables. Using traditional methods for risk prediction and outcomes measurement in autoimmune diseases (including MG) is difficult because of the long-term course and complex phenotype. In addition, traditional methods cannot solve the problem of collinearity or non-linear relationships between variables. Recently, data mining methods have been used as alternatives to traditional statistical methods in medical research (11–13). They can process different types of input data that fill a gap in learning from clinical experience with computers capable of recognizing disease patterns and detecting disease features (14–18). Machine learning (ML) is one of the data mining tools that can provide computers with the ability to learn from experience (19–22). Multivariate adaptive regression splines (MARS), which is a data mining method, is a non-linear and non-parametric regression algorithm that does not require the specification of a functional form in advance (23). The MARS method can use a series of piecewise regression splines to process the unknown functional form which makes it appropriate for modeling complex non-linear relationships (24, 25).

Currently ML algorithms have been widely applied in medicine as they can effectively extract potentially useful information from datasets (26–29). However, their methods are not broadly used in the clinical evaluation of MG. The present study aims to investigate the relationship between the risk factors and the length of hospital stays based on MARS. With MARS, we developed a decision process for screening clinical factors associated with the length of hospital stay and also to construct an explainable prediction model successfully. Our findings suggest that the MARS model can help to identify cut-off points for risk factors association with MG hospitalization. Furthermore, a MARS-based formula was designed as an assisting tool to help with measurement of hospital stay.

2 Materials and methods

2.1 Participant and study design

This retrospective study was performed from 513 hospital admissions of patients with MG at the Shin-Kong Wu Ho-Su Memorial Hospital in Taiwan between December 2015 and October 2018. Patients who were admitted for MG symptom deterioration or admission for MG-related management, including thymectomy or immunotherapy, were considered for enrollment. Furthermore, we considered as the outliers patients whose length of stay was greater than three standard deviations (SD) of the mean length of stay based on the raw data before filtration; therefore, we excluded four hospital admissions who had been hospitalized for more than 80 days. Figure 1 shows the detailed case identification process. After filtration, data from 196 patients were analyzed. Ultimately, a total of 196 patients were included in the analysis (Figure 1). The study protocol was evaluated and deemed acceptable by the Research Ethics Review Committee of the Shin Kong Wu Ho-Su Memorial Hospital (No. 20190109R). All of our methods were carried out in accordance with relevant guidelines and regulations.

FIGURE 1

Figure 1. Overall flowchart of the participant enrollment process.

The data of all patients were collected through a review of their admission medical records, and the detailed characteristics are shown in Table 1. Disease severity was graded according to the Myasthenia Gravis Foundation of America (MGFA) classification based on previous reviews that reported the clinical severity of the patient upon admission (30). A total of 19 clinical variables that may affect the length of hospital stay in patients with MG were assessed (6, 31, 32). Among them, average length of hospital stay was the target variable whereas the rest of the 18 variables were the predictor variables.

TABLE 1

Table 1. Subject demographics.

The definition of disease duration was from disease onset to the first visit after enrollment. The oral steroid dose before admission was recorded from the maximum dosages 1 month before hospitalization. The treatment during hospitalization, included plasmapheresis (PP), intravenous corticosteroid (IC), immunoglobulin (IVIG), and rituximab (RTX) administration was recorded. The serological status of MG autoantibodies included antibody against AChR, muscle-specific tyrosine kinase (MuSK), or double seronegative. We averaged the number of days spent by the same patient during different hospital stays, defined as the “average length of hospital stay.”

2.2 ML model of multivariate adaptive regression splines

MARS is a flexible procedure for finding variable interactions invented by Friedman (23). It can estimate non-linear data relationships by approximating with separate linear regression slopes in distinct intervals of the independent variable space (23–25). These lines, also known as splines, are piecewise linear lines that can best describe the data, whereas the points where the lines join together are the knots. These knots indicate each optimal cutting point of a variable from the data. Furthermore, the combination of splines and their corresponding knots are also known as the hinge functions, which take the form of max (0, variable−knot) or max(0, knot−variable). All hinge functions that describe a variable with their corresponding coefficients are known as basis functions (BF), in which each variable may have one or more BFs. Because each variable may have one or more BFs, they should be overall considered at the same time (23–25).

The building procedures for MARS involve several key steps. First, MARS starts by generating hinge functions to capture non-linear relationships in multivariate data. Second, to select the hinge functions that can form the most suitable BFs, a forward pass procedure is carried out by MARS. It will iteratively select and test the best BFs which considers both expansion and pruning via model selection criteria to optimize the model's complexity while minimizing prediction error. Then a backward pass procedure is conducted to simplify the model further by removing BFs with least contributions for making predictions. During this procedure, generalized cross-validation (a form of regularization that trades off goodness-of-fit against model complexity) is commonly utilized. MARS continuously refines and selects the most suitable BFs; the building procedures stop when further additions/eliminations of BFs do not significantly improve model performance. Hence, the final MARS equation is built and is composed of BFs from each selected variable.

The benefit of the MARS algorithm is that the estimated knots of important independent variables can provide useful information of the relationships between independent variables and the dependent variable, which helps to learn how non-linear features affect the target and select the important features. Thus, many studies from the clinical field utilize MARS because of the strengths that MARS can provide (33–37).

2.3 Data preprocessing of MARS model

The experiment was performed using the “R” software (version 4.1.2) (38) in R studio (version 1.1.453) (39); MLR was implemented with the “stats” package (version 4.1.2) (38); MARS was implemented with the “earth” package (version 5.3.1) (40); Lasso MLR was constructed by the “glmnet” package (version 4.1- 4) (41). For comparison purpose, classification and regression tree (CART) and random forest (RF) were also conducted. CART was conducted by the “rpart” package (version4.1.16) (42); and RF was built with the “ran-domForest” package(version 4.7-1.1) (43). In the modeling process, we randomly divided 80% of the dataset into a training dataset and the remaining 20% into a testing dataset. In the training dataset, a 10-fold cross-validation method was utilized for hyper-parameter tuning with the aid of the “caret” package (version 6.0-92) (44). When utilizing 10-fold cross-validation, the training set was randomly and equally divided into 10 folds (10% for each fold). Then, the 9-folds (90% of the training dataset) were used for training the model and the remaining 1-fold (10% of the training dataset) was used for validating the model. This process was repeated until each fold was used as validation once. Finally, after finding the best hyper-parameter set, the trained model used the testing data to evaluate the performance. The modeling process was repeated 10 times in our study, then we compared the results to determine the best-performing MARS model and get the equation from the selected one.

For the performance evaluation, three metrics were used: mean absolute percentage error (MAPE), symmetric mean absolute percentage error (SMAPE), and relative absolute error (RAE). Using multiple metrics for evaluation ensures that the performance of the model is stable. These metrics measure the prediction error of model output and the model with the smallest error values is the best in terms of performance. MAPE and RAE were generated using the “MLmetrics” package (version 1.1.1) (45); SMAPE was generated using the “Metrics” package (version 0.1.4) (46). The described modeling process in this section is presented as a framework and shown in Figure 2.

FIGURE 2

Figure 2. Data preprocessing processes for training and testing the MARS model.

3 Results

3.1 Characteristics

A total of 196 patients were included in the study, along with 19 clinical variables. The distribution of features in the entire dataset is listed in Table 1. The average age at admission was 49.4 years with women predominant (60.7%). The mean disease duration was 72.7 months. The average age at the onset of MG symptoms was 43-year-old. Among the patients, 88.8% displayed anti-AChR-antibody positivity and 4.1% showed anti-MuSK-antibody positivity. The average duration of hospital stay was 12.3 days. The MGFA clinical classification at admission divided the patients into several groups: 23 patients were classified as class I, 71 as class II, 64 as class III, 25 as class IV, and 13 as having an MG crisis with intubation. According to the thymus histology, 91 patients (46.4%) had thymoma and 59 patients (30.1%) had thymic hyperplasia. In total, 129 patients (65.8%) underwent thymectomy.

3.2 Performance of the MARS model

As mentioned, MARS is a non-parametric approach that can capture non-linear relationships between variables and can provide unique information. Nineteen predictor variables in this study were used when constructing the MARS model while three ML methods, namely, Lasso MLR, CART, and RF, were also constructed for comparison. Table 2 shows the model performance of the MARS model and the other three competing ML methods. According to the table, it can be found that the performance of MARS is similar with that of Lasso MLR, CART, and RF.

TABLE 2

Table 2. Model performance of all four models used in this study.

To confirm the performance of the ML methods, the Kruskal–Wallis test (KW-test) and Wilcoxon signed-rank test (WS-test) were utilized to test the four methods. The WS-test is a non-parametric approach to the one-way ANOVA which can be used to compare multiple groups of data (47). The “KW-test” was first utilized for comparing MARS, Lasso MLR, CART, and RF. Then, to further check if the MARS generates a different model performance compared to the other three competing methods, the WS-test, a well-known non-parametric statistical technique to assess the prediction performance of two different algorithms (48), was used for pairwise comparison.

Table 3 shows the KW-test and WS-test results for comparing the performance of the MARS model to the Lasso MLR, CART, and RF models. From Table 3, it can be found that the MARS model does not have significant performance difference to the Lasso MLR, CART, and RF models. As mentioned in Section 2.2, the advantage and model characteristic of MARS are that it can capture the non-linear relationships between variables by assessing the knots and provide interpretable information through the knots from its equations, information which cannot be generated and provided by the Lasso MLR, CART, and RF models. Since the statistical testing results indicated that there is no significant performance difference among MARS and the three competing ML methods, the MARS model is the most suitable model of this study with extra helpful information to support clinical decision-making when predicting the average length of hospital stay for patients with MG.

TABLE 3

Table 3. KW-test and WS-test results of the four used ML methods.

3.3 Equation for prediction of length of hospital stay based on the MARS model

The BFs and coefficients of the best MARS model are listed in Table 4. As presented in the table, four important variables were selected by the best MARS, along with the corresponding knots, for which a total of seven BFs with seven knots were acquired from MARS. Based on Table 4, the MARS equation can be generated as follows:

\begin{array}{l} Average length of hospital stay \\ = 20.750 - 0.128 \times B F 1 - 0.013 \times B F 2 - 1.772 \times B F 3 \\ + 3.762 \times B F 4 - 1.180 \times B F 5 - 1.241 \times B F 6 + 1.268 \times B F 7 \end{array}

TABLE 4

Table 4. Basis functions and important variables of the best MARS model.

3.4 Influence of the important variables

To better understand how the four important variables under the structure of BFs affect average length of hospital stay, Figure 3 presents a visualization of the influence of the important variables on the average number of hospital days. Each panel in the figure contains one of the important variables and corresponding BF. For example, the MGFA clinical classification has two BFs, which are plotted by combining the BFs and knots of the MGFA clinical classification. All of the panels in Figure 3 follow the same concept. In Figure 3, the influence of age at admission, disease duration, MGFA clinical classification, and maximum daily dose of Prednisolone (PSL) on average length of hospital stay are visualized. In Figure 3A, the age of 41 is the knot of variable age at admission; prior to age 41, there is no difference in the average length of hospital stay; after passing the age of 41, the average number of hospital days decreases. In Figure 3B, 12 months is the knot of the disease duration, and the average number of hospital days decreases after the duration exceeds 12 months. In Figure 3C, using MGFA clinical classification stage IIIb as the datum point, the average length of hospital stay is shortened when the MGFA clinical classification value decreases from stage IIIb. Further, when the values of MGFA clinical classification increase to stage IIIb, the length of the average length of hospital stay increases. Interestingly, in Figure 3D, the variable maximum daily dose of PSL has three knots, which are 5, 10, and 15 mg, respectively. When the maximum daily dose of PSL decreases from 5 mg, the length of average hospital stay shortens. The average length of hospital days remained no different when the maximum daily PSL dose was between 5 and 10 mg. A maximum daily dose of PSL between 10 and 15 mg shortened the length of the average length of hospital stay. Finally, when the PSL maximum daily dose increased from 15 mg, the average length of hospital stay increased.

FIGURE 3

Figure 3. Influence of important variables on the average number of hospital days. (A) Age at admission; (B) disease duration; (C) MGFA clinical classification; (D) PSL maximum daily dose.

4 Discussion

Using the data mining adaptive scheme with the MARS methodology, the study presented a result of four clinical variables that were important for the prediction of the average length of hospital stay, including age at admission, disease duration, MGFA clinical classification, and prednisolone dose, of which the non-linear relationships between them can be captured and described with the MARS equation. The MARS model demonstrated the cut-off point in the four factors and provided more detailed data on how these factors influence length of hospital stay. The data-mining based equation incorporating the four risk factors and detailed clinical parameters could provide good predictive accuracy in our sample.

It is important to assess the days in the hospital and risk factors that influence length of hospital stay at the time of patient admission because it is beneficial in terms of treatment protocols and financial plans for the hospitalization of MG patients (6, 31). In addition, it will be helpful for physicians to explain to patients, control risk, and make a decision plan that could improve the quality of care. MG is a rare disease; understanding the length of hospitalization is helpful for the formulation of national medical insurance policy (6, 31, 49). Several previous studies have tried to investigate the risk factors that influence the outcomes in MG with hospitalization based on tradition retrospective and regression analysis (6, 9, 31). One retrospective study demonstrated that intubation and PP correlated with hospital staying length and the male sex had correlation with a prolonged hospital stay (50). Respiratory distress and pneumonia had correlation with poor outcomes during hospitalization in a national-based review (31). The duration of corticosteroid administration can add to the burden of poor control MG (51). Our results showed another four clinical variables that relate to hospital stay length that could provide a direction and explainable result for evaluation of hospital stay for patients with MG.

There are some limitations of the traditional regression method for evaluation of the risk factors. First, if the clinical variables are non-linear or have collinearity, the strength of the relationship could be under-estimated. Second, it cannot determine the cut-off point between different factors. ML methods could address the weakness described above. Different machine learning techniques may need to be applied to various datasets (52). Our results also indicate that the MARS model was statistically significantly stronger than linear regression. Compared with MLR, MARS can automatically create a piecewise linear model that provides an intuitive stepping block into non-linearity after grasping the concept of multiple linear regression (23, 53). MARS is now a well-known ML method and has been used in some medical care issues, including optimal drug level detection, or applied in important variable cut-point detection (36, 54, 55).

Several previous studies have focused on the predictive factors for MG prognosis using data mining methods. A previous study using five ML methods showed that the MGFA classification, intravenous steroid administration during hospitalization, age, treatment with intravenous immunoglobulins, and thymoma were significant variables affecting prolonged hospitalization in MG patients (32). However, no studies have tried to assess length of hospital stay, and our study fills this gap. In contrast to previous results, which tried to identify the relatively important risk factors that related to prolonged hospital stay and the resulting target variable was a categorical variable (prolonged and non-prolonged), our results showed a good prediction accuracy of hospital stay length and the target variable is a continuous variable. We used the MARS methodology not only to identify the important risk factors that influence the average length of hospital stay but also to construct an easy-to-use model, and we can improve the model and prediction accuracy after incorporating this non-linearity.

Moderate MG symptoms at admission constituted the important variables in our datasets. The MGFA clinical classification is a standard method for identifying the different severities and clinical presentations of myasthenia gravis (56). The association of MGFA with length of hospital stay duration could be explained by the profound muscles weakness in these patients that cause severe disability. According to our results, there was a cut-off point at MGFA stage IIIb, as it is non-linear that the result of Figure 3 is not a straight line across the set of the findings. Furthermore, in the context of MG treatment, corticosteroids have been a first-line immunosuppressive therapy when symptoms are not adequately controlled (57). However, there is a possible risk of exacerbating MG, known as steroid-induced exacerbation, due to the mechanism involving lymphocyte depletion (58). The reported frequency of steroid-induced exacerbation varies (59), and the slow titration regimen is designed to reduce the risk (60). Thus, clinically, it is important to know what the best regimen is to avoid steroid-induced exacerbation and reach the optimal symptom control. However, currently there is no clear guidance. As a pilot study, our results showed the prednisolone dose had biphasic influence on hospital stay in patients with MG. It is possible that the higher doses of prednisolone for MG symptoms may cause prolonged hospitalization and provide an indicator of the impact of steroids on the length of hospital stay.

Several studies have emphasized the importance of age at onset as a prognostic factor for MG. A systematic review highlighted that an onset age below 40 years was a crucial factor for predicting remission (7). Johan et al. demonstrated that early-onset patients tend to have milder disease (40). Chinese studies indicated that MG patients with an onset age exceeding 40 years were more likely to develop generalized MG (41). Furthermore, a retrospective study found that elderly MG patients (onset age > 65 years) were prone to experience increased disease severity (42). Despite a higher percentage of patients in this subgroup presenting with life-threatening events and increase cost during admission, literature reviews have shown that elderly MG patients respond well to treatment (5, 6). While most studies traditionally focus on early/late-onset myasthenia gravis, typically distinguished by an age of 50, our research, although requiring further validation, has identified a critical age threshold at 41 years that influences prognosis. The use of the MARS methodology has introduced new variables and trends for assessing hospital stay duration. The precise impact of age on hospitalization remains unclear, necessitating further research for confirmation.

Our findings found disease duration is a factor that could influence the length of hospitalization. The association between the duration of the disease and the prognosis of MG has been a subject of controversy. Some reports have not concluded that disease duration is closely associated with prognosis in patients with MG (61). However, one cohort study identified disease duration exceeding 41 months as a factor negatively impacting the need for intensive care after MG admission (62). Additionally, a large retrospective study demonstrated that the risk of death tended to decrease after 15 years of the prevalence of the disease (63). Since it is an autoimmune disease, proper medical intervention helps stabilize the symptoms significantly (63). This may be due to the fact that a longer disease duration allows for more stable drug treatment and better psychological adaptation of the patient to the disease, resulting in a shorter length of hospitalization. Our findings, unlike those in other studies, have identified a specific threshold that a disease duration longer than 12 months negatively impacts the length of hospitalization. To the best of our knowledge, no prior study has established how the duration of the disease might affect MG outcomes. Our research offers fresh insights into the clinical care of MG.

The clinical implications of this study are that we constructed a MARS-derived model that can serve as a supportive assessment tool for clinical physicians in evaluating the length of hospital stay, which is rarely used in health care and allows us to model the interaction of explanatory variables. The interaction between the influencing factors found in this study has not been reported previously. After inputting the values for the four data points, a more accurate estimate of hospital stay duration in the clinician's diagnostic dataset can be derived, which can help in estimating medical costs and providing health education for patients with diseases. Moreover, because the variable phenotype of MG makes it difficult to determine the prognosis, physicians can use this model to identify patients likely to have prolonged hospitalization and the risk factors that influence it.

Despite these promising results, this study had some limitations. Firstly, the sample size was small, and it was drawn from a single center, which may reduce the generalizability of our results. Additionally, this model was not validated on a representative sample. For future validation and to enhance generalizability, data from multiple centers and various regions should be collected. Second, the data were collected from retrospective reviews, not prospective. As mentioned above, MG is a fluctuating disease; the MGFA classifies the disease according to the worst state the patient has been in and is not the best tool for grading patient severity at the time. It would have been better to use validated scales for MG such as the MG composite score (MGC) or the quantitative myasthenia gravis scoring (QMG), which can represent the disease severity and status, and also were not collected for analysis. In future studies, using prospective data for analysis can enhance model validation and improve the overall generalization and practicality of the ML model. Third, these models were chosen based on the clinical data. Other variables, such as blood samples, underlying comorbidities, and complications during hospitalization, were not included in our analysis. Incorporating this information could facilitate a more comprehensive analysis. Fourth, we excluded a significant number of cases from the original dataset because the admission reasons for these patients were unrelated to MG. This exclusion may affect the potential for future general applications. Fifth, our current study primarily focuses on the factors affecting the length of hospitalization after the admission. Therefore, we did not conduct an analysis of hospital stay duration based on different admission methods, including acute disease worsening leading to emergency admissions or admission to the intensive care unit. We also did not investigate the impact of hospitalization simply due to surgical procedures. Further studies should emphasize the impact of heterogeneity in hospitalization reasons, including factors like thymectomy surgery, as well as the ICU or emergency admission on the length of hospital stay. Finally, our study population were Asian, which significantly limits the generalizability of the study, and the pattern of clinical practice and admission criteria in this study may be different from those in other countries. Multicenter studies and increased sample size may complete the framework of this study to improve the performance of the MARS model and is worthy of further research.

5 Conclusions

Our results are the first to assess factors that influence the length of hospital stay using data mining methods. The result suggests that the ML-based models of hospital stay length in patients with MG should allow for non-linear associations that could improve their predictive ability. The non-linearity of the MARS model helped to identify cut-off points for four risk factors that influence hospital stay, including disease duration, age at admission, MGFA clinical classification, and daily dose of prednisolone. Furthermore, a MARS-based formula was designed as an assisting clinical decision support tool to help with the assessment of the average hospital stay in MG. In summary, the model maximizes predictions from measurements that can be feasibly supported as an extension of clinical risk assessments. The practical application of this model as a screening tool needs to be replicated and developed further, particularly in community settings.

Data availability statement

The datasets presented in this article are not readily available because of ethical and privacy restrictions. Requests to access the datasets should be directed to C-CC, Y2hhbmdjYzc1QGdtYWlsLmNvbQ==.

Ethics statement

The studies involving humans were approved by the Research Ethics Review Committee of the Shin Kong Wu Ho-Su Memorial Hospital (No. 20190109R). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants' legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

C-CC: Conceptualization, Data curation, Formal analysis, Investigation, Project administration, Writing – original draft, Writing – review & editing. J-HY: Data curation, Writing – review & editing, Resources. H-CC: Data curation, Writing – review & editing, Resources. T-CL: Methodology, Software, Writing – original draft, Writing – review & editing. Y-MC: Data curation, Writing – review & editing, Resources. M-JJ: Methodology, Software, Writing – original draft. C-JL: Conceptualization, Methodology, Project administration, Supervision, Writing – original draft, Writing – review & editing, Formal analysis, Investigation, Funding acquisition.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This research was partially supported by the National Science and Technology Council, Taiwan (NSTC 111-2221-E-030-009), Fu Jen Catholic University (A0111181), and Fu Jen Catholic University Hospital (PL-202008004-V).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer CH declared a shared affiliation, with no collaboration with the authors to the handling editor at the time of the review.

The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Gilhus NE. Myasthenia gravis. N Engl J Med. (2016) 375:2570–81. doi: 10.1056/NEJMra1602678

CrossRef Full Text | Google Scholar

2. Gilhus NE, Tzartos S, Evoli A, Palace J, Burns TM, Verschuuren JJGM. Myasthenia gravis. Nat Rev Dis Primers. (2019) 5:30. doi: 10.1038/s41572-019-0079-y

CrossRef Full Text | Google Scholar

3. Narayanaswami P, Sanders DB, Wolfe G, Benatar M, Cea G, Evoli A, et al. International consensus guidance for management of myasthenia gravis: 2020 update. Neurology. (2021) 96:114–22. doi: 10.1212/WNL.0000000000011124

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Wakata N, Iguchi H, Sugimoto H, Nomoto N, Kurihara T. Relapse of ocular symptoms after remission of myasthenia gravis–a comparison of relapsed and complete remission cases. Clin Neurol Neurosurg. (2003) 105:75–7. doi: 10.1016/S0303-8467(02)00104-X

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Cortés-Vicente E, Álvarez-Velasco R, Segovia S, Paradas C, Casasnovas C, Guerrero-Sola A, et al. Clinical and therapeutic features of myasthenia gravis in adults based on age at onset. Neurology. (2020) 94:e1171–80. doi: 10.1212/WNL.0000000000008903

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Tiamkao S, Pranboon S, Thepsuthammarat K, Sawanyawisuth K. Factors predicting the outcomes of elderly hospitalized myasthenia gravis patients: a national database study. Int J Gen Med. (2017) 10:131–5. doi: 10.2147/IJGM.S129075

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Mao ZF, Mo XA, Qin C, Lai YR, Olde Hartman TC. Course and prognosis of myasthenia gravis: a systematic review. Eur J Neurol. (2010) 17:913–21. doi: 10.1111/j.1468-1331.2010.03017.x

CrossRef Full Text | Google Scholar

8. Donaldson DH, Ansher M, Horan S, Rutherford RB, Ringel SP. The relationship of age to outcome in myasthenia gravis. Neurology. (1990) 40:786–90. doi: 10.1212/WNL.40.5.786

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Wang L, Zhang Y, He M. Clinical predictors for the prognosis of myasthenia gravis. BMC Neurol. (2017) 17:77. doi: 10.1186/s12883-017-0857-7

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Zhou S, AbdelWahab A, Sapp JL, Warren JW, Horáček BM. Localization of ventricular activation origin from the 12-lead ECG: A comparison of linear regression with non-linear methods of machine learning. Ann Biomed Eng. (2019) 47:403–12. doi: 10.1007/s10439-018-02168-y

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Gareth J, Daniela W, Trevor H, Tibshirani R. An Introduction to Statistical Learning: With Applications in R. 2nd ed New York, USA: Springer. (2013).

Google Scholar

12. Kong C, Zhu XZ, Lee TF, Feng PB, Xu JH, Qian PD, et al. LASSO-Based NTCP model for radiation-induced temporal lobe injury developing after intensity-modulated radiotherapy of nasopharyngeal carcinoma. Sci Rep. (2016) 6:26378. doi: 10.1038/srep26378

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Sharma S, Lal V, Prabhakar S, Agarwal R. Clinical profile and outcome of myasthenic crisis in a tertiary care hospital: a prospective study. Ann Indian Acad Neurol. (2013) 16:203–7. doi: 10.4103/0972-2327.112466

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, et al. A guide to deep learning in healthcare. Nat Med. (2019) 25:24. doi: 10.1038/s41591-018-0316-z

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Huang K, Ji F, Xie Z, Wu D, Xu X, Gao H, et al. Artificial liver support system therapy in acute-on-chronic hepatitis B liver failure: classification and regression tree analysis. Sci Rep. (2019) 9:16462. doi: 10.1038/s41598-019-53029-0

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Liu Y, Gao J, Liu J, Walline JH, Liu X, Zhang T, et al. Development and validation of a practical machine-learning triage algorithm for the detection of patients in need of critical care in the emergency department. Sci Rep. (2021) 11:24044. doi: 10.1038/s41598-021-03104-2

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Niu X, Liu G, Huo L, Zhang J, Bai M, Peng Y, et al. Risk stratification based on components of the complete blood count in patients with acute coronary syndrome: a classification and regression tree analysis. Sci Rep. (2018) 8:2838. doi: 10.1038/s41598-018-21139-w

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Wu T-E, Chen H-A, Jhou M-J, Chen Y-N, Chang T-J, Lu C-J. Evaluating the Effect of Topical Atropine Use for Myopia Control on Intraocular Pressure by Using Machine Learning. J Clin Med. (2021) 10:111. doi: 10.3390/jcm10010111

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Liu Y, Chen P-HC, Krause J, Peng L. How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature. JAMA. (2019) 322:1806–16. doi: 10.1001/jama.2019.16489

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to machine learning, neural networks, and deep learning. Transl Vis Sci Technol. (2020) 9:14. doi: 10.1167/tvst.9.2.14

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Yang L, Wu H, Jin X, Zheng P, Hu S, Xu X, et al. Study of cardiovascular disease prediction model based on random forest in eastern China. Sci Rep. (2020) 10:5245. doi: 10.1038/s41598-020-62133-5

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Shih CC, Lu CJ, Chen GD, Chang CC. Risk prediction for early chronic kidney disease: results from an adult health examination program of 19,270 individuals. Int J Environ Res Public Health. (2020) 17:4973. doi: 10.3390/ijerph17144973

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Friedman JH. Multivariate adaptive regression splines. Ann Statist. (1991) 19:1–141. doi: 10.1214/aos/1176347963

CrossRef Full Text | Google Scholar

24. Lu CJ, Lee TS, Lian CM. Sales forecasting for computer wholesalers: a comparison of multivariate adaptive regression splines and artificial neural networks. Decis Support Syst. (2012) 54:584–96. doi: 10.1016/j.dss.2012.08.006

CrossRef Full Text | Google Scholar

25. Ting WC, Chang HR, Chang CC, Lu CJ. Developing a novel machine learning-based classification scheme for predicting spcs in colorectal cancer survivors. Appl Sci. (2020) 10:1355. doi: 10.3390/app10041355

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Triantafyllidis AK, Tsanas A. Applications of machine learning in real-life digital health interventions: review of the literature. J Med Internet Res. (2019) 21:e12286. doi: 10.2196/12286

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Peiffer-Smadja N, Rawson TM, Ahmad R, Buchard A, Georgiou P, Lescure FX, et al. Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clin Microbiol Infect. (2020) 26:584–95. doi: 10.1016/j.cmi.2019.09.009

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Aggarwal CC. Data Mining: The Textbook. 1st ed New York, USA: Springer. (2015). doi: 10.1007/978-3-319-14142-8

CrossRef Full Text | Google Scholar

29. Zaki MJ, Wagner M. Data Mining and Analysis: Fundamental Concepts and Algorithms. 1st ed Cambridge, UK: Cambridge University Press. (2014). doi: 10.1017/CBO9780511810114

CrossRef Full Text | Google Scholar

30. Barnett C, Herbelin L, Dimachkie MM, Barohn RJ. Measuring clinical treatment response in myasthenia gravis. Neurol Clin. (2018) 36:339–53. doi: 10.1016/j.ncl.2018.01.006

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Tiamkao S, Pranboon S, Thepsuthammarat K, Sawanyawisuth K. Prevalence of factors associated with poor outcomes of hospitalized myasthenia gravis patients in Thailand. Neurosciences. (2014) 19:286–90.

PubMed Abstract | Google Scholar

32. Chang C-C, Yeh J-H, Chen Y-M, Jhou M-J, Lu C-J. Clinical predictors of prolonged hospital stay in patients with myasthenia gravis: a study using machine learning algorithms. J Clin Med. (2021) 10:4393. doi: 10.3390/jcm10194393

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Menon R, Bhat G, Saade GR, Spratt H. Multivariate adaptive regression splines analysis to predict biomarkers of spontaneous preterm birth. Acta Obstet Gynecol Scand. (2014) 93:382–91. doi: 10.1111/aogs.12344

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Serrano NB, Sánchez AS, Lasheras FS, Iglesias-Rodríguez FJ, Valverde GF. Identification of gender differences in the factors influencing shoulders, neck and upper limb MSD by means of multivariate adaptive regression splines (MARS). Appl Ergon. (2020) 82:102981. doi: 10.1016/j.apergo.2019.102981

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Lima E, Davies P, Kaler J, Lovatt F, Green M. Variable selection for inferential models with relatively high-dimensional data: between method heterogeneity and covariate stability as adjuncts to robust selection. Sci Rep. (2020) 10:8002. doi: 10.1038/s41598-020-64829-0

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Bitetto A, Cerchiello P, Mertzanis C. A data-driven approach to measuring epidemiological susceptibility risk around the world. Sci Rep. (2021) 11:24037. doi: 10.1038/s41598-021-03322-8

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Sarossy M, Crowston J, Kumar D, Weymouth A, Wu Z. Prediction of glaucoma severity using parameters from the electroretinogram. Sci Rep. (2021) 11:23886. doi: 10.1038/s41598-021-03421-6

PubMed Abstract | CrossRef Full Text | Google Scholar

38. R Core Team. R: A language Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. (2019). Available online at: http://www.R-project.org (accessed June 1, 2022).

Google Scholar

39. RStudio Team. RStudio: Integrated Development Environment for R. Boston, MA, USA (2018). Available online at: https://www.rstudio.com/products/rstudio/ (accessed June 1, 2022).

Google Scholar

40. Milborrow S. Derived from mda: mars by T. Hastie, R. Tibshirani, S. Earth: Multivariate Adaptive Regression Splines. (2022). R package version, 5.3.1. Available online at: http://CRAN.R-project.org/package=earth (accessed June 1, 2022).

Google Scholar

41. Friedman J, Hastie T, Tibshirani R, Narasimhan B, Tay K, Simon N, et al. Glmnet: Lasso, Elastic-Net Regularized Generalized Linear Models. 2022. R Package Version, 4.1-4. Available online at: https://CRAN.R-project.org/package=glmnet (accessed June 1, 2022).

Google Scholar

42. Therneau T, Atkinson B. Rpart: Recursive Partitioning Regression Trees. (2022). R Package Version, 4.1.16. Available online at: https://CRAN.R-project.org/package=rpart (accessed June 1, 2022).

Google Scholar

43. Breiman L, Cutler A, Liaw A, Wiener M. RandomForest: Breiman Cutler's Random Forests for Classification Regression. (2022). R Package Version, 4.7-1.1. Available online at: https://CRAN.R-project.org/package=randomForest (accessed June 1, 2022).

Google Scholar

44. Kuhn M. Caret: Classification Regression Training. (2022). R Package Version, 6.0-92. Available online at: https://CRAN.R-project.org/package=caret (accessed June 1, 2022).

Google Scholar

45. Yan Y. MLmetrics: Machine Learning Evaluation Metrics. (2016). R Package Version, 1.1.1. Available online at: https://CRAN.R-project.org/package=MLmetrics (accessed June 1, 2022).

Google Scholar

46. Hamner B, Frasco M. Metrics: Evaluation Metrics for Machine Learning. (2018). R Package Version, 0.1.4. Available online at: https://CRAN.R-project.org/package=Metrics (accessed June 1, 2022).

Google Scholar

47. Sawilowsky S, Fahoome G. Kruskal-Wallis Test: Basic. In: Wiley StatsRef: Statistics Reference Online. (2014). doi: 10.1002/9781118445112.stat06567

CrossRef Full Text | Google Scholar

48. Diebold FX, Mariano RS. Comparing predictive accuracy. J Bus Econ Stat. (1995) 20:134–44. doi: 10.1198/073500102753410444

CrossRef Full Text | Google Scholar

49. Souayah N, Nasar A, Suri MFK, Kirmani JF, Ezzeddine MA, Qureshi AI. Trends in outcomes and hospitalization charges among mechanically ventilated patients with myasthenia gravis in the United States. Int J Biomed Sci. (2009) 5:209–14. doi: 10.59566/IJBS.2009.5209

PubMed Abstract | CrossRef Full Text | Google Scholar

50. Ramsaroop T, Gelinas D, Kang SA, Govindarajan R. Analysis of length of stay and treatment emergent complications in hospitalized myasthenia gravis patients with exacerbation. BMC Neurol. (2023) 23:12. doi: 10.1186/s12883-022-02922-9

PubMed Abstract | CrossRef Full Text | Google Scholar

51. Harris L, Graham S, MacLachlan S, Exuzides A, Jacob S. A retrospective longitudinal cohort study of the clinical burden in myasthenia gravis. BMC Neurol. (2022) 22:172. doi: 10.1186/s12883-022-02692-4

PubMed Abstract | CrossRef Full Text | Google Scholar

52. Sarker IH. Machine learning: algorithms, real-world applications and research directions. SN Comput Sci. (2021) 2:1–21. doi: 10.1007/s42979-021-00592-x

PubMed Abstract | CrossRef Full Text | Google Scholar

53. York TP, Eaves LJ, van den Oord EJ. Multivariate adaptive regression splines: a powerful method for detecting disease-risk relationship differences among subgroups. Stat Med. (2016) 25:1355–67. doi: 10.1002/sim.2292

PubMed Abstract | CrossRef Full Text | Google Scholar

54. Mukherjee S, Frimpong Boamah E, Ganguly P, Botchwey N. A multilevel scenario based predictive analytics framework to model the community mental health, built environment nexus. Sci Rep. (2021) 11:17548. doi: 10.1038/s41598-021-96801-x

PubMed Abstract | CrossRef Full Text | Google Scholar

55. Butte NF, Wong WW, Adolph AL, Puyau MR, Vohra FA, Zakeri IF. Validation of cross-sectional time series and multivariate adaptive regression splines models for the prediction of energy expenditure in children and adolescents using doubly labeled water. J Nutr. (2010) 140:1516–23. doi: 10.3945/jn.109.120162

PubMed Abstract | CrossRef Full Text | Google Scholar

56. Fernandez-Lozano C, Hervella P, Mato-Abad V, Rodríguez-Yáñez M, Suárez-Garaboa S, López-Dequidt I, et al. Random forest-based prediction of stroke outcome. Sci Rep. (2021) 11:10071. doi: 10.1038/s41598-021-89434-7

PubMed Abstract | CrossRef Full Text | Google Scholar

57. Warmolts JR, Engel WK. Benefit from alternate-day prednisone in myasthenia gravis. N Engl J Med. (1972) 286:17–20. doi: 10.1056/NEJM197201062860104

PubMed Abstract | CrossRef Full Text | Google Scholar

58. Bae JS, Go SM, Kim BJ. Clinical predictors of steroid-induced exacerbation in myasthenia gravis. J Clin Neurosci. (2006) 13:1006–10. doi: 10.1016/j.jocn.2005.12.041

PubMed Abstract | CrossRef Full Text | Google Scholar

59. Jenkins RB. Treatment of myasthenia gravis with prednisone. Lancet. (2006) 299:765–7. doi: 10.1016/S0140-6736(72)90520-X

CrossRef Full Text | Google Scholar

60. Farmakidis C, Pasnoor M, Dimachkie MM, Barohn RJ. Treatment of myasthenia gravis. Neurol Clin. (2018) 36:311–37. doi: 10.1016/j.ncl.2018.01.011

CrossRef Full Text | Google Scholar

61. Citirak G, Cejvanovic S, Andersen H, Vissing J. Effect of gender, disease duration and treatment on muscle strength in myasthenia gravis. PLoS ONE. (2016) 11:e0164092. doi: 10.1371/journal.pone.0164092

PubMed Abstract | CrossRef Full Text | Google Scholar

62. Chang C-C, Yeh J-H, Chiu H-C, Chen Y-M, Jhou M-J, Liu T-C, et al. Utilization of decision tree algorithms for supporting the prediction of intensive care unit admission of myasthenia gravis: a machine learning-based approach. J Pers Med. (2022) 12:32. doi: 10.3390/jpm12010032

PubMed Abstract | CrossRef Full Text | Google Scholar

63. Liu C, Wang Q, Qiu Z, Lin J, Chen B, Li Y, et al. Analysis of mortality and related factors in 2195 adult myasthenia gravis patients in a 10-year follow-up study. Neurol India. (2017) 65:518–24. doi: 10.4103/neuroindia.NI_804_16

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: myasthenia gravis, multivariate adaptive regression splines, data mining, prognosis, hospitalization, machine learning

Citation: Chang C-C, Yeh J-H, Chiu H-C, Liu T-C, Chen Y-M, Jhou M-J and Lu C-J (2023) Assessing the length of hospital stay for patients with myasthenia gravis based on the data mining MARS approach. Front. Neurol. 14:1283214. doi: 10.3389/fneur.2023.1283214

Received: 01 September 2023; Accepted: 27 November 2023;
Published: 14 December 2023.

Edited by:

Jian-Quan Shi, Nanjing Medical University, China

Reviewed by:

Chien Tai Hong, Taipei Medical University, Taiwan
Shih-Hsin Chen, Tamkang University, Taiwan
Xiucai Ye, University of Tsukuba, Japan

Copyright © 2023 Chang, Yeh, Chiu, Liu, Chen, Jhou and Lu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Chi-Jie Lu, MDU5MDk5QG1haWwuZmp1LmVkdS50dw==; Y2hpamllLmx1QGdtYWlsLmNvbQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.