Prediction of patient choice tendency in medical decision-making based on machine learning algorithm

Objective Machine learning (ML) algorithms, as an early branch of artificial intelligence technology, can effectively simulate human behavior by training on data from the training set. Machine learning algorithms were used in this study to predict patient choice tendencies in medical decision-making. Its goal was to help physicians understand patient preferences and to serve as a resource for the development of decision-making schemes in clinical treatment. As a result, physicians and patients can have better conversations at lower expenses, leading to better medical decisions. Method Patient medical decision-making tendencies were predicted by primary survey data obtained from 248 participants at third-level grade-A hospitals in China. Specifically, 12 predictor variables were set according to the literature review, and four types of outcome variables were set based on the optimization principle of clinical diagnosis and treatment. That is, the patient's medical decision-making tendency, which is classified as treatment effect, treatment cost, treatment side effect, and treatment experience. In conjunction with the study's data characteristics, three ML classification algorithms, decision tree (DT), k-nearest neighbor (KNN), and support vector machine (SVM), were used to predict patients' medical decision-making tendency, and the performance of the three types of algorithms was compared. Results The accuracy of the DT algorithm for predicting patients' choice tendency in medical decision making is 80% for treatment effect, 60% for treatment cost, 56% for treatment side effects, and 60% for treatment experience, followed by the KNN algorithm at 78%, 66%, 74%, 84%, and the SVM algorithm at 82%, 76%, 80%, 94%. At the same time, the comprehensive evaluation index F1-score of the DT algorithm are 0.80, 0.61, 0.58, 0.60, the KNN algorithm are 0.75, 0.65, 0.71, 0.84, and the SVM algorithm are 0.81, 0.74, 0.73, 0.94. Conclusion Among the three ML classification algorithms, SVM has the highest accuracy and the best performance. Therefore, the prediction results have certain reference values and guiding significance for physicians to formulate clinical treatment plans. The research results are helpful to promote the development and application of a patient-centered medical decision assistance system, to resolve the conflict of interests between physicians and patients and assist them to realize scientific decision-making.


. Introduction
Traditional medical models are gradually shifting from diseasecentered to patient-centered as medicine progress (1,2). The Institute for Patient-and Family-Centered Care (IPFCC), an institution specializing in patient-centered medical services in the United States, proposes that medical services should involve four elements: respect and dignity, information sharing, participation, and collaboration (3). Patient involvement in medical decisionmaking has recently become the focus of research. It is also a significant issue in clinical practice (4). According to a survey conducted in the United Kingdom, more patients want to be involved in their medical decision-making. Specifically, one-third of community patients and half of hospitalized patients want to participate in the determination of treatment plans (5). Schoenfeld et al. (6) discovered that when there are multiple reasonable options, most adult emergency department patients want to participate in part of the decision-making process.
Patients' participation in medical decision-making has been regarded as a sign of the quality of medical care (7), which is conducive to the establishment of a harmonious doctor-patient relationship, the formation of correct diagnosis and treatment plans, and the reduction of medical expenses. This improves patient satisfaction and treatment effect, so the concept of patient participation in decision-making is widely accepted. However, there are many barriers to patient participation in medical decisionmaking, such as insufficient doctor-patient communication, a lack of patient health literacy, limited diagnosis, and treatment time, and other factors that lead to patients not knowing how to participate in decision-making. As a result, communication between physicians and patients is critical. Physicians need to inform their patients about different treatment options, listen patiently to their appeals, and discuss the risks and benefits of different treatment options with them. Similarly, patients should inform physicians about their preferences in medical decisionmaking. These factors will influence the final decision; thus, patient preference is undeniably important in medical decisionmaking. According to the principle of medical decision-making optimization (8), physicians should formulate diagnosis and treatment plans with the goal of maximizing benefits while minimizing costs. To be more specific, according to the patient's condition and family situation, as well as the development level of local medical technology and objective conditions, the diagnosis and treatment measures with the least pain, the lowest cost, the best curative effect, and the highest degree of safety should be taken. According to this, patients' medical decision-making tendency was set into four aspects, namely, treatment effect, treatment cost, treatment side effect, and treatment experience.
In recent years, more researchers have attempted to predict medical problems using Machine learning (ML), deep learning, and neural network modeling methods. Researchers at the University of Pittsburgh created the first clinical decision support system in human medicine in the 1970s, with the goal of diagnosing complex internal diseases (9). Zhou et al. (10) focused on modeling and analyzing doctor-patient generated data based on an ensemble CNN-RNN framework. In order to improve patients' access to high-quality health information, medical resources, and professional guidance in a virtual healthcare setting, and therefore to promote patient participation in shared decision-making. Sun et al. (11) constructed a deep learning-based medical image and transcript data analysis model. According to the analysis results of medical big data, this can intelligently judge diseases and make effective decisions. At the same time, it can analyze the health status of patients according to the medical examination records and predict the risk of a certain disease in the future. During the COVID-19 pandemic, Pourhomayoun and Shakibi (12) extracted characteristic symptoms of COVID-19 patients that should be paid attention based on the prediction model of the ML algorithm, and the prediction accuracy of their mortality was as high as 89.98%. Specific application studies include leukemia diagnosis (13), prediction of death risk in patients with sepsis (14), formulation of drug dosage in patients with radiotherapy (15), diagnosis of allergic rhinitis (16), etc. Although algorithms are not unfamiliar to medical decision-making, the availability of large amounts of medical data makes ML increasingly applicable in this field (17), and its scope of solving decision-making problems gradually expands (18).
The "patient-centered" service concept, which emphasizes patient input into healthcare decisions, is gaining popularity. However, in the specific medical decision-making implementation process, evaluation tools, and quantitative models are urgently needed to assist physicians to make judgments. As a result, it is critical to implement ML algorithms to help physicians understand their patients' medical decision-making preferences. Due to the large number of variables involved in medical decision-making and the difficulty of data acquisition, it is a challenge to predict the choice tendencies of patients. There are, however, few studies that use the ML algorithm to predict and analyze choice tendencies in medical decision-making. The purpose of this study was to investigate the use of traditional ML algorithms in the prediction of patient choice tendencies in order to improve the quality of medical decision-making.
. Materials and methods

. . Data and sample
The data for this study were obtained from 248 valid patient questionnaires collected by the project team members at thirdlevel grade-A hospitals in Guangzhou by random sampling from September to December 2021. Before the implementation of the questionnaire, the investigators explained the study to the hospital's medical staff and patients, and all participants provided written informed consent. Ethical approvals were obtained by the Ethical Review Committee of the China Guangzhou Medical University. The questionnaire data consists of two parts. The first part includes the basic information of the patient, and the second part includes the choice tendency of patients' medical decision-making. The specific questionnaire design is shown in Appendix A.  The predictor variables consisted of a set of demographic and study variables that were selected based on the literature. The demographic variables (19) include patients' gender, age, education, religion, marital status, and the number of children, family annual income, and the main source of income. Study variables include the condition and severity of any disease (20, 21), the ratio of family monthly medical expenses to income (within 5 years) (22), and medical insurance status (23).
The outcome variable was the patients' medical decisionmaking choice tendency. According to the principle of optimization, the choice tendency of medical decision-making is set into four aspects, namely, treatment effect, treatment cost, treatment side effect, and treatment experience (8). Among them, the choice tendency of each category was sorted by scale (1 = very important, 2 = important, 3 = low importance, 4 = not important at all).

. . Methods
Due to the small sample size (n = 248), insufficient data exist at each level of the medical decision-making tendency, which will affect the performance of the model. So the medical decisionmaking choice tendency can be further divided into two intervals: important and unimportant (which is 1 or 2 = important, 3 or 4 = unimportant).
The traditional ML algorithm model has higher generalization ability than the deep learning algorithm due to the limited sample size (24,25). Furthermore, considering that the samples in this study are structured data, the predicted medical decision-making tendency of patients is a binary classification problem. As a result, the binary classification algorithm in supervised ML is used in this study (26). Currently, binary classification algorithms can be divided into single classification algorithms and ensemble algorithms with good performance, among which single algorithm mainly includes Naive Bayes classification, k-nearest neighbor (KNN) classification, decision tree (DT) classification, support vector machine (SVM) classification, and ensemble algorithms including Bagging classification, Random Forest classification, and Boosting series (27). The ensemble classification algorithm is more suitable for complex data, but the prediction speed is significantly reduced compared with the single algorithm. Additionally, the premise of the Naive Bayes algorithm is that it must conform to the independence attribute of samples (28). Thus, taking into account the data characteristics and prediction problems of the samples in this study, KNN classification, DT classification, and SVM classification are chosen to predict the patient's medical decisionmaking tendency, and the performance of the three classification algorithms is compared and analyzed.
The classification algorithm's performance is then evaluated using the following evaluation indexes: Accuracy rate, Precision rate, Recall rate, and F1-Score (29).
For binary problems, the prediction results are classified into positive or negative categories. As shown in Figure 1, the True Positive (TP) refers to the number of positive classes predicted into positive; True Negative (TN) refers to the number of negative classes predicted into negative; False Positive (FP) refers to the number of positive classes predicted into negative; False Negative (FN), Refers to the number of negative classes predicted to be positive. Therefore, the Accuracy rate can be seen as in Equation (1), which represents the accuracy rate of the predicted quantity and measures the ability of the model to avoid errors.
The Precision rate represents the ability of the model to predict positive samples, as shown in Equation (2).
The Recall rate represents the ability of positive samples to be recognized, as shown in Equation (3).
The F1-Score is the comprehensive evaluation index, F1-Score synthesizes the results of the Precision (P) rate and Recall (R) rate. When F1 is higher, the model is more effective, as shown in Equation (4).

. Results and discussion
Following the global crisis triggered by the COVID-19 pandemic, the world economy has plunged into a slump, with a sharp increase in instability and uncertainty. At the same time,   ML algorithms. The research results will help to build a patientcentered medical decision-making assistance system to assist medical staff in quickly understanding the patient's decisionmaking tendency, effectively resolve the conflict of interest between physicians and patients, and ultimately help physicians and patients to achieve scientific decision-making. Combined with the empirical results of this study, this paper discusses the development prospects of the patient-centered medical decision assistance system from the following three aspects.
. . Patient-centered medical decision-making assistance systems help to resolve conflicts of interest between physicians and patients The key point of predicting patients' medical decision-making choice tendency is to select effective predictor variables. However, the influencing factors affecting decision-making are complex and the data is difficult to collect. According to the above, 12 predictor variables were selected for feature processing according to the above, and the medical decision-making preference was classified into four aspects: treatment effect; treatment cost; treatment side effect; and treatment experience.
The descriptive analysis of specific variables is shown in Table 1. The number of male patients was similar to that of female patients, with 46% males and 54% females. Patients aged 18-35 years had the highest proportion (53%), and the proportion of bachelor's degree or above was the highest, at 46%. Among the surveyed patients, 52% thought the treatment effect was very important, 36% thought the treatment cost was very important, 33% thought the treatment side effect was very important, and 60% thought the treatment experience was very important. Based on the descriptive statistical analysis of variables, linear regression analysis was performed with basic information about patients as independent variables and the four types of medical choice tendencies as dependent variables. The specific analysis results are shown in Table 2, in which the regression results of treatment effect can be seen in model 1; treatment costs, model 2; treatment side effects, model 3; and treatment experience, model 4.
In medical services, physicians commit to practice ethically and to putting patient wellbeing first (32). As a result, in addition to the diagnosis and treatment of patient diseases, they also need to pay attention to other aspects of patient medical needs. First, for patients who pay attention to the treatment effect, the regression analysis results in Table 2 show that the order of the educational level of this type of patients and the importance of the treatment effect (1 = very important, 4 = very unimportant) has a negative effect at the 5% significance level. That is, the higher the educational level of the patient, the more attention is paid to the effect of treatment (33,34). This finding also suggests that patients with higher education levels have higher levels of health literacy (35), and are more concerned and aware of disease treatment options. In the process of providing medical services to this type of patient, medical staff can give a more professional introduction based on the patient's knowledge background, increase the explanation of the theoretical knowledge of the disease, and enrich the patient's understanding of their disease, thereby helping the patient to improve the treatment effect. Second, for patients who pay attention to treatment costs, according to the regression analysis results in Table 2, it can be seen that the medical insurance status of this type of patient is fully self-paid, which has a negative effect on the importance of treatment costs at the 5% significance level. In other words, patients who fully self-pay are more aware of the cost of treatment than patients with other conditions. As a result, in the process of providing medical services for this type of patient, medical staff should pay more attention to the cost of treatment, and can provide patients with alternative cost-effective treatment options (36), such as the use of domestic drugs in the process of treatment. Third, Table 2 shows that patients with religious beliefs in this group are more likely to pay attention to treatment side effects than those without religious beliefs. Religion may play a role here, specifically the religious taboos of China's ethnic minorities (37), which hold that women should not compromise their bodies in any way and thus reject surgical treatment options like organ removal. It's clear that physicians and nurses planning care for patients with religious restrictions need to pay special attention to the possibility of treatment-related harm. Fourth, for patients who pay attention to treatment experience, it can be seen that the patient's disease status and the ranking of important procedures for treatment experience have a positive effect at the 5% significance level. This demonstrates that patients with severe diseases are more concerned about the pain that treatment may bring (38). As a result, medical personnel should strengthen the level of nursing care for these patients with serious illnesses, reduce the pain of patient treatment, and enhance their treatment experience.
The establishment of a patient-centered medical decisionmaking assistance system will help medical staff to understand different types of patients and provide patients with targeted and humanized medical services. This will assist in resolving any conflicts of interest between physicians and patients, thereby reducing the cost of communication between them and improving the quality of medical services.
. . Machine learning technology e ectively promotes the development of patient-centered medical decision-making assistance systems Following the regression analysis of patient medical decision choice tendency, the ML algorithm was used to predict patient choice tendency. The ML algorithm is implemented in this study using the sklearn tool in Python programming language. The One-Hot encoding of categorical variables was carried out before the operation of the model, and the parameters were adjusted during the construction of the model.
In order to compare the performance of the three algorithms, the accuracy rate, precision rate, recall rate, and F1-Score of the evaluation indexes are selected. The specific results of the performance evaluation indexes can be seen in Tables 3-5. Accuracy for treatment effect, cost, side effect, and patient experience, respectively, is 80%, 60%, 56%, and 60% when the DT algorithm is used to predict patient preferences, compared to 78%, .
/fpubh. .   The parameters of the decision tree algorithm are set as follows. The training set ratio is 0.8, the node split standard is "gini, " the node division mode is best, the minimum number of node split samples is 2, the minimum sample number of leaf nodes is 1, and the maximum tree depth is "No limit".  Figure 2 depicts the specific performance comparison results.
In conclusion, the study's ML prediction results demonstrate that the SVM model has good performance, suggesting that the ML algorithm can help medical professionals gain a deeper understanding of patients' propensities for making certain choices. In the era of artificial intelligence, scholars propose to combine ML tools with decision-making to solve the impact of uncertain information on decision results (39). Some scholars even propose to use of ML algorithms to statistically analyze causal relationships and make predictions, thereby replacing human decision-making behavior (40). The involvement of ML algorithms in decision-making in the medical field is mature, such as in the diagnosis and prediction of diseases, including cancer, chronic kidney disease, Parkinson's disease, skin diseases, etc. (41).
Therefore, when taken together with the results of the prediction of patients' medical decision-making tendencies, it demonstrates that ML algorithms can effectively help medical staff quickly understand patients' needs (42), which in turn encourages the development of Patientcentered medical decision-making assistance systems (43). This system can enhance the trust between physicians and patients (44). Ultimately helps them reach a shared decisionmaking model (45) and the attainment of scientifically grounded choice.
. . Opportunities and challenges of building a patient-centered medical decision-making assistance system under the background of artificial intelligence The healthcare system has been overwhelmed by the Covid-19 pandemic (46), and mental health issues are on the rise among healthcare workers (47). The likelihood of harmful medical disputes is bound to rise in such a setting. In this context, there is a huge opportunity to build a patient-centered medical decision aid system based on artificial intelligence technologies such as ML algorithms. The patient-centered medical decision assistance system is helpful to strengthen the understanding between physicians and patients, resolve conflicts of interest and assist in realizing scientific decision-making, which has important application value in clinical practice.
However, the following obstacles will also need to be overcome during the construction of the auxiliary decision-making system. First, in the process of system development, it is necessary to avoid security issues such as patient medical information data leakage. Due to the enormous value of data in the medical field, it is frequently the target of theft by criminal groups who profit from the malicious use of data (48). Medical data leakage will aggravate patient distrust of medical institutions, which will lead to major crises in medical institutions. Therefore, in the development of auxiliary decision-making systems, it is necessary to strengthen the protection of patient privacy and rights through information technology (49). Second, in the operation process of the auxiliary decision-making system, patients are required to provide not only disease information, but also other demographic information, which will increase the medical burden of patients, such as elderly patients with low information literacy levels (50). According to relevant studies (51, 52), only 16.7% of elderly patients in western Chinese cities meet the health literacy level. In order to reduce . /fpubh. . The parameters of the support vector machine are set as follows. The training set ratio is 0.8, the error term penalty coefficient is 1.0, the kernel is rbf, the kernel coefficient values is 0.01, the multiclassification decision function is ovr, the model convergence parameter is 0.001, and the maximum number of iterations is 2,000. patients' confusion and concern, medical staff must patiently explain and guide them while gathering information for such patients. Third, in the application process of the auxiliary decisionmaking system, it is also necessary to provide corresponding training in information technology knowledge to enhance the information literacy of medical staff. According to studies, health information literacy is becoming increasingly important in both developed and developing countries (53). As a result, information skills training for medical staff is very important, which will help to improve the interpretation ability of data decision-making results, and then improve the quality of medical decision-making.

. Conclusions
Medical decision assistance systems are the growing trend that helps physicians and patients make scientific decisions and resolve conflicts of interest. The purpose of this research was to suggest a method for using the ML algorithm to predict a patient's preferences when making important medical decisions. The findings demonstrate the SVM algorithm's strong predictive abilities, which suggests that algorithms such as ML in artificial intelligence technology can contribute to the development and application of patient-centered medical decision assistance systems.
Furthermore, this study still has some limitations. First, due to conditional constraints, the sample size of the questionnaire survey on patients' medical decision-making preferences is insufficient, resulting in the inability of some algorithms to achieve good performance. Second, the selection of features for this study through literature review, which may cause noise in some features and affect the generalization ability of the model. It is hoped that in future research work, the sample size can be further expanded and the method of feature selection can be optimized, which will then be compared with ensemble learning or deep learning algorithms to improve the prediction model's performance and generalization ability.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
Ethical review and approval was not required for the study of human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants was not required to participate in this study in accordance with the national legislation and the institutional requirements.
Author contributions YL and JL: conceptualization and validation. YL and ZY: methodology, software, data curation, and visualization. YL and QX: formal analysis. JL: investigation, writing-review and editing, supervision, project administration, and funding acquisition.