Development of an interpretable machine learning-based intelligent system of exercise prescription for cardio-oncology preventive care: A study protocol

Background Cardiovascular disease (CVD) and cancer are the first and second causes of death in over 130 countries across the world. They are also among the top three causes in almost 180 countries worldwide. Cardiovascular complications are often noticed in cancer patients, with nearly 20% exhibiting cardiovascular comorbidities. Physical exercise may be helpful for cancer survivors and people living with cancer (PLWC), as it prevents relapses, CVD, and cardiotoxicity. Therefore, it is beneficial to recommend exercise as part of cardio-oncology preventive care. Objective With the progress of deep learning algorithms and the improvement of big data processing techniques, artificial intelligence (AI) has gradually become popular in the fields of medicine and healthcare. In the context of the shortage of medical resources in China, it is of great significance to adopt AI and machine learning methods for prescription recommendations. This study aims to develop an interpretable machine learning-based intelligent system of exercise prescription for cardio-oncology preventive care, and this paper presents the study protocol. Methods This will be a retrospective machine learning modeling cohort study with interventional methods (i.e., exercise prescription). We will recruit PLWC participants at baseline (from 1 January 2025 to 31 December 2026) and follow up over several years (from 1 January 2027 to 31 December 2028). Specifically, participants will be eligible if they are (1) PLWC in Stage I or cancer survivors from Stage I; (2) aged between 18 and 55 years; (3) interested in physical exercise for rehabilitation; (4) willing to wear smart sensors/watches; (5) assessed by doctors as suitable for exercise interventions. At baseline, clinical exercise physiologist certificated by the joint training program (from 1 January 2023 to 31 December 2024) of American College of Sports Medicine and Chinese Association of Sports Medicine will recommend exercise prescription to each participant. During the follow-up, effective exercise prescription will be determined by assessing the CVD status of the participants. Expected outcomes This study aims to develop not only an interpretable machine learning model to recommend exercise prescription but also an intelligent system of exercise prescription for precision cardio-oncology preventive care. Ethics This study is approved by Human Experimental Ethics Inspection of Guangzhou Sport University. Clinical trial registration http://www.chictr.org.cn, identifier ChiCTR2300077887.

Background: Cardiovascular disease (CVD) and cancer are the first and second causes of death in over 130 countries across the world.They are also among the top three causes in almost 180 countries worldwide.Cardiovascular complications are often noticed in cancer patients, with nearly 20% exhibiting cardiovascular comorbidities.Physical exercise may be helpful for cancer survivors and people living with cancer (PLWC), as it prevents relapses, CVD, and cardiotoxicity.Therefore, it is beneficial to recommend exercise as part of cardio-oncology preventive care.
Objective: With the progress of deep learning algorithms and the improvement of big data processing techniques, artificial intelligence (AI) has gradually become popular in the fields of medicine and healthcare.In the context of the shortage of medical resources in China, it is of great significance to adopt AI and machine learning methods for prescription recommendations.This study aims to develop an interpretable machine learning-based intelligent system of exercise prescription for cardio-oncology preventive care, and this paper presents the study protocol.
Methods: This will be a retrospective machine learning modeling cohort study with interventional methods (i.e., exercise prescription).We will recruit PLWC participants at baseline (from 1 January 2025 to 31 December 2026) and follow up over several years (from 1 January 2027 to 31 December 2028).Specifically, participants will be eligible if they are (1) PLWC in Stage I or cancer survivors from Stage I; (2) aged between 18 and 55 years; (3) interested in physical exercise for rehabilitation; (4) willing to wear smart sensors/watches; (5) assessed by doctors as suitable for exercise interventions.At baseline, clinical exercise physiologist certificated by the joint training program (from 1 January 2023 to 31 December 2024) of American College of Sports Medicine and Chinese Association of Sports Medicine will recommend exercise prescription to each participant.During the follow-up, effective exercise prescription will be determined by assessing the CVD status of the participants.
Expected outcomes: This study aims to develop not only an interpretable machine learning model to recommend exercise prescription but also an intelligent system of exercise prescription for precision cardio-oncology preventive care.

Introduction
Globally, cardiovascular disease (CVD) ranks as the first leading cause of death, while cancer ranks as the second in around 130 countries (1).These two factors are also among the top three killers in almost 180 countries worldwide (2).People living with cancer (PLWC) usually exhibit cardiovascular complications resulting from socalled "cardio-toxicity" (3) (which is defined as any heart damage arising from cancer treatments) as well as the overlap of risk factors of cancer and CVD, including an unbalanced fat diet, alcohol abuse, and physical inactivity (4).
Interventions using some common prevention strategies for these risk factors exist.For example, regularly engaging in physical activity (i.e., exercise prescription) is an efficient prevention strategy for cardio-oncology (5) because physical exercise can reduce not only cardio-toxicity but also the adverse effects of chemotherapy, including lymphoedema, fatigue, and immunological disorders (6).As a result, among PLWC or cancer survivors, physical exercise is a valuable tool for CVD prevention.Therefore, it is beneficial to recommend exercise prescription in cardio-oncology preventive care (5).
Artificial Intelligence (AI) has been widely employed in healthcare and medicine, especially since great advancements in deep learning algorithms and significant improvements to big data processing techniques (7).The fields of mental health (8), internal medicine (9), infectious diseases control (10), heart failure (11, 12), and diabetes (13), among others employ AI.In the context of the shortage of medical resources in China, developing prescription recommendation systems using AI and machine learning methods is promising (14).Wang et al. have proposed a reinforcement learning-based dynamic prescription recommendation system (15).
As for a dynamic recommendation system of exercise prescription, Tuka and Linhart discussed the possibility of utilizing AI and machine learning approaches for personalized exercise prescription recommendations for  (17).However, there is no existing studies have examined machine learning-based exercise prescription recommendations for cardio-oncology preventive care.Therefore, we aim to develop an interpretable machine learning-based intelligent system of exercise prescription for cardio-oncology preventive care and present the study protocol.

Design
This will be a cohort study with retrospective machine learning modeling.The study timeline is presented in Figure 1.
From 1 January 2023 to 31 December 2024, we will train our exercise prescription doctors in the training program for clinical exercise physiologist (CEP).This training program is jointly supported by American College of Sports Medicine (ACSM) and Chinese Association of Sports Medicine (CASM).Candidates who are admitted to this training program should either hold a bachelor's degree (or above) in medicine/public health or have at least 3-year professional clinical experience in healthcare.Additionally, candidates are supposed to have a vast knowledge of physical education and sports training as well.In this training program, candidates need to complete reading materials, online courses, offline tutorials, practice, and examinations.The offline training activities will take place at Zhuhai, China, where a certificated training base of ACSM-CASM CEP programs locates on.After completion of all modules, our exercise prescription doctors will be certificated jointly by ACSM and CASM as CEP.From 1 January 2025 to 31 December 2026, we will recruit 600 participants who are PLWC (in Stage I) or cancer survivors (from Stage I) for this study.The baseline characteristics including demographics, cancerrelated information, exercise habits and lifestyle, health-related physical fitness, and CVD-related items will be collected based on physical examination or biomedical testing, where necessary.Then the exercise prescription doctors certificated by ACSM-CASM will recommend exercise prescription to each participant based on the abovementioned variables.During the 2-year exercise intervention period, we will monitor each participant's completion status through wearable devices, which can record one's physical activity every day.Additionally, our health management team will keep track of participants every week to ensure their adherence.We will undertake the follow-up study between 1 January 2027 and 31 December 2028, re-examining CVD-related items for all participants.Finally, effective exercise prescription will be assessed based on the changes in CVD-related items from the baseline to the follow-up, and interpretable machine learning models will be adopted in effective exercise prescription.The intelligent exercise prescription recommendation system will be developed based on some machine learning models with both good interpretability and high performance.

Selection of subjects
This protocol involving human participants was reviewed and approved by the Ethics Committee of Guangzhou Sport University.The participants will be provided with their written informed consent to participate in this study.

Exclusion criteria
Participants will be ineligible for inclusion if they have: (1) current or recent serious sports injuries; or (2) existing severe CVD; or (3) other conditions that may not be suitable for exercise interventions, as assessed by doctors.

The sample size
We will recruit 600 participants for this study.Generally, the minimum sample size for machine learning modeling is 200 (18).We estimate the loss rate in the follow-up is 47% according to a finding that 53% cancer survivors do not follow the recommended physical activity guidelines (19).We infer that 65% of exercise prescriptions will be assessed as effective for cardio-oncology preventive care after the follow-up and used for machine learning modeling.This assumed value is based on the effectiveness rate of exercise prescriptions in reducing the risk of cardiovascular events among cancer patients (20).Therefore, at least 200 ÷ 65% ÷ 53% = 580.55participants are needed.As a result, we decide to recruit 600 participants in the baseline to ensure the guarantee the minimum sample size for machine learning modeling at last.

The baseline and follow-up
In the baseline and the follow-up, five aspects of characteristics/variables will be collected: demographics, Frontiers in Cardiovascular Medicine 03 frontiersin.orgThe timeline of this study.
cancer-related information, exercise habits and lifestyle, healthrelated physical fitness (21), and CVD-related items.Table 1 illustrates these variables in detail.Specifically, demographics, cancer-related information, exercise habits and lifestyle, and health-related physical fitness are used for exercise prescriptions, while CVDrelated items are used to evaluate the effectiveness of exercise prescriptions after comparing them with the follow-up data.

Interventional methods
The interventional method for all participants is the exercise prescription, prescribed by our ACSM-CASM certificated exercise prescription doctors.The exercise intervention strategies are prescribed based on each participant's baseline characteristics including demographics, cancer-related information, exercise habits and lifestyle, and health-related physical fitness.
From the professional perspective, exercise prescription doctors will determine the exercise dose considering three aspects: frequency, duration, and intensity (26).To be specific, doctors prescribe an exercise dose such as "3 times of exercise per week, 150 minutes in total, in moderate intensity" or "5 times of exercise per week, 75 minutes for each time, in highintensity."Frequency is the number of times of exercise per week, duration is the length of time in total, and intensity is decided to be high, moderate, or low.Furthermore, we can employ the concept of metabolic equivalent of task (MET) in exercise dose when considering frequency, duration, and intensity (54).For example, the above-mentioned two exercise doses are equivalent to each other, and both represent 7.5 METh/week.
During the 2-year exercise intervention period, wearable devices will be applied to monitor each participant's completion status.In addition, our professional health management team (including certificated CEPs by ACSM-CASM programs and several assistants who hold a degree in public health, sports training, physical education, social work or psychology) will keep track of participants every week to ensure their adherence.The employment of wearable devices will be charged a deposit fee at the beginning which will be returned after the follow-up.Participants who are kept in track by our health management team during these 2 years will be given sports equipment (e.g., badminton rackets, yoga mats, foam rollers) for free every 6 months.

Data analysis and interpretable learning
Effective exercise prescriptions will be selected based on preand post-intervention data analysis of CVD-related items.For example, if the blood pressure variability decreases or at least does not increase, the exercise prescription can be considered effective.We will employ interpretable machine learning models to the learnings on exercise prescription recommendations for all effective prescriptions.Demographics, cancer-related information, exercise habits and lifestyle, health-related physical fitness will be input variables, while frequency, duration, and intensity of exercise prescription will be output variables.We formulate the prescription learning process as a machine learning classification task.Explainable machine learning models such as logistics regression, support vector machine, decision tree, random forest, k-nearest neighbor, and naive Bayes classifiers will be utilized.Specifically, in the logistics regression machine learning model, the estimated values of coefficients and their standard deviations, P-values, and 95% confidence intervals, will provide us with the interpretability.The process of feature selection and kernels when using support vector machine may reveal the model interpretability (55).Figure presentation of trees and importance ranking of features for decision tree and random forest can lead to good explanations (56).In k-nearest neighbor algorithm, showing the k-nearest neighbors might also be explainable (57).As for Naive Bayes, it can be interpreted on the modular level and the conditional probability, then thus it will be very clear for us to understand how much each feature contributes toward a certain class prediction (58).Generally, these machine learning models may all have good potential for interpretability.Some deep learning models will also be employed to evaluate the classification performance (accuracy, precision, recall, F-1 score, area under curve) in fivefold cross-validations, comparing them with the above-mentioned explainable machine learning models.Explanations of deep neural networks can be challenging (59), and hence to ensure that our deep learning models (convolutional neural networks, eXtreme gradient boosting, multilayer perceptron, deep residual network, DeepGBM) are more interpretable, an explainability tool named SHapley Additive exPlanations (SHAP), (60) will also be included.Recent advances in interpretability study for deep learning models have demonstrated the explainable potential for such models utilizing SHAP.For example, Zhao et al. proposed a novel SHAP scores computing algorithm for convolutional neural networks in classification (61).Meng et al. developed an integrated framework with better interpretability based on SHAP and eXtreme gradient boosting (62).
Since the intervention duration will be 2 years, there might be a great chance of loss-to-follow-up.Therefore, we set our sample size as 600 instead of 580, to cope with a higher rate or lower effectiveness rate than our initial estimations.If the final sample size for machine learning modeling still fails to reach the minimum bound (18), we will employ some few-shot learning algorithms (63) in such small size machine learning task to deal with this potential problem.

Intelligent system development
Considering both the model performance and the model interpretability, the intelligent exercise prescription recommendation system will be developed based on some machine learning models with both good interpretability and high performance.Figure 2 depicts an example of this intelligent system.It is noteworthy that our designed intelligent system will explain why it recommends specific exercise prescriptions.

Discussion
All cancer patients need to consider a multidisciplinary approach during treatment, which includes physical training, psychological support, and lifestyle advice (26).The concept of cardio-oncology rehabilitation has been introduced by both the American Heart Association and the American Cancer Society (64).Cardio-oncology rehabilitation is to identify PLWC who are at high risk for cardiac dysfunction as well.Physical activity intervention, that is exercise prescription, is an important component of cardio-oncology rehabilitation and can prevent or A demonstration of the intelligent system of exercise prescription.
Frontiers in Cardiovascular Medicine 06 frontiersin.orgmoderate cardiovascular events in cancer patients or survivors.It has been demonstrated that heterogeneous responses to the same physical training can enhance cardio-respiratory fitness in cancer therapy (65).There is some theoretical and experimental evidence as to why physical exercise can help reduce cardiovascular events in cancer patients or survivors.For example, so-called "cancer-induced cardiac cachexia, " which refers to a multi-organ/tissue syndrome affecting the brain, liver, and heart, can exist in cancer patients because of the tumor environment (66).Previous studies have revealed that physical exercise can restore muscle strength and improve endurance, thus counteracting cardiac cachexia (67).Exercise prescription can act as an aid and therapy for cardio-oncology preventive care.Therefore, our study on interpretable machine learning of exercise prescription for cardio-oncology prevention is of significance.Some limitations of our study need to be mentioned.First, only PLWC in cancer Stage I or cancer survivors from cancer Stage I are taken into account.Although considering more participants (e.g., expansion to Stage II or Stage III) may improve our approach's coverage, we decide to focus on participants with milder intensities of cancer.Second, there is always a trade-off between model interpretability and model performance.Black-box deep learning models may outperform explainable machine learning models in evaluation metrics such as accuracy and area under curve (AUC).Therefore, we plan to adopt both in this study and select a balanced one for intelligent system development.Third, we will only employ internal cross-validation methods in training and testing as an initial validation choice.To fill in this gap, after the interpretable machine learning-based intelligent system of exercise prescription has been developed and used in real case for a period, we will then conduct a quasiexperimental trial for external validation in another study, just like other machine learning-based medical studies did for external validation (68)(69)(70).Furthermore, with external validation and more dataset in the future, we can continuously update the intelligent system of exercise prescription through dynamic optimization of parameters of the interpretable machine learning model.
In conclusion, physical exercise is a promising interventional strategy for cancer patients or survivors during and after medical treatment and may also be effective in counteracting some adverse effects of the tumor environment or drugs on their cardiovascular system.When prescribing exercise, we need to take the cancer patients' or survivors' individual characteristics, cancer drugs/medications, personal lifestyle history, and health-related physical fitness into consideration.Such a tailored exercise prescription process can be learned by interpretable models using machine learning approaches and can generate an intelligent recommendation system of exercise prescription for cardio-oncology preventive care in the future.

TABLE 1
Characteristics of participants in the baseline and at follow-up.