PRAgmatic Clinical Trial Design of Integrative MediCinE (PRACTICE): A Focus Group Series and Systematic Review on Trials of Diabetes and Kidney Disease

Background: Pragmatic trials inform clinical decision with better generalizability and can bridge different streams of medicine. This study collated the expectations regarding pragmatic trial design of integrative medicine (IM) for diabetes and kidney diseases among patients and physicians. Dissonance between users' perspective and existing pragmatic trial design was identified. The association between risk of bias and pragmatism of study design was assessed. Method: A 10-group semi-structured focus group interview series [21 patients, 14 conventional medicine (ConM) and 15 Chinese medicine (CM) physicians] were purposively sampled from private and public clinics in Hong Kong. Perspectives were qualitatively analyzed by constant comparative method. A systematic search of four databases was performed to identify existing IM pragmatic clinical trials in diabetes or kidney disease. Primary outcomes were the pragmatism, risk of bias, and rationale of the study design. Risk of bias and pragmatism were assessed based on Cochrane risk-of-bias tool and PRECIS-2, respectively. The correlation between risk of bias and pragmatism was assessed by regression models with sensitivity analyses. Results: The subtheme on the motivation to seek IM service was analyzed, covering the perceived limitation of ConM effect, perceived benefits of IM service, and assessment of IM effectiveness. Patients expected IM service to retard disease progression, stabilize concomitant drug dosage, and reduce potential side effects associated with ConM. In the systematic review, 25 studies from six countries were included covering CM, Korean medicine, Ayurvedic medicine, and western herbal medicine. Existing study designs did not include a detailed assessment of concomitant drug change and adverse events. Majority of studies either recruited a non-representative proportion of patients as traditional, complementary, and integrative medicine (TCIM) diagnosis was used as inclusion criteria, or not reflecting the real-world practice of TCIM by completely dropping TCIM diagnosis in the trial design. Consultation follow-up frequency is the least pragmatic domain. Increase in pragmatism did not associate with a higher risk of bias. Conclusion: Existing IM pragmatic trial design does not match the patients' expectation in the analysis of incident concomitant drug change and adverse events. A two-layer design incorporating TCIM diagnosis as a stratification factor maximizes the generalizability of evidence and real-world translation of both ConM and TCIM.

1. This is the first focus group series to explore the expected outcomes of patients and physicians regarding pragmatic trial design of IM for diabetes and renal service, involving patients and family medicine, internal medicine, and Chinese medicine (CM) physicians. Unmatched expectation in existing studies was identified through systematic review. 2. Patients expected integrative Chinese-western medicine service to retard disease progression, stabilize concomitant drug dosage, and reduce potential side effects associated with conventional treatment. 3. Existing IM pragmatic trial designs did not include detailed assessment of concomitant drug change and adverse events. Consultation follow-up frequency is the least pragmatic domain in existing IM pragmatic trials. 4. Majority of studies either recruited a non-representative proportion of patients by using traditional, complementary, and integrative medicine (TCIM) diagnosis as inclusion criteria, or not reflecting the real-world practice of TCIM by completely dropping TCIM diagnosis. 5. Increase in pragmatism in study design did not associate with a higher risk of bias from existing evidence.

INTRODUCTION
Pragmatic trials evaluate the effectiveness of interventions in the real-world setting aiming to inform clinical decision and implementation with better generalizability (1,2). Compared to conventional phase III randomized controlled trials, pragmatic trials often are open-label, have less stringent inclusion/exclusion criteria, involve complex/flexible interventions, compare to usual care, and measure outcomes that are patient-centered (1,2). Integrative medicine (IM) amalgamates conventional medicine (ConM) and other streams of medicine from a patient-centered and effectiveness-driven approach (3)(4)(5). Traditional, complementary, and integrative medicine (TCIM), including Chinese medicine (CM), naturopathic medicine, mind-body therapies, and other streams of medicine, are often personalized as their theories were developed predominantly from expert consensus and case series (6). Differences in epistemology (for instance, disease classification and treatment strategy) between ConM and TCIM led to controversies in the evaluation of TCIM's effectiveness (7)(8)(9)(10). Most clinical trials and meta-analyses were designed to estimate the adjusted or averaged effectiveness of a regimen from a population of patients. However, the likelihood of being responsive toward a regimen of each individual patient with distinctive demographics and phenotypes is often more needed by a physician in the clinical situation (11)(12)(13). There are continuous concerns on the conventional evidence-based paradigm building on meta-analyses and randomized controlled trials with limited personalized design (e.g., prespecified subgroup analysis, responder analysis), such as being over-concentrated in population-based assessment (14,15), over-standardized treatment (15,16), and lacking personalization (17). This affected the clinical utility of the evidence (18) and was contradicted with many core principles of TCIM. The efficacy-driven approach, which focuses on comparative effectiveness, has been proposed to bridge ConM and TCIM (8,(19)(20)(21)(22).
Stakeholder (e.g., patients and physicians) engagement is the foundation of designing pragmatic studies (2,23). Stakeholder involvement in the study design stage, from the selection of disease condition, drug formulation, and outcome measurement, is increasingly emphasized to enhance the clinical utility and translation of evidence (18,24). Nevertheless, there are controversies over the pragmatic features (e.g., unblinding of subjects, no placebo control, intervention adjustment) as these flexibilities may enhance generalizability at the expense of internal validity of the evidence (25,26). The correlation between risk of bias and pragmatism remains unclear.
Diabetes presented in 9.5% of adult population and accounted for 9.9% of all-cause mortality globally (27,28). The healthcare expenditure on diabetes mounted to US $850 billion worldwide in 2017, representing 11.6% of the total health expenditure (27,28). Both diabetes and kidney dysfunction are the top 10 conditions attributed to disability-adjusted life-years among population aged over 25 globally (29). In the past decade, CM formulations have been reported to protect against diabetes and chronic kidney disease (CKD) via orchestrated mechanisms (30)(31)(32)(33)(34)(35). However, less than 2% of diabetic patients have ever used CM for diabetes or CKD in Hong Kong which was substantially lower than the utilization in other disciplines (e.g., 50% for cancer patients) (36). Lack of high-quality and communicable evidence has been suggested as one of the key obstacles in implementing IM (6).
This study aimed to collate and explore the expectations regarding the pragmatic trial design of IM for diabetes among patients and physicians. Subsequently, the existing trial design was systematically assessed to identify the dissonance with users' perspective.

Study Design
A 10-group semi-structured focus group interview series was conducted among patients and physicians with constant comparative method to explore their expectation regarding the IM management of diabetes in general (37). Seven high-level themes were previously identified from the interview series. Two themes regarding the barriers to access and the preferred delivery mode of health services were reported (6). In this study, we report another major theme related to pragmatic trial design. A systematic review was conducted subsequently to contrast existing IM pragmatic trials to the users' perspectives identified from the focus group interviews.

Focus Group Interview
The focus group interview series was designed to explore the expectations and concerns of the patients and physicians regarding the IM service access and further research. Detail of the interview methods was previously described (6). Briefly, 50 subjects (21 diabetes patients, 14 ConM physicians, and 15 CM physicians) with diverse demographics and experience were purposely sampled from public clinics, private clinics and teaching hospitals in Hong Kong. A series of face-to-face group interviews with three groups of 6-8 patients, three groups of 3-6 ConM physicians, and four groups of 3-4 CM physicians were conducted. Each interview lasted 60-120 min allowing at least 20 min per participant for adequate interaction. CM physicians were sampled to represent TCIM in Hong Kong as CM is the mainstream of TCIM, and integrative Chinese-western medicine is the major form of IM globally including Hong Kong (38).
The interviews were facilitated by a moderator (P.W.L) with relevant experience and conducted in Cantonese (native language of participants). The identity of interviewees and the moderator was blinded before the interview took place. The interviews were built around participants' consultation experience, concerns and expectations based on a semi-structured interview guide (6). The process of recruitment, interview and analysis were iterative until data saturation during the last round of interview (patient and ConM: third round, CM: fourth round). Interview content was analyzed by constant comparative method (37). Maximum codes on main themes and subthemes were first generated independently by two bilingual investigators (K.W.C., P.W.L.) for initial open coding with revisit to check for emerging ideas. The concepts and theories were refined, and the association of the coding was explored to form axial coding. Final core coding was formed after data saturation and was applied to index the whole dataset. Charted result was translated by a bilingual investigator (K.W.C.) when used as illustrative quotations. Data were processed with the support of simple software (Microsoft Word and Excel) for convenient access.

Search
We sought to assess the pragmatism, risk of bias, and rationale of study design of the existing pragmatic trials of diabetes and kidney disease using IM as intervention. The search strategy (Supplementary File 1) was formulated to include all IM pragmatic clinical trials and trial protocols that recruit patients with diabetes or kidney diseases published until 24 August 2020. IM included any intervention that is not conventionally used in clinical practice, for instance, herbal medicine, acupuncture, and massage. Four databases were searched including Cochrane, Medline, Embase and PubMed. Reference lists were also searched. A clinical epidemiologist (K.W.C.) led the search and data processing. Endnote X9 was used to aid the review process. Protocol registration: CRD4D2021231288.

Screening
After removing duplicated studies, screening started with title and abstract followed by full text before data extraction. All articles were dually screened, assessed, and extracted (Y.K.L, L.G.) independently with a standardized form. All disagreements were resolved by discussion and determined by K.W.C. if consensus could not be reached. There was no language restriction. All observational and qualitative studies were excluded. Studies that used health services or supplements as intervention were excluded.

Quality Assessment and Data Extraction
The co-primary outcomes were the pragmatism, risk of bias, and rationale of the study design. Pragmatism of the trials was assessed based on the PRECIS-2 tool (39) on study population, recruitment setting, intervention delivery, and outcome assessment. Risk of bias in randomization, allocation concealment, blinding, incomplete outcome data, FIGURE 1 | Motivation to seek integrative medicine (IM) service. Themes generally agreed upon by patients in yellow, by Chinese medicine (CM) physicians in blue, by conventional medicine (ConM) physicians in red, by both patient and CM physicians in green, by both patient and ConM clinician in orange, by all parties in black. Control of disease progression was the common perceived benefit of IM. Stabilizing ConM usage was emphasized by patients and acknowledged by CM physicians. Surrogate biomarkers were mutually accepted among patients and physicians. Importance on quality of life divided between patients and CM physicians. and selective reporting was assessed based on the Cochrane risk-of-bias tool (40). The rationale of study design in target population, intervention, comparator, and outcome assessment were identified from the study.

Statistical Analysis
The correlation between risk of bias and pragmatism was assessed by univariable and backward multivariable regression analysis adjusting publication year and sample size. For the quantified assessment of the overall risk of bias of each study, the scores of low, unknown, and high risk were given 0, 1, and 2 points. Lower total score represents low risk of bias in the reported study design. For pragmatism, each domain scored 1 for being least pragmatic and 5 for being most pragmatic, respectively, according to the guideline from the PRECIS-2 tool. For domains that were not assessable, the score was replaced by 3 (midpoint). As there is no consensus on the statistical handling of undetermined domains, sensitivity analysis was conducted to replace undetermined domains by 1 and 5 to test the robustness of results. STATA 15.1 was used for analysis.

Focus Group Interviews
Majority of patients had poor glycemic control (71.4%), with stage 2-4 CKD (95.2%) and albuminuria (90.5%); 4.8% of patients reached end-stage kidney failure, 57.1% (n = 8/14) of ConM physicians specialized in internal medicine, 42.9% (n = 6/14) of ConM physicians specialized in family medicine or practiced as general practitioners, 42.9% (n = 6/14) of ConM physicians received CM education, and all (n = 15) CM physicians received substantial credit bearing ConM education from their undergraduate study. Seven high-level themes, namely, barriers toward IM service, motivation to seek CM service, background knowledge on diabetes, experience of CM service, preferred model of integrative service delivery, and evidence of IM and CM hospital, were previously identified leading to 25 subthemes (6). Data on a high-level theme: motivation to seek IM service is related to the clinical trial design and reported in this study (Figure 1). Quotes are summarized in Table 1.

Main Theme: Motivation to Seek IM Service
Four subthemes related to the motivation of seeking IM service were identified, namely, (1) perceived limitation of ConM effect, (2) peer or media influence, (3) perceived benefits of IM service, and (4) assessment of IM effectiveness. Subthemes 1, 3, and 4 are relevant to study design and summarized below.

Subtheme: Perceived Limitation of ConM Effect
Majority of patients considered IM as they believed the effect of ConM was limited and was concerned about the adverse effects after receiving ConM.

Limitation in ConM Efficacy
Most patients believed that diabetes and diabetic kidney disease (DKD) are irreversible, which was reflected by the limitation of the current regimens (41)(42)(43)(44). This prompted patients to explore alternatives for more options to control disease progression. Physicians from both ConM and CM acknowledged that patients generally prefer IM treatment. Majority of patients approached IM when they experienced disease progression, for example, poor blood glucose control, or developed complications including DKD.

ConM-Associated Adverse Effect
Patients mentioned their experience in developing adverse effects that perceived to be ConM-associated. These included hypoglycemia, hyperkalemia, diarrhea, fluctuating blood glucose, and fatigue. Majority of patients believed that CM has less adverse effects when compared to ConM. A similar observation was suggested by CM physicians.

Subtheme: Perceived Benefits of IM Service
There are several benefits that patients believed IM can offer, including better control of disease progression, prevention of ConM-associated side effect, and stabilizing ConM usage.

Better Control of Disease Progression
Patients sought to have better control of disease progression, for instance, reducing the risk of complications and increasing life expectancy when they consider IM. DKD was highlighted as a major concern as patients were reluctant to receive dialysis. Some CM physicians suggested that patients of different age groups had different treatment targets. Elder patients emphasized more on symptomatic improvement and quality of life, while younger patients focused on laboratory investigations. A few CM physicians suggested that CM emphasizes holistic improvement including both quality of life and biomarkers. Although patients expressed subjective unwell feeling after receiving ConM, symptomatic improvement did not emerged as a major expectation from patients. CM physicians, however, believed that improving quality of life would be a major concern among patients and an advantage of CM. ConM physicians suggested control of renal function deterioration as an important milestone of complication management; however, they emphasized that more evidence is needed to demonstrate such effect of CM.

Stabilizing ConM Usage and Preventing the Associated Adverse Effects
Reducing ConM dependence was one of the common expectations of patients. Some CM physicians reported similar requests encountered in their clinical practice. This is likely because patients linked the use of ConM with disease progression FIGURE 2 | Flow diagram of systematic review. Four databases and search engines were searched and 25 papers were included. Interventions included acupuncture/acupressure (n = 7), herbal products (n = 14), massage-related (n = 2), qigong (n = 1) and combined acupuncture-herbal (n = 1) treatment. The treatment was formulated according to Chinese (n = 20), Kampo (n = 2), Korean (n = 1), Ayurvedic (n = 1), and western herbal (n = 1) medicine. and adverse effects. Minority of patients expect CM to reduce the adverse effects of ConM. Some CM physicians suggested that they have managed ConM-associated adverse events.

Subtheme: Assessment of IM Effectiveness
Patients generally focused on objective conventional biomarkers measured by laboratory investigations for the monitoring of treatment effect, which was supported by the ConM clinicians. Some CM physicians also believed that objective markers were important for their self-evaluation of treatment effect, as DKD is a ConM-defined condition. They also expected the patients would evaluate their treatment based on laboratory investigation results.
Substantially diverted opinion was noted among CM physicians, suggesting current biomarkers should not be the only outcome assessment. They believe CM manages patients' general condition simultaneously while treating DKD. DKD-related biomarkers were limited to only reflect a certain aspect of patients' overall condition. They suggest the concurrent use of CM-related outcome measures, which is phenome-based (e.g., change in symptoms, tongue color and pulse form).
Some ConM physicians acknowledged the difference in the epistemology between CM and ConM and suggested that CM may require different outcome measures. Nevertheless, ConM physicians generally believed that it would be an advantage if the effect of CM can be demonstrated with study designs conventionally used in ConM. There was also a suggestion to personalize the assessment of effect based on the patients' preference which is related to their demographics.

Systematic Review
Our search identified 303 studies from four databases after removing duplicated studies (Figure 2; 264 studies were excluded by title and abstract screening and 14 studies were excluded (Supplementary File 2S) after full-text screening. A total of 25 trials were included for analysis.
All DKD-related studies used urine albumin/protein and/or estimated glomerular filtration rate (GFR) as primary outcomes. All CKD-related studies assessed estimated GFR as the primary outcome. For hemodialysis-related studies, majority (4/5) assessed quality of life or symptom as primary outcomes. All studies described adverse events narratively. No studies measured the change of concurrent medication as primary or secondary outcomes. Nine, 12, two and two studies used standard care, placebo or sham acupuncture, both standard care and placebo, and other active intervention (e.g., other TCIM medication, active exercise) as comparators, respectively.

Risk of Bias, Pragmatism and the Association
Majority (22/25) of studies reported unclearly in at least one domain of potential bias (Figure 3). Twelve studies had unclear description on handling of attrition that led to undetermined bias on completeness of outcome measurement. Four studies were with high risk of bias in at least one domain. The main source of high-risk bias was from the blinding of outcome assessment (n = 3) and allocation concealment (n = 2).
In terms of pragmatism, the eligibility and outcome measurement of most trials were close to the target population with limited exclusion criteria (Figure 3). The outcome measurement was mostly relevant to the target population with clinical significance, for instance, the measurement of estimated GFR among DKD and quality of life among dialysis patients. The setting of trials was less pragmatic as most trials require additional expertise to execute on top of existing infrastructure. The follow-up duration was also less practical as the interventions require substantially more frequent service attendance. The reporting on recruitment strategy and adherence control was not clear to assess the degree of pragmatism. There is no observed positive correlation between the risk of bias and pragmatism of the included studies (R 2 = 0.0215, ß = −0.116, p = 0.484) (Figure 4). Result was comparable in sensitivity analysis with imputation on undetermined domains in pragmatism (Supplementary File 3S). Replacing undetermined domains in the assessment of pragmatism with lowest value resulted in a negative correlation (R 2 = 0.176, ß = −0.277, p = 0.037). Replacement with highest value did not result in significant correlation (R 2 = 0.035, ß = 0.129, p = 0.374). The rationale of study design parameters was uncommonly reported. One study used estimated GFR as primary outcome based on conventional practice of other studies. No study included/referred to stakeholder analysis in justifying the study design.

DISCUSSION
Patients expected IM service to retard disease progression, stabilize concomitant drug (referring to any medications given to the patients except the investigational article) dosage and reduce  potential side-effect associated with conventional treatment where existing study designs did not include detailed assessment. Consultation follow-up frequency is the least pragmatic domain in existing studies. Increase in pragmatism in study design did not associate with higher risk of bias.

Outcome Measures on the Change of Concomitant Drug and Adverse Events
From the focus group interviews, patients expected IM service could retard disease progression, stabilize the use of concomitant drugs, and lower the risk of having adverse events associated with conventional treatment. Surrogate biomarkers were mutually accepted among patients and physicians. Most reviewed pragmatic DKD studies used GFR and urine albumin/protein to measure the change of renal function which addressed both patients' and physicians' preference (6). Nevertheless, no study in the review reported the change of concomitant regimen as primary or secondary outcomes. Pragmatic trials often involve open-label design to better replicate real-world application. The potential bias in delivering intervention due to unblinding could be adjusted or assessed by mediation analysis on the dynamic change of concomitant regimens. Besides, as clinicians often adjust concomitant drugs to achieve or maintain targets of disease control (e.g., lowering hemoglobin A1c to below 7.0% or lowering systolic blood pressure to below 130 mmHg) in chronic conditions, the change in concomitant drugs could better reflect the disease progression than that of biochemical parameters, which is well-noted by patients in the focus group interviews. While most existing studies included analysis of adverse events, the data collection and assessment methods were unclear, and the reporting was often limited to narrative analyses. Further pragmatic studies should include the change of concomitant regimen as outcome measures and consider performing more systematic and in-depth quantitative analyses (e.g., survival analysis) on the incidence of adverse events.

Better Adherence by Reducing Intervention and Consultation Follow-Up Frequency
Among the existing studies, the frequency of add-on oral TCIM medication intake was often three times daily. Since the TCIM-ConM drug interaction is a common concern among ConM physicians, add-on oral TCIM medication is commonly taken separately with ConM (6). Therefore, existing IM study protocols require patients to take medication five to six times per day. Besides, most existing IM acupuncture programs require three times of consultation follow-up per week. We previously demonstrated that convenience of access is a key barrier of IM service implementation (6). Strategies to reduce the frequency of oral TCIM medication intake and integrate TCIM service FIGURE 4 | Correlation between risk of bias and pragmatism in existing study designs. The risk of bias and pragmatism was assessed according to the Cochrane risk-of-bias tool and PRECIS-2 tool. Higher score corresponds to higher risk of bias and more pragmatism. Sample size is presented as the size of circle. There is no statistically significant correlation between risk of bias and pragmatism in both unadjusted and adjusted (publication year and sample size) models. Result is robust in sensitivity analyses replacing undetermined domains with extreme values. delivery into the workflow of ConM would be important to enhance the service utilization and compliance.

Using Add-On Design With Standard Care Comparator to Inform Integrative Practice
Most existing studies used standard care or placebo/sham acupuncture as comparator. While placebo minimizes various kinds of bias, it is not an ideal control in pragmatic trial design as patients are neither blinded nor receiving placebo in realworld practice (1). Furthermore, our focus group series shows that both patients and clinicians focus on the add-on effect of TCIM. The add-on effect would be difficult to assess if other active interventions are used as comparator.
N-of-1 design is advocated in pragmatic trial to evaluate programs with individualized intervention (45). TCIM, including CM, strongly emphasizes personalization with tailor-made treatment and each patient would be an ideal self-control (6). However, the assumption underpinning N-of-1 design is that the intervention would not have a long-term effect after cessation. This assumption is contradictory to the theory of many streams of TCIM which consider that TCIM can restore the balance of human constitution and therefore offers a long-term healing effect (6,46). As the latent effect of TCIM is often a subject of interest, the wash-out period of N-of-1 trial needs to be long enough and should be justified by pilot studies.

Implementation Challenges on Using TCIM Diagnosis as Inclusion Criteria
Five studies from our systematic review included TCIMspecific symptom-based diagnosis in the inclusion/exclusion criteria. Some streams of TCIM, for instance, CM, has a different epistemology compared to ConM, including disease stratification (6). CM defines disease predominantly according to phenotype. We previously demonstrated that add-on symptombased diagnosis independently predicts renal progression among diabetic patients (47). Using standardized treatment across a study population with different CM-specific diagnosis is not personalized and contradictory to CM practice (6). As pragmatic trials are designed to reflect and inform real-world practice, CMspecific diagnosis is necessary in defining CM subgroups for intervention and assessment.
However, evidence generated from a specific subgroup of patients based on CM diagnosis may not be generalizable to the whole disease population (Figure 5) (48). For example, a formulation effective among diabetes patients that presented with qi-yin deficiency may not be effective among those without qi-yin deficiency, and therefore, the evidence has limited external validity to the whole diabetes population. As majority of ConM physicians are not trained in CM, evidence from trials that only recruited a subset of patients defined by symptoms could not inform ConM physicians' decision in referring patients for IM service.
To facilitate the implementation of evidence to IM service, we propose not to include TCIM-specific diagnosis in the inclusion/exclusion criteria of IM pragmatic trials to maximize the representation of the study population of interest (49). TCIMspecific diagnosis can be included as a stratification factor in randomization instead to generate TCIM-specific subgroups for analysis ( Figure 5). By combining all subgroups which represents a whole disease population, the primary analysis evaluates the overall effectiveness of a TCIM service program that is executed according to TCIM real-world practice (49). The main analysis informs ConM physicians on whether to make necessary referral to IM service. Subgroup analysis stratified according to TCIM theory evaluates the effectiveness of different treatments given to each TCIM-specific subgroup. The subgroup analyses inform TCIM physicians the choice of modalities from a personalized perspective. This two-layer design maximizes the generalizability of evidence and translation to real-world practice for both ConM and TCIM physicians.

Strategies to Maximize Reproducibility and Internal Validity in Pragmatic Trials
Although there are concerns over the trade-off between pragmatism and internal validity, our analysis showed that there is no positive correlation between risk of bias and pragmatism in existing study designs. Bias from randomization, allocation concealment, outcome assessment, and reporting in pragmatic trials can be controlled similarly to conventional trial designs (2). However, the intervention evaluated by pragmatic trials are often programs requiring flexibility, and the reproducibility is scrutinized (1). Although an unrestricted replicate of the real-world practice best produces evidence on effectiveness for implementation, the protocol may neither be applicable to nor reproducible in other clinical settings as high-quality standardized diagnostic instruments are lacking (50)(51)(52). For instance, the CM symptom-based diagnosis and personalized treatment in diabetes involves subjective professional judgment and likely differs between CM physicians. Although objective biomarkers may serve as alternative diagnostics, subjective symptom measures have been consistently demonstrated to correlate significantly with long-term clinical outcome independently (47,53,54) and has unique clinical value in patient-centered care (11).
To enhance the validity and reproducibility, symptom-based diagnosis and the corresponding variations in treatment should be pre-specified in a semi-individualized manner (49). Instead of diagnosing and treating patients purely by professional judgment that gives rise to unlimited combinations, patients can be divided by a predefined number of groups based on TCIM diagnosis with prespecified criteria. The treatment plan can be prespecified accordingly with clear instructions on adjustment. An alternative approach is to randomize or stratify the factor causing these variations, in most cases, the physician deciding the diagnosis and treatment. The potential confounding effect from different physicians can therefore be balanced between arms. However, a large cohort of subjects is needed for this method.
Non-uniform observation period is another commonly encountered challenge in pragmatic trial design. Most clinical trials would consider terminating subjects when serious adverse events develop due to clinical need and ethics concern, especially for patients under intervention in open-label design. As pragmatic trials often use standard care as control, subjects receiving standard care can be observed continuously without disturbing clinical management when serious adverse events develop. The imbalance in the length of observation between arms may confound outcome assessment especially for trials involving a long observation period and high incidence of serious adverse events, for instance, diabetes and CKD trials (55). A standardized termination criteria across arms upon developing serious adverse events can balance the observation length. Besides, using slope of change instead of absolute change in quantitative outcomes and incidence rate instead of incidence in count outcomes can also minimize the confounding from non-uniform follow-up.

Quality of Reporting
Overall, the quality of reporting of the included studies is suboptimal, often with limited information for assessing the completeness of outcome reporting. The prospective registration of a trial and/or protocol publication with clearly prespecified outcome measurements before completion of a study can increase the transparency of outcome reporting. Also, the handling of missing values in the statistical analysis was also unclear. The use of less biased statistical methods in handling attrition (e.g., mixed regression model) with sensitivity analyses could enhance the internal validity of the results. Several studies have high risk of bias in outcome assessment as assessors were not blinded. Although pragmatic trials are often open-label among subjects and investigators, the blinding of the outcome assessor (e.g., by independent laboratory/assessor) is critical to reduce the potential observer bias in outcome assessment.

Strengths and Limitations
This is the first focus group series to explore the specific expectation of the patients and physicians regarding IM diabetes and renal service, involving patients and family medicine, internal medicine, and CM physicians. A mixedmethod approach was used in this study. The expectation of stakeholders was qualitatively explored to maximize the finding of mechanisms, and the status quo of clinical trial design was evaluated objectively and systematically with quantitative methods. This study has several limitations. As the focus group series focused on identifying detailed expectations on integrative Chinese-western medicine diabetes and CKD management, findings could be context specific (6). Nevertheless, CM is the mainstream of TCIM and most of the papers identified from the systematic review used CM as the intervention. Also, focus group interviews only delineate possible mechanisms of behavior. Further quantitative studies including surveys are needed to quantify the magnitude of the concerns and test the generalizability in other diseases. The priority of recommendations on study design could be assessed by further consensus methods and surveys involving an extended scope of stakeholders (e.g., caregiver) (56, 57). In the systematic review, the lack of detailed reporting on methodology is partly attributed to journal word limit, which impeded the accuracy of assessment. The correlation analysis between risk of bias and pragmatism is likely underpowered, although all IM pragmatic trials were included. The best estimate of correlation only reflects the association from best available evidence currently. Lastly, the assessment in systematic review only evaluates the quality of trial design through reporting and may not reflect the true quality of trial execution, especially for study protocols.

CONCLUSION
Patients expected IM service to retard disease progression, stabilize concomitant drug dosage and reduce potential sideeffects associated with conventional treatment, which were not reflected in existing study designs. Further pragmatic studies should consider more systematic and in-depth quantitative analyses of incident concomitant drug change and adverse events. Majority of studies either recruited a non-representative proportion of patients as TCIM diagnosis was used as inclusion criteria, or not reflecting the real-world practice of TCIM by completely dropping TCIM diagnosis. A two-layer design incorporating TCIM-specific symptom-based diagnosis as a stratification factor maximizes the generalizability of evidence and translation to real-world practice for both ConM and TCIM physicians.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The University of Hong Kong/Hospital Authority Hong Kong West Cluster Institutional Review Board and Hong Kong East Cluster Research Ethics Committee. The patients/participants provided their written informed consent to participate in this study.  The funding organizations had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.