An Overview of Systematic Reviews of Chinese Herbal Medicine for Parkinson's Disease

Parkinson's disease (PD) is a high prevalence neurodegenerative disorder without a disease-modifying therapy. Up to now, a number of systematic reviews have been conducted to evaluate efficacy and safety of Chinese herbal Medicine (CHM) for PD patients. Here, we aimed to assess the methodological quality and reporting quality of systematic reviews using an overview, and then synthesize and evaluate the available evidence level of CHM for PD. Six databases were searched from inception to September 2018. The literatures were selected and data were extracted according to prespecified criteria. A Measurement Tool to Assess Systematic Reviews (AMSTAR) was used to evaluate the quality of methodology, and Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to determine the evidence quality of the primary outcome measures. A total of 11 systematic reviews with 230 RCTs of CHM for PD were included. AMSTAR scores of the included reviews were range from 4 to 9. Compared with conventional western medicine (WCM), CHM paratherapy showed significant effect in improving UPDRS score, Webster scale score, PDQ-39, NMSQuest, CHM Syndrome Integral Scale, and PDSS. However, CHM monotherapy showed no difference relative to WCM according to various outcome measures. Adverse events were reported in 9 systematic reviews. The side effect in CHM paratherapy group was generally less than or lighter than that in WCM group. The quality of the evidence of primary outcomes was moderate (42%) to high (54%) according to the GRADE profiler. The present finding supported the use of CHM paratherapy for PD patients but we should treat the evidence cautiously because of the methodological flaws, whereas there is insufficient evidence of CHM monotherapy for PD.


INTRODUCTION
Parkinson's disease (PD) is a common, chronic, and progressive neurodegenerative disorder resulting from the progressive loss of dopaminergic neurons in the substantia nigra and generates motor symptoms and non-motor symptoms (NMS) (Bohnen and Albin, 2011). Although the biochemical and molecular pathogenesis of the loss of dopaminergic neurons in PD has not been explicitly understood yet, it is thought to be involved in oxidative stress, mitochondrial dysfunction, and glutamatemediated excitotoxicity and inflammation (Hirsch et al., 2013;Mullin and Schapira, 2015). Currently, there is no proven disease-modified cure for PD. Conventional medicine for PD, levodopa, is only symptomatic relief and always associated with levodopa-related motor fluctuation or dyskinesia. Thus, an increasing number of PD patients resort to complementary and alternative medicine (CAM), estimating the prevalence of CAM use for PD to be between 25.7 and 76% according to the epidemiological data from seven separate countries (Wang et al., 2013;Pan et al., 2018).
Traditional Chinese medicine (TCM), one of main forms of CAM has played an indispensable role in medical care of PD patients for thousands of years in China, and currently is extended to use worldwide (Zheng, 2009;Wang et al., 2011Wang et al., , 2013. Chinese herbal medicine (CHM) is main pharmacological therapy of TCM. The herbal extracts and their biocompounds exert antioxidant, anti-apoptotic, and anti-inflammatory effects, which contribute to avoiding neuronal loss, acting on the biosynthesis of dopamine and its metabolites, and preventing D2 receptors' hypersensitivity (da Costa et al., 2017). In the past years, a number of systematic reviews have been conducted to evaluate the potential therapeutic benefits of CHM for PD (Chung et al., 2006;Kim et al., 2012;Wang et al., 2012;Huo and Yu, 2014;Wen et al., 2014;Zhang et al., 2014Zhang et al., , 2015Cui and Liu, 2015;Zhang, 2015;Wei et al., 2017;Shan et al., 2018), but their conclusions are inconsistent because of the quality of primary studies or methodological flaws. In addition, an overview of systematic reviews (SRs) is a novel tool to address a specific, focused question, relevant to policy or practice, and synthesize evidence from multiple SRs into a single, useful file that can be used to guide health care professionals and policy makers (Thomson et al., 2010;Baker et al., 2014). Thus, we conducted an overview to critically assess the methodological quality and reporting quality of SRs, and then, to synthesize and evaluate the available evidence level of CHM for PD.

Search Strategy
Electronic literature was searched in the following databases from inception to September 31, 2018 without language restrictions: Pubmed, EMBASE, Web of Sciences, China National Knowledge Infrastructure, VIP Journals Database, and Wan fang Med Online Database. The keywords used were as follows: "Traditional Chinese Medicine OR herbal medicine" AND "systematic review OR meta-analysis" AND "Parkinson's Disease" (Parkinson's Disease as a mesh term). For Chinese database, above search terms were used in Chinese accordingly. The following search strategy was used for PubMed and was modified to suit other databases. #1

Eligibility Criteria
Type of study: We included SRs of CHM for PD that met the following criteria: (1) evaluated the effects of CHM on PD compared with western conventional medicine (WCM); (2) provided a clearly definition of clinical question, eligibility criteria, and searching strategies; (3) reported at least one results of desired outcome. SRs with insufficient information for methods section, quality evaluation and methodology study were excluded. Type of participants: Participants were of any age or sex with a confirmed diagnosis of PD based on at least one of following criteria: (1) the UK Brain Bank criteria (Hughes et al., 1992); (2) Chinese National Diagnosis Standard (CNDS) for PD in 1984 (Wang, 1985); (3) CNDS updated version in 2006 for PD (Zhang, 2006); (4) other formal comparable criteria.
Type of intervention: CHM or CHM paratherapy were used in the treatment groups, regardless of the form of the drug, dosage, frequency or duration of the treatment. Comparator interventions were placebo or WCM.
Type of outcome measures: The primary outcomes were total Unified Parkinson's Disease Rating Scale (UPDRS) score, Webster scale, Parkinson's Disease Questionnaire-39 (PDQ-39), and Non-motor Symptoms Questionnaire (NMSQuest). The UPDRS was the major rating scale assessing severity of symptoms of PD. The UPDRS scale consists of the following four segments: Part I (mentation, behavior, and mood) addresses mental dysfunction and mood; Part II (activities of daily living, ADL) assesses motor disability; Part III (motor section) evaluates motor impairment; Part IV (complications) assesses treatment related motor and non-motor complications. The secondary outcomes were Parkinson's Disease Sleep Scale (PDSS), Hamilton depression rating scale (HAM-D), CHM syndrome integral scale, the 36-Item Short Form Health Survey (SF-36), and adverse reactions.

Study Selection and Data Collection
Two investigators (XC-J and LZ) independently screened the title and abstract to select potential references. Full articles were obtained for potentially useful studies. The two investigators independently read the whole articles and made a final decision. The data collection from the studies included author name, year of publication, country of first author, number of primary studies and samples, overall conclusion, meta-analysis, outcome measures. Disagreement between two researchers was resolved by discussion with the third author. If the critical data were missing or only expressed graphically, we tried to contact authors for further information or calculated by ourselves if available.

Assessing the Quality of SRs
A Measurement Tool to Assess SRs (AMSTAR) (Shea et al., 2007), which consists of 11 items was used to evaluate the methodological quality of all included SRs. For each item, a judgement of "Yes, " "No, " "Can't answer" or "Not applicable" was assigned according to judgment criteria of AMSTAR. The number of "yes" will be counted as the total score of AMSTAR. A total score of 4 or less was considered as indication of low quality, a total score of 5 to8 means moderate quality and a total score of 9 or more suggests high quality (Monasta et al., 2010;Jaspers et al., 2011). Each SR was assessed by two researchers (XC-J and LZ) independently, and any disagreements were resolved by discussing with a third author (GQZ).

Assessing the Quality of Evidence
For the primary outcome measures with detailed information, GRADE (Guyatt et al., 2008) was used to evaluate the quality of evidence following the GRADE handbook (Guyatt et al., 2008) by two researchers (XC-J and LZ) independently and disagreements were resolved by a third author (GQZ). GRADE classified the quality of evidence into four levels: high, moderate, low, and very low. We judged evidence as high quality when we were highly confident that the true effect lay close to that of the estimate of the effect; we judged evidence as moderate quality when we considered that the true effect was likely to be close to the estimate of the effect, but there was a possibility that it was substantially different; we judged evidence to be low or very low quality when the true effect might be substantially different from the estimate of the effect.

Data Synthesis
A narrative description of the included SRs was conducted. Review-level summaries for all the primary and secondary outcomes from the included SRs were tabulated. We extracted and reported pooled effect sizes, when outcomes were metaanalyzed within a SR. If there was no quantitative pooling of effect sizes, we reported results with a standardized language indicating direction of effect and statistical significance. Risk ratio (RR) with 95% confidence interval (CI) was involved when summary the dichotomous outcomes, while weighted mean difference (WMD) or standard mean difference (SMD) and 95% CI was involved when summary the continuous data. The heterogeneity of each included SR was also summary and analyzed, which was detected by I 2 and Chi 2 tests.

Description of the Screening Process
A total of 99 studies were retrieved, and of which 22 studies were excluded because of duplicates. After screening titles and abstracts, 66 studies were excluded because they are not relevant to the efficacy for PD, or not relevant to CHM, or not SR, or in combination with other TCM therapeutic modalities. Ultimately, 11 eligible studies were included in the present study. The process of screening is presented in a flow diagram (Figure 1).

Study Characteristics
Eleven SRs with 230 randomized controlled trials (RCTs) were included in the present study. Ten SRs were published journal articles, while only one SR was academic dissertation (Zhang, 2015). Four SRs were published in Chinese (Huo and Yu, 2014;Wen et al., 2014;Cui and Liu, 2015;Zhang, 2015) and 7 others in English from 2006 to 2018, in which 8 SRs published in recent 5 years. The first authors of 10 studies were from China and affiliated to academic institutions, while the first author of only one study (Kim et al., 2012) was from Korea. All included SRs evaluated the efficacy of CHM for PD. Two studies (Kim et al., 2012;Shan et al., 2018) compared CHM with placebo. Four studies (Chung et al., 2006;Kim et al., 2012;Huo and Yu, 2014;Wen et al., 2014) compared CHM therapy with WCM. Comparing CHM paratherapy with WCM were conducted in 10 studies (Chung et al., 2006;Kim et al., 2012;Wang et al., 2012;Wen et al., 2014;Zhang et al., 2014Zhang et al., , 2015Cui and Liu, 2015;Zhang, 2015;Wei et al., 2017;Shan et al., 2018). The number of RCTs included in SRs ranged from 9 to 64. The overall quality of primary studies was poor according to the Jadad score (Huo and Yu, 2014;Wen et al., 2014;Cui and Liu, 2015;Zhang, 2015) or Cochrane risk of bias tool (Chung et al., 2006;Kim et al., 2012;Wang et al., 2012;Zhang et al., 2014Zhang et al., , 2015Wei et al., 2017;Shan et al., 2018). Nine SRs conducted meta-analysis, while the other 2 (Chung et al., 2006;Kim et al., 2012) did not. The characteristics of the included SRs were summarized in Table 1.

Description of the CHM Formulas and High-Frequency Used Herbs
Eight out of the 11 SRs summarized the CHM formulas and reported a wide range of CHM formulas. A total of 52 CHM formulas were used in these studies. The top 3 most frequently used formulas were BushenHuoxue granule, Guiling Pa'an granule, Xifeng Dingchan granule. The top 10 high-frequency used herbs for PD in included studies were Rhizoma Ligustici Chuanxiong, Radix Paeoniae Alba, Rhizoma Acori Tatarinowii, Radix Angelicae Sinensis, Fructus Corni, Radix Polygoni Multiflori, Radix Changii, Rhizoma Coptidis, Rhizoma Gastrodiae, Radix Glycyrrhizae. The details of these 10 herbs were generalized in Table 2.

Assessing the Quality of SRs
AMSTAR scale was used to evaluate the methodological quality of the included SRs. All of the included SRs were not registered in advance and did not provide a list of included and excluded studies. One SR (Huo and Yu, 2014) did not perform a comprehensive literature search, 2 SRs (Huo and Yu, 2014;Wen et al., 2014) did not search gray literature, 3 studies (Chung et al., 2006;Kim et al., 2012;Wang et al., 2012) did not assess and document the scientific quality of the included studies, and the scientific quality of the included studies did not used appropriately in formulating conclusions in them. Two SRs (Chung et al., 2006;Wen et al., 2014) did not appropriately explain the findings of studies, 3 SRs (Chung et al., 2006;Wen et al., 2014;Cui and Liu, 2015) did not assess the likelihood of publication bias, and 6 SRs (Chung et al., 2006;Huo and Yu, 2014;Wen et al., 2014;Zhang et al., 2014;Cui and Liu, 2015;Zhang, 2015) did not state the conflicts of interest. For overall scores, 3 SRs achieved high quality with scoring 9 points of AMSTAR Wei et al., 2017;Shan et al., 2018); one was low quality with scoring 4 points (Chung et al., 2006); the quality of the remaining 7 studies were moderate. Among them, 3 SRs scored 7 points (Kim et al., 2012;Wang et al., 2012;Cui and Liu, 2015), 2 scored 8 points (Zhang et al., 2014;Zhang, 2015), 1 scored 5 points (Wen et al., 2014), and 1 scored 6 points (Huo and Yu, 2014). The details of the assessment of the quality of SRs are listed in Table 3.

UPDRS I CHM paratherapy vs. WCM
Five SRs (Wang et al., 2012;Cui and Liu, 2015;Zhang et al., 2015;Wei et al., 2017;Shan et al., 2018) assessed the UPDRS I score and all of them indicated that CHM combined with WCM is better than that of WCM (P < 0.05). Meta-analysis was conducted in all of 5 SRs. The heterogeneity of 3 SRs (Wang et al., 2012;Wei et al., 2017;Shan et al., 2018) was acceptable with I 2 < 50%, while in 2 SRs (Cui and Liu, 2015;Zhang et al., 2015) was high with I 2 > 50%. The reason of high heterogeneity was not explained in both of the 2 SRs. The details of WMD or SMD, 95% CI, and heterogeneity were generalized in Table 1.

UPDRS II CHM vs. placebo
One SR (Shan et al., 2018) showed that the efficacy of CHM monotherapy was similar to placebo according to UPDRS II (P > 0.05).

UPDRS III CHM vs. placebo
One SR (Shan et al., 2018) showed that the efficacy of CHM monotherapy was similar to placebo according to UPDRS III (P > 0.05).

Total Score of UPDRS CHM vs. placebo
In one SR (Kim et al., 2012), CHM showed significant improvement in total UPDRS score after treatment (P < 0.05). One SR (Shan et al., 2018) showed that the efficacy of CHM monotherapy was similar to placebo according to total UPDRS score (P > 0.05).

Webster Scale CHM vs. WCM
Webster scale score was assessed in 2 SRs (Chung et al., 2006;Kim et al., 2012). In (Chung et al., 2006)'s SR (2006, two trails reported the improvement in the overall Webster scale scoring. However, flaws in design and statistical analysis in these two studies limited the reliability of their conclusions. In (Kim et al., 2012)'s SR (2012, three CHM formulas showed significant effect for improving Webster score.

CHM paratherapy vs. WCM
One SR (Chung et al., 2006) showed the significant effect of CHM paratherapy for improving Webster score compared with WCM. Three out of 4 trails included in (Kim et al., 2012)' SR (2012 indicated that combination therapy is better than that of WCM.

PDQ-39 CHM vs. WCM
One SR (Kim et al., 2012) indicated that the efficacy of CHM monotherapy was similar to WCM according to PDQ-39 (P > 0.05).

Adverse Events
One SR (Chung et al., 2006) evaluated adverse events associated with CHM, including dry mouth, altered taste, musculoskeletal pain, diarrhea/loose stool, constipation, and dizziness. These adverse events were more common in the WCM group than that in the CHM group.
Nine SRs (Chung et al., 2006;Wang et al., 2012;Wen et al., 2014;Zhang et al., 2014Zhang et al., , 2015Zhang, 2015;Wei et al., 2017;Shan et al., 2018) evaluated adverse events associated with CHM combined with WCM. The main symptoms reported were dry mouth, fatigue, sleep disorders, gastrointestinal complaints, dizziness, nausea, and flatulence. All of these SRs indicated that the side effects in CHM adjuvant therapy group were generally less than or lighter than that in WCM group.

Summary of Evidence
This overview indicated that a number of SRs of CHM for PD have emerged between 2006 and 2018, suggesting that the interest in the use of CHM for PD treatment has grown considerably in recent years. Compared with WCM, CHM paratherapy showed significant effect in improving UPDRS score, Webster scale score, PDQ-39, NMSQuest, CHM Syndrome Integral Scale, and PDSS. The side effect in CHM paratherapy group were generally less than or lighter than that in WCM group. The findings of present study supported the use of CHM paratherapy for PD patients but we should treat the evidence cautiously because of the methodological flaws. In addition, CHM monotherapy showed no difference relative to WCM according to various outcome measures.

Limitations
SRs are considered as the highest level of evidence in healthcare; only data from SRs of high-quality RCTs will receive 1a-evidence according to the levels of evidence from the Center of Evidence-Based Medicine in Oxford (Glasziou et al., 2004). An overview of SRs is a comprehensive evaluation method, which summarizes the findings, detects the methodological quality, and grades the evidence quality of all included SRs on one disease. In this overview, a summary of the findings of included SRs showed that CHM paratherapy for PD has better efficacy and safety than that of WCM. However, there are some limitations in the present study. Firstly, most of the included SRs were based on the poor quality of primary studies. The reliability of positive results may be undermined by these methodological flaws. According to the AMSTAR, no prior design provided in all 11 studies which probably affected the rigor of SRs. Six studies failed to explain the interests conflicts, which may bring the difficulty to users to make the judgment on that whether the potential issues existed in SRs, such as anthropogenic factors caused by interests conflicts on evaluation outcomes. Secondly, the quality of evidence of most primary outcomes was moderate (42%) to high (54%). However, only 6 SRs provided full information for grading the quality of evidence, while the quality of evidence of remaining 5 included SRs were unclear, which may affect the comprehensiveness and convincingness of the result of quality grading. Thirdly, the included SRs mostly focus on the intermediate outcomes, such as UPDRS and Webster scale, which mainly reflect some point in the process of interventions affecting the disease, not fully reflect all results of complex pathological process, thus affecting the analysis of the effectiveness. Fourthly, PD is considered a multisystemic neurodegenerative disorder, together with motor symptoms and NMS. Recent researches indicate that some NMS are the direct results of PD progression, or induced by PD medication and increasing attention has been paid to NMS for PD patients (Antonini et al., 2015;Bastide et al., 2015;Shi et al., 2017). However, our included studies mainly focused on evaluating motor symptoms, ignoring the specific analysis of NMS (Schapira, 2015). Fifthly, various kinds of CHM existed in our included studies. Individual drugs have not been evaluated, so it was unclear what specific ingredient was effective.

Implications
This is the first overview of SRs focused on the efficacy and safety of CHM for PD. In the 11 included SRs, CHM paratherapy exhibit significant improvement in PD symptoms compared with WCM. According to the safety assessment, the CHM for PD is generally safe and well-tolerated. The evidences available from the present study supported the use of CHM paratherapy for PD patients but we should treat that cautiously because of the methodological flaws. However, there is insufficient evidence of CHM monotherapy for PD.
Given the methodological issues, recommendations for further research are as follows: (1) when designing RCTs for CHM, some specific guidelines should be combined and used as a comprehensive guideline, such as the CONSORT 2010 statement (Schulz et al., 2010), guidelines for RCTs investigating CHM (Flower et al., 2012) and CONSORT for TCM (Bian et al., 2011); (2) in further RCTs for CHM, individual placebocontrolled group should be designed and studied to evaluate the placebo effect; (3) in order to evaluate the effectiveness of specific ingredient of CHM, further studies of the efficacy of individual CHM should be conducted; (4) it is important to improve the methodological quality of further SRs themselves. The PRISMA statement (Liberati et al., 2009) should be used as a guide and the prospective registration of SRs should be encouraged; (5) assessments of NMS are crucial and specific scales such as the Non-Motor Symptoms Scale, the Mini Mental State Examination, the Montreal Cognitive Assessment Test should be applied (Asakawa et al., 2016). The terminal outcomes in the natural course of PD can be more comprehensive, contributing to the more accurate evaluation of the efficacy of CHM for PD; (6) with the CHM being more widely used for PD, the reporting of adverse events may become more common, so we suggest that a special reporting format should follow up to ensure its safety.

CONCLUSIONS
The findings of present study supported the use of CHM paratherapy for PD patients but we should treat the evidence cautiously because of the methodological flaws. Further rigor RCTs are still needed. In addition, there is insufficient evidence of CHM monotherapy for PD; however, it should be remembered that a lack of scientific evidence does not necessarily mean that the treatment is ineffective (Kotsirilos, 2005). Thus, study of CHM monotherapy for PD is open.