Different Traditional Herbal Medicines for the Treatment of Gastroesophageal Reflux Disease in Adults

Background/Aims Traditional Herbal Medicines (THM) have been being used for gastroesophageal reflux disease (GERD) for a long time, but clinical evidence is still scarce. We evaluated different THM prescriptions for GERD in adults. Methods Data added to nine online databases from their inception to November 30, 2019, were systematically searched. All relevant randomized controlled trials (RCTs) were included and were combined with Bayesian network analysis. The Cochrane Collaboration’s risk of bias tool and GRADE profiler version 3.6 were respectively employed to evaluate the quality of evidence of outcomes. Results Seventeen publications involving 1441 participants were retrieved. The results of our analysis suggested that Jianpi therapy+proton pump inhibitors (PPIs) and Ligan Hewei therapy respectively ranked first in overall clinical efficacy and efficacy under gastroscope; Ligan Hewei therapy+PPIs was the optimum intervention in the improvement of acid regurgitation and heartburn. Conclusion This research indicates that Ligan Hewei therapy and Jianpi therapy, or these therapies separately combined with PPIs, should be recommended as appropriate complementary and alternative treatments based on the specific characteristics of GERD. However, additional well-designed RCTs with high methodological quality are still needed for future research.


INTRODUCTION
Gastroesophageal reflux disease (GERD) is a common chronic disorder characterized by an imbalance of the barrier between the stomach and the esophagus, resulting in the regurgitation of gastric contents into the esophagus amd even the hypopharynx (DeVault and Castell, 2005). Based on the Montreal definition published in 2006, it is subclassified into non-erosive reflux disease (NERD), reflux esophagitis (RE), and Barrett esophagus (BE) (Vakil et al., 2006). Moreover, epidemiological investigation showed that this disease affected approximately 20%~30% of the population around the world and 7.8%~8.8% in East Asia (El-Serag et al., 2014). Without timely treatment, patients with the condition will suffer from numerous complications including esophageal stricture, ulceration, and even BE (Freston et al., 1995;Schwizer and Fried, 1997), thereby leading to huge psychological burden and poor work productivity (Wahlqvist et al., 2008;Nocon et al., 2009).
Currently, the first-line medical drug for GERD is proton pump inhibitors (PPIs). They are estimated to provide about a 56%~76% rate of relief of related symptoms and an 80%~85% recovery rate for esophageal lesions (Katz et al., 2006) as well as reducing the incidence of complications (Savarino et al., 2009). However, approximately 30% GERD sufferers, who had unsatisfactory responses to PPIs still remained symptomatic and had high risk of complications, including BE (Fass et al., 2005). Therefore, in order to seek other effective therapies and improve their quality of life, many patients put their attention on alternative medicine (Patrick, 2011).
The use of traditional Chinese medicine (TCM) has a long history and was first documented by the Sheng Nong Classic of Materia Medica. Currently, traditional Herbal Medicines (THM) are widely used in cardio-cerebrovascular, endocrine, gastrointestinal, neuropsychiatric, and respiratory disorders (Dai et al., 2018;Jiang et al., 2019;Kong et al., 2019;Qin et al., 2019;Gao et al., 2020;Liu et al., 2020;Zhang et al., 2020). Several studies have evaluated the efficacy and safety of THM in treating GERD Dai et al., 2017;Shih et al., 2019). However, these findings were obtained from pairwise comparisons between a prescription and conventional Western medicine(s). No comparison within different THM was conducted in the treatment of GERD. Consequently, to obtain up-to-date information regarding the effectiveness of different prescriptions in treating this disorder, a Bayesian network analysis that integrated direct with indirect evidence for multiple intervention comparisons was performed in this study.

METHODS
This study was performed based on the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISRMA) (Liberati et al., 2009) statement and the Cochrane Handbook for the Systematic Review of interventions (details via http:// training/cochrane.org/handbook).

Data Sources and Search Strategy
We systematically searched the following databases from their inception to November 30, 2019: PubMed, MEDLINE, EMBASE, Cochrane Library, Scopus, Clarivate, and the Chinese databases of CNKI, WanFang, and VIP for relevant literature. The preestablished search terms consisted of three parts: strategies for GRED, THM treatment, and a specific filter for randomized controlled trials. Both Medical Subject Headings (MeSH) terms and text words were used for keywords. The detailed search strategies for each database are shown in Supplementary Table  S1. No limitation was placed on language of article. Any omission of publications was remedied by manual retrieval. To obtain eligible trials, the reference lists of the included studies were checked for verification and further assessment.

Study Selection
Following the PICOS (participants, interventions, comparisons, outcomes, and study design) criteria, two investigators (Yun-kai Dai, Yun-bo Wu) preliminarily screened the relevant titles and abstracts. Randomized, parallel-group clinical trials of THM for GERD were initially included. The full texts of these studies were then scanned for further evaluation. Briefly, participants over the age of 18 should meet the diagnosis criteria of GERD (DeVault and Castell, 2005). Any prescription of THM interventions and certain positive controls (PPIs, or gastrointestinal motility drugs (GMD), or combinations) were selected. Meanwhile, the sample size of each trial should not be less than 30/arm, and the duration of treatment should be at least 4 weeks. In order to obtain superior quality literature, works with a Jadad score above 1 was screened.
However, some participants or publications were excluded: pregnant women, patients with comorbidities such as severe cardio-cerebro-vascular diseases and cancers, published meeting abstracts, non-research articles and cross-over studies, and THM as positive control.

Data Abstraction and Quality Evaluation
Using a prepiloted data extraction sheet, two researchers (Yun-kai Dai, Yun-bo Wu) independently conducted data abstraction and quality assessment. Relevant characteristics of participants (gender, age, and sample size), details of interventions and comparisons (regimen for treatment and duration), course of disease, primary outcomes (overall clinical efficacy and efficacy under gastroscope) and secondary outcomes (improvements of acid regurgitation and heartburn, reflux diagnostic questionnaire (RDQ) scores), side effects, and study design were extracted, as was the classification of GERD. Moreover, relevant missing information could be acquired if necessary through telephoning the corresponding authors.
On the basis of the Cochrane Collaboration Recommendations assessment tool (Savovic et al., 2018), the quality of the included trials was independently evaluated by two reviewers (Yun-kai Dai, Yun-bo Wu). Overall evaluation of methodological quality had seven aspects: (i) random sequence generation; (ii) allocation concealment; (iii) blinding of participants and personnel; (iv) blinding (or masking) of outcomes assessment; (v) incomplete outcome data; (vi) selective reporting; (vii) other bias. Disagreements were resolved by further discussion or negotiation. For the methodological quality attribute of each study, the value "high quality" (or low risk), "uncertain quality" (or unclear risk), or "low quality" (or high risk) was assigned to calculate the overall score, which ranged from 0 to 6 points (from worst to best methodological quality). In view of this, the distributions of the methodological qualities on different comparisons across the evidence network were assessed. In addition, the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) profiler version 3.6 was used to analyze the overall evidence quality of this network analysis.

Statistical Analysis
Evidence of direct and indirect multiple-intervention comparisons is obtained by network meta-analysis, and performing this analysis with the Bayesian framework can improve the accuracy of the results. WinBUGS version 1.4.3 (MRC Biostatistics Unit, Cambridge, UK), based on the Bayesian framework and the Markov chain Monte Carlo (MCMC) method, was used to assess and process research data a priori. We used noninformative uniform and normal prior distributions (Ades et al., 2006;Sutton et al., 2008) and three Markov chains to fit the model. Meanwhile, 50,000 simulation iterations and 10 thining intervals per chain were set to gain the posterior distributions of model parameters. The first 20,000 iterations were used for burn-in so as to eliminate the effects of initial value scaling, while the last 30,000 were applied to sampling. A relationship between direct and indirect multiple-intervention comparisons was drawn as a network figure using Stata version 13.0 software. The Brooks-Gelman-Rubin statistic was calculated to evaluate model convergence. The closer the potential scale reduction factor (PSRF) value was to 1, the better the convergence. Of course, a PSRF value of less than 1.2 was still acceptable. The node-splitting analysis was evaluated to test the consistency (Dias et al., 2010). If the p-value was greater than 0.05, a consistency model would be used. Otherwise, an inconsistency model was used. Accordingly, sensitivity analysis was used to test the source of heterogeneity. To summarize the probabilities for all interventions, the surface under the cumulative ranking curve (SUCRA) was used as a summary statistic for the cumulative ranking (Salanti et al., 2011). Based on the definition, the larger SUCRA scores are, the more effective the interventions are. In this study, the effect sizes of all outcomes were analyzed by a fixed or random effect model depending on indexes of statistical heterogeneity including the p-value and inconsistency index statistic (I 2 ) (Higgins et al., 2003). Dichotomous outcomes were calculated using the odds ratio (OR) and 95% credible intervals (CrIs). Continuous variable data were evaluated using the mean difference (MD) and with their corresponding 95% CrIs.

Study Identification and Selection
In total, 2679 articles were retrieved from the nine databases according to the corresponding search strategies. After removing duplicates and irrelevant publications, 17 randomized controlled trials (RCTs) including 1441 participants were selected for further quantitative analyses. A flow diagram of the specific retrieval process is shown in Figure 1. The baseline characteristics of the included trials are displayed in Table 1. The classification of herbal medicines and usage frequency of the included herbs can be found in Table 2 and Figure 2. Accordingly, we could draw a rough conclusion that herbs with the function of regulating the liver and harmonizing the stomach (TCM jargon: Ligan Hewei) and invigorating spleen (TCM jargon: Jianpi) had higher frequencies among the included herbs.

Risk of Bias Evaluation
On the basis of the Cochrane Collaboration Recommendations evaluation tools (Savovic et al., 2014), the quality of the included RCTs was assessed. Of all the studies, 82.35% (14/17) gave a specific description of the random-allocation process, such as the use of a random number table or a computer-generated randomization list. The others only used the word "randomization" without any explanation. Because of insufficient information about allocation concealment, all included trials were judged as of "unclear risk." In performance bias, only four studies (23.53%) described double or single blinding. As for detection bias, 13 trials (76.47%) either could not be blinded or it was unclear whether they had been. In addition, eight RCTs (47.06%) were at low risk of attrition bias because they provided detailed explanations or statistical estimations of dropout rates. However, two trials (11.76%) failed to provide adequate information for the judgment of missing data risk. Moreover, there was insufficient information on other risks for all 17 studies. In sum, among all trials, 4 were viewed as low risk, 2 as unclear risk, and 11 as high risk. A detailed quality evaluation is shown in Figures 3A, B.

Network Evidence
This study included seven regimens as follows: Ligan Hewei therapy, Ligan Hewei therapy+PPIs, Jianpi therapy, Jianpi therapy+PPIs, PPIs, PPIs+GMD, and GMD. The results of the network analysis suggested that the number of GERD patients treated with PPIs was the largest, followed by Ligan Hewei therapy and then Jianpi therapy, while the number of GERD patients treated with Ligan Hewei therapy+PPIs was the smallest (Figure 4).

Sensitivity Analysis
A sensitivity analysis was conducted through omitting studies one by one. The result of this analysis showed that there were no significant differences in overall clinical efficacy and efficacy under gastroscope ( Figure 6).

GRADE Evidence of Quality
GRADE profiler software, which includes the elements of GRADE criteria such as study design, risk of bias, inconsistency, indirectness, imprecision, and publication bias, was used to rate the quality of evidence and grade strength of recommendations for this network meta-analysis. The results shown in Figure 7 suggested that the evidence quality of overall clinical efficacy was "Low," which could be related to high risk of bias and indirectness within RCTs.

DISCUSSION
Network meta-analysis is used to analyze studies with multiple interventions and provide rankings for them (Naci et al., 2014). Our findings from the comprehensive network analyses demonstrated the overall synthesis of data for currently available GERD treatments in terms of different THM.
Regarding the usage frequency of each herb, a rough conclusion was drawn that herbs with the function of Ligan Hewei and Jianpi were used more frequently among the included herbs. In terms of outcomes, we found that Jianpi therapy+PPIs ranked first in overall clinical efficacy. Ligan Hewei therapy might be a better choice for healing gastroesophageal mucosal lesions according to gastroscope observations. In addition, in the improvement of acid regurgitation and heartburn, Ligan Hewei therapy+PPIs was superior to other interventions. Therefore, Ligan Hewei therapy and Jianpi therapy could be promising complementary and alternative therapies in the management of GERD, which potentially provides TCM practitioners with more suggestions and guidance in clinical decisions, as well as for treatments based on syndrome differentiation. The pathogenesis of GERD is poorly understood so far. Currently, some acknowledged potential mechanisms are not only involved in hiatal hernia (Dore et al., 2016), anti-reflux barrier dysfunction (Xie et al., 2017), esophageal inflammation (Dunbar et al., 2016), and transient lower esophageal sphincter relaxation (TLESR) (Banovcin et al., 2016) but have also been associated with psychological factors (Baker et al., 1995;Wright et al., 2005) and obesity (Nadaleto et al., 2016). However, in the modern pharmacological field, complementary and alternative medicine (CAM), especially TCM, could potentially intervene in these mechanisms. A clinical study showed that wu chu yu tang (affiliated to Jianpi therapy) could improve the symptoms of GERD through anti-inflammation, antioxidant activity, acid suppression, reduction in pepsin secretion, and mucosal protection (Shih et al., 2019). In the treatment of gastrointestinal (GI) reflux diseases, another study indicated that Wendan decoction (WDD, affiliated with Ligan Hewei therapy) could reduce unhealthy emotions in patients via normalizing behaviors and up-regulating orexin-A, orexin receptor 1, and leptin and its receptor in the brain . Additionally, WDD could solve phlegm-related problems and recover GI homeostasis through dual action on acid and bile secretion . Meanwhile, acupuncture regulating qi based on the compatibility of the five meridians (affiliated to Ligan Hewei therapy) could also play an important role in treating GERD with disharmony between liver and stomach syndrome, whose mechanisms were possibly related to its regulation in the neuro-endocrine-immune system, thereby alleviating TLESR, promoting GI motility, suppressing acid secretion, and protecting gastric mucosa (Pan et al., 2017). Besides, acupoint drug finger pressing, based on the TCM theory of Jianpi therapy, also showed good therapeutic effects on GERD, which is probably attributable to lower esophageal sphincter pressure promotion and decrease in acid reflux in esophagus, as well as the improvement of coordination of gastroesophageal movement (Xie et al., 2007). In    sum, CAM, especially TCM, may be multi-target treatments of GERD that are worth studying further in vitro and vivo. Generally speaking, non-randomized trials are susceptible to many biases that affect the weaker forms of evidence. However, in RCTs, certain deficits in their design, conduct, analysis, and reporting may result in bias (Savovic et al., 2018). In this study, the methodological quality of the included trials was generally moderate, and the quality level of evidence for overall clinical efficacy, according to GRADE evidence classification, was "Low." Analyzing from the above two results, the potential risk of bias in our study was possibly rooted in three aspects. First, there were 13/17 (76.47%) RCTs in which blinding as not implemented, which may lead to the occurrence of performance and detection biases. Next, due to the absence of allocation concealment in all of the included studies, the subjects could easily recognize which treatment they were allocated to, inevitably resulting in selection bias. Last, although 8/17 (47.06%) studies reported detailed withdrawals or dropouts, another 2 (11.76%) failed to provide an adequate explanation for missing data, which may also increase the risk of attrition bias.
In network analysis, consistency is characterized as a single comparison of the relationship between direct and indirect sources of evidence (Madan et al., 2011). When consistency is not good in a statistical analysis, it could be short of transitivity. In our study, for primary and secondary outcomes, based on the "node-splitting" method, it showed good convergence and strong stability, thereby further proving high reliability in our results. Nevertheless, clinical heterogeneity, for example, regarding the improvement of symptoms (acid regurgitation and heartburn), which were assessed by the standard excessively subjective judgments    by doctors or patients, cannot be ruled out. Also, it should be taken into consideration that overall clinical efficacy and efficacy under gastroscope were described as comprehensive evaluation of the improvement of both many types of GERD symptoms and histopathological changes of gastroesophageal mucosa. There are several potential limitations to our study. First, the included studies were only in Chinese and Japanese. Evidence with this geographically limited distribution needs more multicenter and large-scale research around the world to support it. Second, discrepancies in traditional herbal medicines (specifically, the interventions mentioned in our study) may exist because of their source and preparation, which could influence the strength of the evidence. Third, missing data could pose a threat to the validity of RCTs because it means that the observed outcomes of an RCT are not representative of all RCTs in the trial. Meanwhile, there was no corresponding evidence to verify its impact on the overall results in our study. Fourth, there was no unified criterion for the classification of interventions. Accordingly, we categorized them by the functions of herbs or prescriptions in the literature. Last, high quality of RCTs plays a key role in the production of optimal sources of evidence.
Therefore, we are looking forward to further standardized research and superior methodology, such as multicenter, large sample sizes, and well-designed (including the implementation of allocation concealment and blinding) RCTs to update and perfect the current body of evidence. Furthermore, strictly following the Consolidated Standards of Reporting Trials (CONSORT) or Standards for Reporting Interventions in Controlled Trials (STRICTA) statement is also essential to improve the reporting quality of future research.

CONCLUSION
Evidence from this network analysis indicates that Ligan Hewei therapy and Jianpi therapy could be the most suitable complementary and alternative interventions for GERD. According to different evaluation outcomes, Jianpi therapy +PPIs could be an optimum treatment in terms of overall clinical efficacy. Ligan Hewei therapy might be suitable for improving gastroesophageal mucosal lesions as seen under a gastroscope. Ligan Hewei therapy+PPIs could be a better choice for patients with acid regurgitation and heartburn. These findings could provide physicians and patients with appropriate treatments based on the specific characteristics of GERD. However, additional high-quality RCTs should be conducted to offer more powerful evidence for future research.