Standard-Dose Proton Pump Inhibitors in the Initial Non-eradication Treatment of Duodenal Ulcer: Systematic Review, Network Meta-Analysis, and Cost-Effectiveness Analysis

Background: Short-term use of standard-dose proton pump inhibitors (PPIs) is the first-line initial non-eradication treatment for duodenal ulcer (DU), but the choice on individual PPI drug is still controversial. The purpose of this study is to compare the efficacy, safety, and cost-effectiveness of standard-dose PPI medications in the initial non-eradication treatment of DU. Methods: We searched PubMed, Embase, Cochrane Library, Clinicaltrials.gov, China National Knowledge Infrastructure, VIP database, and the Wanfang database from their earliest records to September 2017. Randomized controlled trials (RCTs) evaluating omeprazole (20 mg/day), pantoprazole (40 mg/day), lansoprazole (30 mg/day), rabeprazole (20 mg/day), ilaprazole (10 mg/day), ranitidine (300 mg/day), famotidine (40 mg/day), or placebo for DU were included. The outcomes were 4-week ulcer healing rate (4-UHR) and the incidence of adverse events (AEs). A network meta-analysis (NMA) using a Bayesian random effects model was conducted, and a cost-effectiveness analysis using a decision tree was performed from the payer’s perspective over 1 year. Results: A total of 62 RCTs involving 10,339 participants (eight interventions) were included. The NMA showed that all the PPIs significantly increased the 4-UHR compared to H2 receptor antagonists (H2RA) and placebo, while there was no significant difference for 4-UHR among PPIs. As to the incidence of AEs, no significant difference was observed among PPIs, H2RA, and placebo during 4-week follow-up. Based on the costs of both PPIs and management of AEs in China, the incremental cost-effectiveness ratio per quality-adjusted life year (in US dollars) for pantoprazole, lansoprazole, rabeprazole, and ilaprazole compared to omeprazole corresponded to $5134.67, $17801.67, $25488.31, and $44572.22, respectively. Conclusion: Although the efficacy and tolerance of different PPIs are similar in the initial non-eradication treatment of DU, pantoprazole (40 mg/day) seems to be the most cost-effective option in China.

Although eradication of Helicobacter pylori (Hp) is associated with higher healing rates and lower ulcer recurrence rates in patients with Hp-positive DU (Leodolter et al., 2001;Ford et al., 2016), non-eradication therapies are still appropriate for the patients with Hp-negative DU or without the result of Hp testing. Pump proton inhibitors (PPIs) are a kind of benzimidazole prodrug that inhibit gastric acid secretion by irreversibly binding to the hydrogen-potassium ATPase pump residing on the luminal surface of the parietal cell membrane (Wolfe and Sachs, 2000;Shin et al., 2004). These agents have been recommended by the Japanese Society of Gastroenterology (JSG) as first-line treatment for the initial non-eradication treatment of DU (Satoh et al., 2016). Chinese guidelines recommended the standard dose of PPI given over 4-6 weeks for the treatment of DU (Editorial Board of Chinese Journal of Digestion, 2016). Omeprazole (OME; 20 mg/day), lansoprazole (LAN; 30 mg/day), pantoprazole (PAN; 40 mg/day), rabeprazole (RAB; 20 mg/day), ilaprazole (ILA; 10 mg/day), and esomeprazole (ESO; 20 mg/day) are widely used PPIs in the initial non-eradication treatment of DU. PPIs differ in their pKa, bioavailability, peak plasma levels, and route of excretion. A previous network meta-analysis  of randomized controlled trials (RCTs) compared the healing rates and adverse effects of different PPIs in ordinary doses for patients with DU and concluded there was no significant difference for the efficacy and tolerance between the ordinary doses of PPIs. However, this study included 24 RCTs and compared nine interventions, which resulted in an underpowered test. Moreover, ranitidine (RAN) and famotidine (FAM) were considered one intervention (H 2 RA), which introduced clinical heterogeneity to the model. Therefore, this conclusion needs to be further verified. On the other hand, cost-effectiveness among PPIs is still controversial due to high variability in cost. The present study aims to evaluate the efficacy, safety, and cost-effectiveness of standard-dose PPI medications in the initial non-eradication treatment of DU.

MATERIALS AND METHODS
We followed the PRISMA Extension Statement for Reporting of Systematic Reviews Incorporating Network Meta-analyses of Health Care Interventions (Supplementary Table S1). The systematic review was prospectively registered on International Prospective Register of Systematic Review (PROSPERO, CRD42017079704). The economic evaluation reporting also followed the Consolidated Health Economic Evaluation Reporting Standards Statement (CHEERS) ( Supplementary  Table S2).

Search
PubMed, Embase, and the Cochrane Central Register of Controlled Trials (CENTRAL) were searched using the search strategies detailed in Supplementary Table S3, from their inception to September 2017. Clinicaltrials.gov also was searched using the terms "duodenal ulcer, " "proton pump inhibitor, " "omeprazole, " "pantoprazole, " "lansoprazole, " "rabeprazole, " "ilaprazole, " "esomeprazole, " "famotidine, " and "ranitidine." The China National Knowledge Infrastructure (CNKI), VIP database, and Wanfang database were also searched with Chinese terms. We reviewed the references from published network meta-analyses of PPIs, included studies, and relevant review articles to find additional studies.

Eligibility Criteria
We included studies meeting the following criteria: (1) RCTs; (2) participants with endoscopically verified DU; (3) a focus on the following interventions by oral administration: OME 20 mg/day, PAN 40 mg/day, LAN 30 mg/day, RAB 20 mg/day, ILA 10 mg/day, ESO 20 mg/day, RAN 300 mg/day, FAM 40 mg/day, and placebo (PLA); (4) the duration of treatment should be 4 weeks or longer; (5) Reporting on any of the following outcomes: 4-week ulcer healing rate (4-UHR, primary outcome), defined as complete re-epithelialization of the ulcer crater irrespective of residual erosions after 4 weeks of treatment; incidence of overall adverse events (AEs, secondary outcome); and (6) published in English or Chinese.
We excluded studies that enrolled participants with upper gastrointestinal bleeding, stress ulcer, or the concomitant therapy for Hp eradication, studies compared only different doses of the same drug, and studies reported as in-conference abstracts, which were impossible to assess the risk of bias.

Study Selection and Data Extraction
Two reviewers independently screened the titles and abstracts of all studies identified by the search strategies according to the inclusion criteria. The full-texts of all potentially relevant articles were downloaded for further reviewing. We resolved any disagreements through discussion or adjudication by a third reviewer (Juan Xie).
We used a pre-designed data collection form to extract data from each eligible study, including: (1) authors, year of publication, country or region where the study conducted; (2) study design; (3) medication used in treatment or control group, dose, and duration of treatment; (4) number of participants randomized into each group; (5) diagnosis, gender, age, smoking and drinking habits of participants; (6) length of follow up; (7) outcome data (outcomes of interest, events and number of patients included for analyses in each group); and (8) sources of funding. As to the outcome data, we extracted intentionto-treat (ITT) data where these were reported. Otherwise, we extracted the data as reported (often a modified ITT based on, e.g., all patients who received at least one dose of the study drug). A kappa statistic (K) was manually calculated to measure the agreement between two reviewers on the decisions made in study selection.

Risk of Bias Assessment
Two reviewers independently assessed the risk of bias in each included study using the tool developed by Cochrane Collaboration (Higgins and Green, 2011). The items included random sequence generation, allocation concealment, blinding, incomplete outcome data, selective outcome reporting, and other bias. We categorized the judgments as low, high, or unclear risk of bias and created a summary graph using Review Manager Software (version 5.3).

Statistical Synthesis
We generated network plots of comparisons to illustrate which interventions had been compared within randomized trials (head-to-head comparisons). A Bayesian random effects network meta-analysis was conducted to compare the relative efficacy (4-UHR) and safety (the incidence of AEs) between different regimens. WinBUGS (version 1.4.3) was used to perform the analysis. Posterior samples were generated using Markov Chain Monte-Carlo (MCMC) simulation in two parallel chains. We used 5,000 burn-in iterations to allow convergence, and then a further 50,000 iterations to produce the outputs. We calculated odds ratios (ORs) with 95% confidence intervals (95% CIs), and a surface under the cumulative ranking (SUCRA). We evaluated and graded the statistical heterogeneity according to the value of I 2 . A value for I 2 of 50% or greater was used to denote significant heterogeneity. A node-splitting approach employed to assess inconsistency in the triangular loop (van Valkenhoef et al., 2016) using the gemtc package in the R environment (version 3.3.1) (van Valkenhoef et al., 2012). In order to observe the robustness of results, we conducted sensitivity analysis to compare the results from ITT data to per-protocol (PP) data. We also conducted a sensitive analysis by excluding trials with high risk of bias. Subgroup analyses were also conducted between Chinese and non-Chinese participants. Patients from Chinese Mainland, Hong Kong and Taiwan were considered to be Chinese for this study.

Cost-Effectiveness Analysis
We evaluated the cost-effectiveness of PPIs in Chinese patients with DU from the payer's perspective. A decision tree model was constructed in Excel to explore the economic benefits and Quality-Adjusted Life Year (QALY) gains. The model considered costs and outcomes over 1 year, and was based on 10000 Chinese DU patients (male/female = 1), one each in the OME, PAN, LAN, RAB, ILA, and ESO arms. To estimate the probability of 4-UHR for OME, we conducted a single arm meta-analysis based on data from trials on OME with a random-effect model using the meta package in the R environment (version 3.3.1) (DerSimonian and Laird, 1986). Then the probability for OME and the OR for 4-UHR for each PPI versus OME as estimated in the NMA were employed to produce the respective probabilities for other PPIs. To estimate QALYs, we extracted the data about health state utility value from previously published research (Groeneveld et al., 2001;Sun et al., 2011). The cost of each treatment strategy was calculated according to the drug cost for one standard treatment (4 weeks) obtained from the National Health and Family Planning Commission of the People's Republic of China 1 . The costs of managing AEs were obtained from the published literature (Xuan et al., 2016), while all other costs associated with administering the medications were assumed to be the same across the five arms. All costs were recorded in Chinese yuan and then converted into US dollars (exchange rate: 1 yuan = $0.1591). The incremental cost-effectiveness ratio (ICER) per additional life-years saved was calculated to compare the performance of different PPIs. We considered treatment strategies with an ICER of less than $25,761 (i.e., 3-times Chinese gross domestic product [GDP] (Hutubessy et al., 2003) per capita in 2016 2 ) per QALY saved to be acceptable. Probabilistic sensitivity analysis (PSA) was performed to test the robustness of the model.

Risk of Bias Assessment
As shown in Supplementary Figure S1, four studies (Delle Fave et al., 1992;Rensburg et al., 1994;Meneghelli et al., 2000;Hu, 2001) had low risk of selection bias for clearly describing the methods of randomization and allocation concealment, while the other 58 were unclear because the information about selection participants was not reported. Thirty-nine studies (62.90%) had low risk of performance bias and detection bias, as both participants and study personnel were masked; however, this risk was not clear in 23 studies (37.10%) for failing to report who was blinded. Sixty-one studies (98.39%) had low risk of attrition bias, as there was no loss to followup or missing data was appropriately addressed (e.g., applying ITT analysis which could underestimate the efficacy of the intervention). Thirty-nine studies (62.90%) had low risk of reporting bias since they had reported all predesigned outcomes. The other 23 studies (37.10%) neither mentioned registration information nor had an available protocol, so it was unclear whether all the pre-designed outcomes in these studies had been reported. Eight studies (12.90%) were supported by pharmaceutical industry, and bias caused by conflict of interest was unclear.

Incidence of AEs
Fifty studies (9,012 participants) reported the overall incidence of any AEs in participants receiving the eight interventions. The heterogeneity (Supplementary Figure S4) was not statistically significant among most comparisons (I 2 < 50%), except for PAN vs. LAN (I 2 = 51.8% for network). The inconsistency (Supplementary Figure S5) was also not statistically significant among most triangular loops with exception of PAN vs. OME (P = 0.0359). As shown in Table 2, there was no significant difference for the incidence of AEs among all the PPIs, H 2 RAs, and PLA. The results of SUCRA (Supplementary Table S5

Subgroup Analyses
Considering the impact of ethnicity on the results, we performed subgroup analyses in Chinese and non-Chinese participants, respectively. As shown in Supplementary Tables S9, S10, ILA tended to be more effective in improving 4-UHR in Chinese compared to non-Chinese participants. Chinese and non-Chinese subgroups showed similar results for incidence of AEs (Supplementary Tables S11, S12).  was associated with the best efficacy with respect to incremental QALYs but it also had the highest costs. PAN, LAN, and RAB were also associated with greater efficacy but higher costs than OME. According to the threshold recommended by WHO, PAN was preferred based on its efficacy at an acceptable cost. Nevertheless, ILA was found not to be a strongly recommended treatment for patients in China, since the ICER corresponded to higher than $25683.33. Probabilistic sensitivity analyses (Supplementary Table S13) with 1,000 Monte Carlo simulations revealed that PAN, LAN, RAB, and ILA had probabilities of 73.1% (Supplementary Figure S8), 60.6% (Supplementary Figure S9), 60.9% (Supplementary Figure S10), 15.2% (Supplementary Figure S11), respectively, of being cost-effective relative to OME under the threshold ($25683.33) currently accepted in China.

DISCUSSION
To our best knowledge, this is the first systematic review incorporating a network meta-analysis and cost-effectiveness analysis to compare PPIs for the initial non-eradication treatment of DU, and recommend a rank order based on efficacy, safety, and cost. Our study suggests that all the PPIs significantly improve the 4-UHR compared to H 2 RAs and PLA, while there is no significant difference for 4-UHR among PPIs. The incidences of AEs of PPIs, H 2 RAs, and PLA are similar during 4-week followup. PAN seems to be the most cost-effective choice in the initial non-eradication treatment of DU in China.
Most guidelines recommended that all patients with peptic ulcers should be tested for infection with Hp and treated (Malfertheiner et al., 2017). Nevertheless, an overview of systematic reviews and network meta-analysis (Xin et al., 2016) concluded that triple therapy with different antibiotics would influence the eradication rate which was associated with healing rate. In order to reduce the clinical heterogeneity caused by different antibiotics, this review evaluated the efficacy, safety, and cost-effectiveness of different PPIs in the non-eradication treatment of DU. At present, there are six PPIs (OME, PAN, LAN, RAB, ESO, and ILA) in the pharmaceutical market, but only five PPIs were included in this study. The main reason was that the ESO was more effective in the inhibition of gastric acid secretion (Beck, 2004;McKeage et al., 2008) and utilized more for the eradication of Hp, instead of non-eradication treatment of DU.
The subgroup analyses suggested that ILA obtained much better efficacy in Chinese rather than non-Chinese. The reason could be attributed to the fact that most RCTs including ILA were conducted in China, and one RCT  with high risk of bias reported extremely high 4-UHR of ILA in Chinese. After excluding that RCT, there was no significantly difference in the 4-UHR between ILA and other PPIs irrespective of Chinese or non-Chinese, which was consistent with the meta-analysis conducted by Ji et al. (2014).
A previous NMA  including 24 RCTs and 6,188 patients showed no significant difference for the efficacy and tolerance between the ordinary doses of different PPIs, which was mostly consistent with our study. However, we included more RCTs (62) and participants (10,339) to make the conclusion of NMA more robust. In addition, in order to perform the pharmacoeconomic analysis, our study only included the standard dose of PPIs rather than LAN (15 mg/day or 60 mg/day) or OME (40 mg/day).
The cost-effectiveness analysis indicated that ILA did not dominate OME, which was inconsistent with the previous study conducted by Xuan et al. (2016). This could be attributed to the different cost of OME applied in the model: The cost of OME in Xuan's study was set as 16 yuan/day ($2.5456/day) exceeding the upper limit value in our study. The price of OME was reduced greatly because of greater competition and supply of OME in the domestic market. The data of drug cost in our study was from the National Health and Family Planning Commission of the People's Republic of China and had better representativeness.
There are several limitations in this study. We only included RCTs in this review and were therefore underpowered to find rare AEs related to the medications, as the sample size was relatively small and the follow-up time was indeed short. On the other hand, some included RCTs, especially those from China had poor methodological quality, but results and interpretation did not change when these trials were excluded from the analyses. Due to few trials reporting the results of patients with CYP2C19 genotype, our study did not analyze this genotype stratification.

CONCLUSION
This study suggests that the efficacy and tolerance of different PPIs are similar in the initial non-eradication treatment of DU, but PAN (40 mg/day) seems to be the most cost-effective choice in China. More RCTs are warranted to compare the efficacy, long term safety, and cost-effectiveness of different PPIs across different CYP2C19 genotypes.

AUTHOR CONTRIBUTIONS
JZ, LG, MH, YL, JX, DC, XL, WZ, and RH conceptualized and designed the experiments, critically revised the manuscript for important intellectual content, and approved the final version to be published including the authorship list. JZ, JX, and XL contributed to literature search and data collection. JZ and LG analyzed the statistical data. JZ, LG, and XL interpreted the data. JZ, LG, MH, YL, XL, and WZ drafted the manuscript.