Efficacy and safety of total glucosides of paeony in the treatment of recurrent aphthous ulcers: a systematic review and meta-analysis

Background: Recurrent aphthous ulcer (RAU) had high prevalence and lacked widely recognized treatment. Total glucosides of paeony (TGP) was used in the treatment of RAU in recent years. This study was to summarize the efficacy and safety of TGP in the treatment of RAU. Methods: We searched eight commonly used databases for relevant studies that published before 1 November 2023. Primary outcome was visual analogue scale (VAS). Secondary outcomes included overall response rate, significant response rate, ulcer healing time, interval, number of ulcers, and serum inflammatory factors. We conducted the meta-analysis, assessed risk of bias and the confidence of the evidence, by using Stata 15.0, Review Manager 5.4, and Gradepro. Results: Nine randomized controlled trials (RCTs) encompassing 883 patients with RAU were included in the final analysis. The VAS in the TGP group was lower than that in the control group (MD = −1.18, 95% CI = −1.58 to −0.78, p < 0.001, moderate-certainty evidence), subgroup analysis suggested longer (>8 weeks) medication and observation led to a more significant reduction in pain (p = 0.02). Moreover, TGP had higher overall response rate (RR = 1.18, 95% CI = 1.04 to 1.33, p = 0.008, very low-certainty evidence) and significant response rate (RR = 1.72, 95% CI = 1.38 to 2.14, p < 0.001, very low-certainty evidence), accelerated ulcer healing (MD = −1.79, 95% CI = −2.67 to −0.91, p < 0.001, low-certainty evidence), and extended intervals (MD = 23.60, 95% CI = 14.17 to 33.03, p < 0.001, very low-certainty evidence). The efficacy of TGP in reducing the number of ulcers showed no significant difference compared to the control group (MD = −1.66, 95% CI = −3.60 to 0.28, p = 0.09, low-certainty evidence). Moreover, TGP treatment was associated with a higher incidence of abdominal symptoms (RR = 3.27, 95% CI = 1.62 to 6.60, p < 0.001). Conclusion: TGP appears to hold promise as a widely-used clinical therapeutic option for treating RAU. Nevertheless, further rigorous studies of high quality are required to validate its effectiveness. Systematic Review Registration: https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=471154, Identifier CRD42023471154


Introduction
Recurrent aphthous ulcers (RAU), with a prevalence of 5%-66% worldwide, is the most common oral mucosal disease (Saikaly et al., 2018;Prajapat et al., 2021).The pain caused by ulcers significantly impairs patients' ability to eat, speak, and perform other daily tasks (Lu et al., 2020).Patients who experience long-term and highfrequency RAU may become emotionally unstable and lose faith in their medical care (Hariyani et al., 2020).
The treatment goals associated with RAU can be categorized into two distinct domains.Short-term objectives encompass the reduction of pain intensity and the facilitation of ulcer healing, while long-term goals focus on mitigating the frequency of ulcerative episodes and quantity of ulcers (Lau and Smith, 2022).The attainment of short-term goals is feasible through the utilization of diverse pharmacological interventions; however, an optimal treatment strategy for long-term goals remains elusive at present.
Several drugs, including glucocorticoids, thalidomide, and colchicine, have been used in the treatment of RAU.However, the clinical application of thalidomide is considerably limited due to its teratogenic effects (Zeng et al., 2020;Amare et al., 2021;Deng et al., 2022), particularly among the young population, who are commonly affected by RAU (Cui et al., 2016).Additionally, troubling symptoms such as dizziness, constipation, and rash are challenging to mitigate.Notably, glucocorticoids frequently lead to gastrointestinal adverse reactions, and patients with obesity, glaucoma, depression, and hypertension may experience varying degrees of adverse effects, even with a dosage below 10 mg/day (Yasir et al., 2023).In the case of colchicine, the treatment of RAU may lead to gastrointestinal issues, neutropenia, and abnormal liver function, with the incidence of adverse events even surpassing that of prednisolone (Pakfetrat et al., 2010).
Although a lack of vitamins, minerals, and trace elements is thought to be one of the causes of RAU (Saikaly et al., 2018), recent studies have revealed that patients do not benefit from vitamins and minerals (Shao et al., 2018).With the exception of certain anemia patients whose RAU symptoms can be alleviated by supplementing with folic acid and vitamin B 12 (Han and Liu H., 2022;Taleb et al., 2022), using vitamins, minerals, and trace elements to treat RAU is not advised by the most recent therapy guidelines (Guo et al., 2020;Lau and Smith, 2022;Milia et al., 2022).
Botanical drugs have gained considerable attention as researchers endeavor to achieve a delicate balance between effectiveness and potential side effects (Shavakhi et al., 2022).Total glucosides of paeony (TGP) is the total glycosides extracted from the dried roots of Paeonia lactiflora Pall.[Ranunculaceae; Paeoniae Radix Alba].It possesses immunoregulatory properties and has been widely utilized in the treatment of autoimmune diseases (Jiang et al., 2020;Gong et al., 2022).Based on the findings of several randomized controlled trial (RCT)-based meta-analyses (Luo et al., 2017;Feng et al., 2019;Zheng et al., 2019;Wang et al., 2022;Yang et al., 2023), it is suggested that combining TGP with effective therapeutic drugs can lead to more significant treatment efficacy for Sjogren's syndrome, systemic lupus erythematosus, psoriasis, and rheumatoid arthritis compared to using these drugs alone.However, it should be noted that the quality of the RCTs included in these metaanalyses was limited.Nonetheless, two relatively high-quality randomized, double-blinded, placebo-controlled clinical trials demonstrated the effectiveness of TGP as a standalone treatment for Sjogren's syndrome and psoriasis (Zhou et al., 2016;Yu et al., 2017).
Traditional Chinese medicine believes that Paeonia has hepatoprotective functions (Peng et al., 2023).Recent research suggests that TGP may inhibit liver fibrosis and inflammatory response associated with cirrhosis via the FLI1/NLRP3 axis (Zhang et al., 2022).Previous RCTs on TGP showed no significant hepatotoxicity or ocular toxicity (Feng et al., 2019).Several large-scale meta-analyses of RCTs even indicate that the combination of TGP with other drugs reduces the incidence of hepatotoxicity compared to using the other drugs alone (Luo et al., 2017;Yu et al., 2017;Huang et al., 2019).
Previous studies have suggested various potential mechanisms by which TGP may treat RAU effectively, including the regulation of inflammatory factors such as TNF-α, IL-1β, IL-6, IL-12, TGF-β, and IL-10 (Shi et al., 2014;Giannetti et al., 2018;Zhao et al., 2018); the maintenance of a balanced ratio of CD4+/CD8+ T cells (Sun et al., 2000) and Th1/Th17 cells (Kong et al., 2018); inhibition of T-cell sensitivity to inflammation (Shi et al., 2014); and reduction in the secretion of secretory immunoglobulin A (Peng et al., 2019).The commercially available TGP capsules (trade name: Pavlin, produced by Ningbo Liwah Pharmaceutical Co., Ltd., H20055058, Paeonia lactiflora Pall.0.3 g/capsule, containing 130 mg of paeoniflorin) is extensively utilized in clinical practice, which, to some extent, helps mitigate the heterogeneity of treatments resulting from traditional Chinese medicine empirical practices.The extraction process of TGP was shown in Supplementary Table S1.
However, the existing studies predominantly consist of smallsample trials with varying research designs.In response to these limitations, we conducted a rigorous systematic review and metaanalysis according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement (Supplementary Table S2).Our aim was to evaluate the efficacy and safety of TGP in treating RAU.

Eligibility criteria
The studies were screened according to the root "PICOS" principle.

Population
Patients diagnosed with RAU by specialist doctors in oral mucosal diseases based on typical clinical manifestations and medical history (Milia et al., 2022).

Intervention and control
Inclusion criteria: (1) TGP capsules were used in the intervention group; (2) The control group was treated with vitamins (and minerals), placebos, or received the exact same medication as the intervention group, excluding TGP.
Exclusion criteria: (1) The intervention group received traditional Chinese medicine, which consisted of peony or total glucosides of paeony, along with other Chinese herbal ingredients; (2) Except for TGP, vitamins, minerals, and placebo, the two groups used any other different medication.

Outcome
Primary outcome was visual analogue scale (VAS), to assess the pain intensity of ulcers.Secondary outcomes included overall response rate, significant response rate, ulcer healing time, interval (ulcer-free days in the observation period), number of ulcers, and serum inflammatory factors containing tumor necrosis factor-α (TNF-α) and interleukin-2 (IL-2).The incidence of adverse reactions was measured as a safety result.

Study design
Inclusion criteria: (1) Randomized controlled trials, non-randomized controlled trials and cohort studies.
Exclusion criteria: (1) Case reports, reviews, conference articles, expert consensus, animal experiments or mechanism research; (2) Duplicate publications, and studies with incomplete or unavailable data.

Data search strategy
We searched PubMed, Embase, Cochrane Library, Web of Science, China National Knowledge Infrastructure (CNKI), Wanfang Database, VIP information resource integration service platform, and China Biology Medicine Disc (SinoMed), for relevant studies that were published before 1 November 2023.The keywords searched included "Stomatitis, Aphthous" and "Paeonia".All search strategies were presented in Supplementary Table S3.The search results were not limited by language.

Study selection
Two investigators (LZJ and LXY) independently selected studies.All the literature retrieved was imported into Endnote 20 to eliminate duplicates.LZJ and LXY screened the remaining literature by reading the titles and abstracts, and the full texts when needed, to determine whether they met the inclusion and exclusion criteria.Divergences were settled by consulting with a third author (LWH).

Data extraction
Three reviewers (LZJ, LXY, and HYP) used the prespecified form to obtain data from the papers that satisfied the criteria independently.Inconsistencies were corrected under the supervision of the responsible author (LWH).The data included authors, publication year, study design, region, sample sizes, participants' characteristics (gender, age, and course), medication duration, observation duration, interventions, outcomes (evaluation criteria and results), and adverse events.We tried to contact the original authors for clarification when we encountered unclear data.

Assessment of risk of bias
Two authors (LZJ and LXY) independently assessed the risk of bias of included studies using the recommended 'Risk of bias' tool for trials according to the Cochrane manual.This approach addresses the following seven specific domains: (1) random sequence generation, (2) allocation concealment, (3) blinding of participants and personnel, (4) blinding of outcome assessment, (5) incomplete outcome data, (6) selective reporting, and (7) Other bias.Each item was evaluated as "high risk", "low risk" or "unclear".All discrepancies were resolved by discussion to reach consensus between the two review authors, with a third review author (LWH) acting as an arbiter if necessary.

Statistical analysis
For continuous outcomes, such as VAS, ulcer healing time, interval, number of ulcers, TNF-α, and interleukin-2, we employed the weighted mean difference.Dichotomous outcomes, such as overall response rate, significant response rate, and incidence of adverse reactions, were assessed using the risk ratio (RR).To quantify the effects, we provided effect sizes and 95% confidence intervals (95%CI) for all the analytical tools.In order to conduct the meta-analysis, we perform necessary data conversions, such as merging multiple median (interquartile spacing) and converting the median (interquartile spacing) to mean ± standard deviation (Wan et al., 2014;Luo et al., 2018).Review Manager 5.4 (http://www.cochranelibrary.com/)was used to perform data analysis.

Heterogeneity assessment and sensitivity analysis
The statistical heterogeneity was assessed using the I 2 value.If the I 2 exceeded 50%, it signified a notable presence of heterogeneity, and a random-effects model was chosen.Otherwise, a fixed-effects model was utilized.Subgroup analyses were used to explore the sources of heterogeneity.
Sensitivity analyses were conducted using Stata 15.0 to generate graphical representations.Additionally, for results with heterogeneity, we obtained the precise changes of I 2 by omitting the included studies one by one.In cases where certain studies significantly influenced the stability of the outcome, a thorough evaluation of their study design and outcome was performed.If high risks of bias or clinical heterogeneity were identified, the respective study was excluded, and a new meta-analysis was conducted using the remaining studies.

Subgroup analysis
Subgroup analyses were conducted based on medication duration, observation duration, and specific intervention measures employed in the control group.Notably, due to inconsistencies in efficacy evaluation criteria across studies, as indicated in Supplementary Table S4, subgroup analyses were performed specifically for the outcomes of the overall response rate and significant response rate.

Assessment of reporting bias
We used the Egger's test to assess reporting bias, given the limited number of studies available (Sterne et al., 2011).It is not recommended to conduct reporting bias assessment for results based on fewer than five studies.

Certainty assessment
Two authors (LZJ and LXY) assessed the confidence of the evidence independently, according to the Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach (https://gdt.gradepro.org/).The level of evidence was evaluated and categorized as "high", "moderate", "low", or "very low."

Literature screening
A comprehensive search of eight databases yielded a total of 139 records.After removing duplicates (n = 52), an evaluation of titles and abstracts resulted in the identification of 22 potentially eligible literature sources.Finally, nine studies (Tao, 2012;Wang et al., 2013;Su and Nong, 2014;Xu and Chen Z., 2017;Yao et al., 2017;Yan and Zhang H., 2019;Sun, 2020;Chen X. and Zhang H. L., 2021;Liu Z. et al., 2023), all of which were randomized controlled trials, were included based on a thorough examination of their full texts.The study selection process is visually depicted in Figure 1.

Study characteristics
All included studies were single-center RCTs conducted in eight different provinces in China.A total of 883 patients (444 in the treatment group and 439 in the control group) were enrolled.Excluding Xu et al.'s study (Xu and Chen Z., 2017), which did not report the gender and age distribution of the grouped participants, the male-to-female ratio was 328:391, with an average age range of 26.6-44.67years.The average course in both groups ranged from 1-10 years.
The prescribed dosage of TGP for the RAU patients was 0.6 g per administration, to be taken 2-3 times daily.Only one study (Liu Z. et al., 2023) provided a comprehensive report on the actual dosage.The average daily intake of TGP and placebo during different periods ranged from 1.50 to 1.68 g, equivalent to five to six capsules per day.Several studies (Tao, 2012;Wang et al., 2013;Sun, 2020) reported cases of dose reduction in patients, but they did not specify the exact dosage and/or duration of reduced intake.The remaining studies did not mention whether the patients adhered to the prescribed medication regimen.
All the included studies utilized TGP capsules, which were exclusively manufactured by Ningbo Liwah Pharmaceutical Co., Ltd.According to the "type A extract" of the ConPhyMP consensus statement (Heinrich et al., 2022), three different (orthogonal) fingerprinting methods need to be provided to verify the main ingredients of the drug.However, only one study (Liu Z. et al., 2023) showed the detection results of TGP capsule components by high-performance liquid chromatography once, which demonstrated that the content of the crucial component, paeoniflorin, in the drug was 130 mg per capsule.This amount exceeded the minimum standard set by the Chinese Pharmacopoeia 2020.

Risk assessment of bias
The results of the risk of bias assessment were shown in Figure 2.Only one study (Liu Z. et al., 2023) adhered to the CONSORT 2010 Statement (http://www.consort-statement.org)and employed a double-blind randomized controlled trial design.The remaining studies were randomized controlled trials that adequately reported the main outcome measures, but did not provide information on patient withdrawal, allocation considerations, and blinding.Among these studies, only three reported utilizing the random number table method for generating random sequences (Su and Nong, 2014;Yan and Zhang H., 2019;Chen X. and Zhang H. L., 2021).
In terms of efficacy evaluation, three studies assessed efficacy using the interval and number of ulcers (Tao, 2012;Xu and Chen Z., 2017;Sun, 2020).However, these studies solely reported significant and overall response rates without specifying the interval and number of ulcers.Consequently, these studies were classified as high-risk for selective reporting due to the absence of crucial outcome indicators.
Furthermore, one study (Liu Z. et al., 2023) extensively described the use of diary cards by patients for daily recording of outcome indicators.None of the other studies provided detailed information on the recording method of outcomes, thereby raising concerns about potential recall bias in the follow-up visits.

Primary outcome (VAS)
Xu reported on the VAS (Xu and Chen Z., 2017).Compared to the administration of thalidomide alone, the combination of TGP and thalidomide showed a higher VAS score.However, they concluded that the combined therapy had a better pain relief effect, which contradicts our understanding that a higher VAS  ④ and ⑤ were reported every 4 weeks.Data were presented as number, the number of patients (%), mean ± standard deviation, or median (interquartile spacing).d Detailed information of the drugs, including manufacturer, batch Number, and dosage, is provided in the Supplementary Table S7.
Frontiers in Pharmacology frontiersin.orgindicates more severe pain.We were unable to reach the author for further clarification of the data.In order to ensure the accuracy of our research, we excluded this study in this part.Among the studies included, Liu Z. et al. (2023) reported VAS separately for the 1-4 weeks and 25-36 weeks time periods, so we included the data from these two time points independently in this study.All studies reported no intergroup differences in pre-treatment VAS.
After the interventions, the VAS in the TGP groups was lower than that in the control groups (MD = −1.18,95% CI = −1.58 to −0.78, p < 0.001; Figure 3A), with significant statistical heterogeneity present (I 2 = 91%).Sensitivity analysis showed good stability (Supplementary Figure S1A), and individual study exclusion resulted in a change in I 2 ranging from 84% to 93% (Supplementary Table S5).Subgroup analysis suggested that the observation period and medication duration (p < 0.001), and treatment of the control group (p = 0.02) may be the sources of heterogeneity.Longer (>8 weeks) medication and observation (Liu Z. et al., 2023) resulted in a more significant reduction in pain (MD = −3.(Xu and Chen Z., 2017) conducted separate reports on the overall response rate and significant response rate for short-term and long-term durations.Both sets of data were included.The TGP group demonstrated a higher overall response rate compared to the control group (RR = 1.18, 95% CI = 1.04 to 1.33, p = 0.008, I 2 = 78%; Figure 3B).Sensitivity analysis indicated good stability (Supplementary Figure S1B).After excluding the study by Wang (Wang et al., 2013), the heterogeneity of the results significantly decreased (I 2 = 22%), as shown in Supplementary Table S5.After reexamining this study, we found no unique intervention measures or outcome evaluation criteria, and there was no high risk of bias.Therefore, we decided against excluding this study.Subgroup analysis demonstrated that neither medication duration (p = 0.43), observation duration (p = 0.61), treatment of the control group (p = 0.82), nor efficacy evaluation criteria (p = 0.60) were the sources of heterogeneity (Table 2).

Significant response rate
The TGP group demonstrated a higher significant response rate (RR = 1.72, 95% CI = 1.38 to 2.14, p < 0.001; Figure 3C).Despite the absence of significant statistical heterogeneity (I 2 = 48%), we opted for a random-effects model considering the varying treatment measures and efficacy evaluation criteria across studies.The sensitivity analysis confirmed the stability of the results (Supplementary Figure S1C).The difference in significant response rate between the TGP group and the control group, when the control group using both vitamin and thalidomide (Xu and Chen Z., 2017; Chen X. and Zhang H. L., 2021) (RR = 1.49, 95% CI = 1.22 to 1.81, p < 0.001), was smaller compared to the control group using vitamins (and minerals) (Tao, 2012;Wang et al., 2013;FIGURE 2 Risk of bias. Frontiers in Pharmacology frontiersin.orgSu and Nong, 2014;Yao et al., 2017;Yan and Zhang H., 2019;Sun, 2020) (RR = 2.04, 95% CI = 1.62 to 2.55, p < 0.001), with a p-value of 0.04.See Table 2.

Number of ulcers
Three studies reported the number of ulcers.However, Liu's study (Liu Z. et al., 2023) employed a totally different calculation method for the number of ulcers compared to the other two studies, making it impossible to convert and combine the data.The remaining two studies (Wang et al., 2013;Xu and Chen Z., 2017) demonstrated that TGP's ability to reduce the number of ulcers did not significantly differ from the control group (MD = −1.66,95% CI = −3.60 to 0.28, p = 0.09, I 2 = 95%; Figure 3G).Xu and Chen Z. (2017) did not report the specific observation time.Similar to the results for the interval, Wang's study (Wang et al., 2013) indicated a reduction in the number of ulcers within 0-24 weeks (p < 0.01), while Liu Z. et al. (2023) showed a reduction in the number of ulcers only in the 25-26 weeks (p < 0.001).See Supplementary Table S6 for details.

Serum inflammatory factors
TGP was found to significantly decrease serum TNF-α levels (MD = −17.51,95% CI = −19.25 to 15.78, p < 0.001, I 2 = 99%; Figure 3H), while its effect on IL-2 was not significant (MD = 69.42,95% CI = −65.10 to 203.93, p = 0.31, I 2 = 100%; Figure 3I).Although each individual study demonstrated the efficacy of TGP in reducing serum inflammatory factors, a notable disparity in the levels of serum inflammatory factors was observed between Chen's study (Chen X. and Zhang H. L., 2021) and other studies (Xu and Chen Z., 2017;Yan and Zhang H., 2019).This dissimilarity could potentially account for the significant heterogeneity and the absence of statistical significance.However, we were unable to pinpoint a specific reason for this disparity based on the methodology employed.

Safety outcomes
Two studies (Su and Nong, 2014;Yan and Zhang H., 2019) did not report adverse reactions, while five studies indicated that taking TGP was associated with a higher incidence of abdominal symptoms, primarily characterized by increased stool frequency, loose stools, or diarrhea.Among them, four studies (Tao, 2012;Wang et al., 2013;Sun, 2020;Liu Z. et al., 2023) reported a specific number of individuals experiencing adverse reactions, suggesting a significantly elevated likelihood of abdominal symptoms in the TGP group compared to the vitamin, minerals, and placebo (RR = 3.27, 95% CI = 1.62 to 6.60, p < 0.001, I 2 = 0%; Figure 3J).The abdominal symptoms disappeared when patients discontinued or reduced the dosage of TGP.
Additionally, Wang et al. (2013) reported that among 50 patients receiving TGP, five individuals experienced symptoms of nausea and mild headache, while four individuals experienced decreased appetite.These symptoms resolved after 4-7 days without Not applicable a Xu's study was excluded from the subgroup analysis because they did not report the specific observation time.The Interval in the subgroup analysis refers to the average number of oral ulcerfree days per month during the certain observation period.any intervention.Yao et al. (2017) reported the same adverse reactions but did not provide detailed information regarding the number of affected individuals, duration of symptoms, and persistence post-treatment cessation.
In two studies (Xu and Chen Z., 2017;Chen X. and Zhang H. L., 2021) that used thalidomide in the control group, the combined use of TGP was found to reduce the incidence of abdominal symptoms, but the intergroup difference lacked statistical significance (RR = 0.50, 95% CI = 0.15 to 1.62, p = 0.25, I 2 = 0%; Figure 3K).Furthermore, TGP was also found to decrease dizziness and drowsiness caused by thalidomide.
Only one study (Liu Z. et al., 2023) reported that after 6 months of drug administration, the blood biochemical parameters of patients were examined, including alanine aminotransferase, aspartate aminotransferase, bilirubin, and albumin.The results showed no abnormalities, and there were no significant differences in the measured values compared to before the medication, indicating that TGP does not affect liver function.Other studies did not mention monitoring and evaluation of liver function.

Publication bias
Egger's tests uncovered the existence of publication bias in both the overall response rate (p < 0.001) and the significant response rate (p = 0.006).There was no conclusive indication of publication bias detected for the VAS (p = 0.201).Egger's linear regression test was shown in Supplementary Figure S3.

GRADE assessment
We conducted a GRADE assessment only for the clinical outcomes.All the outcomes exhibited a serious risk of bias.There was significant heterogeneity among the studies for the overall response rate and interval, but no reasonable explanation could be identified.Except for the VAS, other outcomes suffered from imprecision due to vague judgment methods, the absence of specific key quantitative values in the reporting, or small sample sizes and wide confidence intervals.Egger's tests detected publication bias in the overall response rate and significant response rate.The certainty of evidence for the VAS was rated as moderate, while for the rest of the outcomes, it was low or even very low. Figure 4 provided an overview of the evidence certainty.

Summary of findings
In comparison to the utilization of vitamins (and minerals), placebos, or no alternative treatments, the administration of TGP, either alone or in combination with the same drugs used in the control group, demonstrated superior pain relief, higher response rates, and accelerated ulcer healing.However, apart from the significant response rate, all the other results mentioned above exhibit heterogeneity.Subgroup analysis revealed that TGP treatment exceeding 2 months resulted in enhanced pain relief.Furthermore, the combined use of thalidomide and TGP significantly shortened the healing time of ulcers.
Regarding the two indicators representing long-term efficacy, "interval" and "number of ulcers".Taking into account the limitations and heterogeneity of Xu's study, we conducted a meta-analysis after excluding it, indicating a significant prolongation of the interval with TGP, while subgroup analysis suggested a significant extension only after medication for 6 months.Three articles reported the "number of ulcers".Similarly, it seems that a distinct decrease in the number of ulcers could be observed only after a 6-month medication of TGP.The included literature collectively demonstrated that TGP was able to lower serum inflammation levels, although the analysis results for IL-2 did not show statistical significance.
It should be noted that TGP may induce abdominal symptoms and alterations in stool characteristics, which return to normal after discontinuation.On the other hand, TGP might potentially reduce the incidence of adverse reactions associated with thalidomide, and accelerated the rate of ulcer healing facilitated by thalidomide.
Publication bias and issues such as small sample sizes diminished our confidence in these results.The GRADE assessment indicated that TGP's effect on alleviating pain was relatively reliable, while the evidence grade for the remaining outcomes was low or even very low.

Strengths and limitations
We employed multiple outcomes and conducted various subgroup analyses based on specific intervention measures and durations, demonstrating the efficacy and safety of TGP in treating RAU.However, considering the various limitations, we interpret these results with caution, and further studies are warranted to validate these findings.To our knowledge, this is the first meta-analysis examining the use of TGP in the treatment of RAU.
Inevitably, several limitations should be acknowledged in this meta-analysis.1) All the included studies were conducted in China, limiting the generalizability of the findings to other populations.These studies employed different intervention measures and durations, and reported varying outcomes, resulting in a limited number of studies included in a certain outcome and contributed to heterogeneity.This may potentially affect the scientific validity and reliability of the conclusions.2) The administration of active ingredients by the patients remains unclear.Apart from Liu's study (Liu Z. et al., 2023), other studies did not report the actual dosage of medication taken by the patients.On the other hand, due to the lack of fingerprinting results for drug samples, all included studies were unable to ascertain the true dosage of active ingredients in the medication consumed by the patients.3) Only one study provided detailed descriptions of the blinding, randomization, and specific outcome measurement methods, while the remaining studies did not employ blinding and lacked detailed methodological reporting.The reporting of outcomes was not sufficiently detailed, with only three studies reporting specific indicators such as the interval and number of ulcers (Tao, 2012;Xu and Chen Z., 2017;Sun, 2020).Other studies only reported effective rates, but these rates were based on evaluations using indicators such as the interval and number of ulcers.This lead to imprecision.Ideally, the evaluation of therapeutic efficacy should be derived from daily patient records.Otherwise, relying on patient reports during follow-up visits would inevitably introduce recall bias.However, this method was only applied in the study by Liu et al. (Liu Z. et al., 2023).
Furthermore, the relatively high-quality study reported that TGP required more than 6 months of usage to achieve significant efficacy in VAS, interval, and number of ulcers, which differed from other study results, indicating substantial heterogeneity.This to some extent reduced the accuracy of the results of this meta-analysis.4) Except for the VAS, the results of the overall response rate and significant response rate were susceptible to publication bias.Other outcomes had limited studies inclusion, making them not recommended for publication bias assessment, but this did not imply the absence of publication bias.5) Safety evaluation did not receive sufficient attention in most studies, with a lack of detailed reporting on the duration of adverse reactions, mitigation measures, and monitoring of liver and kidney function.
Considering the aforementioned limitations, more research is needed to further validate our results.Future trials should adhere to rigorous methodology, encompassing a calculated sample size, extended follow-up period, pre-registered protocol, and implementation of a blinded method.Besides, the reporting of results should align with the guidelines provided by SPIRIT-TCM Extension 2018 (Dai et al., 2019) and CONSORT-CHM Formulas 2017 (Cheng et al., 2017).

Conclusion
This meta-analysis indicates that TGP demonstrates potential effectiveness in the treatment of RAU, particularly in alleviating pain, with no severe adverse effects observed.However, due to the significant heterogeneity and low quality of evidence, further largescale, high-quality studies are necessary to substantiate and confirm the clinical efficacy of TGP in the RAU treatment.acquisition, Conceptualization.HL: Writing-review and editing, Validation, Supervision, Conceptualization, Funding acquisition.

Funding
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article.This study was sponsored by Fujian provincial health technology project (2022QNB027), The National Natural Science Foundation of China (U19A2005) was jointly accomplished by YiH and HL, and Clinical Research Foundation of Peking University School and Hospital of Stomatology (PKUSS-2023CRF304).The funders had no role in the study design, data collection and analysis, and decision on publication or manuscript preparation.

FIGURE 1
FIGURE 1Flow diagram of the study selection process.

TABLE 2
Subgroup analysis for outcomes.

Table 2 (
Continued) Subgroup analysis for outcomes.