Warm needle acupuncture for osteoarthritis: An overview of systematic reviews and meta-analysis

Background Osteoarthritis (OA) is a chronic disease that is a major cause of pain and functional disability. Warm needle acupuncture (WA) therapy has been widely used to treat OA. This overview summarizes the evidence from systematic reviews (SRs) and assesses the methodological quality of previous SRs that evaluated the use of WA therapy for OA. Methods We searched electronic databases to identify SRs that evaluated the efficacy of WA therapy for OA. Two reviewers independently extracted data and assessed the methodological quality of the reviews according to the A Measurement Tool to Assess Systematic Reviews (AMSTAR 2) tool. The reporting quality was assessed using the Preferred Reporting Items for Systematic Reviews and Meta-Analysis 2020 (PRISMA 2020) guidelines. The quality of evidence was assessed according to the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach. Results Fifteen SRs were included in this study. WA therapy was more effective than control conditions for the treatment of OA. The results of the AMSTAR 2 tool showed that the methodological quality of all included studies was critically low. The items with the lowest scores were item 2 (reporting the protocol), item 7 (listing excluded studies and justifying the exclusions), and item 16 (including conflicts of interest). Regarding the PRISMA guidelines, 2 SRs exhibited greater than 85% compliance. The overall quality of evidence in the included SRs ranged from “very low” to “moderate.” Conclusion This overview shows that WA therapy was more effective than the control treatment for OA. However, the methodological quality of the reviews was low, indicating the need for improvements in the collection of evidence. Future studies are needed to collect high-quality evidence regarding the use of WA for OA. Systematic review registration https://www.researchregistry.com/, Research Registry (reviewregistry1317).


Introduction
Osteoarthritis (OA) is a common chronic disease and a main symptom of joint stiffness, instability, and weakness. It usually occurs in middle-aged (between 50 and 60 years of age) people, and in particular, it occurs more often among women than men (1,2). According to research results, the costs directly incurred by OA are billions of US dollars per year (3,4). Therefore, the treatment of OA is significant for reducing pain in patients and alleviating the socioeconomic burden.
Traditional medicine has been used for thousands of years to treat numerous diseases and has been used to relieve pain and improve the function of the knee joint in OA patients (5,6). Acupuncture is one of the options for treating OA (7,8). WA is one type of acupuncture combined with moxibustion (9). The heat of the needle is transmitted to the deep part of the acupoint through the needle, which helps reduce pain and improve function. Recently, the number of studies using warm needle acupuncture (WA) for the treatment of musculoskeletal pain has increased, and the quality of the studies has gradually improved (5,10). Systematic review (SR) is performed on a particular topic in order to provide a comprehensive and unbiased clinical evidence based on rigorous studies (11). One recent SR analyzed 66 randomized controlled trials (RCTs) and showed beneficial effects of WA for OA (12).
An overview of SRs is a method for compiling evidence and synthesizing the results of various SRs (13,14). The greater the amount of information gathered, the better the quality of evidence that can be provided for clinical work. An overview of SRs on traditional Chinese medicine (TCM) for knee OA (15) and acupuncture for knee OA has been published recently (8,16), which concluded that TCM generally appears to be effective for the treatment of knee OA. Nevertheless, the effectiveness of WA as a treatment for OA has not been thoroughly evaluated.
The purpose of this study was to summarize the efficacy of WA in the treatment of OA presented in SRs and to evaluate the methodological quality of the SRs.

Methods
We followed the Preferred Reporting Items for Overviews of SRs (PRIOR) statement (17). This overview was registered in the Research Registry (reviewregistry1317) (18). System (OASIS)) from their inception to January 2023. The search terms were ("warm needle acupuncture" OR "wen zhen" OR "warm acupuncture" OR "warm needle moxibustion") AND ("osteoarthritis") AND ("systematic review" OR "Meta-analysis") in Korean, Chinese, and English. The search terms and websites of 12 databases are described in Supplementary 1.

Inclusion and exclusion criteria
Types of studies SRs and meta-analyses of randomized controlled trials (RCTs) or quasi-RCTs that used WA for OA were included.

Population
Studies of participants diagnosed with OA. There were no restrictions regarding sex or age.

Intervention and comparators
Studies that used WA as an intervention to treat OA were included regardless of types of comparators. Moreover, studies in which WA was combined with other therapies were also included.

Outcomes
SRs reporting on patient health outcomes were included. The studies included data on at least one outcome evaluating the total treatment effect and clinical symptom of interest.

Study selection and data extraction
Two reviewers (JHJ and TYC) separately assessed the citations obtained during the search, and full-text publications from potentially relevant SRs were retrieved and appraised for inclusion. One reviewer (JHJ) extracted the data using a standardized form. Two reviewers (JHJ and TYC) independently evaluated the retrieved data, and any differences were addressed through discussions between the two authors (SP and MSL) and were resolved by discussion. The data extracted from the reviews included the first author, publication year, data search, number of trials included, interventions, comparators, outcomes, direction of effect, overall risk of bias, conclusion, and adverse events. An assessment of the methodological quality for each included SR was also conducted.

Overlap calculation of the reviews
The degree of overlap of the original literature for SRs was assessed by creating citation metrics for SRs. We calculated the "corrected covered area" (CCA) index (19,20). The measure of overlap dividing the frequency of repeated occurrences of the index publication in other reviews by the product of index publications and reviews is reduced by the number of index publications. Calculation formulas were calculated as CCA = (N -r)/(rc -r), where N is the number of included publications in evidence synthesis (this is the sum of the ticked boxes in the citation matrix), r is the number of rows (number of index publications), and c is the number of columns (number of reviews) (supplement overlap). The calculation results lower than 5 can be considered a "slight overlap, " 6-10 can be considered a "moderate overlap, " 11-15 can be considered a "high overlap, " and greater than or equal to 15 can be considered a "very high overlap. " Frontiers in Medicine 03 frontiersin.org

Methodological quality assessment
The quality of the included SRs was evaluated using the Assessing the Methodological Quality of Systematic Reviews 2 (AMSTAR 2) tool (21). There were 16 evaluation items. The reporting was assessed as being sufficiently reported and performed (Yes), insufficiently reported (Partial Yes), or not reported (No). The overall confidence in the results of the review was rated as follows: critically low quality (more than one critical flaw with or without non-critical weaknesses), low quality (one critical flaw with or without non-critical weaknesses), moderate quality (more than one non-critical weakness), and high quality (zero or one non-critical weakness). The AMSTAR 2 tool was used by two authors (JHJ and TYC). If there was a disagreement, the other authors (SP and MSL) resolved the disagreement.

Reporting quality assessment
We used the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA 2020) checklist (22). There were 27 element items that evaluated SR reporting quality. "Yes," "Partial yes, " or "NO" were used to respond to each item. We reported the results as a ratio.

Certainty of evidence
The quality of outcomes of the included SRs was evaluated by the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) tool 1 (23,24). If the GRADE tool was not used in SRs, we evaluated the strength of evidence from primary trials. The assessment of the included SRs was independently carried out by the reviewers. The five categories of GRADE influenced (i.e., downgraded or upgraded) the quality of evidence and included risk of bias, inconsistency, indirectness, imprecision, and publication bias. The quality of evidence of SRs was rated as "high, " "moderate, " "low," and "very low." Evidence based on RCTs began as high quality. Two authors (JHJ and TYC) assessed the quality of evidence. Disagreements were resolved by discussion with a third author (MSL).

Data synthesis and analysis
Narrative synthesis was provided because of the high heterogeneity. The results of the WA intervention were also narratively summarized in more detail from the included SRs, and the direction of effects was calculated. Such a detailed form included the features of the intervention, methodological quality, and quality of evidence.

Study selection
Twelve database searches identified 161 potentially relevant studies, with 39 repeated studies removed. Of the remaining 122 studies, 93 studies were excluded due to lack of relation, review, protocol, and RCT designs. A total of 29 studies were obtained after retrieval. After the final reading of the full texts, 15 SRs (12,(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38) were included in this review. The details of the SR selection screening process are shown in Figure 1. The list of excluded studies and reasons for exclusion are shown in Supplementary 2.
Total effective rate Fourteen SRs (12,(25)(26)(27)(28)(29)(30)(32)(33)(34)(35)(36)(37)(38) suggested that the total effective rate of WA alone or combined with other therapies in OA patients was superior to that in the control group. One SR (30) with the largest sample size included 66 RCTs with 6,231 patients treatment, and a comparison of the effects of WA or WA plus WM group versus control group results showed a greater effect in the intervention group than in the control group. In most studies, WA was effective for OA. However, two SRs (35,38) reported no significant differences between WA and EA.

Function
One SR (12) evaluated the effects of WA alone or WA plus WM in the intervention group on function compared to WM. The analysis results of this SR showed that the intervention group was significantly improved compared with controls.

WOMAC total score
Seven SRs (28,31,32,(34)(35)(36)38) reported the WOMAC total score. The meta-analysis showed the effects of WA alone or combined with other therapies on the WOMAC total score. However, four SRs (28,31,32,34) failed to show that WA had superior effects compared with EA on the WOMAC.

Adverse events
Of all 15 SRs, five SRs (12,27,30,33,35) mentioned adverse events. The major symptoms reported in the WA treatment groups were skin burns. Most of the RCTs included in the SRs reported no adverse events. Four SRs (12, 30, 33, 35) reported     that serious adverse events did not occur. One SR (27) indicated that the incidence of adverse events in the WA treatment groups was lower than that in the control groups, which indicated that WA was a safe therapy for OA.

Methodological quality of the included systematic reviews
The results of the AMSTAR 2 tool showed that the included SRs were critically low quality, low quality, or moderate (Figure 2; Supplementary 4). Ten SRs (25-27, 29-31, 33, 36-38) were considered to have critically low quality, four SRs (28,32,34,35) were considered to have low quality, and one SR (12) was considered to have moderate quality. All of the SRs reported the inclusion of PICO components (item 1). None of the SRs provided a complete list of excluded studies with reasons (item 7). Some SRs were evaluated with a partial yes in three domains (e.g., items 4 and 8).
Seven domains (items 2, 4, 7, 9, 11, 13, and 15) of the AMSTAR 2 tool were critical domains. For item 2, 14 of the SRs (25-38) provided a registry protocol, and one SR (12) was registered with PROSPERO and published protocol. For item 4, six SRs (12,28,30,31,34,35) searched core databases (PubMed, the Cochrane Library, and Embase) and related intervention databases. However, nine SRs (25-27, 29, 32, 33, 36-38) lacked a search of the core databases. For item 7, none of the SRs provided the excluded studies and explained the reason for exclusion. For item 9, 13 SRs (25-30, 32-37) described the bias, one SR insufficiently reported bias (38), and one SR performed the assessment, but the results were not described. For item 11, all of the SRs performed a meta-analysis. For item 13, six SRs (12,28,33,(35)(36)(37) took the risk of bias into account when discussing the results and drew a conclusion with caution. For item 15, all of the SRs investigated publication bias and analyzed its potential effects on the results of the review.

Report quality of included systematic reviews
To assess the reporting quality of the included SRs, we used the PRISMA 2020 checklist (22). Figure 3 shows the reporting quality assessment results of the included SRs. Item 1 (title), item 2 (abstract), item 4 (objects), item 8 (selection process), item 19 (results of individual studies), item 20 (results of syntheses), and item 21 (reporting biases) were reported adequately (100%). Item 15 (describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome), item 22 (certainty of evidence), and item 24 (registration and protocol) of results reported insufficient description. Overall, two SRs (12,34) exhibited over 85% compliance. The results are shown in Supplementary 5.

Certainty of evidence
We evaluated the quality of outcomes extracted from the included studies. Table 2 shows the level of evidence quality of the studies reported. The quality of evidence for outcomes evaluated by the GRADE approach ranged from very low to moderate (Supplementary 6). The risk of bias and imprecision mainly accounted for the downgrade. The quality of evidence was moderate for 5 outcomes (8.92%), low for 21 outcomes (35.21%), and very low for 32 outcomes (55.17%).

Discussion
This overview of SRs was intended to summarize the features and evaluate the quality of methodological, reporting bias, and evidence from included SRs about the efficacy of WA in OA. Fifteen SRs reported that intervention groups using WA alone or WA plus other therapies showed symptom improvements compared with control groups (32,35,36,38). WA treatment was safer than control treatment, and serious adverse events did not occur; however, the evidence of safety based on the included reviews was not sufficient since certain data were missing. Most of the SRs were associated with a high risk of bias, rated moderate to very low with the GRADE approach, and rated critically low with the AMSTAR 2 tool. Thus, it is not possible to draw a clear conclusion. Future research involving large sample sizes and high-quality studies are needed. Regarding the reporting quality of the results, only 2 SRs (12, 34) exhibited over 85% compliance.
All included studies had average reporting quality, according to the PRISMA 2020 checklist. The 6 element items (items 1, 2, 4, 8, 19, 20, and 21) were complete. Only two SRs reached 85.2% (34) and 100% ( (12) compliance. Most of the included SRs were on knee OA and were conducted and published in China. In future studies, the reasonable Frontiers in Medicine 10 frontiersin.org  utilization of the Consolidated Standards of Reporting Trials (CONSORT) (39) and PRISMA (22) checklists will improve the reporting quality of SRs and meta-analyses, which will reduce potential selection bias. In nine SRs, the methodological quality was critically low because there were deficits in the critical items of the AMSTAR 2 tool, which included items 2 (registration protocol), 7 (list of excluded studies), and 16 (potential source of conflicts of interest). For item 2, only one SR (12) reported rates in the protocol and recording section. Preregistration helps to promote transparency, minimize potential biases in reporting and reviewing, reduce duplication of effort among groups, and keep service requests current. For item 7, an exclusion list is recommended because without this list, authors can arbitrarily exclude RCTs that differ from their desired results (21). Nevertheless, as the AMSTAR 2 tool is a more rigorous assessment tool than the previous version, the evaluation results should be interpreted by considering that the methodological quality of the published SRs was underestimated. A major reason for downgrading the evidence in the GRADE tool was that most of the included SRs were assessed as having a risk of bias and inconsistency across categories. The major reasons for this quality of evidence assessment were that randomization and blinding methods were not described and there was high heterogeneity. This overview has some limitations. First, the SRs were dependent on RCTs published in China. The results of this review are not applicable or generalizable to other studies conducted elsewhere. In the future, clinical research should be actively conducted in countries other than China so that WA treatment for OA can be actively used in various ways. Second, the evaluation tools (AMSTAR, PRISMA, and GRADE) that were used were subjective. Two independent reviewers provided the evaluation, and the results were checked; nevertheless, they may have been their own judgment included in the assessment of each factor. Third, this overview was limited to the use of AMSTAR 2 to evaluate the methodological quality of the SRs. Consequently, the quality of the included SRs was not assessed. Future research should use the Risk of Bias in Systematic reviews (ROBIS) tool (40) to evaluate risk of bias and the PRISMA checklist (22) to evaluate the reporting characteristics of the included SRs.
In conclusion, WA or WA plus other therapies was more effective than the control conditions. However, the methodological quality of most of the included systematic reviews was critically low. Therefore, future studies should report SRs according to reporting guidelines, such as the PRISMA 2020 checklist, to improve the methodological quality and quality of evidence. This overview will help improve the evidence-based treatment and acupuncture evaluation system and facilitate research conducted by clinicians and scientific researchers.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.
Author contributions JJ and ML: conceptualization, methodology, investigation, and writing-original draft. JJ: software, visualization, and project administration. T-YC and SP: validation and writing-review and editing. JJ and T-YC: formal analysis and resources. SP and ML: data curation and supervision. ML: funding acquisition. All authors read and approved the final manuscript.

Funding
This research was supported by Korea Institute of Oriental Medicine (KSN 2022210). The authors alone are responsible for the writing and content of paper. The funder will not do any role for this study.