A Field Test of Major Value Frameworks in Chemotherapy of Nasopharyngeal Carcinoma—To Know, Then to Measure

Background: The European Society for Medical Oncology (ESMO) and the American Society of Clinical Oncology (ASCO) have independently developed their own frameworks to assess the benefits of different cancer treatment options, which have significant implications in health science and policy. We aimed to compare these frameworks in nasopharyngeal carcinoma. Methods: We identified all randomized controlled trials of systemic chemotherapies for nasopharyngeal carcinoma until April 5th, 2020. Trials were eligible if significant differences favoring the experimental group in a prespecified primary or secondary outcome were reported. Two assessors independently scored the trials and the final scores were determined by consensus. Results: Fifteen trials were included in the analysis. Five different toxicity grading criteria were applied to the 15 trials. Ten (66.7%) trials did not report grade 1–2 toxicities and eight (53.3%) did not report late toxicities. The number of acute toxicities reported was strikingly different (17 vs. 8) in two trials using the same regimen. All trials met the ESMO criteria for a high level of benefit. However, significant variations in ASCO scores between trials were observed (mean [standard deviation]: 38.9 [20.0]). Conclusions: The underreporting and inconsistent reporting of toxicities would significantly impair the assessment of value using any framework. Moreover, there is a concern that the ASCO framework generated highly inconsistent scoring for treatments that met the ESMO criteria for a high level of benefit. The anomalies identified in the frameworks function would be helpful in their future improvement.


INTRODUCTION
The goal of any cancer therapy is to help patients live longer, or live better, or both. In the clinic, oncologists, and patients need to discuss the balance of benefit and toxicity associated with different treatment options, to make the best decision for each patient. The European Society for Medical Oncology (ESMO) and the American Society of Clinical Oncology (ASCO) have proposed and updated frameworks to assess the value of cancer treatment options (1,2).
Nasopharyngeal carcinoma (NPC) is prevalent in Southern China, Southeast Asia, North Africa, the Middle East, and Alaska (3). Radiotherapy (RT) is the primary treatment for non-metastatic NPC. Multiple randomized controlled trials (RCTs) have shown that combining chemotherapy with RT improves outcome in loco-regionally advanced NPC. However, different sequences (induction, concurrent, adjuvant, and their combinations) and regimens of chemotherapy were used in these RCTs and controversy remains over which treatment option is optimal (4). In recurrent or metastatic NPC, chemotherapy is the mainstay of treatment and various regimens have been used in the clinic.
Recently, researchers have used the ESMO and ASCO frameworks to assess systemic therapies for cancers (5)(6)(7)(8). However, to the best of our knowledge, no study has tested these frameworks in NPC. We applied the updated ESMO and ASCO value frameworks to RCTs investigating systemic chemotherapies in NPC.

Literature Search
This systematic analysis aimed to include all relevant published trials on systemic chemotherapies in NPC. The following electronic databases were searched to identify potentially eligible trials: PubMed, Web of Science, and the Central Registry of Controlled Trials of the Cochrane Library (CENTRAL). The search was supplemented by a manual search of the reference lists of primary studies, review articles, meta-analyses, and relevant books. To search PubMed and Web of Science, we adopted a search algorithm used in the latest individual patient data meta-analysis of chemotherapy in NPC (4). For CENTRAL, we used the Medical Subject Heading "nasopharyngeal neoplasms" to search for studies. The language and time were not limited in the search, which was performed on April 5th, 2020.
The search algorithms were as follows:

Study Selection
The following criteria were applied to the selection of RCTs: 1) RCTs reporting significant differences favoring the experimental group in a prespecified primary or secondary outcome. Trials with "negative" results were excluded, as they were not assessable according to the frameworks. This is in accordance to the ESMO-Magnitude of Clinical Benefit Scale (MCBS) version 1.1 stating that only "adequately powered studies showing statistically significant improvement in the primary outcomes or secondary outcomes" should be scored (2). 2) At least 50% of trial participants had NPC; 3) At least 30 patients had been included in each arm; 4) Trials using split-course RT were excluded.
Two authors (YZ and XL) independently performed the literature search and study selection. Any inconsistencies were discussed until consensus was reached.

Frameworks
The updated ASCO-Value Framework (ASCO-VF) and ESMO-MCBS both quantify gains in overall survival (OS) or its surrogates (e.g., disease-free survival [DFS]) (1,2). In ASCO-VF, the hazard ratio (HR) is subtracted from one and the result is multiplied by 100 to derive a Clinical Benefit Score; in ESMO-MCBS, HRs, and/or survival gains are linked to a particular grade in a pre-specified manner. For example, in the curative setting, a >5% improvement of OS at ≥3 years follow-up translates to a grade of A. Both scales use different forms for treatment in curative and palliative setting.
Toxicity and quality-of-life (QoL) data are used to adjust the scores or grades in both frameworks. For ASCO-VF, different points are assigned to every "clinically meaningful toxicity" based on its frequency and severity (e.g., 2.0 points for every grade 3 or 4 toxicity with a frequency ≥5%). The percentage difference in the sum of toxicity points between the two regimens is then multiplied by −20 to obtain a Toxicity Score. If the test regimen is more toxic than the comparator, the toxicity score is negative and vice versa. In the ESMO-MCBS, some prespecified severe toxicities are explicitly outlined and grades reduced by 1 level if toxic effects meet any of these prespecifications (e.g., a statistically significant increase of toxic death rate >2%).
Both frameworks award bonus for a "tail of the survival curve effect." The ASCO-VF award 16-20 bonus points if there is a 50% or greater improvement in the proportion of patients alive with the test regimen at the time point on the survival curve that is 2 × the median survival of the comparator regimen. The ESMO-MCBS requires an upgrade of 1 level if there is a longterm plateau in the survival curve. Final ASCO-VF scores, termed Net Health Benefit, are the sum of Clinical Benefit Score, Toxicity Score and any bonus points (possible range −20 to more than 120 with bonus point allocation); ESMO-MCBS grades are ranked C, B, or A (for the curative setting), and 1, 2, 3, 4, or 5 (for the palliative setting). ESMO-MCBS defines "substantial clinical benefit" as a grade of 4, 5, B, or A whereas ASCO-VF includes no explicit definition.

Data Abstraction, Scoring, and Statistical Analysis
Firstly, two assessors (YZ, XL) independently scored the trials according to both frameworks. Secondly, the two assessors discussed the results and determined the final scores by consensus. Bias in trials was evaluated by one assessor (XL) using the Cochrane risk of bias assessment tool (9). Data were collected in an Excel file designed for this study. Descriptive statistics were used to summarize the scoring.

RESULTS
The electronic and manual search identified 195 references after the removal of duplicates. After screening, 22 references for 15 trials were eligible (Figure 1). Only one study was excluded because of insufficient information to assign a score for either framework. The median sample size of the 13 included trials was 284. Eleven trials investigated chemotherapy in the curative treatment of non-metastatic NPC, including four trials comparing concurrent chemoradiation (CCRT) plus adjuvant chemotherapy (AC) vs. RT alone, four trials comparing CCRT vs. RT alone, and five trials comparing induction chemotherapy (IC) plus CCRT vs. CCRT ( Table 1). Two trials investigated palliative treatment of recurrent or metastatic NPC: one compared cisplatin and gemcitabine vs. cisplatin and fluorouracil and the other compared cisplatin and fluorouracil every 2 weeks vs. every 4 weeks ( Table 2).
We found significant variation in the reporting of toxicities. Among the 13 trials, five different toxicity grading criteria were

DISCUSSION
To the best of our knowledge, this study is the first to test the ESMO and ASCO frameworks in trials evaluating chemotherapy in NPC. We found significant variation in the reporting of toxicities, including different grading criteria and deficiencies in the reporting of grade 1-2 and long-term toxicities. These results are consistent with previous evidence suggesting that the reporting of toxicity data from RCTs needs improvement (32). The underreporting and inconsistent reporting of toxicities would significantly impair the assessment of value using any framework in any possible settings, not only in NPC. Compliance with established guidance on toxicity reporting and sharing of clinical trial data may help mitigate this problem (33,34). Moreover, subjective toxicities are at high risk of underreporting by physicians, even when prospectively collected within randomized trials (35). This strongly supports the need for incorporation of patient-reported outcomes and QoL data into toxicity reporting in clinical trials (36). Our two assessors had a perfect agreement in the ESMO-MCBS analysis except in the assessment of one trial in the palliative setting (30). The ESMO-MCBS requires upgrading one level if the new treatment is associated with "statistically significantly less grade 3-4 toxicities impacting on daily wellbeing" compared with the standard therapy in the non-curative setting. In the trial comparing cisplatin and gemcitabine vs. cisplatin and fluorouracil in recurrent or metastatic NPC (SYSUCC-GP), the overall incidences of grade 3-4 toxicities were not significantly different between the two arms (43.3 vs. 35.8%, p = 0.18), while the experimental arm had significantly fewer grade 3-4 mucosal inflammation (0 vs. 14.5%, p < 0.001) (30). Our two assessors differed on whether this met the criteria for upgrade. After discussion, they decided that no upgrade should be done. More detailed guidance on this criterion might help avoid discrepancy in the future.
For the ASCO framework, however, wide variation in the initial independent analysis occurred between the two assessors, mainly due to the different interpretation of "clinically meaningful toxicity." The ASCO-VF defined "clinically meaningful toxicities" as toxicities other than laboratory abnormality only, which might be ambiguous and prone to different interpretations. For example, grade 1-2 hyponatremia may be symptomless while grade 3-4 hyponatremia might cause symptoms like fatigue. A clearer definition would facilitate more consistent scoring, which was also suggested by de Hosson et al. (6).
Our results demonstrated good applicability of both frameworks. Trials included in this study achieved highly consistent grades using the ESMO-MCBS. The ASCO-VF, however, gave very inconsistent and disparate scoring. For example, in the curative setting, all except one trial met the ESMO criteria for the highest level of benefit (grade A), while significant variations were found in the ASCO-VF scoring of Clinical Benefit, Toxicity as well as the final Net Health Benefit.
An important difference between these two frameworks is that the ESMO-MCBS places increasing weights on the toxicity profile as the treatment effects moves from curative to increasing palliative settings, while the calculation of toxicity score in the ASCO-VF is the same regardless of curative or palliative setting. For example, in the curative setting, for a new treatment regimen that improved the OS by >5%, the ESMO-MCBS would score a grade of A regardless of toxicity, while the ASCO-VF would take toxicity into consideration. In theory, the ASCO approach might be more reasonable. However, this is also part of the reason why the ASCO-VF score has significant variations. Conversely, unlike the ESMO-MCBS, the ASCO-VF didn't mention grade 5 toxicity (treatment-related death), which we believe is of vital importance in assessing toxicities.
For ASCO-VF, each toxicity is assigned a score between 0.5 and 2.0, based on grade and frequency. However, these points are arbitrary, not intuitive, and this may have obscured the actual differences in toxicity. For example, in the PWHQEH-94 trial comparing CCRT vs. RT alone, grade 3-4 stomatitis was observed in 48.9 and 35.8% of patients in the CCRT and RTonly groups, respectively, with a significant difference of 13.1% (p = 0.002) (18). However, when grading using the ASCO criteria, both groups scored two points, despite the apparent clinically relevant difference. In the original ASCO framework, the HR for survival was also assigned a score of 1-5 on the basis of the magnitude of difference (e.g., 5 for an HR < 0.2). While in the revised framework, a continuous scoring system is used to avoid arbitrary cut-offs (1). In the same vein, a continuous scoring system for toxicity might more accurately reflect the absolute difference in toxicity, as shown in Table 3. Such calculations could be easily performed once the framework is converted to a software application, as planned by ASCO (1).
The study had some limitations. Firstly, only trials reporting significant results favoring the experimental arm were assessable using the frameworks. However, our study was a field test of ESMO and ASCO frameworks in systemic chemotherapy of NPC and not aimed at determining the value of different treatment options. A balanced value assessment requires the consideration of all relevant studies, whether they report significant findings or not, which was beyond the scope of this study. Secondly, our research was limited to RCTs investigating systemic chemotherapy in NPC; the applicability of value frameworks in other treatments or other diseases might be different. Nevertheless, there is a strong probability that similar situations apply to other settings. Thirdly, no trials included in the current study reported QoL data. It was not clear how such data will impact value assessments. Finally, only 6 of 13 trials in the curative setting used intensity-modulated radiotherapy, which has become the standard of care in NPC (37).
In conclusion, significant variations regarding toxicity reporting were found in trials evaluating chemotherapy in NPC. Both frameworks could be applied to the systemic chemotherapy of NPC. However, there is concern that the ASCO-VF generated highly inconsistent scoring for treatments that met the ESMO criteria for high level of benefit. The successful future application of value frameworks requires consistent reporting of toxicities as well as iterative refining and intergroup alignment of different frameworks.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
Study design: YZ and JM. Data collection: YZ, XL, and Y-QL. Revision of the manuscript, data analysis, and interpretation: All authors. Writing of the manuscript: YZ. Statistical analysis: YZ, XL, and Y-QL. The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of this report.