Cost-effectiveness of first line nivolumab-ipilimumab combination therapy for advanced non-small cell lung cancer: A systematic review and methodological quality assessment

To assess the methodological quality of cost-effectiveness analyses (CEA) of nivolumab in combination with ipilimumab, we conducted a systematic literature review in the first-line treatment of patients with recurrent or metastatic non-small cell lung cancer (NSCLC), whose tumors express programmed death ligand-1, with no epidermal growth factor receptor or anaplastic lymphoma kinase genomic tumor aberrations. PubMed, Embase, and the Cost-Effectiveness Analysis Registry were searched, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The methodological quality of the included studies was assessed by the Philips checklist and the Consensus Health Economic Criteria (CHEC) checklist. 171 records were identified. Seven studies met the inclusion criteria. Cost-effectiveness analyses differed substantially due to the applied modeling methods, sources of costs, health state utilities, and key assumptions. Quality assessment of the included studies highlighted shortcomings in data identification, uncertainty assessment, and methods transparency. Our systematic review and methodology assessment revealed that the methods of estimation of long-term outcomes, quantification of health state utility values, estimation of drug costs, the accuracy of data sources, and their credibility have important implications on the cost-effectiveness outcomes. None of the included studies fulfilled all of the criteria reported in the Philips and the CHEC checklists. To compound the economic consequences presented in these limited number of CEAs, ipilimumab's drug action as a combination therapy poses significant uncertainty. We encourage further research to address the economic consequences of these combination agents in future CEAs and the clinical uncertainties of ipilimumab for NSCLC in future trials.

To assess the methodological quality of cost-effectiveness analyses (CEA) of nivolumab in combination with ipilimumab, we conducted a systematic literature review in the first-line treatment of patients with recurrent or metastatic nonsmall cell lung cancer (NSCLC), whose tumors express programmed death ligand-1, with no epidermal growth factor receptor or anaplastic lymphoma kinase genomic tumor aberrations. PubMed, Embase, and the Cost-Effectiveness Analysis Registry were searched, in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The methodological quality of the included studies was assessed by the Philips checklist and the Consensus Health Economic Criteria (CHEC) checklist. 171 records were identified. Seven studies met the inclusion criteria. Cost-effectiveness analyses differed substantially due to the applied modeling methods, sources of costs, health state utilities, and key assumptions. Quality assessment of the included studies highlighted shortcomings in data identification, uncertainty assessment, and methods transparency. Our systematic review and methodology assessment revealed that the methods of estimation of long-term outcomes, quantification of health state utility values, estimation of drug costs, the accuracy of data sources, and their credibility have important implications on the costeffectiveness outcomes. None of the included studies fulfilled all of the criteria reported in the Philips and the CHEC checklists. To compound the economic consequences presented in these limited number of CEAs, ipilimumab's drug action as a combination therapy poses significant uncertainty. We encourage further research to address the economic consequences of these combination agents in future CEAs and the clinical uncertainties of ipilimumab for NSCLC in future trials.

Introduction
Platinum-based doublet chemotherapy was historically the standard first-line treatment for patients with recurrent or metastatic non-small cell lung cancer (NSCLC), whose tumors lack epidermal growth factor receptor mutations or anaplastic lymphoma kinase translocations. More recently, pembrolizumab monotherapy for patients with a high level of tumor programmed cell death ligand-1 (PD-L1) expression ≥1% became the standard first-line therapy for advanced NSCLC without treatable driver mutations (1)(2)(3). Nivolumab and ipilimumab are monoclonal antibodies that bind to programmed death-1 (PD-1) and cytotoxic T-lymphocyte antigen 4 (CTLA-4) receptors, respectively, to restore T-cell activity against tumor cells. In 2019, the CheckMate 227 Phase 3 trial showed improved progression-free and overall survival with this dual checkpoint inhibition in recurrent or metastatic NSCLC (4). The CheckMate 227 trial results indicated that nivolumab in combination with ipilimumab was associated with improved survival in pre-specified subgroups, including PD-L1 ≥ 1% and PD-L1 < 1% (4). In 2021, the CheckMate 9LA Phase 3 trial, stratified patients by PD-L1 ≥ 1% and <1%, showed that nivolumab in combination with ipilimumab plus two cycles of chemotherapy improved progression-free and overall survival, compared with four cycles of chemotherapy (5). The United States (US) Food and Drug Administration (FDA) approved nivolumab in combination with ipilimumab for patients with PD-L1 ≥ 1% (6), and the National Comprehensive Cancer Network panel extended their use for patients with PD-L1 < 1% (7). Nivolumab plus ipilimumab with two cycles of chemotherapy was also approved by the US FDA for patients regardless of PD-L1 expression levels (8).
Although several studies have shown single-agent immune checkpoint inhibitors with or without chemotherapy to be costeffective (9-13), double-agent immunotherapy combinations may not be deemed cost-effective, given their high price tags. To assess the economic value of nivolumab in combination with ipilimumab, we conducted a systematic literature review of model-based cost effectiveness analyses (CEA) in the first-line treatment of patients with recurrent or metastatic NSCLC. To evaluate the methodological quality of the published CEAs, we used the Philips checklist (14), and the Consensus Health Economic Criteria (CHEC) checklist (15), to critically review the applied methods and modelling efforts in this setting.

Search strategy
A systematic literature review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (16). We searched PubMed, Embase, and the Cost-Effectiveness Analysis (CEA) Registry database. The searches were built using the Population Intervention Comparison Outcome (PICO) framework (Supplementary Tables S1A-C). Each search was limited to English-language studies of human subjects. No date restrictions were applied. The search strategy included MeSH terms in PubMed and Emtree terms in Embase, as well as free-text terms in the CEA Registry database (Supplementary Table S1). Manual reference checks supplemented database searches. All searches were finalized on January 5, 2022.

Study selection
Studies accepted at the title-abstract screening stage were retrieved in full text for review. Two reviewers screened all studies and resolved any issues of discrepancy through consensus or consultation with a third reviewer. Studies were included if they fulfilled the eligibility criteria. The process of selection and inclusion and exclusion of articles was recorded in both Rayyan (https://www.rayyan.ai/cite) and Microsoft Excel. This method provides transparency regarding all selection steps and assures reproducibility. The details of the inclusion and exclusion criteria are presented in Table 1.

Data extraction
An evidence table ( Table 2) is created according to the PICO framework to extract data on the study author, year, country, population, clinical trial, PD-L1 expression, intervention, comparator, time horizon, study perspective, incremental outcomes (QALYs and costs), incremental cost-effectiveness ratio (ICER), as well as the author's stated conclusions.

Quality assessment of the methodology
The quality assessment of the included studies was performed by using the Philips checklist (14) and the Consensus Health Economic Criteria (CHEC) checklist (15). The quality of the methodology was assessed by one reviewer and validated by a second reviewer. Any issues of discrepancy were resolved through consensus or by consultation with a third reviewer.   Table S2). Table 3 shows the quality assessment results based on the CHEC checklist. Table 4 shows the quality assessment results based on the Philips checklist. A schematic representation of the outcomes and differences between these checklists is presented in the Supplementary Appendix ( Supplementary  Figures S1, S2).

Included CEAs and study characteristics
In the first-line treatment of advanced NSCLC, the costeffectiveness of nivolumab-ipilimumab and/or nivolumab-ipilimumab plus two cycles of chemotherapy was compared with standard chemotherapy. According to the PICO framework (see Table 2), in the CEAs (17-20, 22, 23) that sourced the CheckMate 227 clinical trial (Population), nivolumab (3 mg/kg every two weeks) in combination with ipilimumab (1 mg/kg every six weeks) (Interventions) was compared with platinum-doublet chemotherapy every three weeks for up to four cycles (Comparator) (4). In the CEAs (21, 23) that sourced the CheckMate 9LA clinical trial (Population), nivolumab (360 mg every three weeks) and ipilimumab (1 mg/kg every six weeks) were combined with histology-based, platinum doublet chemotherapy (every three weeks for two cycles) (Interventions), and were compared with chemotherapy alone (every three weeks for four cycles) (Comparator) (5). Study outcomes in all CEAs were expressed in incremental costs, QALYs, and ICERs (see Table 2 for details on the included study Outcomes and conclusions).
Estimation of long-term outcomes showed variability among the included CEA studies due to: (i) variation in the extraction of data points of survival curves from the CheckMate 227 and the CheckMate 9LA trials, (ii) calibration of the probability of progressive disease to death at each model cycle (i.e., intervals of one week (18), 3 weeks (21), 6 weeks (19,20,22,23), and one month (17), to fit the overall survival curve, (iii) variation in statistical techniques in fitting and extrapolating survival functions. Age-specific mortality from other causes was estimated based on the US life tables (25).

Costs and their sources
All CEAs included the United States (US) healthcare, payer or societal perspectives, and expressed costs in US dollars (years ranging from 2018 to 2021). In one study (19), the authors did not specify a year for the included costs. In another study (18), the rationale for the cost year of 2018 was not included. In this study (18), the authors indicated that the vial prices of nivolumabipilimumab were discounted by 17%, based on a previously published study (26), and the cost of chemotherapy was $24,437 per patient regardless of histology (27). In the same study (18), the cost of maintenance chemotherapy was $5,887 for nonsquamous NSCLC (27). All remaining sources for drug prices were obtained from the US Medicare and Medicaid Services (25), literature, and publicly available sources (28). Medical consumer price indices (29) was used to express costs in US dollars.

Cost-effectiveness thresholds
For the US setting, two studies used a willingness-to-pay threshold (WTP) of $100,000 per QALY (17,18), four studies used a WTP of $150,000 per QALY (19)(20)(21)23), and one study included both thresholds (22). In addition, one study included the perspective of the Chinese healthcare system and used a WTP of $27,351 per QALY (18).
The ICERs reported in the included studies which are deemed cost-effective were as follows: In one study (19), for patients with PD-L1 expression levels ≥50% and ≥1% or a high Tumor Mutational Burden (TMB), the ICERs were $107,404 and $133,732 per QALY gained, respectively (19). In another study (18), the ICER was $75,871 per QALY gained (for the US setting). However the credibility of the data sources in this study (18) is questionable and poses a challenge to accurately compare study outcomes. For the US setting, the outcomes of the above mentioned study (18) should be interpreted with caution.

Sensitivity and/or subgroup analyses
For patients with PD-L1 levels <1%, ≥1% and ≥50%; the ICERs were $332,100, $440,100 and $375,700 per QALY gained, respectively (17). The most influential model inputs were drug acquisition costs, duration of combination immunotherapy, patients' body weight and survival hazard ratio. In one study (19), the analysis on patients with a high TMB, resulted in an ICER of $69,182 per QALY gained compared with chemotherapy. In this study (19), patients with PD-L1 < 1%, nivolumab-ipilimumab combination therapy could be deemed cost-effective, if the cost of nivolumab were to be discounted by 21% or the cost of ipilimumab were to be discounted by 24% (19). Another study (20) reported from the US perspective that the ICERs were $143,434, $196,507 and $212,111 per QALY gained, in patient with PD-L1 < 1, ≥1, and ≥50%, respectively (20). The authors in this study calculated that the cost of nivolumab should be discounted by 20% in order to have an ICER below the WTP threshold (20). In one study (22), the authors showed when patients' weight increased to 140 kg or the overall survival hazard ratio increased to 0.84, the ICER was above the WTP threshold of $150,000 per QALY (22). Finally, one study (21) showed that patients with Eastern Cooperative Oncology Group score of 0 and central nervous system metastases favored nivolumab-ipilimumab plus chemotherapy, with more than a 50% probability of being cost-effective compared with chemotherapy (21). However, the costeffectiveness probability was extremely low for subgroups of patients with unfavorable HR of overall survival, such as those older than 75 years, with squamous NSCLC, and liver metastases (21). In this study, when the cost of nivolumab was reduced by at least 28%, nivolumab-ipilimumab plus chemotherapy was costeffective compared with chemotherapy alone at a threshold of $150,000 per QALY (21). Table 3 shows methodological quality assessment results based on the CHEC checklist. The CHEC checklist consists of 19 questions (15). The quality outcomes of each study were based on whether insufficient or missing information was identified in the article, or in other published materials. If the study authors paid sufficient attention to the listed checklist items then the assessment criteria were fulfilled. Table 4 shows the quality assessment results based on the Philips checklist. This checklist consists of 20 quality dimensions, according to model structure, data, and consistency (14). Similar assessment criteria were employed, and the quality outcomes of each study based on the Philips checklist are presented in Table 4. For a visual representation of the quality assessment study findings and differences among these checklists, see the Supplementary Appendix (Supplementary Figures S1, S2).

Frontiers in Health Services
Based on the CHEC checklist, time horizon and health outcome measurement ( Table 3) were items that were "partially fulfilled" by Courtney et al. (17). In studies reported by Hu et al. (19), Hao et al. (18), Li et al. (20), Wan et al. (22), and Peng et al. (21), a combination of "partially fulfilled" and "not reported" checklist items affected the quality of each study. Using this checklist, the CEA that scored the highest methodological quality was published by Yang et al. (23).
Based on the Philips checklist, time horizon, cycle length, health utilities, and external consistency ( Table 4) were items that were "partially fulfilled" by Courtney et al. (17). In studies reported by Hu et al.  21), a combination of "partially fulfilled" and "not reported" checklist items affected the quality of each study. According to the Philips checklist, the CEA that scored the highest methodological quality was published by Yang et al. (23).
Overall, our assessment highlighted shortcomings in data identification and methods of transparency. Quantification of health state utility values, estimation of drug costs, the accuracy of data sources, and their credibility have important quality implications on the cost-effectiveness outcomes. None of the included studies fulfilled all of the criteria reported in the Philips and the CHEC checklists. Although the conclusions of the four CEAs indicated that nivolumab-ipilimumab combination therapy had favorable cost-effectiveness (i.e., 4 out of 7 studies), the quality assessment of these studies revealed that there were a number of uncertainties and limitations pertaining to each study. From a clinical perspective, Ipilimumab has no approved single-agent (monotherapy) activity in the treatment of NSCLC, and its mechanism of action (i.e., synergy or additivity), when combined with nivolumab, is not fully understood in this setting (35). To compound the economic consequences presented in these limited number of CEAs, ipilimumab's drug action as a combination therapy poses significant uncertainty and requires further clinical investigation (35). We encourage further research to address the economic consequences of these combination agents in future CEAs and clinical uncertainties of ipilimumab for NSCLC in future trials.

Discussion
Nivolumab-ipilimumab combination therapy has a high price tag, and the potential to be used for a range of indications, also in combination with other agents. Our systematic review showed that the methods of estimation of long-term outcomes, quantification of the health state utility, estimation of drug costs, the accuracy of data sources, and their credibility have important implications on the ICERs. None of the included studies fulfilled all of the requirements presented in the Philips checklist, and the CHEC checklist. Quality assessment of the included studies highlighted shortcomings in data identification, uncertainty assessment, and methods transparency domains.
The estimation of long-term immunotherapy outcomes has important implications. Given that the CEA model inputs were sourced from the clinical trials, the durability of response, and potential long-term survival after immunotherapy are crucial factors for these economic analyses. Currently, the minimum effective dose of immunotherapy remains unknown, as does the optimal duration of treatment. A better understanding of optimal drug dosage and treatment duration may influence the overall costs of immunotherapy. To theoretically address the long-term estimation of outcomes, CEAs are encouraged to vary nivolumab-ipilimumab dosing and treatment duration in their sensitivity analyses.
Assessing the cost-effectiveness of immunotherapy drugs depends not only on the relative efficacy of treatments observed in the clinical trials, but also on the model structure, and assumptions. Good practice recommendations were developed specifically for fitting curves to observe progressionfree and overall survival (36,37). Although stochastic uncertainty (i.e., model parameters, and assumptions) is usually assessed in CEA models, structural uncertainty (i.e., alternative modeling approaches) is not often considered. It is common practice to acknowledge potential limitations in model structure, however, identified studies in our review lack clarity about methods to characterize the uncertainty surrounding alternative structural assumptions and their contribution to decision uncertainty. Given that alternative modeling techniques (i.e., cure models, spline-based models) may complement standard methods, future CEAs may incorporate structural uncertainty by considering alternative modeling approaches concurrently.
Although patient-reported outcomes (PROs) were collected in the CheckMate 227 and the CheckMate 9LA trials, six CEA models were developed based on health utility estimates that were sourced from previously published studies (30)(31)(32)(33). Similarly, utility decrements of AEs were sourced from the publicly available literature. Cancers with a high TMB, such as NSCLC, are associated with higher immune-related AEs (irAEs) during immunotherapy treatment, suggesting that these cancers are associated with a higher risk of irAEs than cancers with a low TMB. Although irAEs are rare, the cost of treatment in such cases is rather high. Therefore, the benefits of nivolumab-ipilimumab combination therapy could be over-or underestimated in the included models. The inclusion of irAEs in future economic models of NSCLC is encouraged.
TMB is an emerging biomarker for immunotherapy in lung cancer (38)(39)(40)(41)(42). The results of the CheckMate 568 showed the TMB of more than 10 mutations per mega base could be used as an effective cutoff value for selecting responders (43). Similarly, the analysis of Hellmann et al. showed that the first line treatment with nivolumab-ipilimumab provided clinical benefits for patients with NSCLC with a high TMB (≥10 mutations per mega base), regardless of their tumor PD-L1 expression levels (44). Although nivolumab-ipilimumab provided the greatest absolute survival for patients with a high TMB in the CheckMate 227 trial, the clinical benefits were similar to those of chemotherapy in patients regardless of their TMB. Therefore, it is necessary to understand the implications of TMB as a biomarker and then re-analyze clinical and costeffectiveness findings accordingly.
This study is the first systematic review that focused on the methodological quality of CEAs conducted specifically for the front-line nivolumab-ipilimumab combination. Previously published systematic reviews of CEAs focusing on immunotherapy in advanced NSCLC (45)(46)(47), did not assess the quality of the study methodology based on either the Philips checklist or the CHEC checklist. One study (45) used the Consolidated Health Economic Evaluation Reporting Standards checklist (48). However, this checklist is not designed for the quality assessment of CEA study methodology.
All in all, efficient allocation of existing resources is essential for health systems to meet the evolving needs of populations and sustainability efforts. From our analysis, the quality assessment of the included CEAs highlighted shortcomings in various domains of the included checklists. To improve methodological study quality, we encourage authors of future CEAs to consider the inclusion of either the CHEC or the Philips checklist in their studies and to follow its guidance to report their analyses. The application of high-quality knowledge that stems from scientific evidence and economic modeling can aid in achieving sustainable health systems worldwide. Improving the methodological quality of the future CEAs would be a significant step in the right direction toward this achievement.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.