Mapping the value for money of precision medicine: a systematic literature review and meta-analysis

Objective This study aimed to quantify heterogeneity in the value for money of precision medicine (PM) by application types across contexts and conditions and to quantify sources of heterogeneity to areas of particular promises or concerns as the field of PM moves forward. Methods A systemic search was performed in Embase, Medline, EconLit, and CRD databases for studies published between 2011 and 2021 on cost-effectiveness analysis (CEA) of PM interventions. Based on a willingness-to-pay threshold of one-time GDP per capita of each study country, the net monetary benefit (NMB) of PM was pooled using random-effects meta-analyses. Sources of heterogeneity and study biases were examined using random-effects meta-regressions, jackknife sensitivity analysis, and the biases in economic studies checklist. Results Among the 275 unique CEAs of PM, publicly sponsored studies found neither genetic testing nor gene therapy cost-effective in general, which was contradictory to studies funded by commercial entities and early stage evaluations. Evidence of PM being cost-effective was concentrated in a genetic test for screening, diagnosis, or as companion diagnostics (pooled NMBs, $48,152, $8,869, $5,693, p < 0.001), in the form of multigene panel testing (pooled NMBs = $31,026, p < 0.001), which only applied to a few disease areas such as cancer and high-income countries. Incremental effectiveness was an essential value driver for varied genetic tests but not gene therapy. Conclusion Precision medicine’s value for money across application types and contexts was difficult to conclude from published studies, which might be subject to systematic bias. The conducting and reporting of CEA of PM should be locally based and standardized for meaningful comparisons.


Introduction
Precision medicine (PM) is a novel medical approach that tailors intervention decisions based on expression profiling of individual phenotypes and genotypes or directly corrects pathogenic gene mutations (1,2).The rapid evolvement of PM technology (3,4) has led to global efforts of introducing PM into the existing healthcare settings to transform healthcare (5)(6)(7)(8).However, the clinical adoption rate of PM remains low (9)(10)(11)(12)(13), and due to the lack of knowledge about PM's value for money, the incentives among key stakeholders are poorly aligned to catalyze its development and adoption (9,12,14).
Cost-effectiveness analysis (CEA) provides a systemic framework to inform such decisions which, over a relevant time horizon of expected PM benefits and within the context of societal willingness-to-pay thresholds (WTP) for such benefits, assesses the cost of an intervention relative to the expected health gains in standard terms, such as quality-adjusted life year (QALY) (15).CEAs are commonly used to inform public and private sectors' reimbursement decisions, clinical guidelines, benefit designs, and price negotiations (16) ("conventional CEA"), and help in decisions regarding product profile development and research priorities at an early clinical cycle ("early CEA") (17).To guide research, practice, and policy related to PM, it is valuable to have a detailed understanding of the CEA literature, focusing on how a reported value is related to contexts and conditions of PM interventions, as well as specifications and potential biases of CEAs.Previous reports have described the general relationship between various characteristics of PMs studied and estimated cost-effectiveness (18,19).However, previous reports have not formally assessed this literature using meta-analytic approaches.
This study aimed to quantify heterogeneity in the value for money of PM by pooling the net monetary benefits (NMBs) across the types of PM application [(1) screening for genetic conditions that predispose to disease, (2) early diagnosis, (3) prediction of disease progression, (4) companion diagnostic for targeting drug selection, and (5) gene therapy for established condition], as well as other contexts related to PM technology, disease domain, clinical stage, country capacity, and funder types.A secondary objective was to quantify sources of heterogeneity in PM's value for money in the areas of particular promise or concern as the field of PM moves forward.

Methods
The review was reported following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (20), and the protocol was recently published (PROSPERO: 2021 CRD42021272956) (21).

Search strategy and study selection
We conducted the systematic search and study selection using the Covidence platform ® .Embase, MEDLINE Ovid, EconLit, CRD, and Web of Science databases were searched to identify relevant studies published between January 1, 2011 and July 8, 2021, limited to studies published in or translated into English.In addition, we searched gray literature from reimbursement dossiers of several HTA agencies.Appendix 1 presents the details of the search strategies and search results for each database.To satisfy the inclusion criteria, the study had to be original research of costeffectiveness pertaining to human subjects, reporting costs and either LYs, QALYs, disability-adjusted life years (DALYs), or incremental cost-effectiveness ratios (ICERs), and the intervention of interest had to conform to the working definition of PM (2).Selected studies with overlapped contents were excluded by five independent reviewers.

Data extraction
Data were independently extracted by five reviewers which included characteristics of study (author's name, publication year, geographic region, country-income level, type of funders, and conflict of interest), study population (target population, cascade testing, age, sex, disease areas, and associated prevalence and mortality rates), PM intervention (intervention type, profiling method, developmental stage, clinical pathways, test accuracy, uptake, and treatment compliance), comparators, outcomes (economic and effectiveness parameters, surrogate outcome, data source, and willingness-to-pay thresholds), and modeling (study perspective, time horizon, model type, discount rates, measures of dispersion, and uncertainty).Meanwhile, the risk of bias in the CEAs was assessed using the modified economic evaluations bias (ECOBIAS) checklist (22), which assesses sources of heterogeneity and bias in the overall structure and model of economic evaluations.

Data harmonization and statistical analysis
The statistical analysis was performed using Stata MP version 17.The primary outcome was the net monetary benefit (NMB), which measures the difference between a monetized equivalent of incremental effectiveness (i.e., multiplied by a WTP threshold) and the incremental cost of new technology.Based on the central limit theorem, NBM is distributed normally and thus commonly used for quantitative analysis of CEAs (19,23,24).Although the standard practice typically uses nationally specific WTP thresholds, to enable global comparison that involves low-and middle-income countries (LMICs), we followed the recommendation of the World Health Organization (WHO) and World Bank (25) that defined the WTP threshold as the one-time national gross domestic product (GDP) per capita as of the study year.To standardize costing data, all NMBs were first inflated to the 2020 currency of that study country and then converted to 2020 USD ($) according to the consumer price index and exchange rate from the World Bank (26).
Following the latest guideline for data harmonization in metaanalyses of CEAs (27), we prepared NMB data, with details and the published protocol described in Appendix 2 (21).Through data harmonization, the NMB and its variance were consistently calculated by comparing PM to a conventional intervention strategy.Based on the COMER methodology (28), we performed a random-effects , where w i refers to the inverse of variance.Heterogeneity was tested using the Cochran Q test and I(2) statistics (30) (32), World Bank country-income level (per capita Gross National Income in 2020 USD when most information was available) (33), and funder type (public vs. non-profit private, for-profit private, and mixed or unspecified funding sources).
To assess the robustness and conclusiveness of pooled NMB findings, the jackknife sensitivity analysis was performed for each abovementioned subgroup, which omitted one study at a time and repeated the meta-analysis in the rest of the studies (34).This examined whether pooled NMB was consistent across the studies or excessively affected by any influential CEAs.
Following expert recommendation, publication bias was assessed using funnel plots and Egger's test (27).A funnel plot put NMB estimates on the x-axis against the quantified uncertainty interval on the y-axis.Egger's test assessed whether the funnel was symmetrical, or there was heterogeneity and/or missing studies.
To identify and quantify sources of heterogeneity in the pooled NMB of each PM type, first, we ranked the frequency of the most sensitive parameters to ICER that were reported in the sensitivity analyses of CEAs.Second, we performed univariate random-effects meta-regressions to examine the impact of 19 influencing factors that explain NMB heterogeneity due to study year, target population (age, sex, disease incidence rate, and use of cascade testing), and intervention characteristics (PM cost, incremental effectiveness, integrations of test uptake, test accuracy, and treatment compliance) and that explain value bias as a function of methods (study perspective, time horizon, model type, respective sources of cost and effectiveness data, any use of surrogate outcome, % of "yes" answers in overall ECOBIAS assessment, % "yes" answers in model-specific ECOBIAS assessment, and any conflict of interest).Third, because many covariates were found to be associated with NMB in the univariate meta-regressions (p < 0•05), we used a generalized Lasso approach with 10-fold cross validation to select essential covariates to be included in the multivariate meta-regression (the best-fitting model) (35).Finally, essential covariates were included in a multivariate, random-effect meta-regression (35) to quantify the impact of essential value drivers on the NMB of each PM type.Of note, we compared three random-effect meta-regression models, namely REML, DL, and empirical Bayes, and selected the model that yielded the greatest reduction in between-study heterogeneity [τ(2)] of NMBs.

Literature search and study characteristic
The literature search initially identified 5,187 articles.The final analysis included 275 unique CEAs with 463 cost-effectiveness estimates of varied PM applications because one CEA may include multiple test-treatment strategies, comparators, and settings (Figure 1, Flowchart of literature search and selection; Appendix 3, Full list of included studies).
Table 1 presents the study characteristics.Appendix 4 provides more details.Among the 238 CEAs on genetic testing and 37 CEAs on gene therapy, most were performed in high-income countries, in Western countries and applied to cancer.The median unit cost ranged between $220 and $3,091 for genetic tests and was $321,268 for gene therapy.The median ∆QALY was the lowest in the prognostic test compared to other test types (0.07 vs. 0.23-0.73)and the highest (3.83) in gene therapy.The pattern of risk of bias was persistent across varied PM application types, mainly focusing on narrow perspective, cost measurement omission, intermittent data collection, double counting, limited sensitivity analysis, and limited scope (Appendix 5).
Within each test type, only certain disease areas showed evidence of cost-effectiveness in general.Genetic tests had positive pooled NMBs when used for screening in endocrine and metabolic diseases (especially familial hypercholesterolemia) and cancer, in particular breast cancer ($96,018, $57,889, and $187,000, respectively), for diagnosis in Barrett's esophagus (a pre-malignant digestive condition) and cancer, most commonly thyroid cancer ($58,975, $8,422, and $6,051, respectively), and as a companion diagnostic in chronic infectious diseases (chronic hepatitis C, HIV), gout, and rheumatoid arthritis ($61,333, $4,850, and $4,173, respectively).Nonetheless, the pooled NMBs of the prognostic test were not statistically positive in varied disease areas (Figure 3A).

Gene therapy
In the meta-analysis of 56 cost-effectiveness estimates, the pooled NMB of gene therapy was not significantly greater than 0 in a variety of contexts (Figures 2B, 3B).

System-level variations in PM's value for money
At the structural level, for both genetic tests and gene therapy, commercially funded studies yielded high pooled NMBs, whereas publicly sponsored studies found no evidence of PM being costeffective in general (Figures 2A,B).Early CEAs also reported a higher pooled NMB compared to conventional CEAs both in genetic tests ($26,009 vs. $16,215; Figure 2A) and gene therapy ($1,830,000 vs. $0 [insignificant value]; Figure 2B).

Consistency, robustness, and publication bias of PM's value for money
In the jackknife sensitivity analysis, cost-effectiveness findings were valid and consistent in the above-described subgroups, i.e., both the pooled NMB and the corresponding 95% CI remained in the original position and direction regardless of the omission of any single datapoint (Appendix 6).
As seen by the asymmetry on the funnel plots (Appendix 7), publication bias was present in pooled NMBs of genetic tests in general (Egger's test, coefficient = −0.75,SE = 0.10, p < 0.001), particularly in screening, diagnosis, and companion diagnostics

Sources of heterogeneity in PM's value for money
The ICERs of varied genetic tests but not gene therapy were highly sensitive to disease progression rate and test cost; the ICERs of diagnostic, prognostic, companion tests, and gene therapy but not screening tests were highly sensitive to treatment cost and effectiveness; and the ICERs of screening and diagnostic tests but not prognostic or companion tests were highly sensitive to test accuracy (Appendix 9).
In the univariate meta-regressions of NMBs of genetic tests (Appendix 8), 18 out of 19 selected covariates were significantly associated with NMBs of studies of each test type.Multivariate metaregression results based on Lasso-selected essential features are presented in Table 2. Overall, 97.2% of variability [i.e., R(2)] in       In the univariate meta-regressions of gene therapy, treatment cost, study perspective, and target patient sex were significant value drivers (p < 0.001 for all), but incremental effectiveness barely explained any NMB variability [R(2) = 0%, p = 0.79; Appendix 8].

Discussion
In this systematic review and meta-analysis of 275 CEAs on PM published during 2011-2021, the value for money of genetic tests was highly context-specific: While genetic tests appeared cost-effective for screening, diagnosis, or companion diagnosis, such evidence was mainly based on established profiling methods and treatments, wellstudied disease indications, and from high-income countries.Evidence in new technologies (e.g., ES and gene therapy) and LMICs remained scarce and inconclusive.Incremental effectiveness and target population but not test cost were the essential drivers of value for money of varied genetic tests.Importantly, studies funded by public agencies generally found NMBs of PM to be not significantly greater than 0, whereas commercially funded and/or early stage studies consistently support PM as cost-effective.
Our findings were generally in line with previous literature.Kasztura et al. (18) (19) found gene therapies barely cost-effective in general whereas industry sponsorship was positively associated with cost-effectiveness, and our results confirmed both.As an update and extension, we quantified sources of heterogeneity in PM's value for money on an extensive collection of covariates which, for the first time, enabled in-depth investigation into heterogeneity by application type across disease areas, technologies, clinical stages, as well as sources of heterogeneity related to intervention characteristics, model specifications, and study biases.
Across clinical applications, genetic tests reported differential value for money.Of note, test cost was similar across genetic test types and had no major influence on their NMBs.However, one unit increase in ∆QALY would lead to 2-3 times higher ∆NMB if use for screening and diagnosis compared to prognosis or companion diagnostics, which indicated that PM-enabled early intervention (through risk detection or early diagnosis) was more efficient than PM-enabled treatment stratification (by predicted clinical risk or treatment response) in controlling the costs of disease management in general.In support of this, only screening and diagnostic tests appeared as cost-effective in cancer in published studies, whereas prognostic and companion tests were as plentiful in number but appeared to not be cost-effective.In particular, the prognostic test typically stratifies severe subgroups to advanced treatment which may be still patented and costly, rendering in not-cost-effective profiles in general.Furthermore, the cost-effectiveness of the same type of genetic test varied across disease areas probably because it was largely dependent on incremental effectiveness, which can explain the substantial value difference of genetic screening in breast cancer (a genetic test was used for primary screening) vs. cervical cancer (a genetic test was used as an add-on to pap smear screening).Nonetheless, what is subject to change is PM's inconclusive costeffectiveness profiles in new innovations, new disease indications, and new markets.Over time, the costs of new PM innovation (in particular gene therapy which on average costs $321,268 per patient) can reduce substantially when the scale and scope of production increases, and evidence in new indications and new markets can accumulate.These may render currently not-cost-effective PM interventions to become good value for money in the future.
Our study revealed significant systematic biases.The substantial discrepancies in PM's value for money between early and conventional CEAs, and between commercially funded and publicly sponsored CEAs, can be related to study manipulation as a result of overambition or overoptimism from the R&D community and commercial entities, especially at an early stage when best guesses were commonly adopted, or publication bias such that positive results were more likely to be submitted for publication.For instance, study perspective was found to be an essential value driver of the prognostic genetic test and gene therapy but not of genetic tests used for screening, diagnosis, or companion diagnostics.This pattern could indicate a greater share of analyses leveraging societal perspectives for interventions that were relatively less cost-effective from a healthcare system's perspective.Therefore, our study supports the call from a recent perspective in Nature Reviews (36) that a reference case should be developed to standardize the evaluation and report on the economic impact of PM.For this reason, we are conducting an in-depth analysis of methodological variations that can lead to the development of a reference case for PM evaluation.The results will be published in a separate study.
This study has several limitations.First, it was impossible to capture all sources of heterogeneity due to systemic differences in health service utilization across settings.Second, we excluded non-English publications.Third, using the same WTP for LYs as for QALYs or DALYs may be inappropriate, but over 96% of included studies measured QALYs.Fourth, we were unable to extract treatment cost from the complex, stratified, and/or changing treatment regimens in many studies.Last but not least, the cost-effectiveness findings did not apply to LMICs because no data were available from low-income countries and the studies from middle-income countries provided no support for the cost-effectiveness of PM in general.
To conclude, a large body of evidence suggests that the value for money of PM applications is concentrated in established technologies, disease domains, and markets, which is mainly influenced by incremental effectiveness in favor of early intervention over treatment stratification at diseased stages.It takes time for PM in new innovations, new indications, and new markets to accumulate evidence to affirm its value of money.Moreover, current CEAs of PM are prone to study manipulation and systematic bias.Thus, it is difficult to make an overall conclusion on PM's value for money across application types and disease areas.To enable meaningful comparisons for truly informed decision-making, policymakers and stakeholders should conduct local studies, with appropriate consensus approaches to standardize the conducting and reporting of CEA of PM.

Data availability statement
The data analyzed in this study is subject to the following licenses/ restrictions: the data are currently available upon request.The data will be later deposited in a central depository in National University of Singapore, Saw Swee Hock School of Public Health, for public access.Requests to access these datasets should be directed to wenjiach@nus.edu.sg.

FIGURE 1 PRISMA
FIGURE 1 PRISMA 2020 flow diagram of systematic search and selection.Numbers refer to unique study records, not datasets, except where otherwise indicated.
screening tests' NMBs was explained by incremental effectiveness (p < 0.001) and target age (p = 0.32), 95.9% of variability in the diagnostic tests' NMBs was explained by incremental effectiveness, target sex, source of cost data, model type, and overall study bias (all had p < 0.001), and 48.5% of the variability in the prognostic tests' NMBs was explained by incremental effectiveness (p < 0.001), target sex (p = 0.07), study perspective (p < 0.001), test accuracy (p = 0.26), treatment compliance (p = 0.001), and publication year (p = 0.33), whereas the only essential predictor of the companion tests' NMBs was incremental effectiveness (p < 0.001) when treatment cost was absent, explaining 11.8% of the variability.Test cost was not identified as an essential value driver for any genetic test type.In particular, one extra unit of incremental effectiveness was associated with a marginal increase in NMB of $60,181 (95% CI 59,752-60,609) for the screening test, $41,943 (95% CI 40,381-43,504) for the diagnostic test, $24,515 (95% CI 20,581-28,450) for the prognostic test, and $27,375 (95% CI 26,496-28,255) for the companion diagnostic test.

FIGURE 2
FIGURE 2 Summary forest plot showing the weighted-pooled summary estimates of incremental net monetary benefit of precision medicine.(A) Left panel, genetic testing in general; (B) Right panel, gene therapy.The error bars show the 95% confidence interval.The red vertical line marks the border for significance.

FIGURE 3
FIGURE 3 Summary forest plot showing the weighted-pooled summary estimates (in ≥ two datapoints) of incremental net monetary benefit of precision medicine across major ICD disease domains.(A) Genetic testing for different purposes; (B) Gene therapy.The error bars show the 95% confidence interval.The box shows neoplasm/cancer and detailed sub-categories.The red vertical line marks the border for significance.

TABLE 1
General and economic characteristics of cost-effectiveness analyses reporting precision medicine interventions.

TABLE 2
Parameter estimates from multivariate meta-regression model on the net monetary benefit of genetic testing.