Cost-Utility Analysis: Current Methodological Issues and Future Perspectives

The use of cost–effectiveness as final criterion in the reimbursement process for listing of new pharmaceuticals can be questioned from a scientific and policy point of view. There is a lack of consensus on main methodological issues and consequently we may question the appropriateness of the use of cost–effectiveness data in health care decision-making. Another concern is the appropriateness of the selection and use of an incremental cost–effectiveness threshold (Cost/QALY). In this review, we focus mainly on only some key methodological concerns relating to discounting, the utility concept, cost assessment, and modeling methodologies. Finally we will consider the relevance of some other important decision criteria, like social values and equity.


INTRODUCTION
Escalating costs have become a major concern for healthcare professionals, decision-makers, and the public, prompting the implementation of new cost containment measures over the last decade, especially for new pharmaceuticals. There has been a trend toward an increasing demand for cost-effectiveness data in the decisionmaking process information in Europe (Drummond et al., 1999). A prominent example is the assessment procedure by the National Institute for Health and Clinical Excellence (NICE) in the United Kingdom, but also The Netherlands, Scotland, Sweden, Belgium, and Portugal have formal requirements for the submission of costeffectiveness data. Cost-effectiveness data should permit reliable, reproducible, and verifiable insight into the effectiveness of a pharmaceutical, the costs that will result from its use, and the potential savings that will be made compared with other pharmaceuticals and/or treatments. To do this, health economists use a measurement called the "quality-adjusted life-year," or QALY ("qually"). The lower the ratio of a cost per QALY, the more cost-effective a health intervention is said to be. Even though there is no theoretical or empirical basis for it, values ranging from $50,000 to $100,000 are sometimes used as a threshold in the United States, whereas in the UK, NICE has adopted a cost-effectiveness threshold range of £20,000 to £30,000 per quality-adjusted life-year (QALY) gained.
The following example illustrates the concept and also dilemma of cost-QALY thresholds. Two cost-effectiveness studies of preventive treatment with interferon beta compared to no preventive treatment in patients with multiple sclerosis (MS) in a British setting were performed. The first study by our group revealed that interferons were much more cost-effective compared with the results of another cost-effectiveness study (Parkin et al., 1998;Nuijten and Hutton, 2002). The incremental cost-effectiveness ratio (ICER) in our study was £51,582 per QALY, while the outcomes in the other study were £328,300 and £228,300 per QALY over a period of 5 and 10 years respectively. Authorities with a strict threshold of £50,000 per QALY might accept the use of interferons based on the first study, whereas they surely would reject the use of interferons based on the second study. The contrasting cost-effectiveness outcomes in our example have far reaching implications for patients with MS, and raise a number of fundamental issues regarding the use of economic evaluations by health authorities and decision-makers.
There are several key steps when performing and interpreting data on the economics of disease that are not part of usual patientoriented research practice. These include (1) defining perspective and time horizon, (2) collecting data on health care utilization, (3) costing health care resources, (4) analyzing data on utilization and cost, (5) defining and measuring health effects, (6) adjusting costs and effects for inflation and discounting, (7) and evaluating uncertainty. The cost-effectiveness of a new intervention depends heavily on the choices by the researcher on the above-mentioned issues. A review of the literature shows that there is a lack of consensus on main methodological issues and consequently we may question the appropriateness of the use of cost-effectiveness data in health care decision-making, and especially pricing and reimbursement decisions for new pharmaceuticals. Another concern is the appropriateness of the selection and use of an incremental cost-effectiveness threshold (Cost/QALY). Finally a broader concern is whether the ICER comprehends all relevant criteria for a decision maker, for example social values and equity. In the remainder of this review, we focus mainly on only some key methodological concerns.

DISCOUNTING
In cost-effectiveness analysis, the valuing of costs and health effects over time remains a controversial issue. Decisions about the resources dedicated to prevention depend on the weight given to future health in economic evaluations. Future costs and health gains are commonly weighted in relation to the time at which they www.frontiersin.org occur, future costs, and effects receiving less weight than present ones. This procedure is called discounting.
The debate mostly focuses on whether the discount rates for health and money should be equal and which discounting model and time preferences are most appropriate. The majority view is that benefits and costs should be discounted at the same rate (Gold et al., 1996). However the experts themselves admit that the reasoning behind the use of equal discount rates for costs and health outcomes is indeed not well developed in the published guidance (Claxton et al., 2006). A number of authors have suggested that the value of health grows over time and that as a consequence the discount rate on health effects should be less than the discount rate on costs (Brouwer et al., 2005). For example, the current Belgian guidelines for pharmacoeconomic evaluations recommend that future costs should be discounted at a rate of 3% and future benefits at a rate of 1.5% (Cleemput et al., 2008). Various arguments justify a different rate of discounting for health effects than for costs. Brouwer et al. (2005) advocate discounting future health gains at a lower rate than future costs. They dismiss observed popular preferences for discounting health gains from government programs at a high rate as being "implausibly high" (Cropper et al., 1992). They also argue that people in the future may put a higher value on their own health than people do today. On the other hand, standard cost-effectiveness analysis for health policies, including uniform discounting, reflects the moral imperative to value each person's life equally, and the political judgment that the social benefits to taxpayers of saving a poorer citizen's life through government action is as large as saving the life of a richer, or future, citizen (Keeler and Weinstein, 2005). Recently, Claxton et al. (2011) presented another argument in this debate, as they demonstrated that if the budget for health care is fixed and decisions are based on ICERs, discounting costs, and health gains at the same rate is correct only if the threshold remains constant.
The choice of discount rates can have varying effects on interventions, depending on the disease area, especially chronic disease and preventive interventions. It is therefore crucial that appropriate discount rates are used in economic evaluations. However, the lack of consensus on one of the most sensitive variables in a health economic study makes the case for a more restricted use of costeffectiveness in reimbursement decisions. The lack of consensus is both illustrated by difference of discounting between countries as well as changes in opposite direction in revised guidelines (UK, Netherlands, Belgium;Brouwer et al., 2005). As a consequence, the reimbursement of a new pharmaceutical may vary per country only due to the application of a different discount rate for health outcomes leading to opposite cost-effectiveness outcomes. This policy may be questioned from an equity perspective within Europe, especially when the studies lead to similar costeffectiveness results, when applying a similar discount rate. As a minimum, the impact of different discount rates on the overall cost-effectiveness results should be evaluated in a sensitivity analysis.

THE UTILITY CONCEPT
Another methodological controversy is the utility concept. Health effects in cost-effectiveness analysis are commonly expressed in life-years gained, QALYs gained or lives saved. Although QALYs are a great step forward in cost-effectiveness analysis, their use is not straightforward. The use of QALYs does not imply that fairness and equity in health care is taken into account automatically. Second there are serious measurements problems in determining QALYs. Several techniques are available to assign a value to a particular health state, e.g., the standard gamble, the time-trade-off, but there is little consensus on what technique is most suitable. The problem is that these techniques give widely different results leading to different cost-effectiveness outcomes as recently shown by Joree et al. (2010). In addition, the handling of time preference may even lead to different outcomes using the same scale. QALY values are mostly not corrected for time preference, i.e., a lower valuation being attached to later life-years than to earlier life-years, and therefore may underestimate the true QALY weights. Brouwer et al. (2005) showed the possible consequences for health policymaking when they correct time-trade-off scores for time preference, thereby taking into account severity and time horizon (Attema and Brouwer, 2010).
Another unresolved issue is how to adjust the potential cost per QALY threshold for the severity of the disease. For example, in severe MS the QALY will be lower than in other chronic disease areas and hence those patients might be worse off in terms of reimbursement decision.
In practical applications, the choice of the technique to elicit health status values is crucial. Another unresolved issue is whose values to take? Should the valuations of health care providers and professionals be used? Or should preference be given to the values of healthy people or the values of patients? Or is a random sample drawn from the general population the appropriate group? Summarizing, the cost-effectiveness of a new pharmaceutical may depend heavily on underlying methodological choices for measurement of QALYs. As a consequence, a reimbursement decision of a new pharmaceutical based on cost-effectiveness data can be challenged from a scientific point of view, as long there is no consensus on the measurement of QALYs.

COST PER QALY THRESHOLDS
Apart from the methodological inconsistencies, the interpretation of the cost-effectiveness results of health economic evaluations by policy-makers can also be troublesome. Partly, this is because of the aggregated nature of the outcome of a cost-effectiveness analysis. All economic and health aspects of the interventions are comprised into one single ratio, the ICER. This ratio not only encompasses aspects of monetary costs and savings, but also elements of patients' functioning and well -being, and averted future mortality and morbidity. From a policy maker's point of view, the expression of the results in a single cost-effectiveness ratio might be attractive since it simplifies the stream of information needed for decision-making. For example, in The Netherlands, an almost formally defined threshold value for cost-effectiveness of Ł20,000 per QALY is reported in cost-effectiveness studies (van Lier et al., 2010). In the literature, although not undisputed, a value of 50,000 US $ per QALY is often proposed as cost-effectiveness threshold. The main problem with the threshold is the justification, e.g., what is the basis of the reported thresholds? Does the threshold capture all societal preferences for selecting priorities in the decision-making process, especially innovation? For example, the 50,000 US $ per QALY has been quoted for many years as threshold in the US and was based on the dialysis standard, the purported annual cost/QALY on the Medicare program for patients with chronic renal failure (Rabinovich et al., 2007). This study was performed in 1982, and this threshold has never been adjusted for inflation, as with most internally used thresholds. A comparison of the thresholds also shows large differences within Europe, which cannot be explained by economic differences. For example the threshold in The Netherlands of Ł20,000/QALY is half of the threshold in the United Kingdom ranging from Ł36,000 to Ł54,000. As a consequence, a new innovative pharmaceutical with cost-effectiveness of Ł25,000 per QALY might be considered not cost-effective and therefore not be reimbursed in The Netherlands. The same pharmaceutical might be considered extremely cost-effective in the UK. However, it should be noted that the threshold in The Netherlands has been raised recently with an upper limit of Ł80,000/QALY depending on the severity of the disease and that too much uncertainty in the ICER also may lead to rejection in the UK (Busschbach and Dewel, 2010).
This example shows that the use of cost-effectiveness data as a final criterion in the reimbursement process might lead to unequal access for innovative health care in Europe, which raises serious equity and ethical considerations. Although there are several suggestions to set a differential threshold value between countries, associated with their relative wealth, this would not solve the huge difference between European countries. Other proposals include a differential threshold value between diverse disease and treatment characteristics, for example in The Netherlands, a range between Ł10,00 and Ł80,000/QALY recently has been suggested based on the society perspective, including indirect costs (Busschbach and Dewel, 2010). NICE recognizes that there will be circumstances, especially for end-of-life treatment in which it may be appropriate to recommend the use of treatments with high reference case ICERs (http://www.nice.org.uk/media/E4A/ 79/SupplementaryAdviceTACEoL.pdf).
Adoption of a "flexible threshold" approach, in which the threshold is not the exclusive criterion for decision-making, might resolve the previously mentioned ethical issues related to unequal access.

QALY GAINS
Another concern is the interpretation of a QALY gain, which is calculated by multiplying the utility gain with the number of lifeyears gained. The problem is that a given number of QALYs gained, may be the result of a prolonged survival time, utility gain, or both. For example, a patient who has a survival gain of 4 years at a utility level of 0.4 will result in the same 1.6 QALYs gained as a patient with a survival gain of 2 years at a utility level of 0.8. Thus, QALYs may be gained without additional life-years gained. The terminology of "quality-adjusted life-years" is thus potentially confusing in this regard. As a consequence, QALY gain is an aggregated rough measure, which does not allow differentiation of the underlying components (disease severity, survival). Furthermore, the QALY gain is only useful in assessing the relative differences in outcomes and does not give information on the absolute values, whereas there may be a need to incorporate concerns for severity of illness as an independent factor for societal valuations of health outcomes (Nord et al., 1999). For example, a new treatment A may increase the QALYs over a year from 0.1 to 0.2 leading to QALY gain of 0.1, whereas another new treatment B may increase the QALYs over a year from 0.8 to 0.9 leading also to QALY gain of 0.1. The relative improvement in QALYs is much higher for treatment A (100%) compared with treatment B (12.5%) and as a consequence a similar QALY gain may be valuated differently. Nord et al. (1999) show how equity weights may serve to incorporate concerns for severity and potentials for health in QALY calculations. It is suggested that the QALY as a measure of amounts of well life does not carry sufficient empirical meaning. As a measure of individuals' personal appreciation of outcomes in their own lives the QALY does not seem to be valid for comparisons of life-saving interventions with interventions that improve health or increase life expectancy (Nord, 1994).
Finally, it is important to note that there is more to healthrelated quality of life than preference-based utility values. Nonpreference-based descriptive health-related quality of life measures provide unique additional value in understanding the patient's own viewpoint on disease and its treatment. Both generic and disease-specific questionnaires offer the potential to assist therapeutic value assessment at all stages of a product's lifecycle and to inform medical decision-making in daily practice.

MODELING
In practice it is not always possible to derive all necessary information from prospective randomized controlled trials. In these cases decision-analytic models may be used to provide the necessary cost-effectiveness information using various existing data sources for clinical and economic information. Modeling studies are based on decision analysis, which is a well-recognized method for analyzing the consequences of decisions that are made under uncertainty (Weinstein and Fineberg, 1980). It is an explicit, quantitative, prescriptive approach to healthcare decision-making and allows both clinical and economic consequences of medical actions and attitudes to be analyzed under conditions of uncertainty. From treatment algorithms a model can be constructed which considers the timings of actions and their consequences over time. In effect, a model shows the consequences and complications of different therapeutic interventions, and it should correspond as much as possible, to the real life situation of the disease.
Projections about a pharmaceutical's effectiveness and expected costs can be modeled using realistic and explicit assumptions based on data from clinical studies. In addition modeling often helps overcome the practical limitations of prospective studies, particularly for chronic conditions like Parkinson's disease that may require longer-term extrapolations of therapeutic effects and cost implications. Data sources for the variables being used in a model may be meta-analysis, databases, clinical trials and/or Delphi panels. The role of modeling in economic evaluation has been explored by discussing the concerns of models, which mainly relate to the trade-off between internal and external validity: concerns about the inappropriate use of clinical data, concerns in observational data, concerns about the difficulties in extrapolation, concerns about the transparency or validity of the model and the impact of confounding variables (Nuijten, 2003). There is also the debate over which type of model offers the most appropriate modeling www.frontiersin.org structure. Heeg et al. (2008) provided a critical assessment of the advantages and disadvantages of three modeling techniques using schizophrenia as an example: decision trees, Markov models, and discrete event simulation models. They conclude that depending on the research question, the optimal modeling approach should be selected based on the expected differences between the comparators, the number of co-variates, the number of patient subgroups, the interactions between co-variates, and simulation time (Heeg et al., 2008). As a consequence the ICER outcome of a health economic model should be considered with prudence and there is no guarantee that the outcome reflects the true cost per QALY. The example in the beginning of this paper on the costeffectiveness of interferons in MS shows the variance in ICER outcomes of two health economic models varying £51,582 per QALY to £328,300 per QALY.

COSTS
The long list of cost categories can be divided into two discrete resource categories: direct costs and productivity costs. Direct costs reflect the monetary burden of the medical care and non-medical care expenditures made in response to disease. The cost of pharmaceuticals is one type of direct medical costs. Other types of direct medical costs include cost of hospitalizations, cost of physician visits, cost of tests and procedures, and cost of durable medical equipment. Direct non-medical costs include cost to caregivers or the valued time in monetary terms in caring for a loved one. Productivity costs reflect the monetary value of the work lost due to death or morbidity induced by disease or its treatment. Therefore, productivity cost is especially important for studies conducted from the societal perspective. A variety of approaches for collection of data on utilization exist, and these include subject interviews, subject surveys, provider surveys, medical record reviews, health care utilization diaries, and insurance claims data (Goossens et al., 2000). Collecting data on utilization can be viewed as a detailed accounting exercise. However, much more research is needed to validate the accuracy of these different data collection methods. Finally, missing utilization data can be of particular concern and imputation or modeling methods may need to be used to evaluate the impact of non-random missing cost information.
Costing resource units should be viewed as a research exercise in itself, and usually occurs after the collection of medical resources. The decentralization of cost information and a lack of a research-based "cost-coding dictionary" can make the costing exercise inaccurate. The ideal cost estimates for each resource use would be their opportunity cost, defined as the value of that good or service in its next best use. Opportunity costs are reflected as the price in a perfectly competitive marketplace. No marketplace is "perfect," however, and the health care marketplace has many distinguishing features (e.g., information asymmetries, market distortions, and cross-subsidies) that make it less perfect than other markets. Therefore, routinely used prices of health care goods and services (e.g., charges and reimbursements) are not true opportunity costs. At their best, health care market prices can be viewed as "proxy" costs, which can be either higher or lower than opportunity costs. Therefore, cost estimates used in economic studies may be far removed from opportunity costs, and there are methods to convert certain available prices to better reflect costs (e.g., hospital cost-to-charge ratios; Mushlin et al., 1998). A specific example is the costing of pharmaceuticals, when generic pharmaceuticals are available. In this case in most health economic evaluations generic prices are used. This may be considered a conservative approach, as these generic prices do not reflect the true opportunity costs of these pharmaceuticals. On the contrary, these generic prices do not result from market mechanisms but are due to patent legislation disturbing the self-regulating nature of the market place. Therefore a cost-effectiveness analysis, where generic pharmaceuticals are available will results in a high cost per QALY, which may exceed the threshold, whereas a cost-effectiveness analysis based on real opportunity costs would have resulted in a favorable cost per QALY and consequently reimbursement.
Finally there remains a debate on the inclusion or exclusion of future unrelated costs associated with a treatment increasing survival. These are the medical costs, which may arise during lifeyears gained as a result of the treatment, but are not related to the treatment of the disorder. Although the health economic guidelines do not recommend the inclusion of these costs in the health economic analysis, the debate on this issue continues, because of the potential high impact of the inclusion of these costs on the ICER. An example is cardiovascular disease, where prevention of mortality due to myocardial infarction in a patient, may lead later to costs of an unrelated disease. For example, a patient surviving a myocardial infarction may later suffer from cancer with its associated costs (van Baal et al., 2011).

SUMMARY
Summarizing, the use of cost-effectiveness as final criterion in the reimbursement process for listing of new pharmaceuticals can be questioned from a scientific and policy point of view. Although cost-effectiveness evaluation is not yet the sole criterion in the current decision-making process, its weight can already be substantial in the various health care systems.
It may lead to inappropriate reimbursement decision because of inaccuracy in the applied methodologies as well as cost per QALY threshold. Unacceptable differences between West-European countries, especially discounting and threshold, may lead to unequal access of new innovative pharmaceuticals.
Apart from the issue what method should be used to assess the optimum value of cost per QALY, the value of a single aggregated measure to express all aspects of costs and effectiveness could be questioned. Instead decision-makers may be provided with a list of consequences of interventions instead of an ICER. Such a cost consequence approach would enable decision-makers to tailor the analysis to their specific needs.
Finally, it is being argued that cost-effectiveness ratio does not fully capture all social values. A fundamental limitation of health economics is the fact that it does not adequately consider social values outside of efficiency. By definition, HTA "is a multidisciplinary process that summarizes information about the medical, social, economic, and ethical issues related to the use of a health technology in a systematic, transparent, unbiased, and robust manner." The emergence of health economic guidelines for the submission of economic evidence has brought about more consistency and quality. But with respect to social and ethical issues, a systematic set of principles and processes of assessment have not emerged. Further, whether or not social values are included in an HTA appears to be relatively capricious and arbitrary. The need to advance consideration of social values in HTA and economic evaluation has long been recognized. Compelling arguments for economics to recognize the relevance of social values were made long ago (Williams, 1995;Nord et al., 1999).
Hence the health care community "cannot seem to get out of the starting blocks when it comes to considering objectives outside of cost-effectiveness." For example, separate budgets may be required for life-saving products, which are not cost-effective.
Does this all mean that cost-effectiveness should not be used at all? No, contrary, cost-effectiveness analyses show us how much health "bang" we get for our "buck," which is very valuable information for health decision-makers, when trying to relate costs to clinical outcomes. However, a reimbursement decision should incorporate also other decision criteria, like efficacy and safety of a new treatment, equity and social values, and impact on healthrelated quality of life. Therefore reimbursement decisions would be better based on a multi criteria process. Cost-effectiveness can be included, but its weight in the overall decision should be considered with prudence considering the methodological controversies and its substantial impact on patients.