Use of a Non-parametric Bayesian Method to Model Health State Preferences: An Application to Polish and Irish EQ-5D-5L Valuations

Valuations of preference-based measures for health are conducted in different countries. There is scope to use results from existing countries' valuations to generate better valuation estimates than analyzing the data from each country separately. We analyse data from two smaller design EQ-5D-5L valuation studies where a sample of 119 Polish migrants and 123 native Irish valued 30 common health states using similar composite time trade-off protocols. We apply a non-parametric Bayesian method to provide better predictions of the Polish (Irish) population utility function when the existing Irish (Polish) results were used as informative priors. The resultant new estimates were then compared to those obtained by analyzing the data from each country by itself via different prediction criterions. The results suggest that existing countries' valuations could be used as potential informative priors to produce better valuation estimates under all prediction criterions used. The implications of these results will be hugely important in countries where valuation studies are expensive and hard to conduct. Future application to other countries and to other preference-based health measures are encouraged.


INTRODUCTION
Several preference-based measures of health-related quality of life (HRQoL) are currently available. Such generic measures include the EuroQol five-dimensional (EQ-5D) questionnaire (1), health utilities index 2 (HUI2) and 3 (2,3), assessment of quality of life (4), Quality of Well-being scale (QWB) (5), and the six-dimensional health state short form (SF-6D) (6) along with disease-specific measures (7,8). All of these measures produce derived health-state utilities that can be used for computing quality-adjusted life-years (QALYs) for use in cost-effectiveness analyses (9).
In the context when plenty of data on each country is available, good utility estimates for each country can be produced by analyzing its data separately. However, in the case when limited quantity of data on some (or all) countries is available, it is argued that combined analysis may generate better estimation of every country's utility estimates than analyzing its data by itself. This sort of analysis (adopting strength from existing countries) will be greatly important in countries where large-scale evaluation exercises are very expensive and hard to conduct, especially for countries with smaller populations or low-and middle-income countries (LMIC).
Advancement in statistical modeling techniques, such as Bayesian inference methods (23), offers the potential for borrowing strength from existing countries. In particular, it offers the potential for using the existing results of country 1 to improve those in country 2 by using the results in country 1 as informative priors. As such, the resultant utility estimates of country 2 could be more precise than modeling its own data separately. A number of researchers have investigated the use of such approach. For example, Chan et al. (24) found that EQ-5D-3L health state utilities obtained from shrinkage estimation allow valuation studies with very low sample size to adopt strength from another valuation studies to help improve precision in the estimated mean health utilities and reduce uncertainty. Kharroubi (25) developed a non-parametric Bayesian method that allows the already existing results from one country to be employed as a potential prior information in another country, and applied this method for analyzing the US EQ-5D-3L valuation dataset alongside the available UK dataset (26). In (27)(28)(29)(30)(31)(32), Kharroubi et al. extended this work further to handle the SF-6D Hong Kong, Japan and Lebanon alongside the already available UK dataset, respectively.
Our primary purpose in the present paper is to investigate the use of aforementioned method for countries with smaller design valuation studies and different population compositions, type of work, cultures and languages, all of which could have an impact, suggesting that analyzing its own data separately may not always generate precise valuation estimates. This is investigated using a case study for Polish migrants and native Irish data modeling Irish (Polish) data alongside small Polish (Irish) samples to generate Polish (Irish) estimates. Despite the present paper not offering new methodological advances as the model presented here is a replication of that already reported in  papers, it further tells a reassuring story regarding the superiority and flexibility of the non-parametric Bayesian approach in using existing preference data, thereby generating accurate estimates.
The Polish and Irish EQ-5D-5L valuation surveys and study methods alongside the corresponding datasets are summarized in Section 2. In the next section, the Bayesian non-parametric model is described whereas the results are reported in Section 3. In the last section, the results are discussed and some limitations and suggestions for further research are set out.

MATERIALS AND METHODS
The EQ-5D-5L The EQ-5D-5L describes five multi-level dimensions of health: mobility, self-care, usual activities, pain/discomfort and anxiety/depression. Each dimension is assigned to five levels of health-related problems: no problems "1, " slight problems "2, " moderate problems "3, " severe problems "4, " and extreme problems "5" (11). Different combinations allow the determination of 3,125 distinct health states, each of which is associated with a five-digit identifier, beginning from 11,111 for best health state (perfect health) and ending with 55,555 for the worst health state, referred to as "the pits."

Survey Design and Sampling Strategy
Data from two smaller design EQ-5D-5L valuation studies where, using similar composite time trade-off protocols (cTTO), valuations for 30 common health states were elicited from Polish migrants and native Irish, both living in Ireland. Since they represent the largest non-Irish community residing in Ireland, the Polish migrants were chosen to be included in this valuation study. Detailed description of the survey design and sampling strategy has been reported elsewhere (33). In brief, a sample of 240 (120 Polish migrants and 120 native Irish) respondents was recruited to value six practice states each in addition to one block of 11 cTTO states. The valuation study was conducted between June 2018 and September 2019 and the orthogonal design of the health states was provided by the EuroQol Research Foundation. All interview sessions were conducted using the EuroQol Portable Valuation Technology (EQ-PVT), which is a computer assisted personal interview software and protocol (34,35). The valuation study has been ethically approved by the NUI Galway's Research Ethics Committee. Further details on the valuation study is provided elsewhere (33).

Experimental Design
The orthogonal design was analogous to the one applied in Yang et al. (34). A sample of 30 EQ-5D-5L health states was chosen for valuation. The sample contained 25 health states, including the 'pits' state. A further five mild health states were selected, resulting a total of 30 EQ-5D-5L health states for valuation. A key advantage of using an orthogonal design is that it allows for a small number of health states to be valued and a small sample size to be used in comparison to that of a full national valuation study, which includes 86 EQ-5D-5L health states and a minimum sample size of 1,000 respondents (17). A comparative study between estimates of a smaller design EQ-5D-5L valuation study and that of the full national valuation study has been conducted in Yang et al. (34,36). Results revealed that the smaller design EQ-5D-5L valuation study performed well in comparison to the larger design and that no significant changes in prediction errors have been obtained when modeling EQ-5D-5L cTTO data.

Blocking
Using the blocking algorithm that is readily available in the "AlgDesign" package in the software package R, the 30 EQ-5D-5L health states were split up into three blocks of 11 health states. The rationale for this is to make sure that within-block variance is maximized, thereby observations on the full health-dead scale are attained [see Kelleher et al. (33) for an overview].

Interviewers and Respondents
The EQ-5D-5L cTTO data were extracted from Polish migrants and native Irish, both living whole-time in Ireland. Full details on the valuation study is provided in Kelleher et al. (33). Prior to the survey, respondents were asked to state their country of birth and whether they reside in Ireland whole-time. The interviews were conducted by a group of seven interviewers plus one study coordinator. Respondents were contacted through a Facebook page, by email through the study coordinator, or through friends and family using snowball sampling. Respondents were also asked to provide written consent to be included in the study.
More detailed explanation on the extended experimental design, survey design and sampling strategy is provided in Kelleher et al. (33).

Modeling
Kharroubi (25) developed a non-parametric Bayesian model that allows the already existing results from one country to be employed as a potential prior information in another country.
Here we make use of this model to investigate whether the use of Polish (Irish) alongside the existing Irish (Polish) dataset generates more accurate utility estimates than modeling the data from each country alone. These resultant estimates are then compared in terms of different prediction criterions, including predicted against actual mean utility estimates, mean predicted error and root mean square error (RMSE).
Following Kharroubi (25), the non-parametric Bayesian model is defined as where for i = 1, 2, . . . , I j and j = 1, 2, . . . , J, x ij is the ith health state evaluated by respondent j and the dependent variable y ij is the respondent j's cTTO valuation for that health state, α j is a random effect of respondent j and ε ij is the usual random error. Assume that t j is a vector of covariates for respondent j e.g., age, gender, socio-economic status or level of education. Kharroubi (25) then used the following distributions: where γ is the vector of unknown coefficients and τ 2 and v 2 are further unknown variance parameters to be estimated. That is, the distribution of the respondent effect α j is then independent log-normal, resulting in a skewness that is also typically observed in valuation data, and ε ij are independent normally distributed errors, Note that, because of the way that the respondent effects have been modeled in distribution (2), the utility function u(x) turns out to be the median utility of health state x. Given it is an unknown function, it becomes a random variable in the Bayesian model, which in turn needs a prior belief. Kharroubi (25) formally assigned a multivariate normal distribution for u(x) with mean and variance-covariance matrix where E u 0 (x)|y represents the average utility value for state x and cov u 0 (x) , u 0 (x ′ )|y denotes the variance-covariance matrix between the two utility functions u 0 (x) and u 0 (x ′ )¬ for two distinct health states x and x ′ , both of which are computed directly from modeling the existing countries' data. More details on this are given in Kharroubi (25). Given equations (3) and (4), it is noteworthy that x = (x 1 , x 2 , . . . , x 5 ) denotes a vector comprising discrete levels on each EQ-5D-5L dimensions and γ , β, and σ 2 are unknown parameters. Note also that the regression parameters γ and β represent, respectively, the intercept term and the slopes as each of the 5 dimensions (mobility, self-care, usual activities, pain/discomfort and anxiety/depression) increases, whereas the term c x, x ′ , defined below, represents the correlation between the two utility functions u (x) and u x ′ for two distinct health states x and x ′ in the new country's data. As for equation (3), the prior mean E (u (x) |β) represents a prior belief about the utility function that it is approximately linear and additive in the different dimensions. In addition, the actual utility function is allowed to vary around this mean in accordance with to its multivariate normal distribution, and so it takes any form at all. With regards to equation (4), the correlation c(x, x ′ ) decreases when the distance between x and x ′ gets bigger. Kharroubi (25) defined where for d = 1, 2, . . . , 5, x d and x ′ d -represents, respectively, the levels of dimension d in x and x ′ . The term b d denotes a roughness parameter which by definition controls how close the actual utility function to a linear form in dimension d. For more explanation of this specific point, see Kharroubi (25).
In order to complete the Bayesian model, we need to assign prior distributions for hyperparameters γ , τ 2 , v 2 , β and σ 2 . Vague priors are usually specified unless specific prior information is available. Formally, we assign Note that a flat prior was specified for σ , hence p(σ 2 ) ∝ σ −1 (37). Note also that no prior distributions were assigned to the roughness hyper-parameters b d s. It is noted in Kharroubi et al. (23) that inference about b d s in Gaussian models is generally problematic, thus it is recommended to give them fixed values. We shall discuss one method to demonstrate this in section results. We now formulate the posterior distribution of interest. Letting u = (u 1 , u 2 , . . . , u n ) T be the vector of utilities for the health states in the sample. Equations (3) and (4) give rise for the prior distribution of u Note that u 0 and C 0 are obtained from modeling the existing countries' data. Now let α = (α 1 , α 2 , . . . , α J ) T be the vector of respondent effects, then the posterior distribution of We now compute p(u|y) to predict utility estimates for all health states in the sample. This is obtained by integrating out equation (7) with respect to α and the hyperparameters γ , τ 2 , v 2 , β, and σ 2 . It follows from equation (7) that the posterior distribution of θ is not in the closed form. This implies that the Markov Chain Monte Carlo (MCMC) methods are then needed. Full conditional posterior distributions for all parameters u, α, γ , τ 2 , v 2 , β, and σ 2 using MCMC methods are all set out in Kharroubi (25). To this end, it is important to correct utility to the population mean. Note that the distribution of the individual respondent effect α j in (2) is defined as independent log-normal. This implies that the utility function u(x) in model (1) represents the population median utility for a health state x and not the required population mean utility which, using model (1), is defined as where E(α) represents the expected value of α over the total population. When E(α) = 1, then u (x) will be the same as u(x).
In the context when no covariates are used, the distribution (2) of the respondent effects becomes α j ∼ LN(0, τ 2 ). This results in E (α) =exp(τ 2 /2). However, when there are covariates, it follows from (2) that all of which are obtained directly from the MCMC simulation. Therefore, the calculation of E (α) is straightforward All theoretical and technical details of the non-parametric Bayesian model are reported in Kharroubi et al. (23,25). Matlab source code for implementing the Bayesian approach is available online in the Supplementary Material. Note that the codes are not generic and need to be modified as per users' specific purposes.

Irish With Polish Prior
The Bayesian model (1) was first implemented to predict an Irish value set, where the Polish results were employed as informative priors (which we will refer to as combined analysis from now on). The resultant utility estimates were then compared to those obtained from analyzing the Irish data alone (which from now on we will refer to as single analysis).
Here, the vector of individual-specific covariates is set as (Age, Age 2 , Sex). Also the roughness parameters b d in formula (5)  The MCMC sampler was allowed to iterate for 3,000 runs, with an initial run of 1,000 iterations as "burn-in" (these runs were discarded). Figures 1A,B presents the estimated (line in pink) and actual (line in blue) mean utilities for the 30 EQ-5D-5L health states valued in the Irish survey as well as the full health obtained from the combined and single analyses, respectively. The errors in both plots are displayed by the line in yellow and are obtained by calculating the difference between the two utilities. In both figures, the health states were ordered in terms of estimated mean utilities and were then plotted accordingly. The single analysis in Figure 1A exhibits a clear variation of the actual mean utilities around the estimated ones, particularly for the mild and worse health states, whereas Figure 1B clearly shows that the combined analysis predicts the mean utilities quite well across the full board.
Another way to check the adequacy of the assumed models is to quantify the gains in terms of bias and/or precision. This is achieved by the Bland-Altman agreement plot that displays the difference values between predicted and actual mean utilities against the average bias (38) . Figures 2A,B present the Bland-Altman agreement plot obtained from the combined and single analyses, respectively. The solid line in each plot represents the average bias, whereas the dotted lines are the 95% limits-ofagreement. It can be clearly seen that the combined analysis shows a much greater agreement than the single one. The rationale for this is based on the following three observations. Firstly, shorter width of the 95% limits of agreement, with values of 0.1402 for the combined analysis vs. 0.2030 for the single one. Secondly, smaller difference in average bias, with values of 0.0144 for the combined analysis and 0.0441. Thirdly, the standard deviation of the differences obtained from the combined analysis is also smaller than from the single analysis, with values of 0.0357 and 0.0518, respectively. The overall impact of this can be seen from Table 1 which displays the inferences for the utilities of the 30 EQ-5D-3L states evaluated in the study as well as the perfect health.  Polish estimated mean utility and standard deviation that were used as informative priors in the combined analysis. Moreover, Columns 6 and 7 exhibit the predicted mean utility and standard deviation obtained from the single analysis, respectively, whereas Columns 8 and 9 show the corresponding estimates obtained from the combined analysis. As clearly seen throughout this comprehensive table, the combined analysis provides much better predictions compared to the single analysis overall, with RMSE of 0.038 vs. 0.068, respectively. Furthermore, it can also be seen that the posterior standard deviations of the utility estimates are larger for the single analysis. The posterior standard deviations for the combined analysis are smaller is due to the fact that it is a model that employs the Polish results as prior beliefs, hence producing better estimates.

Polish With Irish Prior
We now apply model (1) to predict a Polish value set, where the Irish results were now employed as informative priors (combined analysis), and the resultant utility estimates were then compared to those obtained from analyzing the Polish data alone (single analysis).
In a similar way to section Irish With Polish Prior, Figures 3A,B present the predicted and actual mean utility estimates for the 30 EQ-5D-5L health states valued in the Polish survey along with their differences obtained from the combined and single analyses, respectively. As is the case in section Irish With Polish Prior, Figure 3A exhibits a clear variability of the actual values around the estimated mean utilities, particularly for the mild and worse health states, while Figure 3B clearly shows that that the combined analysis predicts the mean utilities quite well for almost all heath states in the study.
When comparing the two analyses using the Bland-Altman agreement plots (Figures 4A,B), we can also see clearly that the combined analysis has performed better in terms of a greater agreement. As in section Irish With Polish Prior, this is also due to (1) shorter width of the 95% limits of agreement, with values of 0.1205 for the combined analysis compared to a value of 0.2186 for the single one, (2) smaller difference in mean bias, with values of −0.0006 for the combined analysis and −0.0011, and (3) smaller standard deviation of the differences, with values of 0.0307 for the combined analysis vs. 0.0557 for the single analysis. Finally, and in a similar pattern to Table 1, it can be clearly seen throughout the comprehensive Table 2 that the combined analysis provides much better predictive performance when compared to the single analysis overall, with a value of 0.030 for RMSE against 0.055 from the single analysis. The posterior standard deviations of the utility estimates are also smaller for the combined analysis. Kharroubi (25) built a non-parametric Bayesian model that allows the already existing results from one country to be employed as a potential prior information in another country. Here, we explored the use of this method for Polish migrants and native Irish data modeling Irish (Polish) data alongside small Polish (Irish) samples to generate Polish (Irish) estimates. The resultant new estimates were then compared to those obtained modeling the data from each country alone. The findings proved that existing countries' valuations could be used as informative priors to produce better utility estimates under all criterions used, including estimated against actual mean utilities, mean predicted error and RMSE. This sort of analysis (adopting strength from existing countries) will be greatly important in countries where large-scale evaluation exercises are hard to conduct, especially for countries with small population size or LMICs. Despite the present paper not offering new methodological advances, the novelty here was to investigate the use of nonparametric Bayesian model for countries with smaller design valuation studies and different population compositions, type of work, cultures and languages. All of these could have an impact on the relative valuations of the dimensions of health (such as, self-care and anxiety/depression) and on where-about each health state lies on the [0-1] dead-perfect health scale. This suggests that the approach presented here may not generate precise utility estimates all the time. Further, the analysis presented here also provided a re-assuring story regarding the superiority and flexibility of the non-parametric Bayesian approach in using existing preference data, thereby generating accurate utility estimates.

DISCUSSION
It is true from the two analysis presented here that the improvement in the utility estimates in general and in the mean-squared error in particular is moderate. However, there are other crucial benefits especially those related to health and quality of life gains. As lots of reimbursement agencies worldwide need cost-effectiveness assessments, effectiveness analysis would become more international with combination of data across   Table 1 revealed that the combined analysis provides much better predictive performance when compared to the single analysis overall, with values for RMSE of 0.038 and 0.068, respectively. Therefore, the difference in utility estimates is, on average, equal to 0.03, which leads to an increase in QALYs from 0.5 to 0.53 for a treatment that extends life by an extra year. This in turn leads to a decrease in the cost per QALY from £20,000 to £18, 867 for a treatment costing £10,000, and so puts it under the cost-effectiveness threshold employed by the National Institute for Health and Clinical Excellence (NICE) in the UK. As a result, this could potentially influence the probability of whether a new treatment or health care scheme is deemed costeffective and funded. It could also impact on the validity of the resource allocation decisions being made. Heijink et al. (39) drew similar conclusion from their analysis on the impact of different valuation functions on QALYs.
A key benefit of the non-parametric Bayesian model presented here is that it allows for multi-countries to be analyzed rather than two. Further, equations (3) and (4) may be generalized further to handle n (n > 2) countries. Thus, we formally assign a multivariate normal distribution for u(x) with mean: and variance-covariance matrix where n k=1 E u k (x) is the overall expected utility of state x and n k=1 cov u k (x) , u k (x ′ ) is the overall variance-covariance matrix between the two utility functions u k (x) and u k (x ′ )¬¬ for two distinct health states x and x' , both of which are computed directly from modeling the existing datasets in n different countries. Work is in progress on demonstrating this idea for SF-6D in the UK, Hong Kong and Japan has introductory results that are particularly promising.
As already mentioned, generic measures of HRQoL, such as the EQ-5D and SF-6D, have been valued in different countries and so there are many different value sets from different countries and subgroups available. Such valuation studies are very expensive and are potentially wasteful. The analysis presented here demonstrates how existing countries' valuations could be used as informative priors to produce better utility estimates. This offers the potential to reduce the need for conducting large surveys in every country which in turn will reduce the cost of cross-country valuation. The approach presented here (borrowing strength from existing countries) could be particularly promising for countries where large-scale evaluation exercises are hard to conduct, especially for countries with small population size or LMICs. For instance, it is worth noting that large-scale national EQ-5D-5L valuations studies are significant undertakings of research that require substantial resources and logistics to be completed. These studies require a minimum of 1,000 respondents to value 86 health states, as mentioned above. As such, using the methods employed in this paper and a similar smaller design EQ-5D-5L valuation study to Kelleher et al. (33), future national valuation studies could be conducted more efficiently by having less respondents to value less health states compared to current large-scale national valuation studies. Examining this could be particularly promising and would form a key research agenda for further research. In addition, further analysis could be conducted more efficiently using simulated data. The thing is that, if through simulated data we know how the value sets differ, then we can explore the relationship between how different the countries are and how useful the use of priors are. However, although the empirical example is helpful and a worthwhile addition to the literature in its own right, it does not allow exploration of the full range of distances between national value sets (24). Further research is encouraged to examine this. The present study has certain limitations that ought to be considered. First, the study sample size was small which may in some sense limit the generalizability of the utility values obtained. Second, snowball sampling has been used, thus a good representative sample of Polish migrants and/or native Irish was not selected, let alone there was some difficulties associated with obtaining a good representative sample [see Kelleher et al. (33) for an overview]. This implies that the results of this study ought not to be considered as good representative of Polish migrants and/or native Irish in Ireland. Third, the small number of health states (i.e., 30 EQ-5D-5L health states) valued in this study could have an impact on the precision of econometric modeling, suggesting that the presented EQ-5D-5L utility estimates may not be considered as representative of the general Polish migrants or native Irish population. Future research with more representative samples is then encouraged to produce the Polish migrants or native Irish specific EQ-5D-5L value set. However, because of the way the non-parametric Bayesian model is defined, this should not in theory impact on the resulting utility estimates from this paper, though this could be further examined in future work.
In conclusion, the promising results suggest that existing countries' valuations could be used as informative priors to generate better utility estimates than modeling the data from each country separately. This kind of analysis could be particularly promising in terms of reducing the need for conducting large surveys in every country which in turn would reduce the cost of cross-country valuation. This will be greatly important for countries where large-scale evaluation exercises are expensive and hard to conduct. Similar approach could be used to other descriptive measures like HUI-II (40), in addition to other condition-specific measures (41). Ongoing research is underway to examine this

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by NUI Galway's Research Ethics Committee (application number 18-Mar-13). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SK participated in the conceptualization of the idea, the design of the methodology, software, data analysis, data interpretation, manuscript drafting, and the final review of the manuscript. DK reviewed and finalized the manuscript and participated in the conceptualization of the study. All authors have read and approved the final manuscript.