Comparative efficacy of three Bayesian variable selection methods in the context of weight loss in obese women

The use of high-dimensional data has expanded in many fields, including in clinical research, thus making variable selection methods increasingly important compared to traditional statistical approaches. The work aims to compare the performance of three supervised Bayesian variable selection methods to detect the most important predictors among a high-dimensional set of variables and to provide useful and practical guidelines of their use. We assessed the variable selection ability of: (1) Bayesian Kernel Machine Regression (BKMR), (2) Bayesian Semiparametric Regression (BSR), and (3) Bayesian Least Absolute Shrinkage and Selection Operator (BLASSO) regression on simulated data of different dimensions and under three scenarios with disparate predictor-response relationships and correlations among predictors. This is the first study describing when one model should be preferred over the others and when methods achieve comparable results. BKMR outperformed all other models with small synthetic datasets. BSR was strongly dependent on the choice of its own intrinsic parameter, but its performance was comparable to BKMR with large datasets. BLASSO should be preferred only when it is reasonable to hypothesise the absence of synergies between predictors and the presence of monotonous predictor-outcome relationships. Finally, we applied the models to a real case study and assessed the relationships among anthropometric, biochemical, metabolic, cardiovascular, and inflammatory variables with weight loss in 755 hospitalised obese women from the Follow Up OBese patients at AUXOlogico institute (FUOBAUXO) cohort.


Introduction
The advent of exposomics and the abundance of novel toxicological information have made the identification of the associations between both exposures and their molecular responses and a health outcome more challenging. Therefore, many variable selection approaches have increased in importance and popularity (1). When multiple exposures and their molecular responses co-occur and have a strong complex correlation structure, traditional statistical models are limited in accounting for multicollinearity or standard error inflation (2). To reduce this problem, dimensionality reduction methods-such as principal component and factor analyses-become very valuable. However, those approaches focus on the transformation of the original variables thus leading to interpretability issues. In addition, multiple co-occurring predictors can have nonlinear and nonadditive relationships with the health outcome, and most statistical methods fail to properly model those relationships. Penalised regression methods are used in this context, such as the least Absolute Shrinkage and Selection Operator (LASSO) (3) and its numerous variants (4-7), along with Bayesian variable selection methods, which have recently been developed to handle jointly multiple correlated predictors and both nonlinear and nonadditive relationships, allowing for the inclusion of prior information (8,9).
Among these Bayesian methods, those employing spike-and-slab priors (10) or shrinkage priors (11) stand out for features selection. These methods are now widely studied and employed within the environmental epidemiological literature (9,12), but only a few studies have evaluated the differences and similarities of those approaches in a more general setting.
In this work, we compared three supervised Bayesian models: the Bayesian Kernel Machine Regression (BKMR) (13), the Bayesian Semiparametric Regression (BSR) (14), and the Bayesian LASSO (BLASSO) (15) and we provided useful and practical guidelines of their use. BKMR models the outcome-predictors associations through the use of a kernel function of predictors, while BSR flexibly models the relationships between the predictors and the outcome by employing natural splines. We evaluated the models' goodness of fit and selection results, simulating several predictors with a complex correlation structure and with disparate relationships with a continuous outcome and considering data with different sample sizes. We finally assessed the models' performance on a real case study. We leveraged data on weight loss in hospitalised obese women from the Follow Up OBese patients at AUXOlogico institute (FUOBAUXO) cohort (16) and determined the association between biochemical, anthropometric, and clinical variables on weight loss percentage in these patients over a period of 40 days. The paper is organised as follows. In Section 2, we provide a brief overview of the models; in Section 3, we describe the simulation studies for evaluating the models' selection and performance based on three scenarios; in Section 4, we illustrate the methods' performance on data from the FUOBAUXO study; and in the last section, we offer some concluding remarks and suggestions for further research.

Methods
In this section we summarise the methodologies applied throughout the study. We assume a continuous outcome variable Y and several main predictors Z and confounders/covariates X for all models. The BKMR model has the following form: where Y i is the outcome for individual i i n = … ( ) 1, , ; z im is the m th predictor, also called the exposure variable; x i is a vector of potential confounders; h denotes the unknown exposure-response function to be estimated; β represents the effect of the covariates; and ε i represents residuals iid N(0, σ 2 ). To model the h function, BKMR uses a Gaussian kernel machine representation of the form: where Z z , ,z is the vector of predictors' values for the individual i, K(.) is the Gaussian kernel, and α i represents unknown parameters (13). Such representation can include a large number of variables of interest and allows for nonlinear and nonadditive relationships between these variables and the health outcome. In linear models, including linear or spline terms of all main predictors and their interactions with the outcome can lead to over-fitting issues due to the high number of parameters to estimate. The use of the kernel machine representation for h addresses this concern by regularising the high-dimensional response function, with the result that subjects with similar exposures will have similar health risks (13). Prior distributions on all the unknown parameters are placed as described in the supplemental material of (13). Adding an auxiliary parameter to the kernel function, which assumes a value of zero when the predictor is no longer included in the model, and a value of one -For large sample size data, BKMR and BSR may be employed indiscriminately, as they were able to capture complex relationships between predictors and the outcome even in presence of highly correlated variables. -For small sample size data: BKMR correctly selected the important variables, while BSR was strongly dependent on the choice of its tuning parameter and additional considerations should be taken into account to ensure an accurate variable selection. -BLASSO should be preferred only when it is reasonable to hypothesise the absence of interactions between predictors and in the presence of monotonous relationships. -In general, the performance of these models strongly depends on the sample size of the dataset, the correlation structure, and the predictor-outcome relationships.
Frontiers in Nutrition 03 frontiersin.org otherwise, BKMR allows for variable selection. The posterior inclusion probabilities (PIPs), which provide measures of variable importance, are estimated for each variable included in the model. Finally, the interactions can be studied using a graphic tool, investigating the predictor-response function of a single predictor in Z for another predictor in Z fixed at various quantiles (and for the remaining predictors set to a particular value). The BSR model is similar to the form of BKMR (1), but the predictor-response function h is modelled through the use of suitable spline functions: where each function f(.) is a natural cubic spline of a required dimension d, and dots indicate that the summation extends all the way through M-way interactions (14). The d parameter corresponds to the number of degrees of freedom used to model the effects of the predictors. The authors suggest building the model for different d values and using the Watanabe-Akaike information criterion (WAIC) to evaluate which model should be utilised going forward.
As a result of variable selection, the BSR provides the PIPs for the main effects and the interactions. Lastly, the LASSO is a technique for improving the ordinary least squares estimates for a linear exposureresponse function h by imposing the L1 penalty on the regression coefficients, shrinking or setting them to zero: , , , .

M
The λ parameter controls the amount of shrinkage that is applied to the estimates. Within a Bayesian context, BLASSO is obtained by placing a conditional Laplace prior on the α parameter (15), considering that LASSO estimates can be interpreted as posterior mode estimates when the regression parameters have independent and identical Laplace priors (3).

Software
The R programming language was used for all analyses, along with open-source software packages for BKMR (bkmr), BSR (NLinteraction), and BLASSO (monomvn).

Simulation
We compared the BKMR, BSR, and BLASSO models in terms of their ability with feature selection and performance prediction in three simulated scenarios with: (1) linear, (2) quadratic, and (3) logistic predictor-response relationships. We considered each scenario under two circumstances: with and without interactions among predictors. Changes in correlations structures and sample size (small, n = 100; large, n = 1,000) were also evaluated, and two prior settings were used in each model.
The correlation structures were set as moderate (r = 0.5) or high (r = 0.8) between the first five predictors and low (r = 0.2) or absent between all the others ( Figure 1).
We assessed the robustness of each model by changing the underlying priors in two situations: for BKMR and BSR, we assumed that a mixture component z m was included in the model with a prior probability π following a beta(a π = 2, b π = 2) or beta(a π = 2, b π = 6) distribution such that, a priori, we would expect, respectively, 50 and 25% of the components to be included (6,7). For λ 2 (square of the lasso penalty parameter) of BLASSO, we used gamma distribution priors with α (shape) and β (rate) parameters (1, 0.5) and (1, 2), corresponding to exponential distributions (10). A total of 48 configurations (x3 High-and low-correlation structure for the simulated datasets. Frontiers in Nutrition 04 frontiersin.org scenarios, x2 interaction situations, x2 correlation structures, x2 sample sizes, x2 prior settings) were studied for all the models. For each of these configurations, we generated K = 100 simulated datasets with M = 10 predictors Z Z distribution with a mean of zero and a standard deviation of one for each predictor, and with the correlation structure defined above. The predictor-response functions depending on the scenario and the presence/absence of an interaction are shown in Table 1; in Figure 2, a summary of the simulation structure and parameters is presented. Training and validation sets were selected with an 80/20 ratio for each dataset. We computed the average PIP of each predictor from the training datasets and then reported mean PIPs and standard errors (SEs) for all variables of interest. We fit BSR for a few d values, which are considered optimal in Antonelli et al. (14) (i.e., d ∈ {1, 2, 3, 4}).
We assessed the performance prediction on the validation sets using the mean squared error (MSE), reporting the mean and the standard deviation (SD) for each configuration. To select the important predictors in each model, we used the median probability model (MPM) (17), which consisted of having a marginal PIP of at least 50%.

Results
The results in terms of feature selection under each configuration and considering only the first set of priors are shown in  Due to the robustness of the results, the findings with the second prior setting are shown in Supplementary Figures S1-S3. Each bar represents the mean PIP for each variable and model, and the black horizontal line shows the MPM threshold of 50%.
Under the first scenario, all the models were able to select the predictors Z 2 and Z 3 when the sample size was large. With a small sample size (n = 100), BLASSO and BKMR showed similar selection abilities, selecting both Z 2 and Z 3 . BSR was unable to identify Z 2 in the situation of high correlation with a sample size of 100. BSR was influenced by the choice of the parameter d; in this scenario, d = 1 led to the best results. Z 1 was not selected by any model, nor were the highly correlated variables Z 4 and Z 5 . Considering the interaction terms, the results were mostly consistent with previous findings, with the exception of the BLASSO, which did not select any predictor. BSR with d = 1 led to the best results when the sample size was large; that is, it showed mean PIP values that were higher than those of the models with different d values. It also identified the weak relationship with Z 1 , but to the detriment of a greater number of false positives (selecting also the correlated variables Z 4 and Z 5 ).
In the second scenario, all the models were able to identify the variables Z 2 and Z 3 in all configurations. The BKMR and BSR also identified Z 1 as important in the scenario with a large sample size, while in the low sample size, only the BKMR identified this predictor. When the interactions were included, we detected a decrease in the selection performance of the BLASSO that did not identify any predictors, while the results of the BKMR and BSR were unchanged. BSR showed strong variability in terms of mean PIP as the parameter d varied, especially when the predictor-response relationship was weak.
Finally, in the third scenario, the BKMR was able to select the predictors Z 2 and Z 3 in all configurations, both with and without interactions, while the BSR was unable to identify Z 2 in the scenario with a low sample size, high correlation, and without the interactions. Z 1 was identified by both models only in when the interactions were included and a large sample size was used, while only the BKMR was able to identify this relationship when the sample size was small. The BLASSO performed well, selecting Z 2 and Z 3 in all situations without interactions. When interactions were present, it was able to identify Z 3 and, only with a large sample size and low correlation, also Z 2 . Changes in prior settings led to consistent results for all models.
To test the predictive ability of the models on the validation set, we used MSE. The results in terms of the mean MSE for each configuration and the first prior setting are shown in Table 2 Scenario 2: quadratic predictor-response associations  Simulation summary.

Real case study on morbidly obese women
Here we present a case study to compare the models using data from the Istituto Auxologico Italiano. The FUOBAUXO study, approved by the Ethical Committee of the Istituto Auxologico Italiano (protocol research project code: 18A301) (16), included 1,129 women hospitalised for clinical complication care and to access the metabolic rehabilitation program at San Giuseppe Hospital in Piancavallo (VB), Italy. The eligibility criteria for enrolment in this cohort included being over 18 years of age and having a Body Mass Index (BMI) ≥ 30 kg/m 2 . All women were recruited between May 2015 and July 2019. The protocol was explained to the patients, who gave their written informed consent. The overall objective of this study was to determine the association between the rehabilitation program and each patient's Mean PIPs for each configuration in Scenario 2 (quadratic predictor-response associations). Results for prior setting 1 shown (π following a beta (a π = 2, b π = 2) for BKMR/BSR, λ 2 following a gamma (α = 1, β = 0.5) for BLASSO]. Mean PIPs for each configuration in Scenario 3 (logistic predictor-response associations). Results for prior setting 1 shown (π following a beta (a π = 2, b π = 2) for BKMR/BSR, λ 2 following a gamma (α = 1, β = 0.5) for BLASSO]. In this study, we focused on the association between several predictors-anthropometric, clinical, and biochemical variables-and weight loss percentage at discharge on obesity-affected women. The predictors included in the models were grouped according to a medical experts' suggestion in: anthropometric, biochemical, cardiovascular, metabolic, and inflammatory (Supplementary Table S2). Furthermore, predictors were characterised by a complex correlation structure (Supplementary Figure S4). Women with missing information on the predictors were not included in the analyses, thus leaving a total of 755 participants. We divided the initial dataset into training (80%, n 1 = 604) and validation (20%, n 2 = 151) sets. Women in the training and validation sets had similar demographic characteristics (Table 3). We fitted the three models on the training dataset and considered the 50% PIP threshold for predictor selection ( Table 4). The BSR WAICs when d = 3 and d = 4 were equivalent (2793.0 and 2792.4, respectively). Therefore, we chose the model with d = 3 because the model increases its complexity with an increase of the d value.
In the training data among 39 predictors, 13 were selected by the models. All three models agreed in selecting the lean body mass (mean PIP: BKMR = 99.9%, BSR = 81.7%, BLASSO = 82.7%). However, we identified discrepancies among the variable selections performed by the BKMR, BSR, and BLASSO models. The BLASSO selected all anthropometric predictors except for uric acid. Both the BKMR and BSR selected thyroid-stimulating hormone (TSH) and erythrocyte sedimentation rate, showing a nonadditive effect between the two (BKMR and BSR results are shown in Figure 6). Among the differences between the BKMR and BSR results, we identified fasting plasma glucose, which was a relevant predictor for the BSR but not for the BKMR. The BKMR selected a greater number of variables, suggesting a wider interaction pattern ( Figure 6) and more complex mechanisms of action (e.g., between total cholesterol and TSH). We measured and compared the prediction ability of each model by leveraging the validation set and using the MSE. The MSE obtained were 3.6, 7.6, and 3.6. for the BKMR, BSR, and BLASSO, respectively.

Discussion
In this study, we compared three supervised Bayesian methods (BKMR, BSR, and BLASSO) that were designed to perform variable selection and evaluate the association of multiple predictors on an outcome. We highlighted advantages and limitations of each approach under different data configurations. Both the BKMR and BSR are  Frontiers in Nutrition 08 frontiersin.org novel semiparametric approaches that are able to handle nonlinear and nonadditive relationships, while the BLASSO regression accommodates only additive and linear predictor-outcome associations. In the simulations, we observed that the performance of both the BKMR and BSR strongly depends on the sample size and correlation structure. When the data sample size was large, both the BKMR and BSR correctly selected the predictors associated with the outcome, except when the true relationship was extremely weak. With a small sample size, the predictors more strongly associated with the outcome were always selected by the BKMR. With a small sample size and when the predictor-outcome relationship was weak, only the BKMR correctly selected the predictors, thus making the BKMR the most robust variable selection approach across scenarios. With a small sample size dataset and independently of the strength of predictor-outcome associations, BSR showed high heterogeneity in the variable selection abilities because of strong influence the tuning d parameter. The BLASSO captured well strong linear and additive relationships between predictors and outcome. BLASSO variable selection ability was hampered by strong nonlinear associations or interactions. When there were high correlations between predictors, PIP values were more unstable for all models. In terms of prediction ability, no differences were observed between the BKMR and BSR (for each value of d considered), while the BLASSO seemed to be characterised by higher average MSE values in most configurations. Despite the presence of highly correlated variables that are not associated with the outcome of interest, the models considered do not detect spurious relationships. This confirms their ability to handle situations of high correlation between predictors.   Our case study had a sample size comparable to the large-samplesize synthetic scenarios. When we applied all three models (BKMR, BSR, and BLASSO) to the case study, we observed that, out of all variables, the lean body mass was selected by all models. The selection of at least one anthropometric parameter, such as lean body mass, emphasised the importance of body composition in the process of weight loss. Several weight loss programs have resulted in short-term success, but many patients fail in long-term weight loss maintenance because of the loss of lean body mass that frequently and unintentionally occurs (18). There is a wide literature on the variation of body compartments during weight loss phases, which demonstrates that body composition improvement is mainly associated with the maintenance or increase of lean muscle mass (besides fat mass decrease) when dieting (19). The BLASSO selected other anthropometric measures; however, those variables were supposed to be strongly correlated with lean body mass, thus suggesting that they were false positives.
Both the BKMR and BSR identified the outcome "weight loss" as linked to TSH and the erythrocyte sedimentation rate (ESR), suggesting the existence of a nonlinear relationship between those variables and the outcome. Thyroid hormones have an essential and well-known role in body weight regulation, mainly through energy expenditure modulation. Indeed, hyperthyroidism or hypothyroidism frequently lead to significant changes in body weight (20). Thyroid hormones are tightly linked to cholesterol levels, which may lead to an increase of androgen levels and facilitate weight loss. The ESR is an inflammation marker routinely used in clinical practice to estimate overall inflammation. Common metabolic abnormalities, such as obesity and metabolic syndrome, are pro-inflammatory states that can be associated with increased ESR levels (21). Moreover, in morbidly obese patients, surgically induced weight loss is associated with a marked decrease in ESR and white blood cell count, which may indicate general inflammatory status improvements (22).
The BKMR identified five other variables associated with the outcome, but the BSR did not select them. Based on what we have learnt with the simulated scenarios, those variables may be weakly associated with weight loss, and further studies should investigate the effective role of those variables on weight loss.
In the case study, the BSR identified fasting plasma glucose as associated with weight loss. This variable was correlated with other metabolic variables selected by the BKMR, thus suggesting a connection between metabolic parameters and weight loss. This result was in part expected, given that the beneficial effects of decreased energy intake and weight loss in blood glucose control begin to occur rapidly during rehabilitation programs. The use in the case study of BKMR and BSR, models capable of handling situations with high correlation and complex relationships with the outcome, enabled the identification of important predictors belonging to different clinical groups by selecting only one or a few variables from each group. In contrast, BLASSO selected highly correlated variables largely belonging to a single clinical group, the anthropometric group, with the greatest possibility of identifying spurious relationships between predictors and outcomes.
A few limitations of our study should be noted. Neither the synthetic nor real case scenarios considered the high-dimensional framework, which can include more than 40 predictors and more than 1,000 participants. Moreover, in the simulations, we did not consider departures from the normality assumption in the linear models. Although these approaches adapt very well to high-dimensional biological or clinical contexts with different underlining distribution functions, further studies should assess the performance of those approaches under the more extended framework. Furthermore, considering the role played by the prior choice in the Bayesian variable selection, we selected two prior settings, which guaranteed a low and medium inclusion probability, a priori, of each component. However, further studies may need to assess the robustness of the results. Lastly, the real case scenario focused only on Caucasian women undergoing a weight loss program; for this reason, these results cannot be generalised to the entire population, to men, or to minorities who are more prone to develop metabolic conditions (i.e., Hispanics). Larger and more ethnically diverse studies should confirm our findings.

Conclusion
This is a first attempt to compare the performance of these innovative methods beyond typically environmental contexts and to provide practical usage information. In summary, the performance of all these methods depended on the sample size, the correlation structure, and the relationship between the predictors and the outcome.
The BKMR showed excellent performance in all scenarios, both with high and low correlation and sample size, identifying predictors strongly associated with the outcome and without false positives. It was also able to identify true weak relationships.
The performance of the BSR was also excellent in all different scenarios, but showed a very strong dependence on the choice of its own parameter, especially with a small sample size or high correlations between predictors, showing a higher variability in selection abilities.
The BLASSO performed poorly under nonlinear and synergistic scenarios but had good performance under the linear and additive scenarios. Our results suggest that both the BKMR and BSR may be employed in all scenarios, with attention in selecting an appropriate tuning parameter for BSR when sample size is small. The BLASSO should be preferred when it is possible to hypothesise the absence of interactions between predictors and in the presence of monotonous relationships and low correlations between predictors.

Data availability statement
The simulation data supporting the conclusions of this article will be made available by the authors, without undue reservation. FUOBAUXO data are available upon request from the Auxologico team. Please contact n.pesenti@campus.unimib.it.

Ethics statement
The studies involving human participants were reviewed and approved by Ethical Committee of the Istituto Auxologico Italiano (protocol research project code: 18A301). The patients/participants provided their written informed consent to participate in this study.

Author contributions
NP participated in the conceptualization and design of the work, managed the acquisition of data, gave substantial contributions to the analysis or interpretation of data, wrote the first draft of the paper, gave final approval of the version published, and ensured that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. PQ and EC participated in the conceptualization and design of the work, gave substantial contribution to the interpretation of data, critically revised the work and gave final approval of the version published. RC and MS gave substantial contribution to the interpretation of data, revised the work and gave final approval of the version published. AZ participated in the conceptualization and design of the work, gave essential contribution to the interpretation of data, revised the draft giving important intellectual contribution and gave final approval of the version published and ensured that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All authors contributed to the article and approved the submitted version.

Funding
During the preparation of this manuscript, EC was supported by the National Institute of Environmental Health Science (NIEHS): R01ES032242 and P30ES023515.