Modified Quantile Regression for Modeling the Low Birth Weight

This study aims to identify the best model of low birth weight by applying and comparing several methods based on the quantile regression method's modification. The birth weight data is violated with linear model assumptions; thus, quantile approaches are used. The quantile regression is adjusted by combining it with the Bayesian approach since the Bayesian method can produce the best model in small size samples. Three kinds of the modified quantile regression methods considered here are the Bayesian quantile regression, the Bayesian Lasso quantile regression, and the Bayesian Adaptive Lasso quantile regression. This article implements the skewed Laplace distribution as the likelihood function in Bayesian analysis. The cross-sectional study collected the primary data of 150 birth weights in West Sumatera, Indonesia. This study indicated that Bayesian Adaptive Lasso quantile regression performed well compared to the other two methods based on a smaller absolute bias and a shorter Bayesian credible interval based on the simulation study. This study also found that the best model of birth weight is significantly affected by maternal education, the number of pregnancy problems, and parity.


INTRODUCTION
Birth weight is considered a significant predictor of later life's physical, psychological, and behavioral outcomes. Infants with low birth weight (LBW) (less than 2,500 g) tend to experience a delay in their development and face a greater risk of early childhood mortality than normal-weight infants [1][2][3][4]. Investigating the causes of low birth weight has become necessary and has come under intense global scrutiny.
There are several LBW determinants. One of the most relevant determinants is the maternal education level [5]. In developed countries, mothers with unfavorable socioeconomic status and low education levels face greater vulnerability to having LBW children [6]. Conversely, the use of prenatal health care and health technologies in the preconception, prenatal, and perinatal periods have led to an increase in the proportion of LBW, especially in the more affluent social strata, with greater access to such procedures. Additionally, late pregnancies also add to the number of LBW proportion. Recent observational studies have shown an increase in LBW in more privileged social groups and regions with higher economic growth [7]. Gestational weight gain (GWG) is also an important determinant of pregnancy and LBW. Low GWG has been linked to a higher incidence of preterm delivery and LBW [8].
Identifying the determinants of LBW, a specific part of a distribution, with various birth weight values inside data, cannot use ordinary least square (OLS). OLS techniques are focused on the average relationship between a set of regressors and outcome variables based on the conditional mean function only. Meanwhile, this study focuses on describing some relationships from a different perspective concerning conditional distribution. The quantile regression method provides that capability [9][10][11][12]. The quantile regression has gained increasing popularity due to its two primary features. First, it offers more valuable information on the predictors' effects on different response variable quantifications than the regular mean regression. Second, it is relatively insensitive to heteroscedasticity, outliers, or other anomalies of latent responses, and thus, the quantile regression can accommodate non-normal errors commonly encountered in many practical applications [13][14][15]. Those two strengths resulted in a rapid expansion of the quantile regression application over recent years, particularly in social sciences, public health, medicine, and econometrics.
Yanuar et al. [7] wrote that quantile regression needs more than 250 size samples to produce a better model. They then suggested implementing the Bayesian approach for constructing the model with a small to moderate size sample. Bayesian techniques for variable selection in quantile regression have received considerable attention in recent literature because Bayesian methods are often more competitive for small or moderate data sets with a low signal-to-noise ratio [16][17][18]. Li et al. [19] gave a generic treatment to a set of regularization approaches, including Lasso, group Lasso, and net elastic penalties. Alhamzawi and Yu [9,20], Ji et al. [21], and Chen et al. [22] extended stochastic search variable selection (SSVS) methods in mean regression to quantile regression. Benoit et al. [23] proposed the Bayesian hierarchical model for variable selection and estimation in the context of binary quantile regression. Oh et al. [10] proposed an alternative Bayesian variable selection method in quantile regression using the Savage-Dickey density ratio. Many studies on theoretical aspects of quantile regression were also discussed by Muharisa et al. [24] and Yanuar et al. [25]. Muharisa et al. [24] provided the capability of the Bayesian quantile method in handling nonnormal problems; meanwhile, Yanuar et al. [25] considered the simulation study to describe the capability of the quantile method in handling a heteroscedastic problem.
This study focuses on constructing the best model of LBW by comparing the performance of modification to the quantile regression method, i.e., Bayesian quantile regression (BQR), Bayesian Lasso quantile regression (BLQR), and Bayesian Adaptive Lasso quantile regression (BALQR). The primary data set of birth weight in West Sumatera was used in this study. The algorithm's acceptability for implementing all three methods is also tested by a simulation study with three conditions for error considered. The rest of the article is organized as follows.
In Section 2, we provide information on the data used for this study. Section 3 presents a description of the Bayesian quantile regression and Bayesian quantile regression with Lasso and Adaptive Lasso. Section 4 contains the results of this study, consisting of a simulation study to examine the performance of  the proposed methods and modeling of LBW in West Sumatera, Indonesia. Finally, brief conclusions are given in Section 5.

METHODOLOGY
Statistical analysis used in this study is a modification to quantile regression since the size sample is moderate, with 150 observations. Three kinds of the modified quantile regression methods are considered, namely, the Bayesian quantile regression, the Bayesian Lasso quantile regression, and the Bayesian Adaptive Lasso quantile regression.
For 0 < τ < 1, let Q yi (τ |x i ) denote the τ -th quantile regression function of y i with associated p dimensional vector of covariates x i . The quantile regression function is expressed in the form of Q yi (τ |x i ) = x T i β, for i = 1, 2, . . . , n, where β is a p × 1 vector of coefficients for indicator variables that depend on τ . Then, a linear quantile regression model can be expressed as Here, e i is the error term whose distribution is restricted so much that τ -th quantile is equal to zero. Then, quantile regression estimation for β is obtained by minimizing where ρ τ (u) is the check function defined by Here, I(.) is an indicator function that takes the value of unity when I(.) is true and zero otherwise and here u = y i − x T i β. However, this indicator function is not differentiable at zero, and explicit solutions to minimization problems are unobtainable [26,27]. In quantile regression methods, linear programming is often implemented for parameter estimation.
Yu and Moyeed [27] found that minimizing the expression (2) is equivalent to maximizing a likelihood function formed by combining the independently distributed asymmetric Laplace error distribution. The asymmetric Laplace distribution is employed in likelihood distribution in order to make Bayesian estimation more natural and effective [27,28], since this distribution is a possible parametric link between the minimization problem in Equation (2) and maximum likelihood theory. Therefore, a random variable ε i is said to be distributed as a skewed Laplace distribution with density [29].
where δ is a scale parameter. It is also known that the mean and variance of the asymmetric Laplace distribution are given respectively by One property of the asymmetric Laplace distribution is this distribution can be represented with various mixture representations. The Gibbs sampling algorithm is then utilized for Bayesian analysis of the quantile regression model based on a skewed Laplace distribution's theoretical derivation. The mixture of the exponential and normal distribution of the skewed Laplace distribution allows efficient Gibbs sampling [29][30][31]. More specifically let where Equation (1) and (6) lead to The random variables z i and ξ i are mutually independent. Variable z i is exponentially distributed with a mean of 1/δ. Variable ξ i is a standard normal distribution. Thus, the conditional distribution of y i given z i is normally distributed with a mean x T i β +θ z i and variance φ 2 z i . Then the quantile regression model here is represented as a normal regression model. This representation provides an easy way to construct a Gibbs sampler and save time in sampling the regression coefficients.

Bayesian Quantile Regression With Lasso and Adaptive Lasso
One crucial problem in building a quantile regression model is the selection of predictors. The prediction accuracy is often improved by choosing an appropriate subset of predictors. It is often desired to identify a smaller subset of predictors from a large set of predictors to obtain better interpretation in practice. There have been active studies on the sparse representation of linear regression. Li et al. [19] showed that the least absolute shrinkage and selection operator (Lasso) technique could simultaneously perform variable selection and parameter estimation. The Lasso estimate, which is also known as an L1-regularized least squares estimate, involves the sum of the coefficients' absolute values as the penalty. This L1regularized has the advantage of simultaneously controlling the fitted coefficients' variance and performing the automatic variable selection. The Lasso estimates are defined as Alhamzawi and Yu [9,20] and Zou [32].
where λ > 0 is a regularized parameter that controls the degree of penalization. The second term in the expression (9) is an L 1 -regularized term, which could be interpreted as a Bayesian posterior mode estimate under independent Laplace priors for the coefficients. As a nonnegative regularization parameter λ increases, the Lasso estimates continuously the shrinks quantile regression coefficient toward zero. The appropriate prior distribution for β k ∈ β(k = 1, ..., p) is Laplace distribution, defined as follows Here, it is assumed that the residuals ε i have a skewed Laplace distribution as represented in Equation (4). Then, we employed the Bayesian Adaptive LASSO quantile regression (BALQR) to estimate the unknown parameter model. The penalty function in (9) can be made "adaptive" by choosing different shrinkages for different coefficients: where λ k > 0 is the tuning parameter for the k th coefficient. Here, the new proposed Laplace prior for β k is formed by This equation can be represented as a scale mixture of normal with an exponential mixing density [29,33,34].
Let a k = δ 1/2 λ k . The prior distribution for β k is expressed in the form or in the form (13) Then, we consider the class of inverse gamma priors on λ 2 k of the form Frontiers in Applied Mathematics and Statistics | www.frontiersin.org By combining Equations (13) and (14), we obtain the posterior density function of λ 2 k is inverse gamma with shape parameter 1 + σ and rate parameter δ s k 2 + ρ. The value of two hyperparameters σ and ρ affect the amount of shrinkage in the prior Equation (14), e.g., larger σ and smaller ρ lead to bigger penalization. This BALQR uses a Laplace prior for β k such that each β k has a Lasso type of penalization parameter δ 1/2 λ k . More detailed explanations in formulating posterior density function for all parameter models are written in Alhamzawi et al. [29] and Xu and Tang [34].
To estimate credible intervals, it is not automatically valid by constructing the posterior. Yang et al. [35] argued to employ the Wald method based on the asymptotic approximation to the variance-covariance matrix of the posterior sequences to estimate the Bayes credible interval, as also reported in Li et al. [19] and Yue and Hang [36]. In this present study, we implemented the Wald method based on the asymptotic approximation to the variance-covariance matrix of the posterior sequences to estimate the Bayes credible interval.

Simulation Study
In this section, we demonstrate the application of the Bayesian, Bayesian Lasso, and Bayesian Adaptive Lasso in quantile regression to several different generating processes. The goal of this simulation study here is to reveal the performance of the proposed methods and their associated algorithm in recovering the true parameters. For the proposed model, the MCMC simulations were implemented in R version 3.6.1 [37]. In this simulation study, the response variable y i is generated from the following regression model.
where each covariate x ik is simulated from a standard normal distribution and β = (5, 1.5, 3). Three different distributions for e i were considered: (i) heteroscedastic normal, (1 + x 1 )N(0, 1), (ii) autocorrelated error, sin(seq(0.1π, 18.3π, 0.1π) + Z) with Z (0, 0.1), (iii) nonnormal error, the mixture of two Student's t distribution, 0.1t (1) + 0.9t (3) . For each choice of the error distribution, we employed a Bootstrap resampling method with 100 simulations were carried out. In each simulation, 150 observations were generated. Four different values of the given quantile τ = 0.25, 0.50, 0.75, and 0.95 were considered. To assess the sampling efficiency of the proposed algorithm, the Monte Carlo standard errors for each β k , k = 1, 2, 3 were calculated by running the Gibbs sampler for 5,000 iterations with an initial burn-in of 1,000 iterations to lessen the effect of initial simulations. The process resulted in 4,000 final posterior samples for each regression parameter. Then the width of 95% Bayes credible interval was estimated for each selected quantile for each proposed method. The absolute bias is also estimated when the estimates ofβ k were compared with the true value of β k . The results are summarized in Tables 1-3. Table 1 presents the results of all three proposed methods at several selected quantiles (i.e., τ = 0.25, τ = 0.50, τ = 0.75, and τ = 0.95). The table also presents the absolute bias and the width of 95% Bayes credible interval. The table shows that all proposed methods yielded very similar results. The table shows that, in general, Bayesian Adaptive Lasso quantile regression performs well compared to two other methods, BLQR and BQR. Table 2 shows the results for autocorrelated errors. The table informs us that all proposed methods yielded almost similar values for corresponding quantiles. In general, BALQR performed best among the three methods because of the number of the smallest values of the entries. Table 3 presents the absolute bias and the width of 95% Bayes credible interval for the three-parameter models of a nonnormal error condition. The results show that all values of the absolute Bias for corresponding quantiles are almost similar. Note that BALQR yielded the smallest values among the other three methods except forβ 3 . The Bayes credible intervals for all proposed methods at the same quantiles are almost similar. In general, the credible interval for BALQR is narrower than the other two methods.
The results concerning three different error distributions inform us of two things. First, although the performance of all the three methods proposed in this study is very close in general, the Bayes Adaptive Lasso quantile regression method performs better than the Bayes Lasso quantile regression and Bayes quantile method. Second, the BALQR method is robust to the error distribution assumptions, such as the normality assumption, the homogeneous assumption, or the non-correlated assumption. We assumed that the BALQR tends to produce more suitable values for the parameter estimated than other methods.
For the next analysis, we have to evaluate our algorithm's convergence used in the BALQR method. We use convergence diagnostics such as the posterior plots (trace plot and density plot) and autocorrelation analysis. Figures 1-3 present the trace plot, density plot, and autocorrelation plot ofβ 1 at quantile τ = 0.25 for all three conditions of error, i.e., heteroscedastic normal, autocorrelated error, and non-normal error, respectively. The author saves other plots for limited space. As shown in the trace plots in Figures 1A, 2A, and 3A that all generated samples lie within two parallel horizontal lines, centered at respective values, and no trends are detected. The histograms of marginal posteriors in Figures 1B, 2B, and 3B above inform us that the conditional posterior distributions are the desired stationary univariate normal. All posterior distributions shrink at the true parameter value (trace plot and density plot). Furthermore, Figures 1C, 2C, and 3C inform that the decrease in the empirical autocorrelation of posterior samples proves that the underlying chains are stationary.
The results obtained from these convergence diagnostics indicate that our algorithm used in the BALQR approach could produce adequate and acceptable values of the estimated parameter.

Sample Data
The analysis is applied to the primary data related to Birth weight. The data was collected by distributing the online questionnaires from February to April 2020 to mothers who just delivered a singleton live birth and living in West Sumatera, Indonesia. In total, 150 respondents with complete information were involved in the analysis.
This study uses Birth weight, recorded in kilograms, as a response variable and 11 indicator variables, consisting of 6 variables in continuous type and five variables in categorical types. The continuous indicator variables were the Mother's age, Mother's weight gain (during pregnancy), Hemoglobin, Last birth interval, Parity, and Prenatal care. Meanwhile, the categorical indicator variables were Maternal education, Maternal occupation, Residence, Number of pregnancy problems, and Sex of the baby. Maternal education was divided into three levels; Low, Middle, and High level, where the Low level was set as a reference category, so coefficients were interpreted relative to this category. The Maternal occupation was classified into three categories, i.e., Government employee, Housewife, and Others. A Residence was categorized as Urban or Rural. Many pregnancy problems were categorized into three types: More than one problem, One problem, and No problem, where More than One problem was used as the reference category. Table 4 displays the summary statistics for the continuous independents of the sample. The mean Birth weight data is 3.06, with the mean Mother's age being 30.22 years old. The average Mother's weight gain is 12.45 Kg and the mean of Hemoglobin is 11.97. On average, the mother's Last birth interval was 3.10 years ago. The mean of Parity is 1.88 times, and the mother had Prenatal care on average is 7.93 times.
While Table 5 presents the summary statistics for categorical independents of the sample. In terms of Maternal educational attainment, 22.7% of mothers have a Low school education, 38% are Middle school, and 39.3 are University or College graduates. For the Maternal occupation variable, more than half of the mother's status is a Housewife (51.3%), 22.0% is a Government employee, and Others are 26.7%. More than half of mothers (64%) are living in urban areas. In terms of The number of pregnancy problems, most mothers have No problem while pregnancy (46%), 34.7% have One problem, and 19.3% have More than one problem. For the gender of babies, 46.7% are girls and 53.3% are boys.

Construction of LBW Model
In the preliminary analysis, we did several tests on the Birth weight model. Based on the test, it informs that the error of our model is violated by normality assumption and homoscedasticity. Then the quantile regression model between the response Birth weight and the eleven predictors without intercept was applied. In this empirical study, the frequentist quantile regression model is also employed for comparison purposes. The same equation shown in (1) is used here. The wild bootstrap method, as proposed by Feng et al. [38] and Yanuar and Zetra [39], was implemented for the quantile regression to get the parameter estimated. The procedures to use the wild bootstrap are as follows: 1. Fitting Equation (1) to the data to obtain the parameter vector ofβ and the residualê i for i = 1, . . . , n. In constructing the proposed model, we did any model combination and compared it to obtain the best and most acceptable model, including allowing a model's simplicity (results for model comparison are available upon request).
Hence, our reduced model only involved the significant indicator variables, namely, Maternal education (consisting of 3 categories), The number of pregnancy problems (with three categories), and Parity. Thus, we consider the following quantile regression (QR) model: estimated mean with its upper bound and lower bound for a 95% CI. This figure informs us that the width of 95% confidence interval based on OLS estimated seems to appear similar to quantiles, especially at lower quantiles. Table 6 summarizes the results for the estimated mean and the width of the 95% CI obtained based on quantile regression with the wild Bootstrap resampling method and modified quantile regression (BRQ, BLQR, and BALQR). Since LBW is focused on low quantiles, quantile t = 0.10, 0.25, and 0.50 were selected. For the QR model at all selected quantiles τ , the 95% CI of the quantile is wider than each modified quantile method. These results have been predicted as the sample size is relatively small for the QR method (150 observations). Thus, in this analysis, QR does not produce an adequate model. ANOVA test also yielded that at selected quantile, the significant different due between quantile and respectively modified quantile. Furthermore, we could look at this table that at 0.10th quantile, all indicator variables for QR and modified QR are significantly difference from zero at the 5% level except for No problem. We also conclude here that BALQR tends to yield the shortest 95% CI among others.
The interpretation of the proposed model yielded based on BALQR is the 0.10th quantile (or percentile) of Birth weight for Middle is 0.613 Kg higher than Low, hold all else constant. The 0.10th quantile of Birth weight for High is 0.488 Kg higher than Low, assumed all else constant. Besides, the 0.10th quantile of Birth weight for No problem is 0.409 Kg higher than More than one problem. The impact of Parity on Birth weight is greater, for every increment of 1 unit of Parity, the Birth weight will increase by 0.116 Kg with assumptions else constant. A similar interpretation for model BALQR at 0.25th and 0.50th quantile could be created as well, except for No problem (not significant). While the 0.50th quantile for all four dummy variables is not significantly different from zero at the 5% level, only Parity. Parity, as a continuous variable, has a significant impact on Birth weight. The next analysis is the convergency test for all estimated parameters. Figure 5 shows the diagnostic plots for Middle at the 0.25th quantile for illustrative purposes. The author saves other plots because of limited space.

CONCLUSION
The study yields the acceptable model of LBW in West Sumatera, Indonesia, after doing a comparative study between three modification methods in quantile regression. The strength of the quantile method is that it can model the predictors' effects on the different quantiles of the response variable. It can accommodate non-normal errors since it is insensitive to heteroscedasticity and outliers. The quantile method's limitation requires a big sample size, and therefore, a quantile method should then be modified by combining it with the Bayesian approach. Under the Bayesian quantile regression approach, the parameter model is estimated by minimizing the check function, equivalent to maximizing a likelihood function formed by combining independently distributed asymmetric Laplace error distribution. This technique is robust to model small to moderate-sized samples and can handle any cases with violated normal assumptions. Generally, the Bayesian method needs no assumptions.
Even though many studies have been done on determining the LBW, no studies have been done on the modeling of the LBW model using a comparison of Bayesian quantile and its modification in this study. Not many studies have been done on constructing LBW using all 11 indicator variables as done in this present study. The indicators that are found significant in determining the LBW considered in this study are the mother's education at three levels: Middle, High, and Low level (as reference category), The number of pregnancy problems was categorized into three types: One problem, No problem, and More than one problem (as reference category), and Parity. These results are also linear with previous studies, such as research by Silvestrin et al. [5] and Yanuar et al. [18]. Here, the low birth weight model was constructed by involving 150 respondents. We assumed that these respondents were representing the condition of other mothers who just have a baby and living in West Sumatera. Based on data, we found that these size samples have met the requirement of sample adequacy. But, we argue to future research to use at least 200 samples to avoid misleading in such implementation of the quantile regression approach.
In this present study, we implemented the Wald method based on the asymptotic approximation to the variancecovariance matrix of the posterior sequences to estimate the Bayes credible interval. Based on simulation study and empirical study, it was proved that the Bayesian Adaptive Lasso quantile regression results in the smallest absolute Bias and the shortest 95% Bayes credible interval than the other two methods. This present study also gives a paramount significance to the attention of policymakers and decision-making organizations related to maternal pregnancy health to improve the adequacy of prenatal care use, facilitate the development of culturally sensitive interventions to enhance nutritional status and health of maternal pregnancy.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because restrictions apply and are not publicly available. Requests to access the datasets should be directed to the corresponding author and with permission from West Sumatra Provincial Health Office, Indonesia.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by M. Djamil Hospital. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.