Bayesian-bootstrap process capability analysis for mixture model

Kanwal, Tahira; Abbas, Kamran; Shamoon, Imran; Zaman, Mehwish; Hussain, Zamir

doi:10.3389/fams.2025.1744829

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 12 January 2026

Sec. Statistics and Probability

Volume 11 - 2025 | https://doi.org/10.3389/fams.2025.1744829

Bayesian-bootstrap process capability analysis for mixture model

Mehwish Zaman³^*

¹Department of Statistics, King Abdullah Campus Chatter Kalas, The University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
²The University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
³Department of Statistical Science, University of Padova, Padova, PD, Italy
⁴School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan

This research contributes valuable insights into the evaluation of process capability indices, S_pmk and C_PY, through the development of a mixture model of two components: Frechet distributions based on maximum likelihood and the Bayesian estimation methods. Furthermore, bootstrapping is used to assess the stability and performance of the estimated process capability indices. The comparative study revealed that the Bayesian estimators outperform the counterpart in terms of mean squared errors and width of bootstrap confidence intervals for smaller to larger sample sizes. The real-life data results reinforce the findings across different analytical approaches, and these findings hold implications for researchers and the quality control experts engaged in manufacturing, services, and other industries and emphasizing the importance of methodological selections in ensuring robust and accurate process capability analysis in situations where the underlying process distribution is complex and possibly multimodal.

1 Introduction

Process capability refers to a process's intrinsic ability to produce a good product as specified in the product design process. A crucial component of any continuous quality improvement endeavor is measuring the performance of a process and taking appropriate action based on the measurements Spiring [1]. Businesses evaluate the effectiveness of their processes using a variety of metrics. The process capability indices(PCIs) are the most prevalent of these metrics [2]. The key reason for its popularity is that businesses need quantitative measures of the process's performance in relation to its specification constraints [3]. PCIs are statistical quantifications that are unitless and are used to compare how well a process characteristic performs. Quality control literature includes many process capability indices applicable to various process scenarios. Some commonly used are: Cp by Juran et al. [4]; Cpk by Kane [5]; Cpm, Chan et al. [6]; Cpp by Greenwich and Jahr-Schaffrath [7]. There are some other PCIs introduced in literature, namely CNp, CNpk, CNpm, and CNpmk [see, Pearn and Chen [8]; Schneider et al., [9], and Tong and Chen [10]].

In statistics, finite mixture models have been used extensively. It was particularly valued to generalize distributional assumptions and to model heterogeneity in population. The finite mixture models have several uses in various fields including biomedical, engineering, social sciences, medicine, economics, marketing, reliability studies and life testing problems, because mixture distributions represent heterogeneous data set when there is evidence of multimodality or simply unimodality. Generally, a mixture distribution can be formulated by combining two or more distributions using mixing parameters and mixture distribution of two sub-populations could be a suitable model for characterizing the overall population. Moreover, multiple causes of failure can be studied simultaneously via mixture distributions. Usually, the failure time population comprises of weak and strong components corresponding to short and long lives, respectively. Recently, various authors discussed different types of mixtures of distributions. The readers may refer to Turkan and Calis [31]; Everitt [11]; Everitt and Hand [12]; Jiang and Murthy [13]; Jiang and Murthy [14]; Marin et al. [15]; McLachlan and Peel [16]; Titterington et al., [17]; Titterington et al., [18]; and Lindsy [19]. Ali and Riaz [20] studied the generalized capability indices using Bayesian approach for different loss functions for the simple and mixture of generalized lifetime models. Moreover, Ouyang et al., [21] and Lin et al., [22] developed credible intervals for the PCIs. Wu [32] employed non-informative priors to construct the PC estimator for subsamples gathered over time using the squared error loss function. Kargar et al., [23] employed a Bayesian technique with a normal prior based on subsamples to assess process capacity using the capability index Cpk. Saxena and Singh [24] studied Bayesian estimation of the Cp index for normal distributions. Kanwal and Abbas [25] developed confidence interval for PCIs using objective Bayesian and made comparison with classical approach. The PCIs Cpy and CNpmk were developed for non-normal process by Kanwal et al. [26].

The main objective of this study is to develop the Bayesian estimators for the vector of parameters (Θ = {ψ, α_j, β_j}, j = 1, 2) of mixture of Frechet distribution (MFD) and to estimate the PCIs S_pmk and C_PY for MFD. Furthermore, bootstrap confidence intervals(BCIs) are evaluated as a measure of performance for both maximum likelihood estimates(MLEs) and Bayesian estimates (BE). The novelty of present work can be assessed from the fact that no attempt has been made to study the PCIs for MFD. Our aim is to give recommendations for selecting the suitable approach for estimating Θ for MFD and further evaluation of effective PCIs. This study is hoped to be valuable to practicing engineers, quality experts, and applied statisticians. In this work, we propose a MFD that can be successfully used in evaluation of PCIs in real-life problems. Let a random variable T follows a finite mixture model with k components, then the probability density function (PDF) can be written as

\begin{array}{l} f (t) = \sum_{j = 1}^{k} ψ f_{j} (t) & (1) \end{array}

where ψ is the mixing parameter which is a non-negative proportion such that $\sum_{j = 1}^{k} ψ_{j} = 1$ . The PDF of MFD is given as

\begin{array}{l} f (t) = ψ f_{1} (t; α_{1}, β_{1}) + (1 - ψ) f_{2} (t; α_{2}, β_{2}) & (2) \end{array}

The PDF of the jth component, j = 1, 2, is

\begin{array}{l} f_{j} (t ∣ α_{j}, β_{j}) = (\frac{α_{j}}{β_{j}}) {(\frac{β_{j}}{t})}^{α_{j} + 1} exp {- (\frac{β_{j}}{t})}^{α_{j}}, t, α_{j}, β_{j} > 0, & (3) \end{array}

where α_j and β_j are the shape and scale parameters of Frechet distribution(FD). The cumulative distribution function(CDF), reliability function, and hazard rate function of the MFD are, respectively, given by

\begin{array}{l} F (t, α_{j}, β_{j}) = ψ exp [- {(\frac{β_{j}}{t})}^{α_{j}}] + (1 - ψ) exp [- {(\frac{β_{j}}{t})}^{α_{j}}] & (4) \end{array}

\begin{array}{l} R (t, α_{j}, β_{j}) = ψ [1 - exp {(- \frac{β_{j}}{t})}^{a_{j}}] \end{array}

\begin{array}{l} + (1 - ψ) [1 - exp {(- \frac{β_{j}}{t})}^{a_{j}}] & (5) \end{array}

\begin{array}{l} h (t, α_{j}, β_{j}) = ψ (\frac{(\frac{α_{j}}{β_{j}}) {(\frac{β_{j}}{t})}^{α_{j} + 1} exp {- (\frac{β_{j}}{t})}^{α_{j}}}{1 - exp {(- \frac{β_{j}}{t})}^{α_{j}}}) \end{array}

\begin{array}{l} + (1 - ψ) (\frac{(\frac{α_{j}}{β_{j}}) {(\frac{β_{j}}{t})}^{α_{j} + 1} exp {- (\frac{β_{j}}{t})}^{α_{j}}}{1 - exp {(- \frac{β_{j}}{t})}^{α_{j}}}) & (6) \end{array}

In continuation of this introductory section, the rest of the article unfolds as: In Section 2, MLE for the parameters of MFD are obtained. Bayesian estimators are presented in Section 3, and PCIs are discussed in Section 4. Simulations study is given in Section 5, and BCIs are presented in Section 6. Finally, a real data set is analyzed in Section 7, and conclusions are presented in Section 8.

2 Maximum likelihood estimation

Let t₁, t₂, ..., t_n be a random sample of size n from MFD, then log-likelihood function is

\begin{array}{r} L = \log l = \sum_{i = 1}^{n} \log [ψ α_{1} β_{1}^{α_{1}} t_{i}^{- (α_{1} + 1)} \exp [- {(\frac{β_{1}}{t})}^{α_{1}}] \\ + (1 - ψ) α_{2} β_{2}^{α_{2}} t_{i}^{- (α_{2} + 1)} \exp [- {(\frac{β_{2}}{t})}^{α_{2}}]] & (7) \end{array}

Differentiating Equation 7 with respect to the mixing parameter ψ, α_j, β_j, where j = 1, 2 and equating to zero. The score equations are

\begin{array}{l} \begin{array}{l} \frac{\partial L}{\partial ψ} = \sum_{i = 1}^{n} \frac{f_{1} (t_{i}) - f_{2} (t_{i})}{ψ f_{1} (t_{i}) + (1 - ψ) f_{2} (t_{i})} = 0 \\ \frac{\partial L}{\partial α_{1}} = \sum_{i = 1}^{n} - \frac{e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}}} n ψ t_{i}^{α_{2}} β_{1}^{α_{1}} (- 1 + α_{1} (log (t_{i}) - log (β_{1}) + log (\frac{β_{1}}{t_{i}}) {(\frac{β_{1}}{t_{i}})}^{α_{1}}))}{e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}}} ψ t_{i}^{α_{2}} α_{1} β_{1}^{α_{1}} - e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} (- 1 + ψ) t_{i}^{α_{1}} α_{2} β_{2}^{α_{2}}} = 0 \\ \frac{\partial L}{\partial β_{1}} = \sum_{i = 1}^{n} - \frac{e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}}} n ψ t_{i}^{α_{2}} α_{1}^{2} (- 1 + {(\frac{β_{1}}{t_{i}})}^{α_{1}})}{β_{1} (e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}} ψ t_{i}^{α_{2}} α_{1} - e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} (- 1 + ψ) t_{i}^{α_{1}} α_{2} β_{1}^{- α_{1}} β_{2}^{α_{2}}})} = 0 \\ \frac{\partial L}{\partial α_{2}} = \sum_{i = 1}^{n} - \frac{e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} n (- 1 + ψ) t_{i}^{α_{1}} β_{2}^{α_{2}} (- 1 + α_{2} (Log (t_{i}) - Log (β_{2}) + Log (\frac{β_{2}}{t_{i}}) {(\frac{β_{2}}{t_{i}})}^{α_{2}}))}{- e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}}} ψ t_{i}^{α_{2}} α_{1} β_{1}^{α_{1}} + e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} (- 1 + ψ) t_{i}^{α_{1}} α_{2} β_{2}^{α_{2}}} \\ = 0 \\ \frac{\partial L}{\partial β_{2}} = \sum_{i = 1}^{n} - \frac{e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} n (- 1 + ψ) t_{i}^{α_{1}} α_{2}^{2} (- 1 + {(\frac{β_{2}}{t_{i}})}^{α_{2}})}{β_{2} (e^{{(\frac{β_{1}}{t_{i}})}^{α_{1}}} (- 1 + ψ) t_{i}^{α_{1}} α_{2} - e^{{(\frac{β_{2}}{t_{i}})}^{α_{2}}} ψ t_{i}^{α_{2}} α_{1} β_{1}^{α_{1}} β_{2}^{- α_{2}})} = 0 \end{array}} & (8) \end{array}

where f₁(t) and f₂(t) are the PDF of the FD. The system of nonlinear equations cannot be written in closed form. Here, we use the BB package which is available in the R software to get the MLEs of model parameters.

3 Bayesian estimation

For Bayesian estimation, we need prior distribution of ψ, α₁, α₂, β₁, and β₂. Assuming that ψ has the prior Beta(s₁, s₂) distribution, whereas α₁, α₂, β₁, and β₂ each have independent inverted gamma priors with PDFs, respectively, given by

\begin{array}{l} π (ψ) = \frac{1}{β (s_{1}, s_{2})} {(ψ)}^{s_{1} - 1} {(1 - ψ)}^{s_{2} - 1}, 0 \leq ψ \leq 1, s_{1}, s_{2} > 0 \\ π (α_{j}) = \frac{d_{j}^{c_{j}}}{Γ (c_{j})} α_{j}^{c_{j} - 1} exp (- d_{j} α_{j}), α_{j} > 0, c_{j}, d_{j} > 0 \\ π (β_{j}) = \frac{l_{j}^{n_{j}}}{Γ (n_{j})} β_{j}^{n_{j} - 1} exp (- l_{j} β_{j}), β_{j} > 0, n_{j}, l_{j} > 0 \end{array}

where (d_j, c_j, n_j, l_j) are the hyperparameters. The joint prior density of the random vector Θ = {ψ, α_j, β_j}, j = 1, 2 is

\begin{array}{l} π (\underline{θ}) = π (ψ) π_{j} (α_{j}) π_{j} (β_{j}) & (9) \end{array}

The joint posterior density function for MFD is given by

\begin{array}{l} π^{'} (\underline{θ} ∣ \underline{t}) = \frac{L (\underline{θ} ∣ t) π (ψ) π_{1} (α_{1}) π_{2} (β_{1}) π_{3} (α_{2}) π_{4} (β_{2})}{\int L (ψ, α_{1}, β_{1}, α_{2}, β_{2} ∣ t) π (ψ) π_{1} (α_{1}) π_{2} (β_{1}) π_{3} (α_{2}) π_{4} (β_{2}) d ψ d α_{1} d β_{1} d α_{2} d β_{2}} & (10) \end{array}

The posterior distribution (Equation 10) assumes a ratio form that is not amenable to closure. Therefore, in absence of closed form results here, we are using Lindley [27] approximation to obtain the BE, which can be expressed as

\begin{array}{l} I (t) ≊ U (\hat{ψ}, \hat{α_{1}}, \hat{β_{1}}, \hat{α_{2}}, \hat{β_{2}}) \\ + [u_{1} a_{1} + u_{2} a_{2} + u_{3} a_{3} + u_{4} a_{4} + u_{5} a_{5} + a_{6} + a_{7}] \\ + \frac{1}{2} [A (u_{1} σ_{11} + u_{2} σ_{12} + u_{3} σ_{13} + u_{4} σ_{14} + u_{5} σ_{15}) \\ + B (u_{1} σ_{21} + u_{2} σ_{22} + u_{3} σ_{23} + u_{4} σ_{24} + u_{5} σ_{25}) \\ + C (u_{1} σ_{31} + u_{2} σ_{32} + u_{3} σ_{33} + u_{4} σ_{34} + u_{5} σ_{35}) \\ + D (u_{1} σ_{41} + u_{2} σ_{42} + u_{3} σ_{43} + u_{4} σ_{44} + u_{5} σ_{45}) \\ + E (u_{1} σ_{51} + u_{2} σ_{52} + u_{3} σ_{53} + u_{4} σ_{54} + u_{5} σ_{55})] & (11) \end{array}

where a_i = ρ₁σ_i1 + ρ₂σ_i2 + ρ₃σ_i3 + ρ₄σ_i4 + ρ₅σ_i5, i = 1, 2, 3, 4, 5

\begin{array}{l} ρ_{1} = \frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}, ρ_{2} = \frac{c_{1} - 1}{\hat{α_{1}}} - d_{1}, ρ_{3} = \frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}, \\ ρ_{4} = \frac{c_{2} - 1}{\hat{α_{2}}} - d_{2}, ρ_{5} = \frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2} \end{array}

\begin{array}{l} A = σ_{11} L_{111} + 2 σ_{12} L_{121} + 2 σ_{13} L_{131} + 2 σ_{14} L_{141} + 2 σ_{15} L_{151} \\ + 2 σ_{23} L_{231} + 2 σ_{24} L_{241} + 2 σ_{25} L_{251} + 2 σ_{34} L_{341} \\ + 2 σ_{35} L_{351} + 2 σ_{45} L_{451} + σ_{22} L_{221} + σ_{33} L_{331} + σ_{44} L_{441} + σ_{55} L_{551} \\ B = σ_{11} L_{112} + 2 σ_{12} L_{122} + 2 σ_{13} L_{132} + 2 σ_{14} L_{142} + 2 σ_{15} L_{152} \\ + 2 σ_{23} L_{232} + 2 σ_{24} L_{242} + 2 σ_{25} L_{252} + 2 σ_{34} L_{342} \\ + 2 σ_{35} L_{352} + 2 σ_{45} L_{452} + σ_{22} L_{222} + σ_{33} L_{332} + σ_{44} L_{442} + σ_{55} L_{552} \\ C = σ_{11} L_{113} + 2 σ_{12} L_{123} + 2 σ_{13} L_{133} + 2 σ_{14} L_{143} + 2 σ_{15} L_{153} \\ + 2 σ_{23} L_{233} + 2 σ_{24} L_{243} + 2 σ_{25} L_{253} + 2 σ_{34} L_{343} \\ + 2 σ_{35} L_{353} + 2 σ_{45} L_{453} + σ_{22} L_{222} + σ_{33} L_{333} + σ_{44} L_{443} + σ_{55} L_{553} \\ D = σ_{11} L_{114} + 2 σ_{12} L_{124} + 2 σ_{13} L_{134} + 2 σ_{14} L_{144} + 2 σ_{15} L_{154} \\ + 2 σ_{23} L_{234} + 2 σ_{24} L_{244} + 2 σ_{25} L_{254} + 2 σ_{34} L_{344} \\ + 2 σ_{35} L_{354} + 2 σ_{45} L_{454} + σ_{22} L_{223} + σ_{33} L_{334} + σ_{44} L_{444} + σ_{55} L_{554} \\ E = σ_{11} L_{115} + 2 σ_{12} L_{125} + 2 σ_{13} L_{135} + 2 σ_{14} L_{145} + 2 σ_{15} L_{155} \\ + 2 σ_{23} L_{235} + 2 σ_{24} L_{245} + 2 σ_{25} L_{255} + 2 σ_{34} L_{345} \\ + 2 σ_{35} L_{355} + 2 σ_{45} L_{455} + σ_{22} L_{224} + σ_{33} L_{335} + σ_{44} L_{445} + σ_{55} L_{555} \end{array}

θ_{1} = ψ, θ_{2} = α_{1}, θ_{3} = β_{1}, θ_{4} = α_{2}, θ_{5} = β_{2}

L_{i j}^{*} = \frac{\partial^{3} L (\underline{θ})}{\partial θ_{i} \partial θ_{j}}, i, j = 1, 2, 3

L_{i j k}^{*} = \frac{\partial^{3} L (\underline{θ})}{\partial θ_{i} \partial θ_{j} \partial θ_{k}}, i, j, k = 1, 2, 3

The Bayesian estimators for ψ, α₁, α₂, β₁, and β₂ under squared error loss function are

\begin{array}{l} {\hat{ψ}}_{B E} = \hat{ψ} + (\frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}) σ_{11} + (\frac{c_{1} - 1}{{\hat{α}}_{1}} - d_{1}) σ_{12} \\ + (\frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}) σ_{13} + (\frac{c_{2} - 1}{{\hat{α}}_{2}} - d_{2}) σ_{14} + (\frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2}) σ_{15} \\ + 0.5 [A σ_{11} + B σ_{21} + C σ_{31} + D σ_{41} + E σ_{51}] & (12) \end{array}

\begin{array}{l} {\hat{α_{1}}}_{B E} = \hat{α_{1}} + (\frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}) σ_{21} + (\frac{c_{1} - 1}{{\hat{α}}_{1}} - d_{1}) σ_{22} \\ + (\frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}) σ_{23} + (\frac{c_{2} - 1}{{\hat{α}}_{2}} - d_{2}) σ_{24} + (\frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2}) σ_{25} \\ + 0.5 [A σ_{12} + B σ_{22} + C σ_{32} + D σ_{42} + E σ_{52}] & (13) \end{array}

\begin{array}{l} {\hat{β_{1}}}_{B E} = \hat{β_{1}} + (\frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}) σ_{31} + (\frac{c_{1} - 1}{{\hat{α}}_{1}} - d_{1}) σ_{32} \\ + (\frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}) σ_{33} + (\frac{c_{2} - 1}{{\hat{α}}_{2}} - d_{2}) σ_{34} + (\frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2}) σ_{35} \\ + 0.5 [A σ_{13} + B σ_{23} + C σ_{33} + D σ_{43} + E σ_{53}] & (14) \end{array}

\begin{array}{l} {\hat{α_{2}}}_{B E} = \hat{α_{2}} + (\frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}) σ_{41} + (\frac{c_{1} - 1}{{\hat{α}}_{1}} - d_{1}) σ_{42} \\ + (\frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}) σ_{43} + (\frac{c_{2} - 1}{{\hat{α}}_{2}} - d_{2}) σ_{44} + (\frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2}) σ_{45} \\ + 0.5 [A σ_{14} + B σ_{24} + C σ_{34} + D σ_{44} + E σ_{54}] & (15) \end{array}

\begin{array}{l} {\hat{β_{2}}}_{B E} = \hat{β_{2}} + (\frac{s_{1} - 1}{\hat{ψ}} - \frac{s_{2} - 1}{1 - \hat{ψ}}) σ_{51} + (\frac{c_{1} - 1}{{\hat{α}}_{1}} - d_{1}) σ_{52} \\ + (\frac{k_{1} - 1}{{\hat{β}}_{1}} - m_{1}) σ_{53} + (\frac{c_{2} - 1}{{\hat{α}}_{2}} - d_{2}) σ_{54} + (\frac{k_{2} - 1}{{\hat{β}}_{2}} - m_{2}) σ_{55} \\ + 0.5 [A σ_{15} + B σ_{25} + C σ_{35} + D σ_{45} + E σ_{55}] & (16) \end{array}

where $\hat{ψ}, \hat{α_{1}}, \hat{α_{2}}, \hat{β_{1}}$ and $\hat{β_{2}}$ are the maximum likelihood estimates of ψ, α₁, α₂, β₁ and β₂, respectively. Rest derivatives will be provided on request.

4 Process capability indices

The most frequently used PCIs C_p, C_pk, C_pmk, and C_pm rely strictly on the normality assumption for a given process with mean and process dispersion. However, many production and service processes make the presumption of normality mostly invalid.

4.1 The Index S_pmk for MFD

PCI S_pmk was introduced by Chen et al. [2] for any underlying distribution, by considering the process variability, departure of process mean (μ) from the true value (T₁), and fraction of non-conformity as

\begin{array}{l} S_{p m k} = \frac{ϕ^{- 1} (\frac{1 + F (U S L) - F (L S L)}{2})}{3 \sqrt{1 + (\frac{μ - T_{1}}{σ})}} & (17) \end{array}

\begin{array}{l} S_{p m k} = \frac{ϕ^{- 1} (1 - p / 2)}{3 \sqrt{1 + {(\frac{μ - T_{1}}{σ})}^{2}}} & (18) \end{array}

Here, ϕ(·) is CDF of standard normal distribution, p=1 - F(USL) - F(LSL) denotes the proportion of non-conformities, whereas USL and LSL are lower and upper specification limits, and F(·) is the CDF of process distribution. The PCI S_pmk of MFD quality characteristic can be written as

\begin{array}{l} \begin{array}{l} S_{p m k} = \\ \frac{ψ^{- 1} (\frac{(1 + ψ (exp [- {(\frac{β_{1}}{U S L})}^{α_{1}}] - exp [- {(\frac{β_{1}}{L S L})}^{α_{1}}]) + (1 - ψ) ([exp [- {(\frac{β_{2}}{U S L})}^{α_{2}}] - exp [- {(\frac{β_{2}}{L S L})}^{α_{2}}]]))}{2})}{3 \sqrt{1 + {(\frac{μ - T_{1}}{σ})}^{2}}}, \end{array} & (19) \end{array}

The true values of θ = (ψ, α₁, β₁, α₂, β₂) are unknown, and here we use the Bayesian and maximum likelihood estimators to get their estimates. The unbiased estimators of μ and σ² are $\bar{T}$ and S², respectively, where

\bar{T} = \frac{\sum_{i = 1}^{n} T_{i}}{n}, S^{2} = \frac{\sum_{i = 1}^{n} {(T_{i} - \bar{T})}^{2}}{n - 1}

4.2 The Index C_PY for MFD

Maiti et al. [28] have proposed a generalized PCI C_PY. It comprises both normal and non-normal, continuous as well as discrete random variables, and which is either directly or indirectly related to the majority of PCIs. It is defined as follows:

\begin{array}{l} C_{p y} = \frac{F (U S L) - F (L S L)}{F (U D L) - F (L D L)} = \frac{p}{p_{0}} & (20) \end{array}

where LDL and UDL are the lower and upper desirable limits, respectively, p is the process yield, and p₀ is the desirable yield. If the process distribution is normal with LDL = μ−3σ and UDL = μ+3σ, then the generalized PCI C_py can be written as (p/0.9973). The index C_py for MFD can be expressed as

\begin{array}{l} C_{p y} = \frac{1}{p_{0}} [ψ (\exp [- {(\frac{β_{1}}{U S L})}^{α_{1}}] - \exp [- {(\frac{β_{1}}{L S L})}^{α_{1}}]) \\ + (1 - ψ) (\exp [- {(\frac{β_{2}}{U S L})}^{α_{2}}] - \exp [- {(\frac{β_{2}}{L S L})}^{α_{2}}])] & (21) \end{array}

Here, we use the Bayesian and maximum likelihood estimators to get the estimates of θ = (ψ, α₁, β₁, α₂, β₂).

5 Simulation study

In this section, Monte Carlo simulation is conducted to examine the behavior of PCIs for both MLEs and BE in terms of BCIs, and mean squared errors (MSEs) against different sample sizes. The Monte Carlo simulation is performed as follows:

1. Take the initial values of the vectors of parameters Θ = {ψ, α_j, β_j}.

2. Generate random samples of sizes n = 10, 20, 30, 50, 70, 80, and 100, for each vector of the parameters from MFD. Random samples of the mixture model are generated as:
i. Generate U₁ and U₂ from two uniform variates.
ii. If U₁ < ψ, use U₁ to generate a random variate T from the MFD using t= $F^{- 1} (U_{1})$ .
iii. If U₁ ≥ ψ, use U₂ to generate a random variate T from the MFD using $t = F^{- 1} (U_{2})$ .

3. The process has been replicated 10,000 times for each sample size, and average estimates were determined for each approach using the R software (version: i386 4.3.1), and the results are presented in Tables 1–3.

4. Simulation is done for performance evaluation of two observed PCIs, namely S_pmk and C_PY, for MFD based on MLEs and BE. The sample sizes are considered as n = (10, 20, 30, 50, 70, 80, and 100). USL and LSL are set at 0 and 15, respectively. The median of two specification limits is used as target value, and vector of parameter is defined as α₁ = 0.6, β₁ = 1.4, α₂ = 0.7, and β₂ = 1 also ψ = (0.4, 0.5, 0.6).

5. Using bootstrapping, the estimators for PCIs, coefficient of skewness $(\sqrt{Ω})$ , coefficient of kurtosis (Λ), MSEs, and BCIs are calculated employing MLEs and BE using R software with 50,000 replications. The criteria used for performance evaluation is MSEs and width of BCIs. The results are listed for PCI S_pmk in Tables 4–6 for C_PY in Tables 7–9.

Table 1

Table 1. Point estimates for MFD when α₁ = 0.6, β₁ = 1.4, α₂ = 0.7, and β₂ = 1.

Table 2

Table 2. Point estimates for MFD when α₁ = 0.6, β₁ = 1.4, α₂ = 0.7, and β₂ = 1.

Table 3

Table 3. Point estimates for MFD when α₁ = 0.6, β₁ = 1.4, α₂ = 0.7, and β₂ = 1.

Table 4

Table 4. The true values of S_pmk along with average estimates and respective MSEs for ψ = 0.4, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Table 5

Table 5. The true values of S_pmk along with average estimates and respective MSEs for ψ = 0.5, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Table 6

Table 6. The True values of S_pmk along with average estimates and respective MSEs for ψ = 0.6, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Table 7

Table 7. The true values of C_PY along with average estimates and respective MSEs for ψ = 0.4, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Table 8

Table 8. The True values of C_PY along with average estimates and respective MSEs for ψ = 0.5, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Table 9

Table 9. The True values of C_PY along with average estimates and respective MSEs for ψ = 0.6, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

6 BCls for S_pmk and C_PY

Franklin and Gray [29] developed the bootstrap method, and BCIs are typically employed to obtain the empirical distribution of different PCIs. The following steps are followed for bootstrapping, and the standard BCIs are considered for S_pmk and C_PY. :

1. A random sample (t₁, t₂, …, t_n) of size n is drawn from MFD~(ψ, α_j, β_j), where j = 1, 2 and then n-bootstrap samples (with replacement) taken from the initial sample with a mass of 1/n at each. $MLEs (\hat{ψ}, \hat{α}, \hat{β})$ and $B E (\hat{ψ}, \hat{α}, \hat{β})$ , are evaluated and then R-th bootstrap estimator are calculated for S_pmk, and C_PY i.e., $Ŝ_{p m k}^{(R)} = Ŝ_{p m k} (t_{1}, t_{2}, t_{3}, \dots, t_{n})$ , ${\hat{C_{P Y}}}^{(R)} = \hat{C_{P Y}} (t_{1}, t_{2}, t_{3}, \dots, t_{n})$ .

2. Total of nⁿ re-samples are there and for each sample $Ŝ_{p m k}^{(R)}$ , are estimated which will make a complete bootstrapped distribution ${Ŝ_{p m k}^{* (J)}; J = 1 (1) B}$ and is denoted by $Ŝ_{p m k}^{* (1)} \leq$ $Ŝ_{p m k}^{* (2)} \leq \dots \leq Ŝ_{p m k}^{* (B)}$ , and same process is done for ${\hat{C_{P Y}}}^{(R)}$ .

From the results of simulation study, conclusions are drawn regarding the behavior of the estimators and PCIs, which are listed below.

• The results presented in Tables 1–3 demonstrate a comparative analysis of MLEs and BE using Lindley's approximation for sample sizes i.e., (10, 20, 30, 50, 70, 80, 100).

• The PCIs S_pmk and C_PY are estimated using MLE and Bayesian methods of estimation keeping values for hyperparameters constant at zero. Furthermore, to assess the performance of these methods, bootstrapping is used to compute BCIs for each estimate along with $\sqrt{Ω}$ , Λ, and MSEs. The results are shown in Tables 4–9.

• The results in Tables 4–9 and Figures 1–6 depicted clearly that the MSEs are decreasing with increasing sample size and consistently smaller for BE, for both PCIs, i.e., S_pmk and C_PY for different vectors of parameters.

• The BCIs for both S_pmk and C_PY are consistently narrower using Bayesian estimation, reinforcing the efficiency of this method in estimating above-mentioned PCIs with less uncertainty compared to those obtained via MLE. So, BE offers superior precision in calculating S_pmk and C_PY.

Figure 1

Bar graph comparing the Mean Squared Error (MSE) for SpmkML and SpmkBE across different sample sizes ranging from ten to one hundred. Orange bars represent SpmkML and purple bars represent SpmkBE. The MSE values are slightly higher for SpmkML across all sample sizes.

Figure 1. MSEs for S_pmk with ψ = 0.4, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Figure 2

Bar chart comparing MSE for CpyML and CpyBE for sample sizes from 10 to 100. CpyML bars are orange and consistently higher than the purple CpyBE bars. Y-axis represents MSE, ranging from 0.0000 to 0.0025.

Figure 2. MSEs for C_py with ψ = 0.4, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Figure 3

Graph showing cumulative distribution function $ F(t) $ versus time between failures. Black dots represent data points. A red curve labeled “BE” and a green dashed curve labeled “MLE” show model fits. The x-axis ranges from 0.5 to 3.5, and the y-axis ranges from 0.0 to 1.0.

Figure 3. MSEs for S_pmk with ψ = 0.5, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Figure 4

Bar chart comparing the Mean Squared Error (MSE) for SpmkML and SpmkBE across different sample sizes from 10 to 100. SpmkML is represented in orange and SpmkBE in purple. MSE values range from 0.0000 to 0.0012, with both methods showing similar MSE across all sample sizes.

Figure 4. MSEs for C_py with ψ = 0.5, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Figure 5

Bar chart comparing the mean squared error (MSE) of two models, CpyML and CpyBE, across various sample sizes (10 to 100). CpyML, shown in orange, consistently performs slightly better, with lower MSE values than CpyBE, shown in purple.

Figure 5. MSEs for S_pmk with ψ = 0.6, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

Figure 6

Bar chart comparing mean squared error (MSE) for CpyML and CpyBE across sample sizes of ten to one hundred. CpyML bars are orange, while CpyBE bars are purple. MSE values decrease as sample size increases.

Figure 6. MSEs for C_py with ψ = 0.6, α₁ = 0.6, β₁ = 1.4, α_{2 =} = 0.7, and β₂ = 1 for MFD.

7 Real-data illustration

In this segment, we considered the real-data illustration for time between failures for repairable item, a random sample of 30 components of a certain type, presented in Table 10, documented by Murthy et al. [30] to evaluate the estimates using different methods of estimation and their related PCIs. First, we determine whether the described data set is from FD employing the Kolmogorov Smirnov (KS) test's p-value. The value for KS test with respective p-value is D =0.1336, p-value = 0.6106 for considered data at 5 percent level of significance. Therefore, FD clearly fits into this data set, according to the aforementioned estimators. Moreover, data set is randomly divided into two groups and presented in Table 11 for MFD assuming ψ = 0.5. The point estimates of parameter for MFD are presented in Table 12 along with AIC values. Moreover, Figure 7 shows the comparative results for both estimation methods.

Table 10

Table 10. Time between failures for repairable items.

Table 11

Table 11. Mixture real-life data regarding time between failures for repairable items.

Table 12

Table 12. Point estimates of MFD for time between failures for repairable item.

Figure 7

Bar chart comparing mean squared error (MSE) for SpmkML and SpmkBE across different sample sizes (ten to one hundred). Orange bars represent SpmkML, and purple bars represent SpmkBE. The MSE values are similar for both methods across all sample sizes, ranging from about 0.0005 to 0.0020.

Figure 7. Comparison of estimation methods.

To estimate PCIs, USL and LSL were placed at 0.580 and 3.165, respectively. The target value is 1.8725. Table 13 displays the BCIs of the PCIs S_pmk and C_PY using two aforementioned estimation approaches for MFD. From Table 13, it is evident that the BE for both PCIs has a lower MSEs and a shorter BCI than its counterpart. As a result, it can be concluded that BE offers a more effective approach for estimating PCIs for MFD. Based on the findings of study, it is suggested that BE can be utilized more effectively in examining the observed PCIs for MFD.

Table 13

Table 13. BCIs for PCIs of MFD.

8 Conclusion

In this study, the statistical inference of the unknown parameters of the MFD under MLE and Bayesian framework has been considered. The Lindley's approximation is employed for BE, and the simulations show that the approach presented in this study is more effective than maximum likelihood method. Further, the comparative analysis of classical and Bayesian methods for estimation of PCIs S_pmk and C_PY demonstrate that the Bayesian estimation consistently produces shorter BCIs for both indices, against all sample sizes indicating higher precision and reduced uncertainty compared to the MLE approach. Similarly the MSEs for both PCIs are decreasing with increasing sample size and more rapidly decreasing for PCIs evaluated under Bayesian estimation. Results from both simulation studies and real-data analysis exhibit a high degree of similarity. This consistency reinforces the reliability and validity of the findings across different analytical approaches. Considering the findings of this study, we may suggest that the BE for MFD can be utilized for estimation of PCIs for dealing the heterogenous data set. The work can further be extended by considering some other flexible extreme value distribution or some sort of censoring techniques.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

TK: Conceptualization, Formal analysis, Methodology, Software, Writing – original draft. KA: Formal analysis, Project administration, Software, Supervision, Writing – review & editing. IS: Formal analysis, Software, Writing – review & editing. MZ: Resources, Writing – review & editing, Methodology. ZH: Methodology, Project administration, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Spiring FA. Process capability: a total quality management tool. Total Qual Managem. (1995) 6:21–34. doi: 10.1080/09544129550035558

Crossref Full Text | Google Scholar

2. Chen JP, Ding CG. A new process capability index for non-normal distributions. Int J Qual Reliab Managem. (2001) 18:762–70. doi: 10.1108/02656710110396076

Crossref Full Text | Google Scholar

3. Anis MZ. Basic process capability indices: an expository review. Int Stat Rev. (2008) 76:347–67. doi: 10.1111/j.1751-5823.2008.00060.x

Crossref Full Text | Google Scholar

4. Juran JM, Gryna FM, Bingham RS. Quality Control Handbook. New York: McGraw-Hill. (1979).

Google Scholar

5. Kane VE. Process capability indices. J Qual Technol. (1986) 18:41–52. doi: 10.1080/00224065.1986.11978984

Crossref Full Text | Google Scholar

6. Chan LK, Cheng SW, Spiring FA. A new measure of process capability: CPM. J Qual Technol. (1988) 20:162–75. doi: 10.1080/00224065.1988.11979102

Crossref Full Text | Google Scholar

7. Greenwich M, Jahr-Schaffrath BL. A process incapability index. Int J Qual Reliab Managem. (1995) 12:58–71. doi: 10.1108/02656719510087328

Crossref Full Text | Google Scholar

8. Pearn WL, Chen KS. Capability indices for non-normal distributions with an application in electrolytic capacitor manufacturing. Microelectron Reliab. (1997) 37:1853–8. doi: 10.1016/S0026-2714(97)00023-1

Crossref Full Text | Google Scholar

9. Schneider H, Pruett J, Lagrange C. Uses of process capability indices in the supplier certification process. Qual Eng. (1995) 8:225–35. doi: 10.1080/08982119508904621

Crossref Full Text | Google Scholar

10. Tong LI, Chen JP. Lower confidence limits of process capability indices for non-normal process distributions. Int J Qual Reliab Managem. (1998) 15:907–19. doi: 10.1108/02656719810199006

Crossref Full Text | Google Scholar

11. Everitt BS. An introduction to finite mixture distributions. Stat Methods Med Res. (1996) 5:107–27. doi: 10.1177/096228029600500202

PubMed Abstract | Crossref Full Text | Google Scholar

12. Everitt B. Finite Mixture Distributions. Cham: Springer Science & Business Media (2013).

Google Scholar

13. Jiang R, Murthy DN. Modeling failure-data by mixture of 2 Weibull distributions: a graphical approach. IEEE Trans Reliab. (1995) 44:477–88. doi: 10.1109/24.406588

Crossref Full Text | Google Scholar

14. Jiang R, Murthy DN. Two sectional models involving three Weibull distributions. Qual Reliab Eng Int. (1997) 13:83–96.

Google Scholar

15. Marin J, Rodriguez-Bernal M, Wiper M. Using Weibull mixture distributions to model heterogeneous survival data. Commun Statist-Simulat Comp. (2005) 34:673–84. doi: 10.1081/SAC-200068372

Crossref Full Text | Google Scholar

16. McLachlan G, Peel D. Finite Mixture Models. New York: John Wiley & Sons (2004).

Google Scholar

17. Titterington DM, Smith AF, Makov UE. Statistical Analysis of Finite Mixture Distributions. Chichester: Wiley & Sons. (1985).

Google Scholar

18. McLachlan GJ, Basford KE. Mixture Models: Inference and Applications to Clustering. New York, NY: M. Dekker. (1988).

Google Scholar

19. Lindsay BG. Mixture Models: Theory, Geometry, and Applications. Hayward, California: The Institute of Mathematical Statistics, Hayward, California (1995).

Google Scholar

20. Ali S, Riaz M. On the generalized process capability under simple and mixture models. J Appl Stat. (2014) 41:832–52. doi: 10.1080/02664763.2013.856386

Crossref Full Text | Google Scholar

21. Ouyang LY, Wu CC, Kuo HL. Bayesian assessment for some process capability indices. Int J Inform Managem Sci. (2002) 13:1–8.

Google Scholar

22. Lin TY, Wu CW, Chen JC, Chiou YH. Applying Bayesian approach to assess process capability for asymmetric tolerances based on CPMK index. Appl Math Model. (2011) 35:4473–89. doi: 10.1016/j.apm.2011.03.011

Crossref Full Text | Google Scholar

23. Kargar M, Mashinchi M, Parchami A. A Bayesian approach to capability testing based on Cpk with multiple samples. Qual Reliab Eng Int. (2014) 30:615–21. doi: 10.1002/qre.1512

Crossref Full Text | Google Scholar

24. Saxena S, Singh HP, A. Bayesian estimator of process capability index. J Statis Managem Syst. (2006) 9:269–83. doi: 10.1080/09720510.2006.10701206

Crossref Full Text | Google Scholar

25. Kanwal T, Abbas K. Bootstrap confidence intervals of process capability indices Spmk, Spmkc and Cs for Frechet distribution. Qual Reliab Eng Int. (2023) 39:2244–57. doi: 10.1002/qre.3333

Crossref Full Text | Google Scholar

26. Kanwal T, Abbas K, Zaman M, Hussain Z. Bootstrap confidence intervals of process capability indices Cpy and CNpmk using different methods of estimation for Frechet distribution. Front Appl Mathem Statist. (2025) 11:1668809. doi: 10.3389/fams.2025.1668809

Crossref Full Text | Google Scholar

27. Lindley DV. Approximate bayesian methods. Trabajos de estad stica y de investigacin operativa. (1980) 31:223–45. doi: 10.1007/BF02888353

Crossref Full Text | Google Scholar

28. Maiti SS, Saha M, Nanda AK. On generalizing process capability indices. Qual Technol Quant Managem. (2010) 7:279–300. doi: 10.1080/16843703.2010.11673233

Crossref Full Text | Google Scholar

29. Franklin LA, Gary W. Bootstrap confidence interval estimates of Cpk: an introduction. Commun Stat-Simul Comp. (1991) 20:231–42. doi: 10.1080/03610919108812950

Crossref Full Text | Google Scholar

30. Murthy DP, Xie M, Jiang R. Weibull Models. New York: John Wiley & Sons (2004).

Google Scholar

31. Turkan AH, Caliis N. Comparison of two-component mixture distribution models for heterogeneous survival datasets: a review study. Istatistik J Turk Stat Assoc. (2014) 7:33–42.

Google Scholar

32. Wu CW. Assessing process capability based on Bayesian approach with subsamples. Eur J Operat Res. (2008) 184:207–28. doi: 10.1016/j.ejor.2006.10.054

Crossref Full Text | Google Scholar

Keywords: Bayesian estimators, bootstrapping, maximum likelihood estimators, mean squared errors, mixed Frechet distribution

Citation: Kanwal T, Abbas K, Shamoon I, Zaman M and Hussain Z (2026) Bayesian-bootstrap process capability analysis for mixture model. Front. Appl. Math. Stat. 11:1744829. doi: 10.3389/fams.2025.1744829

Received: 12 November 2025; Revised: 12 December 2025;
Accepted: 15 December 2025; Published: 12 January 2026.

Edited by:

Artur Lemonte, Federal University of Rio Grande do Norte, Brazil

Reviewed by:

Edgar O. Resnédiz-Flores, Tecnólgico Nacional de México/IT Saltillo, Mexico
Walaa Ahmed Hamdi, University of Jeddah, Saudi Arabia

Copyright © 2026 Kanwal, Abbas, Shamoon, Zaman and Hussain. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Mehwish Zaman, bWVod2lzaC56YW1hbkBzdHVkZW50aS51bmlwZC5pdA==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Bayesian-bootstrap process capability analysis for mixture model

1 Introduction

2 Maximum likelihood estimation

3 Bayesian estimation

4 Process capability indices

4.1 The Index Spmk for MFD

4.2 The Index CPY for MFD

5 Simulation study

6 BCls for Spmk and CPY

7 Real-data illustration

8 Conclusion

Data availability statement

Author contributions

Funding

Conflict of interest

Generative AI statement

Publisher's note

References

4.1 The Index S_pmk for MFD

4.2 The Index C_PY for MFD

6 BCls for S_pmk and C_PY