- 1Department of Statistics, King Abdullah Campus Chatter Kalas, The University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
- 2The University of Azad Jammu and Kashmir, Muzaffarabad, Pakistan
- 3Department of Statistical Science, University of Padova, Padova, PD, Italy
- 4School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
This research contributes valuable insights into the evaluation of process capability indices, Spmk and CPY, through the development of a mixture model of two components: Frechet distributions based on maximum likelihood and the Bayesian estimation methods. Furthermore, bootstrapping is used to assess the stability and performance of the estimated process capability indices. The comparative study revealed that the Bayesian estimators outperform the counterpart in terms of mean squared errors and width of bootstrap confidence intervals for smaller to larger sample sizes. The real-life data results reinforce the findings across different analytical approaches, and these findings hold implications for researchers and the quality control experts engaged in manufacturing, services, and other industries and emphasizing the importance of methodological selections in ensuring robust and accurate process capability analysis in situations where the underlying process distribution is complex and possibly multimodal.
1 Introduction
Process capability refers to a process's intrinsic ability to produce a good product as specified in the product design process. A crucial component of any continuous quality improvement endeavor is measuring the performance of a process and taking appropriate action based on the measurements Spiring [1]. Businesses evaluate the effectiveness of their processes using a variety of metrics. The process capability indices(PCIs) are the most prevalent of these metrics [2]. The key reason for its popularity is that businesses need quantitative measures of the process's performance in relation to its specification constraints [3]. PCIs are statistical quantifications that are unitless and are used to compare how well a process characteristic performs. Quality control literature includes many process capability indices applicable to various process scenarios. Some commonly used are: Cp by Juran et al. [4]; Cpk by Kane [5]; Cpm, Chan et al. [6]; Cpp by Greenwich and Jahr-Schaffrath [7]. There are some other PCIs introduced in literature, namely CNp, CNpk, CNpm, and CNpmk [see, Pearn and Chen [8]; Schneider et al., [9], and Tong and Chen [10]].
In statistics, finite mixture models have been used extensively. It was particularly valued to generalize distributional assumptions and to model heterogeneity in population. The finite mixture models have several uses in various fields including biomedical, engineering, social sciences, medicine, economics, marketing, reliability studies and life testing problems, because mixture distributions represent heterogeneous data set when there is evidence of multimodality or simply unimodality. Generally, a mixture distribution can be formulated by combining two or more distributions using mixing parameters and mixture distribution of two sub-populations could be a suitable model for characterizing the overall population. Moreover, multiple causes of failure can be studied simultaneously via mixture distributions. Usually, the failure time population comprises of weak and strong components corresponding to short and long lives, respectively. Recently, various authors discussed different types of mixtures of distributions. The readers may refer to Turkan and Calis [31]; Everitt [11]; Everitt and Hand [12]; Jiang and Murthy [13]; Jiang and Murthy [14]; Marin et al. [15]; McLachlan and Peel [16]; Titterington et al., [17]; Titterington et al., [18]; and Lindsy [19]. Ali and Riaz [20] studied the generalized capability indices using Bayesian approach for different loss functions for the simple and mixture of generalized lifetime models. Moreover, Ouyang et al., [21] and Lin et al., [22] developed credible intervals for the PCIs. Wu [32] employed non-informative priors to construct the PC estimator for subsamples gathered over time using the squared error loss function. Kargar et al., [23] employed a Bayesian technique with a normal prior based on subsamples to assess process capacity using the capability index Cpk. Saxena and Singh [24] studied Bayesian estimation of the Cp index for normal distributions. Kanwal and Abbas [25] developed confidence interval for PCIs using objective Bayesian and made comparison with classical approach. The PCIs Cpy and CNpmk were developed for non-normal process by Kanwal et al. [26].
The main objective of this study is to develop the Bayesian estimators for the vector of parameters (Θ = {ψ, αj, βj}, j = 1, 2) of mixture of Frechet distribution (MFD) and to estimate the PCIs Spmk and CPY for MFD. Furthermore, bootstrap confidence intervals(BCIs) are evaluated as a measure of performance for both maximum likelihood estimates(MLEs) and Bayesian estimates (BE). The novelty of present work can be assessed from the fact that no attempt has been made to study the PCIs for MFD. Our aim is to give recommendations for selecting the suitable approach for estimating Θ for MFD and further evaluation of effective PCIs. This study is hoped to be valuable to practicing engineers, quality experts, and applied statisticians. In this work, we propose a MFD that can be successfully used in evaluation of PCIs in real-life problems. Let a random variable T follows a finite mixture model with k components, then the probability density function (PDF) can be written as
where ψ is the mixing parameter which is a non-negative proportion such that . The PDF of MFD is given as
The PDF of the jth component, j = 1, 2, is
where αj and βj are the shape and scale parameters of Frechet distribution(FD). The cumulative distribution function(CDF), reliability function, and hazard rate function of the MFD are, respectively, given by
In continuation of this introductory section, the rest of the article unfolds as: In Section 2, MLE for the parameters of MFD are obtained. Bayesian estimators are presented in Section 3, and PCIs are discussed in Section 4. Simulations study is given in Section 5, and BCIs are presented in Section 6. Finally, a real data set is analyzed in Section 7, and conclusions are presented in Section 8.
2 Maximum likelihood estimation
Let t1, t2, ..., tn be a random sample of size n from MFD, then log-likelihood function is
Differentiating Equation 7 with respect to the mixing parameter ψ, αj, βj, where j = 1, 2 and equating to zero. The score equations are
where f1(t) and f2(t) are the PDF of the FD. The system of nonlinear equations cannot be written in closed form. Here, we use the BB package which is available in the R software to get the MLEs of model parameters.
3 Bayesian estimation
For Bayesian estimation, we need prior distribution of ψ, α1, α2, β1, and β2. Assuming that ψ has the prior Beta(s1, s2) distribution, whereas α1, α2, β1, and β2 each have independent inverted gamma priors with PDFs, respectively, given by
where (dj, cj, nj, lj) are the hyperparameters. The joint prior density of the random vector Θ = {ψ, αj, βj}, j = 1, 2 is
The joint posterior density function for MFD is given by
The posterior distribution (Equation 10) assumes a ratio form that is not amenable to closure. Therefore, in absence of closed form results here, we are using Lindley [27] approximation to obtain the BE, which can be expressed as
where ai = ρ1σi1 + ρ2σi2 + ρ3σi3 + ρ4σi4 + ρ5σi5, i = 1, 2, 3, 4, 5
The Bayesian estimators for ψ, α1, α2, β1, and β2 under squared error loss function are
where and are the maximum likelihood estimates of ψ, α1, α2, β1 and β2, respectively. Rest derivatives will be provided on request.
4 Process capability indices
The most frequently used PCIs Cp, Cpk, Cpmk, and Cpm rely strictly on the normality assumption for a given process with mean and process dispersion. However, many production and service processes make the presumption of normality mostly invalid.
4.1 The Index Spmk for MFD
PCI Spmk was introduced by Chen et al. [2] for any underlying distribution, by considering the process variability, departure of process mean (μ) from the true value (T1), and fraction of non-conformity as
Here, ϕ(·) is CDF of standard normal distribution, p=1 - F(USL) - F(LSL) denotes the proportion of non-conformities, whereas USL and LSL are lower and upper specification limits, and F(·) is the CDF of process distribution. The PCI Spmk of MFD quality characteristic can be written as
The true values of θ = (ψ, α1, β1, α2, β2) are unknown, and here we use the Bayesian and maximum likelihood estimators to get their estimates. The unbiased estimators of μ and σ2 are and S2, respectively, where
4.2 The Index CPY for MFD
Maiti et al. [28] have proposed a generalized PCI CPY. It comprises both normal and non-normal, continuous as well as discrete random variables, and which is either directly or indirectly related to the majority of PCIs. It is defined as follows:
where LDL and UDL are the lower and upper desirable limits, respectively, p is the process yield, and p0 is the desirable yield. If the process distribution is normal with LDL = μ−3σ and UDL = μ+3σ, then the generalized PCI Cpy can be written as (p/0.9973). The index Cpy for MFD can be expressed as
Here, we use the Bayesian and maximum likelihood estimators to get the estimates of θ = (ψ, α1, β1, α2, β2).
5 Simulation study
In this section, Monte Carlo simulation is conducted to examine the behavior of PCIs for both MLEs and BE in terms of BCIs, and mean squared errors (MSEs) against different sample sizes. The Monte Carlo simulation is performed as follows:
1. Take the initial values of the vectors of parameters Θ = {ψ, αj, βj}.
2. Generate random samples of sizes n = 10, 20, 30, 50, 70, 80, and 100, for each vector of the parameters from MFD. Random samples of the mixture model are generated as:
i. Generate U1 and U2 from two uniform variates.
ii. If U1 < ψ, use U1 to generate a random variate T from the MFD using t= .
iii. If U1 ≥ ψ, use U2 to generate a random variate T from the MFD using .
3. The process has been replicated 10,000 times for each sample size, and average estimates were determined for each approach using the R software (version: i386 4.3.1), and the results are presented in Tables 1–3.
4. Simulation is done for performance evaluation of two observed PCIs, namely Spmk and CPY, for MFD based on MLEs and BE. The sample sizes are considered as n = (10, 20, 30, 50, 70, 80, and 100). USL and LSL are set at 0 and 15, respectively. The median of two specification limits is used as target value, and vector of parameter is defined as α1 = 0.6, β1 = 1.4, α2 = 0.7, and β2 = 1 also ψ = (0.4, 0.5, 0.6).
5. Using bootstrapping, the estimators for PCIs, coefficient of skewness, coefficient of kurtosis (Λ), MSEs, and BCIs are calculated employing MLEs and BE using R software with 50,000 replications. The criteria used for performance evaluation is MSEs and width of BCIs. The results are listed for PCI Spmk in Tables 4–6 for CPY in Tables 7–9.
Table 4. The true values of Spmk along with average estimates and respective MSEs for ψ = 0.4, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
Table 5. The true values of Spmk along with average estimates and respective MSEs for ψ = 0.5, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
Table 6. The True values of Spmk along with average estimates and respective MSEs for ψ = 0.6, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
Table 7. The true values of CPY along with average estimates and respective MSEs for ψ = 0.4, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
Table 8. The True values of CPY along with average estimates and respective MSEs for ψ = 0.5, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
Table 9. The True values of CPY along with average estimates and respective MSEs for ψ = 0.6, α1 = 0.6, β1 = 1.4, α2 = = 0.7, and β2 = 1 for MFD.
6 BCls for Spmk and CPY
Franklin and Gray [29] developed the bootstrap method, and BCIs are typically employed to obtain the empirical distribution of different PCIs. The following steps are followed for bootstrapping, and the standard BCIs are considered for Spmk and CPY. :
1. A random sample (t1, t2, …, tn) of size n is drawn from MFD~(ψ, αj, βj), where j = 1, 2 and then n-bootstrap samples (with replacement) taken from the initial sample with a mass of 1/n at each. and , are evaluated and then R-th bootstrap estimator are calculated for Spmk, and CPY i.e., , .
2. Total of nn re-samples are there and for each sample , are estimated which will make a complete bootstrapped distribution and is denoted by , and same process is done for .
From the results of simulation study, conclusions are drawn regarding the behavior of the estimators and PCIs, which are listed below.
• The results presented in Tables 1–3 demonstrate a comparative analysis of MLEs and BE using Lindley's approximation for sample sizes i.e., (10, 20, 30, 50, 70, 80, 100).
• The PCIs Spmk and CPY are estimated using MLE and Bayesian methods of estimation keeping values for hyperparameters constant at zero. Furthermore, to assess the performance of these methods, bootstrapping is used to compute BCIs for each estimate along with , Λ, and MSEs. The results are shown in Tables 4–9.
• The results in Tables 4–9 and Figures 1–6 depicted clearly that the MSEs are decreasing with increasing sample size and consistently smaller for BE, for both PCIs, i.e., Spmk and CPY for different vectors of parameters.
• The BCIs for both Spmk and CPY are consistently narrower using Bayesian estimation, reinforcing the efficiency of this method in estimating above-mentioned PCIs with less uncertainty compared to those obtained via MLE. So, BE offers superior precision in calculating Spmk and CPY.
7 Real-data illustration
In this segment, we considered the real-data illustration for time between failures for repairable item, a random sample of 30 components of a certain type, presented in Table 10, documented by Murthy et al. [30] to evaluate the estimates using different methods of estimation and their related PCIs. First, we determine whether the described data set is from FD employing the Kolmogorov Smirnov (KS) test's p-value. The value for KS test with respective p-value is D =0.1336, p-value = 0.6106 for considered data at 5 percent level of significance. Therefore, FD clearly fits into this data set, according to the aforementioned estimators. Moreover, data set is randomly divided into two groups and presented in Table 11 for MFD assuming ψ = 0.5. The point estimates of parameter for MFD are presented in Table 12 along with AIC values. Moreover, Figure 7 shows the comparative results for both estimation methods.
To estimate PCIs, USL and LSL were placed at 0.580 and 3.165, respectively. The target value is 1.8725. Table 13 displays the BCIs of the PCIs Spmk and CPY using two aforementioned estimation approaches for MFD. From Table 13, it is evident that the BE for both PCIs has a lower MSEs and a shorter BCI than its counterpart. As a result, it can be concluded that BE offers a more effective approach for estimating PCIs for MFD. Based on the findings of study, it is suggested that BE can be utilized more effectively in examining the observed PCIs for MFD.
8 Conclusion
In this study, the statistical inference of the unknown parameters of the MFD under MLE and Bayesian framework has been considered. The Lindley's approximation is employed for BE, and the simulations show that the approach presented in this study is more effective than maximum likelihood method. Further, the comparative analysis of classical and Bayesian methods for estimation of PCIs Spmk and CPY demonstrate that the Bayesian estimation consistently produces shorter BCIs for both indices, against all sample sizes indicating higher precision and reduced uncertainty compared to the MLE approach. Similarly the MSEs for both PCIs are decreasing with increasing sample size and more rapidly decreasing for PCIs evaluated under Bayesian estimation. Results from both simulation studies and real-data analysis exhibit a high degree of similarity. This consistency reinforces the reliability and validity of the findings across different analytical approaches. Considering the findings of this study, we may suggest that the BE for MFD can be utilized for estimation of PCIs for dealing the heterogenous data set. The work can further be extended by considering some other flexible extreme value distribution or some sort of censoring techniques.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.
Author contributions
TK: Conceptualization, Formal analysis, Methodology, Software, Writing – original draft. KA: Formal analysis, Project administration, Software, Supervision, Writing – review & editing. IS: Formal analysis, Software, Writing – review & editing. MZ: Resources, Writing – review & editing, Methodology. ZH: Methodology, Project administration, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Spiring FA. Process capability: a total quality management tool. Total Qual Managem. (1995) 6:21–34. doi: 10.1080/09544129550035558
2. Chen JP, Ding CG. A new process capability index for non-normal distributions. Int J Qual Reliab Managem. (2001) 18:762–70. doi: 10.1108/02656710110396076
3. Anis MZ. Basic process capability indices: an expository review. Int Stat Rev. (2008) 76:347–67. doi: 10.1111/j.1751-5823.2008.00060.x
5. Kane VE. Process capability indices. J Qual Technol. (1986) 18:41–52. doi: 10.1080/00224065.1986.11978984
6. Chan LK, Cheng SW, Spiring FA. A new measure of process capability: CPM. J Qual Technol. (1988) 20:162–75. doi: 10.1080/00224065.1988.11979102
7. Greenwich M, Jahr-Schaffrath BL. A process incapability index. Int J Qual Reliab Managem. (1995) 12:58–71. doi: 10.1108/02656719510087328
8. Pearn WL, Chen KS. Capability indices for non-normal distributions with an application in electrolytic capacitor manufacturing. Microelectron Reliab. (1997) 37:1853–8. doi: 10.1016/S0026-2714(97)00023-1
9. Schneider H, Pruett J, Lagrange C. Uses of process capability indices in the supplier certification process. Qual Eng. (1995) 8:225–35. doi: 10.1080/08982119508904621
10. Tong LI, Chen JP. Lower confidence limits of process capability indices for non-normal process distributions. Int J Qual Reliab Managem. (1998) 15:907–19. doi: 10.1108/02656719810199006
11. Everitt BS. An introduction to finite mixture distributions. Stat Methods Med Res. (1996) 5:107–27. doi: 10.1177/096228029600500202
13. Jiang R, Murthy DN. Modeling failure-data by mixture of 2 Weibull distributions: a graphical approach. IEEE Trans Reliab. (1995) 44:477–88. doi: 10.1109/24.406588
14. Jiang R, Murthy DN. Two sectional models involving three Weibull distributions. Qual Reliab Eng Int. (1997) 13:83–96.
15. Marin J, Rodriguez-Bernal M, Wiper M. Using Weibull mixture distributions to model heterogeneous survival data. Commun Statist-Simulat Comp. (2005) 34:673–84. doi: 10.1081/SAC-200068372
17. Titterington DM, Smith AF, Makov UE. Statistical Analysis of Finite Mixture Distributions. Chichester: Wiley & Sons. (1985).
18. McLachlan GJ, Basford KE. Mixture Models: Inference and Applications to Clustering. New York, NY: M. Dekker. (1988).
19. Lindsay BG. Mixture Models: Theory, Geometry, and Applications. Hayward, California: The Institute of Mathematical Statistics, Hayward, California (1995).
20. Ali S, Riaz M. On the generalized process capability under simple and mixture models. J Appl Stat. (2014) 41:832–52. doi: 10.1080/02664763.2013.856386
21. Ouyang LY, Wu CC, Kuo HL. Bayesian assessment for some process capability indices. Int J Inform Managem Sci. (2002) 13:1–8.
22. Lin TY, Wu CW, Chen JC, Chiou YH. Applying Bayesian approach to assess process capability for asymmetric tolerances based on CPMK index. Appl Math Model. (2011) 35:4473–89. doi: 10.1016/j.apm.2011.03.011
23. Kargar M, Mashinchi M, Parchami A. A Bayesian approach to capability testing based on Cpk with multiple samples. Qual Reliab Eng Int. (2014) 30:615–21. doi: 10.1002/qre.1512
24. Saxena S, Singh HP, A. Bayesian estimator of process capability index. J Statis Managem Syst. (2006) 9:269–83. doi: 10.1080/09720510.2006.10701206
25. Kanwal T, Abbas K. Bootstrap confidence intervals of process capability indices Spmk, Spmkc and Cs for Frechet distribution. Qual Reliab Eng Int. (2023) 39:2244–57. doi: 10.1002/qre.3333
26. Kanwal T, Abbas K, Zaman M, Hussain Z. Bootstrap confidence intervals of process capability indices Cpy and CNpmk using different methods of estimation for Frechet distribution. Front Appl Mathem Statist. (2025) 11:1668809. doi: 10.3389/fams.2025.1668809
27. Lindley DV. Approximate bayesian methods. Trabajos de estad stica y de investigacin operativa. (1980) 31:223–45. doi: 10.1007/BF02888353
28. Maiti SS, Saha M, Nanda AK. On generalizing process capability indices. Qual Technol Quant Managem. (2010) 7:279–300. doi: 10.1080/16843703.2010.11673233
29. Franklin LA, Gary W. Bootstrap confidence interval estimates of Cpk: an introduction. Commun Stat-Simul Comp. (1991) 20:231–42. doi: 10.1080/03610919108812950
31. Turkan AH, Caliis N. Comparison of two-component mixture distribution models for heterogeneous survival datasets: a review study. Istatistik J Turk Stat Assoc. (2014) 7:33–42.
Keywords: Bayesian estimators, bootstrapping, maximum likelihood estimators, mean squared errors, mixed Frechet distribution
Citation: Kanwal T, Abbas K, Shamoon I, Zaman M and Hussain Z (2026) Bayesian-bootstrap process capability analysis for mixture model. Front. Appl. Math. Stat. 11:1744829. doi: 10.3389/fams.2025.1744829
Received: 12 November 2025; Revised: 12 December 2025;
Accepted: 15 December 2025; Published: 12 January 2026.
Edited by:
Artur Lemonte, Federal University of Rio Grande do Norte, BrazilReviewed by:
Edgar O. Resnédiz-Flores, Tecnólgico Nacional de México/IT Saltillo, MexicoWalaa Ahmed Hamdi, University of Jeddah, Saudi Arabia
Copyright © 2026 Kanwal, Abbas, Shamoon, Zaman and Hussain. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Mehwish Zaman, bWVod2lzaC56YW1hbkBzdHVkZW50aS51bmlwZC5pdA==