Extending the Kumaraswamy distribution: properties and applications

Jan, Mahvish; Ahad, Tabasum; Correa, Fabio Mathias; Ahmad, S. P.

doi:10.3389/fams.2025.1694353

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 23 January 2026

Sec. Statistics and Probability

Volume 11 - 2025 | https://doi.org/10.3389/fams.2025.1694353

Extending the Kumaraswamy distribution: properties and applications

Mahvish Jan¹^*

Tabasum Ahad¹

Fabio Mathias Correa²

S. P. Ahmad¹

¹Department of Statistics, School of Physcial and Mathematical Sciences, University of Kashmir, Srinagar, Kashmir, India
²Department of Mathematical Statistics and Actuarial Sciences, University of the Free State, Bloemfontein, South Africa

In this study, we propose a new extension of the Kumaraswamy distribution, termed the PNJKumaraswamy Distribution, developed using the PNJ transformation technique. The PNJKumaraswamy distribution provides enhanced flexibility compared to several existing generalizations of the Kumaraswamy model. We derive and examine its key statistical properties, including moments, quantiles, hazard rate behavior and related structural measures. The parameter estimation is performed using the maximum likelihood method. A simulation study is conducted to evaluate the behavior of the maximum likelihood estimates under various parameter settings. The practical usefulness of the proposed model is further demonstrated through the analysis of real data sets, where the PNJKumaraswamy distribution achieves a superior fit relative to competing models.

1 Introduction

In statistical distribution theory, incorporating additional parameters into existing distribution families has become a widely adopted and valuable methodological advancement. Such extensions significantly increase the adaptability of probability models, enabling them to more accurately represent complex data structures and achieve improved fit across diverse real-world applications. This methodological development not only enriches the theoretical properties of probability distributions but also broadens their applicability in various scientific and engineering domains. In recent years, a wide range of generator mechanisms and transformation techniques has been introduced to extend classical distributions and enhance their ability to model complex data behaviors. These generators provide flexible frameworks for modifying baseline distributions through additional shape parameters, functional transformations, or mixing strategies, thereby offering improved control over skewness, tail heaviness, and overall distributional form. The exponentiated Weibull family for analyzing bathtub failure-rate data improves tail flexibility by exponentiating a baseline cdf [1]. The Alpha-Log-Power Transformed-G technique was proposed by Musekwa et al. [2] to modify the scale and shape behavior through a log-power structure, and Mahdavi and Kundu [3] proposed an alpha power transformation family which incorporates a new parameter α into the baseline distribution within the CDF. Likewise, the T-X family provides a broad framework for constructing new distributions through sequential functional transformation [4]. Other important generator systems include the Marshall–Olkin family [5] and the beta-generated family [6], both of which provide robust frameworks for constructing flexible distributions. The Gamma–G family [7] is a new model by embedding the gamma distribution into a baseline G. The exponentiated Weibull family of distributions is among the earliest approaches for enhancing a baseline model by incorporating an additional shape parameter, thereby improving its ability to capture diverse and complex data behaviors [8]. Further advancement has been achieved through transformation-based methodologies such as the SMP transformation [9]. The trigonometric-based ASP family of distributions represents another generalization technique [10]. Several extended versions of the Kumaraswamy distribution have also been proposed in the literature, including the Kumaraswamy–Weibull distribution [11], the exponentiated Kumaraswamy distribution [12], the inverted Kumaraswamy distribution [13], and the SMP Kumaraswamy distribution [14].

In this study, we contribute to this ongoing effort by introducing a new flexible extension of the Kumaraswamy distribution via the PNJ transformation technique. The resulting model, termed the PNJ Kumaraswamy distribution (PNJKD), incorporates an additional shape parameter that significantly enhances its capability to model diverse data patterns. The key motivations for this work are threefold:

1. To develop a more flexible alternative to the classical Kumaraswamy distribution that can effectively model non-constant hazard rates, including increasing, decreasing, and unimodal shapes.

2. The third parameter increases the ability of the distribution to model diverse shapes such as heavy or light tails, high skewness, or peaked densities, allowing a much better fit than traditional two-parameter models.

3. Unlike many existing distributions that have limited hazard shapes, the proposed distribution can generate multiple forms of hazard functions making it more useful in reliability and survival analysis. More Versatile Hazard Rate Patterns.

4. To provide a comprehensive statistical foundation for the new distribution by deriving its essential properties, such as moments, entropy and order statistics, making it a fully characterized tool for practitioners.

5. To demonstrate empirical superiority by rigorously testing the PNJK against several well-established competing models using real-world datasets, thereby establishing its practical utility.

This study proposes and investigates a novel extension of the Kumaraswamy distribution using the PNJ transformation. This new distribution, known as the PNJ Kumaraswamy distribution (PNJKD), introduces an extra parameter which gives it several desirable properties and makes the shapes of the hazard and density functions more flexible. Furthermore, when applied to two real datasets, the proposed model outperforms several well-known competing models. The rest of the study is organized as follows. Section 2 introduces the PNJ transformation. Section 3 presents the PNJKD in detail. Sections 4, 5 describes some statistical and moment properties, respectively. Section 6 focuses on estimating the unknown parameters using the maximum likelihood approach. Sections 7, 8 present the simulation study and real-life applications, respectively, and Section 9 concludes the study.

2 PNJ transformation

The cumulative distribution function (cdf) and the probability density function (pdf) of the PNJ distribution proposed by Ahad et al. [15] are given below

\begin{array}{l} F_{PNJ} (x) = {\begin{array}{l} \frac{e^{\log (ζ) F (x)} - S (x)}{ζ}, ζ > 1, \\ F (x), ζ = 1. \end{array} & (1) \end{array}

and

\begin{array}{l} f_{PNJ} (x) = {\begin{array}{l} \frac{f (x)}{ζ} {\log (ζ) e^{\log (ζ) F (x)} + 1}, ζ > 1, \\ f (x), ζ = 1. \end{array} & (2) \end{array}

where F(x), f(x), and S(x) are the cdf, pdf, and survival function of the distribution to be generalized.

3 PNJ Kumaraswamy distribution

Suppose the random variable X has the Kumaraswamy distribution with parameters α and β, then its cdf and pdf are, respectively, given by:

\begin{array}{l} G (x; α, β) = 1 - {(1 - x^{α})}^{β}, & (3) \end{array}

and

\begin{array}{l} g (x; α, β) = α β x^{α - 1} {(1 - x^{α})}^{β - 1}, & (4) \end{array}

where 0 < x < 1, α>0, β>0.

Using the cdf and pdf of the Kumaraswamy distribution as baseline distributions in PNJ Transformation 2, we obtain the PNJK distribution, with the cdf and pdf given, respectively, by:

\begin{array}{l} F_{P N J K} (x) = \frac{e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})} - {(1 - x^{α})}^{β}}{ζ}, & (5) \end{array}

and

\begin{array}{l} f_{P N J K} (x) = \frac{α β x^{α - 1} {(1 - x^{α})}^{β - 1} {l o g (ζ) e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})} + 1}}{ζ} . & (6) \end{array}

Figure 1 shows the pdf for different combinations of PNJK parameters, ζ, α, and β. Figure 1 illustrates how the PNJ Kumaraswamy distribution's pdf changes under different parameter choices: In the first graph (α = 4, β = 3, ζ = 1.5), the curve is moderately peaked and slightly right-skewed; in the second (α = 4, β = 3, ζ = 10), the same base shape becomes much sharper and more concentrated because a large ζ greatly increases the height of the mode; the third graph (α = 2.1, β = 1, ζ = 3.5) shows a mostly increasing density with no strong peak due to the lower α and β; and the fourth (α = 1.5, β = 1.56, ζ = 1.3) presents a smooth, mildly skewed bell-shaped curve, demonstrating the flexibility of the distribution to produce increasing, unimodal, or sharply peaked forms depending on the parameter combination.

Figure 1

Four graphs depict different probability density functions with varying parameters. The first graph shows a skewed curve peaking around $x = 0.7$ with $\alpha = 4$, $\beta = 3$, $\zeta = 1.5$. The second graph is similar, peaking slightly higher with $\zeta = 10$. The third graph illustrates a continuously rising curve with $\alpha = 2.1$, $\beta = 1$, $\zeta = 3.5$. The fourth graph presents a symmetric curve peaking around $x = 0.5$ with $\alpha = 1.5$, $\beta = 1.56$, $\zeta = 1.3$.

Figure 1. Pdf plot of PNJKD with different parameter value.

4 Some properties of the PNJ Kumaraswamy distribution

This section focuses on deriving key reliability measures for the PNJKD, such as the survival function, hazard rate, reverse hazard rate, and cumulative hazard function.

The survival function for the PNJKD is given by:

\begin{array}{l} S_{P N J K} (x) = \frac{ζ + {(1 - x^{α})}^{β} - e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})}}{ζ} . & (7) \end{array}

Figure 2 shows the PNJKD survival plots. In the first panel (α = 1.5, β = 1.7, ζ = 4.1), the moderately large α and β produce a smooth but noticeable decline, meaning failures increase gradually with x. In the second panel (α = 2.1, β = 5.1, ζ = 7.1), the very large β makes the curve drop sharply, showing a rapid loss of survival and indicating a strong sensitivity to changes in x. In the third panel (α = 0.2, β = 1.1, ζ = 3.1), the small α slows the impact of x^α, so the curve decreases more slowly, reflecting a more prolonged survival pattern. Finally, in the fourth panel (α = 0.1, β = 2.88, ζ = 0.5), the extremely small α makes the decline start very late, but the relatively large β forces a sudden drop once the effect kicks in; the small ζ compresses this behavior further, giving a fast, early fall in survival. In general, increasing α or β accelerates decay, while decreasing ζ shifts survival downward, and each set of parameters forms a distinct survival pattern.

Figure 2

Four plots of survival functions S(x) with varying parameters. Each plot shows a curve representing the survival function value on the y-axis against x. The parameters for each plot are: first plot α=1.5, β=1.7, ζ=4.1; second plot α=2.1, β=5.1, ζ=7.1; third plot α=0.2, β=1.1, ζ=3.1; fourth plot α=0.1, β=2.88, ζ=0.5. Each curve decreases steeply, indicating the declining probability of survival.

Figure 2. Plots of the survival function for the PNJKD.

The expression for the hazard rate of the PNJKD is obtained as:

\begin{array}{l} h_{P N J K} (x) = \frac{α β x^{α - 1} {(1 - x^{α})}^{β - 1} {e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})} + 1}}{ζ + {(1 - x^{α})}^{β} - e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})}}, & (8) \end{array}

Figure 3 presents the hazard function of the PNJKD distribution for different parameter combinations. The plots illustrate the flexibility of the model in capturing various hazard-rate shapes. For (α = 1.5, β = 1.7, ζ = 4.1) and (α = 2.1, β = 5.1, ζ = 7.1), the hazard rate is monotonically increasing, with the latter showing a steeper rise. For (α = 0.2, β = 1.1, ζ = 3.1), the hazard increases slowly and becomes prominent only in the tail region. In contrast, for (α = 0.1, β = 2.88, ζ = 0.5), the hazard rate is decreasing, indicating an initially high failure risk followed by a stabilizing trend. These patterns confirm that the PNJKD distribution accommodates increasing, decreasing, and rapidly increasing hazard behaviors across its parameter space.

Figure 3

Four graphs display hazard rates as functions of $ x $ with red curves. Each graph varies parameters $\alpha$, $\beta$, and $\zeta$. The hazard rates increase sharply near $ x = 1 $ in the first three graphs, while the fourth shows a decreasing trend.

Figure 3. Plots of the hazard function for the PNJKD.

The expressions for the reversed hazard rate and the cumulative hazard function are given below, respectively,

\begin{array}{l} h_{r} (x) = \frac{f_{P N J K} (x)}{F_{P N J K} (x)} \\ = \frac{α β x^{α - 1} {(1 - x^{α})}^{β - 1} {l o g (ζ) \cdot e^{\log (ζ) (1 - {(1 - x^{α})}^{β})} + 1}}{e^{\log (ζ) (1 - {(1 - x^{α})}^{β})} - {(1 - x^{α})}^{β}}, & (9) \end{array}

Figure 4 displays the reverse hazard function of the PNJKD distribution for selected parameter combinations. In all cases, the reverse hazard rate exhibits a decreasing pattern, starting from a very high value near x = 0 and declining rapidly as x increases. For (α = 1.5, β = 1.1, ζ = 2.5) and (α = 4.5, β = 0.9, ζ = 1.5), the decline is steep, indicating that the probability of failure from the left tail reduces quickly. For (α = 5.3, β = 1.8, ζ = 3.1) and (α = 1.1, β = 0.68, ζ = 2.3), the shape is similarly decreasing but with slower decay, reflecting heavier left-tail behavior. These plots confirm that the PNJKD distribution can accommodate different magnitudes of decreasing reverse hazard rates, demonstrating its flexibility in modeling left-tail risk characteristics.

\begin{array}{l} Λ_{P N J K} (x) = - l o g (S_{P N J K} (x)) = \\ l o g {\frac{ζ}{ζ + {(1 - x^{α})}^{β} - e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})}}}, & (10) \end{array}

Figure 4

Four line graphs depicting reverse hazard rate against variable x, each with different parameters: Alpha, Beta, and Zeta. All graphs show a steep decline. Parameters vary per graph: (1.5, 1.1, 2.5), (4.5, 0.9, 1.5), (5.3, 1.8, 3.1), (1.1, 0.68, 2.3).

Figure 4. Plots of the reverse hazard rate for the PNJKD.

Figure 5 presents the cumulative hazard function of the PNJKD distribution for different parameter settings. In all cases, the cumulative hazard increases with x, as expected. For (α = 1.5, β = 1.9, ζ = 3.8) and (α = 1.7, β = 0.4, ζ = 3.4), the growth is initially slow and then rises sharply near the right tail. For (α = 0.2, β = 1.1, ζ = 3.1), the increase is gradual at first and becomes steeper for larger x. In contrast, for (α = 0.1, β = 2.8, ζ = 0.5), the cumulative hazard grows rapidly at the beginning and then stabilizes. These results demonstrate that the PNJKD distribution is capable of generating diverse cumulative hazard shapes, reflecting its flexibility in modeling different reliability patterns.

Figure 5

Four cumulative hazard plots displayed side by side. Each plot shows a curve labeled with different parameters: Plot one has α=1.5, β=1.9, ξ=3.8, plot two has α=1.7, β=0.4, ξ=3.4, plot three has α=0.2, β=1.1, ξ=3.1, and plot four has α=0.1, β=2.8, ξ=0.5. All plots have cumulative hazard on the vertical axis and x on the horizontal axis, with curves increasing sharply.

Figure 5. Plots of the cumulative hazard function for the PNJKD.

5 Moment

The r^th moment for the PNJK distribution can be obtained as follows:

\begin{array}{l} μ_{r}^{'} = E (X^{r}) = \int_{0}^{1} x^{r} f_{PNJK} (x) d x \\ = \frac{α β}{ζ} \int_{0}^{1} x^{r + α - 1} {(1 - x^{α})}^{β - 1} {\log (ζ) e^{\log (ζ) (1 - {(1 - x^{α})}^{β})} + 1} d x . & (11) \end{array}

using expansions

e^{x} = \sum_{j = 0}^{\infty} \frac{x^{j}}{j!},

{(1 - x)}^{n} = \sum_{k = 0}^{\infty} {(- 1)}^{k} (\begin{matrix} n \\ k \end{matrix}) x^{k}, | x | < 1,

and substitute x^α = z, we get

\begin{array}{l} E (X^{r}) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) B (\frac{r}{α} + 1, β (k + 1)) \\ + B (\frac{r}{α} + 1, β)}, & (12) \end{array}

setting r = 1, in Equation 12 we get the first moment as

\begin{array}{l} E (X) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) B (\frac{1}{α} + 1, β (k + 1)) \\ + B (\frac{1}{α} + 1, β)} . & (13) \end{array}

Theorem 1: Let X ~ PNJK(α, β, ζ), then the moment generating function, M_x(t) is

\begin{array}{l} M x (t) = \frac{β}{ζ} \sum_{r = 0}^{\infty} \frac{t^{r}}{r!} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (\frac{r}{α} + 1, β (k + 1)) + B (\frac{r}{α} + 1, β)} . \end{array}

Proof: The MGF M_X(t) is given as

\begin{array}{l} M x (t) = \int_{0}^{1} e^{t x} f_{P N J K} (x) d x = \int_{0}^{1} e^{t x} \frac{α β}{ζ} x^{α - 1} {(1 - x^{α})}^{β} \\ {l o g (ζ) \cdot e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})} + 1} . & (14) \end{array}

By Using the same procedure as in Equation 12, we get

\begin{array}{l} M x (t) = \frac{β}{ζ} \sum_{r = 0}^{\infty} \frac{t^{r}}{r!} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (\frac{r}{α} + 1, β (k + 1)) + B (\frac{r}{α} + 1, β)} . & (15) \end{array}

Theorem 2: Let X ~ PNJK(α, β, ζ), then the Characteristic function, ϕ_x(t) is

\begin{array}{l} ϕ_{X} (t) = \frac{β}{ζ} \sum_{r = 0}^{\infty} \frac{{(i t)}^{r}}{r!} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (\frac{r}{α} + 1, β (k + 1)) + B (\frac{r}{α} + 1, β)} . \end{array}

Proof: The Characteristic function, ϕ_X(t) is given as

\begin{array}{l} ϕ_{X} (t) = \int_{0}^{1} e^{i t x} f_{P N J K} (x) d x = \int_{0}^{1} e^{i t x} \frac{α β}{ζ} x^{α - 1} {(1 - x^{α})}^{β} \\ {l o g (ζ) \cdot e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})} + 1}, & (16) \end{array}

Using the same procedure of the Equation 12, we have:

\begin{array}{l} ϕ_{X} (t) = \frac{β}{ζ} \sum_{r = 0}^{\infty} \frac{{(i t)}^{r}}{r!} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (\frac{r}{α} + 1, β (k + 1)) + B (\frac{r}{α} + 1, β)} . & (17) \end{array}

Theorem 3: Let X ~ PNJK (α, β, ζ) with PDF given in Equation 6, then the r^th incomplete moment I_r(x) of X are

\begin{array}{l} I_{r} (x) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}), \\ B (t^{α}; \frac{r}{α} + 1, β (k + 1)) + B (t^{α}; \frac{r}{α} + 1, β)} . & (18) \end{array}

Proof: The r^th incomplete moment are defined as

\begin{array}{l} I_{r} (t) = \int_{0}^{t} x^{r} f_{P N J K} (x) d x = \int_{0}^{t} \frac{α β}{ζ} x^{r + α - 1} {(1 - x^{α})}^{β - 1} \\ {\log (ζ) e^{\log (ζ) (1 - {(1 - x^{α})}^{β})} + 1} d x . & (19) \end{array}

By Equation 12, then

\begin{array}{l} I_{r} (x) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}), \\ B (t^{α}; \frac{r}{α} + 1, β (k + 1)) + B (t^{α}; \frac{r}{α} + 1, β)} & (20) \end{array}

Setting r=1 in Equation 20, the first incomplete moment as given by:

\begin{array}{l} I_{1} (x) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (t^{α}; \frac{1}{α} + 1, β (k + 1)) + B (t^{α}; \frac{1}{α} + 1, β)} . & (21) \end{array}

5.1 Mean residual life

The mean residual life of a component is the expected remaining time that the component will live for, given that it has survived up to time t. It is given as:

μ (t) = \frac{1}{S (t)} [E (X) - \int_{0}^{t} x f_{P N J K} (x) d x] - t,

where

\begin{array}{l} E (X) = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) B (\frac{1}{α} + 1, β (k + 1)) \\ + B (\frac{1}{α} + 1, β)}, \end{array}

and S(t) is the survival function, is the probability that a system survives longer than a specific time and is given as

S (t) = \frac{ζ + {(1 - x^{α})}^{β} - e^{l o g (ζ) (1 - {(1 - x^{α})}^{β})}}{ζ},

\begin{array}{l} \int_{0}^{t} x f_{P N J K} (x) d x = \frac{β}{ζ} {\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(l o g (ζ))}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) \\ B (t^{α}; \frac{1}{α} + 1, β (k + 1)) + B (t^{α}; \frac{1}{α} + 1, β)}, \end{array}

and

Hence,

\begin{array}{l} μ (t) = \frac{β}{ζ + {(1 - t^{α})}^{β} - e^{\log (ζ) {(1 - (1 - t^{α}))}^{β}}} [\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(\log ζ)}^{j + 1}}{j!} \\ {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) {B (\frac{1}{α} + 1, β (k + 1)) + B (\frac{1}{α} + 1, β) \\ - B (t^{α}; \frac{1}{α} + 1, β (k + 1))} - B (t^{α}; \frac{1}{α} + 1, β)] - t . \end{array}

5.2 Mean waiting time

The mean waiting time is defined as the elapsed time since an item failed, provided that the failure occurred within the time period [0, t]. It is defined as follows:

\bar{μ} (t) = t - \frac{1}{F (t)} \int_{0}^{t} x f_{P N J K} (x) d x,

\begin{array}{l} \bar{μ} (t) = t - \frac{β}{e^{\log (ζ) (1 - {(1 - t^{α})}^{β})} - {(1 - t^{α})}^{β}} \\ [\sum_{j = 0}^{\infty} \sum_{k = 0}^{\infty} \frac{{(\log ζ)}^{j + 1}}{j!} {(- 1)}^{k} (\begin{matrix} j \\ k \end{matrix}) B (t^{α}; \frac{1}{α} + 1, β (k + 1)) \\ + B (t^{α}; \frac{1}{α} + 1, β)} . \end{array}

5.3 Renyi entropy

The Rényi entropy is a quantity that generalizes various notions of entropy. The Rényi entropy is named after Alfréd Rényi, Rényi, [16] who looked for the most general way to quantify information while preserving additivity for independent events and is defined as

I_{v} = \frac{1}{1 - v} \log \int_{0}^{1} f^{v} (x) d x,

using pdf given in Equation 6

\begin{array}{l} I_{v} = \frac{1}{1 - v} \log [{(\frac{α β}{ζ})}^{v} \int_{0}^{1} x^{v (α - 1)} {(1 - x^{α})}^{v (β - 1)} \\ {\log (ζ) e^{\log (ζ) (1 - {(1 - x^{α})}^{β})} + 1}^{v} d x] . & (22) \end{array}

After integration, the Renyi entropy reduces to:

I_{v} = \frac{1}{1 - v} [v \log (\frac{α β}{ζ}) + \log {T (v, α, β, ζ)}],

where

\begin{array}{l} T (v, α, β, ζ) = \sum_{m = 0}^{v} \sum_{n = 0}^{\infty} \sum_{k = 0}^{n} (\begin{matrix} v \\ m \end{matrix}) (\begin{matrix} n \\ k \end{matrix}) \frac{{(- 1)}^{k} m^{n} {[\log (ζ)]}^{m + n}}{n!} \\ B (\frac{v α - v - α + 1}{α} + 1, v (β - 1) + β (k + 1)) . \end{array}

6 Estimation of parameters

This section covers the maximum likelihood estimation method to determine the unknown parameters of α, β, and ζ.

6.1 Maximum likelihood estimation

Let x₁, x₂, ..., x_n be a random sample from the PNJK distribution. Then the log likelihood function is given by:

\begin{array}{l} ℓ (α, β, ζ) = n \log α + n \log β - n \log ζ + (α - 1) \sum_{i = 1}^{n} \log x_{i} \\ + (β - 1) \sum_{i = 1}^{n} \log (1 - x_{i}^{α}) + \sum_{i = 1}^{n} \log \\ {\log (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} + 1}, & (23) \end{array}

The MLEs of α, β, and ζ are obtained by partially differentiating Equation 23 with respect to the corresponding parameters and setting them to zero. So,

\begin{array}{l} \frac{\partial ℓ}{\partial α} = \frac{n}{α} + \sum_{i = 1}^{n} \log x_{i} - (β - 1) \sum_{i = 1}^{n} \frac{x_{i}^{α} \log x_{i}}{1 - x_{i}^{α}} + \sum_{i = 1}^{n} \\ \frac{\log^{2} (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} \cdot β x_{i}^{α} \log x_{i} {(1 - x_{i}^{α})}^{β - 1}}{\log (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} + 1} = 0, & (24) \end{array}

\begin{array}{l} \frac{\partial ℓ}{\partial β} = \frac{n}{β} + \sum_{i = 1}^{n} \log (1 - x_{i}^{α}) - \sum_{i = 1}^{n} \\ \frac{\log^{2} (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} {(1 - x_{i}^{α})}^{β} \log (1 - x_{i}^{α})}{\log (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} + 1} = 0, & (25) \end{array}

\begin{array}{l} \frac{\partial ℓ}{\partial ζ} = - \frac{n}{ζ} + \sum_{i = 1}^{n} \\ \frac{e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} [1 + \log (ζ) (1 - {(1 - x_{i}^{α})}^{β})]}{ζ (\log (ζ) e^{\log (ζ) (1 - {(1 - x_{i}^{α})}^{β})} + 1)} = 0. & (26) \end{array}

The above three equations are not in closed form. We will use the Newton–Raphson method and R software to solve them and estimate the parameters.

6.2 Verification of the global maximum

To confirm that the maximum likelihood estimates (MLEs) correspond to a global maximum of the log-likelihood function, we examine the second-order behavior of the log-likelihood. Let ℓ(α, β, ζ) denote the log-likelihood function defined in Equation 23. The Hessian matrix of second partial derivatives is

H (α, β, ζ) = (\begin{matrix} \frac{\partial^{2} ℓ}{\partial α^{2}} & \frac{\partial^{2} ℓ}{\partial α \partial β} & \frac{\partial^{2} ℓ}{\partial α \partial ζ} \\ \frac{\partial^{2} ℓ}{\partial β \partial α} & \frac{\partial^{2} ℓ}{\partial β^{2}} & \frac{\partial^{2} ℓ}{\partial β \partial ζ} \\ \frac{\partial^{2} ℓ}{\partial ζ \partial α} & \frac{\partial^{2} ℓ}{\partial ζ \partial β} & \frac{\partial^{2} ℓ}{\partial ζ^{2}} \end{matrix}) .

Analytical expressions of the second derivatives are algebraically complex; therefore, the Hessian is evaluated numerically at the MLEs. For a maximum to occur, the Hessian must be negative definite. This is confirmed by checking that:

• the diagonal elements of the Hessian are negative, and

• all eigenvalues of $H (\hat{α}, \hat{β}, \hat{ζ})$ are strictly negative.

Furthermore, the optimization algorithm was run using several dispersed initial values and consistently converged to the same parameter estimates, indicating that no alternative local maxima exist. Hence, the MLEs obtained for the PNJK model correspond to a unique global maximum of the log-likelihood function.

7 Simulation study

To evaluate the performance of the maximum likelihood estimation procedure for the PNJK distribution, a comprehensive simulation study was conducted. The simulation was designed to examine the behavior of the estimator under different parameter settings and across varying sample sizes. The steps involved in the simulation are summarized as follows:

7.1 Simulation steps

1. Initial parameter values: Three distinct combinations of the parameters (α, β, ζ) were selected as the initial parameter values for the simulation. These combinations were chosen to represent distributions with varying shapes, degrees of skewness, and tail behaviors.

2. Generation of random samples: For each parameter combination, random samples were generated from the PNJKD in R software. Sample sizes of n = 20, 100, and 600 were considered to reflect small, moderate, and large sample scenarios encountered in practical applications.

3. Estimation of parameters: For every simulation run, the maximum likelihood estimates (MLEs) of the parameters were obtained. The variance and mean squared error (MSE) of the MLEs were computed to assess estimation accuracy, stability, and efficiency. The average MLEs, along with their corresponding variances and MSE values across the repeated simulations, were recorded and summarized in Table 1. The parameter combinations (0.4, 0.50, 1.5), (0.70, 0.80, 3.0), and (0.70, 0.80, 2.5) were deliberately chosen to ensure a meaningful and comprehensive assessment of the proposed estimation method. This is achieved by selecting parameter values (α, β) that ensure coverage of diverse distributional shapes, ranging from light-tailed to heavy-tailed patterns with moderate to high skewness, reflecting realistic scenarios often seen in applied studies like reliability modeling, hydrology, and survival analysis. Additionally, by including varying transformation parameter (ζ) values, the assessment thoroughly examines the estimator's sensitivity to the strength of the PNJ transformation. This balanced approach tests the robustness of the estimator across multiple shape and tail behaviors of the distribution.

Table 1

Table 1. MLE, variance, and MSE for the parameters α, β, and ζ.

7.2 Simulation results

The findings reveal that the MLEs closely approximate the initial parameter values across all configurations. Additionally, both variance and MSE decrease as the sample size increases, confirming the consistency and efficiency of the MLE for the PNJKD. These results validate the reliability of the proposed estimation method for practical applications.

8 Application

This section demonstrates the practical applicability of the PNJKD using two real-life datasets. We determine the potential of the proposed model by comparing its performance with that of several other models, including the transmuted Kumaraswamy distribution [17], the Kumaraswamy inverse exponential distribution [18], and the Kumaraswamy distribution [19]. Goodness-of-fit criteria are used, including the Akaike information criterion (AIC), the Bayesian information criterion (BIC), the Akaike information criterion corrected (AICC), and the Hannan–Quinn information criterion (HQIC), -2log-likeihood. The distribution with the lowest AIC, BIC, AICC, and HQIC values is considered the best fit.

The probability density functions of the comparison models are as follows:

• Transmuted Kumaraswamy distribution (TRKD)

\begin{array}{l} f (x; α, β, λ) = α β x^{(α - 1)} {(1 - x^{α})}^{(β - 1)} (1 - λ + 2 λ {(1 - x^{α})}^{β}); \\ α, β > 0, - 1 \leq λ \leq 1, \end{array}

where λ is a transmuted parameter.

• Kumaraswamy inverse exponential distribution (KIED)

f (x; α, β, θ) = \frac{α β θ}{x^{2}} e^{\frac{(- θ α)}{x}} {(1 - e^{\frac{- θ α}{x}})}^{(β - 1)}; α, β, θ > 0,

where θ is a scale parameter.

• Kumaraswamy distribution (KUMD)

The pdf for KUMD is given in Equation 4.

Data set 1: milk production data

The data set shows the measurements of the proportion of total milk production in the first birth of 107 SINDI cows from Cordeiro et al. [11] and has been previously used by Jan and Ahmad [14]. The data set is given as follows,

0.4365, 0.4260, 0.5140, 0.6907, 0.7471, 0.2605, 0.6196, 0.8781, 0.4990, 0.6058, 0.6891, 0.5770, 0.5394,0.1479, 0.2356, 0.6012, 0.1525, 0.5483, 0.6927, 0.7261, 0.3323, 0.0671, 0.2361, 0.4800, 0.5707, 0.7131,0.5853, 0.6768, 0.5350, 0.4151, 0.6789, 0.4576, 0.3259, 0.2303, 0.7687, 0.4371, 0.3383, 0.6114, 0.3480, 0.4564, 0.7804, 0.3406, 0.4823, 0.5912, 0.5744, 0.5481, 0.1131, 0.7290, 0.0168, 0.5529, 0.4530, 0.3891,0.4752, 0.3134, 0.3175, 0.1167, 0.6750, 0.5113, 0.5447, 0.4143, 0.5627, 0.5150, 0.0776, 0.3945, 0.4553,0.4470, 0.5285, 0.5232, 0.6465, 0.0650, 0.8492, 0.8147, 0.3627, 0.3906, 0.4438, 0.4612, 0.3188, 0.2160,0.6707, 0.6220, 0.5629, 0.4675, 0.6844, 0.3413, 0.4332, 0.0854, 0.3821, 0.4694, 0.3635, 0.4111, 0.5349,0.3751, 0.1546, 0.4517, 0.2681, 0.4049, 0.5553, 0.5878, 0.4741, 0.3598, 0.7629, 0.5941, 0.6174, 0.6860,0.0609, 0.6488, 0.2747.

The estimated parameters for the evaluated models are in Table 2.

Table 2

Table 2. MLEs (Standard Error) of PNJKD and competitive models for milk production data.

Data set 2: monthly water capacity trends in Shasta Reservoir, California.

The data has been previously studied by Nadar et al. [20]. The data set is given as follows,

0.338936, 0.768007, 0.431915, 0.843485, 0.759932, 0.787408, 0.724626, 0.849868, 0.757583, 0.695970, 0.811556, 0.842316, 0.785339, 0.828689, 0.783660, 0.580194, 0.815627, 0.430681, 0.847413, 0.742563.

The estimated parameters for the evaluated models are in Table 3.

Table 3

Table 3. MLEs (Standard Error) of PNJKD and competitive models for the monthly water capacity trends dataset.

The PNJKD model performed best for both datasets (Tables 4, 5), achieving the lowest values for all four goodness-of-fit criteria (Table 4). In contrast, the KIED model performed had the highest values for all goodness-of-fit criteria, indicating a significantly worse fit than the other models. The TRKD and KUMD models produced intermediate results, but were outperformed by the PNJKD model.

Table 4

Table 4. Comparison of PNJKD and competitive models for milk production data.

Table 5

Table 5. Comparison of PNJKD and competitive models for the monthly water capacity trends dataset.

The fitted model plots, shown in Figures 6, 7, further support this observation by illustrating that the PNJKD closely aligns with both datasets.

Figure 6

Histogram with density curves fitting data set I, titled “Model fitting for data set I“. Curves represent different models: PNJKD (black), TRKD (green dashed), KIED (blue dashed), GWD (pink dashed), GGD (purple dotted), and KUMD (yellow dashed). X-axis is labeled as x, and Y-axis is labeled as Density. Bars show data distribution, with varying model fit curves overlayed.

Figure 6. Fitted density plots for milk production dataset.

Figure 7

Density plot titled “Model fitting for data set II” with a histogram in the background. Several lines in different colors represent different model fits: black solid line for PNJKD, red dashed for TRKD, green for KIED, blue for GWD, purple for GGD, and yellow for KUMD. The x-axis is labeled “x” and the y-axis “Density“.

Figure 7. Fitted density plots for the monthly water capacity trends dataset.

9 Conclusion

This study introduces the PNJK distribution as a novel extension of the Kumaraswamy distribution, developed using the PNJ approach. Several statistical properties of the proposed model have been investigated, including the survival function, hazard rate function, reverse hazard rate, moments, mean residual life, mean waiting time, Rényi entropy, and moment-generating function. Parameters of the model are estimated using the method of maximum likelihood. A simulation study is conducted to assess the performance of the maximum likelihood estimators (MLEs). The model's fit is further evaluated using various goodness-of-fit criteria. The proposed distribution is shown to be unimodal, symmetric, and negatively skewed. It can accommodate decreasing, increasing, and J-shaped hazard rate behaviors depending on the parameter values. Hence, it is suitable for modeling datasets exhibiting such characteristics. To demonstrate its practical utility, the PNJK distribution is applied to two real-life datasets, where it outperforms other competing models and provides a superior fit. Although the model performs well, the study is limited by the use of a single estimation method, and the consideration of only two real datasets and some analytical expressions require numerical evaluation due to their complexity. Future research may focus on developing alternative estimation methods, extending the model to regression or multivariate frameworks, exploring additional theoretical properties and conducting broader empirical studies to further validate and enhance the applicability of the PNJK distribution.

9.1 Verification of global maximum for the datasets

For both datasets analyzed in this study, the numerical Hessian evaluated at the estimated parameter values was found to be negative definite. Specifically, all eigenvalues of the Hessian matrix were strictly negative, confirming concavity of the log-likelihood surface at the optimum. Additionally, multiple starting values were used for the optimization routine. In every case, the algorithm converged to the same solution, providing strong evidence that the MLEs obtained for each dataset correspond to the global maximum of the log-likelihood function. The eigenvalues and global maximum for the PNJK model are presented in Table 6.

Table 6

Table 6. Numerical verification of global maximum for the PNJK model.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

MJ: Conceptualization, Formal analysis, Methodology, Writing – original draft, Writing – review & editing. TA: Data curation, Methodology, Validation, Writing – original draft. FC: Data curation, Investigation, Validation, Writing – review & editing. SA: Conceptualization, Supervision, Validation, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Acknowledgments

The authors express sincere thanks to the editor and reviewers for their time and consideration.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Cordeiro GM, Ortega EM, da Cunha DC. The exponentiated generalized class of distributions. J Data Sci. (2013) 11:1–27. doi: 10.6339/JDS.201301_11(1).0001

Crossref Full Text | Google Scholar

2. Musekwa RR, Gabaitiri L, Makubate B. A new technique to generate families of continuous distributions. Rev Colomb Estad. (2024) 47:329–54. doi: 10.15446/rce.v47n2.112245

Crossref Full Text | Google Scholar

3. Mahdavi A, Kundu D. A new method of generating distributions with an application to exponential distribution. Commun Stat Theory Applic. (2015) 46:6543–57. doi: 10.1080/03610926.2015.1130839

Crossref Full Text | Google Scholar

4. Alzaatreh A, Lee C, Famoye F. A new method for generating families of continuous distributions. Metron. (2013) 71:63–79. doi: 10.1007/s40300-013-0007-y

Crossref Full Text | Google Scholar

5. Marshall AW, Olkin I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika. (1997) 84:641–52. doi: 10.1093/biomet/84.3.641

Crossref Full Text | Google Scholar

6. Eugene N, Lee C, Famoye F. Beta-normal distribution and its applications. Commun Stat Theory Methods. (2002) 31:497–512. doi: 10.1081/STA-120003130

Crossref Full Text | Google Scholar

7. Nadarajah S, Cordeiro GM, Ortega EM. The Zografos-Balakrishnan-G family of distributions: mathematical properties and applications. Commun Stat Theory Methods. (2015) 44:186–215. doi: 10.1080/03610926.2012.740127

Crossref Full Text | Google Scholar

8. Mudholkar GS, Srivastava DK. Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Trans Reliab. (1993) 42:299–302. doi: 10.1109/24.229504

Crossref Full Text | Google Scholar

9. Rasool SU, Lone MA, Ahmad SP. An innovative technique for generating probability distributions: a study on lomax distribution with applications in medical and engineering fields. Ann Data Sci. (2025) 12:439–55. doi: 10.1007/s40745-024-00515-6

Crossref Full Text | Google Scholar

10. Mir AA, Rasool SU, Ahmad S, Bhat A, Jawa TM, Sayed-Ahmed N, et al. A robust framework for probability distribution generation: analyzing structural properties and applications in engineering and medicine. Axioms. (2025) 14:281. doi: 10.3390/axioms14040281

Crossref Full Text | Google Scholar

11. Cordeiro GM, Ortega EM, Nadarajah S. The Kumaraswamy Weibull distribution with application to failure data. J Franklin Inst. (2010) 347:1399–429. doi: 10.1016/j.jfranklin.2010.06.010

Crossref Full Text | Google Scholar

12. Lemonte AJ, Barreto-Souza W, Cordeiro GM. The exponentiated Kumaraswamy distribution and its log-transform. Braz. J. Probab. Stat. (2013) 27:31–53. doi: 10.1214/11-BJPS149

Crossref Full Text | Google Scholar

13. Malik AS, Ahmad S. Generalized inverted kumaraswamy-rayleigh distribution: Properties and application. J Mod Appl Stat Methods. (2024) 23:1–13. doi: 10.56801/Jmasm.V23.i1.15

Crossref Full Text | Google Scholar

14. Jan M, Ahmad SP. A new extension of kumaraswamy distribution for improved data modeling: properties and applications. Reliability. (2024) 19:654–665. doi: 10.24412/1932-2321-2024-480-525-540

Crossref Full Text | Google Scholar

15. Ahad N, Ahmad SP, Reshi J. A novel approach to distribution generation with applications in electrical engineering. Reliability. (2024) 19:525–540. doi: 10.5923/j.statistics.20140402.05

Crossref Full Text | Google Scholar

16. Rényi A. On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume 1: Contributions to the Theory of Statistics (University of California Press) (1961). p. 547–562.

Google Scholar

17. Khan MS, King R, Hudson IL. Transmuted kumaraswamy distribution. Stat Trans New Series. (2016) 17:183–210. doi: 10.21307/stattrans-2016-013

Crossref Full Text | Google Scholar

18. Oguntunde P, Babatunde O, Ogunmola A. Theoretical analysis of the Kumaraswamy-inverse exponential distribution. Int J Stat Applic. (2014) 4:113–6. doi: 10.24412/1932-2321-2024-480-654-665

Crossref Full Text | Google Scholar

19. Kumaraswamy P. A generalized probability density function for double-bounded random processes. J Hydrol. (1980) 46:79–88. doi: 10.1016/0022-1694(80)90036-0

Crossref Full Text | Google Scholar

20. Nadar M, Papadopoulos A, Kızılaslan F. Statistical analysis for Kumaraswamy's distribution based on record data. Stat Papers. (2013) 54:355–69. doi: 10.1007/s00362-012-0432-7

Crossref Full Text | Google Scholar

Keywords: cumulative hazard function, hazard rate, likelihood, mean residual life, PNJ transformation

Citation: Jan M, Ahad T, Correa FM and Ahmad SP (2026) Extending the Kumaraswamy distribution: properties and applications. Front. Appl. Math. Stat. 11:1694353. doi: 10.3389/fams.2025.1694353

Received: 28 August 2025; Revised: 17 December 2025;
Accepted: 29 December 2025; Published: 23 January 2026.

Edited by:

Abdulzeid Yen Anafo, University of Mines and Technology, Ghana

Reviewed by:

Benjamin Odoi, University of Mines and Technology, Ghana
Rama Shanker, Assam University, India
Regent Retrospect Musekwa, Botswana International University of Science and Technology, Botswana

Copyright © 2026 Jan, Ahad, Correa and Ahmad. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Mahvish Jan, d2FuaW1haHZpc2hAZ21haWwuY29t

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.