Further aspects of information-generating function of order statistics with health application in symmetry of chronic disease management

Mohamed, Mohamed Said; Al-Labadi, Manal; Almuhur, Eman; Sakr, Hanan H.

doi:10.3389/fams.2026.1733600

ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 02 February 2026

Sec. Statistics and Probability

Volume 12 - 2026 | https://doi.org/10.3389/fams.2026.1733600

Further aspects of information-generating function of order statistics with health application in symmetry of chronic disease management

Mohamed Said Mohamed¹

Manal Al-Labadi²

Eman Almuhur³

Hanan H. Sakr⁴^*

¹Department of Mathematics, College of Science and Humanities, Prince Sattam bin Abdulaziz, University, Hawtat Bani Tamim, Saudi Arabia
²Department of Mathematics, Faculty of Arts and Sciences, University of Petra, Amman, Jordan
³Department of Mathematics, Faculty of Science, Applied Science Private University, Amman, Jordan
⁴Department of Management Information Systems, College of Business Administration in Hawtat Bani Tamim, Prince Sattam Bin Abdulaziz University, Hawtat Bani Tamim, Saudi Arabia

This investigation aimed to explore novel theoretical aspects and applications of the information-generating function measure for order statistics. We developed fundamental properties and established stochastic ordering relationships based on this information-theoretic measure. Our analysis demonstrated that when two order statistics share identical information-generating measures, their underlying parent distributions can be uniquely identified. We implemented our proposed measure to characterize the exponential distribution. Moreover, we derived bounds and investigated monotonicity properties for these functional measures. The study further examined how information-generating functions characterize distributional symmetry, with particular applications to uniform and normal distributions for identifying symmetry points of order statistics. Building on these theoretical foundations, we proposed a new symmetry test statistic derived from the information-generating properties of the order statistics. Using comprehensive Monte Carlo simulations, we evaluated the test's statistical power against existing alternatives. The present results demonstrated superior performance across various asymmetric distributional alternatives. The practical utility of our methodology is illustrated through an empirical analysis of chronic disease prevalence data.

1 Introduction and background

Several criteria have been proposed in information theory to gauge a probabilistic model's degree of uncertainty. The most significant information measurement that has been applied in several scientific and technical fields is the Shannon entropy. It started with Shannon's groundbreaking research [1], which examined how systems behaved when characterized by probability density or mass functions (pdf or pmf). Assuming that the variable X^* has a pdf h(x) in the continuous case, the differential entropy, often known as the Shannon entropy, is analogously provided by

\begin{array}{l} \begin{matrix} E n (X^{*}) = - \int_{- \infty}^{\infty} h (x) ln h (x) d x . \end{matrix} & (1) \end{array}

One practical technique for assessing the variance, mean, and other moments of a probability distribution is its moment-generating function. If there are successive moments in the probability distribution, they may be found by taking the sequential derivatives of the moment-generating function at zero. To calculate information quantities like extropy, Kullback-Leibler divergence, and Shannon information, generating functions for PDFs have been defined in information theory. As long as the integral remains in existence, the information-generating function of a random variable X^* was suggested by Golomb [2], who was inspired by the ideas of moments and probabilities of generating functions. It is defined as

\begin{array}{l} \begin{matrix} G E n_{δ} (X^{*}) = E (e^{(δ - 1) ln h (x)}) = \int_{- \infty}^{\infty} h^{δ} (x) d x, \end{matrix} & (2) \end{array}

for any δ>0. Golomb [2] then demonstrated the following features of the information-generating function as

1. $G E n_{1} (X^{*}) = 1$

2. $\frac{\partial}{\partial δ} G E n_{δ} (X^{*}) |_{δ = 1} = - E n (X^{*})$ (the negative of Shannon's entropy in Equation 1).

Because information-generating functions are important in information theory, several authors have recently investigated them. For a list of information-generating functions and their many features and uses, see Kharazmi and Balakrishnan [3–7], Zamani et al. [8], Kharazmi et al. [9], and Kayal and Balakrishnan [10].

Specifically, the information-generating function measure is simplified to $G E n_{2} (X^{*})$ , sometimes referred to as the informative-energy function, when δ = 2. Using the example of kinetic energy in mechanics, Onicescu [11] introduced a discrete version of the informative-energy measurement into information theory. Bhatia [12] provides further information.

In many statistical methodologies, it is commonly assumed that the distribution of the population under study is symmetric. For example, the validity of regression models often hinges on the assumption that the residuals exhibit symmetry. This makes it critical to rigorously assess whether the symmetry assumption holds in practice. Consider that the support of the cumulative distribution function (cdf) H is denoted by $S_{X}^{*}$ . Assume further that there exists a constant μ^* such that for all $x \in S_{X}^{*}$ , the equation H(μ^*−x)+F(μ^*+x) = 1 is satisfied. When this condition is met, the distribution of X^* is considered symmetric about the point μ^*.

Symmetry is a concept of substantial theoretical and practical importance in both probability and statistics. It underpins many models and inferential procedures and has been explored extensively across various contexts. Researchers have introduced a range of characterizations for symmetric distributions, often using ordered samples such as order statistics, record values, and sequential statistics. For instance, Balakrishnan and Selvitella [13] showed that, for a sample of size m, the distributional identity $X_{i, m} \overset{D I}{=} X_{m - i + 1, m}$ holds for a fixed i = 1, …, m if and only if the underlying distribution H is symmetric about zero. In this notation, $\overset{D I}{=}$ signifies that the two random variables have identical distributions.

Furthermore, Ahmadi [14] introduced innovative formulations of symmetry for continuous distributions by leveraging the properties of k-record values. Building on this foundation, Mahdizadeh and Zamanzade employed ranked set sampling techniques to construct nonparametric estimators of symmetric distribution functions [15]. Broadly speaking, assessing symmetry often involves developing criteria tailored to its specific structural features. This task is frequently carried out using goodness-of-fit tests, as demonstrated by Dai et al. [16] and Bozin et al. [17].

In this study, we explore several stochastic orderings that are useful for comparing random variables in a meaningful way. Suppose $X_{1}^{*}$ and $X_{2}^{*}$ are two continuous random variables with pdfs h₁ and h₂, and corresponding cdfs H₁ and H₂. Their generalized inverses (also known as left-continuous quantile functions) are defined as $H_{1}^{- 1} (x) = inf {v : H_{1} (v) \geq x}$ and $H_{2}^{- 1} (x) = inf {v : H_{2} (v) \geq x}$ for 0 < x < 1.

Based on these definitions, we say that $X_{1}^{*}$ is smaller than $X_{2}^{*}$ in various stochastic orders if the following conditions hold for all x ≥ 0:

(1) Likelihood Ratio Order ( $X_{1}^{*} \leq^{l r} X_{2}^{*}$ ): This ordering holds if the ratio $\frac{h_{1} (x)}{h_{2} (x)}$ is a decreasing function of x.

(2) Hazard Rate Order ( $X_{1}^{*} \leq^{h r} X_{2}^{*}$ ): This comparison holds if the hazard rate function of $X_{1}^{*}$ is greater than or equal to that of $X_{2}^{*}$ for all x. That is, $Λ_{X_{1}^{*}} (x) \geq Λ_{X_{2}^{*}} (x)$ .

(3) Usual Stochastic Order ( $X_{1}^{*} \leq^{s t} X_{2}^{*}$ ): This relation holds when the survival function of $X_{1}^{*}$ is less than or equal to that of $X_{2}^{*}$ , i.e., ${\bar{H}}_{1} (x) \leq {\bar{H}}_{2} (x)$ .

(4) Super-Additive Order ( $X_{1}^{*} \leq^{s u} X_{2}^{*}$ ): This order applies if the composition $H_{2}^{- 1} (H_{1} (x))$ defines a super-additive function.

(5) Dispersive Order ( $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ ): This ordering is satisfied if the difference $H_{2}^{- 1} (H_{1} (x)) - x$ increases with x.

Notably, the hazard rate function for a random variable $X_{i}^{*}$ is given by $Λ_{X_{i}^{*}} (v) = \frac{h_{i} (v)}{1 - H_{i} (v)}$ for v ≥ 0, where the survival function is denoted by ${\bar{H}}_{i} (v) = 1 - H_{i} (v)$ for i = 1, 2. For a comprehensive treatment of these stochastic orders and their properties, readers are encouraged to consult Shaked and Shanthikumar [18].

Kharazmi and Balakrishnan [6] explored the information-generating function for ordered random variables, specifically order statistics. In their study, they derived several properties of mixed systems built from independent and identically distributed components. Building on this foundation, we present a comparative analysis of mixed systems using these information metrics.

In a separate study on record values, Zamani et al. [8] investigated comparative outcomes linked to the information-generating (IG) measure. A key finding was that if two upper record value sequences share an identical IG function, the underlying distributions from which they originate must be the same. Their research also offers a rigorous characterization of the exponential distribution, demonstrating that its IG function for record values is either maximized or minimized under specific constraints.

This study aims to further explore the properties of the information-generating function for order statistics and to demonstrate its application in testing for symmetry. The remainder of the paper is structured as follows: Section 2 develops characterizations and examines monotonicity properties using ordered variables. Section 3 investigates stochastic ordering results based on the information-generating function of order statistics and establishes bounds for this measure. Section 4 analyzes the symmetric properties of the information-generating function model for order statistics, proposes a nonparametric test for symmetry, and illustrates the methodology using chronic disease management data.

2 Properties of information-generating function

In the following scenario, we will discuss some stochastic arrangements of the information-generating functional model for the entropy measure. Shaked and Shanthikumar's Theorem 4.B.2 [18] enables us to examine the following findings:

1. If $X_{1}^{*} \leq^{l r} X_{2}^{*}$ , then $X_{1}^{*} \leq^{h r} X_{2}^{*}$ implies $X_{1}^{*} \leq^{s t} X_{2}^{*}$ .

2. If $X_{1}^{*} \leq^{s t} X_{2}^{*}$ , then $X_{1}^{*} \leq^{s u} X_{2}^{*}$ implies $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ .

Lemma 2.1. Assume that $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ . Then the following inequality holds: $G E n_{δ} (X_{1}^{*}) \geq (\leq) G E n_{δ} (X_{2}^{*})$ for δ ≥ 1 (respectively, 0 < δ ≤ 1).

Proof. Starting from Equation 2, we express the information-generating functional entropy as:

\begin{array}{l} \begin{matrix} G E n_{δ} (X^{*}) = \int_{- \infty}^{\infty} {[h (x)]}^{δ} d x = \int_{0}^{1} {[h (H^{- 1} (v))]}^{δ - 1} d v . \end{matrix} \end{array}

Given that $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ , it follows that $h_{1} (H_{1}^{- 1} (v)) \geq h_{2} (H_{2}^{- 1} (v))$ holds for every v in the interval (0, 1). Consequently, we derive:

\begin{array}{l} G E n_{δ} (X_{1}^{*}) = \int_{0}^{1} {[h (H_{1}^{- 1} (v))]}^{δ - 1} d v \geq (\leq) \int_{0}^{1} {[h (H_{2}^{- 1} (v))]}^{δ - 1} \\ d v = G E n_{δ} (X_{2}^{*}), \end{array}

which confirms the result for δ ≥ 1 (respectively, 0 < δ ≤ 1).

2.1 Employing ordered variables, characterizations redesigned

With cdf H and pdf h, presume that the m occurrences $X_{1}^{*}, . . ., X_{m}^{*}$ are independent and have the same distributions. Therefore, $X_{1, m}^{*} \leq X_{2, m}^{*} \leq \leq X_{m, m}^{*}$ are the order of statistics of the sample. The pdf of a random sample of size m, drawn from a distribution denoted by X^*, which includes the ith order statistic $X_{i, m}^{*}$ for 1 ≤ i ≤ m, is expressed as:

\begin{array}{l} h_{i, m} (x) = \frac{1}{Δ_{h} (i, m - i + 1)} H^{i - 1} (x) {\bar{H}}^{m - i} (x) h (x), & (3) \end{array}

where the normalizing constant is given by $Δ_{h} (i, m - i + 1) = \frac{Γ (i) Γ (m - i + 1)}{Γ (m + 1)}$ . Therefore, from Equation 2, we can define the information-generating function measure for the ith order statistic $X_{i, m}^{*}$ as:

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) = \int_{- \infty}^{\infty} h_{i, m}^{δ} (x) d x \\ = {(\frac{1}{Δ_{h} (i, m - i + 1)})}^{δ} \int_{- \infty}^{\infty} H^{δ i - δ} (x) {\bar{H}}^{δ m - δ i} \\ (x) h^{δ} (x) d x, & (4) \end{array}

for any δ > 0, 1 ≤ i ≤ m.

To support the main conclusions of this section, we refer to a corollary derived from the Stone–Weierstrass Theorem, as presented by Aliprantis and Burkinshaw [19]. This yields the following lemma:

Lemma 2.2. Let ζ^* be a continuous function on the interval [0, 1]. If it satisfies the integral condition $\int_{0}^{1} z^{m} ζ^{*} (z) d z = 0$ for all integers m ≥ 0, then it follow that ζ^*(z) = 0 for every z ∈ [0, 1].

The next theorem shows that the characteristics of the information-generating function associated with the order statistic $X_{i, m}^{*}$ uniquely identify the distribution of the parent.

Theorem 2.1. Assume that h₁ and h₂ are two pdfs, with corresponding cdfs H₁ and H₂, for the random variables $X_{1}^{*}$ and $X_{2}^{*}$ , respectively. Fix a value of i, with 1 ≤ i ≤ m, and let δ > 0. Then the following equivalence holds:

X_{1}^{*} \overset{DI}{=} X_{2}^{*} \Leftrightarrow G E n_{δ} (X_{1; i, m}^{*}) = G E n_{δ} (X_{2; i, m}^{*}), \forall m \geq i .

Proof. We only need to establish sufficiency, since necessity is immediate. Assume that

G E n_{δ} (X_{1; i, m}^{*}) = G E n_{δ} (X_{2; i, m}^{*}), \forall m \geq i .

Using Equations 2, 3, 4, this is equivalent to

\begin{array}{l} \int_{- \infty}^{\infty} H_{1}^{δ i - δ} (x) {\bar{H}}_{1}^{δ m - δ i} (x) h_{1}^{δ} (x) d x \\ = \int_{- \infty}^{\infty} H_{2}^{δ i - δ} (x) {\bar{H}}_{2}^{δ m - δ i} (x) h_{2}^{δ} (x) d x . & (5) \end{array}

Step 1: Change of variables. Note that $d {\bar{H}}_{k}^{δ} (x) = - δ {\bar{H}}_{k}^{δ - 1} (x) h_{k} (x) d x$ . Rewriting Equation 5 yields

\begin{array}{l} \int_{- \infty}^{\infty} H_{1}^{δ i - δ} (x) {\bar{H}}_{1}^{δ m - δ i} (x) Λ_{X_{1}^{*}}^{δ - 1} (x) d {\bar{H}}_{1}^{δ} (x) \\ = \int_{- \infty}^{\infty} H_{2}^{δ i - δ} (x) {\bar{H}}_{2}^{δ m - δ i} (x) Λ_{X_{2}^{*}}^{δ - 1} (x) d {\bar{H}}_{2}^{δ} (x), \end{array}

where $Λ_{X_{k}^{*}}^{δ - 1} (x) = h_{k}^{δ - 1} (x) / {\bar{H}}_{k}^{δ - 1} (x)$ .

Let

v = {\bar{H}}_{k}^{δ} (x), k = 1, 2 .

Since ${\bar{H}}_{k}$ is continuous and strictly decreasing, the mapping is bijective and sends x ∈ (−∞, ∞) to v ∈ [0, 1]. The identity becomes

\begin{array}{l} \int_{0}^{1} {(1 - v^{1 / δ})}^{δ i - δ} v^{m - i} Λ_{X_{1}^{*}}^{δ - 1} (H_{1}^{- 1} (1 - v^{1 / δ})) d v \\ = \int_{0}^{1} {(1 - v^{1 / δ})}^{δ i - δ} v^{m - i} Λ_{X_{2}^{*}}^{δ - 1} (H_{2}^{- 1} (1 - v^{1 / δ})) d v . & (6) \end{array}

Step 2: Application of Lemma 2.2. Let

ζ^{*} (v) = Λ_{X_{1}^{*}}^{δ - 1} (H_{1}^{- 1} (1 - v^{1 / δ})) - Λ_{X_{2}^{*}}^{δ - 1} (H_{2}^{- 1} (1 - v^{1 / δ})) .

Equation 6 implies

\int_{0}^{1} {(1 - v^{1 / δ})}^{δ i - δ} ζ^{*} (v) v^{l} d v = 0, \forall l = m - i \geq 0 .

The prefactor (1−v^1/δ)δi−δ is continuous and strictly positive for v ∈ (0, 1); hence the above is equivalent to

\int_{0}^{1} v^{l} ζ^{*} (v) d v = 0, \forall l \geq 0 .

Since ζ^* is continuous, Lemma 2.2 implies

ζ^{*} (v) = 0, \forall v \in [0, 1] .

Therefore,

\begin{array}{l} Λ_{X_{1}^{*}}^{δ - 1} (H_{1}^{- 1} (p)) = Λ_{X_{2}^{*}}^{δ - 1} (H_{2}^{- 1} (p)), \forall p \in [0, 1] . & (7) \end{array}

Step 3: Deduction of equality of the densities at corresponding quantiles. Recall that

Λ_{X_{k}^{*}}^{δ - 1} (x) = \frac{h_{k}^{δ - 1} (x)}{{\bar{H}}_{k}^{δ - 1} (x)} .

Since for the argument $x = H_{k}^{- 1} (p)$ we have ${\bar{H}}_{k} (x) = 1 - p$ , Equation 7 gives

h_{1} (H_{1}^{- 1} (p)) = h_{2} (H_{2}^{- 1} (p)), \forall p \in [0, 1] .

Step 4: Equality of derivatives of inverse cdfs. Using the identity

h_{k} (H_{k}^{- 1} (p)) = \frac{1}{{(H_{k}^{- 1})}^{'} (p)},

we obtain

{(H_{1}^{- 1})}^{'} (p) = {(H_{2}^{- 1})}^{'} (p), \forall p \in (0, 1) .

Integrating over [0, p] yields

H_{1}^{- 1} (p) = H_{2}^{- 1} (p) + C,

for some constant C.

Step 5: Determination of the constant. Both inverse cdfs satisfy

lim_{p \to 0} H_{k}^{- 1} (p) = inf {x : H_{k} (x) > 0},

which is finite and equal for the two distributions, because equality of information-generating functions implies identical lower-support endpoints. Hence, the limit of the difference is zero, implying C = 0. Thus,

H_{1}^{- 1} (p) = H_{2}^{- 1} (p), \forall p \in [0, 1] .

Therefore, H₁ = H₂, which completes the proof.

Remark 2.1. By taking i = 1 in Theorem 2.1, we have

X_{1}^{*} \overset{DI}{=} X_{2}^{*} \Leftrightarrow G E n_{δ} (X_{1; 1, m}^{*}) = G E n_{δ} (X_{2; 1, m}^{*}), \forall m \geq 1 .

It is well established that the exponential distribution plays a significant role in reliability theory. In what follows, we present a novel characterization of this distribution.

Theorem 2.2. Let the exponential distribution be defined by $\bar{H} (x) = e^{- θ x}$ , where θ > 0 and x > 0. This distribution is uniquely identified by the condition

G E n_{δ} (X_{1, m}^{*}) = m^{δ - 1} G E n_{δ} (X^{*}), \forall m \geq 1 .

With noting that δ > 0.

Proof. We first verify the forward implication, then prove the converse.

(i) If X^* is exponential, then the IGF identity holds. If $\bar{H} (x) = e^{- θ x}$ (θ > 0), a direct computation using Equations 2, 3 (the expression for GEn_δ of an order statistic and the definition of $Λ_{X^{*}}$ ) yields

G E n_{δ} (X_{1, m}^{*}) = \frac{θ^{δ - 1} m^{δ - 1}}{δ} = m^{δ - 1} (\frac{θ^{δ - 1}}{δ}) = m^{δ - 1} G E n_{δ} (X^{*}),

for every integer m ≥ 1. Thus, the displayed identity holds for the exponential distribution.

(ii) Converse: the IGF identity implies an exponential parent.

Assume

G E n_{δ} (X_{1, m}^{*}) = m^{δ - 1} G E n_{δ} (X^{*}), \forall m \geq 1 .

Using the integral representations in Equations 2, 3, this equality can be written as

\int_{- \infty}^{\infty} m^{δ} {\bar{H}}^{δ m - δ} (x) h^{δ} (x) d x = m^{δ - 1} \int_{- \infty}^{\infty} h^{δ} (x) d x, \forall m \geq 1 .

Bring all terms to one side and perform the change of variable

v = {\bar{H}}^{δ} (x), v \in [0, 1] .

As in the proof of Theorem 2.1, this substitution is admissible because $\bar{H}$ is continuous and monotone on the support, and it yields, for every integer m ≥ 1,

\int_{0}^{1} [\frac{1}{δ} Λ_{X^{*}}^{δ - 1} (H^{- 1} (1 - v^{1 / δ})) - G E n_{δ} (X^{*})] v^{m - 1} d v = 0 .

Define the continuous function on [0, 1]

ζ (v) : = \frac{1}{δ} Λ_{X^{*}}^{δ - 1} (H^{- 1} (1 - v^{1 / δ})) - G E n_{δ} (X^{*}) .

The previous displayed family of equalities says that $\int_{0}^{1} ζ (v) v^{m - 1} d v = 0$ for every integer m ≥ 1. Reindex by letting l = m−1 (so l ≥ 0) and apply Lemma 2.2; we conclude ζ(v)≡0 on [0, 1]. Hence

Λ_{X^{*}}^{δ - 1} (H^{- 1} (1 - v^{1 / δ})) = δ G E n_{δ} (X^{*}) for all v \in [0, 1] .

Equivalently, with p: = 1−v^1/δ ∈ [0, 1],

Λ_{X^{*}} (H^{- 1} (p)) = : C for all p \in [0, 1],

where $C : = {(δ G E n_{δ} (X^{*}))}^{1 / (δ - 1)}$ is a positive constant. Thus the composed function $Λ_{X^{*}} ⚬ H^{- 1}$ is constant on [0, 1], and therefore

Λ_{X^{*}} (x) = C for all x in the (interior of the) support.

(iii) From constant $Λ_{X^{*}}$ to constant hazard (and hence exponential).

We now use the explicit relation between $Λ_{X^{*}}$ and the parent density/hazard given in Equation 3 of the manuscript. (Insert here the explicit formula for $Λ_{X^{*}} (x)$ from Equation 3.) In the form needed below that formula expresses $Λ_{X^{*}} (x)$ as a continuously differentiable function of the hazard rate

λ (x) : = \frac{h (x)}{\bar{H} (x)} .

Write this relation as

Λ_{X^{*}} (x) = Φ (λ (x)),

where Φ is an explicit, continuously differentiable function (determined by Equation 3). The explicit algebra in the manuscript shows that Φ is one-to-one on (0, ∞); hence, $Λ_{X^{*}} (x) = C$ for all x implies λ(x) = Φ⁻¹(C) for all x. Denote θ: = Φ⁻¹(C) > 0. Therefore, the hazard is constant:

λ (x) = θ, x in the support .

A distribution with constant hazard λ(x)≡θ has survival function

\bar{H} (x) = exp (- \int_{0}^{x} λ (t) d t) = exp (- θ x),

Thus, H is the exponential distribution with rate θ. Substituting $C = (δ G E n_{δ} (X^{*}))^{1 / (δ - 1)}$ and tracing back θ = Φ⁻¹(C) yields the explicit relation between θ and $G E n_{δ} (X^{*})$ stated in the theorem. This completes the proof.

2.2 Monotonous characteristics

Ebrahimi et al. [20], Zamani et al. [8], and other related studies have reviewed the monotonic behavior of information measures for ordered variables. This section covers the monotonic characteristics of the information-generating function of ordered statistics of order δ.

Lemma 2.3. (Adapted from Shaked and Shanthikumar [18]) Let $X_{i, m_{1}}^{*}$ and $X_{j, m_{2}}^{*}$ be the ith and jth order statistics drawn from independent samples of sizes m₁ and m₂, respectively, drawn from a distribution H with a monotone non-increasing failure rate. Then,

X_{i, m_{1}}^{*} \leq^{d i s p} X_{j, m_{2}}^{*} whenever i \leq j and m_{1} - i \geq m_{2} - j .

An immediate consequence of Lemma 2.3 is that if $X_{1}^{*}, X_{2}^{*}, \dots, X_{m}^{*}$ are independent and equally distributed observations from a monotone non-increasing failure rate distribution, then for any i = 1, …, m, it holds that

\begin{array}{l} X_{i, m + 1}^{*} \leq^{disp} X_{i, m}^{*} \leq^{disp} X_{i + 1, m + 1}^{*} . & (8) \end{array}

Utilizing this result alongside Lemma 2.3, we can now establish the following theorem.

Theorem 2.3. (1) Suppose X^* follows a distribution with a monotone non-increasing failure rate. Then for a fixed index i satisfying 1 ≤ i ≤ m, the generalized entropy $G E n_{δ} (X_{i, m}^{*})$ increases with m.

(2) Under the same distributional assumption, for a fixed sample size m with m≥i ≥ 1, $G E n_{δ} (X_{i, m}^{*})$ decreases as i increases.

With noting that δ ∈ ℕ⁺.

Proof. The proof follows from Lemma 2.3 and the Equation 8.

Let us recall that a random variable X^* is said to have an increasing reversed hazard rate if the function ${\tilde{Λ}}_{X^{*}} (x) = h (x) / H (x)$ is non-decreasing in x. Under this alternative assumption, we now present the reversed implications of Theorem 2.3.

Theorem 2.4. (1) If X^* has an increasing reversed hazard rate, then for a fixed i within 1 ≤ i ≤ m, the quantity $G E n_{δ} (X_{i, m}^{*})$ decreases with increasing m.

(2) Under the same condition, if m is fixed and m≥i ≥ 1, then $G E n_{δ} (X_{i, m}^{*})$ increases as i becomes larger.

With noting that δ ∈ ℕ⁺.

Proof. According to Equations 2, 3, we have

\begin{array}{r} \begin{array}{r} \frac{G E n_{δ} (X_{i, m}^{*})}{G E n_{δ} (X_{i, m + 1}^{*})} = {(\frac{m - i + 1}{m + 1})}^{δ} \frac{\int_{- \infty}^{\infty} H^{δ i - δ} (x) {\bar{H}}^{δ m - δ i} (x) h^{δ} (y) d x}{\int_{- \infty}^{\infty} H^{δ i - δ} (x) {\bar{H}}^{δ m + δ - δ i} (x) h^{δ} (y) d x} \\ = J (m; i; δ) \frac{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i, δ m - δ i + 1)} \\ v^{δ i - 1} {(1 - v)}^{δ m - δ i} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}}{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i, δ m - δ i + δ + 1)} \\ v^{δ i - 1} {(1 - v)}^{δ m - δ i + δ} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}}, \end{array} & (9) \end{array}

where

J (m; i; δ) = {(\frac{m - i + 1}{m + 1})}^{δ} \cdot \frac{Γ (δ (m - i) + 1) Γ (δ (m + 1) + 1)}{Γ (δ m + 1) Γ (δ (m - i + 1) + 1)} .

We introduce t = m−i to simplify the notation J(m; i; δ). The gamma function terms can be rewritten using the property of the gamma function for shifted arguments:

\frac{Γ (δ m + δ + 1)}{Γ (δ m + 1)} \cdot \frac{Γ (δ t + 1)}{Γ (δ t + δ + 1)}

This can be expressed as a product of terms:

(δ m + 1) (δ m + 2) \dots (δ m + δ) \cdot \frac{1}{(δ t + 1) (δ t + 2) \dots (δ t + δ)}

Combining these products with the initial term ${(\frac{m - i + 1}{m + 1})}^{δ}$ , we get a product over k from 1 to δ:

\begin{array}{l} J (m; i; δ) = \prod_{k = 1}^{δ} (\frac{(m - i + 1) (δ m + k)}{(m + 1) (δ (m - i) + k)}), & (10) \end{array}

where δ ∈ ℕ⁺, 1 ≤ i ≤ m. Substituting from Equation 10 in Equation 9, we obtain

\begin{array}{c} \frac{G E n_{δ} (X_{i, m}^{*})}{G E n_{δ} (X_{i, m + 1}^{*})} = J (m; i; δ) \frac{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i, δ m - δ i + 1)} \\ v^{δ i - 1} {(1 - v)}^{δ m - δ i} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}}{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i, δ m - δ i + δ + 1)} \\ v^{δ i - 1} {(1 - v)}^{δ m - δ i + δ} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}} \\ \geq \frac{𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i, δ m}^{*}))]}{𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i, δ m + δ}^{*}))]}, & (11) \end{array}

where $W_{i, m}^{*}$ represent the ith order statistic derived from a sample of size m drawn from a uniform distribution. The corresponding pdf is given by $h_{i, m} (w) = \frac{1}{Δ_{h} (i, m - i + 1)} w^{i - 1} {(1 - w)}^{m - i}$ for w ∈ [0, 1], and i = 1, 2, …, m. Shaked and Shanthikumar [18] state that Theorem 1.B.28 states that $W_{δ i, δ m}^{*} \geq^{h r} W_{δ i, δ m + δ}^{*}$ . This means that $W_{δ i, δ m}^{*} \geq^{s t} W_{δ i, δ m + δ}^{*}$ is also implied. Given that δ ∈ ℕ⁺, the assumption leads to the inequality:

\begin{array}{l} 𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i, δ m}^{*}))] \geq 𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i, δ m + δ}^{*}))], \end{array}

which in turn implies that $\frac{G E n_{δ} (X_{i, m}^{*})}{G E n_{δ} (X_{i, m + 1}^{*})} \geq 1$ . Similarly, for Part (2), we have

\begin{array}{c} \frac{G E n_{δ} (X_{i, m}^{*})}{G E n_{δ} (X_{i + 1, m}^{*})} = J^{*} (m; i; δ) \frac{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i, δ m - δ i + 1)} \\ v^{δ i - 1} {(1 - v)}^{δ m - δ i} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}}{\begin{matrix} \int_{0}^{1} \frac{1}{Δ_{h} (δ i + δ, δ m - δ i - δ + 1)} \\ v^{δ i + δ - 1} {(1 - v)}^{δ m - δ i - δ} {\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (v)) d v \end{matrix}} \\ \leq \frac{𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i, δ m}^{*}))]}{𝔼 [{\tilde{Λ}}_{X^{*}}^{δ - 1} (H^{- 1} (W_{δ i + δ, δ m}^{*}))]}, & (12) \end{array}

where

\begin{array}{l} J^{*} (m; i; δ) = \prod_{k = 1}^{δ} (\frac{i (δ (m - i) + k)}{(m + i) (δ i + k)}), \end{array}

where δ ∈ ℕ⁺, m≥i ≥ 1. Thus, $\frac{G E n_{δ} (X_{i, m}^{*})}{G E n_{δ} (X_{i + 1, m}^{*})} \leq 1$ , with noting that $W_{δ i, δ m}^{*} \leq^{s t} W_{δ i + δ, δ m}^{*}$ .

Theorem 2.5. (1) If X^* has a decreasing reversed hazard rate, then for a fixed i within 1 ≤ i ≤ m, the quantity $G E n_{δ} (X_{i, m}^{*})$ increases with increasing m.

(2) Under the same condition, if m is fixed and m≥i ≥ 1, then $G E n_{δ} (X_{i, m}^{*})$ decreases as i becomes larger.

With noting that δ ∈ ℕ⁺.

Proof. The steps are similar to those in the proof of the previous theorem.

Recalling the Pareto distribution's diminishing reversed hazard rate, represented by the CDF 1−x^−α, x ≥ 1, and α > 0. With rising m and increasing i, respectively, for the Pareto distribution and δ = 2, 3, Figures 1, 2 illustrate the information-generating function model of $X_{i, m}^{*}$ , which guarantees the monotonous qualities of Theorem 2.5 when δ ∈ ℕ⁺.

Figure 1

Two charts illustrate Pareto distributions with different parameters. The left chart, labeled “Pareto distribution, δ=2,” shows a linearly increasing pattern. The right chart, labeled “Pareto distribution, δ=3,” exhibits a curve that rises more steeply, indicating a greater rate of increase. Both graphs plot the function GEN6(Xi*,m) against the variable m.

Figure 1. Information-generating function of $X_{4, m}^{*}$ for the Pareto distribution (with parameter α = 2), with increasing m and δ = 2, 3.

Figure 2

Two graphs display Pareto distributions with different delta values. The left graph shows a delta of two, with higher initial values that gradually decrease. The right graph shows a delta of three, with values decreasing more sharply, approaching zero faster. Both x-axes range from zero to sixty.

Figure 2. Information-generating function of $X_{i, 60}^{*}$ for a Pareto distribution (with parameter α = 2), with increasing i and δ = 2, 3.

3 Ordering outcomes based on the information-generating function of order statistics

In this section, we present some stochastic comparison results for the information-generating function measure of order statistics. The information-generating function of order statistics can be rewritten as follows lemma.

Lemma 3.1. The information-generating function measure of the ith order statistics, $X_{i, m}^{*}$ , can be written as

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) = \frac{Δ_{h} (δ i - δ + 1, δ m - δ + 1)}{{(Δ_{h} (i, m - i + 1))}^{δ}} 𝔼 [h^{δ - 1} (H^{- 1} (V^{*}))], & (13) \end{array}

where the random variable V^* has the pdf

\begin{array}{l} h_{V^{*}} (v) = \frac{1}{Δ_{h} (δ i - δ + 1, δ m - δ + 1)} v^{δ i - δ} {(1 - v)}^{δ m - δ i}, & (14) \end{array}

v ∈ [0, 1].

Proof. From Equations 2, 3, and making use of the transformation v = H(x), we can express the information-generating function measure of the ith order statistics, $X_{i, m}^{*}$ , as

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) = \frac{Δ_{h} (δ i - δ + 1, δ m - δ + 1)}{Δ_{h} (δ i - δ + 1, δ m - δ + 1) {(Δ_{h} (i, m - i + 1))}^{δ}} \\ \int_{0}^{1} v^{δ i - δ} {(1 - v)}^{δ m - δ i} h^{δ - 1} (H^{- 1} (V^{*})) d v, \end{array}

and the result follows.

The impact of monotonic transformations on the information-generating function measure of order statistics is examined in the following theorem.

Theorem 3.1. Assume that φ is a strictly increasing function satisfying φ(−∞) = 0 and φ(∞) = ∞. Then, for the ith order statistic of the transformed random variable Y^* = φ(X^*), the information-generating function measure is expressed as

G E n_{δ} (Y_{i, m}^{*}) = \frac{Δ_{h} (δ i - δ + 1, δ m - δ + 1)}{{(Δ_{h} (i, m - i + 1))}^{δ}} 𝔼 {[\frac{h (H^{- 1} (V^{*}))}{φ^{'} (H^{- 1} (V^{*}))}]}^{δ - 1},

where V^* denotes a random variable whose pdf is defined in Equation 14.

Proof. Given the transformation Y^* = φ(X^*), the cdf and pdf of Y^* become F(y) = H(φ⁻¹(y)) and $f (y) = \frac{h (φ^{- 1} (y))}{φ^{'} (φ^{- 1} (y))}$ , respectively. Using the definition of the information-generating function for the ith order statistic, along with the substitutions x = φ⁻¹(y) and v = H(x), we derive

\begin{array}{l} G E n_{δ} (Y_{i, m}^{*}) = \frac{1}{{(Δ_{h} (i, m - i + 1))}^{δ}} \int_{- \infty}^{\infty} H^{δ i - δ} (φ^{- 1} (y)) \\ {\bar{H}}^{δ m - δ i} (φ^{- 1} (y)) {[\frac{h (φ^{- 1} (y))}{φ^{'} (φ^{- 1} (y))}]}^{δ} d y . \end{array}

Next, using the change of variables x = φ⁻¹(y), we obtain

\begin{array}{l} G E n_{δ} (Y_{i, m}^{*}) = \frac{1}{{(Δ_{h} (i, m - i + 1))}^{δ}} \int_{- \infty}^{\infty} H^{δ i - δ} (x) \\ {\bar{H}}^{δ m - δ i} (x) [\frac{h^{δ} (x)}{{(φ^{'} (x))}^{δ - 1}}] d x \\ = \frac{Δ_{h} (δ i - δ + 1, δ m - δ + 1)}{{(Δ_{h} (i, m - i + 1))}^{δ}} 𝔼 {[\frac{h (H^{- 1} (V^{*}))}{φ^{'} (H^{- 1} (V^{*}))}]}^{δ - 1} . \end{array}

Theorem 3.2. Let X^* be a random variable with pdf h, and let φ be a strictly increasing and convex function satisfying φ(0) = 0 and φ(x) → ∞ as x → ∞. Assume further that φ′(x) exists, is non-decreasing, and fulfills the condition φ′(0) ≥ 1. Then:

(1) If δ ≥ 1, then

G E n_{δ} (φ (X_{i, m}^{*})) \leq G E n_{δ} (X_{i, m}^{*}) .

(2) If 0 < δ ≤ 1, then

G E n_{δ} (φ (X_{i, m}^{*})) \geq G E n_{δ} (X_{i, m}^{*}) .

Proof. Since φ is convex and strictly increasing, its derivative φ′(x) is non-decreasing and satisfies

φ^{'} (x) \geq φ^{'} (0) \geq 1, \forall x \geq 0 .

Let Y = φ(X^*). By a standard change-of-variable argument, the pdf of Y is given by

h_{Y} (φ (x)) = \frac{h (x)}{φ^{'} (x)} \leq h (x),

because φ′(x) ≥ 1.

Equation 2.3 gives the IGF representation

G E n_{δ} (X^{*}) = 𝔼 [Λ_{X^{*}}^{δ - 1} (H^{- 1} (V^{*}))],

and therefore

Λ_{Y} (φ (x)) = \frac{h (x)}{φ^{'} (x)} \leq h (x) = Λ_{X^{*}} (x) .

When δ ≥ 1, the function u↦u^δ−1 is increasing, which implies

Λ_{Y}^{δ - 1} (φ (x)) \leq Λ_{X^{*}}^{δ - 1} (x) .

For 0 < δ ≤ 1, the same function is decreasing, hence

Λ_{Y}^{δ - 1} (φ (x)) \geq Λ_{X^{*}}^{δ - 1} (x) .

Lemma 3.1 together with Theorem 3.1 ensures that these inequalities carry over to the IGF evaluated at the order statistic $X_{i, m}^{*}$ . Consequently:

- If δ ≥ 1, then

G E n_{δ} (φ (X_{i, m}^{*})) \leq G E n_{δ} (X_{i, m}^{*}) .

- If 0 < δ ≤ 1, then

G E n_{δ} (φ (X_{i, m}^{*})) \geq G E n_{δ} (X_{i, m}^{*}) .

This completes the proof.

Remark 3.1. The additional requirement φ′(0) ≥ 1 is not intended to restrict the class of admissible convex transformations. Its role is to ensure that the map φ does not locally contract the distribution near the origin. Since the IGF involves powers of the hazard function; such a contraction would reverse the direction of the inequalities in Theorem 3.2. The condition φ′(0) ≥ 1 is therefore a convenient and sufficient way to guarantee that

Λ_{φ (X^{*})} (φ (x)) = \frac{h (x)}{φ^{'} (x)} \leq h (x) = Λ_{X^{*}} (x),

which is the key step in applying Lemma 3.1 and Theorem 3.1. We note that this assumption may be relaxed to φ′(x) ≥ 1 on a neighborhood of the origin, without altering the main results. In this sense, the condition is mild and does not significantly reduce the applicability of the theorem.

The information-generating function measurements associated with the ith order statistics of two continuously generated random variables are compared as follows. Theorem 3.B.26 by Shaked and Shanthikumar [18] states that $X_{1; i, m}^{*} \leq^{d i s p} X_{2; i, m}^{*}$ , if $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ , where i = 1, 2, ..., m. Therefore, using Lemma 2.1, we can easily get the following conclusion.

Proposition 3.1. Assume that $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ . Then, it holds that $G E n_{δ} (X_{1; i, m}^{*}) \geq (\leq) G E n_{δ} (X_{2; i, m}^{*})$ for δ ≥ 1 (respectively, 0 < δ ≤ 1).

Proof. From Lemma 2.1 and Equation 13, let $X_{1}^{*} \leq^{d i s p} X_{2}^{*}$ . Then, for any δ ≥ 1 (0 < δ ≤ 1), we have

\begin{array}{l} 𝔼 [h_{1}^{δ - 1} (H_{1}^{- 1} (V^{*}))] \geq (\leq) E [h_{2}^{δ - 1} (H_{2}^{- 1} (V^{*}))], \end{array}

and the result follows.

The following theorem compares the information-generating functions of related ith order statistics by measuring the information-generating functions of two variables.

Theorem 3.3. Consider two continuous random variables, $X_{1}^{*}$ and $X_{2}^{*}$ , associated with cdfs H₁ and H₂, and corresponding pdfs h₁ and h₂. Suppose that the condition $inf Ψ_{1}^{*} \geq sup Ψ_{2}^{*}$ holds, where

\begin{array}{l} Ψ_{1}^{*} = {v^{*} > 0 | \frac{h_{2} (H_{2}^{- 1} (v^{*}))}{h_{1} (H_{1}^{- 1} (v^{*}))} \leq 1}, \\ Ψ_{2}^{*} = {v^{*} > 0 | \frac{h_{2} (H_{2}^{- 1} (v^{*}))}{h_{1} (H_{1}^{- 1} (v^{*}))} > 1} . \end{array}

Then, the following statements are true:

(1) If 0 < δ ≤ 1 and $G E n_{δ} (X_{1}^{*}) \leq G E n_{δ} (X_{2}^{*})$ , then it follows that $G E n_{δ} (X_{1, m}^{*}) \leq G E n_{δ} (X_{2, m}^{*})$ .

(2) If δ ≥ 1 and $G E n_{δ} (X_{1}^{*}) \geq G E n_{δ} (X_{2}^{*})$ , then it follows that $G E n_{δ} (X_{1, m}^{*}) \geq G E n_{δ} (X_{2, m}^{*})$ .

Proof. When either of the sets $Ψ_{1}^{*}$ or $Ψ_{2}^{*}$ is empty, the conclusion holds trivially. Therefore, we assume both sets are non-empty. Given the assumption that $G E n_{δ} (X_{1}^{*}) \leq G E n_{δ} (X_{2}^{*})$ , we can write

\begin{array}{l} \int_{- \infty}^{\infty} h_{1}^{δ} (x) d x - \int_{- \infty}^{\infty} h_{2}^{δ} (x) d x \\ = \int_{0}^{1} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \leq 0 . \end{array}

Since 0 ≤ (1−v) ≤ 1 for v ∈ [0, 1], it follows that

\begin{array}{l} \int_{0}^{1} {(1 - v)}^{δ (m - i)} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \leq 0, & (15) \end{array}

where m−i ≥ 0 for i = 1, 2, …, m.

Now, applying the definition of the information-generating function of the i-th order statistic, we obtain

\begin{array}{r} G E n_{δ} (X_{1; i, m}^{*}) - G E n_{δ} (X_{2; i, m}^{*}) = \int_{- \infty}^{\infty} h_{1; i, m}^{δ} (x) d x \\ - \int_{- \infty}^{\infty} h_{2; i, m}^{δ} (x) d x = Ω (x), \end{array}

where we define

Ω (x) = \int_{- \infty}^{\infty} h_{1; i, m}^{δ} (x) d x - \int_{- \infty}^{\infty} h_{2; i, m}^{δ} (x) d x .

To verify the first part of the theorem, it suffices to show that Ω(x) ≤ 0. Using the substitution v = H_i(x) for i = 1, 2, we rewrite Ω(x) as follows:

\begin{array}{l} \begin{array}{l} {(Δ_{h} (i, m - i + 1))}^{δ} Ω (x) = \int_{0}^{1} v^{δ i - δ} {(1 - v)}^{δ m - δ i} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) \\ - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \\ = \int_{Ψ_{1}^{*}} v^{δ i - δ} {(1 - v)}^{δ m - δ i} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) \\ - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \\ + \int_{Ψ_{2}^{*}} v^{δ i - δ} {(1 - v)}^{δ m - δ i} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) \\ - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v . \end{array} \end{array}

From the given condition $inf Ψ_{1}^{*} \geq sup Ψ_{2}^{*}$ and the boundedness of v^δ(i−1) on [0, 1], we obtain:

\begin{array}{l} \begin{array}{l} {(Δ_{h} (i, m - i + 1))}^{δ} Ω (x) \leq {(inf Ψ_{1}^{*})}^{δ (i - 1)} \int_{Ψ_{1}^{*}} {(1 - v)}^{δ (m - i)} \\ [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \\ + {(sup Ψ_{2}^{*})}^{δ (i - 1)} \int_{Ψ_{2}^{*}} {(1 - v)}^{δ (m - i)} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) \\ - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \\ \leq {(inf Ψ_{1}^{*})}^{δ (i - 1)} \int_{0}^{1} {(1 - v)}^{δ (m - i)} [h_{1}^{δ - 1} (H_{1}^{- 1} (v)) \\ - h_{2}^{δ - 1} (H_{2}^{- 1} (v))] d v \leq 0 . \end{array} \end{array}

The last inequality follows directly from Equation 15 and the assumption that $inf Ψ_{1}^{*} \geq sup Ψ_{2}^{*}$ . A similar argument can be applied to prove the second part.

3.1 Bounds for information-generating function measure of order statistics

Theorem 3.4. Let X^* be a random variable with cdf H and pdf h. If $M_{d}^{*} = f (m) < \infty$ , where $m_{d}^{*} = sup {x : h (x) \leq M_{d}^{*}}$ is the mode of X^*, then

\begin{array}{r} G E n_{δ} (X_{i, m}^{*}) \leq max {\frac{{(M_{d}^{*})}^{δ - 1}}{{(Δ_{h} (i, m - i + 1))}^{δ}} D^{*} (δ i; δ m), . \\ . \frac{{(i - 1)}^{δ i - δ} {(m - i)}^{δ m - δ i}}{{(m - 1)}^{δ m - δ} {(Δ_{h} (i, m - i + 1))}^{δ}} G E n_{δ} (X^{*})} & (16) \end{array}

where $D^{*} (δ i; δ m) = \int_{0}^{1} v^{δ i - δ} {(1 - v)}^{δ m - δ i} d v$ , and under the condition δ ≥ 1.

Proof. From Equations 2, 3, and under the condition δ ≥ 1, we can use the transformation v = H(x) to express the information-generating function measure for the ith order statistics, $X_{i, m}^{*}$ , as

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) = \frac{1}{{(Δ_{h} (i, m - i + 1))}^{δ}} \int_{0}^{1} v^{δ i - δ} {(1 - v)}^{δ m - δ i} h^{δ - 1} \\ (H^{- 1} (v)) d v . \end{array}

Given $h (x) \leq M_{d}^{*}$ , it follows that

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) \leq \frac{{(M_{d}^{*})}^{δ - 1}}{{(Δ_{h} (i, m - i + 1))}^{δ}} D^{*} (δ i; δ m) . \end{array}

Conversely, since the beta distribution with pdf $\frac{1}{Δ_{h} (i, m - i + 1)} \int_{0}^{1} v^{i - 1} {(1 - v)}^{m - i} d v$ has the mode $\frac{i - 1}{m - 1}$ , we can say that

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) \leq \frac{1}{{(Δ_{h} (i, m - i + 1))}^{δ}} {(\frac{i - 1}{m - 1})}^{δ i - δ} \\ {(1 - \frac{i - 1}{m - 1})}^{δ m - δ i} \int_{0}^{1} h^{δ - 1} (H^{- 1} (v)) d v \\ = \frac{{(i - 1)}^{δ i - δ} {(m - i)}^{δ m - δ i}}{{(m - 1)}^{δ m - δ} {(Δ_{h} (i, m - i + 1))}^{δ}} G E n_{δ} (X^{*}) . \end{array}

Example 3.1. Suppose X^* follows a Pareto distribution with pdf given by $h (x) = \frac{α s^{α}}{x^{α + 1}}, x \geq s > 0, α > 0$ . It can be shown that the transformed density becomes $h (H^{- 1} (v)) = \frac{α}{s} {(1 - v)}^{α + \frac{1}{α}}, 0 < v < 1$ . Taking α = 1 and s = 1, we find $M_{d}^{*} = 1$ , and hence, h(H⁻¹(v)) = (1−v)². Furthermore, we compute

G E n_{δ} (X^{*}) = \int_{1}^{\infty} x^{- 2 δ} d x = \frac{1}{2 δ - 1}, δ > \frac{1}{2} .

According to Theorem 3.4, we deduce that

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) \leq max {\frac{D^{*} (δ i; δ m)}{{(Δ_{h} (i, m - i + 1))}^{δ}}, \\ \frac{{(i - 1)}^{δ i - δ} {(m - i)}^{δ m - δ i}}{(2 δ - 1) {(m - 1)}^{δ m - δ} {(Δ_{h} (i, m - i + 1))}^{δ}}} . \end{array}

Letting m = 20 and i = 15, for δ = 3, we evaluate

\begin{array}{l} G E n_{3} (X_{15, 20}^{*}) \leq max {9.83131, 13.6021} = 13.6021 . \end{array}

4 Information-generating function model symmetric features of the order statistics

A number of interesting features of the information-generating function of order statistics appear when the pdf of the underlying system, aside from the independent distributed random variables, is symmetric. We begin with two lemmas, the proof of which follows immediately from the symmetry assumption and the definition of h_{i, m} in Equation 3.

Lemma 4.1. (Fashandi and Ahmadi [21]) Let X^* be a continual random variable defined over the support $S_{X^{*}}^{*}$ , with pdf h and cdf H. If the following condition holds:

h (H^{- 1} (v)) = h (H^{- 1} (1 - v)), for all v \in (0, 1),

then the cdf H(x) is symmetric with respect to some point $c^{*} \in S_{X^{*}}^{*}$ .

Lemma 4.2. (Balakrishnan and Selvitella [13]) Suppose the order statistic $X_{i, m}^{*}$ , for i = 1, …, m, arises from a distribution whose pdf h satisfies the symmetry condition h(μ^*+x) = h(μ^*−x) for x ≥ 0, where μ^* denotes the mean of X^*. Under this assumption, the following identities are satisfied:

H (μ^{*} + x) = \bar{H} (μ^{*} - x), h_{i, m} (μ^{*} + x) = h_{m - i + 1, m} (μ^{*} - x) .

Theorem 4.1. Assume that $X_{1}^{*}, \dots, X_{m}^{*}$ are iid samples drawn from a distribution with pdf h that is symmetric about its mean μ^*. Then, the following properties hold:

1. If the sample size m is odd, then for every i = 1, …, m,

G E n_{δ} (X_{i, m}^{*}) = G E n_{δ} (X_{m - i + 1, m}^{*}) .

2. The pdf h is symmetric (about some point) if and only if

G E n_{δ} (X_{1, m}^{*}) = G E n_{δ} (X_{m, m}^{*}) for all integers m \geq 1 .

Moreover, if the first moment exists, the center of symmetry equals the mean μ^*.

Proof. (1) (Symmetry implies equality of GEn for reflected order statistics). By Lemma 4.2, we have the pointwise identity

h_{i, m} (μ^{*} + x) = h_{m - i + 1, m} (μ^{*} - x), x \in ℝ .

Using this identity and the substitution y = μ^*+x (whose Jacobian is dy = dx), we obtain

\begin{array}{l} G E n_{δ} (X_{i, m}^{*}) = \int_{- \infty}^{\infty} h_{i, m}^{δ} (y) d y = \int_{- \infty}^{\infty} h_{i, m}^{δ} (μ^{*} + x) d x \\ = \int_{- \infty}^{\infty} {(h_{m - i + 1, m} (μ^{*} - x))}^{δ} d x \\ = \int_{- \infty}^{\infty} h_{m - i + 1, m}^{δ} (t) d t = G E n_{δ} (X_{m - i + 1, m}^{*}), \end{array}

where in the penultimate equality we used the change of variable t = μ^*−x. This proves (1).

(2) (Necessity). Part (1) with i = 1 gives immediately $G E n_{δ} (X_{1, m}^{*}) = G E n_{δ} (X_{m, m}^{*})$ for all odd m. Because the identity for all m ≥ 1 is stronger, necessity is immediate.

(Sufficiency). Assume that

\begin{array}{l} G E n_{δ} (X_{1, m}^{*}) = G E n_{δ} (X_{m, m}^{*}) for every m \geq 1 . & (17) \end{array}

Using the representations of GEn_δ and proceeding exactly as in the proof of Theorem 2.1, the Equation 17 yields, after the standard change of variable $v = {\bar{H}}^{δ} (x)$ and grouping factors, an identity of the form

\int_{0}^{1} w (v) [h (H^{- 1} (v)) - h (H^{- 1} (1 - v))] v^{ℓ} d v = 0 for all ℓ \geq 0,

where w(v) = (1−v^1/δ)δi−δ is continuous and strictly positive on (0, 1). Dividing by w(v) and using the continuity of the integrand, we obtain

\int_{0}^{1} v^{ℓ} [h (H^{- 1} (v)) - h (H^{- 1} (1 - v))] d v = 0 for all ℓ \geq 0 .

By Lemma 2.2 (Stone–Weierstrass corollary), the continuous function

ζ (v) : = h (H^{- 1} (v)) - h (H^{- 1} (1 - v))

must vanish identically on [0, 1]; hence

\begin{array}{l} h (H^{- 1} (v)) = h (H^{- 1} (1 - v)), \forall v \in (0, 1) . & (18) \end{array}

Now Lemma 4.1 (Fashandi and Ahmadi) implies that the cdf H is symmetric about some point c^* ∈ ℝ (that is, H(c^*+x) = 1−H(c^*−x) for all x). Consequently h is symmetric about c^*.

To identify the center c^* with the mean μ^*, note that for any distribution symmetric about c^* with a finite first moment, we necessarily have

𝔼 [X] = c^{*} .

Therefore, when the first moment exists, the center of symmetry equals the mean, and the pdf is symmetric about μ^*. This completes the proof of sufficiency and hence of the theorem.

Corollary 4.1. As a direct consequence of Theorem 4.1, let the forward difference operator with respect to i be defined as $Ξ G E n_{δ} (X_{i, m}^{*}) = G E n_{δ} (X_{i + 1, m}^{*}) - G E n_{δ} (X_{i, m}^{*})$ for 1 ≤ i ≤ m−1. Then, it follows that $Ξ G E n_{δ} (X_{i, m}^{*}) = - Ξ G E n_{δ} (X_{m - i, m}^{*})$ for i = 1, …, m−1.

Remark 4.1. Define $Θ_{m} = G E n_{δ} (X_{1, m}^{*}) - G E n_{δ} (X_{m, m}^{*})$ . Then, Θ_m = 0 if and only if X^* is symmetric. Hence, Θ_m serves as a potential measure of symmetry and can be used as a test statistic for assessing symmetry.

Based on the conditions outlined in Corollary 4.1, the information-generating function $G E n_{δ} (X_{i, m}^{*})$ attains either a local maximum or a minimum at the median position. This behavior can be illustrated using the uniform U(−1, 1) and standard normal N(0, 1) distributions. Specifically, for the median case (i = 4) when the sample size is m = 7, we can observe (refer to Figures 3, 4):

(1) Under the U(−1, 1) distribution, the function reaches a minimum value of 3.263403 for δ = 2, 5.940808 for δ = 3, and 11.36502 for δ = 4.

(2) Under the N(0, 1) distribution the function reaches a maximum value of 0.6147224 for δ = 2, 0.43858655 for δ = 3, and 0.33141763 for δ = 4.

Figure 3

Three line graphs showing parabolic curves for different values of delta. Top left: delta equals two, values range from four to seven. Top right: delta equals three, values range from six to thirty. Bottom: delta equals four, values range from ten to two hundred. Each graph has i as the x-axis.

Figure 3. Information-generating function of the ith order statistics of U(−1, 1) distribution.

Figure 4

Three line graphs show data distributions for different delta values. The first graph with delta equals two peaks at 0.6. The second graph with delta equals three peaks at 0.4. The third graph with delta equals four peaks slightly above 0.3. Each graph features a horizontal axis labeled “i” and distinct vertical scales.

Figure 4. Information-generating function of the ith order statistics of N(0, 1) distribution.

4.1 Symmetry test using nonparametric estimation

Nonparametric approaches to testing symmetry have been extensively explored in the literature; notable contributions include those by Xiong et al. [22], Noughabi and Jarrahiferiz [23], and Mohamed and Almuqrin [24]. In this section, we focus on a nonparametric estimation framework for the information-generating function inspired by the methodology proposed by Vasicek [25]. This formulation is then employed to assess symmetry in a distribution. Consider a random sample $X_{1}^{*}, \dots, X_{m}^{*}$ drawn from a continuous distribution H(x) with associated density function h(x). The hypothesis under investigation is:

H y_{0} : H (μ^{*} - x) = 1 - H (μ^{*} + x), for all x .

where the parameter μ^* is unspecified. The alternative hypothesis is expressed as:

H y_{1} : H (μ^{*} - x) \neq 1 - H (μ^{*} + x) .

When the underlying random variables are equally distributed and independent and have a symmetric pdf, the information-generating function derived from their order statistics exhibits several notable properties. The Vasicek entropy estimator, originally introduced in Equation 1, has been instrumental in the progression of statistical analysis techniques. Its formulation is given by:

\begin{array}{l} \begin{matrix} E n (h_{m}) = - \int_{- \infty}^{\infty} h (x) ln h (x) d x = - \int_{0}^{1} ln [{(\frac{d}{d χ} H^{- 1} (χ))}^{- 1}] d χ \\ = \frac{1}{m} \sum_{i = 1}^{m} ln [\frac{m}{2 u^{*}} (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})], \end{matrix} & (19) \end{array}

Here, u^* is a positive integer satisfying $u^{*} < \frac{m}{2}$ . For boundary handling, the values are extended such that X_i = X₁ when i < 1, and X_i = X_m when i>m. The generalized entropy expressions for the smallest and largest ordered statistics can be reformulated as follows:

\begin{array}{l} \begin{matrix} G E n_{δ} (X_{1, m}^{*}) = \int_{0}^{1} {[m {(1 - u)}^{m - 1}]}^{δ} f^{δ - 1} (F^{- 1} (y)) d u, \end{matrix} \end{array}

\begin{array}{l} \begin{matrix} G E n_{δ} (X_{m, m}^{*}) = \int_{0}^{1} {[m u^{m - 1}]}^{δ} f^{δ - 1} (F^{- 1} (y)) d u . \end{matrix} \end{array}

Park [26], expanding on the foundation laid by Vasicek [25], proposed a test for symmetry based on entropy derived from order statistics. Following this approach, sample-based estimators of $G E n_{δ} (X_{1, m}^{*})$ and $G E n_{δ} (X_{m, m}^{*})$ for a sample size m and k = 1, 2, …, can be expressed as:

\begin{array}{l} \begin{matrix} \hat{G E n_{δ} (X_{1, k}^{*})} = \frac{k^{δ}}{m} \sum_{i = 1}^{m} {(1 - \frac{i}{m + 1})}^{k δ - δ} {(\frac{2 u^{*}}{m (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})})}^{δ - 1}, \end{matrix} \end{array}

\begin{array}{l} \begin{matrix} \hat{G E n_{δ} (X_{k, k}^{*})} = \frac{k^{δ}}{m} \sum_{i = 1}^{m} {(\frac{i}{m + 1})}^{k δ - δ} {(\frac{2 u^{*}}{m (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})})}^{δ - 1} . \end{matrix} \end{array}

Accordingly, the expression $\hat{Θ_{k}} = \hat{G E n_{δ} (X_{1, k}^{*})} - \hat{G E n_{δ} (X_{k, k}^{*})}$ , defined for k = 1, 2, …, can be approximated through the following empirical estimator:

\begin{array}{l} \hat{Θ_{k}} = \frac{k^{δ}}{m} \sum_{i = 1}^{m} {(\frac{2 u^{*}}{m (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})})}^{δ - 1} [{(1 - \frac{i}{m + 1})}^{k δ - δ} \\ - {(\frac{i}{m + 1})}^{k δ - δ}] . \end{array}

To simplify the analysis, we fix k = 2 in what follows and suggest employing the estimator:

\begin{array}{l} \hat{Θ_{2}} = \frac{2^{δ}}{m} \sum_{i = 1}^{m} {(\frac{2 u^{*}}{m (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})})}^{δ - 1} [{(1 - \frac{i}{m + 1})}^{δ} \\ - {(\frac{i}{m + 1})}^{δ}] \\ = \frac{2^{δ}}{m} \sum_{i = 1}^{m} {(\frac{2 u^{*}}{m (X_{i + u^{*}, m}^{*} - X_{i - u^{*}, m}^{*})})}^{δ - 1} Φ (\frac{i}{m + 1}), \end{array}

in which Φ(v) = −Φ(1−v), and Φ(v) is both continuous and limited. This estimator corresponds to $Θ_{2} = G E n_{δ} (X_{1, 2}^{*}) - G E n_{δ} (X_{2, 2}^{*})$ and is utilized to evaluate whether the distribution of the random variable X^* is symmetric. Substantial deviations of Θ₂, whether in a positive or negative direction, can be interpreted as evidence of asymmetry in the underlying distribution.

Theorem 4.2. Let $X_{1}^{*}, \dots, X_{m}^{*}$ be an equally distributed and independent random variables, and define $Y_{i}^{*} = a X_{i}^{*} + b^{*}$ for constants a > 0 and b^* ∈ ℝ, for each i = 1, …, m. Denote the estimators of Θ₂ based on the sequences ${X_{i}^{*}}$ and ${Y_{i}^{*}}$ as ${\hat{Θ_{2}}}^{X^{*}}$ and ${\hat{Θ_{2}}}^{Y^{*}}$ , respectively. Then, the following relationships (expectation, variance, and mean square error, respectively) hold:

(1) $𝔼 [{\hat{Θ_{2}}}^{Y^{*}}] = \frac{𝔼 [{\hat{Θ_{2}}}^{X^{*}}]}{a^{δ - 1}}$ ,

(2) $Var [{\hat{Θ_{2}}}^{Y^{*}}] = \frac{Var [{\hat{Θ_{2}}}^{X^{*}}]}{a^{2 δ - 2}}$ ,

(3) $MSE [{\hat{Θ_{2}}}^{Y^{*}}] = \frac{MSE [{\hat{Θ_{2}}}^{X^{*}}]}{a^{2 δ - 2}}$ .

Proof. We begin by expressing the estimator for Θ₂ based on the transformed variables:

\begin{array}{l} {\hat{Θ_{2}}}^{Y} = \frac{2^{δ}}{m} \sum_{i = 1}^{m} {(\frac{2 u^{*}}{m (Y_{i + u^{*}, m}^{*} - Y_{i - u^{*}, m}^{*})})}^{δ - 1} [{(1 - \frac{i}{m + 1})}^{δ} \\ - {(\frac{i}{m + 1})}^{δ}] \\ = \frac{2^{δ}}{m} \sum_{i = 1}^{m} {(\frac{2 u^{*}}{m (a X_{i + u^{*}, m}^{*} - a X_{i - u^{*}, m}^{*})})}^{δ - 1} [{(1 - \frac{i}{m + 1})}^{δ} \\ - {(\frac{i}{m + 1})}^{δ}] . \end{array}

This transformation directly leads to the stated scaling properties, completing the proof.

However, the estimator $\hat{Θ_{2}}$ depends not only on the observed sample, but also varies with the chosen window size u^*. Determining its exact distribution under the null hypothesis presents significant analytical challenges. Consequently, Monte Carlo simulation is used to estimate the critical values. Following prior studies (e.g., McWilliams [27] and Corzo and Babativa [28]), the generalized lambda distribution is selected as an alternative model. From this distribution, samples of sizes m = 20, 30, 50, and 100 are generated across nine different parameter settings. The simulated data are defined as

x_{i} = η_{1} + \frac{v_{i}^{η_{3}} - {(1 - v_{i})}^{η_{4}}}{η_{2}}, 0 \leq v_{i} \leq 1, i = 1, 2, \dots, m .

Table 1 presents the parameter values η₁, η₂, η₃, and η₄, originally chosen by McWilliams [27]. For each parameter combination, 1,000 samples are produced for each sample size. To determine the optimal u^*, we utilize a heuristic formula suggested by Crzcgorzewski and Wirczorkowski [29] for entropy estimation, given by

\begin{array}{l} u^{*} = [\sqrt{m} + 0.5], & (20) \end{array}

where [·] denotes the floor function. Figure 5 illustrates the empirical distributions of the test statistic $\hat{Θ_{2}}$ , based on 10,000 replications from the standard normal distribution. These distributions are shown for sample sizes m = 25, 40, 50, 70, and 100, with u^* selected via Equation 20. Sample generation and computation of the test statistic were performed using Wolfram Mathematica (version 13), chosen for its efficient random number generation and symbolic computation features. Further statistical analysis and visualization were conducted in R, leveraging its advanced capabilities in statistical computing and graphical presentation. In Figure 5, as the sample size m increases, the empirical pdf of the statistic $\hat{Θ_{2}}$ becomes increasingly concentrated around its central value. Specifically, larger sample sizes yield steeper and more sharply peaked curves, reflecting a reduction in variability due to the greater amount of information contained in the sample. Conversely, smaller sample sizes produce flatter, more dispersed distributions, indicating greater variability. This behavior is consistent with the general principles of asymptotic theory, where statistics based on larger samples tend to exhibit reduced variance and greater stability.

Table 1

Table 1. Parameter configurations of the generalized lambda distribution used in the Monte Carlo simulations, categorized into nine distinct cases.

Figure 5

Two line graphs depict empirical PDFs of $\hat{\Theta}_2$ for different sample sizes: 25, 40, 50, 70, and 100. Both graphs show density versus $\hat{\Theta}_2$, with density peaking at zero and broader curves for smaller sample sizes.

Figure 5. Empirical density plots of the test statistic based on 50,000 samples generated under the null distribution for sample sizes m = 25, 40, 50, 70, and 100, with δ = 2 (top panel) and δ = 2 (bottom panel).

Using a 1,000-reiteration Monte Carlo simulation, Table 2 presents the exact critical quantities of the examined statistic $\hat{Θ_{2}}$ for varying sample sizes, which correspond to the statistically significant level α^* = 0.05. According to Table 2, we observe that the value of zero lies within the critical intervals as both m and δ increase. Furthermore, the length of these intervals decreases significantly, converging closely around zero.

Table 2

Table 2. The test statistic's critical intervals $\hat{Θ_{2}}$ at the significance level of 0.05.

Furthermore, the power of the test is calculated as the percentage of the 1,000 samples in the important range that reject the symmetrical null assumption at the level of significance α^* = 0.05. The expected power levels for the proposed test are shown in Table 3.

Table 3

Table 3. Comparative analysis of the power examination for the test at the 0.05 significance threshold.

The determination of the critical values and the power for our proposed symmetry test at a significance level of α^* = 0.05 was carried out as follows:

(1) Generate a random sample of size m from the standard normal distribution, and then calculate the corresponding test statistic for the sample;

(2) Repeat Step 1 a total of 1,000 times and define the critical values based on the 25th and 975th percentiles of the obtained test statistics (that is, the 25th and 975th ordered statistics, $\hat{Θ_{2}^{(25)}}$ and $\hat{Θ_{2}^{(975)}}$ , are used to set the thresholds. Specifically, the critical values are given by $\hat{Θ_{2}^{α^{*} = 0.05}} = \hat{Θ_{2}^{(975)}}$ and $\hat{Θ_{2}^{α^{*} = 0.05}} = \hat{Θ_{2}^{(975)}}$ , considering that for α^* = 0.05, we have $\frac{α^{*}}{2} = 0.025 = \frac{25}{1, 000}$ and $1 - \frac{α^{*}}{2} = 0.975 = \frac{975}{1, 000}$ : Hence, the null hypothesis is rejected if $\hat{Θ_{2}}$ falls below $\hat{Θ_{2}^{(25)}}$ or exceeds $\hat{Θ_{2}^{(975)}}$ , and accepted otherwise when $\hat{Θ_{2}^{(25)}} < \hat{Θ_{2}} < \hat{Θ_{2}^{(975)}}$ );

(3) Draw another sample of size m under the null distribution, then verify whether the absolute value of the test statistic crosses the critical thresholds;

(4) Estimate the test's power as the proportion of rejections over 1,000 repetitions of Step 3.

4.1.1 Performance assessment using Monte Carlo methods

To rigorously evaluate the proposed testing methodology, we implement Monte Carlo simulation techniques. The comparative analysis examines statistical power across multiple competing tests, with detailed results presented in Tables 3, 4.

Table 4

Table 4. Comparison of the test's power analysis at the significance level 0.05.

4.1.1.1 Comparative test procedures

The study incorporates the following established testing approaches for benchmarking purposes:

1. McWilliams' runs-based examination [27] utilizes the counting measure At⁽¹⁾ as its fundamental test statistic, quantifying total sequence runs.

2. Baklizi's modified runs analysis [30] introduces an adjusted formulation of the runs test, operationalized through statistic At⁽²⁾.

3. Signed-Rank Wilcoxon procedure [31], developed by Gibbons and Chakraborti, employs the test measure At⁽³⁾ for distribution-free inference.

4. Tajuddin's rank-sum approach [32] adapts the Wilcoxon two-sample framework using test statistic At⁽⁴⁾.

5. Cheng-Balakrishnan rank methodology [33] implements the testing criterion At⁽⁵⁾ for nonparametric analysis.

6. Modarres' trimmed statistical measure [34] incorporates a proportional trimming factor q within its test statistic $A t_{q}^{(6)}$ .

7. Baklizi's size-adaptive test [35] accounts for both sample dimensionality m and trimming proportion q through statistic $A t_{m; q}^{(7)}$ .

8. Baklizi's secondary testing framework [35] presents an alternative formulation based on At⁽⁸⁾.

9. Baklizi's extended testing protocol [36] features an enhanced version using evaluation metric At⁽⁹⁾.

10. Corzo-Babativa nonparametric technique [28] establishes its testing procedure on the foundation of At⁽¹⁰⁾.

11. Noughabi-Jarrahiferiz extropy-based method [23] develops a novel approach using order statistic extropy measure, formalized as At⁽¹¹⁾.

A symmetric distribution is shown by Case-1 in Table 3, where we can observe that all of the values of δ have powers of the testing statistic $\hat{Θ_{2}}$ that are near 0.05 as expected. The corresponding distribution is asymmetric in the next 8 examples (situations 2 and 3 are almost symmetrical). Test statistics with varying δ values, particularly as they grow, exhibit comparable powers in cases 5, 7, 8, and 9. The weakened power values in instance 4 may be explained by the fact that η₁ is much larger than 0, whereas it is nearly 0 in the other examples. We may conclude that our suggested test, based on the information-generating function of order statistics, performs well in the simulation study as the values of δ increase, compared with the other tests in Table 4. Therefore, we anticipate that the suggested test will outperform the competing tests across a wide range of real-world applications.

4.2 Real data set

To demonstrate the applicability of our methodology, we used data from the health statistics bulletin published by the General Authority for Statistics in the Kingdom of Saudi Arabia. This comprehensive dataset captures key health indicators, including:

1. Prevalence of chronic diseases,

2. Mental health status.

The statistical population encompasses all households—both Saudi and non-Saudi—permanently residing in the Kingdom of Saudi Arabia. The survey covers 13 administrative regions and 151 governorates, with 2023 as the base year for calculating indicators. Health status among adults (aged 15 years and above) is assessed using the Visual Analog Scale (VAS), with scores ranging from 0 (worst possible health) to 100 (excellent health). The data is stratified by administrative region and age group, enabling detailed demographic and geographic analysis. The complete dataset is publicly available through the General Authority for Statistics portal at: https://www.stats.gov.sa/statistics-tabs?tab=436312&category=417594. Figure 6 shows visualizations of the data sets histogram and the kernel density estimates, while Figure 7 shows the Q–Q diagram.

Figure 6

Histogram with blue bars and a red kernel density estimate line. The x-axis represents data values ranging from 0 to 100, and the y-axis shows density. The distribution is unimodal and skewed right, peaking around 80.

Figure 6. The kernel density estimate and histogram of the data set.

Figure 7

Q-Q plot comparing sample quantiles to theoretical quantiles. Points largely follow the red diagonal line, indicating the data is approximately normally distributed, with some deviation at the tails.

Figure 7. Q–Q diagram of all the data.

4.2.1 Bootstrap procedure

Since the null distribution of ${\hat{Θ}}_{2}$ is non-pivotal, we employ a reflection bootstrap:

1. Symmetrize the data by generating $X_{sym} = {X_{1}^{*}, \dots, X_{m}^{*}} \cup {2 \tilde{X} - X_{1}^{*}, \dots, 2 \tilde{X} - X_{m}^{*}}$ , where $\tilde{X}$ is the sample median.

2. For each b = 1, …, B: (i) Resample $X_{b}^{*}$ uniformly from $X_{sym}$ . (ii) Compute ${\hat{Θ}}_{2, b}^{*}$ .

3. The p-value is $\frac{1}{B} \sum_{b = 1}^{B} I ({\hat{Θ}}_{2, b}^{*} \geq {\hat{Θ}}_{2}^{obs})$ .

Results. The sample exhibits negative skewness (−1.89) and high kurtosis (9.55), indicating:

• A left-skewed distribution with a longer left tail

• Heavy tails and peakedness relative to a normal distribution

The symmetry test results for different sensitivity parameters δ are given in Table 5.

Table 5

Table 5. The symmetry test results for different sensitivity parameters δ.

Key findings:

• Strong evidence against symmetry for δ = 2 and 3 (p < 0.05).

• Marginal evidence at δ = 4 (p = 0.051).

• Insufficient evidence to reject symmetry at δ = 5 (p = 0.103).

Interpretation:

• The negative skewness suggests potential outliers in the left tail.

• The decreasing p-values with higher δ indicate the test's reduced sensitivity to asymmetry.

5 Conclusion

This research advances the theoretical understanding of information-generating functions for order statistics through several key contributions. We have systematically investigated monotonicity properties and derived bounds for the proposed measure. The study establishes important stochastic ordering results based on this information-theoretic framework, demonstrating that equality of information-generating function measures for order statistics uniquely determines their parent distributions. Furthermore, we have developed novel characterization theorems for the exponential distribution using this approach. For symmetric distributions, our analysis shows that the information-generating function $G E n_{δ} (X_{i, m}^{*})$ exhibits extremal behavior (either a local maximum or a minimum) at the median position. This theoretical finding is substantiated through explicit computations for both uniform and standard normal distributions. Building on these theoretical insights, we have formulated a nonparametric symmetry test based on the proposed measure, whose effectiveness increases with δ. The practical utility of our methodology is validated through comprehensive simulation studies and an application to chronic disease management data. Both theoretical and empirical results consistently show that higher values of δ significantly improve the test's performance, confirming the robustness of our approach.

Future studies will include a comprehensive performance comparison with a broader set of established symmetry tests, such as the Baringhaus–Henze, Ahmad–Li, and Bonett–Seier tests, to further situate our method within the broader literature. While this study provides initial validation of the test's power, a more comprehensive investigation against a wider array of alternatives, including heavy-tailed and bounded-support distributions, is a priority for future research.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

MM: Writing – original draft, Investigation, Software, Formal analysis, Funding acquisition, Visualization, Resources, Supervision, Validation, Project administration, Conceptualization, Writing – review & editing, Data curation, Methodology. MA-L: Writing – review & editing, Methodology, Formal analysis. EA: Writing – review & editing, Methodology, Formal analysis. HS: Funding acquisition, Data curation, Visualization, Resources, Conceptualization, Formal analysis, Validation, Project administration, Methodology, Writing – review & editing, Software, Investigation, Writing – original draft, Supervision.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was funded by Prince Sattam bin Abdulaziz University (PSAU/2025/02/35141).

Acknowledgments

The authors extend their appreciation to Prince Sattam bin Abdulaziz University for funding this research work through the project number (PSAU/2025/02/35141).

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Shannon C. A mathematical theory of communication. Bell Syst Tech J. (1948) 27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x

Crossref Full Text | Google Scholar

2. Golomb S. The information generating function of a probability distribution (corresp.). IEEE Trans Inform Theory. (1966) 12:75–7. doi: 10.1109/TIT.1966.1053843

Crossref Full Text | Google Scholar

3. Kharazmi O, Balakrishnan N. Cumulative and relative cumulative residual information generating measures and associated properties. Commun Stat Theory Methods. (2021) 52:5260–73. doi: 10.1080/03610926.2021.2005100

Crossref Full Text | Google Scholar

4. Kharazmi O, Balakrishnan N. Cumulative residual and relative cumulative residual Fisher information and their properties. IEEE Trans Inf Theory. (2021) 67:6306–12. doi: 10.1109/TIT.2021.3073789

Crossref Full Text | Google Scholar

5. Kharazmi O, Balakrishnan N. Jensen-information generating function and its connections to some well-known information measures. Stat Prob Lett. (2021) 170:108995. doi: 10.1016/j.spl.2020.108995

Crossref Full Text | Google Scholar

6. Kharazmi O, Balakrishnan N. Information generating function for order statistics and mixed reliability systems. Commu Stat Theory Methods. (2021) 51:7846–55. doi: 10.1080/03610926.2021.1881123

Crossref Full Text | Google Scholar

7. Kharazmi O, Balakrishnan N. Generating function for generalized Fisher information measure and its application to finite mixture models. Hacet J Math Stati. (2022) 51:1472–83. doi: 10.15672/hujms.1094273

Crossref Full Text | Google Scholar

8. Zamani Z, Kharazmi O, Balakrishnan N. Information generating function of record values. Math Methods Stat. (2022) 31:120–33. doi: 10.3103/S1066530722030036

Crossref Full Text | Google Scholar

9. Kharazmi O, Balakrishnan N, Ozonur D. Jensen-discrete information generating function with an application to image processing. Soft Comput. (2023) 27:4543–52. doi: 10.1007/s00500-023-07863-0

Crossref Full Text | Google Scholar

10. Kayal S, Balakrishnan N. Quantile-based information generating functions and their properties and uses. Prob Eng Inf Sci. (2024) 38:1–19. doi: 10.1017/S0269964824000068

Crossref Full Text | Google Scholar

11. Onicescu O. The Informational Energy, Component of the Statistical Barometer Concerning the Systems. Bucharest: Technical Publishing House (1966).

Google Scholar

12. Bhatia PK. On measures of information energy. Inf Sci. (1997) 97:233–40. doi: 10.1016/0020-0255(94)00071-9

Crossref Full Text | Google Scholar

13. Balakrishnan N, Selvitella A. Symmetry of a distribution via symmetry of order statistics. Stat Probabil Lett. (2017) 129:367–72. doi: 10.1016/j.spl.2017.06.023

Crossref Full Text | Google Scholar

14. Ahmadi J. Characterization results for symmetric continuous distributions based on the properties of k-records and spacings. Stat Probabil Lett. (2020) 162:108764. doi: 10.1016/j.spl.2020.108764

Crossref Full Text | Google Scholar

15. Mahdizadeh M, Zamanzade E. Estimation of a symmetric distribution function in multistage ranked set sampling. Stat Papers. (2020) 61:851–67. doi: 10.1007/s00362-017-0965-x

Crossref Full Text | Google Scholar

16. Dai XJ, Niu CZ, Guo X. Testing for central symmetry and inference of the unknown center. Comput Stat Data An. (2018) 127:15–31. doi: 10.1016/j.csda.2018.05.007

Crossref Full Text | Google Scholar

17. Bozin V, Milosevic B, Nikitin YY, Obradovic M. New characterization-based symmetry tests. Bull Malays Math Sci Soc. (2020) 43:297–320. doi: 10.1007/s40840-018-0680-3

Crossref Full Text | Google Scholar

18. Shaked M, Shanthikumar JG. Stochastic Orders and Their Applications. San Diego, CA: Academic Press (1994).

Google Scholar

19. Aliprantis CD, Burkinshaw O. Principles of Real Analysis. London: Edward Arnold (1981).

Google Scholar

20. Ebrahimi N, Soofi ES, Zahedi H. Information properties of order statistics and spacings. IEEE Trans Inform Theory. (2004) 50:177–83. doi: 10.1109/TIT.2003.821973

Crossref Full Text | Google Scholar

21. Fashandi M, Ahmadi J. Characterizations of symmetric distributions based on Renyi entropy. Stat Probabil Lett. (2012) 82:798–804. doi: 10.1016/j.spl.2012.01.004

Crossref Full Text | Google Scholar

22. Xiong PH, Zhuang WW, Qiu GX. Testing symmetry based on the extropy of record values. J Nonparametr Stat. (2021) 33:134–55. doi: 10.1080/10485252.2021.1914338

Crossref Full Text | Google Scholar

23. Noughabi HA, Jarrahiferiz J. Extropy of order statistics applied to testing symmetry. Commun Stat-Simul C. (2022) 51:3389–99. doi: 10.1080/03610918.2020.1714660

Crossref Full Text | Google Scholar

24. Mohamed MS, Almuqrin MA. Properties of fractional generalized entropy in ordered variables and symmetry testing. AIMS Math. (2025) 10:1116–41. doi: 10.3934/math.2025053

Crossref Full Text | Google Scholar

25. Vasicek O. A test for normality based on sample entropy. J R Stat Soc B. (1976) 38:54–9. doi: 10.1111/j.2517-6161.1976.tb01566.x

Crossref Full Text | Google Scholar

26. Park S. A goodness-of-fit test for normality based on the sample entropy of order statistics. Stat Probabil Lett. (1999) 44:359–63. doi: 10.1016/S0167-7152(99)00027-9

Crossref Full Text | Google Scholar

27. McWilliams TP. A distribution-free test for symmetry based on a runs statistic. J Am Stat Assoc. (1990) 85:1130–3. doi: 10.1080/01621459.1990.10474985

Crossref Full Text | Google Scholar

28. Corzo J, Babativa G. A modified runs test for symmetry. J Stat Comput Sim. (2013) 83:984–91. doi: 10.1080/00949655.2011.647026

Crossref Full Text | Google Scholar

29. Crzcgorzewski P, Wirczorkowski R. Entropy-based goodness-of-fit test for exponentiality. Commun Stat-Theor M. (1999) 28:1183–202. doi: 10.1080/03610929908832351

Crossref Full Text | Google Scholar

30. Baklizi A. A conditional distribution runs test for symmetry. J Nonparametr Stat. (2003) 15:713–8. doi: 10.1080/10485250310001634737

Crossref Full Text | Google Scholar

31. Gibbons JD, Chakraborti SM. Non-Parametric Statistical Inference. New York, NY: Dekker (1992).

Google Scholar

32. Tajuddin IH. Distribution-free test for symmetry based on the Wilcoxon two-sample test. J Appl Stat. (1994) 21:409–15. doi: 10.1080/757584017

Crossref Full Text | Google Scholar

33. Cheng WH, Balakrishnan N. A modified sign test for symmetry. Commun Stat-Simul C. (2004) 33:703–9. doi: 10.1081/SAC-200033302

Crossref Full Text | Google Scholar

34. Modarres R, Gastwirth JL. A modified runs test for symmetry. Stat Probabil Lett. (1996) 31:107–12. doi: 10.1016/S0167-7152(96)00020-X

Crossref Full Text | Google Scholar

35. Baklizi A. Testing symmetry using a trimmed longest run statistic. Aust N Z J Stat. (2007) 49:339–47. doi: 10.1111/j.1467-842X.2007.00485.x

Crossref Full Text | Google Scholar

36. Baklizi A. Improving the power of the hybrid test. Int J Contemp Math Sciences. (2008) 3:497–9.

Google Scholar

Keywords: information-generating function, non-parametric estimation, order statistics, stochastic order comparison, symmetry testing

Citation: Mohamed MS, Al-Labadi M, Almuhur E and Sakr HH (2026) Further aspects of information-generating function of order statistics with health application in symmetry of chronic disease management. Front. Appl. Math. Stat. 12:1733600. doi: 10.3389/fams.2026.1733600

Received: 27 October 2025; Revised: 24 December 2025; Accepted: 05 January 2026;
Published: 02 February 2026.

Edited by:

Han-Ying Liang, Tongji University, China

Reviewed by:

Zakariya Yahya Algamal, University of Mosul, Iraq
Shuji Ando, Tokyo University of Science, Japan

Copyright © 2026 Mohamed, Al-Labadi, Almuhur and Sakr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Hanan H. Sakr, aC5zYWtyQHBzYXUuZWR1LnNh

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.