On testing exponentiality under Type-I censoring

Two new goodness-of-fit testing procedures are introduced to test exponentiality when data are subject to Type-I censoring. We proposed four test statistics for this purpose. Under extensive Monte Carlo simulations, we showed that the proposed tests maintain the nominal significance level and show good power for both monotonic and non-monotonic hazard function alternatives even for small samples as n = 10. A real dataset is studied for illustrative purposes.


. Introduction
In reliability and life testing problems, Type-I censoring has gained a significant amount of popularity due to the duration of the experiment being fixed prior to it being started and the fact that it is under the control of the experimenter.
It is of interest to study the lifetime of n items by performing a life testing experiment. By controlling the total time, the experiment can be terminated at the time of T, which can be determined before the life testing experiment begins. This means that d observations take the form of X 1 : n ≤ X 2 : n ≤ . . . ≤ X d : n , and n − d data values are censored, as discussed by Balakrishnan and Cohen [1] and Cohen [2].
The exponential distribution considered in this article is well-known and frequently uses lifetime models. The exponential model is a special case among many important statistical models such as Weibull and gamma distributions. The simplicity and the existence of closed form solutions for many problems make the exponential model appealing, which informs the current study (see also Balakrishnan and Basu [3]). We assume the following form of pdf for the exponential distribution with scale parameter θ Suppose n items are placed in a life testing experiment, which will be terminated at a predetermined time T > 0. Let X 1 : n , X 2 : n , . . . , X d : n be the corresponding Type-I censored sample from a distribution function F. Consider the following goodness-of-fit hypothesis For some positive scale parameter θ . Based on this, the current study was interested in testing for exponentiality. The maximum likelihood estimator (MLE) of θ , based on censored data X 1 : n , X 2 : n , . . . , X d : n is given byθ .
/fams. .  Provided that d ≥ 1. However, hereafter we assume that d ≥ 1 and that at least one example of censored data are observed.
Pearson [4] was the first to study the problem of goodness-of-fit, which is a statistical procedure for testing the suitability of a specific model to describe a given set of complete or censored data. For a detailed discussion of this problem see D'Agostino and Stephens [5], Huber-Carol et al. [6], and Nikulin and Chimitova [7] among others.
Stephens [8] proposed a version of the Cramer-von Mises and Anderson-Darling goodness-of-fit test statistics for Type-I censored data. Pakyari and Balakrishnan [9] studied a goodness-of-fit testing procedure for the exponential distribution when the available data are Type-I censored. They studied the goodness-of-fit testing problem for the exponential model by treating the Type-I censored data as a complete sample and then performing classical goodness-of-fit tests for complete data.
Their method considered the Type-I censored sample X 1 : n ≤ X 2 : n ≤ . . . ≤ X d : n as order statistics from a complete sample of size d, from a right-truncated exponential distribution at time T.
This article presents new testing procedures for testing the goodness-of-fit of the exponential model when data are Type-I censored. We study several testing procedures in this regard such as tests based on order statistics, tests based on quantiles, and tests based on binomial distribution. However, our proposed method is based on order statistics followed by tests based on quantiles. We investigate the empirical power of the proposed tests through an extensive Monte Carlo simulation study.
This study aims to provide some easy yet powerful goodness-of-fit testing procedures for exponentiality, which is known to be a special case among many well-applied lifetime models.
The paper is structured as follows. Section 2 introduces some test statistics which are constructed based on order statistics. In Section 3 we propose a test statistic based on a linear combination of quantiles vector. Tests based on binomial distribution are discussed in Section 4. In Section 5, we investigate the validity of the proposed tests by calculating the empirical significance levels and comparing them with the nominated levels. We then perform a Monte Carlo simulation study to access the empirical power of the proposed tests so that we can compare them with the power of some known tests described in the literature on this subject. Finally, we explain the proposed tests using a real data example.

. Tests based on order statistics
Note that conditional on D = d, Where the order statistics V 1 : d , . . . , V d : d are a random sample of size d from exponential distribution but right truncated at T; see Arnold et al. [22] and David and Nagaraja [23].
On finding the MLE of θ , it will be useful to transform the Type-I censored sample X 1 : n , X 2 : n , . . . , X d : n to the complete uniformly distributed sample U 1 : d , U 2 : d , . . . , U d : d using the following transformation: Therefore, testing that the Type-I censored data X 1 : n , X 2 : n , . . . , X d : n follow exponential distribution is equivalent to testing that the complete data U 1 : d , U 2 : d , . . . , U d : d follow a uniform distribution.
If we then let ν i = U i : d − i n+1 be the deviation of each order statistics U i : d from its expected value, then several goodness-of-fit test statistics can be considered: . /fams. .
(4) Large values of these statistics will tend to reject the null hypothesis of exponentiality. In Section 5, we use the Monte Carlo simulation to determine the upper tail of the simulated values of the statistics T 1 , T 2 , and T 3 as critical points for testing exponentiality.

. Test based on quantiles
Note that the order statistics U i : d defined by Equation (3) follow beta distribution with parameters (i, d − i + 1). Define Where The quantiles vector can be used as a measure of goodness-of-fit. Extreme values of p i , i.e. values close to zero or one are signs of "badness-of-fit"! It is noteworthy that, although p i 's are uniformly distributed over (0, 1), they are not statistically independent.
We propose a test statistic in terms of a linear combination of p i and 1 − p i as follows: Where w = i−1 d , for i = 1, 2, . . . , d and p (i) 's are the ordered values of p i arranged from smallest to largest. Note that the test statistic T P will be calculated for values of p (i) in the interval (0, 1), i.e. we exclude the cases with p (i) = 0 or p (i) = 1. Note also that whilst u i : d 's are ordered in terms of their values, the p i 's are not necessarily ordered. Moreover, the test statistic T P , will be large whenever one of p i 's are close to zero or one. Hence, large values of T P provide evidence that the null hypothesis H 0 of exponentiality should be rejected. Hence, testing the null hypothesis of exponentiality (1), is equivalent to performing a binomial test say

. Test based on binomial distribution
Note that if we assume that the null hypothesis is true, i.e. under the validity of the exponential model, we expect to observe nF(T) failures. The usual binomial test may then be used to find the associated p-value.
For large values of sample size n, the binomial distribution is well approximated by the Gaussian model in which a z-test is performed to the test statistic Z, using continuity correction given by However, using the Monte Carlo simulation we found that the test statistic T B does not maintain the nominated significance level for small sample sizes even for sample sizes n ≤ 40, so we did not include the power of T B in our simulation study.
In the following section, we perform a Monte Carlo simulation to assess the power of the proposed tests for various alternative models, and for a combination of various sample sizes n and censoring proportion 1 − F(T) = exp(−T/θ).

. Simulation study
In this section, the performance of our proposed tests will be evaluated by studying the empirical significance level and the empirical power through extensive Monte Carlo simulations. We used the R pseudo-random generator with 50,000 iterations.
First, we investigate the null distribution of the test statistics presented in the previous section using the Monte Carlo estimate of the coefficient of skewness ( √ β 1 ) and the coefficient of kurtosis (β 2 ) when the underlying distribution is standard exponential. The results are shown in Table 1. The coefficient of skewness ( √ β 1 ) and the coefficient of kurtosis (β 2 ) are defined as: .
/fams. . and From Table 1, it is clear that the null distribution of all the test statistics are far from normality, as √ β 1 and β 2 are not close to 0 and 3 respectively, which are the coefficients of skewness and kurtosis of normal distribution. This is also evident from Figure 1, which depicts the simulated pdf curves for the test statistics under the validity of the null hypothesis. Indeed, it has been observed that all the test statistics are skewed to the right. Hence, we use empirical critical values to perform goodness-of-fit tests.
We compare the empirical power of the proposed tests to those of the EDF-based test statistics proposed by Pettitt and Stephens [24] and Stephens [8].
Stephens [8] studied the modification of the Kolmogorov-Smirnov statistic for the Type-I censored data from an exponential model in the form of: Where u (i) = 1 − exp(−x i : n /θ ) and u (d+1) = 1 − exp(−T/θ ) withθ being the MLE of the scale parameter θ given by Equation (2). Pettitt and Stephens [24] also studied the Cramér-von Mises statistic 1 W 2 T : n and the Anderson-Darling statistic 1 A 2 T : n under Type-I censoring in the form of: and We considered seven alternative models in three groups G 1 , G 2 and G 3 based on their behavior of hazard functions as follows:  2. Log-normal distribution with location parameter µ = 0 and scale parameter σ = 1.0, denoted by Log-normal (0, 1.0).
The following forms of probability density functions were used here.
The gamma distribution with density function Where α > 0 is the shape parameter and β > 0 is the scale parameter. The Weibull distribution with density function Where a > 0 and b > 0 are the shape and scale parameters, respectively.
. /fams. .  The log-normal distribution with density function Where −∞ < µ < ∞ is the mean and σ > 0 is the standard deviation of the transformed normal distribution. Finally, the Lomax distribution (also known as Pareto Type II), with probability density function With the scale parameter c > 0 and the shape parameter d > 0.
The plot of CDFs of the alternative distributions in groups G 1 , G 2 and G 3 are depicted in Figure 2.
For a comprehensive discussion of these distributions, one may refer to Johnson et al. [25,26] and Kleiber and Kotz [27].
Verifying the empirical significance level is of great importance for the validity of any goodness-of-fit test statistic. To assess the validity of our tests we investigate the empirical significance level by generating 100, 000 Type-I censored random data from the exponential distribution with a rate equal to one (standard exponential). We considered a combination of various sample sizes n and proportions (probability) of failures F(T) = 1 − exp(−T). The empirical significance levels at nominated level α = 0.10 are tabulated in Table 2. The values in this table confirm the validity of our proposed tests in terms of preserving the nominated significance level.
The power of the proposed tests together with the powers associated with the classical EDF-based tests are recorded in Tables 3-5 for sample sizes n = 10, n = 20, and n = 30, respectively for the three alternative groups G 1 , G 2 , and G 3 . Figures 4-6 depict the corresponding heatmaps to provide better visualization of the results. The greyscale is given in Figure 7.
The test statistics T 3 and T P outperformed the classical EDFbased statistics for groups G 2 and G 3 , respectively for the monotonic increase and non-monotonic hazard function alternatives for all sample sizes considered here. The test statistic T 2 also had the best power in some cases in groups G 2 and G 3 . However, in the group G 1 alternative for monotonic decreasing hazard functions, the EDF-based test statistic AD performed better than the other tests. In Table 5, for log-normal (0, 0.5) alternative and n = 30, the empirical powers are equal to 1.00 for most tests when the censoring FIGURE Histogram of the complete data and fitted exponential pdf curve of the data in Table . proportion F(T) is at least 60%. This shows the consistency of the test statistics considered here. Moreover, as one would expect the empirical power values of all the tests considered here increase when the sample size n increases and/or when the censoring proportion F(T) increases.
In summary, for the monotonically increasing and nonmonotonic hazard rate alternatives, we recommend using the test statistics T 3 and T P . For the Lomax model alternative, we recommend T P for a small amount of censoring proportion and the AD statistic for large values of F(T).

. Numerical example
In this section, we study a numerical example to illustrate our proposed procedure and test statistics. The data concerning the times to breakdown of an insulating fluid tested at 34 kilovolts for n = 19 insulating fluids (see Nelson [28], Table 1.1, page 105).
Suppose we decided to terminate the experiment at time T = 15 so any data larger than 15 is censored. The complete and the Type-I censored data are summarized in Table 6.
. /fams. .  The value of d is found to be d = 14 and using Equation (2) The values of the test statistics and the associated p-values are given in Table 7. The p-values are sufficiently large for all test statistics and thereby the null hypothesis of exponentiality is not significant and the exponential model fits the data. The histogram of the complete data and the fitted exponential pdf curve with scale parameter θ = 10 are depicted in Figure 3.

. Concluding remarks
In this paper, we proposed some new goodness-of-fit tests for exponentiality when the available data are Type-I censored. We employed two methods for this purpose: the first was based on the distance between the observed order statistics and its theoretical mean under the assumption of exponentiality.
The second method was based on the values of quantiles of uniform order statistics, which are known to follow the beta distribution, as is the fact that under the assumption of the null hypothesis, most of the quantiles p i 's should be close to 0.5. We proposed test statistics based on the weighted mean of the logarithm of p i .
Among the four test statistics presented in this article, the test statistic T 3 , based on order statistics, exhibits the most powerful test followed by the test statistic T 4 , which is based on quantiles.
The large sample properties of the proposed estimators will be examined in a separate future study through Monte Carlo simulation.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions
Sections 1-4 were prepared by RP. Sections 5, 6 were prepared by OA-H (60%) and RP (40%). All authors contributed to the article and approved the submitted version.