ORIGINAL RESEARCH article

Front. Phys., 20 March 2020

Sec. Statistical and Computational Physics

Volume 8 - 2020 | https://doi.org/10.3389/fphy.2020.00071

Performance of Three-Stage Sequential Estimation of the Normal Inverse Coefficient of Variation Under Type II Error Probability: A Monte Carlo Simulation Study

  • Department of Mathematics, Faculty of Engineering, Kuwait College of Science and Technology, Kuwait, Kuwait

Article metrics

View details

6

Citations

2,2k

Views

652

Downloads

Abstract

This paper sheds light on the performance of the three-stage sequential estimation of the population inverse coefficient of variation of the normal distribution under a moderate sample size. We estimate the final sample size generated by the three-stage procedure, and the population mean, the population variance, the population inverse coefficient of variation, the asymptotic coverage probability, and the asymptotic regret incurred by estimating the population inverse coefficient of variation by its sample statistics under squared-error loss function plus linear sampling cost. Besides, we address the sensitivity of the constructed confidence interval to detect a potential shift that may occur in the population inverse coefficient of variation under uncontrolled and controlled optimal sample size against type II error probability. We do so by computing the characteristic operating function. Besides, we address the sensitivity of the three-stage procedure as the underlying distribution departs away from normality. We consider two classes of distributions: Student's t distribution and beta distribution. We use Monte Carlo simulations for this study. We write FORTRAN codes and use Microsoft developer studio software. The simulation results revealed that the controlled confidence intervals provide coverage probabilities that exceed the prescribed nominal value even for small optimal sample size contrary to the uncontrolled case that attains the nominal value only asymptotically. Moreover, under the controlled case, the sensitivity of the procedure to depict a potential shift in the parameter of concern becomes more sensitive than the uncontrolled case. Finally, the three-stage procedure is non-sensitive to departure from normality for normal likewise distributions.

Introduction

Let X1, X2, … be a sequence of independent and identically distributed random variables from a normal distribution N(μ, σ2) with mean μ ∈ ℝ and variance σ2 ∈ ℝ+, both parameters are finite but unknown. Pearson [1] introduced the concept of coefficient of variation in the statistical literature. The population coefficient of variation is simply the ratio of the population standard deviation to the population mean, provided the mean is not zero. The higher the coefficient of variation, the greater the level of dispersion around the mean. It is a unit-free measure that allows for comparison between distributions of values whose scales of measurement are not comparable.

The measure has a wide range of applications across many fields of science; see Nairy and Rao [2] for a brief survey of recent applications in business, climatology, engineering, and other fields. Recently, Hima Bindu et al. [3] published a book, which provides necessary exposure of computational strategies, properties of the coefficient of variation, and extracting the metadata leading to efficient knowledge representation. It also compiles representational and classification strategies based on the measure through illustrative explanations. The disadvantage of the measure lies when the population mean equal zero, or when the mean approaches zero. For that reason, we recommend to work with the reciprocal of the measure, inverse coefficient of variation, that is , η ∈ ℝ.

Having observed a random sample X1, X2, …, Xn of size (n ≥ 2) from the normal distribution, we recommend using the customary measures and as initial point estimates for the population mean μ and the population standard deviation σ, respectively. Note are complete sufficient statistics for (μ, σ2). Consequently, the customary sample inverse coefficient of variation is .

Lehman [4] obtained an exact form for the distribution function of the sample coefficient of variation, which depends on the non-central t − distribution, while Jayakumar and Sulthan [5] derived a density function for the sample coefficient of variation in terms of the confluent hypergeometric distribution. Moreover, they obtained the first two moments of the distribution and proved that the sample coefficient of variation is a biased estimator for the population coefficient of variation. Sharma and Krishna [6] found the asymptotic distribution for the sample inverse coefficient of variation without assuming normality. They derived an asymptotic confidence interval for the population inverse coefficient of variation mathematically and then invested the result in making inferences regarding Gamma and Weibull distributions. Albatineh et al. [7] examined the performance of the asymptotic confidence interval for a wide class of underlying distributions: normal, lognormal, χ2 (Chi-squared-distribution), Gamma, and Weibull via Monte Carlo simulation. Gulha et al. [8] considered several confidence intervals for estimating the population coefficient of variation using parametric, non-parametric, and modified methods using Simulation. Their objective was to compare the performance of the existing and newly proposed methods. Banik and Kibria [9] also considered various confidence intervals for estimating the population coefficient of variation under several classes of distributions: symmetric and skewed distributions using simulation. They also include some bootstrap proposed interval estimators for estimating the coefficient of variation. Therefore, the inference for the coefficient of variation is limited to parametric methods or standard bootstrap. Wang et al. [10] used non-parametric methods based on empirical likelihood and modified jackknife empirical likelihood method for constructing confidence intervals for the coefficient of variation. They also propose bootstrap procedures for calibrating the test statistics.

In this paper, we propose sequential estimation for estimating the population inverse coefficient of variation of the normal distribution and prove that sequential estimation provides better results than the classical methods.

Problem Setting

Suppose we desire to construct a confidence interval for η such that

where d(> 0) and 0 < α <1 are predetermined constants. That is the half-width of the interval is d, and the coverage probability is at least 100(1 − α)%.

It was shown from Yousef and Hamdy [11] Corollary 2 parts (i) and (ii), that as n → ∞ , “D” denotes convergence in distribution. It follows that

where Φ(·) is the cumulative distribution function of and the upper cut off point of . Solving Equation (2) for n provides

If η is known, then (3) is the optimal fixed sample size required to solve (1) uniformly for all μ ∈ ℝ and σ > 0. However, since η is unknown, then it has been shown by Dantzig [12] that no fixed sample size procedure could satisfy (1) except by using multistage sequential sampling procedures. In this paper, we use Hall's three-stage sequential sampling procedure.

Before we review Hall's three-stage procedure [13, 14], we summarize the customary asymptotic measures through which one judges the quality of inference, as presented in the literature. These asymptotic measures help in comparing different methods of multistage sampling.

Let N be the final random sample size generated by a multistage sampling procedure, and let n* be as in (3). Then the multistage procedure is said to be (i) first-order asymptotically efficient if as λ → ∞, while it is (ii) second-order asymptotically efficient if as λ → ∞, E(Nn*) is bounded by a finite number unrelated to n*, in the sense of Ghosh and Mukhopadhyay [15].

Let IN be the fixed-width confidence interval constructed via a multistage procedure. Then the procedure is (iii) consistent or exactly consistent if P(η ∈ IN) ≥ 1 − α, uniformly for all ∀ μ and σ. while it is (iv) asymptotically consistent if as λ → ∞, P(η ∈ IN) → 1 − α in the sense of Stein [16], Mukhopadhyay [17], and Chow and Robbins [18].

Let RN be the multistage risk encountered in estimating η by the corresponding sample measure and be the optimal fixed sample size risk had η been known. Then, the procedure is (v) first-order asymptotically risk efficient if as λ → ∞, → 1 while it is (vi) asymptotically second-order regret if as λ → ∞, is bounded by a finite number in the sense of Ghosh and Mukhopadhyay [15]. For more details about the procedures, see Mukhopadhyay and de Silva ([19], Ch. 6).

In addition to the above asymptotic measures, we address other factors for comparison: the practical implementations in real-life problems, the insensitivity to changes in the underline distribution, and the sensitivity to depict any changes in the parameter under consideration.

Three-Stage Sequential Procedure

Stein [16, 20] and Cox [21] introduced the two-stage procedure for solving (1) regarding the population normal mean. The two-stage procedure attains consistency, but unfortunately, it leads to oversampling, in other words, it is asymptotically inefficient. To overcome such deficiency, Anscombe [22], Ray [23], and Chow and Robbins [18] proposed the purely sequential procedure. The procedure attains efficiency and asymptotic consistency but lacks time consumption. As a compromising procedure, Hall [13] introduced the three-stage procedure to achieve two primary objectives, the operational savings made possible by sampling in batches and the asymptotic efficiency attained by the purely sequential sampling. The procedure based on three stages, as we describe later. The procedure combines the efficiency of Anscombe, and Chow and Robbins one-by-one purely sequential procedure and the operational saving made possible by sampling in bulks by applying Stein's group sampling techniques. It is a nice trade-off between purely sequential procedure and two-stage procedure ease of implementation. The procedure attains all properties except exact consistency.

Mukhopadhyay [24] made further developments to the three-stage sampling by focusing on higher-order moments of the stopping variable. Hamdy [25] extended Hall's results and proposed a three-stage sampling point estimation procedure to estimate the normal mean while Liu [26] extended Hall's results to tackle hypothesis-testing problems for the normal mean.

Yousef [27, 28] tackle the three-stage fixed-width confidence interval for the mean of a continuous distribution where but unknown under two cases; the first when the explicit form of the underlying function is known and the second when the underlying distribution can be approximated by Edgeworth series of order two. Heuristically, he showed that the kurtosis of the underlying distribution mainly influences the performance of the asymptotic coverage probability. He studied the asymptotic characteristics of each confidence interval and discussed the sensitivity of the three-stage procedure as the underlying distribution departs away from normality. Son et al. [29] proposed a triple sampling sequential procedure, which yields both a fixed-width confidence interval and a hypothesis testing for the normal mean while controlling Type II error probability. Yousef [28] extended their results to a wider class of underlying continuous distributions. Both Son et al. [29] and Yousef [28] provided second-order approximations to the characteristic operating function of the inference. See also Hamdy et al. [30].

For a complete list of three-stage estimation, see Ghosh et al. [31].

In this paper, we use the three-stage procedure to generate inference for the population inverse coefficient of variation η based on (3).

The Pilot-Stage: take a pilot sample of size m from the normal distribution and calculate the sample mean, sample variance, and the sample inverse coefficient of variation.

The Main-Study Stage: let [x] be the largest integer function and γ (design factor)0 < γ <1. The stage depends on the stopping rule

The Fine-Tuning Stage: Apply the rule

Once the procedure terminates, we propose , and . The 100 (1 − α) % fixed-width confidence interval of η is .

Review of Sequential Estimation of the Population Inverse Coefficient of Variation

Regarding sequential estimation of the population inverse coefficient of variation of the normal distribution, Chaturvedi and Rani [32] developed a purely sequential procedure to find a fixed-width confidence interval estimation for the inverse coefficient of variation of the normal distribution. They showed mathematically that the proposed procedure attains asymptotic efficiency and consistency in the sense of Chow and Robbins [18] without any numerical or simulation results.

Later, Yousef and Hamdy [33] tackle the same problem using Hall's three-stage sequential procedure. They found a unified optimal sample size in the form, that tackle both point and interval estimation for the population normal mean. As an application, they found the asymptotic coverage probability of the population inverse coefficient of variation and the asymptotic regret under the squared-error loss function with linear sampling cost through Monte Carlo simulation. The results showed that the three-stage procedure attains asymptotic efficiency and consistency in the sense of Chow and Robbins [18]. Recently, Yousef and Hamdy [11] reconsidered the same problem but theoretically using an optimal sample size of the form, , that is, the stopping rule directly depends on the population inverse coefficient of variation. They found a compact form for the asymptotic coverage probability for the population inverse coefficient of variation, as well as the asymptotic regret under a squared-error loss function plus linear sampling cost. Moreover, they found the characteristic operating function for a simple hypothesis against a shift that may occur in the population inverse coefficient of variation. They showed mathematically that the three-stage procedure attains asymptotic efficiency while under some proper choices of γ (the design factor) and α the procedure attains consistency. Collectively, the three-stage procedure attains the nominal value only asymptotically. In both cases, the asymptotic regret provides negative values.

Up to our knowledge, none of the existing papers in the literature of sequential estimation use Monte Carlo simulations to examine the performance of the three-stage procedure to tackle inference of the normal inverse coefficient of variation using the optimal size defined in (3).

In this paper, we continue the research of estimating the population inverse coefficient of variation of the normal distribution by examining the performance of the procedure under (3) and verify the theoretical results found by Yousef and Hamdy [11] under moderate sample sizes. We estimate all the parameters in concern; the final sample size N, the population, mean μ, the population variance σ2, and the population inverse coefficient of variation η. We tackle two estimation problems; first, the point estimation problem under the squared-error loss function plus linear sampling cost, and second, the fixed-width confidence estimation problem under controlled optimal sample size against type II error probability. Besides, we discuss the sensitivity of the procedure to depict any potential shift in the population inverse coefficient of variation under both uncontrolled and controlled optimal sample size. Finally, we study the sensitivity of the procedure as the underlying distribution departs away from normality considering tdistribtuion with different degrees of freedom (Leptokurtic) and Beta distribution with different parameters (Platykurtic). We use Monte Carlo simulations for this study using Microsoft Developer Studio software.

Sequential Inference for the Population Inverse Coefficient of Variation

Point Estimation Problem

Consider the loss incurred by estimation the population inverse coefficient of variation η by its customary estimate, the sample inverse coefficient of variation given by

where A is a known constant, and c is the cost per unit sample. The risk associated with (6) is

By minimizing (7) concerning n yields

where n0 is the optimal fixed sample size required for estimating η.

Now, if we set Equation (3) equal Equation (8), we find the optimal sample size needed to perform both point and confidence interval estimation for η with fixed-width 2d and coverage probability at least 100(1 − α)%. That is, the constant A should be chosen such that

As d → 0, A → ∞. For more details regarding A, see [33].

The optimal risk is . While the three-stage sequential risk is

The asymptotic regret, which is the difference between the risks of using the three-stage procedure minus the optimal risk, would be

While the asymptotic relative risk (efficiency ratio) is the sequential risk relative to the optimal risk, that is

Now, if , then Equation (11) provides a negative regret see Martinsek [34] while Equation (12) yields ν(d) < 1. This implies that the three-stage procedure provides a better estimation than the optimal had η been known.

The Asymptotic Coverage Probability of the Population Inverse Coefficient of Variation

Recall the three-stage sampling confidence interval of the inverse coefficient variation, the asymptotic coverage probability of η is

The results of Anscombe [35] provide that as λ → ∞ independent of the random variable N = m, m + 1, m + 2, …. Thus,

Constructing a Fixed-Width Confidence Interval With Controlled Type II Error Probability for the Population Inverse Coefficient of Variation

There is a close relationship between statistical testing hypotheses and confidence intervals in the sense that they can perform similar inference objectives. Confidence intervals, however, provide more information compared to the hypotheses testing counterpart see, Tukey [36]. They signify by their length, the precision of estimation, and the direction of error. Moreover, confidence intervals show which parameter value should not be rejected if they were hypothesized as null values. Therefore, the sensitivity of confidence intervals to depict shifts in the real parameter value η0 becomes a crucial issue to ensure the quality of inference.

Son et al. [29], Costanza et al. [37] were the first who brought up the idea for the normal mean, while Hamdy [38] considered the idea for estimating the location parameter of the exponential distribution.

From a practical standpoint, this issue is essential when constructing quality control charts to monitor the mean quality of service or production. We formulate the following hypotheses:

Both hypotheses make statements about the population value of the test statistic and are mutually exclusive. The null hypothesis H0 asserts that no shift in the actual population inverse coefficient of variation occurred against the alternative hypothesis Ha which emphasized that the actual inverse coefficient of variation has shifted by a distance k Measured in unites of d.

The probability of not depicting a shift given that the shift has already occurred can be assessed by the type II error probability βkc.

Since the process has an equal probability of committing a type II error probability above the centerline or below the centerline, we, therefore, consider only the probability of committing a positive shift from the actual parameter value η0.

Let τ be the probability of committing a type II error probability. Our objective is to control the probability of committing a type II error probability. We do so by finding the characteristic operating curve that gives the probability of acceptance of various possible values of η1. The minimum sample size required to control both α and τ is

where is the upper point of N(0, 1). For more details, see Nelson [39, 40].

The second-order approximation of the controlled characteristic operating function under Equations (14) and (16) as λ → ∞

The uncontrolled case occurs by setting b = 0 in (16) to give βk.

Monte Carlo Simulation

Since the sequential results are asymptotic, it is worth mentioning to estimate the above equations through Monte Carlo simulations. We do so by writing FORTRAN codes and run them using Microsoft Developer Studio software.

Simulation Methodology

The simulation process performs as follows: Fix the values of m, γ, α andñ*.

  • A. Generate an i-th sample of size m ≥ 8 from the normal distribution, and compute , and as initial point estimates of μ, σ2 and η, respectively.

  • B. Apply Equation (4), . Furthermore, compute the numerical value of T.

    • If Tm, then we have enough observations, and thus the experiment terminates. In this case , and .

    • If T > m then sample extra observations of size Tm, say Xm+1, Xm+2, Xm+3, …, XT, then augment the new sample with the previous sample in (A) to have a sample of size T. Then compute the statistics and for the parameters μ, σ, and η, respectively.

  • C. Apply Equation (5), and compute N.

    • If NT, sampling is terminated with , and .

    • If N > T, further observations needed. Sample the difference NT say XT+1,XT+2,…, …,XN Furthermore, augmented with the previous T observations. The updated sample is of size N, and the new estimates are , and .

Upon termination, record the resultant sample size , the simulated mean , the simulated standard deviation and the simulated inverse coefficient of variation for i = 1, 2, …, L.

  • D. As a result, record the observations ,, , and .

  • E. Calculate the estimated means for N, μ, σ and η respectively as follows

    • is the estimated mean sample size,

    • is the estimated mean of the sample mean,

    • is the estimated mean of the sample variance and

    • is the estimated mean of the sample inverse coefficient of variation across replicates.

  • F. The simulated standard errors are

    • , ,

    • and .

  • G. The simulated regret is .

  • H. The simulated relative risk

  • I. The simulated coverage probability is

  • J. The simulated controlled operating characteristic function ;k = 0(0.1)1,1.5 and 2

The study covers two points; the performance of the procedure at fixed m and the performance of the procedure as m changes from m = 8, 10, 15, and 20.

Simulation Experiment and Results

To conduct the simulation study, a series of L = 50, 000 replications were generated from N(μ, σ), with and 10 provided η = 20, 10, 5, 3, 2, 1.5, 1 and 0.5.

The optimal sample sizes are chosen to represent small, medium to large performance, that is

n* = 24, 43, 61, 76, 96, 125, 171, 246, and 500.

While the design factor is chosen to be γ = 0.5, and the pilot samples are taken m = 8, 10, 15, and 20. As small to moderate pilot samples. For brevity, we consider α = 5%, which gives a = 1.96.

Table 1 demonstrates the simulation results at m = 15 as the optimal sample size increases. We noticed the following

  • Regarding the final sample size N; for all values of n* and the absolute difference between and n* reduces as the optimal sample size increases. The simulated standard errors increase as n* increases.

  • Regarding the population mean, the simulated mean converges asymptotically to the population mean. That is is asymptotically unbiased estimator to μ. The standard errors decrease as n* increases.

  • Regarding the population standard variance, the estimates converge to the population standard deviation asymptotically. is asymptotically unbiased to σ. The standard errors decrease as n* increases.

  • The simulated regret has negative values, which indicates that the three-stage procedure provides estimates for the population inverse coefficient of variation better than the optimal had n* been known.

  • Regarding the relative risk, the simulation results reveal that

    • - For fixed n* as η decreases the estimated values decreases slightly.

    • - For fixed η as n* increases the simulated values decreases.

  • a. At n* = 500, converges asymptotically to 0.759 at η = 20 and 10 and approaches 0.752 at η = 1. Even for η < 1, will converge asymptotically to 0.752.

  • 6. Regarding the relative risk, the simulated converges asymptotically to nearly 0.75. This implies that the sequential risk is 25% less than the optimal risk.

  • 7. Regarding the simulated coverage probability, the three-stage procedure attains the desired nominal value asymptotically (asymptotic consistency). The coverage improves as n* increases. Figure 1 shows the performance of the coverage probability as the optimal sample size increases at μ = 10 while and 20. In other words, as η decreases.

Figure 1

Table 1

n*
μ = 10, σ = 0.5, η = 20
2435.020.0329.9990.00050.51420.000319.8110.0120−0.990.9820.9643
4346.920.0539.9980.00040.51680.000319.6710.0110−17.960.7920.9051
6162.910.0719.9990.00030.50820.000219.8740.0086−28.690.7660.9406
7677.790.0839.9990.00030.50530.000219.9300.0075−36.220.7620.9444
9698.100.0969.9990.00020.50400.000219.9490.0065−45.890.7610.9483
125127.530.1149.9990.00020.50300.000119.9610.0057−59.960.7600.9465
171174.490.14010.0000.00020.50200.000119.9770.0049−81.950.7590.9487
246250.610.17610.0000.00010.50140.000119.9830.0040−118.350.7600.9500
500509.100.30210.0000.00010.50080.000119.9880.0028−240.940.7590.9509
μ = 10, σ = 1.0, η = 10
2435.090.0339.9920.00101.03000.00079.8890.0061−0.950.9760.9620
4346.910.0529.9910.00081.03470.00069.8200.0055−18.010.7910.9025
6162.730.0709.9960.00061.01610.00059.9360.0043−28.870.7640.9420
7677.580.0829.9970.00051.01120.00049.9580.0038−36.470.7610.9447
9697.960.0969.9980.00051.00870.00039.9670.0033−46.100.7600.9457
125127.410.1139.9980.00041.00640.00039.9760.0029−60.130.7600.9471
171174.140.1379.9990.00031.00450.00029.9840.0025−82.380.7600.9477
246250.210.1779.9990.00031.00360.00029.9840.0020−118.930.7580.9497
500508.800.29910.0000.00021.00160.00019.9940.0014−241.250.7590.9498
μ = 10, σ = 2.0, η = 5
2434.820.0329.9730.00212.05210.00134.9510.0031−1.180.9770.9643
4346.800.0519.9710.00152.05970.00124.9220.0028−18.000.7890.9030
6162.770.0699.9850.00122.02850.00094.9700.0022−28.790.7640.9434
7677.620.0799.9890.00102.01990.00084.9810.0019−36.390.7600.9440
9697.860.0929.9910.00092.01510.00074.9850.0017−46.140.7590.9470
125127.300.1089.9940.00082.01070.00064.9910.0015−60.160.7600.9486
171174.140.1359.9950.00072.00780.00054.9930.0013−82.320.7590.9484
246250.330.1699.9960.00062.00550.00044.9950.0010−118.660.7590.9499
500508.560.2899.9990.00042.00230.00034.9990.0007−241.300.7580.9501
μ = 10, σ = 10/3, η = 3
2434.570.0319.9360.00343.41610.00222.9650.0020−1.440.9700.9599
4346.210.0489.9250.00263.42290.00202.9480.0018−18.570.7860.8943
6162.500.0659.9660.00203.37210.00152.9830.0014−29.020.7620.9436
7677.600.0759.9730.00183.35900.00132.9900.0012−36.360.7610.9466
9697.820.0859.9830.00153.35400.00112.9920.0011−46.130.7600.9461
125127.250.1019.9880.00133.34830.00102.9950.0009−60.160.7590.9473
171173.620.1239.9900.00113.34530.00082.9950.0008−82.860.7580.9510
246249.500.1559.9920.00103.34110.00072.9970.0007−119.460.7580.9479
500506.620.2629.9960.00073.33770.00052.9980.0005−243.410.7570.9498
μ = 10, σ = 5.0, η = 2
2434.240.0279.8630.00525.09530.00351.9740.0015−1.740.9630.9588
4345.550.0449.8700.00395.08820.00281.9700.0013−19.060.7780.8950
6162.330.0589.9420.00305.03480.00211.9920.0010−29.080.7600.9479
7677.310.0669.9540.00265.02950.00191.9920.0009−36.630.7600.9481
9697.600.0769.9630.00235.02150.00161.9950.0008−46.320.7600.9496
125126.800.0899.9750.00205.01990.00141.9950.0007−60.660.7580.9510
171173.240.1069.9810.00175.01420.00121.9960.0006−83.220.7570.9498
246248.950.1399.9870.00145.00910.00101.9980.0005−119.990.7560.9507
μ = 10, σ = 5.0, η = 2.0
500505.140.2199.9930.00105.00450.00071.9990.0003−244.800.7550.9509
μ = 10, σ = 20/3, η = 1.5
2433.730.0259.8140.00686.76140.00481.4810.0013−2.190.9540.9524
4344.920.0409.8370.00526.73410.00351.4820.0011−19.550.7720.9193
6162.180.0519.9210.00396.69580.00271.4940.0008−29.200.7610.9488
7677.270.0589.9350.00356.69190.00241.4940.0007−36.630.7590.9487
9697.470.0679.9510.00316.68410.00221.4960.0007−46.400.7580.9510
125126.710.0799.9620.00276.68130.00191.4970.0006−60.670.7570.9495
171172.750.0929.9680.00236.67800.00161.4970.0005−83.670.7550.9482
246248.350.1169.9820.00196.67550.00131.4980.0004−120.550.7550.9516
500504.010.1889.9910.00136.67080.00091.4990.0003−245.870.7540.9504
μ = 10, σ = 10, η = 1.0
2433.060.0219.7010.010110.06310.00750.9830.0011−2.820.9420.9477
4344.270.0339.8520.007010.01750.00490.9950.0008−20.050.7660.9509
6162.280.0419.9000.005710.00930.00410.9970.0007−29.040.7620.9518
7677.290.0459.9240.005210.01350.00360.9980.0006−36.530.7600.9499
9697.330.0529.9420.004610.01490.00320.9980.0006−46.480.7580.9499
125126.480.0599.9410.004010.00560.00280.9970.0005−60.880.7560.9517
171172.660.0729.9630.003410.00450.00240.9990.0004−83.670.7540.9515
246247.670.0879.9700.002910.00510.00200.9980.0003−121.210.7540.9509
500502.200.1359.9860.002010.00250.00140.9990.0002−247.660.7520.9508
μ = 10, σ = 20, η = 0.5
2432.130.0169.540.020719.9100.01580.48930.0011−3.620.9250.9357
4344.310.0209.810.013319.9410.00960.49720.0007−19.970.7680.9536
6162.320.0239.860.011319.9570.00810.49800.0006−28.960.7630.9529
7677.390.0279.900.010119.9580.00720.49890.0005−36.380.7610.9518
9697.350.0299.910.009119.9660.00650.49900.0005−46.420.7580.9506
125126.420.0349.930.008019.9760.00570.49910.0004−60.860.7570.9519
171172.470.0409.950.006819.9870.00480.49930.0004−83.810.7550.9513
246247.560.0499.970.005719.9950.00400.49960.0003−121.220.7540.9491
500501.590.0719.980.004019.9920.00280.49980.0002−248.190.7520.9490

Three-stage sequential estimation of the population inverse coefficient of variation under (3) at m = 15, γ = 0.5, α = 5%, η = 20, 10, 5, 3, 2, 1.5, 1.0, and 0.5.

n* indicates the optimal sample size.

Now let us record the impact of increasing m on the performance of the procedure. To do so, we present Table 2. We noticed the following

  • At n* = 24 and 43: As the pilot sample increases the absolute difference between the optimal sample size and the simulated final sample size increases, while at n*= 61, 76, 96, and 125, the absolute difference decreases slightly. At n*= 171, 246, and 500, the absolute difference decreases significantly to approach the desired optimal sample size. The corresponding standard deviations decreases.

  • For fixed n*, as the pilot sample increases, the simulated mean is nearly approaching the population mean. While the estimations become better as the optimal sample size increases with decreased standard deviation.

  • The simulated standard deviation approaches the population standard deviation as the optimal sample size increases. The biased decreases as the optimal sample size increases.

  • In general, the simulated regret decreases as the pilot sample increases except at n*=24.

  • Figure 2 demonstrates the performance of the simulated coverage probability as m increases.

Figure 2

Table 2

m = 8
n*
2427.610.0499.770.00545.1510.00391.94970.0018−8.480.9044
4346.250.0769.920.00355.0490.00251.98830.0012−18.160.9494
6165.270.0959.940.00295.0330.00211.99200.0010−26.150.9484
7681.240.1249.960.00265.0280.00181.99300.0009−32.690.9504
96102.470.1459.970.00235.0220.00161.99430.0008−41.470.9521
125133.460.1899.970.00205.0150.00141.99620.0007−53.950.9519
171182.360.2529.980.00175.0130.00121.99620.0006−74.110.9521
246261.550.3459.990.00145.0080.00101.99810.0005−107.360.9520
500531.030.6969.990.00105.0040.00071.99930.0003−218.840.9516
m = 10
2428.610.0349.750.00535.1820.00391.93350.0017−7.610.9272
4345.140.0569.920.00355.0460.00251.98950.0012−19.260.9490
6163.650.0749.950.00295.0320.00211.99270.0010−27.750.9496
7679.010.0839.950.00265.0270.00181.99250.0009−34.920.9499
9699.540.1019.960.00235.0230.00161.99330.0008−44.430.9490
125129.470.1239.970.00205.0190.00141.99520.0007−57.980.9491
171177.060.1679.980.00175.0120.00121.99750.0006−79.340.9506
246254.770.2359.990.00145.0090.00101.99790.0005−114.150.9512
500516.490.4319.990.00105.0040.00071.99930.0003−233.390.9515
m = 15
2434.240.0309.870.00515.0930.00351.97510.0015−1.730.9566
4345.450.0449.870.00395.0860.00281.97010.0013−19.160.8942
6162.320.0589.940.00305.0360.00211.99130.0010−29.100.9491
7677.330.0669.960.00265.0310.00191.99230.0009−36.610.9490
9697.530.0769.960.00235.0240.00161.99350.0008−46.420.9479
125126.870.0889.970.00205.0160.00141.99580.0007−60.560.9503
171173.100.1059.980.00175.0130.00121.99630.0006−83.360.9498
246249.060.1379.990.00145.0060.00101.99870.0005−119.790.9488
500504.940.2239.990.00105.0050.00071.99880.0003−245.010.9519
m = 20
2441.090.0389.970.00484.9770.00332.04430.00165.700.9511
4349.650.0379.830.00405.1230.00281.94700.0013−15.290.9252
6162.660.0549.920.00315.0480.00221.98470.0011−28.870.9353
7677.110.0639.950.00265.0280.00191.99320.0009−36.810.9494
9697.200.0719.970.00235.0210.00161.99560.0008−46.690.9481
125126.310.0829.970.00205.0170.00141.99610.0007−61.100.9490
171172.180.0979.980.00175.0130.00121.99650.0006−84.280.9519
246247.380.1189.990.00145.0100.00101.99750.0005−121.570.9493
500502.430.1789.990.00105.0050.00071.99890.0003−247.510.9481

The impact of increasing the pilot sample m on the procedure at μ = 10, σ = 5, η = 2.

n* indicates the optimal sample size.

Table 3 shows the asymptotic coverage probability and the asymptotic characteristic operating function under the uncontrolled optimal sample size given in (3). The procedure attains the desired nominal value, only asymptotically. As k increases the simulated decreases and approaches zero at k = 2. The other part of Table 3 shows the results under the controlled optimal sample size given in (16) against type II error probability. The procedure exceeds the desired nominal value, even for small optimal sample sizes. The simulated decreases significantly to zero and . This indicates that by controlling the confidence interval against type II error probability, the procedure becomes more sensitive toward a remarkable shift. Figures 3, 4 reflect the previous comments.

Table 3

Uncontrolled optimal sample size defined in (3)
n*2443617696125171246500
1–0.95880.8950.94790.94810.94960.9510.94980.95070.9509
Uncontrolled characteristic operating values
k2443617696125171246500
00.50880.51460.51260.51140.50850.50500.50940.50550.5061
0.10.43550.43940.43670.42980.43240.43190.42810.42710.4241
0.20.36250.36330.35950.35950.35360.35380.35520.35030.3542
0.30.29490.29710.28810.28790.29120.28390.28470.28120.2782
0.40.23140.24030.22750.22650.22630.22300.22260.22160.2193
0.50.18080.18950.16960.17040.16770.16840.16840.16830.1672
0.60.13170.15170.12920.12670.12610.12510.12500.12260.1188
0.70.09290.12380.09120.09290.09170.09090.08760.08840.0877
0.80.06200.10340.06550.06580.06170.06210.06300.05970.0600
0.90.04120.09190.04490.04170.04190.04180.04150.04070.0401
1.00.02310.08520.03060.03060.02790.02850.02760.02770.0267
1.50.00060.01210.00270.00240.00220.00210.00200.00220.0018
2.00.00000.00070.00050.00010.00010.00000.00000.00010.0001
Controlled optimal sample size defined in (15) τ = 0.05
n*2443617696125171246500
1–0.99990.99940.99930.99990.99990.99990.99990.99980.9999
Controlled characteristic operating values
k2443617696125171246500
00.51020.51600.50960.51090.050590.50550.50760.50500.5017
0.10.36280.36220.35830.35870.35550.35520.35650.35620.3475
0.20.23130.23690.22430.22440.22500.22140.22130.21990.2166
0.30.23130.23690.22430.22440.22500.22140.22130.21990.2166
0.40.13430.15360.13010.12860.12500.12560.12540.12460.1230
0.50.06190.10450.06450.06440.06320.06120.06160.06210.0604
0.60.02420.08480.02910.02780.02850.02720.02600.02750.0265
0.70.00640.04190.01250.01250.01120.01010.01010.01020.0095
0.80.00130.01890.00440.00420.00450.00370.00380.00340.0033
0.90.00020.00670.00180.00150.00120.00110.00110.00090.0009
1.00.00000.00200.00080.00040.00040.00040.00030.00040.0003
1.50.00000.00000.00000.00000.00000.00000.00000.00000.0000
2.00.00000.00000.00000.00000.00000.00000.00000.00000.0000

The robustness of three–stage at m = 15, γ = 0.5, α = 0.05, μ = 10, σ = 5, η = 2.0.

n* indicates the optimal sample size.

Figure 3

Figure 4

The Sensitivity of the Normal-Based Three-Stage Procedure for Underlying Distribution

Assume we need to estimate the population inverse coefficient of variation for a class of non-normal distributions using the normal-based optimal sample size in (3). How sensitive is the three-stage procedure toward estimation? To examine this and without loss of generality, we consider two families of underlying distributions; the student's t(r), r = 5, 10 and 20, r indicates the degrees of freedom and the family of beta distribution; Beta(0.5, 0.5) and Beta(1, 1). Table 4 shows the asymptotic results for the t-distribution with the selected degrees of freedom. We obtain better estimation for all parameters, and the procedure satisfies all asymptotic measures except consistency. The simulated relative risk converges asymptotically to 0.752. Table 5 shows the results for the beta distribution. Here the three-stage procedure provides coverage probabilities that exceed the prescribed nominal value. This may resort to the structural behavior of the beta distribution since it belongs to uniform power series functions. Again approaches nearly to 0.752.

Table 4

r = 5
n*
2430.600.0120.00030.00151.2460.0016−0.00010.0012−4.990.8940.8665
4344.610.0060.00000.00091.2720.0010−0.00020.0007−19.640.7710.9540
6162.610.0060.00040.00071.2760.00090.00030.0006−28.640.7650.9538
7677.620.006−0.00070.00071.2800.0008−0.00050.0005−36.130.7620.9524
9697.630.006−0.00060.00061.2820.0007−0.00040.0005−46.120.7600.9515
125126.620.006−0.00080.00051.2830.0006−0.00060.0004−60.630.7580.9520
171172.630.006−0.00090.00041.2850.0006−0.00070.0003−83.620.7560.9511
246247.630.0060.00000.00041.2870.00050.00000.0003−121.120.7540.9504
500501.480.006−0.00010.00031.2880.0003−0.00010.0002−248.270.7520.9494
r = 10
2430.370.0130.00050.00121.0930.00110.00030.0012−5.220.8940.8656
4344.610.006−0.00090.00071.1080.0006−0.00060.0007−19.640.7710.9528
6162.630.006−0.00070.00061.1100.0005−0.00050.0006−28.630.7650.9518
7677.620.006−0.00080.00061.1140.0005−0.00070.0005−36.140.7620.9531
9697.620.006−0.00090.00051.1150.0004−0.00080.0005−46.130.7600.9518
125126.620.006−0.00040.00041.1150.0004−0.00040.0004−60.640.7570.9506
171172.630.006−0.00050.00041.1160.0003−0.00040.0003−83.630.7550.9503
246247.630.006−0.00030.00031.1160.0003−0.00020.0003−121.130.7540.9510
500501.480.007−0.00060.00021.1170.0002−0.00050.0002−248.270.7520.9502
r = 20
2430.600.011−0.00050.00121.0350.0009−0.00040.0012−4.990.8960.8667
4344.610.006−0.00040.00071.0470.0005−0.00040.0007−19.650.7720.9543
6162.620.0060.00010.00061.0490.00050.00020.0006−28.640.7650.9522
7677.610.0060.00000.00051.0510.00040.00000.0005−36.140.7620.9535
9697.620.006−0.00010.00051.0510.0004−0.00010.0005−46.130.7600.9526
125126.630.006−0.00110.00041.0520.0003−0.00100.0004−60.630.7570.9518
171172.620.006−0.00040.00041.0520.0003−0.00050.0003−83.630.7550.9511
246247.620.0060.00030.00031.0530.00020.00030.0003−121.130.7540.9508
500501.490.006−0.00020.00021.0540.0002−0.00020.0002−248.260.7520.9503

Three–stage estimation of the inverse coefficient of variation for underlying T(r), r = 5, 10, 20 m = 15, γ = 0.5, α = 5%.

n* indicates the optimal sample size.

Table 5

Beta(0.5, 0.5)
n*
2433.040.0220.48910.00040.3560.00011.3810.0010−3.080.9360.9886
4343.510.0330.49340.00030.3540.00011.3990.0008−21.050.7550.9788
6161.420.0400.49600.00020.3540.00011.4050.0006−30.120.7520.9866
7676.440.0450.49660.00020.3540.00011.4060.0006−37.640.7520.9869
9696.420.0510.49750.00020.3540.00011.4080.0005−47.640.7520.9871
125125.490.0640.49780.00010.3540.00011.4090.0004−62.090.7520.9867
171171.590.0720.49850.00010.3540.00001.4100.0004−84.990.7520.9875
246246.760.0880.49890.00010.3540.00001.4110.0003−122.340.7520.9872
500501.530.1360.49950.00010.3540.00001.4130.0002−248.480.7510.9874
Beta(1, 1)
2433.380.0220.49130.00030.2910.00011.7010.0012−2.720.9400.9873
4343.880.0350.49390.00020.2900.00011.7130.0009−20.720.7590.9624
6161.500.0440.49690.00020.2890.00011.7230.0007−30.020.7540.9855
7676.590.0500.49750.00010.2890.00011.7250.0007−37.430.7540.9852
9696.560.0570.49800.00010.2890.00011.7260.0006−47.480.7530.9856
125125.710.0660.49840.00010.2890.00011.7270.0005−61.830.7530.9856
171171.800.0770.49880.00010.2890.00001.7280.0004−84.750.7520.9853
246247.070.0970.49920.00010.2890.00001.7300.0004−121.960.7520.9859
500501.750.1540.49960.00010.2890.00001.7310.0003−248.320.7520.9854

Three–stage estimation of the inverse coefficient of variation for underlying Beta(0.5, 0.5) and (1, 1) m = 15, γ = 0.5, α = 5%.

n* indicates the optimal sample size.

Conclusion

We examine the performance of the three-stage procedure for estimating the population inverse coefficient of variation of the normal distribution. We estimated all parameters in concern and found that the three-stage procedure attains efficiency and asymptotic consistency as the width of the interval approaches zero. By controlling the confidence intervals against type II error probabilities, the procedure provides coverage probabilities that exceed the prescribed nominal value and becomes more sensitive toward any potential shift that may occur in the population inverse coefficient of variation. Regarding the sensitivity of the procedure as the underlying distribution departs away from normality, we found that the three-stage procedure is robust for likewise normal distributions.

Statements

Data availability statement

All datasets generated for this study are included in the article/supplementary material.

Author contributions

AY did all the study, wrote Fortran programs and ran them using Microsoft developer studio software, and wrote the draft and the final form of the paper.

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1.

    PearsonK. Mathematical contributions to the theory of evolutions? III Regression, heredity, and panmixia. Philos Trans R Soc. (1896) 187:253318. 10.1098/rsta.1896.0007

  • 2.

    NairyKSRaoKA. Tests of coefficients of variation of normal population. Commun Stat Simul Comput. (2003) 32:64161. 10.1081/SAC-120017854

  • 3.

    Hima BinduKMorusupalliRDeyNRaoRC. Coefficients of Variation and Machine Learning Applications, 1st ed. New York, NY: CRC Press (2019) ISBN 9780367273286 - CAT# K425377.

  • 4.

    LehmannEL. Theory of Point Estimation, 2nd ed. New York, NY: Wiley (1983).

  • 5.

    JayakumarGSDSulthanA. Exact sampling distribution of sample coefficient of variation. J Reliabil Stat Stud. (2015) 8:3950.

  • 6.

    SharmaKKKrishnaH. Asymptotic sampling distribution of inverse coefficient-of variation and its applications. IEEE Transac Reliabil. (1994) 43:6303. 10.1109/24.370217

  • 7.

    AlbatinehAKibriaBMZogheibB. Asymptotic sampling distribution of inverse coefficient of variation and its applications: revisited. Int J Adv Stat Probabil. (2014) 2:1520. 10.14419/ijasp.v2i1.1475

  • 8.

    GulhaMKibriaBMAlbatinehAAhmedN. A comparison of some confidence intervals for estimating the population coefficient of variation: a simulation study. SORT (2012) 36:4568.

  • 9.

    BanikSKibriaBM. Estimating the population coefficient of variation by confidence intervals. Commun Stat Simul Comput. (2011) 40:123661. 10.1080/03610918.2011.568151

  • 10.

    WangDFormicaMKLiuS. Nonparametric interval estimators for the coefficient of variation. Int J Biostat. (2018) 14:15574679. 10.1515/ijb-2017-0041

  • 11.

    YousefAHamdyH. Three-stage sequential estimation of the inverse coefficient of variation of the normal distribution. Computation. (2019) 7:69. 10.3390/computation7040069

  • 12.

    DantzigGB. On the non-existence of tests of student's hypothesis having power function independent of σ. Ann Mathem Stat. (1940) 11:18692. 10.1214/aoms/1177731912

  • 13.

    HallP. Asymptotic theory of triple sampling of sequential estimation of a mean. Ann Stat. (1981) 9:122938. 10.1214/aos/1176345639

  • 14.

    HallP. Sequential estimation saving sampling operations. J R Stat Soc. (1983) 45:122938. 10.1111/j.2517-6161.1983.tb01243.x

  • 15.

    GhoshMMukhopadhyayN. Consistency and asymptotic efficiency of two-stage and sequential procedures. Sankhya Indian J Stat Ser A. (1981) 43:2207.

  • 16.

    SteinC. A Two-sample test for a linear hypothesis whose power is independent of the variance. Ann Math Stat. (1945) 16:24358. 10.1214/aoms/1177731088

  • 17.

    MukhopadhyayN. Stein's two-stage procedure and exact consistency. Skandinavisk Aktuarietdskr. (1982) 1982:11022. 10.1080/03461238.1982.10405107

  • 18.

    ChowYSRobbinsH. On the asymptotic theory of fixed-width sequential confidence intervals for the mean. Ann Math Stat. (1965) 36:45762. 10.1214/aoms/1177700156

  • 19.

    MukhopadhyayNde SilvaBM. Sequential Methods and Their Applications.Boca Raton, FL: Chapman and Hall/CRC (1950).

  • 20.

    SteinC. Some problems in sequential estimation. Econometrics. (1949) 17:778.

  • 21.

    CoxDR. Estimation by double sampling. Biometrika. (1952) 39:21727. 10.1093/biomet/39.3-4.217

  • 22.

    AnscombeFJ. Large sample theory of sequential estimation. Mathem Proc Cambridge Philos Soc. (1952) 45:6007.

  • 23.

    RayWD. Sequential confidence intervals for the mean of a normal population with unknown variance. J R Stat Soc Ser B. (1957) 19:13343. 10.1111/j.2517-6161.1957.tb00248.x

  • 24.

    MukhopadhyayN. Some properties of a three-stage procedure with applications in sequential analysis. Indian J Stat Ser A. (1990) 52:21831.

  • 25.

    HamdyHI. Remarks on the asymptotic theory of triple stage estimation of the normal mean. Scand Stat J. (1988) 15:30310.

  • 26.

    LiuW. Fixed-width simultaneous confidence intervals for all pairwise comparisons. Comput Stat Data Anal. (1995) 20:3544. 10.1016/0167-9473(94)00032-E

  • 27.

    YousefA. Construction a three-stage asymptotic coverage probability for the mean using edgeworth second-order approximation. In: International Conference on Mathematical Sciences and Statistics. Singapore: Springer. (2014). pp. 5367.

  • 28.

    YousefA. A note on a three-stage sequential confidence interval for the mean when the underlying distribution departs away from normality. Int. J. Appl. Math. Stat. (2018) 57:5769.

  • 29.

    SonMSHaughLDHamdyHICostanzaMC. Controlling type II error while constructing triple sampling fixed precision confidence intervals for the normal mean. Ann Inst Stat Math. (1997) 49:68192. 10.1023/A:1003266326065

  • 30.

    HamdyHISonSMYousefSA. Sensitivity analysis of multi-stage sampling to departure of an underlying distribution from normality with computer simulations. J Seq Anal. (2015) 34:53258. 10.1080/07474946.2015.1099951

  • 31.

    GhoshMMukhopadhyayNSenP. Sequential Estimation. New York, NY: Wiley Series in Probability and Statistics (1997).

  • 32.

    ChaturvediARaniU. Fixed-width confidence interval estimation of the inverse coefficient of variation in a normal population. Microelectron Reliabil. (1996) 36:13058. 10.1016/0026-2714(95)00152-2

  • 33.

    YousefAHamdyH. Three-stage estimation of the mean and variance of the normal distribution with application to inverse coefficient of variation. Mathematics. (2019) 7:831. 10.3390/math7090831

  • 34.

    MartinsekAT. Negative regret, optimal stopping, and the elimination of outliers. J Am Stat Assoc. (1988) 10:6580.

  • 35.

    AnscombeFJ. Sequential estimation. J R Stat Soc. (1953) 15:121.

  • 36.

    TukeyJW. The philosophy of multiple comparisons. Stat Sci. (1991) 6:10016. 10.1214/ss/1177011945

  • 37.

    CostanzaMCHamdyHIHaughLDSonMS. Type II error performance of triple sampling fixed precision confidence intervals for the normal mean. Metron. (1995) LIII:6982.

  • 38.

    HamdyHI. Performance of fixed-width confidence intervals under type II errors: the exponential case. South Afr Stat J. (1997) 31:25969.

  • 39.

    NelsonLS. Comments on significant tests and confidence intervals. J Qual Technol. (1990) 22:32830. 10.1080/00224065.1990.11979266

  • 40.

    NelsonLS. Sample sizes for confidence intervals with specified length and tolerances. J Qual Technol. (1994) 26:5463. 10.1080/00224065.1994.11979498

Summary

Keywords

asymptotic consistency, asymptotic efficiency, inverse coefficient of variation, Monte Carlo simulation, normal distribution, squared-error loss function, three-stage procedure

Citation

Yousef A (2020) Performance of Three-Stage Sequential Estimation of the Normal Inverse Coefficient of Variation Under Type II Error Probability: A Monte Carlo Simulation Study. Front. Phys. 8:71. doi: 10.3389/fphy.2020.00071

Received

06 January 2020

Accepted

02 March 2020

Published

20 March 2020

Volume

8 - 2020

Edited by

Dumitru Baleanu, University of Craiova, Romania

Reviewed by

Kolade Matthew Owolabi, Federal University of Technology, Nigeria; Zakia Hammouch, Moulay Ismail University, Morocco

Updates

Copyright

*Correspondence: Ali Yousef

This article was submitted to Mathematical Physics, a section of the journal Frontiers in Physics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics