ORIGINAL RESEARCH article

Front. Appl. Math. Stat., 15 July 2022

Sec. Statistics and Probability

Volume 8 - 2022 | https://doi.org/10.3389/fams.2022.952142

A New Tobit Ridge-Type Estimator of the Censored Regression Model With Multicollinearity Problem

  • 1. Department of Mathematics, Al-Aqsa University, Gaza, Palestine

  • 2. Department of Applied Statistics and Econometrics, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza, Egypt

  • 3. Department of Quantitative Analysis, College of Business Administration, King Saud University, Riyadh, Saudi Arabia

  • 4. Electrical Engineering Department, Faculty of Engineering & Technology, Future University in Egypt, New Cairo, Egypt

Article metrics

View details

11

Citations

5,4k

Views

790

Downloads

Abstract

In the censored regression model, the Tobit maximum likelihood estimator is unstable and inefficient in the occurrence of the multicollinearity problem. To reduce this problem's effects, the Tobit ridge and the Tobit Liu estimators are proposed. Therefore, this study proposes a new kind of the Tobit estimation called the Tobit new ridge-type (TNRT) estimator. Also, the TNRT estimator was theoretically compared with the Tobit maximum likelihood, the Tobit ridge, and the Tobit Liu estimators via the mean squared error criterion. Moreover, we performed a Monte Carlo simulation to study the performance of the TNRT estimator compared with the previously defined estimators. Also, we used the Mroz dataset to confirm the theoretical and the simulation study results.

Introduction

The limited dependent variables (LDVs) in the regression models are defined as the censored, the discrete, and the truncated outcomes. Tobin [1] introduced the Tobit model of the censored dependent variable, which is related to the LDVs, and Goldberger [2] gave its current name. The censored data appear when the dependent variable has a loss of information, while the truncated data appear when the dependent and the independent variables have a loss of information. In this study, we used the standard Tobit regression model, which is the Type 1 model of the Tobit models (Type 1–5) categorized by Amemiya [3] to deal with the censored dataset and their estimation. The censored normal regression model, which is called the Tobit model, is used to relieve the deficiency of biasedness and inconsistency of the results of using the least squares estimator (LSE). Therefore, to determine the estimates of the parameter and to find the estimates of statistical inference, the Tobit maximum likelihood estimator (TMLE) is used. When the explanatory (independent) variables are not independent, it becomes a problem called multicollinearity, which this problem often ignored in the censored regression models. Also, the multicollinearity makes the Tobit maximum likelihood estimates of the regression coefficients incorrect, unreliable, and unstable; because the mean squared error (MSE) values of these estimates are inflated. For this case, Khalaf et al. [4] examined the multicollinearity effects on the TMLE, and they introduced the Tobit ridge estimator (TRE). Then, Alhusseini and Odah [5] introduced a Tobit principal component estimator. Also, Toker et al. [6] introduced a Tobit Liu estimator (TLE).

In the linear regression model (LRM), several alternative estimators of the regression coefficients have been produced for the LSE when the multicollinearity problem happens because, in this case, the LSE gives large variances, wrong signs, and becomes unstable. The most popular estimators are the ridge estimator of Hoerl and Kennard [7] and the Liu estimator of Liu [8]. Recently, Kibria and Lukman [9] proposed a new ridge-type estimator (NRTE). The NRTE has been extended in different regression models in different studies, such as Lukman et al. [10], Lukman et al. [11], Akram et al. [12], Dawoud and Abonazel [13], Awwad et al. [14], and Abonazel et al. [15]. The multicollinearity is known to be a terrible problem in the Tobit model like in the LRM. For handling multicollinearity, some studies gave and investigated some biased estimators in the LRM for a long time, but there is little investigation of these estimators in the Tobit model. However, studies of the biased estimators instead of TMLE in deleting multicollinearity effects on regression coefficients in the Tobit model are needed. In this context, the TRE was introduced by Khalaf et al. [4] and the TLE by Toker et al. [6] were the biased estimation beginning points in the Tobit model. Then, we defined the Tobit NRTE (TNRTE) in this study. Also, we focus on the theoretical properties of the TNRTE by the MSE criterion and to compare them to the TMLE, the TRE, and the TLE.

The next content of this study is given as follows: Methodology Section defines the Tobit regression model and provides the TNRTE and the theoretical properties. A Monte Carlo Simulation Section deals with the Monte Carlo simulation study. A Real Life Data Section deals with the Mroz dataset. Conclusion Section includes the concluding remarks.

Methodology

Tobit Regression Model

The model of the Tobit regression is

where is called the dependent latent variable, xi is an i-th row of the known matrix X with the dimension n × (p + 1); where p is the number of the explanatory variables. β is the unknown (p + 1) × 1 coefficient vector (when the model contains the intercept β0), and ui is called an error term that is independent, follows a normal distribution by mean, and equals 0 and variance equals σ2. We considered the left censoring, where yi is defined as follows:

On the basis of n observations on yi and xi, the β and σ2 estimation issues are noted. For the defined model in Equation (1), assuming that na is the observation number for yi = 0 and is the observation number for yi > 0, that is, non-zero for yi occur first, then the log-likelihood function of the censored data is given as

where .

The TMLE of β is identified after solving the derivate of Equation (3), but it is not a linear function of β, so it can be solved iteratively by Fisher's scoring method that comprises using the second derivative. The Fisher's scoring method is given as

where is the matrix of the Fisher information which is given at where is β estimate at iteration(r), is β estimate at iteration (r − 1), , D is called as the diagonal matrix and So, the TMLE is written as:

Then, is given as:

Since the TMLE becomes inefficient and unstable when the multicollinearity problem occurs, Khalaf et al. [4] proposed the TRE and Toker et al. [6] proposed the TLE to eliminate the effects of this problem.

The TRE is given iteratively as

and the first step of the TRE is

such that is the first estimate of β, , is given at β(0), the TMLE first step values are as same as that of the TRE, and is the first step of the TMLE. When k = 0, .

The TLE is given iteratively as

and the first step of the TLE is

where the TMLE first step values are as same as that of the TLE if d = 1, [see Amemiya [16], Fair [17], and Toker et al. [6] for more details].

New Ridge-Type Estimator

The usefulness of the NRTE among the one-parameter estimators (RE and LE) in many different regression models and the extension of the one-parameter estimators to the area of the Tobit regression model encouraged us to derive the NRTE in this model as follows:

By extending Equation (3), which is the censored data log-likelihood function with the term of penalization, as

where is called a Lagrangian multiplier and c is a constant, and by differentiating J due to β, we got

where .

By finding the J second derivative due to β and then taking the expectation, we got the following form for the matrix:

Then, we employed the scoring of Fisher's method in order to introduce the TNRTE as:

By using Equation (4), we have the TNRTE in its final form as:

The TNRTE of Equation (15) was obtained iteratively. The first step of the TNRTE is given as follows:

and the first step of the TNRTE is

where the first step values of the TNRTE are same as that of the Tobit LE and is evaluated at β(0) if k = 0, .

Asymptotic MSE Comparisons

To observe the estimators' characteristics, the MSE criterion was preferred. When is an estimator of B, then the matrix form of the MSE criterion is given as

where is the matrix form of the variance-covariance and is the bias vector of estimator. Then, the scalar MSE is given by

Since the TMLE for the first step is known as an asymptotically unbiased estimator, it means that the asymptotic matrix form of the MSE equals the asymptotic matrix form of the variance-covariance as follows:

The asymptotic MSE matrix form of is given as

The asymptotic MSE matrix form of is given as

The first step TNRTE asymptotic bias and its asymptotic variance-covariance forms are given as follows:

and

Then, the asymptotic MSE matrix form of is given as

Model (1) is written in the canonical form using the orthogonal transformation and the spectral decomposition such that the Fisher matrix form of the first step is given as , where C = [C0, C1, ..., Cp] is called a (p + 1) × (p + 1) orthogonal matrix form and refers to the eigenvectors columns, is called a (p + 1) × (p + 1) diagonal matrix form with the eigenvalues on the diagonal, such that M = XC. The canonical form formula of the asymptotic matrix form and the scalar MSE for , , and are written as follows:

where α = C′β, , , and .

The lemmas below are useful to be used in the theoretical comparisons among the above estimators.

Lemma 1: Suppose for the matrices n × n, if F > 0 and I > 0 (or I ≥ 0), then F > I iff such that is the matrix IF−1 maximum eigenvalue [18].

Lemma 2: If the matrix F is defined as an n × n positive definite, i.e., F > 0, as well as α is a vector, then, F − αα′ > 0 iff α′F−1 α < 1 [19].

Lemma 3: Suppose αi = Kim, i = 1, 2 are two α linear estimators and suppose , where refers to covariance matrix and , i = 1, 2 [20], then consequently,

iff , where .

Comparisons Among the Estimators

Theorem 1: is superior to iff

Proof : The dispersion difference is:

We observed that is positive definite since for k > 0. By Lemma 3, the proof is completed.

Theorem 2: When , is superior to iff

where

Proof:

where and

It is clear that, for k > 0 and 0 < d < 1, F > 0 and I > 0. It is obvious that F − I > 0 if and only if , where is the maximum eigenvalue of the matrix IF−1. By Lemma 1, the proof is completed.

Theorem 3: is superior to if and only if

where

Proof: The dispersion difference is

We observed that is applicable if and only if . For k > 0, it was observed that . By Lemma 3, the proof is completed.

The Selection of k Parameter of the TNRTE

Using the Kibria and Lukman [9] method, the optimal biasing parameter k of the TNRTE is given as:

and using the unbiased estimates of σ2 and α2, the optimal estimated k of the TNRTE is given as:

A Monte Carlo Simulation

To explain the performance of the proposed TNRTE compared with other mentioned estimators, we conducted the simulation experiments using some different factor levels. The design is constructed by following the techniques of Kibria [21], Yenilmez et al. [22], Khalaf et al. [4], Yenilmez and Kantar [23], Toker et al. [6], and Yenilmez et al. [24]. The correlation degree (Ï„) among the explanatory variables is one of the essential factors in the simulation. For providing the correlation changing range, the data were also generated using the next model:

where zij is given and follows a standard normal. The dependent variable is given using the next equation:

where ui's are considered as pseudo-random numbers, which are independent and identical and have N(0, σ2), and the parameter vector is considered as β′β = 1 as in the studies of Dawoud and Abonazel [25], Awwad et al. [26], Awwad et al. [14], Abonazel and Dawoud [27], Algamal and Abonazel [28], Abonazel et al. [15], and Abonazel et al. [29]. So, the dependent variable has been censored using Equation (2). Also, all factors used in this simulation are stated in Table 1.

Table 1

FactorSymbolDesign
Censoring levelCL5, 25, 50%
Sample sizen100, 400, 800
Varianceσ0.5, 1, 5
Degree of correlationτ0.85, 0.9, 0.95, 0.99
Number of explanatory variablesp4, 8
Number of replicatesMCN1,000

Values of factors that are considered in the simulation.

The TRE, the TLE, and the proposed TNRTE estimated biasing parameters used in this simulation study are given as follows:

  • The estimated parameter of k for the TRE is considered according to Hoerl and Kennard [7], as

  • The estimated parameter d for the TLE is considered, according to Liu [8] as follows

    when has negative value, Ozkale and Kaciranlar [30] considered the alternative parameter of d as:

  • Following the study of Kibria and Lukman [9], the estimated biasing parameter minimum value and the harmonic-mean of k for the proposed TNRTE are considered as follows:

To examine the performances of the TMLE, TRE, TLE, and the proposed TNRTE, we computed the estimated MSE (EMSE) as:

where is called an estimator as well as α is called a true parameter. The simulation results (EMSE values) are stated in Tables 2–7, the smallest value of the EMSE is highlighted in bold.

Table 2

CLnτ
0.051000.850.084390.081770.080240.075800.04701
0.900.144770.141080.136870.133400.09588
0.950.378660.300130.288050.219570.14546
0.992.121231.692811.174091.365381.07352
4000.850.069480.068730.068620.066820.04976
0.900.037360.036980.036640.035950.02226
0.950.159970.154860.153220.144270.10026
0.990.514610.442490.416980.365560.26895
8000.850.015000.014950.014920.014820.01131
0.900.024200.024030.023930.023560.01506
0.950.058890.058230.057790.056530.03707
0.990.261890.242390.232420.211320.13378
0.251000.850.151830.136070.138910.113500.09149
0.900.341390.232990.279360.161250.13711
0.950.817210.331120.471750.180700.15651
0.993.079111.944881.309701.307600.94765
4000.850.093150.091790.092450.089120.09653
0.900.147610.140840.144580.130330.12393
0.950.287870.212180.250170.142160.10127
0.991.164880.511210.640530.282210.23849
8000.850.054740.054540.054620.054070.06394
0.900.143150.137000.140520.124980.09854
0.950.116860.107470.112020.091190.06602
0.991.234990.817750.897440.534210.35129
0.501000.850.943980.490230.723430.348570.32451
0.900.895230.374000.611220.292650.28724
0.950.451120.158180.254730.128410.12884
0.994.571830.723900.729600.391330.45197
4000.850.410650.304700.389180.254600.24267
0.900.368240.278860.344620.228980.20956
0.950.784580.390980.660410.296140.27413
0.996.158953.918782.455501.677150.37830
8000.850.289780.275170.287130.257920.24808
0.900.389910.313600.377300.263720.24913
0.950.329270.299350.322780.284050.28284
0.990.648170.297160.469060.253710.24986

Simulation results in case of p = 4 and σ = 0.5.

Table 3

CLnτ
0.051000.850.279940.222480.239480.158180.09746
0.900.485480.369450.388820.266480.18523
0.951.064870.536190.596760.293760.21853
0.996.094463.408601.507701.883230.79975
4000.850.120660.112200.115580.095710.05239
0.900.122230.111100.114000.090420.04230
0.950.294360.240580.258520.175960.10436
0.991.218360.665030.710620.400580.30620
8000.850.039400.038240.038570.035330.01791
0.900.062450.059200.060330.051860.02352
0.950.145610.132520.136330.108420.05392
0.990.663750.424830.476230.267130.19121
0.251000.850.336600.214040.269730.142570.12302
0.900.650970.295700.451690.177680.15457
0.951.352050.437670.611710.190880.13035
0.996.276922.957771.269491.386310.60606
4000.850.124460.114770.121180.101950.10170
0.900.238840.190920.223620.151720.13461
0.950.370850.199660.300110.117320.09775
0.991.801260.614180.768890.253840.13237
8000.850.074140.071390.073090.065940.06361
0.900.170090.150600.164340.124000.09976
0.950.211580.160120.193100.110500.08175
0.991.745110.878711.078840.458160.25214
0.501000.851.138500.519460.812900.354200.33801
0.901.137290.407010.681950.303230.30289
0.950.853860.207590.368660.137170.13773
0.995.996790.903440.654490.413610.43331
4000.850.438620.289810.407170.246880.24386
0.900.427030.275020.386200.220240.20849
0.951.024450.431730.794520.293420.25947
0.997.238873.739732.271231.306130.39580
8000.850.317290.282460.312010.256440.24980
0.900.430230.304510.409780.251710.24425
0.950.389200.302050.370000.278840.27794
0.990.887670.319740.549050.251390.24847

Simulation results in case of p = 4 and σ = 1.

Table 4

CLnτ
0.051000.855.852740.399910.808300.188880.19297
0.909.238010.438270.750550.202750.17161
0.9520.260410.630660.546740.754670.93491
0.99107.519623.145110.337991.957721.76879
4000.851.707700.168230.753220.046560.03788
0.902.537430.167430.785350.049290.04044
0.954.589550.171210.799500.065940.04300
0.9921.620360.421020.403840.258610.13540
8000.850.739290.084730.471440.023800.02045
0.901.191200.099310.615010.029610.02455
0.952.386320.085830.772790.026640.01796
0.9911.969860.241250.536980.138780.06955
0.251000.855.774610.463780.712240.292940.30568
0.9011.215820.489870.740740.228800.18155
0.9518.425670.639050.548340.564940.50903
0.9989.600132.055240.261700.983690.41537
4000.851.384310.428190.684250.185930.17036
0.903.083220.359740.970730.158860.13434
0.954.086420.257200.678090.129000.09643
0.9922.178590.409160.369890.312140.13774
8000.850.836190.170910.531770.064650.05761
0.901.186900.239040.633270.105220.09467
0.952.814680.173170.853940.072620.06217
0.9915.320230.410750.729090.155740.12420
0.501000.859.333580.755391.218960.633960.61351
0.9010.718090.768790.979050.718380.71674
0.9517.166820.427850.442560.422130.32800
0.9964.196890.916220.555781.465110.66360
4000.851.848580.588270.866590.366890.34593
0.902.426620.443740.823340.253600.22669
0.956.364840.406970.989290.292020.24581
0.9942.227171.411020.731270.443360.27481
8000.851.248100.454700.825890.266730.25520
0.901.798800.405710.956880.251330.23683
0.952.058140.412600.740840.263990.24359
0.997.950360.332780.525250.327000.23893

Simulation results in case of p = 4 and σ = 5.

Table 5

CLnτ
0.051000.850.212990.210030.202610.195630.12179
0.900.569690.514090.482770.366500.18440
0.950.832070.723660.630840.503350.28609
0.9916.5215810.446952.350484.922022.37078
4000.850.271540.264110.262110.231130.13296
0.900.133710.128090.126460.101560.02604
0.950.310370.298730.290930.251020.13099
0.991.189911.014670.863210.718930.45445
8000.850.050240.049940.049660.048100.02107
0.900.057210.056660.056140.053300.01864
0.950.180540.172060.170890.134690.04490
0.991.350891.071591.041260.691340.43935
0.251000.851.562750.588231.034740.223150.19675
0.901.894250.747301.134490.319280.26897
0.953.920062.280152.158261.183690.83884
0.9915.485983.409250.682610.491890.45279
4000.850.214750.178200.199090.093420.05105
0.900.517820.361580.458510.166210.10651
0.951.897601.160851.379480.401650.11108
0.993.850821.252701.422910.436490.29770
8000.850.364830.318130.350160.205150.13247
0.900.121700.108030.117540.081410.07711
0.950.551610.371530.483980.166210.08142
0.993.116110.915771.554830.298620.18130
0.501000.854.469211.764142.696720.830170.64431
0.904.406011.933982.240120.700130.41793
0.957.705472.093021.911770.476690.38454
0.9916.115851.445230.696490.806560.86553
4000.850.587690.258760.506640.196750.19763
0.900.862370.346080.729740.229020.22895
0.951.804720.600171.261000.352830.33833
0.9921.835445.598583.627751.135930.75091
8000.850.478810.347350.460120.281210.27713
0.900.837020.490460.762430.298570.27414
0.951.494590.577851.227090.314940.27184
0.993.120400.666851.307150.270540.24975

Simulation results in case of p = 8 and σ = 0.5.

Table 6

CLnτ
0.051000.851.158460.878880.916970.513220.27689
0.901.535531.153831.070840.680930.38196
0.954.016362.371951.803301.016170.57745
0.9912.275228.055842.469224.066232.06109
4000.850.193090.179010.181890.124470.03658
0.900.430580.377140.391350.234240.09405
0.950.738810.555140.606380.276580.12419
0.994.291092.147791.751270.855360.46653
8000.850.114830.108930.110950.081910.02236
0.900.164190.152370.156340.105030.02677
0.950.323580.288380.295490.180400.05814
0.991.600440.775400.973480.295640.18525
0.251000.852.325741.089861.367340.274680.13717
0.905.223602.786692.688891.039910.49823
0.955.405883.282942.541751.619570.97862
0.9927.304918.100321.350841.919401.40683
4000.850.445210.268320.393870.112540.08142
0.900.769370.451700.653820.184710.11600
0.952.295001.216211.509640.272960.10602
0.995.175611.690151.393490.334330.13233
8000.850.295100.240880.278980.127720.06503
0.900.366890.243080.337940.125690.10288
0.950.375690.202980.318070.081420.06467
0.993.966921.070231.729130.300980.20096
0.501000.853.376041.069331.784040.571550.55877
0.901.378760.329100.686370.190410.19209
0.955.843011.006501.432250.410510.42935
0.9973.1014629.992531.116491.680542.97445
4000.851.124730.443710.947280.253300.23185
0.901.836500.659191.398580.251780.18772
0.951.391640.411540.910570.302560.30279
0.9912.101851.846341.587790.383320.36864
8000.850.480520.345340.462630.295110.29213
0.900.536340.309010.496020.267580.26706
0.951.812950.664961.398540.302320.25480
0.994.024890.896121.471100.291790.26844

Simulation results in case of p = 8 and σ = 1.

Table 7

CLnτ
0.051000.8516.484290.762651.799340.473920.58901
0.9024.142351.022681.452800.778911.03974
0.9553.513622.157901.020471.795612.33215
0.99239.124419.284870.346274.557325.75055
4000.853.490160.138361.512750.036100.03901
0.906.204170.200771.817640.057040.06378
0.9511.944950.352801.739860.093220.10407
0.9959.069041.591330.781710.327410.37704
8000.851.895630.087641.175270.020670.02213
0.902.730880.093861.399510.017620.01943
0.955.564150.138411.710800.023000.02574
0.9927.361990.617081.109960.135210.14900
0.251000.8518.877740.829761.553460.634870.74055
0.9034.437122.118591.937080.827100.99253
0.9556.672862.543561.202411.203501.40572
0.99234.546776.242610.464016.352977.65207
4000.853.648350.194081.467530.082550.08183
0.906.640630.268421.941080.106310.11199
0.9516.557550.747652.085090.215530.21790
0.9953.924021.123680.644150.376360.40181
8000.852.286590.163071.358290.066840.06702
0.902.587070.191281.288710.101710.10106
0.955.413960.183381.546590.066000.06500
0.9932.356630.772031.222960.194960.20403
0.501000.8520.540851.080112.064150.918510.96065
0.9021.673150.573511.043080.548170.55866
0.9565.761522.323541.072402.243062.64051
0.99299.4042212.273410.4214211.0454711.98258
4000.855.149500.391941.966350.253460.24852
0.907.855140.343162.150830.149050.14598
0.9514.828210.652501.896240.470940.45788
0.9969.452041.451360.856360.941160.88360
8000.851.888970.378111.182690.253360.25122
0.903.046330.401051.517430.303560.30175
0.958.832550.402262.392020.205270.20621
0.9926.029720.546930.967600.278060.25399

Simulation results in case of p = 8 and σ = 5.

Based on the simulation results, we conclude the following:

  • The EMSE increases as n decreases.

  • The EMSE increases as p increases.

  • The EMSE increases as Ï„ increases.

  • The EMSE increases as σ increases.

  • The EMSE increases as the CL increases.

  • The TMLE exhibited the least performance at all levels of multicollinearity and censoring.

  • The TNRTE and the TLE outperform the TRE for all cases.

  • The proposed TNRTE has few EMSE values near to that of TLE in case of large σ and p values.

  • The proposed TNRTE with the biasing parameters performs the best of all other mentioned estimators in terms of the EMSE, followed by the proposed TNRTE with the biasing parameters in most cases.

  • The proposed TNRTE performance and others almost depend on the determination of their biasing parameter estimators.

  • Finally, the proposed TNRTE performs the best of all other mentioned estimators in terms of the EMSE in most cases.

A Real-Life Data

In this section, we have the Mroz dataset that was originally adopted by Mroz [31] to clarify the performance of the proposed TNRTE and other mentioned estimators. The Mroz data contains 753 cases of married women with 21 variables, and the ages of these women range from 30 to 60 years. Three hundred twenty-five of the 753 cases from these women have an average wage of zero in an hour. Then, Barros et al. [32] considered the average hourly wage of the women as a dependent variable (y), while the independent variables are as follows: age of the women (x1), education of the women (x2), number of children <6 years (x3), number of children between the ages 6 and 18 (x4), and previous labor market experience of the women (x5). With the method of Toker et al. [6], to examine the existence of multicollinearity, or not, the matrix eigenvalues are given as 69,601.81, 1,723.52, 334.22, 54.43, 6.22, and 0.36, and the condition number is calculated as 441.09, and these results connote that there is high multicollinearity. The parameters and MSE are estimated and presented in Table 8.

Table 8

Estimatorα0α1α2α3α4α5MSEk/d
−2.5349−0.16480.6825−2.68770.07810.221456.6834NA
−0.4558−0.18060.5691−2.0635−0.00770.22437.75172.270
−0.8312−0.18110.6051−2.42130.00830.22309.99860.029
1.2387−0.19870.5007−1.9321−0.07700.225434.37141.342
−0.3934−0.18830.5992−2.5793−0.00900.22268.02580.296
−1.4565−0.17660.6405−2.63240.03430.222020.10890.300
−1.3068−0.17660.6267−2.49570.02780.222516.94220.300
−0.3780−0.18840.5986−2.5771−0.00960.22268.02350.300

The regression coefficients and the MSE results.

Table 8 shows that the TMLE performs worse as expected. Also, the TRE has a near MSE value with the biasing parameter estimator to that of the proposed TNRTE with biasing parameter estimator . Moreover, the proposed TNRTE has the lowest MSE value among the mentioned estimators (TRE and TLE), followed by TLE and then the TRE, when k = d = 0.3; this means that the proposed TNRTE is the best in this case.

Figure 1 shows that the proposed TNRTE with biasing parameter k from 0.18 to 0.58 performing better than other mentioned estimators, and when k equals 0.36, the proposed TNRTE has the least MSE; which means it is the best of all given estimators, while the TMLE performs the worst as expected.

Figure 1

Figure 1

MSE of TMLE, TRE, TLE, and TNRTE for diffrent k, d.

Conclusions

In this study, we proposed the Tobit new ridge-type estimator (TNRTE) for overcoming the multicollinearity problem of the censored model. Theoretically, we compared the proposed TNRTE with some given estimators: the Tobit maximum likelihood estimator (TMLE), the Tobit ridge estimator (TRE), and the Tobit Liu estimator (TLE), and gave biasing parameter estimators of the proposed TNRTE. Then, a simulation study was performed to know the performance of the TMLE, the TRE, and the TLE with the proposed TNRTE. The results of the simulation indicate that the proposed TNRTE is better than other existing estimators in most cases. Moreover, real-life Mroz data were used to clarify the study results.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

ID, MA, and FA contributed to conception and structural design of the manuscript. MA performed the simulation and application. All authors contributed to manuscript revision, read, and approved the submitted version.

Acknowledgments

The authors would like to thank the Deanship of Scientific Research at King Saud University represented by the Research Center at CBA for supporting this research financially.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  • 1.

    TobinJ. Estimation of relationships for limited dependent variables. Econometrica. (1958) 26:24–36. 10.2307/1907382

  • 2.

    GoldbergerAS. Econometric Theory, 1st ed.New York, NY: John Wiley and Sons (1964), p. 1–399.

  • 3.

    AmemiyaT. Tobit models: a survey. J Econom. (1984) 24:3–61. 10.1016/0304-4076(84)90074-5

  • 4.

    KhalafGManssonKSjolanderP. A Tobit ridge regression estimator. Commun. Stat. Theory Methods. (2014) 43:131–40. 10.1080/03610926.2012.655881

  • 5.

    AlhusseiniFHHOdahMH. Principal component regression for Tobit model and purchases of gold. In: Proceedings of the 10th International Management Conference, Bucharest, Romania. (2016) 10:491–500.

  • 6.

    TokerSÖzbayNSirayGÜYenilmezI. Tobit Liu estimation of censored regression model: an application to Mroz data and a Monte Carlo simulation study. J Stat Comput Simul. (2021) 91:1061–91. 10.1080/00949655.2020.1828416

  • 7.

    HoerlAEKennardRW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics. (1970) 12:55–67. 10.1080/00401706.1970.10488634

  • 8.

    LiuK. A new class of biased estimate in linear regression. Commun Stat Theory Methods. (1993) 22:393–402. 10.1080/03610929308831027

  • 9.

    KibriaBMGLukmanAF. A new ridge-type estimator for the linear regression model: simulations and applications. Hindawi. (2020) 2020:1–16. 10.1155/2020/9758378

  • 10.

    LukmanAFDawoudIKibriaBMAlgamalZY. A new ridge-type estimator for the gamma regression model. Scientifica. (2021) 2021:5545356. 10.1155/2021/5545356

  • 11.

    LukmanAFAlgamalZYKibriaBG. The KL estimator for the inverse Gaussian regression model. Concurr Comput Pract Exp. (2021) 33:e6222. 10.1002/cpe.6222

  • 12.

    AkramMNKibriaBGAbonazelMR. On the performance of some biased estimators in the gamma regression model: simulation and applications. J Stat Comput Simul. (2022). 10.1080/00949655.2022.2032059. [Epub ahead of print].

  • 13.

    DawoudIAbonazelMR. Generalized Kibria-Lukman estimator: method, simulation, and application. Front Appl Math Stat. (2022) 8:880086. 10.3389/fams.2022.880086

  • 14.

    AwwadFAOdeniyiKADawoudIAlgamalZYAbonazelMRBMTag EldinE. New two-parameter estimators for the logistic regression model with multicollinearity. WSEAS Trans Math. (2022) 21:403–14. 10.37394/23206.2022.21.48

  • 15.

    AbonazelMRDawoudIAwwadFA. Dawoud–Kibria estimator for beta regression model: simulation and application. Front Appl Math Stat. (2022) 8:775068. 10.3389/fams.2022.775068

  • 16.

    AmemiyaT. Regression analysis when the dependent variable is truncated normal. Econometrics. (1973) 41:997–1016. 10.2307/1914031

  • 17.

    FairRC. A note on computation of the Tobit estimator. Econometrics. (1977) 45:1723–7. 10.2307/1913962

  • 18.

    WangSGWuMXJiaZZ. Matrix Inequalities. 2nd ed. Beijing: Chinese Science Press (2006), p. 1–116.

  • 19.

    FarebrotherRW. Further results on the mean square error of ridge regression. J R Stat Soc B. (1976) 38:248–50. 10.1111/j.2517-6161.1976.tb01588.x

  • 20.

    TrenklerGToutenburgH. Mean squared error matrix comparisons between biased estimators-an overview of recent results. Stat Pap. (1990) 31:165–79. 10.1007/BF02924687

  • 21.

    KibriaBMG. Performance of some new ridge regression estimators. Commun Stat Simul Comput. (2003) 32:419–35. 10.1081/SAC-120017499

  • 22.

    YenilmezIMert KantarYAcitaşS. Estimation of censored regression model in the case of non-normal error. Sigma J Eng Nat Sci. (2018) 36:513–521.

  • 23.

    YenilmezIMert KantarY. An alternative estimation method based on alpha skew logistic distribution for parameters of censored regression model. Data Sci Appl. (2019) 2:16–20.

  • 24.

    YenilmezIIlhanUMert KantarY. Quasi-maximum likelihood estimator based on moyal distribution for censored data. In: 5th International Researchers, Statisticians and Young Statisticians Congress Aydin, Turkey. (2019), p. 419–27.

  • 25.

    DawoudIAbonazelMR. Robust Dawoud–Kibria estimator for handling multicollinearity and outliers in the linear regression model. J Stat Comput Simul. (2021) 91:3678–92. 10.1080/00949655.2021.1945063

  • 26.

    Awwad FA DawoudIAbonazelMR. Development of robust Özkale–Kaçiranlar and Yang–Chang estimators for regression models in the presence of multicollinearity and outliers. Concurr Comput Pract Exp. (2022) 34:e6779. 10.1002/cpe.6779

  • 27.

    AbonazelMRDawoudI. Developing robust ridge estimators for Poisson regression model. Concurr Comput Pract Exp. (2022) 34:e6979. 10.1002/cpe.6979

  • 28.

    AlgamalZYAbonazelMR. Developing a Liu-type estimator in beta regression model. Concurr Comput Pract Exp. (2022) 34:e6685. 10.1002/cpe.6685

  • 29.

    AbonazelMRAlgamalZYAwwadFATahaIM. A New Two-parameter estimator for beta regression model: method, simulation, and application. Front Appl Math Stat. (2022) 7:780322. 10.3389/fams.2021.780322

  • 30.

    OzkaleMRKaçiranlarS. The restricted and unrestricted two-parameter estimators. Commun Stat Theory Methods. (2007) 36:2707–25. 10.1080/03610920701386877

  • 31.

    MrozTA. The sensitivity of an empirical model of married women's hours of work to economic and statistical assumptions. Econometrica. (1987) 55:765–99. 10.2307/1911029

  • 32.

    BarrosMGaleaMLeivaV. Generalized Tobit models: diagnostics and application in econometrics. J Appl Stat. (2018) 45:145–67. 10.1080/02664763.2016.1268572

Summary

Keywords

censored regression model, multicollinearity, Tobit Liu estimator, Tobit ridge estimator, Tobit new ridge-type estimator

Citation

Dawoud I, Abonazel MR, Awwad FA and Tag Eldin E (2022) A New Tobit Ridge-Type Estimator of the Censored Regression Model With Multicollinearity Problem. Front. Appl. Math. Stat. 8:952142. doi: 10.3389/fams.2022.952142

Received

24 May 2022

Accepted

21 June 2022

Published

15 July 2022

Volume

8 - 2022

Edited by

Han-Ying Liang, Tongji University, China

Reviewed by

Guoliang Fan, Shanghai Maritime University, China; Fuxia Cheng, Illinois State University, United States

Updates

Copyright

*Correspondence: Mohamed R. Abonazel

This article was submitted to Statistics and Probability, a section of the journal Frontiers in Applied Mathematics and Statistics

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics