On composite length-biased exponential-Pareto distribution: Properties, simulation, and application in actuarial science

The composite length-biased exponential-Pareto (CLBEP) distribution is a new composite distribution that is introduced in this article. This model's probability density function, moments, and quantiles, among other statistical characteristics, are determined mathematically. The parameters' maximum-likelihood estimation and stochastic ordering are discussed. A comparison study with other new composite and conventional distributions is also included. Specifically, using two actual fire insurance data sets, the goodness of fit of this new model is contrasted with the composite exponential-Pareto, composite lognormal-Pareto, and composite Rayleigh-Pareto distributions (Algerian and Danish fire insurance losses). 2010 AMS subject classifications 62E10; 60E05.


. Introduction
Currently, digital methods are being used in the fields of biology, economics, physical sciences, statistical sciences, and other fields. In the applications of other fields as well as in daily life, the statistical sciences are essential. Probability distributions are frequently the foundation of statistical science because many problems in these fields frequently do not follow one of the fundamental probability distributions. Actuarial science and finance generally use common distributions to express their data on payments, quantity and number of claims, and premium computation. Examples of these distributions are exponential, Poisson, length-biased exponential, and Pareto.
The length-biased exponential distribution, on the other hand, offers a wide range of practical applications in several industries (reliability, actuarial science, survival analysis, and mathematical financiers). The lifetime of a phenomenon with no memory, no aging, no wear and tear, or the profits of an insurance company, or various models of surpluses and financial assets, are frequently modeled using the length-biased exponential distribution.
The modeling of unimodal insurance loss data with a long tail appeals to actuaries. Distributions that may replicate the heavy tail of insurance loss data are necessary to provide a sufficiently precise estimate of the degree of connected business risk, including gamma, Pareto, length-biased exponential, Rayleigh lognormal, and Weibull. For example, if there are both modest and significant losses, insurance companies may experience losses. When modeling very large losses, practitioners seem to favor the Pareto distribution for size distribution. Length-biased exponential, lognormal, Rayleigh, or Weibull models are preferred when the losses are composed of smaller values with high frequencies and larger losses with low frequencies [1]. Nevertheless, no conventional size model can simultaneously account for losses that are both minor and significant. Unlike length-biased, lognormal, Rayleigh, or Weibull exponential models, which have a positive general fit but fit the tail poorly, Pareto models actually fit the tail well.
When modeling data that have heavy tails, the composite distributions appear appropriate. For instance, the one-parameter exponential-Pareto (exp-Pareto) model and the one-parameter inverse gamma-Pareto (IG-Pareto) model have both been proposed as potential models for the modeling of insurance data. When they are fitted to well-known insurance data sets, such as the Danish fire insurance data set, they still are unable to perform satisfactorily. So, the model needs to be improved. By exponentiation of the random variable linked to the probability density function (pdf) of an inverse gamma-Pareto distribution, Liu and Ananda [2] suggested an improved version of the one-parameter IG-Pareto model. Their suggested model outperformed the original model significantly across several data sets. Furthermore, there are other composite models such as the composite lognormal-Pareto (cLP) model (see Scollnik [3] and composite Rayleigh-Pareto (cRP) model (see Benatmane et al. [4]). For more details see [5][6][7][8][9][10][11][12].
As a result, we suggest, in this study, a novel composite distribution that blends length-biased and Pareto exponential distributions. This effort aims to introduce a new composite distribution. As a result, the CLBEP distribution has a single parameter. It is simple to determine mathematical qualities in an explicit form. Due to its composition (two types of distributions that can be simulated for survival analysis and actuarial purposes), this new distribution offers advantages. Many real-life data sets can be analyzed using the CLBEP model, which provides suitable fits to these data sets.
The current article is structured as follows: The composite length-biased exponential-Pareto distribution and some of its statistical characteristics are discussed in Section 2. The estimation of parameters is addressed in Section 3. A numerical example with a comparison of various classical and composite models using two real data sets is provided in Section 4.

. Formulation of the CLBEP distribution
For many theoretical issues, the length-biased exponential and Pareto distributions might not be adequate. We created the composite length-biased exponential-Pareto (CLBEP) distribution, based on the composite transformation, to have a flexible model. Let T be an arbitrary random variable with density function where f 1 is a length-biased exponential density, f 2 is a twoparameter Pareto density, and c is the normalizing constant. Hence, where λ, α, and θ are unknown non-negative parameters. To obtain a composite smooth density function, we use the continuity and differentiability conditions at the threshold point θ , i.e., These two restrictions give After some calculation, we get Using the numerical methods, we find To find the normalizing constant, we use the density Since f (t; θ ) can be expressed as .

. Statistical properties of the CLBEP distribution
In this subsection, many statistical properties are presented, such as the behavior of PDF and quantile function, as well as the moments and stochastic ordering. The following proposition states that there is one shape for the PDF of the CLBEP distribution. Furthermore, the plots of PDF for some parameter value of the proposed model are presented in Figure 1.
The CLBEP distribution is unimodal with maximum value at the pointt = 0.398 14θ , where the unique mode is t mod = 0.398 14θ .

. . Cumulative distribution function and moments of the CLBEP distribution
The cumulative distribution function (c.d.f.) of this composite model is The kth moment about the origin of the CLBEP distribution can be obtained as: The mean of the CLBEP distribution is given by

. . The quantile function of the CLBEP distribution
The quantile function of the CLBEP distribution is given in the following theorem. Theorem 1. The quantile function of the CLBEP distribution is Multiplying by e −1 both sides, we find Moreover, for any θ , t > 0, it is immediate that −(1 + 2.5118t θ ) < 0, and it can also be checked that

by taking into account the properties of the negative branch of the Lambert W function, Equation (3) becomes
Now, for u ∈ ]u 0 ; 1[ , we have to solve the equation F(t) = u with respect to t, t > 0 it is easy to find where u 0 < u < 1.
. Generating random values from the CLBEP distribution

. . Parameter estimation
In this section, we will introduce two methods of estimating the unknown parameter θ .

. . . An ad hoc procedure based on percentiles
The following ad hoc procedure provides a closed form for the parameter θ , estimated using percentiles. Let t 1 , t 2 , ..., t n be a random sample from the CLBEP model. Assume that t 1 ≤ t 2 ≤ ... ≤ t n and t m ≤ θ ≤ t m+1 . Based on percentiles, the parameter θ can be estimated, as the pth percentile, where p = F(θ ) The Pareto distribution or the length-biased exponential distribution will be a more superior model than the composite length-biased exponential-Pareto distribution according asθ is closer to t 1 or t n .

. . . Maximum-likelihood estimation
Assume again that t 1 ≤ t 2 ≤ ... ≤ t n and t m ≤ θ ≤ t m+1 . Then, the likelihood function is Hence, the solution of the likelihood equation d ln L dθ = 0 iŝ Since this estimator requires the value of m, we recommend the following algorithm (see Teodorescu and Vernic [13]):

. Numerical and application examples
In this section, the estimation procedure described in Section 3 has been explained using two data samples generated from the CLBEP model. The generating algorithm used is based on the inversion of the c.d.f. (Equation 2).
Step 3. Verify ifθ is in between t m ≤θ ≤ t m+1 . If so, thenθ is the MLE. If not, use Algorithm 2.
An alternative algorithm would be to replace Step 1 with the consideration of all possible values for m and the achievement for each of them of the verification of step 3: Algorithm . Estimate θ using MLE.
Step 2. If there is no solution to θ , try an alternative model.

. . Example
The data set given in this subsection, consisting of 108 values, was sampled from a length-biased exponential-Pareto population with parameter θ = 5 (see Table 1 in the Appendix).
-By Algorithm 2, MLE Step 2 :θ 3 = 4.9810. We notice that Algorithm 2 in Step 1 gives a more accurate value. We also applied the χ 2 test to check the distribution fitting, and the results forθ 3 are given in Tables 1-4. The χ 2 distances calculated for all the estimated values of the parameters are The χ 2 test accepts the length-biased exponential Pareto model for all values of the parameter as expected, which d 2 θ 1 is a minimum.

. . Goodness of fit
In this subsection, we apply the composite length-biased exponential-Pareto model to two real insurance data sets.
Data set I: is 100 Algerian (SAA company) fire insurance losses (see Appendix).
We provide in Table 5 the estimated value of fitted models and the values of the −LL, AIC, AICc, and BIC evaluated at the maximum-likelihood estimators.
Data set II: is 2, 156 Danish fire insurance losses. We use the same analysis, we find Tables 5, 6 indicate that the CLBEP model outperforms classical distributions, composite Rayleigh-Pareto, composite exponential-Pareto, and composite lognormal-Pareto models in terms of −LL, AIC, AICc, and BIC for data sets I and II. In addition, in data set

. Conclusion
A unique distribution known as the composite length-biased exponential Pareto generated is suggested for application. Some of the mathematical features of this distribution include the quantile function, stochastic ordering, moments of the CLBEP, and maximum-likelihood estimation. In contrast to other conventional and new composite distributions, the distribution proposed in this work gives very satisfactory results. The goodness of fit of this novel model is compared to different conventional and new composite models, such as composite exponential-Pareto, composite lognormal-Pareto, and composite Rayleigh-Pareto distributions, using two real fire insurance data sets (Algerian and Danish fire insurance losses). Compared to the standard models, the composite models provided a far better fit to the data. The composite exponential-Pareto, composite lognormal-Pareto, and composite Rayleigh-Pareto distributions do not fit as well as the CLBEP model provides. We predict that researchers interested in statistical sciences and their applications, such as dependability and actuarial sciences, will be drawn to the CLBEP model. A future research may examine the Bayesian estimation of the CLBEP parameter, introducing the truncated version of the CLBEP distribution. In addition, it is . /fams. .
interesting to use similar composite distributions to model the epidemic problem.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.