Estimation of the Parameters of Power Function Distribution based on Records

This paper estimates the power function distribution parameters and predicts the future record values when samples are available only in the form of upper record values. We considered the maximum likelihood and Bayesian techniques for the estimation. We also construct asymptotic, bootstrap, and HPD confidence intervals for the unknown parameters. Bayes estimators are derived using the squared error loss function, entropy loss function, and Linex loss function using the Lindley approximation and importance sampling procedures. Finally, we conduct a simulation study to compare all the proposed estimation methods and analyse a real data set for illustration purposes.


Introduction
Record values and associated statistics are essential in several real-life problems involving weather, economic, and sports data. In application of record values in the areas of agriculture, finance, and sports have gained attention. Also, the record value prediction helps the athletes attend training and practice in athletic events. The forecast of the intensity related to the next strongest earthquake is helpful for disaster management planning. Likewise, the highest or lowest face value of shares in the share market is significant to plan the investment strategies and soon. The statistical study of record was first introduced by Chandler (1952), and interested readers can refer to Awad and Raqab (2000) and Ahsanullah (2015).
for the power function model parameters where the sample data are available records.
We build point estimators and interval estimators for the unknown parameters of PFD. The probability density function (PDF) and the cumulative distribution function (CDF) of the PFD is given by ; 0< x < λ, θ,λ >0.
Many authors have discussed the problem of the estimation of parameters of PFD distribution using Bayesian techniques. Bagchi and Sarkar (1986), Omar  We also compute the asymptotic and bootstrap confidence intervals for the MLE and highest posterior density (HPD) credible intervals for the Bayes estimators.
The rest of the paper is organized as follows: the MLE, asymptotic, and two bootstrap confidence intervals of PFD parameters using record values are given in Section 2. Bayes estimators of the parameters under squared error loss function, entropy loss function, and linex loss function using the Lindley approximation method, importance sampling procedures and HPD credible intervals discussed in section 3. Also, we derived the Bayes estimates using the Metropolis-Hastings method. The problem of predicting the future record values from the PFD using the ML and Bayesian approaches is explained in section4. In section 5, the Monte Carlo simulation study compares the estimators' performance proposed. Further, real-life data used for illustration purposes. Finally, a brief conclusion is given in section 6.

Maximum Likelihood Estimation
This section estimates the MLE of the PFD parameters based on upper record values and computes the asymptotic and bootstrap confidence intervals. Let R = (R1,R2...,Rn) be the records of size n from the PFD with pdf given in (1). Then the joint density function of R is given by (Ahsanullah (2015)) (3) where f(.) and F(.) are the pdf and cdf of the upper record sample. Using (1) and (2) in (3), the likelihood function is given by (4) From (4), the log-likelihood function is given by .
The resulting normal equations are given as , (5) and .

Asymptotic Confidence Interval
This section, derives the asymptotic confidence intervals, including the coverage probability (CP) of θ and λ. We know that consistent estimator of the Fisher information matrix is the observed information matrix and for η = (θ;λ) it is defined as .
The above approach is used to derive approximate 100(1 − ) 0 /0 confidence intervals of the parameters θ and λ of the forms where Z /2 is upper ( /2) th percentile of the standard normal distribution. Arnold et al. (1998) state that anyone who develops technologies that contain record values encounters small samples. In such situations, one can use some resampling techniques to provide more accurate approximate confidence intervals. Here, we propose two bootstrap confidence intervals for different sample sizes.

Bootstrap Confidence Intervals
In this section, we derive the bootstrap and confidence intervals of the estimator's parameters of PFD. We provide two methods of finding Bootstrap confidence intervals for upper record samples. The first method is the percentile Bootstrap confidence intervals proposed by Efron (1982). The second method is the student's t bootstrap confidence interval proposed by Hall (1988). We use the following steps for doing this procedure.
Step 3: Repeat step 2, D times to obtain a set of bootstrap samples of θ and λ, say ( ) and To construct the bootstrap confidence intervals of the parameter η (which is θ and λ), the bootstrap samples generated by algorithm 1 are used, and two different bootstrap confidence intervals are obtained as follows: (i) Boot-p method

3.Bayesian Estimation
In this section, we provide the Bayes estimation of unknown parameters of PFD under different loss functions. We considered symmetric as well as asymmetric loss functions for our estimation purpose. One of the symmetric loss functions is the Squared Error Loss (SELF) function, which is defined as L(η,δ(X)) = [δ(X)−η] 2 . Varian (1975) proposed an asymmetric linear-exponential loss function known as Linex Loss Function (LLF) which is defined as where h is the shape parameter and is known as the degree of asymmetry. Also, another asymmetric loss function is the Entropy Loss Function (ELF) proposed by Calabria and Pulcini (1996) and is defined as 1,p= 0, where p is shape parameter. Under the SELF, LLF and ELF Bayes estimators of η defined for SELF, LLF and ELF as and In the Bayesian approach, the parameter of interest is considered a random variable with some specified distribution. It may be noted that if all the parameters θ and λ are unknown, joint conjugate priors do not exist. In such cases, there are several ways to choose the priors. One way is to consider the piecewise independent priors. The gamma prior (GP) is one of the most considerable prior's researchers often use. Also, GP is a conjugate prior family. The conjugate prior approach was originated in Raiffa and Schleifer (1961). Assume that the unknown parameters θ and λ assumes independent gamma priors Γ (a1,b1) and Γ (a2,b2) and is defined as , (6) where η represents one of the parameters θ and λ and (ai,bi), i= 1,2. Using equations (4) and (6), the joint posterior distribution can be obtained as , (7) where and γ2 = b2. Now, using (7), the Bayes estimators of η = (θ, λ) under SELF, LLF and ELF are derived respectively as , and , (10) where θˆself, θˆllf and θˆelf, indicates the Bayes estimate of λ using different loss functions. Similarly, the Bayes estimates of λ denoted respectively as λˆself, λˆllf and λˆelf can be derived by following the above steps. We can see that Bayes estimators are in the form of a ratio of integrals, which cannot simplify to closed forms. Hence, we use two approximation methods, namely the Lindley approximation and importance sampling methods, to solve the above ratio of integrals, and are discussed in the following sections.

Lindely's approximation method
This section, discusses the Lindley approximation method to solve the proposed Bayesian estimates of PFD parameters based on upper record values. In this model the Bayes estimators of η = (θ, λ) under SELF, ELF, and LLF gave in (8) to (10)

Importance sampling procedure
This section, we discuss another method the importance sampling procedure to solve the ratio of integrals in Bayes estimation. We also derive the HPD credible intervals for the unknown parameters. The joint posterior distribution given in (7) can be rewritten as and , (13) where Γ ( . Therefore, the Bayes estimators of θ and λ can be calculated using the following steps: 1. Generate θ1 from f (λ;β1,γ1). 2. For the generated value of θ1, generate λ1 from f (λ|θ; β2,γ2).

MCMC Method
The Gibbs sampling technique generates the samples from the joint posterior distribution (7). The first step is to break down θ and λ in the joint posterior distribution into fully conditional distributions. The conditional posterior distribution of θ given λ can be written as . (14) The conditional posterior distribution of λ given θ can be written as Therefore, to find estimates of θ and λ from (14) and (15), the algorithm by Tierney (1994) is used, which is the Metropolis-Hastings method used in the Gibbs sampling. If so, we assume posterior distribution of θ follows a gamma distribution with parameters (a1 + n) and ( Also, assume the conditional posterior distribution of λ follows a gamma distribution with parameters (a2 + nθ) and b2. Hence, one can generate a sample from the posterior distribution of the θ and λ. The detailed algorithm is given below.

Repeat steps
Compute the Bayes estimate of θ using SELF, ELF and LLF. Using the first M iterations as burn-in period, the estimates of θ are respectively given by , and .
Similarly, we can construct the Bayes estimator of λ using different loss functions.

Maximum Likelihood Prediction
The Bayesian prediction of record data and related prediction intervals is an important topic discussed in the literature. For example, if we observe record values of rainfall and snowfall to date, coming rain and snowfall can be predicted based on current record values. However, much attention has not been given to the prediction issues relating to the parameters of PFD. The focus is on predicting its s th upper record values using past record values R = (R1, ..., Rn), s > n. Suppose that the first n upper record values ) have been observed from a population with pdf f (x| η). We find out a predicted value of Y = Rs, s > n, the joint predictive likelihood function of Y = Rs and η is needed, which is given by . Thus, by using (1)-(2), the log predictive likelihood function for the PFD is given by where 0 < R1 < R2<... < Rn−1, < Rn < y. Hence, the log-likelihood equations are given by ISSN-2424-6271 41 (16) and .
Solving (16)- (17) numerically, one will predict the value of the s th upper records for the PFD. Bayesian approach to predict future record values under SELF, ELF and LLF are calculated based on this data, one wants to predict s th upper record Y = Rs, s > n. The conditional pdf of Y given R is of the following form.
The posterior predictive density of y given the observed data is, Hence the predictive survival function is given by (19) So, the lower and upper 100(1 − ) 0 /0 prediction bounds (L(y), U(y)) for y are obtained by equating (19) to and , respectively. And then, for the PFD with pdf given by (2), the conditional density function in (18)

Numerical Data
This section uses data analysis methods data using simulated and using real data sets. We study the MLE, Bayes estimators, CI and prediction performance using the simulated data and real-life data.

Simulation study
This section compares the Bayes and MLE estimators developed in the previous sections using a simulation study. Using the coverage probability concept, we compared the asymptotic confidence interval performances (ACI), the bootstrap confidence interval performances and the HPD credible interval performances. To compare the point predictors, we use the concept of mean squared prediction errors   Figure 1. Figure 2 shows the trace plots and autocorrelation plots of the Markov chain samples. We can list the following conclusions from the numerical results reported in Tables 1-5. 1. The bias and MSE of the proposed estimators decrease when the number of records increases. This proves the consistent properties of the estimators. 2. The average length of the approximate confidence intervals decreases when the number of records increases while the coverage probability is around 0.95. 3. From Table 2-4, the MSE of the estimators decreases with increasing sample size. 4. The HPD credible intervals' width is smaller than the width of the asymptotic confidence intervals in all the cases. However, the width of the confidence / HPD intervals decreases as the sample size increases. 5. The coverage potential of asymptotic confidence intervals is, in most cases, much lower than their nominal level, even at substantial sample sizes. 6. In Figure 1, using MSE, Bayes estimators perform better than MLEs. This is because Bayes estimators have more information in the form of previous information than MLE. 7. Fig.2 shows the sequential realizations of the parameters of the model. In the chain exhibits several flat bits and suffers from slow mixing. The chain can be run longer to find an appropriate burn-in. Autocorrelation is significant even after lag 30. 8. Compared to bootstrap confidence intervals, in most cases, boot-t confidence intervals provide longer expected boot-p confidence intervals, while their performance is almost identical in terms of coverage probability. 9. The MSPEs of the point predictors decreases as the sample size increases in all cases. It can be seen that the Bayes predictors under the LINEX loss function with shape parameter h = 2 perform better than the other predictors.

Data Analysis
To study the performance of the estimators defined in this paper we use the real data set, consisting of the number of 1000s of cycles to failure for electrical appliances in a life test (Yousaf et al. (2019)). The appliances were operated repeatedly by an automatic testing machine, the lifespan is the number of usage cycles completed until the home appliances fail. We have checked the validity of PFD using the Kolmogorov Smirnov (KS) test and observed that the KS distance corresponding to the p-value = 0.0645 is 0.9769. The fitted pdf for the PFD is presented in Figure 3, which indicates the PFD provides a good fit to the data. For the Bayes estimation we choose the values of hyper-parameters as a1 = 1.6, b1 = 3.2, a2 = 2 and b2 = 2 respectively. Bayes estimation using LL and EL functions are evaluated for h = 2 and p = 2. The MLE and Bayes estimates under different loss functions are given in Table 6.

Conclusions
This paper, we propose the MLE and Bayes estimators of PFD parameters using different loss functions and predicts future record values when samples are available only in the form of upper record values. We construct the asymptotic and bootstrap confidence intervals of the unknown parameters for the MLE. The Bayes estimates under SE, Entropy and Linex loss functions are in the form of the ratio of two integrals. Further using the method of Chen and Shao (1999), we compute the highest posterior density interval. Data analysis using simulated and real data finds that Bayes estimators under different loss functions perform better than the MLE. The proposed Bayesian modelling linked with MCMC estimation has worked very well. In Fig 2, Markov Chain seems to be mixing well enough. The plot looks like long upward or downward trends, and then the chain goes to convergent. We also predict the future record values when samples are available only in the form of upper records. Finally, the use of Bayesian modelling has been of great benefit and similar approaches are likely to be helpful in other situations also this can be other area of future study.