An Appraisal on Some Methods for Estimating the 2- Parameter Weibull Distribution with Application to Wind Speeds Sample

Six methods for estimating the Weibull shape and scale parameters are considered and compared in this paper. These methods are: the least squares method, weighted least squares method, method of moments, energy pattern factor method, method of L-moments and the maximum likelihood method. A simulation study as well as application to a real data set (wind speeds sample) is used to test the performance of the different methods using the smallest mean square error criterion. Results from the simulation study indicate that the maximum likelihood method is the most efficient method when dealing with large sample sizes, while the weighted least squares method, method of moments and the method of L-moments are quite efficient for small and moderate sample sizes. The maximum likelihood method produced the best method when all six methods were applied to a wind speeds sample by possessing the smallest mean square error. A very useful result obtained from the study is that the weighted least squares method, performed considerably well in estimating the Weibull parameters. This is a rare incidence in many studies.


Introduction
Wind speed is a classic example of a stochastic variable. Due to this stochastic nature, its characteristics and behaviors in a given location can be captured by fitting specific probability distributions to a given wind speeds sample(s) collected over time. The 2-parameter Weibull distribution has become the most widely used probability distribution in wind speed analysis . Results from many studies have suggested that it is very adequate in fitting many wind regimes whether in tropical or extra tropical latitudes due to its flexibility. Central to the use of the Weibull model in wind speed analysis is the estimation of the parameters of the Weibull distribution. Several methods for estimating the Weibull parameters have been proposed over the years. Generally, these methods can simply be classified into two broad categories, namely: graphical methods and statistical methods.
The graphical methods which has not only be found to be very simple to implement but also having the direct advantage of reporting visibly whether or not the Weibull model fits the observed sample well, as well as offer initial estimates that can be used as starting points for the statistical methods . However, estimates obtained from the graphical methods have been found to be less accurate than those obtained using statistical methods. Some commonly used graphical methods include the empirical cumulative distribution function plot methods , Weibull probability plot and hazard rate plot . Statistical methods on the other hand, are based on mathematical and statistical theory which makes them more general and applicable to different data types. The Weibull estimators obtained using statistical methods are generally more accurate but complex to deal with than their graphical counterpart. While it may not be possible to obtain the asymptotic properties of the estimators based on the graphical methods, this is however possible with the statistical methods. Some commonly used statistical methods include: method of moment , method of percentiles, method of maximum Likelihood estimation , L-moments estimators , interval estimation , Bayesian method and the energy pattern factor method . Many works have appeared in the literature over the years geared towards the determination of the best method for estimating the Weibull shape and scale parameters . The rest of this paper is organized as follows. The Weibull distribution is introduced in section 2, some methods for estimating the Weibull parameters is considered in section 3. A simulation study and results from the simulation study is presented in sections 4 and 5 respectively. Application to wind speeds sample as well as results obtained from the application is contained in sections 6 and 7 respectively with conclusion in section 8.

Weibull Distribution
Weibull distribution is a continuous probability distribution named after Waloddi Weibull who described it in detail in 1951, although it was first identified by Maurice Frechet in1927 and first applied by Rosin and Rammler in 1933 to describe the size distribution of particles . The probability density function (PDF) and cumulative distribution function (CDF) of the Weibull distribution given by are respectively of the form where and are shape and scale parameters respectively. When applied to wind speed studies, the parameter is a dimensionless shape parameter that shows the peakedness of the distribution of the wind speeds at the measuring location, and for varying values of the distribution of the wind speeds takes the shape of other distributions. For , the distribution is exponential, for it is Rayleigh distribution, and for the distribution becomes approximately normal. The scale parameter measured in the dimension of the wind speed ( ) shows the dispersion of the wind speeds in a given location. The inverse CDF or quantile function, of the Weibull distribution is obtained by finding the root of the equation . This gives the quantile function as IASSL ISSN-2424-6271 149

Methods for Estimating the Weibull Shape and Scale Parameters
In this section we consider some methods for estimating the Weibull parameters. These methods include the least squares method (LSM), weighted least squares method (WLSM), method of moments (MOM), energy pattern factor method (EPFM), method of L-moments (MOL) and maximum likelihood method (MLM).

Least squares method (LSM)
The LSM is extensively used in many practical problems in estimating the parameter(s) of a model. Consider the model of the form The least squares estimates of the parameters and are the values of the parameters which minimizes the function and thus, the estimators of and are given respectively by Now, to estimate the Weibull parameters and using the LSM, we first admit the linearization of the Weibull CDF given by (2)  The LSM estimates of the Weibull shape and scale parameters are obtained using (12) and (13) respectively.

3.2.Weighted Least Squares Method (WLSM)
For the model given by (4), the WLSM estimates of the parameters and are the values of the parameters which minimizes the function where is a weight factor. Hence the WLSM estimators of and are given by Taking (8) to be of the form (4) and using similar arguments as put forward for the LSM, one readily obtain the following results It follows that and In this paper, the weight function proposed by is used. This is defined as

Method of moments (MOM)
It is a standard practice in statistics when fitting a parametric family of distribution to a data set to estimate the parameters by equating the sample moments to those of the fitted distribution. For a continuous random variable with PDF , the moment about the origin is defined as If the random variable follows the Weibull distribution, the moment about the origin is given by Evaluating (23) gives where is the complete gamma function defined as The mean of the Weibull random variable is obtained from (24) by setting to obtain The variance defined as for the Weibull distribution is thus given as For a random sample of size , the sample moment is defined as The sample mean is obtained from (28) by setting to obtain and consequently, the sample variance is given by The MOM estimators of the Weibull shape and scale parameters and is obtained by equating the population moment in (24) to the sample moment in (28). In particular, from (26) (34) is a function of . The estimate of the shape parameter is obtained by numerically finding the root of the equation given in (34). Once is obtained, the scale parameter is then obtained from (31) as

Energy Pattern Factor Method (EPFM)
The EPFM for estimating the Weibull parameters as developed by within the context of wind speed analysis is based on the energy pattern factor (EPF). The EPF is the ratio of the average wind power density (WPD) to the cube of the mean wind speed. The WPD is obtained as follows.
Wind speed as air in motion possesses kinetic energy. For a given air mass the kinetic energy present in air movement with velocity is given as To obtain the mass of flowing air passing through an area perpendicular to its velocity, its volume after time has elapsed ismultiplied by the air density (whose value in this paper is taken to be 1.225kg /m 3 ) this gives the value of as Putting (37) into (36) gives the total wind energy as To obtain the total wind power, we consider that power is the rate of change of energy with time. Thus we differentiate (38) with respect to to obtain the wind power Observe that the wind power is proportional to the cube of the wind velocity . The actual average wind power based on a sample of wind speeds observations is obtained from (39) by defining as the mean value of the third power of the wind speeds observations. It follows that the average WPD associated with the PDF of the wind speed random variable is given as Observe also that the average WPD given by (40) is obtained by computing the third moment about zero of the wind speed random variable . If follows the Weibull distribution, one readily obtain the WPD using (24) as The cube of the mean wind speed for a wind speed random variable following the Weibull distribution is given by It follows that the EPF for a wind speed random variable following the Weibull is given by Equating the population moments which make up (43) to their corresponding sample moments we obtain IASSL ISSN-2424-6271 155 which is further expressed as The estimate of is numerically obtained by finding the root of the equation given by (44). Once the estimate of is obtained, the estimate of can be obtained using (35).

3.5.Method of L-moments (MOL)
The moments based methods for estimating the Weibull parameters discussed in sections 3.3 and 3.4 are not always satisfactory. The sample moments, especially when the sample size is small can significantly be different from those of the probability distribution from which the sample was drawn; and the estimated parameters of distributions fitted by the method of moments are often markedly less accurate than those obtainable by other estimation procedures such as the maximum likelihood method. An alternative approach to the conventional moments based methods is the method based on L-moments as proposed by . L-moments though analogous to the conventional moments are estimated by linear combinations of order statistics, i.e. by L-statistics. L-moments have the theoretical advantages over conventional moments of being able to characterize a wider range of distributions and, when estimated from a sample, of being more robust to the presence of outliers in the data.
Let be a real-valued random variable with CDF and quantile function , and let be the order statistics of a random sample of size drawn from the distribution of . The L-moments of as defined by are the quantities The natural estimator of based on an observed sample of data is a linear combination of the ordered data values, i.e. an L-statistic. The expectation of an ordered statistic as defined in may be written as Substituting (46)  If follows the Weibull distribution, then the first two L-moments of using the system in (48) are given by The L-moments describe in (48) above are for a probability distribution. In practice, they are often estimated from a finite sample. Suppose be an ordered sample, the sample Lmoments can be characterized better by the estimator of the probability weighted moments . Following , an unbiased estimator of based on the ordered sample is defined as The sample L-moments are related to by where The first two sample L-moments are To estimate the Weibull parameters, we equate the population L-moments given in (49) and (50) to their corresponding sample estimates given in (53) and (54). The following results follow.
implies that Substituting (55) into (56) and evaluating gives the estimate of as Once is determined from (57), the estimate of is then estimated from (55) by

Maximum Likelihood Method (MLM)
Given the PDF of the Weibull distribution with parameters and , and a random independent sample of observations of size , the MLM estimates of the parameters and are obtained by maximizing the log-likelihood function For the Weibull distribution, the log-likelihood function is given as To obtain the estimate of and we take the partial derivative of (60) w.r.t. and and equate the resulting partial derivatives to zero. This gives Observe that (61)

Simulation Study
For the simulation study a random sample is generated from the Weibull distribution with and for sample sizes IASSL ISSN-2424-6271 159 . For each sample size bootstrap samples are generated. For each methods discussed in section 3, estimates of the parameter and estimates of the parameter are obtained. The means ( and ) of these estimates for and are computed. To compare the performance of the methods of estimation, the mean square errors is computed from the relation where is obtained by substituting and (for each method) in (2) while is the empirical CDF (ECDF). The method with the smallest mean square error value is taken as best among the others. Results from the simulation study are contained in Table 1.

Discussion of Results from Simulation Study
Results from the simulation study as presented in Table 1 and Figure  1, clearly reveal that the MSE decreases as the sample size increases for the 6 methods considered. For , the MSE for the MOM is smallest followed by that of the MOL. The EPFM reported the highest MSE followed by that of WLSM, LSM and MLM respectively. For sample size , the WLSM reported the lowest MSE follow by that of MLM, MOL, MOM, LSM and EPFM. For sample size the MLM reported the smallest MSE. Sample sizes and report the MOL as the method with the smallest MSE. The MLM reported the smallest MSE for sample size . It is evident from the study that the EPFM reported the highest MSE for all sample sizes. Again, for small sample sizes the WLSM appears to be very efficient as well as in large sample sizes. The MOL and MLM can also be viewed as good methods for small, moderate and large sample sizes.

Application to Wind Speed Sample
For the application, the Weibull distribution, using the 6 methods discussed in section 3 was used to fit the daily mean wind speeds sample covering 3 years (2012-2015) obtained from the National Center for Energy and Environment (NCEE) University of Benin, Benin City. A total sample size of 988 observations was used for the analysis out of a possible sample size of 1096. The missing observations were due to temporary shut down and maintenance of recording systems at the Center. Results from the analysis are contained in Table 2 which also contains the mean square errors obtained using the 6 methods and computed from the relation in (66).

Discussion of Results from Application
Results obtained from using the 6 methods to fit the Weibull distribution to the wind speeds data as contained in Table 2 and Figure 2 suggests that the Weibull distribution provides a very good model for the wind speeds data. This is evident from the fact that the MSEs obtained using the 6 Weibull methods are small and the fitted density in Figure 2 also testify to that. The MLM provided the best method for estimating the Weibull parameters for the wind speeds sample due to it having the smallest MSE. The WLSM and MOL also proved to be good methods as well. This result further confirmed the one obtained from the simulation study on the efficacy of the MLM, WLSM and MOL in estimating the Weibull parameters.

Conclusion
In this paper, we have discussed six methods for estimating the Weibull shape and scale parameters using simulation study and application to a real data set (wind speeds sample). We recommend based on the results obtained in this paper that the MLM be used when dealing with large sample sizes. The WLSM, MOM and MOL should also be considered when fitting small and moderate sizes of samples.