Goodness-of-fit tests for the generalized pareto distribution.

合集下载

Goodness-of-fit tests for the generalized pareto distribution

Goodness-of-fit tests for the generalized pareto distribution

bstract: Tests of fit are given for the generalized Pareto distribution (GPD) based on Cramer-von Mises statistics. Examples are given to illustrate the estimation techniques and the goodness-of-fit procedures. The tests are applied to the exceedances over given thresholds for 238 river flows in Canada; in general, the GPD provides an adequate fit. The tests are useful in deciding the threshold in such applications; this method is investigated and also the closeness of the GPD to some other distributions that might be used for long-tailed data.KEY WORDS: Exceedances; Extreme values; Hydrology; Threshold selection.The generalized Pareto distribution (GPD) has the following distribution function:F(x) = 1 - [(1 - kx/a).sup.1/k], (1)where a is a positive scale parameter and k is a shape parameter. The density function isf(x) = (1/a)[(1 - kx/a).sup.(1-k)/k]; (2)the range of x is 0 [less than or equal to] x < [infinity] for k [less than or equal to] 0 and 0 [less than or equal to] x [less than or equal to] a/k for k > 0. The mean and variance are [mu] = a/(l + k) and [[sigma].sup.2] = [a.sup.2]/{[(1+k).sup.2](1+2k)}; thus the variance of the GPD is finite only for k > -0.5. For the special values k = 0 and 1, the GPD becomes the exponential and uniform distributions, respectively. The name generalized Pareto was given by Pickands (1975); the distribution is sometimes called simply Pareto when k < 0. In this case, the GPD has a long tail to the right and has been used to model datasets that exhibit this form in several areas of appliedIn particular, the GPD is used to model extreme values. This application was discussed by many authors--for example, Hosking and Wallis (1987), Smith (1984, 1989, 1990), Davison (1984), and Davison and Smith (1990). Smith (1990) gave an excellent review of the two most widely used methods in this field, based on generalized extreme value distributions and on the GPD. In hydrology, the GPD is often called the "peaks over thresholds" (POT) model since it is used to model exceedances over threshold levels in flood control. Davison and Smith (1990) discussed this application in their section 9, using river-flow exceedances for a particular river over a period of 35 years. The authors fit the GPD for exceedances over a series of thresholds and also calculated the Kolmogorov-Smirnov and Anderson-Darling statistics to test fit. In the article, they mentioned the lack of tests for the GPD [echoing a remark made earlier by Smith (1984)]. In the absence of such tests, they used tables for testing exponentiality, which, as the authors pointed out, give too high critical values. This dataset is tested for GPD in Section 3.In this article, goodness-of-fit tests are given for the GPD, based on the Cramer-von Mises statistic [W.sup.2] and the Anderson-Darling [A.sup.2]. We concentrate on the most practical case in which the parameters are not known. Estimation of parameters is discussed in Section 1, and the goodness-of-fit tests are given in Section 2. In Section 3, they are applied to exceedances fora Canadian river, and it is shown how the tests can be used to help select the threshold in the POT model, following ideas suggested by Davison and Smith (1990). Moreover, the Davison-Smith example is revisited, using the GPD test and the Anderson-Darling statistic. The tests are then used to model the exceedances over thresholds of 238 Canadian river-flow series; they indicate the adequacy of the GPD fit. The technique of choosing the threshold is based on a stability property of the GPD; the efficacy of the technique when the true exceedance distribution is not GPD is examined in Section 4. In Section 5, we investigate the versat ility of the GPD as a distribution that might be used for many types of data with a long tail and no mode in the density. Finally, the asymptotic theory of the tests is given in Section 6.1. ESTIMATION OF PARAMETERSBefore the test of fit can be made, unknown parameters in (1) must first be estimated. The asymptotic theory of the test statistics requires an efficient method of estimation; we shall use maximum likelihood. Although it is theoretically possible to have datasets for which no solution exists to the likelihood equations, in practice this appears to be extremely rare. For example, in our examination of the 238 Canadian rivers, maximum likelihood estimates of the parameters could be found in every case. Thus we shall assume that such estimates exist for the dataset under test. Hosking and Wallis (1987), Castillo and Hadi (1997), and Dupuis and Tsao (1998) studied other methods of estimating the parameters. One of these methods is the use of probability-weighted moments; although there appear to be some advantages in terms of bias, Chen and Balakrishnan (1995) and Dupuis (1996) later showed that this technique is not always feasible.We now discuss estimation by maximum likelihood. Suppose that [x.sub.1],..., [x.sub.n] is a given random sample from the GPD given in (1), and let [x.sub.(1)] [less than or equal to] [x.sub.(2)] [less than or equal to] ... [less than or equal to] [x.sub.(n)] be the order statistics. We consider three distinct cases--Case 1, in which the shape parameter k is known and the scale parameter a is unknown; Case 2, in which the shape parameter k is unknown, and the scale parameter is known; Case 3, in which both parameters a and k are unknown. Case 3 is the most likely situation to arise in practice. The log-likelihood is given byL(a, k) = -n log a - (1 - 1/k) x [summation over (i=1] [n] log(1 - [kx.sub.i]/a) for k [not equal to] 0 = - n log a - [summation over (i=1] [n] [x.sub.i]/a for k = 0. (3)The range for a is a > 0 for k [less than or equal to] 0 and a > [kx.sub.(n)] for k > 0. When k < 1/2, Smith (1984) showed that, under certain regularity conditions, the maximum likelihood estimators are asymptotically normal and asymptotically efficient. When 0.5 [less than or equal to] k < 1, Smith (1984) identified the problem as nonregular, which alters the rate of convergence of the maximum likelihood estimators. For k [greater than or equal to] 1, and as n [right arrow] [infinity], the probability approaches 1 that the likelihood has no local maximum. We now consider Cases 1 to 3 separately.Case 1 (k known, a unknown). For this case, we have the following result.Proposition 1. For any known k with k < 1, the maximum likelihood estimate of a exists and is unique.Proof. For k = 0 (the exponential distribution), the result is well known. Suppose that k [not equal to] 0; then a, the maximum likelihood estimate of a, will be a solution a of [partial]L(a, k)/[partial]a = 0, which may be simplified to [L.sub.1](a) = 0, where [L.sub.1](a) = n - (1 - k) [summation over (i=1/n)] [x.sub.i]/(a - k[x.sub.i]). The value of [[partial].sup.2]L(a, k)/[partial][a.sup.2] at a = a is -(1 - k) [summation over (i=1/n)] [x.sub.i] ([a - k[x.sub.i]).sup.-2]/a < 0, which implies that at a = a, the likelihood function attains its maximum value. Moreover, the function [L.sub.1] (a) is an increasing function on the range of a because [partial][L.sub.1] (a)/[partial]a = (1 - k) [summation over (i=1/n)] [x.sub.i]/ [(a - [kx.sub.i]).sup.2] > 0; the function can take negative and positive values; thus it cuts the a axis exactly at one point. Hence, a is unique.Case 2 (a known, k unknown). In this situation, there may or may not exist a maximum likelihood estimate for k. To see this, consider the likelihood function L(a, k) given in (3) for -[infinity] < k [less than or equal to] a/[x.sub.(n)]. Since a is known, the likelihood will be regarded as a function L(k) of k only. Set [k.sup.*] = a/[x.sub.(n)]; then[lim.sub.k[right arrow][infinity]] L(k) = - [infinity], (4)[lim.sub.k[right arrow][k.sup.*]] L(k) = - [infinity] if [k.sup.*] < 1, (5)and[lim.sub.k[right arrow][k.sup.*]] L(k) = + [infinity] if [k.sup.*] > 1. (6)Note that the value of k is not restricted to be less than or equal to 1. From (4) and (5), it follows that, if [k.sup.*] < 1, there is at least one maximum. For a fixed sample of size n, Pr([k.sup.*] < 1) = 1 - [[1 - [(1 - k).sup.1/k]].sup.n] > 1 - [(3/4).sup.n]. Similarly, if [k.sup.*] > 1, it follows from (4) and (6) that there may or may not exist a local maximum.Case 3 (both parameters unknown). Maximum likelihood estimation of the parameters a and k when both are unknown was discussed in detail by Grimshaw (1993). This case is similar to Case 2: In principle, maximum likelihood estimates for a and k may not exist. However, as stated previously, this is unlikely in practical applications, especially when the sample size is reasonably large. To find a solution, Davison (1984) pointed out that, by a change of parameters to [theta] = k/a and k = k, the problem is reduced to a unidimensional search; we search for [theta], which gives a local maximum of the profile log-likelihood (the log-likelihood maximized over k). This is[L.sup.*]([theta]) = - n - [summation over (n/i=1)] log(1 - [theta][x.sub.i]) - n log [ - [(n[[theta].sup.-1] [summation over (n/i=1)] log(1 - [theta][x.sub.i])] (7)for [theta] < 1/[x.sub.(n)]. Suppose that a local maximum [theta] of (7) can be found; thenk = -([n.sup.-1]) [sumaration over (i=1] .sup.n] log(1 - [theta][x.sub.i]) (8)anda = k/[theta]. (9)2. GOODNESS-OF-FIT TESTSIn this section, the Cramer-von Mises statistic [W.sup.2] and the Anderson-Darling statistic [A.sup.2] are described. The Anderson-Darling statistic is a modification of the Cramer-von Mises statistic giving more weight to observations in the tail of the distribution, which is useful in detecting outliers. The null hypothesis is [H.sub.0]: the random sample [x.sub.1],..., [x.sub.n] comes from Distribution (1).When parameters a and k are known in (1), the GPD is completely specified; we call this situation Case 0. Then the transformation [z.sub.i] = F([x.sub.i]) produces a z sample that will be uniformly distributed between 0 and 1 under [H.sub.0]. Many tests of the uniform distribution exist, including Cramer-von Mises tests (see Stephens 1986), so we shall not consider this case further. In the more common Cases 1, 2, and 3, when one or both parameters must be estimated, the goodness-of-fit test procedure is as follows:1. Find the estimates of unknown parameters as described previously, and make the transformation [z.sub.(i)] = F([x.sub.(i)]), for i = 1,..., n, using the estimates where necessary.2. Calculate statistics [W.sup.2] and [A.sup.2] as follows:[W.sup.2] = [summation over (i=1/n)] [{[z.sub.(i)] - (2i - 1)/(2n)}.sup.2] + 1/(12n)and[A.sup.2] = - n - (1/n) [summation over (i=1/n)] (2i - 1)[log{[z.sub.(i)]} + log{1 - [z.sub.(n+1-i)]}].Tables 1 and 2 give upper-tail asymptotic percentage points for the statistics [W.sup.2] and [A.sup.2], for values of k between -0.9 and 0.5, and for Cases 1, 2, and 3. In Cases 2 and 3, where k must be estimated, the appropriate table should be entered at k, and in all tables, if k or k is greater than 0.5, the table should be entered at k = 0.5. Critical points for other values of k can be obtained by interpolation; linear interpolation in Table 2 for [A.sup.2], Case 3, gives a maximum error in [alpha] from 0.0011 to 0.0003 as [alpha] moves through the important range R; 0.10 [greater than or equal to] [alpha] [greater than or equal to] 0.005 and is less than 0.003 for the two values S; [alpha] = 0.5 and 0.25; for [W.sup.2] the corresponding figures are 0.0025 to 0.0004 over R and less than 0.007 for S. For the much less important cases 1 and 2, linear interpolation in Table 1 gives maximum error for [A.sup.2], Case 1, from 0.009 to 0.0017 over R, and 0.014 for S;for [W.sup.2], Case 1, the figures are 0.0055 t o 0.0015 over R and 0.008 for S. For Case 2, the maximum errors are smaller by a factor of approximately 10.The points in these tables were calculated from the asymptotic theory to be presented in Section 6. Points for finite n were generated from Monte Carlo samples, using 10,000 samples for each combination of n and k. The results show that these points converge quickly to the asymptotic points for all three cases, a feature of statistics [W.sup.2] and [A.sup.2] found also in many other applications. The asymptotic points can then be used with good accuracy for, say, n [greater than or equal to] 25. More extensive versions of Tables 1 and 2, and also tables showing the convergence to the asymptotic points, were given by Choulakian and Stephens (2000).3 ?In this section we consider two specific examples and then discuss the overall results when the GPD is applied to 238 Canadian rivers.In the first example, the data are n = 72 exceedances of flood peaks (in [m.sup.3]/s) of the Wheaton River near Carcross in Yukon Territory, Canada. The initial threshold value [tau] is 27.50 (how this was found will be explained later), and the 72 exceedances, for the years 1958 to 1984, rounded to one decimal place, were 1.7, 2.2, 14.4, 1.1, 0.4, 20.6, 5.3, 0.7, 1.9, 13, 12, 9.3, 1.4, 18.7, 8.5, 25.5, 11.6, 14.1, 22.1, 1.1, 2.5, 14.4, 1.7, 37.6, 0.6, 2.2, 39, 0.3, 15, 11, 7.3, 22.9, 1.7, 0.1, 1.1, 0.6, 9, 1.7, 7, 20.1, 0.4, 2.8, 14.1, 9.9, 10.4, 10.7, 30, 3.6, 5.6, 30.8, 13.3, 4.2, 25.5, 3.4, 11.9, 21.5, 27.6, 36.4, 2.7, 64, 1.5, 2.5, 27.4, 1, 27.1, 20.2, 16.8, 5.3, 9.7, 27.5, 2.5, 27. The maximum likelihood estimates of the parameters are k = -0.006 and a = 12.14. We now apply the tests presented in Section 2; the values of the test statistics are [W.sup.2] = 0.2389 (p value < 0.025) and [A.sup.2] = 1.452 (p value < 0.010), so the GPD does not fit the dataset well at the threshold value of 27.50. Next we follow the technique of Davison and Smith (1990), who suggested raising the threshold until the GPD fits the new values of the remaining exceedances. Here the threshold was raised successively by the value of the smallest order statistic, which is then deleted, until the p values of the [A.sup.2] and [W.sup.2] statistics exceeded 10%. This happens when six order statistics are deleted; the threshold value is then 27.50 + 0.60 = 28.10. The GPD now fits the data quite well: Details are given in Table 3. This example shows how the tests may help in choosing a threshold value in the POT model.In this connection, we next revisit the example given by Davison and Smith (1990). Their tables 5 and 6 give the parameter estimates (Case 3) and the values of the Anderson-Darling statistic for a range of thresholds from 140 down to 70 at intervals of 10 units. The values of the Anderson-Darling statistic were compared with 1.74, the 5% point for a test for exponentiality since no test for GPD was available, and the statistics were not nearly significant by this criterion. Now Table 2 can be used to give the asymptotic 5% critical values [z.sub.5] for [A.sup.2] corresponding to the estimate of the parameter k. The first two and last two threshold results are given in Table 4. Davison and Smith pointed out the sudden increase in the value of [A.sup.2] at threshold level 70; against the exponential point this value is still not significant, but using Table 2, it falls just at the critical 5% level.The Wheaton River in the previous example is one of 238 Canadian rivers for which we have similar data. We now examine how well the GPD fits the exceedances for the other rivers. First one must decide the threshold level for a given river flow. The first threshold estimate, [tau], was chosen so that the number of exceedances per year could be modeled by a Poisson distribution; see, for instance, Todorovic (1979). This was done by taking [tau] such that, if N[tau] is the number of exceedances, the mean of N[tau] divided by its variance was approximately 1. The Poisson assumption will be tested more rigorously later. After the threshold level was chosen, the maximum likelihood estimates of the parameters k and a were calculated and the [W.sup.2] and [A.sup.2] tests applied. Table 5 gives the frequencies of p values for the 238 rivers. It is clear that the GPD fits quite well; using [W.sup.2], only 9 gave p values less than 0.01, and using [A.sup.2], there were only 15. At the 0.05 level, these figures are 34 using [W.sup.2] and 49 using [A.sup.2]. The results demonstrate also that [A.sup.2] is more sensitive (and therefore more powerful) than [W.sup.2] against possible outliers in the tail, as was suggested in Section 2. More details were given by Choulakian and Stephens (2000). For the 49 "rejections" using [A.sup.2], the threshold was increased, as described in the first example, by deleting the smallest order statistics until the p value became larger than 0.10. Then only 10 of the 49 sets were still rejected as GPD by [A.sup.2]. Finally, for these 10 rejected river flows, the Poisson assumption was tested by the Cramervon Mises tests given by Spinelli and Stephens (1997). Only one dataset rejected the Poisson assumption.Another result of interest is that 229 of the 238 values of k were between -0.5 and 0.4; this confirms the findings of Hosking and Wallis (1987), who restricted attention to -0.5 < k < 0.5, where the great majority of values fall.4. POWER OF THE FAILURE-TO-REJECT METHODThe method of choosing the threshold by using as many exceedances as possible subject to passing a test for GPD can be investigated for power as follows. The data analyst wishes to fit a GPD to the exceedances far enough in the tail. She\he therefore starts with as many as possible, say 100, and tests; if the test fails, she/he omits the smallest values one by one and tests again until the test yields acceptance for the GPD. The efficiency of this procedure can now be investigated when the true distribution is not GPD. Suppose, for instance, the true exceedance distribution is Weibull with parameter 0.75, and suppose, starting with sample size 100, the sample fails the GPD test until K lower-order statistics have been removed; then it passes. Statistic K will vary from sample to sample, but its mean will give a measure of the power of the test. The standard deviation and other statistics from the distribution of K are also of interest. This has been investigated for various alternative distributions for the t rue exceedances; Table 6 provides these results. They are based on 1,000 Monte Carlo samples of initial size n = 100, 50, or 30. Only statistic [A.sup.2] is reported because [A.sup.2] outperforms [W.sup.2] at detecting tail discrepancies. Column 3 gives the initial rejection rate for a GPD test with parameters estimated, at test size 0.05; this is also the power in a conventional study of power against the specified alternative. Subsequent columns give the mean, standard deviation, the three quartiles, and the maximum values for K. Thus for the Weibull(0.75) alternative, 509 of the 1,000 samples of size100 were rejected as GPD initially; of these rejected samples, the mean number of order statistics to be deleted before acceptance was 8.169 with standard deviation 9.178. The distribution of K is long-tailed; the quartiles are 2, 5, and 11, respectively, but one sample needed 52 order statistics removed to achieve acceptance by the GPD test.In Table 6, the distributions were chosen to give a reasonable starting rejection rate. Several other distributions could be supposed as alternatives (e.g., the half-normal, half-Cauchy, or gamma distributions), but these can be made very close to the GPD by suitable choice of parameters, so the initial power is small, and very few low-order statistics, if any, need be deleted to achieve GPD acceptance (cf. the numerical results in Sec. 5). In their section 9, Davison and Smith suggested that the exceedances for their example might possibly be truly modeled by a mixture of two populations, and in Table 6 we have included three mixtures of GPD as possible true models. The test statistic distinguishes quite effectively between these mixtures and a single GPD.For all alternative distributions considered in Table 6, it may be seen that, as n increases, the mean, standard deviation, and maximum of K all increase; the results also show the increase in the power of the test as the sample size increases.5. VERSATILITY OF THE GPDAs was suggested previously, some distributions often fitted to long-tailed distributions may be brought close to a GPD by suitable choice of k and a. This closeness was investigated, and results are given in Table 7. In this table, for example, the standard half-normal and standard half-Cauchy distributions (respectively, the distribution of [absolute value of X] when X has the standard normal or the standard Cauchy distribution) are compared with the GPD(0.293,1.022) for the first and GPD(-0.710, 1.152) for the second; the standard lognormal is compared with GPD(-0.140, -1.406). The choice of the GPD parameters can be made by equating the first two moments of the GPD and those of the distribution compared, but this does not produce as good a fit as the following maximum likelihood procedure.A sample of 500 values was taken from the half-normal distribution, and a GPD fit was made, estimating the parameters by maximum likelihood, first solving (7) for [theta], and then obtaining k, a from (8) and (9). This was repeated 100 times, and the average values of the k, a values were used as the parameters in the GPD. Although the matches are clearly not perfect, the error in [alpha], the probability level of a given percentile, is small when one distribution is used instead of another. Similar comparisons are given for several Weibull distributions, and a gamma distribution. The exponential (Weibull with parameter 1) is omitted because it is a special case of the GPD with k = 0. Again the two distributions are quite close for the Weibull parameter greater than 1 (where the Weibull has a mode), but the match is less good for Weibull with parameter less than 1--for example, the Weibull(0.5), or the Weibull(0.75)--where the density rises to infinity at X = 0.Overall, the results suggest that a GPD could often be used as a model for data with a long tailwhen neither a mode nor an infinite density is suggested by the nature of the variables or by the data themselves.6. ASYMPTOTIC THEORY OF THE TESTSIn this section we summarize the asymptotic theory of Cramer--von Mises tests. The calculation of asymptotic distributions of the statistics follows a procedure described, for instance, by Stephens (1976). It is based on the fact that [y.sub.n] (z) = [square root of (term)]n{[F.sub.n](z) - z}, 0 [less than or equal to] z [less than or equal to] 1, where [F.sub.n](z) is the empirical distribution function of the z set, it tends to a Gaussian process y(z) as n [right arrow] [infinity] and the statistics are functionals of this process. The mean of y(z) is 0: We need the covariance function [rho](s, t) = E{y(s)y(t)}, 0 [less than or equal to] s, t [less than or equal to] 1. When all the parameters are known, this covariance is [[rho].sub.0](s, t) =min(s, t) - st. When parameters are estimated, the covariance will depend in general on the true values of the estimated parameters. However, if the method of estimation is efficient, the covariance will not depend on the scale parameter a but will depend on the shape parameter k. We illustrate for Case 3 only. As the sample size n [right arrow] [infinity], the maximum likelihood estimators (a, k) have a bivariate normal distribution with mean (a, k) and variance-covariance matrix [SIGMA], where[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (10)When both parameters a and k are estimated, the covariance function of y(z) becomes[[rho].sub.3](s, t) = [[rho].sub.0](s, t) - {g(s)}'[summation over]g(t), (11)where s = F(x), and g(s) = ([g.sub.1](s), [g.sub.2](s))' is a vector having coordinates[g.sub.1] (s) = [partial]F / [partial]a = (1 - s) {1 - [(1 - s).sup.-k]}/(ak)and[g.sub.2] (s) = [partial]F / [partial]k = (1 - s)k log(1 - s) - 1 + [(1 - s).sup.-k]/[k.sup.2].When [SIGMA] and g(s) are inserted into (11), [[rho].sub.3](s, t) will be independent of a. When k [greater than or equal to] 0.5, the maximum likelihood estimates of a and k are superefficient in the sense of Darling (1955), and then the covariance and the resulting asymptotic distributions will be the same as for k = 0.5. Thus, if 0.5 [less than or equal to] k [less than or equal to] 1, the table should be entered at k = 0.5, as described in Section 2. In Case 1 the covariance of y(z) becomes[[rho].sub.1](s, t) = [[rho].sub.0](s, t) - (1 - 2k)[g.sub.1] (s)[g.sub.1] (t) (12)and for Case 2 it becomes[[rho].sub.2](s, t) = [[rho].sub.0](s, t) - (1 - k)(1 - 2k)[g.sub.2] (s)[g.sub.2] (t)/2. (13)In both these cases, at k = 0.5, the asymptotic covariance becomes [[rho].sub.0](s, t). This is the same as for a test for Case 0, when both a and k are known, and the asymptotic points are the same as for such a test (see Stephens 1986).The Cramer-von Mises statistic [W.sup.2] is based directly on the process y(z), while [A.sup.2] is based on the process w(z) = y(z)/[{z(1 - z)}.sup.1/2]; asymptotically the distributions of [W.sup.2] and [A.sup.2] are those of [W.sup.2] = [[integral].sup.1.sub.0] [y.sup.2](z)dz and [A.sup.2] = [[integral].sup.1.sub.0] [w.sup.2](z)dz. The asymptotic distributions of both statistics are a sum of weighted independent [[chi square].sub.1] variables; the weights for [W.sup.2] must be found from the eigenvalues of an integral equation with the appropriate [[rho].sub.j](s, t) for Case j as kernel. For [A.sup.2], [[rho].sub.A](s, t), the covariance of the w(z) process, is [[rho].sub.j](s, t)/[{st(1 - s)(1 - t)}.sup.1/2], and this is the kernel of the integral equation. Once the weights are known, the percentage points of the distributions can be calculated by Imhof's method. For details of these procedures, see, for example, Stephens (1976).ACKNOWLEDGMENTSWe thank the editor and two referees for their many helpful suggestions. This work was supported by the Natural Science and Engineering Council of Canada.REFERENCESCastillo, E., and Hadi, A. S. (1997), "Fitting the Generalized Pareto Distribution to Data," Journal of the American Statistical Association, 92, 1609-1620.Chen, G., and Balakrishnan, N. (1995), "The Infeasibility of Probability Weighted Moments Estimation of Some Generalized Distributions," in Recent Advances in Life-Testing and Reliability, ed. N. Balakrishnan, London: CRC Press, pp. 565-573.Choulakian, V., and Stephens, M. A. (2000), "Goodness-of-Fit Tests for the Generalized Pareto Distribution," research report, Simon Fraser University, Dept. of Mathematics and Statistics.Darling, D. (1955), "The Cramer-von Mises Test in the Parametric Case," The Annals of Mathematical Statistics, 26, 1-20.Davison, A. C. (1984), "Modeling Excesses Over High Thresholds, With an Application," in Statistical Extremes and Applications, ed. J. Tiago de Oliveira, Dordrecht: D. Reidel, pp. 461-482.Davison, A. C., and Smith, R. L. (1990), "Models for Exceedances Over High Thresholds" (with comments), Journal of the Royal Statistical Society, Set. B, 52, 393-442.Dupuis, D. J. (1996), "Estimating the Probability of Obtaining Nonfeasible Parameter Estimates ofthe Generalized Pareto Distribution," Journal of Statistical Computer Simulation, 54, 197-209.Dupuis, D. J., and Tsao, M. (1998), "A Hybrid Estimator for Generalized Pareto and Extreme-Value Distributions," Communications in Statistics--Theory and Methods, 27, 925-941.Grimshaw, S. D. (1993), "Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution," Technometrics, 35, 185-191.Hosking, J. R. M., and Wallis, J. R. (1987), "Parameter and Quantile Estimation for the Generalized Pareto Distribution," Technometrics, 29, 339-349.Pickands, J. (1975), "Statistical Inference Using Extreme Order Statistics," Time Annals of Statistics, 3, 119-131.Smith, R. L. (1984), "Threshold Methods for Sample Extremes," in Statistical Extremes and Applications, ed. J. Tiago de Oliveira, Dordrecht: Reidel, pp. 6211-6638._____(1989), "Extreme Value Analysis of Environmental Time Series: An Application to Trend Detection in Ground-level Ozone," Statistical Science, 4, 367-393._____(1990), "Extreme Value Theory," in Handbook of Applicable Mathematics (Vol. 7), ed. W. Ledermann, Chichester, U.K.: Wiley.Spinelli, J. J., and Stephens, M. A. (1997), "Test of Fit for the Poisson Distribution," The Canadian Journal of Statistics, 25, 257-268.Stephens, M. A. (1976), "Asymptotic Results for Goodness-of-Fit Statistics With Unknown Parameters," The Aimals of Statistics, 4, 357-369._____(1986), "Tests Based on EDF Statistics," in Goodness-of-fit Techniques, eds. R. B. D'Agostino and M. A. Stephens, New York: Marcel Dekker, pp. 97-122.Todorovic, P. (1979), "A Probabilistic Approach to Analysis and Prediction of Floods," in Proceedings of the International Statistical Institute, Buenos Aires (Vol. 1), pp. 113-124.Table 1.Upper-Tail Asymptotic Percentage Points for [W.sup.2] (normal type) and for [A.sup.2] (bold), Cases 1 and 2; p is Pr([W.sup.2] [greater than or equal to] z), or Pr([A.sup.2] [greater than or equal to] z), Where z is the Table Entry Case 1: k Known, a unknown…Case 2: a known, k unknown。

心理统计课件L6 The Chi-Square Statistic

心理统计课件L6 The Chi-Square Statistic
• The data are usually presented in a matrix with the categories for one variable defining the rows and the categories of the second variable defining the columns.
13
The Chi-Square Test for Independence (cont.)
• The data, again called observed frequencies, show how many individuals are in each cell of the matrix.
5
The Chi-Square Test for Goodness-of-Fit (cont.)
• The null hypothesis specifies the proportion of the population that should be in each category.
• The proportions from the null hypothesis are used to compute expected frequencies that describe how the sample would appear if it were in perfect agreement with the null hypothesis.
The Chi-Square Statistic
1
Parametric and Nonparametric Tests
• two non-parametric hypothesis tests using the chi-square statistic: the chisquare test for goodness of fit and the chisquare test for independence.

Goodness of fit test for ergodic diffusion processes

Goodness of fit test for ergodic diffusion processes
T 0
1(−∞,x] (Xt )(dXt − S0 (Xt )dt).
As usual 1A denote the indicator function on a set A. Following Koul and Stute (1999), we call VT the score marked empirical process. We prove that the test based on the statistic supx |VT (x)| is asymptotically distribution free under each simple null hypothesis S = S0 and it is consistent under any simple fixed alternative S = S1 = S0 . Despite the fact of their importance in applications, few works are devoted to the goodness of fit test for diffusions up to our knowledge. So the construction of goodness of fit tests for such kind of model is very important and needs very detailed studies. Kutoyants (2004) discusses some possibilities of the construction of such tests. In particular, he considers the Kolmogorov√ ˆT (x) − FS0 (x)|. The goodness of fit Smirnov statistics ∆T (X T ) = supx T |F test based on this statistics is asymptotically consistent and the asymptotic distribution under the null hypothesis follows from the weak convergence of the empirical process to a suitable Gaussian process (Negri, 1998 and Van der Vaart and Van Zanten, 2005). However, due to the structure of the covariance of the limit process, the Kolmogorov-Smirnov statistics is not

生物统计学 第六章 卡平方测验

生物统计学 第六章 卡平方测验

解:( 1 )列联表
第一块田 第二块田 总数
有锈病 372 ( 396*702/774=359.16 ) 330 ( 378*702/774=342.84 ) 702
无锈病 24 ( 396*72/774=36.84 ) 48 ( 378*72/774=35.16 ) 72
H 0 : 两块地发病率一致, H A : 两块地 发病率不一致
进行列联表分析,那些情况下需要进行连续 性矫正( A)。
A.2×2 表 B.2×3 表 C.2×c 表 D.r ×2 表
以红米非糯稻和白米糯稻杂交,子二代检测
179 株,数据如下:
属性 (x)
红米非糯 红米糯 白米非糯 白
株数
96
37
31
问子二代分离是否符合 9 : 3 : 3 : 1 的
规律? ( A)。
,等等。
查表,得
,所以
差异极显著,拒绝 H 0 ,这一品种已不纯。
求置信区间,首先有,
由于该表有三行三列,∴自由度 df =(3-1) ×(3-1) = 4。不须连续性矫正。查表:
,∴差异不显著,接 受 H 0 ,叶片衰老与灌溉方式无关。
所以
的置信区间为,
22. 纯种玉米株高方差不应大于 64

现测量某一品种玉米 75 株,得株高
总数 396 378 774
=0.4240+4.1334+0.4442+4.3309 =9.3325

=7.8794
9 . 3325 > 7 . 8794, 所以差异极显著, 拒绝 H 0 ,两块地发病率不一致
( 2 )百分数检验: H 0 : 两块地发病率 一致, H A : 两块地发病率不一致

Goodness-of-Fit Tests with Censored Data

Goodness-of-Fit Tests with Censored Data
t
2
ˆ dN(w) Y ( w)0 ( w;h )dw}
2 p*;
• Test: Reject H0 if
Sp >
.
16
A Choice of Generalizing Pearson
• Partition [0,t] into 0 = a1 < a2 < … < ap = t, and let
p 0 (t ) 0 (t;h ) expk k (t;h ) = 0 (t;h ) exp{ t (t;h )} k =1 • Embedding Class
Cp
p p = p (t; ,h ) = 0 (t;h ) exp k k (t;h ) : k =1
5
Goodness-of-Fit Problem
• T1, T2, …, Tn are IID from an unknown distribution function F • Case 1: F is a continuous df • Case 2: F is discrete df with known jump points • Case 3: F is a mixed distribution with known jump points • C1, C2, …, Cn are (nuisance) variables that right-censor the Ti‟s
j aj
j 1
a j 1
ˆ Ej =
ˆ Y ( w) ( w;h )dw
0
a j 1
ˆ • E j ' s are dynamic expected frequencies

goodness-of-fit

goodness-of-fit

1
Introduction
Social network analysis (Wasserman and Faust, 1994) is concerned with links among entities. The network data considered here correspond to the directed ties among the members of a set of actors. It is common that the ties are binary, but ties may take on arbitrary values. When modeling networks, the ties between the actors are treated as random variables. The tie variables are, however, not independent. Some well-known examples of dependencies are reciprocity (second-order dependence) and transitivity (Holland and Leinhardt, 1970, 1976), which represents third-order dependence among ties and implies clustering (“group structure”) in social networks. Since the tie variables are dependent, statistical inference proved to be hard. (Curved) exponential random graph models (ERGMs) (Snijders, Pattison, Robins, and Handcock, 2006, Hunter and Handcock, 2006) have been used to model networks observed at one time point, but—in spite of recent advances in the specification and estimation of ERGMs—some theoretical and practical issues remain. Here, the focus is on longitudinal network data. Longitudinal network data come frequently in the form of panel data. There is a large agreement in the literature (see Frank, 1991) that the most promising models for network panel data are continuoustime Markov models which assume that the network was observed at discrete time points, and that between these time points latent, continuous-time Markov processes shape the network. Holland and Leinhardt (1977) and Wasserman (1979, 1980) proposed methods for statistical inference for such Markov models, but these methods are limited to models with second-order dependence, and thus neglect fundamental thirdand higher-order dependencies. Snijders (2001) considered a family of continuous-time Markov models which allows to model third- and higher-order dependencies, and proposed the method of moments to estimate the parameter θ . The probabilistic framework may be described as actor2

Chi-Square Effect Size Estimator

Chi-Square Effect Size Estimator

Chapter 900Chi-Square EffectSize Estimator IntroductionThis procedure calculates the effect size of the Chi-square test for use in power and sample size calculations.Based on your input, the procedure provides effect size estimates for Chi-square goodness-of-fit tests and for Chi-square tests of independence.The Chi-square test is often used to test whether sets of frequencies or proportions follow certain patterns. The two most common cases are in tests of goodness of fit and tests of independence in contingency tables.The Chi-square goodness-of-fit test is used to test whether a set of data follows a particular distribution. For example, you might want to test whether a set of data comes from the normal distribution.The Chi-square test for independence in a contingency table is another common application of this test. Here individuals (people, animals, or things) are classified by two (nominal or ordinal) classification variables into a two-way contingency table. This table contains the counts of the number of individuals in each combination of the row categories and column categories. The Chi-square test determines if there is dependence (association) between the two classification variables. Effect SizeFor each cell of a table containing m cells, there are two proportions considered: one specified by a nullhypothesis and the other specified by the alternative hypothesis. Usually, the proportions specified by thealternative hypothesis are those occurring in the data. Define p 0i to be the proportion in cell i given by the null hypothesis and p 1i to be the proportion in cell i according to the alternative hypothesis. The effect size, W , is calculated using the following formulaW = (p -p )p .i=1m 20i 1i 0i ∑ The formula for computing the Chi-square value, χ2,is χ2 = (O -E )E = N (p -p )p ,i=1m 2i ii i=1m 20i 1i 0i ∑∑where N is the total count in all the cells. Hence, the relationship between W and χ2 isχ22 = NW orW N =χ2Contingency Table TabThis window allows you to enter up to an eight-by-eight contingency table. You can enter percentages or counts. If you enter counts, the Chi-Square and Prob Level values are correct and may be used to test the independence of the row and column variables. If you enter percentages, you should ignore the Chi-Square and Prob Level values. Note that if you are entering percentages, it does not matter whether you enter table percentages or row (or column) percentages as long as you are consistent. ExampleSuppose you are planning a survey with the primary purpose of testing whether marital status is related to gender. You decide to adopt four marital status categories: never married, married, divorced, widowed. In the population you are studying, previous studies have found the following percentages in each of these categories:Never Married27% Married39% Divorced23% Widowed11%You decide that you want to calculate the effect size when the individual percentages for males and females areGenderMale Female Never Married22% 32% Married46% 33% Divorced22% 24% Widowed10% 11%To complete this example, you would load the Chi-Square Effect Size Estimator procedure from the PASS-Other menu and enter “22 46 22 10” across the top row and “32 33 24 11” across the next row. The value of W turns out to be 0.143626.Note that even though a Chi-square value (4.13) and probability level (0.248) are displayed, you would ignore them since you have entered percentages, not counts, into the table. If you had entered counts, these results could be used to test the hypothesis of independence.Multinomial Test TabThis window allows you to enter a multinomial table with up to fourteen cells. You can enter percentages or counts. If you enter counts, the Chi-Square and Prob Level values are correct and may be used to test the statistical significance of the table. If you enter percentages, you should ignore the Chi-Square and Prob Level values.Note that if you are using the window to perform a goodness-of-fit test on a set of data, you will need to adjust the degrees of freedom for the number of parameters you are estimating. For example, if you are testing whether the data are distributed normally and you estimate the mean and standard deviation from the data, you will need to reduce the degrees of freedom by two.ExampleSuppose you are going to use the Chi-square goodness-of-fit statistic calculated from a multinomial table to test whether a set of exponential data follow the normal distribution. That is, you want to find a reasonable effect size for comparing exponentially distributed data to the normal distribution.You decide to divide the data into five groups: 5 or less, 5-10, 10-15, 15-20, 20+Using tables for the normal and exponential distributions, you find that the probabilities for each group areCategory Normal Exponential5 or Less 11% 39%5 to 10 20% 26%10 to 15 38% 18%15 to 20 20% 11%Above 20 11% 6%To complete this example, you would set the Chi-Square Effect Size Estimator procedure to the Multinomial Test tab and enter “11 20 38 20 11” down the first column and “39 26 18 11 6” down the second column. The calculated value of W is 0.948271. You would enter this value into the Effect Size option of the Chi-Square Test window in PASS to determine the necessary sample size.。

Goodness-of-Fit-Tests-with-Censored-DataPPT课件

Goodness-of-Fit-Tests-with-Censored-DataPPT课件

Research support from NIH and NSF.
-
1
Practical Problem
-
2
Product-Limit Estimator and Best-Fitting Exponential Survivor Function
Question: Is the underlying survivor function modeled by a
family of exponential distributions? a Weibull distribution?
-
3
A Car Tire Data Set
• Times to withdrawal (in hours) of 171 car tires, with
withdrawal either due to failure or right-censoring.
Goodness-of-Fit Tests with Censored Data
Edsel A. Pena
Statistics Department University of South Carolina
Columbia, SC [E-Mail: pena@]
Talk at Cornell University, 3/13/02
E ˆi nF 0(Ii;hˆ)
is the estimated expected number of observations in the
ith interval under the null model.
-
8
Obstacles with Censored Data
• With right-censored data, determining the exact values of the Oj’s is not possible. • Need to estimate them using the product-limit estimator (Hollander and Pena, ‘92; Li and Doss, ‘93), Nelson-Aalen estimator (Akritas, ‘88; Hjort, ‘90), or by self-consistency arguments. • Hard to examine the power or optimality properties of the resulting Pearson generalizations because of the ad hoc nature of their derivations.

Goodness of fit

Goodness of fit

Goodness-of-Fit StatisticsAfter using graphical methods to evaluate the goodness of fit, you should examine the goodness-of-fit statistics. Curve Fitting Toolbox™ software supports these goodness -of-fit statistics for parametric models:∙The sum of squares due to error (SSE) ∙R-square ∙Adjusted R-square ∙ Root mean squared error (RMSE)For the current fit, these statistics are displayed in the Results pane in the Curve Fitting app. For all fits in the current curve-fitting session, you can compare the goodness-of-fit statistics in the Table of fits .To get goodness-of-fit statistics at the command line, either:∙In Curve Fitting app, select Fit > Save to Workspace to export your fit and goodness of fit to the workspace.∙ Specify the gof output argument with the fit function. Sum of Squares Due to ErrorThis statistic measures the total deviation of the response values from the fit to the response values. It is also called the summed square of residuals and is usually labeled as SSE . SSE=n Ξi=1w i (y i −ˆy i )2A value closer to 0 indicates that the model has a smaller random error component, and that the fit will be more useful for prediction.R-SquareThis statistic measures how successful the fit is in explaining the variation of the data. Put another way, R-square is the square of the correlation between the response values and the predicted response values. It is also called the square of the multiple correlation coefficient and the coefficient of multiple determination.R-square is defined as the ratio of the sum of squares of the regression (SSR ) and the total sum of squares (SST ). SSR is defined asSSR=n Ξi=1w i (ˆy i −¯y)2SST is also called the sum of squares about the mean, and is defined asSST=n Ξi=1w i (y i −¯y)2where SST = SSR + SSE. Given these definitions, R-square is expressed asR-square=SSRSST =1−SSESSTR-square can take on any value between 0 and 1, with a value closer to 1 indicating that a greater proportion of variance is accounted for by the model. For example, an R-square value of 0.8234 means that the fit explains 82.34% of the total variation in the data about the average.If you increase the number of fitted coefficients in your model, R-square will increase although the fit may not improve in a practical sense. To avoid this situation, you should use the degrees of freedom adjusted R-square statistic described below.Note that it is possible to get a negative R-square for equations that do not contain a constant term. Because R-square is defined as the proportion of variance explained by the fit, if the fit is actually worse than just fitting a horizontal line then R-square is negative. In this case, R-square cannot be interpreted as the square of a correlation. Such situations indicate that a constant term should be added to the model.Degrees of Freedom Adjusted R-SquareThis statistic uses the R-square statistic defined above, and adjusts it based on the residual degrees of freedom. The residual degrees of freedom is defined as the number of response values n minus the number of fitted coefficients m estimated from the response values.v = n–mv indicates the number of independent pieces of information involving the n data points that are required to calculate the sum of squares. Note that if parameters are bounded and one or more of the estimates are at their bounds, then those estimates are regarded as fixed. The degrees of freedom is increased by the number of such parameters.The adjusted R-square statistic is generally the best indicator of the fit quality when you compare two models that are nested— that is, a series of models each of which adds additional coefficients to the previous model.adjusted R-square=1−SSE(n−1)SST(v)The adjusted R-square statistic can take on any value less than or equal to 1, with a value closer to 1 indicating a better fit. Negative values can occur when the model contains terms that do not help to predict the response.Root Mean Squared ErrorThis statistic is also known as the fit standard error and the standard error of the regression. It is an estimate of the standard deviation of the random component in the data, and is defined asRMSE=s=GMSEwhere MSE is the mean square error or the residual mean squareMSE=SSEvJust as with SSE, an MSE value closer to 0 indicates a fit that is more useful for prediction.。

FIRO-B

FIRO-B

The extent of power or dominance a person seeks
How much you want to lead others or want others to lead and influence you
Refers to one-on-one relationships and your behavior in groups
Expressed Expressed Expressed Inclusion (eI) Control (eC) Affection (eA)
Wanted Needs
Wanted
Wanted
Wanted
Inclusion (wI) Control (wC) Affection (wA)
2021/4/1
Hypotheses to be explored
You are responsible for interpreting the meaning
Must be considered in the overall context of your life
Results CAN be influenced significantly by
Schutz identified three needs (inclusion, control, and affection), and two levels (expressed and wanted)
The Six-Cell Model
2021/4/1
3
Results can be used to…
2021/4/1
6
FIRO-B Stage I Interpretation

我国橄榄球运动员下肢、躯干非接触性损伤风险评估的研究

我国橄榄球运动员下肢、躯干非接触性损伤风险评估的研究

文章编号:1002-9826(2018)05-0117-06 DOI:10.16470/j.csst.201805018中国体育科技2018年(第54卷)第5期CHINA SPORT SCIENCE AND TECHNOLOGY Vol.54,No.5,117-122,2018我国橄榄球运动员下肢、躯干非接触性损伤风险评估的研究A Study on the Risk Assesment aboutNon-Contact Injury of Lower Limbs andTrunk of Chinese Rugby Players高晓嶙1,徐辉2,黄鹏3,李玉3,杨慧君4GAO Xiao-lin1, XU Hui2, HUANG Peng3, LI Yu3, YANG Hui-jun4摘要:目的:利用Logistic回归方法建立我国橄榄球运动员下肢、躯干非接触性损伤风险方程模型,筛选预测因子,为运动损伤风险评估提供科学依据。

方法:以我国现役国家队、省队橄榄球运动员为受试者,采用标准Y平衡测试(Y Balance Test,YBT)、功能性动作测试(Functional Movement screen,FMS)收集数据,并跟踪调查我国橄榄球运动员下肢与躯干非接触性损伤情况,采用Logistic回归方法筛选风险因子,建立方程,分析相关风险因子与伤病风险关系。

结果:我国橄榄球运动员非撞击性损伤风险回归方程为:Logistic [P(Y=1)] = -1.639 -1.492X1-0.013X2+2.188X3+1.184X4+0.118X5+1.901X6(X1=性别、X2=专项年限、X3=运动损伤史、X4=下肢Y测试、X5=FMS总分、X6=FMS测试痛、Y=损伤);回归方程模型系数综合检验中的步(step)、块(block)、模型(model)的P值均小于0.01;在拟合优度检验中,-2对数似然值(-2LL)为82.629;Cox & Snell R2为0.381,Nagelkerke R2为0.527;回归方程模型伤病预测准确率为89.7%,无伤病准确率为68.6%,平均准确率为82.5%。

体检注意事项 英语作文

体检注意事项 英语作文

体检注意事项英语作文Title: Essential Considerations for Undergoing aMedical Examination.Undertaking a medical examination is an integral partof maintaining good health. It provides crucial insightsinto one's physical condition, allowing for timelydetection and treatment of potential health issues. However, to ensure the accuracy and effectiveness of these examinations, it is imperative to follow certain guidelines and considerations.Firstly, prior to the medical examination, it is essential to schedule an appointment with the healthcare provider. This not only ensures that the necessaryresources and personnel are available, but also allows the individual to prepare adequately. During this scheduling process, it is advisable to inquire about the specific procedures and tests that will be performed, as well as any pre-examination instructions or restrictions.One crucial aspect to consider before a medical examination is diet. Depending on the specific tests being performed, certain dietary restrictions may be required. For instance, certain blood tests may require theindividual to refrain from eating or drinking for aspecific duration before the examination. It is, therefore, crucial to adhere to these dietary instructions to ensure the accuracy of the test results.Furthermore, it is essential to avoid certain substances that may interfere with the medical examination. This includes alcohol, tobacco, and certain medications.。

健康的体重对身体素质很重要英语作文

健康的体重对身体素质很重要英语作文

健康的体重对身体素质很重要英语作文Maintaining a Healthy Weight: The Key to Optimal Physical Well-beingAchieving and maintaining a healthy weight is a fundamental aspect of overall physical well-being. The importance of maintaining a healthy weight cannot be overstated, as it directly impacts an individual's physical, mental, and emotional health. In this essay, we will explore the various ways in which a healthy weight contributes to improved physical fitness and overall quality of life.One of the primary benefits of maintaining a healthy weight is the positive impact it has on cardiovascular health. Excess body weight, particularly in the form of visceral fat, can put significant strain on the heart and circulatory system. This increased strain can lead to the development of conditions such as high blood pressure, high cholesterol, and an increased risk of heart disease and stroke. By maintaining a healthy weight, individuals can reduce the burden on their cardiovascular system, lowering their risk of these potentially life-threatening conditions.In addition to the cardiovascular benefits, a healthy weight also playsa crucial role in musculoskeletal health. Carrying excess weight can place additional stress on the joints, particularly the knees, hips, and ankles, leading to the development of osteoarthritis and other joint-related issues. Maintaining a healthy weight not only reduces the strain on the joints but also enhances overall mobility and flexibility, allowing individuals to engage in physical activities with greater ease and reduced risk of injury.Furthermore, a healthy weight is closely linked to improved respiratory function. Excess body weight can impair the efficiency of the lungs and respiratory system, making it more difficult to breathe and engage in physical activity. By maintaining a healthy weight, individuals can experience improved lung capacity, better oxygen intake, and a reduced risk of respiratory problems such as sleep apnea and asthma.Beyond the physical benefits, a healthy weight also has a significant impact on an individual's mental and emotional well-being. Excess weight can lead to feelings of low self-esteem, depression, and anxiety, which can further exacerbate physical health issues. Conversely, maintaining a healthy weight can boost self-confidence, improve mood, and enhance overall quality of life. This positive impact on mental health can then have a ripple effect, improving an individual's ability to engage in physical activity and make healthier lifestyle choices.It is important to note that achieving and maintaining a healthy weight is not a one-size-fits-all approach. Each individual's body type, metabolism, and overall health status can vary, and it is essential to work with healthcare professionals to determine the appropriate weight range and strategies for maintaining it. This may involve a combination of balanced nutrition, regular physical activity, and in some cases, medical interventions such as weight management programs or medications.In conclusion, the importance of maintaining a healthy weight cannot be overstated. A healthy weight contributes to improved cardiovascular health, enhanced musculoskeletal function, better respiratory function, and a positive impact on mental and emotional well-being. By prioritizing a healthy weight, individuals can unlock the full potential of their physical fitness and overall quality of life. Embracing a holistic approach to health and wellness, which includes maintaining a healthy weight, is a crucial step towards achieving optimal physical well-being.。

for assessing the goodness of fit

for assessing the goodness of fit
n
2
i
ii
i
i=1
2
i
i
i
3
2
i
2
i
i
i
i
i
i
i
2
Table 1: Recurrence rates y=m from cervical carcinoma L E 0 1 2 3 1 21/124 7/21 9/16 13/13 2 18/ 58 6/12 5/ 7 5/ 5 3 4/ 14 16/19 9/12 10/12 3
i i i i i i i
3i
i
i
i
a wider family, again with E (y ) = and model (1) but with variance V and cumulants ? , which are proportional to those under the null hypothesis where = 1. This covers the problem of overdispersion in Poisson or binomial models, where the variability of the residuals is systematically larger than speci ed by the model. Departures from the null model are commonly evaluated by the (generalized) Pearson statistic
i i i i i i i i i i i i i i
Example 1. To illustrate the e ect of using X , a logistic regression

Discrete Goodness-of-Fit测试包说明书

Discrete Goodness-of-Fit测试包说明书

Package‘dgof’October13,2022Version1.4Date2022-07-16Title Discrete Goodness-of-Fit TestsAuthor Taylor B.Arnold,John W.Emerson,R Core Team and contributorsworldwideMaintainer Taylor B.Arnold<*************************>Description A revision to the stats::ks.test()function and the associatedks.test.Rd help page.With one minor exception,it does not change theexisting behavior of ks.test(),and it adds features necessaryfor doing one-sample tests with hypothesized discretedistributions.The package also contains cvm.test(),for doingone-sample Cramer-von Mises goodness-of-fit tests.License GPL(>=2.0)LazyLoad yesNeedsCompilation yesRepository CRANDate/Publication2022-06-1616:50:02UTCR topics documented:cvm.test (1)ks.test (3)Index8 cvm.test Discrete Cramer-von Mises Goodness-of-Fit TestsDescriptionComputes the test statistics for doing one-sample Cramer-von Mises goodness-of-fit tests and cal-culates asymptotic p-values.12cvm.testUsagecvm.test(x,y,type=c("W2","U2","A2"),simulate.p.value=FALSE,B=2000,tol=1e-8)Argumentsx a numerical vector of data values.y an ecdf or step-function(stepfun)for specifying the hypothesized model.type the variant of the Cramer-von Mises test;"W2"is the default and most common method,"U2"is for cyclical data,and"A2"is the Anderson-Darling alternative.For details see references.simulate.p.valuea logical indicating whether to compute p-values by Monte Carlo simulation.B an integer specifying the number of replicates used in the Monte Carlo test(fordiscrete goodness-of-fit tests only).tol used as an upper bound for possible rounding error in values(say,a and b)when needing to check for equality(a==b)(for discrete goodness-of-fit tests only).DetailsWhile the Kolmogorov-Smirnov test may be the most popular of the nonparametric goodness-of-fit tests,Cramer-von Mises tests have been shown to be more powerful against a large class of alternatives hypotheses.The original test was developed by Harald Cramer and Richard von Mises (Cramer,1928;von Mises,1928)and further adapted by Anderson and Darling(1952),and Watson (1961).ValueAn object of class htest.NoteAdditional notes?Author(s)Taylor B.Arnold and John W.EmersonMaintainer:Taylor B.Arnold<**********************>ReferencesT.W.Anderson and D.A.Darling(1952).Asymptotic theory of certain"goodness offit"criteria based on stochastic processes.Annals of Mathematical Statistics,23:193-212.V.Choulakian,R.A.Lockhart,and M.A.Stephens(1994).Cramer-von Mises statistics for discrete distributions.The Canadian Journal of Statistics,22(1):125-137.H.Cramer(1928).On the composition of elementary errors.Skand.Akt.,11:141-180.M.A.Stephens(1974).Edf statistics for goodness offit and some comparisons.Journal of the American Statistical Association,69(347):730-737.R.E.von Mises(1928).Wahrscheinlichkeit,Statistik und Wahrheit.Julius Springer,Vienna,Aus-tria.G.S.Watson(1961).Goodness offit tests on the circle.Biometrika,48:109-114.See Alsoks.test,ecdf,stepfunExamplesrequire(dgof)x3<-sample(1:10,25,replace=TRUE)#Using ecdf()to specify a discrete distribution:ks.test(x3,ecdf(1:10))cvm.test(x3,ecdf(1:10))#Using step()to specify the same discrete distribution:myfun<-stepfun(1:10,cumsum(c(0,rep(0.1,10))))ks.test(x3,myfun)cvm.test(x3,myfun)#Usage of U2for cyclical distributions(note U2unchanged,but W2not)set.seed(1)y<-sample(1:4,20,replace=TRUE)cvm.test(y,ecdf(1:4),type= W2 )cvm.test(y,ecdf(1:4),type= U2 )z<-ycvm.test(z,ecdf(1:4),type= W2 )cvm.test(z,ecdf(1:4),type= U2 )#Compare analytic results to simulation resultsset.seed(1)y<-sample(1:3,10,replace=TRUE)cvm.test(y,ecdf(1:6),simulate.p.value=FALSE)cvm.test(y,ecdf(1:6),simulate.p.value=TRUE)ks.test Kolmogorov-Smirnov TestsDescriptionPerforms one or two sample Kolmogorov-Smirnov tests.Usageks.test(x,y,...,alternative=c("two.sided","less","greater"),exact=NULL,tol=1e-8,simulate.p.value=FALSE,B=2000)Argumentsx a numeric vector of data values.y a numeric vector of data values,or a character string naming a cumulative dis-tribution function or an actual cumulative distribution function such as pnorm.Alternatively,y can be an ecdf function(or an object of class stepfun)forspecifying a discrete distribution....parameters of the distribution specified(as a character string)by y.alternative indicates the alternative hypothesis and must be one of"two.sided"(default), "less",or"greater".You can specify just the initial letter of the value,butthe argument name must be give in full.See‘Details’for the meanings of thepossible values.exact NULL or a logical indicating whether an exact p-value should be computed.See ‘Details’for the meaning of NULL.Not used for the one-sided two-sample case.tol used as an upper bound for possible rounding error in values(say,a and b)when needing to check for equality(a==b);value of NA or0does exact comparisonsbut risks making errors due to numerical imprecisions.simulate.p.valuea logical indicating whether to compute p-values by Monte Carlo simulation,fordiscrete goodness-of-fit tests only.B an integer specifying the number of replicates used in the Monte Carlo test(fordiscrete goodness-of-fit tests only).DetailsIf y is numeric,a two-sample test of the null hypothesis that x and y were drawn from the same continuous distribution is performed.Alternatively,y can be a character string naming a continuous(cumulative)distribution function(or such a function),or an ecdf function(or object of class stepfun)giving a discrete distribution.Inthese cases,a one-sample test is carried out of the null that the distribution function which generated x is distribution y with parameters specified by....The presence of ties generates a warning unless y describes a discrete distribution(see above),sincecontinuous distributions do not generate them.The possible values"two.sided","less"and"greater"of alternative specify the null hy-pothesis that the true distribution function of x is equal to,not less than or not greater than thehypothesized distribution function(one-sample case)or the distribution function of y(two-samplecase),respectively.This is a comparison of cumulative distribution functions,and the test statisticis the maximum difference in value,with the statistic in the"greater"alternative being D+=max u[F x(u)−F y(u)].Thus in the two-sample case alternative="greater"includes distribu-tions for which x is stochastically smaller than y(the CDF of x lies above and hence to the left ofthat for y),in contrast to t.test or wilcox.test.Exact p-values are not available for the one-sided two-sample case,or in the case of ties if y iscontinuous.If exact=NULL(the default),an exact p-value is computed if the sample size is lessthan100in the one-sample case with y continuous or if the sample size is less than or equal to30with y discrete;or if the product of the sample sizes is less than10000in the two-samplecase for continuous y.Otherwise,asymptotic distributions are used whose approximations maybe inaccurate in small samples.With y continuous,the one-sample two-sided case,exact p-valuesare obtained as described in Marsaglia,Tsang&Wang(2003);the formula of Birnbaum&Tingey(1951)is used for the one-sample one-sided case.In the one-sample case with y discrete,the methods presented in Conover(1972)and Gleser(1985)are used when exact=TRUE(or when exact=NULL)and length(x)<=30as described above.When exact=FALSE or exact=NULL with length(x)>30,the test is not exact and the resulting p-values are known to be age of exact=TRUE with sample sizes greater than30is not adviseddue to numerical instabilities;in such cases,simulated p-values may be desirable.If a single-sample test is used with y continuous,the parameters specified in...must be pre-specified and not estimated from the data.There is some more refined distribution theory for theKS test with estimated parameters(see Durbin,1973),but that is not implemented in ks.test. ValueA list with class"htest"containing the following components:statistic the value of the test statistic.p.value the p-value of the test.alternative a character string describing the alternative hypothesis.method a character string indicating what type of test was performed. a character string giving the name(s)of the data.Author(s)Modified by Taylor B.Arnold and John W.Emerson to include one-sample testing with a discretedistribution(as presented in Conover’s1972paper–see references).ReferencesZ.W.Birnbaum and Fred H.Tingey(1951),One-sided confidence contours for probability distri-bution functions.The Annals of Mathematical Statistics,22/4,592–596.William J.Conover(1971),Practical Nonparametric Statistics.New York:John Wiley&Sons.Pages295–301(one-sample Kolmogorov test),309–314(two-sample Smirnov test).William J.Conover(1972),A Kolmogorov Goodness-of-Fit Test for Discontinuous Distributions.Journal of American Statistical Association,V ol.67,No.339,591–596.Leon Jay Gleser(1985),Exact Power of Goodness-of-Fit Tests of Kolmogorov Type for Discontin-uous Distributions.Journal of American Statistical Association,V ol.80,No.392,954–958.Durbin,J.(1973)Distribution theory for tests based on the sample distribution function.SIAM.George Marsaglia,Wai Wan Tsang and Jingbo Wang(2003),Evaluating Kolmogorov’s distribution.Journal of Statistical Software,8/18.https:///v08/i18/.See Alsoshapiro.test which performs the Shapiro-Wilk test for normality;cvm.test for Cramer-von Mises type tests.Examplesrequire(graphics)require(dgof)set.seed(1)x<-rnorm(50)y<-runif(30)#Do x and y come from the same distribution?ks.test(x,y)#Does x come from a shifted gamma distribution with shape3and rate2?ks.test(x+2,"pgamma",3,2)#two-sided,exactks.test(x+2,"pgamma",3,2,exact=FALSE)ks.test(x+2,"pgamma",3,2,alternative="gr")#test if x is stochastically larger than x2x2<-rnorm(50,-1)plot(ecdf(x),xlim=range(c(x,x2)))plot(ecdf(x2),add=TRUE,lty="dashed")t.test(x,x2,alternative="g")wilcox.test(x,x2,alternative="g")ks.test(x,x2,alternative="l")##########################################################TBA,JWE new examples added for discrete distributions:x3<-sample(1:10,25,replace=TRUE)#Using ecdf()to specify a discrete distribution:ks.test(x3,ecdf(1:10))#Using step()to specify the same discrete distribution: myfun<-stepfun(1:10,cumsum(c(0,rep(0.1,10))))ks.test(x3,myfun)#The previous R ks.test()does not correctly calculate the #test statistic for discrete distributions(gives warning): #stats::ks.test(c(0,1),ecdf(c(0,1)))#ks.test(c(0,1),ecdf(c(0,1)))#Even when the correct test statistic is given,the#previous R ks.test()gives conservative p-values: stats::ks.test(rep(1,3),ecdf(1:3))ks.test(rep(1,3),ecdf(1:3))ks.test(rep(1,3),ecdf(1:3),simulate=TRUE,B=10000)Index∗htestcvm.test,1ks.test,3cvm.test,1,6ecdf,2–4ks.test,3,3shapiro.test,6stepfun,2–4t.test,5wilcox.test,58。

EffectSizeandConfidenceIntervalsinGeneralLinear…

EffectSizeandConfidenceIntervalsinGeneralLinear…

Effect Size and Confidence Intervals inGeneral Linear Models for Categorical Data AnalysisRandall E. SchumackerUniversity of North Texas, Health Science CenterIn general linear models for categorical data analysis, goodness-of-fit statistics only provide a broad significance test of whether the model fits the sample data. Hypothesis testing has traditionally reported the chi-square or G2 likelihood ratio (deviance) statistic and associated p-value when testing the significance of a model or comparing alternative models. The effect size (log odds ratio) and confidence interval (ASE) need to receive more attention when interpreting categorical response data using the logistic regression model. This trend is supported by recent efforts in general linear models for continuous data (t-test, analysis of variance, least squares regression) that have criticized the sole use of statistical significance testing and the p < .05 criteria for a Type I error rate..The American Psychological Association has recently advocated that hypothesis testing go beyond statistical significance testing at p < .05 for Type I error rate (Wilkinson, L., & APA Task Force on Statistical Inference, 1999). The traditional statistical significance testing has placed an emphasis upon the probability of a statistical value occurring beyond a chance level given the sampling distribution of the statistic (Harlow, Mulaik, and Steiger, 1997). Recently, more emphasis has been placed on the practical interpretation of results that include effect size, confidence interval, and confidence intervals around an effect size; however, the discussion centered on statistical applications that use continuous data (Kirk, 1996). The present study highlights a typical application that uses the general linear model for categorical data analysis (DeMaris, 1992; Fox, 1997). The logistic regression goodness-of-fit criteria for categorical data analysis will be presented (Klienbaum, 1994). The results go beyond the statistical test of significance and highlight the important role that effect size (odds ratio, log odds ratio, relative risk or probability ratio) and confidence interval (asymptotic standard error; ASE) have in the general linear model for categorical data analysis.Categorical data analysis techniques are used when subject responses are binary and mutually exclusive. The typical method of analyzing relationships amongst categorical variables is to use the chi-square statistic or phi correlation coefficient (Upton, 1978). The general linear model for categorical response variables however has become more widely used in the behavioral sciences because many research questions involve a categorical dependent variable and one or more categorical independent variables.Logistic regression is a special case of log-linear regression where both the dependent and independent variables are categorical in nature (Hosmer & Lemeshow, 1989; Klienbaum, 1994). It offers distinct advantages over the chi-square method for analysis of categorical variables. In logit models, natural log odds of the frequencies are computed that allow different models and different model parameters to be compared given the additive nature of the G2 component for each model. If a non-significant likelihood-ratio chi-square (G2) value is computed, then a given model fits the observed data.Goodness-of-fit CriteriaA theoretical logit regression model is generally postulated (null model or base model). A common practice is then to create alternative models where each new model contains parameters of the previous model, plus a hypothesized new parameter. The theoretical model can be tested beginning with a null model and adding parameters, or with a saturated model deleting parameters. Several logit regression models may fit equally well based on various goodness-of-fit criteria that are used to determine whether the model fits the data in the logit regression model. The goodness-of-fit criteria typically reported are:1. Pearson chi-square2. Likelihood-ratio chi-square (G2)3. Predictive efficacy (R-squared type measure)4. Deviance (-2 [L M – L S])Pearson chi-square is calculated as: χ2 = Σ (O – E)2 / E. The chi-square distribution is defined by degrees of freedom, df. The mean of the chi-square distribution is equal to df with the standard deviation2. As the degrees of freedom, df, increases the chi-square sampling distribution goes from equal to dfa right skewed distribution to a normal distribution.The likelihood-ratio chi-square (G2) is based on the ratio of maximum likelihood values, Λ, and expressed in logarithm form as – 2 log (Λ). The G2 statistic can also be expressed as: G2 = 2 ΣOij log (Oij / Eij) where O is the observed cell frequency, E is the expected cell frequency, and the i and j subscripts represent the individual cells in the cross-tabulated table. The log transformation yields an approximate chi-squared sampling distribution with a minimum value of zero and larger values suggesting rejection of the null hypothesis. The p-value simply indicates the strength of evidence against the null hypothesis.Predictive efficacy refers to whether a model generates accurate predictions of group membership on the dependent variable. It is possible to have an excellent fit between the logit model and the data without having predictive efficacy. Recall, if G2 = 0, a saturated model exists which perfectly fits the data, yet predictive efficacy can be far from perfect. The R2 type measure for logistic regression is not meant as a variance accounted for interpretation, as traditionally noted in least squares regression, because it under estimates the proportion of variance explained in the categorical variables. Instead, the R2 type measure is an approximation for assessing predictive efficacy ranging from zero (0) [independence model] to one (1) [saturated model].The deviance value provides a way to examine differences in nested logistic regression models. The G2 from one model is simply subtracted from the G2 of the second model. This is similar to testing a full versus restricted model in multiple regression. The deviance value is -2[L m - L s ] where L represents the respective log-likelihood function of each model with the degrees of freedom equal to the difference in the degrees of freedom of the two models. The deviance is the likelihood ratio statistic (G2) for comparing model M to the saturated model S. Since the saturated model has G2 = 0, this reduces to the G2 statistic for the hypothesized logistic regression model. If G2 is non-significant, then additional independent categorical predictors in the model are not needed. This type of test is only appropriate for the likelihood-ratio chi-square and not the Pearson chi-square because adding additional independent categorical predictor variables will never result in a poorer fit of the model to the data.Effect Size and Confidence Interval CriteriaEffect size measures and the asymptotic standard error (ASE) play a major role in interpreting the practical significance of estimated parameters in general linear models for categorical variables. The parameter estimates in logistic regression are calculated using maximum likelihood estimation and possess asymptotic properties. As sample size increases, the parameter estimates become unbiased and consistent with population parameters. The sampling distribution also approaches normality with variance lower than other unbiased estimation procedures.The effect size measures typically used in categorical data analysis are:1. z test2. odds ratio3. log odds ratio4. relative risk or probability ratioThe z test, given larger samples, can be used to test a parameter’s significance and compute a confidence interval. The formula for z is: z = B / ASE. The confidence interval is computed as: z +/- 1.96*σ; where σ = [p(1-p)/n]1/2. The significance test simply indicates whether an estimated parameter is reasonable whereas the confidence interval yields a range of possible values for the parameter, given sampling error.Odds ratios are computed as: Odds = p / 1 – p. If the probability of success is .8, the probability of failure is .2, and the odds ratio is .8 / .2 = 4. This indicates 4 successes for every one failure. Unfortunately, odds ratios in small to moderate samples have skewed sampling distributions and therefore are not widely used.The log odds ratio or natural logarithm of the odds ratio, log (θ), is preferred for interpreting an effect size. Independence of categorical variables is equivalent to log (θ) = 0, i.e. odds ratio = 1 is equal to log odds ratio = 0. The sampling distribution of the log odds ratio approximates a normal distribution as sample size increases with a mean of log (θ) and standard deviation ASE. Parameter estimates in logit models can be readily interpreted as a log-odds ratio. This is calculated as e β for a single parameter, or e β1 - β2 for differences between two parameters. This is useful when examining contrasts between levels of two independent categorical predictor variables.The relative risk or probability ratio should be interpreted separately from the odds ratio (Cohen, 2000). The relative risk (RR) indicates a probability and is computed as probability p 1 divided by probability p 2 [RR = p 1 / p 2]. In contrast the odds ratio (OR) is (p 1 / 1 – p 1) divided by (p 2 / 1 – p 2). The odds ratio is therefore related, but different from relative risk (OR = [PR–p 1]/1–p 1] or RR x [1–p 2/1–p 1]). For logistic model interpretation, a gender coefficient (male = 0 and female = 1) of e 1.67 would indicate the odds of females over males participating, whereas the statement females were two-thirds more likely than males to participate is a relative risk or probability statement.The asymptotic standard error (ASE) or standard deviation of the log transform sampling distribution is computed as: ASE (log π) = [1/n 1 + …+ 1/n k ]1/2. A 95% confidence interval around the log odds ratio is then computed as log(π) +/- 1.96 ASE[log(π)]. The confidence interval should contain the value 1.0 otherwise the true odds will be different for the two groups being compared. The confidence interval also provides valuable information about the range of minimum and maximum log odd ratios.MethodThe logistic regression model (log(π) = α + β1 X 1 +… + βk X k ) was applied to a set of categorical data (Stokes, Davis, & Koch, 1995). The goodness-of-fit criteria, effect size, confidence interval, and confidence interval around the effect size are reported. The importance of effect size and confidence interval reporting above and beyond significance testing is then discussed.Data AnalysisAn example data set relating myocardial infarction and aspirin use is provided as follows (Agresti, 1996):Group Yes No TotalPlacebo 189 10,845 11,034Aspirin 104 10,933 11.037The proportion, p 1, or placebo odds ratio is 189 / 11,034 = .0171 and indicates that .0171 percent suffered myocardial infarction while taking a placebo. In contrast, proportion, p 2, or aspirin odds ratio is 104 / 11,037 and indicates that .0094 suffered myocardial infarction while taking aspirin. The percent difference is .0077 with standard error of .0015. z = .0077/.0015 = 5.133, which is statistically significant. The 95% confidence interval for this true difference is .0077 + /- 1.96(.0015) or (.005, .011), so taking aspirin appears to diminish the risk of myocardial infarction. The relative risk is .0171 divided by .0094 or 1.82. Using relative risk, the proportion of MI cases was 82% higher for the group taking the placebo. The 95% confidence interval is (1.43, 2.30), thus we can be 95% confident that the proportion of MI cases for the group taking the placebo was at least 43% higher than the group taking the aspirin. The relative risk indicates that the difference isn’t trivial and may have important health implications.The natural log odds ratio is log (1.82) = .599. The ASE (log π) is computed as [1/189 + 1/10,845 + 1/104 + 1/10,933]1/2 = .123. The 95% confidence interval for log (π) is (.358, .840). The corresponding confidence interval for π is (1.43, 2.30). Since it does not contain 1.0, the true odds of myocardial infarction appear to be different for the two groups.Results and InterpretationThe categorical data example indicates a statistically significant z-test of the difference between the proportion of myocardial infarction cases for the placebo and aspirin usage groups. The effect size (odds ratio, log odds ratio, and relative risk or probability ratio) provides a more practical interpretation of the efficacy of using aspirin to thwart myocardial infarction in patients. Moreover, the confidence interval and especially the confidence interval around the effect size (log odds ratio) provided important additional information to our interpretation of results.Statistical significance testing has come under attack by scholars in recent years because it is influenced by a researcher’s choice of sample size, power, and Type I error rate. The reported research literature however has focused on continuous data analysis techniques and not fully included categorical data analysis methods. The American Psychological Association and Editors of several popular journals are now requiring educational researchers to report effect size and confidence intervals. The use and interpretation of effect size and confidence interval in categorical data analysis is therefore also important to understand and report.ReferencesAgresti, A. (1996). An Introduction to Categorical Data Analysis. NY: John Wiley & Sons, Inc. Cohen, M.P. (2000). Note on the Odds Ratio and the Probability Ratio. Journal of Educational and Behavioral Statistics, 25(2), 249-252.DeMaris, A. (1992). Logit modeling: Practical Applications. Sage University Paper series on Quantitative Applications in the Social Sciences, no. 07-086. Newbury Park, CA: Sage.Fox, J. (1997). Applied Regression Analysis, Linear Models, and Related Methods. Newbury Park, CA: Sage.Harlow, L.L., Mulaik, S.A., & Steiger, J.H. (editors.) (1997). What if there were no significance tests? NJ: Lawrence Erlbaum Associates, Inc.Hosmer, D.W. & S. Lemeshow (1989). Applied Logistic Regression. NY: John Wiley & Sons, Inc. Kirk, R. (1996). Practical significance: A concept whose time has come. Educational and Psychological Measurement, 56, 746-759.Kleinbaum, D.G. (1994). Logistic Regression. NY: Springer-Verlag.Stokes, M.E., C.S. Davis, & G.G. Koch (1995). Categorical Data Analysis Using the SAS System. Cary, NC: SAS Institute, Inc.Send correspondence to: Randall E. Schumacker, Ph.D.University of North TexasHealth Science CenterEmail: *******************。

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Goodness-of-fit tests for the generalized pareto distribution. Abstract: Tests of fit are given for the generalized Pareto distribution (GPD) based on Cramer-von Mises statistics. Examples are given to illustrate the estimation techniqu es and the goodness-of-fit procedures. The tests are applied to the exceedances over g iven thresholds for 238 river flows in Canada; in general, the GPD provides an adequat e fit. The tests are useful in deciding the threshold in such applications; this method is investigated and also the closeness of the GPD to some other distributions that might be used for long-tailed data.KEY WORDS: Exceedances; Extreme values; Hydrology; Threshold selection.The generalized Pareto distribution (GPD) has the following distribution function:F(x) = 1 - [(1 - kx/a).sup.1/k], (1)where a is a positive scale parameter and k is a shape parameter. The density function isf(x) = (1/a)[(1 - kx/a).sup.(1-k)/k]; (2)the range of x is 0 [less than or equal to] x < [infinity] for k [less than or equal to] 0 and 0 [less than or equal to] x [less than or equal to] a/k for k > 0. The mean and variance are [mu] = a/(l + k) and [[sigma].sup.2] = [a.sup.2]/{[(1+k).sup.2](1+2k)}; thus the variance of the GPD is finite only for k > -0.5. For the special values k = 0 and 1, the GPD becomes the exponential and uniform distributions, respectively. The n ame generalized Pareto was given by Pickands (1975); the distribution is sometimes ca lled simply Pareto when k < 0. In this case, the GPD has a long tail to the right and h as been used to model datasets that exhibit this form in several areas of appliedIn particular, the GPD is used to model extreme values. This application was discussed by many authors--for example, Hosking and Wallis (1987), Smith (1984, 1989, 1990), Davison (1984), and Davison and Smith (1990). Smith (1990) gave an excellent revie w of the two most widely used methods in this field, based on generalized extreme value distributions and on the GPD. In hydrology, the GPD is often called the "peaks over thresholds" (POT) model since it is used to model exceedances over threshold levels i n flood control. Davison and Smith (1990) discussed this application in their section 9, using river-flow exceedances for a particular river over a period of 35 years. The autho rs fit the GPD for exceedances over a series of thresholds and also calculated the Kolm ogorov-Smirnov and Anderson-Darling statistics to test fit. In the article, they mentione d the lack of tests for the GPD [echoing a remark made earlier by Smith (1984)]. In t he absence of such tests, they used tables for testing exponentiality, which, as the aut hors pointed out, give too high critical values. This dataset is tested for GPD in Section 3.In this article, goodness-of-fit tests are given for the GPD, based on the Cramer-von M ises statistic [W.sup.2] and the Anderson-Darling [A.sup.2]. We concentrate on the mo st practical case in which the parameters are not known. Estimation of parameters is d iscussed in Section 1, and the goodness-of-fit tests are given in Section 2. In Section 3, they are applied to exceedances for a Canadian river, and it is shown how the tests can be used to help select the threshold in the POT model, following ideas suggested by Davison and Smith (1990). Moreover, the Davison-Smith example is revisited, using the GPD test and the Anderson-Darling statistic. The tests are then used to model the exceedances over thresholds of 238 Canadian river-flow series; they indicate the adeq uacy of the GPD fit. The technique of choosing the threshold is based on a stability pro perty of the GPD; the efficacy of the technique when the true exceedance distribution i s not GPD is examined in Section 4. In Section 5, we investigate the versat ility of the GPD as a distribution that might be used for many types of data with a long tail and no mode in the density. Finally, the asymptotic theory of the tests is given in Section 6.1. ESTIMATION OF PARAMETERSBefore the test of fit can be made, unknown parameters in (1) must first be estimated. The asymptotic theory of the test statistics requires an efficient method of estimation;we shall use maximum likelihood. Although it is theoretically possible to have datas ets for which no solution exists to the likelihood equations, in practice this appears to be extremely rare. For example, in our examination of the 238 Canadian rivers, maxim um likelihood estimates of the parameters could be found in every case. Thus we shall assume that such estimates exist for the dataset under test. Hosking and Wallis (198 7), Castillo and Hadi (1997), and Dupuis and Tsao (1998) studied other methods of est imating the parameters. One of these methods is the use of probability-weighted m oments; although there appear to be some advantages in terms of bias, Chen and Bal akrishnan (1995) and Dupuis (1996) later showed that this technique is not always fea sible.We now discuss estimation by maximum likelihood. Suppose that [x.sub.1],..., [x.sub. n] is a given random sample from the GPD given in (1), and let [x.sub.(1)] [less than or equal to] [x.sub.(2)] [less than or equal to] ... [less than or equal to] [x.sub.(n)] b e the order statistics. We consider three distinct cases--Case 1, in which the shape par ameter k is known and the scale parameter a is unknown; Case 2, in which the shape parameter k is unknown, and the scale parameter is known; Case 3, in which both pa rameters a and k are unknown. Case 3 is the most likely situation to arise in practice. The log-likelihood is given byL(a, k) = -n log a - (1 - 1/k) x [summation over (i=1] [n] log(1 - [kx.sub.i]/a) for k [not equal to] 0 = - n log a - [summation over (i=1] [n] [x.sub.i]/a for k = 0. (3) The range for a is a > 0 for k [less than or equal to] 0 and a > [kx.sub.(n)] for k > 0. When k < 1/2, Smith (1984) showed that, under certain regularity conditions, the maximum likelihood estimators are asymptotically normal and asymptotically efficient. When 0.5 [less than or equal to] k < 1, Smith (1984) identified the problem as nonre gular, which alters the rate of convergence of the maximum likelihood estimators. For k [greater than or equal to] 1, and as n [right arrow] [infinity], the probability approac hes 1 that the likelihood has no local maximum. We now consider Cases 1 to 3 separa tely.Case 1 (k known, a unknown).For this case, we have the following result. Proposition 1. For any known k with k < 1, the maximum likelihood estimate of a exist s and is unique.Proof. For k = 0 (the exponential distribution), the result is well known. Suppose that k [not equal to] 0; then a, the maximum likelihood estimate of a, will be a solution a of [partial]L(a, k)/[partial]a = 0, which may be simplified to [L.sub.1](a) = 0, where [L. sub.1](a) = n - (1 - k) [summation over (i=1/n)] [x.sub.i]/(a - k[x.sub.i]). The value of [[partial].sup.2]L(a, k)/[partial][a.sup.2] at a = a is -(1 - k) [summation over (i=1/ n)] [x.sub.i] ([a - k[x.sub.i]).sup.-2]/a < 0, which implies that at a = a, the likelihood function attains its maximum value. Moreover, the function [L.sub.1] (a) is an increasi ng function on the range of a because [partial][L.sub.1] (a)/[partial]a = (1 - k) [summ ation over (i=1/n)] [x.sub.i]/ [(a - [kx.sub.i]).sup.2] > 0; the function can take negati ve and positive values; thus it cuts the a axis exactly at one point. Hence, a is unique.Case 2 (a known, k unknown).In this situation, there may or may not exist a maxi mum likelihood estimate for k. To see this, consider the likelihood function L(a, k) give n in (3) for -[infinity] < k [less than or equal to] a/[x.sub.(n)]. Since a is known, the l ikelihood will be regarded as a function L(k) of k only. Set [k.sup.*] = a/[x.sub.(n)]; t hen[lim.sub.k[right arrow][infinity]] L(k) = - [infinity], (4)[lim.sub.k[right arrow][k.sup.*]] L(k) = - [infinity] if [k.sup.*] < 1, (5)and[lim.sub.k[right arrow][k.sup.*]] L(k) = + [infinity] if [k.sup.*] > 1. (6)Note that the value of k is not restricted to be less than or equal to 1. From (4) and(5), it follows that, if [k.sup.*] < 1, there is at least one maximum. For a fixed sample of size n, Pr([k.sup.*] < 1) = 1 - [[1 - [(1 - k).sup.1/k]].sup.n] > 1 - [(3/4).sup.n].Similarly, if [k.sup.*] > 1, it follows from (4) and (6) that there may or may not exis t a local maximum.Case 3 (both parameters unknown).Maximum likelihood estimation of the paramet ers a and k when both are unknown was discussed in detail by Grimshaw (1993). This case is similar to Case 2: In principle, maximum likelihood estimates for a and k may not exist. However, as stated previously, this is unlikely in practical applications, espe cially when the sample size is reasonably large. To find a solution, Davison (1984) poin ted out that, by a change of parameters to [theta] = k/a and k = k, the problem is re duced to a unidimensional search; we search for [theta], which gives a local maximum of the profile log-likelihood (the log-likelihood maximized over k). This is[L.sup.*]([theta]) = - n - [summation over (n/i=1)] log(1 - [theta][x.sub.i]) - n log [ - [(n[[theta].sup.-1] [summation over (n/i=1)] log(1 - [theta][x.sub.i])] (7)for [theta] < 1/[x.sub.(n)]. Suppose that a local maximum [theta] of (7) can be found; thenk = -([n.sup.-1]) [sumaration over (i=1] .sup.n] log(1 - [theta][x.sub.i]) (8)anda = k/[theta]. (9)2. GOODNESS-OF-FIT TESTSIn this section, the Cramer-von Mises statistic [W.sup.2] and the Anderson-Darling stati stic [A.sup.2] are described. The Anderson-Darling statistic is a modification of the Cra mer-von Mises statistic giving more weight to observations in the tail of the distribution, which is useful in detecting outliers. The null hypothesis is [H.sub.0]: the random sam ple [x.sub.1],..., [x.sub.n] comes from Distribution (1).When parameters a and k are known in (1), the GPD is completely specified; we call t his situation Case 0. Then the transformation [z.sub.i] = F([x.sub.i]) produces a z sam ple that will be uniformly distributed between 0 and 1 under [H.sub.0]. Many tests of the uniform distribution exist, including Cramer-von Mises tests (see Stephens 1986), so we shall not consider this case further. In the more common Cases 1, 2, and 3, when one or both parameters must be estimated, the goodness-of-fit test procedure is as fo llows:1. Find the estimates of unknown parameters as described previously, and make the tr ansformation [z.sub.(i)] = F([x.sub.(i)]), for i = 1,..., n, using the estimates where nec essary.2. Calculate statistics [W.sup.2] and [A.sup.2] as follows:[W.sup.2] = [summation over (i=1/n)] [{[z.sub.(i)] - (2i - 1)/(2n)}.sup.2] + 1/(12n) and[A.sup.2] = - n - (1/n) [summation over (i=1/n)] (2i - 1)[log{[z.sub.(i)]} + log{1 - [z. sub.(n+1-i)]}].Tables 1 and 2 give upper-tail asymptotic percentage points for the statistics [W.sup.2] and [A.sup.2], for values of k between -0.9 and 0.5, and for Cases 1, 2, and 3. In C ases 2 and 3, where k must be estimated, the appropriate table should be entered at k, and in all tables, if k or k is greater than 0.5, the table should be entered at k = 0.5. Critical points for other values of k can be obtained by interpolation; linear interpola tion in Table 2 for [A.sup.2], Case 3, gives a maximum error in [alpha] from 0.0011 t o 0.0003 as [alpha] moves through the important range R; 0.10 [greater than or equal to] [alpha] [greater than or equal to] 0.005 and is less than 0.003 for the two values S; [alpha] = 0.5 and 0.25; for [W.sup.2] the corresponding figures are 0.0025 to 0.0 004 over R and less than 0.007 for S. For the much less important cases 1 and 2, line ar interpolation in Table 1 gives maximum error for [A.sup.2], Case 1, from 0.009 to 0. 0017 over R, and 0.014 for S; for [W.sup.2], Case 1, the figures are 0.0055 t o 0.001 5 over R and 0.008 for S. For Case 2, the maximum errors are smaller by a factor of approximately 10.The points in these tables were calculated from the asymptotic theory to be presented in Section 6. Points for finite n were generated from Monte Carlo samples, using 10,000 samples for each combination of n and k. The results show that these points converge quickly to the asymptotic points for all three cases, a feature of statistics [W.sup.2] and [A.sup.2] found also in many other applications. The asymptotic points can then b e used with good accuracy for, say, n [greater than or equal to] 25. More extensive ve rsions of Tables 1 and 2, and also tables showing the convergence to the asymptotic p oints, were given by Choulakian and Stephens (2000).3 ?In this section we consider two specific examples and then discuss the overall results when the GPD is applied to 238 Canadian rivers.In the first example, the data are n = 72 exceedances of flood peaks (in [m.sup.3]/s) of the Wheaton River near Carcross in Yukon Territory, Canada. The initial threshold va lue [tau] is 27.50 (how this was found will be explained later), and the 72 exceedance s, for the years 1958 to 1984, rounded to one decimal place, were 1.7, 2.2, 14.4, 1.1, 0.4, 20.6, 5.3, 0.7, 1.9, 13, 12, 9.3, 1.4, 18.7, 8.5, 25.5, 11.6, 14.1, 22.1, 1.1, 2.5, 14.4, 1.7, 37.6, 0.6, 2.2, 39, 0.3, 15, 11, 7.3, 22.9, 1.7, 0.1, 1.1, 0.6, 9, 1.7, 7, 20.1, 0.4, 2.8, 14.1, 9.9, 10.4, 10.7, 30, 3.6, 5.6, 30.8, 13.3, 4.2, 25.5, 3.4, 11.9, 21.5, 2 7.6, 36.4, 2.7, 64, 1.5, 2.5, 27.4, 1, 27.1, 20.2, 16.8, 5.3, 9.7, 27.5, 2.5, 27. The ma ximum likelihood estimates of the parameters are k = -0.006 and a = 12.14. We now apply the tests presented in Section 2; the values of the test statistics are [W.sup.2] = 0.2389 (p value < 0.025) and [A.sup.2] = 1.452 (p value < 0.010), so the GPD do es not fit the dataset well at the threshold value of 27.50. Next we follow the techniqu e of Davison and Smith (1990), who suggested raising the threshold until the GPD fits the new values of the remaining exceedances. Here the threshold was raised successiv ely by the value of the smallest order statistic, which is then deleted, until the p value s of the [A.sup.2] and [W.sup.2] statistics exceeded 10%. This happens when six order statistics are deleted; the threshold value is then 27.50 + 0.60 = 28.10. The GPD now fits the data quite well: Details are given in Table 3. This example shows how the te sts may help in choosing a threshold value in the POT model.In this connection, we next revisit the example given by Davison and Smith (1990). Th eir tables 5 and 6 give the parameter estimates (Case 3) and the values of the Anders on-Darling statistic for a range of thresholds from 140 down to 70 at intervals of 10 u nits. The values of the Anderson-Darling statistic were compared with 1.74, the 5% poi nt for a test for exponentiality since no test for GPD was available, and the statistics w ere not nearly significant by this criterion. Now Table 2 can be used to give the asymp totic 5% critical values [z.sub.5] for [A.sup.2] corresponding to the estimate of the par ameter k. The first two and last two threshold results are given in Table 4. Davison an d Smith pointed out the sudden increase in the value of [A.sup.2] at threshold level 70; against the exponential point this value is still not significant, but using Table 2, it fall s just at the critical 5% level.The Wheaton River in the previous example is one of 238 Canadian rivers for which we have similar data. We now examine how well the GPD fits the exceedances for the ot her rivers. First one must decide the threshold level for a given river flow. The first thr eshold estimate, [tau], was chosen so that the number of exceedances per year could be modeled by a Poisson distribution; see, for instance, Todorovic (1979). This was do ne by taking [tau] such that, if N[tau] is the number of exceedances, the mean of N[t au] divided by its variance was approximately 1. The Poisson assumption will be tested more rigorously later. After the threshold level was chosen, the maximum likelihood e stimates of the parameters k and a were calculated and the [W.sup.2] and [A.sup.2] t ests applied. Table 5 gives the frequencies of p values for the 238 rivers. It is clear th at the GPD fits quite well; using [W.sup.2], only 9 gave p values less than 0.01, and u sing [A.sup.2], there were only 15. At the 0.05 level, these figures are 34 using [W.su p.2] and 49 using [A.sup.2]. The results demonstrate also that [A.sup.2] is more sensi tive (and therefore more powerful) than [W.sup.2] against possible outliers in the tail, as was suggested in Section 2. More details were given by Choulakian and Stephens (2000). For the 49 "rejections" using [A.sup.2], the threshold was increased, as described in the first example, by deleting the smallest order statistics until the p value became larger than 0.10. Then only 10 of the 49 sets were still rejected as GPD by [A.sup.2]. Finally, for these 10 rejected river flows, the Poisson assumption was tested by the Cramervon Mises tests given by Spinelli and Stephens (1997). Only one dataset reject ed the Poisson assumption.Another result of interest is that 229 of the 238 values of k were between -0.5 and 0. 4; this confirms the findings of Hosking and Wallis (1987), who restricted attention to -0.5 < k < 0.5, where the great majority of values fall.4. POWER OF THE FAILURE-TO-REJECT METHODThe method of choosing the threshold by using as many exceedances as possible subje ct to passing a test for GPD can be investigated for power as follows. The data analyst wishes to fit a GPD to the exceedances far enough in the tail. She\he therefore starts with as many as possible, say 100, and tests; if the test fails, she/he omits the small est values one by one and tests again until the test yields acceptance for the GPD. Th e efficiency of this procedure can now be investigated when the true distribution is not GPD. Suppose, for instance, the true exceedance distribution is Weibull with parameter 0.75, and suppose, starting with sample size 100, the sample fails the GPD test until K lower-order statistics have been removed; then it passes. Statistic K will vary from s ample to sample, but its mean will give a measure of the power of the test. The stand ard deviation and other statistics from the distribution of K are also of interest. This ha s been investigated for various alternative distributions for the t rue exceedances; Tabl e 6 provides these results. They are based on 1,000 Monte Carlo samples of initial size n = 100, 50, or 30. Only statistic [A.sup.2] is reported because [A.sup.2] outperforms [W.sup.2] at detecting tail discrepancies. Column 3 gives the initial rejection rate for a GPD test with parameters estimated, at test size 0.05; this is also the power in a con ventional study of power against the specified alternative. Subsequent columns give the mean, standard deviation, the three quartiles, and the maximum values for K. Thus for the Weibull(0.75) alternative, 509 of the 1,000 samples of size 100 were rejected as GPD initially; of these rejected samples, the mean number of order statistics to be dele ted before acceptance was 8.169 with standard deviation 9.178. The distribution of K is long-tailed; the quartiles are 2, 5, and 11, respectively, but one sample needed 52 or der statistics removed to achieve acceptance by the GPD test.In Table 6, the distributions were chosen to give a reasonable starting rejection rate. S everal other distributions could be supposed as alternatives (e.g., the half-normal, half-Cauchy, or gamma distributions), but these can be made very close to the GPD by suit able choice of parameters, so the initial power is small, and very few low-order statisti cs, if any, need be deleted to achieve GPD acceptance (cf. the numerical results in Sec.5). In their section 9, Davison and Smith suggested that the exceedances for their ex ample might possibly be truly modeled by a mixture of two populations, and in Table 6 we have included three mixtures of GPD as possible true models. The test statistic dis tinguishes quite effectively between these mixtures and a single GPD.For all alternative distributions considered in Table 6, it may be seen that, as n increas es, the mean, standard deviation, and maximum of K all increase; the results also sho w the increase in the power of the test as the sample size increases.5. VERSATILITY OF THE GPDAs was suggested previously, some distributions often fitted to long-tailed distributions may be brought close to a GPD by suitable choice of k and a. This closeness was inves tigated, and results are given in Table 7. In this table, for example, the standard half-normal and standard half-Cauchy distributions (respectively, the distribution of [absolut e value of X] when X has the standard normal or the standard Cauchy distribution) are compared with the GPD(0.293,1.022) for the first and GPD(-0.710, 1.152) for the sec ond; the standard lognormal is compared with GPD(-0.140, -1.406). The choice of the GPD parameters can be made by equating the first two moments of the GPD and those of the distribution compared, but this does not produce as good a fit as the following maximum likelihood procedure.A sample of 500 values was taken from the half-normal distribution, and a GPD fit was made, estimating the parameters by maximum likelihood, first solving (7) for [theta], and then obtaining k, a from (8) and (9). This was repeated 100 times, and the avera ge values of the k, a values were used as the parameters in the GPD. Although the m atches are clearly not perfect, the error in [alpha], the probability level of a given perc entile, is small when one distribution is used instead of another. Similar comparisons ar e given for several Weibull distributions, and a gamma distribution. The exponential (W eibull with parameter 1) is omitted because it is a special case of the GPD with k = 0. Again the two distributions are quite close for the Weibull parameter greater than 1 (where the Weibull has a mode), but the match is less good for Weibull with paramete r less than 1--for example, the Weibull(0.5), or the Weibull(0.75)--where the density ri ses to infinity at X = 0.Overall, the results suggest that a GPD could often be used as a model for data with a long tail when neither a mode nor an infinite density is suggested by the nature of the variables or by the data themselves.6. ASYMPTOTIC THEORY OF THE TESTSIn this section we summarize the asymptotic theory of Cramer--von Mises tests. The c alculation of asymptotic distributions of the statistics follows a procedure described, for instance, by Stephens (1976). It is based on the fact that [y.sub.n] (z) = [square root of (term)]n{[F.sub.n](z) - z}, 0 [less than or equal to] z [less than or equal to] 1, w here [F.sub.n](z) is the empirical distribution function of the z set, it tends to a Gaussi an process y(z) as n [right arrow] [infinity] and the statistics are functionals of this pr ocess. The mean of y(z) is 0: We need the covariance function [rho](s, t) = E{y(s)y (t)}, 0 [less than or equal to] s, t [less than or equal to] 1. When all the parameters are known, this covariance is [[rho].sub.0](s, t) =min(s, t) - st. When parameters areestimated, the covariance will depend in general on the true values of the estimated p arameters. However, if the method of estimation is efficient, the covariance will not de pend on the scale parameter a but will depend on the shape parameter k. We illustrat e for Case 3 only. As the sample size n [right arrow] [infinity], the maximum likelihoo d estimators (a, k) have a bivariate normal distribution with mean (a, k) and variance-covariance matrix [SIGMA], where[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (10)When both parameters a and k are estimated, the covariance function of y(z) becomes[[rho].sub.3](s, t) = [[rho].sub.0](s, t) - {g(s)}'[summation over]g(t), (11)where s = F(x), and g(s) = ([g.sub.1](s), [g.sub.2](s))' is a vector having coordinates [g.sub.1] (s) = [partial]F / [partial]a = (1 - s) {1 - [(1 - s).sup.-k]}/(ak)and[g.sub.2] (s) = [partial]F / [partial]k = (1 - s)k log(1 - s) - 1 + [(1 - s).sup.-k]/[k.su p.2].When [SIGMA] and g(s) are inserted into (11), [[rho].sub.3](s, t) will be independent of a. When k [greater than or equal to] 0.5, the maximum likelihood estimates of a an d k are superefficient in the sense of Darling (1955), and then the covariance and the resulting asymptotic distributions will be the same as for k = 0.5. Thus, if 0.5 [less th an or equal to] k [less than or equal to] 1, the table should be entered at k = 0.5, as described in Section 2. In Case 1 the covariance of y(z) becomes[[rho].sub.1](s, t) = [[rho].sub.0](s, t) - (1 - 2k)[g.sub.1] (s)[g.sub.1] (t) (12)and for Case 2 it becomes[[rho].sub.2](s, t) = [[rho].sub.0](s, t) - (1 - k)(1 - 2k)[g.sub.2] (s)[g.sub.2] (t)/2. (1 3)In both these cases, at k = 0.5, the asymptotic covariance becomes [[rho].sub.0](s, t). This is the same as for a test for Case 0, when both a and k are known, and the asy mptotic points are the same as for such a test (see Stephens 1986).The Cramer-von Mises statistic [W.sup.2] is based directly on the process y(z), while [A.sup.2] is based on the process w(z) = y(z)/[{z(1 - z)}.sup.1/2]; asymptotically the distributions of [W.sup.2] and [A.sup.2] are those of [W.sup.2] = [[integral].sup.1.sub. 0] [y.sup.2](z)dz and [A.sup.2] = [[integral].sup.1.sub.0] [w.sup.2](z)dz. The asymptot ic distributions of both statistics are a sum of weighted independent [[chi square].sub.1] variables; the weights for [W.sup.2] must be found from the eigenvalues of an integra l equation with the appropriate [[rho].sub.j](s, t) for Case j as kernel. For [A.sup.2], [[rho].sub.A](s, t), the covariance of the w(z) process, is [[rho].sub.j](s, t)/[{st(1 - s) (1 - t)}.sup.1/2], and this is the kernel of the integral equation. Once the weights are known, the percentage points of the distributions can be calculated by Imhof's method. For details of these procedures, see, for example, Stephens (1976). ACKNOWLEDGMENTSWe thank the editor and two referees for their many helpful suggestions. This work wa s supported by the Natural Science and Engineering Council of Canada. REFERENCESCastillo, E., and Hadi, A. S. (1997), "Fitting the Generalized Pareto Distribution to Data, " Journal of the American Statistical Association, 92, 1609-1620.Chen, G., and Balakrishnan, N. (1995), "The Infeasibility of Probability Weighted Mome nts Estimation of Some Generalized Distributions," in Recent Advances in Life-Testing a nd Reliability, ed. N. Balakrishnan, London: CRC Press, pp. 565-573.Choulakian, V., and Stephens, M. A. (2000), "Goodness-of-Fit Tests for the Generalized Pareto Distribution," research report, Simon Fraser University, Dept. of Mathematics an d Statistics.Darling, D. (1955), "The Cramer-von Mises Test in the Parametric Case," The Annals of Mathematical Statistics, 26, 1-20.Davison, A. C. (1984), "Modeling Excesses Over High Thresholds, With an Application," in Statistical Extremes and Applications, ed. J. Tiago de Oliveira, Dordrecht: D. Reidel, pp. 461-482.Davison, A. C., and Smith, R. L. (1990), "Models for Exceedances Over High Threshold s" (with comments), Journal of the Royal Statistical Society, Set. B, 52, 393-442. Dupuis, D. J. (1996), "Estimating the Probability of Obtaining Nonfeasible Parameter Es timates of the Generalized Pareto Distribution," Journal of Statistical Computer Simulati on, 54, 197-209.Dupuis, D. J., and Tsao, M. (1998), "A Hybrid Estimator for Generalized Pareto and Ext reme-Value Distributions," Communications in Statistics--Theory and Methods, 27, 925-941.Grimshaw, S. D. (1993), "Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution," Technometrics, 35, 185-191.Hosking, J. R. M., and Wallis, J. R. (1987), "Parameter and Quantile Estimation for the Generalized Pareto Distribution," Technometrics, 29, 339-349.Pickands, J. (1975), "Statistical Inference Using Extreme Order Statistics," Time Annals of Statistics, 3, 119-131.Smith, R. L. (1984), "Threshold Methods for Sample Extremes," in Statistical Extremes and Applications, ed. J. Tiago de Oliveira, Dordrecht: Reidel, pp. 6211-6638._____(1989), "Extreme Value Analysis of Environmental Time Series: An Application to Trend Detection in Ground-level Ozone," Statistical Science, 4, 367-393._____(1990), "Extreme Value Theory," in Handbook of Applicable Mathematics (Vol. 7), ed. W. Ledermann, Chichester, U.K.: Wiley.。

相关文档
最新文档