lecture10 统计推断 区间估计 清华大学 课件
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Hypothesis: H 0 : p = p0 Test statistic: B = å Xi , Small values of B give evidence that H 1 is true. Under the null, B has a Binomial (n, p0 ) distribution. Therefore, the p -value is n æ ö ÷ p(x) = P ç çBn , p0 £ å x i ÷ ÷ ÷ ç è ø i =1
Fisher’s exact test
X X Since S X = å i = X i is a sufficient statistic for pX and SY = å i = Y 1 1 i
n
n
Let W (X) be a test statistic such that large values of W give evidence that H 1 is true. Let S (X) be a sufficient statistic for the parameter q under the null model. For each sample point x, define p(x) = P ( W (X) ³ W (x) | S = S (x)). Then, p(X) is a valid p -value.
11/24/2010
Is the proportion of girls in Tsinghua less than 50%
统计学方法及其应用
Statistical Methods with Applications
NB
Rui Jiang, PhD Associate Professor
Ministry of Education Key Laboratory of Bioinformatics Bioinformatics Division, TNLIST/Department of Automation Tsinghua University, Beijing 100084, China
3
11/24/2010
Tests of two proportions
Protocol
Tests of two proportions
Contingency table
Let Populatoin X and Population Y in a contingency have Bernoulli (pX ) and Bernoulli (pY ) distributions, respectively. We like to test (1) H 0 : pX = pY (2) H 0 : pX = pY (3) H 0 : pX = pY versus versus versus H 1 : pX > pY ; H 1 : pX < pY ; H 1 : pX ¹ pY .
q ÎQ0
Here, Bn ,q is a Binomial (n, q) random variable.
Then, p(X) is a valid p -value.
Binomial test, less
One-sample binomial test, less
Normal approximation
(q0 < q1 ).
SX is a sufficient statistic of q, and the pmf of SX is Therefore f (Sx | q1 ) f (Sx | q0 )
Байду номын сангаас
æq 1- q ö 1 0÷ ÷ =ç ç ÷ çq 1 - q ÷ ç è 0 1ø q1 1 - q0 q0 1 - q1
variable. We have that æX -m ö ˆ - p)2 (p 2 2 ÷ ÷ . Zn c1 =ç ç ÷ ç ÷ ˆ ˆ) / n (1 p -p ç ès / n ø
2 This suggests the use of Z n as the test statistic and rejects H 0 if and only 2 2 if Z n . > c1, a 2
Sx
æ1 - q ö ç 1÷ ÷ ç ÷ , ç1 - q ÷ ç è 0ø
n
Since q1 > q0 ,1 - q0 > 1 - q1, q1 q0 > 1; 1 - q0 1 - q1 > 1; >1
f (x | m1 ) > kf (x | m0 ) is equivalent to é æq 1- q ö æ1 - q ÷ öù ç 1 0÷ 0 ÷ú ÷ º c(k ). Sx > êêlog k + n log ç ç ÷ ÷ ú log ç ç ç ÷ ç ç 1 q è è q0 1 - q1 ÷ ø ú 1 øû ëê Therefore, we like to reject H 0 when Sx is sufficiently large.
NB
NG
Are two coins equally fair?
In 1000 tosses of coin 1, 520 heads and 480 tails appear. In 1200 tosses of coin 2, 680 heads and 520 tails appear. Is it reasonable to assume that the two coins are equally fair?
统计学方法及其应用
Fisher’s Exact Test
统计学基础 随机变量的函数
“A random variable is a quantity whose values are random and to which a probability distribution is assigned.”
is a sufficient statistic for pY , we can make the decision with the use of only S X and SY . Obviously, S X has a binomial (nX , pX ) distribution and SY has a binomial (nY , pY ) distribution. When the null is true (pX = pY = p), the joint pmf of (S X , SY ) is æn öæ n ö (n +n )-(s +s ) X ÷ ç Y ÷ sX +sY ÷ç ÷ ÷p f (s X , sY | nX , nY , p) = ç (1 - p) X Y X Y , ç ÷ ç ç s ÷ ÷ ç ç è X øèsY ø which suggests that S = S X + SY is a sufficient statistic for p under the null hypothesis.
1
11/24/2010
Karlin-Rubin Theorem
Let X 1, ¼, X n Bernoulli(q), iid. Consider the hypotheses H 0 : q = q0 versus H 1 : q > q0 . SX is a sufficient statistic of q. Moreover, SX has a Binomial distribution, which has a monotone likelihood ratio. According to the Karlin-Rubin theorem, the test reject H 0 when SX > t0 is a UMP level a test, where a = P (Bn ,q > t0 ).
Sx (1 - q0 )n -Sx q0
A random sample X1, X n is observed from a Bernoulli population whose probability of success is p. We like to test (1) H 0 : p = p0 versus H 1 : p > p0 ; (2) H 0 : p = p0 versus H 1 : p < p0 ; (3) H 0 : p = p0 versus H 1 : p ¹ p0 .
0
p-value
p-value
Let W (X) be a test statistic such that large values of W give evidence that H 1 is true. For each sample point x, define p(x) = sup Pq (W (X) ³ W (x)) .
NG
November 24, 2010
Tests of a single proportion
Protocol
Neyman-Pearson Test
Let X1,¼, X n Bernoulli(q), iid. Consider the simple hypotheses H 0 : q = q0 versus H 1 : q = q1 f (Sx | qi ) µ qiSx (1 - qi )n -Sx , = q1Sx (1 - q1 )n -Sx
Successes Failures Total Population X SX FX nX Population Y SY FY nY Total S = SX + SY F = FX + FY n = nX + nY
Conditional on a sufficient statistic
Conditional on a sufficient statistic to define a p-value
Let X1, , X n be a random sample from a Bernoulli (p ) population. Consider the test H 0 : p £ p0 versus H 1 : p > p0 Since for a Bernoulli populatoin, m = EX = p and s 2 =VarX = p(1 - p), from the central limit theorem and the Slutskey’stheorem, when n ¥, ˆ-p X -m p N (0,1), Zn = ˆ(1 - p ˆ) / n s/ n p n ˆ = X = å i =1 X i / n is the estimator of p, and p ˆ(1 - p ˆ) is that of s 2 . where p This suggest a test with the test statistic ˆ - p0 X - m0 p = Zn = . ˆ(1 - p ˆ) / n s/ n p This test reject H 0 if and only if Z n > z a .
i =1 n
versus H 1 : p < p0
2
11/24/2010
c2 approximation
2 Since the square of a standard normal random variable is a c1 random
Is the proportion of girls in our class significantly higher than that of Tsinghua?
q ÎQ0
Then, p(X) is a valid p -value.
Let W (X) be a test statistic such that small values of W give evidence that H 1 is true. For each sample point x, define p(x) = sup Pq (W (X) £ W (x)) .