多重比较

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Example
Suppose we have m = 3 t-tests. Assume target = 0.05. Unadjusted P-values are P1 = 0.001 P2 = 0.013 P3 = 0.074 For the jth test, calculate 1-(1-)1/(m-j+1), For test j = 1, 1-(1-)1/(m-j+1)= 1-0.951/ (3 – 1 + 1) = 0.01695 For test j=1, the observed P1 = 0.001 is less than 0.0170, so we reject the null hypothesis.
2012-11-29 3
狭义的多重比较
狭义的多重比较,特指对多组的总体参数或多 组的分布比较后各组间的两两比较(post hoc comparison)。
方差分析后多组均数的比较 多个率比较后的两两比较 多组等级分布比较后的两两比较等
广义的多重比较
一般指多变量的情形,即对同一问题通过对多 个变量的逐一检验来回答,如多元回归中各自 变量的假设检验,简称多重检验(multiple testing)
FWER
Family–Wise Error Rate 总I型错误率/总第一类错误率 The FWER is defined as the probability of at least one Type I error (false positive):
FWER = P (V > 0)
FDR=E(V/R)
FNR
• False Non-discovery Rate
– FNR=E(T/W)
• 注意FDR和“false positive rate”的区别 • 注意FNR和“false negative rate”的区别
多重比较的统计方法分类
按控制指标
控制总I型错误率 (family-wise error rate, FWER) 控制“阳性结果错误率/假发现率”(false discovery rate, FDR) 控制“阴性结果错误率”(false non-discovery rate, FNR)
则拒绝H 。 如果 P i i
Carlo Emilio Bonferroni (1892-1960)
Example
Suppose we have m = 3 t-tests. Assume target = 0.05. Bonferroni corrected P-value is /m = 0.05/3 = 0.0167 Unadjusted P-values are P1 = 0.001 P2 = 0.013 P3 = 0.074 P1 = 0.001 < 0.0167, so reject null P2 = 0.013 < 0.0167, so reject null P3 = 0.074 > 0.0167, so do not reject null
Holm step-down
Order the P values for the m hypotheses being tested from smallest to largest.
P1 P2 ... Pm
min mP ,1 P 1 1 m in max P , m i 1 P , 1 P i i 1 i
Sidak single-step
对i=1, 2, …, m,
如果Pi ≤1-(1-)1/m , 则拒绝Hi。
=1- 1-P P i i
m
, 则拒绝H 。 如果 P i i
Zbyněk Šidák (1933-1999)
Example
Suppose we have m = 3 t-tests. Assume target = 0.05. Sidak corrected P-value is 1-(1-)1/m = 1-0.951/3 = 0.01695 Unadjusted P-values are P1 = 0.001 P2 = 0.013 P3 = 0.074 P1 = 0.001 < 0.0170, so reject null P2 = 0.013 < 0.0170, so reject null P3 = 0.074 > 0.0170, so do not reject null
全基因组关联研究中SNPs的比较 微阵列数据分析中各个蛋白/基因的比较
GWAS data
id 1 2 … k SNP 1
1 0
SNP 2
1 1
… … … … …
SNP m
1 1
case 1 1 …. 0
wk.baidu.com

0

1

0
Microarray data
id 1 2 … k
Gene 1
×.×× ×.××
Multiple comparison
多重比较(multiple comparison),简单地说 ,就是从手头样本出发,针对某一个问题 提出检验假设H0,该假设是一系列假设(a family of hypotheses),并非单一假设。 Simply stated, multiple test refers to any situation in which a collection of statistical hypotheses is formally or informally evaluated and concluded from one dataset.
Example
Suppose we have m = 3 t-tests. Assume target = 0.05. Unadjusted P-values are P1 = 0.001 P2 = 0.013 P3 = 0.074 For the jth test, calculate /(m-j+1), For test j = 1, /(m-j+1) = 0.05/(3 -1 + 1) = 0.05 / 3 = 0.0167 For test j=1, the observed P1 = 0.001 is less than 0.0167, so we reject the null hypothesis.
Gene 2
×.×× ×.××
… … … … …
Gene m
×.×× ×.××
case 1 1 …. 0

×.××

×.××

×.××
为什么要考虑“多重检验”问题
m independent hypotheses with level
P at least faslely reject one hypothesis 1 1 m
FDR
False Discovery Rate “阳性结果错误率/假发现率” The FDR (Benjamini & Hochberg 1995) is the expected proportion of Type I errors among the rejected hypotheses:
Example
For test j = 2, /(m-j+1) = 0.05/(3 – 2 + 1) = 0.05 / 2 = 0.025 For test j=2, the observed P2 = 0.013 is less than 0.025, so we reject the null hypothesis. For test j = 3, /(m-j+1) = 0.05/(3 – 3 + 1) = 0.05 / 1 = 0.05 For test j=3, the observed P3 = 0.074 is greater than 0.05, so we do not reject the null hypothesis.
Family
A family is a collection of inferences for which it is meaningful to take into account some overall measure of errors. family members: – finite/infinite? – confirmatory or exploratory?
研究生《高级医学统计学》课程
MEDICAL STATISTICS 医学统计学
多重比较
柏建岭 bjlcn@163.com 南京医科大学公共卫生学院 流行病与卫生统计学系 http://weibo.com/u/1596725354
内容提要
多重比较的定义 错误率的定义 常见的多重比较的统计方法 实例分析
按控制的操作程序
单步(single-step)法 逐步(step-wise)法 基于再抽样(resampling-based)的方法
FWER Controlling Procedures Single-step procedures Step-wise procedures Resampling-based algorithm
Sidak step-down (Sidak-Holm)
m P1 min 1- 1-P1 ,1
m i 1 ,1 Pi min max Pi 1 ,1- 1-Pi i 2,m





Example
If P1> 1-(1-)1/m, accept all the n hypothesis (i.e., none are significant). If P1≤1-(1-)1/m, reject H1 (i.e., H1 is declared significant), and consider H2 If P2> 1-(1-)1/(m-1), accept H2 If P2≤ 1-(1-)1/(m-1), reject H2 and move onto H3 Proceed with the hypotheses until the first j such that Pj > 1-(1-)1/(m-j+1).
m
Control m with multiple test procedure
Outcomes of m tests
设同时对m个假设进行检验,其中m0个是正确的,R 表示检验结果为阳性的假设个数 。 H0 True False Total Not Rejected Rejected Total m0 m-m0 m
Single-step procedures
Bonferroni :
=min mP ,1 P i i
m Pi =1- 1-Pi
Sidak :
Bonferroni single-step
对i=1, 2, …, m,
如果Pi ≤ /m, 则拒绝Hi。
=min mP ,1 P i i
U T W
V S R
其中,m在假设检验前已知。R是可观察的随机变量,而U、 V、S、T是不可观察的随机变量。
PCER
Per-Comparison Error Rate 每次/平均比较错误率 The PCER is defined as: PCER = E(V)/m
PFER
Per-Family Error Rate Not really a rate, the PFER is defined as the expected number of type I errors: PFER = E(V)
i 2,m
Sture Holm




Example
If P1> /m, accept all the n hypothesis (i.e., none are significant). If P1≤/m, reject H1 (i.e., H1 is declared significant), and consider H2 If P2> /(m-1), accept H2 If P2≤/(m-1), reject H2 and move onto H3 Proceed with the hypotheses until the first j such that Pj > /(m-j+1),
相关文档
最新文档