生物信息学表达谱芯片分析

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Is there a difference?
between you…means, who is meaner?
T-test
1.
Test for single mean Whether the sample mean is equal to the predefined population mean ?
5.
Comparison with theoretical value if tab t (n-1) < cal t (n-1) reject Ho, if tab t (n-1) > cal t (n-1) accept Ho, Inference
t –test for single mean
• Test statistics n=20, x =21.0 mg, sd=5.91 , =24.0 mg
n2
df n n 2
1 2 2 S2
2 SP

1 (n 1 1) (n 2 ) 1 (n 1 ) ( n 2 1)
2 S1
Recall that for single samples:
tobt
X score - mean sX standard error
Detection Call Significance
p value
One-Sided Wilcoxon’s Signed Rank Test
Where is the from?
R
Differential Expression: The affymetrix report
How is it calculated?
Normal distribution
Equal variance
Random sampling
t-Statistic
x t s/ n
• When the sampled population is normally distributed, the t statistic is Student t distributed with n-1 degrees of freedom.
Analyze expression cell intensity data expression probe analysis data .chp file created
.DAT File
.CEL file
Raw data, not background corrected
.CHP File
treatment group mean
Is there a difference?
What does difference mean?
medium variability
The mean difference is the same for all three cases
high variability
Setting Up the Hypothesis
H0: 1 = 2 H1: 1 2
H0: 1 2 H1: 1 > 2 H0: 1 2 H1: 1 < 2
OR
H0: 1 -2 = 0 H1: 1 - 2 0
µ tells us about the population Population 1 Population 2
µ1
µ2
Байду номын сангаас
Sample1
X1 The sample mean tells us about µ
X2
Sample2
The t-distribution
• • • • • •
Founder WS Gosset (1876 to 1937) Wrote under the pseudonym “Student” Mostly worked in tea (t) time ? Hence known as Student's t test. Preferable when the n < 60 Certainly if n < 30
Outline
1. Microarray data: Warming up! 2. T-test and other tests 3. SAM: Significance Analysis of Microarrays 4. Rank Products and a real application
= =
=
low variability
Determining the p-Value
2.575 1.96 -2.575 -1.96 0
Z
Area = .005
Area =.005
Area = .025
Area = .025
f(t)
-1.96
.95
.025 1.96 t
0
Assumptions
SPVi,j = PVi,j + log2(nfi*sfi)
Change Call
Decide the baseline array and experiment array Calculate PM – MM of the two arrays
One-Sided Wilcoxon’s Signed Rank Test
l 21 .0 24l t 2.30 5.91 20
t = t .05, 19 = 2.093 Accept H0 if t < 2.093 Reject H0 if t >= 2.093 Inference : There is no evidence that the sample is taken from the population with mean weight of 24 gm
Lecture 8 Bioinformatics
Differential Expression Analysis of Microarray Data
Chunsheng Han, Ph.D. Division of Bioinformatics State Key Laboratory of Reproductive Biology Institute of Zoology, CAS April 20, 2011
low variability
What does difference mean?
medium variability
high variability
low variability
Which one shows the greatest difference?
What does difference mean?
• like a signal-to-noise ratio
low variability
Which one shows the greatest difference?
So we estimate
signal noise difference between group means variability of groups _ _ XT - XC _ _ SE(XT - XC) t-value
Background substraction
IM: Ideal MM
Performance on corrupted data…
1. Simple average 2.
3. Tukey biweight
∑ ∑
Discrimination score R
R = (PM - MM) / (PM + MM) If PM >> MM, then R ≈ 1 If PM = MM, then R = 0
• a statistical difference is a function of the difference between means relative to the variability
• a small difference between means with large variability could be due to chance
9
1 2 3
16
18 18 19 19 20
21 22 22 24 24
26 27 29 30 32
Steps for test for single mean
1.
Questioned to be answered Is the Mean weight of the sample of 20 rats is 24 mg? N=20, x =21.0 mg, sd=5.91 , =24.0 mg
H0: 1 - 2 0 H1: 1 - 2 > 0 H0: 1 - 2 H1: 1 - 2 < 0
Two Tail
Right Tail Left Tail
OR
OR
Mean systolic BP in nephritis is significantly higher than of normal person
Data, data, data…: the affymetrix system
Scan probe array image data .dat file created
Compute cell intensity data from the image data intensity data .cel file created
2. Null Hypothesis The mean weight of rats is 24 mg. That is, The sample mean is equal to population mean. 3. Test statistics 4.
x t s/ n
--- t (n-1) df
2. Test for difference in means Whether the CD4 level of patients taking treatment A is equal to CD4 level of patients taking treatment B ?
3. Test for paired observation Whether the treatment conferred any significant benefit ?
For related samples:
tobt
where:
D D sD
T-test for difference in means
Hypothesized Difference (usually zero when testing for equal means)
_
t

(X1
X 2 1 2
_
) (
n1
)
1 1 2 Sp
Generate a change p value Make a change call
Outline
1. Microarray data: Warming up! 2. T-test and other tests 3. SAM: Significance Analysis of Microarrays 4. Rank Products and a real application
T- test for single mean
The following are the weight (mg) of each of 20 rats drawn at random from a large stock. Is it likely that the mean weight of these 20 rats are similar to the mean weight ( 24 mg) of the whole stock ?
100
110
120
130
140
Mean systolic BP in nephritis is significantly different from that of normal person
0.025
100 110 120 130 140
0.025
Statistical Analysis
control group mean
相关文档
最新文档