第十章回归与相关
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Sample’s Correlation Coefficient: r
15
Different Patterns of Correlation
Positive
Negative
Null
Null
0<r<1
-1<r<0
r=0
r=0
Completely Positive
Completely Negative
Statistical inference for population
correlation coefficient
Hypothesis test
14
Measure of correlation
Pearson’s Linear Correlation Coefficient, The correlation, denoted by r, measures the
rank
1 2 3 4 5 6.5 6.5 8 9.5 11 12 14 T2=82.5
5
n1=8
T1=127.5
Solution to Example9.1
H0
: M1=M2 population locations of survival time of both cat and rabbit are equal H1: M1 ≠ M2 population locations of survival time of both cat and rabbit are not equal ; a = 0.05 Sorting and ranking, calculate 2 Rank sums of 2 groups. Take the Ti with small n as T. n1=8<n2=12, so T= T1 =127.5.
variables.
S =R*R
Probabilistic Model: a method 百度文库sed to capture
the randomness that is part of a real-life process.
Weight(Y,kg) vs. Height (X,cm)/
Lecture10
CORRELATION & REGRESSION
Xiaojin Yu
Department of Epidemiology and Biostatistics, public Health school, Southeast University
1
review
Comparison of means :t –test
272 534 65.00 8
r
l XY l XX lYY
65.00 70.00 67.50
0.9456
21
Hypothesis test for ρ
H0: ρ=0, there is no linear relationship
between x and y ; H1: ρ=0, there is linear relationship between x and y Test methods ① t-test ② look up table
N ( N 1) 1213 78 2 2 N 1 6. 5 2
Blue-male Red- female
4
EXAMPLE 9.1: Table 9.1 Survival Times of Cats & Rabbits without oxygen
Cats minutes 25 34 44 46 46 48 49 50 rank 9.5 13 15 16 17 18 19 20 rabbits minutes 14 15 16 17 19 21 21 23 25 28 30 35 n2=12
simple linear regression
N 1
8
Correlations in medicine
Drinking a glass of red wine per day may decrease your
chances of a heart attack. Taking one aspirin per day may decrease your chances of stroke or of a heart attack. Eating lots of certain kinds of fish may improve your health and make you smarter. Pregnant women that smoke tend to have low birthweight babies. Taller people tend to weigh more Animals with large brains tend to be more intelligent. The more you study for an exam, the higher the score you are likely to receive.
Comparison of proportions: Chi-square test Comparison of Median: Rank sum test
2
Review on rank sum test
raw data and Rank ( cardinal and ordinal
number) Rank sum test_ methods based on rank
Dependent variable: denoted Y Independent variables: denoted X1, X2, …, Xk
11
Ex.1 Height of 2 years old and 20 years old
Height of 2 years 39 30 old (inch) height of 20 years old (inch) 71 63 32 34 35 36 36 30
If we are interested in predicting the value of one variable (the dependent variable) on the basis of other variables (the independent variables),we use Regression analysis .
For example: 18-years-old y=0.8X-69
10
Correlation & regression
If we are interested only in determining whether a relationship exists, we use correlation analysis.
9
Model Types… Relationship between variables
Deterministic Model: an equation that allow us to fully determine the value of the dependent variable from the values of the independent
critical interval of T0.05 (58-
110),T=127.5, is beyond of Tα,
so, P≤α, Given α=0.05, P<0.05; H0 is rejected, it concludes that the
survival times of
rabbits without different. in the oxygen
22
t test for pho
H0:ρ=0 ,there is no linear relationship between 2 variables H1:ρ≠ 0, there is linear relationship between 2 variables ,α=0.05
t
0.9456 1 0.9456 82
2
7.1196
ν=8-2=6
According t critical value,P<0.05,reject H0,accept H1,
conclude that there is linear relationship between height of 2 years old and adult height. 。
x2
y2
xy
2
1 2 3
lYY Y Y 35712
l XX X X 9318
2
534
8
2
67.50
272
8
2
70.00
4
5 6 7 8
total
34
35 36 36 30
67
68 68 70 64
lXY X X Y Y 18221
Null
Null
r=1
r=-1
r=0
r=0
16
How high must a correlation be to be considered meaningful?
17
Magnitude & direction
The larger the absolute value of
correlation coefficient , the stronger the correlation.
2 independent groups_ willcoxon rank sum
test 2 paired groups_ sign rank sum test
3
Solution to height comparison height between F and M
12 11 10 9 8 7 6 5 4 3 2 1 TM=52 Tf=26
23
Caution: Correlation does not necessarily imply causation.
If X is correlated with Y, there could be five explanations:
63
67
68
68
70
64
12
Scatter plot
71
Y height of 20 years old(inch)
69 67
65
63 30 32 34 36 38 X height of 2 years old(inch) 40
13
Correlation
Concept
Calculate r
cats and
might be
environment
6
Basic logics of scientific research
To find the difference
To find the correlation
Contents
linear Correlation
Rank correlation
amount of linear association between two variables, strength and direction. r is always between -1 and 1 inclusive. [-1, 1]
Population’s Correlation Coefficient: ρ
If the sign is positive(+), the two variables
varies at the same direction; If the sign is positive(-), the two variables varies at the opposite direction.
18
Calculation of correlation coefficient
XX 1 r n 1 sX Y Y s Y
19
Calculation of correlation coefficient
20
no
2 adult years (inch) (inch) Y X 39 30 32 71 63 63