第九章:回归分析-30页文档

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Regression Analysis
Chapter 11
Regression and Correlation
Techniques that are used to establish whether there is a mathematical relationship between two or more variables, so that the behavior of one variable can be used to predict the behavior of others. Applicable to “Variables” data only.
run
axis.
b
0
X
A simple linear relationship can be described mathematically by
Y = mX + b
Simple Linear Regression
slope =
rise run
=
(6 - 3)
1
=
(10 - 4)
2
Y
rise
5
run intercept = 1
Rent
Step 1: Scatter plot
2500 2300 2100 1900 1700 1500 1300 1100 900 700 500
500 700 900 1100 1300 1500 1700 1900 2100
Size
Scatter plot suggests that there is a ‘linear’ relationship between Rent and Size
High
... .
?
Y
..
.
. ..
?
. . .. .
Low
.. . . ?
Low
High
X
Simple Linear Regression
m = slope =
rise run
Y
rise
b = Y intercept
= the Y value
at point that
the line
intersects Y
Using regression for
prediction – Caution!
Regression equation is valid only over the range over which it was estimated!
Do not use the equation in predicting Y when X values are not within the range of data used to develop the equation.
Is there a Relationship Between the Variables?
What Direction is the Relationship?
How Strong is the Relationship?
High
... .
Y
..
.
. ..
. . .. .
Low
.. . .
df
SS
MS
F Significance F
1
2268777 2268777 59.91376 7.51833E-08
23 870949.5 37867.37
24
3139726
Intercept X Variable 1
Coefficients Std Error t Stat P-value 177.12082 161.0043 1.1001 0.28267 1.0651439 0.137608 7.740398 7.52E-08
Low
High
X
Simple Linear Regression
Is there a Relationship Between the Variables?
What Direction is the Relationship?
How Strong is the Relationship?
25
ANOVA
Regression Residual Total
df
SS
MS
F Significance F
1
2268777 2268777 59.91376 7.51833E-08
23 870949.5 37867.37
24
3139726
ቤተ መጻሕፍቲ ባይዱ
Intercept X Variable 1
Coefficients Std Error t Stat P-value 177.12082 161.0043 1.1001 0.28267 1.0651439 0.137608 7.740398 7.52E-08
0
X
0
5
10
Y = 0.5X + 1
Simple regression example
A n a g e n tf o ra r e s id e n tia lr e a le s ta te c o m p a n y in a la r g e c ity w o u ld lik e to p r e d ic tth e m o n th ly r e n ta lc o s tf o ra p a r tm e n ts b a s e d o n th e s iz e o fth e a p a r tm e n ta s d e f in e d b y s q u a r e f o o ta g e .A s a m p le o f2 5 a p a r tm e n ts in a p a r tic u la rr e s id e n tia ln e ig h b o r h o o d w a s s e le c te d to g a th e rth e in f o r m a tio n .
(continuous data)
x
Does Y depend on X? Which line is correct?
Examples:
Process conditions and product properties
Sales and advertising budget
4
Simple Linear Regression
• “Regression” provides a functional relationship (Y=f(x)) between the variables; the function represents the “average” relationship.
• “Correlation” tells us the direction and the strength of the relationship.
Regression Equation Rent = 177.12082+1.0651439*Size
Meaning of the regression coefficient
What does the coefficient of Size mean?
For every additional square feet, Rent goes up by $1.0651493
Using regression for prediction
Predict monthly rent when apartment size is 1000 square feet:
Regression Equation Rent = 177.12082+1.0651439*Size Thus Rent = 177.12082+1.0651439*1000 Rent = $1242.26472
The sign of r is the same as that of the coefficient of X (Size) in the regression equation (in our case the sign is positive). Also, if you look at the scatter plot, you will note that the sign should be positive.
Tools>> Data analysis>> Regression (Correlation)
Simple Linear Regression
What is it?
Determines if Y
depends on X and
provides a math
equation for the
y
relationship
df
SS
MS
F Significance F
1
2268777 2268777 59.91376 7.51833E-08
23 870949.5 37867.37
24
3139726
Intercept X Variable 1
Coefficients Std Error t Stat P-value 177.12082 161.0043 1.1001 0.28267 1.0651439 0.137608 7.740398 7.52E-08
Coefficient of correlation from EXCEL
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.85
R Square
0.72
Adjusted R Square 0.71
Standard Error
194.60
Observations
Correlation Levels
r = 0.05
r = 0.50
6
4
2
0
0
6
12
6
4
2
0
0
6
12
8
6
4
2
0
0
6
12
r = 0.95
10 8 6 4 2 0 0
6
12
r = –0.95
Correlation tells us how much linear association there is between two variables.
Correlation (r)
• “Correlation coefficient”, r, is a measure of the strength and the direction of the relationship between two variables. Values of r range from +1 (very strong direct relationship), through “0” (no relationship), to –1 (very strong inverse relationship). It measures the degree of scatter of the points around the “Least Squares” regression line.
TheTahneaalnyasliyssisstasrtatsrtswwitihthaa SSccaatttteerrPPloltootfoYf Yvs vXs X.
Regression and Correlation
Excel will do Regression analysis and Correlation analysis:
Thus, we should not use the equation to predict rent for an apartment whose size is 500 square feet, since this value is not in the range of size values used to create the regression equation.
Step 2: Analysis via EXCEL
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.85
R Square
0.72
Adjusted R Square 0.71
Standard Error
194.60
Observations
25
ANOVA
Regression Residual Total
Interpreting EXCEL output
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.85
R Square
0.72
Adjusted R Square 0.71
Standard Error
194.60
Observations
25
ANOVA
Regression Residual Total
相关文档
最新文档