第2章多元回归分析
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Because R2 will usually increase with the number of independent variables, it is not a good way to compare models
16
An Example: an crime model (w p82)
8
“Partialling Out” continued
Previous equation implies that regressing y on x1 and x2 gives same effect of x1 as regressing y on residuals from a regression of x1 on x2
yˆ bˆ0 bˆ1x1 bˆ2x2 L bˆk xk
The above estimated equation is called the OLS regression line or the sample regression function (SRF)
the above equation is the estimated equation, is
Now, we first regress educ on exper and tenure to patial out the exper and tenure’s effects. Then we regress wage on the residuals of educ on exper and tenure. Whether we get the same result.?
x y inn1
ii11 i1 i
yi
bˆ0bˆ0
bˆ1bxˆi11xi1 L L
bˆkbxˆkikxik2
0
n xi2 yi bˆ0 bˆ1xi1 L bˆk xik 0
i 1
M
n xik yi bˆ0 bˆ1xi1 L bˆk xik 0.
i 1
4
Obtaining OLS Estimates, cont.
12
Goodness-of-Fit
We can thinkof each observation as being made up of an explained part, and an unexplained part, yi yˆi uˆi Wethen define the following :
x1 and x2 are uncorrelated in the sample
11
The wage determinations: exemple
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
The STATA command
Use [path]wage1.dta (insheet using [path]wage1.raw/wage1.txt) Reg wage educ exper tenure Reg lwage educ exper tenure
7
A “Partialling Out” Interpretation
R2
yi y yˆi yˆ 2 yi y2 yˆi yˆ 2
15
More about R-squared
R2 can never decrease when another independent variable is added to a regression, and usually will increase
educ=13.575-0.0738exper+0.048tenure wage=5.896+0.599resid log(wage)=1.623+0.092resid
We can see that the coefficient of resid is the same of the coefficien of the variable educ in the first estimated equation. And the same to log(wage) in the second equation.
wage=b0b1educ+b2exper+b3tenure+u log(wage)=b0b1educ+b2exper+b3tenure+u
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
Years of education, educ Years of labor market experience, exper Years with the current employer, tenure
The relationship btw. wage and educ, exper, tenure:
R2 = SSE/SST = 1 – SSR/SST
14
Goodness-of-Fit (continued)
We can also think of R2 as being equal to
the squaredcorrelation coefficient between
the actual yi and the values yˆi
9
The wage determinations
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
y = b0 + b1x1 + b2x2 + . . . bkxk + u
b0 is still the intercept b1 to bk all called slope parameters
u is still the error term (or disturbance) Still need to make a zero conditional mean
Consider the case wherek 2, i.e.
yˆ bˆ0 bˆ1x1 bˆ2 x2 , then
bˆ1 rˆi1yi
rˆi12 , w hererˆi1 are
the residuals from the estimated
பைடு நூலகம்
regression xˆ1 ˆ0 ˆ2 xˆ2
not the really equation. The really equation is
population regression line which we don’t know.
We we
coannlygeesttaimnOoaLttheSeiintrt.edSrcifoefpe,trueessntiinmt geatseatimdifaOfetLeSrdesnleotpqseuaeamsttiimopnlaete,s
line. The population regression line is
E( y | x) b0 b1x1 b2 x2 L bk xk
5
Interpreting Multiple Regression
yˆ bˆ0 bˆ1x1 bˆ2 x2 ... bˆk xk , so yˆ bˆ1x1 bˆ2x2 ... bˆk xk ,
What determines the person to commit crime? (the dependent variable is the number of times the man was arrested during 1986, narr86)
10
Simple vs Multiple Reg Estimate
Compare thesimple regression ~y b~0 b~1x1 with themultiple regression yˆ bˆ0 bˆ1x1 bˆ2x2 Generally, b~1 bˆ1 unless : bˆ2 0 (i.e. no partial effectof x2 ) OR
assumption, so now assume that E(u|x1,x2, …,xk) = 0 Still minimizing the sum of squared residuals,
so have k+1 first order conditions
3
Obtaining OLS Estimates
fIrnomthethgeefniersrtalocrdaeser cwointhdiktioinnd, ewpeencdanengtevtakriab1les,
lwtiynˆhneeerasyebreiˆf0eeoqkrbeuˆeb,0aˆs1mttxiio1mibnnˆ1aisxLmtiie1niszkbeLˆb0ˆtk,h1xbeˆkb1uˆs,nkKuxkmink,oboˆwkf ns0inqsubtˆha0re,ebedˆ1q,ruKeistii,dobˆunka:ls:
yi y2 is the totalsum of squares(SST) yˆi y2 is the explained sum of squares(SSE) uˆi2 is the residualsum of squares(SSR)
Then SST SSE SSR
13
Goodness-of-Fit (continued)
The estimated equations without tenure
wage=3.3910.644educ+0.070exper log(wage)=0.2170.098educ+0.0103exper
wage=0.9050.541educ log(wage)=0.5840.083educ
How do we think about how well our sample regression line fits our sample data?
Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression
so holding x2,...,xk fixed implies that
yˆ bˆ1x1, that is each b has
a ceteris paribus interpretation
6
An Example (Wooldridge, p76)
The determination of wage (dollars per hour), wage:
第二章 多元回归分析:估计
y = b0 + b1x1 + b2x2 + . . . bkxk + u
1
Multiple Regression Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
1. Estimation
2
Parallels with Simple Regression
This means only the part of xi1 that is uncorrelated with xi2 are being related to yi so we’re estimating the effect of x1 on y after x2 has been “partialled out”
16
An Example: an crime model (w p82)
8
“Partialling Out” continued
Previous equation implies that regressing y on x1 and x2 gives same effect of x1 as regressing y on residuals from a regression of x1 on x2
yˆ bˆ0 bˆ1x1 bˆ2x2 L bˆk xk
The above estimated equation is called the OLS regression line or the sample regression function (SRF)
the above equation is the estimated equation, is
Now, we first regress educ on exper and tenure to patial out the exper and tenure’s effects. Then we regress wage on the residuals of educ on exper and tenure. Whether we get the same result.?
x y inn1
ii11 i1 i
yi
bˆ0bˆ0
bˆ1bxˆi11xi1 L L
bˆkbxˆkikxik2
0
n xi2 yi bˆ0 bˆ1xi1 L bˆk xik 0
i 1
M
n xik yi bˆ0 bˆ1xi1 L bˆk xik 0.
i 1
4
Obtaining OLS Estimates, cont.
12
Goodness-of-Fit
We can thinkof each observation as being made up of an explained part, and an unexplained part, yi yˆi uˆi Wethen define the following :
x1 and x2 are uncorrelated in the sample
11
The wage determinations: exemple
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
The STATA command
Use [path]wage1.dta (insheet using [path]wage1.raw/wage1.txt) Reg wage educ exper tenure Reg lwage educ exper tenure
7
A “Partialling Out” Interpretation
R2
yi y yˆi yˆ 2 yi y2 yˆi yˆ 2
15
More about R-squared
R2 can never decrease when another independent variable is added to a regression, and usually will increase
educ=13.575-0.0738exper+0.048tenure wage=5.896+0.599resid log(wage)=1.623+0.092resid
We can see that the coefficient of resid is the same of the coefficien of the variable educ in the first estimated equation. And the same to log(wage) in the second equation.
wage=b0b1educ+b2exper+b3tenure+u log(wage)=b0b1educ+b2exper+b3tenure+u
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
Years of education, educ Years of labor market experience, exper Years with the current employer, tenure
The relationship btw. wage and educ, exper, tenure:
R2 = SSE/SST = 1 – SSR/SST
14
Goodness-of-Fit (continued)
We can also think of R2 as being equal to
the squaredcorrelation coefficient between
the actual yi and the values yˆi
9
The wage determinations
The estimated equation as below:
wage=2.8730.599educ+0.022exper+0.169tenure log(wage)=0.2840.092educ+0.0041exper+0.022tenure
y = b0 + b1x1 + b2x2 + . . . bkxk + u
b0 is still the intercept b1 to bk all called slope parameters
u is still the error term (or disturbance) Still need to make a zero conditional mean
Consider the case wherek 2, i.e.
yˆ bˆ0 bˆ1x1 bˆ2 x2 , then
bˆ1 rˆi1yi
rˆi12 , w hererˆi1 are
the residuals from the estimated
பைடு நூலகம்
regression xˆ1 ˆ0 ˆ2 xˆ2
not the really equation. The really equation is
population regression line which we don’t know.
We we
coannlygeesttaimnOoaLttheSeiintrt.edSrcifoefpe,trueessntiinmt geatseatimdifaOfetLeSrdesnleotpqseuaeamsttiimopnlaete,s
line. The population regression line is
E( y | x) b0 b1x1 b2 x2 L bk xk
5
Interpreting Multiple Regression
yˆ bˆ0 bˆ1x1 bˆ2 x2 ... bˆk xk , so yˆ bˆ1x1 bˆ2x2 ... bˆk xk ,
What determines the person to commit crime? (the dependent variable is the number of times the man was arrested during 1986, narr86)
10
Simple vs Multiple Reg Estimate
Compare thesimple regression ~y b~0 b~1x1 with themultiple regression yˆ bˆ0 bˆ1x1 bˆ2x2 Generally, b~1 bˆ1 unless : bˆ2 0 (i.e. no partial effectof x2 ) OR
assumption, so now assume that E(u|x1,x2, …,xk) = 0 Still minimizing the sum of squared residuals,
so have k+1 first order conditions
3
Obtaining OLS Estimates
fIrnomthethgeefniersrtalocrdaeser cwointhdiktioinnd, ewpeencdanengtevtakriab1les,
lwtiynˆhneeerasyebreiˆf0eeoqkrbeuˆeb,0aˆs1mttxiio1mibnnˆ1aisxLmtiie1niszkbeLˆb0ˆtk,h1xbeˆkb1uˆs,nkKuxkmink,oboˆwkf ns0inqsubtˆha0re,ebedˆ1q,ruKeistii,dobˆunka:ls:
yi y2 is the totalsum of squares(SST) yˆi y2 is the explained sum of squares(SSE) uˆi2 is the residualsum of squares(SSR)
Then SST SSE SSR
13
Goodness-of-Fit (continued)
The estimated equations without tenure
wage=3.3910.644educ+0.070exper log(wage)=0.2170.098educ+0.0103exper
wage=0.9050.541educ log(wage)=0.5840.083educ
How do we think about how well our sample regression line fits our sample data?
Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression
so holding x2,...,xk fixed implies that
yˆ bˆ1x1, that is each b has
a ceteris paribus interpretation
6
An Example (Wooldridge, p76)
The determination of wage (dollars per hour), wage:
第二章 多元回归分析:估计
y = b0 + b1x1 + b2x2 + . . . bkxk + u
1
Multiple Regression Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
1. Estimation
2
Parallels with Simple Regression
This means only the part of xi1 that is uncorrelated with xi2 are being related to yi so we’re estimating the effect of x1 on y after x2 has been “partialled out”