工具变量(IV)详细解说

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

IV
The 2SLS name notwithstanding, we don‘t usually construct 2SLS estimates in two-steps. For one thing, the resulting standard errors are wrong, as we discuss later. () Where si is the residual from a regression cov( si , si ) V ( si ) of si on X i .This follows from the multivariate regression anatomy formula and the fact that cov( si , si ) V ( si ) . It is also easy to show that, in a model with a single endogenous variable and a single instrument, the 2SLS estimator is the same as the corresponding ILS( Indirect Least Squares ) estimator.(Q3) 由2SLS,
Chapter 4
E.G.
IV
First in a restricted model with constant effects.
f i ( s ) s i (4.1.1)
i Ai ' i E ( sii ) 0 Yi s Ai ' i (4.1.2)

cov(Yi , si ) cov( si , si )


IV
Yi ' X i Si i (4.1.6)
The link between 2SLS and IV warrants a bit more elaboration in the multiinstrument case. Assuming each instrument captures the same causal effect (a strong assumption that is relaxed below), we might want to combine these alternative IV estimates into a single more precise estimate. In models with multiple instruments, 2SLS provides just such a linear combination by combining multiple instruments into a single instrument. Suppose, for example, we have three instrumental variables, Z1i, Z 2i , and Z3i . In the Angrist and Krueger (1991) application, these are dummies for first, second, and third-quarter births. The first-stage equation then becomes
IV
4.1 IV and causality
First in a restricted model with constant effects.
causality
Second in a framework with unrestricted heterogeneous potential outcomes
Zi 0 1si i cov( Zi ,i ) 0 cov(Yi , Zi ) cov(Yi , Zi ) / v( Zi ) (4.1.3) cov( si , Zi ) cov( si , Zi ) / v( Zi )
Zi
(IV)
IV
• Q1:The second equality in (4.1.3) is useful because its usually easier to think in terms of regression coeห้องสมุดไป่ตู้ficients than in terms of covariance. 2.
IV
IV
• 1、Origin
Studying agricultural markets in the 1920s, the father and son research team of Phillip and Sewall Wright were interested in a challenging problem of causal inference: how to estimate the slope of supply and demand curves when observed data on prices and quantities are determined by the intersection of these two curves. In other words, equilibrium prices and quantities the only ones we get to observe solve these two stochastic equations at the same time. Upon which curve, therefore, does the observed scatterplot of prices and quantities lie? The fact that population regression coefficients do not capture the slope of any one equation in a set of simultaneous equations had been understood by Phillip Wright for some time. The IV method, first laid out in Wright (1928), solves the statistical simultaneous equations problem by using variables that appear in one equation to shift this equation and trace out the other. The variables that do he shifting came to be known as instrumental variables (Reiersol, 1941).
Yi ' X i [ X i ' 10 11Z i 1i ] i X i' [ 10 ] 11Z i [ 1i i ] X i' 20 21Zi 2i where 20 10 ; 21 11 ; 2i 1i i
Yi X i si 2i
'
i
' X i si 1i i ' X i si [ ( Si si ) i ] (4.1.9)
The resulting estimator is consistent for because (a) first-stage estimates are consistent; and, (b) the covariates, X , and instruments, Zi , are uncorrelated i with both i and (Si si ) .
Z S A
Y
First, the instrument must have a clear effect on s . This is the first stage. Second, the only reason for the relationship between y and z is the firststage.
i
i i
(9)
IV
4.1.1 Two-Stage Least Squares
已知
Y X :
i ' i
Si X i ' 10 11Zi 1i (4.1.4a)
20
21Z i 2i (4.1.4b)
将(4.1.4a)带入(4.1.6)中
Yi ' X i Si i (4.1.6)
Yi ' X i Si i (4.1.6)
Where z is the residual from a regression of Z on the exogenous covariates, X . The right-hand side of (4.1.5) therefore swaps ~ zi for zi in the general IV formula, (4.1.3). Econometricians call the sample analog of the left-hand side of equation (4.1.5) an Indirect Least Squares (ILS) estimator of in the causal model with covariates.(9)
E.G.
Angrist and Krueger (1991) exploit the variation induced by compulsory schooling laws in a paper that typifies the use of “natural experiments” try to eliminate omitted variables bias. Compulsory schooling laws 六岁必须上学,所以每年下半年出生的孩子入学年龄会比较小。 16周岁之前必须待在学校。所以选择了1930到1939年的数据。以年 份和季度(工具变量)进行第一阶段回归(教育与出生季度之间的 关系);再用出生年份和季度(工具变量)进行第二阶段回归(出 生季度与周收入之间的关系)。
稍加调整(4.1.7)式:
(4.1.7)
Yi ' X i [ X i '10 11Zi ] 2i ,(4.1.8)
' Where X i 10 11Zi is the population fitted value from the first-stage regression of S i on X i and Zi . (A2)
• 2、Work
(1) Solving these two stochastic equations at the same time. (out date) (2) Causal inference . (3) Solving the problem of bias from measurement error in regression models. (4) Solving the problem of omitted variables bias.(most important )
IV
In practice, of course, we almost always work with data from samples. Given a random sample, the first-stage fitted values in the population are consistently estimated by 第一步:用 X i 和 Zi 回归 si X i ' 10 11Zi 第二步:用 s 和 X i 回归 Yi .
IV
Si X i'10 11Zi 1i (4.1.4a) Yi X i' 20 21Zi 2i (4.1.4b)
结果(1)受教育高的收入高 (2)年龄较大的人收入较 高(30年出生的人的收入比31 年的高)。
IV
Q2:
21 cov(Yi , zi ) (4.1.5) 11 cov( Si , zi )
i i i
IV
So where can you find an instrumental variable?
One possible source of instruments for schooling differences in costs due, say, to loan policies or other subsidies that vary independently of ability or earnings potential. A second source of variation in schooling is institutional constraints.
相关文档
最新文档