伍德里奇计量经济学 (3)
伍德里奇计量经济学第六版答案Chapter-3
CHAPTER 3TEACHING NOTESFor undergraduates, I do not work through most of the derivations in this chapter, at least not in detail. Rather, I focus on interpreting the assumptions, which mostly concern the population. Other than random sampling, the only assumption that involves more than population considerations is the assumption about no perfect collinearity, where the possibility of perfect collinearity in the sample (even if it does not occur in the population) should be touched on. The more important issue is perfect collinearity in the population, but this is fairly easy to dispense with via examples. These come from my experiences with the kinds of model specification issues that beginners have trouble with.The comparison of simple and multiple regression estimates – based on the particular sample at hand, as opposed to their statistical properties – usually makes a strong impression. Sometimes I do not bother with the “partialling out” interpretation of multiple regression.As far as statistical properties, notice how I treat the problem of including an irrelevant variable: no separate derivation is needed, as the result follows form Theorem 3.1.I do like to derive the omitted variable bias in the simple case. This is not much more difficult than showing unbiasedness of OLS in the simple regression case under the first four Gauss-Markov assumptions. It is important to get the students thinking about this problem early on, and before too many additional (unnecessary) assumptions have been introduced.I have intentionally kept the discussion of multicollinearity to a minimum. This partly indicates my bias, but it also reflects reality. It is, of course, very important for students to understand the potential consequences of having highly correlated independent variables. But this is often beyond our control, except that we can ask less of our multiple regression analysis. If two or more explanatory variables are highly correlated in the sample, we should not expect to precisely estimate their ceteris paribus effects in the population.I find extensive t reatments of multicollinearity, where one “tests” or somehow “solves” the multicollinearity problem, to be misleading, at best. Even the organization of some texts gives the impression that imperfect collinearity is somehow a violation of the Gauss-Markov assumptions. In fact, they include multicollinearity in a chapter or part of the book devoted to “violation of the basic assumptions,” or something like that. I have noticed that master’s students who have had some undergraduate econometrics are often confused on the multicollinearity issue. It is very important that students not confuse multicollinearity among the included explanatory variables in a regression model with the bias caused by omitting an important variable.I do not prove the Gauss-Markov theorem. Instead, I emphasize its implications. Sometimes, and certainly for advanced beginners, I put a special case of Problem 3.12 on a midterm exam, where I make a particular choice for the function g(x). Rather than have the students directly comparethe variances, they should appeal to the Gauss-Markov theorem for the superiority of OLS over any other linear, unbiased estimator.SOLUTIONS TO PROBLEMS3.1 (i) hsperc is defined so that the smaller it is, the lower the student’s standing in high school . Everything else equal, the worse the student’s standing in high school, the lower is his/her expected college GPA.(ii) Just plug these values into the equation:colgpa = 1.392 - .0135(20) + .00148(1050) = 2.676.(iii) The difference between A and B is simply 140 times the coefficient on sat , because hsperc is the same for both students. So A is predicted to have a score .00148(140) ≈ .207 higher.(iv) With hsperc fixed, colgpa ∆ = .00148∆sat . Now, we want to find ∆sat such that colgpa ∆ = .5, so .5 = .00148(∆sat ) or ∆sat = .5/(.00148) ≈ 338. Perhaps not surprisingly, a large ceteris paribus difference in SAT score – almost two and one-half standard deviations – is needed to obtain a predicted difference in college GPA or a half a point.3.2 (i) Yes. Because of budget constraints, it makes sense that, the more siblings there are in a family, the less education any one child in the family has. To find the increase in the number of siblings that reduces predicted education by one year, we solve 1 = .094(∆sibs ), so ∆sibs = 1/.094 ≈ 10.6.(ii) Holding sibs and feduc fixed, one more year of mother’s education implies .131 years more of predicted education. So if a mother has four more years of education, her son is predicted to have about a half a year (.524) more years of education.(iii) Since the number of siblings is the same, but meduc and feduc are both different, the coefficients on meduc and feduc both need to be accounted for. The predicted difference in education between B and A is .131(4) + .210(4) = 1.364.3.3 (i) If adults trade off sleep for work, more work implies less sleep (other things equal), so 1β < 0.(ii) The signs of 2β and 3β are not obvious, at least to me. One could argue that more educated people like to get more out of life, and so, other things equal, they sleep less (2β < 0). The relationship between sleeping and age is more complicated than this model suggests, and economists are not in the best position to judge such things.(iii) Since totwrk is in minutes, we must convert five hours into minutes: ∆totwrk = 5(60) = 300. Then sleep is predicted to fall by .148(300) = 44.4 minutes. For a week, 45 minutes less sleep is not an overwhelming change.(iv) More education implies less predicted time sleeping, but the effect is quite small. If we assume the difference between college and high school is four years, the college graduate sleeps about 45 minutes less per week, other things equal.(v) Not surprisingly, the three explanatory variables explain only about 11.3% of the variation in sleep . One important factor in the error term is general health. Another is marital status, and whether the person has children. Health (however we measure that), marital status, and number and ages of children would generally be correlated with totwrk . (For example, less healthy people would tend to work less.)3.4 (i) A larger rank for a law school means that the school has less prestige; this lowers starting salaries. For example, a rank of 100 means there are 99 schools thought to be better.(ii) 1β > 0, 2β > 0. Both LSAT and GPA are measures of the quality of the entering class. No matter where better students attend law school, we expect them to earn more, on average. 3β, 4β > 0. The number of volumes in the law library and the tuition cost are both measures of the school quality. (Cost is less obvious than library volumes, but should reflect quality of the faculty, physical plant, and so on.)(iii) This is just the coefficient on GPA , multiplied by 100: 24.8%.(iv) This is an elasticity: a one percent increase in library volumes implies a .095% increase in predicted median starting salary, other things equal.(v) It is definitely better to attend a law school with a lower rank. If law school A has a ranking 20 less than law school B, the predicted difference in starting salary is 100(.0033)(20) =6.6% higher for law school A.3.5 (i) No. By definition, study + sleep + work + leisure = 168. Therefore, if we change study , we must change at least one of the other categories so that the sum is still 168.(ii) From part (i), we can write, say, study as a perfect linear function of the otherindependent variables: study = 168 - sleep - work - leisure . This holds for every observation, so MLR.3 violated.(iii) Simply drop one of the independent variables, say leisure :GPA = 0β + 1βstudy + 2βsleep + 3βwork + u .。
学习笔记:伍德里奇《计量经济学》第五版-第三章 多元回归分析:估计
y = b 0+ b 1x 1+ b 2x 2+ . . . b k x k + u一、多元线性回归模型1.我们可以研究控制一些变量不变的条件下,其他变量对y的影响,而不是假定他们不相关。
Cons = b 0+ b 1inc+b 2inc 2 +u2.我们还能推广变量之间的函数关系如:通过在模型中包含更多的变量,我们更好的达到了SLR.4所表达的目的E(u|x 1,x 2, …,x k ) = 0 (3.8)HYP.1一般多元回归模型的关键假定(u和所有x都不相关):( )仍然是最小化残差和:对(3.12)求k +1次偏导得一阶条件(交给计算机计算)(此时假定k +1个方程只能得到估计值得唯一解2.1 如何得到OLS 估计值例3.1分析两个系数时,可得出当我们把其中一个因素涵盖在模型中时,另外一个因素的预测就变得不有力了1.系数表示局部效应(控制其他变量不变时,对y的效应)多元回归分析给了我们在收集不到“其他条件不变”时的数据仍有同样效果的能力2.“控制其他变量不变”的含义3.同时改变不止一个自变量(只需要将效应加和)2.2 对OLS 回归方程的解释从单变量情形加以推广,得:1.残差的样本平均值为02.每个自变量和OLS 残差之间的样本协方差为0。
因此OLS 拟合值和OLS 残差之间的样本协方差也为03.点总位于OLS 回归线上(性质1. 2.由一阶条件得,性质3.由1.可得2.3 OLS 的拟合值和残差( )其中 是x1对其他变量回归后的残差(即排除其他变量对x1的影响,类似矢量正交)2.4 对“排除其他变量影响”的解释( )(是 对 简单回归的斜率1.样本中x2对y的偏效应为0,即2.x1和x 2不相关,即(1. 2.可解释、 的差异由(3.23)知,在两种情况下利用矢量正交的理解考虑简单回归和两个自变量的回归:2.5简单回归和多元回归估计值比较可以证明,R2的另一种理解是 的实际值与其拟合值 的相关系数的平方,其中2.6 拟合优度(与简单回归大致相同)二、普通最小二乘法(多元线性回归模型的代数特征和对方程的解释)使用提示:1.该笔记是对伍德里奇《计量经济学》第五版第三章学习过程中的内容梳理2.由于本人水平有限,单独看该笔记估计会很吃力,且很可能出现错误,建议结合书本进行理解3.希望能够对想学习计量经济学的人起到一点点帮助第三章多元回归分析:估计2020年3月19日10:47由于定义下增加解释变量不会降低R2,所以判断一个解释变量是否应该放入模型的依据应该是该解释变量在总体中对y的偏效应是否非02.7 过原点的回归1.之前推导的性质不再成立,特别是OLS残差的样本平均值不再是02.计算R2没有特定的规则3.当截距项b0不等于0,斜率参数OLS估计量将有偏误;当截距项b0=0,估计带截距项方程的代价是,OLS斜率估计量的方差会更大2.8 OLS估计量的期望值MLR.1(线性于参数)MLR.2(随机抽样)MLR.3(不存在完全共线性,允许一定程度的相关)(在定义函数时要小心不要违背了MLR.3MLR.4(条件均值为0)(内生解释变量:解释变量可能与误差项相关定理3.1 OLS的无偏性()2.9 过度设定和设定不足(多了无关变量和少了解释变量)2.9.1过度设定(不影响OLS估计量的无偏性,但影响OLS估计量的方差)2.9.2设定不足1.简单情形:从一个斜率参数到两个斜率参数由(3.23):取均值得偏误为:(因此偏误的方向取决于两个符号,偏误的大小取决于两者之积,在应用中可以通过常识来判断偏误方向2.扩展情形:从两个斜率参数到三个斜率参数当你假设和不相关时,就可以证明和的关系和简单情形一样2.10 OLS估计量的方差MLR.5(同方差性,不仅可以简化公式,还得到了有效性)定理3.2 OLS斜率估计量的抽样方差在MLR.1-5下,以自变量的样本值为条件,有()(是的总样本波动,则是对所有其他自变量(并包含一个截距项)回归所得到的由(3.51)可知,估计量的抽样方差由三个要素决定:1.误差方差(噪声越大,越难估计)2.的总样本波动(越分散,越容易估计)3.自变量之间的线性关系(和其他自变量相关性越高,越不利于估计(很高的并不一定有问题,抽样方差的大小还要取决于剩下两个因素,可以通过收集更多的数据来削减多重共线性(当考虑某一个自变量 的方差时,若 和其他自变量均无关,那么其他自变量间的关系是不造成影响的,某些经济学家为了分离特定变量的因果效应,而在模型中包括许多控制因素,但这并不影响因果效应的证实( )当含有两个解释变量时:( )当含有一个解释变量时:((3.54)和(3.55)表明除非样本中x1和x2不相关,否则 <1.当 =0时,两个都无偏,但 < ,所以前者更好2.当不等于0时,不放x 2进去会导致有偏,放了x 2进去会导致方差增加,但我们喜欢把x2放进去的理由是:不放进去的偏误不会随着样本容量扩大而缩减,而放进去增加的方差却会随着样本容量的扩大逐渐缩小至0所以有两个结论:2.10.1 过度设定的方差(建立在过度设定无偏讨论的基础上)( )2.10.2 OLS 估计量的标准误(与简单回归相同)在假定MLR.1-5下,有(MLR .5若不满足(即异方差),会使标准误失效(第二种表达清楚说明了随着样本容量的扩大,在其他三项( 、 、 )都趋于常数的时候,估计量标准误是如何变小的因此得估计量的标准误:定理3.3 的无偏估计OLS 估计量是最优线性无偏估计量(如(3.22)所示的线性、无偏误、在线性无偏估计量中方差最小在MLR.1-5下,得定理3.4 高斯-马尔科夫定理2.11 对OLS 估计的一个正确认识。
计量经济学(伍德里奇第三版中文版)课后习题答案
第1章解决问题的办法1.1(一)理想的情况下,我们可以随机分配学生到不同尺寸的类。
也就是说,每个学生被分配一个不同的类的大小,而不考虑任何学生的特点,能力和家庭背景。
对于原因,我们将看到在第2章中,我们想的巨大变化,班级规模(主题,当然,伦理方面的考虑和资源约束)。
(二)呈负相关关系意味着,较大的一类大小是与较低的性能。
因为班级规模较大的性能实际上伤害,我们可能会发现呈负相关。
然而,随着观测数据,还有其他的原因,我们可能会发现负相关关系。
例如,来自较富裕家庭的儿童可能更有可能参加班级规模较小的学校,和富裕的孩子一般在标准化考试中成绩更好。
另一种可能性是,在学校,校长可能分配更好的学生,以小班授课。
或者,有些家长可能会坚持他们的孩子都在较小的类,这些家长往往是更多地参与子女的教育。
(三)鉴于潜在的混杂因素- 其中一些是第(ii)上市- 寻找负相关关系不会是有力的证据,缩小班级规模,实际上带来更好的性能。
在某种方式的混杂因素的控制是必要的,这是多元回归分析的主题。
1.2(一)这里是构成问题的一种方法:如果两家公司,说A和B,相同的在各方面比B公司à用品工作培训之一小时每名工人,坚定除外,多少会坚定的输出从B公司的不同?(二)公司很可能取决于工人的特点选择在职培训。
一些观察到的特点是多年的教育,多年的劳动力,在一个特定的工作经验。
企业甚至可能歧视根据年龄,性别或种族。
也许企业选择提供培训,工人或多或少能力,其中,“能力”可能是难以量化,但其中一个经理的相对能力不同的员工有一些想法。
此外,不同种类的工人可能被吸引到企业,提供更多的就业培训,平均,这可能不是很明显,向雇主。
(iii)该金额的资金和技术工人也将影响输出。
所以,两家公司具有完全相同的各类员工一般都会有不同的输出,如果他们使用不同数额的资金或技术。
管理者的素质也有效果。
(iv)无,除非训练量是随机分配。
许多因素上市部分(二)及(iii)可有助于寻找输出和培训的正相关关系,即使不在职培训提高工人的生产力。
伍德里奇计量经济学知识点总结
【伍德里奇计量经济学知识点总结】1. 基本概念伍德里奇计量经济学是指利用数学、统计学和计量经济学的方法对经济现象进行定量分析和预测的一门学科。
它是经济学的重要分支,通过建立数学模型和使用实证数据进行检验,可以揭示经济规律和进行政策分析。
2. 经典假定在伍德里奇计量经济学中,有一些经典的假定是非常重要的。
首先是线性假定,即假定经济关系是线性的;其次是随机抽样假定,即样本是随机抽取的,能够代表总体;还有就是无多重共线性、异方差和自相关等假定。
3. 模型建立在进行伍德里奇计量经济学的研究时,首先需要建立适当的计量经济模型。
常见的模型包括线性回归模型、多元回归模型、时间序列模型和横断面数据模型等。
在建立模型时,需要考虑模型的选择、变量的设定和函数形式的确定等问题。
4. 参数估计一旦模型建立完成,接下来就需要进行参数估计。
通常使用最小二乘法进行参数估计,通过最小化残差平方和来确定参数的估计值。
在进行参数估计时,需要考虑参数的一致性、有效性和假设检验等问题。
5. 模型诊断模型诊断是伍德里奇计量经济学中的重要环节,通过对模型的有效性、稳健性和适用性进行诊断,可以确保模型的准确性和可靠性。
模型诊断包括多重共线性、异方差、自相关和样本外验证等内容。
6. 预测和政策分析在进行伍德里奇计量经济学的研究时,需要对模型进行预测和政策分析。
通过对模型的预测能力和政策效应进行分析,可以为决策者提供重要的参考信息,并对经济现象进行深入理解和解释。
在我看来,伍德里奇计量经济学是一门非常有趣且重要的学科,它不仅可以帮助我们理解经济现象背后的规律,还可以为政策制定提供重要参考。
通过建立数学模型和使用实证数据进行检验,我们能够更加深入地探讨经济问题并作出合理的判断。
我也深刻意识到在进行伍德里奇计量经济学研究时,需要综合运用数学、统计学和经济学知识,这对我们的综合能力提出了更高的要求。
总结回顾起来,伍德里奇计量经济学是一门综合性强、逻辑性强的学科,在研究过程中需要我们对经济现象有着深刻的理解和分析能力。
计量经济学导论伍德里奇课后答案中文
2.10(iii) From (2.57), Var(1ˆβ) = σ2/21()n i i x x =⎛⎫- ⎪⎝⎭∑. 由提示:: 21n i i x=∑ ≥21()n i i x x =-∑, and so Var(1β ) ≤ Var(1ˆβ). A more direct way to see this is to write(一个更直接的方式看到这是编写) 21()n ii x x =-∑ = 221()n i i x n x =-∑, which is less than21n i i x=∑unless x = 0.(iv)给定的c 2i x 但随着x 的增加, 1ˆβ的方差与Var(1β )的相关性也增加.0β小时1β 的偏差也小.因此, 在均方误差的基础上不管我们选择0β还是1β 要取决于0β,x ,和n 的大小 (除了 21n i i x=∑的大小).3.7We can use Table 3.2. By definition, 2β > 0, and by assumption, Corr(x 1,x 2) < 0. Therefore, there is anegative bias in 1β : E(1β ) < 1β. This means that, on average across different random samples, the simple regression estimator underestimates the effect of the training program. It is even possible that E(1β ) is negative even though 1β > 0. 我们可以使用表3.2。
根据定义,> 0,由假设,科尔(X1,X2)<0。
因此,有一个负偏压为:E ()<。
伍德里奇《计量经济学导论》笔记和课后习题详解(简单回归模型)【圣才出品】
β1 就是斜率参数。
②给定零条件均值假定 E(u|x)=0,把斱程中的 y 看成两个部分是比较有用的。一
部分是表示 E(y|x)的 β0+β1一个
部分是被称为非系统部分的 u,即丌能由 x 觋释的那一部分。
二、普通最小二乘法的推导
1.最小二乘估计值
表 2-1 简单回归的术语
3.零条件均值假定 (1)零条件均值 u 的平均值不 x 值无关。可以把它写作:E(u|x)=E(u)。当斱程成立时,就说 u 的均值独立亍 x。 (2)零条件均值假定的意义 ①零条件均值假定给出 β1 的另一种非常有用的觋释。以 x 为条件叏期望值,幵利用 E
1 / 33
圣才电子书 十万种考研考证电子书、题库视频学习平台
第 2 章 简单回归模型
2.1 复习笔记
一、简单回归模型的定义 1.双发量线性回归模型 一个简单的斱程是:y=β0+β1x+u。 假定斱程在所关注的总体中成立,它便定义了一个简单线性回归模型。因为它把两个发 量 x 和 y 联系起来,所以又把它称为两发量戒者双发量线性回归模型。 2.回归术语
E x y β0 β1x 0
得到
1 n
n i1
yi βˆ0 βˆ1xi
0
和
2 / 33
圣才电子书 十万种考研考证电子书、题库视频学习平台
1
n
n i 1
xi
yi βˆ0 βˆ1xi
0
这两个斱程可用来觋出 βˆ0 和 βˆ1 , y βˆ0 βˆ1x ,则 βˆ0 y βˆ1x 。
量了 yi 的样本发异,SSR 度量了 ui 的样本发异。y 的总发异总能表示成觋释了的发异和未
觋释的发异 SSR 乊和。因此,SST=SSE+SSR。
大学伍德里奇计量经济学第三版教师手册-CHAPTER 16
20XX年复习资料大学复习资料专业:班级:科目老师:日期:CHAPTER 20XXXXTEACHING NOTESI spend some time in Section 20XXXX.1 trying to distinguish between good and inappropriate uses of SEMs. Naturally, this is partly determined by my taste, and many applications fall into a gray area. But students who are going to learn about SEMS should know that just because two (or more) variables are jointly determined does not mean that it is appropriate to specify and estimate an SEM. I have seen many bad applications of SEMs where no equation in the system can stand on its own with an interesting ceteris paribus interpretation. In most cases, the researcher either wanted to estimate a tradeoff between two variables, controlling for other factors –in which case OLS is appropriate – or should have been estimating what is (often pejoratively) called the “reduced form.”The identification of a two-equation SEM in Section 20XXXX.3 is fairly standard except that I emphasize that identification is a feature of the population. (The early work on SEMs also had this emphasis.) Given the treatment of 2SLS in Chapter 20XXXX, the rank condition is easy to state (and test).Romer’s (20XXXX0XX3) inflation and openness example is a nice example of using aggregate cross-sectional data. Purists may not like the labor supply example, but it has become common to view labor supply as being a two-tier decision. While there are different ways to model the two tiers, specifying a standard labor supply function conditional on working is not outside the realm of reasonable models. Section 20XXXX.5 begins by expressing doubts of the usefulness of SEMs for aggregate models such as those that are specified based on standard macroeconomic models. Such models raise all kinds of thorny issues; these are ignored in virtually all texts, where such models are still used to illustrate SEM applications.SEMs with panel data, which are covered in Section 20XXXX.6, are not covered in any other introductory text. Presumably, if you are teaching this material, it is to more advanced students in a second semester, perhaps even in a more applied course. Once students have seen first differencing or the within transformation, along with IV methods, they will find specifying and estimating models of the sort contained in Example 20XXXX.8 straightforward. Levitt’s example concerning prison populations is especially convincing because his instruments seem to be truly exogenous.SOLUTIONS TO PROBLEMS20XXXX.1 (i) If1= 0 then y1= 1z1+ u1, and so theright-hand-side depends only on the exogenous variable z1 and the error term u1. This then is the reduced form for y1. If 1= 0, the reduced form for y1 is y1= 2z2+ u2. (Note that having both1 and2equal zero is not interesting as it implies the bizarrecondition u2–u1= 1z12z2.)If10 and2= 0, we can plug y1= 2z2+ u2into the firstequation and solve for y2:2z2 +u2 = 1y2 + 1z1 + u1or1y2 = 1z12z2 + u1–u2.Dividing by1 (because10) givesy2= (1/1)z1– (2/1)z2 + (u1–u2)/121z1 + 22z2 + v2,where21=1/1,22=2/1, and v2= (u1–u2)/1. Notethat the reduced form for y2generally depends on z1 and z2 (as well as on u1 and u2).(ii) If we multiply the second structural equation by (1/2) andsubtract it from the first structural equation, we obtainy 1 – (1/2)y 1 =1y 21y 2 +1z 1– (1/2)2z 2 +u 1 – (1/2)u 2=1z 1– (1/2)2z 2 + u 1 – (1/2)u 2or[1 – (1/2)]y 1 =1z 1 – (1/2)2z 2 + u 1 – (1/2)u 2.Because12, 1 – (1/2) 0, and so we can divide theequation by 1 – (1/2) to obtain the reduced form for y 1: y 1 =20XXXX z 1+20XXXXz 2 + v 1, where20XXXX=1/[1 – (1/2)],20XXXX=(1/2)2/[1 – (1/2)], and v 1 = [u 1 – (1/2)u 2]/[1 –(1/2)].A reduced form does exist for y 2, as can be seen by subtracting the second equation from the first:0 = (1–2)y 2 +1z 1 –2z 2 + u 1 – u 2;because12, we can rearrange and divide by12to obtainthe reduced form.(iii) In supply and demand examples,12is very reasonable. If the first equation is the supply function, we generally expect1> 0, and if the second equation is the demand function,2<0. The reduced forms can exist even in cases where the supply function is not upward sloping and the demand function is not downward sloping, but we might question the usefulness of such models.20XXXX.2Using simple economics, the first equation must be the demand function, as it depends on income, which is a common determinant of demand. The second equation contains a variable, rainfall, that affects crop production and therefore corn supply.20XXXX.3 No. In this example, we are interested in estimating the tradeoff between sleeping and working, controlling for some other factors. OLS is perfectly suited for this, provided we have been able to control for all other relevant factors. While it is true individuals are assumed to optimally allocate their time subject to constraints, this does not result in a system of simultaneous equations. If we wrote down such a system, there is no sense in which each equation could stand on its own; neither would have an interesting ceteris paribus interpretation. Besides, we could not estimate either equation because economic reasoning gives us no wayof excluding exogenous variables from either equation. See Example20XXXX.2 for a similar discussion.20XXXX.4 We can easily see that the rank condition for identifying the second equation does not hold: there are no exogenous variables appearing in the first equation that are not also in the secondequation. The first equation is identified provided30 (andwe would presume3< 0). This gives us an exogenous variable, log(price), that can be used as an IV for alcohol in estimating the first equation by 2SLS (which is just standard IV in this case). 20XXXX.5(i) Other things equal, a higher rate of condom usage shouldreduce the rate of sexually transmitted diseases (STDs). So1< 0.(ii) If students having sex behave rationally, and condom usage does prevent STDs, then condom usage should increase as the rate of infection increases.(iii) If we plug the structural equation for infrate into conuse=0+1infrate+ …, we see thatconuse depends on 1u1. Because1> 0, conuse is positively related to u1. In fact, if the structural error (u2) in the conuse equation is uncorrelated with u1,Cov(conuse ,u 1) =1Var(u 1) > 0. If we ignore the other explanatoryvariables in the infrate equation, we can use equation (5.4) to obtainthe direction of bias: 1ˆplim()β1> 0 because Cov(conuse ,u 1) >0, where 1ˆβ denotes the OLS estimator. Since we think 1< 0, OLSis biased towards zero. In other words, if we use OLS on the infrate equation, we are likely to underestimate the importance of condom use in reducing STDs. (Remember, the more negative is 1, the moreeffective is condom usage.)(iv) We would have to assume that condis does not appear, in addition to conuse , in the infrate equation. This seems reasonable, as it is usage that should directly affect STDs, and not just having a distribution program. But we must also assume condis is exogenous in the infrate : it cannot be correlated with unobserved factors (inu 1) that also affect infrate .We must also assume that condis has some partial effect on conuse , something that can be tested by estimating the reduced form for conuse . It seems likely that this requirement for an IV – see equations (20XXXX.30) and (20XXXX.31) – is satisfied.20XXXX.6 (i) It could be that the decision to unionize certain segments of workers is related to how a firm treats its employees.While the timing may not be contemporaneous, with the snapshot of a single cross section we might as well assume that it is.(ii) One possibility is to collect information on whether workers’ parents belonged to a union, and construct a variable that is the percentage of workers who had a parent in a union (say, perpar). This may be (partially) correlated with the percent of workers that belong to a union.(iii) We would have to assume that percpar is exogenous in the pension equation. We can test whether perunion is partially correlated with perpar by estimating the reduced form for perunion and doing a t test on perpar.20XXXX.7(i) Attendance at women’s basketball may grow in ways that are unrelated to factors that we can observe and control for. The taste for women’s basketball may increase over time, and this would be captured by the time trend.(ii) No. The university sets the price, and it may change price based on expectations of next year’s attendance; if the university uses factors that we cannot observe, these are necessarily in theerror term u t. So even though the supply is fixed, it does not mean that price is uncorrelated with the unobservables affecting demand.(iii) If people only care about how this year’s team is doing, SEASPERC t-1 can be excluded from the equation once WINPERC t has been controlled for. Of course, this is not a very good assumption for all games, as attendance early in the season is likely to be related to how the team did last year. We would also need to check that 1PRICE t is partially correlated with SEASPERC t-1by estimating the reduced form for 1PRICE t.(iv) It does make sense to include a measure of men’s basketball ticket prices, as attending a women’s basketball game is a substitute for attending a men’s game. The coefficient on 1MPRICE t would be expected to be positive: an increase in the pri ce of men’s tickets should increase the demand for women’s tickets. The winning percentage of the men’s team is another good candidate for an explanatory variable in the women’s demand equation.(v) It might be better to use first differences of the logs, which are then growth rates. We would then drop the observation for the first game in each season.(vi) If a game is sold out, we cannot observe true demand for that game. We only know that desired attendance is some number above capacity. If we just plug in capacity, we are understating the actual demand for tickets. (Chapter 20XXXX discusses censored regression methods that can be used in such cases.)20XXXX.8 We must first eliminate the unobserved effect, a i1. If we difference, we have1HPRICE it=t + 1lEXPEND it + 21POLICE it +31MEDINC itu it,+4PROPTAX it +for t= 2,3. The t here denotes different intercepts in the two years. The key assumption is that the change in the (log of) the state allocation, 1STATEALL it, is exogenous in this equation. Naturally, 1STATEALL it is (partially) correlated with 1EXPEND it because local expenditures depend at least partly on the state subsidy. The policy change in 20XXXX0XX4 means that there should be significant variation in 1STATEALL it, at least for the 20XXXX0XX4 to 20XXXX0XX6 change. Therefore, we can estimate this equation by pooled 2SLS, using1STATEALL it as an IV for 1EXPEND it; of course, this assumes theother explanatory variables in the equation are exogenous. (We could certainly question the exogeneity of the policy and property tax variables.) Without a policy change, 1STATEALL it would probably not vary sufficiently across i or t.SOLUTIONS TO COMPUTER EXERCISESC20XXXX.1 (i) Assuming the structural equation represents a causal relationship, 20XXXX0×1 is the approximate percentage change in income if a person smokes one more cigarette per day.(ii) Since consumption and price are, ceteris paribus, negativelyrelated, we expect5 0 (allowing for5) = 0. Similarly,everything else equal, restaurant smoking restrictions should reducecigarette smoking, so50.(iii) We need5 or6to be different from zero. That is, weneed at least one exogenous variable in the cigs equation that is not also in the log(income) equation.(iv) OLS estimation of the log(income) equation giveslog()income= 7.80 + .0020XXXX cigs+ .20XXXX0 educ + .20XXXX8 age.0020XXXX3 age2(0.20XXXX) (.0020XXXX) (.020XXXX)(.020XXXX) (.00020XXXX)n = 820XXXX, R2 = .20XXXX5.The coefficient on cigs implies that cigarette smoking causes income to increase, although the coefficient is not statistically different from zero. Remember, OLS ignores potential simultaneity between income and cigarette smoking.(v) The estimated reduced form for cigs iscigs= 1.58 .450 educ + .823 age .020XXXX0XX age2.351 log(cigpric)(23.70) (.20XXXX2) (.20XXXX4)(.0020XXXX) (5.766)2.74 restaurn(1.20XXXX)n = 820XXXX, R2 = .20XXXX1.While log(cigpric) is very insignificant, restaurn had the expected negative sign and a t statistic of about –2.47. (People living in states with restaurant smoking restrictions smoke almost three fewer cigarettes, on average, given education and age.) We could drop log(cigpric) from the analysis but we leave it in. (Incidentally, the F test for joint significance of log(cigpric) and restaurn yields p-value .20XXXX4.)(vi) Estimating the log(income) equation by 2SLS gives income= 7.78 .20XXXX2 cigs+ .20XXXX0 educ+ log().20XXXX4 age.0020XXXX0XX age2(0.23) (.20XXXX6) (.020XXXX)(.20XXXX3) (.0020XXXX7)n = 820XXXX.Now the coefficient on cigs is negative and almost significant at the 20XXXX% level against a two-sided alternative. The estimated effect is very large: each additional cigarette someone smokes lowers predicted income by about 4.2%. Of course, the 95% CI for cigs is very wide.(vii) Assuming that state level cigarette prices and restaurant smoking restrictions are exogenous in the income equation is problematical. Incomes are known to vary by region, as do restaurant smoking restrictions. It could be that in states where income is lower (after controlling for education and age), restaurant smoking restrictions are less likely to be in place.C20XXXX.2(i) We estimate a constant elasticity version of the labor supply equation (naturally, only for hours> 0), again by 2SLS. We getlog()hours = 8.37 + 1.20XXXX log(wage) .235 educ .020XXXX age(0.69) (0.56) (.20XXXX1)(.020XXXX).465 kidslt6.020XXXX nwifeinc(.220XXXX) (.020XXXX)n = 428,which implies a labor supply elasticity of 1.20XXXX. This is even higher than the 1.26 we obtained from equation (20XXXX.24) at the mean value of hours (20XXXX20XXXX).(ii) Now we estimate the equation by 2SLS but allow log(wage) and educ to both be endogenous. The full list of instrumental variables is age, kidslt6, nwifeinc, exper, exper2, motheduc, and fatheduc. The result ishours = 7.26 + 1.81 log(wage).20XXXX9 educ log().020XXXX age(1.20XXXX) (0.50) (.20XXXX7)(.020XXXX).543 kidslt6.020XXXX nwifeinc(.220XXXX) (.020XXXX)n = 428.The biggest effect is to reduce the size of the coefficient on educ as well as its statistical significance. The labor supply elasticity is only moderately smaller.ˆu, from the estimation (iii) After obtaining the 2SLS residuals,1in part (ii), we regress these on age, kidslt6, nwifeinc, exper, exper2, motheduc, and fatheduc. The n-R-squared statistic is420XXXX(.0020XXXX) = .428. We have two overidentifying restrictions, so the p-value is roughly P(2χ> .43) ≈ .81. There2is no evidence against the exogeneity of the IVs.C20XXXX.3 (i) The OLS estimates areinf = 25.23 .220XXXX open(4.20XXXX) (.20XXXX3)n = 20XXXX4, R2 = .20XXXX5.The IV estimates areinf = 29.61 .333 open(5.66) (.20XXXX0)n = 20XXXX4, R2 = .20XXXX2.The OLS coefficient is the same, to three decimal places, whenlog(pcinc) is included in the model. The IV estimate with log(pcinc) in the equation is .337, which is very close to .333. Therefore, dropping log(pcinc) makes little difference.(ii) Subject to the requirement that an IV be exogenous, we want an IV that is as highly correlated as possible with the endogenous explanatory variable. If we regress open on land we obtainR2= .20XXXX5. The simple regression of open on log(land) gives R2= .448. Therefore, log(land) is much more highly correlated with open. Further, if we regress open on log(land) and land we getopen= 20XXXX9.22 8.40 log(land) +.000020XXXX3 land(20XXXX.47) (0.20XXXX)(.000020XXXX1)n = 20XXXX4, R2 = .457.While log(land) is very significant, land is not, so we might as well use only log(land) as the IV for open.[Instructor’s Note: You might ask students whether it is better to use log(land) as the single IV for open or to use both land and land2. In fact, log(land) explains much more variation in open.] (iii) When we add oil to the original model, and assume oil is exogenous, the IV estimates areinf = 24.01 .337 open + .820XXXXlog(pcinc ) 6.56 oil(20XXXX.20XXXX) (.20XXXX4)(2.20XXXX)(9.80)n = 20XXXX4, R 2 = .20XXXX5.Being an oil producer is estimated to reduce average annual inflation by over 6.5 percentage points, but the effect is not statistically significant. This is not too surprising, as there are only seven oil producers in the sample.C20XXXX.4 (i) The usual form of the test assumes no serial correlation under H 0, and this appears to be the case. We also assume homoskedasticity. After estimating (20XXXX.35), we obtain the 2SLSresiduals, ˆt u. We then run the regression ˆt u on gc t -1, gy t -1, and r 3t -1. The n -R -squared statistic is 35(.20XXXX20XXXX) ≈ 2.20XXXX. With one df the (asymptotic) p -value is P(21χ > 2.20XXXX) ≈ .20XXXX3, and so the instruments pass the overidentification test at the 20XXXX% level.(ii) If we estimate (20XXXX.35) but with gc t -2, gy t -2, and r3t -2 as the IVs, we obtain, with n = 34,gc = .020XXXX4 + 1.220XXXX gy tt.0020XXXX3 r3t.(.20XXXX74) (1.272)(.0020XXXX0XX)The coefficient on gy t has doubled in size compared with equation (20XXXX.35), but it is not statistically significant. The coefficient on r3t is still small and statistically insignificant.(iii) If we regress gy t on gc t-2, gy t-2, and r3t-2 we obtaingy= .20XXXX1 .20XXXX0 gc t-2+t.20XXXX4 gy t-2+ .0020XXXX4 r3t-2(.020XXXX) (.469) (.330)(.0020XXXX6)n = 34, R2 = .020XXXX7.The F statistic for joint significance of all explanatory variables yields p-value .94, and so there is no correlation between gy t and the proposed IVs, gc t-2, gy t-2, and r3t-2. Therefore, we never should have done the IV estimation in part (ii) in the first place.[Instructor’s Note: There may be serial correlation in this regression, in which case the F statistic is not valid. But the point remains that gy t is not at all correlated with two lags of all variables.]C20XXXX.5 This is an open-ended question without a single answer. Even if we settle on extending the data through a particular year, we might want to change the disposable income and nondurable consumption numbers in earlier years, as these are often recalculated. For example, the value for real disposable personal income in20XXXX0XX5, as reported in Table B-29 of the 20XXXX Economic Report of the President(ERP), is $4,945.8 billions. In the 20XXXX ERP, this value has been changed to $4,920XXXX.0 billions (see Table B-31). All series can be updated using the latest edition of the ERP. The key is to use real values and make them per capita by dividing by population. Make sure that you use nondurable consumption.C20XXXX.6 (i) If we estimate the inverse supply function by OLS we obtain (with the coefficients on the monthly dummies suppressed)tgprc= .020XXXX4 .20XXXX43 gcem t+.20XXXX28 gprcpet t +(.020XXXX2) (.020XXXX1)(.020XXXX3)n = 220XXXX, R2 = .386.Several of the monthly dummy variables are very statistically significant, but their coefficients are not of direct interest here. The estimated supply curve slopes down, not up, and the coefficient on gcem t is very statistically significant (t statistic ≈ 4.87).(ii) We need gdefs t to have a nonzero coefficient in the reduced form for gcem t. More precisely, if we writegcem t = 0 + 1gdefs t + 2gprcpet t + 3feb t + + 20XXXX dec t + v t,then identification requires10. When we run this regression,1ˆπ= 1.20XXXX4 with a t statistic of about –0.294. Therefore,we cannot reject H0:1= 0 at any reasonable significance level,and we conclude that gdefs t is not a useful IV for gcem t(even if grdefs t is exogenous in the supply equation).(iii) Now the reduced form for gcem isgcem t =+1gres t +2gnon t +3gprcpet t +4feb t + +20XXXXdec t + v t ,and we need at least one of1and2to be different from zero. Infact, 1ˆπ = .20XXXX6, t (1ˆπ) = .20XXXX4 and 2ˆπ = 1.20XXXX, t (2ˆπ) =5.47. So gnon t is very significant in the reduced form for gcem t , and we can proceed with IV estimation.(iv) We use both gres t and gnon t as IVs for gcem t and apply 2SLS, even though the former is not significant in the RF. The estimated labor supply function (with seasonal dummy coefficients suppressed) is nowt gprc = .20XXXX28.020XXXX0XX gcem t +.20XXXX20XXXX gprcpet t +(.020XXXX3)(.20XXXX77)(.020XXXX7)n = 220XXXX, R 2 = .356.While the coefficient on gcem t is still negative, it is only about one-fourth the size of the OLS coefficient, and it is now very insignificant. At this point we would conclude that the static supply function is horizontal (with gprc on the vertical axis, asusual). Shea (20XXXX0XX3) adds many lags of gcem t and estimates a finite distributed lag model by IV, using leads as well as lags of gres t and gnon t as IVs. He estimates a positive long run propensity. C20XXXX.7 (i) If county administrators can predict when crime rates will increase, they may hire more police to counteract crime. This would explain the estimated positive relationship betweenlog(crmrte) and log(polpc) in equation (20XXXX.33).(ii) This may be reasonable, although tax collections depend in part on income and sales taxes, and revenues from these depend on the state of the economy, which can also influence crime rates.(iii) The reduced form for log(polpc it), for each i and t, is log(polpc it) = 0+ 1d83t+ 2d84t+ 3d85t+ 4d86t + 5d87tlog(prbarr it) + 7log(prbconv it) ++6log(prbpris it)8+log (avgsen it) + 20XXXX log(taxpc it) +9v it.We need0 for log(taxpc it) to be a reasonable IV candidate20XXXXfor log(polpc it). When we estimate this equation by pooled OLSˆ = .020XXXX2 with a t (N= 90, T= 6 for n= 540), we obtain10statistic of only .20XXXX0. Therefore, log(taxpc it) is not a good IV for log(polpc it).(iv) If the grants were awarded randomly, then the grant amounts, say grant it for the dollar amount for county i and year t, will be uncorrelated with u it, the changes in unobservables that affect county crime rates. By definition, grant it should be correlated with log(polpc it) across i and t. This means we have an exogenous variable that can be omitted from the crime equation and that is (partially) correlated with the endogenous explanatory variable. We could reestimate (20XXXX.33) by IV.C20XXXX.8(i) To estimate the demand equations, we need at least one exogenous variable that appears in the supply equation.(ii) For wave2t and wave3t to be valid IVs for log(avgprc t), we need two assumptions. The first is that these can be properly excluded from the demand equation. This may not be entirely reasonable, and wave heights are determined partly by weather, and demand at a local fish market could depend on demand. The second assumption is thatat least one of wave2t and wave3t appears in the supply equation. There is indirect evidence of this in part three, as the two variables are jointly significant in the reduced form for log(avgprc t).(iii) The OLS estimates of the reduced form areavgprc = 1.20XXXX .020XXXX mon t.020XXXX0 tues t log()t+ .20XXXX1 wed t+ .20XXXX4 thurs t(.20XXXX) (.20XXXX4) (.20XXXX20XXXX) (.20XXXX2) (.20XXXX1)+ .20XXXX4 wave2t + .20XXXX3 wave3t(.20XXXX1) (.20XXXX0)n = 20XXXX, R2 = .320XXXXThe variables wave2t and wave3t are jointly very significant: F = 20XXXX.1, p-value = zero to four decimal places.(iv) The 2SLS estimates of the demand function arelog()totqty = 8.20XXXX .820XXXX log(avgprc t)t.320XXXX mon t.685 tues t(.20XXXX) (.327) (.229) (.226).521 wed t + .20XXXX5 thurs t(.224)(.225)n = 20XXXX, R 2 = .20XXXX3The 95% confidence interval for the demand elasticity is roughly 1.47 to .20XXXX. The point estimate, .82, seems reasonable: a 20XXXX percent increase in price reduces quantity demanded by about 8.2%.(v) The coefficient on ,1ˆi t uis about .294 (se = .20XXXX0XX), so there is strong evidence of positive serial correlation, although the estimate of is not huge. One could compute a Newey-West standard error for 2SLS in place of the usual standard error.(vi) To estimate the supply elasticity, we would have to assume that the day-of-the-week dummies do not appear in the supply equation, but they do appear in the demand equation. Part (iii) provides evidence that there are day-of-the-week effects in the demand function. But we cannot know about the supply function.(vii) Unfortunately, in the estimation of the reduced form for log(avgprc t ) in part (iii), the variables mon , tues , wed , and thurs are jointly insignificant [F (4,90) = .53, p -value = .71.] This meansthat, while some of these dummies seem to show up in the demand equation, things cancel out in a way that they do not affect equilibrium price, once wave2 and wave3 are in the equation. So, without more information, we have no hope of estimating the supply equation.[Instructor’s Note: You could have the students try part (vii), anyway, to see what happens. Also, you could have them estimate the demand function by OLS, and compare the estimates with the 2SLS estimates in part (iv). You could also have them compute the test of the single overidentification condition.]C20XXXX.9 (i) The demand function should be downward sloping, so1 < 0: as price increases, quantity demanded for air travel decreases.(ii) The estimated price elasticity is .391 (t statistic = 5.82).(iii) We must assume that passenger demand depends only on air fare, so that, once price is controlled for, passengers are indifferent about the fraction of travel accounted for by the largest carrier.(iv) The reduced form equation for log(fare) islog()fare = 6.20XXXX + .395 concen.936 log(dist) + .20XXXX0XX [log(dist)]2(0.89) (.20XXXX3) (.272) (.20XXXX1)n = 1,20XXXX9, R2 = .420XXXXThe coefficient on concen shows a pretty strong link between concentration and fare. If concen increases by .20XXXX (20XXXX percentage points), fare is estimated to increase by almost 4%. The t statistic is about 6.3.(v) Using concen as an IV for log(fare) [and where the distance variables act as their own IVs], the estimated price elasticity is 1.20XXXX, which shows much greater price sensitivity than did the OLS estimate. The IV estimate suggests that a one percent increase in fare leads to a slightly more than one percent increase drop in passenger demand. Of course, the standard error of the IV estimate is much larger (about .389 compared with the OLS standard errorof .20XXXX7), but the IV estimate is statistically significant (t is about 3.0).(vi) The relationship between log(fare) and log(dist) has a U-shape, as given in the following graph:。
APPENDIX E
226APPENDIX ESOLUTIONS TO PROBLEMSE.1 This follows directly from partitioned matrix multiplication in Appendix D. WriteX = 12n ⎛⎫ ⎪ ⎪ ⎪ ⎪ ⎪⎝⎭x x x , X ' = (1'x 2'x n 'x ), and y = 12n ⎛⎫ ⎪ ⎪ ⎪ ⎪ ⎪⎝⎭y y yTherefore, X 'X = 1n t t t ='∑x x and X 'y = 1nt t t ='∑x y . An equivalent expression for ˆβ isˆβ = 111n t t t n --=⎛⎫' ⎪⎝⎭∑x x 11nt t t n y -=⎛⎫' ⎪⎝⎭∑xwhich, when we plug in y t = x t β + u t for each t and do some algebra, can be written asˆβ= β + 111n t t t n --=⎛⎫' ⎪⎝⎭∑x x 11nt t t n u -=⎛⎫' ⎪⎝⎭∑x .As shown in Section E.4, this expression is the basis for the asymptotic analysis of OLS using matrices.E.2 (i) Following the hint, we have SSR(b ) = (y – Xb )'(y – Xb ) = [ˆu+ X (ˆβ – b )]'[ ˆu + X (ˆβ – b )] = ˆu'ˆu + ˆu 'X (ˆβ – b ) + (ˆβ – b )'X 'ˆu + (ˆβ – b )'X 'X (ˆβ – b ). But by the first order conditions for OLS, X 'ˆu= 0, and so (X 'ˆu )' = ˆu 'X = 0. But then SSR(b ) = ˆu 'ˆu + (ˆβ – b )'X 'X (ˆβ – b ), which is what we wanted to show.(ii) If X has a rank k then X 'X is positive definite, which implies that (ˆβ– b ) 'X 'X (ˆβ – b ) > 0 for all b ≠ ˆβ. The term ˆu 'ˆu does not depend on b , and so SSR(b ) – SSR(ˆβ) = (ˆβ– b ) 'X 'X (ˆβ– b ) > 0 for b ≠ˆβ.E.3 (i) We use the placeholder feature of the OLS formulas. By definition, β= (Z 'Z )-1Z 'y = [(XA )' (XA )]-1(XA )'y = [A '(X 'X )A ]-1A 'X 'y = A -1(X 'X )-1(A ')-1A 'X 'y = A -1(X 'X )-1X 'y = A -1ˆβ.(ii) By definition of the fitted values, ˆt y= ˆt x β and t y = t z β . Plugging z t and β into the second equation gives ty= (x t A )(A -1ˆβ) = ˆt x β = ˆty .(iii) The estimated variance matrix from the regression of y and Z is 2σ(Z 'Z )-1 where 2σ is the error variance estimate from this regression. From part (ii), the fitted values from the two227regressions are the same, which means the residuals must be the same for all t . (The dependentvariable is the same in both regressions.) Therefore, 2σ= 2ˆσ. Further, as we showed in part (i), (Z 'Z )-1 = A -1(X 'X )-1(A ')-1, and so 2σ(Z 'Z )-1 = 2ˆσA -1(X 'X )-1(A -1)', which is what we wanted to show.(iv) The jβ are obtained from a regression of y on XA , where A is the k ⨯ k diagonal matrix with 1, a 2, , a k down the diagonal. From part (i), β= A -1ˆβ. But A -1 is easily seen to be the k ⨯ k diagonal matrix with 1, 12a -, , 1k a - down its diagonal. Straightforward multiplicationshows that the first element of A -1ˆβis 1ˆβ and the j th element is ˆjβ/a j , j = 2, , k .(v) From part (iii), the estimated variance matrix of βis 2ˆσA -1(X 'X )-1(A -1)'. But A -1 is a symmetric, diagonal matrix, as described above. The estimated variance of jβis the j th diagonal element of 2ˆσA -1(X 'X )-1A -1, which is easily seen to be = 2ˆσc jj /2j a -, where c jj is the j thdiagonal element of (X 'X )-1. The square root of this, σa j |, is se(jβ ), which is simply se(jβ )/|a j |.(vi) The t statistic for jβ is, as usual,j β /se(j β ) = (ˆj β/a j )/[se(ˆjβ)/|a j |], and so the absolute value is (|ˆj β|/|a j |)/[se(ˆj β)/|a j |] = |ˆj β|/se(ˆjβ), which is just the absolute value of the t statistic for ˆjβ. If a j > 0, the t statistics themselves are identical; if a j < 0, the t statistics are simply opposite in sign.E.4 (i) 垐 E(|)E(|)E(|).====δX GβX G βX Gβδ(ii) 2121垐 Var(|)Var(|)[Var(|)][()][()].σσ--'''''====δX GβX G βX G G X X G G X X G(iii) The vector of regression coefficients from the regression y on XG -1 is111111111111[()]()[()]() ()[()]()ˆ ()()().------------''''''='''''=''''''''===XG XG XG y G X XG G X y G X X G G X yG X X G G X y G X X X y δFurther, as shown in Problem E.3, the residuals are the same as from the regression y on X , andso the error variance estimate, 2ˆ,σis the same. Therefore, the estimated variance matrix is228211121垐[()](),σσ----'''=XG XG G X X Gwhich is the proper estimate of the expression in part (ii).(iv) It is easily seen by matrix multiplication that choosing123100...0010...0...0...010...k c c c c ⋅⋅⋅⋅⋅⋅⋅⋅⎛⎫ ⎪ ⎪ ⎪=⋅⋅⋅⋅ ⎪ ⎪ ⎪ ⎪⎝⎭Gdoes the trick: if δ = G β then δj = βj , j = 1,…,k -1, and 1122....k k k c c c δβββ=+++(v) Straightforward matrix multiplication shows that, for the suggested choice of G -1, 1.n -=G G I Also by multiplication, it is easy to see that, for each t ,11122,11[(/),(/),...,(/),/].t t k tk t k tk t k k k tk tk k x c c x x c c x x c c x x c ---=---x GE.5 (i) By plugging in for y , we can write111()()()().---''''''==+=+βZ X Z y Z X Z Xβu βZ X Z uNow we use the fact that Z is a function of X to pull Z outside of the conditional expectation:11E(|)E[()|]()E(|).--''''=+=+=βX βZ X Z u X βZ X Z u X β(ii) We start from the same representation in part (i): 1()-''=+ββZ X Z u and so11121211Var(|)()[Var(|)][()] ()()()()().n σσ------''''=''''''==βX Z X Z u X Z Z X Z X Z I Z X Z Z X Z Z X ZA common mistake is to forget to transpose the matrix 'Z X in the last term.(iii) The estimator βis linear in y and, as shown in part (i), it is unbiased (conditional on X ). Because the Gauss-Markov assumptions hold, the OLS estimator, ˆβ, is best linear unbiased. In particular, its variance-covariance matrix is “smaller” (in the matrix sense) than Var(|).βX Therefore, we prefer the OLS estimator.。
计量经济学总结:计量各小章伍德里奇
Asymptotics如果OLS不是无偏的, 那consistency是对估计量的起码要求. 一致性是指在样本容量趋于无穷时, 估计量的分布会集中在估计值的点上. 在四个初始假定下, OLS估计量都是一致估计. 而如果放宽OLS的假定,把zero conditional mean拆成两个假定E(u)=0和Cov(x,u)=0, 即u的期望值为0且与x不相关, 这时候即时条件均值假定不成立, OLS不是无偏, 仍可以得到一致估计.如果任何一个x与u相关, 就会导致不一致性. 而如果遗漏一个变量x2而其又与x1相关, 就会导致不一致性. 如果被遗漏变量与任何一个其他变量都不相关, 则不会导致不一致性. 如果x1与u相关, 但x1与u都与其它变量不相关, 则只是x1的估计量存在不一致性.非正态的总体不影响无偏性和BLUE,但是要做出正确的t和F统计量估计需要有正态分布的假定(第6个假定)。
但只要样本容量足够大,根据中心极限定理,OLS是渐进正态分布的。
但这必须以homoskedasticity和Zero conditional mean为前提。
这时OLS估计量也具有最小的渐进方差。
Dummy variable用来衡量定性的信息对于dummy variable,设置0和1,便于做出自然的解释;如果在一个函数中添加了两个互补的dummy variables,就会造成dummy variable trap,导致perfect collineartiy;那个没有被加入模型的会形成互补的variable,通常被成为base group(基组)。
Intercept Dummy variable:单独作为自变量加上系数后出现。
在图上只表示为intecept shift,图形只是截距发生了平行迁移。
如果male为1,那女性截距就是α,男性截距是γ+α。
Slope Dummy variable:作为自变量的一个interaction variable出现。
《计量经济学导论》伍德里奇-第四版-笔记和习题答案(2-8章)
inc e inc incE e inc 0 。
inc e inc
inc
2
Var e inc inc e2 。
(Ⅲ)低收入家庭支出的灵活性较低,因为低收入家庭必须首先支付衣食住行等必需品。而高收入家庭具有 较高的灵活性,部分选择更多的消费,而另一部分家庭选择更多的储蓄。这种较高的灵活性暗示高收入家庭中储 蓄的变动幅度更大。
(Ⅲ)在(Ⅱ)的方程中,如果备考课程有效,那么 1 的符号应该是什么? (Ⅳ)在(Ⅱ)的方程中, 0 该如何解释? 答: (Ⅰ)构建实验时,首先随机分配准备课程的小时数,以保证准备课程的时间与其他影响 SAT 的因素是
houri :i 1 , , n , n 表示试验中所包括的学 独立的。然后收集实验中每个学生 SAT 的数据,建立样本 sati ,
因此 GPA 0.5681 0.1022 ACT 。 此处截距没有一个很好的解释, 因为对样本而言,ACT 并不接近 0。 如果 ACT 分数提高 5 分,预期 GPA 会提高 0.1022× 5=0.511。 (Ⅱ)每次观测的拟合值和残差表如表 2-3 所示: 表 2-3
i
GPA
GPA^^源自 7.利用 Kiel and McClain(1995)有关 1988 年马萨诸塞州安德沃市的房屋出售数据,如下方程给出了房屋 价格( price )和距离一个新修垃圾焚化炉的距离( dist )之间的关系:
log price 9.40 0.312log dist n 135 , R 2 0.162
y 0 0 1 x u 0
令新的误差项为 e u 0 ,因此 E e 0 。 新的截距项为 0 0 ,斜率不变为 1 。 2.下表包含了 8 个学生的 ACT 分数和 GPA(平均成绩) 。平均成绩以四分制计算,且保留一位小数。 GPA ACT student 1 2 3 4 5 6 7 8
伍德里奇 计量经济学导论
伍德里奇计量经济学导论摘要:一、引言1.计量经济学的基本概念2.计量经济学的研究方法与应用领域二、概率论与数理统计基础1.随机变量与概率分布2.数学期望与方差3.抽样分布与假设检验三、线性回归分析1.回归方程的建立与估计2.回归系数的显著性检验3.回归模型的诊断与修正四、多元线性回归分析1.多元线性回归模型的建立2.多元线性回归的求解方法3.多元线性回归的显著性检验五、时间序列分析1.时间序列的基本概念与特点2.平稳时间序列的判定与转换3.时间序列模型的建立与预测六、非参数统计方法1.非参数检验的基本思想与方法2.非参数回归与插值方法3.非参数统计方法的优缺点及应用场景七、计量经济学在实践中的应用1.我国经济发展中的计量经济学应用案例2.计量经济学在国际贸易、金融、环境等领域的应用3.计量经济学在政策评估与制定中的作用八、伍德里奇计量经济学导论的评价与启示1.教材的结构与内容特点2.伍德里奇计量经济学导论在我国的影响力3.对我国计量经济学教育的启示正文:计量经济学是一门运用概率论、统计学、数学等方法研究经济现象及其规律的科学。
在当今经济学领域,计量经济学已成为一门重要的分支学科,广泛应用于科研、教学和实践。
伍德里奇《计量经济学导论》一书,系统地阐述了计量经济学的基本原理、方法及应用,为读者提供了宝贵的理论指导和实践经验。
本书首先介绍了计量经济学的基本概念和研究方法。
计量经济学的研究方法主要包括实证分析、理论分析及实证与理论相结合的分析方法。
研究范围涉及宏观、微观及政策评估等多个领域。
此外,本书还简要介绍了概率论和数理统计的基本知识,为后续章节的学习奠定了基础。
在概率论和数理统计基础部分,本书详细讲解了随机变量、概率分布、数学期望、方差等概念,以及抽样分布、假设检验等统计方法。
这些知识为后续的回归分析提供了理论支持。
线性回归分析是计量经济学的重要内容之一。
本书介绍了回归方程的建立与估计、回归系数的显著性检验以及回归模型的诊断与修正方法。
伍德里奇计量经济学讲义3
Alternate form of the White test
Consider that the fitted values from OLS, ŷ, are a function of all the x’s Thus, ŷ2 will be a function of the squares and crossproducts and ŷ and ŷ2 can proxy for all of the xj, xj2, and xjxh, so Regress the residuals squared on ŷ and ŷ2 and use the R2 to form an F or LM statistic Note only testing for 2 restrictions now
7
Robust Standard Errors (cont)
Important to remember that these robust standard errors only have asymptotic justification – with small sample sizes t statistics formed with robust standard errors will not have a distribution close to the t, and inferences will not be correct In Stata, robust standard errors are easily obtained using the robust option of reg
13
Weighted Least Squares
While it’s always possible to estimate robust standard errors for OLS estimates, if we know something about the specific form of the heteroskedasticity, we can obtain more efficient estimates than OLS The basic idea is going to be to transform the model into one that has homoskedastic errors – called weighted least squares
计量经济学(伍德里奇第五版中文版)答案
计量经济学(伍德里奇第五版中文版)答案(三)鉴于潜在的混杂因素- 其中一些是第(ii)上市- 寻找负相关关系不会是有力的证据,缩小班级规模,实际上带来更好的性能。
在某种方式的混杂因素的控制是必要的,这是多元回归分析的主题。
1.2(一)这里是构成问题的一种方法:如果两家公司,说A和B,相同的在各方面比B公司à用品工作培训之一小时每名工人,坚定除外,多少会坚定的输出从B公司的不同?(二)公司很可能取决于工人的特点选择在职培训。
一些观察到的特点是多年的教育,多年的劳动力,在一个特定的工作经验。
企业甚至可能歧视根据年龄,性别或种族。
也许企业选择提供培训,工人或多或少能力,其中,“能力”可能是难以量化,但其中一个经理的相对能力不同的员工有一些想法。
此外,不同种类的工人可能被吸引到企业,提供更多的就业培训,平均,这可能不是很明显,向雇主。
(iii)该金额的资金和技术工人也将影响输出。
所以,两家公司具有完全相同的各类员工一般都会有不同的输出,如果他们使用不同数额的资金或技术。
管理者的素质也有效果。
(iv)无,除非训练量是随机分配。
许多因素上市部分(二)及(iii)可有助于寻找输出和培训的正相关关系,即使不在职培训提高工人的生产力。
1.3没有任何意义,提出这个问题的因果关系。
经济学家会认为学生选择的混合学习和工作(和其他活动,如上课,休闲,睡觉)的基础上的理性行为,如效用最大化的约束,在一个星期只有168小时。
然后我们可以使用统计方法来衡量之间的关联学习和工作,包括回归分析,我们覆盖第2章开始。
但我们不会声称一个变量“使”等。
他们都选择学生的变量。
第2章解决问题的办法2.1(I)的收入,年龄,家庭背景(如兄弟姐妹的人数)仅仅是几个可能性。
似乎每个可以与这些年的教育。
(收入和教育可能是正相关,可能是负相关,年龄和受教育,因为在最近的同伙有妇女,平均而言,更多的教育和兄弟姐妹和教育的人数可能呈负相关)。
(ii)不会(i)部分中列出的因素,我们与EDUC。
伍德里奇《计量经济学》chap3
问题 1:为何要用多元替代简单?
答案:3.1 多元回归的动因(脆弱 的假定,多样的函数形式)
问题 2:怎么实现多元估计?
3.2 OLS 的操作 最小化残差平方和 矩法估计
(3.22) (3.62)
问题 3:怎么解释多元估计?
3.2 OLS 的解释:(1)偏效应,(2)
其他条件不变,……(3)排除其他 变量影响后,……
i=1
i=1
均值总位于回归线上
3.3OLS 的期望:无偏
无偏的三个假定: MLR.1:线性于参数 MLR.2:随机抽样 MLR.3:无完全共线性 MLR.4:零条件均值 无偏:这个程序是无偏 的。
3.4OLS 的方差
MLR.5:同方差。方差成
( ) ( ) 分:Var bˆ j
=
s2 SSTj 1−
(∑ ) ∑ bˆ1 = rˆi1yi
rˆi12 ( 3.22 )
(∑ ) ∑ bˆ1 = b1 + rˆi1ui ( rˆi12 ), (3.62)
问题 4:OLS 有什么性质 代数性质 3.2 小 样 本 性 质 ( 优 势 ): 3.3~3.5
优势
三个代数性质:
n
∑ uˆi = 0
i=1
n
n
∑ ∑ xiuˆi = 0, yˆiuˆi = 0
遗漏变量
b% j = b垐j + bkd%j , (3.63)
问题 5:模型误设后果?
R
2 j
3.5OLS 的有效性
高斯马尔可夫假定 ->高斯马尔可夫 定理
s2
1
−
R
2 j
SST
无偏
标准误的 估计
伍德里奇 计量经济学导论
伍德里奇计量经济学导论摘要::1.伍德里奇《计量经济学导论》概述2.多元线性回归模型及其假设3.高斯- 马尔科夫假设4.伍德里奇《计量经济学导论》的课后习题答案5.总结正文:计量经济学是一门以经济理论为基础,运用数学和统计学方法,通过建立计量经济模型对经济变量之间的关系进行定量分析的学科。
伍德里奇的《计量经济学导论》是计量经济学领域的经典教材,受到了广泛关注和应用。
本文将从伍德里奇的《计量经济学导论》概述、多元线性回归模型及其假设、高斯- 马尔科夫假设以及伍德里奇《计量经济学导论》的课后习题答案等方面进行探讨。
伍德里奇《计量经济学导论》概述《计量经济学导论》是伍德里奇所著的一本计量经济学教材,目前已经出版到第6 版。
本书旨在为读者提供一个全面、系统的计量经济学知识体系,帮助读者了解和掌握计量经济学的基本概念、理论和方法。
全书共分为四篇,包括横截面数据的回归分析、多元回归分析、时间序列分析和面板数据分析。
每一篇都涵盖了相应的理论知识和应用实例,既有理论深度,又有实践操作,使得读者能够更好地理解和应用计量经济学知识。
多元线性回归模型及其假设多元线性回归模型是计量经济学中一种常用的模型,用于分析多个自变量与因变量之间的关系。
在伍德里奇的《计量经济学导论》中,多元线性回归模型被详细介绍,包括模型的构建、参数估计、模型检验等内容。
同时,伍德里奇还介绍了多元线性回归模型的假设,这些假设被称为高斯- 马尔科夫假设。
高斯- 马尔科夫假设高斯- 马尔科夫假设是多元线性回归模型的五个假设之一,它包括以下四个假设:1.线性性假设:因变量与自变量之间的关系是线性的。
2.独立性假设:自变量之间相互独立,自变量与误差项之间也相互独立。
3.正态性假设:自变量和误差项都服从正态分布。
4.零均值假设:所有自变量的平均值等于零。
这四个假设被称为高斯- 马尔科夫假设,它们保证了多元线性回归模型的估计结果具有无偏性和最小方差性。
伍德里奇《计量经济学导论》的课后习题答案伍德里奇的《计量经济学导论》每一章节都配有详细的课后习题,帮助读者巩固和检验所学知识。
大学伍德里奇计量经济学第三版教师手册-CHAPTER
20XX年复习资料大学复习资料专业:班级:科目老师:日期:CHAPTER 20XXXXTEACHING NOTESBecause of its realism and its care in stating assumptions, this chapter puts a somewhat heavier burden on the instructor and student than traditional treatments of time series regression. Nevertheless, I think it is worth it. It is important that students learn that there are potential pitfalls inherent in using regression with time series data that are not present for cross-sectional applications. Trends, seasonality, and high persistence are ubiquitous in time series data. By this time, students should have a firm grasp of multiple regression mechanics and inference, and so you can focus on those features that make time series applications different from cross-sectional ones.I think it is useful to discuss static and finite distributed lag models at the same time, as these at least have a shot at satisfying the Gauss-Markov assumptions. Many interesting examples have distributed lag dynamics. In discussing the time series versions of the CLM assumptions, I rely mostly on intuition. The notion of strict exogeneity is easy to discuss in terms of feedback. It is also pretty apparent that, in many applications, there are likely to be some explanatory variables that are not strictly exogenous. What thestudent should know is that, to conclude that OLS is unbiased – as opposed to consistent – we need to assume a very strong form of exogeneity of the regressors. Chapter 20XXXX shows that only contemporaneous exogeneity is needed for consistency.Although the text is careful in stating the assumptions, in class, after discussing strict exogeneity, I leave the conditioning on X implicit, especially when I discuss the no serial correlation assumption. As this is a new assumption I spend some time on it. (I also discuss why we did not need it for random sampling.)Once the unbiasedness of OLS, the Gauss-Markov theorem, and the sampling distributions under the classical linear model assumptions have been covered – which can be done rather quickly – I focus on applications. Fortunately, the students already know about logarithms and dummy variables. I treat index numbers in this chapter because they arise in many time series examples.A novel feature of the text is the discussion of how to compute goodness-of-fit measures with a trending or seasonal dependent variable. While detrending or deseasonalizing y is hardly perfect (and does not work with integrated processes), it is better thansimply reporting the very high R-squareds that often come with time series regressions with trending variables.SOLUTIONS TO PROBLEMS20XXXX.1 (i) Disagree. Most time series processes are correlated over time, and many of them strongly correlated. This means they cannot be independent across observations, which simply represent different time periods. Even series that do appear to be roughly uncorrelated – such as stock returns – do not appear to be independently distributed, as you will see in Chapter 20XXXX under dynamic forms of heteroskedasticity.(ii) Agree. This follows immediately from Theorem 20XXXX.1. In particular, we do not need the homoskedasticity and no serial correlation assumptions.(iii) Disagree. Trending variables are used all the time as dependent variables in a regression model. We do need to be careful in interpreting the results because we may simply find a spurious association between y t and trending explanatory variables. Including a trend in the regression is a good idea with trending dependent or independent variables. As discussed in Section20XXXX.5, the usual R-squared can be misleading when the dependent variable is trending.(iv) Agree. With annual data, each time period represents a year and is not associated with any season.20XXXX.2 We follow the hint and writegGDP t -1 =+int t -1 +1int t -2 + u t -1,and plug this into the right-hand-side of the int t equation:int t = 0+ 1(+int t-1 +1int t-2 + u t-1 – 3) + v t= (0 +10– 31) +10int t-1 + 11int t-2 +1u t-1+ v t .Now by assumption, u t -1 has zero mean and is uncorrelated with all right-hand-side variables in the previous equation, except itself of course. SoCov(int ,u t -1) = E(int t ⋅u t-1) =1E(21t u -) > 0because1> 0. If 2u σ= E(2t u ) for all t then Cov(int,u t-1) =12u σ.This violates the strict exogeneity assumption, TS.2. While u t is uncorrelated with int t , int t-1, and so on, u t is correlated with int t+1.20XXXX.3 Writey* =+ (+1+2)z* =+ LRP ⋅z *,and take the change: y* = LRP⋅z*.20XXXX.4We use the R-squared form of the F statistic (and ignore the information on 2R). The 20XXXX% critical value with 3 and 20XXXX4 degrees of freedom is about 2.20XXXX (using 20XXXX0 denominator df in Table G.3a). The F statistic isF = [(.320XXXX .281)/(1 .320XXXX)](20XXXX4/3) ≈ 1.43, which is well below the 20XXXX% cv. Therefore, the event indicators are jointly insignificant at the 20XXXX% level. This is another example of how the (marginal) significance of one variable (afdec6) can be masked by testing it jointly with two very insignificant variables.20XXXX.5The functional form was not specified, but a reasonable one islog(hsestrts t) = 0 + 1t + 1Q2t + 2Q3t + 3Q3t+ 1int tlog(pcinc t) + u t,+2Where Q2t, Q3t, and Q4t are quarterly dummy variables (the omitted quarter is the first) and the other variables are self-explanatory. This inclusion of the linear time trend allows the dependent variableand log(pcinc t ) to trend over time (int t probably does not contain a trend), and the quarterly dummies allow all variables to display seasonality. The parameter 2is an elasticity and 20XXXX0⋅1isa semi-elasticity.20XXXX.6 (i) Given j=+1j +2j 2 for j = 0,1,,4, we canwritey t =+0z t+ (+1+2)z t -1 + (+ 21+ 42)z t -2+ (0+ 31+ 92)z t -3 + (0 + 41 + 20XXXX2)z t -4 + u t=+(z t + z t -1 + z t -2 + z t -3 + z t -4) +1(z t -1 + 2z t -2 + 3z t -3+ 4z t -4)+2(z t-1 + 4z t -2 + 9z t -3 + 20XXXX z t -4) + u t .(ii) This is suggested in part (i). For clarity, define three new variables: z t 0 = (z t + z t -1 + z t -2 + z t -3 + z t -4), z t 1 = (z t -1 + 2z t -2 + 3z t -3 + 4z t -4), and z t 2 = (z t -1 + 4z t -2 + 9z t -3 + 20XXXX z t -4). Then,,,1, and2are obtained from the OLS regression of y t on z t 0, z t 1,and z t 2, t = 1, 2,, n . (Following our convention, we let t = 1denote the first time period where we have a full set of regressors.) The ˆj δ can be obtained from ˆj δ= 0ˆγ+ 1ˆγj + 2ˆγj 2.(iii) The unrestricted model is the original equation, which has six parameters (and the fivej). The PDL model has fourparameters. Therefore, there are two restrictions imposed in moving from the general model to the PDL model. (Note how we do not have to actually write out what the restrictions are.) The df in the unrestricted model is n – 6. Therefore, we would obtain theunrestricted R -squared, 2ur R from the regression of y t on z t , z t -1,,z t -4 and the restricted R -squared from the regression in part (ii),2r R . The F statistic is222()(6).(1)2ur r ur R R n F R --=⋅-Under H 0 and the CLM assumptions, F ~ F 2,n -6.20XXXX.7 (i) pe t -1 and pe t -2 must be increasing by the same amount aspe t .(ii) The long-run effect, by definition, should be the change ingfr when pe increases permanently. But a permanent increase meansthe level of pe increases and stays at the new level, and this is achieved by increasing pe t -2, pe t -1, and pe t by the same amount.SOLUTIONS TO COMPUTER EXERCISESC20XXXX.1Let post79be a dummy variable equal to one for years after 20XXXX0XX9, and zero otherwise. Adding post79 to equation20XXXX.20XXXX) gives3t i= 1.30 + .620XXXX inf t+ .363 def t+ 1.56 post79t(0.43) (.20XXXX6) (.120XX)(0.51)n = 56, R2 = .664, 2R = .644.The coefficient on post79is statistically significant (t statistic 3.06) and economically large: accounting for inflation and deficits, i3was about 1.56 points higher on average in years after 20XXXX0XX9. The coefficient on def falls once post79 is included in the regression.C20XXXX.2 (i) Adding a linear time trend to (20XXXX.22) giveslog()chnimp = 2.37 .686 log(chempi)+ .466 log(gas) + .20XXXX8 log(rtwex)(20XX.78) (1.240) (.876) (.472)+ .20XXXX0 befile6+ .20XXXX0XX affile6.351 afdec6+ .020XXXX t(.251) (.257) (.282) (.020XXXX) n = 20XXXX1, R2 = .362, 2R = .325.Only the trend is statistically significant. In fact, in addition to the time trend, which has a t statistic over three, only afdec6 has a t statistic bigger than one in absolute value. Accounting for a linear trend has important effects on the estimates.(ii) The F statistic for joint significance of all variables except the trend and intercept, of course) is about .54. The df in the F distribution are 6 and 20XXXX3. The p-value is about .78, and so the explanatory variables other than the time trend are jointly very insignificant. We would have to conclude that once a positive linear trend is allowed for, nothing else helps to explain log(chnimp). This is a problem for the original event study analysis.(iii) Nothing of importance changes. In fact, the p-value for thetest of joint significance of all variables except the trend andmonthly dummies is about .79. The 20XXXX monthly dummies themselvesare not jointly significant: p-value≈ .59.C20XXXX.3 Adding log(prgnp) to equation (20XXXX.38) giveslog()prepop= 6.66 .220XXXXtlog(mincov t) + .486 log(usgnp t) + .285 log(prgnp t)(1.26) (.20XXXX0) (.222) (.20XXXX0).20XXXX7 t(.020XXXX)n = 38, R2 = .889, 2R = .876.The coefficient on log(prgnp t) is very statistically significant (tstatistic≈ 3.56). Because the dependent and independent variableare in logs, the estimated elasticity of prepop with respect to prgnpis .285. Including log(prgnp) actually increases the size of the minimum wage effect: the estimated elasticity of prepop with respectto mincov is now .220XXXX, as compared with .20XXXX9 in equation(20XXXX.38).C20XXXX.4If we run the regression of gfr t on pe t, (pe t-1–pe t), (pe t-2–pe t), ww2t, and pill t, the coefficient and standard error on pe t are, rounded to four decimal places, .20XXXX20XXXX and .20XXXX20XXXX, respectively. When rounded to three decimal places weobtain .20XXXX1 and .20XXXX0, as reported in the text.C20XXXX.5(i) The coefficient on the time trend in the regression of log(uclms) on a linear time trend and 20XXXX monthly dummy variables is about .020XXXX9 (se≈ .0020XXXX), which implies that monthly unemployment claims fell by about 1.4% per month on average. The trend is very significant. There is also very strong seasonality in unemployment claims, with 6 of the 20XXXX monthly dummy variables having absolute t statistics above 2. The F statistic for joint significance of the 20XXXX monthly dummies yieldsp-value≈ .0020XXXX.(ii) When ez is added to the regression, its coefficient is about .520XXXX (se≈ .20XXXX6). Because this estimate is so large in magnitude, we use equation (7.20XXXX): unemployment claims are estimated to fall 20XXXX0[1 – exp(.520XXXX)] ≈ 39.8% after enterprise zone designation.(iii) We must assume that around the time of EZ designation there were not other external factors that caused a shift down in the trend of log(uclms). We have controlled for a time trend and seasonality, but this may not be enough.C20XXXX.6 (i) The regression of gfr t on a quadratic in time givesˆgfr= 20XXXX0XX.20XXXX +t.20XXXX2 t- .020XXXX0 t2(6.20XXXX) (.382)(.020XXXX1)n = 72, R2 = .320XXXX.Although t and t2 are individually insignificant, they are jointly very significant (p-value≈ .0000).(ii) Usinggfr as the dependent variable in (20XXXX.35) givestR2≈.620XXXX, compared with about .727 if we do not initially detrend. Thus, the equation still explains a fair amount of variation in gfr even after we net out the trend in computing the total variation in gfr.(iii) The coefficient and t statistic on t3are about .0020XXXX9 and .00020XXXX, respectively, which results in a very significant t statistic. It is difficult to know what to make of this. The cubic trend, like the quadratic, is not monotonic. So this almost becomes a curve-fitting exercise.C20XXXX.7 (i) The estimated equation isgc= .020XXXX1 + .571 gy tt(.0020XXXX) (.20XXXX7)n = 36, R2 = .679.This equation implies that if income growth increases by one percentage point, consumption growth increases by .571 percentage points. The coefficient on gy t is very statistically significant (t statistic 8.5).(ii) Adding gy t-1 to the equation givesgc= .020XXXX4 + .552 gy t+t.20XXXX0XX gy t-1(.020XXXX3) (.20XXXX0)(.20XXXX9)n = 35, R2 = .695.The t statistic on gy t-1 is only about 1.39, so it is not significant at the usual significance levels. (It is significant at the 20XX% level against a two-sided alternative.) In addition, the coefficient is not especially large. At best there is weak evidence of adjustment lags in consumption.(iii) If we add r3t to the model estimated in part (i) we obtaingc= .020XXXX2 + .578 gy t+t.0020XXXX1 r3t(.020XXXX0) (.20XXXX2)(.0020XXXX3)n = 36, R2 = .680.The t statistic on r3t is very small. The estimated coefficient is also practically small: a one-point increase in r3t reduces consumption growth by about .20XXXX1 percentage points.C20XXXX.8 (i) The estimated equation isgfr= 92.20XXXX + .20XXXX9 pe t.020XXXX0 pe t-1+ t.020XXXX4 pe t-2+ .020XXXX pe t-3+ .020XXXX pe t-4(3.33) (.20XXXX6) (.20XXXX31)(.20XXXX51) (.20XXXX4) (.20XXXX0XX)21.34 ww2t31.20XXXX pill t(20XXXX.54) (3.90)n = 68, R2 = .537, 2R = .483.The p-value for the F statistic of joint significance of pe t-3and pe t-4 is about .94, which is very weak evidence against H.(ii) The LRP and its standard error can be obtained as the coefficient and standard error on pe t in the regressiongfr t on pe t, (pe t-1–pe t), (pe t-2–pe t), (pe t-3–pe t), (pe t-4–pe t), ww2t, pill tWe get LRP≈ .20XXXX9 (se≈ .20XXXX0), which is above the estimatedLRP with only two lags (.20XXXX1). The standard errors are the same rounded to three decimal places.(iii) We estimate the PDL with the additional variables ww22 andpill t . To estimate,1, and2, we define the variablesz0t = pe t + pe t -1 + pe t -2 + pe t -3 + pe t -4z1t = pe t -1 + 2pe t -2 + 3pe t -3 + 4pe t -4z2t = pe t -1 + 4pe t -2 + 9pe t -3 + 20XXXX pe t -4.Then, run the regression gfrt t on z0t , z1t , z2t , ww2t , pill t . Using the data in FERTIL3.RAW gives (to three decimal places) 0ˆγ= .20XXXX9,1ˆγ= –.20XXXX7, 2ˆγ= .020XXXX. So 0ˆδ= 0ˆγ = .20XXXX9,1ˆδ= .20XXXX9 - .20XXXX7 + .020XXXX = .20XXXX4, 2ˆδ= .20XXXX9 –2(.20XXXX7) + 4(.020XXXX) = .020XXXX, 3ˆδ= .20XXXX9 – 3(.20XXXX7) + 9(.020XXXX) = .020XXXX, 4ˆδ= .20XXXX9 – 4(.20XXXX7) + 20XXXX(.020XXXX) = .20XXXX3. Therefore, the LRP is .20XXXX5. This is slightly above the .20XXXX9 obtained from the unrestricted model, but not much.Incidentally, the F statistic for testing the restrictions imposed by the PDL is about [(.537 - .536)/(1.537)](60/2) ≈ .20XXXX5,which is very insignificant. Therefore, the restrictions are not rejected by the data. Anyway, the only parameter we can estimate with any precision, the LRP, is not very different in the two models.C20XXXX.9(i) The sign ofβ is fairly clear-cut: as interest rates2rise, stock returns fall, soβ< 0. Higher interest rates imply that2T-bill and bond investments are more attractive, and also signal a future slowdown in economic activity. The sign ofβ is less clear.1While economic growth can be a good thing for the stock market, it can also signal inflation, which tends to depress stock prices.(ii) The estimated equation isrsp00= 20XXXX.84 + .20XXXX6 pcip t5t1.36 i3t(3.27) (.20XXXX9)(0.54)n = 557, R2 = .020XXXX.A one percentage point increase in industrial production growth is predicted to increase the stock market return by .20XXXX6 percentage points (a very small effect). On the other hand, a one percentage point increase in interest rates decreases the stock market return by an estimated 1.36 percentage points.(iii) Only i3 is statistically significant with t statistic≈2.52.(iv) The regression in part (i) has nothing directly to say about predicting stock returns because the explanatory variables are dated contemporaneously with rsp500. In other words, we do not know i3t before we know rsp500t. What the regression in part (i) says is that a change in i3is associated with a contemporaneous change in rsp500. C20XXXX.10 (i) The sample correlation between inf and def is only about .020XXXX, which is pretty small. Perhaps surprisingly, inflation and the deficit rate are practically uncorrelated over this period. Of course, this is a good thing for estimating the effects of each variable on i3, as it implies almost no multicollinearity.(ii) The equation with the lags is3t i = 1.61 + .343 inf t + .382 inf t-1.190 def t+ .569 def t-1(0.40) (.20XXXX5) (.134) (.221)(.20XXXX0XX)n = 55, R2 = .685, 2R = .660.(iii) The estimated LRP of i3 with respect to infis .343 + .382 = .725, which is somewhat larger than .620XXXX, whichwe obtain from the static model in (20XXXX.20XXXX). But the estimates are fairly close considering the size and significance of the coefficient on inf t-1.(iv) The F statistic for significance of inf t-1 and def t-1 is about 5.22, with p-value .020XXXX. So they are jointly significant at the 1% level. It seems that both lags belong in the model.C20XXXX.11 (i) The variable beltlaw becomes one at t = 61, which corresponds to January, 20XXXX0XX6. The variable spdlaw goes from zero to one at t = 77, which corresponds to May, 20XXXX0XX7.(ii) The OLS regression giveslog()totacc= 20XXXX.469 + .020XXXX75 t .20XXXX27 feb + .20XXXX20XXXX mar + .020XXXX5 apr(.020XXXX) (.00020XXXX) (.20XXXX44) (.20XXXX44) (.20XXXX45)+ .20XXXX21 may + .20XXXX20XXXX jun + .20XXXX76 jul + .20XXXX40 aug(.20XXXX45) (.20XXXX45)(.20XXXX45) (.20XXXX45)+ .20XXXX24 sep + .20XXXX21 oct + .20XXXX20XXXX nov + .20XXXX0XX2 dec(.20XXXX45) (.20XXXX45)(.20XXXX45)(.20XXXX45) n = 20XXXX0XX, R2 = .720XXXXWhen multiplied by 20XXXX0, the coefficient on t gives roughly the average monthly percentage growth in totacc, ignoring seasonal factors. In other words, once seasonality is eliminated, totacc grew by about .275% per month over this period, or, 20XXXX(.275) = 3.3% at an annual rate.There is pretty clear evidence of seasonality. Only February has a lower number of total accidents than the base month, January. The peak is in December: roughly, there are 9.6% accidents more inDecember over January in the average year. The F statistic for joint significance of the monthly dummies is F= 5.20XXXX. With 20XXXX and 95 df, this give a p-value essentially equal to zero.(iii) I will report only the coefficients on the new variables: log()totacc = 20XXXX.640 + … + .020XXXX33 wkends .20XXXX20XXXX unem(.20XXXX3) (.020XXXX78) (.020XXXX4).20XXXX38 spdlaw + .20XXXX54 beltlaw(.020XXXX6) (.020XXXX2)n = 20XXXX0XX, R2 = .920XXXXThe negative coefficient on unem makes sense if we view unem as a measure of economic activity. As economic activity increases –unem decreases – we expect more driving, and therefore more accidents. The estimate that a one percentage point increase in the unemployment rate reduces total accidents by about 2.1%. A better economy does have costs in terms of traffic accidents.(iv) At least initially, the coefficients on spdlaw and beltlaw are not what we might expect. The coefficient on spdlaw implies that accidents dropped by about 5.4% after the highway speed limit was increased from 55 to 65 miles per hour. There are at least a couple of possible explanations. One is that people because safer drivers after the increased speed limiting, recognizing that the must be more cautious. It could also be that some other change – other than the increased speed limit or the relatively new seat belt law – caused lower total number of accidents, and we have not properly accounted for this change.The coefficient on beltlaw also seems counterintuitive at first. But, perhaps people became less cautious once they were forced to wear seatbelts.(v) The average of prcfat is about .886, which means, on average, slightly less than one percent of all accidents result in a fatality. The highest value of prcfat is 1.220XXXX, which means there was one month where 1.2% of all accidents resulting in a fatality.(vi) As in part (iii), I do not report the coefficients on the time trend and seasonal dummy variables:prcfat =1.20XXXX0 + … + .0020XXXX3 wkends.020XXXX4 unem(.20XXXX0XX)(.020XXXX20XXXX)(.020XXXX5)+ .20XXXX71 spdlaw .20XXXX95 beltlaw(.20XXXX20XXXX)(.20XXXX32)n = 20XXXX0XX, R2 = .720XXXXHigher speed limits are estimated to increase the percent of fatal accidents, by .20XXXX7 percentage points. This is a statistically significant effect. The new seat belt law is estimated to decrease the percent of fatal accidents by about .20XXXX, but the two-sided p-value is about .21.Interestingly, increased economic activity also increases the percent of fatal accidents. This may be because more commercial trucks are on the roads, and these probably increase the chance that an accident results in a fatality.C20XXXX.20XXXX (i) OLS estimation using all of the data givesinf = 1.20XXXX + .520XXXX unem(1.55) (.266)n = 56, R2 = .20XXXX2, 2R = .20XXXX5,so there are 56 years of data.(ii) The estimates are similar to those in equation (20XXXX.20XXXX). Adding the extra years does not help in finding a tradeoff between inflation and unemployment. In fact, the slope estimate becomes even larger (and is still positive) in the full sample.(iii) Using only data from 20XXXX to 20XXXX givesinf = 4.20XXXX .378 unem(1.65) (.334)n = 7, R2 = .220XXXX, 2R = .20XXXX4.The equation now shows a tradeoff between inflation and unemployment: a one percentage point increase in unem is estimated to reduce inf by about .38 percentage points. Not surprisingly, with such a small sample size, the estimate is not statistically different from zero: the two-sided p-value is .31. So, while it is tempting to think that the inflation-unemployment tradeoff reemerges in the last part of the sample, the estimates are not precise enough to draw that conclusion.(iv) The regressions in parts (i) and (iii) are an example of this setup, with n1 = 49 and n2 = 7. The weighted average of the slopes from the two different periods is (49/56)(.468) + (7/56)(.378) .362. But the slope estimate on the entire sample is .520XXXX. Generally, there is no simple relationship between the slope estimate on the entire sample and the slope estimates on two sub-samples.。
伍德里奇 第三章
2 2
ˆ ˆ ∑ ( y − y )( y − y ) ˆ ˆ ∑ ( y − y) ∑ ( y − y)
2
]2
2
(如果没有常数项,一般情况下,
ˆ ∑u = ∑e
ˆ ≠ 0, y ≠ y )
第一,随机项的条件均值 E (ui | X ) 等于随机项的无条件均值 E (ui ) ,表明 ui 均值 独立于所有解释变量,即任意的 X。 (在横截面回归中,如果 X 是非随机变量,或不同次抽样中保持固定取值,这个条 件自然满足。如果 X 是随机变量,则需要强调这个假定。)
第二, E (ui | X ) = E (ui ) =0 ,表明模型函数形式设定正确,即没有模型形式设定 正确、没有遗漏变量、解释变量也不存在系统的测量误差。 违背的原因: ① ② ③ 函数形式误设定 遗漏变量(被遗漏的变量与其他变量相关) 测量误差
不同 x 的偏效应无法区分开来。 要求样本点个数大于参数个数。
假定 4.零条件均值(或称为 X 严格外生假定、均值独立假定)-最关键 当解释变量给定时,随机干扰项均值为 0。即
E (ui | x1 ,x2 ...xk )=E (ui | X )=E (ui )=0,
隐含了以下两个假定
i = 1, 2,..., n
为
ˆ α1 = ∑
∑ (x − x )
1 1
( x1 − x1 )( y − y )
7.“排除其他变量影响”的解释
数学推导。
8.简单回归与多元回归的区别
ˆ ˆ ˆ y = β 0 + β1 x1 ˆ ˆ ˆ ˆ y = α 0 + α1 x1 + α 2 x2
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Introductory Econometrics
4
Motivation : Advantage
It can explain more of the variation in the dependent variable.
It can incorporate more general functional form.
If other factors that affecting y are not correlated with x, changing x can ensure that u is not changed, and the effect of x on y can be identified.
Multiple regression analysis is more amenable to ceteris paribus analysis because it allows us to explicitly control for many other factors that simultaneously affect the dependent variable.
6
Motivation: An Example
Consider a model that says family
consumption is a quadratic function of family income:
Cons = b0 + b1 inc+b2 inc2 +u
Now the marginal propensity to consume is approximated by
pcolGPA: predicted values of college grade point average
pcolGPA:大学绩点预测值
hsGPA : high school GPA hsGPA : 高中绩点
ACT : achievement test score ACT :成绩测验分数
pcolGPA = 1.29 + 0.453hsGPA+0.0094ACT
yˆ bˆ1x1, that is each b has
a ceteris paribus interpretation
Introductory Econometrics
10
Example 3.4: Determinants of College
GPA (GPA1.dta)
Two-independent-variable regression
The multiple regression model is the most widely used vehicle for empirical analysis.
Introductory Econometrics
5
Motivation: An Example
Consider a simple version of the wage equation for obtaining the effect of education on hourly wage:
Introductory Econometrics
9
Interpreting Multiple Regression
yˆ bˆ0 bˆ1x1 bˆ2 x2 ... bˆk xk , so yˆ bˆ1x1 bˆ2 x2 ... bˆk xk ,
so holding x2,...,xk fixed implies that
exper: years of labor market experience
wage b0 b1educ b2exper u
In this example experience is explicitly taken out of the error term.
Introductory Econometrics
the residuals from the estimated
regression xˆ1 ˆ0 ˆ2 xˆ2
Introductory Econometrics
17
A “Partialling Out” Interpretation
Regress our first independent variable x1 on our second independent variable x2 ,
13
Example: Determinants of College GPA
One-independent-variable regression
pcolGPA = 2.4 +0.0271ACT
The coefficients on ACT is three times larger.
If these two regressions were both true, they can be considered as the results of two different experiments.
Introductory Econometrics
16
A “Partialling Out” Interpretation
Consider the case where k 2, i.e.
yˆ bˆ0 bˆ1x1 bˆ2 x2 , then
bˆ1 rˆ1i yi
rˆ12i , where rˆ1i are
MPC= b1 +2b2 inc
Introductory Econometrics
7
The Model with k Independent Variables
The general multiple linear regression model can be written as
yi b0 b1x1i b2 x2i bk xki ui
and then obtain the residual r1 .
Then, do to obtain
a simple bˆ1 .
regression
of
y
on
r1
Introductory Econometrics
18
“Partialling Out” continued
Previous equation implies that regressing y
Still need to make a zero conditional mean assumption, so now assume that
E(u|x1,x2, …,xk) = 0 Still minimizing the sum of squared residuals, so have k+1 first order conditions
Introductory Econometrics
8
Parallels with Simple Regression
b0 is still the intercept b1 to bk all called slope parameters
u is still the error term (or disturbance)
on x1 and x2 gives same effect of x1 as regressing y on residuals from a regression
of x1 on x2
This means only the part of x1 that is uncorrelated with x2 are being related to y so we’re estimating the effect of x1 on y after x2 has been “partialled out”
Introductory Econometrics
14
Holdier of multiple regression analysis is that it allows us to do in non-experimental environments what natural scientists are able to do in a controlled laboratory setting: keep other factors fixed.
Whether the ceteris paribus effects are reliable or not depends on whether the conditional mean assumption is realistic.
Introductory Econometrics
2
Motivation: Advantage
x1 and x2 are uncorrelated in the sample
Introductory Econometrics
20
“Partialling Out” continued
In the general model with k explanatory
variables, equation
bbˆˆ11cann
still
rˆ1i
yi
be written as in n rˆ1i2 , but the
rxe1soidnuxa2l…r1
3. Multiple Regression
Analysis: Estimation
yi = b0 + b1x1i + b2x2i + . . . bkxki + ui
Introductory Econometrics
1
Motivation: Advantage