伍德里奇计量经济学讲义7
计量经济学(伍德里奇第三版中文版)课后习题答案
第1章解决问题的办法1.1(一)理想的情况下,我们可以随机分配学生到不同尺寸的类。
也就是说,每个学生被分配一个不同的类的大小,而不考虑任何学生的特点,能力和家庭背景。
对于原因,我们将看到在第2章中,我们想的巨大变化,班级规模(主题,当然,伦理方面的考虑和资源约束)。
(二)呈负相关关系意味着,较大的一类大小是与较低的性能。
因为班级规模较大的性能实际上伤害,我们可能会发现呈负相关。
然而,随着观测数据,还有其他的原因,我们可能会发现负相关关系。
例如,来自较富裕家庭的儿童可能更有可能参加班级规模较小的学校,和富裕的孩子一般在标准化考试中成绩更好。
另一种可能性是,在学校,校长可能分配更好的学生,以小班授课。
或者,有些家长可能会坚持他们的孩子都在较小的类,这些家长往往是更多地参与子女的教育。
(三)鉴于潜在的混杂因素- 其中一些是第(ii)上市- 寻找负相关关系不会是有力的证据,缩小班级规模,实际上带来更好的性能。
在某种方式的混杂因素的控制是必要的,这是多元回归分析的主题。
1.2(一)这里是构成问题的一种方法:如果两家公司,说A和B,相同的在各方面比B公司à用品工作培训之一小时每名工人,坚定除外,多少会坚定的输出从B公司的不同?(二)公司很可能取决于工人的特点选择在职培训。
一些观察到的特点是多年的教育,多年的劳动力,在一个特定的工作经验。
企业甚至可能歧视根据年龄,性别或种族。
也许企业选择提供培训,工人或多或少能力,其中,“能力”可能是难以量化,但其中一个经理的相对能力不同的员工有一些想法。
此外,不同种类的工人可能被吸引到企业,提供更多的就业培训,平均,这可能不是很明显,向雇主。
(iii)该金额的资金和技术工人也将影响输出。
所以,两家公司具有完全相同的各类员工一般都会有不同的输出,如果他们使用不同数额的资金或技术。
管理者的素质也有效果。
(iv)无,除非训练量是随机分配。
许多因素上市部分(二)及(iii)可有助于寻找输出和培训的正相关关系,即使不在职培训提高工人的生产力。
伍德里奇《计量经济学导论》(第5版)笔记和课后习题详解-第7章含有定性信息的多元回归分析:二值(或
伍德里奇《计量经济学导论》(第5版)笔记和课后习题详解-第7章含有定性信息的多元回归分析:二值(或第7章含有定性信息的多元回归分析:二值(或虚拟)变量7.1复习笔记一、对定性信息的描述定性信息通常以二值信息的形式出现。
在计量经济学中,二值变量最常见的称呼是虚拟变量。
二、只有一个虚拟自变量1.只有一个虚拟自变量的简单模型考虑如下决定小时工资的简单模型:001wage female educ uβδβ=+++用0δ表示female 的参数,以强调虚拟变量参数的含义。
假定零条件均值假定() 0E u female educ =,成立,那么:()()0| 1 |0 E wage female educ E wage female educ δ==-=,,由于female=1对应于女性且female=0对应于男性,所以可以简单的把模型写为:()()0| | E wage female educ E wage male educ δ=-,,这种情况可以在图形上描绘成男性与女性之间的截距变化。
男性线的截距是0β,女性线的截距是00βδ+。
由于只有两组数据,所以只需要两个不同的截距。
这意味着,除了0β之外,只需要一个虚拟变量。
因为female +male=1,意味着male 是female 的一个完全线性函数,如果使用两个虚拟变量就会导致完全多重共线性,这就是虚拟变量陷阱。
2.当因变量为log(y)时,对虚拟解释变量系数的解释在应用研究中有一个常见的设定,当自变量中有一个或多个虚拟变量时,因变量则以对数形式出现。
在这种情况下,此系数具有一种百分比解释。
当log(y)是一个模型的因变量时,将虚拟变量的系数乘以100,可解释为y 在保持所有其他因素不变情况下的百分数差异。
当一个虚拟变量的系数意味着y 有较大比例的变化时,可以得到精确的百分数差异。
一般地,如果1β是一个虚拟变量(比方说x 1)的系数,那么,当log(y)是因变量时,在x 1=1时预测的y 相对于在x 1=0时预测的y,精确的百分数差异为:()1?100exp 1β-??三、使用多类别虚拟变量1.在方程中包括虚拟变量的一般原则如果回归模型具有g 组或g 类不同截距,那就需要在模型中包含g-1个虚拟变量和一个截距。
伍德里奇-横截面与面板数据的经济计量分析7.2-7.4-7.8-8.6
Ecmt675:Econometrics IAssignment5Problem1a.maxc i;q iU(c i;q i)=c i+ log(1+q i)s:t:c i+p i q i=y iUsing the Lagrange equation with Lagrange multiplierL=c i+ log(1+q i)+ (y i c i p i q i) We can get the…rst-order conditions:[c i]1 =0[q i]1+q i p i=0Therefore,we can getq i=p i 1(1)c i=y i+p i (2) From(1),we can getlog(q i+1)=log( ) log(p i)(3) Therefore,we can estimate the coe¢cients bylog(q i)= 0+ 1log(p i)+u iFrom(2),we can use the equationc i= 0+ 1p i+ 2y i+v ito estimate the coe¢cients.To sum up,log(q i)= 0+ 1log(p i)+u ic i= 0+ 1p i+ 2y i+v i1bLet i=exp(x i ),the equation(3)becomeslog(q i+1)=x i log(p i)Therefore,we can estimate the coe¢cients bylog(q i)=x i + 1log(p i)+u iSimilarly,let i=x i ,then(2)becomesc i=y i+p i x iTherefore,we can estimate the coe¢cients byc i=x i + 1p i+ 2y i+v iTo sum up,log(q i)=x i + 1log(p i)+u ic i=x i + 1p i+ 2y i+v icIn the…rst equation in a.or b.,there exists the endogenous problem if p i actually depends on q i.This could happen because the donation amount may actually move a person to a di¤erent tax bracket.Therefore,we will get the inconsistent estimators in the…rst equation.One possible instrument is the change of tax schedule that will a¤ect p i.Problem2(7.2)a.p^ )= 1N N X i=1X0i X i 1 1p N N X i=1X0i u iVar p N(^ ) = 1N N X i=1X0i X i 1 1N N X i=1X0i u i u0i X i 1N N X i=1X0i X i 1By LLN! E(X0i X i) 1E(X0i u i u0i X i) E(X0i X i) 1= E(X0i X i) 1E(X0i X i) E(X0i X i) 12)Avar(^ SOLS)=1N E(X0i X i) 1E(X0i X i) E(X0i X i) 1 b.Using^ =1NN X i=1^^u i^^u0ito estimate .Therefore,we can useN X i=1X0i X i 1(N X i=1X0i^ X i) N X i=1X0i X i 1to estimate asymptotic variance of^ SOLS.cAvar(^ F GLS) 1 Avar(^ SOLS) 1=N E(X0 1i X i) E(X0i X i) E(X0i X i) 1E(X0i X i)Let Z i= 12X i;W i= 12X i=N E(Z0i Z i) E(Z0i W i) E(W0i W i) 1E(W0i Z i)From the linear regression Z i on W i,we know that E(Z0i Z i) E(Z0i W i) E(W0i W i) 1E(W0i Z i) is p.s.d.dIf = 2I G,thenAvar(^ SOLS)=1N E(X0i X i) 1E(X0i X i) E(X0i X i) 1= 2N E(X0i X i) 1Avar(^ F GLS)=1N E(X0i X i) 1= 2N E(X0i X i) 1eFrom c,we know that the FGLS estimator always has smaller asymptotic variance than the SOLS estimator.3Problem 3(7.4)Because ^^u i =y i X i ^^=X i +u i X i ^^ =u i +X i ( ^^ ),we can get 1p X ^^u i ^^u 0i =1p X u i u 0i(4)+1pN X u i ( ^^ )0X 0i (5)+1pN X X i ( ^^ )u 0i(6)+1pXX i ( ^^ )( ^^ )0X 0i(7)From (5),1p N X vec u i ( ^^ )0X 0i =1p NX (X i u i )vec ( ^^ )0 =1NX(X i u i ) p ^^ )0 =o p (1)O p (1)=o p (1)Also,(6)is similarly equal to o p (1).From (7)1p NX vec X i ( ^^ )( ^^ )0X 0i=1p X (X i X i )vec ( ^^)( ^^ )0 =1p N 1N X (X i X i ) vec p N ( ^^ )p N ( ^^ )0 =1p NO p (1)O p (1)O p (1)=o p (1)Therefore1p N X ^^u i ^^u 0i =1p NX u i u 0i +o p (1))p N (^ )=1p N(X u i u 0i )+o p (1)4Problem4(7.8).sureg(hrearn educ exper expersq tenure tenuresq union south nrtheast nrthcen married white male)(hrvac educ exper expersq tenure tenuresq union south nrtheast nrthcen married white male)(hrsick educ exper expersq tenure tenuresq union south nrtheast nrthcen married white male)(hrins educ exper expersq tenure tenuresq union south nrtheast nrthcen married white male)(hrpens educ exper expersq tenure tenuresq union south nrtheast nrthcen married white male) (The results are omitted here).test married(1)[hrearn]married=0(2)[hrvac]married=0(3)[hrsick]married=0(4)[hrins]married=0(5)[hrpens]married=0chi2(5)=14.48Prob>chi2=0.0128.test[hrpens]educ=[hrins]educ(1)-[hrins]educ+[hrpens]educ=0chi2(1)=102.24Prob>chi2=0.0000From the results,you can see the joint test or each t-statistics in eachequation to test whether the married status a¤ect them.Also,from thesecond test,we know that two e¤ects are signi…cant di¤erent.Problem5(8.6)a.E(Z0 1i u i)=E 0@z0i100z0i21A0@ 11 1212 221A0@u i1u i21A=0@ 11E(z0i1u i1)+ 12E(z0i1u i2) 12E(z0i2u i1)+ 22E(z0i2u i2)1A5So under the assumption E(z0i1u i1)=0and E(z0i2u i2)=0,E(Z0 1i u i)is still possible not equal to zero if the terms E(z0i2u i1)=0or E(z0i1u i2)=0b.If 12=0,then E(Z0 1i u i)=0c.If z i1=z i2,then E(z0i1u i2)=E(z0i2u i2)=0.Also E(z0i2u i1)=E(z0i1u i1)=0. Therefore,E(Z0 1i u i)=0.6。
计量经济学授课教案(多场合)
计量经济学授课教案一、课程概述计量经济学是经济学的一个重要分支,它运用数学、统计学和计算机科学的方法,研究经济现象中的数量关系和规律性。
本课程旨在帮助学生掌握计量经济学的基本理论、方法和应用,提高学生运用计量经济学方法分析和解决实际经济问题的能力。
二、教学目标1.理解计量经济学的基本概念、原理和方法;2.掌握经典线性回归模型的估计、检验和预测;3.了解非线性回归模型、面板数据模型和时间序列模型;4.学会运用计量经济学软件进行数据处理和分析;5.培养学生运用计量经济学方法解决实际经济问题的能力。
三、教学内容与安排1.第一讲:导论1.1计量经济学的定义与作用1.2计量经济学的研究方法与步骤1.3计量经济学软件介绍2.第二讲:经典线性回归模型2.1一元线性回归模型2.2多元线性回归模型2.3回归模型的估计方法:最小二乘法3.第三讲:回归模型的检验与预测3.1模型拟合优度检验3.2回归参数的显著性检验3.3回归模型的预测与区间估计4.第四讲:非线性回归模型4.1线性模型的局限性4.2二次回归模型4.3Logit回归模型与Probit回归模型5.第五讲:面板数据模型5.1面板数据的定义与特点5.2面板数据模型的设定与估计5.3面板数据模型的检验与预测6.第六讲:时间序列模型6.1时间序列数据的定义与特点6.2自回归模型(AR)6.3移动平均模型(MA)6.4自回归移动平均模型(ARMA)7.第七讲:计量经济学应用案例分析7.1金融市场分析7.2货币政策分析7.3贸易政策分析四、教学方法1.课堂讲授:讲解计量经济学的基本理论、方法和应用;2.案例分析:通过实际经济案例,引导学生运用计量经济学方法解决实际问题;3.上机实践:指导学生运用计量经济学软件进行数据处理和分析;4.小组讨论:鼓励学生分组讨论,提高学生的合作能力和沟通能力。
五、考核方式1.平时成绩:包括课堂表现、作业完成情况和小组讨论;2.期中考试:考查学生对计量经济学基本理论、方法和应用的理解;3.期末考试:综合考查学生对计量经济学的掌握程度,包括理论知识和实际应用能力。
伍德里奇《计量经济学导论》(第6版)复习笔记和课后习题详解OLS用于时间序列数据的其他问题
伍德里奇《计量经济学导论》(第6版)复习笔记和课后习题详解OLS用于时间序列数据的其他问题第11章OLS用于时间序列数据的其他问题11.1复习笔记考点一:平稳和弱相关时间序列★★★★1.时间序列的相关概念(见表11-1)表11-1时间序列的相关概念2.弱相关时间序列(1)弱相关对于一个平稳时间序列过程{x t:t=1,2,…},随着h的无限增大,若x t和x t+h“近乎独立”,则称为弱相关。
对于协方差平稳序列,如果x t和x t+h之间的相关系数随h的增大而趋近于0,则协方差平稳随机序列就是弱相关的。
本质上,弱相关时间序列取代了能使大数定律(LLN)和中心极限定理(CLT)成立的随机抽样假定。
(2)弱相关时间序列的例子(见表11-2)表11-2弱相关时间序列的例子考点二:OLS的渐近性质★★★★1.OLS的渐近性假设(见表11-3)表11-3OLS的渐近性假设2.OLS的渐近性质(见表11-4)表11-4OLS的渐进性质考点三:回归分析中使用高度持续性时间序列★★★★1.高度持续性时间序列(1)随机游走(见表11-5)表11-5随机游走(2)带漂移的随机游走带漂移的随机游走的形式为:y t=α0+y t-1+e t,t=1,2,…。
其中,e t(t=1,2,…)和y0满足随机游走模型的同样性质;参数α0被称为漂移项。
通过反复迭代,发现y t的期望值具有一种线性时间趋势:y t=α0t+e t+e t-1+…+e1+y0。
当y0=0时,E(y t)=α0t。
若α0>0,y t的期望值随时间而递增;若α0<0,则随时间而下降。
在t时期,对y t+h的最佳预测值等于y t加漂移项α0h。
y t的方差与纯粹随机游走情况下的方差完全相同。
带漂移随机游走是单位根过程的另一个例子,因为它是含截距的AR(1)模型中ρ1=1的特例:y t=α0+ρ1y t-1+e t。
2.高度持续性时间序列的变换(1)差分平稳过程I(1)弱相关过程,也被称为0阶单整或I(0),这种序列的均值已经满足标准的极限定理,在回归分析中使用时无须进行任何处理。
计量经济学课件英文版 伍德里奇
Two methods to estimate
2016/9/22 Department of Statistics-Zhaoliqin 17
Some Terminology, cont.
β0 :intercept parameter β1 :slope parameter means that a one-unit increase in x changes the expected value of y by the amount β1,holding the other factors in u fixed.
Department of Statistics-Zhaoliqin 16
2016/9/22
Some Terminology, cont.
y = β0 + β1x + u, E [u|x] = 0. β0 + β1x is the systematic part of y. u is the unsystematic part of y. u is denoted the error term. Other terms for u : error shock disturbance residual (sometimes in reference to fitted error term)
6
2016/9/22
Department of Statistics-Zhaoliqin
7
Hence, we are interested in studying is the mean of wages given the years of Education that will be denoted as E [Wages|Education] Following economic theory, we assume a specific model for E[Wages|Education]
伍德里奇《计量经济学导论》(第5版)笔记和课后习题详解-第7章 含有定性信息的多元回归分析:二值(或
第7章含有定性信息的多元回归分析:二值(或虚拟)变量7.1复习笔记一、对定性信息的描述定性信息通常以二值信息的形式出现。
在计量经济学中,二值变量最常见的称呼是虚拟变量。
二、只有一个虚拟自变量1.只有一个虚拟自变量的简单模型考虑如下决定小时工资的简单模型:001wage female educ uβδβ=+++用0δ表示female 的参数,以强调虚拟变量参数的含义。
假定零条件均值假定() 0E u female educ =,成立,那么:()()0| 1 |0 E wage female educ E wage female educ δ==-=,,由于female=1对应于女性且female=0对应于男性,所以可以简单的把模型写为:()()0| | E wage female educ E wage male educ δ=-,,这种情况可以在图形上描绘成男性与女性之间的截距变化。
男性线的截距是0β,女性线的截距是00βδ+。
由于只有两组数据,所以只需要两个不同的截距。
这意味着,除了0β之外,只需要一个虚拟变量。
因为female+male=1,意味着male 是female 的一个完全线性函数,如果使用两个虚拟变量就会导致完全多重共线性,这就是虚拟变量陷阱。
2.当因变量为log(y)时,对虚拟解释变量系数的解释在应用研究中有一个常见的设定,当自变量中有一个或多个虚拟变量时,因变量则以对数形式出现。
在这种情况下,此系数具有一种百分比解释。
当log(y)是一个模型的因变量时,将虚拟变量的系数乘以100,可解释为y 在保持所有其他因素不变情况下的百分数差异。
当一个虚拟变量的系数意味着y 有较大比例的变化时,可以得到精确的百分数差异。
一般地,如果1ˆβ是一个虚拟变量(比方说x 1)的系数,那么,当log(y)是因变量时,在x 1=1时预测的y 相对于在x 1=0时预测的y,精确的百分数差异为:()1ˆ100exp 1β⎡⎤⋅-⎣⎦三、使用多类别虚拟变量1.在方程中包括虚拟变量的一般原则如果回归模型具有g 组或g 类不同截距,那就需要在模型中包含g-1个虚拟变量和一个截距。
计量经济学课件英文版 伍德里奇
20
1.3 The Structure of Economic Data
Cross Sectional Time Series Panel
Department of Statistics by Zhaoliqin
21
Types of Data – Cross Sectional
Department of Statistics by Zhaoliqin 10
Why study Econometrics?
An empirical analysis uses data to test a theory or to estimate a relationship
A formal economic model can be tested
Theory may be ambiguous as to the effect of some policy change – can use econometrics to evaluate the program
Department of Statistics by Zhaoliqin 11
1.2 steps in empirical economic analysis
Welcome to Econometrics
Department of Statistic:赵丽琴 liqinzhao_618@
Department of Statistics by Zhaoliqin
1
About this Course
Textbook: Jeffrey M. Wooldridge, Introductory Econometrics—A Modern Approach. Main Software: Eviews. Sample data can be acquired from internet. If time permitted, R commands will be introduced.
伍德里奇计量经济学笔记
伍德里奇计量经济学笔记伍德里奇计量经济学(Wooldridge Econometrics)是一门应用计量经济学的学科,它结合了经济学和数理统计学的理论和方法。
1. 引言- 计量经济学的定义:利用数理统计学和计量经济模型来分析经济问题。
- 经济学模型包括描述经济系统和理论关系的方程。
- 计量经济学的目标是估计和测试经济模型中的参数。
2. 统计学基础- 假设检验:用统计方法来验证经济理论。
- 最小二乘法(OLS):估计经济模型中未知参数的方法。
- OLS估计结果的性质和假设:无偏性、一致性和有效性。
3. 单变量回归模型- 简单线性回归模型:一个自变量和一个因变量之间的线性关系。
- 估计参数和评估模型:OLS估计、t统计量、R方和调整的R 方。
- 解释和预测:利用估计的模型进行解释和预测。
4. 多变量回归模型- 多元线性回归模型:多个自变量和一个因变量之间的线性关系。
- 估计参数和评估模型:OLS估计、t统计量、F统计量、R方和调整的R方。
- 控制变量和决策:利用控制变量来减少混淆因素,做出更准确的决策。
5. 动态模型- 差分方程:描述变量随时间变化的关系。
- 滞后变量和滞后因变量:引入滞后变量来解释变量之间的时序关系。
- 动态因果关系:解释一些经济变量之间的长期和短期关系。
6. 面板数据模型- 面板数据:包含多个个体和多个时间观测的数据集。
- 固定效应模型和随机效应模型:解释面板数据中个体效应和时间效应。
- 引入个体和时间固定效应:控制个体特征和时间变化对变量关系的影响。
7. 工具变量估计- 决定性和随机性端变量:用于解决内生性问题的变量。
- 工具变量的选择和检验:选择有效的工具变量来估计内生性模型。
- 两阶段最小二乘法(2SLS):用工具变量估计内生性模型。
8. 非线性回归模型- 非线性函数:描述实际经济关系的复杂性。
- 估计非线性模型:使用非线性最小二乘法(NLS)估计非线性模型。
- 非线性回归模型的解释和预测:利用估计的非线性模型进行解释和预测。
伍德里奇 计量经济学导论
伍德里奇计量经济学导论计量经济学是一门运用数学、统计学、经济学理论及计算机技术等方法研究经济现象之间数量关系的学科。
在《伍德里奇计量经济学导论》这本书中,作者详细介绍了计量经济学的基本原理与方法,为读者提供了丰富的理论知识与实践案例。
本书共分为八个部分,下面我们将分别进行介绍。
首先是引言部分,作者对计量经济学的定义与作用进行了阐述,指出计量经济学在经济预测、政策评估、经济研究等方面具有重要意义。
此外,作者还对本书的结构安排进行了说明,以便读者更好地把握全书内容。
第二部分为概率论与数理统计基础。
在这一部分,作者详细介绍了随机变量、概率分布、数学期望、方差等基本概念,并讲解了常见概率分布如正态分布、t分布、卡方分布等。
此外,作者还介绍了最大似然估计方法,为后续回归分析奠定了基础。
接下来是一元线性回归模型部分。
作者首先建立了回归方程,并讲解了如何进行参数估计。
在此基础上,作者对拟合优度检验、显著性检验进行了阐述,并介绍了如何进行预测与控制。
第四部分为多元线性回归模型。
作者首先介绍了多元线性回归方程,并讲解了参数估计方法。
在此基础上,作者对多元线性回归的检验进行了详细说明,并为矩阵计算提供了方法。
第五部分为时间序列分析。
作者首先讲解了时间序列的基本概念,并介绍了平稳性检验。
随后,作者分别阐述了自回归模型、移动平均模型及自回归移动平均模型,为时间序列分析奠定了基础。
第六部分为非线性回归模型。
作者首先概述了非线性回归,并介绍了非线性回归的估计方法。
在此基础上,作者对非线性回归的检验进行了说明。
第七部分为计量经济学应用案例。
作者选取了我国经济增长、通货膨胀及环境污染与经济增长的关系等研究课题,展示了计量经济学在实际问题中的应用。
最后是第八部分,作者对伍德里奇计量经济学导论的评价及启示。
作者认为该教材在内容安排、理论阐述、案例分析等方面具有优点,但同时也指出了不足之处。
在此基础上,作者对我国计量经济学发展提出了有益的启示。
伍德里奇计量经济学讲义7
~ Normal
0,1
because it
ˆ b j is distribute d normally is a linear combinatio
n of the errors
5
The t Test
Under the CLM assumption s ˆ se b j
bˆ
j
b
j
ˆ c se b , where c is the 1 - a percentile ˆ bj j 2
in a t n k 1 distributi on
16
Computing p-values for t tests
An alternative to the classical approach is to ask, “what is the smallest significance level at which the null would be rejected?” So, compute the t statistic, and then look up what percentile it is in the appropriate t distribution – this is the p-value p-value is the probability we would observe the t statistic we did, if the null were true
3
The homoskedastic normal distribution with a single explanatory variable
y
f(y|x)
伍德里奇《计量经济学导论》(第4版)笔记和课后习题详解(2-8章)
GPA GPA Ai
ˆ 5.8125 / 56.875 0.1022 。 根据公式 2.19 可得: 1
ˆ 3.2125 0.1022 25.875 0.5681 。 根据公式 2.17 可知: 0
????2222211112222221111varvarvar?nnnniiiiiiiiiinnniiiiiixxuxxuxxx??????????????????????????????????????????????????????????????????根据公式257????2211?varniixx????????????对任何数据样本??2211nniiiixxx??????除非0x?
7.利用 Kiel and McClain(1995)有关 1988 年马萨诸塞州安德沃市的房屋出售数据,如下方程给出了房屋 价格( price )和距离一个新修垃圾焚化炉的距离( dist )之间的关系:
log price 9.40 0.312log dist n 135 , R 2 0.162
因此 GPA 0.5681 0.1022 ACT 。 此处截距没有一个很好的解释, 因为对样本而言,ACT 并不接近 0。 如果 ACT 分数提高 5 分,预期 GPA 会提高 0.1022× 5=0.511。 (Ⅱ)每次观测的拟合值和残差表如表 2-3 所示: 表 2-3
i
GPA
GPA
^
^
ˆ u
1 2 3 4 5 6 7 8
^
2.8 3.4 3.0 3.5 3.6 3.0 2.7 3.7
2.7143 3.0209 3.2253 3.3275 3.5319 3.1231 3.1231 3.6341
伍德里奇计量经济学导论ppt课件
E(Y|Xi) = 0 + 1 Xi,
ppt课件.
21
Ø 随机误差项u的意义:
l 反映被忽略掉的因素对被解释变量的影响。 或者理论不够完善,或者数据缺失;或者影响轻微。
l 模型设定误差 l 度量误差 l 人类行为内在的随机性
ppt课件.
22
Ø 随机误差项主要包括下列因素:
在解释变量中被忽略的因素的影响; 变量观测值的观测误差的影响; 残缺数据; 模型关系的设定误差的影响; 其他随机因素的影响。
l 对于某一个家庭,如何描述可支配收入和消费支出的关系?
Yi=E(Y|Xi) + ui =0 + 1 Xi + ui
某个家庭的消费支出分为两部分:一是E(Y|Xi)=0 + 1 Xi ,称为系统成
分或确定性成分;二是ui,称为非系统或随机性成分。
ppt课件.
20
l 随机性总体回归函数
Yi=0 + 1 Xi + ui
260
— 152
— — 180 185 — 3 517
ppt课件.
26
样本回归线
样本均值连线
ppt课件.
27
Ø 总体回归模型和样本回归模型的比较
ppt课件.
28
注意:分清几个关系式和表示符号
E(Y|Xi) = 0 + 1 Xi (1)总体(真实的)回归直线: Yi
E(Y|Xi)01Xi
Y2
Y1
或: Yi ˆ0ˆ1Xi ei
其中: Yˆi 为Yi的估计值(拟合值); ˆ0 , ˆ1 为 0 , 1 的估计值;
110 115 120 130 135 140
- 6 750
伍德里奇计量经济学讲义-文档资料
More difficult if one model uses y and the other uses ln(y) Can follow same basic logic and transform predicted ln(y) to get ŷ for the second step In any case, Davidson-MacKinnon test may reject neither or both models rather than clearly preferring one specification
2
Functional Form (continued)
First, use economic theory to guide you Think about the interpretation Does it make more sense for x to affect y in percentage (use logs) or absolute terms? Does it make more sense for the derivative of x1 to vary with x1 (quadratic) or with x2 (interactions) or to be fixed?
7
Proxy Variables
What if model is misspecified because no data is available on an important x variable? It may be possible to avoid omitted variable bias by using a proxy variable A proxy variable must be related to the unobservable variable – for example: x3* = d0 + d3x3 + v3, where * implies unobserved Now suppose we just substitute x3 for x3*
伍德里奇《计量经济学导论--现代观点》
X
22
3
1.
15
p
0.2 0.1 0.1 0.1 0.1 0.3 0.1
( X ,Y ) (1,1) (1,0) (1,1) (2,1) (2,1) (3,0) (3,1)
( X Y )2 4 1 0 9 1 9 4
得 E[(X Y )2] 4 0.3 1 0.2 0 0.1 9 0.4 5.
( X ,Y ) (1,1) (1,0) (1,1) (2,1) (2,1) (3,0) (3,1) Y X 1 0 1 1 2 1 2 0 1 3
于是
E Y 1 0.2 0 0.1 1 0.1 1 0.1 1 0.1 0 0.3 1 0.1
(2) 级数的绝对收敛性保证了级数的和不 随级数各项次序的改变而改变 , 之所以这样要 求是因为数学期望是反映随机变量X 取可能值 的平均值,它不应随可能值的排列次序而改变.
(3) 随机变量的数学期望与一般变量的算 术平均值不同.
例1 谁的技术比较好? 甲,乙两个射手,他们的射击技术分别为
甲射手
击中环数 8 9 10 概率 0.3 0.1 0.6
第四章
随机变量的数字特征
第一节 数学期望
一、随机变量的数学期望 二、随机变量函数的数学期望 三、数学期望的性质 四、小结
一、随机变量的数学期望
1. 离散型随机变量的数学期望
定义4.1设离散型随机变量 X 的分布律为
P{ X xk } pk , k 1,2,.
若级数
xk pk 绝对收敛,则称级数
故甲射手的技术比较好.
例2 如何确定投资决策方向?
某人有10万元现金, 想投资
伍德里奇---计量经济学第7章部分计算机习题详解(STATA)
班级:金融学×××班姓名:××学号:×××××××C7.10 NBASAL.RAWpoints=β0+β1exper+β2exper2+β3guard+β4forward+u 解:(ⅰ)估计一个线性回归模型,将单场得分与联赛中打球经历和位置(后卫、前锋或中锋)联系起来。
包括打球经历的二次项形式,并将中锋作为基组。
以通常形式报告结果。
由上图可知:points=4.76+1.28exper−0.072exper2+2.31guard+1.54forward1.180.330.024 1.00 (1.00)n=269,R2=0.0910,R2=0.0772。
(ⅱ)在第(ⅰ)部分中,你为什么不将所有三个位置虚拟变量包括进来?由于forward+center+guard=1,意味着forward和guard之和是center的一个线性函数,所以如果在模型中同时使用三个虚拟变量将会导致完全多重共线性,即包括三个位置虚拟变量会掉入虚拟变量陷阱,故不能将三个位置虚拟变量都包括在模型中。
(ⅲ)保持经历不变,一个后卫的得分比一个中锋多吗?多多少?这个差异统计显著吗?由(ⅰ)中估计方程可知:一个后卫的得分比一个中锋多,且多得2.31分。
同时,guard的t统计量为2.31,所以这个差异统计显著。
(ⅳ)现在,将婚姻状况加入方程。
保持位置和经历不变,已婚球员是否更高效?将婚姻状况加入方程后,回归结果如下所示:points=4.703+1.233exper−0.0704exper2+2.286guard+1.541forward+0.584marr1.180.330.024 1.00 1.00 (0.74)n=269,R2=0.0931,R2=0.0759。
从方程中marr的系数不难发现:在保持位置和经历不变时,已婚球员每场得分比没结婚的球员高0.5分,可是事实上,变量marr的t统计量为0.789,t检验的p值为43.1%,所以marr统计并不显著,故无法得出“已婚球员得分更高效”的结论。
伍德里奇计量经济学第六版答案Chapter 7
CHAPTER 7TEACHING NOTESThis is a fairly standard chapter on using qualitative information in regression analysis, although I try to emphasize examples with policy relevance (and only cross-sectional applications are included.).In allowing for different slopes, it is important, as in Chapter 6, to appropriately interpret the parameters and to decide whether they are of direct interest. For example, in the wage equation where the return to education is allowed to depend on gender, the coefficient on the female dummy variable is the wage differential between women and men at zero years of education. It is not surprising that we cannot estimate this very well, nor should we want to. In this particular example we would drop the interaction term because it is insignificant, but the issue of interpreting the parameters can arise in models where the interaction term is significant.In discussing the Chow test, I think it is important to discuss testing for differences in slope coefficients after allowing for an intercept difference. In many applications, a significant Chow statistic simply indicates intercept differences. (See the example in Section 7.4 on student-athlete GPAs in the text.) From a practical perspective, it is important to know whether the partial effects differ across groups or whether a constant differential is sufficient.I admit that an unconventional feature of this chapter is its introduction of the linear probability model. I cover the LPM here for several reasons. First, the LPM is being used more and more because it is easier to interpret than probit or logit models. Plus, once the proper parameter scalings are done for probit and logit, the estimated effects are often similar to the LPM partial effects near the mean or median values of the explanatory variables. The theoretical drawbacks of the LPM are often of secondary importance in practice. Computer Exercise C7.9 is a good one to illustrate that, even with over 9,000 observations, the LPM can deliver fitted values strictly between zero and one for all observations.If the LPM is not covered, many students will never know about using econometrics to explain qualitative outcomes. This would be especially unfortunate for students who might need to read an article where an LPM is used, or who might want to estimate an LPM for a term paper or senior thesis. Once they are introduced to purpose and interpretation of the LPM, along with its shortcomings, they can tackle nonlinear models on their own or in a subsequent course.A useful modification of the LPM estimated in equation (7.29) is to drop kidsge6 (because it is not significant) and then define two dummy variables, one for kidslt6 equal to one and the other for kidslt6 at least two. These can be included in place of kidslt6 (with no young children being the base group). This allows a diminishing marginal effect in an LPM. I was a bit surprised when a diminishing effect did not materialize.SOLUTIONS TO PROBLEMS7.1 (i) The coefficient on male is 87.75, so a man is estimated to sleep almost one and one-half hours more per week than a comparable woman. Further, t male = 87.75/34.33 ≈ 2.56, which is close to the 1% critical value against a two-sided alternative (about 2.58). Thus, the evidence for a gender differential is fairly strong.(ii) The t statistic on totwrk is -.163/.018 ≈ -9.06, which is very statistically significant. The coefficient implies that one more hour of work (60 minutes) is associated with .163(60) ≈ 9.8 minutes less sleep.(iii) To obtain 2r R , the R -squared from the restricted regression, we need to estimate themodel without age and age 2. When age and age 2 are both in the model, age has no effect only if the parameters on both terms are zero.7.2 (i) If ∆cigs = 10 then log()bwght ∆ = -.0044(10) = -.044, which means about a 4.4% lower birth weight.(ii) A white child is estimated to weigh about 5.5% more, other factors in the first equation fixed. Further, t white ≈ 4.23, which is well above any commonly used critical value. Thus, the difference between white and nonwhite babies is also statistically significant.(iii) If the mother has one more year of education, the child’s birth weight is estimated to be .3% higher. This is not a huge effect, and the t statistic is only one, so it is not statistically significant.(iv) The two regressions use different sets of observations. The second regression uses fewer observations because motheduc or fatheduc are missing for some observations. We would have to reestimate the first equation (and obtain the R -squared) using the same observations used to estimate the second equation.7.3 (i) The t statistic on hsize 2 is over four in absolute value, so there is very strong evidence that it belongs in the equation. We obtain this by finding the turnaround point; this is the value ofhsize that maximizes ˆsat(other things fixed): 19.3/(2⋅2.19) ≈ 4.41. Because hsize is measured in hundreds, the optimal size of graduating class is about 441.(ii) This is given by the coefficient on female (since black = 0): nonblack females have SAT scores about 45 points lower than nonblack males. The t statistic is about –10.51, so thedifference is very statistically significant. (The very large sample size certainly contributes to the statistical significance.)(iii) Because female = 0, the coefficient on black implies that a black male has an estimated SAT score almost 170 points less than a comparable nonblack male. The t statistic is over 13 in absolute value, so we easily reject the hypothesis that there is no ceteris paribus difference.(iv) We plug in black = 1, female = 1 for black females and black = 0 and female = 1 for nonblack females. The difference is therefore –169.81 + 62.31 = -107.50. Because the estimate depends on two coefficients, we cannot construct a t statistic from the information given. The easiest approach is to define dummy variables for three of the four race/gender categories and choose nonblack females as the base group. We can then obtain the t statistic we want as the coefficient on the black female dummy variable.7.4 (i) The approximate difference is just the coefficient on utility times 100, or –28.3%. The t statistic is -.283/.099 ≈ -2.86, which is very statistically significant.(ii) 100⋅[exp(-.283) – 1) ≈ -24.7%, and so the estimate is somewhat smaller in magnitude.(iii) The proportionate difference is .181 - .158 = .023, or about 2.3%. One equation that can be estimated to obtain the standard error of this difference islog(salary ) = 0β + 1βlog(sales ) + 2βroe + 1δconsprod + 2δutility +3δtrans + u ,where trans is a dummy variable for the transportation industry. Now, the base group is finance , and so the coefficient 1δ directly measures the difference between the consumer products and finance industries, and we can use the t statistic on consprod .7.5 (i) Following the hint, colGPA = 0ˆβ + 0ˆδ(1 – noPC ) + 1ˆβhsGPA + 2ˆβACT = (0ˆβ + 0ˆδ) - 0ˆδnoPC + 1ˆβhsGPA + 2ˆβACT . For the specific estimates in equation (7.6), 0ˆβ = 1.26 and 0ˆδ = .157, so the new intercept is 1.26 + .157 = 1.417. The coefficient on noPC is –.157.(ii) Nothing happens to the R -squared. Using noPC in place of PC is simply a different way of including the same information on PC ownership.(iii) It makes no sense to include both dummy variables in the regression: we cannot hold noPC fixed while changing PC . We have only two groups based on PC ownership so, in addition to the overall intercept, we need only to include one dummy variable. If we try toinclude both along with an intercept we have perfect multicollinearity (the dummy variable trap).7.6 In Section 3.3 – in particular, in the discussion surrounding Table 3.2 – we discussed how to determine the direction of bias in the OLS estimators when an important variable (ability, in this case) has been omitted from the regression. As we discussed there, Table 3.2 only strictly holds with a single explanatory variable included in the regression, but we often ignore the presence of other independent variables and use this table as a rough guide. (Or, we can use the results of Problem 3.10 for a more precise analysis.) If less able workers are more likely to receivetraining, then train and u are negatively correlated. If we ignore the presence of educ and exper , or at least assume that train and u are negatively correlated after netting out educ and exper , then we can use Table 3.2: the OLS estimator of 1β (with ability in the error term) has a downward bias. Because we think 1β ≥ 0, we are less likely to conclude that the training program waseffective. Intuitively, this makes sense: if those chosen for training had not received training, they would have lowers wages, on average, than the control group.7.7 (i) Write the population model underlying (7.29) asinlf = 0β + 1βnwifeinc + 2βeduc + 3βexper +4βexper 2 + 5βage+ 6βkidslt6 + 7βkidsage6 + u ,plug in inlf = 1 – outlf , and rearrange:1 – outlf = 0β + 1βnwifeinc + 2βeduc + 3βexper +4βexper2 + 5βage+ 6βkidslt6 + 7βkidsage6 + u ,oroutlf = (1 - 0β) - 1βnwifeinc - 2βeduc - 3βexper - 4βexper 2 - 5βage - 6βkidslt6 - 7βkidsage6 - u ,The new error term, -u , has the same properties as u . From this we see that if we regress outlf on all of the independent variables in (7.29), the new intercept is 1 - .586 = .414 and each slope coefficient takes on the opposite sign from when inlf is the dependent variable. For example, the new coefficient on educ is -.038 while the new coefficient on kidslt6 is .262.(ii) The standard errors will not change. In the case of the slopes, changing the signs of the estimators does not change their variances, and therefore the standard errors are unchanged (butthe t statistics change sign). Also, Var(1 - 0ˆβ) = Var(0ˆβ), so the standard error of the intercept is the same as before.(iii) We know that changing the units of measurement of independent variables, or entering qualitative information using different sets of dummy variables, does not change the R -squared. But here we are changing the dependent variable. Nevertheless, the R -squareds from the regressions are still the same. To see this, part (i) suggests that the squared residuals will be identical in the two regressions. For each i the error in the equation for outlf i is just the negative of the error in the other equation for inlf i , and the same is true of the residuals. Therefore, the SSRs are the same. Further, in this case, the total sum of squares are the same. For outlf we haveSST = 2211()[(1)(1)]n n i i i i outlf outlf inlf inlf ==-=---∑∑= 2211()()n ni i i i inlf inlf inlf inlf ==-+=-∑∑,which is the SST for inlf . Because R 2 = 1 – SSR/SST, the R -squared is the same in the two regressions.7.8 (i) We want to have a constant semi-elasticity model, so a standard wage equation with marijuana usage included would belog(wage ) = 0β + 1βusage + 2βeduc + 3βexper + 4βexper 2 + 5βfemale + u .Then 100⋅1β is the approximate percentage change in wage when marijuana usage increases byone time per month.(ii) We would add an interaction term in female and usage :log(wage ) = 0β + 1βusage + 2βeduc + 3βexper + 4βexper 2 + 5βfemale+ 6βfemale ⋅usage + u .The null hypothesis that the effect of marijuana usage does not differ by gender is H 0: 6β = 0.(iii) We take the base group to be nonuser. Then we need dummy variables for the other three groups: lghtuser , moduser , and hvyuser . Assuming no interactive effect with gender, the model would belog(wage ) = 0β + 1δlghtuser + 2δmoduser + 3δhvyuser + 2βeduc + 3βexper + 4βexper 2 + 5βfemale + u .(iv) The null hypothesis is H 0: 1δ = 0, 2δ= 0, 3δ = 0, for a total of q = 3 restrictions. If n is the sample size, the df in the unrestricted model – the denominator df in the F distribution – is n – 8. So we would obtain the critical value from the F q ,n -8 distribution.(v) The error term could contain factors, such as family background (including parental history of drug abuse) that could directly affect wages and also be correlated with marijuana usage. We are interested in the effects of a person’s drug usage on his or her wage, so we would like to hold other confounding factors fixed. We could try to collect data on relevant background information.7.9 (i) Plugging in u = 0 and d = 1 gives 10011()()()f z z βδβδ=+++.(ii) Setting **01()()f z f z = gives **010011()()z z βββδβδ+=+++ or *010z δδ=+. Therefore, provided 10δ≠, we have *01/z δδ=-. Clearly, *z is positive if and only if 01/δδ is negative, which means 01 and δδ must have opposite signs.(iii) Using part (ii) we have *.357/.03011.9totcoll == years.(iv) The estimated years of college where women catch up to men is much too high to be practically relevant. While the estimated coefficient on female totcoll ⋅ shows that the gap is reduced at higher levels of college, it is never closed – not even close. In fact, at four years of。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
• y|x ~ Normal(b0 + b1x1 +…+ bkxk, s2)
• While for now we just assume normality, clear that sometimes not the case
To perform our test we first need to form
"the"t statistic for bˆj :tbˆ j bˆ j se bˆ j
We will then use our t statistic along with a rejection rule to determine whether to accept the null hypothesis, H0
t Test: One-Sided Alternatives
• Besides our null, H0, we need an alternative hypothesis, H1, and a significance level
• H1 may be one-sided, or two-sided
One-Sided Alternatives (cont)
yi = b0 + b1xi1 + … + bkxik + ui
• Large samples will let us drop normality
The homoskedastic normal distribution with a single explanatory variable
y
f(y|x)
.
.
E(y|x) = b0 + b1x
Normal distributions
• H1: bj > 0 and H1: bj < 0 are one-sided • H1: bj 0 is a two-sided alternative
• If we want to have only a 5% probability of rejecting H0 if it is really true, then we say our significance level is 5%
• Assume that u is independent of x1, x2,…, xk and u is normally distributed with zero
mean and variance s2: u ~ Normal(0,s2)
CLM Assumptions (cont)
• Under CLM, OLS is not only BLUE, but is the minimum variance unbiased estimator
• Start with a null hypothesis
• For example, H0: bj=0
• If accept null, then accept that xj has no effect on y, controlling for other x’s
The t Test (cont)
x1
x2
Normal Samplinห้องสมุดไป่ตู้ Distributions
Under the CLM assumptions, conditional on the sample values of the independent variables
bˆ j ~ Normal b j ,Var bˆ j , so that
• We can reject the null hypothesis if the t statistic is greater than the critical value
• If the t statistic is less than the critical value then we fail to reject the null
Assumptions of the Classical Linear Model (CLM)
• So far, we know that given the GaussMarkov assumptions, OLS is BLUE,
• In order to do classical hypothesis testing, we need to add another assumption (beyond the Gauss-Markov assumptions)
One-Sided Alternatives (cont)
• Having picked a significance level, a, we look up the (1 – a)th percentile in a t distribution with n – k – 1 df and call this c, the critical value
Note this is a t distributi on (vs normal)
because we have to estimate s 2by sˆ 2
Note the degrees of freedom : n k 1
The t Test (cont)
• Knowing the sampling distribution for the standardized estimator allows us to carry out hypothesis tests
bˆ j b j sd bˆ j ~ Normal0,1
bˆj is distributed normally because it
is a linear combination of the errors
The t Test
Under the CLM assumption s
bˆ j b j se bˆ j ~ tnk1