哈佛计量经济学讲义5
计量经济学讲义
计量经济学讲义浙江工商大学金融学院姚耀军目录第一讲 OLS的代数 (2)第二讲 OLS估计量 (17)第三讲假设检验 (33)第四讲异方差 (63)第五讲自相关 (82)第六讲多重共线 (107)第七讲虚拟变量 (122)第八讲时间序列初步:平稳性与单位根 (134)第九讲协整与误差修正模型 (158)第十讲 ARCH模型及其扩展 (165)第一讲 OLS 的代数一、 问题假定y 与x 具有近似的线性关系:01y x ββε=++,其中ε是随机误差项。
我们对01,ββ这两个参数的值一无所知。
我们的任务是利用样本去猜测01,ββ的取值。
现在,我们手中就有一个样本容量为N 的样本,其观测值是:1122(),(),...,()N N y x y x y x 。
问题是,如何利用该样本来猜测01,ββ的取值?一个简单的办法是,对这些观察值描图,获得一个横轴x ,纵轴y 的散点图。
既然y 与x 具有近似的线性关系,那么我们就在散点图中拟合一条直线:1ˆˆˆx yββ=+。
该直线是对y 与x 的真实关系的近似,而01ˆˆ,ββ分别是对01,ββ的猜测(估计)。
问题是,如何确定0ˆβ与1ˆβ,以使我们的猜测看起来是合理的呢?二、 O LS 的两种思考方法法一:12(,...,)N y y y '与12ˆˆˆ(,...,)N y y y '是N 维空间的两点,0ˆβ与1ˆβ的选择应该是这两点的距离最短。
这可以归结为求解一个数学问题:01012201ˆˆˆˆ,,11ˆˆˆ()()N Ni i i i i i Min y y Min y x ββββββ==-=--∑∑在这里ˆi i y y -定义了残差ˆi ε。
法二:给定i x ,看起来i y 与ˆi y 越近越好(最近距离是0)。
然而,当你选择拟合直线使得i y 与ˆi y是相当近的时候,j y 与ˆj y的距离也许变远了,因此,存在一个权衡。
计量经济学复习讲义
计量经济学复习讲义吉林⼤学经济学院《计量经济学》复习讲义配套教材:计量经济学(李⼦奈、潘⽂卿编著,第三版)第⼆章、⼀元线性回归模型⼀、相关与回归相关系数计算:回归分析:变量间关系不⼀致⼆、参数估计1.总体/样本回归模型:2.最⼩⼆乘法(OLS)β0、β1的估计值β0、β1的⽅差与概率分布总体⽅差估计值3.统计检验拟合优度检验可决系数:R2=ESS/TSS显著性检验:H0:βi=0,H1:βi≠0置信区间估计(1-α)缩⼩置信区间:增⼤样本容量n、提⾼模型拟合优度。
3.线性性与⽆偏性的证明⽅法线性性:⽆偏性:4.预测对条件均值:对个别值:第三章、多元线性回归模型⼀、.总体回归函数:⼀般形式:Y=β0+β1X1+β2X2+…+βk X k+µ⼀般形式:Y=Xβ+µ⼆、基本假定(略)三、参数估计-普通最⼩⼆乘估计参数估计:µ的⽅差估计:四、统计性质五、样本容量问题n≥k+1,不能少于解释变量(含常数⾹)数⽬n≥30或⾄少≥3(k+1)时满⾜模型估计基本要求六、统计检验1.拟合优度检验调整的可决系数⾚池信息准则和施⽡茨准则变⼩的话允许增加解释变量2.显著性检验⽅程显著性H0:β1~k全为零H1:不全为零太⼤就接受备择假设,说明模型的线性关系显著成⽴。
总体线性关系⼗分显著时不必苛求⾼可决系数。
变量显著性参数的置信区间缩⼩置信区间:增⼤样本容量n、提⾼模型拟合优度、提⾼样本观测值的分散度。
七、预测1.均值的预测2.单个值的预测⼋、⾮线性化为线性变换⾮线性普通最⼩⼆乘法九、受约束回归1.条件约束约束后e'*e*≥e'e,即残差平⽅和可能变⼤。
除⾮约束条件为真,模型解释能⼒可能降低。
若F太⼤则约束⽆效2.增减解释变量少变量模型可看做对多变量模型加以约束⽽形成。
q=kU-kR,kU=k+q3.参数稳健性-邹⽒参数稳定性检验(n2>k):结构不变式相当于对变动式施加k+1个约束:H0:β=α,进⾏F 检验判断是否合适。
哈佛计量经济学讲义8
SIMULTANEOUS EQUATIONS ESTIMATION
2
externally. Thus in the present case G and R are both endogenous and I is exogenous. The exogenous variables and the disturbance terms ultimately determine the values of the endogenous variables, once one has cut through the circularity. The mathematical relationships expressing the endogenous variables in terms of the exogenous variables and disturbance terms are known as the reduced form equations. The original equations that we wrote down when specifying the model are described as the structural equations. We will derive the reduced form equations for G and R. To obtain that for G, we take the structural equation for G and substitute for R from the second equation: G = α + β1R + β2I + u = α + β1(γ + δ1G + v) + β2I + u Hence (1-β1δ1)G = (α + β1γ) + β2I + u + β1v and so
计量经济学讲义(一到四章)(计量经济学-东北财经大学,王
计量经济学讲义王维国讲授课程的性质计量经济学是一门由经济学、统计学和数学结合而成的交叉学科,从学科性质来看,计量经济学是一门应用经济学。
具体来说,计量经济学是在经济学理论指导下,借助于数学、统计学和计算机等方法和技术,研究具有随机特征的经济现象,目的在于揭示其发展变化规律。
课程教学目标计量经济学按其内容划分为理论计量经济学和应用计量经济学。
本课程采用多媒体教学手段,结合Eviews软件应用,讲解理论计量经济学的最基本内容。
本课程教学目标:一是使学生了解现实经济世界中可能存在的计量经济问题,掌握检测及解决计量经济问题的方法和技术;二是使学生能够在计算机软件辅助下,建立计量经济模型,为其他专业课的学习及对经济问题进行实证分析研究奠定基础。
课程适用的专业与年级本大纲适用于数量经济专业2001级计量经济学课程的教学。
课程的总学时和总学分课程总学时为72,共计4学分。
本课程与其他课程的联系与分工学习本课程需要学生具备概率论与数理统计、微积分、线性代数、Excel、微观经济学、宏观经济学、经济统计等学科知识。
概率论与数理统计等数学课是计量经济学的方法论基础,计量经济学主要解决的是实际中不满足数理统计假定时经济变量之间关系及经济变量发展变化规律分析方法和技术,而经济学为计量经济学提供经济理论的准备,它仅就经济变量之间的关系提出一些理论假设,而不进行实证分析,只有具备了计量经济学的基本知识才能更好地解决一些实际问题。
课程使用的教材及教学参考资料使用的教材:计量经济学(Basic Econometrics) 第三版,[美]古扎拉蒂(DamodarN.Gujarati) 著,林少宫译,中国人民大学2000年3月第1版。
该教材畅销美国,并流行于英国及其他英语国家。
该书充分考虑了学科发展的前沿,十分重视基础知识的教学及训练,内容深入浅出。
教学参考资料:1. 王维国,《计量经济学》,东北财经大学2001.2.Aaron C. Johnson, Econometrics Basic and Applied学时分配表第一讲引言:经济计量学的特征及研究X围第一节什么是计量经济学一、计量经济学的来源二、计量经济学的定义计量经济学几种定义。
伍德里奇《计量经济学》chap5
第5章 OLS 的渐近性(样本容量无限增大的情况:OLS 的大样本性质)5.1一致性(1) 依概率收敛定义 (2) 均方收敛定义 (3) 概率极限法则(4) 大数定律(弱大数定律,切比雪夫和辛钦) (5)一致性z 假定:MLR.1- MLR.4 z 不一致性:源于MLR.4不满足 简单回归模型 多元回归模型:一般而言,如果x1和u 相关,其他自变量x 都和u 无关,所有的1ˆβ……ˆkβ都是不一致的。
特殊情况:如果x1和u 相关,其他自变量x 都和u 无关,而且,其他自变量x和x1也无关,则只有1ˆβ是不一致的。
和偏误的比较:相似和区别(样本和总体的区别)5.2渐近正态性(作用:大样本情形下,可替代MLR.6假定)(1) 依分布收敛定义:符号d→,极限分布 (2) 中心极限定理(3) 渐近分布(来源于极限分布,又区别于极限分布),符号a∼z 假定:MLR.1- MLR.5同方差假定不成立,会如何?实际上渐近正态性仍然成立。
但是渐近方差计算方式改变,所以t 和F 分布要改变。
z 理解:当n →∞,ˆj β是均方收敛的,即收敛于期望,而且方差收敛于0。
但是,当n 是有限数时,n 很大的话,ˆjβ可近似看作服从正态分布,其方差还没有变为0,而是渐近方差。
见式5.7。
随着n →∞渐近方差的估计值,即se 是以1/n 的速度趋向于0的。
z 5.7式怎么来的?)12ˆijr −∑来5.3渐近有效性z 渐近有效性定义:“致,且渐近正态”的估计量,其渐近协方差阵小于等于任何一个一致且渐近正态的估计量的协方差阵,则它是渐近有效的。
z 格林P76:我们还没有在大样本中证明OLS 按照“任何”一种标准都是最优的。
定理5.3也不过是告诉我们:在某一类估计量中,OLS 是最优的,即渐近有效的。
计量经济学辅导讲稿.doc
《计量经济学》课程课外辅导讲稿注:本辅导主要针对教学内容中的重点及难点部分进行辅导,不是以针对考试内容为主的考前辅导。
(关键在对知识的理解→掌握→应用)本课程的主要内容有:第2章:线性回归的基本思想:双变量模型第3章:双变量模型:假设检验 第4章:多元回归:估计与假设检验 第5章:回归方程的函数形式第6章:虚拟变量回归模型第7章:模型选择:标准与检验(民族班可略) 第8章:多重共线性第9章:异方差 第10章:自相关第一次辅导课内容:第2章:线性回归的基本思想:双变量模型第3章:双变量模型:假设检验 第4章:多元回归:估计与假设检验一、古典线性回归模型的基本形式(注意随机误差项的构成)i i i i ii i X b b YX B B X Y E u X B B Y +=+=++=212121ˆ)|(ii i i i i u X Y E Y e YY +=+=)|(ˆ二、古典线性回归模型的基本假定假定1 回归模型是参数线性的,并且是正确设定的。
假定 2 解释变量与随机扰动项u 不相关(解释变量是确定性变量时自然成立);假定3 零均值假定: E(u)=0 假定4 同方差假定: Var(u i )=常数 假定5 无自相关假定:Cov(u,u)=0 i ≠j假定6 假定随机项误差u 服从均值为零,(同)方差为常数的正态分布:),0(~2σN u i 假定7 解释变量之间不存在线性相关关系;注意:线性回归模型中线性的含义:一般的线性指的是解释变量线性和参数线性。
我们这里的线性强调的是参数线性。
三、古典线性回归模型的参数估计 1.参数估计的方法:普通最小二乘法(OLS)2.最小二乘原理:就是选择合适参数使得全部观察值的残差平方和(RSS)最小,数学形式为:()}min{ })ˆ(min{}min{2212∑∑∑--=-=i i i i 2iX b b Y Y Y e利用极值原理可得到正规方程组,求解可得:3.OLS 估计量的性质:高斯-马尔柯夫定理:若满足古典线性回归模型的基本假定,则在所有线性无偏估计量中,OLS 估计量具有最小方差性,即:OLS 估计量是最优线性无偏估计量(BLUE )。
计量经济学第5章PPT学习教案
量保持不变的情况下,Xj每变化1个单位时,Y 的均值E(Y)的变化;
或者说j给出了Xj的单位变化对Y均值的“直
接”或“净”(不含其他变量)影响。
第1页/共49页
2
总体回归模型n个随机方程的矩阵表达式为 Y Xβ μ
其中
1 X 11
X
1
X 12
1 X 1n
所以,
ˆ ~ N(, 2(X X )1)
第24页/共49页
以cii表示矩阵(X’X)-1 主对角线上的第i个元素,于是参数估 计量的 方差为 : 其中,2为随机误差项的总体 方差, 由于总 体未知 ,故方 差也不 可知。 因此, 在实际 计算时 ,用它 的估计 量代替:
ˆi ~ N (i , 2cii )
2Q
ˆˆ
2X X是一个正定矩阵
ˆ (X X ) XY 1
是使方程最小化的解。
第13页/共49页
14
知识点:正定矩阵
对于任意的非零向量c,令
a cX Xc
则
a cXXc vv
vi2
除非v中的每一个元素为0, 否则a为正的。但是,若v为0, 则
v Xc 0
这与X中的向量线性无关的假设是矛盾的,故X满秩,则必
n
第7页/共49页
8
回忆:由线性代数可知
如果一个矩阵没有逆矩阵,则被称 为奇异矩阵,如果有则为非奇异矩 阵(non-singular)
对于n阶方阵A,A是非奇异矩阵的 证明: 充要条件是A的行列式不等于0
当r且an仅k(当X X矩)阵 满ran秩k时(X,) 其k行1列式不 X X为(k等+1于)(零k+1)阶方阵,所以,X X为非奇异矩阵,可逆.
《计量经济学讲义》新
第一章绪论§计量经济学一、计量经济学的产生与发展计量经济学是经济学的一个分支,是以揭示经济活动中的客观存在的数量关系为容的分支学科。
其创立者R.弗里希将其定义为经济理论、统计学、数学三者的结合,但它又完全不同于这三个学科的每一个分支。
计量经济学(Econometrics)1926年由挪威经济学家弗里希(R.Frish)仿造生物计量学(Biometrics)一词提出的。
1930年12月弗里希、丁百根和费歇耳等经济学家在美国克利夫兰市成立经济计量学会。
1933年出版《计量经济学杂志》在发刊词中弗里希将计量经济学定义为:经济理论、数学、统计学的结合。
计量经济学的学术渊源和社会历史根源:17世纪英国经济学家威廉.配弟在《政治算术》一书中应用“数字、重量或尺度”来阐述经济现象19世纪法国经济学家古尔诺《财富理论的数学原理研究》中认为:某些经济畴、需求、价格、供给可以视为互为函数关系,从而有可能用一系列的函数方程表述市场中的关系,并且可以用数学语言系统地阐述某些经济规律(数理学派的奠基者)其后瑞士经济学家瓦尔拉斯创立了一般均衡理论,利用联立方程研究一般均衡的决定条件(洛桑学派的先驱)意大利经济学家帕累托发展了一般均衡理论。
用立体几何研究经济变量之间的关系。
1890年(剑桥学派的创始人)马歇尔的《经济学原理》的问世,使数学成为经济学研究不可缺少的描述与分析推理的工具为计量经济学奠定了基础计量经济学从二十世纪三十年代诞生起就显示了极强的生命力。
一方面出于对经济的干预政策的需要,许多国家都广泛采用经济计量理论和方法,进行经济预测,加强市场研究,探讨经济政策的效果。
另一方面随着科学技术的发展与进步,各门科学相互协作、相互渗透,计算机科学、数学、系统论、信息论、控制论等相继进入了经济研究领域。
特别是计算机技术的高速发展为计量经济学广泛应用铺平了道路。
计量经济学的发展过程是计量经济模型的建立、应用和发展的过程。
LNa
a'.a%b'%b'1%b2%b!m%b.(1)(2)Harvard University Spring Term 1997 Department of Economics Economics 2140AA. Regression: Point Estimation1. Univariate regression is the study of regression functions with one dependent variable and any number of independent variables. "Univariate" refers to the number of dependent variables, namely one. Univariate regression can be contrasted with multivariate regression, a set of regression functions with the same independent variables. In multivariate regression there are several dependent variables and, possibly, several independent variables. It is customary to refer to univariate regression as multiple regression, since there are multiple independent variables. In presenting univariate regression or multiple regression we find it convenient to employ matrix notation. We begin our presentation with a brief review of matrix notation and its applications to the differential calculus and to probability theory.A vector, say a, is an array of numbers:We can add vectors, say a and b, by adding their elements:a%b'b%a.a%(b%c)'(a%b)%c.',0%a'a. a%(&a)'0.(3)(4)(5)(6)(7)The sum of two vectors, a and b, is another vector, denoted a + b. We refer to the operation of adding vectors as vector addition.The operation of vector addition is commutative, that is, the sum of two vectors is independent of the order of the vectors:The sum of three (or more) vectors can be defined in terms of successive sums of pairs of vectors; the operation of vector addition is associative, that is independent of the order in which vectors are combined:If we define the zero vector, say 0, as the vector with all elements zero:Then the sum of the zero vector and any vector, say a, is the vector a:Furthermore, there is a unique vector, say -a, such that:ca''!.c(da)'(cd)a.c(a%b)'ca%cb. (c%d)a'ca%da.1@x'x.(8)(9)(10)(11)(12)This completes our discussion of vector addition.We can multiply a vector, say a, by a constant, say c, by multiplying each element of a by c:Again, we obtain another vector, denoted ca. To distinguish a single number like c from a vector like a, we refer to c as a scalar. We refer to the multiplication of a vector by a scalar as scalar multiplication.The operation of scalar multiplication is associative. If we have two scalars, c and d, and a vector a, then:Scalar multiplication is distributive in vectors; where a and b are vectors:Scalar multiplication is also distributive in scalars; where c and d are scalars:Finally, scalar multiplication of a vector by unity results in the same vector:This completes our discussion of scalar multiplication.a)'[a1a2...am].a)b'[a1a2ÿam]'a1b1%a2b2%þ%ambm'j a i b i@a)b'b)a.(13)(14)(15)We can transpose a vector a by writing it as an array of numbers in the form:We refer to the vector a' as a row vector and the original vector a as a column vector. We can add row vectors and carry out scalar multiplication, as before. Further, we can multiply a row vector, say a' , and a column vector, say b, by summing the products of their elements:We refer to the product of a row vector and a column vector as the inner productof the two vectors. Obviously,The inner product of a' and b is the same as the inner product of b' and a.At this point we draw attention to an important property of vector addition or the inner product of a row vector and a column vector. These operations require that the vectors have the same number of elements. For vectors with differing numbers of elements, these operations are not defined. Where vector addition or the inner product operation is defined for two vectors, we say that these vectors are conformable. In applications of these operations we must verify that the vectors involved are conformable. For example, if it is proposed that we form the vector sum:1 2%,[12],A'11a12þa121a22þa2!!!m1am2þa.A'[a1a2...an],a j'12!,(j'1,2...n).aj(17)(16)(17)(18)we see that the two vectors are not conformable, so that vector addition is not defined. Similarly, if it is proposed that we form the inner product:again, the vectors are not conformable, so that the inner product is not defined.2. We can now introduce matrix notation. A matrix, say A, is a rectangular array of numbers:Alternatively, we can regard a matrix as an array of vectors:where the jth element of this array, say , can be represented in vector form as:A%B'11a12...a121a22...a2!!!m1am2...a%b11b12...b1b21b22...b!!!bm1bm2...b,'11%b11a12%b12...a1n%b1n21%b21a22%b22...a2n%b2n!!!m1%am2am2%bm2...amn%b.A%B'B%A.A%(B%C)'(A%B)%C.'0...0...!0....(19)(19)(20)(21)(22)We can define matrix addition by creating a new matrix with elements equal to the sum of the corresponding elements of the two matrices being added. For two matrices, say A and B, we can write:Matrix addition has the same properties as vector addition. Matrix addition is commutative, so that: It is also associative, so that:We can define the zero matrix, say 0, as the matrix with all elements zero:The sum of the zero matrix and any matrix, say A , is the matrix A:0%A'A. A%(&A)'0.cA'11ca12...caln21ca22...ca2!!!m1cam2...ca,c(dA)'(cd)A,c(A%B)'cA%cB,(c%d)A'cA%dA.(23)(24)(25)(26)(27)(28)There is a unique matrix, say - A, such that:This completes our discussion of matrix addition.We can define scalar multiplication for a matrix, say A, as:where c is a scalar. Scalar multiplication for matrices has the same properties as scalar multiplication for vectors. Scalar multiplication is associative:where c and d are scalars. Scalar multiplication is distributive in matrices:and scalars:Finally, scalar multiplication by unity leaves a matrix unchanged:1@A'A.A)'a11a21...aa12a22...a!!!a1na2n...a(A)))'A. A)'aaaa) j aj(j'1,2...n)(29)(30)(31)This completes our discussion of scalar multiplication of matrices.We can define the transpose of a matrix as a matrix with the columns of the original matrix as rows.For a matrix A the transpose matrix, say A', is defined as:The transpose of the transpose of a matrix, say A, is the matrix A:Obviously, a matrix can be represented as an array of its rows as well as its columns. For example, we can represent the transpose matrix A' in the form:where the vector is the transpose of the vector defined above .A matrix is symmetric if it is equal to its own transpose:A'A).a ij 'aji,(i,j'1,2...m).AB'11a12...a121a22...a2!!!m1am2...a11b12...b121b22...b2!!!n1bn2...b.'a1ibi1ja1ibi2...j a1i ba2ibi1ja2ibi2...j a2i b!!!amibi1jamibi2...j a mi b.A(BC)'(AB)C;(32)(33)(34)(35)For a symmetric matrix we obtain an identical element by interchanging any pair of row and column subscripts:We have introduced the operations of matrix addition and scalar multiplication. We can nowintroduce matrix multiplication, defined as follows:The element in the ith row and jth column of the matrix product AB is the sum of the products of the elements of the ith row of A and the corresponding elements of the jth column of B.Matrix multiplication is associative:this property can be verified by direct computation. Matrix multiplication is not usually commutative, so that, in general:AB…BA,A(B%C)'AB%AC.(AB))'B)A).(A)A))'A)A.I'0...1...!0....I A'A.(36)(37)(38)(39)(40)even where both matrix products are defined. Matrix addition and matrix multiplication have the distributive property:Again, this property can be verified by direct computation. We can easily show that the transpose of a matrix product is the product of the transposes in reverse order:This implies that the matrix product of a matrix and its own transpose is symmetric, since:We have introduced a zero matrix, having the property that the sum of the zero matrix and any other matrix is the other matrix. Similarly, we can introduce an identity or unit matrix, say I, where:The matrix product of the identity matrix and any matrix, say A, is the matrix A:This property can be verified by direct computation.A A&1'I.A A&1A'AA&1A'A A&1'I, (A&1)&1'A.A A&11'A A&12'I.A&1 1A A&11'A&11A A&12,A&1A&11a nd A&12(41)(42)(43)If A has the same number of rows and columns, we refer to A as a square matrix. For a square matrix we can define the inverse matrix, say , as a matrix such that:The product of a square matrix and its inverse is the identity matrix. Not every square matrix has an inverse, but if the inverse of a matrix exists:so that:and:The inverse of the inverse matrix of a matrix A is the matrix A.If the inverse of a square matrix exists, then it is unique, for if are two inverses of the matrix A:But then:A&1 1'A&12.A%BaijA ij aij(&1)i%J aijwhich implies:This completes our discussion of matrix multiplication.Matrix addition can be defined in terms of vector addition; the sum of two matrices, say , is the matrix with columns equal to the vector sum of the corresponding columns of the matrices A and B. Similarly, matrix multiplication can be defined in terms of the inner product operation for vectors. The element of the matrix product AB in the ith row and jth column is the inner product of the ith row of A and the jth column of B, considered as row and column vectors. Matrices, like vectors, must be conformable for the operations of matrix addition and matrix multiplication to be defined. For two matrices to be conformable for matrix addition they must have the same number of rows and the same number of columns. For two matrices to be conformable for matrix multiplication the number of columns of the first matrix must be equal to the number of rows of the second matrix. In applications of these operations we must verify that the matrices involved are conformable; if they are not conformable, the operations are not defined.We can define the determinant of a square matrix A, denote |A|, as a certain scalar function of the elements of A. First, we define a submatrix of a matrix A, not necessarily square, as a matrix obtained by striking rows and columns of the matrix A. A minor is the determinant of a square submatrix of the matrix A. The minor corresponding to an element of a square matrix A is the determinant of the square submatrix obtained by striking the ith row and jth column of A. The cofactor of an element of a square matrix A is times the minor of . The determinant of A can be defined in terms of cofactors as follows:|A|'j n j'1a ij A ij'j n i'1a ij A ij.|A|'/0000/000 0a 11a 12a 21a22'a11a22&a12a21.|I|'/00000000000/000000000000 10 001 0!!! 00 (1)'1.|A|'Ani'1aii.a ij '0i…j(44)(45)We can express the cofactors of A in terms of cofactors of matrices of order n-2, these cofactors in terms of cofactors of matrices of order n-3, and so on, until we express the definition in terms of cofactors of matrices of order two. These cofactors involve determinants of matrices of order one, which are equal to the corresponding elements.An example of the definition of the determinant is the determinant of a matrix of order two: A second example is the determinant of an identity matrix:A square matrix is diagonal if for . The product of two diagonal matrices is diagonal. The determinant of a diagonal matrix is the product of the diagonal elements:|A|'|A)|;|AB|'|A|@|B|;|I|'|A@A&1|'|A|@|A&1|'1,a ij'Aji|A|,(i,j'1,2...n),aij'0i<j i>jA&1a ij (46)(47)(48)Similarly, a square matrix is triangular if for or . In the first case we say that the matrix is lower triangular; in the second case we say that the matrix is upper triangular. The determinant of a triangular matrix is the product of the diagonal elements.Three important properties of determinants are that the determinant of the transpose of a matrix is equal to the determinant of the matrix:second, that the determinant of a matrix product is the product of the determinants:third, that the determinant of a matrix with two identical rows or two identical columns is equal to zero.We say that a matrix A is singular if its determinant is equal to zero and nonsingular if its determinant is not equal to zero. A matrix has an inverse if and only if it is nonsingular. To show this we observe that if a matrix A has an inverse:so that |A| is not equal to zero. Conversely, if |A| is not equal to zero, then the typical element of , say , can be written:A @A &1'(a ij )(a ij )'(a ij )(A ji|A |)'(j a ik A jk |A |)'I .j a ik A ik '|A |,tr A 'j ni '1a ii .tr (A %B )'tr A %tr B .tr AB 'tr BA .A ji a ji |A |i …j j a ik A jk tr A (49)(50)and is the cofactor of and is the determinant of A . To show that this matrix is the inverse of A we carry out the matrix multiplication:The typical element of this matrix product is unity for i=j , since:and zero for , since is the determinant of a matrix with two identical rows.Similarly, we can define the trace of a square matrix of order n , denoted , as the sum of the diagonal elements of A :The following properties of the trace can be verified directly:3. We turn next to applications of matrix notation. The first application we consideris to the solution of a system of linear equations:a 11x 1%a 12x 2%...%a 1m x m 'y 1,a 21x 1%a 22x 2%...%a 2m x m 'y 2,............................a m1x 1%a m2x 2%...%a mm x m 'y m .Ax 'y ,A'11a 12...a 121a 22...a 2!!!m1a m2...a ,x',y'.x 'Ix 'A &1Ax 'A &1y .{x i }(i '1,2...m ){x i }(i '1,2...m ){y i }(i '1,2...m )a ij (51)(52)(53)We assume at the outset that the number of equations and the number of unknowns are equal.We can represent our system of equations in matrix notation as follows:where A is a matrix:and x and y are vectors:The problem of solving a system of equations is to express the unknowns in terms of the known constants and the coefficients {} ( i, j = 1, 2 . . . m ).If the matrix of coefficients A has an inverse, this inverse is unique and:A [a 1a 2...a m ]'[u 1,u 2...u m ]'I ,[a 1a 2...a m ]'A &1.Ax '0,x 'A &1A @0'A &1@0'0.'c i x i '0.A &1y A &1y u 1,u 2...u m u i a 1,a 2...a m A &1x 1,x 2...x m c 1,c 2...c m (54)We conclude that the vector is the unique solution to the system of equations (A52). If the matrix A of coefficients has an inverse, then there exists a unique solution to this system of equations. Conversely, if there exists a unique solution to the system of equations (A52) for any vector y , we can take y to be, successively, the unit vectors -- -- where has unity in the ith position and zeros elsewhere. We denote the unique solutions corresponding to these vectors -- :so that these vectors are the columns of :We conclude that there exists a unique solution to the system of equations (A52) for any vector y if and only if there exists an inverse to the matrix of coefficients A . Of course, the only solution of the equation,is the trivial solution,A set of vector -- -- is linearly dependent if and only if there is a set of constants -- -- not all zero, such that:j c i x i'yyj'j k i'1c ij x i,(j'0,1...k).z j 'yj&c1jc10y'j k i'1(c ij&c1j c10c i0)x i,(j'1,2,...k)x1,x2...xmc1,c2...cmy,y1,y2...ymx1,x2...xmyim'1,y'cx1y1'c1x1cc 1y%y1'0c…0y&c1cy1'0m'k k%1y,y1...ykx1,x2...xkcij jyi'0k%1c10z1,z2...zmd1,d2...dm(55)If a set of vectors is not linearly dependent, we say that it is linearly independent. We say that the rank of a set of vectors is the maximum number of linearly independent vectors that can be chosen from the set. Finally, a vector y is a linear combination of a set of vectors -- -- if and only if there is a set of constants -- such that:The main result on linear dependence is that if each of the set of vectors ---- is a linear combination of the set of vectors -- -- then the {} are linearly dependent. To show this we proceed by induction. First, for , and . If both and are zero, and the two vectors are linearly dependent. Otherwise, we can suppose that , so that ; again, the two vectors are linearly dependent. Second, suppose that the proposition is true for . By assumption we can express the vectors -- -- in terms of the k vectors -- :If all the coefficients {} are equal to zero, and the vectors are linearly dependent. Otherwise, we can suppose that is not equal to zero, so that:so that the set of vectors -- -- is linearly dependent by the induction hypothesis. Hence, there exists a set of constants -- -- not all zero, such that:0'j d j z j'j d j(y j&c1j c10y0),xi'j x ij u j,(i'0,1...m),c 0y%j c i a i'0.y 0,y1...ymm%1x,x1...xmu1,u2...umx'(xij)x 0,x1...xma1,a2...am,ya1,a2...ama 1,a2...am,yc 0,c1...cmc'0a1,a2...amc…0so that the set of vectors -- -- is linearly dependent.We can now apply the main result on linear dependence to the solution of systems of linear equations. First, we observe that any set of vectors with m components is linearly dependent. To show this we consider such a set of vectors -- . We can express each of these vectors in terms of the m unit vectors with m components -- . For example, if , then:so that the set of vectors -- -- is linearly dependent. To consider the solutions to a system of m equations in m unknowns (A52), we take the columns of the matrix of coefficients A and the vector y to be a set -- . A necessary and sufficient condition for the existence of a unique solution to this system of equations is that the set of vectors ---- comprising the columns of the matrix A, is linearly independent. First, we observe that the set of vectors -- -- is linearly dependent, so that there exists a set of constants -- -- not all zero, such that:If , the set of vectors -- -- is linearly dependent, which contradicts the hypothesis that this set is linearly independent. Accordingly, we suppose that and write:y '&j c ic 0a i 'j x i a i 'Ax ,y '&j c i c 0a i '&jd id 0a i .j c ic 0&a i '0,Ax 'j x i a i 'y .j c i a i '0.j x i a i %8j c i a i 'j (x i %8c i )a i 'y ,d 0,d 1...d m c ic 0d i d 0a 1,a 2...a m a 1,a 2...a m c 1,c 2...c m 8…0so that there is a solution to this set of equations. It is unique, since if there exists another set of constants -- -- not all zero, but we have:Either the coefficients and are identical or:and the set of vectors -- -- is linearly dependent, a contradiction.Conversely, we suppose that there exists a unique solution to the systemof equations (A52), so that:If the set of vectors -- -- is linearly dependent, there exists a set of constants -- -- not all zero, such that:For any , we can write:Ax '0,j c i a i 'c 0y ,y 'j c ic 0a i ;x i %8c i x i a 1,a 2...a m |A |…0x '0a 1,a 2...a m c 0,c 1...c m c 0'0y '0c 0…0a 1,a 2...a m so that { } is another solution to the system of equation (A52), contradicting the hypothesis that the solution { } is unique. We conclude that the set of vectors -- -- is linearly independent. We refer to the rank of this set of vectors as the column rank of the matrix A . We can say that the column rank of the matrix A is equal to m if and only if the system of equations (A52)has a unique solution. In this case, the matrix A is nonsingular, , and A has an inverse.Under these conditions the only solutions to the equation:is the trivial solution , as before.We find it useful to consider systems of equations under the hypothesis that the set of vectors -- -- is linearly dependent. There exists a set of constants -- -- not all zero such that:There are two possibilities: Either for all such sets of constants, so that there is no solution to the system of equations (A52) unless . Alternatively, suppose that there exists a set of constants for which . In this case there is a solution to the system of equations:However, this solution is not unique, since the set of vectors -- -- is linearly dependent.L (x 1,x 2...x m )'a 1x 1%a 2x 2%...%a m x m ,'j a i x i .L (x )'a )x ,a',x'.Q (x 1,x 2...x m )'j j a ij x i x j .r %1R %1Ax '0(56)(57)(58)We say that a matrix A , not necessarily square, has rank r if the maximum number of linearly independent columns of A is equal to r . For a matrix A of rank r every minor of order must be equal to zero, since any square submatrix of order has columns such that the equation has a nontrivial solution. Second, there is at least one minor of order r not equal to zero,since for some set of r columns of the matrix A this equation has only the trivial solution. Since the determinant of the transpose of the matrix is equal to the determinant of the matrix itself, the rank of a matrix is also the maximum number of linearly independent rows.The second application of matrix notation we consider is to partial differentiation of a function of several variables. We first consider a linear form,In vector notation we can write this function in more compact form as:where a and x are vectors:We also consider a quadratic form:Q(x)'x)Ax,A'11a12...a121a22...a2!m1am2...a,a ij 'aji'12(aij%aji),(i…j;i,j'1,2...m).Ax'0,xixj(i…j;i,j'1,2...m)aij%ajiQ(x)>0x 0Q(x)'>0x 0Q(x)<0x 0Q(x)'<0x…0(59)(60)matrix notation we can write this function in more compact form as:where A is a matrix:and x is a vector, as before. Without loss of generality, we can take the matrix A to be symmetric, since the coefficient of the typical cross-product is . We adopt the convention that:This convention does not affect the value of the quadratic form.A quadratic form is positive definite if and only if for all ; a quadratic form is non-negative definite if and only if for all . Similarly, a quadratic form is negative definite if and only if for all ; a quadratic form is non-positive definite if and only if for all . If a quadratic form is neither non-negative definite nor non-positive definite, we say that it is indefinite.If a quadratic form Q(x) with matrix of coefficients A is positive definite, then A is nonsingular and has an inverse. Suppose that A were singular; then there is a nontrivial solution to the equation:Q (x )'x )Ax '0,C 'B )AB ,Bz '0,z )Cz 'z )B )ABz >0.C '(A &1))A A &1'(A &1))'A &1.z …0A &1B 'A &1so that the quadratic form:which contradicts the assumption that this quadratic form is positive definite. We conclude that the matrix of coefficients A of a positive definite or a negative definite quadratic form has an inverse.Second, if B is nonsingular and A is the matrix of coefficients of a positive definite quadratic form, then the matrix:is the matrix of coefficients of a positive definite quadratic form. To show this we observe that if B is nonsingular, there is no nontrivial solution of the equation:so that:for .Finally, if A is the matrix of coefficients of a positive definite quadratic form, then is positive definite. To show this we let is the expression given above:We can write the identity matrix in the form:I 'A A &1'A &1A 'A )(A &1))'A (A &1)),(A &1))'A &1M f M x '.M L M x'''a .A &1(61)(62)so that:and is a symmetric matrix.We can define the partial derivative of a function, say f , with respect to a vector, say x , as the vector of partial derivatives:Applying this definition to a linear form L (x) , we obtain:Similarly, applying this definition to a quadratic form Q (x) , we obtain:M Q M x ''2Ax.M2f M x M x)'M2fM x1M x2...M2fM x2M x2...!.M2fM x)M x'M2fM x M x).(63)(64)(65)We can extend the notion of a partial derivative with respect to a vector by introducing the notion of a second partial derivative:In this definition we first differentiate the function f with respect to a column vector x and then differentiate the vector partial derivative with respect to the row vector x'. Obviously, the order of differentiation can be interchanged:The first and second vector partial derivatives are useful in the second-order Taylor's series expansion of a function with continuous second-order partial derivatives:f(x1,x2...xm)'f(x01,x02 (x0)m)%j M f M x i(x01,x02...x0m)(x i&x0i) %12j j M2fM xiM xj(x11,x12 (x1)m)(xi&x0i)(xj&x0j),x1i'2x0i%(1&2)xi,0#2#1,(i'1,2...m).f(x)'f(x0)%M fM x(x0))(x&x0)%12(x&x0))M2fM x M x)x1(x&x0), x0'12,x1'12.M f M x (x0)'(x0(x0(x0''0.x0(66)(67)where:In matrix notation, we can represent this expansion in the form:where:For a twice differentiable function f a necessary condition for a local minimum at is that the vector partial derivative is equal to zero:。
计量经济学讲义第五讲(共十讲)
第五讲 自相关高斯-马尔科夫假定五是:(,)0,i j i j C ovariance i j εεεεδ==≠如果该假定不成立,那么称模型的误差项是序列相关的。
由于序列相关主要针对于时间序列数据,因此,下面把i 改写为t ,样本容量N 改写为T 。
笔记:1、如果基于横截面数据的回归模型其误差项是相关的,则称为空间自相关。
但是要记住,除非观察顺序具有某种逻辑或者经济上的意义,否则,在横截面数据回归中,观察顺序是可以随意的,因此,也许在某种观测顺序下误差项呈现出一种模式的自相关但在另一种观测顺序下又呈现出另外一种模式的自相关。
然而,当我们处理时间序列时,观测服从时间上的一种自然顺序。
2、在经济变量时间序列回归模型中,误差项经常被称之为冲击(Shock )。
对经济系统的冲击经常具有持续性,从而这为误差项序列相关提供了现实依据。
一、 自相关的后果在证明高斯-马尔科夫定理时,我们仅仅在证明OLS 估计量的方差最小(在所有线性无偏估计量中)时用到了序列无关假定,而在证明线性、无偏性并没有用到该假定,因此违背无自相关性假定并不影响线性、无偏性,只影响方差最小性质。
在证明方差最小时,我们分了两步,其中第一步是计算OLS 估计量的方差。
对模型:t 01t t y x ββε=++有:12ˆ12222()()()()(())()()[()]t t t t t t t t tx x Variance x x x x Variance x x Variance x x x x βεδβεε-=+---==--∑∑∑∑∑∑在假定五:0,0t t j j εεδ+=≠下,有:122ˆ222()[()]ttt x x x x βεδδ-=-∑∑如果假定五不成立,那么正确的方差表达式应该是:12ˆ1221122()2()()[()]t t t jT T tt t t j t j t x x x x x x x x βεεεδδδ+--+==-+--=-∑∑∑∑所以, OLS 法下通常的系数估计量方差的表示是错误的。
哈佛大学计量经济学考试题-期末05
Department of Economics Economics 1123 Harvard University Fall 2005Final Exam2:15 p.m., Saturday January 21, 2006Instructions1.Do not turn this page until so instructed.2.This exam is three hours long.3.The exam has five parts for a total of 100 points.4.Please put each question in a separate blue book (five blue books total). Put your nameand Harvard ID number on the cover of each blue book.5.You are permitted one two-sided 8½” x 11” sheet of notes, plus a calculator. Computers andwireless devices are not permitted without prior permission. You may not share resources with anyone else.6.Turn in this exam when you finish. (An electronic copy of the exam with answers will beposted on the course Web site in a few days, and hard copies of the exam will be available when you pick up your graded blue books.)7.Some questions ask you to draw a real-world judgment in a problem of practical importance.The quality of that judgment counts. For example, consider the question: “It is 10o F outside.In your judgment, why are so many people wearing heavy coats?” The answer, “To stay warm” would receive more points than the answer, “Because they are fashion-conscious.”Background for Parts I and II: Voting on Women’s IssuesParts I and II examine the relationship between the gender of a U.S. representative’s children andhis/her voting record on “women’s issues.” The data pertain to votes taken during the 105thCongress (1997-1998; each Congress lasts two years). The observational unit is a U.S.representative (House of Representatives only – no senators). There are 435 representatives, butthe study focuses on the 371 who have at least one child (regressions with fewer than n = 371reflect some missing opinion survey data). Among these 371 representatives with at least onechild, 89% are men and the mean age is 53.Two voting measures are considered. The first (“Teen contraceptive”) is binary, whether therepresentative voted to support a specific bill that would increase teenagers’ access tocontraceptives. The second (“NOW”) is a score ranging from 0 to 100 based on votes onmultiple bills related to women’s issues, computed by the National Organization of Women(NOW), measuring the agreement between the representative’s votes and the votingrecommendations made by NOW (0 to 100, with 100 = perfect agreement).The data set contains variables that measure the characteristics of the representative’s district andthe results of a political opinion survey administered to voters in his/her district.Variables in the Voting Data SetVariable Definition Teen contraceptive = 1 if the representative voted in favor of a specific billincreasing teen access to contraception, = 0 otherwiseNOW Composite NOW voting score:0 = complete disagreement with NOW’s positions100 = complete agreement with NOW’s positionsFraction daughters fraction of the representative’s children who are female (rangeis 0 to 1)District characteristicsRegistered Democrat proportion of voters registered as Democratic Party (0 to 1)District income median income in district (thousands of dollars)Fraction white fraction of district voters who are white (0 to 1)Fraction college grads fraction of district voters who are college graduates (0 to 1)District opinionsAbortion should be legal fraction of survey respondents in district who agree (0 to 1)Women are equal to men fraction of survey respondents in district who agree (0 to 1)Anti-crime spending shouldfraction of survey respondents in district who agree (0 to 1)increaseSocial service spending shouldfraction of survey respondents in district who agree (0 to 1)increaseShould be laws to protectfraction of survey respondents in district who agree (0 to 1)homosexuals from discrimination12The questions in Parts I and II refer to Table 1.Table 1The Effect of Having Daughters on Representatives Votes(1) (2) (3) (4) (5) Dependent variable Teen contra-ceptives? Teen contra-ceptives?NOW NOW Fract.daughtersEstimation method Probit OLS OLS OLS OLSRegressorsIntercept -0.51** (0.10) 0.38** (0.06) 40.2** (4.1) 38.6** (2.3) 0.07(0.29)Fraction daughters 0.36** (0.12) 0.13** (0.05) 6.18* (2.67) 6.01* (2.86)District characteristicsRegistered Democrat 0.71** (0.28) 0.23** (0.09) 84.27** (11.57) 82.1** (15.8) 0.20(0.26)District income 0.21 (0.20) 0.00(0.00)Fraction white -8.6 (9.5) 0.08(0.19)Fraction college grads -108.5 (77.7) -1.72(1.58)District opinionsAbortion should be legal 41.0* (20.6) -0.32(0.40)Women are equal to men -20.6 (23.1) 0.25(0.29)Anti-crime spending should increase30.2 (18.7) 0.82(0.52) Social service spending should increase-14.8 (16.8) -1.53**(0.47) Should be laws to protect homosexuals from discrimination10.2 (13.9) -0.06(0.36) N 371 371 371 331 331F-statistics testing that thecoefficients on variables in agroup are all zeroDistrict characteristics 0.93 (0.46) 1.10(0.36)District opinions 1.98 (0.081) 1.41(0.220)Notes : Heteroskedasticity-robust standard errors appear in parentheses under regressioncoefficients, and p -values appear in parentheses under F -statistics. The regressions are estimated using data on U.S. representatives during the 105th Congress (1997-1998).Significant at the: **1%, *5% significance level.1)Interpret the coefficient on Fraction daughters in regression (2). (3 points)2)Consider a representative with 2 daughters and 1 son, from a district in which 55% ofvoters are registered Democrats.a)Using regression (1), compute the probability that this representative voted in favor ofthe bill on teen access to contraception. (3 points)b)Using regression (2), compute the probability that this representative voted in favor ofthe bill on teen access to contraception. (3 points)3)Does the coefficient on Fraction daughters change substantially (in a real-world sense)from regression (3) to regression (4)? What does this tell you about the additional variables that were included in regression (4)? (3 points)4) A critic asserts that a shortfall of this study is that it focuses exclusively on daughters,indicating gender bias by the author. The critic suggests adding one more regressor toregression (4), specifically, Fraction sons, which is the fraction of males among therepresentative’s children. What would be learned from this regression? Be specific. (3points)5)Another critic suggests that more conservative districts might elect representatives withfewer daughters, so that Fraction daughters is endogenous. The author responds thatregression (5) provides evidence against this hypothesis, because Fraction daughters is(with only one exception) unpredictable by the other regressors and thus is exogenous. Do you agree or disagree with the author’s response? Why? Be precise. (3 points)31)The following questions concern regression (4):a)Provide a potential reason why the coefficient on district income in (4) is subject toomitted variable bias. (2 points)b)Comment on the following statement: Your answer to the previous question impliesthat the conditional mean of the error term in (4) is nonzero, given the regressors in (4).Therefore, the first least squares assumption is violated and the coefficient on Fraction daughters in (4) does not have a causal interpretation. (3 points)For the remaining questions, suppose (hypothetically) that the data set is extended to be panel data for T = 3 Congresses, the 105th (1997-1998), 106th (1999-2000), and 107th (2001-2002) Congresses. The observational unit would be a representative (his/her votes, children, and district) in a given Congressional session. The data set would consist of all representatives who were elected to Congress for all three sessions. Suppose n = 300, so there is a total of 900 observations (representatives are elected for two-year terms, and almost all who run for reelection are reelected).2)Representatives in the 105th Congress who retire, are not reelected, or die would be in thecross-sectional data set used in Table 1, but would not be in the panel data set. Would this introduce sample selection bias into the panel data estimate of the effect of Fractiondaughters? (3 points)Regardless of your answer to question (2), for the rest of these questions, ignore the possibility of sample selection bias.3)To what extent would including representative fixed effects address the “endogeneity”criticism? Explain. (3 points)4)Would it be appropriate to include time fixed effects, in addition to representative fixedeffects, in the panel data regression? Explain. (3 points)5)Consider a hypothetical panel data version of regression (4) in Table 1, in which bothrepresentative fixed effects and time fixed effects are included. Call this hypotheticalregression (P4) (“P” for panel).a)What is the problem that is solved by “clustered” or “HAC” standard errors, and howdo clustered standard errors solve that problem? (3 points)b)In regression (P4), which would you recommend using: conventional(heteroskedasticity-robust) standard errors or clustered standard errors? Explain, withspecific reference to regression (P4). (3 points)c)Suppose that the author estimated regression (P4), using the standard errors yourecommended in part (b). Using your judgment, do you think that these standard errorsin hypothetical panel regression (P4) would be smaller, larger, or about the same asthose in the cross-section regression (4) in Table 1? Explain. (3 points)4Background to Parts III and IV: Female Labor SupplyHarvard economist Claudia Goldin attributes much of the rise of professional women in the U.S.labor force to their ability to engage in family planning after the introduction of the birth-controlpill. In developing countries early childbearing is associated with lower levels of education andmore dependency of women on their husband’s earnings.This question examines the effect of family size on female labor supply. The data set consists ofn = 254,654 married women (aged 21 – 35), as reported in the 1980 U.S. Census of thePopulation (the data pertain to the full calendar year of 1979).Variables in the Female Labor Supply Data SetVariable Definition Wife’s weeks worked No. of weeks wife worked for pay in 1979Husband’s weeks worked No. of weeks husband worked for pay in 1979Same sex = 1 if first two children have same sex, = 0 otherwise2 boys = 1 if first two children are boys, = 0 otherwise2 girls = 1 if first two children are girls, = 0 otherwiseKids>2 = 1 if family has more than 2 children, = 0 otherwiseBoy first = 1 if first child is a boy, = 0 otherwiseCurrent age of mother age of mother in 1979Age of mother at 1st birth age of mother at birth of first childBlack = 1 if blackHispanic = 1 if HispanicOther race = 1 if nonwhite/nonblack/nonHispanic56The questions in Parts III and IV refer to Table 2.Table 2Child Sex Composition, Family Size, and Labor Supply(1) (2) (3) (4) (5) (6) Dependent variableKids>2 Kids>2 Wife’s weeks worked Wife’s weeks worked Wife’s weeks worked Husband’s weeks worked Estimation methodOLS OLS OLS TSLS TSLS TSLS InstrumentsSame sex 2 boys, 2 girls Same sex RegressorsSame sex.0694** (.0018) 2 boys.0599** (.0026) 2 girls.0789** (.0026) Kids>2-8.04** (0.09) -5.40** (1.21) -5.16** (1.20) 1.01 (0.63) Boy first-.0011 (.0019) -.0015 (.0026) -0.05 (0.08) -0.02 (0.08) -0.02 (0.08) 0.03 (0.08) Current age of mother.0304** (.0003) .0304** (.0003) 1.33** (0.01) 1.25** (0.04) 1.25** (0.04) 0.10* (0.04) Age of mother at 1st birth-.0436** (.0003) -.0436** (.0003) -1.36** (0.17) -1.24** (0.05) -1.24** (0.05) -0.21** (0.06) Black.0680** (.0042) .0680** (.0042) 10.83** (0.19) 10.66** (0.21) 10.64** (0.21) -4.10** (0.26) Hispanic.1260** (.0039) .1260** (.0039) -0.04 (0.18) -0.38 (0.23) -0.41 (0.23) -2.61** (0.23) Other race .0480** (.0044) .0480** (.0044) 2.82** (0.20) 2.70** (0.21) 2.69** (0.21) 2.02**(0.18)N 254,654 254,654 254,654 254,654 254,654 254,654F -statistic on Same sex 1413.0F -statistic on 2 boys, 2girls725.9 J -statistic 3.24Notes : Regressions (4), (5), and (6) are estimated by two stage least squares (TSLS) regression, in which the included endogenous variable is Kids>2. Heteroskedasticity-robust standard errors appear in parentheses under regression coefficients, and p -values appear in parentheses under F -statistics. All regressions include an estimated intercept, which is not reported. Regressions (1) – (5) are estimated using data on married women for 1979, regression (6) is estimated using data for the husbands of those married women.Significant at the: **1%, *5% significance level.1)Give the best reason you can why the OLS estimator of the coefficient on Kids>2 in Table2, column (3) might be biased. (3 points)2)Consider the hypothesis that, on average, U.S. parents want to have children of bothgenders (that is, they prefer at least one girl and one boy to all girls or all boys). DoesTable 2 provide evidence in favor of this hypothesis, against this hypothesis, or neither?Explain. (3 points)3)Consider the following potential instrumental variables for Kids>2 in regression (3):a)Whether wife came from large family (binary) (3 points)b)The teen pregnancy rate in the wife’s city or town of residence (3 points)For each proposed instrument, is the variable arguably a valid instrument variable? Briefly explain.4)Based on a combination of your judgment and the empirical results in Table 2:a)Is Same sex a valid instrument in regression (4)? (3 points)b)Is the pair of variables, 2 boys and 2 girls, a valid set of instruments in regression (5)?(3 points)5)The estimated coefficient on Kids>2 differs in regressions (3) and (4) (the OLS estimate ismore negative than the TSLS estimate). Provide a real-world explanation (an interpretation of the results) that explains why the OLS estimate is more negative than the TSLS estimate.(3 points)71)Consider a hypothetical regression (7),Wife’s weeks worked i = β0 + β1Kids>2 + u i (7) which would be estimated by TSLS, using Same sex as an instrument (so regression (7) isregression (4) without the variables Boy first,…, Other race). For this question, assumethat Same sex is a valid instrument in regression (4) and in addition that Same sex isdistributed independently of all the control variables in regression (4), so E(Boy first|Samesex) = 0, …, E(Other race|Same sex) = 0.a)Explain why Same sex would be a valid instrument in regression (7). (3 points)b)Provide a reason why, despite the validity of Same sex as an instrument in regression(7), you would still prefer regression (4). (3 points)2)Some women are more ambitious professionally than others. Suppose that the effect onlabor force participation of having a large family is not the same for every woman,specifically, the more ambitious the woman, the smaller is the effect (the most ambitiouswomen will work whether or not they have a large family). How – if at all – would thischange your interpretation of the results in regressions (4) and (5)? Explain your reasoning.(5 points)Use Table 2 to comment on the following statements. For each statement, do you agree ordisagree with the statement, and explain why (be specific).3)Families with large numbers of children tend to be unusual in certain ways, in some casescoming from certain religious/ethnic backgrounds (traditional Catholic families, Mormons,etc.). So the analysis in regressions (4) and (5) is not providing a valid estimate of theeffect of family size on labor supply, it is just reflects this religious/ethnic effect. (3 points)4)Even though having large families reduces female labor force participation, this is only halfof the story because their husbands will work more to compensate for the loss of the wife’searnings. (3 points)8Background to Part V: The Term Spread and Output GrowthThe U.S. Treasury issues bonds of different maturities. A 10-year bond is debt that is paid offover 10 years. A one-year bond is debt that is paid off over one year. Usually, the rate ofinterest on a 10-year bond exceeds the rate of interest on a one-year bond. If short-term interestrates are unusually high, however, then the rate of interest on a one-year bond can exceed therate of interest on a 10-year bond. The difference between the rate of interest on a long-termbond (here, the 10-year bond) and the rate of interest a short-term bond (here, the one-year bond)is called the Term Spread. If the 10-year rate is 4.5 (percent) and the 1-year rate is 3.5 (percent),then the spread is 1.0 (percentage points).The Term Spread is often viewed as a measure of monetary policy. If monetary policy isespecially tight, then short term interest rates are high, relative to long term interest rates, and theterm spread is negative.Over the past few months, the Term Spread in the U.S. has fallen, and just recently it becamenegative for the first time since the onset of the recession in 2000.The Term Spread data set contains quarterly time series data for the U.S. from the first quarter of1960 (1960:I) through the third quarter of 2005 (2005:III). The data are plotted in Figure 1.Variables in Term Spread Data SetVariable Definition GDP growth quarterly growth rate of GDP, expressed in percent at an annualrate (computed using the logarithmic approximation, GDP growth= 400ln(GDP t/GDP t–1), where GDP t is the real Gross DomesticProduct of the U.S. in quarter t. (Quarterly GDP is the total valueof final goods and services produced in the United States in thatquarter.)Term Spread the interest rate on a 10-year U.S. Treasury bill, minus the interestrate on a 1-year U.S. Treasury bill.9Figure 1. Time series plots of quarterly GDP growth and Term Spread, 1960:I – 2005:III1011The questions in Part V refer to Table 3.Table 3GDP Growth and the Term SpreadDependent variable: GDP growth t(1) (2) (3) (4) (5) Sample period 1960:I – 2005:III 1960:I – 2005:III 1960:I – 2005:III 1960:I – 1984:IV 1985:I –2005:IIIRegressorsIntercept 2.42** (0.38) 2.04** (0.52) 1.85** (0.45) 2.05** (0.56) 2.07**(0.57)GDP growth t –1 0.27** (0.08) 0.24** (0.08) 0.26** (0.07) 0.23* (0.10) 0.25*(0.12) GDP growth t –2 0.18(0.14)GDP growth t –3 -0.06 (0.08)GDP growth t –4 0.01 (0.10)Term Spread t –1 0.67** (0.25) 1.56** (0.44) 0.18(0.20)Quandt Likelihood Ratio (QLR) statistic (p -value in parentheses) 1.18 (0.41) 1.71 (0.32) 5.37 (0.03) 2.59 (0.26) 2.88 (0.24)T 183 183 183 100 83SER 3.3 3.2 3.1 3.8 1.93F-statistic testing zero coefficients on GDP growth t –2,. GDP growth t –3,and GDP growth t –4 (p -value inparentheses)1.27 (0.29)Notes : Estimation is by OLS, with heteroskedasticity-robust standard errors in parentheses. The regressions are estimated over the sample period given in the first row. The QLR statistic is for all the regressors in the regression, including the intercept. Heteroskedasticity-robust standard errors are included in parentheses.Significant at the: **1%, *5%, +10% significance level.Questions for Part V (20 points).Please answer these questions in Blue Book V1)The value of GDP growth in 2005:III was 4.1 (that is, in the third quarter of 2005, GDPgrew by 4.1% at an annual rate).a)Use regression (1) in Table 3 to compute a forecast of GDP growth for 2005:IV. (3points)b)Suppose that the errors in regression (1) are normally distributed. Compute a 95%prediction interval (forecast interval) for GDP growth in 2005:IV. (3 points)c)Suppose that forecast errors come in clusters, for example, some years have morevolatile GDP growth than others, so that GDP growth is more difficult to predict insome years than in others. Suggest a modification of regression (1) in Table 3 thatwould produce more reliable forecast intervals if there is this forecast error volatilityclustering. (2 points)2)Table 3 reports heteroskedasticity-robust standard errors. Should it report HAC standarderrors instead? Explain. (2 points)3)In Business Week Online (January 9, 2006), David Wyss, chief economist for Standard andPoor’s wrote about how the recent decline of Term Spread has created worries about aslowdown in U.S. economic growth. Based on the results in Table 3, do you think that these worries are justified? Fully explain your reasoning. (5 points)4)Suppose the U.S. Federal Reserve Bank is considering setting Term Spread to 1.0, that is,increasing Term Spread from its current value of approximately zero by 1.0 percentage point. (Suppose that, because long rates are more sluggish than short rates, the Fed can do this by lowing short-term interest rates until Term Spread equals 1.0.)a)Use regression (5) to estimate the effect of this easing. (1 points)b)In your judgment, do you think that your answer in (a) provides a good estimate of theeffect of this proposed policy intervention by the Fed? Why or why not? (4 points)12Selected Tables from Stock and Watson, Introduction to Econometrics13141516。
计量经济学(第五版)教学课件5
模型 统计量
1
2
样本容量 25 50 100 250 500 >500 25 50 100 250 500 >500 25 50 100 250 500 >500
0.01 -2.66 -2.62 -2.60 -2.58 -2.58 -2.58 -3.75 -3.58 -3.51 -3.46 -3.44 -3.43 3.41 3.28 3.22 3.19 3.18 3.18
– 则称该随机时间序列是平稳的(stationary),而该
随机过程是一平稳随机过程(stationary stochastic
process)。
宽平稳、广义平稳
• 白噪声(white noise)过程是平稳的: Xt=t , t~N(0,2)
• 随机游走(random walk)过程是非平稳的: Xt=Xt-1+t , t~N(0,2) Var(Xt)=t2
0.01 -4.38 -4.15 -4.04 -3.99 -3.98 -3.96 4.05 3.87 3.78 3.74 3.72 3.71 3.74 3.60 3.53 3.49 3.48 3.46
0.025 -3.95 -3.80 -3.73 -3.69 -3.68 -3.66 3.59 3.47 3.42 3.39 3.38 3.38 3.25 3.18 3.14 3.12 3.11 3.11
0.05 -3.60 -3.50 -3.45 -3.43 -3.42 -3.41 3.20 3.14 3.11 3.09 3.08 3.08 2.85 2.81 2.79 2.79 2.78 2.78
0.10 -3.24 -3.18 -3.15 -3.13 -3.13 -3.12 2.77 2.75 2.73 2.73 2.72 2.72 2.39 2.38 2.38 2.38 2.38 2.38
哈佛经济学笔记
哈佛经济学笔记(原创实用版)目录1.引言:介绍哈佛经济学笔记的背景和重要性2.主要内容:概括哈佛经济学笔记的主要内容和观点3.哈佛经济学的理论体系:介绍哈佛经济学的理论体系和特点4.哈佛经济学的方法论:分析哈佛经济学的方法论和应用5.哈佛经济学的现实意义:讨论哈佛经济学对现实世界的影响和启示6.结论:总结哈佛经济学笔记的主要观点和贡献正文1.引言哈佛经济学笔记是一本由哈佛大学经济学教授们编写的经济学教材,旨在为学生提供一个全面、系统的经济学理论体系和方法论。
随着全球经济的快速发展,经济学在各个领域中的应用越来越广泛,哈佛经济学笔记因此成为了经济学领域的重要参考书籍。
本文将从哈佛经济学笔记的主要内容、理论体系、方法论和现实意义等方面进行探讨。
2.主要内容哈佛经济学笔记主要涵盖了微观经济学、宏观经济学和经济政策等方面的内容。
其中,微观经济学部分包括消费者行为、生产者行为、市场结构等;宏观经济学部分包括国民收入决定、通货膨胀、失业等;经济政策部分包括货币政策、财政政策等。
这些内容旨在帮助学生了解经济学的基本原理和应用。
3.哈佛经济学的理论体系哈佛经济学的理论体系以新古典经济学为基础,强调市场机制在资源配置中的决定性作用。
在这一体系中,消费者追求效用最大化,生产者追求利润最大化,市场竞争使得资源得以高效配置。
此外,哈佛经济学还关注市场失灵和政府干预的问题,探讨如何通过经济政策来实现经济稳定和增长。
4.哈佛经济学的方法论哈佛经济学笔记采用了一种案例分析和理论讲解相结合的方法论。
书中提供了丰富的实际案例,帮助学生更好地理解经济学原理。
同时,理论讲解部分注重逻辑推理和数学证明,使得学生能够更深入地理解经济学的基本概念和定理。
这种方法论有助于培养学生独立思考和解决问题的能力。
5.哈佛经济学的现实意义哈佛经济学笔记对现实世界具有重要的指导意义。
首先,通过学习哈佛经济学,我们可以更好地理解经济现象,为政策制定提供理论依据。
经济计量学讲义经济计量学的特征及研究范围.ppt
3、如何建立和发展模型(模型中究竟应包含多少个变量)?
3、如何进行假设检验?回归结果的显著性检验、拟合优度的检验、随机
为什么要学习经济计量学
对于主修经济学和商业专业的学生来说, 学习经济计量学有实用性,在经济学和 商科专业的学习与培训中,经济计量学 已成为不可或缺的一部分。(如在企业销售部 ) 门工作,需要预测产品的市场需求
经济计量学的方法论
理论或假说的陈述; 收集数据; 建立数学模型; 建立统计或经济计量模型; 经济计量模型参数的估计; 检查模型的准确性:模型的假设检验; 检验来自模型的假说; 运用模型进行预测。
部分重点),用适当的方法,如普通最小二乘法来计算。运用最小
二乘法和这些数据,得到下面的式(3):
^
CLEPR
Bˆ1
Bˆ 2 CUNP
如:
^
CLEPR
69.9355
0.6458CUNP
在CLEPR上加一符号∧表明:式(3)是式(2)的估计式。散
点图是根据真实数据估计得到的回归直线。式(3)表明,如果失业
一个问题:式(3)CLE^ PR Bˆ1 Bˆ2CUNP 所 描述模型的准确性如何呢?
人们在进入劳动力市场之前,会根据一些因素(如失业率 的大小)来考虑劳动市场的状况。
但还有其他一些因素影响人们进入劳动力市场的决定,比 如每小时工资(收入)也是重要的决定变量,因为工资越高越 能吸引工人进入劳动力市场。当然可能还有其他因素。
如果我们试图在回归模型中包括每一个可以想像到的变量,那么 这个模型将会极为庞大以至于没有任何实际用处。最终选择的模 型应该是对现实的合理的复制。
(7)检验来自模型的假说
模型最终确定之后,我们进行假设检验(hypothesis testing)。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
6DUMMY VARIABLES (draft)Note: This draft will not be updated in the current academic year. Nor will the slideshows. The slideshows cover the lectures, but are in an old format.The Basic IdeaNC O S Tαα+Suppose that you hypothesize that the schools have the following cost functions:Regular schools:COST = α + βN + u Occupational schools:COST = α' + βN + uThese equations incorporate the implicit assumption that the schools have the same marginal cost. Only the overhead cost differs. The assumption may or may not be true and it will be relaxed later. Defining δ to be equal to (α'-α), the equation for the occupational schools can be re-written:Occupational schools:COST = α+δ + βN + uNow the two equations can be combined asCOST = α+δD + βN + uwhere D is an artificial variable, known as a dummy variable, with two possible values: 0 and 1. It is set equal to 0 for observations relating to the regular schools and to 1 for observations relating to the occupational schools. By including it we allow the intercept to switch between α, for regular schools, and α+δ, for occupational schools. Hence we can run just one regression using the complete sample, instead of two separate regressions for the different types of school. This has two advantages. The main one is that we have a larger sample,which will reduce the population variances of the coefficients, and this should be reflected by smaller standard errors. The other is that we obtain a single estimate of β, instead of two separate ones that are likely to conflict. The price we have to pay is that we have to assume that β is the same for both subsamples. We will relax this assumption in due course.Example: Regular and Occupational Schools in ShanghaiThe scatter diagram shows the annual recurrent expenditure (COST ), measured in yuan, then worth about 40 cents U.S., plotted against enrolment (N ), for a sample of 74 secondary schools in Shanghai in the mid-1980s. As can be seen from the diagram, the occupationalschools tend to cost more to run than the regular schools.01000002000003000004000005000006000007000000200400600800100012001400NC O S TWe will fit a cost function including a dummy variable OCC which is equal to 1 for occupational schools and 0 for regular schools:COST = α+δOCC + βN + uDataTable 9.1 shows the data for the first 10 of the 74 observations. Note that the dummy variable OCC takes value 0 if the observation relates to a regular school and 1 if it relates to an occupational school.Table 9.1: Recurrent Expenditure (COST) andEnrolment (N) by Type of SchoolSchool Type COST N OCC1Occupational345,00062312Occupational537,00065313Regular170,00040004Occupational526,00066315Regular100,00056306Regular28,00023607Regular160,00030708Occupational45,00017319Occupational120,000146110Occupational61,000991Once it has been defined, OCC is treated like any other explanatory variable in the regression model. The Stata output which follows shows the result of regressing COST on N and OCC.. reg cost n occSource | SS df MS Number of obs = 74 ---------+------------------------------ F( 2, 71) = 56.86 Model | 9.0582e+11 2 4.5291e+11 Prob > F = 0.0000 Residual | 5.6553e+11 71 7.9652e+09 R-squared = 0.6156 ---------+------------------------------ Adj R-squared = 0.6048 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 89248 ------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 331.4493 39.75844 8.337 0.000 252.1732 410.7254 occ | 133259.1 20827.59 6.398 0.000 91730.06 174788.1 _cons | -33612.55 23573.47 -1.426 0.158 -80616.71 13391.61 ------------------------------------------------------------------------------The regression result is therefore (standard errors in parentheses)^COST=-34,000 + 133,000 OCC +331 N(24,000)(21,000)(40)Putting OCC equal to 0 and 1, respectively, we can obtain the implicit cost functions shown in the following diagram for the two types of school.^Regular schools:COST=-34,000 + 331 N^Occupational schools:COST=-34,000 + 133,000 + 331 N = 99,000 + 331 N-100000100000200000300000400000500000600000700000NC O S TExtension to More than Two CategoriesIn actual fact there are four types of school in the sample: secondary technical schools and skilled workers' schools, the occupational schools in the previous example, and general academic schools and vocational schools. The vocational schools were classified with the general schools as regular schools in the previous example because they tended to beconverted general schools without serious provision of occupational training.01000002000003000004000005000006000007000000200400600800100012001400NC O S TWe need to choose a reference category to which the basic equation applies. It is usually best to choose the dominant or most normal category, if there is one. We will choose general academic schools. The define dummies for the other categories: TECH for technical schools,WORKER for skilled workers' schools, and VOC for vocational schools. They are defined to be 1 if the observation relates to that type of school and 0 otherwise. The model is nowCOST = α + δT TECH + δW WORKER + δV VOC + βN + uDataTable 9.2: Recurrent Expenditure, Enrolments and Type of SchoolSchool Type COST N TECH WORKER VOC1Technical345,0006231002Technical537,0006531003General170,0004000004Skilled workers'526,0006630105General100,0005630006Vocational28,0002360017Vocational160,0003070018Technical45,0001731009Technical120,00014610010Skilled workers'61,00099010. reg cost n tech worker vocSource | SS df MS Number of obs = 74 ---------+------------------------------ F( 4, 69) = 29.63 Model | 9.2996e+11 4 2.3249e+11 Prob > F = 0.0000 Residual | 5.4138e+11 69 7.8461e+09 R-squared = 0.6320 ---------+------------------------------ Adj R-squared = 0.6107 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 88578 ------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 342.6335 40.2195 8.519 0.000 262.3978 422.8692 tech | 154110.9 26760.41 5.759 0.000 100725.3 207496.4 worker | 143362.4 27852.8 5.147 0.000 87797.57 198927.2 voc | 53228.64 31061.65 1.714 0.091 -8737.646 115194.9 _cons | -54893.09 26673.08 -2.058 0.043 -108104.4 -1681.748 ------------------------------------------------------------------------------The regression result is therefore (standard errors in parentheses)^COST=-55,000 + 154,000 TECH +143,000 WORKER +53,000 VOC + 342 N(27,000)(27,000)(28,000)(31,000)(40)From this equation we can obtain the implicit cost functions shown in the following diagram for the four types of school.^General schools:COST=-55,000 + 342 N^Technical schools:COST=-55,000 + 154,000 + 342 N = 99,000 + 342 N^Skilled workers':COST =-55,000 + 143,000 + 342 N = 88,000 + 342 N ^Vocational schools:COST=-55,000 + 53,000 + 342 N = -2,000 + 342N-100000100000200000300000400000500000600000700000NC O S TJoint Explanatory Power of a Group of Dummy VariablesWe can test the joint explanatory power of the dummy variables as a group by comparing the residual sum of squares in a regression containing them with that in an equation omitting them. If our theoretical model isCOST = α + δT TECH + δW WORKER + δV VOC + βN + uthen the null hypothesis for the test is H 0: δT = δW = δV =0. H 1 is that at least one δ is non-zero.To perform the test we need the result of regressing COST only on N :. reg cost nSource | SS df MS Number of obs = 74---------+------------------------------ F( 1, 72) = 46.82 Model | 5.7974e+11 1 5.7974e+11 Prob > F = 0.0000Residual | 8.9160e+11 72 1.2383e+10 R-squared = 0.3940---------+------------------------------ Adj R-squared = 0.3856 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 1.1e+05------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- n | 339.0432 49.55144 6.842 0.000 240.2642 437.8222 _cons | 23953.3 27167.96 0.882 0.381 -30205.04 78111.65------------------------------------------------------------------------------The test is a standard F test for a group of explanatory variables. The numerator is the reduction in RSS when they are added, divided by the cost. The cost is the number of degrees of freedom given up, which in this case is 3, one for each extra parameter estimated. The denominator is RSS after the dummy variables have been included, divided by the number of degrees of freedom remaining, in this case 69.F(,)(..)....36989160541383541386911674007846149 =-==Note that the RSS given in the regression results were all multiplied by 1011, but this factor can be ignored because it appears in both the numerator and the denominator of the F statistic. Note also that the ratios were calculated to four significant figures. This will ensure that the F statistic will be correct to three significant figures. The critical value of F(3,69) will be a little below 4.13, the critical value for F(3,60), at the 1% significance level, so we can reject H0 at this level. Indeed if we performed a 0.1% significance test, we would still reject H0. This is only to be expected because t tests showed that δT and δV were both significantly different from zero, and it is rare (but not impossible) for the F test not to reject H0 when one or more coefficients is significant.Change of Omitted CategoryThe skilled workers' schools were considerably less academic than the others, even the technical schools. Suppose that we wish to investigate whether their costs were significantly different from the others. The easiest way to do this is to make them the omitted category (reference category). Then the coefficients of the dummy variables become estimates of the differences between the overhead costs of the other types of school and those of the skilled workers' schools. Since skilled workers' schools are now the reference category, we need a dummy variable, which will be called GEN, for the general academic schools.COST = α + δT TECH + δG GEN + δV VOC + βN + uThe data table now is as shown in Table 9.3.Table 9.3: Recurrent Expenditure, Enrolments and Type of SchoolSchool Type COST N TECH GEN VOC1Technical345,0006231002Technical537,0006531003General170,0004000104Skilled workers'526,0006630005General100,0005630106Vocational28,0002360017Vocational160,0003070018Technical45,0001731009Technical120,00014610010Skilled workers'61,00099000The Stata regression result is given below:. reg cost n tech voc genSource | SS df MS Number of obs = 74 ---------+------------------------------ F( 4, 69) = 29.63 Model | 9.2996e+11 4 2.3249e+11 Prob > F = 0.0000 Residual | 5.4138e+11 69 7.8461e+09 R-squared = 0.6320 ---------+------------------------------ Adj R-squared = 0.6107 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 88578------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 342.6335 40.2195 8.519 0.000 262.3978 422.8692 tech | 10748.51 30524.87 0.352 0.726 -50146.93 71643.95 voc | -90133.74 33984.22 -2.652 0.010 -157930.4 -22337.07 gen | -143362.4 27852.8 -5.147 0.000 -198927.2 -87797.57 _cons | 88469.29 28849.56 3.067 0.003 30916.01 146022.6 ------------------------------------------------------------------------------The regression result is therefore (standard errors in parentheses)^COST=88,000 + 11,000 TECH -143,000 GEN -90,000 VOC + 342 N(29,000)(30,000)(28,000)(34,000)(40)From this equation we can again obtain the implicit cost functions shown in the following diagram for the four types of school.^General schools:COST=88,000 – 143,000 + 342 N = -55,000 + 342 N^Technical schools:COST=88,000 + 11,000 + 342 N = 99,000 + 342 N^Skilled workers':COST=88,000 + 342 N^Vocational schools:COST=88,000 - 90,000 + 342 N = -2,000 + 342 NNote that these equations are identical to those obtained when general schools were the omitted category. The choice of omitted category does not affect the substance of the regression results. The only components which change are the standard errors and the meaning of the t tests. R2, the coefficients of the other variables, the t statistics for the other variables. and the F statistic for the equation as a whole do not alter. And of course the diagram representing the four cost functions is the same as before.More Than One Set of Dummy VariablesIt is common to have more than one set of dummy variables in a regression equation. An example will be given here. Some of the schools were boarding schools (residential schools),others day schools. You would expect the overhead costs of boarding schools to be relatively high, so we introduce a dummy variables, BOARD , which is equal to 1 for boarding schoolsand 0 for the others.01000002000003000004000005000006000007000000200400600800100012001400NC O S TFor the sake of simplicity we will revert to the occupational/regular classification of school type. The model now becomesCOST = α+δOCC +εBOARD + βN + uDataOf the first 10 schools in the sample, the second, fourth and seventh were boarding schools.Hence the values of BOARD are as shown in Table 9.4.Table 9.4: Recurrent Expenditure by SchoolType and Whether Day or BoardingSchoolTypeCOSTNOCCBOARD1Occupational, day345,000623102Occupational, boarding 537,000653113Regular, day170,000400004Occupational, boarding 526,000663115Regular, day 100,000563006Regular,day28,000236007Regular, boarding 160,000307018Occupational, day 45,000173109Occupational, day 120,0001461010Occupational, day61,000991The Stata regression result is given below:. reg cost n occ boardSource | SS df MS Number of obs = 74---------+------------------------------ F( 3, 70) = 40.43 Model | 9.3297e+11 3 3.1099e+11 Prob > F = 0.0000Residual | 5.3838e+11 70 7.6911e+09 R-squared = 0.6341---------+------------------------------ Adj R-squared = 0.6184 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 87699------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- n | 321.833 39.40225 8.168 0.000 243.2477 400.4183 occ | 109564.6 24039.58 4.558 0.000 61619.15 157510 board | 57909.01 30821.31 1.879 0.064 -3562.137 119380.2 _cons | -29045.27 23291.54 -1.247 0.217 -75498.78 17408.25------------------------------------------------------------------------------The regression result is therefore (standard errors in parentheses)^COST=-29,000 + 110,000 OCC +58,000 BOARD + 342 N (23,000)(24,000)(31,000)(39)-100000100000200000300000400000500000600000700000NC O S TSlope Dummy VariablesSuppose that we wish to relax the assumption that the marginal cost per student is the same for all types of school. We can do this by introducing a slope dummy variable, NOCC ,defined to be the product of N and OCC :COST = α+δOCC + βN + λNOCC + uIf OCC is zero, so is NOCC and the equation becomesCOST = α + βN + uIf OCC is one, NOCC is equal to N and the equation becomesCOST = α+δ + (β+λ)N + uλ is therefore the incremental marginal cost associated with occupational schools, in the same way that δ is the incremental overhead cost associated with them.DataSchool Type COST N OCC1Occupational345,00062312Occupational537,00065313Regular170,00040004Occupational526,00066315Regular100,00056306Regular28,00023607Regular160,00030708Occupational45,00017319Occupational120,000146110Occupational61,000991. g nocc=n*occ. reg cost n occ noccSource | SS df MS Number of obs = 74 ---------+------------------------------ F( 3, 70) = 49.64 Model | 1.0009e+12 3 3.3363e+11 Prob > F = 0.0000 Residual | 4.7045e+11 70 6.7207e+09 R-squared = 0.6803 ---------+------------------------------ Adj R-squared = 0.6666 Total | 1.4713e+12 73 2.0155e+10 Root MSE = 81980 ------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 152.2982 60.01932 2.537 0.013 32.59349 272.003 occ | -3501.177 41085.46 -0.085 0.932 -85443.55 78441.19 nocc | 284.4786 75.63211 3.761 0.000 133.6351 435.3221 _cons | 51475.25 31314.84 1.644 0.105 -10980.24 113930.7 ------------------------------------------------------------------------------1000002000003000004000005000006000007000000200400600800100012001400NC O S TJoint Explanatory Power of the Intercept and Slope Dummy VariablesThe joint explanatory power of the intercept an slope dummies can be tested using the usual F test for a group of variables. RSS in the regression without the dummy variables was8.9160x1011, and in the regression with the dummy variables it was 4.7045x1011. The F statistic is thereforeF (,)(..)..270891604704524704570313=-=The critical value of F (2,70) at the 1% significance level is a little below 4.98, the critical value for F (2,60), so we come to the conclusion that the null hypothesis H 0: δ=λ=0 should be rejected. We know from the t tests that λ is significantly different from zero but δ is not.Chow TestThe Chow test is designed to test whether the same regression model can be applied to two or more distinct subsamples of observations in the sample. We will illustrate it with reference to the school cost function data, making a simple distinction between regular and occupational schools. We need to run three regressions. In the first we regress COST on N using the whole sample. We have already done this (see above). This is called the pooled regression. We make a note of RSS for it, 8.9160x1011. In the second and third we run the same regression for the two subsamples of regular and occupational schools separately and again make a note of RSS .. reg cost n if occ==0Source | SS df MS Number of obs = 40 ---------+------------------------------ F( 1, 38) = 13.53 Model | 4.3273e+10 1 4.3273e+10 Prob > F = 0.0007 Residual | 1.2150e+11 38 3.1973e+09 R-squared = 0.2626 ---------+------------------------------ Adj R-squared = 0.2432 Total | 1.6477e+11 39 4.2249e+09 Root MSE = 56545------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 152.2982 41.39782 3.679 0.001 68.49275 236.1037 _cons | 51475.25 21599.14 2.383 0.022 7750.064 95200.43 ------------------------------------------------------------------------------. reg cost n if occ==1Source | SS df MS Number of obs = 34 ---------+------------------------------ F( 1, 32) = 55.52 Model | 6.0538e+11 1 6.0538e+11 Prob > F = 0.0000 Residual | 3.4895e+11 32 1.0905e+10 R-squared = 0.6344 ---------+------------------------------ Adj R-squared = 0.6229 Total | 9.5433e+11 33 2.8919e+10 Root MSE = 1.0e+05 ------------------------------------------------------------------------------ cost | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- n | 436.7769 58.62085 7.451 0.000 317.3701 556.1836 _cons | 47974.07 33879.03 1.416 0.166 -21035.26 116983.4 ------------------------------------------------------------------------------RSS is 1.2150x1011 for the regular schools and 3.4895x1011 for the occupational schools. The total RSS from the subsample regressions is therefore 4.7045x1011. It is lower than RSS for the pooled regression because the subsample regressions fit their subsamples better than the pooled regression. The question is whether the difference in the fit is significant, and once again we test this with an F test. The numerator is the improvement in fit on splitting the sample, divided by the cost (having to estimate two sets of parameters instead of only one).In this case it is (8.9160-4.7045)x1011 divided by 2 (we have had to estimate two intercepts and two slope coefficients, instead of only one of each). The denominator is the joint RSS remaining after splitting the sample, divided by the joint number of degrees of freedom remaining. In this case it is 4.7045x1011 divided by 70 (74 observations, less four degrees of freedom because two parameters were estimated in each equation). When we calculate the F statistic the 1011 factors cancel out and we haveF(,)...270421152 4704570311 ==The critical value of F(2,70) at the 1% significance level is a little below 4.98, the critical value for F(2,60), so we come to the conclusion that there is a significant improvement in the fit on splitting the sample and that we should not use the pooled regression. Note that this test is exactly equivalent to the F test on the joint explanatory power of the intercept and slope dummy variables.1000002000003000004000005000006000007000000200400600800100012001400NC O S T。