Functional Form (continued)
First, use economic theory to guide you Think about the interpretation Does it make more sense for x to affect y in percentage (use logs) or absolute terms? Does it make more sense for the derivative of x1 to vary with x1 (quadratic) or with x2 (interactions) or to be fixed?
Functional Form (continued)
We already know how to test joint exclusion restrictions to see if higher order terms or interactions belong in the model It can be tedious to add and test extra terms, plus may find a square term matters when really using logs would be even better A test of functional form is Ramsey’s regression specification error test (RESET)
Proxy Variables (continued)
What do we need for for this solution to
give us consistent estimates of b1 and b2?
Cp11.a:Stochastic error term: Stochastic error term is a term that is added to a regression equation to introduce all of the variation in Y that cannot be explained by the included Xs.B:Linear : An equation is linear if plotting the function in terms of X and Y generates a straight line,c: Slope coefficient(β)shows the response of Y to a one-unit increase in X.d:: a linear regression model which has more than one independent variables .e:. Expected value:the expression β0+β1x is called the deterministic component of the regression equation ,this deterministic component can also be thought of as the expected value of Y given X.f:Residual: the difference between the estimated value of the dependent variable and the actual value of the dependent variable is defined as the residualCP2P43:a: Ordinary Least Squares is a regression estimation technique that calculates theβs so as to minimize the sum of the squared residuals.b:multivariate regression coefficient indicates the change in the dependent variable associated with a one-unit increase in the independent variable in question.c: total sum of squares is the squared variations of Y around its mean as a measure of the amount of variation to be explained by the regression. The explained sum of squares is the amount if the squared deviation of Yi from its mean that is explained by the regression line. Residual sum of squares is the unexplained in an empirical sense by the estimated regression equation.d: coefficient of determination is the ratio of the explained sum of squares to the total sum of squares.e: degrees of freedom is the excess of the number of observations(N) over the number of coefficients(including the intercept estimated (K+1).f: R^2 is the coefficient of determination.Chapter 31.a. Review the literature and develop the theoretical model.Specify the model.Hypothesize the expected signs of coefficient.Collect data.Estimate and evaluate the equation.Document the results.b. Apply other researchers’ model to my data set.c. Any mistakes in the specification of a model.d. The excess of the number of observations over the number of coefficient s to be estimated.Chapter 41.a. The regression model is linear.The error term has a zero population mean.All explanatory variables are uncorrelated with the err termObservations of the err term are uncorrelated with each other.The error term has a constant varianceNo explanatory variable is a perfect linear function of any other explanatory variables.The error term is normally distributed.b. An error term satisfying Assumption through one to five.c. A normal distribution with a mean equal to zero and a variance equal to one.d. SE is the square root of the estimated variance of β s.e. An estimate β is an unbiased estimator if its sampling distribution has as its expected value the true value of β.f. Best Linear Unbiased Estimator.g. The probability distribution of these β values across different samples. Chapter 51.a. A statement of the values that the researcher does not expect.b. A statement of the values that the researcher expects.c. We reject a true null hypothesisd. Indicate the probability of observing an estimated t-value greater than the critical t-value if the null hypothesis were correct.e. In which the alternative hypothesis has values on both sides if null hypothesis.f. When testing a hypothesis, you have to calculate a sample statistic and compare it with a critical value selected in advance.g. A value that divides the acceptance region from the rejection region when testing a null hypothesis.h. It is a ratio of departure of an estimated parameter from its notional value and its standard error.i. A range which contains the true value of an item a specified percentage of the time.j. A p-value for a t-score is the probability of observing a t score that big or bigger if the null hypothesis were true.CHAPTER 61.a. The omitted variable is an important explanatory variable that has been left outof a regression equation.b. The irrelevant variable is the variable included in an equation that doesn’t belong there.c. The specification bias is the bias caused by leaving a variable out of an equation.d. sequential specification search 6.4.2e. The specification error results from choosing the incorrect independent variables, the incorrect functional form and the incorrect form of the stochastic error term.f. The four valid criteria are to help decide whether a given variable belongs in the equation:g. expected bias 6.1.3CP71.a. Elasticity of Y with respect to X, the percentage change in the dependent variable caused by a 1 percent increase in the independent variable, holding the other varia bles in the equation constant can be calculated.b. In a doublelog functional form, the natural log of Y is the dependent variables an d the natural log of X is the independent variable.c. The semiology functional form is a variant of double-log equation in which some but not all of the variables are expressed in terms of their natural logsd. Polynomial functional forms express Y as a function of independent variables, so me of which are raised to powers other than one.e. The inverse functional form expresses Y as a function of the reciprocal of one or more of the independent variables.f. A slope dummy is a dummy variable that is multiplied by an independent variabl e to allow the slope of the relationship between the dependent variable and the pa rticular independent variable to change, depending on whether or not a particular condition is met.g. The natural log of a number is its logarithm to the base of the mathematical cons tant e, where e is an irrational and transcendental number approximately equal to2.718.h. The omitted condition, forms the basis against which the included conditions are compared.i. An interaction term is an independent variable in a regression equation that is th e multiple of two or more other independent variables.j. A form that is linear in the variables should be used unless a specific hypothesis s uggestions otherwise.k. An equation is linear in the coefficients only if the coefficients appear in their si mplest form, they are not raised to any powers and not multiplied or divided by ot her coefficients, and do not themselves include some sort of function.。
招生单位自命题科目(代码使用8××或9××) (150分)
硕士计量1-05-Functional Forms
Example: Housing Price
Suppose the relationship between the housing price y and pollution nox is log-log, and y and rooms is log-linear, we have: Log(price) = 0+1Log(nox)+ 2rooms+ In this case, for each percent change in pollution, ^ a 1 percent change in price As we have shown, each additional room causes ^ the price to increase by 100 2 percent
Example: Log Model
Let see an example of heteroskedastic variance in the salary vs. catalog purchase dataset using Eviews In other words, the variance of the error term increases with increase in x
of work experience vs. salary
Holding other factors constant, each extra year of experience leads to the same among of increase in salary (% or $)?
Functional-coefficient regression models for nonlinear time series
from with
the \curse of dimensionality".
Ui taking values in <k and Xi
strictly small.
stationary Let E(Y12)
transpose of a matrix or vector. The idea to model time series in such a form is not new; see,
for example, Nicholls and Quinn (1982). In fact, many useful time series models may be viewed
This paper adapts the functional-coe cient modeling technique to analyze nonlinear time series
data. The approach allows appreciable exibility on the structure of tted model without su ering
Ui and Xi consist of some lagged values of Yi. The functional-coe cient regression model has the
m(u; x) = Xp aj(u) xj;
(1.2)j=1来自where aj( )'s are measurable functions from <k to <1 and x = (x1; : : :; xp)T with T denoting the
Chapter 7 Specification Choosing A Functional Form
Chapter 7: Specification: Choosing A Functional Form1. section 7.2 presents alternative functional forms that are useful when specifying econometric models. Linear models are frequently too restrictive to properly fit thefunctional form suggested by the underlying theory.The last column of Table 7.1 below shows the correct EViews specification for thealternative functional forms printed in UE, Table 7.1, p. 214. You can use the table as a guide, but you must realize that Y represents the dependent variable while X1 & X2represent the only independent variables in all of the equations/specifications. Note that a constant (C) should be included in all models even if theory suggests otherwise (see UE, p. 201). You must have a workfile open in order to specify and estimate a regression model. Then, to specify a regression model in EViews, select Objects/NewObject/Equation from the workfile menu and enter the appropriate EViews specification (see the last column of the table below), in the Equation Specification: window.1Table 7.1: EViews Specification of Functional FormsSection Equation # Fcn. Form Equation specification EViews specification7.2.1 ---- Linear Y = β0 + β1X1 + β2X2Y C X1 X27.2.2 7.3 Double-Log lnY = β0 + β1lnX1 + β2lnX2log(Y) C log(X1) log(X2)7.2.3 7.7 Lin-Log Y = β0 + β1lnX1 + β2X2Y C log(X1) X27.2.3 7.9 Log-Lin lnY = β0 + β1X1 + β2X2log(Y) C X1 X27.2.4 7.10 PolynomialY = β0 + β1X1 + β2(X12) + β3X2Y C X1 X1^2 X27.2.5 7.13 Inverse Y = β0 + β1(1/X1) + β2X2Y C 1/X1 X27.5 7.20 Dummy* Y = β0 + β1X1 + β2D1Y C X1 D17.5 7.22 Dummy**Y = β0 + β1X1 + β2D1 + β3D1X1Y C X1 D1 D1*X1 * Intercept dummy variable. ** Intercept and slope dummy variables.Calculating "Quasi - R2" in EViews (UE 7.3.1, footnote 5, p. 215):The dependent variable must be in the same form when using R2 and adjusted R2 to compare the overall goodness of fit between two equations. For example, it would not be appropriate to compare the R2 for a linear model with a double-log or a log-lin model.However, it would be appropriate to compare R2 for a linear model with a lin-log, a1 Alternately, select Quick/Estimate Equation from the main menu. If this method is used you must namethe equation to save it. Select Name on the equation menu bar and enter the desired name in the Name to identify object: window, and click OK.polynomial, or an inverse functional form model. Likewise, it would be appropriate to compare R2 for double-log and log-lin functional form models. In order to demonstrate the process, the car acceleration data introduced in UE, Exercise 16, p. 234, will be used to demonstrate the process of calculating the quasi-R2. The steps below show how tocompare the goodness of fit for models using S (the number of seconds it takes a car to accelerate from 0 to 60 miles per hour) as the dependent variable versus using the natural log of S as the dependent variable. In both models, the independent variables are thesame as the original model printed at the top of UE, p. 236.Calculating "Quasi - R2" for a linear versus a log-lin model using EViews:Step 1. Open the EViews workfile named Cars7.wk1.Step 2. Select Objects/New Object/Equation on the workfile menu bar, enter S C T E P H in the Equation Specification: window, and click OK.Step 3. Select Name on the equation menu bar, write linear in the Name to identify object: window, and click OK. Minimize the equation object named linear.Step 4. Select Objects/New Object/Equation on the workfile menu bar, enter log(S) C T E PH in the Equation Specification: window (i.e., the log-lin functional form), and clickOK.Step 5. Select Name on the equation menu bar, write loglin in the Name to identify object: window, and click OK.Step 6. Select Forecast on the equation menu bar, select S in the Forecast of:2 window, enter SF in the Forecast name: window, uncheck the two boxes in the Output: window (the only objective here is to create a forecast series, not a forecast evaluation), and click OK.A new series named SF appears in the workfile window.Steps 7, 8 & 9 calculate the quasi-R2 for this regression (UE 7.3.1, footnote 5, p. 215). Step 7. Minimize the equation window, select Genr on the workfile menu bar, type numerator=(S-SF)^2 in the Enter equation: window, and click OK (this step generates the un-summed variable in the numerator of the quasi-R2 equation).Step 8. Select Genr on the workfile menu bar, type denominator=(S-@mean(S))^2 in the Enter equation: window, and click OK (this step generates the un-summed variable in the denominator of the quasi-R2 equation).Step 9. To calculate the quasi-R2, type the following equation in the command window and press Enter: scalar quasir2=1-(@sum(numerator)/@sum(denominator)). A new variable named quasir2 will appear in the workfile window. Double click on it and the value for the quasi-R2 will be displayed in the lower left of the screen (0.783958974). The quasi-R2 calculated in Step 9 (i.e., 0.78) is in-between the R2 from the linear model estimated in Step 2 (i.e., 0.71) and the R2 from the log-lin model estimated in Step 5 (i.e., 0.81).2 The Forecast procedure in EViews gives you the option of forecasting the transformed dependentvariable (i.e., LOG(S) in this case) or the original variable (i.e., S in this case). Select S, since thecomputation of quasi-R2 requires converting of LOG(S) to S by taking the anti-log of the dependentvariable (this can also be done by using the EViews command @exp(LOG(S)).Coefficient restrictions tests using EViews (UE, Appendix 7.7):The F-test can be used to test a wide range of hypothesis concerning regressioncoefficients. For example, suppose that the claim was made that when a car has a manual transmission it increases its acceleration speed (i.e., decreases the number of seconds it takes to accelerate from 0 to 60 miles per hour) just as much as adding 100 horsepower to the car. Translating this into the language of UE, Equation 7.28, p. 235, this means that the absolute value of the coefficient on T i is 100 times larger than the absolute value of the coefficient on H i. Just looking at the size of the estimated coefficients, it appears that you can easily reject the hypothesis because the absolute value of the coefficient on T i is only about 41.5 times larger than the absolute value of the coefficient on H i (divide the coefficient on T i by the coefficient on H i). However, these coefficients are just estimates.Follow these steps to carry out an F-test for the null hypothesis that the absolute value of the coefficient on T i is 100 times larger than the absolute value of the coefficient on H i. : Step 1. Open the EViews workfile named Cars7.wk1.Step 2. Select Objects/New Object/Equation on the workfile menu bar, enter S C T E P H in the Equation Specification: window, and click OK.Step 3. Select Name on the equation menu bar, write EQ01 in the Name to identify object: window, and click OK.Step 4. Select View/Coefficients Tests/Wald-Coefficient Restrictions … on the equation menu bar, enter -C(2)=-100*C(5) in the Coefficients separated by commas: window, and click OK to reveal the following output:3Wald Test:Equation: EQ01Null Hypothesis: -C(2)=-100*C(5)Probability 0.124472F-statistic 2.485049Probability 0.114933Chi-square 2.485049The null hypothesis is -C(2)=-100*C(5), since variable T is the second coefficient and variable H is the fifth coefficient in the EViews Estimation Output from Step 2. The F-statistic compares the residual sum of squares computed with and without the restrictions imposed. If the restrictions are valid, there should be little difference in the two residual sum-of-squares and the F-value should be small. Based on the Wald Test: results table, the null hypothesis cannot be rejected at the 5% level of significance. The calculated F-statistic of 2.49 is less than the critical F-value of 4.14. The critical F-value can be found in UE, Table B-2, p. 609 for 1 degree of freedom in the numerator and 33 (interpolate between the 30 and 40) degrees of freedom in the denominator or EViews can calculate3 The coefficients should be referred to as C(1), C(2), and so on (do not use series names). Multiplecoefficient restrictions must be separated by commas and the restrictions should be expressed as equations involving estimated coefficients and constants. The coefficients should be referred to as C(1), C(2), and so on (do not use series names).its value.4 The reported probability is the marginal significance level of the F-test. Itsupports this result in that rejecting the null hypothesis would be wrong less than 12.44% of the time.The Chi-square statistic is equal to the F-statistic times the number of restrictions under test. In this example, there is only one restriction and so the two test statistics areidentical with the p-values of both statistics indicating that we cannot reject the nullhypothesis, that the absolute value of the coefficient on T i is 100 times larger than theabsolute value of the coefficient on H i, at the 10% significance level. The 10%significance critical value for the χ2 test can be found in UE, Table B-8, p. 619 to be 2.71.The Chow test, alternately termed Chow's Breakpoint Test (UE, Appendix 7.7): Chow's Breakpoint Test divides the data into two sub-samples.5 It then estimates thesame equation for each sub-sample separately, to see whether there are significantdifferences in the estimated equations. A significant difference indicates a structuralchange in the relationship.Follow these steps to apply the Chow breakpoint test, as described in UE, pp. 241-242, to determine whether there was a structural change in the demand for chicken in 1976:Step 1. Open the EViews workfile named Chick6.wf1.Step 2. Select Objects/New Object/Equation on the workfile menu bar, enter Y C PC PB YD in the Equation Specification: window, and click OK.Step 3. Select Name on the equation menu bar, write EQ01 in the Name to identify object: window, and click OK.Step 4. Select View/Stability Tests/Chow Breakpoint Test… on the equation menu bar, enter 1976 in the Enter one date (observation) for the Forecast Test or one or more dates for the Breakpoint Test: window, and click OK to reveal the following output:Chow Breakpoint Test: 1976F-statistic 4.542962 Probability 0.004498Log likelihood ratio 17.98027 Probability 0.001245EViews reports two test statistics for the Chow breakpoint test. The F-statistic is based on the comparison of the restricted and unrestricted sum of squared residuals. EViewscalculates the F-statistic using the formula printed in UE, Equation 7.36, p. 242. In this4 To have EViews calculate the 5% critical F-value for this problem, type the following equation in thecommand window =@qfdist(0.95,1,eq01.@regobs-eq01.@ncoefs), press Enter and view the followingvalue on the status bar in the lower left of the screen . For the 10% critical F-value type =@qfdist(0.90,1,eq01.@regobs-eq01.@ncoefs) in the command window, and press Enter and view the following value on the status bar in the lower left of the screen .5 One major drawback of the breakpoint test is that each sub-sample requires at least as many observationsas the number of estimated parameters. This may be a problem if, for example, you want to test forstructural change between wartime and peacetime where there are only a few observations in the wartime sample.case, the calculated F-statistic of 4.54 exceeds the critical F-value of 2.63 for the 5% level of significance so the null hypothesis of no structural change can be rejected. The critical F-value can be found in UE, Table B-2, p. 609 for 4 degrees of freedom in the numerator and 36 (interpolate between the 30 and 40) degrees of freedom in the denominator or EViews can calculate its value.6 The reported probability is the marginal significance level of the F-test. It supports this result in that rejecting the null hypothesis would be wrong less than 0.4498% of the time.The log likelihood ratio statistic is based on the comparison of the restricted and unrestricted maximum of the log likelihood function. The LR test statistic has an asymptotic χ2 distribution with degrees of freedom equal to (m-1)*(k+1) under the null hypothesis of no structural change, where m is the number of sub-samples and k is the number of independent variables in the model (i.e., m = 2 in this case because one breakpoint is selected and k = 3). The calculated value for LR test statistic of 17.98 exceeds of 9.49 for the 5% level of significance and 13.28 for the 1% level of significance so the null hypothesis of no structural change can be rejected.7 The reported probability is the marginal significance level of the χ2 test. It supports this result in that rejecting the null hypothesis would be wrong less than 0.1245% of the time.6 To have EViews calculate the 5% critical F-value for this problem, type the following equation in the command window =@qfdist(0.95,eq01.@ncoef,eq01.@regobs-2*eq01.@ncoef), press Enter and view the following value on the status bar in the lower left of the screen .7The critical value for the χ2 test can be found in UE, Table B-8, p. 619.。
4 the unusually high capacity of specialists ( doctors, theologians, philosophers, scientists ,etc)
In such a translation one is concerned with the dynamic relationship, that the relationship between receptor and message should be substantially the same as that which existed between the original receptors and the message.
The nature of the message
Messages differ primarily in the degree to which content or form is the dominant consideration.Of course, the content of a message can never be completely abstracted from the form, and form is nothing apart from content; but in some messages the content is of primary consideration, and in others the form must be given a higher priority.
例如,描述税收与税率关系的拉弗曲线: 抛物线: s = a + b r + c r2
设X1 = r,X2 = r2, 则原方程变换为 s = a + b X1 + c X2 c<2Xi+ui 无截距模型与一般的模型不同在于:
一、双对数模型Double log model
B2 i
Y:博彩支出;X:个人可支配收入(PDI) 上式可转化为: lnYi=lnA+B2lnXi 模型特点:关于变量非线性
这样的模型称为双对数( double - log )模型 (因为两个变量都以对数形式出现)或对数-线性 ( log - linear )模型(因为以对数形式出现的变 量之间是线性的)。
Y AL K e
B1 B2
数。 劳动投入弹性+资本投入弹性=规模报酬参数
规模报酬递增 规模报酬递减 规模报酬不变
四、半对数模型(semilog model)
(一) 对数-线性模型(log-lin)——测量增长 率
Figure 5-11 Summary of functional forms.
lnYi=B1+B2lnX2i+B3lnX3i+ui 其中,B2,B3又称为偏弹性系数,它们度量 了在其他变量保持不变 条件下,应变量对 某一解释变量的偏弹性。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Adding More Regressors to Reduce the Error Variance
ˆ Given Var j
SST j 1 R
2 j
What is the effect of adding more variables ^ to our model on the Var(j)? It will reduce the error variance, because ˆ ˆ 2 ui2 / n k 1 SSR / n k 1 ^ And from it, it might reduce Var(j) if the added x’s are not correlated with xj
Which one is a better model? Can we use R2 to chose between the two? 2 R can be used to determine which is a better model
Choice of Models
Based on adjusted R2, the polynomial rather then the log model is a better model
Selection of Regressors (cont.)
On the other hand, if our purpose is to see the effect of lot size, number of rooms on the housing price, including the assessed value makes no sense, for it takes away the impact of lot size and number of rooms on the housing price Lets compare the two models in Eviews
2 R
Adjusted R2
Adding more variable to a model will not reduce R2 To compare two different models with different set of regressors (independent variables), the R2 adjusted by degree of R 2 gives us a better freedom denoted as means to compare the two It penalize a model with more regressors
Sales 20000 25000
8 6 4 2 0 0 2 4 Log(Sales) 6 8 10
Selection of Regressors
What should we include or exclude in our model depends on what the purpose of our study is For example, in the housing price model, if our purpose is to see if the assessed value of the house for tax purpose is rational, i.e. the assessed value reflect the market value of the house, then showing the inclusion of other factors made no difference completes the test
Standard Errors of the Intervals
First, suppose that we want an estimate of
The Mean Interval is a prediction based on the variability of the coefficient estimates only The Prediction Interval is a prediction taken also account of the variability of the error term Standard errors of the two are different
Two Types of Confidence Interval Predictions
There are two types of Confidence Interval predictions
Interval (an interval about the mean) Prediction Interval (an interval about a particular value)
R-squared 0.061329 Adjusted R-squared 0.03004
Mean dependent var 3.265625 S.D. dependent var 1.874079
Lets see why from the following graphs
10 9 8 7 6 5 4 3 2 1 0 0
Example: Choosing Between Two Models
See if we can use RDCHEM dataset to choose between:
rdintens= 0+1log(sales)+u rdintens= 0+1sales+2sales2+ u vs.
Class 9 Functional Forms II
More on interaction terms Polynomial (Quadratic) Adjusted R2 Prediction Intervals
Adjusted Model
SSR / (n k 1) R 1 or TSS / (n 1)
1 (1 R 2 )(n 1) / (n k 1)
Example: Comparing Different Functional Forms with Adjusted R2
When choosing between two functional forms, for example:
Confidence Interval Predictions of Ŷ
Confidence Interval Predictions of Ŷ
Sometime we are interested in more than just the expected value of Y: ˆ ˆ ˆ ˆ ˆ E(Y ) Y 0 1 X1 2 X 2 ... k X k We might want to know with probability (1-) that an interval around E(Y) contains the true value of the dependent variable Just like we can compute confidence interval estimate of the coefficients, we can also compute confidence interval estimate of the predicted dependent value
If our purpose is to see the effect of lot size, rooms, and house size on the housing price, should we have included the asses value in the model, even if adding it will increase the Adjusted R-squre to .77?
y = 0+1log(x)+u y = 0+1x+ 2x2 +u vs.
Neither is F –test nor is t –test will help us much in choosing between two non-nested models Adjusted R2 comes to the rescue
Dependent Variable: Log(PRICE) No, that will have defeated our purpose. Method: Least Squares The assessed value has taken all these Sample: 1 88 factors into account, so we will not be able to gage the effects separately. Included observations: 88 Variable Coefficient Std. Error t-Statistic Prob. C -1.297041 0.651284 -1.991514 0.0497 Log(SQRFT) 0.700232 0.092865 7.540304 0.0000 BDRMS 0.036958 0.027531 1.342411 0.1831 Log(LOTSIZE) 0.167967 0.038281 4.387717 0.0000 R-squared 0.642965 Mean dependent var 5.633180 Adjusted R-squared 0.630214 S.D. dependent var 0.303573 S.E. of regression 0.184603 Akaike info criterion -0.496833 Sum squared resid 2.862563 Schwarz criterion -0.384227 Log likelihood 25.86066 F-statistic 50.42372 Durbin-Watson stat 2.088996 Prob(F-statistic) 0.000000