北京理工大学数学专业应用回归分析期末试题(MTH17095)
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
课程编号:07000237 北京理工大学2011-2012学年第二学期
2009级应用回归分析期末试题A 卷
1.(35)Consider the following model:0112233i i i i i y x x x ββββε=++++,
where y=labor force paticipation (%)by family heads of poor families, x 1=mean family income ($), x 2=mean family size,
x 3=unemployment rate (% of civilian labor force unemployed).
Two versions of the model were estimated as follows (the standard errors are in the brackets).
(A)123ˆ33.460.01915.520.813i i i i y
x x x =-+++ (48.78) (0.019) (9.46) (1.911)
()Re 15,5130.13,3716.98T s n SS SS A ===
(B) 12ˆ26.510.01815.30i i i y
x x =-++ (44.37) (0.018) (9.12)
()Re 3778.11s SS B =
(1)Interpret the coefficient of mean family income in model (B);
(2)Carry out a t-test to test whether in model (A) mean family size has a significant effect upon labor force paticipation;()0.05α=
(3) Carry out a partial F-test to test whether in unemployment rate has a significant effect upon labor force paticipation;()0.05α=
(4)What is the adjusted coefficient of determination 2
R in model (A); (5)Test the significance of model(B);
()0.05α=
(6)Find a 95% confidence interval for the coefficient 1β of 1x in model (B); (7)Interpret the confidence coefficient 95% in (6).
x 1=national income (100 million yuan) x 2=volume of consumption (100 million yuan) x 3=volume of passengers on railway (ten thousands persons) x 4=length of airline of civil aviation (ten thousands persons) x 5=number of inbound tourist arrivals (ten thousands persons) y=volume of passengers of civil aviation (ten thousands persons)
(1)What problem do the VIFs imply? (2)Which regression coefficients may have the wrong sign? (3)Discuss the reasons for the problem in (2).
3.(12)Consider the following model (n=8):2
012y x x βββε=+++
where y=body temperature of a pig (centi) x=time length after the pig is infected (hours)
(1)Test the significance of 2
x ;()0.05α= (2)Predict body temperature at x=80; (3)If the observations of x lie in (8,64),what ’s your suggestion about the prediction in (2); 4.(18)()()()2,:,,0,,0y X X n p rk X p E Var V V βεεεσ=+⨯===>, (1)Find GLSE for β;(2)Find an unbiased estimator for 2
σ.
5.(20)Full model ()()11222
0,,1,2,
,,cov ,0,i i i i i i j y x x E i j n i j
i j ββεεσεε⎧⎪
=++⎪⎪
==⎨⎪⎧=⎪=⎨⎪≠⎩⎩
subset model ()()112
0,,1,2,
,,cov ,0,i i i i i j y x E i j n i j
i j βεεσεε⎧
⎪
=+⎪⎪
==⎨⎪⎧=⎪=⎨⎪≠⎩⎩
(1)Under subset model caculate OLSE 1ˆβ
for 1
β; (2)Assume full model is true,caculate ()
()
11
ˆˆ,E Var ββ. Attached list:
()()()()()0.0250.0250.0250.050.0511 2.201,12 2.1788,5 2.5706,1,11 4.8443,2,12 3.8853
t t t F F =====
课程编号:MTH17095 北京理工大学2012-2013学年第二学期
2010级应用回归分析期末试题A 卷
Attached list:()()()0.050.050.041,22 4.30,1,23 4.28,3,22 3.418,F F F ===
()()0.0250.02522 2.074,23 2.0687t t ==
1.(28)Consider the following model:01122ˆy
x x βββ=++,n=25,where y=deliver time (minutes), x 1=number of cases of product, x 2=distance walked by the route driver (feet).
Two versions of the model were estimated as follows (the standard errors are in the brackets).
(A)12ˆ 2.341 1.6160.014y
x x =++ (1.097) (0.171) (0.004)
()Re 5784.543,233.732T s SS SS A ==
(B) 1ˆ 3.321 2.176y
x =+ (1.371) (0.124)
()Re 402.134s SS B =
(1)Interpret the coefficient of number of cases of product in model (A);
(2)Carry out a t-test to test whether for model (A) number of cases of product has a significant effect upon deliver time;()0.05α=
(3)Carry out a partial F-test to test whether distance has a significant effect upon deliver time;()0.05α=
(4)Test the significance of model(B);
()0.05α=
(5)Find a 95% confidence interval for the parameter 1β from model (B);
(6)Find a 90% Bonferroni confidence interval for the parameter 0β and 1β from model (B); (7)Explain the result in (6).
2.(18) Consider the following model:01122ˆy
x x βββ=++,n=25,where y=deliver time (minutes), x 1=number of cases of product, x 2=distance walked by the route driver (feet).
(1)What are the horizontal scale and vertical scale in the following partial regression plot?What does the plot indicate?
(2)It is reported that studentized residual at point 9 9993.2138,0.4983r h ==,where ii h is the ith diagonal element of hat matrix H,and COOK ’s distance 9 3.418D =.Interpret the results. (3)The correlation coefficients 12r between x 1 and x 2 is 120.824r =.What does the result imply? What are sources of the problem?
3.(15)To study the relationship between the annual per capita expenditure on education and the annual per capita consumption expenditure,two models are used to fit the data,where y:The annual per capita expenditure on education, x:The annual per capita consumption expenditure.
4.(21) Consider the simple linear regression model:011y x ββε=++,
with ()()2
0,E Var εεσ==,and ε uncorrelated.
(1)Show ()2
2
1R xx E MS S σβ=+; (2) Show ()2
Re s E MS σ=.
5.(18)A linear regression model is written as follows: 11223344y x x x x ββββε=++++,
()()20,E Var εεσ==.The data is shown in the following table:
(2)Caculate OLSE 1
ˆβ for 1β; (3)Caculate ()
1ˆVar β.
课程编号:MTH17095 北京理工大学2013—2014学年第二学期
2011级应用回归分析期末试题*卷(年份推断为2011,试卷类型未知)
附表:()()0.050.0255,10 3.33,10 2.2281F t ==
1.(28分)中国民航客运量回归方程为:(括号里是标准误差)
12345ˆ450.90.3540.5610.007321.5780.435y
x x x x x =+--++, (178.08)(0.085) (0.125) (0.002) (4.030) (0.052)
16,13843371.750,13818876.769n SST SSR ===
其中:y —民航客运量(万人) x 1—国民收入(亿元) x 2—消费额(亿元)
x 3—铁路客运量(万人) x 4—民航航线里程(万公里) x 5—来华旅游入境人数(万人) (1)解释回归方程中民航航线里程的回归系数; (2)检验回归方程的显著性;()0.05α= (3)计算回归方程的决定系数,并作出解释; (4)计算回归的标准误差,解释这一结果; (5)对模型中来华旅游入境人数对民航客运量是否有显著影响进行t-检验; (6)建立x 4的回归系数4β的置信水平为95%的置信区间。
2.(15分) 中国民航客运量回归方程为:
12345ˆ450.90.3540.5610.007321.5780.435y
x x x x x =+--++, 其中:y —民航客运量(万人) x 1—国民收入(亿元) x 2—消费额(亿元)
x 3—铁路客运量(万人) x 4—民航航线里程(万公里) x 5—来华旅游入境人数(万人)
a
a.Dependent Variable :Y
(1)写出条件数的定义,解释中国民航客运量模型中的关于条件数的结果; (2)解释中国民航客运量模型中的关于方差比例的结果;
(3)结合中国民航客运量模型说明多重共线性对回归模型的影响。
3.(16分)研究切割工具类型对切割工具寿命的影响。
y 是切割工具寿命,x 1是每分钟车床的转速,x 2是切割工具的类型,x 2=0,如果观测值来自工具类型A ,x 2=1,如果观测值来自工具类型B 。
Model A :01122y x x βββε=+++
Model B :01122312y x x x x ββββε=++++
(1)写出模型(A )的回归方程;
(2)解释模型(A )的回归系数,模型(A )中x 2对y 有显著性影响吗?()0.05α= (3)讨论模型(A )和模型(B )的区别; (4)模型(B )中的两条回归线的斜率相等吗?()0.05α=
4.(13分) 设()()201,,0,cov ,,,1,2,
,0,i i i i i j i j y x E i j n i j
σββεεεε⎧==++===⎨≠⎩。
01ˆˆˆi i y x ββ=+,01
ˆˆ,ββ是01,ββ的最小二乘估计, 证明:()22
1
1ˆˆ2n i i
i y y n σ==--∑是2σ的无偏估计。
5.(20分) 设全模型为二元线性回归模型,模型矩阵表示为()()20n y X E Var I βεεεσ⎧=+⎪
=⎨⎪=⎩
其中111
12112
21222212,,,n n n n y x x y x x y X y x x εβεβεβε⎛⎫⎛⎫⎛⎫
⎪ ⎪ ⎪⎛⎫
⎪
⎪ ⎪==== ⎪ ⎪ ⎪ ⎪⎝⎭ ⎪ ⎪ ⎪⎝⎭⎝⎭⎝⎭
, 选模型为一元线性回归模型:()()211,,0,cov ,,,1,2,
,0,i i i i i j i j
y x E i j n i j
σβεεεε⎧==+===⎨≠⎩
试求:(1)全模型下12,ββ的最小二乘估计12ˆˆ,ββ;(2)选模型下1β的最小二乘估计1ˆβ; (3)证明若全模型正确,则选模型回归系数1β的最小二乘估计1
ˆβ是全模型相应参数1β的有偏估计;(4)简述自变量选择对回归方程估计和预测的影响。
6.(8分)什么是自相关?举例说明自相关产生的原因。
2012级应用回归分析期末试题B 卷
附表:()()()0.050.050.0251,22 4.30,1,23 4.28,23 2.0687F F t ===
1.(30分)考虑如下模型:01122ˆy
x x βββ=++,n=25, 其中:y=送货时间(分),x 1=产品的箱数,x 2=送货的距离(英尺),
有如下两个回归方程(括号里是标准误差)。
(A)12ˆ 2.341 1.6160.014y
x x =++ (1.097) (0.171) (0.004)
()5784.543,233.732SST SSE A ==
(B) 1ˆ 3.321 2.176y
x =+ (1.371) (0.124)
()402.134SSE B =
(1)解释模型(A )中产品的箱数的回归系数;
(2)对送货距离对送货时间是否有显著的线性效应作偏F-检验;()0.05α=
(3)计算y 与x 1,x 2的复相关系数,并对结果作出解释; (4)计算y 与x 2的偏相关系数; (5)检验模型(B )的显著性()0.05α=; (6)建立模型(B )中参数1β的95%置信区间。
2.(15分)以下三问在第1题中送货模型下考虑。
(1)模型(A )中,已知9号点的学生化残差9 3.2138SRE =,杠杆值990.4983h =,库克距离9 3.418D =,解释这个结果;
(2)说明学生化残差,杠杆值和库克距离之间的关系; (3)如果模型(A )正确而误用了模型(B ),对估计和预测会产生什么影响?
3.(18分)某经济学家想调查文化程度对家庭储蓄的影响,在一个中等收入的样本中,随机调查了13户高学历家庭与14户中低学历的家庭,因变量y 为上一年家庭储蓄增加额,自变量x 1为上一年家庭总收入,自变量x 2表示家庭学历,高学历家庭x 2=1,低学历家庭x 2=0,
模型(A ):12ˆ797638263700y
x x =-+-, 模型(B ):1212ˆ87634057776787y
x x x x =-+--, (1)解释模型(A )中的回归系数;
(2)13户高学历家庭的平均年储蓄增加额为3009.31元,14户低学历家庭的平均年储蓄增加额为5059.36元,这样会认为高学历家庭每年的储蓄额比低学历的家庭平均少 5059.36-3009.31=2050.05元,将此结果与模型(A )得到的结果进行比较; (3)讨论模型(B )和模型(A )的区别;
(4)模型(B )中12x x 的系数的显著性检验的显著性概率(sig)=0.247,解释这一结果。
4.(10分) 在一次关于公共交通的社会调查中,一个调查项目是“是乘坐公交汽车上下班,还是骑自行车上下班。
”因变量y=1表示主要乘坐公交汽车上下班,y=0表示主要骑自行车表示女性。
(1)写出5.(20分) 考虑如下的简单线性回归模型, ()201,...
0,,1,2,
i i i i y x i i d N i n ββεεσ=++=,其中0β已知。
(1)求1β的最小二乘估计;(2)求1β的最小二乘估计的方差; (3)求1β的置信水平为1α-的置信区间。
6.(7分)如何正确理解多元线性回归方程显著性检验中拒绝原假设?
2013级应用回归分析期末试题B 卷
附表:()()0.050.0255,10 3.33,10 2.2281,ln7.33 1.99,ln11.5 2.44,ln17.99 2.89F t ===== 1.(30分)中国民航客运量回归方程为:(括号里是标准误差)
12345ˆ450.90.3540.5610.007321.5780.435y
x x x x x =+--++, (178.078)(0.085) (0.125) (0.002) (4.030) (0.052) 16,13843371.750,13818876.769n SST SSR ===
其中:y —民航客运量(万人) x 1—国民收入(亿元) x 2—消费额(亿元)
x 3—铁路客运量(万人) x 4—民航航线里程(万公里) x 5—来华旅游入境人数(万人) (1)解释回归方程中来华旅游入境人数的回归系数;(2)检验回归方程的显著性;()0.05α= (3)计算回归方程的决定系数,并作出解释;
(4)对模型中铁路客运量对民航客运量是否有显著影响进行t-检验; (5)建立x 4的回归系数4β的置信水平为95%的置信区间。
2.(15分) 中国民航客运量回归方程为:(括号里是VIF 的值)
12345ˆ450.90.3540.5610.007321.5780.435y
x x x x x =+--++, (1963) (1741) (3.171) (55.5) (25.2)
其中:y —民航客运量(万人) x 1—国民收入(亿元) x 2—消费额(亿元)
x 3—铁路客运量(万人) x 4—民航航线里程(万公里) x 5—来华旅游入境人数(万人) (1)写出方差扩大因子VIF 的定义;
(2)结合VIF 的结果分析中国民航客运量模型中存在的问题; (3)讨论怎样消除中国民航客运量模型存在的问题。
3.(16分) 考虑如下模型(n=8):2
012y x x βββε=+++
其中y=猪的体温(摄氏度) x=猪被感染的时间(小时)
(1)写出回归方程,并解释回归系数的意义;(2)检验2
x 的显著性;()0.05α=
(3)预测x=80时猪的体温;(4)如果x 的观测值位于(8,64),你对(3)中预测有何建议。
4.(15分)保险公司研究客户投保的年数(x)与是否续保(y=1表示续保,y=0表示不续保)之间的关系,已知投保一年的客户续保的概率为88%,投保两年的客户续保的概率为92%。
(1)试建立y 与x 之间的Logistic 回归方程;(2)预测投保三年的客户续保的概率; (3)讨论Logistic 回归模型与线性回归模型之间的区别。
5.(15分)设()()21,,0,cov ,,,1,2,
,0,i i i i i j i j
y x E i j n i j σβεεεε⎧==+===⎨≠⎩
,
写出2
σ的无偏估计,并证明无偏性。
6.(9分)举例说明异方差产生的原因。