应用回归分析-多重共线性
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
剔除x1
Coefficientsa
6.6
1)根据多重共线性剔除变量
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
.04
.00
4
.006
29.476
.03
.06
.11
.59
.21
5
.001
70.426
.96
.09
.46
.09
.79
a. Dependent Variable: y
剔除x2
ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
690.551
3
230.184
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
102.934
12.403
8.299
.000
x1
-.227
x3
-.923
.459
-.604
-2.012
.063
.003
306.617
x4
.026
.017
.093
1.591
.132
.086
11.605
x5
.510
.078
2.815
6.527
.000
.002
632.896
x6
-.011
.008
-.028
-1.274
.222
.608
1.645
a. Dependent Variable: y
6
.001
77.546
.96
.48
.10
.10
.01
.72
a. Dependent Variable: y
此时VIF全部小于10,~1.,但从回归系数的显著性看,存在不显著变量.。
剔除x4
Baidu NhomakorabeaANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
695.147
.100
-.222
-2.273
.032
.661
1.513
x2
-.074
.055
-.116
-1.359
.187
.866
1.155
x3
-2.629
.385
-.685
-6.835
.000
.629
1.591
x4
-.022
.066
-.031
-.326
.747
.706
1.416
x5
-.370
.120
-.711
-3.084
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x1
x2
x3
x4
x5
x6
1
1
6.950
1.000
.00
.00
.00
.00
.00
.00
.00
2
.019
19.291
.00
.15
.01
.03
.39
.00
.00
3
.015
.052
-.243
-2.414
.023
.720
1.389
a. Dependent Variable: y
CollinearityDiagnosticsa
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x1
x2
x3
21.501
.00
.15
.24
.13
.03
.00
.00
4
.009
27.621
.01
.03
.18
.61
.19
.00
.00
5
.006
33.829
.00
.11
.44
.13
.36
.02
.01
6
.001
82.638
.80
.50
.10
.10
.02
.07
.01
7
.000
196.786
.19
.06
.02
.01
.01
38.643
.000b
Residual
160.831
27
5.957
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x3, x1
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
115.662
11.226
10.303
.000
x3
-2.772
.365
-.722
-7.597
.000
.781
1.280
x1
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
-1252.832
1507.836
-.831
.419
x1
-.735
.163
-1.291
-4.524
.000
.004
276.969
x4
x5
1
1
5.953
1.000
.00
.00
.00
.00
.00
.00
2
.019
17.857
.00
.16
.01
.02
.39
.00
3
.014
20.503
.00
.15
.37
.10
.02
.01
4
.009
26.002
.01
.04
.07
.74
.27
.01
5
.005
35.652
.03
.17
.45
.03
.30
.26
Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
1
.901a
.811
.790
2.440634
a. Predictors: (Constant), x5, x3, x1
此时所有回归系数显著
F=38.643,sig=0.000故回归方程显著。
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
111.718
10.235
10.915
.000
x3
-2.825
.358
-.736
-7.886
.000
.804
1.244
x1
.000
x1
-.285
.104
-.279
-2.753
.011
.710
1.408
x2
-.052
.058
-.081
-.898
.378
.896
1.116
x3
-2.704
.412
-.704
-6.561
.000
.634
1.579
x4
-.027
.071
-.039
-.382
.706
.707
1.414
x5
-.126
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
116.488
11.618
10.027
(1)用方差扩大因子法分析数据的多重共线性;
(2)用特征根法分析数据的多重共线性;
(3)本题是否适用剔除变量的方法消除共线性,如果适用,进行变量剔除(要求写出回归方程,及主要的统计量);
三、实验结果与分析(包括运行结果及其数据分析、解释等)
(1)用方差扩大因子法分析数据的多重共线性;
Coefficientsa
df
Mean Square
F
Sig.
1
Regression
696.052
5
139.210
22.406
.000b
Residual
155.329
25
6.213
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x2, x3, x1, x4
.91
.98
a. Dependent Variable: y
特征值全都十分接近0,故认为变量间有严重的多重共线性。
由方差比例阵,x5-x6间可能存在共线性
(3)本题是否适用剔除变量的方法消除共线性,如果适用,进行变量剔除(要求写出回归方程,及主要的统计量);
剔除x6
ANOVAa
Model
Sum of Squares
实验报告
实验课程应用回归分析第七次实验实验日期12.17
班级学号姓名成绩
要求:
将所有要提交的数据、结果等文件按学号+姓名上传
一、实验目的
掌握SPSS中找出并消除数据共线性方法.
掌握SPSS中的岭回归分析方法.
二、实验内容
1.在训练中氧气消耗能力问题的研究中,我们想要建立一个关系式,以便根据训练测试的数据来预报肺活量,而不必进行昂贵和笨重的氧气消耗测试。考察的因变量y为OXY(氧气消耗能力),自变量有x1(age,年龄)、x2(weight,体重)、x3(RunTime,1.5英里跑的时间)、x4(RstPulse,休息时脉博)、x5(RunPulse,跑步时脉博)、x6(RunPulse,跑步时最大脉博)。(数据在“回归人大数据12_学生.xls的第2题”中),利用统计软件计算
剔除x5
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
-2715.046
2829.351
-.256
.096
-.251
-2.664
.013
.790
1.267
x5
-.131
.051
-.252
-2.588
.015
.738
1.355
a. Dependent Variable: y
CollinearityDiagnosticsa
Model
Dimension
Eigenvalue
Condition Index
.005
.119
8.437
x6
.303
.136
.522
2.221
.036
.114
8.744
a. Dependent Variable: y
由于x5,x6的VIF~8,x1-x4的VIF都~1.5,故猜测可能存在多重共线性。
(2)用特征根法分析数据的多重共线性;
CollinearityDiagnosticsa
-.087
.932
.037
27.177
x5
.671
.128
3.706
5.241
.000
.001
1860.726
x6
-.008
.008
-.020
-.928
.369
.574
1.743
a. Dependent Variable: y
剔除x2
Coefficientsa
Model
Unstandardized Coefficients
4
173.787
28.921
.000b
Residual
156.235
26
6.009
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x2, x3, x1
Coefficientsa
Model
Unstandardized Coefficients
1348.338
2211.463
.610
.552
x1
-.641
.167
-1.125
-3.840
.002
.003
319.484
x2
-.317
.204
-1.306
-1.551
.143
.000
2636.564
x3
-.413
.548
-.270
-.752
.464
.002
479.288
x4
-.002
.024
-.007
Variance Proportions
(Constant)
x3
x1
x5
1
1
3.978
1.000
.00
.00
.00
.00
2
.012
18.340
.00
.50
.38
.01
3
.009
20.800
.03
.42
.19
.10
4
.001
60.601
.96
.08
.42
.90
a. Dependent Variable: y
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x3
x1
x2
x5
1
1
4.967
1.000
.00
.00
.00
.00
.00
2
.014
18.529
.00
.03
.30
.29
.01
3
.011
20.838
.01
.83
.12
-.960
.352
x1
-.047
.235
-.083
-.202
.843
.006
160.513
x3
1.463
.526
.957
2.781
.013
.009
111.949
x4
.036
.031
.128
1.160
.263
.087
11.507
x6
.003
.015
.008
.206
.839
.649
1.540
a. Dependent Variable: y
-.276
.099
-.270
-2.783
.010
.748
1.338
x2
-.049
.056
-.077
-.875
.390
.908
1.102
x5
-.129
.051
-.249
-2.544
.017
.737
1.356
a. Dependent Variable: y
CollinearityDiagnosticsa
Coefficientsa
6.6
1)根据多重共线性剔除变量
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
.04
.00
4
.006
29.476
.03
.06
.11
.59
.21
5
.001
70.426
.96
.09
.46
.09
.79
a. Dependent Variable: y
剔除x2
ANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
690.551
3
230.184
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
102.934
12.403
8.299
.000
x1
-.227
x3
-.923
.459
-.604
-2.012
.063
.003
306.617
x4
.026
.017
.093
1.591
.132
.086
11.605
x5
.510
.078
2.815
6.527
.000
.002
632.896
x6
-.011
.008
-.028
-1.274
.222
.608
1.645
a. Dependent Variable: y
6
.001
77.546
.96
.48
.10
.10
.01
.72
a. Dependent Variable: y
此时VIF全部小于10,~1.,但从回归系数的显著性看,存在不显著变量.。
剔除x4
Baidu NhomakorabeaANOVAa
Model
Sum of Squares
df
Mean Square
F
Sig.
1
Regression
695.147
.100
-.222
-2.273
.032
.661
1.513
x2
-.074
.055
-.116
-1.359
.187
.866
1.155
x3
-2.629
.385
-.685
-6.835
.000
.629
1.591
x4
-.022
.066
-.031
-.326
.747
.706
1.416
x5
-.370
.120
-.711
-3.084
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x1
x2
x3
x4
x5
x6
1
1
6.950
1.000
.00
.00
.00
.00
.00
.00
.00
2
.019
19.291
.00
.15
.01
.03
.39
.00
.00
3
.015
.052
-.243
-2.414
.023
.720
1.389
a. Dependent Variable: y
CollinearityDiagnosticsa
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x1
x2
x3
21.501
.00
.15
.24
.13
.03
.00
.00
4
.009
27.621
.01
.03
.18
.61
.19
.00
.00
5
.006
33.829
.00
.11
.44
.13
.36
.02
.01
6
.001
82.638
.80
.50
.10
.10
.02
.07
.01
7
.000
196.786
.19
.06
.02
.01
.01
38.643
.000b
Residual
160.831
27
5.957
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x3, x1
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
115.662
11.226
10.303
.000
x3
-2.772
.365
-.722
-7.597
.000
.781
1.280
x1
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
-1252.832
1507.836
-.831
.419
x1
-.735
.163
-1.291
-4.524
.000
.004
276.969
x4
x5
1
1
5.953
1.000
.00
.00
.00
.00
.00
.00
2
.019
17.857
.00
.16
.01
.02
.39
.00
3
.014
20.503
.00
.15
.37
.10
.02
.01
4
.009
26.002
.01
.04
.07
.74
.27
.01
5
.005
35.652
.03
.17
.45
.03
.30
.26
Model Summary
Model
R
R Square
Adjusted R Square
Std. Error of the Estimate
1
.901a
.811
.790
2.440634
a. Predictors: (Constant), x5, x3, x1
此时所有回归系数显著
F=38.643,sig=0.000故回归方程显著。
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
111.718
10.235
10.915
.000
x3
-2.825
.358
-.736
-7.886
.000
.804
1.244
x1
.000
x1
-.285
.104
-.279
-2.753
.011
.710
1.408
x2
-.052
.058
-.081
-.898
.378
.896
1.116
x3
-2.704
.412
-.704
-6.561
.000
.634
1.579
x4
-.027
.071
-.039
-.382
.706
.707
1.414
x5
-.126
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
116.488
11.618
10.027
(1)用方差扩大因子法分析数据的多重共线性;
(2)用特征根法分析数据的多重共线性;
(3)本题是否适用剔除变量的方法消除共线性,如果适用,进行变量剔除(要求写出回归方程,及主要的统计量);
三、实验结果与分析(包括运行结果及其数据分析、解释等)
(1)用方差扩大因子法分析数据的多重共线性;
Coefficientsa
df
Mean Square
F
Sig.
1
Regression
696.052
5
139.210
22.406
.000b
Residual
155.329
25
6.213
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x2, x3, x1, x4
.91
.98
a. Dependent Variable: y
特征值全都十分接近0,故认为变量间有严重的多重共线性。
由方差比例阵,x5-x6间可能存在共线性
(3)本题是否适用剔除变量的方法消除共线性,如果适用,进行变量剔除(要求写出回归方程,及主要的统计量);
剔除x6
ANOVAa
Model
Sum of Squares
实验报告
实验课程应用回归分析第七次实验实验日期12.17
班级学号姓名成绩
要求:
将所有要提交的数据、结果等文件按学号+姓名上传
一、实验目的
掌握SPSS中找出并消除数据共线性方法.
掌握SPSS中的岭回归分析方法.
二、实验内容
1.在训练中氧气消耗能力问题的研究中,我们想要建立一个关系式,以便根据训练测试的数据来预报肺活量,而不必进行昂贵和笨重的氧气消耗测试。考察的因变量y为OXY(氧气消耗能力),自变量有x1(age,年龄)、x2(weight,体重)、x3(RunTime,1.5英里跑的时间)、x4(RstPulse,休息时脉博)、x5(RunPulse,跑步时脉博)、x6(RunPulse,跑步时最大脉博)。(数据在“回归人大数据12_学生.xls的第2题”中),利用统计软件计算
剔除x5
Coefficientsa
Model
Unstandardized Coefficients
Standardized Coefficients
t
Sig.
CollinearityStatistics
B
Std. Error
Beta
Tolerance
VIF
1
(Constant)
-2715.046
2829.351
-.256
.096
-.251
-2.664
.013
.790
1.267
x5
-.131
.051
-.252
-2.588
.015
.738
1.355
a. Dependent Variable: y
CollinearityDiagnosticsa
Model
Dimension
Eigenvalue
Condition Index
.005
.119
8.437
x6
.303
.136
.522
2.221
.036
.114
8.744
a. Dependent Variable: y
由于x5,x6的VIF~8,x1-x4的VIF都~1.5,故猜测可能存在多重共线性。
(2)用特征根法分析数据的多重共线性;
CollinearityDiagnosticsa
-.087
.932
.037
27.177
x5
.671
.128
3.706
5.241
.000
.001
1860.726
x6
-.008
.008
-.020
-.928
.369
.574
1.743
a. Dependent Variable: y
剔除x2
Coefficientsa
Model
Unstandardized Coefficients
4
173.787
28.921
.000b
Residual
156.235
26
6.009
Total
851.382
30
a. Dependent Variable: y
b. Predictors: (Constant), x5, x2, x3, x1
Coefficientsa
Model
Unstandardized Coefficients
1348.338
2211.463
.610
.552
x1
-.641
.167
-1.125
-3.840
.002
.003
319.484
x2
-.317
.204
-1.306
-1.551
.143
.000
2636.564
x3
-.413
.548
-.270
-.752
.464
.002
479.288
x4
-.002
.024
-.007
Variance Proportions
(Constant)
x3
x1
x5
1
1
3.978
1.000
.00
.00
.00
.00
2
.012
18.340
.00
.50
.38
.01
3
.009
20.800
.03
.42
.19
.10
4
.001
60.601
.96
.08
.42
.90
a. Dependent Variable: y
Model
Dimension
Eigenvalue
Condition Index
Variance Proportions
(Constant)
x3
x1
x2
x5
1
1
4.967
1.000
.00
.00
.00
.00
.00
2
.014
18.529
.00
.03
.30
.29
.01
3
.011
20.838
.01
.83
.12
-.960
.352
x1
-.047
.235
-.083
-.202
.843
.006
160.513
x3
1.463
.526
.957
2.781
.013
.009
111.949
x4
.036
.031
.128
1.160
.263
.087
11.507
x6
.003
.015
.008
.206
.839
.649
1.540
a. Dependent Variable: y
-.276
.099
-.270
-2.783
.010
.748
1.338
x2
-.049
.056
-.077
-.875
.390
.908
1.102
x5
-.129
.051
-.249
-2.544
.017
.737
1.356
a. Dependent Variable: y
CollinearityDiagnosticsa