统计建模与R软件第六章习题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
第六章 1. a .
> x <- c(5.1, 3.5, 7.1, 6.2, 8.8, 7.8, 4.5, 5.6, 8.0, 6.4)
> y <- c(1907, 1287, 2700, 2373, 3260, 3000, 1947, 2273, 3113,2493) > plot(x,y)
456789
15002000
2500300
x
y
X 与Y 线性相关
b .
> x <- c(5.1, 3.5, 7.1, 6.2, 8.8, 7.8, 4.5, 5.6, 8.0, 6.4)
> y <- c(1907, 1287, 2700, 2373, 3260, 3000, 1947, 2273, 3113,2493) > lm.sol<-lm(y~1+x) > summary(lm.sol) Call:
lm(formula = y ~ 1 + x)
Residuals:
Min 1Q Median 3Q Max
-128.591 -70.978 -3.727 49.263 167.228
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 140.95 125.11 1.127 0.293
x 364.18 19.26 18.908 6.33e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 96.42 on 8 degrees of freedom
Multiple R-Squared: 0.9781, Adjusted R-squared: 0.9754
F-statistic: 357.5 on 1 and 8 DF, p-value: 6.33e-08
回归方程为Y=140.95+364.18X,极为显著
d.
> new <- data.frame(x=7)
> lm.pred <- predict(lm.sol,new,interval="prediction",level=0.95)
> lm.pred
fit lwr upr
[1,] 2690.227 2454.971 2925.484
Y(7)= 2690.227, [2454.971,2925.484]
2.
> out<-data.frame(
+ x1 <- c(0.4,0.4,3.1,0.6,4.7,1.7,9.4,10.1,11.6,12.6,10.9,23.1,23.1,21.6,23.1,1.9,26.8,29.9), + x2 <- c(52,34,19,34,24,65,44,31,29,58,37,46,50,44,56,36,58,51),
+ x3 <- c(158,163,37,157,59,123,46,117,173,112,111,114,134,73,168,143,202,124),
+ y <- c(64,60,71,61,54,77,81,93,93,51,76,96,77,93,95,54,168,99)
+ )
> lm.sol<-lm(y~x1+x2+x3,data=out)
> summary(lm.sol)
Call:
lm(formula = y ~ x1 + x2 + x3, data = out)
Residuals:
Min 1Q Median 3Q Max
-27.575 -11.160 -2.799 11.574 48.808
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.9290 18.3408 2.450 0.02806 *
x1 1.8033 0.5290 3.409 0.00424 **
x2 -0.1337 0.4440 -0.301 0.76771
x3 0.1668 0.1141 1.462 0.16573
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 19.93 on 14 degrees of freedom
Multiple R-Squared: 0.551, Adjusted R-squared: 0.4547
F-statistic: 5.726 on 3 and 14 DF, p-value: 0.009004
回归方程为y=44.9290+1.8033x1-0.1337x2+0.1668x3
由计算结果可以得到,回归系数与回归方程的检验都是显著的
3.
a.
> x<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,5,6,6,6,7,7,7,8,8,8,9,11,12,12,12)
> y<-c(0.6,1.6,0.5,1.2,2.0,1.3,2.5,2.2,2.4,1.2,3.5,4.1,5.1,5.7,3.4,9.7,8.6,4.0,5.5,10.5,17.5,13.4,4.5, + 30.4,12.4,13.4,26.2,7.4)
> lm.sol <- lm(y ~ 1+x)
> summary(lm.sol)
Call:
lm(formula = y ~ 1 + x)
Residuals:
Min 1Q Median 3Q Max
-9.84130 -2.33691 -0.02137 1.05921 17.83201
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.4519 1.8353 -0.791 0.436
x 1.5578 0.2807 5.549 7.93e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 5.168 on 26 degrees of freedom
Multiple R-Squared: 0.5422, Adjusted R-squared: 0.5246
F-statistic: 30.8 on 1 and 26 DF, p-value: 7.931e-06
线性回归方程为y=-1.4519+1.5578x,并且未通过t检验和F检验
> plot(x,y)
> abline(-1.4519,1.5578)
>
24681012
0510********
x
y
c .
> x<-c(1,1,1,1,2,2,2,3,3,3,4,4,4,5,6,6,6,7,7,7,8,8,8,9,11,12,12,12)
> y<-c(0.6,1.6,0.5,1.2,2.0,1.3,2.5,2.2,2.4,1.2,3.5,4.1,5.1,5.7,3.4,9.7,8.6,4.0,5.5,10.5,17.5,13.4,4.5, + 30.4,12.4,13.4,26.2,7.4)
> y.res<-resid(lm.sol);y.fit<-predict(lm.sol) > plot(y.res~y.fit)
> y.rst<-rstandard(lm.sol) > plot(y.rst~y.fit) >
普通残差
05
1015
-10-505101
5
y.fit
y .r e s
标准化残差
05
1015
-2-1012
3
y.fit
y .r s t
d . (4)
> lm.new<-update(lm.data3,sqrt(.)~.);coef(lm.new) (Intercept) x 0.7665018 0.2913620
2
x 0.084890650.4466549x 0.58752220.2913620x
0.7665018 ++=+=∧
∧
y y
> plot(x,y)
> lines(x,y=0.5875222+0.08489065*x^2+0.4466549*x) > y.res<-resid(lm.new);y.fit<-predict(lm.new) > plot(y.res~y.fit)
> y.rst<-rstandard(lm.new)
> plot(y.rst~y.fit)
4.
> lm.sol<-lm(Y~X1+X2,data=toothpaste)
> summary(lm.sol)
Call:
lm(formula = Y ~ X1 + X2, data = toothpaste)
Residuals:
Min 1Q Median 3Q Max
-0.497785 -0.120312 -0.008672 0.110844 0.581059
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.4075 0.7223 6.102 1.62e-06 ***
X1 1.5883 0.2994 5.304 1.35e-05 ***
X2 0.5635 0.1191 4.733 6.25e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2383 on 27 degrees of freedom
Multiple R-squared: 0.886, Adjusted R-squared: 0.8776
F-statistic: 105 on 2 and 27 DF, p-value: 1.845e-13
> source("Reg_Diag.R");Reg_Diag(lm.sol)
residual s1 standard s2 student s3 hat_matrix s4 DFFITS
1 -0.047231639 -0.21248023 -0.20868284 0.13012303 -0.08071140
2 -0.09807022
3 -0.42151698 -0.41500522 0.04704358 -0.09220771
3 0.07428862
4 0.33492116 0.3293452
5 0.13385955 0.12947381
4 -0.006645926 -0.03003380 -0.02947287 0.13797634 -0.01179139
5 0.581059204 * 2.55701395 * 2.88236603 * 0.09091719 0.91152886
6 -0.107785364 -0.46031439 -0.45349258 0.03475104 -0.08604674
7 0.300758350 1.31532456 1.33418993 0.07955382 0.39223739
8 0.424810360 2.05723960 * 2.19842345 * 0.24932956 * 1.26699112
9 -0.027532493 -0.12804079 -0.12568545 0.18600211 -0.06008033
10 -0.026629932 -0.11546376 -0.11333335 0.06356511 -0.02952761
11 -0.497785364 -2.12587089 * -2.28622467 * 0.03475104 -0.43379359
12 0.061913782 0.26540305 0.26078220 0.04194215 0.05456415
13 0.112816344 0.48968055 0.48267493 0.06556908 0.12785881
14 -0.150249714 -0.65395500 -0.64687388 0.07069034 -0.17841010
15 0.104927089 0.45112292 0.44436784 0.04761107 0.09935493
16 0.154341375 0.66319490 0.65616401 0.04652075 0.14493729
17 0.057639541 0.25360011 0.24915642 0.09056633 0.07862668
18 -0.146012230 -0.64156026 -0.63442169 0.08813106 -0.19723152
19 -0.124487198 -0.53776150 -0.53055795 0.05659354 -0.12994717
20 -0.040713930 -0.18584177 -0.18248454 0.15505538 -0.07817275
21 -0.199843357 -0.88485598 -0.88118586 0.10202733 -0.29702583
22 -0.359558498 -1.59386063 -1.64328265 0.10408399 -0.56010648
23 -0.387785364 -1.65609855 -1.71455446 0.03475104 -0.32532355
24 -0.010697936 -0.04979066 -0.04886215 0.18729463 -0.02345680
25 -0.283315771 -1.25181018 -1.26568782 0.09823492 -0.41774652
26 0.017855655 0.08230563 0.08077720 0.17144488 0.03674433
27 0.279286070 1.27482211 1.29043074 0.15505538 0.55279486
28 0.022483501 0.09864705 0.09682047 0.08548985 0.02960262
29 0.174895100 0.76513625 0.75910825 0.08017207 0.22411038
30 0.147269942 0.66281470 0.65578162 0.13089383 0.25449685
s5 cooks_distance s6 COVRATIO s7
1 2.251190e-03 1.2809524
2 2.923723e-0
3 1.1521157
3 5.778631e-03 1.2769056
4 4.812656e-0
5 1.2989982
5 * 2.179654e-01 0.5361673 *
6 2.542824e-03 1.1330964
7 4.984335e-02 0.9974548
8 * 4.685683e-01 * 0.8945237
9 1.248734e-03 1.3732722
10 3.016556e-04 1.1941261
11 5.423517e-02 0.6696819
12 1.027897e-03 1.1597812
13 5.608624e-03 1.1668135
14 1.084362e-02 1.1487064
15 3.391268e-03 1.1494743
16 7.153137e-03 1.1180504
17 2.134879e-03 1.2226245
18 1.326021e-02 1.1728006
19 5.782640e-03 1.1493238
20 2.112633e-03 1.3203083
21 2.965359e-02 1.1417402
22 9.837757e-02 0.9293112
23 3.291392e-02 0.8413370
24 1.904437e-04 1.3775853
25 5.690208e-02 1.0379534
26 4.672410e-04 1.3505881
27 9.941147e-02 1.1001732
28 3.032306e-04 1.2232439
29 1.700877e-02 1.1399973
30 2.205512e-02 1.2266609
> toothpaste<-data.frame(
+ X1=c(-0.05, 0.25,0.60,0,0.20, 0.15,-0.15, 0.15,
+ 0.10,0.40,0.45,0.35,0.30, 0.50,0.50, 0.40,-0.05,
+ -0.05,-0.10,0.20,0.10,0.50,0.60,-0.05,0, 0.05, 0.55), + X2=c( 5.50,6.75,7.25,5.50,6.50,6.75,5.25,6.00,
+ 6.25,7.00,6.90,6.80,6.80,7.10,7.00,6.80,6.50,
+ 6.25,6.00,6.50,7.00,6.80,6.80,6.50,5.75,5.80,6.80), + Y =c( 7.38,8.51,9.52,7.50,8.28,8.75,7.10,8.00,
+ 8.15,9.10,8.86,8.90,8.87,9.26,9.00,8.75,7.95,
+ 7.65,7.27,8.00,8.50,8.75,9.21,8.27,7.67,7.93,9.26) + )
>
> lm.sol<-lm(Y~X1+X2,data=toothpaste)
> summary(lm.sol)
Call:
lm(formula = Y ~ X1 + X2, data = toothpaste)
Residuals:
Min 1Q Median 3Q Max
-0.37130 -0.10114 0.03066 0.10016 0.30162
Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept) 4.0759 0.6267 6.504 1.00e-06 ***
X1 1.5276 0.2354 6.489 1.04e-06 ***
X2 0.6138 0.1027 5.974 3.63e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1767 on 24 degrees of freedom Multiple R-squared: 0.9378, Adjusted R-squared: 0.9327
F-statistic: 181 on 2 and 24 DF, p-value: 3.33e-15
5.
XX<-cor(cement[1:4])
kappa(XX,exact=TRUE)
[1] 1376.881
> eigen(XX)
$values
[1] 2.235704035 1.576066070 0.186606149 0.001623746
$vectors
[,1] [,2] [,3] [,4]
[1,] -0.4759552 0.5089794 0.6755002 0.2410522
[2,] -0.5638702 -0.4139315 -0.3144204 0.6417561
[3,] 0.3940665 -0.6049691 0.6376911 0.2684661
[4,] 0.5479312 0.4512351 -0.1954210 0.6767340
删去了X3,X4
> cement<-data.frame(
+ X1=c( 7, 1, 11, 11, 7, 11, 3, 1, 2, 21, 1, 11, 10),
+ X2=c(26, 29, 56, 31, 52, 55, 71, 31, 54, 47, 40, 66, 68),
+ Y =c(78.5, 74.3, 104.3, 87.6, 95.9, 109.2, 102.7, 72.5, + 93.1,115.9, 83.8, 113.3, 109.4)
+ )
> XX<-cor(cement[1:2])
> kappa(XX,exact=TRUE)
[1] 1.592620
复共线性消失了。