用R软件进行一元线性回归 实验报告
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
数理统计上机报告
上机实验题目:用R软件进行一元线性回归
上机实验目的:
1、进一步理解假设实验的基本思想,学会使用实验检验和进行统计推断。
2、学会利用R软件进行假设实验的方法。
一元线性回归基本理论、方法:
基本理论:假设预测目标因变量为Y,影响它变化的一个自变量为X,因变量随自变量的增(减)方向的变化。
一元线性回归分析就是要依据一定数量的观察样本(Xi, Yi),i=1,2…,n,找出回归直线方程Y=a+b*X
方法:对应于每一个Xi,根据回归直线方程可以计算出一个因变量估计值Yi。
回归方程估计值Yi 与实际观察值Yj之间的误差记作e-i=Yi-Yi。
显然,n个误差的总和越小,说明回归拟合的直线越能反映两变量间的平均变化线性关系。
据此,回归分析要使拟合所得直线的平均平方离差达到最小,据此,回归分析要使拟合所得直线的平均平方离差达到最小,简称最小二乘法将求出的a和b 代入式(1)就得到回归直线Yi=a+bXi 。
那么,只要给定Xi值,就可以用作因变量Yi的预测值。
(一)
实验实例和数据资料:
有甲、乙两个实验员,对同一实验的同一指标进行测定,两人测定的结果如
试问:甲、乙两人的测定有无显著差异?取显著水平α=0.05.
上机实验步骤:
1
(1)设置假设:H0:u1-u-2=0:H1:u1-u-2<0
(2)确定自由度为n1+n2-2=14;显著性水平a=0.05 (3)计算样本均值样本标准差和合并方差统计量的观测值alpha<-0.05;
n1<-8;
n2<-8;
x<-c(4.3,3.2,3.8,3.5,3.5,4.8,3.3,3.9);
y<-c(3.7,4.1,3.8,3.8,4.6,3.9,2.8,4.4);
var1<-var(x);
xbar<-mean(x);
var2<-var(y);
ybar<-mean(y);
Sw2<-((n1-1)*var1+(n2-1)*var2)/(n1+n2-2)
t<-(xbar-ybar)/(sqrt(Sw2)*sqrt(1/n1+1/n2));
tvalue<-qt(alpha,n1+n2-2);
(4)计算临界值:tvalue<-qt(alpha,n1+n2-2)
(5)比较临界值和统计量的观测值,并作出统计推断
实例计算结果及分析:
alpha<-0.05;
> n1<-8;
> n2<-8;
> x<-c(4.3,3.2,3.8,3.5,3.5,4.8,3.3,3.9);
> y<-c(3.7,4.1,3.8,3.8,4.6,3.9,2.8,4.4);
> var1<-var(x);
> xbar<-mean(x);
> var2<-var(y);
> ybar<-mean(y);
> Sw2<-((n1-1)*var1+(n2-1)*var2)/(n1+n2-2)
> t<-(xbar-ybar)/(sqrt(Sw2)*sqrt(1/n1+1/n2));
> var1
[1] 0.2926786
> xbar
[1] 3.7875
> var2
[1] 0.2926786
2
> ybar
[1] 3.8875
Sw2
[1] 0.2926786
> t
[1] -0.3696873
tvalue
[1] -1.76131
分析:t=-0.3696873>tvalue=-1.76131,所以接受假设H1即甲乙两人的测定无显著性差异。
(二)
实验实例和数据资料:
2.某型号玻璃纸的横向延伸率要求不低于65%,且其服从正态分布,现对一批该批号的玻璃纸测得100个数据如下:
上机实验步骤:
(1)设置假设:H0:u=65, H1:u<65.
(2)确定自由度为n=100-1=99;显著性水平a=0.05
(3) 输入数据x<-
c(35.5,35.5,35.5,35.5,35.5,35.5,35.5,37.5,37.5,37.5,37.5,37.5,37.5,37.5,37.5,39.5,39.5,
3
39.5,39.5,39.5,39.5,39.5,39.5,39.5,39.5,39.5,41.5,41.5,41.5,41.5,41.5,41.5,41.5,41.5,4 1.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,45.5,45.5,45.5,45.5,45.5,45.5,45.5,45. 5,45.5,45.5,45.5,45.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5, 47.5,47.5,47.5,47.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,4 9.5,51.5,51.5,51.5,51.5,51.5,53.5,53.5,53.5,55.5,55.5,59.5,59.5,63.5)
(4)用R软件计算临界值
(5)比较临界值和统计量的观测值,并作出推断
实例计算结果及分析:
计算过程如下:
alpha<-0.05;
n<-100;
x<-
c(35.5,35.5,35.5,35.5,35.5,35.5,35.5,37.5,37.5,37.5,37.5,37.5,37.5,37.5,37.5,39.5,39.5, 39.5,39.5,39.5,39.5,39.5,39.5,39.5,39.5,39.5,41.5,41.5,41.5,41.5,41.5,41.5,41.5,41.5,4 1.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,43.5,45.5,45.5,45.5,45.5,45.5,45.5,45.5,45. 5,45.5,45.5,45.5,45.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5,47.5, 47.5,47.5,47.5,47.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,49.5,4 9.5,51.5,51.5,51.5,51.5,51.5,53.5,53.5,53.5,55.5,55.5,59.5,59.5,63.5)
sd1<-sd(x);
xbar<-mean(x);
t<-(xbar-65.0)/(sd1/sqrt(n));
tvalue<-qt(alpha,n-1);
sd1
[1] 5.815896
xbar
[1] 45.06
t
[1] -34.28534
tvalue
[1] -1.660391
分析推断:因为t=-34.28534<tvalue=-1.660391所以拒绝原假设。
即该批玻璃纸的横向延伸率不符合要求
(三)
实验实例和数据资料:
4
为了检验一种杂交作物的两种新处理方案,在同一地区随机选择16块地段在各实验地段,按两种方案处理作物,这8块地段的单位面积产量(单位:公斤)是:
一号方案产量:86 87 56 93 84 93 75 79 81 78 79 90 68 65 87 90;
二号方案产量:80 79 58 91 77 82 74 66 58 59 64 78 76 80 82 55;
假设两种方案的产量都服从正态分布,分别为N(u1,a^2),N(u2,a^2),a^2未知,求均值差u1-u2的置信区间;
实例计算结果及分析:
利用R软件求解过程如下:
>alpha<-0.05;
> x<-c(86,87,56,93,84,93,75,79,81,78,79,90,68,65,87,90);
> y<-c(80,79,58,91,77,82,74,66,58,59,64,78,76,80,82,55);
> n1<-length(x);
> n2<-length(y);
> xbar=mean(x);
> ybar=mean(y);
> sw<-sqrt((n1-1)*sqrt(var(x))+(n2-1)*sqrt(var(y))) /(n1+n2-2);
> q<-qt(1-alpha/2,(n1+n2-2));
> left<-xbar-ybar-q*sw*sqrt(1/n1+1/n2);
> right<-xbar-ybar+q*sw*sqrt(1/n1+1/n2);
> n1
[1] 16
> n2
[1] 16
> left
[1] 7.819162
> right
[1] 8.680838
所以置信区间【7.819162,,8.680838】
5
6。