sas线性回归分析案例(Case...

合集下载
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

sas线性回归分析案例(Case study of SAS linear regression
analysis)
linear regression
20094788 Chen Lei calculates 2
Southwest Jiao Tong University
SouthWest JiaoTong University
-------------------------------------------------------------------
Linear regression is divided into single linear regression and multiple linear regression.
The model of unary linear regression is
Y=..0+..1X+ epsilon,
Here
X
Independent variable,
Y
Dependent variable,
Epsilon is a random error term
.
It is usually assumed that the mean of the random error is
Zero
The variance is
(..2
..2>0),
..2 and
X
Value independent. If further assumptions
Random error
The difference follows a normal distribution, which is called a normal linear model. In general, with
K
An independent variable and a dependent variable, dependent variable
The value can be broken down into two parts: part is due to the
influence of the independent variable, that is to say
Function as an argument
Among them, the function form is already
Know, but contain some unknown parameters; another part is due to other UN considered factors and random effects, that is, random errors.
When a function is a linear function of unknown parameters, it is called a linear regression analysis model.
If there are multiple dependent variables, the regression model is:
Y=..0+..1X1+..2X2+.+..IXi+..
Due to the linear die
The model contains random errors, so the regression
The straight line reflected by the model is uncertain
. The main purpose of regression analysis is to derive from these
In the uncertain straight line, find a line which can best fit the original data information and describe it as a regression model
Relationship between independent variables,
The straight line is called the regression equation.
through
Often in regression analysis, yes
Epsilon has the most commonly used classical assumptions.
1. The expected value of epsilon is
Zero
2, epsilon for all
X
For example, it has the same variance.
3, epsilon obeys normal distribution and is independent of each other
Variable.
Explanation of linear regression,
This paper
Based on examples.
In the following example, there is a one element regression analysis, and another two
Meta regression analysis.
Examples
(
Data analysis method
_
exercises
2.4_page79
)
A company manager who knows about the monthly sales of a cosmetics in a city
Y
(unit: box) with the middle of the city
The number of people who use the cosmetics
..1 (unit: thousand persons) and their per capita monthly income
..2 (unit: yuan) between
In a certain month
Fifteen
Three cities were surveyed to obtain the above views Measured values, such as table
Two point one two
As shown.
surface
Two point one two
Cosmetics sales data
City
Sales volume (y)
Number of people (x1)
Income (x2)
City
Sales volume (y)
Number of people (x1)
Income (x2)
One
One hundred and sixty-two
Two hundred and seventy-four
Two thousand four hundred and fifty
Nine
One hundred and sixteen
One hundred and ninety-five
Two thousand one hundred and thirty-seven Two
One hundred and twenty
One hundred and eighty
Three thousand two hundred and fifty-four Ten
Fifty-five
Fifty-three
Two thousand five hundred and sixty
Three
Two hundred and twenty-three
Three hundred and seventy-five
Three thousand eight hundred and two Eleven
Two hundred and fifty-two
Four hundred and thirty
Four thousand and twenty
Four
One hundred and thirty-one
Two hundred and five
Two thousand eight hundred and thirty-eight Twelve
Two hundred and thirty-two
Three hundred and seventy-two
Four thousand four hundred and twenty-seven Five
Sixty-seven
Eighty-six
Two thousand three hundred and forty-seven Thirteen
One hundred and forty-four
Two hundred and thirty-six
Two thousand six hundred and sixty
Six
One hundred and sixty-nine
Two hundred and sixty-five
Three thousand seven hundred and eighty-two
Fourteen
One hundred and three
One hundred and fifty-seven
Two thousand and eighty-eight Seven
Eighty-one
Ninety-eight
Three thousand and eight
Fifteen
Two hundred and twelve
Three hundred and seventy
Two thousand six hundred and five Eight
One hundred and ninety-two
Three hundred and thirty
Two thousand four hundred and fifty
hypothesis
Y
and
..1,
Linear regression relation is found between..2 ....=..0+..1....1+..2....2+..,
..=1,2,... 15.
among
Independent and identically distributed
... (0,..2)
(
One
)
Coefficient of linear regression
..0,
..1,
Least squares estimation and error variance of..2
..2 estimates, writes regression equations, and...
Regression coefficient
Interpret;
(
Two
)
The ANOVA table was used to explain the significance of linear regression test
. Square of the coefficient of the complex correlation
..2
value
And explain its meaning;
(
Three
Separately seek
..1 and
The confidence of..2 is
95%
Confidence interval
;
(
Four
)
Yes
The number of people tested by alpha =0.05 ..1 and income
..2
Sales volume
Y
Is the effect significant?
Regression coefficient
Test of general hypothesis test method ..1 and
The interaction of..2 (i.e.
..1..2) yes
Y
Is the effect significant?
;
Data import
Edit window input
This question
The
Data import code
:
Title
Data analysis method
_
exercises
2.4_page79
"
; / *
Title, omission does not affect analysis results * /
Data
Mylib.ch2_2_4;
*
First, a new logical library,
Logical Libraries
Mylib
Create data set
Ch2_2_4*/
Input y X1 x2 @ @ /*@@; Represents a continuous input
,
Y
Dependent variable,
X1
,
X2
Independent variable
* /
Cards; / *
Start input data
* /
1622742450120180, 32542233753802
131205283867862347, 1692653782
819830081923302450, 1161952137
Fifty-five
532560252430402023, 3724427
1442362660103157, 20882123702605
;
*
Missing data"
.
"Otherwise, the corresponding set of data will be automatically deleted
* /
Run
/*run
Statement is used to illustrate all rows before the statement in the current procedure step
* /
Press
F8
After run,
Open logical library
Mylib
You can see the new data set
Ch2_2_4
.
SAS
A variety of imports are provided According to the manner, for example: One
,
Read data from file,
INFILE
F
:
\
Mylib
\
C
H2_2_4.txt
";
Two
And the use of established data sets,
Proc reg data=mylib.ch2_2_4
;
You can also import directly from outside
Excel
Other ways. The program above is entered directly in the edit
box.
procedure call
The procedure to call in this question
yes
Proc reg
Process.
Proc reg
Process is
SAS
system
Many regression analysis process of the system in the Except that it can fit the general linear regression model,
A variety of optimal model selection methods and model checking methods are also provided.
Among them
One
Two
)
Three
The results of multivariate linear regression analysis are mainly used. (
Four
) will use a linear regression analysis
Results.
(I)
Y
and
..,
Linear regression analysis
Proc
Reg
*
transfer
Reg
Process use
* /
M
Odel y=x1 x2;
*
Dependent variable
Y
The independent variable is X1
,
X2*/
Run;
Model
Statement: used to define the model's dependent variables, arguments, model options, and output options.
Common options are
Selection=,
Specifies the variable selection method:
FORWARD
(forward input method),
BACKWARD
Xiang Houshan
Division),
STEPWISE
(stepwise regression)
,
ADJRSQ
(modified multiple correlation coefficient criterion
)
,
CP
(
Cp criterion
Etc..
NOINT
Said, is often included in the model
Number item;
STB
The regression coefficient, output standard;
CLI
The output of single predictive value, confidence interval; R
Residual scores are performed
Analysis of results of the analysis and output; I
Output
(
XTX)
.1
matrix
.
Format:
MODEL
Dependent variable name
=
Argument ranking
These
option
]
Cases:
Model y=x1 / x2 selection=stepwise / *; stepwise regression
* /
After running the program, get the results Parameter estimation table
(
One
)
Least squares estimation:
= = (0,... 1,... 2) = (3.45261,0.49600,0.00920) Regression equation:
Y=3.45261+0.49600
..1+0.00920
..2
ANOVA table
(
Two
Error variance estimate:
... 2=MSE=4.74040
Multiple correlation coefficient
Squares
:
..2=0.9989
(
R
-
Square
)
Significance: from the value of the complex correlation coefficient, it can be seen that it is highly significant
and
..1,
..2)
Multiple correlation coefficient Squares
Can also pass
By calculation:
..2=SSR/SST=53845/53902=0.9989 (
Three
)
Confidence interval:
K+...
.t1.
.
2 (N.P) s...
).
..0.975 (12) =2.17881 (via check) T distribution table obtained) You can also pass the function
Y=TINV
(
P
,
DF
Obtain.
..1=0.496+/
-
2.179*0.00605
Draw (
Zero point four eight two eight ,
Zero point five zero nine two )
..2=0.0092+/
-
2.179*0.00096811
,
Draw
Zero point zero zero seven one ,
Zero point zero one one three )
(
Two
)
Y
and
Linear regression analysis
Proc
Reg
Data=mylib.ch2_2_4; / *
Direct reference data set
* /
Model y=x1;
Run
;
(
Four
The coefficient of multiple correlation is: Zero point nine nine one zero
X1
Yes
Y
Significant influence
(
Three
)
Y
and
Linear regression analysis Proc
Reg
Data=mylib.ch2_2_4; / * Direct reference data set * /
Model y=x2
;
Run
;
(
Four
)
The coefficient of quadratic correlation is square: Zero point four zero eight seven
,
X2
Yes
Y
The effect is not significant
(
Four
)
Y
and
Linear regression analysis of... Data mylib.ch2_2_4;
Set mylib.ch2_2_4;
*
Read data set
* /
Z=x1*x2;
*
New argument
Z*/
Run;
Proc reg;
Model y=z;
*
Argument is
Z*/
Run;
(
Four
)
The square of the complex correlation coefficient is: Zero point nine zero three zero
,
X1X2
Yes
Y
Significant impact
Linear regression analysis using modules (I)
Linear regression analysis
start-up
SAS
System, and click "solution" in turn"
-
>
"Analysis"
-
>
"Analysts"
And then click "file""
-
>
Open, open the data set"
Ch2_2_4.sas7bdat
"
,
Figure
Variable list
independent variable dependent variable
The value of confidence a Click "Statistics" in turn" -
>
"Regression"
-
>
"Simple" pop-up dialog box
One
)
Variable settings
On the left hand side of the variables list
Central Election
Y
Click"
Dependent
The button is set as dependent variable ;
Selected
X2
Click
"
Explanatory
"
Button, set it as an argument.
"
Model
In the settings bar, select by default" Linear
"" means linear regression
.
(
Two
)
Tests
Set up
Click"
Tests
Button to eject the dialog box
Confidence defaults to
Zero Point Zero Five
May change.
Click"
OK
".
(
Three
)
Plots
Set up
Click"
Plots
"Button" pops up the plotting Options dialog box
Choice"
Residul
Tab
.
"
Studentized
"
Represents a student residual," Normal quantile
-
Quantile plot
"Stands for normality."
QQ
Graph check.
Settings as shown
Residual column
Normal inspection
Test bar
Variable column
variance analysis
parameter estimation
Click"
OK
"And click on the main settings dialog box "
OK
",
Therefore
And get results
regression equation
Click"
Analysis (new, project) "Dialog box""
Plot of RSTUDENT
Vs
X2
"" pops up the residual graph Dialog box
Click again
"
Plot of RSTUDENT
Vs
NQQ
"
Pop up
QQ
chart
The normal state of the residual by the student
QQ
It can be seen that the model error term is approximately normal distribution
.
Independent variable selection
(two) many
Linear regression analysis
start-up
SAS
System, click "solution" in turn
Resolution"
-
>
"Analysis"
>
"Analyst", and then click "file"" -
>
Open, open the data set"
Ch2_2_4.sas7bdat
"
.
Click "Statistics" in turn"
-
>
"Regression"
-
>
"Linear" pop-up dialog box
Select argument
X1
,
X2
Dependent variable
Y
. Click"
Model
Button to eject the dialog box
In"
Selection method
"Column" provides independent variable selection, such as: Stepwise selection
Express
Step regression method;
Adjusted R
-
Square
Indicates the modified multiple correlation coefficient criterion. This example selects
Stepwise regression method. Click"
OK
".
Plots
The setting is similar to the one element regression analysis
. Last click"
OK
".
Multivariate linear analysis:
Residual plot
QQ
In addition, click"
Analysis (
New project
)
"
Dialog box"
Code
"
Pop-up program dialog box.
The above process is mainly explained by linear regression
SAS
The use of the system, and therefore less analysis of the results. For example: by
QQ
As can be seen from the graph
spot
Approach a straight line, Indicate
Error term
Approximate
just
state
Distribution.。

相关文档
最新文档