实验报告6_单因素方差分析
实验六 单因素方差分析
实验目的:
1. 掌握单因素方差分析的理论与方法;
2. 掌握利用SAS 进行模型的建立与显著性检验,解决有关实际应用问题.
实验要求:编写程序,结果分析. 实验内容:
1. 写出单因素方差分析模型的步骤,平方和分解公式;
解:
一、单因素方差分析模型的步骤:
(1) MODEL 因变量名称=因素效应语句 ,即单因素模型:Model Y=A; (2) MEANS 因素效应/选项 语句 选项部分:可以是下列选项
1)T (或LSD ):对effects 列出的各因素在不同水平上的均值进行两两比较的t 检验各
2)BON :对effects 列出的各因素在其不同水平上的均值进行Bonferroni 同时两两比较t 检验
4)CLDIFF :输出effects 中列出的各因素在不同水平上的两两均值之差的置信区间
5)CLM :要求输出“effects ”中列出的各因素在其不同水平上的均值的置信区间
二、平方和分解公式:
各y ij 间总的差异大小可用总(偏差)平方和SS T 表示:
SS T =∑∑(y ij -) 2
i =1j =1
a n i
随机误差引起的数据间的差异可以用组内偏差平方和表示,也称误差(偏差)平方和SS E :
SS A =∑n i (i ∙-) 2
i =1
a
由于组间偏差除了随机误差外,还反映了效应的差异,故由于效应不同引起
的数据差异可以用组间偏差平方和表示,也称因素A 的(偏差)平方和SS A :
SS E =∑∑(y ij -i ∙) 2
i =1j =1
a n i
将表示总偏差的平方和进行分解:
SS T =∑∑(y ij -) =∑∑(y ij -i ∙+i ∙-) 2
2
i =1j =1n i
i =1j =1a
n i
a
n i
a
n i
=∑∑(y ij -i ∙) +∑∑(i ∙-) +2∑∑(y ij -i ∙) (i ∙-) (3.5) i =1j =1i =1j =1i =1j =1
2
2
a a n i
=∑∑(y ij -i ∙) +∑n i (i ∙-) 2
2
i =1j =1
i =1
a n i a
=SS E +SS A
其中
a
n i
a
∑∑(y
i =1j =1
ij
-i ∙) (i ∙-)=∑[(i ∙-) ∑(y ij -i ∙) ]=∑(i ∙-)(n i i ∙-n i i ∙) =0,
i =1
n i
a
j =1i =1
即:总平方和=误差平方和+因素平方和
2. 3.4 3.5(选作)
3.4
程序:
data examp3_4;
input chj $ delv @@; cards ; a1 0.88 a1 0.85 a1 0.79 a1 0.86 a1 0.85 a1 0.83 a2 0.87 a2 0.92 a2 0.85 a2 0.83 a2 0.90 a2 0.80 a3 0.84 a3 0.78 a3 0.81
a3 0.80 a3 0.85 a3 0.83 a4 0.81 a4 0.86 a4 0.90 a4 0.87 a4 0.78 a4 0.79 ; run ;
proc anova data =examp3_4; /* µ÷Ó÷½²î·ÖÎö¹ý³Ì */ class chj; model delv=chj; run ;
The SAS System 11:16 Friday, October 22, 2013 1 The ANOVA Procedure
Class Level Information
Class Levels Values
chj 4 a1 a2 a3 a4表示一个因素chj, 四个水平
Number of observations 24样本值个数24 The SAS System 18:50 Saturday, December 4, 2012 2
The ANOVA Procedure
Dependent Variable: delv
Sum of
Source DF Squares Mean Square F Value Pr > F
方差来源 自由度 平方和 均方 f=MS A /MS E p值
Model 3 0.00584583 0.00194861 1.31 0.3002
Error 20 0.02985000 0.00149250
Corrected Total 23 0.03569583
R-Square Coeff Var Root MSE delv Mean
0.163768 4.601436 0.038633 0.839583
Source DF Anova SS Mean Square F Value Pr > F
chj 3 0.00584583 0.00194861 1.31 0.3002
由计算可知检验假设H 0:μ1=μ2=μ3=μ4,f =MS A /MS E =1. 31
p =P (F (3, 20) ≥f ) =0. 3002>0. 05该值较大,因此认为这四种不同催化剂对该化工产品
的得率无显著影响 3.5
(1)程序:
data examp3_5;
input kyjf $ tgl @@; cards ; a1 7.6 a1 8.2 a1 6.8 a1 5.8 a1 6.9 a1 6.6 a1 6.3 a1 7.7 a1 6.0 a2 6.7 a2 8.1 a2 9.4 a2 8.6 a2 7.8 a2 7.7 a2 8.9 a2 7.9 a2 8.3 a2 8.7 a2 7.1 a2 8.4 a3 8.5 a3 9.7 a3 10.1
a3 7.8 a3 9.6 a3 9.5 ; run ;
proc anova data =examp3_5; class kyjf; model tgl=kyjf; run ;
The SAS System 11:16 Friday, October 22, 2013 1
The ANOVA Procedure
Class Level Information
Class Levels Values
kyjf 3 a1 a2 a3
Number of observations 27
The SAS System 11:16 Friday, October 22, 2013 1 The ANOVA Procedure
Dependent Variable: tgl
Sum of
Source DF Squares Mean Square F Value Pr > F
方差来源
Model 2 20.12518519 10.06259259 15.72
Error 24 15.36222222 0.64009259
Corrected Total 26 35.48740741
R-Square Coeff Var Root MSE tgl Mean
0.567108 10.06128 0.800058 7.951852
表示一个因素kyjf, 三个水平 自由度 平方和 均方 f=MS A /MS E p值
Source DF Anova SS Mean Square F Value Pr > F
kyjf 2 20.12518519 10.06259259 15.72
由计算可知检验假设H 0:μ1=μ2=μ3,f =MS A /MS E =15. 72
p =P (F (3, 24) ≥f )
入的不同对当年生产力的提高有显著影响。 (2)
proc anova data =examp3_5; class kyjf; model tgl=kyjf; means kyjf; means kyjf/t clm alpha =0.05; means kyjf/t cldiff alpha =0.05; run ;
The SAS System 11:16 Friday, October 22, 2013 1 The ANOVA Procedure =i ∙
Level of -------------tgl------------- kyjf N Mean Std Dev
因素kyjf 的水平 观测次数n i 各总体均值
∑y
j =1
n i
ij
/n i s
2
i ∙
=∑(y ij -i ∙) 2/(n i -1)
j =1
n i
i ∙ 各总体样本标准差s i
a1 9 6.87777778 0.81359968 a2 12 8.13333333 0.75718778 a3 6 9.20000000 0.86717934 给出μi 置信度1-α的置信区间
The ANOVA Procedure
t Confidence Intervals for tgl
Alpha 0.05 Error Degrees of Freedom 24 Error Mean Square 0.640093 Critical Value of t 2.06390
95% Confidence kyjf N Mean Limits
a3 6 9.2000 8.5259 9.8741 a2 12 8.1333 7.6567 8.6100 a1 9 6.8778 6.3274 7.4282 The SAS System 11:16 Friday, October 22, 2013 1 The ANOVA Procedure
t Tests (LSD) for tgl
NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.
Alpha 0.05 误差平方自由度 Error Degrees of Freedom n -a = 24 均方误差 Error Mean Square MS E
= 0.640093
检验t 值 Critical Value of t t 0. 975(27-3) =2.06390
***表示显著差异 Comparisons significant at the 0.05 level are indicated by ***.
Difference
kyjf Between 95% Confidence Comparison Means Limits
各因素比较 均值差μi
-μj 估计 95%的均值差的置信区间
a3 - a2 1.0667 0.2410 1.8923 *** a3 - a1 2.3222 1.4519 3.1925 *** a2 - a3 -1.0667 -1.8923 -0.2410 *** a2 - a1 1.2556 0.5274 1.9837 *** a1 - a3 -2.3222 -3.1925 -1.4519 *** a1 - a2 -1.2556 -1.9837 -0.5274 ***
3∙=6. 8778, 由表3.6知,MS E =0.64009259,t α(n -a ) =t 0. 975(24) =2.06390,n 1=9,n 2, 12,n 3=, 6,
1-2
估计结果求得1∙=9. 2000, 2∙=8. 1333,
μi 置信度1-α的置信区间 i ∙-t
⎝
⎛
1-
α(n -a ) MS E /n i , 2
i ∙+t
⎫⎪ (n -a ) MS /n αE i ⎪1-
2⎭
故得生产能力增高量的均值μ1,μ2,μ3的置信度95%的置信区间分别为 (8.5259 ,9.8741)(7.6567 ,8.6100)(6.3274 ,7.4282)
μi -μj 的置信度95%的置信区间为
⎛
i ∙-j ∙-t α(n -a ) (1+1) MS E , 1-n 1n 2
2⎝
i ∙-j ∙+t
1-
α(n -a ) (2
11
+) MS E n 1n 2
⎫⎪ ⎪⎭
故得生产能力增高量的均值μ1,μ2,μ3的两两之差置信度95%的置信区间分别为 (-1.9837 ,-0.5274)μ1-μ3:(-3.1925 , -1.4519)μ2-μ3:(-1.8923 , -0.2410) μ1-μ2:
μ1显著大于μ3和μ2, μ2显著大于μ3.
(3)
proc anova data =examp3_5; class kyjf; model tgl=kyjf;
means kyjf/bon cldiff alpha =0.05; run ;
下面给出均值差的同时置信区间
The SAS System 11:16 Friday, October 22, 2013 1 The ANOVA Procedure
Bonferroni (Dunn) t Tests for tgl
NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type
II error rate than Tukey's for all pairwise comparisons.
Alpha 0.05 Error Degrees of Freedom 24 Error Mean Square 0.640093 Critical Value of t 2.57364
Comparisons significant at the 0.05 level are indicated by ***.
Difference
kyjf Between Simultaneous 95% Comparison Means Confidence Limits
各因素比较 均值差μi
a3 - a2 1.0667 0.0371 2.0962 *** a3 - a1 2.3222 1.2370 3.4074 *** a2 - a3 -1.0667 -2.0962 -0.0371 ***
-μj 估计 95%均值差的同时置信区间
a2 - a1 1.2556 0.3476 2.1635 *** a1 - a3 -2.3222 -3.4074 -1.2370 *** a1 - a2 -1.2556 -2.1635 -0.3476 ***