drgwen.org-Statistics Tutorial

ANALYSIS OF VARIANCE

I. Differences Among Group Means: One-Way Analysis of Variance (ANOVA)

a. Introduction

1)Many times, a clinical research question involves a comparison of several groups on a particular measure. When the measure is represented by interval or ratio-level data, next determine whether or not the groups vary from one another in their distribution of scores. The basic t-test compared two means in relation to the distribution of the differences between pairs of means drawn from a random sample. When there are more than two groups and are interested in the differences among the set of groups, we are dealing with different combinations of pairs of means. If chose to analyze the differences by t-test analysis, would need to do a number of t-tests. Problem with multiple group comparison is that each time a test is run, run the risk of a Type I error. The probability level set as the point at which we reject the null hypothesis is also the level of risk we decide we are comfortable with. If the level is .05, we are accepting the risk that 5 of 100 times our rejection of the null hypothesis will be in error.

2)When we calculate multiple t-tests on independent samples measured on the same variable, our rate of error increases exponentially by the number of tests conducted.

3) Instead of using a series of individual comparisons, we can examine the differences among the groups through an analysis that considers the variation across all groups called the Analysis of Variance (ANOVA).

4)Question answered by ANOVA is whether group means differ from each other.

b.Type of Data Required

1)Independent Variables: Are at the nominal level. A "one-way" ANOVA means that there is only one independent variable (called a factor). The independent variable has two or more levels. A "two-way" ANOVA indicates two independent variables.

c.Assumptions

1)ANOVA is considered "robust", which means that even if assumptions are not rigidly adhered to, the results still may be close to the truth.

2)Assumptions are same as those for the t-test.

II. One-Way ANOVA

a.Have one independent categorical variable with "n" levels and one continuous dependent variable. Example: May have two experimental conditions and try to determine if groups differ in their scores. Question asked is “Which groups are different from which other groups?”

b.Statistical Question

1)The statistical question in ANOVA is based on the null hypothesis: the assumption that all groups are equal, drawn from the same population. Any difference found comes from a sampling difference that is random in nature. To test the null hypothesis, must consider the variation in the scores and compare amount of variation found in groups.

c.Source of Variance

1)Under null hypothesis, all groups are from the same population, and all scores come from the same population of measures. Need to look at two sources of variability.

1. Within-group variation
2.Between-group variation

2)ANOVA examines the variation and tests whether the between group variation is greater than the within group variation. When the between group variation is greater (statistically greater) than the within group variance, the means of the groups must be different. Compared with when the within group variance is about the same as the between group variance, the groups means are not importantly different.

d.Measure of Variance: Sums of Squares

1)Sum of Squares is the sum of the squared deviations of each of the scores around a respective mean. Types:

1.Sum of Squares for Total Variation SStot

a.Total sum of squares is equal to the sum of the squared deviations of each score in all groups from the grand mean. Expressed as the following equation:

SStot = Σ(X-Mtot)2 = Σx2tot

2.Sum of Squares for Within Group Variation: SSw

a.The within group variation is the total of the variation that occurs in each subgroup. Calculated by finding the sum of squares for each group separately and then summing the results. Formula is:

SSw = SSw1 + SSw2 + SSw3 + SSwk

(*SSwk=last group in the set included in the analysis)

Example:
Group 1 SSw1 = Σ(X1 – M1)2 =Σx21
Group 2 SSw2 = Σ(X2 – M2)2 =Σx22

Applying this formula to our three-group example, we see that the within group variation follows this equation:

SSw = SSw1 + SSw2 + SSw3

In calculating the within group variation, we add together the sum of squares of each of the groups.

3.Sum of Squares for Between Group Variation SSb

a. Between group variation examines how each of the groups varies from the grand mean. Use group means as representative of the individual groups. First find squared deviation of each group from the grand mean (xg - xtot )2 . Then weight the difference by the size of each group ng (xg - xtot )2. Finally sum the weighted deviations

SSb = ∑ ng (xg - xtot )2

where
ng = the n size of each group
xg = mean of each group

If have three groups, have the following equation:

SSb = n1 (x1 – x tot)2 + n2 (x2 – x tot)2 + n3 (x3 – x tot)2

4. Homogeneity of Variance (need to test that variances of the groups do not differ significantly from each other)

Largest group variance
F = ---------------------------
Smallest group variance

*Levene test in SPSS

5.Calculating the Variance: The Mean Square

a.The mean square is simply the term used for variance. Divide by the df, rather than n, since using samples as estimates of populations.

Between group variance represented symbolically by the following:
Msb = SSb/dfb

Within group variance represented symbolically by the following:
Msw = Ssw/dfw

6. Degrees of Freedom

a.df for the between group variance is equal to the number of groups minus one. dfb = k - 1 (ex. dfb = 3 - 1)

b.df for the within variance is equal to the df for each group added together. dfw = ntot - k (ex., dfw = 30 - 3)

*n equals total n across all groups, and k equals the number of groups.

7.Now calculate the mean square terms for between and within variation using appropriate degrees of freedom. Mean square term for between group variation calculated with following formula:

MSb = SSb/dfb

8.Calculate the mean square for the within variation:

MSw= SSw/dfw

9.F-Ratio (to determine whether the between group difference is great enough to reject the null hypothesis, we compare it statistically to the within group variance).

F = MSb/ Msw

Then compare the value to the F distribution in a critical values table. Locate the between df on the row across the top of the table, and locate within df on the column on the left side of the table. Then locate two critical values for F

10.Displaying the results: Summary of ANOVA table.

Source of Variance	SS	df	MS	F	p
Between Group
Within Group
Total

ANOVA Example (Three Groups)

Begin by developing the computation table.

X1	X12	X2	X22	X3	X32
2	4	7	49	6	36
2	4	6	36	4	16
3	9	5	25	1	1
3	9	3	9	3	9
4	16	3	9	2	4
5	25	4	16	5	25
19	67	28	144	21	91
_ X1 = 3.7		_ X2 = 4.7		_ X3 = 3.5
EXtot = X1 + X2 + X3				19 + 28 + 21 = 68
EX2tot = X12 + X12 + X12				67 + 144 + 91 = 302

Now calculate sum of squares.

(∑Xtot)2
SStot = ∑X2tot - ---------
ntot

(68)2
SStot = 302 - ------ = 45.11
18

(∑X1)2 (∑X2)2(∑X3)2 (∑Xtot)2
SSb = ------- + -------- + -------- -- --------
n1n2 n3 ntot

(19)2 (28)2 (21)2 (68)2
SSb = ------ + ------- + -------- - -----
6 6 6 18

SSb = 60.2 + 130.7 + 175.5 - 256.9 = 7.5

SSw = SStot - SSb

SSw = 45.1 - 7.5 = 37.6

Next calculate the degrees of freedom.

1) dfb = k – 1	1) dfb = 3 – 1 (= 2, for between group variance)
2) dfw = ntot – 1	2) dfw = 18 – 3 (=15, for within group variance)

Now calculate the mean square terms.

1) MSb = SSb/dfb	1) MSb = 7.5 / 2 (= 3.75)
2) MSw = SSw/dfw	2) MSw = 37.6 / 15 (= 2.5)

Then, calculate the F ratio.

F = MSb / MSw

F = 3.75 / 2.5

F = 1.5

Now, develop the Summary Table.

Source of Variance	SS	df	MS	F	p
Between Group	7.5	2	3.75	1.5	.05
Within Group	37.6	15	2.5
Total

Results: F (2, 15) = 1.5, p < 0.5)

*Need an F of 3.68 ( Critical Values Table ) for statistical significance. Thus, there we accept the null, that there is no difference between the three groups.

drgwen.org tutorials