CHI-SQUARE

I. Chi-Square (X2) Analysis

a. Research question answered by chi-square analysis deals with the relationship that exists among selected categories. More precisely, the question asks whether membership in one category affects membership in another. If there is no relationship, the categories are considered independent of one another, if there is a relationship, categories are considered contingent upon one another.

b. Relationship does not infer causation. Because chi-square tests the non-relationship between categories, it is sometimes called the test of independence.

c. Types of Data Required

1) Relationships between groups are examined, using two (or more) nominal level data. Chi-square is symbolized as X2. The X2 is based on the relationship between the expected number of subjects that fall into a category and the actual or observed number of subjects.

d. Assumptions

1) Data must be frequency data.

2) There must be an adequate sample size.

3) Measures must be independent of one another.

4) There must be some theoretical basis for the categorization of the variables.

e. Calculation of Chi-Square

1) Expected frequencies-assuming the null hypothesis, we can calculate the expected frequencies (fe) and compare to what actually occurred in the sample, the observed frequencies (fo).

fe = T row x T column
Total no. of subjects

A

B

T(r1)

C

D

T(r2)

T(c1)

T(c2)

T(N)



where: T(r1)=total of row 1
T(r2)=total of row 2
T(c1)=total of column 1
T(c2)=total of column 2
T(N)= total sample

According to the formula:

fe for cell A = T(r1) x T(c1)

T(N)

fe for cell B = T(r1) x T(c2)

   T(N)

and so for each of the cells

2) If a larger table than a 2 x 2 use this general formula:

fe(k) = T(rk) x T(ck)

T(N)

k=any one of the cells, formula repeated for each of the cells in the table

3) Chi-Square Formula: Once expected frequencies for all the cells have been calculated, they are compared to the observed frequencies using the X2 analysis.

(fo - fe)2

X2 = ∑ -----------

fe

fo = observed frequency
fe = expected frequency
∑= overall sum, for all cells

4) Reminder: degrees of freedom (df): is the extent of which values are free to vary given a specific number of subjects and a total score. In X2 analysis, frequencies rather than scores are used. No means are calculated. The values that are free to vary depends on the number of cells found in the table. The df for a 2 x 2 chi-square analysis is always 1, despite the sample size.

df = (r-1)(c-1)

f. Interpreting the Chi-Square

1) Use a distribution of X2 probability table Reject the Null Hypothesis when X2 value > X2 otherwise retain Ho.

Chi-Square Example

Research Question: is there a difference between gender and seat belt use?

Male

Female

Total (C)

Users

86 (A)

104 (B)

= 190 (R1) Users

Non-Users

161 (C)

36 (D)

= 197 (R2) Non-Users

Total (R)

= 247 (C1) Males

= 140 (C2) Females

= 387

A-D=Individual Cells
R= Row
C=Column

(fo - fe)2

X2 = ∑ -----------

fe

fo = observed frequency
fe = expected frequency
∑= overall sum, for all cells


fe (male users)

190 x 247
= -----------
387

= 121.27

fe (female users)

190 x 140
= -----------
387

= 68.73

fe (male non-users)

197 x 247
= -----------
387

= 125.73

fe (female non-users)

197 x 140
= -----------
387

= 71.27

X2 =

(86 – 121.27)2 +
-----------------
121.27

(104 – 68.73)2 +
------------------
68.73

(161 – 125.73)2 +
------------------
125.73

(36 – 71.27)2
------------------
71.27

X2 =

(-35.27)2 +
-----------
121.27

(35.27)2 +
-----------
68.73

(35.27)2 +
----------
125.73

(-35.27)2
-----------
71.27

X2 =

1244
-------
121.27

1244
-------
68.73

1244
-------
125.73

1244
-------
71.27

X2 =

10.26 +

18.10 +

9.89 +

17.45

X2 =

55.7

Next calculate the degrees of freedom.


df = (R – 1) (C – 1)

= (2 – 1) (2 – 1)

df = 1

Compare to the critical values table. With p < .05, X2 needs to be > than 3.841 to be significant.
Thus, we would reject the null hypothesis. There is a significant relationship between gender and seatbelt use