CORRELATION
I.Correlation
a. Introduction
1) Correlation techniques are used to study relationship.
1. Exploration studies: to determine whether or not relationships exist.
2. Associationtesting studies: test a hypothesis about a particular relationship.2) Correlation shows relationship does not imply one variable caused the other
b. Type of Data Required
1) The Pearson Product Moment Correlation Coefficient (r) is the most usual method by which the relation between two variables is quantified.
2) To calculate r there must be at least two measures on each subject at the ordinal/interval level. Categorical variables can also be coded for use with r and with regression equations.c. Assumptions
1) There are certain assumptions if we are to generalize beyond the sample statistic, if one is to make inferences about the population itself.
1. Sample must be representative of the population to which the inference will be made.
2. Variables to be correlated (X and Y) must each have a normal distribution (scores approximate normal curve).
3. For every value of X, distribution of Y scores must have approximately equal variabilitycalled assumption of homoscedasticity (same as homogeneity of variance).
4. Relationship between X and Y must be linear, that is, when the two scores for each individual are graphed, they should tend to form a straight line.d. Correlation Coefficient
1) r allows us to:
1. state mathematically what relationship exists between two variables
2. tell the type of relationship that exists, whether the relationship is positive or negative.2) Correlation coefficient may range from +1.00 through 0.00 to 1.00. A +1.00 indicates a perfect positive relationship, 0.00 indicates no relationship, and 1.00 indicates a perfect negative relationship.
3) Strength of the Correlation Coefficient1. "it depends"
a. alternate forms of tests should be high
b. studying relationships among various aspects of human behavior, a correlation of .50 may be good
c. direction does not affect strength of the relationship. A correlation of .90 is just as high or just as "strong" as an r of +. 90. Following categories include + and  rs:
0.000.25 little, if any
0.260.49 low
0.500.69 moderate
0.700.89 high
0.901.00 very high4) Significance of the Correlation
1. To generalize the r calculated from the sample to the correlation of the two variables in the population, need to determine level of probability of r, that is, the probability that this r occurred by chance alone.
2. Can use either a one or twotailed test for significance, depending on whether or not you hypothesized about the relationship.
3. When r is calculated by hand, consult a table to determine level of statistical significance.
4. Level of statistical significance is greatly affected by n size.5) Meaningfulness of the Correlation Coefficient
1. Coefficient of determination r2 is often used as a measure of the meaningfulness of r. This is a measure of the amount of variance that the variables share
2. To determine the meaningfulness of r, square the correlation coefficient, which explains that shared variance between two variables.
3. Calculations
(∑X)(∑Y)
∑XY  
n
r = 
________________________
√ (∑X)2 (∑Y)2
(∑X2 )(∑Y2  )
n n
6) To determine the statistical significance of the correlation use the critical values table. The degrees of freedom for r are n – 2.
Correlation Example (Pearson r)
Begin by developing the computation table.
Subjects  X 
Y 
X2 
Y2 
XY 
1 
8 
6 
64 
36 
48 
2 
4 
3 
16 
9 
12 
3 
7 
2 
49 
4 
14 
4 
6 
5 
36 
25 
30 
5 
9 
10 
81 
100 
90 

34 
26 
246 
174 
194 
n = 
5 
∑X = 
34 
∑Y = 
26 
∑X2 = 
246 
∑Y2 = 
174 
XY = 
194 
r = 

r = 

r = 

r = 

r = 
17.2 
r = 
17.2 
r = 
.72 