**MULTIPLE CORRELATION & REGRESSION **

I. Multiple Correlations (Overview)

a. The relationship is measured between one variable and a combination of other variables. In

r, talking about one independent variable (X), and one dependent variable (Y). In multiple correlation (R), talking about more than one independent variable (X1, X2,X3 and so on) and one dependent variable (Y).

II. Regression

a. Introduction

1) Regression is a technique that makes use of the correlation between variables and the notion of a straight line to develop a prediction equation. Once a relationship has been established between two variables, it is possible to develop an equation that allows us to predict the score of one of the variables, given the score of the other.

2) In multiple correlation, regression is used to establish a prediction equation (independent variables are each assigned a weight based on their relationship to the dependent variable).

3) Regression may be used in relation-searching and association-testing.

III.Simple Linear Regression

a.Simple Regression: A correlation between two variables used to develop a prediction equation. Based on a linear relationship.

1) The higher the correlation, the more accurate the prediction.

2) To be able to make predictions, the relationship between two variables, the independent (X) and the dependent (Y) must be measured. If there is a correlation, a regression equation can be developed that will allow prediction ofY, givenX.

3)“Regression” means literally a falling back toward the mean. Each prediction "regresses" back toward the mean, depending on the strength of the correlation.

4) Prediction Equation

Y' = a + bX1.

Y' is the predicted score. Given data onXandYfrom a sample of subjects called the regression sample, a and b can be calculated. With those two measures, Y can be predicted givenX.

2. The letterais called the intercept constant and is the value ofYwhenX=0. It is the point at which the regression line intercepts theYaxis.

3. The letterbis called the regression coefficient and is the rate of change inYwith a unit change inX. It is a measure of theslopeof the regression line.

4. The regression line is the "line of best fit" and is formed by a technique called the method of least squares. Because the mean is the center of the data, the sum of the deviations of the scores around the mean ∑ (x-M), adds up to 0. Also, if you square those deviations and add them, that number will be smaller than the sum of the squared deviations around any other method of central tendency. In the same way, the regression line passes through the exact center of the scatter diagram; thus, it is the “line of best fit”. Regression line represents the predicted scores (Y’s), but since prediction is not perfect, actual scores (Ys) would deviate somewhat from predicted scores. Because regression line passes through the center of the pairs of scores, if you add up the deviations from the regression line (Y - Y’), they would equal 0.

IV.Multiple Regression: This is possible when there is a measurable multiple correlation between a group of predictor variables and one dependent variable. The prediction equation is:

Y' = a + b1X1 + b2X2 + b3X3 + ....bkXka.Significance Testing

1) When doing a simple, linear regression, the correlation between the two variables is tested for significance, and

r2 represents meaningfulness.

2) With multiple correlation, interested not only in the significance of the overallRand thus the amount of variance accounted for (R2) but also in the significance of each of the independent variables.

3) In multiple regression, the multiple correlation is tested for significance and each of theb-weights is also tested for significance. Testing theb-weight tells whether or not the independent variable associated with it is contributing significantly to the variance accounted for in the dependent variable.

4) TheFdistribution is used for testing significance of theR2s, and either theF- ort-distribution is used to test the significance of thebs.

5) When testing the significance of theR2s, the degrees of freedom (df) are calculated ask/(n - k - 1). Thekstands for the number of independent variables, andnstands for the number of subjects. When testing significance ofb-weight, thedfis 1/(n - k- 1).