epsy341small.gif (20859 bytes)

The PowerPoint presentation contains important information for this unit on correlations. Contact the instructor...del.siegle@uconn.edu...if you have trouble viewing it.

When are correlation methods used?

How is correlational research different from experimental research?
In correlational research we do not (or at least try not to) influence any variables but only measure them and look for relations (correlations) between some set of variables, such as blood pressure and cholesterol level. In experimental research, we manipulate some variables and then measure the effects of this manipulation on other variables; for example, a researcher might artificially increase blood pressure and then record cholesterol level. Data analysis in experimental research also comes down to calculating "correlations" between variables, specifically, those manipulated and those affected by the manipulation. However, experimental data may potentially provide qualitatively better information: Only experimental data can conclusively demonstrate causal relations between variables. For example, if we found that whenever we change variable A then variable B changes, then we can conclude that "A influences B." Data from correlational research can only be "interpreted" in causal terms based on some theories that we have, but correlational data cannot conclusively prove causality. Source: http://www.statsoft.com/textbook/stathome.html

Correlation research asks the question: What relationship exists?

Please try the scatterplot demo showing the directions and strength of various correlation coefficients (the script for this was created by Juha Puranen from Finland).

No relationship between the measures (variables) is called unrelated, uncorrelated, orthogonal, or independent.

Some Math for Bivariate Product Moment Correlation (not required for EPSY 341):
Multiple the z scores of each pair and add all of those products. Divide that by the number of pairs of scores. (pretty easy)
---or---
Each pair has two scores...one from each of two variables. For each pair, subtract the mean for each variable from the pair's score on that variable and multiply the results times each other--> (Score1 - Mean1) * (Score2 - Mean2). Total those results for all of the pairs --> SUM((Score1 - Mean1) * (Score2 - Mean2)). Divide that by the number of pairs minus 1-->SUM((Score1 - Mean1) * (Score2 - Mean2)) / (n - 1). Multiple the standard deviation for each of the two variables times each other and divide your previous answer by that --> (SUM((Score1 - Mean1) * (Score2 - Mean2)) / (n - 1)) / (SD1 * SD2).

Some correlation questions elementary students can investigate are
What is the relationship between...


Correlations only describe the relationship, they do not prove cause and effect. Correlation is a necessary, but not a sufficient condition for determining causality.

There are Three Requirements to Infer a Causal Relationship

  1. A statistically significant relationship between the variables
  2. The causal variable occurred prior to the other variable
  3. There are no other factors that could account for the cause

(Correlation studies do not meet the last requirement and may not meet the second requirement)

There is a strong relationship between the number of ice cream cones sold and the number of people who drown each month.  Just because there is a relationship (strong correlation) does not mean that one caused the other.

If there is a relationship between A (ice cream cone sales) and B (drowning) it could be because

The points is...just because there is a correlation, you CANNOT say that the one variable causes the other.  On the other hand, if there is NO correlations, you can say that one DID NOT cause the other (assuming the measures are valid and reliable).


Format for correlations research questions and hypotheses:

Question: Is there a (statistically significant) relationship between height and arm span?
HO: There is no (statistically significant) relationship between height and arm span (H0: r=0).
HA: There is a (statistically significant) relationship between height and arm span (HA: r<>0).

Coefficient of Determination (Shared Variation)

One way researchers often express the strength of the relationship between two variables is by squaring their correlation coefficient. This squared correlation coefficient is called a COEFFICIENT OF DETERMINATION. The coefficient of determination is useful because it gives the proportion of the variance of one variable that is predictable from the other variable.

Factors which could limit a product-moment correlation coefficient (link to a Powerpoint illustrating these factors)

  1. Homogenous group (the subjects are very similar on the variables)
  2. Unreliable measurement instrument (your measurements can't be trusted and bounce all over the place)
  3. Nonlinear relationship (Pearson's r is based on linear relationships...other formulas can be used in this case)
  4. Ceiling or Floor with measurement (lots of scores clumped at the top or bottom...therefore no spread which creates a problem similar to the homogeneous group)

Assumptions one must meet in order to use the Pearson product-moment correlation

  1. The measures are approximately normally distributed
  2. The variance of the two measures is similar (homoscedasticity) -- check with scatterplot
  3. The relationship is linear -- check with scatterplot
  4. The sample represents the population
  5. The variables are measured on a interval or ratio scale

There are different types of relationships: Linear - Nonlinear or Curvilinear - Non-monotonic (concave or cyclical). Different procedures are used to measure different types of relationships using different types of scales. The issue of measurement  scales  is very important for this class.  Be sure that you understand them.

Predictor and Criterion Variables (NOT NEEDED FOR EPSY 341)

When using a critical value table for Pearson's product-moment correlation, the value found through the intersection of degree of freedom (n - 2) and the alpha level you are testing (p = .05) is the minimum r value needed in order for the relationship to be above chance alone.

The statistics package SPSS as well as Microsoft's Excel can be used to calculate the correlation. We will use Microsoft's Excel.

Del Siegle, Ph.D.
Neag School of Education - University of Connecticut
del.siegle@uconn.edu

www.delsiegle.com

r=0