The major feature that distinguishes experimental
research from other types of research is that
the researcher manipulates the independent variable. There are a number
of group designs in experimental research. Some of these
qualify as experimental research, others do not.
- In true experimental research, the researcher not
only manipulates the independent variable, he or she also randomly assigned individuals to
the various treatment categories (i.e., control and treatment).
- In quasi experimental research, the
researcher does not randomly assign subjects to treatment and control groups. In other words, the treatment is not distributed among participants randomly. In some cases, a researcher may randomly
assigns one whole group to treatment and one whole group to control. In this case, quasi-experimental research involves using intact groups in an experiment, rather than assigning individuals at random to research conditions. (some researchers define
this latter situation differently. For our course, we will allow this definition).
- In causal comparative (ex post facto)
research, the groups are already formed. It does not meet the standards of an experiment
because the independent variable in not manipulated.
The statistics by themselves have no meaning.
They only take on meaning within the design of your study. If we just examine stats, bread can be deadly.
The term validity is used three ways in research...
- In the
sampling unit, we learn about external validity (generalizability).
- In the
survey unit, we learn about instrument
- In this unit, we learn
about internal validity and external validity. Internal validity means that
the differences that we were found between groups on the dependent variable
in an experiment were directly related to what the researcher did to the independent
variable, and not due to some other unintended variable (confounding variable).
Simply stated, the question addressed by internal validity is "Was the
study done well?" Once the researcher is satisfied that the study was
done well and the independent variable caused the dependent variable (internal
validity), then the research examines external validity (under what conditions
[ecological] and with whom [population] can these results be replicated [Will
I get the same results with a different group of people or under different
circumstances?]). If a study is not internally valid, then considering external
validity is a moot point (If the independent did not cause the dependent,
then there is no point in applying the results [generalizing the results]
to other situations.). Interestingly, as one tightens a study to control for
treats to internal validity, one decreases the generalizability of the study
(to whom and under what conditions one can generalize the results).
There are several common threats to internal validity
in experimental research. They are described in our text. I have review
each below (this material is also included in the PowerPoint
presentation for this unit):
- Subject Characteristics (Selection Bias/Differential
Selection) -- The groups may have been different from the start. If you were testing
instructional strategies to improve reading and one group enjoyed reading more than the
other group, they may improve more in their reading because they enjoy it, rather than the
instructional strategy you used.
- Loss of Subjects (Mortality) -- All of the
high or low scoring subject may have dropped out or were missing from one of the groups.
If we collected posttest data on a day when the honor society was on field trip at the
treatment school, the mean for the treatment group would probably be much lower than it
really should have been.
- Location -- Perhaps one group was at a disadvantage
because of their location. The city may have been demolishing a building next to one
of the schools in our study and there are constant distractions which interferes with our
- Instrumentation Instrument Decay -- The testing
instruments may not be scores similarly. Perhaps the person grading the posttest is
fatigued and pays less attention to the last set of papers reviewed. It may be that those
papers are from one of our groups and will received different scores than the earlier
- Data Collector Characteristics -- The subjects of
one group may react differently to the data collector than the other group. A male
interviewing males and females about their attitudes toward a type of math instruction may
not receive the same responses from females as a female interviewing females would.
- Data Collector Bias -- The person collecting data my
favors one group, or some characteristic some subject possess, over another. A principal
who favors strict classroom management may rate students' attention under different
teaching conditions with a bias toward one of the teaching conditions.
- Testing -- The act of taking a pretest or posttest
may influence the results of the experiment. Suppose we were conducting a unit to increase
student sensitivity to prejudice. As a pretest we have the control and treatment groups
watch Shindler's List and write a reaction essay. The pretest may have actually
increased both groups' sensitivity and we find that our treatment groups didn't score any
higher on a posttest given later than the control group did. If we hadn't given the
pretest, we might have seen differences in the groups at the end of the study.
- History -- Something may happen at one site during
our study that influences the results. Perhaps a classmate dies in a car accident at the
control site for a study teaching children bike safety. The control group may actually
demonstrate more concern about bike safety than the treatment group.
- Maturation --There may be natural changes in the
subjects that can account for the changes found in a study. A critical thinking unit may
appear more effective if it taught during a time when children are developing abstract
- Hawthorne Effect -- The subjects may respond
differently just because they are being studied. The name comes from a classic study in
which researchers were studying the effect of lighting on worker productivity. As the
intensity of the factor lights increased, so did the work productivity. One researcher
suggested that they reverse the treatment and lower the lights. The productivity of the
workers continued to increase. It appears that being observed by the researchers was
increasing productivity, not the intensity of the lights.
- John Henry Effect -- One group may view that it is
competition with the other group and may work harder than than they would under normal
circumstances. This generally is applied to the control group "taking on" the
treatment group. The terms refers to the classic story of John Henry laying railroad
- Resentful Demoralization of the Control Group -- The
control group may become discouraged because it is not receiving the special attention
that is given to the treatment group. They may perform lower than usual because of this.
- Regression (Statistical Regression) -- A
class that scores particularly low can be expected to score slightly higher just by
chance. Likewise, a class that scores particularly high, will have a tendency to score
slightly lower by chance. The change in these scores may have nothing to do with the
- Implementation --The treatment may not be
implemented as intended. A study where teachers are asked to use student modeling
techniques may not show positive results, not because modeling techniques don't work, but
because the teacher didn't implement them or didn't implement them as they were designed.
- Compensatory Equalization of Treatment -- Someone
may feel sorry for the control group because they are not receiving much attention and
give them special treatment. For example, a researcher could be studying the effect of
laptop computers on students' attitudes toward math. The teacher feels sorry for the class
that doesn't have computers and sponsors a popcorn party during math class. The control
group begins to develop a more positive attitude about mathematics.
- Experimental Treatment Diffusion -- Sometimes the
control group actually implements the treatment. If two different techniques are being
tested in two different third grades in the same building, the teachers may share what
they are doing. Unconsciously, the control may use of the techniques she or he learned
from the treatment teacher.
When planning a study, it is important to consider the threats to interval validity as
we finalize the study design. After we complete our study, we should reconsider each of
the threats to internal validity as we review our data and draw conclusions.
Del Siegle, Ph.D.
Neag School of Education - University of Connecticut