Inferential Statistics

Statistical procedures fall into one of two categories. In the last module we discussed descriptive statistics. The second category, inferential statistics, allows us to make more general statements about our results.

Inferential statistics permit confident generalizations from observed facts

Having evaluated experimental results to determine what happened, a re searcher may wish to make inferences to a larger population. For example, no one is interested in the opinions of a mere 2,000 individuals as to whether or not they thought the President was involved in unethical election practices. However, if those 2,000 responses can be generalized to the total population of U.S. voters, the poll becomes very interesting indeed. As you read this module, keep in mind the following questions.



Techniques of random sampling give each member of a population an equal chance of being included in the sample

The word "population" refers to all members of a class of events, objects, or persons. The population of the United States is approximately 10 million persons, whereas the population of students enrolled in a typical section of Introductory Psychology is probably somewhere between 20 and 200 students. Research is seldom carried out on total populations, because it is not economical or feasible, but a sample of a population may behave in a way that typifies the total population. A sample is at least one less than the total number of members in a population. In order to generalize results from a sample to a population, the sample must be selected at random. Statistical inferences require random samples, although many practical problems make this a difficult requirement to meet. A random sample demands that subjects be selected on a chance basis; in other words, the only biases operating are chance selection factors.

One way to take a random sample of 200 students at the University of New Mexico would be to write each student's name on a piece of paper, shake the pieces thoroughly, then draw out 200 pieces of paper. It would still be possible to draw only psychology majors, but the most likely distribution would be representative of the population as a whole.

For a sample to be unbiased (or representative of the total population), each member of the population must have an equal chance of being included in the sample. A sampling error is anything that contributes to making a sample nonrepresentative, by giving one member of the population a better chance of being selected than another member. These errors are always possible, but taking several unbiased samples can minimize the probability of sampling errors. Although probability and magnitude of sampling errors can be computed, the techniques are beyond the scope of this course.



Inferential statistics provide a measure of the probability of an event occurring by chance

Suppose you select a random sample of 100 students from the 900 enrolled in Introductory Psychology at Jones College. You give all 100 students in the sample a form of the final examination before they begin the course. After grading the 100 tests, you find a mean score of 38. Suppose you then permit another student to spend two hours examining the text -- spot- reading the topic headings, first sentences of paragraphs, tables, and charts. You then give him the same examination, on which he gets a score of 47, or 9 points higher than the mean score for the random sample of 100 students. What would the higher score say about the effect of the two hours of review? Just by chance, you might have picked a student who would have done well without a review. What is the probability of a student scoring 9 points above the mean?

A "significant result t would be unlikely to occur through chance alone

If it were found that one student out of every two could be expected to achieve a mean score of 47 without any time spent in review, the study would provide little evidence of the effect of the review. If the probability were one in five that an individual would score that high, the evidence is better, but not impressive. Generally, if the probability is less than 1 in 20 of the observed results occurring by chance alone, the evidence in favor of the hypothesis is considered significant. (Note that "significant" does not mean "notable," only that chance is unlikely to be the sole determining factor.)

Figure 13. Galton's demonstration of normal distribution (three contraptions showing how balls dispensed randomly form a normal curve)

The shot are released at the top and they travel among the pins and into the grooves. Note that the general shape remains the same as more shot are added This bell-shaped distribution is referred to as the normal distribution. The larger the sample of shot, the more closely the figure resembles the shape of the normal curve. (Courtesy of News, 36, No. 2, Feb. 1958.)

In this case, how do you decide that you can, or cannot, legitimately infer that the review influenced the student's performance on the examination? The probability of a chance deviation from the mean is determined by comparison of the observed results with a normal distribution.


In 1885, Sir Francis Galton demonstrated that chance alone produces a distribution of events that follows a particular pattern. Figure 13 shows Galton's device and explains his experiment.

The accumulation of buckshot is greatest in the center slots and progressively smaller in the other slots as they are further away from the center. The general shape of the pattern is the normal curve (or bell-shaped curve) that we encountered earlier.

The important thing to remember is that the normal curve reflects the way things happen when they occur entirely by chance, and the shape of this normal curve can be expressed in a mathematical formula. Having identified empirically how chance events occur, we can compare a particular event with the normal (or chance) distribution of events, and calculate the probability that the observed event would have occurred by chance alone. Figure 14 shows the principal characteristics of the normal curve derived from the mathematical formula. A normal distribution reflects the way things actually happen when chance is the only causative factor


Figure 14. The curve of a normal distribution

As you can see, the mean is defined as zero, and variations from the mean are expressed as numbers of standard deviations above (+) or below ( - ) the mean of zero. We know a great deal about this normal curve. For example:

  • The curve is perfectly symmetrical, which means that 50% of the area under the curve is above the mean and 50% is below.
  • 68.3% of the area under the curve is between -1.0 and 1.0 standard deviations, which means that 68.3% of the chance events fall between these values. 95.4% of the area is within 2 standard deviations of the mean.
  • 99.7% of the area is within 3 standard deviations. Thus there is a very small probability of any event having a value higher than 3 standard deviations from the mean in either direction.

    We can calculate the area under the curve for any number of standard deviations on one or both sides of the mean. For example, 1.5 standard deviations would encompass 86.6% of the area.

    Returning to the example of a test score of 47 compared with a mean score of 38, it should be clear that we can find the probability that this event would occur by chance if we can express the variance as a standard deviation from the mean score for the sample. .

    The Z-score (or standard score)

    We can transform experimental data to a standard distribution by using the formula Z-score = X-M/sd where .


    X is the score to be tested; M is the mean score of the sample; SD is the standard deviation in the sample of scores; Z (or standard) score is the variance of a score above or below the sample mean expressed as a number of standard deviations from the mean.

    Suppose the standard deviation (SD) for the sample was 4.5 (you know the mean was 38) you can now calculate Z for the score of 47. X - M = 47-38/4.5 = 9/4.5 = 2SD

    Thus the score of 47 was 2SD from the mean. Referring to Figure 14, we calculate that 97.7% of the scores fall at or below the score of 47. 50% of the total area is below the mean, and above the mean covers another 47.7% (95.4/2). Thus, based on chance alone, we would expect only 2.3% of the students to achieve a score above 47 on the examination.

    Using the same procedure, we can make inferences about the likelihood of someone in the population of 900 making a passing grade of 60 on the pretest. Once again, we convert the score to a z-score.

    While 4.88 is now shown on Figure 14, we can calculate that less than 0.15% of our 900 students could be expected to do this well on the pretest, before taking the course. In other words, we have a probability of less than .0015 that any student will make a grade of at least 60 on the pretest. Here we are making statistical inferences to the population based on the data generated by a sample of 100, using the properties of the normal curve. These are the basic techniques, used in hypothesis testing, by which we determine the probability that the results are chance occurrences.

    t-Test and Null Hypothesis

    Consider an experiment to test the effectiveness of two study methods. In one group (A), the subjects study for 1 hour every night for a week before the exam; in a second group (B), the students cram for 7 hours the night before the exam. Assume group A's average on the exam is 76, while group B's is 72. Would you state emphatically that the procedure used by group A was superior to the procedure for B? Remember, we want to generalize to all people under these conditions. Since we could not test everyone, but just a sample from the population, our results are subject to sampling error. In other words, the mean of B could be 74 or 72, or the mean of A could be some value other than 76. Sample A may have included unusually good students. Since they were chosen by chance, we could not know this beforehand. However, we can determine the probability of a chance occurrence of these results in the population. A variation of the Z-score permits such inferences.

    The t-test uses the mean and standard deviations from each group to calculate the likelihood that the differences between the mean scores for A and B were due to chance alone. If this difference has a low probability (for instance 1 in 20 or 1 in 100) of occurrence due to chance, then we reject the assumption that procedure A differs from B due to chance.

    A null hypothesis assumes that there is no difference between an experimental and a control group

    The assumption that our differences are due to chance is called the null hypothesis. The experimenter assumes that the result of an experiment are due to chance alone and will not adopt another hypothesis unless the t-test discredits the possibility that the observed difference is due to chance alone.


    The null hypothesis is rejected when it is shown statistically to be highly improbable

    Psychologists often make use of a null hypothesis, which asserts that there is no difference between two groups. For example, the experimenter might assume that there is no relationship between success and scores on a particular aptitude test. He would reject the null hypothesis only if testing shows a relationship that is not likely to occur by chance alone. A null hypothesis could also assume no significant difference between control and experimental groups. An experimenter might assume that the administration of a tranquilizer before a test will have no effect on the students' performance. He divides a volunteer class, has tranquilizers administered to half of them, and has sugar pills given to the other half. If a significant difference between group is shown, the null hypothesis is rejected. In this case (as in most experiments), care must be taken to assure that the two groups would ordinarily score the same. The experimenter might reject a true null hypothesis, committing a Type I error.

    A Type II error is accepting a false null hypothesis. That is, the differences were, in fact, due to the manipulation of the independent variable and not due to chance factors alone. This, too, can occur when two groups are not evenly matched or when adequate controls are not maintained.

    Now take Progress Check 1.

    MODULE 6

    1. We generalize our results to populations on the basis of:

    a. descriptive statistics.
    b. inferential statistics.
    c. (both of the above)
    d. (neither of the above)

    2. The null hypothesis states:

    a. that there is a difference between the groups
    b. explicitly what results are expected.
    c. that there are only chance differences between the groups.
    d. (none of the above)

    3. Type I errors occur:

    a. when we accept a true null hypothesis.
    b. when we reject a true null hypothesis.
    c. when we accept a false null hypothesis.
    d. when we reject a false null hypothesis.

    4. If the null hypothesis has a low probability (1/20) of occurrence, we usually:

    a. reject it.
    b. accept it.
    c. decide it is in error.
    d. (none of the above)

    5. Here are some values for a particular distribution. (Note: You may refer to the text for help.)

    X = 3

    Median = 3

    Mode = 5

    Range = 5

    = 1.67

    For a score of 4, find Z =

    6. Match the following for a population of "all psychology undergraduates in Tennessee."



    _1 Biased sample

    Unbiased sample

    a) Every psychology student whose name begins
    with "Z" in Tennessee junior colleges

    b) Every tenth undergraduate psychology student, in alphabetical order, at the University of Tennessee
    c) Three percent of the psychology undergraduates, drawn by lot, at every institution in Tennessee



    The formula for a Z-score is: the raw score -- the mean = X - M divided by the standard deviation

    If the mean is 5 and the standard deviation is 3, convert a score of 10 to a Z-score.

    MODULE 6

    A biased sample is one in which each member of the population did not have an equal chance of being included in the sample. Write whether each of the following samples is biased or unbiased. The population is all full-time students over 21 who attend San Francisco State University."
    a. Every student who enters a particular bar__________________

    b. Every student who is registered to vote________________

    c. A drawing by lot of every full-time student in the_______________ college over 21
    ________________________________________ 4
    The result of an experiment is statistically significant when the probability of its occurring by chance is low. Which of the following might be a statistically significant finding in the population of a large city?
    a. An average IQ of 70
    b. An average male height of 5'9"
    'c. Approximately the same number of males as females
    d. An average weight of 180 pounds

    A graduate assistant assumes that sixth graders in his city are not more intelligent than those across the country. He is preparing to conduct an experiment so that he can legitimately make inferences about all sixth-grade students in his city. He selects three schools in the city by writing the names of all schools on separate slips, shaking the slips in a hat, and drawing out three. He then asks each sixth grade teacher at these schools to send him eight students. The teachers, naturally, send their brightest students. The graduate assistant then tests the 24 students with a standard IQ test. When he finds that their average IQ is significantly higher than the mean for sixth graders across the country, he decides his original assumption was wrong.

    A representative or unbiased sample is one in which every member of the population has an equal chance to be included. A sample is biased when this is not true. Indicate whether each of the following samples for the example above was biased or unbiased.

    a. The sample of schools___________________

    b. The sample of students from each school______________

    c. The total sample______________________

    _____________________________________ 3

    A sampling error is anything that contributes to making a sample nonrepresentative. Write the sampling error for the example above.

    1 a

    2 + X X + 5 + 1.67

    3 a. unbiased
    b. biased
    c. biased

    4 a. biased
    b. biased
    c. unbiased

    5. not picking the students at random.

    The graduate assistant assumed the null hypothesis in his experiment. What was his hypothesis?
    ____________________________________ 1

    In general, the null hypothesis is the assumption that

    _____________________________________ 3
    The graduate assistant committed a Type I error in hypothesis testing. He:
    a. accepted~a false hypothesis.
    b. rejected a true hypothesis.
    c. confirmed his null hypothesis.

    1 ) Sampling error_____
    2) Type I error______
    3) Type 11 error _______

    a. Using a null hypothesis
    b. Accepting a false hypothesis
    c. Rejecting a true hypothesis
    d. Using a random sample
    e. Anything that contributes to a sample's not being representative



    1 that sixth graders in his city are no more intelligent than others across the country
    2 b
    3 there is no difference between groups.

    4 1 3
    2, c
    3, b


    MODULE 6

        1. Z-scores may tell us:
        a. how far a score is from the mean in standard deviation units.
        b. what percent of the scores fall at or below a particular score.
        c. the likelihood of occurrence of scores greater than a given Z 
        d. (all of the above)
        e. (none of the above)
        2. Random samples:
        a. should have only chance factors operating. 
        b. are inappropriate when we want to generalize to a population. 
        c. give everybody an equal chance of being chosen. 
        d. have known biases.
        3. Find Z for    a raw score of 20.
         X = 60
        Median = 40
        Mode= 50
        SD = 10
       4. Accepting a false hypothesis is a:
        a. Type I error.
        b. Type II error.
        c. null error.
     5. A psychologist is studying the interests of junior college students in California as compared to
        freshmen and sophomores at four-year colleges He cannot reasonably test all of the students who fit
        into these groups. The psychologist will probably:
        a. test only the junior college population. 
        b. test  an unbiased sample from each group. 
        c. test a biased sample from each group.
        d. test a bimodal sample from each group.
     6. The psychologist assigns a number to each student in each group. He selects from the numbers
        without knowing who the numbers represent. He is:
        a. introducing sampling errors. 
        b  obtaining a random sample.
        c. biasing the sample.
        d. allowing each student an equal chance to be included in the sample.
     7. Before starting his testing, the psychologist assumes that both groups will have the same interests. He
        a. test a null hypothesis.
        b. reject this hypothesis only if a difference is statistically significant.
        c. test his hypothesis in this experiment.
        d (none of these)




    Take the last UNIT TEST



    Unit 13 Table of Contents

    Home Page