MODULE 5

Descriptive Statistics

The first problem in analysis is to interpret what actually happened

Although graphs provide some information about trends, it is still useful to describe data formally so that other researchers will know precisely what results were obtained. Your objective in this module is to learn how to analyze data statistically in order to understand what actually happened during an experiment. How to draw valid inferences from the results is the subject of the next module. Try to answer the following questions as you read this module.

51

CENTRAL TENDENCY

Descriptive statistics may involve the calculation of one or more of the measures of central tendency: mean, median, and mode. These are methods for determining an "average" score for a set of data.

52

The mode is defined as the most frequently occurring response, score, or event, and might be referred to as "the typical case."

The median divides the distribution into two equal parts, as one-half of the scores fall below it and one-half above it. In other words, half of the scores have a greater value than the median and half have a lesser value.

The mean, symbolized X, is the arithmetical average, calculated by summing all the scores in a distribution and dividing by the total number of scores.

A skewed curve gives different values for different measures of central tendency

In symmetrical distributions (typified by a bell-shaped curve) the mode, mean, and median have nearly the same value. When this is true, the mean is most often used because it represents the typical case, or performance, and because it can be handled most easily in other statistical operations that are usually necessary in the evaluation of psychological data. However, when a distribution is skewed (not symmetrical) the mean gives a markedly different view of the distribution than the median. This can be seen in Table 2, which records the scores on an examination in Introductory Psychology.


                               Table 2
    
    Scores on an Introductory Psychology examination:
    
    98, 97, 80, 73, 72, 65, 65, 65, 64, 61, 59

    Mode = 65
    
    Median = 65

    Mean = Sum of all scores
                 _______________     = 72.6 or approx. 73

                 Number of scores



    
    The formula for finding the mean is usually written as:  EX/n
    
                        
Where  E means to sum what follows.  in
    this case, equals the sum of all
    the scores recorded.
    
    X is a symbol for the distribution, i.e., Introductory
    Psychology scores on an examination.
    
    N is equal to the number of scores (or persons), i.e., 11. M is the
    usual symbol for the mean of distribution X.
    
    The median is the middlemost score, or the sixth case in this example.

        The mode is the most frequently occurring score, which is 65 in this
    example.
    
    

Although the mean for this data is 72.6, the median is just 65. Obviously, the instructor's view of what the "average" student did on this test would depend on what measure of "average" he uses. In this case, the instructor would be more likely to curve the examination results on the median rather than on the mean.

The reason there is such a great discrepancy between median and mean is that the former identifies only a single score from the distribution while the latter depends upon all the scores. It may be that no one in the class actually received the mean score. Statistics do not lie, of course, but they can give false impressions to the unwary.

MEASURES OF DISPERSION

Experimental results are not described completely without a computation of the measure of dispersion. First look at the distributions in Table 3.

disper.gif - 5875 Bytes

Table 3

Distribution X Scores.

90, 80, 80, 70, 70, 70, 70, 60, 60, 50

Distribution of Y Scores.

80, 75, 75, 70, 70, 70, 70, 65, 65, 60

Note: Since these distributions are symmetrical, the mean, median, and mode are equal. These measures of central tendency are not equal in skewed distributions.

The range for distribution X equals 40 (90 - 50), while the range for Y equals 20.

Dispersion can be expressed as a range or in terms of standard deviations

Both distributions have the same mean, but they are not equal; one is fat, while the other is slender. These distributions vary in the degree of dispersion; in other words, the scores are spread over a wider range in one distribution than in the other. The statistics that provide information about dispersion are called the range and the standard deviation.

The range is the simplest, but not necessarily the best, measure of dispersion. To obtain the range, one simply subtracts the smallest score from the largest score in the distribution. Since it relies on only two scores, the two extreme ones, the range is a very crude and unstable measure. It would be very unlikely that the range calculated from the results of one experiment would ever be duplicated in another, or that it would be characteristic of the range encountered in a total population.

53

Psychologist Clifton T. Morgan (1961) described the standard deviation as follows:

The standard deviation is the measure par excellence of the variability of measurements in a distribution. This is such a good measure that, if the frequency distribution is reasonably normal, the distribution can be reconstructed by knowing only two numbers, the mean and the standard deviation. This is true because mathematicians have a precise formula for the normal-probability curve, and the only two unknowns in it are the mean and the standard deviation. Given these, one can draw the normal curve that best fits the particular frequency distribution. Thus, in so far as a distribution is normal, the mean and the standard deviation completely describe and specify it.

Computing the standard deviation of a distribution takes more work, but it is a far more useful measure of dispersion.

The standard deviation is calculated as the square root of the average squared deviations from the mean. The procedure for calculating the standard deviation for a particular distribution requires the following steps.

1. Subtract the mean value (M) from each score (X-M) to obtain deviation scores, symbolized x (lower case). For example, the deviation score for 90 in Table 3 is X = (X - M), or 90 - 70 = 20. For the other scores, the deviations are: 10,10, 0, 0, 0, 0, - 10, - 10, - 20.

2. Square the deviation scores, X2: 400, 100,100, 0, 0, 0, 0,100,100, 400

3. Add all these squared deviation scores and divide by N. the number of scores:

Ex2/N = 1200/10 = 120

4. Find the square root of this value: = 10.95 Thus, the standard deviation = 10.95

The standard deviation for distribution X is 10.95. From the preceding steps, you can see that the formula for obtaining standard deviation is Ex2/N

Now suppose we wish to calculate the standard deviation for the Y distribu- tion, we would follow the same steps. The table below illustrates the procedure. For practice, fill in all the blanks and compute for the Y distribution.


      

(Step 1) (Step 2) (Subtract the mean)(Square the deviations) Y-scores Y -M = y y2 80 10 100 75 5 75 5 70 0 70 0 0 70 0 70 0 65 -5 65 -5 60 - 10 100 EY = 700 by = 0 (Step 3) (Add the squared deviations) (Step 4) = square root of Ey2/N

(Remember, N is the number of scores in the sample) Compare your work with the following solution

54


MODULE 5
PROGRESS CHECK 1

    
    1. Which of the following are measures of central tendency?
    
    a. Range
    b. Mode
    c. Median
    d. Mean
    
    2. If a distribution is markedly skewed, which measure of central tendency is 
      most appropriate?
    
    a. Mode
    b. Median
    c. Mean
    d. (none of the above)
    
    For the following distribution, calculate the measures below:
    
    5, 4, 1, 5, 0, 5, 3, 3, 2, 2. (Note: You may refer to the text for help.)
    
    3. Mean=
    
    4. Median =
    
    5. Mode =
    
    6. Range =
    
    7. sdx    =
    

ANSWER KEY PAGE 71

56

5 OR MORE CORRECT PAGE 61
FEWER THAN 5 CORRECT PAGE 57


MODULE 5
EXERCISES

    Each of the three measures of central tendency, commonly referred to as the
    average, provides a single point along the score scale that represents the
    trend of the entire distribution. Remember that the mode is the most
    frequently occurring value in the distribution. The median divides the list of
    scores such that half of the scores have a greater value than the median and
    half are less than the median. The median is easier to find if the scores are
    ordered first. The mean is simply the sum of the scores divided by the
    number of scores. Use the following distribution for the next series of
    exercises.
    
    10, 9, 2, 4, 7, 1, 4, 4, 5, 4
    
Find the mode.
    
______________________________________4

    Now find the median for the above distribution.
    
______________________________________2

 Now find the mean.                             

__________________________________________1

The first step in obtaining the standard deviation is to subtract the
    mean from each of the scores in the distribution. For example, if the mean is
    70, then from the scores 80 and 40 we may obtain an x of 10 and -30,
    that is 80 - 70 and 40 - 70.
    
    The x for 90 would be:
    
    a. 20
    b. - 20
    c. 70
    d. 90

_________________________________________6
    
    After you subtract the mean, you square each value of x.
    
X		x=(X-M)		x2
80			10		100
40			-30		900
60    			_____		___
 
_________________________________  5
    
    Here is a table for the distribution. We have calculated some of the
          values. Complete the table and determine the standard deviation.
    
    X       (X - M) = x     x2

    10          5             25
      9
      7
      5          0               0
    
    Range and standard deviation are measures of variability. They give
    information about how dispersed the scores are. We find the range by
    subtracting the smallest score from the largest score in the distribution.
    
    The range for our distribution is____________________________________
    
_________________________________________________3

ANSWERS

2. The median is 4 since this value is halfway between the two middlemost scores when the scores are ordered as below:
            
     1, 2, 4, 4, 4, 4, 5, 7, 9, 10
    
    
        


          X    (X-M) = x      x2           
         10            5      25              
          9            4      16
          7            2       4
          5            0       0
          4           - 1      1
          4           - 1      1
          4           - 1      1
          4           - 1      1
          2            -3      9
          1           - 4     16
    
    E X = 50   Ex = 0  Ex2   = ___
    
The distribution in the previous example is:
                                           a. positively skewed.
    					b. negatively skewed.
					c. symmetrical.

Graph the scores of the above distribution, plotting frequencies against scores. 4 / / 3 / / 2 / / 1 / / 0 / ___/___________________________________ SCORES NOW TAKE PROGRESS CHECK 2 1 a 2 c 3 b 58

MODULE 5
PROGRESS CHECK 2

   
    1.  A distribution with a small  standard deviation has _________________ 
         variability than a distribution with a large standard deviatinon.
    
    a. more 
    b. less
    
    2.    The median is found:
    
    a. by ordering the scores and identifying the middlemost value. 
    b. by adding up the scores and dividing by N. 
    c. by finding the most frequently occurring score.
    d. by subtracting the smallest score from the largest score.
    
    3. The standard deviation measures:
    
    a. dispersion. 
    b. central tendency.
    c. variability.
    d. degree of skewness.
    
    For the following distribution, calculate the measures below:
    
     8, 2, 7, 7, 5, 3.
    
    4. M
    
    5. Median
    
    6. Mode
    
    7. Range
    
    8. standard devation of x
    

60

6 OR MORE CORRECT PAGE 61
FEWER THAN 6 CORRECT INSTRUCTOR CONFERENCE

ANSWER KEY

Unit 13 Table of Contents

Home Page