Intelligence Tests


Intelligence is frequently defined as "what intelligence tests measure." For this reason, we shall begin our study of intelligence with an overview of intelligence testing. In developing the first intelligence tests, psychologists made a large contribution to our current understanding of intelligence.

As you read the text, try to answer the following questions.

Binet's Test of Critical Judgment

IQ reflects the relationship between mental age and chronological age in children

In the nineteenth century a French psychologist, Alfred Binet, began studying psychological processes in children. He suggested the creation of species classes for slow reamers who were not progressing normally in school. About 1904, the French government commissioned Binet and a colleague, Dr. Simon to find a way to identify these slow learners. This could not be done on the basic of teacher judgments without being unfair to some children. Binet was asked to develop an objective test that could be used to measure the relative intelligence of all children.

Binet devised a graded series of tests arranged in order of increasing difficult, to determine the level of a child's intelligence. The test items covered suet factors as memory, attention, discrimination, and ability to follow orders Different factors, however, were covered in each test, according to age level By working with children who were making normal progress in school he determined the levels at which average children of various ages performed on different tests. This enabled him to classify one child's performance as typical of a particular age and assign the child to a special class appropriate to that age. For example, a five-year-old child might pass only the number of items passed by the average three-year-old. The child was then judged to be two years retarded.

How Did Binet Recognize a Good Test Item?

Just looking at an item is not enough. For example, a question on a subject that is only encountered by a particular social or ethnic group is not a good item. For example, a 10 year-old Jewish child may not be able to define "first communion" while a Catholic child of the same age could do it with ease. In general, an item was considered an adequate measure of, say, five-year-old intelligence if it satisfied three criteria. First, approximately 70 to 75 percent of the five-year olds must be able to perform it. Second, less than 60 percent of the four-year-olds must- be able to perform it. And third, more than 75 percent of the six-year olds must be able to perform it. Thus, the item had to be too difficult for the age level below and too easy for the age level above to be considered a good measure of a particular age level of intelligence.

Figure 6 illustrates the types of items which have been found to meet the conditions outlined above.

The IQ Measure

During the test revision, Binet and Simon developed the concept of mental age. The five-year-old who could pass only the three-year-old tests was said to have a mental age of three. A child whose chronological age (CA) and mental age (MA) were both five was considered to have normal intelligence. Later, a German psychologist, William Stern, devised the intelligence quotient (IQ) using the CA and MA. The formula he developed was:

IQ = MA/CA X 100

Thus the child with an MA of 3 and CA of 5 would have an IQ of 60. The "normal" five-year-old has an IQ of 100. The five-year-old with an MA of 7 has an IQ of 140 (7/5 X 100) Computational formulas had to be modified later, however, for the measurement of adult intelligence which is not presumed to be directly variable with age.

The Stanford-Binet
The Stanford-Binet Scale was devised by Lewis C. Terman, at Stanford University, following Binet's procedures. Terman rearranged many of Binet's tests and added new ones, especially at the upper end of the scale. The Stanford-Binet Test became the standard instrument for measuring intelligence in the United States for more than 20 years. This scale, first prepared in 1916, has undergone several major revisions -- the most recent in 1972. The most significant changes have been in the use of the scale for superior adults and the development of national norms.






Three-hole form board
Block building: tower
Block building: bridge
Identifying parts of the body

Naming objects from memory
Picture identification

Places form (e 9, circle) in correct hole.
Builds a four-block tower from model after demonstration.
Builds a bridge consisting of the side blocks and one top block from model after demonstration.
Points out hair, mouth, etc., on large paper doll
One of three objects (e.g., toys, dog, or shoe) is covered after child has seen all objects, child then names covered object from memory.

Points to correct pictures of objects on a card when asked, .'Show me what we cook on?''or "What do we carry when it is raining?"





Copying a diamond
Memory for stories
Verbal absurdities

Digit reversal

Answers such questions as, "In what way are coal and wood alike? Ship and automobile?"
Copies a diamond in the record booklet.
Defines eight words from a list.
Listens to a story, then repeats the gist of it.
Must say what is foolish about stories similar to: "I saw a well-dressed young man who was walking down the street with his hands in his pockets and twirling a brand new cane."
Must repeat four digits backward.

Average Adult

Defines 20 words from a list.
Explains in own words the meaning of two or more common proverbs.
Must answer questions similar to: "What direction would you have to face so your left hand would be toward the south? "

Figure 6. Some illustrative items from the Stanford-Binet Intelligence Scale. The sample items should be passed by most of the children at the ages indicated. (From Terman and Merrill, 1960)

Intelligence scaling for people older than 15 years requires a different approach

Measuring Adult Intelligence

The Binet-type test works well for children but it is not as successful for measuring the intelligence of adults. For one thing, it is very hard to find items that most 17-year-olds will fail 18-year-olds will consistently pass. Generally the measure of mental age becomes meaningless past the age of 15. Another frequently voiced criticism of Binet's scale was that its items were essentially designed for children. Even though materials were later added which were to be more appropriate for adults, they were not essentially different. The result was that tests were not intrinsically motivating to adults, nor did they allow sufficient room at the top to differentiate among adults. In an effort to meet this criticism Wechsler (in 1939) brought out a scale specifically designed to measure adult intelligence!

There are several important differences between Wechsler's test and the Binet type of test. First, the tests are not grouped by age; the scale yields a percentile score rather than an IQ measure by age group. But the most striking difference between the two tests is the fact that the Wechsler breaks down the scores for general intelligence into separate scores for each type of sub-test: vocabulary, information, arithmetic, picture arrangement, blocks, and so on, and then builds these into two composite scores, one for verbal and one for performance capabilities. The Wechsler test met with such widespread acceptance that he soon developed the Wechsler Intelligence Scale for Children (WISC), and Wechsler's original scale was renamed the Wechsler Adult Intelligence Scale (WAIS).

Adult intelligence scores usually represent percentile rankings within a given sample of test scores

Converting Percentile Score to IQ

A percentile score expresses the ranking of an individuals score on a particular test relative to the scores of a sample of other people who have taken the same test. When someone receives such a score, he knows the percent of people in the sample whose scores fell below his own; if he scored in the 82nd percentile, his score was better than 82% of those in the sample and only 18% scored above him. This is a relative performance measure only. He cannot tell from this score what his absolute performance was.

However, the interpretation of percentile scores is not widely understood and many people prefer to have a percentile score translated into some kind of equivalent IQ. But the formula for IQ does not work for people in the late teens or early adults, so another method has to be used. If we plot a frequency diagram of scores on the WAIS we would find they would form a normal distribution or bell shaped curve.

If a score at the 50th percentile is presumed to represent the performance of an average person, we can make that score equal to an IQ of 100. By various other statistical procedures it is possible to relate values of IQ to this distribution. Figure 8 illustrates that comparison.


Figure B. Comparison of IQ distribution and percentiles

Adult IQ scores, therefore, are determined from a statistical analysis of the distribution of test scores made by a sample of the adult population. The adult IQ score is still a percentile measure, but for the general public, it has the advantage that it does not appear to need interpretation; one simply concludes that his score is above or below "average." In fact, however, it does require interpretation. The fact that a percentile score was translated into an IQ score does not add to either the reliability or the validity of the tests on which it is based. Has the reliability of the tests been verified through repeated testing of the same sample? Has the validity been verified by ensuring that the sample represented an "average" cross-section of the population that is being evaluated against the sample? As so often happens, some well informed people think the tests are reliable and valid while others do not.

Group Tests
Both the Stanford-Binet and Wechsler Scales are individual tests. That is, the test is administered by a trained person to a single subject. Such testing procedures are costly and time-consuming.

Modern tests are based on assumptions of different types of intelligence

But at about the same time that Terman was developing the Stanford-Binet test, several psychologists such as Arthur Otis, and E. L. Thorndike were experimenting with tests that could be given to a group of subjects at one time. The First World War was the immediate impetus for this development. Thousands of young men were being inducted into the service. Military authorities needed a way to weed out those whose intelligence was too low, and to select others for officer training. For this reason, army psychologists devised two tests. One was for men who could read and write; these persons were then classified roughly according to intelligence. The other test was for illiterates and non-English speaking candidates. It emphasized nonverbal problems with oral instructions. These tests were drastically revised during the Second World War and became the Army General Classification Test (AGCT).

A. If 5.5 tons of bark cost $33, what will 3.5 cost? ( )

B. A train is harder to stop than an automobile because ( ) it is longer. ( ) it is heavier. ( ) the brakes are not so good.

C. If the two words of a pair mean the same or nearly the same thing, draw a line under same. If they mean the opposite or nearly the opposite thing, draw a line under opposite.
comprehensive restrictedsame opposite
allure attract same opposite
latent hidden same opposite
deride ridicule same opposite

D. If, when you have arranged the following words to make a sentence, the sentence is true, underline true; if it is false, underline false. people enemies arrogant many make true false never who heedless those stumble are true false never man the show the deeds true false

E. Underlilne the word that correctly completes each sentence. The pitcher has an important place in tennis football baseball handball Dismal is to dark as cheerful is to laugh bright house gloomy

Figure 9. items from the AGCT

Like the WAIS, the AGCT separates intelligence into several factors. Four subtest scores can be obtained for verbal ability, spatial comprehension, mathematical computation, and reasoning.

Some sample items from these early group tests are shown in Figure 9.

Since the development of these early tests, many other group tests have been prepared for civilians. Today, group tests are far more frequently used than individual tests. Some of these group tests, such as the S.R.A. Primary Ability test are developed for children, while others, like the Otis Quick Scores Mental Abilities Gamma tests, are appropriate for high school and college students.


Now test yourself without looking back.

1. The purpose of the earliest intelligence test was to identify________________________________________

2. Which of these results would indicate the best test items for a Binet-type test for the five-year-old level?

a. 20% of the four-year-olds, 40% of the five-year-olds, and 60% of the six-year- olds passed it.
b. 70% of the four-year-olds, 70% of the five-year-olds, and 95% of the six-year- olds passed it.
c. 20% of the four-year-olds, 75% of the five-year-olds, and 90% of the six-year- olds passed it.
d. 20% of the four-year-olds, 75 % of the five-year-olds, and 75% of the six-year- olds passed it.

3. The IQ score for an adult is:

a. derived from a statistical analysis of the distribution of scores made by a sample of adults.
b. MA/CA X 100
c. based on his chronological age.

4. Match.

1 ) Group test______

2) Individual test______

a. Given to many people simultaneously
b. Given to one person at a time
c. Devised earlier
d. More widely used today
e. Generally less expensive

5. Match.
1) Wechsler Scale_____
2) Stanford-Binet Scale__________

a. Given on an individual basis
b. Uses the same types of tests for all age levels
c. Separates intelligence test into several factors, i.e., verbal and performance.
d. Scored using the concept of MA
e. Scored based on comparison with the population distribution

6. A nine-year-old child has a mental age of ten. The IQ of this child is:
a. 90.
b. 111.
c. found by using the formula CA X 100.



An early individual intelligence test, the Binet-Simon Scale, used different types of test items for each age level. It was devised to identify slow learners. Following the same principles, Terman developed the Stanford-Binet Scale for use with U.S. schoolchildren. More recent individual tests are the Wechsler Scales, devised to test each age level on the same types of items. These are the WAIS, which is designed to test adults, and WISC, for children. Finally, army psychologists have developed group tests (AGCT) for military personnel.
Fill in the information about each test above.

The normal intelligence quotient (IQ) is 100. IQ's for children are found using this formula:

IQ = Mental Age (MA)/Chronological Age (CA) X 100

Write the IQ's for the following:

a. A child with an MA of 6 and a CA of 7_______

b. A five-year-old with a mental age of 7_______

c. A 13-year-old with an average IQ_______

d. A 1 0-year-old with an average IQ______
________________________________________ 5

An adult IQ is based on comparisons with the average of the
population as a unit. Adult IQ's are:

a. computed just like children's IQ's.
b. found using the formula X 100.
c. based on population norms.


Both Wechsler Scales provided separate scores for verbal and performance subtests as well as a total score. The Wechsler Scales:

a. can be used for either adults or children.
b. provide a single mental age score.
c. are divided into two categories.
d. use the same types of items for all age groups.


On most intelligence group tests a person's score is given in the form of a percentage. The typical group IQ test given today is similar to which individual test in its method of scoring?

________________________________ 3


1. a, c, d
2 b, c
3 Wechsler
4 c
5 a. 86
b 140
c. 100
d. 100

Suppose a given test item was passed by 70% of the five-year olds and 65% of the four-year-olds. If a child got it right you would be certain that:
a. he is brighter than the average four-year-old.
b. he is brighter than the average five-year-old.
c. he is brighter than the average six-year-old.
d. (none of the above)

A Binet-type test item is very good if slightly more than half of the children at the particular age get it right (60-75%), while most of the children a year younger miss it (less than 60%) and most of those a year older get it right (above 75%). Which of these items would be a good Binet item for a test for five-year-olds?

a. 30% of the four-year-olds, 50% of the five-year-olds, and 70% of the six-year-olds passed it.
b: 65% of the ALSO of the five-year-olds and 80% of the six-year-olds passed it.
c. 20% of the four-year-olds, 85% of the five-year-olds, and 90% of the six-year-olds passed it.
d. 40% of the four-year-olds, 65% of the five-year-olds, and 80% of the six-year-olds passed it.


Test reliability can be indicated by a coefficient of correlation be tween scores on retests with the same test, a high correlation signifying high consistency of scores for the population tested.

Below are the reliability coefficients of several intelligence tests. Which is the most reliable?
a. .85
b. .58
c. .08


Reliability depends on, among other things, the length of a test. This is one reason why individual tests generally are more reliable than group tests. Which of these two tests would you suspect was more reliable?
a. the WAIS
b. the AGCT


NOW TAKE progress CHECK 2


1 d
2 a
3 d


The earliest intelligence test was.
a. devised by Wechsler.
b. used by the army to classify inductees.
c. devised to identify slow learners.
d. devised by Terman.

2. Match.
2) Binet_____
3) Wechsler_____

a. Devised the earliest well-known intelligence test
b. Devised a test that gave separate verbal and motor performance scores
c. Devised the Stanford-Binet Scale
d. Devised an individual test for adults

3. Adult IQ is based on:
a. the concept of mental age.
b. a statistical comparison with the distribution of adult scores.
c. a measure of chronological age and achievement.

4. A 12-year-old child has a mental age of 10 years and 6 months. His IQ is:

a. 83.5.
b. 87.5.
c. found by CA/MA X 100
d. calculated just like an adult's IQ.

5. Which of the following is true of the Stanford-Binet Scale?

a. Only appropriate for adults
b. Given on an individual basis
c. Developed for use by the U.S. Army
d. Gives different scores for different subtests

6. The IQ tests most widely used today are (group/individual) _____________________

7. Which of these items would be considered best for a Binet-type test for the seven-year-old level?

a. 20% of the six-year-olds, 40% of the seven-year-olds, and 75% of the eight-year-olds passed it.
b. 60% of the six-year-olds, 65% of the seven-year-olds, and 70% of the eight-year-olds passed it.
c. 20% of the six-year-olds, 85% of the seven-year-olds, and 86% of the eight-year-olds passed it.
d. 50% of the six-year-olds, 70% of the seven-year-olds, and 85% of the eight-year-olds passed it.


UNIT 7 Table of Contents

Oct. 11, 2005