MODULE 3

Reinforcement Schedules

The importance of reinforcement schedules in operant conditioning research can hardly be overestimated. They permit the experimenter to maintain the subject s behavior over long periods of time so that reactions to the variables of which behavior is a function can be examined in detail. But perhaps of equal importance is the fact that all of us in our everyday behavior are subjected to reinforcement schedules of various sorts.

By studying what happens to animals on various schedules one can understand his own behavior better. As you read the text, keep the following questions in mind.


For many years, researchers only used continuous reinforcement schedules in their experiments; that is, they reinforced the desired behavior every time it occurred. Until the "pleasant.Saturday afternoon," described by Skinner below, no one had thought of doing anything else. Skinner recounts that he was running eight rats in four homemade Skinner boxes, and making his own food pellets with a pill machine, which is a tedious and time consuming task.

Here is Skinner's personal account of how he happened to hit upon the effects of intermittent reinforcement (1959).

One pleasant Saturday afternoon I surveyed my supply of dry pellets and, appealing to certain elemental theorems in arithmetic, deduced that unless I spent the rest of that afternoon and evening at the pill machine, the supply would be exhausted by ten-thirty Monday morning.

Since I do not wish to deprecate the hypothetico-deductive method, I am glad to testify here to its usefulness. It led me to apply our second principle of unformalized scientific method and to ask myself why every press of the lever had to be reinforced. I was not then aware of what had happened at the Brown Laboratories, as Harold Schlosberg later told the story. A graduate student had been given the task of running a cat through a cat through a difficult discrimination experiment. One Sunday, the student found the supply of cat food exhausted. The stores were closed, and so, with a beautiful faith in the frequency-theory of learning, he ran the cat as usual and took it back to its living cage unrewarded. Schlosberg reports that the cat howled its protest continuously for nearly forty eight hours. Unaware of this, I decided to reinforce a response only once every minute and to allow all other responses to go unreinforced There were two results: (a) my supply of pellets lasted almost indefinitely, and (b) each rat stabilized at a fairly constant rate of responding.

Now, a steady state was something I was familiar with from physical chemistry, and I therefore embarked upon the study of periodic reinforcement. I soon found that the constant rate at which the rat stabilized depended upon how hungry it was. Hungry rat, high rate; less hungry rat, lower rate. At that time I was bothered by the practical problem of controlling food deprivation. I was working half time at the Medical School (on chronaxie of subordinations) and could not maintain a good schedule in working with the rats. The rate of responding under periodic reinforcement suggested a scheme for keeping a rat at a constant level of deprivation. The argument went like this: Suppose you reinforce the rat, not at the end of a given period, but when it has completed the number of responses ordinarily emitted in that period. And suppose you use substantial pellets of food and give the rat continuous access to the lever. Except for periods when the rat sleeps, it should operate the lever at a constant rate around the clock. For, whenever it grows hungrier, it will work faster, get food faster, and become less hungry, while whenever it grows slightly less hungry, it will respond at a lower rate, get less food, and grow hungrier. By setting the reinforcement at a given number of responses, it should even be possible to hold the rat at any given level of deprivation. I visualized a machine with a dial which one could set to make available, at any time of day or night, a rat in a given state of deprivation. Of course, nothing of the sort happens. This is fixed-ratio rather than fixed- interval' reinforcement and, as I soon found out, it produces a very different type of performance. This is an example of a fifth unformalized principle of scientific c practice, but one which has at least been named. Walter Cannon described it with a word invented by Horace Walpole:serendipity the art of finding one thing while looking for something else.

Thus, the study of the very important implications of reinforcement schedules arose from a desire to avoid the tedium of preparing food pellets for rats.

THE CUMULATIVE RECORD

To understand the use and the effects of various reinforcement schedules, it helps to see how behavior is described in a cumulative record.

A cumulative recorder is a device with which a pen makes marks on a steadily moving paper. For each response this pen moves upward a little way. The illustration below shows how a single response would appear on paper.

We can see by this record that the animal did not emit any desired response prior to his single response or afterward. When the animal is not responding you simply get a horizontal line on the graph; the graph never goes down. The pen is merely reset at the bottom when it runs out of room at the top. The faster an animal is responding the steeper the slope of the cumulative record will be. The illustration below shows the records of two rats, one responding at a faster rate than the other.

The general shape of the curve, and not the individual response steps themselves, is of primary interest in a cumulative record. For this reason, we will draw curves as though viewed from some distance and will just show the general shape. The graph below shows two sketches of cumulative records; one is an acquisition curve and the other is an extinction curve. Notice the difference in curvature between the two records.

EXTINCTION

As we said previously, the procedure of reinforcing every response is referred to as a continuous schedule. Other schedules are called intermittent or partial reinforcement schedules because many unreinforced responses are mixed with reinforced ones. The effects of intermittent schedules answer in part the question posed at the beginning of this module. Not only is it unnecessary to reinforce each response, it is often preferable not to do so.

In addition to increasing response rates, intermittent schedules have still another attractive feature. Since so many nonreinforced trials are mixed in with reinforced ones, it is more difficult to obtain extinction. Data show that organisms on some schedules may make literally thousands of responses after the last reinforcement has been delivered.

The resistance to extinction caused by intermittent schedules is precisely why extinction is often so difficult to obtain in everyday life.

Extinction following intermittent reinforcement requires a greater number of responses than following a continuous schedule, even when the same number of reinforcements were used in the original conditioning

KINDS OF INTERMITTENT REINFORCEMENT SCHEDULES

Although there is an unlimited number of possible reinforcement schedules, you will be pleased to know that we are only going to discuss four of them. The four can be divided into two main groups: ratio schedules and schedules. These in turn can vary in two ways. You may have fixed or variable ratio schedules and fixed or variable interval schedules. It will be easy for you to remember these four kinds if you just think of a two by two matrix, like this: RatioInterval
Fixed FRFI
VariableVRVI

Now we will examine each of these four reinforcement schedules in a little more detail.

FIXED RATIO (FR) REINFORCEMENT SCHEDULES

In a fixed ratio schedule a given amount of behavior is required for every reinforcement. For example, if the experimenter is working with a rat or pigeon in a Skinner box, he may set the equipment to require ten responses, bar presses, or key pecks, for each reinforcement.

Intermittently reinforced behavior is difficult to extinguish

On a fixed ratio schedule, the animal typically responds in bursts; after reinforcement he pauses, then begins responding again at a high, constant rate. Figure 8 shows a cumulative record of an FR performance. The vertical slash marks indicate reinforcement. The horizontal sections of the record indicate non responding, and the diagonal portions show a constant, high rate of response.

Fixed Ratio Curve

If too many unreinforced responses are demanded for each reinforcement, that is, if the ratio becomes too large, then the pauses after reinforcement become very long, indicating ratio strain or simply strain.

It should be emphasized that this is not a physical strain on the animal. It is, rather, a feature of the schedule. The pauses can be eliminated by reducing the ratio.

Ratio strain appears in people, too. Students who have two or three reports due at one time often complain about ratio strain. After finishing his first report the probability of a student's starting another one, for a while at least, is near zero.

VARIABLE RATIO (VR) REINFORCEMENT SCHEDULES

The characteristic pause after reinforcement on a fixed ratio schedule is eliminated in the variable ratio schedule. Although the average number of responses required for a fixed ratio schedule and a variable ratio schedule may be the same, the number of responses required before each reinforcement in a VR schedule is randomly determined.

In general, ratio schedules, both fixed and variable, generate the highest rates of responding. The variable ratio is precisely the type of schedule designed into a slot machine. You cannot hit the jackpot unless you play, and the machines are set to pay off on some preset variable ratio schedule. Anyone who has visited Las Vegas can testify to the high response rate of slot machine players.

FIXED INTERVAL (Fl) REINFORCEMENT SCHEDULES

In interval schedules, the animal gets reinforcement the first time he responds after a given interval of time has elapsed. For example, on the F15 schedule the animal would get reinforced for his first response after 5 minutes had elapsed. Thus, theoretically at least, the animal could get reinforced for only one response every 5 minutes. On an Fl schedule after much training, the animal has a very low probability of responding right after reinforcement. As the time for reinforcement approaches, however, he responds at a faster and faster rate. Thus, a scalloped curve develops which is characteristic of the Fl performance. After considerable training on an Fl schedule, a record similar to that shown in Figure 9 results.

Fixed Interval Schedule

Figure 9. Cumulative response curve for a fixed interval schedule

VARIABLE INTERVAL (Vl) REINFORCEMENT SCHEDULES

As with the VR schedule, the pauses after reinforcement can be eliminated by shifting from the fixed interval to a variable interval schedule. In this way, a steady rate of responding will be generated. How high that rate is will depend on the frequency of reinforcement.

CONDITIONING OF A SUPERSTITION

Suppose a naive subject is brought into a classroom and he sees the following: The experimenter is standing in front of a counter, a hand switch connected to it. He instructs, "Say wordsÄany words at all, except for sentences. Try and earn as many points as you can." The subject then begins to say words and, occasionally, he (and the class) will hear a click of the counter as the subject earns points. Actually, the hand switch has nothing to do with whether the subject gets a reinforcement or not; the counter is controlled by a timer. The subject gets reinforced, not because of anything he does, but solely with the passage of time.

As the subject says words and gets reinforced, he, and members of the class, will form superstitious hypotheses. Some subjects may guess that reinforcement depends on saying plural words; another will say it has to do with nouns, and so on. Some of the hypotheses (superstitions) will turn out to be quite involved, but all will be incorrectÄexcept the hypothesis that the reinforcement is noncontingent!

MODULE 3

PROGRESS CHECK 1

Now test yourself without looking back.
1. Which of the following classifications include all reinforcement schedules?

a. Fixed ratio and fixed interval
b. Fixed ratio and variable interval
c. Variable ratio and fixed interval
d. Continuous and intermittent

2. Do rats respond at a more constant rate on a variable interval schedule or on a fixed interval schedule?

3. Will an organism tend to pause after reinforcement on a fixed ratio or a variable ratio schedule?

4. Which generates behavior more resistant to extinction: continuous or intermittent reinforcement?

5. What will happen if too many unreinforced responses are required for each reinforcement on a ratio
schedule?

6. What is an FR 7 schedule?

7. What is the graph called that shows a continuing record of responses of the animal, plotted against time?


ANSWER KEY

Be sure to do all the exercises which follow



MODULE 3

EXERCISES

In the last module, conditioning was discussed in which every response was reinforced. This is known as:

a. intermittent reinforcement.
b. continuous reinforcement.
c. fixed interval reinforcement.
d. variable ratio reinforcement.

_______________________________________________________________ 2

When every response is not reinforced, intermittent reinforcement is being used. Four types of intermittent reinforcement are shown by this chart.

RatioInterval
Fixed FRFI
VariableVRVI

Write the letters for the type of reinforcement after each of the following, using C for continuous reinforcement.

a. A rat is reinforced every fifth time he makes a response. _______

b. A rat is reinforced for every response_______.

c. A pigeon's response is reinforced only if it occurs 40 seconds or longer after its previous reinforced response______.

d. A rat's response is reinforced if it is made at various periods of time, (1 to 30 seconds), since its last reinforced response _____.

e. A pigeon is reinforced after differing numbers of responses have been made. ________

_______________________________________________________ 4

A cumulative record keeps a record of responses against time. The steeper the slope of the graph the faster the rate of responding. The cumulative record below shows the responses of a rat on a fixed interval schedule. The slashes indicate reinforcements.

cumrec.gif - 13041 Bytes

Which of the following is true?

a. The rat maintains a steady rate of response.

b. Response rate is lower immediately after each
reinforcement.

c. The rat's rate of responding changes after each pause.

________________________________________________________________ 5

In a fixed ratio (VI) schedule a pause characteristically follows each reinforcer. This pause is eliminated in a variable ratio (VR) schedule. Label the curves below.

variable.gif - 23152 Bytes


1 VI

2 b

3 VR

4)
a. FR
b. C
c. Fl
d. Vl
e. VR

5 b

In an interval schedule, an animal gets reinforced for the first response after a given interval of time has elapsed. Thus if the animal were on a fixed interval schedule of four minutes he would be reinforced:
a. for his first response, if less than four minutes have passed.
b. for every fourth response.
c. for his first response after four minutes have passed.
d. every four minutes, regardless of response.

____________________________________________________________________ 3

On a fixed interval schedule, the animal tends not to respond immediately after a reinforcement. As the time approaches for reinforcement, his rate of responding will _____________________________________________________________________1

Explain when a rat would be reinforced on a variable interval schedule of reinforcement. __________________________________________________________________________________________________________________________________________________________________________4 (The graph here has scalloped curves, as in the third graph back)

a. This graph is called a________________________________

b. It develops under what kind of reinforcement schedule?_______________________________________________________________ 2

One advantage of intermittent reinforcement is that it establishes behavior more resistant to extinction than a reinforcement schedule in which every response is reinforced. For what type of schedule is every response reinforced?
a. Fixed interval
b. Variable ratio
c. Continuous reinforcement

NOW TAKE PROGRESS CHECK 2



1 increase
2 a. cumulative record
b. fixed Interval
3 c 4 after his first response follow- ing a randomly determined in- terval of time
5 c


MODULE 3

PROGRESS CHECK 2

1. Name two classifications that will include all reinforcement schedules.
a.

b.

2. Explain briefly the rate of responding by an organism on a fixed interval reinforcement schedule._____________________________________________________________________

3. What is a cumulative record?______________________________________________________________________________________________________

4. Which of the curves below shows a rat with the most rapid rate of acquisition?

5. In general, will rats respond at a more constant rate on a fixed or variable schedule of reinforcement?

_______________________________________________________

6. What general classification of reinforcement schedules would you use in conditioning a rat so that his behavior would be easily extinguished?

___________________________________________________________________

7. Contingent reinforcement of responses occurring after randomly determined periods of time is called

________________________________________________________

ANSWER KEY

6 OR MORE CORRECT GO TO MODULE 4

FEWER THAN 6 CORRECT -- INSTRUCTOR CONFERENCE

Go to Unit 3 table of contents

Return to the Psych 200 Home Page

July 29, 2002