Page: Textbook - Chapter 8
Chapter 8. Learning
8. Learning
My Story of Post-traumatic Stress Disorder
It is a continuous challenge living with post-traumatic stress disorder (PTSD), and I’ve suffered from it for
most of my life. I can look back now and gently laugh at all the people who thought I had the perfect life. I
was young, beautiful, and talented, but unbeknownst to them, I was terrorized by an undiagnosed debilitating
mental illness.
Having been properly diagnosed with PTSD at age 35, I know that there is not one aspect of my life that
has gone untouched by this mental illness. My PTSD was triggered by several traumas, most importantly a
sexual attack at knifepoint that left me thinking I would die. I would never be the same after that attack. For
me there was no safe place in the world, not even my home. I went to the police and filed a report. Rape
counselors came to see me while I was in the hospital, but I declined their help, convinced that I didn’t need
it. This would be the most damaging decision of my life.
For months after the attack, I couldn’t close my eyes without envisioning the face of my attacker. I suffered
horrific flashbacks and nightmares. For four years after the attack I was unable to sleep alone in my house. I
obsessively checked windows, doors, and locks. By age 17, I’d suffered my first panic attack. Soon I became
unable to leave my apartment for weeks at a time, ending my modeling career abruptly. This just became a
way of life. Years passed when I had few or no symptoms at all, and I led what I thought was a fairly normal
life, just thinking I had a “panic problem.”
Then another traumatic event retriggered the PTSD. It was as if the past had evaporated, and I was back in
the place of my attack, only now I had uncontrollable thoughts of someone entering my house and harming
my daughter. I saw violent images every time I closed my eyes. I lost all ability to concentrate or even
complete simple tasks. Normally social, I stopped trying to make friends or get involved in my community.
I often felt disoriented, forgetting where, or who, I was. I would panic on the freeway and became unable
to drive, again ending a career. I felt as if I had completely lost my mind. For a time, I managed to keep it
together on the outside, but then I became unable to leave my house again.
Around this time I was diagnosed with PTSD. I cannot express to you the enormous relief I felt when I
discovered my condition was real and treatable. I felt safe for the first time in 32 years. Taking medication
and undergoing behavioural therapy marked the turning point in my regaining control of my life. I’m
rebuilding a satisfying career as an artist, and I am enjoying my life. The world is new to me and not limited
by the restrictive vision of anxiety. It amazes me to think back to what my life was like only a year ago, and
just how far I’ve come.
For me there is no cure, no final healing. But there are things I can do to ensure that I never have to suffer as
I did before being diagnosed with PTSD. I’m no longer at the mercy of my disorder, and I would not be here
today had I not had the proper diagnosis and treatment. The most important thing to know is that it’s never
too late to seek help. (Philips, 2010)
301
The topic of this chapter is learning—the relatively permanent change in knowledge or behaviour that is the result
of experience. Although you might think of learning in terms of what you need to do before an upcoming exam,
the knowledge that you take away from your classes, or new skills that you acquire through practice, these changes
represent only one component of learning. In fact, learning is a broad topic that is used to explain not only how we
acquire new knowledge and behaviour but also how we acquire a wide variety of other psychological processes,
including the development of both appropriate and inappropriate social behaviours, and even how a person may
acquire a debilitating psychological disorder such as PTSD.
Figure 8.1 Skinner and Watson. B. F. Skinner (left) and John B. Watson (right) were champions
of the behaviourist school of learning.
Learning is perhaps the most important human capacity. Learning allows us to create effective lives by being able
to respond to changes. We learn to avoid touching hot stoves, to find our way home from school, and to remember
which people have helped us in the past and which people have been unkind. Without the ability to learn from our
experiences, our lives would be remarkably dangerous and inefficient. The principles of learning can also be used to
explain a wide variety of social interactions, including social dilemmas in which people make important, and often
selfish, decisions about how to behave by calculating the costs and benefits of different outcomes.
The study of learning is closely associated with the behaviourist school of psychology, in which it was seen as an
alternative scientific perspective to the failure of introspection. The behaviourists, including John B. Watson and
B. F. Skinner (Figure 8.1), focused their research entirely on behaviour, to the exclusion of any kinds of mental
processes. For behaviourists, the fundamental aspect of learning is the process of conditioning — the ability to
connect stimuli (the changes that occur in the environment) with responses (behaviours or other actions).
But conditioning is just one type of learning. We will also consider other types, including learning through insight,
as well as observational learning (also known as modelling). In each case we will see not only what psychologists
have learned about the topics but also the important influence that learning has on many aspects of our everyday
lives. And we will see that in some cases learning can be maladaptive — for instance, when a person like P. K.
Philips continually experiences disruptive memories and emotional responses to a negative event.
References
Philips, P. K. (2010). My story of survival: Battling PTSD. Anxiety Disorders Association of America. Retrieved
from http://www.adaa.org/living-with-anxiety/personal-stories/my-story-survival-battling-ptsd
8. LEARNING • 302
Image Attributions
Figure 8.1: “B.F. Skinner” (http://commons.wikimedia.org/wiki/File:B.F._Skinner_at_Harvard_circa_1950.jpg) is
licensed under the CC BY 3.0 license (http://creativecommons.org/licenses/by/3.0/deed.en). “John Broadus
Watson” (http://en.wikipedia.org/wiki/File:John_Broadus_Watson.JPG) is in the public domain.
303 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
8.1 Learning by Association: Classical Conditioning
Learning Objectives
1. Describe how Pavlov’s early work in classical conditioning influenced the understanding of
learning.
2. Review the concepts of classical conditioning, including unconditioned stimulus (US),
conditioned stimulus (CS), unconditioned response (UR), and conditioned response (CR).
3. Explain the roles that extinction, generalization, and discrimination play in conditioned
learning.
Pavlov Demonstrates Conditioning in Dogs
In the early part of the 20th century, Russian physiologist Ivan Pavlov (1849-1936), shown in Figure 8.2, was
studying the digestive system of dogs when he noticed an interesting behavioural phenomenon: the dogs began
to salivate when the lab technicians who normally fed them entered the room, even though the dogs had not yet
received any food. Pavlov realized that the dogs were salivating because they knew that they were about to be fed;
the dogs had begun to associate the arrival of the technicians with the food that soon followed their appearance in
the room.
With his team of researchers, Pavlov began studying this process in more detail. He conducted a series of
experiments in which, over a number of trials, dogs were exposed to a sound immediately before receiving food. He
systematically controlled the onset of the sound and the timing of the delivery of the food, and recorded the amount
of the dogs’ salivation. Initially the dogs salivated only when they saw or smelled the food, but after several pairings
of the sound and the food, the dogs began to salivate as soon as they heard the sound. The animals had learned to
associate the sound with the food that followed.
Pavlov had identified a fundamental associative learning process called classical conditioning. Classical
conditioning refers to learning that occurs when a neutral stimulus (e.g., a tone) becomes associated with a
stimulus (e.g., food) that naturally produces a behaviour. After the association is learned, the previously neutral
stimulus is sufficient to produce the behaviour.
As you can see in Figure 8.3, “4-Panel Image of Whistle and Dog,” psychologists use specific terms to identify
the stimuli and the responses in classical conditioning. The unconditioned stimulus (US) is something (such as
food) that triggers a naturally occurring response, and the unconditioned response (UR) is the naturally occurring
response (such as salivation) that follows the unconditioned stimulus. The conditioned stimulus (CS) is a neutral
stimulus that, after being repeatedly presented prior to the unconditioned stimulus, evokes a similar response as the
unconditioned stimulus. In Pavlov’s experiment, the sound of the tone served as the conditioned stimulus that, after
learning, produced the conditioned response (CR), which is the acquired response to the formerly neutral stimulus.
304
Figure 8.2 Ivan Pavlov.
Note that the UR and the CR are the same behaviour—in this case salivation—but they are given different names
because they are produced by different stimuli (the US and the CS, respectively).
Figure 8.3 4-Panel Image of Whistle and Dog.
Conditioning is evolutionarily beneficial because it allows organisms to develop expectations that help them prepare
for both good and bad events. Imagine, for instance, that an animal first smells a new food, eats it, and then gets
sick. If the animal can learn to associate the smell (CS) with the food (US), it will quickly learn that the food creates
the negative outcome and will not eat it the next time.
305 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
The Persistence and Extinction of Conditioning
After he had demonstrated that learning could occur through association, Pavlov moved on to study the variables
that influenced the strength and the persistence of conditioning. In some studies, after the conditioning had taken
place, Pavlov presented the sound repeatedly but without presenting the food afterward. Figure 8.4, “Acquisition,
Extinction, and Spontaneous Recovery,” shows what happened. As you can see, after the initial acquisition
(learning) phase in which the conditioning occurred, when the CS was then presented alone, the behaviour rapidly
decreased — the dogs salivated less and less to the sound, and eventually the sound did not elicit salivation at all.
Extinction refers to the reduction in responding that occurs when the conditioned stimulus is presented repeatedly
without the unconditioned stimulus.
Figure 8.4 Acquisition, Extinction, and Spontaneous Recovery. Acquisition: The CS and the US
are repeatedly paired together and behaviour increases. Extinction: The CS is repeatedly presented
alone, and the behaviour slowly decreases. Spontaneous recovery: After a pause, when the CS is
again presented alone, the behaviour may again occur and then again show extinction.
Although at the end of the first extinction period the CS was no longer producing salivation, the effects of
conditioning had not entirely disappeared. Pavlov found that, after a pause, sounding the tone again elicited
salivation, although to a lesser extent than before extinction took place. The increase in responding to the CS
following a pause after extinction is known as spontaneous recovery. When Pavlov again presented the CS alone,
the behaviour again showed extinction until it disappeared again.
Although the behaviour has disappeared, extinction is never complete. If conditioning is again attempted, the animal
will learn the new associations much faster than it did the first time.
Pavlov also experimented with presenting new stimuli that were similar, but not identical, to the original conditioned
stimulus. For instance, if the dog had been conditioned to being scratched before the food arrived, the stimulus
would be changed to being rubbed rather than scratched. He found that the dogs also salivated upon experiencing
the similar stimulus, a process known as generalization. Generalization refers to the tendency to respond to stimuli
that resemble the original conditioned stimulus. The ability to generalize has important evolutionary significance.
If we eat some red berries and they make us sick, it would be a good idea to think twice before we eat some purple
berries. Although the berries are not exactly the same, they nevertheless are similar and may have the same negative
properties.
Lewicki (1985) conducted research that demonstrated the influence of stimulus generalization and how quickly and
easily it can happen. In his experiment, high school students first had a brief interaction with a female experimenter
who had short hair and glasses. The study was set up so that the students had to ask the experimenter a question,
and (according to random assignment) the experimenter responded either in a negative way or a neutral way toward
the students. Then the students were told to go into a second room in which two experimenters were present
and to approach either one of them. However, the researchers arranged it so that one of the two experimenters
8.1 LEARNING BY ASSOCIATION: CLASSICAL CONDITIONING • 306
looked a lot like the original experimenter, while the other one did not (she had longer hair and no glasses). The
students were significantly more likely to avoid the experimenter who looked like the earlier experimenter when that
experimenter had been negative to them than when she had treated them more neutrally. The participants showed
stimulus generalization such that the new, similar-looking experimenter created the same negative response in the
participants as had the experimenter in the prior session.
The flip side of generalization is discrimination — the tendency to respond differently to stimuli that are similar
but not identical. Pavlov’s dogs quickly learned, for example, to salivate when they heard the specific tone that had
preceded food, but not upon hearing similar tones that had never been associated with food. Discrimination is also
useful — if we do try the purple berries, and if they do not make us sick, we will be able to make the distinction in
the future. And we can learn that although two people in our class, Courtney and Sarah, may look a lot alike, they
are nevertheless different people with different personalities.
In some cases, an existing conditioned stimulus can serve as an unconditioned stimulus for a pairing with a new
conditioned stimulus—a process known as second-order conditioning. In one of Pavlov’s studies, for instance, he
first conditioned the dogs to salivate to a sound and then repeatedly paired a new CS, a black square, with the sound.
Eventually he found that the dogs would salivate at the sight of the black square alone, even though it had never
been directly associated with the food. Secondary conditioners in everyday life include our attractions to things that
stand for or remind us of something else, such as when we feel good on a Friday because it has become associated
with the paycheque that we receive on that day, which itself is a conditioned stimulus for the pleasures that the
paycheque buys us.
The Role of Nature in Classical Conditioning
As we have seen in Chapter 1, “Introducing Psychology,” scientists associated with the behaviourist school argued
that all learning is driven by experience, and that nature plays no role. Classical conditioning, which is based on
learning through experience, represents an example of the importance of the environment. But classical conditioning
cannot be understood entirely in terms of experience. Nature also plays a part, as our evolutionary history has made
us better able to learn some associations than others.
Clinical psychologists make use of classical conditioning to explain the learning of a phobia — a strong and
irrational fear of a specific object, activity, or situation. For example, driving a car is a neutral event that would not
normally elicit a fear response in most people. But if a person were to experience a panic attack in which he or she
suddenly experienced strong negative emotions while driving, that person may learn to associate driving with the
panic response. The driving has become the CS that now creates the fear response.
Psychologists have also discovered that people do not develop phobias to just anything. Although people may
in some cases develop a driving phobia, they are more likely to develop phobias toward objects (such as snakes
and spiders) or places (such as high locations and open spaces) that have been dangerous to people in the past. In
modern life, it is rare for humans to be bitten by spiders or snakes, to fall from trees or buildings, or to be attacked
by a predator in an open area. Being injured while riding in a car or being cut by a knife are much more likely. But
in our evolutionary past, the potential for being bitten by snakes or spiders, falling out of a tree, or being trapped in
an open space were important evolutionary concerns, and therefore humans are still evolutionarily prepared to learn
these associations over others (.hman & Mineka, 2001; LoBue & DeLoache, 2010).
Another evolutionarily important type of conditioning is conditioning related to food. In his important research on
food conditioning, John Garcia and his colleagues (Garcia, Kimeldorf, & Koelling, 1955; Garcia, Ervin, & Koelling,
1966) attempted to condition rats by presenting either a taste, a sight, or a sound as a neutral stimulus before the
rats were given drugs (the US) that made them nauseous. Garcia discovered that taste conditioning was extremely
307 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
powerful—the rat learned to avoid the taste associated with illness, even if the illness occurred several hours later.
But conditioning the behavioural response of nausea to a sight or a sound was much more difficult. These results
contradicted the idea that conditioning occurs entirely as a result of environmental events, such that it would occur
equally for any kind of unconditioned stimulus that followed any kind of conditioned stimulus. Rather, Garcia’s
research showed that genetics matters — organisms are evolutionarily prepared to learn some associations more
easily than others. You can see that the ability to associate smells with illness is an important survival mechanism,
allowing the organism to quickly learn to avoid foods that are poisonous.
Classical conditioning has also been used to help explain the experience of post-traumatic stress disorder (PTSD),
as in the case of P. K. Philips described in the chapter opener. PTSD is a severe anxiety disorder that can develop
after exposure to a fearful event, such as the threat of death (American Psychiatric Association, 2000). PTSD occurs
when the individual develops a strong association between the situational factors that surrounded the traumatic event
(e.g., military uniforms or the sounds or smells of war) and the US (the fearful trauma itself). As a result of the
conditioning, being exposed to or even thinking about the situation in which the trauma occurred (the CS) becomes
sufficient to produce the CR of severe anxiety (Keane, Zimering, & Caddell, 1985).
PTSD develops because the emotions experienced during the event have produced neural activity in the amygdala
and created strong conditioned learning. In addition to the strong conditioning that people with PTSD experience,
they also show slower extinction in classical conditioning tasks (Milad et al., 2009). In short, people with PTSD
have developed very strong associations with the events surrounding the trauma and are also slow to show extinction
to the conditioned stimulus.
Key Takeaways
• In classical conditioning, a person or animal learns to associate a neutral stimulus (the conditioned
stimulus, or CS) with a stimulus (the unconditioned stimulus, or US) that naturally produces a
behaviour (the unconditioned response, or UR). As a result of this association, the previously
neutral stimulus comes to elicit the same response (the conditioned response, or CR).
• Extinction occurs when the CS is repeatedly presented without the US, and the CR eventually
disappears, although it may reappear later in a process known as spontaneous recovery.
• Stimulus generalization occurs when a stimulus that is similar to an already-conditioned stimulus
begins to produce the same response as the original stimulus does.
• Stimulus discrimination occurs when the organism learns to differentiate between the CS and
other similar stimuli.
• In second-order conditioning, a neutral stimulus becomes a CS after being paired with a
previously established CS.
• Some stimuli — response pairs, such as those between smell and food — are more easily
conditioned than others because they have been particularly important in our evolutionary past.
8.1 LEARNING BY ASSOCIATION: CLASSICAL CONDITIONING • 308
Exercises and Critical Thinking
1. A teacher places gold stars on the chalkboard when the students are quiet and attentive.
Eventually, the students start becoming quiet and attentive whenever the teacher approaches the
chalkboard. Can you explain the students’ behaviour in terms of classical conditioning?
2. Recall a time in your life, perhaps when you were a child, when your behaviours were
influenced by classical conditioning. Describe in detail the nature of the unconditioned and
conditioned stimuli and the response, using the appropriate psychological terms.
3. If post-traumatic stress disorder (PTSD) is a type of classical conditioning, how might
psychologists use the principles of classical conditioning to treat the disorder?
References
American Psychiatric Association. (2000). Diagnostic and statistical manual of mental disorders (4th ed., text rev.).
Washington, DC: Author.
Garcia, J., Ervin, F. R., & Koelling, R. A. (1966). Learning with prolonged delay of reinforcement. Psychonomic
Science, 5(3), 121–122.
Garcia, J., Kimeldorf, D. J., & Koelling, R. A. (1955). Conditioned aversion to saccharin resulting from exposure to
gamma radiation. Science, 122, 157–158.
Keane, T. M., Zimering, R. T., & Caddell, J. M. (1985). A behavioral formulation of posttraumatic stress disorder
in Vietnam veterans. The Behavior Therapist, 8(1), 9–12.
Lewicki, P. (1985). Nonconscious biasing effects of single instances on subsequent judgments. Journal of
Personality and Social Psychology, 48, 563–574.
LoBue, V., & DeLoache, J. S. (2010). Superior detection of threat-relevant stimuli in infancy. Developmental
Science, 13(1), 221–228.
Milad, M. R., Pitman, R. K., Ellis, C. B., Gold, A. L., Shin, L. M., Lasko, N. B.,…Rauch, S. L. (2009).
Neurobiological basis of failure to recall extinction memory in posttraumatic stress disorder. Biological Psychiatry,
66(12), 1075–82.
.hman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: Toward an evolved module of fear and fear
learning. Psychological Review, 108(3), 483–522.
Image Attributions
Figure 8.2: Ivan Pavlov (http://commons.wikimedia.org/wiki/File:Ivan_Pavlov_LIFE.jpg) is in the public domain.
309 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
8.2 Changing Behaviour through Reinforcement and Punishment: Operant
Conditioning
Learning Objectives
1. Outline the principles of operant conditioning.
2. Explain how learning can be shaped through the use of reinforcement schedules and secondary
reinforcers.
In classical conditioning the organism learns to associate new stimuli with natural biological responses such as
salivation or fear. The organism does not learn something new but rather begins to perform an existing behaviour
in the presence of a new signal. Operant conditioning, on the other hand, is learning that occurs based on the
consequences of behaviour and can involve the learning of new actions. Operant conditioning occurs when a dog
rolls over on command because it has been praised for doing so in the past, when a schoolroom bully threatens
his classmates because doing so allows him to get his way, and when a child gets good grades because her parents
threaten to punish her if she doesn’t. In operant conditioning the organism learns from the consequences of its own
actions.
How Reinforcement and Punishment Influence Behaviour: The Research of Thorndike and Skinner
Psychologist Edward L. Thorndike (1874-1949) was the first scientist to systematically study operant conditioning.
In his research Thorndike (1898) observed cats who had been placed in a “puzzle box” from which they tried to
escape (“Video Clip: Thorndike’s Puzzle Box”). At first the cats scratched, bit, and swatted haphazardly, without
any idea of how to get out. But eventually, and accidentally, they pressed the lever that opened the door and exited to
their prize, a scrap of fish. The next time the cat was constrained within the box, it attempted fewer of the ineffective
responses before carrying out the successful escape, and after several trials the cat learned to almost immediately
make the correct response.
Observing these changes in the cats’ behaviour led Thorndike to develop his law of effect, the principle that
responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a
similar situation, whereas responses that produce a typically unpleasant outcome are less likely to occur again
in the situation (Thorndike, 1911). The essence of the law of effect is that successful responses, because they
are pleasurable, are “stamped in” by experience and thus occur more frequently. Unsuccessful responses, which
produce unpleasant experiences, are “stamped out” and subsequently occur less frequently.
When Thorndike placed his cats in a puzzle box, he found that they learned to engage in the important escape
behaviour faster after each trial. Thorndike described the learning that follows reinforcement in terms of the law of
effect.
310
Watch: “Thorndike’s Puzzle Box” [YouTube]: http://www.youtube.com/
watch?v=BDujDOLre-8
The influential behavioural psychologist B. F. Skinner (1904-1990) expanded on
Thorndike’s ideas to develop a more complete set of principles to explain operant
conditioning. Skinner created specially designed environments known as operant
chambers (usually called Skinner boxes) to systematically study learning. A Skinner
box (operant chamber) is a structure that is big enough to fit a rodent or bird and that
contains a bar or key that the organism can press or peck to release food or water. It
also contains a device to record the animal’s responses (Figure 8.5).
The most basic of Skinner’s experiments was quite similar to Thorndike’s research with cats. A rat placed in the
chamber reacted as one might expect, scurrying about the box and sniffing and clawing at the floor and walls.
Eventually the rat chanced upon a lever, which it pressed to release pellets of food. The next time around, the rat
took a little less time to press the lever, and on successive trials, the time it took to press the lever became shorter
and shorter. Soon the rat was pressing the lever as fast as it could eat the food that appeared. As predicted by the
law of effect, the rat had learned to repeat the action that brought about the food and cease the actions that did not.
Skinner studied, in detail, how animals changed their behaviour through reinforcement and punishment, and
he developed terms that explained the processes of operant learning (Table 8.1, “How Positive and Negative
Reinforcement and Punishment Influence Behaviour”). Skinner used the term reinforcer to refer to any event that
strengthens or increases the likelihood of a behaviour, and the term punisher to refer to any event that weakens
or decreases the likelihood of a behaviour. And he used the terms positive and negative to refer to whether a
reinforcement was presented or removed, respectively. Thus, positive reinforcement strengthens a response by
presenting something pleasant after the response, and negative reinforcement strengthens a response by reducing
or removing something unpleasant. For example, giving a child praise for completing his homework represents
positive reinforcement, whereas taking Aspirin to reduce the pain of a headache represents negative reinforcement.
In both cases, the reinforcement makes it more likely that behaviour will occur again in the future.
Figure 8.5 Skinner Box. B. F. Skinner used a Skinner box to study operant learning. The box
contains a bar or key that the organism can press to receive food and water, and a device that
records the organism’s responses.
311 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Table 8.1 How Positive and Negative Reinforcement and Punishment Influence Behaviour.
[Skip Table]
Operant
conditioning
term Description Outcome Example
Positive
reinforcement
Add or increase a
pleasant stimulus
Behaviour is
strengthened Giving a student a prize after he or she gets an A on a test
Negative
reinforcement
Reduce or remove an
unpleasant stimulus
Behaviour is
strengthened
Taking painkillers that eliminate pain increases the
likelihood that you will take painkillers again
Positive
punishment
Present or add an
unpleasant stimulus
Behaviour is
weakened
Giving a student extra homework after he or she
misbehaves in class
Negative
punishment
Reduce or remove a
pleasant stimulus
Behaviour is
weakened
Taking away a teen’s computer after he or she misses
curfew
Reinforcement, either positive or negative, works by increasing the likelihood of a behaviour. Punishment, on the
other hand, refers to any event that weakens or reduces the likelihood of a behaviour. Positive punishment weakens
a response by presenting something unpleasant after the response, whereas negative punishment weakens a
response by reducing or removing something pleasant. A child who is grounded after fighting with a sibling
(positive punishment) or who loses out on the opportunity to go to recess after getting a poor grade (negative
punishment) is less likely to repeat these behaviours.
Although the distinction between reinforcement (which increases behaviour) and punishment (which decreases it)
is usually clear, in some cases it is difficult to determine whether a reinforcer is positive or negative. On a hot day
a cool breeze could be seen as a positive reinforcer (because it brings in cool air) or a negative reinforcer (because
it removes hot air). In other cases, reinforcement can be both positive and negative. One may smoke a cigarette
both because it brings pleasure (positive reinforcement) and because it eliminates the craving for nicotine (negative
reinforcement).
It is also important to note that reinforcement and punishment are not simply opposites. The use of positive
reinforcement in changing behaviour is almost always more effective than using punishment. This is because
positive reinforcement makes the person or animal feel better, helping create a positive relationship with the person
providing the reinforcement. Types of positive reinforcement that are effective in everyday life include verbal praise
or approval, the awarding of status or prestige, and direct financial payment. Punishment, on the other hand, is
more likely to create only temporary changes in behaviour because it is based on coercion and typically creates a
negative and adversarial relationship with the person providing the reinforcement. When the person who provides
the punishment leaves the situation, the unwanted behaviour is likely to return.
Creating Complex Behaviours through Operant Conditioning
Perhaps you remember watching a movie or being at a show in which an animal — maybe a dog, a horse, or a
dolphin—did some pretty amazing things. The trainer gave a command and the dolphin swam to the bottom of the
pool, picked up a ring on its nose, jumped out of the water through a hoop in the air, dived again to the bottom of
the pool, picked up another ring, and then took both of the rings to the trainer at the edge of the pool. The animal
was trained to do the trick, and the principles of operant conditioning were used to train it. But these complex
behaviours are a far cry from the simple stimulus-response relationships that we have considered thus far. How can
reinforcement be used to create complex behaviours such as these?
8.2 CHANGING BEHAVIOUR THROUGH REINFORCEMENT AND PUNISHMENT: OPERANT CONDITIONING • 312
One way to expand the use of operant learning is to modify the schedule on which the reinforcement is applied.
To this point we have only discussed a continuous reinforcement schedule, in which the desired response is
reinforced every time it occurs; whenever the dog rolls over, for instance, it gets a biscuit. Continuous reinforcement
results in relatively fast learning but also rapid extinction of the desired behaviour once the reinforcer disappears.
The problem is that because the organism is used to receiving the reinforcement after every behaviour, the responder
may give up quickly when it doesn’t appear.
Most real-world reinforcers are not continuous; they occur on a partial (or intermittent) reinforcement schedule
— a schedule in which the responses are sometimes reinforced and sometimes not. In comparison to continuous
reinforcement, partial reinforcement schedules lead to slower initial learning, but they also lead to greater resistance
to extinction. Because the reinforcement does not appear after every behaviour, it takes longer for the learner to
determine that the reward is no longer coming, and thus extinction is slower. The four types of partial reinforcement
schedules are summarized in Table 8.2, “Reinforcement Schedules.”
Table 8.2 Reinforcement Schedules.
[Skip Table]
Reinforcement
schedule Explanation Real-world example
Fixed-ratio Behaviour is reinforced after a specific number of
responses.
Factory workers who are paid according to
the number of products they produce
Variable-ratio Behaviour is reinforced after an average, but unpredictable,
number of responses.
Payoffs from slot machines and other games
of chance
Fixed-interval Behaviour is reinforced for the first response after a specific
amount of time has passed. People who earn a monthly salary
Variableinterval
Behaviour is reinforced for the first response after an
average, but unpredictable, amount of time has passed. Person who checks email for messages
Partial reinforcement schedules are determined by whether the reinforcement is presented on the basis of the time
that elapses between reinforcement (interval) or on the basis of the number of responses that the organism engages
in (ratio), and by whether the reinforcement occurs on a regular (fixed) or unpredictable (variable) schedule. In
a fixed-interval schedule, reinforcement occurs for the first response made after a specific amount of time has
passed. For instance, on a one-minute fixed-interval schedule the animal receives a reinforcement every minute,
assuming it engages in the behaviour at least once during the minute. As you can see in Figure 8.6, “Examples
of Response Patterns by Animals Trained under Different Partial Reinforcement Schedules,” animals under fixedinterval
schedules tend to slow down their responding immediately after the reinforcement but then increase the
behaviour again as the time of the next reinforcement gets closer. (Most students study for exams the same way.)
In a variable-interval schedule, the reinforcers appear on an interval schedule, but the timing is varied around
the average interval, making the actual appearance of the reinforcer unpredictable. An example might be checking
your email: you are reinforced by receiving messages that come, on average, say, every 30 minutes, but the
reinforcement occurs only at random times. Interval reinforcement schedules tend to produce slow and steady rates
of responding.
In a fixed-ratio schedule, a behaviour is reinforced after a specific number of responses. For instance, a rat’s
behaviour may be reinforced after it has pressed a key 20 times, or a salesperson may receive a bonus after he or
she has sold 10 products. As you can see in Figure 8.6, “Examples of Response Patterns by Animals Trained under
Different Partial Reinforcement Schedules,” once the organism has learned to act in accordance with the fixed-ratio
schedule, it will pause only briefly when reinforcement occurs before returning to a high level of responsiveness.
313 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Figure 8.6 Examples of Response Patterns by Animals Trained under Different Partial
Reinforcement Schedules. Schedules based on the number of responses (ratio types) induce
greater response rate than do schedules based on elapsed time (interval types). Also, unpredictable
schedules (variable types) produce stronger responses than do predictable schedules (fixed types).
A variable-ratio schedule provides reinforcers after a specific but average number of responses. Winning money
from slot machines or on a lottery ticket is an example of reinforcement that occurs on a variable-ratio schedule. For
instance, a slot machine (see Figure 8.7, “Slot Machine”) may be programmed to provide a win every 20 times the
user pulls the handle, on average. Ratio schedules tend to produce high rates of responding because reinforcement
increases as the number of responses increases.
Figure 8.7 Slot Machine. Slot machines are examples of a variable-ratio reinforcement schedule.
Complex behaviours are also created through shaping, the process of guiding an organism’s behaviour to the
desired outcome through the use of successive approximation to a final desired behaviour. Skinner made extensive
use of this procedure in his boxes. For instance, he could train a rat to press a bar two times to receive food, by first
providing food when the animal moved near the bar. When that behaviour had been learned, Skinner would begin
to provide food only when the rat touched the bar. Further shaping limited the reinforcement to only when the rat
pressed the bar, to when it pressed the bar and touched it a second time, and finally to only when it pressed the bar
twice. Although it can take a long time, in this way operant conditioning can create chains of behaviours that are
reinforced only when they are completed.
Reinforcing animals if they correctly discriminate between similar stimuli allows scientists to test the animals’
8.2 CHANGING BEHAVIOUR THROUGH REINFORCEMENT AND PUNISHMENT: OPERANT CONDITIONING • 314
ability to learn, and the discriminations that they can make are sometimes remarkable. Pigeons have been trained
to distinguish between images of Charlie Brown and the other Peanuts characters (Cerella, 1980), and between
different styles of music and art (Porter & Neuringer, 1984; Watanabe, Sakamoto & Wakita, 1995).
Behaviours can also be trained through the use of secondary reinforcers. Whereas a primary reinforcer includes
stimuli that are naturally preferred or enjoyed by the organism, such as food, water, and relief from pain, a
secondary reinforcer (sometimes called conditioned reinforcer) is a neutral event that has become associated with
a primary reinforcer through classical conditioning. An example of a secondary reinforcer would be the whistle
given by an animal trainer, which has been associated over time with the primary reinforcer, food. An example of
an everyday secondary reinforcer is money. We enjoy having money, not so much for the stimulus itself, but rather
for the primary reinforcers (the things that money can buy) with which it is associated.
Key Takeaways
• Edward Thorndike developed the law of effect: the principle that responses that create a typically
pleasant outcome in a particular situation are more likely to occur again in a similar situation,
whereas responses that produce a typically unpleasant outcome are less likely to occur again in
the situation.
• B. F. Skinner expanded on Thorndike’s ideas to develop a set of principles to explain operant
conditioning.
• Positive reinforcement strengthens a response by presenting something that is typically pleasant
after the response, whereas negative reinforcement strengthens a response by reducing or
removing something that is typically unpleasant.
• Positive punishment weakens a response by presenting something typically unpleasant after the
response, whereas negative punishment weakens a response by reducing or removing something
that is typically pleasant.
• Reinforcement may be either partial or continuous. Partial reinforcement schedules are
determined by whether the reinforcement is presented on the basis of the time that elapses
between reinforcements (interval) or on the basis of the number of responses that the organism
engages in (ratio), and by whether the reinforcement occurs on a regular (fixed) or unpredictable
(variable) schedule.
• Complex behaviours may be created through shaping, the process of guiding an organism’s
behaviour to the desired outcome through the use of successive approximation to a final desired
behaviour.
315 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Exercises and Critical Thinking
1. Give an example from daily life of each of the following: positive reinforcement, negative
reinforcement, positive punishment, negative punishment.
2. Consider the reinforcement techniques that you might use to train a dog to catch and retrieve a
Frisbee that you throw to it.
3. Watch the following two videos from current television shows. Can you determine which
learning procedures are being demonstrated?
a. The Office: http://www.break.com/usercontent/2009/11/the-office-altoidexperiment-
1499823
b. The Big Bang Theory [YouTube]: http://www.youtube.com/watch?v=JA96Fba-WHk
References
Cerella, J. (1980). The pigeon’s analysis of pictures. Pattern Recognition, 12, 1–6.
Kassin, S. (2003). Essentials of psychology. Upper Saddle River, NJ: Prentice Hall. Retrieved from Essentials
of Psychology Prentice Hall Companion Website: http://wps.prenhall.com/hss_kassin_essentials_1/15/3933/
1006917.cw/index.html
Porter, D., & Neuringer, A. (1984). Music discriminations by pigeons. Journal of Experimental Psychology: Animal
Behavior Processes, 10(2), 138–148.
Thorndike, E. L. (1898). Animal intelligence: An experimental study of the associative processes in
animals. Washington, DC: American Psychological Association.
Thorndike, E. L. (1911). Animal intelligence: Experimental studies. New York, NY: Macmillan. Retrieved
from http://www.archive.org/details/animalintelligen00thor
Watanabe, S., Sakamoto, J., & Wakita, M. (1995). Pigeons’ discrimination of painting by Monet and
Picasso. Journal of the Experimental Analysis of Behaviour, 63(2), 165–174.
Image Attributions
Figure 8.5: “Skinner box” (http://en.wikipedia.org/wiki/File:Skinner_box_photo_02.jpg) is licensed under the CC
BY SA 3.0 license (http://creativecommons.org/licenses/by-sa/3.0/deed.en). “Skinner box scheme” by Andreas1
(http://en.wikipedia.org/wiki/File:Skinner_box_scheme_01.png) is licensed under the CC BY SA 3.0 license
(http://creativecommons.org/licenses/by-sa/3.0/deed.en)
Figure 8.6: Adapted from Kassin (2003).
Figure 8.7: “Slot Machines in the Hard Rock Casino” by Ted Murpy (http://commons.wikimedia.org/wiki/
File:HardRockCasinoSlotMachines.jpg) is licensed under CC BY 2.0. (http://creativecommons.org/licenses/by/2.0/
deed.en).
8.2 CHANGING BEHAVIOUR THROUGH REINFORCEMENT AND PUNISHMENT: OPERANT CONDITIONING • 316
8.3 Learning by Insight and Observation
Learning Objective
1. Understand the principles of learning by insight and observation.
John B. Watson and B. F. Skinner were behaviourists who believed that all learning could be explained by the
processes of conditioning — that is, that associations, and associations alone, influence learning. But some kinds
of learning are very difficult to explain using only conditioning. Thus, although classical and operant conditioning
play a key role in learning, they constitute only a part of the total picture.
One type of learning that is not determined only by conditioning occurs when we suddenly find the solution
to a problem, as if the idea just popped into our head. This type of learning is known as insight, the sudden
understanding of a solution to a problem. The German psychologist Wolfgang K.hler (1925) carefully observed
what happened when he presented chimpanzees with a problem that was not easy for them to solve, such as placing
food in an area that was too high in the cage to be reached. He found that the chimps first engaged in trial-and-error
attempts at solving the problem, but when these failed they seemed to stop and contemplate for a while. Then, after
this period of contemplation, they would suddenly seem to know how to solve the problem: for instance, by using a
stick to knock the food down or by standing on a chair to reach it. K.hler argued that it was this flash of insight, not
the prior trial-and-error approaches, which were so important for conditioning theories, that allowed the animals to
solve the problem.
Edward Tolman studied the behaviour of three groups of rats that were learning to navigate through mazes (Tolman
& Honzik, 1930). The first group always received a reward of food at the end of the maze. The second group never
received any reward, and the third group received a reward, but only beginning on the 11th day of the experimental
period. As you might expect when considering the principles of conditioning, the rats in the first group quickly
learned to negotiate the maze, while the rats of the second group seemed to wander aimlessly through it. The rats in
the third group, however, although they wandered aimlessly for the first 10 days, quickly learned to navigate to the
end of the maze as soon as they received food on day 11. By the next day, the rats in the third group had caught up
in their learning to the rats that had been rewarded from the beginning.
It was clear to Tolman that the rats that had been allowed to experience the maze, even without any reinforcement,
had nevertheless learned something, and Tolman called this latent learning. Latent learning refers to learning that
is not reinforced and not demonstrated until there is motivation to do so. Tolman argued that the rats had formed a
“cognitive map” of the maze but did not demonstrate this knowledge until they received reinforcement.
Observational Learning: Learning by Watching
The idea of latent learning suggests that animals, and people, may learn simply by experiencing or watching.
Observational learning (modelling) is learning by observing the behaviour of others. To demonstrate the
importance of observational learning in children, Bandura, Ross, and Ross (1963) showed children a live image of
317
either a man or a woman interacting with a Bobo doll, a filmed version of the same events, or a cartoon version of
the events. As you can see in “Video Clip: Bandura Discussing Clips From His Modelling Studies,” the Bobo doll
is an inflatable balloon with a weight in the bottom that makes it bob back up when you knock it down. In all three
conditions, the model violently punched the clown, kicked the doll, sat on it, and hit it with a hammer.
Take a moment to see how Albert Bandura explains his research into the modelling of
aggression in children.
Watch: “Bandura Discussing Clips from His Modelling Studies” [YouTube]:
http://www.youtube.com/watch?v=jWsxfoJEwQQ&feature=youtu.be
The researchers first let the children view one of the three types of modelling, and then
let them play in a room in which there were some really fun toys. To create some
frustration in the children, Bandura let the children play with the fun toys for only a
couple of minutes before taking them away. Then Bandura gave the children a chance
to play with the Bobo doll.
If you guessed that most of the children imitated the model, you would be correct. Regardless of which type of
modelling the children had seen, and regardless of the sex of the model or the child, the children who had seen
the model behaved aggressively — just as the model had done. They also punched, kicked, sat on the doll, and hit
it with the hammer. Bandura and his colleagues had demonstrated that these children had learned new behaviours
simply by observing and imitating others.
Observational learning is useful for animals and for people because it allows us to learn without having to actually
engage in what might be a risky behaviour. Monkeys that see other monkeys respond with fear to the sight of a
snake learn to fear the snake themselves, even if they have been raised in a laboratory and have never actually seen
a snake (Cook & Mineka, 1990). As Bandura put it,
the prospects for [human] survival would be slim indeed if one could learn only by suffering the consequences
of trial and error. For this reason, one does not teach children to swim, adolescents to drive automobiles, and
novice medical students to perform surgery by having them discover the appropriate behaviour through the
consequences of their successes and failures. The more costly and hazardous the possible mistakes, the heavier
is the reliance on observational learning from competent learners. (Bandura, 1977, p. 212)
Although modelling is normally adaptive, it can be problematic for children who grow up in violent families. These
children are not only the victims of aggression, but they also see it happening to their parents and siblings. Because
children learn how to be parents in large part by modelling the actions of their own parents, it is no surprise that there
is a strong correlation between family violence in childhood and violence as an adult. Children who witness their
parents being violent or who are themselves abused are more likely as adults to inflict abuse on intimate partners or
their children, and to be victims of intimate violence (Heyman & Slep, 2002). In turn, their children are more likely
to interact violently with each other and to aggress against their parents (Patterson, Dishion, & Bank, 1984).
Research Focus: The Effects of Violent Video Games on Aggression
The average North American child watches more than four hours of television every day, and two out of
three of the programs they watch contain aggression. It has been estimated that by the age of 12, the average
North American child has seen more than 8,000 murders and 100,000 acts of violence. At the same time,
children are also exposed to violence in movies, video games, and virtual reality games, as well as in music
8.3 LEARNING BY INSIGHT AND OBSERVATION • 318
videos that include violent lyrics and imagery (Henry J. Kaiser Family Foundation, 2003; Schulenburg,
2007; Coyne & Archer, 2005).
It might not surprise you to hear that these exposures to violence have an effect on aggressive behaviour. The
evidence is impressive and clear: the more media violence that people, including children, view, the more
aggressive they are likely to be (Anderson et al., 2003; Cantor et al., 2001). The relationship between viewing
television violence and aggressive behaviour is about as strong as the relationship between smoking and
cancer or between studying and academic grades. People who watch more violence become more aggressive
than those who watch less violence.
It is clear that watching television violence can increase aggression, but what about violent video games?
These games are more popular than ever, and also more graphically violent. Youths spend countless hours
playing these games, many of which involve engaging in extremely violent behaviours. The games often
require the player to take the role of a violent person, to identify with the character, to select victims, and of
course to kill the victims. These behaviours are reinforced by winning points and moving on to higher levels,
and are repeated over and over.
Again, the answer is clear — playing violent video games leads to aggression. A recent meta-analysis by
Anderson and Bushman (2001) reviewed 35 research studies that had tested the effects of playing violent
video games on aggression. The studies included both experimental and correlational studies, with both
male and female participants in both laboratory and field settings. They found that exposure to violent video
games is significantly linked to increases in aggressive thoughts, aggressive feelings, psychological arousal
(including blood pressure and heart rate), as well as aggressive behaviour. Furthermore, playing more video
games was found to relate to less altruistic behaviour.
In one experiment, Bushman and Anderson (2002) assessed the effects of viewing violent video games
on aggressive thoughts and behaviour. Participants were randomly assigned to play either a violent or
a nonviolent video game for 20 minutes. Each participant played one of four violent video games
(Carmageddon, Duke Nukem, Mortal Kombat, or Future Cop) or one of four nonviolent video games (Glider
Pro, 3D Pinball, Austin Powers, or Tetra Madness).
Participants then read a story — for instance, this one about Todd — and were asked to list 20 thoughts,
feelings, and actions they would have if they were Todd:
Todd was on his way home from work one evening when he had to brake quickly for a yellow light. The
person in the car behind him must have thought Todd was going to run the light because he crashed into the
back of Todd’s car, causing a lot of damage to both vehicles. Fortunately, there were no injuries. Todd got
out of his car and surveyed the damage. He then walked over to the other car.
As you can see in Figure 8.8, “Results From Bushman and Anderson, 2002,” the students who had played
one of the violent video games responded much more aggressively to the story than did those who played
the nonviolent games. In fact, their responses were often extremely aggressive. They said things like “Call
the guy an idiot,” “Kick the other driver’s car,” “This guy’s dead meat!” and “What a dumbass!”
However, although modelling can increase violence, it can also have positive effects. Research has found
that, just as children learn to be aggressive through observational learning, they can also learn to be altruistic
in the same way (Seymour, Yoshida, & Dolan, 2009).
319 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Figure 8.8 Researchers found that undergraduate students who had just played a violent video
game expressed significantly more violent responses to a story than did those who had just played
a nonviolent video game. [Long Description] Adapted from Bushman & Anderson (2002).
Key Takeaways
• Not all learning can be explained through the principles of classical and operant conditioning.
• Insight is the sudden understanding of the components of a problem that makes the solution
apparent.
• Latent learning refers to learning that is not reinforced and not demonstrated until there is
motivation to do so.
• Observational learning occurs by viewing the behaviours of others.
• Both aggression and altruism can be learned through observation.
Exercises and Critical Thinking
1. Describe a time when you learned something by insight. What do you think led to your
learning?
2. Imagine that you had a 12-year-old brother who spent many hours a day playing violent video
games. Basing your answer on the material covered in this chapter, do you think that your parents
should limit his exposure to the games? Why or why not?
3. How might we incorporate principles of observational learning to encourage acts of kindness
and selflessness in our society?
References
Anderson, C. A., & Bushman, B. J. (2001). Effects of violent video games on aggressive behavior, aggressive
8.3 LEARNING BY INSIGHT AND OBSERVATION • 320
cognition, aggressive affect, physiological arousal, and prosocial behavior: A meta-analytic review of the scientific
literature. Psychological Science, 12(5), 353–359.
Anderson, C. A., Berkowitz, L., Donnerstein, E., Huesmann, L. R., Johnson, J. D., Linz, D.,…Wartella, E. (2003).
The influence of media violence on youth. Psychological Science in the Public Interest, 4(3), 81–110.
Bandura, A. (1977). Self-efficacy: Toward a unifying theory of behavior change. Psychological Review, 84,
191–215.
Bandura, A., Ross, D., & Ross, S. A. (1963). Imitation of film-mediated aggressive models. The Journal of
Abnormal and Social Psychology, 66(1), 3–11.
Bushman, B. J., & Anderson, C. A. (2002). Violent video games and hostile expectations: A test of the general
aggression model. Personality and Social Psychology Bulletin, 28(12), 1679–1686.
Cantor, J., Bushman, B. J., Huesmann, L. R., Groebel, J., Malamuth, N. M., Impett, E. A.,…Singer, J. L. (Eds.).
(2001). Some hazards of television viewing: Fears, aggression, and sexual attitudes. Thousand Oaks, CA: Sage.
Cook, M., & Mineka, S. (1990). Selective associations in the observational conditioning of fear in rhesus
monkeys. Journal of Experimental Psychology: Animal Behavior Processes, 16(4), 372–389.
Coyne, S. M., & Archer, J. (2005). The relationship between indirect and physical aggression on television and in
real life. Social Development, 14(2), 324–337.
Henry J. Kaiser Family Foundation. (2003, Spring). Key facts: TV Violence [PDF]. Menlo Park, CA: Author.
Retrieved from https://kaiserfamilyfoundation.files.wordpress.com/2013/01/key-facts-tv-violence.pdf
Heyman, R. E., & Slep, A. M. S. (2002). Do child abuse and interparental violence lead to adulthood family
violence? Journal of Marriage and Family, 64(4), 864–870.
K.hler, W. (1925). The mentality of apes (E. Winter, Trans.). New York, NY: Harcourt Brace Jovanovich.
Patterson, G. R., Dishion, T. J., & Bank, L. (1984). Family interaction: A process model of deviancy
training. Aggressive Behavior, 10(3), 253–267.
Schulenburg, C. (2007, January). Dying to entertain: Violence on prime time broadcast television, 1998 to 2006
[PDF]. Los Angeles, CA: Parents Television Council. Retrieved from http://www.parentstv.org/PTC/publications/
reports/violencestudy/DyingtoEntertain.pdf
Seymour, B., Yoshida, W., & Dolan, R. (2009) Altruistic learning. Frontiers in Behavioral Neuroscience, 3, 23.
Tolman, E. C., & Honzik, C. H. (1930). Introduction and removal of reward, and maze performance in
rats. University of California Publications in Psychology, 4, 257–275.
321 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Long Descriptions
Figure 8.8 long description: Effect of Violent and Nonviolent
Video Games
Non-violent video game Violent video game
Do/say 3.0 4.8
Think 1.8 2.5
Feel 5.5 7.0
[Return to Figure 8.8]
8.3 LEARNING BY INSIGHT AND OBSERVATION • 322
8.4 Using the Principles of Learning to Understand Everyday Behaviour
Learning Objectives
1. Review the ways that learning theories can be applied to understanding and modifying
everyday behaviour.
2. Describe the situations under which reinforcement may make people less likely to enjoy
engaging in a behaviour.
3. Explain how principles of reinforcement are used to understand social dilemmas, such as the
prisoner’s dilemma, and why people are likely to make competitive choices in them.
The principles of learning are some of the most general and most powerful in all of psychology. It would be fair
to say that these principles account for more behaviour using fewer principles than any other set of psychological
theories. The principles of learning are applied in numerous ways in everyday settings. For example, operant
conditioning has been used to motivate employees, to improve athletic performance, to increase the functioning of
those suffering from developmental disabilities, and to help parents successfully toilet train their children (Azrin &
Foxx, 1974; McGlynn, 1990; Pedalino & Gamboa, 1974; Simek & O’Brien, 1981). In this section we will consider
how learning theories are used in advertising, in education, and in understanding competitive relationships between
individuals and groups.
Using Classical Conditioning in Advertising
Classical conditioning has long been, and continues to be, an effective tool in marketing and advertising (Hawkins,
Best, & Coney, 1998). The general idea is to create an advertisement that has positive features such that the ad
creates enjoyment in the person exposed to it. The enjoyable ad serves as the unconditioned stimulus (US), and
the enjoyment is the unconditioned response (UR). Because the product being advertised is mentioned in the ad,
it becomes associated with the US, and then becomes the conditioned stimulus (CS). In the end, if everything has
gone well, seeing the product online or in the store will then create a positive response in the buyer, leading him or
her to be more likely to purchase the product.
Can you determine how classical conditioning is being used in these commercials?
Watch: “Television Ads” [YouTube]: http://www.youtube.com/v/dsESVrArhbk
A similar strategy is used by corporations that sponsor teams or events. For instance, if people enjoy watching a
university basketball team playing basketball, and if that team is sponsored by a product, such as Pepsi, then people
may end up experiencing positive feelings when they view a can of Pepsi. Of course, the sponsor wants to sponsor
only good teams and good athletes because these create more pleasurable responses.
Advertisers use a variety of techniques to create positive advertisements, including enjoyable music, cute babies,
323
attractive models, and funny spokespeople. In one study, Gorn (1982) showed research participants pictures of
different writing pens of different colours, but paired one of the pens with pleasant music and the other with
unpleasant music. When given a choice as a free gift, more people chose the pen colour associated with the pleasant
music. And Schemer, Matthes, Wirth, and Textor (2008) found that people were more interested in products that
had been embedded in music videos of artists that they liked and less likely to be interested when the products were
in videos featuring artists that they did not like.
Another type of ad that is based on principles of classical conditioning is one that associates fear with the use of a
product or behaviour, such as those that show pictures of deadly automobile accidents to encourage seatbelt use or
images of lung cancer surgery to discourage smoking. These ads have also been found to be effective (Das, de Wit,
& Stroebe, 2003; Perloff, 2003; Witte & Allen, 2000), due in large part to conditioning. When we see a cigarette
and the fear of dying has been associated with it, we are hopefully less likely to light up.
Taken together then, there is ample evidence of the utility of classical conditioning, using both positive as well
as negative stimuli, in advertising. This does not, however, mean that we are always influenced by these ads.
The likelihood of conditioning being successful is greater for products that we do not know much about, where
the differences between products are relatively minor, and when we do not think too carefully about the choices
(Schemer et al., 2008).
Psychology in Everyday Life: Operant Conditioning in the Classroom
John B. Watson and B. F. Skinner believed that all learning was the result of reinforcement, and thus that
reinforcement could be used to educate children. For instance, Watson wrote in his book on behaviourism,
Give me a dozen healthy infants, well-formed, and my own specified world to bring them up in and
I’ll guarantee to take any one at random and train him to become any type of specialist I might select
— doctor, lawyer, artist, merchant-chief and, yes, even beggar-man and thief, regardless of his talents,
penchants, tendencies, abilities, vocations, and race of his ancestors. I am going beyond my facts and I
admit it, but so have the advocates of the contrary and they have been doing it for many thousands of
years (Watson, 1930, p. 82).
Skinner promoted the use of programmed instruction, an educational tool that consists of self-teaching
with the aid of a specialized textbook or teaching machine that presents material in a logical sequence
(Skinner, 1965). Programmed instruction allows students to progress through a unit of study at their own
rate, checking their own answers and advancing only after answering correctly. Programmed instruction is
used today in many classes — for instance, to teach computer programming (Emurian, 2009).
Although reinforcement can be effective in education, and teachers make use of it by awarding gold stars,
good grades, and praise, there are also substantial limitations to using reward to improve learning. To be
most effective, rewards must be contingent on appropriate behaviour. In some cases teachers may distribute
rewards indiscriminately — for instance, by giving praise or good grades to children whose work does not
warrant it — in the hope that students will “feel good about themselves” and that this self-esteem will lead
to better performance. Studies indicate, however, that high self-esteem alone does not improve academic
performance (Baumeister, Campbell, Krueger, & Vohs, 2003). When rewards are not earned, they become
meaningless and no longer provide motivation for improvement.
Another potential limitation of rewards is that they may teach children that the activity should be performed
for the reward, rather than for one’s own interest in the task. If rewards are offered too often, the task itself
8.4 USING THE PRINCIPLES OF LEARNING TO UNDERSTAND EVERYDAY BEHAVIOUR • 324
becomes less appealing. Mark Lepper and his colleagues (Lepper, Greene, & Nisbett, 1973) studied this
possibility by leading some children to think that they engaged in an activity for a reward, rather than because
they simply enjoyed it. First, they placed some fun felt-tipped markers in the classroom of the children they
were studying. The children loved the markers and played with them right away. Then the markers were
taken out of the classroom, and the children were given a chance to play with the markers individually at
an experimental session with the researcher. At the research session, the children were randomly assigned
to one of three experimental groups. One group of children (the expected reward condition) was told that
if they played with the markers they would receive a good drawing award. A second group (the unexpected
reward condition) also played with the markers, and also got the award — but they were not told ahead of
time that they would be receiving the award; it came as a surprise after the session. The third group (the no
reward group) played with the markers too, but got no award.
Then the researchers placed the markers back in the classroom and observed how much the children in each
of the three groups played with them. As you can see in Figure 8.9, “Undermining Intrinsic Interest,” the
children who had been led to expect a reward for playing with the markers during the experimental session
played with the markers less at the second session than they had at the first session. The idea is that, when
the children had to choose whether or not to play with the markers when the markers reappeared in the
classroom, they based their decision on their own prior behaviour. The children in the no reward group and
the children in the unexpected reward group realized that they played with the markers because they liked
them. Children in the expected award condition, however, remembered that they were promised a reward
for the activity the last time they played with the markers. These children, then, were more likely to draw
the inference that they play with the markers only for the external reward, and because they did not expect
to get an award for playing with the markers in the classroom, they determined that they didn’t like them.
Expecting to receive the award at the session had undermined their initial interest in the markers.
Figure 8.9 Undermining Intrinsic Interest. Mark Lepper and his colleagues (1973) found that
giving rewards for playing with markers, which the children naturally enjoyed, could reduce their
interest in the activity. [Long Description]
This research suggests that, although receiving a reward may in many cases lead us to perform an activity
more frequently or with more effort, a reward may not always increase our liking for the activity. In some
cases a reward may actually make us like an activity less than we did before we were rewarded for it. This
outcome is particularly likely when the reward is perceived as an obvious attempt on the part of others
to get us to do something. When children are given money by their parents to get good grades in school,
they may improve their school performance to gain the reward. But at the same time their liking for school
may decrease. On the other hand, rewards that are seen as more internal to the activity, such as rewards
that praise us, remind us of our achievements in the domain, and make us feel good about ourselves as a
325 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
result of our accomplishments, are more likely to be effective in increasing not only the performance of,
but also the liking of, the activity (Hulleman, Durik, Schweigert, & Harackiewicz, 2008; Ryan & Deci,
2002).Other research findings also support the general principle that punishment is generally less effective
than reinforcement in changing behaviour. In a recent meta-analysis, Gershoff (2002) found that although
children who were spanked by their parents were more likely to immediately comply with the parents’
demands, they were also more aggressive, showed less ability to control aggression, and had poorer mental
health in the long term than children who were not spanked. The problem seems to be that children who are
punished for bad behaviour are likely to change their behaviour only to avoid the punishment, rather than by
internalizing the norms of being good for its own sake. Punishment also tends to generate anger, defiance,
and a desire for revenge. Moreover, punishment models the use of aggression and ruptures the important
relationship between the teacher and the learner (Kohn, 1993).
Reinforcement in Social Dilemmas
The basic principles of reinforcement, reward, and punishment have been used to help understand a variety of
human behaviours (Bandura, 1977; Miller & Dollard, 1941; Rotter, 1945). The general idea is that, as predicted
by principles of operant learning and the law of effect, people act in ways that maximize their outcomes, where
outcomes are defined as the presence of reinforcers and the absence of punishers.
Consider, for example, a situation known as the commons dilemma, as proposed by the ecologist Garrett Hardin
(1968). Hardin noted that in many European towns there was at one time a centrally located pasture, known as
the commons, which was shared by the inhabitants of the village to graze their livestock. But the commons was
not always used wisely. The problem was that each individual who owned livestock wanted to be able to use the
commons to graze his or her own animals. However, when each group member took advantage of the commons by
grazing many animals, the commons became overgrazed, the pasture died, and the commons was destroyed.
Although Hardin focused on the particular example of the commons, the basic dilemma of individual desires versus
the benefit of the group as a whole can also be found in many contemporary public goods issues, including the use
of limited natural resources, air pollution, and public land. In large cities, most people may prefer the convenience
of driving their own car to work each day rather than taking public transportation. Yet this behaviour uses up public
goods (the space on limited roadways, crude oil reserves, and clean air). People are lured into the dilemma by shortterm
rewards, seemingly without considering the potential long-term costs of the behaviour, such as air pollution
and the necessity of building even more highways.
A social dilemma such as the commons dilemma is a situation in which the behaviour that creates the most positive
outcomes for the individual may in the long term lead to negative consequences for the group as a whole. The
dilemmas are arranged in such a way that it is easy to be selfish, because the personally beneficial choice (such
as using water during a water shortage or driving to work alone in one’s own car) produces reinforcements for the
individual. Furthermore, social dilemmas tend to work on a type of time delay. The problem is that, because the
long-term negative outcome (the extinction of fish species or dramatic changes in the earth’s climate) is far away
in the future and the individual benefits are occurring right now, it is difficult for an individual to see how many
costs there really are. The paradox, of course, is that if everyone takes the personally selfish choice in an attempt to
maximize his or her own outcomes, the long-term result is poorer outcomes for every individual in the group. Each
individual prefers to make use of the public goods for himself or herself, whereas the best outcome for the group as
a whole is to use the resources more slowly and wisely.
8.4 USING THE PRINCIPLES OF LEARNING TO UNDERSTAND EVERYDAY BEHAVIOUR • 326
One method of understanding how individuals and groups behave in social dilemmas is to create such situations in
the laboratory and observe how people react to them. The best known of these laboratory simulations is called the
prisoner’s dilemma game (Poundstone, 1992). This game represents a social dilemma in which the goals of the
individual compete with the goals of another individual (or sometimes with a group of other individuals). Like all
social dilemmas, the prisoner’s dilemma assumes that individuals will generally try to maximize their own outcomes
in their interactions with others.
In the prisoner’s dilemma game, the participants are shown a payoff matrix in which numbers are used to express
the potential outcomes for each of the players in the game, given the decisions each player makes. The payoffs are
chosen beforehand by the experimenter to create a situation that models some real-world outcome. Furthermore, in
the prisoner’s dilemma game, the payoffs are normally arranged as they would be in a typical social dilemma, such
that each individual is better off acting in his or her immediate self-interest, and yet if all individuals act according
to their self-interests, then everyone will be worse off.
In its original form, the prisoner’s dilemma game involves a situation in which two prisoners (we’ll call them Frank
and Malik) have been accused of committing a crime. The police believe that the two worked together on the crime,
but they have only been able to gather enough evidence to convict each of them of a more minor offence. In an
attempt to gain more evidence, and thus be able to convict the prisoners of the larger crime, each of the prisoners
is interrogated individually, with the hope that he will confess to having been involved in the more major crime
in return for a promise of a reduced sentence if he confesses first. Each prisoner can make either the cooperative
choice (which is to not confess) or the competitive choice (which is to confess).
The incentives for either confessing or not confessing are expressed in a payoff matrix such as the one shown in
Figure 8.10, “The Prisoner’s Dilemma.” The top of the matrix represents the two choices that Malik might make
(to either confess that he did the crime or not confess), and the side of the matrix represents the two choices that
Frank might make (also to either confess or not confess). The payoffs that each prisoner receives, given the choices
of each of the two prisoners, are shown in each of the four squares.
If both prisoners take the cooperative choice by not confessing (the situation represented in the upper left square
of the matrix), there will be a trial, the limited available information will be used to convict each prisoner, and
they each will be sentenced to a relatively short prison term of three years. However, if either of the prisoners
confesses, turning “state’s evidence” against the other prisoner, then there will be enough information to convict the
other prisoner of the larger crime, and that prisoner will receive a sentence of 30 years, whereas the prisoner who
confesses will get off free. These outcomes are represented in the lower left and upper right squares of the matrix.
Finally, it is possible that both players confess at the same time. In this case there is no need for a trial, and in return
the prosecutors offer a somewhat reduced sentence (of 10 years) to each of the prisoners.
The prisoner’s dilemma has two interesting characteristics that make it a useful model of a social dilemma. For
one, the prisoner’s dilemma is arranged in such a way that a positive outcome for one player does not necessarily
mean a negative outcome for the other player. If you consider again the matrix in Figure 8.10, “The Prisoner’s
Dilemma,” you can see that if one player takes the cooperative choice (to not confess) and the other takes the
competitive choice (to confess), then the prisoner who cooperates loses, whereas the other prisoner wins. However,
if both prisoners make the cooperative choice, each remaining quiet, then neither gains more than the other, and
both prisoners receive a relatively light sentence. In this sense, both players can win at the same time.
Second, the prisoner’s dilemma matrix is arranged so that each individual player is motivated to take the competitive
choice because this choice leads to a higher payoff regardless of what the other player does. Imagine for a moment
that you are Malik, and you are trying to decide whether to cooperate (don’t confess) or to compete (confess). And
imagine that you are not really sure what Frank is going to do. Remember the goal of the individual is to maximize
outcomes. The values in the matrix make it clear that if you think that Frank is going to confess, you should confess
327 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Figure 8.10 The Prisoner’s Dilemma. In the prisoner’s dilemma game, two suspected criminals
are interrogated separately. The matrix indicates the outcomes for each prisoner, measured as the
number of years each is sentenced to prison, as a result of each combination of cooperative (don’t
confess) and competitive (confess) decisions. Outcomes for Malik are in black and outcomes for
Frank are in grey. [Long Description]
yourself (to get 10 rather than 30 years in prison). And it is also clear that if you think Frank is not going to confess,
you should still confess (to get no time in prison rather than three years). So the matrix is arranged so that the “best”
alternative for each player, at least in the sense of pure reward and self-interest, is to make the competitive choice,
even though in the end both players would prefer the combination in which both players cooperate to the one in
which they both compete.
Although initially specified in terms of the two prisoners, similar payoff matrices can be used to predict behaviour
in many different types of dilemmas involving two or more parties and including choices of helping and not helping,
working and loafing, and paying and not paying debts. For instance, we can use the prisoner’s dilemma to help us
understand roommates living together in a house who might not want to contribute to the housework. Each of them
would be better off if they relied on the other to clean the house. Yet if neither of them makes an effort to clean the
house (the cooperative choice), the house becomes a mess and they will both be worse off.
Key Takeaways
• Learning theories have been used to change behaviours in many areas of everyday life.
8.4 USING THE PRINCIPLES OF LEARNING TO UNDERSTAND EVERYDAY BEHAVIOUR • 328
• Some advertising uses classical conditioning to associate a pleasant response with a product.
• Rewards are frequently and effectively used in education but must be carefully designed to be
contingent on performance and to avoid undermining interest in the activity.
• Social dilemmas, such as the prisoner’s dilemma, can be understood in terms of a desire to
maximize one’s outcomes in a competitive relationship.
Exercises and Critical Thinking
1. Find and share with your class some examples of advertisements that make use of classical
conditioning to create positive attitudes toward products.
2. Should parents use both punishment as well as reinforcement to discipline their children? On
what principles of learning do you base your opinion?
3. Think of a social dilemma other than one that has been discussed in this chapter, and explain
people’s behaviour in it in terms of principles of learning.
References
Azrin, N., & Foxx, R. M. (1974). Toilet training in less than a day. New York, NY: Simon & Schuster.
Bandura, A. (1977). Social learning theory. New York, NY: General Learning Press.
Baumeister, R. F., Campbell, J. D., Krueger, J. I., & Vohs, K. D. (2003). Does high self-esteem cause better
performance, interpersonal success, happiness, or healthier lifestyles? Psychological Science in the Public Interest,
4, 1–44.
Das, E. H. H. J., de Wit, J. B. F., & Stroebe, W. (2003). Fear appeals motivate acceptance of action
recommendations: Evidence for a positive bias in the processing of persuasive messages. Personality & Social
Psychology Bulletin, 29(5), 650–664.
Emurian, H. H. (2009). Teaching Java: Managing instructional tactics to optimize student learning. International
Journal of Information & Communication Technology Education, 3(4), 34–49.
Gershoff, E. T. (2002). Corporal punishment by parents and associated child behaviors and experiences: A metaanalytic
and theoretical review. Psychological Bulletin, 128(4), 539–579.
Gorn, G. J. (1982). The effects of music in advertising on choice behavior: A classical conditioning
approach. Journal of Marketing, 46(1), 94–101.
Hardin, G. (1968). The tragedy of the commons. Science, 162, 1243–1248.
329 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
Hawkins, D., Best, R., & Coney, K. (1998). Consumer Behavior: Building Marketing Strategy (7th ed.). Boston,
MA: McGraw-Hill.
Hulleman, C. S., Durik, A. M., Schweigert, S. B., & Harackiewicz, J. M. (2008). Task values, achievement goals,
and interest: An integrative analysis. Journal of Educational Psychology, 100(2), 398–416.
Kohn, A. (1993). Punished by rewards: The trouble with gold stars, incentive plans, A’s, praise, and other bribes.
Boston, MA: Houghton Mifflin and Company.
Lepper, M. R., Greene, D., & Nisbett, R. E. (1973). Undermining children’s intrinsic interest with extrinsic reward:
A test of the “overjustification” hypothesis. Journal of Personality & Social Psychology, 28(1), 129–137.
McGlynn, S. M. (1990). Behavioral approaches to neuropsychological rehabilitation. Psychological Bulletin, 108,
420–441.
Miller, N., & Dollard, J. (1941). Social learning and imitation. New Haven, CT: Yale University Press.
Pedalino, E., & Gamboa, V. U. (1974). Behavior modification and absenteeism: Intervention in one industrial
setting. Journal of Applied Psychology, 59, 694–697.
Perloff, R. M. (2003). The dynamics of persuasion: Communication and attitudes in the 21st century (2nd ed.).
Mahwah, NJ: Lawrence Erlbaum Associates.
Poundstone, W. (1992). The prisoner’s dilemma. New York, NY: Doubleday.
Rotter, J. B. (1945). Social learning and clinical psychology. Upper Saddle River, NJ: Prentice Hall.
Ryan, R. M., & Deci, E. L. (2002). Overview of self-determination theory: An organismic-dialectical perspective.
In E. L. Deci & R. M. Ryan (Eds.), Handbook of self-determination research (pp. 3–33). Rochester, NY: University
of Rochester Press.
Schemer, C., Matthes, J. R., Wirth, W., & Textor, S. (2008). Does “Passing the Courvoisier” always pay off?
Positive and negative evaluative conditioning effects of brand placements in music videos. Psychology &
Marketing, 25(10), 923–943.
Simek, T. C., & O’Brien, R. M. (1981). Total golf: A behavioral approach to lowering your score and getting more
out of your game. New York, NY: Doubleday & Company.
Skinner, B. F. (1965). The technology of teaching. Proceedings of the Royal Society B Biological Sciences,
162(989): 427–43.
Watson, J. B. (1930). Behaviorism (Rev. ed.). New York, NY: Norton.
Witte, K., & Allen, M. (2000). A meta-analysis of fear appeals: Implications for effective public health
campaigns. Health Education & Behavior, 27(5), 591–615.
Image Attributions
Figure 8.9: Adapted from Lepper, Greene, & Nisbett (1973).
8.4 USING THE PRINCIPLES OF LEARNING TO UNDERSTAND EVERYDAY BEHAVIOUR • 330
Long Descriptions
Figure 8.9 long description: Undermining intrinsic
interest.
First Session Second Session
Expected award 17 8
No award 15 16
Unexpected award 17 17
[Return to Figure 8.9]
Figure 8.10 long description: The prisoner’s Dilemma. If both Malik and Frank don’t confess, they each get three
years in prison. If only one of them confesses, the confessor gets no years in prison while the person who did not
confess gets 30 years in prison. If they both confess, they each get 10 years in prison. [Return to Figure 8.10]
331 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION
8.5 Chapter Summary
Classical conditioning was first studied by physiologist Ivan Pavlov. In classical conditioning, a person or animal
learns to associate a neutral stimulus (the conditioned stimulus, or CS) with a stimulus (the unconditioned stimulus,
or US) that naturally produces a behaviour (the unconditioned response, or UR). As a result of this association, the
previously neutral stimulus comes to elicit the same or similar response (the conditioned response, or CR).
Classically conditioned responses show extinction if the CS is repeatedly presented without the US. The CR may
reappear later in a process known as spontaneous recovery.
Organisms may show stimulus generalization, in which stimuli similar to the CS may produce similar behaviours,
or stimulus discrimination, in which the organism learns to differentiate between the CS and other similar stimuli.
Second-order conditioning occurs when a second CS is conditioned to a previously established CS.
Psychologist Edward Thorndike developed the law of effect: the idea that responses that are reinforced are “stamped
in” by experience and thus occur more frequently, whereas responses that are punished are “stamped out” and
subsequently occur less frequently.
B. F. Skinner expanded on Thorndike’s ideas to develop a set of principles to explain operant conditioning.
Positive reinforcement strengthens a response by presenting something pleasant after the response, and negative
reinforcement strengthens a response by reducing or removing something unpleasant. Positive punishment weakens
a response by presenting something unpleasant after the response, whereas negative punishment weakens a response
by reducing or removing something pleasant.
Shaping is the process of guiding an organism’s behaviour to the desired outcome through the use of reinforcers.
Reinforcement may be either partial or continuous. Partial-reinforcement schedules are determined by whether the
reward is presented on the basis of the time that elapses between rewards (interval) or on the basis of the number
of responses that the organism engages in (ratio), and by whether the reinforcement occurs on a regular (fixed) or
unpredictable (variable) schedule.
Not all learning can be explained through the principles of classical and operant conditioning. Insight is the sudden
understanding of the components of a problem that makes the solution apparent, and latent learning refers to
learning that is not reinforced and not demonstrated until there is motivation to do so.
Learning by observing the behaviour of others and the consequences of those behaviours is known as observational
learning. Aggression, altruism, and many other behaviours are learned through observation.
Learning theories can be and have been applied to change behaviours in many areas of everyday life. Some
advertising uses classical conditioning to associate a pleasant response with a product.
Rewards are frequently and effectively used in education but must be carefully designed to be contingent on
performance and to avoid undermining interest in the activity.
332
Social dilemmas, such as the prisoner’s dilemma, can be understood in terms of a desire to maximize one’s
outcomes in a competitive relationship.
333 • INTRODUCTION TO PSYCHOLOGY - 1ST CANADIAN EDITION