Chapter 5: Learning — Master Introductory Psychology
Master Introductory Psychology  ·  Chapter 5 of 16

Learning

Classical conditioning, operant conditioning, and cognitive learning — how experience changes behavior and shapes who we become.

📖 15 sections ⌛ ~30 min read 🔑 35 key terms ✎ 10 review questions
📚 Master Introductory Psychology Complete Edition — Get the full 16 chapter print book here
Buy on Amazon →

Learning

On a common-sense level we all know what learninglearningA relatively long-lasting change in behavior or knowledge that results from experience. is, but how can we come up with a precise definition of learning? How can we measure learning? As a teacher, I wonder every day whether my students have learned something because I can't really get inside their heads to find out. In school situations, we generally depend on observation to determine whether learning has occurred. We don't just take your word that you've learned something, we want to see evidence. This is actually very similar to how behavioral psychologists of the 1920s to 1950s thought about learning. They believed that the inner workings of the mind weren't important; what mattered was observable behavior. With this emphasis on behavior, we can say that learning is a relatively long-lasting change in behavior that results from our experience with the world. Because of this emphasis on behavior in assessing learning, the conditioning theories we'll learn about in this chapter are often collectively referred to as behaviorism. Behavior is not the only way of assessing learning, and as we'll see at the end of this chapter, research eventually moved away from a strict emphasis on behavior. But before we get ahead of ourselves, let's look to the start of behaviorist psychology with the work of Ivan PavlovIvan PavlovRussian physiologist (1849–1936) who discovered classical conditioning while studying digestion in dogs..

✎  Quick check — Section 1
Which of the following best defines learning as used in psychology?

Classical Conditioning

Chances are you’ve heard someone mention “Pavlov’s dog” or something being a “Pavlovian response”, so let’s take a look at the details of some of Pavlov’s work and the key terms for discussing what is known as classical conditioningclassical conditioningA type of learning in which a neutral stimulus becomes associated with a meaningful stimulus, eventually producing a similar response..

Ivan Pavlov was a Russian physiologist studying digestion in dogs (for which he won the Nobel prize in 1904) who noticed an interesting phenomenon. Pavlov had been collecting saliva samples from dogs when food was presented to them, but he noticed that the dogs began salivating even before the food was presented (like while the researcher was preparing the food). It was as if the dogs knew that food was on the way.

Pavlov then wanted to see if dogs could also learn to expect food (shown by their salivation) following a particular stimulus like a bell, a metronome, a light, etc. In the most well-known version of his experiments, he rang a bell prior to the presentation of food. Initially the bell was a meaningless sound, but by repeatedly following that sound with food, Pavlov was able to teach the dogs to salivate whenever the bell rang.

In this case, the bell started as a neutral stimulus (NS), meaning that it didn't elicit a response on its own. If you go up to a random dog on the street and ring a bell, there isn't a specific response that will occur. The food, however, would be an unconditioned stimulusunconditioned stimulusA stimulus that naturally and automatically triggers a response without prior learning. (US), because it doesn't need to be conditioned, or taught to the dog. If you give food to any random dog, a predictable response will occur: salivation. This response, salivating to food, hasn't been taught so it's called an unconditioned responseunconditioned responseThe natural, unlearned reaction to an unconditioned stimulus. (UR). Conditioning consists of repeatedly presenting the neutral stimulus followed by the unconditioned stimulus, which will automatically cause the unconditioned response.

After enough training, the neutral stimulus will be able to elicit a predictable response all by itself. The previously neutral stimulus can now be called a conditioned stimulusconditioned stimulusA previously neutral stimulus that, after association with an unconditioned stimulus, comes to trigger a conditioned responseconditioned responseThe learned response to a conditioned stimulus — resembles the unconditioned response but is typically weaker.. (CS). The response that has been taught, salivating to the sound of a bell, is now called a conditioned response (CR). When the dog has learned to salivate to the sound of the bell, we can say that acquisitionacquisitionThe initial stage of learning in which a conditioned response is established and strengthened. has occurred. Here’s a summary of the steps in classical conditioning:

Before Conditioning

Neutral Stimulus (bell) alone : no response

Unconditioned Stimulus (food) alone : Unconditioned Response (salivate)

The Conditioning Process

Neutral Stimulus (bell) then Unconditioned Stimulus (food): Unconditioned Response (salivate)

After repeating this several times, the Neutral Stimulus becomes a Conditioned Stimulus, which means...

After Conditioning

Conditioned Stimulus (bell): Conditioned Response (salivate to the bell)

If we stop pairing the bell with food, this conditioned response won't continue on forever. Eventually, the dog will stop responding to the stimulus, and we can say that extinctionextinctionThe gradual weakening and disappearance of a conditioned response when the conditioned stimulus is repeatedly presented without the unconditioned stimulus. has occurred. But just because the dog has stopped responding to the stimulus doesn't necessarily mean that the dog has completely forgotten the learning. In fact, Pavlov found that after extinction the conditioned response would reappear after a rest period of about a day. This was referred to as spontaneous recoveryspontaneous recoveryThe reappearance of an extinguished conditioned response after a rest period. and shows us that the dog hasn't forgotten the association, he has just temporarily stopped responding. In addition, if this dog were to be trained with bell/food pairings again in the future, he would relearn the association more quickly than a dog without any prior conditioning.

After teaching the dogs to salivate to the sound of a bell, Pavlov found that they also salivated to the sound of a similar bell (playing a slightly different tone). This became known as stimulus generalizationgeneralizationThe tendency to respond to stimuli similar to the conditioned stimulus — the broader the stimulus, the weaker the response.. After conditioning, the conditioned response can occur in the presence of stimuli which are similar to the conditioned stimulus. As an interesting side note, Pavlov found that the response to other tones weakened as the tone frequencies got farther and farther from the original stimulus, but then increased when reaching an octave, indicating that dogs also perceive octaves as sounding like the “same” note just as we do.

With more conditioning, Pavlov discovered that he could teach the dogs to respond only to a particular stimulus. For example, by always presenting the food in the presence of a particular tone but never in the presence of a second (different) tone, Pavlov could eventually condition the dogs to only salivate to the first tone. This ability to differentiate two similar stimuli is known as stimulus discriminationdiscriminationThe learned ability to distinguish between similar stimuli and respond only to the conditioned stimulus..

Pavlov also found that after dogs had been conditioned to salivate to a bell, he could get them to salivate to a light by turning it on before the bell, even though the light was never directly paired with food. This is known as second-order conditioning or higher-order conditioninghigher-order conditioningA procedure in which a conditioned stimulus is used to condition a new neutral stimulus.. Because the dogs had learned that the bell meant food was coming, then learned that the light meant that the bell was coming, they would salivate to the light.

Pavlov wasn't the only one conducting experiments in classical conditioning. John B. WatsonJohn B. WatsonAmerican psychologist who founded behaviorism and demonstrated that emotional responses can be classically conditioned in humans. and Rosalie Rayner conducted a study in 1920 showing that classical conditioning could be used to teach fear. In an ethically-questionable procedure, Watson and Rayner presented a white rat to an 8-month old baby (known as “Little Albert”), then hit a metal bar with a hammer, creating a loud, startling noise which made Albert cry. After repeated pairings of the rat with the loud noise, they found that Albert would show distress when the rat was presented, suggesting that fears could develop via classical conditioning, known as aversive conditioning. Albert also showed stimulus generalization because he showed distress not only to rats, but also to other furry objects like a rabbit and a white-bearded mask that Watson wore.

✎  Quick check — Section 2
In Pavlov's experiments, before conditioning, the bell was a:

Biological Aspects of Conditioning

There are biological constraints on which types of associations can be learned via classical conditioning. While studying the effects of radiation on rats, John Garcia and colleagues noticed that the rats refused to drink water from the dishes in their cages following radiation exposure. The radiation was making the rats sick, and apparently the rats were automatically associating this illness with the water they had drank prior to exposure. This suggested that the rats had a biological predisposition to learn relationships between food and illness. Garcia and Robert Koelling conducted later studies teaching rats to associate a noise with an electric shock which followed and to associate flavored water with subsequent nausea, but they discovered that they couldn't teach rats to associate noise with nausea or drinking flavored water with an electric shock. This demonstrates that we have a biological preparednessbiological preparednessThe evolutionary predisposition of an organism to more easily associate certain stimuli with certain outcomes. for certain types of learning and some pairings will be learned more easily than others. If we consider an evolutionary approach to understanding this, it makes sense that we should be able to quickly learn to associate illness with food (so we can avoid that food in the future) while associating nausea with noise would be less practical. This learned taste aversiontaste aversionA conditioned avoidance of a food or drink associated with illness — can form in a single trial and is highly resistant to extinction. doesn't quite follow the rules of classical conditioning because there may only be a single pairing of a food with illness as well as a long delay between the two, and yet the learning can still occur.

If you've seen Stanley Kubrick's A Clockwork Orange (based on the novel by Anthony Burgess), you might recall the scene in which Alex is taught to avoid violence via classical conditioning. He's given an injection that makes him feel sick (though he was told it was a vitamin supplement) and is then forced to watch violent films. The goal is for him to be conditioned to associate violence with illness, thus avoiding violence in the future. While it may sound like a good idea in theory, in practice Alex would have a biological predisposition to associate his illness with whatever food had been served to him hours earlier and not with the violent films he was forced to watch.

✎  Quick check — Section 3
Taste aversion learning is resistant to extinction and can form in a single trial — this demonstrates:

Operant Conditioning

Though classical conditioning provided some explanation for the formation of associations between stimuli and automatic responses, it didn’t provide much insight on the development of voluntary behaviors. In studies of classical conditioning, the organism being studied is rather passive. Pavlov’s dogs simply stood around waiting for bells to ring, food to be presented, etc. They weren’t exploring the environment, searching for food, or interacting with stimuli. In order to understand these real-life behaviors, we need to look to another type of conditioning, in which the organism operates on the environment and experiences the consequences of its behavior.

To begin, we’ll look to one of the first researchers in this area, Edward Thorndike. Thorndike placed a hungry cat in a box, then placed food outside the box. Inside the box was a lever which, when pressed, would cause the door to open, allowing the cat access to the food dish. Thorndike would repeatedly place cats in this “puzzle box”, and observe how long they took to escape. Thorndike found that the cats learned that pressing the lever opened the door and they became successively faster with each trial. Initially they may have stumbled upon the lever accidentally, but after repeated placement in the box, the cats had learned how to escape.

After observing his cats escaping from these puzzle boxes, Thorndike postulated the simple Law of EffectLaw of EffectThorndike's principle that behaviors followed by satisfying outcomes are strengthened; behaviors followed by negative outcomes are weakened., stating that behavior which is followed by positive consequences is more likely to be repeated. In this case, pressing the lever gave the hungry cats access to food, so this lever pressing would be more likely to be repeated. Thorndike referred to this as instrumental learning: learning to perform behaviors which bring positive consequences.

Thorndike’s research examining the relationship between behavior and consequences laid the groundwork for B.F. SkinnerB.F. SkinnerAmerican behaviorist who developed operant conditioning and the Skinner box, demonstrating how reinforcement shapes voluntary behavior.'s research (if your name was Burrhus Frederic, you'd prefer to go by B.F. as well). Skinner examined behaviors and their outcomes in greater detail, believing that nearly all behaviors could be explained by their associations with rewards or punishments.

Skinner referred to a desirable consequence of a behavior as reinforcementreinforcementAny consequence that increases the likelihood of the behavior that preceded it., because it reinforced to the animal that this was a behavior that should be repeated in the future. Reinforcement could come in many forms, from things which satisfy basic biological drives like food or warmth (known as primary reinforcers), to things like grades, stickers, or money (known as secondary reinforcers), which are rewarding due to learned associations (using money to buy food, being praised for good grades, etc.).

Skinner differentiated between positive reinforcementpositive reinforcementAdding a pleasant stimulus after a behavior to increase its frequency., in which an organism receives something desirable (like food), and negative reinforcementnegative reinforcementRemoving an unpleasant stimulus after a behavior to increase its frequency., in which an organism has something undesirable removed (like turning off electric shocks or loud noises). Imagine you have a terrible headache. You perform the behavior of taking a pill and your headache goes away. This will make it more likely that you'll take a pill in the future (whenever you have a headache) and so the behavior of pill-taking is being reinforced. It’s important to note that both types of reinforcement (positive and negative) are used to encourage a behavior so that it will be repeated. This is a commonly-confused concept of conditioning, particularly in popular culture, as films and television shows have often mistakenly referred to negative reinforcement when they should have referred to punishmentpunishmentAny consequence that decreases the likelihood of the behavior that preceded it..

Some consequences reduce the likelihood of a behavior being performed in the future. This is punishment. There are also two main types of punishment. Positive punishment is receiving something unpleasant. So when you touch a button and receive a painful electric shock, you'll be less likely to touch that button in the future. Negative punishment (also known as omission training) is when something desirable is taken away as a result of a behavior. So a police officer issuing a fine (taking away money) for speeding is a negative punishment designed to reduce the behavior of speeding in the future.

While punishment can be effective in quickly stopping behavior (like if you press a button and receive a shock), one of the problems it has is that it doesn't provide any information about which behaviors are desirable. In addition, punishments often don't follow immediately after the undesirable behaviors, and therefore may not actually be associated with those behaviors. From the example above, rather than reducing speed, in the future a driver may simply drive on other roads in an attempt to avoid police (associating the fine with getting caught, not with speeding).

✎  Quick check — Section 4
According to the law of effect, a behavior followed by a satisfying consequence will:

The “Skinner Box”

Skinner studied the different effects of reinforcement by placing animals like rats or pigeons inside what he called “operant boxes” (often referred to by others as “Skinner boxes”) which offered food rewards for behaviors like lever pressing or disc pecking. These boxes tracked how often the animal performed the behavior, and this data could clearly demonstrate increases in behavior based on reinforcement.

Skinner also found that behaviors could be successfully linked together in what is known as chaining. For instance, a pigeon might be reinforced for pushing a lever and also reinforced for pecking a disc, and then these two behaviors could be chained together so the pigeon is only rewarded for pecking the disc and then pressing the lever.

For teaching more complex behaviors, Skinner would reinforce successive approximations of the desired behavior; a process known as shapingshapingReinforcing successive approximations of a desired behavior until the target behavior is achieved.. If you want to teach a rat how to play basketball you can’t simply wait around for the rat to pick up a ball and place it into a hoop then reinforce this behavior. A behavior this complex won’t just suddenly occur on its own (like pecking a disc or pressing a lever might) and therefore it needs to be gradually developed. In this particular case, a rat might first be reinforced just for touching a ball. Then this wouldn't be good enough, and the rat would only be rewarded for actually picking up the ball, then only for carrying it to one side of the cage, then only for placing it inside a basket, and so on until the rat is only rewarded for completing the full behavior (picking up the ball, carrying it to the raised hoop, and dropping the ball in).

This process of shaping applies to human behaviors as well. If we want kids to learn calculus, we don't sit around just waiting for them to solve an integral equation so we can reward them. First we reward counting, then addition, subtraction, multiplication, algebra, etc. until finally we only reward the complex behavior (Mom probably doesn't praise you for adding 3+4 like she used to, now it seems you've got to bring home an A in multivariate calculus to get any rewards).

By closely monitoring the occurrence of behaviors and the frequency of rewards, Skinner was able to look for patterns. Receiving a reward each time the lever is pressed would be an example of continuous reinforcement. But Skinner also wanted to know how behavior might change if the reward wasn't always present. This is known as intermittent reinforcement (or partial reinforcement). By tracking the accumulated behavioral responses of animals in his operant boxes over time, Skinner could see how different reward schedules influenced the timing and frequency of behavior. Though each of these approaches could be varied in countless ways, there were 4 general types of schedules that Skinner tested.

✎  Quick check — Section 5
A variable-ratio schedule of reinforcement produces behavior that is:

Schedules of Reinforcement

Fixed-Ratio (The Vending Machine)

A fixed-ratio schedulefixed-ratio scheduleA reinforcement schedule that delivers reward after a fixed number of responses. follows a consistent pattern of reinforcing a certain number of behaviors. This may come in the form of rewarding every behavior (1:1) or only rewarding every 5th response (5:1), according to some set rule. Just as nobody continuously feeds coins to a broken vending machine, when the set ratio is violated (like when each level press no longer delivers food), animals quickly learn to reduce their behavior.

Variable-Ratio (The Slot Machine)

A variable-ratio schedulevariable-ratio scheduleA reinforcement schedule that delivers reward after an unpredictable number of responses — produces high, steady behavior resistant to extinction. rewards a particular behavior but does so in an unpredictable fashion. The reinforcement may come after the 1st level press or the 15th, and then may follow immediately with the next press or perhaps not follow for another 10 presses. The unpredictable nature of a variable-ratio schedule can lead to a high frequency of behavior, as the animal (or human) may believe that the next press will “be the one” that delivers the reward.

This is the type of reinforcement seen in gambling, as each next play could provide the big payoff. Skinner found that behaviors rewarded with a variable-ratio schedule were most resistant to extinction. To illustrate this, consider a broken vending machine (fixed ratio) versus a broken slot machine (variable-ratio). How long would you keep putting money into a broken vending machine? You'd probably give up after your first or maybe second try didn't result in a delicious Snickers bar. But now imagine playing a slot machine that is broken and unable to pay out (though everything else appears to be working). You might play 15 times or more before you cease your coin-inserting and button-pressing behavior.

Fixed-Interval (The Paycheck)

In a fixed-interval schedulefixed-interval scheduleA reinforcement schedule that delivers reward after a fixed amount of time has passed., reinforcement for a behavior is provided only at fixed time intervals. The reward may be given after 1 minute, every 5 minutes, once an hour, etc. What Skinner found when implementing this schedule was that the frequency of behavior would increase as the time for the reward approached (ensuring that the animal gets the reward), but would then decrease immediately following the reward, as if the animal knew that another reward wouldn’t be arriving any time soon.

This may be of concern for human fixed-interval situations like biweekly or monthly paychecks, as work effort may be reduced immediately after a paycheck has been received (just as most students reduce studying effort in the days immediately following exams, because the next exams aren't coming for a while).

Variable-Interval (The Pop-Quiz)

In a variable-interval schedulevariable-interval scheduleA reinforcement schedule that delivers reward after unpredictable time intervals — produces slow, steady response rates., reinforcement of a behavior is provided at a varying time interval since the last reinforcement. This means a pigeon might be rewarded for pecking after 10 seconds, or it might be rewarded after 1 minute, then after 5 minutes, then 5 seconds and the time interval between reinforcements is always changing. This schedule produces a slow and steady rate of response. The pigeon pecks steadily so it doesn't miss any opportunities for reinforcement but there's no need to rush, since that won't influence the length of delays.

A human comparison might be a class with pop-quizzes for extra credit given at varying and unpredictable times. These would encourage students to study a little each day to always be prepared to earn some points, though they probably wouldn't cram for hours and hours every night.

Superstitious Minds

Skinner also tried rewarding the animals at random, dropping food into the box at unpredictable times that didn’t correspond to any particular desired behavior. Rather than doing nothing and just waiting for the food to arrive, the animals who were rewarded randomly developed bizarre “superstitious” behaviors.

If the animal was lifting a leg or turning his head in the moment preceding the reward, this behavior would be reinforced, making it more likely to be repeated. If, by chance, this behavior was repeated as the reward was delivered again (randomly), this would further serve to reinforce the behavior. As a result, Skinner found pigeons turning in circles or hopping on one leg, simply as a result of this random reinforcement. From this we may view all sorts of superstitious human behaviors, from rain dances to lucky charms to salt thrown over the shoulder, as the result of chance occurrences of reinforcement.

✎  Quick check — Section 6
Garcia and Koelling's research on taste aversion showed that:

Behavior Reinforcing Behavior

David Premack did research with monkeys which suggested that some behaviors are naturally more desirable than others, and they could therefore be considered to be high-probability behaviors. This is known as the relativity theory of reinforcement or the Premack Principle. In essence it means that we can use preferences for one behavior as reinforcement for another behavior, such as letting a child play video games (presumably a high-probability behavior) only after finishing a number of math problems (presumably a low-probability behavior).

So if your mom ever insisted that you could only play after you finished your homework, now you know she was actually applying this Premack principle, attempting to use a desirable behavior (game playing) to reinforce a less desirable behavior (doing homework). Of course, preferences for some behaviors may vary between individuals, which will influence which behaviors can be used as reinforcement. Some children may love solving math problems and hate playing video games, so our strategy above wouldn't work well for them.

✎  Quick check — Section 7
Bandura's Bobo doll experiments demonstrated:

Biological Limits to Learning

Earlier we saw Garcia and Koelling's research on food aversions demonstrating that biological drives influence the associations that can be formed in classical conditioning. Keller and Marian Breland also found that they weren't able to train animals to perform certain tasks which conflicted with biological instincts. For instance, rather than carrying coins to a piggy bank, they found that pigs would repeatedly “root” the coins, pushing them around with their snouts as if the coins were food. Similarly, raccoons would repeatedly rub coins together and attempt to wash them with their paws, something that they usually do with food prior to eating. Both of these demonstrate instinctual drift, which is when a behavior being conditioned is similar to an instinctual behavior, and as a result, the instinct takes over and prevents the conditioning from being acquired properly.

Cognitive Learning

The strict focus on observable behavior couldn't last forever and eventually psychologists began considering the cognitive elements of learning more closely. Over time, evidence began to accumulate that we could actually study cognitive elements in a scientific manner, and as a result, psychology shifted away from its emphasis on behavior and started on what is often referred to as the “cognitive revolution”. In the following two chapters we'll look at cognition more closely, but before we do so, let's look at some of the theories that shifted the tide away from the behaviorist view and towards a cognitive approach.

Beyond Associations: The Contingency Model of Classical Conditioning

You might wonder, if Pavlov's dogs knew to expect food during his bell experiments, why didn't they always salivate to Pavlov's mere presence? Why weren't they just salivating as soon as he walked into the room and started setting up his bells, lights, and metronomes? There were innumerable things that the dogs could have associated with food (the presence of Pavlov, the dog happening to wag his tail before food arrived, the clock on the wall, the clipboard on the table for Pavlov to record data, etc, etc, etc.). This suggests that while the dogs may have been learning that all of these other stimuli are associated with food, this mere association wasn't enough to cause salivation.

The Rescorla-Wagner model suggests that a dog in Pavlov's experiment wasn't just learning an association between the bell and the food, but was really learning that the bell is a reliable predictor of food. Part of the reason the bell is a reliable predictor is because the bell is salient. Pavlov may be standing there the whole time, so the dog may look at him without food arriving, or may not be looking at him just before food arrives, but we can be sure that the bell will at least temporarily capture the dog's attention just before the food. Rescorla and Wagner's contingency model of conditioning emphasizes that we don't just want to learn associations, we want to know what these associations mean and how accurate our predictions about the world will be.

Observational Learning

Imagine that you watch me walk up to a strange device you’ve never seen before. I press a button and immediately scream out in pain as I receive an electric shock. Are you willing to try pressing the button? Or perhaps I press a different button; a slot opens, and cash pours out into my hands. How likely is it that you would press this button?

According to strict theories of behaviorism, the probability of a behavior should only change when you are reinforced or punished. In this scenario, however, this view doesn’t quite work, as you haven’t actually been rewarded or punished, and in fact, you haven’t really even performed a behavior. Yet we see that your probability of performing a particular behavior has in fact been modified by your experience of watching me. You’ve learned something about the button-pressing behavior without needing to do it yourself. The idea that we can learn without actually doing led Albert BanduraAlbert BanduraPsychologist who demonstrated observational learning and developed social cognitive theory, including the concept of self-efficacy. to investigate whether something like aggressive behavior might be learned simply through observation. To study this, Bandura had children observe an adult playing with a “Bobo doll”, a large inflatable toy.

Some of the children saw an adult’s behavior that was “aggressive”; punching, kicking, throwing, and beating the doll, while other children were not exposed to this aggressive play (instead the adult quietly played with tinker toys). Bandura wanted to see if observing this behavior had an effect on how the children later played with the toys. After observing the adult, the children were allowed to play with the toys in the room including the Bobo doll. Bandura then observed and recorded the children’s actions, looking for mimicry of the behaviors, as well as novel “aggressive” behaviors.

It may not be surprising to learn that Bandura found children directly mimicking behaviors they had observed. Children exposed to the “aggressive” adult play were more likely to perform the “aggressive” behaviors themselves, suggesting that they had learned aggression via what Bandura called modeling.

In later versions of the study, Bandura rewarded (with candy) or punished (via scolding) the adult models, to see if this influenced the children’s subsequent behavior. He referred to the consequences the children observed as vicarious reinforcement. Bandura found that children could demonstrate the effects of reinforcement (or punishment) on behavior simply by watching another person be rewarded (or punished) for performing that behavior. This has important implications for understanding how children can learn so many behaviors so quickly, as they don’t necessarily need to experience all the consequences through trial-and-error in order to learn appropriate behavior.

While it may seem obvious that children learn by observing and mimicking behavior, Bandura's study demonstrated that direct reinforcement wasn't necessary for learning, which also implies that there were cognitive elements at play while children were watching the adult's behavior.

Latent Learning

Edward Tolman conducted research in the 1930s and 40s, though the dominance of a behaviorist approach to understanding learning at that time meant that his work was largely ignored until the 1960s. In one classic study, Tolman and Charles Honzik (1930) divided rats into 3 different conditions and had them complete the same maze repeatedly over the course of 17 days. They monitored each rat and counted the number of errors it made in completing the maze.

Tolman's latent learning experiment — rats that explored a maze without reward l
Tolman's latent learning experiment — rats that explored a maze without reward learned its layout and performed as well as rewarded rats once food was introduced.

For the first group of rats, there was a food reward at the end of the maze. As we would expect from operant conditioningoperant conditioningA type of learning in which behavior is shaped by its consequences — reinforcement increases it, punishment decreases it., these rats quickly learned to run to the end of the maze to get their reward, and they gradually made fewer and fewer errors over the course of the 17 days. For the next group of rats, there was no food reward at the end of the maze. Again, without reinforcement, behaviorist theory would suggest that the rats wouldn't learn anything, and not surprisingly, these rats wandered around the maze day after day, and the number of errors they made didn't decrease much over the 17 days.

Cognitive maps in navigation — we build flexible mental representations of space
Cognitive maps in navigation — we build flexible mental representations of space that allow novel routes rather than memorizing fixed sequences of turns.

So far you may be wondering why Tolman's work is still remembered today. As you might guess, the interesting part of the study comes with the third group of rats. This group did not have a food reward at the end of the maze for the first 10 days, then for the last 7 days a food reward was added. A behaviorist might predict that these rats would wander for 10 days, then gradually start learning beginning on the 11th day. But this is not what happened! While these rats did wander aimlessly for the first 10 days, once the food reward was available they rapidly reduced the number of errors they made, and in the last few days these rats actually performed better than the rats in the first group, who had been rewarded all along.

This suggests that the rats in the third group actually were learning during those first 10 days, they just weren't demonstrating their learning yet because there was no incentive to do so. Tolman referred to this as latent learninglatent learningLearning that occurs without reinforcement and is not immediately expressed in behavior — demonstrated by Tolman's maze studies.. This is learning that occurs without reinforcement, and isn't demonstrated until there is an incentive or reward.

This type of learning isn't just for rats in mazes, and in fact you probably do it all the time. Teachers rely on the fact that students engage in latent learning during classes and lectures. We assume that between when class started and when class ended you actually learned something, even though we may not have provided any way for you to demonstrate this learning. Just like the rats who waited 10 days before showing how well they knew the maze, you probably have waited weeks in order to demonstrate all the learning leading up to that midterm or final exam.

In another study, Tolman repeatedly placed a rat into a simple maze with only one route available and other routes blocked. The rat would run straight, turn left, then right, then turn right again, then run down a long hallway which ended with a food reward. After learning this maze, the rat was then placed in a similar maze, except the first straight path was blocked and several hallways were available in other directions. A behaviorist would expect the rat to either choose the most similar hall (almost straight) or perhaps to turn left, since that was the first turn it usually made. Instead, the rats showed that they actually knew the general location of the food in relation to the starting point, and correctly turned down the path angling off to the right and leading to the food, even though this angled right turn had never been rewarded in the past. This suggests that the rats had learned a cognitive mapcognitive mapA mental representation of the layout of an environment, acquired without direct reinforcement., a mental representation of the food's location, rather than just the behaviors (i.e. turn left, turn right, turn right) which had been rewarded in the past.

Humans also use cognitive maps and this is easily demonstrated if you imagine arriving at a restaurant. While you wait to be seated, you ask to use the restroom and are instructed to walk straight, turn left, turn right, then go to the end of the hallway. Later in the evening, after being seated at your table, you need to use the restroom again. Instead of following the specific previous behaviors (turn left, turn right) you now need to perform new behaviors to get to the restroom (walk straight, turn left) because you are starting from your table. Rather than becoming hopelessly lost and soiling your pants, you use a cognitive map of your position in relation the location of the bathroom to figure out a new route with relatively little effort.

Abstract Learning

We don't only learn specific behaviors tied to specific situations or stimuli. We also learn more complex concepts like what a tree is (we can understand that a willow is a tree and a palm tree is a tree, even though they look quite different). Similarly, we're capable of learning other abstract concepts like love, humility, or pride and these influence our thoughts and behavior even though they may not be readily observable.

While we may excel at it, this ability to learn concepts may not be limited to humans. In fact, research has shown that even pigeons can demonstrate this type of abstract learning. After being taught to peck a picture of a chair, pigeons could also peck at pictures of chairs that they hadn't seen before. While this may seem like just stimulus generalization, their ability to correctly respond to a novel stimulus suggests the possibility that pigeons actually have a cognitive understanding of the concept of chair. In a study by Shigeru Watanabe, Junko Sakamoto, and Masumi Wakita (1995), pigeons were taught to differentiate between Picasso and Monet paintings and later could do so with paintings from these artists they had not seen before. This implies that there are important cognitive elements in their learning process that can't be fully understood through simple behavioral observation.

Insight Learning

Wolfgang Köhler studied learning in chimps and presented them with puzzles to solve in order to obtain bananas. Bananas were suspended high above the ground or placed out of reach behind a fence, and chimps needed to stack boxes to stand on or use sticks to drag the bananas closer. Once the chimps had figured out what to do, they worked at that solution (such as stacking boxes) until they were able to perform it successfully. Köhler observed that the behavior of the chimps differed from that of Thorndike's cats, who discovered the solution to the puzzlebox by simple trial-and-error. It wasn't the case that the chimps simply stumbled onto the solution, but instead that they mentally worked out a solution then acted in a purposeful way.

Rather than a gradual strengthening of a stimulus-response association, the learning in this case seemed to appear in the form of a sudden realization, which Köhler considered to be an example of insight. After their initial failures to reach the bananas, the chimps seemed to contemplate solutions, rather than constantly trying new behaviors. This suggests an internal cognitive learning process that is occurring without the need for repeated trial-and-error behavior.

End of an Era

The accumulating evidence of biological and cognitive aspects of learning gradually led to the decline of behaviorism as the dominant approach in psychology. In the following two chapters on memory, language, and cognition, we'll attempt to better understand the internal workings of the mind and how our mental representations of the world influence our thoughts, emotions, and behaviors.

Chapter Summary

Key takeaways — Chapter 5
  • Classical conditioning refers to learning which results from the repeated pairing of stimuli. A neutral stimulus is followed by an unconditioned stimulus (which generates an unconditioned response) repeatedly until the neutral stimulus alone causes a response (called a conditioned response) at which point the neutral stimulus is referred to as a conditioned stimulus.
  • In operant conditioning an organism learns to associate rewards and punishments with particular behaviors. Positive and negative reinforcement both encourage a behavior either by giving something desirable or taking away something undesirable, while positive and negative punishment discourage a behavior by giving something undesirable or taking away something desirable.
  • Reinforcement in operant conditioning can be given according to different schedules including fixed-ratio, variable-ratio, fixed-interval, and variable-interval.
  • We have biological predispositions for some types of learning such as associating taste with nausea. Instinctual drift can also prevent some behaviors from being learned properly.
  • Increasing evidence for cognitive components of learning from studies of latent learning, observational learning, abstract learning, and insight learning eventually led researchers away from a strict emphasis on behavior for understanding learning.

Review Questions

Chapter 5 — Learning
10 multiple choice questions · Select an answer then click Check
Question 1 of 10 Score: 0 / 0
out of 10
Study tools
Practice the Chapter 5 key terms
35 flashcards covering all key terms from this chapter — with instant definitions.
Study Flashcards →