A GRAND EXPERIMENT
“The better I get to know men, the more I find myself loving dogs.”
Since we’re in the business of teaching people how to speak dog, it is vital to understand how dogs take in information. As the esteemed trainer and veterinarian Dr. Ian Dunbar points out, the dog believes it’s training you. The dog wants a piece of food or to be petted, and it sits patiently, with waiting eyes. You say, “Good doggy” and massage its head, and the dog thinks, “Good owner.” When a dog that has trained its owner wants to sniff and be petted by a friendly stranger he will sit and wait until the obedient stranger offers a hand, followed by a pet. The dog thinks, “Good stranger, now carry on. Owner, let’s keep walking,” and off they go.
CAUSE AND EFFECT
Dogs have an outstanding handle on cause and effect because they are tireless practitioners of trial and error. They are continually performing experiments and making learned associations from the results. For example, my dog Pacino is an absolute sucker for affection and will shamelessly seek affection from anyone, anytime, anywhere. In my apartment, he has figured out that when I’m at my desk, I’m typically working. In turn, he won’t approach me for a pet when I’m sitting there. When I’m on the couch, he’ll take a few shots at me. Sometimes he wins, and sometimes he loses, but he’s figured something out. I have a comfortable armchair, and Pacino will accost anyone who gets in that chair for affection. It took me at least a year to figure out why he’s at his most tenacious over that chair, but he quickly figured out that he has a high batting average in that spot. Why? Because when I’m in that chair, I’m never working, and I’m not watching television. When I’m occupying that chair, I’m never doing anything in particular so Pacino’s pet batting average is at a high there. Moreover, guests tend to gravitate toward that chair, and no guests are turning down Pacino when they first sit. He is also allowed to climb into people’s laps on that chair, something he would not typically be allowed to do. To speak to Dunbar’s point on the dog training you, it was only after that chair officially got dubbed “the Pacino chair” that I realized what had happened. I got so programmed by my dog that I now make it clear to people who sit in that chair that they are not allowed to ignore Pacino. It is an immutable rule. This evolution took place over the course of a year or more, and I was the last to catch on.
With enough consistency in the results, a voluntary response is produced, and the dog (or the owner) continues to perform the behavior. When conditioned well, we behave consistently, and at some point a voluntary response turns involuntary. Think Pavlov’s dogs; think about a dog’s reaction as one heads toward the leash closet or when it hears the rustle of dry food being unpacked. The dog’s excitement to eat or go for a walk is involuntary. These learned associations and responses work in both directions, though. Should a dog condition itself to be on alert when it hears the rumblings of a garbage truck, there is no conscious control over this response. What compounds these problems is issues the dog picks up from an unwitting owner. I trained a great little Boston terrier down in the Financial District who went ballistic every time a garbage truck passed by. The dog’s owner was a super-nice woman from Texas. Dave got her to open up some about her anxieties, and by session’s end, she had a breakthrough moment on how much she hated New York City garbage trucks. It was a hysterical purging. It all started on her second day in New York. She’d been ruthlessly hit on by a garbage man. Hailing from Texas, she was too polite to just walk on, so my man probably figured he was getting somewhere. He wasn’t her type, plus, it was August, and the stench of hot garbage that permeated the air had nauseated her. Her dog was surely on edge just being in a city for the first time, and although he was probably okay with the smell, the loud engine running and his mommy feeling far less than charmed set off a few of his alarms. These alarms continued ringing whenever a garbage truck went by. It took far less work to decondition the dog once the owner realized that tensing up and suddenly picking up the pace whenever a sanitation truck drove by caused her dog to follow suit.
The environment and our every action is either conditioning or deconditioning a dog. Therefore, it is imperative that we become aware of the messages we are sending.
Dogs jump on our beds, eat off our plates, gnaw on the legs of our furniture, usurp our armchairs, kill rodents, and dig holes for us to beam in amazement at. These behaviors are part of being a dog. Like most creatures, dogs seek out a good time while avoiding pain. Whatever experiment they believe produces a worthwhile result, they will continue to perform, and whatever experiment is met with substandard results, they will abandon. This is how dogs learn, and they reveal this learning to us when they consistently partake in or avoid certain behaviors.
These are all the clues we need to provide them with a happy outcome that works in concert with our needs. For example, if I want my dog to stay in a given place when the doorbell rings, I can teach him to do so by providing affection and a piece of food in connection with performing this action. For both of us, this is a pleasant consequence. Conversely, if your dog chases down a stick you heaved into the woods and proceeds to drop that stick in a sewer, it’s a safe bet no one’s throwing that stick again—most likely an undesired consequence if your dog has any interest in playing fetch, particularly with that stick. Dogs are quite literal, so your dog may wonder why the game stopped so abruptly; he may also wonder why you didn’t climb into the sewer to fetch the stick. The dog is now motivated to get back to playing this game, which is where we come in to provide instruction. With repetition, the dog will realize that fetch requires the stick to remain in play and that the game is a two-way street.
Our aim is to repeat the actions that engender favorable consequences and limit those that lead to negative consequences. Feed the positive, starve the negative.
Dogs are often on the right track or at least in the ballpark. Chiquita is always on alert, which is fine, but I don’t want her on red alert. I appreciate the fact that she barks when there is someone at the door, but I did not appreciate it when she acted as if we were under attack. She would frantically pace, growl, and bark wildly, as if screaming, “The British are coming, the British are here!” In the beginning, feeding the positive was not possible. I couldn’t stuff a treat in her mouth after the first bark, because she was off to the races. So I ignored her. I did not move a muscle, and she went bananas all by herself. Starve the negative. I also informed friends in advance that I might not get the door for a few minutes. She had to calm down first, and she had to look to me for guidance. The moment she shot a glance in my direction, I would calmly motion for her to come, and have her sit. I did not want to give her a treat because that could have reinforced her reactive behavior. Once I got my friend in the door and Chiquita remained seated, only then would I reward her. Slowly—and it was slow in her case—she caught on, and in time, I was able to reward her after the first few barks. Feed the positive, starve the negative.
Side note on the aftermath: I often mention to people to cut their dogs some slack if something is not particularly bothersome. I was never vigilant about enforcing Chiquita’s doorbell protocol, and over time, she regressed some. Nowadays, when I answer the door, Chiquita freaks out a little, and then I mildly yell at her to chill out, which does nothing. For me, this is an acceptable amount of chaos and a long way from her days of being stressed out and aggressive whenever she heard the slightest noise outside my door.
When dogs take actions in their world, they quickly learn whether a behavior produces a negative, positive, or neutral outcome. They get balanced in a hurry because dogs warn each other first, then bite, and move on. In dealing with humans, they are navigating far murkier waters. Our world is full of breakable objects with intrinsic value, fearful people, unsoiled belongings, and, most challenging of all, we speak a language that dogs do not intuitively understand. Their world offers a far more effective vantage ground, as nature may be cruel, but it is consistent. Every time a dog investigates a skunk too closely, it pays a price. With us, inconsistencies abound, so learning can be difficult to trust. Dogs would ask us things like “How come I could sleep with you as a puppy and now I can’t? Did I do something wrong?” or “How was I supposed to know I got too big to sit in your lap?” As it’s been pointed out, a puppy jumping on people is cute, but once it gets big enough, it will get punished for the same behavior. Inconsistent and unfair.
In order to speak dog effectively, we listen with our eyes and respect the fact that we are dealing with a dog that is blessed with an inherent set of behaviors that may not always jibe with our wishes. From there, we can gain an understanding of the basic principles of dog training and how dogs learn, so we can seamlessly move into commands and socialization before putting it all together.
Reinforcement is one of the biggest buzzwords in training and, unfortunately, one of the most misunderstood. The business of positive versus negative reinforcement is the stuff of great debate in the world of dog training. Personally, I am gobsmacked that some of the most outspoken dog trainers can so vociferously share their misunderstanding of reinforcement.
In today’s dog parlance, positive reinforcement refers to praise or rewards in teaching. Negative reinforcement has become synonymous with dominance-based training or, more accurately, punishing a dog until it gets it right.
Positive reinforcement is commonly believed to encompass training techniques that rely entirely on praise, play, petting, treats, toys, and affection. When a dog misbehaves or performs poorly, the trainer offers continued encouragement and essentially waits for the dog to do it right while passively standing by, waiting to heap praise with a fistful of treats.
Negative reinforcement is considered to involve yanking on collars, hitting dogs, spraying dogs with repellents, yelling, and making mean faces, all in the name of teaching.
These are all inaccurate interpretations of positive and negative reinforcement that have stifled the evolution of dog training. After my short disclaimer, I will offer a simple, factual explanation of the genesis of reinforcement and how it applies to dogs.
In my estimation, nature offers enough deterrents for dogs. A dog that decided to grapple with a porcupine should not come home only to get spanked for having muddy paws. I also believe that punishment-based training can cause negative behaviors to persist when the dog’s trainer/tormentor leaves the room. It is unnecessarily cruel and, in my estimation, ineffective. At best, the dog will behave properly when the abusive person is around. As I once heard, no one would dare train a bear this way. Taking advantage of a dog in a violent manner is inexcusable, sickens me, and makes a strong case for capital punishment.
Dogma aside, here’s the deal: You can lead a dog to training, but you can’t make it click. If a dog gets sprayed, smacked, and verbally derided for its efforts, it will not be too hip on future training sessions. Trainers who believe in establishing dominance and punishing as the nucleus of their training have not only dogs that hate them but a growing list of conscientious objectors, including me. There is no place for that.
The sad part is that many well-intended positive-based trainers are, regrettably, the most uninformed and indiscriminately critical. Still, they at least aim to make life better for dogs.
That’s my disclaimer. If you can stomach the coming pages on operant conditioning, you will have a sense of the psychology behind dog (and people) responsiveness, and at some point, you may have a chance to correct a misinformed trainer. As Josh Billings said, “A dog is the only thing on earth that loves you more than it loves itself,” so let’s show a little love for our pups by gutting out the next few pages.
The term “operant conditioning” was coined by B. F. Skinner, a decorated psychologist and behaviorist who decided to pursue psychology after being inspired by the works of none other than Ivan Pavlov. In a 2002 survey of his fellows, Skinner was considered the most profoundly influential psychologist of the twentieth century. Not bad. His statement “the consequences of behavior determine the probability that the behavior will occur again” may well be the calculus of dog training. Although his body of work was largely in the realm of human psychology, his paper simply titled “How to Teach Animals” is an insightful guide into animal learning.
Ironic side note: Skinner built his name through his research on operant conditioning and developed a device called a “cumulative recorder.” With this device, he was able to determine that behavior did not depend on the preceding stimulus, as Pavlov had posited, but instead found that behavior hinged upon what happened after the response. He called this operant behavior.
Operant conditioning supposes that learning is based entirely on the rewards and punishments for a given behavior. Via operant conditioning, dogs make an association between a behavior and the consequences of that behavior.
Skinner’s work suggested that thoughts and motivations are not reliable in explaining behavior. Rather, we should focus on the external, observable causes of behavior. Through these external factors, we can understand how people acquire the range of behaviors they exhibit on a daily basis. Think of a guy buying flowers to make a good impression on a date. Guys do not buy flowers because they love the smell; they do it based on the potential promise of a reward. Should they receive that reward, the behavior will increase so long as that reward still holds value. When you think of the song “You Don’t Bring Me Flowers Anymore,” it is clear that the reward-producing behavior has come and gone. Sadly, the potential reward of bringing flowers is no longer of value for the giver. Thus, the behavior of buying flowers has decreased.
Operant conditioning causes behaviors to increase, decrease, or cease depending on the outcome of performing that behavior and the extent to which we value the outcome. What? Behavior increases, decreases, or stops depending on the payoff or lack thereof. For example, a dog learning to relieve itself outside may be doing so to avoid being yelled at or to seek out praise.
It is important to understand the use of the words “positive” and “negative” with regards to punishment and reinforcement. Positive merely means the introduction of a stimulus while negative denotes the removal of a stimulus. Positive and negative have no correlation with good and bad in this context.
KEY COMPONENTS OF OPERANT CONDITIONING
Reinforcement is any occurrence that increases the behavior it follows. There are two kinds, positive and negative.
1. Positive reinforcement: Favorable events or outcomes that are presented after the behavior. Guy buys flowers and appreciative flower-loving date throws her arms around him. The behavior is strengthened by her affection. A dog sits when instructed and gets a piece of food. The food is presented after the behavior of sitting.
2. Negative reinforcement: The removal of unwanted outcomes or events after the display of behavior. Behaviors are strengthened when the negative reinforcer is removed or avoided altogether. Something unwanted is being subtracted from the equation. In this case, flower guy buys flowers and allergic flower-hating girl sneezes in his face. From this moment on, he buys chocolates for his dates and mother (although he may still get in trouble for trying to gain favor with high-calorie foods). The example that is commonly used is putting on sunscreen to avoid getting burned. The negative reinforcer is the sunburn, which serves as the reminder that strengthens the behavior of applying sunscreen.
These responses are created via the subtraction of something undesirable—sneezing and sunburns, respectively. In the above cases, the negative reinforcers increase a behavior, which is what makes it different from punishment. Punishment is the introduction or removal of a stimulus designed to decrease or weaken the behavior it follows. Again, there are two types:
1. Positive punishment: The presentation or introduction of an unfavorable event or outcome to weaken the response it follows. Example: A dog jumps on your guests and gets sprayed with a water bottle.
2. Negative punishment: A favorable event or outcome is removed after a behavior occurs. We all know this one: When I play ball in the house, Mom takes the ball away.
Punishment is aimed at weakening or decreasing a particular behavior via the introduction of an unwanted stimulus or the removal of a desired stimulus.
Reinforcement aims to increase wanted behaviors, while punishment is designed to decrease unwanted behaviors.
THE ROAD TO HELL
Intentions are really the thing we have to look at. Are we offering our dogs a chance to do something good or are we attempting to stop them from doing something we don’t like? I want my dogs to be calm when I leash them up in the face of their excitement about going outside. I take the leashes off the hook and they begin to act up so I put the leashes back on the hook. Am I doing this to encourage calm or to halt chaos? Putting the leashes away is an example of negative punishment if my aim is to weaken the behavior of my dogs acting unruly. Still, I’m taking an action that anyone would consider reasonable and entirely humane. What makes this act something other than negative reinforcement is a matter of technical definition and a mere tweaking of the act. Let me get to the point. Examine your motives and don’t get caught up in the technical definitions. Do I have my dog’s best interests in mind or am I acting on my own frustration? Am I guiding my dog to the promised land or am I speeding its path to Hades, hoping it will realize that’s no place to be? Drama aside, check your motives and try to offer your dog the swiftest, happiest path to good behavior.
SCHEDULES OF REINFORCEMENT
This is where it gets interesting. In operant conditioning, when and how often a behavior gets reinforced can have a significant influence on the strength and rate of learning or response. A schedule of reinforcement is a set of rules that determine what behaviors will be reinforced, as well as the frequency with which reinforcement occurs. Behaviors will be reinforced all the time, some of the time, or none of the time. Positive reinforcement or negative reinforcement may be used, depending on the circumstance, but, remember, the goal is always to strengthen favorable behavior and increase its likelihood going forward. Remember when I confessed that I got a little lax about Chiquita overreacting when the doorbell rings? I forgot to schedule in my schedule of reinforcement.
Let’s take a look at the following reinforcement schedules.
1. Continuous reinforcement: Continuous is as it implies—the behavior is continuously reinforced. For example, if you are trying to teach a dog to sit, catching him in the act of sitting, whether or not you asked for it, is cause for rewarding. In the initial stages of learning, continuous reinforcement will make a strong association between the behavior of sitting and the response—for example, getting a reward.
2. Partial reinforcement: In keeping with the sit example, partial reinforcement says the dog is rewarded only when you specifically ask it to sit; it would not receive a reward for sitting on its own accord. This is the most common reward schedule with commands and is particularly effective in guarding against “extinction.”
Extinction occurs when a trained behavior is no longer rewarded or the reward ceases to be rewarding. For a dog, something ventured and nothing gained means it’s time to move on. Stick to a schedule of partial reinforcement to safeguard learning from extinction. For example, food and affection typically maintain their value, but the same old toys can get old in a hurry. Vary the rewards and the way they are presented so the dog will keep knowledge of the rewarded behavior handy.
SCHEDULES OF PARTIAL REINFORCEMENT
1. Fixed ratio: A behavior is reinforced after a specified number of responses. For example, once my dog can follow the “sit” command with eighty percent success, I will reward the behavior only every third requested “sit.” Fixed ratios help to wean dogs off of food rewards and can increase their attention span.
2. Variable ratio: A behavior is reinforced after an unpredictable number of responses. Sometimes you win, sometimes you lose. This schedule creates a tremendous response in humans, as evidenced by the popularity of games of chance. In the beginning, consistency in schedules is everything, but once a command is learned, dogs get hyped up by unexpected rewards.
3. Fixed-interval schedules: The response is rewarded after a specified amount of time. Useful with dogs when teaching commands like “stay.” Fixed intervals allow us to increase the amount of time in which the dog performs the requested behavior.
4. Variable-interval schedules: A reward is given after an unpredictable amount of time. This schedule produces a steady response and, like variable ratio, helps to keep a dog’s interest.
5. Shaping: This is not Skinner’s work, but a type of reinforcement dog trainers use. As dogs learn, we expect more from them, but in the beginning, a halfhearted squat can count as a successful “sit.” The partial execution or even attempt to perform a command may warrant a reward.
Determining a reward schedule is pretty simple. When teaching something brand-new, continuous reinforcement is great for capturing behaviors. Once the behavior is learned, partial reinforcement refines the lesson, while maintaining the value of the reward.
Things can get humdrum if rewards don’t vary some in value, frequency, and presentation. If you don’t want your dog to fall into a predictable pattern, then you can’t, either. Once a dog understands a command, feel free to reward with different hands, have someone else give the reward (helps with socializing, too!), make the dog wait with anticipation, and occasionally, give it right away. Anything you can think of to keep things fresh works, and, if all goes to plan, you’ll be having fun, as well.
Why am I so intent on varying the rewards as well as keeping things upbeat and moving? The brain circuitry of humans and dogs, of course. Dopamine is a feel-good neurotransmitter that drives the reward response in humans and most animals. A dog’s brain, as well as ours, is extremely attuned to expectations, and when expectations go unmet, dopamine levels fall precipitously. Try telling a kid who ran home from school that he has to wait until tomorrow to play Xbox, and watch the effects of falling dopamine levels.
To understand the link between dopamine and reward circuitry, you have to travel deep within the brain to a place called the nucleus accumbens, where dopamine cells are waiting to fire in anticipation of a reward. Professor Wolfram Schultz at Cambridge University discovered environmental signs that indicate an upcoming reward creates a response in the brain that releases dopamine. These are rewards we know to expect, which transmits dopamine. But guess what releases more dopamine than expected rewards? Unexpected rewards. Surprising rewards and upbeat, unpredictable deliveries can help ward off any downturns in mood. In humans, the drop-off in dopamine for not receiving an expected reward is akin to pain and triggers a small threat response—not exactly a receptive, learning state.
Dopamine cells connect to the prefrontal cortex, which is critical for concentration and learning. Keep expectations positive for the dog, and focus grows accordingly. The connection between dopamine levels and perception is believed to be the reason that happier people experience improved mental performance and increased problem-solving abilities. I personally believe this is also true for dogs, although their prefrontal cortex, in relation to brain size, is much smaller than ours.
To recap: Vary the rewards, and deliver them at expected and unexpected times. Keep the vibe light, positive, and varied so you’ll have a supercharged dog to work with. Once some learning is in place, use the power of variable ratio and interval reinforcement schedules so a dog can bask in the element of surprise.
As a dog gains proficiency in training, we can up the ante to reward only the best responses based on our criteria. An underperforming dog will soon realize what it takes and up his game.