What is Balanced Training?
There is much talk these days about "positive training" and "force free" training. In opposition to this, you will hear people refer to "balanced training." What do these terms meaning? Is positive training really better than balanced training?
The phrase "balanced training" refers to a system of training which utilizes the 4 quadrants of operant conditioning, all of which have great value and provide us with a full toolbox to use on dogs that have different personalities and histories. Positive reinforcement is GREAT. It should be the first tool we pull out of our toolbox and is especially helpful with young puppies, with teaching novel behaviors, and with training any sort of tricks or agility training. However, when it comes to some things, especially established behavioral issues, sometimes positive reinforcement doesn't work, and we need to use a different quadrant. Some dogs, especially in certain breeds or with certain types of histories, are not approval-motivated or food-motivated. Aussies do tend to be approval/food motivated (lucky us!) but sometimes a dog with an unknown history may have issues there.
As a brief run-down primer, these are the 4 quadrants.
First, let’s discuss vocabulary. The phrase “positive only training” gets thrown around a lot, but that’s actually a misnomer. What they mean is “reward based” or “force-free”. The word "positive" means we added something. It doesn’t mean we rewarded the dog. It just means we ADDED something (could be something good, could be something bad). The word "negative" means we took something away. It doesn’t mean a correction. It means we removed something from the picture. Maybe it was a good something and maybe it was a bad something. The word "reinforcement" means what we did is intended to make the dog repeat the behavior. The word "punishment" means what we did is intended to inhibit the behavior. Punishment doesn’t mean force or correction. It simply means we are trying to discourage a behavior. Many lay persons use the word "punishment" incorrectly and in training circles, it is good to understand how trainers use the word so there are no misunderstandings. For example, most people who say they are "positive only trainers" actually use a lot of "negative punishment" (see below) and most still use some "negative reinforcement" to a degree. What they really mean to say is that they avoid using "positive punishment" which makes the name "positive only" a little bit confusing, doesn't it? So, what does all that mean? Well let's look a little closer at those 4 quadrants!
1)Positive reinforcement: The dog does something good, so something good happens. This is what most people associate with force-free programs. And you know what? This is a great training method! It works on a LOT of things. When we are trying to train a new behavior, this is absolutely the best way to do it. Little puppies are usually quite motivated by positive reinforcement. Example: My dog sits, so I give him a reward. We added something (the treat) to increase a desired behavior (sitting). The reward we are adding may be food, praise, petting, attention, a toy, or anything else your dog enjoys.
2)Positive Punishment: The dog does something “bad”, so something bad happens. This is what most people think the word "punishment" means. Example: your dog jumps on you, so you knee him in the chest. You added something (The knee movement) to decrease a behavior (jumping).
What are other forms of positive punishment? If your dog does something he shouldn't and you yell "no!", that is positive punishment. Positive punishment is not an inherently bad thing and can be very, very helpful when used correctly. Other forms include E-collar training, leash "pops", spraying with a water bottle etc. These things are excellent tools to use in some circumstances. For example, puppy biting is often quickly decreased by simply giving a loud strong "no!" This works much more quickly than simply redirecting to toys or ignoring the behavior or God forbid, yelping when the puppy bites! We must first tell our puppy that what he did was wrong before we redirect. This is why most reward-based trainers struggle with behavior modification. We must clearly explain to our puppy what behaviors are right and what behaviors are wrong.
3)Negative reinforcement: the dog does something good which makes something "bad" stop happening. Example: You want your dog to come, he doesn't, so you pull the leash. The dog comes, you stop putting pressure on the leash. Many "positive trainers" actually use this in limited amounts in their program as well. You removed something unpleasant (leash pressure) to increase a desired behavior (coming to you). Anything involving "pressure" is based on negative reinforcement.
4) Negative punishment. This is the other half of the typical force-free program, which is why the term "positive only" is such a misnomer. I’ve never met a truly positive-only trainer! This is when the dog does something bad, and that makes something good stop happening. Example: Your dog jumps on you, so you turn your back to him. You took away something (your attention) to decrease an undesired behavior (jumping). Anytime you withhold a treat, attention, etc. you are using negative punishment.
All these tactics are very useful depending on the dog, the situation, and the experience level of the trainer! Positive punishment in particular is hard for lay people to “get right” because you have to have impeccable timing and know exactly how forceful to be.
So what is the problem with all-positive reinforcement, you might ask? Well it's very simple. If we only use positive reinforcement, then what we are offering the dog as a reward for obedience has to be MORE desirable than what he gets for disobedience. If he isn’t, you won’t be able to modify his behavior.
Let's use some numbers to make this clearer to our human brains. Let's say that on scale of -10 to +10, every action has a value to your dog. Something he really loves is worth +10 and something he really HATES is a -10. Neutral things are a zero.
Let's say we want our dog to sit and we have a piece of hot dog in our hand. Our dog really loves hot dogs. Hot dogs are +8 in his mind! If he doesn't sit, what does he get? Nothing. He gets no reward and no punishment (value: 0). The act of standing in one place is neither rewarding nor punishing. When you tell him to sit, his options are to do it and get the hot dog (score of 8) or to not do it get nothing. Man, this is an easy choice, right? Positive reinforcement for the win! This is why positive reinforcement is so great for training new behavior
Now let's take a different scenario. Let's say your dog sees a squirrel! Chasing squirrels is awesome. Chasing squirrels is a +10 in his book! There is NOTHING you can offer him that is more attractive than squirrel chasing. You tell him to ignore the squirrel and you offer a hot dog. So, what are his choices? Ignore the squirrel and get a hot dog (score: 8) or run off and chase a squirrel (score 10). He's going to go chase that squirrel! What you have isn’t good enough to pull his attention from the squirrel. Positive reinforcement only works when the reward for obedience is greater than the reward for disobedience. Things like squirrel chasing are what we call “self-rewarding behaviors” because the behavior itself is so fun to the dog that it constitutes a reward.
Now let's introduce a form of positive punishment to that 2nd scenario. Let's say your dog has been taught that if he doesn't come to you, you are going to say NO!!!! He doesn't like when you yell no. That's worth -3 points. So, if ignores the squirrel he gets a hot dog (score 8). If he chases the squirrel, he gets to enjoy the chase (worth 10pts) but he also gets yelled at (-3) so now squirrel chasing is only is worth 7 pts. So, ignoring the squirrel is the better option.
Obviously, our dogs don't do math, but I use this example to show how correction can tip the scale in your odds when your dog makes a choice. He's not just factoring in what's more fun. He must also factor in consequences and decide if making the wrong decision is worth it.
So how do we know when to use the different forms of training? First, let’s use language that is clear. I’m going to avoid using words like positive or punishment. We are going to talk about REWARDS (things your dog enjoys like food or praise) and CORRECTIONS (things your dog doesn’t enjoy like leash pressure or a verbal no).
I break training down into 4 basic categories and each one benefits from different quadrants of operant conditioning.
1)Teaching Novel Behaviors- A novel behavior is something new that your dog doesn’t know how to do yet. Let’s take learning to “sit”. Your puppy isn’t born knowing how to sit on command. You have to teach him that. So, if you look at him and say “sit”, and he doesn’t…well you can’t very well correct that can you? It’s not his fault that he doesn’t know what “sit” means. You’ve never taught him that. Imagine you have a child that’s learning math. Are you going to smack them every time they get a wrong answer? No! That won’t help them get the right answer. You are going to tell them, “that’s not the right answer, let me show how to find the correct answer”. And if they still don’t get it, you’re going to go back and show them a different method of finding the answer. And when they get it right, you’re going to say good job! That’s right! You’re going to teach them to problem solve and let them know when they do well. It’s the same for our dogs. Your 8-week-old puppy isn’t refusing to sit on purpose. He doesn’t know what that word means. There is no point in even saying the word in fact. When I train sit, I take a treat and hold it above the puppy’s head so that he leans back. This naturally causes his butt to drop. When his butt drops, I say YES! And give him the treat (you can also use a clicker instead of the word yes). I’m going to lure him into this behavior several times so that the minute he sees my hand raise, he drops his butt. Only then do I “name” the behavior and tell him that what he just did is called “sitting”. I’m going to raise my hand with the treat and as the butt starts to drop, I say “sit” and then “yes!” I’ll do this for a session or two. Only then will I start saying the word “sit” at the beginning of the exercise. We have to show him what’s correct first. If he doesn’t sit, I’m going to go back a step or two until he is being successful and try again. If he offers an incorrect behavior (like lying down), I’m going to say “nope!” or “uh-oh!” in a light-hearted voice. That isn’t a correction. It’s just me saying “try again, that wasn’t quite right”. Anytime we are learning a new behavior, this is how I’m going to approach it.
2) Teaching them to refrain from inappropriate behaviors. By inappropriate behaviors, I’m referring to things we, as humans, wish they wouldn’t do. They aren’t inherently “wrong” in and of themselves. These things may be acceptable in certain circumstances (urinating, which is acceptable outdoors but not indoors) or things that some owners approve of while others do not (jumping on the furniture). Other examples include jumping on us, nipping, digging holes, chewing. The thing about these behaviors is that the dog doesn’t know they’re undesirable. They are very normal doggy behaviors! They are things we as humans want him to avoid in order to integrate into our human lives in a way that is most convenient and desirable to us as humans. To punish him for these behaviors without telling him that his behavior is inappropriate wouldn’t be fair. Little children don’t know that it’s inappropriate to run around without clothing on. That’s a rule of society that we teach them. We aren’t going to slap a 4yr old for walking out of the bathroom naked, are we? Of course not! We are going to explain that we don’t walk around without clothes on because it’s not appropriate behavior.
It’s the same for dogs. With these behaviors, I follow the pattern of correct, redirect, reward. In this case the correction starts with a loud verbal NO that snaps them out of the behavior. The “no” isn’t to frighten them. It’s to startle them and stop the behavior mid-moment. Your dog will quickly learn that “no” means “stop what you’re doing”. You can also do a loud clap instead of a “no” as long as you’re consistent. Once we have stopped the behavior and told the dog we don’t think it’s appropriate, we now need to tell him what IS appropriate and reward that. So, our dog starts to piddle on the carpet. We correct them by saying NO! (It has to be strong enough and loud enough to make them startle a bit). We pick them up, we go outside, we say “go potty” (redirect) and then when they potty outside we reward them. Same thing with jumping. When little puppies jump, I say NO! Tell them to sit, reward the sit. Puppy biting? Same thing. Puppy nips, I say NO, to stop the behavior, then hand him a chew toy to redirect his chewing, and reward with praise.
3)Bad behavior. These are behaviors that are ALWAYS wrong, and the dog simply should never be allowed to do. Bad behavior will very rarely respond to positive reinforcement alone, because almost all of them are self-reinforcing. If you have a teenager who is sneaking out of the house at night, do you just ignore that behavior and then reward them on the days they don’t sneak out? Of course not. You tell them the behavior is wrong and you correct it the first time it happens. It’s simply not acceptable. You aren’t going to bribe them into avoiding the behavior or simply praise them when they don’t do it. You’re going to CORRECT it. It’s the same with dogs. The training method is the same as above, except we are going to offer much stronger corrections from Day 1 to emphasize that this is never ever ok and always provokes a strong aversive response. This is one of the few times I’ll use physical corrections (generally a leash pop). A leash pop is when you give the leash one strong tug-and-release. It is not a steady pull or a yank. The leash must be loose before you do your “pop”. The purpose is to surprise the dog and if the leash is tight, it doesn’t work. You may have to step forward to put a bit of slack in the leash before giving the correction. What sort of behaviors are we talking about here? Aggression towards other dogs, chasing other family pets, lunging on the leash, and “reactivity” towards dogs or other people are the most common things I see. People often claim that reactivity is based in fear, but the reason for the reactivity really doesn’t matter in the long run. You still need your dog to work through his emotions and behavior appropriately. Just as humans aren’t allowed to act out just because we are scared or angry, neither is your dog. Some responses just aren’t appropriate. If your dog is “fearful” of another dog and barks at it once, that’s ok. He’s telling you he’s scared. If you say, “leave it” or “let’s go” and pull him onward, he should respect that command regardless of whether he’s fearful or not. He’s told you he’s scared and you’ve acknowledged that. As his leader, you’ve told him not to worry. At that point, he should leave the stimulus and trust you. Continuing to bark, lunge, etc isn’t ok. It’s an excessive and out of proportion reaction. Say NO, give a leash pop, change direction and walk the other way. These behaviors are simply never tolerated. Only once the dog is truly focused on me will I ask him for a different behavior and reward him. We do not want him associating the reward with the bad behavior. If my dog lunges at another dog, I leash pop, change direction and if the dog truly shifts his focus to me, I will ask for a sit/down/nose touch/etc and reward that behavior.
4)Disobedience. Disobedience is when the dog knows how to behave and makes a choice not to. Some positive-only trainers will insist this doesn’t happen, that dogs don’t think this way. I assure you, they do. They can know exactly what you’re asking of them and decide that they simply feel like ignoring you and doing their own thing. The behavior in question may be any one of the previously mentioned things (a behavior you taught them, an inappropriate behavior you’ve taught them to avoid, or a bad behavior they know has serious repercussions.). Disobedience must be corrected if your dog is going to respect you. It’s the same as telling a child, “clean your room or you’re grounded”. If they don’t clean the room, and you don’t follow through with the grounding, do you think your child will listen to you the next time you tell them to clean their room? Of course not. They know they can get away with ignoring you!
Your dog cannot believe it’s ok to ignore commands that he understands. A great example of this is “stay”. I use positive reinforcement to teach “stay”. But once the dog understands the concept, if they break a stay, I’m going to correct it. The level of correction should correlate to the age and temperament of the dog and to the importance of the behavior. What I mean by that is, if your dog doesn’t sit when you told him to, that isn’t the end of the world. I’d correct it with a verbal reprimand and a more forceful command, “No, SIT”. I might follow up with a leash pop if they’re being really stubborn about it. (Yes, dogs are capable of stubbornness). But that’s as far as it’s going to go. “Stay” on the other hand is a safety-oriented, life saving behavior. If your child is about to run into a street and you say “stop” and he ignores you, are you going to ask him nicely again? Are you going to wave a candy bar at him and lure him back to you? No! You’re going to grab his arm and yank him back. Is it possible you might bruise his arm or that he might find the experience unpleasant? Of course. But it was necessary to save his life. You dog must understand that this is a CRITICAL command that must always be obeyed no matter what.
Loose leash walking also falls in this category for me. We teach loose-leash walking with positive reinforcement but if they continue to pull once you’ve introduced them to the “heel” command, I will now use corrections. I will use a much softer correction on a puppy than I will on an adolescent or adult dog. Again, this is a safety behavior so must be understood. Use the lowest level of force/correction that is required to gain compliance; however you must gain compliance.
I hope this quick primer has helped people to understand the quadrants of operant conditioning and why reward-based models of training are not sufficient to creating a truly well-behaved dog. Feel free to message me with any questions you may have!
留言