How animals actually learn: reward, punishment, and why aversive methods backfire

How animals actually learn: reward, punishment, and why aversive methods backfire

D

Dr. Alastair Greenway

MRCVS

Yesterday10 min read0 views
Vet reviewedby Claire Greenway, BVM&S MRCVSLast reviewed 10 Jun 2026

Almost every behaviour problem you will ever try to solve, and almost every nice thing you will ever teach, runs on the same small set of rules. Your dog learns that the doorbell means a visitor; your cat learns that the cupboard opening means dinner; a nervous dog learns the vet car park is frightening. None of this is character or stubbornness or spite. It is learning, and learning follows laws that are now very well understood. Once you can see those laws at work, training stops feeling like a battle of wills and starts feeling like something you can steer.

This article is the engine room for the rest of the behaviour space. It explains how animals learn, why reward-based methods are the ones every major veterinary behaviour body recommends, and why aversive methods (smacks, prong and electronic collars, staring-down and pinning) tend to backfire in ways that are now well documented rather than merely disapproved of. One thing first: if a behaviour has changed suddenly, the first step is a vet check to rule out pain or illness, because a sore or unwell animal cannot be trained out of a medical problem. We hand that conversation to is it behaviour or is it medical?, with our behaviour check to help you decide. Everything below assumes that front door has been opened.

Two kinds of learning, working at once

Learning theory splits into two halves, both running in every animal, every day, dogs and cats alike (Landsberg, Hunthausen & Ackerman, 2013).

The first is classical conditioning, sometimes called Pavlovian or associative learning: the involuntary, emotional kind, where an animal learns that one thing predicts another and starts to feel about the first the way it feels about the second. Pavlov's dogs salivating at a bell is the textbook case. The version that matters at home is emotional: if the lead always comes out just before a walk it becomes exciting, and if the carrier only ever appears before a trip to the vet it becomes frightening. Your animal is not deciding this; the feeling has simply attached itself to the predictor. This is the half of learning that governs how an animal feels, and feelings are usually what drive the behaviours that worry us.

The second is operant conditioning, the learning that comes from consequences. The animal does something, a consequence follows, and that consequence makes the behaviour more or less likely next time. A cat that paws the cupboard and gets fed will paw it again; a dog that jumps up and gets a fuss will jump up again. This is the half we lean on for most cue training, and the two run together: the best plans use both.

The four consequences, and the words that confuse everyone

Operant learning has exactly four possible consequences, best drawn as a simple grid. The single most useful thing I can tell you is that the words "positive" and "negative" here do not mean good and bad: they mean add and subtract. Get that one idea straight and the whole field suddenly makes sense (AVSAB, 2021; FelineVMA, 2024).

A two-by-two grid showing the four operant consequences, with positive meaning add and negative meaning subtract
Positive means add, negative means subtract. Reward-based training lives in the top-left and bottom-right; aversive training lives in the other two.

Positive reinforcement adds something the animal likes, and the behaviour increases: a treat for a sit. Negative reinforcement removes something it dislikes, and the behaviour increases: the lead pressure stops the moment the dog gives in. Positive punishment adds something it dislikes, and the behaviour decreases: a yank, a shout, a shock. Negative punishment takes away something it wants, and the behaviour decreases: the attention stops the instant the puppy jumps up. Reward-based training is built almost entirely from the first and last; aversive training relies on the middle two. That distinction, mechanical rather than moral, is what the rest of this article is about.

Why reward-based training wins on the evidence

It would be easy to assume this is just a kindness argument, reward-based training being the nice-but-soft option. The evidence says otherwise. Reward-based methods are at least as effective as aversive methods, and several lines of research point to their being more effective, while there is no good evidence aversive methods are more effective in any context (AVSAB, 2021). A review of seventeen studies agreed: aversive methods carry risks to physical and mental wellbeing and show no efficacy advantage (Ziv, 2017).

The most striking single study took sixty-three pet dogs with recall or off-lead problems, the very situation electronic collars are most often sold to fix, and compared e-collar trainers, the same trainers without the collar, and reward-focused trainers. Over five days the reward-focused dogs came out ahead, obeying a single "come" or "sit" cue more often (around eighty-two percent against roughly seventy-one) and responding faster, with no sign the e-collar was necessary even for its headline indication (China, Mills & Cooper, 2020). The collar did not win even on its home turf.

For cats, the same principles hold and the trainability is real, despite the old joke that cats cannot be trained. In one study a hundred shelter cats were taught four brand-new behaviours over fifteen short sessions, with most mastering a target-touch and many learning to spin, high-five or sit, regardless of age (Kogan, Kolus & Schoenfeld-Tacher, 2017). Cats learn by the same rules as dogs, and positive reinforcement is the way to teach them; punishment does not teach a cat what you want, it teaches it to fear the situation, and often you (FelineVMA, 2024). Cats are first-class learners here, not an afterthought.

The four reasons punishment backfires

So why does adding something unpleasant, which can genuinely stop a behaviour in the moment, tend to go wrong? Not as a matter of opinion, but for four documented reasons (AVSAB, 2007).

First, it teaches fear and anxiety, and that fear does not stay attached to the behaviour you meant to target: it spreads to whatever is around, your hand, the room, the lead, you. Second, punishment suppresses the visible behaviour without touching the emotion underneath. If a dog growls because it is frightened and you punish the growl, the fear is still there; you have only removed the dog's way of warning you, and the problem re-emerges, often in a worse form. Third, it can provoke defensive aggression, the effect behaviourists call fallout: in a large survey of owners using confrontational techniques (alpha rolls, staring the dog down, hitting, forcibly taking an item away, the "dominance down"), at least about a quarter of dogs responded aggressively to at least one of them (Herron, Shofer & Reisner, 2009). Punishing a frightened animal is one of the more reliable ways to get bitten. Fourth, it damages the relationship: an animal that has learned you are sometimes a source of something unpleasant is a worse learner and a more anxious companion (AVSAB, 2021).

There is now hard welfare data behind all this. Dogs trained with aversive or mixed methods show more stress behaviours during training (crouching, yelping, panting, low tense postures) and bigger rises in the stress hormone cortisol afterwards than reward-trained dogs; those trained exclusively with aversive methods were also more "pessimistic" in a careful cognitive-bias test, a sign of a longer-term dip in mood (Vieira de Castro et al., 2020). That finding has been independently replicated, with a separate UK group finding dogs whose owners used two or more aversive techniques significantly slower to approach an ambiguous bowl in the same kind of test (Casey et al., 2021). The case against electronic collars is just as concrete: in a DEFRA-funded study, dogs showed negative behavioural changes on application of the stimulus and elevated cortisol afterwards, with no consistent training benefit to set against that cost (Cooper et al., 2014). For why prong, choke and electronic collars are the wrong tool, our guide to reactivity equipment, harnesses and muzzles takes it from here.

There is also a quieter, practical point. To suppress a behaviour reliably, punishment has to be consistent, near-instant (a second or two), and the right intensity every time, and owners frequently apply it incorrectly (AVSAB, 2007). Punish late or inconsistently, as people do in real life, and the animal cannot work out what the unpleasantness is even for; it just learns that you are unpredictable and a bit frightening. Reward-based training is far more forgiving of clumsy timing, which is one reason it works so much better in ordinary households.

What to do instead: reward, manage, and let some things go

If punishment is off the table, what fills the gap? You reinforce the behaviour you do want, you manage the environment so the unwanted behaviour cannot keep being practised, and, more often than owners expect, you simply stop feeding the behaviour you do not want. That last one is the most underused tool there is. A great many annoying behaviours are accidentally reinforced by us: the jumping up that gets a fuss, the barking that gets attention, the counter-surfing that occasionally finds a sandwich. Each is rewarded, just not on purpose. The fix is rarely to add a punishment; it is to remove the reward (the technical name is negative punishment, or extinction) and prevent the animal rehearsing the behaviour meanwhile. With the jumping puppy, that means giving attention only once the paws are back on the floor.

One honest warning so you do not lose your nerve: when you stop rewarding a behaviour that used to pay off, it usually gets briefly worse before it fades. This is the extinction burst, completely normal, the equivalent of pressing a lift button harder when it does not arrive. Hold the line, because giving in at the peak only teaches the animal that persistence pays. And note that behaviours like these come from accidental rewarding rather than any urge for status or rank, which is why the old "show them who is boss" framing leads owners so badly astray (AVSAB, 2008). The full retirement of the dominance myth, and the emotions that really drive behaviour, belongs to its companion piece, why your pet does this; read the two together and you have the whole of the basics.

Where this takes you next

Everything here is the foundation for the single most important technique in behaviour work: desensitisation and counter-conditioning, the method behind almost every successful fear, noise and reactivity plan. It pairs careful, below-threshold exposure with something the animal loves, so the feeling itself changes. Both halves of learning are at work, and we teach the how-to in full in desensitisation and counter-conditioning, the natural next click from here.

If you take one thing away, let it be this: you change behaviour fastest by deciding what you want the animal to do and rewarding that, not by chasing what you want it to stop. So next time your dog or cat frustrates you, ask one question: what would I rather they did instead, and how can I make that the rewarding option? And if a trainer you have hired reaches for prong collars, e-collars, alpha rolls or "flooding", that is your cue to find someone else; finding real help shows you how to tell a qualified, reward-based professional from the rest.

References

  1. Landsberg G, Hunthausen W, Ackerman L. Behavior Problems of the Dog and Cat. 3rd ed. Saunders Elsevier, 2013.
  2. American Veterinary Society of Animal Behavior. AVSAB Position Statement on Humane Dog Training. AVSAB, 2021.
  3. Feline Veterinary Medical Association. Positive Reinforcement Training Educational Toolkit: How Cats Learn. FelineVMA, 2024.
  4. Ziv G. The effects of using aversive training methods in dogs: a review. Journal of Veterinary Behavior, 2017;19:50-60.
  5. China L, Mills DS, Cooper JJ. Efficacy of dog training with and without remote electronic collars vs. a focus on positive reinforcement. Frontiers in Veterinary Science, 2020;7:508.
  6. Kogan L, Kolus C, Schoenfeld-Tacher R. Assessment of clicker training for shelter cats. Animals, 2017;7(10):73.
  7. American Veterinary Society of Animal Behavior. AVSAB Position Statement: The Use of Punishment for Behavior Modification in Animals. AVSAB, 2007.
  8. Herron ME, Shofer FS, Reisner IR. Survey of the use and outcome of confrontational and non-confrontational training methods in client-owned dogs showing undesired behaviors. Applied Animal Behaviour Science, 2009;117(1-2):47-54.
  9. Vieira de Castro AC, Fuchs D, Morello GM, Pastur S, de Sousa L, Olsson IAS. Does training method matter? Evidence for the negative impact of aversive-based methods on companion dog welfare. PLOS ONE, 2020;15(12):e0225023.
  10. Casey RA, Naj-Oleari M, Campbell S, Mendl M, Blackwell EJ. Dogs are more pessimistic if their owners use two or more aversive training methods. Scientific Reports, 2021;11:19023.
  11. Cooper JJ, Cracknell N, Hardiman J, Wright H, Mills D. The welfare consequences and efficacy of training pet dogs with remote electronic training collars in comparison to reward based training. PLOS ONE, 2014;9(9):e102722.
  12. American Veterinary Society of Animal Behavior. AVSAB Position Statement on the Use of Dominance Theory in Behavior Modification of Animals. AVSAB, 2008.