# moving from drive-based reward to non-reward



## Bart Karmich (Jul 16, 2010)

Let's say you train a dog using food or tug rewards for positive reinforcement and you are using non-reward (negative) punishment for corrections (ie Balabanov method). Now to continue the training, what are some good ways to transition to working without reward delivery?

I understand you can gradually raise the criteria on performance and extend work duration before the release and reward until the point where you are completing a trial, but do you do anything more specific to prevent the dog from perceiving a negative punishment? Obviously you are developing a vocabulary of your markers, no, or yes or ok, and the dog does not hear the negative marker but what kind of problems can you run into when the rewards are not coming so frequently?

I am not asking how this is possible, but how this can be done well.


----------



## Timothy Saunders (Mar 12, 2009)

its not about the length of time . it is about the variable rewards


----------



## Bart Karmich (Jul 16, 2010)

If you use variable rewards then how do you address violating expectations? An extreme example would be if the dog is hoping for a bite and you offer a piece of hot dog instead. But I think you are talking more along the lines of variability in the length of a play reward event. Can you elaborate?


----------



## Terrasita Cuffie (Jun 8, 2008)

Its the ratio. First you may start with a 1:1 ratio, then 1:2 ratio. I may vacillate between 1:1 and 1:5 and mix it up to the point that he never knows when it comes. Pretty soon its entire trial routines or three of them. This is the hugely important part about marker training that some don't understand or implement and then conclude that marker training doesn' work for trial preparation.

T


----------



## Timothy Saunders (Mar 12, 2009)

Bart Karmich said:


> If you use variable rewards then how do you address violating expectations? An extreme example would be if the dog is hoping for a bite and you offer a piece of hot dog instead. But I think you are talking more along the lines of variability in the length of a play reward event. Can you elaborate?


the type if reward could be the same . it's about some rewards and no rewards. you could go from rewarding every time, to every now and then,rewarding every time to not rewarding at all. The dog works with the expectation of a reward but doesn't feel "punishment " when he doesn't get it.


----------



## Bob Scott (Mar 30, 2006)

I like to use a slot machine as an example.
You perform the behavior of pulling the handle hoping for a reward. You occasionally win and that's enough to keep you coming back and pulling the handle. You have no idea how many times you pull that handle but you KNOW that if you keep doing it you get rewarded.
To many times on the same machine without reward and you quit.
The randomness of your reward to the dog is no different. Go to long without the reward and the dog loses interest. Reward to often or in a pattern and the dog wises up and slacks of on the performance in between those predictable rewards.
Example: Heeling
It's real easy to stop rewarding a dog for the first step, the first ten steps, etc once its performing well for 50-100 steps. NEVER stop random reward at 1 step, 8 steps, etc.
Same with correction training only. Once the dog figures out YOUR pattern then why would it stay in correct position if it "KNOWS" it can go 25-50 steps with no correction. 
Random! That's the key to getting through a whole pattern without reward or correction. The dog always BELIEVES the reward or correction is always a possibility.


----------

