# intermittent reinforcemnt questions



## Eiche Park (Mar 11, 2008)

I have a question regarding intermittent reinforcement. Up until this point in time, I have always rewarded operantly conditioned movements or positions with reward. Specifically I use a clicker as a bridge stimulus and most of the time I use food as reward and motivation. I realize I have to move on to intermittent reinforcement but I'm not quite sure how to do this. I am pretty sure that I want to use a variable schedule but I also will use jackpots for the fastest and best responses. All of the correct responses thus far have been marked with the clicker and the dog is released for food or sometimes tug play. If the dog gives me what the behavior and I am on a VR schedule, what do I do? Do I click or not? Do I say good and ask for a different behavior? I am confused as to how to use intermittent reinforcement.


----------



## Connie Sutherland (Mar 27, 2006)

Eiche Park said:


> I have a question regarding intermittent reinforcement. Up until this point in time, I have always rewarded operantly conditioned movements or positions with reward. Specifically I use a clicker as a bridge stimulus and most of the time I use food as reward and motivation. I realize I have to move on to intermittent reinforcement but I'm not quite sure how to do this. I am pretty sure that I want to use a variable schedule but I also will use jackpots for the fastest and best responses. All of the correct responses thus far have been marked with the clicker and the dog is released for food or sometimes tug play. If the dog gives me what the behavior and I am on a VR schedule, what do I do? Do I click or not? Do I say good and ask for a different behavior? I am confused as to how to use intermittent reinforcement.



Phasing out food for me is done on a totally random schedule. The dog knows that he *might* get a treat. I do still mark and praise while I'm doing this. 

Also, I start phasing it out with rewarding much more often than not, and gradually going to occasional random rewards.

Are you using both a clicker and a verbal marker? I haven't started with my new clicker yet, but I've read that there usually isn't much confusion over switching (either way). 

Bob Scott? Are you here?


----------



## Bob Scott (Mar 30, 2006)

I don't like to withold reward if I've marked the behaviour. I don't want that marker/clicker/etc to loose it's value.
The "good" can be used as a bridge to the next behaviour without a marker. That's the basics to chaining. 
I might also finish on occasion with a "OK". MY dog responds to that release as a time to play. For him, that's just as good as a food marker.
I wont use the release if I've been rewarding with the tug. Again, for MY dog, the tug has a higher value then just a bit of playtime. 
Obviously you want to build a chain of behaviours but never, never completely eliminate random reward for individual behaviours in the chain. The dog has to know that it WILL come!
Just a few options for random reward.


----------



## Connie Sutherland (Mar 27, 2006)

Bob Scott said:


> I don't like to withold reward if I've marked the behaviour. I don't want that marker/clicker/etc to loose it's value..



Ooooh. Good point.


----------



## Bob Scott (Mar 30, 2006)

Connie Sutherland said:


> Ooooh. Good point.


Even if I ef up and mark at the wrong time I'll still reward.


----------



## Anne Vaini (Mar 15, 2007)

We call them "screw-up cookies" :lol:


----------



## Bob Scott (Mar 30, 2006)

Anne Vaini said:


> We call them "screw-up cookies" :lol:


A whole lot less stressful then a "screw-up correction". :grin: ;-)


----------



## Eiche Park (Mar 11, 2008)

Bob Scott said:


> I don't like to withold reward if I've marked the behaviour.


I agree, to me it's like a promise. When I click, I will reward. Period. The dog uses the marker as a predictor of size and probability of reward. I have read that some people will click with no reward but that only lessens the power of the bridge stimulus marker IMO. If the reward is underprected, then new learning occurs. If the reward is overpredicted by the marker, then inhibition occurs. If the reward is what was predicted, then homeostasis occurs. This is what several behaviorists theorize.
My dog knows "good" or "good girl", is a higher order reinforcer. This is not a release for her but more of a way to tell her she is correct and to keep going... you are on the right path to reward. I am still not clear how to release the dog from a correct operant response without reward. I'd like to tell her if she was correct if in fact she was correct. Maybe a little bit of praise? She knows the click is a release for reward 100% of the time. She knows "OK" is also a release for play reward, food reward, or a combination. I usually use "OK" to jackpot when she has made a significant step in learning. I don't really know what to do when she's correct but not getting a reward. I don't always want to ask for another operant response or do I?????


----------



## Kayce Cover (Oct 30, 2007)

*Re: intermittent reinforcement questions*

Hi all.

I have a different take on this. I use Terminal Bridges all the time with no primary reinforcer. The bridge is not necessarily weakened by this. An easy way to do this is to use Intermediate Bridges during training, and these mark success without giving a Terminal Bridge, and can be faded in and out, reassuring the animal that although food is not there, he is doing great. 

Recently, I worked with two dogs whose accomplished owners said could not be worked because they were not (at that time) food motivated. I asked permission to interject myself (both were students of mine) and proceeded to work both dogs for about an hour each with no food in one case (nor any toys) and less than a quarter ounce of cheese in the other case (no toys). My main reinforcer was my interaction, the information, and the use of rhythm.

In fact, it is a tenet of exotic animal training that we do not want to create any rigid expectation of food or anything else. The most frequent problem we get into is from animals that want to prevent us from leaving their areas, or who are angry because we do not give food every single time. Some times this is unavoidable - let's say another animal manages to knock into you and spill your bucket - not good. Better to avoid a sense of entitlement. 

If you decide to test what I am saying here, I believe you will end up with a stronger bond with your dog and behavior more reliable and resistant to discouragement, and there is research to back that up.

Regards,
Kayce, just home again.


----------



## Eiche Park (Mar 11, 2008)

I am inferring that when you say intermediate bridge, you are talking about saying "good" as if to say "yes that's right, keep going". A terminal bridge is for example a click and release, right? 
To follow up with my question, how do I start to use intermittent reinforcement if I have been using a clicker as my bridge stimulus (conditioned reinforcer) and food as my primary reinforcer? Keep in mind I have been clicking and rewarding after every correct operant response.


----------



## Kayce Cover (Oct 30, 2007)

Eiche Park said:


> I am inferring that when you say intermediate bridge, you are talking about saying "good" as if to say "yes that's right, keep going". A terminal bridge is for example a click and release, right?
> To follow up with my question, how do I start to use intermittent reinforcement if I have been using a clicker as my bridge stimulus (conditioned reinforcer) and food as my primary reinforcer? Keep in mind I have been clicking and rewarding after every correct operant response.


Yes, specifically, the Intermediate Bridge (IB) tells the animal: You are headed toward success, but are not there yet. Continue on this path and you will reach success. *


The Terminal Bridge can be a click and release, exactly as you describe. With experienced animals, they behavior will continue anyway, depending on context, but in the early stages this is true.

If you want to experiment with changing over, do NOT use a clicker for a terminal bridge, and use an IB that is different than your TB. This helps change the expectations of a reward for every click in a way that is not too jarring to your dog. You can use any hard sound, like g, d, k, or x. So, for cross-over dogs, it might be gggggg for the IB and a D! for the TB, so, ggggggD! 

(email me with your email address if you want free instructions on conditioning verbal bridges and the target) 

If a dog quits after the click and shows that he wants food, I say, "I see you want a treat, and you did well, will you (please ) do this next behavior, gggggggggD!" as he complies. Then I give the treat. Sometimes dogs will just shut down and I just let them think about things awhile and then try again. Usually they are on board within three days.

Ironically, the hardest dogs to change over are clicker dogs because of this rigid expectation of food. However, we have many certified trainers who were Clicker trainers (as in , following the rules of Clicker training, versus just using a clicker as a terminal bridge) and they report it is worth the effort, although even years later, their dogs will tend to throw behaviors unasked. Clicker trainers are now saying that it is not necessary to feed every click, which is important. With exotic animals, we know that a rigid expectation of food is dangerous.

*I do not use the words "keep going" because Clicker trainers use a "Keep Going Signal" (KGS) which many confuse with the IB. They are not the same. The KGS is a recue, and has a different meaning (in the animal's mind), although the trainer's intent may be the same.

The IB is not a cue, it is a reinforcer. 

The difference can be tested: I ask you to hand me a pen and keep saying hand-it, hand-it, hand-it, as you bring the pen toward me. People, in tests, report frustration and anger with this approach. If I ask you to hand me the pen, and say, thanks, thanks, thanks, THANK YOU! people report a different feeling as they respond, and their expression is different and their approach is faster and more confident.

Well, this is long-winded, although takes less than a minute to actually do!

Let me know if it is not clear and I will try again.

For the start up instructions, [email protected]


----------



## Bob Scott (Mar 30, 2006)

Eiche,
I've just started reading Kayce's manual. It's definately giving me new insight on motivational training. All good!


----------

