Scheduling Reinforcement

Scheduling Reinforcement - Quiz

Choose your answer and write the correct one down. Then click HERE for the answers to this quiz.

NOTE: The transcript from the video is listed below the quiz for your reference.

1. Giving a lab rat food every third time it presses a lever is an example of a

variable ratio
fixed ratio
constant reward
fixed interval
variable interval

2. In operant conditioning, scheduling is useful because

while rewarding is most effective if done every time, punishing is most effective if done selectively
rewarding or punishing behavior every time is less effective than if done selectively
rewarding or punishing behavior every time is more effective than if done selectively
while punishing is most effective if done every time, rewarding is most effective if done selectively
rewarding or punishing is most easily facilitated when using a schedule

3. Fishing, in which you may catch something after 15 minutes, then another 45 then two hours, is an example of a

fixed interval
variable interval
fixed ratio
variable ratio
constant reward

4. Playing a slot machine, which provides inconsistently occurring payouts, is an example of a

fixed ratio
variable interval
fixed interval
constant reward
variable ratio

5. A child's weekly allowance is an example of a

constant reward
variable ratio
fixed interval
fixed ratio
variable interval

Have you ever wondered how our behavior is conditioned? How does the timing of punishments and rewards affect our learning? In this lesson, we'll take a look at how reward scheduling can influence how fast we learn a behavior and how strongly it's reinforced.

Operant conditioning is usually based on the idea that we reward or punish specific behaviors. In real-world applications, we can't reward or punish behavior every time. Even if we could, research suggests that rewarding every instance of a positive behavior can actually be less effective than rewarding more selectively. So for training purposes, we have four different ways of spacing out the rewards and punishments:

fixed ratio
variable ratio
fixed interval and
variable interval

FIXED	VARIABLE
Ratio (after 3 times)	Ratio (after 2, 3 or 4 times)
Interval (every 5 minutes)	Interval (every 5, 10 or 15 minutes)

Ratio refers to how many successes you need before you get a reward, or how many failures before you get a punishment.

The fixed ratio says that we do it after x amount of successes or failures. Let's say three. Every third time you do the behavior, we give you a reward or punishment.

With the variable ratio, we change it up. Sometimes we'll reward you twice in a row; sometimes you'll have to wait through a bunch of successes.

Interval refers to the time between rewards. If that amount of time is predetermined and the same every time, it's a fixed interval. If, however, it's changing and unpredictable, it's a variable interval.

These four methods are used for different types of operant conditioning. All of them lead to longer-term results than a continuous reinforcement schedule where you're rewarded every time. Because, with that, you get used to being rewarded every time, and as soon as the rewards stop coming, the conditioning quickly goes to extinction.

Now let's look at some examples of reward schedules.

A fixed ratio might be shown by training a seal to do tricks, where they have to do 3 tricks in a row to get a reward. So the seal balances a ball on its nose, jumps through a hoop and claps its flippers, all for one fish at the end.

Now, a variable ratio is best exemplified by the slot machine. With the slot machine, we never know when we're going to win, but we know we won't win if we stop pulling that handle. That's what keeps us playing in the hopes of hitting the jackpot.

Fixed interval reinforcement is like your paycheck because you go to work every day, and on a schedule, you're rewarded with a sum of money; whereas a variable interval is like random bonuses that your boss gives out every so often.

These are the main ways of reinforcing behavior. The fastest way to condition behavior is by getting rewarded every time. But since the reward is expected, that behavior is unlikely to continue if there's no more reward. The different types of fixed and variable schedules take longer to condition, but the effects are longer-lasting. This is especially true for variable ratios because they keep us guessing if the next time we'll win the jackpot.