diff --git a/units/en/unit5/snowball-target.mdx b/units/en/unit5/snowball-target.mdx index c65511d..e1061a5 100644 --- a/units/en/unit5/snowball-target.mdx +++ b/units/en/unit5/snowball-target.mdx @@ -10,7 +10,10 @@ The goal in this environment is that Julien the bear **hit as many targets as po In addition, to avoid "snowball spamming" (aka shooting a snowball every timestep), **Julien the bear has a "cool off" system** (it needs to wait 0.5 seconds after a shoot to be able to shoot again). -ADD GIF COOLOFF +
+Cool Off System +
The agent needs to wait 0.5s before being able to shoot a snowball again
+ ## The reward function and the reward engineering problem