Add illustrations

This commit is contained in:
simoninithomas
2023-01-07 10:48:28 +01:00
parent 92dc5ce8eb
commit 98f4c85709
2 changed files with 4 additions and 5 deletions

View File

@@ -23,6 +23,8 @@ If you want to know more about curiosity, the next section (optional) will expla
In terms of observation, we **use 148 raycasts that can each detect objects** (switch, bricks, golden brick, and walls.)
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/pyramids_raycasts.png"/>
We also use a **boolean variable indicating the switch state** (did we turn on or not the switch to spawn the Pyramid) and a vector that **contains the agents speed**.
ADD SCREENSHOT CODE

View File

@@ -13,7 +13,7 @@ In addition, to avoid "snowball spamming" (aka shooting a snowball every timeste
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/cooloffsystem.gif" alt="Cool Off System"/>
<figcaption>The agent needs to wait 0.5s before being able to shoot a snowball again</figcaption>
</figure>
## The reward function and the reward engineering problem
@@ -39,10 +39,7 @@ Think of raycasts as lasers that will detect if it passes through an object.
In this environment our agent have multiple set of raycasts:
-
TOOD ADD raycasts that can each detect objects (target, walls) and how much we have
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/snowball_target_raycasts.png" alt="Raycasts"/>
## The action space