Update pyramids.mdx

2026-04-14 18:31:36 +08:00 · 2023-01-07 17:54:05 +01:00
parent cd118ad2cc
commit 19c3876657
1 changed files with 5 additions and 2 deletions
--- a/units/en/unit5/pyramids.mdx
+++ b/units/en/unit5/pyramids.mdx
@@ -11,6 +11,9 @@ The reward function is:

 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/pyramids-reward.png" alt="Pyramids Environment"/>

+In terms of code it looks like this
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/pyramids-reward-code.png" alt="Pyramids Reward"/>
+
 To train this new agent that seeks that button and then the Pyramid to destroy, we’ll use a combination of two types of rewards:

 - The *extrinsic one* given by the environment (illustration above).
@@ -26,11 +29,11 @@ In terms of observation, we **use 148 raycasts that can each detect objects** (s

 We also use a **boolean variable indicating the switch state** (did we turn on or not the switch to spawn the Pyramid) and a vector that **contains the agent’s speed**.

-<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/pyramids-obs-code.png" alt="Pyramids obs code"/>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/pyramids-obs-code.png" alt="Pyramids obs code"/>


 ## The action space

 The action space is **discrete** with four possible actions:

-<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/pyramids-action.png" alt="Pyramids Environment"/>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/pyramids-action.png" alt="Pyramids Environment"/>