diff --git a/units/en/unit5/pyramids.mdx b/units/en/unit5/pyramids.mdx
index 8983692..c5d23f6 100644
--- a/units/en/unit5/pyramids.mdx
+++ b/units/en/unit5/pyramids.mdx
@@ -11,6 +11,9 @@ The reward function is:
+In terms of code it looks like this
+
+
To train this new agent that seeks that button and then the Pyramid to destroy, we’ll use a combination of two types of rewards:
- The *extrinsic one* given by the environment (illustration above).
@@ -26,11 +29,11 @@ In terms of observation, we **use 148 raycasts that can each detect objects** (s
We also use a **boolean variable indicating the switch state** (did we turn on or not the switch to spawn the Pyramid) and a vector that **contains the agent’s speed**.
-
+
## The action space
The action space is **discrete** with four possible actions:
-
+