Update advantages-disadvantages.mdx

Change remaining references to rose states
2026-04-01 17:51:01 +08:00 · 2023-05-09 18:47:59 +05:30
parent adbd2abb38
commit e30be856bc
1 changed files with 1 additions and 1 deletions
--- a/units/en/unit4/advantages-disadvantages.mdx
+++ b/units/en/unit4/advantages-disadvantages.mdx
@@ -38,7 +38,7 @@ Under a deterministic policy, the policy will either always move right when in a

 Under a value-based Reinforcement learning algorithm, we learn a **quasi-deterministic policy** ("greedy epsilon strategy"). Consequently, our agent can **spend a lot of time before finding the dust**.

-On the other hand, an optimal stochastic policy **will randomly move left or right in rose states**. Consequently, **it will not be stuck and will reach the goal state with a high probability**.
+On the other hand, an optimal stochastic policy **will randomly move left or right in red (colored) states**. Consequently, **it will not be stuck and will reach the goal state with a high probability**.

 <figure class="image table text-center m-0 w-full">
  <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit6/hamster3.jpg" alt="Hamster 1"/>