Add illustration links

2026-06-15 06:27:24 +08:00 · 2023-01-31 16:17:50 +01:00
parent 00974cc6b3
commit d5bedcee2f
3 changed files with 18 additions and 8 deletions
--- a/units/en/unit7/introduction.mdx
+++ b/units/en/unit7/introduction.mdx
@@ -14,7 +14,7 @@ This worked great, and the single-agent system is useful for many applications.
 <figcaption>

 A patchwork of all the environments you’ve trained your agents on since the beginning of the course
-
+</figcaption>
 </figure>

 But, as humans, **we live in a multi-agent world**. Our intelligence comes from interaction with other agents. And so, our **goal is to create agents that can interact with other humans and other agents**.
--- a/units/en/unit7/multi-agent-setting.mdx
+++ b/units/en/unit7/multi-agent-setting.mdx
@@ -11,7 +11,12 @@ To design this multi-agents reinforcement learning system (MARL), we have two so

 ## Decentralized system

-[ADD illustration decentralized approach]
+<figure>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/decentralized.png" alt="Decentralized"/>
+<figcaption>
+Source: <a href="https://www.youtube.com/watch?v=qgb0gyrpiGk"> Introduction to Multi-Agent Reinforcement Learning </a>
+</figcaption>
+</figure>

 In decentralized learning, **each agent is trained independently from others**. In the example given each vacuum learns to clean as much place it can **without caring about what other vacuums (agents) are doing**.

@@ -24,7 +29,12 @@ And this is problematic for many reinforcement Learning algorithms **that can't

 ## Centralized approach

-[ADD illustration centralized approach]
+<figure>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/centralized.png" alt="Centralized"/>
+<figcaption>
+Source: <a href="https://www.youtube.com/watch?v=qgb0gyrpiGk"> Introduction to Multi-Agent Reinforcement Learning </a>
+</figcaption>
+</figure>

 In this architecture, **we have a high level process that collect agents experiences**: experience buffer. And we'll use these experience **to learn a common policy**.

--- a/units/en/unit7/self-play.mdx
+++ b/units/en/unit7/self-play.mdx
@@ -1,11 +1,11 @@
 # Self-Play: a classic technique to train competitive agents in adversarial games

-Now that we studied the basics of multi-agents. We're ready to go deeper. As mentioned in the introduction, we're going to train agents in an adversarial games a Soccer 2vs2 game.
+Now that we studied the basics of multi-agents. We're ready to go deeper. As mentioned in the introduction, we're going **to train agents in an adversarial games with SoccerTwos a 2vs2 game.

 <figure>
-<img src=”https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/soccertwos.gif” alt=”SoccerTwos”/>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/soccertwos.gif" alt="SoccerTwos"/>

-<figcaption>This environment was made by the <a href=”https://github.com/Unity-Technologies/ml-agents”>Unity MLAgents Team</a></figcaption>
+<figcaption>This environment was made by the <a href="https://github.com/Unity-Technologies/ml-agents">Unity MLAgents Team</a></figcaption>

 </figure>

@@ -80,7 +80,7 @@ After every game:

 So if A and B have rating Ra, and Rb, then the **expected scores are** given by:

-<img src=”https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/elo1.png” alt=”ELO Score”/>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/elo1.png" alt="ELO Score"/>

 Then, at the end of the game, we need to update the player’s actual Elo score, we use a linear adjustment **proportional to the amount by which the player over-performed or under-performed.**

@@ -91,7 +91,7 @@ We also define a maximum adjustment rating per game: K-factor.

 If Player A has Ea points but scored Sa points, then the player’s rating is updated using the formula:

-<img src=”https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/elo2.png” alt=”ELO Score”/>
+<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/elo2.png" alt="ELO Score"/>

 ### Example