Files
deep-rl-class/units/en/unit7/introduction-to-marl.mdx
Dylan Wilson 87e65269dd Typos Unit7
2023-04-19 10:50:25 -05:00

56 lines
3.3 KiB
Plaintext

# An introduction to Multi-Agents Reinforcement Learning (MARL)
## From single agent to multiple agents
In the first unit, we learned to train agents in a single-agent system. When our agent was alone in its environment: **it was not cooperating or collaborating with other agents**.
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/patchwork.jpg" alt="Patchwork"/>
<figcaption>
A patchwork of all the environments you've trained your agents on since the beginning of the course
</figcaption>
</figure>
When we do multi-agents reinforcement learning (MARL), we are in a situation where we have multiple agents **that share and interact in a common environment**.
For instance, you can think of a warehouse where **multiple robots need to navigate to load and unload packages**.
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/warehouse.jpg" alt="Warehouse"/>
<figcaption> [Image by upklyak](https://www.freepik.com/free-vector/robots-warehouse-interior-automated-machines_32117680.htm#query=warehouse robot&position=17&from_view=keyword) on Freepik </figcaption>
</figure>
Or a road with **several autonomous vehicles**.
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/selfdrivingcar.jpg" alt="Self driving cars"/>
<figcaption>
[Image by jcomp](https://www.freepik.com/free-vector/autonomous-smart-car-automatic-wireless-sensor-driving-road-around-car-autonomous-smart-car-goes-scans-roads-observe-distance-automatic-braking-system_26413332.htm#query=self driving cars highway&position=34&from_view=search&track=ais) on Freepik
</figcaption>
</figure>
In these examples, we have **multiple agents interacting in the environment and with the other agents**. This implies defining a multi-agents system. But first, let's understand the different types of multi-agent environments.
## Different types of multi-agent environments
Given that, in a multi-agent system, agents interact with other agents, we can have different types of environments:
- *Cooperative environments*: where your agents need **to maximize the common benefits**.
For instance, in a warehouse, **robots must collaborate to load and unload the packages efficiently (as fast as possible)**.
- *Competitive/Adversarial environments*: in this case, your agent **wants to maximize its benefits by minimizing the opponent's**.
For example, in a game of tennis, **each agent wants to beat the other agent**.
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/tennis.png" alt="Tennis"/>
- *Mixed of both adversarial and cooperative*: like in our SoccerTwos environment, two agents are part of a team (blue or purple): they need to cooperate with each other and beat the opponent team.
<figure>
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit10/soccertwos.gif" alt="SoccerTwos"/>
<figcaption>This environment was made by the <a href="https://github.com/Unity-Technologies/ml-agents">Unity MLAgents Team</a></figcaption>
</figure>
So now we might wonder: how can we design these multi-agent systems? Said differently, **how can we train agents in a multi-agent setting** ?