add glossary to unit7

2026-06-15 06:27:24 +08:00 · 2023-07-05 14:40:16 +02:00
parent 4f4e0e40ed
commit 719b8a05ec
2 changed files with 31 additions and 0 deletions
--- a/units/en/_toctree.yml
+++ b/units/en/_toctree.yml
@@ -172,6 +172,8 @@
    title: Designing Multi-Agents systems
  - local: unit7/self-play
    title: Self-Play
+  - local: unit7/glossary
+    title: Glossary
  - local: unit7/hands-on
    title: Let's train our soccer team to beat your classmates' teams (AI vs. AI)
  - local: unit7/conclusion
--- a/units/en/unit7/glossary.mdx
+++ b/units/en/unit7/glossary.mdx
@@ -0,0 +1,29 @@
+# Glossary 
+
+This is a community-created glossary. Contributions are welcomed!
+
+- **Multi-Agent Reinforcement Learning (MARL):** A subfield of reinforcement learning that deals with scenarios where multiple agents interact with each other and a shared environment. In MARL, the goal is to learn effective policies for each agent, considering the dynamic interactions and interdependencies among them.
+
+- **Cooperative Agents:** Agents that work together to maximize a common benefit.
+
+- **Competitive Agents:** Agents that compete against each other to maximize their benefits by minimizing the opponents.
+
+- **Mixed Agents:** Agents that exhibit both cooperative and competitive behaviors, where some agents need to cooperate to beat an opponent (or a group of opponents).
+
+- **Decentralized Learning:** An approach in MARL where each agent is trained independently without considering the actions or policies of other agents. The big drawback of this technique is that it will make the environment non-stationary and prevent reaching a global optimum.
+
+- **Centralized Learning:** A learning architecture in which a high-level process, **experience buffer**, collects experiences from multiple agents to learn a (single) common policy and achieve a global reward.
+
+- **Non-Stationarity:** The condition in which the underlying Markov decision process in the environment changes over time due to the interactions and decisions made by other agents. It makes it difficult for algorithms to converge to a globally optimal solution since the environment is in a constant state of change.
+
+- **Self-Play:** A training technique where an agent plays against previous versions of its own policy to create a challenging, yet progressively improving, environment.
+
+- **Zero-Sum Game:** A type of game where the total reward is constant, meaning one player's gain is equivalent to another player's loss.
+
+- **ELO Score:** A rating system, named after Arpad Elo, commonly used in adversarial games to track relative skill levels between players.  It determines numerical ratings of players based on their match outcomes against opponents.
+
+If you want to improve the course, you can [open a Pull Request.](https://github.com/huggingface/deep-rl-class/pulls)
+
+This glossary was made possible thanks to:
+
+- [Diego Carpintero](https://github.com/dcarpintero)