diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..778cf1a Binary files /dev/null and b/.DS_Store differ diff --git a/units/.DS_Store b/units/.DS_Store new file mode 100644 index 0000000..c88dcc6 Binary files /dev/null and b/units/.DS_Store differ diff --git a/units/en/.DS_Store b/units/en/.DS_Store new file mode 100644 index 0000000..5a0bf24 Binary files /dev/null and b/units/en/.DS_Store differ diff --git a/units/en/unit7/glossary.mdx b/units/en/unit7/glossary.mdx deleted file mode 100644 index 04f6b94..0000000 --- a/units/en/unit7/glossary.mdx +++ /dev/null @@ -1,29 +0,0 @@ -# Glossary - -This is a community-created glossary. Contributions are welcomed! - -- **Multi-Agent Reinforcement Learning (MARL):** A subfield of reinforcement learning that deals with scenarios where multiple agents interact with each other and a shared environment. In MARL, the goal is to learn effective policies for each agent, considering the dynamic interactions and interdependencies among them. - -- **Cooperative Agents:** Agents that work together to maximize a common benefit. - -- **Competitive Agents:** Agents that compete against each other to maximize their benefits by minimizing the opponents. - -- **Mixed Agents:** Agents that exhibit both cooperative and competitive behaviors, where some agents need to cooperate to beat an opponent (or a group of opponents). - -- **Decentralized Learning:** An approach in MARL where each agent is trained independently without considering the actions or policies of other agents. The big drawback of this technique is that it will make the environment non-stationary and prevent reaching a global optimum. - -- **Centralized Learning:** A learning architecture in which a high-level process, **experience buffer**, collects experiences from multiple agents to learn a (single) common policy and achieve a global reward. - -- **Non-Stationarity:** The condition in which the underlying Markov decision process in the environment changes over time due to the interactions and decisions made by other agents. It makes it difficult for algorithms to converge to a globally optimal solution since the environment is in a constant state of change. - -- **Self-Play:** A training technique where an agent plays against previous versions of its own policy to create a challenging, yet progressively improving, environment. - -- **Zero-Sum Game:** A type of game where the total reward is constant, meaning one player's gain is equivalent to another player's loss. - -- **ELO Score:** A rating system, named after Arpad Elo, commonly used in adversarial games to track relative skill levels between players. It determines numerical ratings of players based on their match outcomes against opponents. - -If you want to improve the course, you can [open a Pull Request.](https://github.com/huggingface/deep-rl-class/pulls) - -This glossary was made possible thanks to: - -- [Diego Carpintero](https://github.com/dcarpintero)