Files
deep-rl-class/unit2/README.md
2022-09-18 19:45:13 +02:00

98 lines
6.1 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Unit 2: Introduction to Q-Learning
In this Unit, we're going to dive deeper into one of the Reinforcement Learning methods: value-based methods and **study our first RL algorithm: Q-Learning**.
We'll also implement our **first RL agent from scratch**: a Q-Learning agent and will train it in two environments:
- [Frozen-Lake-v1 ⛄ (non-slippery version)](https://www.gymlibrary.dev/environments/toy_text/frozen_lake/): where our agent will need to go from the starting state (S) to the goal state (G) by walking only on frozen tiles (F) and avoiding holes (H).
- [An autonomous taxi 🚕](https://www.gymlibrary.dev/environments/toy_text/taxi/?highlight=taxi) will need to learn to navigate a city to transport its passengers from point A to point B.
<img src="assets/img/envs.gif" alt="unit 2 environments"/>
You'll then be able to **compare your agents results with other classmates thanks to a leaderboard** 🔥.
This Unit is divided into 2 parts:
- [Part 1](https://huggingface.co/blog/deep-rl-q-part1)
- [Part 2](https://huggingface.co/blog/deep-rl-q-part2)
<img src="assets/img/two_parts.jpg" alt="Two parts"/>
This course is **self-paced**, you can start whenever you want.
## Required time ⏱️
The required time for this unit is, approximately:
- **2-3 hours** for the theory
- **1 hour** for the hands-on.
## Start this Unit 🚀
Here are the steps for this Unit:
1⃣ 📝 If it's not already done, sign up to our Discord Server. This is the place where you **can exchange with the community and with us, create study groups to grow each other and more** 
👉🏻 [https://discord.gg/aYka4Yhff9](https://discord.gg/aYka4Yhff9).
Are you new to Discord? Check our **discord 101 to get the best practices** 👉 https://github.com/huggingface/deep-rl-class/blob/main/DISCORD.Md
2⃣ 👋 **Introduce yourself on Discord in #introduce-yourself Discord channel 🤗 and check on the left the Reinforcement Learning section.**
- In #rl-announcements we give the last information about the course.
- #discussions is a place to exchange.
- #unity-ml-agents is to exchange about everything related to this library.
- #study-groups, to create study groups with your classmates.
<img src="assets/img/discord_channels.jpg" alt="Discord Channels"/>
3⃣ 📖 **Read An [Introduction to Q-Learning Part 1](https://huggingface.co/blog/deep-rl-q-part1)**.
4⃣ 📝 Take a piece of paper and **check your knowledge with this series of questions** ❔ 👉 https://github.com/huggingface/deep-rl-class/blob/main/unit2/quiz1.md
5⃣ 📖 **Read An [Introduction to Q-Learning Part 2](https://huggingface.co/blog/deep-rl-q-part2)**.
6⃣ 📝 Take a piece of paper and **check your knowledge with this series of questions** ❔ 👉 https://github.com/huggingface/deep-rl-class/blob/main/unit2/quiz2.md
7⃣ 👩‍💻 Then dive on the hands-on, where **youll implement our first RL agent from scratch**, a Q-Learning agent, and will train it in two environments:
1. Frozen Lake v1 ❄️: where our agent will need to **go from the starting state (S) to the goal state (G)** by walking only on frozen tiles (F) and avoiding holes (H).
2. An autonomous taxi 🚕: where the agent will need **to learn to navigate** a city to **transport its passengers from point A to point B.**
Thanks to a leaderboard, **you'll be able to compare your results with other classmates** and exchange the best practices to improve your agent's scores Who will win the challenge for Unit 2 🏆?
👩‍💻 The hands-on 👉 [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/deep-rl-class/blob/main/unit2/unit2.ipynb)
🏆 The leaderboard 👉 https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard
You can work directly **with the colab notebook, which allows you not to have to install everything on your machine (and its free)**.
8⃣ The best way to learn **is to try things on your own**. Thats why we have a challenges section in the colab where we give you some ideas on how you can go further: using another environment, using another model etc.
## Additional readings 📚
- [Reinforcement Learning: An Introduction, Richard Sutton and Andrew G. Barto Chapter 5, 6 and 7](http://incompleteideas.net/book/RLbook2020.pdf)
- [Foundations of Deep RL Series, L2 Deep Q-Learning by Pieter Abbeel](https://youtu.be/Psrhxy88zww)
- To dive deeper on Monte Carlo and Temporal Difference Learning:
- [Why do temporal difference (TD) methods have lower variance than Monte Carlo methods?](https://stats.stackexchange.com/questions/355820/why-do-temporal-difference-td-methods-have-lower-variance-than-monte-carlo-met)
- [When are Monte Carlo methods preferred over temporal difference ones?](https://stats.stackexchange.com/questions/336974/when-are-monte-carlo-methods-preferred-over-temporal-difference-ones)
## How to make the most of this course
To make the most of the course, my advice is to:
- **Participate in Discord** and join a study group.
- **Read multiple times** the theory part and takes some notes
- Dont just do the colab. When you learn something, try to change the environment, change the parameters and read the libraries' documentation. Have fun 🥳
- Struggling is **a good thing in learning**. It means that you start to build new skills. Deep RL is a complex topic and it takes time to understand. Try different approaches, use our additional readings, and exchange with classmates on discord.
## This is a course built with you 👷🏿‍♀️
We want to improve and update the course iteratively with your feedback. **If you have some, please fill this form** 👉 https://forms.gle/3HgA7bEHwAmmLfwh9
## Dont forget to join the Community 📢
We have a discord server where you **can exchange with the community and with us, create study groups to grow each other and more** 
👉🏻 [https://discord.gg/aYka4Yhff9](https://discord.gg/aYka4Yhff9).
Dont forget to **introduce yourself when you sign up 🤗**
❓If you have other questions, [please check our FAQ](https://github.com/huggingface/deep-rl-class#faq)
## Keep learning, stay awesome 🤗,