Files
deep-rl-class/units/en/unit2/introduction.mdx
Thomas Simonini 20a12074bc Apply suggestions from code review from Omar
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2022-12-08 15:50:46 +01:00

27 lines
1.6 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Introduction to Q-Learning [[introduction-q-learning]]
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/thumbnail.jpg" alt="Unit 2 thumbnail" width="100%">
In the first unit of this class, we learned about Reinforcement Learning (RL), the RL process, and the different methods to solve an RL problem. We also **trained our first agents and uploaded them to the Hugging Face Hub.**
In this unit, we're going to **dive deeper into one of the Reinforcement Learning methods: value-based methods** and study our first RL algorithm: **Q-Learning.**
We'll also **implement our first RL agent from scratch**, a Q-Learning agent, and will train it in two environments:
1. Frozen-Lake-v1 (non-slippery version): where our agent will need to **go from the starting state (S) to the goal state (G)** by walking only on frozen tiles (F) and avoiding holes (H).
2. An autonomous taxi: where our agent will need **to learn to navigate** a city to **transport its passengers from point A to point B.**
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/envs.gif" alt="Environments"/>
Concretely, we will:
- Learn about **value-based methods**.
- Learn about the **differences between Monte Carlo and Temporal Difference Learning**.
- Study and implement **our first RL algorithm**: Q-Learning.
This unit is **fundamental if you want to be able to work on Deep Q-Learning**: the first Deep RL algorithm that played Atari games and beat the human level on some of them (breakout, space invaders, etc).
So let's get started! 🚀