mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-14 18:31:36 +08:00
Some small updates
This commit is contained in:
@@ -46,6 +46,10 @@
|
||||
title: Play with Huggy
|
||||
- local: unitbonus1/conclusion
|
||||
title: Conclusion
|
||||
- title: Live 1. How the course work, Q&A, and playing with Huggy 🐶
|
||||
sections:
|
||||
- local: live1/live1.mdx
|
||||
title: Live 1. How the course work, Q&A, and playing with Huggy 🐶
|
||||
- title: Unit 2. Introduction to Q-Learning
|
||||
sections:
|
||||
- local: unit2/introduction
|
||||
@@ -96,7 +100,7 @@
|
||||
title: Conclusion
|
||||
- local: unit3/additional-readings
|
||||
title: Additional Readings
|
||||
- title: Unit Bonus 2. Automatic Hyperparameter Tuning with Optuna
|
||||
- title: Bonus Unit 2. Automatic Hyperparameter Tuning with Optuna
|
||||
sections:
|
||||
- local: unitbonus2/introduction
|
||||
title: Introduction
|
||||
|
||||
10
units/en/live1/live1.mdx
Normal file
10
units/en/live1/live1.mdx
Normal file
@@ -0,0 +1,10 @@
|
||||
# Live 1: Deep RL Course. Intro, Q&A, and playing with Huggy 🐶
|
||||
|
||||
In this first live stream, we explained how the course work (scope, units, challenges, and more) and answered your questions.
|
||||
|
||||
And finally, we saw some LunarLander agents you've trained and play with your Huggies 🐶
|
||||
|
||||
<Youtube id="JeJIswxyrsM" />
|
||||
|
||||
|
||||
To know when the next live is scheduled **check the discord server**. We will also send **you an email**. If you can't participate, don't worry, we record the live sessions.
|
||||
@@ -14,3 +14,9 @@ You will be able then to play with him 🤗.
|
||||
|
||||
<video src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/huggy.mp4" alt="Huggy" type="video/mp4">
|
||||
</video>
|
||||
|
||||
Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉 [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
|
||||
|
||||
### Keep Learning, stay awesome 🤗
|
||||
|
||||
|
||||
|
||||
@@ -15,5 +15,7 @@ In the next chapter, we’re going to dive deeper by studying our first Deep Rei
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit4/atari-envs.gif" alt="Atari environments"/>
|
||||
|
||||
|
||||
Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉 [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
|
||||
|
||||
### Keep Learning, stay awesome 🤗
|
||||
|
||||
|
||||
@@ -62,7 +62,7 @@ For each state, the state-value function outputs the expected return if the agen
|
||||
|
||||
In the action-value function, for each state and action pair, the action-value function **outputs the expected return** if the agent starts in that state and takes action, and then follows the policy forever after.
|
||||
|
||||
The value of taking action an in state \\(s\\) under a policy \\(π\\) is:
|
||||
The value of taking action \\(a\\) in state \\(s\\) under a policy \\(π\\) is:
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/action-state-value-function-1.jpg" alt="Action State value function"/>
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/action-state-value-function-2.jpg" alt="Action State value function"/>
|
||||
|
||||
@@ -11,4 +11,7 @@ Don't hesitate to train your agent in other environments (Pong, Seaquest, QBert,
|
||||
|
||||
In the next unit, **we're going to learn about Optuna**. One of the most critical task in Deep Reinforcement Learning is to find a good set of training hyperparameters. And Optuna is a library that helps you to automate the search.
|
||||
|
||||
Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉 [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
|
||||
|
||||
### Keep Learning, stay awesome 🤗
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ Internally, our Q-function has **a Q-table, a table where each cell corresponds
|
||||
The problem is that Q-Learning is a *tabular method*. This raises a problem in which the states and actions spaces **are small enough to approximate value functions to be represented as arrays and tables**. Also, this is **not scalable**.
|
||||
Q-Learning worked well with small state space environments like:
|
||||
|
||||
- FrozenLake, we had 14 states.
|
||||
- FrozenLake, we had 16 states.
|
||||
- Taxi-v3, we had 500 states.
|
||||
|
||||
But think of what we're going to do today: we will train an agent to learn to play Space Invaders a more complex game, using the frames as input.
|
||||
|
||||
@@ -6,5 +6,7 @@ You can now sit and enjoy playing with your Huggy 🐶. And don't **forget to sp
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/unit-bonus1/huggy-cover.jpeg" alt="Huggy cover" width="100%">
|
||||
|
||||
Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉 [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
|
||||
|
||||
### Keep Learning, stay awesome 🤗
|
||||
|
||||
### Keep Learning, Stay Awesome 🤗
|
||||
|
||||
@@ -9,3 +9,8 @@ Now that you've learned to use Optuna, we give you some ideas to apply what you'
|
||||
By doing that, you're going to see how Optuna is valuable and powerful in training better agents,
|
||||
|
||||
Have fun,
|
||||
|
||||
Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉 [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
|
||||
|
||||
### Keep Learning, stay awesome 🤗
|
||||
|
||||
|
||||
Reference in New Issue
Block a user