Some small updates

2026-04-14 18:31:36 +08:00 · 2022-12-31 20:52:40 +01:00
parent 10d539f24f
commit 9b531c3be0
9 changed files with 36 additions and 4 deletions
--- a/units/en/_toctree.yml
+++ b/units/en/_toctree.yml
@@ -46,6 +46,10 @@
    title: Play with Huggy
  - local: unitbonus1/conclusion
    title: Conclusion
+- title: Live 1. How the course work, Q&A, and playing with Huggy 🐶
+  sections:
+  - local: live1/live1.mdx
+    title: Live 1. How the course work, Q&A, and playing with Huggy 🐶
 - title: Unit 2. Introduction to Q-Learning
  sections:
  - local: unit2/introduction
@@ -96,7 +100,7 @@
    title: Conclusion
  - local: unit3/additional-readings
    title: Additional Readings
- title: Unit Bonus 2. Automatic Hyperparameter Tuning with Optuna
+- title: Bonus Unit 2. Automatic Hyperparameter Tuning with Optuna
  sections:
  - local: unitbonus2/introduction
    title: Introduction
--- a/units/en/live1/live1.mdx
+++ b/units/en/live1/live1.mdx
@@ -0,0 +1,10 @@
+# Live 1: Deep RL Course. Intro, Q&A, and playing with Huggy 🐶
+
+In this first live stream, we explained how the course work (scope, units, challenges, and more) and answered your questions.
+
+And finally, we saw some LunarLander agents you've trained and play with your Huggies 🐶
+
+<Youtube id="JeJIswxyrsM" />
+
+
+To know when the next live is scheduled **check the discord server**. We will also send **you an email**. If you can't participate, don't worry, we record the live sessions.
--- a/units/en/unit1/conclusion.mdx
+++ b/units/en/unit1/conclusion.mdx
@@ -14,3 +14,9 @@ You will be able then to play with him 🤗.

 <video src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/huggy.mp4" alt="Huggy" type="video/mp4">
 </video>
+
+Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉  [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
+
+### Keep Learning, stay awesome 🤗
+
+
--- a/units/en/unit2/conclusion.mdx
+++ b/units/en/unit2/conclusion.mdx
@@ -15,5 +15,7 @@ In the next chapter, we’re going to dive deeper by studying our first Deep Rei
 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit4/atari-envs.gif" alt="Atari environments"/>


+Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉  [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)

 ### Keep Learning, stay awesome 🤗
+
--- a/units/en/unit2/two-types-value-based-methods.mdx
+++ b/units/en/unit2/two-types-value-based-methods.mdx
@@ -62,7 +62,7 @@ For each state, the state-value function outputs the expected return if the agen

 In the action-value function, for each state and action pair, the action-value function **outputs the expected return** if the agent starts in that state and takes action, and then follows the policy forever after.

-The value of taking action an in state \\(s\\) under a policy \\(π\\) is:
+The value of taking action \\(a\\) in state \\(s\\) under a policy \\(π\\) is:

 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/action-state-value-function-1.jpg" alt="Action State value function"/>
 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/action-state-value-function-2.jpg" alt="Action State value function"/>
--- a/units/en/unit3/conclusion.mdx
+++ b/units/en/unit3/conclusion.mdx
@@ -11,4 +11,7 @@ Don't hesitate to train your agent in other environments (Pong, Seaquest, QBert,

 In the next unit, **we're going to learn about Optuna**. One of the most critical task in Deep Reinforcement Learning is to find a good set of training hyperparameters. And Optuna is a library that helps you to automate the search.

+Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉  [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
+
 ### Keep Learning, stay awesome 🤗
+
--- a/units/en/unit3/from-q-to-dqn.mdx
+++ b/units/en/unit3/from-q-to-dqn.mdx
@@ -13,7 +13,7 @@ Internally, our Q-function has **a Q-table, a table where each cell corresponds
 The problem is that Q-Learning is a *tabular method*. This raises a problem in which the states and actions spaces **are small enough to approximate value functions to be represented as arrays and tables**. Also, this is **not scalable**.
 Q-Learning worked well with small state space environments like:

- FrozenLake, we had 14 states.
+- FrozenLake, we had 16 states.
 - Taxi-v3, we had 500 states.

 But think of what we're going to do today: we will train an agent to learn to play Space Invaders a more complex game, using the frames as input.
--- a/units/en/unitbonus1/conclusion.mdx
+++ b/units/en/unitbonus1/conclusion.mdx
@@ -6,5 +6,7 @@ You can now sit and enjoy playing with your Huggy 🐶. And don't **forget to sp

 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/unit-bonus1/huggy-cover.jpeg" alt="Huggy cover" width="100%">

+Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉  [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
+
+### Keep Learning, stay awesome 🤗

-### Keep Learning, Stay Awesome 🤗
--- a/units/en/unitbonus2/hands-on.mdx
+++ b/units/en/unitbonus2/hands-on.mdx
@@ -9,3 +9,8 @@ Now that you've learned to use Optuna, we give you some ideas to apply what you'
 By doing that, you're going to see how Optuna is valuable and powerful in training better agents,

 Have fun,
+
+Finally, we would love **to hear what you think of the course and how we can improve it**. If you have some feedback then, please 👉  [fill this form](https://forms.gle/BzKXWzLAGZESGNaE9)
+
+### Keep Learning, stay awesome 🤗
+