diff --git a/units/en/unit1/hands-on.mdx b/units/en/unit1/hands-on.mdx index 078b3ec..2c65154 100644 --- a/units/en/unit1/hands-on.mdx +++ b/units/en/unit1/hands-on.mdx @@ -1,4 +1,5 @@ -# Hands on [[hands-on]] +# Train your first Deep Reinforcement Learning Agent 🤖 [[hands-on]] + diff --git a/units/en/unit2/hands-on.mdx b/units/en/unit2/hands-on.mdx index 58a2a57..08c63d7 100644 --- a/units/en/unit2/hands-on.mdx +++ b/units/en/unit2/hands-on.mdx @@ -1,10 +1,10 @@ # Hands-on [[hands-on]] - + @@ -21,6 +21,7 @@ Thanks to a [leaderboard](https://huggingface.co/spaces/huggingface-projects/Dee [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/deep-rl-class/blob/master/notebooks/unit2/unit2.ipynb) + # Unit 2: Q-Learning with FrozenLake-v1 ⛄ and Taxi-v3 🚕 Unit 2 Thumbnail diff --git a/units/en/unit2/mid-way-quiz.mdx b/units/en/unit2/mid-way-quiz.mdx index b1ffe3a..abb4b8b 100644 --- a/units/en/unit2/mid-way-quiz.mdx +++ b/units/en/unit2/mid-way-quiz.mdx @@ -37,7 +37,8 @@ The best way to learn and [to avoid the illusion of competence](https://www.cour **The Bellman equation is a recursive equation** that works like this: instead of starting for each state from the beginning and calculating the return, we can consider the value of any state as: -\\(Rt+1 + (\gamma * V(St+1)))\\ +Rt+1 + gamma * V(St+1) + The immediate reward + the discounted value of the state that follows