mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-13 16:29:42 +08:00
Small updates
This commit is contained in:
@@ -12,6 +12,6 @@ Thanks to our <a href="https://huggingface.co/spaces/huggingface-projects/Deep-R
|
||||
|
||||
So let's get started! 🚀
|
||||
|
||||
To start the hands-on click on Open In Colab button 👇 :
|
||||
**To start the hands-on click on Open In Colab button** 👇 :
|
||||
|
||||
[]()
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Quiz [[quiz]]
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://fr.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
### Q1: What is Reinforcement Learning?
|
||||
|
||||
|
||||
@@ -108,7 +108,7 @@ Taking this information into consideration is crucial because it will **have im
|
||||
|
||||
The reward is fundamental in RL because it’s **the only feedback** for the agent. Thanks to it, our agent knows **if the action taken was good or not.**
|
||||
|
||||
The cumulative reward at each time step t can be written as:
|
||||
The cumulative reward at each time step **t** can be written as:
|
||||
|
||||
<figure>
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/rewards_1.jpg" alt="Rewards">
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# First Quiz [[quiz1]]
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://fr.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
|
||||
### Q1: What are the two main approaches to find optimal policy?
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Second Quiz [[quiz2]]
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://fr.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
|
||||
### Q1: What is Q-Learning?
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Quiz [[quiz]]
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://fr.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
### Q1: What are tabular methods?
|
||||
|
||||
|
||||
Reference in New Issue
Block a user