mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-14 02:11:17 +08:00
Merge pull request #262 from huggingface/ThomasSimoniniMarchUpdate
March 2023 Update
This commit is contained in:
@@ -289,6 +289,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install setuptools==65.5.0\n",
|
||||
"!pip install -r requirements.txt\n",
|
||||
"# Since colab uses Python 3.9 we need to add this installation\n",
|
||||
"!pip install gym[atari,accept-rom-license]==0.21.0"
|
||||
|
||||
@@ -52,21 +52,21 @@ The course is composed of:
|
||||
|
||||
You can choose to follow this course either:
|
||||
|
||||
- *To get a certificate of completion*: you need to complete 80% of the assignments before the end of April 2023.
|
||||
- *To get a certificate of honors*: you need to complete 100% of the assignments before the end of April 2023.
|
||||
- *To get a certificate of completion*: you need to complete 80% of the assignments before the end of June 2023.
|
||||
- *To get a certificate of honors*: you need to complete 100% of the assignments before the end of June 2023.
|
||||
- *As a simple audit*: you can participate in all challenges and do assignments if you want, but you have no deadlines.
|
||||
|
||||
Both paths **are completely free**.
|
||||
Whatever path you choose, we advise you **to follow the recommended pace to enjoy the course and challenges with your fellow classmates.**
|
||||
|
||||
You don't need to tell us which path you choose. At the end of April, when we will verify the assignments **if you get more than 80% of the assignments done, you'll get a certificate.**
|
||||
You don't need to tell us which path you choose. **If you get more than 80% of the assignments done, you'll get a certificate.**
|
||||
|
||||
## The Certification Process [[certification-process]]
|
||||
|
||||
The certification process is **completely free**:
|
||||
|
||||
- *To get a certificate of completion*: you need to complete 80% of the assignments before the end of April 2023.
|
||||
- *To get a certificate of honors*: you need to complete 100% of the assignments before the end of April 2023.
|
||||
- *To get a certificate of completion*: you need to complete 80% of the assignments before the end of June 2023.
|
||||
- *To get a certificate of honors*: you need to complete 100% of the assignments before the end of June 2023.
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit0/certification.jpg" alt="Course certification" width="100%"/>
|
||||
|
||||
|
||||
@@ -127,7 +127,10 @@ cd /content/rl-baselines3-zoo/
|
||||
```
|
||||
|
||||
```bash
|
||||
pip install setuptools==65.5.0
|
||||
pip install -r requirements.txt
|
||||
# Since colab uses Python 3.9 we need to add this installation
|
||||
pip install gym[atari,accept-rom-license]==0.21.0
|
||||
```
|
||||
|
||||
## Train our Deep Q-Learning Agent to Play Space Invaders 👾
|
||||
|
||||
@@ -31,7 +31,7 @@ We do the same with self-play:
|
||||
- We **start with a copy of our agent as an opponent** this way, this opponent is on a similar level.
|
||||
- We **learn from it**, and when we acquire some skills, we **update our opponent with a more recent copy of our training policy**.
|
||||
|
||||
The theory behind self-play is not something new. It was already used by Arthur Samuel’s checker player system in the fifties and by Gerald Tesauro’s TD-Gammon in 1955. If you want to learn more about the history of self-play [check this very good blogpost by Andrew Cohen](https://blog.unity.com/technology/training-intelligent-adversaries-using-self-play-with-ml-agents)
|
||||
The theory behind self-play is not something new. It was already used by Arthur Samuel’s checker player system in the fifties and by Gerald Tesauro’s TD-Gammon in 1995. If you want to learn more about the history of self-play [check this very good blogpost by Andrew Cohen](https://blog.unity.com/technology/training-intelligent-adversaries-using-self-play-with-ml-agents)
|
||||
|
||||
## Self-Play in MLAgents
|
||||
|
||||
|
||||
Reference in New Issue
Block a user