Update Course

This commit is contained in:
Thomas Simonini
2023-03-28 15:45:22 +02:00
parent f14c53f06a
commit 4be6adb7f4
3 changed files with 5 additions and 1 deletions

View File

@@ -289,6 +289,7 @@
},
"outputs": [],
"source": [
"pip install setuptools==65.5.0\n",
"!pip install -r requirements.txt\n",
"# Since colab uses Python 3.9 we need to add this installation\n",
"!pip install gym[atari,accept-rom-license]==0.21.0"

View File

@@ -127,7 +127,10 @@ cd /content/rl-baselines3-zoo/
```
```bash
pip install setuptools==65.5.0
pip install -r requirements.txt
# Since colab uses Python 3.9 we need to add this installation
pip install gym[atari,accept-rom-license]==0.21.0
```
## Train our Deep Q-Learning Agent to Play Space Invaders 👾

View File

@@ -31,7 +31,7 @@ We do the same with self-play:
- We **start with a copy of our agent as an opponent** this way, this opponent is on a similar level.
- We **learn from it**, and when we acquire some skills, we **update our opponent with a more recent copy of our training policy**.
The theory behind self-play is not something new. It was already used by Arthur Samuels checker player system in the fifties and by Gerald Tesauros TD-Gammon in 1955. If you want to learn more about the history of self-play [check this very good blogpost by Andrew Cohen](https://blog.unity.com/technology/training-intelligent-adversaries-using-self-play-with-ml-agents)
The theory behind self-play is not something new. It was already used by Arthur Samuels checker player system in the fifties and by Gerald Tesauros TD-Gammon in 1995. If you want to learn more about the history of self-play [check this very good blogpost by Andrew Cohen](https://blog.unity.com/technology/training-intelligent-adversaries-using-self-play-with-ml-agents)
## Self-Play in MLAgents