diff --git a/notebooks/unit6/unit6.ipynb b/notebooks/unit6/unit6.ipynb index b797060..01b6802 100644 --- a/notebooks/unit6/unit6.ipynb +++ b/notebooks/unit6/unit6.ipynb @@ -5,7 +5,6 @@ "colab": { "provenance": [], "private_outputs": true, - "authorship_tag": "ABX9TyPDFLK3trc6MCLJLqUUuAbl", "include_colab_link": true }, "kernelspec": { @@ -36,7 +35,7 @@ "\n", "\"Thumbnail\"/\n", "\n", - "In this notebook, you'll learn to use A2C with PyBullet and Panda-Gym, two set of robotics environments. \n", + "In this notebook, you'll learn to use A2C with PyBullet and Panda-Gym, two set of robotics environments.\n", "\n", "With [PyBullet](https://github.com/bulletphysics/bullet3), you're going to **train a robot to move**:\n", "- `AntBulletEnv-v0` 🕸️ More precisely, a spider (they say Ant but come on... it's a spider 😆) 🕸️\n", @@ -62,12 +61,12 @@ { "cell_type": "markdown", "source": [ - "### 🎮 Environments: \n", + "### 🎮 Environments:\n", "\n", "- [PyBullet](https://github.com/bulletphysics/bullet3)\n", "- [Panda-Gym](https://github.com/qgallouedec/panda-gym)\n", "\n", - "###📚 RL-Library: \n", + "###📚 RL-Library:\n", "\n", "- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)" ], @@ -112,7 +111,7 @@ "\n", "- 📖 Study Deep Reinforcement Learning in **theory and practice**.\n", "- 🧑‍💻 Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n", - "- 🤖 Train **agents in unique environments** \n", + "- 🤖 Train **agents in unique environments**\n", "\n", "And more check 📚 the syllabus 👉 https://simoninithomas.github.io/deep-rl-course\n", "\n", @@ -192,7 +191,7 @@ "source": [ "## Create a virtual display 🔽\n", "\n", - "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n", + "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n", "\n", "Hence the following cell will install the librairies and create and run a virtual screen 🖥" ], @@ -266,7 +265,10 @@ }, "outputs": [], "source": [ - "!pip install -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit6/requirements-unit6.txt" + "!pip install stable-baselines3[extra]==1.8.0\n", + "!pip install huggingface_sb3\n", + "!pip install panda_gym==2.0.0\n", + "!pip install pyglet==1.5.1" ] }, { @@ -403,7 +405,7 @@ { "cell_type": "markdown", "source": [ - "A good practice in reinforcement learning is to [normalize input features](https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html). \n", + "A good practice in reinforcement learning is to [normalize input features](https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html).\n", "\n", "For that purpose, there is a wrapper that will compute a running average and standard deviation of input features.\n", "\n", @@ -630,7 +632,7 @@ "\n", "\"Create\n", "\n", - "- Copy the token \n", + "- Copy the token\n", "- Run the cell below and paste the token" ] }, @@ -855,7 +857,7 @@ "cell_type": "code", "source": [ "# 6\n", - "model_name = \"a2c-PandaReachDense-v2\"; \n", + "model_name = \"a2c-PandaReachDense-v2\";\n", "model.save(model_name)\n", "env.save(\"vec_normalize.pkl\")\n", "\n", @@ -927,4 +929,4 @@ } } ] -} +} \ No newline at end of file