mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-02 18:18:18 +08:00
Update (bug package to hub)
This commit is contained in:
@@ -5,7 +5,6 @@
|
||||
"colab": {
|
||||
"provenance": [],
|
||||
"private_outputs": true,
|
||||
"authorship_tag": "ABX9TyPDFLK3trc6MCLJLqUUuAbl",
|
||||
"include_colab_link": true
|
||||
},
|
||||
"kernelspec": {
|
||||
@@ -36,7 +35,7 @@
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit8/thumbnail.png\" alt=\"Thumbnail\"/>\n",
|
||||
"\n",
|
||||
"In this notebook, you'll learn to use A2C with PyBullet and Panda-Gym, two set of robotics environments. \n",
|
||||
"In this notebook, you'll learn to use A2C with PyBullet and Panda-Gym, two set of robotics environments.\n",
|
||||
"\n",
|
||||
"With [PyBullet](https://github.com/bulletphysics/bullet3), you're going to **train a robot to move**:\n",
|
||||
"- `AntBulletEnv-v0` 🕸️ More precisely, a spider (they say Ant but come on... it's a spider 😆) 🕸️\n",
|
||||
@@ -62,12 +61,12 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"### 🎮 Environments: \n",
|
||||
"### 🎮 Environments:\n",
|
||||
"\n",
|
||||
"- [PyBullet](https://github.com/bulletphysics/bullet3)\n",
|
||||
"- [Panda-Gym](https://github.com/qgallouedec/panda-gym)\n",
|
||||
"\n",
|
||||
"###📚 RL-Library: \n",
|
||||
"###📚 RL-Library:\n",
|
||||
"\n",
|
||||
"- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)"
|
||||
],
|
||||
@@ -112,7 +111,7 @@
|
||||
"\n",
|
||||
"- 📖 Study Deep Reinforcement Learning in **theory and practice**.\n",
|
||||
"- 🧑💻 Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n",
|
||||
"- 🤖 Train **agents in unique environments** \n",
|
||||
"- 🤖 Train **agents in unique environments**\n",
|
||||
"\n",
|
||||
"And more check 📚 the syllabus 👉 https://simoninithomas.github.io/deep-rl-course\n",
|
||||
"\n",
|
||||
@@ -192,7 +191,7 @@
|
||||
"source": [
|
||||
"## Create a virtual display 🔽\n",
|
||||
"\n",
|
||||
"During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n",
|
||||
"During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n",
|
||||
"\n",
|
||||
"Hence the following cell will install the librairies and create and run a virtual screen 🖥"
|
||||
],
|
||||
@@ -266,7 +265,10 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install -r https://raw.githubusercontent.com/huggingface/deep-rl-class/main/notebooks/unit6/requirements-unit6.txt"
|
||||
"!pip install stable-baselines3[extra]==1.8.0\n",
|
||||
"!pip install huggingface_sb3\n",
|
||||
"!pip install panda_gym==2.0.0\n",
|
||||
"!pip install pyglet==1.5.1"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -403,7 +405,7 @@
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"A good practice in reinforcement learning is to [normalize input features](https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html). \n",
|
||||
"A good practice in reinforcement learning is to [normalize input features](https://stable-baselines3.readthedocs.io/en/master/guide/rl_tips.html).\n",
|
||||
"\n",
|
||||
"For that purpose, there is a wrapper that will compute a running average and standard deviation of input features.\n",
|
||||
"\n",
|
||||
@@ -630,7 +632,7 @@
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/create-token.jpg\" alt=\"Create HF Token\">\n",
|
||||
"\n",
|
||||
"- Copy the token \n",
|
||||
"- Copy the token\n",
|
||||
"- Run the cell below and paste the token"
|
||||
]
|
||||
},
|
||||
@@ -855,7 +857,7 @@
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"# 6\n",
|
||||
"model_name = \"a2c-PandaReachDense-v2\"; \n",
|
||||
"model_name = \"a2c-PandaReachDense-v2\";\n",
|
||||
"model.save(model_name)\n",
|
||||
"env.save(\"vec_normalize.pkl\")\n",
|
||||
"\n",
|
||||
@@ -927,4 +929,4 @@
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user