mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-02-02 18:09:24 +08:00
Merge pull request #543 from mohamedsaeed8223/main
Edit monitor environment object in unit1 handson
This commit is contained in:
@@ -31,6 +31,9 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "x7oR6R-ZIbeS"
|
||||
},
|
||||
"source": [
|
||||
"### The environment 🎮\n",
|
||||
"\n",
|
||||
@@ -39,19 +42,16 @@
|
||||
"### The library used 📚\n",
|
||||
"\n",
|
||||
"- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/)"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "x7oR6R-ZIbeS"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)."
|
||||
],
|
||||
"metadata": {
|
||||
"id": "OwEcFHe9RRZW"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -72,14 +72,14 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "Ff-nyJdzJPND"
|
||||
},
|
||||
"source": [
|
||||
"## This notebook is from Deep Reinforcement Learning Course\n",
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/deep-rl-course-illustration.jpg\" alt=\"Deep RL Course illustration\"/>"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "Ff-nyJdzJPND"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -120,14 +120,14 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "HoeqMnr5LuYE"
|
||||
},
|
||||
"source": [
|
||||
"## A small recap of Deep Reinforcement Learning 📚\n",
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/RL_process_game.jpg\" alt=\"The RL process\" width=\"100%\">"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "HoeqMnr5LuYE"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -157,6 +157,9 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "qDploC3jSH99"
|
||||
},
|
||||
"source": [
|
||||
"# Let's train our first Deep Reinforcement Learning agent and upload it to the Hub 🚀\n",
|
||||
"\n",
|
||||
@@ -167,23 +170,20 @@
|
||||
"To find your result, go to the [leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) and find your model, **the result = mean_reward - std of reward**\n",
|
||||
"\n",
|
||||
"For more information about the certification process, check this section 👉 https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "qDploC3jSH99"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "HqzznTzhNfAC"
|
||||
},
|
||||
"source": [
|
||||
"## Set the GPU 💪\n",
|
||||
"\n",
|
||||
"- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n",
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/gpu-step1.jpg\" alt=\"GPU Step 1\">"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "HqzznTzhNfAC"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -215,14 +215,14 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!apt install swig cmake"
|
||||
],
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "yQIGLPDkGhgG"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!apt install swig cmake"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
@@ -237,65 +237,65 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "BEKeXQJsQCYm"
|
||||
},
|
||||
"source": [
|
||||
"During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n",
|
||||
"\n",
|
||||
"Hence the following cell will install virtual screen libraries and create and run a virtual screen 🖥"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "BEKeXQJsQCYm"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "j5f2cGkdP-mb"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!sudo apt-get update\n",
|
||||
"!sudo apt-get install -y python3-opengl\n",
|
||||
"!apt install ffmpeg\n",
|
||||
"!apt install xvfb\n",
|
||||
"!pip3 install pyvirtualdisplay"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "j5f2cGkdP-mb"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"To make sure the new installed libraries are used, **sometimes it's required to restart the notebook runtime**. The next cell will force the **runtime to crash, so you'll need to connect again and run the code starting from here**. Thanks to this trick, **we will be able to run our virtual screen.**"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "TCwBTAwAW9JJ"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"To make sure the new installed libraries are used, **sometimes it's required to restart the notebook runtime**. The next cell will force the **runtime to crash, so you'll need to connect again and run the code starting from here**. Thanks to this trick, **we will be able to run our virtual screen.**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"import os\n",
|
||||
"os.kill(os.getpid(), 9)"
|
||||
],
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "cYvkbef7XEMi"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"os.kill(os.getpid(), 9)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "BE5JWP5rQIKf"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Virtual display\n",
|
||||
"from pyvirtualdisplay import Display\n",
|
||||
"\n",
|
||||
"virtual_display = Display(visible=0, size=(1400, 900))\n",
|
||||
"virtual_display.start()"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "BE5JWP5rQIKf"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -581,12 +581,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/sb3.png\" alt=\"Stable Baselines3\">"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "HLlClRW37Q7e"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit1/sb3.png\" alt=\"Stable Baselines3\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -776,7 +776,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title\n",
|
||||
"eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n",
|
||||
"eval_env = Monitor(gym.make(\"LunarLander-v2\", render_mode='rgb_array'))\n",
|
||||
"mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n",
|
||||
"print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")"
|
||||
]
|
||||
@@ -939,6 +939,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "I2E--IJu8JYq"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import gymnasium as gym\n",
|
||||
"\n",
|
||||
@@ -974,15 +979,13 @@
|
||||
" eval_env=eval_env, # Evaluation Environment\n",
|
||||
" repo_id=repo_id, # id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name} for instance ThomasSimonini/ppo-LunarLander-v2\n",
|
||||
" commit_message=commit_message)\n"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "I2E--IJu8JYq"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "T79AEAWEFIxz"
|
||||
},
|
||||
"source": [
|
||||
"Congrats 🥳 you've just trained and uploaded your first Deep Reinforcement Learning agent. The script above should have displayed a link to a model repository such as https://huggingface.co/osanseviero/test_sb3. When you go to this link, you can:\n",
|
||||
"* See a video preview of your agent at the right.\n",
|
||||
@@ -993,10 +996,7 @@
|
||||
"Under the hood, the Hub uses git-based repositories (don't worry if you don't know what git is), which means you can update the model with new versions as you experiment and improve your agent.\n",
|
||||
"\n",
|
||||
"Compare the results of your LunarLander-v2 with your classmates using the leaderboard 🏆 👉 https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "T79AEAWEFIxz"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -1028,25 +1028,25 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"id": "bhb9-NtsinKB"
|
||||
},
|
||||
"source": [
|
||||
"Because the model I download from the Hub was trained with Gym (the former version of Gymnasium) we need to install shimmy a API conversion tool that will help us to run the environment correctly.\n",
|
||||
"\n",
|
||||
"Shimmy Documentation: https://github.com/Farama-Foundation/Shimmy"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "bhb9-NtsinKB"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!pip install shimmy"
|
||||
],
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "03WI-bkci1kH"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install shimmy"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
@@ -1086,17 +1086,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"id": "PAEVwK-aahfx"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#@title\n",
|
||||
"eval_env = Monitor(gym.make(\"LunarLander-v2\"))\n",
|
||||
"mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n",
|
||||
"print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "PAEVwK-aahfx"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -1154,12 +1154,12 @@
|
||||
"metadata": {
|
||||
"accelerator": "GPU",
|
||||
"colab": {
|
||||
"private_outputs": true,
|
||||
"provenance": [],
|
||||
"collapsed_sections": [
|
||||
"QAN7B0_HCVZC",
|
||||
"BqPKw3jt_pG5"
|
||||
]
|
||||
],
|
||||
"private_outputs": true,
|
||||
"provenance": []
|
||||
},
|
||||
"gpuClass": "standard",
|
||||
"kernelspec": {
|
||||
|
||||
Reference in New Issue
Block a user