diff --git a/unit1/unit1.ipynb b/unit1/unit1.ipynb
index 09d1c54..6928b44 100644
--- a/unit1/unit1.ipynb
+++ b/unit1/unit1.ipynb
@@ -3,8 +3,8 @@
{
"cell_type": "markdown",
"metadata": {
- "id": "view-in-github",
- "colab_type": "text"
+ "colab_type": "text",
+ "id": "view-in-github"
},
"source": [
"
"
@@ -194,19 +194,24 @@
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "LW_lsjc8h_up"
+ },
"source": [
"During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n",
"\n",
"Hence the following cell will install virtual screen libraries and create and run a virtual screen 🖥\n",
"\n",
"If you have this error `FileNotFoundError: [Errno 2] No such file or directory: 'Xvfb': 'Xvfb'` please restart the colab."
- ],
- "metadata": {
- "id": "LW_lsjc8h_up"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dlx9uWHuhqfh"
+ },
+ "outputs": [],
"source": [
"!sudo apt-get update\n",
"!apt install python-opengl\n",
@@ -219,12 +224,7 @@
"\n",
"virtual_display = Display(visible=0, size=(1400, 900))\n",
"virtual_display.start()"
- ],
- "metadata": {
- "id": "dlx9uWHuhqfh"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
@@ -295,6 +295,9 @@
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "MRqRuRUl8CsB"
+ },
"source": [
"### Step 3: Understand what is Gym and how it works? 🤖\n",
"\n",
@@ -307,22 +310,22 @@
"\n",
"Let's look at an example, but first let's remember what's the RL Loop.\n",
"\n"
- ],
- "metadata": {
- "id": "MRqRuRUl8CsB"
- }
+ ]
},
{
"cell_type": "markdown",
- "source": [
- ""
- ],
"metadata": {
"id": "VvCOlJp-_kw4"
- }
+ },
+ "source": [
+ ""
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "-TzNN0bQ_j-3"
+ },
"source": [
"At each step:\n",
"- Our Agent receives **state S0** from the **Environment** — we receive the first frame of our game (Environment).\n",
@@ -351,13 +354,15 @@
"- We reset the environment to its initial state with `observation = env.reset()`\n",
"\n",
"**Let's look at an example!** Make sure to read the code\n"
- ],
- "metadata": {
- "id": "-TzNN0bQ_j-3"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "w7vOFlpA_ONz"
+ },
+ "outputs": [],
"source": [
"import gym\n",
"\n",
@@ -381,12 +386,7 @@
" # Reset the environment\n",
" print(\"Environment is reset\")\n",
" observation = env.reset()"
- ],
- "metadata": {
- "id": "w7vOFlpA_ONz"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
@@ -795,22 +795,22 @@
},
{
"cell_type": "markdown",
- "source": [
- ""
- ],
"metadata": {
"id": "RVJYd-YZRXRZ"
- }
+ },
+ "source": [
+ ""
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "YWxcm2xiRSgA"
+ },
"source": [
"- Copy the token \n",
"- Run the cell below and paste the token"
- ],
- "metadata": {
- "id": "YWxcm2xiRSgA"
- }
+ ]
},
{
"cell_type": "code",
@@ -863,6 +863,11 @@
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "JPG7ofdGIHN8"
+ },
+ "outputs": [],
"source": [
"import gym\n",
"from stable_baselines3.common.vec_env import DummyVecEnv\n",
@@ -874,11 +879,12 @@
"## repo_id is the id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name} for instance ThomasSimonini/ppo-LunarLander-v2\n",
"repo_id = \n",
"\n",
+ "# TODO: Define the name of the environment\n",
+ "env_id = \n",
+ "\n",
"# Create the evaluation env\n",
"eval_env = DummyVecEnv([lambda: gym.make(env_id)])\n",
"\n",
- "# TODO: Define the name of the environment\n",
- "env_id = \n",
"\n",
"# TODO: Define the model architecture we used\n",
"model_architecture = \"\"\n",
@@ -898,12 +904,7 @@
"# Note: if after running the package_to_hub function and it gives an issue of rebasing, please run the following code\n",
"# cd && git add . && git commit -m \"Add message\" && git pull \n",
"# And don't forget to do a \"git push\" at the end to push the change to the hub."
- ],
- "metadata": {
- "id": "JPG7ofdGIHN8"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
@@ -960,6 +961,9 @@
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "pN0-W6kkKn44"
+ },
"source": [
"Congrats 🥳 you've just trained and uploaded your first Deep Reinforcement Learning agent. The script above should have displayed a link to a model repository such as https://huggingface.co/osanseviero/test_sb3. When you go to this link, you can:\n",
"* see a video preview of your agent at the right. \n",
@@ -970,13 +974,13 @@
"Under the hood, the Hub uses git-based repositories (don't worry if you don't know what git is), which means you can update the model with new versions as you experiment and improve your agent.\n",
"\n",
"Compare the results of your LunarLander-v2 with your classmates using the leaderboard 🏆 👉 https://huggingface.co/spaces/ThomasSimonini/Lunar-Lander-Leaderboard"
- ],
- "metadata": {
- "id": "pN0-W6kkKn44"
- }
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "9nWnuQHRfFRa"
+ },
"source": [
"### Step 9: Load a saved LunarLander model from the Hub 🤗\n",
"Thanks to [ironbar](https://github.com/ironbar) for the contribution.\n",
@@ -985,33 +989,35 @@
"\n",
"You go https://huggingface.co/models?library=stable-baselines3 to see the list of all the Stable-baselines3 saved models.\n",
"1. You select one and copy its repo_id"
- ],
- "metadata": {
- "id": "9nWnuQHRfFRa"
- }
+ ]
},
{
"cell_type": "markdown",
- "source": [
- ""
- ],
"metadata": {
"id": "qDwKbhJ0ffuG"
- }
+ },
+ "source": [
+ ""
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "hNPLJF2bfiUw"
+ },
"source": [
"2. Then we just need to use load_from_hub with:\n",
"- The repo_id\n",
"- The filename: the saved model inside the repo and its extension (*.zip)"
- ],
- "metadata": {
- "id": "hNPLJF2bfiUw"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "oj8PSGHJfwz3"
+ },
+ "outputs": [],
"source": [
"from huggingface_sb3 import load_from_hub\n",
"repo_id = \"\" # The repo_id\n",
@@ -1035,38 +1041,38 @@
"eval_env = gym.make(\"LunarLander-v2\")\n",
"mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)\n",
"print(f\"mean_reward={mean_reward:.2f} +/- {std_reward}\")"
- ],
- "metadata": {
- "id": "oj8PSGHJfwz3"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {
+ "id": "Fs0Y-qgPgLUf"
+ },
"source": [
"Let's watch our agent performing 🎥 (Google Colab only) 👀\n",
"We're going to use [colabgymrender package by Ryan Rudes](https://github.com/ryanrudes) that records our agent performing in the environment and outputs a video."
- ],
- "metadata": {
- "id": "Fs0Y-qgPgLUf"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "stUaYyj8gKXE"
+ },
+ "outputs": [],
"source": [
"!pip install gym pyvirtualdisplay > /dev/null 2>&1\n",
"!apt-get install -y xvfb python-opengl ffmpeg > /dev/null 2>&1\n",
"!pip install colabgymrender==1.0.2"
- ],
- "metadata": {
- "id": "stUaYyj8gKXE"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ickoCCH8gW1-"
+ },
+ "outputs": [],
"source": [
"from colabgymrender.recorder import Recorder\n",
"\n",
@@ -1079,12 +1085,7 @@
" action, _state = model.predict(obs)\n",
" obs, reward, done, info = env.step(action)\n",
"env.play()"
- ],
- "metadata": {
- "id": "ickoCCH8gW1-"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "markdown",
@@ -1156,20 +1157,20 @@
"accelerator": "GPU",
"colab": {
"collapsed_sections": [],
+ "include_colab_link": true,
"name": "Copie de Unit 1: Train your first Deep Reinforcement Learning Agent 🚀.ipynb",
- "provenance": [],
"private_outputs": true,
- "include_colab_link": true
+ "provenance": []
},
+ "gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
- },
- "gpuClass": "standard"
+ }
},
"nbformat": 4,
"nbformat_minor": 0
-}
\ No newline at end of file
+}