Update Unit Bonus

2026-05-16 13:55:52 +08:00 · 2022-12-05 02:16:03 +01:00
parent 7a6faea612
commit 8e17dcae02
7 changed files with 584 additions and 24 deletions
--- a/notebooks/unit1/unit1.ipynb
+++ b/notebooks/unit1/unit1.ipynb
@@ -295,7 +295,7 @@
        "id": "wrgpVFqyENVf"
      },
      "source": [
-        "## Step 2: Import the packages 📦\n",
+        "## Import the packages 📦\n",
        "\n",
        "One additional library we import is huggingface_hub **to be able to upload and download trained models from the hub**.\n",
        "\n",
@@ -330,7 +330,7 @@
        "id": "MRqRuRUl8CsB"
      },
      "source": [
-        "## Step 3: Understand what is Gym and how it works? 🤖\n",
+        "## Understand what is Gym and how it works 🤖\n",
        "\n",
        "🏋 The library containing our environment is called Gym.\n",
        "**You'll use Gym a lot in Deep Reinforcement Learning.**\n",
@@ -649,8 +649,8 @@
        "id": "ClJJk88yoBUi"
      },
      "source": [
-        "### Train the PPO agent 🏃\n",
-        "- Let's train our agent for 500,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~10min, but you can use less timesteps if you just want to try it out.\n",
+        "## Train the PPO agent 🏃\n",
+        "- Let's train our agent for 1,000,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~20min, but you can use less timesteps if you just want to try it out.\n",
        "- During the training, take a ☕ break you deserved it 🤗"
      ]
    },
@@ -662,7 +662,7 @@
      },
      "outputs": [],
      "source": [
-        "# TODO: Train it for 500,000 timesteps\n",
+        "# TODO: Train it for 1,000,000 timesteps\n",
        "\n",
        "# TODO: Specify file name for model and save the model to file\n",
        "model_name = \"\"\n"
@@ -686,8 +686,8 @@
      "outputs": [],
      "source": [
        "# SOLUTION\n",
-        "# Train it for 500,000 timesteps\n",
-        "model.learn(total_timesteps=500000)\n",
+        "# Train it for 1,000,000 timesteps\n",
+        "model.learn(total_timesteps=1000000)\n",
        "# Save the model\n",
        "model_name = \"ppo-LunarLander-v2\"\n",
        "model.save(model_name)"
@@ -699,7 +699,7 @@
        "id": "BY_HuedOoISR"
      },
      "source": [
-        "### Evaluate the agent 📈\n",
+        "## Evaluate the agent 📈\n",
        "- Now that our Lunar Lander agent is trained 🚀, we need to **check its performance**.\n",
        "- Stable-Baselines3 provides a method to do that: `evaluate_policy`.\n",
        "- To fill that part you need to [check the documentation](https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#basic-usage-training-saving-loading)\n",
@@ -766,7 +766,7 @@
        "id": "IK_kR78NoNb2"
      },
      "source": [
-        "### Publish our trained model on the Hub 🔥\n",
+        "## Publish our trained model on the Hub 🔥\n",
        "Now that we saw we got good results after the training, we can publish our trained model on the hub 🤗 with one line of code.\n",
        "\n",
        "📚 The libraries documentation 👉 https://github.com/huggingface/huggingface_sb3/tree/main#hugging-face--x-stable-baselines3-v20\n",
@@ -1016,8 +1016,8 @@
      "outputs": [],
      "source": [
        "from huggingface_sb3 import load_from_hub\n",
-        "repo_id = \"\" # The repo_id\n",
-        "filename = \"\" # The model filename.zip\n",
+        "repo_id = \"Classroom-workshop/assignment2-omar\" # The repo_id\n",
+        "filename = \"ppo-LunarLander-v2.zip\" # The model filename.zip\n",
        "\n",
        "# When the model was trained on Python 3.8 the pickle protocol is 5\n",
        "# But Python 3.6, 3.7 use protocol 4\n",
@@ -1120,9 +1120,6 @@
      "collapsed_sections": [
        "dFD9RAFjG8aq",
        "QAN7B0_HCVZC",
-        "ClJJk88yoBUi",
-        "1bQzQ-QcE3zo",
-        "BY_HuedOoISR",
        "BqPKw3jt_pG5",
        "IK_kR78NoNb2",
        "Avf6gufJBGMw"