Update Unit Bonus

This commit is contained in:
simoninithomas
2022-12-05 02:16:03 +01:00
parent 7a6faea612
commit 8e17dcae02
7 changed files with 584 additions and 24 deletions

View File

@@ -295,7 +295,7 @@
"id": "wrgpVFqyENVf"
},
"source": [
"## Step 2: Import the packages 📦\n",
"## Import the packages 📦\n",
"\n",
"One additional library we import is huggingface_hub **to be able to upload and download trained models from the hub**.\n",
"\n",
@@ -330,7 +330,7 @@
"id": "MRqRuRUl8CsB"
},
"source": [
"## Step 3: Understand what is Gym and how it works? 🤖\n",
"## Understand what is Gym and how it works 🤖\n",
"\n",
"🏋 The library containing our environment is called Gym.\n",
"**You'll use Gym a lot in Deep Reinforcement Learning.**\n",
@@ -649,8 +649,8 @@
"id": "ClJJk88yoBUi"
},
"source": [
"### Train the PPO agent 🏃\n",
"- Let's train our agent for 500,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~10min, but you can use less timesteps if you just want to try it out.\n",
"## Train the PPO agent 🏃\n",
"- Let's train our agent for 1,000,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~20min, but you can use less timesteps if you just want to try it out.\n",
"- During the training, take a ☕ break you deserved it 🤗"
]
},
@@ -662,7 +662,7 @@
},
"outputs": [],
"source": [
"# TODO: Train it for 500,000 timesteps\n",
"# TODO: Train it for 1,000,000 timesteps\n",
"\n",
"# TODO: Specify file name for model and save the model to file\n",
"model_name = \"\"\n"
@@ -686,8 +686,8 @@
"outputs": [],
"source": [
"# SOLUTION\n",
"# Train it for 500,000 timesteps\n",
"model.learn(total_timesteps=500000)\n",
"# Train it for 1,000,000 timesteps\n",
"model.learn(total_timesteps=1000000)\n",
"# Save the model\n",
"model_name = \"ppo-LunarLander-v2\"\n",
"model.save(model_name)"
@@ -699,7 +699,7 @@
"id": "BY_HuedOoISR"
},
"source": [
"### Evaluate the agent 📈\n",
"## Evaluate the agent 📈\n",
"- Now that our Lunar Lander agent is trained 🚀, we need to **check its performance**.\n",
"- Stable-Baselines3 provides a method to do that: `evaluate_policy`.\n",
"- To fill that part you need to [check the documentation](https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#basic-usage-training-saving-loading)\n",
@@ -766,7 +766,7 @@
"id": "IK_kR78NoNb2"
},
"source": [
"### Publish our trained model on the Hub 🔥\n",
"## Publish our trained model on the Hub 🔥\n",
"Now that we saw we got good results after the training, we can publish our trained model on the hub 🤗 with one line of code.\n",
"\n",
"📚 The libraries documentation 👉 https://github.com/huggingface/huggingface_sb3/tree/main#hugging-face--x-stable-baselines3-v20\n",
@@ -1016,8 +1016,8 @@
"outputs": [],
"source": [
"from huggingface_sb3 import load_from_hub\n",
"repo_id = \"\" # The repo_id\n",
"filename = \"\" # The model filename.zip\n",
"repo_id = \"Classroom-workshop/assignment2-omar\" # The repo_id\n",
"filename = \"ppo-LunarLander-v2.zip\" # The model filename.zip\n",
"\n",
"# When the model was trained on Python 3.8 the pickle protocol is 5\n",
"# But Python 3.6, 3.7 use protocol 4\n",
@@ -1120,9 +1120,6 @@
"collapsed_sections": [
"dFD9RAFjG8aq",
"QAN7B0_HCVZC",
"ClJJk88yoBUi",
"1bQzQ-QcE3zo",
"BY_HuedOoISR",
"BqPKw3jt_pG5",
"IK_kR78NoNb2",
"Avf6gufJBGMw"