mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-05-16 13:55:52 +08:00
Update Unit Bonus
This commit is contained in:
@@ -295,7 +295,7 @@
|
||||
"id": "wrgpVFqyENVf"
|
||||
},
|
||||
"source": [
|
||||
"## Step 2: Import the packages 📦\n",
|
||||
"## Import the packages 📦\n",
|
||||
"\n",
|
||||
"One additional library we import is huggingface_hub **to be able to upload and download trained models from the hub**.\n",
|
||||
"\n",
|
||||
@@ -330,7 +330,7 @@
|
||||
"id": "MRqRuRUl8CsB"
|
||||
},
|
||||
"source": [
|
||||
"## Step 3: Understand what is Gym and how it works? 🤖\n",
|
||||
"## Understand what is Gym and how it works 🤖\n",
|
||||
"\n",
|
||||
"🏋 The library containing our environment is called Gym.\n",
|
||||
"**You'll use Gym a lot in Deep Reinforcement Learning.**\n",
|
||||
@@ -649,8 +649,8 @@
|
||||
"id": "ClJJk88yoBUi"
|
||||
},
|
||||
"source": [
|
||||
"### Train the PPO agent 🏃\n",
|
||||
"- Let's train our agent for 500,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~10min, but you can use less timesteps if you just want to try it out.\n",
|
||||
"## Train the PPO agent 🏃\n",
|
||||
"- Let's train our agent for 1,000,000 timesteps, don't forget to use GPU on Colab. It will take approximately ~20min, but you can use less timesteps if you just want to try it out.\n",
|
||||
"- During the training, take a ☕ break you deserved it 🤗"
|
||||
]
|
||||
},
|
||||
@@ -662,7 +662,7 @@
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# TODO: Train it for 500,000 timesteps\n",
|
||||
"# TODO: Train it for 1,000,000 timesteps\n",
|
||||
"\n",
|
||||
"# TODO: Specify file name for model and save the model to file\n",
|
||||
"model_name = \"\"\n"
|
||||
@@ -686,8 +686,8 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# SOLUTION\n",
|
||||
"# Train it for 500,000 timesteps\n",
|
||||
"model.learn(total_timesteps=500000)\n",
|
||||
"# Train it for 1,000,000 timesteps\n",
|
||||
"model.learn(total_timesteps=1000000)\n",
|
||||
"# Save the model\n",
|
||||
"model_name = \"ppo-LunarLander-v2\"\n",
|
||||
"model.save(model_name)"
|
||||
@@ -699,7 +699,7 @@
|
||||
"id": "BY_HuedOoISR"
|
||||
},
|
||||
"source": [
|
||||
"### Evaluate the agent 📈\n",
|
||||
"## Evaluate the agent 📈\n",
|
||||
"- Now that our Lunar Lander agent is trained 🚀, we need to **check its performance**.\n",
|
||||
"- Stable-Baselines3 provides a method to do that: `evaluate_policy`.\n",
|
||||
"- To fill that part you need to [check the documentation](https://stable-baselines3.readthedocs.io/en/master/guide/examples.html#basic-usage-training-saving-loading)\n",
|
||||
@@ -766,7 +766,7 @@
|
||||
"id": "IK_kR78NoNb2"
|
||||
},
|
||||
"source": [
|
||||
"### Publish our trained model on the Hub 🔥\n",
|
||||
"## Publish our trained model on the Hub 🔥\n",
|
||||
"Now that we saw we got good results after the training, we can publish our trained model on the hub 🤗 with one line of code.\n",
|
||||
"\n",
|
||||
"📚 The libraries documentation 👉 https://github.com/huggingface/huggingface_sb3/tree/main#hugging-face--x-stable-baselines3-v20\n",
|
||||
@@ -1016,8 +1016,8 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from huggingface_sb3 import load_from_hub\n",
|
||||
"repo_id = \"\" # The repo_id\n",
|
||||
"filename = \"\" # The model filename.zip\n",
|
||||
"repo_id = \"Classroom-workshop/assignment2-omar\" # The repo_id\n",
|
||||
"filename = \"ppo-LunarLander-v2.zip\" # The model filename.zip\n",
|
||||
"\n",
|
||||
"# When the model was trained on Python 3.8 the pickle protocol is 5\n",
|
||||
"# But Python 3.6, 3.7 use protocol 4\n",
|
||||
@@ -1120,9 +1120,6 @@
|
||||
"collapsed_sections": [
|
||||
"dFD9RAFjG8aq",
|
||||
"QAN7B0_HCVZC",
|
||||
"ClJJk88yoBUi",
|
||||
"1bQzQ-QcE3zo",
|
||||
"BY_HuedOoISR",
|
||||
"BqPKw3jt_pG5",
|
||||
"IK_kR78NoNb2",
|
||||
"Avf6gufJBGMw"
|
||||
|
||||
Reference in New Issue
Block a user