mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-27 04:11:37 +08:00
572 lines
19 KiB
Plaintext
572 lines
19 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "59a253e3",
|
|
"metadata": {},
|
|
"source": [
|
|
"## This notebook will guide you through logging your experiments"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "2e020e33",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T01:11:29.835785Z",
|
|
"start_time": "2022-05-08T01:11:29.831878Z"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Import libraries \n",
|
|
"\n",
|
|
"# The Training Libs\n",
|
|
"import gym\n",
|
|
"from stable_baselines3 import PPO\n",
|
|
"from stable_baselines3.common.evaluation import evaluate_policy\n",
|
|
"from stable_baselines3.common.env_util import make_vec_env\n",
|
|
"\n",
|
|
"from stable_baselines3.common.vec_env import VecVideoRecorder , DummyVecEnv\n",
|
|
"\n",
|
|
"# Import wandb stuff\n",
|
|
"import wandb\n",
|
|
"from wandb.integration.sb3 import WandbCallback"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "fa321aa8",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Weights and biases (wandb)\n",
|
|
"\n",
|
|
"[Weights and Biases](https://wandb.ai/) is a free and open source logging platform that allows you to track multiple results when you are training your model. It collects various metrics, and organizes them in an simple web dashboard. \n",
|
|
"\n",
|
|
"In this notebook, we will see how simple it is to start tracking your experiments with weights and biases"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "285ecb67",
|
|
"metadata": {},
|
|
"source": [
|
|
"It can log all the training metrics that is output during the training phase, along with a video of our agent at that point in our training. Additionally, it also logs system status like CPU/GPU usage and temperature.\n",
|
|
"\n",
|
|
"They have good documentation, which can be found here - https://docs.wandb.ai/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "fbefa8c4",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup for wandb\n",
|
|
"\n",
|
|
"1. Create an account on wandb - https://wandb.ai/\n",
|
|
"\n",
|
|
"2. Install wandb: \n",
|
|
"\n",
|
|
" Wandb is distributed as a python package. To install it, run `pip install wandb`.\n",
|
|
" \n",
|
|
"3. Login into wandb:\n",
|
|
"\n",
|
|
" To login into wandb, run `wandb login`. Paste the api key. The key can be found here - https://wandb.ai/authorize"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "ae0fd212",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T00:55:11.285850Z",
|
|
"start_time": "2022-05-08T00:55:05.640088Z"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"\u001b[34m\u001b[1mwandb\u001b[0m: Currently logged in as: \u001b[33msupersecurehuman\u001b[0m. Use \u001b[1m`wandb login --relogin`\u001b[0m to force relogin\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Tracking run with wandb version 0.12.16"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Run data is saved locally in <code>/home/venom/Desktop/deep-rl-class/unit1/unit1_bonus/wandb/run-20220508_062506-2ovixu73</code>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Syncing run <strong><a href=\"https://wandb.ai/supersecurehuman/LunarLander-v2/runs/2ovixu73\" target=\"_blank\">warm-gorge-3</a></strong> to <a href=\"https://wandb.ai/supersecurehuman/LunarLander-v2\" target=\"_blank\">Weights & Biases</a> (<a href=\"https://wandb.me/run\" target=\"_blank\">docs</a>)<br/>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Initialize wandb\n",
|
|
"# https://docs.wandb.ai/guides/integrations/other/stable-baselines-3\n",
|
|
"\n",
|
|
"config = {\n",
|
|
" \"policy_type\": \"MlpPolicy\",\n",
|
|
" \"total_timesteps\": 100000,\n",
|
|
" \"env_name\": \"LunarLander-v2\",\n",
|
|
" \"learning_rate\" : 0.0002,\n",
|
|
"}\n",
|
|
"\n",
|
|
"run = wandb.init(\n",
|
|
" project=\"LunarLander-v2\", # It creates a project on wandb if it doesnt exist. The logging happens there\n",
|
|
" config=config,\n",
|
|
" sync_tensorboard=True, # auto-upload tensorboard metrics\n",
|
|
" monitor_gym=True, # auto-upload the videos of agents playing the game\n",
|
|
")\n",
|
|
"\n",
|
|
"# After running this cell, go to https://wandb.ai/home to see your new project created"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5ee51e03",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Now, we will create a environment and a model to log the training\n",
|
|
"\n",
|
|
"This part is covered in the main notebook\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "0b8046ae",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T00:55:59.965683Z",
|
|
"start_time": "2022-05-08T00:55:55.396867Z"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Using cuda device\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"env = make_vec_env('LunarLander-v2', n_envs=16)\n",
|
|
"# Use the following line with caution. The video recorder will try to render the agent on the screen, so that ffmpeg can capture it. Here, we have 16 envs set. Trying to render 16 envs on screen will\n",
|
|
"# be pretty resource intensive. \n",
|
|
"# env = VecVideoRecorder(env, f\"videos/{run.id}\", record_video_trigger=lambda x: x % 2000 == 0, video_length=200) # Set the video recorder, to record our agent during training\n",
|
|
"\n",
|
|
"# I would suggest you to add all your hyperparameters in the config dictionary defined before the wandb init step. This would help you to visualize the effect those hyper parameters\n",
|
|
"# have on your model, via the wandb dashboard\n",
|
|
"model = PPO(\n",
|
|
" policy = config[\"policy_type\"],\n",
|
|
" env = env,\n",
|
|
" learning_rate=config[\"learning_rate\"],\n",
|
|
" tensorboard_log=\"logs\",\n",
|
|
" verbose=1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "f772cc3f",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T00:59:14.354142Z",
|
|
"start_time": "2022-05-08T00:56:32.259913Z"
|
|
},
|
|
"scrolled": true
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Logging to logs/PPO_1\n",
|
|
"---------------------------------\n",
|
|
"| rollout/ | |\n",
|
|
"| ep_len_mean | 91.1 |\n",
|
|
"| ep_rew_mean | -170 |\n",
|
|
"| time/ | |\n",
|
|
"| fps | 5432 |\n",
|
|
"| iterations | 1 |\n",
|
|
"| time_elapsed | 6 |\n",
|
|
"| total_timesteps | 32768 |\n",
|
|
"---------------------------------\n",
|
|
"-----------------------------------------\n",
|
|
"| rollout/ | |\n",
|
|
"| ep_len_mean | 98.9 |\n",
|
|
"| ep_rew_mean | -132 |\n",
|
|
"| time/ | |\n",
|
|
"| fps | 1480 |\n",
|
|
"| iterations | 2 |\n",
|
|
"| time_elapsed | 44 |\n",
|
|
"| total_timesteps | 65536 |\n",
|
|
"| train/ | |\n",
|
|
"| approx_kl | 0.009069282 |\n",
|
|
"| clip_fraction | 0.0836 |\n",
|
|
"| clip_range | 0.2 |\n",
|
|
"| entropy_loss | -1.38 |\n",
|
|
"| explained_variance | 0.00211 |\n",
|
|
"| learning_rate | 0.0002 |\n",
|
|
"| loss | 345 |\n",
|
|
"| n_updates | 10 |\n",
|
|
"| policy_gradient_loss | -0.00783 |\n",
|
|
"| value_loss | 810 |\n",
|
|
"-----------------------------------------\n",
|
|
"-----------------------------------------\n",
|
|
"| rollout/ | |\n",
|
|
"| ep_len_mean | 103 |\n",
|
|
"| ep_rew_mean | -104 |\n",
|
|
"| time/ | |\n",
|
|
"| fps | 1135 |\n",
|
|
"| iterations | 3 |\n",
|
|
"| time_elapsed | 86 |\n",
|
|
"| total_timesteps | 98304 |\n",
|
|
"| train/ | |\n",
|
|
"| approx_kl | 0.012206479 |\n",
|
|
"| clip_fraction | 0.12 |\n",
|
|
"| clip_range | 0.2 |\n",
|
|
"| entropy_loss | -1.35 |\n",
|
|
"| explained_variance | 0.436 |\n",
|
|
"| learning_rate | 0.0002 |\n",
|
|
"| loss | 77.5 |\n",
|
|
"| n_updates | 20 |\n",
|
|
"| policy_gradient_loss | -0.0125 |\n",
|
|
"| value_loss | 335 |\n",
|
|
"-----------------------------------------\n",
|
|
"-----------------------------------------\n",
|
|
"| rollout/ | |\n",
|
|
"| ep_len_mean | 113 |\n",
|
|
"| ep_rew_mean | -87.2 |\n",
|
|
"| time/ | |\n",
|
|
"| fps | 1031 |\n",
|
|
"| iterations | 4 |\n",
|
|
"| time_elapsed | 127 |\n",
|
|
"| total_timesteps | 131072 |\n",
|
|
"| train/ | |\n",
|
|
"| approx_kl | 0.012416394 |\n",
|
|
"| clip_fraction | 0.167 |\n",
|
|
"| clip_range | 0.2 |\n",
|
|
"| entropy_loss | -1.31 |\n",
|
|
"| explained_variance | 0.513 |\n",
|
|
"| learning_rate | 0.0002 |\n",
|
|
"| loss | 66.1 |\n",
|
|
"| n_updates | 30 |\n",
|
|
"| policy_gradient_loss | -0.015 |\n",
|
|
"| value_loss | 257 |\n",
|
|
"-----------------------------------------\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"<stable_baselines3.ppo.ppo.PPO at 0x7fb0d70ccd50>"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Now we do the magical stuff of logging to wandb. All you have to do is add the wandb callback to the model's callback like this\n",
|
|
"\n",
|
|
"model.learn(total_timesteps=config[\"total_timesteps\"], \n",
|
|
" callback=[WandbCallback(\n",
|
|
" gradient_save_freq=100\n",
|
|
" )])"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "d2a6341c",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T00:59:27.663456Z",
|
|
"start_time": "2022-05-08T00:59:14.382414Z"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Waiting for W&B process to finish... <strong style=\"color:green\">(success).</strong>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"application/vnd.jupyter.widget-view+json": {
|
|
"model_id": "",
|
|
"version_major": 2,
|
|
"version_minor": 0
|
|
},
|
|
"text/plain": [
|
|
"VBox(children=(Label(value='0.002 MB of 0.002 MB uploaded (0.000 MB deduped)\\r'), FloatProgress(value=1.0, max…"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"<style>\n",
|
|
" table.wandb td:nth-child(1) { padding: 0 10px; text-align: left ; width: auto;} td:nth-child(2) {text-align: left ; width: 100%}\n",
|
|
" .wandb-row { display: flex; flex-direction: row; flex-wrap: wrap; justify-content: flex-start; width: 100% }\n",
|
|
" .wandb-col { display: flex; flex-direction: column; flex-basis: 100%; flex: 1; padding: 10px; }\n",
|
|
" </style>\n",
|
|
"<div class=\"wandb-row\"><div class=\"wandb-col\"><h3>Run history:</h3><br/><table class=\"wandb\"><tr><td>global_step</td><td>▁▃▆█</td></tr><tr><td>rollout/ep_len_mean</td><td>▁▃▅█</td></tr><tr><td>rollout/ep_rew_mean</td><td>▁▄▇█</td></tr><tr><td>time/fps</td><td>█▂▁▁</td></tr><tr><td>train/approx_kl</td><td>▁██</td></tr><tr><td>train/clip_fraction</td><td>▁▄█</td></tr><tr><td>train/clip_range</td><td>▁▁▁</td></tr><tr><td>train/entropy_loss</td><td>▁▄█</td></tr><tr><td>train/explained_variance</td><td>▁▇█</td></tr><tr><td>train/learning_rate</td><td>▁▁▁</td></tr><tr><td>train/loss</td><td>█▁▁</td></tr><tr><td>train/policy_gradient_loss</td><td>█▃▁</td></tr><tr><td>train/value_loss</td><td>█▂▁</td></tr></table><br/></div><div class=\"wandb-col\"><h3>Run summary:</h3><br/><table class=\"wandb\"><tr><td>global_step</td><td>131072</td></tr><tr><td>rollout/ep_len_mean</td><td>113.48</td></tr><tr><td>rollout/ep_rew_mean</td><td>-87.24881</td></tr><tr><td>time/fps</td><td>1031.0</td></tr><tr><td>train/approx_kl</td><td>0.01242</td></tr><tr><td>train/clip_fraction</td><td>0.16663</td></tr><tr><td>train/clip_range</td><td>0.2</td></tr><tr><td>train/entropy_loss</td><td>-1.30755</td></tr><tr><td>train/explained_variance</td><td>0.51275</td></tr><tr><td>train/learning_rate</td><td>0.0002</td></tr><tr><td>train/loss</td><td>66.06168</td></tr><tr><td>train/policy_gradient_loss</td><td>-0.01496</td></tr><tr><td>train/value_loss</td><td>257.38614</td></tr></table><br/></div></div>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Synced <strong style=\"color:#cdcd00\">warm-gorge-3</strong>: <a href=\"https://wandb.ai/supersecurehuman/LunarLander-v2/runs/2ovixu73\" target=\"_blank\">https://wandb.ai/supersecurehuman/LunarLander-v2/runs/2ovixu73</a><br/>Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 1 other file(s)"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"Find logs at: <code>./wandb/run-20220508_062506-2ovixu73/logs</code>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
}
|
|
],
|
|
"source": [
|
|
"# Finish run\n",
|
|
"run.finish()\n",
|
|
"\n",
|
|
"# This cell output will also give a global summary, along with giving you the link to view your run."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e9f1e6a5",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Note\n",
|
|
"\n",
|
|
"You need to enable tensorboard logging to view your training metrics in wandb dashboard."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "629f050a",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Package to 🤗 hub"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"id": "8be43170",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T01:13:41.386029Z",
|
|
"start_time": "2022-05-08T01:13:41.335337Z"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": []
|
|
},
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"# You have to disable wandb while packaging it to hub, because it seems to be interfering with package to hub function.\n",
|
|
"wandb.init(mode=\"disabled\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "481c5eae",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from huggingface_hub import notebook_login # To log to our Hugging Face account to be able to upload models to the Hub."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "91d45e2a",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Note: You just need to run notebook_login() once in any machine you are trying to login. The token is saved in you machine, making future access to your account easier\n",
|
|
"notebook_login()\n",
|
|
"!git config --global credential.helper store"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0192bfe0",
|
|
"metadata": {
|
|
"ExecuteTime": {
|
|
"end_time": "2022-05-08T01:14:35.254796Z",
|
|
"start_time": "2022-05-08T01:13:43.366135Z"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from huggingface_sb3 import package_to_hub\n",
|
|
"\n",
|
|
"env_id = config[\"env_name\"]\n",
|
|
"\n",
|
|
"model_architecture = \"PPO\"\n",
|
|
"model_name = \"PPO-LunarLander-v2\"\n",
|
|
"\n",
|
|
"repo_id = \"SuperSecureHuman/LunarLander_v2_PPO_wandb\"\n",
|
|
"\n",
|
|
"commit_message = \"Initial Commit\"\n",
|
|
"\n",
|
|
"eval_env = DummyVecEnv([lambda: gym.make(env_id)])\n",
|
|
"\n",
|
|
"package_to_hub(model=model, # Our trained model\n",
|
|
" model_name=model_name, # The name of our trained model \n",
|
|
" model_architecture=model_architecture, # The model architecture we used: in our case PPO\n",
|
|
" env_id=env_id, # Name of the environment\n",
|
|
" eval_env=eval_env, # Evaluation Environment\n",
|
|
" repo_id=repo_id, # id of the model repository from the Hugging Face Hub\n",
|
|
" commit_message=commit_message)\n",
|
|
"eval_env.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9368a9f9",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Congrats!\n",
|
|
"\n",
|
|
"Now you have now started to use wandb in your project. Do checkout the docs to know what are the other amazing stuff it is capable off!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.12"
|
|
},
|
|
"toc": {
|
|
"base_numbering": 1,
|
|
"nav_menu": {},
|
|
"number_sections": true,
|
|
"sideBar": true,
|
|
"skip_h1_title": false,
|
|
"title_cell": "Table of Contents",
|
|
"title_sidebar": "Contents",
|
|
"toc_cell": false,
|
|
"toc_position": {},
|
|
"toc_section_display": true,
|
|
"toc_window_display": false
|
|
},
|
|
"varInspector": {
|
|
"cols": {
|
|
"lenName": 16,
|
|
"lenType": 16,
|
|
"lenVar": 40
|
|
},
|
|
"kernels_config": {
|
|
"python": {
|
|
"delete_cmd_postfix": "",
|
|
"delete_cmd_prefix": "del ",
|
|
"library": "var_list.py",
|
|
"varRefreshCmd": "print(var_dic_list())"
|
|
},
|
|
"r": {
|
|
"delete_cmd_postfix": ") ",
|
|
"delete_cmd_prefix": "rm(",
|
|
"library": "var_list.r",
|
|
"varRefreshCmd": "cat(var_dic_list()) "
|
|
}
|
|
},
|
|
"types_to_exclude": [
|
|
"module",
|
|
"function",
|
|
"builtin_function_or_method",
|
|
"instance",
|
|
"_Feature"
|
|
],
|
|
"window_display": false
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|