diff --git a/notebooks/bonus-unit1/bonus-unit1.ipynb b/notebooks/bonus-unit1/bonus-unit1.ipynb index b4f0e80..c433256 100644 --- a/notebooks/bonus-unit1/bonus-unit1.ipynb +++ b/notebooks/bonus-unit1/bonus-unit1.ipynb @@ -102,7 +102,7 @@ "\n", "- πŸ“– Study Deep Reinforcement Learning in **theory and practice**.\n", "- πŸ§‘β€πŸ’» Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n", - "- πŸ€– Train **agents in unique environments** \n", + "- πŸ€– Train **agents in unique environments**\n", "\n", "And more check πŸ“š the syllabus πŸ‘‰ https://simoninithomas.github.io/deep-rl-course\n", "\n", @@ -254,7 +254,7 @@ "id": "nyumV5XfPKzu" }, "source": [ - "Make sure your file is accessible " + "Make sure your file is accessible" ] }, { @@ -321,11 +321,16 @@ "\n", "- For the scope of this notebook, we're not going to modify the hyperparameters, but if you want to try as an experiment, you should also try to modify some other hyperparameters, Unity provides very [good documentation explaining each of them here](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md).\n", "\n", - "- But we need to create a config file for Huggy. \n", + "- But we need to create a config file for Huggy.\n", "\n", - "- Go to `/content/ml-agents/config/ppo`\n", + " - To do that click on Folder logo on the left of your screen.\n", "\n", - "- Create a new file called `Huggy.yaml`\n", + " \"Create\n", + "\n", + " - Go to `/content/ml-agents/config/ppo`\n", + " - Right mouse click and create a new file called `Huggy.yaml`\n", + "\n", + " \"Create\n", "\n", "- Copy and paste the content below πŸ”½" ], @@ -385,9 +390,9 @@ "\n", "- For instance **if you want to save more models during the training** (for now, we save every 200,000 training timesteps). You need to modify:\n", " - `checkpoint_interval`: The number of training timesteps collected between each checkpoint.\n", - " - `keep_checkpoints`: The maximum number of model checkpoints to keep. \n", + " - `keep_checkpoints`: The maximum number of model checkpoints to keep.\n", "\n", - "=> Just keep in mind that **decreasing the `checkpoint_interval` means more models to upload to the Hub and so a longer uploading time** \n", + "=> Just keep in mind that **decreasing the `checkpoint_interval` means more models to upload to the Hub and so a longer uploading time**\n", "We’re now ready to train our agent πŸ”₯." ], "metadata": { @@ -413,9 +418,9 @@ "3. `--run_id`: the name you want to give to your training run id.\n", "4. `--no-graphics`: to not launch the visualization during the training.\n", "\n", - "Train the model and use the `--resume` flag to continue training in case of interruption. \n", + "Train the model and use the `--resume` flag to continue training in case of interruption.\n", "\n", - "> It will fail first time when you use `--resume`, try running the block again to bypass the error. \n", + "> It will fail first time when you use `--resume`, try running the block again to bypass the error.\n", "\n" ] }, @@ -462,7 +467,7 @@ "\n", "\"Create\n", "\n", - "- Copy the token \n", + "- Copy the token\n", "- Run the cell below and paste the token" ], "metadata": { @@ -578,7 +583,7 @@ "1. In step 1, choose your model repository which is the model id (in my case ThomasSimonini/ppo-Huggy).\n", "\n", "2. In step 2, **choose what model you want to replay**:\n", - " - I have multiple ones, since we saved a model every 500000 timesteps. \n", + " - I have multiple ones, since we saved a model every 500000 timesteps.\n", " - But since I want the more recent, I choose `Huggy.onnx`\n", "\n", "πŸ‘‰ What’s nice **is to try with different models steps to see the improvement of the agent.**"