From e145aeba1dafca0324b58da0cd9dff57a528feab Mon Sep 17 00:00:00 2001 From: Thomas Simonini Date: Tue, 30 May 2023 09:14:44 +0200 Subject: [PATCH] Update with Gymnasium --- notebooks/unit3.ipynb | 833 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 833 insertions(+) create mode 100644 notebooks/unit3.ipynb diff --git a/notebooks/unit3.ipynb b/notebooks/unit3.ipynb new file mode 100644 index 0000000..f9eee5e --- /dev/null +++ b/notebooks/unit3.ipynb @@ -0,0 +1,833 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k7xBVPzoXxOg" + }, + "source": [ + "# Unit 3: Deep Q-Learning with Atari Games ๐Ÿ‘พ using RL Baselines3 Zoo\n", + "\n", + "\"Unit\n", + "\n", + "In this notebook, **you'll train a Deep Q-Learning agent** playing Space Invaders using [RL Baselines3 Zoo](https://github.com/DLR-RM/rl-baselines3-zoo), a training framework based on [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) that provides scripts for training, evaluating agents, tuning hyperparameters, plotting results and recording videos.\n", + "\n", + "We're using the [RL-Baselines-3 Zoo integration, a vanilla version of Deep Q-Learning](https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html) with no extensions such as Double-DQN, Dueling-DQN, and Prioritized Experience Replay.\n", + "\n", + "โฌ‡๏ธ Here is an example of what **you will achieve** โฌ‡๏ธ" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "J9S713biXntc" + }, + "outputs": [], + "source": [ + "%%html\n", + "" + ] + }, + { + "cell_type": "markdown", + "source": [ + "### ๐ŸŽฎ Environments: \n", + "\n", + "- [SpacesInvadersNoFrameskip-v4](https://gymnasium.farama.org/environments/atari/space_invaders/)\n", + "\n", + "You can see the difference between Space Invaders versions here ๐Ÿ‘‰ https://gymnasium.farama.org/environments/atari/space_invaders/#variants\n", + "\n", + "### ๐Ÿ“š RL-Library: \n", + "\n", + "- [RL-Baselines3-Zoo](https://github.com/DLR-RM/rl-baselines3-zoo)" + ], + "metadata": { + "id": "ykJiGevCMVc5" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wciHGjrFYz9m" + }, + "source": [ + "## Objectives of this notebook ๐Ÿ†\n", + "At the end of the notebook, you will:\n", + "- Be able to understand deeper **how RL Baselines3 Zoo works**.\n", + "- Be able to **push your trained agent and the code to the Hub** with a nice video replay and an evaluation score ๐Ÿ”ฅ.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## This notebook is from Deep Reinforcement Learning Course\n", + "\"Deep" + ], + "metadata": { + "id": "TsnP0rjxMn1e" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nw6fJHIAZd-J" + }, + "source": [ + "In this free course, you will:\n", + "\n", + "- ๐Ÿ“– Study Deep Reinforcement Learning in **theory and practice**.\n", + "- ๐Ÿง‘โ€๐Ÿ’ป Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n", + "- ๐Ÿค– Train **agents in unique environments** \n", + "\n", + "And more check ๐Ÿ“š the syllabus ๐Ÿ‘‰ https://simoninithomas.github.io/deep-rl-course\n", + "\n", + "Donโ€™t forget to **sign up to the course** (we are collecting your email to be able toย **send you the links when each Unit is published and give you information about the challenges and updates).**\n", + "\n", + "\n", + "The best way to keep in touch is to join our discord server to exchange with the community and with us ๐Ÿ‘‰๐Ÿป https://discord.gg/ydHrjt3WP5" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0vgANIBBZg1p" + }, + "source": [ + "## Prerequisites ๐Ÿ—๏ธ\n", + "Before diving into the notebook, you need to:\n", + "\n", + "๐Ÿ”ฒ ๐Ÿ“š **[Study Deep Q-Learning by reading Unit 3](https://huggingface.co/deep-rl-course/unit3/introduction)** ๐Ÿค— " + ] + }, + { + "cell_type": "markdown", + "source": [ + "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." + ], + "metadata": { + "id": "7kszpGFaRVhq" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "QR0jZtYreSI5" + }, + "source": [ + "# Let's train a Deep Q-Learning agent playing Atari' Space Invaders ๐Ÿ‘พ and upload it to the Hub.\n", + "\n", + "We strongly recommend students **to use Google Colab for the hands-on exercises instead of running them on their personal computers**.\n", + "\n", + "By using Google Colab, **you can focus on learning and experimenting without worrying about the technical aspects of setting up your environments**.\n", + "\n", + "To validate this hands-on for the certification process, you need to push your trained model to the Hub and **get a result of >= 200**.\n", + "\n", + "To find your result, go to the leaderboard and find your model, **the result = mean_reward - std of reward**\n", + "\n", + "For more information about the certification process, check this section ๐Ÿ‘‰ https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## An advice ๐Ÿ’ก\n", + "It's better to run this colab in a copy on your Google Drive, so that **if it timeouts** you still have the saved notebook on your Google Drive and do not need to fill everything from scratch.\n", + "\n", + "To do that you can either do `Ctrl + S` or `File > Save a copy in Google Drive.`\n", + "\n", + "Also, we're going to **train it for 90 minutes with 1M timesteps**. By typing `!nvidia-smi` will tell you what GPU you're using.\n", + "\n", + "And if you want to train more such 10 million steps, this will take about 9 hours, potentially resulting in Colab timing out. In that case, I recommend running this on your local computer (or somewhere else). Just click on: `File>Download`. " + ], + "metadata": { + "id": "Nc8BnyVEc3Ys" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Set the GPU ๐Ÿ’ช\n", + "- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n", + "\n", + "\"GPU" + ], + "metadata": { + "id": "PU4FVzaoM6fC" + } + }, + { + "cell_type": "markdown", + "source": [ + "- `Hardware Accelerator > GPU`\n", + "\n", + "\"GPU" + ], + "metadata": { + "id": "KV0NyFdQM9ZG" + } + }, + { + "cell_type": "markdown", + "source": [ + "# Install RL-Baselines3 Zoo and its dependencies ๐Ÿ“š\n", + "\n", + "If you see `ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.` **this is normal and it's not a critical error** there's a conflict of version. But the packages we need are installed." + ], + "metadata": { + "id": "wS_cVefO-aYg" + } + }, + { + "cell_type": "code", + "source": [ + "# For now we install this update of RL-Baselines3 Zoo\n", + "!pip install git+https://github.com/DLR-RM/rl-baselines3-zoo@update/hf" + ], + "metadata": { + "id": "hLTwHqIWdnPb" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "IF AND ONLY IF THE VERSION ABOVE DOES NOT EXIST ANYMORE. UNCOMMENT AND INSTALL THE ONE BELOW" + ], + "metadata": { + "id": "p0xe2sJHdtHy" + } + }, + { + "cell_type": "code", + "source": [ + "#!pip install rl_zoo3==2.0.0a9" + ], + "metadata": { + "id": "N0d6wy-F-f39" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "!apt-get install swig cmake ffmpeg" + ], + "metadata": { + "id": "8_MllY6Om1eI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4S9mJiKg6SqC" + }, + "source": [ + "To be able to use Atari games in Gymnasium we need to install atari package. And accept-rom-license to download the rom files (games files)." + ] + }, + { + "cell_type": "code", + "source": [ + "!pip install gymnasium[atari]\n", + "!pip install gymnasium[accept-rom-license]" + ], + "metadata": { + "id": "NsRP-lX1_2fC" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Create a virtual display ๐Ÿ”ฝ\n", + "\n", + "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n", + "\n", + "Hence the following cell will install the librairies and create and run a virtual screen ๐Ÿ–ฅ" + ], + "metadata": { + "id": "bTpYcVZVMzUI" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "jV6wjQ7Be7p5" + }, + "outputs": [], + "source": [ + "%%capture\n", + "!apt install python-opengl\n", + "!apt install ffmpeg\n", + "!apt install xvfb\n", + "!pip3 install pyvirtualdisplay" + ] + }, + { + "cell_type": "code", + "source": [ + "# Virtual display\n", + "from pyvirtualdisplay import Display\n", + "\n", + "virtual_display = Display(visible=0, size=(1400, 900))\n", + "virtual_display.start()" + ], + "metadata": { + "id": "BE5JWP5rQIKf" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5iPgzluo9z-u" + }, + "source": [ + "## Train our Deep Q-Learning Agent to Play Space Invaders ๐Ÿ‘พ\n", + "\n", + "To train an agent with RL-Baselines3-Zoo, we just need to do two things:\n", + "\n", + "1. Create a hyperparameter config file that will contain our training hyperparameters called `dqn.yml`.\n", + "\n", + "This is a template example:\n", + "\n", + "```\n", + "SpaceInvadersNoFrameskip-v4:\n", + " env_wrapper:\n", + " - stable_baselines3.common.atari_wrappers.AtariWrapper\n", + " frame_stack: 4\n", + " policy: 'CnnPolicy'\n", + " n_timesteps: !!float 1e7\n", + " buffer_size: 100000\n", + " learning_rate: !!float 1e-4\n", + " batch_size: 32\n", + " learning_starts: 100000\n", + " target_update_interval: 1000\n", + " train_freq: 4\n", + " gradient_steps: 1\n", + " exploration_fraction: 0.1\n", + " exploration_final_eps: 0.01\n", + " # If True, you need to deactivate handle_timeout_termination\n", + " # in the replay_buffer_kwargs\n", + " optimize_memory_usage: False\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_VjblFSVDQOj" + }, + "source": [ + "Here we see that:\n", + "- We use the `Atari Wrapper` that preprocess the input (Frame reduction ,grayscale, stack 4 frames)\n", + "- We use `CnnPolicy`, since we use Convolutional layers to process the frames\n", + "- We train it for 10 million `n_timesteps` \n", + "- Memory (Experience Replay) size is 100000, aka the amount of experience steps you saved to train again your agent with.\n", + "\n", + "๐Ÿ’ก My advice is to **reduce the training timesteps to 1M,** which will take about 90 minutes on a P100. `!nvidia-smi` will tell you what GPU you're using. At 10 million steps, this will take about 9 hours, which could likely result in Colab timing out. I recommend running this on your local computer (or somewhere else). Just click on: `File>Download`. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5qTkbWrkECOJ" + }, + "source": [ + "In terms of hyperparameters optimization, my advice is to focus on these 3 hyperparameters:\n", + "- `learning_rate`\n", + "- `buffer_size (Experience Memory size)`\n", + "- `batch_size`\n", + "\n", + "As a good practice, you need to **check the documentation to understand what each hyperparameters does**: https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html#parameters\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Hn8bRTHvERRL" + }, + "source": [ + "2. We start the training and save the models on `logs` folder ๐Ÿ“\n", + "\n", + "- Define the algorithm after `--algo`, where we save the model after `-f` and where the hyperparameter config is after `-c`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Xr1TVW4xfbz3" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.train --algo ________ --env SpaceInvadersNoFrameskip-v4 -f _________ -c _________" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SeChoX-3SZfP" + }, + "source": [ + "#### Solution" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "PuocgdokSab9" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.train --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/ -c dqn.yml" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_dLomIiMKQaf" + }, + "source": [ + "## Let's evaluate our agent ๐Ÿ‘€\n", + "- RL-Baselines3-Zoo provides `enjoy.py`, a python script to evaluate our agent. In most RL libraries, we call the evaluation script `enjoy.py`.\n", + "- Let's evaluate it for 5000 timesteps ๐Ÿ”ฅ" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "co5um_KeKbBJ" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps _________ --folder logs/ " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Q24K1tyWSj7t" + }, + "source": [ + "#### Solution" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "P_uSmwGRSk0z" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps 5000 --folder logs/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "liBeTltiHJtr" + }, + "source": [ + "## Publish our trained model on the Hub ๐Ÿš€\n", + "Now that we saw we got good results after the training, we can publish our trained model on the hub ๐Ÿค— with one line of code.\n", + "\n", + "\"Space" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ezbHS1q3HYVV" + }, + "source": [ + "By using `rl_zoo3.push_to_hub` **you evaluate, record a replay, generate a model card of your agent and push it to the hub**.\n", + "\n", + "This way:\n", + "- You can **showcase our work** ๐Ÿ”ฅ\n", + "- You can **visualize your agent playing** ๐Ÿ‘€\n", + "- You can **share with the community an agent that others can use** ๐Ÿ’พ\n", + "- You can **access a leaderboard ๐Ÿ† to see how well your agent is performing compared to your classmates** ๐Ÿ‘‰ https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XMSeZRBiHk6X" + }, + "source": [ + "To be able to share your model with the community there are three more steps to follow:\n", + "\n", + "1๏ธโƒฃ (If it's not already done) create an account to HF โžก https://huggingface.co/join\n", + "\n", + "2๏ธโƒฃ Sign in and then, you need to store your authentication token from the Hugging Face website.\n", + "- Create a new token (https://huggingface.co/settings/tokens) **with write role**\n", + "\n", + "\"Create" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "9O6FI0F8HnzE" + }, + "source": [ + "- Copy the token \n", + "- Run the cell below and past the token" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ppu9yePwHrZX" + }, + "outputs": [], + "source": [ + "from huggingface_hub import notebook_login # To log to our Hugging Face account to be able to upload models to the Hub.\n", + "notebook_login()\n", + "!git config --global credential.helper store" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2RVEdunPHs8B" + }, + "source": [ + "If you don't want to use a Google Colab or a Jupyter Notebook, you need to use this command instead: `huggingface-cli login`" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dSLwdmvhHvjw" + }, + "source": [ + "3๏ธโƒฃ We're now ready to push our trained agent to the ๐Ÿค— Hub ๐Ÿ”ฅ" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PW436XnhHw1H" + }, + "source": [ + "Let's run push_to_hub.py file to upload our trained agent to the Hub.\n", + "\n", + "`--repo-name `: The name of the repo\n", + "\n", + "`-orga`: Your Hugging Face username\n", + "\n", + "`-f`: Where the trained model folder is (in our case `logs`)\n", + "\n", + "\"Select" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ygk2sEktTDEw" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.push_to_hub --algo dqn --env SpaceInvadersNoFrameskip-v4 --repo-name _____________________ -orga _____________________ -f logs/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "otgpa0rhS9wR" + }, + "source": [ + "#### Solution" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_HQNlAXuEhci" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.push_to_hub --algo dqn --env SpaceInvadersNoFrameskip-v4 --repo-name dqn-SpaceInvadersNoFrameskip-v4 -orga ThomasSimonini -f logs/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0D4F5zsTTJ-L" + }, + "source": [ + "###." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ff89kd2HL1_s" + }, + "source": [ + "Congrats ๐Ÿฅณ you've just trained and uploaded your first Deep Q-Learning agent using RL-Baselines-3 Zoo. The script above should have displayed a link to a model repository such as https://huggingface.co/ThomasSimonini/dqn-SpaceInvadersNoFrameskip-v4. When you go to this link, you can:\n", + "\n", + "- See a **video preview of your agent** at the right. \n", + "- Click \"Files and versions\" to see all the files in the repository.\n", + "- Click \"Use in stable-baselines3\" to get a code snippet that shows how to load the model.\n", + "- A model card (`README.md` file) which gives a description of the model and the hyperparameters you used.\n", + "\n", + "Under the hood, the Hub uses git-based repositories (don't worry if you don't know what git is), which means you can update the model with new versions as you experiment and improve your agent.\n", + "\n", + "**Compare the results of your agents with your classmates** using the [leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) ๐Ÿ†" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fyRKcCYY-dIo" + }, + "source": [ + "## Load a powerful trained model ๐Ÿ”ฅ\n", + "- The Stable-Baselines3 team uploaded **more than 150 trained Deep Reinforcement Learning agents on the Hub**.\n", + "\n", + "You can find them here: ๐Ÿ‘‰ https://huggingface.co/sb3\n", + "\n", + "Some examples:\n", + "- Asteroids: https://huggingface.co/sb3/dqn-AsteroidsNoFrameskip-v4\n", + "- Beam Rider: https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4\n", + "- Breakout: https://huggingface.co/sb3/dqn-BreakoutNoFrameskip-v4\n", + "- Road Runner: https://huggingface.co/sb3/dqn-RoadRunnerNoFrameskip-v4\n", + "\n", + "Let's load an agent playing Beam Rider: https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B-9QVFIROI5Y" + }, + "outputs": [], + "source": [ + "%%html\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7ZQNY_r6NJtC" + }, + "source": [ + "1. We download the model using `rl_zoo3.load_from_hub`, and place it in a new folder that we can call `rl_trained`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OdBNZHy0NGTR" + }, + "outputs": [], + "source": [ + "# Download model and save it into the logs/ folder\n", + "!python -m rl_zoo3.load_from_hub --algo dqn --env BeamRiderNoFrameskip-v4 -orga sb3 -f rl_trained/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "LFt6hmWsNdBo" + }, + "source": [ + "2. Let's evaluate if for 5000 timesteps" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "aOxs0rNuN0uS" + }, + "outputs": [], + "source": [ + "!python -m rl_zoo3.enjoy --algo dqn --env BeamRiderNoFrameskip-v4 -n 5000 -f rl_trained/ --no-render" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kxMDuDfPON57" + }, + "source": [ + "Why not trying to train your own **Deep Q-Learning Agent playing BeamRiderNoFrameskip-v4? ๐Ÿ†.**\n", + "\n", + "If you want to try, check https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4#hyperparameters **in the model card, you have the hyperparameters of the trained agent.**" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "xL_ZtUgpOuY6" + }, + "source": [ + "But finding hyperparameters can be a daunting task. Fortunately, we'll see in the next Unit, how we can **use Optuna for optimizing the Hyperparameters ๐Ÿ”ฅ.**\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-pqaco8W-huW" + }, + "source": [ + "## Some additional challenges ๐Ÿ†\n", + "The best way to learn **is to try things by your own**!\n", + "\n", + "In the [Leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) you will find your agents. Can you get to the top?\n", + "\n", + "Here's a list of environments you can try to train your agent with:\n", + "- BeamRiderNoFrameskip-v4\n", + "- BreakoutNoFrameskip-v4 \n", + "- EnduroNoFrameskip-v4\n", + "- PongNoFrameskip-v4\n", + "\n", + "Also, **if you want to learn to implement Deep Q-Learning by yourself**, you definitely should look at CleanRL implementation: https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/dqn_atari.py\n", + "\n", + "\"Environments\"/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "paS-XKo4-kmu" + }, + "source": [ + "________________________________________________________________________\n", + "Congrats on finishing this chapter!\n", + "\n", + "If youโ€™re still feel confused with all these elements...it's totally normal! **This was the same for me and for all people who studied RL.**\n", + "\n", + "Take time to really **grasp the material before continuing and try the additional challenges**. Itโ€™s important to master these elements and having a solid foundations.\n", + "\n", + "In the next unit, **weโ€™re going to learn about [Optuna](https://optuna.org/)**. One of the most critical task in Deep Reinforcement Learning is to find a good set of training hyperparameters. And Optuna is a library that helps you to automate the search.\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5WRx7tO7-mvC" + }, + "source": [ + "\n", + "\n", + "### This is a course built with you ๐Ÿ‘ท๐Ÿฟโ€โ™€๏ธ\n", + "\n", + "Finally, we want to improve and update the course iteratively with your feedback. If you have some, please fill this form ๐Ÿ‘‰ https://forms.gle/3HgA7bEHwAmmLfwh9\n", + "\n", + "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the Github Repo](https://github.com/huggingface/deep-rl-class/issues)." + ] + }, + { + "cell_type": "markdown", + "source": [ + "See you on Bonus unit 2! ๐Ÿ”ฅ " + ], + "metadata": { + "id": "Kc3udPT-RcXc" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fS3Xerx0fIMV" + }, + "source": [ + "### Keep Learning, Stay Awesome ๐Ÿค—" + ] + } + ], + "metadata": { + "colab": { + "private_outputs": true, + "provenance": [], + "include_colab_link": true + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.6" + }, + "varInspector": { + "cols": { + "lenName": 16, + "lenType": 16, + "lenVar": 40 + }, + "kernels_config": { + "python": { + "delete_cmd_postfix": "", + "delete_cmd_prefix": "del ", + "library": "var_list.py", + "varRefreshCmd": "print(var_dic_list())" + }, + "r": { + "delete_cmd_postfix": ") ", + "delete_cmd_prefix": "rm(", + "library": "var_list.r", + "varRefreshCmd": "cat(var_dic_list()) " + } + }, + "types_to_exclude": [ + "module", + "function", + "builtin_function_or_method", + "instance", + "_Feature" + ], + "window_display": false + }, + "accelerator": "GPU", + "gpuClass": "standard" + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file