diff --git a/notebooks/unit3/unit3.ipynb b/notebooks/unit3/unit3.ipynb index 2252762..f9eee5e 100644 --- a/notebooks/unit3/unit3.ipynb +++ b/notebooks/unit3/unit3.ipynb @@ -7,7 +7,7 @@ "colab_type": "text" }, "source": [ - "\"Open" + "\"Open" ] }, { @@ -44,7 +44,9 @@ "source": [ "### 🎮 Environments: \n", "\n", - "- SpacesInvadersNoFrameskip-v4 \n", + "- [SpacesInvadersNoFrameskip-v4](https://gymnasium.farama.org/environments/atari/space_invaders/)\n", + "\n", + "You can see the difference between Space Invaders versions here 👉 https://gymnasium.farama.org/environments/atari/space_invaders/#variants\n", "\n", "### 📚 RL-Library: \n", "\n", @@ -127,6 +129,10 @@ "source": [ "# Let's train a Deep Q-Learning agent playing Atari' Space Invaders 👾 and upload it to the Hub.\n", "\n", + "We strongly recommend students **to use Google Colab for the hands-on exercises instead of running them on their personal computers**.\n", + "\n", + "By using Google Colab, **you can focus on learning and experimenting without worrying about the technical aspects of setting up your environments**.\n", + "\n", "To validate this hands-on for the certification process, you need to push your trained model to the Hub and **get a result of >= 200**.\n", "\n", "To find your result, go to the leaderboard and find your model, **the result = mean_reward - std of reward**\n", @@ -173,6 +179,81 @@ "id": "KV0NyFdQM9ZG" } }, + { + "cell_type": "markdown", + "source": [ + "# Install RL-Baselines3 Zoo and its dependencies 📚\n", + "\n", + "If you see `ERROR: pip's dependency resolver does not currently take into account all the packages that are installed.` **this is normal and it's not a critical error** there's a conflict of version. But the packages we need are installed." + ], + "metadata": { + "id": "wS_cVefO-aYg" + } + }, + { + "cell_type": "code", + "source": [ + "# For now we install this update of RL-Baselines3 Zoo\n", + "!pip install git+https://github.com/DLR-RM/rl-baselines3-zoo@update/hf" + ], + "metadata": { + "id": "hLTwHqIWdnPb" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "IF AND ONLY IF THE VERSION ABOVE DOES NOT EXIST ANYMORE. UNCOMMENT AND INSTALL THE ONE BELOW" + ], + "metadata": { + "id": "p0xe2sJHdtHy" + } + }, + { + "cell_type": "code", + "source": [ + "#!pip install rl_zoo3==2.0.0a9" + ], + "metadata": { + "id": "N0d6wy-F-f39" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "!apt-get install swig cmake ffmpeg" + ], + "metadata": { + "id": "8_MllY6Om1eI" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4S9mJiKg6SqC" + }, + "source": [ + "To be able to use Atari games in Gymnasium we need to install atari package. And accept-rom-license to download the rom files (games files)." + ] + }, + { + "cell_type": "code", + "source": [ + "!pip install gymnasium[atari]\n", + "!pip install gymnasium[accept-rom-license]" + ], + "metadata": { + "id": "NsRP-lX1_2fC" + }, + "execution_count": null, + "outputs": [] + }, { "cell_type": "markdown", "source": [ @@ -201,29 +282,6 @@ "!pip3 install pyvirtualdisplay" ] }, - { - "cell_type": "code", - "source": [ - "# Additional dependencies for RL Baselines3 Zoo\n", - "!apt-get install swig cmake freeglut3-dev " - ], - "metadata": { - "id": "fWyKJCy_NJBX" - }, - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "source": [ - "!pip install pyglet==1.5.1" - ], - "metadata": { - "id": "C5LwHrISW7Q5" - }, - "execution_count": null, - "outputs": [] - }, { "cell_type": "code", "source": [ @@ -234,68 +292,11 @@ "virtual_display.start()" ], "metadata": { - "id": "ww5PQH1gNLI4" + "id": "BE5JWP5rQIKf" }, "execution_count": null, "outputs": [] }, - { - "cell_type": "markdown", - "metadata": { - "id": "mYIMvl5X9NAu" - }, - "source": [ - "## Clone RL-Baselines3 Zoo Repo 📚\n", - "You can now directly install from python package `pip install rl_zoo3` but since we want **the full installation with extra environments and dependencies** we're going to clone `RL-Baselines3-Zoo` repository and install from source." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "eu5ZDPZ09VNQ" - }, - "outputs": [], - "source": [ - "!git clone https://github.com/DLR-RM/rl-baselines3-zoo" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HCIoSbvbfAQh" - }, - "source": [ - "## Install dependencies 🔽\n", - "We can now install the dependencies RL-Baselines3 Zoo needs (this can take 5min ⏲)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "s2QsFAk29h-D" - }, - "outputs": [], - "source": [ - "%cd /content/rl-baselines3-zoo/ \n", - "!git checkout v1.8.0" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "3QaOS7Xj9j1s" - }, - "outputs": [], - "source": [ - "!pip install setuptools==65.5.0\n", - "!pip install -r requirements.txt\n", - "# Since colab uses Python 3.9 we need to add this installation\n", - "!pip install gym[atari,accept-rom-license]==0.21.0" - ] - }, { "cell_type": "markdown", "metadata": { @@ -305,9 +306,31 @@ "## Train our Deep Q-Learning Agent to Play Space Invaders 👾\n", "\n", "To train an agent with RL-Baselines3-Zoo, we just need to do two things:\n", - "1. We define the hyperparameters in `/content/rl-baselines3-zoo/hyperparams/dqn.yml`\n", "\n", - "\"DQN\n" + "1. Create a hyperparameter config file that will contain our training hyperparameters called `dqn.yml`.\n", + "\n", + "This is a template example:\n", + "\n", + "```\n", + "SpaceInvadersNoFrameskip-v4:\n", + " env_wrapper:\n", + " - stable_baselines3.common.atari_wrappers.AtariWrapper\n", + " frame_stack: 4\n", + " policy: 'CnnPolicy'\n", + " n_timesteps: !!float 1e7\n", + " buffer_size: 100000\n", + " learning_rate: !!float 1e-4\n", + " batch_size: 32\n", + " learning_starts: 100000\n", + " target_update_interval: 1000\n", + " train_freq: 4\n", + " gradient_steps: 1\n", + " exploration_fraction: 0.1\n", + " exploration_final_eps: 0.01\n", + " # If True, you need to deactivate handle_timeout_termination\n", + " # in the replay_buffer_kwargs\n", + " optimize_memory_usage: False\n", + "```" ] }, { @@ -346,7 +369,9 @@ "id": "Hn8bRTHvERRL" }, "source": [ - "2. We run `train.py` and save the models on `logs` folder 📁" + "2. We start the training and save the models on `logs` folder 📁\n", + "\n", + "- Define the algorithm after `--algo`, where we save the model after `-f` and where the hyperparameter config is after `-c`." ] }, { @@ -357,7 +382,7 @@ }, "outputs": [], "source": [ - "!python train.py --algo ________ --env SpaceInvadersNoFrameskip-v4 -f _________" + "!python -m rl_zoo3.train --algo ________ --env SpaceInvadersNoFrameskip-v4 -f _________ -c _________" ] }, { @@ -377,7 +402,7 @@ }, "outputs": [], "source": [ - "!python train.py --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/" + "!python -m rl_zoo3.train --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/ -c dqn.yml" ] }, { @@ -399,7 +424,7 @@ }, "outputs": [], "source": [ - "!python enjoy.py --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps _________ --folder logs/" + "!python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps _________ --folder logs/ " ] }, { @@ -419,7 +444,7 @@ }, "outputs": [], "source": [ - "!python enjoy.py --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps 5000 --folder logs/" + "!python -m rl_zoo3.enjoy --algo dqn --env SpaceInvadersNoFrameskip-v4 --no-render --n-timesteps 5000 --folder logs/" ] }, { @@ -440,7 +465,7 @@ "id": "ezbHS1q3HYVV" }, "source": [ - "By using `rl_zoo3.push_to_hub.py` **you evaluate, record a replay, generate a model card of your agent and push it to the hub**.\n", + "By using `rl_zoo3.push_to_hub` **you evaluate, record a replay, generate a model card of your agent and push it to the hub**.\n", "\n", "This way:\n", "- You can **showcase our work** 🔥\n", @@ -518,6 +543,8 @@ "\n", "`-orga`: Your Hugging Face username\n", "\n", + "`-f`: Where the trained model folder is (in our case `logs`)\n", + "\n", "\"Select" ] }, @@ -649,7 +676,7 @@ }, "outputs": [], "source": [ - "!python enjoy.py --algo dqn --env BeamRiderNoFrameskip-v4 -n 5000 -f rl_trained/" + "!python -m rl_zoo3.enjoy --algo dqn --env BeamRiderNoFrameskip-v4 -n 5000 -f rl_trained/ --no-render" ] }, { @@ -803,4 +830,4 @@ }, "nbformat": 4, "nbformat_minor": 0 -} +} \ No newline at end of file