{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "2D3NL_e4crQv" }, "source": [ "# Unit 5: An Introduction to ML-Agents\n", "\n" ] }, { "cell_type": "markdown", "source": [ "\"Thumbnail\"/\n", "\n", "In this notebook, you'll learn about ML-Agents and train two agents.\n", "\n", "- The first one will learn to **shoot snowballs onto spawning targets**.\n", "- The second need to press a button to spawn a pyramid, then navigate to the pyramid, knock it over, **and move to the gold brick at the top**. To do that, it will need to explore its environment, and we will use a technique called curiosity.\n", "\n", "After that, you'll be able **to watch your agents playing directly on your browser**.\n", "\n", "For more information about the certification process, check this section ๐Ÿ‘‰ https://huggingface.co/deep-rl-course/en/unit0/introduction#certification-process" ], "metadata": { "id": "97ZiytXEgqIz" } }, { "cell_type": "markdown", "source": [ "โฌ‡๏ธ Here is an example of what **you will achieve at the end of this unit.** โฌ‡๏ธ\n" ], "metadata": { "id": "FMYrDriDujzX" } }, { "cell_type": "markdown", "source": [ "\"Pyramids\"/\n", "\n", "\"SnowballTarget\"/" ], "metadata": { "id": "cBmFlh8suma-" } }, { "cell_type": "markdown", "source": [ "### ๐ŸŽฎ Environments: \n", "\n", "- [Pyramids](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Learning-Environment-Examples.md#pyramids)\n", "- SnowballTarget\n", "\n", "### ๐Ÿ“š RL-Library: \n", "\n", "- [ML-Agents (HuggingFace Experimental Version)](https://github.com/huggingface/ml-agents)\n", "\n", "โš  We're going to use an experimental version of ML-Agents were you can push to hub and load from hub Unity ML-Agents Models **you need to install the same version**" ], "metadata": { "id": "A-cYE0K5iL-w" } }, { "cell_type": "markdown", "source": [ "We're constantly trying to improve our tutorials, so **if you find some issues in this notebook**, please [open an issue on the GitHub Repo](https://github.com/huggingface/deep-rl-class/issues)." ], "metadata": { "id": "qEhtaFh9i31S" } }, { "cell_type": "markdown", "source": [ "## Objectives of this notebook ๐Ÿ†\n", "\n", "At the end of the notebook, you will:\n", "\n", "- Understand how works **ML-Agents**, the environment library.\n", "- Be able to **train agents in Unity Environments**.\n" ], "metadata": { "id": "j7f63r3Yi5vE" } }, { "cell_type": "markdown", "source": [ "## This notebook is from the Deep Reinforcement Learning Course\n", "\"Deep" ], "metadata": { "id": "viNzVbVaYvY3" } }, { "cell_type": "markdown", "metadata": { "id": "6p5HnEefISCB" }, "source": [ "In this free course, you will:\n", "\n", "- ๐Ÿ“– Study Deep Reinforcement Learning in **theory and practice**.\n", "- ๐Ÿง‘โ€๐Ÿ’ป Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n", "- ๐Ÿค– Train **agents in unique environments** \n", "\n", "And more check ๐Ÿ“š the syllabus ๐Ÿ‘‰ https://huggingface.co/deep-rl-course/communication/publishing-schedule\n", "\n", "Donโ€™t forget to **sign up to the course** (we are collecting your email to be able toย **send you the links when each Unit is published and give you information about the challenges and updates).**\n", "\n", "\n", "The best way to keep in touch is to join our discord server to exchange with the community and with us ๐Ÿ‘‰๐Ÿป https://discord.gg/ydHrjt3WP5" ] }, { "cell_type": "markdown", "metadata": { "id": "Y-mo_6rXIjRi" }, "source": [ "## Prerequisites ๐Ÿ—๏ธ\n", "Before diving into the notebook, you need to:\n", "\n", "๐Ÿ”ฒ ๐Ÿ“š **Study [what is ML-Agents and how it works by reading Unit 5](https://huggingface.co/deep-rl-course/unit5/introduction)** ๐Ÿค— " ] }, { "cell_type": "markdown", "source": [ "# Let's train our agents ๐Ÿš€\n", "\n", "The ML-Agents integration on the Hub is **still experimental**, some features will be added in the future. \n", "\n", "But for now, **to validate this hands-on for the certification process, you just need to push your trained models to the Hub**. Thereโ€™s no results to attain to validate this one. But if you want to get nice results you can try to attain:\n", "\n", "- For `Pyramids` : Mean Reward = 1.75\n", "- For `SnowballTarget` : Mean Reward = 15 or 30 targets hit in an episode.\n" ], "metadata": { "id": "xYO1uD5Ujgdh" } }, { "cell_type": "markdown", "source": [ "## Set the GPU ๐Ÿ’ช\n", "- To **accelerate the agent's training, we'll use a GPU**. To do that, go to `Runtime > Change Runtime type`\n", "\n", "\"GPU" ], "metadata": { "id": "DssdIjk_8vZE" } }, { "cell_type": "markdown", "source": [ "- `Hardware Accelerator > GPU`\n", "\n", "\"GPU" ], "metadata": { "id": "sTfCXHy68xBv" } }, { "cell_type": "markdown", "metadata": { "id": "an3ByrXYQ4iK" }, "source": [ "## Clone the repository and install the dependencies ๐Ÿ”ฝ\n", "- We need to clone the repository, that **contains the experimental version of the library that allows you to push your trained agent to the Hub.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "6WNoL04M7rTa" }, "outputs": [], "source": [ "%%capture\n", "# Clone the repository\n", "!git clone --depth 1 https://github.com/huggingface/ml-agents/ " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "d8wmVcMk7xKo" }, "outputs": [], "source": [ "%%capture\n", "# Go inside the repository and install the package\n", "%cd ml-agents\n", "!pip3 install -e ./ml-agents-envs\n", "!pip3 install -e ./ml-agents" ] }, { "cell_type": "markdown", "source": [ "## SnowballTarget โ›„\n", "\n", "If you need a refresher on how this environments work check this section ๐Ÿ‘‰\n", "https://huggingface.co/deep-rl-course/unit5/snowball-target" ], "metadata": { "id": "R5_7Ptd_kEcG" } }, { "cell_type": "markdown", "metadata": { "id": "HRY5ufKUKfhI" }, "source": [ "### Download and move the environment zip file in `./training-envs-executables/linux/`\n", "- Our environment executable is in a zip file.\n", "- We need to download it and place it to `./training-envs-executables/linux/`\n", "- We use a linux executable because we use colab, and colab machines OS is Ubuntu (linux)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "C9Ls6_6eOKiA" }, "outputs": [], "source": [ "# Here, we create training-envs-executables and linux\n", "!mkdir ./training-envs-executables\n", "!mkdir ./training-envs-executables/linux" ] }, { "cell_type": "markdown", "metadata": { "id": "jsoZGxr1MIXY" }, "source": [ "Download the file SnowballTarget.zip from https://drive.google.com/file/d/1YHHLjyj6gaZ3Gemx1hQgqrPgSS2ZhmB5 using `wget`. \n", "\n", "Check out the full solution to download large files from GDrive [here](https://bcrf.biochem.wisc.edu/2021/02/05/download-google-drive-files-using-wget/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "QU6gi8CmWhnA" }, "outputs": [], "source": [ "!wget --load-cookies /tmp/cookies.txt \"https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1YHHLjyj6gaZ3Gemx1hQgqrPgSS2ZhmB5' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\\1\\n/p')&id=1YHHLjyj6gaZ3Gemx1hQgqrPgSS2ZhmB5\" -O ./training-envs-executables/linux/SnowballTarget.zip && rm -rf /tmp/cookies.txt" ] }, { "cell_type": "markdown", "source": [ "We unzip the executable.zip file" ], "metadata": { "id": "_LLVaEEK3ayi" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "8FPx0an9IAwO" }, "outputs": [], "source": [ "%%capture\n", "!unzip -d ./training-envs-executables/linux/ ./training-envs-executables/linux/SnowballTarget.zip" ] }, { "cell_type": "markdown", "metadata": { "id": "nyumV5XfPKzu" }, "source": [ "Make sure your file is accessible " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "EdFsLJ11JvQf" }, "outputs": [], "source": [ "!chmod -R 755 ./training-envs-executables/linux/SnowballTarget" ] }, { "cell_type": "markdown", "source": [ "### Define the SnowballTarget config file\n", "- In ML-Agents, you define the **training hyperparameters into config.yaml files.**\n", "\n", "There are multiple hyperparameters. To know them better, you should check for each explanation with [the documentation](https://github.com/Unity-Technologies/ml-agents/blob/release_20_docs/docs/Training-Configuration-File.md)\n", "\n", "\n", "So you need to create a `SnowballTarget.yaml` config file in ./content/ml-agents/config/ppo/\n", "\n", "We'll give you here a first version of this config (to copy and paste into your `SnowballTarget.yaml file`), **but you should modify it**.\n", "\n", "```\n", "behaviors:\n", " SnowballTarget:\n", " trainer_type: ppo\n", " summary_freq: 10000\n", " keep_checkpoints: 10\n", " checkpoint_interval: 50000\n", " max_steps: 200000\n", " time_horizon: 64\n", " threaded: true\n", " hyperparameters:\n", " learning_rate: 0.0003\n", " learning_rate_schedule: linear\n", " batch_size: 128\n", " buffer_size: 2048\n", " beta: 0.005\n", " epsilon: 0.2\n", " lambd: 0.95\n", " num_epoch: 3\n", " network_settings:\n", " normalize: false\n", " hidden_units: 256\n", " num_layers: 2\n", " vis_encode_type: simple\n", " reward_signals:\n", " extrinsic:\n", " gamma: 0.99\n", " strength: 1.0\n", "```" ], "metadata": { "id": "NAuEq32Mwvtz" } }, { "cell_type": "markdown", "source": [ "\"Config\n", "\"Config" ], "metadata": { "id": "4U3sRH4N4h_l" } }, { "cell_type": "markdown", "source": [ "As an experimentation, you should also try to modify some other hyperparameters. Unity provides very [good documentation explaining each of them here](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md).\n", "\n", "Now that you've created the config file and understand what most hyperparameters do, we're ready to train our agent ๐Ÿ”ฅ." ], "metadata": { "id": "JJJdo_5AyoGo" } }, { "cell_type": "markdown", "metadata": { "id": "f9fI555bO12v" }, "source": [ "### Train the agent\n", "\n", "To train our agent, we just need to **launch mlagents-learn and select the executable containing the environment.**\n", "\n", "We define four parameters:\n", "\n", "1. `mlagents-learn `: the path where the hyperparameter config file is.\n", "2. `--env`: where the environment executable is.\n", "3. `--run_id`: the name you want to give to your training run id.\n", "4. `--no-graphics`: to not launch the visualization during the training.\n", "\n", "\"MlAgents\n", "\n", "Train the model and use the `--resume` flag to continue training in case of interruption. \n", "\n", "> It will fail first time if and when you use `--resume`, try running the block again to bypass the error. \n", "\n" ] }, { "cell_type": "markdown", "source": [ "The training will take 10 to 35min depending on your config, go take a โ˜•๏ธyou deserve it ๐Ÿค—." ], "metadata": { "id": "lN32oWF8zPjs" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "bS-Yh1UdHfzy" }, "outputs": [], "source": [ "!mlagents-learn ./config/ppo/SnowballTarget.yaml --env=./training-envs-executables/linux/SnowballTarget/SnowballTarget --run-id=\"SnowballTarget1\" --no-graphics" ] }, { "cell_type": "markdown", "metadata": { "id": "5Vue94AzPy1t" }, "source": [ "### Push the agent to the ๐Ÿค— Hub\n", "\n", "- Now that we trained our agent, weโ€™re **ready to push it to the Hub to be able to visualize it playing on your browser๐Ÿ”ฅ.**" ] }, { "cell_type": "markdown", "source": [ "To be able to share your model with the community there are three more steps to follow:\n", "\n", "1๏ธโƒฃ (If it's not already done) create an account to HF โžก https://huggingface.co/join\n", "\n", "2๏ธโƒฃ Sign in and then, you need to store your authentication token from the Hugging Face website.\n", "- Create a new token (https://huggingface.co/settings/tokens) **with write role**\n", "\n", "\"Create\n", "\n", "- Copy the token \n", "- Run the cell below and paste the token" ], "metadata": { "id": "izT6FpgNzZ6R" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "rKt2vsYoK56o" }, "outputs": [], "source": [ "from huggingface_hub import notebook_login\n", "notebook_login()" ] }, { "cell_type": "markdown", "source": [ "If you don't want to use a Google Colab or a Jupyter Notebook, you need to use this command instead: `huggingface-cli login`" ], "metadata": { "id": "aSU9qD9_6dem" } }, { "cell_type": "markdown", "source": [ "Then, we simply need to run `mlagents-push-to-hf`.\n", "\n", "And we define 4 parameters:\n", "\n", "1. `--run-id`: the name of the training run id.\n", "2. `--local-dir`: where the agent was saved, itโ€™s results/, so in my case results/First Training.\n", "3. `--repo-id`: the name of the Hugging Face repo you want to create or update. Itโ€™s always /\n", "If the repo does not exist **it will be created automatically**\n", "4. `--commit-message`: since HF repos are git repository you need to define a commit message.\n", "\n", "\"Push\n", "\n", "For instance:\n", "\n", "`!mlagents-push-to-hf --run-id=\"SnowballTarget1\" --local-dir=\"./results/SnowballTarget1\" --repo-id=\"ThomasSimonini/ppo-SnowballTarget\" --commit-message=\"First Push\"`" ], "metadata": { "id": "KK4fPfnczunT" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "dGEFAIboLVc6" }, "outputs": [], "source": [ "!mlagents-push-to-hf --run-id= # Add your run id --local-dir= # Your local dir --repo-id= # Your repo id --commit-message= # Your commit message" ] }, { "cell_type": "markdown", "source": [ "Else, if everything worked you should have this at the end of the process(but with a different url ๐Ÿ˜†) :\n", "\n", "\n", "\n", "```\n", "Your model is pushed to the hub. You can view your model here: https://huggingface.co/ThomasSimonini/ppo-SnowballTarget\n", "```\n", "\n", "Itโ€™s the link to your model, it contains a model card that explains how to use it, your Tensorboard and your config file. **Whatโ€™s awesome is that itโ€™s a git repository, that means you can have different commits, update your repository with a new push etc.**" ], "metadata": { "id": "yborB0850FTM" } }, { "cell_type": "markdown", "source": [ "But now comes the best: **being able to visualize your agent online ๐Ÿ‘€.**" ], "metadata": { "id": "5Uaon2cg0NrL" } }, { "cell_type": "markdown", "source": [ "### Watch your agent playing ๐Ÿ‘€\n", "\n", "For this step itโ€™s simple:\n", "\n", "1. Remember your repo-id\n", "\n", "2. Go here: https://singularite.itch.io/snowballtarget\n", "\n", "3. Launch the game and put it in full screen by clicking on the bottom right button\n", "\n", "\"Snowballtarget" ], "metadata": { "id": "VMc4oOsE0QiZ" } }, { "cell_type": "markdown", "source": [ "1. In step 1, choose your model repository which is the model id (in my case ThomasSimonini/ppo-SnowballTarget).\n", "\n", "2. In step 2, **choose what model you want to replay**:\n", " - I have multiple one, since we saved a model every 500000 timesteps. \n", " - But if I want the more recent I choose `SnowballTarget.onnx`\n", "\n", "๐Ÿ‘‰ Whatโ€™s nice **is to try with different models step to see the improvement of the agent.**\n", "\n", "And don't hesitate to share the best score your agent gets on discord in #rl-i-made-this channel ๐Ÿ”ฅ\n", "\n", "Let's now try a harder environment called Pyramids..." ], "metadata": { "id": "Djs8c5rR0Z8a" } }, { "cell_type": "markdown", "source": [ "## Pyramids ๐Ÿ†\n", "\n", "### Download and move the environment zip file in `./training-envs-executables/linux/`\n", "- Our environment executable is in a zip file.\n", "- We need to download it and place it to `./training-envs-executables/linux/`\n", "- We use a linux executable because we use colab, and colab machines OS is Ubuntu (linux)" ], "metadata": { "id": "rVMwRi4y_tmx" } }, { "cell_type": "markdown", "metadata": { "id": "NyqYYkLyAVMK" }, "source": [ "Download the file Pyramids.zip from https://drive.google.com/uc?export=download&id=1UiFNdKlsH0NTu32xV-giYUEVKV4-vc7H using `wget`. Check out the full solution to download large files from GDrive [here](https://bcrf.biochem.wisc.edu/2021/02/05/download-google-drive-files-using-wget/)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "AxojCsSVAVMP" }, "outputs": [], "source": [ "!wget --load-cookies /tmp/cookies.txt \"https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1UiFNdKlsH0NTu32xV-giYUEVKV4-vc7H' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\\1\\n/p')&id=1UiFNdKlsH0NTu32xV-giYUEVKV4-vc7H\" -O ./training-envs-executables/linux/Pyramids.zip && rm -rf /tmp/cookies.txt" ] }, { "cell_type": "markdown", "metadata": { "id": "bfs6CTJ1AVMP" }, "source": [ "**OR** Download directly to local machine and then drag and drop the file from local machine to `./training-envs-executables/linux`" ] }, { "cell_type": "markdown", "metadata": { "id": "H7JmgOwcSSmF" }, "source": [ "Wait for the upload to finish and then run the command below. \n", "\n", "![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAASYAAAAfCAYAAABKxmALAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsMAAA7DAcdvqGQAAAmZSURBVHhe7d0NTNTnHQfwL+rxcigHFHUH9LCCVuaKCWgB4029Gq7NcBuLPdPKEhh6iRhEmauLY2V0ptEUSZGwTJTA5jSBOnGDmpyxEMUqxEH0rOIQfDmtt/oCnsof9BT2PPAcx8nbeTQi9vcxF56X+///h8n98vs9/xfcYmJiukEIIS+RCeInIYS8NChjIoS4LCwsDMHBwfDx8cEc76u4dNcT5y63orm5WbzDNRSYCCHPhQejuLg4xMbGwtvbW4wCymNLRAuQFAtwpW0Sio+34+uzN8So8ygwEUKc4uXlheTkZCxbtkyMOOofmGy63P1R37EAm4u+QUdHhxgdGa0xEUJGxLOknJycQYPSvXv3cPnyZTx0n4Fut4litNeEx61YMNGAQxun9uzDWZQxEUKGxQNKVlaWQ9nGg9Hhw4dx8uRJ3Lx5U4wCQYouvD3bB+qfTEWsT70Ytfv1l287tf5EgYkQMiRevvFMSalUihHAYDCgpKQEjx49EiODWzwL+MN73Zji1iZGgO/8l+ODbQ0jlnXjtJQLheZ9HTQzRXcAHbKK8pEeK7rfi5GOScirh68p9Q9K+/fvx65du0YMStyxS0DcTjecuh8lRoDprRXY8pvFoje0IQJTNNLzilBU5PjK35qO+Dly8Z6xFAF1nBZaTbjovwhjcUxCxg4v4fqvKfFM6cCBA6LnvIziK7jdFSh6wBKfE4iaGyJ6gxs2Y5LOFiIlJYW90pC5rQT1CEfCulRoxzw2lSN7TQo272kU/RdhLI5JyNjhlwTY8DUlXr65avMBq2gBkyQTUjUeoje4IdaYeMakR1hLIdJ21okxJigJ2z+JgeVQBpoj86GVVSEzcx/M/ecqMvCVajv0Ac1okEcg0s8EQ0o2KiNX4XeJaqgUMqDLCkuLAQXbytHCt41NR/7qqWhmhwpboISchUvJXIeyf9yGWh+PUAV7T5cFLV8W4NNDfAteqmmBIynILuU7CEXC79chfhZ7Y5cE87lrkM0Lwc09acg7xablaui36BCt7I2okrkB5TsKUGUvfXutzEJRnEp07Ew9x3E8ZvT6/J7fsW5CWO9+2XFNJ4rx2d8aIIntCBnP9u7d27fgzUs4V7Kl/nKT3+hbEO+Y8hY0W7/taQ/GxTUmCZUXTYAyDGqRPckXhSDAakLjEfG1DAqHsqkMBTuKYUA8Nug1UJjKkJmWgrQdNbDMiEdyor12ZTtDiKIWxR9nILOYfbmV0Uj6rRqPq/OQsSkbZedZ+Hl3JdvTQBF6PeJnSKgpzkTGx8Wo9QhEgJjj4tevQrS8Gfsy05DGAmmzPBK61AQMSPxKs0WGyF+ZMFxlUd5iRFWFmH9WUAh86oqRuSkTJSctUP40GalLxRwh4xgv4/qfheNn30ar5pvbogV4Si3DXj7gfGDyi0D8h1Es+JjR3MBCU4URLVYVwpfzr7cc2jAlrNcbYbClC7dqUVBoQMMFEyzyepTu/BSfFVbBzOali/vQyNIs5Rtq8WbOjPqCSjSYLTCzzMPIg6m5Hn9hx7G0sayr7hokmRKhAxa0o7F0bgAsZ8tQcsIMC8uGKgvqe7M4Qe4hg9V8EbXs4JK5CmUHK1Fz8Q54IjYU5ft6aFiwMx7cjZqhUqBv+edrgLnNzIJiGc61yRE2n2VVhIxz/DYTG17G9b8kwFXHL9wXLVaqPZUQPWuy6A00bGCSz9PbF79z0pGgsqCh9HOU8aAhGdB43cqSJi0LSxqEv85KnvMGexnTyUoq0WTRAHemxEC/9a99+9Pyisnh6FZY+zaWYH3Kfjy1OlEWhcB3shV3bhhFn5HYvkSTq2lohHW2Dvm5W7GFZU8Rlirs+2eNQ/ByEKSD/h0VpLPl2H1imE/g8PmMMN6QIJMPF+4IGR/4vW82ra2tojU6d9snoHPSNNFjFZB/p2gNNGxgsi9+i1daJgqqLbZZGM6zcu71cGji5kCFfmXcs9gXfUMyL+XKkb2pd18GtumLYq7IQdpHeSg7ZcLjaVHQbdiO7aujxeyzQqFbrYFKMqJ8T40TgdFOLpOJFiFkMN39Qk433ERrIOdLuUFIh4wsHKmgficE6F/GPUulRIDMhNrPDTD1LDjLIXO8cn0UruHeQxkCgiNEn5HLYA8REdB+uAraYCMMXxQiJzMDef+REPBWDCsCBwpdmQiNSkLDF3lDl3CDUkLpz0pGyRa4CRm/7t+3l13+/v6iNTp+8i54Pfmf6AHNrZ6iNdCoAhNQCeN1ICBADnNzvzLuWZYONqdE1Go1VDNYIEveAnWQmBu1OlSfvwPFPB2SFimhUEZCtzGGHc0uZJ4GCSuTEO3HOn4RiORn0To7wP/r5ZEJSErUsDyJmalD4lJWwjWUooCfzRuJKgablkdAwf5FJOoRM01C4wmDmCRk/Lpxw/5EAF9fXwQG2q9DctWicPsyR/ckb5y+9ED0BhplYGKhiZdzvIyrGCa9uFCIwgoTZNFJyPpjFnSht3GOn/HylDsEEFcZCwtReVXOAt5W5H6SjIgH13BHzPG1n8JdZWhEFPQ5vWtlUR7NqNxVyMZY0IqMgXqxmuVVzIJwqFiqpYjst7bGXvnrhyj7bt2ENXoNcotykb44AOavirHbmYBGyEuO38/W3t4uesDChQtFy3VLIuznyjvks4e9Z27U98r1XM/jV4O07LLnWo8Z72y/dwr7vQl5FaWmpvZd+c3PzK1du9apW1EGMz/EDfm/vCt6LFfpikFKfpPoDeRyxiRXhiJ8/iq8+2MZWs5W/qCCEiE/BEeOHBGt3nIuKSlJ9J7fn39hX+jukilQMsICrsuBac6v1mHTWg0CrhpQeojCEiGvGl5qHT16VPQArVaLFStWiJ7z8lOC4etmX1w50zkfNWeGf6olPfaEEDKk0Tz2JCRgEnI/8EbghCtiBHjo9SZ+nvdwxMeeTAwODv6TaBNCiIMnT56gqampZ/Hb3d29Z8z21AGZTNaz9vTggf3s2mSPbqjn+kH/Xgg2xH4HRbf98oAnMn+sORiIW7duiZGhUcZECBkRD0YbN250yJxseHDiV4er7v8LP5K+hsxqL9tszK+twEd/Nzn911MoYyKEjIgHnurqaigUCsyc6fi0RE9PT/j5+UHx6L/wanW82bfL/TUcf/wzrN95yqlMyYYCEyHEKbysO336NOrr6+Hm5obp06f3lXech+UM3O+dwVOPqZB8ImFsfxMFx4A9/zb2bPs8qJQjhLiMl3i2P3ip9j2DpjZv1F7qxLmL9gVvV1BgIoS8dEZ9SwohhHzfKDARQl4ywP8B/eN9dc0U7ocAAAAASUVORK5CYII=)" ] }, { "cell_type": "markdown", "source": [ "Unzip it" ], "metadata": { "id": "iWUUcs0_794U" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "i2E3K4V2AVMP" }, "outputs": [], "source": [ "%%capture\n", "!unzip -d ./training-envs-executables/linux/ ./training-envs-executables/linux/Pyramids.zip" ] }, { "cell_type": "markdown", "metadata": { "id": "KmKYBgHTAVMP" }, "source": [ "Make sure your file is accessible " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Im-nwvLPAVMP" }, "outputs": [], "source": [ "!chmod -R 755 ./training-envs-executables/linux/Pyramids/Pyramids" ] }, { "cell_type": "markdown", "source": [ "### Modify the PyramidsRND config file\n", "- Contrary to the first environment which was a custom one, **Pyramids was made by the Unity team**.\n", "- So the PyramidsRND config file already exists and is in ./content/ml-agents/config/ppo/PyramidsRND.yaml\n", "- You might asked why \"RND\" in PyramidsRND. RND stands for *random network distillation* it's a way to generate curiosity rewards. If you want to know more on that we wrote an article explaning this technique: https://medium.com/data-from-the-trenches/curiosity-driven-learning-through-random-network-distillation-488ffd8e5938\n", "\n", "For this training, weโ€™ll modify one thing:\n", "- The total training steps hyperparameter is too high since we can hit the benchmark (mean reward = 1.75) in only 1M training steps.\n", "๐Ÿ‘‰ To do that, we go to config/ppo/PyramidsRND.yaml,**and modify these to max_steps to 1000000.**\n", "\n", "\"Pyramids" ], "metadata": { "id": "fqceIATXAgih" } }, { "cell_type": "markdown", "source": [ "As an experimentation, you should also try to modify some other hyperparameters, Unity provides a very [good documentation explaining each of them here](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md).\n", "\n", "Weโ€™re now ready to train our agent ๐Ÿ”ฅ." ], "metadata": { "id": "RI-5aPL7BWVk" } }, { "cell_type": "markdown", "source": [ "### Train the agent\n", "\n", "The training will take 30 to 45min depending on your machine, go take a โ˜•๏ธyou deserve it ๐Ÿค—." ], "metadata": { "id": "s5hr1rvIBdZH" } }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "fXi4-IaHBhqD" }, "outputs": [], "source": [ "!mlagents-learn ./config/ppo/PyramidsRND.yaml --env=./training-envs-executables/linux/Pyramids/Pyramids --run-id=\"Pyramids Training\" --no-graphics" ] }, { "cell_type": "markdown", "metadata": { "id": "txonKxuSByut" }, "source": [ "### Push the agent to the ๐Ÿค— Hub\n", "\n", "- Now that we trained our agent, weโ€™re **ready to push it to the Hub to be able to visualize it playing on your browser๐Ÿ”ฅ.**" ] }, { "cell_type": "code", "source": [], "metadata": { "id": "JZ53caJ99sX_" }, "execution_count": null, "outputs": [] }, { "cell_type": "code", "source": [ "!mlagents-push-to-hf --run-id= # Add your run id --local-dir= # Your local dir --repo-id= # Your repo id --commit-message= # Your commit message" ], "metadata": { "id": "yiEQbv7rB4mU" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "### Watch your agent playing ๐Ÿ‘€\n", "\n", "The temporary link for Pyramids demo is: https://singularite.itch.io/pyramids" ], "metadata": { "id": "7aZfgxo-CDeQ" } }, { "cell_type": "markdown", "source": [ "### ๐ŸŽ Bonus: Why not train on another environment?\n", "Now that you know how to train an agent using MLAgents, **why not try another environment?** \n", "\n", "MLAgents provides 18 different and weโ€™re building some custom ones. The best way to learn is to try things of your own, have fun.\n", "\n" ], "metadata": { "id": "hGG_oq2n0wjB" } }, { "cell_type": "markdown", "source": [ "![cover](https://miro.medium.com/max/1400/0*xERdThTRRM2k_U9f.png)" ], "metadata": { "id": "KSAkJxSr0z6-" } }, { "cell_type": "markdown", "source": [ "You have the full list of the one currently available on Hugging Face here ๐Ÿ‘‰ https://github.com/huggingface/ml-agents#the-environments\n", "\n", "For the demos to visualize your agent, the temporary link is: https://singularite.itch.io (temporary because we'll also put the demos on Hugging Face Space)\n", "\n", "For now we have integrated: \n", "- [Worm](https://singularite.itch.io/worm) demo where you teach a **worm to crawl**.\n", "- [Walker](https://singularite.itch.io/walker) demo where you teach an agent **to walk towards a goal**.\n", "\n", "If you want new demos to be added, please open an issue: https://github.com/huggingface/deep-rl-class ๐Ÿค—" ], "metadata": { "id": "YiyF4FX-04JB" } }, { "cell_type": "markdown", "source": [ "Thatโ€™s all for today. Congrats on finishing this tutorial!\n", "\n", "The best way to learn is to practice and try stuff. Why not try another environment? ML-Agents has 18 different environments, but you can also create your own? Check the documentation and have fun!\n", "\n", "See you on Unit 6 ๐Ÿ”ฅ,\n", "\n", "## Keep Learning, Stay awesome ๐Ÿค—" ], "metadata": { "id": "PI6dPWmh064H" } } ], "metadata": { "accelerator": "GPU", "colab": { "provenance": [], "private_outputs": true, "include_colab_link": true }, "gpuClass": "standard", "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }