From 5c910ecb27189e6ba32de34474f7367c3e2ab670 Mon Sep 17 00:00:00 2001
From: Thomas Simonini <simonini.thomas.pro@gmail.com>
Date: Sat, 5 Aug 2023 15:52:24 +0200
Subject: [PATCH] Update Observation Space

---
 notebooks/unit1/unit1.ipynb | 40 ++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 20 deletions(-)
diff --git a/notebooks/unit1/unit1.ipynb b/notebooks/unit1/unit1.ipynb
index 8283dd3..95562ff 100644
--- a/notebooks/unit1/unit1.ipynb
+++ b/notebooks/unit1/unit1.ipynb
@@ -7,7 +7,7 @@
         "colab_type": "text"
       },
       "source": [
-        "<a href=\"https://colab.research.google.com/github/huggingface/deep-rl-class/blob/GymnasiumUpdate%2FUnit1/notebooks/unit1.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+        "<a href=\"https://colab.research.google.com/github/huggingface/deep-rl-class/blob/main/notebooks/unit1/unit1.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
       ]
     },
     {
@@ -101,10 +101,10 @@
         "\n",
         "- 📖 Study Deep Reinforcement Learning in **theory and practice**.\n",
         "- 🧑‍💻 Learn to **use famous Deep RL libraries** such as Stable Baselines3, RL Baselines3 Zoo, CleanRL and Sample Factory 2.0.\n",
-        "- 🤖 Train **agents in unique environments** \n",
+        "- 🤖 Train **agents in unique environments**\n",
         "- 🎓 **Earn a certificate of completion** by completing 80% of the assignments.\n",
         "\n",
-        "And more! \n",
+        "And more!\n",
         "\n",
         "Check 📚 the syllabus 👉 https://simoninithomas.github.io/deep-rl-course\n",
         "\n",
@@ -248,7 +248,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). \n",
+        "During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames).\n",
         "\n",
         "Hence the following cell will install virtual screen libraries and create and run a virtual screen 🖥"
       ],
@@ -428,7 +428,7 @@
         "  # Do this action in the environment and get\n",
         "  # next_state, reward, terminated, truncated and info\n",
         "  observation, reward, terminated, truncated, info = env.step(action)\n",
-        "  \n",
+        "\n",
         "  # If the game is terminated (in our case we land, crashed) or truncated (timeout)\n",
         "  if terminated or truncated:\n",
         "      # Reset the environment\n",
@@ -453,7 +453,7 @@
         "---\n",
         "\n",
         "\n",
-        "💡 A good habit when you start to use an environment is to check its documentation \n",
+        "💡 A good habit when you start to use an environment is to check its documentation\n",
         "\n",
         "👉 https://gymnasium.farama.org/environments/box2d/lunar_lander/\n",
         "\n",
@@ -498,8 +498,8 @@
         "- Vertical speed (y)\n",
         "- Angle\n",
         "- Angular speed\n",
-        "- If the left leg contact point has touched the land\n",
-        "- If the right leg contact point has touched the land\n"
+        "- If the left leg contact point has touched the land (boolean)\n",
+        "- If the right leg contact point has touched the land (boolean)\n"
       ]
     },
     {
@@ -521,7 +521,7 @@
         "id": "MyxXwkI2Magx"
       },
       "source": [
-        "The action space (the set of possible actions the agent can take) is discrete with 4 actions available 🎮: \n",
+        "The action space (the set of possible actions the agent can take) is discrete with 4 actions available 🎮:\n",
         "\n",
         "- Action 0: Do nothing,\n",
         "- Action 1: Fire left orientation engine,\n",
@@ -648,7 +648,7 @@
         "# TODO: Define a PPO MlpPolicy architecture\n",
         "# We use MultiLayerPerceptron (MLPPolicy) because the input is a vector,\n",
         "# if we had frames as input we would use CnnPolicy\n",
-        "model = "
+        "model ="
       ]
     },
     {
@@ -762,7 +762,7 @@
         "eval_env =\n",
         "\n",
         "# Evaluate the model with 10 evaluation episodes and deterministic=True\n",
-        "mean_reward, std_reward = \n",
+        "mean_reward, std_reward =\n",
         "\n",
         "# Print the results\n",
         "\n"
@@ -844,7 +844,7 @@
         "\n",
         "<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/create-token.jpg\" alt=\"Create HF Token\">\n",
         "\n",
-        "- Copy the token \n",
+        "- Copy the token\n",
         "- Run the cell below and paste the token"
       ]
     },
@@ -913,10 +913,10 @@
         "\n",
         "## TODO: Define a repo_id\n",
         "## repo_id is the id of the model repository from the Hugging Face Hub (repo_id = {organization}/{repo_name} for instance ThomasSimonini/ppo-LunarLander-v2\n",
-        "repo_id = \n",
+        "repo_id =\n",
         "\n",
         "# TODO: Define the name of the environment\n",
-        "env_id = \n",
+        "env_id =\n",
         "\n",
         "# Create the evaluation env and set the render_mode=\"rgb_array\"\n",
         "eval_env = DummyVecEnv([lambda: Monitor(gym.make(env_id, render_mode=\"rgb_array\"))])\n",
@@ -930,7 +930,7 @@
         "\n",
         "# method save, evaluate, generate a model card and record a replay video of your agent before pushing the repo to the hub\n",
         "package_to_hub(model=model, # Our trained model\n",
-        "               model_name=model_name, # The name of our trained model \n",
+        "               model_name=model_name, # The name of our trained model\n",
         "               model_architecture=model_architecture, # The model architecture we used: in our case PPO\n",
         "               env_id=env_id, # Name of the environment\n",
         "               eval_env=eval_env, # Evaluation Environment\n",
@@ -978,7 +978,7 @@
         "\n",
         "# PLACE the package_to_hub function you've just filled here\n",
         "package_to_hub(model=model, # Our trained model\n",
-        "               model_name=model_name, # The name of our trained model \n",
+        "               model_name=model_name, # The name of our trained model\n",
         "               model_architecture=model_architecture, # The model architecture we used: in our case PPO\n",
         "               env_id=env_id, # Name of the environment\n",
         "               eval_env=eval_env, # Evaluation Environment\n",
@@ -995,7 +995,7 @@
       "cell_type": "markdown",
       "source": [
         "Congrats 🥳 you've just trained and uploaded your first Deep Reinforcement Learning agent. The script above should have displayed a link to a model repository such as https://huggingface.co/osanseviero/test_sb3. When you go to this link, you can:\n",
-        "* See a video preview of your agent at the right. \n",
+        "* See a video preview of your agent at the right.\n",
         "* Click \"Files and versions\" to see all the files in the repository.\n",
         "* Click \"Use in stable-baselines3\" to get a code snippet that shows how to load the model.\n",
         "* A model card (`README.md` file) which gives a description of the model\n",
@@ -1017,7 +1017,7 @@
         "## Load a saved LunarLander model from the Hub 🤗\n",
         "Thanks to [ironbar](https://github.com/ironbar) for the contribution.\n",
         "\n",
-        "Loading a saved model from the Hub is really easy. \n",
+        "Loading a saved model from the Hub is really easy.\n",
         "\n",
         "You go to https://huggingface.co/models?library=stable-baselines3 to see the list of all the Stable-baselines3 saved models.\n",
         "1. You select one and copy its repo_id\n",
@@ -1115,7 +1115,7 @@
       },
       "source": [
         "## Some additional challenges 🏆\n",
-        "The best way to learn **is to try things by your own**! As you saw, the current agent is not doing great. As a first suggestion, you can train for more steps. With 1,000,000 steps, we saw some great results! \n",
+        "The best way to learn **is to try things by your own**! As you saw, the current agent is not doing great. As a first suggestion, you can train for more steps. With 1,000,000 steps, we saw some great results!\n",
         "\n",
         "In the [Leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) you will find your agents. Can you get to the top?\n",
         "\n",
@@ -1190,4 +1190,4 @@
   },
   "nbformat": 4,
   "nbformat_minor": 0
-}
+}
\ No newline at end of file