mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-05-16 13:55:52 +08:00
Update colab
This commit is contained in:
@@ -41,7 +41,7 @@
|
||||
"from IPython.display import HTML\n",
|
||||
"\n",
|
||||
"HTML('''<video width=\"640\" height=\"480\" controls>\n",
|
||||
" <source src=\"https://huggingface.co/edbeeching/doom_health_gathering_supreme_3333/resolve/main/replay.mp4\" \n",
|
||||
" <source src=\"https://huggingface.co/edbeeching/doom_health_gathering_supreme_3333/resolve/main/replay.mp4\"\n",
|
||||
" type=\"video/mp4\">Your browser does not support the video tag.</video>'''\n",
|
||||
")"
|
||||
]
|
||||
@@ -124,15 +124,15 @@
|
||||
"\n",
|
||||
"### How sample-factory works\n",
|
||||
"\n",
|
||||
"Sample-factory is one of the **most highly optimized RL implementations available to the community**. \n",
|
||||
"Sample-factory is one of the **most highly optimized RL implementations available to the community**.\n",
|
||||
"\n",
|
||||
"It works by **spawning multiple processes that run rollout workers, inference workers and a learner worker**. \n",
|
||||
"It works by **spawning multiple processes that run rollout workers, inference workers and a learner worker**.\n",
|
||||
"\n",
|
||||
"The *workers* **communicate through shared memory, which lowers the communication cost between processes**. \n",
|
||||
"The *workers* **communicate through shared memory, which lowers the communication cost between processes**.\n",
|
||||
"\n",
|
||||
"The *rollout workers* interact with the environment and send observations to the *inference workers*. \n",
|
||||
"The *rollout workers* interact with the environment and send observations to the *inference workers*.\n",
|
||||
"\n",
|
||||
"The *inferences workers* query a fixed version of the policy and **send actions back to the rollout worker**. \n",
|
||||
"The *inferences workers* query a fixed version of the policy and **send actions back to the rollout worker**.\n",
|
||||
"\n",
|
||||
"After *k* steps the rollout works send a trajectory of experience to the learner worker, **which it uses to update the agent’s policy network**.\n",
|
||||
"\n",
|
||||
@@ -164,9 +164,9 @@
|
||||
"source": [
|
||||
"## ViZDoom\n",
|
||||
"\n",
|
||||
"[ViZDoom](https://vizdoom.cs.put.edu.pl/) is an **open-source python interface for the Doom Engine**. \n",
|
||||
"[ViZDoom](https://vizdoom.cs.put.edu.pl/) is an **open-source python interface for the Doom Engine**.\n",
|
||||
"\n",
|
||||
"The library was created in 2016 by Marek Wydmuch, Michal Kempka at the Institute of Computing Science, Poznan University of Technology, Poland. \n",
|
||||
"The library was created in 2016 by Marek Wydmuch, Michal Kempka at the Institute of Computing Science, Poznan University of Technology, Poland.\n",
|
||||
"\n",
|
||||
"The library enables the **training of agents directly from the screen pixels in a number of scenarios**, including team deathmatch, shown in the video below. Because the ViZDoom environment is based on a game the was created in the 90s, it can be run on modern hardware at accelerated speeds, **allowing us to learn complex AI behaviors fairly quickly**.\n",
|
||||
"\n",
|
||||
@@ -195,7 +195,7 @@
|
||||
"source": [
|
||||
"## We first need to install some dependencies that are required for the ViZDoom environment\n",
|
||||
"\n",
|
||||
"Now that our Colab runtime is set up, we can start by installing the dependencies required to run ViZDoom on linux. \n",
|
||||
"Now that our Colab runtime is set up, we can start by installing the dependencies required to run ViZDoom on linux.\n",
|
||||
"\n",
|
||||
"If you are following on your machine on Mac, you will want to follow the installation instructions on the [github page](https://github.com/Farama-Foundation/ViZDoom/blob/master/doc/Quickstart.md#-quickstart-for-macos-and-anaconda3-python-36)."
|
||||
]
|
||||
@@ -210,7 +210,7 @@
|
||||
"source": [
|
||||
"%%capture\n",
|
||||
"%%bash\n",
|
||||
"# Install ViZDoom deps from \n",
|
||||
"# Install ViZDoom deps from\n",
|
||||
"# https://github.com/mwydmuch/ViZDoom/blob/master/doc/Building.md#-linux\n",
|
||||
"\n",
|
||||
"apt-get install build-essential zlib1g-dev libsdl2-dev libjpeg-dev \\\n",
|
||||
@@ -244,11 +244,21 @@
|
||||
"source": [
|
||||
"# install python libraries\n",
|
||||
"# thanks toinsson\n",
|
||||
"!pip install sample-factory==2.0.2\n",
|
||||
"!pip install faster-fifo==1.4.2\n",
|
||||
"!pip install vizdoom"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"source": [
|
||||
"!pip install sample-factory==2.0.2"
|
||||
],
|
||||
"metadata": {
|
||||
"id": "alxUt7Au-O8e"
|
||||
},
|
||||
"execution_count": null,
|
||||
"outputs": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
@@ -330,7 +340,7 @@
|
||||
"- 1 available game variable: HEALTH\n",
|
||||
"- death penalty = 100\n",
|
||||
"\n",
|
||||
"You can find out more about the scenarios available in ViZDoom [here](https://github.com/Farama-Foundation/ViZDoom/tree/master/scenarios). \n",
|
||||
"You can find out more about the scenarios available in ViZDoom [here](https://github.com/Farama-Foundation/ViZDoom/tree/master/scenarios).\n",
|
||||
"\n",
|
||||
"There are also a number of more complex scenarios that have been create for ViZDoom, such as the ones detailed on [this github page](https://github.com/edbeeching/3d_control_deep_rl).\n",
|
||||
"\n"
|
||||
@@ -451,7 +461,7 @@
|
||||
"\n",
|
||||
"<img src=\"https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/create-token.jpg\" alt=\"Create HF Token\">\n",
|
||||
"\n",
|
||||
"- Copy the token \n",
|
||||
"- Copy the token\n",
|
||||
"- Run the cell below and paste the token"
|
||||
]
|
||||
},
|
||||
@@ -571,7 +581,7 @@
|
||||
"source": [
|
||||
"## Some additional challenges 🏆: Doom Deathmatch\n",
|
||||
"\n",
|
||||
"Training an agent to play a Doom deathmatch **takes many hours on a more beefy machine than is available in Colab**. \n",
|
||||
"Training an agent to play a Doom deathmatch **takes many hours on a more beefy machine than is available in Colab**.\n",
|
||||
"\n",
|
||||
"Fortunately, we have have **already trained an agent in this scenario and it is available in the 🤗 Hub!** Let’s download the model and visualize the agent’s performance."
|
||||
],
|
||||
|
||||
Reference in New Issue
Block a user