mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-05 03:28:05 +08:00
Update MLAgents introduction
This commit is contained in:
@@ -136,8 +136,8 @@
|
||||
title: Introduction
|
||||
- local: unit5/how-mlagents-works
|
||||
title: How ML-Agents works?
|
||||
- local: unit5/shoot-target-env
|
||||
title: The Shoot Target environment
|
||||
- local: unit5/snowball-target
|
||||
title: The SnowballTarget environment
|
||||
- local: unit5/pyramids
|
||||
title: The Pyramids environment
|
||||
- local: unit5/curiosity
|
||||
|
||||
@@ -65,4 +65,4 @@ The Academy will be the one that will **send the order to our Agents and ensure
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/academy.png" alt="The MLAgents Academy" width="100%">
|
||||
|
||||
|
||||
Now that we understand how ML-Agents works, **we’re ready to train our agent** TODO add a phrase about our agent (snowball target)
|
||||
Now that we understand how ML-Agents works, **we’re ready to train our agents.**
|
||||
|
||||
@@ -1,26 +1,26 @@
|
||||
# An Introduction to Unity ML-Agents [[introduction-to-ml-agents]]
|
||||
|
||||
One of the critical element in Reinforcement Learning is **to be able to create environments**. An interesting tool to use for that is game engines such as Godot, Unity or Unreal Engine.
|
||||
One of the critical elements in Reinforcement Learning is **to be able to create environments**. An interesting tool to use for that is game engines such as Godot, Unity, or Unreal Engine.
|
||||
|
||||
One of them, [Unity](https://unity.com/), created the [Unity ML-Agents Toolkit](https://github.com/Unity-Technologies/ml-agents) , a plugin based on the game engine Unity that allows us **to use the Unity Game Engine as an environment builder to train agents**.
|
||||
One of them, [Unity](https://unity.com/), created the [Unity ML-Agents Toolkit](https://github.com/Unity-Technologies/ml-agents), a plugin based on the game engine Unity that allows us **to use the Unity Game Engine as an environment builder to train agents**.
|
||||
|
||||
<figure>
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/example-envs.png" alt="MLAgents environments"/>
|
||||
<figcaption>Source: <a href="https://github.com/Unity-Technologies/ml-agents">ML-Agents documentation</figcaption>
|
||||
<figcaption>Source: <a href="https://github.com/Unity-Technologies/ml-agents">ML-Agents documentation</a></figcaption>
|
||||
</figure>
|
||||
|
||||
From playing football (soccer), learning to walk and jumping big walls, Unity ML-Agents Toolkit provides a ton of exceptional pre-made environments.
|
||||
Unity ML-Agents Toolkit provides a ton of exceptional pre-made environments, from playing football (soccer), learning to walk, and jumping big walls.
|
||||
|
||||
In this Unit, we're going to learn to use ML-Agents, but **don't worry if you don't know how to use the Unity Game Engine**, you'll don't need to use.
|
||||
In this Unit, we'll learn to use ML-Agents, but **don't worry if you don't know how to use the Unity Game Engine**, you'll don't need to use it to train your agents.
|
||||
|
||||
And so, today, we're going to train two agents:
|
||||
- The first one will learn to **shoot snowballs onto spawning target**.
|
||||
- The second needs to **press a button to spawn a pyramid, then navigate to the pyramid, knock it over, and move to the gold brick at the top**. To do that, it will need to explore its environment, and we will use a technique called curiosity.
|
||||
- The second need to **press a button to spawn a pyramid, then navigate to the pyramid, knock it over, and move to the gold brick at the top**. To do that, it will need to explore its environment, and we will use a technique called curiosity.
|
||||
|
||||
TODO: Add illustration environments
|
||||
|
||||
Then, after training we’ll push the trained agents to the Hugging Face Hub and you’ll be able to visualize it playing directly on your browser without having to use the Unity Editor. You’ll be also be able to visualize and download others trained agents from the community.
|
||||
Then, after training **, you'll push the trained agents to the Hugging Face Hub**, and you'll be able to **visualize it playing directly on your browser without having to use the Unity Editor**.
|
||||
|
||||
Doing this Unit will prepare you for the next challenge where you will train agent in multi-agents environments and compete against your classmates' agents.
|
||||
Doing this Unit will **prepare you for the next challenge: AI vs. AI where you will train agents in multi-agents environments and compete against your classmates' agents**.
|
||||
|
||||
Sounds exciting? Let's get started,
|
||||
|
||||
@@ -1,12 +1,10 @@
|
||||
# The Shoot Target Environment
|
||||
# The SnowballTarget Environment
|
||||
|
||||
## The Agent's Goal
|
||||
The first agent you're going to train is Julien the bear (the name is based after our [CTO Julien Chaumond](https://twitter.com/julien_c)) to shoot targets with snowballs.
|
||||
The first agent you're going to train is Julien the bear (the name is based after our [CTO Julien Chaumond](https://twitter.com/julien_c)) to hit targets with snowballs.
|
||||
|
||||
The goal in this environment is that Julien the bear shoot the maximum of targets that spawned, in the limited time. To do that, he will need to move correctly towards the target and shoot.
|
||||
Given he needs to wait 2 second after launching a snowball, he needs to learn to shoot correctly.
|
||||
|
||||
TODO ADD GIF
|
||||
The goal in this environment is that Julien the bear **hit as much as possible targets that spawned, in the limited time** (1000 timesteps). To do that, it will need **to place itself correctly from the target and shoot
|
||||
**. In addition, to avoid "snowball spamming" (aka shooting a snowall every timestep),**Julien the bear has a "cool off" system** (it needs to wait 0.5 seconds after a shoot to be able to shoot again).
|
||||
|
||||
## The reward function
|
||||
|
||||
Reference in New Issue
Block a user