mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-02-03 02:14:53 +08:00
Unit 5 quiz and rewording of unit 6
This commit is contained in:
@@ -2,72 +2,114 @@
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
|
||||
### Q1: What of the following statemets are true about Unity ML-Agents?
|
||||
### Q1: Which of the following tools are specifically designed for video games development?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "It allows you to create learning environments from the Unity ´Scene´ objects",
|
||||
text: "Unity (C#)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "It allows you to create and train your agents using Reinforcement Learning",
|
||||
text: "Unreal Engine (C++)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Its `Communicator` component manages the communication between Unity's C# Environments/Agents and the Python back-end",
|
||||
{
|
||||
text: "Godot (GDScript, C++, C#)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "The library which carries out the actual training is Pytorch",
|
||||
explain: ""
|
||||
correct: "true"
|
||||
{
|
||||
text: "JetBrains' Rider",
|
||||
explain: "Although useful for its support of C# for Unity, it's not a video games development IDE",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "Unity ML-Agents only support Proximal Policy Optimization (PPO)",
|
||||
explain: "No, Unity ML-Agents supports several families of algorithms, including Actor-Critic which is going to be explained in the next section"
|
||||
correct: "false"
|
||||
{
|
||||
text: "JetBrains' CLion",
|
||||
explain: "Although useful for its support of C++ for Unreal Engine, it's not a video games development IDE",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "Unity's programming language is C++",
|
||||
explain: "It's C#. If you are interested in programming in C++, take a look at Unreal Engine `Learning Agents`"
|
||||
correct: "false"
|
||||
{
|
||||
text: "Microsoft Visual Studio and Visual Studio Code",
|
||||
explain: "Including support for both Unity and Unreal, they are generic IDEs, not video games oriented.",
|
||||
correct: false,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q2: Explain with your own words what is the role of the `Academy`.
|
||||
### Q2: What of the following statements are true about Unity ML-Agents?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "Unity ´Scene´ objects can be used to create learning environments",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Unit ML-Agents allows you to create and train your agents using Reinforcement Learning",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Its `Communicator` component manages the communication between Unity's C# Environments/Agents and a Python back-end",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "The training process uses Reinforcement Learning algorithms, implemented in Pytorch",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Unity ML-Agents only support Proximal Policy Optimization (PPO)",
|
||||
explain: "No, Unity ML-Agents supports several families of algorithms, including Actor-Critic which is going to be explained in the next section",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "It includes a Gym Wrapper and a multi-agent version of it called `PettingZoo`",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q3: Fill the missing letters
|
||||
|
||||
- In Unity ML-Agents, the Policy of an Agent is called a b _ _ _ n
|
||||
- The component in charge of orchestrating the agents is called the _ c _ _ _ m _
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
|
||||
The `Academy` is the orchestrating module in charge of attending the requests from the Python API and sending them to the agents (e.g, `collect observations`)
|
||||
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit5/academy.png" alt="Academy"/>
|
||||
|
||||
- b r a i n
|
||||
- a c a d e m y
|
||||
</details>
|
||||
|
||||
### Q4: Define with your own words what is a `raycast`
|
||||
|
||||
### Q3: What are the differences between capturing the environment using `frames` or `raycasts`?
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
A raycast is (most of the times) a linear projection, as a `laser` which aims to detect collisions through objects.
|
||||
</details>
|
||||
|
||||
### Q5: Which are the differences between capturing the environment using `frames` or `raycasts`?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "By using `frames`, the environment is defined by each of the pixels of the screen. By using `raycasts`, we only send a sample of those pixels.",
|
||||
explain: "`Raycasts` don't have anything to do with pixels. They are projections of geometric shapes, normally lines, that we spawn to check for collisions.",
|
||||
explain: "`Raycasts` don't have anything to do with pixels. They are linear projections (lasers) that we spawn to look for collisions.",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "By using `raycasts`, the environment is defined by each of the pixels of the screen. By using `frames`, we spawn a geometric shape (normally lines) to check what objects it collides with",
|
||||
text: "By using `raycasts`, the environment is defined by each of the pixels of the screen. By using `frames`, we spawn a (usually) line to check what objects it collides with",
|
||||
explain: "It's the other way around - `frames` collect pixels, `raycasts` check for collisions.",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "By using `frames`, we collect all the pixels of the screen, which define the environment. By using `raycast`, we don't use pixels, we spawn geometric shapes (normally lines) and check for collisions",
|
||||
text: "By using `frames`, we collect all the pixels of the screen, which define the environment. By using `raycast`, we don't use pixels, we spawn (normally) lines and check their collisions",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
@@ -75,12 +117,13 @@ The `Academy` is the orchestrating module in charge of attending the requests fr
|
||||
/>
|
||||
|
||||
|
||||
### Q4: Name several input variables which were used in any of the Snowball or Pyramid environments
|
||||
### Q6: Name several environment and agent input variables used to train the agent in the Snowball or Pyramid environments
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
- Collisions of the raycasts in charge of detecting blocks, (invisible) walls, stones, our target, switches, etc. in the environment.
|
||||
- Traditional inputs describing agent features, as its speed (it could also be position, rotation, etc. although that is covered by our raycast already).
|
||||
- Some boolean vars, as the switch (on/off) in Pyramids or the `can I shoot?` in the SnowballTarget.
|
||||
- Collisions of the raycasts spawned from the agent detecting blocks, (invisible) walls, stones, our target, switches, etc.
|
||||
- Traditional inputs describing agent features, as its speed
|
||||
- Boolean vars, as the switch (on/off) in Pyramids or the `can I shoot?` in the SnowballTarget.
|
||||
</details>
|
||||
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
|
||||
### Q1: What of the following interpretations of bias-variance tradeoff is the most accurate in the field of Reinforcement Learning?
|
||||
### Q1: Which of the following interpretations of bias-variance tradeoff is the most accurate in the field of Reinforcement Learning?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
@@ -20,7 +20,7 @@ The best way to learn and [to avoid the illusion of competence](https://www.cour
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q2: Which of the following statements are True, when talking about models with bias and/or variance in RL?
|
||||
### Q2: Which of the following statements are true, when talking about models with bias and/or variance in RL?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
@@ -48,29 +48,29 @@ The best way to learn and [to avoid the illusion of competence](https://www.cour
|
||||
/>
|
||||
|
||||
|
||||
### Q3: Which of the following statements are true about Monte-carlo method?
|
||||
### Q3: Which of the following statements are true about Monte Carlo method?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "It's a sampling mechanism, which means we don't consider analyze all the possible states, but a sample of those",
|
||||
text: "It's a sampling mechanism, which means we don't analyze all the possible states, but a sample of those",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "It's very resistant to stochasticity (random elements in the trajectory)",
|
||||
explain: "Monte-carlo randomly estimates everytime a sample of trajectories. However, even same trajectories can have different reward values if they contain stochastic elements",
|
||||
explain: "Monte Carlo randomly estimates everytime a sample of trajectories. However, even same trajectories can have different reward values if they contain stochastic elements",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "To reduce the impact of stochastic elements in Monte-Carlo, we can take `n` strategies and average them, reducing their impact impact in case of noise",
|
||||
text: "To reduce the impact of stochastic elements in Monte Carlo, we take `n` strategies and average them, reducing their individual impact",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q4: What is the Advanced Actor-Critic Method (A2C)?
|
||||
### Q4: How would you describe, with your own words, the Actor-Critic Method (A2C)?
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
@@ -83,12 +83,12 @@ The idea behind Actor-Critic is that we learn two function approximations:
|
||||
|
||||
</details>
|
||||
|
||||
### Q5: Which of the following statemets are True about the Actor-Critic Method?
|
||||
### Q5: Which of the following statements are true about the Actor-Critic Method?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "The Critic does not learn from the training process",
|
||||
text: "The Critic does not learn any function during the training process",
|
||||
explain: "Both the Actor and the Critic function parameters are updated during training time",
|
||||
correct: false,
|
||||
},
|
||||
|
||||
Reference in New Issue
Block a user