mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-08 21:30:45 +08:00
Merge pull request #430 from josejuanmartinez/unit-5-quiz
Unit 5 quiz and some rewording for Unit 6
This commit is contained in:
@@ -148,6 +148,8 @@
|
||||
title: Hands-on
|
||||
- local: unit5/bonus
|
||||
title: Bonus. Learn to create your own environments with Unity and MLAgents
|
||||
- local: unit5/quiz
|
||||
title: Quiz
|
||||
- local: unit5/conclusion
|
||||
title: Conclusion
|
||||
- title: Unit 6. Actor Critic methods with Robotics environments
|
||||
|
||||
130
units/en/unit5/quiz.mdx
Normal file
130
units/en/unit5/quiz.mdx
Normal file
@@ -0,0 +1,130 @@
|
||||
# Quiz
|
||||
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
### Q1: Which of the following tools are specifically designed for video games development?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "Unity (C#)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Unreal Engine (C++)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Godot (GDScript, C++, C#)",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "JetBrains' Rider",
|
||||
explain: "Although useful for its support of C# for Unity, it's not a video games development IDE",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "JetBrains' CLion",
|
||||
explain: "Although useful for its support of C++ for Unreal Engine, it's not a video games development IDE",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "Microsoft Visual Studio and Visual Studio Code",
|
||||
explain: "Including support for both Unity and Unreal, they are generic IDEs, not video games oriented.",
|
||||
correct: false,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q2: What of the following statements are true about Unity ML-Agents?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "Unity ´Scene´ objects can be used to create learning environments",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Unit ML-Agents allows you to create and train your agents using Reinforcement Learning",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Its `Communicator` component manages the communication between Unity's C# Environments/Agents and a Python back-end",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "The training process uses Reinforcement Learning algorithms, implemented in Pytorch",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "Unity ML-Agents only support Proximal Policy Optimization (PPO)",
|
||||
explain: "No, Unity ML-Agents supports several families of algorithms, including Actor-Critic which is going to be explained in the next section",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "It includes a Gym Wrapper and a multi-agent version of it called `PettingZoo`",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q3: Fill the missing letters
|
||||
|
||||
- In Unity ML-Agents, the Policy of an Agent is called a b _ _ _ n
|
||||
- The component in charge of orchestrating the agents is called the _ c _ _ _ m _
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
- b r a i n
|
||||
- a c a d e m y
|
||||
</details>
|
||||
|
||||
### Q4: Define with your own words what is a `raycast`
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
A raycast is (most of the times) a linear projection, as a `laser` which aims to detect collisions through objects.
|
||||
</details>
|
||||
|
||||
### Q5: Which are the differences between capturing the environment using `frames` or `raycasts`?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "By using `frames`, the environment is defined by each of the pixels of the screen. By using `raycasts`, we only send a sample of those pixels.",
|
||||
explain: "`Raycasts` don't have anything to do with pixels. They are linear projections (lasers) that we spawn to look for collisions.",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "By using `raycasts`, the environment is defined by each of the pixels of the screen. By using `frames`, we spawn a (usually) line to check what objects it collides with",
|
||||
explain: "It's the other way around - `frames` collect pixels, `raycasts` check for collisions.",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "By using `frames`, we collect all the pixels of the screen, which define the environment. By using `raycast`, we don't use pixels, we spawn (normally) lines and check their collisions",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
|
||||
### Q6: Name several environment and agent input variables used to train the agent in the Snowball or Pyramid environments
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
- Collisions of the raycasts spawned from the agent detecting blocks, (invisible) walls, stones, our target, switches, etc.
|
||||
- Traditional inputs describing agent features, as its speed
|
||||
- Boolean vars, as the switch (on/off) in Pyramids or the `can I shoot?` in the SnowballTarget.
|
||||
</details>
|
||||
|
||||
|
||||
Congrats on finishing this Quiz 🥳, if you missed some elements, take time to read the chapter again to reinforce (😏) your knowledge.
|
||||
@@ -3,7 +3,7 @@
|
||||
The best way to learn and [to avoid the illusion of competence](https://www.coursera.org/lecture/learning-how-to-learn/illusions-of-competence-BuFzf) **is to test yourself.** This will help you to find **where you need to reinforce your knowledge**.
|
||||
|
||||
|
||||
### Q1: What of the following interpretations of bias-variance tradeoff is the most accurate in the field of Reinforcement Learning?
|
||||
### Q1: Which of the following interpretations of bias-variance tradeoff is the most accurate in the field of Reinforcement Learning?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
@@ -20,7 +20,7 @@ The best way to learn and [to avoid the illusion of competence](https://www.cour
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q2: Which of the following statements are True, when talking about models with bias and/or variance in RL?
|
||||
### Q2: Which of the following statements are true, when talking about models with bias and/or variance in RL?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
@@ -48,29 +48,29 @@ The best way to learn and [to avoid the illusion of competence](https://www.cour
|
||||
/>
|
||||
|
||||
|
||||
### Q3: Which of the following statements are true about Monte-carlo method?
|
||||
### Q3: Which of the following statements are true about Monte Carlo method?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "It's a sampling mechanism, which means we don't consider analyze all the possible states, but a sample of those",
|
||||
text: "It's a sampling mechanism, which means we don't analyze all the possible states, but a sample of those",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
{
|
||||
text: "It's very resistant to stochasticity (random elements in the trajectory)",
|
||||
explain: "Monte-carlo randomly estimates everytime a sample of trajectories. However, even same trajectories can have different reward values if they contain stochastic elements",
|
||||
explain: "Monte Carlo randomly estimates everytime a sample of trajectories. However, even same trajectories can have different reward values if they contain stochastic elements",
|
||||
correct: false,
|
||||
},
|
||||
{
|
||||
text: "To reduce the impact of stochastic elements in Monte-Carlo, we can take `n` strategies and average them, reducing their impact impact in case of noise",
|
||||
text: "To reduce the impact of stochastic elements in Monte Carlo, we take `n` strategies and average them, reducing their individual impact",
|
||||
explain: "",
|
||||
correct: true,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
|
||||
### Q4: What is the Advanced Actor-Critic Method (A2C)?
|
||||
### Q4: How would you describe, with your own words, the Actor-Critic Method (A2C)?
|
||||
|
||||
<details>
|
||||
<summary>Solution</summary>
|
||||
@@ -83,12 +83,12 @@ The idea behind Actor-Critic is that we learn two function approximations:
|
||||
|
||||
</details>
|
||||
|
||||
### Q5: Which of the following statemets are True about the Actor-Critic Method?
|
||||
### Q5: Which of the following statements are true about the Actor-Critic Method?
|
||||
|
||||
<Question
|
||||
choices={[
|
||||
{
|
||||
text: "The Critic does not learn from the training process",
|
||||
text: "The Critic does not learn any function during the training process",
|
||||
explain: "Both the Actor and the Critic function parameters are updated during training time",
|
||||
correct: false,
|
||||
},
|
||||
|
||||
Reference in New Issue
Block a user