From c3a210fbfb817ad83a2f850a761a9ea4840ca46b Mon Sep 17 00:00:00 2001 From: Daniel Regado Date: Mon, 7 Aug 2023 18:41:30 +0100 Subject: [PATCH 1/4] Clarifying set of actions possible in Super Mario --- units/en/unit1/rl-framework.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/units/en/unit1/rl-framework.mdx b/units/en/unit1/rl-framework.mdx index cf155e7..fbba374 100644 --- a/units/en/unit1/rl-framework.mdx +++ b/units/en/unit1/rl-framework.mdx @@ -83,11 +83,11 @@ The actions can come from a *discrete* or *continuous space*:
Mario -
Again, in Super Mario Bros, we have only 5 possible actions: 4 directions and jumping
+
In Super Mario Bros, we have only 4 possible actions: left, right, jumping and ducking
-In Super Mario Bros, we have a finite set of actions since we have only 4 directions and jump. +Again, in Super Mario Bros, we have a finite set of actions since we have only 4 directions. - *Continuous space*: the number of possible actions is **infinite**. From ec8973296ac3e7813ae0d016a88dca99cf88b8ba Mon Sep 17 00:00:00 2001 From: Daniel Regado Date: Mon, 7 Aug 2023 18:49:53 +0100 Subject: [PATCH 2/4] Removing duplicate text directly below captions --- units/en/unit1/rl-framework.mdx | 4 ---- units/en/unit1/two-methods.mdx | 2 -- 2 files changed, 6 deletions(-) diff --git a/units/en/unit1/rl-framework.mdx b/units/en/unit1/rl-framework.mdx index fbba374..1af2291 100644 --- a/units/en/unit1/rl-framework.mdx +++ b/units/en/unit1/rl-framework.mdx @@ -61,8 +61,6 @@ In a chess game, we have access to the whole board information, so we receive a
In Super Mario Bros, we only see the part of the level close to the player, so we receive an observation.
-In Super Mario Bros, we only see the part of the level close to the player, so we receive an observation. - In Super Mario Bros, we are in a partially observed environment. We receive an observation **since we only see a part of the level.** @@ -87,8 +85,6 @@ The actions can come from a *discrete* or *continuous space*: -Again, in Super Mario Bros, we have a finite set of actions since we have only 4 directions. - - *Continuous space*: the number of possible actions is **infinite**.
diff --git a/units/en/unit1/two-methods.mdx b/units/en/unit1/two-methods.mdx index fcfc04a..34ddab8 100644 --- a/units/en/unit1/two-methods.mdx +++ b/units/en/unit1/two-methods.mdx @@ -82,8 +82,6 @@ Here we see that our value function **defined values for each possible state.**
Thanks to our value function, at each step our policy will select the state with the biggest value defined by the value function: -7, then -6, then -5 (and so on) to attain the goal.
-Thanks to our value function, at each step our policy will select the state with the biggest value defined by the value function: -7, then -6, then -5 (and so on) to attain the goal. - If we recap: Vbm recap From 71ef586129a86c96de931a231867f02a6a43fcb2 Mon Sep 17 00:00:00 2001 From: Daniel Regado Date: Tue, 8 Aug 2023 11:14:26 +0100 Subject: [PATCH 3/4] Revert "Removing duplicate text directly below captions" This reverts commit ec8973296ac3e7813ae0d016a88dca99cf88b8ba. --- units/en/unit1/rl-framework.mdx | 4 ++++ units/en/unit1/two-methods.mdx | 2 ++ 2 files changed, 6 insertions(+) diff --git a/units/en/unit1/rl-framework.mdx b/units/en/unit1/rl-framework.mdx index 1af2291..fbba374 100644 --- a/units/en/unit1/rl-framework.mdx +++ b/units/en/unit1/rl-framework.mdx @@ -61,6 +61,8 @@ In a chess game, we have access to the whole board information, so we receive a
In Super Mario Bros, we only see the part of the level close to the player, so we receive an observation.
+In Super Mario Bros, we only see the part of the level close to the player, so we receive an observation. + In Super Mario Bros, we are in a partially observed environment. We receive an observation **since we only see a part of the level.** @@ -85,6 +87,8 @@ The actions can come from a *discrete* or *continuous space*: +Again, in Super Mario Bros, we have a finite set of actions since we have only 4 directions. + - *Continuous space*: the number of possible actions is **infinite**.
diff --git a/units/en/unit1/two-methods.mdx b/units/en/unit1/two-methods.mdx index 34ddab8..fcfc04a 100644 --- a/units/en/unit1/two-methods.mdx +++ b/units/en/unit1/two-methods.mdx @@ -82,6 +82,8 @@ Here we see that our value function **defined values for each possible state.**
Thanks to our value function, at each step our policy will select the state with the biggest value defined by the value function: -7, then -6, then -5 (and so on) to attain the goal.
+Thanks to our value function, at each step our policy will select the state with the biggest value defined by the value function: -7, then -6, then -5 (and so on) to attain the goal. + If we recap: Vbm recap From 5130cf126e733a3724fa3eac100495a16f4a193a Mon Sep 17 00:00:00 2001 From: Daniel Regado Date: Tue, 8 Aug 2023 11:23:59 +0100 Subject: [PATCH 4/4] Rephrased possible actions in Super Mario --- units/en/unit1/rl-framework.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/units/en/unit1/rl-framework.mdx b/units/en/unit1/rl-framework.mdx index fbba374..9745357 100644 --- a/units/en/unit1/rl-framework.mdx +++ b/units/en/unit1/rl-framework.mdx @@ -83,7 +83,7 @@ The actions can come from a *discrete* or *continuous space*:
Mario -
In Super Mario Bros, we have only 4 possible actions: left, right, jumping and ducking
+
In Super Mario Bros, we have only 4 possible actions: left, right, up (jumping) and down (crouching).