From 162110aba93891d4753f834e4801db5b45ad3c9e Mon Sep 17 00:00:00 2001 From: Lutz von der Burchard <61054407+lutzvdb@users.noreply.github.com> Date: Thu, 28 Dec 2023 11:00:33 +0100 Subject: [PATCH] Added clarification to the meaning of the rows of the Q-table --- units/en/unit2/q-learning.mdx | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/units/en/unit2/q-learning.mdx b/units/en/unit2/q-learning.mdx index 5f46722..1ff8456 100644 --- a/units/en/unit2/q-learning.mdx +++ b/units/en/unit2/q-learning.mdx @@ -27,7 +27,8 @@ Let's go through an example of a maze. Maze example -The Q-table is initialized. That's why all values are = 0. This tableĀ **contains, for each state and action, the corresponding state-action values.** +The Q-table is initialized. That's why all values are = 0. This tableĀ **contains, for each state and action, the corresponding state-action values.** +For this simple example, the state is only defined by the position of the mouse. Therefore, we have 2*3 rows in our Q-table, one row for each possible position of the mouse. In more complex scenarios, the state could contain more information than the position of the actor. Maze example