From 3bdc44cd354cc2434caab99a41475c586ff3c7d9 Mon Sep 17 00:00:00 2001 From: Thomas Simonini Date: Tue, 20 Dec 2022 14:05:29 +0100 Subject: [PATCH] Update bellman-equation.mdx --- units/en/unit2/bellman-equation.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/units/en/unit2/bellman-equation.mdx b/units/en/unit2/bellman-equation.mdx index 6979d23..f8f99f7 100644 --- a/units/en/unit2/bellman-equation.mdx +++ b/units/en/unit2/bellman-equation.mdx @@ -58,6 +58,6 @@ But you'll study an example with gamma = 0.99 in the Q-Learning section of this -To recap, the idea of the Bellman equation is that instead of calculating each value as the sum of the expected return, **which is a long process.** This is equivalent **to the sum of immediate reward + the discounted value of the state that follows.** +To recap, the idea of the Bellman equation is that instead of calculating each value as the sum of the expected return, **which is a long process.**, we calculate the value as **the sum of immediate reward + the discounted value of the state that follows.** Before going to the next section, think about the role of gamma in the Bellman equation. What happens if the value of gamma is very low (e.g. 0.1 or even 0)? What happens if the value is 1? What happens if the value is very high, such as a million?