From 87dde1584e34f48a839c7033b3ad059913d3133f Mon Sep 17 00:00:00 2001 From: Thomas Simonini Date: Sat, 3 Dec 2022 11:13:33 +0100 Subject: [PATCH] Update units/en/unit2/bellman-equation.mdx Co-authored-by: Sayak Paul --- units/en/unit2/bellman-equation.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/units/en/unit2/bellman-equation.mdx b/units/en/unit2/bellman-equation.mdx index be70441..27c7d35 100644 --- a/units/en/unit2/bellman-equation.mdx +++ b/units/en/unit2/bellman-equation.mdx @@ -42,7 +42,7 @@ If we go back to our example, the value of State 1= expected cumulative return i To calculate the value of State 1: the sum of rewards **if the agent started in that state 1** and then followed the **policy for all the time steps.** -Which is equivalent to \\(V(S_{t})\\) = Immediate reward \\(R_{t+1}\\) + Discounted value of the next state \\(gamma * V(S_{t+1})\\) +This is equivalent to \\(V(S_{t})\\) = Immediate reward \\(R_{t+1}\\) + Discounted value of the next state \\(gamma * V(S_{t+1})\\) Bellman equation