diff --git a/units/en/unit2/bellman-equation.mdx b/units/en/unit2/bellman-equation.mdx index e819f5a..577c6bb 100644 --- a/units/en/unit2/bellman-equation.mdx +++ b/units/en/unit2/bellman-equation.mdx @@ -41,7 +41,7 @@ If we go back to our example, we can say that the value of State 1 is equal to t To calculate the value of State 1: the sum of rewards **if the agent started in that state 1** and then followed the **policy for all the time steps.** -This is equivalent to \\(V(S_{t})\\) = Immediate reward \\(R_{t+1}\\) + Discounted value of the next state \\(gamma * V(S_{t+1})\\) +This is equivalent to \\(V(S_{t})\\) = Immediate reward \\(R_{t+1}\\) + Discounted value of the next state \\(\gamma * V(S_{t+1})\\)
Bellman equation