Update units/en/unit2/bellman-equation.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-04-13 18:00:45 +08:00 · 2022-12-03 11:13:33 +01:00
parent 3e9e315e53
commit 87dde1584e
1 changed files with 1 additions and 1 deletions
--- a/units/en/unit2/bellman-equation.mdx
+++ b/units/en/unit2/bellman-equation.mdx
@@ -42,7 +42,7 @@ If we go back to our example, the value of State 1= expected cumulative return i

 To calculate the value of State 1: the sum of rewards **if the agent started in that state 1** and then followed the **policy for all the time steps.**

-Which is equivalent to  \\(V(S_{t})\\)  = Immediate reward  \\(R_{t+1}\\)  + Discounted value of the next state  \\(gamma * V(S_{t+1})\\)
+This is equivalent to  \\(V(S_{t})\\)  = Immediate reward  \\(R_{t+1}\\)  + Discounted value of the next state  \\(gamma * V(S_{t+1})\\)

 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/bellman6.jpg" alt="Bellman equation"/>