mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-04-04 02:57:58 +08:00
Update units/en/unit2/bellman-equation.mdx
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
This commit is contained in:
@@ -47,7 +47,7 @@ Which is equivalent to \\(V(S_{t})\\) = Immediate reward \\(R_{t+1}\\) + Dis
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/bellman6.jpg" alt="Bellman equation"/>
|
||||
|
||||
|
||||
For simplification, here we don't discount, so gamma = 1.
|
||||
In the interest of simplicity, here we don't discount, so gamma = 1.
|
||||
|
||||
- The value of \\(V(S_{t+1}) \\) = Immediate reward \\(R_{t+2}\\) + Discounted value of the next state ( \\(gamma * V(S_{t+2})\\) ).
|
||||
- And so on.
|
||||
|
||||
Reference in New Issue
Block a user