mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-02-08 12:54:32 +08:00
Merge pull request #134 from ankandrew/minor-bold-fix
Fix minor bold text issue
This commit is contained in:
@@ -18,7 +18,7 @@ Then, to calculate the \\(V(S_{t+1})\\), we need to calculate the return startin
|
||||
|
||||
<figure>
|
||||
<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/bellman3.jpg" alt="Bellman equation"/>
|
||||
<figcaption>To calculate the value of State 2: the sum of rewards **if the agent started in that state, and then followed the **policy for all the time steps.</figcaption>
|
||||
<figcaption>To calculate the value of State 2: the sum of rewards **if the agent started in that state**, and then followed the **policy for all the time steps.**</figcaption>
|
||||
</figure>
|
||||
|
||||
So you may have noticed, we're repeating the computation of the value of different states, which can be tedious if you need to do it for each state value or state-action value.
|
||||
|
||||
Reference in New Issue
Block a user