From 0c3616c03ffcf8735a59ec495f08db0c73540c42 Mon Sep 17 00:00:00 2001 From: Artagon Date: Fri, 16 Dec 2022 20:34:24 +0100 Subject: [PATCH] Replace ** by tags in figcaption --- units/en/unit2/bellman-equation.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/units/en/unit2/bellman-equation.mdx b/units/en/unit2/bellman-equation.mdx index 99d753a..6979d23 100644 --- a/units/en/unit2/bellman-equation.mdx +++ b/units/en/unit2/bellman-equation.mdx @@ -18,7 +18,7 @@ Then, to calculate the \\(V(S_{t+1})\\), we need to calculate the return startin
Bellman equation -
To calculate the value of State 2: the sum of rewards **if the agent started in that state**, and then followed the **policy for all the time steps.**
+
To calculate the value of State 2: the sum of rewards if the agent started in that state, and then followed the policy for all the time steps.
So you may have noticed, we're repeating the computation of the value of different states, which can be tedious if you need to do it for each state value or state-action value.