From beaef9b0a44cb5a5b618d1c4a5668fd7143eec06 Mon Sep 17 00:00:00 2001 From: Thomas Simonini Date: Tue, 20 Dec 2022 14:02:46 +0100 Subject: [PATCH] Update two-types-value-based-methods.mdx --- units/en/unit2/two-types-value-based-methods.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/units/en/unit2/two-types-value-based-methods.mdx b/units/en/unit2/two-types-value-based-methods.mdx index 3422e7d..df83311 100644 --- a/units/en/unit2/two-types-value-based-methods.mdx +++ b/units/en/unit2/two-types-value-based-methods.mdx @@ -36,7 +36,7 @@ Consequently, whatever method you use to solve your problem, **you will have a So the difference is: - In policy-based, **the optimal policy (denoted π\*) is found by training the policy directly.** -- In value-based, **finding an optimal value function (denoted Q\* or V\*, we'll study the difference after) in our leads to having an optimal policy.** +- In value-based, **finding an optimal value function (denoted Q\* or V\*, we'll study the difference after) leads to having an optimal policy.** Link between value and policy