Update two-types-value-based-methods.mdx

2026-06-18 01:27:16 +08:00 · 2022-12-20 14:02:46 +01:00
parent 31dc00a52b
commit beaef9b0a4
1 changed files with 1 additions and 1 deletions
--- a/units/en/unit2/two-types-value-based-methods.mdx
+++ b/units/en/unit2/two-types-value-based-methods.mdx
@@ -36,7 +36,7 @@ Consequently, whatever method you use to solve your problem, **you will have a
 So the difference is:

 - In policy-based, **the optimal policy (denoted π\*) is found by training the policy directly.**
- In value-based, **finding an optimal value function (denoted Q\* or V\*, we'll study the difference after) in our leads to having an optimal policy.**
+- In value-based, **finding an optimal value function (denoted Q\* or V\*, we'll study the difference after) leads to having an optimal policy.**

 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/link-value-policy.jpg" alt="Link between value and policy"/>