Minor typo fix

2026-06-14 22:17:15 +08:00 · 2023-01-12 15:53:05 -05:00
parent d65108046a
commit 704bd156e0
1 changed files with 2 additions and 2 deletions
--- a/units/en/unit4/what-are-policy-based-methods.mdx
+++ b/units/en/unit4/what-are-policy-based-methods.mdx
@@ -37,6 +37,6 @@ Policy-gradient methods, what we're going to study in this unit, is a subclass o
 The difference between these two methods **lies on how we optimize the parameter** \\(\theta\\):

 - In *policy-based methods*, we search directly for the optimal policy. We can optimize the parameter \\(\theta\\) **indirectly** by maximizing the local approximation of the objective function with techniques like hill climbing, simulated annealing, or evolution strategies.
- In *policy-gradient methods*, because we're a subclass of the policy-based methods, we search directly for the optimal policy. But we optimize the parameter \\(\theta\\) **directly** by performing the gradient ascent on the performance of the objective function \\(J(\theta)\\).
+- In *policy-gradient methods*, because it is a subclass of the policy-based methods, we search directly for the optimal policy. But we optimize the parameter \\(\theta\\) **directly** by performing the gradient ascent on the performance of the objective function \\(J(\theta)\\).

-Before diving more into how works policy-gradient methods (the objective function, policy gradient theorem, gradient ascent, etc.), let's study the advantages and disadvantages of policy-based methods.
+Before diving more into how policy-gradient methods work (the objective function, policy gradient theorem, gradient ascent, etc.), let's study the advantages and disadvantages of policy-based methods.