diff --git a/units/en/unit2/q-learning.mdx b/units/en/unit2/q-learning.mdx
index 48f01d2..e78a598 100644
--- a/units/en/unit2/q-learning.mdx
+++ b/units/en/unit2/q-learning.mdx
@@ -144,7 +144,7 @@ Is different from the policy we use during the training part:
 
 - *On-policy:* using the **same policy for acting and updating.**
 
-For instance, with Sarsa, another value-based algorithm, **the epsilon-greedy Policy selects the next state-action pair, not a greedy policy.**
+For instance, with Sarsa, another value-based algorithm, **the epsilon-greedy policy selects the next state-action pair, not a greedy policy.**
 
 
 <figure>