diff --git a/units/en/unit6/advantage-actor-critic.mdx b/units/en/unit6/advantage-actor-critic.mdx
index 8b7863c..64f07fc 100644
--- a/units/en/unit6/advantage-actor-critic.mdx
+++ b/units/en/unit6/advantage-actor-critic.mdx
@@ -16,7 +16,7 @@ On the other hand, your friend (Critic) will also update their way to provide fe
 
 This is the idea behind Actor-Critic. We learn two function approximations:
 
-- *A policy* that **controls how our agent acts**: \\( \pi_{\theta}(s,a) \\)
+- *A policy* that **controls how our agent acts**: \\( \pi_{\theta}(s) \\)
 
 - *A value function* to assist the policy update by measuring how good the action taken is: \\( \hat{q}_{w}(s,a) \\)
 
@@ -24,7 +24,7 @@ This is the idea behind Actor-Critic. We learn two function approximations:
 Now that we have seen the Actor Critic's big picture, let's dive deeper to understand how Actor and Critic improve together during the training.
 
 As we saw, with Actor-Critic methods, there are two function approximations (two neural networks):
-- *Actor*, a **policy function** parameterized by theta: \\( \pi_{\theta}(s,a) \\)
+- *Actor*, a **policy function** parameterized by theta: \\( \pi_{\theta}(s) \\)
 - *Critic*, a **value function** parameterized by w: \\( \hat{q}_{w}(s,a) \\)
 
 Let's see the training process to understand how Actor and Critic are optimized: