From afb42f18bd30938e00bf6f3ba4bb9a32b277890b Mon Sep 17 00:00:00 2001
From: dylwil3 <53534755+dylwil3@users.noreply.github.com>
Date: Tue, 2 May 2023 08:39:07 -0500
Subject: [PATCH] requested change

---
 units/en/unit4/introduction.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/units/en/unit4/introduction.mdx b/units/en/unit4/introduction.mdx
index 6dc4998..c087059 100644
--- a/units/en/unit4/introduction.mdx
+++ b/units/en/unit4/introduction.mdx
@@ -8,7 +8,7 @@ Since the beginning of the course, we have only studied value-based methods, **
 
 <img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit3/link-value-policy.jpg" alt="Link value policy" />
 
-In value-based methods, the policy ** \\(π\\) is determined by the action value estimates by a function** (for instance, the greedy-policy, which selects the action with the highest value given a state).
+In value-based methods, the policy ** \(π\) only exists because of the action value estimates since the policy is just a function** (for instance, greedy-policy) that will select the action with the highest value given a state.
 
 With policy-based methods, we want to optimize the policy directly **without having an intermediate step of learning a value function.**