From fc66ea7e4aa2b53c761367d55154566477a98c17 Mon Sep 17 00:00:00 2001
From: Artagon <florent.vaucher@gmail.com>
Date: Sat, 17 Dec 2022 22:33:02 +0100
Subject: [PATCH] Rephrasing for initial epsilon value

---
 units/en/unit2/q-learning.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/units/en/unit2/q-learning.mdx b/units/en/unit2/q-learning.mdx
index e78a598..2dd7190 100644
--- a/units/en/unit2/q-learning.mdx
+++ b/units/en/unit2/q-learning.mdx
@@ -80,7 +80,7 @@ We need to initialize the Q-table for each state-action pair. **Most of the tim
 
 Epsilon greedy strategy is a policy that handles the exploration/exploitation trade-off.
 
-The idea is that we define epsilon ɛ ≤ 1.0:
+The idea is that we define the initial epsilon ɛ = 1.0:
 
 - *With probability 1 — ɛ* : we do **exploitation** (aka our agent selects the action with the highest state-action pair value).
 - With probability ɛ: **we do exploration** (trying random action).