Merge pull request #575 from carschandler/patch-1

Confusing wording in self-play.mdx
2026-04-26 11:51:19 +08:00 · 2024-12-20 11:54:32 +01:00
parent d411ceaf93 1c1cf48d3c
commit dba30ddd69
1 changed files with 1 additions and 1 deletions
--- a/units/en/unit7/self-play.mdx
+++ b/units/en/unit7/self-play.mdx
@@ -37,7 +37,7 @@ The theory behind self-play is not something new. It was already used by Arthur

 Self-Play is integrated into the MLAgents library and is managed by multiple hyperparameters that we’re going to study. But the main focus, as explained in the documentation, is the **tradeoff between the skill level and generality of the final policy and the stability of learning**.

-Training against a set of slowly changing or unchanging adversaries with low diversity **results in more stable training. But a risk to overfit if the change is too slow.**
+Training against a set of slowly changing or unchanging adversaries with low diversity **results in more stable training. But there is a risk of overfitting if the change is too slow.**

 So we need to control: