Merge pull request #384 from mrvincenzo/updates/AddMcAndTdToGlossary

Add MC and TD to Unit2 glossary
2026-06-16 15:07:58 +08:00 · 2023-08-17 10:00:27 +02:00
parent f1198dab15 bc9a54adcf
commit 1961ee99c8
1 changed files with 6 additions and 0 deletions
--- a/units/en/unit2/glossary.mdx
+++ b/units/en/unit2/glossary.mdx
@@ -32,6 +32,12 @@ This is a community-created glossary. Contributions are welcomed!
 - **Off-policy algorithms:** A different policy is used at training time and inference time
 - **On-policy algorithms:** The same policy is used during training and inference

+### Monte Carlo and Temporal Difference learning strategies
+
+- **Monte Carlo (MC):** Learning at the end of the episode. With Monte Carlo, we wait until the episode ends and then we update the value function (or policy function) from a complete episode.
+
+- **Temporal Difference (TD):** Learning at each step. With Temporal Difference Learning, we update the value function (or policy function) at each step without requiring a complete episode.
+
 If you want to improve the course, you can [open a Pull Request.](https://github.com/huggingface/deep-rl-class/pulls)

 This glossary was made possible thanks to: