Update glossary.mdx

This commit is contained in:
Thomas Simonini
2022-12-20 13:06:31 +01:00
committed by GitHub
parent c275b13ddf
commit a37804cebf

View File

@@ -1,4 +1,7 @@
# Glossary
# Glossary [[glossary]]
This is a community-created glossary. Contributions are welcomed!
### Strategies to find the optimal policy
@@ -9,3 +12,10 @@
- **The state-value function.** For each state, the state-value function is the expected return if the agent starts in that state and follows the policy until the end.
- **The action-value function.** In contrast to the state-value function, the action-value calculates for each state and action pair the expected return if the agent starts in that state and takes an action. Then it follows the policy forever after.
If you want to improve the course, you can [open a Pull Request.](https://github.com/huggingface/deep-rl-class/pulls)
This glossary was made possible thanks to:
- [Ramón Rueda](https://github.com/ramon-rd)