mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-03-31 17:21:01 +08:00
21 lines
886 B
Plaintext
21 lines
886 B
Plaintext
# Additional Readings
|
|
|
|
These are **optional readings** if you want to go deeper.
|
|
|
|
|
|
## Introduction to Policy Optimization
|
|
|
|
- [Part 3: Intro to Policy Optimization - Spinning Up documentation](https://spinningup.openai.com/en/latest/spinningup/rl_intro3.html)
|
|
|
|
|
|
## Policy Gradient
|
|
|
|
- [https://johnwlambert.github.io/policy-gradients/](https://johnwlambert.github.io/policy-gradients/)
|
|
- [RL - Policy Gradient Explained](https://jonathan-hui.medium.com/rl-policy-gradients-explained-9b13b688b146)
|
|
- [Chapter 13, Policy Gradient Methods; Reinforcement Learning, an introduction by Richard Sutton and Andrew G. Barto](http://incompleteideas.net/book/RLbook2020.pdf)
|
|
|
|
## Implementation
|
|
|
|
- [PyTorch Reinforce implementation](https://github.com/pytorch/examples/blob/main/reinforcement_learning/reinforce.py)
|
|
- [Implementations from DDPG to PPO](https://github.com/MrSyee/pg-is-all-you-need)
|