mirror of
https://github.com/huggingface/deep-rl-class.git
synced 2026-03-31 17:21:01 +08:00
22 lines
1.2 KiB
Plaintext
22 lines
1.2 KiB
Plaintext
# Additional Readings [[additional-readings]]
|
|
|
|
These are **optional readings** if you want to go deeper.
|
|
|
|
## PPO Explained
|
|
|
|
- [Towards Delivering a Coherent Self-Contained Explanation of Proximal Policy Optimization by Daniel Bick](https://fse.studenttheses.ub.rug.nl/25709/1/mAI_2021_BickD.pdf)
|
|
- [What is the way to understand Proximal Policy Optimization Algorithm in RL?](https://stackoverflow.com/questions/46422845/what-is-the-way-to-understand-proximal-policy-optimization-algorithm-in-rl)
|
|
- [Foundations of Deep RL Series, L4 TRPO and PPO by Pieter Abbeel](https://youtu.be/KjWF8VIMGiY)
|
|
- [OpenAI PPO Blogpost](https://openai.com/blog/openai-baselines-ppo/)
|
|
- [Spinning Up RL PPO](https://spinningup.openai.com/en/latest/algorithms/ppo.html)
|
|
- [Paper Proximal Policy Optimization Algorithms](https://arxiv.org/abs/1707.06347)
|
|
|
|
## PPO Implementation details
|
|
|
|
- [The 37 Implementation Details of Proximal Policy Optimization](https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/)
|
|
- [Part 1 of 3 — Proximal Policy Optimization Implementation: 11 Core Implementation Details](https://www.youtube.com/watch?v=MEt6rrxH8W4)
|
|
|
|
## Importance Sampling
|
|
|
|
- [Importance Sampling Explained](https://youtu.be/C3p2wI4RAi8)
|