From 143f169a654875f52044d409c4e2c13f726139d3 Mon Sep 17 00:00:00 2001 From: simoninithomas Date: Fri, 30 Dec 2022 19:05:40 +0100 Subject: [PATCH] Adding reading resources --- units/en/unit6/additional-readings.mdx | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/units/en/unit6/additional-readings.mdx b/units/en/unit6/additional-readings.mdx index 4361839..5e7f386 100644 --- a/units/en/unit6/additional-readings.mdx +++ b/units/en/unit6/additional-readings.mdx @@ -1,9 +1,16 @@ # Additional Readings [[additional-readings]] ## Bias-variance tradeoff in Reinforcement Learning + If you want to dive deeper into the question of variance and bias tradeoff in Deep Reinforcement Learning, you can check these two articles: - [Making Sense of the Bias / Variance Trade-off in (Deep) Reinforcement Learning](https://blog.mlreview.com/making-sense-of-the-bias-variance-trade-off-in-deep-reinforcement-learning-79cf1e83d565) - [Bias-variance Tradeoff in Reinforcement Learning](https://www.endtoend.ai/blog/bias-variance-tradeoff-in-reinforcement-learning/) ## Advantage Functions + - [Advantage Functions, SpinningUp RL](https://spinningup.openai.com/en/latest/spinningup/rl_intro.html?highlight=advantage%20functio#advantage-functions) + +## Actor Critic + +- [Foundations of Deep RL Series, L3 Policy Gradients and Advantage Estimation by Pieter Abbeel](https://www.youtube.com/watch?v=AKbX1Zvo7r8) +- [A2C Paper: Asynchronous Methods for Deep Reinforcement Learning](https://arxiv.org/abs/1602.01783v2)