Add authors

2026-06-15 06:27:24 +08:00 · 2023-02-20 15:50:07 +01:00
parent 575910d970
commit c39fe3b98f
10 changed files with 38 additions and 0 deletions
--- a/units/en/unitbonus3/curriculum-learning.mdx
+++ b/units/en/unitbonus3/curriculum-learning.mdx
@@ -48,3 +48,7 @@ For more information, we recommend you check out the following resources:
 - [Evolving Curricula with Regret-Based Environment Design](https://arxiv.org/abs/2203.01302)
 - [Curriculum Reinforcement Learning via Constrained Optimal Transport](https://proceedings.mlr.press/v162/klink22a.html)
 - [Prioritized Level Replay](https://arxiv.org/abs/2010.03934)
+
+## Author
+
+This section was written by <a href="https://twitter.com/ClementRomac"> Clément Romac </a>
--- a/units/en/unitbonus3/decision-transformers.mdx
+++ b/units/en/unitbonus3/decision-transformers.mdx
@@ -25,3 +25,7 @@ For more information, we recommend you check out the following resources:

 - [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345)
 - [Online Decision Transformer](https://arxiv.org/abs/2202.05607)
+
+## Author
+
+This section was written by <a href="https://twitter.com/edwardbeeching">Edward Beeching</a>
--- a/units/en/unitbonus3/envs-to-try.mdx
+++ b/units/en/unitbonus3/envs-to-try.mdx
@@ -43,3 +43,7 @@ Starcraft II is a famous *real-time strategy game*. DeepMind has used this game
 To start using this environment, check these resources:
 - [Starcraft gym](http://starcraftgym.com/)
 - [A. I. Learns to Play Starcraft 2 (Reinforcement Learning) tutorial](https://www.youtube.com/watch?v=q59wap1ELQ4)
+
+## Author
+
+This section was written by <a href="https://twitter.com/ThomasSimonini"> Thomas Simonini</a>
--- a/units/en/unitbonus3/godotrl.mdx
+++ b/units/en/unitbonus3/godotrl.mdx
@@ -202,3 +202,7 @@ Try setting this property up to 8 to speed up training. This can be a great bene
 ### There’s more!

 We have only scratched the surface of what can be achieved with Godot RL Agents, the library includes custom sensors and cameras to enrich the information available to the agent. Take a look at the [examples](https://github.com/edbeeching/godot_rl_agents_examples) to find out more!
+
+## Author
+
+This section was written by <a href="https://twitter.com/edwardbeeching">Edward Beeching</a>
--- a/units/en/unitbonus3/introduction.mdx
+++ b/units/en/unitbonus3/introduction.mdx
@@ -6,4 +6,6 @@
 Congratulations on finishing this course! **You now have a solid background in Deep Reinforcement Learning**.
 But this course was just the beginning of your Deep Reinforcement Learning journey, there are so many subsections to discover. In this optional unit, we **give you resources to explore multiple concepts and research topics in Reinforcement Learning**.

+Contrary to other units, this unit is a collective work of multiple people from Hugging Face. We mention the author for each unit.
+
 Sounds fun? Let's get started 🔥,
--- a/units/en/unitbonus3/language-models.mdx
+++ b/units/en/unitbonus3/language-models.mdx
@@ -39,3 +39,7 @@ For more information we recommend you check out the following resources:
 - [Pre-Trained Language Models for Interactive Decision-Making](https://arxiv.org/abs/2202.01771)
 - [Grounding Large Language Models with Online Reinforcement Learning](https://arxiv.org/abs/2302.02662v1)
 - [Guiding Pretraining in Reinforcement Learning with Large Language Models](https://arxiv.org/abs/2302.06692)
+
+## Author
+
+This section was written by <a href="https://twitter.com/ClementRomac"> Clément Romac </a>
--- a/units/en/unitbonus3/model-based.mdx
+++ b/units/en/unitbonus3/model-based.mdx
@@ -26,3 +26,7 @@ For more information on MBRL, we recommend you check out the following resources

 - A [blog post on debugging MBRL](https://www.natolambert.com/writing/debugging-mbrl).
 - A [recent review paper on MBRL](https://arxiv.org/abs/2006.16712),
+
+## Author
+
+This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>
--- a/units/en/unitbonus3/offline-online.mdx
+++ b/units/en/unitbonus3/offline-online.mdx
@@ -31,3 +31,7 @@ For more information, we recommend you check out the following resources:

 - [Offline Reinforcement Learning, Talk by Sergei Levine](https://www.youtube.com/watch?v=qgZPZREor5I)
 - [Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems](https://arxiv.org/abs/2005.01643)
+
+## Author
+
+This section was written by <a href="https://twitter.com/ThomasSimonini"> Thomas Simonini</a>
--- a/units/en/unitbonus3/rl-documentation.mdx
+++ b/units/en/unitbonus3/rl-documentation.mdx
@@ -50,3 +50,7 @@ record on [GitHub](https://github.com/RewardReports/reward-reports).

 For further reading, you can visit the Reward Reports [paper](https://arxiv.org/abs/2204.10817)
 or look [an example report](https://github.com/RewardReports/reward-reports/tree/main/examples).
+
+## Author
+
+This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>
--- a/units/en/unitbonus3/rlhf.mdx
+++ b/units/en/unitbonus3/rlhf.mdx
@@ -44,3 +44,7 @@ And here is a snapshot of the growing set of papers that show RLHF's performance
 - [Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned](https://arxiv.org/abs/2209.07858) (Ganguli et al. 2022): A detailed documentation of efforts to “discover, measure, and attempt to reduce [language models] potentially harmful outputs.”
 - [Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning](https://arxiv.org/abs/2208.02294) (Cohen at al. 2022): Using RL to enhance the conversational skill of an open-ended dialogue agent.
 - [Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization](https://arxiv.org/abs/2210.01241) (Ramamurthy and Ammanabrolu et al. 2022): Discusses the design space of open-source tools in RLHF and proposes a new algorithm NLPO (Natural Language Policy Optimization) as an alternative to PPO.
+
+## Author
+
+This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>