Add authors

This commit is contained in:
simoninithomas
2023-02-20 15:50:07 +01:00
parent 575910d970
commit c39fe3b98f
10 changed files with 38 additions and 0 deletions

View File

@@ -48,3 +48,7 @@ For more information, we recommend you check out the following resources:
- [Evolving Curricula with Regret-Based Environment Design](https://arxiv.org/abs/2203.01302)
- [Curriculum Reinforcement Learning via Constrained Optimal Transport](https://proceedings.mlr.press/v162/klink22a.html)
- [Prioritized Level Replay](https://arxiv.org/abs/2010.03934)
## Author
This section was written by <a href="https://twitter.com/ClementRomac"> Clément Romac </a>

View File

@@ -25,3 +25,7 @@ For more information, we recommend you check out the following resources:
- [Decision Transformer: Reinforcement Learning via Sequence Modeling](https://arxiv.org/abs/2106.01345)
- [Online Decision Transformer](https://arxiv.org/abs/2202.05607)
## Author
This section was written by <a href="https://twitter.com/edwardbeeching">Edward Beeching</a>

View File

@@ -43,3 +43,7 @@ Starcraft II is a famous *real-time strategy game*. DeepMind has used this game
To start using this environment, check these resources:
- [Starcraft gym](http://starcraftgym.com/)
- [A. I. Learns to Play Starcraft 2 (Reinforcement Learning) tutorial](https://www.youtube.com/watch?v=q59wap1ELQ4)
## Author
This section was written by <a href="https://twitter.com/ThomasSimonini"> Thomas Simonini</a>

View File

@@ -202,3 +202,7 @@ Try setting this property up to 8 to speed up training. This can be a great bene
### Theres more!
We have only scratched the surface of what can be achieved with Godot RL Agents, the library includes custom sensors and cameras to enrich the information available to the agent. Take a look at the [examples](https://github.com/edbeeching/godot_rl_agents_examples) to find out more!
## Author
This section was written by <a href="https://twitter.com/edwardbeeching">Edward Beeching</a>

View File

@@ -6,4 +6,6 @@
Congratulations on finishing this course! **You now have a solid background in Deep Reinforcement Learning**.
But this course was just the beginning of your Deep Reinforcement Learning journey, there are so many subsections to discover. In this optional unit, we **give you resources to explore multiple concepts and research topics in Reinforcement Learning**.
Contrary to other units, this unit is a collective work of multiple people from Hugging Face. We mention the author for each unit.
Sounds fun? Let's get started 🔥,

View File

@@ -39,3 +39,7 @@ For more information we recommend you check out the following resources:
- [Pre-Trained Language Models for Interactive Decision-Making](https://arxiv.org/abs/2202.01771)
- [Grounding Large Language Models with Online Reinforcement Learning](https://arxiv.org/abs/2302.02662v1)
- [Guiding Pretraining in Reinforcement Learning with Large Language Models](https://arxiv.org/abs/2302.06692)
## Author
This section was written by <a href="https://twitter.com/ClementRomac"> Clément Romac </a>

View File

@@ -26,3 +26,7 @@ For more information on MBRL, we recommend you check out the following resources
- A [blog post on debugging MBRL](https://www.natolambert.com/writing/debugging-mbrl).
- A [recent review paper on MBRL](https://arxiv.org/abs/2006.16712),
## Author
This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>

View File

@@ -31,3 +31,7 @@ For more information, we recommend you check out the following resources:
- [Offline Reinforcement Learning, Talk by Sergei Levine](https://www.youtube.com/watch?v=qgZPZREor5I)
- [Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems](https://arxiv.org/abs/2005.01643)
## Author
This section was written by <a href="https://twitter.com/ThomasSimonini"> Thomas Simonini</a>

View File

@@ -50,3 +50,7 @@ record on [GitHub](https://github.com/RewardReports/reward-reports).
For further reading, you can visit the Reward Reports [paper](https://arxiv.org/abs/2204.10817)
or look [an example report](https://github.com/RewardReports/reward-reports/tree/main/examples).
## Author
This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>

View File

@@ -44,3 +44,7 @@ And here is a snapshot of the growing set of papers that show RLHF's performance
- [Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned](https://arxiv.org/abs/2209.07858) (Ganguli et al. 2022): A detailed documentation of efforts to “discover, measure, and attempt to reduce [language models] potentially harmful outputs.”
- [Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning](https://arxiv.org/abs/2208.02294) (Cohen at al. 2022): Using RL to enhance the conversational skill of an open-ended dialogue agent.
- [Is Reinforcement Learning (Not) for Natural Language Processing?: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization](https://arxiv.org/abs/2210.01241) (Ramamurthy and Ammanabrolu et al. 2022): Discusses the design space of open-source tools in RLHF and proposes a new algorithm NLPO (Natural Language Policy Optimization) as an alternative to PPO.
## Author
This section was written by <a href="https://twitter.com/natolambert"> Nathan Lambert </a>