diff --git a/units/en/unitbonus3/rlhf.mdx b/units/en/unitbonus3/rlhf.mdx index a4a2de9..3691f97 100644 --- a/units/en/unitbonus3/rlhf.mdx +++ b/units/en/unitbonus3/rlhf.mdx @@ -1,23 +1,27 @@ # RLHF -- Introduction to RL HF: Nathan +Reinforcement learning from human feedback (RLHF) is a methodology for integrating human data labels into a RL-based optimization process. +It is motivated by the challenge of modeling human preferences. +For many questions, even if you could try and write down an equation for one ideal, humans differ on their preferences. +Updating models based on measured data is an avenue to try and alleviate these inherently human ML problems. ## Start Learning about RLHF To start learning about RLHF: -1. Read [Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf) +1. Read this introduction: [Illustrating Reinforcement Learning from Human Feedback (RLHF)](https://huggingface.co/blog/rlhf). 2. Watch the recorded live we did some weeks ago, where Nathan covered the basics of Reinforcement Learning from Human Feedback (RLHF) and how this technology is being used to enable state-of-the-art ML tools like ChatGPT. Most of the talk is an overview of the interconnected ML models. It covers the basics of Natural Language Processing and RL and how RLHF is used on large language models. We then conclude with the open question in RLHF. -3. [Closed-API vs Open-source continues: RLHF, ChatGPT, data moats](https://robotic.substack.com/p/rlhf-chatgpt-data-moats) +3. Read other blogs on this topic, such as [Closed-API vs Open-source continues: RLHF, ChatGPT, data moats](https://robotic.substack.com/p/rlhf-chatgpt-data-moats). Let us know if there are more you like! ## Additional readings +*Note, this is copied from the Illustrating RLHF blog post above*. Here is a list of the most prevalent papers on RLHF to date. The field was recently popularized with the emergence of DeepRL (around 2017) and has grown into a broader study of the applications of LLMs from many large technology companies. Here are some papers on RLHF that pre-date the LM focus: - [TAMER: Training an Agent Manually via Evaluative Reinforcement](https://www.cs.utexas.edu/~pstone/Papers/bib2html-links/ICDL08-knox.pdf) (Knox and Stone 2008): Proposed a learned agent where humans provided scores on the actions taken iteratively to learn a reward model.