[D] Reinforcement Learning from AI Feedback

Aug 1, 2023

—

[ad_1]

Hey everyone,

As many of you probably know Reinforcement Learning from Human Feedback (RLHF) was the core technique used to produce ChatGPT and similar AI assistants that followed. RLHF replaces human feedback in an RL schema with a preference model that is trained according to a dataset of human preferences.

Anthropic has devised an extension of this idea in which an AI model (rather than humans) is used to generate the data which ultimately trains the preference model. This method, called Reinforcement Learning from AI Feedback uses a "constitution" to guide the feedback model in terms of what outputs are preferable to others.

I go over the research in How Reinforcement Learning from AI Feedback Works. In short, the authors find that they are able to train a non-evasive harmless agent using a short constitution. The method is found to be superior to RLHF, and constitutes a Pareto improvement over RLHF models.

https://preview.redd.it/qaivl8f1ljfb1.png?width=1179&format=png&auto=webp&s=a0941f2ce0ccdcf0557cf19b7f4b48fa712a66f2

Let me know what you think, I'm happy to answer any questions!

submitted by /u/SleekEagle
[comments]

[ad_2]

Source link

Comments

2 responses to “[D] Reinforcement Learning from AI Feedback”

Megan Atkinson

3 August 2023

Hi there,

We run an Instagram growth service, which increases your number of followers both safely and practically.

– We guarantee to gain you 400-1000+ followers per month.
– People follow you because they are interested in you, increasing likes, comments and interaction.
– All actions are made manually by our team. We do not use any ‘bots’.

The price is just $60 (USD) per month, and we can start immediately.

If you’d like to see some of our previous work, let me know, and we can discuss it further.

Kind Regards,
Megan

Reply
Jeana Jelks

9 August 2023

Quick question to ask you… Are you aware that by reading this message you just proved that contact form marketing works? That’s right, and we can get eyeballs on your offer too! Pricing starts at just $100 to blast YOUR ad message to 1 MILLION contact forms on websites just like yours worldwide. Contact me on Skype and let’s discuss what will work for your product/service. My Skype ID: live:.cid.83c9da999a4f9f

this message was sent to your website contact form at: dubai.digital

Reply

[D] Reinforcement Learning from AI Feedback

Comments

2 responses to “[D] Reinforcement Learning from AI Feedback”

Leave a Reply Cancel reply