RLHF: Reinforcement Learning from Human Feedback

from blog Chip Huyen, 2 May 2023 | ↗ original

[LinkedIn discussion, Twitter thread] In literature discussing why ChatGPT is able to capture so much of our imagination, I often come across two narratives: Scale: throwing more data and compute at it. UX: moving from a prompt interface to a more natural chat interface. One narrative that is often glossed over is the incredible technical...

This is a short summary. ↗ Open original to view full content

An underwhelming conversation with ChatGPT

decomposition ∘ al | original ↗

How Chain of Thought Prompting Boosts LLM Performance

Stanislav Khromov | original ↗

What should you use ChatGPT for?

Vicki Boykis | original ↗

LLMs in production: lessons learned

Duarte O.Carmo | original ↗

Guest post by ChatGPT

Koen van Gilst | original ↗

ChatGPT Prototyped Our New Feature

The WakaTime Blog | original ↗

Write your own words

Willem's Blog | original ↗

Experimenting with LLMs to Research, Reflect, and Plan

Eugene Yan | original ↗

Hiring technical writers in a ChatGPT world

passo.uno | original ↗

How we systematically analyze user feedback with LLMs

any blockers? | original ↗

More from Chip Huyen

Common pitfalls when building generative AI applications

16 Jan 2025 | original ↗

As we’re still in the early days of building applications with foundation models, it’s normal to make mistakes. This is a quick note with examples of some of the most common pitfalls that I’ve seen, both from public case studies and from my personal experience. Because these pitfalls are common, if you’ve worked on any AI product, you’ve probably...

Agents

7 Jan 2025 | original ↗

Intelligent agents are considered by many to be the ultimate goal of AI. The classic book by Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach (Prentice Hall, 1995), defines the field of AI research as “the study and design of rational agents.” The unprecedented capabilities of foundation models have opened the door to...

Conversational feedback design for generative AI applications

6 Jan 2025 | original ↗

User feedback has always played a critical role in software applications for two key reasons: evaluating the application’s performance and informing its development. For AI applications, user feedback takes on an even more significant role. It can be used to personalize models and to train future model iterations. Foundation models enable many...

Building A Generative AI Platform

25 Jul 2024 | original ↗

After studying how companies deploy generative AI applications, I noticed many similarities in their platforms. This post outlines the common components of a generative AI platform, what they do, and how they are implemented. I try my best to keep the architecture general, but certain applications might deviate. This is what the overall...

Measuring personal growth

17 Apr 2024 | original ↗

My founder friends constantly think about growth. They think about how to measure their business growth and how to get to the next order of magnitude scale. If they’re making $1M ARR today, they think about how to get to $10M ARR. If they have 1,000 users today, they think about how to get to 10,000 users. This made me wonder if/how people are...

RLHF: Reinforcement Learning from Human Feedback

Related

More from Chip Huyen