DiscoverArgmax15: InstructGPT
15: InstructGPT

15: InstructGPT

Update: 2023-03-28
Share

Description

In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

15: InstructGPT

15: InstructGPT

Vahe Hagopian, Taka Hasegawa, Farrukh Rahman