John Schulman on dead ends, scaling RL, and building research institutions
Description
A conversation with John Schulman on the first year LLMs could have been useful, building research teams, and where RL goes from here.00:00 - Speedrunning ChatGPT09:22 - Archetypes of research managers11:56 - Was OpenAI inspired by Bell Labs?16:54 - The absence of value functions18:23 - Continual learning21:09 - Brittle generalization24:05 - Co-training generators and verifiers, GANs27:06 - John’s personal use of AI for research28:54 - Day in the life33:01 - Slowdowns in consequential ML ideas36:21 - "Peer review" within the labs39:19 - Distribution shift in researchers43:33 - Future of RL45:33 - Will the labs coordinate if the world needs them to?44:46 - Forecasting ills in AGI and engineering47:53 - Thinking Machines




