Arash Ahmadian on Rethinking RLHF

Arash Ahmadian on Rethinking RLHF

Update: 2024-03-25
Share

Description

Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI.

Featured Reference

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker


Additional References

Comments 
In Channel
Ian Osband

Ian Osband

2024-03-0701:08:26

Martin Riedmiller

Martin Riedmiller

2023-08-2201:13:56

Max Schwarzer

Max Schwarzer

2023-08-0801:10:18

Julian Togelius

Julian Togelius

2023-07-2540:04

Jakob Foerster

Jakob Foerster

2023-05-0801:03:45

Danijar Hafner 2

Danijar Hafner 2

2023-04-1245:21

Jeff Clune

Jeff Clune

2023-03-2701:11:11

Natasha Jaques 2

Natasha Jaques 2

2023-03-1446:02

Jacob Beck and Risto Vuorio

Jacob Beck and Risto Vuorio

2023-03-0701:07:05

John Schulman

John Schulman

2022-10-1844:21

Sven Mika

Sven Mika

2022-08-1934:56

Karol Hausman and Fei Xia

Karol Hausman and Fei Xia

2022-08-1601:03:09

Sai Krishna Gottipati

Sai Krishna Gottipati

2022-08-0101:08:11

Aravind Srinivas 2

Aravind Srinivas 2

2022-05-0958:33

Rohin Shah

Rohin Shah

2022-04-1201:37:04

loading
Download from Google Play
Download from App Store
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Arash Ahmadian on Rethinking RLHF

Arash Ahmadian on Rethinking RLHF

Robin Ranjit Singh Chauhan