DiscoverODSC's Ai X PodcastDeep Reinforcement Learning in the Real World with Anna Goldie
Deep Reinforcement Learning in the Real World with Anna Goldie

Deep Reinforcement Learning in the Real World with Anna Goldie

Update: 2024-03-26
Share

Description

In this episode, you’ll explore the field of deep reinforcement learning and the ways it influences the real world with Anna Goldie, Senior Staff Research Scientist at Google DeepMind. 


Anna’s current role has her working on Large Language Model (LLM) research for Gemini and Bard. Prior to that, she worked on reinforcement learning for LLMs and retrieval-augmented LLMs at Anthropic, and was co-founder/lead of the ML for Systems team in Google Brain.


During this wide-ranging discussion, you’ll learn about her contributions to the field of reinforcement learning, and how we can leverage reinforcement learning effectively for real world applications going forward. 


Sponsored by: https://odsc.com/ 

Find more ODSC lightning interviews, webinars, live trainings, certifications, bootcamps here – https://aiplus.training/ 


Topics:

1. Professional journey and the key moments 

2. Core principles of deep reinforcement learning

3. Deep reinforcement learning for chip design vs traditional approaches

4. Key complexities in modern chip design and how deep reinforcement learning can address these complexities

5. Discuss Google’s TPUs - Tensor Processing Units - built specifically for accelerating machine learning workloads. 

6. The potential of Deep Reinforcement learning in computer systems or other domains within Google Deepmind.

7. Deep reinforcement learning use in Large Language Models (LLMs)

8. Reinforcement Learning from Human Feedback (RLHF), designing effective rewards and providing feedback at scale

9. Scalable supervision techniques, for developing methods to efficiently gather feedback that aligns the LLM with human preferences

10. Implement the Constitutional AI framework where AI models are guided by a set of foundational principles or 'constitutional' directives

11. How Retrieval Augment Generation (RAG) systems improve the accuracy and relevance of responses compared to standard large language models even LLMs with large retrieval context windows

12. How “RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVALcompares to traditional RAG approaches that retrieve short, contiguous chunks?

13. Hierarchical summaries with RAPTOR

14. LLM Finetuning With Low-Rank Adaptation (LoRA)

15. Google's Gemini 1.5 next-generation LLM model and mixture of experts architecture

16. CALM—Composition to Augment Language Models

17. How dual undergraduate degrees in both computer science and linguistics from MIT has contributed to your innovative work in machine learning,

18. Constitutional AI at Antropic https://www.anthropic.com/news/claudes-constitution

19. What is the best way to follow your work?

20. Keynote address at ODSC East in mid-April. 



SHOW NOTES

More about Anna Godie

https://www.linkedin.com/in/adgoldie/

https://www.annagoldie.com/



More about Constitutional AI at Antropic

https://www.anthropic.com/news/claudes-constitution

Constitutional AI: Harmlessness from AI Feedback

 https://arxiv.org/pdf/2212.08073.pdf



More about Anna’s Paper

RAPTOR: RECURSIVE ABSTRACTIVE PROCESSING FOR TREE-ORGANIZED RETRIEVAL

https://openreview.net/pdf?id=GN921JHCRw

The official Code implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

https://github.com/parthsarthi03/raptor



More about Large Language Models

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

https://arxiv.org/abs/2305.18290

LLM Finetuning With Low-Rank Adaptation (LoRA)

https://lightning.ai/pages/community/article/lora-llm



CALM—Composition to Augment Language Models

https://arxiv.org/pdf/2401.02412.pdf

https://www.anthropic.com/news/claudes-constitution

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Deep Reinforcement Learning in the Real World with Anna Goldie

Deep Reinforcement Learning in the Real World with Anna Goldie