Jay Alammar on LLMs, RAG, and AI Engineering

Update: 2024-08-11

Description

Jay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures.

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

Cohere Command R model series: https://cohere.com/command

Jay Alamaar:

https://x.com/jayalammar

Buy Jay's new book here!

Hands-On Large Language Models: Language Understanding and Generation

https://amzn.to/4fzOUgh

TOC:

00:00:00 Introduction to Jay Alammar and AI Education

00:01:47 Cohere's Approach to RAG and AI Re-ranking

00:07:15 Implementing AI in Enterprise: Challenges and Solutions

00:09:26 Jay's Role at Cohere and the Importance of Learning in Public

00:15:16 The Evolution of AI in Industry: From Deep Learning to LLMs

00:26:12 Expert Advice for Newcomers in Machine Learning

00:32:39 The Power of Semantic Search and Embeddings in AI Systems

00:37:59 Jay Alammar's Journey as an AI Educator and Visualizer

00:43:36 Visual Learning in AI: Making Complex Concepts Accessible

00:47:38 Strategies for Keeping Up with Rapid AI Advancements

00:49:12 The Future of Transformer Models and AI Architectures

00:51:40 Evolution of the Transformer: From 2017 to Present

00:54:19 Preview of Jay's Upcoming Book on Large Language Models

Disclaimer: This is the fourth video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview. Note also that this combines several previously unpublished interviews from Jay into one, the earlier one at Tim's house was shot in Aug 2023, and the more recent one in Toronto in May 2024.

Refs:

The Illustrated Transformer

https://jalammar.github.io/illustrated-transformer/

Attention Is All You Need

https://arxiv.org/abs/1706.03762

The Unreasonable Effectiveness of Recurrent Neural Networks

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Neural Networks in 11 Lines of Code

https://iamtrask.github.io/2015/07/12/basic-python-network/

Understanding LSTM Networks (Chris Olah's blog post)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Luis Serrano's YouTube Channel

https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

https://arxiv.org/abs/1908.10084

GPT (Generative Pre-trained Transformer) models

https://jalammar.github.io/illustrated-gpt2/

https://openai.com/research/gpt-4

BERT (Bidirectional Encoder Representations from Transformers)

https://jalammar.github.io/illustrated-bert/

https://arxiv.org/abs/1810.04805

RoPE (Rotary Positional Encoding)

https://arxiv.org/abs/2104.09864 (Linked paper discussing rotary embeddings)

Grouped Query Attention

https://arxiv.org/pdf/2305.13245

RLHF (Reinforcement Learning from Human Feedback)

https://openai.com/research/learning-from-human-preferences

https://arxiv.org/abs/1706.03741

DPO (Direct Preference Optimization)

https://arxiv.org/abs/2305.18290

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Ashley Edwards - Genie Paper (DeepMind/Runway)

2024-09-1325:04

Cohere's SVP Technology - Saurabh Baji

2024-09-1201:30:25

David Hanson's Vision for Sentient Robots

2024-09-1053:14

The Fabric of Knowledge - David Spivak

2024-09-0546:28

Jürgen Schmidhuber - Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

2024-08-2801:39:39

"AI should NOT be regulated at all!" - Prof. Pedro Domingos

2024-08-2502:12:15

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

2024-08-2201:28:00

Joscha Bach - AGI24 Keynote (Cyberanimism)

2024-08-2157:21

Gary Marcus' keynote at AGI-24

2024-08-1701:29:04

Is ChatGPT an N-gram model on steroids?

2024-08-1532:57

Jay Alammar on LLMs, RAG, and AI Engineering

2024-08-1157:28

Can AI therapy be more effective than drugs?

2024-08-0802:14:07

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

2024-07-2901:42:27

Sayash Kapoor - How seriously should we take AI X-risk? (ICML 1/13)

2024-07-2849:42

Sara Hooker - Why US AI Act Compute Thresholds Are Misguided

2024-07-1827:23

Prof. Murray Shanahan - Machines Don't Think Like Us

2024-07-1402:15:22

David Chalmers - Reality+

2024-07-0801:17:57

Ryan Greenblatt - Solving ARC with GPT4o

2024-07-0602:18:01

Aiden Gomez - CEO of Cohere (AI's 'Inner Monologue' – Crucial for Reasoning)

2024-06-2901:00:22

New "50%" ARC result and current winners interviewed

2024-06-1802:14:17

00:00

Jay Alammar on LLMs, RAG, and AI Engineering

Machine Learning Street Talk (MLST)

#box-pro-ellipsis-172635322411335{-webkit-line-clamp:2;}Jay Alammar on LLMs, RAG, and AI Engineering

Jay Alammar on LLMs, RAG, and AI Engineering

Machine Learning Street Talk (MLST)

Jay Alammar on LLMs, RAG, and AI Engineering