DiscoverMachine Learning Street Talk (MLST)Jay Alammar on LLMs, RAG, and AI Engineering
Jay Alammar on LLMs, RAG, and AI Engineering

Jay Alammar on LLMs, RAG, and AI Engineering

Update: 2024-08-11
Share

Description

Jay Alammar, renowned AI educator and researcher at Cohere, discusses the latest developments in large language models (LLMs) and their applications in industry. Jay shares his expertise on retrieval augmented generation (RAG), semantic search, and the future of AI architectures.




MLST is sponsored by Brave:


The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.




Cohere Command R model series: https://cohere.com/command




Jay Alamaar:


https://x.com/jayalammar




Buy Jay's new book here!


Hands-On Large Language Models: Language Understanding and Generation


https://amzn.to/4fzOUgh




TOC:


00:00:00 Introduction to Jay Alammar and AI Education


00:01:47 Cohere's Approach to RAG and AI Re-ranking


00:07:15 Implementing AI in Enterprise: Challenges and Solutions


00:09:26 Jay's Role at Cohere and the Importance of Learning in Public


00:15:16 The Evolution of AI in Industry: From Deep Learning to LLMs


00:26:12 Expert Advice for Newcomers in Machine Learning


00:32:39 The Power of Semantic Search and Embeddings in AI Systems


00:37:59 Jay Alammar's Journey as an AI Educator and Visualizer


00:43:36 Visual Learning in AI: Making Complex Concepts Accessible


00:47:38 Strategies for Keeping Up with Rapid AI Advancements


00:49:12 The Future of Transformer Models and AI Architectures


00:51:40 Evolution of the Transformer: From 2017 to Present


00:54:19 Preview of Jay's Upcoming Book on Large Language Models




Disclaimer: This is the fourth video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview. Note also that this combines several previously unpublished interviews from Jay into one, the earlier one at Tim's house was shot in Aug 2023, and the more recent one in Toronto in May 2024.




Refs:


The Illustrated Transformer


https://jalammar.github.io/illustrated-transformer/




Attention Is All You Need


https://arxiv.org/abs/1706.03762




The Unreasonable Effectiveness of Recurrent Neural Networks


http://karpathy.github.io/2015/05/21/rnn-effectiveness/




Neural Networks in 11 Lines of Code


https://iamtrask.github.io/2015/07/12/basic-python-network/




Understanding LSTM Networks (Chris Olah's blog post)


http://colah.github.io/posts/2015-08-Understanding-LSTMs/




Luis Serrano's YouTube Channel


https://www.youtube.com/channel/UCgBncpylJ1kiVaPyP-PZauQ




Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks


https://arxiv.org/abs/1908.10084




GPT (Generative Pre-trained Transformer) models


https://jalammar.github.io/illustrated-gpt2/


https://openai.com/research/gpt-4




BERT (Bidirectional Encoder Representations from Transformers)


https://jalammar.github.io/illustrated-bert/


https://arxiv.org/abs/1810.04805




RoPE (Rotary Positional Encoding)


https://arxiv.org/abs/2104.09864 (Linked paper discussing rotary embeddings)




Grouped Query Attention


https://arxiv.org/pdf/2305.13245




RLHF (Reinforcement Learning from Human Feedback)


https://openai.com/research/learning-from-human-preferences


https://arxiv.org/abs/1706.03741




DPO (Direct Preference Optimization)


https://arxiv.org/abs/2305.18290

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Jay Alammar on LLMs, RAG, and AI Engineering

Jay Alammar on LLMs, RAG, and AI Engineering

Machine Learning Street Talk (MLST)