Listen Top Shows Blog

172: Transformers and Large Language Models

172: Transformers and Large Language Models

Update: 2024-03-11

1

Share

Description

172: Transformers and Large Language Models

Intro topic: Is WFH actually WFC?

News/Links:

Falsehoods Junior Developers Believe about Becoming Senior
- https://vadimkravcenko.com/shorts/falsehoods-junior-developers-believe-about-becoming-senior/
Pure Pursuit
- Tutorial with python code: https://wiki.purduesigbots.com/software/control-algorithms/basic-pure-pursuit
- Video example: https://www.youtube.com/watch?v=qYR7mmcwT2w
PID without a PHD
- https://www.wescottdesign.com/articles/pid/pidWithoutAPhd.pdf
Google releases Gemma
- https://blog.google/technology/developers/gemma-open-models/

Book of the Show

Patrick: The Eye of the World by Robert Jordan (Wheel of Time)
- https://amzn.to/3uEhg6v
Jason: How to Make a Video Game All By Yourself
- https://amzn.to/3UZtP7b

Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h

Tool of the Show

Patrick: Stadia Controller Wifi to Bluetooth Unlock
- https://stadia.google.com/controller/index_en_US.html
Jason: FUSE and SSHFS
- https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh

Topic: Transformers and Large Language Models

How neural networks store information
- Latent variables
Transformers
- Encoders & Decoders
Attention Layers
- History
  - RNN
    - Vanishing Gradient Problem
  - LSTM
    - Short term (gradient explodes), Long term (gradient vanishes)
- Differentiable algebra
- Key-Query-Value
- Self Attention
Self-Supervised Learning & Forward Models
Human Feedback
- Reinforcement Learning from Human Feedback
- Direct Policy Optimization (Pairwise Ranking)

★ Support this podcast on Patreon ★

Comments

In Channel

184: Asynchronous Programming

184: Asynchronous Programming

2025-09-2301:30:32

183: Landing a Software Job in 2025

183: Landing a Software Job in 2025

2025-07-3101:46:53

182: AI Assisted Coding

182: AI Assisted Coding

2025-06-3001:37:36

181: Memory Management

181: Memory Management

2025-05-1201:46:21

180: Reinforcement Learning

180: Reinforcement Learning

2025-03-1701:52:22

179: Project Planning

179: Project Planning

2025-02-0301:43:00

178: Working from Home

178: Working from Home

2024-12-0301:45:15

177: Vector Databases

177: Vector Databases

2024-11-0401:28:26

176: MLOps at SwampUp

176: MLOps at SwampUp

2024-09-2401:58:37

175: Resume Writing

175: Resume Writing

2024-08-1601:40:55

174: Devops

174: Devops

2024-06-1001:25:47

173: Mocking and Unit Tests

173: Mocking and Unit Tests

2024-04-2901:35:22

172: Transformers and Large Language Models

172: Transformers and Large Language Models

2024-03-1101:26:08

171: Compilers and Interpreters

171: Compilers and Interpreters

2024-02-1201:25:10

170: 2023 Holiday Special Live

170: 2023 Holiday Special Live

2023-12-2401:38:34

169: HyperLogLog

169: HyperLogLog

2023-11-2701:29:33

168: Godot

168: Godot

2023-11-2001:28:34

167: Desktop User Interfaces

167: Desktop User Interfaces

2023-10-2301:26:06

166: Speedy Database Queries with Lukas Fittl

166: Speedy Database Queries with Lukas Fittl

2023-10-1601:12:12

165: Differential Equations

165: Differential Equations

2023-09-2501:16:43

00:00

00:00

x

172: Transformers and Large Language Models

172: Transformers and Large Language Models

Patrick Wheeler and Jason Gauci