#055 Embedding Intelligence: AI's Move to the Edge

Update: 2025-08-13

Description

Nicolay here,

while everyone races to cloud-scale LLMs, Pete Warden is solving AI problems by going completely offline. No network connectivity required.

Today I have the chance to talk to Pete Warden, CEO of Useful Sensors and author of the TinyML book.

His philosophy: if you can't explain to users exactly what happens to their data, your privacy model is broken.

Key Insight: The Real World Action Gap

LLMs excel at text-to-text transformations but fail catastrophically at connecting language to physical actions. There's nothing in the web corpus that teaches a model how "turn on the light" maps to sending a pin high on a microcontroller.

This explains why every AI agent demo focuses on booking flights and API calls - those actions are documented in text. The moment you step off the web into real-world device control, even simple commands become impossible without custom training on action-to-outcome data.

Pete's company builds speech-to-intent systems that skip text entirely, going directly from audio to device actions using embeddings trained on limited action sets.

💡 Core Concepts

Speech-to-Intent: Direct audio-to-action mapping that bypasses text conversion, preserving ambiguity until final classification

ML Sensors: Self-contained circuit boards processing sensitive data locally, outputting only simple signals without exposing raw video/audio

Embedding-Based Action Matching: Vector representations mapping natural language variations to canonical device actions within constrained domains

⏱ Important Moments

Real World Action Problem: [06:27 ] LLMs discuss turning on lights but lack training data connecting text commands to device control

Apple Intelligence Challenges: [04:07 ] Design-led culture clashes with AI accuracy limitations

Speech-to-Intent vs Speech-to-Text: [12:01 ] Breaking audio into text loses critical ambiguity information

Limited Action Set Strategy: [15:30 ] Smart speakers succeed by constraining to ~3 functions rather than infinite commands

8-Bit Quantization: [33:12 ] Remains deployment sweet spot - processor instruction support matters more than compression

On-Device Privacy: [47:00 ] Complete local processing provides explainable guarantees vs confusing hybrid systems

🛠 Tools & Tech

Whisper: github.com/openai/whisper

Moonshine: github.com/usefulsensors/moonshine

TinyML Book: oreilly.com/library/view/tinyml/9781492052036

Stanford Edge ML: github.com/petewarden/stanford-edge-ml

📚 Resources

Looking to Listen Paper: looking-to-listen.github.io

Lottery Ticket Hypothesis: arxiv.org/abs/1803.03635

Connect: pete@usefulsensors.com | petewarden.com | usefulsensors.com

Beta Opportunity: Moonshine browser implementation for client-side speech processing in

JavaScript

Comments

In Channel

#056 Building Solo: How One Engineer Uses AI Agents to Ship Production Code

2025-09-1101:12:24

#055 Embedding Intelligence: AI's Move to the Edge

2025-08-1301:05:35

#054 Building Frankenstein Models with Model Merging and the Future of AI

2025-07-2901:06:55

#053 AI in the Terminal: Enhancing Coding with Warp

2025-07-2301:04:30

#052 Don't Build Models, Build Systems That Build Models

2025-07-0159:22

#051 Build systems that can be debugged at 4am by tired humans with no context

2025-06-1701:05:51

#050 Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster

2025-05-2701:06:57

#050 TAKEAWAYS Bringing LLMs to Production: Delete Frameworks, Avoid Finetuning, Ship Faster

2025-05-2711:00

#049 BAML: The Programming Language That Turns LLMs into Predictable Functions

2025-05-2001:02:38

#049 TAKEAWAYS BAML: The Programming Language That Turns LLMs into Predictable Functions

2025-05-2001:12:34

#048 TAKEAWAYS Why Your AI Agents Need Permission to Act, Not Just Read

2025-05-1307:06

#048 Why Your AI Agents Need Permission to Act, Not Just Read

2025-05-1157:02

#047 Architecting Information for Search, Humans, and Artificial Intelligence

2025-03-2757:21

#046 Building a Search Database From First Principles

2025-03-1353:28

#045 RAG As Two Things - Prompt Engineering and Search

2025-03-0601:02:43

#044 Graphs Aren't Just For Specialists Anymore

2025-02-2801:03:34

#043 Knowledge Graphs Won't Fix Bad Data

2025-02-2001:10:58

#042 Temporal RAG, Embracing Time for Smarter, Reliable Knowledge Graphs

2025-02-1301:33:43

#041 Context Engineering, How Knowledge Graphs Help LLMs Reason

2025-02-0601:33:34

#040 Vector Database Quantization, Product, Binary, and Scalar

2025-01-3152:11

00:00

1.0x

#055 Embedding Intelligence: AI's Move to the Edge

#box-pro-ellipsis-176546854511336{-webkit-line-clamp:2;}#055 Embedding Intelligence: AI's Move to the Edge

#055 Embedding Intelligence: AI's Move to the Edge

Nicolay Gerold

#055 Embedding Intelligence: AI's Move to the Edge