Unlocking Unstructured Data with LLMs

Update: 2025-07-03

Description

Shreya Shankar is a PhD student at UC Berkeley in the EECS department. This episode explores how Large Language Models (LLMs) are revolutionizing the processing of unstructured enterprise data like text documents and PDFs. It introduces DocETL, a framework using a MapReduce approach with LLMs for semantic extraction, thematic analysis, and summarization at scale.

Subscribe to the Gradient Flow Newsletter 📩 https://gradientflow.substack.com/

Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon · RSS.

Detailed show notes - with links to many references - can be found on The Data Exchange web site.

Comments

In Channel

Databases for Machines, Not People

2025-10-2327:42

When AI Agents Need to Talk: Inside the A2A Protocol

2025-10-1629:23

The Infrastructure for Production AI

2025-10-0947:32

How to Make Your Data Truly AI-Ready

2025-10-0245:49

Beyond the Agent Hype

2025-09-2729:35

How to Build and Optimize AI Research Agents

2025-09-2538:41

Why Digital Work is the Perfect Training Ground for AI Agents

2025-09-1842:16

Beyond the Chatbot: What Actually Works in Enterprise AI

2025-09-1138:24

Why China's Engineering Culture Gives Them an AI Advantage

2025-09-0652:45

Predictability Beats Accuracy in Enterprise AI

2025-09-0444:11

2025 AI Governance Survey

2025-08-2839:23

The Fenic Approach to Production-Ready Data Processing

2025-08-2146:45

When AI Eats the Bottom Rung of the Career Ladder

2025-08-1627:57

From NotebookLM to Audio Companions: Why Google’s AI Team Went Startup

2025-08-1432:33

The AI-Native Notebook That Thinks Like a Spreadsheet

2025-08-0742:26

How Agentic AI is Transforming Wall Street

2025-07-3140:09

The Quantum Advantage Is Real—But Where's the Infrastructure?

2025-07-2445:53

From Human-Readable to Machine-Usable: The New API Stack

2025-07-1738:23

Why Voice Security Is Your Next Big Problem

2025-07-1041:37

Unlocking Unstructured Data with LLMs

2025-07-0327:46

00:00

Unlocking Unstructured Data with LLMs

#box-pro-ellipsis-176123960731782{-webkit-line-clamp:2;}Unlocking Unstructured Data with LLMs

Unlocking Unstructured Data with LLMs

Ben Lorica

Unlocking Unstructured Data with LLMs