DiscoverThe Data Exchange with Ben LoricaUnlocking Unstructured Data with LLMs
Unlocking Unstructured Data with LLMs

Unlocking Unstructured Data with LLMs

Update: 2025-07-03
Share

Description

Shreya Shankar is a聽 PhD student at UC Berkeley in the EECS department. This episode explores how Large Language Models (LLMs) are revolutionizing the processing of unstructured enterprise data like text documents and PDFs. It introduces DocETL, a framework using a MapReduce approach with LLMs for semantic extraction, thematic analysis, and summarization at scale.

Subscribe to the Gradient Flow Newsletter 馃摡https://gradientflow.substack.com/

Subscribe: AppleSpotify OvercastPocket CastsAntennaPodPodcast AddictAmazon 路聽 RSS.

Detailed show notes - with links to many references - can be found on The Data Exchange web site.

Comments
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Unlocking Unstructured Data with LLMs

Unlocking Unstructured Data with LLMs

Ben Lorica