The Analytics Engineering Podcast

Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet’s best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to podcast@dbtlabs.com.

Data as an assembly line (w/ Cedric Chin)

Cedric Chin runs Commoncog—a publication about accelerating business expertise. He joins Tristan to talk about the analytics development lifecycle, how organizations value (or misvalue) data, and why “data teams are not some IT helpdesk to be ignored.”   For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

11-17
51:11

The data jobs to be done (w/ Erik Bernhardsson)

Erik Bernhardsson, the CEO and co-founder of Modal Labs, joins Tristan to talk about Gen AI, the lack of GPUs, the future of cloud computing, and egress fees. They also discuss whether the job title of data engineer is something we should want more or less of in the future. Erik’s not afraid of a spicy take, so this is a fun one.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

11-03
42:55

Coalesce 2024 edition: What’s next for data teams? (w/ Scott Breitenother)

Show description: Scott Breitenother, founder of data consultancy Brooklyn Data Co., joins Tristan at Coalesce 2024 in Las Vegas to discuss the early days of dbt, the evolution of data teams, and what's next for the dbt community. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

10-20
44:23

The current state of the AI ecosystem (w/ Julia Schottenstein)

Former co-host Julia Schottenstein returns to the show to go deep into the world of LLMs. Julia joined LangChain as an early employee, in Tristan’s words, to “Basically solve all of the problems that aren't specifically in product and engineering.” LangChain has become one of, if not the primary frameworks for developing applications using large language models. There are over a million developers using LangChain today, building everything from prototypes to production AI applications.

10-06
45:44

Creating value from GenAI in the enterprise (w/ Nisha Paliwal)

Nisha Paliwal, who leads enterprise data tech at Capital One, joins Tristan to discuss building a strong data culture for in the world of AI. She is the co-author of the book Secrets of AI Value Creation.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

09-22
45:20

Developer productivity on GitHub Copilot (w/ Eirini Kalliamvakou)

Dr. Eirini Kalliamvakou is a senior researcher at GitHub Next. Eirini has built a career on studying software engineers, how to measure their productivity, how developer experience impacts productivity, and more. Recently, Eirini has been working on quantifying the impacts of GitHub Copilot. Does it actually help software engineers be more productive? Tristan and Eirini explore how to quantify developer productivity in the first place, and finally, arriving at whether or not Copilot‌ makes a difference. In the search for real business value, this research is a real bellwether of things to come. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs. Join data practitioners and data leaders this October in Las Vegas at Coalesce, the analytics engineering conference hosted by dbt Labs. Register now at coalesece.getdbt.com. Listeners of this show can use the code podcast20 for a 20% discount.

09-08
53:59

The rapid experimentation of AI agents (w/ Yohei Nakajima)

Yohei Nakajima is an investor by day and coder by night. In particular, one of his projects, an AI agent framework called BabyAGI that creates a plan-execute loop, got a ton of attention in the past year. The truth is that AI agents are an extremely experimental space, and depending on how strict you want to be with your definition, there aren't a lot of production use cases today.  Yohei discusses the current state of AI agents and where they might take us.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

06-09
45:55

Funnel analytics and AI models for event sequences (w/ Misha Panko)

Misha Panko has worked in data for a long time, including on high performance data teams at Uber and Google. Today, Misha is the co-founder and CEO of Motif Analytics, a product focused on helping growth and ops teams understand their event data. In this episode, Tristan and Misha nerd out about the state of the art in computational neuroscience, where Misha got his PhD. They then go deep into event stream data and how it differs from classical fact and dimension data, and why it needs different analytical tools. Make sure to check out the back half of the episode, where they dive into AI and how Motif is applying breakthroughs in language modeling to train foundation models of event sequences—check out his team’s blog post on their work. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

05-26
44:09

From Moneyball to Gen AI

Eric Avidon is a journalist at TechTarget who's interviewed Tristan a few times, and now Tristan gets to flip the script and interview Eric. Eric is a journalist veteran, covering everything from finance to the Boston Red Sox, but now he spends a lot of time with vendors in the data space and has a broad view of what's going on. Eric and Tristan discuss AI and analytics and how mature these features really are today, data quality and its importance, the AI strategies of Snowflake and Databricks, and a lot more. Plus, part way through you can hear Tristan reacting to a mild earthquake that hit the East Coast. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

05-12
37:37

Being Pro-Human in the AI Era

Barry McCardel is the co-founder and CEO of Hex. Hex is an analytics tool that's structured around a notebook experience, but as you'll hear in the episode, goes well beyond the traditional notebook. We're big fans of Hex at dbt Labs, and use it for a bunch of our internal data work. In this episode, Barry and Tristan discuss notebooks and data analysis, before zooming out to discuss the hype cycle of data science, how AI is different, the experience of building AI products, and how AI will impact data practitioners. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

04-21
50:09

The 2024 Machine Learning, AI & Data Landscape (w/ Matt Turck)

Matt Turck has been publishing his ecosystem map since 2012. It was first called the Big Data Landscape. Now it’s the Machine Learning, AI & Data (MAD) Landscape.  The 2024 MAD Landscape includes 2,011(!) logos, which Matt attributes first a data infrastructure cycle and now an ML/AI cycle. As Matt writes, “Those two waves are intimately related. A core idea of the MAD Landscape every year has been to show the symbiotic relationship between data infrastructure, analytics/BI,  ML/AI, and applications.” Matt and Tristan discuss themes in Matt's post: generative AI’s impact on data analytics, the modern AI stack compared to the modern data stack, and Databricks vs. Snowflake (plus Microsoft Fabric). For full show notes and to read 7+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

04-07
36:22

How the Media Covers Gen AI (w/ Matthew Lynley, Supervised)

Matthew Lynley is a bit of a hybrid. He's been a long-time journalist covering enterprise tech, currently in his fantastic AI and data newsletter Supervised, and he's also been a hands-on data practitioner.  Matthew has covered the analytics tech stack, but this time Tristan turns the tables to get Matthew’s perspective on the rise of Gen AI as a topic in the popular press, what's going on in the space today, and where AI is headed. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

03-24
48:15

AI's Impact in the World of Structured Data Analytics (w/ Juan Sequeda, data.world)

Juan Sequeda is a principal data scientist and head of the AI Lab at data.world, and is also the co-host of the fantastic data podcast Catalog and Cocktails.  This episode tackles semantics, semantic web, Juan’s research in how raw text-to-SQL performs versus text-to-semantic layer,  and where we both believe AI will make an impact in the world of structured data analytics. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

03-10
48:18

The End of the Modern Data Stack (w/ Benn Stancil, Mode)

Benn Stancil, cofounder and CTO at Mode, returns to The Analytics Engineering Podcast to discuss the evolution of the term "modern data stack" and its value today. Tristan wrote on this idea for The Analytics Engineering Roundup in Is the Modern Data Stack Still a Useful Idea? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

02-25
45:46

Data Mesh Architecture at Large Enterprises (w/ Moritz Heimpel and Ben Flusberg)

Moritz Heimpel from Siemens and Ben Flusberg from Cox Automotive have very similar jobs. They both act as stewards of the data strategies at large, complex companies. In this episode, we get into what it’s like to collaborate with data at scale. Ben and Mortitz share their experiences adopting a data mesh architecture and what that looks like at their organizations. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

12-08
46:07

Let's Talk About Data Vault (w/ Brandon Taylor and Michael Olschimke)

If Data Vault is a new term for you, it’s a data modeling design pattern. We’re joined by Brandon Taylor, a senior data architect at Guild, and Michael Olschimke, who is the CEO of Scalefree—the consulting firm whose co-founder Dan Lindstedt is credited as the designer of the data vault architecture.  In this conversation with Tristan and Julia, Michael and Brandon explore the Data Vault approach among data warehouse design methodologies. They discuss Data Vault’s adoption in Europe, its alignment with data mesh architecture, and the ongoing debate over Data Vault vs. Kimball methods.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

11-17
44:04

Navigating AI Complexity (w/ Jonathan Frankle)

Jonathan Frankle is the Chief Scientist at MosaicML, which was recently bought by Databricks for $1.3 billion.  MosaicML helps customers train generative AI models on their data. Lots of companies are excited about gen AI, and the hope is that their company data and information will be what sets them apart from the competition.  In this conversation with Tristan and Julia, Jonathan discusses a potential future where you can train specialized, purpose-built models, the future of MosaicML inside of Databricks, and the importance of responsible AI practices. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

11-03
46:20

Career Growth in Data Roles (w/ Hubspot's Kasey Mazza at Coalesce 2023)

In this conversation with Tristan recorded at Coalesce 2023, Kasey Mazza, an analytics engineering manager on the RevOps team at HubSpot, discusses the roles of data analysts and analytics engineers, the importance of building internal data communities, and the evolving landscape of data teams.  Watch Kasey’s Coalescse 2023 presentation The career growth software development lifecycle. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.  The Analytics Engineering Podcast is sponsored by dbt Labs.

10-20
29:17

Operationalizing Your Warehouse, Streaming Analytics, and Cereal (W/ Arjun Narayan of Materialize and Nathan Bean of General Mills)

It turns out data plays a big role in getting cereal manufactured and delivered so you can enjoy your Cheerios reliably for breakfast. We talk with Arjun Narayan, CEO of Materialize, a company building an operational warehouse, and Nathan Bean, a data leader at General Mills responsible for all of the company's manufacturing analytics and insights.  We discuss Materialize’s founding story, how streaming technology has matured, and how exactly companies are leveraging their warehouse to operationalize their business—in this case, at one of the largest consumer product companies in the United States.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.  The Analytics Engineering Podcast is sponsored by dbt Labs.

10-06
42:23

Roche’s Data Transformation Journey (w/ Yannick Misteli)

Yannick Misteli is the head of engineering for the go-to-market domain at Roche, a $250 billion multinational pharmaceutical and diagnostics company.  Roche was an early supporter of dbt Cloud, and Yannick helped move his team of 120+ engineers to a modern data stack. He always finds a way to push the boundaries to make a large company founded in 1896 incredibly modern and innovative. We wanted to know more about the "how" of the work—the people, process, and technology.  Read more about Roche's data journey here: https://docs.getdbt.com/blog/dbt-squared

09-22
40:05

Recommend Channels