Discover
Data Engineering Central Podcast
Data Engineering Central Podcast
Author: Data Engineering in Real Life
Subscribed: 13Played: 100Subscribe
Share
© dataengineeringdude
Description
Long Live the Data Engineer. No holds barred. Talking about Data Engineering news, topics, and general mayhem.
dataengineeringcentral.substack.com
dataengineeringcentral.substack.com
17 Episodes
Reverse
In this episode of the Data Engineering Central Podcast, I sit down with Maxine Meurer, DevOps engineer, author, and educator behind I Love DevOps, for a wide-ranging conversation about careers, infrastructure, automation, and what it actually means to build systems that last.This isn’t a buzzword-heavy DevOps chat. It’s a grounded, honest discussion between two engineers about how people really get into tech, how careers evolve over time, and why modern infrastructure is as much about systems thinking and human judgment as it is about tools.We talk through Maxine’s journey from early technical curiosity to hands-on DevOps work, dealing with “ClickOps” to automation-first infrastructure, and how writing and teaching reshaped the way she thinks about engineering.What we cover in this episode:* 🛠️ From ClickOps to DevOps — what that transition actually looks like in the real world* 🧠 Why DevOps is fundamentally about systems and people, not just pipelines and YAML* 📚 How Maxine went from self-teaching to authoring practical guides like LLMs for Humans and The DevOps Career Switch Blueprint* 🤯 Common mistakes engineers make when learning DevOps, cloud, and distributed systems* 🔍 Testing failures, production realities, and where modern infrastructure still breaks down* 🤖 What AI and LLMs actually change for engineers, and what’s mostly hype* 🧭 Career advice for engineers without a traditional background* 🔮 Where DevOps and platform engineering are heading over the next 3–5 yearsThroughout the conversation, Maxine brings a refreshing, human-centered perspective to topics that are often over-abstracted or oversold. We dig into the tradeoffs behind tooling choices, the reality of production systems, and the importance of learning how to think, not just what to deploy.If you’re navigating a DevOps or infrastructure career, wrestling with modern stacks, or trying to make sense of AI’s role in engineering, this episode offers clarity, context, and hard-won insight.Learn more about Maxine’s work:* Writing & guides: * LinkedIn: https://www.linkedin.com/in/maxinemeurer/* Gumroad resources: https://mameurer.gumroad.comThanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode, I sit down with industry veteran Robin Moffatt — Sr. Principal Advisor in Streaming Data Technologies (Kafka, etc.) and a longtime voice in the data engineering community, to unpack the journey from old-school data architectures to today’s real-time streaming ecosystems. From early mainframe data processing and COBOL through the rise of Apache Kafka, streaming ETL, and event-driven systems, Robin shares lived experience from across decades of building, scaling, and evolving data platforms.We dive into:* 🧠 How the role of software engineering has shifted with the rise of distributed, real-time systems* 📊 Why event streaming and platforms like Kafka aren’t just messaging systems, but the backbone of modern data architectures* 🚀 How the community’s tooling and mental models have had to evolve — from static databases and nightly jobs to continuous, always-on streaming applications* 🤖 A candid look at how AI and real-time data are intersecting, shaping both tooling and expectations for the next decade* 🔮 Robin’s perspective on where the industry is headed — beyond buzzwords toward real engineering maturityAlong the way, we get historical context, real-world lessons from conference stages and community forums, and a perspective on building resilient, scalable systems that power today’s data-rich applications.If you’ve ever wondered how we got from batch jobs to continuous event streams, or what it really takes to build modern pipelines that support AI workflows, this conversation with Robin is a must-listen.For more from Robin:* 📍 His personal blog & talks: https://rmoff.net/* 🔗 LinkedIn profile: https://www.linkedin.com/in/robinmoffattThanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode of the Data Engineering Central Podcast, I sit down with R. Tyler Croy for a wide-ranging conversation on the present—and future—of modern data platforms.Tyler is a long-time open-source contributor to projects such as delta-rs. You can watch him on YouTube, read his blog, or work directly with him through his consultancy, Buoyant Data.Tyler has spent years deep in the open-source data ecosystem, contributing to projects such as Delta Lake and thinking critically about how real-world data systems are built and maintained. This isn’t a hype-driven conversation—it’s a grounded discussion about what’s working, what’s breaking, and what’s coming next.We dig into:* What the Lakehouse architecture gets right—and where it still falls short* Why multimodal data (text, images, audio, video, embeddings) changes everything* How open table formats like Delta Lake fit into the next generation of data platforms* The growing gap between data tooling hype and day-to-day data engineering reality* What skills and architectural thinking will matter most for data engineers over the next decadeIf you’re building or operating modern data platforms—and trying to separate real signal from noise—this episode is for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode of the Data Engineering Central Podcast, I sit down with Hoyt Emerson, founder of The Full Data Stack and Early Signal, for a wide-ranging conversation on data, analytics, and creating content in the tech world.We talk candidly about:* What actually matters in modern data and analytics* Why so much “data content” misses the mark* The difference between noise and real signal* What works (and doesn’t) when building a technical audience* Writing, consistency, and credibility in the data space* Why opinions + experience beat trends and buzzwordsIf you’re a data engineer, analyst, or technologist who’s curious about both building better data systems and communicating ideas that resonate, this episode goes deep on the lessons learned from doing both.This is less about hacks—and more about craft, judgment, and long-term thinking.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode of the Data Engineering Central Podcast, I sit down with Andy Leonard — someone who’s been building systems long before “data engineering” was even a job title.Andy’s career didn’t start in software at all. It started with physical circuits, literally wiring systems as an electrician, before moving into programming, databases, and eventually decades of hands-on data engineering work.This conversation isn’t about trends or hype cycles. It’s about how the fundamentals of data work have evolved, what hasn’t changed, and what you only learn after years of building, breaking, fixing, and rebuilding real systems.We talk about how the industry got here, how tools have changed, where they haven’t helped as much as advertised, and what newer data engineers can learn from a long, practical career spent close to the metal.If you’re interested in perspective, experience, and lessons earned the hard way — this one’s for you.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode of the Data Engineering Central Podcast, I interview a Data OG, someone who’s been around the data space forever, and we talked about all things data, past, present, and future.I’m joined by Thomas Horton a longtime friend and one of the most well-rounded data professionals I know. Over the course of his career, Tom has worn just about every hat in data: developer, DBA, analyst, and everything in between. He’s lived through the era of on-prem databases, the rise of analytics, and the constant reinvention that defines modern data engineering today.We talk about what’s changed, what hasn’t, and why many of the “new” problems in data feel oddly familiar. We also dig into lessons learned the hard way, lessons that are just as relevant for early-career data engineers as they are for seasoned practitioners navigating today’s ever-expanding stacks.On a personal note, a huge portion of what I know about relational databases and analytics can be traced back to Tom. This conversation is part reflection, part history lesson, and part reality check on where the data industry is headed next.* If you’re interested in the past, present, and future of data—and what really matters beneath all the tooling, this is an episode you won’t want to miss.Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In this episode, I sit down with Scott Haines — O’Reilly author, Databricks MVP, and veteran of Yahoo, Nike, and Twilio — for a wide-ranging conversation on the real state of modern data engineering. We dig into open-source ecosystems, Lakehouse architectures, the evolution of Spark, streaming, what’s broken and what’s working in today’s data tooling, and the lessons Scott has learned scaling platforms at some of the biggest companies in the world.If you care about data engineering, architecture, OSS, or the future of the modern data stack, you’ll love this one.Thanks for reading Data Engineering Central! This post is public so feel free to share it.Make sure to follow Scott here on Substack, and over on GitHub. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
Hello! A new episode of the Data Engineering Central Podcast is dropping today. We will be covering a few hot topics!* Cluster Fatigue* The Death of Open SourceGoing to be a great show, come along for the ride!Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
This is a free preview of a paid episode. To hear more, visit dataengineeringcentral.substack.comHello! A new episode of the Data Engineering Central Podcast is dropping today, we will be covering a few hot topics!* Apache Iceberg Catalogs* new Boring Catalog* new full Iceberg support from Databricks/Unity Catalog* Databricks SQL Scripting* DuckDB coming to a Lake House near you* Lakebase from DatabricksGoing to be a great show, come along for the ride!Thanks …
Hello, my fair-weathered friends and readers! I am gone on vacation this week with my family, probably at this moment lying in the sand on a beach (Lord willing the creek don’t rise), not thinking of you all.Anywho, be that as it may, I didn’t want you to miss my pretty face, so here is a video of me ranting about Apache Iceberg, something I’ve had a lot of practice doing and enjoy quite thoroughly.For all you free-loaders out there, you can get 20% off to celebrate Memorial Day.https://dataengineeringcentral.substack.com/Merica This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
This is a free preview of a paid episode. To hear more, visit dataengineeringcentral.substack.comIt’s time for another episode of the Data Engineering Central Podcast. In this episode, we cover …* Rust-based tool called UV to replace pip and poetry etc* Apache X-Table and the Future of the Lake House* How is AI going to affect you?Thanks for being a consumer of Data Engineering Central; your support means a lot. Please share this podcast with your friend…
It’s time for another episode of the Data Engineering Central Podcast. In this episode, we cover …* AWS Lambda + DuckDB and Delta Lake (Polars, Daft, etc).* IAC - Long Live Terraform.* Databricks Data Quality with DQX.* Unity Catalog releases for DuckDB and Polars* Bespoke vs Managed Data Platforms* Delta Lake vs. Iceberg and UinFORM for a single table.Thanks for b… This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
In todays episode of Data Engineering Central Podcast we talk about a few hot topics, AWS S3 Tables, Databricks raising money, are Data Contracts Dead, and the Lake House Storage Format battle!It's a good one, buckle up! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
It’s time for another episode of the Data Engineering Central Podcast. In this episode we cover …* Apache Airflow vs Databricks Workflows* End-of-Year Engineering Planning for 2025* 10 Billion Row Challenge with DuckDB vs Daft vs Polars* Raw Data Ingestion.As usual, the full episode is available to paid subscribers, and a shortened version to you free loaders out there, don’t worry, I still love you though. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
It’s time for another episode of Data Engineering Central Podcast, our third one! Topics in this episode …* Should you use DuckDB or Polars?* Small Engineering Changes (PR Reviews)* Daft vs Spark on Databricks with Unity Catalog (Delta Lake)* Primary and Foreign keys in the Lake HouseEnjoy! This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
Welcome to the Data Engineering Central Podcast —— a no-holds-barred discussion on the Data Landscape.Welcome to Episode 02In today’s episode, we will talk about the following topics from the Data Engineering perspective …* Using OpenAI’s o1 Model to do Data Engineering work* Lord Save us from more ETL tools* Rust for the small things* Hosted (SaaS) vs Build This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe
Welcome to the Data Engineering Central Podcast —— a no-holds-barred discussion on the Data Landscape.Welcome to Episode 01 In today’s episode we will talk about the following topics from the Data Engineering perspective …* Snowflake vs Databricks.* Is Apache Spark being replaced??* Notebooks in Production. Bad. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe




















