DiscoverThe Joe Reis Show
The Joe Reis Show
Claim Ownership

The Joe Reis Show

Author: Joe Reis

Subscribed: 49Played: 2,622
Share

Description

What happens when a best-selling author and "recovering data scientist" gets a microphone? This podcast.

I'm Joe Reis, and each week I broadcast from wherever I am in the world, sharing candid thoughts on the data, tech, and AI industry.

Sometimes it's a solo rant. Other times, I'm chatting with the smartest people I know.

If you're looking for an unfiltered perspective on the state of AI, data, and tech, you've found it.
329 Episodes
Reverse
Oh yeah...ontologies. In this mini-clip from Matt Housley and I, we chat about why ontologies are super popular now.
Had an interesting discussion with my 15 year old son. He and his friends see white collar work as “cooked.” They see it as a rat race where the work is increasingly insecure, abusive, and meaningless. Then there’s the looming question of AI…Instead, they’re interested in careers they find meaningful and not as exposed to whatever AI does to work. And if they own a company, they’ll just hire “clankers” whenever that moment arrives.I’m excited that these kids are looking at what’s happening right now, questioning if it’s their path, and choosing a life that’s fit for them.More broadly, especially in the age of AI, I think some of the most important conversations we need to have is over what we find valuable and meaningful, making a living and the nature of work, and the nature of community.
It's Friday! Matt Housley and I catch up to discuss the aftermath of AWS re:Invent and why the industry’s obsession with AI Agents might be premature. We also dive deep into the hardware wars between Google and NVIDIA , the "brain-damaged" nature of current LLMs , and the growing "enshittification" of the internet and platforms like LinkedIn. Plus, I reveals some details about my upcoming "Mixed Model Arts" project.
In this episode, I sit down with Mark Freeman and Chad Sanderson (Gable.ai) to discuss the release of their new O’Reilly book, Data Contracts: Developing Production-Grade Pipelines at Scale. They dive deep into the chaotic journey of writing a 350-page book while simultaneously building a venture-backed startup.The conversation takes a sharp turn into the evolution of Data Contracts. While the concept started with data engineers, Mark and Chad explain why they pivoted their focus to software engineers. They argue that software engineers are facing a "Data Lake Moment, "prioritizing speed over craftsmanship, resulting in massive technical debt and integration failures.Gable: https://www.gable.ai/
I meet a lot of people who want to accomplish major goals next year. Then the year comes and goes and most people are still waiting to get started.It's almost December. Rather than wait until the New Year to get going, use December to plan how you'll execute on "that thing" you're itching to accomplish. Time waits for nobody, so get going.
In this episode, Ciro Greco (Co-founder & CEO, Bauplan) joins me to discuss why the future of data infrastructure must be "Code-First" and how this philosophy accidentally created the perfect environment for AI Agents.We explore why the "Modern Data Stack" isn't ready for autonomous agents and why a programmable lakehouse is the solution. Ciro explains that while we trust agents to write code (because we can roll it back), allowing them to write data requires strict safety rails. He breaks down how Bauplan uses "Git for Data" semantics - branching, isolation, and transactionality - to provide an air-gapped sandbox where agents can safely operate without corrupting production data. Welcome to the future of the lakehouse.Bauplan: https://www.bauplanlabs.com/
Just launched your Substack? Great! Here’s what to do next.This episode covers the realities of writing long-form in public, the traps that cause most writers to stall, how to build consistency, and how to grow an engaged audience from day one.
Data engineering is undergoing a fundamental shift. In this episode, I sit down with Nick Schrock, founder and CTO of Dagster, to discuss why he went from being an "AI moderate" to believing 90% of code will be written by AI. Being hands on also led to a massive pivot in Dagster’s roadmap and a new focus on managing and engineering context.We dive deep into why simply feeding data to LLMs isn't enough. Nick explains why real-time context tools (like MCPs) can become "token hogs" that lack precision and why the future belongs to "context pipelines": offline, batch-computed context that is governed, versioned, and treated like code.We also explore Compass, Dagster’s new collaborative agent that lives in Slack, bridging the gap between business stakeholders and data teams. If you’re wondering how your role as a data engineer will evolve in an agentic world, this conversation maps out the territoryDagster: dagster.io Nick Schrock on X: @schrockn
The days of easy entry into data jobs over. Maggie Wolff joins the show to discuss the new reality of the data career landscape. We dive into why the bar is higher than ever and why "cold DMing" on LinkedIn is a terrible strategy.Maggie also breaks down her secret strategy for networking as an introvert: treating events like a game or role-playing a more extroverted friend. Plus, we discuss the rise of AI in education, the problem with "lazy" learning , and why companies replacing humans with AI are making a mistake.
Matt Housley joins me for our monthly round-up of topics. This time, there's danger everywhere - The AI Bubble, how vibe coding is evolving, AI slop, and more.
After 1,500+ conversations with CDOs and VPs of data , guest Malcolm Hawker noticed a disturbing pattern: a "limiting mindset" that causes data leaders to fail. He argues that too many leaders blame external factors such as "culture" , "data literacy", or a lack of support rather than taking accountability for delivering value.In this conversation, Malcolm breaks down how this mindset is reinforced by the analyst and consultant community and why it leads to a "value fatigue" where no one can prove their own ROI. He offers a clear path forward, starting with a simple 3-question framework for any new CDO and explains why "culture" is actually an outcome of delivering value, not a prerequisite for it. We also discuss his new book, "The Data Hero Playbook," tackle the "AI Ready" myth , explaining why conflating it with "BI Ready" is holding companies back and why your data is likely "good enough" to start right now.
In this conversation, Dr. Cecilia Dones and I discuss the social skills we're losing as AI becomes more integrated into our lives. We explore the erosion of social norms, from AI companions joining Zoom calls without consent, endless enshitified content, to my son's generation calling AI girlfriends "clankers".Is there hope? We break down the "rage currency" that dominates media and the positive AI stories that go unheard. The biggest takeaway: as the world becomes more synthetic, "showing up" in person will become the ultimate "premium value."
In conversations I've been having with leaders and practitioners, there's some open-ended questions about the impact of AI on vendors and open-source projects. If you don't have a moat, you need to start thinking about how AI coding tools will erode the edges of your product. And what about getting users and traction? I cover this and much more in this episode. Enjoy!
Sujay Dutta and Sidd Rajagopal, authors of "Data as the Fourth Pillar," join the show to make the compelling case that for C-suite leaders obsessed with AI, data must be elevated to the same level as people, process, and technology.They provide a practical playbook for Chief Data Officers (CDOs) to escape the "cost center" trap by focusing on the "demand side" (business value) instead of just the "supply side" (technology). They also introduce frameworks like "Data Intensity" and "Total Addressable Value (TAV)" for data.We also tackle the reality of AI "slopware" and the "Great Pacific garbage patch" of junk data , explaining how to build the critical "context" (or "Data Intelligence Layer") that most GenAI projects are missing. Finally, they explain why the CDO must report directly to the CEO to play "offense," not defense.
Matt Turck (VC at FirstMark) joins the show to break down the most controversial MAD (Machine Learning, AI, and Data) Landscape yet. This year, the team "declared bankruptcy" and cut over 1,000 logos to better reflect the market reality: a "Cambrian explosion" of AI companies and a fierce "struggle and tension between the very large companies and the startups".Matt discusses why incumbents are "absolutely not lazy" , which categories have "largely just gone away" (like Customer Data Platforms and Reverse ETL) , and what new categories (like AI Agents and Local AI) are emerging. We also cover his investment thesis in a world dominated by foundation models, the "very underestimated" European AI scene , and whether an AI could win a Nobel Prize by 2027.https://www.mattturck.com/mad2025
I travel a TON, and the most frequent questions I get relate to traveling: Why I do it and any tips I have for traveling. Here, I answer those questions and more.
Jeremiah Lowin, founder of Prefect , returns to the show to discuss the seismic shift in the data and AI landscape since our last conversation a few years ago. He shares the wild origin story of FastMCP, a project he started to create a more "Pythonic" wrapper for Anthropic's Model Context Protocol (MCP).Jeremiah explains how this side project was incorporated into Anthropic's official SDK and then exploded to over a million downloads a day after MCP gained support from OpenAI and Google.He clarifies why this is an complementary expansion for Prefect, not a pivot , and provides a simple analogy for MCP as the "USB-C for AI agents". Most surprisingly, Jeremiah reveals that the primary adoption of MCP isn't for external products, but internally by data teams who are using it to finally fulfill the promise of the self-serve semantic layer and create a governable, "LLM-free zone" for AI tools.
I'm back, and give some notes from the road, thoughts on choosing tools and vendors, having a plan B for tools, and more.
There's no shortage of technical content for data engineers, but a massive gap exists when it comes to the non-technical skills required to advance beyond a senior role. I sit down with Yordan Ivanov, Head of Data Engineering and writer of "Data Gibberish," to talk about this disconnect.We dive into his personal journey of failing as a manager the first time, learning the crucial "people" skills, and his current mission to help data engineers learn how to speak the language of business.Key areas we explore:The Senior-Level Content Gap: Yordan explains why his non-technical content on career strategy and stakeholder communication gets "terrible" engagement compared to technical posts, even though it's what's needed to advance.The Managerial Trap: Yordan's candid story about his first attempt at management, where he failed because he cared only about code and wasn't equipped for the people-centric aspects and politics of the role.The Danger of AI Over-reliance: A deep discussion on how leaning too heavily on AI can prevent the development of fundamental thinking and problem-solving skills, both in coding and in life.The Maturing Data Landscape: We reflect on the end of the "modern data stack euphoria" and what the wave of acquisitions means for innovation and the future of data tooling.AI Adoption in Europe vs. the US: A look at how AI adoption is perceived as massive and mandatory in Europe, while US census data shows surprisingly low enterprise adoption rates
The world of data is being reset by AI, and the infrastructure needs to evolve with it. I sit down with streaming legend Tyler Akidau to discuss how the principles of stream processing are forming the foundation for the next generation of "agentic AI" systems.Tyler, who was an AI cynic until recently, explains why he's now convinced that AI agents will fundamentally change how businesses operate and what problems we need to solve to deploy them safely.Key topics we explore:From Human Analytics to Agentic Systems: How data architectures built for human analysis must be re-imagined for a world with thousands of AI agents operating at machine speed.Auditing Everything: Why managing AI requires a new level of governance where we must record all data an agent touches, not just metadata, to diagnose its complex and opaque behaviorThe End of Windowing's Dominance: Tyler reflects on the influential Dataflow paper he co-authored and explains why he now sees a table-based abstraction as a more powerful and user-friendly model than focusing on windowing.The D&D Alignment of AI: Tyler's brilliant analogy for why enterprises are struggling to adopt AI: we're trying to integrate "chaotic" agents into systems built for "lawful good" employees.A Reset for the Industry: Why the rise of AI feels like the early 2010s of streaming, where the problems are unsolved and everyone is trying to figure out the answers.
loading
Comments