AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)

Update: 2024-12-10

Description

This episode of AI Native Dev, hosted by Simon Maple and Guy Podjarny, features a mashup of conversations with leading figures in the AI industry. Guests include Des Traynor, founder of Intercom, who discusses the paradigm shift generative AI brings to product development. Rishabh Mehrotra, Head of AI at SourceGraph, emphasizes the importance of evaluation processes over model training. Tamar Yehoshua, President of Products and Technology at Glean, shares her experiences in enterprise search and the challenges of using LLMs in data-sensitive environments. Finally, Simon Last, Co-Founder and CTO of Notion, talks about continuous improvement and the iterative processes at Notion. Each guest provides invaluable insights into the evolving landscape of AI-driven products.

Watch the episode on YouTube: https://youtu.be/gZ4sGROvOdQ

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Mastering LLM Prompting in the Real World by Macey Baker

2025-01-0738:47

Crossover episode with The Infra Pod - AI Native Development with Guy Podjarny

2024-12-2656:32

The Evolution of v0 and Vercel's AI SDK, with Malte Ubl, Vercel CTO

2024-12-1825:43

AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)

2024-12-1049:58

DevOps with AI: Identifying the impact zone, with Roxane Fischer

2024-12-0528:13

Live Roundup: AI Myth busting in the real world and more with Matt Biilmann, Ben Galbraith, Patrick Debois and Simon Last

2024-12-0441:27

Building Notion AI: Lessons Learned and Myths Busted with Simon Last, Notion Co-Founder and CTO

2024-11-2652:25

Tessl Raises $125M to Build AI Native Development

2024-11-2036:18

Beyond Coding assistants: Cursor as an API, Coding with gestures and more with Patrick Debois

2024-11-1237:03

Does AI threaten the open web? Challenges and Opportunities with Netlify's CEO & Co-Founder, Matt Biillmann

2024-11-0755:46

Live Roundup: Embracing AI in Development and Infrastructure, with Liran Tal, Amara Graham, Armon Dadgar and Patrick Debois

2024-11-0448:35

Changing the Developer Documentation UX Workflow using AI with Amara Graham

2024-10-2932:57

From DevOps to AI: Patrick Debois Shares Strategies for Successful AI Integration and Cultural Change

2024-10-2230:19

Can AI Tools Be Trusted with Security-Critical Code? Real World AI Security Risks with Liran Tal

2024-10-1537:51

Armon Dadgar, Hashicorp co-founder, on AI Native DevOps: Can AI shape the future of Autonomous DevOps workloads?

2024-10-0947:08

Monthly Roundup: AI Security, AI Documentation, Enterprise AI Strategies, with Ben Galbraith

2024-10-0330:10

Can Claude 3.5 build production-quality apps without us having to write code? Felipe Aguirre tried doing just that!

2024-10-0137:38

Does AI Generate Secure Code? Tackling AppSec in the Face of AI Dev Acceleration, Prompt Injection and Data Poisoning.

2024-09-2453:59

AI-Powered Documentation: Insights with Omer Rosenbaum, CTO of Swimm

2024-09-1733:38

Enterprise AI Solutions Need to be Different - Glean and ex-Slack CPO, Tamar Yehoshua, on RAG, Changing Behavior and Bring-Your-Own-Model.

2024-09-1047:29

00:00

1.0x

AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)

#box-pro-ellipsis-173687428364496{-webkit-line-clamp:2;}AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)

AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)

Tessl

AI Evaluation and Testing: How to Know When Your Product Works (or Doesn’t)