DiscoverBehind the CraftComplete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan
Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Update: 2025-08-24
Share

Description

Today, I want to share a new episode with Aman Khan.The best way to learn about AI evaluations is to watch 2 PMs build them live from scratch. In our new episode, Aman and I walk through creating evals for an AI customer support agent — from labeling a golden dataset to aligning LLM judges. This is the complete beginners AI eval course you've been waiting for.Aman and I talked about:

(00:00 ) What are AI evals and how to get good at them

(02:52 ) The 4 types of AI evaluations everyone should know

(06:08 ) Live demo: Building evals for a customer support agent

(10:29 ) Using Anthropic's console to generate great prompts

(15:13 ) Creating the evaluation criteria

(17:40 ) Adding human labels to the golden dataset

(31:05 ) Scaling evals with LLM-judge prompts

(38:21 ) How to align LLM judges with human judgmentGet the takeaways: https://creatoreconomy.so/p/complete-beginner-course-on-ai-evaluations-aman-khanWhere to find Aman:

X: https://www.linkedin.com/in/amanberkeley/

Website: https://arize.com/📌 Subscribe to this channel – more interviews coming soon!

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan

Peter Yang