Model Metrics: Benchmarking AI

Update: 2025-05-15

Description

In this episode of "Paul, Weiss Waking Up With AI," Katherine Forrest and Anna Gressel discuss AI benchmarking, exploring how these standardized tests evaluate AI models against each other and human capabilities, helping developers and deployers assess performance, safety and progress toward artificial general intelligence.

Learn More About Paul, Weiss’s Artificial Intelligence practice:

https://www.paulweiss.com/industries/artificial-intelligence

Comments

In Channel

Synthetic Identity: Navigating the New Deepfake Frontier

2025-12-1125:42

Gemini 3: Google’s Big Jump

2025-12-0422:30

Pixels, Preemption, and Cheese Pie: How AI Is Breathing New Life Into the Video Privacy Protection Act

2025-11-2819:39

AI Authorship on Trial: Thaler’s SCOTUS Bid and Getty’s UK Fight

2025-11-2021:34

Autonomous Vehicles Unpacked: An Overview of the Technology, Safety, and the Road to Regulation

2025-11-1318:13

Agentic Autonomy, Alignment, Acceleration: Anthropic’s Claude & Haiku 4.5

2025-11-0615:23

Advancements in AI Video Generation

2025-10-3017:31

Algorithmic Collusion: California’s AI and Antitrust Update

2025-10-2322:36

Legislative Developments: Spotlight on California’s SB 53

2025-10-1616:24

The Embodied AI Trifecta

2025-10-0922:36

When Superintelligence Requires a New Social Contract

2025-10-0213:46

Understanding Grokking

2025-09-2515:58

AI and “Workflows"

2025-09-1818:13

The Increasing Importance of Knowing Your Model’s Provenance

2025-09-1120:56

Hot Week in AI Safety Developments

2025-09-0419:41

AI Changes Coding

2025-08-2820:05

New Developments in World Models

2025-08-2119:35

OpenAI's Next Moves

2025-08-1429:28

Subliminal Learning in AI

2025-08-0713:03

Unpacking America’s AI Action Plan

2025-07-3133:25

00:00

1.0x

Model Metrics: Benchmarking AI

#box-pro-ellipsis-176589347264148{-webkit-line-clamp:2;}Model Metrics: Benchmarking AI

Model Metrics: Benchmarking AI

Paul, Weiss

Model Metrics: Benchmarking AI