DiscoverInterconnects(Voiceover) Building on evaluation quicksand
(Voiceover) Building on evaluation quicksand

(Voiceover) Building on evaluation quicksand

Update: 2024-10-16
Share

Description

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksand

Chapters

00:00 Building on evaluation quicksand

01:26 The causes of closed evaluation silos

06:35 The challenge facing open evaluation tools

10:47 Frontiers in evaluation

11:32 New types of synthetic data contamination

13:57 Building harder evaluations

Figures

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp



Get full access to Interconnects at www.interconnects.ai/subscribe
Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

(Voiceover) Building on evaluation quicksand

(Voiceover) Building on evaluation quicksand

Nathan Lambert