DiscoverAI BreakdownDataRater: Meta-Learned Dataset Curation
DataRater: Meta-Learned Dataset Curation

DataRater: Meta-Learned Dataset Curation

Update: 2025-12-05
Share

Description

In this episode, we discuss DataRater: Meta-Learned Dataset Curation by Dan A. Calian, Gregory Farquhar, Iurii Kemaev, Luisa M. Zintgraf, Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, Tom Schaul, Jeffrey Dean, Hado van Hasselt, David Silver. The paper proposes DataRater, a meta-learning approach that estimates the value of individual training data points to improve dataset curation. By leveraging meta-gradients, DataRater optimizes data selection to enhance training efficiency on held-out data. Experiments demonstrate that filtering data with DataRater significantly boosts compute efficiency across various model scales and datasets.
Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

DataRater: Meta-Learned Dataset Curation

DataRater: Meta-Learned Dataset Curation

agibreakdown