DataRater: Meta-Learned Dataset Curation
Update: 2025-12-05
Description
In this episode, we discuss DataRater: Meta-Learned Dataset Curation by Dan A. Calian, Gregory Farquhar, Iurii Kemaev, Luisa M. Zintgraf, Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, Tom Schaul, Jeffrey Dean, Hado van Hasselt, David Silver. The paper proposes DataRater, a meta-learning approach that estimates the value of individual training data points to improve dataset curation. By leveraging meta-gradients, DataRater optimizes data selection to enhance training efficiency on held-out data. Experiments demonstrate that filtering data with DataRater significantly boosts compute efficiency across various model scales and datasets.
Comments
In Channel



