121. Alexei Baevski - data2vec and the future of multimodal learning

Update: 2022-04-27

Description

If the name data2vec sounds familiar, that’s probably because it made quite a splash on social and even traditional media when it came out, about two months ago. It’s an important entry in what is now a growing list of strategies that are focused on creating individual machine learning architectures that handle many different data types, like text, image and speech.

Most self-supervised learning techniques involve getting a model to take some input data (say, an image or a piece of text) and mask out certain components of those inputs (say by blacking out pixels or words) in order to get the models to predict those masked out components.

That “filling in the blanks” task is hard enough to force AIs to learn facts about their data that generalize well, but it also means training models to perform tasks that are very different depending on the input data type. Filling in blacked out pixels is quite different from filling in blanks in a sentence, for example.

So what if there was a way to come up with one task that we could use to train machine learning models on any kind of data? That’s where data2vec comes in.

For this episode of the podcast, I’m joined by Alexei Baevski, a researcher at Meta AI one of the creators of data2vec. In addition to data2vec, Alexei has been involved in quite a bit of pioneering work on text and speech models, including wav2vec, Facebook’s widely publicized unsupervised speech model. Alexei joined me to talk about how data2vec works and what’s next for that research direction, as well as the future of multi-modal learning.

***

Intro music:

- Artist: Ron Gelinas

- Track Title: Daybreak Chill Blend (original mix)

- Link to Track: https://youtu.be/d8Y2sKIgFWc

***

Chapters:

2:00 Alexei’s background

10:00 Software engineering knowledge

14:10 Role of data2vec in progression

30:00 Delta between student and teacher

38:30 Losing interpreting ability

41:45 Influence of greater abilities

49:15 Wrap-up

Comments

In Channel

130. Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

2022-10-1258:22

129. Amber Teng - Building apps with a new generation of language models

2022-10-0551:21

128. David Hirko - AI observability and data as a cybersecurity weakness

2022-09-2849:02

127. Matthew Stewart - The emerging world of ML sensors

2022-09-2141:34

126. JR King - Does the brain run on deep learning?

2022-09-1455:43

125. Ryan Fedasiuk - Can the U.S. and China collaborate on AI safety?

2022-09-0748:19

124. Alex Watson - Synthetic data could change everything

2022-05-1851:47

123. Ala Shaabana and Jacob Steeves - AI on the blockchain (it actually might just make sense)

2022-05-1254:43

122. Sadie St. Lawrence - Trends in data science

2022-05-0443:02

121. Alexei Baevski - data2vec and the future of multimodal learning

2022-04-2749:31

120. Liam Fedus and Barrett Zoph - AI scaling with mixture of expert models

2022-04-2040:47

119. Jaime Sevilla - Projecting AI progress from compute trends

2022-04-1348:34

118. Angela Fan - Generating Wikipedia articles with AI

2022-04-0651:44

117. Beena Ammanath - Defining trustworthy AI

2022-03-3046:46

116. Katya Sedova - AI-powered disinformation, present and future

2022-03-2354:24

115. Irina Rish - Out-of-distribution generalization

2022-03-0950:12

114. Sam Bowman - Are we *under-hyping* AI?

2022-03-0247:48

113. Yaron Singer - Catching edge cases in AI

2022-02-0935:20

112. Tali Raveh - AI, single cell genomics, and the new era of computational biology

2022-02-0242:04

111. Mo Gawdat - Scary Smart: A former Google exec’s perspective on AI risk

2022-01-2601:00:12

00:00

121. Alexei Baevski - data2vec and the future of multimodal learning

#box-pro-ellipsis-176682666539339{-webkit-line-clamp:2;}121. Alexei Baevski - data2vec and the future of multimodal learning

Chapters:

121. Alexei Baevski - data2vec and the future of multimodal learning

The TDS team

121. Alexei Baevski - data2vec and the future of multimodal learning