129 - Transformers and Hierarchical Structure, with Shunyu Yao

Update: 2021-07-02

Description

In this episode, we talk to Shunyu Yao about recent insights into how transformers can represent hierarchical structure in language. Bounded-depth hierarchical structure is thought to be a key feature of natural languages, motivating Shunyu and his coauthors to show that transformers can efficiently represent bounded-depth Dyck languages, which can be thought of as a formal model of the structure of natural languages. We went on to discuss some of the intuitive ideas that emerge from the proofs, connections to RNNs, and insights about positional encodings that may have practical implications. More broadly, we also touched on the role of formal languages and other theoretical tools in modern NLP.

Papers discussed in this episode:

- Self-Attention Networks Can Process Bounded Hierarchical Languages (https://arxiv.org/abs/2105.11115)
- Theoretical Limitations of Self-Attention in Neural Sequence Models (https://arxiv.org/abs/1906.06755)
- RNNs can generate bounded hierarchical languages with optimal memory (https://arxiv.org/abs/2010.07515)
- On the Practical Computational Power of Finite Precision RNNs for Language Recognition (https://arxiv.org/abs/1805.04908)

Shunyu Yao's webpage: https://ysymyth.github.io/

The hosts for this episode are William Merrill and Matt Gardner.

Comments

In Channel

Are LLMs safe?

2024-02-2942:15

"Imaginative AI" with Mohamed Elhoseiny

2024-01-0823:19

142 - Science Of Science, with Kyle Lo

2023-12-2848:57

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld

2023-06-2929:36

140 - Generative AI and Copyright, with Chris Callison-Burch

2023-06-0651:28

139 - Coherent Long Story Generation, with Kevin Yang

2023-03-2445:18

138 - Compositional Generalization in Neural Networks, with Najoung Kim

2023-01-2048:22

137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal

2023-01-1335:56

136 - Including Signed Languages in NLP, with Kayo Yin and Malihe Alikhani

2022-05-1901:02:15

135 - PhD Application Series: After Submitting Applications

2022-03-0236:53

134 - PhD Application Series: PhDs in Europe versus the US, with Barbara Plank and Gonçalo Correia

2021-10-1938:29

133 - PhD Application Series: Preparing Application Materials, with Nathan Schneider and Roma Patel

2021-10-0643:54

132 - Alexa Prize Socialbot Grand Challenge and Alquist 4.0, with Petr Marek

2021-09-2741:43

131 - Opportunities and Barriers between HCI and NLP, with Nanna Inie and Leon Derczynski

2021-08-2046:54

130 - Linking human cognitive patterns to NLP Models, with Lisa Beinborn

2021-08-0944:02

129 - Transformers and Hierarchical Structure, with Shunyu Yao

2021-07-0235:43

128 - Dynamic Benchmarking, with Douwe Kiela

2021-06-1947:00

127 - Masakhane and Participatory Research for African Languages, with Tosin Adewumi and Perez Ogayo

2021-06-0847:17

126 - Optimizing Continuous Prompts for Generation, with Lisa Li

2021-05-2447:38

125 - VQA for Real Users, with Danna Gurari

2021-05-0442:10

00:00

1.0x

129 - Transformers and Hierarchical Structure, with Shunyu Yao

Allen Institute for Artificial Intelligence

#box-pro-ellipsis-176342494342080{-webkit-line-clamp:2;}129 - Transformers and Hierarchical Structure, with Shunyu Yao

129 - Transformers and Hierarchical Structure, with Shunyu Yao

Allen Institute for Artificial Intelligence

129 - Transformers and Hierarchical Structure, with Shunyu Yao