CS224U

16 Episodes

Reverse

Sam Bowman on benchmarking and AI alignment

2023-02-2301:26:27

Lessons learned about benchmarking, adversarial testing, the dangers of over- and under-claiming, and AI alignment. Transcript: https://web.stanford.edu/class/cs224u/podcast/bowman/ Sam's website Sam on Twitter NYU Linguistics NYU Data Science NYU Computer Science Anthropic SNLI paper: A large annotated corpus for learning natural language inference SNLI leaderboard FraCaS SICK A SICK cure for the evaluation of compositional distributional semantic models SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment RTE Knowledge Resources Richard Socher Chris Manning Andrew Ng Ray Kurtzweil SQuAD Gabor Angeli Adina Williams Adina Williams podcast episode MultiNLI paper: A broad-coverage challenge corpus for sentence understanding through inference MultiNLI leaderboards Twitter discussion of LLMs and negation GLUE SuperGLUE DecaNLP GPT-3 paper: Language Models are Few-Shot Learners FLAN Winograd schema challenges BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding JSALT: General-Purpose Sentence Representation Learning Ellie Pavlick Ellie Pavlick podcast episode Tal Linzen Ian Tenney Dipanjan Das Yoav Goldberg Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks Big Bench Upwork Surge AI Dynabench Douwe Kiela Douwe Kiela podcast episode Ethan Perez NYU Alignment Research Group Eliezer Shlomo Yudkowsky Alignment Research Center Redwood Research Percy Liang podcast episode Richard Socher podcast episode

Amir Goldberg on the impact of AI

2023-01-2701:28:19

AI and social science, the causal revolution in economics, predictions about the impact of AI, teaching MBAs, productizing AI, and a journey from Tel Aviv to Princeton to Stanford. Transcript: https://web.stanford.edu/class/cs224u/podcast/goldberg/ Amir's website Amir on Twitter Computational Culture Lab ChatGPT Laura Nelson Bart Bonikowski Chris Winship Bernie Koch Treebanks BIG-bench Guido Imbens Endogeneity Susan Athey Cambridge Analytica Prediction Machines Speech and Language Processing DALL-E 2 Midjourney Stable Diffusion Postmodernism, or, the Cultural Logic of Late Capitalism Turing test Matt Salganik Paul DiMaggio

Marie-Catherine de Marneffe on understanding your data

2022-11-0701:08:41

Leaving Ohio, being back in Belgium, organizing NAACL 2022, reviewing at NLP-scale, universal dependencies, and doing NLU before it was cool. Transcript: https://web.stanford.edu/class/cs224u/podcast/demarneffe/ Marie's website Generating Typed Dependency Parses from Phrase Structure Parses Universal Dependencies project OSU Linguistics NAACL 2022 Dan Jurafsky Dan Roth Chris Manning ARR Priscilla Rasmussen Transactions of the ACL Finding Contradictions in Text Not a simple yes or no: Uncertainty in indirect answers Recognizing Textual Entailment Anna Rafferty Scott Grimm "Was It Good? It Was Provocative." Learning the Meaning of Scalar Adjectives Did It Happen? The Pragmatic Complexity of Veridicality Assessment Yejin Choi Yejin Choi's ACl 2022 talk Barbara Plank Linguistically debatable or just plain wrong? Jesse Dodge Reproducibility badges at NAACL 2022 Stanford Sentiment Treebank Judith Tonhauser Nan-Jiang Jiang Lauri Karttunen Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data Microsoft DeBERTa surpasses human performance on the SuperGLUE benchmark Daniel Zeman Marta Recasens

Sasha Rush on NLP research, engineering, and education

2022-10-0401:22:39

Coding puzzles, practices, and education, structured prediction, the culture of Hugging Face, large models, and the energy of New York. Transcript: https://web.stanford.edu/class/cs224u/podcast/rush/ Sasha's website Sasha on Twitter Sasha on the Humans of AI podcast Sasha on The Thesis Review Podcast with Sean Welleck Sasha on the Talking Machines Podcast Sasha interviewed by Sayak Paul Hugging Face PyTorch The Annotated Transformer The Annotated Alice The Annotated S4 Sasha and Dan Oneață's declarative graphics library Chalk Drawing Big Ben in Chalk OpenNMT Ken Shan Blog post by Ken and Dylan Thurston Edward Z. Yang Stuart Shieber Literate programming Soumith Chintala Lua Torch TensorFlow Graham Neubig Chris Dyer DyNet JAX jax.vmap Matt Johnson Finale Doshi-Velez, whose undergrad ML course inspired and informed Sasha's Tensor Puzzles GPU Puzzles A tweet that Chris added to his CV Adam Paszke Dougal MacLaurin Dex Named Tensor notation Named Tensors in PyTorch TorchDim Mini Torch Torch-Struct Sarah Hooker's paper 'The hardware lottery' Jacob Andreas Kevin Ellis Hugging Face transformers library Hugging Face datasets library Hugging Face diffusers library Hugging Face evaluate library scikit-learn Big Science blog BLOOM The Technology Behind BLOOM Training CRFM Eleuther T0 and PromptSource Washington Post: Big Tech builds AI with bad data. So scientists sought better data The bet: Is Attention All You Need? Democratizing access to large-scale language models with OPT-175B Epic OPT-175 Logbook Google's PaLM United's shares plunge 76% on bogus bankruptcy report Imagen Albert Gu Bell Labs

Diyi Yang on socially aware language technologies

2022-08-0101:21:49

Moving to Stanford, linguistic and social variation, interventional studies, and shared stories and lessons learned from an ACL Young Rising Star. Transcript: https://web.stanford.edu/class/cs224u/podcast/yang/ Diyi's website Diyi on Twitter Dan Jurafsky The Stanford NLP Group Buford Highway in Atlanta Sweet tea VALUE paper AAE GLUE Negative concord Exploring the role of grammar and word choice in bias toward African American English (AAE) in hate speech classification Inducing positive perspectives with text reframing Dynabench Datasheets for datasets MTurk Upwork Prolific Seekers, Providers, Welcomers, and Storytellers: Modeling Social Roles in Online Health Communities ToTTo: A controlled table-to-text generation dataset Six questions for socially aware language technologies The importance of modeling social factors of language: Theory and practice Dirk Hovy Workshop on Shared Stories and Lessons Learned EMNLP 2022 Workshop on Shared Stories and Lessons Learned ICCV 2021 Jeff Hancock

Maria Antoniak on cultural analytics

2022-06-2701:26:26

Birth narratives, stable static representations, NLP for everyone, AI2 and Semantic Scholar, the mission of Ukrainian Catholic University, and books books books. Transcript: https://web.stanford.edu/class/cs224u/podcast/antoniak/ Maria's website Maria on Twitter Semantic Scholar Elliott Ash ETH Zurich Center for Law and Economics Text As Data (TADA) 2022 David Mimno A computational reading of a birth stories community r/BabyBumps Roger Shank Nate Chambers ICWSM 2022 workshop: BERT for Social Sciences and Humanities Measuring Word Similarity with BERT (Sephora Makeup Reviews) Melanie Walsh word2vec BERT Nick Vincent's Twitter thread on Meta's OPT-175B filtering strategies Stemming Alexandra Schofield LDA LSA GloVe Evaluating the stability of embedding-based word similarities Narrative datasets through the lenses of NLP and HCI Belmont report Casey Fiesler Naive Bayes Allen Institute CORD-19 dataset, which appeared March 16, 2020! Books books books Pushkin Press New York Review Books Posthumous Memoirs of Brás Cubas And Then There Were None Stanisław Lem Jeff VanderMeer Italo Calvino Jorge Luis Borges xkcd War and Peace Middlemarch Beloved Novelist Cormac McCarthy's tips on how to write a great science paper Blood Meridian No Country for Old Men (book) No Country for Old Men (movie) The Road Talking a visual walk through Burnt Norton Ukrainian Catholic University Support Ukraine Now: Real Ways You can Help Ukraine Let Ukraine Speak: Integrating Scholarship on Ukraine into Classroom Syllabi Ukraine Trust Chain spilka World Central Kitchen Caritas Ukraine Science for Ukraine Data Science Crash Course: Interview Prep

Percy Liang on the Center for Research on Foundation Models

2022-06-1301:27:241

Realizing that Foundation Models are a big deal, scaling, why Percy founded CRFM, Stanford's position in the field, benchmarking, privacy, and CRFM's first and next 30 years. Transcript: https://web.stanford.edu/class/cs224u/podcast/liang/ Percy's website Percy on Twitter CRFM On the opportunities and risks of foundation models ELMo: Deep contextualized word representations BERT: Pre-training of deep bidirectional Transformers for language understanding Sam Bowman GPT-2 Adversarial examples for evaluating reading comprehension systems System 1 and System 2 The Unreasonable Effectiveness of Data Chinchilla: Training Compute-Optimal Large Language Models GitHub Copilot LaMDA: Language models for dialog applications AI Test Kitchen DALL-E 2 Richer Socher on the CS224U podcast you.com Chris Ré Fei-Fei Li Chris Manning HAI Rob Reich Erik Brynjolfsson Dan Ho Russ Altman Jeff Hancock The time is now to develop community norms for the release of foundation models Twitter Spaces event Best practices for deploying language models Model Cards for model reporting Datasheets for datasets Strathern's law

Roger Levy on computational psycholinguistics in the deep learning era

2022-06-1001:28:46

From genes to memes, evidence in linguistics, central questions of computational psycholinguistics, academic publishing woes, and the benefits of urban density. Transcript: https://web.stanford.edu/class/cs224u/podcast/levy/ Roger's website Roger on Twitter Roger's courses The Selfish Gene Joan Bresnan John Rickford Chris Manning Noah Goodman Thomas Clark Ted Gibson Ethan Wilcox Critical period Yevgeni Berzak Heritage language How many words do kids hear each year? See footnote 10. W.E.I.R.D Kristina Gulordava Poverty of stimulus hypothesis Formal grammar and information theory: together again? Expectation-based syntactic comprehension Google Ngram viewer Google Ngram data files Geoff Hinton's 2001 Rummelhart Prize from the Cognitive Science Society Center embedding Mark Johnson Stuart Shieber Ivan Sag Cognitive constraints and island effects The Chicken or the Egg? A Probabilistic Analysis of English Binomials Sarah Bunin Benor Roger's pinned tweet Eric Baković MIT's committee on the library system Project DEAL Diamond open access Fernanda Ferreira Brian Dillon Glossa Psycholinguistics Glossa Johan Rooryck La Jolla Cove

Kalika Bali on language technologies for a multilingual world

2022-05-2401:29:07

Giving a TED talk, linguistic diversity, code switching and large language models, the Indian NLP scene, empowering women with language consultation work, Wordle, and "once a linguist, always a linguist". Transcript: https://web.stanford.edu/class/cs224u/podcast/bali/ Kalika's website Kalika on Twitter Kalika's TED talk Microsoft Research India HAL IndicBERT AI4Bharat mBERT Hindi Bangla English Gondi Adivasi radio Oriya Karya crowdsourcing platform Sandy Chung Language processing experiments in the field Tamil Telugu Idu Mishmi COMPASS 2022 Digital Green Everwell Wordle Information-theoretic analysis of Wordle

Yulia Tsvetkov on ethical NLP

2022-05-1601:23:03

Coast-to-coast professional journeys, multilingual NLP, teaching in a fast-changing field, the history of hate speech detection in NLP, ethics review of NLP research, research on sensitive topics, mentoring researchers, and optimizing for your own passions. Transcript: https://web.stanford.edu/class/cs224u/podcast/tsvetkov/ Yulia's website TsvetShop Shuly Wintner Just when I thought I was out ... Algorithms for NLP HMMs Kneser–Ney smoothing Noah Smith Demoting racial bias in hate speech detection The risk of racial bias in hate speech detection Demoting racial bias in hate speech detection Fortifying toxic speech detectors against veiled toxicity This is the daily stormer's playbook Microaggressions.com Finding microaggressions in the wild: A case for locating elusive phenomena in social media posts https://delphi.allenai.org Delphi: Towards Machine Ethics and Norms Yejin Choi

Ellie Pavlick on true language understanding

2022-05-0901:23:32

Grounding through pure language modeling objectives, the origins or probing, the nature of understanding, the future of system assessment, signs of meaningful progress in the field, and having faith in yourself. Transcript: https://web.stanford.edu/class/cs224u/podcast/pavlick/ Ellie's website The LUNAR Lab MIT Scientist Captures 90,000 Hours of Video of His Son’s First Words, Graphs It Michael Frank Spot robots Dylan Ebert Ian Tenney What do you learn from context? Probing for sentence structure in contextualized word representations BERT Rediscovers the Classical NLP Pipeline JSALT: General-Purpose Sentence Representation Learning Sam Bowman Skip thought vectors What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties Hex Charlie Lovering Designing and interpreting probes with control tasks Jerry Fodor Been Kim Mycal Tucker What if this modified that? Syntactic interventions via counterfactual embeddings Yonatan Belinkov HANS: Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference Conceptual pacts and lexical choice in conversation Locating and editing factual knowledge in GPT Could a purely self-supervised language model achieve grounded language understanding? Dartmouth Summer Research Project on Artificial Intelligence (1956) Equal numbers of neuronal and nonneuronal cells make the human brain an isometrically scaled-up primate brain

Richard Socher on conviction in research

2022-05-0201:34:57

The early days of the rise of deep learning in NLP, conviction, the importance of applied work in the current moment, start-up risks, the state of Web search, paramotoring, and over-looked gems in the U.S. National Park system. Transcript: https://web.stanford.edu/class/cs224u/podcast/socher/ Richard's website Richard's Twitter Richard's paramotoring Salesforce Research you.com @YouSearchEngine Nate Chambers Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems Parsing Natural Scenes and Natural Language with Recursive Neural Networks CS224n Collobert and Weston 2008: A unified architecture for natural language processing: deep neural networks with multitask learning Stephen Merity CoVe: Learned in Translation: Contextualized Word Vectors https://believermag.com/ghosts/ Frances Arnold ProGen: Language Modeling for Protein Generation Lav Varshney decaNLP Chris Ré Loebner Prize Eugene Goostman Richard's castle on airbnb American Samoa National Park Great Sand Dunes National Park White Sands National Park Zion National Park

Omar Khattab on neural information retrieval

2022-04-2501:25:29

Pronouncing "ColBERT", the origins of ColBERT, doing NLP from an IR perspective, how getting "scooped" can be productive, OpenQA and related tasks, PhD journeys, why even retrieval plus attention is not all you need, multilingual knowledge-intensive NLP, and aiming high in research projects. Transcript: https://web.stanford.edu/class/cs224u/podcast/khattab/ Omar's website Matei Zaharia Keshav Santhanam Steven Colbert thowing paper with Obama The ColBERT paper and the ColBERTv2 paper DeepImpact: Learning passage impacts for inverted indexes DPR: Dense passage retrieval for open-domain question answering Incorporating query term independence assumption for efficient retrieval and ranking using deep neural networks DeepCT: Context-aware sentence/passage term importance estimation for first stage retrieval Reading Wikipedia to answer open-domain questions ORQA: Latent retrieval for weakly supervised open domain question answering QRECC ColBERT-QA: Relevance-guided Supervision for OpenQA with ColBERT Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval Passage reranking with BERT UniK-QA: Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering Self-driving search engines: The neural hype and comparisons against weak baselines Mohammad Hammoud RAG: Retrieval-augmented generation for knowledge-intensive NLP tasks Hindsight: Posterior-guided training of retrievers for improved open-ended generation Learning Cross-Lingual IR from an English Retriever Blog post: A moderate proposal for radically better AI-powered Web search Blog post: Building scalable, explainable, and adaptive NLP models with retrieval XOR-TyDi

Adina Williams on deep linguistic analysis in NLP

2022-04-2001:27:52

Neuroscience and neural networks, being a linguist in the world of NLP, evaluation methods, fine-grained NLI questions, the pace of research, and the vexing fact that, on the internet, people = men. Transcript: https://web.stanford.edu/class/cs224u/podcast/williams/ Adina's website Adina on Twitter Based on billions of words on the internet, people = men April Bailey Andrei Cimpian Androcentrism GloVe fastText Preregistration P-hacking Rishi Bommasani Interpreting pretrained contextualized representations via reductions to static embeddings Common Crawl Battlestar Galactica MultiNLI ANLI DynaBench Breaching experiment Liina Pylkkänen Sam Bowman Nikita Nangia Ludwig Wittgenstein Ido Dagan Sebastian Riedel SNLI Yixin Nie Douwe Kiela Jason Weston Emily Dinan Build it break it fix it for dialogue safety: Robustness from adversarial human attack Allyson Ettinger GLUE DynaSent DeBERTa RoBERTa Breaking NLI systems with sentences that require simple lexical inferences HANS: Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference Targeted syntactic evaluation of language models Max Bartolo Magnetoencephalography Marr's Levels Marco Baroni Richard Futrell Ryan Cotterell SIGMORPHON 2022 Alexis Conneau FLORES XNLI OCNLI: Original Chinese natural language inference Yann LeCun Novel Ideas in Learning-to-Learn through Interaction Grounding semantics in olfactory perception Brain in a vat Ellie Pavlick Tom Kwiatkowski Mohit Bansal Identifying inherent disagreement in natural language inference DALL-E 2 Winoground Gary Marcus on Winoground

Douwe Kiela on research at Hugging Face

2022-04-1801:21:36

Hugging Face, multimodality, data and model auditing, ethics review, adversarial testing, attention as more and less than you ever needed, neural information retrieval, philosophy of mind and consciousness, augmenting human creativity, openness in science, and a defininitive guide to pronouncing Douwe. Transcript: https://web.stanford.edu/class/cs224u/podcast/kiela/ Douwe's website Hugging Face Grounding semantics in olfactory perception Model Cards for Model reporting Datasheets for datasets Dynabench Hugging Face Spaces http://www.isattentionallyouneed.com The Annotated S4 Retrieval-Augmented Generation for knowledge-intensive NLP tasks Language models as slightly consciousness Fields of wheat as slightly pasta True few-shot learning with language models https://believermag.com/ghosts/ I Am A Strange Loop AI Dungeon LIGHT Good first issue

Rishi Bommasani on Foundation Models

2022-04-1101:29:59

Deriving static representations from contextual ones, interdisciplinary research, training large models, the Foundation Models paper and CRFM, being an academic on Twitter, and progress in NLP. Transcript: https://web.stanford.edu/class/cs224u/podcast/bommasani/ Rishi's website Rishi on Twitter Bommasani et al 2020 On the opportunities and risks of foundation models Reflections on foundation models EleutherAI Chinchilla http://www.isattentionallyouneed.com Rishi on The Gradient podcast

#box-pro-ellipsis-17684681643845{-webkit-line-clamp:2;}CS224U