The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

698 Episodes

Reverse

Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680

2024-04-1645:54

Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM architecture. Finally, we delve into the future of RL, and the potential of combining language models with traditional methods to achieve more robust AI reasoning. The complete show notes for this episode can be found at twimlai.com/go/680.

Localizing and Editing Knowledge in LLMs with Peter Hase - #679

2024-04-0849:16

Today we're joined by Peter Hase, a fifth-year PhD student at the University of North Carolina NLP lab. We discuss "scalable oversight", and the importance of developing a deeper understanding of how large neural networks make decisions. We learn how matrices are probed by interpretability researchers, and explore the two schools of thought regarding how LLMs store knowledge. Finally, we discuss the importance of deleting sensitive information from model weights, and how "easy-to-hard generalization" could increase the risk of releasing open-source foundation models. The complete show notes for this episode can be found at twimlai.com/go/679.

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

2024-04-0147:571

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we delve into the future of AI security, and the need for a better approach to mitigate the risks posed by optimized adversarial attacks. The complete show notes for this episode can be found at twimlai.com/go/678.

V-JEPA, AI Reasoning from a Non-Generative Architecture with Mido Assran - #677

2024-03-2547:171

Today we’re joined by Mido Assran, a research scientist at Meta’s Fundamental AI Research (FAIR). In this conversation, we discuss V-JEPA, a new model being billed as “the next step in Yann LeCun's vision” for true artificial reasoning. V-JEPA, the video version of Meta’s Joint Embedding Predictive Architecture, aims to bridge the gap between human and machine intelligence by training models to learn abstract concepts in a more efficient predictive manner than generative models. V-JEPA uses a novel self-supervised training approach that allows it to learn from unlabeled video data without being distracted by pixel-level detail. Mido walks us through the process of developing the architecture and explains why it has the potential to revolutionize AI. The complete show notes for this episode can be found at twimlai.com/go/677.

Video as a Universal Interface for AI Reasoning with Sherry Yang - #676

2024-03-1849:041

Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, we explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments. The complete show notes for this episode can be found at twimlai.com/go/676.

Assessing the Risks of Open AI Models with Sayash Kapoor - #675

2024-03-1139:56

Today we’re joined by Sayash Kapoor, a Ph.D. student in the Department of Computer Science at Princeton University. Sayash walks us through his paper: "On the Societal Impact of Open Foundation Models.” We dig into the controversy around AI safety, the risks and benefits of releasing open model weights, and how we can establish common ground for assessing the threats posed by AI. We discuss the application of the framework presented in the paper to specific risks, such as the biosecurity risk of open LLMs, as well as the growing problem of "Non Consensual Intimate Imagery" using open diffusion models. The complete show notes for this episode can be found at twimlai.com/go/675.

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

2024-03-0431:421

Today we’re joined by Akshita Bhagia, a senior research engineer at the Allen Institute for AI. Akshita joins us to discuss OLMo, a new open source language model with 7 billion and 1 billion variants, but with a key difference compared to similar models offered by Meta, Mistral, and others. Namely, the fact that AI2 has also published the dataset and key tools used to train the model. In our chat with Akshita, we dig into the OLMo models and the various projects falling under the OLMo umbrella, including Dolma, an open three-trillion-token corpus for language model pretraining, and Paloma, a benchmark and tooling for evaluating language model performance across a variety of domains. The complete show notes for this episode can be found at twimlai.com/go/674.

Training Data Locality and Chain-of-Thought Reasoning in LLMs with Ben Prystawski - #673

2024-02-2624:33

Today we’re joined by Ben Prystawski, a PhD student in the Department of Psychology at Stanford University working at the intersection of cognitive science and machine learning. Our conversation centers on Ben’s recent paper, “Why think step by step? Reasoning emerges from the locality of experience,” which he recently presented at NeurIPS 2023. In this conversation, we start out exploring basic questions about LLM reasoning, including whether it exists, how we can define it, and how techniques like chain-of-thought reasoning appear to strengthen it. We then dig into the details of Ben’s paper, which aims to understand why thinking step-by-step is effective and demonstrates that local structure is the key property of LLM training data that enables it. The complete show notes for this episode can be found at twimlai.com/go/673.

Reasoning Over Complex Documents with DocLLM with Armineh Nourbakhsh - #672

2024-02-1945:08

Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her team’s approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the datasets they used to build the model, their approach to incorporating layout information, and the various ways they evaluated the model’s performance. The complete show notes for this episode can be found at twimlai.com/go/672.

Are Emergent Behaviors in LLMs an Illusion? with Sanmi Koyejo - #671

2024-02-1201:05:101

Today we’re joined by Sanmi Koyejo, assistant professor at Stanford University, to continue our NeurIPS 2024 series. In our conversation, Sanmi discusses his two recent award-winning papers. First, we dive into his paper, “Are Emergent Abilities of Large Language Models a Mirage?”. We discuss the different ways LLMs are evaluated and the excitement surrounding their“emergent abilities” such as the ability to perform arithmetic Sanmi describes how evaluating model performance using nonlinear metrics can lead to the illusion that the model is rapidly gaining new capabilities, whereas linear metrics show smooth improvement as expected, casting doubt on the significance of emergence. We continue on to his next paper, “DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models,” discussing the methodology it describes for evaluating concerns such as the toxicity, privacy, fairness, and robustness of LLMs. The complete show notes for this episode can be found at twimlai.com/go/671.

AI Trends 2024: Reinforcement Learning in the Age of LLMs with Kamyar Azizzadenesheli - #670

2024-02-0501:09:55

Today we’re joined by Kamyar Azizzadenesheli, a staff researcher at Nvidia, to continue our AI Trends 2024 series. In our conversation, Kamyar updates us on the latest developments in reinforcement learning (RL), and how the RL community is taking advantage of the abstract reasoning abilities of large language models (LLMs). Kamyar shares his insights on how LLMs are pushing RL performance forward in a variety of applications, such as ALOHA, a robot that can learn to fold clothes, and Voyager, an RL agent that uses GPT-4 to outperform prior systems at playing Minecraft. We also explore the progress being made in assessing and addressing the risks of RL-based decision-making in domains such as finance, healthcare, and agriculture. Finally, we discuss the future of deep reinforcement learning, Kamyar’s top predictions for the field, and how greater compute capabilities will be critical in achieving general intelligence. The complete show notes for this episode can be found at twimlai.com/go/670.

Building and Deploying Real-World RAG Applications with Ram Sriharsha - #669

2024-01-2935:46

Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new serverless offering. Currently in public preview, Pinecone Serverless is a vector database that enables on-demand data loading, flexible scaling, and cost-effective query processing. Ram discusses how the serverless paradigm impacts the vector database’s core architecture, key features, and other considerations. Lastly, Ram shares his perspective on the future of vector databases in helping enterprises deliver RAG systems. The complete show notes for this episode can be found at twimlai.com/go/669.

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao - #668

2024-01-2239:15

Today we’re joined by Ben Zhao, a Neubauer professor of computer science at the University of Chicago. In our conversation, we explore his research at the intersection of security and generative AI. We focus on Ben’s recent Fawkes, Glaze, and Nightshade projects, which use “poisoning” approaches to provide users with security and protection against AI encroachments. The first tool we discuss, Fawkes, imperceptibly “cloaks” images in such a way that models perceive them as highly distorted, effectively shielding individuals from recognition by facial recognition models. We then dig into Glaze, a tool that employs machine learning algorithms to compute subtle alterations that are indiscernible to human eyes but adept at tricking the models into perceiving a significant shift in art style, giving artists a unique defense against style mimicry. Lastly, we cover Nightshade, a strategic defense tool for artists akin to a 'poison pill' which allows artists to apply imperceptible changes to their images that effectively “breaks” generative AI models that are trained on them. The complete show notes for this episode can be found at twimlai.com/go/668.

Learning Transformer Programs with Dan Friedman - #667

2024-01-1538:18

Today, we continue our NeurIPS series with Dan Friedman, a PhD student in the Princeton NLP group. In our conversation, we explore his research on mechanistic interpretability for transformer models, specifically his paper, Learning Transformer Programs. The LTP paper proposes modifications to the transformer architecture which allow transformer models to be easily converted into human-readable programs, making them inherently interpretable. In our conversation, we compare the approach proposed by this research with prior approaches to understanding the models and their shortcomings. We also dig into the approach’s function and scale limitations and constraints. The complete show notes for this episode can be found at twimlai.com/go/667.

AI Trends 2024: Machine Learning & Deep Learning with Thomas Dietterich - #666

2024-01-0801:04:48

Today we continue our AI Trends 2024 series with a conversation with Thomas Dietterich, distinguished professor emeritus at Oregon State University. As you might expect, Large Language Models figured prominently in our conversation, and we covered a vast array of papers and use cases exploring current research into topics such as monolithic vs. modular architectures, hallucinations, the application of uncertainty quantification (UQ), and using RAG as a sort of memory module for LLMs. Lastly, don’t miss Tom’s predictions on what he foresees happening this year as well as his words of encouragement for those new to the field. The complete show notes for this episode can be found at twimlai.com/go/666.

AI Trends 2024: Computer Vision with Naila Murray - #665

2024-01-0251:311

Today we kick off our AI Trends 2024 series with a conversation with Naila Murray, director of AI research at Meta. In our conversation with Naila, we dig into the latest trends and developments in the realm of computer vision. We explore advancements in the areas of controllable generation, visual programming, 3D Gaussian splatting, and multimodal models, specifically vision plus LLMs. We discuss tools and open source projects, including Segment Anything–a tool for versatile zero-shot image segmentation using simple text prompts clicks, and bounding boxes; ControlNet–which adds conditional control to stable diffusion models; and DINOv2–a visual encoding model enabling object recognition, segmentation, and depth estimation, even in data-scarce scenarios. Finally, Naila shares her view on the most exciting opportunities in the field, as well as her predictions for upcoming years. The complete show notes for this episode can be found at twimlai.com/go/665.

Are Vector DBs the Future Data Platform for AI? with Ed Anuff - #664

2023-12-2847:43

Today we’re joined by Ed Anuff, chief product officer at DataStax. In our conversation, we discuss Ed’s insights on RAG, vector databases, embedding models, and more. We dig into the underpinnings of modern vector databases (like HNSW and DiskANN) that allow them to efficiently handle massive and unstructured data sets, and discuss how they help users serve up relevant results for RAG, AI assistants, and other use cases. We also discuss embedding models and their role in vector comparisons and database retrieval as well as the potential for GPU usage to enhance vector database performance. The complete show notes for this episode can be found at twimlai.com/go/664.

Quantizing Transformers by Helping Attention Heads Do Nothing with Markus Nagel - #663

2023-12-2646:19

Today we’re joined by Markus Nagel, research scientist at Qualcomm AI Research, who helps us kick off our coverage of NeurIPS 2023. In our conversation with Markus, we cover his accepted papers at the conference, along with other work presented by Qualcomm AI Research scientists. Markus’ first paper, Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing, focuses on tackling activation quantization issues introduced by the attention mechanism and how to solve them. We also discuss Pruning vs Quantization: Which is Better?, which focuses on comparing the effectiveness of these two methods in achieving model weight compression. Additional papers discussed focus on topics like using scalarization in multitask and multidomain learning to improve training and inference, using diffusion models for a sequence of state models and actions, applying geometric algebra with equivariance to transformers, and applying a deductive verification of chain of thought reasoning performed by LLMs. The complete show notes for this episode can be found at twimlai.com/go/663.

Responsible AI in the Generative Era with Michael Kearns - #662

2023-12-2236:511

Today we’re joined by Michael Kearns, professor in the Department of Computer and Information Science at the University of Pennsylvania and an Amazon scholar. In our conversation with Michael, we discuss the new challenges to responsible AI brought about by the generative AI era. We explore Michael’s learnings and insights from the intersection of his real-world experience at AWS and his work in academia. We cover a diverse range of topics under this banner, including service card metrics, privacy, hallucinations, RLHF, and LLM evaluation benchmarks. We also touch on Clean Rooms ML, a secured environment that balances accessibility to private datasets through differential privacy techniques, offering a new approach for secure data handling in machine learning. The complete show notes for this episode can be found at twimlai.com/go/662.

Edutainment for AI and AWS PartyRock with Mike Miller - #661

2023-12-1829:16

Today we’re joined by Mike Miller, director of product at AWS responsible for the company’s “edutainment” products. In our conversation with Mike, we explore AWS PartyRock, a no-code generative AI app builder that allows users to easily create fun and shareable AI applications by selecting a model, chaining prompts together, and linking different text, image, and chatbot widgets together. Additionally, we discuss some of the previous tools Mike’s team has delivered at the intersection of developer education and entertainment, including DeepLens, a computer vision hardware device, DeepRacer, a programmable vehicle that uses reinforcement learning to navigate a track, and lastly, DeepComposer, a generative AI model that transforms musical inputs and creates accompanying compositions. The complete show notes for this episode can be found at twimlai.com/go/661.

Comments (24)

Priya Dharshini

🔴WATCH>>ᗪOᗯᑎᒪOᗩᗪ>>👉https://co.fastmovies.org

Jan 16th

ali ghanbarzade

It was fantastic! Thank u very much!

Nov 21st

Hamed Gh

great

Aug 1st

Andrew Miller

As someone interested in both data science and agriculture, I found this podcast fascinating. The potential applications for AI in agriculture are vast and exciting, but as the podcast notes, high-quality data annotation is crucial to the success of these technologies. That's why I highly recommend checking out this article on https://www.waybinary.com/types-of-data-annotation-for-ai-applications/, which delves deeper into the importance of data annotation and the different techniques used in the field.

Apr 21st

10/10 podcast about an interesting topic. Today AI is everywhere and without proper data processing, it just can't function right. Additional to info here, check https://www.businessmodulehub.com/blog/advantages-of-data/. Some information overlaps with the podcast, but still, many new tips on annotation automation and quality control. Strongly recommend it to anyone interested in machine learning.

Apr 20th

Emilia Gray

Even though automation has improved over the years, it still lacks intelligence. Machine learning algorithms can organize data themselves by learning the ownership of specific data types, which makes automation more efficient, you can find good specialists in this field here https://indatalabs.com/services/machine-learning-consulting

May 24th

Flavio Coelho

what's ADP?

Dec 12th

Duncan Pullen

This was a simply amazing episode. so much depth of information about real life and life changing AI/ML

Nov 22nd

Daniel Sierra

Best podcast on machine learning an ai

May 27th

Özgür Yüksel

Thanks a lot for introducing us to the genius of our age. Tremendously inspiring.

Dec 11th

Glory Dey

A very good insightful episode, Maki Moussavi explains the various points in a lucid manner. Truly, we are the captain of our life's ship. We are responsible for our own emotions and actions. Being proactive rather than reactive is the key to success and happiness! I will be reading this book! Thanks for sharing this interesting podcast. Have a great day!

Oct 15th

I love this channel and all the great podcasts. The topics are very relevant and the speakers are well informed experts so the episodes are very educative. Only request, please change the opening music note of the podcast. It is very unpleasant tune sets a jarring effect right at the beginning. Otherwise all these episodes are very interesting in the field of innovations in Artificial Intelligence and Machine Learning! Regards!

Jun 25th

Billy Bloomer

so smart you can smell it

Jun 14th

raqueeb shaikh

great podcast

May 31st

Loza Boza

Phenomenal discussion. Thank you! Particularly enjoyed the parts on generative models and the link to Daniel Kahneman.

May 20th

simon abdou

Horrible Audio

May 9th

This is a very realistic and proper episode which explains quantum computing even as alone.

Apr 9th

Naadodi

Hello all, Thanks for podcast Can we combine the two agent learnings from same environment to find the best actions Thanks

Mar 14th

Bhavul Gauri

notes : * Data scientists are not trained to think of money optimisations. plotting cpu usage vs accuracy gives an idea about it. if u increase data 4x as much just to gain 1% increase in accuracy that may not be great because you're using 4 times as much CPU power * a team just decicated to monitoring. i. monitor inputs : should not go beyond a certain range for each feature that you are supposed to have. Nulls ratio shouldn't change by a lot. ii. monitor both business and model metrics. sometimes even if model metrics get better ur business metrics could go low....and this could be the case like better autocompletion makes for low performance spell check OR it could also depend upon other things that have changed. or seasonality. * Data scientists and ML engineers in pairs. ML Engineers get to learn about the model while Data Scientists come up with it. both use same language. ML Engineers make sure it gets scaled up and deployed to production. * Which parameters are somewhat stable

Mar 11th

Abhijeet Gulati

great podcast. do we reference to papers that were discussed by Ganju. good job

Jan 22nd

#box-pro-ellipsis-171357450441180{-webkit-line-clamp:2;}The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Priya Dharshini

ali ghanbarzade

Hamed Gh

Andrew Miller

Andrew Miller

Emilia Gray

Flavio Coelho

Duncan Pullen

Daniel Sierra

Özgür Yüksel

Glory Dey

Glory Dey

Billy Bloomer

raqueeb shaikh

Loza Boza

simon abdou

Özgür Yüksel

Naadodi

Bhavul Gauri

Abhijeet Gulati

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)