Discover
Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)
Author: Machine Learning Street Talk (MLST)
Subscribed: 1,166Played: 36,032Subscribe
Share
© Machine Learning Street Talk (MLST)
Description
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
227 Episodes
Reverse
In this episode, hosts Tim and Keith finally realize their long-held dream of sitting down with their hero, the brilliant neuroscientist Professor Karl Friston. The conversation is a fascinating and mind-bending journey into Professor Friston's life's work, the Free Energy Principle, and what it reveals about life, intelligence, and consciousness itself.**SPONSORS**Gemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cli--- Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!---cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economyOct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst***They kick things off by looking back on the 20-year journey of the Free Energy Principle. Professor Friston explains it as a fundamental rule for survival: all living things, from a single cell to a human being, are constantly trying to make sense of the world and reduce unpredictability. It’s this drive to minimize surprise that allows things to exist and maintain their structure.This leads to a bigger question: What does it truly mean to be "intelligent"? The group debates whether intelligence is everywhere, even in a virus or a plant, or if it requires a certain level of complexity. Professor Friston introduces the idea of different "kinds" of things, suggesting that creatures like us, who can model themselves and think about the future, possess a unique and "strange" kind of agency that sets us apart.From intelligence, the discussion naturally flows to the even trickier concept of consciousness. Is it the same as intelligence? Professor Friston argues they are different. He explains that consciousness might emerge from deep, layered self-awareness—not just acting, but understanding that you are the one causing your actions and thinking about your place in the world.They also explore intelligence at different sizes. Is a corporation intelligent? What about the entire planet? Professor Friston suggests there might be a "Goldilocks zone" for intelligence. It doesn't seem to exist at the super-tiny atomic level or at the massive scale of planets and solar systems, but thrives in the complex middle-ground where we live.Finally, they tackle one of the most pressing topics of our time: Can we build a truly conscious AI? Professor Friston shares his doubts about whether our current computers are capable of a feat like that. He suggests that genuine consciousness might require a different kind of "mortal" computation, where the machine's physical body and its "mind" are inseparable, much like in biological creatures.TRANSCRIPT:https://app.rescript.info/public/share/FZkF8BO7HMt9aFfu2_q69WGT_ZbYZ1VVkC6RtU3eeOITOC:00:00:00: Introduction & Retrospective on the Free Energy Principle00:09:34: Strange Particles, Agency, and Consciousness00:37:45: The Scale of Intelligence: From Viruses to the Biosphere01:01:35: Modelling, Boundaries, and Practical Application01:21:12: Conclusion
We are joined by Cristopher Moore, a professor at the Santa Fe Institute with a diverse background in physics, computer science, and machine learning.The conversation begins with Cristopher, who calls himself a "frog" explaining that he prefers to dive deep into specific, concrete problems rather than taking a high-level "bird's-eye view". They explore why current AI models, like transformers, are so surprisingly effective. Cristopher argues it's because the real world isn't random; it's full of rich structures, patterns, and hierarchies that these models can learn to exploit, even if we don't fully understand how.**SPONSORS**Take the Prolific human data survey - https://www.prolific.com/humandatasurvey?utm_source=mlst and be the first to see the results and benchmark their practices against the wider community!---cyber•Fund https://cyber.fund/?utm_source=mlst is a founder-led investment firm accelerating the cybernetic economy.Oct SF conference - https://dagihouse.com/?utm_source=mlst - Joscha Bach keynoting(!) + OAI, Anthropic, NVDA,++Hiring a SF VC Principal: https://talent.cyber.fund/companies/cyber-fund-2/jobs/57674170-ai-investment-principal#content?utm_source=mlstSubmit investment deck: https://cyber.fund/contact?utm_source=mlst***Cristopher Moore:https://sites.santafe.edu/~moore/TOC:00:00:00 - Introduction00:02:05 - Meet Christopher Moore: A Frog in the World of Science00:05:14 - The Limits of Transformers and Real-World Data00:11:19 - Intelligence as Creative Problem-Solving00:23:30 - Grounding, Meaning, and Shared Reality00:31:09 - The Nature of Creativity and Aesthetics00:44:31 - Computational Irreducibility and Universality00:53:06 - Turing Completeness, Recursion, and Intelligence01:11:26 - The Universe Through a Computational Lens01:26:45 - Algorithmic Justice and the Need for TransparencyTRANSCRIPT: https://app.rescript.info/public/share/VRe2uQSvKZOm0oIBoDsrNwt46OMCqRnShVnUF3qyoFkFilmed at DISI (Diverse Intelligences Summer Institute)https://disi.org/REFS:The Nature of computation [Chris Moore]https://nature-of-computation.org/ Birds and Frogs [Freeman Dyson]https://www.ams.org/notices/200902/rtx090200212p.pdf Replica Theory [Parisi et al]https://arxiv.org/pdf/1409.2722 Janossy pooling [Fabian Fuchs]https://fabianfuchsml.github.io/equilibriumaggregation/ Cracking the cryptic [YT channel]https://www.youtube.com/c/CrackingTheCrypticSudoko Bench [Sakana]https://sakana.ai/sudoku-bench/Fractured entangled representations “phylogenetic locking in comment” [Kumar/Stanley]https://arxiv.org/pdf/2505.11581 (see our shows on this)The War Against Cliché: [Martin Amis]https://www.amazon.com/War-Against-Cliche-Reviews-1971-2000/dp/0375727167Rule 110 (CA)https://mathworld.wolfram.com/Rule150.htmlUniversality in Elementary Cellular Automata [Matt Cooke]https://wpmedia.wolfram.com/sites/13/2018/02/15-1-1.pdf Small Semi-Weakly Universal Turing Machines [Damien Woods] https://tilde.ini.uzh.ch/users/tneary/public_html/WoodsNeary-FI09.pdf COMPUTING MACHINERY AND INTELLIGENCE [Turing, 1950]https://courses.cs.umbc.edu/471/papers/turing.pdf Comment on Space Time as a causal set [Moore, 88]https://sites.santafe.edu/~moore/comment.pdf Recursion Theory on the Reals and Continuous-time Computation [Moore, 96]
Dr. Michael Timothy Bennett is a computer scientist who's deeply interested in understanding artificial intelligence, consciousness, and what it means to be alive. He's known for his provocative paper "What the F*** is Artificial Intelligence" which challenges conventional thinking about AI and intelligence.**SPONSOR MESSAGES***Prolific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=mb***Michael takes us on a journey through some of the biggest questions in AI and consciousness. He starts by exploring what intelligence actually is - settling on the idea that it's about "adaptation with limited resources" (a definition from researcher Pei Wang that he particularly likes).The discussion ranges from technical AI concepts to philosophical questions about consciousness, with Michael offering fresh perspectives that challenge Silicon Valley's "just scale it up" approach to AI. He argues that true intelligence isn't just about having more parameters or data - it's about being able to adapt efficiently, like biological systems do.TOC:1. Introduction & Paper Overview [00:01:34]2. Definitions of Intelligence [00:02:54]3. Formal Models (AIXI, Active Inference) [00:07:06]4. Causality, Abstraction & Embodiment [00:10:45]5. Computational Dualism & Mortal Computation [00:25:51]6. Modern AI, AGI Progress & Benchmarks [00:31:30]7. Hybrid AI Approaches [00:35:00]8. Consciousness & The Hard Problem [00:39:35]9. The Diverse Intelligences Summer Institute (DISI) [00:53:20]10. Living Systems & Self-Organization [00:54:17]11. Closing Thoughts [01:04:24]Michaels socials:https://michaeltimothybennett.com/https://x.com/MiTiBennettTranscript:https://app.rescript.info/public/share/4jSKbcM77Sf6Zn-Ms4hda7C4krRrMcQt0qwYqiqPTPIReferences:Bennett, M.T. "What the F*** is Artificial Intelligence"https://arxiv.org/abs/2503.23923Bennett, M.T. "Are Biological Systems More Intelligent Than Artificial Intelligence?" https://arxiv.org/abs/2405.02325Bennett, M.T. PhD Thesis "How To Build Conscious Machines"https://osf.io/preprints/thesiscommons/wehmg_v1Legg, S. & Hutter, M. (2007). "Universal Intelligence: A Definition of Machine Intelligence"Wang, P. "Defining Artificial Intelligence" - on non-axiomatic reasoning systems (NARS)Chollet, F. (2019). "On the Measure of Intelligence" - introduces the ARC benchmark and developer-aware generalizationHutter, M. (2005). "Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability"Chalmers, D. "The Hard Problem of Consciousness"Descartes, R. - Cartesian dualism and the pineal gland theory (historical context)Friston, K. - Free Energy Principle and Active Inference frameworkLevin, M. - Work on collective intelligence, cancer as information isolation, and "mind blindness"Hinton, G. (2022). "The Forward-Forward Algorithm" - introduces mortal computation conceptAlexander Ororbia & Friston - Formal treatment of mortal computationSutton, R. "The Bitter Lesson" - on search and learning in AIPearl, J. "The Book of Why" - causal inference and reasoningAlternative AGI ApproachesWang, P. - NARS (Non-Axiomatic Reasoning System)Goertzel, B. - Hyperon system and modular AGI architecturesBenchmarks & EvaluationHendrycks, D. - Humanities Last Exam benchmark (mentioned re: saturation)Filmed at:Diverse Intelligences Summer Institute (DISI) https://disi.org/
Deep dive with Dan Hendrycks, a leading AI safety researcher and co-author of the "Superintelligence Strategy" paper with former Google CEO Eric Schmidt and Scale AI CEO Alexandr Wang.*** SPONSOR MESSAGESGemini CLI is an open-source AI agent that brings the power of Gemini directly into your terminal - https://github.com/google-gemini/gemini-cliProlific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=script-gen***Hendrycks argues that society is making a fundamental mistake in how it views artificial intelligence. We often compare AI to transformative but ultimately manageable technologies like electricity or the internet. He contends a far better and more realistic analogy is nuclear technology. Like nuclear power, AI has the potential for immense good, but it is also a dual-use technology that carries the risk of unprecedented catastrophe.The Problem with an AI "Manhattan Project":A popular idea is for the U.S. to launch a "Manhattan Project" for AI—a secret, all-out government race to build a superintelligence before rivals like China. Hendrycks argues this strategy is deeply flawed and dangerous for several reasons:- It wouldn’t be secret. You cannot hide a massive, heat-generating data center from satellite surveillance.- It would be destabilizing. A public race would alarm rivals, causing them to start their own desperate, corner-cutting projects, dramatically increasing global risk.- It’s vulnerable to sabotage. An AI project can be crippled in many ways, from cyberattacks that poison its training data to physical attacks on its power plants. This is what the paper refers to as a "maiming attack."This vulnerability leads to the paper's central concept: Mutual Assured AI Malfunction (MAIM). This is the AI-era version of the nuclear-era's Mutual Assured Destruction (MAD). In this dynamic, any nation that makes an aggressive, destabilizing bid for a world-dominating AI must expect its rivals to sabotage the project to ensure their own survival. This deterrence, Hendrycks argues, is already the default reality we live in.A Better Strategy: The Three PillarsInstead of a reckless race, the paper proposes a more stable, three-part strategy modeled on Cold War principles:- Deterrence: Acknowledge the reality of MAIM. The goal should not be to "win" the race to superintelligence, but to deter anyone from starting such a race in the first place through the credible threat of sabotage.- Nonproliferation: Just as we work to keep fissile materials for nuclear bombs out of the hands of terrorists and rogue states, we must control the key inputs for catastrophic AI. The most critical input is advanced AI chips (GPUs). Hendrycks makes the powerful claim that building cutting-edge GPUs is now more difficult than enriching uranium, making this strategy viable.- Competitiveness: The race between nations like the U.S. and China should not be about who builds superintelligence first. Instead, it should be about who can best use existing AI to build a stronger economy, a more effective military, and more resilient supply chains (for example, by manufacturing more chips domestically).Dan says the stakes are high if we fail to manage this transition:- Erosion of Control- Intelligence Recursion- Worthless LaborHendrycks maintains that while the risks are existential, the future is not set. TOC:1 Measuring the Beast [00:00:00]2 Defining the Beast [00:11:34]3 The Core Strategy [00:38:20]4 Ideological Battlegrounds [00:53:12]5 Mechanisms of Control [01:34:45]TRANSCRIPT:https://app.rescript.info/public/share/cOKcz4pWRPjh7BTIgybd7PUr_vChUaY6VQW64No8XMs<truncated, see refs and larger description on YT version>
This episode features Shlomi Fuchter and Jack Parker Holder from Google DeepMind, who are unveiling a new AI called Genie 3. The host, Tim Scarfe, describes it as the most mind-blowing technology he has ever seen. We were invited to their offices to conduct the interview (not sponsored).Imagine you could create a video game world just by describing it. That's what Genie 3 does. It's an AI "world model" that learns how the real world works by watching massive amounts of video. Unlike a normal video game engine (like Unreal or the one for Doom) that needs to be programmed manually, Genie generates a realistic, interactive, 3D world from a simple text prompt.**SPONSOR MESSAGES***Prolific: Quality data. From real people. For faster breakthroughs.https://prolific.com/mlst?utm_campaign=98404559-MLST&utm_source=youtube&utm_medium=podcast&utm_content=script-gen***Here’s a breakdown of what makes it so revolutionary:From Text to a Virtual World: You can type "a drone flying by a beautiful lake" or "a ski slope," and Genie 3 creates that world for you in about three seconds. You can then navigate and interact with it in real-time.It's Consistent: The worlds it creates have a reliable memory. If you look away from an object and then look back, it will still be there, just as it was. The guests explain that this consistency isn't explicitly programmed in; it's a surprising, "emergent" capability of the powerful AI model.A Huge Leap Forward: The previous version, Genie 2, was a major step, but it wasn't fast enough for real-time interaction and was much lower resolution. Genie 3 is 720p, interactive, and photorealistic, running smoothly for several minutes at a time.The Killer App - Training Robots: Beyond entertainment, the team sees Genie 3 as a game-changer for training AI. Instead of training a self-driving car or a robot in the real world (which is slow and dangerous), you can create infinite simulations. You can even prompt rare events to happen, like a deer running across the road, to teach an AI how to handle unexpected situations safely.The Future of Entertainment: this could lead to a "YouTube version 2" or a new form of VR, where users can create and explore endless, interconnected worlds together, like the experience machine from philosophy.While the technology is still a research prototype and not yet available to the public, it represents a monumental step towards creating true artificial worlds from the ground up.Jack Parker Holder [Research Scientist at Google DeepMind in the Open-Endedness Team]https://jparkerholder.github.io/Shlomi Fruchter [Research Director, Google DeepMind]https://shlomifruchter.github.io/TOC:[00:00:00] - Introduction: "The Most Mind-Blowing Technology I've Ever Seen"[00:02:30] - The Evolution from Genie 1 to Genie 2[00:04:30] - Enter Genie 3: Photorealistic, Interactive Worlds from Text[00:07:00] - Promptable World Events & Training Self-Driving Cars[00:14:21] - Guest Introductions: Shlomi Fuchter & Jack Parker Holder[00:15:08] - Core Concepts: What is a "World Model"?[00:19:30] - The Challenge of Consistency in a Generated World[00:21:15] - Context: The Neural Network Doom Simulation[00:25:25] - How Do You Measure the Quality of a World Model?[00:28:09] - The Vision: Using Genie to Train Advanced Robots[00:32:21] - Open-Endedness: Human Skill and Prompting Creativity[00:38:15] - The Future: Is This the Next YouTube or VR?[00:42:18] - The Next Step: Multi-Agent Simulations[00:52:51] - Limitations: Thinking, Computation, and the Sim-to-Real Gap[00:58:07] - Conclusion & The Future of Game EnginesREFS:World Models [David Ha, Jürgen Schmidhuber]https://arxiv.org/abs/1803.10122POEThttps://arxiv.org/abs/1901.01753[Akarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanley]The Fractured Entangled Representation Hypothesishttps://arxiv.org/pdf/2505.11581TRANSCRIPT:https://app.rescript.info/public/share/Zk5tZXk6mb06yYOFh6nSja7Lg6_qZkgkuXQ-kl5AJqM
Prof. David Krakauer, President of the Santa Fe Institute argues that we are fundamentally confusing knowledge with intelligence, especially when it comes to AI.He defines true intelligence as the ability to do more with less—to solve novel problems with limited information. This is contrasted with current AI models, which he describes as doing less with more; they require astounding amounts of data to perform tasks that don't necessarily demonstrate true understanding or adaptation. He humorously calls this "really shit programming".David challenges the popular notion of "emergence" in Large Language Models (LLMs). He explains that the tech community's definition—seeing a sudden jump in a model's ability to perform a task like three-digit math—is superficial. True emergence, from a complex systems perspective, involves a fundamental change in the system's internal organization, allowing for a new, simpler, and more powerful level of description. He gives the example of moving from tracking individual water molecules to using the elegant laws of fluid dynamics. For LLMs to be truly emergent, we'd need to see them develop new, efficient internal representations, not just get better at memorizing patterns as they scale.Drawing on his background in evolutionary theory, David explains that systems like brains, and later, culture, evolved to process information that changes too quickly for genetic evolution to keep up. He calls culture "evolution at light speed" because it allows us to store our accumulated knowledge externally (in books, tools, etc.) and build upon it without corrupting the original.This leads to his concept of "exbodiment," where we outsource our cognitive load to the world through things like maps, abacuses, or even language itself. We create these external tools, internalize the skills they teach us, improve them, and create a feedback loop that enhances our collective intelligence.However, he ends with a warning. While technology has historically complemented our deficient abilities, modern AI presents a new danger. Because we have an evolutionary drive to conserve energy, we will inevitably outsource our thinking to AI if we can. He fears this is already leading to a "diminution and dilution" of human thought and creativity. Just as our muscles atrophy without use, he argues our brains will too, and we risk becoming mentally dependent on these systems.TOC:[00:00:00] Intelligence: Doing more with less[00:02:10] Why brains evolved: The limits of evolution[00:05:18] Culture as evolution at light speed[00:08:11] True meaning of emergence: "More is Different"[00:10:41] Why LLM capabilities are not true emergence[00:15:10] What real emergence would look like in AI[00:19:24] Symmetry breaking: Physics vs. Life[00:23:30] Two types of emergence: Knowledge In vs. Out[00:26:46] Causality, agency, and coarse-graining[00:32:24] "Exbodiment": Outsourcing thought to objects[00:35:05] Collective intelligence & the boundary of the mind[00:39:45] Mortal vs. Immortal forms of computation[00:42:13] The risk of AI: Atrophy of human thoughtDavid KrakauerPresident and William H. Miller Professor of Complex Systemshttps://www.santafe.edu/people/profile/david-krakauerREFS:Large Language Models and Emergence: A Complex Systems PerspectiveDavid C. Krakauer, John W. Krakauer, Melanie Mitchellhttps://arxiv.org/abs/2506.11135Filmed at the Diverse Intelligences Summer Institute:https://disi.org/
Dr. Maxwell Ramstead grills Guillaume Verdon (AKA “Beff Jezos”) who's the founder of Thermodynamic computing startup Extropic.Guillaume shares his unique path – from dreaming about space travel as a kid to becoming a physicist, then working on quantum computing at Google, to developing a radically new form of computing hardware for machine learning. He explains how he hit roadblocks with traditional physics and computing, leading him to start his company – building "thermodynamic computers." These are based on a new design for super-efficient chips that use the natural chaos of electrons (think noise and heat) to power AI tasks, which promises to speed up AND lower the costs of modern probabilistic techniques like sampling. He is driven by the pursuit of building computers that work more like your brain, which (by the way) runs on a banana and a glass of water! Guillaume talks about his alter ego, Beff Jezos, and the "Effective Accelerationism" (e/acc) movement that he initiated. Its objective is to speed up tech progress in order to “grow civilization” (as measured by energy use and innovation), rather than “slowing down out of fear”. Guillaume argues we need to embrace variance, exploration, and optimism to avoid getting stuck or outpaced by competitors like China. He and Maxwell discuss big ideas like merging humans with AI, decentralizing intelligence, and why boundless growth (with smart constraints) is “key to humanity's future”.REFS:1. John Archibald Wheeler - "It From Bit" Concept00:04:45 - Foundational work proposing that physical reality emerges from information at the quantum levelLearn more: https://cqi.inf.usi.ch/qic/wheeler.pdf 2. AdS/CFT Correspondence (Holographic Principle)00:05:15 - Theoretical physics duality connecting quantum gravity in Anti-de Sitter space with conformal field theoryhttps://en.wikipedia.org/wiki/Holographic_principle 3. Renormalization Group Theory00:06:15 - Mathematical framework for analyzing physical systems across different length scales https://www.damtp.cam.ac.uk/user/dbs26/AQFT/Wilsonchap.pdf 4. Maxwell's Demon and Information Theory00:21:15 - Thought experiment linking information processing to thermodynamics and entropyhttps://plato.stanford.edu/entries/information-entropy/ 5. Landauer's Principle00:29:45 - Fundamental limit establishing minimum energy required for information erasure https://en.wikipedia.org/wiki/Landauer%27s_principle 6. Free Energy Principle and Active Inference01:03:00 - Mathematical framework for understanding self-organizing systems and perception-action loopshttps://www.nature.com/articles/nrn2787 7. Max Tegmark - Information Bottleneck Principle01:07:00 - Connections between information theory and renormalization in machine learninghttps://arxiv.org/abs/1907.07331 8. Fisher's Fundamental Theorem of Natural Selection01:11:45 - Mathematical relationship between genetic variance and evolutionary fitnesshttps://en.wikipedia.org/wiki/Fisher%27s_fundamental_theorem_of_natural_selection 9. Tensor Networks in Quantum Systems00:06:45 - Computational framework for simulating many-body quantum systems https://arxiv.org/abs/1912.10049 10. Quantum Neural Networks00:09:30 - Hybrid quantum-classical models for machine learning applicationshttps://en.wikipedia.org/wiki/Quantum_neural_network 11. Energy-Based Models (EBMs)00:40:00 - Probabilistic framework for unsupervised learning based on energy functionshttps://www.researchgate.net/publication/200744586_A_tutorial_on_energy-based_learning 12. Markov Chain Monte Carlo (MCMC)00:20:00 - Sampling algorithm fundamental to modern AI and statistical physics https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo 13. Metropolis-Hastings Algorithm00:23:00 - Core sampling method for probability distributionshttps://arxiv.org/abs/1504.01896 ***SPONSOR MESSAGE***Google Gemini 2.5 Flash is a state-of-the-art language model in the Gemini app. Sign up at https://gemini.google.com
Are the AI models you use today imposters?Please watch the intro video we did before this: https://www.youtube.com/watch?v=o1q6Hhz0MAgIn this episode, hosts Dr. Tim Scarfe and Dr. Duggar are joined by AI researcher Prof. Kenneth Stanley and MIT PhD student Akash Kumar to discuss their fascinating paper, "Questioning Representational Optimism in Deep Learning."Imagine you ask two people to draw a perfect skull. One is a brilliant artist who understands anatomy, the other is a machine that just traces the image. Both drawings look identical, but the artist understands what a skull is—they know where the mouth is, how the jaw works, and that it's symmetrical. The machine just has a tangled mess of lines that happens to form the right picture.An AI with an elegant representation, has the building blocks to generate truly new ideas.The Path Is the Goal: As Kenneth Stanley puts it, "it matters not just where you get, but how you got there". Two students can ace a math test, but the one who truly understands the concepts—instead of just memorizing formulas—is the one who will go on to make new discoveries.The show is a mixture of 3 separate recordings we have done, the original Patreon warmup with Tim/Kenneth, the Tim/Keith "Steakhouse" recorded after the main interview, then the main interview with Kenneth/Akarsh/Keith/Tim. Feel free to skip around. We had to edit this in a rush as we are travelling next week but it's reasonably cleaned up. TOC:00:00:00 Intro: Garbage vs. Amazing Representations00:05:42 How Good Representations Form00:11:14 Challenging the "Bitter Lesson"00:18:04 AI Creativity & Representation Types00:22:13 Steakhouse: Critiques & Alternatives00:28:30 Steakhouse: Key Concepts & Goldilocks Zone00:39:42 Steakhouse: A Sober View on AI Risk00:43:46 Steakhouse: The Paradox of Open-Ended Search00:47:58 Main Interview: Paper Intro & Core Concepts00:56:44 Main Interview: Deception and Evolvability01:36:30 Main Interview: Reinterpreting Evolution01:56:16 Main Interview: Impostor Intelligence02:11:15 Main Interview: Recommendations for AI ResearchREFS:Questioning Representational Optimism in Deep Learning:The Fractured Entangled Representation HypothesisAkarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanleyhttps://arxiv.org/pdf/2505.11581Kenneth O. Stanley, Joel LehmanWhy Greatness Cannot Be Planned: The Myth of the Objectivehttps://amzn.to/44xLaXKOriginal show with Kenneth from 4 years ago:https://www.youtube.com/watch?v=lhYGXYeMq_EKenneth Stanley is SVP Open Endedness at Lila Scienceshttps://x.com/kenneth0stanleyAkarsh Kumar (MIT)https://akarshkumar.com/AND... Kenneth is HIRING (this is an OPPORTUNITY OF A LIFETIME!)Research Engineer: https://job-boards.greenhouse.io/lila/jobs/7890007002Research Scientist: https://job-boards.greenhouse.io/lila/jobs/8012245002TRANSCRIPT:https://app.rescript.info/public/share/W_T7E1OC2Wj49ccqlIOOztg2MJWaaVbovTeyxcFEQdU
What if today's incredible AI is just a brilliant "impostor"? This episode features host Dr. Tim Scarfe in conversation with guests Prof. Kenneth Stanley (ex-OpenAI), Dr. Keith Duggar (MIT), and Arkash Kumar (MIT).While AI today produces amazing results on the surface, its internal understanding is a complete mess, described as "total spaghetti" [00:00:49]. This is because it's trained with a brute-force method (SGD) that’s like building a sandcastle: it looks right from a distance, but has no real structure holding it together [00:01:45].To explain the difference, Keith Duggar shares a great analogy about his high school physics classes [00:03:18]. One class was about memorizing lots of formulas for specific situations (like the "impostor" AI). The other used calculus to derive the answers from a deeper understanding, which was much easier and more powerful. This is the core difference: one method memorizes, the other truly understands.The episode then introduces a different, more powerful way to build AI, based on Kenneth Stanley's old experiment, "Picbreeder" [00:04:45]. This method creates AI with a shockingly clean and intuitive internal model of the world. For example, it might develop a model of a skull where it understands the "mouth" as a separate component it can open and close, without ever being explicitly trained on that action [00:06:15]. This deep understanding emerges bottom-up, without massive datasets.The secret is to abandon a fixed goal and embrace "deception" [00:08:42]—the idea that the stepping stones to a great discovery often don't look anything like the final result. Instead of optimizing for a target, the AI is built through an open-ended process of exploring what's "interesting" [00:09:15]. This creates a more flexible and adaptable foundation, a bit like how evolvability wins out in nature [00:10:30].The show concludes by arguing that this choice matters immensely. The "impostor" path may be hitting a wall, requiring insane amounts of money and energy for progress and failing to deliver true creativity or continual learning [00:13:00]. The ultimate message is a call to not put all our eggs in one basket [00:14:25]. We should explore these open-ended, creative paths to discover a more genuine form of intelligence, which may be found where we least expect it.REFS:Questioning Representational Optimism in Deep Learning:The Fractured Entangled Representation HypothesisAkarsh Kumar, Jeff Clune, Joel Lehman, Kenneth O. Stanleyhttps://arxiv.org/pdf/2505.11581Kenneth O. Stanley, Joel LehmanWhy Greatness Cannot Be Planned: The Myth of the Objectivehttps://amzn.to/44xLaXKOriginal show with Kenneth from 4 years ago:https://www.youtube.com/watch?v=lhYGXYeMq_EKenneth Stanley is SVP Open Endedness at Lila Scienceshttps://x.com/kenneth0stanleyAkarsh Kumar (MIT)https://akarshkumar.com/AND... Kenneth is HIRING (this is an OPPORTUNITY OF A LIFETIME!)Research Engineer: https://job-boards.greenhouse.io/lila/jobs/7890007002Research Scientist: https://job-boards.greenhouse.io/lila/jobs/8012245002Tim's Code visualisation of FER based on Akarsh repo: https://github.com/ecsplendid/ferTRANSCRIPT: https://app.rescript.info/public/share/YKAZzZ6lwZkjTLRpVJreOOxGhLI8y4m3fAyU8NSavx0
What if the most powerful technology in human history is being built by people who openly admit they don't trust each other? In this explosive 2-hour debate, three AI experts pull back the curtain on the shocking psychology driving the race to Artificial General Intelligence—and why the people building it might be the biggest threat of all. Kokotajlo predicts AGI by 2028 based on compute scaling trends. Marcus argues we haven't solved basic cognitive problems from his 2001 research. The stakes? If Kokotajlo is right and Marcus is wrong about safety progress, humanity may have already lost control.Sponsor messages:========Google Gemini: Google Gemini features Veo3, a state-of-the-art AI video generation model in the Gemini app. Sign up at https://gemini.google.comTufa AI Labs are hiring for ML Engineers and a Chief Scientist in Zurich/SF. They are top of the ARCv2 leaderboard! https://tufalabs.ai/========Guest PowerhouseGary Marcus - Cognitive scientist, author of "Taming Silicon Valley," and AI's most prominent skeptic who's been warning about the same fundamental problems for 25 years (https://garymarcus.substack.com/)Daniel Kokotajlo - Former OpenAI insider turned whistleblower who reveals the disturbing rationalizations of AI lab leaders in his viral "AI 2027" scenario (https://ai-2027.com/)Dan Hendrycks - Director of the Center for AI Safety who created the benchmarks used to measure AI progress and argues we have only years, not decades, to prevent catastrophe (https://danhendrycks.com/)Transcript: http://app.rescript.info/public/share/tEcx4UkToi-2jwS1cN51CW70A4Eh6QulBRxDILoXOnoTOC:Introduction: The AI Arms Race00:00:04 - The Danger of Automated AI R&D00:00:43 - The Rationalization: "If we don't, someone else will"00:01:56 - Sponsor Reads (Tufa AI Labs & Google Gemini)00:02:55 - Guest IntroductionsThe Philosophical Stakes00:04:13 - What is the Positive Vision for AGI?00:07:00 - The Abundance Scenario: Superintelligent Economy00:09:06 - Differentiating AGI and Superintelligence (ASI)00:11:41 - Sam Altman: "A Decade in a Month"00:14:47 - Economic Inequality & The UBI ProblemPolicy and Red Lines00:17:13 - The Pause Letter: Stopping vs. Delaying AI00:20:03 - Defining Three Concrete Red Lines for AI Development00:25:24 - Racing Towards Red Lines & The Myth of "Durable Advantage"00:31:15 - Transparency and Public Perception00:35:16 - The Rationalization Cascade: Why AI Labs Race to "Win"Forecasting AGI: Timelines and Methodologies00:42:29 - The Case for Short Timelines (Median 2028)00:47:00 - Scaling Limits: Compute, Data, and Money00:49:36 - Forecasting Models: Bio-Anchors and Agentic Coding00:53:15 - The 10^45 FLOP Thought ExperimentThe Great Debate: Cognitive Gaps vs. Scaling00:58:41 - Gary Marcus's Counterpoint: The Unsolved Problems of Cognition01:00:46 - Current AI Can't Play Chess Reliably01:08:23 - Can Tools and Neurosymbolic AI Fill the Gaps?01:16:13 - The Multi-Dimensional Nature of Intelligence01:24:26 - The Benchmark Debate: Data Contamination and Reliability01:31:15 - The Superhuman Coder Milestone Debate01:37:45 - The Driverless Car AnalogyThe Alignment Problem01:39:45 - Has Any Progress Been Made on Alignment?01:42:43 - "Fairly Reasonably Scares the Sh*t Out of Me"01:46:30 - Distinguishing Model vs. Process AlignmentScenarios and Conclusions01:49:26 - Gary's Alternative Scenario: The Neurosymbolic Shift01:53:35 - Will AI Become Jeff Dean?01:58:41 - Takeoff Speeds and Exceeding Human Intelligence02:03:19 - Final Disagreements and Closing RemarksREFS:Gary Marcus (2001) - The Algebraic Mind https://mitpress.mit.edu/9780262632683/the-algebraic-mind/ 00:59:00Gary Marcus & Ernest Davis (2019) - Rebooting AI https://www.penguinrandomhouse.com/books/566677/rebooting-ai-by-gary-marcus-and-ernest-davis/ 01:31:59Gary Marcus (2024) - Taming SV https://www.hachettebookgroup.com/titles/gary-marcus/taming-silicon-valley/9781541704091/ 00:03:01
We interview Professor Christopher Summerfield from Oxford University about his new book "These Strange New Minds: How AI Learned to Talk and What It". AI learned to understand the world just by reading text - something scientists thought was impossible. You don't need to see a cat to know what one is; you can learn everything from words alone. This is "the most astonishing scientific discovery of the 21st century."People are split: some refuse to call what AI does "thinking" even when it outperforms humans, while others believe if it acts intelligent, it is intelligent. Summerfield takes the middle ground - AI does something genuinely like human reasoning, but that doesn't make it human.Sponsor messages:========Google Gemini: Google Gemini features Veo3, a state-of-the-art AI video generation model in the Gemini app. Sign up at https://gemini.google.comTufa AI Labs are hiring for ML Engineers and a Chief Scientist in Zurich/SF. They are top of the ARCv2 leaderboard! https://tufalabs.ai/========Prof. Christopher Summerfieldhttps://www.psy.ox.ac.uk/people/christopher-summerfieldThese Strange New Minds: How AI Learned to Talk and What It Meanshttps://amzn.to/4e26BVaTable of Contents:Introduction & Setup00:00:00 Superman 3 Metaphor - Humans Absorbed by Machines00:02:01 Book Introduction & AI Debate Context00:03:45 Sponsor Segments (Google Gemini, Tufa Labs)Philosophical Foundations00:04:48 The Fractured AI Discourse00:08:21 Ancient Roots: Aristotle vs Plato (Empiricism vs Rationalism)00:10:14 Historical AI: Symbolic Logic and Its LimitsThe Language Revolution00:12:11 ChatGPT as the Rubicon Moment00:14:00 The Astonishing Discovery: Learning Reality from Words Alone00:15:47 Equivalentists vs Exceptionalists DebateCognitive Science Perspectives00:19:12 Functionalism and the Duck Test00:21:48 Brain-AI Similarities and Computational Principles00:24:53 Reconciling Chomsky: Evolution vs Learning00:28:15 Lamarckian AI vs Darwinian Human LearningThe Reality of AI Capabilities00:30:29 Anthropomorphism and the Clever Hans Effect00:32:56 The Intentional Stance and Nature of Thinking00:37:56 Three Major AI Worries: Agency, Personalization, DynamicsSocietal Risks and Complex Systems00:37:56 AI Agents and Flash Crash Scenarios00:42:50 Removing Frictions: The Lawfare Example00:46:15 Gradual Disempowerment Theory00:49:18 The Faustian Pact of TechnologyHuman Agency and Control00:51:18 The Crisis of Authenticity00:56:22 Psychology of Control vs Reward01:00:21 Dopamine Hacking and Variable ReinforcementFuture Directions01:02:27 Evolution as Goal-less Optimization01:03:31 Open-Endedness and Creative Evolution01:06:46 Writing, Creativity, and AI-Generated Content01:08:18 Closing RemarksREFS:Academic References (Abbreviated)Essential Books"These Strange New Minds" - C. Summerfield [00:02:01] - Main discussion topic"The Mind is Flat" - N. Chater [00:33:45] - Summerfield's favorite on cognitive illusions"AI: A Guide for Thinking Humans" - M. Mitchell [00:04:58] - Host's previous favorite"Principia Mathematica" - Russell & Whitehead [00:11:00] - Logic Theorist reference"Syntactic Structures" - N. Chomsky (1957) [00:13:30] - Generative grammar foundation"Why Greatness Cannot Be Planned" - Stanley & Lehman [01:04:00] - Open-ended evolutionKey Papers & Studies"Gradual Disempowerment" - D. Duvenaud [00:46:45] - AI threat model"Counterfeit People" - D. Dennett (Atlantic) [00:52:45] - AI societal risks"Open-Endedness is Essential..." - DeepMind/Rocktäschel/Hughes [01:03:42]Heider & Simmel (1944) [00:30:45] - Agency attribution to shapesWhitehall Studies - M. Marmot [00:59:32] - Control and health outcomes"Clever Hans" - O. Pfungst (1911) [00:31:47] - Animal intelligence illusionHistorical References<trunc, see https://youtu.be/35r0iSajXjA>
"Blurring Reality" - Chai's Social AI Platform - sponsoredThis episode of MLST explores the groundbreaking work of Chai, a social AI platform that quietly built one of the world's largest AI companion ecosystems before ChatGPT's mainstream adoption. With over 10 million active users and just 13 engineers serving 2 trillion tokens per day, Chai discovered the massive appetite for AI companionship through serendipity while searching for product-market fit.CHAI sponsored this show *because they want to hire amazing engineers* -- CAREER OPPORTUNITIES AT CHAIChai is actively hiring in Palo Alto with competitive compensation ($300K-$800K+ equity) for roles including AI Infrastructure Engineers, Software Engineers, Applied AI Researchers, and more. Fast-track qualification available for candidates with significant product launches, open source contributions, or entrepreneurial success.https://www.chai-research.com/jobs/The conversation with founder William Beauchamp and engineers Tom Lu and Nischay Dhankhar covers Chai's innovative technical approaches including reinforcement learning from human feedback (RLHF), model blending techniques that combine smaller models to outperform larger ones, and their unique infrastructure challenges running exaflop-class compute.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers in Zurich and SF. Goto https://tufalabs.ai/***Key themes explored include:- The ethics of AI engagement optimization and attention hacking- Content moderation at scale with a lean engineering team- The shift from AI as utility tool to AI as social companion- How users form deep emotional bonds with artificial intelligence- The broader implications of AI becoming a social mediumWe also examine OpenAI's recent pivot toward companion AI with April's new GPT-4o, suggesting a fundamental shift in how we interact with artificial intelligence - from utility-focused tools to companion-like experiences that blur the lines between human and artificial intimacy.The episode also covers Chai's unconventional approach to hiring only top-tier engineers, their bootstrap funding strategy focused on user revenue over VC funding, and their rapid experimentation culture where one in five experiments succeed.TOC:00:00:00 - Introduction: Steve Jobs' AI Vision & Chai's Scale00:04:02 - Chapter 1: Simulators - The Birth of Social AI00:13:34 - Chapter 2: Engineering at Chai - RLHF & Model Blending00:21:49 - Chapter 3: Social Impact of GenAI - Ethics & Safety00:33:55 - Chapter 4: The Lean Machine - 13 Engineers, Millions of Users00:42:38 - Chapter 5: GPT-4o Becoming a Companion - OpenAI's Pivot00:50:10 - Chapter 6: What Comes Next - The Future of AI Intimacy TRANSCRIPT: https://www.dropbox.com/scl/fi/yz2ewkzmwz9rbbturfbap/CHAI.pdf?rlkey=uuyk2nfhjzezucwdgntg5ubqb&dl=0
Today GoogleDeepMind released AlphaEvolve: a Gemini coding agent for algorithm discovery. It beat the famous Strassen algorithm for matrix multiplication set 56 years ago. Google has been killing it recently. We had early access to the paper and interviewed the researchers behind the work.AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithmshttps://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/Authors: Alexander Novikov*, Ngân Vũ*, Marvin Eisenberger*, Emilien Dupont*, Po-Sen Huang*, Adam Zsolt Wagner*, Sergey Shirobokov*, Borislav Kozlovskii*, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, Matej Balog*(* indicates equal contribution or special designation, if defined elsewhere)SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***AlphaEvolve works like a very smart, tireless programmer. It uses powerful AI language models (like Gemini) to generate ideas for computer code. Then, it uses an "evolutionary" process – like survival of the fittest for programs. It tries out many different program ideas, automatically tests how well they solve a problem, and then uses the best ones to inspire new, even better programs.Beyond this mathematical breakthrough, AlphaEvolve has already been used to improve real-world systems at Google, such as making their massive data centers run more efficiently and even speeding up the training of the AI models that power AlphaEvolve itself. The discussion also covers how humans work with AlphaEvolve, the challenges of making AI discover things, and the exciting future of AI helping scientists make new discoveries.In short, AlphaEvolve is a powerful new AI tool that can invent new algorithms and solve complex problems, showing how AI can be a creative partner in science and engineering.Guests:Matej Balog: https://x.com/matejbalogAlexander Novikov: https://x.com/SashaVNovikovREFS:MAP Elites [Jean-Baptiste Mouret, Jeff Clune]https://arxiv.org/abs/1504.04909FunSearch [Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli & Alhussein Fawzi]https://www.nature.com/articles/s41586-023-06924-6TOC:[00:00:00] Introduction: Alpha Evolve's Breakthroughs, DeepMind's Lineage, and Real-World Impact[00:12:06] Introducing AlphaEvolve: Concept, Evolutionary Algorithms, and Architecture[00:16:56] Search Challenges: The Halting Problem and Enabling Creative Leaps[00:23:20] Knowledge Augmentation: Self-Generated Data, Meta-Prompting, and Library Learning[00:29:08] Matrix Multiplication Breakthrough: From Strassen to AlphaEvolve's 48 Multiplications[00:39:11] Problem Representation: Direct Solutions, Constructors, and Search Algorithms[00:46:06] Developer Reflections: Surprising Outcomes and Superiority over Simple LLM Sampling[00:51:42] Algorithmic Improvement: Hill Climbing, Program Synthesis, and Intelligibility[01:00:24] Real-World Application: Complex Evaluations and Robotics[01:05:39] Role of LLMs & Future: Advanced Models, Recursive Self-Improvement, and Human-AI Collaboration[01:11:22] Resource Considerations: Compute Costs of AlphaEvolveThis is a trial of posting videos on Spotify, thoughts? Email me or chat in our Discord
Randall Balestriero joins the show to discuss some counterintuitive findings in AI. He shares research showing that huge language models, even when started from scratch (randomly initialized) without massive pre-training, can learn specific tasks like sentiment analysis surprisingly well, train stably, and avoid severe overfitting, sometimes matching the performance of costly pre-trained models. This raises questions about when giant pre-training efforts are truly worth it.He also talks about how self-supervised learning (where models learn from data structure itself) and traditional supervised learning (using labeled data) are fundamentally similar, allowing researchers to apply decades of supervised learning theory to improve newer self-supervised methods.Finally, Randall touches on fairness in AI models used for Earth data (like climate prediction), revealing that these models can be biased, performing poorly in specific locations like islands or coastlines even if they seem accurate overall, which has important implications for policy decisions based on this data.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT + SHOWNOTES:https://www.dropbox.com/scl/fi/n7yev71nsjso71jyjz1fy/RANDALLNEURIPS.pdf?rlkey=0dn4injp1sc4ts8njwf3wfmxv&dl=0TOC:1. Model Training Efficiency and Scale [00:00:00] 1.1 Training Stability of Large Models on Small Datasets [00:04:09] 1.2 Pre-training vs Random Initialization Performance Comparison [00:07:58] 1.3 Task-Specific Models vs General LLMs Efficiency2. Learning Paradigms and Data Distribution [00:10:35] 2.1 Fair Language Model Paradox and Token Frequency Issues [00:12:02] 2.2 Pre-training vs Single-task Learning Spectrum [00:16:04] 2.3 Theoretical Equivalence of Supervised and Self-supervised Learning [00:19:40] 2.4 Self-Supervised Learning and Supervised Learning Relationships [00:21:25] 2.5 SSL Objectives and Heavy-tailed Data Distribution Challenges3. Geographic Representation in ML Systems [00:25:20] 3.1 Geographic Bias in Earth Data Models and Neural Representations [00:28:10] 3.2 Mathematical Limitations and Model Improvements [00:30:24] 3.3 Data Quality and Geographic Bias in ML DatasetsREFS:[00:01:40] Research on training large language models from scratch on small datasets, Randall Balestriero et al.https://openreview.net/forum?id=wYGBWOjq1Q[00:10:35] The Fair Language Model Paradox (2024), Andrea Pinto, Tomer Galanti, Randall Balestrierohttps://arxiv.org/abs/2410.11985[00:12:20] Muppet: Massive Multi-task Representations with Pre-Finetuning (2021), Armen Aghajanyan et al.https://arxiv.org/abs/2101.11038[00:14:30] Dissociating language and thought in large language models (2023), Kyle Mahowald et al.https://arxiv.org/abs/2301.06627[00:16:05] The Birth of Self-Supervised Learning: A Supervised Theory, Randall Balestriero et al.https://openreview.net/forum?id=NhYAjAAdQT[00:21:25] VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning, Adrien Bardes, Jean Ponce, Yann LeCunhttps://arxiv.org/abs/2105.04906[00:25:20] No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data (2025), Daniel Cai, Randall Balestriero, et al.https://arxiv.org/abs/2502.06831[00:33:45] Mark Ibrahim et al.'s work on geographic bias in computer vision datasets, Mark Ibrahimhttps://arxiv.org/pdf/2304.12210
Prof. Kevin Ellis and Dr. Zenna Tavares talk about making AI smarter, like humans. They want AI to learn from just a little bit of information by actively trying things out, not just by looking at tons of data.They discuss two main ways AI can "think": one way is like following specific rules or steps (like a computer program), and the other is more intuitive, like guessing based on patterns (like modern AI often does). They found combining both methods works well for solving complex puzzles like ARC.A key idea is "compositionality" - building big ideas from small ones, like LEGOs. This is powerful but can also be overwhelming. Another important idea is "abstraction" - understanding things simply, without getting lost in details, and knowing there are different levels of understanding.Ultimately, they believe the best AI will need to explore, experiment, and build models of the world, much like humans do when learning something new.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT:https://www.dropbox.com/scl/fi/3ngggvhb3tnemw879er5y/BASIS.pdf?rlkey=lr2zbj3317mex1q5l0c2rsk0h&dl=0 Zenna Tavares:http://www.zenna.org/Kevin Ellis:https://www.cs.cornell.edu/~ellisk/TOC:1. Compositionality and Learning Foundations [00:00:00] 1.1 Compositional Search and Learning Challenges [00:03:55] 1.2 Bayesian Learning and World Models [00:12:05] 1.3 Programming Languages and Compositionality Trade-offs [00:15:35] 1.4 Inductive vs Transductive Approaches in AI Systems2. Neural-Symbolic Program Synthesis [00:27:20] 2.1 Integration of LLMs with Traditional Programming and Meta-Programming [00:30:43] 2.2 Wake-Sleep Learning and DreamCoder Architecture [00:38:26] 2.3 Program Synthesis from Interactions and Hidden State Inference [00:41:36] 2.4 Abstraction Mechanisms and Resource Rationality [00:48:38] 2.5 Inductive Biases and Causal Abstraction in AI Systems3. Abstract Reasoning Systems [00:52:10] 3.1 Abstract Concepts and Grid-Based Transformations in ARC [00:56:08] 3.2 Induction vs Transduction Approaches in Abstract Reasoning [00:59:12] 3.3 ARC Limitations and Interactive Learning Extensions [01:06:30] 3.4 Wake-Sleep Program Learning and Hybrid Approaches [01:11:37] 3.5 Project MARA and Future Research DirectionsREFS:[00:00:25] DreamCoder, Kevin Ellis et al.https://arxiv.org/abs/2006.08381[00:01:10] Mind Your Step, Ryan Liu et al.https://arxiv.org/abs/2410.21333[00:06:05] Bayesian inference, Griffiths, T. L., Kemp, C., & Tenenbaum, J. B.https://psycnet.apa.org/record/2008-06911-003[00:13:00] Induction and Transduction, Wen-Ding Li, Zenna Tavares, Yewen Pu, Kevin Ellishttps://arxiv.org/abs/2411.02272[00:23:15] Neurosymbolic AI, Garcez, Artur d'Avila et al.https://arxiv.org/abs/2012.05876[00:33:50] Induction and Transduction (II), Wen-Ding Li, Kevin Ellis et al.https://arxiv.org/abs/2411.02272[00:38:35] ARC, François Chollethttps://arxiv.org/abs/1911.01547[00:39:20] Causal Reactive Programs, Ria Das, Joshua B. Tenenbaum, Armando Solar-Lezama, Zenna Tavareshttp://www.zenna.org/publications/autumn2022.pdf[00:42:50] MuZero, Julian Schrittwieser et al.http://arxiv.org/pdf/1911.08265[00:43:20] VisualPredicator, Yichao Lianghttps://arxiv.org/abs/2410.23156[00:48:55] Bayesian models of cognition, Joshua B. Tenenbaumhttps://mitpress.mit.edu/9780262049412/bayesian-models-of-cognition/[00:49:30] The Bitter Lesson, Rich Suttonhttp://www.incompleteideas.net/IncIdeas/BitterLesson.html[01:06:35] Program induction, Kevin Ellis, Wen-Ding Lihttps://arxiv.org/pdf/2411.02272[01:06:50] DreamCoder (II), Kevin Ellis et al.https://arxiv.org/abs/2006.08381[01:11:55] Project MARA, Zenna Tavares, Kevin Ellishttps://www.basis.ai/blog/mara/
Eiso Kant, CTO of poolside AI, discusses the company's approach to building frontier AI foundation models, particularly focused on software development. Their unique strategy is reinforcement learning from code execution feedback which is an important axis for scaling AI capabilities beyond just increasing model size or data volume. Kant predicts human-level AI in knowledge work could be achieved within 18-36 months, outlining poolside's vision to dramatically increase software development productivity and accessibility. SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***Eiso Kant:https://x.com/eisokanthttps://poolside.ai/TRANSCRIPT:https://www.dropbox.com/scl/fi/szepl6taqziyqie9wgmk9/poolside.pdf?rlkey=iqar7dcwshyrpeoz0xa76k422&dl=0TOC:1. Foundation Models and AI Strategy [00:00:00] 1.1 Foundation Models and Timeline Predictions for AI Development [00:02:55] 1.2 Poolside AI's Corporate History and Strategic Vision [00:06:48] 1.3 Foundation Models vs Enterprise Customization Trade-offs2. Reinforcement Learning and Model Economics [00:15:42] 2.1 Reinforcement Learning and Code Execution Feedback Approaches [00:22:06] 2.2 Model Economics and Experimental Optimization3. Enterprise AI Implementation [00:25:20] 3.1 Poolside's Enterprise Deployment Strategy and Infrastructure [00:26:00] 3.2 Enterprise-First Business Model and Market Focus [00:27:05] 3.3 Foundation Models and AGI Development Approach [00:29:24] 3.4 DeepSeek Case Study and Infrastructure Requirements4. LLM Architecture and Performance [00:30:15] 4.1 Distributed Training and Hardware Architecture Optimization [00:33:01] 4.2 Model Scaling Strategies and Chinchilla Optimality Trade-offs [00:36:04] 4.3 Emergent Reasoning and Model Architecture Comparisons [00:43:26] 4.4 Balancing Creativity and Determinism in AI Models [00:50:01] 4.5 AI-Assisted Software Development Evolution5. AI Systems Engineering and Scalability [00:58:31] 5.1 Enterprise AI Productivity and Implementation Challenges [00:58:40] 5.2 Low-Code Solutions and Enterprise Hiring Trends [01:01:25] 5.3 Distributed Systems and Engineering Complexity [01:01:50] 5.4 GenAI Architecture and Scalability Patterns [01:01:55] 5.5 Scaling Limitations and Architectural Patterns in AI Code Generation6. AI Safety and Future Capabilities [01:06:23] 6.1 Semantic Understanding and Language Model Reasoning Approaches [01:12:42] 6.2 Model Interpretability and Safety Considerations in AI Systems [01:16:27] 6.3 AI vs Human Capabilities in Software Development [01:33:45] 6.4 Enterprise Deployment and Security ArchitectureCORE REFS (see shownotes for URLs/more refs):[00:15:45] Research demonstrating how training on model-generated content leads to distribution collapse in AI models, Ilia Shumailov et al. (Key finding on synthetic data risk)[00:20:05] Foundational paper introducing Word2Vec for computing word vector representations, Tomas Mikolov et al. (Seminal NLP technique)[00:22:15] OpenAI O3 model's breakthrough performance on ARC Prize Challenge, OpenAI (Significant AI reasoning benchmark achievement)[00:22:40] Seminal paper proposing a formal definition of intelligence as skill-acquisition efficiency, François Chollet (Influential AI definition/philosophy)[00:30:30] Technical documentation of DeepSeek's V3 model architecture and capabilities, DeepSeek AI (Details on a major new model)[00:34:30] Foundational paper establishing optimal scaling laws for LLM training, Jordan Hoffmann et al. (Key paper on LLM scaling)[00:45:45] Seminal essay arguing that scaling computation consistently trumps human-engineered solutions in AI, Richard S. Sutton (Influential "Bitter Lesson" perspective)<trunc - see PDF shownotes>
Connor Leahy and Gabriel Alfour, AI researchers from Conjecture and authors of "The Compendium," joinus for a critical discussion centered on Artificial Superintelligence (ASI) safety and governance. Drawing from their comprehensive analysis in "The Compendium," they articulate a stark warning about the existential risks inherent in uncontrolled AI development, framing it through the lens of "intelligence domination"—where a sufficiently advanced AI could subordinate humanity, much like humans dominate less intelligent species.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT + REFS + NOTES:https://www.dropbox.com/scl/fi/p86l75y4o2ii40df5t7no/Compendium.pdf?rlkey=tukczgf3flw133sr9rgss0pnj&dl=0https://www.thecompendium.ai/https://en.wikipedia.org/wiki/Connor_Leahyhttps://www.conjecture.dev/abouthttps://substack.com/@gabeccTOC:1. AI Intelligence and Safety Fundamentals [00:00:00] 1.1 Understanding Intelligence and AI Capabilities [00:06:20] 1.2 Emergence of Intelligence and Regulatory Challenges [00:10:18] 1.3 Human vs Animal Intelligence Debate [00:18:00] 1.4 AI Regulation and Risk Assessment Approaches [00:26:14] 1.5 Competing AI Development Ideologies2. Economic and Social Impact [00:29:10] 2.1 Labor Market Disruption and Post-Scarcity Scenarios [00:32:40] 2.2 Institutional Frameworks and Tech Power Dynamics [00:37:40] 2.3 Ethical Frameworks and AI Governance Debates [00:40:52] 2.4 AI Alignment Evolution and Technical Challenges3. Technical Governance Framework [00:55:07] 3.1 Three Levels of AI Safety: Alignment, Corrigibility, and Boundedness [00:55:30] 3.2 Challenges of AI System Corrigibility and Constitutional Models [00:57:35] 3.3 Limitations of Current Boundedness Approaches [00:59:11] 3.4 Abstract Governance Concepts and Policy Solutions4. Democratic Implementation and Coordination [00:59:20] 4.1 Governance Design and Measurement Challenges [01:00:10] 4.2 Democratic Institutions and Experimental Governance [01:14:10] 4.3 Political Engagement and AI Safety Advocacy [01:25:30] 4.4 Practical AI Safety Measures and International CoordinationCORE REFS:[00:01:45] The Compendium (2023), Leahy et al.https://pdf.thecompendium.ai/the_compendium.pdf[00:06:50] Geoffrey Hinton Leaves Google, BBC Newshttps://www.bbc.com/news/world-us-canada-65452940[00:10:00] ARC-AGI, Chollethttps://arcprize.org/arc-agi[00:13:25] A Brief History of Intelligence, Bennetthttps://www.amazon.com/Brief-History-Intelligence-Humans-Breakthroughs/dp/0063286343[00:25:35] Statement on AI Risk, Center for AI Safetyhttps://www.safe.ai/work/statement-on-ai-risk[00:26:15] Machines of Love and Grace, Amodeihttps://darioamodei.com/machines-of-loving-grace[00:26:35] The Techno-Optimist Manifesto, Andreessenhttps://a16z.com/the-techno-optimist-manifesto/[00:31:55] Techno-Feudalism, Varoufakishttps://www.amazon.co.uk/Technofeudalism-Killed-Capitalism-Yanis-Varoufakis/dp/1847927270[00:42:40] Introducing Superalignment, OpenAIhttps://openai.com/index/introducing-superalignment/[00:47:20] Three Laws of Robotics, Asimovhttps://www.britannica.com/topic/Three-Laws-of-Robotics[00:50:00] Symbolic AI (GOFAI), Haugelandhttps://en.wikipedia.org/wiki/Symbolic_artificial_intelligence[00:52:30] Intent Alignment, Christianohttps://www.alignmentforum.org/posts/HEZgGBZTpT4Bov7nH/mapping-the-conceptual-territory-in-ai-existential-safety[00:55:10] Large Language Model Alignment: A Survey, Jiang et al.http://arxiv.org/pdf/2309.15025[00:55:40] Constitutional Checks and Balances, Bokhttps://plato.stanford.edu/entries/montesquieu/<trunc, see PDF>
We are joined by Francois Chollet and Mike Knoop, to launch the new version of the ARC prize! In version 2, the challenges have been calibrated with humans such that at least 2 humans could solve each task in a reasonable task, but also adversarially selected so that frontier reasoning models can't solve them. The best LLMs today get negligible performance on this challenge. https://arcprize.org/SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT:https://www.dropbox.com/scl/fi/0v9o8xcpppdwnkntj59oi/ARCv2.pdf?rlkey=luqb6f141976vra6zdtptv5uj&dl=0TOC:1. ARC v2 Core Design & Objectives [00:00:00] 1.1 ARC v2 Launch and Benchmark Architecture [00:03:16] 1.2 Test-Time Optimization and AGI Assessment [00:06:24] 1.3 Human-AI Capability Analysis [00:13:02] 1.4 OpenAI o3 Initial Performance Results2. ARC Technical Evolution [00:17:20] 2.1 ARC-v1 to ARC-v2 Design Improvements [00:21:12] 2.2 Human Validation Methodology [00:26:05] 2.3 Task Design and Gaming Prevention [00:29:11] 2.4 Intelligence Measurement Framework3. O3 Performance & Future Challenges [00:38:50] 3.1 O3 Comprehensive Performance Analysis [00:43:40] 3.2 System Limitations and Failure Modes [00:49:30] 3.3 Program Synthesis Applications [00:53:00] 3.4 Future Development RoadmapREFS:[00:00:15] On the Measure of Intelligence, François Chollethttps://arxiv.org/abs/1911.01547[00:06:45] ARC Prize Foundation, François Chollet, Mike Knoophttps://arcprize.org/[00:12:50] OpenAI o3 model performance on ARC v1, ARC Prize Teamhttps://arcprize.org/blog/oai-o3-pub-breakthrough[00:18:30] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models, Jason Wei et al.https://arxiv.org/abs/2201.11903[00:21:45] ARC-v2 benchmark tasks, Mike Knoophttps://arcprize.org/blog/introducing-arc-agi-public-leaderboard[00:26:05] ARC Prize 2024: Technical Report, Francois Chollet et al.https://arxiv.org/html/2412.04604v2[00:32:45] ARC Prize 2024 Technical Report, Francois Chollet, Mike Knoop, Gregory Kamradthttps://arxiv.org/abs/2412.04604[00:48:55] The Bitter Lesson, Rich Suttonhttp://www.incompleteideas.net/IncIdeas/BitterLesson.html[00:53:30] Decoding strategies in neural text generation, Sina Zarrießhttps://www.mdpi.com/2078-2489/12/9/355/pdf
Mohamed Osman joins to discuss MindsAI's highest scoring entry to the ARC challenge 2024 and the paradigm of test-time fine-tuning. They explore how the team, now part of Tufa Labs in Zurich, achieved state-of-the-art results using a combination of pre-training techniques, a unique meta-learning strategy, and an ensemble voting mechanism. Mohamed emphasizes the importance of raw data input and flexibility of the network.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT + REFS:https://www.dropbox.com/scl/fi/jeavyqidsjzjgjgd7ns7h/MoFInal.pdf?rlkey=cjjmo7rgtenxrr3b46nk6yq2e&dl=0Mohamed Osman (Tufa Labs)https://x.com/MohamedOsmanMLJack Cole (Tufa Labs)https://x.com/MindsAI_JackHow and why deep learning for ARC paper:https://github.com/MohamedOsman1998/deep-learning-for-arc/blob/main/deep_learning_for_arc.pdfTOC:1. Abstract Reasoning Foundations [00:00:00] 1.1 Test-Time Fine-Tuning and ARC Challenge Overview [00:10:20] 1.2 Neural Networks vs Programmatic Approaches to Reasoning [00:13:23] 1.3 Code-Based Learning and Meta-Model Architecture [00:20:26] 1.4 Technical Implementation with Long T5 Model2. ARC Solution Architectures [00:24:10] 2.1 Test-Time Tuning and Voting Methods for ARC Solutions [00:27:54] 2.2 Model Generalization and Function Generation Challenges [00:32:53] 2.3 Input Representation and VLM Limitations [00:36:21] 2.4 Architecture Innovation and Cross-Modal Integration [00:40:05] 2.5 Future of ARC Challenge and Program Synthesis Approaches3. Advanced Systems Integration [00:43:00] 3.1 DreamCoder Evolution and LLM Integration [00:50:07] 3.2 MindsAI Team Progress and Acquisition by Tufa Labs [00:54:15] 3.3 ARC v2 Development and Performance Scaling [00:58:22] 3.4 Intelligence Benchmarks and Transformer Limitations [01:01:50] 3.5 Neural Architecture Optimization and Processing DistributionREFS:[00:01:32] Original ARC challenge paper, François Chollethttps://arxiv.org/abs/1911.01547[00:06:55] DreamCoder, Kevin Ellis et al.https://arxiv.org/abs/2006.08381[00:12:50] Deep Learning with Python, François Chollethttps://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438[00:13:35] Deep Learning with Python, François Chollethttps://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438[00:13:35] Influence of pretraining data for reasoning, Laura Ruishttps://arxiv.org/abs/2411.12580[00:17:50] Latent Program Networks, Clement Bonnethttps://arxiv.org/html/2411.08706v1[00:20:50] T5, Colin Raffel et al.https://arxiv.org/abs/1910.10683[00:30:30] Combining Induction and Transduction for Abstract Reasoning, Wen-Ding Li, Kevin Ellis et al.https://arxiv.org/abs/2411.02272[00:34:15] Six finger problem, Chen et al.https://openaccess.thecvf.com/content/CVPR2024/papers/Chen_SpatialVLM_Endowing_Vision-Language_Models_with_Spatial_Reasoning_Capabilities_CVPR_2024_paper.pdf[00:38:15] DeepSeek-R1-Distill-Llama, DeepSeek AIhttps://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B[00:40:10] ARC Prize 2024 Technical Report, François Chollet et al.https://arxiv.org/html/2412.04604v2[00:45:20] LLM-Guided Compositional Program Synthesis, Wen-Ding Li and Kevin Ellishttps://arxiv.org/html/2503.15540[00:54:25] Abstraction and Reasoning Corpus, François Chollethttps://github.com/fchollet/ARC-AGI[00:57:10] O3 breakthrough on ARC-AGI, OpenAIhttps://arcprize.org/[00:59:35] ConceptARC Benchmark, Arseny Moskvichev, Melanie Mitchellhttps://arxiv.org/abs/2305.07141[01:02:05] Mixtape: Breaking the Softmax Bottleneck Efficiently, Yang, Zhilin and Dai, Zihang and Salakhutdinov, Ruslan and Cohen, William W.http://papers.neurips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently.pdf
Iman Mirzadeh from Apple, who recently published the GSM-Symbolic paper discusses the crucial distinction between intelligence and achievement in AI systems. He critiques current AI research methodologies, highlighting the limitations of Large Language Models (LLMs) in reasoning and knowledge representation. SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT + RESEARCH:https://www.dropbox.com/scl/fi/mlcjl9cd5p1kem4l0vqd3/IMAN.pdf?rlkey=dqfqb74zr81a5gqr8r6c8isg3&dl=0TOC:1. Intelligence vs Achievement in AI Systems [00:00:00] 1.1 Intelligence vs Achievement Metrics in AI Systems [00:03:27] 1.2 AlphaZero and Abstract Understanding in Chess [00:10:10] 1.3 Language Models and Distribution Learning Limitations [00:14:47] 1.4 Research Methodology and Theoretical Frameworks2. Intelligence Measurement and Learning [00:24:24] 2.1 LLM Capabilities: Interpolation vs True Reasoning [00:29:00] 2.2 Intelligence Definition and Measurement Approaches [00:34:35] 2.3 Learning Capabilities and Agency in AI Systems [00:39:26] 2.4 Abstract Reasoning and Symbol Understanding3. LLM Performance and Evaluation [00:47:15] 3.1 Scaling Laws and Fundamental Limitations [00:54:33] 3.2 Connectionism vs Symbolism Debate in Neural Networks [00:58:09] 3.3 GSM-Symbolic: Testing Mathematical Reasoning in LLMs [01:08:38] 3.4 Benchmark Evaluation and Model Performance AssessmentREFS:[00:01:00] AlphaZero chess AI system, Silver et al.https://arxiv.org/abs/1712.01815[00:07:10] Game Changer: AlphaZero's Groundbreaking Chess Strategies, Sadler & Reganhttps://www.amazon.com/Game-Changer-AlphaZeros-Groundbreaking-Strategies/dp/9056918184[00:11:35] Cross-entropy loss in language modeling, Voitahttp://lena-voita.github.io/nlp_course/language_modeling.html[00:17:20] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in LLMs, Mirzadeh et al.https://arxiv.org/abs/2410.05229[00:21:25] Connectionism and Cognitive Architecture: A Critical Analysis, Fodor & Pylyshynhttps://www.sciencedirect.com/science/article/pii/001002779090014B[00:28:55] Brain-to-body mass ratio scaling laws, Sutskeverhttps://www.theverge.com/2024/12/13/24320811/what-ilya-sutskever-sees-openai-model-data-training[00:29:40] On the Measure of Intelligence, Chollethttps://arxiv.org/abs/1911.01547[00:33:30] On definition of intelligence, Gignac et al.https://www.sciencedirect.com/science/article/pii/S0160289624000266[00:35:30] Defining intelligence, Wanghttps://cis.temple.edu/~wangp/papers.html[00:37:40] How We Learn: Why Brains Learn Better Than Any Machine... for Now, Dehaenehttps://www.amazon.com/How-We-Learn-Brains-Machine/dp/0525559884[00:39:35] Surfaces and Essences: Analogy as the Fuel and Fire of Thinking, Hofstadter and Sanderhttps://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475[00:43:15] Chain-of-thought prompting, Wei et al.https://arxiv.org/abs/2201.11903[00:47:20] Test-time scaling laws in machine learning, Brownhttps://podcasts.apple.com/mv/podcast/openais-noam-brown-ilge-akkaya-and-hunter-lightman-on/id1750736528?i=1000671532058[00:47:50] Scaling Laws for Neural Language Models, Kaplan et al.https://arxiv.org/abs/2001.08361[00:55:15] Tensor product variable binding, Smolenskyhttps://www.sciencedirect.com/science/article/abs/pii/000437029090007M[01:08:45] GSM-8K dataset, OpenAIhttps://huggingface.co/datasets/openai/gsm8k
Face recognition technology for surveillance and security applications has truly revolutionized the way we approach safety and monitoring. The seamless integration of this technology allows for quick and accurate identification of individuals, enhancing the overall efficiency of security systems, check for more on https://www.thefreemanonline.org/face-recognition-for-surveillance-and-security-applications/ . The experience of using face recognition for surveillance is incredibly intuitive and user-friendly, making it a valuable tool for both large-scale operations and everyday security measures. The precision and speed at which faces can be recognized is truly impressive, providing a sense of reassurance and peace of mind. Overall, the utilization of face recognition technology in surveillance and security applications has undoubtedly raised the bar in terms of safety and protection.
2:58:00 the moment Eliezer loses