Discover
Machine Learning Street Talk (MLST)
![Machine Learning Street Talk (MLST) Machine Learning Street Talk (MLST)](https://is1-ssl.mzstatic.com/image/thumb/Podcasts116/v4/b7/50/40/b750406d-6157-3539-a905-a02d6455011b/mza_794612352763073027.jpg/400x400bb.jpg)
Machine Learning Street Talk (MLST)
Author: Machine Learning Street Talk (MLST)
Subscribed: 816Played: 20,252Subscribe
Share
© Machine Learning Street Talk (MLST)
Description
Welcome! We engage in fascinating discussions with pre-eminent figures in the AI field. Our flagship show covers current affairs in AI, cognitive science, neuroscience and philosophy of mind with in-depth analysis. Our approach is unrivalled in terms of scope and rigour – we believe in intellectual diversity in AI, and we touch on all of the main ideas in the field with the hype surgically removed. MLST is run by Tim Scarfe, Ph.D (https://www.linkedin.com/in/ecsquizor/) and features regular appearances from MIT Doctor of Philosophy Keith Duggar (https://www.linkedin.com/in/dr-keith-duggar/).
157 Episodes
Reverse
Sara Hooker is VP of Research at Cohere and leader of Cohere for AI. We discuss her recent paper critiquing the use of compute thresholds, measured in FLOPs (floating point operations), as an AI governance strategy.
We explore why this approach, recently adopted in both US and EU AI policies, may be problematic and oversimplified. Sara explains the limitations of using raw computational power as a measure of AI capability or risk, and discusses the complex relationship between compute, data, and model architecture.
Equally important, we go into Sara's work on "The AI Language Gap." This research highlights the challenges and inequalities in developing AI systems that work across multiple languages. Sara discusses how current AI models, predominantly trained on English and a handful of high-resource languages, fail to serve the linguistic diversity of our global population. We explore the technical, ethical, and societal implications of this gap, and discuss potential solutions for creating more inclusive and representative AI systems.
We broadly discuss the relationship between language, culture, and AI capabilities, as well as the ethical considerations in AI development and deployment.
YT Version: https://youtu.be/dBZp47999Ko
TOC:
[00:00:00] Intro
[00:02:12] FLOPS paper
[00:26:42] Hardware lottery
[00:30:22] The Language gap
[00:33:25] Safety
[00:38:31] Emergent
[00:41:23] Creativity
[00:43:40] Long tail
[00:44:26] LLMs and society
[00:45:36] Model bias
[00:48:51] Language and capabilities
[00:52:27] Ethical frameworks and RLHF
Sara Hooker
https://www.sarahooker.me/
https://www.linkedin.com/in/sararosehooker/
https://scholar.google.com/citations?user=2xy6h3sAAAAJ&hl=en
https://x.com/sarahookr
Interviewer: Tim Scarfe
Refs
The AI Language gap
https://cohere.com/research/papers/the-AI-language-gap.pdf
On the Limitations of Compute Thresholds as a Governance Strategy.
https://arxiv.org/pdf/2407.05694v1
The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm
https://arxiv.org/pdf/2406.18682
Cohere Aya
https://cohere.com/research/aya
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
https://arxiv.org/pdf/2407.02552
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs
https://arxiv.org/pdf/2402.14740
Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence
https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/
EU AI Act
https://www.europarl.europa.eu/doceo/document/TA-9-2024-0138_EN.pdf
The bitter lesson
http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Neel Nanda interview
https://www.youtube.com/watch?v=_Ygf0GnlwmY
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
https://transformer-circuits.pub/2024/scaling-monosemanticity/
Chollet's ARC challenge
https://github.com/fchollet/ARC-AGI
Ryan Greenblatt on ARC
https://www.youtube.com/watch?v=z9j3wB1RRGA
Disclaimer: This is the third video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview.
Murray Shanahan is a professor of Cognitive Robotics at Imperial College London and a senior research scientist at DeepMind. He challenges our assumptions about AI consciousness and urges us to rethink how we talk about machine intelligence.
We explore the dangers of anthropomorphizing AI, the limitations of current language in describing AI capabilities, and the fascinating intersection of philosophy and artificial intelligence.
Show notes and full references: https://docs.google.com/document/d/1ICtBI574W-xGi8Z2ZtUNeKWiOiGZ_DRsp9EnyYAISws/edit?usp=sharing
Prof Murray Shanahan:
https://www.doc.ic.ac.uk/~mpsha/ (look at his selected publications)
https://scholar.google.co.uk/citations?user=00bnGpAAAAAJ&hl=en
https://en.wikipedia.org/wiki/Murray_Shanahan
https://x.com/mpshanahan
Interviewer: Dr. Tim Scarfe
Refs (links in the Google doc linked above):
Role play with large language models
Waluigi effect
"Conscious Exotica" - Paper by Murray Shanahan (2016)
"Simulators" - Article by Janis from LessWrong
"Embodiment and the Inner Life" - Book by Murray Shanahan (2010)
"The Technological Singularity" - Book by Murray Shanahan (2015)
"Simulacra as Conscious Exotica" - Paper by Murray Shanahan (newer paper of the original focussed on LLMs)
A recent paper by Anthropic on using autoencoders to find features in language models (referring to the "Scaling Monosemanticity" paper)
Work by Peter Godfrey-Smith on octopus consciousness
"Metaphors We Live By" - Book by George Lakoff (1980s)
Work by Aaron Sloman on the concept of "space of possible minds" (1984 article mentioned)
Wittgenstein's "Philosophical Investigations" (posthumously published)
Daniel Dennett's work on the "intentional stance"
Alan Turing's original paper on the Turing Test (1950)
Thomas Nagel's paper "What is it like to be a bat?" (1974)
John Searle's Chinese Room Argument (mentioned but not detailed)
Work by Richard Evans on tackling reasoning problems
Claude Shannon's quote on knowledge and control
"Are We Bodies or Souls?" - Book by Richard Swinburne
Reference to work by Ethan Perez and others at Anthropic on potential deceptive behavior in language models
Reference to a paper by Murray Shanahan and Antonia Creswell on the "selection inference framework"
Mention of work by Francois Chollet, particularly the ARC (Abstraction and Reasoning Corpus) challenge
Reference to Elizabeth Spelke's work on core knowledge in infants
Mention of Karl Friston's work on planning as inference (active inference)
The film "Ex Machina" - Murray Shanahan was the scientific advisor
"The Waluigi Effect"
Anthropic's constitutional AI approach
Loom system by Lara Reynolds and Kyle McDonald for visualizing conversation trees
DeepMind's AlphaGo (mentioned multiple times as an example)
Mention of the "Golden Gate Claude" experiment
Reference to an interview Tim Scarfe conducted with University of Toronto students about self-attention controllability theorem
Mention of an interview with Irina Rish
Reference to an interview Tim Scarfe conducted with Daniel Dennett
Reference to an interview with Maria Santa Caterina
Mention of an interview with Philip Goff
Nick Chater and Martin Christianson's book ("The Language Game: How Improvisation Created Language and Changed the World")
Peter Singer's work from 1975 on ascribing moral status to conscious beings
Demis Hassabis' discussion on the "ladder of creativity"
Reference to B.F. Skinner and behaviorism
In the coming decades, the technology that enables virtual and augmented reality will improve beyond recognition. Within a century, world-renowned philosopher David J. Chalmers predicts, we will have virtual worlds that are impossible to distinguish from non-virtual worlds. But is virtual reality just escapism?
In a highly original work of 'technophilosophy', Chalmers argues categorically, no: virtual reality is genuine reality. Virtual worlds are not second-class worlds. We can live a meaningful life in virtual reality - and increasingly, we will.
What is reality, anyway? How can we lead a good life? Is there a god? How do we know there's an external world - and how do we know we're not living in a computer simulation? In Reality+, Chalmers conducts a grand tour of philosophy, using cutting-edge technology to provide invigorating new answers to age-old questions.
David J. Chalmers is an Australian philosopher and cognitive scientist specializing in the areas of philosophy of mind and philosophy of language. He is Professor of Philosophy and Neural Science at New York University, as well as co-director of NYU's Center for Mind, Brain, and Consciousness. Chalmers is best known for his work on consciousness, including his formulation of the "hard problem of consciousness."
Reality+: Virtual Worlds and the Problems of Philosophy
https://amzn.to/3RYyGD2
https://consc.net/
https://x.com/davidchalmers42
00:00:00 Reality+ Intro
00:12:02 GPT conscious? 10/10
00:14:19 The consciousness processor thought experiment (11/10)
00:20:34 Intelligence and Consciousness entangled? 10/10
00:22:44 Karl Friston / Meta Problem 10/10
00:29:05 Knowledge argument / subjective experience (6/10)
00:32:34 Emergence 11/10 (best chapter)
00:42:45 Working with Douglas Hofstadter 10/10
00:46:14 Intelligence is analogy making? 10/10
00:50:47 Intelligence explosion 8/10
00:58:44 Hypercomputation 10/10
01:09:44 Who designed the designer? (7/10)
01:13:57 Experience machine (7/10)
Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.
Sponsor:
Sign up to Kalshi here https://kalshi.onelink.me/1r91/mlst -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.
We discuss:
- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.
- The strengths and weaknesses of current AI models.
- How AI and humans differ in learning and reasoning.
- Combining various techniques to create smarter AI systems.
- The potential risks and future advancements in AI, including the idea of agentic AI.
https://x.com/RyanPGreenblatt
https://www.redwoodresearch.org/
Refs:
Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]
https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt
On the Measure of Intelligence [Chollet]
https://arxiv.org/abs/1911.01547
Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]
https://ruccs.rutgers.edu/images/personal-zenon-pylyshyn/proseminars/Proseminar13/ConnectionistArchitecture.pdf
Software 2.0 [Andrej Karpathy]
https://karpathy.medium.com/software-2-0-a64152b37c35
Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]
https://amzn.to/3Wfy2E0
Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]
https://gwern.net/doc/iq/high/smpy/1984-clements.pdf
Model Evaluation and Threat Research (METR)
https://metr.org/
Why Tool AIs Want to Be Agent AIs
https://gwern.net/tool-ai
Simulators - Janus
https://www.lesswrong.com/posts/vJFdjigzmcXMhNTsx/simulators
AI Control: Improving Safety Despite Intentional Subversion
https://www.lesswrong.com/posts/d9FJHawgkiMSPjagR/ai-control-improving-safety-despite-intentional-subversion
https://arxiv.org/abs/2312.06942
What a Compute-Centric Framework Says About Takeoff Speeds
https://www.openphilanthropy.org/research/what-a-compute-centric-framework-says-about-takeoff-speeds/
Global GDP over the long run
https://ourworldindata.org/grapher/global-gdp-over-the-long-run?yScale=log
Safety Cases: How to Justify the Safety of Advanced AI Systems
https://arxiv.org/abs/2403.10462
The Danger of a “Safety Case"
http://sunnyday.mit.edu/The-Danger-of-a-Safety-Case.pdf
The Future Of Work Looks Like A UPS Truck (~02:15:50)
https://www.npr.org/sections/money/2014/05/02/308640135/episode-536-the-future-of-work-looks-like-a-ups-truck
SWE-bench
https://www.swebench.com/
Using DeepSpeed and Megatron to Train Megatron-Turing NLG
530B, A Large-Scale Generative Language Model
https://arxiv.org/pdf/2201.11990
Algorithmic Progress in Language Models
https://epochai.org/blog/algorithmic-progress-in-language-models
Aidan Gomez, CEO of Cohere, reveals how they're tackling AI hallucinations and improving reasoning abilities. He also explains why Cohere doesn't use any output from GPT-4 for training their models.
Aidan shares his personal insights into the world of AI and LLMs and Cohere's unique approach to solving real-world business problems, and how their models are set apart from the competition. Aidan reveals how they are making major strides in AI technology, discussing everything from last mile customer engineering to the robustness of prompts and future architectures.
He also touches on the broader implications of AI for society, including potential risks and the role of regulation. He discusses Cohere's guiding principles and the health the of startup scene. With a particular focus on enterprise applications. Aidan provides a rare look into the internal workings of Cohere and their vision for driving productivity and innovation.
https://cohere.com/
https://x.com/aidangomez
Check out Cohere's amazing new Command R* models here
https://cohere.com/command
Disclaimer: This is the second video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview.
The ARC Challenge, created by Francois Chollet, tests how well AI systems can generalize from a few examples in a grid-based intelligence test. We interview the current winners of the ARC Challenge—Jack Cole, Mohammed Osman and their collaborator Michael Hodel. They discuss how they tackled ARC (Abstraction and Reasoning Corpus) using language models. We also discuss the new "50%" public set approach announced today from Redwood Research (Ryan Greenblatt).
Jack and Mohammed explain their winning approach, which involves fine-tuning a language model on a large, specifically-generated dataset and then doing additional fine-tuning at test-time, a technique known in this context as "active inference". They use various strategies to represent the data for the language model and believe that with further improvements, the accuracy could reach above 50%. Michael talks about his work on generating new ARC-like tasks to help train the models.
They also debate whether their methods stay true to the "spirit" of Chollet's measure of intelligence. Despite some concerns, they agree that their solutions are promising and adaptable for other similar problems.
Note:
Jack's team is still the current official winner at 33% on the private set. Ryan's entry is not on the private leaderboard or eligible.
Chollet invented ARC in 2019 (not 2017 as stated)
"Ryan's entry is not a new state of the art. We don't know exactly how well it does since it was only evaluated on 100 tasks from the evaluation set and does 50% on those, reportedly. Meanwhile Jacks team i.e. MindsAI's solution does 54% on the entire eval set and it is seemingly possible to do 60-70% with an ensemble"
Jack Cole:
https://x.com/Jcole75Cole
https://lab42.global/community-interview-jack-cole/
Mohamed Osman:
Mohamed is looking to do a PhD in AI/ML, can you help him?
Email: mothman198@outlook.com
https://www.linkedin.com/in/mohamedosman1905/
Michael Hodel:
https://arxiv.org/pdf/2404.07353v1
https://www.linkedin.com/in/michael-hodel/
https://x.com/bayesilicon
https://github.com/michaelhodel
Getting 50% (SoTA) on ARC-AGI with GPT-4o - Ryan Greenblatt
https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt
Neural networks for abstraction and reasoning: Towards broad generalization in machines [Mikel Bober-Irizar, Soumya Banerjee]
https://arxiv.org/pdf/2402.03507
Measure of intelligence:
https://arxiv.org/abs/1911.01547
YT version: https://youtu.be/jSAT_RuJ_Cg
Nick Frosst, co-founder of Cohere, on the future of LLMs, and AGI. Learn how Cohere is solving real problems for business with their new AI models.
This is the first podcast from our new Cohere partnership!
Nick talks about his journey at Google Brain, working with AI legends like Geoff Hinton, and the amazing things his company, Cohere, is doing. From creating the must useful language models for businesses to making tools for developers, Nick shares a lot of interesting insights. He even talks about his band, Good Kid! Nick said that RAG is one of the best features of Cohere's new Command R* models. We are about to release a deep-dive on RAG with Patrick Lewis from Cohere, keep an eye out for that - he explains why their models are specifically optimised for RAG use cases.
Learn more about Cohere Command R* models here:
https://cohere.com/commandhttps://github.com/cohere-ai/cohere-toolkit
Nick's band Good Kid:
https://goodkidofficial.com/
Nick on Twitter:
https://x.com/nickfrosst
Disclaimer: We are in a partnership with Cohere to release content for them. We were not told what to say in the interview, and didn't edit anything out from the interview. We are currently planning to release 2 shows per month under the partnership about their AI platform, research and strategy.
These two scientists have mapped out the insides or “reachable space” of a language model using control theory, what they discovered was extremely surprising.
Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.
https://patreon.com/mlst
YT version: https://youtu.be/Bpgloy1dDn0
Aman Bhargava from Caltech and Cameron Witkowski from the University of Toronto to discuss their groundbreaking paper, “What’s the Magic Word? A Control Theory of LLM Prompting.” (the main theorem on self-attention controllability was developed in collaboration with Dr. Shi-Zhuo Looi from Caltech).
They frame LLM systems as discrete stochastic dynamical systems. This means they look at LLMs in a structured way, similar to how we analyze control systems in engineering. They explore the “reachable set” of outputs for an LLM. Essentially, this is the range of possible outputs the model can generate from a given starting point when influenced by different prompts. The research highlights that prompt engineering, or optimizing the input tokens, can significantly influence LLM outputs. They show that even short prompts can drastically alter the likelihood of specific outputs. Aman and Cameron’s work might be a boon for understanding and improving LLMs. They suggest that a deeper exploration of control theory concepts could lead to more reliable and capable language models.
We dropped an additional, more technical video on the research on our Twitter account here: https://x.com/MLStreetTalk/status/1795093759471890606
Additional 20 minutes of unreleased footage on our Patreon here: https://www.patreon.com/posts/whats-magic-word-104922629
What's the Magic Word? A Control Theory of LLM Prompting (Aman Bhargava, Cameron Witkowski, Manav Shah, Matt Thomson)
https://arxiv.org/abs/2310.04444
LLM Control Theory Seminar (April 2024)
https://www.youtube.com/watch?v=9QtS9sVBFM0
Society for the pursuit of AGI (Cameron founded it)
https://agisociety.mydurable.com/
Roger Federer demo
http://conway.languagegame.io/inference
Neural Cellular Automata, Active Inference, and the Mystery of Biological Computation (Aman)
https://aman-bhargava.com/ai/neuro/neuromorphic/2024/03/25/nca-do-active-inference.html
Aman and Cameron also want to thank Dr. Shi-Zhuo Looi and Prof. Matt Thomson from from Caltech for help and advice on their research. (https://thomsonlab.caltech.edu/ and https://pma.caltech.edu/people/looi-shi-zhuo)
https://x.com/ABhargava2000
https://x.com/witkowski_cam
Maria Santacaterina, with her background in the humanities, brings a critical perspective on the current state and future implications of AI technology, its impact on society, and the nature of human intelligence and creativity. She emphasizes that despite technological advancements, AI lacks fundamental human traits such as consciousness, empathy, intuition, and the ability to engage in genuine creative processes. Maria argues that AI, at its core, processes data but does not have the capability to understand or generate new, intrinsic meaning or ideas as humans do.
Throughout the conversation, Maria highlights her concern about the overreliance on AI in critical sectors such as healthcare, the justice system, and business. She stresses that while AI can serve as a tool, it should not replace human judgment and decision-making. Maria points out that AI systems often operate on past data, which may lead to outdated or incorrect decisions if not carefully managed.
The discussion also touches upon the concept of "adaptive resilience", which Maria describes in her book. She explains adaptive resilience as the capacity for individuals and enterprises to evolve and thrive amidst challenges by leveraging technology responsibly, without undermining human values and capabilities.
A significant portion of the conversation focussed on ethical considerations surrounding AI. Tim and Maria agree that there's a pressing need for strong governance and ethical frameworks to guide AI development and deployment. They discuss how AI, without proper ethical considerations, risks exacerbating issues like privacy invasion, misinformation, and unintended discrimination.
Maria is skeptical about claims of achieving Artificial General Intelligence (AGI) or a technological singularity where machines surpass human intelligence in all aspects. She argues that such scenarios neglect the complex, dynamic nature of human intelligence and consciousness, which cannot be fully replicated or replaced by machines.
Tim and Maria discuss the importance of keeping human agency and creativity at the forefront of technology development. Maria asserts that efforts to automate or standardize complex human actions and decisions are misguided and could lead to dehumanizing outcomes. They both advocate for using AI as an aid to enhance human capabilities rather than a substitute.
In closing, Maria encourages a balanced approach to AI adoption, urging stakeholders to prioritize human well-being, ethical standards, and societal benefit above mere technological advancement. The conversation ends with Maria pointing people to her book for more in-depth analysis and thoughts on the future interaction between humans and technology.
Buy Maria's book here: https://amzn.to/4avF6kq
https://www.linkedin.com/in/mariasantacaterina
TOC
00:00:00 - Intro to Book
00:03:23 - What Life Is
00:10:10 - Agency
00:18:04 - Tech and Society
00:21:51 - System 1 and 2
00:22:59 - We Are Being Pigeonholed
00:30:22 - Agency vs Autonomy
00:36:37 - Explanations
00:40:24 - AI Reductionism
00:49:50 - How Are Humans Intelligent
01:00:22 - Semantics
01:01:53 - Emotive AI and Pavlovian Dogs
01:04:05 - Technology, Social Media and Organisation
01:18:34 - Systems Are Not That Automated
01:19:33 - Hiring
01:22:34 - Subjectivity in Orgs
01:32:28 - The AGI Delusion
01:45:37 - GPT-laziness Syndrome
01:54:58 - Diversity Preservation
01:58:24 - Ethics
02:11:43 - Moral Realism
02:16:17 - Utopia
02:18:02 - Reciprocity
02:20:52 - Tyranny of Categorisation
Thomas Parr and his collaborators wrote a book titled "Active Inference: The Free Energy Principle in Mind, Brain and Behavior" which introduces Active Inference from both a high-level conceptual perspective and a low-level mechanistic, mathematical perspective.
Active inference, developed by the legendary neuroscientist Prof. Karl Friston - is a unifying mathematical framework which frames living systems as agents which minimize surprise and free energy in order to resist entropy and persist over time. It unifies various perspectives from physics, biology, statistics, and psychology - and allows us to explore deep questions about agency, biology, causality, modelling, and consciousness.
Buy Active Inference: The Free Energy Principle in Mind, Brain, and Behavior
https://amzn.to/4dj0iMj
YT version: https://youtu.be/lbb-Si5wa_o
Please support us on Patreon to get access to the private Discord server, bi-weekly calls, early access and ad-free listening.
https://patreon.com/mlst
Chapters should be embedded in the mp3, let me me know if issues
Connor is the CEO of Conjecture and one of the most famous names in the AI alignment movement. This is the "behind the scenes footage" and bonus Patreon interviews from the day of the Beff Jezos debate, including an interview with Daniel Clothiaux. It's a great insight into Connor's philosophy. At the end there is an unreleased additional interview with Beff.
Support MLST:
Please support us on Patreon. We are entirely funded from Patreon donations right now. Patreon supports get private discord access, biweekly calls, very early-access + exclusive content and lots more.
https://patreon.com/mlst
Donate: https://www.paypal.com/donate/?hosted_button_id=K2TYRVPBGXVNA
If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail
Topics:
Externalized cognition and the role of society and culture in human intelligence
The potential for AI systems to develop agency and autonomy
The future of AGI as a complex mixture of various components
The concept of agency and its relationship to power
The importance of coherence in AI systems
The balance between coherence and variance in exploring potential upsides
The role of dynamic, competent, and incorruptible institutions in handling risks and developing technology
Concerns about AI widening the gap between the haves and have-nots
The concept of equal access to opportunity and maintaining dynamism in the system
Leahy's perspective on life as a process that "rides entropy"
The importance of distinguishing between epistemological, decision-theoretic, and aesthetic aspects of morality (inc ref to Hume's Guillotine)
The concept of continuous agency and the idea that the first AGI will be a messy admixture of various components
The potential for AI systems to become more physically embedded in the future
The challenges of aligning AI systems and the societal impacts of AI technologies like ChatGPT and Bing
The importance of humility in the face of complexity when considering the future of AI and its societal implications
Disclaimer: this video is not an endorsement of e/acc or AGI agential existential risk from us - the hosts of MLST consider both of these views to be quite extreme. We seek diverse views on the channel.
00:00:00 Intro
00:00:56 Connor's Philosophy
00:03:53 Office Skit
00:05:08 Connor on e/acc and Beff
00:07:28 Intro to Daniel's Philosophy
00:08:35 Connor on Entropy, Life, and Morality
00:19:10 Connor on London
00:20:21 Connor Office Interview
00:20:46 Friston Patreon Preview
00:21:48 Why Are We So Dumb?
00:23:52 The Voice of the People, the Voice of God / Populism
00:26:35 Mimetics
00:30:03 Governance
00:33:19 Agency
00:40:25 Daniel Interview - Externalised Cognition, Bing GPT, AGI
00:56:29 Beff + Connor Bonus Patreons Interview
Professor Chris Bishop is a Technical Fellow and Director at Microsoft Research AI4Science, in Cambridge. He is also Honorary Professor of Computer Science at the University of Edinburgh, and a Fellow of Darwin College, Cambridge. In 2004, he was elected Fellow of the Royal Academy of Engineering, in 2007 he was elected Fellow of the Royal Society of Edinburgh, and in 2017 he was elected Fellow of the Royal Society. Chris was a founding member of the UK AI Council, and in 2019 he was appointed to the Prime Minister’s Council for Science and Technology.
At Microsoft Research, Chris oversees a global portfolio of industrial research and development, with a strong focus on machine learning and the natural sciences.
Chris obtained a BA in Physics from Oxford, and a PhD in Theoretical Physics from the University of Edinburgh, with a thesis on quantum field theory.
Chris's contributions to the field of machine learning have been truly remarkable. He has authored (what is arguably) the original textbook in the field - 'Pattern Recognition and Machine Learning' (PRML) which has served as an essential reference for countless students and researchers around the world, and that was his second textbook after his highly acclaimed first textbook Neural Networks for Pattern Recognition.
Recently, Chris has co-authored a new book with his son, Hugh, titled 'Deep Learning: Foundations and Concepts.' This book aims to provide a comprehensive understanding of the key ideas and techniques underpinning the rapidly evolving field of deep learning. It covers both the foundational concepts and the latest advances, making it an invaluable resource for newcomers and experienced practitioners alike.
Buy Chris' textbook here:
https://amzn.to/3vvLcCh
More about Prof. Chris Bishop:
https://en.wikipedia.org/wiki/Christopher_Bishop
https://www.microsoft.com/en-us/research/people/cmbishop/
Support MLST:
Please support us on Patreon. We are entirely funded from Patreon donations right now. Patreon supports get private discord access, biweekly calls, early-access + exclusive content and lots more.
https://patreon.com/mlst
Donate: https://www.paypal.com/donate/?hosted_button_id=K2TYRVPBGXVNA
If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail
TOC:
00:00:00 - Intro to Chris
00:06:54 - Changing Landscape of AI
00:08:16 - Symbolism
00:09:32 - PRML
00:11:02 - Bayesian Approach
00:14:49 - Are NNs One Model or Many, Special vs General
00:20:04 - Can Language Models Be Creative
00:22:35 - Sparks of AGI
00:25:52 - Creativity Gap in LLMs
00:35:40 - New Deep Learning Book
00:39:01 - Favourite Chapters
00:44:11 - Probability Theory
00:45:42 - AI4Science
00:48:31 - Inductive Priors
00:58:52 - Drug Discovery
01:05:19 - Foundational Bias Models
01:07:46 - How Fundamental Is Our Physics Knowledge?
01:12:05 - Transformers
01:12:59 - Why Does Deep Learning Work?
01:16:59 - Inscrutability of NNs
01:18:01 - Example of Simulator
01:21:09 - Control
Dr. Philip Ball is a freelance science writer. He just wrote a book called "How Life Works", discussing the how the science of Biology has advanced in the last 20 years. We focus on the concept of Agency in particular.
He trained as a chemist at the University of Oxford, and as a physicist at the University of Bristol. He worked previously at Nature for over 20 years, first as an editor for physical sciences and then as a consultant editor. His writings on science for the popular press have covered topical issues ranging from cosmology to the future of molecular biology.
YT: https://www.youtube.com/watch?v=n6nxUiqiz9I
Transcript link on YT description
Philip is the author of many popular books on science, including H2O: A Biography of Water, Bright Earth: The Invention of Colour, The Music Instinct and Curiosity: How Science Became Interested in Everything. His book Critical Mass won the 2005 Aventis Prize for Science Books, while Serving the Reich was shortlisted for the Royal Society Winton Science Book Prize in 2014.
This is one of Tim's personal favourite MLST shows, so we have designated it a special edition. Enjoy!
Buy Philip's book "How Life Works" here: https://amzn.to/3vSmNqp
Support MLST:
Please support us on Patreon. We are entirely funded from Patreon donations right now. Patreon supports get private discord access, biweekly calls, early-access + exclusive content and lots more.
https://patreon.com/mlst
Donate: https://www.paypal.com/donate/?hosted...
If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail
Dr. Paul Lessard and his collaborators have written a paper on "Categorical Deep Learning and Algebraic Theory of Architectures". They aim to make neural networks more interpretable, composable and amenable to formal reasoning. The key is mathematical abstraction, as exemplified by category theory - using monads to develop a more principled, algebraic approach to structuring neural networks.
We also discussed the limitations of current neural network architectures in terms of their ability to generalise and reason in a human-like way. In particular, the inability of neural networks to do unbounded computation equivalent to a Turing machine. Paul expressed optimism that this is not a fundamental limitation, but an artefact of current architectures and training procedures.
The power of abstraction - allowing us to focus on the essential structure while ignoring extraneous details. This can make certain problems more tractable to reason about. Paul sees category theory as providing a powerful "Lego set" for productively thinking about many practical problems.
Towards the end, Paul gave an accessible introduction to some core concepts in category theory like categories, morphisms, functors, monads etc. We explained how these abstract constructs can capture essential patterns that arise across different domains of mathematics.
Paul is optimistic about the potential of category theory and related mathematical abstractions to put AI and neural networks on a more robust conceptual foundation to enable interpretability and reasoning. However, significant theoretical and engineering challenges remain in realising this vision.
Please support us on Patreon. We are entirely funded from Patreon donations right now.
https://patreon.com/mlst
If you would like to sponsor us, so we can tell your story - reach out on mlstreettalk at gmail
Links:
Categorical Deep Learning: An Algebraic Theory of Architectures
Bruno Gavranović, Paul Lessard, Andrew Dudzik,
Tamara von Glehn, João G. M. Araújo, Petar Veličković
Paper: https://categoricaldeeplearning.com/
Symbolica:
https://twitter.com/symbolica
https://www.symbolica.ai/
Dr. Paul Lessard (Principal Scientist - Symbolica)
https://www.linkedin.com/in/paul-roy-lessard/
Interviewer: Dr. Tim Scarfe
TOC:
00:00:00 - Intro
00:05:07 - What is the category paper all about
00:07:19 - Composition
00:10:42 - Abstract Algebra
00:23:01 - DSLs for machine learning
00:24:10 - Inscrutibility
00:29:04 - Limitations with current NNs
00:30:41 - Generative code / NNs don't recurse
00:34:34 - NNs are not Turing machines (special edition)
00:53:09 - Abstraction
00:55:11 - Category theory objects
00:58:06 - Cat theory vs number theory
00:59:43 - Data and Code are one in the same
01:08:05 - Syntax and semantics
01:14:32 - Category DL elevator pitch
01:17:05 - Abstraction again
01:20:25 - Lego set for the universe
01:23:04 - Reasoning
01:28:05 - Category theory 101
01:37:42 - Monads
01:45:59 - Where to learn more cat theory
Dr. Minqi Jiang and Dr. Marc Rigter explain an innovative new method to make the intelligence of agents more general-purpose by training them to learn many worlds before their usual goal-directed training, which we call "reinforcement learning".
Their new paper is called "Reward-free curricula for training robust world models" https://arxiv.org/pdf/2306.09205.pdf
https://twitter.com/MinqiJiang
https://twitter.com/MarcRigter
Interviewer: Dr. Tim Scarfe
Please support us on Patreon, Tim is now doing MLST full-time and taking a massive financial hit. If you love MLST and want this to continue, please show your support! In return you get access to shows very early and private discord and networking. https://patreon.com/mlst
We are also looking for show sponsors, please get in touch if interested mlstreettalk at gmail.
MLST Discord: https://discord.gg/machine-learning-street-talk-mlst-937356144060530778
Nick Chater is Professor of Behavioural Science at Warwick Business School, who works on rationality and language using a range of theoretical and experimental approaches. We discuss his books The Mind is Flat, and the Language Game.
Please support me on Patreon (this is now my main job!) - https://patreon.com/mlst - Access the private Discord, networking, and early access to content.
MLST Discord: https://discord.gg/machine-learning-street-talk-mlst-937356144060530778
https://twitter.com/MLStreetTalk
Buy The Language Game:
https://amzn.to/3SRHjPm
Buy The Mind is Flat:
https://amzn.to/3P3BUUC
YT version: https://youtu.be/5cBS6COzLN4
https://www.wbs.ac.uk/about/person/nick-chater/
https://twitter.com/nickjchater?lang=en
See what Sam Altman advised Kenneth when he left OpenAI! Professor Kenneth Stanley has just launched a brand new type of social network, which he calls a "Serendipity network". The idea is that you follow interests, NOT people. It's a social network without the popularity contest. We discuss the phgilosophy and technology behind the venture in great detail. The main ideas of which came from Kenneth's famous book "Why greatness cannot be planned".
See what Sam Altman advised Kenneth when he left OpenAI! Professor Kenneth Stanley has just launched a brand new type of social network, which he calls a "Serendipity network".The idea is that you follow interests, NOT people. It's a social network without the popularity contest.
YT version: https://www.youtube.com/watch?v=pWIrXN-yy8g
Chapters should be baked into the MP3 file now
MLST public Discord: https://discord.gg/machine-learning-street-talk-mlst-937356144060530778
Please support our work on Patreon - get access to interviews months early, private Patreon, networking, exclusive content and regular calls with Tim and Keith.
https://patreon.com/mlst
Get Maven here:
https://www.heymaven.com/
Kenneth:
https://twitter.com/kenneth0stanley
https://www.kenstanley.net/home
Host - Tim Scarfe:
https://www.linkedin.com/in/ecsquizor/
https://www.mlst.ai/
Original MLST show with Kenneth:
https://www.youtube.com/watch?v=lhYGXYeMq_E
Tim explains the book more here:
https://www.youtube.com/watch?v=wNhaz81OOqw
Brandon Rohrer who obtained his Ph.D from MIT is driven by understanding algorithms ALL the way down to their nuts and bolts, so he can make them accessible to everyone by first explaining them in the way HE himself would have wanted to learn!
Please support us on Patreon for loads of exclusive content and private Discord:
https://patreon.com/mlst (public discord)
https://discord.gg/aNPkGUQtc5
https://twitter.com/MLStreetTalk
Brandon Rohrer is a seasoned data science leader and educator with a rich background in creating robust, efficient machine learning algorithms and tools. With a Ph.D. in Mechanical Engineering from MIT, his expertise encompasses a broad spectrum of AI applications — from computer vision and natural language processing to reinforcement learning and robotics. Brandon's career has seen him in Principle-level roles at Microsoft and Facebook. An educator at heart, he also shares his knowledge through detailed tutorials, courses, and his forthcoming book, "How to Train Your Robot."
YT version: https://www.youtube.com/watch?v=4Ps7ahonRCY
Brandon's links:
https://github.com/brohrer
https://www.youtube.com/channel/UCsBKTrp45lTfHa_p49I2AEQ
https://www.linkedin.com/in/brohrer/
How transformers work:
https://e2eml.school/transformers
Brandon's End-to-End Machine Learning school courses, posts, and tutorials
https://e2eml.school
Free course:
https://end-to-end-machine-learning.teachable.com/p/complete-course-library-full-end-to-end-machine-learning-catalog
Blog: https://e2eml.school/blog.html
Ziptie: Learning Useful Features [Brandon Rohrer]
https://www.brandonrohrer.com/ziptie
TOC should be baked into the MP3 file now
00:00:00 - Intro to Brandon
00:00:36 - RLHF
00:01:09 - Limitations of transformers
00:07:23 - Agency - we are all GPTs
00:09:07 - BPE / representation bias
00:12:00 - LLM true believers
00:16:42 - Brandon's style of teaching
00:19:50 - ML vs real world = Robotics
00:29:59 - Reward shaping
00:37:08 - No true Scotsman - when do we accept capabilities as real
00:38:50 - Externalism
00:43:03 - Building flexible robots
00:45:37 - Is reward enough
00:54:30 - Optimization curse
00:58:15 - Collective intelligence
01:01:51 - Intelligence + creativity
01:13:35 - ChatGPT + Creativity
01:25:19 - Transformers Tutorial
The world's second-most famous AI doomer Connor Leahy sits down with Beff Jezos, the founder of the e/acc movement debating technology, AI policy, and human values. As the two discuss technology, AI safety, civilization advancement, and the future of institutions, they clash on their opposing perspectives on how we steer humanity towards a more optimal path.
Watch behind the scenes, get early access and join the private Discord by supporting us on Patreon. We have some amazing content going up there with Max Bennett and Kenneth Stanley this week!
https://patreon.com/mlst (public discord)
https://discord.gg/aNPkGUQtc5
https://twitter.com/MLStreetTalk
Post-interview with Beff and Connor: https://www.patreon.com/posts/97905213
Pre-interview with Connor and his colleague Dan Clothiaux: https://www.patreon.com/posts/connor-leahy-and-97631416
Leahy, known for his critical perspectives on AI and technology, challenges Jezos on a variety of assertions related to the accelerationist movement, market dynamics, and the need for regulation in the face of rapid technological advancements. Jezos, on the other hand, provides insights into the e/acc movement's core philosophies, emphasizing growth, adaptability, and the dangers of over-legislation and centralized control in current institutions.
Throughout the discussion, both speakers explore the concept of entropy, the role of competition in fostering innovation, and the balance needed to mediate order and chaos to ensure the prosperity and survival of civilization. They weigh up the risks and rewards of AI, the importance of maintaining a power equilibrium in society, and the significance of cultural and institutional dynamism.
Beff Jezos (Guillaume Verdon):
https://twitter.com/BasedBeffJezos
https://twitter.com/GillVerd
Connor Leahy:
https://twitter.com/npcollapse
YT: https://www.youtube.com/watch?v=0zxi0xSBOaQ
TOC:
00:00:00 - Intro
00:03:05 - Society library reference
00:03:35 - Debate starts
00:05:08 - Should any tech be banned?
00:20:39 - Leaded Gasoline
00:28:57 - False vacuum collapse method?
00:34:56 - What if there are dangerous aliens?
00:36:56 - Risk tolerances
00:39:26 - Optimizing for growth vs value
00:52:38 - Is vs ought
01:02:29 - AI discussion
01:07:38 - War / global competition
01:11:02 - Open source F16 designs
01:20:37 - Offense vs defense
01:28:49 - Morality / value
01:43:34 - What would Conor do
01:50:36 - Institutions/regulation
02:26:41 - Competition vs. Regulation Dilemma
02:32:50 - Existential Risks and Future Planning
02:41:46 - Conclusion and Reflection
Note from Tim: I baked the chapter metadata into the mp3 file this time, does that help the chapters show up in your app? Let me know. Also I accidentally exported a few minutes of dead audio at the end of the file - sorry about that just skip on when the episode finishes.
Watch behind the scenes, get early access and join the private Discord by supporting us on Patreon:
https://patreon.com/mlst (public discord)
https://discord.gg/aNPkGUQtc5
https://twitter.com/MLStreetTalk
YT version: https://youtu.be/n8G50ynU0Vg
In this interview on MLST, Dr. Tim Scarfe interviews Mahault Albarracin, who is the director of product for R&D at VERSES and also a PhD student in cognitive computing at the University of Quebec in Montreal. They discuss a range of topics related to consciousness, cognition, and machine learning.
Throughout the conversation, they touch upon various philosophical and computational concepts such as panpsychism, computationalism, and materiality. They consider the "hard problem" of consciousness, which is the question of how and why we have subjective experiences.
Albarracin shares her views on the controversial Integrated Information Theory and the open letter of opposition it received from the scientific community. She reflects on the nature of scientific critique and rivalry, advising caution in declaring entire fields of study as pseudoscientific.
A substantial part of the discussion is dedicated to the topic of science itself, where Albarracin talks about thresholds between legitimate science and pseudoscience, the role of evidence, and the importance of validating scientific methods and claims.
They touch upon language models, discussing whether they can be considered as having a "theory of mind" and the implications of assigning such properties to AI systems. Albarracin challenges the idea that there is a pure form of intelligence independent of material constraints and emphasizes the role of sociality in the development of our cognitive abilities.
Albarracin offers her thoughts on scientific endeavors, the predictability of systems, the nature of intelligence, and the processes of learning and adaptation. She gives insights into the concept of using degeneracy as a way to increase resilience within systems and the role of maintaining a degree of redundancy or extra capacity as a buffer against unforeseen events.
The conversation concludes with her discussing the potential benefits of collective intelligence, likening the adaptability and resilience of interconnected agent systems to those found in natural ecosystems.
https://www.linkedin.com/in/mahault-albarracin-1742bb153/
00:00:00 - Intro / IIT scandal
00:05:54 - Gaydar paper / What makes good science
00:10:51 - Language
00:18:16 - Intelligence
00:29:06 - X-risk
00:40:49 - Self modelling
00:43:56 - Anthropomorphisation
00:46:41 - Mediation and subjectivity
00:51:03 - Understanding
00:56:33 - Resiliency
Technical topics:
1. Integrated Information Theory (IIT) - Giulio Tononi
2. The "hard problem" of consciousness - David Chalmers
3. Panpsychism and Computationalism in philosophy of mind
4. Active Inference Framework - Karl Friston
5. Theory of Mind and its computation in AI systems
6. Noam Chomsky's views on language models and linguistics
7. Daniel Dennett's Intentional Stance theory
8. Collective intelligence and system resilience
9. Redundancy and degeneracy in complex systems
10. Michael Levin's research on bioelectricity and pattern formation
11. The role of phenomenology in cognitive science
Comments