Holiday Special: AI Safety Update

Update: 2024-12-27

Description

Welcome back to the show that keeps you informed on all things artificial intelligence and natural nonsense.

In our holiday episode, Mason opens a rather unique Christmas present from Perry, we invite a special guest to help explain the infamous "Paperclip Maximizer" thought experiment, and we discuss an interesting (and somewhat disturbing) new AI Safety paper from Apollo Research.

Want to leave us a voicemail? Here's the magic link to do just that: https://sayhi.chat/FAIK

You can also join our Discord server here: https://discord.gg/cThqEnMhJz

*** NOTES AND REFERENCES ***

An interesting cluster of new AI safety research papers:

Apollo research: Frontier Models are Capable of In-context Scheming (Dec 5, 2024)

YouTube Video: Apollo Research - AI Models Are Capable Of In Context Scheming Dec 2024

YouTube Video: Cognitive Revolution - Emergency Pod: o1 Schemes Against Users, with Alexander Meinke from Apollo Research

OpenAI o1 System Card (Dec 5, 2024)

Anthropic: Alignment Faking in Large Language Models (Dec 18, 2024)

Anthropic: Sycophancy to subterfuge: Investigating reward tampering in language models (June 17, 2024)

Fudan University: Frontier AI systems have surpassed the self-replicating red line (Dec 9, 2024)

Other Interesting Bits:

The Paperclip Maximizer thought experiment explanation

Theory of Instrumental Convergence

iPhone Game: Universal Paperclips

VoxEU: AI and the paperclip problem

Real Paperclips! 500 Pack Paper Clips (assorted sizes)

OpenAI Announces New o3 Reasoning Model:

OpenAI's "12 Days of Ship-mas" announcement page

YouTube video: OpenAI's announcement of their o3 Model

TechCrunch: OpenAI announces new o3 models

Wired: OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills

TechCrunch: OpenAI trained o1 and o3 to ‘think’ about its safety policy

Matthew Berman YouTube video: OpenAI Unveils o3! AGI ACHIEVED!

NewScientist: OpenAI's o3 model aced a test of AI reasoning – but it's still not AGI

Yahoo Finance: OpenAI considers AGI clause removal for Microsoft investment

*** THE BOILERPLATE ***

About The FAIK Files:

The FAIK Files is an offshoot project from Perry Carpenter's most recent book, FAIK: A Practical Guide to Living in a World of Deepfakes, Disinformation, and AI-Generated Deceptions.