DiscoverThe Nonlinear Library: Alignment ForumAF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger
AF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger

AF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger

Update: 2024-09-04
Share

Description

Link to original article

Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception?, published by David Scott Krueger on September 4, 2024 on The AI Alignment Forum.

AI systems up to some high level of intelligence plausibly need to know exactly where they are in space-time in order for deception/"scheming" to make sense as a strategy.

This is because they need to know:

1) what sort of oversight they are subject to

and

2) what effects their actions will have on the real world

(side note: Acausal trade might break this argument)

There are a number of informal proposals to keep AI systems selectively ignorant of (1) and (2) in order to prevent deception. Those proposals seem very promising to flesh out; I'm not aware of any rigorous work doing so, however. Are you?

Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

AF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger

AF - Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? by David Scott Krueger

David Scott Krueger