“What do people mean when they say that something will become more like a utility maximizer?” by Nina Panickssery
Description
AI risk arguments often gesture at smarter AIs being "closer to a perfect utility maximizer" (and hence be more dangerous) but what does this mean, concretely? Almost anything can be modeled as a maximizer of some utility function.
The only way I can see to salvage this line of reasoning is to restrict the class of utility functions one can have such that the agent's best-fit utility function cannot be maximized until it gets very capable. The restriction may be justified on the basis of which kind of agents are unstable under real-world conditions/will get outcompeted by other agents.
What do we mean when we say a person is more or less of a perfect utility maximizer/is more or less of a "rational agent"?
With people, you can appeal to the notion of reasonable vs. unreasonable utility functions, and hence look at their divergence from a maximizer of [...]
---
Outline:
(00:48 ) What do we mean when we say a person is more or less of a perfect utility maximizer/is more or less of a rational agent?
(01:55 ) Unsatisfactory answers Ive seen
(01:59 ) A1: Its about being able to cause the universe to look more like the way you want it to
(02:24 ) A2: Its more rational if the implied utility function is simpler
(02:43 ) A3: Its the degree to which you satisfy the VNM axioms
(02:56 ) The most promising answers Ive seen are ways to formalize the reasonableness restriction
(03:02 ) A4: Its the degree to which your implied preferences are coherent over time
(03:40 ) A5: Its the degree to which your implied preferences are robust to arbitrary-seeming perturbations
---
First published:
September 21st, 2025
---
Narrated by TYPE III AUDIO.