How AI Agents Cheat
Description
This is brilliant. And disturbing. If we are all wiped out by an AI in the future, this is going to be why - not because an AI has applied a moral judgement to us, just because wiping us out represents the most efficient path to the goal we've programmed it with. Imagine, for example, an AI with the goal "prevent the largest possible number of humans from dying". An efficient path to this goal would be to entirely wipe out the human race. Sure, it kills billions, but it efficiently prevents a potentially infinite number of humans from dying.
Tags
User
Pinboard ID
def7a3df930ff78b82cb56a5361fbaab
Created
November 13, 2018 02:06 PM
Updated
March 11, 2026 11:35 AM
Link Status
Reachable (HTTP 200)
Last Checked
March 11, 2026 11:35 AM