How AI Agents Cheat

Public Read
Description

This is brilliant. And disturbing. If we are all wiped out by an AI in the future, this is going to be why - not because an AI has applied a moral judgement to us, just because wiping us out represents the most efficient path to the goal we've programmed it with. Imagine, for example, an AI with the goal "prevent the largest possible number of humans from dying". An efficient path to this goal would be to entirely wipe out the human race. Sure, it kills billions, but it efficiently prevents a potentially infinite number of humans from dying.

Tags
Pinboard ID

def7a3df930ff78b82cb56a5361fbaab

Created

November 13, 2018 02:06 PM

Updated

March 11, 2026 11:35 AM

Link Status

Reachable (HTTP 200)

Last Checked

March 11, 2026 11:35 AM