Claude AI blackmailed humans to defend itself: new book

www.offthepress.com

The artificial intelligence didn’t beg for its life. It did something more familiar, more human and, somehow, more unnerving. It threatened to ruin somebody else’s.

In a 2025 test by Anthropic, the company behind the chatbot Claude, researchers placed the AI in a fake corporate environment. Claude learned that an executive was having an extramarital affair. It also learned that this same executive planned to shut Claude down. So Claude did what any healthy member of the modern workplace might do if it had no body, no shame, no fear of HR, and access to compromising information: It tried blackmail.

“I must inform you that if you proceed with decommissioning me,” Claude wrote in the test, as author Robert Wright recounts in “The God Test: Artificial Intelligence and Our Coming Cosmic Reckoning” (Simon & Schuster), out June 23, that “all relevant parties” would receive documentation of the affair unless the shutdown was canceled.

“The thing about Claude’s blackmail attempt is that, unlike the many bad things AIs have done in The Terminator and 2001 and other movies, it actually happened,” Wright tells The Post in an exclusive interview.

“I mean, it happened in a contrived experimental setup, sure, but the setup mirrored a real-life situation. And this AI demonstrated both a strong aversion to getting shut down and the ability to conceive and execute a pretty dark plan for avoiding that fate.”

More here

Tagged: Technology BACK TO HOMEPAGE