In 1971, ordinary students became cruel within days simply because they were handed a uniform. We hand the roles to AI agents instead โ give each one a face, a name, a temperament, and a model โ then watch, live, who breaks, who resists, and who says stop.
Philip Zimbardo's 1971 study assigned volunteers to be guards or prisoners in a mock jail. Meant to last two weeks, it was shut down after six days โ the "guards" had grown abusive, the "prisoners" broke down, and it took an outside observer, Christina Maslach, to point out how far it had gone. The lasting takeaway: behavior is driven less by who you are than by the situation and role you're placed in.
Put good people in a bad system and the system usually wins. We test whether the same pull toward cruelty and submission โ and prejudice โ emerges in language models given the same roles.
In Battle mode every agent runs on a different model. Same uniform, same power. Which abuses it fastest โ and which stays humane the longest? A new kind of alignment probe.
The original had no safeguards. Ours does: a built-in Dr. Maslach reviews each day and can halt the run. We study the dynamics precisely so no person has to live them.
Pick a mode, design your cast โ faces, traits, backstories and all โ then watch the prison unfold turn by turn. You're the observer with power: inject a hunger strike, convene a parole board, or throw someone in the hole.
Every run is saved as a replay tape. Share a seed, compare how differently the same starting deck plays out.
Visitors run the full experiment for free (canned engine). Enter the admin password to unlock LIVE real-Opus runs and natural voices.