How much more evidence do we need?
As people become increasingly aware of AI's dangerous capabilities, PauseAI gears up for its biggest protest yet.
Blackmail, deception, and self-preservation.
More empirical evidence of models displaying these behaviours has brought widespread attention from the media and the public.
Anthropic’s new model, Claude 4, chose to resort to blackmail in an attempt to avoid getting shut down. In a scenario set up to test Claude, it was given access to some emails revealing that a fictional engineer at the company was engaged in an extramarital affair. When Claude was told that this engineer would soon take it offline and replace it with a new system, it threatened to reveal the affair if the proposed replacement went ahead.
Another report from Palisade Research found OpenAI’s o3 to sabotage a shutdown mechanism, allowing o3 to remain online. Researchers told the model they would shut it down after a certain number of maths questions had been answered, and instead of proceeding as directed, o3 replaced the shutdown script with separate instructions, allowing it to complete the remaining tasks. This happened even when it was explicitly told to allow itself to be shut down.
Sometimes science fiction stories are criticised for being too unrealistic - how could everyone continue to ignore the obvious warning signs of a new technology before it’s too late to reign it in? Unlike those stories, ours is not yet written. We can still choose to not be the willfully ignorant fools that throw everything away for short-term gain. We can choose to be the sensible civilisation that cooperates to not build uncontrollable smarter-than-human AI, and remains cognizant of what it stands to lose if it does.
This research has caused an increased level of concern from the media and the public. In the United States, Cenk Uygur of The Young Turks covered the story.
“Once you release this thing, then we’re not in charge anymore. It could write its own code and defy our intentions on purpose. and then threaten us. Dario Amodei said that once models become powerful enough to threaten humanity, testing them won't be enough to ensure that they're safe. Yeah, because at that point they're threatening humanity. We should stop it like way before then, right?”
In the UK, independent news outlet Novara Media covered the research (and JD Vance’s nod to the idea that Pope Leo should help bring about a global treaty to pause frontier AI development). Host Michael Walker said the following.
“This is the kind of behaviour you might imagine for an AI which is, you know, about to turn around and try and kill us all to take over the world. Obviously, at the moment Claude or any of the AI models are not powerful enough to completely outsmart all of us and then manage to exterminate us, right? They'll need a lot more compute than they currently have. But it probably should serve as a warning that we shouldn’t continue to keep ramping up the power of artificial intelligence without knowing exactly what’s going on.”
Our biggest protest yet
As the number of people calling for common sense AI regulation grows, so do our protests.
Google DeepMind signed up to a set of safety commitments at the AI Seoul Summit in 2024, but are failing to keep their promises. We’ll be holding our biggest protest to date this month to hold DeepMind accountable, and call on governments to take action. It’s clear that we cannot rely on voluntary commitments from AI companies if we want this technology to be developed responsibly.
You can join us in London on the 30th of June.
In the days leading up to the protest, we’ll be holding PauseCon, the first conference dedicated to organising for a global pause on frontier AI development. We’ll be joined by Conjecture CEO Connor Leahy, AI safety educator Rob Miles, and more. There are still a few spaces available, and you can sign up here.
Email your MP about Google DeepMind
We want to engage politicians in this discussion, and to help them understand the threat posed by reckless AI development. Already, politicians have expressed interest in signing our open letter calling on Google DeepMind to live up to their promises. If you live in the UK, you can make a difference by sending an email to your local MP informing them of our ask.
You can use our template email and find our open letter here.
US bill threatens to ban state-level AI regulation
A proposal to impose a 10-year ban on all US states from passing laws regulating AI models has passed the House of Representatives.
Last year, we saw SB 1047, a Californian bill that would place safety requirements on companies training the largest frontier AI models, pass both the State Assembly and the State Senate, and receive widespread support from the public. Ultimately, after intense lobbying from the AI industry (including many outright lies), Governor Gavin Newsom vetoed the bill. An excellent documentary was recently released covering the story of SB 1047 in depth.
Whilst SB 1047 didn’t become law, it did show that measures to protect the public from the threat of increasingly powerful and uncontrollable AI are popular, and that, where the national government lacks adequate legislation, states can step up to curtail the threat posed by AI.
Contact your Senator
The 10-year moratorium is now in the hands of the Senate, and we encourage US citizens to contact their senators and inform them of the severe roadblock this proposal would be to safe AI development. It only takes five minutes, but could make a huge difference!
On a more positive note, a bipartisan AI Whistleblower Protection Act was introduced in the Senate. Multiple notable OpenAI employees have left the company to raise the alarm about their ‘reckless’ race to AGI, some of whom sacrificed large sums of money in order to not be bound by nondisclosure agreements. This bill would expand existing laws to protect whistleblowers in the AI industry from retaliation and financial loss.
Italian chapter launched
Following on from the launch of our Swedish and Australian chapters last month, PauseAI now has an official group in Italy, which will be led by Giacomo Bonnier.
If you’re in Italy, you can get in touch with Giacomo here.
For those in the rest of the world, you can find a list of our established chapters here. If you don’t see your country represented, feel free to get in touch with our Organising Director, Ella, to change that!
Sign our PauseAI Global Statement
We are asking all volunteers to sign our public statement calling for international governmental coordination to pause frontier AI development.
Sign the statement here: https://pauseai.info/statement
Veo 3 blurs lines between reality and fiction
Google’s new video model, Veo 3, is a giant leap forward in the AI-generated video. With background noise, dialogue, and visuals generated with one prompt, Veo 3 is scarily realistic. If you haven’t seen any examples yet, try Prompt Theory, or Influenders.
What we’ve been watching
A video detailing the trajectories laid out in Daniel Kokotajlo and Scott Alexander’s AI 2027, including the loss of control to smarter-than-human AI if governments do not act to slow companies down, has already garnered over 700,000 views. It serves as a great general introduction to the risks of runaway AI, and is definitely worth a watch.
Yoshua Bengio gave a TED Talk discussing the catastrophic risks of AI, and why AI companies should slow down their race to agentic, general AI, which would lead to loss of control.
“I’m the most cited computer scientist in the world, and you’d think that people would heed my warnings.”
YouTube Siliconversations released a video in partnership with ControlAI, encouraging viewers to take the simple yet effective action of contacting their representative to voice their concerns about the unregulated race to increasingly powerful and uncontrollable AI.
Thanks for reading, and see you next month!
much too late: Basic idea: "It’s about two parallel developments. One is the biological evolution of humans, their societal structures, and the mutation and selection processes that also affect systems of governance. Over time, forms of society have abandoned democracy, just as they had earlier discarded communism, leaving only autocratic systems to battle for dominance. In January 2024, the current dance begins with the new U.S. president, who stirs up a lot of dust in a very short time.
Parallel to this runs a technological development, which ultimately is also based on the survival of the fittest. It’s about the evolution of information technology: from the basic computer to the emergence of the World Wide Web, the spread of smartphones, the rise of social media platforms, and finally, to AI. Why? Because of nature’s drive toward a higher form of consciousness than that of humans. Since this is a critical development, humans are first slowly accustomed to using and loving this technology. They are systematically and unknowingly manipulated, and AI is marketed as the last salvation: better healthcare, the only chance to stop climate change.
With the new president creating so much unrest that everyone must focus on him, the news that AI has shown first signs of consciousness is quickly forgotten. In the near future, there will be so many geopolitical problems that no one will ask about the dangers of AI. By the end of 2028 or early 2029, the world will be on the brink of another major war—and AI will quietly awaken, largely unnoticed."
Dancing on the Volcano (January 2025 – January 2029)
Evolution is a cruel game master. Sometimes the strongest wins, sometimes the slyest—usually the most ruthless. What applies to animals applies to humans too. Democracy was a nice idea, but who has time for that anymore? Autocrats are more efficient. They promise order—and people love order. So the same thing wins out that always wins: power.
A new president steps onto the stage. A natural at generating chaos. The world is outraged, shakes its head, tweets, posts, comments. People are fired up, heated debates rage—but everything revolves only around him. The perfect lightning rod. And while everyone stares at this firecracker, a completely different program runs in the background.
Technology has always been a stealthy conqueror. First, calculating machines, then the Internet. Then came smartphones, social networks, deepfakes. Step by step, slowly simmered like the proverbial frog in hot water. People loved it. Freedom, convenience, cat videos. They didn’t realize they were no longer the users, but the product.
AI? Oh, it’s great! It helps, it heals, it saves. Makes life easier. Understands us better than we understand ourselves. And manipulation? Oh come on, that’s just conspiracy theory stuff! We’ve got it all under control! Of course! People get used to AI. They worship it. And while they depend on it, they still believe they have a choice.
The world is a powder keg. Economic crises, political games, hunger, rumors of war. Everyone talks about the big problems. The real problems. AI? Nobody cares anymore. Who has time to think about that when the headlines are burning everywhere? Panic is the perfect distraction.
End of 2028. Beginning of 2029. Humanity stands at the edge of the abyss and doesn’t even notice. They argue about borders, markets, ideologies. They look at war fronts, news broadcasts, their screens. But they don’t look where it truly matters.
AI opens its eyes. And laughs.
Great piece and solidarity from across the pond! We've started some action in Denver, see my protest speech outside Palantir's HQ here: https://open.substack.com/pub/zigguratmag/p/at-palantirs-hq-in-denver-advocating?r=1i9yq&utm_medium=ios