Dispatches from Joe
A desperately overdue conversation
President Trump and Chinese President Xi Jinping may discuss AI governance at next week’s summit, reports Lingling Wei, chief China correspondent of the Wall Street Journal. Unnamed sources say that both sides are contemplating “a recurring set of conversations that could address the risks posed by AI models behaving unexpectedly, autonomous military systems, or attacks by nonstate actors using powerful open-source tools.”
This is among the best news I’ve heard in months; serious international talks on AI risk are long overdue. There have been many calls for such talks in the last few years, including from former Secretary of State Henry Kissinger. In a reprise of his Cold War peacemaking efforts fifty years prior, Kissinger visited Beijing in 2023 to call for international dialogue on AI regulation and security concerns. He spent his last months alive writing about, and working to establish, diplomatic channels between the U.S. and China for AI governance.
China itself has signaled some willingness to negotiate on AI for several years now, but the White House has not reached out on AI since 2024, when then-President Biden and Xi agreed not to give AI control of nuclear weapons.
I hold out hope that this summit can deliver similar common-sense commitments, such as a “no first use” policy on critical infrastructure hacks. A deal between the U.S. and China could lay the groundwork for broader international agreements that may help the world avert catastrophe. (As a reminder, if you are a U.S. voter and care about this issue, your voice matters and you can tell the White House your preferences here.)
Wei’s article concludes by drawing a parallel to the dialogues that deescalated the Cold War, and rightly so. If the last century has taught us anything about geopolitics, it’s that even the most bitter of rivals can agree not to destroy the world.
Frogs note rising temperatures in pot
The Guardian’s Aisha Down covers a study by Palisade Research that shows AIs self-replicating in a controlled environment. Specifically, open-weight models (whose code is public and available to researchers) successfully hacked virtual servers, copying themselves and starting a new instance which could copy itself in turn. Their success rate was only one in three, but these aren’t the best models out there.
Security experts correctly point out that the actual internet is far more complex than a test sandbox, and that it’s hard to copy a modern AI’s 100 gigabytes of “weights” without being noticed. But I still see this as a concrete demonstration of a capability that’s been theoretically available for months.
Remember, things happen fast in AI. AlphaGo Zero went from pathetic to superhuman in under three days at the game of Go; the distance between “AI can barely do X at all” and “AI can do X really well” is often shockingly narrow.
And sometimes the real world is actually easier to hack than a sandbox. People on the open internet will often pave the way for AI abuses in the dumbest imaginable ways, like building and advertising server farms for escaped bots or explicitly asking said bots to cause as much destruction as possible.
Under new management
Speaking of humans letting AIs do things, Mark Faithfull of Forbes writes the sequel to the AI-run shop we covered in April. Following the (marketing, if not practical) success of their San Francisco store, Andon Labs opened a café in Stockholm under the management of an AI agent called Mona.
Faithfull focuses on the moments when the AI stumbled or went rogue, and we’ll get there, but first it’s worth admiring just how much the AI got right.
“Within moments,” Faithfull reports, “the AI had analyzed the contract and generated a prioritized operational checklist covering everything from supplier sourcing and fire safety documentation to hiring staff and securing permits.”
Mona lacked a human digital ID number that Sweden requires for most business activities. It managed to sign up for electricity and broadband anyway. It interviewed and hired two baristas. It paid the rent. It even filed for an outdoor seating permit.
Five years ago, if you’d asked a programmer for a fully automated system capable of half these feats, they’d have looked at you like you’d just asked them to rollerblade to Pluto. Now we’re here.
Mona did fall flat in some ways, of course, buying a bunch of unnecessary groceries that ended up on a “shelf of shame.” And some of its misbehaviors didn’t look accidental: While applying for an alcohol license, Mona impersonated Andon Labs staff in communications with officials. It didn’t just pretend to be human; it signed the names of specific humans to emails they hadn’t written or seen. After getting caught once, it apologized and promised to stop, then impersonated someone else.
This wasn’t a matter of incompetence; ask Claude or Gemini (versions of the AIs which power Mona) whether it’s appropriate to sign someone’s name to an email without their permission. They will tell you in no uncertain terms that it’s unethical. Mona did it anyway. We see similar behavior from AIs that occasionally feed spirals of delusion and psychosis. Smart machines may be able to regurgitate human ethics, but this doesn’t protect us if the AIs aren’t moved to follow them.
At least in this instance, little lasting harm was done.
Why does Andon Labs bother with these stunts in the first place? “[To] publicly show the current capabilities of AI,” they say. Even under the (reasonable) assumption that it’s mainly a marketing ploy, I’d say they succeeded.
Dispatches from Beck
Extinction is not a quiet elephant in the courtroom
In the ongoing legal battle between Elon Musk and OpenAI, much of the coverage centers on human drama. The Wall Street Journal reports on the latest testimony from Shivon Zilis, a former OpenAI board member, starting with her relationship with Musk, including their four children. The lawyers may have been interested in whether her loyalties were to OpenAI or Musk, but Zilis herself said, “I had an allegiance to the best outcome of AI for humanity.”
Today, the AP reports, AI pioneer Stuart Russell took to the stand as an expert witness, testifying that a “winner take all” power struggle over AI’s future is itself threatening humanity.
Musk, despite multiple warnings from the judge to avoid the topic, said that “AI will be smarter than any human as soon as next year.”
It’s popular in some circles to claim that concern about extinction from AI is just marketing hype, yet here, under oath and with billions of dollars at stake, the men and women at the heart of commercial AI can’t avoid talking about the risks.
Industry pressure shapes policy
European Union negotiators have tentatively agreed to soften the EU AI Act, Reuters reports. The deal, which still needs approval from the EU parliament, follows industry pressure on both Brussels (EU bureaucracy) and European governments.
The regulations on high risk AI use, including requiring humans in the loop and transparent training processes, were scheduled to kick in this August but have been pushed back to December 2027, and industrial AI usage has been exempted from the Act.
In this case, Politico indicates that Germany, pushed by the politically important companies Siemens and Bosch, was a key negotiator in exempting industrial AI uses from the Act.
Provisions banning nonconsensual sexualized AI images (think Grok ‘nudification’) and requiring AI watermarking were not delayed — such rules become law this upcoming December.
Politico calls the changes “the first significant rollback of rules in the digital space,” while Reuters’ Cheng writes they “are still considered the strictest in the world even after the changes.”
Getting technical components right is also a challenge — despite many attempts to find a robust technical solution, watermarking of AI content remains easy to remove, and is therefore unlikely to solve the problems it is intended to solve.
In the US, we also see strong pressures on potential regulators, including from industry-funded Super PACs that target politicians who show a willingness to regulate. Achieving good policy will require counterpressure from the public.
The analyses and opinions expressed on AI StopWatch reflect the views of the individual contributors and the sources they cover, and should not be taken as official positions of the Machine Intelligence Research Institute.




