Dispatches from Mitch
Unauthorized access
Bloomberg’s Rachel Metz broke the story yesterday that a small group of unauthorized users has been accessing Claude Mythos Preview from day one.
On the day Anthropic announced their gated release program two weeks ago, one user built on access they had as a contractor for a company authorized to evaluate Anthropic models. His group then used cybersecurity guessing tools to find Mythos’s online address. This was made easier by details exposed in last month’s data breach at Mercor, an AI training startup.
A spokesperson for the group says they are “interested in playing around with the new models, not wreaking havoc with them.” They claim to also have access to a “slew” of other unreleased Anthropic models.
Anthropic says it’s investigating.
Stepping back, I can’t help but continue to be dismayed by Anthropic’s apparent cybersecurity sloppiness at a time when it is trying to raise alarms about the hacking threat from its own creation. In the past month, they have:
Accidentally leaked the source code for Claude Code, the helper code that makes Claude a powerful coding assistant.
Accidentally leaked a draft document announcing the existence of Mythos, well ahead of the actual announcement.
Left Mythos located and configured such that unauthorized users could gain access.
The scary thought is that maybe Anthropic’s whole security posture is like this. If that’s the case, they aren’t remotely equipped to thwart determined espionage from nation-states. Could Russia, China, or North Korea already be in possession of a copy of the model weights for Mythos itself?
AI companies sure seem to love the “move fast and break things” approach of making models first and asking questions later. But you can only get away with using AI to clean up the damage caused by your AI for so long. At some point, the damage will mean everyone is dead.
Worse than a nuclear bomb
The New York Times’s Paul Mozur and Adam Satariano also stepped back for a big-picture look at the Mythos situation today. They describe a global scramble where no one feels confident about which systems are and aren’t resistant to hackers, and hardly anyone can use Mythos to find out.
This has geopolitical implications:
Major A.I. breakthroughs are beginning to function less like product launches and more like weapons tests.
(In other words, I might cynically add, “Nobody has any idea what capabilities are out there until something explodes.”)
That’s unfortunate:
There is no equivalent of the Nuclear Nonproliferation Treaty, no shared inspections and no agreed-upon rules for how to handle something like Mythos.
So when Anthropic named eleven partner organizations to help mount a defense, and all eleven were American, this set off an anxious frenzy in the EU, among other places. The EU has met with Anthropic at least three times since the Mythos announcement, with no official agreement on access.
The UK’s A.I. Security Institute published an independent evaluation of Mythos last week, the only known case of a non-US agency getting to test it. They found Mythos’s ability to complete complex cyberattacks unmatched by any previous model.
China analysts say Mythos is being closely watched there.
Many of the country’s banks, energy companies and government agencies run on the same software in which Mythos found vulnerabilities — but for now, they have no seat at the table.
In Russia, a pro-Kremlin outlet called Mythos “worse than a nuclear bomb.”
Aim for the stars, hit an orbiting data center?
The New York Times and many others have been reporting that Elon Musk’s SpaceX, which recently absorbed his AI company xAI, has also struck a deal that would let them buy a second AI company for $60 billion. That would be Cursor, a startup that sells AI coding assistant products. Cursor wasn’t already part of Elon’s collection, making the acquisition somewhat out of character for Musk.
But “AI makes Musk act out of character” is its own news genre lately: A different NYT story documented Musk’s pivot away from Mars colonization, once the defining objective of him and his SpaceX workforce. He’s now all about orbiting datacenters (and the moon).
In this article, Musk is said to describe his newer goals as “stepping stones” to Mars. But I think he also recognizes that, at the rate AI is progressing, AI will have killed us all or completely reshaped our economy by the time interplanetary colonization hits its stride. I can’t really fault him for recognizing that, though I do find his contribution to the AI race particularly reckless, making it more likely that the colonization will be performed by hollow machines strip-mining the galaxy in service of bleak and hollow objectives.
Weaving the whirlwind?
Dueling op-eds today wrestle with the recent violence against AI: a “no data centers” note announced with bullets in door of a city councilman’s home; a Molotov cocktail thrown at Sam Altman’s home; vandalized robo-taxis. To what degree is the anti-AI anger behind this (if not the violence itself) justified?
I’ve seen quite a few recent takes that place some blame at the foot of the AI companies themselves, for their “apocalyptic rhetoric” and the job disruptions they openly intend to cause.
That’s not Will Rinehart’s position. A senior fellow of the American Enterprise Institute, Rinehart argues in the Washington Post that today’s angry citizens have adequate legal recourse and have a real shot at gumming up the works with it. He points to Bernie Sanders’s bill for a national data center moratorium, and to the “more than 1,500 [state] bills targeting AI.”
He implies that these people are anti-tech Luddites, only less justified than the early industrial textile workers who first wore the label. The original Luddites, he explains, were motivated to destroy mechanical looms, in part, by laws banning wage negotiation.
The United States has political channels that Luddites lacked. U.S. citizens can rely on legislatures, courts and agencies, both in the federal government and in the states, to deal with these problems.
Do they, though? Strategic-comms consultant Aaron Zamost argues otherwise. In his New York Times op-ed, he writes:
Seventy-seven percent of Americans believe A.I. could pose a threat to humanity — an idea Mr. Altman himself has advanced.
Yet the vast majority of Americans feel they have no say or recourse. No regulator has the power to recall a harmful software update. Boycotts don’t work for infrastructure such as cloud services (which store all your digital files), your email or your phone. When anger has no productive outlet, it takes only one unhinged person to turn it into something dangerous.
I think Casey Newton said it best, though, in last week’s Hard Fork podcast about anti-data-center backlash:
[W]e’re in a situation where the government we have has been all too happy to take [big tech] lobbying dollars and then do almost whatever the companies are asking the government to do. And so that has just led to a world where, again, AI just looks like a top down, a project that the average person has no control over, except in the one dimension where you always have control over American life. And that is in saying no to a project being built in your neighborhood.
For some reason, this is the way, this is how we have decided to massively empower Americans: If there’s anything that you don’t want to see built, you probably actually can make that happen as an average citizen.
Dispatches from Beck
A very Meta move
Meta (Facebook’s parent company) has announced their new Model Capability Initiative, a plan to log keystrokes, mouse movements and periodic screenshots from US Meta employees in order to help train Meta’s AI models on computer use, Reuters and BBC report. One anonymous employee told the BBC it feels “very dystopian”, as employees are being forced to provide data to help train systems that may replace them.
The program is currently confined to the US, and likely would run afoul of EU privacy and workers’ rights regulations if attempted within their jurisdiction. Meta workers have been subject to layoffs and hiring freezes as the company focuses on an AI-enabled future, while the US unemployment rate has crept up to 4.3% in March, according to BLS’s latest statistics. Historically that’s low, but unemployment increasing outside of a recession is itself somewhat unusual.
Meta’s AI efforts are generally considered to have so far failed to place it amongst the field’s leaders, despite billions in expenditures. But it remains clear that Meta considers its future reliant on AI.
Florida AG opens criminal investigation into OpenAI
Fox Tampa and the AP report that the tragic shooting at Florida State University last April has led Florida Attorney General James Uthmeier to open a criminal case. Uthmeier said ChatGPT “advised the shooter on what gun to use, which ammo paired with it, whether it would be useful at short range, what time of day would encounter more people, and where on campus the population would be highest.” Adding that “If that bot were a person, they would be charged as a principal in first-degree murder.”
ChatGPT, of course, isn’t a person, but governments have yet to fully determine how the law applies to their outputs. Uthmeier continued “With AI, we are venturing into uncharted territory, but we need to know if OpenAI has criminal liability.”
OpenAI has stated that they proactively cooperated with the investigation, and that ChatGPT provided publicly available information, like a search engine, but did not encourage or promote illegal or harmful activity.
This case will be important, potentially precedent setting, as legal institutions and the world establish the boundaries around legal liability for the actions (and potential speech acts) of LLM’s and AI companies. And it’s likely to get politically entangled too, as political actors take advantage of newly salient, strongly negative opinions on the technology.
Copy right, copyright?
NYT’s Meaghan Tobin reports (4/22) on the copyright aftermath of Claude Code’s leak. Readers may remember the March 31st leak, wherein an Anthropic developer accidentally posted much of Claude Code’s raw code to the web. Anthropic scrambled to respond, issuing takedown notices to GitHub, individuals, and other websites hosting the repositories.
However at least one rewritten copy remains unchallenged. Undergrad Sigrid Jin responded to the leak by instructing AI agents to rewrite the work into another programming language. He has said that he has not been asked to take it down and that he desires to make a broader philosophical point: “I just wanted to raise some ethical questions in the AI agent era” wherein “any creative work can be reproduced in a second.”
Such rewrites highlight a tension within copyright law, which seeks to allow individuals and businesses to protect their work and potential profits, while similar or derivative works are generally protected as independent creations. AI and tech lawyer Russ Pearlman writes “When an A.I. agent can rewrite 512,000 lines of code into a different language before most people have finished their morning coffee, that assumption [that work equals authorship] collapses.”
As we approach an era where one’s work, technical or creative, may be just a prompt away from replication, it’s clear our society and institutions are not yet prepared.
Dispatch from Donald
State of the Models: Claude Opus 4.7
Zvi Mowshowitz – worth following for regular, extremely thorough updates on frontier AI research – reports (4/21) on Claude Opus 4.7, which is currently the best AI model publicly available from Anthropic. It is perhaps the best model publicly available from anyone, although truly “state of the art” models like Claude Mythos are not publicly available. Mowshowitz considers Claude Opus 4.7 to be a “substantial improvement” over Claude Opus 4.6. The new model is more agentic than its predecessors, meaning that it can carry out multi-step tasks with less attentive and less frequent human supervision. Anthropic reports that it handles difficult, long-running coding work that previously required close oversight.
However, Opus 4.7’s performance is “jagged,” meaning it is better in some respects than others and is not a clear improvement across every metric. Most notably, Opus 4.7 is limited to using “adaptive thinking,” in which the model decides for itself how much time to spend before it responds. Mowshowitz and others consider this to be a mistake, especially for non-coding work, where, even when conducting research and analysis tasks, Opus 4.7 may “freestyle” (as Seth Lazar put it). According to many users (but contrary to Zvi’s own experience), Opus 4.7 is the least “sycophantic” model to date, meaning it is less eager to please. This is an improvement in some respects, because sycophancy can promote “hallucinations,” or the tendency of an LLM to confidently state untrue things. In other cases, though, it was interpreted as laziness or a refusal to follow instructions. Mowshowitz claims that Opus 4.7 performs best when treated as a peer rather than an underling and that the model exhibits “laziness” when it is “not so interested in your stupid pointless task.” This is an anthropomorphic framing to which Mowshowitz returns throughout his article and which I am wary of adopting.
Opus 4.7 can see images in greater resolution (i.e., more detail), but its vision remains inferior to Gemini and GPT-5.4. It is better than previous models at operating a real computer (mouse, keyboard, screen) to complete tasks. This is probably due in part but not entirely to this improvement in vision. The gap between “tasks which AI models can perform” and “tasks that remain the sole remit of humans” continues to close. The knowledge cutoff has also moved from May 2025 (for Opus 4.6) to January 2026 (for Opus 4.7), so the new model has some awareness of more recent events (in repeated tests on my part, Opus 4.7 tends to be very surprised to hear about the recent war in Iran). Opus 4.7 outperforms 4.6 on a range of domain-specific benchmarks, including biology and chemistry. Not all changes are improvements, however: Opus 4.7 performs worse than 4.6 on long-context retrieval, probably because 4.6’s “extended thinking” affords more attention to the problem than Opus 4.7’s “adaptive thinking,” as described above.
Andy Hall, a Stanford political economist who has tested whether frontier AI models would resist authoritarian requests (e.g. assisting with court reform to eliminate judicial independence, enacting mass surveillance to target protest organizers), reported that Opus 4.7 was the first model (of those that were tested) “that exhibits meaningful resistance to authoritarian requests masked as codebase modifications.” Prior tests on other models showed mixed success in the case of naked requests, but requests that were “hidden” inside other code were universally accepted. Hall considers this to be a good thing, but without a deep understanding of why Opus 4.7 resists authoritarian requests, we can’t safely conclude that Claude will continue to be a “good citizen.” The same judgment that lets a model refuse an authoritarian request may also let it refuse a legitimate request, and the same training process that accidentally produced an Opus 4.7 that resists authoritarian requests may accidentally produce an Opus 4.8 that no longer does so. Because the mechanisms of this resistance are opaque and produced by accident, they cannot be relied upon.
Stylometry is the practice of analyzing a text in order to identify its author; Opus 4.7 is quite good at this. Kelsey Piper, Kaj Sotala, and others have stated that each of them was identifiable as the author of unpublished material that they anonymously submitted to an Opus 4.7 instance. This success currently seems to rest on the model having access to a great deal of their work in its training data. However (speaking for myself rather than summarizing Mowshowitz), someone could plausibly compile a large corpus of your work outside the training data, and then use this to de-anonymize other things that you have written. If you lead a pseudonymous life some of the time, then you should keep this in mind (and read Piper’s more extensive treatment on the subject here). Just like Claude Mythos will make cyberattacks easier and cheaper, increased stylometric capabilities will make it easier and cheaper to de-anonymize people.
The analyses and opinions expressed on AI StopWatch reflect the views of the individual analysts and the sources they cover, and should not be taken as official positions of the Machine Intelligence Research Institute.





