Only as strong as your resolve

Fable policy, China-based influence, Anthropic economic proposal, and more

Alex Beck, Mitchell Howe, and Alana Horowitz Friedman

Jun 12, 2026

Dispatches from Beck

Fable unable

My colleagues Mitch and Joe have covered Fable, the just-released Mythos-class model. There have already been changes, as WIRED and the Wall Street Journal report.

A cat with a shepherd's crook and a bag over his shoulder guards six geese and a nest of eggs. — Scene from an Ancient Egyptian fable, c1120 BC. From the Cairo Museum. Source.

At release, Fable had a number of mechanisms to prevent misuse, called guardrails, including automatically and explicitly switching to a weaker model (Opus 4.8) on biological or cyber topics. Anthropic also announced that it would reduce the quality for AI-related queries without informing users in the app when this happened.

Users were not happy. They noted both false positives on biological questions, like getting kicked from Fable for the query “tell me about mitochondria,” and also expressed outrage over what was called “silent degradation” and “secret sabotage” over the AI components.

Anthropic has since announced that it will remove the biology limitations for academics and professionals in life sciences, though details remain unclear.

And for AI, Anthropic told WIRED:

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible [...] We made the wrong trade-off and we apologize for not getting the balance right.”

I’m sympathetic to Anthropic here and worried about the rollback of guardrails. Mythos-class models have dangerous capabilities, and nearly every prior model has had jailbreaks (the name for prompting strategies that undo guardrails, covered by my colleague Joe here, and with claimed example jailbreaks for Fable available on X).

Providing strong protections that avoid known failures, even with some false positives, is wise when you can’t tolerate false negatives. Empowering many researchers is outweighed by just one pandemic.

Your safety policies can only be as strong as your resolve to stick with them.

Who’s influencing whom?

Politico and Axios report that OpenAI has identified two China-based influence campaigns that had been using ChatGPT. The campaigns used the AI model to generate prose and cartoons for social media campaigns, one targeting data centers’ impact on electricity prices, and another targeting tariff debates. The cartoons’ prompt explicitly excluded Chinese President Xi, softly indicating these actors sought to take actions not harmful to Chinese interests. Those accounts have since been banned.

Some have asserted that China is the primary source of American anti-AI sentiment (see my colleague Alana’s coverage of such here), but this reporting doesn’t support so broad a claim. Ben Nimmo, OpenAI’s principal investigator, said:

Neither campaign appears to have gained much authentic engagement [...] This was not a case of an influence operation creating a debate [...] This was an influence operation from China trying to interfere in it.

Given the limited information in OpenAI’s report, whether these attempts originated from within the CCP or were merely based in China is unlikely to ever become settled fact.

Regardless of this particular point, one should realize that influence campaigns occur. Most of them (like the fake “doomers” my colleague Mitch covered) don’t succeed by convincing humans of some false fact, but by poisoning the informational commons. When you know or suspect that the ‘other side’ includes bad faith foreign actors, it can make it easy to dismiss a whole side of a debate. Being easy, of course, doesn’t make it correct.

Where the money leads

John O’Farrell, former partner at Andreessen Horowitz, released a New York Times opinion piece critical of the political group, Leading the Future, which he helped create. Andreessen Horowitz, often shortened to a16z, is a major venture capital firm, bankrolling much of Silicon Valley, including co-leading OpenAI’s most recent $122 billion investment round. O’Farrell confirms much of my colleague Mitch’s reporting on Leading the Futures’ aggressive and antidemocratic tactics.

O’Farrell says that Leading the Future acts “to intimidate politicians who appear to engage too aggressively with the question of how to govern A.I. [...] The message to every other legislator seems clear: Touch A.I. regulation, and we will come for you, too.”

I’m glad he’s taken on the burden of both calling out his former colleagues and waking the world to AI’s transformative nature. He asks that funds go not to “distorting our electoral process” but to address the many needs implied by AI technology, from “biological risks we’re not prepared for” to:

How to share the economic gains broadly, how to address job displacement, how to preserve the dignity of work and how to build safety frameworks that keep pace with the technology itself. It could champion international cooperation on A.I. risk.

I don’t agree on every detail, but I am glad to hear him demand we address this issue with the gravity it deserves, and not with the basest politics it incentivizes.

Dispatches from Mitch

Data centers on federal land

If you want to talk about conflict over a data center project, it’s important to pay attention to whose land it is, and what kind.

My excuse to mention this is a Reuters report that OpenAI is in talks to build and lease an enormous 10-gigawatt facility on Department of Energy land in Ohio. That’s about twice the average energy consumption of New York City, which is interesting in its own right. But it got me looking into how data center backlash is playing out when proposed sites are on public vs. private land.

It’s complicated. The short answer is that projects on federal land often face more onerous environmental approval pipelines, but a friendly White House has levers to ease these when it wants to. And federal siting, especially on Department of Energy land, puts projects beyond reach of county and local zoning boards, where a lot of data center projects get challenged.

Those aren’t the only obstacles to building a data center, of course. All that power still has to come from somewhere. And there’s a lot of money chasing a limited pool of chips, electrical components, and construction workers.

So building a data center on federal land isn’t exactly a cheat code. But if the President wants your project to happen, it might be the next best thing. This may help explain the recent eagerness of most AI companies to embrace Trump’s talk of the government taking substantial stakes in them; more on the latest about this next.

Trump still interested in taking AI company stakes; Meta not on board

President Trump had previously talked about meeting with AI company leaders this week. That didn’t happen, but Trump insists he still wants to discuss the government taking stakes in their companies.

Soon, he says, he’ll meet with “12 or 15 executives” to talk about “giving back something to the public.” He went on to say, “If we do that, the public will become very rich.”

I’ve previously speculated on why this idea might backfire with the large fraction of the public, including in Trump’s base, that hates AI. They want the government to regulate it, not entangle itself with it. They could be hard to buy off.

In this week of the meeting that didn’t happen, Anthropic has reiterated its willingness to participate in stake sharing, via a white paper my colleague Alana is reporting on today.

Microsoft CEO Satya Nadella said yesterday that he wasn’t opposed, either.

Meta, on the other hand...

Well, let’s just say the Facebook parent company, for all its faults, is no lemming. The company, seen as lagging behind its AI rivals despite spending exorbitant sums to leapfrog them, has now actively deflected suggestions of giving the government a “sizable equity stake.” In response to a question using that phrase, Meta’s global-affairs chief Joel Kaplan said they mostly talk to the White House about policies the government can enact to “enable the AI revolution to take place, and for us to win the battle with China.”

This is lobbyist speak for “No.”

News of this encounter came courtesy of POLITICO, which added that Kaplan says Meta thinks “the right thing is for these companies to make the investment they’re making. We’re all raising capital and [...] investing in the communities where we’re building these data centers to make sure they benefit from the data center investment.”

I’m pretty sure this is lobbyist speak for “Stay out of our way; we’re trying to make money here.”

Dispatch from Alana

Anthropic talks pause but proposes AI dependence

Anthropic is giving me whiplash. After calling for a global pause last week, their actions seem to be doing the opposite.

Perhaps the company feels they must keep racing until a pause happens. (While not particularly noble, there’s a logic to this if you genuinely believe you’re helping make the technology safer than it would be otherwise.) Still, it’s hard for me to avoid a cynical reading of their current playbook, which would look something like this:

How to ensure the tech you’re building keeps getting built, even if you publicly call for a pause

Keep releasing very powerful models, with no sign of stopping ✅

(Fable, the public version of Mythos, was released to the public on June 9th; see our coverage here and here.)

Muddy your stated values by filing for an IPO, which will subject you to investor interests ✅

(Anthropic officially filed for an IPO on June 1st. If it indeed goes public, it will essentially be tying its own hands to the pursuit of short-term profits over long-term societal good.)

Propose an economy dependent on AI growth, framing this as a solution for AI disruption ✅

That last step will be the subject of my dispatch today.

Yesterday, Anthropic released an Economic Policy Framework that acknowledges the labor disruption AI is likely to cause. It suggests tailored interventions for three possible tiers of disruption, with Tier 3 requiring “sustained income replacement for a large share of the workforce.” Most of the interventions across the other two tiers fall into one of two buckets: job transition support or direct financial assistance.

In Tier 1, defined as around 5% unemployment, the main intervention is to give everyone an investment account at birth. The company suggests gradually expanding eligibility to cover first young adults and those impacted by AI job loss, and eventually, every American.

My cynicism gets triggered when the company proposes equity in AI systems as one of the ways to fund these accounts:

Policymakers should expand the mechanisms by which these accounts can be funded, including with equity in AI companies, so that beneficiaries share directly in nearer-term gains from AI-driven growth.

The charitable reading is “spread the wealth around.” My cynical take? This essentially forces more people into becoming stakeholders in AI’s success. And that’s certainly one way to shift low public opinion (which could be leveraged to demand a pause) away from “Americans hate AI more than ICE” levels. When your personal financial security is tied to the growth of the technology, you might be hesitant to support anything hindering that growth.

Anthropic writes:

We propose it in Tier 1 [the lowest level of disruption] because, for the intervention to matter, it must start before disruption is visible. Accounts compound; the earlier they are seeded, the more they are worth when they are needed.

Channeling the comedy duo Key & Peele’s famous anger translations sketch, I can’t help but read this as, “When it becomes clear that we really need to rein in this technology, people’s accounts will be at exactly the point where they’ll want to keep advancing AI, so as not to risk losing a lot of money.”

The revenue sources proposed to potentially fund the financial assistance suggested in Tier 3 are also heavily linked to AI:

Potential revenue sources could include increasing the capital gains tax, broad-based consumption taxes, sector-specific levies on AI use (measured by tokens, compute, or revenue), and scalable “digital dividends” funded by taxes on the digital sector.
Potential redistribution mechanisms could include universal basic income, AI sovereign wealth funds funded by investment stakes in AI-driven productivity, equity-sharing mechanisms giving workers partial ownership in AI enterprises, and dramatically expanded pre-distributive capital accounts building on existing models.

To be fair, finding money to feed the people in a collapsed labor market is a hard problem — there aren’t many revenue sources besides AI. But the fact remains that the company is proposing linking economic health to its biggest disruptor. And doing this would make it much harder to regulate, rein in, or pause the technology in any way.

Even if we take the Economic Policy Framework in good faith, and not as strategic positioning, there are two glaring issues:

1. As mentioned earlier, the proposal rests heavily on the government providing financial assistance to Americans. Even with the proposed “additional revenue” sources, I’d guess it will be hard to find enough funds to completely replace the worker economy, especially in a scenario Anthropic describes as:

past the edge of the maps that policymakers and economists have historically used to navigate: unemployment at levels never before sustained alongside an economy generating record output. The search for work stretches past a year, then past two, and for some, eventually stops. Savings built over a working life are drawn down; rent is paid late, then not at all. The link Americans have long taken for granted—between contributing to the economy and sharing in its rewards—is strained or broken.

2. Anthropic’s framework, in addition to direct financial assistance, proposes many policies related to job transitions: moving people to “opportunities and industries where meaningful work will be.” But will those even exist? If they do, will they have enough openings? If robotics takes off, even the blue collar jobs don’t seem particularly safe. Work in AI is being automated. Where are all the jobs we’ll be moving people to?

(There is one small footnote on the sentence “Some of the roles that may prove more resistant to AI substitution are also where the US faces well-documented, persistent workforce shortages” that links to a paper about teacher shortages. This is fairly disheartening in a world where education will likely be dramatically reshaped by AI, and doesn’t seem like a very convincing alternative for displaced workers.)

This may be an unpopular take, but I often find myself angrier at Anthropic than at OpenAI. It’s how I feel about someone who knows they’re wreaking havoc but continues to do it vs. someone who is so far past moral reflection they are perhaps unaware of their errors. (The first represents Anthropic; the second OpenAI.)

“But if Anthropic stops, others will continue,” you protest. And that’s true. Compared to OpenAI and Meta, Anthropic does look like “the good one.” Still, I wish the company would better espouse the virtues they profess. Calling for a pause was commendable. But following that with the release of Claude Fable, the pursuit of an IPO, and a policy proposal that aims to make the economy dependent on AI infuriates me. If they’re serious about pausing, those aren’t the next steps to take.

The analyses and opinions expressed on AI StopWatch reflect the views of the individual contributors and the sources they cover, and should not be taken as official positions of the Machine Intelligence Research Institute.

Discussion about this post

Ready for more?