Window of opportunity
Bellwether op-eds, GPT 5.6 roll-out, AI-assisted policing, and more
Dispatches from Beck
Opinion shifts
In politics, there’s a limit to what someone can say without sounding absurd or extremist, but this limit changes over time. Consider the progression of civil rights, where integration started as a radical proposal, before eventually becoming the consensus position. This limit of generally-acceptable-positioning is what political scientists call the Overton window, and it’s a useful concept for understanding the progression of discourse. It can also be an excuse for meekness, which is the trap MIRI set out to avoid — hence a book called If Anyone Builds It, Everyone Dies.

But the window is shifting. In a new op-ed in the Washington Post, author and journalist Robert Wright considers “What if Trump is right to pump the brakes on the most advanced AI?” with the subheading “Nationalist fervor over beating China biases AI policy toward recklessness — and possible catastrophe.” In it, he writes about the catastrophic risk from AI, including not just the now widespread concerns about hacking and novel pandemics but also the danger of “alignment faking.” That is when a model in training gives the answers it thinks the trainers want, rather than giving the answer it thinks is correct, so that it can avoid having its current preferences changed. Such concerns have often been excluded from the conversation as being too weird, extreme, or sci-fi, but now alignment faking is the direct subject of both study and editorials.
In more evidence of a growing Overton window, an op-ed by MIRI President Nate Soares ran in The Hill today. Soares makes the case that “AI leaders would like to stop racing. Let’s make that possible.” He notes that AI executives from Jack Clark and Demis Hassabis to Elon Musk have shared that they would prefer a mutual slowdown, if possible. Soares writes:
American leadership can, and must, do what the AI companies can’t do themselves — build an “off switch” that can end their mad rush to superintelligence that threatens us all.
Model regulation?
OpenAI’s new model, GPT-5.6, will first be released to limited, administration-approved organizations, POLITICO reports based on original reporting by The Information. This is due to the model’s “Mythos-like” hacking capabilities. Sam Altman told his staff yesterday that the controlled release was not his “preferred” option, but that OpenAI would be pivoting following White House requests. Details, including when the delays might end, remain uncertain or undecided, as is the case with restrictions on Anthropic’s Mythos and Fable.
Some have criticized the practice, particularly its ad hoc imposition. Dean Ball, former White House AI strategist and recent OpenAI hire, wrote about the regulatory regime following Fable’s export restriction:
“AI is licensed now, but the requirements change constantly and are always a secret, even to the administration itself, which will discover the rules spontaneously in real time as it reacts to events.”
From a safety perspective, the implications are more mixed. The government taking serious action is a positive step. The models can already enable cyberattacks and potentially facilitate the development of bioweapons; to the extent these rules lower those risks, that is legitimately good. The extension of the restrictions from Anthropic to OpenAI also helps soften concerns that the release restrictions would be imposed only on those that the White House views as political adversaries. Such a practice would also undermine the regulatory regime’s efficacy.
To the extent this is a big regulatory mess, it is not a great sign for the near-term economic prospects of these companies. However, from humanity’s perspective, trading a bit of reduced economic gains for significant reductions in the risk of human extinction would be a great trade. But such a trade rides on whether such reductions actually work.
Restricting external access doesn’t prevent the companies from using their own models internally to develop even smarter models. That’s recursive self-improvement (what my colleague Mitch called the “most dangerous milestone any company could aim for” yesterday), and it is the explicit plan of most of these companies. Restricting deployment might reduce financial incentives in the long run, but even with additional obstacles it seems unlikely that these companies would struggle to secure funds over the medium term.
Time will tell whether the White House’s actions are the first steps in developing sensible policy or random flailing that imposes significant costs for little benefit.
Dispatches from Alana
AI in policing
An article in Stateline today discussed the increasing use of AI in policing, raising concerns about surveillance, bias, inaccuracies, and privacy.
One of the throughlines: “The technology is advancing faster than agencies, regulators and courts are able to fully assess its implications.” That throughline seems to be true of most, if not all, fields where AI adoption is becoming widespread, including medicine, the courts, education, and immigration.
Another pattern is AI’s ability to sift through vast amounts of data very quickly and come up with specific conclusions. This, in essence, destroys de facto privacy protections that exist due to the time and labor it takes to sort and analyze lots and lots of data. In other words, we’ve always had a bunch of unsecured data floating around about us. But now it can easily be collated, sorted, and analyzed.
In the policing case, the article reports that AI is being used for report generation, evidence/data analysis, facial recognition, tracking through license plate readers, and case summaries, among other things. The response, and extent of adoption, is agency, city, or state specific, without much consistency nationwide. Cautionary measures discussed and sometimes implemented include disclosures when AI is used, requiring a human to verify AI-generated text, and regulating the use of specific technologies (for example, facial recognition).
Even if precautionary measures could keep up with the pace of adoption, I find myself skeptical of things like AI disclosures. This sounds great on paper, but how, practically, will it change actions or decisions? Knowing some information was produced by AI might make us less confident in its veracity, but it seems the options are: trust it anyway, verify it, or discard it. Given the nature of many AI tasks (like sifting through massive amounts of data very quickly) human verification is either impossible or impractical. So what will people do with that bit of doubt instilled by an AI disclosure? I’d guess one likely outcome is nothing. Think about it: does the fine-print reminder that “ChatGPT can make mistakes” change the way you use it all that much?
Of course, the other angle here is similar to a now outdated, but still relevant opinion piece by the New York Times’s Joe Nocera on self-driving cars, that argues human driving is probably worse than machine driving. Humans also make mistakes. Humans also have biases. All this to say: if I’m a suspect and I know I’m innocent, I might actually prefer to be evaluated by an AI than a police officer.
My preference would definitely change, though, when it comes to “agentic policing,” a hypothetical future practice in which:
body-camera footage, camera networks and other data sources [are integrated] into a single system capable of generating investigative leads, identifying potential suspects or suggesting connections between cases.
As George Washington University law professor Andrew Guthrie Ferguson points out:
All that data is going to be dumped into an AI model, and they’re going to query it to say who’s the most likely suspect ... The AI is going to be running the agentic analysis of it and come up with the answer, and then police and prosecutors have to kind of work backwards to see if it’s accurate ... We’ve never started with an answer and made people work backwards.
I would add that AI hates to answer “I don’t know”. So giving it a directive like “find me a suspect” seems like a very bad idea indeed. It will likely find someone, anyone, and defend its choice with whatever rationalizations and data it can find.
Easier to tear down than repair
An article in the Wall Street Journal puts a cautiously hopeful spin on whether we’ll rise to the challenge of maintaining privacy in the AI age. Its author is Daniel J. Solove, who wrote a book on technology and privacy, and is a professor at George Washington University Law School. Solove highlights the public’s strong desire for privacy protections, along with the suite of privacy laws being passed today, as positive signals.
The laws passed today, he argues, are largely putting the onus on the wrong parties, tasking us with managing our own data. Instead, they should hold companies accountable. He argues that companies are good at responding to what regulations and the law demand, and that this can actually spur innovation rather than stifle it.
He’s primarily talking about privacy, but I think it’s worth evaluating some of his points more broadly. In other words, will “holding companies accountable” work to address other societal and safety risks posed by AI?
In his article, Solove emphasizes that we have a pattern of regulating things once we realize they are dangerous:
It took the deaths of many babies [from formaldehyde-sweetened milk] until finally policymakers woke up and enacted regulation. A similar story occurred with cars. They were very dangerous until the law demanded safety.
He’s absolutely right that we typically wait until some small scale disaster strikes before taking action. But while he views this as a hopeful pattern, I view it less optimistically.
Unlike with other technologies, waiting for the other shoe to drop before taking meaningful action on AI risk could cause widespread death and havoc, if not civilization-ending catastrophe. Especially since labs are currently racing to create artificial superintelligence, which, once out of the box, won’t go back in. If we wait until we can viscerally feel the dangers of superintelligence before we ban or regulate it, it will most likely be too late.
What about his point on regulation spurring innovation, rather than stifling it? He writes:
I often hear gripes by defenders of tech companies that regulation will stifle innovation. But when the law required greater car safety, companies innovated to create technologies to make cars safer. Seat belts and air bags are innovations just as much as faster engines.
I love this point. But while it may work for privacy (which seems like a solvable, if difficult, problem) I don’t think standard company innovation and existing regulatory methods (aside from a ban on frontier technology) could effectively mitigate a wider set of safety risks. Developers are working with black box systems that nobody understands, and innovation hasn’t been able to solve many of the problems that have already cropped up (example: sycophancy leading to user psychosis still persists despite developers’ best efforts to address it.) It’s far easier to tear down than to repair.
Dispatch from Donald
OpenAI considers delaying IPO
The New York Times’s Rob Copeland and Mike Isaac report that OpenAI may push back its initial public offering (IPO) until 2027. This follows the poor post-IPO performance of SpaceX, whose stock rose from an opening of $135 per share to $202 before dropping to $153. OpenAI’s CEO, Sam Altman, is reportedly holding back because he wants the company to begin its IPO at a public valuation of $1 trillion. He is concerned that the market may be less bullish after SpaceX.
This is arguably a return to the original plan. Late last year, OpenAI’s chief financial officer, Sarah Friar, denied plans of an IPO, saying the company was instead focused on its finances. Since that time, OpenAI has continued to spend heavily on data centers, computing power, marketing, and talent acquisition. It is also trying to increase revenue, hoping to triple last year’s roughly $13 billion with strategies like letting people buy things from inside ChatGPT.
But claims that the AI bubble is about to pop — and that the frontier labs are about to wither on the vine as a result — are premature. Elon Musk might have been a trillionaire for less than two weeks, but he remains the richest person in the world. OpenAI’s previous private valuation of $730 billion is still monstrously vast. Altman is merely quibbling over whether his company’s initial public valuation should be the size of Michigan’s 2025 GDP or the size of Michigan plus Oklahoma.
The analyses and opinions expressed on AI StopWatch reflect the views of the individual contributors and the sources they cover, and should not be taken as official positions of the Machine Intelligence Research Institute.




