Tomorrow, and tomorrow, and tomorrow
Changes to AI StopWatch, survey of Hill staffers, AI superforecasters
In this issue:
Changes to AI StopWatch - New Dispatches section, changes to our Notes, and more
Next year is also in the next 250 years - Congressional staffers rank “loss of control over AI” as a top danger to the United States
Tomorrow isn’t in the training data - AI can now predict the future nearly as well as the best human forecasters
Dispatches from Mitch
Changes to AI StopWatch
New Dispatches section, changes to our Notes, and more
AI StopWatch is an experiment, and we’re still fiddling with it.
We’ve found that it’s sometimes awkward to share StopWatch content because individual dispatches don’t always want the rest of a Daily Digest tagging along. We were using Substack’s Notes feature for standalone posting, but this has been an awkward fit.
So starting today, we’re introducing a new Dispatches section. This is where each of our posts will first appear, as soon as we write them. Subscribe to the new section only if you want a separate email for each new dispatch hot off the wire. Don’t subscribe to both Dispatches and the Daily Digest unless you don’t mind a little duplication in your inbox.
We’re also going to start using the Notes feature more as other Substacks typically do: to say a few words about each new post and link to its standalone page, for easier sharing.
And finally, in response to reader feedback, we’re adding a table of contents to the top of each Daily Digest, with descriptions and internal links to aid skimming.
Let me express my thanks to everyone who has provided feedback so far. We hope you’ll continue telling us how we can better help you have the conversations you want to have about where the AI race is taking us.
Next year is also in the next 250 years
Congressional staffers rank “loss of control over AI” as a top danger to the United States
Thanks to a tweet by Tim Schnabel, who provided additional numbers, I found Punchbowl’s reporting on a survey question they asked senior congressional staffers. The question:
Over the next 250 years, what do you believe will be the most important challenge for America to overcome?
Respondents were asked to select three responses out of 18 listed. The third-most chosen concern, selected by 35% of staffers, was “losing control of AI.” This put it ahead of war, fertility decline, and even geopolitical competition.
Their two top picks?
“National debt and federal deficit” (51%)
“Political polarization” (45%)
Unlike that #1 pick, however, losing control of AI was a strongly bipartisan concern, selected by 40% of Democrats and 31% of Republicans.
I find this really encouraging. Members of Congress commonly rely on their staffers, who are typically younger, to research emerging issues and suggest policy positions. So a survey like this can be a leading indicator of congressional evolution.
I like to think this is a sign that ongoing efforts by MIRI and others to brief the Hill are bearing fruit.
I just hope respondents didn’t overly attach to the “next 250 years” clause in the question and pick the AI problem as feeling Far Out. As MIRI’s briefings emphasize, AI capabilities are advancing very quickly, and we can no longer be sure the extinction problem can be put off even to the next administration. Next year is also in the next 250 years.
In fact, I’m a lot more worried about 2027 (and 2028, and 2029) than I am about 2130 or 2276. If we can survive the next few decades, I think we’re in for an amazing couple of centuries.
Dispatch from Robert
Tomorrow isn’t in the training data
AI can now predict the future nearly as well as the best human forecasters
There are so many opaque AI benchmarks that it can sometimes be hard to see whether AI capabilities are still improving. Today Scott Alexander discussed a specific type of AI capability that’s actually quite easy to assess.
He discusses AI superforecasters. Superforecasters are traditionally people who are exceptionally good at predicting future events. These can include, for example, wars, stock market trends, or even elections. They don’t do this by looking into a crystal ball, but by making well-informed judgments based on expertise, background knowledge, probability models, and calculations.
AI superforecasters are frontier models that perform the same work, using deep search and analysis in structured software environments specifically tailored to support them, which are called “scaffolds.” Only they’re doing it much faster and more cost-effectively. At the most recent Metaculus Cup, a well-known competition for forecasters, humans took first and second place, but the AI start-up Preseen’s AI superforecaster took third place. If development continues as the trend suggests, AI superforecasters will leave even the best human counterparts behind in about half a year. Incidentally, the Metaculus forecasters themselves estimate a 95% probability that this will happen before 2030.
Of course, the commercial potential is immense. According to Alexander, Preseen itself claims that its model turned just $35 into a whopping $2 million on the Kalshi forecasting market in only seven months. Another company called FutureSearch even claims to significantly outperform the stock market. I’m always quite skeptical of such claims, because with a bit of luck and survivorship bias, even utter nonsense can look convincing at first glance. I find the experiment Alexander conducted more convincing: He had the FutureSearch model calculate the probability of a future event for $8, after which it spat out a result and a list of the 212 sources used in just 5 minutes. He also gave the same task to the Preseen model and a human superforecaster. All three arrived at nearly the same result. Not bad for a couple of “stochastic parrots”, right?
The “stochastic parrot” phrase — that is, the notion that LLMs essentially just remix what’s already in their training data and then regurgitate it – is hard to reconcile with results like these. Tomorrow isn’t in the training data. Of course, it’s possible that certain basic assumptions about probabilities and causal relationships were part of the training data. But analyzing and predicting new events on that basis — and outperforming even the best humans in the process — is precisely not “parroting.” We generally call that “judgment.” And the benchmark in this case isn’t just some test that someone created and for which the AI can potentially be prepared, but reality itself.
So how much longer will it take for the capabilities of AI superforecasters to eclipse those of all humans? As mentioned, human superforecasters believe it won’t be long. Skeptics of superhuman AI capabilities see it differently. Alexander cites an article on the topic by the computer scientists Arvind Narayanan and Sayash Kapoor, authors of the book AI Snake Oil, in which they argue that human capabilities in some areas represent the limit of what AIs can achieve:
Specifically, we propose two such areas: forecasting and persuasion. We predict that AI will not be able to meaningfully outperform trained humans (particularly teams of humans and especially if augmented with simple automated tools) at forecasting geopolitical events (such as elections). We make the same prediction for the task of persuading people to act against their own self-interest.
Just two weeks ago, we reported that frontier AI models can now out-persuade even expert human persuaders. That may not yet meet the strict criteria set by Narayanan and Kapoor, but you don’t need to be a superforecaster to doubt that it won’t be long before that changes, too. In any case, it’s a testable prediction by the two skeptics that’s currently being seriously shaken on both fronts.
Alexander thinks that superforecasting could become a kind of “opinion layer” of AI. That would enable far more people to make the best-informed decisions even in cases where no one would shell out for a human superforecaster.
But I would actually go one step further. In the online resources for If Anyone Builds It, Everyone Dies, Eliezer Yudkowsky and Nate Soares explain the danger of creating something that is better at predicting the world and steering it than humans are. Superforecasting is prediction and basically already constitutes half of that. And if superprediction and superpersuasion are already drawing nearer, how far away is superhuman research and the resulting threat of an intelligence explosion through smarter and smarter AIs building their own successors?
We’re not there yet, and we still have a chance to hit the brakes. But every new improvement increases AI’s ability to devise and implement plans — including plans nobody asked for.
The analyses and opinions expressed on AI StopWatch reflect the views of the individual contributors and the sources they cover, and should not be taken as official positions of the Machine Intelligence Research Institute.





