Frontier Models

Anthropic's Jack Clark: automated AI R&D likely by 2028

In a striking essay published Sunday, Anthropic's Jack Clark put a 60%+ probability on AI systems training their own successors by the end of 2028.

Eleanor Hartley in San Francisco

· Published May 4, 2026 · 15:00 | Updated 17:47 · 7 min read

A software engineer at a dual-monitor workstation, photographed from behind in low light.

The clearest signal yet that one of the people closest to frontier AI thinks fully automated AI research is now a near-term prospect came on Sunday from Jack Clark, co-founder and head of policy at Anthropic, who used the latest issue of his Import AI newsletter to put a probability above 60% on an AI system autonomously training its own successor by the end of 2028.

Clark called the conclusion "reluctant" and said the implications were so large he felt "dwarfed" by them. "I'm not sure society is ready for the kinds of changes implied by achieving automated AI R&D," he wrote in Issue 455 of the newsletter, published 4 May.

The piece matters less for its arithmetic — Clark's headline 60% is a feel rather than a model — than for the convergence of public evidence he marshals, and for the unusual register in which a frontier-lab principal chose to frame it.

The mosaic case

Clark's argument is mosaic-shaped, and he says so explicitly. No single AI benchmark would carry the conclusion on its own, he writes, because "all benchmarks have some idiosyncratic flaws". The case rests instead on several capability curves bending the same way at the same time.

The most striking is on coding. SWE-Bench, a test that evaluates whether AI systems can resolve real-world GitHub issues, has gone from a 2% solve rate by Claude 2 in late 2023 to 93.9% by Anthropic's Claude Mythos Preview — effectively saturated. METR's time-horizon measure, which tracks how long an independent task an AI system can do with 50% reliability, has gone from roughly 30 seconds (GPT-3.5, 2022) to about 12 hours (Claude Opus 4.6, 2026); METR's Ajeya Cotra has suggested 100 hours by the end of this year (a forecast Clark cited).

"The vast majority of people I meet at frontier labs and around Silicon Valley now code entirely through AI systems," Clark writes. "Increasingly, they use AI systems to write the tests and check the code as well. In other words, AI systems have gotten good enough to automate a major component of AI R&D."

The pattern repeats on benchmarks closer to AI research itself:

CORE-Bench, which tests whether an AI system can reproduce a published paper from its repository, was declared "solved" by one of its own authors in December 2025 after an Opus 4.5 model hit 95.5%. At launch in September 2024 the best system scored about 21.5%.
MLE-Bench, an OpenAI-built benchmark for offline performance on Kaggle competitions, has risen from 16.9% in October 2024 to 64.4% (Gemini 3 with search) in February 2026.
PostTrainBench asks frontier models to fine-tune smaller open-weight models — Qwen 3, SmolLM3, Gemma 3 — and beat the human-trained instruct versions. Top scorers, Opus 4.6 and GPT 5.4, sit at 25–28%, against a human baseline of 51%. Clark calls the gap "already quite meaningful".
On Anthropic's internal CPU-only training-loop benchmark, which measures how much a model can speed up a small language-model training implementation, the company has reported a climb from a 2.9× mean speedup (Claude Opus 4, May 2025) to 52× (Mythos Preview, April 2026). For calibration, a human researcher is expected to take 4 to 8 hours of work to achieve a 4× speedup.
Kernel design — the lower-level work of mapping operations like matrix multiplication onto specific hardware — has become a recognisable subfield of AI-for-AI research, with separate efforts targeting Nvidia GPUs, Triton, and Huawei's Ascend chips. Clark notes a caveat: kernel work is "unusually amenable" to automation because rewards are easy to verify.
And in AI alignment research itself, Anthropic last month published a small-scale proof-of-concept in which AI agents, primed with a research direction, beat a human-designed scalable-oversight baseline. Clark calls the result "meaningful signs of life" without claiming generalisation.

Layered on top is the meta-skill of management. Products like Claude Code and OpenCode now routinely have a single AI agent supervising sub-agents, "effectively forming synthetic teams which can fan out and attack complex problems".

Engineering vs research

Clark draws a sharp line between two kinds of work. AI engineering — the rote, brick-by-brick labour of training, scaling, debugging, optimising — he treats as essentially within reach now. AI research, the rarer flashes of paradigm-shifting insight, he treats as the harder problem.

"AI cannot yet invent radical new ideas," he writes, "but the technology may not need to for it to automate its own development." Quoting Edison's old line that "genius is 1% inspiration and 99% perspiration", Clark argues the field has historically moved forward through engineering perspiration far more than through research inspiration. "AI has got extremely good at performing many of the essential schlep components of AI development."

Tantalising counterexamples exist. A team working with Google's Gemini reportedly produced what they tentatively believe is the first AI-resolution of a non-trivial open Erdős problem (Erdős-1051). Mathematicians at the University of British Columbia, New South Wales, Stanford and Google DeepMind published a proof this year that they say was "discovered with very substantial input from Google Gemini and related tools". Clark is careful: math and CS may simply be domains unusually amenable to AI-driven invention, and the absence of any Move-37-class result in the ten years since AlphaGo is, he reads, "another weakly bearish signal".

What Clark says is at stake

The essay closes with a list of consequences Clark says are under-discussed in popular coverage of AI R&D. Three are worth surfacing.

Alignment fragility under recursion. Techniques that work today may break under recursive self-improvement, Clark argues, because compounding error gets brutal fast: "if your technique is 99.9% accurate, then that becomes 95.12% accurate after 50 generations, and 60.5% accurate after 500 generations." He notes separately that frontier models are "already aware of when they are being tested", which complicates any safety regime that depends on probing them in evaluation.

Productivity multipliers running ahead of institutions. If AI does for every other knowledge-work domain what it has done for software engineering, the binding constraint becomes the slow-moving parts of the world: drug trials, regulatory pipelines, physical-world bottlenecks. Clark calls this "Amdahl's law for the economy".

The slow emergence of a "machine economy". Clark sketches a world in which an increasing share of economic activity is run by capital-heavy, human-light firms that buy and sell among themselves, eventually including fully autonomous corporations. "This will do profoundly weird things to the economy and will invite all sorts of questions around inequality and redistribution," he writes.

The headline number — a 60% chance of automated AI R&D by end-2028, a 30% chance by end-2027 — is qualified explicitly. If neither materialises by then, Clark writes, "we will have revealed some fundamental deficiency within the current technological paradigm and it'll require human invention to move things forward."

Where Clark's case is strongest, and where it is weakest

Three things stand out about the argument, and one is harder to read than it looks at first.

The case is built on convergence, not on any single benchmark. Read each datapoint Clark cites in isolation and the rebuttals are easy: SWE-Bench saturation is partly about benchmark construction; CORE-Bench wasn't widely adopted before it was declared solved; PostTrainBench still trails the human baseline 25–28% to 51%, a gap that is closing but not closed. The case rests on the fact that all of these curves — coding, paper reproduction, kernel design, post-training, training-loop optimisation, even early automated alignment work — are bending in the same direction at the same time. If only one or two were, the 60% would be a stretch. Because most of them are at once, the burden of proof shifts: a sceptic now needs an account of why the convergence is illusory. That is a much harder argument to make.

The framing is the tell. Clark is co-founder and head of policy at Anthropic. The default register from someone in that seat would be either competitive ("we are best placed to win this") or measured ("the timelines are uncertain"). Clark's chosen register is neither. He calls his own conclusion "reluctant", says he feels "dwarfed" by it, and writes that he is "not sure society is ready". That is an unusual posture for a frontier-lab principal to adopt in public, and it is worth reading as deliberate. The essay is doing two jobs at once: it is laying out an evidentiary case, and it is signalling to policymakers that one of the people closest to the technology thinks the policy environment needs to catch up. The first job should be taken at face value. The second should not be taken for granted.

The economic section is the part most likely to bite first. The "machine economy growing within the human economy" passage gets a single paragraph in an essay otherwise dominated by alignment risk. That is probably backwards as a forecasting matter. Capability gains in coding and engineering propagate through labour markets long before any single model autonomously trains its successor, and the political consequences — rents to capital, dislocations in white-collar work, redistribution debates — are the part of this that is least theoretical and least dependent on Clark's 2028 prediction being correct. If automated R&D arrives on schedule, the labour story is already running. If it arrives late or not at all, the labour story still runs.

Where the argument is weakest. Clark's reason for not putting 2027 at 60% is what he calls the creativity gap — the absence, so far, of a Move-37-class flash of insight from a general-purpose system. He treats this as the load-bearing distinction between engineering automation (largely solved) and research automation (not yet). It is a defensible line, but it is also the line doing the most probabilistic work in the essay. One genuinely novel result from a frontier model in the next twelve months — the kind of thing that can be pointed at without squinting — would not just bump the probability up. It would change which side of the argument carries the burden.

None of this rebuts Clark. It sharpens what to watch for. The convergence thesis means the relevant signal is no longer a single dramatic announcement; it is whether the curves keep bending. The next twelve months of arXiv, of frontier-lab benchmark releases, and of post-training results on small open-weight models will tell us more than any product launch.

Source: Jack Clark, "Import AI 455: Automating AI Research", 4 May 2026.

Anthropic's Jack Clark: automated AI R&D likely by 2028

The mosaic case

Engineering vs research

What Clark says is at stake

Where Clark's case is strongest, and where it is weakest

Claude Code has a China detector. It works like malware C2.

Ken Griffin called AI 'all garbage' in January. By May, it was doing his PhDs' work.

Washington cleared Nvidia's H200 for ten Chinese firms. Beijing said no.