Blog
Dados & Embeddings
An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run
Epoch AI's new MirrorCode benchmark tests whether AI models can recreate complete programs without access to the original code. Claude Opus 4.7 leads with a 56 percent solve rate, rebuilding a 16,000-line toolkit in just 14 hours. But every model tested still fails on the most complex tasks. The article An AI model programmed nonstop for 19 days on a single MirrorCode task that cost $2,600 to run appeared first on The Decoder .
The Decoder
·Matthias Bastian
·
// relacionados
Leia também
Blog
DeepSeek Releases DSpark, a Speculative Decoding Framework That Accelerates DeepSeek-V4 Per-User Generation 60–85% Over MTP-1
Blog
J.P. Morgan sees a pile of red flags in the AI market
Blog
Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on
Blog