Paper LLMs & Texto Dados & Embeddings

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

NatureBench presents a cross-disciplinary benchmark of 90 scientific tasks derived from Nature publications to assess AI coding agents' ability to achieve discovery rather than jus…

Hugging Face · Daily Papers ·Yuru Wang, Lejun Cheng · 23 de janeiro de 2026 ·▲ 48 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Yuru Wang, Lejun Cheng, Yuxin Zuo, Sihang Zeng, Bingxiang He, Che Jiang

48 upvotes da comunidade
Temas: NatureBench, NatureGym, AI coding agents, scientific discovery, methodological translation, supervised prediction problems

Resumo

Resumo original (em inglês), extraído do paper:

NatureBench presents a cross-disciplinary benchmark of 90 scientific tasks derived from Nature publications to assess AI coding agents' ability to achieve discovery rather than just reproduction, revealing that current agents primarily rely on methodological translation rather than genuine scientific innovation.

Ler o paper completo no Hugging Face →

Ver no Hugging Face

// relacionados

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Resumo

Leia também

Europe is pushing back on Washington’s chip war

Comfy-Org/Krea-2

Cerebras stock plunges after earnings as CEO says margin outlook was misunderstood

OpenAI and Broadcom announce chip designed for LLM inference at scale