NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

NatureBench presents a cross-disciplinary benchmark of 90 scientific tasks derived from Nature publications to assess AI coding agents' ability to achieve discovery rather than jus…

Hugging Face · Daily Papers ·Yuru Wang, Lejun Cheng · ·▲ 48 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Yuru Wang, Lejun Cheng, Yuxin Zuo, Sihang Zeng, Bingxiang He, Che Jiang

  • 48 upvotes da comunidade
  • Temas: NatureBench, NatureGym, AI coding agents, scientific discovery, methodological translation, supervised prediction problems

Resumo

Resumo original (em inglês), extraído do paper:

NatureBench presents a cross-disciplinary benchmark of 90 scientific tasks derived from Nature publications to assess AI coding agents' ability to achieve discovery rather than just reproduction, revealing that current agents primarily rely on methodological translation rather than genuine scientific innovation.

Ler o paper completo no Hugging Face →

compartilhar: