Dataset
LLMs & Texto
ibm-research/ScarfBench
Dataset em destaque no Hugging Face — 215 downloads. Scarf (Self-Contained Application Refactoring) is a benchmark suite for evaluating AI agents' ability to migrate enterprise Java applications across J…
Hugging Face · Datasets
·ibm-research
·
·↓ 215
·♥ 16
O dataset ibm-research/ScarfBench está entre os destaques do Hugging Face — dados que alimentam o treinamento e a avaliação dos modelos do momento.
- 215 downloads
- 16 curtidas
Sobre o dataset
Scarf (Self-Contained Application Refactoring) is a benchmark suite for evaluating AI agents' ability to migrate enterprise Java applications across Jakarta EE, Quarkus, and Spring while preserving functionality, idiomatic patterns, and architectural integrity.
text-generation code benchmark evaluation java code-translation agentic
// relacionados
Leia também
Blog
How Businesses Are Building Specialized AI They Can Trust
Blog
Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates
Blog
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
Blog