Dataset LLMs & Texto

ibm-research/ScarfBench

Dataset em destaque no Hugging Face — 215 downloads. Scarf (Self-Contained Application Refactoring) is a benchmark suite for evaluating AI agents' ability to migrate enterprise Java applications across J…

Hugging Face · Datasets ·ibm-research · 22 de janeiro de 2026 ·↓ 215 ·♥ 16

O dataset ibm-research/ScarfBench está entre os destaques do Hugging Face — dados que alimentam o treinamento e a avaliação dos modelos do momento.

215 downloads
16 curtidas

Sobre o dataset

Scarf (Self-Contained Application Refactoring) is a benchmark suite for evaluating AI agents' ability to migrate enterprise Java applications across Jakarta EE, Quarkus, and Spring while preserving functionality, idiomatic patterns, and architectural integrity.

text-generation code benchmark evaluation java code-translation agentic

Explorar o dataset no Hugging Face →

Ver no Hugging Face

// relacionados

ibm-research/ScarfBench

Sobre o dataset

Leia também

How Businesses Are Building Specialized AI They Can Trust

Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates

Build real agentic apps using CUGA: two dozen working examples on a lightweight harness

Cursor announces its own AI model, a new Git platform, and a mobile app