AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning

Large language models face challenges in archive-grounded reasoning tasks involving evidence retrieval and synthesis across diverse document collections, with performance varying s…

Hugging Face · Daily Papers ·Honglin Guo, Qi Zhang ·

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Honglin Guo, Qi Zhang, Yu Zhang, Weijie Li, Rui Zheng, Zhikai Lei

  • 0 upvotes da comunidade
  • Temas: large language models, archive-grounded reasoning, evidence retrieval, cross-document task synthesis, leakage-preventing obfuscation, difficulty filtering

Resumo

Resumo original (em inglês), extraído do paper:

Large language models face challenges in archive-grounded reasoning tasks involving evidence retrieval and synthesis across diverse document collections, with performance varying significantly across domains.

Ler o paper completo no Hugging Face →

compartilhar: