AGORA: An Archive-Grounded Benchmark for Agentic Workplace Document Reasoning
Large language models face challenges in archive-grounded reasoning tasks involving evidence retrieval and synthesis across diverse document collections, with performance varying s…
Hugging Face · Daily Papers
·Honglin Guo, Qi Zhang
·
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Honglin Guo, Qi Zhang, Yu Zhang, Weijie Li, Rui Zheng, Zhikai Lei
- 0 upvotes da comunidade
- Temas: large language models, archive-grounded reasoning, evidence retrieval, cross-document task synthesis, leakage-preventing obfuscation, difficulty filtering
Resumo
Resumo original (em inglês), extraído do paper:
Large language models face challenges in archive-grounded reasoning tasks involving evidence retrieval and synthesis across diverse document collections, with performance varying significantly across domains.