Blog LLMs & Texto Multimodal

REKEY: Metadata-Grounded Visual-Key Regeneration for Contamination-Resilient VQA Evaluation

arXiv:2606.20736v1 Announce Type: new Abstract: Static visual question answering (VQA) benchmarks age quickly: Once the items leak into training corpora, scores can reflect memorization rather than genuine visual ability, thus obscuring real progress. Rebuilding high-quality benchmarks such as V*Bench requires substantial human annotation, yet each static release can quickly become another leaked artifact. We propose ReKey, a live benchmark protocol that randomly regenerates the answer-bearing l...

arXiv cs.CV ·Tengjie Lin, Yutao Sun, Jingwei Ni, Shuhan Ge, Hao-Xuan Ma, Yanting Miao, Wangyue Lu, Mingshuai Chen, Tiancheng Zhao, Jianwei Yin · 23 de janeiro de 2026

Ver no Hugging Face

// relacionados

REKEY: Metadata-Grounded Visual-Key Regeneration for Contamination-Resilient VQA Evaluation

Leia também

How Businesses Are Building Specialized AI They Can Trust

Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates

Build real agentic apps using CUGA: two dozen working examples on a lightweight harness

Cursor announces its own AI model, a new Git platform, and a mobile app