GridVQA-X: A Framework for Evaluating Multimodal Explainability Methods
GridVQA-X introduces a diagnostic framework to evaluate cross-modal explainability by distinguishing genuine spatial-relational reasoning from cross-modal shortcuts in multimodal m…
Hugging Face · Daily Papers
·Sujay Belsare, Sudarshan Nikhil
·
·▲ 3 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Sujay Belsare, Sudarshan Nikhil, Sushant Kumar, Ponnurangam Kumaraguru, Chirag Agarwal
- 3 upvotes da comunidade
- Temas: Vision-Language Models, Multimodal Explainable AI, cross-modal reasoning, cross-modal shortcuts, diagnostic framework, spatial-relational reasoning
Resumo
Resumo original (em inglês), extraído do paper:
GridVQA-X introduces a diagnostic framework to evaluate cross-modal explainability by distinguishing genuine spatial-relational reasoning from cross-modal shortcuts in multimodal models.