Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning
Text-to-image models fail to generate counterfactual scenes because they rely on tightly coupled visual-textual patterns rather than causal reasoning, demonstrating limited underst…
Hugging Face · Daily Papers
·Jiayi Lei, Yuandong Pu
·
·▲ 5 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Jiayi Lei, Yuandong Pu, Xingyu Han, Rongpeng Zhu, Jing Xu, Jinyao Wang
- 5 upvotes da comunidade
- Temas: text-to-image generation, counterfactual benchmark, Vision Language Model, CF-Eval, Prior Resistance Rate, Reasoning Retention Rate
Resumo
Resumo original (em inglês), extraído do paper:
Text-to-image models fail to generate counterfactual scenes because they rely on tightly coupled visual-textual patterns rather than causal reasoning, demonstrating limited understanding beyond pattern matching.