Perceive-to-Reason: Decoupling Perception and Reasoning for Fine-Grained Visual Reasoning
A unified framework named Perceive-to-Reason (P2R) is introduced that separates visual perception from reasoning in vision-language models through a two-stage process, improving fi…
Hugging Face · Daily Papers
·Hongxing Li, Xiufeng Huang
·
·▲ 10 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Hongxing Li, Xiufeng Huang, Dingming Li, Wenjing Jiang, Zixuan Wang, Haolei Xu
- 10 upvotes da comunidade
- Temas: vision-language models, fine-grained visual reasoning, Perceiver, Reasoner, Perception-Reasoning Alternating GRPO, reinforcement learning
Resumo
Resumo original (em inglês), extraído do paper:
A unified framework named Perceive-to-Reason (P2R) is introduced that separates visual perception from reasoning in vision-language models through a two-stage process, improving fine-grained visual reasoning performance on high-resolution images.Onde ler
// relacionados
Leia também
Editorial
Claude Sonnet 5: a Anthropic aposta que o modelo do meio faz o trabalho do topo
Blog
Google’s AI buildout drove 37% increase in electricity use in 2025
Blog
OpenAI reportedly offers the Trump administration a five percent stake in the company
Blog