ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation
ABACUS is a unified vision-language model that performs object counting and related tasks through innovative spatial grounding, boundary-aware counting policies, and self-critical…