Efficient Remote Sensing Instance Segmentation with Linear-Time State Space Distilled Visual Foundation Models
arXiv:2606.25324v1 Announce Type: new Abstract: The computational complexity of Transformers scales quadratically with the number of tokens, which significantly constrains the efficiency of vision models, particularly recent ViT-based foundation models in dense prediction tasks. Instance segmentation, a typical dense visual prediction task in the remote sensing field, faces similar challenges. In this paper, inspired by the recent advances of knowledge distillation in large language models, we i...
arXiv cs.CV
·Qinzhe Yang, Keyan Chen, Jia Xu, Zhenwei Shi, Zhengxia Zou
·
// relacionados
Leia também
Blog
Amazon ups India bet with fresh $13B AI infrastructure investment
Editorial
Jalapeño: a OpenAI projeta seu primeiro chip de inferência — e usou IA para fazer isso em 9 meses
Editorial
SkillOpt: como ensinar agentes de IA a melhorar suas próprias habilidades — +23 pontos em GPT-5.5
Blog