Blog LLMs & Texto Multimodal

SpikeVLA: Vision-Language-Action Models with Spiking Neural Networks

arXiv:2606.27807v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have become a dominant paradigm for embodied intelligence. However, most existing approaches are built on large-scale transformers, resulting in substantial inference latency and energy consumption that limit their practical deployment in low-power, real-time scenarios. We propose SpikeVLA, a spiking VLA architecture for embodied navigation with energy-efficient inference, consisting of three key components. (i) ...

arXiv cs.RO ·Ruiqi Song, Dujun Nie, Siyu Teng, Baiyong Ding, Xiaotong Zhang, Dong Li, Chenming Zhang, Yuchen Li, Hangbin Wu, Long Chen · 29 de janeiro de 2026

Ver no Hugging Face

// relacionados

SpikeVLA: Vision-Language-Action Models with Spiking Neural Networks

Leia também

The US military used AI to pick thousands of targets but missed a note saying one was a school

HP accelerates enterprise workflows with OpenAI Frontier

O fantasma do Fable 5: banido, o modelo vive nos datasets que o destilam

MultiHashFormer: e se cada palavra fosse uma impressão digital?