SpikeVLA: Vision-Language-Action Models with Spiking Neural Networks
arXiv:2606.27807v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have become a dominant paradigm for embodied intelligence. However, most existing approaches are built on large-scale transformers, resulting in substantial inference latency and energy consumption that limit their practical deployment in low-power, real-time scenarios. We propose SpikeVLA, a spiking VLA architecture for embodied navigation with energy-efficient inference, consisting of three key components. (i) ...
arXiv cs.RO
·Ruiqi Song, Dujun Nie, Siyu Teng, Baiyong Ding, Xiaotong Zhang, Dong Li, Chenming Zhang, Yuchen Li, Hangbin Wu, Long Chen
·
// relacionados
Leia também
Blog
The US military used AI to pick thousands of targets but missed a note saying one was a school
Blog
HP accelerates enterprise workflows with OpenAI Frontier
Editorial
O fantasma do Fable 5: banido, o modelo vive nos datasets que o destilam
Editorial