ViTL: Temporal Logic-Guided Zero-Shot Natural Language Navigation via Vision-Language Models

arXiv:2606.30696v1 Announce Type: new Abstract: Enabling robots to follow natural language commands to complete zero-shot long-horizon tasks remains challenging. It requires extracting implicit temporal and logical constraints from natural language commands and executing multiple sub-tasks accordingly. Recent zero-shot object navigation methods use vision-language models (VLMs) to guide frontier-based exploration in unknown environments, but they are limited to single-target tasks. Real-world co...

arXiv cs.RO ·Kaier Liang, Hengde Dai, Cristian-Ioan Vasile ·
compartilhar: