Blog Multimodal Robótica & RL

Keypose Exploration: Efficient Automatic Trajectory Labelling and Cross-Embodiment Policy Transfer

arXiv:2606.29028v1 Announce Type: new Abstract: Keypose-based manipulation decomposes tasks into critical waypoints to simplify policy learning for long-horizon tasks, but existing approaches rely on task-specific heuristics or manual annotation to extract keyposes from demonstrations. We present an automatic trajectory labelling pipeline for grasp-related tasks. This pipeline combines vision-language models (VLMs) for semantic event detection with classical trajectory analysis for precise tempo...

arXiv cs.RO ·Yupu Lu, Hang Xu, Yizhou Chen, Jia Pan · 30 de janeiro de 2026

Ver no Hugging Face

// relacionados

Keypose Exploration: Efficient Automatic Trajectory Labelling and Cross-Embodiment Policy Transfer

Leia também

LocateAnything-3B: a NVIDIA ensina um modelo a apontar o dedo na imagem

InternScience/Agents-A1

NIVA: A Multimodal Foundation Model for Actionable Earth System Intelligence

Can AI Draw Science? A Benchmark for Evaluating Scientific Figure Generation by Text-to-Image and Multimodal Models