Paper Multimodal Dados & Embeddings

Learning to Move Before Learning to Do: Task-Agnostic pretraining for VLAs

Task-Agnostic Pretraining framework trains robotic models using self-supervised inverse dynamics on unlabeled data followed by lightweight language grounding, achieving superior pe…

Hugging Face · Daily Papers ·Junhao Shi, Siyin Wang · 02 de janeiro de 2026 ·▲ 4 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Junhao Shi, Siyin Wang, Xiaopeng Yu, Li Ji, Jingjing Gong, Xipeng Qiu

4 upvotes da comunidade
Temas: Vision-Language-Action models, expert demonstrations, physical competence, semantic alignment, self-supervised Inverse Dynamics, task-agnostic pretraining

Resumo

Resumo original (em inglês), extraído do paper:

Task-Agnostic Pretraining framework trains robotic models using self-supervised inverse dynamics on unlabeled data followed by lightweight language grounding, achieving superior performance with minimal expert demonstrations.

Onde ler

Ver no Hugging Face

// relacionados

Learning to Move Before Learning to Do: Task-Agnostic pretraining for VLAs

Resumo

Onde ler

Leia também

Orientação de Segurança Neuro-Simbólica para Modelos de Visão-Linguagem-Ação via Correspondência de Fluxo Restrita

PairCoder++: Pair Programming as a Universal Paradigm for Verified Code-Driven Multimodal and Structured-Artifact Generation

Guided Action Flow: Q-Guided Inference for Flow-Matching Vision-Language-Action Policies

Bridge-WA: Predicting Where and How the World Changes for Robotic Action