MemLearner: Learning to Query Context memory for Video World Models

MemLearner: Learning to Query Context memory for Video World Models

MemLearner improves video world models by using learning-based adaptive context querying with query tokens to enhance scene consistency and memory in long video sequences with occl…

Hugging Face · Daily Papers ·Jiwen Yu, Jianxiong Gao · ·▲ 18 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Jiwen Yu, Jianxiong Gao, Jianhong Bai, Yiran Qin, Kaiyi Huang, Quande Liu

  • 18 upvotes da comunidade
  • Temas: video world models, context frame retrieval, query tokens, video generation model, visual priors, multi-dataset training strategy

Resumo

Resumo original (em inglês), extraído do paper:

MemLearner improves video world models by using learning-based adaptive context querying with query tokens to enhance scene consistency and memory in long video sequences with occlusions and dynamic objects.

Onde ler

compartilhar: