MemLearner: Learning to Query Context memory for Video World Models
MemLearner improves video world models by using learning-based adaptive context querying with query tokens to enhance scene consistency and memory in long video sequences with occl…
Hugging Face · Daily Papers
·Jiwen Yu, Jianxiong Gao
·
·▲ 18 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Jiwen Yu, Jianxiong Gao, Jianhong Bai, Yiran Qin, Kaiyi Huang, Quande Liu
- 18 upvotes da comunidade
- Temas: video world models, context frame retrieval, query tokens, video generation model, visual priors, multi-dataset training strategy
Resumo
Resumo original (em inglês), extraído do paper:
MemLearner improves video world models by using learning-based adaptive context querying with query tokens to enhance scene consistency and memory in long video sequences with occlusions and dynamic objects.Onde ler
// relacionados