UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating

UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating

UnityShots is a memory-driven audio-video generation system that maintains consistent subject appearance and audio across video cuts using fixed-size long-term and short-term memor…

Hugging Face · Daily Papers ·Jiehui Huang, Yuechen Zhang · ·▲ 14 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Jiehui Huang, Yuechen Zhang, Bin Xia, Jiahao Wang, Xu He, Zhenchao Tang

  • 14 upvotes da comunidade
  • Temas: multi-shot audio-video generation, LTX-2.3, long-term memory, short-term memory, boundary-conditioned gate, visual cut probability

Resumo

Resumo original (em inglês), extraído do paper:

UnityShots is a memory-driven audio-video generation system that maintains consistent subject appearance and audio across video cuts using fixed-size long-term and short-term memory slots with boundary-conditioned gates and discrete cut-type priors.

Ler o paper completo no Hugging Face →

compartilhar: