Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

A large-scale video editing dataset and model are introduced that support multi-task and structural manipulations through advanced data synthesis and network architectures.

Hugging Face · Daily Papers ·Sen Liang, Cong Wang · ·▲ 1 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Sen Liang, Cong Wang, Zhentao Yu, Fengbin Guan, Zhengguang Zhou, Teng Hu

  • 1 upvotes da comunidade
  • Temas: instruction-aligned video editing pairs, data synthesis pipeline, progressive filtering system, MLLM, decoupled dual-branch design, mask branch

Resumo

Resumo original (em inglês), extraído do paper:

A large-scale video editing dataset and model are introduced that support multi-task and structural manipulations through advanced data synthesis and network architectures.

Onde ler

compartilhar: