Blog Áudio & Voz Visão Computacional

KM-Speaker: Keypoint-Based Style Control for High-Quality Speech-Driven 3D Facial Animation and Dialogue Localization

arXiv:2606.28568v1 Announce Type: new Abstract: Speech-driven 3D facial animation methods face significant challenges in simultaneously achieving high-fidelity motion and precise artistic control at production quality. Existing controllable models typically learn global style control by relying on large-scale, low-quality \emph{in-the-wild} datasets that compromise overall animation realism. Furthermore, these frameworks often lack the fine-grained temporal precision required for demanding tasks...

arXiv cs.CV ·Arthur Josi, Emeline Got, Abdallah Dib, Luiz Gustavo Hafemann, Rafael M. O. Cruz · 30 de janeiro de 2026

Ver no Hugging Face

// relacionados

KM-Speaker: Keypoint-Based Style Control for High-Quality Speech-Driven 3D Facial Animation and Dialogue Localization

Leia também

Um modelo, muitas latências: limpar a voz sem escolher entre rápido e bom

Majority Vote Silences Minority Values: Annotator Disagreement at the Hate/Offensive Boundary in HateXplain

Conversational Domain Adaptation of IndicTrans2 across 21 Indic Languages via Experience Replay and Model Soups

South Korea to spend $1T on more memory chip production and humanoid robots