EmoInstruct-TTS: Dual-Path Instruction-Guided Emotional Speech Synthesis
arXiv:2606.20650v1 Announce Type: new Abstract: Instruction-based controllable speech synthesis enables users to specify emotions through natural language. However, existing approaches often rely on coarse emotion labels and lack explicit modeling of fine-grained intensity. We propose EmoInstruct-TTS, a dual-path instruction-guided framework for emotional speech synthesis. We introduce Emotion2embed, a supervised semantic-acoustic emotion embedding covering 48 emotional states, including fine-gr...
arXiv cs.CL
·Minghui Wu, Ganjun Liu, Zikun Fang, Ting Meng, Hongchuan Wu, Bingao Xu, Yonglong Cai, Jiasheng Chen, Jun Du
·
// relacionados
Leia também
Blog
How Businesses Are Building Specialized AI They Can Trust
Blog
Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates
Blog
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
Blog