Speaker Identity in Non-Verbal Vocalizations: Conditional Distillation and Mixture of Experts Approach
A novel speaker verification framework combines frozen self-supervised features with ECAPA-TDNN and MoE modules to improve identity verification across both speech and non-verbal v…
Hugging Face · Daily Papers
·Tzu-Chieh Wei, Yi-Cheng Lin
·
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Tzu-Chieh Wei, Yi-Cheng Lin, Huang-Cheng Chou, Kuan-Yu Chen, Hsin-Yen Sung, Shrikanth Narayanan
- 0 upvotes da comunidade
- Temas: Data2Vec, ECAPA-TDNN, Mixture of Experts, conditional distillation loss, contrastive loss, speaker verification
Resumo
Resumo original (em inglês), extraído do paper:
A novel speaker verification framework combines frozen self-supervised features with ECAPA-TDNN and MoE modules to improve identity verification across both speech and non-verbal vocalizations while maintaining speech performance.Onde ler
// relacionados