Interleaved Speech Language Models Latently Work In Text

Interleaved Speech Language Models Latently Work In Text

Interleaved speech-text language models exhibit an implicit transcription phase where text tokens become decodable in intermediate layers, followed by text-based prediction before…

Hugging Face · Daily Papers ·Talia Sternberg, Gallil Maimon · ·▲ 10 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Talia Sternberg, Gallil Maimon, Yossi Adi

  • 10 upvotes da comunidade
  • Temas: speech language models, speech-text interleaving, logit lens, intermediate layers, text token, speech recognition

Resumo

Resumo original (em inglês), extraído do paper:

Interleaved speech-text language models exhibit an implicit transcription phase where text tokens become decodable in intermediate layers, followed by text-based prediction before speech domain transformation.

Onde ler

compartilhar: