Tapered Language Models

Tapered Language Models

Tapered language models allocate more parameters to earlier layers and fewer to later layers, improving performance without increasing total parameters or compute costs.

Hugging Face · Daily Papers ·Reza Bayat, Ali Behrouz · ·▲ 3 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Reza Bayat, Ali Behrouz, Aaron Courville

  • 3 upvotes da comunidade
  • Temas: transformer, recurrent, memory-based variants, identical layers, parameter-efficient fine-tuning, perplexity

Resumo

Resumo original (em inglês), extraído do paper:

Tapered language models allocate more parameters to earlier layers and fewer to later layers, improving performance without increasing total parameters or compute costs.

Ler o paper completo no Hugging Face →

compartilhar: