Paper
LLMs & Texto
Information-Aware KV Cache Compression for Long Reasoning
InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights.
Hugging Face · Daily Papers
·Jushi Kai, Zhuiri Xiao
·
·▲ 9 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Jushi Kai, Zhuiri Xiao, Alexandra Birch, Zhouhan Lin
- 9 upvotes da comunidade
- Temas: KV cache, attention weights, token importance, predictive uncertainty, information-theoretic signals, Forward Influence
Resumo
Resumo original (em inglês), extraído do paper:
InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights.Onde ler
// relacionados