Paper
LLMs & Texto
Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe
Uniform 4-bit training with RHT-based quantization outperforms E2M1-based methods by eliminating shrinkage bias and improving training stability across large language model archite…
Hugging Face · Daily Papers
·Qian Zhao, Kunlong Chen
·
·▲ 7 upvotes
Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.
Autores: Qian Zhao, Kunlong Chen, Changxin Tian, Zhonghui Jiang, Haitao Zhang, Chaofan Yu
- 7 upvotes da comunidade
- Temas: FP4 training, Shrinkage Bias, Random Hadamard Transform, uniform grids, E1M2, INT4
Resumo
Resumo original (em inglês), extraído do paper:
Uniform 4-bit training with RHT-based quantization outperforms E2M1-based methods by eliminating shrinkage bias and improving training stability across large language model architectures.
// relacionados
Leia também
Blog
How Businesses Are Building Specialized AI They Can Trust
Blog
Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates
Blog
Build real agentic apps using CUGA: two dozen working examples on a lightweight harness
Blog