Dataset LLMs & Texto

openbmb/Ultra-FineWeb-L3

Dataset em destaque no Hugging Face — 90.3 mil downloads. Ultra-FineWeb-L3 📜 Ultra-FineWeb Technical Report | 📦 UltraData Collection | 🌐 UltraData | 🤗 MiniCPM5 Series English | 中文 📚 Introdu…

Hugging Face · Datasets ·openbmb · ·↓ 90331 ·♥ 301

O dataset openbmb/Ultra-FineWeb-L3 está entre os destaques do Hugging Face — dados que alimentam o treinamento e a avaliação dos modelos do momento.

  • 90.3 mil downloads
  • 301 curtidas

Sobre o dataset

Ultra-FineWeb-L3 📜 Ultra-FineWeb Technical Report | 📦 UltraData Collection | 🌐 UltraData | 🤗 MiniCPM5 Series English | 中文 📚 Introduction Ultra-FineWeb-L3 is the L3 refined data for general high-quality web data within UltraData's L0-L4 tiered data management framework.

text-generation llm pretraining data-synthesis data-filtering high-quality general-knowledge qa-generation

Explorar o dataset no Hugging Face →

compartilhar: