Dataset LLMs & Texto

cais/mmlu

Dataset com 10 mil – 100 mil exemplos — 429.7 mil downloads no Hugging Face. Dataset Card for MMLU Dataset Summary Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mant…

Hugging Face · Datasets ·cais · 08 de janeiro de 2024 ·↓ 429656 ·♥ 778

O dataset cais/mmlu está entre os destaques do Hugging Face — dados que alimentam o treinamento e a avaliação dos modelos do momento.

Ficha do dataset

Tamanho: 10 mil – 100 mil exemplos
Tarefas: perguntas e respostas
Idiomas: inglês
Licença: MIT
Downloads: 429.7 mil · Curtidas: 778

Sobre o dataset

Dataset Card for MMLU Dataset Summary Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt (ICLR 2021).

Como carregar

Use a biblioteca datasets do Hugging Face:

pip install -U datasets

from datasets import load_dataset

ds = load_dataset("cais/mmlu")
print(ds)
print(ds["train"][0])

cais/mmlu

Ficha do dataset

Sobre o dataset

Como carregar

Tags

Leia também

OCRmyPDF Tutorial: Convert Scanned Documents into Searchable PDF/A Files with Sidecar Text Extraction and Batch Processing

Why Wall Street thinks US memory maker Micron is the next Nvidia

AI won't become a real coworker until it stops answering and starts finishing tasks

Coinbase joins the rush to Chinese AI models as Western labs face a pricing stress test