Dataset
LLMs & Texto
cais/mmlu
Dataset com 10 mil – 100 mil exemplos — 429.7 mil downloads no Hugging Face. Dataset Card for MMLU Dataset Summary Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mant…
Hugging Face · Datasets
·cais
·
·↓ 429656
·♥ 778
O dataset cais/mmlu está entre os destaques do Hugging Face — dados que alimentam o treinamento e a avaliação dos modelos do momento.
Ficha do dataset
- Tamanho: 10 mil – 100 mil exemplos
- Tarefas: perguntas e respostas
- Idiomas: inglês
- Licença: MIT
- Downloads: 429.7 mil · Curtidas: 778
Sobre o dataset
Dataset Card for MMLU Dataset Summary Measuring Massive Multitask Language Understanding by Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt (ICLR 2021).
Como carregar
Use a biblioteca datasets do Hugging Face:
pip install -U datasets
from datasets import load_dataset
ds = load_dataset("cais/mmlu")
print(ds)
print(ds["train"][0])Tags
question-answering
// relacionados
Leia também
Blog
OCRmyPDF Tutorial: Convert Scanned Documents into Searchable PDF/A Files with Sidecar Text Extraction and Batch Processing
Blog
Why Wall Street thinks US memory maker Micron is the next Nvidia
Blog
AI won't become a real coworker until it stops answering and starts finishing tasks
Blog