Blog LLMs & Texto

Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments

arXiv:2606.30987v1 Announce Type: new Abstract: Decision-makers routinely rely on expert judgments accompanied by written explanations, yet explanation quality is difficult to measure at scale. Forecasting tournaments offer a natural testing ground: probabilistic judgments are paired with natural-language rationales and scored against realized outcomes. We introduce Explanation Quality Markers (EQMs), a set of sixty theory-guided reasoning patterns scored by large language models (LLMs). In a pr...

arXiv cs.CL ·Christopher W. Karvetski, Sheldon S. Huang, Simas Ku\v{c}inskas, Nadja Flechner, Jingyu Hu, Philip Tetlock, Ezra Karger · 01 de janeiro de 2026

Ver no Hugging Face

// relacionados

Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments

Leia também

Using Lift to Turn Research PDFs into Structured JSON with Controlled, Schema-Guided Field-Level Evaluation

Anthropic Redeploys Claude Fable 5 on July 1 After US Export Controls Lift, Adds New Cybersecurity Classifier

The latest AI news we announced in June 2026

Cloudflare’s new policy pushes AI companies to pay for publishers’ content