Blog
LLMs & Texto
Measuring Judgment Quality in Natural-Language Explanations: Evidence from Forecasting Tournaments
arXiv:2606.30987v1 Announce Type: new Abstract: Decision-makers routinely rely on expert judgments accompanied by written explanations, yet explanation quality is difficult to measure at scale. Forecasting tournaments offer a natural testing ground: probabilistic judgments are paired with natural-language rationales and scored against realized outcomes. We introduce Explanation Quality Markers (EQMs), a set of sixty theory-guided reasoning patterns scored by large language models (LLMs). In a pr...
arXiv cs.CL
·Christopher W. Karvetski, Sheldon S. Huang, Simas Ku\v{c}inskas, Nadja Flechner, Jingyu Hu, Philip Tetlock, Ezra Karger
·
// relacionados
Leia também
Blog
Using Lift to Turn Research PDFs into Structured JSON with Controlled, Schema-Guided Field-Level Evaluation
Blog
Anthropic Redeploys Claude Fable 5 on July 1 After US Export Controls Lift, Adds New Cybersecurity Classifier
Blog
The latest AI news we announced in June 2026
Blog