Blog LLMs & Texto Dados & Embeddings

T2D-Bench: Evidence-Gated Evaluation of LLM Outputs for Type 2 Diabetes Using a Multi-Layer Clinical-Lifestyle Knowledge Graph

arXiv:2606.24145v1 Announce Type: new Abstract: Large language models (LLMs) can produce clinically fluent recommendations for type 2 diabetes while failing to satisfy guideline constraints or explicitly justify lifestyle-related glycemic claims. We present T2D-Bench, a reproducible benchmark and evidence-gated evaluation framework for testing whether LLM outputs satisfy explicit, graph-checkable evidence requirements. T2D-Bench is built on a multi-layer clinical-lifestyle knowledge graph that c...

arXiv cs.AI ·Saba A. Farahani, Hung Cao, Ramesh Jain, Amir M. Rahmani · 24 de janeiro de 2026

Ver no Hugging Face

// relacionados

T2D-Bench: Evidence-Gated Evaluation of LLM Outputs for Type 2 Diabetes Using a Multi-Layer Clinical-Lifestyle Knowledge Graph

Leia também

Europe is pushing back on Washington’s chip war

Comfy-Org/Krea-2

Cerebras stock plunges after earnings as CEO says margin outlook was misunderstood

OpenAI and Broadcom announce chip designed for LLM inference at scale