Blog LLMs & Texto Dados & Embeddings

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

arXiv:2606.20661v1 Announce Type: new Abstract: The integration of external tools has transitioned LLM agents from passive responders to autonomous systems. However, current benchmarks prioritize execution success, neglecting self-awareness capability, the ability to discern whether a problem requires necessary external resources or can be solved via internal parametric knowledge. To address this, we introduce KAPRO (Knowing-Acting Quadrant PRObe), a framework that evaluates cognitive-behavioral...

arXiv cs.AI ·Yifan Li, Shengbin Yue, Boyu Feng, Jinhu Qi, Bo Ke, Zixing Song, Hongru Wang, Zhongyu Wei, Irwin King · 23 de janeiro de 2026

Ver no Hugging Face

// relacionados

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

Leia também

How Businesses Are Building Specialized AI They Can Trust

Fika Jobs raises $4M to build a video-first hiring platform where AI agents interview candidates

Build real agentic apps using CUGA: two dozen working examples on a lightweight harness

Cursor announces its own AI model, a new Git platform, and a mobile app