Causal Discovery in the Era of Agents
Language models should assist causal discovery workflows by providing contextual support and explanations rather than generating causal conclusions, as demonstrated through a platf…
Papers, modelos e datasets em alta no Hugging Face, além do blog oficial — com leitura editorial em português.
Language models should assist causal discovery workflows by providing contextual support and explanations rather than generating causal conclusions, as demonstrated through a platf…
A failure detection framework for long-horizon robotic tasks uses action-conditioned world models and functional conformal prediction to monitor manipulation trajectories with only…
A bi-modal construction domain dataset combining stereo RGB and LiDAR data under challenging environmental conditions is introduced for autonomous system perception research.
AOHP presents an Android-based operating system framework that treats AI agents as first-class entities, enhancing task completion rates and reducing execution costs through specia…
PhoneBuddy combines real and mock app environments to improve training of open models for phone use, demonstrating enhanced task success rates through mixed reinforcement learning…
UniverSat introduces a Universal Patch Encoder for Vision Transformers that enables robust, sensor-agnostic spatial feature extraction across diverse Earth Observation data types.
Computer-use agents can execute software tasks through either graphical interfaces or programmatic command interfaces, but existing evaluations confound interaction modality with d…
Standard LLM agents rely on plan content remaining in context rather than maintaining it as persistent state, with evidence shown through replay pairing diagnostics and compression…
A principled synthesis engine generates high-quality terminal-agent tasks through multi-dimensional capability taxonomy and evidence-guided research, creating a distilled dataset t…
DR-MV3D presents a map-grounded learning framework with dense rewards to improve multi-view 3D visual question answering through global map construction, view-trajectory planning,…
Procedural memory enhances LLM agents on workplace tasks through skill transfer across roles and models, with varying generalization capabilities affecting deployment strategies.
Pre premature commitment in long-horizon LLM agents leads to silent failures where agents defend early interpretations without considering alternatives, and hidden-state convergenc…