Radar de IA — Notícias, Modelos e Papers

CIExplainer++: Generating Causal and Interpretable Explanations for Graph Neural Networks

arXiv:2606.20747v1 Announce Type: new Abstract: Explainable Artificial Intelligence aims to make black-box models more trustworthy by presenting, in a human-understandable manner, the elements that lead to the model's output. This involves both (i) identifying components and connections with genuine causal influence on outputs and (ii) translating such structures into an interpretable representation. For the former, we introduce CIExplainer, a novel perturbation-based method grounded in causal i...

23.06.2026

Blog LLMs & Texto

Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier

arXiv:2606.20662v1 Announce Type: new Abstract: Modern agent systems can turn uncertainty into overconfidence. Fragile upstream decisions are often exposed to downstream components as clean intermediate artifacts, while the uncertainty behind those decisions is lost at the interface. As a result, local ambiguity can become system-level error amplification. We argue that this reveals an interface bottleneck in agent uncertainty propagation: uncertainty does not propagate simply because a trajecto...

23.06.2026

Blog Robótica & RL

Vesta: A Generalist Embodied Reasoning Model

arXiv:2606.20905v1 Announce Type: new Abstract: Robots operating in open-world environments must seamlessly integrate localization, spatial reasoning, navigation, and long-horizon planning. While specialist models excel at individual tasks, deploying a multi-model stack is computationally expensive and prone to cascading errors. We present Vesta, a unified embodied generalist that consolidates these capabilities into a single foundation model. Our approach combines a diverse and massive curated ...

23.06.2026

Blog Robótica & RL

ReFPO: Reflow Regularization for Flow Matching Policy Gradients

arXiv:2606.21086v1 Announce Type: new Abstract: We present Reflow-regularized Flow Matching Policy Gradients (ReFPO), a simple online RL method that adds explicit Reflow regularization to FPO for efficient flow-based control. We uncover a key structural property: the gradient updates in Flow Matching Policy Gradients (FPO) can be interpreted as an implicit advantage-weighted Reflow process, providing a new geometric perspective on flow-based policy gradients. Building on this insight, ReFPO intr...

23.06.2026

Blog LLMs & Texto

Video2Code: Generating Interactive Webpages from UI Videos via Action-Aware Revisit

arXiv:2606.20711v1 Announce Type: new Abstract: UI videos provide a natural input for generating interactive webpages, as they capture both webpage appearance and action-triggered state transitions. However, directly applying video-capable vision-language models to this task remains insufficient. Existing models typically rely on sparse sampling or compressed temporal representations, which may miss short action boundaries and break the state-action-state transitions needed to implement webpage ...

23.06.2026

Blog LLMs & Texto

On the Identifiability of User Adaptation in Co-Adaptive Neural Interfaces

arXiv:2606.20569v1 Announce Type: new Abstract: We analyze identifiability in co-adaptive human-machine systems. We show that closed-loop encoder estimates do not uniquely identify user adaptation, but instead reflect properties of the joint system. We discuss implications for interpreting behavioral adaptation and propose conditions for identification.

23.06.2026

Blog Dados & Embeddings

CDER-SME: A Cross-Device Event-RGB Micro-Expression Dataset under Multi-Level Stress Induction

arXiv:2606.20715v1 Announce Type: new Abstract: Micro-expression recognition (MER) in realistic scenarios demands high temporal sensitivity and ecological validity, yet existing benchmarks are largely constrained to laboratory-controlled settings and rigid hardware-coupled sensing. We introduce CDER-SME, a cross-device Event-RGB dataset collected under a multi-level stress induction framework (cognitive and social) to elicit spontaneous emotional leakage. To enable reproducible acquisition with ...

23.06.2026

Blog Robótica & RL

Massive Activations Are Architecturally Robust: A Controlled Scratch/Commitment Residual Stream Test

arXiv:2606.20743v1 Announce Type: new Abstract: Trained transformers reliably develop massive activations, a small number of hidden dimensions whose magnitude is far above the median and which concentrate on the sequence-start token. Whether these outliers are a removable artifact of the residual stream's overloaded read and write role, or instead a functional necessity, is actively debated. We test the artifact hypothesis directly, with an architectural intervention. Our architecture, Ledger Re...

23.06.2026

Blog LLMs & Texto

MAGNIFIED: RL Fine-tuning of Multimodal Large Language Models for Motion Planning

arXiv:2606.20641v1 Announce Type: new Abstract: Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities in semantic understanding and common sense reasoning, making them promising candidates for solving planning problems in autonomous driving. However, the next-token text prediction objectives traditionally used in pre-training and supervised fine-tuning (SFT) of MLLMs may fall short of fulfilling the planning objectives for autonomous vehicles. The next-token predict...

23.06.2026

Blog LLMs & Texto

Democratizing and accelerating AI-driven pathology research through agentic intelligence

arXiv:2606.20677v1 Announce Type: new Abstract: Computational pathology has advanced rapidly with the emergence of foundation models, yet widespread adoption remains limited by substantial technical complexity and programming requirements. Here we present PathLab, an autonomous agentic framework that translates natural-language research objectives into executable and validated computational pathology workflows through the structured composition of domain-specific skills and tools. By organizing ...

23.06.2026

Blog Geração de Imagem

Storyline Trees: Hierarchical Representations for Long-Form Narratives

arXiv:2606.20900v1 Announce Type: new Abstract: Long-form narratives are challenging for long-context models because their structure is implicit: events, characters, and plotlines interact across hundreds of pages without the explicit cues that guide navigation in structured documents. We address this by constructing storyline trees, hierarchical representations that organize narratives from global themes and major plotlines to fine-grained events. We first segment chapters into contiguous narra...

23.06.2026

Blog Visão Computacional

VTOS: Learning to Orchestrate Vision Tools by Co-Searching Solutions and Observers

arXiv:2606.20728v1 Announce Type: new Abstract: Vision foundation tools such as open-vocabulary detectors, segmentation models, and post-processing operators are powerful building blocks for computer vision, but their effectiveness depends heavily on how they are orchestrated: which tools are used, in what order, with what parameters, and under what visual conditions. Existing visual-programming agents typically generate a fixed solution pipeline, making them brittle under dense objects, occlusi...

23.06.2026

O que está acontecendo agora

CIExplainer++: Generating Causal and Interpretable Explanations for Graph Neural Networks

Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier

Vesta: A Generalist Embodied Reasoning Model

ReFPO: Reflow Regularization for Flow Matching Policy Gradients

Video2Code: Generating Interactive Webpages from UI Videos via Action-Aware Revisit

On the Identifiability of User Adaptation in Co-Adaptive Neural Interfaces

CDER-SME: A Cross-Device Event-RGB Micro-Expression Dataset under Multi-Level Stress Induction

Massive Activations Are Architecturally Robust: A Controlled Scratch/Commitment Residual Stream Test

MAGNIFIED: RL Fine-tuning of Multimodal Large Language Models for Motion Planning

Democratizing and accelerating AI-driven pathology research through agentic intelligence

Storyline Trees: Hierarchical Representations for Long-Form Narratives

VTOS: Learning to Orchestrate Vision Tools by Co-Searching Solutions and Observers