Robótica & RL — Radar de IA

Factor-Aware Mixture-of-Experts with Pretrained Encoder for Combinatorial Generalization

arXiv:2606.21100v1 Announce Type: new Abstract: The integration of pretrained encoders with diffusion policies has become a dominant paradigm for visual robotic manipulation. However, it still struggles to generalize across complex environments with varying factors such as lighting and surface textures. To address this, we propose FAME, a framework that integrates a factor-aware mixture-of-experts (MoE) with a pretrained encoder to enhance generalization to environmental variations. FAME follows...

23.06.2026

Blog Robótica & RL

CIExplainer++: Generating Causal and Interpretable Explanations for Graph Neural Networks

arXiv:2606.20747v1 Announce Type: new Abstract: Explainable Artificial Intelligence aims to make black-box models more trustworthy by presenting, in a human-understandable manner, the elements that lead to the model's output. This involves both (i) identifying components and connections with genuine causal influence on outputs and (ii) translating such structures into an interpretable representation. For the former, we introduce CIExplainer, a novel perturbation-based method grounded in causal i...

23.06.2026

Blog Robótica & RL

Vesta: A Generalist Embodied Reasoning Model

arXiv:2606.20905v1 Announce Type: new Abstract: Robots operating in open-world environments must seamlessly integrate localization, spatial reasoning, navigation, and long-horizon planning. While specialist models excel at individual tasks, deploying a multi-model stack is computationally expensive and prone to cascading errors. We present Vesta, a unified embodied generalist that consolidates these capabilities into a single foundation model. Our approach combines a diverse and massive curated ...

23.06.2026

Blog Robótica & RL

ReFPO: Reflow Regularization for Flow Matching Policy Gradients

arXiv:2606.21086v1 Announce Type: new Abstract: We present Reflow-regularized Flow Matching Policy Gradients (ReFPO), a simple online RL method that adds explicit Reflow regularization to FPO for efficient flow-based control. We uncover a key structural property: the gradient updates in Flow Matching Policy Gradients (FPO) can be interpreted as an implicit advantage-weighted Reflow process, providing a new geometric perspective on flow-based policy gradients. Building on this insight, ReFPO intr...

23.06.2026

Blog Robótica & RL

Massive Activations Are Architecturally Robust: A Controlled Scratch/Commitment Residual Stream Test

arXiv:2606.20743v1 Announce Type: new Abstract: Trained transformers reliably develop massive activations, a small number of hidden dimensions whose magnitude is far above the median and which concentrate on the sequence-start token. Whether these outliers are a removable artifact of the residual stream's overloaded read and write role, or instead a functional necessity, is actively debated. We test the artifact hypothesis directly, with an architectural intervention. Our architecture, Ledger Re...

23.06.2026

Blog Robótica & RL

Geometric Entropy: When Trajectory Diversity Helps and Hurts in Imitation Learning

arXiv:2606.20871v1 Announce Type: new Abstract: We study how trajectory-shape diversity in demonstrations affects imitation learning (IL) performance across models, tasks, and data scales. We introduce Geometric Entropy (H_G), a task-agnostic metric that quantifies the intrinsic diversity of transit trajectories after normalizing away extrinsic variation, such as goal pose and workspace scale, via target-frame alignment. Across multiple IL architectures and both simulated and real-robot contact-...

23.06.2026

Blog Robótica & RL

Formalizing Task-Space Complexity for Zero-Shot Generalization

arXiv:2606.20967v1 Announce Type: new Abstract: Policies must operate across diverse conditions, yet a single policy is often conservative while fully adaptive schemes can be complex. We study zero-shot generalization in contextual dynamical systems and introduce a performance-centric, directional task dissimilarity--the signed divergence--that upper bounds the generalization gap from a source context to a target context. The signed divergence induces $\varepsilon$-tolerance sets that certify wh...

23.06.2026

Blog Robótica & RL

World Action Models: A Survey

arXiv:2606.20781v1 Announce Type: new Abstract: World Action Models (WAMs) are embodied predictive-action models that make a forecast of the future available to action. Recent WAMs repurpose large video generation models, and a parallel line relies on language or vision-language backbones without a video-generation core. This rapid expansion has blurred the boundary among broad world models, video generation models, action-grounded video world models, Vision-Language-Action policies, and WAMs. T...

23.06.2026

Blog Robótica & RL

Inductive Generalization for Robotic Manipulation

arXiv:2606.20999v1 Announce Type: new Abstract: Understanding the generalization capabilities of visuomotor policies is essential in the development of capable robotic agents. Generalizable models learn structures that transfer across domains. However, in practice, visuomotor policies test performance by interpolation on known distributions using unstructured domain shifts (e.g. lighting, clutter, diverse objects). We argue that to measure generalization capabilities we must instead test the ind...

23.06.2026

Blog LLMs & Texto

Scalable Hierarchical Attention Transformers for Multi-Turn Jailbreak Detection in Long Conversations

arXiv:2606.21082v1 Announce Type: new Abstract: Multi-turn jailbreaks can evade turn-level moderation by spreading unsafe intent across a dialogue through gradual escalation, reframing, and role manipulation. We address multi-turn jailbreak detection as a conversation-level classification problem and introduce an efficient hierarchical detector that avoids expensive long-context concatenation while retaining cross-turn reasoning. The model encodes individual turns to form compact turn representa...

23.06.2026

Blog Robótica & RL

SafeDojo: Safe Reinforcement Learning for VLA via Interactive World Model

arXiv:2606.20698v1 Announce Type: new Abstract: Safe control is a prerequisite for real-world embodied intelligence, for which safe reinforcement learning has emerged as a promising paradigm. However, existing safe reinforcement learning methods either require costly real-world exploration or depend on hand-crafted safety functions. Neither scales to vision-language-action models deployed in open-world physical environments. We propose SafeDojo, the first model-based safe reinforcement learning ...

23.06.2026

Blog Robótica & RL

Tessellated Biomes: Distributed Robotic Assemblies for Architectural Resilience

arXiv:2606.20647v1 Announce Type: new Abstract: This paper presents Tessellated Biomes, a cyber-physical framework for the adaptive robotic construction and reconfiguration of modular multi-material assemblies. It challenges the linear lifecycle of standard construction by fusing (1) local microfactory fabrication, (2) discrete multi-material optimization, and (3) distributed robotic assembly into a unified circular process of spatial adaptation. The research details methods for the digital fabr...

23.06.2026