Radar de IA — Notícias, Modelos e Papers

MV-WAM: Manifold-Aware World Action Model with Value Augmentation

arXiv:2606.21088v1 Announce Type: new Abstract: Achieving robust and generalizable manipulation across diverse environments remains a fundamental challenge in embodied robotics. Recent world action models achieve strong in-domain performance, yet their gains do not extend proportionally to out-of-distribution scenarios. We attribute this to a structural mismatch between visual and action modalities, whose intrinsically heterogeneous manifolds cause joint optimization to disproportionately degrad...

23.06.2026

Blog LLMs & Texto

MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data

arXiv:2606.20696v1 Announce Type: new Abstract: Decoding inner speech from non-invasive brain signals remains a fundamental challenge due to the absence of overt linguistic output, limited training data, and large inter-subject variability. Existing brain-to-text approaches often rely on task-specific decoder fine-tuning, which restricts scalability and complicates adaptation to new participants. We propose MindAlign, a decoupled two-stage brain-to-language framework that enables open-ended text...

23.06.2026

Blog LLMs & Texto

ARGUSTRACK: A Multi-View Annotation System for Multi-Object Tracking

arXiv:2606.20687v1 Announce Type: new Abstract: Multi-Camera Multi-Target (MCMT) tracking has emerged as a critical capability for applications ranging from autonomous driving to animal behavior monitoring. While recent advances have yielded sophisticated tracking algorithms, the availability of annotated multi-view data remains a significant bottleneck. Existing annotation tools predominantly support single-camera workflows or rely on LiDAR sensors, making cross-view labeling tedious and imprac...

23.06.2026

Blog LLMs & Texto

A Viscosity Semigroup Framework for Stable Image Reconstruction

arXiv:2606.20620v1 Announce Type: new Abstract: Starting from the axiomatic formulation of scale-space theory, we develop a viscosity-solution framework for multiscale image representations arising from degenerate elliptic-parabolic partial differential equations. Rather than introducing a new semigroup theory, we work within the standard viscosity-solution setting, using comparison principles to obtain well-posedness, uniqueness, and contraction in the supremum norm. This perspective is used to...

23.06.2026

Blog LLMs & Texto

EmoInstruct-TTS: Dual-Path Instruction-Guided Emotional Speech Synthesis

arXiv:2606.20650v1 Announce Type: new Abstract: Instruction-based controllable speech synthesis enables users to specify emotions through natural language. However, existing approaches often rely on coarse emotion labels and lack explicit modeling of fine-grained intensity. We propose EmoInstruct-TTS, a dual-path instruction-guided framework for emotional speech synthesis. We introduce Emotion2embed, a supervised semantic-acoustic emotion embedding covering 48 emotional states, including fine-gr...

23.06.2026

Blog LLMs & Texto

$\Omega$: Operator-based Mixture Ensemble for Generative Assimilation

arXiv:2606.20920v1 Announce Type: new Abstract: Characterizing non-Gaussian posterior distributions in partially observed high-dimensional nonlinear systems remains a fundamental challenge in data assimilation. Ensemble Kalman filters rely on Gaussian approximations that can be inaccurate for strongly non-Gaussian posteriors, whereas particle filters suffer from severe scalability limitations. Recent score-based generative approaches improve posterior characterization but typically require super...

23.06.2026

Blog Robótica & RL

Heterogeneous Policy Networks for Composite Robot Team Communication and Coordination

arXiv:2606.20962v1 Announce Type: new Abstract: High-performing human-human teams learn intelligent and efficient communication and coordination strategies to maximize their joint utility. These teams implicitly understand the different roles of heterogeneous team members and adapt their communication protocols accordingly. Multi-Agent Reinforcement Learning (MARL) has attempted to develop computational methods for synthesizing such joint coordination-communication strategies, but emulating hete...

23.06.2026

Blog LLMs & Texto

A UAV-Based Multi-Modal Vision System for Automated Sideslope Deformation Monitoring and Hazard Detection

arXiv:2606.20681v1 Announce Type: new Abstract: Slope hazards constitute a major safety threat to expressway infrastructure, and their evolution is typically manifested as slow surface deformation. Conventional manual inspection suffers from low efficiency and inadequate operational safety, especially on severely deteriorated slopes. Accordingly, there is an urgent need for an automated, high-precision solution capable of large-area slope observation and analysis. This study aims to develop a hi...

23.06.2026

Blog LLMs & Texto

Expected Free Energy-based Planning as Variational Inference

arXiv:2606.20658v1 Announce Type: new Abstract: Planning under uncertainty requires agents to balance goal achievement with information gathering. Active inference addresses this through the Expected Free Energy (EFE), a cost function that unifies instrumental and epistemic objectives. However, existing EFE-based methods typically employ specialized optimization procedures that are difficult to extend or analyze. In this paper, we show that EFE-based planning can be formulated as Variational Fre...

23.06.2026

Blog LLMs & Texto

UniSLAD: A Unified Framework for Structural and Logical Industrial Visual Anomaly Detection

arXiv:2606.20768v1 Announce Type: new Abstract: Visual anomaly detection is a fundamental task in industrial automation. While existing approaches have achieved notable progress in identifying structural defects, the detection of logical anomalies remains relatively underexplored. In practice, structural and logical anomalies frequently co-occur in industrial workflows. Therefore, a solution capable of detecting both structural and logical anomalies is crucial for advancing comprehensive anomaly...

23.06.2026

Blog Robótica & RL

One Image is All You Need: Agentic One-Shot Image Generation via Text-Based World Models for Long-Tail Spatial Perception

arXiv:2606.20764v1 Announce Type: new Abstract: Reliable spatial decision automation, such as autonomous driving and maritime surveillance, critically depends on robust visual perception. However, real-world spatiotemporal data exhibits severe heterogeneity, often manifesting as extreme long-tail distributions for safety-critical scenarios. This data scarcity induces dataset shift that degrades detection performance and pose safety risks. While synthetic data generation offers a potential soluti...

23.06.2026

Blog Dados & Embeddings

Temporal Causal Prior-Data Fitted Networks for Panel Data with Learned Reliability Signals

arXiv:2606.20889v1 Announce Type: new Abstract: Estimating causal effects in industrial time series requires handling temporal dynamics, time-varying treatments, and unobserved confounders. Existing causal foundation models (CausalPFN, CausalFM) operate only on static cross-sectional data; neural temporal methods (CRN, G-Net) require per-dataset training; and concurrent temporal-PFN proposals have not been demonstrated at industrial scale. None output explicit per-pair reliability signals alongs...

23.06.2026

O que está acontecendo agora

MV-WAM: Manifold-Aware World Action Model with Value Augmentation

MindAlign: Decoding Inner Speech from fMRI Signals via Multimodal Embedding Alignment under Limited Data

ARGUSTRACK: A Multi-View Annotation System for Multi-Object Tracking

A Viscosity Semigroup Framework for Stable Image Reconstruction

EmoInstruct-TTS: Dual-Path Instruction-Guided Emotional Speech Synthesis

$\Omega$: Operator-based Mixture Ensemble for Generative Assimilation

Heterogeneous Policy Networks for Composite Robot Team Communication and Coordination

A UAV-Based Multi-Modal Vision System for Automated Sideslope Deformation Monitoring and Hazard Detection

Expected Free Energy-based Planning as Variational Inference

UniSLAD: A Unified Framework for Structural and Logical Industrial Visual Anomaly Detection

One Image is All You Need: Agentic One-Shot Image Generation via Text-Based World Models for Long-Tail Spatial Perception

Temporal Causal Prior-Data Fitted Networks for Panel Data with Learned Reliability Signals