Radar de IA — Notícias, Modelos e Papers

Post-Training Recipe, More Than Model Family, Shapes Multi-Agent LLM Conversational Behavior

arXiv:2606.20632v1 Announce Type: new Abstract: Multi-LLM systems use multiple language models to deliberate, judge each other's outputs, or coordinate as agents. Their value depends on the models producing measurably different conversational behaviors when given the same input. Prior offline studies recommend drawing one model per family for behavioral diversity, because LLMs prefer outputs from their own family when rating one another in isolation. Whether the same family label predicts behavi...

23.06.2026

Blog LLMs & Texto

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

arXiv:2606.20812v1 Announce Type: new Abstract: EEG foundation models can learn generalizable representations from large-scale EEG corpora to enable single-backbone transfer across diverse clinical and brain-computer interface tasks. Existing models typically discretize the continuous multi-channel EEG waveform into patches or codebook tokens and train a transformer with masked self-supervision. Recognizing that this discretization fragments continuous brain rhythms and obscures fine-grained tem...

23.06.2026

Blog Multimodal

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

arXiv:2606.20697v1 Announce Type: new Abstract: AlphaEarth Foundations (AEF) unify global remote sensing foundation embeddings through multimodal self-supervised learning, but their pretraining focuses on physical land-surface signals, limiting plug-and-play use in socioeconomic tasks. We integrate seven heterogeneous data streams across 36 Chinese cities over eight years - AEF embeddings, population, nighttime lights, remote sensing indices, points of interest (POIs), urban morphology, and cros...

23.06.2026

Blog LLMs & Texto

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

arXiv:2606.20661v1 Announce Type: new Abstract: The integration of external tools has transitioned LLM agents from passive responders to autonomous systems. However, current benchmarks prioritize execution success, neglecting self-awareness capability, the ability to discern whether a problem requires necessary external resources or can be solved via internal parametric knowledge. To address this, we introduce KAPRO (Knowing-Acting Quadrant PRObe), a framework that evaluates cognitive-behavioral...

23.06.2026

Blog Robótica & RL

Learning-Based Modeling of Soft Robots via Cosserat Rod Theory

arXiv:2606.20958v1 Announce Type: new Abstract: Modeling soft robot dynamics is challenging due to their continuum structure and typically nonlinear dynamics. Creating models based on first-order principles is typically time-demanding, and their expressiveness is limited, whereas data-driven models lack interpretability and physical consistency. This work aims to overcome these challenges by introducing a port-Hamiltonian Gaussian Process Regression framework for learning and simulating the dyna...

23.06.2026

Blog Robótica & RL

Factor-Aware Mixture-of-Experts with Pretrained Encoder for Combinatorial Generalization

arXiv:2606.21100v1 Announce Type: new Abstract: The integration of pretrained encoders with diffusion policies has become a dominant paradigm for visual robotic manipulation. However, it still struggles to generalize across complex environments with varying factors such as lighting and surface textures. To address this, we propose FAME, a framework that integrates a factor-aware mixture-of-experts (MoE) with a pretrained encoder to enhance generalization to environmental variations. FAME follows...

23.06.2026

Blog LLMs & Texto

LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations

arXiv:2606.21098v1 Announce Type: new Abstract: Reliable evaluation of phrase break annotations is crucial, as subtle variations in prosodic boundaries directly affect the clarity and naturalness of speech. However, existing approaches exhibit major limitations: single-reference evaluation assumes a unique gold phrasing for an utterance despite multiple valid phrasings, while human judgment, though flexible, is labor-intensive and unscalable. To address these, we propose LLM-based Multi-Referenc...

23.06.2026

Blog LLMs & Texto

In LLM Reasoning, there is Irrationality on top of Value Misalignment

arXiv:2606.20624v1 Announce Type: new Abstract: Significant progress has been made in aligning LLMs with target value functions. We argue that, even when an LLM has been well aligned in (post-)training, it may still fail to maximise the aligned value in reasoning. We mathematically formalise this gap as rational value risk: the utility discrepancy between a model's deployed reasoning strategy and its rational counterpart, which is defined to be the responses that maximise expected utility in the...

23.06.2026

Blog Multimodal

Beyond 'One Language, One Script': Quantifying Orthographic Bias in Multilingual VLMs with PuMVR

arXiv:2606.20770v1 Announce Type: new Abstract: Current Vision-Language Models (VLMs) are celebrated for their multilingual capabilities, yet they operate under a flawed assumption: that one language corresponds to a single writing system. This overlooks billions of users of multi-script languages like Punjabi, Serbian, Hindi-Urdu, Kurdish, among many others, for whom a model's capability may be fractured by orthographic bias. We introduce PuMVR (Punjabi Multimodal Visual Reasoning), the first b...

23.06.2026

Blog Multimodal

Evidential Fusion Network for Multimodal Survival Prediction under Missing Modalities

arXiv:2606.20757v1 Announce Type: new Abstract: Recent multimodal survival prediction models have demonstrated strong predictive performance by leveraging complementary information across modalities. However, such models generally assume data completeness and exhibit limited robustness toward missing modalities, which are frequently encountered in real-world clinical settings. We propose the Evidential Missing Modality Survival Fusion (EMMS) model for multimodal survival prediction under missing...

23.06.2026

Blog Dados & Embeddings

Understanding Latent Flow Models for Tabular Data Synthesis: Targets, Paths, and Sampling

arXiv:2606.20878v1 Announce Type: new Abstract: Synthetic tabular data enables microdata sharing in regulated domains, yet deploying continuous-time generative models requires balancing analytical utility, disclosure risk, and computational cost. Latent-space flow models are flexible, but theoretical equivalences across learning targets, probability paths, and sampling dynamics can translate into different behaviour under finite-step integration and explicit compute budgets. We present an empiri...

23.06.2026

Blog LLMs & Texto

GEOPHYS: The Geometry of Physical Plausibility

arXiv:2606.20707v1 Announce Type: new Abstract: While humans can identify physically implausible events within milliseconds, machine learning approaches addressing the same problem are extremely slow and expensive. They either rely on external multimodal-LLM judges or require ad-hoc modifications to the training procedure. In this work, we argue that indicators of physical plausibility are implicitly captured by five geometric properties of the per-frame embeddings produced by frozen image encod...

23.06.2026

O que está acontecendo agora

Post-Training Recipe, More Than Model Family, Shapes Multi-Agent LLM Conversational Behavior

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

Learning-Based Modeling of Soft Robots via Cosserat Rod Theory

Factor-Aware Mixture-of-Experts with Pretrained Encoder for Combinatorial Generalization

LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations

In LLM Reasoning, there is Irrationality on top of Value Misalignment

Beyond 'One Language, One Script': Quantifying Orthographic Bias in Multilingual VLMs with PuMVR

Evidential Fusion Network for Multimodal Survival Prediction under Missing Modalities

Understanding Latent Flow Models for Tabular Data Synthesis: Targets, Paths, and Sampling

GEOPHYS: The Geometry of Physical Plausibility