Radar de IA — Notícias, Modelos e Papers

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

arXiv:2606.20615v1 Announce Type: new Abstract: AI agents now participate as first-class team members across the software development lifecycle, yet no specification language exists for expressing the human-agent responsibility boundaries, approval gates, and governance constraints this collaboration requires. Existing approaches encode process in agent prompts (subject to drift), target adjacent domains (workflow management, business processes), or address only fragments (access control, approv...

23.06.2026

Blog Robótica & RL

Physics-Guided Dual-Stream Heterogeneous Graph Neural Network for Predicting Full-Field Structural Response of Stiffened Panels

arXiv:2606.20916v1 Announce Type: new Abstract: Iterative design and optimization of large, complex structures require fast and accurate prediction of stress, displacement, and other fields. Finite element analysis (FEA) is computationally expensive for this task. Existing neural network surrogates often struggle with varying topologies and complex boundary conditions. This study proposes the novel Dual-Stream Heterogeneous Graph Neural Network (DS-HGNN) for full-field stress and displacement pr...

23.06.2026

Blog LLMs & Texto

Human Decision-Making with AI Assistance under Correlated Features

arXiv:2606.20628v1 Announce Type: new Abstract: Humans increasingly make decisions with AI assistance; for example, doctors may follow AI-recommended diagnostic tests and base their diagnoses on the results. A natural question is which tests should AI recommend to balance short-term decision quality and long-term human learning when different features (e.g., test results) are correlated. While prior work establishes that stationary policies that recommend the same tests repeatedly are optimal wh...

23.06.2026

Blog LLMs & Texto

Post-Training Recipe, More Than Model Family, Shapes Multi-Agent LLM Conversational Behavior

arXiv:2606.20632v1 Announce Type: new Abstract: Multi-LLM systems use multiple language models to deliberate, judge each other's outputs, or coordinate as agents. Their value depends on the models producing measurably different conversational behaviors when given the same input. Prior offline studies recommend drawing one model per family for behavioral diversity, because LLMs prefer outputs from their own family when rating one another in isolation. Whether the same family label predicts behavi...

23.06.2026

Blog LLMs & Texto

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

arXiv:2606.20812v1 Announce Type: new Abstract: EEG foundation models can learn generalizable representations from large-scale EEG corpora to enable single-backbone transfer across diverse clinical and brain-computer interface tasks. Existing models typically discretize the continuous multi-channel EEG waveform into patches or codebook tokens and train a transformer with masked self-supervision. Recognizing that this discretization fragments continuous brain rhythms and obscures fine-grained tem...

23.06.2026

Blog Multimodal

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

arXiv:2606.20697v1 Announce Type: new Abstract: AlphaEarth Foundations (AEF) unify global remote sensing foundation embeddings through multimodal self-supervised learning, but their pretraining focuses on physical land-surface signals, limiting plug-and-play use in socioeconomic tasks. We integrate seven heterogeneous data streams across 36 Chinese cities over eight years - AEF embeddings, population, nighttime lights, remote sensing indices, points of interest (POIs), urban morphology, and cros...

23.06.2026

Blog LLMs & Texto

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

arXiv:2606.20661v1 Announce Type: new Abstract: The integration of external tools has transitioned LLM agents from passive responders to autonomous systems. However, current benchmarks prioritize execution success, neglecting self-awareness capability, the ability to discern whether a problem requires necessary external resources or can be solved via internal parametric knowledge. To address this, we introduce KAPRO (Knowing-Acting Quadrant PRObe), a framework that evaluates cognitive-behavioral...

23.06.2026

Blog Robótica & RL

Learning-Based Modeling of Soft Robots via Cosserat Rod Theory

arXiv:2606.20958v1 Announce Type: new Abstract: Modeling soft robot dynamics is challenging due to their continuum structure and typically nonlinear dynamics. Creating models based on first-order principles is typically time-demanding, and their expressiveness is limited, whereas data-driven models lack interpretability and physical consistency. This work aims to overcome these challenges by introducing a port-Hamiltonian Gaussian Process Regression framework for learning and simulating the dyna...

23.06.2026

Blog Robótica & RL

Factor-Aware Mixture-of-Experts with Pretrained Encoder for Combinatorial Generalization

arXiv:2606.21100v1 Announce Type: new Abstract: The integration of pretrained encoders with diffusion policies has become a dominant paradigm for visual robotic manipulation. However, it still struggles to generalize across complex environments with varying factors such as lighting and surface textures. To address this, we propose FAME, a framework that integrates a factor-aware mixture-of-experts (MoE) with a pretrained encoder to enhance generalization to environmental variations. FAME follows...

23.06.2026

Blog LLMs & Texto

LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations

arXiv:2606.21098v1 Announce Type: new Abstract: Reliable evaluation of phrase break annotations is crucial, as subtle variations in prosodic boundaries directly affect the clarity and naturalness of speech. However, existing approaches exhibit major limitations: single-reference evaluation assumes a unique gold phrasing for an utterance despite multiple valid phrasings, while human judgment, though flexible, is labor-intensive and unscalable. To address these, we propose LLM-based Multi-Referenc...

23.06.2026

Blog LLMs & Texto

In LLM Reasoning, there is Irrationality on top of Value Misalignment

arXiv:2606.20624v1 Announce Type: new Abstract: Significant progress has been made in aligning LLMs with target value functions. We argue that, even when an LLM has been well aligned in (post-)training, it may still fail to maximise the aligned value in reasoning. We mathematically formalise this gap as rational value risk: the utility discrepancy between a model's deployed reasoning strategy and its rational counterpart, which is defined to be the responses that maximise expected utility in the...

23.06.2026

Blog Multimodal

Beyond 'One Language, One Script': Quantifying Orthographic Bias in Multilingual VLMs with PuMVR

arXiv:2606.20770v1 Announce Type: new Abstract: Current Vision-Language Models (VLMs) are celebrated for their multilingual capabilities, yet they operate under a flawed assumption: that one language corresponds to a single writing system. This overlooks billions of users of multi-script languages like Punjabi, Serbian, Hindi-Urdu, Kurdish, among many others, for whom a model's capability may be fractured by orthographic bias. We introduce PuMVR (Punjabi Multimodal Visual Reasoning), the first b...

23.06.2026

O que está acontecendo agora

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

Physics-Guided Dual-Stream Heterogeneous Graph Neural Network for Predicting Full-Field Structural Response of Stiffened Panels

Human Decision-Making with AI Assistance under Correlated Features

Post-Training Recipe, More Than Model Family, Shapes Multi-Agent LLM Conversational Behavior

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

Learning-Based Modeling of Soft Robots via Cosserat Rod Theory

Factor-Aware Mixture-of-Experts with Pretrained Encoder for Combinatorial Generalization

LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations

In LLM Reasoning, there is Irrationality on top of Value Misalignment

Beyond 'One Language, One Script': Quantifying Orthographic Bias in Multilingual VLMs with PuMVR