Radar de IA — Notícias, Modelos e Papers

RIZZ: Routing Interactions to Near Zero-Interference Zones for Continual Adaptation of Black-Box Agents

arXiv:2606.20638v1 Announce Type: new Abstract: Large language models are increasingly deployed as long-lived agents that must adapt across users, tasks, domains, modalities, and feedback regimes without access to model weights. Existing black-box adaptation methods typically optimize a single prompt, maintain an undifferentiated memory, or rely on repeated rollout-heavy search. However, these designs struggle when streams of input are nonstationary, feedback is sparse, and failures from one tas...

23.06.2026

Blog Dados & Embeddings

D2HDMap: Non-visible Driveline Map Prior for Online Vectorized HD Map Prediction

arXiv:2606.20725v1 Announce Type: new Abstract: Accurate, up-to-date representations of road structures are critical for the safe operation of autonomous vehicles. Existing systems rely either on costly, maintenance-heavy high-definition (HD) maps which compromise safety when outdated, or purely sensor-based online mapping which struggles with long-range reliability and occlusion. Systems incorporating map prior information into online mapping seek to overcome drawbacks of both approaches by com...

23.06.2026

Blog Geração de Imagem

Hierarchical Pooling for Sheaf Neural Networks

arXiv:2606.20932v1 Announce Type: new Abstract: Sheaf Neural Networks (SNNs) generalize Graph Neural Networks (GNNs) by replacing scalar node signals with stalk-valued signals and by using restriction maps to measure compatibility across edges. Unlike standard graph diffusion, which encourages neighboring node features to become similar, sheaf diffusion promotes consistency through the restriction maps and can therefore model more general relationships between neighboring nodes. However, existin...

23.06.2026

Blog Dados & Embeddings

NeuroShield: A Device-Agnostic Foundation Model for EEG Authentication

arXiv:2606.20673v1 Announce Type: new Abstract: A central challenge in EEG authentication is that models are typically tied to the acquisition settings in which they are trained. In particular, variations in headset hardware, channel layout, and signal duration create heterogeneous recordings that existing models are not designed to handle, causing each new headset or dataset to be treated as a separate model-development problem. This fragmentation limits multi-dataset learning, hinders knowledg...

23.06.2026

Blog LLMs & Texto

Phonemes to the Rescue: Multilingual Tokenization Based on International Phonetic Alphabet

arXiv:2606.20993v1 Announce Type: new Abstract: Multilingual language models often exhibit performance disparities across languages that can arise as early as the tokenization stage. Widely-used subword tokenization approaches favor high-resource languages, and tokenizer-free methods still yield longer sequences for scripts with a higher bytes-per-character ratio. To address these shortcomings, we propose to use the International Phonetic Alphabet (IPA) as a language-agnostic input representatio...

23.06.2026

Blog Robótica & RL

Latent Goal Prediction from Language for Model-Based Planning

arXiv:2606.20627v1 Announce Type: new Abstract: Planning with world models is bottlenecked by compounding prediction errors and the difficulty of defining optimizable goals. Visual targets provide precise local gradients but poor distant guidance, while language is flexible yet limited by noisy cross-modal alignment or dependence on large generative models unsuited for the high-sampling nature of model-based planning. To address these challenges, we introduce Latent Goal Prediction from Language...

23.06.2026

Blog LLMs & Texto

Investigating Linguistic Steering: An Analysis of Adjectival Effects Across Large Language Model Architectures

arXiv:2606.20572v1 Announce Type: new Abstract: Achieving reliable control of Large Language Models (LLMs) requires a precise, scalable understanding of how they interpret linguistic cues. We introduce a rigorous framework using Shapley values to quantify the steering effect of individual adjectives on model performance, moving beyond anecdotal heuristics to principled attribution. Applying this method to 100 adjectives across a diverse suite of models (including o3, gpt-4o-mini, phi-3, llama-3-...

23.06.2026

Blog Robótica & RL

MMGNN: Multi-level, multi-color graph neural networks for molecular property prediction

arXiv:2606.20906v1 Announce Type: new Abstract: Molecular message-passing neural networks commonly propagate chemically diverse interactions through a single graph, which may mix interaction-specific signals and require deep propagation to capture long-range effects. We introduce the Multi-level, Multi-color Graph Neural Network (MMGNN), a hierarchical framework that decomposes a molecular graph into overlapping atom-type-pair-specific subgraphs while preserving atom-level resolution. MMGNN-2D c...

23.06.2026

Blog LLMs & Texto

DrugBench: Evaluating AI Control Protocols for Medication Harm Mitigation

arXiv:2606.20663v1 Announce Type: new Abstract: Large Language Models have the potential to expand and improve the access to clinical information by enabling new ways of interacting with medical knowledge in natural language. However, their deployment in medical question-answering settings is safety-critical, since misaligned outputs can lead to severe patient harm. AI control is an emerging approach that introduces external safeguards to mitigate unsafe behaviours in misaligned systems and has ...

23.06.2026

Blog LLMs & Texto

Demographic Metadata as Construct-Irrelevant Noise in DistilBERT-Based Automated Essay Scoring

arXiv:2606.21066v1 Announce Type: new Abstract: Automated Essay Scoring (AES) systems are increasingly used to support teachers in managing grading workloads and to provide a supplementary rater in large-scale assessments. While human grading is frequently influenced by students' demographic characteristics, the efficacy of different strategies for integrating demographic metadata with textual input used to train AES models remains underexplored. This study investigates the impact of a specific ...

23.06.2026

Blog LLMs & Texto

Scaling Diverse Language Generation for 3D Visual Grounding

arXiv:2606.20946v1 Announce Type: new Abstract: Developing robust models for 3D visual grounding (3DVG), the localization of entities in a 3D scene described in natural language, is important for enabling agents to correspond spatial language with objects in the physical world. However, the lack of diverse descriptions at scale prevents models from generalizing beyond simple linguistic patterns. Recent such attempts lack diversity in the constraint types and language used to ground objects. Capt...

23.06.2026

Blog LLMs & Texto

Skill Coverage: A Test Adequacy Metric for Agent Skills

arXiv:2606.20659v1 Announce Type: new Abstract: Agent skills encode reusable procedural knowledge that guides large language model agents across tasks and execution contexts. Existing evaluations primarily assess skills through task level outcomes, yet task success alone does not reveal which parts of a skill have been exercised or which remain untested. We introduce skill coverage, a test adequacy metric that treats the skill artifact as the object under test. Our approach extracts observable s...

23.06.2026

O que está acontecendo agora

RIZZ: Routing Interactions to Near Zero-Interference Zones for Continual Adaptation of Black-Box Agents

D2HDMap: Non-visible Driveline Map Prior for Online Vectorized HD Map Prediction

Hierarchical Pooling for Sheaf Neural Networks

NeuroShield: A Device-Agnostic Foundation Model for EEG Authentication

Phonemes to the Rescue: Multilingual Tokenization Based on International Phonetic Alphabet

Latent Goal Prediction from Language for Model-Based Planning

Investigating Linguistic Steering: An Analysis of Adjectival Effects Across Large Language Model Architectures

MMGNN: Multi-level, multi-color graph neural networks for molecular property prediction

DrugBench: Evaluating AI Control Protocols for Medication Harm Mitigation

Demographic Metadata as Construct-Irrelevant Noise in DistilBERT-Based Automated Essay Scoring

Scaling Diverse Language Generation for 3D Visual Grounding

Skill Coverage: A Test Adequacy Metric for Agent Skills