Radar de IA — Notícias, Modelos e Papers

Learning What Not to Forget: Long-Horizon Agent Memory from a Few Kilobytes of Learning

arXiv:2606.20954v1 Announce Type: new Abstract: Long-running language-model systems accumulate interaction history that outgrows the context window, so they must continually evict. When an eviction policy drops a load-bearing detail, for example an access token issued at login or a path the next call needs, the action fails. We present LRE (Learned Relevance Eviction), a few kilobytes, CPU-only, language-model-free scorer that learns which units of history are load-bearing and keeps them by verb...

23.06.2026

Blog Dados & Embeddings

FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes

arXiv:2606.20769v1 Announce Type: new Abstract: AI systems for peer review fail on three fronts: they train on Computer Science and Machine Learning venues alone, ignore the iterative dialogue that validates science, and evaluate on stylistic mimicry rather than real editorial judgment. We introduce FirstPass, a dataset and fine-tuned model that addresses all three. Curating 3,668 complete multi-round peer-review dialogues from Nature Communications across five scientific domains (biology, chemi...

23.06.2026

Blog LLMs & Texto

Robust Image-Driven Phenotyping of Ovarian Tumor Cells using Optimized Dynamic Features in Hyperbolic Channels

arXiv:2606.20703v1 Announce Type: new Abstract: Label-free, image-based cellular mechanophenotyping in microfluidic devices provides a high-throughput method for single-cell profiling. However, while complex microchannels (e.g., hyperbolic geometries) reveal transient deformation dynamics under continuous extensional stress, the resulting high-dimensional feature spaces are highly susceptible to hydrodynamic artifacts. Flow rate variations often distort discriminative boundaries, linking feature...

23.06.2026

Blog LLMs & Texto

From Sentiment to Actionable Insights: A Data-Driven Public Sentiment Analysis of Advanced Air Mobility

arXiv:2606.20751v1 Announce Type: new Abstract: Advanced Air Mobility (AAM) is an emerging low-altitude air transportation system whose successful deployment depends not only on technological advancement but also on public acceptance. This acceptance will drive government support, regulations, noise standards, and willingness to fly, and in turn the overall commercial viability of AAM. Understanding public sentiment toward AAM is therefore essential for identifying its societal barriers and info...

23.06.2026

Blog LLMs & Texto

PEAR: Permutation-Equivariant Adaptive Routing Multi-Agent Debate

arXiv:2606.20621v1 Announce Type: new Abstract: Multi-agent debate improves the reliability of large language models (LLMs) through iterative peer critiques. However, fixed topologies often introduce persistent positional biases, amplify unreliable agents, and cause high sensitivity to role assignments. We introduce \textit{Permutation-Equivariant Adaptive Routing Multi-Agent Debate (PEAR)}, an inference-time protocol that dynamically reconfigures communication roles and sparse topologies across...

23.06.2026

Blog LLMs & Texto

A Multi-Agent Audit Framework for High-Stakes Reasoning: Evaluation and Interpretability in Clinical Mental Health Screening

arXiv:2606.21123v1 Announce Type: new Abstract: High-stakes reasoning tasks necessitate transparent and verifiable workflows, yet conventional single-model large language models (LLMs) often struggle with hallucination and low interpretability under zero-shot paradigms. To address this general AI challenge, we propose a Multi-Agent Audit Framework that simulates a collaborative, multi-step verification process. We empirically validate this architecture in the sensitive domain of clinical mental ...

23.06.2026

Blog LLMs & Texto

Right Knowledge, Wrong Answer: Test-Time Steering for Temporal Fact Conflicts in Open-Weight Language Models

arXiv:2606.20959v1 Announce Type: new Abstract: Large language models can store both outdated facts and newer superseding facts in their parameters, but standard prompting may still elicit the outdated answer. We formalize this problem as Parametric Temporal Conflict (PTC) and introduce Temporal Attractor Steering (TAS), a three-stage test-time intervention that detects likely conflicts, identifies a conflict-critical layer, and steers hidden states toward newer-fact representations without retr...

23.06.2026

Blog Geração de Imagem

MotionPyramid: Hierarchical Motion Representation and Residual Interfaces

arXiv:2606.20705v1 Announce Type: new Abstract: We ask whether the representational hierarchy seen in perception, from local primitives such as edges to higher level structures such as parts and objects, can be established for motion. In humanoid control, low level actions specify immediate motor commands, while meaningful behavior is organized over longer temporal scales, including contacts, gait fragments, balance recovery, reaching, and whole body skills. We introduce MotionPyramid, a hierarc...

23.06.2026

Blog Geração de Imagem

BayesFP: Posterior Estimation for Flow-Based Policies via Feynman-Kac Sampling

arXiv:2606.21014v1 Announce Type: new Abstract: Robots must generate trajectories that remain faithful to learned expert behavior while satisfying safety constraints and task-specific objectives specified only at inference time. We formulate constrained trajectory generation for pretrained diffusion and flow-matching policies as Bayesian posterior sampling, with the learned demonstration distribution as a prior and an inference-time, cost-derived likelihood tilting it toward feasible, optimal tr...

23.06.2026

Blog LLMs & Texto

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

arXiv:2606.20615v1 Announce Type: new Abstract: AI agents now participate as first-class team members across the software development lifecycle, yet no specification language exists for expressing the human-agent responsibility boundaries, approval gates, and governance constraints this collaboration requires. Existing approaches encode process in agent prompts (subject to drift), target adjacent domains (workflow management, business processes), or address only fragments (access control, approv...

23.06.2026

Blog Robótica & RL

Physics-Guided Dual-Stream Heterogeneous Graph Neural Network for Predicting Full-Field Structural Response of Stiffened Panels

arXiv:2606.20916v1 Announce Type: new Abstract: Iterative design and optimization of large, complex structures require fast and accurate prediction of stress, displacement, and other fields. Finite element analysis (FEA) is computationally expensive for this task. Existing neural network surrogates often struggle with varying topologies and complex boundary conditions. This study proposes the novel Dual-Stream Heterogeneous Graph Neural Network (DS-HGNN) for full-field stress and displacement pr...

23.06.2026

Blog LLMs & Texto

Human Decision-Making with AI Assistance under Correlated Features

arXiv:2606.20628v1 Announce Type: new Abstract: Humans increasingly make decisions with AI assistance; for example, doctors may follow AI-recommended diagnostic tests and base their diagnoses on the results. A natural question is which tests should AI recommend to balance short-term decision quality and long-term human learning when different features (e.g., test results) are correlated. While prior work establishes that stationary policies that recommend the same tests repeatedly are optimal wh...

23.06.2026

O que está acontecendo agora

Learning What Not to Forget: Long-Horizon Agent Memory from a Few Kilobytes of Learning

FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes

Robust Image-Driven Phenotyping of Ovarian Tumor Cells using Optimized Dynamic Features in Hyperbolic Channels

From Sentiment to Actionable Insights: A Data-Driven Public Sentiment Analysis of Advanced Air Mobility

PEAR: Permutation-Equivariant Adaptive Routing Multi-Agent Debate

A Multi-Agent Audit Framework for High-Stakes Reasoning: Evaluation and Interpretability in Clinical Mental Health Screening

Right Knowledge, Wrong Answer: Test-Time Steering for Temporal Fact Conflicts in Open-Weight Language Models

MotionPyramid: Hierarchical Motion Representation and Residual Interfaces

BayesFP: Posterior Estimation for Flow-Based Policies via Feynman-Kac Sampling

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

Physics-Guided Dual-Stream Heterogeneous Graph Neural Network for Predicting Full-Field Structural Response of Stiffened Panels

Human Decision-Making with AI Assistance under Correlated Features