Radar de IA — Notícias, Modelos e Papers

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

arXiv:2606.20571v1 Announce Type: new Abstract: In agent-driven question answering (QA) applications, retrieval-augmented generation (RAG) is commonly introduced to enhance the response accuracy of large language models (LLMs) by providing additional context. Due to the inherent noise in retrieval results and the coarse granularity of document-level retrieval, the retrieved context often contains substantial redundant information. In this setting, the agent prompt, consisting of the user query a...

23.06.2026

Blog LLMs & Texto

Quality and Agreement in Multilabel Emotion Annotation: A Case Study and Evaluation Framework

arXiv:2606.21069v1 Announce Type: new Abstract: Emotion annotation is inherently subjective, yet most NLP pipelines still assume "gold" labels, typically produced by majority voting, and treat annotator variation as noise. In this paper, we present a multilabel emotion annotation case study and use it to examine how annotator behavior and aggregation choices affect both agreement estimates and downstream emotion classifiers. Rather than collapsing disagreement into a single label, we represent t...

23.06.2026

Blog Dados & Embeddings

XmoPipe: A Pipeline for Large-Scale In-the-Wild Human Motion Dataset Construction

arXiv:2606.20731v1 Announce Type: new Abstract: Large-scale human motion datasets are essential for training robust motion models for analysis, synthesis, and understanding. While marker-based motion capture provides precise data, it is costly and limited in scale and diversity. Recent advances in monocular motion capture and video-language understanding open the way to extract plausible motion from unconstrained online videos. We present a scalable pipeline for constructing in-the-wild human mo...

23.06.2026

Blog Dados & Embeddings

Bridging Multi-Valued Heuristics and Dimensionality Reduction in Multi-Objective Search

arXiv:2606.20644v1 Announce Type: new Abstract: Multi-objective shortest-path (MOSP) algorithms traditionally rely on single-valued heuristics (SVHs), which associate each state with a single admissible cost vector. While SVHs provide safe lower bounds, they fail to capture the trade-off structure of the Pareto frontier and often yield weak search guidance. Multi-valued heuristics (MVHs) address this limitation by mapping states to sets of cost estimates, enabling a richer approximation of possi...

23.06.2026

Blog LLMs & Texto

MIRAGE: Stealthy Visual Prompt Injection for Vulnerability Detection in Web Agents

arXiv:2606.20717v1 Announce Type: new Abstract: Multimodal Large Language Model (MLLM)-based web agents provide practical, high-precision solutions for visual browser automation; however, they inherently expand the attack surface, introducing novel vision-based vulnerabilities. Existing adversarial evaluations targeting these agents frequently rely on permissive threat models and visually conspicuous artifacts. In this paper, we investigate a constrained vulnerability detection setting: a truste...

23.06.2026

Blog LLMs & Texto

Physics-Guided Fully Convolutional Spatiotemporal Learning Toward Digital-Twin-Enabled Microstructure Evolution Prediction

arXiv:2606.20983v1 Announce Type: new Abstract: Understanding and predicting microstructure evolution is central to materials design, yet purely data-driven spatiotemporal learning models often suffer from limited physical consistency and degraded long-term prediction accuracy. In this work, we introduce a physics-guided fully convolutional spatiotemporal learning framework for microstructure evolution prediction. Unlike prior self-supervised approaches, the proposed method explicitly incorporat...

23.06.2026

Blog LLMs & Texto

The Metanym Game: A Self-Contained, Self-Consistent LLM Peer-Community Benchmark for Structural Intelligence

arXiv:2606.21008v1 Announce Type: new Abstract: The metanym game is a competitive word game for LLMs that measures structural intelligence against established cognitive-science constructs. No content is given in advance; the contestants create all of it -- a new kind of analogy test, analogical production falsifiable sentence by sentence, with no fixed test set to leak into training (contamination-resistant by construction). In the council-of-peers benchmark, the contestants also rate each other...

23.06.2026

Blog LLMs & Texto

GRAG: Generic Response-Augmented Generation Framework for Personalized Conversational Systems

arXiv:2606.21097v1 Announce Type: new Abstract: Deploying highly capable personalized conversational agents in resource-constrained or privacy-sensitive environments remains a significant challenge. We identify a fundamental bottleneck in the existing approaches: current training paradigms treat personalization and grounding as a single monolithic learning problem. Under these paradigms, language models are forced to simultaneously address what to say (content grounding) and how to say it in a u...

23.06.2026

Blog Robótica & RL

Perturbation-Based Uncertainty for Failure Detection in Vision-Language-Action Models

arXiv:2606.20754v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models have shown strong performance in robotic manipulation, but reliable uncertainty quantification remains challenging, particularly under distribution shift. Unlike autoregressive policies, many modern VLA models generate continuous actions through regression or flow-based generation, where explicit predictive probabilities are unavailable. Moreover, existing approaches often rely on stochastic action sampling or su...

23.06.2026

Blog Robótica & RL

Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

arXiv:2606.20712v1 Announce Type: new Abstract: Sampling-based Model Predictive Control (SMPC) is a promising strategy for contact-rich robotic manipulation, combining gradient-free optimization with massively parallel GPU simulation. Yet, most prior work relies on simplified dynamics or remains confined to simulation. We present an MPC framework that leverages JAX for large-scale parallelization and efficient computation, coupled with the high-fidelity MuJoCo MJX simulator, and deploy it on a F...

23.06.2026

Blog LLMs & Texto

Short-Term Electricity Demand Forecasting for New England Using a Hybrid Transformer-XGBoost Framework with Weather, Calendar, and COVID-19 Indicators

arXiv:2606.20918v1 Announce Type: new Abstract: Accurate short-term electricity demand forecasting is critical for reliable power system operation, energy market planning, and infrastructure optimization. This paper presents a hybrid framework combining a Transformer encoder for temporal feature extraction with gradient-boosted decision trees (XGBoost) for daily electricity demand forecasting across New England. The framework integrates meteorological observations from six cities spanning all si...

23.06.2026

Blog LLMs & Texto

Beyond Fixed Budgets: Characterizing the Inelasticity and Limitations of Tree-of-Thought Reasoning Strategies

arXiv:2606.20599v1 Announce Type: new Abstract: Tree of Thought (ToT) search has become a promising direction for improving the reasoning capabilities of large language models, but deploying these methods in practice raises a question that has received little systematic attention: how do different search strategies behave under varying compute budgets, model sizes, and problem difficulties? In this work, we evaluate two representative ToT methods; DPTS, a Monte Carlo tree search based approach, ...

23.06.2026

O que está acontecendo agora

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

Quality and Agreement in Multilabel Emotion Annotation: A Case Study and Evaluation Framework

XmoPipe: A Pipeline for Large-Scale In-the-Wild Human Motion Dataset Construction

Bridging Multi-Valued Heuristics and Dimensionality Reduction in Multi-Objective Search

MIRAGE: Stealthy Visual Prompt Injection for Vulnerability Detection in Web Agents

Physics-Guided Fully Convolutional Spatiotemporal Learning Toward Digital-Twin-Enabled Microstructure Evolution Prediction

The Metanym Game: A Self-Contained, Self-Consistent LLM Peer-Community Benchmark for Structural Intelligence

GRAG: Generic Response-Augmented Generation Framework for Personalized Conversational Systems

Perturbation-Based Uncertainty for Failure Detection in Vision-Language-Action Models

Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

Short-Term Electricity Demand Forecasting for New England Using a Hybrid Transformer-XGBoost Framework with Weather, Calendar, and COVID-19 Indicators

Beyond Fixed Budgets: Characterizing the Inelasticity and Limitations of Tree-of-Thought Reasoning Strategies