32 resultados para "RAG"

Artigo

Fine-tuning vs. RAG: Quando Usar Cada Um

LLMs & Texto

Fine-tuning ou RAG? Entenda a diferença, os custos e como decidir entre ajustar um modelo ou conectá-lo à sua base de conhecimento.

Guia

Embeddings & RAG: A Memória das IAs

Dados & Embeddings

Entenda embeddings, busca semântica, bancos vetoriais e RAG — como dar memória e fontes a um modelo de linguagem, e por que tudo depende dos dados.

Guia

LLMs: Como Funcionam os Modelos de Linguagem

LLMs & Texto

Entenda de uma vez como funcionam os LLMs: a arquitetura transformer, o treinamento, por que eles alucinam, fine-tuning, RAG, quantização e agentes.

Artigo

Agentes de IA: O Que São e Como Pensam

LLMs & Texto

Agentes de IA explicados: como um LLM deixa de só responder e passa a usar ferramentas, planejar e agir — e por que isso é mais frágil do que parece.

Artigo

RAG do Zero: Busca + Geração

Dados & Embeddings

RAG explicado passo a passo: indexar documentos com chunking e embeddings, recuperar trechos relevantes, montar o prompt e gerar a resposta.

Notícia

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

LLMs & Texto

arXiv:2606.20571v1 Announce Type: new Abstract: In agent-driven question answering (QA) applications, retrieval-augmented generation (RAG) is commonly introduced to enhance the response accuracy of large language models (LLMs) by providing additional context. Due to the inherent noise in retrieval results and the coarse granularity of document-level retrieval, the retrieved context often contains substantial redundant information. In this setting, the agent prompt, consisting of the user query a...

Notícia

MIRAGE: Stealthy Visual Prompt Injection for Vulnerability Detection in Web Agents

LLMs & Texto

arXiv:2606.20717v1 Announce Type: new Abstract: Multimodal Large Language Model (MLLM)-based web agents provide practical, high-precision solutions for visual browser automation; however, they inherently expand the attack surface, introducing novel vision-based vulnerabilities. Existing adversarial evaluations targeting these agents frequently rely on permissive threat models and visually conspicuous artifacts. In this paper, we investigate a constrained vulnerability detection setting: a truste...

Notícia

GRAG: Generic Response-Augmented Generation Framework for Personalized Conversational Systems

LLMs & Texto

arXiv:2606.21097v1 Announce Type: new Abstract: Deploying highly capable personalized conversational agents in resource-constrained or privacy-sensitive environments remains a significant challenge. We identify a fundamental bottleneck in the existing approaches: current training paradigms treat personalization and grounding as a single monolithic learning problem. Under these paradigms, language models are forced to simultaneously address what to say (content grounding) and how to say it in a u...

Notícia

Real-World Deployment of Massively Parallel Sampling-Based MPC for Contact-Rich Manipulation

Robótica & RL

arXiv:2606.20712v1 Announce Type: new Abstract: Sampling-based Model Predictive Control (SMPC) is a promising strategy for contact-rich robotic manipulation, combining gradient-free optimization with massively parallel GPU simulation. Yet, most prior work relies on simplified dynamics or remains confined to simulation. We present an MPC framework that leverages JAX for large-scale parallelization and efficient computation, coupled with the high-fidelity MuJoCo MJX simulator, and deploy it on a F...

Notícia

A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation

LLMs & Texto

arXiv:2606.20955v1 Announce Type: new Abstract: Dynamic average estimation is a critical problem in multi-agent systems, enabling agents to collaboratively estimate time-varying signals using only local information exchange. Traditional model-based approaches often face challenges related to convergence speed and sensitivity to network topology changes. This paper introduces a novel learning-based solution leveraging Gated Graph Neural Networks (GGNNs) for fast-convergent dynamic average estimat...

Notícia

Peeking Inside LLMs: Leveraging Internal Artifacts of LLMs for Enhancing Reliability in Legal Classification

LLMs & Texto

arXiv:2606.20929v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being adopted in the legal domain. However, despite their strong performance, LLMs are prone to generating incorrect or hallucinated outputs, raising serious concerns about their reliability in high-stakes domains such as law. Detecting the correctness of responses of LLM-based systems is therefore a critical challenge. In this work, we explore the potential of leveraging internal artifacts of LLM to de...

Notícia

ELADO: Elliptic PDE Assessment Datasets for Operator Learning

Dados & Embeddings

arXiv:2606.20771v1 Announce Type: new Abstract: We introduce ELADO (Elliptic PDE Assessment Datasets for Operator Learning), a systematic benchmark suite constructed to show and quantify failure modes of neural operator architectures when learning solution operators of elliptic PDEs. While the benchmarks of existing datasets focus on average case performance, the ELADO datasets are constructed to highlight challenges that arise naturally in elliptic PDE problems. In particular, we construct seve...

Notícia

Hierarchical Pooling for Sheaf Neural Networks

Geração de Imagem

arXiv:2606.20932v1 Announce Type: new Abstract: Sheaf Neural Networks (SNNs) generalize Graph Neural Networks (GNNs) by replacing scalar node signals with stalk-valued signals and by using restriction maps to measure compatibility across edges. Unlike standard graph diffusion, which encourages neighboring node features to become similar, sheaf diffusion promotes consistency through the restriction maps and can therefore model more general relationships between neighboring nodes. However, existin...

Notícia

NeuroShield: A Device-Agnostic Foundation Model for EEG Authentication

Dados & Embeddings

arXiv:2606.20673v1 Announce Type: new Abstract: A central challenge in EEG authentication is that models are typically tied to the acquisition settings in which they are trained. In particular, variations in headset hardware, channel layout, and signal duration create heterogeneous recordings that existing models are not designed to handle, causing each new headset or dataset to be treated as a separate model-development problem. This fragmentation limits multi-dataset learning, hinders knowledg...

Notícia

Skill Coverage: A Test Adequacy Metric for Agent Skills

LLMs & Texto

arXiv:2606.20659v1 Announce Type: new Abstract: Agent skills encode reusable procedural knowledge that guides large language model agents across tasks and execution contexts. Existing evaluations primarily assess skills through task level outcomes, yet task success alone does not reveal which parts of a skill have been exercised or which remain untested. We introduce skill coverage, a test adequacy metric that treats the skill artifact as the object under test. Our approach extracts observable s...

Notícia

A Quantum-Assisted Agentic Distributed Artificial Intelligence Framework for Deadline-Bounded Orchestration of Hybrid Renewable Microgrids

LLMs & Texto

arXiv:2606.20667v1 Announce Type: new Abstract: The real-time orchestration of microgrids that combine fluctuating renewable sources, dispatchable units, storage and curtailable consumers requires the repeated solution of combinatorial dispatch and coalition formation problems under hard control deadlines. In this paper, a quantum-assisted agentic distributed artificial intelligence (DAI) framework is proposed in which the dispatch problem of each control slot is formulated as a quadratic uncons...

Notícia

DEMM-Bench: A Cross-Regime Benchmark for Agent-Runtime Governance-Evidence Sufficiency

LLMs & Texto

arXiv:2606.20634v1 Announce Type: new Abstract: Agent-runtime systems emit traces, ledgers, provenance graphs, policy logs, delegation tokens, cache events, and tool-firewall records, but those containers do not necessarily answer governance questions about a specific decision. DEMM-Bench is a cross-regime benchmark for agent-runtime governance-evidence sufficiency, grounded in the Decision Evidence Maturity Model (DEMM): it measures whether records across eight evidence regimes are sufficient t...

Notícia

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

Robótica & RL

arXiv:2606.21139v1 Announce Type: new Abstract: Latent action pretraining learns representations of visual change from pairs of observations, but existing methods typically encode each transition as a single unstructured representation that entangles transition extent and transition mode. We introduce Polar Latent Actions with Radial structure (PoLAR), which imposes a radial-direction structure on latent actions, encouraging radius to encode transition extent and direction to retain transition m...

Notícia

Mind the Privileged-to-Camera Gap: Actor-Centric Sidecar Supervision for Camera-First Open-Loop Waypoint Prediction

LLMs & Texto

arXiv:2606.20772v1 Announce Type: new Abstract: Camera-first autonomous-driving models predict future ego waypoints from images, ego-state features, and route commands, but waypoint supervision alone does not explicitly supervise actor-level representations of nearby road users. We study this as supervised representation learning for open-loop waypoint prediction. The deployable model uses multi-view RGB, ego state, and route command at inference. During training, simulator-derived sidecar label...

Notícia

Robust Zero-Shot Generalization for Open-Vocabulary Action Recognition via Task Arithmetic

LLMs & Texto

arXiv:2606.20734v1 Announce Type: new Abstract: Open Vocabulary Action Recognition (OVAR) enables the recognition of novel actions by leveraging vision-language representations, overcoming the limitations of traditional closed-set approaches. However, achieving robust performance in real-world scenarios typically requires domain-specific fine-tuning, which is often costly and raises privacy and regulatory concerns. In this work, we propose an alternative paradigm that bypasses target-domain trai...

Notícia

Topic-to-Timestamp Alignment by Constrained Evidence Selection

LLMs & Texto

arXiv:2606.20890v1 Announce Type: new Abstract: Meeting archives are difficult to search when users remember what was discussed but not when. We study topic-to-timestamp alignment: given a natural-language topic and a timestamped meeting transcript, the goal is to return the time at which the topic is discussed. A standard RAG setup can retrieve relevant transcript excerpts, but still asks the language model to generate a timestamp, which can produce unsupported or invalid timecodes. We therefor...

Notícia

MotionPyramid: Hierarchical Motion Representation and Residual Interfaces

Geração de Imagem

arXiv:2606.20705v1 Announce Type: new Abstract: We ask whether the representational hierarchy seen in perception, from local primitives such as edges to higher level structures such as parts and objects, can be established for motion. In humanoid control, low level actions specify immediate motor commands, while meaningful behavior is organized over longer temporal scales, including contacts, gait fragments, balance recovery, reaching, and whole body skills. We introduce MotionPyramid, a hierarc...

Notícia

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

LLMs & Texto

arXiv:2606.20615v1 Announce Type: new Abstract: AI agents now participate as first-class team members across the software development lifecycle, yet no specification language exists for expressing the human-agent responsibility boundaries, approval gates, and governance constraints this collaboration requires. Existing approaches encode process in agent prompts (subject to drift), target adjacent domains (workflow management, business processes), or address only fragments (access control, approv...

Notícia

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

LLMs & Texto

arXiv:2606.20812v1 Announce Type: new Abstract: EEG foundation models can learn generalizable representations from large-scale EEG corpora to enable single-backbone transfer across diverse clinical and brain-computer interface tasks. Existing models typically discretize the continuous multi-channel EEG waveform into patches or codebook tokens and train a transformer with masked self-supervision. Recognizing that this discretization fragments continuous brain rhythms and obscures fine-grained tem...

Notícia

Evidential Fusion Network for Multimodal Survival Prediction under Missing Modalities

Multimodal

arXiv:2606.20757v1 Announce Type: new Abstract: Recent multimodal survival prediction models have demonstrated strong predictive performance by leveraging complementary information across modalities. However, such models generally assume data completeness and exhibit limited robustness toward missing modalities, which are frequently encountered in real-world clinical settings. We propose the Evidential Missing Modality Survival Fusion (EMMS) model for multimodal survival prediction under missing...

Notícia

Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier

LLMs & Texto

arXiv:2606.20662v1 Announce Type: new Abstract: Modern agent systems can turn uncertainty into overconfidence. Fragile upstream decisions are often exposed to downstream components as clean intermediate artifacts, while the uncertainty behind those decisions is lost at the interface. As a result, local ambiguity can become system-level error amplification. We argue that this reveals an interface bottleneck in agent uncertainty propagation: uncertainty does not propagate simply because a trajecto...

Notícia

Mirage: a Clean-Label Backdoor against LiDAR 3D Object Detection

Visão Computacional

arXiv:2606.20752v1 Announce Type: new Abstract: Deep neural network-based LiDAR 3D object detection serves as a critical perception component in safety-critical autonomous systems. However, recent studies have revealed its vulnerability to backdoor attacks. Existing attacks typically require white-box access or label modification and focus on geometric attacks such as object disappearance or bounding-box manipulation. In this paper, we present Mirage, a black-box and clean-label backdoor attack ...

Notícia

NVIDIA Vera CPU Opens the Way for Agentic Scientific AI at Los Alamos National Laboratory

Robótica & RL

Mission, Vision and Veritas — new Los Alamos National Laboratory (LANL) supercomputers to be built with HPE and NVIDIA — are tapping NVIDIA Vera CPUs to accelerate scientific discovery, unlocking agentic AI for science. The supercomputers will use the HPE Cray Supercomputing GX5000 architecture with the NVIDIA Vera Rubin platform, combining NVIDIA Vera CPUs, NVIDIA […]

Notícia

PerceptionDLM: modelos de difusão aprendem a descrever várias regiões de uma imagem ao mesmo tempo

Visão Computacional

Pesquisadores da Universidade de Pequim combinam um modelo de difusão de linguagem com um encoder de visão para gerar descrições de múltiplas regiões de uma imagem em paralelo — 3,4 vezes mais rápido que os métodos sequenciais.

Notícia

Training Open Models for Agentic Phone Use

LLMs & Texto

PhoneBuddy combines real and mock app environments to improve training of open models for phone use, demonstrating enhanced task success rates through mixed reinforcement learning…

Notícia

UniverSat: Resolution- and Modality-Agnostic Transformers for Earth Observation

Visão Computacional

UniverSat introduces a Universal Patch Encoder for Vision Transformers that enables robust, sensor-agnostic spatial feature extraction across diverse Earth Observation data types.

Notícia

Crawlee for Python: Build a Web Crawling Pipeline with Robots Handling, Link Graphs, and RAG Chunk Export

Robótica & RL

In this tutorial, we build a complete Crawlee for Python workflow from setup to AI-ready output. We generate a local demo website, then crawl it with BeautifulSoupCrawler, ParselCrawler, and PlaywrightCrawler. We extract titles, metadata, product fields, and JavaScript-rendered cards, and capture full-page screenshots. We then normalize the data, build a link graph, and export JSON, CSV, and RAG-ready JSONL chunks. The post Crawlee for Python: Build a Web Crawling Pipeline with Robots Handling, ...