Dados & Embeddings — Radar de IA

Peeking Inside LLMs: Leveraging Internal Artifacts of LLMs for Enhancing Reliability in Legal Classification

arXiv:2606.20929v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being adopted in the legal domain. However, despite their strong performance, LLMs are prone to generating incorrect or hallucinated outputs, raising serious concerns about their reliability in high-stakes domains such as law. Detecting the correctness of responses of LLM-based systems is therefore a critical challenge. In this work, we explore the potential of leveraging internal artifacts of LLM to de...

23.06.2026

Blog Dados & Embeddings

ELADO: Elliptic PDE Assessment Datasets for Operator Learning

arXiv:2606.20771v1 Announce Type: new Abstract: We introduce ELADO (Elliptic PDE Assessment Datasets for Operator Learning), a systematic benchmark suite constructed to show and quantify failure modes of neural operator architectures when learning solution operators of elliptic PDEs. While the benchmarks of existing datasets focus on average case performance, the ELADO datasets are constructed to highlight challenges that arise naturally in elliptic PDE problems. In particular, we construct seve...

23.06.2026

Blog Dados & Embeddings

Temporal Causal Prior-Data Fitted Networks for Panel Data with Learned Reliability Signals

arXiv:2606.20889v1 Announce Type: new Abstract: Estimating causal effects in industrial time series requires handling temporal dynamics, time-varying treatments, and unobserved confounders. Existing causal foundation models (CausalPFN, CausalFM) operate only on static cross-sectional data; neural temporal methods (CRN, G-Net) require per-dataset training; and concurrent temporal-PFN proposals have not been demonstrated at industrial scale. None output explicit per-pair reliability signals alongs...

23.06.2026

Blog Dados & Embeddings

D2HDMap: Non-visible Driveline Map Prior for Online Vectorized HD Map Prediction

arXiv:2606.20725v1 Announce Type: new Abstract: Accurate, up-to-date representations of road structures are critical for the safe operation of autonomous vehicles. Existing systems rely either on costly, maintenance-heavy high-definition (HD) maps which compromise safety when outdated, or purely sensor-based online mapping which struggles with long-range reliability and occlusion. Systems incorporating map prior information into online mapping seek to overcome drawbacks of both approaches by com...

23.06.2026

Blog Geração de Imagem

Hierarchical Pooling for Sheaf Neural Networks

arXiv:2606.20932v1 Announce Type: new Abstract: Sheaf Neural Networks (SNNs) generalize Graph Neural Networks (GNNs) by replacing scalar node signals with stalk-valued signals and by using restriction maps to measure compatibility across edges. Unlike standard graph diffusion, which encourages neighboring node features to become similar, sheaf diffusion promotes consistency through the restriction maps and can therefore model more general relationships between neighboring nodes. However, existin...

23.06.2026

Blog Dados & Embeddings

NeuroShield: A Device-Agnostic Foundation Model for EEG Authentication

arXiv:2606.20673v1 Announce Type: new Abstract: A central challenge in EEG authentication is that models are typically tied to the acquisition settings in which they are trained. In particular, variations in headset hardware, channel layout, and signal duration create heterogeneous recordings that existing models are not designed to handle, causing each new headset or dataset to be treated as a separate model-development problem. This fragmentation limits multi-dataset learning, hinders knowledg...

23.06.2026

Blog LLMs & Texto

Skill Coverage: A Test Adequacy Metric for Agent Skills

arXiv:2606.20659v1 Announce Type: new Abstract: Agent skills encode reusable procedural knowledge that guides large language model agents across tasks and execution contexts. Existing evaluations primarily assess skills through task level outcomes, yet task success alone does not reveal which parts of a skill have been exercised or which remain untested. We introduce skill coverage, a test adequacy metric that treats the skill artifact as the object under test. Our approach extracts observable s...

23.06.2026

Blog LLMs & Texto

A Quantum-Assisted Agentic Distributed Artificial Intelligence Framework for Deadline-Bounded Orchestration of Hybrid Renewable Microgrids

arXiv:2606.20667v1 Announce Type: new Abstract: The real-time orchestration of microgrids that combine fluctuating renewable sources, dispatchable units, storage and curtailable consumers requires the repeated solution of combinatorial dispatch and coalition formation problems under hard control deadlines. In this paper, a quantum-assisted agentic distributed artificial intelligence (DAI) framework is proposed in which the dispatch problem of each control slot is formulated as a quadratic uncons...

23.06.2026

Blog Multimodal

An approach with Visual and Tabular Mamba to multimodal medical data using Mixed Fusion

arXiv:2606.20738v1 Announce Type: new Abstract: This article presents a complementary approach for integrating multimodal medical data in cancer classification, based on state space models represented by the Mamba architecture. To this end, a mixed multimodal fusion architecture, called Mixed Fusion, was employed and developed to enhance the interpretability of the decision-making process. The proposed approach explores two variants of Mamba: one dedicated to visual processing, responsible for c...

23.06.2026

Blog Dados & Embeddings

Detecting Satellites in Radio-Frequency Data via Semi-Supervised Learning

arXiv:2606.20976v1 Announce Type: new Abstract: Radio-frequency (RF) monitoring is essential for space domain awareness, but it often generates large, variable, and sparsely populated datasets with few labels. These observations can capture satellites, space debris, and the ionospheric background, yet interpreting them typically requires specialized subject-matter expertise. Supervised deep learning methods can perform well on labeled RF data, but they require many annotated examples and may nee...

23.06.2026

Blog LLMs & Texto

DEMM-Bench: A Cross-Regime Benchmark for Agent-Runtime Governance-Evidence Sufficiency

arXiv:2606.20634v1 Announce Type: new Abstract: Agent-runtime systems emit traces, ledgers, provenance graphs, policy logs, delegation tokens, cache events, and tool-firewall records, but those containers do not necessarily answer governance questions about a specific decision. DEMM-Bench is a cross-regime benchmark for agent-runtime governance-evidence sufficiency, grounded in the Decision Evidence Maturity Model (DEMM): it measures whether records across eight evidence regimes are sufficient t...

23.06.2026

Blog Robótica & RL

PoLAR: Factorizing Extent and Mode in Latent Actions for Robot Policy Learning

arXiv:2606.21139v1 Announce Type: new Abstract: Latent action pretraining learns representations of visual change from pairs of observations, but existing methods typically encode each transition as a single unstructured representation that entangles transition extent and transition mode. We introduce Polar Latent Actions with Radial structure (PoLAR), which imposes a radial-direction structure on latent actions, encouraging radius to encode transition extent and direction to retain transition m...

23.06.2026