Dados & Embeddings — Radar de IA

Open Annotations and Synthetic Data for Field Localisation in Indian Bank Cheques

arXiv:2606.20682v1 Announce Type: new Abstract: Automated cheque processing requires localising key fields (date, legal amount, IFSC code, account number, signature, and payee name) before any recognition step. The IDRBT Cheque Image Dataset is, to our knowledge, the only public collection of Indian bank cheques, but it ships without field annotations and with no stated licence, so its redistribution terms are unclear. We address both limitations. First, we release six-field bounding-box annotat...

23.06.2026

Blog LLMs & Texto

Topic-to-Timestamp Alignment by Constrained Evidence Selection

arXiv:2606.20890v1 Announce Type: new Abstract: Meeting archives are difficult to search when users remember what was discussed but not when. We study topic-to-timestamp alignment: given a natural-language topic and a timestamped meeting transcript, the goal is to return the time at which the topic is discussed. A standard RAG setup can retrieve relevant transcript excerpts, but still asks the language model to generate a timestamp, which can produce unsupported or invalid timecodes. We therefor...

23.06.2026

Blog Dados & Embeddings

FirstPass: Grounding AI Scientific Judgment in Multi-Round Editorial Outcomes

arXiv:2606.20769v1 Announce Type: new Abstract: AI systems for peer review fail on three fronts: they train on Computer Science and Machine Learning venues alone, ignore the iterative dialogue that validates science, and evaluate on stylistic mimicry rather than real editorial judgment. We introduce FirstPass, a dataset and fine-tuned model that addresses all three. Curating 3,668 complete multi-round peer-review dialogues from Nature Communications across five scientific domains (biology, chemi...

23.06.2026

Blog Geração de Imagem

MotionPyramid: Hierarchical Motion Representation and Residual Interfaces

arXiv:2606.20705v1 Announce Type: new Abstract: We ask whether the representational hierarchy seen in perception, from local primitives such as edges to higher level structures such as parts and objects, can be established for motion. In humanoid control, low level actions specify immediate motor commands, while meaningful behavior is organized over longer temporal scales, including contacts, gait fragments, balance recovery, reaching, and whole body skills. We introduce MotionPyramid, a hierarc...

23.06.2026

Blog LLMs & Texto

Specifying AI-SDLC Processes: A Protocol Language for Human-Agent Boundaries

arXiv:2606.20615v1 Announce Type: new Abstract: AI agents now participate as first-class team members across the software development lifecycle, yet no specification language exists for expressing the human-agent responsibility boundaries, approval gates, and governance constraints this collaboration requires. Existing approaches encode process in agent prompts (subject to drift), target adjacent domains (workflow management, business processes), or address only fragments (access control, approv...

23.06.2026

Blog LLMs & Texto

B[FM]$^2$: Brain Foundation Model via Flow Matching with SplitUNet

arXiv:2606.20812v1 Announce Type: new Abstract: EEG foundation models can learn generalizable representations from large-scale EEG corpora to enable single-backbone transfer across diverse clinical and brain-computer interface tasks. Existing models typically discretize the continuous multi-channel EEG waveform into patches or codebook tokens and train a transformer with masked self-supervision. Recognizing that this discretization fragments continuous brain rhythms and obscures fine-grained tem...

23.06.2026

Blog Multimodal

AEF-Econ: Toward Plug-and-Play Socioeconomic Foundation Embeddings from AlphaEarth for Urban Remote Sensing

arXiv:2606.20697v1 Announce Type: new Abstract: AlphaEarth Foundations (AEF) unify global remote sensing foundation embeddings through multimodal self-supervised learning, but their pretraining focuses on physical land-surface signals, limiting plug-and-play use in socioeconomic tasks. We integrate seven heterogeneous data streams across 36 Chinese cities over eight years - AEF embeddings, population, nighttime lights, remote sensing indices, points of interest (POIs), urban morphology, and cros...

23.06.2026

Blog LLMs & Texto

From Knowing to Acting: Benchmarking Self-Awareness Capability of LLM Agents

arXiv:2606.20661v1 Announce Type: new Abstract: The integration of external tools has transitioned LLM agents from passive responders to autonomous systems. However, current benchmarks prioritize execution success, neglecting self-awareness capability, the ability to discern whether a problem requires necessary external resources or can be solved via internal parametric knowledge. To address this, we introduce KAPRO (Knowing-Acting Quadrant PRObe), a framework that evaluates cognitive-behavioral...

23.06.2026

Blog Multimodal

Evidential Fusion Network for Multimodal Survival Prediction under Missing Modalities

arXiv:2606.20757v1 Announce Type: new Abstract: Recent multimodal survival prediction models have demonstrated strong predictive performance by leveraging complementary information across modalities. However, such models generally assume data completeness and exhibit limited robustness toward missing modalities, which are frequently encountered in real-world clinical settings. We propose the Evidential Missing Modality Survival Fusion (EMMS) model for multimodal survival prediction under missing...

23.06.2026

Blog Dados & Embeddings

Understanding Latent Flow Models for Tabular Data Synthesis: Targets, Paths, and Sampling

arXiv:2606.20878v1 Announce Type: new Abstract: Synthetic tabular data enables microdata sharing in regulated domains, yet deploying continuous-time generative models requires balancing analytical utility, disclosure risk, and computational cost. Latent-space flow models are flexible, but theoretical equivalences across learning targets, probability paths, and sampling dynamics can translate into different behaviour under finite-step integration and explicit compute budgets. We present an empiri...

23.06.2026

Blog LLMs & Texto

Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier

arXiv:2606.20662v1 Announce Type: new Abstract: Modern agent systems can turn uncertainty into overconfidence. Fragile upstream decisions are often exposed to downstream components as clean intermediate artifacts, while the uncertainty behind those decisions is lost at the interface. As a result, local ambiguity can become system-level error amplification. We argue that this reveals an interface bottleneck in agent uncertainty propagation: uncertainty does not propagate simply because a trajecto...

23.06.2026

Blog Dados & Embeddings

CDER-SME: A Cross-Device Event-RGB Micro-Expression Dataset under Multi-Level Stress Induction

arXiv:2606.20715v1 Announce Type: new Abstract: Micro-expression recognition (MER) in realistic scenarios demands high temporal sensitivity and ecological validity, yet existing benchmarks are largely constrained to laboratory-controlled settings and rigid hardware-coupled sensing. We introduce CDER-SME, a cross-device Event-RGB dataset collected under a multi-level stress induction framework (cognitive and social) to elicit spontaneous emotional leakage. To enable reproducible acquisition with ...

23.06.2026