LLMs & Texto — Radar de IA

Learning Splitting Heuristics for Parallel String Solvers

arXiv:2606.20656v1 Announce Type: new Abstract: String constraint solvers are crucial for reasoning about string-manipulating programs. However, many practical string constraints are undecidable, and real-world applications often present complex constraints that challenge current solvers. The rise of multi-core architectures offers an opportunity for parallel solving. A key parallel solving method is \emph{cube-and-conquer}, in which the quality of splitting heuristics is critical to effectively...

23.06.2026

Blog LLMs & Texto

Darwin Mobile Agent: A Roadmap for Self-Evolution

arXiv:2606.20622v1 Announce Type: new Abstract: The goal of artificial intelligence is to create agents capable of general, adaptive behaviour in open-ended environments. Guided by the "Bitter Lesson", we argue that the most effective path toward this goal is to systematically remove human priors and allow intelligence to naturally emerge through interaction with a "Big World" that is orders of magnitude more complex than the agent itself. We propose the mobile Graphical User Interface (GUI) as ...

23.06.2026

Blog LLMs & Texto

PeerCheck: Enhancing LLM-Generated Academic Reviews Towards Human-Level Quality

arXiv:2606.20897v1 Announce Type: new Abstract: As academic submissions grow, the traditional peer review process struggles to keep up, raising concerns about quality and fairness. A trend of using large language models (LLMs) for assistance has emerged. In this work, we take a critical step toward improving the quality of LLM-generated reviews. We propose the PeerCheck framework, which investigates LLM-human review differences (RQ1) and explores methods to improve LLM-generated review quality (...

23.06.2026

Blog LLMs & Texto

Robust Zero-Shot Generalization for Open-Vocabulary Action Recognition via Task Arithmetic

arXiv:2606.20734v1 Announce Type: new Abstract: Open Vocabulary Action Recognition (OVAR) enables the recognition of novel actions by leveraging vision-language representations, overcoming the limitations of traditional closed-set approaches. However, achieving robust performance in real-world scenarios typically requires domain-specific fine-tuning, which is often costly and raises privacy and regulatory concerns. In this work, we propose an alternative paradigm that bypasses target-domain trai...

23.06.2026

Blog LLMs & Texto

Evaluation of Medical Vision Language Models HuluMed and MedGemma, and general purpose chatbots Gemma 3, ChatGPT Plus, and Claude Pro on real previously unseen wound images

arXiv:2606.20723v1 Announce Type: new Abstract: Chronic wound assessment remains a clinically challenging task that requires accurate interpretation of wound morphology, tissue composition, vascular characteristics, and infection risk. Recent advances in Vision-Language Models (VLMs) have introduced the possibility of automated multimodal wound analysis through image understanding combined with clinical reasoning. This study evaluates the performance of several general-purpose and medically spec...

23.06.2026

Blog LLMs & Texto

Topic-to-Timestamp Alignment by Constrained Evidence Selection

arXiv:2606.20890v1 Announce Type: new Abstract: Meeting archives are difficult to search when users remember what was discussed but not when. We study topic-to-timestamp alignment: given a natural-language topic and a timestamped meeting transcript, the goal is to return the time at which the topic is discussed. A standard RAG setup can retrieve relevant transcript excerpts, but still asks the language model to generate a timestamp, which can produce unsupported or invalid timecodes. We therefor...

23.06.2026

Blog LLMs & Texto

AlphaMemo: Structured Search-Process Memory for Self-Evolving Alpha Mining Agents

arXiv:2606.20625v1 Announce Type: new Abstract: LLM agents are promising for alpha mining via combining financial priors, symbolic reasoning, executable factor generation, and feedback-driven refinement. Yet, they face a combinatorial search space, noisy non-stationary feedback, redundant discoveries, and overfitting risks from naively reusing past successes. To address these challenges, we propose AlphaMemo, a self-evolving alpha mining agent with Structured Search-Process Memory. Rather than m...

23.06.2026

Blog LLMs & Texto

Learning What Not to Forget: Long-Horizon Agent Memory from a Few Kilobytes of Learning

arXiv:2606.20954v1 Announce Type: new Abstract: Long-running language-model systems accumulate interaction history that outgrows the context window, so they must continually evict. When an eviction policy drops a load-bearing detail, for example an access token issued at login or a path the next call needs, the action fails. We present LRE (Learned Relevance Eviction), a few kilobytes, CPU-only, language-model-free scorer that learns which units of history are load-bearing and keeps them by verb...

23.06.2026

Blog LLMs & Texto

Robust Image-Driven Phenotyping of Ovarian Tumor Cells using Optimized Dynamic Features in Hyperbolic Channels

arXiv:2606.20703v1 Announce Type: new Abstract: Label-free, image-based cellular mechanophenotyping in microfluidic devices provides a high-throughput method for single-cell profiling. However, while complex microchannels (e.g., hyperbolic geometries) reveal transient deformation dynamics under continuous extensional stress, the resulting high-dimensional feature spaces are highly susceptible to hydrodynamic artifacts. Flow rate variations often distort discriminative boundaries, linking feature...

23.06.2026

Blog LLMs & Texto

From Sentiment to Actionable Insights: A Data-Driven Public Sentiment Analysis of Advanced Air Mobility

arXiv:2606.20751v1 Announce Type: new Abstract: Advanced Air Mobility (AAM) is an emerging low-altitude air transportation system whose successful deployment depends not only on technological advancement but also on public acceptance. This acceptance will drive government support, regulations, noise standards, and willingness to fly, and in turn the overall commercial viability of AAM. Understanding public sentiment toward AAM is therefore essential for identifying its societal barriers and info...

23.06.2026

Blog LLMs & Texto

PEAR: Permutation-Equivariant Adaptive Routing Multi-Agent Debate

arXiv:2606.20621v1 Announce Type: new Abstract: Multi-agent debate improves the reliability of large language models (LLMs) through iterative peer critiques. However, fixed topologies often introduce persistent positional biases, amplify unreliable agents, and cause high sensitivity to role assignments. We introduce \textit{Permutation-Equivariant Adaptive Routing Multi-Agent Debate (PEAR)}, an inference-time protocol that dynamically reconfigures communication roles and sparse topologies across...

23.06.2026

Blog LLMs & Texto

A Multi-Agent Audit Framework for High-Stakes Reasoning: Evaluation and Interpretability in Clinical Mental Health Screening

arXiv:2606.21123v1 Announce Type: new Abstract: High-stakes reasoning tasks necessitate transparent and verifiable workflows, yet conventional single-model large language models (LLMs) often struggle with hallucination and low interpretability under zero-shot paradigms. To address this general AI challenge, we propose a Multi-Agent Audit Framework that simulates a collaborative, multi-step verification process. We empirically validate this architecture in the sensitive domain of clinical mental ...

23.06.2026