// radar de ia

O que está acontecendo agora

Papers, modelos e datasets em alta no Hugging Face, além do blog oficial — com leitura editorial em português.

todas LLMs & Texto Geração de Imagem Visão Computacional Áudio & Voz Multimodal Dados & Embeddings Robótica & RL

World Action Models: A Survey

World Action Models are predictive-action systems that generate future states for decision-making, with designs balancing representational richness against computational constraint…

18.06.2026 ·▲ 33

Paper Geração de Imagem

Go-with-the-Track: Video Compositing and Motion Control with Point Tracking

Go-with-the-Track unifies motion control and reference image compositing in video generation by using point-track embeddings with spatial-aware encoding and video diffusion transfo…

18.06.2026 ·▲ 1

Paper LLMs & Texto

Toward Parking Spot Occupancy Recognition: A Self-Supervised Approach

A self-supervised transfer learning approach for parking spot occupancy recognition that achieves high accuracy with minimal labeled data through two-stage training and deployment…

18.06.2026

Paper LLMs & Texto

FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows

FlowBender is a closed-loop framework that addresses constraint satisfaction in diffusion and flow models by training networks to correct alignment errors using inference-time feed…

18.06.2026 ·▲ 20

Paper LLMs & Texto

Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention

Grouped Query Experts (GQE) improves Transformer efficiency by selectively activating query heads based on token content while maintaining key-value cache benefits of grouped-query…

18.06.2026 ·▲ 25

Paper LLMs & Texto

StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs

Multimodal large language models exhibit social bias driven by specific visual attributes, with fashion style and socioeconomic cues having the greatest impact on model judgments.

18.06.2026 ·▲ 2

Paper LLMs & Texto

Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning

Large language models can be trained through reinforcement learning to develop a meta-capability enabling continuous learning and adaptation across long sequences of tasks in dynam…

18.06.2026 ·▲ 5

Paper LLMs & Texto

CogniRoute: Learning to Route Social Evidence in Omni-Modal Models

CogniRoute is a schema-guided Mixture-of-Experts framework for social video question answering that improves multimodal reasoning through cognitive schema factorization and route-a…

18.06.2026 ·▲ 1

Paper Dados & Embeddings

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

EventVLA addresses long-horizon robotic manipulation challenges by introducing a sparse visual evidence memory framework with visual anchors and dynamic Keyframe Evidence Memory mo…

18.06.2026 ·▲ 1

Paper Dados & Embeddings

DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis

A large-scale real-world dataset called DF3DV-1K is introduced to address the lack of clean and cluttered image sets for distractor-free radiance field research, containing 1,048 s…

18.06.2026 ·▲ 31

Paper Geração de Imagem

The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation

Analysis of FID variance across different training and sampling seeds reveals significant reproducibility issues in image generation evaluation, with retraining causing larger fluc…

18.06.2026 ·▲ 6

Paper LLMs & Texto

QG-MIL: A Gated Transformer Aggregator for Domain-Agnostic Multiple Instance Learning in Medical Imaging

QG-MIL introduces a gated transformer aggregator for multiple instance learning in medical imaging that stabilizes attention distribution and improves prediction consistency across…

18.06.2026 ·▲ 2

← anterior 115 / 144 próxima →

1720 itens no radar