Paper LLMs & Texto Visão Computacional

NormGuard: Reward-Preserving Norm Constraints in Flow-Matching Reinforcement Learning

Reinforcement learning post-training degrades perceptual quality in flow-based generators through velocity norm inflation, which requires training-time intervention rather than inf…

Hugging Face · Daily Papers ·Tianlin Pan, Lianyu Pang · 26 de janeiro de 2026 ·▲ 3 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Tianlin Pan, Lianyu Pang, Cheng Da, Huan Yang, Changqian Yu, Kun Gai

3 upvotes da comunidade
Temas: flow-based generators, reinforcement learning, reward alignment, velocity norm, norm inflation, classifier-free guidance

Resumo

Resumo original (em inglês), extraído do paper:

Reinforcement learning post-training degrades perceptual quality in flow-based generators through velocity norm inflation, which requires training-time intervention rather than inference-time corrections to maintain both reward alignment and image quality.

Onde ler

Ver no Hugging Face

// relacionados

NormGuard: Reward-Preserving Norm Constraints in Flow-Matching Reinforcement Learning

Resumo

Onde ler

Leia também

The US military used AI to pick thousands of targets but missed a note saying one was a school

HP accelerates enterprise workflows with OpenAI Frontier

O fantasma do Fable 5: banido, o modelo vive nos datasets que o destilam

MultiHashFormer: e se cada palavra fosse uma impressão digital?