LLMs & Texto
Papers, modelos e datasets em alta no Hugging Face, além do blog oficial — com leitura editorial em português.
Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages
Multi-LCB addresses the limitation of LiveCodeBench by providing a multi-language benchmark for evaluating LLMs across twelve programming languages while maintaining contamination…
Manifold Bandits: Bayesian Curriculum Learning over the Latent Geometry of Large Language Models
Reinforcement learning approaches for improving LLM reasoning capabilities are enhanced by a Bayesian Manifold Curriculum framework that structures problem sampling based on task m…
Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe
Uniform 4-bit training with RHT-based quantization outperforms E2M1-based methods by eliminating shrinkage bias and improving training stability across large language model archite…
MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization
MobileForge enables efficient adaptation of mobile GUI agents through annotation-free learning by combining real app interaction grounding with hierarchical feedback-guided policy…
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
LEDGERAGENT is a method for customer service agents that maintains task states in a separate ledger to improve policy adherence and state management during tool calling.
Go-with-the-Track: Video Compositing and Motion Control with Point Tracking
Go-with-the-Track unifies motion control and reference image compositing in video generation by using point-track embeddings with spatial-aware encoding and video diffusion transfo…
Toward Parking Spot Occupancy Recognition: A Self-Supervised Approach
A self-supervised transfer learning approach for parking spot occupancy recognition that achieves high accuracy with minimal labeled data through two-stage training and deployment…
FlowBender: Feedback-Aware Training for Self-Correcting Conditional Flows
FlowBender is a closed-loop framework that addresses constraint satisfaction in diffusion and flow models by training networks to correct alignment errors using inference-time feed…
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention
Grouped Query Experts (GQE) improves Transformer efficiency by selectively activating query heads based on token content while maintaining key-value cache benefits of grouped-query…
StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMs
Multimodal large language models exhibit social bias driven by specific visual attributes, with fashion style and socioeconomic cues having the greatest impact on model judgments.
Connect the Dots: Training LLMs for Long-Lifecycle Agents with Cross-Domain Generalization Via Reinforcement Learning
Large language models can be trained through reinforcement learning to develop a meta-capability enabling continuous learning and adaptation across long sequences of tasks in dynam…