Manifold Bandits: Bayesian Curriculum Learning over the Latent Geometry of Large Language Models

Manifold Bandits: Bayesian Curriculum Learning over the Latent Geometry of Large Language Models

Reinforcement learning approaches for improving LLM reasoning capabilities are enhanced by a Bayesian Manifold Curriculum framework that structures problem sampling based on task m…

Hugging Face · Daily Papers ·Darrien McKenzie, Nicklas Hansen ·

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Darrien McKenzie, Nicklas Hansen, Xiaolong Wang

  • 0 upvotes da comunidade
  • Temas: reinforcement learning, large language models, adaptive curriculum learning, bandit problem, manifold-structured bandit, endogenous non-stationarity

Resumo

Resumo original (em inglês), extraído do paper:

Reinforcement learning approaches for improving LLM reasoning capabilities are enhanced by a Bayesian Manifold Curriculum framework that structures problem sampling based on task manifold relationships and endogenous non-stationarity.

Ler o paper completo no Hugging Face →

compartilhar: