ShutterMuse: Capture-Time Photography Guidance with MLLMs

ShutterMuse: Capture-Time Photography Guidance with MLLMs

Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations…

Hugging Face · Daily Papers ·Jiayu Li, Yixiao Fang · ·▲ 34 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Jiayu Li, Yixiao Fang, Tianyu Hu, Wei Cheng, Ping Huang, Zheheng Fan

  • 34 upvotes da comunidade
  • Temas: multimodal large language models, aesthetic cropping, visual annotations, supervised fine-tuning, reinforcement fine-tuning, photographer-side composition

Resumo

Resumo original (em inglês), extraído do paper:

Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations during image capture.

Ler o paper completo no Hugging Face →

compartilhar: