Paper LLMs & Texto Dados & Embeddings

ShutterMuse: Capture-Time Photography Guidance with MLLMs

Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations…

Hugging Face · Daily Papers ·Jiayu Li, Yixiao Fang · 24 de janeiro de 2026 ·▲ 34 upvotes

Este artigo está em destaque na seleção diária de papers do Hugging Face, curada pela comunidade de pesquisa em IA.

Autores: Jiayu Li, Yixiao Fang, Tianyu Hu, Wei Cheng, Ping Huang, Zheheng Fan

34 upvotes da comunidade
Temas: multimodal large language models, aesthetic cropping, visual annotations, supervised fine-tuning, reinforcement fine-tuning, photographer-side composition

Resumo

Resumo original (em inglês), extraído do paper:

Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations during image capture.

Ler o paper completo no Hugging Face →

Ver no Hugging Face

// relacionados

ShutterMuse: Capture-Time Photography Guidance with MLLMs

Resumo

Leia também

Amazon ups India bet with fresh $13B AI infrastructure investment

Jalapeño: a OpenAI projeta seu primeiro chip de inferência — e usou IA para fazer isso em 9 meses

SkillOpt: como ensinar agentes de IA a melhorar suas próprias habilidades — +23 pontos em GPT-5.5

Authors Guild test finds some AI detectors perfectly identify human writing while others fail on every single text