Blog LLMs & Texto Multimodal

EgoSafetyBench: A Diagnostic Egocentric Video Benchmark for Evaluating Embodied VLMs as Runtime Safety Guards

arXiv:2607.00218v1 Announce Type: new Abstract: Vision-language models (VLMs) are now proposed as runtime safety guards for embodied agents in homes and factories. A deployable guard must catch genuinely unsafe situations while avoiding unnecessary intervention on routine but superficially alarming activity, a distinction that binary safety benchmarks obscure. We introduce EgoSafetyBench, an egocentric video benchmark of 1,200 robot-view scenarios annotated at half-second granularity, to evaluat...

arXiv cs.CV ·Siddhant Panpatil, Arth Singh, Mijin Koo, Chaeyun Kim, Haon Park, Dasol Choi · 02 de janeiro de 2026

Ver no Hugging Face

// relacionados

EgoSafetyBench: A Diagnostic Egocentric Video Benchmark for Evaluating Embodied VLMs as Runtime Safety Guards

Leia também

Claude Sonnet 5: a Anthropic aposta que o modelo do meio faz o trabalho do topo

Google’s AI buildout drove 37% increase in electricity use in 2025

OpenAI reportedly offers the Trump administration a five percent stake in the company

The Google Health API Got a CLI: ghealth is an Open-Source Tool for Your Fitbit Air Data