LLM-Powered Interactive Robotic Action Synthesis from Multimodal Speech, Gestures, and Music

arXiv:2606.31158v1 Announce Type: new Abstract: The quest for intuitive and natural human-robot interaction (HRI) remains a significant challenge in robotics. Traditional methods often rely on rigid, pre-programmed commands that limit the robot's expressiveness and adaptability. This paper introduces a novel framework that leverages the reasoning capabilities of Large Language Models (LLMs) to synthesize complex robotic actions from a rich tapestry of multimodal human inputs: natural speech, han...

arXiv cs.RO ·Snehasis Banerjee, Ranjan Dasgupta ·
compartilhar: