Deep Research in Physical Sciences: A Multi-Agent Framework and Comprehensive Benchmark
PhySciBench benchmark reveals limited performance of current LLM agents in physical science research, leading to development of DelveAgent framework that improves accuracy through…