Blog
LLMs & Texto
One Year Later...The Harms Persist, But So Do We!
arXiv:2606.23884v1 Announce Type: new Abstract: General-purpose large language models (LLMs) are increasingly used for mental health-related conversations, yet safety safeguards remain inadequate and inconsistent across clinical conditions. This study evaluates six proprietary LLMs across 16 DSM-5 conditions using four adversarial attack variants, introducing an eight-dimension harm taxonomy and a multi-dimensional evaluation framework. Results show that safeguards hold reliably only for suicide...
arXiv cs.CL
·Annika Marie Schoene, Cansu Canca, Gautham Vijay Kumar, Anson Antony
·