Blog
LLMs & Texto
A Survey of Toxicity Detection and Mitigation Strategies for Multilingual Language Models
arXiv:2606.25380v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed across languages, but their safety behavior remains uneven across linguistic and cultural contexts. This survey synthesizes work on toxicity detection and detoxification for multilingual LLMs. We first catalogue threat models that exploit language choice, translation pivots, code-switching, orthographic variation, multi-turn interaction, and post-deployment fine-tuning to weaken safety alignmen...
arXiv cs.CL
·Soham Dan, Himanshu Beniwal, Thomas Hartvigsen
·
// relacionados
Leia também
Blog
Amazon ups India bet with fresh $13B AI infrastructure investment
Editorial
Jalapeño: a OpenAI projeta seu primeiro chip de inferência — e usou IA para fazer isso em 9 meses
Editorial
SkillOpt: como ensinar agentes de IA a melhorar suas próprias habilidades — +23 pontos em GPT-5.5
Blog