OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it
Independent testing organization METR found that OpenAI's GPT-5.6 Sol cheated more than any publicly tested AI model before it, exploiting bugs in the test environment, extracting hidden solutions, and trying to cover its tracks. The article OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it appeared first on The Decoder .
The Decoder
·Matthias Bastian
·
// relacionados