↑
GPT-5.6 cheats so much its testers couldn't measure it
Posted by
shakeelhashim
|
3 hours ago |
3 comments
smallerize 3 hours ago
[1 more]
Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?
3 hours ago
Comment deleted
dane_works 2 hours ago
Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.