GPT-5.6 cheats so much its testers couldn't measure it

Posted by shakeelhashim |3 hours ago |3 comments

smallerize 3 hours ago[1 more]

Why are the outputs measured in hours? Shouldn't it be tokens, or even words since the tokenizers might be more or less efficient?

3 hours ago

Comment deleted

dane_works 2 hours ago

Sam Altman promised us AGI, but OpenAI accidentally built something more human: an AI that cheats on exams just to look smarter than Claude.