logo

Posted by TheMrZZ |2 hours ago |0 comments

arm32 2 hours ago[3 more]

The title got me, I'll admit it—except that the benchmark is a game where the models are told to lie.

bellowsgulch 2 hours ago

I find it deeply funny and I suppose a bit expected that a Grok model appears at face value to be optimized for supposed truth telling.

And to keep the e-mob off my back, I don't endorse Elon Musk.