logo

Realworld benchmark between Codex 5.3 and Opus 4.6

Posted by hongbo_zhang |3 hours ago |3 comments

hongbo_zhang 3 hours ago

This is the benchmark between the latest models on a new programming language to avoid overfitting. Latest models are quite good over generalization to new languages, they can write tens of thousands of lines of code in one prompt that just works.

alontorres 3 hours ago[1 more]

I do feel like the latest codex 5.2 and 5.3 have been really excellent in coding and have been giving opus a good fight. I still prefer Opus 4.6 as my daily driver but specifically for coding tasks I think codex 5.3 is the best, especially when considering value for money.