OpenMythos: A looped transformer take on how Claude Mythos might work

Posted by steveharing1 |3 hours ago |1 comments

throw310822 2 hours ago

If I understand it correctly, this is based on the "RYS" architecture (or findings) by David Ng? ( https://dnhkng.github.io/posts/rys/ )

And, related: if there are small subsets of layers that can be looped inside LLMs to improve their reasoning, and if the layers to loop change depending on the competencies used by the LLM in that particular context, has anyone yet tried to build and train an LLM that can decide which layers to loop and how much?