HelloRL: A modular framework, like Lego for Reinforcement Learning

Posted by AndrewHart |2 hours ago |1 comments

AndrewHart 2 hours ago

This is something I built while learning RL, and I decided to open-source. I noticed that every major RL algorithm (Actor Critic, A2C, PPO, TD3 etc.) would be written from scratch, even though they shared a lot of the same features. And every implementation of the same features would be slightly different across each implementation. So trying to take a feature from one algo to another, or even to try building your own features, was a massive pain and error-prone.

HelloRL is a new modular framework, built around a single `train()` function, which scales up to every algorithm. The difference between Actor Critic (discrete, online, monte-carlo, simple) and TD3 (continuous, offline, 1-step rollout, targets, reference critic, etc.) is just a different set of modules. Easy to swap between algorithms, or mix and match features, or build your own modules.