Train and deploy Reinforcement Learning agents faster with SheepRL.

Easy-to-use framework
Run reinforcement learning algorithms with a single simple instruction.
Accelerated with Lightning Fabric
Thanks to Fabric, we provide a simple, scalable and efficient framework.
Distributed Training
Run your agents on multiple gpus.
Use different algorithms
You can use one of the already available algorithms, or create your own.
python sheeprl exp=ppo exp=ppo env=gym env.id=CartPole-v1
python sheeprl.py exp=ppo exp=ppo env=gym env.id=CartPole-v1 fabric.devices=2
python sheeprl.py exp=ppo exp=ppo env=gym env.id=CartPole-v1 fabric.devices=2 fabric.accelerator=gpu
python sheeprl.py exp=sac env=gym env.id=LunarLanderContinuous-v2 algo.total_steps=2000000 env.capture_video=True

Monitor your experiments

Visualize your trained agent behaviour

python sheeprl.py exp=ppo fabric.devices=[2,3] fabric.accelerator=gpu algo.cnn_keys.encoder=[rgb] algo.mlp_keys.encoder=[] env=atari env.id=PongNoFrameskip-v4 algo.optimizer.lr=2.5e-4 algo.anneal_lr=True algo.anneal_ent_coef=True algo.anneal_clip_coef=True algo.ent_coef=0.01 algo.clip_coef=0.1 algo.rollout_steps=128 algo.update_epochs=3 env.num_envs=8 algo.per_rank_batch_size=128 algo.total_steps=40000000 env.capture_video=True
python sheeprl.py exp=ppo fabric.devices=[2,3] fabric.accelerator=gpu env=atari env.id=BreakoutNoFrameskip-v4 algo.cnn_keys.encoder=[rgb] algo.mlp_keys.encoder=[] algo.anneal_lr=True algo.anneal_ent_coef=True algo.anneal_clip_coef=True algo.ent_coef=0.01 algo.clip_coef=0.1 algo.rollout_steps=128 algo.optimizer.lr=2.5e-4 algo.update_epochs=3 env.num_envs=8 algo.per_rank_batch_size=128 algo.total_steps=40000000 env.capture_video=True

Check your agent for emerging behaviours

Blog Posts