EclecticSheep

Train and deploy Reinforcement Learning agents faster with SheepRL.

Easy-to-use framework

Run reinforcement learning algorithms with a single simple instruction.

Accelerated with Lightning Fabric

Thanks to Fabric, we provide a simple, scalable and efficient framework.

Distributed Training

Run your agents on multiple gpus.

Use different algorithms

You can use one of the already available algorithms, or create your own.

                        
                                        python sheeprl exp=ppo 
                                        exp=ppo env=gym env.id=CartPole-v1
                                    
                                        python sheeprl.py exp=ppo
                                        exp=ppo env=gym env.id=CartPole-v1
                                    
                                        fabric.devices=2
                                    
                                        python sheeprl.py exp=ppo
                                        exp=ppo env=gym env.id=CartPole-v1
                                        fabric.devices=2
                                     
                                        fabric.accelerator=gpu
                                    
                                    python sheeprl.py
                                    
                                        exp=sac env=gym
                                        env.id=LunarLanderContinuous-v2
                                        algo.total_steps=2000000
                                        env.capture_video=True

Monitor your experiments

Visualize your trained agent behaviour

                        
                                    python sheeprl.py exp=ppo
                                    
                                        fabric.devices=[2,3]
                                    
                                        fabric.accelerator=gpu
                                    
                                        algo.cnn_keys.encoder=[rgb]
                                        algo.mlp_keys.encoder=[]
                                        env=atari
                                        env.id=PongNoFrameskip-v4
                                        algo.optimizer.lr=2.5e-4
                                    
                                        algo.anneal_lr=True
                                        algo.anneal_ent_coef=True
                                        algo.anneal_clip_coef=True
                                    
                                        algo.ent_coef=0.01
                                        algo.clip_coef=0.1
                                        algo.rollout_steps=128
                                        algo.update_epochs=3
                                    
                                        env.num_envs=8
                                        algo.per_rank_batch_size=128
                                        algo.total_steps=40000000
                                    
                                        env.capture_video=True
                                    
                                        python sheeprl.py exp=ppo
                                        fabric.devices=[2,3]
                                        fabric.accelerator=gpu 
                                        
                                        env=atari
                                        env.id=BreakoutNoFrameskip-v4
                                        algo.cnn_keys.encoder=[rgb]
                                        algo.mlp_keys.encoder=[]
                                     
                                        algo.anneal_lr=True
                                        algo.anneal_ent_coef=True
                                        algo.anneal_clip_coef=True
                                    
                                        algo.ent_coef=0.01
                                        algo.clip_coef=0.1
                                        algo.rollout_steps=128
                                        algo.optimizer.lr=2.5e-4
                                        algo.update_epochs=3
                                    
                                        env.num_envs=8
                                        algo.per_rank_batch_size=128
                                        algo.total_steps=40000000
                                    
                                        env.capture_video=True

Check your agent for emerging behaviours

Blog Posts

Super Mario Bros Environment in SheepRL

Our tutorial on how to add the Super Mario Bros environment in SheepRL is now available!

An introduction to RL with SheepRL: the A2C algorithm

Reinforcement Learning is a powerful tool for solving complex problems. In this article we will introduce you Reinforcement Learning, with a focus on the Advantage Actor Critic (A2C) algorithm, and we will show

From 0 to RL with SheepRL

In the last few months we have achieved a lot with SheepRL, solving difficult environments like Minecraft, Diambra, Crafter and more. In this blogpost we want to provide an overview and share what