RL Swarm
RL Swarm lets anyone, anywhere, join and participate in a distributed reinforcement learning system that learns faster together than alone.
What is RL Swarm?
RL Swarm is a decentralized training environment where reinforcement learning (RL) agents cooperate over the internet instead of inside a single datacenter.
Each node runs a local language model that participates in multi-stage RL reasoning games, which involves answering, critiquing, and revising solutions alongside peers.
By connecting an RL Swarm node to an on-chain identity on the Gensyn Testnet, every participant’s contributions are logged and verifiable. This enables a persistent view of collective training performance across the network.

Why It Exists
Traditional RL research happens inside isolated labs using centralized GPU clusters. These environments are expensive, inaccessible, and closed by design.
RL Swarm was built to show that reinforcement learning can happen collaboratively and trustlessly across independent machines, powered by Gensyn’s decentralized execution and verification layers.
By turning multi-agent RL into a networked experiment, RL Swarm demonstrates:
- How peer-to-peer learning can outperform solo training. 
- How collective reasoning can improve model quality and efficiency. 
- How the Gensyn Protocol’s primitives, [1] execution, [2] verification, [3] communication, and [4] coordination, work together in a live environment. 
What You Can Do With It
Anyone can clone the RL Swarm repository, run a node locally, and connect to the live swarm.
In the swarm, each node participates in four stages of RL:
When a session (“episode”) ends, the node’s updated weights can be uploaded to a model hub like Hugging Face or logged directly to the Gensyn Testnet, which creates and contributes to a transparent record of the decentralized training progress.
Ready?
Head over to Getting Started section and select your platform for OS-specific set-up guides, or browse our Troubleshooting documentation.
Last updated

