Hermes Agent, the open-source MIT-licensed coding agent built by Nous Research, became the number-one daily application on OpenRouter's global rankings between May 10 and May 13, 2026. It now processes roughly 224 billion tokens per day across the platform, ahead of OpenClaw's roughly 186 billion. Its differentiator is a "do, learn, improve" self-improvement loop and a three-layer memory system. It runs model-agnostically across more than 200 models, and its latest release is v0.13.0 "The Tenacity Release," shipped May 7, 2026.
What Just Happened: An Open-Source Agent Took the Top Spot
For most of the agentic-coding era, the conversation has orbited three names: OpenAI, Anthropic, and Google. Every benchmark, every funding round, every model release gets framed as a move in that three-way game. So when an open-source agent from a relatively small research lab quietly climbs to the top of OpenRouter's global daily usage chart, it is worth stopping to look closely.
According to reporting from MarkTechPost and Nous Research's own Hermes Agent site, Hermes Agent now routes approximately 224 billion tokens per day through OpenRouter, the model-routing marketplace that aggregates traffic across hundreds of large language models. That figure puts it ahead of OpenClaw, the previous leader, which sits near 186 billion tokens per day. On OpenRouter's daily app and agent leaderboard, that makes Hermes Agent the single largest consumer of inference traffic on the platform.
This is not a benchmark score or a marketing claim. OpenRouter's rankings reflect real, paid token consumption — money moving through the platform as developers and automated workflows actually run the agent. That distinction matters. We have watched plenty of "fastest-growing AI tool" headlines that evaporate when you check whether anyone keeps using the thing after week one. Token throughput on a marketplace is a harder, dirtier number to game.
Who Is Nous Research?
Nous Research is an open-source AI lab known for the Hermes family of fine-tuned open-weight models. It built a reputation in the open-model community for releasing capable instruction-tuned and reasoning models without the closed-API gatekeeping that defines the frontier labs. Hermes Agent, launched in February 2026 under an MIT license, is the lab's bet that the agent layer — not just the model — should be open too.
That positioning is the whole story here. The three giants are racing to make their agents stickier, more proprietary, and more deeply tied to their own model APIs. Nous Research went the opposite direction: an agent that belongs to nobody, runs on everybody's models, and improves itself the more it is used.
The Numbers Behind the #1 Ranking
Let us put the headline figures side by side, because the gap is the most concrete fact in this story.
| Metric | Hermes Agent | OpenClaw |
|---|---|---|
| OpenRouter daily token throughput | ~224 billion per day | ~186 billion per day |
| OpenRouter daily rank (app/agent) | #1 global | #2 |
| License | MIT (open source) | Open source |
| Maintainer | Nous Research | Community / OpenClaw project |
| Model coverage | 200+ models, model-agnostic | Multi-model |
| Latest release | v0.13.0 (May 7, 2026) | Active |
The roughly 38-billion-token-per-day lead is not enormous in percentage terms — Hermes Agent is ahead by around 20 percent — but the direction of travel is the interesting part. OpenClaw held the top of this chart through the agent boom of early 2026. We covered OpenClaw closely from the security angle when the OpenClaw CVE-2026-25253 WebSocket hijack vulnerability exposed how fragile the agent-runtime layer can be. Watching a challenger overtake the incumbent on raw usage, this quickly, is a strategic signal worth reading.
Why Token Throughput Is the Metric That Matters
There are three ways an AI tool can claim to be winning: downloads, stars, or usage. Downloads and stars measure curiosity. Token throughput measures commitment. Every billion tokens routed through OpenRouter represents a developer or an automated pipeline that ran the agent, paid for the inference, and presumably got enough value to keep going.
For an agent specifically, sustained throughput is a proxy for two things at once: how many people adopted it, and how much work each of them gives it. A self-improving agent that gets more capable the more it runs has a structural reason to climb this particular chart faster than a static tool — usage compounds into capability, and capability pulls more usage. That feedback loop is the heart of the Hermes Agent story.
The Self-Improving Loop: "Do, Learn, Improve"
The single most-cited differentiator for Hermes Agent is its self-improvement cycle. Most coding agents operate as stateless executors: you give them a task, they run a plan-act-observe loop, they finish, and the next task starts from zero. Hermes Agent adds a fourth phase that persists across tasks — it writes its own reusable skill files.
The loop has three named stages. Do: the agent executes the task using its available tools, the way any agent would. Learn: when it solves something non-trivial, it reflects on what worked and distills the solution into a procedural skill. Improve: that skill is saved as a reusable file the agent can retrieve and apply to future tasks without rediscovering the solution from scratch.
Why Self-Generated Skills Change the Economics
This matters more than it first appears. In a stateless agent, every project pays the full cognitive cost of every problem, every time. The agent re-derives the same shell incantation, the same API quirk, the same build workaround on every run. With a skill library that the agent builds itself, repeated problems get cheaper over the lifetime of a workspace. The tenth time the agent hits a familiar class of bug, it retrieves a skill instead of reasoning from first principles.
We have seen variations of this idea before — Claude Code's skill files and OpenAI Codex's reusable instructions both move in this direction. What is distinctive about Hermes Agent is that the skill generation is the agent's own behavior, not a human-authored artifact. The agent decides what is worth remembering. That is a meaningfully different design philosophy from the human-curated playbooks most tools ship with.
The Obvious Risk: Bad Skills Compound Too
We will not pretend this is free. A self-improving agent that learns the wrong lesson can entrench a mistake just as efficiently as a good one. A skill distilled from a solution that happened to work by accident becomes a confidently-applied wrong answer on every future task that pattern-matches to it. Nous Research's bet is that the curation logic is good enough that the average skill helps more than it hurts. That is an empirical claim that the OpenRouter throughput numbers suggest is holding up at scale — but it is the part of this design we would watch most carefully in production.
The Three-Layer Memory Architecture
The self-improvement loop is only useful if the agent can actually store and retrieve what it learns. Hermes Agent's answer is a three-layer memory system, and the layering is deliberate — each tier solves a different recall problem.
Layer 1: Identity Snapshots
The first layer captures the agent's working identity — its current goals, its understanding of the project, and the high-level context it needs to stay coherent across long-running work. This is the layer that lets a persistent goal survive a restart or a context reset. It is the difference between an agent that knows why it is doing something and one that just knows the next step.
Layer 2: SQLite Full-Text Recall
The second layer is a SQLite database with full-text search over the agent's accumulated history. Rather than cramming everything into the model's context window, Hermes Agent stores its observations, decisions, and outcomes in a queryable store and retrieves the relevant slices on demand. This is the pragmatic engineering choice — SQLite is boring, durable, fast, and runs anywhere. Using full-text search over a local database instead of a heavyweight vector store is the kind of decision that signals the team optimized for reliability over novelty.
Layer 3: Procedural Skills
The third layer is the skill library produced by the do-learn-improve loop. These are the reusable procedural files — the "how to solve this class of problem" knowledge — that the agent retrieves when it recognizes a familiar situation. Together, the three layers separate three concerns cleanly: who the agent is (identity), what it has seen (full-text history), and what it knows how to do (skills).
This separation is the architectural insight worth taking away even if you never run Hermes Agent. Most agent memory systems collapse all of this into one undifferentiated context blob or one vector index. Splitting it into identity, episodic recall, and procedural skill is a cleaner mental model — and it maps onto how durable human expertise actually works.
Model-Agnostic by Design: 200+ Models
Hermes Agent runs on more than 200 models across OpenRouter, NVIDIA NIM, Amazon Bedrock, and local Ollama deployments. It is not tied to a single model provider, and that is a strategic choice with real consequences.
Why Model-Agnosticism Is a Competitive Weapon
The frontier labs have a structural incentive to bind their agents to their own models. Claude Code is best with Anthropic models; OpenAI Codex is built around OpenAI models. That tight coupling delivers a polished experience, but it also locks the user into one provider's pricing, availability, and roadmap. Hermes Agent inverts this. If one model gets too expensive, gets rate-limited, or gets deprecated, you switch the underlying model and the agent keeps running. The agent is the durable asset; the model is a swappable component.
For teams running agents at scale, that is not a minor convenience — it is a procurement and risk-management feature. It is also the most credible explanation for why a community-maintained agent could out-consume the proprietary leaders on a neutral marketplace like OpenRouter: it has no incentive to push you toward any one model, so it goes wherever the work is cheapest and fastest.
How It Compares to the Proprietary Agents
It is worth being precise about where Hermes Agent sits relative to the tools we cover regularly. Claude Code remains the most polished terminal-native agent with the deepest model integration. OpenClaw built the open-source momentum that Hermes Agent has now overtaken on usage. Autonomous agents like Devin and Manus target a different shape of work — long-horizon autonomous task completion rather than the developer-in-the-loop model. Our Devin vs Manus comparison and our Claude Code vs OpenAI Codex breakdown map that landscape in detail. Hermes Agent's distinct claim is not "better model" — it is "the agent layer should be open, portable, and self-improving."
v0.13.0 "The Tenacity Release": What Shipped May 7
The momentum is not just historical. On May 7, 2026, Nous Research shipped Hermes Agent v0.13.0, named "The Tenacity Release." The naming is on-theme: this release is largely about making long-running, multi-agent work survive interruptions.
Durable Multi-Agent Kanban
v0.13.0 introduces a durable multi-agent Kanban board. Multiple agents can coordinate work across a shared board whose state persists, so a crash or restart does not lose the orchestration. For anyone who has watched a multi-agent run die halfway through and take its plan with it, durable orchestration state is the unglamorous feature that actually makes the pattern usable.
Persistent /goal and the Ralph Loop
The release adds a persistent /goal directive that survives across sessions — the so-called Ralph loop, where the agent keeps grinding toward a stated goal until it is genuinely done rather than until the session ends. This is the practical expression of the identity-snapshot memory layer: a goal that does not evaporate when the context window resets.
Checkpoints v2 and Eight P0 Security Patches
Checkpoints v2 improves the agent's ability to roll back to a known-good state, which pairs naturally with self-improvement — you want cheap undo when an experimental approach goes wrong. The release also patches eight P0 security issues. After what the industry learned from the OpenClaw WebSocket hijack incident, an open-source agent treating eight priority-zero security fixes as headline release content is the right instinct. Agent runtimes are now a serious attack surface, and treating security as a first-class release concern is exactly the maturity this category needs.
Google Chat as the 20th Channel
v0.13.0 also adds Google Chat as the agent's twentieth integration channel, continuing a pattern of meeting developers where they already work rather than forcing them into a dedicated interface. Twenty channels is a quiet signal of how broadly the agent has been deployed.
The Hermes 4 Model Stack Underneath
Hermes Agent is model-agnostic, but it has a natural home: the Hermes 4 family of open-weight models, also from Nous Research. Understanding this stack explains why the agent behaves the way it does.
14B, 70B, and 405B Open Weights
Hermes 4 ships in three sizes: roughly 14 billion, 70 billion, and 405 billion parameters. All three are open-weight, which means teams can run them on their own infrastructure with no API dependency. The size range matters for agent work — you can run a small Hermes 4 model for cheap, fast tool-calling and reserve the 405B model for the hard reasoning steps, all within the same agent and the same model family.
Hybrid Reasoning on a Qwen and Llama 3.1 Base
The Hermes 4 models are hybrid-reasoning models built on a base that blends Qwen and Llama 3.1 lineages. Hybrid reasoning means the model can switch between fast direct responses and slower deliberate reasoning depending on task difficulty — useful for an agent that alternates between trivial tool calls and genuinely hard planning steps within a single run.
Trained Heavily on Agent Traces Across 40+ Tools
The detail we find most telling: Hermes 4 was trained with a heavy emphasis on agent traces spanning more than 40 tools. Most models are trained on text and then asked to behave like agents as an afterthought. Hermes 4 was shaped by examples of agents actually using tools. That training focus is the likeliest explanation for why the agent built on top of it performs well at the tool-heavy, multi-step work that drives OpenRouter token consumption — the model and the agent were co-designed for the same job.
Why an Open-Source Agent Beating the Giants Matters
Step back from the feature list. The strategic story is that a marketplace neutral enough to measure real usage just crowned an MIT-licensed, community-maintained, model-agnostic agent over the proprietary leaders. That is a data point about where the agent layer is heading.
The Agent Layer May Commoditize Faster Than the Model Layer
Frontier models are expensive to train and hard to replicate, so the model layer stays concentrated among well-funded labs. The agent layer is different. An agent is orchestration logic, memory design, and tool integration — software, not a multi-hundred-million-dollar training run. Hermes Agent is early evidence that the agent layer can commoditize and open up even while the model layer stays proprietary. If that holds, the durable moat in agentic AI may not be the agent at all.
Self-Improvement as a Genuine Differentiator
The do-learn-improve loop is the part competitors will study hardest. An agent that gets cheaper and more capable the more a given workspace uses it has a compounding advantage that a static agent cannot match. We expect the proprietary agents to ship their own versions of self-generated skill libraries within the next few release cycles, because the throughput numbers make the value of the pattern hard to argue with.
The Sustainability Question
The honest caveat: a community-maintained project carrying #1-on-OpenRouter scale of usage takes on real maintenance, security, and reliability obligations. The eight P0 patches in v0.13.0 show the project is taking that seriously, but sustaining frontier-scale usage on an open-source governance model is an unsolved problem in general. Whether Hermes Agent stays at the top depends as much on the project's ability to keep up with that obligation as on the elegance of its architecture.
What We Are Watching Next
Three things will tell us whether this is a durable shift or a momentary spike. First, whether Hermes Agent holds the #1 OpenRouter position through the next OpenClaw release cycle — leads taken can be leads lost. Second, whether the proprietary agents respond with their own self-improving skill systems, which would validate the architecture and intensify the competition. Third, whether the self-improvement loop's quality holds up as the skill libraries grow large — the failure mode of compounding bad skills is real, and scale is the test.
For now, the headline stands on verifiable ground: an open-source MIT agent from Nous Research is the largest single consumer of inference traffic on OpenRouter, ahead of OpenClaw, with roughly 224 billion tokens flowing through it per day. The challenger that came from outside the three-giant frame is, on the one metric that measures real commitment, leading.
Frequently Asked Questions
What is Hermes Agent?
Hermes Agent is an open-source, MIT-licensed coding agent built by Nous Research and launched in February 2026. It is model-agnostic, running across more than 200 models via OpenRouter, NVIDIA NIM, Amazon Bedrock, and local Ollama. Its defining features are a do-learn-improve self-improvement loop and a three-layer memory system. As of May 13, 2026, it is the number-one daily application on OpenRouter's global rankings.
Why is Hermes Agent ranked #1 on OpenRouter?
Hermes Agent processes approximately 224 billion tokens per day through OpenRouter, ahead of OpenClaw's roughly 186 billion. OpenRouter ranks applications and agents by real, paid token throughput, so the #1 position reflects actual sustained usage rather than a benchmark score or marketing claim.
How does Hermes Agent compare to OpenClaw?
Both are open-source agents, but Hermes Agent has overtaken OpenClaw on OpenRouter daily token throughput, roughly 224 billion versus 186 billion per day, taking the #1 spot OpenClaw previously held. Hermes Agent's distinct differentiators are its self-improving do-learn-improve loop, its three-layer memory, and its model-agnostic design across 200-plus models.
What is the do-learn-improve loop?
The do-learn-improve loop is Hermes Agent's self-improvement cycle. The agent executes a task (do), reflects on non-trivial solutions and distills them into procedural knowledge (learn), and saves that knowledge as a reusable skill file it can retrieve for future tasks (improve). Unlike human-authored playbooks, the agent decides what is worth remembering itself.
What is Hermes Agent's three-layer memory system?
The three layers are identity snapshots (the agent's goals and project understanding, which survive restarts), a SQLite full-text database (queryable episodic history retrieved on demand instead of stuffed into context), and procedural skills (the reusable skill files from the do-learn-improve loop). The design separates who the agent is, what it has seen, and what it knows how to do.
Is Hermes Agent really open source?
Yes. Hermes Agent is released under the MIT license, one of the most permissive open-source licenses, which allows commercial use, modification, and redistribution. It is maintained by Nous Research, an open-source AI lab also known for the Hermes family of open-weight models.
What models does Hermes Agent run on?
Hermes Agent is model-agnostic and runs on more than 200 models across OpenRouter, NVIDIA NIM, Amazon Bedrock, and local Ollama deployments. Its natural home is the Hermes 4 open-weight family from Nous Research, available in roughly 14B, 70B, and 405B parameter sizes.
What is in the v0.13.0 "Tenacity Release"?
Hermes Agent v0.13.0, shipped May 7, 2026, adds a durable multi-agent Kanban board, a persistent /goal directive (the Ralph loop), Checkpoints v2 for state rollback, patches for eight P0 security issues, and Google Chat as the agent's twentieth integration channel. The release focuses on making long-running multi-agent work survive interruptions.
What is the Hermes 4 model family?
Hermes 4 is Nous Research's open-weight model family in roughly 14B, 70B, and 405B parameter sizes. The models use hybrid reasoning on a base blending Qwen and Llama 3.1 lineages and were trained with heavy emphasis on agent traces spanning more than 40 tools, which explains why an agent built on them performs well at tool-heavy multi-step work.
How is Hermes Agent different from Claude Code or OpenAI Codex?
Claude Code and OpenAI Codex are proprietary agents tightly coupled to their providers' models. Hermes Agent is open-source, model-agnostic across 200-plus models, and self-improving via its do-learn-improve loop. The trade-off: the proprietary agents offer a more polished, integrated experience, while Hermes Agent offers portability, no provider lock-in, and a compounding skill library.
Why does an open-source agent beating the giants matter?
It is evidence that the agent layer may commoditize faster than the model layer. Frontier models stay concentrated among well-funded labs because training is expensive, but agents are orchestration software. A neutral marketplace measuring real usage crowning an MIT-licensed agent suggests the durable moat in agentic AI may not be the agent itself.
Can Hermes Agent's self-improvement loop make mistakes?
Yes. A self-improving agent can entrench a bad lesson as efficiently as a good one — a skill distilled from an accidental success becomes a confidently-applied wrong answer on similar future tasks. Nous Research's bet is that the curation logic keeps the average skill net-positive, a claim the OpenRouter throughput numbers suggest is holding up at scale but that warrants careful monitoring in production.



