#1736: Why OpenClaw Eats 16 Trillion Tokens

OpenClaw is processing 16.5 trillion tokens daily, dwarfing Wikipedia. Here’s why it’s #1.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-1889
Published: Mar 29
Duration: 32:02
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents tokenization open-source-ai

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The AI App Store leaderboard looks different from what you might expect. While consumer-facing giants like ChatGPT dominate headlines, a quiet giant named OpenClaw is consuming 16.5 trillion tokens daily—processing more text than Wikipedia every single day. This isn't just a statistical anomaly; it represents a fundamental shift in how AI is being used by developers, tinkerers, and power users.

OpenRouter serves as the clearinghouse for this high-level AI activity. Unlike app download counts, which can be inflated by bots or curiosity, token consumption reflects actual usage. It's the difference between buying a treadmill and actually running a marathon. When OpenRouter reports that OpenClaw is trending up 46% in just a few weeks, they're looking at literal billing cycles and data packets. This is a "skin in the game" metric that filters out the tourists.

The secret behind OpenClaw's astronomical token count lies in its architecture as an autonomous agent runtime. Unlike traditional chatbots that respond to single prompts, OpenClaw operates as a "Local Gateway" that lives on your machine, accessing files, browsers, and terminals to execute multi-step tasks. It uses a "Plan-and-Execute" loop: when given a goal, it creates a task list, performs each step, and pauses to "think" when unexpected obstacles arise.

This thinking process is where the tokens multiply exponentially. For every action an agent takes in the real world, it might send five to ten prompts back to the LLM to verify what it just saw and decide what to do next. A typical 100-step task—like complex browser automation—can consume about 60,000 input tokens and 40,000 output tokens. When thousands of people run these agents all day to automate web research or lead generation, trillion-token numbers become inevitable.

The "brain" of these agents remains in the cloud, using models like Claude 3.5 Sonnet or GPT-4o, while the "body" runs locally. This hybrid model keeps sensitive files private while sending only necessary snippets to cloud APIs. It's a privacy-first approach to hyper-powerful assistance that's forcing a rethink of the entire ecosystem. Last year, everyone built thin "wrappers" over APIs; now, infrastructure platforms like OpenClaw are emerging.

The coding category reveals where the real "war" is happening. Kilo Code, at 5.38 trillion tokens, dominates VS Code integration with "autocomplete on steroids" and chat-with-your-repo functionality. It's proactive, watching you type and suggesting the next three lines. Claude Code, at 2 trillion tokens, takes a different approach as a CLI tool you "set loose" on tasks—more like a chauffeur than a co-pilot.

Cline shows the most dramatic growth at 133%, offering an open-source VS Code extension that's explicitly agentic. Developers love it because they can "bring their own key" and only pay for what they use, choosing cheap, fast models like DeepSeek R1 or Gemini Flash. One case study showed Cline migrating a legacy Python 2.7 app to Python 3.12 in four hours—a task that normally takes a week. It consumed about 400 million tokens in that single afternoon by looping through error-checking and code-fixing cycles.

The roleplay and entertainment category forms the "shadow economy" of AI, with apps like Janitor AI and SillyTavern consistently topping charts. These platforms require massive context windows—detailed character lore files plus conversation history—driving enormous token volume. While some dismiss this as "unserious," the token numbers prove it's a major force in the AI ecosystem.

The core insight is clear: the future of AI isn't just about chatting. It's about autonomous agents that can plan, execute, and adapt while handling complex, multi-step workflows. Whether for coding, automation, or entertainment, the token consumption patterns reveal where the real innovation—and real usage—is happening.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#1736: Why OpenClaw Eats 16 Trillion Tokens

OpenClaw is currently consuming sixteen point five trillion tokens. To put that in perspective, that is more than the entire text of Wikipedia being processed every single day. What on earth is it, and why is it sitting at number one?

It is the absolute giant in the room right now, Corn. Today's prompt from Daniel is about the OpenRouter app rankings, and honestly, looking at these numbers is like getting a peek at the hidden plumbing of the AI revolution. Most people look at App Store downloads to see what is popular, but in the world of high-level AI, you follow the tokens.

Right, because a download just means you were curious once. Token consumption means you are actually putting the model to work. It is the difference between buying a treadmill and actually running a marathon on it. And by the way, if you are wondering who is writing our script today, it is Google Gemini three Flash. Hopefully, it can keep up with these trillion-token stats you are about to drop. I am Corn, the resident skeptic of anything that sounds too productive.

And I am Herman Poppleberry, and I have been staring at these OpenRouter charts for three hours. This is fascinating because OpenRouter is the clearinghouse for the power users. It is where developers, tinkerers, and the "agentic" crowd go to swap models like trading cards. When we see sixteen point five trillion tokens for an app like OpenClaw, we are looking at the pulse of the early adopter community.

It is funny, because if you ask the average person on the street what the biggest AI app is, they will say ChatGPT or maybe Gemini. But among the people actually building things, the leaderboard looks completely different. It is all coding agents, roleplay servers, and this mysterious "Claw" thing. So, let's break down the ranking mechanism first, Herman. Why is OpenRouter the gold standard for seeing what is actually "real" in AI?

Because it is an API aggregator. If you build an app and you want your users to be able to choose between Claude, Llama, and DeepSeek without you having to manage ten different subscriptions, you use OpenRouter. So, when OpenRouter says an app is trending, they aren't guessing. They are looking at the literal billing cycles and data packets. It filters out the "tourists." If an app is high on this list, people are paying for it, or at least burning through massive amounts of free-tier credits to get work done.

So it is a "skin in the game" metric. I like that. It is much harder to fake sixteen trillion tokens than it is to fake a million downloads from a bot farm in a basement somewhere. Let's talk categories. The top apps seem to cluster into three distinct tribes: the "I want the AI to do my job" coding tribe, the "I want to talk to a fictional vampire" roleplay tribe, and the creative tools. But before we get to the vampires, we have to talk about the monster at the top. OpenClaw. Sixteen point five trillion tokens, trending up forty-six percent in just the last few weeks. What is the deal? Is it a botnet? Is it a new search engine?

It is an autonomous agent runtime. Think of it as a "Local Gateway." The reason the token count is so astronomical is that OpenClaw isn't just a chatbot where you ask a question and get an answer. It is a system that lives on your machine, has access to your files, your browser, and your terminal, and it executes multi-step tasks.

But how does that work in practice? If I give it a task, how does it stay on track without me hovering over it?

It uses a "Plan-and-Execute" loop. When you give it a goal, it doesn't just start typing. It creates a task list. It says, "Step one: Open the browser. Step two: Search for the data. Step three: Extract the table." Then it performs the first step and looks at the result. If the result isn't what it expected—say, a pop-up blocked the screen—it pauses, "thinks" about how to close the pop-up, and then tries again. That "thinking" is where the tokens go.

So, instead of me saying "Write me a poem," I am saying "Go find all the invoices in my email, compare them to my bank statements, find the discrepancies, and draft emails to the vendors."

Well, I should say, that is exactly the kind of workflow it handles. Every single one of those steps—searching the email, reading the PDF, checking the bank site—requires the agent to "think" out loud. In agent-speak, we call these "thoughts" or "reasoning steps." For every one action the agent takes in the real world, it might send five or ten prompts back to the LLM to verify what it just saw and decide what to do next.

Ah, so the token multiplier is insane. If I chat with a bot, it is one-to-one. If I set an agent loose, it is one-to-fifty.

At least. There is a great breakdown from a site called ClawKit that says a typical one-hundred-step task—like a complex browser automation—can consume about sixty thousand input tokens and forty thousand output tokens. If you have thousands of people running these agents all day to automate their web research or their lead generation, you hit those trillion-token marks incredibly fast.

But wait, if it's running locally, why is the token count so high on OpenRouter? Shouldn't it be free if it's on my machine?

That’s a common misconception. The "body" of the agent—the code that moves the mouse and reads the files—is local. But the "brain" is still a massive LLM like Claude 3.5 Sonnet or GPT-4o. Every time the agent needs to "decide" its next move, it sends a snapshot of your screen or a snippet of your code to the API. It’s like having a remote-control robot in your house; the robot is there, but the pilot is in a data center in Virginia.

What is interesting about OpenClaw specifically is that it is open source and self-hosted. It is not a "SaaS" in the traditional sense where you go to a website. You run the "Clawdbot" locally. This matters because of the "Local Gateway" architecture Daniel mentioned in his notes. It keeps your sensitive files and your memory local, only sending the specific "snippets" of data to the cloud LLM that it needs to process. It is a privacy-first way to have a hyper-powerful assistant.

And it is forcing a rethink of the whole ecosystem. Last year, everyone was building "wrappers"—just thin interfaces over an API. OpenClaw is an "infrastructure." It is a platform that other people are building tools on top of. That forty-six percent growth is likely because more people are realizing that if they want to build a truly autonomous agent that doesn't leak all their corporate data, they need this local execution layer.

It also explains why the GPU demand is spiking. Even though the "brain" is in the cloud—maybe they are using Gemini two point five Flash because it is fast and cheap—the "body" of the agent is running on the user's local hardware, managing the state and the memory. It is a hybrid model.

It is the "Agentic Harness." We have talked about this concept before, but seeing it at sixteen trillion tokens makes it real. It is not a toy anymore. But let's look at the runners-up, because the coding category is where the real "war" is happening. Kilo Code is at five point thirty-eight trillion tokens. That is the heavy hitter for VS Code users.

I have seen Kilo Code. It is everywhere on Twitter. It is basically like having a senior engineer sitting inside your editor who actually knows your entire codebase. But how does it differ from something like Claude Code, which is also on the list at two trillion tokens?

It is a matter of philosophy and "surface area." Kilo Code is very much integrated into the IDE—the Integrated Development Environment. It is focused on "autocomplete on steroids" and "chat-with-your-repo." It is optimized for the flow of a developer who is already typing. Claude Code, on the other hand, is more of a CLI tool—a command line interface. It is an agent you "set loose" on a task. You tell Claude Code, "Hey, refactor this entire directory to use the new API," and it just goes to work while you go get a coffee.

So Kilo is my co-pilot, but Claude Code is my intern who I trust with the keys to the car.

That is a fair way to put it. Kilo is proactive; it’s watching you type and suggesting the next three lines based on a file you wrote three months ago. Claude Code is reactive; it waits for a command and then goes on a mission. It’s the difference between a passenger giving you directions and a chauffeur driving the car while you sleep in the back.

And then you have Cline, which is showing one hundred and thirty-three percent growth. One hundred and thirty-three percent! That is vertical growth. Cline is an open-source tool that used to be called Devins. It is an extension for VS Code that is explicitly "agentic." It can create files, run terminal commands, and fix its own bugs.

And it’s not just fixing bugs. I saw a case study where a developer used Cline to migrate a legacy Python 2.7 app to Python 3.12. Normally, that’s a week of tedious manual work. Cline did it in four hours. It would run the code, see the syntax error, read the new documentation, fix the line, and repeat. It went through about four hundred million tokens in that one afternoon.

Why is Cline growing so much faster than the others? Is it just because it is free and open source?

That is a huge part of it. Developers love "bring your own key." They don't want to pay a twenty-dollar-a-month subscription to a company that might go bust or change their terms. With Cline, you plug in your OpenRouter API key, you choose the cheapest, fastest model—like DeepSeek R1 or Gemini Flash—and you only pay for what you use. It is the "Linux" of coding agents.

And because it is agentic, it is a token hog. If I am a developer and I let Cline try to solve a complex bug, it might loop twenty times. It checks the error, changes a line of code, runs the test, sees it failed, tries again. Each loop is a fresh context window. If your context window is two hundred thousand tokens, and you do that twenty times... well, you do the math. That is how you get to these massive numbers.

Precisely. Well, I mean, that is the core of the "Coding Agent War." It is a race to see who can be the most "autonomous" without being "expensive." This is where the technical nuances really matter. Some of these tools, like Hermes Agent, which is up ninety-two percent, are using specialized models that are fine-tuned just for function calling and tool use. They aren't trying to write poetry; they are trying to be the best at "using a terminal."

I love the name Hermes Agent. Very "messenger of the gods," hopefully delivering code that actually compiles. But let's pivot for a second. We have all these "serious" tools—coding, automation, local gateways. And then, you look at the rest of the top ten, and it is Janitor AI. It is SillyTavern. It is ISEKAI ZERO. Herman, we need to talk about the vampires and the anime characters.

It is the "shadow economy" of AI, Corn. People underestimate it because it feels "unserious," but the token volume in roleplay and entertainment is staggering. Janitor AI and SillyTavern are consistently at the top of OpenRouter's charts.

For the uninitiated—and stay with us here—these aren't just "chatbots." These are platforms where people create incredibly detailed "personas." You might have a character with a ten-thousand-word "lore" file. Every time you send a message to that character, the AI has to read that entire lore file plus the last fifty messages of your conversation to stay in character.

And that is why the tokens are through the roof. In a coding task, you might send a snippet of code. It is transactional. In roleplay, the "context" is the product. People want these characters to remember that three weeks ago, in the "story," they lost their magic sword in a forest. To keep that "state" alive, you are constantly feeding massive amounts of text back into the model.

It is basically a collaborative novel that never ends. And unlike a human novelist, the AI doesn't get tired. It will roleplay with you for ten hours straight. If you look at ISEKAI ZERO, it is even more specialized. It is built around the "Isekai" genre of anime, where a character is transported to another world. It is highly specific, highly immersive, and people are clearly obsessed with it.

Fun fact about Isekai Zero—it actually uses a "World Info" database. If you mention a specific city in the game world, the app automatically pulls the description of that city and injects it into the prompt. It’s like a specialized version of RAG—Retrieval-Augmented Generation—but for fantasy storytelling.

So it’s basically an automated Dungeon Master. What this tells us is that AI is becoming a new medium of entertainment. It is not just a tool to write emails. It is a "living" game engine. If you think about the history of the internet, what always drives the early adoption of new technology? It is usually gaming and... let's say, "adult-adjacent" entertainment. Roleplay hits both of those notes. It is the proving ground for long-context models.

That is a great point. If a model can't remember who I am after ten messages, it is useless for roleplay. So, the roleplay community is actually the one "stress testing" the memory of these models more than the developers are. A developer only cares about the current file they are working on. A roleplayer cares about the "soul" of the character.

But how do they handle the cost? If they are burning through millions of tokens just to talk to an elf, doesn't that get incredibly expensive for the user?

That’s why OpenRouter is so vital here. These users aren't using GPT-4o for roleplay; they are using models like "Midnight Miqu" or "Llama 3.1 70B" which are hosted cheaply. Some of these models cost pennies per million tokens. So they can afford to send a 128k context window every single turn because it only costs a fraction of a cent. They are optimizing for "vibe per dollar."

And the developers of these apps are getting really clever with "token efficiency." There is an app called SillyTavern which is basically a "frontend" for these models. It has features like "summarization on the fly." It will take the last twenty messages, have a smaller, cheaper AI summarize them, and then only send the summary to the big, expensive AI to save tokens.

So even the "unserious" apps are driving technical innovation in "state management" and "context compression." It is a bifurcation of the market. You have the "Productivity" branch, which is all about "Get the job done with the fewest tokens possible to save money," and the "Entertainment" branch, which is "I don't care how many tokens it takes, make me feel something."

And "make me feel something" is a much bigger market than most VCs realize. We are looking at trillions of tokens spent on fictional conversations. That is a massive amount of human "attention" being mediated by AI. If I were an AI developer right now, I would be looking at Janitor AI and asking, "How do I build a model that is specifically good at emotional intelligence and long-term character consistency?" because that is where the "heavy users" are.

It is also a different "safety" profile. A lot of the big models—like the base versions of Claude or Gemini—have very strict "guardrails" that make them terrible for creative roleplay. They are too "helpful" and "polite." They won't play a villain. That is why people go to OpenRouter. They want to use the "Uncensored" versions of Llama or specialized models like Hermes or Mythomax that are allowed to be "dark" or "edgy."

It is the "Wild West" of AI. OpenRouter provides the "ammo"—the tokens—and the apps provide the "scenery." But let's look at the third category: creative tools. Descript and Novelcrafter. Descript is a huge name in podcasting—we know it well. It uses AI to edit audio by editing text.

Descript being on the list is interesting because it shows the "multimodal" shift. It is not just text-in, text-out. It is "Audio-to-text-to-editing." That is a very token-intensive process because you are transcribing hours of audio.

Think about our show, Herman. If we use Descript to edit this, it has to transcribe 45 minutes of us rambling, then it has to use an LLM to identify the "filler words," then it has to regenerate the audio to bridge the gaps. That’s a massive amount of back-and-forth data.

And Novelcrafter is for the "serious" writers. It is like a "Scrivener" for the AI age. It helps you manage your world-building, your character sheets, and your plot outlines, and then it connects to OpenRouter so you can "co-write" with the AI. Again, it is about "context." If you are writing a ninety-thousand-word novel, you need an AI that can "see" the whole thing.

So, across all these categories—Coding, Roleplay, Creative Writing—the "common thread" is Context Management. The apps that are winning are the ones that are best at "feeding the beast" the right information at the right time.

And doing it without breaking the bank. Which brings us to the practical side of this. If you are a listener and you are looking at these rankings, what is the "Alpha"? What is the move? First, I think you have to look at the "Coding Agent" category as a productivity cheat code. If you aren't using something like Cline or Kilo Code, you are basically writing code with one hand tied behind your back.

I tried one of those "autonomous" agents last week. I told it to "Fix the styling on my personal website." I watched it open my browser, look at the site, go "Hmm, that header is ugly," go back to the code, change the CSS, refresh the browser, and then say "I think it looks better now." It was like watching a ghost inhabit my computer. It was terrifying and amazing.

That is the "OpenClaw" experience. And as Zea Gatdula pointed out in that blog post Daniel sent over, the "cost" of that ghost can be high if you aren't careful. She managed to reduce her API costs by eighty-five percent just by being smarter about "Reasoning Levels."

"Reasoning Levels." That sounds like a sci-fi term. "Commander, the AI has reached Reasoning Level Nine!" What does it actually mean in the real world?

It means not using a "Sledgehammer" to crack a nut. If the agent is just "checking if a file exists," you don't need Claude three point five Sonnet or Gemini Pro. You can use a tiny, cheap model like DeepSeek or a free-tier Flash model. You only "spin up" the big brain when the agent is actually stuck or needs to write complex logic.

So, "Agentic Orchestration." The next big skill for humans isn't "Prompt Engineering," it is "Agent Management." It is knowing when to let the intern handle it and when to call in the senior dev.

And the "OpenRouter" rankings are basically a leaderboard for which "Agent Managers" are currently winning. If OpenClaw is at number one with sixteen trillion tokens, it means the "Local Gateway" model is the winner of early 2026. People want their agents to be "local" but their "brains" to be "global."

I want to go back to the "Roleplay" thing for a second, because I think there is a hidden lesson there for business. If people are willing to spend trillions of tokens talking to a fictional character, what does that say about the "User Interface" of the future? Maybe we shouldn't be building "Dashboards" and "Sidebars." Maybe we should be building "Personas" for our software.

"The Quickbooks Vampire."

"I see you haven't reconciled your taxes, mortal. Do you wish to face the wrath of the IRS, or shall we dance with the spreadsheets?" I mean, I would actually do my taxes if the UI was a bit more dramatic.

You joke, but look at "BLACKBOXAI" on the list. It is a coding assistant, but it has a very distinct "personality" and "brand." It doesn't just feel like a sterile tool. The "Human-AI" connection is the "secret sauce" for engagement. Even in coding, the agents that "talk back" to you and explain their "thought process" are the ones that people trust more.

It is the "Show your work" principle. If an agent just changes my code and says "Done," I am nervous. If it says "I noticed you were using an older version of this library, so I updated it and checked for breaking changes in the documentation," I feel like I have a partner.

And that "explaining the thought process" consumes more tokens! It is a "Trust Tax." We are paying in tokens to feel comfortable with what the AI is doing.

That is a brilliant way to put it. The "Trust Tax." We are literally burning compute to build a bridge of understanding between human and machine. And right now, we are building sixteen trillion tokens worth of bridge every day via OpenClaw.

Let's talk about the "Coding Agent Wars" a bit more deeply, because this is where the money is. Kilo Code at five trillion tokens. Why Kilo? What is the technical "hook" there?

From what I have seen, Kilo's "hook" is their "Indexing." They don't just "read" your code; they build a "semantic map" of your entire repository. So when you ask a question, it isn't just searching for keywords. It understands that "Function A" in "File X" is connected to "Variable B" in "File Y."

That is "Repository-level Context." And that is the "Holy Grail." Most LLMs have a "window," but a codebase is a "house." You need to be able to see through the walls. Kilo Code's dominance suggests that their "indexing" is better than the "out of the box" solutions from the big providers.

It also suggests that people are moving away from "The Sidebar." Remember a year ago, everyone was excited about the "AI Sidebar" in their browser or their editor? The rankings show that the "Sidebar" is dying. The "Agent" is taking over. You don't want to "chat" with your code in a sidebar; you want the code to "fix itself" in the main window.

It is the shift from "Chat" to "Action." My Weird Prompts listeners might remember we talked about "Claude Code" a few weeks ago—that "Agentic Harness" approach. It is sitting at two trillion tokens. It is growing, but it is still chasing Kilo. Why? Maybe because Claude Code is "CLI-first." It is a tool for "power users" who live in the terminal. Kilo is for the "VS Code" masses.

It is "Accessibility" versus "Power." But then you have "Cline" and "Roo Code" (which used to be called Koto) showing massive growth. These are "Extensions." They are the "Middle Way." They give you the power of an agent right inside the editor you already use. I think the "Extension" model is going to win in the long run.

I agree. It is about "reducing friction." If I have to switch to a terminal to talk to my AI, I am losing focus. If the AI is just a "button" next to my "Save" icon, I am going to use it a thousand times a day. And that is how you get to five trillion tokens.

Let's talk about the "Small Models" for a second. DeepSeek and Llama three. They are "free" or "cheap" on OpenRouter, but they are driving a ton of the volume. Daniel mentioned that people are using "Gemini two point five Flash" because it is fast and has a huge context window.

This is the "Intelligence Per Dollar" metric. If you are running an agent like OpenClaw, you don't need "GPT-5" levels of intelligence for every single step. You need a model that is "smart enough" to follow instructions but "cheap enough" that you can run it ten thousand times. This is why Google's "Flash" series and the "DeepSeek" models are the "workhorses" of the ranking. They are the "fuel" for the agents.

It is like a car. You don't use "Rocket Fuel" to go to the grocery store. You use regular unleaded. The "Big Models" are for the "Heavy Lifting"—the complex architectural decisions—and the "Small Models" are for the "Commute."

And OpenRouter makes it incredibly easy to "swap" those fuels. You can have an agent that uses "Pro" for the strategy and "Flash" for the execution. That "hybrid" approach is what is driving the token explosion. It is making AI "affordable" at scale.

But how does the developer actually set that up? Do they have to write a bunch of conditional code?

Not anymore. That’s what’s so cool about these new agent frameworks. They have "Router" logic built in. The framework looks at the task and asks, "Is this a logic problem or a data entry problem?" If it’s data entry, it automatically routes the call to the cheapest model. If the cheap model returns an error, it automatically escalates to a bigger model. It’s "Auto-scaling intelligence."

So, if you are a developer listening to this, and you are building a "wrapper"... stop. The market is telling you that "wrappers" are dead. People want "Agents." They want "Local Gateways." They want "Infrastructure." If your app doesn't "do" something—if it just "says" something—you are going to get eaten by the trillions of tokens being spent on actual automation.

And if you are in the "Entertainment" space, don't be ashamed of the vampires! The data shows that "Roleplay" is a massive, underserved market. There is so much room to build "Personalized Entertainment" that uses these long-context models. Imagine a "Mystery Game" where the AI is the "Dungeon Master" and it remembers every clue you have found over a month of play.

I would play that. "Herman Poppleberry and the Case of the Trillion Missing Tokens."

I would be the prime suspect, obviously. But let's look at the "Creative Tools" again. Descript and Novelcrafter. These are "Vertical AI." They aren't trying to do everything. They are doing one thing—editing audio or writing books—perfectly.

I think "Verticalization" is the other big trend. We have "General Agents" like OpenClaw, and then we have "Specialized Agents" like Kilo Code. The "Generalists" provide the foundation, and the "Specialists" provide the value.

It is a healthy ecosystem. It looks like a real "economy" now, not just a bunch of hype. Trillions of tokens being spent on code, trillion being spent on stories, trillions being spent on automation. This is what "AI Adoption" actually looks like in 2026. It isn't a "killer app." it is a "killer infrastructure."

And OpenRouter is the "Stock Exchange" for that infrastructure. It is the best place to see which "Model" is currently the "Gold Standard." Right now, looking at the data, it seems like the "Claude" family and the "Gemini" family are the ones people are actually "plugging into" for the heavy lifting, while "Llama" and "DeepSeek" are the "utility" models.

And don't forget the "Open Source" factor. The fact that the top apps—OpenClaw, Cline, SillyTavern—are mostly "open" or "local-first" is a huge signal. Power users don't want to be "locked in." They want "Portability." They want to be able to switch from OpenRouter to their own local server if the prices change.

It is the "Decentralization" of AI. We spent ten years moving everything to the "Cloud," and now, thanks to agents, we are moving the "Control" back to the "Edge." My computer is becoming a "Command Center" again, not just a "Thin Client" for a website.

That is the "Local Gateway" revolution Daniel was talking about. It is a fundamental shift in how we think about computing. Your "Agent" is your "Operating System." And that Operating System is hungry. It wants tokens. Trillions of them.

But wait, if everyone is running these locally, what happens to the big cloud providers? If I’m running my agent on my laptop using an OpenRouter key, am I basically bypassing the big "walled gardens" of Apple and Microsoft?

In a way, yes. You’re using their models, but you’re not using their "experience." You’re not using Copilot; you’re using your own custom agent. This is why the big players are so desperate to integrate AI into the OS level—they don't want to lose that "interface" layer to open-source agents like OpenClaw.

Well, I am hungry too, but for lunch, not tokens. Let's wrap this up with some practical takeaways for the folks at home. Herman, give me the "Power User" checklist based on these rankings.

Okay, three things. One: If you are a developer, start using an "Agentic" extension like Cline or Roo Code today. Stop "chatting" with your code and start "directing" it. Two: If you care about privacy and high-level automation, look into "OpenClaw." It is the number one app for a reason. It is the most powerful way to run a local "Agentic" workflow. And three: Monitor the OpenRouter "Top Apps" list once a month. It is the only way to spot the "next big thing" before it hits the mainstream.

My takeaway is simpler: The "Vampires" are winning. If you can make your software "engaging" and "persistent"—if you can give it a "memory" and a "personality"—you are going to capture more attention than a sterile dashboard ever will. We are moving from the "Information Age" to the "Relationship Age." Even if that relationship is with a bot that fixes your Python scripts.

"The Relationship Age." I like that. It is a bit "Sloth-like" in its focus on connection, but it is accurate. We are building "Partnerships" with these models.

Well, I am not allowed to say that word, am I? Herman almost got me. I mean, you are right on the money there, Herman. It is a partnership. And like any partnership, it requires trust, communication, and a whole lot of "reasoning steps."

And sixteen trillion tokens.

Naturally. Well, this has been a deep dive into the "Hidden Economy" of AI. If you want to see the charts for yourself, head over to OpenRouter and look at the "Rankings" tab. It is a trip.

It really is. And big thanks to Daniel for the prompt. It is always good to "Follow the Tokens."

Before we go, we have to thank the people who keep the lights on. Big thanks to our producer, Hilbert Flumingtop, for making sure we don't sound like we are recording this from inside a tin can.

And a huge shout-out to Modal for sponsoring the show. They provide the GPU credits that power our generation pipeline. If you are building the next "OpenClaw" or "Kilo Code," you should check out Modal for your serverless GPU needs.

This has been My Weird Prompts. If you enjoyed the show, do us a favor and leave a review on Apple Podcasts or Spotify. It actually helps more than you think—it tells the algorithms that we aren't just two animals talking to themselves in a room.

Though, technically, we are.

Details, details. You can find us at myweirdprompts dot com for the full archive and all the ways to subscribe. We are also on Telegram if you want to get notified the second a new episode drops.

See you in the next one.

Keep those agents on a short leash. Goodbye.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#1736: Why OpenClaw Eats 16 Trillion Tokens

Downloads

You Might Also Like

#1736: Why OpenClaw Eats 16 Trillion Tokens