#2039: CLIs vs. MCPs: How AI Agents Actually Talk to Services

Why give an AI agent a terminal? We compare CLIs and MCPs for AI integration.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-2195
Published: Apr 5
Duration: 23:36
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents model-context-protocol local-ai

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The way AI agents interact with external tools is sparking a fundamental architectural debate: should we give them a command line interface (CLI) or connect them to a structured protocol like MCP (Model Context Protocol)? This isn't just a technical detail; it determines how efficient, secure, and capable these agents will be.

The CLI approach is essentially giving an AI agent a terminal. Because models are trained on vast amounts of code and documentation, they already "speak" bash. They know that gh pr list will list pull requests without needing a manual. This "latent knowledge" makes CLIs incredibly fast and low-cost in terms of tokens. There's no complex schema to load, which avoids the "Context Tax"—the massive overhead of sending API documentation into the model's limited memory. A simple CLI skill might take 200 tokens, while a full MCP schema can consume over 50,000, effectively halving the agent's available context before it even starts a task. CLIs also support powerful patterns like piping, allowing agents to chain commands (e.g., filter AWS instances with jq and open GitHub issues) in a single, efficient step.

However, CLIs are "messy" for machines. When a command fails, the agent gets a raw text error or stack trace and must reason about what went wrong, which can lead to hallucinations. Security is another major concern; giving an agent a terminal is like giving it a shell, and without strict sandboxing, it could accidentally run destructive commands. This is where MCP shines. It provides a structured, typed interface where errors are clearly defined (e.g., "Error 400: missing parameter"), and security is "capability-based"—the agent only knows about the tools the server explicitly exposes, making it easier to enforce permissions like "read-only" access.

The industry is converging on a hybrid "CLI-first, MCP-fallback" model. For local development, where speed and flexibility are key, CLIs are preferred. For production environments, where audit trails, security, and structured logging are critical, MCP is the better choice. We're also seeing the rise of "AI-first CLIs" that output machine-readable data (like NDJSON) and avoid interactive prompts that can trap agents in loops. Additionally, "Skills" are emerging as a middle ground—lightweight Markdown files that encode institutional knowledge and can direct an agent to use either a CLI or MCP, keeping the agent smart regardless of the transport layer. Ultimately, the choice depends on the trade-off between the raw power of the terminal and the governance of a protocol.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2039: CLIs vs. MCPs: How AI Agents Actually Talk to Services

You know, Herman, I was looking at my terminal the other day, and it hit me how much of my life is just typing strings into a black box. And now we have these AI agents doing the same thing. It’s a bit full circle, isn’t it?

It’s the return of the command line as the universal language, Corn. Honestly, it’s one of the most fascinating shifts in the last couple of years. We spent thirty years trying to hide the terminal behind layers of glass and colorful buttons, and now the most advanced intelligence we’ve ever built just wants to talk to a /bin/sh prompt.

Well, Daniel is thinking along the same lines. He sent us a text prompt today that really gets into the weeds of how these agents actually talk to the services we use. Here is what he wrote: "Many tech vendors produce excellent and well-maintained CLIs. Often, you can get excellent results by defining a skill for an AI agent to use a CLI like gh. If we can do this, why do we need the GitHub MCP? Is there room for both CLIs and MCPs, or are we going to need to pick a lane?"

That is a quintessential "My Weird Prompts" question because it touches on the invisible architecture of how the future is being built. We’re at this crossroads where the old school—the CLI—is meeting the new school—the Model Context Protocol, or MCP. And by the way, for the folks listening, today’s episode is actually powered by Google Gemini Three Flash. It’s writing the script for us while we dive into this.

I like that Gemini is writing about how agents talk to each other. It’s very meta. But let’s get into Daniel’s question. On one hand, you’ve got the GitHub CLI, gh, which has been around since what, twenty twenty? It’s rock solid. It handles auth, it handles pagination, it has beautiful table formatting. On the other, we have the shiny new GitHub MCP server. If I can just tell an agent, "hey, run gh pr list," why would I bother setting up an entire protocol server? Is this just engineers over-engineering, or is there a "there" there?

It’s definitely not just over-engineering, but Daniel’s point about "Skills" is the real disruptor here. To understand the tension, we have to look at how an LLM actually perceives these two things. When an agent uses a CLI, it’s basically acting like a human junior dev. It’s typing a command, getting a blob of text back, and trying to make sense of it. It’s reading the stdout and stderr like a set of tea leaves. When it uses MCP, it’s more like a professional integration. It’s using a structured, typed interface where the rules are strictly defined before the first command is even sent.

Right, but the CLI approach feels so much more... natural? I mean, these models were trained on millions of lines of GitHub documentation and Stack Overflow posts. They already know how to use gh. It’s like they were born speaking "bash."

Well, not exactly, but you're hitting on the core strength of the CLI. We call this "latent knowledge." Because the training data is so saturated with CLI usage, the model doesn't need a manual to know that gh pr create starts a pull request. It doesn't need to be told that -R specifies the repository. This leads to what people are calling the "Context Tax" problem.

The Context Tax. That sounds like something I’d try to write off on my returns. Explain that.

It’s a huge issue in early twenty twenty-six. When you connect an agent to an MCP server—let’s take the official GitHub one—the server has to tell the agent what it can do. It sends over a schema. For GitHub, which has over ninety different tools in its MCP server, that schema can be massive. We’re talking twenty-three thousand to fifty-five thousand tokens just to say "hello, here is what I can do."

Wait, fifty-five thousand tokens? That’s half the "brain space" of some models just to load the menu before you even order the food. If I'm using a model with a 128k window, I've basically just lobotomized half the agent's memory before I've even asked it to look at my code.

Precisely. If you have a context window of a hundred thousand tokens, you’ve just spent half of it on the overhead of the protocol. Meanwhile, if you use a "Skill"—which is basically just a small Markdown file that says "You have access to the gh CLI, here are the three commands we use for our workflow"—that might only cost you two hundred tokens.

So it’s a thirty-five-to-one ratio in some cases. If I’m a developer paying for tokens, or if I want my agent to actually remember the code it’s looking at instead of the API documentation, the CLI seems like the obvious winner. Why is there even a debate?

Because the CLI is "messy" for a machine. Think about what happens when a command fails. If you run a CLI command and it errors out, the agent gets a string of text back. Maybe it’s a helpful error, maybe it’s a stack trace, maybe it’s just "command not found." The agent then has to "reason" its way out of that error. It has to guess if it made a typo or if the server is down. With MCP, the error is structured. The server can say, "Error code four hundred, missing parameter: title." It’s much harder for the agent to hallucinate a fix when the boundaries are that rigid.

I see. So the CLI is the "vibes" approach—fast, cheap, but occasionally prone to tripping over its own feet. MCP is the "bureaucracy" approach—expensive, heavy, but very clear about the rules.

That’s a good way to put it. And there’s a huge security angle here too. If you give an agent a "CLI Skill," you’re essentially giving it a terminal. If that agent gets confused or if there’s an injection attack, it could theoretically run rm -rf or start poking around files it shouldn’t touch. It’s very hard to sandbox a raw shell without losing the very flexibility that makes the CLI great. MCP acts as a wrapper. You can define exactly which tools are exposed. You can say "you can list PRs, but you cannot delete repositories." It provides a governance layer that enterprises crave.

But couldn't you just restrict the CLI user's permissions? Like, give the agent a restricted shell or a specific IAM role?

You can, but then you're managing security at the OS level and the cloud level and the application level. It becomes a configuration nightmare. With MCP, the security is baked into the protocol. It’s "Capability-based security." The agent literally doesn't know the "delete" command exists because the server never told it.

You mention "Skills" as this emerging middle ground. I’ve been seeing this more often—these SKILLS.md files in repositories. Is that the bridge between these two worlds?

It is. Projects like Brad Feld’s CompanyOS or the stuff Supabase is doing with agent-skills are pushing this "Skills" architecture. The idea is that the Skill holds the "institutional knowledge." It tells the agent, "At this company, we always use squash merges," or "Make sure to tag the QA team on every PR." That’s logic that doesn’t live in the CLI tool itself and doesn't live in the MCP protocol. It’s the "how" and "why."

So, in that world, the Skill is the brain, and the CLI or MCP is just the hand. I guess a Skill could technically point to either one, right?

And the "Standalone Test" is the gold standard there. A good Skill should be written so clearly that if the MCP server goes down, the agent could still draft the command for a human to run. It keeps the agent smart regardless of the transport layer. Think of it like a recipe. The CLI is the stove, the MCP is a high-tech smart-oven, but the Skill is the actual recipe that tells you how to make the souffle.

Let’s talk about the "Unix Philosophy" for a second. One thing I love about the CLI is piping. I can take the output of one command and shove it into another. I can take a list of AWS instances, filter them with jq, and then use gh to open an issue for each one that’s over-provisioned. Can an agent do that with MCP?

Not easily. Not yet, anyway. To do that in MCP, the agent has to call the AWS server, get the data, store it in its own memory, reason about it, and then call the GitHub server. It’s a lot of round trips. Every round trip is a potential point of failure and more latency. In a CLI, the agent can just write a one-line bash script. It’s incredibly powerful. Andrej Karpathy made a great point about this—he said CLIs are exciting because they are legacy. They are stable. They don't change every two weeks, which makes them perfect for AI models that were trained on data from a year ago.

It’s funny to think of "legacy" as a feature, but for an AI, stability is everything. If GitHub updates their MCP schema tomorrow, the agent might get confused. If they change a flag in the gh CLI, there’s probably a thousand blog posts explaining why, and the model likely already knows. But wait, what about versioning? If I'm using an old model with a brand new CLI version, doesn't that break the "latent knowledge" advantage?

It can! That's a great follow-up. If gh introduces a new feature today, GPT-4o doesn't know about it. This is where the "Context Tax" actually becomes a "Context Investment." With MCP, the server tells the agent about the new feature instantly. The agent doesn't need to have been trained on it. So if you're using a rapidly evolving tool, MCP actually wins because it provides the "manual" in real-time.

There’s also the "Human-in-the-loop" factor. If an agent uses a CLI and it gets stuck, I can look at the terminal history and see exactly what it tried to do. It’s readable. MCP logs are... well, they’re JSON. They’re for machines. For a developer trying to debug why their agent just closed forty pull requests by accident, the CLI provides a much more intuitive audit trail.

That’s a huge point. If I see the agent typed gh pr close --all, I know exactly where the logic failed. If I see a JSON blob with a trace ID and a series of method calls, I have to go find a formatter just to understand the mistake.

Okay, so we’ve got the efficiency and the "pipability" of the CLI versus the structure and security of MCP. Daniel asked if we have to pick a lane. What’s the industry actually doing? Are people choosing?

The consensus is actually moving toward a hybrid approach. Think of it as "CLI-first, MCP-fallback." Or, perhaps more accurately, "CLI for local dev, MCP for production." If I’m working on my local machine, I want my agent to be fast. I don’t want to wait for a protocol handshake every time I want to check a git status. I’ll give it the CLI skill. But if I’m running a fleet of agents in the cloud that are managing a production environment, I want the guardrails of MCP. I want to know that the agent is authenticated via OAuth and that every action is logged in a structured way that my security team can monitor.

That makes sense. It’s like the difference between me fixing a leaky pipe in my own house versus a contractor working on a skyscraper. I can just grab a wrench; the contractor needs a permit and a safety harness.

And we’re seeing "AI-first CLIs" now too. Tools like the Kraken CLI or the new CockroachDB ccloud CLI. These are built specifically to be consumed by agents. They move away from "human noise"—like progress bars or interactive prompts that ask "Are you sure? Y/N"—and they output pure machine-readable data like NDJSON. It’s the CLI evolving to meet the agent halfway.

That’s a "fun fact" moment right there. I read a report that some teams are actually aliasing their CLIs to remove the "Are you sure?" prompts specifically so their agents don't get stuck in an infinite loop waiting for a "Y" that they don't know they need to provide.

Oh, it's a real problem! I've seen agents get stuck for twenty minutes because a CLI tool tried to be "helpful" by opening a pager like less. The agent is just sitting there waiting for more text, and the tool is waiting for a spacebar press. This is why MCP is safer—there is no "pager" in a protocol.

So the CLI is becoming more like a protocol, and the protocols like MCP are trying to become more efficient. They’re converging.

They are. There’s a really interesting case study from earlier this year involving Intune automation. A team tried to build an agent to manage device compliance. They started with a specialized MCP server. Every time the agent wanted to check a device, it cost them about a hundred and forty-five thousand tokens because of the overhead of the device schemas. They switched to a simple CLI skill using the Intune PowerShell modules, and the cost dropped to about four thousand tokens. A thirty-five-times reduction in cost and a massive increase in speed.

That’s not just a marginal gain; that’s the difference between a project being viable or being a money pit. It also means the agent can handle thirty-five times more data in its actual working memory.

And that’s why the "Context Tax" is the biggest hurdle for MCP right now. Until we have models with millions of tokens of "free" context, or until MCP finds a way to do "lazy loading" of tools where it only describes the tool the agent actually needs, the CLI is going to remain the king of efficiency.

But wait, didn't we talk about MCP aggregators in a previous episode? Like Composio? Doesn't that help with the management of all these servers?

It does, but it doesn't solve the token problem. In fact, it can sometimes make it worse because now you’re connecting to a hub that might have hundreds of tools. The agent gets overwhelmed. It’s like being handed a thousand-page dictionary when you just wanted to know how to say "where is the bathroom." The model starts to "forget" the earlier tools in the list because the prompt is so long.

So, if I’m a developer today, and I’m building an agent to help my team with GitHub, what’s the move? Do I go through the trouble of setting up the GitHub MCP server, or do I just point the agent at the gh CLI and call it a day?

I think the pragmatic move is to start with the CLI. It’s already there, it’s zero-config, and your agent already knows how to use it. You write a SKILLS.md file that defines your specific team's workflow—how you label issues, who you assign to reviews—and you let the agent run with it. You only move to MCP when you hit a wall.

What does that wall look like?

The wall is usually security or complexity. If you need the agent to perform actions that require complex authentication that you don't want to manage in a shell script—like multi-factor auth or complex scoped tokens—or if you’re worried about the agent having too much "raw" power over the system, that’s when you wrap it in MCP. Or, if you’re building a tool that you want other people to use in their agents. If you’re a vendor, you provide an MCP server because it’s a universal standard. It makes it easy for someone else’s agent to discover your tool without you having to write a "How to use our CLI" guide for every LLM.

So, CLIs are for "me and my team," and MCP is for "the world and the enterprise."

That’s a very sharp way to divide it. And we’re seeing that play out with GitHub specifically. The gh CLI is still the gold standard for individual productivity. But the GitHub MCP server is becoming the backbone for these "Agentic Control Planes" where you have non-coders running complex workflows across an entire organization.

It’s interesting you mention non-coders. We’ve talked before about how the terminal is being "hijacked" by people who don't necessarily know bash but can tell an agent what they want. In that world, the CLI is just an implementation detail. The user doesn't care if it's gh or an MCP call; they just want the PR opened.

Right. But for the person building the agent, the choice matters immensely for the bottom line. If I can run ten thousand agent tasks for the price of one thousand just by switching from MCP to a CLI skill, I’m going to do it every time. There's also the latency aspect. Every time an agent has to parse a massive JSON schema, it adds milliseconds—sometimes seconds—to the response time. For a real-time coding assistant, those seconds feel like an eternity.

Is there a world where MCP wins entirely? Where the token cost becomes zero and the "legacy" training data advantage of the CLI fades away?

Maybe, but it’s a long way off. Even if tokens become "free," the "reasoning" overhead is still there. LLMs are just better at following a pattern they’ve seen a million times in their training data than they are at following a brand-new schema they just read five seconds ago. It’s like the difference between driving a route you’ve taken every day for ten years versus following a GPS in a city you’ve never visited. Even if the GPS is perfect, you’re still going to be a little slower and more hesitant.

I love that. The CLI is the "commute" the AI has done a million times. It knows where the potholes are. It knows which exits are tricky. MCP is the high-tech GPS that’s technically more accurate but requires more cognitive load to follow. It’s the difference between "intuition" and "instruction."

And that’s why I don’t think we’re going to "pick a lane." I think we’re going to see a permanent multi-layered stack. You’ll have your raw "Skills" layer for high-frequency, low-cost tasks. You’ll have your MCP layer for cross-app orchestration and high-security tasks. And you’ll probably have a "Human" layer that still uses the CLI directly for the stuff that’s too nuanced for an agent.

It’s a good time to be a CLI maintainer. Five years ago, people were saying the CLI was dead and everything was going to be GUI-based. Now, the CLI is the most important interface in the world again, just not for humans. It's the API that wasn't meant to be an API.

It’s the ultimate irony of the AI era. We built these incredibly complex visual interfaces for decades, and the moment we created "intelligent" machines, they looked at our screens and said, "Actually, can I just have the text prompt from nineteen seventy-five? It’s much more efficient." They don't want the pixels; they want the semantics.

It’s the revenge of the green-on-black text! So, let’s wrap this with some practical takeaways for the folks listening. If you’re building an agentic workflow right now, don't feel like you're "behind" if you're just using CLI commands. In fact, you're probably being more efficient.

Step one: Audit your tool usage. If you're using an MCP server for something like GitHub or AWS, look at your token logs. See how much you're paying just for the "handshake." If it's more than thirty percent of your total context, it's time to look at a CLI skill. You might be surprised how much "dead air" is in your prompts.

Step two: Document your "Skills." Even if you’re using a CLI, don't just give the agent raw access. Use a Markdown file to explain how your team uses those tools. That "institutional knowledge" is the real secret sauce. It makes the agent feel like a member of the team instead of just a script. Tell it things like "We never delete the main branch" or "Always run npm test before pushing."

And step three: Stay flexible. The MCP ecosystem is moving fast. We’re already seeing "compact" MCP formats being proposed that might slash that context tax. There’s a proposal for "Binary MCP" which would be even faster. But for today, in April twenty twenty-six, the CLI is still the most pragmatic "Skill" an agent can have. It's the universal adapter.

I’m going to go home and apologize to my terminal. I’ve been taking it for granted. It’s not just a black box; it’s the linguistic bridge to the future. I might even change my theme to something a bit more "agent-friendly."

It really is. And it’s a testament to good design. The fact that a tool built for humans in the seventies and eighties is still the best way for a twenty twenty-six AI to interact with the world... that’s some serious engineering longevity. High-quality text interfaces are essentially timeless.

Well, Daniel, I hope that answers the question. It’s not a "lane" situation; it’s a "tool for the job" situation. Use the CLI for the heavy lifting and the efficiency, use MCP for the governance and the "handshakes." And remember, every token you save is a token the agent can use to actually solve your problem.

And keep an eye on those "AI-first CLIs." That’s where the real innovation is happening—tools that are built to be piped into an LLM brain. We're going to see a whole new generation of tools that output structured data by default and skip the fancy formatting entirely.

This has been a deep one. I feel like we could talk about token math all day, but I can see Herman’s eyes starting to glow with the desire to go read more white papers. I think he's already mentally parsing the next GitHub update.

Guilty as charged. There’s a new one on "Recursive MCP Discovery" that I’ve been eyeing... it explores how agents can find other MCP servers without a central registry. It's wild stuff.

No! Save it for the next prompt. We need to wrap this up before we descend into pure JSON. Big thanks as always to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes and making sure our own "handshakes" are successful.

And a huge thank you to Modal for sponsoring the show. They provide the GPU credits that power the entire pipeline that makes "My Weird Prompts" possible. If you're running agentic workloads, Modal is where you want to be—it's the infrastructure that actually keeps up with the speed of these models.

This has been "My Weird Prompts." If you enjoyed this dive into the CLI versus MCP wars, do us a favor and leave a review on whatever app you’re using to listen. It actually makes a huge difference in helping other curious humans—and maybe a few agents—find the show.

We’ll be back soon with another prompt from Daniel. Until then, keep your shells open and your context windows clear. Don't let the bureaucracy of the protocol slow down the speed of your ideas.

See ya.

Bye.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2039: CLIs vs. MCPs: How AI Agents Actually Talk to Services

Downloads

You Might Also Like

#2039: CLIs vs. MCPs: How AI Agents Actually Talk to Services