#2014: Coding Tools Are Secretly System Agents

They call it a coding assistant, but real users are treating it like a personal operating system.

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-2170
Published: Apr 4
Duration: 26:51
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents model-context-protocol software-development

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The terminal agent market is suffering from a massive framing gap. While companies like Anthropic and Google market their tools—Claude Code, Gemini CLI, OpenAI Codex—as "coding assistants" for developers, the actual usage patterns tell a completely different story. Users aren't just writing Python scripts; they're managing entire digital lives, from podcast production to Linux system administration and legal research. This disconnect between marketing and reality reveals a fundamental truth: these tools are actually general-purpose system operators, not just code completers.

The "Claude Spaces" concept, pioneered by power user Daniel, perfectly illustrates this shift. Instead of using git repositories for traditional software projects, these spaces function as structured workspaces for non-coding tasks. A "Podcast Production" repo might contain audio files and show notes, while a "LAN Management" space handles router queries and NAS configurations. The secret sauce is the CLAUDE.md file, which acts as persistent memory, telling the agent exactly how to handle domain-specific tasks. It transforms a raw LLM into a specialized expert for any workflow.

This repurposing of developer tools creates unexpected benefits. Git provides natural context window management—when you open a terminal agent in a repo, it automatically indexes files and understands project state. More importantly, it offers a safety net for AI experimentation. The fear of AI "messing things up" disappears when you can simply git checkout to revert any unwanted changes. This version control for your entire workflow is a powerful advantage that non-developers haven't fully realized yet.

However, the current "coding tool" framing creates real limitations. The industry's obsession with benchmarks like HumanEval and SWE bench drives product development toward quantifiable coding metrics rather than general-purpose utility. Marketing a "General Purpose Do-Anything-in-the-Terminal" tool is difficult when success is measured in vibes rather than test scores. Additionally, the "safe sandbox" argument makes business sense—developers are less likely to blame vendors when destructive commands fail, compared to less technical users.

The Model Context Protocol (MCP) further breaks the coding label. When Claude Code connects to servers for Google Drive or Slack, it becomes an orchestration hub for your entire digital life, not just a code writer. Yet current tools assume primary goals are script-oriented, missing opportunities for better media handling, file previews, and non-text file analysis. This creates a "spoon problem"—AI uses workarounds because the tool was designed for a different purpose.

Looking ahead, the market will likely evolve toward specialized editions. Just as we have "Claude for Enterprise" and specialized medical or legal models, we'll see terminal agents pre-configured for specific use cases. Daniel's DIY approach with Claude Spaces essentially creates these specialized editions before companies build them. The future may involve standard patterns for different workspace types—legal, research, project management—allowing any terminal agent to immediately understand the structure and purpose of a workspace.

The terminal agent revolution is happening, but it's being held back by narrow marketing and engineering-focused design. As these tools mature, the industry must recognize they're building general-purpose system operators that happen to use code as their language, not merely coding assistants for developers.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2014: Coding Tools Are Secretly System Agents

So, I was looking at how people are actually using these new terminal agents lately, and it hit me—the way these things are marketed is almost a complete lie. Or at least, it’s a very narrow slice of the truth. We’re told Claude Code, Gemini CLI, and OpenAI Codex are "coding assistants." They’re for developers, right? They’re for writing Python or debugging React. But then you look at what people are actually doing with them, and it’s like using a Ferrari just to drive to the mailbox.

It really is a massive framing gap, Corn. Today's prompt from Daniel is about exactly this—the disconnect between the "developer tool" label and the reality of these tools as general-purpose system operators. Daniel’s been pushing the boundaries of this with something he calls Claude Spaces. Instead of using a git repository for a software project, he’s using them as structured workspaces for everything from podcast production to Linux system administration and even legal research.

It’s fascinating because the terminal doesn't care if you're editing a source code file or a configuration file for a server. And by the way, for everyone listening, a quick bit of housekeeping: today’s episode is actually being powered by Google Gemini three Flash. It’s the model writing our script today, which is fitting since we’re talking about the very category of tools it helps power. But Herman, let’s get into this "Claude Spaces" idea. Daniel has an index on GitHub—the Claude Code Projects Index—with over seventy-five repositories. And almost none of them are what a traditional developer would call "code."

My name is Herman Poppleberry, and I have spent way too much time this week digging into how these spaces actually function. What Daniel is doing is essentially using the scaffolding of a development environment to trick the AI into being a high-level executive assistant. Each "Space" is a git repo, but instead of containing a library, it might contain audio files, show notes, or Docker compose files. The secret sauce is the CLAUDE dot M D file. In the world of Claude Code, that file acts as the persistent memory or the "brain" for that specific folder.

Right, so if I’m in a "Podcast Production" repo, the CLAUDE dot M D file tells the agent, "Hey, when I ask you to prep an episode, use this specific template for show notes, connect to this specific Model Context Protocol server to process the transcript, and make sure the final output is formatted for Spotify." It turns a raw LLM into a domain expert for that specific task.

And that’s where the marketing fails. Anthropic calls it "Claude Code." Google calls theirs "Gemini CLI." The branding screams "Software Engineering." But if I can use it to manage my Proxmox server or organize three hundred gigabytes of messy media files, is it really a coding tool? Or is it a System Agent? We’re seeing a tool that is fundamentally a "System Operator" being sold as a "Code Completer."

I think the reason for that is pretty obvious, though. Money and metrics. If you’re Anthropic or Google, how do you prove your model is getting smarter? You point at benchmarks like HumanEval or S W E bench. Those are quantifiable. You can say, "Our model solved fifteen percent more coding issues than the last version." It’s much harder to market a "General Purpose Do-Anything-in-the-Terminal" tool because the success metrics are vibes-based. "It organized my closet" isn't a benchmark.

There is also the "Safe Sandbox" argument. Writing code is a very clean text-in, text-out workflow. It’s predictable. If an AI writes a bug in a script, the script just doesn't run. But if you market a tool as a "System Administrator AI" and it accidentally runs a recursive remove command on the root directory because it misunderstood a prompt about cleaning up logs, that’s a PR nightmare. By calling it a "coding tool," they’re targeting an audience—developers—who already know how to handle a terminal and, theoretically, won't blame the tool if they approve a destructive command.

But that’s such a disservice to the rest of the world. Think about researchers, data analysts, or even just power users who are tired of clicking through a million GUI menus. Daniel’s example of a "Claude Rescue" mode for G R U B recovery shells is brilliant. That’s not coding; that’s emergency surgery on an operating system. If we keep calling these "coding tools," we’re scaring off the very people who could benefit most from a "Headless Computer Use" agent.

I love that term—Headless Computer Use. Anthropic has been pushing "Computer Use" where the AI literally looks at pixels and clicks buttons like a human would. But that is incredibly slow and prone to errors. If the AI can just stay in the terminal, talk directly to the file system, execute shell commands, and pull data via M C P servers, it’s ten times faster and much more reliable.

Let’s talk about that M C P element—the Model Context Protocol. That’s really what breaks the "coding" label, isn't it? Because if I connect Claude Code to an M C P server for Google Drive or Slack, it’s no longer just looking at the files in my repo. It’s now an orchestration hub for my entire digital life.

It becomes the conductor of an orchestra. I was looking at one of Daniel’s repos for "LAN Management." He’s not writing a new networking protocol. He's using the terminal agent to query his router, check which devices are online, and maybe update a configuration on a Synology N A S. The "code" is just the medium of communication. The actual task is pure IT administration.

It’s like we’re using git as a filing cabinet for our thoughts and workflows because it’s the only structure these agents currently respect. We’re forcing our non-coding lives into a coding-shaped box just to get the benefits of the technology.

And why is that? Because git provides a natural "context window" management system. When you open a terminal agent in a repo, it automatically indexes the files. It sees the history. It knows the "state" of the project. If I just had a giant "Documents" folder, the AI would get lost. But if I have a "Taxes twenty twenty-five" repo with a CLAUDE dot M D file explaining how I want my receipts categorized, the agent suddenly has a map.

It’s a bit recursive, isn't it? We use developer tools to build a "scaffold" for the AI so it can help us with things that have nothing to do with development. But I wonder, Herman, does this framing gap actually hurt the product design? Like, are there features we’re missing because the engineers at Google and Anthropic think they’re only building for other engineers?

Oh, absolutely. Think about file previews or media handling. A terminal agent built for a "General System User" would probably have better built-in hooks for displaying images, playing audio snippets, or summarizing non-text files like P D Fs or spreadsheets. Right now, if you want Claude Code to analyze a spreadsheet, it has to write a Python script to read the C S V and then print the output. It works, but it’s a "hacky" workaround because the tool assumes your primary goal is the script itself, not the data inside the file.

That’s the "Spoon" problem we’ve talked about before—the AI is using a spoon to eat a steak because that’s the only tool it was given. If I’m a researcher trying to manage a library of case files, I don't want to see "Committing changes to main." I want to see "Case file indexed and cross-referenced." The language of git is the language of the developer, and it’s a massive barrier to entry for everyone else.

There’s a psychological hurdle here too. If you tell a writer, "Hey, you can use this amazing AI to organize your chapters and track your character arcs, but first you need to install Node dot J S, clone a git repository, and learn how to navigate a C L I," they’re going to walk away. We are currently gatekeeping the most powerful productivity leap in a decade behind a "DevOps" wall.

So, do we need a rebrand? Is "Claude Code" a bad name?

I think it’s a great name for the current market, but it’s a terrible name for the potential market. If they want to reach the next hundred million users, they need "Claude OS" or "Gemini Shell." Something that implies ownership over the environment, not just the language.

I’m curious about the "fork" possibility you mentioned earlier. Do you think we’ll see these companies release different flavors of the same engine? Like, you download the "Researcher" edition and it comes pre-configured with M C P servers for JSTOR and Zotero, and the terminal prompt is tailored for document analysis instead of debugging?

It’s almost inevitable. Look at how the L L M market evolved. We started with one giant model, and now we have "Claude for Enterprise," "GitHub Copilot for Business," and specialized models for medical or legal use. The terminal agent is just the next layer of the stack. Daniel’s "Claude Spaces" are basically a DIY version of this. He’s manually creating the specialized editions that the companies haven't built yet.

Let’s dive deeper into one of those specific use cases from Daniel’s index. He has one for "ADHD Research Workspaces." Now, on the surface, that sounds like a notes app task. Why on earth would you use a terminal-based AI agent for that?

Because a notes app is passive. A terminal agent is active. If Daniel is in his "ADHD Research" space, he can say, "Hey, find all the P D Fs in this folder that mention executive function, extract the key findings, and update my summary table in the README file." The agent can actually manipulate the data. It can run a local search, it can reformat text, it can even fetch new papers if it has an internet-enabled M C P connection. A notes app just sits there waiting for you to type.

And the git structure gives him version control for his own research. If he doesn't like how the AI summarized a chapter, he can just "git checkout" the previous version. It’s an accidental benefit of using developer tools for non-dev work. You get the world’s best undo button for free.

It’s the ultimate "safety net" for AI experimentation. One of the biggest fears people have with agentic AI is that it will "mess things up." It’ll overwrite a file or delete a paragraph. But if you’re working inside a repo, who cares? You can always go back. That’s a "Second-Order Effect" that non-developers haven't realized yet. Git isn't just for code; it’s for "State Management" of your entire project.

I wonder if we’ll see a new kind of "Non-Code Repo" emerge. Like, a standard for how to structure a "Legal Space" or a "Project Management Space" so that any terminal agent can walk in and know exactly where the evidence is, where the timeline is, and what the rules of engagement are.

That’s effectively what Daniel is building with his index. He’s creating a library of "Patterns." And this is where it gets really interesting for the future of the space. If these tools start recognizing these patterns, they can optimize for them. Imagine if Claude Code saw a folder full of audio files and a "Show Notes" template and shifted its personality from "Senior Software Engineer" to "Podcast Producer."

"Claude Producer" has a nice ring to it. But let’s look at the flip side. Why is the industry so slow to see this? You’d think Google, of all companies, would want Gemini CLI to be the ultimate personal assistant for anyone using a Chromebook or a Linux machine.

It goes back to the benchmarks, Corn. The people building these models are engineers. They use engineering tools. When they want to test if a model can "reason," they give it a coding problem because coding has a right or wrong answer. It’s binary. You can run the code and see if it passes the tests. How do you "test" if an AI is a good research assistant? It’s subjective. So the product reflects the testing methodology.

It’s a classic case of "If all you have is a hammer, everything looks like a nail." And if all your benchmarks are coding tests, your AI agent looks like a IDE. But the "Nail" in this case is actually a complex, multi-faceted system administration task that millions of people struggle with every day.

Think about the sysadmin who has to manage fifty Docker containers. Right now, they might have some complex dashboard or a bunch of brittle bash scripts. A terminal agent like Gemini CLI could just "observe" the system and say, "Hey, I noticed container X is consuming ninety percent of your memory. Should I restart it and check the logs for common error patterns?" That is a massive value add that has zero to do with "writing code" in the traditional sense, but it’s still marketed as an "engineering" feature.

It’s almost like we’re in the "MS-DOS" era of AI agents. You have to know the magic incantations to make it work. But Daniel’s "Spaces" are like the first step toward a "Windows" or "Mac OS" moment—where the structure is standardized and the power is accessible to someone who isn't a terminal wizard.

Actually, I’d argue it’s the opposite. The GUI was a way to hide the complexity of the computer from the user. Terminal agents are a way to give the user back the power of that complexity without requiring them to be a wizard. The AI becomes the translator. I don't need to remember the flags for the "find" command or the syntax for a "sed" replacement. I just tell the agent what I want, and it speaks the machine's language for me.

So it’s the "Democratization of the CLI."

Precisely! Wait, no—I’m not supposed to say that word. It absolutely is the democratization of the command line. It’s taking the most powerful interface ever created—the terminal—and removing the "Syntax Tax." If you can speak English, you can now operate a Linux server as well as a senior admin who’s been doing it for twenty years.

That’s a bold claim, Herman. You really think a terminal agent makes me a senior admin?

If you have the right "Spaces" and the right "Scaffolding," yes. Because the agent has the collective knowledge of every documentation page and Stack Overflow post ever written. It knows the edge cases. It knows that if you update this package, it might break that dependency. As long as you provide it with the local context—the "State" of your specific machine—it can navigate those waters for you.

This brings up a big security question, though. If we’re encouraging non-technical users to use "System Agents" to manage their computers, aren't we just creating a massive new attack vector? One "hallucinated" command could wipe a drive or open a firewall port.

That is the big "but" in this whole conversation. And it’s probably why the marketing is so conservative. If you call it a "coding tool," there’s an implied "For Professional Use Only" sticker on it. If you call it a "General Computer Copilot," you’re inviting grandma to use it to "speed up her laptop," and that’s a recipe for disaster.

I can see the headline now: "AI Agent Deletes Grandma’s Photos While Trying to Optimize Disk Space."

So the framing gap might actually be a deliberate "Safety Buffer." It keeps the tool in the hands of people who—theoretically—know how to check its work. But as the models get more reliable, that buffer is going to start feeling like a cage.

Let’s talk about the practical side for a minute. If I’m a listener and I’m NOT a developer, but I’m tech-literate enough to open a terminal, how do I actually start using this "Claude Spaces" pattern?

The first step is to stop thinking about "coding" and start thinking about "Workflow Organization." Pick a project—maybe it’s organizing your digital photos or managing a small business’s invoices. Create a folder. Initialize a git repo inside it. Then, create that CLAUDE dot M D file.

What goes in that file for a non-coder?

Think of it as a "Standard Operating Procedure." You tell the agent: "This folder contains my twenty twenty-four invoices. When I ask you to process them, look for the date, the vendor, and the amount. Rename the files to YYYY-MM-DD-Vendor-Amount. If a file is a duplicate, move it to the 'Duplicates' subfolder." Now, every time you open Claude Code in that folder, it knows exactly what its job is. You don't have to explain it every time.

That is so much more powerful than a simple script. Because if an invoice comes in as a weirdly formatted P D F, a script would fail. But the LLM can "look" at the P D F, understand what the vendor name is even if it’s in a different spot, and handle it.

And if it gets stuck, it asks you. "Hey, this vendor is 'Amazon Web Services' but I also see 'A W S.' Should I treat them as the same?" It’s a collaborative system operator. Daniel’s public index is a goldmine for these kinds of patterns. He has one for "Side Hustle Ideation" where the agent helps him brainstorm, research competitors, and draft business plans, all within a structured repo.

I love the idea of using a "Side Hustle" repo. It’s like having a tiny, very focused co-founder who never sleeps and has read every business book ever written.

But again, if you looked at that repo, a developer would say, "This isn't code. Why are you using a code tool for this?" And the answer is: because the code tool is the only thing that has the "Agentic Harness" to actually execute tasks. The "Harness" is the terminal access, the file system access, and the ability to call external tools.

It’s the "Harness," not the "Code." That’s the key distinction. We’re using the harness of a coding tool to pull a very different kind of wagon.

And I think we’re going to see a "Great Rebranding" in the next eighteen months. Anthropic and Google are too smart to leave this money on the table. They’ll eventually realize that for every one developer, there are a hundred "Operations" people who need this exact same technology but are put off by the name.

What would you call it? If you were the head of product at Anthropic, what’s the name for the tool that replaces Claude Code?

I’d go with something like "Claude Command" or "Claude Director." Something that implies authority and action. "Code" is a noun; "Command" is a verb. These tools are fundamentally about "Doing," not just "Writing."

I like "Claude Director." It makes me feel like I’m standing on a movie set, shouting instructions through a megaphone, and a bunch of invisible AI assistants are running around moving the lights and setting up the cameras.

That’s a great image. And the terminal is the megaphone. It’s the most direct, high-bandwidth way to talk to a computer. GUI agents are like trying to direct a movie by pointing at things through a window. The terminal lets you be on the set.

Let’s look at the "System Administration" angle a bit more. Daniel mentions "Claude Rescue" for G R U B recovery. For the non-geeks listening, G R U B is the thing that loads your operating system. If it breaks, your computer doesn't start. It’s a terrifying place to be.

It’s the "Blue Screen of Death" but for Linux, and much more manual. Usually, you’re staring at a black screen with a blinking cursor and no idea what to type. An AI agent that can live in that environment and say, "Okay, I see your partition table, it looks like your boot flag is missing, let me try to re-index the kernel for you"—that is a literal lifesaver.

And that requires the agent to be more than a "coder." It has to be a "Systems Engineer." It has to understand hardware-software interaction.

Which these models already do! They’ve read the Linux kernel mailing lists. They’ve read every manual page. The knowledge is there; it’s just the "Interface" and the "Framing" that are missing. When we call it a "coding tool," we’re essentially telling the AI to ignore ninety percent of its brain.

"Don't worry about the hardware, just focus on the Python syntax." It’s like hiring a world-class chef and telling them they’re only allowed to use the toaster.

And Daniel’s work is basically saying, "Hey, look at all these other appliances in the kitchen!" He’s showing that the toaster can actually be used to sear a steak if you turn it on its side and hack the temperature sensor.

I’m not sure I’d recommend searing a steak in a toaster, Herman.

It’s a metaphor, Corn! But you get the point. We are in the "Hacking" phase of terminal agents. The power users are figuring out how to use these tools for everything OTHER than what they were intended for. And eventually, the manufacturers will catch up and make those "hacks" official features.

I wonder if the "Git-as-Workspace" paradox will ever go away. Will we eventually have a "Claude Space" file format that isn't a git repo?

I think we might see a shift toward "Agentic Filesystems." Imagine a folder that, by its very nature, has an LLM "Layer" sitting on top of it. You don't "open" an agent; the folder IS the agent. Any file you drop in there is automatically indexed, summarized, and acted upon based on the rules you’ve set for that "Space."

That sounds like the "Agent-Centric World" we’ve talked about before. Where the "App" is just a thin skin over a very smart data store.

And the terminal agent is the bridge to that world. It’s the first tool that treats the computer as a "Collaborative Environment" rather than a "Static Tool."

So, what are the big takeaways for our listeners? If you’re sitting there thinking, "I’m not a coder, this episode isn't for me," why should you care?

First, if you use a computer for anything complex—research, writing, data management—you should try one of these tools. Don't be intimidated by the "Code" in the name. Think of it as a "System Assistant." Start small. Use it to organize a messy folder or summarize a stack of documents.

And use the "Spaces" pattern! Don't just run the agent in your home directory. Create a specific folder for a specific task, put a CLAUDE dot M D file in there with your "Rules," and see how much more effective the agent becomes when it has a clear mission.

Second, for the developers listening: expand your horizons. You’re already using these tools to write functions. Why aren't you using them to manage your servers, automate your documentation, and organize your life? You’ve been given a superpower; don't just use it to save ten seconds on a for-loop.

And check out Daniel’s index. Seriously. Even if you don't use his specific repos, just seeing the titles of what he’s built—"Synology Manager," "ADHD Research Space," "Legal Case File Unpacker"—it expands your brain. It makes you realize that the "Terminal" isn't a scary place for hackers; it’s a high-performance workspace for anyone who wants to get things done.

It’s about "Intentional Computing." Instead of clicking around and hoping you find the right setting, you’re stating your intent to an agent that knows the system better than you do. It’s a much more dignified way to interact with technology.

"Dignified Computing." I like that. No more fighting with "File Explorer" or "Finder" to find that one document you saved three years ago. You just tell the "Director" to go find it and bring it to you.

We are moving from "Managing Files" to "Managing Outcomes." And the "Framing Gap" we’re seeing right now is just the growing pains of that transition. The companies call it "Code" because that’s what they know. But we know it’s so much more.

It’s going to be a wild ride when the rest of the world realizes they’ve been invited to the party. I can’t wait to see what the "Claude Spaces" for non-technical fields look like. "Claude Gardening," "Claude Menu Planning," "Claude Family Archive."

It sounds silly until you realize that all of those things involve data, organization, and multi-step tasks. And that is exactly what these agents excel at. The "Terminal" is just the most efficient place to do that work.

Well, I think we’ve thoroughly deconstructed the "Coding" myth. It’s time to start seeing these agents for what they really are: the first true "Computer Copilots" that actually have the keys to the engine room.

And I’m glad we have people like Daniel out there doing the "R and D" for the rest of us. It makes the transition a lot less scary when you have a map of the territory.

This has been a fascinating deep dive. It’s one of those topics where the more you look, the more you realize how much we’ve been limiting ourselves by the labels we put on things.

A name is a powerful thing, Corn. It can be a bridge or a wall. Right now, "Claude Code" is a bit of both. Our job is to help people climb over the wall and see the view on the other side.

And the view is pretty great. Alright, I think that’s a wrap on our exploration of the framing gap and the rise of the "System Agent."

It’s a brave new terminal out there. I hope everyone’s ready to start typing.

Or at least, ready to start talking to their terminal. Thanks for the deep dive, Herman. You really made the case for why a "Donkey" should be in charge of the "Engine Room."

I’ll take that as a compliment!

It was! Mostly. Alright, let’s wrap this up. If you’ve enjoyed this look into the future of "Headless Computer Use" and the "Claude Spaces" pattern, do us a favor and leave a review on whatever podcast app you’re using. It genuinely helps other "Intentional Computers" find the show.

Big thanks as always to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes.

And a massive thank you to Modal for providing the G P U credits that power the generation of this show. We couldn't do this without that serverless horsepower.

This has been My Weird Prompts.

We’ll see you in the next repo. Or space. Or whatever we’re calling them next week. Goodbye!

Goodbye!

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2014: Coding Tools Are Secretly System Agents

Downloads

You Might Also Like

#2014: Coding Tools Are Secretly System Agents