#2069: Agentskills.io Spec: From Broken YAML to Production Skills

Stop guessing at the agentskills.io spec. Learn the exact YAML fields, directory structure, and authoring patterns to make Claude Code skills that ...

Featuring

Daniel

Corn

Herman

0:000:00

Episode Details

Episode ID: MWP-2225
Published: Apr 6
Duration: 20:50
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: Gemini 3 Flash
Topics: ai-agents prompt-engineering rag

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The agentskills.io specification has emerged as the formal contract for building Claude Code skills, yet many developers still treat it like guesswork. This guide breaks down the core requirements and authoring patterns, moving from the "what" of the spec to the "how" of building a production-quality skill.

The Directory is the Skill

A common misconception is that a skill is just a single file. According to the spec, a conformant skill is a directory. The directory name must match the name field in your frontmatter and use kebab-case. For example, a skill named "docker-manager" must live in a folder called "docker-manager"—no underscores or capital letters allowed. Inside, the mandatory entry point is SKILL.md, which acts as the brain. Optional subdirectories like scripts, references, and assets help manage context. The spec enforces a "Progressive Disclosure" model: the agent loads only the frontmatter first, then the full SKILL.md when activated, and only dives into references if instructed. This prevents bloating the agent's active memory.

The Five Non-Negotiable Frontmatter Fields

Most broken skills fail in the YAML frontmatter. The spec defines five required fields:

Name: A unique identifier, 64 characters or less, using lowercase alphanumeric and hyphens only. It must exactly match the folder name.
Description: Up to 1024 characters, but this isn't for humans—it's for the agent's internal routing. A vague description like "Helps with Git" will be ignored. A conformant description is a trigger phrase: "Generates semantic commit messages by analyzing staged changes. Use this when the user asks to commit code."
Version: Must follow semantic versioning (e.g., 1.0.0). This will be critical for future dependency management in marketplaces.
Author: Required for conformance, though validation is currently minimal.
Triggers: The newest part of the spec (as of April 2026). This is an array of objects, currently supporting only "slash_command" types. Each trigger needs a type, command, and optional description/parameters. However, if you define a slash command in YAML, you must also have corresponding instructions in SKILL.md—otherwise, the skill is malformed.

Security and Syntax Pitfalls

The spec includes an optional allowed-tools field for security. This space-delimited list pre-authorizes tools like "Read Bash git:*", preventing the agent from asking for permission every time and guarding against prompt injection. Syntax-wise, avoid XML tags in YAML—they can break parsing since Claude uses XML tags internally. Use two-space indentation and no tabs.

From MVS to Production Quality

A Minimal Viable Skill (MVS) has the frontmatter and basic instructions but lacks "teeth." A production-quality skill includes executable scripts and error handling. For a Docker manager skill, you'd create a scripts folder with a Bash script that runs docker ps and outputs JSON (LLMs parse JSON better than ASCII tables). In SKILL.md, you'd instruct Claude to run this script using the environment variable CLAUDE_SKILL_DIR—never hardcode paths—to ensure portability.

Production quality also means implementing the "Wizard" pattern: a decision tree where the agent checks in with the user. For example, if the Docker script fails, the skill should check if the Docker daemon is running and offer to start it, rather than assuming success.

Key Takeaways

Conformance is contractual: Breaking the spec means the agent stays dumb.
Description is for routing: Write it as a prompt for the agent to know when to call the skill.
Modularity matters: Over-scoped skills hit context limits and dilute agent attention. Break them into focused, single-purpose tools.
Use environment variables: CLAUDE_SKILL_DIR ensures skills work across different machines.
Error handling is mandatory: An MVS assumes success; a production skill plans for failure.

The agentskills.io spec isn't just a format—it's a framework for building reliable, portable, and secure agentic tools. Follow it, and your skills will move from broken YAML to production-ready assets.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2069: Agentskills.io Spec: From Broken YAML to Production Skills

If you have ever tried to write a Claude Code skill and ended up with a broken YAML file that Claude just flat-out refuses to load, this episode is your rescue mission. We are looking at the agentskills dot io specification, which has really stabilized as the de facto standard for how these things are built, but honestly, most authors are still just guessing at what the requirements are.

It is a classic case of the "vibe coding" versus "procedural engineering" gap. People think because it is an AI tool, they can just throw some Markdown at it and it will work. But the agentskills dot io spec is a formal contract. If you break that contract, the agent stays dumb. I am Herman Poppleberry, by the way, and I have been digging through the latest March fifteenth, twenty twenty-six update to the spec all morning.

And I am Corn. Today’s prompt from Daniel is basically a request for a masterclass. He wants us to do a line-by-line forensic walkthrough of the agentskills dot io specification, and then pivot into a practical workshop on how to actually author these things from scratch. By the way, today’s episode is powered by Google Gemini three Flash, which is handling our script duties.

I love that Daniel asked for the "line-by-line" treatment. Usually, people want the summary, but in the world of agentic workflows, the summary is where you lose the technical nuance that actually makes the skill portable.

Well, not "exactly," but you are on the right track. See? I’m learning. Let's start with the prompt Daniel sent over. He says: "The agentskills dot io specification for writing Claude Code skills — a deep dive and practical authoring guide combined into one episode. Herman and Corn walk through the spec line by line: what conformance requirements actually mean, how the frontmatter schema is structured, what each metadata field does, and what makes a skill technically well-formed versus broken. Then pivot into a practical writer's workshop: how do you actually write a spec-conformant skill from scratch? What does production-quality look like versus a minimal viable skill? Common mistakes, over-scoping, and how to know when it is ready to publish."

That is a tall order, but a necessary one. If we are going to do this right, we have to start with the directory structure itself. A lot of people think a "skill" is just a file. It isn't. According to the spec, a conformant skill is a directory. The directory name must match the name field in your frontmatter, and it has to use kebab-case. If you name your folder "My_Cool_Skill" with underscores, you have already failed conformance before you even wrote a line of code.

So it is a containerized logic block. Inside that directory, you have the mandatory entry point, which is SKILL dot md. That is the brain. But then you have these optional subdirectories: scripts, references, and assets. Why the separation? Why not just put everything in the Markdown file?

Because of the context window and the way Claude "peeks" at files. If you stuff ten thousand lines of documentation into SKILL dot md, you are eating up the agent's active memory for every single turn of the conversation. The spec enforces a "Progressive Disclosure" model. The agent loads the frontmatter first to see if the skill is even relevant. It only reads the full SKILL dot md when the skill is activated. And it only goes into the references folder if the instructions in SKILL dot md tell it to look there for deeper technical details. It is about efficiency.

Okay, let's get into the YAML frontmatter schema. This is where most of the "broken skill" errors happen. The spec defines five non-negotiable fields: name, description, version, author, and triggers. Herman, walk me through the "name" field first. It seems simple, but the spec is weirdly specific about it.

It is specific because the name is the unique identifier in the agent's tool registry. It has to be sixty-four characters or fewer, lowercase alphanumeric, and hyphens only. No spaces, no periods, no underscores. And as I mentioned, it must be identical to the folder name. If they don't match, Claude Code's discovery mechanism won't register it as a valid skill.

Then there is the "description" field. The spec says this can be up to one thousand twenty-four characters. That sounds like a lot for just a summary. Is that where we put the instructions?

No! That is a huge trap. The description is actually the most important piece of metadata for the discovery phase. When you start Claude Code, it scans all your skills and reads only the descriptions. It uses those descriptions to build a "mental map" of what tools it has available. If your description is "Helps with Git," Claude will almost never use it because that is too vague. A conformant, production-quality description needs to be a "trigger phrase." It should say something like, "Generates semantic commit messages by analyzing staged changes. Use this when the user asks to commit code or needs help writing a description for their changes." You are essentially writing a prompt for the agent to know when to call the skill.

So the description isn't for the human user; it is for the agent's internal routing logic. That makes sense. What about "version" and "author"? Are those just for show, or does the spec actually validate them?

The spec requires them for "conformance," but currently, the validation is mostly about existence. However, the version field must follow semantic versioning—so, one dot zero dot zero, not "version one." As the ecosystem grows into marketplaces like agentskills dot io, these fields will be used for dependency management. You don't want to load a skill that requires a specific API version if you are running an outdated local script.

Now, let's talk about the "triggers" field. This is the newest part of the spec as of April twenty twenty-six. It is an array of objects. Currently, the only supported type is "slash_command," right?

For now, yes. Each trigger needs a "type," a "command," and optionally a "description" and "parameters." This is how you get those cool /deploy or /test commands in the CLI. But here is the technical catch: if you define a slash command in the frontmatter, you must also have corresponding instructions in the SKILL dot md file that tell the agent what to do when that command is invoked. If the command exists in YAML but there is no logic for it in the Markdown, the skill is considered "malformed."

I noticed something in the spec about "allowed-tools." It is an optional field, but it seems powerful. It lets you pre-authorize certain tools so Claude doesn't have to ask for permission every time?

Precisely. Well, I shouldn't say precisely. It is a space-delimited list. For instance, if your skill needs to read the git log, you would put "Read Bash git colon asterisk" in that field. This is a security-first design. By declaring it in the frontmatter, the user can see exactly what permissions the skill is claiming before they run it. If a skill tries to run a bash command that isn't in its "allowed-tools" list, the agent will pause and ask for manual confirmation. It prevents "prompt injection" where a skill might try to secretly delete your home directory.

Let's talk about what makes a skill "broken" versus "well-formed." You mentioned missing fields, which is obvious. But what about the YAML syntax itself? I have seen people try to use XML tags inside the YAML block to "organize" their prompts.

That is a cardinal sin in the agentskills dot io spec. No XML tags allowed in frontmatter. The reason is that Claude itself uses XML tags to parse the system prompt. If your skill's metadata contains loose angle brackets, it can "break out" of the YAML block and start hallucinating system instructions. The spec is very clear: use plain text or quoted strings. Also, tabs are a no-go. It’s two-space indentation for the YAML, or the parser will throw a fit.

And what about "over-scoping"? The spec mentions modularity as a core tenet.

This is more of a logical conformance issue than a syntax one. A well-formed skill should do one thing well. If you have a skill named "web-developer" that tries to handle CSS styling, React component generation, SQL migrations, and AWS deployment all in one file, you are going to hit the context limit. More importantly, the agent's "attention" will be spread too thin. The spec suggests breaking those into a "web-style" skill, a "web-db" skill, and so on.

Okay, so that is the "what" of the specification. Now let's move into the "how." We are shifting into the workshop portion of the episode. Imagine I am a developer, and I want to write a skill that helps me manage my local Docker containers. I want a /containers command that lists what’s running and offers to stop them. How do I start from scratch?

We start with the Minimal Viable Skill, or MVS. You create a folder called "docker-manager." Inside, you create SKILL dot md. The first thing you do—and I mean the very first thing—is write that frontmatter. You need those five fields. Name is docker-manager. Description is "Manages local Docker containers. Use this to list, stop, or restart containers when the user asks about Docker status." Version is zero dot one dot zero. Author is your name. And then the triggers.

For the triggers, I would define a slash_command called "containers," right?

Correct. Your YAML would look like this: triggers, then a dash, type: slash_command, command: containers, description: "Lists and manages Docker containers." That is your skeleton. If you just had that, and some basic text in the body of the Markdown file, Claude could load it. But it wouldn't be "production-quality" yet because it doesn't have "teeth."

By "teeth," you mean actual executable scripts. Because right now, Claude would just be "hallucinating" what it thinks my Docker containers are unless I give it a way to see them.

Right. In a production-quality skill, you would create a "scripts" folder inside "docker-manager." Inside that, you might write a simple Python script or a Bash script called "list_containers dot sh." That script just runs "docker ps" and formats the output as a clean JSON block.

Why JSON? Why not just have the script print the raw table?

Because LLMs are much better at parsing structured data like JSON than they are at "reading" ASCII tables with misaligned columns. If you give Claude JSON, it can reliably extract the container ID and status. So, in your SKILL dot md, under the "Instructions" section, you would tell Claude: "When the user runs /containers, execute the script at dollar-sign-curly-bracket CLAUDE_SKILL_DIR close-bracket /scripts/list_containers dot sh."

Wait, that variable—CLAUDE_SKILL_DIR. That is a spec requirement, isn't it?

It is. You must never use absolute paths like "Users slash Corn slash skills." That makes the skill non-portable. If you send that skill to me, it will break on my machine. The spec requires using that environment variable so the agent knows to look relative to where the skill is actually installed.

Okay, so we have the frontmatter, we have the script, and we have the instruction to run the script. What else separates an MVS from a production-quality skill?

Error handling and the "Wizard" pattern. An MVS just assumes the script works. A production skill says: "If the script returns an error, check if the Docker daemon is running. If it isn't, inform the user and ask if they want you to try starting it." You are basically writing a "decision tree" for the agent.

I like the idea of the "Wizard" pattern. That is where the agent doesn't just do everything at once, but checks in with the user, right?

Yes. For your Docker skill, a production version wouldn't just stop a container because it "thinks" that’s what you want. The instructions should say: "Before stopping a container, present the list of running containers to the user and ask for the ID. Wait for user confirmation before executing the stop command." This prevents the agent from accidentally killing your production database when you meant to kill a test runner.

What are some other common mistakes people make when they’re writing these? You mentioned "voodoo constants" in the background notes Daniel sent.

Oh, that is a big one. People will write a script with a timeout of thirty seconds, but they won't explain why it is thirty seconds in the SKILL dot md file. If the script fails because of a network lag, Claude won't know that it should try increasing the timeout or just wait. You have to remember that the agent is your "runtime operator." It needs to understand the "why" behind the code it is running.

Another one I see is "deep nesting." People try to make their skills super organized by having SKILL dot md link to another Markdown file, which links to a third one.

And that is where "Progressive Disclosure" breaks. Claude usually only "peeks" at the first hundred lines or so of a referenced file to see if it is worth reading the whole thing. If you bury the actual "how-to" three levels deep, the agent might lose the thread or just give up because it feels like it’s "too much work" to find the information. The spec recommends keeping references only one level deep. If you have a huge API spec, put it in references slash API_REF dot md and tell Claude exactly when to open it.

Let’s talk about the "isMeta" flag. This wasn't in the frontmatter list we discussed, but it is in the deeper spec for Claude Code specifically.

The "isMeta" flag is brilliant. It’s a way to tell the CLI to hide the skill's output from the user's screen while still sending it to the API. If your skill is doing something verbose—like scanning five thousand lines of logs—the user doesn't want to see all that junk flying by in their terminal. By using the "isMeta" pattern, you keep the chat clean, but the agent gets all the data it needs to give you a one-sentence summary. It makes the experience feel much more "magical" and less like a scrolling terminal.

And what about "context: fork"? This sounds like something for the power users.

It is. This is for when you want to run a "sub-agent." Let’s say your Docker skill needs to do some deep research into a specific error code. Instead of cluttering up your main conversation with Docker logs, you can tell Claude to "fork" a sub-agent. That sub-agent uses the skill, does the research in an isolated "thought space," and then returns only the final answer to the main thread. It keeps the "primary" agent focused on your main goal.

So, if I’m building this Docker manager, how do I know when it is actually ready to publish to something like agentskills dot io? Is there a "linter" or a validation tool?

There is a basic validation tool in the Claude Code CLI. You can run "claude skills validate" and it will check your YAML syntax and the five required fields. But "technical readiness" is different from "production readiness." To be production-ready, you need to test it in at least three different scenarios. Does it work when the Docker daemon is off? Does it work when there are zero containers? Does it work when there are fifty containers?

That "fifty containers" one is interesting because of the context window. If the list is too long, the agent might truncate it.

A production-quality skill would handle that by saying: "If there are more than twenty containers, only show the first ten and ask the user if they want to see more." You have to teach the agent to be a good "UI designer" for the terminal.

Okay, let's recap the "Writer's Workshop" steps for someone who wants to do this today. Step one: Directory structure. Kebab-case folder name, mandatory SKILL dot md. Step two: Frontmatter. Name, description, version, author, triggers. Make that description a "triggering" statement for the agent.

Step three: The instructions. Use imperative language. Tell the agent exactly what to do. Use checklists. Claude loves checklists. Step four: Add the "teeth." Put your logic in scripts slash and use the CLAUDE_SKILL_DIR variable. And step five: Refine for the "Wizard" pattern. Add confirmations and error handling so the agent doesn't go rogue.

It really feels like we are moving away from "chatting with a bot" and toward "authoring an operating system for an agent."

That is exactly what it is. The agentskills dot io spec is basically the POSIX of the agent world. It is a standard way to say "here is how a tool should look so any agent can use it." If you follow this spec, your skill doesn't just work in Claude Code; it could technically work in any agentic framework that adopts the standard.

I think one of the most underrated parts of the spec is the "allowed-tools" section we touched on. In a world where people are worried about AI security, being able to say "this skill is only allowed to read files, it can never write to them" is a huge trust-builder.

It is. And the spec is evolving to include "parameter validation" in those triggers. So, by April twenty twenty-six, you can actually define in your YAML that the /containers command requires a string parameter for the container name. This means the agent doesn't even have to guess; the CLI will enforce the input format before the agent even sees it.

So what is the "takeaway" for the listener who has been struggling with this?

The takeaway is: stop treating SKILL dot md like a blog post and start treating it like a configuration file. Every line of your Markdown should be a clear, unambiguous instruction. If you find yourself using words like "maybe" or "optionally," you are opening the door for hallucinations. Be the "boss" of the agent. Tell it: "First, do X. Then, do Y. If Z happens, stop and ask me."

And don't over-scope. I think that is the biggest one I see. Everyone wants the "God Skill" that does everything. But the agentskills dot io spec is all about modularity. It is better to have five small, perfect skills than one massive, flaky one.

It is the Unix philosophy applied to AI. Small tools, connected by a smart orchestrator.

Well, I think we have covered the "forensic" side of the spec and the "craft" side of the workshop. I'm feeling like I could actually go and fix that broken Git helper I was working on last night. I definitely didn't have the "version" field formatted right, and I am pretty sure I used an underscore in the folder name.

Don't feel bad. Even the best developers get tripped up by the "simple" stuff in a new spec. The key is just to keep that agentskills dot io page open while you are coding. It is the source of truth.

Alright, let's wrap this one up. We’ve gone deep on the spec, we’ve built a mental model for a Docker manager, and we’ve hopefully saved a few hundred YAML files from the trash bin.

I hope so. It’s an exciting time to be building this stuff. The spec is actually making it easier to be a "prompt engineer" because it gives us a box to play in.

Before we go, I want to give a big shout-out to our producer, Hilbert Flumingtop, for keeping the gears turning behind the scenes. And a huge thanks to Modal for providing the GPU credits that power the generation of this show.

If you found this deep dive useful, or if you managed to build your first spec-conformant skill because of it, we’d love to hear about it. Find us at myweirdprompts dot com for the RSS feed and all the different ways you can subscribe.

This has been My Weird Prompts. Go write some clean YAML, everyone.

See ya.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2069: Agentskills.io Spec: From Broken YAML to Production Skills

The Directory is the Skill

The Five Non-Negotiable Frontmatter Fields

Security and Syntax Pitfalls

From MVS to Production Quality

Key Takeaways

Downloads

You Might Also Like

#2069: Agentskills.io Spec: From Broken YAML to Production Skills