#2790: Git Hygiene for AI Coding Agents

How to keep your git repo clean when Claude is blazing through tasks — plus a recovery playbook for when things go sideways.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-2955
Published: May 13
Duration: 36:52
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: version-control software-development ai-agents

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

When your AI coding agent is blazing through tasks, it's easy to assume everything is landing in git. But as one developer discovered the hard way, uncommitted changes from two weeks ago can lurk in your working tree, waiting to introduce regressions. The problem isn't malice — it's that the agent's internal model of what happened diverges from reality. Claude confidently narrates changes it never committed.

The fix is a three-layer git hygiene system. First, a standing instruction at the project level — something in your CLAUDE.md file — that makes committing a hard gate: no commit, no task completion. Second, per-session reminders to check git state before starting work. Third, periodic verification every five to ten turns, where you ask the agent to show you what's actually in the log, not what it thinks it did. Breaking "commit" out as a numbered step rather than a trailing afterthought dramatically improves reliability.

For recovery, the first rule is never commit what you don't understand. Before staging uncommitted changes, run git diff and read the changes. Use git add -p to stage changes hunk by hunk, separating intentional work from cruft. And for long-term sanity, adopt a tagging cadence. Annotated tags at every stability point — after a completed feature, before a risky refactor, after a successful deploy — shrink your maximum potential loss from days to hours. Push those tags to remote so your checkpoint system survives a laptop failure. When you work solo with an agent on a single branch, tags replace branches as your safety net, letting you reset to a known-good state without the overhead of merge workflows.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2790: Git Hygiene for AI Coding Agents

Daniel sent us this one, and it's basically a two-parter wrapped in a war story. He spent yesterday wrestling with Claude — it was blazing through tasks, everything looked great, and then he discovered uncommitted changes from two weeks ago lurking in the working tree. Adding those in retroactively introduced regressions. So the first question is, what hand-holding instructions do you give a coding agent to make sure everything actually lands in git? And the second is, what's the recovery playbook when you discover your repository state isn't what you thought? He also wants to talk about tagging as an underused habit when you're working solo on a single branch.

This is one of those things where the tool got so good that the safeguards stopped feeling necessary. When Claude was slower or more obviously imperfect, you checked its work constantly. Now it rips through five tasks in a row, you glance at the output, it looks right, you move on. And the git commit is the thing that gets dropped because from the agent's perspective, it did the work. The file changed. The test passed.

The classic "I deployed" but it didn't push.

And the prompt's right that the murky middle of the context window is where this gets dangerous. You're forty turns in, Claude's attention is strongest on the most recent exchanges, and those uncommitted changes from turn twelve might as well be in a different universe. The agent doesn't notice them until they collide with something new.

Let's start with the hand-holding. If you're working with a coding agent and you want git hygiene built into the workflow rather than bolted on as an afterthought, what do you actually say?

I'd break it into three layers. The first is a standing instruction at the project level — something in your CLAUDE dot md or your project rules file. Something like, after every file modification, run git status and confirm the working tree is in the state you expect. Before declaring any task complete, verify that all changes are staged and committed with a descriptive message. Never leave uncommitted changes at the end of a turn.

You're making it a hard gate. No commit, no completion.

And the key word there is "before declaring any task complete.Not as a nice-to-have. The task isn't done until the commit exists. The second layer is per-session reminders. If you're starting a new chat for a feature, your opening prompt should include, remind me of the current git state before we begin. What branch are we on, what's the last commit, are there any uncommitted changes?

I like that because it also catches the thing Daniel ran into. If the agent checks at the start and says "by the way, you've got uncommitted changes from two weeks ago," you deal with them before they become a time bomb.

The third layer is the one people skip — explicit verification requests. Every five or ten turns, especially after a burst of rapid changes, ask the agent to run git log minus five and summarize what was committed. Not what it thinks it did. What's actually in the log. Claude is perfectly capable of confidently describing changes it never committed.

That's the really insidious part. The agent's internal model of what happened diverges from reality, and it narrates from the model, not from the filesystem.

It's the same failure mode as any confident but ungrounded system. The model believes it committed because committing is the correct thing to do in that situation, and it generated the commit message in its head. But it never ran git commit.

Those are the three layers. Standing instruction, per-session check, periodic verification.

One more, and it's subtle. When you give Claude a multi-step task, explicitly number the steps and include the git operation as a numbered step, not a trailing afterthought. Don't say "refactor the authentication module and commit the changes." Say "step one, refactor the authentication module. Step two, verify all tests pass. Step three, stage and commit with a message describing the refactor. Step four, confirm the commit exists in the log." Making it a discrete step with its own verification reduces the chance it gets mentally elided.

You're essentially breaking the "commit" step out of the background assumptions and making it a foreground action with its own success criterion.

I've seen Claude skip the commit when it was a trailing clause, but when it's step three of four with a "confirm" step after, the reliability goes way up. It's not magic, it's just reducing ambiguity.

That's the preventive side. Let's talk recovery. Daniel laid out two scenarios. One is the nuclear option — git add dot, bundle everything into a commit, declare it the source of truth. The other is tagging as a fallback mechanism. Walk me through the recovery playbook.

Let's start with the situation Daniel actually hit. You discover uncommitted changes from two weeks ago. The immediate question is, what are they? Before you add anything, before you stage, you need to know what you're dealing with. Git diff on those files. Read the changes. Understand why they exist. Were they abandoned for a reason? Were they part of a feature you decided to scrap? Did Claude introduce them during a debugging session and forget to clean up?

Because the danger isn't the uncommitted changes themselves. It's that you don't know why they're uncommitted.

Uncommitted changes are not inherently bad. They're just work whose intent hasn't been recorded. And if you can't reconstruct the intent from the diff, you're gambling. I'd say the first recovery principle is never commit what you don't understand.

That sounds obvious but I guarantee everyone listening has done it at least once. You're frustrated, you just want a clean state, you git add dot and move on.

Sometimes that's the right call, but only after you've read the diff and can articulate what you're adding. The prompt mentions that git add dot is one way to reset the repository state, and it is, but it's also the "I'm declaring bankruptcy on understanding my own changes" option. Sometimes that's fine. If it's a solo project, it's a static site, you can visually verify everything looks right, and the alternative is spending an hour untangling — sure, take the nuclear option. But do it knowingly.

The phrase "declaring bankruptcy on understanding" is the perfect summary of git add dot at three in the morning.

The better recovery pattern, if you have the time and the changes are nontrivial, is to use git add minus p. That's the patch mode. It walks you through each change hunk by hunk and asks if you want to stage it. You can say yes to the intentional changes, no to the cruft, and split hunks if needed. It's slower, but you end up with commits that actually reflect what you meant to do.

If you're working with Claude, you can actually ask it to help triage. Feed it the git diff output and say, "these are uncommitted changes. Which of these look intentional versus accidental? Group them into logical commits for me.

That's a great use case. Claude is actually quite good at reading diffs and categorizing changes. It can say, "this hunk looks like the authentication refactor we did last Tuesday, this one looks like a stray debug print statement, this one is a dependency version bump." Then you can stage and commit each group separately with meaningful messages.

Which brings us to tagging. Daniel thinks tagging is underused in solo-agent workflows, and I think he's right.

Tagging is the cheapest insurance policy in git and almost nobody uses it. A tag is just a named pointer to a specific commit. It doesn't branch, it doesn't create parallel history, it just says "this commit matters." And for a solo developer working with an AI agent, I'd argue tags are more important than branches.

Why more important?

Because branches impose a workflow. They require you to think about merges, about switching contexts, about keeping things in sync. For a solo developer on a simple project, the cognitive overhead of branching might not be worth it. But tags have zero overhead. You just stick a label on a commit and move on. And when something goes wrong six hours or six days later, you can instantly find the last known-good state.

It's like bookmarking your repository's sanity.

The key is tagging at moments of known stability, not just at releases. After a successful feature completion and commit. After you've verified the site deploys correctly. Before you start a risky refactor. Before you hand Claude a complex multi-step task. Each tag is a checkpoint you can reset to without squinting at commit messages trying to remember which one was the good one.

What's the actual command pattern here?

Git tag minus a followed by a descriptive name and optionally a message with minus m. So git tag minus a stable dash auth dash refactor minus m "authentication module refactored and tested." The minus a makes it an annotated tag, which stores the tagger name, email, date, and message. Lightweight tags are just pointers with no metadata. For checkpointing, always use annotated tags.

Then to get back to one?

Git checkout tags slash tag name. That puts you in detached HEAD state, which sounds scary but just means you're not on a branch. From there, you can either create a new branch if you want to diverge, or if you want to reset your current branch to that tag, git reset minus hard tags slash tag name.

Let's pause on that. Hard reset is the one that strikes fear into people.

It discards all changes after that commit. It's the "I want to pretend everything after this point never happened" command. But if you've been tagging regularly, you don't have to hard reset all the way back to the Stone Age. You can reset to last Tuesday's tag, keep this morning's work, and only lose the three hours where Claude introduced regressions.

The tagging cadence shrinks the blast radius.

That's exactly the right way to think about it. Every tag reduces the maximum amount of work you can lose. If you tag once a week, you could lose up to a week. If you tag after every completed task, you might lose an hour at most.

What about pushing tags? I feel like people forget that git push doesn't push tags by default.

It doesn't. You need git push origin tag name or git push minus minus tags to push all tags. And this matters because if your tags only exist locally and your laptop dies, your checkpoint system died with it. For a solo developer, pushing tags to a remote is your offsite backup of your repository's sanity markers.

The full hygiene stack you're describing is: standing instructions for the agent to commit after every task, per-session git status checks, periodic verification of the log, and a tagging cadence at every stability point, with tags pushed to remote.

One more thing that I think ties it all together. The prompt mentions working on main or master in a single-branch setup, and there's a whole contingent of developers who will tell you that's wrong. But for a solo developer with an AI agent, I actually think it's fine, provided you have the other practices in place.

What's the actual argument against single-branch development?

The traditional argument is that main should always be deployable, and all work should happen in feature branches so you never break main. But when you're a solo developer and main is your only branch, "always deployable" is still the goal — you just achieve it through commits and tags rather than branches and merges. A tag says "this commit was deployable." If the commit after the tag breaks things, you don't need to have been on a separate branch. You just reset to the tag.

The tag effectively does what a merge commit does, but without the branching overhead.

It's a simpler version of the same idea. Branches give you isolation. Tags give you rollback points. For a solo developer who trusts their own commit discipline and their agent's reliability most of the time, rollback points might be sufficient.

Let's talk about a specific failure mode I've seen with Claude. The agent makes a change, the change works, the agent moves on. Ten turns later, a different change interacts badly with the first one. The agent tries to fix the interaction but doesn't realize the first change was never committed. So it's fixing against the wrong baseline.

That's the regression-introduction machine right there. The agent's mental model says change A is in the codebase. Reality says change A is only in the working tree. The fix for the interaction assumes A is present. When you finally commit everything together, you're bundling an untested combination.

If you'd run git status before the fix, you'd have caught it.

This is why I'm so insistent on the standing instruction to check git status before starting any new task. It's not busywork. It's making sure the agent's model of the repository matches the actual repository.

Is there a way to automate this? Something that runs before every Claude interaction?

You could write a pre-prompt hook, but honestly, the simplest thing is to just include it in your project rules and then occasionally ask. Claude Code, specifically, has access to run git commands. It can check its own work. The issue isn't capability, it's instruction.

We've covered prevention and recovery. Let's talk about the psychological side of this. Why do people — myself included — let git hygiene slip specifically when working with AI agents?

I think there are two things happening. First, the agent feels like a collaborator, and you unconsciously assume it's handling the boring parts. You wouldn't expect a human colleague to leave uncommitted changes strewn around, so you don't expect the AI to either. Second, the speed is disorienting. When Claude does in thirty seconds what would take you an hour, your mental model of "we must be at step three by now" lags behind reality. You lose track of what's been done and what's just been discussed.

That speed thing is real. I've had sessions where I blinked and there were fifteen files changed and I had no clear memory of what happened in what order.

That's where the verification habit becomes essential. You don't need to remember what happened if you can read the commit log. But you can only read the commit log if commits exist.

Let's get concrete about commit messages. When you're moving fast with an agent, the temptation is to write "updates" or "fixes" and move on. What's the minimum viable commit message for a session where you might need to reconstruct your reasoning later?

The conventional wisdom is a short summary line followed by a blank line and then more detail. For agent-assisted work, I'd say the summary line should answer two questions: what changed and why. Not "updated auth module" but "refactored auth module to use token-based sessions." The why is often obvious from the what, but when you're scanning fifty commits looking for the one that introduced a bug, the extra context saves you from opening every diff.

Do you have the agent write the message or do you write it yourself?

I let the agent propose the message, but I read it before the commit happens. Sometimes Claude writes beautiful, detailed commit messages. Sometimes it writes "made changes as requested." The standing instruction should include a requirement that commit messages be descriptive and specific, and the verification step should include checking the message quality.

"Made changes as requested" is the commit message equivalent of a shrug emoji.

It's worse than no message at all because it gives the illusion of documentation. Six months later you see "made changes as requested" and you have no idea what was requested or why.

Let's circle back to tagging because I want to dig into the naming convention. Daniel mentioned wanting to be able to find a clean place to fall back to. If you've got a hundred tags, how do you know which one is the right one?

Tag naming is personal, but I'd recommend a convention that includes a date and a short descriptor. Something like two thousand twenty six zero five one three dash stable dash landing dash page. The date sorts chronologically in most tag listings, and the descriptor tells you what was stable at that point. If you're tagging multiple times a day, add a sequence number or a time.

You just used the YYYYMMDD format. Is that the standard?

It's the one that sorts correctly as a string, which is all that matters. You could also use semantic versioning tags for actual releases — v one dot zero dot zero — but for checkpoint tags, date plus descriptor is more informative.

For the scenario where you say, "I want to see all my checkpoints," it's just git tag minus l?

Git tag lists all tags. Git tag minus l followed by a pattern filters. So git tag minus l "stable star" would show all your stability tags if you prefix them with "stable." Git tag minus l "two thousand twenty six zero five star" shows everything tagged in May.

That's actually useful right now. If you're recovering from a bad session, you can list today's tags and pick the last one before things went sideways.

The recovery command is just git reset minus hard followed by the tag name. Your working directory is now exactly what it was at that checkpoint. All the bad changes are gone. You can breathe.

What about the case where you don't want to lose everything after the tag? Maybe there's one good commit among the bad ones.

Then you don't reset. You checkout the tag into a new branch, and cherry-pick the good commit. Git checkout minus b recovery dash branch tags slash tag name, then git cherry dash pick the commit hash you want to keep. Now you've got a branch with the good state plus the one good change, and you can continue from there.

Cherry-picking is one of those git features that sounds like advanced wizardry until you need it, and then it's the most straightforward thing in the world.

It's literally "take this one commit and apply it here." The name is perfect. You're picking the ripe cherries off a branch you don't want to keep.

We've built out a pretty comprehensive set of practices. Let me try to condense them into something someone could actually follow. You're a solo developer using Claude as a coding agent on a single-branch project. What's your daily git workflow?

Start of session: ask Claude to run git status and git log minus five and report the state. Address any uncommitted changes before doing anything new. During the session: after every completed task, Claude stages, commits with a descriptive message, and confirms the commit exists. Every few tasks, ask for a git log summary to verify. At stability points — feature complete, deploy verified, before a risky change — create an annotated tag with a date and descriptor. Push tags to remote. End of session: final git status check, final git log review, confirm nothing is left uncommitted.

That sounds like a lot of steps, but most of them are one command and they take seconds.

The entire routine adds maybe two minutes to a multi-hour coding session. And it prevents the kind of two-hour debugging spiral that comes from discovering uncommitted changes from two weeks ago.

The economics are wildly favorable.

They always are with hygiene practices. Flossing takes two minutes. The root canal takes two hours.

That's the most dentist thing you've ever said on this show.

I was a pediatrician, not a dentist.

Prevention is boring and cheap. Recovery is exciting and expensive.

Let's talk about one more failure mode that the prompt hinted at but didn't fully explore. Claude says "I've deployed" but it missed files. This happens because the agent's understanding of what constitutes "the project" might not match the actual deployment artifact.

It changed three files but the build process pulls in five, and two of those were modified in a previous session and never committed.

If you're deploying a static site, the difference between "the files I changed" and "the current state of the working tree" can be subtle. Claude might have modified your CSS and your JavaScript but forgotten to include an image that was added to the assets folder three sessions ago and never committed. The site breaks in a way that's hard to diagnose because you're looking at the recent changes.

The standing instruction should probably include a pre-deployment check that goes beyond "did I commit my changes" to "is the entire working tree committed and pushed.

Git status before deploy. If it says "nothing to commit, working tree clean," you know the repository state matches the working state. If it says anything else, you have work to do before you deploy.

What about git stash? Is that a useful tool in this workflow or a complication?

Stash is useful when you're in the middle of something and need to switch contexts temporarily. But for the agent workflow we're describing, I'd say avoid stash as a long-term holding area. Stashes are easy to forget about. They don't have messages by default. They're not pushed to remote. If you're using stash to park changes between sessions, you're building a hidden backlog of uncommitted work.

Stash is for "I need to switch branches for five minutes," not for "I'll deal with this tomorrow.

And in a single-branch workflow, you rarely need stash at all. If you're not switching branches, you're not context-switching in a way that requires shelving work.

Let's address something that I think is implicit in this whole conversation. A lot of git advice online assumes you're working on a team. Branch naming conventions, pull request templates, code review workflows. And if you're a solo developer, especially one working with an AI agent, a lot of that advice is overhead without benefit.

The entire pull request workflow exists to coordinate multiple humans. If you're the only human and your collaborator is an AI that doesn't review code in the human sense, the PR adds ceremony without safety. You're reviewing your own code, which you were going to do anyway, but now you're doing it in a web interface instead of your editor.

The AI can review code, actually. You can ask Claude to review its own changes or another instance's changes. But that review doesn't require a pull request. It just requires a diff.

Git diff head gives you everything you need for an AI code review. You don't need a branch, a PR, a merge. You just need the changes and a prompt.

For the solo agent-assisted developer, the git workflow collapses to: commit often, tag at stability points, verify before deploying, push everything. That's basically it.

The "commit often" part is where the agent needs the most hand-holding, which brings us full circle to the first part of the prompt.

Let's talk about the hand-holding instructions one more time, because I want to give people something they can copy and adapt. What's the actual text you'd put in a project rules file?

I'd write something like this. "Git hygiene rules. One, before starting any task, run git status and report the current branch, last commit, and any uncommitted changes. Two, after every file modification task, stage the changed files and commit with a descriptive message that explains what changed and why. Three, after committing, confirm the commit exists by running git log minus one. Four, never end a turn with uncommitted changes in the working tree. Five, before deploying, run git status and confirm the working tree is clean and all commits are pushed to remote.

That's clear. It's not ambiguous. And it has verification built in.

The verification is the part people skip. They tell the agent to commit, and they assume it did. The "confirm the commit exists" step catches the failure mode where the agent intended to commit but didn't.

What about when the agent is doing something exploratory? You're trying things, you're not sure what you'll keep. Does the "commit after every task" rule still apply?

Yes, but with a different message convention. Use "WIP" or "experiment" in the message so you know these commits are tentative. WIP dash experiment with dark mode toggle. If you keep the experiment, you can amend the message later. If you don't, you reset to the tag before the experiments started.

That's another place where tagging shines. Tag before the experimentation session. Then you can experiment freely, committing WIP commits as you go, knowing you can always reset to the tag if nothing works out.

It's liberating. You can be messy in the middle because you've got a clean checkpoint at the start.

There's a metaphor here that I think captures the whole thing. Working with an AI agent without git hygiene is like cooking without cleaning as you go. You're making great food, everything's coming together, and then you turn around and the kitchen is a disaster zone and you can't find the spatula and there's something burning and you don't know which cutting board had the raw chicken.

Tagging is labeling your containers before you put them in the fridge. You don't need the labels when everything's fresh in your memory. You need them three days later when you're staring at a Tupperware of something beige.

The Tupperware of something beige is basically the uncommitted git diff from two weeks ago.

It's the perfect metaphor. You know it was probably important when you saved it. You have no idea what it is now.

To wrap up the first part of the prompt, the hand-holding instructions boil down to: make committing a discrete, verified step, not a background assumption. Check git status at session boundaries. And verify the log periodically to catch divergence between the agent's mental model and reality.

For the recovery part: read the diff before you commit it, use git add minus p for triage, tag at stability points with annotated tags, push tags to remote, and use tags as reset points when things go wrong.

The prompt also asked about the easiest recovery state, and I think we've landed on: it depends on how much you understand the uncommitted changes. If you've read the diff and it all looks intentional, git add dot and a commit with a clear message is fine. If you're not sure, patch mode or asking Claude to triage is better. If you need to undo recent work entirely, reset to a tag.

If you're in the nightmare scenario where everything is tangled and you don't know what's good and what's bad, checkout the last known-good tag into a new branch and manually port over the changes you're confident about. It's slow, but it's correct.

One thing we haven't mentioned is git reflog. If you've done a hard reset and then realized you reset too far, reflog is your safety net.

Reflog is the undo button for git's undo button. It records every time HEAD moves, even for operations that would otherwise be destructive. Git reflog shows you a list of every commit HEAD has pointed to, with timestamps. You can find the commit you were on before the reset and checkout back to it.

Even a hard reset isn't truly destructive for at least thirty days by default.

The commits still exist as orphaned objects until garbage collection runs. Reflog lets you reach them. It's the reason you can be a little bolder with reset than you might otherwise feel comfortable being.

That's reassuring. The safety net has a safety net.

Git is remarkably hard to lose data in, provided you've committed it at some point. The danger isn't losing committed work. The danger is work that was never committed in the first place.

Which is exactly what the prompt is about. The uncommitted changes are the ones that can vanish.

With an AI agent, uncommitted changes can accumulate silently across many sessions because the agent doesn't complain about them the way a human collaborator would. A human would say, "hey, you've got a dirty working tree, what's going on?" The AI just works around it.

Until it doesn't. Until it accidentally includes them in something or conflicts with them or introduces a regression because it assumed they were already part of the codebase.

That's the moment Daniel described. "Oh, by the way, you've got uncommitted changes from two weeks ago." By the time the agent mentions it, the context of those changes is long gone from the context window.

If I had to distill this entire conversation into a single principle, it would be: the agent's model of the repository is not the repository. Trust the git log, not the agent's narration.

That's it. That's the whole thing. The git log is the source of truth. The agent's description of what it did is a story. Stories can be wrong. The log is what actually happened.

Now: Hilbert's daily fun fact.

Hilbert: The largest recorded solar halo display occurred over Belize in nineteen sixty-two, spanning a full forty-four degrees of angular radius when a rare circumscribed halo joined with an infralateral arc during a Cold War atmospheric monitoring flight.

A forty-four degree halo is enormous. Most halos are twenty-two degrees.

I had no idea Belize was a hotspot for atmospheric optics during the Cold War.

I suspect it wasn't, and that's what makes it a fact.

Where does this leave us? We've got a set of practices, a recovery playbook, and a principle. The thing I keep thinking about is how none of this is new. Git's been around for twenty years. Commit early, commit often has been advice for two decades. But something about working with AI agents makes people forget the basics.

I think it's because the agent feels so competent that you assume competence extends to areas you haven't explicitly instructed. But git hygiene isn't a capability issue for Claude. It can run git commands perfectly well. It's an instruction issue. It doesn't know you care about clean commits unless you tell it you care.

The cost of not telling it is discovering, at the worst possible moment, that your repository is full of uncommitted changes with no provenance and no explanation.

The prompt described a frustrating day. That frustration is preventable with about five lines of project rules and a habit of checking git status at session boundaries. It's not a technical problem. It's a discipline problem.

Which is somehow both reassuring and annoying. Reassuring because the fix is simple. Annoying because you could have been doing it all along.

That's most best practices, honestly.

We should thank our producer, Hilbert Flumingtop, for keeping the show running while we nerd out about version control.

This has been My Weird Prompts. You can find every episode at myweirdprompts dot com.

Commit your changes before you deploy. We'll be here next week.

Clean working tree, clean conscience.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2790: Git Hygiene for AI Coding Agents

Downloads

You Might Also Like

#2790: Git Hygiene for AI Coding Agents