#2345: Why File Naming Conventions Are More Than Just Style

Discover how file naming conventions like snake_case and camelCase impact development workflows, CI/CD pipelines, and filesystem compatibility.

0:000:00
Episode Details
Episode ID
MWP-2503
Published
Duration
24:47
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
Claude Sonnet 4.6

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

File naming conventions are often dismissed as mere stylistic preferences, but they play a pivotal role in development workflows, CI/CD pipelines, and filesystem compatibility. This episode explores the taxonomy of naming conventions—snake_case, camelCase, PascalCase, kebab-case, SCREAMING_SNAKE_CASE, and Train-Case—detailing their origins, ecosystem preferences, and practical implications.

Each convention emerged from specific constraints. For instance, snake_case traces back to C and Python, where underscores provided a clean separator that parsers could reliably interpret. camelCase and PascalCase, rooted in Algol and Pascal traditions, became staples in Java and C#, with PascalCase signaling types or classes in TypeScript. SCREAMING_SNAKE_CASE, used for constants in Unix shell scripts, emphasizes visual distinctiveness to enforce discipline.

The episode highlights the machine-safety concerns tied to filenames, such as case sensitivity across filesystems. Developers often work on case-insensitive systems like macOS’s APFS or Windows’ NTFS, while production servers typically run case-sensitive systems like Linux’s ext4. This mismatch can lead to latent failures, where a renamed file works locally but breaks in CI/CD pipelines or on different operating systems. Git’s handling of case-insensitive renames further complicates matters, as it may not track changes that only surface in case-sensitive environments.

The discussion underscores the importance of treating filenames as interfaces rather than labels. Conventions serve as shared contracts between developers, tools, and downstream processes, ensuring reliability and reducing cognitive friction. By understanding the tradeoffs and constraints behind each convention, developers can make informed choices that align with their ecosystems and avoid costly errors.

Ultimately, file naming conventions are more than just style—they’re architectural decisions that impact the robustness and maintainability of codebases. This episode offers practical insights and actionable advice for navigating this often-overlooked aspect of software development.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2345: Why File Naming Conventions Are More Than Just Style

Corn
A team pushes a rename to their repo — capital S on a shell script, Script.sh instead of script.On their MacBooks, running APFS, nothing breaks. The filesystem doesn't even register that anything changed. They merge, the CI runner spins up on Linux, ext4, case-sensitive, and it can't find the file. Not because the code was wrong. Because a letter changed case.
Herman
I'm Herman Poppleberry, and that story is not hypothetical. That exact failure mode — I keep wanting to say failure mode, but let's just call it what it is, that exact way things broke — shows up constantly in post-mortems. And it's almost never the first thing people look for, which is part of what makes it so costly.
Corn
Daniel sent us this one, and I think he framed it well. The question is essentially: why do file naming conventions and machine-safe naming practices matter more than most developers actually treat them? He wants us to cover the full taxonomy — kebab-case, snake_case, camelCase, PascalCase, Train-Case, SCREAMING_SNAKE_CASE — where each one comes from, what ecosystems favor it, when to reach for which. And then the deeper layer: the practical machine-safety concerns. Spaces and special characters in filenames, case sensitivity across filesystems, length limits, reserved characters, Unicode hazards, emoji in paths. And what all of that means when it actually breaks — shell scripts, glob expansion, Git on a case-insensitive filesystem, CI/CD pipelines going down. The underlying principle he's pushing toward is that files aren't labels. They're interfaces.
Herman
Which is the reframe that I think unlocks everything else. Once you think of a filename as an interface — something that other systems, other processes, other humans working programmatically will consume — the question of what you name it stops being aesthetic and becomes architectural.
Corn
By the way, today's episode is powered by Claude Sonnet four point six.
Herman
Good to know our script has impeccable taste.
Corn
The CI/CD example is worth sitting with for a second before we get into taxonomy, because it illustrates something specific. It wasn't a logic error. It wasn't a dependency failure. It was a filesystem disagreement about whether two strings were the same string.
Herman
The insidious part is that Git, on a case-insensitive filesystem, will not track that rename as a rename. It sees no change. So the developer who made the change has no indication anything went wrong. Their local tests pass. Their colleague's tests pass. Everyone on macOS is fine. The problem only materializes when the code hits a system with different assumptions baked into the kernel.
Corn
The failure was latent. It existed the moment the rename happened. It just didn't surface until the environment changed.
Herman
And that latency is what makes poor file naming dangerous in a way that a syntax error isn't. A syntax error fails immediately and loudly. A filename that violates machine-safety assumptions can sit quietly in a repository for months before it detonates in production, or in a deployment pipeline, or when someone tries to run the project on a different operating system.
Corn
How long are we talking, realistically? Like, in a real team, how long could something like that sit before it surfaces?
Herman
Indefinitely, if the team is homogeneous enough. If everyone is on macOS, running the same CI image, never touching Windows — you could go years. The trigger is usually something environmental. You migrate your CI from one provider to another, you onboard a developer who runs Linux locally, you upgrade your Docker base image and the new one uses a different filesystem configuration. Something shifts in the environment and suddenly a latent assumption becomes a live failure.
Corn
Which is a pretty good argument for treating this as infrastructure, not housekeeping.
Herman
Infrastructure is exactly the right frame. And the taxonomy of naming conventions is where that becomes concrete, because each convention exists for a reason that's grounded in what the consuming system expects.
Corn
Right, and I think that's the thing people miss. They see snake_case versus camelCase as a style preference, like tabs versus spaces, something to argue about and then forget. But the conventions map to actual ecosystem constraints.
Herman
The Wikipedia article on programming naming conventions traces snake_case back to C — specifically to Kernighan and Ritchie's original work in 1978. The underscore was the separator that worked cleanly in identifiers when spaces obviously couldn't, and Python inherited that lineage hard. The standard library, PEP 8, all of it.
Corn
Kebab-case is the URL-friendly cousin. Hyphens instead of underscores, which is why you see it everywhere in web contexts — CSS class names, URL slugs, Lisp, which predates most of this by decades.
Herman
CamelCase and PascalCase come out of the Algol and Pascal traditions, which fed directly into Java and C sharp. PascalCase is literally named after the Pascal language. And the distinction between camel and Pascal — whether the first word is lowercase or capitalized — sounds trivial until you're in a codebase where the convention signals whether something is a variable or a type.
Corn
There's actually a fun piece of trivia here. The term camelCase itself wasn't widely standardized until the nineties, even though the style had been in use for decades. Different communities called it different things — InterCaps, BumpyCaps, WikiCase if you were in that world. The camel metaphor only stuck because it was the most evocative. You look at the humps in the middle of the word and it just clicks.
Herman
WikiCase is a good one because it shows how the same convention gets reinvented independently when the constraint is the same. Wiki software needed page names that were both human-readable and automatically linkable without special syntax. So you smash the words together with capital letters and the software can detect word boundaries. Same underlying problem as a parser that can't handle spaces, same solution.
Corn
SCREAMING_SNAKE_CASE is the one that announces itself. All caps, underscores, constants only. Unix environment variables. If you see MAX_RETRIES in a shell script, you know immediately what it is and you know not to reassign it mid-execution.
Herman
Which is the point of the convention. It's communicating to the reader and to the tooling simultaneously. Machine-safe naming works the same way — it's not just about what the filesystem accepts, it's about what downstream processes can reliably parse without you having to handle edge cases.
Corn
The conventions are a shared contract.
Herman
A shared contract between the developer who names the file, the tools that consume it, the CI system that runs against it, and the next developer who has to work with it six months later without any context.
Corn
Contracts have consequences when you break them. So let's actually work through what each convention is doing mechanically, because I think the tradeoffs become visible when you look at them that way.
Herman
Start with snake_case, because it's probably the cleanest example of a convention that emerged from a hard constraint. The underscore was never going to be misinterpreted by a parser. It's not a mathematical operator, it's not a path separator, it's not a shell metacharacter. It just sits there, inert, doing its job.
Corn
Which is why Python leaned into it so completely. You look at something like the requests library — get, post, send_request, response_headers — everything lowercase, everything underscored. There's no ambiguity about what the tokenizer sees.
Herman
The readability argument is real. For long identifiers, underscores are genuinely easier to scan than camel humps. calculate_total_invoice_amount is easier to parse at a glance than calculateTotalInvoiceAmount, at least for most readers.
Corn
Though JavaScript developers would fight you on that.
Herman
They would, and not without reason. camelCase in JavaScript is load-bearing. The language itself, the DOM API, every major framework — React, Vue, Node — uses camelCase for variables and functions. It's so deeply embedded that violating it reads as a bug, not a preference. If you're writing a React component and you name a prop background_color instead of backgroundColor, someone is going to think something went wrong.
Corn
The convention carries semantic weight. It signals which ecosystem you're operating in.
Herman
PascalCase takes that further. In TypeScript, in C sharp, PascalCase on an identifier is a strong signal that you're looking at a type or a class, not a variable. MyComponent, UserProfile, InvoiceService. The capitalization is doing type-system communication before you even read the definition.
Corn
That's actually enforced by some linters, right? It's not just convention at that point — the tooling will flag it.
Herman
In TypeScript with strict ESLint rules, yes. There are rules that specifically require PascalCase for type aliases and interfaces and will throw a warning if you deviate. So the convention has been promoted from social agreement to automated enforcement. Which is exactly where you want it.
Corn
Train-Case is the one that doesn't come up as often in these conversations. Content-Type, Accept-Encoding, X-Request-ID. It's kebab-case with the first letter of each word capitalized.
Herman
Right, and it exists almost entirely in that one context. You wouldn't use Train-Case for a Python variable or a JavaScript function. It's domain-specific in a way the others aren't. If you see it somewhere unexpected, that's actually a signal that something's probably wrong.
Corn
SCREAMING_SNAKE_CASE — the interesting thing about it is that it's the only convention where the visual weight is intentional by design. It's supposed to stand out.
Herman
Defensive programming through typography, almost. The all-caps is saying: this is a constant, treat it with respect, do not shadow it, do not reassign it. In Unix shell scripts, environment variables like PATH, HOME, MAX_CONNECTIONS — the convention enforces a discipline that the language itself often doesn't enforce mechanically.
Corn
Each of these conventions is solving a specific problem in a specific context. The mistake is importing one into an ecosystem that expects another.
Herman
Which happens constantly. A Python developer who's been writing snake_case for years joins a TypeScript project and names everything with underscores. But every other developer on the team has to do a small cognitive translation every time they read it. That friction compounds.
Corn
In filenames specifically, the stakes are higher than in code identifiers, because the filesystem doesn't know which language you're using.
Herman
The filesystem is the great equalizer. It doesn't care about your language idioms or your team's style guide. It has its own rules, and they vary depending on which filesystem you're actually sitting on.
Corn
Which is where things get treacherous. Because most developers work on one machine, with one filesystem, and they build up intuitions that are just... wrong in other contexts.
Herman
The three you need to understand are ext4, APFS, and NTFS. ext4 is the default on most Linux systems. It's case-sensitive. txt and foo.txt are two different files. APFS is what macOS has been running since 2017 and it's case-insensitive by default, though you can format a volume as case-sensitive if you know to ask. NTFS, Windows, also case-insensitive by default. So you have this situation where the majority of developer laptops are running case-insensitive filesystems, and the majority of production servers are running case-sensitive ones.
Corn
That's a structural mismatch baked into the industry.
Herman
Git doesn't paper over it cleanly. If you rename a file from utils.js to Utils.js on macOS, Git on that filesystem sees no change. The rename simply does not register. You have to use git mv with the dash dash force flag to make it stick, or rename it to something else entirely and then rename it back. It's awkward.
Corn
If someone doesn't know to do that, they commit what they think is a rename, push it, and the CI runner on Linux tries to find Utils.js and finds utils.js instead, which is a different file, which may or may not exist.
Herman
The pipeline breaks with a file not found error, and the developer is staring at their screen thinking, but it's right there. I can see it. Because on their machine, it is right there.
Corn
That's a particularly cruel debugging experience.
Herman
And the length limits add another layer. NTFS supports filenames up to two hundred and fifty-five characters. ext4 supports up to two hundred and fifty-five bytes. Those sound equivalent until you introduce Unicode, because a single Unicode character can be two, three, or four bytes. So a filename that's two hundred characters long in a script using multi-byte characters might be perfectly legal on NTFS and blow the limit on ext4.
Corn
Nobody tests for that. Nobody is sitting there counting bytes in their filenames.
Herman
Until a deployment script hits a path that's too long and fails silently or throws an error that doesn't obviously point to the filename length as the cause.
Corn
What about reserved characters? Because Windows has a list that I think surprises people who've only worked on Unix.
Herman
It's substantial. On Windows with NTFS, you cannot use a forward slash, backslash, colon, asterisk, question mark, double quote, less-than, greater-than, or pipe in a filename. That's nine characters that are either path separators, shell metacharacters, or redirects. Unix is more permissive — technically the only truly forbidden characters in a filename on ext4 are the forward slash and the null byte. Everything else is legal, which is precisely the problem.
Corn
Because legal on the filesystem and safe in a shell are completely different things.
Herman
A filename with a space in it is perfectly legal on every major filesystem. It will also break any shell script that isn't quoting its variables correctly. And most shell scripts, if we're being honest, are not quoting their variables correctly everywhere.
Corn
Someone names a file quarterly report.csv, and then a script tries to process it and the shell interprets quarterly and report.csv as two separate arguments.
Herman
Glob expansion makes it worse. If you have a directory with files named report 1.txt, report 2.txt, report 3.txt, and you write a script that does something like for file in star dot txt, the shell expands that glob and then word-splits on the spaces, and suddenly your loop is iterating over report, 1.txt, report, 2.txt — six tokens instead of three files.
Corn
Which is a bug that only appears when the filenames have spaces, so it works fine in testing with clean names and detonates in production when a user uploads something with a normal human-readable name.
Herman
The fix in the shell script — quoting your variable in double quotes — is one of those things that feels like a minor style point until it isn't. ShellCheck, the static analysis tool for shell scripts, will flag unquoted variables, and this is exactly why. The tool exists because the failure mode is so common and so non-obvious.
Corn
That's actually a good example of the linter doing work that the runtime won't. The shell will happily execute the broken version. It just won't do what you meant.
Herman
Unicode and emoji take this further. Modern filesystems handle Unicode reasonably well in isolation. The problem is cross-platform consistency and the tools that sit above the filesystem. A filename with an emoji in it might display correctly in Finder, refuse to tab-complete in certain terminals, fail to match in a regex that wasn't written to handle multi-byte sequences, and cause a Python script using the older string handling to throw a codec error.
Corn
There's the normalization issue. Unicode has multiple ways to represent the same character. A filename with an accented e might be stored as a single precomposed character on one system and as a base letter plus a combining accent on another. Those are different byte sequences. Git sees them as different files.
Herman
Which is a real source of mysterious duplicates in repositories when developers on different operating systems are working with filenames that include diacritics.
Corn
The principle that ties all of this together is the one Daniel pushed toward in the prompt. Files aren't labels. They're interfaces.
Herman
An interface that has to be consumed reliably by your shell, your build system, your version control, your CI runner, your deployment scripts, and every developer who clones the repository on whatever operating system they happen to be using. When you name a file, you're not just describing its contents. You're making a promise about how it can be referenced programmatically.
Corn
Breaking that promise doesn't always announce itself immediately. That's the thing that separates a filename problem from a code problem. A bad variable name causes a syntax error or a type error right away. A bad filename sits quietly until the environment changes, or until a script runs that wasn't written defensively, or until someone on a different OS joins the project.
Herman
Latent failures are always more expensive than immediate ones. The CI pipeline that breaks after a merge is expensive. The production deployment that fails because of a path issue that's been in the codebase for eight months is catastrophic.
Corn
Entirely preventable with about five minutes of thinking upfront.
Herman
Five minutes and a linter, honestly. Because the good news is most of this is automatable. You don't have to rely on developers remembering the rules under deadline pressure.
Corn
What does that actually look like in practice? Someone starts a new project — what are the concrete decisions they should be making on day one?
Herman
First decision: pick one convention for filenames and write it down. Not in your head. In a contributing guide, in a README, somewhere that a new team member will actually find it. The specific convention matters less than the consistency. kebab-case for everything is a perfectly defensible choice for a web project. snake_case for a Python project. What kills you is mixing them because nobody decided.
Corn
Is there a way to enforce that without it becoming a code review argument every time someone opens a PR?
Herman
That's exactly the right question, because code review is the worst place to catch this. By the time it's in a PR, someone has already done the work, and asking them to rename files feels petty even when it matters. You want the enforcement to happen before the commit, not after. Which is where pre-commit hooks come in.
Corn
The filesystem-level stuff?
Herman
Assume case-insensitive even if your current environment is case-sensitive. It costs you nothing to name files with that constraint in mind, and it means you'll never hit the Git rename trap. Practically: all lowercase, hyphens or underscores, no spaces, no special characters outside that set. That's a rule you can put in a pre-commit hook and enforce automatically.
Corn
Pre-commit hooks are underused for exactly this kind of thing.
Herman
There's a tool called pre-commit, the Python package, that makes it trivial to run filename checks before anything gets staged. You can write a hook in twenty lines that rejects any filename containing a space, an uppercase letter in a context where you've decided on lowercase, or a character outside your allowed set. The failure is immediate and local, not three steps downstream in CI.
Corn
What about the length issue? The byte-versus-character problem on ext4?
Herman
Keep filenames short. Under a hundred characters is a good rule of thumb that gives you headroom on every major filesystem regardless of encoding. If a filename is approaching two hundred characters, that's usually a sign the path structure needs rethinking, not that you need to count bytes.
Herman
Stick to ASCII for filenames. Not because Unicode support is bad, but because the cross-platform normalization problems are subtle enough that they'll bite you at the worst moment. If you're building tooling that has to handle arbitrary user-supplied filenames, sanitize on ingest. Strip or transliterate anything outside ASCII before it touches your filesystem.
Corn
The interface principle in action. You control the contract at the boundary.
Herman
And the last thing I'd add is: treat your CI environment as the source of truth for what's acceptable. If your pipeline runs on Linux with ext4, your local development should be testing under those same constraints, not assuming macOS forgiveness will hold.
Corn
Which is an argument for containerized development environments, but that's a whole other episode.
Herman
It really is. Though I'll say — even without full containerization, something as simple as running your test suite inside a Docker container that uses a Linux base image catches a huge proportion of these issues before they reach CI. It's not a complete solution but it closes the most common gap.
Corn
That's a whole other episode. But let's land the plane here, because I think the thing worth sitting with is how much of this is invisible until it isn't. You can ship software for years with sloppy file naming and never notice, and then one environment change, one new contributor on a different OS, one CI migration, and suddenly you're debugging something that looks completely unrelated to what's actually wrong.
Herman
The diagnosis is hard. The fix is rename the file, add a linter rule, update the convention doc. But you've already burned hours getting there.
Corn
Which is what makes it feel like such a waste. It's not a hard problem. It's a neglected one.
Herman
The forward-looking question I keep coming back to is whether filesystem design is going to catch up to the mess. There are experiments with content-addressable storage, systems where the identifier isn't a human-readable name at all but a hash of the content. Git's internal object store works that way already. If that model ever surfaces at the filesystem level, a lot of these naming problems just dissolve.
Corn
Though you'd introduce a completely different set of problems around human legibility. Someone has to know what the hash refers to.
Herman
There's probably no world where you fully escape the tension between names that are meaningful to humans and names that are safe for machines. The best you can do is be deliberate about where you sit on that spectrum and enforce it consistently.
Corn
Which is, honestly, a reasonable place to leave it. Don't assume the filesystem is forgiving just because your laptop is.
Herman
Write it down before you need it, not after.
Corn
Thanks to Hilbert Flumingtop for producing the show, and to Modal for keeping our infrastructure from doing exactly what we've been describing for the last twenty-five minutes. This has been My Weird Prompts. If you've got a moment, a review on Spotify goes a long way. We'll see you next time.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.