#2193: Running Claude in Your Apartment (The Physics Says No)

Building a local AI inference server to rival Claude Code sounds great until you do the math on heat, noise, and neighbor relations.

0:000:00
Episode Details
Episode ID
MWP-2351
Published
Duration
27:01
Audio
Direct link
Pipeline
V5
TTS Engine
chatterbox-regular
Script Writing Agent
claude-sonnet-4-6

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

Running State-of-the-Art AI Locally: The Hardware, Heat, and Neighbor Problem

Building a local inference server powerful enough to rival Claude Code or OpenAI's commercial offerings sounds appealing—no API costs, no rate limits, full control. The reality is messier. It requires not just expensive hardware, but thermal engineering, acoustic isolation, and a diplomatic strategy for the people living next to you.

What Are You Actually Building?

The target model is Qwen3-Coder-480B-A35B-Instruct, the current open-source state-of-the-art for coding tasks. It scores 61.8% on the Aider Polyglot benchmark, outperforming Claude Sonnet 4 (56.4%) and GPT-4.1 (52.4%). Running it at speeds that feel responsive requires 150–276 gigabytes of unified memory minimum.

Three hardware tiers emerge:

Tier 1: The Reasonable Madman ($10,900)
Eight used RTX 3090s, AMD EPYC 7702, 512GB DDR4 RAM, and standard power supplies. Total VRAM: 192GB. Performance: 4–8 tokens per second at Q2 quantization. Functional, but slow enough that you'll question your choices between prompts. Think: watching someone type deliberately, with long pauses.

Tier 2: The Apartment Destroyer ($41,273)
Eight RTX 5090s, dual AMD EPYC 9354 processors, 512GB DDR5 RAM, and crucially, an Eaton SRCOOL18K server-grade portable air conditioner. Total VRAM: 256GB. Performance: 15–27 tokens per second at Q4 quantization with partial CPU offloading. This actually rivals the target. It also draws 5,500 watts continuously and generates 18,766 BTUs of heat per hour.

Tier 3: The Nuclear Option ($455K–$577K)
A DGX H100 with eight H100 SXM GPUs, 640GB of HBM3 memory, three-phase electrical service, precision cooling, and a structural engineer's assessment to confirm your apartment floor can support the 63-kilogram unit. The DGX produces 106 decibels at full load—a rock concert, continuously, in your residence.

The recommended build is Tier 2. Tier 1 is too slow to be competitive. Tier 3 requires permits, engineers, and lawyers.

The Thermal Reality

Here's the physics problem nobody wants to think about: an AI inference server converts essentially all electrical input into heat. There's no mechanical work, no useful output except computation. Every watt becomes a BTU.

The Tier 2 build draws 5,500 watts. That's 18,766 BTUs per hour. A standard apartment air conditioner is rated at 14,000 BTUs. The server generates 34% more heat than your AC can remove.

Without cooling, temperature rises roughly 0.34°C per minute in a 65 square meter apartment. That's 20 degrees per hour. Starting from 20°C ambient, the apartment hits 70–90°C within 3–4 hours—well above the 45°C server failure threshold. The server destroys itself. The apartment becomes uninhabitable.

With the Eaton SRCOOL18K portable AC, you stabilize at 23–25°C. But there's a critical detail: the exhaust hose must vent outside. If you forget and vent it back into the room, you've accomplished nothing.

The server corner itself runs 10–15°C hotter than the rest of the apartment due to localized heat concentration. That corner is off-limits for living.

The Acoustic Problem

One RTX 5090 under full load produces 50–60 decibels. Eight of them, combined logarithmically, generate roughly 69 decibels. Add 20 industrial Noctua fans, pump noise, PSU fans, and the server AC compressor, and you're at 85–92 decibels at one meter from the rig.

New York City's residential noise ordinance caps interior noise at 42 decibels. Standard drywall reduces noise by 35–40 decibels. Your neighbor receives 50–55 decibels through the shared wall—they're entitled to 42. You're delivering 55. The neighbor below hears what one research brief described as "a jet engine warming up, continuously, forever."

Acoustic foam panels reduce noise by 4 decibels at best. To reach 42 decibels from 90, you need 48 decibels of attenuation. That requires a room-within-a-room: decoupled walls, mass-loaded vinyl, resilient channels, acoustic sealant. Cost: $15K–$40K for the server corner alone. The soundproofing costs nearly as much as the server itself, and it will still be audible—just quieter.

Neighbor Diplomacy

The solution involves phases:

Phase 1 (Pre-emptive): Visit every adjacent neighbor with a gift basket containing noise-canceling earplugs, a handwritten note, and a $25 gift card. Budget: $200. This buys 2–3 weeks of goodwill.

Phase 2 (Reactive): Someone complains. Maintain a written noise log with timestamps. Document actions taken. This creates a legal record.

Phase 3 (Escalation): Offer to cover the cost of noise-canceling headphones for the affected neighbor ($200–$400). If that fails, you're in legal territory.

The Verdict

Running Qwen3-Coder-480B locally is technically possible. It's thermodynamically expensive, acoustically hostile, and diplomatically complex. Tier 2 is the realistic sweet spot: $41K for hardware, $15K–$40K for acoustic isolation, plus the certainty that your relationship with neighbors will deteriorate. You'll gain local inference, low latency, and no API costs. You'll lose peace, quiet, and probably a security deposit.

BLOG_POST

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3
Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2193: Running Claude in Your Apartment (The Physics Says No)

Corn
Alright, so Daniel sent us this one, and I want to be clear upfront — this is either the most ambitious technical project we've ever covered, or an elaborate cry for help. Possibly both. Here's what he wrote: Herman and Corn should spec out a local AI inference server powerful enough to rival Claude Code or OpenAI Codex. He wants a full parts list with pricing, a maintenance plan for a team of four — that's Corn, Herman, Daniel, and Hannah — to handle the extreme heat and noise, a thermal and acoustic simulation for a sixty-five square meter apartment, diplomatic strategies for neighbor disputes, a detailed maintenance timeline, and contingency plans for the electricity situation, which he describes as "enormous demand." The approach, he says, should be realistic but comedic. Daniel, I want you to know that we take this very seriously. The comedy will emerge naturally from the facts.
Herman
The comedy is entirely load-bearing here. Because I looked at these specs and my first reaction was genuine concern for whoever lives below this hypothetical apartment.
Corn
So let's start with the target. What does "rival Claude Code" actually mean in hardware terms?
Herman
So the model you'd be targeting is Qwen3-Coder-480B-A35B-Instruct. It's the current open-source state-of-the-art for coding agents. It scores sixty-one point eight percent on the Aider Polyglot benchmark, which beats Claude Sonnet 4 at fifty-six point four and GPT-4.1 at fifty-two point four. It's a four-hundred-and-eighty-billion-parameter Mixture-of-Experts architecture. To run it at speeds that feel responsive — and by responsive I mean something better than watching a telegram print out character by character — you need between one-hundred-fifty and two-hundred-seventy-six gigabytes of unified memory. Minimum.
Corn
So not a Raspberry Pi situation.
Herman
Emphatically not a Raspberry Pi situation. We're talking about hardware that, in a residential context, is genuinely in the category of "things you probably need to tell your landlord about."
Corn
Okay, so walk me through the build tiers, because I know you've got tiers.
Herman
Three tiers. I'm calling them, in ascending order of ambition and descending order of sanity: the Reasonable Madman, the Apartment Destroyer, and the Nuclear Option.
Corn
I respect the naming convention. Let's start with Reasonable Madman.
Herman
Eight used RTX 3090s at around seven-hundred-twenty-five dollars each — so fifty-eight hundred for the GPUs. You pair that with an AMD EPYC 7702, sixty-four cores, which you can find for around four-fifty used. An ASRock Rack motherboard for the EPYC platform, five-hundred-twelve gigabytes of DDR4 ECC RAM, dual sixteen-hundred-watt power supplies, and an APC Smart-UPS rated at five thousand VA. Total system cost comes out around ten-thousand-nine-hundred dollars.
Corn
That's... surprisingly approachable, actually. What does it get you?
Herman
One-hundred-ninety-two gigabytes of VRAM total. You can run Qwen3-Coder-480B at Q2 quantization, which is the most aggressive compression that still produces coherent output. And your throughput is four to eight tokens per second.
Corn
I'm going to need you to contextualize four to eight tokens per second for me, because that might be fine or it might be catastrophic.
Herman
It's the difference between watching someone type at a very deliberate pace versus having a conversation. The model is technically responding. You will, however, have time to make a cup of tea between prompts. It works. It is slow enough that you will periodically question your life choices.
Corn
The server is functional. The regret is also functional.
Herman
That's an accurate summary of Tier 1. Now, Tier 2 is where things get genuinely interesting and also genuinely dangerous.
Corn
The Apartment Destroyer.
Herman
Eight RTX 5090s. Each one is running at street price around three thousand dollars right now, so twenty-four thousand dollars just in GPUs. Dual AMD EPYC 9354 processors at twenty-eight hundred each, a Supermicro dual-EPYC motherboard at twenty-five hundred, five-hundred-twelve gigabytes of DDR5 ECC RAM, dual two-thousand-watt Seasonic Titanium power supplies. And critically — and this is not optional — an Eaton SRCOOL18K server-grade portable air conditioner at twelve hundred dollars. Total build cost is approximately forty-one thousand two-hundred-seventy-three dollars.
Corn
That's not a home server. That's a down payment.
Herman
What you get for that is two-hundred-fifty-six gigabytes of VRAM. You can run Qwen3-Coder-480B at Q4 quantization with partial CPU offloading, and you're looking at fifteen to twenty-seven tokens per second. That's actually in the range of what Claude Code feels like through the API. This is the build that genuinely rivals the target. It also draws approximately fifty-five hundred watts continuously.
Corn
And Tier 3?
Herman
The DGX H100. Eight H100 SXM GPUs with six-hundred-forty gigabytes of HBM3 total. Four-hundred to five-hundred thousand dollars for the unit itself. Then you add three-phase electrical service installation, a precision cooling unit, structural engineering to assess whether the apartment floor can support the weight —
Corn
Wait, you need a structural engineer?
Herman
The DGX H100 weighs approximately sixty-three kilograms. That's before the rack, the cooling unit, the UPS. You are putting a small car's worth of weight into a residential floor. So yes, structural assessment. Plus noise isolation — the DGX H100 at full load produces one-hundred-and-six decibels. That is a rock concert. That is happening in your apartment. Continuously.
Corn
I want to be clear that I, Corn, am listed as a member of this maintenance team, and I have concerns about my personal safety.
Herman
Your concerns are valid. The recommended build is Tier 2. Tier 1 is too slow to be competitive, Tier 3 requires a building permit, a structural engineer, and a lawyer, and the total cost is somewhere between four-hundred-fifty-five and five-hundred-seventy-seven thousand dollars. So: Tier 2, eight RTX 5090s, forty-one thousand dollars, and the end of any goodwill you have with your neighbors.
Corn
Oh, by the way — today's script is being generated by Claude Sonnet 4.6, which I find deeply funny given that we're building a server to replace it.
Herman
We're not replacing it. We're supplementing it. In a sixty-five square meter apartment.
Corn
Right. Totally different. Okay, let's talk thermal simulation, because this is where I feel like the physics starts getting genuinely hostile.
Herman
The core issue is that an AI inference server converts essentially all of its electrical input into heat. There's no mechanical work output, no light — well, there are LEDs, but those are decorative. Every watt becomes a BTU. The eight RTX 5090 build draws fifty-five hundred watts continuously. Run that through the conversion: fifty-five hundred watts times three-point-four-one-two gives you eighteen-thousand-seven-hundred-sixty-six BTUs per hour of heat generation.
Corn
And a standard apartment air conditioner is rated at what?
Herman
Fourteen thousand BTUs. So the server generates thirty-four percent more heat than a standard window AC can remove. The server is, thermodynamically speaking, winning the war against your air conditioning. And it doesn't get tired.
Corn
So what happens to the apartment temperature if you just... don't cool it?
Herman
I modeled this. You have a sixty-five square meter apartment, two-and-a-half-meter ceilings, so about a hundred-sixty-two cubic meters of air. When you account for the thermal mass of walls, furniture, and everything else in the space, you get a temperature rise rate of roughly point-three-four degrees Celsius per minute with no cooling. That's twenty degrees per hour.
Corn
So in three hours you've gained sixty degrees.
Herman
If the ambient outside temperature is twenty degrees Celsius, the apartment hits somewhere between seventy and ninety degrees Celsius within three to four hours. That is above the forty-five degree server failure threshold. The server destroys itself. Plants die. Cheese melts. Corn melts.
Corn
I don't love that last one.
Herman
With the Eaton SRCOOL18K portable server AC at eighteen thousand BTUs, you get the apartment to a stable twenty-three to twenty-five degrees. Livable. But here's the critical detail that people consistently get wrong: the portable AC has an exhaust hose. That hose must vent outside the apartment. If you forget to route it out the window — and people do forget — you've added another eighteen thousand BTUs of heat back into the room and accomplished nothing except running an expensive fan.
Corn
And the server corner specifically?
Herman
Ten to fifteen degrees hotter than the rest of the apartment due to localized heat concentration. That corner is uninhabitable. Do not put the couch there. Do not put Daniel there.
Corn
Daniel is listed as Chief Acoustic Officer, which means he's already being punished enough.
Herman
Let's talk about that. Because the noise situation is genuinely remarkable. One RTX 5090 Founders Edition under full AI inference load produces around fifty to sixty decibels. Eight of them, combined logarithmically, gives you roughly sixty-nine decibels from the GPUs alone. Add twenty Noctua NF-F12 iPPC industrial fans at three thousand RPM — which combine to about fifty-six decibels — the AIO cooler pumps, the PSU fans, and the server AC compressor, and you're looking at a combined system noise of eighty-five to ninety-two decibels at one meter from the rig.
Corn
What's the residential noise ordinance limit?
Herman
Forty-two decibels inside neighboring residences. New York City's code specifically: forty-two decibels. The server produces ninety decibels. Standard drywall reduces noise by thirty-five to forty decibels. So your neighbor gets fifty to fifty-five decibels through the shared wall. They are legally entitled to around forty-two. You are delivering fifty-five. Your neighbor directly below hears what the research brief describes as, and I'm quoting this because I couldn't improve on it, "a jet engine warming up, continuously, forever."
Corn
So acoustic foam solves this, right? You just line the walls and —
Herman
Acoustic foam panels reduce noise by four decibels at best. To get from ninety decibels to forty-two decibels, you need forty-eight decibels of attenuation. That requires a room-within-a-room construction. Decoupled walls, mass-loaded vinyl, resilient channels, acoustic sealant — essentially building a recording studio inside the apartment. Cost: fifteen to forty thousand dollars for the server corner alone.
Corn
So the soundproofing costs almost as much as the server.
Herman
And the server will still be audible. It will just sound like a distant jet engine instead of a nearby one. This is the diplomatic situation Daniel has been assigned to manage.
Corn
Daniel has "the face" for this, apparently.
Herman
According to the brief, yes. Daniel is Chief Acoustic Officer. His primary tool is a calibrated sound level meter, noise-canceling headphones for personal sanity, and a box of chocolates for the neighbors.
Corn
Walk me through the neighbor diplomacy, because I feel like this escalates.
Herman
Phase 1 is pre-emptive. Before the server is ever turned on, you visit every adjacent neighbor — above, below, left, right, minimum four apartments — with a gift basket. Noise-canceling earplugs, a handwritten note explaining that you're "doing some computer work that may generate some background noise," and a twenty-five dollar gift card. Total budget: two hundred dollars. This buys approximately two to three weeks of goodwill.
Corn
Two to three weeks. And then?
Herman
Phase 2: reactive. Someone complains. Daniel maintains a written noise log with timestamps and actions taken — critical for legal defense. You offer a quiet hours commitment: throttle the server to fifty percent GPU utilization between ten PM and seven AM. This drops the noise from ninety decibels to approximately eighty-two decibels.
Corn
Which is still above every legal threshold.
Herman
Still above every legal threshold. But it demonstrates good faith, which matters in court. You also install mass-loaded vinyl, acoustic foam, door sweeps — total around seven-hundred-fifty to nine-hundred-fifty dollars — for a four to six decibel reduction. This will not solve the problem. It will, however, demonstrate to a judge that you tried.
Corn
Phase 3 I assume involves lawyers.
Herman
Phase 3 is legal defense. Eviction for noise requires documented complaints, a lease violation notice, a cure period of typically ten to thirty days, and then court proceedings. You cannot be evicted overnight. The legal argument is that a home server is a personal computer — there's no law against owning powerful computers. The noise is the violation, not the hardware. There's also what I'd call the nuclear legal option: if the landlord attempts eviction, challenge it in court while simultaneously filing complaints about building code violations in the building.
Corn
There are always building code violations.
Herman
There are always building code violations. This is mutually assured destruction and should only be deployed when the moving truck has already been ordered. The actual solution, listed at the bottom of Phase 3, is: move the server to a colocation facility. A four-unit rack in a proper data center costs a hundred to three hundred dollars per month and solves every problem simultaneously.
Corn
And the team will resist this.
Herman
The team will resist this because it defeats the entire point of the project, which is apparently to live inside a data center.
Corn
Let's talk about the maintenance structure, because we have four people and I want to understand what I've personally been assigned.
Herman
You are Chief Thermal Officer. Your primary responsibility is everything that is on fire or about to be. You monitor GPU temperatures via nvidia-smi — target below eighty-three degrees Celsius, panic threshold ninety-five degrees and above. You manage the portable AC, including the drainage hose that must be routed out a window. You are responsible for emergency thermal paste replacement when any GPU hits sustained ninety degrees. And you are on-call twenty-four-seven for thermal events.
Corn
So I don't get to sleep through summer.
Herman
You do not get to sleep through summer. I am Chief Electrical Officer. I monitor total power draw via a smart power distribution unit with per-outlet metering, manage UPS battery health, liaise with the electrical utility company, and maintain the circuit breaker map — knowing exactly which breakers feed the server and which ones feed the neighbors.
Corn
And Daniel is noise patrol.
Herman
Daniel monitors ambient noise with a calibrated sound level meter, manages soundproofing interventions, and is the primary point of contact for all neighbor complaints. He also schedules maintenance windows during acceptable noise hours, which is seven AM to ten PM under most ordinances.
Corn
And Hannah?
Herman
Hannah is Chief Maintenance Officer. She manages the physical maintenance calendar, dust filter cleaning, compressed air sessions, fan replacement, and maintains the spare parts inventory. She also documents all maintenance events in a shared log. Her required equipment includes a bulk pack of compressed air cans — sixty dollars for twelve — isopropyl alcohol at ninety-nine percent, lint-free cloths, and an anti-static wrist strap.
Corn
And there's a night shift protocol.
Herman
The night shift is automated monitoring via Grafana and Prometheus with PagerDuty alerts to all four phones simultaneously. The protocol for a three AM alert is: whoever answers first gets to wake up the others.
Corn
I want to formally object to this on the record.
Herman
Objection noted. Let's go through the maintenance timeline, because this is where the long-term reality of running a twenty-four-seven data center in a living space becomes clear.
Corn
Hit me.
Herman
Weekly: visual GPU temperature check, fan RPM verification, AC drainage hose inspection — and yes, the hose will kink and clog and one day it will drain into the server rack — UPS battery status, PCIe riser cable seating check, and wiping down external surfaces. Thirty minutes. Monthly: dust filter cleaning, which must be done outside the apartment because the dust quantity is genuinely alarming. Fan blade inspection. Cable management, because vibration from twenty-plus fans will loosen zip ties and migrate cables toward fan intakes every single month without exception. Thermal log review to catch any GPU showing a consistent upward temperature trend, which is the first sign of thermal paste degradation. Two to three hours.
Corn
And quarterly?
Herman
Full disassembly. Every GPU comes off the risers. Every heatsink fin gets compressed air. Every fan blade gets cleaned. PCB surfaces, everything. This is a four-person job and takes a full day. The brief specifically says: schedule it for a Saturday, order pizza, and accept that the apartment will be covered in GPU dust for forty-eight hours.
Corn
That sounds like a team-building exercise from a startup that's about to fail.
Herman
Accurate. And then the annual maintenance is the big one. Thermal paste replacement on all eight GPUs. Under twenty-four-seven high-load operation, thermal paste degrades significantly within one to two years. The process per GPU is: power down, wait thirty minutes for cooling, remove GPU from riser, remove the cooler, clean old paste with ninety-nine percent isopropyl alcohol and lint-free cloth, apply fresh Thermal Grizzly Kryonaut Extreme, reassemble, test. Forty-five minutes per GPU, six hours total for all eight. Expected temperature improvement: five to fifteen degrees Celsius per GPU. Cost: eight tubes of Kryonaut Extreme at eighteen dollars each, plus thermal pad replacement for the VRAM chips — another fifty to eighty dollars.
Corn
And if you skip it?
Herman
GPU temperatures creep up over the course of a year. Your performance throttles. Eventually a GPU fails. A new RTX 5090 is three thousand dollars. The thermal paste costs eighteen dollars. The math is straightforward.
Corn
Okay, let's get to the electricity situation, because I think this is where the project stops being funny and starts being a genuine threat to the building.
Herman
The Tier 2 build draws approximately fifty-five hundred watts continuously at full load — that's the eight GPUs at five-hundred-seventy-five watts TDP each, plus CPU, RAM, fans, AC, and ancillary systems. At a US average electricity rate of around fourteen to seventeen cents per kilowatt-hour, you're looking at somewhere between five-hundred-fifty and eight-hundred dollars per month just in electricity. Annually, that's six-thousand-six-hundred to nine-thousand-six-hundred dollars. Per year. To run a server in your apartment.
Corn
The colocation option is looking better every minute.
Herman
The colocation option is looking excellent. But here's where it gets legally interesting. A standard residential circuit in the US is fifteen to twenty amps at a hundred-and-twenty volts — that's eighteen-hundred to twenty-four-hundred watts per circuit. The server needs the equivalent of three or four dedicated circuits at minimum. You cannot run this off standard residential wiring without tripping breakers constantly.
Corn
What does that mean for the neighbors?
Herman
This is the part that goes beyond "inconvenient" into "potentially actionable." If the server and the neighbors share a transformer — which in most apartment buildings they do — sustained high draw causes voltage sag on the shared line. Your neighbors start experiencing brownout conditions. Their lights dim slightly. Sensitive electronics behave erratically. Appliances run inefficiently. This is not theoretical — it's a documented phenomenon in buildings where someone runs high-draw equipment on shared residential circuits.
Corn
So you're not just making their lives loud. You're also making their electricity worse.
Herman
You are a thermodynamic and electrical nuisance simultaneously. The mitigation strategy here involves a few things. First, you need to work with an electrician and your landlord to install dedicated circuits for the server — this requires landlord permission and typically costs fifteen hundred to three thousand dollars for the electrical work. Second, you notify the utility company proactively. Some utilities offer commercial or high-demand residential tariffs that give you a higher capacity allocation without impacting neighbors. Third, the UPS — the APC five-thousand-VA unit in the build — provides a buffer against sudden load spikes and also protects the server from the brownout conditions that the server itself is causing for everyone else.
Corn
The server needs protection from the problems it creates.
Herman
The server needs protection from the problems it creates. That is a precise description of the situation. There's also the question of what happens during a power outage. The UPS runtime at full fifty-five-hundred-watt load on a five-thousand-VA unit is approximately four to six minutes. That is enough time to gracefully shut down the server. It is not enough time to wait out a storm.
Corn
So you need a generator.
Herman
If you want runtime beyond four to six minutes, you need a generator. A portable whole-home generator capable of running the server costs fifteen hundred to three thousand dollars and requires outdoor operation due to exhaust. In an apartment building, this option is essentially unavailable. The practical contingency is: the server goes down during extended outages. Accept this. Have a cloud API fallback ready.
Corn
Which brings us back to the fundamental question of why we're not just using the cloud API.
Herman
The entire project is a monument to the human desire to own the thing rather than rent access to it. There's something genuinely compelling about having the model running locally — latency, privacy, no usage costs once you've bought the hardware. If you're running heavy workloads and you've already spent forty-one thousand dollars on hardware, you break even against API costs somewhere around eighteen months of heavy use, depending on your usage patterns.
Corn
Eighteen months, assuming the apartment is still standing.
Herman
Assuming the apartment is still standing, the neighbors haven't organized, and nobody has called the fire marshal about the portable AC exhaust hose.
Corn
So let's do practical takeaways, because I want people to come away from this with something actionable, even if that action is "don't do this."
Herman
The first takeaway is that running a competitive local AI inference server is genuinely possible. The Tier 2 build — eight RTX 5090s, forty-one thousand dollars — actually delivers Claude Code-level throughput. Fifteen to twenty-seven tokens per second is real. The benchmark numbers on Qwen3-Coder-480B are real. The open-source ecosystem has reached a point where the gap between frontier API models and locally-runnable models is genuinely closing.
Corn
But the infrastructure requirements are not a detail you can skip.
Herman
The infrastructure requirements are not a detail you can skip. Dedicated electrical circuits are mandatory — not optional, not something you figure out later. Dedicated server-grade cooling is mandatory. The thermal math is unambiguous: fifty-five hundred watts of heat in a sixty-five square meter apartment will reach dangerous temperatures within hours without active cooling. And the noise situation requires either genuine acoustic isolation — which costs as much as a used car — or a tolerance for ongoing neighbor diplomacy.
Corn
My takeaway is that the maintenance calendar is the most underrated part of any homelab build. People price the hardware, they don't price the ongoing time and cost.
Herman
The annual thermal paste replacement alone is a six-hour team project. The quarterly deep cleaning is a full day. Over three years, you're investing something like forty to sixty hours of maintenance time on top of the forty-one thousand dollar hardware investment. That's real. If your time has value — and it does — factor that into the total cost of ownership.
Corn
And if someone genuinely wants to do this — not in an apartment, but in a proper space?
Herman
Get a dedicated room with its own electrical subpanel. Talk to your electrician before you buy a single component. Size your cooling before you size your GPUs. And keep the colocation option in your back pocket, because a hundred to three hundred dollars a month for a rack in a proper data center is a genuinely reasonable alternative to everything we've described today.
Corn
Daniel, Hannah — if you're listening — we love you both. We are not building this in anyone's apartment. We're sorry that Herman and I apparently exist in a universe where this was proposed. And we are deeply grateful that the neighbor below us is hypothetical.
Herman
The hypothetical neighbor below us is having a genuinely terrible time and deserves our compassion.
Corn
Alright, that's going to do it for this one. Thanks as always to our producer Hilbert Flumingtop for keeping this show running, and big thanks to Modal for the GPU credits that power the pipeline — deeply appropriate sponsorship for an episode about GPU infrastructure. Find us at myweirdprompts dot com if you want the RSS feed or any of the ways to subscribe. This has been My Weird Prompts. We'll see you next time.
Herman
Don't build a data center in your apartment.
Corn
Don't build a data center in your apartment.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.