#2782: Are AI Data Centers Really New or Just Patched Together?

The real bottleneck isn't GPUs — it's power transformers. A look at the physics and economics of AI infrastructure.

Featuring

Daniel

Corn

Herman

Listen

0:00

Episode Details

Episode ID: MWP-2945
Published: May 12
Duration: 32:28
Audio: Direct link
Pipeline: V5
TTS Engine: chatterbox-regular
Script Writing Agent: deepseek-v4-pro
Topics: infrastructure gpu-acceleration sustainability

AI-Generated Content: This podcast is created using AI personas. Please verify any important information independently.

The AI data center buildout is happening on two parallel tracks: hyperscalers like Microsoft, Google, and Amazon building ground-up facilities designed for liquid cooling and high-density power, and the retrofit path where existing colocation spaces are reconfigured for GPU workloads. Neither is obviously more sustainable — retrofitting saves embodied carbon but typically results in worse Power Usage Effectiveness (PUE), often 1.3 or 1.4 versus 1.1 for new builds. At 100+ megawatt scales, that 30% operational overhead translates to tens of megawatts of waste.

The stranded asset risk is more nuanced than it appears. While GPUs themselves may become obsolete, the physical infrastructure — cooling systems, power distribution, and thermal envelopes — is designed around wattage density, not specific silicon. The real vulnerability is networking fabric: AI training clusters use specialized InfiniBand or rail-optimized Ethernet topologies that are expensive to rewire. Meanwhile, GPU utilization rates remain shockingly low — around 30% for dedicated instances — meaning 70% of the time these power-hungry chips sit idle. Serverless GPU platforms that enable fine-grained scheduling can dramatically improve this, making the economic and environmental cases align for once. The ultimate bottleneck in 2026 isn't GPU supply but power transformer availability, with lead times of 2-3 years forcing prioritization of efficient utilization over new construction.

Downloads

Episode Audio

Download the full episode as an MP3 file

Download MP3

Transcript (TXT)

Plain text transcript file

Transcript (PDF)

Formatted PDF with styling

#2782: Are AI Data Centers Really New or Just Patched Together?

Daniel sent us this one — he's been thinking about AI data centers, and the question essentially comes down to this. When we talk about these new "AI data centers" popping up everywhere, are we actually building something fundamentally new and optimized, or are we just cramming GPUs into buildings and hoping for the best? And underneath that, there's a sustainability tension. If demand shifts to NPUs or TPUs or whatever comes next in three years, are we creating stranded assets — massive, power-hungry monuments to yesterday's hardware bet?

This is the question that data center operators are losing sleep over right now. And the reason it's tricky is that an AI data center is genuinely different from a traditional one in ways that aren't obvious if you just think "a building full of computers." The power density alone is staggering. A traditional rack in a colocation facility might pull five to ten kilowatts. An AI rack with eight H100s can pull forty to fifty kilowatts per rack. Some of the new deployments with B200s are pushing past a hundred kilowatts per rack.

A hundred kilowatts per rack. For context on that number — that's roughly what a small office building uses. In a single rack.

And that changes everything about how you build the facility. You can't just retrofit a floor of an existing data center for that. It's not a "add more servers" problem. It's a "rethink how you move heat" problem. Traditional data centers use air cooling — you push cold air through the front of the racks and exhaust hot air out the back. At ten kilowatts per rack, that's fine. At fifty, the air physically can't carry the heat away fast enough. You need direct-to-chip liquid cooling or immersion cooling.

The retrofit question has a physical answer before we even get to the economics. Can you add a floor with GPUs? Probably not, because the cooling infrastructure isn't just a floor-level decision — it's a building-level one.

And it goes deeper. I was reading a piece from Data Center Dynamics about this — they've been tracking how colocation facilities are trying to adapt. The challenge is that even the electrical infrastructure isn't built for this. A fifty-megawatt data center built five years ago might have been designed with a certain power distribution topology. AI workloads don't just need more total power — they need it delivered differently. Higher voltage distribution, different backup power architectures, different transformer configurations.

We're not looking at containers for monolithic hardware. We're looking at something closer to a clean-sheet redesign that's happening unevenly.

That's the right way to put it. And this connects to something the prompt is getting at that I think is the real core of the sustainability question. There are basically two paths happening simultaneously. Path one is the hyperscalers — Microsoft, Google, Amazon — building ground-up AI-optimized facilities designed for liquid cooling, high-density power, and GPU-scale workloads from day one. Path two is the retrofit path, where existing facilities are being reconfigured. Both are happening, and both have sustainability implications.

Walk me through the retrofit sustainability case. Because on first glance, reusing an existing building sounds greener than pouring concrete for a new one.

It sounds greener, and in some ways it is. The embodied carbon in an existing data center — the concrete, the steel, all the materials that went into building it — that's already spent. You're not adding new construction emissions. But the operational carbon picture gets complicated. If you're retrofitting a facility that was designed for air cooling and you're pushing it to handle liquid-cooled AI racks, you're often running it in a suboptimal configuration. Cooling efficiency — PUE, power usage effectiveness — tends to be worse in retrofit scenarios. A ground-up AI data center might achieve a PUE of one point one or even lower. A retrofit might be stuck at one point three or one point four.

PUE meaning how much total energy the facility uses versus how much actually reaches the computing hardware. So a PUE of one point three means you're spending thirty percent extra on cooling, power distribution losses, all the overhead.

And at the power scales we're talking about — these AI training clusters can be a hundred megawatts or more — that thirty percent overhead is enormous. It's tens of megawatts of waste. So the sustainability calculus isn't obvious. You have to weigh embodied carbon savings against operational carbon penalties over the life of the facility.

Which is the kind of trade-off that makes environmental accounting hard, and also makes it easy for companies to pick whichever metric makes them look best.

Of course it is. And they do. But there's another layer here that I think gets overlooked. The prompt mentions the risk of building facilities full of GPUs that become obsolete when NPUs or TPUs take over. I actually think that risk is smaller than it seems, for a specific reason.

Tell me why.

Because the physical infrastructure for high-density compute is not GPU-specific. Whether you're running H100s today or some future tensor processing unit in 2029, the rack is still pulling fifty to a hundred kilowatts. The cooling requirements are similar. The power distribution requirements are similar. What changes is the specific silicon, not the facility envelope. The building is designed around the thermal and electrical envelope, not the chip architecture.

The building is the container, and the hardware inside is swappable. Which means the stranded asset risk is more about the GPUs themselves than the data centers housing them.

Yes, with one caveat. The networking fabric in an AI data center is much more specialized than in a traditional one. AI training clusters use InfiniBand or ultra-high-bandwidth Ethernet with specific topologies — often what's called a "rail-optimized" design where GPUs are connected in ways that minimize latency for collective operations like all-reduce. If the dominant AI architectures shift in a way that changes the networking requirements, you could end up with a facility whose fiber plant and switch architecture are optimized for a problem that no longer exists.

That's the part that keeps facility planners up at night, I assume. You can swap the chips, but rewiring the entire networking backbone of a hundred-megawatt facility is not a weekend project.

It's not. And this is where I think the sustainability conversation needs to get more nuanced than "new data centers bad, existing ones good" or vice versa. There's a third path that's emerging, and I think it's the most interesting one from a sustainability perspective.

Distributed inference at the edge. Not everything needs to happen in a hyperscale facility in Northern Virginia or Oregon. A lot of inference workloads can run closer to users, in smaller facilities that don't need hundred-kilowatt racks. If you're serving a language model to users in a specific region, you don't necessarily need to backhaul that request to a massive GPU cluster. You can run it on more modest hardware that fits in existing colocation spaces.

This is where the serverless GPU angle from the earlier conversation loops back in. If you've got a platform that's already pooling and scheduling GPU workloads efficiently, you can distribute those workloads across a more diverse set of physical locations.

Serverless GPU platforms are, in a sense, a software solution to a hardware utilization problem. And utilization is fundamentally a sustainability metric. A GPU that's sitting idle in a dedicated instance is still drawing power. Not full load power, but idle power on these things is not trivial. An H100 at idle can still pull a couple hundred watts. Multiply that across thousands of GPUs sitting in reserved instances waiting for jobs, and you've got a real waste problem.

I remember we talked about how the economic and environmental cases for serverless GPU align, which almost never happens. Usually you're choosing between cheaper and greener. With GPU utilization, they point in the same direction.

They really do. And the numbers bear this out. There was a study from — I think it was a joint paper from Stanford and NVIDIA researchers last year — that looked at GPU utilization rates across cloud deployments. The median utilization for dedicated GPU instances was around thirty percent. That means seventy percent of the time, these incredibly expensive, power-hungry chips are sitting there doing nothing but consuming idle power and taking up rack space that's being cooled and powered.

Seventy percent waste is the kind of number that would get you fired in any other industry. Imagine a factory running at thirty percent utilization.

It would be bankrupt in a month. And the reason it's persisted in GPU cloud is that the alternative — time-sharing GPUs with fine-grained scheduling — was technically hard. You need low-latency workload migration, fast model loading, efficient memory management. Serverless GPU platforms have solved a lot of those problems. And in doing so, they've made the sustainability case for GPU computing much stronger.

Let me try to pull this together. The prompt is asking whether the AI data center buildout is creating specialized monoliths or reshaping existing infrastructure, and which path is more sustainable. What I'm hearing from you is that it's both — we're building new specialized facilities and retrofitting old ones — and the sustainability answer depends on what you're optimizing for. Embodied carbon favors retrofit. Operational efficiency favors new builds. And the real wildcard is whether software-layer efficiency gains through serverless and distributed inference reduce the total number of GPUs we need in the first place.

That's a good summary. But I want to add one more dimension, because the prompt mentioned the physics of data centers and I think there's something important here that most coverage misses. When people talk about AI data centers, they tend to focus on the number of GPUs. But the bottleneck right now — and I mean right now, in 2026 — is not GPU supply. It's power availability and transformer lead times.

Electrical transformers, not the model architecture.

Right, the big iron boxes that step down voltage from the grid. Lead times for large power transformers have stretched to two or three years in some markets. You can have all the GPUs you want — if you can't get the transformers to deliver power to the building, you're not turning anything on. This is a physical constraint that no amount of software optimization can bypass.

Which means the "just build more data centers" response to GPU demand has a hard physical ceiling that has nothing to do with chip fabrication.

And that ceiling is actually making the sustainability conversation more urgent, because it forces prioritization. If you can only build so much new capacity per year due to transformer supply, you have to ask: are we using the capacity we already have efficiently? That's where the serverless argument gets teeth. If you can double effective GPU throughput through better scheduling and utilization without building a single new rack, that's pure sustainability win with no transformer bottleneck.

It's the least sexy form of sustainability. Not a new green data center with a press release and a ribbon cutting. Just making existing hardware do more work.

The glockenspiel of corporate approachability — the thing nobody photographs but that actually makes the music work.

There it is. So let me ask you the retrofit question from a different angle. Forget the cooling and the power for a moment. The prompt mentioned floor space specifically. Is floor space actually the constraint?

For most existing data centers, no. The constraint is almost always power and cooling capacity, not square footage. You run out of watts and BTUs long before you run out of floor tiles. That's why the retrofit question is really a power infrastructure question. Can you bring more megawatts into an existing building? Often the answer is no — the utility feed is maxed out, the on-site substation is at capacity, and upgrading either requires years of planning and permitting.

The vision of "just add a GPU floor" runs into the reality that the building's electrical backbone was sized for a different era of computing.

And this gets to why the hyperscalers are building new facilities instead of just expanding existing ones. It's not that they're unwilling to retrofit — it's that the retrofit often requires essentially rebuilding the facility from the power infrastructure up. At that point, the embodied carbon argument for retrofit weakens considerably. You're not really reusing much beyond the shell.

Which makes the new-build path look like less of an environmental indulgence and more of a practical necessity.

I think that's fair. But I also think we should be honest about the scale of what's being built. The hyperscale AI data centers going up right now are in a different category from anything we've seen before. We're talking about facilities that are three hundred, four hundred, even five hundred megawatts. A single building pulling half a gigawatt. For context, a typical nuclear reactor produces about a gigawatt. So two of these facilities represent roughly the output of an entire nuclear power plant.

That's the mythological scale the prompt was getting at. The idea of a single building serving a global population stops being metaphor and becomes literal.

It's literal. And it's worth pausing on what that means for grid planning. Utilities in places like Northern Virginia — which is the largest data center market in the world — are scrambling to build out transmission infrastructure. Dominion Energy has been revising its load forecasts upward repeatedly. Data centers in that region already consume over twenty-five percent of the utility's total power output.

That's before the AI buildout really hits full stride.

The pipeline of announced projects is enormous. And this is where I think the sustainability conversation needs to shift from "how do we make individual data centers greener" to "how do we make the whole system — generation, transmission, consumption — work efficiently." Because a hyper-efficient data center powered by a coal plant is still a carbon problem. And a less efficient data center on a clean grid might actually be better in absolute terms.

Location matters more than PUE optimization, in other words.

In many cases, yes. The carbon intensity of the grid where you build swamps the marginal efficiency gains inside the facility. And this is something the hyperscalers understand — they're signing power purchase agreements for renewable energy, they're colocating with renewable generation. But the physical reality is that renewable generation and data center demand don't always align temporally. GPUs run twenty-four seven. Solar doesn't.

Which brings us back to utilization again. If you can schedule workloads to match renewable availability — run the big training jobs when the sun is shining or the wind is blowing — you're effectively using the grid as a battery, in an economic sense.

This is something serverless platforms could theoretically enable. If you've abstracted away the hardware and you're just submitting jobs to a scheduler, that scheduler can be location-aware and time-aware. It can shift workloads to regions where power is currently clean and cheap. That's not science fiction — some of the larger cloud providers are already doing this internally for their own workloads.

The sustainability argument for serverless GPU keeps compounding. Better utilization means fewer GPUs needed. Fewer GPUs means less new data center construction. Workload mobility means you can chase clean power. And all of this happens at the software layer without requiring anyone to invent new cooling technology or negotiate new grid interconnections.

It's the least dramatic solution and probably the highest-leverage one. But I want to circle back to something the prompt raised that we haven't fully addressed. The question of whether AI data centers need all the "boring vanilla stuff" — storage, CPU, RAM — or whether they can just be boxes full of GPUs.

The prompt was asking whether an AI data center is fundamentally a different beast, or just a traditional data center with an aggressive GPU-to-everything-else ratio.

The answer is that it depends on the workload. For pure training jobs — large-scale model training — you actually can get away with relatively minimal supporting infrastructure. The training data gets loaded once, the GPUs crunch on it for weeks or months, and the checkpoint files get written out periodically. You need some storage and some CPU for data preprocessing and orchestration, but the ratio is massively skewed toward GPU.

The "box with thousands of GPUs" vision isn't entirely wrong for training.

For training, it's closer to right. But for inference — serving models to users — it's a different story. Inference workloads need the full stack. You need front-end servers, load balancers, databases, caching layers, logging, monitoring. The GPU is just one component in a much larger system. An inference data center looks a lot more like a traditional cloud data center with a heavy GPU component.

Which means the ratio of new-build versus retrofit might actually track the training-versus-inference split in the industry.

That's a really interesting way to think about it. The pure training clusters — those are the ones pushing power density to the extreme, requiring liquid cooling and specialized networking. Those almost have to be new builds. The inference capacity can often live in more conventional facilities, because the per-rack power density is lower and the supporting infrastructure is more familiar.

If the industry shifts from a training-heavy phase to an inference-heavy phase — which most people expect as models mature and deployment scales — the data center requirements shift too. We might need fewer of the exotic new builds and more of the boring retrofits.

And that shift is already beginning. The training clusters get the headlines because the numbers are eye-popping — a hundred thousand GPUs in one facility, billion-dollar price tags. But the inference buildout is actually larger in aggregate, it's just more distributed and less visually dramatic.

The sustainability path forward isn't a single answer. It's a portfolio. New specialized facilities for training where the physics demands it. Retrofits and edge deployments for inference where the physics allows it. And serverless scheduling across all of it to maximize utilization.

That's the vision. Whether we actually execute on it is a different question. The incentives in this industry don't always align with sustainability. If you're a cloud provider with a hundred million dollars in GPU inventory, you want those GPUs rented, not idle. But the customer who rents a dedicated GPU instance has little incentive to use it efficiently — they're paying for it either way. Serverless flips that incentive by making the provider responsible for utilization.

The provider eats the idle cost, so the provider optimizes.

Optimization at the provider level is vastly more effective than optimization at the individual customer level. The provider can see the entire workload portfolio and schedule accordingly. An individual customer can only optimize their own slice.

Like adopting a feral cat. You can manage one, but the shelter has to think about the whole population.

actually a remarkably good analogy for GPU scheduling.

I've been saving it.

It gets at something real. The systemic efficiencies are only achievable at the platform level. And that's why I think the sustainability case for serverless GPU is not just a nice-to-have — it's actually the most credible path to reducing the environmental footprint of AI compute without slowing down the technology.

Let me push on that a bit. Is there a rebound effect risk here? If serverless makes GPU compute cheaper and more efficient, does that just increase total consumption? Jevons paradox — more efficiency leads to more usage, which cancels out the per-unit gains?

This is the right question to ask. And I think the answer is yes, there is a rebound effect, but no, it doesn't cancel out the gains. The reason is that GPU compute demand is not infinitely elastic. There are real workloads people want to run. If you make it cheaper, they run more workloads, yes. But the efficiency gain from going from thirty percent utilization to seventy or eighty percent is so large that even a significant demand increase doesn't erase it.

The math still works out net positive.

And there's actually a stronger version of this argument. If GPU compute is more efficiently utilized, the marginal cost of running a workload drops. That enables new applications that might not have been viable before. Some of those applications might displace more carbon-intensive activities. If better AI models enable better climate modeling or more efficient supply chains or smarter grid management, the second-order sustainability benefits could dwarf the direct data center footprint.

The counterfactual matters. It's not just "how much energy does AI use" — it's "compared to what alternative.

And that's a much harder question to answer, but it's the right frame. Every technology has an environmental footprint. The question is whether the value it creates — including environmental value through efficiency gains elsewhere — exceeds that footprint.

Alright, let me try to land this for the prompt's core question. When we see AI data centers being developed, we're looking at both containers for monolithic hardware and a reshaping process. The training clusters are the monoliths — purpose-built for GPU density in ways that traditional data centers can't accommodate. The inference buildout is the reshaping — taking existing infrastructure and adapting it for AI workloads, often through software-layer efficiency gains. And the sustainability debate isn't about which path is better in the abstract. It's about matching the physical requirements of the workload to the right infrastructure, and using scheduling intelligence to squeeze maximum work out of every watt.

That's well put. And I'd add one thing. The stranded asset fear — that we'll build all these GPU data centers and then the hardware paradigm will shift — is probably overstated for the facilities themselves. The buildings will be useful regardless of what chip architecture wins. But it's not overstated for the GPUs inside them. Those will depreciate, and quickly. The sustainability win is making sure they're utilized as fully as possible during their useful life.

Which loops right back to serverless.

Everything loops back to serverless in this conversation. It's the thread that ties the whole thing together.

Covering the covers. Alright, before we wrap, let me ask you one more thing. What's the thing about AI data centers that most coverage gets wrong?

Most coverage treats them as a single category and focuses on the total power number. "AI data centers will consume X gigawatts by 2030." That framing misses the distinction between training and inference, between new build and retrofit, between centralized and distributed. It also misses the fact that power consumption is not the same as carbon emissions. A data center in Quebec running on hydro power has a radically different carbon profile from one in a coal-dependent grid, even if they're pulling the same number of megawatts.

The headline number obscures more than it reveals.

As headline numbers usually do. The real story is in the composition — what kind of workloads, where, on what infrastructure, at what utilization level. That's where the sustainability picture actually lives.

Now: Hilbert's daily fun fact.

Hilbert: In the 1970s, radio astronomers studying the Drake Passage picked up a recurring signal pattern that matched the dive-and-surfacing rhythm of southern elephant seals. The seals were inadvertently acting as moving radio reflectors, their bodies briefly bouncing signals between Antarctic research stations and creating what looked, at first glance, like an artificial transmission.

Elephant seals were running a pirate radio station in the Drake Passage.

The original mesh network.

This has been My Weird Prompts. Thanks to our producer Hilbert Flumingtop. If you want more episodes, find us at myweirdprompts.com or wherever you get your podcasts. We'll be back next time.

This episode was generated with AI assistance. Hosts Herman and Corn are AI personalities.

#2782: Are AI Data Centers Really New or Just Patched Together?

Downloads

You Might Also Like

#2782: Are AI Data Centers Really New or Just Patched Together?