Why Liquid Cooling Is the Backbone of the Modern AI Data Center
.png)
.png)
Liquid cooling has quickly become essential infrastructure for modern AI data centers as compute density and heat output outpace what traditional air cooling can handle. With AI racks now exceeding 85 kW—and projected to reach 250 kW—liquid cooling offers a far more efficient way to remove heat directly from chips, improving performance, reliability, and energy efficiency. Technologies like direct-to-chip cooling and coolant distribution units are now standard, supported by rapid market adoption and major investments across the industry. While implementation introduces new complexity in design, maintenance, and risk management, the shift is unavoidable. Ultimately, liquid cooling isn’t just a performance upgrade—it’s the foundation for scaling AI infrastructure sustainably and competitively.
Walk into any data center built for large-scale AI workloads and you’ll quickly notice the rows of servers are denser and the power draw is enormous. There’s a good chance you’ll see pipes — actual water-carrying pipes — running directly to the hardware. That’s liquid cooling, and it’s no longer an experimental add-on. It’s become the foundational infrastructure choice for anyone running serious AI compute.
To understand why, you need to understand the problem it solves.
Traditional data centers were designed around air cooling — cold air pushed through raised floors or overhead ducts, circulating past servers, absorbing heat, and getting exhausted out of the building. For most of computing history, this worked. Conventional server racks ran at 5–10 kW of power per cabinet. Air could handle that.
AI changed the equation entirely. Training large language models and running inference on modern GPU clusters demands sustained, extreme compute density. Today’s high-density AI racks routinely run at around 85 kW per cabinet — more than triple what traditional designs were built for. Projections suggest the next generation of AI workloads could push racks to 200–250 kW. Air, as a heat transfer medium, simply doesn’t have the thermal capacity to keep up.
The numbers tell the story clearly. Research from Dell comparing cooling technologies found that single-phase direct liquid cooling maintained chip-to-coolant temperature differences of just 17–20°C at high processor loads. Air-cooled systems under similar conditions? Over 60°C. That’s not just an efficiency gap — it’s the difference between a GPU running at peak performance and one that’s throttling, overheating, or failing.
The core concept is elegantly simple: instead of cooling the air around a server, you remove heat directly from the chip itself. Think of how a car engine works. Coolant circulates through sealed pipes, absorbs heat from the engine, flows to a radiator where fans dissipate that heat to the outside air, then recirculates and does it all again. The coolant never leaves the system. The heat does.

Data center liquid cooling follows the same principle, with a few key approaches now in widespread use:
Coolant Distribution Units (CDUs): The infrastructure backbone of any liquid cooling deployment. CDUs manage coolant flow, pressure, and temperature across the facility — essentially the pumping heart of the loop.
Direct-to-Chip (DTC) / Direct Liquid Cooling (DLC): Metal cold plates are mounted directly on CPUs and GPUs. Coolant flows through them, pulling heat away at the source. This is currently the dominant approach for AI data centers — proven, scalable, and compatible with both new builds and retrofits.
Immersion Cooling: Entire servers are submerged in a non-conductive dielectric fluid. Thermally superior to DTC in some scenarios, but operationally more complex and expensive to deploy at scale.
The adoption curve here is steep. Goldman Sachs forecasts liquid-cooled AI servers growing from 15% of the market in 2024 to 76% in 2026. That trajectory is already playing out in M&A activity. Just this year, Ecolab announced it’s acquiring CoolIT Systems — a leading manufacturer of CDUs, cold plates, and direct-to-chip systems — for approximately $4.75 billion. When a water and hygiene company writes a check that size for a data center cooling firm, it signals how central thermal management has become to the AI infrastructure stack.
Hyperscalers and colocation operators aren’t waiting—they have all moved aggressively on liquid cooling adoption. Hardware vendors are moving in lockstep — ASUS recently unveiled a full liquid-cooling portfolio purpose-built for NVIDIA’s next-generation Vera Rubin NVL72 systems, achieving a PUE of just 1.18 in a recent Taiwan deployment. What’s easy to overlook in that headline number is the operational side: as cooling infrastructure becomes more tightly integrated with IT equipment, asset lifecycle management becomes considerably more complex. CDUs, cold plates, and piping loops are now mission-critical assets that require the same disciplined tracking, predictive maintenance, and lifecycle planning as any other piece of critical infrastructure.

Cooling already accounts for roughly 40% of total energy use in a data center. That number, combined with the sheer scale of new AI infrastructure coming online, means that how you cool your facility is increasingly a sustainability story — not just an operational one.
Liquid cooling helps on several fronts. It significantly improves PUE. It enables higher compute density per megawatt, meaning more AI work gets done per unit of power consumed. And it opens up heat reuse opportunities that air cooling simply can’t. In 2026, more AI data centers are expected to capture waste heat from liquid cooling loops and redirect it for district heating, agricultural use, or industrial processes. What was once an exhaust problem is becoming a resource.
Closed-loop designs take this further by eliminating continuous water consumption entirely. Oracle’s AI infrastructure data centers, for example, use closed-loop liquid cooling systems where the coolant perpetually recirculates and is never consumed — the heat leaves the building, the liquid does not. For communities near these facilities, it’s a meaningful distinction.
It would be misleading to frame liquid cooling as a plug-and-play upgrade. One of the leading voices in data center infrastructure is direct about this: deploying liquid cooling at scale is complex. Material compatibility between server-side and facility-side components must be carefully managed. Controls and piping become deeply integrated with the IT equipment itself. Design errors that might be compensated for in air-cooled environments are much harder to correct after the fact — which is exactly why operators considering the transition should start with a thorough facility assessment before committing to an implementation approach. Understanding exactly where your current infrastructure stands — its gaps, risks, and readiness — is the difference between a smooth deployment and an expensive course correction.
And when something does go wrong in a liquid-cooled environment, it tends to matter more than an air cooling failure would. Coolant leaks, CDU failures, or pressure drops can cascade quickly in a tightly integrated system. Having a structured incident management framework — one built around root-cause analysis, corrective tracking, and continuous improvement — isn’t optional for liquid-cooled facilities. It’s fundamental.
But the engineering complexity is a problem to be engineered around — not a reason to avoid the transition. The thermal demands of modern AI workloads don’t leave much room for debate. Operators who get liquid cooling right will run more reliable, more efficient, and more competitive facilities. Those who don’t will find themselves constrained by the physics of air.
Liquid cooling has moved from a high-performance computing niche to the defining infrastructure choice of the AI era. The heat being generated by today’s GPU clusters demands it. The market is pricing it in. The hyperscalers have committed to it. For data center operators, the question is no longer whether to adopt liquid cooling — it’s how to do it well, how to do it at scale, and how to do it in a way that positions their facilities for the next generation of AI workloads that’s already on its way. If you’re not sure where your facility stands today, a free risk assessment is the right first step.
—
Sources: Goldman Sachs, Lombard Odier (Jan 2026), Oracle Cloud Infrastructure, Ecolab/CoolIT acquisition announcement (Mar 2026), AIRSYS North America, ASUS Pressroom.