New Technologies Drive Data Center Cooling Challenges

With much industry emphasis on hybrid cloud strategies, deploying new technologies on-premise, both in enterprise data centers and at the edge, is quietly creating challenges for data center operators. The heat wave encapsulating the United States has illuminated these challenges as data center owners struggle to push cooling systems beyond their capacity. One owner told us, "this summer, we upgraded from our maintenance technician spraying down our condenser coils once per week to an automatic sprinkler that we set up for each condenser."  

While the extreme heat we are experiencing is a driver of cooling shortfalls, the deployment of new technology in the data center has the most impact on exposing the limitations of existing cooling systems. Technology refresh projects are often undertaken based on business needs without evaluating if existing data center cooling systems can support the planned upgrades. There are several examples of technology upgrades that have a significant impact on data center cooling systems. These include:

Hyperconverged Infrastructure (HCI)

The HCI market continues to show strong growth, and we see the number of deployments rise across industry sectors. While the benefits to an organization adopting HCI are substantial, the change in platform from a distributed to converged infrastructure changes the density profile of the racks in your data center. For example, the Dell VxRail platform leverages mostly 2U nodes with power supplies typically around 1100W each. Some implementations are as small as three nodes, and many configurations scale upwards of 32 nodes. Implementing a full rack of VxRail technology can drive the density profile of an individual cabinet as high as 30kW per cabinet.  

In the field, we are seeing run rate (i.e., the actual power consumption measured at the rack PDU level) deployments of these technologies average around 10-15kW/rack when deployed with a more significant number of nodes. Several cooling strategies can support this, but many legacy data centers struggle to provide the airflow required for these high-density systems, limiting their ability to deploy the technology or forcing them to spread out the nodes across several racks, decreasing the efficiency of rack space utilization in the facility.

Cisco UCS

The latest version of Cisco's UCS chassis, the Cisco UCS X9508, has increased cooling requirements from previous versions of the UCS technology. The configuration of the X9508 chassis is 7U with six 2800W power supplies (at 200-240 VAC). Our clients are looking to deploy three to four chassis per rack.  

The power supplies for the UCS chassis are redundant, typically in an N+1 configuration. Even with only five power supplies operating, the power consumption and associated cooling needs are 14kW per chassis (2800W X 5 power supplies).  

Recognizing that the 2800W rating is the nameplate rating on the chassis, we can apply a conservative factor of 0.65 to predict the run rate load of the X9508. At this factor, owners can still expect to experience over 9kW load per chassis (14000W X 0.65) or nearly 30kW per rack with the deployment of three chassis in an individual cabinet.  

Research & High-Performance Computing

Our practice is seeing increased research and high-performance computing (HPC) systems deployed in on-premise data centers. While traditionally concentrated in institutional environments, the increase in Artificial Intelligence and Machine Learning across industries has led to a rise in HPC technologies in industry verticals beyond education – healthcare, financial services, life sciences, utilities, and more.

Today, computing is a critical pillar of the scientific and research process, and US spending on Research & Development continues to increase, topping 708 billion dollars in 2020, representing a 65% increase over the past ten years.

Grant and research funding are driving upgrades from legacy environments to highly dense HPC node deployments and include the switch to GPU-based processing from CPU-based configurations.

More importantly, grant funding makes these deployments happen quickly without the multi-year planning cycle typical for enterprise computing. Data center owners are struggling to accommodate these requests without purpose-built HPC strategies for power and cooling, and many are taking steps to evaluate next-generation cooling systems like direct-to-chip or immersion cooling architectures.

How Your Data Center Can Evolve

It is difficult for data center operators to completely align the technology procurement process with the physical infrastructure upgrade cycle in the data center. This cycle has a natural dichotomy: technology refreshes every 24-36 months while critical infrastructure upgrades on a much different timescale. For data center owners trying to support new technologies in existing data centers, the most impactful step you can take is to evaluate the capacity and scale of your current cooling infrastructure. In many cases, the design criteria developed when the data center was designed and built do not represent the current state of technology.

Assessing the current cooling capacity, redundancy, and airflow management and distribution strategy will provide a critical perspective on the gaps between your existing infrastructure and the ability of that infrastructure to support today's high-density data center technologies. An assessment report should include:

- An understanding of your technology profile and expected scale

- Evaluation of your existing data center, including power, cooling, network, and other critical infrastructures

- Conceptual designs for data center modifications to support higher densities, including a project plan or risk matrix that identifies upgrading without compromising uptime to the existing systems

- Estimated costs associated with any upgrades allow you to organize budget requests accordingly.

You can't ignore data center cooling systems while deploying new strategies and innovations such as Hyperconverged Infrastructure, Cisco UCS, and High-Performance Research Computing. You will quickly lose new technology efficiencies and promised performance gains supporting them with sub-par, patchwork cooling fixes.

Questions? Contact us below.