Data centers continue to evolve rapidly. The increase in High-Performance Computing (HPC) workloads, the push toward edge strategies, and the utilization of cloud technologies is changing how we can think about how we design our facilities.
Traditionally, data centers were designed with as much redundancy as possible. Most owners required Tier III topologies or an “N+1” configuration for all major infrastructures. While most owners still desire this reliability, we are seeing an increase in data center owners who are considering ‘variable resiliency,’ or the ability to deliver varying levels of redundancy to different racks in their data center.
A big reason for this shift is that end users have begun to think about their data centers much more on the application level than the server level. Historically (and before virtualization), users treated each piece of server hardware in the data center equally. Today, many users talk about their data center in terms of application “tiers,” with “Tier 1” applications being mission critical to their business.
This shift in mindset has enabled the conversation of variable resiliency. In a data center, variable resiliency is a concept whereby you can provide less redundancy to specific racks (or applications) based on their criticality. Instead of providing a traditional N+1 power architecture to your entire data center, your less critical applications would receive grid power only. A frequent example that we see is for HPC applications; they are extremely high in density and require unique design strategies, but in many cases they are also able to sustain a short outage without damage to the work being conducted. (See Dr. James Cuff of Harvard University talking about ‘shutting off’ his data center in demand response).
There is a strong cost argument for this approach. By providing grid-power only to certain applications in your data center, you are reducing the size of your Uninterruptible Power Supply (UPS) equipment, which will decrease your capital costs during construction and lower your maintenance costs on an annual basis.
If you are considering variable resiliency for your data center, there are several factors to consider:
Business Continuity Planning
A variable resiliency strategy requires thoughtful IT planning. You may be choosing to provide lower redundancy to certain data center applications because they are less critical or because they are replicated asynchronously to another site, but in either case the delivery of services from these applications will experience more interruptions than in a traditional “N+1” data center. This requires an IT organization to really think through the downstream implications of intermittent interruptions on their service delivery model.
If you are foregoing the use of a UPS to filter the ‘dirty’ power from your grid, it is important to understand the reliability of your utility power. Historical trending data on their service can typically be obtained through working with the utility directly. Review a minimum of 36 months worth of data, take the worst case out of those months (i.e. the highest number of outages) and make sure you are comfortable that your data center can sustain that variability.
Having a robust monitoring solution is critical to any data center with varying levels of reliability. With different reliability ‘sections’ of your data center, a single utility power event will impact these sections of your data center differently. Make sure that your monitoring platform provides granular insight into the status of your physical infrastructure so that you can make informed decisions, especially if your data center is lights out.
Standby Generator Planning
Users of variable resiliency designs should be prepared for more proactive monitoring of their grid and events that may impact its uptime (i.e. severe weather). Less critical applications may be able to withstand short outages, but many users will want to prevent sustained downtime through utilizing their standby generator system. More frequent usage of your generator plant will consume more fuel and potentially increase maintenance, something that should be considered during the design process.
Variable resiliency is a worth consideration for your data center. It requires careful planning but can deliver long-term costs savings that for some organizations outweigh the risks of lower reliability for certain applications.