Few data centers expect any deceleration in the speed and volume of data they must handle. Rather, it is a constant matter of asking “Where should our next 100 servers be placed?” Among the variables to consider are heat generation, power requirement with proper load balancing, network bandwidth and network security. In certain co-location scenarios, any new servers to host applications and data for a particular tenant must be contained within zones reserved for that tenant.
Computational Fluid Dynamics (CFD) can help data centers better predict thermal behavior and cooling requirements as they strive to fit more and more computing capacity into fixed spaces. CFD combined with lifecycle management capabilities for tracking workflows, work orders and maintenance events for servers, power distribution units (PDUs) and other physical assets can form the backbone of a powerful data center infrastructure management (DCIM) platform.
Claridion, a DCIM consultant services provider based in Quebec, recognized that even with predictive thermal analytics and lifecycle software, its customers still could only approximate current conditions in their data centers. To make data-supported operational decisions, they needed more. To best answer questions about expanding capacity, refreshing old equipment and adhering to service agreements, data center operators needed a real-time data engine. Claridion’s Salvatore Cimmino and Rémi Duquette of partner firm MAYA HTT Ltd., joined forces to help Claridion deliver DCIM as a service by leveraging MAYA HTT’s comprehensive DCIM solution. Claridion’s search for a universal data trending platform that would facilitate the collection and analysis of data about cooling and power capacity, available network bandwidth, current router connectivity and other concerns came to a successful conclusion when they found MAYA HTT’s Datacenter Clarity LC solution.
Better visibility into data center operations and more informed decision-making when expanding compute capacity would make serious gains in energy efficiency and business agility possible. Claridion, its data center customers, and the users that depended on the hosted applications and data all stood to gain if they could do this, while sparing everyone the pain of unexpected system failures and downtime events due to overheating.
Adding Operational Data
Claridion now uses Datacenter Clarity LC® powered by OSIsoft’s PI System as the engine to feed real-time data and capture asset performance over time for the DCIM platform. It can tap into all the relevant data streaming from virtual machines (VMs), server hardware, racks, the communications infrastructure, the building management system (BMS), and any additional relevant data sources such as lighting and physical access systems. It enables data center customers to do high-volume data collection and analysis of key performance indicators from the single-asset level to zone level, to the floor and building levels.
Templates pull from the PI Data Server to present trends and real-time information as detailed dashboards. The time-series performance data is overlayed with business data and presented as 3D renderings and graphical visualizations. This makes clear the steps needed to optimize available capacity and to avoid downtime and energy waste. Power consumption trending data for example is delivered with a visualization showing how devices are related along power lines and their current and power readings and thresholds. You can click on a device in the series to call up its maintenance history from the Datacenter Clarity LC lifecycle management database. Combining the relevant data in this easily-navigated, visual way facilitates collaboration and fast, accurate decision-making among stakeholders. Now, as decision-makers review their server placement options inside the platform, they are presented with visualizations based on real-time data that filter for each constraint, ultimately revealing optimal locations for new servers.
Phases of DCIM
The first phase of DCIM is planning: Where can we place our initial or additional 100 servers? The next issue centers on Reserving. The data center staff needs to reserve ahead for not only the physical space, but also the power and the cooling that will be required for the new machines. A different set of predictive analytics and asset management tools apply here. Then, when the servers arrive, the data team needs decision support to design the workflows and to generate the work orders to deploy the 100 servers efficiently. When hosting business critical applications or those that deal in protected personal information, consideration needs to be given to the levels of security, redundancy, high availability and recoverability agreed to in the service agreement. All these considerations can be addressed by DCIM predictive analytics; however, predictive simulation alone yields only an approximation of reality. Datacenter Clarity LC results are made reliable and useful by the addition of the operational data from the PI System.
Another perennial question for data center operators is “When should older IT equipment be phased out and replaced?” This is a business decision that again calls for careful analysis of a multitude of factors. If the equipment in a particular zone is older, it typically can support fewer clients than the latest generation and thus may be limiting revenue. So, when a maintenance event is scheduled for a server or rack, the data center’s CFO might want to be made aware of the situation, especially if it involves downtime.
Depending upon the business critical nature of the applications supported by the asset and the client service-level agreement (SLA), the costs of data center downtime can be as much as $8000 a minute. The prudent financial decision may be to buy new IT hardware and phase it in without incurring any downtime. In this scenario, it may be preferable to present the operational performance data in financial terms to more easily bring the CFO into the decision-making.
Visibility into data center infrastructure – both through real-time monitoring and lifecycle asset tracking – offers huge returns in driving data center operational efficiency and continuous improvement. When combined with thermal predictive analytics, the potential for energy efficiency gains is increased another order of magnitude. Analysts estimate that energy-related costs account for approximately 12% of total data center operational costs, and cooling strategies are a big factor. High density zones featuring special cooling techniques like chilled water and outside air methods are best-practice strategies for increasing power usage efficiency (PUE) and cooling usage efficiency (CUE). Designing such zones requires the type of insight that comes from the combination of thermal modeling and a real-time engine serving actual measurements of power and heat across the internal environment—the type of insight that comes from Datacenter Clarity LC® with the PI System at its core.
To learn more about DCIM and how to manage, visualize, and distribute data, enabling all stakeholders to understand performance data at a given moment, please visit www.osisoft.com
Claridion is an IT service firm with expertise in data center infrastructure management (DCIM) based in Montreal, Quebec, with data center customers in both enterprise-owned and hosting categories. MAYA HTT Ltd is also based in Montreal, Quebec, and has decades of experience in Computational Fluid Dynamics (CFD) which inform its thermal models and analytics.
One thought on “Data Centers Reach Toward Operational Excellence Guided by Predictive Analytics”
Thank you for valuable content Therese!