Measuring application power consumption on the Linux* operating system

Power consumption is a common and growing concern in large compute installations, whether they be HPC, Cloud or Enterprise:   facility power and space limitations are making it increasingly difficult to support the explosive growth of computational needs.  Thus we need to dig deeper on how to best reduce power consumption at multiple levels, from hardware to software.  In this article I will describe some of the known approaches and tools available to tackle this challenge.  

There are multiple approaches to addressing power consumption -- barring the reduction of the 'need', which I don't think will likely happen anytime soon --  including:

  • Use faster applications (if you are outsourcing the software you run) or introduce more parallelism in your internal software. thus using less power per unit of computation.   This ultimately means that you can run more jobs through your datacenters, which will slow down the rate at which your facility will need to grow.
  • Remove unnecessary components from the hardware that aren't used or needed. This includes not only unneeeded servers, but also boards / components inside the systems.  
  • Disable or remove unnecessary software component,  freeing up 'cycles' for jobs (e.g. disable unneeded services at the operating system level)
  • Upgrade to  systems that consume less power.   Newer hardware often consumes less power, due to advances in power management.
  • Limit the amount of power consumed by systems or applications (which may affect overall throughput).   This can be done through software that runs externally from the system  (for example, with  Intel(R) Node Manager) or software running on the systems themselves:    you can tune the system BIOS settings to enable power savings mode;  additionally,  some Virtual Machine Monitor implementations can be configured to power down  virtual machines when they are not in use, and you can often place a limit on CPU consumption by any virtual machine (which will consequently limit power consumption).

If you want to slow down your rate of purchases or need for capital investments, perhaps you should  take a second look at your software,  Besides making it run faster,

  • Can you make it run in a more power efficient way?   
  • Is there a way to gain an understanding of how much power your software consumes during execution?   
  • Are there constructs and methods that will (generally) improve or degrade an application's power footprint?

I asked this question of our Linux* experts here at Intel  to see what pointers I'd get.  The result is this blog, where  I will share what I learned.   In this writeup I specifically focus on Linux since many HPC clusters and cloud installations run this operating system.

There are two categories of energy monitoring tools:

   a. Out of band monitors

   b. In band monitors

 

   a. Out of band monitors - These monitors show information about a system's power consumption as a whole, so you won't be able to see what specific applications drove increased power consumption.   On the plus side, they don't affect the power consumption of the system while collecting the statistics.   An example of such an out of band monitor is the Intel(R) Node Manager.  It has  the ability to report power consumption, allowing this data to be accessed via IPMI In-band operation is also possible,  but results in visible overhead on highly loaded applications if you sample data too often, as it's not possible to effectively do exactly a second interval sampling.  A data sampling rate of every 3-5 seconds is recommended.

    b. In-band monitors - Most of these tools rely on energy counters exposed via hardware registers, such as RAPL (Running Average Power Limit) counters, which are exposed externally via Model-Specific Registers (MSRs).   The Intel® Xeon® E5 series processors based on the Intel microarchitecture code-named Sandy Bridge, make such counters available.     You might be able to attain greater detail (to determine which applications are generating higher power consumption):  the downside of this approach is that collecting the data will itself consume power.  RAPL counters are updated approximately every millisecond, and provide estimates for energy consumption based on internal processor heuristics, rather than actual data. In Intel's experience the accuracy of the data is good around TDP (Thermal Design Power) levels, and may be 10% off at lower power load levels. Also, RAPL won't provide the complete platform power, just CPU and memory power.   User-space unprivileged applications can gather power usage statistics on Linux. The msr.ko kernel module must be loaded, and the application must have permission to read /dev/cpu/*/msr .   Most of the tools in the table below are considered in-band monitors. 

The measurement method that is most suitable for your needs  will depend on how the power consumption data is going to be used. In many cases, Intel engineers use both measurement methods.   Assessing power efficiency is not always easy or straightforward.  You need to ask yourself questions such as the following:

  • Should I use the Operating System to measure power?
  • Do I need to measure overall system power?
  • How can I trace the power consumption back to an application?
  • Given a problematic application, how do I drill down to problem areas within it?

We suggest that by using both, correlating out-of band-monitoring with in-band monitoring data, you can construct a more complete picture of what is going on.   Once you have identified the culprit application(s), you can use Intel(R) VTune Amplifier XE or the Intel(R) Energy Checker SDK (mentioned below) or pyTimechart to drill down from there.

 

The following table summarizes some of the tools or capabilities currently used by Intel engineers (and for what purpose):

 
Tool What it does, the pros and cons
powertop Shows, at a high level, what your application is doing (similar to the type of output you see from 'top)'.   It is useful in identifying misbehaving programs while the computer is idle, and is very popular among Intel Linux* users
Intel(R) Vtune(tm) Amplifier XE 2013

A GUI tool, available for purchase, with support for power analysis in Linux kernels >=2.6.32.   This analysis can be done on relatively recent vintage Intel(R) Xeon(R)  systems (from 2010 or newer)  as well as several Intel(R)  based tablets and phones running Android*.   More information on whatit can measure is available here:

http://software.intel.com/en-us/articles/how-to-use-the-power-analysis-types-of-intel-vtune-amplifier-xe-2013-on-linux

Intel(R) Energy Checker An API that provides the functions required for exporting and importing counters from an application.  Through software instrumentation, it exposes metrics for the "useful work" done by an application.
Intel(R) Power Governor A utility and library that allow developers to monitor and regulate power at very fine time granularities (a few tens of milliseconds). Power monitoring and control is available for the package, core, graphics, uncore and DRAM domains.
IntelR) Performance Counter Monitor Provides sample C++ routines and utilities to estimate the internal resource utilization of the latest Intel® Xeon® and Core™ processors
turbostat  Tracks CPU frequency and C-states
ipmitool and Intel(R) Node Manager  

A combination for  sampling data out-of-band.  Both, when used together, provide a more complete picture of what is happening.  An example of how ipmitool can be used on recent Intel(R) server platforms is as follows:

ipmitool -H ipmihost -U root -P root delloem powermonitor or ipmi-oem intelnm get-node-manager-statistics

Likwid A lightweight performance measurement tool that is useful for monitoring of multi-node HPC applications.  It allows the measurement of both memory and cache usage, as well as thread-core affinity (among other things).
Intel(R) Performance Bottleneck Analyzer  Finds and prioritizes issues impacting performance on Intel Architectures.  It detects architectural issues such as partial flag/register stalls, store fowarding, and L2 misses.   Support for Linux* based applications is limited:   see the User guide for more information
PyTimechart Allows you to  see what events are causing C-state exits.  This includes  events that are causing your system to wake up frequently (leading to higher power consumption).

 

What do you do, once you've figured out what applications may be consuming a lot of power?    

There are quite a few resources available that explain what software development practices consume more or less power.    There are some great pointers on the Lesswatts website, and in the Intel(R) Energy Efficiency Community.  In the book "Energy Aware Computing" there is an entire section dedicated to writing energy efficient software for the Data Center.  We encourage you to check those out!

The general idea is that you want to keep unnecessary 'work' to a minimum, for example, too frequent gettimeofday() calls or repeated events are a place to look.  Minimizing system calls is another:  some API's and system calls are more expensive than others.  Perform an in-depth analysis of your application to understand I/O patterns (are you experiencing cache misses a lot?), improve data localization, as well as maximize user time (over system time).  All of these factor into your application's power consumption.

We recommend that you also consult with Intel by submitting your questions to the Power Efficiency Community Forum.

Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione